Skip to main content

Showing 1–50 of 366 results for author: Zhao, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13372  [pdf, other

    cs.AI

    Thread: A Logic-Based Data Organization Paradigm for How-To Question Answering with Retrieval Augmented Generation

    Authors: Kaikai An, Fangkai Yang, Liqun Li, Junting Lu, Sitao Cheng, Lu Wang, Pu Zhao, Lele Cao, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

    Abstract: Current question answering systems leveraging retrieval augmented generation perform well in answering factoid questions but face challenges with non-factoid questions, particularly how-to queries requiring detailed step-by-step instructions and explanations. In this paper, we introduce Thread, a novel data organization paradigm that transforms documents into logic units based on their inter-conne… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 21 pages, 4 figures

  2. arXiv:2406.12195  [pdf, other

    quant-ph cs.LG

    Quantum Compiling with Reinforcement Learning on a Superconducting Processor

    Authors: Z. T. Wang, Qiuhao Chen, Yuxuan Du, Z. H. Yang, Xiaoxia Cai, Kaixuan Huang, **gning Zhang, Kai Xu, Jun Du, Yinan Li, Yuling Jiao, Xingyao Wu, Wu Liu, Xiliang Lu, Huikai Xu, Yirong **, Ruixia Wang, Haifeng Yu, S. P. Zhao

    Abstract: To effectively implement quantum algorithms on noisy intermediate-scale quantum (NISQ) processors is a central task in modern quantum technology. NISQ processors feature tens to a few hundreds of noisy qubits with limited coherence times and gate operations with errors, so NISQ algorithms naturally require employing circuits of short lengths via quantum compilation. Here, we develop a reinforcemen… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2406.11886  [pdf, other

    cs.LG cs.AI cs.CE q-fin.CP

    Financial Assets Dependency Prediction Utilizing Spatiotemporal Patterns

    Authors: Haoren Zhu, Pengfei Zhao, Wilfred Siu Hung NG, Dik Lun Lee

    Abstract: Financial assets exhibit complex dependency structures, which are crucial for investors to create diversified portfolios to mitigate risk in volatile financial markets. To explore the financial asset dependencies dynamics, we propose a novel approach that models the dependencies of assets as an Asset Dependency Matrix (ADM) and treats the ADM sequences as image sequences. This allows us to leverag… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  4. arXiv:2406.11105  [pdf, other

    cs.CV cs.AI

    Exploiting Diffusion Prior for Out-of-Distribution Detection

    Authors: Armando Zhu, Jiabei Liu, Keqin Li, Shuying Dai, Bo Hong, Peng Zhao, Changsong Wei

    Abstract: Out-of-distribution (OOD) detection is crucial for deploying robust machine learning models, especially in areas where security is critical. However, traditional OOD detection methods often fail to capture complex data distributions from large scale date. In this paper, we present a novel approach for OOD detection that leverages the generative ability of diffusion models and the powerful feature… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  5. arXiv:2406.03001  [pdf, other

    cs.CV cs.AI

    EdgeSync: Faster Edge-model Updating via Adaptive Continuous Learning for Video Data Drift

    Authors: Peng Zhao, Runchu Dong, Guiqin Wang, Cong Zhao

    Abstract: Real-time video analytics systems typically place models with fewer weights on edge devices to reduce latency. The distribution of video content features may change over time for various reasons (i.e. light and weather change) , leading to accuracy degradation of existing models, to solve this problem, recent work proposes a framework that uses a remote server to continually train and adapt the li… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  6. arXiv:2406.01359  [pdf, other

    cs.CL cs.SE

    R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models

    Authors: Ken Deng, Jiaheng Liu, He Zhu, Congnan Liu, **gxin Li, Jiakai Wang, Peng Zhao, Chenchen Zhang, Yanan Wu, Xueqiao Yin, Yuanxing Zhang, Wenbo Su, Bangyu Xiang, Tiezheng Ge, Bo Zheng

    Abstract: Code completion models have made significant progress in recent years. Recently, repository-level code completion has drawn more attention in modern software development, and several baseline methods and benchmarks have been proposed. However, existing repository-level code completion methods often fall short of fully using the extensive context of a project repository, such as the intricacies of… ▽ More

    Submitted 3 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  7. arXiv:2405.19705  [pdf, ps, other

    cs.LG math.OC

    Universal Online Convex Optimization with $1$ Projection per Round

    Authors: Wenhao Yang, Yibo Wang, Peng Zhao, Lijun Zhang

    Abstract: To address the uncertainty in function types, recent progress in online convex optimization (OCO) has spurred the development of universal algorithms that simultaneously attain minimax rates for multiple types of convex functions. However, for a $T$-round online problem, state-of-the-art methods typically conduct $O(\log T)$ projections onto the domain in each round, a process potentially time-con… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  8. arXiv:2405.19661  [pdf, other

    cs.LG

    MGCP: A Multi-Grained Correlation based Prediction Network for Multivariate Time Series

    Authors: Zhicheng Chen, Xi Xiao, Ke Xu, Zhong Zhang, Yu Rong, Qing Li, Guojun Gan, Zhiqiang Xu, Peilin Zhao

    Abstract: Multivariate time series prediction is widely used in daily life, which poses significant challenges due to the complex correlations that exist at multi-grained levels. Unfortunately, the majority of current time series prediction models fail to simultaneously learn the correlations of multivariate time series at multi-grained levels, resulting in suboptimal performance. To address this, we propos… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  9. arXiv:2405.17079  [pdf, other

    stat.ML cs.LG

    Learning with User-Level Local Differential Privacy

    Authors: Puning Zhao, Li Shen, Rongfei Fan, Qingming Li, Huiwen Wu, Jiafei Wu, Zhe Liu

    Abstract: User-level privacy is important in distributed systems. Previous research primarily focuses on the central model, while the local models have received much less attention. Under the central model, user-level DP is strictly stronger than the item-level one. However, under the local model, the relationship between user-level and item-level LDP becomes more complex, thus the analysis is crucially dif… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  10. arXiv:2405.17061  [pdf, ps, other

    cs.LG

    Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation

    Authors: Long-Fei Li, Yu-Jie Zhang, Peng Zhao, Zhi-Hua Zhou

    Abstract: We study a new class of MDPs that employs multinomial logit (MNL) function approximation to ensure valid probability distributions over the state space. Despite its benefits, introducing non-linear function approximation raises significant challenges in both computational and statistical efficiency. The best-known method of Hwang and Oh [2023] has achieved an… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  11. arXiv:2405.17051  [pdf, other

    cs.LG cs.AI

    BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics

    Authors: Hao Wu, Xingjian Shi, Ziyue Huang, Penghao Zhao, Wei Xiong, **bao Xue, Yangyu Tao, Xiaomeng Huang, Weiyan Wang

    Abstract: Data-driven deep learning has emerged as the new paradigm to model complex physical space-time systems. These data-driven methods learn patterns by optimizing statistical metrics and tend to overlook the adherence to physical laws, unlike traditional model-driven numerical methods. Thus, they often generate predictions that are not physically realistic. On the other hand, by sampling a large amoun… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  12. arXiv:2405.16797  [pdf

    cs.SD cs.AI eess.AS

    A Real-Time Voice Activity Detection Based On Lightweight Neural

    Authors: Jidong Jia, Pei Zhao, Di Wang

    Abstract: Voice activity detection (VAD) is the task of detecting speech in an audio stream, which is challenging due to numerous unseen noises and low signal-to-noise ratios in real environments. Recently, neural network-based VADs have alleviated the degradation of performance to some extent. However, the majority of existing studies have employed excessively large models and incorporated future context,… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  13. arXiv:2405.15150  [pdf, other

    cs.LG

    Enhancing Learning with Label Differential Privacy by Vector Approximation

    Authors: Puning Zhao, Rongfei Fan, Huiwen Wu, Qingming Li, Jiafei Wu, Zhe Liu

    Abstract: Label differential privacy (DP) is a framework that protects the privacy of labels in training datasets, while the feature vectors are public. Existing approaches protect the privacy of labels by flip** them randomly, and then train a model to make the output approximate the privatized label. However, as the number of classes $K$ increases, stronger randomization is needed, thus the performances… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  14. arXiv:2405.14578  [pdf, other

    cs.LG

    Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

    Authors: Shuaipeng Li, Penghao Zhao, Hailin Zhang, Xingwu Sun, Hao Wu, Dian Jiao, Weiyan Wang, Chengjun Liu, Zheng Fang, **bao Xue, Yangyu Tao, Bin Cui, Di Wang

    Abstract: In current deep learning tasks, Adam style optimizers such as Adam, Adagrad, RMSProp, Adafactor, and Lion have been widely used as alternatives to SGD style optimizers. These optimizers typically update model parameters using the sign of gradients, resulting in more stable convergence curves. The learning rate and the batch size are the most critical hyperparameters for optimizers, which require c… ▽ More

    Submitted 4 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  15. arXiv:2405.14291  [pdf, other

    cs.LG cs.AI cs.DC

    Variational Bayes for Federated Continual Learning

    Authors: Dezhong Yao, Sanmu Li, Yutong Dai, Zhiqiang Xu, Shengshan Hu, Peilin Zhao, Lichao Sun

    Abstract: Federated continual learning (FCL) has received increasing attention due to its potential in handling real-world streaming data, characterized by evolving data distributions and varying client classes over time. The constraints of storage limitations and privacy concerns confine local models to exclusively access the present data within each learning cycle. Consequently, this restriction induces p… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  16. arXiv:2405.13746  [pdf, other

    cs.LG cs.AI cs.DC

    CG-FedLLM: How to Compress Gradients in Federated Fune-tuning for Large Language Models

    Authors: Huiwen Wu, Xiaohan Li, Deyi Zhang, Xiaogang Xu, Jiafei Wu, Puning Zhao, Zhe Liu

    Abstract: The success of current Large-Language Models (LLMs) hinges on extensive training data that is collected and stored centrally, called Centralized Learning (CL). However, such a collection manner poses a privacy threat, and one potential solution is Federated Learning (FL), which transfers gradients, not raw data, among clients. Unlike traditional networks, FL for LLMs incurs significant communicati… ▽ More

    Submitted 23 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  17. arXiv:2405.13584  [pdf, other

    cs.LG cs.DC

    Emulating Full Client Participation: A Long-Term Client Selection Strategy for Federated Learning

    Authors: Qingming Li, Juzheng Miao, Puning Zhao, Li Zhou, Shouling Ji, Bowen Zhou, Furui Liu

    Abstract: Client selection significantly affects the system convergence efficiency and is a crucial problem in federated learning. Existing methods often select clients by evaluating each round individually and overlook the necessity for long-term optimization, resulting in suboptimal performance and potential fairness issues. In this study, we propose a novel client selection strategy designed to emulate t… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  18. arXiv:2405.13453  [pdf, other

    cs.LG cs.CR

    A Huber Loss Minimization Approach to Mean Estimation under User-level Differential Privacy

    Authors: Puning Zhao, Lifeng Lai, Li Shen, Qingming Li, Jiafei Wu, Zhe Liu

    Abstract: Privacy protection of users' entire contribution of samples is important in distributed systems. The most effective approach is the two-stage scheme, which finds a small interval first and then gets a refined estimate by clip** samples into the interval. However, the clip** operation induces bias, which is serious if the sample distribution is heavy-tailed. Besides, users with large local samp… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  19. arXiv:2405.10570  [pdf

    eess.IV cs.AI

    Simultaneous Deep Learning of Myocardium Segmentation and T2 Quantification for Acute Myocardial Infarction MRI

    Authors: Yirong Zhou, Chengyan Wang, Mengtian Lu, Kunyuan Guo, Zi Wang, Dan Ruan, Rui Guo, Peijun Zhao, Jianhua Wang, Naiming Wu, Jianzhong Lin, Yinyin Chen, Hang **, Lianxin Xie, Lilan Wu, Liuhong Zhu, Jianjun Zhou, Congbo Cai, He Wang, Xiaobo Qu

    Abstract: In cardiac Magnetic Resonance Imaging (MRI) analysis, simultaneous myocardial segmentation and T2 quantification are crucial for assessing myocardial pathologies. Existing methods often address these tasks separately, limiting their synergistic potential. To address this, we propose SQNet, a dual-task network integrating Transformer and Convolutional Neural Network (CNN) components. SQNet features… ▽ More

    Submitted 29 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: 10 pages, 8 figures, 6 tables

  20. arXiv:2405.01990  [pdf, other

    cs.LG

    Soft Label PU Learning

    Authors: Puning Zhao, **tao Deng, Xu Cheng

    Abstract: PU learning refers to the classification problem in which only part of positive samples are labeled. Existing PU learning methods treat unlabeled samples equally. However, in many real tasks, from common sense or domain knowledge, some unlabeled samples are more likely to be positive than others. In this paper, we propose soft label PU learning, in which unlabeled data are assigned soft labels acc… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  21. Cross-Task Multi-Branch Vision Transformer for Facial Expression and Mask Wearing Classification

    Authors: Armando Zhu, Keqin Li, Tong Wu, Peng Zhao, Bo Hong

    Abstract: With wearing masks becoming a new cultural norm, facial expression recognition (FER) while taking masks into account has become a significant challenge. In this paper, we propose a unified multi-branch vision transformer for facial expression recognition and mask wearing classification tasks. Our approach extracts shared features for both tasks using a dual-branch architecture that obtains multi-s… ▽ More

    Submitted 30 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Journal ref: Journal of Computer Technology and Applied Mathematics, vol. 1, no. 1, Apr. 2024, pp. 46-53,

  22. arXiv:2404.13630  [pdf

    cs.SE cs.AI cs.CL cs.LG

    Utilizing Deep Learning to Optimize Software Development Processes

    Authors: Keqin Li, Armando Zhu, Peng Zhao, **tong Song, Jiabei Liu

    Abstract: This study explores the application of deep learning technologies in software development processes, particularly in automating code reviews, error prediction, and test generation to enhance code quality and development efficiency. Through a series of empirical studies, experimental groups using deep learning tools and control groups using traditional methods were compared in terms of code error r… ▽ More

    Submitted 3 May, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Report number: JCTAM-2024042100074

  23. arXiv:2404.01650  [pdf, other

    cs.LG

    Test-Time Model Adaptation with Only Forward Passes

    Authors: Shuaicheng Niu, Chunyan Miao, Guohao Chen, Pengcheng Wu, Peilin Zhao

    Abstract: Test-time adaptation has proven effective in adapting a given trained model to unseen test samples with potential distribution shifts. However, in real-world scenarios, models are usually deployed on resource-limited devices, e.g., FPGAs, and are often quantized and hard-coded with non-modifiable parameters for acceleration. In light of this, existing methods are often infeasible since they heavil… ▽ More

    Submitted 29 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 18 pages, 4 figures, 17 tables, accepted by International Conference on Machine Learning

  24. arXiv:2404.00292  [pdf, other

    cs.CV

    LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion

    Authors: Pancheng Zhao, Peng Xu, Pengda Qin, Deng-** Fan, Zhicheng Zhang, Guoli Jia, Bowen Zhou, Jufeng Yang

    Abstract: Camouflaged vision perception is an important vision task with numerous practical applications. Due to the expensive collection and labeling costs, this community struggles with a major bottleneck that the species category of its datasets is limited to a small number of object species. However, the existing camouflaged generation methods require specifying the background manually, thus failing to… ▽ More

    Submitted 12 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024, Fig.3 revised

  25. arXiv:2403.19839  [pdf, other

    cs.LG cs.AI cs.CL

    The New Agronomists: Language Models are Experts in Crop Management

    Authors: **g Wu, Zhixin Lai, Suiyao Chen, Ran Tao, Pan Zhao, Naira Hovakimyan

    Abstract: Crop management plays a crucial role in determining crop yield, economic profitability, and environmental sustainability. Despite the availability of management guidelines, optimizing these practices remains a complex and multifaceted challenge. In response, previous studies have explored using reinforcement learning with crop simulators, typically employing simple neural-network-based reinforceme… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  26. arXiv:2403.14958  [pdf, other

    cs.LG cs.CL math.OC

    Adapprox: Adaptive Approximation in Adam Optimization via Randomized Low-Rank Matrices

    Authors: Pengxiang Zhao, ** Li, Yingjie Gu, Yi Zheng, Stephan Ludger Kölker, Zhefeng Wang, Xiaoming Yuan

    Abstract: As deep learning models exponentially increase in size, optimizers such as Adam encounter significant memory consumption challenges due to the storage of first and second moment data. Current memory-efficient methods like Adafactor and CAME often compromise accuracy with their matrix factorization techniques. Addressing this, we introduce Adapprox, a novel approach that employs randomized low-rank… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  27. arXiv:2403.13267  [pdf, other

    cs.SI

    Dynamic Information Dissemination Model Incorporating Non-Adjacent Node Interaction

    Authors: Xinyu Li, **yang Huang, Xiang Zhang, Peng Zhao, Meng Wang, Guohang Zhuang, Huan Yan, Xiao Sun, Meng Wang

    Abstract: Describing the dynamics of information dissemination within social networks poses a formidable challenge. Despite multiple endeavors aimed at addressing this issue, only a limited number of studies have effectively replicated and forecasted the evolving course of information dissemination. In this paper, we propose a novel model, DM-NAI, which not only considers the information transfer between ad… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2403.06385

  28. arXiv:2403.11491  [pdf, other

    cs.LG

    Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting

    Authors: Mingkui Tan, Guohao Chen, Jiaxiang Wu, Yifan Zhang, Yaofo Chen, Peilin Zhao, Shuaicheng Niu

    Abstract: Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and test data by adapting a given model w.r.t. any test sample. Although recent TTA has shown promising performance, we still face two key challenges: 1) prior methods perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications; 2) while existing TTA can signi… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 20 pages, 14 tables, 11 figures. arXiv admin note: substantial text overlap with arXiv:2204.02610

  29. arXiv:2403.07029  [pdf, other

    cs.CR

    A Model for Assessing Network Asset Vulnerability Using QPSO-LightGBM

    Authors: Xinyu Li, Yu Gu, Chenwei Wang, Peng Zhao

    Abstract: With the continuous development of computer technology and network technology, the scale of the network continues to expand, the network space tends to be complex, and the application of computers and networks has been deeply into politics, the military, finance, electricity, and other important fields. When security events do not occur, the vulnerability assessment of these high-risk network asse… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  30. arXiv:2403.06141  [pdf, other

    cs.SI

    Information Dissemination Model Based on User Attitude and Public Opinion Environment

    Authors: Xinyu Li, **yang Huang, Xiang Zhang, Peng Zhao, Meng Wang, Guohang Zhuang, Huan Yan, Xiao Sun, Meng Wang

    Abstract: Modeling the information dissemination process in social networks is a challenging problem. Despite numerous attempts to address this issue, existing studies often assume that user attitudes have only one opportunity to alter during the information dissemination process. Additionally, these studies tend to consider the transformation of user attitudes as solely influenced by a single user, overloo… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  31. arXiv:2403.05018  [pdf, other

    cs.CV

    InstructGIE: Towards Generalizable Image Editing

    Authors: Zichong Meng, Changdi Yang, Jun Liu, Hao Tang, Pu Zhao, Yanzhi Wang

    Abstract: Recent advances in image editing have been driven by the development of denoising diffusion models, marking a significant leap forward in this field. Despite these advances, the generalization capabilities of recent image editing approaches remain constrained. In response to this challenge, our study introduces a novel image editing framework with enhanced generalization robustness by boosting in-… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Preprint

  32. arXiv:2403.05016  [pdf, other

    cs.CV

    DiffClass: Diffusion-Based Class Incremental Learning

    Authors: Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi WAng

    Abstract: Class Incremental Learning (CIL) is challenging due to catastrophic forgetting. On top of that, Exemplar-free Class Incremental Learning is even more challenging due to forbidden access to previous task data. Recent exemplar-free CIL methods attempt to mitigate catastrophic forgetting by synthesizing previous task data. However, they fail to overcome the catastrophic forgetting due to the inabilit… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Preprint

  33. arXiv:2403.04568  [pdf, other

    cs.LG stat.ML

    Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit Feedback and Unknown Transition

    Authors: Long-Fei Li, Peng Zhao, Zhi-Hua Zhou

    Abstract: We study reinforcement learning with linear function approximation, unknown transition, and adversarial losses in the bandit feedback setting. Specifically, we focus on linear mixture MDPs whose transition kernel is a linear mixture model. We propose a new algorithm that attains an $\widetilde{O}(d\sqrt{HS^3K} + \sqrt{HSAK})$ regret with high probability, where $d$ is the dimension of feature mapp… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: AISTATS 2024

  34. arXiv:2403.03193  [pdf, other

    cs.PL cs.DB

    VeriEQL: Bounded Equivalence Verification for Complex SQL Queries with Integrity Constraints

    Authors: Yang He, Pinhan Zhao, Xinyu Wang, Yuepeng Wang

    Abstract: The task of SQL query equivalence checking is important in various real-world applications (including query rewriting and automated grading) that involve complex queries with integrity constraints; yet, state-of-the-art techniques are very limited in their capability of reasoning about complex features (e.g., those that involve sorting, case statement, rich integrity constraints, etc.) in real-lif… ▽ More

    Submitted 15 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: OOPSLA 2024

  35. arXiv:2403.00376  [pdf, other

    cs.CV cs.AI cs.LG

    Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model

    Authors: Huan Ma, Yan Zhu, Changqing Zhang, Peilin Zhao, Baoyuan Wu, Long-Kai Huang, Qinghua Hu, Bingzhe Wu

    Abstract: Vision-language foundation models have exhibited remarkable success across a multitude of downstream tasks due to their scalability on extensive image-text paired data. However, these models also display significant limitations when applied to downstream tasks, such as fine-grained image classification, as a result of ``decision shortcuts'' that hinder their generalization capabilities. In this wo… ▽ More

    Submitted 3 June, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

  36. arXiv:2402.19473  [pdf, other

    cs.CV

    Retrieval-Augmented Generation for AI-Generated Content: A Survey

    Authors: Penghao Zhao, Hailin Zhang, Qinhan Yu, Zhengren Wang, Yunteng Geng, Fangcheng Fu, Ling Yang, Wentao Zhang, Jie Jiang, Bin Cui

    Abstract: Advancements in model algorithms, the growth of foundational models, and access to high-quality datasets have propelled the evolution of Artificial Intelligence Generated Content (AIGC). Despite its notable successes, AIGC still faces hurdles such as updating knowledge, handling long-tail data, mitigating data leakage, and managing high training and inference costs. Retrieval-Augmented Generation… ▽ More

    Submitted 21 June, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: Citing 353 papers, 22 pages, 1 table, 12 figures. Project: https://github.com/PKU-DAIR/RAG-Survey

  37. arXiv:2402.17531  [pdf, other

    cs.SE cs.AI cs.CL

    Nissist: An Incident Mitigation Copilot based on Troubleshooting Guides

    Authors: Kaikai An, Fangkai Yang, Junting Lu, Liqun Li, Zhixing Ren, Hao Huang, Lu Wang, Pu Zhao, Yu Kang, Hua Ding, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

    Abstract: Effective incident management is pivotal for the smooth operation of enterprises-level cloud services. In order to expedite incident mitigation, service teams compile troubleshooting knowledge into Troubleshooting Guides (TSGs) accessible to on-call engineers (OCEs). While automated pipelines are enabled to resolve the most frequent and easy incidents, there still exist complex incidents that requ… ▽ More

    Submitted 10 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Work in progress

  38. arXiv:2402.15980  [pdf, other

    cs.SI

    Signed Graph Representation Learning: A Survey

    Authors: Zeyu Zhang, Peiyao Zhao, Xin Li, Jiamou Liu, Xinrui Zhang, Junjie Huang, Xiaofeng Zhu

    Abstract: With the prevalence of social media, the connectedness between people has been greatly enhanced. Real-world relations between users on social media are often not limited to expressing positive ties such as friendship, trust, and agreement, but they also reflect negative ties such as enmity, mistrust, and disagreement, which can be well modelled by signed graphs. Signed Graph Representation Learnin… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  39. arXiv:2402.12928  [pdf, other

    cs.DL cs.AI cs.CV

    A Literature Review of Literature Reviews in Pattern Analysis and Machine Intelligence

    Authors: Penghai Zhao, Xin Zhang, Ming-Ming Cheng, Jian Yang, Xiang Li

    Abstract: By consolidating scattered knowledge, the literature review provides a comprehensive understanding of the investigated topic. However, reading, conducting, or peer-reviewing review papers generally demands a significant investment of time and effort from researchers. To improve efficiency, this paper aims to provide a thorough review of reviews in the PAMI field from diverse perspectives. First, t… ▽ More

    Submitted 24 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: IEEE version v1. [February 19, 2024] IEEE version v2 with typos fixed. [February 23, 2024] IEEE version v3 with errors fixed. [February 29, 2024] IEEE version v4 with improved quaility. [February 29, 2024]

  40. arXiv:2402.10787  [pdf, other

    cs.LG cs.AI cs.CL

    EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge

    Authors: Xuan Shen, Zhenglun Kong, Changdi Yang, Zhaoyang Han, Lei Lu, Peiyan Dong, Cheng Lyu, Chih-hsiang Li, Xuehang Guo, Zhihao Shu, Wei Niu, Miriam Leeser, Pu Zhao, Yanzhi Wang

    Abstract: Despite the remarkable strides of Large Language Models (LLMs) in various fields, the wide applications of LLMs on edge devices are limited due to their massive parameters and computations. To address this, quantization is commonly adopted to generate lightweight LLMs with efficient computations and fast inference. However, Post-Training Quantization (PTQ) methods dramatically degrade in quality w… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Preprint

  41. arXiv:2402.07610  [pdf, other

    cs.CL cs.AI

    Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrap**

    Authors: Haoyu Wang, Guozheng Ma, Ziqiao Meng, Zeyu Qin, Li Shen, Zhong Zhang, Bingzhe Wu, Liu Liu, Yatao Bian, Tingyang Xu, Xueqian Wang, Peilin Zhao

    Abstract: Self-alignment is an effective way to reduce the cost of human annotation while ensuring promising model capability. However, most current methods complete the data collection and training steps in a single round, which may overlook the continuously improving ability of self-aligned models. This gives rise to a key query: What if we do multi-time bootstrap** self-alignment? Does this strategy en… ▽ More

    Submitted 27 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  42. arXiv:2402.06642  [pdf, other

    q-fin.ST cs.LG

    From GARCH to Neural Network for Volatility Forecast

    Authors: Pengfei Zhao, Haoren Zhu, Wilfred Siu Hung NG, Dik Lun Lee

    Abstract: Volatility, as a measure of uncertainty, plays a crucial role in numerous financial activities such as risk management. The Econometrics and Machine Learning communities have developed two distinct approaches for financial volatility forecasting: the stochastic approach and the neural network (NN) approach. Despite their individual strengths, these methodologies have conventionally evolved in sepa… ▽ More

    Submitted 29 January, 2024; originally announced February 2024.

    Comments: Accepted by AAAI'24

  43. arXiv:2402.03139  [pdf, other

    cs.LG

    Enhancing Neural Subset Selection: Integrating Background Information into Set Representations

    Authors: Binghui Xie, Yatao Bian, Kaiwen zhou, Yongqiang Chen, Peilin Zhao, Bo Han, Wei Meng, James Cheng

    Abstract: Learning neural subset selection tasks, such as compound selection in AI-aided drug discovery, have become increasingly pivotal across diverse applications. The existing methodologies in the field primarily concentrate on constructing models that capture the relationship between utility function values and subsets within their respective supersets. However, these approaches tend to overlook the va… ▽ More

    Submitted 9 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  44. arXiv:2402.02211  [pdf, other

    cs.LG cs.DS

    Query-decision Regression between Shortest Path and Minimum Steiner Tree

    Authors: Guangmo Tong, Peng Zhao, Mina Samizadeh

    Abstract: Considering a graph with unknown weights, can we find the shortest path for a pair of nodes if we know the minimal Steiner trees associated with some subset of nodes? That is, with respect to a fixed latent decision-making system (e.g., a weighted graph), we seek to solve one optimization problem (e.g., the shortest path problem) by leveraging information associated with another optimization probl… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: PAKDD 2024

  45. arXiv:2402.00822  [pdf, other

    cs.HC cs.AI

    WiOpen: A Robust Wi-Fi-based Open-set Gesture Recognition Framework

    Authors: Xiang Zhang, **gyang Huang, Huan Yan, Peng Zhao, Guohang Zhuang, Zhi Liu, Bin Liu

    Abstract: Recent years have witnessed a growing interest in Wi-Fi-based gesture recognition. However, existing works have predominantly focused on closed-set paradigms, where all testing gestures are predefined during training. This poses a significant challenge in real-world applications, as unseen gestures might be misclassified as known classes during testing. To address this issue, we propose WiOpen, a… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  46. arXiv:2402.00350  [pdf, other

    cs.SE cs.AI

    Large Language Models Based Fuzzing Techniques: A Survey

    Authors: Linghan Huang, Peizhou Zhao, Huaming Chen, Lei Ma

    Abstract: In the modern era where software plays a pivotal role, software security and vulnerability analysis have become essential for software development. Fuzzing test, as an efficient software testing method, are widely used in various domains. Moreover, the rapid development of Large Language Models (LLMs) has facilitated their application in the field of software testing, demonstrating remarkable perf… ▽ More

    Submitted 7 February, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: 9 pages submission under review

  47. arXiv:2402.00034  [pdf, other

    cs.DC cs.AI

    Why does Prediction Accuracy Decrease over Time? Uncertain Positive Learning for Cloud Failure Prediction

    Authors: Haozhe Li, Minghua Ma, Yudong Liu, Pu Zhao, Lingling Zheng, Ze Li, Yingnong Dang, Murali Chintalapati, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang

    Abstract: With the rapid growth of cloud computing, a variety of software services have been deployed in the cloud. To ensure the reliability of cloud services, prior studies focus on failure instance (disk, node, and switch, etc.) prediction. Once the output of prediction is positive, mitigation actions are taken to rapidly resolve the underlying failure. According to our real-world practice in Microsoft A… ▽ More

    Submitted 7 January, 2024; originally announced February 2024.

    ACM Class: K.6.3; I.2.0

  48. arXiv:2401.16766  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    Detection and Recovery Against Deep Neural Network Fault Injection Attacks Based on Contrastive Learning

    Authors: Chenan Wang, Pu Zhao, Siyue Wang, Xue Lin

    Abstract: Deep Neural Network (DNN) models when implemented on executing devices as the inference engines are susceptible to Fault Injection Attacks (FIAs) that manipulate model parameters to disrupt inference execution with disastrous performance. This work introduces Contrastive Learning (CL) of visual representations i.e., a self-supervised learning approach into the deep learning training and inference… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: Published in AdvML 2021

  49. arXiv:2401.14913  [pdf, ps, other

    cs.SE cs.PL

    On Repairing Quantum Programs Using ChatGPT

    Authors: Xiaoyu Guo, Jianjun Zhao, Pengzhan Zhao

    Abstract: Automated Program Repair (APR) is a vital area in software engineering aimed at generating automatic patches for vulnerable programs. While numerous techniques have been proposed for repairing classical programs, the realm of quantum programming lacks a comparable automated repair technique. In this initial exploration, we investigate the use of ChatGPT for quantum program repair and evaluate its… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: The 5th International Workshop on Quantum Software Engineering (Q-SE 2024)

  50. arXiv:2401.13270  [pdf, other

    cs.CV cs.AI

    Audio-Infused Automatic Image Colorization by Exploiting Audio Scene Semantics

    Authors: Pengcheng Zhao, Yanxiang Chen, Yang Zhao, Wei Jia, Zhao Zhang, Ronggang Wang, Richang Hong

    Abstract: Automatic image colorization is inherently an ill-posed problem with uncertainty, which requires an accurate semantic understanding of scenes to estimate reasonable colors for grayscale images. Although recent interaction-based methods have achieved impressive performance, it is still a very difficult task to infer realistic and accurate colors for automatic colorization. To reduce the difficulty… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.