Skip to main content

Showing 1–50 of 856 results for author: Gao, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01219  [pdf, other

    cs.CL

    Searching for Best Practices in Retrieval-Augmented Generation

    Authors: Xiaohua Wang, Zhenghua Wang, Xuan Gao, Feiran Zhang, Yixin Wu, Zhibo Xu, Tianyuan Shi, Zhengyuan Wang, Shizheng Li, Qi Qian, Ruicheng Yin, Changze Lv, Xiaoqing Zheng, Xuan**g Huang

    Abstract: Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains. While many RAG approaches have been proposed to enhance large language models through query-dependent retrievals, these approaches still suffer from their complex implementation and prolong… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2407.00752  [pdf, other

    cs.CV cs.AI

    Chest-Diffusion: A Light-Weight Text-to-Image Model for Report-to-CXR Generation

    Authors: Peng Huang, Xue Gao, Lihong Huang, **g Jiao, Xiaokang Li, Yuanyuan Wang, Yi Guo

    Abstract: Text-to-image generation has important implications for generation of diverse and controllable images. Several attempts have been made to adapt Stable Diffusion (SD) to the medical domain. However, the large distribution difference between medical reports and natural texts, as well as high computational complexity in common stable diffusion limit the authenticity and feasibility of the generated m… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  3. Personalized Federated Continual Learning via Multi-granularity Prompt

    Authors: Hao Yu, Xin Yang, Xin Gao, Yan Kang, Hao Wang, Junbo Zhang, Tianrui Li

    Abstract: Personalized Federated Continual Learning (PFCL) is a new practical scenario that poses greater challenges in sharing and personalizing knowledge. PFCL not only relies on knowledge fusion for server aggregation at the global spatial-temporal perspective but also needs model improvement for each client according to the local requirements. Existing methods, whether in Personalized Federated Learning… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Accepted by KDD 2024 Research Track

  4. arXiv:2406.19583  [pdf, other

    cs.IT

    Interference Cancellation Information Geometry Approach for Massive MIMO Channel Estimation

    Authors: An-An Lu, Bingyan Liu, Xiqi Gao

    Abstract: In this paper, the interference cancellation information geometry approaches (IC-IGAs) for massive MIMO channel estimation are proposed. The proposed algorithms are low-complexity approximations of the minimum mean square error (MMSE) estimation. To illustrate the proposed algorithms, a unified framework of the information geometry approach for channel estimation and its geometric explanation are… ▽ More

    Submitted 1 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 38 pages, 9 figures

  5. arXiv:2406.19130  [pdf, other

    cs.CV

    Evidential Concept Embedding Models: Towards Reliable Concept Explanations for Skin Disease Diagnosis

    Authors: Yibo Gao, Zheyao Gao, Xin Gao, Yuanye Liu, Bomin Wang, Xiahai Zhuang

    Abstract: Due to the high stakes in medical decision-making, there is a compelling demand for interpretable deep learning methods in medical image analysis. Concept Bottleneck Models (CBM) have emerged as an active interpretable framework incorporating human-interpretable concepts into decision-making. However, their concept predictions may lack reliability when applied to clinical diagnosis, impeding conce… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: accepted by MICCAI 2024

  6. arXiv:2406.18938  [pdf, other

    cs.IR

    Towards Personalized Federated Multi-scenario Multi-task Recommendation

    Authors: Yue Ding, Yanbiao Ji, Xun Cai, Xin Xin, Xiaofeng Gao, Hongtao Lu

    Abstract: In modern recommender system applications, such as e-commerce, predicting multiple targets like click-through rate (CTR) and post-view click-through \& conversion rate (CTCVR) is common. Multi-task recommender systems are gaining traction in research and practical use. Existing multi-task recommender systems tackle diverse business scenarios, merging and modeling these scenarios unlocks shared kno… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  7. arXiv:2406.18873  [pdf, other

    cs.AR

    LayoutCopilot: An LLM-powered Multi-agent Collaborative Framework for Interactive Analog Layout Design

    Authors: Bingyang Liu, Haoyi Zhang, Xiaohan Gao, Zichen Kong, Xiyuan Tang, Yibo Lin, Runsheng Wang, Ru Huang

    Abstract: Analog layout design heavily involves interactive processes between humans and design tools. The tools are usually designed to use scripting commands or visualized buttons for manipulation, especially for those interactive automation functionalities, which have a steep learning curve and cumbersome user experience, making a notable barrier to their adoption by designers. Aiming to address such a u… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 8pages, 8figures

  8. arXiv:2406.18588  [pdf, other

    cs.CV cs.LG

    Varying Manifolds in Diffusion: From Time-varying Geometries to Visual Saliency

    Authors: Junhao Chen, Manyi Li, Zherong Pan, Xifeng Gao, Changhe Tu

    Abstract: Deep generative models learn the data distribution, which is concentrated on a low-dimensional manifold. The geometric analysis of distribution transformation provides a better understanding of data structure and enables a variety of applications. In this paper, we study the geometric properties of the diffusion model, whose forward diffusion process and reverse generation process construct a seri… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  9. arXiv:2406.18539  [pdf, other

    cs.CV cs.GR

    TexPainter: Generative Mesh Texturing with Multi-view Consistency

    Authors: Hongkun Zhang, Zherong Pan, Congyi Zhang, Lifeng Zhu, Xifeng Gao

    Abstract: The recent success of pre-trained diffusion models unlocks the possibility of the automatic generation of textures for arbitrary 3D meshes in the wild. However, these models are trained in the screen space, while converting them to a multi-view consistent texture image poses a major obstacle to the output quality. In this paper, we propose a novel method to enforce multi-view consistency. Our meth… ▽ More

    Submitted 17 May, 2024; originally announced June 2024.

    Comments: accepted by Siggraph 2024

  10. arXiv:2406.17260  [pdf, other

    cs.CL

    Mitigating Hallucination in Fictional Character Role-Play

    Authors: Nafis Sadeq, Zhouhang Xie, Byungkyu Kang, Prarit Lamba, Xiang Gao, Julian McAuley

    Abstract: Role-playing has wide-ranging applications in customer support, embodied agents, computational social science, etc. The influence of parametric world knowledge of large language models (LLMs) often causes role-playing characters to act out of character and hallucinate about things outside the scope of their knowledge. In this work, we focus on the evaluation and mitigation of hallucination in fict… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  11. arXiv:2406.16038  [pdf, other

    cs.CV

    LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control

    Authors: Delin Qu, Qizhi Chen, **rui Zhang, Xianqiang Gao, Bin Zhao, Dong Wang, Xuelong Li

    Abstract: This paper aims to advance the progress of physical world interactive scene reconstruction by extending the interactive object reconstruction from single object level to complex scene level. To this end, we first construct one simulated and one real scene-level physical interaction dataset containing 28 scenes with multiple interactive objects per scene. Furthermore, to accurately model the intera… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  12. arXiv:2406.14963  [pdf, other

    cs.LG

    Optimised Grouped-Query Attention Mechanism for Transformers

    Authors: Yuang Chen, Cheng Zhang, Xitong Gao, Robert D. Mullins, George A. Constantinides, Yiren Zhao

    Abstract: Grouped-query attention (GQA) has been widely adopted in LLMs to mitigate the complexity of multi-head attention (MHA). To transform an MHA to a GQA, neighbour queries in MHA are evenly split into groups where each group shares the value and key layers. In this work, we propose AsymGQA, an activation-informed approach to asymmetrically grou** an MHA to a GQA for better model performance. Our Asy… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML2024 ES-FoMo-II Workshop

  13. arXiv:2406.14956  [pdf, other

    cs.LG cs.CL

    Unlocking the Global Synergies in Low-Rank Adapters

    Authors: Zixi Zhang, Cheng Zhang, Xitong Gao, Robert D. Mullins, George A. Constantinides, Yiren Zhao

    Abstract: Low-rank Adaption (LoRA) has been the de-facto parameter-efficient fine-tuning technique for large language models. We present HeteroLoRA, a light-weight search algorithm that leverages zero-cost proxies to allocate the limited LoRA trainable parameters across the model for better fine-tuned performance. In addition to the allocation for the standard LoRA-adapted models, we also demonstrate the ef… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML2024 ES-FoMo-II Workshop

  14. arXiv:2406.13200  [pdf, other

    cs.LG

    RobGC: Towards Robust Graph Condensation

    Authors: Xinyi Gao, Hongzhi Yin, Tong Chen, Guanhua Ye, Wentao Zhang, Bin Cui

    Abstract: Graph neural networks (GNNs) have attracted widespread attention for their impressive capability of graph representation learning. However, the increasing prevalence of large-scale graphs presents a significant challenge for GNN training due to their computational demands, limiting the applicability of GNNs in various scenarios. In response to this challenge, graph condensation (GC) is proposed as… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  15. arXiv:2406.11145  [pdf, other

    cs.CV

    Federated Face Forgery Detection Learning with Personalized Representation

    Authors: Decheng Liu, Zhan Dang, Chunlei Peng, Nannan Wang, Ruimin Hu, Xinbo Gao

    Abstract: Deep generator technology can produce high-quality fake videos that are indistinguishable, posing a serious social threat. Traditional forgery detection methods directly centralized training on data and lacked consideration of information sharing in non-public video data scenarios and data privacy. Naturally, the federated learning strategy can be applied for privacy protection, which aggregates m… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: The code is publicly available

  16. arXiv:2406.10933  [pdf, other

    cs.CV

    Improving Adversarial Robustness via Decoupled Visual Representation Masking

    Authors: Decheng Liu, Tao Chen, Chunlei Peng, Nannan Wang, Ruimin Hu, Xinbo Gao

    Abstract: Deep neural networks are proven to be vulnerable to fine-designed adversarial examples, and adversarial defense algorithms draw more and more attention nowadays. Pre-processing based defense is a major strategy, as well as learning robust feature representation has been proven an effective way to boost generalization. However, existing defense works lack considering different depth-level visual fe… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: The code is publicly available

  17. arXiv:2406.10887  [pdf, other

    cs.CV

    Imperceptible Face Forgery Attack via Adversarial Semantic Mask

    Authors: Decheng Liu, Qixuan Su, Chunlei Peng, Nannan Wang, Xinbo Gao

    Abstract: With the great development of generative model techniques, face forgery detection draws more and more attention in the related field. Researchers find that existing face forgery models are still vulnerable to adversarial examples with generated pixel perturbations in the global image. These generated adversarial samples still can't achieve satisfactory performance because of the high detectability… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: The code is publicly available

  18. arXiv:2406.09822  [pdf, other

    cs.IT cs.CV cs.LG eess.IV eess.SP

    An I2I Inpainting Approach for Efficient Channel Knowledge Map Construction

    Authors: Zhenzhou **, Li You, Jue Wang, Xiang-Gen Xia, Xiqi Gao

    Abstract: Channel knowledge map (CKM) has received widespread attention as an emerging enabling technology for environment-aware wireless communications. It involves the construction of databases containing location-specific channel knowledge, which are then leveraged to facilitate channel state information (CSI) acquisition and transceiver design. In this context, a fundamental challenge lies in efficientl… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 15 pages, 11 figures

  19. arXiv:2406.09198  [pdf, other

    cs.CV

    CLIP-Driven Cloth-Agnostic Feature Learning for Cloth-Changing Person Re-Identification

    Authors: Shuang Li, Jiaxu Leng, Guozhang Li, Ji Gan, Haosheng chen, Xinbo Gao

    Abstract: Contrastive Language-Image Pre-Training (CLIP) has shown impressive performance in short-term Person Re-Identification (ReID) due to its ability to extract high-level semantic features of pedestrians, yet its direct application to Cloth-Changing Person Re-Identification (CC-ReID) faces challenges due to CLIP's image encoder overly focusing on clothes clues. To address this, we propose a novel fram… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  20. arXiv:2406.08098  [pdf, other

    cs.SE

    Scalable Defect Detection via Traversal on Code Graph

    Authors: Zhengyao Liu, Xitong Zhong, Xing**g Deng, Shuo Hong, Xiang Gao, Hailong Sun

    Abstract: Detecting defects and vulnerabilities in the early stage has long been a challenge in software engineering. Static analysis, a technique that inspects code without execution, has emerged as a key strategy to address this challenge. Among recent advancements, the use of graph-based representations, particularly Code Property Graph (CPG), has gained traction due to its comprehensive depiction of cod… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  21. arXiv:2406.06031  [pdf, other

    cs.IR

    A WT-ResNet based fault diagnosis model for the urban rail train transmission system

    Authors: Zuyu Cheng, Zhengcai Zhao, Yixiao Wang, Wentao Guo, Yufei Wang, Xiang Gao

    Abstract: This study presents a novel fault diagnosis model for urban rail transit systems based on Wavelet Transform Residual Neural Network (WT-ResNet). The model integrates the advantages of wavelet transform for feature extraction and ResNet for pattern recognition, offering enhanced diagnostic accuracy and robustness. Experimental results demonstrate the effectiveness of the proposed model in identifyi… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 12 pages,10 figures

  22. arXiv:2406.05366  [pdf, other

    cs.LG math.OC

    Regret Bounds for Episodic Risk-Sensitive Linear Quadratic Regulator

    Authors: Wenhao Xu, Xuefeng Gao, Xuedong He

    Abstract: Risk-sensitive linear quadratic regulator is one of the most fundamental problems in risk-sensitive optimal control. In this paper, we study online adaptive control of risk-sensitive linear quadratic regulator in the finite horizon episodic setting. We propose a simple least-squares greedy algorithm and show that it achieves $\widetilde{\mathcal{O}}(\log N)$ regret under a specific identifiability… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  23. arXiv:2406.05361  [pdf, other

    cs.CL

    Write Summary Step-by-Step: A Pilot Study of Stepwise Summarization

    Authors: Xiuying Chen, Shen Gao, Mingzhe Li, Qingqing Zhu, Xin Gao, Xiangliang Zhang

    Abstract: Nowadays, neural text generation has made tremendous progress in abstractive summarization tasks. However, most of the existing summarization models take in the whole document all at once, which sometimes cannot meet the needs in practice. Practically, social text streams such as news events and tweets keep growing from time to time, and can only be fed to the summarization system step by step. He… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 10 pages, 4 figures, published in TASLP

  24. arXiv:2406.05360  [pdf, other

    cs.CL

    Flexible and Adaptable Summarization via Expertise Separation

    Authors: Xiuying Chen, Mingzhe Li, Shen Gao, Xin Cheng, Qingqing Zhu, Rui Yan, Xin Gao, Xiangliang Zhang

    Abstract: A proficient summarization model should exhibit both flexibility -- the capacity to handle a range of in-domain summarization tasks, and adaptability -- the competence to acquire new knowledge and adjust to unseen out-of-domain tasks. Unlike large language models (LLMs) that achieve this through parameter scaling, we propose a more parameter-efficient approach in this study. Our motivation rests o… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 10 pages, 7 figures, published in SIGIR 2024

  25. arXiv:2406.05358  [pdf, other

    cs.LG math.OC

    Reinforcement Learning for Intensity Control: An Application to Choice-Based Network Revenue Management

    Authors: Huiling Meng, Ningyuan Chen, Xuefeng Gao

    Abstract: Intensity control is a type of continuous-time dynamic optimization problems with many important applications in Operations Research including queueing and revenue management. In this study, we adapt the reinforcement learning framework to intensity control using choice-based network revenue management as a case study, which is a classical problem in revenue management that features a large state… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  26. arXiv:2406.03603  [pdf, other

    cs.LG

    Alignment Calibration: Machine Unlearning for Contrastive Learning under Auditing

    Authors: Yihan Wang, Yiwei Lu, Guojun Zhang, Franziska Boenisch, Adam Dziedzic, Yaoliang Yu, Xiao-Shan Gao

    Abstract: Machine unlearning provides viable solutions to revoke the effect of certain training data on pre-trained model parameters. Existing approaches provide unlearning recipes for classification and generative models. However, a category of important machine learning models, i.e., contrastive learning (CL) methods, is overlooked. In this paper, we fill this gap by first proposing the framework of Machi… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  27. arXiv:2406.01302  [pdf

    cs.CV

    Pulmonary Embolism Mortality Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data

    Authors: Zhusi Zhong, Helen Zhang, Fayez H. Fayad, Andrew C. Lancaster, John Sollee, Shreyas Kulkarni, Cheng Ting Lin, Jie Li, Xinbo Gao, Scott Collins, Colin Greineder, Sun H. Ahn, Harrison X. Bai, Zhicheng Jiao, Michael K. Atalay

    Abstract: Purpose: Pulmonary embolism (PE) is a significant cause of mortality in the United States. The objective of this study is to implement deep learning (DL) models using Computed Tomography Pulmonary Angiography (CTPA), clinical data, and PE Severity Index (PESI) scores to predict PE mortality. Materials and Methods: 918 patients (median age 64 years, range 13-99 years, 52% female) with 3,978 CTPAs w… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  28. arXiv:2406.00639  [pdf, other

    cs.CV

    An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition

    Authors: Haojun Xu, Yan Gao, Jie Li, Xinbo Gao

    Abstract: Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training. Previous research has focused on aligning sequences' visual and semantic spatial distributions. However, these methods extract semantic features simply. They ignore that proper prompt design for rich and fine-grained action cues can provide robust repr… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 12 pages, 8 figures init commit

  29. arXiv:2406.00602  [pdf, other

    cs.SE cs.PL

    From Effectiveness to Efficiency: Comparative Evaluation of Code Generated by LCGMs for Bilingual Programming Questions

    Authors: Weipeng Jiang, Xuanqi Gao, Juan Zhai, Shiqing Ma, Xiaoyu Zhang, Chao Shen

    Abstract: Large Code Generation Models (LCGMs) have garnered significant attention and achieved promising results across various programming tasks. However, concerns arise regarding performance when using non-English prompts, as these models are primarily trained on English-centric corpora, and most programming language tokens resemble English. Existing benchmarks often rely on English programming questions… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: 10 and a quarter pages, 6 figures

  30. arXiv:2406.00588  [pdf, other

    cs.LG cs.CR math.ST

    Generalization Bound and New Algorithm for Clean-Label Backdoor Attack

    Authors: Lijia Yu, Shuang Liu, Yibo Miao, Xiao-Shan Gao, Lijun Zhang

    Abstract: The generalization bound is a crucial theoretical tool for assessing the generalizability of learning methods and there exist vast literatures on generalizability of normal learning, adversarial learning, and data poisoning. Unlike other data poison attacks, the backdoor attack has the special property that the poisoned triggers are contained in both the training set and the test set and the purpo… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  31. arXiv:2405.19098  [pdf, other

    cs.LG cs.AI cs.CR cs.CV stat.ML

    Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior

    Authors: Shuyu Cheng, Yibo Miao, Yinpeng Dong, Xiao Yang, Xiao-Shan Gao, Jun Zhu

    Abstract: This paper studies the challenging black-box adversarial attack that aims to generate adversarial examples against a black-box model by only using output feedback of the model to input queries. Some previous methods improve the query efficiency by incorporating the gradient of a surrogate white-box model into query-based attacks due to the adversarial transferability. However, the localized gradie… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  32. arXiv:2405.19027  [pdf, other

    cs.DC

    A Dual-functional Blockchain Framework for Solving Distributed Optimization

    Authors: Weihang Cao, Xintong Ling, Jiaheng Wang, Xiqi Gao, Zhi Ding

    Abstract: Proof of Work (PoW) has been extensively utilized as the foundation of blockchain's security, consistency, and tamper-resistance. However, long has it been criticized for its tremendous and inefficient utilization of computational power and energy. In this work, we design a dual-functional blockchain framework that uses solving optimization problems to reach consensus as an alternative to PoW, cha… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  33. arXiv:2405.18812  [pdf, other

    cs.CV

    MindSemantix: Deciphering Brain Visual Experiences with a Brain-Language Model

    Authors: Ziqi Ren, Jie Li, Xuetong Xue, Xin Li, Fan Yang, Zhicheng Jiao, Xinbo Gao

    Abstract: Deciphering the human visual experience through brain activities captured by fMRI represents a compelling and cutting-edge challenge in the field of neuroscience research. Compared to merely predicting the viewed image itself, decoding brain activity into meaningful captions provides a higher-level interpretation and summarization of visual information, which naturally enhances the application fle… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 13 pages, 6 figures

  34. arXiv:2405.18700  [pdf, other

    cs.CV

    Multi-Condition Latent Diffusion Network for Scene-Aware Neural Human Motion Prediction

    Authors: Xuehao Gao, Yang Yang, Yang Wu, Shaoyi Du, Guo-Jun Qi

    Abstract: Inferring 3D human motion is fundamental in many applications, including understanding human activity and analyzing one's intention. While many fruitful efforts have been made to human motion prediction, most approaches focus on pose-driven prediction and inferring human motion in isolation from the contextual environment, thus leaving the body location movement in the scene behind. However, real-… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE Transactions on Image Processing

  35. arXiv:2405.18380  [pdf, other

    cs.LG cs.AI cs.CL

    OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning

    Authors: Pengxiang Li, Lu Yin, Xiaowei Gao, Shiwei Liu

    Abstract: The rapid advancements in Large Language Models (LLMs) have revolutionized various natural language processing tasks. However, the substantial size of LLMs presents significant challenges in training or fine-tuning. While parameter-efficient approaches such as low-rank adaptation (LoRA) have gained popularity, they often compromise performance compared to full-rank fine-tuning. In this paper, we p… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  36. arXiv:2405.18113  [pdf, other

    cs.CL cs.AI

    Facilitating Multi-Role and Multi-Behavior Collaboration of Large Language Models for Online Job Seeking and Recruiting

    Authors: Hongda Sun, Hongzhan Lin, Haiyu Yan, Chen Zhu, Yang Song, Xin Gao, Shuo Shang, Rui Yan

    Abstract: The emergence of online recruitment services has revolutionized the traditional landscape of job seeking and recruitment, necessitating the development of high-quality industrial applications to improve person-job fitting. Existing methods generally rely on modeling the latent semantics of resumes and job descriptions and learning a matching function between them. Inspired by the powerful role-pla… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  37. arXiv:2405.18004  [pdf, other

    cs.CV

    SkinCAP: A Multi-modal Dermatology Dataset Annotated with Rich Medical Captions

    Authors: Juexiao Zhou, Liyuan Sun, Yan Xu, Wenbin Liu, Shawn Afvari, Zhongyi Han, Jiaoyan Song, Yongzhi Ji, Xiaonan He, Xin Gao

    Abstract: With the widespread application of artificial intelligence (AI), particularly deep learning (DL) and vision-based large language models (VLLMs), in skin disease diagnosis, the need for interpretability becomes crucial. However, existing dermatology datasets are limited in their inclusion of concept-level meta-labels, and none offer rich medical descriptions in natural language. This deficiency imp… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  38. arXiv:2405.17811  [pdf, other

    cs.GR cs.CV

    Mani-GS: Gaussian Splatting Manipulation with Triangular Mesh

    Authors: Xiangjun Gao, Xiaoyu Li, Yiyu Zhuang, Qi Zhang, Wenbo Hu, Chaopeng Zhang, Yao Yao, Ying Shan, Long Quan

    Abstract: Neural 3D representations such as Neural Radiance Fields (NeRF), excel at producing photo-realistic rendering results but lack the flexibility for manipulation and editing which is crucial for content creation. Previous works have attempted to address this issue by deforming a NeRF in canonical space or manipulating the radiance field based on an explicit mesh. However, manipulating NeRF is not hi… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Project page here: https://gaoxiangjun.github.io/mani_gs/

  39. arXiv:2405.17463  [pdf, other

    cs.GT cs.LG

    No Algorithmic Collusion in Two-Player Blindfolded Game with Thompson Sampling

    Authors: Ningyuan Chen, Xuefeng Gao, Yi Xiong

    Abstract: When two players are engaged in a repeated game with unknown payoff matrices, they may be completely unaware of the existence of each other and use multi-armed bandit algorithms to choose the actions, which is referred to as the ``blindfolded game'' in this paper. We show that when the players use Thompson sampling, the game dynamics converges to the Nash equilibrium under a mild assumption on the… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  40. arXiv:2405.17003  [pdf, other

    cs.LG

    Graph Condensation for Open-World Graph Learning

    Authors: Xinyi Gao, Tong Chen, Wentao Zhang, Yayong Li, Xiangguo Sun, Hongzhi Yin

    Abstract: The burgeoning volume of graph data presents significant computational challenges in training graph neural networks (GNNs), critically impeding their efficiency in various applications. To tackle this challenge, graph condensation (GC) has emerged as a promising acceleration solution, focusing on the synthesis of a compact yet representative graph for efficiently training GNNs while retaining perf… ▽ More

    Submitted 12 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024

  41. arXiv:2405.16449  [pdf, other

    cs.LG math.OC q-fin.MF

    Reinforcement Learning for Jump-Diffusions

    Authors: Xuefeng Gao, Lingfei Li, Xun Yu Zhou

    Abstract: We study continuous-time reinforcement learning (RL) for stochastic control in which system dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized exploratory control problem with stochastic policies to capture the exploration--exploitation balance essential for RL. Unlike the pure diffusion case initially studied by Wang et al. (2020), the derivation of the explora… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  42. arXiv:2405.16152  [pdf, other

    cs.CV cs.HC

    SuDA: Support-based Domain Adaptation for Sim2Real Motion Capture with Flexible Sensors

    Authors: Jiawei Fang, Haishan Song, Chengxu Zuo, Xiaoxia Gao, Xiaowei Chen, Shihui Guo, Yipeng Qin

    Abstract: Flexible sensors hold promise for human motion capture (MoCap), offering advantages such as wearability, privacy preservation, and minimal constraints on natural movement. However, existing flexible sensor-based MoCap methods rely on deep learning and necessitate large and diverse labeled datasets for training. These data typically need to be collected in MoCap studios with specialized equipment a… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 20 pages conference, accepted ICML paper

  43. arXiv:2405.14113  [pdf, other

    eess.IV cs.CV

    Multi-modality Regional Alignment Network for Covid X-Ray Survival Prediction and Report Generation

    Authors: Zhusi Zhong, Jie Li, John Sollee, Scott Collins, Harrison Bai, Paul Zhang, Terrence Healey, Michael Atalay, Xinbo Gao, Zhicheng Jiao

    Abstract: In response to the worldwide COVID-19 pandemic, advanced automated technologies have emerged as valuable tools to aid healthcare professionals in managing an increased workload by improving radiology report generation and prognostic analysis. This study proposes Multi-modality Regional Alignment Network (MRANet), an explainable model for radiology report generation and survival prediction that foc… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  44. arXiv:2405.14060  [pdf, ps, other

    cs.LG physics.comp-ph

    Probabilistic Inference in the Era of Tensor Networks and Differential Programming

    Authors: Martin Roa-Villescas, Xuanzhao Gao, Sander Stuijk, Henk Corporaal, **-Guo Liu

    Abstract: Probabilistic inference is a fundamental task in modern machine learning. Recent advances in tensor network (TN) contraction algorithms have enabled the development of better exact inference methods. However, many common inference tasks in probabilistic graphical models (PGMs) still lack corresponding TN-based adaptations. In this work, we advance the connection between PGMs and TNs by formulating… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 12 pages, 4 figures

  45. arXiv:2405.13707  [pdf, other

    cs.LG cs.AI

    Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition

    Authors: Xinyi Gao, Tong Chen, Wentao Zhang, Junliang Yu, Guanhua Ye, Quoc Viet Hung Nguyen, Hongzhi Yin

    Abstract: The increasing prevalence of large-scale graphs poses a significant challenge for graph neural network training, attributed to their substantial computational requirements. In response, graph condensation (GC) emerges as a promising data-centric solution aiming to substitute the large graph with a small yet informative condensed graph to facilitate data-efficient GNN training. However, existing GC… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  46. arXiv:2405.12969  [pdf, other

    cs.LG

    Can We Treat Noisy Labels as Accurate?

    Authors: Yuxiang Zheng, Zhongyi Han, Yilong Yin, Xin Gao, Tongliang Liu

    Abstract: Noisy labels significantly hinder the accuracy and generalization of machine learning models, particularly due to ambiguous instance features. Traditional techniques that attempt to correct noisy labels directly, such as those using transition matrices, often fail to address the inherent complexities of the problem sufficiently. In this paper, we introduce EchoAlign, a transformative paradigm shif… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 10 pages

  47. arXiv:2405.12217  [pdf, other

    cs.CV cs.AI cs.LG

    Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning

    Authors: Guanglin Zhou, Zhongyi Han, Shiming Chen, Biwei Huang, Liming Zhu, Salman Khan, Xin Gao, Lina Yao

    Abstract: Recent studies indicate that large multimodal models (LMMs) are highly robust against natural distribution shifts, often surpassing previous baselines. Despite this, domain-specific adaptation is still necessary, particularly in specialized areas like healthcare. Due to the impracticality of fine-tuning LMMs given their vast parameter space, this work investigates in-context learning (ICL) as an e… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 17 pages, 7 figures, 7 tables

  48. arXiv:2405.11758  [pdf, other

    cs.LG cs.AI

    Fed-Credit: Robust Federated Learning with Credibility Management

    Authors: Jiayan Chen, Zhirong Qian, Tianhui Meng, Xitong Gao, Tian Wang, Weijia Jia

    Abstract: Aiming at privacy preservation, Federated Learning (FL) is an emerging machine learning approach enabling model training on decentralized devices or data sources. The learning mechanism of FL relies on aggregating parameter updates from individual clients. However, this process may pose a potential security risk due to the presence of malicious devices. Existing solutions are either costly due to… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  49. arXiv:2405.10132  [pdf, other

    cs.CV

    Cooperative Visual-LiDAR Extrinsic Calibration Technology for Intersection Vehicle-Infrastructure: A review

    Authors: Xinyu Zhang, Yi** Xiong, Qianxin Qu, Renjie Wang, Xin Gao, **g Liu, Shichun Guo, Jun Li

    Abstract: In the typical urban intersection scenario, both vehicles and infrastructures are equipped with visual and LiDAR sensors. By successfully integrating the data from vehicle-side and road monitoring devices, a more comprehensive and accurate environmental perception and information acquisition can be achieved. The Calibration of sensors, as an essential component of autonomous driving technology, ha… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  50. arXiv:2405.08423  [pdf, other

    eess.IV cs.CV

    NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

    Authors: Yihong Chen, Zhen Fan, Shuai Dong, Zhiwei Chen, Wenjie Li, Minghui Qin, Min Zeng, Xubing Lu, Guofu Zhou, Xingsen Gao, Jun-Ming Liu

    Abstract: Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high co… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.