Skip to main content

Showing 1–50 of 647 results for author: Tang, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19853  [pdf, other

    cs.CL cs.AI

    YuLan: An Open-source Large Language Model

    Authors: Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou , et al. (13 additional authors not shown)

    Abstract: Large language models (LLMs) have become the foundation of many applications, leveraging their extensive capabilities in processing and understanding natural language. While many open-source LLMs have been released with technical reports, the lack of training details hinders further research and development. This paper presents the development of YuLan, a series of open-source LLMs with $12$ billi… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  2. arXiv:2406.19240  [pdf, other

    cs.SE

    Data Preparation for Deep Learning based Code Smell Detection: A Systematic Literature Review

    Authors: Fengji Zhang, Zexian Zhang, Jacky Wai Keung, Xiangru Tang, Zhen Yang, Xiao Yu, Wenhua Hu

    Abstract: Code Smell Detection (CSD) plays a crucial role in improving software quality and maintainability. And Deep Learning (DL) techniques have emerged as a promising approach for CSD due to their superior performance. However, the effectiveness of DL-based CSD methods heavily relies on the quality of the training data. Despite its importance, little attention has been paid to analyzing the data prepara… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.18873  [pdf, other

    cs.AR

    LayoutCopilot: An LLM-powered Multi-agent Collaborative Framework for Interactive Analog Layout Design

    Authors: Bingyang Liu, Haoyi Zhang, Xiaohan Gao, Zichen Kong, Xiyuan Tang, Yibo Lin, Runsheng Wang, Ru Huang

    Abstract: Analog layout design heavily involves interactive processes between humans and design tools. The tools are usually designed to use scripting commands or visualized buttons for manipulation, especially for those interactive automation functionalities, which have a steep learning curve and cumbersome user experience, making a notable barrier to their adoption by designers. Aiming to address such a u… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 8pages, 8figures

  4. arXiv:2406.16905  [pdf

    cs.LG cs.AI

    Optimising Random Forest Machine Learning Algorithms for User VR Experience Prediction Based on Iterative Local Search-Sparrow Search Algorithm

    Authors: Xirui Tang, Feiyang Li, Zinan Cao, Qixuan Yu, Yulu Gong

    Abstract: In this paper, an improved method for VR user experience prediction is investigated by introducing a sparrow search algorithm and a random forest algorithm improved by an iterative local search-optimised sparrow search algorithm. The study firstly conducted a statistical analysis of the data, and then trained and tested using the traditional random forest model, the random forest model improved by… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  5. arXiv:2406.14644  [pdf, other

    cs.CL

    Unveiling the Spectrum of Data Contamination in Language Models: A Survey from Detection to Remediation

    Authors: Chunyuan Deng, Yilun Zhao, Yuzhao Heng, Yitong Li, Jiannan Cao, Xiangru Tang, Arman Cohan

    Abstract: Data contamination has garnered increased attention in the era of large language models (LLMs) due to the reliance on extensive internet-derived training corpora. The issue of training corpus overlap with evaluation benchmarks--referred to as contamination--has been the focus of significant recent research. This body of work aims to identify contamination, understand its impacts, and explore mitig… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Camera-Ready Version

  6. arXiv:2406.14275  [pdf, other

    cs.CL cs.AI

    Step-Back Profiling: Distilling User History for Personalized Scientific Writing

    Authors: Xiangru Tang, Xingyao Zhang, Yanjun Shao, Jie Wu, Yilun Zhao, Arman Cohan, Ming Gong, Dongmei Zhang, Mark Gerstein

    Abstract: Large language models (LLMs) excel at a variety of natural language processing tasks, yet they struggle to generate personalized content for individuals, particularly in real-world scenarios like scientific writing. Addressing this challenge, we introduce Step-Back Profiling to personalize LLMs by distilling user history into concise profiles, including essential traits and preferences of users. R… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  7. arXiv:2406.14022  [pdf, other

    cs.LG cs.CL

    Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning

    Authors: Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: The emergence of in-context learning (ICL) is potentially attributed to two major abilities: task recognition (TR) for recognizing the task from demonstrations and utilizing pre-trained priors, and task learning (TL) for learning from demonstrations. However, relationships between the two abilities and how such relationships affect the emergence of ICL is unclear. In this paper, we take the first… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: work in progress

  8. arXiv:2406.13604  [pdf, other

    cs.SE cs.AI cs.PF

    Root Cause Localization for Microservice Systems in Cloud-edge Collaborative Environments

    Authors: Yuhan Zhu, Jian Wang, Bing Li, Xuxian Tang, Hao Li, Neng Zhang, Yuqi Zhao

    Abstract: With the development of cloud-native technologies, microservice-based software systems face challenges in accurately localizing root causes when failures occur. Additionally, the cloud-edge collaborative environment introduces more difficulties, such as unstable networks and high latency across network segments. Accurately identifying the root cause of microservices in a cloud-edge collaborative e… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  9. arXiv:2406.13294  [pdf, other

    cs.MM cs.LG

    Enhancing Cross-Prompt Transferability in Vision-Language Models through Contextual Injection of Target Tokens

    Authors: Xikang Yang, Xuehai Tang, Fuqing Zhu, Jizhong Han, Songlin Hu

    Abstract: Vision-language models (VLMs) seamlessly integrate visual and textual data to perform tasks such as image classification, caption generation, and visual question answering. However, adversarial images often struggle to deceive all prompts effectively in the context of cross-prompt migration attacks, as the probability distribution of the tokens in these images tends to favor the semantics of the o… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 13 pages

  10. arXiv:2406.13193  [pdf, other

    cs.LG cs.AI cs.CL physics.chem-ph

    PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes

    Authors: He Cao, Yanjun Shao, Zhiyuan Liu, Zi**g Liu, Xiangru Tang, Yuan Yao, Yu Li

    Abstract: Multimodal Large Language Models (MLLMs) have seen growing adoption across various scientific disciplines. These advancements encourage the investigation of molecule-text modeling within synthetic chemistry, a field dedicated to designing and conducting chemical reactions to synthesize new compounds with desired properties and applications. Current approaches, however, often neglect the critical r… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  11. arXiv:2406.12692  [pdf, other

    cs.CL cs.AI cs.DB cs.HC

    MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL

    Authors: Arian Askari, Christian Poelitz, Xinye Tang

    Abstract: Self-correction in text-to-SQL is the process of prompting large language model (LLM) to revise its previously incorrectly generated SQL, and commonly relies on manually crafted self-correction guidelines by human experts that are not only labor-intensive to produce but also limited by the human ability in identifying all potential error patterns in LLM responses. We introduce MAGIC, a novel multi… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 20 pages, 17 figures

  12. arXiv:2406.11252  [pdf, other

    cs.CV

    Mining Open Semantics from CLIP: A Relation Transition Perspective for Few-Shot Learning

    Authors: Cilin Yan, Haochen Wang, Xiaolong Jiang, Yao Hu, Xu Tang, Guoliang Kang, Efstratios Gavves

    Abstract: Contrastive Vision-Language Pre-training(CLIP) demonstrates impressive zero-shot capability. The key to improve the adaptation of CLIP to downstream task with few exemplars lies in how to effectively model and transfer the useful knowledge embedded in CLIP. Previous work mines the knowledge typically based on the limited visual samples and close-set semantics (i.e., within target category set of d… ▽ More

    Submitted 28 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  13. arXiv:2406.11187  [pdf, other

    cs.LG

    Save It All: Enabling Full Parameter Tuning for Federated Large Language Models via Cycle Black Gradient Descent

    Authors: Lin Wang, Zhichao Wang, Xiaoying Tang

    Abstract: The advent of large language models (LLMs) has revolutionized the deep learning paradigm, yielding impressive results across a wide array of tasks. However, the pre-training or fine-tuning of LLMs within a federated learning (FL) framework poses substantial challenges, including considerable computational and memory resource demands, as well as communication bottlenecks between servers and clients… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  14. arXiv:2406.10801  [pdf, other

    cs.CV

    Saliency-guided and Patch-based Mixup for Long-tailed Skin Cancer Image Classification

    Authors: Tianyunxi Wei, Yi** Huang, Li Lin, Pu** Cheng, Sirui Li, Xiaoying Tang

    Abstract: Medical image datasets often exhibit long-tailed distributions due to the inherent challenges in medical data collection and annotation. In long-tailed contexts, some common disease categories account for most of the data, while only a few samples are available in the rare disease categories, resulting in poor performance of deep learning methods. To address this issue, previous approaches have em… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: IEEE ISBI2024

  15. arXiv:2406.08870  [pdf, other

    cs.NI

    MEGA: Maximum-Entropy Genetic Algorithm for Router Nodes Placement in Wireless Mesh Networks

    Authors: N. Ussipov, S. Akhtanov, D. Turlykozhayeva, S. Temesheva, A. Akhmetali, M. Zaidyn, T. Namazbayev, A. Bolysbay, A. Akniyazova, Xiao Tang

    Abstract: Over the past decade, Wireless Mesh Networks (WMNs) have seen significant advancements due to their simple deployment, cost-effectiveness, ease of implementation and reliable service coverage. However, despite these advantages, the placement of nodes in WMNs presents a critical challenge that significantly impacts their performance. This issue is recognized as an NP-hard problem, underscoring the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE Access

  16. arXiv:2406.03464  [pdf, other

    cs.LG

    Node-wise Filtering in Graph Neural Networks: A Mixture of Experts Approach

    Authors: Haoyu Han, Juanhui Li, Wei Huang, Xianfeng Tang, Hanqing Lu, Chen Luo, Hui Liu, Jiliang Tang

    Abstract: Graph Neural Networks (GNNs) have proven to be highly effective for node classification tasks across diverse graph structural patterns. Traditionally, GNNs employ a uniform global filter, typically a low-pass filter for homophilic graphs and a high-pass filter for heterophilic graphs. However, real-world graphs often exhibit a complex mix of homophilic and heterophilic patterns, rendering a single… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  17. arXiv:2406.02014  [pdf, other

    q-bio.NC cs.LG cs.SD eess.AS

    Understanding Auditory Evoked Brain Signal via Physics-informed Embedding Network with Multi-Task Transformer

    Authors: Wanli Ma, Xuegang Tang, ** Gu, Ying Wang, Yuling Xia

    Abstract: In the fields of brain-computer interaction and cognitive neuroscience, effective decoding of auditory signals from task-based functional magnetic resonance imaging (fMRI) is key to understanding how the brain processes complex auditory information. Although existing methods have enhanced decoding capabilities, limitations remain in information utilization and model representation. To overcome the… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  18. arXiv:2406.00335  [pdf, other

    cs.LG

    Benchmarking for Deep Uplift Modeling in Online Marketing

    Authors: Dugang Liu, Xing Tang, Yang Qiao, Miao Liu, Zexu Sun, Xiuqiang He, Zhong Ming

    Abstract: Online marketing is critical for many industrial platforms and business applications, aiming to increase user engagement and platform revenue by identifying corresponding delivery-sensitive groups for specific incentives, such as coupons and bonuses. As the scale and complexity of features in industrial scenarios increase, deep uplift modeling (DUM) as a promising technique has attracted increased… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  19. arXiv:2406.00258  [pdf, other

    cs.CV cs.AI

    Artemis: Towards Referential Understanding in Complex Videos

    Authors: Jihao Qiu, Yuan Zhang, Xi Tang, Lingxi Xie, Tianren Ma, Pengyu Yan, David Doermann, Qixiang Ye, Yunjie Tian

    Abstract: Videos carry rich visual information including object description, action, interaction, etc., but the existing multimodal large language models (MLLMs) fell short in referential understanding scenarios such as video-based referring. In this paper, we present Artemis, an MLLM that pushes video-based referential understanding to a finer level. Given a video, Artemis receives a natural-language quest… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: 19 pages, 14 figures. Code and data are available at https://github.com/qiujihao19/Artemis

  20. arXiv:2405.17221  [pdf, other

    cs.AI cs.AR

    Efficient Orchestrated AI Workflows Execution on Scale-out Spatial Architecture

    Authors: **yi Deng, Xinru Tang, Zhiheng Yue, Guangyang Lu, Qize Yang, Jiahao Zhang, **xi Li, Chao Li, Shaojun Wei, Yang Hu, Shouyi Yin

    Abstract: Given the increasing complexity of AI applications, traditional spatial architectures frequently fall short. Our analysis identifies a pattern of interconnected, multi-faceted tasks encompassing both AI and general computational processes. In response, we have conceptualized "Orchestrated AI Workflows," an approach that integrates various tasks with logic-driven decisions into dynamic, sophisticat… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  21. arXiv:2405.16233  [pdf, other

    cs.LG

    Client2Vec: Improving Federated Learning by Distribution Shifts Aware Client Indexing

    Authors: Yongxin Guo, Lin Wang, Xiaoying Tang, Tao Lin

    Abstract: Federated Learning (FL) is a privacy-preserving distributed machine learning paradigm. Nonetheless, the substantial distribution shifts among clients pose a considerable challenge to the performance of current FL algorithms. To mitigate this challenge, various methods have been proposed to enhance the FL training process. This paper endeavors to tackle the issue of data heterogeneity from another… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  22. arXiv:2405.15458  [pdf, other

    cs.LG cs.DC

    FedCal: Achieving Local and Global Calibration in Federated Learning via Aggregated Parameterized Scaler

    Authors: Hongyi Peng, Han Yu, Xiaoli Tang, Xiaoxiao Li

    Abstract: Federated learning (FL) enables collaborative machine learning across distributed data owners, but data heterogeneity poses a challenge for model calibration. While prior work focused on improving accuracy for non-iid data, calibration remains under-explored. This study reveals existing FL aggregation approaches lead to sub-optimal calibration, and theoretical analysis shows despite constraining v… ▽ More

    Submitted 3 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted by ICML'24

  23. arXiv:2405.15301  [pdf, other

    cs.LG

    Rankability-enhanced Revenue Uplift Modeling Framework for Online Marketing

    Authors: Bowei He, Yunpeng Weng, Xing Tang, Ziqiang Cui, Zexu Sun, Liang Chen, Xiuqiang He, Chen Ma

    Abstract: Uplift modeling has been widely employed in online marketing by predicting the response difference between the treatment and control groups, so as to identify the sensitive individuals toward interventions like coupons or discounts. Compared with traditional \textit{conversion uplift modeling}, \textit{revenue uplift modeling} exhibits higher potential due to its direct connection with the corpora… ▽ More

    Submitted 12 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024

  24. arXiv:2405.14782  [pdf, other

    cs.CL

    Lessons from the Trenches on Reproducible Evaluation of Language Models

    Authors: Stella Biderman, Hailey Schoelkopf, Lintang Sutawika, Leo Gao, Jonathan Tow, Baber Abbasi, Alham Fikri Aji, Pawan Sasanka Ammanamanchi, Sidney Black, Jordan Clive, Anthony DiPofi, Julen Etxaniz, Benjamin Fattori, Jessica Zosa Forde, Charles Foster, Jeffrey Hsu, Mimansa Jaiswal, Wilson Y. Lee, Haonan Li, Charles Lovering, Niklas Muennighoff, Ellie Pavlick, Jason Phang, Aviya Skowron, Samson Tan , et al. (5 additional authors not shown)

    Abstract: Effective evaluation of language models remains an open challenge in NLP. Researchers and engineers face methodological issues such as the sensitivity of models to evaluation setup, difficulty of proper comparisons across methods, and the lack of reproducibility and transparency. In this paper we draw on three years of experience in evaluating large language models to provide guidance and lessons… ▽ More

    Submitted 29 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  25. arXiv:2405.14297  [pdf, other

    cs.LG cs.AI

    Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

    Authors: Yongxin Guo, Zhenglin Cheng, Xiaoying Tang, Tao Lin

    Abstract: The Sparse Mixture of Experts (SMoE) has been widely employed to enhance the efficiency of training and inference for Transformer-based foundational models, yielding promising results. However, the performance of SMoE heavily depends on the choice of hyper-parameters, such as the number of experts and the number of experts to be activated (referred to as top-k), resulting in significant computatio… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 9 pages, 21 figures

  26. arXiv:2405.13382  [pdf, other

    cs.CV

    VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

    Authors: Yongxin Guo, **gyu Liu, Mingda Li, Xiaoying Tang, Xi Chen, Bo Zhao

    Abstract: Video Temporal Grounding (VTG) focuses on accurately identifying event timestamps within a particular video based on a linguistic query, playing a vital role in downstream tasks such as video browsing and editing. While Video Large Language Models (video LLMs) have made significant progress in understanding video content, they often face challenges in accurately pinpointing timestamps within video… ▽ More

    Submitted 1 July, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  27. arXiv:2405.11921  [pdf, other

    cs.CV

    MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections

    Authors: Jiayue Liu, Xiao Tang, Freeman Cheng, Roy Yang, Zhihao Li, Jianzhuang Liu, Yi Huang, Jiaqi Lin, Shiyong Liu, Xiaofei Wu, Songcen Xu, Chun Yuan

    Abstract: 3D Gaussian Splatting showcases notable advancements in photo-realistic and real-time novel view synthesis. However, it faces challenges in modeling mirror reflections, which exhibit substantial appearance variations from different viewpoints. To tackle this problem, we present MirrorGaussian, the first method for mirror scene reconstruction with real-time rendering based on 3D Gaussian Splatting.… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  28. arXiv:2405.07966  [pdf, other

    cs.CV cs.AI

    OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition

    Authors: Qiuchi Xiang, **tao Cheng, Jiehao Luo, ** Wu, Rui Fan, Xieyuanli Chen, Xiaoyu Tang

    Abstract: Place recognition is the foundation for enabling autonomous systems to achieve independent decision-making and safe operations. It is also crucial in tasks such as loop closure detection and global localization within SLAM. Previous methods utilize mundane point cloud representations as input and deep learning-based LiDAR-based Place Recognition (LPR) approaches employing different point cloud ima… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  29. arXiv:2405.07202  [pdf, other

    cs.CV cs.AI cs.LG cs.MM cs.SD eess.AS

    Unified Video-Language Pre-training with Synchronized Audio

    Authors: Shentong Mo, Haofan Wang, Huaxia Li, Xu Tang

    Abstract: Video-language pre-training is a typical and challenging problem that aims at learning visual and textual representations from large-scale data in a self-supervised way. Existing pre-training approaches either captured the correspondence of image-text pairs or utilized temporal ordering of frames. However, they do not explicitly explore the natural synchronization between audio and the other two m… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  30. arXiv:2405.05991  [pdf, other

    cs.LG cs.AI cs.GT

    Agent-oriented Joint Decision Support for Data Owners in Auction-based Federated Learning

    Authors: Xiaoli Tang, Han Yu, Xiaoxiao Li

    Abstract: Auction-based Federated Learning (AFL) has attracted extensive research interest due to its ability to motivate data owners (DOs) to join FL through economic means. While many existing AFL methods focus on providing decision support to model users (MUs) and the AFL auctioneer, decision support for data owners remains open. To bridge this gap, we propose a first-of-its-kind agent-oriented joint Pri… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  31. arXiv:2405.05610  [pdf, other

    cs.CL cs.CR cs.LG

    Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM

    Authors: Xikang Yang, Xuehai Tang, Songlin Hu, Jizhong Han

    Abstract: Large language models (LLMs) have achieved remarkable performance in various natural language processing tasks, especially in dialogue systems. However, LLM may also pose security and moral threats, especially in multi round conversations where large models are more easily guided by contextual content, resulting in harmful or biased responses. In this paper, we present a novel method to attack LLM… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  32. arXiv:2405.02911  [pdf, other

    cs.CV

    Multimodal Sense-Informed Prediction of 3D Human Motions

    Authors: Zhenyu Lou, Qiongjie Cui, Haofan Wang, Xu Tang, Hong Zhou

    Abstract: Predicting future human pose is a fundamental application for machine intelligence, which drives robots to plan their behavior and paths ahead of time to seamlessly accomplish human-robot collaboration in real-world 3D scenarios. Despite encouraging results, existing approaches rarely consider the effects of the external scene on the motion sequence, leading to pronounced artifacts and physical im… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  33. arXiv:2405.02843  [pdf, other

    cs.CV

    Residual-Conditioned Optimal Transport: Towards Structure-Preserving Unpaired and Paired Image Restoration

    Authors: Xiaole Tang, Xin Hu, Xiang Gu, Jian Sun

    Abstract: Deep learning-based image restoration methods generally struggle with faithfully preserving the structures of the original image. In this work, we propose a novel Residual-Conditioned Optimal Transport (RCOT) approach, which models image restoration as an optimal transport (OT) problem for both unpaired and paired settings, introducing the transport residual as a unique degradation-specific cue fo… ▽ More

    Submitted 10 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  34. arXiv:2405.02008  [pdf, other

    cs.CV

    DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model

    Authors: Pei** Jia, Tuopu Wen, Ziang Luo, Mengmeng Yang, Kun Jiang, Zhiquan Lei, Xuewei Tang, Ziyuan Liu, Le Cui, Kehua Sheng, Bo Zhang, Diange Yang

    Abstract: Constructing high-definition (HD) maps is a crucial requirement for enabling autonomous driving. In recent years, several map segmentation algorithms have been developed to address this need, leveraging advancements in Bird's-Eye View (BEV) perception. However, existing models still encounter challenges in producing realistic and consistent semantic map layouts. One prominent issue is the limited… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  35. arXiv:2405.01555  [pdf, ps, other

    cs.NI cs.AI

    Digital Twin-Empowered Task Assignment in Aerial MEC Network: A Resource Coalition Cooperation Approach with Generative Model

    Authors: Xin Tang, Qian Chen, Rong Yu, Xiaohuan Li

    Abstract: To meet the demands for ubiquitous communication and temporary edge computing in 6G networks, aerial mobile edge computing (MEC) networks have been envisioned as a new paradigm. However, dynamic user requests pose challenges for task assignment strategies. Most of the existing research assumes that the strategy is deployed on ground-based stations or UAVs, which will be ineffective in an environme… ▽ More

    Submitted 5 May, 2024; v1 submitted 17 March, 2024; originally announced May 2024.

  36. arXiv:2404.18361  [pdf, other

    cs.DC cs.AR

    Improving Multi-Instance GPU Efficiency via Sub-Entry Sharing TLB Design

    Authors: Bingyao Li, Yueqi Wang, Tianyu Wang, Lieven Eeckhout, Jun Yang, Aamer Jaleel, Xulong Tang

    Abstract: NVIDIA's Multi-Instance GPU (MIG) technology enables partitioning GPU computing power and memory into separate hardware instances, providing complete isolation including compute resources, caches, and memory. However, prior work identifies that MIG does not extend to partitioning the last-level TLB (i.e., L3 TLB), which remains shared among all instances. To enhance TLB reach, NVIDIA GPUs reorgani… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  37. Cost-Driven Data Replication with Predictions

    Authors: Tianyu Zuo, Xueyan Tang, Bu Sung Lee

    Abstract: This paper studies an online replication problem for distributed data access. The goal is to dynamically create and delete data copies in a multi-server system as time passes to minimize the total storage and network cost of serving access requests. We study the problem in the emergent learning-augmented setting, assuming simple binary predictions about inter-request times at individual servers. W… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: The formal version of this draft will appear in ACM SPAA'24 conference

  38. arXiv:2404.15381  [pdf, other

    cs.LG cs.AI

    Advances and Open Challenges in Federated Learning with Foundation Models

    Authors: Chao Ren, Han Yu, Hongyi Peng, Xiaoli Tang, Anran Li, Yulan Gao, Alysa Ziying Tan, Bo Zhao, Xiaoxiao Li, Zengxiang Li, Qiang Yang

    Abstract: The integration of Foundation Models (FMs) with Federated Learning (FL) presents a transformative paradigm in Artificial Intelligence (AI), offering enhanced capabilities while addressing concerns of privacy, data decentralization, and computational efficiency. This paper provides a comprehensive survey of the emerging field of Federated Foundation Models (FedFM), elucidating their synergistic rel… ▽ More

    Submitted 29 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Survey of Federated Foundation Models (FedFM)

  39. arXiv:2404.13244  [pdf, other

    cs.LG cs.GT

    Intelligent Agents for Auction-based Federated Learning: A Survey

    Authors: Xiaoli Tang, Han Yu, Xiaoxiao Li, Sarit Kraus

    Abstract: Auction-based federated learning (AFL) is an important emerging category of FL incentive mechanism design, due to its ability to fairly and efficiently motivate high-quality data owners to join data consumers' (i.e., servers') FL training tasks. To enhance the efficiency in AFL decision support for stakeholders (i.e., data consumers, data owners, and the auctioneer), intelligent agent-based techni… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  40. arXiv:2404.13062  [pdf, other

    cs.AR cs.NE

    EasyACIM: An End-to-End Automated Analog CIM with Synthesizable Architecture and Agile Design Space Exploration

    Authors: Haoyi Zhang, Jiahao Song, Xiaohan Gao, Xiyuan Tang, Yibo Lin, Runsheng Wang, Ru Huang

    Abstract: Analog Computing-in-Memory (ACIM) is an emerging architecture to perform efficient AI edge computing. However, current ACIM designs usually have unscalable topology and still heavily rely on manual efforts. These drawbacks limit the ACIM application scenarios and lead to an undesired time-to-market. This work proposes an end-to-end automated ACIM based on a synthesizable architecture (EasyACIM). W… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  41. arXiv:2404.10394  [pdf, other

    cs.CV

    Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior

    Authors: Yiqian Wu, Hao Xu, Xiangjun Tang, Xien Chen, Siyu Tang, Zhebin Zhang, Chen Li, Xiaogang **

    Abstract: Existing neural rendering-based text-to-3D-portrait generation methods typically make use of human geometry prior and diffusion models to obtain guidance. However, relying solely on geometry information introduces issues such as the Janus problem, over-saturation, and over-smoothing. We present Portrait3D, a novel neural rendering-based framework with a novel joint geometry-appearance prior to ach… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  42. arXiv:2404.10089  [pdf, other

    cs.HC

    CFlow: Supporting Semantic Flow Analysis of Students' Code in Programming Problems at Scale

    Authors: Ashley Ge Zhang, Xiaohang Tang, Steve Oney, Yan Chen

    Abstract: The high demand for computer science education has led to high enrollments, with thousands of students in many introductory courses. In such large courses, it can be overwhelmingly difficult for instructors to understand class-wide problem-solving patterns or issues, which is crucial for improving instruction and addressing important pedagogical challenges. In this paper, we propose a technique an… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 10 pages, 4 figures, conditionally accepted by L@S 24

  43. arXiv:2404.09165  [pdf, ps, other

    cs.IT cs.CR

    Private Multiple Linear Computation: A Flexible Communication-Computation Tradeoff

    Authors: **bao Zhu, Lan** Li, Xiaohu Tang, ** Deng

    Abstract: We consider the problem of private multiple linear computation (PMLC) over a replicated storage system with colluding and unresponsive constraints. In this scenario, the user wishes to privately compute $P$ linear combinations of $M$ files from a set of $N$ replicated servers without revealing any information about the coefficients of these linear combinations to any $T$ colluding servers, in the… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE ISIT 2024

  44. arXiv:2404.08743  [pdf, other

    cs.HC

    VizGroup: An AI-Assisted Event-Driven System for Real-Time Collaborative Programming Learning Analytics

    Authors: Xiaohang Tang, Sam Wong, Kevin Pu, Xi Chen, Yalong Yang, Yan Chen

    Abstract: Programming instructors often conduct collaborative learning activities, like Peer Instruction, to foster a deeper understanding in students and enhance their engagement with learning. These activities, however, may not always yield productive outcomes due to the diversity of student mental models and their ineffective collaboration. In this work, we introduce VizGroup, an AI-assisted system that… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  45. arXiv:2404.06351  [pdf, other

    cs.CV

    HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention

    Authors: Xiaolong Tang, Meina Kan, Shiguang Shan, Zhilong Ji, **feng Bai, Xilin Chen

    Abstract: Predicting the trajectories of road agents is essential for autonomous driving systems. The recent mainstream methods follow a static paradigm, which predicts the future trajectory by using a fixed duration of historical frames. These methods make the predictions independently even at adjacent time steps, which leads to potential instability and temporal inconsistency. As successive time steps hav… ▽ More

    Submitted 11 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: CVPR2024

  46. arXiv:2404.06270  [pdf, other

    cs.CV

    3D Geometry-aware Deformable Gaussian Splatting for Dynamic View Synthesis

    Authors: Zhicheng Lu, Xiang Guo, Le Hui, Tianrui Chen, Min Yang, Xiao Tang, Feng Zhu, Yuchao Dai

    Abstract: In this paper, we propose a 3D geometry-aware deformable Gaussian Splatting method for dynamic view synthesis. Existing neural radiance fields (NeRF) based solutions learn the deformation in an implicit manner, which cannot incorporate 3D scene geometry. Therefore, the learned deformation is not necessarily geometrically coherent, which results in unsatisfactory dynamic view synthesis and 3D dynam… ▽ More

    Submitted 14 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024. Project page: https://npucvr.github.io/GaGS/

  47. arXiv:2404.05242  [pdf, other

    cs.RO

    Collision-Free Trajectory Optimization in Cluttered Environments with Sums-of-Squares Programming

    Authors: Yulin Li, Chunxin Zheng, Kai Chen, Yusen Xie, Xindong Tang, Michael Yu Wang, Jun Ma

    Abstract: In this work, we propose a trajectory optimization approach for robot navigation in cluttered 3D environments. We represent the robot's geometry as a semialgebraic set defined by polynomial inequalities such that robots with general shapes can be suitably characterized. To address the robot navigation task in obstacle-dense environments, we exploit the free space directly to construct a sequence o… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  48. arXiv:2404.04285  [pdf, other

    cs.CL cs.AI

    MIMIR: A Streamlined Platform for Personalized Agent Tuning in Domain Expertise

    Authors: Chunyuan Deng, Xiangru Tang, Yilun Zhao, Hanming Wang, Haoran Wang, Wangchunshu Zhou, Arman Cohan, Mark Gerstein

    Abstract: Recently, large language models (LLMs) have evolved into interactive agents, proficient in planning, tool use, and task execution across a wide variety of tasks. However, without specific agent tuning, open-source models like LLaMA currently struggle to match the efficiency of GPT- 4, particularly given the scarcity of agent-tuning datasets for fine-tuning. In response, we introduce \textsc{Mimir}… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  49. arXiv:2404.02933  [pdf, other

    cs.DB cs.AI cs.CL

    NL2KQL: From Natural Language to Kusto Query

    Authors: Amir H. Abdi, Xinye Tang, Jeremias Eichelbaum, Mahan Das, Alex Klein, Nihal Irmak Pakis, William Blum, Daniel L Mace, Tanvi Raja, Namrata Padmanabhan, Ye Xing

    Abstract: Data is growing rapidly in volume and complexity. Proficiency in database query languages is pivotal for crafting effective queries. As coding assistants become more prevalent, there is significant opportunity to enhance database query languages. The Kusto Query Language (KQL) is a widely used query language for large semi-structured data such as logs, telemetries, and time-series for big data ana… ▽ More

    Submitted 15 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  50. arXiv:2404.01975  [pdf, other

    cs.LG

    DSGNN: A Dual-View Supergrid-Aware Graph Neural Network for Regional Air Quality Estimation

    Authors: Xin Zhang, Ling Chen, Xing Tang, Hongyu Shi

    Abstract: Air quality estimation can provide air quality for target regions without air quality stations, which is useful for the public. Existing air quality estimation methods divide the study area into disjointed grid regions, and apply 2D convolution to model the spatial dependencies of adjacent grid regions based on the first law of geography, failing to model the spatial dependencies of distant grid r… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Submitted to TKDE, 12 pages and 8 figures