Skip to main content

Showing 1–50 of 634 results for author: Yang, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18862  [pdf, other

    cs.SD eess.AS

    Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study

    Authors: Peikun Chen, Sining Sun, Changhao Shan, Qing Yang, Lei Xie

    Abstract: Unified speech-text models like SpeechGPT, VioLA, and AudioPaLM have shown impressive performance across various speech-related tasks, especially in Automatic Speech Recognition (ASR). These models typically adopt a unified method to model discrete speech and text tokens, followed by training a decoder-only transformer. However, they are all designed for non-streaming ASR tasks, where the entire s… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted for Interspeech 2024

  2. arXiv:2406.17404  [pdf, other

    cs.CL cs.LG

    Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training

    Authors: Yixuan Wang, Xianzhen Luo, Fuxuan Wei, Yijun Liu, Qingfu Zhu, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che

    Abstract: Existing speculative decoding methods typically require additional model structure and training processes to assist the model for draft token generation. This makes the migration of acceleration methods to the new model more costly and more demanding on device memory. To address this problem, we propose the Make Some Noise (MSN) training framework as a replacement for the supervised fine-tuning st… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 11 pages, 6 figures

  3. arXiv:2406.16442  [pdf, other

    cs.CV

    EmoLLM: Multimodal Emotional Understanding Meets Large Language Models

    Authors: Qu Yang, Mang Ye, Bo Du

    Abstract: Multi-modal large language models (MLLMs) have achieved remarkable performance on objective multimodal perception tasks, but their ability to interpret subjective, emotionally nuanced multimodal content remains largely unexplored. Thus, it impedes their ability to effectively understand and react to the intricate emotions expressed by humans through multimodal media. To bridge this gap, we introdu… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 9 pages

  4. arXiv:2406.16271  [pdf, other

    cs.CV

    Feature-prompting GBMSeg: One-Shot Reference Guided Training-Free Prompt Engineering for Glomerular Basement Membrane Segmentation

    Authors: Xueyu Liu, Guangze Shi, Rui Wang, Yexin Lai, Jianan Zhang, Lele Sun, Quan Yang, Yongfei Wu, MIng Li, Weixia Han, Wen Zheng

    Abstract: Assessment of the glomerular basement membrane (GBM) in transmission electron microscopy (TEM) is crucial for diagnosing chronic kidney disease (CKD). The lack of domain-independent automatic segmentation tools for the GBM necessitates an AI-based solution to automate the process. In this study, we introduce GBMSeg, a training-free framework designed to automatically segment the GBM in TEM images… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: Accepted for MICCAI2024

  5. arXiv:2406.13007  [pdf, other

    cs.CV

    NTIRE 2024 Challenge on Night Photography Rendering

    Authors: Egor Ershov, Artyom Panshin, Oleg Karasev, Sergey Korchagin, Shepelev Lev, Alexandr Startsev, Daniil Vladimirov, Ekaterina Zaychenkova, Nikola Banić, Dmitrii Iarchuk, Maria Efimova, Radu Timofte, Arseniy Terekhin, Shuwei Yue, Yuyang Liu, Minchen Wei, Lu Xu, Chao Zhang, Yasi Wang, Furkan Kınlı, Doğa Yılmaz, Barış Özcan, Furkan Kıraç, Shuai Liu, **gyuan Xiao , et al. (25 additional authors not shown)

    Abstract: This paper presents a review of the NTIRE 2024 challenge on night photography rendering. The goal of the challenge was to find solutions that process raw camera images taken in nighttime conditions, and thereby produce a photo-quality output images in the standard RGB (sRGB) space. Unlike the previous year's competition, the challenge images were collected with a mobile phone and the speed of algo… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 10 pages, 10 figures

  6. arXiv:2406.12726  [pdf, other

    cs.SD cs.AI eess.AS

    ED-sKWS: Early-Decision Spiking Neural Networks for Rapid,and Energy-Efficient Keyword Spotting

    Authors: Zeyang Song, Qianhui Liu, Qu Yang, Yizhou Peng, Haizhou Li

    Abstract: Keyword Spotting (KWS) is essential in edge computing requiring rapid and energy-efficient responses. Spiking Neural Networks (SNNs) are well-suited for KWS for their efficiency and temporal capacity for speech. To further reduce the latency and energy consumption, this study introduces ED-sKWS, an SNN-based KWS model with an early-decision mechanism that can stop speech processing and output the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH2024

  7. arXiv:2406.12403  [pdf, other

    cs.CL cs.AI

    PDSS: A Privacy-Preserving Framework for Step-by-Step Distillation of Large Language Models

    Authors: Tao Fan, Yan Kang, Wei**g Chen, Hanlin Gu, Yuanfeng Song, Lixin Fan, Kai Chen, Qiang Yang

    Abstract: In the context of real-world applications, leveraging large language models (LLMs) for domain-specific tasks often faces two major challenges: domain-specific knowledge privacy and constrained resources. To address these issues, we propose PDSS, a privacy-preserving framework for step-by-step distillation of LLMs. PDSS works on a server-client architecture, wherein client transmits perturbed promp… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  8. arXiv:2406.12254  [pdf, other

    eess.IV cs.CV

    Enhancing Single-Slice Segmentation with 3D-to-2D Unpaired Scan Distillation

    Authors: Xin Yu, Qi Yang, Han Liu, Ho Hin Lee, Yucheng Tang, Lucas W. Remedios, Michael Kim, Shunxing Bao, Ann Xenobia Moore, Luigi Ferrucci, Bennett A. Landman

    Abstract: 2D single-slice abdominal computed tomography (CT) enables the assessment of body habitus and organ health with low radiation exposure. However, single-slice data necessitates the use of 2D networks for segmentation, but these networks often struggle to capture contextual information effectively. Consequently, even when trained on identical datasets, 3D networks typically achieve superior segmenta… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  9. HiFGL: A Hierarchical Framework for Cross-silo Cross-device Federated Graph Learning

    Authors: Zhuoning Guo, Duanyi Yao, Qiang Yang, Hao Liu

    Abstract: Federated Graph Learning (FGL) has emerged as a promising way to learn high-quality representations from distributed graph data with privacy preservation. Despite considerable efforts have been made for FGL under either cross-device or cross-silo paradigm, how to effectively capture graph knowledge in a more complicated cross-silo cross-device environment remains an under-explored problem. However… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Accepted by SIGKDD 2024

  10. arXiv:2406.10540  [pdf, other

    cs.AI cs.NE cs.RO

    Generating and Evolving Reward Functions for Highway Driving with Large Language Models

    Authors: Xu Han, Qiannan Yang, Xianda Chen, Xiaowen Chu, Meixin Zhu

    Abstract: Reinforcement Learning (RL) plays a crucial role in advancing autonomous driving technologies by maximizing reward functions to achieve the optimal policy. However, crafting these reward functions has been a complex, manual process in many practices. To reduce this complexity, we introduce a novel framework that integrates Large Language Models (LLMs) with RL to improve reward function design in a… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 7 pages, 6 figures

  11. arXiv:2406.10469  [pdf, other

    eess.IV cs.CV cs.MM

    Object-Attribute-Relation Representation based Video Semantic Communication

    Authors: Qiyuan Du, Yi** Duan, Qianqian Yang, Xiaoming Tao, Mérouane Debbah

    Abstract: With the rapid growth of multimedia data volume, there is an increasing need for efficient video transmission in applications such as virtual reality and future video streaming services. Semantic communication is emerging as a vital technique for ensuring efficient and reliable transmission in low-bandwidth, high-noise settings. However, most current approaches focus on joint source-channel coding… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  12. arXiv:2406.04601  [pdf, other

    cs.LG

    Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning

    Authors: Zheng Huang, Qihui Yang, Dawei Zhou, Yujun Yan

    Abstract: Although most graph neural networks (GNNs) can operate on graphs of any size, their classification performance often declines on graphs larger than those encountered during training. Existing methods insufficiently address the removal of size information from graph representations, resulting in sub-optimal performance and reliance on backbone models. In response, we propose DISGEN, a novel and mod… ▽ More

    Submitted 11 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  13. arXiv:2406.04323  [pdf, other

    cs.LG cs.AI cs.CV

    ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories

    Authors: Qianlan Yang, Yu-Xiong Wang

    Abstract: Training autonomous agents with sparse rewards is a long-standing problem in online reinforcement learning (RL), due to low data efficiency. Prior work overcomes this challenge by extracting useful knowledge from offline data, often accomplished through the learning of action distribution from offline data and utilizing the learned distribution to facilitate online RL. However, since the offline d… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: ICML 2024 Accepted

  14. arXiv:2406.04025  [pdf

    cs.CL

    The syntax-semantics interface in a child's path: A study of 3- to 11-year-olds' elicited production of Mandarin recursive relative clauses

    Authors: Caimei Yang, Qihang Yang, Xingzhi Su, Chenxi Fu, Xiaoyi Wang, Ying Yan, Zaijiang Man

    Abstract: There have been apparently conflicting claims over the syntax-semantics relationship in child acquisition. However, few of them have assessed the child's path toward the acquisition of recursive relative clauses (RRCs). The authors of the current paper did experiments to investigate 3- to 11-year-olds' most-structured elicited production of eight Mandarin RRCs in a 4 (syntactic types)*2 (semantic… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  15. arXiv:2406.03868  [pdf, other

    cs.DC

    PALM: A Efficient Performance Simulator for Tiled Accelerators with Large-scale Model Training

    Authors: Jiahao Fang, Huizheng Wang, Qize Yang, Dehao Kong, Xu Dai, **yi Deng, Yang Hu, Shouyi Yin

    Abstract: Deep learning (DL) models are piquing high interest and scaling at an unprecedented rate. To this end, a handful of tiled accelerators have been proposed to support such large-scale training tasks. However, these accelerators often incorporate numerous cores or tiles even extending to wafer-scale, substantial on-chip bandwidth, and distributed memory systems. This results in an exceedingly complex… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 11 pages

  16. arXiv:2406.02916  [pdf, other

    cs.RO

    Real-time Motion Planning for autonomous vehicles in dynamic environments

    Authors: Mohammad Dehghani Tezerjani, Dominic Carrillo, Deyuan Qu, Sudip Dhakal, Amir Mirzaeinia, Qing Yang

    Abstract: Recent advancements in self-driving car technologies have enabled them to navigate autonomously through various environments. However, one of the critical challenges in autonomous vehicle operation is trajectory planning, especially in dynamic environments with moving obstacles. This research aims to tackle this challenge by proposing a robust algorithm tailored for autonomous cars operating in dy… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 8 pages

  17. arXiv:2406.02224  [pdf, other

    cs.CL cs.AI

    FedMKT: Federated Mutual Knowledge Transfer for Large and Small Language Models

    Authors: Tao Fan, Guoqiang Ma, Yan Kang, Hanlin Gu, Yuanfeng Song, Lixin Fan, Kai Chen, Qiang Yang

    Abstract: Recent research in federated large language models (LLMs) has primarily focused on enabling clients to fine-tune their locally deployed homogeneous LLMs collaboratively or on transferring knowledge from server-based LLMs to small language models (SLMs) at downstream clients. However, a significant gap remains in the simultaneous mutual enhancement of both the server's LLM and clients' SLMs. To bri… ▽ More

    Submitted 18 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  18. arXiv:2406.01956  [pdf, other

    cs.CV

    Enhance Image-to-Image Generation with LLaVA Prompt and Negative Prompt

    Authors: Zhicheng Ding, Panfeng Li, Qikai Yang, Siyang Li

    Abstract: This paper presents a novel approach to enhance image-to-image generation by leveraging the multimodal capabilities of the Large Language and Vision Assistant (LLaVA). We propose a framework where LLaVA analyzes input images and generates textual descriptions, hereinafter LLaVA-generated prompts. These prompts, along with the original image, are fed into the image-to-image generation pipeline. Thi… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted by 2024 5th International Conference on Information Science, Parallel and Distributed Systems

  19. arXiv:2406.01422  [pdf, other

    cs.SE cs.CL

    How to Understand Whole Software Repository?

    Authors: Yingwei Ma, Qing** Yang, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li

    Abstract: Recently, Large Language Model (LLM) based agents have advanced the significant development of Automatic Software Engineering (ASE). Although verified effectiveness, the designs of the existing methods mainly focus on the local information of codes, e.g., issues, classes, and functions, leading to limitations in capturing the global context and interdependencies within the software system. From th… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  20. arXiv:2406.01085  [pdf, other

    cs.CR cs.AI

    FedAdOb: Privacy-Preserving Federated Deep Learning with Adaptive Obfuscation

    Authors: Hanlin Gu, Jiahuan Luo, Yan Kang, Yuan Yao, Gongxi Zhu, Bowen Li, Lixin Fan, Qiang Yang

    Abstract: Federated learning (FL) has emerged as a collaborative approach that allows multiple clients to jointly learn a machine learning model without sharing their private data. The concern about privacy leakage, albeit demonstrated under specific conditions, has triggered numerous follow-up research in designing powerful attacking methods and effective defending mechanisms aiming to thwart these attacki… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  21. arXiv:2405.20681  [pdf, other

    cs.CR cs.AI

    No Free Lunch Theorem for Privacy-Preserving LLM Inference

    Authors: Xiao** Zhang, Yulin Fei, Yan Kang, Wei Chen, Lixin Fan, Hai **, Qiang Yang

    Abstract: Individuals and businesses have been significantly benefited by Large Language Models (LLMs) including PaLM, Gemini and ChatGPT in various ways. For example, LLMs enhance productivity, reduce costs, and enable us to focus on more valuable tasks. Furthermore, LLMs possess the capacity to sift through extensive datasets, uncover underlying patterns, and furnish critical insights that propel the fron… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  22. arXiv:2405.18802  [pdf, other

    cs.CR cs.AI

    Enhancing Security and Privacy in Federated Learning using Update Digests and Voting-Based Defense

    Authors: Wenjie Li, Kai Fan, **gyuan Zhang, Hui Li, Wei Yang Bryan Lim, Qiang Yang

    Abstract: Federated Learning (FL) is a promising privacy-preserving machine learning paradigm that allows data owners to collaboratively train models while kee** their data localized. Despite its potential, FL faces challenges related to the trustworthiness of both clients and servers, especially in the presence of curious or malicious adversaries. In this paper, we introduce a novel framework named \unde… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 14 pages

  23. arXiv:2405.18776  [pdf, other

    cs.CR cs.CL cs.LG

    LMO-DP: Optimizing the Randomization Mechanism for Differentially Private Fine-Tuning (Large) Language Models

    Authors: Qin Yang, Meisam Mohammad, Han Wang, Ali Payani, Ashish Kundu, Kai Shu, Yan Yan, Yuan Hong

    Abstract: Differentially Private Stochastic Gradient Descent (DP-SGD) and its variants have been proposed to ensure rigorous privacy for fine-tuning large-scale pre-trained language models. However, they rely heavily on the Gaussian mechanism, which may overly perturb the gradients and degrade the accuracy, especially in stronger privacy regimes (e.g., the privacy budget $ε< 3$). To address such limitations… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 18 pages, 15 figures

  24. arXiv:2405.17660  [pdf, other

    cs.CV

    LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking

    Authors: Shaohua Dong, Yunhe Feng, Qing Yang, Yuewei Lin, Heng Fan

    Abstract: High-performance Transformer trackers have shown excellent results, yet they often bear a heavy computational load. Observing that a smaller input can immediately and conveniently reduce computations without changing the model, an easy solution is to adopt the low-resolution input for efficient Transformer tracking. Albeit faster, this hurts tracking accuracy much due to information loss in low re… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  25. arXiv:2405.17522  [pdf, other

    cs.LG cs.DC

    Efficient Model Compression for Hierarchical Federated Learning

    Authors: Xi Zhu, Songcan Yu, Junbo Wang, Qinglin Yang

    Abstract: Federated learning (FL), as an emerging collaborative learning paradigm, has garnered significant attention due to its capacity to preserve privacy within distributed learning systems. In these systems, clients collaboratively train a unified neural network model using their local datasets and share model parameters rather than raw data, enhancing privacy. Predominantly, FL systems are designed fo… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  26. arXiv:2405.17221  [pdf, other

    cs.AI cs.AR

    Efficient Orchestrated AI Workflows Execution on Scale-out Spatial Architecture

    Authors: **yi Deng, Xinru Tang, Zhiheng Yue, Guangyang Lu, Qize Yang, Jiahao Zhang, **xi Li, Chao Li, Shaojun Wei, Yang Hu, Shouyi Yin

    Abstract: Given the increasing complexity of AI applications, traditional spatial architectures frequently fall short. Our analysis identifies a pattern of interconnected, multi-faceted tasks encompassing both AI and general computational processes. In response, we have conceptualized "Orchestrated AI Workflows," an approach that integrates various tasks with logic-driven decisions into dynamic, sophisticat… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  27. arXiv:2405.15474  [pdf, other

    cs.LG cs.DC

    Unlearning during Learning: An Efficient Federated Machine Unlearning Method

    Authors: Hanlin Gu, Gongxi Zhu, Jie Zhang, Xinyuan Zhao, Yuxing Han, Lixin Fan, Qiang Yang

    Abstract: In recent years, Federated Learning (FL) has garnered significant attention as a distributed machine learning paradigm. To facilitate the implementation of the right to be forgotten, the concept of federated machine unlearning (FMU) has also emerged. However, current FMU approaches often involve additional time-consuming steps and may not offer comprehensive unlearning capabilities, which renders… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI 2024

  28. arXiv:2405.14488  [pdf, other

    cs.CL

    MoGU: A Framework for Enhancing Safety of Open-Sourced LLMs While Preserving Their Usability

    Authors: Yanrui Du, Sendong Zhao, Danyang Zhao, Ming Ma, Yuhan Chen, Liangyu Huo, Qing Yang, Dongliang Xu, Bing Qin

    Abstract: Large Language Models (LLMs) are increasingly deployed in various applications. As their usage grows, concerns regarding their safety are rising, especially in maintaining harmless responses when faced with malicious instructions. Many defense strategies have been developed to enhance the safety of LLMs. However, our research finds that existing defense strategies lead LLMs to predominantly adopt… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  29. arXiv:2405.14212  [pdf, other

    cs.CR cs.CL

    Federated Domain-Specific Knowledge Transfer on Large Language Models Using Synthetic Data

    Authors: Haoran Li, Xinyuan Zhao, Dadi Guo, Hanlin Gu, Ziqian Zeng, Yuxing Han, Yangqiu Song, Lixin Fan, Qiang Yang

    Abstract: As large language models (LLMs) demonstrate unparalleled performance and generalization ability, LLMs are widely used and integrated into various applications. When it comes to sensitive domains, as commonly described in federated learning scenarios, directly using external LLMs on private data is strictly prohibited by stringent data security and privacy regulations. For local clients, the utiliz… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  30. arXiv:2405.13483  [pdf, other

    cs.IT

    Distributed Indirect Source Coding with Decoder Side Information

    Authors: Jiancheng Tang, Qianqian Yang, Deniz Gündüz

    Abstract: This paper studies a variant of the rate-distortion problem motivated by task-oriented semantic communication and distributed learning problems, where $M$ correlated sources are independently encoded for a central decoder. The decoder has access to a correlated side information in addition to the messages received from the encoders, and aims to recover a latent random variable correlated with the… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  31. arXiv:2405.11493  [pdf, other

    cs.CV cs.IT eess.SP

    Point Cloud Compression with Implicit Neural Representations: A Unified Framework

    Authors: Hongning Ruan, Yulin Shao, Qianqian Yang, Liang Zhao, Dusit Niyato

    Abstract: Point clouds have become increasingly vital across various applications thanks to their ability to realistically depict 3D objects and scenes. Nevertheless, effectively compressing unstructured, high-precision point cloud data remains a significant challenge. In this paper, we present a pioneering point cloud compression framework capable of handling both geometry and attribute components. Unlike… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: 6 Pages, 6 Figures, submitted to IEEE ICCC

  32. arXiv:2405.10762  [pdf

    q-fin.RM cs.AI cs.LG

    Research on Credit Risk Early Warning Model of Commercial Banks Based on Neural Network Algorithm

    Authors: Yu Cheng, Qin Yang, Liyang Wang, Ao Xiang, **gyu Zhang

    Abstract: In the realm of globalized financial markets, commercial banks are confronted with an escalating magnitude of credit risk, thereby imposing heightened requisites upon the security of bank assets and financial stability. This study harnesses advanced neural network techniques, notably the Backpropagation (BP) neural network, to pioneer a novel model for preempting credit risk in commercial banks. T… ▽ More

    Submitted 30 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  33. arXiv:2405.02364  [pdf, other

    cs.LG cs.DC

    A Survey on Contribution Evaluation in Vertical Federated Learning

    Authors: Yue Cui, Chung-ju Huang, Yuzhu Zhang, Leye Wang, Lixin Fan, Xiaofang Zhou, Qiang Yang

    Abstract: Vertical Federated Learning (VFL) has emerged as a critical approach in machine learning to address privacy concerns associated with centralized data storage and processing. VFL facilitates collaboration among multiple entities with distinct feature sets on the same user population, enabling the joint training of predictive models without direct data sharing. A key aspect of VFL is the fair and ac… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  34. arXiv:2405.01701  [pdf

    cs.CV

    Active Learning Enabled Low-cost Cell Image Segmentation Using Bounding Box Annotation

    Authors: Yu Zhu, Qiang Yang, Li Xu

    Abstract: Cell image segmentation is usually implemented using fully supervised deep learning methods, which heavily rely on extensive annotated training data. Yet, due to the complexity of cell morphology and the requirement for specialized knowledge, pixel-level annotation of cell images has become a highly labor-intensive task. To address the above problems, we propose an active learning framework for ce… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  35. arXiv:2405.00482  [pdf, other

    cs.CR cs.LG

    PackVFL: Efficient HE Packing for Vertical Federated Learning

    Authors: Liu Yang, Shuowei Cai, Di Chai, Junxue Zhang, Han Tian, Yilun **, Kun Guo, Kai Chen, Qiang Yang

    Abstract: As an essential tool of secure distributed machine learning, vertical federated learning (VFL) based on homomorphic encryption (HE) suffers from severe efficiency problems due to data inflation and time-consuming operations. To this core, we propose PackVFL, an efficient VFL framework based on packed HE (PackedHE), to accelerate the existing HE-based VFL algorithms. PackVFL packs multiple cleartex… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 12 pages excluding references

  36. arXiv:2405.00365  [pdf, other

    cs.IT eess.SP

    Robust Continuous-Time Beam Tracking with Liquid Neural Network

    Authors: Fenghao Zhu, Xinquan Wang, Chongwen Huang, Richeng **, Qianqian Yang, Ahmed Alhammadi, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: Millimeter-wave (mmWave) technology is increasingly recognized as a pivotal technology of the sixth-generation communication networks due to the large amounts of available spectrum at high frequencies. However, the huge overhead associated with beam training imposes a significant challenge in mmWave communications, particularly in urban environments with high background noise. To reduce this high… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  37. arXiv:2405.00253  [pdf, other

    cs.CL cs.SE

    CodeHalu: Code Hallucinations in LLMs Driven by Execution-based Verification

    Authors: Yuchen Tian, Weixiang Yan, Qian Yang, Qian Chen, Wen Wang, Ziyang Luo, Lei Ma

    Abstract: Large Language Models (LLMs) have made significant progress in code generation, providing developers with unprecedented automated programming support. However, LLMs often generate code that is syntactically correct and even semantically plausible but may not execute as expected or meet specified requirements. This phenomenon of hallucinations in the code domain has not been systematically explored… ▽ More

    Submitted 26 June, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

  38. arXiv:2404.19750  [pdf, other

    cs.IT eess.SP

    A Joint Communication and Computation Design for Distributed RISs Assisted Probabilistic Semantic Communication in IIoT

    Authors: Zhouxiang Zhao, Zhaohui Yang, Chongwen Huang, Li Wei, Qianqian Yang, Caijun Zhong, Wei Xu, Zhaoyang Zhang

    Abstract: In this paper, the problem of spectral-efficient communication and computation resource allocation for distributed reconfigurable intelligent surfaces (RISs) assisted probabilistic semantic communication (PSC) in industrial Internet-of-Things (IIoT) is investigated. In the considered model, multiple RISs are deployed to serve multiple users, while PSC adopts compute-then-transmit protocol to reduc… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  39. arXiv:2404.19534  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Nighttime Flare Removal: Methods and Results

    Authors: Yuekun Dai, Dafeng Zhang, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Peiqing Yang, Zhezhu **, Guanqun Liu, Chen Change Loy, Lize Zhang, Shuai Liu, Chaoyu Feng, Luyang Wang, Shuan Chen, Guangqi Shao, Xiaotao Wang, Lei Lei, Qirui Yang, Qihua Cheng, Zhiqiang Xu, Yihao Liu, Huan**g Yue, **gyu Yang , et al. (38 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 27 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Nighttime Flare Removal Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  40. arXiv:2404.18848  [pdf, other

    cs.LG cs.AI cs.CL

    FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition

    Authors: Yuxuan Yan, Qianqian Yang, Shunpu Tang, Zhiguo Shi

    Abstract: Despite their exceptional performance on various tasks after fine-tuning, pre-trained language models (PLMs) face significant challenges due to growing privacy concerns with data in centralized training methods. We consider federated learning (FL) to fine-tune PLMs in this paper. However, the substantial number of parameters in PLMs poses significant difficulties for client devices with limited co… ▽ More

    Submitted 25 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  41. arXiv:2404.18081  [pdf, other

    cs.SD cs.AI cs.CL cs.LG cs.MM eess.AS

    ComposerX: Multi-Agent Symbolic Music Composition with LLMs

    Authors: Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo

    Abstract: Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints. While demonstrating impressive capabilities in STEM subjects, current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and C… ▽ More

    Submitted 30 April, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

  42. TIUP: Effective Processor Verification with Tautology-Induced Universal Properties

    Authors: Yufeng Li, Yiwei Ci, Qiusong Yang

    Abstract: Design verification is a complex and costly task, especially for large and intricate processor projects. Formal verification techniques provide advantages by thoroughly examining design behaviors, but they require extensive labor and expertise in property formulation. Recent research focuses on verifying designs using the self-consistency universal property, reducing verification difficulty as it… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted by ASP-DAC 2024, please note that this is not the final camera-ready version

  43. arXiv:2404.16296  [pdf

    cs.CV cs.AI

    Research on Splicing Image Detection Algorithms Based on Natural Image Statistical Characteristics

    Authors: Ao Xiang, **gyu Zhang, Qin Yang, Liyang Wang, Yu Cheng

    Abstract: With the development and widespread application of digital image processing technology, image splicing has become a common method of image manipulation, raising numerous security and legal issues. This paper introduces a new splicing image detection algorithm based on the statistical characteristics of natural images, aimed at improving the accuracy and efficiency of splicing image detection. By a… ▽ More

    Submitted 17 May, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  44. arXiv:2404.15381  [pdf, other

    cs.LG cs.AI

    Advances and Open Challenges in Federated Learning with Foundation Models

    Authors: Chao Ren, Han Yu, Hongyi Peng, Xiaoli Tang, Anran Li, Yulan Gao, Alysa Ziying Tan, Bo Zhao, Xiaoxiao Li, Zengxiang Li, Qiang Yang

    Abstract: The integration of Foundation Models (FMs) with Federated Learning (FL) presents a transformative paradigm in Artificial Intelligence (AI), offering enhanced capabilities while addressing concerns of privacy, data decentralization, and computational efficiency. This paper provides a comprehensive survey of the emerging field of Federated Foundation Models (FedFM), elucidating their synergistic rel… ▽ More

    Submitted 29 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Survey of Federated Foundation Models (FedFM)

  45. arXiv:2404.14781  [pdf, other

    cs.LO cs.FL

    Improved Algorithm for Reachability in $d$-VASS

    Authors: Yuxi Fu, Qizhe Yang, Yangluo Zheng

    Abstract: An $\mathsf{F}_{d}$ upper bound for the reachability problem in vector addition systems with states (VASS) in fixed dimension is given, where $\mathsf{F}_d$ is the $d$-th level of the Grzegorczyk hierarchy of complexity classes. The new algorithm combines the idea of the linear path scheme characterization of the reachability in the $2$-dimension VASSes with the general decomposition algorithm by… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 36 pages

  46. arXiv:2404.13880  [pdf, other

    cs.CV

    Regional Style and Color Transfer

    Authors: Zhicheng Ding, Panfeng Li, Qikai Yang, Siyang Li, Qingtian Gong

    Abstract: This paper presents a novel contribution to the field of regional style transfer. Existing methods often suffer from the drawback of applying style homogeneously across the entire image, leading to stylistic inconsistencies or foreground object twisted when applied to image with foreground elements such as person figures. To address this limitation, we propose a new approach that leverages a segme… ▽ More

    Submitted 26 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted by 2024 5th International Conference on Computer Vision, Image and Deep Learning

  47. arXiv:2404.13812  [pdf, other

    cs.SI cs.AI

    A Comparative Study on Enhancing Prediction in Social Network Advertisement through Data Augmentation

    Authors: Qikai Yang, Panfeng Li, Xinhe Xu, Zhicheng Ding, Wen**g Zhou, Yi Nian

    Abstract: In the ever-evolving landscape of social network advertising, the volume and accuracy of data play a critical role in the performance of predictive models. However, the development of robust predictive algorithms is often hampered by the limited size and potential bias present in real-world datasets. This study presents and explores a generative augmentation framework of social network advertising… ▽ More

    Submitted 28 April, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted by 2024 4th International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)

  48. arXiv:2404.13565  [pdf, other

    cs.CV cs.AI cs.CL

    Exploring Diverse Methods in Visual Question Answering

    Authors: Panfeng Li, Qikai Yang, Xieming Geng, Wen**g Zhou, Zhicheng Ding, Yi Nian

    Abstract: This study explores innovative methods for improving Visual Question Answering (VQA) using Generative Adversarial Networks (GANs), autoencoders, and attention mechanisms. Leveraging a balanced VQA dataset, we investigate three distinct strategies. Firstly, GAN-based approaches aim to generate answer embeddings conditioned on image and question inputs, showing potential but struggling with more com… ▽ More

    Submitted 20 May, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted by 2024 5th International Conference on Electronic Communication and Artificial Intelligence

  49. arXiv:2404.13401  [pdf, other

    cs.LG

    Approximate Algorithms For $k$-Sparse Wasserstein Barycenter With Outliers

    Authors: Qingyuan Yang, Hu Ding

    Abstract: Wasserstein Barycenter (WB) is one of the most fundamental optimization problems in optimal transportation. Given a set of distributions, the goal of WB is to find a new distribution that minimizes the average Wasserstein distance to them. The problem becomes even harder if we restrict the solution to be ``$k$-sparse''. In this paper, we study the $k$-sparse WB problem in the presence of outliers,… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  50. arXiv:2404.12634  [pdf

    cs.CV cs.AI cs.LG

    Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment

    Authors: Danqing Ma, Meng Wang, Ao Xiang, Zongqing Qi, Qin Yang

    Abstract: This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism. This architecture combines the study of non-contrast computed tomography (NCCT) images and discharge diagnosis reports of patients undergoing stroke treatment, using a variety of methods based on Transformer architecture approach to predicting functional outcomes of str… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.