Skip to main content

Showing 1–50 of 115 results for author: Wei, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16253  [pdf, other

    cs.CL

    LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

    Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

    Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2406.00752  [pdf, other

    cs.DC

    Blockchain-aided wireless federated learning: Resource allocation and client scheduling

    Authors: Jun Li, Weiwei Zhang, Kang Wei, Guangji Chen, Feng Shu, Wen Chen, Shi **

    Abstract: Federated learning (FL) based on the centralized design faces both challenges regarding the trust issue and a single point of failure. To alleviate these issues, blockchain-aided decentralized FL (BDFL) introduces the decentralized network architecture into the FL training process, which can effectively overcome the defects of centralized architecture. However, deploying BDFL in wireless networks… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 14 pages, 4 figures

  3. arXiv:2405.17932  [pdf, ps, other

    cs.LG cs.DC

    Towards Communication-efficient Federated Learning via Sparse and Aligned Adaptive Optimization

    Authors: Xiumei Deng, Jun Li, Kang Wei, Long Shi, Zeihui Xiong, Ming Ding, Wen Chen, Shi **, H. Vincent Poor

    Abstract: Adaptive moment estimation (Adam), as a Stochastic Gradient Descent (SGD) variant, has gained widespread popularity in federated learning (FL) due to its fast convergence. However, federated Adam (FedAdam) algorithms suffer from a threefold increase in uplink communication overhead compared to federated SGD (FedSGD) algorithms, which arises from the necessity to transmit both local model updates a… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  4. arXiv:2405.17914  [pdf, other

    cs.LG

    Trustworthy DNN Partition for Blockchain-enabled Digital Twin in Wireless IIoT Networks

    Authors: Xiumei Deng, Jun Li, Long Shi, Kang Wei, Ming Ding, Yumeng Shao, Wen Chen, Shi **

    Abstract: Digital twin (DT) has emerged as a promising solution to enhance manufacturing efficiency in industrial Internet of Things (IIoT) networks. To promote the efficiency and trustworthiness of DT for wireless IIoT networks, we propose a blockchain-enabled DT (B-DT) framework that employs deep neural network (DNN) partitioning technique and reputation-based consensus mechanism, wherein the DTs maintain… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  5. arXiv:2405.13080  [pdf, other

    cs.CR cs.LG

    EmInspector: Combating Backdoor Attacks in Federated Self-Supervised Learning Through Embedding Inspection

    Authors: Yuwen Qian, Shuchi Wu, Kang Wei, Ming Ding, Di Xiao, Tao Xiang, Chuan Ma, Song Guo

    Abstract: Federated self-supervised learning (FSSL) has recently emerged as a promising paradigm that enables the exploitation of clients' vast amounts of unlabeled data while preserving data privacy. While FSSL offers advantages, its susceptibility to backdoor attacks, a concern identified in traditional federated supervised learning (FSL), has not been investigated. To fill the research gap, we undertake… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 18 pages, 12 figures

  6. arXiv:2405.06993  [pdf, other

    cs.LG cs.DC

    Robust Model Aggregation for Heterogeneous Federated Learning: Analysis and Optimizations

    Authors: Yumeng Shao, Jun Li, Long Shi, Kang Wei, Ming Ding, Qianmu Li, Zengxiang Li, Wen Chen, Shi **

    Abstract: Conventional synchronous federated learning (SFL) frameworks suffer from performance degradation in heterogeneous systems due to imbalanced local data size and diverse computing power on the client side. To address this problem, asynchronous FL (AFL) and semi-asynchronous FL have been proposed to recover the performance loss by allowing asynchronous aggregation. However, asynchronous aggregation i… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  7. arXiv:2405.05802  [pdf, other

    cs.DC cs.AI

    Deploying Graph Neural Networks in Wireless Networks: A Link Stability Viewpoint

    Authors: Jun Li, Weiwei Zhang, Kang Wei, Guangji Chen, Long Shi, Wen Chen

    Abstract: As an emerging artificial intelligence technology, graph neural networks (GNNs) have exhibited promising performance across a wide range of graph-related applications. However, information exchanges among neighbor nodes in GNN pose new challenges in the resource-constrained scenario, especially in wireless systems. In practical wireless systems, the communication links among nodes are usually unre… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 5 pages,3 figures

  8. arXiv:2405.03516  [pdf, other

    cs.LG

    GI-SMN: Gradient Inversion Attack against Federated Learning without Prior Knowledge

    Authors: ** Qian, Kaimin Wei, Yongdong Wu, Jilian Zhang, Jipeng Chen, Huan Bao

    Abstract: Federated learning (FL) has emerged as a privacy-preserving machine learning approach where multiple parties share gradient information rather than original user data. Recent work has demonstrated that gradient inversion attacks can exploit the gradients of FL to recreate the original user data, posing significant privacy risks. However, these attacks make strong assumptions about the attacker, su… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 18 pages, 10 figures, conference

  9. arXiv:2405.03152  [pdf, other

    eess.AS cs.SD

    MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition

    Authors: Bingshen Mu, Yangze Li, Qijie Shao, Kun Wei, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie

    Abstract: Despite notable advancements in automatic speech recognition (ASR), performance tends to degrade when faced with adverse conditions. Generative error correction (GER) leverages the exceptional text comprehension capabilities of large language models (LLM), delivering impressive performance in ASR error correction, where N-best hypotheses provide valuable information for transcription prediction. H… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  10. arXiv:2405.02132  [pdf, other

    cs.SD cs.CL eess.AS

    Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets

    Authors: Xuelong Geng, Tianyi Xu, Kun Wei, Bingshen Mu, Hongfei Xue, He Wang, Yangze Li, Pengcheng Guo, Yuhang Dai, Longhao Li, Mingchen Shao, Lei Xie

    Abstract: Large Language Models (LLMs) have demonstrated unparalleled effectiveness in various NLP tasks, and integrating LLMs with automatic speech recognition (ASR) is becoming a mainstream paradigm. Building upon this momentum, our research delves into an in-depth examination of this paradigm on a large open-source Chinese dataset. Specifically, our research aims to evaluate the impact of various configu… ▽ More

    Submitted 6 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

  11. arXiv:2404.16348  [pdf, other

    cs.CV

    Dual Expert Distillation Network for Generalized Zero-Shot Learning

    Authors: Zhijie Rao, **gcai Guo, Xiaocheng Lu, **gming Liang, Jie Zhang, Haozhao Wang, Kang Wei, Xiaofeng Cao

    Abstract: Zero-shot learning has consistently yielded remarkable progress via modeling nuanced one-to-one visual-attribute correlation. Existing studies resort to refining a uniform map** function to align and correlate the sample regions and subattributes, ignoring two crucial issues: 1) the inherent asymmetry of attributes; and 2) the unutilized channel information. This paper addresses these issues by… ▽ More

    Submitted 29 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 9 pages, 4 figures; Accepted to IJCAI 2024

  12. arXiv:2404.13860  [pdf, other

    cs.LG cs.CR

    Distributional Black-Box Model Inversion Attack with Multi-Agent Reinforcement Learning

    Authors: Huan Bao, Kaimin Wei, Yongdong Wu, ** Qian, Robert H. Deng

    Abstract: A Model Inversion (MI) attack based on Generative Adversarial Networks (GAN) aims to recover the private training data from complex deep learning models by searching codes in the latent space. However, they merely search a deterministic latent space such that the found latent code is usually suboptimal. In addition, the existing distributional MI schemes assume that an attacker can access the stru… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  13. arXiv:2404.03372  [pdf, other

    math.OC cs.LG

    Elementary Analysis of Policy Gradient Methods

    Authors: Jiacai Liu, Wenye Li, Ke Wei

    Abstract: Projected policy gradient under the simplex parameterization, policy gradient and natural policy gradient under the softmax parameterization, are fundamental algorithms in reinforcement learning. There have been a flurry of recent activities in studying these algorithms from the theoretical aspect. Despite this, their convergence behavior is still not fully understood, even given the access to exa… ▽ More

    Submitted 10 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  14. arXiv:2402.15526  [pdf, other

    cs.AI cs.LG

    Chain-of-Specificity: An Iteratively Refining Method for Eliciting Knowledge from Large Language Models

    Authors: Kaiwen Wei, **gyuan Zhang, Hongzhi Zhang, Fuzheng Zhang, Di Zhang, Li **, Yue Yu

    Abstract: Large Language Models (LLMs) exhibit remarkable generative capabilities, enabling the generation of valuable information. Despite these advancements, previous research found that LLMs sometimes struggle with adhering to specific constraints (e.g., in specific place or at specific time), at times even overlooking them, which leads to responses that are either too generic or not fully satisfactory.… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  15. arXiv:2402.12957  [pdf, other

    cs.DC

    Energy-Efficient Wireless Federated Learning via Doubly Adaptive Quantization

    Authors: Xuefeng Han, Wen Chen, Jun Li, Ming Ding, Qingqing Wu, Kang Wei, Xiumei Deng, Zhen Mei

    Abstract: Federated learning (FL) has been recognized as a viable distributed learning paradigm for training a machine learning model across distributed clients without uploading raw data. However, FL in wireless networks still faces two major challenges, i.e., large communication overhead and high energy consumption, which are exacerbated by client heterogeneity in dataset sizes and wireless channels. Whil… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  16. arXiv:2402.10705  [pdf, other

    cs.AI

    AutoSAT: Automatically Optimize SAT Solvers via Large Language Models

    Authors: Yiwen Sun, Xianyin Zhang, Shiyu Huang, Shaowei Cai, BingZhen Zhang, Ke Wei

    Abstract: Heuristics are crucial in SAT solvers, but no heuristic rules are suitable for all SAT problems. Therefore, it is helpful to refine specific heuristics for specific problems. In this context, we present AutoSAT, a novel framework for automatically optimizing heuristics in SAT solvers. AutoSAT is based on Large Language Models (LLMs) which is able to autonomously generate codes, conduct evaluation,… ▽ More

    Submitted 31 May, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  17. arXiv:2402.02008  [pdf, other

    cs.CL cs.AI

    How well do LLMs cite relevant medical references? An evaluation framework and analyses

    Authors: Kevin Wu, Eric Wu, Ally Cassasola, Angela Zhang, Kevin Wei, Teresa Nguyen, Sith Riantawan, Patricia Shi Riantawan, Daniel E. Ho, James Zou

    Abstract: Large language models (LLMs) are currently being used to answer medical questions across a variety of clinical domains. Recent top-performing commercial LLMs, in particular, are also capable of citing sources to support their responses. In this paper, we ask: do the sources that LLMs generate actually support the claims that they make? To answer this, we propose three contributions. First, as expe… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  18. arXiv:2401.14446  [pdf, other

    cs.CY cs.AI cs.CR

    Black-Box Access is Insufficient for Rigorous AI Audits

    Authors: Stephen Casper, Carson Ezell, Charlotte Siegmann, Noam Kolt, Taylor Lynn Curtis, Benjamin Bucknall, Andreas Haupt, Kevin Wei, Jérémy Scheurer, Marius Hobbhahn, Lee Sharkey, Satyapriya Krishna, Marvin Von Hagen, Silas Alberti, Alan Chan, Qinyi Sun, Michael Gerovitch, David Bau, Max Tegmark, David Krueger, Dylan Hadfield-Menell

    Abstract: External audits of AI systems are increasingly recognized as a key mechanism for AI governance. The effectiveness of an audit, however, depends on the degree of access granted to auditors. Recent audits of state-of-the-art AI systems have primarily relied on black-box access, in which auditors can only query the system and observe its outputs. However, white-box access to the system's inner workin… ▽ More

    Submitted 29 May, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: FAccT 2024

    Journal ref: The 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT '24), June 3-6, 2024, Rio de Janeiro, Brazil

  19. arXiv:2401.13363  [pdf, other

    cs.CV

    Do You Guys Want to Dance: Zero-Shot Compositional Human Dance Generation with Multiple Persons

    Authors: Zhe Xu, Kun Wei, Xu Yang, Cheng Deng

    Abstract: Human dance generation (HDG) aims to synthesize realistic videos from images and sequences of driving poses. Despite great success, existing methods are limited to generating videos of a single person with specific backgrounds, while the generalizability for real-world scenarios with multiple persons and complex backgrounds remains unclear. To systematically measure the generalizability of HDG mod… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  20. arXiv:2401.13138  [pdf, other

    cs.CY cs.AI

    Visibility into AI Agents

    Authors: Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam Kolt, Lennart Heim, Markus Anderljung

    Abstract: Increased delegation of commercial, scientific, governmental, and personal activities to AI agents -- systems capable of pursuing complex goals with limited supervision -- may exacerbate existing societal risks and introduce new risks. Understanding and mitigating these risks involves critically evaluating existing governance structures, revising and adapting these structures where needed, and ens… ▽ More

    Submitted 17 May, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted to ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT 2024)

  21. arXiv:2401.01084  [pdf, other

    cs.LG math.OC

    Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction

    Authors: Jie Feng, Ke Wei, **chi Chen

    Abstract: Natural policy gradient (NPG) and its variants are widely-used policy search methods in reinforcement learning. Inspired by prior work, a new NPG variant coined NPG-HM is developed in this paper, which utilizes the Hessian-aided momentum technique for variance reduction, while the sub-problem is solved via the stochastic gradient descent method. It is shown that NPG-HM can achieve the global last… ▽ More

    Submitted 21 January, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

  22. arXiv:2312.16497  [pdf

    cs.DC cs.AI

    Mobility and Cost Aware Inference Accelerating Algorithm for Edge Intelligence

    Authors: Xin Yuan, Ning Li, kang Wei, Wenchao Xu, Quan Chen, Hao Chen, Song Guo

    Abstract: The edge intelligence (EI) has been widely applied recently. Spliting the model between device, edge server, and cloud can improve the performance of EI greatly. The model segmentation without user mobility has been investigated deeply by previous works. However, in most use cases of EI, the end devices are mobile. Only a few works have been carried out on this aspect. These works still have many… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: 17 pages, 16 figures. arXiv admin note: substantial text overlap with arXiv:2312.15850

  23. arXiv:2312.01080  [pdf, other

    cs.CR

    A Novel Residual-guided Learning Method for Image Steganography

    Authors: Miaoxin Ye, Dongxia Huang, Kangkang Wei, Weiqi Luo

    Abstract: Traditional steganographic techniques have often relied on manually crafted attributes related to image residuals. These methods demand a significant level of expertise and face challenges in integrating diverse image residual characteristics. In this paper, we introduce an innovative deep learning-based methodology that seamlessly integrates image residuals, residual distances, and image local va… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  24. arXiv:2312.00855  [pdf, other

    cs.LG cs.AI cs.CR

    Refine, Discriminate and Align: Stealing Encoders via Sample-Wise Prototypes and Multi-Relational Extraction

    Authors: Shuchi Wu, Chuan Ma, Kang Wei, Xiaogang Xu, Ming Ding, Yuwen Qian, Tao Xiang

    Abstract: This paper introduces RDA, a pioneering approach designed to address two primary deficiencies prevalent in previous endeavors aiming at stealing pre-trained encoders: (1) suboptimal performances attributed to biased optimization objectives, and (2) elevated query costs stemming from the end-to-end paradigm that necessitates querying the target encoder every epoch. Specifically, we initially Refine… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: 13 pages, 11 figures

  25. arXiv:2311.14750  [pdf, other

    cs.CV

    Attribute-Aware Representation Rectification for Generalized Zero-Shot Learning

    Authors: Zhijie Rao, **gcai Guo, Xiaocheng Lu, Qihua Zhou, Jie Zhang, Kang Wei, Chenxin Li, Song Guo

    Abstract: Generalized Zero-shot Learning (GZSL) has yielded remarkable performance by designing a series of unbiased visual-semantics map**s, wherein, the precision relies heavily on the completeness of extracted visual features from both seen and unseen classes. However, as a common practice in GZSL, the pre-trained feature extractor may easily exhibit difficulty in capturing domain-specific traits of th… ▽ More

    Submitted 1 December, 2023; v1 submitted 23 November, 2023; originally announced November 2023.

    Comments: 11 pages, 6 figures

  26. arXiv:2311.10950  [pdf, other

    cs.CV physics.optics

    Single-shot Phase Retrieval from a Fractional Fourier Transform Perspective

    Authors: Yixiao Yang, Ran Tao, Kaixuan Wei, Jun Shi

    Abstract: The realm of classical phase retrieval concerns itself with the arduous task of recovering a signal from its Fourier magnitude measurements, which are fraught with inherent ambiguities. A single-exposure intensity measurement is commonly deemed insufficient for the reconstruction of the primal signal, given that the absent phase component is imperative for the inverse transformation. In this work,… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  27. arXiv:2311.09227  [pdf, other

    cs.CY cs.AI cs.SE

    Open-Sourcing Highly Capable Foundation Models: An evaluation of risks, benefits, and alternative methods for pursuing open-source objectives

    Authors: Elizabeth Seger, Noemi Dreksler, Richard Moulange, Emily Dardaman, Jonas Schuett, K. Wei, Christoph Winter, Mackenzie Arnold, Seán Ó hÉigeartaigh, Anton Korinek, Markus Anderljung, Ben Bucknall, Alan Chan, Eoghan Stafford, Leonie Koessler, Aviv Ovadya, Ben Garfinkel, Emma Bluemke, Michael Aird, Patrick Levermore, Julian Hazell, Abhishek Gupta

    Abstract: Recent decisions by leading AI labs to either open-source their models or to restrict access to their models has sparked debate about whether, and how, increasingly capable AI models should be shared. Open-sourcing in AI typically refers to making model architecture and weights freely and publicly accessible for anyone to modify, study, build on, and use. This offers advantages such as enabling ex… ▽ More

    Submitted 29 September, 2023; originally announced November 2023.

    Comments: Official release at https://www.governance.ai/research-paper/open-sourcing-highly-capable-foundation-models

  28. arXiv:2311.07538  [pdf, other

    cs.CL cs.LG

    Leveraging Multiple Teachers for Test-Time Adaptation of Language-Guided Classifiers

    Authors: Kangda Wei, Sayan Ghosh, Rakesh R. Menon, Shashank Srivastava

    Abstract: Recent approaches have explored language-guided classifiers capable of classifying examples from novel tasks when provided with task-specific natural language explanations, instructions or prompts (Sanh et al., 2022; R. Menon et al., 2022). While these classifiers can generalize in zero-shot settings, their task performance often varies substantially between different language explanations in unpr… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  29. arXiv:2311.06728  [pdf, other

    cs.DB

    A Comprehensive Survey on Database Management System Fuzzing: Techniques, Taxonomy and Experimental Comparison

    Authors: Xiyue Gao, Zhuang Liu, Jiangtao Cui, Hui Li, Hui Zhang, Kewei Wei, Kankan Zhao

    Abstract: Database Management System (DBMS) fuzzing is an automated testing technique aimed at detecting errors and vulnerabilities in DBMSs by generating, mutating, and executing test cases. It not only reduces the time and cost of manual testing but also enhances detection coverage, providing valuable assistance in develo** commercial DBMSs. Existing fuzzing surveys mainly focus on general-purpose softw… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

    Comments: 34 pages, 22 figures

  30. arXiv:2310.14278  [pdf, other

    cs.SD cs.CL eess.AS

    Conversational Speech Recognition by Learning Audio-textual Cross-modal Contextual Representation

    Authors: Kun Wei, Bei Li, Hang Lv, Quan Lu, Ning Jiang, Lei Xie

    Abstract: Automatic Speech Recognition (ASR) in conversational settings presents unique challenges, including extracting relevant contextual information from previous conversational turns. Due to irrelevant content, error propagation, and redundancy, existing methods struggle to extract longer and more effective contexts. To address this issue, we introduce a novel conversational ASR system, extending the C… ▽ More

    Submitted 27 April, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

    Comments: TASLP

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024

  31. arXiv:2310.09002  [pdf, other

    cs.LG

    Federated Meta-Learning for Few-Shot Fault Diagnosis with Representation Encoding

    Authors: Jixuan Cui, Jun Li, Zhen Mei, Kang Wei, Sha Wei, Ming Ding, Wen Chen, Song Guo

    Abstract: Deep learning-based fault diagnosis (FD) approaches require a large amount of training data, which are difficult to obtain since they are located across different entities. Federated learning (FL) enables multiple clients to collaboratively train a shared model with data privacy guaranteed. However, the domain discrepancy and data scarcity problems among clients deteriorate the performance of the… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  32. arXiv:2308.03521  [pdf, other

    cs.LG cs.AI cs.DC

    Analysis and Optimization of Wireless Federated Learning with Data Heterogeneity

    Authors: Xuefeng Han, Jun Li, Wen Chen, Zhen Mei, Kang Wei, Ming Ding, H. Vincent Poor

    Abstract: With the rapid proliferation of smart mobile devices, federated learning (FL) has been widely considered for application in wireless networks for distributed model training. However, data heterogeneity, e.g., non-independently identically distributions and different sizes of training data among clients, poses major challenges to wireless FL. Limited communication resources complicate the implement… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

  33. arXiv:2308.03407  [pdf, other

    cs.CV

    Spatially Varying Nanophotonic Neural Networks

    Authors: Kaixuan Wei, Xiao Li, Johannes Froech, Praneeth Chakravarthula, James Whitehead, Ethan Tseng, Arka Majumdar, Felix Heide

    Abstract: The explosive growth of computation and energy cost of artificial intelligence has spurred strong interests in new computing modalities as potential alternatives to conventional electronic processors. Photonic processors that execute operations using photons instead of electrons, have promised to enable optical neural networks with ultra-low latency and power consumption. However, existing optical… ▽ More

    Submitted 30 December, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

  34. arXiv:2307.10551  [pdf, other

    cs.AI

    PPN: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts

    Authors: Kaiwen Wei, Jie Yao, **gyuan Zhang, Yangyang Kang, Fubang Zhao, Yating Zhang, Changlong Sun, Xin **, Xin Zhang

    Abstract: Key Information Extraction (KIE) is a challenging multimodal task that aims to extract structured value semantic entities from visually rich documents. Although significant progress has been made, there are still two major challenges that need to be addressed. Firstly, the layout of existing datasets is relatively fixed and limited in the number of semantic entity categories, creating a significan… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  35. arXiv:2307.04630  [pdf, other

    cs.SD eess.AS

    The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task

    Authors: Kun Song, Yi lei, Peikun Chen, Yiqing Cao, Kun Wei, Yongmao Zhang, Lei Xie, Ning Jiang, Guoqing Zhao

    Abstract: This paper describes the NPU-MSXF system for the IWSLT 2023 speech-to-speech translation (S2ST) task which aims to translate from English speech of multi-source to Chinese speech. The system is built in a cascaded manner consisting of automatic speech recognition (ASR), machine translation (MT), and text-to-speech (TTS). We make tremendous efforts to handle the challenging multi-source input. Spec… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: IWSLT@ACL 2023 system paper. Our submitted system ranks 1st in the S2ST task of the IWSLT 2023 evaluation campaign

  36. arXiv:2306.07005  [pdf, other

    cs.CV cs.CR

    AI-Generated Image Detection using a Cross-Attention Enhanced Dual-Stream Network

    Authors: Ziyi Xi, Wenmin Huang, Kangkang Wei, Weiqi Luo, Peijia Zheng

    Abstract: With the rapid evolution of AI Generated Content (AIGC), forged images produced through this technology are inherently more deceptive and require less human intervention compared to traditional Computer-generated Graphics (CG). However, owing to the disparities between CG and AIGC, conventional CG detection methods tend to be inadequate in identifying AIGC-produced images. To address this issue, o… ▽ More

    Submitted 8 November, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

    Comments: 8 pages, 41 figures

  37. arXiv:2305.19575  [pdf, other

    math.OC cs.LG stat.ML

    On the Linear Convergence of Policy Gradient under Hadamard Parameterization

    Authors: Jiacai Liu, **chi Chen, Ke Wei

    Abstract: The convergence of deterministic policy gradient under the Hadamard parameterization is studied in the tabular setting and the linear convergence of the algorithm is established. To this end, we first show that the error decreases at an $O(\frac{1}{k})$ rate for all the iterations. Based on this result, we further show that the algorithm has a faster local linear convergence rate after $k_0$ itera… ▽ More

    Submitted 25 November, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

  38. arXiv:2305.17732  [pdf, other

    cs.SD eess.AS

    StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation

    Authors: Kun Song, Yi Ren, Yi Lei, Chunfeng Wang, Kun Wei, Lei Xie, Xiang Yin, Zejun Ma

    Abstract: Direct speech-to-speech translation (S2ST) has gradually become popular as it has many advantages compared with cascade S2ST. However, current research mainly focuses on the accuracy of semantic translation and ignores the speech style transfer from a source language to a target language. The lack of high-fidelity expressive parallel data makes such style transfer challenging, especially in more p… ▽ More

    Submitted 25 July, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted to Interspeech 2023

  39. arXiv:2305.02937  [pdf, other

    cs.CL cs.SD eess.AS

    End-to-end spoken language understanding using joint CTC loss and self-supervised, pretrained acoustic encoders

    Authors: Jixuan Wang, Martin Radfar, Kai Wei, Clement Chung

    Abstract: It is challenging to extract semantic meanings directly from audio signals in spoken language understanding (SLU), due to the lack of textual information. Popular end-to-end (E2E) SLU models utilize sequence-to-sequence automatic speech recognition (ASR) models to extract textual embeddings as input to infer semantics, which, however, require computationally expensive auto-regressive decoding. In… ▽ More

    Submitted 2 June, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: ICASSP 2023

  40. arXiv:2305.01387  [pdf, other

    cs.DC cs.CR cs.LG

    Efficient Federated Learning with Enhanced Privacy via Lottery Ticket Pruning in Edge Computing

    Authors: Yifan Shi, Kang Wei, Li Shen, Jun Li, Xueqian Wang, Bo Yuan, Song Guo

    Abstract: Federated learning (FL) is a collaborative learning paradigm for decentralized private data from mobile terminals (MTs). However, it suffers from issues in terms of communication, resource of MTs, and privacy. Existing privacy-preserving FL methods usually adopt the instance-level differential privacy (DP), which provides a rigorous privacy guarantee but with several bottlenecks: severe performanc… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: 13 pages

  41. arXiv:2305.00873  [pdf, other

    cs.LG cs.CR cs.DC

    Towards the Flatter Landscape and Better Generalization in Federated Learning under Client-level Differential Privacy

    Authors: Yifan Shi, Kang Wei, Li Shen, Yingqi Liu, Xueqian Wang, Bo Yuan, Dacheng Tao

    Abstract: To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clip** local updates and adding random noise. However, existing DPFL methods tend to make a sharp loss landscape and have poor weight perturbation robustness, resulting in severe performance de… ▽ More

    Submitted 2 May, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

    Comments: 20 pages. arXiv admin note: substantial text overlap with arXiv:2303.11242

  42. Instance Segmentation in the Dark

    Authors: Linwei Chen, Ying Fu, Kaixuan Wei, Dezhi Zheng, Felix Heide

    Abstract: Existing instance segmentation techniques are primarily tailored for high-visibility inputs, but their performance significantly deteriorates in extremely low-light environments. In this work, we take a deep look at instance segmentation in the dark and introduce several techniques that substantially boost the low-light inference accuracy. The proposed method is motivated by the observation that n… ▽ More

    Submitted 8 September, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: Accepted by International Journal of Computer Vision (IJCV) 2023

  43. ALR-GAN: Adaptive Layout Refinement for Text-to-Image Synthesis

    Authors: Hongchen Tan, Baocai Yin, Kun Wei, ** Liu, Xin Li

    Abstract: We propose a novel Text-to-Image Generation Network, Adaptive Layout Refinement Generative Adversarial Network (ALR-GAN), to adaptively refine the layout of synthesized images without any auxiliary information. The ALR-GAN includes an Adaptive Layout Refinement (ALR) module and a Layout Visual Refinement (LVR) loss. The ALR module aligns the layout structure (which refers to locations of objects a… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: Accepted by TMM

  44. Gradient Sparsification for Efficient Wireless Federated Learning with Differential Privacy

    Authors: Kang Wei, Jun Li, Chuan Ma, Ming Ding, Feng Shu, Haitao Zhao, Wen Chen, Hongbo Zhu

    Abstract: Federated learning (FL) enables distributed clients to collaboratively train a machine learning model without sharing raw data with each other. However, it suffers the leakage of private information from uploading models. In addition, as the model size grows, the training latency increases due to limited transmission bandwidth and the model performance degrades while using differential privacy (DP… ▽ More

    Submitted 20 December, 2023; v1 submitted 9 April, 2023; originally announced April 2023.

  45. arXiv:2304.04162  [pdf, other

    cs.GT cs.DC cs.LG

    Design of Two-Level Incentive Mechanisms for Hierarchical Federated Learning

    Authors: Shunfeng Chu, Jun Li, Kang Wei, Yuwen Qian, Kunlun Wang, Feng Shu, Wen Chen

    Abstract: Hierarchical Federated Learning (HFL) is a distributed machine learning paradigm tailored for multi-tiered computation architectures, which supports massive access of devices' models simultaneously. To enable efficient HFL, it is crucial to design suitable incentive mechanisms to ensure that devices actively participate in local training. However, there are few studies on incentive mechanism desig… ▽ More

    Submitted 16 January, 2024; v1 submitted 9 April, 2023; originally announced April 2023.

  46. arXiv:2303.17799  [pdf, other

    cs.CL cs.SD eess.AS

    Dialog act guided contextual adapter for personalized speech recognition

    Authors: Feng-Ju Chang, Thejaswi Muniyappa, Kanthashree Mysore Sathyendra, Kai Wei, Grant P. Strimel, Ross McGowan

    Abstract: Personalization in multi-turn dialogs has been a long standing challenge for end-to-end automatic speech recognition (E2E ASR) models. Recent work on contextual adapters has tackled rare word recognition using user catalogs. This adaptation, however, does not incorporate an important cue, the dialog act, which is available in a multi-turn dialog scenario. In this work, we propose a dialog act guid… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

    Comments: Accepted at ICASSP 2023

  47. arXiv:2303.11242  [pdf, other

    cs.LG cs.CR cs.CV

    Make Landscape Flatter in Differentially Private Federated Learning

    Authors: Yifan Shi, Yingqi Liu, Kang Wei, Li Shen, Xueqian Wang, Dacheng Tao

    Abstract: To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clip** local updates and adding random noise. However, existing DPFL methods tend to make a sharper loss landscape and have poorer weight perturbation robustness, resulting in severe performanc… ▽ More

    Submitted 26 June, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: CVPR2023

  48. arXiv:2303.07787  [pdf, other

    cs.DB

    One Size Cannot Fit All: a Self-Adaptive Dispatcher for Skewed Hash Join in Shared-nothing RDBMSs

    Authors: **xin Yang, Hui Li, Yiming Si, Hui Zhang, Kankan Zhao, Kewei Wei, Wenlong Song, Yingfan Liu, Jiangtao Cui

    Abstract: Shared-nothing architecture has been widely adopted in various commercial distributed RDBMSs. Thanks to the architecture, query can be processed in parallel and accelerated by scaling up the cluster horizontally on demand. In spite of that, load balancing has been a challenging issue in all distributed RDBMSs, including shared-nothing ones, which suffers much from skewed data distribution. In this… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

  49. arXiv:2303.04274  [pdf, other

    cs.LG cs.CR cs.PF

    Amplitude-Varying Perturbation for Balancing Privacy and Utility in Federated Learning

    Authors: Xin Yuan, Wei Ni, Ming Ding, Kang Wei, Jun Li, H. Vincent Poor

    Abstract: While preserving the privacy of federated learning (FL), differential privacy (DP) inevitably degrades the utility (i.e., accuracy) of FL due to model perturbations caused by DP noise added to model updates. Existing studies have considered exclusively noise with persistent root-mean-square amplitude and overlooked an opportunity of adjusting the amplitudes to alleviate the adverse effects of the… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

  50. arXiv:2302.13756  [pdf, other

    cs.IR

    Multi-Feature Integration for Perception-Dependent Examination-Bias Estimation

    Authors: Xiaoshu Chen, Xiangsheng Li, Kunliang Wei, Bin Hu, Lei Jiang, Zeqian Huang, Zhanhui Kang

    Abstract: Eliminating examination bias accurately is pivotal to apply click-through data to train an unbiased ranking model. However, most examination-bias estimators are limited to the hypothesis of Position-Based Model (PBM), which supposes that the calculation of examination bias only depends on the rank of the document. Recently, although some works introduce information such as clicks in the same query… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.