Skip to main content

Showing 1–50 of 353 results for author: Du, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00119  [pdf, other

    cs.LG cs.AI cs.CL

    Efficient Long-distance Latent Relation-aware Graph Neural Network for Multi-modal Emotion Recognition in Conversations

    Authors: Yuntao Shou, Wei Ai, Jiayi Du, Tao Meng, Haiyan Liu

    Abstract: The task of multi-modal emotion recognition in conversation (MERC) aims to analyze the genuine emotional state of each utterance based on the multi-modal information in the conversation, which is crucial for conversation understanding. Existing methods focus on using graph neural networks (GNN) to model conversational relationships and capture contextual latent semantic relationships. However, due… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: 11 pages, 3 tables

  2. arXiv:2406.18984  [pdf, other

    cs.IR

    Amplify Graph Learning for Recommendation via Sparsity Completion

    Authors: Peng Yuan, Haojie Li, Minying Fang, Xu Yu, Yong**g Hao, Junwei Du

    Abstract: Graph learning models have been widely deployed in collaborative filtering (CF) based recommendation systems. Due to the issue of data sparsity, the graph structure of the original input lacks potential positive preference edges, which significantly reduces the performance of recommendations. In this paper, we study how to enhance the graph structure for CF more effectively, thereby optimizing the… ▽ More

    Submitted 1 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.16253  [pdf, other

    cs.CL

    LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

    Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

    Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  4. arXiv:2406.16203  [pdf, other

    cs.CL

    LLMs' Classification Performance is Overclaimed

    Authors: Hanzi Xu, Renze Lou, Jiangshu Du, Vahid Mahzoon, Elmira Talebianaraki, Zhuoan Zhou, Elizabeth Garrison, Slobodan Vucetic, Wenpeng Yin

    Abstract: In many classification tasks designed for AI or human to solve, gold labels are typically included within the label space by default, often posed as "which of the following is correct?" This standard setup has traditionally highlighted the strong performance of advanced AI, particularly top-performing Large Language Models (LLMs), in routine classification tasks. However, when the gold label is in… ▽ More

    Submitted 29 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  5. arXiv:2406.15731  [pdf, other

    cs.CR cs.AI

    Breaking Secure Aggregation: Label Leakage from Aggregated Gradients in Federated Learning

    Authors: Zhibo Wang, Zhiwei Chang, Jiahui Hu, Xiaoyi Pang, Jiacheng Du, Yongle Chen, Kui Ren

    Abstract: Federated Learning (FL) exhibits privacy vulnerabilities under gradient inversion attacks (GIAs), which can extract private information from individual gradients. To enhance privacy, FL incorporates Secure Aggregation (SA) to prevent the server from obtaining individual gradients, thus effectively resisting GIAs. In this paper, we propose a stealthy label inference attack to bypass SA and recover… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 10 pages, conference to IEEE INFOCOM 2024

  6. arXiv:2406.13348  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    Textual Unlearning Gives a False Sense of Unlearning

    Authors: Jiacheng Du, Zhibo Wang, Kui Ren

    Abstract: Language models (LMs) are susceptible to "memorizing" training data, including a large amount of private or copyright-protected content. To safeguard the right to be forgotten (RTBF), machine unlearning has emerged as a promising method for LMs to efficiently "forget" sensitive training content and mitigate knowledge leakage risks. However, despite its good intentions, could the unlearning mechani… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  7. arXiv:2406.12195  [pdf, other

    quant-ph cs.LG

    Quantum Compiling with Reinforcement Learning on a Superconducting Processor

    Authors: Z. T. Wang, Qiuhao Chen, Yuxuan Du, Z. H. Yang, Xiaoxia Cai, Kaixuan Huang, **gning Zhang, Kai Xu, Jun Du, Yinan Li, Yuling Jiao, Xingyao Wu, Wu Liu, Xiliang Lu, Huikai Xu, Yirong **, Ruixia Wang, Haifeng Yu, S. P. Zhao

    Abstract: To effectively implement quantum algorithms on noisy intermediate-scale quantum (NISQ) processors is a central task in modern quantum technology. NISQ processors feature tens to a few hundreds of noisy qubits with limited coherence times and gate operations with errors, so NISQ algorithms naturally require employing circuits of short lengths via quantum compilation. Here, we develop a reinforcemen… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  8. arXiv:2406.10304  [pdf, other

    cs.CL

    Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design

    Authors: Ming Gao, Hang Chen, Jun Du, Xin Xu, Hongxiao Guo, Hui Bu, Jianxing Yang, Ming Li, Chin-Hui Lee

    Abstract: Smart home technology has gained widespread adoption, facilitating effortless control of devices through voice commands. However, individuals with dysarthria, a motor speech disorder, face challenges due to the variability of their speech. This paper addresses the wake-up word spotting (WWS) task for dysarthric individuals, aiming to integrate them into real-world applications. To support this, we… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: to be published in Interspeech 2024

  9. arXiv:2406.08757  [pdf, other

    cs.CL cs.AI

    SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding

    Authors: Jiefeng Ma, Yan Wang, Chenyu Liu, Jun Du, Yu Hu, Zhenrong Zhang, Pengfei Hu, Qing Wang, Jianshu Zhang

    Abstract: Accurately identifying and organizing textual content is crucial for the automation of document processing in the field of form understanding. Existing datasets, such as FUNSD and XFUND, support entity classification and relationship prediction tasks but are typically limited to local and entity-level annotations. This limitation overlooks the hierarchically structured representation of documents,… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: NeurIPS 2024 Track on Datasets and Benchmarks under review

  10. arXiv:2406.07256  [pdf, ps, other

    cs.SD cs.AI eess.AS

    AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

    Authors: Rong Gong, Hongfei Xue, Lezhi Wang, Xin Xu, Qisheng Li, Lei Xie, Hui Bu, Shaomei Wu, Jiaming Zhou, Yong Qin, Binbin Zhang, Jun Du, Jia Bin, Ming Li

    Abstract: The rapid advancements in speech technologies over the past two decades have led to human-level performance in tasks like automatic speech recognition (ASR) for fluent speech. However, the efficacy of these models diminishes when applied to atypical speech, such as stuttering. This paper introduces AS-70, the first publicly available Mandarin stuttered speech dataset, which stands out as the large… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  11. arXiv:2406.07081  [pdf, other

    cs.CL

    Efficiently Exploring Large Language Models for Document-Level Machine Translation with In-context Learning

    Authors: Menglong Cui, Jiangcun Du, Shaolin Zhu, Deyi Xiong

    Abstract: Large language models (LLMs) exhibit outstanding performance in machine translation via in-context learning. In contrast to sentence-level translation, document-level translation (DOCMT) by LLMs based on in-context learning faces two major challenges: firstly, document translations generated by LLMs are often incoherent; secondly, the length of demonstration for in-context learning is usually limi… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL2024 long paper (Findings)

  12. arXiv:2406.04582  [pdf, other

    eess.AS cs.SD

    Neural Codec-based Adversarial Sample Detection for Speaker Verification

    Authors: Xuanjun Chen, Jiawei Du, Haibin Wu, Jyh-Shing Roger Jang, Hung-yi Lee

    Abstract: Automatic Speaker Verification (ASV), increasingly used in security-critical applications, faces vulnerabilities from rising adversarial attacks, with few effective defenses available. In this paper, we propose a neural codec-based adversarial sample detection method for ASV. The approach leverages the codec's ability to discard redundant perturbations and retain essential information. Specificall… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  13. arXiv:2406.02463  [pdf, other

    cs.CR

    Click Without Compromise: Online Advertising Measurement via Per User Differential Privacy

    Authors: Yingtai Xiao, Jian Du, Shikun Zhang, Qiang Yan, Danfeng Zhang, Daniel Kifer

    Abstract: Online advertising is a cornerstone of the Internet ecosystem, with advertising measurement playing a crucial role in optimizing efficiency. Ad measurement entails attributing desired behaviors, such as purchases, to ad exposures across various platforms, necessitating the collection of user activities across these platforms. As this practice faces increasing restrictions due to rising privacy con… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  14. arXiv:2406.02110  [pdf, other

    cs.CL cs.AI

    UniOQA: A Unified Framework for Knowledge Graph Question Answering with Large Language Models

    Authors: Zhuoyang Li, Liran Deng, Hui Liu, Qiaoqiao Liu, Junzhao Du

    Abstract: OwnThink stands as the most extensive Chinese open-domain knowledge graph introduced in recent times. Despite prior attempts in question answering over OwnThink (OQA), existing studies have faced limitations in model representation capabilities, posing challenges in further enhancing overall accuracy in question answering. In this paper, we introduce UniOQA, a unified framework that integrates two… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures

  15. arXiv:2406.00625  [pdf, other

    cs.CV

    SAM-LAD: Segment Anything Model Meets Zero-Shot Logic Anomaly Detection

    Authors: Yun Peng, Xiao Lin, Nachuan Ma, Jiayuan Du, Chuangwei Liu, Chengju Liu, Qijun Chen

    Abstract: Visual anomaly detection is vital in real-world applications, such as industrial defect detection and medical diagnosis. However, most existing methods focus on local structural anomalies and fail to detect higher-level functional anomalies under logical conditions. Although recent studies have explored logical anomaly detection, they can only address simple anomalies like missing or addition and… ▽ More

    Submitted 5 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

  16. arXiv:2405.17245  [pdf, other

    cs.DC cs.AI cs.LG cs.NI

    Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference

    Authors: Shengyuan Ye, Jiangsu Du, Liekang Zeng, Wenzhong Ou, Xiaowen Chu, Yutong Lu, Xu Chen

    Abstract: Transformer-based models have unlocked a plethora of powerful intelligent applications at the edge, such as voice assistant in smart home. Traditional deployment approaches offload the inference workloads to the remote cloud server, which would induce substantial pressure on the backbone network as well as raise users' privacy concerns. To address that, in-situ inference has been recently recogniz… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE International Conference on Computer Communications 2024

  17. arXiv:2405.15863  [pdf, other

    cs.SD cs.AI eess.AS

    Quality-aware Masked Diffusion Transformer for Enhanced Music Generation

    Authors: Chang Li, Ruoyu Wang, Lijuan Liu, Jun Du, Yixuan Sun, Zilu Guo, Zhenrong Zhang, Yuan Jiang

    Abstract: In recent years, diffusion-based text-to-music (TTM) generation has gained prominence, offering a novel approach to synthesizing musical content from textual descriptions. Achieving high accuracy and diversity in this generation process requires extensive, high-quality data, which often constitutes only a fraction of available datasets. Within open-source datasets, the prevalence of issues like mi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  18. arXiv:2405.14137  [pdf, other

    cs.CV

    RET-CLIP: A Retinal Image Foundation Model Pre-trained with Clinical Diagnostic Reports

    Authors: Jiawei Du, Jia Guo, Weihang Zhang, Shengzhu Yang, Hanruo Liu, Huiqi Li, Ningli Wang

    Abstract: The Vision-Language Foundation model is increasingly investigated in the fields of computer vision and natural language processing, yet its exploration in ophthalmology and broader medical applications remains limited. The challenge is the lack of labeled data for the training of foundation model. To handle this issue, a CLIP-style retinal image foundation model is developed in this paper. Our fou… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  19. arXiv:2405.11862  [pdf, other

    cs.CV

    SEMv3: A Fast and Robust Approach to Table Separation Line Detection

    Authors: Chunxia Qin, Zhenrong Zhang, Pengfei Hu, Chenyu Liu, Jiefeng Ma, Jun Du

    Abstract: Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image. The `"split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial. However, challenges such as wireless and deformed tables make it demanding. In this paper, we adhere to the "split-and-merge" paradigm and propose SEMv3 (SEM: Spl… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 9 pages, 6 figures, 5 tables. Accepted by IJCAI2024 main track

  20. arXiv:2405.10436  [pdf, other

    cs.IR cs.AI

    Positional encoding is not the same as context: A study on positional encoding for Sequential recommendation

    Authors: Alejo Lopez-Avila, **hua Du, Abbas Shimary, Ze Li

    Abstract: The expansion of streaming media and e-commerce has led to a boom in recommendation systems, including Sequential recommendation systems, which consider the user's previous interactions with items. In recent years, research has focused on architectural improvements such as transformer blocks and feature extraction that can augment model information. Among these features are context and attributes.… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 19 pages, 3 figures, 12 tables

    MSC Class: I.2.m

  21. arXiv:2405.09353  [pdf, other

    eess.IV cs.CV

    Large coordinate kernel attention network for lightweight image super-resolution

    Authors: Fangwei Hao, Jiesheng Wu, Haotian Lu, Ji Du, **g Xu

    Abstract: The multi-scale receptive field and large kernel attention (LKA) module have been shown to significantly improve performance in the lightweight image super-resolution task. However, existing lightweight super-resolution (SR) methods seldom pay attention to designing efficient building block with multi-scale receptive field for local modeling, and their LKA modules face a quadratic increase in comp… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  22. arXiv:2405.05176  [pdf, other

    cs.CL

    Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming

    Authors: Tommaso Pasini, Alejo López-Ávila, Husam Quteineh, Gerasimos Lampouras, **hua Du, Yubing Wang, Ze Li, Yusen Sun

    Abstract: Composing poetry or lyrics involves several creative factors, but a challenging aspect of generation is the adherence to a more or less strict metric and rhyming pattern. To address this challenge specifically, previous work on the task has mainly focused on reverse language modeling, which brings the critical selection of each rhyming word to the forefront of each verse. On the other hand, revers… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 18 pages, 1 figure

    MSC Class: I.2.7

  23. arXiv:2404.17662  [pdf, other

    cs.CL

    PLAYER*: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery Games

    Authors: Qinglin Zhu, Runcong Zhao, **hua Du, Lin Gui, Yulan He

    Abstract: We propose PLAYER*, a novel framework that addresses the limitations of existing agent-based approaches built on Large Language Models (LLMs) in handling complex questions and understanding interpersonal relationships in dynamic environments. PLAYER* enhances path planning in Murder Mystery Games (MMGs) using an anytime sampling-based planner and a questioning-driven search framework. By equip**… ▽ More

    Submitted 17 June, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

  24. arXiv:2404.11313  [pdf, other

    eess.IV cs.AI

    NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

    Authors: Xin Li, Kun Yuan, Ya**g Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei Li, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo , et al. (43 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024 Workshop. The challenge report for CVPR NTIRE2024 Short-form UGC Video Quality Assessment Challenge

  25. arXiv:2404.07748  [pdf, other

    cs.CV cs.LG

    3D-CSAD: Untrained 3D Anomaly Detection for Complex Manufacturing Surfaces

    Authors: Xuanming Cao, Chengyu Tao, Juan Du

    Abstract: The surface quality inspection of manufacturing parts based on 3D point cloud data has attracted increasing attention in recent years. The reason is that the 3D point cloud can capture the entire surface of manufacturing parts, unlike the previous practices that focus on some key product characteristics. However, achieving accurate 3D anomaly detection is challenging, due to the complex surfaces o… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  26. arXiv:2404.05403  [pdf, other

    cs.CR cs.AI

    SoK: Gradient Leakage in Federated Learning

    Authors: Jiacheng Du, Jiahui Hu, Zhibo Wang, Peng Sun, Neil Zhenqiang Gong, Kui Ren

    Abstract: Federated learning (FL) enables collaborative model training among multiple clients without raw data exposure. However, recent studies have shown that clients' private training data can be reconstructed from the gradients they share in FL, known as gradient inversion attacks (GIAs). While GIAs have demonstrated effectiveness under \emph{ideal settings and auxiliary assumptions}, their actual effic… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  27. arXiv:2404.04937  [pdf, other

    cs.CR cs.GT

    Optimizing Information Propagation for Blockchain-empowered Mobile AIGC: A Graph Attention Network Approach

    Authors: Jiana Liao, **bo Wen, Jiawen Kang, Yang Zhang, Jianbo Du, Qihao Li, Weiting Zhang, Dong Yang

    Abstract: Artificial Intelligence-Generated Content (AIGC) is a rapidly evolving field that utilizes advanced AI algorithms to generate content. Through integration with mobile edge networks, mobile AIGC networks have gained significant attention, which can provide real-time customized and personalized AIGC services and products. Since blockchains can facilitate decentralized and transparent data management… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2403.13237

  28. arXiv:2404.04481  [pdf, other

    cs.IR cs.AI cs.LG

    Joint Identifiability of Cross-Domain Recommendation via Hierarchical Subspace Disentanglement

    Authors: **g Du, Zesheng Ye, Bin Guo, Zhiwen Yu, Lina Yao

    Abstract: Cross-Domain Recommendation (CDR) seeks to enable effective knowledge transfer across domains. Existing works rely on either representation alignment or transformation bridges, but they struggle on identifying domain-shared from domain-specific latent factors. Specifically, while CDR describes user representations as a joint distribution over two domains, these methods fail to account for its join… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: accepted to SIGIR 2024 as a Full Research Paper

  29. arXiv:2404.03329  [pdf

    cs.LG eess.SP stat.ML

    DeepFunction: Deep Metric Learning-based Imbalanced Classification for Diagnosing Threaded Pipe Connection Defects using Functional Data

    Authors: Yukun Xie, Juan Du, Chen Zhang

    Abstract: In modern manufacturing, most of the product lines are conforming. Few products are nonconforming but with different defect types. The identification of defect types can help further root cause diagnosis of production lines. With the sensing development, signals of process variables can be collected in high resolution, which can be regarded as multichannel functional data. They have abundant infor… ▽ More

    Submitted 24 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Revised version for submission to IISE Transactions

  30. arXiv:2404.01233  [pdf, other

    math.ST cs.LG stat.ML

    Optimal Ridge Regularization for Out-of-Distribution Prediction

    Authors: Pratik Patil, **-Hong Du, Ryan J. Tibshirani

    Abstract: We study the behavior of optimal ridge regularization and optimal ridge risk for out-of-distribution prediction, where the test distribution deviates arbitrarily from the train distribution. We establish general conditions that determine the sign of the optimal regularization level under covariate and regression shifts. These conditions capture the alignment between the covariance and signal struc… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 59 pages, 14 figures

  31. arXiv:2403.20150  [pdf, other

    cs.LG cs.AI cs.CY

    TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods

    Authors: Xiangfei Qiu, Jilin Hu, Lekui Zhou, Xingjian Wu, Junyang Du, Buang Zhang, Chenjuan Guo, Aoying Zhou, Christian S. Jensen, Zhenli Sheng, Bin Yang

    Abstract: Time series are generated in diverse domains such as economic, traffic, health, and energy, where forecasting of future values has numerous important applications. Not surprisingly, many forecasting methods are being proposed. To ensure progress, it is essential to be able to study and compare such methods empirically in a comprehensive and reliable manner. To achieve this, we propose TFB, an auto… ▽ More

    Submitted 18 June, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: Directly accepted by PVLDB 2024

  32. arXiv:2403.17708  [pdf, other

    cs.CV cs.HC cs.MM

    Panonut360: A Head and Eye Tracking Dataset for Panoramic Video

    Authors: Yutong Xu, Junhao Du, Jiahe Wang, Yuwei Ning, Sihan Zhou Yang Cao

    Abstract: With the rapid development and widespread application of VR/AR technology, maximizing the quality of immersive panoramic video services that match users' personal preferences and habits has become a long-standing challenge. Understanding the saliency region where users focus, based on data collected with HMDs, can promote multimedia encoding, transmission, and quality assessment. At the same time,… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 7 pages,ACM MMSys'24 accepted

  33. arXiv:2403.17465  [pdf, other

    cs.CV cs.AI

    LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection

    Authors: Yunpeng Luo, Junlong Du, Ke Yan, Shouhong Ding

    Abstract: The evolution of Diffusion Models has dramatically improved image generation quality, making it increasingly difficult to differentiate between real and generated images. This development, while impressive, also raises significant privacy and security concerns. In response to this, we propose a novel Latent REconstruction error guided feature REfinement method (LaRE^2) for detecting the diffusion-… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  34. arXiv:2403.13322  [pdf, other

    cs.CV

    DD-RobustBench: An Adversarial Robustness Benchmark for Dataset Distillation

    Authors: Yifan Wu, Jiawei Du, ** Liu, Yuewei Lin, Wenqing Cheng, Wei Xu

    Abstract: Dataset distillation is an advanced technique aimed at compressing datasets into significantly smaller counterparts, while preserving formidable training performance. Significant efforts have been devoted to promote evaluation accuracy under limited compression ratio while overlooked the robustness of distilled dataset. In this work, we introduce a comprehensive benchmark that, to the best of our… ▽ More

    Submitted 27 May, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: * denotes equal contributions; ^ denotes corresponding author. In this updated version, we have expanded our research to include more experiments on various adversarial attack methods and latest dataset distillation studies. All new results have been incorporated into the document

  35. arXiv:2403.11445  [pdf, other

    cs.CR cs.DS eess.SP

    Budget Recycling Differential Privacy

    Authors: Bo Jiang, Jian Du, Sagar Shamar, Qiang Yan

    Abstract: Differential Privacy (DP) mechanisms usually {force} reduction in data utility by producing "out-of-bound" noisy results for a tight privacy budget. We introduce the Budget Recycling Differential Privacy (BR-DP) framework, designed to provide soft-bounded noisy outputs for a broad range of existing DP mechanisms. By "soft-bounded," we refer to the mechanism's ability to release most outputs within… ▽ More

    Submitted 16 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

  36. arXiv:2403.11091  [pdf, other

    cs.SD cs.CV eess.AS

    Multitask frame-level learning for few-shot sound event detection

    Authors: Liang Zou, Genwei Yan, Ruoyu Wang, Jun Du, Meng Lei, Tian Gao, Xin Fang

    Abstract: This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples. However, prevailing methods methods in few-shot SED predominantly rely on segment-level predictions, which often providing detailed, fine-grained predictions, particularly for events of brief duration. Although frame-level prediction strategies have been… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: 6 pages, 4 figures, conference

  37. arXiv:2403.10682  [pdf

    cond-mat.mtrl-sci cs.LG

    Evaluation of GlassNet for physics-informed machine learning of glass stability and glass-forming ability

    Authors: Sarah I. Allec, Xiaonan Lu, Daniel R. Cassar, Xuan T. Nguyen, Vinay I. Hegde, Thiruvillamalai Mahadevan, Miroslava Peterson, **cheng Du, Brian J. Riley, John D. Vienna, James E. Saal

    Abstract: Glasses form the basis of many modern applications and also hold great potential for future medical and environmental applications. However, their structural complexity and large composition space make design and optimization challenging for certain applications. Of particular importance for glass processing is an estimate of a given composition's glass-forming ability (GFA). However, there remain… ▽ More

    Submitted 19 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  38. arXiv:2403.08196  [pdf, other

    cs.CL eess.AS

    SpeechColab Leaderboard: An Open-Source Platform for Automatic Speech Recognition Evaluation

    Authors: Jiayu Du, **peng Li, Guoguo Chen, Wei-Qiang Zhang

    Abstract: In the wake of the surging tide of deep learning over the past decade, Automatic Speech Recognition (ASR) has garnered substantial attention, leading to the emergence of numerous publicly accessible ASR systems that are actively being integrated into our daily lives. Nonetheless, the impartial and replicable evaluation of these ASR systems encounters challenges due to various crucial subtleties. I… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  39. arXiv:2403.04245  [pdf, other

    cs.SD cs.CV cs.LG cs.MM eess.AS

    A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

    Authors: Yusheng Dai, Hang Chen, Jun Du, Ruoyu Wang, Shihao Chen, Jiefeng Ma, Haotian Wang, Chin-Hui Lee

    Abstract: Advanced Audio-Visual Speech Recognition (AVSR) systems have been observed to be sensitive to missing video frames, performing even worse than single-modality models. While applying the dropout technique to the video modality enhances robustness to missing frames, it simultaneously results in a performance loss when dealing with complete data input. In this paper, we investigate this contrasting p… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: the paper is accepted by CVPR2024

  40. Minimum Topology Attacks for Graph Neural Networks

    Authors: Mengmei Zhang, Xiao Wang, Chuan Shi, Lingjuan Lyu, Tianchi Yang, Jun** Du

    Abstract: With the great popularity of Graph Neural Networks (GNNs), their robustness to adversarial topology attacks has received significant attention. Although many attack methods have been proposed, they mainly focus on fixed-budget attacks, aiming at finding the most adversarial perturbations within a fixed budget for target node. However, considering the varied robustness of each node, there is an ine… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Published on WWW 2023. Proceedings of the ACM Web Conference 2023

  41. arXiv:2402.18667  [pdf, other

    cs.CL

    FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability

    Authors: Congying Xia, Chen Xing, Jiangshu Du, Xinyi Yang, Yihao Feng, Ran Xu, Wenpeng Yin, Caiming Xiong

    Abstract: This paper presents FoFo, a pioneering benchmark for evaluating large language models' (LLMs) ability to follow complex, domain-specific formats, a crucial yet underexamined capability for their application as AI agents. Despite LLMs' advancements, existing benchmarks fail to assess their format-following proficiency adequately. FoFo fills this gap with a diverse range of real-world formats and in… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: The first two authors contributed equally

  42. arXiv:2402.16775  [pdf, other

    cs.CL cs.AI

    A Comprehensive Evaluation of Quantization Strategies for Large Language Models

    Authors: Renren **, Jiangcun Du, Wuwei Huang, Wei Liu, Jian Luan, Bin Wang, Deyi Xiong

    Abstract: Increasing the number of parameters in large language models (LLMs) usually improves performance in downstream tasks but raises compute and memory costs, making deployment difficult in resource-limited settings. Quantization techniques, which reduce the bits needed for model weights or activations with minimal performance loss, have become popular due to the rise of LLMs. However, most quantizatio… ▽ More

    Submitted 6 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: ACL 2024 Findings

  43. arXiv:2402.13018  [pdf, other

    eess.AS cs.SD

    EMO-SUPERB: An In-depth Look at Speech Emotion Recognition

    Authors: Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee

    Abstract: Speech emotion recognition (SER) is a pivotal technology for human-computer interaction systems. However, 80.77% of SER papers yield results that cannot be reproduced. We develop EMO-SUPERB, short for EMOtion Speech Universal PERformance Benchmark, which aims to enhance open-source initiatives for SER. EMO-SUPERB includes a user-friendly codebase to leverage 15 state-of-the-art speech self-supervi… ▽ More

    Submitted 12 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: webpage: https://emosuperb.github.io/

  44. arXiv:2402.10600  [pdf, other

    cs.NI

    Envisioning the Future Role of 3D Wireless Networks in Preventing and Managing Disasters and Emergency Situations

    Authors: Ahmed Alhammadi, Anuj Abraham, Aymen Fakhreddine, Yu Tian, Jun Du, Faouzi Bader

    Abstract: In an era marked by unprecedented climatic upheavals and evolving urban landscapes, the role of advanced communication networks in disaster prevention and management is becoming increasingly critical. This paper explores the transformative potential of 3D wireless networks, an innovative amalgamation of terrestrial, aerial, and satellite technologies, in enhancing disaster response mechanisms. We… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  45. arXiv:2402.06044  [pdf, other

    cs.AI cs.CL

    OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models

    Authors: Hainiu Xu, Runcong Zhao, Lixing Zhu, **hua Du, Yulan He

    Abstract: Neural Theory-of-Mind (N-ToM), machine's ability to understand and keep track of the mental states of others, is pivotal in develo** socially intelligent agents. However, prevalent N-ToM benchmarks have several shortcomings, including the presence of ambiguous and artificial narratives, absence of personality traits and preferences, a lack of questions addressing characters' psychological mental… ▽ More

    Submitted 3 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: ACL 2024

  46. arXiv:2402.02242  [pdf, other

    cs.CV cs.LG

    Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey

    Authors: Yi Xin, Siqi Luo, Haodi Zhou, Junlong Du, Xiaohong Liu, Yue Fan, Qing Li, Yuntao Du

    Abstract: Large-scale pre-trained vision models (PVMs) have shown great potential for adaptability across various downstream vision tasks. However, with state-of-the-art PVMs growing to billions or even trillions of parameters, the standard full fine-tuning paradigm is becoming unsustainable due to high computational and storage demands. In response, researchers are exploring parameter-efficient fine-tuning… ▽ More

    Submitted 8 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

    Comments: 9 pages, 3 figures, 2 tables

  47. arXiv:2401.17573  [pdf

    stat.ML cs.LG eess.IV eess.SY

    Tensor-based process control and monitoring for semiconductor manufacturing with unstable disturbances

    Authors: Yanrong Li, Juan Du, Fugee Tsung, Wei Jiang

    Abstract: With the development and popularity of sensors installed in manufacturing systems, complex data are collected during manufacturing processes, which brings challenges for traditional process control methods. This paper proposes a novel process control and monitoring method for the complex structure of high-dimensional image-based overlay errors (modeled in tensor form), which are collected in semic… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 30 pages, 5 figures

  48. arXiv:2401.10642  [pdf, other

    cs.SI cs.AI

    Fast Butterfly-Core Community Search For Large Labeled Graphs

    Authors: JiaYi Du, Yinghao Wu, Wei Ai, Tao Meng, CanHao Xie, KeQin Li

    Abstract: Community Search (CS) aims to identify densely interconnected subgraphs corresponding to query vertices within a graph. However, existing heterogeneous graph-based community search methods need help identifying cross-group communities and suffer from efficiency issues, making them unsuitable for large graphs. This paper presents a fast community search model based on the Butterfly-Core Community (… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: 8 pages, 8 figures

  49. arXiv:2401.03357  [pdf

    cs.NI eess.SP

    Measured and Modeled Outdoor Indoor Coverage at 28 GHz into High Thermal Efficiency Buildings

    Authors: Dmitry Chizhik, **feng Du, Reinaldo Valenzuela, Andrea Bedin, Martti Moisio, Rodolfo Feick

    Abstract: 28 GHz outdoor-indoor coverage into modern office buildings with high thermal efficiency windows is found to be severely limited due to 46 dB median penetration loss at normal incidence and additional 15 dB median oblique incidence loss. The study is based on measurements of path gain over 280 outdoor-indoor links, at ranges up to 100 m. A simple theoretical path gain model is extended to include… ▽ More

    Submitted 4 September, 2023; originally announced January 2024.

    Comments: 2 pages, 3 figures. Presented at IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting

  50. arXiv:2401.02290  [pdf, other

    cs.LG cs.AI cs.SI

    Path-based Explanation for Knowledge Graph Completion

    Authors: Heng Chang, Jiangnan Ye, Alejo Lopez Avila, **hua Du, Jia Li

    Abstract: Graph Neural Networks (GNNs) have achieved great success in Knowledge Graph Completion (KGC) by modelling how entities and relations interact in recent years. However, the explanation of the predicted facts has not caught the necessary attention. Proper explanations for the results of GNN-based KGC models increase model transparency and help researchers develop more reliable models. Existing pract… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.