Skip to main content

Showing 1–50 of 136 results for author: Zeng, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14855  [pdf, other

    cs.CV cs.CR

    Six-CD: Benchmarking Concept Removals for Benign Text-to-image Diffusion Models

    Authors: Jie Ren, Kangrui Chen, Yingqian Cui, Shenglai Zeng, Hui Liu, Yue Xing, Jiliang Tang, Lingjuan Lyu

    Abstract: Text-to-image (T2I) diffusion models have shown exceptional capabilities in generating images that closely correspond to textual prompts. However, the advancement of T2I diffusion models presents significant risks, as the models could be exploited for malicious purposes, such as generating images with violence or nudity, or creating unauthorized portraits of public figures in inappropriate context… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2406.14773  [pdf, other

    cs.CR

    Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data

    Authors: Shenglai Zeng, Jiankun Zhang, Pengfei He, Jie Ren, Tianqi Zheng, Hanqing Lu, Han Xu, Hui Liu, Yue Xing, Jiliang Tang

    Abstract: Retrieval-augmented generation (RAG) enhances the outputs of language models by integrating relevant information retrieved from external knowledge sources. However, when the retrieval process involves private data, RAG systems may face severe privacy risks, potentially leading to the leakage of sensitive information. To address this issue, we propose using synthetic data as a privacy-preserving al… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2406.14052  [pdf, other

    eess.IV cs.CV

    Perspective+ Unet: Enhancing Segmentation with Bi-Path Fusion and Efficient Non-Local Attention for Superior Receptive Fields

    Authors: **tong Hu, Siyan Chen, Zhiyi Pan, Sen Zeng, Wenming Yang

    Abstract: Precise segmentation of medical images is fundamental for extracting critical clinical information, which plays a pivotal role in enhancing the accuracy of diagnoses, formulating effective treatment plans, and improving patient outcomes. Although Convolutional Neural Networks (CNNs) and non-local attention methods have achieved notable success in medical image segmentation, they either struggle to… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 13 pages, 5 figures

  4. arXiv:2406.13583  [pdf, other

    cs.CV

    Low-Rank Mixture-of-Experts for Continual Medical Image Segmentation

    Authors: Qian Chen, Lei Zhu, Hangzhou He, Xinliang Zhang, Shuang Zeng, Qiushi Ren, Yanye Lu

    Abstract: The primary goal of continual learning (CL) task in medical image segmentation field is to solve the "catastrophic forgetting" problem, where the model totally forgets previously learned features when it is extended to new categories (class-level) or tasks (task-level). Due to the privacy protection, the historical data labels are inaccessible. Prevalent continual learning methods primarily focus… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  5. arXiv:2406.06874  [pdf, other

    cs.AI cs.HC cs.RO

    Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback

    Authors: Chenliang Li, Siliang Zeng, Zeyi Liao, Jiaxiang Li, Dongyeop Kang, Alfredo Garcia, Mingyi Hong

    Abstract: Aligning human preference and value is an important requirement for building contemporary foundation models and embodied AI. However, popular approaches such as reinforcement learning with human feedback (RLHF) break down the task into successive stages, such as supervised fine-tuning (SFT), reward modeling (RM), and reinforcement learning (RL), each performing one specific learning task. Such a s… ▽ More

    Submitted 19 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  6. arXiv:2406.06086  [pdf, other

    cs.SD eess.AS

    RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection

    Authors: Yujie Chen, Jiangyan Yi, Jun Xue, Chenglong Wang, Xiaohui Zhang, Shunbo Dong, Siding Zeng, Jianhua Tao, Lv Zhao, Cunhang Fan

    Abstract: Fake artefacts for discriminating between bonafide and fake audio can exist in both short- and long-range segments. Therefore, combining local and global feature information can effectively discriminate between bonafide and fake audio. This paper proposes an end-to-end bidirectional state space model, named RawBMamba, to capture both short- and long-range discriminative information for audio deepf… ▽ More

    Submitted 18 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  7. arXiv:2406.03949  [pdf, other

    cs.CL

    UltraMedical: Building Specialized Generalists in Biomedicine

    Authors: Kaiyan Zhang, Sihang Zeng, Ermo Hua, Ning Ding, Zhang-Ren Chen, Zhiyuan Ma, Haoxin Li, Ganqu Cui, Biqing Qi, Xuekai Zhu, Xingtai Lv, Hu **fang, Zhiyuan Liu, Bowen Zhou

    Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities across various domains and are moving towards more specialized areas. Recent advanced proprietary models such as GPT-4 and Gemini have achieved significant advancements in biomedicine, which have also raised privacy and security challenges. The construction of specialized generalists hinges largely on high-quality datasets, enh… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Datasets and models are available at https://github.com/TsinghuaC3I/UltraMedical

  8. arXiv:2405.17888  [pdf, other

    cs.AI

    Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment

    Authors: Jiaxiang Li, Siliang Zeng, Hoi-To Wai, Chenliang Li, Alfredo Garcia, Mingyi Hong

    Abstract: Aligning human preference and value is an important requirement for contemporary foundation models. State-of-the-art techniques such as Reinforcement Learning from Human Feedback (RLHF) often consist of two stages: 1) supervised fine-tuning (SFT), where the model is fine-tuned by learning from human demonstration data; 2) Preference learning, where preference data is used to learn a reward model,… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  9. arXiv:2405.17422  [pdf, other

    cs.CV cs.AI cs.LG

    Hardness-Aware Scene Synthesis for Semi-Supervised 3D Object Detection

    Authors: Shuai Zeng, Wenzhao Zheng, Jiwen Lu, Haibin Yan

    Abstract: 3D object detection aims to recover the 3D information of concerning objects and serves as the fundamental task of autonomous driving perception. Its performance greatly depends on the scale of labeled training data, yet it is costly to obtain high-quality annotations for point cloud data. While conventional methods focus on generating pseudo-labels for unlabeled samples as supplements for trainin… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Code is available at: https://github.com/wzzheng/HASS

  10. arXiv:2405.15732  [pdf, other

    cs.LG cs.CE

    Neural Persistence Dynamics

    Authors: Sebastian Zeng, Florian Graf, Martin Uray, Stefan Huber, Roland Kwitt

    Abstract: We consider the problem of learning the dynamics in the topology of time-evolving point clouds, the prevalent spatiotemporal model for systems exhibiting collective behavior, such as swarms of insects and birds or particles in physics. In such systems, patterns emerge from (local) interactions among self-propelled entities. While several well-understood governing equations for motion and interacti… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  11. arXiv:2405.09927  [pdf, other

    math.OC cs.LG

    Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-loop and Hessian-free Solution Strategy

    Authors: Risheng Liu, Zhu Liu, Wei Yao, Shangzhi Zeng, ** Zhang

    Abstract: This work focuses on addressing two major challenges in the context of large-scale nonconvex Bi-Level Optimization (BLO) problems, which are increasingly applied in machine learning due to their ability to model nested structures. These challenges involve ensuring computational efficiency and providing theoretical guarantees. While recent advances in scalable BLO algorithms have primarily relied o… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  12. arXiv:2405.09660  [pdf, other

    math.OC cs.LG

    Fast Two-Time-Scale Stochastic Gradient Method with Applications in Reinforcement Learning

    Authors: Sihan Zeng, Thinh T. Doan

    Abstract: Two-time-scale optimization is a framework introduced in Zeng et al. (2024) that abstracts a range of policy evaluation and policy optimization problems in reinforcement learning (RL). Akin to bi-level optimization under a particular type of stochastic oracle, the two-time-scale optimization framework has an upper level objective whose gradient evaluation depends on the solution of a lower level p… ▽ More

    Submitted 10 June, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  13. arXiv:2405.02456  [pdf, ps, other

    math.OC cs.LG

    Natural Policy Gradient and Actor Critic Methods for Constrained Multi-Task Reinforcement Learning

    Authors: Sihan Zeng, Thinh T. Doan, Justin Romberg

    Abstract: Multi-task reinforcement learning (RL) aims to find a single policy that effectively solves multiple tasks at the same time. This paper presents a constrained formulation for multi-task RL where the goal is to maximize the average performance of the policy across tasks subject to bounds on the performance in each task. We consider solving this problem both in the centralized setting, where informa… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  14. arXiv:2403.18230  [pdf, other

    cs.AI

    Large Language Models Need Consultants for Reasoning: Becoming an Expert in a Complex Human System Through Behavior Simulation

    Authors: Chuwen Wang, Shirong Zeng, Cheng Wang

    Abstract: Large language models (LLMs), in conjunction with various reasoning reinforcement methodologies, have demonstrated remarkable capabilities comparable to humans in fields such as mathematics, law, coding, common sense, and world knowledge. In this paper, we delve into the reasoning abilities of LLMs within complex human systems. We propose a novel reasoning framework, termed ``Mosaic Expert Observa… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  15. arXiv:2403.11052  [pdf, other

    cs.CV cs.CR

    Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention

    Authors: Jie Ren, Yaxin Li, Shenglai Zeng, Han Xu, Lingjuan Lyu, Yue Xing, Jiliang Tang

    Abstract: Recent advancements in text-to-image diffusion models have demonstrated their remarkable capability to generate high-quality images from textual prompts. However, increasing research indicates that these models memorize and replicate images from their training data, raising tremendous concerns about potential copyright infringement and privacy risks. In our study, we provide a novel perspective to… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  16. arXiv:2402.16893  [pdf, other

    cs.CR cs.AI cs.CL

    The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)

    Authors: Shenglai Zeng, Jiankun Zhang, Pengfei He, Yue Xing, Yiding Liu, Han Xu, Jie Ren, Shuaiqiang Wang, Dawei Yin, Yi Chang, Jiliang Tang

    Abstract: Retrieval-augmented generation (RAG) is a powerful technique to facilitate language model with proprietary and private data, where data privacy is a pivotal concern. Whereas extensive research has demonstrated the privacy risks of large language models (LLMs), the RAG technique could potentially reshape the inherent behaviors of LLM generation, posing new privacy issues that are currently under-ex… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  17. arXiv:2402.02333  [pdf, other

    cs.CR cs.CV cs.LG

    Copyright Protection in Generative AI: A Technical Perspective

    Authors: Jie Ren, Han Xu, Pengfei He, Yingqian Cui, Shenglai Zeng, Jiankun Zhang, Hongzhi Wen, Jiayuan Ding, Hui Liu, Yi Chang, Jiliang Tang

    Abstract: Generative AI has witnessed rapid advancement in recent years, expanding their capabilities to create synthesized content such as text, images, audio, and code. The high fidelity and authenticity of contents generated by these Deep Generative Models (DGMs) have sparked significant copyright concerns. There have been various legal debates on how to effectively safeguard copyrights in DGMs. This wor… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: 26 pages

  18. arXiv:2401.16164  [pdf, other

    cs.LG math.OC

    Constrained Bi-Level Optimization: Proximal Lagrangian Value function Approach and Hessian-free Algorithm

    Authors: Wei Yao, Chengming Yu, Shangzhi Zeng, ** Zhang

    Abstract: This paper presents a new approach and algorithm for solving a class of constrained Bi-Level Optimization (BLO) problems in which the lower-level problem involves constraints coupling both upper-level and lower-level variables. Such problems have recently gained significant attention due to their broad applicability in machine learning. However, conventional gradient-based methods unavoidably rely… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  19. arXiv:2401.09851  [pdf, other

    cs.AI

    Next-Generation Simulation Illuminates Scientific Problems of Organised Complexity

    Authors: Cheng Wang, Chuwen Wang, Wang Zhang, Shirong Zeng, Yu Zhao, Ronghui Ning, Changjun Jiang

    Abstract: As artificial intelligence becomes increasingly prevalent in scientific research, data-driven methodologies appear to overshadow traditional approaches in resolving scientific problems. In this Perspective, we revisit a classic classification of scientific problems and acknowledge that a series of unresolved problems remain. Throughout the history of researching scientific problems, scientists hav… ▽ More

    Submitted 14 June, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

  20. arXiv:2401.08149  [pdf, ps, other

    cs.IT eess.SP

    Channel Estimation for Holographic Communications in Hybrid Near-Far Field

    Authors: Shaohua Yue, Shuhao Zeng, Liang Liu, Boya Di

    Abstract: To realize holographic communications, a potential technology for spectrum efficiency improvement in the future sixth-generation (6G) network, antenna arrays inlaid with numerous antenna elements will be deployed. However, the increase in antenna aperture size makes some users lie in the Fresnel region, leading to the hybrid near-field and far-field communication mode, where the conventional far-f… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 6 pages, 5 figures

  21. arXiv:2401.06820  [pdf, other

    math.OC cs.LG

    QCQP-Net: Reliably Learning Feasible Alternating Current Optimal Power Flow Solutions Under Constraints

    Authors: Sihan Zeng, Youngdae Kim, Yuxuan Ren, Kibaek Kim

    Abstract: At the heart of power system operations, alternating current optimal power flow (ACOPF) studies the generation of electric power in the most economical way under network-wide load requirement, and can be formulated as a highly structured non-convex quadratically constrained quadratic program (QCQP). Optimization-based solutions to ACOPF (such as ADMM or interior-point method), as the classic appro… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  22. arXiv:2401.03868  [pdf, other

    cs.AR cs.AI

    FlightLLM: Efficient Large Language Model Inference with a Complete Map** Flow on FPGAs

    Authors: Shulin Zeng, Jun Liu, Guohao Dai, Xinhao Yang, Tianyu Fu, Hongyi Wang, Wenheng Ma, Hanbo Sun, Shiyao Li, Zixiao Huang, Yadong Dai, **tao Li, Zehao Wang, Ruoyu Zhang, Kairui Wen, Xuefei Ning, Yu Wang

    Abstract: Transformer-based Large Language Models (LLMs) have made a significant impact on various domains. However, LLMs' efficiency suffers from both heavy computation and memory overheads. Compression techniques like sparsification and quantization are commonly used to mitigate the gap between LLM's computation/memory overheads and hardware capacity. However, existing GPU and transformer-based accelerato… ▽ More

    Submitted 9 January, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: Accepted to FPGA'24

  23. arXiv:2312.17276  [pdf, other

    cs.CL cs.LG

    PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation

    Authors: Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, Dacheng Tao

    Abstract: The recent trend of large language models (LLMs) is to increase the scale of both model size (\aka the number of parameters) and dataset to achieve better generative ability, which is definitely proved by a lot of work such as the famous GPT and Llama. However, large models often involve massive computational costs, and practical applications cannot afford such high prices. However, the method of… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  24. arXiv:2312.09651  [pdf, other

    cs.SD cs.CR cs.LG eess.AS

    What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection

    Authors: Xiaohui Zhang, Jiangyan Yi, Chenglong Wang, Chuyuan Zhang, Siding Zeng, Jianhua Tao

    Abstract: The rapid evolution of speech synthesis and voice conversion has raised substantial concerns due to the potential misuse of such technology, prompting a pressing need for effective audio deepfake detection mechanisms. Existing detection models have shown remarkable success in discriminating known deepfake audio, but struggle when encountering new attack types. To address this challenge, one of the… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by the main track The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

  25. arXiv:2312.08036  [pdf

    cs.CL

    CoRTEx: Contrastive Learning for Representing Terms via Explanations with Applications on Constructing Biomedical Knowledge Graphs

    Authors: Huaiyuan Ying, Zhengyun Zhao, Yang Zhao, Sihang Zeng, Sheng Yu

    Abstract: Objective: Biomedical Knowledge Graphs play a pivotal role in various biomedical research domains. Concurrently, term clustering emerges as a crucial step in constructing these knowledge graphs, aiming to identify synonymous terms. Due to a lack of knowledge, previous contrastive learning models trained with Unified Medical Language System (UMLS) synonyms struggle at clustering difficult terms and… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  26. arXiv:2311.10927  [pdf, other

    cs.GT cs.LG

    Learning Payment-Free Resource Allocation Mechanisms

    Authors: Sihan Zeng, Sujay Bhatt, Eleonora Kreacic, Parisa Hassanzadeh, Alec Koppel, Sumitra Ganesh

    Abstract: We consider the design of mechanisms that allocate limited resources among self-interested agents using neural networks. Unlike the recent works that leverage machine learning for revenue maximization in auctions, we consider welfare maximization as the key objective in the payment-free setting. Without payment exchange, it is unclear how we can align agents' incentives to achieve the desired obje… ▽ More

    Submitted 12 April, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

  27. arXiv:2311.08010  [pdf, other

    cs.CL cs.AI

    Distantly-Supervised Named Entity Recognition with Uncertainty-aware Teacher Learning and Student-student Collaborative Learning

    Authors: Helan Hu, Shuzheng Si, Haozhe Zhao, Shuang Zeng, Kaikai An, Zefan Cai, Baobao Chang

    Abstract: Distantly-Supervised Named Entity Recognition (DS-NER) effectively alleviates the burden of annotation, but meanwhile suffers from the label noise. Recent works attempt to adopt the teacher-student framework to gradually refine the training labels and improve the overall robustness. However, we argue that these teacher-student methods achieve limited performance because poor network calibration pr… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  28. arXiv:2311.05965  [pdf, other

    cs.CL

    Large Language Models are Zero Shot Hypothesis Proposers

    Authors: Biqing Qi, Kaiyan Zhang, Haoxiang Li, Kai Tian, Sihang Zeng, Zhang-Ren Chen, Bowen Zhou

    Abstract: Significant scientific discoveries have driven the progress of human civilisation. The explosion of scientific literature and data has created information barriers across disciplines that have slowed the pace of scientific discovery. Large Language Models (LLMs) hold a wealth of global and interdisciplinary knowledge that promises to break down these information barriers and foster a new wave of s… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: Instruction Workshop @ NeurIPS 2023

  29. arXiv:2311.03115  [pdf, other

    cs.CY cs.LG stat.AP

    RELand: Risk Estimation of Landmines via Interpretable Invariant Risk Minimization

    Authors: Mateo Dulce Rubio, Siqi Zeng, Qi Wang, Didier Alvarado, Francisco Moreno, Hoda Heidari, Fei Fang

    Abstract: Landmines remain a threat to war-affected communities for years after conflicts have ended, partly due to the laborious nature of demining tasks. Humanitarian demining operations begin by collecting relevant information from the sites to be cleared, which is then analyzed by human experts to determine the potential risk of remaining landmines. In this paper, we propose RELand system to support the… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  30. arXiv:2310.15486  [pdf, other

    cs.IT

    RIS-based IMT-2030 Testbed for MmWave Multi-stream Ultra-massive MIMO Communications

    Authors: Shuhao Zeng, Boya Di, Hongliang Zhang, Jiahao Gao, Shaohua Yue, Xinyuan Hu, Rui Fu, Jiaqi Zhou, Xu Liu, Haobo Zhang, Yuhan Wang, Shaohui Sun, Haichao Qin, Xin Su, Mengjun Wang, Lingyang Song

    Abstract: As one enabling technique of the future sixth generation (6G) network, ultra-massive multiple-input-multiple-output (MIMO) can support high-speed data transmissions and cell coverage extension. However, it is hard to realize the ultra-massive MIMO via traditional phased arrays due to unacceptable power consumption. To address this issue, reconfigurable intelligent surface-based (RIS-based) antenna… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 8 pages, 5 figures, to be published in IEEE Wireless Communications

  31. arXiv:2310.08045  [pdf, other

    cs.RO eess.SY

    Model Predictive Inferential Control of Neural State-Space Models for Autonomous Vehicle Motion Planning

    Authors: Iman Askari, Xumein Tu, Shen Zeng, Huazhen Fang

    Abstract: Model predictive control (MPC) has proven useful in enabling safe and optimal motion planning for autonomous vehicles. In this paper, we investigate how to achieve MPC-based motion planning when a neural state-space model represents the vehicle dynamics. As the neural state-space model will lead to highly complex, nonlinear and nonconvex optimization landscapes, mainstream gradient-based MPC metho… ▽ More

    Submitted 19 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

  32. arXiv:2310.07464  [pdf

    eess.IV cs.LG q-bio.QM

    Deep Learning Predicts Biomarker Status and Discovers Related Histomorphology Characteristics for Low-Grade Glioma

    Authors: Zijie Fang, Yihan Liu, Yifeng Wang, Xiangyang Zhang, Yang Chen, Chang**g Cai, Yiyang Lin, Ying Han, Zhi Wang, Shan Zeng, Hong Shen, Jun Tan, Yongbing Zhang

    Abstract: Biomarker detection is an indispensable part in the diagnosis and treatment of low-grade glioma (LGG). However, current LGG biomarker detection methods rely on expensive and complex molecular genetic testing, for which professionals are required to analyze the results, and intra-rater variability is often reported. To overcome these challenges, we propose an interpretable deep learning pipeline, a… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 47 pages, 6 figures

  33. arXiv:2310.06714  [pdf, other

    cs.AI cs.CL cs.LG

    Exploring Memorization in Fine-tuned Language Models

    Authors: Shenglai Zeng, Yaxin Li, Jie Ren, Yiding Liu, Han Xu, Pengfei He, Yue Xing, Shuaiqiang Wang, Jiliang Tang, Dawei Yin

    Abstract: Large language models (LLMs) have shown great capabilities in various tasks but also exhibited memorization of training data, raising tremendous privacy and copyright concerns. While prior works have studied memorization during pre-training, the exploration of memorization during fine-tuning is rather limited. Compared to pre-training, fine-tuning typically involves more sensitive data and diverse… ▽ More

    Submitted 22 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  34. arXiv:2310.05263  [pdf, other

    cs.CR

    Confidence-driven Sampling for Backdoor Attacks

    Authors: Pengfei He, Han Xu, Yue Xing, Jie Ren, Yingqian Cui, Shenglai Zeng, Jiliang Tang, Makoto Yamada, Mohammad Sabokrou

    Abstract: Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios. Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples. Our research highlights the overlooked drawbacks of random sampling, which make that attack detectab… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  35. arXiv:2310.01307  [pdf, other

    cs.CL cs.AI cs.LG

    On the Generalization of Training-based ChatGPT Detection Methods

    Authors: Han Xu, Jie Ren, Pengfei He, Shenglai Zeng, Yingqian Cui, Amy Liu, Hui Liu, Jiliang Tang

    Abstract: ChatGPT is one of the most popular language models which achieve amazing performance on various natural language tasks. Consequently, there is also an urgent need to detect the texts generated ChatGPT from human written. One of the extensively studied methods trains classification models to distinguish both. However, existing studies also demonstrate that the trained models may suffer from distrib… ▽ More

    Submitted 3 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

  36. arXiv:2309.11876  [pdf, other

    cs.CV cs.AI

    Multi-level Asymmetric Contrastive Learning for Volumetric Medical Image Segmentation Pre-training

    Authors: Shuang Zeng, Lei Zhu, Xinliang Zhang, Qian Chen, Hangzhou He, Lujia **, Zifeng Tian, Qiushi Ren, Zhaoheng Xie, Yanye Lu

    Abstract: Medical image segmentation is a fundamental yet challenging task due to the arduous process of acquiring large volumes of high-quality labeled data from experts. Contrastive learning offers a promising but still problematic solution to this dilemma. Because existing medical contrastive learning strategies focus on extracting image-level representation, which ignores abundant multi-level representa… ▽ More

    Submitted 13 May, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

  37. arXiv:2309.08571  [pdf, other

    cs.LG

    A Bayesian Approach to Robust Inverse Reinforcement Learning

    Authors: Ran Wei, Siliang Zeng, Chenliang Li, Alfredo Garcia, Anthony McDonald, Mingyi Hong

    Abstract: We consider a Bayesian approach to offline model-based inverse reinforcement learning (IRL). The proposed framework differs from existing offline model-based IRL approaches by performing simultaneous estimation of the expert's reward function and subjective model of environment dynamics. We make use of a class of prior distributions which parameterizes how accurate the expert's model of the enviro… ▽ More

    Submitted 6 April, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

  38. arXiv:2309.07001  [pdf, other

    cs.CE cs.AI stat.AP

    Modeling the Evolutionary Trends in Corporate ESG Reporting: A Study based on Knowledge Management Model

    Authors: Ziyuan Xia, Anchen Sun, Xiaodong Cai, Saixing Zeng

    Abstract: Environmental, social, and governance (ESG) reports are globally recognized as a keystone in sustainable enterprise development. However, current literature has not concluded the development of topics and trends in ESG contexts in the twenty-first century. Therefore, We selected 1114 ESG reports from firms in the technology industry to analyze the evolutionary trends of ESG topics by text mining.… ▽ More

    Submitted 25 May, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: 29 pages, 10 figures, 3 tables

  39. arXiv:2309.02106  [pdf, other

    cs.CL cs.AI cs.LG eess.AS

    Leveraging Label Information for Multimodal Emotion Recognition

    Authors: Peiying Wang, Sunlu Zeng, Junqing Chen, Lu Fan, Meng Chen, Youzheng Wu, Xiaodong He

    Abstract: Multimodal emotion recognition (MER) aims to detect the emotional status of a given expression by combining the speech and text information. Intuitively, label information should be capable of hel** the model locate the salient tokens/frames relevant to the specific emotion, which finally facilitates the MER task. Inspired by this, we propose a novel approach for MER by leveraging label informat… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Accepted by Interspeech 2023

  40. arXiv:2308.04949  [pdf, other

    cs.CV

    Branches Mutual Promotion for End-to-End Weakly Supervised Semantic Segmentation

    Authors: Lei Zhu, Hangzhou He, Xinliang Zhang, Qian Chen, Shuang Zeng, Qiushi Ren, Yanye Lu

    Abstract: End-to-end weakly supervised semantic segmentation aims at optimizing a segmentation model in a single-stage training process based on only image annotations. Existing methods adopt an online-trained classification branch to provide pseudo annotations for supervising the segmentation branch. However, this strategy makes the classification branch dominate the whole concurrent training process, hind… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

  41. arXiv:2307.00866  [pdf, other

    cs.CL cs.AI

    Mining Clues from Incomplete Utterance: A Query-enhanced Network for Incomplete Utterance Rewriting

    Authors: Shuzheng Si, Shuang Zeng, Baobao Chang

    Abstract: Incomplete utterance rewriting has recently raised wide attention. However, previous works do not consider the semantic structural information between incomplete utterance and rewritten utterance or model the semantic structure implicitly and insufficiently. To address this problem, we propose a QUEry-Enhanced Network (QUEEN). Firstly, our proposed query template explicitly brings guided semantic… ▽ More

    Submitted 27 July, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: NAACL 2022

  42. arXiv:2307.00266  [pdf, other

    cs.CL cs.AI

    Hierarchical Pretraining for Biomedical Term Embeddings

    Authors: Bryan Cai, Sihang Zeng, Yucong Lin, Zheng Yuan, Doudou Zhou, Lu Tian

    Abstract: Electronic health records (EHR) contain narrative notes that provide extensive details on the medical condition and management of patients. Natural language processing (NLP) of clinical notes can use observed frequencies of clinical terms as predictive features for downstream applications such as clinical decision making and patient trajectory prediction. However, due to the vast number of highly… ▽ More

    Submitted 1 July, 2023; originally announced July 2023.

  43. arXiv:2306.16761  [pdf, other

    math.OC cs.LG

    Moreau Envelope Based Difference-of-weakly-Convex Reformulation and Algorithm for Bilevel Programs

    Authors: Lucy L. Gao, Jane J. Ye, Haian Yin, Shangzhi Zeng, ** Zhang

    Abstract: Bilevel programming has emerged as a valuable tool for hyperparameter selection, a central concern in machine learning. In a recent study by Ye et al. (2023), a value function-based difference of convex algorithm was introduced to address bilevel programs. This approach proves particularly powerful when dealing with scenarios where the lower-level problem exhibits convexity in both the upper-level… ▽ More

    Submitted 20 January, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

    MSC Class: 90C99

  44. arXiv:2306.16248  [pdf, other

    cs.LG

    Latent SDEs on Homogeneous Spaces

    Authors: Sebastian Zeng, Florian Graf, Roland Kwitt

    Abstract: We consider the problem of variational Bayesian inference in a latent variable model where a (possibly complex) observed stochastic process is governed by the solution of a latent stochastic differential equation (SDE). Motivated by the challenges that arise when trying to learn an (almost arbitrary) latent neural SDE from data, such as efficient gradient computation, we take a step back and study… ▽ More

    Submitted 21 February, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: v3: updated experiments with results using the public source code (commit bc6edd1)

    Journal ref: NeurIPS 2023

  45. arXiv:2306.10453  [pdf, other

    cs.LG cs.SI

    Evaluating Graph Neural Networks for Link Prediction: Current Pitfalls and New Benchmarking

    Authors: Juanhui Li, Harry Shomer, Haitao Mao, Shenglai Zeng, Yao Ma, Neil Shah, Jiliang Tang, Dawei Yin

    Abstract: Link prediction attempts to predict whether an unseen edge exists based on only a portion of edges of a graph. A flurry of methods have been introduced in recent years that attempt to make use of graph neural networks (GNNs) for this task. Furthermore, new and diverse datasets have also been created to better evaluate the effectiveness of these new models. However, multiple pitfalls currently exis… ▽ More

    Submitted 18 November, 2023; v1 submitted 17 June, 2023; originally announced June 2023.

  46. arXiv:2306.00673  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Attribute-Efficient PAC Learning of Low-Degree Polynomial Threshold Functions with Nasty Noise

    Authors: Shiwei Zeng, Jie Shen

    Abstract: The concept class of low-degree polynomial threshold functions (PTFs) plays a fundamental role in machine learning. In this paper, we study PAC learning of $K$-sparse degree-$d$ PTFs on $\mathbb{R}^n$, where any such concept depends only on $K$ out of $n$ attributes of the input. Our main contribution is a new algorithm that runs in time $({nd}/ε)^{O(d)}$ and under the Gaussian marginal distributi… ▽ More

    Submitted 19 March, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: ICML 2023. V2 fixed typos

  47. arXiv:2305.06511  [pdf, other

    eess.IV cs.CV

    ParamNet: A Parameter-variable Network for Fast Stain Normalization

    Authors: Hongtao Kang, Die Luo, Li Chen, Junbo Hu, Shenghua Cheng, Tingwei Quan, Shaoqun Zeng, Xiuli Liu

    Abstract: In practice, digital pathology images are often affected by various factors, resulting in very large differences in color and brightness. Stain normalization can effectively reduce the differences in color and brightness of digital pathology images, thus improving the performance of computer-aided diagnostic systems. Conventional stain normalization methods rely on one or several reference images,… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  48. arXiv:2305.04076  [pdf, other

    cs.CL cs.AI

    SANTA: Separate Strategies for Inaccurate and Incomplete Annotation Noise in Distantly-Supervised Named Entity Recognition

    Authors: Shuzheng Si, Zefan Cai, Shuang Zeng, Guoqiang Feng, Jiaxing Lin, Baobao Chang

    Abstract: Distantly-Supervised Named Entity Recognition effectively alleviates the burden of time-consuming and expensive annotation in the supervised setting. But the context-free matching process and the limited coverage of knowledge bases introduce inaccurate and incomplete annotation noise respectively. Previous studies either considered only incomplete annotation noise or indiscriminately handle two ty… ▽ More

    Submitted 28 July, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

    Comments: Findings of ACL2023

  49. arXiv:2303.12981  [pdf, other

    cs.LG math.OC

    Connected Superlevel Set in (Deep) Reinforcement Learning and its Application to Minimax Theorems

    Authors: Sihan Zeng, Thinh T. Doan, Justin Romberg

    Abstract: The aim of this paper is to improve the understanding of the optimization landscape for policy optimization problems in reinforcement learning. Specifically, we show that the superlevel set of the objective function with respect to the policy parameter is always a connected set both in the tabular setting and under policies represented by a class of neural networks. In addition, we show that the o… ▽ More

    Submitted 30 September, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

  50. arXiv:2303.02668  [pdf, other

    cs.LG cs.AI cs.DC

    Knowledge-Enhanced Semi-Supervised Federated Learning for Aggregating Heterogeneous Lightweight Clients in IoT

    Authors: Jiaqi Wang, Shenglai Zeng, Zewei Long, Yaqing Wang, Fenglong Ma

    Abstract: Federated learning (FL) enables multiple clients to train models collaboratively without sharing local data, which has achieved promising results in different areas, including the Internet of Things (IoT). However, end IoT devices do not have abilities to automatically annotate their collected data, which leads to the label shortage issue at the client side. To collaboratively train an FL model, w… ▽ More

    Submitted 5 March, 2023; originally announced March 2023.

    Comments: This paper is acceptted by SDM-2023. Jiaqi Wang and Shenglai Zeng are of equal contribution