Skip to main content

Showing 1–50 of 739 results for author: Lu, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01491  [pdf, other

    cs.CL cs.CV

    Expressive and Generalizable Low-rank Adaptation for Large Models via Slow Cascaded Learning

    Authors: Siwei Li, Yifan Yang, Yifei Shen, Fangyun Wei, Zongqing Lu, Lili Qiu, Yuqing Yang

    Abstract: Efficient fine-tuning plays a fundamental role in modern large models, with low-rank adaptation emerging as a particularly promising approach. However, the existing variants of LoRA are hampered by limited expressiveness, a tendency to overfit, and sensitivity to hyperparameter settings. This paper presents LoRA Slow Cascade Learning (LoRASC), an innovative technique designed to enhance LoRA's exp… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2407.00782  [pdf, other

    cs.CL

    Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning

    Authors: Zimu Lu, Aojun Zhou, Ke Wang, Houxing Ren, Weikang Shi, Junting Pan, Mingjie Zhan

    Abstract: Direct Preference Optimization (DPO) has proven effective at improving the performance of large language models (LLMs) on downstream tasks such as reasoning and alignment. In this work, we propose Step-Controlled DPO (SCDPO), a method for automatically providing stepwise error supervision by creating negative samples of mathematical reasoning rationales that start making errors at a specified step… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  3. arXiv:2407.00735  [pdf, other

    physics.flu-dyn cs.LG

    Generative prediction of flow field based on the diffusion model

    Authors: Jiajun Hu, Zhen Lu, Yue Yang

    Abstract: We propose a geometry-to-flow diffusion model that utilizes the input of obstacle shape to predict a flow field past the obstacle. The model is based on a learnable Markov transition kernel to recover the data distribution from the Gaussian distribution. The Markov process is conditioned on the obstacle geometry, estimating the noise to be removed at each step, implemented via a U-Net. A cross-att… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  4. arXiv:2406.17841  [pdf, other

    quant-ph cs.AI

    Probing many-body Bell correlation depth with superconducting qubits

    Authors: Ke Wang, Weikang Li, Shibo Xu, Mengyao Hu, Jiachen Chen, Yaozu Wu, Chuanyu Zhang, Feitong **, Xuhao Zhu, Yu Gao, Ziqi Tan, Aosai Zhang, Ning Wang, Yiren Zou, Tingting Li, Fanhao Shen, Jiarun Zhong, Zehang Bao, Zitian Zhu, Zixuan Song, **feng Deng, Hang Dong, Xu Zhang, Pengfei Zhang, Wenjie Jiang , et al. (10 additional authors not shown)

    Abstract: Quantum nonlocality describes a stronger form of quantum correlation than that of entanglement. It refutes Einstein's belief of local realism and is among the most distinctive and enigmatic features of quantum mechanics. It is a crucial resource for achieving quantum advantages in a variety of practical applications, ranging from cryptography and certified random number generation via self-testing… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 11 pages,6 figures + 14 pages, 6 figures

  5. arXiv:2406.17755  [pdf, other

    cs.CL

    Accelerating Clinical Evidence Synthesis with Large Language Models

    Authors: Zifeng Wang, Lang Cao, Benjamin Danek, Yichi Zhang, Qiao **, Zhiyong Lu, Jimeng Sun

    Abstract: Automatic medical discovery by AI is a dream of many. One step toward that goal is to create an AI model to understand clinical studies and synthesize clinical evidence from the literature. Clinical evidence synthesis currently relies on systematic reviews of clinical trials and retrospective analyses from medical literature. However, the rapid expansion of publications presents challenges in effi… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  6. arXiv:2406.15480  [pdf, other

    cs.CL cs.AI cs.LG

    On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion

    Authors: Chenghao Fan, Zhenyi Lu, Wei Wei, Jie Tian, Xiaoye Qu, Dangyang Chen, Yu Cheng

    Abstract: Efficient fine-tuning of large language models for task-specific applications is imperative, yet the vast number of parameters in these models makes their training increasingly challenging. Despite numerous proposals for effective methods, a substantial memory overhead remains for gradient computations during updates. \thm{Can we fine-tune a series of task-specific small models and transfer their… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: submit under review

  7. arXiv:2406.15479  [pdf, other

    cs.CL cs.AI cs.LG

    Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

    Authors: Zhenyi Lu, Chenghao Fan, Wei Wei, Xiaoye Qu, Dangyang Chen, Yu Cheng

    Abstract: In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models and (b) heterogeneous data during testing. Traditional model merging methods often show significant performance gaps compared to fine-tuned models due to these i… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: submit in review

  8. arXiv:2406.12588  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    UIFV: Data Reconstruction Attack in Vertical Federated Learning

    Authors: Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao

    Abstract: Vertical Federated Learning (VFL) facilitates collaborative machine learning without the need for participants to share raw private data. However, recent studies have revealed privacy risks where adversaries might reconstruct sensitive features through data leakage during the learning process. Although data reconstruction methods based on gradient or model information are somewhat effective, they… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  9. Evolutionary Spiking Neural Networks: A Survey

    Authors: Shuaijie Shen, Rui Zhang, Chao Wang, Renzhuo Huang, Aiersi Tuerhong, Qinghai Guo, Zhichao Lu, Jianguo Zhang, Luziwei Leng

    Abstract: Spiking neural networks (SNNs) are gaining increasing attention as potential computationally efficient alternatives to traditional artificial neural networks(ANNs). However, the unique information propagation mechanisms and the complexity of SNN neuron models pose challenges for adopting traditional methods developed for ANNs to SNNs. These challenges include both weight learning and architecture… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Journal ref: J Membr Comput (2024)

  10. arXiv:2406.12259  [pdf

    cs.AI

    Adversarial Attacks on Large Language Models in Medicine

    Authors: Yifan Yang, Qiao **, Furong Huang, Zhiyong Lu

    Abstract: The integration of Large Language Models (LLMs) into healthcare applications offers promising advancements in medical diagnostics, treatment recommendations, and patient care. However, the susceptibility of LLMs to adversarial attacks poses a significant threat, potentially leading to harmful outcomes in delicate medical contexts. This study investigates the vulnerability of LLMs to two types of a… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  11. arXiv:2406.12036  [pdf, other

    cs.CL cs.AI

    MedCalc-Bench: Evaluating Large Language Models for Medical Calculations

    Authors: Nikhil Khandekar, Qiao **, Guangzhi Xiong, Soren Dunn, Serina S Applebaum, Zain Anwar, Maame Sarfo-Gyamfi, Conrad W Safranek, Abid A Anwar, Andrew Zhang, Aidan Gilson, Maxwell B Singer, Amisha Dave, Andrew Taylor, Aidong Zhang, Qingyu Chen, Zhiyong Lu

    Abstract: As opposed to evaluating computation and logic-based reasoning, current benchmarks for evaluating large language models (LLMs) in medicine are primarily focused on question-answering involving domain knowledge and descriptive reasoning. While such qualitative capabilities are vital to medical diagnosis, in real-world scenarios, doctors frequently use clinical calculators that follow quantitative e… ▽ More

    Submitted 30 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Github link: https://github.com/ncbi-nlp/MedCalc-Bench HuggingFace link: https://huggingface.co/datasets/nsk7153/MedCalc-Bench

  12. arXiv:2406.10671  [pdf

    cs.CL

    Augmenting Biomedical Named Entity Recognition with General-domain Resources

    Authors: Yu Yin, Hyunjae Kim, Xiao Xiao, Chih Hsuan Wei, Jaewoo Kang, Zhiyong Lu, Hua Xu, Meng Fang, Qingyu Chen

    Abstract: Training a neural network-based biomedical named entity recognition (BioNER) model usually requires extensive and costly human annotations. While several studies have employed multi-task learning with multiple BioNER datasets to reduce human effort, this approach does not consistently yield performance improvements and may introduce label ambiguity in different biomedical corpora. We aim to tackle… ▽ More

    Submitted 18 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: We make data, codes, and models publicly available via https://github.com/qingyu-qc/bioner_gerbera

  13. arXiv:2406.07349  [pdf, other

    cs.CR

    Erasing Radio Frequency Fingerprints via Active Adversarial Perturbation

    Authors: Zhaoyi Lu, Wenchao Xu, Ming Tu, Xin Xie, Cunqing Hua, Nan Cheng

    Abstract: Radio Frequency (RF) fingerprinting is to identify a wireless device from its uniqueness of the analog circuitry or hardware imperfections. However, unlike the MAC address which can be modified, such hardware feature is inevitable for the signal emitted to air, which can possibly reveal device whereabouts, e.g., a sniffer can use a pre-trained model to identify a nearby device when receiving its s… ▽ More

    Submitted 12 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  14. arXiv:2406.07001  [pdf, other

    cs.CL cs.AI

    Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models

    Authors: Zhenyi Lu, Jie Tian, Wei Wei, Xiaoye Qu, Yu Cheng, Wenfeng xie, Dangyang Chen

    Abstract: Text classification is a crucial task encountered frequently in practical scenarios, yet it is still under-explored in the era of large language models (LLMs). This study shows that LLMs are vulnerable to changes in the number and arrangement of options in text classification. Our extensive empirical analyses reveal that the key bottleneck arises from ambiguous decision boundaries and inherent bia… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ACL2024 findings

  15. arXiv:2406.03688  [pdf, other

    eess.IV cs.CV

    Shadow and Light: Digitally Reconstructed Radiographs for Disease Classification

    Authors: Benjamin Hou, Qingqing Zhu, Tejas Sudarshan Mathai, Qiao **, Zhiyong Lu, Ronald M. Summers

    Abstract: In this paper, we introduce DRR-RATE, a large-scale synthetic chest X-ray dataset derived from the recently released CT-RATE dataset. DRR-RATE comprises of 50,188 frontal Digitally Reconstructed Radiographs (DRRs) from 21,304 unique patients. Each image is paired with a corresponding radiology text report and binary labels for 18 pathology classes. Given the controllable nature of DRR generation,… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  16. arXiv:2406.03271  [pdf, other

    cs.CV

    Image Copy-Move Forgery Detection and Localization Scheme: How to Avoid Missed Detection and False Alarm

    Authors: Li Jiang, Zhaowei Lu, Yuebing Gao, Yifan Wang

    Abstract: Image copy-move is an operation that replaces one part of the image with another part of the same image, which can be used for illegal purposes due to the potential semantic changes. Recent studies have shown that keypoint-based algorithms achieved excellent and robust localization performance even when small or smooth tampered areas were involved. However, when the input image is low-resolution,… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  17. arXiv:2406.01799  [pdf, other

    cs.LG math.OC stat.ML

    Online Control in Population Dynamics

    Authors: Noah Golowich, Elad Hazan, Zhou Lu, Dhruv Rohatgi, Y. Jennifer Sun

    Abstract: The study of population dynamics originated with early sociological works but has since extended into many fields, including biology, epidemiology, evolutionary game theory, and economics. Most studies on population dynamics focus on the problem of prediction rather than control. Existing mathematical models for control in population dynamics are often restricted to specific, noise-free dynamics,… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  18. arXiv:2406.01394  [pdf, other

    cs.CR cs.AI

    PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration

    Authors: Ziqian Zeng, Jianwei Wang, Zhengdong Lu, Hui** Zhuang, Cen Chen

    Abstract: The widespread usage of online Large Language Models (LLMs) inference services has raised significant privacy concerns about the potential exposure of private information in user inputs to eavesdroppers or untrustworthy service providers. Existing privacy protection methods for LLMs suffer from insufficient privacy protection, performance degradation, or severe inference time overhead. In this pap… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  19. arXiv:2406.00485  [pdf

    eess.IV cs.RO

    TacShade A New 3D-printed Soft Optical Tactile Sensor Based on Light, Shadow and Greyscale for Shape Reconstruction

    Authors: Zhenyu Lu, Jialong Yang, Haoran Li, Yifan Li, Weiyong Si, Nathan Lepora, Chenguang Yang

    Abstract: In this paper, we present the TacShade a newly designed 3D-printed soft optical tactile sensor. The sensor is developed for shape reconstruction under the inspiration of sketch drawing that uses the density of sketch lines to draw light and shadow, resulting in the creation of a 3D-view effect. TacShade, building upon the strengths of the TacTip, a single-camera tactile sensor of large in-depth de… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted by ICRA 2024

  20. arXiv:2406.00337  [pdf, other

    cs.HC

    The Odyssey Journey: Hemifacial Spasm Patients' Top-Tier Medical Resource Seeking in China from an Actor-Network Perspective

    Authors: Ka I Chan, Yuntao Wang, Siying Hu, Bo Hei, Zhicong Lu, Pei-Luen Patrick Rau, Yuanchun Shi

    Abstract: Health information-seeking behaviors are critical for individuals managing illnesses, especially in cases like hemifacial spasm (HFS), a condition familiar to specialists but not to general practitioners and the broader public. The limited awareness of HFS often leads to scarce online resources for self-diagnosis and a heightened risk of misdiagnosis. In China, the imbalance in the doctor-to-patie… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  21. arXiv:2405.19609  [pdf, other

    cs.CV cs.GR

    SMPLX-Lite: A Realistic and Drivable Avatar Benchmark with Rich Geometry and Texture Annotations

    Authors: Yujiao Jiang, Qingmin Liao, Zhaolong Wang, Xiangru Lin, Zongqing Lu, Yuxi Zhao, Hanqing Wei, **grui Ye, Yu Zhang, Zhi**g Shao

    Abstract: Recovering photorealistic and drivable full-body avatars is crucial for numerous applications, including virtual reality, 3D games, and tele-presence. Most methods, whether reconstruction or generation, require large numbers of human motion sequences and corresponding textured meshes. To easily learn a drivable avatar, a reasonable parametric body model with unified topology is paramount. However,… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: ICME 2024;Project page: https://alex-jyj.github.io/SMPLX-Lite/

  22. arXiv:2405.18577  [pdf, other

    math.OC cs.LG stat.ML

    Single-loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions

    Authors: Quanqi Hu, Qi Qi, Zhaosong Lu, Tianbao Yang

    Abstract: In this paper, we study a class of non-smooth non-convex problems in the form of $\min_{x}[\max_{y\in Y}φ(x, y) - \max_{z\in Z}ψ(x, z)]$, where both $Φ(x) = \max_{y\in Y}φ(x, y)$ and $Ψ(x)=\max_{z\in Z}ψ(x, z)$ are weakly convex functions, and $φ(x, y), ψ(x, z)$ are strongly concave functions in terms of $y$ and $z$, respectively. It covers two families of problems that have been studied but are m… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  23. arXiv:2405.16205  [pdf

    cs.AI cs.CL

    GeneAgent: Self-verification Language Agent for Gene Set Knowledge Discovery using Domain Databases

    Authors: Zhizheng Wang, Qiao **, Chih-Hsuan Wei, Shubo Tian, Po-Ting Lai, Qingqing Zhu, Chi-** Day, Christina Ross, Zhiyong Lu

    Abstract: Gene set knowledge discovery is essential for advancing human functional genomics. Recent studies have shown promising performance by harnessing the power of Large Language Models (LLMs) on this task. Nonetheless, their results are subject to several limitations common in LLMs such as hallucinations. In response, we present GeneAgent, a first-of-its-kind language agent featuring self-verification… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 30 pages with 10 figures and/or tables

  24. arXiv:2405.15541  [pdf, other

    cs.CV

    Learning Generalizable Human Motion Generator with Reinforcement Learning

    Authors: Yunyao Mao, Xiaoyang Liu, Wengang Zhou, Zhenbo Lu, Houqiang Li

    Abstract: Text-driven human motion generation, as one of the vital tasks in computer-aided content creation, has recently attracted increasing attention. While pioneering research has largely focused on improving numerical performance metrics on given datasets, practical applications reveal a common challenge: existing methods often overfit specific motion expressions in the training data, hindering their a… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  25. arXiv:2405.15373  [pdf, other

    cs.RO cs.AI

    Autonomous Quilt Spreading for Caregiving Robots

    Authors: Yuchun Guo, Zhiqing Lu, Yanling Zhou, Xin Jiang

    Abstract: In this work, we propose a novel strategy to ensure infants, who inadvertently displace their quilts during sleep, are promptly and accurately re-covered. Our approach is formulated into two subsequent steps: interference resolution and quilt spreading. By leveraging the DWPose human skeletal detection and the Segment Anything instance segmentation models, the proposed method can accurately recogn… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  26. arXiv:2405.15369  [pdf, other

    cs.LG cs.AI

    Cross-Domain Policy Adaptation by Capturing Representation Mismatch

    Authors: Jiafei Lyu, Chenjia Bai, **gwen Yang, Zongqing Lu, Xiu Li

    Abstract: It is vital to learn effective policies that can be transferred to different domains with dynamics discrepancies in reinforcement learning (RL). In this paper, we consider dynamics adaptation settings where there exists dynamics mismatch between the source domain and the target domain, and one can get access to sufficient source domain data, while can only have limited interactions with the target… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  27. arXiv:2405.10530  [pdf, other

    cs.CV

    CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation

    Authors: Mushui Liu, Jun Dan, Ziqian Lu, Yunlong Yu, Yingming Li, Xi Li

    Abstract: Due to the large-scale image size and object variations, current CNN-based and Transformer-based approaches for remote sensing image semantic segmentation are suboptimal for capturing the long-range dependency or limited to the complex computational complexity. In this paper, we propose CM-UNet, comprising a CNN-based encoder for extracting local image features and a Mamba-based decoder for aggreg… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 5 pages, 6 figures

  28. arXiv:2405.08780  [pdf

    cs.CV cs.AI

    Harnessing the power of longitudinal medical imaging for eye disease prognosis using Transformer-based sequence modeling

    Authors: Gregory Holste, Mingquan Lin, Ruiwen Zhou, Fei Wang, Lei Liu, Qi Yan, Sarah H. Van Tassel, Kyle Kovacs, Emily Y. Chew, Zhiyong Lu, Zhangyang Wang, Yifan Peng

    Abstract: Deep learning has enabled breakthroughs in automated diagnosis from medical imaging, with many successful applications in ophthalmology. However, standard medical image classification approaches only assess disease presence at the time of acquisition, neglecting the common clinical setting of longitudinal imaging. For slow, progressive eye diseases like age-related macular degeneration (AMD) and p… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  29. arXiv:2405.07990  [pdf, other

    cs.CL cs.CV

    Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots

    Authors: Chengyue Wu, Yixiao Ge, Qiushan Guo, Jiahao Wang, Zhixuan Liang, Zeyu Lu, Ying Shan, ** Luo

    Abstract: The remarkable progress of Multi-modal Large Language Models (MLLMs) has attracted significant attention due to their superior performance in visual contexts. However, their capabilities in turning visual figure to executable code, have not been evaluated thoroughly. To address this, we introduce Plot2Code, a comprehensive visual coding benchmark designed for a fair and in-depth assessment of MLLM… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  30. arXiv:2405.06238  [pdf

    cs.LG

    A Novel Pseudo Nearest Neighbor Classification Method Using Local Harmonic Mean Distance

    Authors: Junzhuo Chen, Zhixin Lu, Shitong Kang

    Abstract: In the realm of machine learning, the KNN classification algorithm is widely recognized for its simplicity and efficiency. However, its sensitivity to the K value poses challenges, especially with small sample sizes or outliers, impacting classification performance. This article introduces a novel KNN-based classifier called LMPHNN (Novel Pseudo Nearest Neighbor Classification Method Using Local H… ▽ More

    Submitted 27 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

  31. arXiv:2405.04656  [pdf, other

    cs.HC

    Corporate Communication Companion (CCC): An LLM-empowered Writing Assistant for Workplace Social Media

    Authors: Zhuoran Lu, Sheshera Mysore, Tara Safavi, Jennifer Neville, Longqi Yang, Mengting Wan

    Abstract: Workplace social media platforms enable employees to cultivate their professional image and connect with colleagues in a semi-formal environment. While semi-formal corporate communication poses a unique set of challenges, large language models (LLMs) have shown great promise in hel** users draft and edit their social media posts. However, LLMs may fail to capture individualized tones and voices… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  32. arXiv:2405.02080  [pdf, ps, other

    cs.IT

    Coding for Synthesis Defects

    Authors: Ziyang Lu, Han Mao Kiah, Yiwei Zhang, Robert N. Grass, Eitan Yaakobi

    Abstract: Motivated by DNA based data storage system, we investigate the errors that occur when synthesizing DNA strands in parallel, where each strand is appended one nucleotide at a time by the machine according to a template supersequence. If there is a cycle such that the machine fails, then the strands meant to be appended at this cycle will not be appended, and we refer to this as a synthesis defect.… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  33. arXiv:2405.00882  [pdf, other

    cs.RO eess.SY

    A Differentiable Dynamic Modeling Approach to Integrated Motion Planning and Actuator Physical Design for Mobile Manipulators

    Authors: Zehui Lu, Yebin Wang

    Abstract: This paper investigates the differentiable dynamic modeling of mobile manipulators to facilitate efficient motion planning and physical design of actuators, where the actuator design is parameterized by physically meaningful motor geometry parameters. These parameters impact the manipulator's link mass, inertia, center-of-mass, torque constraints, and angular velocity constraints, influencing cont… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  34. arXiv:2404.19444  [pdf, other

    cs.CV

    AnomalyXFusion: Multi-modal Anomaly Synthesis with Diffusion

    Authors: Jie Hu, Yawen Huang, Yilin Lu, Guoyang Xie, Guannan Jiang, Yefeng Zheng, Zhichao Lu

    Abstract: Anomaly synthesis is one of the effective methods to augment abnormal samples for training. However, current anomaly synthesis methods predominantly rely on texture information as input, which limits the fidelity of synthesized abnormal samples. Because texture information is insufficient to correctly depict the pattern of anomalies, especially for logical anomalies. To surmount this obstacle, we… ▽ More

    Submitted 1 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

  35. arXiv:2404.18433  [pdf, other

    cs.CV

    ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal

    Authors: Zhuohao Li, Guoyang Xie, Guannan Jiang, Zhichao Lu

    Abstract: Transformer recently emerged as the de facto model for computer vision tasks and has also been successfully applied to shadow removal. However, these existing methods heavily rely on intricate modifications to the attention mechanisms within the transformer blocks while using a generic patch embedding. As a result, it often leads to complex architectural designs requiring additional computation re… ▽ More

    Submitted 30 April, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  36. arXiv:2404.17875  [pdf, other

    cs.LG

    Noisy Node Classification by Bi-level Optimization based Multi-teacher Distillation

    Authors: Yu**g Liu, Zongqian Wu, Zhengyu Lu, Ci Nie, Guoqiu Wen, ** Hu, Xiaofeng Zhu

    Abstract: Previous graph neural networks (GNNs) usually assume that the graph data is with clean labels for representation learning, but it is not true in real applications. In this paper, we propose a new multi-teacher distillation method based on bi-level optimization (namely BO-NNC), to conduct noisy node classification on the graph data. Specifically, we first employ multiple self-supervised learning me… ▽ More

    Submitted 8 May, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

  37. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  38. A Multi-objective Optimization Benchmark Test Suite for Real-time Semantic Segmentation

    Authors: Yifan Zhao, Zhenyu Liang, Zhichao Lu, Ran Cheng

    Abstract: As one of the emerging challenges in Automated Machine Learning, the Hardware-aware Neural Architecture Search (HW-NAS) tasks can be treated as black-box multi-objective optimization problems (MOPs). An important application of HW-NAS is real-time semantic segmentation, which plays a pivotal role in autonomous driving scenarios. The HW-NAS for real-time semantic segmentation inherently needs to ba… ▽ More

    Submitted 28 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: GECCO 2024

  39. arXiv:2404.14209  [pdf

    cs.CL

    EnzChemRED, a rich enzyme chemistry relation extraction dataset

    Authors: Po-Ting Lai, Elisabeth Coudert, Lucila Aimo, Kristian Axelsen, Lionel Breuza, Edouard de Castro, Marc Feuermann, Anne Morgat, Lucille Pourcel, Ivo Pedruzzi, Sylvain Poux, Nicole Redaschi, Catherine Rivoire, Anastasia Sveshnikova, Chih-Hsuan Wei, Robert Leaman, Ling Luo, Zhiyong Lu, Alan Bridge

    Abstract: Expert curation is essential to capture knowledge of enzyme functions from the scientific literature in FAIR open knowledgebases but cannot keep pace with the rate of new discoveries and new publications. In this work we present EnzChemRED, for Enzyme Chemistry Relation Extraction Dataset, a new training and benchmarking dataset to support the development of Natural Language Processing (NLP) metho… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  40. arXiv:2404.13915  [pdf, other

    math.OC cs.RO

    Angle-Aware Coverage with Camera Rotational Motion Control

    Authors: Zhiyuan Lu, Muhammad Hanif, Takumi Shimizu, Takeshi Hatanaka

    Abstract: This paper presents a novel control strategy for drone networks to improve the quality of 3D structures reconstructed from aerial images by drones. Unlike the existing coverage control strategies for this purpose, our proposed approach simultaneously controls both the camera orientation and drone translational motion, enabling more comprehensive perspectives and enhancing the map's overall quality… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 17 pages, 8 figures, 2 tables

  41. arXiv:2404.13875  [pdf, ps, other

    cs.IT eess.SP

    Active RIS-Aided Massive MIMO Uplink Systems with Low-Resolution ADCs

    Authors: Zhangjie Peng, Zecheng Lu, Xue Liu, Cunhua Pan, Jiangzhou Wang

    Abstract: This letter considers an active reconfigurable intelligent surface (RIS)-aided multi-user uplink massive multipleinput multiple-output (MIMO) system with low-resolution analog-to-digital converters (ADCs). The letter derives the closedform approximate expression for the sum achievable rate (AR), where the maximum ratio combination (MRC) processing and low-resolution ADCs are applied at the base st… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  42. arXiv:2404.10757  [pdf, other

    astro-ph.IM astro-ph.SR cs.CL cs.LG

    Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification

    Authors: Yu-Yang Li, Yu Bai, Cunshi Wang, Mengwei Qu, Ziteng Lu, Roberto Soria, Jifeng Liu

    Abstract: Light curves serve as a valuable source of information on stellar formation and evolution. With the rapid advancement of machine learning techniques, it can be effectively processed to extract astronomical patterns and information. In this study, we present a comprehensive evaluation of deep-learning and large language model (LLM) based models for the automatic classification of variable star ligh… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 35 pages, 20 figures

  43. arXiv:2404.10498  [pdf, other

    cs.AI cs.CV cs.DC

    LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Perception System

    Authors: Shi**g Hu, Ruijun Deng, Xin Du, Zhihui Lu, Qiang Duan, Yi He, Shih-Chia Huang, Jie Wu

    Abstract: Recent large vision models (e.g., SAM) enjoy great potential to facilitate intelligent perception with high accuracy. Yet, the resource constraints in the IoT environment tend to limit such large vision models to be locally deployed, incurring considerable inference latency thereby making it difficult to support real-time applications, such as autonomous driving and robotics. Edge-cloud collaborat… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  44. arXiv:2404.09091  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    Semantic In-Domain Product Identification for Search Queries

    Authors: Sanat Sharma, Jayant Kumar, Twisha Naik, Zhaoyu Lu, Arvind Srikantan, Tracy Holloway King

    Abstract: Accurate explicit and implicit product identification in search queries is critical for enhancing user experiences, especially at a company like Adobe which has over 50 products and covers queries across hundreds of tools. In this work, we present a novel approach to training a product classifier from user behavioral data. Our semantic model led to >25% relative improvement in CTR (click through r… ▽ More

    Submitted 29 May, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

  45. arXiv:2404.08449  [pdf, other

    cs.CV

    OccGaussian: 3D Gaussian Splatting for Occluded Human Rendering

    Authors: **grui Ye, Zongkai Zhang, Yujiao Jiang, Qingmin Liao, Wenming Yang, Zongqing Lu

    Abstract: Rendering dynamic 3D human from monocular videos is crucial for various applications such as virtual reality and digital entertainment. Most methods assume the people is in an unobstructed scene, while various objects may cause the occlusion of body parts in real-life scenarios. Previous method utilizing NeRF for surface rendering to recover the occluded areas, but it requiring more than one day t… ▽ More

    Submitted 14 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

  46. arXiv:2404.07176  [pdf, other

    cs.CV

    Self-supervised Monocular Depth Estimation on Water Scenes via Specular Reflection Prior

    Authors: Zhengyang Lu, Ying Chen

    Abstract: Monocular depth estimation from a single image is an ill-posed problem for computer vision due to insufficient reliable cues as the prior knowledge. Besides the inter-frame supervision, namely stereo and adjacent frames, extensive prior information is available in the same frame. Reflections from specular surfaces, informative intra-frame priors, enable us to reformulate the ill-posed depth estima… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 16 pages, 8 figures

  47. arXiv:2404.06270  [pdf, other

    cs.CV

    3D Geometry-aware Deformable Gaussian Splatting for Dynamic View Synthesis

    Authors: Zhicheng Lu, Xiang Guo, Le Hui, Tianrui Chen, Min Yang, Xiao Tang, Feng Zhu, Yuchao Dai

    Abstract: In this paper, we propose a 3D geometry-aware deformable Gaussian Splatting method for dynamic view synthesis. Existing neural radiance fields (NeRF) based solutions learn the deformation in an implicit manner, which cannot incorporate 3D scene geometry. Therefore, the learned deformation is not necessarily geometrically coherent, which results in unsatisfactory dynamic view synthesis and 3D dynam… ▽ More

    Submitted 14 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024. Project page: https://npucvr.github.io/GaGS/

  48. arXiv:2404.05952  [pdf, other

    cs.RO

    Robot Safe Planning In Dynamic Environments Based On Model Predictive Control Using Control Barrier Function

    Authors: Zetao Lu, Kaijun Feng, Jun Xu, Haoyao Chen, Yunjiang Lou

    Abstract: Implementing obstacle avoidance in dynamic environments is a challenging problem for robots. Model predictive control (MPC) is a popular strategy for dealing with this type of problem, and recent work mainly uses control barrier function (CBF) as hard constraints to ensure that the system state remains in the safe set. However, in crowded scenarios, effective solutions may not be obtained due to i… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  49. arXiv:2404.05880  [pdf, other

    cs.CL

    Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge

    Authors: Weikai Lu, Ziqian Zeng, Jianwei Wang, Zhengdong Lu, Zelin Chen, Hui** Zhuang, Cen Chen

    Abstract: Jailbreaking attacks can enable Large Language Models (LLMs) to bypass the safeguard and generate harmful content. Existing jailbreaking defense methods have failed to address the fundamental issue that harmful knowledge resides within the model, leading to potential jailbreak risks for LLMs. In this paper, we propose a novel defense method called Eraser, which mainly includes three goals: unlearn… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  50. arXiv:2404.05136  [pdf, other

    cs.CV cs.AI

    Self-Supervised Multi-Object Tracking with Path Consistency

    Authors: Zijia Lu, Bing Shuai, Yanbei Chen, Zhenlin Xu, Davide Modolo

    Abstract: In this paper, we propose a novel concept of path consistency to learn robust object matching without using manual object identity supervision. Our key idea is that, to track a object through frames, we can obtain multiple different association results from a model by varying the frames it can observe, i.e., skip** frames in observation. As the differences in observations do not alter the identi… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR 2024