Skip to main content

Showing 1–50 of 174 results for author: Qian, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11432  [pdf, other

    cs.CV cs.AI

    AnyTrans: Translate AnyText in the Image with Large Scale Models

    Authors: Zhipeng Qian, Pei Zhang, Baosong Yang, Kai Fan, Yiwei Ma, Derek F. Wong, Xiaoshuai Sun, Rongrong Ji

    Abstract: This paper introduces AnyTrans, an all-encompassing framework for the task-Translate AnyText in the Image (TATI), which includes multilingual text translation and text fusion within images. Our framework leverages the strengths of large-scale models, such as Large Language Models (LLMs) and text-guided diffusion models, to incorporate contextual cues from both textual and visual elements during tr… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2406.01063  [pdf, other

    cs.CV

    DANCE: Dual-View Distribution Alignment for Dataset Condensation

    Authors: Hansong Zhang, Shikun Li, Fanzhao Lin, Wei** Wang, Zhenxing Qian, Shiming Ge

    Abstract: Dataset condensation addresses the problem of data burden by learning a small synthetic training set that preserves essential knowledge from the larger real training set. To date, the state-of-the-art (SOTA) results are often yielded by optimization-oriented methods, but their inefficiency hinders their application to realistic datasets. On the other hand, the Distribution-Matching (DM) methods sh… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: This work has been accepted by IJCAI-24

  3. arXiv:2405.19769  [pdf, other

    cs.CV

    All-In-One Medical Image Restoration via Task-Adaptive Routing

    Authors: Zhiwen Yang, Haowei Chen, Ziniu Qian, Yang Yi, Hui Zhang, Dan Zhao, Bingzheng Wei, Yan Xu

    Abstract: Although single-task medical image restoration (MedIR) has witnessed remarkable success, the limited generalizability of these methods poses a substantial obstacle to wider application. In this paper, we focus on the task of all-in-one medical image restoration, aiming to address multiple distinct MedIR tasks with a single universal model. Nonetheless, due to significant differences between differ… ▽ More

    Submitted 28 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: This article has been early accepted by MICCAI 2024

  4. arXiv:2405.15154  [pdf, other

    cs.AI cs.LG

    Online Prompt Pricing based on Combinatorial Multi-Armed Bandit and Hierarchical Stackelberg Game

    Authors: Meiling Li, Hongrun Ren, Haixu Xiong, Zhenxing Qian, Xinpeng Zhang

    Abstract: Generation models have shown promising performance in various tasks, making trading around machine learning models possible. In this paper, we aim at a novel prompt trading scenario, prompt bundle trading (PBT) system, and propose an online pricing mechanism. Based on the combinatorial multi-armed bandit (CMAB) and three-stage hierarchical Stackelburg (HS) game, our pricing mechanism considers the… ▽ More

    Submitted 31 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  5. arXiv:2405.13532  [pdf, other

    cs.CV

    What Makes Good Few-shot Examples for Vision-Language Models?

    Authors: Zhaojun Guo, **ghui Lu, Xue**g Liu, Rui Zhao, ZhenXing Qian, Fei Tan

    Abstract: Despite the notable advancements achieved by leveraging pre-trained vision-language (VL) models through few-shot tuning for downstream tasks, our detailed empirical study highlights a significant dependence of few-shot learning outcomes on the careful selection of training examples - a facet that has been previously overlooked in research. In this study, we delve into devising more effective strat… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 8 pages, 4 figures

  6. arXiv:2405.11758  [pdf, other

    cs.LG cs.AI

    Fed-Credit: Robust Federated Learning with Credibility Management

    Authors: Jiayan Chen, Zhirong Qian, Tianhui Meng, Xitong Gao, Tian Wang, Weijia Jia

    Abstract: Aiming at privacy preservation, Federated Learning (FL) is an emerging machine learning approach enabling model training on decentralized devices or data sources. The learning mechanism of FL relies on aggregating parameter updates from individual clients. However, this process may pose a potential security risk due to the presence of malicious devices. Existing solutions are either costly due to… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  7. arXiv:2405.02844  [pdf, other

    cs.CV

    SMCD: High Realism Motion Style Transfer via Mamba-based Diffusion

    Authors: Ziyun Qian, Zeyu Xiao, Zhenyi Wu, Dingkang Yang, Mingcheng Li, Shunli Wang, Shuaibing Wang, Dongliang Kou, Lihua Zhang

    Abstract: Motion style transfer is a significant research direction in multimedia applications. It enables the rapid switching of different styles of the same motion for virtual digital humans, thus vastly increasing the diversity and realism of movements. It is widely applied in multimedia scenarios such as movies, games, and the Metaverse. However, most of the current work in this field adopts the GAN, wh… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  8. arXiv:2404.16456  [pdf, other

    cs.CV

    Correlation-Decoupled Knowledge Distillation for Multimodal Sentiment Analysis with Incomplete Modalities

    Authors: Mingcheng Li, Dingkang Yang, Xiao Zhao, Shuaibing Wang, Yan Wang, Kun Yang, Mingyang Sun, Dongliang Kou, Ziyun Qian, Lihua Zhang

    Abstract: Multimodal sentiment analysis (MSA) aims to understand human sentiment through multimodal data. Most MSA efforts are based on the assumption of modality completeness. However, in real-world applications, some practical factors cause uncertain modality missingness, which drastically degrades the model's performance. To this end, we propose a Correlation-decoupled Knowledge Distillation (CorrKD) fra… ▽ More

    Submitted 10 June, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  9. arXiv:2404.04584  [pdf, other

    cs.CV

    D$^3$: Scaling Up Deepfake Detection by Learning from Discrepancy

    Authors: Yongqi Yang, Zhihao Qian, Ye Zhu, Yu Wu

    Abstract: The boom of Generative AI brings opportunities entangled with risks and concerns. In this work, we seek a step toward a universal deepfake detection system with better generalization and robustness, to accommodate the responsible deployment of diverse image generative models. We do so by first scaling up the existing detection task setup from the one-generator to multiple-generators in training, d… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 14 pages, 3 figures

  10. arXiv:2404.00726  [pdf, other

    eess.IV cs.CV cs.LG

    MugenNet: A Novel Combined Convolution Neural Network and Transformer Network with its Application for Colonic Polyp Image Segmentation

    Authors: Chen Peng, Zhiqin Qian, Kunyu Wang, Qi Luo, Zhuming Bi, Wenjun Zhang

    Abstract: Biomedical image segmentation is a very important part in disease diagnosis. The term "colonic polyps" refers to polypoid lesions that occur on the surface of the colonic mucosa within the intestinal lumen. In clinical practice, early detection of polyps is conducted through colonoscopy examinations and biomedical image processing. Therefore, the accurate polyp image segmentation is of great signi… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  11. arXiv:2404.00589   

    cs.LG cs.CL

    Harnessing the Power of Large Language Model for Uncertainty Aware Graph Processing

    Authors: Zhenyu Qian, Yiming Qian, Yuting Song, Fei Gao, Hai **, Chen Yu, Xia Xie

    Abstract: Handling graph data is one of the most difficult tasks. Traditional techniques, such as those based on geometry and matrix factorization, rely on assumptions about the data relations that become inadequate when handling large and complex graph data. On the other hand, deep learning approaches demonstrate promising results in handling large graph data, but they often fall short of providing interpr… ▽ More

    Submitted 12 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Comments: Because my organization does not allow members to privately upload papers to arXiv, I am requesting a withdrawal of my submission

  12. arXiv:2403.13349  [pdf, other

    cs.LG cs.CV

    Hierarchical Gaussian Mixture Normalizing Flow Modeling for Unified Anomaly Detection

    Authors: Xincheng Yao, Ruoqi Li, Zefeng Qian, Lu Wang, Chongyang Zhang

    Abstract: Unified anomaly detection (AD) is one of the most challenges for anomaly detection, where one unified model is trained with normal samples from multiple classes with the objective to detect anomalies in these classes. For such a challenging task, popular normalizing flow (NF) based AD methods may fall into a "homogeneous map**" issue,where the NF-based AD models are biased to generate similar la… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: 15 pages

  13. arXiv:2403.10766  [pdf, other

    cs.LG stat.ME

    ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference

    Authors: Krzysztof Kacprzyk, Samuel Holt, Jeroen Berrevoets, Zhaozhi Qian, Mihaela van der Schaar

    Abstract: Inferring unbiased treatment effects has received widespread attention in the machine learning community. In recent years, our community has proposed numerous solutions in standard settings, high-dimensional treatment settings, and even longitudinal settings. While very diverse, the solution has mostly relied on neural networks for inference and simultaneous correction of assignment bias. New appr… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Published in The Twelfth International Conference on Learning Representations (ICLR). Copyright 2024 by the author(s)

  14. arXiv:2403.10492  [pdf, other

    cs.CV

    Mitigating Dialogue Hallucination for Large Vision Language Models via Adversarial Instruction Tuning

    Authors: Dongmin Park, Zhaofang Qian, Guangxing Han, Ser-Nam Lim

    Abstract: Mitigating hallucinations of Large Vision Language Models,(LVLMs) is crucial to enhance their reliability for general-purpose assistants. This paper shows that such hallucinations of LVLMs can be significantly exacerbated by preceding user-system dialogues. To precisely measure this, we first present an evaluation benchmark by extending popular multi-modal benchmark datasets with prepended halluci… ▽ More

    Submitted 25 May, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  15. HandGCAT: Occlusion-Robust 3D Hand Mesh Reconstruction from Monocular Images

    Authors: Shuaibing Wang, Shunli Wang, Dingkang Yang, Mingcheng Li, Ziyun Qian, Liuzhen Su, Lihua Zhang

    Abstract: We propose a robust and accurate method for reconstructing 3D hand mesh from monocular images. This is a very challenging problem, as hands are often severely occluded by objects. Previous works often have disregarded 2D hand pose information, which contains hand prior knowledge that is strongly correlated with occluded regions. Thus, in this work, we propose a novel 3D hand mesh reconstruction ne… ▽ More

    Submitted 26 February, 2024; originally announced March 2024.

    Comments: 6 pages, 4 figures, ICME-2023 conference paper

    Journal ref: 2023 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2023: 2495-2500

  16. arXiv:2403.06407  [pdf, other

    cs.CV

    Can LLMs' Tuning Methods Work in Medical Multimodal Domain?

    Authors: Jiawei Chen, Yue Jiang, Dingkang Yang, Mingcheng Li, **jie Wei, Ziyun Qian, Lihua Zhang

    Abstract: While large language models (LLMs) excel in world knowledge understanding, adapting them to specific subfields requires precise adjustments. Due to the model's vast scale, traditional global fine-tuning methods for large models can be computationally expensive and impact generalization. To address this challenge, a range of innovative Parameters-Efficient Fine-Tuning (PEFT) methods have emerged an… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  17. arXiv:2403.01489  [pdf, other

    cs.CV cs.AI

    Regeneration Based Training-free Attribution of Fake Images Generated by Text-to-Image Generative Models

    Authors: Meiling Li, Zhenxing Qian, Xinpeng Zhang

    Abstract: Text-to-image generative models have recently garnered significant attention due to their ability to generate images based on prompt descriptions. While these models have shown promising performance, concerns have been raised regarding the potential misuse of the generated fake images. In response to this, we have presented a simple yet effective training-free method to attribute fake images gener… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  18. arXiv:2403.00277  [pdf, other

    cs.CL

    Gender Bias in Large Language Models across Multiple Languages

    Authors: **man Zhao, Yitian Ding, Chen Jia, Yining Wang, Zifan Qian

    Abstract: With the growing deployment of large language models (LLMs) across various applications, assessing the influence of gender biases embedded in LLMs becomes crucial. The topic of gender bias within the realm of natural language processing (NLP) has gained considerable focus, particularly in the context of English. Nonetheless, the investigation of gender bias in languages other than English is still… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

    Comments: 20 pages, 27 tables, 7 figures, submitted to ACL2024

  19. arXiv:2402.17599  [pdf, other

    cs.LG cs.AI stat.ML

    DAGnosis: Localized Identification of Data Inconsistencies using Structures

    Authors: Nicolas Huynh, Jeroen Berrevoets, Nabeel Seedat, Jonathan Crabbé, Zhaozhi Qian, Mihaela van der Schaar

    Abstract: Identification and appropriate handling of inconsistencies in data at deployment time is crucial to reliably use machine learning models. While recent data-centric methods are able to identify such inconsistencies with respect to the training set, they suffer from two key limitations: (1) suboptimality in settings where features exhibit statistical independencies, due to their usage of compressive… ▽ More

    Submitted 28 February, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: AISTATS 2024; added correspondance email

  20. arXiv:2402.17210  [pdf, other

    cs.CR cs.CV

    Purified and Unified Steganographic Network

    Authors: Guobiao Li, Sheng Li, Zicong Luo, Zhenxing Qian, Xinpeng Zhang

    Abstract: Steganography is the art of hiding secret data into the cover media for covert communication. In recent years, more and more deep neural network (DNN)-based steganographic schemes are proposed to train steganographic networks for secret embedding and recovery, which are shown to be promising. Compared with the handcrafted steganographic tools, steganographic networks tend to be large in size. It r… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 8 pages, 9 figures, Accepted at CVPR2024

  21. arXiv:2402.05212  [pdf, other

    cs.SE cs.CR

    An Investigation of Patch Porting Practices of the Linux Kernel Ecosystem

    Authors: Xingyu Li, Zheng Zhang, Zhiyun Qian, Trent Jaeger, Chengyu Song

    Abstract: Open-source software is increasingly reused, complicating the process of patching to repair bugs. In the case of Linux, a distinct ecosystem has formed, with Linux mainline serving as the upstream, stable or long-term-support (LTS) systems forked from mainline, and Linux distributions, such as Ubuntu and Android, as downstreams forked from stable or LTS systems for end-user use. Ideally, when a pa… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  22. arXiv:2402.01422  [pdf, other

    cs.CV

    EmoSpeaker: One-shot Fine-grained Emotion-Controlled Talking Face Generation

    Authors: Guanwen Feng, Haoran Cheng, Yunan Li, Zhiyuan Ma, Chaoneng Li, Zhihao Qian, Qiguang Miao, Chi-Man Pun

    Abstract: Implementing fine-grained emotion control is crucial for emotion generation tasks because it enhances the expressive capability of the generative model, allowing it to accurately and comprehensively capture and express various nuanced emotional states, thereby improving the emotional quality and personalization of generated content. Generating fine-grained facial animations that accurately portray… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  23. arXiv:2401.17618  [pdf, other

    cs.CR cs.OS

    Beyond Control: Exploring Novel File System Objects for Data-Only Attacks on Linux Systems

    Authors: **meng Zhou, Jiayi Hu, Ziyue Pan, Jiaxun Zhu, Guoren Li, Wenbo Shen, Yulei Sui, Zhiyun Qian

    Abstract: The widespread deployment of control-flow integrity has propelled non-control data attacks into the mainstream. In the domain of OS kernel exploits, by corrupting critical non-control data, local attackers can directly gain root access or privilege escalation without hijacking the control flow. As a result, OS kernels have been restricting the availability of such non-control data. This forces att… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 14 pages, in submission of the 31th ACM Conference on Computer and Communications Security (CCS), 2024

  24. arXiv:2401.17617  [pdf, other

    cs.CV cs.AI

    Unveiling the Power of Self-supervision for Multi-view Multi-human Association and Tracking

    Authors: Wei Feng, Feifan Wang, Ruize Han, Zekun Qian, Song Wang

    Abstract: Multi-view multi-human association and tracking (MvMHAT), is a new but important problem for multi-person scene video surveillance, aiming to track a group of people over time in each view, as well as to identify the same person across different views at the same time, which is different from previous MOT and multi-camera MOT tasks only considering the over-time human tracking. This way, the video… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  25. arXiv:2401.17205  [pdf, other

    stat.ML cs.LG

    Adaptive Experiment Design with Synthetic Controls

    Authors: Alihan Hüyük, Zhaozhi Qian, Mihaela van der Schaar

    Abstract: Clinical trials are typically run in order to understand the effects of a new treatment on a given population of patients. However, patients in large populations rarely respond the same way to the same treatment. This heterogeneity in patient responses necessitates trials that investigate effects on multiple subpopulations - especially when a treatment has marginal or no benefit for the overall po… ▽ More

    Submitted 9 February, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: Proceedings of the 27th International Conference on Artificial Intelligence and Statistics

  26. arXiv:2401.11642  [pdf, other

    cs.SE cs.CR cs.OS

    SyzRetrospector: A Large-Scale Retrospective Study of Syzbot

    Authors: Joseph Bursey, Ardalan Amiri Sani, Zhiyun Qian

    Abstract: Over the past 6 years, Syzbot has fuzzed the Linux kernel day and night to report over 5570 bugs, of which 4604 have been patched [11]. While this is impressive, we have found the average time to find a bug is over 405 days. Moreover, we have found that current metrics commonly used, such as time-to-find and number of bugs found, are inaccurate in evaluating Syzbot since bugs often spend the major… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

  27. arXiv:2401.02600  [pdf, other

    cs.CV cs.AI

    Object-oriented backdoor attack against image captioning

    Authors: Meiling Li, Nan Zhong, Xinpeng Zhang, Zhenxing Qian, Sheng Li

    Abstract: Backdoor attack against image classification task has been widely studied and proven to be successful, while there exist little research on the backdoor attack against vision-language models. In this paper, we explore backdoor attack towards image captioning models by poisoning training data. Assuming the attacker has total access to the training dataset, and cannot intervene in model construction… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  28. arXiv:2401.01717  [pdf

    cs.CV

    Fact-checking based fake news detection: a review

    Authors: Yuzhou Yang, Yangming Zhou, Qichao Ying, Zhenxing Qian, Dan Zeng, Liang Liu

    Abstract: This paper reviews and summarizes the research results on fact-based fake news from the perspectives of tasks and problems, algorithm strategies, and datasets. First, the paper systematically explains the task definition and core problems of fact-based fake news detection. Second, the paper summarizes the existing detection methods based on the algorithm principles. Third, the paper analyzes the c… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: Invited short review paper (in Chinese)

  29. arXiv:2401.00653  [pdf, other

    cs.CV

    PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation Models Through Prompt Tuning

    Authors: Xuntao Liu, Yuzhou Yang, Qichao Ying, Zhenxing Qian, Xinpeng Zhang, Sheng Li

    Abstract: Deceptive images can be shared in seconds with social networking services, posing substantial risks. Tampering traces, such as boundary artifacts and high-frequency information, have been significantly emphasized by massive networks in the Image Manipulation Localization (IML) field. However, they are prone to image post-processing operations, which limit the generalization and robustness of exist… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: Under Review

  30. arXiv:2401.00652  [pdf, other

    cs.CV

    From Covert Hiding to Visual Editing: Robust Generative Video Steganography

    Authors: Xueying Mao, Xiaoxiao Hu, Wanli Peng, Zhenliang Gan, Qichao Ying, Zhenxing Qian, Sheng Li, Xinpeng Zhang

    Abstract: Traditional video steganography methods are based on modifying the covert space for embedding, whereas we propose an innovative approach that embeds secret message within semantic feature for steganography during the video editing process. Although existing traditional video steganography methods display a certain level of security and embedding capacity, they lack adequate robustness against comm… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: Under Review

  31. arXiv:2401.00282  [pdf, other

    cs.LG

    Deep Generative Symbolic Regression

    Authors: Samuel Holt, Zhaozhi Qian, Mihaela van der Schaar

    Abstract: Symbolic regression (SR) aims to discover concise closed-form mathematical equations from data, a task fundamental to scientific discovery. However, the problem is highly challenging because closed-form equations lie in a complex combinatorial search space. Existing methods, ranging from heuristic search to reinforcement learning, fail to scale with the number of input variables. We make the obser… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

    Comments: In the proceedings of the Eleventh International Conference on Learning Representations (ICLR 2023). https://iclr.cc/virtual/2023/poster/11782

    ACM Class: I.2.6; I.2.5

    Journal ref: International Conference on Learning Representations (ICLR), 2023

  32. arXiv:2312.09228  [pdf, other

    cs.CV

    3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting

    Authors: Zhiyin Qian, Shaofei Wang, Marko Mihajlovic, Andreas Geiger, Siyu Tang

    Abstract: We introduce an approach that creates animatable human avatars from monocular videos using 3D Gaussian Splatting (3DGS). Existing methods based on neural radiance fields (NeRFs) achieve high-quality novel-view/novel-pose image synthesis but often require days of training, and are extremely slow at inference time. Recently, the community has explored fast grid structures for efficient training of c… ▽ More

    Submitted 4 April, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Project page: https://neuralbodies.github.io/3DGS-Avatar

  33. arXiv:2312.05262  [pdf, other

    cs.CR cs.LG

    Model Copyright Protection in Buyer-seller Environment

    Authors: Yusheng Guo, Nan Zhong, Zhenxing Qian, Xinpeng Zhang

    Abstract: Training a deep neural network (DNN) requires a high computational cost. Buying models from sellers with a large number of computing resources has become prevailing. However, the buyer-seller environment is not always trusted. To protect the neural network models from leaking in an untrusted environment, we propose a novel copyright protection scheme for DNN using an input-sensitive neural network… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  34. arXiv:2311.13619  [pdf, other

    cs.CV cs.CR

    Steal My Artworks for Fine-tuning? A Watermarking Framework for Detecting Art Theft Mimicry in Text-to-Image Models

    Authors: Ge Luo, Junqiang Huang, Manman Zhang, Zhenxing Qian, Sheng Li, Xinpeng Zhang

    Abstract: The advancement in text-to-image models has led to astonishing artistic performances. However, several studios and websites illegally fine-tune these models using artists' artworks to mimic their styles for profit, which violates the copyrights of artists and diminishes their motivation to produce original works. Currently, there is a notable lack of research focusing on this issue. In this paper,… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: A Watermarking Framework for Detecting Art Theft Mimicry in Text-to-Image Models

  35. arXiv:2311.12397  [pdf, other

    cs.CV

    PatchCraft: Exploring Texture Patch for Efficient AI-generated Image Detection

    Authors: Nan Zhong, Yiran Xu, Sheng Li, Zhenxing Qian, Xinpeng Zhang

    Abstract: Recent generative models show impressive performance in generating photographic images. Humans can hardly distinguish such incredibly realistic-looking AI-generated images from real ones. AI-generated images may lead to ubiquitous disinformation dissemination. Therefore, it is of utmost urgency to develop a detector to identify AI generated images. Most existing detectors suffer from sharp perform… ▽ More

    Submitted 7 March, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: Our project: https://fdmas.github.io/AIGCDetect/

  36. arXiv:2311.12245  [pdf, other

    cs.RO

    Towards Accurate Loop Closure Detection in Semantic SLAM with 3D Semantic Covisibility Graphs

    Authors: Zhentian Qian, Jie Fu, **g Xiao

    Abstract: Loop closure is necessary for correcting errors accumulated in simultaneous localization and map** (SLAM) in unknown environments. However, conventional loop closure methods based on low-level geometric or image features may cause high ambiguity by not distinguishing similar scenarios. Thus, incorrect loop closures can occur. Though semantic 2D image information is considered in some literature… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  37. arXiv:2310.18970  [pdf, other

    cs.LG

    TRIAGE: Characterizing and auditing training data for improved regression

    Authors: Nabeel Seedat, Jonathan Crabbé, Zhaozhi Qian, Mihaela van der Schaar

    Abstract: Data quality is crucial for robust machine learning algorithms, with the recent interest in data-centric AI emphasizing the importance of training data characterization. However, current data characterization methods are largely focused on classification settings, with regression settings largely understudied. To address this, we introduce TRIAGE, a novel data characterization framework tailored t… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Presented at NeurIPS 2023

  38. arXiv:2310.18899  [pdf

    cs.CV

    Multi-task deep learning for large-scale building detail extraction from high-resolution satellite imagery

    Authors: Zhen Qian, Min Chen, Zhuo Sun, Fan Zhang, Qingsong Xu, **zhao Guo, Zhiwei Xie, Zhixin Zhang

    Abstract: Understanding urban dynamics and promoting sustainable development requires comprehensive insights about buildings. While geospatial artificial intelligence has advanced the extraction of such details from Earth observational data, existing methods often suffer from computational inefficiencies and inconsistencies when compiling unified building-related datasets for practical applications. To brid… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

  39. arXiv:2310.18688  [pdf, other

    cs.LG

    Clairvoyance: A Pipeline Toolkit for Medical Time Series

    Authors: Daniel Jarrett, **sung Yoon, Ioana Bica, Zhaozhi Qian, Ari Ercole, Mihaela van der Schaar

    Abstract: Time-series learning is the bread and butter of data-driven *clinical decision support*, and the recent explosion in ML research has demonstrated great potential in various healthcare settings. At the same time, medical time-series problems in the wild are challenging due to their highly *composite* nature: They entail design choices and interactions among components that preprocess data, impute m… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Journal ref: In Proc. 9th International Conference on Learning Representations (ICLR 2021)

  40. arXiv:2310.07510  [pdf, other

    cs.CV

    Heuristic Vision Pre-Training with Self-Supervised and Supervised Multi-Task Learning

    Authors: Zhiming Qian

    Abstract: To mimic human vision with the way of recognizing the diverse and open world, foundation vision models are much critical. While recent techniques of self-supervised learning show the promising potentiality of this mission, we argue that signals from labelled data are also important for common-sense recognition, and properly chosen pre-text tasks can facilitate the efficiency of vision representati… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  41. arXiv:2310.06397  [pdf, other

    cs.CR

    Top of the Heap: Efficient Memory Error Protection for Many Heap Objects

    Authors: Kaiming Huang, Mathias Payer, Zhiyun Qian, Jack Sampson, Gang Tan, Trent Jaeger

    Abstract: Exploits against heap memory errors continue to be a major concern. Although many defenses have been proposed, heap data are not protected from attacks that exploit memory errors systematically. Research defenses focus on complete coverage of heap objects, often giving up on comprehensive memory safety protection and/or incurring high costs in performance overhead and memory usage. In this paper,… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  42. Securing Fixed Neural Network Steganography

    Authors: Zicong Luo, Sheng Li, Guobiao Li, Zhenxing Qian, Xinpeng Zhang

    Abstract: Image steganography is the art of concealing secret information in images in a way that is imperceptible to unauthorized parties. Recent advances show that is possible to use a fixed neural network (FNN) for secret embedding and extraction. Such fixed neural network steganography (FNNS) achieves high steganographic performance without training the networks, which could be more useful in real-world… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  43. arXiv:2309.07428  [pdf, other

    cs.CV eess.IV

    Physical Invisible Backdoor Based on Camera Imaging

    Authors: Yusheng Guo, Nan Zhong, Zhenxing Qian, Xinpeng Zhang

    Abstract: Backdoor attack aims to compromise a model, which returns an adversary-wanted output when a specific trigger pattern appears yet behaves normally for clean inputs. Current backdoor attacks require changing pixels of clean images, which results in poor stealthiness of attacks and increases the difficulty of the physical implementation. This paper proposes a novel physical invisible backdoor based o… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  44. arXiv:2308.02983  [pdf, other

    cs.CV

    Focus the Discrepancy: Intra- and Inter-Correlation Learning for Image Anomaly Detection

    Authors: Xincheng Yao, Ruoqi Li, Zefeng Qian, Yan Luo, Chongyang Zhang

    Abstract: Humans recognize anomalies through two aspects: larger patch-wise representation discrepancies and weaker patch-to-normal-patch correlations. However, the previous AD methods didn't sufficiently combine the two complementary aspects to design AD models. To this end, we find that Transformer can ideally satisfy the two aspects as its great power in the unified modeling of patch-wise representations… ▽ More

    Submitted 5 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV2023

  45. arXiv:2308.00245  [pdf, other

    cs.SE cs.AI

    The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models

    Authors: Haonan Li, Yu Hao, Yizhuo Zhai, Zhiyun Qian

    Abstract: Static analysis is a widely used technique in software engineering for identifying and mitigating bugs. However, a significant hurdle lies in achieving a delicate balance between precision and scalability. Large Language Models (LLMs) offer a promising alternative, as recent advances demonstrate remarkable capabilities in comprehending, generating, and even debugging code. Yet, the logic of bugs c… ▽ More

    Submitted 15 November, 2023; v1 submitted 31 July, 2023; originally announced August 2023.

  46. arXiv:2307.16418  [pdf, other

    cs.CV cs.MM eess.IV

    DRAW: Defending Camera-shooted RAW against Image Manipulation

    Authors: Xiaoxiao Hu, Qichao Ying, Zhenxing Qian, Sheng Li, Xinpeng Zhang

    Abstract: RAW files are the initial measurement of scene radiance widely used in most cameras, and the ubiquitously-used RGB images are converted from RAW data through Image Signal Processing (ISP) pipelines. Nowadays, digital images are risky of being nefariously manipulated. Inspired by the fact that innate immunity is the first line of body defense, we propose DRAW, a novel scheme of defending images aga… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: To appear in ICCV 2023. The leading two authors contribute equally

  47. One-shot Joint Extraction, Registration and Segmentation of Neuroimaging Data

    Authors: Yao Su, Zhentian Qian, Lei Ma, Lifang He, Xiangnan Kong

    Abstract: Brain extraction, registration and segmentation are indispensable preprocessing steps in neuroimaging studies. The aim is to extract the brain from raw imaging scans (i.e., extraction step), align it with a target brain image (i.e., registration step) and label the anatomical brain regions (i.e., segmentation step). Conventional studies typically focus on develo** separate methods for the extrac… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: Published as a research track paper at KDD 2023. Code: https://github.com/Anonymous4545/JERS

  48. arXiv:2307.10846  [pdf, other

    cs.RO cs.AI

    Goal-Conditioned Reinforcement Learning with Disentanglement-based Reachability Planning

    Authors: Zhifeng Qian, Mingyu You, Hongjun Zhou, Xuanhui Xu, Bin He

    Abstract: Goal-Conditioned Reinforcement Learning (GCRL) can enable agents to spontaneously set diverse goals to learn a set of skills. Despite the excellent works proposed in various fields, reaching distant goals in temporally extended tasks remains a challenge for GCRL. Current works tackled this problem by leveraging planning algorithms to plan intermediate subgoals to augment GCRL. Their methods need t… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted by 2023 RAL with ICRA

  49. arXiv:2307.10642  [pdf, other

    cs.CV cs.MM

    RetouchingFFHQ: A Large-scale Dataset for Fine-grained Face Retouching Detection

    Authors: Qichao Ying, Jiaxin Liu, Sheng Li, Haisheng Xu, Zhenxing Qian, Xinpeng Zhang

    Abstract: The widespread use of face retouching filters on short-video platforms has raised concerns about the authenticity of digital appearances and the impact of deceptive advertising. To address these issues, there is a pressing need to develop advanced face retouching techniques. However, the lack of large-scale and fine-grained face retouching datasets has been a major obstacle to progress in this fie… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: Under review

  50. arXiv:2307.03444  [pdf, other

    cs.CR cs.AI

    Towards Deep Network Steganography: From Networks to Networks

    Authors: Guobiao Li, Sheng Li, Meiling Li, Zhenxing Qian, Xinpeng Zhang

    Abstract: With the widespread applications of the deep neural network (DNN), how to covertly transmit the DNN models in public channels brings us the attention, especially for those trained for secret-learning tasks. In this paper, we propose deep network steganography for the covert communication of DNN models. Unlike the existing steganography schemes which focus on the subtle modification of the cover da… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

    Comments: 8 pages. arXiv admin note: text overlap with arXiv:2302.14521