Skip to main content

Showing 1–50 of 220 results for author: Deng, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.07977  [pdf, other

    q-bio.QM cs.LG q-bio.NC

    A Demographic-Conditioned Variational Autoencoder for fMRI Distribution Sampling and Removal of Confounds

    Authors: Anton Orlichenko, Gang Qu, Ziyu Zhou, Anqi Liu, Hong-Wen Deng, Zhengming Ding, Julia M. Stephen, Tony W. Wilson, Vince D. Calhoun, Yu-** Wang

    Abstract: Objective: fMRI and derived measures such as functional connectivity (FC) have been used to predict brain age, general fluid intelligence, psychiatric disease status, and preclinical neurodegenerative disease. However, it is not always clear that all demographic confounds, such as age, sex, and race, have been removed from fMRI data. Additionally, many fMRI datasets are restricted to authorized re… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 12 pages

  2. arXiv:2405.07919  [pdf, other

    cs.CV

    Exploring the Low-Pass Filtering Behavior in Image Super-Resolution

    Authors: Haoyu Deng, Zi**g Xu, Yule Duan, Xiao Wu, Wenjie Shu, Liang-Jian Deng

    Abstract: Deep neural networks for image super-resolution have shown significant advantages over traditional approaches like interpolation. However, they are often criticized as `black boxes' compared to traditional approaches which have solid mathematical foundations. In this paper, we attempt to interpret the behavior of deep neural networks using theories from signal processing theories. We first report… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  3. arXiv:2405.04782  [pdf, other

    cs.CV

    Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection

    Authors: Zhaoxiang Zhang, Hanqiu Deng, **an Bao, Xingyu Li

    Abstract: Image Anomaly Detection has been a challenging task in Computer Vision field. The advent of Vision-Language models, particularly the rise of CLIP-based frameworks, has opened new avenues for zero-shot anomaly detection. Recent studies have explored the use of CLIP by aligning images with normal and prompt descriptions. However, the exclusive dependence on textual guidance often falls short, highli… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  4. arXiv:2405.04309  [pdf, other

    cs.CV

    Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling

    Authors: Jiawei Shi, Hui Deng, Yuchao Dai

    Abstract: Even though Non-rigid Structure-from-Motion (NRSfM) has been extensively studied and great progress has been made, there are still key challenges that hinder their broad real-world applications: 1) the inherent motion/rotation ambiguity requires either explicit camera motion recovery with extra constraint or complex Procrustean Alignment; 2) existing low-rank modeling of the global shape can over-… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR 2024

  5. arXiv:2405.04294  [pdf, other

    cs.AI

    Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework

    Authors: Xiangpeng Wan, Haicheng Deng, Kai Zou, Shiqi Xu

    Abstract: Structured finance, which involves restructuring diverse assets into securities like MBS, ABS, and CDOs, enhances capital market efficiency but presents significant due diligence challenges. This study explores the integration of artificial intelligence (AI) with traditional asset review processes to improve efficiency and accuracy in structured finance. Using both open-sourced and close-sourced l… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  6. arXiv:2405.00236  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    STT: Stateful Tracking with Transformers for Autonomous Driving

    Authors: Longlong **g, Ruichi Yu, Xu Chen, Zhengli Zhao, Shiwei Sheng, Colin Graber, Qi Chen, Qinru Li, Shangxuan Wu, Han Deng, Sang** Lee, Chris Sweeney, Qiurui He, Wei-Chih Hung, Tong He, Xingyi Zhou, Farshid Moussavi, Zijian Guo, Yin Zhou, Mingxing Tan, Weilong Yang, Congcong Li

    Abstract: Tracking objects in three-dimensional space is critical for autonomous driving. To ensure safety while driving, the tracker must be able to reliably track objects across frames and accurately estimate their states such as velocity and acceleration in the present. Existing works frequently focus on the association task while either neglecting the model performance on state estimation or deploying c… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: ICRA 2024

  7. arXiv:2404.16522  [pdf, other

    eess.IV cs.LG

    A Deep Learning-Driven Pipeline for Differentiating Hypertrophic Cardiomyopathy from Cardiac Amyloidosis Using 2D Multi-View Echocardiography

    Authors: Bo Peng, Xiaofeng Li, Xinyu Li, Zhenghan Wang, Hui Deng, Xiaoxian Luo, Lixue Yin, Hongmei Zhang

    Abstract: Hypertrophic cardiomyopathy (HCM) and cardiac amyloidosis (CA) are both heart conditions that can progress to heart failure if untreated. They exhibit similar echocardiographic characteristics, often leading to diagnostic challenges. This paper introduces a novel multi-view deep learning approach that utilizes 2D echocardiography for differentiating between HCM and CA. The method begins by classif… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  8. arXiv:2404.13860  [pdf, other

    cs.LG cs.CR

    Distributional Black-Box Model Inversion Attack with Multi-Agent Reinforcement Learning

    Authors: Huan Bao, Kaimin Wei, Yongdong Wu, ** Qian, Robert H. Deng

    Abstract: A Model Inversion (MI) attack based on Generative Adversarial Networks (GAN) aims to recover the private training data from complex deep learning models by searching codes in the latent space. However, they merely search a deterministic latent space such that the found latent code is usually suboptimal. In addition, the existing distributional MI schemes assume that an attacker can access the stru… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  9. arXiv:2404.11206  [pdf, other

    cs.CL

    Prompt-tuning for Clickbait Detection via Text Summarization

    Authors: Haoxiang Deng, Yi Zhu, Ye Wang, Jipeng Qiang, Yunhao Yuan, Yun Li, Runmei Zhang

    Abstract: Clickbaits are surprising social posts or deceptive news headlines that attempt to lure users for more clicks, which have posted at unprecedented rates for more profit or commercial revenue. The spread of clickbait has significant negative impacts on the users, which brings users misleading or even click-jacking attacks. Different from fake news, the crucial problem in clickbait detection is deter… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  10. arXiv:2404.08750  [pdf, other

    cs.LG

    FastLogAD: Log Anomaly Detection with Mask-Guided Pseudo Anomaly Generation and Discrimination

    Authors: Yifei Lin, Hanqiu Deng, Xingyu Li

    Abstract: Nowadays large computers extensively output logs to record the runtime status and it has become crucial to identify any suspicious or malicious activities from the information provided by the realtime logs. Thus, fast log anomaly detection is a necessary task to be implemented for automating the infeasible manual detection. Most of the existing unsupervised methods are trained only on normal log d… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 10 pages

  11. arXiv:2404.07932  [pdf, other

    cs.CV eess.IV

    FusionMamba: Efficient Image Fusion with State Space Model

    Authors: Siran Peng, Xiangyu Zhu, Haoyu Deng, Zhen Lei, Liang-Jian Deng

    Abstract: Image fusion aims to generate a high-resolution multi/hyper-spectral image by combining a high-resolution image with limited spectral information and a low-resolution image with abundant spectral data. Current deep learning (DL)-based methods for image fusion primarily rely on CNNs or Transformers to extract features and merge different types of data. While CNNs are efficient, their receptive fiel… ▽ More

    Submitted 10 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

  12. arXiv:2404.07833  [pdf

    cs.CV cs.LG

    Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution

    Authors: Handi Deng, Yucheng Zhou, Jiaxuan Xiang, Liujie Gu, Yan Luo, Hai Feng, Mingyuan Liu, Cheng Ma

    Abstract: Foundation models have rapidly evolved and have achieved significant accomplishments in computer vision tasks. Specifically, the prompt mechanism conveniently allows users to integrate image prior information into the model, making it possible to apply models without any training. Therefore, we propose a method based on foundation models and zero training to solve the tasks of photoacoustic (PA) i… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  13. arXiv:2404.07543  [pdf, other

    cs.CV eess.IV

    Content-Adaptive Non-Local Convolution for Remote Sensing Pansharpening

    Authors: Yule Duan, Xiao Wu, Haoyu Deng, Liang-Jian Deng

    Abstract: Currently, machine learning-based methods for remote sensing pansharpening have progressed rapidly. However, existing pansharpening methods often do not fully exploit differentiating regional information in non-local spaces, thereby limiting the effectiveness of the methods and resulting in redundant learning parameters. In this paper, we introduce a so-called content-adaptive non-local convolutio… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  14. arXiv:2403.15432  [pdf, other

    eess.SP cs.AI cs.HC cs.LG cs.RO

    BRIEDGE: EEG-Adaptive Edge AI for Multi-Brain to Multi-Robot Interaction

    Authors: **hui Ouyang, Mingzhu Wu, Xinglin Li, Hanhui Deng, Di Wu

    Abstract: Recent advances in EEG-based BCI technologies have revealed the potential of brain-to-robot collaboration through the integration of sensing, computing, communication, and control. In this paper, we present BRIEDGE as an end-to-end system for multi-brain to multi-robot interaction through an EEG-adaptive neural network and an encoding-decoding communication framework, as illustrated in Fig.1. As d… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  15. arXiv:2403.12552  [pdf, other

    cs.CV cs.AI cs.RO

    M2DA: Multi-Modal Fusion Transformer Incorporating Driver Attention for Autonomous Driving

    Authors: Dongyang Xu, Haokun Li, Qingfan Wang, Ziying Song, Lei Chen, Hanming Deng

    Abstract: End-to-end autonomous driving has witnessed remarkable progress. However, the extensive deployment of autonomous vehicles has yet to be realized, primarily due to 1) inefficient multi-modal environment perception: how to integrate data from multi-modal sensors more efficiently; 2) non-human-like scene understanding: how to effectively locate and predict critical risky agents in traffic scenarios l… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  16. arXiv:2403.01774  [pdf, other

    cs.CL

    WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations

    Authors: Haolin Deng, Chang Wang, Xin Li, Dezhang Yuan, Junlang Zhan, Tianhua Zhou, ** Ma, Jun Gao, Ruifeng Xu

    Abstract: Enhancing the attribution in large language models (LLMs) is a crucial task. One feasible approach is to enable LLMs to cite external sources that support their generations. However, existing datasets and evaluation methods in this domain still exhibit notable limitations. In this work, we formulate the task of attributed query-focused summarization (AQFS) and present WebCiteS, a Chinese dataset f… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 19 pages, 7 figures

  17. arXiv:2403.00862  [pdf, other

    cs.CL cs.AI

    NewsBench: Systematic Evaluation of LLMs for Writing Proficiency and Safety Adherence in Chinese Journalistic Editorial Applications

    Authors: Miao Li, Ming-Bin Chen, Bo Tang, Shengbin Hou, Pengyu Wang, Haiying Deng, Zhiyu Li, Feiyu Xiong, Keming Mao, Peng Cheng, Yi Luo

    Abstract: This study presents NewsBench, a novel benchmark framework developed to evaluate the capability of Large Language Models (LLMs) in Chinese Journalistic Writing Proficiency (JWP) and their Safety Adherence (SA), addressing the gap between journalistic ethics and the risks associated with AI utilization. Comprising 1,267 tasks across 5 editorial applications, 7 aspects (including safety and journali… ▽ More

    Submitted 21 March, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

    Comments: 27 pages

  18. arXiv:2402.17091  [pdf, other

    cs.CV

    Structural Teacher-Student Normality Learning for Multi-Class Anomaly Detection and Localization

    Authors: Hanqiu Deng, Xingyu Li

    Abstract: Visual anomaly detection is a challenging open-set task aimed at identifying unknown anomalous patterns while modeling normal data. The knowledge distillation paradigm has shown remarkable performance in one-class anomaly detection by leveraging teacher-student network feature comparisons. However, extending this paradigm to multi-class anomaly detection introduces novel scalability challenges. In… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  19. arXiv:2402.15823  [pdf, other

    cs.CV

    Parameter-efficient Prompt Learning for 3D Point Cloud Understanding

    Authors: Hongyu Sun, Yongcai Wang, Wang Chen, Haoran Deng, Deying Li

    Abstract: This paper presents a parameter-efficient prompt tuning method, named PPT, to adapt a large multi-modal model for 3D point cloud understanding. Existing strategies are quite expensive in computation and storage, and depend on time-consuming prompt engineering. We address the problems from three aspects. Firstly, a PromptLearner module is devised to replace hand-crafted prompts with learnable conte… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

    Comments: 9 pages, 5 figures, 6 tables; accepted by ICRA 2024

  20. Graph-Skeleton: ~1% Nodes are Sufficient to Represent Billion-Scale Graph

    Authors: Linfeng Cao, Haoran Deng, Yang Yang, Chun** Wang, Lei Chen

    Abstract: Due to the ubiquity of graph data on the web, web graph mining has become a hot research spot. Nonetheless, the prevalence of large-scale web graphs in real applications poses significant challenges to storage, computational capacity and graph model design. Despite numerous studies to enhance the scalability of graph models, a noticeable gap remains between academic research and practical web grap… ▽ More

    Submitted 6 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: 21 pages, 11 figures, In Proceedings of the ACM Web Conference 2024 (WWW'24)

  21. arXiv:2402.02085  [pdf, other

    cs.CV cs.AI

    DeCoF: Generated Video Detection via Frame Consistency

    Authors: Long Ma, Jiajia Zhang, Hong** Deng, Ningyu Zhang, Yong Liao, Haiyang Yu

    Abstract: The escalating quality of video generated by advanced video generation methods leads to new security challenges in society, which makes generated video detection an urgent research priority. To foster collaborative research in this area, we construct the first open-source dataset explicitly for generated video detection, providing a valuable resource for the community to benchmark and improve dete… ▽ More

    Submitted 5 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  22. arXiv:2401.17043  [pdf, other

    cs.CL

    CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models

    Authors: Yuanjie Lyu, Zhiyu Li, Simin Niu, Feiyu Xiong, Bo Tang, Wen** Wang, Hao Wu, Huanyong Liu, Tong Xu, Enhong Chen, Yi Luo, Peng Cheng, Haiying Deng, Zhonghao Wang, Zijia Lu

    Abstract: Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by incorporating external knowledge sources. This method addresses common LLM limitations, including outdated information and the tendency to produce inaccurate "hallucinated" content. However, the evaluation of RAG systems is challenging, as existing benchmarks are limited in scope a… ▽ More

    Submitted 18 February, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 26 Pages

  23. arXiv:2401.16545  [pdf

    cs.DC

    Leveraging Public Cloud Infrastructure for Real-time Connected Vehicle Speed Advisory at a Signalized Corridor

    Authors: Hsien-Wen Deng, M Sabbir Salek, Mizanur Rahman, Mashrur Chowdhury, Mitch Shue, Amy W. Apon

    Abstract: In this study, we developed a real-time connected vehicle (CV) speed advisory application that uses public cloud services and tested it on a simulated signalized corridor for different roadway traffic conditions. First, we developed a scalable serverless cloud computing architecture leveraging public cloud services offered by Amazon Web Services (AWS) to support the requirements of a real-time CV… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  24. arXiv:2401.15897  [pdf, other

    cs.CY cs.HC cs.LG

    Red-Teaming for Generative AI: Silver Bullet or Security Theater?

    Authors: Michael Feffer, Anusha Sinha, Wesley Hanwen Deng, Zachary C. Lipton, Hoda Heidari

    Abstract: In response to rising concerns surrounding the safety, security, and trustworthiness of Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red-teaming as a key component of their strategies for identifying and mitigating these risks. However, despite AI red-teaming's central role in policy discussions and corporate messaging, significant questions remain about what… ▽ More

    Submitted 15 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  25. arXiv:2401.01568  [pdf, other

    cs.CR cs.NI

    A Survey of Protocol Fuzzing

    Authors: Xiaohan Zhang, Cen Zhang, Xinghua Li, Zhengjie Du, Yuekang Li, Yaowen Zheng, Yeting Li, Bing Mao, Yang Liu, Robert H. Deng

    Abstract: Communication protocols form the bedrock of our interconnected world, yet vulnerabilities within their implementations pose significant security threats. Recent developments have seen a surge in fuzzing-based research dedicated to uncovering these vulnerabilities within protocol implementations. However, there still lacks a systematic overview of protocol fuzzing for answering the essential questi… ▽ More

    Submitted 3 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

  26. arXiv:2312.13304  [pdf, other

    eess.IV cs.CV

    End-to-end Rain Streak Removal with RAW Images

    Authors: GuoDong Du, HaoJian Deng, JiaHao Su, Yuan Huang

    Abstract: In this work we address the problem of rain streak removal with RAW images. The general approach is firstly processing RAW data into RGB images and removing rain streak with RGB images. Actually the original information of rain in RAW images is affected by image signal processing (ISP) pipelines including none-linear algorithms, unexpected noise, artifacts and so on. It gains more benefit to direc… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 10 pages, 5 figures,4 tables, conference

  27. arXiv:2312.09245  [pdf, other

    cs.CV

    DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving

    Authors: Wenhai Wang, Jiangwei Xie, ChuanYang Hu, Haoming Zou, Jianan Fan, Wenwen Tong, Yang Wen, Silei Wu, Hanming Deng, Zhiqi Li, Hao Tian, Lewei Lu, Xizhou Zhu, Xiaogang Wang, Yu Qiao, Jifeng Dai

    Abstract: Large language models (LLMs) have opened up new possibilities for intelligent agents, endowing them with human-like thinking and cognitive abilities. In this work, we delve into the potential of large language models (LLMs) in autonomous driving (AD). We introduce DriveMLM, an LLM-based AD framework that can perform close-loop autonomous driving in realistic simulators. To this end, (1) we bridge… ▽ More

    Submitted 25 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Technical Report

  28. arXiv:2312.04948  [pdf, other

    cs.CV astro-ph.GA cs.LG

    Scientific Preparation for CSST: Classification of Galaxy and Nebula/Star Cluster Based on Deep Learning

    Authors: Yuquan Zhang, Zhong Cao, Feng Wang, Lam, Man I, Hui Deng, Ying Mei, Lei Tan

    Abstract: The Chinese Space Station Telescope (abbreviated as CSST) is a future advanced space telescope. Real-time identification of galaxy and nebula/star cluster (abbreviated as NSC) images is of great value during CSST survey. While recent research on celestial object recognition has progressed, the rapid and efficient identification of high-resolution local celestial images remains challenging. In this… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  29. arXiv:2312.00372  [pdf, other

    cs.IR cs.CL

    Event-driven Real-time Retrieval in Web Search

    Authors: Nan Yang, Shusen Zhang, Yannan Zhang, Xiaoling Bai, Hualong Deng, Tianhua Zhou, ** Ma

    Abstract: Information retrieval in real-time search presents unique challenges distinct from those encountered in classical web search. These challenges are particularly pronounced due to the rapid change of user search intent, which is influenced by the occurrence and evolution of breaking news events, such as earthquakes, elections, and wars. Previous dense retrieval methods, which primarily focused on st… ▽ More

    Submitted 4 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

  30. arXiv:2311.17971  [pdf, other

    cs.CV

    GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation

    Authors: Baorui Ma, Haoge Deng, Junsheng Zhou, Yu-Shen Liu, Tiejun Huang, Xinlong Wang

    Abstract: Text-to-3D generation by distilling pretrained large-scale text-to-image diffusion models has shown great promise but still suffers from inconsistent 3D geometric structures (Janus problems) and severe artifacts. The aforementioned problems mainly stem from 2D diffusion models lacking 3D awareness during the lifting. In this work, we present GeoDream, a novel method that incorporates explicit gene… ▽ More

    Submitted 30 November, 2023; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: Code and Demo: https://github.com/baaivision/GeoDream

  31. arXiv:2311.15772  [pdf, other

    cs.LG

    Attend Who is Weak: Enhancing Graph Condensation via Cross-Free Adversarial Training

    Authors: Xinglin Li, Kun Wang, Hanhui Deng, Yuxuan Liang, Di Wu

    Abstract: In this paper, we study the \textit{graph condensation} problem by compressing the large, complex graph into a concise, synthetic representation that preserves the most essential and discriminative information of structure and features. We seminally propose the concept of Shock Absorber (a type of perturbation) that enhances the robustness and stability of the original graphs against changes in an… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  32. arXiv:2311.15296  [pdf, other

    cs.CL

    UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation

    Authors: Xun Liang, Shichao Song, Simin Niu, Zhiyu Li, Feiyu Xiong, Bo Tang, Zhaohui Wy, Dawei He, Peng Cheng, Zhonghao Wang, Haiying Deng

    Abstract: Large language models (LLMs) have emerged as pivotal contributors in contemporary natural language processing and are increasingly being applied across a diverse range of industries. However, these large-scale probabilistic statistical models cannot currently ensure the requisite quality in professional content generation. These models often produce hallucinated text, compromising their practical… ▽ More

    Submitted 19 February, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

    Comments: 13 Pages

  33. arXiv:2311.11503  [pdf, other

    quant-ph cs.PL

    A Case for Synthesis of Recursive Quantum Unitary Programs

    Authors: Haowei Deng, Runzhou Tao, Yuxiang Peng, Xiaodi Wu

    Abstract: Quantum programs are notoriously difficult to code and verify due to unintuitive quantum knowledge associated with quantum programming. Automated tools relieving the tedium and errors associated with low-level quantum details would hence be highly desirable. In this paper, we initiate the study of program synthesis for quantum unitary programs that recursively define a family of unitary circuits f… ▽ More

    Submitted 5 December, 2023; v1 submitted 19 November, 2023; originally announced November 2023.

  34. arXiv:2311.08995  [pdf

    cs.CV

    Simple but Effective Unsupervised Classification for Specified Domain Images: A Case Study on Fungi Images

    Authors: Zhaocong liu, Fa Zhang, Lin Cheng, Huanxi Deng, Xiaoyan Yang, Zhenyu Zhang, Chichun Zhou

    Abstract: High-quality labeled datasets are essential for deep learning. Traditional manual annotation methods are not only costly and inefficient but also pose challenges in specialized domains where expert knowledge is needed. Self-supervised methods, despite leveraging unlabeled data for feature extraction, still require hundreds or thousands of labeled instances to guide the model for effective speciali… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  35. arXiv:2311.02835  [pdf

    cs.CV

    Flexible Multi-Generator Model with Fused Spatiotemporal Graph for Trajectory Prediction

    Authors: Peiyuan Zhu, Fengxia Han, Hao Deng

    Abstract: Trajectory prediction plays a vital role in automotive radar systems, facilitating precise tracking and decision-making in autonomous driving. Generative adversarial networks with the ability to learn a distribution over future trajectories tend to predict out-of-distribution samples, which typically occurs when the distribution of forthcoming paths comprises a blend of various manifolds that may… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

  36. arXiv:2310.14576  [pdf, other

    cs.LG cs.CV

    Tensor Decomposition Based Attention Module for Spiking Neural Networks

    Authors: Haoyu Deng, Ruijie Zhu, Xuerui Qiu, Yule Duan, Malu Zhang, Liangjian Deng

    Abstract: The attention mechanism has been proven to be an effective way to improve spiking neural network (SNN). However, based on the fact that the current SNN input data flow is split into tensors to process on GPUs, none of the previous works consider the properties of tensors to implement an attention module. This inspires us to rethink current SNN from the perspective of tensor-relevant theories. Usin… ▽ More

    Submitted 10 April, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted by Knowledge-Based Systems

  37. arXiv:2310.13482  [pdf, other

    cs.HC cs.MM

    HSVRS: A Virtual Reality System of the Hide-and-Seek Game to Enhance Gaze Fixation Ability for Autistic Children

    Authors: Chengyan Yu, Shihuan Wang, Dong zhang, Yingying Zhang, Chaoqun Cen, Zhixiang you, Xiaobing zou, Hongzhu Deng, Ming Li

    Abstract: Numerous children diagnosed with Autism Spectrum Disorder (ASD) exhibit abnormal eye gaze pattern in communication and social interaction. Due to the high cost of ASD interventions and a shortage of professional therapists, researchers have explored the use of virtual reality (VR) systems as a supplementary intervention for autistic children. This paper presents the design of a novel VR-based syst… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  38. Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model

    Authors: Haikang Deng, Colin Raffel

    Abstract: While large language models have proven effective in a huge range of downstream applications, they often generate text that is problematic or lacks a desired attribute. In this paper, we introduce Reward-Augmented Decoding (RAD), a text generation procedure that uses a small unidirectional reward model to encourage a language model to generate text that has certain properties. Specifically, RAD us… ▽ More

    Submitted 1 January, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

  39. arXiv:2310.07990  [pdf

    q-bio.GN cs.IR cs.LG stat.AP

    Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted Metabolomics

    Authors: Chen Zhao, Kuan-Jui Su, Chong Wu, Xuewei Cao, Qiuying Sha, Wu Li, Zhe Luo, Tian Qin, Chuan Qiu, Lan Juan Zhao, Anqi Liu, Lindong Jiang, Xiao Zhang, Hui Shen, Weihua Zhou, Hong-Wen Deng

    Abstract: Background: Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in metabolomics studies. Method: In this study, we propose a novel method that leverages the information f… ▽ More

    Submitted 12 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 19 pages, 3 figures

  40. arXiv:2310.04820  [pdf, ps, other

    cs.IT math.CO

    The BCH Family of Storage Codes on Triangle-Free Graphs is of Unit Rate

    Authors: Haihua Deng, Hexiang Huang, Guobiao Weng, Qing Xiang

    Abstract: Let $Γ$ be a simple connected graph on $n$ vertices, and let $C$ be a code of length $n$ whose coordinates are indexed by the vertices of $Γ$. We say that $C$ is a \textit{storage code} on $Γ$ if for any codeword $c \in C$, one can recover the information on each coordinate of $c$ by accessing its neighbors in $Γ$. The main problem here is to construct high-rate storage codes on triangle-free grap… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

    Comments: 16 pages

  41. arXiv:2309.15406  [pdf, ps, other

    cs.CR

    SOCI^+: An Enhanced Toolkit for Secure OutsourcedComputation on Integers

    Authors: Bowen Zhao, Weiquan Deng, Xiaoguo Li, Ximeng Liu, Qingqi Pei, Robert H. Deng

    Abstract: Secure outsourced computation is critical for cloud computing to safeguard data confidentiality and ensure data usability. Recently, secure outsourced computation schemes following a twin-server architecture based on partially homomorphic cryptosystems have received increasing attention. The Secure Outsourced Computation on Integers (SOCI) [1] toolkit is the state-of-the-art among these schemes wh… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  42. arXiv:2309.13411  [pdf, other

    cs.LG cs.AI cs.CV

    Towards Attributions of Input Variables in a Coalition

    Authors: Xinhao Zheng, Huiqi Deng, Bo Fan, Quanshi Zhang

    Abstract: This paper aims to develop a new attribution method to explain the conflict between individual variables' attributions and their coalition's attribution from a fully new perspective. First, we find that the Shapley value can be reformulated as the allocation of Harsanyi interactions encoded by the AI model. Second, based the re-alloction of interactions, we extend the Shapley value to the attribut… ▽ More

    Submitted 28 November, 2023; v1 submitted 23 September, 2023; originally announced September 2023.

  43. arXiv:2309.09380  [pdf, other

    cs.CL cs.LG

    Mitigating Shortcuts in Language Models with Soft Label Encoding

    Authors: Zirui He, Huiqi Deng, Haiyan Zhao, Ninghao Liu, Mengnan Du

    Abstract: Recent research has shown that large language models rely on spurious correlations in the data for natural language understanding (NLU) tasks. In this work, we aim to answer the following research question: Can we reduce spurious correlations by modifying the ground truth labels of the training data? Specifically, we propose a simple yet effective debiasing framework, named Soft Label Encoding (So… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

  44. arXiv:2309.02911   

    cs.LG

    A Multimodal Learning Framework for Comprehensive 3D Mineral Prospectivity Modeling with Jointly Learned Structure-Fluid Relationships

    Authors: Yang Zheng, Hao Deng, Ruisheng Wang, **gjie Wu

    Abstract: This study presents a novel multimodal fusion model for three-dimensional mineral prospectivity map** (3D MPM), effectively integrating structural and fluid information through a deep network architecture. Leveraging Convolutional Neural Networks (CNN) and Multilayer Perceptrons (MLP), the model employs canonical correlation analysis (CCA) to align and fuse multimodal features. Rigorous evaluati… ▽ More

    Submitted 9 October, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: Upon careful review, it has come to our attention that inaccuracies exist in the formulation of the structure-fluid relationships, impacting the validity of the presented results

  45. arXiv:2309.01029  [pdf, other

    cs.CL cs.AI cs.LG

    Explainability for Large Language Models: A Survey

    Authors: Haiyan Zhao, Hanjie Chen, Fan Yang, Ninghao Liu, Huiqi Deng, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, Mengnan Du

    Abstract: Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. However, their internal mechanisms are still unclear and this lack of transparency poses unwanted risks for downstream applications. Therefore, understanding and explaining these models is crucial for elucidating their behaviors, limitations, and social impacts. In this paper, we introduce a taxo… ▽ More

    Submitted 28 November, 2023; v1 submitted 2 September, 2023; originally announced September 2023.

  46. arXiv:2308.15939  [pdf, other

    cs.CV

    Bootstrap Fine-Grained Vision-Language Alignment for Unified Zero-Shot Anomaly Localization

    Authors: Hanqiu Deng, Zhaoxiang Zhang, **an Bao, Xingyu Li

    Abstract: Contrastive Language-Image Pre-training (CLIP) models have shown promising performance on zero-shot visual recognition tasks by learning visual representations under natural language supervision. Recent studies attempt the use of CLIP to tackle zero-shot anomaly detection by matching images with normal and abnormal state prompts. However, since CLIP focuses on building correspondence between paire… ▽ More

    Submitted 26 February, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

  47. arXiv:2308.10705  [pdf, other

    cs.CV cs.AI

    Unsupervised 3D Pose Estimation with Non-Rigid Structure-from-Motion Modeling

    Authors: Haorui Ji, Hui Deng, Yuchao Dai, Hongdong Li

    Abstract: Most of the previous 3D human pose estimation work relied on the powerful memory capability of the network to obtain suitable 2D-3D map**s from the training data. Few works have studied the modeling of human posture deformation in motion. In this paper, we propose a new modeling method for human pose deformations and design an accompanying diffusion-based motion prior. Inspired by the field of n… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

  48. arXiv:2307.13424  [pdf, other

    cs.CL

    Holistic Exploration on Universal Decompositional Semantic Parsing: Architecture, Data Augmentation, and LLM Paradigm

    Authors: Hexuan Deng, Xin Zhang, Meishan Zhang, Xuebo Liu, Min Zhang

    Abstract: In this paper, we conduct a holistic exploration of the Universal Decompositional Semantic (UDS) Parsing. We first introduce a cascade model for UDS parsing that decomposes the complex parsing task into semantically appropriate subtasks. Our approach outperforms the prior models, while significantly reducing inference time. We also incorporate syntactic information and further optimized the archit… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: 12 pages, 7 figures, 3 tables

  49. arXiv:2307.10954  [pdf, other

    cs.RO cs.CV

    Soft-tissue Driven Craniomaxillofacial Surgical Planning

    Authors: Xi Fang, Daeseung Kim, Xuanang Xu, Tianshu Kuang, Nathan Lampen, Jungwook Lee, Hannah H. Deng, Jaime Gateno, Michael A. K. Liebschner, James J. Xia, **kun Yan

    Abstract: In CMF surgery, the planning of bony movement to achieve a desired facial outcome is a challenging task. Current bone driven approaches focus on normalizing the bone with the expectation that the facial appearance will be corrected accordingly. However, due to the complex non-linear relationship between bony structure and facial soft-tissue, such bone-driven methods are insufficient to correct fac… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: Early accepted by MICCAI 2023

  50. arXiv:2307.10930  [pdf

    cs.CL cs.AI

    MediaGPT : A Large Language Model For Chinese Media

    Authors: Zhonghao Wang, Zijia Lu, Bo **, Haiying Deng

    Abstract: Large language models (LLMs) have shown remarkable capabilities in generating high-quality text and making predictions based on large amounts of data, including the media domain. However, in practical applications, the differences between the media's use cases and the general-purpose applications of LLMs have become increasingly apparent, especially Chinese. This paper examines the unique characte… ▽ More

    Submitted 26 July, 2023; v1 submitted 20 July, 2023; originally announced July 2023.