Skip to main content

Showing 1–50 of 148 results for author: Guan, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19934  [pdf, other

    cs.CL cs.AI

    From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis

    Authors: Chuanqi Cheng, Jian Guan, Wei Wu, Rui Yan

    Abstract: We explore multi-step reasoning in vision-language models (VLMs). The problem is challenging, as reasoning data consisting of multiple steps of visual and language processing are barely available. To overcome the challenge, we first introduce a least-to-most visual reasoning paradigm, which interleaves steps of decomposing a question into sub-questions and invoking external tools for resolving sub… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  2. arXiv:2406.16708  [pdf, other

    cs.LG stat.ME

    CausalFormer: An Interpretable Transformer for Temporal Causal Discovery

    Authors: Lingbai Kong, Wengen Li, Hanchen Yang, Yichao Zhang, Jihong Guan, Shuigeng Zhou

    Abstract: Temporal causal discovery is a crucial task aimed at uncovering the causal relations within time series data. The latest temporal causal discovery methods usually train deep learning models on prediction tasks to uncover the causality between time series. They capture causal relations by analyzing the parameters of some components of the trained models, e.g., attention weights and convolution weig… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  3. arXiv:2405.18729  [pdf, other

    cs.LG cs.AI

    Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning

    Authors: Tianle Zhang, Jiayi Guan, Lin Zhao, Yihang Li, Dongjiang Li, Zecui Zeng, Lei Sun, Yue Chen, Xuelong Wei, Lusong Li, Xiaodong He

    Abstract: Offline reinforcement learning (RL) aims to learn optimal policies from previously collected datasets. Recently, due to their powerful representational capabilities, diffusion models have shown significant potential as policy models for offline RL issues. However, previous offline RL algorithms based on diffusion policies generally adopt weighted regression to improve the policy. This approach opt… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  4. arXiv:2405.15769  [pdf, other

    cs.CV

    FastDrag: Manipulate Anything in One Step

    Authors: Xuanjia Zhao, Jian Guan, Congyi Fan, Dongli Xu, Youtian Lin, Haiwei Pan, Pengming Feng

    Abstract: Drag-based image editing using generative models provides precise control over image contents, enabling users to manipulate anything in an image with a few clicks. However, prevailing methods typically adopt $n$-step iterations for latent semantic optimization to achieve drag-based image editing, which is time-consuming and limits practical applications. In this paper, we introduce a novel one-ste… ▽ More

    Submitted 6 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 13 pages, 13 figures, Project page: https://fastdrag-site.github.io/

  5. arXiv:2404.13528  [pdf, other

    cs.LG cs.AI cs.DC

    SmartMem: Layout Transformation Elimination and Adaptation for Efficient DNN Execution on Mobile

    Authors: Wei Niu, Md Musfiqur Rahman Sanim, Zhihao Shu, Jiexiong Guan, Xipeng Shen, Miao Yin, Gagan Agrawal, Bin Ren

    Abstract: This work is motivated by recent developments in Deep Neural Networks, particularly the Transformer architectures underlying applications such as ChatGPT, and the need for performing inference on mobile devices. Focusing on emerging transformers (specifically the ones with computationally efficient Swin-like architectures) and large models (e.g., Stable Diffusion and LLMs) based on transformers, w… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  6. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  7. arXiv:2403.15759  [pdf

    cs.CY

    Deep Learning Approach to Forecasting COVID-19 Cases in Residential Buildings of Hong Kong Public Housing Estates: The Role of Environment and Sociodemographics

    Authors: E. Leung, J. Guan, KO. Kwok, CT. Hung, CC. Ching, KC. Chong, CHK. Yam, T. Sun, WH. Tsang, EK. Yeoh, A. Lee

    Abstract: Introduction: The current study investigates the complex association between COVID-19 and the studied districts' socioecology (e.g. internal and external built environment, sociodemographic profiles, etc.) to quantify their contributions to the early outbreaks and epidemic resurgence of COVID-19. Methods: We aligned the analytic model's architecture with the hierarchical structure of the resident'… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  8. arXiv:2403.13842  [pdf

    cs.LG physics.soc-ph

    Analyzing the Variations in Emergency Department Boarding and Testing the Transferability of Forecasting Models across COVID-19 Pandemic Waves in Hong Kong: Hybrid CNN-LSTM approach to quantifying building-level socioecological risk

    Authors: Eman Leung, **g**g Guan, Kin On Kwok, CT Hung, CC. Ching, CK. Chung, Hector Tsang, EK Yeoh, Albert Lee

    Abstract: Emergency department's (ED) boarding (defined as ED waiting time greater than four hours) has been linked to poor patient outcomes and health system performance. Yet, effective forecasting models is rare before COVID-19, lacking during the peri-COVID era. Here, a hybrid convolutional neural network (CNN)-Long short-term memory (LSTM) model was applied to public-domain data sourced from Hong Kong's… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  9. arXiv:2403.07902  [pdf, other

    q-bio.BM cs.LG

    DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design

    Authors: Jiaqi Guan, Xiangxin Zhou, Yuwei Yang, Yu Bao, Jian Peng, Jianzhu Ma, Qiang Liu, Liang Wang, Quanquan Gu

    Abstract: Designing 3D ligands within a target binding site is a fundamental task in drug discovery. Existing structured-based drug design methods treat all ligand atoms equally, which ignores different roles of atoms in the ligand for drug design and can be less efficient for exploring the large drug-like molecule space. In this paper, inspired by the convention in pharmaceutical practice, we decompose the… ▽ More

    Submitted 26 February, 2024; originally announced March 2024.

    Comments: Accepted to ICML 2023

  10. arXiv:2403.07040  [pdf, other

    cs.LG cs.AI

    All in One: Multi-Task Prompting for Graph Neural Networks (Extended Abstract)

    Authors: Xiangguo Sun, Hong Cheng, Jia Li, Bo Liu, Jihong Guan

    Abstract: This paper is an extended abstract of our original work published in KDD23, where we won the best research paper award (Xiangguo Sun, Hong Cheng, Jia Li, Bo Liu, and Jihong Guan. All in one: Multi-task prompting for graph neural networks. KDD 23) The paper introduces a novel approach to bridging the gap between pre-trained graph models and the diverse tasks they're applied to, inspired by the succ… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: submitted to IJCAI 2024 Sister Conferences Track. The original paper can be seen at arXiv:2307.01504

  11. arXiv:2403.03218  [pdf, other

    cs.LG cs.AI cs.CL cs.CY

    The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

    Authors: Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatti, Justin D. Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B. Liu, Michael Chen, Isabelle Barrass, Oliver Zhang, Xiaoyuan Zhu, Rishub Tamirisa, Bhrugu Bharathi, Adam Khoja, Zhenqi Zhao, Ariel Herbert-Voss, Cort B. Breuer , et al. (32 additional authors not shown)

    Abstract: The White House Executive Order on Artificial Intelligence highlights the risks of large language models (LLMs) empowering malicious actors in develo** biological, cyber, and chemical weapons. To measure these risks of malicious use, government institutions and major AI labs are develo** evaluations for hazardous capabilities in LLMs. However, current evaluations are private, preventing furthe… ▽ More

    Submitted 15 May, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: See the project page at https://wmdp.ai

  12. arXiv:2402.19101  [pdf, other

    cs.IR cs.LG

    Effective Two-Stage Knowledge Transfer for Multi-Entity Cross-Domain Recommendation

    Authors: Jianyu Guan, Zongming Yin, Tianyi Zhang, Leihui Chen, Yin Zhang, Fei Huang, Jufeng Chen, Shuguang Han

    Abstract: In recent years, the recommendation content on e-commerce platforms has become increasingly rich -- a single user feed may contain multiple entities, such as selling products, short videos, and content posts. To deal with the multi-entity recommendation problem, an intuitive solution is to adopt the shared-network-based architecture for joint training. The idea is to transfer the extracted knowled… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  13. arXiv:2402.03708  [pdf, other

    cs.CV

    SISP: A Benchmark Dataset for Fine-grained Ship Instance Segmentation in Panchromatic Satellite Images

    Authors: Pengming Feng, Mingjie Xie, Hongning Liu, Xuanjia Zhao, Guangjun He, Xueliang Zhang, Jian Guan

    Abstract: Fine-grained ship instance segmentation in satellite images holds considerable significance for monitoring maritime activities at sea. However, existing datasets often suffer from the scarcity of fine-grained information or pixel-wise localization annotations, as well as the insufficient image diversity and variations, thus limiting the research of this task. To this end, we propose a benchmark da… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 14 pages, 9 figures

  14. arXiv:2402.01469  [pdf, other

    cs.CL

    AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback

    Authors: Jian Guan, Wei Wu, Zujie Wen, Peng Xu, Hongning Wang, Minlie Huang

    Abstract: The notable success of large language models (LLMs) has sparked an upsurge in building language agents to complete various complex tasks. We present AMOR, an agent framework based on open-source LLMs, which reasons with external knowledge bases and adapts to specific domains through human supervision to the reasoning process. AMOR builds reasoning logic over a finite state machine (FSM) that solve… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: Work in progress

  15. arXiv:2401.11040  [pdf, other

    cs.HC

    Design Frameworks for Spatial Zone Agents in XRI Metaverse Smart Environments

    Authors: Jie Guan, Jiamin Liu, Alexis Morris

    Abstract: The spatial XR-IoT (XRI) Zone Agents concept combines Extended Reality (XR), the Internet of Things (IoT), and spatial computing concepts to create hyper-connected spaces for metaverse applications; envisioning space as zones that are social, smart, scalable, expressive, and agent-based. These zone agents serve as applications and agents (partners, assistants, or guides) for users co-living and co… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Journal ref: 6th IEEE International Conference on Artificial Intelligence & extended and Virtual Reality (IEEE AIxVR 2024)

  16. arXiv:2312.16855  [pdf, other

    cs.LG q-bio.BM

    Molecular Property Prediction Based on Graph Structure Learning

    Authors: Bangyi Zhao, Weixia Xu, Jihong Guan, Shuigeng Zhou

    Abstract: Molecular property prediction (MPP) is a fundamental but challenging task in the computer-aided drug discovery process. More and more recent works employ different graph-based models for MPP, which have made considerable progress in improving prediction performance. However, current models often ignore relationships between molecules, which could be also helpful for MPP. For this sake, in this pap… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  17. arXiv:2312.16600  [pdf, other

    q-bio.GN cs.AI cs.LG

    scRNA-seq Data Clustering by Cluster-aware Iterative Contrastive Learning

    Authors: Weikang Jiang, **xian Wang, Jihong Guan, Shuigeng Zhou

    Abstract: Single-cell RNA sequencing (scRNA-seq) enables researchers to analyze gene expression at single-cell level. One important task in scRNA-seq data analysis is unsupervised clustering, which helps identify distinct cell types, laying down the foundation for other downstream analysis tasks. In this paper, we propose a novel method called Cluster-aware Iterative Contrastive Learning (CICL in short) for… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  18. arXiv:2312.04519  [pdf, other

    cs.CV

    Bootstrap** Autonomous Driving Radars with Self-Supervised Learning

    Authors: Yiduo Hao, Sohrab Madani, Junfeng Guan, Mohammed Alloulah, Saurabh Gupta, Haitham Hassanieh

    Abstract: The perception of autonomous vehicles using radars has attracted increased research interest due its ability to operate in fog and bad weather. However, training radar models is hindered by the cost and difficulty of annotating large-scale radar data. To overcome this bottleneck, we propose a self-supervised learning framework to leverage the large amount of unlabeled radar data to pre-train radar… ▽ More

    Submitted 18 April, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: 12 pages, 5 figures, to be published in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024

  19. arXiv:2311.06073  [pdf, other

    cs.DC

    Collaborative Inference in DNN-based Satellite Systems with Dynamic Task Streams

    Authors: **glong Guan, Qiyang Zhang, Ilir Murturi, Praveen Kumar Donta, Schahram Dustdar, Shangguang Wang

    Abstract: As a driving force in the advancement of intelligent in-orbit applications, DNN models have been gradually integrated into satellites, producing daily latency-constraint and computation-intensive tasks. However, the substantial computation capability of DNN models, coupled with the instability of the satellite-ground link, pose significant challenges, hindering timely completion of tasks. It becom… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  20. arXiv:2310.14564  [pdf, other

    cs.CL

    Language Models Hallucinate, but May Excel at Fact Verification

    Authors: Jian Guan, Jesse Dodge, David Wadden, Minlie Huang, Hao Peng

    Abstract: Recent progress in natural language processing (NLP) owes much to remarkable advances in large language models (LLMs). Nevertheless, LLMs frequently "hallucinate," resulting in non-factual outputs. Our carefully-designed human evaluation substantiates the serious hallucination issue, revealing that even GPT-3.5 produces factual outputs less than 25% of the time. This underscores the importance of… ▽ More

    Submitted 20 March, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted in NAACL 2024

  21. arXiv:2310.14173  [pdf, other

    cs.SD eess.AS

    First-Shot Unsupervised Anomalous Sound Detection With Unknown Anomalies Estimated by Metadata-Assisted Audio Generation

    Authors: He**g Zhang, Qiaoxi Zhu, Jian Guan, Haohe Liu, Feiyang Xiao, Jiantong Tian, Xinhao Mei, Xubo Liu, Wenwu Wang

    Abstract: First-shot (FS) unsupervised anomalous sound detection (ASD) is a brand-new task introduced in DCASE 2023 Challenge Task 2, where the anomalous sounds for the target machine types are unseen in training. Existing methods often rely on the availability of normal and abnormal sound data from the target machines. However, due to the lack of anomalous sound data for the target machine types, it become… ▽ More

    Submitted 11 March, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

    Comments: Accepted at ICASSP 2024

  22. arXiv:2310.08950  [pdf, ps, other

    cs.SD eess.AS

    Transformer-based Autoencoder with ID Constraint for Unsupervised Anomalous Sound Detection

    Authors: Jian Guan, Youde Liu, Qiuqiang Kong, Feiyang Xiao, Qiaoxi Zhu, Jiantong Tian, Wenwu Wang

    Abstract: Unsupervised anomalous sound detection (ASD) aims to detect unknown anomalous sounds of devices when only normal sound data is available. The autoencoder (AE) and self-supervised learning based methods are two mainstream methods. However, the AE-based methods could be limited as the feature learned from normal sounds can also fit with anomalous sounds, reducing the ability of the model in detectin… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

    Comments: Accepted by EURASIP Journal on Audio, Speech, and Music Processing

  23. arXiv:2310.05330  [pdf, other

    cs.CV

    A Lightweight Video Anomaly Detection Model with Weak Supervision and Adaptive Instance Selection

    Authors: Yang Wang, Jiaogen Zhou, Jihong Guan

    Abstract: Video anomaly detection is to determine whether there are any abnormal events, behaviors or objects in a given video, which enables effective and intelligent public safety management. As video anomaly labeling is both time-consuming and expensive, most existing works employ unsupervised or weakly supervised learning methods. This paper focuses on weakly supervised video anomaly detection, in which… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  24. arXiv:2310.04463  [pdf, other

    q-bio.BM cs.AI cs.LG

    Diffusing on Two Levels and Optimizing for Multiple Properties: A Novel Approach to Generating Molecules with Desirable Properties

    Authors: Siyuan Guo, Jihong Guan, Shuigeng Zhou

    Abstract: In the past decade, Artificial Intelligence driven drug design and discovery has been a hot research topic, where an important branch is molecule generation by generative models, from GAN-based models and VAE-based models to the latest diffusion-based models. However, most existing models pursue only the basic properties like validity and uniqueness of the generated molecules, a few go further to… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  25. arXiv:2309.09705  [pdf, other

    cs.SD eess.AS

    Synth-AC: Enhancing Audio Captioning with Synthetic Supervision

    Authors: Feiyang Xiao, Qiaoxi Zhu, Jian Guan, Xubo Liu, Haohe Liu, Kejia Zhang, Wenwu Wang

    Abstract: Data-driven approaches hold promise for audio captioning. However, the development of audio captioning methods can be biased due to the limited availability and quality of text-audio data. This paper proposes a SynthAC framework, which leverages recent advances in audio generative models and commonly available text corpus to create synthetic text-audio pairs, thereby enhancing text-audio represent… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  26. arXiv:2309.07498  [pdf, other

    eess.AS cs.SD

    Hierarchical Metadata Information Constrained Self-Supervised Learning for Anomalous Sound Detection Under Domain Shift

    Authors: Haiyan Lan, Qiaoxi Zhu, Jian Guan, Yuming Wei, Wenwu Wang

    Abstract: Self-supervised learning methods have achieved promising performance for anomalous sound detection (ASD) under domain shift, where the type of domain shift is considered in feature learning by incorporating section IDs. However, the attributes accompanying audio files under each section, such as machine operating conditions and noise types, have not been considered, although they are also crucial… ▽ More

    Submitted 18 December, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: To appear at ICASSP 2024

  27. arXiv:2309.04819  [pdf, other

    quant-ph cs.CR cs.LG

    Detecting Violations of Differential Privacy for Quantum Algorithms

    Authors: Ji Guan, Wang Fang, Mingyu Huang, Mingsheng Ying

    Abstract: Quantum algorithms for solving a wide range of practical problems have been proposed in the last ten years, such as data search and analysis, product recommendation, and credit scoring. The concern about privacy and other ethical issues in quantum computing naturally rises up. In this paper, we define a formal framework for detecting violations of differential privacy for quantum algorithms. A det… ▽ More

    Submitted 9 September, 2023; originally announced September 2023.

    Journal ref: In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (CCS 2023)

  28. arXiv:2308.14063  [pdf, other

    cs.SD eess.AS

    Anomalous Sound Detection Using Self-Attention-Based Frequency Pattern Analysis of Machine Sounds

    Authors: He**g Zhang, Jian Guan, Qiaoxi Zhu, Feiyang Xiao, Youde Liu

    Abstract: Different machines can exhibit diverse frequency patterns in their emitted sound. This feature has been recently explored in anomaly sound detection and reached state-of-the-art performance. However, existing methods rely on the manual or empirical determination of the frequency filter by observing the effective frequency range in the training data, which may be impractical for general application… ▽ More

    Submitted 6 September, 2023; v1 submitted 27 August, 2023; originally announced August 2023.

    Comments: Published in INTERSPEECH 2023

  29. arXiv:2308.09540  [pdf, other

    cs.CV cs.AI

    Meta-ZSDETR: Zero-shot DETR with Meta-learning

    Authors: Lu Zhang, Chenbo Zhang, Jiajia Zhao, Jihong Guan, Shuigeng Zhou

    Abstract: Zero-shot object detection aims to localize and recognize objects of unseen classes. Most of existing works face two problems: the low recall of RPN in unseen classes and the confusion of unseen classes with background. In this paper, we present the first method that combines DETR and meta-learning to perform zero-shot object detection, named Meta-ZSDETR, where model training is formalized as an i… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: Accepted in ICCV 2023

  30. arXiv:2308.06107  [pdf, other

    cs.CR

    Test-Time Backdoor Defense via Detecting and Repairing

    Authors: Jiyang Guan, Jian Liang, Ran He

    Abstract: Deep neural networks have played a crucial part in many critical domains, such as autonomous driving, face recognition, and medical diagnosis. However, deep neural networks are facing security threats from backdoor attacks and can be manipulated into attacker-decided behaviors by the backdoor attacker. To defend the backdoor, prior research has focused on using clean data to remove backdoor attack… ▽ More

    Submitted 29 November, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

  31. arXiv:2308.02362  [pdf, other

    cs.CR cs.AI cs.LG

    Flexible Differentially Private Vertical Federated Learning with Adaptive Feature Embeddings

    Authors: Yuxi Mi, Hongquan Liu, Yewei Xia, Yiheng Sun, Jihong Guan, Shuigeng Zhou

    Abstract: The emergence of vertical federated learning (VFL) has stimulated concerns about the imperfection in privacy protection, as shared feature embeddings may reveal sensitive information under privacy attacks. This paper studies the delicate equilibrium between data privacy and task utility goals of VFL under differential privacy (DP). To address the generality issue of prior arts, this paper advocate… ▽ More

    Submitted 26 July, 2023; originally announced August 2023.

  32. arXiv:2308.01999  [pdf, other

    quant-ph cs.PF cs.SE

    cuQuantum SDK: A High-Performance Library for Accelerating Quantum Science

    Authors: Harun Bayraktar, Ali Charara, David Clark, Saul Cohen, Timothy Costa, Yao-Lung L. Fang, Yang Gao, Jack Guan, John Gunnels, Azzam Haidar, Andreas Hehn, Markus Hohnerbach, Matthew Jones, Tom Lubowe, Dmitry Lyakh, Shinya Morino, Paul Springer, Sam Stanwyck, Igor Terentyev, Satya Varadhan, Jonathan Wong, Takuma Yamaguchi

    Abstract: We present the NVIDIA cuQuantum SDK, a state-of-the-art library of composable primitives for GPU-accelerated quantum circuit simulations. As the size of quantum devices continues to increase, making their classical simulation progressively more difficult, the availability of fast and scalable quantum circuit simulators becomes vital for quantum algorithm developers, as well as quantum hardware eng… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: paper accepted at QCE 2023, journal reference will be updated whenever available

    MSC Class: 68Q12; 68Q09; 81P68;

  33. arXiv:2307.16410  [pdf, other

    cs.CV

    HiREN: Towards Higher Supervision Quality for Better Scene Text Image Super-Resolution

    Authors: Minyi Zhao, Yi Xu, Bingjia Li, Jie Wang, Jihong Guan, Shuigeng Zhou

    Abstract: Scene text image super-resolution (STISR) is an important pre-processing technique for text recognition from low-resolution scene images. Nowadays, various methods have been proposed to extract text-specific information from high-resolution (HR) images to supervise STISR model training. However, due to uncontrollable factors (e.g. shooting equipment, focus, and environment) in manually photographi… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

  34. arXiv:2307.10803  [pdf, other

    cs.LG physics.ao-ph

    Spatial-Temporal Data Mining for Ocean Science: Data, Methodologies, and Opportunities

    Authors: Hanchen Yang, Wengen Li, Shuyu Wang, Hui Li, Jihong Guan, Shuigeng Zhou, Jiannong Cao

    Abstract: With the rapid amassing of spatial-temporal (ST) ocean data, many spatial-temporal data mining (STDM) studies have been conducted to address various oceanic issues, including climate forecasting and disaster warning. Compared with typical ST data (e.g., traffic data), ST ocean data is more complicated but with unique characteristics, e.g., diverse regionality and high sparsity. These characteristi… ▽ More

    Submitted 3 August, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

  35. arXiv:2307.01542  [pdf, other

    cs.CL

    Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation

    Authors: Jian Guan, Minlie Huang

    Abstract: Despite the huge progress in myriad generation tasks, pretrained language models (LMs) such as GPT2 still tend to generate repetitive texts with maximization-based decoding algorithms for open-ended generation. We attribute their overestimation of token-level repetition probabilities to the learning bias: LMs capture simple repetitive patterns faster with the MLE loss. We propose self-contrastive… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: ACL 2023 Short Findings

  36. arXiv:2307.01504  [pdf, other

    cs.SI cs.AI cs.LG

    All in One: Multi-task Prompting for Graph Neural Networks

    Authors: Xiangguo Sun, Hong Cheng, Jia Li, Bo Liu, Jihong Guan

    Abstract: Recently, ''pre-training and fine-tuning'' has been adopted as a standard workflow for many graph tasks since it can take general graph knowledge to relieve the lack of graph annotations from each application. However, graph tasks with node level, edge level, and graph level are far diversified, making the pre-training pretext often incompatible with these multiple tasks. This gap may even cause a… ▽ More

    Submitted 17 December, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

    Comments: KDD 23 Best Research Paper Award, which is the first for Hong Kong and Mainland China. A Python Library is released as ProG: https://github.com/sheldonresearch/ProG Submitted to SIGKDD'23 in 03 Feb 2023; Receive Acceptance in 17 May 2023 (Rating 3 4 4 4); Submit to arXiv 1st time in 4 Jul 2023

  37. Design Frameworks for Hyper-Connected Social XRI Immersive Metaverse Environments

    Authors: Jie Guan, Alexis Morris

    Abstract: The metaverse refers to the merger of technologies for providing a digital twin of the real world and the underlying connectivity and interactions for the many kinds of agents within. As this set of technology paradigms - involving artificial intelligence, mixed reality, the internet-of-things and others - gains in scale, maturity, and utility there are rapidly emerging design challenges and new r… ▽ More

    Submitted 27 January, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

    Journal ref: IEEE Network ( Volume: 37, Issue: 4, July/August 2023)

  38. arXiv:2306.05358  [pdf, other

    cs.CR cs.AI cs.LG cs.SD eess.AS

    Trustworthy Sensor Fusion against Inaudible Command Attacks in Advanced Driver-Assistance System

    Authors: Jiwei Guan, Lei Pan, Chen Wang, Shui Yu, Longxiang Gao, Xi Zheng

    Abstract: There are increasing concerns about malicious attacks on autonomous vehicles. In particular, inaudible voice command attacks pose a significant threat as voice commands become available in autonomous driving systems. How to empirically defend against these inaudible attacks remains an open question. Previous research investigates utilizing deep learning-based multimodal fusion for defense, without… ▽ More

    Submitted 29 May, 2023; originally announced June 2023.

  39. An XRI Mixed-Reality Internet-of-Things Architectural Framework Toward Immersive and Adaptive Smart Environments

    Authors: Alexis Morris, Jie Guan, Amna Azhar

    Abstract: The internet-of-things (IoT) refers to the growing number of embedded interconnected devices within everyday ubiquitous objects and environments, especially their networks, edge controllers, data gathering and management, sharing, and contextual analysis capabilities. However, the IoT suffers from inherent limitations in terms of human-computer interaction. In this landscape, there is a need for i… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Journal ref: 2021 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)

  40. Extending the Metaverse: Hyper-Connected Smart Environments with Mixed Reality and the Internet of Things

    Authors: Jie Guan, Alexis Morris, Jay Irizawa

    Abstract: The metaverse, i.e., the collection of technologies that provide a virtual twin of the real world via mixed reality, internet of things, and others, is gaining prominence. However, the metaverse faces challenges as it grows toward mainstream adoption. Among these is the lack of strong connections between metaverse objects and traditional physical objects and environments, which leads to inconsiste… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Journal ref: 2023 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops

  41. Cross-Reality for Extending the Metaverse: Designing Hyper-Connected Immersive Environments with XRI

    Authors: Jie Guan, Alexis Morris, Jay Irizawa

    Abstract: The Metaverse comprises technologies to enable virtual twins of the real world, via mixed reality, internet of things, and others. As it matures unique challenges arise such as a lack of strong connections between virtual and physical worlds. This work presents design frameworks for cross-reality hybrid spaces. Contributions include: i) clarifying the metaverse "disconnect", ii) extended metaverse… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Journal ref: 2023 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)

  42. Extended-XRI Body Interfaces for Hyper-Connected Metaverse Environments

    Authors: Jie Guan, Alexis Morris

    Abstract: Hybrid mixed-reality (XR) internet-of-things (IoT) research, here called XRI, aims at a strong integration between physical and virtual objects, environments, and agents wherein IoT-enabled edge devices are deployed for sensing, context understanding, networked communication and control of device actuators. Likewise, as augmented reality systems provide an immersive overlay on the environments, an… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Journal ref: 2022 IEEE Games, 2022 IEEE Games, Entertainment, Media Conference (GEM)

  43. arXiv:2305.12881  [pdf, other

    cs.CV cs.MM

    Building an Invisible Shield for Your Portrait against Deepfakes

    Authors: Jiazhi Guan, Tianshu Hu, Hang Zhou, Zhizhi Guo, Lirui Deng, Chengbin Quan, Errui Ding, Youjian Zhao

    Abstract: The issue of detecting deepfakes has garnered significant attention in the research community, with the goal of identifying facial manipulations for abuse prevention. Although recent studies have focused on develo** generalized models that can detect various types of deepfakes, their performance is not always be reliable and stable, which poses limitations in real-world applications. Instead of… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: under review

  44. Toward Mixed Reality Hybrid Objects with IoT Avatar Agents

    Authors: Alexis Morris, Jie Guan, Nadine Lessio, Yiyi Shao

    Abstract: The internet-of-things (IoT) refers to the growing field of interconnected pervasive computing devices and the networking that supports smart, embedded applications. The IoT has multiple human-computer interaction challenges due to its many formats and interlinked components, and central to these is the need to provide sensory information and situational context pertaining to users in a more human… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  45. arXiv:2305.07508  [pdf, other

    q-bio.BM cs.LG q-bio.QM

    MolDiff: Addressing the Atom-Bond Inconsistency Problem in 3D Molecule Diffusion Generation

    Authors: Xingang Peng, Jiaqi Guan, Qiang Liu, Jianzhu Ma

    Abstract: Deep generative models have recently achieved superior performance in 3D molecule generation. Most of them first generate atoms and then add chemical bonds based on the generated atoms in a post-processing manner. However, there might be no corresponding bond solution for the temporally generated atoms as their locations are generated without considering potential bonds. We define this problem as… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

  46. arXiv:2305.05445  [pdf, other

    cs.CV cs.GR cs.MM

    StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator

    Authors: Jiazhi Guan, Zhanwang Zhang, Hang Zhou, Tianshu Hu, Kaisiyuan Wang, Dongliang He, Haocheng Feng, **gtuo Liu, Errui Ding, Ziwei Liu, **gdong Wang

    Abstract: Despite recent advances in syncing lip movements with any audio waves, current methods still struggle to balance generation quality and the model's generalization ability. Previous studies either require long-term data for training or produce a similar movement pattern on all subjects with low quality. In this paper, we propose StyleSync, an effective framework that enables high-fidelity lip synch… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. Project page: https://hangz-nju-cuhk.github.io/projects/StyleSync

  47. arXiv:2305.03328  [pdf, other

    eess.AS cs.SD

    Time-weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection

    Authors: Jian Guan, Youde Liu, Qiaoxi Zhu, Tieran Zheng, Jiqing Han, Wenwu Wang

    Abstract: Although deep learning is the mainstream method in unsupervised anomalous sound detection, Gaussian Mixture Model (GMM) with statistical audio frequency representation as input can achieve comparable results with much lower model complexity and fewer parameters. Existing statistical frequency representations, e.g, the log-Mel spectrogram's average or maximum over time, do not always work well for… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: To appear at ICASSP 2023

  48. arXiv:2305.02606  [pdf, other

    cs.CL

    Re$^3$Dial: Retrieve, Reorganize and Rescale Dialogue Corpus for Long-Turn Open-Domain Dialogue Pre-training

    Authors: Jiaxin Wen, Hao Zhou, Jian Guan, Minlie Huang

    Abstract: Pre-training on large-scale open-domain dialogue data can substantially improve the performance of dialogue models. However, the pre-trained dialogue model's ability to utilize long-range context is limited due to the scarcity of long-turn dialogue sessions. Most dialogues in existing pre-training corpora contain fewer than three turns of dialogue. To alleviate this issue, we propose the Retrieve,… ▽ More

    Submitted 22 October, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023 Main Coference

  49. arXiv:2304.03588  [pdf, other

    cs.SD cs.LG eess.AS

    Anomalous Sound Detection using Audio Representation with Machine ID based Contrastive Learning Pretraining

    Authors: Jian Guan, Feiyang Xiao, Youde Liu, Qiaoxi Zhu, Wenwu Wang

    Abstract: Existing contrastive learning methods for anomalous sound detection refine the audio representation of each audio sample by using the contrast between the samples' augmentations (e.g., with time or frequency masking). However, they might be biased by the augmented data, due to the lack of physical properties of machine sound, thereby limiting the detection performance. This paper uses contrastive… ▽ More

    Submitted 10 April, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: To appear in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023)

  50. Graph Attention for Automated Audio Captioning

    Authors: Feiyang Xiao, Jian Guan, Qiaoxi Zhu, Wenwu Wang

    Abstract: State-of-the-art audio captioning methods typically use the encoder-decoder structure with pretrained audio neural networks (PANNs) as encoders for feature extraction. However, the convolution operation used in PANNs is limited in capturing the long-time dependencies within an audio signal, thereby leading to potential performance degradation in audio captioning. This letter presents a novel metho… ▽ More

    Submitted 10 April, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: Accepted by IEEE Signal Processing Letters