Skip to main content

Showing 1–50 of 234 results for author: Qian, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12888  [pdf, other

    cond-mat.mtrl-sci cs.AI physics.atom-ph

    A Space Group Symmetry Informed Network for O(3) Equivariant Crystal Tensor Prediction

    Authors: Keqiang Yan, Alexandra Saxton, Xiaofeng Qian, Xiaoning Qian, Shuiwang Ji

    Abstract: We consider the prediction of general tensor properties of crystalline materials, including dielectric, piezoelectric, and elastic tensors. A key challenge here is how to make the predictions satisfy the unique tensor equivariance to O(3) group and invariance to crystal space groups. To this end, we propose a General Materials Tensor Network (GMTNet), which is carefully designed to satisfy the req… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted to ICML 24 as a poster. You are encouraged to cite the conference version of this paper

  2. arXiv:2406.12178  [pdf, other

    cs.CV

    FCA-RAC: First Cycle Annotated Repetitive Action Counting

    Authors: Jiada Lu, WeiWei Zhou, Xiang Qian, Dongze Lian, Yanyu Xu, Weifeng Wang, Lina Cao, Shenghua Gao

    Abstract: Repetitive action counting quantifies the frequency of specific actions performed by individuals. However, existing action-counting datasets have limited action diversity, potentially hampering model performance on unseen actions. To address this issue, we propose a framework called First Cycle Annotated Repetitive Action Counting (FCA-RAC). This framework contains 4 parts: 1) a labeling technique… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. Learning Flexible Time-windowed Granger Causality Integrating Heterogeneous Interventional Time Series Data

    Authors: Ziyi Zhang, Shaogang Ren, Xiaoning Qian, Nick Duffield

    Abstract: Granger causality, commonly used for inferring causal structures from time series data, has been adopted in widespread applications across various fields due to its intuitive explainability and high compatibility with emerging deep neural network prediction models. To alleviate challenges in better deciphering causal structures unambiguously from time series, the use of interventional data has bec… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted by ACM SIGKDD 2024

  4. arXiv:2406.06045  [pdf, other

    cs.CV cs.AI

    Synthesizing Efficient Data with Diffusion Models for Person Re-Identification Pre-Training

    Authors: Ke Niu, Haiyang Yu, Xuelin Qian, Teng Fu, Bin Li, Xiangyang Xue

    Abstract: Existing person re-identification (Re-ID) methods principally deploy the ImageNet-1K dataset for model initialization, which inevitably results in sub-optimal situations due to the large domain gap. One of the key challenges is that building large-scale person Re-ID datasets is time-consuming. Some previous efforts address this problem by collecting person images from the internet e.g., LUPerson,… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  5. arXiv:2405.19949  [pdf, other

    cs.CV

    Hyper-Transformer for Amodal Completion

    Authors: Jianxiong Gao, Xuelin Qian, Longfei Liang, Junwei Han, Yanwei Fu

    Abstract: Amodal object completion is a complex task that involves predicting the invisible parts of an object based on visible segments and background information. Learning shape priors is crucial for effective amodal completion, but traditional methods often rely on two-stage processes or additional information, leading to inefficiencies and potential error accumulation. To address these shortcomings, we… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  6. arXiv:2405.19695  [pdf, other

    cs.CV

    Distribution Aligned Semantics Adaption for Lifelong Person Re-Identification

    Authors: Qizao Wang, Xuelin Qian, Bin Li, Xiangyang Xue

    Abstract: In real-world scenarios, person Re-IDentification (Re-ID) systems need to be adaptable to changes in space and time. Therefore, the adaptation of Re-ID models to new domains while preserving previously acquired knowledge is crucial, known as Lifelong person Re-IDentification (LReID). Advanced LReID methods rely on replaying exemplars from old domains and applying knowledge distillation in logits w… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  7. arXiv:2405.19149  [pdf, other

    cs.CV cs.AI cs.IR

    CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval

    Authors: Xintong Jiang, Yaxiong Wang, Mengjian Li, Yujiao Wu, Bingwen Hu, Xueming Qian

    Abstract: Composed Image Retrieval (CIR) involves searching for target images based on an image-text pair query. While current methods treat this as a query-target matching problem, we argue that CIR triplets contain additional associations beyond this primary relation. In our paper, we identify two new relations within triplets, treating each triplet as a graph node. Firstly, we introduce the concept of te… ▽ More

    Submitted 30 May, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: To appear at SIGIR 2024. arXiv admin note: text overlap with arXiv:2309.02169

  8. arXiv:2405.19005  [pdf, other

    cs.CV

    Auto-selected Knowledge Adapters for Lifelong Person Re-identification

    Authors: Xuelin Qian, Ruiqi Wu, Gong Cheng, Junwei Han

    Abstract: Lifelong Person Re-Identification (LReID) extends traditional ReID by requiring systems to continually learn from non-overlap** datasets across different times and locations, adapting to new identities while preserving knowledge of previous ones. Existing approaches, either rehearsal-free or rehearsal-based, still suffer from the problem of catastrophic forgetting since they try to cram diverse… ▽ More

    Submitted 30 May, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

  9. arXiv:2405.16600  [pdf, other

    cs.CV

    Image-Text-Image Knowledge Transferring for Lifelong Person Re-Identification with Hybrid Clothing States

    Authors: Qizao Wang, Xuelin Qian, Bin Li, Yanwei Fu, Xiangyang Xue

    Abstract: With the continuous expansion of intelligent surveillance networks, lifelong person re-identification (LReID) has received widespread attention, pursuing the need of self-evolution across different domains. However, existing LReID studies accumulate knowledge with the assumption that people would not change their clothes. In this paper, we propose a more practical task, namely lifelong person re-i… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  10. arXiv:2405.16597  [pdf, other

    cs.CV

    Content and Salient Semantics Collaboration for Cloth-Changing Person Re-Identification

    Authors: Qizao Wang, Xuelin Qian, Bin Li, Lifeng Chen, Yanwei Fu, Xiangyang Xue

    Abstract: Cloth-changing person Re-IDentification (Re-ID) aims at recognizing the same person with clothing changes across non-overlap** cameras. Conventional person Re-ID methods usually bias the model's focus on cloth-related appearance features rather than identity-sensitive features associated with biological traits. Recently, advanced cloth-changing person Re-ID methods either resort to identity-rela… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  11. arXiv:2405.12609  [pdf, other

    eess.AS cs.SD

    Mamba in Speech: Towards an Alternative to Self-Attention

    Authors: Xiangyu Zhang, Qiquan Zhang, Hexin Liu, Tianyi Xiao, Xinyuan Qian, Beena Ahmed, Eliathamby Ambikairajah, Haizhou Li, Julien Epps

    Abstract: Transformer and its derivatives have achieved success in diverse tasks across computer vision, natural language processing, and speech processing. To reduce the complexity of computations within the multi-head self-attention mechanism in Transformer, Selective State Space Models (i.e., Mamba) were proposed as an alternative. Mamba exhibited its effectiveness in natural language processing and comp… ▽ More

    Submitted 30 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  12. arXiv:2405.12202  [pdf, other

    cs.CV cs.AI

    Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution

    Authors: Xihaier Luo, Xiaoning Qian, Byung-Jun Yoon

    Abstract: In this work, we present an arbitrary-scale super-resolution (SR) method to enhance the resolution of scientific data, which often involves complex challenges such as continuity, multi-scale physics, and the intricacies of high-frequency signals. Grounded in operator learning, the proposed method is resolution-invariant. The core of our model is a hierarchical neural operator that leverages a Gale… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 20 pages, 14 figures

  13. arXiv:2405.11461  [pdf, other

    cs.IR cs.AI cs.CL

    DocReLM: Mastering Document Retrieval with Language Model

    Authors: Gengchen Wei, Xinle Pang, Tianning Zhang, Yu Sun, Xun Qian, Chen Lin, Han-Sen Zhong, Wanli Ouyang

    Abstract: With over 200 million published academic documents and millions of new documents being written each year, academic researchers face the challenge of searching for information within this vast corpus. However, existing retrieval systems struggle to understand the semantics and domain knowledge present in academic papers. In this work, we demonstrate that by utilizing large language models, a docume… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  14. arXiv:2405.05430  [pdf, other

    cs.LG

    Towards Invariant Time Series Forecasting in Smart Cities

    Authors: Ziyi Zhang, Shaogang Ren, Xiaoning Qian, Nick Duffield

    Abstract: In the transformative landscape of smart cities, the integration of the cutting-edge web technologies into time series forecasting presents a pivotal opportunity to enhance urban planning, sustainability, and economic growth. The advancement of deep neural networks has significantly improved forecasting performance. However, a notable challenge lies in the ability of these models to generalize wel… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted by ACM WWW Companion 2024

  15. arXiv:2405.01104  [pdf, other

    cs.IT eess.SP

    Multi-user ISAC through Stacked Intelligent Metasurfaces: New Algorithms and Experiments

    Authors: Ziqing Wang, Hongzheng Liu, Jianan Zhang, Ru**g Xiong, Kai Wan, Xuewen Qian, Marco Di Renzo, Robert Caiming Qiu

    Abstract: This paper investigates a Stacked Intelligent Metasurfaces (SIM)-assisted Integrated Sensing and Communications (ISAC) system. An extended target model is considered, where the BS aims to estimate the complete target response matrix relative to the SIM. Under the constraints of minimum Signal-to-Interference-plus-Noise Ratio (SINR) for the communication users (CUs) and maximum transmit power, we j… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  16. arXiv:2404.18501  [pdf, other

    eess.AS cs.SD

    Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention

    Authors: Ruijie Tao, Xinyuan Qian, Yidi Jiang, Junjie Li, Jiadong Wang, Haizhou Li

    Abstract: Audio-visual target speaker extraction (AV-TSE) aims to extract the specific person's speech from the audio mixture given auxiliary visual cues. Previous methods usually search for the target voice through speech-lip synchronization. However, this strategy mainly focuses on the existence of target speech, while ignoring the variations of the noise characteristics. That may result in extracting noi… ▽ More

    Submitted 8 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  17. arXiv:2404.13153  [pdf, other

    eess.IV cs.CV

    Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring

    Authors: Chengxu Liu, Xuan Wang, Xiangyu Xu, Ruhao Tian, Shuai Li, Xueming Qian, Ming-Hsuan Yang

    Abstract: Eliminating image blur produced by various kinds of motion has been a challenging problem. Dominant approaches rely heavily on model capacity to remove blurring by reconstructing residual from blurry observation in feature space. These practices not only prevent the capture of spatially variable motion in the real world but also ignore the tailored handling of various motions in image space. In th… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  18. arXiv:2404.08995  [pdf, other

    cs.LG cs.AI cs.CV

    Beyond Known Clusters: Probe New Prototypes for Efficient Generalized Class Discovery

    Authors: Ye Wang, Yaxiong Wang, Yujiao Wu, Bingchen Zhao, Xueming Qian

    Abstract: Generalized Class Discovery (GCD) aims to dynamically assign labels to unlabelled data partially based on knowledge learned from labelled data, where the unlabelled data may come from known or novel classes. The prevailing approach generally involves clustering across all data and learning conceptions by prototypical contrastive learning. However, existing methods largely hinge on the performance… ▽ More

    Submitted 30 April, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

    Comments: 9 pages, 7 figures

  19. arXiv:2403.18211  [pdf, other

    cs.CV cs.LG

    NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

    Authors: **gyang Huo, Yikai Wang, Xuelin Qian, Yun Wang, Chong Li, Jianfeng Feng, Yanwei Fu

    Abstract: Recent fMRI-to-image approaches mainly focused on associating fMRI signals with specific conditions of pre-trained diffusion models. These approaches, while producing high-quality images, capture only a limited aspect of the complex information in fMRI signals and offer little detailed control over image creation. In contrast, this paper proposes to directly modulate the generation process of diff… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  20. arXiv:2403.16607  [pdf, other

    cs.LG cs.CV

    Enhancing Industrial Transfer Learning with Style Filter: Cost Reduction and Defect-Focus

    Authors: Chen Li, Ruijie Ma, Xiang Qian, Xiaohao Wang, Xinghui Li

    Abstract: Addressing the challenge of data scarcity in industrial domains, transfer learning emerges as a pivotal paradigm. This work introduces Style Filter, a tailored methodology for industrial contexts. By selectively filtering source domain data before knowledge transfer, Style Filter reduces the quantity of data while maintaining or even enhancing the performance of transfer learning strategy. Offerin… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 17 pages, 11 figures,4 tables

  21. arXiv:2403.11857  [pdf, other

    cs.LG cond-mat.mtrl-sci

    Complete and Efficient Graph Transformers for Crystal Material Property Prediction

    Authors: Keqiang Yan, Cong Fu, Xiaofeng Qian, Xiaoning Qian, Shuiwang Ji

    Abstract: Crystal structures are characterized by atomic bases within a primitive unit cell that repeats along a regular lattice throughout 3D space. The periodic and infinite nature of crystals poses unique challenges for geometric graph representation learning. Specifically, constructing graphs that effectively capture the complete geometric information of crystals and handle chiral crystals remains an un… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: This paper has been accepted by ICLR 2024

  22. arXiv:2403.10012  [pdf, other

    cs.CV cs.RO eess.IV physics.optics

    Real-World Computational Aberration Correction via Quantized Domain-Mixing Representation

    Authors: Qi Jiang, Zhonghua Yi, Shaohua Gao, Yao Gao, Xiaolong Qian, Hao Shi, Lei Sun, Zhijie Xu, Kailun Yang, Kaiwei Wang

    Abstract: Relying on paired synthetic data, existing learning-based Computational Aberration Correction (CAC) methods are confronted with the intricate and multifaceted synthetic-to-real domain gap, which leads to suboptimal performance in real-world applications. In this paper, in contrast to improving the simulation pipeline, we deliver a novel insight into real-world CAC from the perspective of Unsupervi… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Codes and datasets will be made publicly available at https://github.com/zju-jiangqi/QDMR

  23. Fast-Forward Reality: Authoring Error-Free Context-Aware Policies with Real-Time Unit Tests in Extended Reality

    Authors: Xun Qian, Tianyi Wang, Xuhai Xu, Tanya R Jonker, Kashyap Todi

    Abstract: Advances in ubiquitous computing have enabled end-user authoring of context-aware policies (CAPs) that control smart devices based on specific contexts of the user and environment. However, authoring CAPs accurately and avoiding run-time errors is challenging for end-users as it is difficult to foresee CAP behaviors under complex real-world conditions. We propose Fast-Forward Reality, an Extended… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 17 pages, 7 figures, ACM CHI 2024 Full Paper

    ACM Class: H.5.2

  24. arXiv:2403.06017  [pdf, ps, other

    cs.LG cs.CY

    Addressing Shortcomings in Fair Graph Learning Datasets: Towards a New Benchmark

    Authors: Xiaowei Qian, Zhimeng Guo, Jialiang Li, Haitao Mao, Bingheng Li, Suhang Wang, Yao Ma

    Abstract: Fair graph learning plays a pivotal role in numerous practical applications. Recently, many fair graph learning methods have been proposed; however, their evaluation often relies on poorly constructed semi-synthetic datasets or substandard real-world datasets. In such cases, even a basic Multilayer Perceptron (MLP) can outperform Graph Neural Networks (GNNs) in both utility and fairness. In this w… ▽ More

    Submitted 17 June, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

    Comments: KDD ADS 2024

  25. arXiv:2403.05660  [pdf, other

    cs.CV

    Decoupling Degradations with Recurrent Network for Video Restoration in Under-Display Camera

    Authors: Chengxu Liu, Xuan Wang, Yuanting Fan, Shuai Li, Xueming Qian

    Abstract: Under-display camera (UDC) systems are the foundation of full-screen display devices in which the lens mounts under the display. The pixel array of light-emitting diodes used for display diffracts and attenuates incident light, causing various degradations as the light intensity changes. Unlike general video restoration which recovers video by treating different degradation factors equally, video… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: AAAI 2024

  26. arXiv:2402.12225  [pdf, other

    cs.CV

    Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability

    Authors: Xuelin Qian, Yu Wang, Simian Luo, Yinda Zhang, Ying Tai, Zhenyu Zhang, Chengjie Wang, Xiangyang Xue, Bo Zhao, Tiejun Huang, Yunsheng Wu, Yanwei Fu

    Abstract: Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space. In this paper, we extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously. Firstly, we leverage an ensemble of publicly available 3D datasets to facilitate… ▽ More

    Submitted 26 March, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Project page: https://argus-3d.github.io/ . Datasets: https://huggingface.co/datasets/BAAI/Objaverse-MIX. arXiv admin note: substantial text overlap with arXiv:2303.14700

  27. arXiv:2402.11857  [pdf, other

    cs.LG cs.DC

    Communication-Efficient Distributed Learning with Local Immediate Error Compensation

    Authors: Yifei Cheng, Li Shen, Linli Xu, Xun Qian, Shiwei Wu, Yiming Zhou, Tie Zhang, Dacheng Tao, Enhong Chen

    Abstract: Gradient compression with error compensation has attracted significant attention with the target of reducing the heavy communication overhead in distributed learning. However, existing compression methods either perform only unidirectional compression in one iteration with higher communication cost, or bidirectional compression with slower convergence rate. In this work, we propose the Local Immed… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  28. arXiv:2402.02322  [pdf, other

    cs.LG stat.ML

    Dynamic Incremental Optimization for Best Subset Selection

    Authors: Shaogang Ren, Xiaoning Qian

    Abstract: Best subset selection is considered the `gold standard' for many sparse learning problems. A variety of optimization techniques have been proposed to attack this non-smooth non-convex problem. In this paper, we investigate the dual forms of a family of $\ell_0$-regularized problems. An efficient primal-dual algorithm is developed based on the primal and dual problem structures. By leveraging the d… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2207.02058

  29. arXiv:2402.02277  [pdf, other

    cs.LG stat.ML

    Causal Bayesian Optimization via Exogenous Distribution Learning

    Authors: Shaogang Ren, Xiaoning Qian

    Abstract: Maximizing a target variable as an operational objective in a structural causal model is an important problem. Existing Causal Bayesian Optimization~(CBO) methods either rely on hard interventions that alter the causal structure to maximize the reward; or introduce action nodes to endogenous variables so that the data generation mechanisms are adjusted to achieve the objective. In this paper, a no… ▽ More

    Submitted 26 May, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  30. arXiv:2401.16936  [pdf, other

    cs.LG cs.CV

    Multi-modal Representation Learning for Cross-modal Prediction of Continuous Weather Patterns from Discrete Low-Dimensional Data

    Authors: Alif Bin Abdul Qayyum, Xihaier Luo, Nathan M. Urban, Xiaoning Qian, Byung-Jun Yoon

    Abstract: World is looking for clean and renewable energy sources that do not pollute the environment, in an attempt to reduce greenhouse gas emissions that contribute to global warming. Wind energy has significant potential to not only reduce greenhouse emission, but also meet the ever increasing demand for energy. To enable the effective utilization of wind energy, addressing the following three challenge… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  31. arXiv:2401.04408  [pdf, other

    cs.IR cs.LG

    Fine-Grained Embedding Dimension Optimization During Training for Recommender Systems

    Authors: Qinyi Luo, Penghan Wang, Wei Zhang, Fan Lai, Jiachen Mao, Xiaohan Wei, Jun Song, Wei-Yu Tsai, Shuai Yang, Yuxi Hu, Xuehai Qian

    Abstract: Huge embedding tables in modern Deep Learning Recommender Models (DLRM) require prohibitively large memory during training and inference. Aiming to reduce the memory footprint of training, this paper proposes FIne-grained In-Training Embedding Dimension optimization (FIITED). Given the observation that embedding vectors are not equally important, FIITED adjusts the dimension of each individual emb… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: 16 pages, 9 figures

    ACM Class: I.2.6; H.3.3

  32. arXiv:2312.14066  [pdf, other

    cs.LG

    Upper Bounding Barlow Twins: A Novel Filter for Multi-Relational Clustering

    Authors: Xiaowei Qian, Bingheng Li, Zhao Kang

    Abstract: Multi-relational clustering is a challenging task due to the fact that diverse semantic information conveyed in multi-layer graphs is difficult to extract and fuse. Recent methods integrate topology structure and node attribute information through graph filtering. However, they often use a low-pass filter without fully considering the correlation among multiple graphs. To overcome this drawback, w… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  33. arXiv:2312.12681  [pdf, other

    cs.CL cs.AI

    Imitation of Life: A Search Engine for Biologically Inspired Design

    Authors: Hen Emuna, Nadav Borenstein, Xin Qian, Hyeonsu Kang, Joel Chan, Aniket Kittur, Dafna Shahaf

    Abstract: Biologically Inspired Design (BID), or Biomimicry, is a problem-solving methodology that applies analogies from nature to solve engineering challenges. For example, Speedo engineers designed swimsuits based on shark skin. Finding relevant biological solutions for real-world problems poses significant challenges, both due to the limited biological knowledge engineers and designers typically possess… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: To be published in the AAAI 2024 Proceedings Main Track

  34. arXiv:2312.09672  [pdf, other

    cs.HC cs.AI

    InstructPipe: Building Visual Programming Pipelines with Human Instructions

    Authors: Zhongyi Zhou, **g **, Vrushank Phadnis, Xiuxiu Yuan, Jun Jiang, Xun Qian, **gtao Zhou, Yiyi Huang, Zheng Xu, Yinda Zhang, Kristen Wright, Jason Mayes, Mark Sherwood, Johnny Lee, Alex Olwal, David Kim, Ram Iyengar, Na Li, Ruofei Du

    Abstract: Visual programming provides beginner-level programmers with a coding-free experience to build their customized pipelines. Existing systems require users to build a pipeline entirely from scratch, implying that novice users need to set up and link appropriate nodes all by themselves, starting from a blank workspace. We present InstructPipe, an AI assistant that enables users to start prototy** ma… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  35. arXiv:2312.07485  [pdf, other

    cs.CV

    MinD-3D: Reconstruct High-quality 3D objects in Human Brain

    Authors: Jianxiong Gao, Yuqian Fu, Yun Wang, Xuelin Qian, Jianfeng Feng, Yanwei Fu

    Abstract: In this paper, we introduce Recon3DMind, an innovative task aimed at reconstructing 3D visuals from Functional Magnetic Resonance Imaging (fMRI) signals, marking a significant advancement in the fields of cognitive neuroscience and computer vision. To support this pioneering task, we present the fMRI-Shape dataset, which includes data from 14 participants and features 360-degree videos of 3D objec… ▽ More

    Submitted 21 March, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: 26 pages, 13 figures

  36. arXiv:2311.16035  [pdf, other

    quant-ph cs.AI cs.AR cs.LG

    RobustState: Boosting Fidelity of Quantum State Preparation via Noise-Aware Variational Training

    Authors: Hanrui Wang, Yilian Liu, Pengyu Liu, Jiaqi Gu, Zirui Li, Zhiding Liang, **glei Cheng, Yongshan Ding, Xuehai Qian, Yiyu Shi, David Z. Pan, Frederic T. Chong, Song Han

    Abstract: Quantum state preparation, a crucial subroutine in quantum computing, involves generating a target quantum state from initialized qubits. Arbitrary state preparation algorithms can be broadly categorized into arithmetic decomposition (AD) and variational quantum state preparation (VQSP). AD employs a predefined procedure to decompose the target state into a series of gates, whereas VQSP iterativel… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Accepted to FASTML @ ICCAD 2023. 14 pages, 20 figures

  37. arXiv:2311.12070  [pdf, other

    eess.IV cs.CV

    FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model

    Authors: Yunxiang Li, Hua-Chieh Shao, Xiaoxue Qian, You Zhang

    Abstract: Diffusion models have demonstrated significant potential in producing high-quality images in medical image translation to aid disease diagnosis, localization, and treatment. Nevertheless, current diffusion models have limited success in achieving faithful image translations that can accurately preserve the anatomical structures of medical images, especially for unpaired datasets. The preservation… ▽ More

    Submitted 26 June, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

  38. arXiv:2311.00342  [pdf, other

    cs.CV

    fMRI-PTE: A Large-scale fMRI Pretrained Transformer Encoder for Multi-Subject Brain Activity Decoding

    Authors: Xuelin Qian, Yun Wang, **gyang Huo, Jianfeng Feng, Yanwei Fu

    Abstract: The exploration of brain activity and its decoding from fMRI data has been a longstanding pursuit, driven by its potential applications in brain-computer interfaces, medical diagnostics, and virtual reality. Previous approaches have primarily focused on individual subject analysis, highlighting the need for a more universal and adaptable framework, which is the core motivation behind our work. In… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  39. arXiv:2310.14778  [pdf, other

    cs.MM cs.SD eess.AS

    Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions

    Authors: **zheng Zhao, Yong Xu, Xinyuan Qian, Davide Berghi, Peipei Wu, Meng Cui, Jianyuan Sun, Philip J. B. Jackson, Wenwu Wang

    Abstract: Audio-visual speaker tracking has drawn increasing attention over the past few years due to its academic values and wide application. Audio and visual modalities can provide complementary information for localization and tracking. With audio and visual information, the Bayesian-based filter can solve the problem of data association, audio-visual fusion and track management. In this paper, we condu… ▽ More

    Submitted 17 December, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

  40. arXiv:2310.11426  [pdf

    cs.RO

    Underwater and Surface Aquatic Locomotion of Soft Biomimetic Robot Based on Bending Rolled Dielectric Elastomer Actuators

    Authors: Chenyu Zhang, Chen Zhang, Juntian Qu, Xiang Qian

    Abstract: All-around, real-time navigation and sensing across the water environments by miniature soft robotics are promising, for their merits of small size, high agility and good compliance to the unstructured surroundings. In this paper, we propose and demonstrate a mantas-like soft aquatic robot which propels itself by flap**-fins using rolled dielectric elastomer actuators (DEAs) with bending motions… ▽ More

    Submitted 19 October, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: 6 Pages, 12 Figures, Published at IROS 2023

  41. arXiv:2310.10497  [pdf, other

    cs.SD cs.AI eess.AS

    LocSelect: Target Speaker Localization with an Auditory Selective Hearing Mechanism

    Authors: Yu Chen, Xinyuan Qian, Zexu Pan, Kainan Chen, Haizhou Li

    Abstract: The prevailing noise-resistant and reverberation-resistant localization algorithms primarily emphasize separating and providing directional output for each speaker in multi-speaker scenarios, without association with the identity of speakers. In this paper, we present a target speaker localization algorithm with a selective hearing mechanism. Given a reference speech of the target speaker, we firs… ▽ More

    Submitted 17 October, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Submitted to ICASSP 2024

  42. arXiv:2309.16308  [pdf, other

    cs.MM cs.SD eess.AS

    Audio Visual Speaker Localization from EgoCentric Views

    Authors: **zheng Zhao, Yong Xu, Xinyuan Qian, Wenwu Wang

    Abstract: The use of audio and visual modality for speaker localization has been well studied in the literature by exploiting their complementary characteristics. However, most previous works employ the setting of static sensors mounted at fixed positions. Unlike them, in this work, we explore the ego-centric setting, where the heterogeneous sensors are embodied and could be moving with a human to facilitat… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  43. arXiv:2309.13248  [pdf, other

    cs.CV

    Rethinking Amodal Video Segmentation from Learning Supervised Signals with Object-centric Representation

    Authors: Ke Fan, **gshi Lei, Xuelin Qian, Miaopeng Yu, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu

    Abstract: Video amodal segmentation is a particularly challenging task in computer vision, which requires to deduce the full shape of an object from the visible parts of it. Recently, some studies have achieved promising performance by using motion flow to integrate information across frames under a self-supervised setting. However, motion flow has a clear limitation by the two factors of moving cameras and… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

    Comments: Accepted by ICCV 2023

  44. arXiv:2309.03061  [pdf, other

    stat.ML cs.LG stat.ME

    Learning Active Subspaces for Effective and Scalable Uncertainty Quantification in Deep Neural Networks

    Authors: Sanket Jantre, Nathan M. Urban, Xiaoning Qian, Byung-Jun Yoon

    Abstract: Bayesian inference for neural networks, or Bayesian deep learning, has the potential to provide well-calibrated predictions with quantified uncertainty and robustness. However, the main hurdle for Bayesian deep learning is its computational complexity due to the high dimensionality of the parameter space. In this work, we propose a novel scheme that addresses this limitation by constructing a low-… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

  45. arXiv:2309.02169   

    cs.CV cs.AI

    Dual Relation Alignment for Composed Image Retrieval

    Authors: Xintong Jiang, Yaxiong Wang, Yujiao Wu, Meng Wang, Xueming Qian

    Abstract: Composed image retrieval, a task involving the search for a target image using a reference image and a complementary text as the query, has witnessed significant advancements owing to the progress made in cross-modal modeling. Unlike the general image-text retrieval problem with only one alignment relation, i.e., image-text, we argue for the existence of two types of relations in composed image re… ▽ More

    Submitted 31 January, 2024; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: The architecture of our model changes, hence methodolgy and experiments changes a lot, We have significantly revised the original manuscript of the paper, so a withdraw of our original script is needed

  46. arXiv:2308.16825  [pdf, other

    cs.CV

    Coarse-to-Fine Amodal Segmentation with Shape Prior

    Authors: Jianxiong Gao, Xuelin Qian, Yikai Wang, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu

    Abstract: Amodal object segmentation is a challenging task that involves segmenting both visible and occluded parts of an object. In this paper, we propose a novel approach, called Coarse-to-Fine Segmentation (C2F-Seg), that addresses this problem by progressively modeling the amodal segmentation. C2F-Seg initially reduces the learning space from the pixel-level image space to the vector-quantized latent sp… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: Accepted to ICCV 2023

  47. arXiv:2308.10717  [pdf, other

    cs.CV

    Rethinking Person Re-identification from a Projection-on-Prototypes Perspective

    Authors: Qizao Wang, Xuelin Qian, Bin Li, Yanwei Fu, Xiangyang Xue

    Abstract: Person Re-IDentification (Re-ID) as a retrieval task, has achieved tremendous development over the past decade. Existing state-of-the-art methods follow an analogous framework to first extract features from the input images and then categorize them with a classifier. However, since there is no identity overlap between training and testing sets, the classifier is often discarded during inference. O… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

  48. Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-Identification

    Authors: Qizao Wang, Xuelin Qian, Bin Li, Xiangyang Xue, Yanwei Fu

    Abstract: Cloth-changing person Re-IDentification (Re-ID) is a particularly challenging task, suffering from two limitations of inferior discriminative features and limited training samples. Existing methods mainly leverage auxiliary information to facilitate identity-relevant feature learning, including soft-biometrics features of shapes or gaits, and additional labels of clothing. However, this informatio… ▽ More

    Submitted 20 June, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

    Comments: Accepted by IEEE TIFS 2024

  49. arXiv:2308.10087  [pdf, other

    cs.DC cs.AI

    GNNPipe: Scaling Deep GNN Training with Pipelined Model Parallelism

    Authors: **gji Chen, Zhuoming Chen, Xuehai Qian

    Abstract: Communication is a key bottleneck for distributed graph neural network (GNN) training. This paper proposes GNNPipe, a new approach that scales the distributed full-graph deep GNN training. Being the first to use layer-level model parallelism for GNN training, GNNPipe partitions GNN layers among GPUs, each device performs the computation for a disjoint subset of consecutive GNN layers on the whole… ▽ More

    Submitted 24 September, 2023; v1 submitted 19 August, 2023; originally announced August 2023.

  50. arXiv:2308.01170  [pdf, other

    cs.LG

    Direct Gradient Temporal Difference Learning

    Authors: Xiaochi Qian, Shangtong Zhang

    Abstract: Off-policy learning enables a reinforcement learning (RL) agent to reason counterfactually about policies that are not executed and is one of the most important ideas in RL. It, however, can lead to instability when combined with function approximation and bootstrap**, two arguably indispensable ingredients for large-scale reinforcement learning. This is the notorious deadly triad. Gradient Temp… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: Submitted to JMLR in Apr 2023