Skip to main content

Showing 1–50 of 175 results for author: Ba, Z

.
  1. arXiv:2407.01284  [pdf, other

    cs.AI cs.CL cs.CV cs.LG cs.SC

    We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?

    Authors: Runqi Qiao, Qiuna Tan, Guanting Dong, Minhui Wu, Chong Sun, Xiaoshuai Song, Zhuoma GongQue, Shanglin Lei, Zhe Wei, Miaoxuan Zhang, Runfeng Qiao, Yifan Zhang, Xiao Zong, Yida Xu, Muxi Diao, Zhimin Bao, Chen Li, Honggang Zhang

    Abstract: Visual mathematical reasoning, as a fundamental visual reasoning ability, has received widespread attention from the Large Multimodal Models (LMMs) community. Existing benchmarks, such as MathVista and MathVerse, focus more on the result-oriented performance but neglect the underlying principles in knowledge acquisition and generalization. Inspired by human-like mathematical reasoning, we introduc… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Work in progress

  2. arXiv:2406.17841  [pdf, other

    quant-ph cs.AI

    Probing many-body Bell correlation depth with superconducting qubits

    Authors: Ke Wang, Weikang Li, Shibo Xu, Mengyao Hu, Jiachen Chen, Yaozu Wu, Chuanyu Zhang, Feitong **, Xuhao Zhu, Yu Gao, Ziqi Tan, Aosai Zhang, Ning Wang, Yiren Zou, Tingting Li, Fanhao Shen, Jiarun Zhong, Zehang Bao, Zitian Zhu, Zixuan Song, **feng Deng, Hang Dong, Xu Zhang, Pengfei Zhang, Wenjie Jiang , et al. (10 additional authors not shown)

    Abstract: Quantum nonlocality describes a stronger form of quantum correlation than that of entanglement. It refutes Einstein's belief of local realism and is among the most distinctive and enigmatic features of quantum mechanics. It is a crucial resource for achieving quantum advantages in a variety of practical applications, ranging from cryptography and certified random number generation via self-testing… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 11 pages,6 figures + 14 pages, 6 figures

  3. arXiv:2406.16601  [pdf, other

    cs.CV

    Do As I Do: Pose Guided Human Motion Copy

    Authors: Sifan Wu, Zhenguang Liu, Beibei Zhang, Roger Zimmermann, Zhongjie Ba, Xiaosong Zhang, Kui Ren

    Abstract: Human motion copy is an intriguing yet challenging task in artificial intelligence and computer vision, which strives to generate a fake video of a target person performing the motion of a source person. The problem is inherently challenging due to the subtle human-body texture details to be generated and the temporal consistency to be considered. Existing approaches typically adopt a conventional… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  4. arXiv:2406.03865  [pdf, other

    cs.CV cs.AI

    Semantic Similarity Score for Measuring Visual Similarity at Semantic Level

    Authors: Senran Fan, Zhicheng Bao, Chen Dong, Haotai Liang, Xiaodong Xu, ** Zhang

    Abstract: Semantic communication, as a revolutionary communication architecture, is considered a promising novel communication paradigm. Unlike traditional symbol-based error-free communication systems, semantic-based visual communication systems extract, compress, transmit, and reconstruct images at the semantic level. However, widely used image similarity evaluation metrics, whether pixel-based MSE or PSN… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  5. arXiv:2405.04929  [pdf, ps, other

    cs.IR

    Enabling Roll-up and Drill-down Operations in News Exploration with Knowledge Graphs for Due Diligence and Risk Management

    Authors: Sha Wang, Yuchen Li, Hanhua Xiao, Zhifeng Bao, Lambert Deng, Yanfei Dong

    Abstract: Efficient news exploration is crucial in real-world applications, particularly within the financial sector, where numerous control and risk assessment tasks rely on the analysis of public news reports. The current processes in this domain predominantly rely on manual efforts, often involving keywordbased searches and the compilation of extensive keyword lists. In this paper, we introduce NCEXPLORE… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: The paper was accepted by ICDE 2024

  6. arXiv:2405.03708  [pdf

    cs.DC cs.DB cs.LG

    Delta Tensor: Efficient Vector and Tensor Storage in Delta Lake

    Authors: Zhiwei Bao, Liu Liao-Liao, Zhiyu Wu, Yifan Zhou, Dan Fan, Michal Aibin, Yvonne Coady, Andrew Brownsword

    Abstract: The exponential growth of artificial intelligence (AI) and machine learning (ML) applications has necessitated the development of efficient storage solutions for vector and tensor data. This paper presents a novel approach for tensor storage in a Lakehouse architecture using Delta Lake. By adopting the multidimensional array storage strategy from array databases and sparse encoding methods to Delt… ▽ More

    Submitted 13 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

  7. arXiv:2405.02335  [pdf, other

    cs.IT cs.LG

    sDAC -- Semantic Digital Analog Converter for Semantic Communications

    Authors: Zhicheng Bao, Chen Dong, Xiaodong Xu

    Abstract: In this paper, we propose a novel semantic digital analog converter (sDAC) for the compatibility of semantic communications and digital communications. Most of the current semantic communication systems are based on the analog modulations, ignoring their incorporation with digital communication systems, which are more common in practice. In fact, quantization methods in traditional communication s… ▽ More

    Submitted 26 April, 2024; originally announced May 2024.

  8. arXiv:2404.15878  [pdf, other

    quant-ph physics.flu-dyn

    Simulating unsteady fluid flows on a superconducting quantum processor

    Authors: Zhaoyuan Meng, Jiarun Zhong, Shibo Xu, Ke Wang, Jiachen Chen, Feitong **, Xuhao Zhu, Yu Gao, Yaozu Wu, Chuanyu Zhang, Ning Wang, Yiren Zou, Aosai Zhang, Zhengyi Cui, Fanhao Shen, Zehang Bao, Zitian Zhu, Ziqi Tan, Tingting Li, Pengfei Zhang, Shiying Xiong, Hekang Li, Qiujiang Guo, Zhen Wang, Chao Song , et al. (2 additional authors not shown)

    Abstract: Recent advancements of intermediate-scale quantum processors have triggered tremendous interest in the exploration of practical quantum advantage. The simulation of fluid dynamics, a highly challenging problem in classical physics but vital for practical applications, emerges as a good candidate for showing quantum utility. Here, we report an experiment on the digital simulation of unsteady flows,… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  9. arXiv:2404.14249  [pdf, other

    cs.CV

    CLIP-GS: CLIP-Informed Gaussian Splatting for Real-time and View-consistent 3D Semantic Understanding

    Authors: Guibiao Liao, Jiankun Li, Zhenyu Bao, Xiaoqing Ye, **gdong Wang, Qing Li, Kanglin Liu

    Abstract: The recent 3D Gaussian Splatting (GS) exhibits high-quality and real-time synthesis of novel views in 3D scenes. Currently, it primarily focuses on geometry and appearance modeling, while lacking the semantic understanding of scenes. To bridge this gap, we present CLIP-GS, which integrates semantics from Contrastive Language-Image Pre-Training (CLIP) into Gaussian Splatting to efficiently comprehe… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: https://github.com/gbliao/CLIP-GS

  10. Non-Abelian braiding of Fibonacci anyons with a superconducting processor

    Authors: Shibo Xu, Zheng-Zhi Sun, Ke Wang, Hekang Li, Zitian Zhu, Hang Dong, **feng Deng, Xu Zhang, Jiachen Chen, Yaozu Wu, Chuanyu Zhang, Feitong **, Xuhao Zhu, Yu Gao, Aosai Zhang, Ning Wang, Yiren Zou, Ziqi Tan, Fanhao Shen, Jiarun Zhong, Zehang Bao, Weikang Li, Wenjie Jiang, Li-Wei Yu, Zixuan Song , et al. (7 additional authors not shown)

    Abstract: Non-Abelian topological orders offer an intriguing path towards fault-tolerant quantum computation, where information can be encoded and manipulated in a topologically protected manner immune to arbitrary local noises and perturbations. However, realizing non-Abelian topologically ordered states is notoriously challenging in both condensed matter and programmable quantum systems, and it was not un… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  11. arXiv:2403.16935  [pdf, other

    quant-ph

    Measuring Spectral Form Factor in Many-Body Chaotic and Localized Phases of Quantum Processors

    Authors: Hang Dong, Pengfei Zhang, Ceren B. Dag, Yu Gao, Ning Wang, **feng Deng, Xu Zhang, Jiachen Chen, Shibo Xu, Ke Wang, Yaozu Wu, Chuanyu Zhang, Feitong **, Xuhao Zhu, Aosai Zhang, Yiren Zou, Ziqi Tan, Zhengyi Cui, Zitian Zhu, Fanhao Shen, Tingting Li, Jiarun Zhong, Zehang Bao, Hekang Li, Zhen Wang , et al. (6 additional authors not shown)

    Abstract: The spectral form factor (SFF) captures universal spectral fluctuations as signatures of quantum chaos, and has been instrumental in advancing multiple frontiers of physics including the studies of black holes and quantum many-body systems. However, the measurement of SFF in many-body systems is challenging due to the difficulty in resolving level spacings that become exponentially small with incr… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 12 pages, 9 figures

  12. arXiv:2403.12542  [pdf, ps, other

    math.OC

    Attitude Tracking of Uncertain Flexible Spacecraft Systems Subject to Unknown External Disturbances

    Authors: Zean Bao, Maobin Lu, Fang Deng, Jie Chen

    Abstract: In this paper, we investigate the attitude tracking problem of uncertain flexible spacecraft systems subject to external disturbances. In sharp contrast to existing results, the dynamics of flexible spacecraft systems and external disturbances are allowed to be unknown. To deal with the challenges by these unknown factors, we develop a class of nonlinear internal models which converts the attitude… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 8 pages, 2 figures, submitted to TAC on 6 Dec. 2023

  13. arXiv:2403.11495  [pdf, other

    cs.LG cs.AI

    Semantic-Enhanced Representation Learning for Road Networks with Temporal Dynamics

    Authors: Yile Chen, Xiucheng Li, Gao Cong, Zhifeng Bao, Cheng Long

    Abstract: In this study, we introduce a novel framework called Toast for learning general-purpose representations of road networks, along with its advanced counterpart DyToast, designed to enhance the integration of temporal dynamics to boost the performance of various time-sensitive downstream tasks. Specifically, we propose to encode two pivotal semantic characteristics intrinsic to road networks: traffic… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  14. arXiv:2403.01786  [pdf, other

    cs.CV cs.IT

    Exposing the Deception: Uncovering More Forgery Clues for Deepfake Detection

    Authors: Zhongjie Ba, Qingyu Liu, Zhenguang Liu, Shuang Wu, Feng Lin, Li Lu, Kui Ren

    Abstract: Deepfake technology has given rise to a spectrum of novel and compelling applications. Unfortunately, the widespread proliferation of high-fidelity fake videos has led to pervasive confusion and deception, shattering our faith that seeing is believing. One aspect that has been overlooked so far is that current deepfake detection approaches may easily fall into the trap of overfitting, focusing onl… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: AAAI2024

  15. arXiv:2402.05027  [pdf, other

    cs.MA cs.AI

    Towards Generalizability of Multi-Agent Reinforcement Learning in Graphs with Recurrent Message Passing

    Authors: Jannis Weil, Zhenghua Bao, Osama Abboud, Tobias Meuser

    Abstract: Graph-based environments pose unique challenges to multi-agent reinforcement learning. In decentralized approaches, agents operate within a given graph and make decisions based on partial or outdated observations. The size of the observed neighborhood limits the generalizability to different graphs and affects the reactivity of agents, the quality of the selected actions, and the communication ove… ▽ More

    Submitted 4 June, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted at AAMAS 2024, version with appendix; corrected typo in equation (1)

  16. arXiv:2402.04648  [pdf, other

    cs.CV

    OV-NeRF: Open-vocabulary Neural Radiance Fields with Vision and Language Foundation Models for 3D Semantic Understanding

    Authors: Guibiao Liao, Kaichen Zhou, Zhenyu Bao, Kanglin Liu, Qing Li

    Abstract: The development of Neural Radiance Fields (NeRFs) has provided a potent representation for encapsulating the geometric and appearance characteristics of 3D scenes. Enhancing the capabilities of NeRFs in open-vocabulary 3D semantic perception tasks has been a recent focus. However, current methods that extract semantics directly from Contrastive Language-Image Pretraining (CLIP) for semantic field… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  17. arXiv:2402.00936  [pdf, other

    quant-ph cond-mat.supr-con

    Enhanced quantum state transfer: Circumventing quantum chaotic behavior

    Authors: Liang Xiang, Jiachen Chen, Zitian Zhu, Zixuan Song, Zehang Bao, Xuhao Zhu, Feitong **, Ke Wang, Shibo Xu, Yiren Zou, Hekang Li, Zhen Wang, Chao Song, Alexander Yue, Justine Partridge, Qiujiang Guo, Rubem Mondaini, H. Wang, Richard T. Scalettar

    Abstract: The ability to realize high-fidelity quantum communication is one of the many facets required to build generic quantum computing devices. In addition to quantum processing, sensing, and storage, transferring the resulting quantum states demands a careful design that finds no parallel in classical communication. Existing experimental demonstrations of quantum information transfer in solid-state qua… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 10 pages, 4 figures (main text); 14 pages, 20 figures (supplementary materials)

  18. arXiv:2401.15704  [pdf, other

    cs.CR cs.SD eess.AS

    Phoneme-Based Proactive Anti-Eavesdrop** with Controlled Recording Privilege

    Authors: Peng Huang, Yao Wei, Peng Cheng, Zhongjie Ba, Li Lu, Feng Lin, Yang Wang, Kui Ren

    Abstract: The widespread smart devices raise people's concerns of being eavesdropped on. To enhance voice privacy, recent studies exploit the nonlinearity in microphone to jam audio recorders with inaudible ultrasound. However, existing solutions solely rely on energetic masking. Their simple-form noise leads to several problems, such as high energy requirements and being easily removed by speech enhancemen… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: 14 pages, 28 figures; submitted to IEEE TDSC

  19. arXiv:2401.14726  [pdf, other

    cs.CV cs.GR

    3D Reconstruction and New View Synthesis of Indoor Environments based on a Dual Neural Radiance Field

    Authors: Zhenyu Bao, Guibiao Liao, Zhongyuan Zhao, Kanglin Liu, Qing Li, Guo** Qiu

    Abstract: Simultaneously achieving 3D reconstruction and new view synthesis for indoor environments has widespread applications but is technically very challenging. State-of-the-art methods based on implicit neural functions can achieve excellent 3D reconstruction results, but their performances on new view synthesis can be unsatisfactory. The exciting development of neural radiance field (NeRF) has revolut… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: 20 pages, 8 figures

  20. arXiv:2401.12900  [pdf, other

    cs.GR cs.CV

    PSAvatar: A Point-based Shape Model for Real-Time Head Avatar Animation with 3D Gaussian Splatting

    Authors: Zhongyuan Zhao, Zhenyu Bao, Qing Li, Guo** Qiu, Kanglin Liu

    Abstract: Despite much progress, achieving real-time high-fidelity head avatar animation is still difficult and existing methods have to trade-off between speed and quality. 3DMM based methods often fail to model non-facial structures such as eyeglasses and hairstyles, while neural implicit models suffer from deformation inflexibility and rendering inefficiency. Although 3D Gaussian has been demonstrated to… ▽ More

    Submitted 23 June, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: 13 pages, 10 figures

  21. arXiv:2401.11087  [pdf, other

    cond-mat.soft cond-mat.mtrl-sci

    Network evolution controlling strain-induced damage and self-healing of elastomers with dynamic bonds

    Authors: Yikai Yin, Shaswat Mohanty, Christopher B. Cooper, Zhenan Bao, Wei Cai

    Abstract: Highly stretchable and self-healable supramolecular elastomers are promising materials for future soft electronics, biomimetic systems, and smart textiles, due to their dynamic cross-linking bonds. The dynamic or reversible nature of the cross-links gives rise to interesting macroscopic responses in these materials such as self-healing and rapid stress-relaxation. However, the relationship between… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: 17 pages, 7 figures

  22. arXiv:2401.08284  [pdf, other

    quant-ph cond-mat.stat-mech cond-mat.str-el

    Schrödinger cats growing up to 60 qubits and dancing in a cat scar enforced discrete time crystal

    Authors: Zehang Bao, Shibo Xu, Zixuan Song, Ke Wang, Liang Xiang, Zitian Zhu, Jiachen Chen, Feitong **, Xuhao Zhu, Yu Gao, Yaozu Wu, Chuanyu Zhang, Ning Wang, Yiren Zou, Ziqi Tan, Aosai Zhang, Zhengyi Cui, Fanhao Shen, Jiarun Zhong, Tingting Li, **feng Deng, Xu Zhang, Hang Dong, Pengfei Zhang, Yang-Ren Liu , et al. (8 additional authors not shown)

    Abstract: Greenberger-Horne-Zeilinger (GHZ) states, as maximally entangled Schrödinger cat states, play vital roles in the foundations of quantum physics and technology, but creating and preserving these fragile states pose tremendous challenges. Discrete time crystals (DTCs), originally aimed at exploring exotic nonequilibrium quantum matters, have raised significant scientific interest, but whether this b… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 7 pages, 4 figures + supplementary information

  23. arXiv:2401.04333  [pdf, other

    quant-ph cond-mat.dis-nn cond-mat.supr-con

    Long-lived topological time-crystalline order on a quantum processor

    Authors: Liang Xiang, Wenjie Jiang, Zehang Bao, Zixuan Song, Shibo Xu, Ke Wang, Jiachen Chen, Feitong **, Xuhao Zhu, Zitian Zhu, Fanhao Shen, Ning Wang, Chuanyu Zhang, Yaozu Wu, Yiren Zou, Jiarun Zhong, Zhengyi Cui, Aosai Zhang, Ziqi Tan, Tingting Li, Yu Gao, **feng Deng, Xu Zhang, Hang Dong, Pengfei Zhang , et al. (16 additional authors not shown)

    Abstract: Topologically ordered phases of matter elude Landau's symmetry-breaking theory, featuring a variety of intriguing properties such as long-range entanglement and intrinsic robustness against local perturbations. Their extension to periodically driven systems gives rise to exotic new phenomena that are forbidden in thermal equilibrium. Here, we report the observation of signatures of such a phenomen… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: 8 pages (main text), 16 pages (supplementary information)

  24. arXiv:2401.00659  [pdf, ps, other

    cs.DB

    Advanced Dataset Discovery: When Multi-Query-Dataset Cardinality Estimation Matters

    Authors: Tingting Wang, Shixun Huang, Zhifeng Bao, J. Shane Culpepper, Reza Arablouei, Volkan Dedeoglu

    Abstract: As available data increases, so too does the demand to dataset discovery. Existing studies often yield coarse-grained results where significant information overlaps and non-relevant data occur. They also implicitly assume that a user can purchase all datasets found, which is rarely true in practice. Therefore, achieving dataset discovery results with less redundancy using fine-grained information… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

  25. arXiv:2312.16057  [pdf, other

    cs.IT eess.SP

    Semantic Importance-Aware Based for Multi-User Communication Over MIMO Fading Channels

    Authors: Haotai Liang, Zhicheng Bao, Wannian An, Chen Dong, Xiaodong Xu

    Abstract: Semantic communication, as a novel communication paradigm, has attracted the interest of many scholars, with multi-user, multi-input multi-output (MIMO) scenarios being one of the critical contexts. This paper presents a semantic importance-aware based communication system (SIA-SC) over MIMO Rayleigh fading channels. Combining the semantic symbols' inequality and the equivalent subchannels of MIMO… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  26. arXiv:2312.14901  [pdf, other

    quant-ph

    Ancilla-Assisted Process Tomography with Bipartiete Mixed Separable States

    Authors: Zhuoran Bao, Daniel F. V. James

    Abstract: It has been shown that the entanglement between the system state and the ancillary state is not a strict requirement for performing ancilla-assisted process tomography(AAPT). Instead, it only requires that the system-ancilla state be faithful, which, in practice, is the invertibility of a certain matrix representing the state. Our paper takes on the operational definition of faithfulness, i.e., a… ▽ More

    Submitted 20 May, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: 6 pages

  27. arXiv:2312.09708  [pdf, other

    cs.LG cs.AI

    GraphRARE: Reinforcement Learning Enhanced Graph Neural Network with Relative Entropy

    Authors: Tianhao Peng, Wenjun Wu, Haitao Yuan, Zhifeng Bao, Zhao Pengrui, Xin Yu, Xuetao Lin, Yu Liang, Yanjun Pu

    Abstract: Graph neural networks (GNNs) have shown advantages in graph-based analysis tasks. However, most existing methods have the homogeneity assumption and show poor performance on heterophilic graphs, where the linked nodes have dissimilar features and different class labels, and the semantically related nodes might be multi-hop away. To address this limitation, this paper presents GraphRARE, a general… ▽ More

    Submitted 13 April, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: 14 pages, 7 figures

  28. arXiv:2312.06712  [pdf, other

    cs.CV cs.AI

    Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion Models

    Authors: Zhipeng Bao, Yijun Li, Krishna Kumar Singh, Yu-Xiong Wang, Martial Hebert

    Abstract: Despite recent significant strides achieved by diffusion-based Text-to-Image (T2I) models, current systems are still less capable of ensuring decent compositional generation aligned with text prompts, particularly for the multi-object generation. This work illuminates the fundamental reasons for such misalignment, pinpointing issues related to low attention activation scores and mask overlaps. Whi… ▽ More

    Submitted 31 January, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

  29. arXiv:2312.05911  [pdf, ps, other

    math.ST cs.IT math.PR

    A leave-one-out approach to approximate message passing

    Authors: Zhigang Bao, Qiyang Han, Xiaocong Xu

    Abstract: Approximate message passing (AMP) has emerged both as a popular class of iterative algorithms and as a powerful analytic tool in a wide range of statistical estimation problems and statistical physics models. A well established line of AMP theory proves Gaussian approximations for the empirical distributions of the AMP iterate in the high dimensional limit, under the GOE random matrix model and it… ▽ More

    Submitted 25 December, 2023; v1 submitted 10 December, 2023; originally announced December 2023.

  30. arXiv:2311.14321  [pdf, other

    quant-ph

    Hypercontractivity for Quantum Erasure Channels via Multipartite Log-Sobolev Inequality

    Authors: Zongbo Bao, Yang**g Dong, Fengning Ou, Penghui Yao

    Abstract: We prove an almost optimal hypercontractive inequality for quantum erasure channels, generalizing the hypercontractivity for classical binary erasure channels [NW16]. To our knowledge, this is the first hypercontractivity bound for non-unital quantum channels. The traditional inductive arguments for classical hypercontractivity cannot be generalized to the quantum setting due to the nature of non-… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: 39 pages, 2 figures

  31. arXiv:2311.10492  [pdf, other

    cs.CV

    A Relay System for Semantic Image Transmission based on Shared Feature Extraction and Hyperprior Entropy Compression

    Authors: Wannian An, Zhicheng Bao, Haotai Liang, Chen Dong, Xiaodong

    Abstract: Nowadays, the need for high-quality image reconstruction and restoration is more and more urgent. However, most image transmission systems may suffer from image quality degradation or transmission interruption in the face of interference such as channel noise and link fading. To solve this problem, a relay communication network for semantic image transmission based on shared feature extraction and… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  32. arXiv:2310.15696  [pdf

    cond-mat.mtrl-sci cond-mat.other

    Strain-driven switching between antiferromagnetic states in frustrated antiferromagnet UO2 probed by exchange bias effect

    Authors: E. A. Tereshina-Chitrova, L. V. Pourovskii, S. Khmelevskyi, L. Horak, Z. Bao, A. Mackova, P. Malinsky, T. Gouder, R. Caciuffo

    Abstract: Frustrated antiferromagnets offer a captivating platform to study the intricate relationship of magnetic interactions, geometric constraints, and emergent phenomena. By controlling spin orientations, these materials can be tailored for applications in spintronics and quantum information processing. The research focuses on the interplay of magnetic and exchange anisotropy effects in artificial hete… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  33. arXiv:2310.13424  [pdf, other

    cs.CR cs.AI cs.DC cs.LG

    FLTracer: Accurate Poisoning Attack Provenance in Federated Learning

    Authors: Xinyu Zhang, Qingyu Liu, Zhongjie Ba, Yuan Hong, Tianhang Zheng, Feng Lin, Li Lu, Kui Ren

    Abstract: Federated Learning (FL) is a promising distributed learning approach that enables multiple clients to collaboratively train a shared global model. However, recent studies show that FL is vulnerable to various poisoning attacks, which can degrade the performance of global models or introduce backdoors into them. In this paper, we first conduct a comprehensive study on prior FL attacks and detection… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 18 pages, 27 figures

  34. arXiv:2309.17450  [pdf, other

    cs.CV

    Multi-task View Synthesis with Neural Radiance Fields

    Authors: Shuhong Zheng, Zhipeng Bao, Martial Hebert, Yu-Xiong Wang

    Abstract: Multi-task visual learning is a critical aspect of computer vision. Current research, however, predominantly concentrates on the multi-task dense prediction setting, which overlooks the intrinsic 3D world and its multi-view consistent structures, and lacks the capability for versatile imagination. In response to these limitations, we present a novel problem setting -- multi-task view synthesis (MT… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: ICCV 2023, Website: https://zsh2000.github.io/mtvs.github.io/

  35. arXiv:2309.14122  [pdf, other

    cs.CV cs.CR

    SurrogatePrompt: Bypassing the Safety Filter of Text-To-Image Models via Substitution

    Authors: Zhongjie Ba, Jieming Zhong, Jiachen Lei, Peng Cheng, Qinglong Wang, Zhan Qin, Zhibo Wang, Kui Ren

    Abstract: Advanced text-to-image models such as DALL-E 2 and Midjourney possess the capacity to generate highly realistic images, raising significant concerns regarding the potential proliferation of unsafe content. This includes adult, violent, or deceptive imagery of political figures. Despite claims of rigorous safety mechanisms implemented in these models to restrict the generation of not-safe-for-work… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: 14 pages, 11 figures

  36. arXiv:2309.11131  [pdf, other

    cs.CV

    Locate and Verify: A Two-Stream Network for Improved Deepfake Detection

    Authors: Chao Shuai, Jieming Zhong, Shuang Wu, Feng Lin, Zhibo Wang, Zhongjie Ba, Zhenguang Liu, Lorenzo Cavallaro, Kui Ren

    Abstract: Deepfake has taken the world by storm, triggering a trust crisis. Current deepfake detection methods are typically inadequate in generalizability, with a tendency to overfit to image contents such as the background, which are frequently occurring but relatively unimportant in the training dataset. Furthermore, current methods heavily rely on a few dominant forgery regions and may ignore other equa… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: 10 pages, 8 figures, 60 references. This paper has been accepted for ACM MM 2023

  37. arXiv:2309.10979  [pdf, other

    cs.LG

    Towards Data-centric Graph Machine Learning: Review and Outlook

    Authors: Xin Zheng, Yixin Liu, Zhifeng Bao, Meng Fang, Xia Hu, Alan Wee-Chung Liew, Shirui Pan

    Abstract: Data-centric AI, with its primary focus on the collection, management, and utilization of data to drive AI models and applications, has attracted increasing attention in recent years. In this article, we conduct an in-depth and comprehensive review, offering a forward-looking outlook on the current efforts in data-centric AI pertaining to graph data-the fundamental data structure for representing… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: 42 pages, 9 figures

  38. arXiv:2309.09526  [pdf, other

    cs.CV cs.AI

    DFIL: Deepfake Incremental Learning by Exploiting Domain-invariant Forgery Clues

    Authors: Kun Pan, Yin Yifang, Yao Wei, Feng Lin, Zhongjie Ba, Zhenguang Liu, ZhiBo Wang, Lorenzo Cavallaro, Kui Ren

    Abstract: The malicious use and widespread dissemination of deepfake pose a significant crisis of trust. Current deepfake detection models can generally recognize forgery images by training on a large dataset. However, the accuracy of detection models degrades significantly on images generated by new deepfake methods due to the difference in data distribution. To tackle this issue, we present a novel increm… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: Accepted by ACMMM2023

  39. arXiv:2308.14346  [pdf, other

    cs.CL cs.AI

    DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation

    Authors: Zhijie Bao, Wei Chen, Shengze Xiao, Kuang Ren, Jiaao Wu, Cheng Zhong, Jiajie Peng, Xuan**g Huang, Zhongyu Wei

    Abstract: We propose DISC-MedLLM, a comprehensive solution that leverages Large Language Models (LLMs) to provide accurate and truthful medical response in end-to-end conversational healthcare services. To construct high-quality Supervised Fine-Tuning (SFT) datasets, we employ three strategies: utilizing medical knowledge-graphs, reconstructing real-world dialogues, and incorporating human-guided preference… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: Work in progress

  40. arXiv:2308.09581  [pdf, ps, other

    math.PR math-ph math.ST

    Phase transition for the smallest eigenvalue of covariance matrices

    Authors: Zhigang Bao, Jaehun Lee, Xiaocong Xu

    Abstract: In this paper, we study the smallest non-zero eigenvalue of the sample covariance matrices $\mathcal{S}(Y)=YY^*$, where $Y=(y_{ij})$ is an $M\times N$ matrix with iid mean $0$ variance $N^{-1}$ entries. We prove a phase transition for its distribution, induced by the fatness of the tail of $y_{ij}$'s. More specifically, we assume that $y_{ij}$ is symmetrically distributed with tail probability… ▽ More

    Submitted 8 November, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: Typos in equations (1.13) and (2.3) have been corrected

  41. arXiv:2308.05338  [pdf, other

    cs.CE

    MDVSC -- Wireless Model Division Video Semantic Communication

    Authors: Zhicheng Bao, Haotai Liang, Chen Dong, Cong Li, Xiaodong Xu, ** Zhang

    Abstract: This paper introduces a novel method for transmitting video data over noisy wireless channels with high efficiency and controllability. The method derivates from model division multiple access (MDMA) to extract common semantic features from video frames. It also uses deep joint source-channel coding (JSCC) as the main framework to establish communication links and deal with channel noise. An entro… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: arXiv admin note: text overlap with arXiv:2305.15799

  42. Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks

    Authors: Xinyu Zhang, Hanbin Hong, Yuan Hong, Peng Huang, Binghui Wang, Zhongjie Ba, Kui Ren

    Abstract: The language models, especially the basic text classification models, have been shown to be susceptible to textual adversarial attacks such as synonym substitution and word insertion attacks. To defend against such attacks, a growing body of research has been devoted to improving the model robustness. However, providing provable robustness guarantees instead of empirical robustness is still widely… ▽ More

    Submitted 11 June, 2024; v1 submitted 31 July, 2023; originally announced July 2023.

    Comments: Published in the 2024 IEEE Symposium on Security and Privacy (SP)

  43. arXiv:2307.10129  [pdf, other

    cs.CV

    General vs. Long-Tailed Age Estimation: An Approach to Kill Two Birds with One Stone

    Authors: Zenghao Bao, Zichang Tan, Jun Li, Jun Wan, Xibo Ma, Zhen Lei

    Abstract: Facial age estimation has received a lot of attention for its diverse application scenarios. Most existing studies treat each sample equally and aim to reduce the average estimation error for the entire dataset, which can be summarized as General Age Estimation. However, due to the long-tailed distribution prevalent in the dataset, treating all samples equally will inevitably bias the model toward… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  44. arXiv:2307.07909  [pdf, other

    cs.AI

    Is Imitation All You Need? Generalized Decision-Making with Dual-Phase Training

    Authors: Yao Wei, Yanchao Sun, Ruijie Zheng, Sai Vemprala, Rogerio Bonatti, Shuhang Chen, Ratnesh Madaan, Zhongjie Ba, Ashish Kapoor, Shuang Ma

    Abstract: We introduce DualMind, a generalist agent designed to tackle various decision-making tasks that addresses challenges posed by current methods, such as overfitting behaviors and dependence on task-specific fine-tuning. DualMind uses a novel "Dual-phase" training strategy that emulates how humans learn to act in the world. The model first learns fundamental common knowledge through a self-supervised… ▽ More

    Submitted 9 October, 2023; v1 submitted 15 July, 2023; originally announced July 2023.

  45. arXiv:2307.06027  [pdf, other

    cs.MM

    Semantic Communications System with Model Division Multiple Access and Controllable Coding Rate for Point Cloud

    Authors: Xiaoyi Liu, Haotai Liang, Zhicheng Bao, Chen Dong, Xiaodong Xu

    Abstract: Point cloud, as a 3D representation, is widely used in autonomous driving, virtual reality (VR), and augmented reality (AR). However, traditional communication systems think that the point cloud's semantic information is irrelevant to communication, which hinders the efficient transmission of point clouds in the era of artificial intelligence (AI). This paper proposes a point cloud based semantic… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

  46. arXiv:2306.11363  [pdf, other

    cs.CV cs.AI cs.LG

    Masked Diffusion Models Are Fast Distribution Learners

    Authors: Jiachen Lei, Qinglong Wang, Peng Cheng, Zhongjie Ba, Zhan Qin, Zhibo Wang, Zhenguang Liu, Kui Ren

    Abstract: Diffusion model has emerged as the \emph{de-facto} model for image generation, yet the heavy training overhead hinders its broader adoption in the research community. We observe that diffusion models are commonly trained to learn all fine-grained visual information from scratch. This paradigm may cause unnecessary training costs hence requiring in-depth investigation. In this work, we show that it… ▽ More

    Submitted 27 November, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

  47. arXiv:2306.05239  [pdf, other

    cs.CV cs.NE

    Point-Voxel Absorbing Graph Representation Learning for Event Stream based Recognition

    Authors: Bo Jiang, Chengguo Yuan, Xiao Wang, Zhimin Bao, Lin Zhu, Yonghong Tian, ** Tang

    Abstract: Sampled point and voxel methods are usually employed to downsample the dense events into sparse ones. After that, one popular way is to leverage a graph model which treats the sparse points/voxels as nodes and adopts graph neural networks (GNNs) to learn the representation of event data. Although good performance can be obtained, however, their results are still limited mainly due to two issues. (… ▽ More

    Submitted 29 July, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: In Peer Review

  48. arXiv:2306.02604  [pdf

    cs.DB

    A Simple Yet High-Performing On-disk Learned Index: Can We Have Our Cake and Eat it Too?

    Authors: Hai Lan, Zhifeng Bao, J. Shane Culpepper, Renata Borovica-Gajic, Yu Dong

    Abstract: While in-memory learned indexes have shown promising performance as compared to B+-tree, most widely used databases in real applications still rely on disk-based operations. Based on our experiments, we observe that directly applying the existing learned indexes on disk suffers from several drawbacks and cannot outperform a standard B+-tree in most cases. Therefore, in this work we make the first… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: 14 pages

  49. arXiv:2305.19580  [pdf

    cond-mat.mtrl-sci physics.chem-ph

    Monofluorinated Ether Electrolyte with Acetal Backbone for High-Performance Lithium Metal Batteries

    Authors: Elizabeth Zhang, Yuelang Chen, Zhiao Yu, Yi Cui, Zhenan Bao

    Abstract: High degree of fluorination for ether electrolytes has resulted in improved cycling stability of lithium metal batteries (LMBs) due to stable SEI formation and good oxidative stability. However, the sluggish ion transport and environmental concerns of high fluorination degree drives the need to develop less fluorinated structures. Here, we introduce bis(2-fluoroethoxy)methane (F2DEM) which feature… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: 19 pages, 6 figures

  50. arXiv:2305.15799  [pdf, other

    cs.MM

    MDVSC -- Wireless Model Division Video Semantic Communication

    Authors: Zhicheng Bao, Haotai Liang, Chen Dong, Xiaodong Xu, Geng Liu

    Abstract: In this paper, we propose a new wireless video communication scheme to achieve high-efficiency video transmission over noisy channels. It exploits the idea of model division multiple access (MDMA) and extracts common semantic features across video frames. Besides, deep joint source-channel coding (JSCC) is applied to overcome the distortion caused by noisy channels. The proposed framework is colle… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.