Skip to main content

Showing 1–50 of 555 results for author: Liu, Z

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.19464  [pdf, other

    cs.RO cs.AI cs.CV cs.SD eess.AS

    ManiWAV: Learning Robot Manipulation from In-the-Wild Audio-Visual Data

    Authors: Zeyi Liu, Cheng Chi, Eric Cousineau, Naveen Kuppuswamy, Benjamin Burchfiel, Shuran Song

    Abstract: Audio signals provide rich information for the robot interaction and object properties through contact. These information can surprisingly ease the learning of contact-rich robot manipulation skills, especially when the visual information alone is ambiguous or incomplete. However, the usage of audio data in robot manipulation has been constrained to teleoperated demonstrations collected by either… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2406.18555  [pdf

    eess.IV cs.CV

    Using a Convolutional Neural Network and Explainable AI to Diagnose Dementia Based on MRI Scans

    Authors: Tyler Morris, Ziming Liu, Longjian Liu, Xiaopeng Zhao

    Abstract: As the number of dementia patients rises, the need for accurate diagnostic procedures rises as well. Current methods, like using an MRI scan, rely on human input, which can be inaccurate. However, the decision logic behind machine learning algorithms and their outputs cannot be explained, as most operate in black-box models. Therefore, to increase the accuracy of diagnosing dementia through MRIs,… ▽ More

    Submitted 25 May, 2024; originally announced June 2024.

    Comments: 4 pages, 4 figures

  3. arXiv:2406.18069  [pdf, other

    eess.SP cs.AI cs.CL

    Large Language Models for Cuffless Blood Pressure Measurement From Wearable Biosignals

    Authors: Zengding Liu, Chen Chen, Jiannong Cao, Minglei Pan, Jikui Liu, Nan Li, Fen Miao, Ye Li

    Abstract: Large language models (LLMs) have captured significant interest from both academia and industry due to their impressive performance across various textual tasks. However, the potential of LLMs to analyze physiological time-series data remains an emerging research field. Particularly, there is a notable gap in the utilization of LLMs for analyzing wearable biosignals to achieve cuffless blood press… ▽ More

    Submitted 26 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  4. arXiv:2406.16933  [pdf, other

    eess.SP cs.AI

    SGSM: A Foundation-model-like Semi-generalist Sensing Model

    Authors: Tianjian Yang, Hao Zhou, Shuo Liu, Kaiwen Guo, Yiwen Hou, Haohua Du, Zhi Liu, Xiang-Yang Li

    Abstract: The significance of intelligent sensing systems is growing in the realm of smart services. These systems extract relevant signal features and generate informative representations for particular tasks. However, building the feature extraction component for such systems requires extensive domain-specific expertise or data. The exceptionally rapid development of foundation models is likely to usher i… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  5. arXiv:2406.16020  [pdf, other

    cs.SD cs.CL eess.AS

    AudioBench: A Universal Benchmark for Audio Large Language Models

    Authors: Bin Wang, Xunlong Zou, Geyu Lin, Shuo Sun, Zhuohan Liu, Wenyu Zhang, Zhengyuan Liu, AiTi Aw, Nancy F. Chen

    Abstract: We introduce AudioBench, a new benchmark designed to evaluate audio large language models (AudioLLMs). AudioBench encompasses 8 distinct tasks and 26 carefully selected or newly curated datasets, focusing on speech understanding, voice interpretation, and audio scene understanding. Despite the rapid advancement of large language models, including multimodal versions, a significant gap exists in co… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

    Comments: 20 pages; v2 - typo update; Code: https://github.com/AudioLLMs/AudioBench

  6. arXiv:2406.15160  [pdf, other

    eess.AS eess.SP

    Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios

    Authors: Ya Jiang, Qing Wang, Jun Du, Maocheng Hu, Pengfei Hu, Zeyan Liu, Shi Cheng, Zhaoxu Nian, Yuxuan Dong, Mingqi Cai, Xin Fang, Chin-Hui Lee

    Abstract: This study presents an audio-visual information fusion approach to sound event localization and detection (SELD) in low-resource scenarios. We aim at utilizing audio and video modality information through cross-modal learning and multi-modal fusion. First, we propose a cross-modal teacher-student learning (TSL) framework to transfer information from an audio-only teacher model, trained on a rich c… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: accepted by icme2024

  7. arXiv:2406.08523  [pdf, other

    eess.IV

    A Plug-and-Play Untrained Neural Network for Full Waveform Inversion in Reconstructing Sound Speed Images of Ultrasound Computed Tomography

    Authors: Weicheng Yan, Qiude Zhang, Yun Wu, Zhaohui Liu, Liang Zhou, Mingyue Ding, Ming Yuchi, Wu Qiu

    Abstract: Ultrasound computed tomography (USCT), as an emerging technology, can provide multiple quantitative parametric images of human tissue, such as sound speed and attenuation images, distinguishing it from conventional B-mode (reflection) ultrasound imaging. Full waveform inversion (FWI) is acknowledged as a technique with the greatest potential for reconstructing high-resolution sound speed images in… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  8. arXiv:2406.08266  [pdf, other

    eess.AS cs.SD

    Refining Self-Supervised Learnt Speech Representation using Brain Activations

    Authors: Hengyu Li, Kangdi Mei, Zhaoci Liu, Yang Ai, Li** Chen, Jie Zhang, Zhenhua Ling

    Abstract: It was shown in literature that speech representations extracted by self-supervised pre-trained models exhibit similarities with brain activations of human for speech perception and fine-tuning speech representation models on downstream tasks can further improve the similarity. However, it still remains unclear if this similarity can be used to optimize the pre-trained speech models. In this work,… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: accpeted by Interspeech2024

  9. arXiv:2406.07854  [pdf, other

    cs.SD cs.MM eess.AS

    Zero-Shot Fake Video Detection by Audio-Visual Consistency

    Authors: Xiaolou Li, Zehua Liu, Chen Chen, Lantian Li, Li Guo, Dong Wang

    Abstract: Recent studies have advocated the detection of fake videos as a one-class detection task, predicated on the hypothesis that the consistency between audio and visual modalities of genuine data is more significant than that of fake data. This methodology, which solely relies on genuine audio-visual data while negating the need for forged counterparts, is thus delineated as a `zero-shot' detection pa… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: to be published in INTERSPEECH 2024

  10. arXiv:2406.05551  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    Autoregressive Diffusion Transformer for Text-to-Speech Synthesis

    Authors: Zhijun Liu, Shuai Wang, Sho Inoue, Qibing Bai, Haizhou Li

    Abstract: Audio language models have recently emerged as a promising approach for various audio generation tasks, relying on audio tokenizers to encode waveforms into sequences of discrete symbols. Audio tokenization often poses a necessary compromise between code bitrate and reconstruction accuracy. When dealing with low-bitrate audio codes, language models are constrained to process only a subset of the i… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  11. arXiv:2406.05170  [pdf

    q-bio.OT cs.CV eess.IV

    Research on Tumors Segmentation based on Image Enhancement Method

    Authors: Danyi Huang, Ziang Liu, Yizhou Li

    Abstract: One of the most effective ways to treat liver cancer is to perform precise liver resection surgery, the key step of which includes precise digital image segmentation of the liver and its tumor. However, traditional liver parenchymal segmentation techniques often face several challenges in performing liver segmentation: lack of precision, slow processing speed, and computational burden. These short… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  12. arXiv:2406.04762  [pdf, other

    eess.SP

    Holographic Intelligence Surface Assisted Integrated Sensing and Communication

    Authors: Zhuoyang Liu, Yuchen Zhang, Haiyang Zhang, Feng Xu, Yonina C. Eldar

    Abstract: Traditional discrete-array-based systems fail to exploit interactions between closely spaced antennas, resulting in inadequate utilization of the aperture resource. In this paper, we propose a holographic intelligence surface (HIS) assisted integrated sensing and communication (HISAC) system, wherein both the transmitter and receiver are fabricated using a continuous-aperture array. A continuous-d… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  13. arXiv:2406.04679  [pdf, other

    eess.IV cs.CV

    XctDiff: Reconstruction of CT Images with Consistent Anatomical Structures from a Single Radiographic Projection Image

    Authors: Qingze Bai, Tiange Liu, Zhi Liu, Yubing Tong, Drew Torigian, Jayaram Udupa

    Abstract: In this paper, we present XctDiff, an algorithm framework for reconstructing CT from a single radiograph, which decomposes the reconstruction process into two easily controllable tasks: feature extraction and CT reconstruction. Specifically, we first design a progressive feature extraction strategy that is able to extract robust 3D priors from radiographs. Then, we use the extracted prior informat… ▽ More

    Submitted 13 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  14. arXiv:2406.02430  [pdf, other

    eess.AS cs.SD

    Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

    Authors: Philip Anastassiou, Jiawei Chen, Jitong Chen, Yuanzhe Chen, Zhuo Chen, Ziyi Chen, Jian Cong, Lelai Deng, Chuang Ding, Lu Gao, Mingqing Gong, Peisong Huang, Qingqing Huang, Zhiying Huang, Yuanyuan Huo, Dongya Jia, Chumin Li, Feiya Li, Hui Li, Jiaxin Li, Xiaoyang Li, Xingxing Li, Lin Liu, Shouda Liu, Sichao Liu , et al. (21 additional authors not shown)

    Abstract: We introduce Seed-TTS, a family of large-scale autoregressive text-to-speech (TTS) models capable of generating speech that is virtually indistinguishable from human speech. Seed-TTS serves as a foundation model for speech generation and excels in speech in-context learning, achieving performance in speaker similarity and naturalness that matches ground truth human speech in both objective and sub… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  15. arXiv:2406.01153  [pdf, other

    eess.SY

    Safety-Critical Control of Euler-Lagrange Systems Subject to Multiple Obstacles and Velocity Constraints

    Authors: Zhi Liu, Si Wu, Tengfei Liu, Zhong-** Jiang

    Abstract: This paper studies the safety-critical control problem for Euler-Lagrange (EL) systems subject to multiple ball obstacles and velocity constraints in accordance with affordable velocity ranges. A key strategy is to exploit the underlying inner-outer-loop structure for the design of a new cascade controller for the class of EL systems. In particular, the outer-loop controller is developed based on… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  16. arXiv:2405.20733  [pdf, other

    eess.SY

    Dynamic Microgrid Formation Considering Time-dependent Contingency: A Distributionally Robust Approach

    Authors: Ziang Liu, Sheng Cai, Qiuwei Wu, Xinwei Shen, Xuan Zhang, Nikos Hatziargyriou

    Abstract: The increasing frequency of extreme weather events has posed significant risks to the operation of power grids. During long-duration extreme weather events, microgrid formation (MF) is an essential solution to enhance the resilience of the distribution systems by proactively partitioning the distribution system into several microgrids to mitigate the impact of contingencies. This paper proposes a… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 5 pages, 5 figures, Accepted by PES General Meeting 2024

  17. arXiv:2405.18712  [pdf, other

    eess.SY

    Identifying the Most Influential Driver Nodes for Pinning Control of Multi-Agent Systems with Time-Varying Topology

    Authors: Guangrui Zhang, Zhaohui Liu, Xinghuo Yu, Mahdi Jalili

    Abstract: Identifying the most influential driver nodes to guarantee the fastest synchronization speed is a key topic in pinning control of multi-agent systems. This paper develops a methodology to find the most influential pinning nodes under time-varying topologies. First, we provide the pinning control synchronization conditions of multi-agent systems. Second, a method is proposed to identify the best dr… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  18. arXiv:2405.16889  [pdf

    eess.SP

    Extraction of In-Phase and Quadrature Components by Time-Encoding Sampling

    Authors: Y. H. Shao, S. Y. Chen, H. Z. Yang, F. Xi, H. Hong, Z. Liu

    Abstract: Time encoding machine (TEM) is a biologically-inspired scheme to perform signal sampling using timing. In this paper, we study its application to the sampling of bandpass signals. We propose an integrate-and-fire TEM scheme by which the in-phase (I) and quadrature (Q) components are extracted through reconstruction. We design the TEM according to the signal bandwidth and amplitude instead of upper… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 30 pages, 8 figures

  19. arXiv:2405.15241  [pdf, other

    eess.IV cs.CV

    Blaze3DM: Marry Triplane Representation with Diffusion for 3D Medical Inverse Problem Solving

    Authors: Jia He, Bonan Li, Ge Yang, Ziwen Liu

    Abstract: Solving 3D medical inverse problems such as image restoration and reconstruction is crucial in modern medical field. However, the curse of dimensionality in 3D medical data leads mainstream volume-wise methods to suffer from high resource consumption and challenges models to successfully capture the natural distribution, resulting in inevitable volume inconsistency and artifacts. Some recent works… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  20. arXiv:2405.12872  [pdf, other

    eess.IV cs.CV

    Spatial-aware Attention Generative Adversarial Network for Semi-supervised Anomaly Detection in Medical Image

    Authors: Zerui Zhang, Zhichao Sun, Zelong Liu, Bo Du, Rui Yu, Zhou Zhao, Yongchao Xu

    Abstract: Medical anomaly detection is a critical research area aimed at recognizing abnormal images to aid in diagnosis.Most existing methods adopt synthetic anomalies and image restoration on normal samples to detect anomaly. The unlabeled data consisting of both normal and abnormal data is not well explored. We introduce a novel Spatial-aware Attention Generative Adversarial Network (SAGAN) for one-class… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Early Accept by MICCAI 2024

  21. arXiv:2405.10705  [pdf, other

    eess.IV cs.CV

    3D Vessel Reconstruction from Sparse-View Dynamic DSA Images via Vessel Probability Guided Attenuation Learning

    Authors: Zhentao Liu, Huangxuan Zhao, Wenhui Qin, Zhenghong Zhou, Xinggang Wang, Wen** Wang, Xiaochun Lai, Chuansheng Zheng, Dinggang Shen, Zhiming Cui

    Abstract: Digital Subtraction Angiography (DSA) is one of the gold standards in vascular disease diagnosing. With the help of contrast agent, time-resolved 2D DSA images deliver comprehensive insights into blood flow information and can be utilized to reconstruct 3D vessel structures. Current commercial DSA systems typically demand hundreds of scanning views to perform reconstruction, resulting in substanti… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 12 pages, 13 figures, 5 tables

  22. arXiv:2405.09470  [pdf, other

    cs.SD cs.CR cs.LG eess.AS

    Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer

    Authors: Weifei **, Yuxin Cao, Junjie Su, Qi Shen, Kai Ye, Derui Wang, Jie Hao, Ziyao Liu

    Abstract: In light of the widespread application of Automatic Speech Recognition (ASR) systems, their security concerns have received much more attention than ever before, primarily due to the susceptibility of Deep Neural Networks. Previous studies have illustrated that surreptitiously crafting adversarial perturbations enables the manipulation of speech recognition systems, resulting in the production of… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted to SecTL (AsiaCCS Workshop) 2024

  23. Teacher-Student Network for Real-World Face Super-Resolution with Progressive Embedding of Edge Information

    Authors: Zhilei Liu, Chenggong Zhang

    Abstract: Traditional face super-resolution (FSR) methods trained on synthetic datasets usually have poor generalization ability for real-world face images. Recent work has utilized complex degradation models or training networks to simulate the real degradation process, but this limits the performance of these methods due to the domain differences that still exist between the generated low-resolution image… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by ICIP 2023

  24. arXiv:2405.04295  [pdf, other

    eess.IV cs.CV

    Semi-Supervised Disease Classification based on Limited Medical Image Data

    Authors: Yan Zhang, Chun Li, Zhaoxia Liu, Ming Li

    Abstract: In recent years, significant progress has been made in the field of learning from positive and unlabeled examples (PU learning), particularly in the context of advancing image and text classification tasks. However, applying PU learning to semi-supervised disease classification remains a formidable challenge, primarily due to the limited availability of labeled medical images. In the realm of medi… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  25. arXiv:2405.04253  [pdf

    eess.SP

    Fermat Number Transform Based Chromatic Dispersion Compensation and Adaptive Equalization Algorithm

    Authors: Siyu Chen, Zheli Liu, Weihao Li, Zihe Hu, Mingming Zhang, Sheng Cui, Ming Tang

    Abstract: By introducing the Fermat number transform into chromatic dispersion compensation and adaptive equalization, the computational complexity has been reduced by 68% compared with the con?ventional implementation. Experimental results validate its transmission performance with only 0.8 dB receiver sensitivity penalty in a 75 km-40 GBaud-PDM-16QAM system.

    Submitted 7 May, 2024; originally announced May 2024.

  26. arXiv:2404.17973  [pdf, other

    cs.IT eess.SP

    Over-the-Air Fusion of Sparse Spatial Features for Integrated Sensing and Edge AI over Broadband Channels

    Authors: Zhiyan Liu, Qiao Lan, Kaibin Huang

    Abstract: The 6G mobile networks are differentiated from 5G by two new usage scenarios - distributed sensing and edge AI. Their natural integration, termed integrated sensing and edge AI (ISEA), promised to create a platform for enabling environment perception to make intelligent decisions and take real-time actions. A basic operation in ISEA is for a fusion center to acquire and fuse features of spatial se… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: Submitted to IEEE for possible publication

  27. arXiv:2404.16223  [pdf, other

    cs.CV eess.IV

    Deep RAW Image Super-Resolution. A NTIRE 2024 Challenge Survey

    Authors: Marcos V. Conde, Florin-Alexandru Vasluianu, Radu Timofte, Jianxing Zhang, Jia Li, Fan Wang, Xiaopeng Li, Zikun Liu, Hyunhee Park, Sejun Song, Changho Kim, Zhijuan Huang, Hongyuan Yu, Cheng Wan, Wending Xiang, Jiamin Lin, Hang Zhong, Qiaosong Zhang, Yue Sun, Xuanwu Yin, Kunlong Zuo, Senyan Xu, Siyuan Jiang, Zhi**g Sun, Jiaying Zhu , et al. (10 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 RAW Image Super-Resolution Challenge, highlighting the proposed solutions and results. New methods for RAW Super-Resolution could be essential in modern Image Signal Processing (ISP) pipelines, however, this problem is not as explored as in the RGB domain. Th goal of this challenge is to upscale RAW Bayer images by 2x, considering unknown degradations such as nois… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 - NTIRE Workshop

  28. arXiv:2404.15620  [pdf, other

    eess.IV

    A Dynamic Kernel Prior Model for Unsupervised Blind Image Super-Resolution

    Authors: Zhixiong Yang, **gyuan Xia, Shengxi Li, Xinghua Huang, Shuanghui Zhang, Zhen Liu, Yaowen Fu, Yongxiang Liu

    Abstract: Deep learning-based methods have achieved significant successes on solving the blind super-resolution (BSR) problem. However, most of them request supervised pre-training on labelled datasets. This paper proposes an unsupervised kernel estimation model, named dynamic kernel prior (DKP), to realize an unsupervised and pre-training-free learning-based algorithm for solving the BSR problem. DKP can a… ▽ More

    Submitted 25 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted for publication in CVPR 2024

  29. arXiv:2404.14693  [pdf, other

    cs.CR cs.CV eess.IV

    Double Privacy Guard: Robust Traceable Adversarial Watermarking against Face Recognition

    Authors: Yunming Zhang, Dengpan Ye, Sipeng Shen, Caiyun Xie, Ziyi Liu, Jiacheng Deng, Long Tang

    Abstract: The wide deployment of Face Recognition (FR) systems poses risks of privacy leakage. One countermeasure to address this issue is adversarial attacks, which deceive malicious FR searches but simultaneously interfere the normal identity verification of trusted authorizers. In this paper, we propose the first Double Privacy Guard (DPG) scheme based on traceable adversarial watermarking. DPG employs a… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  30. Multi-agent Reinforcement Learning-based Joint Precoding and Phase Shift Optimization for RIS-aided Cell-Free Massive MIMO Systems

    Authors: Yiyang Zhu, Enyu Shi, Ziheng Liu, Jiayi Zhang, Bo Ai

    Abstract: Cell-free (CF) massive multiple-input multiple-output (mMIMO) is a promising technique for achieving high spectral efficiency (SE) using multiple distributed access points (APs). However, harsh propagation environments often lead to significant communication performance degradation due to high penetration loss. To overcome this issue, we introduce the reconfigurable intelligent surface (RIS) into… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  31. arXiv:2404.09436  [pdf

    physics.med-ph eess.IV

    Image Reconstruction with B0 Inhomogeneity using an Interpretable Deep Unrolled Network on an Open-bore MRI-Linac

    Authors: Shanshan Shan, Yang Gao, David E. J. Waddington, Hongli Chen, Brendan Whelan, Paul Z. Y. Liu, Yaohui Wang, Chunyi Liu, Hong** Gan, Mingyuan Gao, Feng Liu

    Abstract: MRI-Linac systems require fast image reconstruction with high geometric fidelity to localize and track tumours for radiotherapy treatments. However, B0 field inhomogeneity distortions and slow MR acquisition potentially limit the quality of the image guidance and tumour treatments. In this study, we develop an interpretable unrolled network, referred to as RebinNet, to reconstruct distortion-free… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  32. arXiv:2404.08610  [pdf, other

    eess.SP

    Full-Duplex Beyond Self-Interference: The Unlimited Sensing Way

    Authors: Ziang Liu, Ayush Bhandari, Bruno Clerckx

    Abstract: The success of full-stack full-duplex communication systems depends on how effectively one can achieve digital self-interference cancellation (SIC). Towards this end, in this paper, we consider unlimited sensing framework (USF) enabled full-duplex system. We show that by injecting folding non-linearities in the sensing pipeline, one can not only suppress self-interference but also recover the sign… ▽ More

    Submitted 16 June, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: Submitted to IEEE Letter

  33. arXiv:2404.07556  [pdf, other

    eess.IV cs.CV

    Attention-Aware Laparoscopic Image Desmoking Network with Lightness Embedding and Hybrid Guided Embedding

    Authors: Ziteng Liu, Jiahua Zhu, Bainan Liu, Hao Liu, Wenpeng Gao, Yili Fu

    Abstract: This paper presents a novel method of smoke removal from the laparoscopic images. Due to the heterogeneous nature of surgical smoke, a two-stage network is proposed to estimate the smoke distribution and reconstruct a clear, smoke-free surgical scene. The utilization of the lightness channel plays a pivotal role in providing vital information pertaining to smoke density. The reconstruction of smok… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: ISBI2024

  34. arXiv:2404.04878  [pdf, other

    eess.IV cs.CV

    CycleINR: Cycle Implicit Neural Representation for Arbitrary-Scale Volumetric Super-Resolution of Medical Data

    Authors: Wei Fang, Yuxing Tang, Heng Guo, Mingze Yuan, Tony C. W. Mok, Ke Yan, Jiawen Yao, Xin Chen, Zaiyi Liu, Le Lu, Ling Zhang, Minfeng Xu

    Abstract: In the realm of medical 3D data, such as CT and MRI images, prevalent anisotropic resolution is characterized by high intra-slice but diminished inter-slice resolution. The lowered resolution between adjacent slices poses challenges, hindering optimal viewing experiences and impeding the development of robust downstream analysis algorithms. Various volumetric super-resolution algorithms aim to sur… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: CVPR accepted paper

  35. arXiv:2403.18811  [pdf, other

    cs.CV cs.GR cs.SD eess.AS

    Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment

    Authors: Li Siyao, Tianpei Gu, Zhitao Yang, Zhengyu Lin, Ziwei Liu, Henghui Ding, Lei Yang, Chen Change Loy

    Abstract: We introduce a novel task within the field of 3D dance generation, termed dance accompaniment, which necessitates the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm. Unlike existing solo or group dance generation tasks, a duet dance scenario entails a heightened degree of interaction between t… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: ICLR 2024

  36. arXiv:2403.18200  [pdf, other

    eess.SY

    Fault-tolerant properties of scale-free linear protocols for synchronization of homogeneous multi-agent systems

    Authors: Anton A. Stoorvogel, Ali Saberi, Zhenwei Liu

    Abstract: Originally, protocols were designed for multi-agent systems (MAS) using information about the network. However, in many cases there is no or only limited information available about the network. Recently, there has been a focus on scale-free synchronization of multi-agent systems (MAS). In this case, the protocol is designed without any prior information about the network. As long as the network c… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: The article was submitted to IEEE Transactions on Automatic Control for review at March 27th, 2024

  37. arXiv:2403.16836  [pdf, other

    eess.SY physics.optics

    Energy Efficiency Optimization Method of WDM Visible Light Communication System for Indoor Broadcasting Networks

    Authors: Dayu Shi, Xun Zhang, Ziqi Liu, Xuanbang Chen, Jianghao Li, Xiaodong Liu, William Shieh

    Abstract: This paper introduces a novel approach to optimize energy efficiency in wavelength division multiplexing (WDM) Visible Light Communication (VLC) systems designed for indoor broadcasting networks. A physics-based LED model is integrated into system energy efficiency optimization, enabling quantitative analysis of the critical issue of VLC energy efficiency: the nonlinear interplay between illuminat… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  38. arXiv:2403.14775  [pdf, ps, other

    cs.IT eess.SP

    RIS-Aided Cooperative Mobile Edge Computing: Computation Efficiency Maximization via Joint Uplink and Downlink Resource Allocation

    Authors: Zhenrong Liu, Zongze Li, Yi Gong, Yik-Chung Wu

    Abstract: In mobile edge computing (MEC) systems, the wireless channel condition is a critical factor affecting both the communication power consumption and computation rate of the offloading tasks. This paper exploits the idea of cooperative transmission and employing reconfigurable intelligent surface (RIS) in MEC to improve the channel condition and maximize computation efficiency (CE). The resulting pro… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: This paper has been accepted for publication in IEEE Transactions on Wireless Communications

  39. arXiv:2403.12521  [pdf

    eess.SY

    Multi-mode Fault Diagnosis Datasets of Gearbox Under Variable Working Conditions

    Authors: Shi** Chen, Zeyi Liu, Xiao He, Dongliang Zou, Donghua Zhou

    Abstract: The gearbox is a critical component of electromechanical systems. The occurrence of multiple faults can significantly impact system accuracy and service life. The vibration signal of the gearbox is an effective indicator of its operational status and fault information. However, gearboxes in real industrial settings often operate under variable working conditions, such as varying speeds and loads.… ▽ More

    Submitted 8 April, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 10 pages, 12 figures

  40. A Novel Mutual Insurance Model for Hedging Against Cyber Risks in Power Systems Deploying Smart Technologies

    Authors: Pikkin Lau, Lingfeng Wang, Wei Wei, Zhaoxi Liu, Chee-Wooi Ten

    Abstract: In this paper, a novel cyber-insurance model design is proposed based on system risk evaluation with smart technology applications. The cyber insurance policy for power systems is tailored via cyber risk modeling, reliability impact analysis, and insurance premium calculation. A stochastic Epidemic Network Model is developed to evaluate the cyber risk by propagating cyberattacks among graphical vu… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: Power system reliability, cyber-insurance, power system security, cyber-physical systems, cyber risk modeling, actuarial design, tail risk

    Journal ref: in IEEE Transactions on Power Systems, vol. 38, no. 1, pp. 630-642, Jan. 2023

  41. arXiv:2403.10230  [pdf, ps, other

    cs.IT eess.SP

    Fairness Optimization for Intelligent Reflecting Surface Aided Uplink Rate-Splitting Multiple Access

    Authors: Shanshan Zhang, Wen Chen, Qingqing Wu, Ziwei Liu, Shunqing Zhang, Jun Li

    Abstract: This paper studies the fair transmission design for an intelligent reflecting surface (IRS) aided rate-splitting multiple access (RSMA). IRS is used to establish a good signal propagation environment and enhance the RSMA transmission performance. The fair rate adaption problem is constructed as a max-min optimization problem. To solve the optimization problem, we adopt an alternative optimization… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: This work has been submitted to TCOM

  42. arXiv:2403.09076  [pdf, ps, other

    eess.SY

    Chaotic Masking Protocol for Secure Communication and Attack Detection in Remote Estimation of Cyber-Physical Systems

    Authors: Tao Chen, Andreu Cecilia, Daniele Astolfi, Lei Wang, Zhitao Liu, Hongye Su

    Abstract: In remote estimation of cyber-physical systems (CPSs), sensor measurements transmitted through network may be attacked by adversaries, leading to leakage risk of privacy (e.g., the system state), and/or failure of the remote estimator. To deal with this problem, a chaotic masking protocol is proposed in this paper to secure the sensor measurements transmission. In detail, at the plant side, a chao… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 8 pages, 7 figures

  43. arXiv:2403.08505  [pdf, other

    eess.IV cs.AI cs.CV cs.MM

    Content-aware Masked Image Modeling Transformer for Stereo Image Compression

    Authors: Xinjie Zhang, Shenyuan Gao, Zhening Liu, Jiawei Shao, Xingtong Ge, Dailan He, Tongda Xu, Yan Wang, Jun Zhang

    Abstract: Existing learning-based stereo image codec adopt sophisticated transformation with simple entropy models derived from single image codecs to encode latent representations. However, those entropy models struggle to effectively capture the spatial-disparity characteristics inherent in stereo images, which leads to suboptimal rate-distortion results. In this paper, we propose a stereo image compressi… ▽ More

    Submitted 19 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  44. arXiv:2403.05906  [pdf, other

    eess.IV cs.CV

    Segmentation Guided Sparse Transformer for Under-Display Camera Image Restoration

    Authors: **gyun Xue, Tao Wang, Jun Wang, Kaihao Zhang, Wenhan Luo, Wenqi Ren, Zikun Liu, Hyunhee Park, Xiaochun Cao

    Abstract: Under-Display Camera (UDC) is an emerging technology that achieves full-screen display via hiding the camera under the display panel. However, the current implementation of UDC causes serious degradation. The incident light required for camera imaging undergoes attenuation and diffraction when passing through the display panel, leading to various artifacts in UDC imaging. Presently, the prevailing… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: 13 pages, 10 figures, conference or other essential info

  45. arXiv:2403.02854  [pdf, ps, other

    eess.SP

    STAR-RIS Assisted Wireless-Powered and Backscattering Mobile Edge Computing Networks

    Authors: Bin Lyu, Yining Zhang, Pengcheng Chen, Ziwei Liu, Feng Tian

    Abstract: Wireless powered and backscattering mobile edge computing (WPB-MEC) network is a novel network paradigm to supply energy supplies and computing resource to wireless sensors (WSs). However, its performance is seriously affected by severe attenuations and inappropriate assumptions of infinite computing capability at the hybrid access point (HAP). To address the above issues, in this paper, we propos… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted by China Communications. 13 pages, 8 figures

  46. arXiv:2403.00473  [pdf, other

    cs.GR cs.RO eess.SY

    Computer-Controlled 3D Freeform Surface Weaving

    Authors: Xiangjia Chen, Lip M. Lai, Zishun Liu, Chengkai Dai, Isaac C. W. Leung, Charlie C. L. Wang, Yeung Yam

    Abstract: In this paper, we present a new computer-controlled weaving technology that enables the fabrication of woven structures in the shape of given 3D surfaces by using threads in non-traditional materials with high bending-stiffness, allowing for multiple applications with the resultant woven fabrics. A new weaving machine and a new manufacturing process are developed to realize the function of 3D surf… ▽ More

    Submitted 8 May, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

  47. arXiv:2402.19085  [pdf, other

    cs.CL cs.AI eess.SY

    Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment

    Authors: Yiju Guo, Ganqu Cui, Lifan Yuan, Ning Ding, Jiexin Wang, Huimin Chen, Bowen Sun, Ruobing Xie, Jie Zhou, Yankai Lin, Zhiyuan Liu, Maosong Sun

    Abstract: Alignment in artificial intelligence pursues the consistency between model responses and human preferences as well as values. In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment tax" -a compromise where enhancements in alignment within one objective (e.g.,harmlessness) can diminish performance in others (e.g.,helpfulness). However, exi… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  48. arXiv:2402.18204  [pdf, other

    cs.SD eess.AS

    ConvDTW-ACS: Audio Segmentation for Track Type Detection During Car Manufacturing

    Authors: Álvaro López-Chilet, Zhaoyi Liu, Jon Ander Gómez, Carlos Alvarez, Marivi Alonso Ortiz, Andres Orejuela Mesa, David Newton, Friedrich Wolf-Monheim, Sam Michiels, Danny Hughes

    Abstract: This paper proposes a method for Acoustic Constrained Segmentation (ACS) in audio recordings of vehicles driven through a production test track, delimiting the boundaries of surface types in the track. ACS is a variant of classical acoustic segmentation where the sequence of labels is known, contiguous and invariable, which is especially useful in this work as the test track has a standard configu… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 12 pages, 2 figures

  49. arXiv:2402.17645  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation

    Authors: Shuangrui Ding, Zihan Liu, Xiaoyi Dong, Pan Zhang, Rui Qian, Conghui He, Dahua Lin, Jiaqi Wang

    Abstract: We present SongComposer, an innovative LLM designed for song composition. It could understand and generate melodies and lyrics in symbolic song representations, by leveraging the capability of LLM. Existing music-related LLM treated the music as quantized audio signals, while such implicit encoding leads to inefficient encoding and poor flexibility. In contrast, we resort to symbolic song represen… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: project page: https://pjlab-songcomposer.github.io/ code: https://github.com/pjlab-songcomposer/songcomposer

  50. arXiv:2402.17268  [pdf, other

    eess.SY

    Reinforcement Learning Based Robust Volt/Var Control in Active Distribution Networks With Imprecisely Known Delay

    Authors: Hong Cheng, Huan Luo, Zhi Liu, Wei Sun, Weitao Li, Qiyue Li

    Abstract: Active distribution networks (ADNs) incorporating massive photovoltaic (PV) devices encounter challenges of rapid voltage fluctuations and potential violations. Due to the fluctuation and intermittency of PV generation, the state gap, arising from time-inconsistent states and exacerbated by imprecisely known system delays, significantly impacts the accuracy of voltage control. This paper addresses… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.