Skip to main content

Showing 1–50 of 837 results for author: Zhang, Z

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.18993  [pdf, ps, other

    eess.SP

    Interference Cancellation Based Neural Receiver for Superimposed Pilot in Multi-Layer Transmission

    Authors: Han Xiao, Wenqiang Tian, Shi **, Wendong Liu, Jia Shen, Zhihua Shi, Zhi Zhang

    Abstract: In this paper, an interference cancellation based neural receiver for superimposed pilot (SIP) in multi-layer transmission is proposed, where the data and pilot are non-orthogonally superimposed in the same time-frequency resource. Specifically, to deal with the intra-layer and inter-layer interference of SIP under multi-layer transmission, the interference cancellation with superimposed symbol ai… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2406.18102  [pdf

    eess.IV cs.CV

    A Lung Nodule Dataset with Histopathology-based Cancer Type Annotation

    Authors: Muwei Jian, Hongyu Chen, Zaiyong Zhang, Nan Yang, Haorang Zhang, Lifu Ma, Wen**g Xu, Huixiang Zhi

    Abstract: Recently, Computer-Aided Diagnosis (CAD) systems have emerged as indispensable tools in clinical diagnostic workflows, significantly alleviating the burden on radiologists. Nevertheless, despite their integration into clinical settings, CAD systems encounter limitations. Specifically, while CAD systems can achieve high performance in the detection of lung nodules, they face challenges in accuratel… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.16380  [pdf, other

    eess.SY

    Testing Topological Data Analysis for Condition Monitoring of Wind Turbines

    Authors: Simone Casolo, Alexander Stasik, Zhenyou Zhang, Signe Riemer-Sorensen

    Abstract: We present an investigation of how topological data analysis (TDA) can be applied to condition-based monitoring (CBM) of wind turbines for energy generation. TDA is a branch of data analysis focusing on extracting meaningful information from complex datasets by analyzing their structure in state space and computing their underlying topological features. By representing data in a high-dimensional s… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: To be published in the proceedings of the Annual Conference of the Prognostics and Health Management Society Style 2024

  4. arXiv:2406.13358  [pdf, other

    cs.CV eess.IV

    Multi-scale Restoration of Missing Data in Optical Time-series Images with Masked Spatial-Temporal Attention Network

    Authors: Zaiyan Zhang, **ing Yan, Yuanqi Liang, Jiaxin Feng, Haixu He, Wei Han

    Abstract: Due to factors such as thick cloud cover and sensor limitations, remote sensing images often suffer from significant missing data, resulting in incomplete time-series information. Existing methods for imputing missing values in remote sensing images do not fully exploit spatio-temporal auxiliary information, leading to limited accuracy in restoration. Therefore, this paper proposes a novel deep le… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  5. arXiv:2406.12186  [pdf, ps, other

    eess.IV cs.CV

    Unlocking the Potential of Early Epochs: Uncertainty-aware CT Metal Artifact Reduction

    Authors: Xinquan Yang, Guanqun Zhou, Wei Sun, Youjian Zhang, Zhongya Wang, Jiahui He, Zhicheng Zhang

    Abstract: In computed tomography (CT), the presence of metallic implants in patients often leads to disruptive artifacts in the reconstructed images, hindering accurate diagnosis. Recently, a large amount of supervised deep learning-based approaches have been proposed for metal artifact reduction (MAR). However, these methods neglect the influence of initial training weights. In this paper, we have discover… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  6. arXiv:2406.11799  [pdf, other

    eess.IV cs.CV cs.LG

    Mix-Domain Contrastive Learning for Unpaired H&E-to-IHC Stain Translation

    Authors: Song Wang, Zhong Zhang, Huan Yan, Ming Xu, Guanghui Wang

    Abstract: H&E-to-IHC stain translation techniques offer a promising solution for precise cancer diagnosis, especially in low-resource regions where there is a shortage of health professionals and limited access to expensive equipment. Considering the pixel-level misalignment of H&E-IHC image pairs, current research explores the pathological consistency between patches from the same positions of the image pa… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  7. arXiv:2406.09846  [pdf, ps, other

    cs.IT eess.SP

    Multiple Intelligent Reflecting Surfaces Collaborative Wireless Localization System

    Authors: Ziheng Zhang, Wen Chen, Qingqing Wu, Zhendong Li, Xusheng Zhu, **gfeng Chen, Nan Cheng

    Abstract: This paper studies a multiple intelligent reflecting surfaces (IRSs) collaborative localization system where multiple semi-passive IRSs are deployed in the network to locate one or more targets based on time-of-arrival. It is assumed that each semi-passive IRS is equipped with reflective elements and sensors, which are used to establish the line-of-sight links from the base station (BS) to multipl… ▽ More

    Submitted 17 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: 13 pages, 8 figures

  8. arXiv:2406.09389  [pdf, other

    eess.IV cs.CV

    Sagiri: Low Dynamic Range Image Enhancement with Generative Diffusion Prior

    Authors: Baiang Li, Sizhuo Ma, Yanhong Zeng, Xiaogang Xu, Youqing Fang, Zhao Zhang, Jian Wang, Kai Chen

    Abstract: Capturing High Dynamic Range (HDR) scenery using 8-bit cameras often suffers from over-/underexposure, loss of fine details due to low bit-depth compression, skewed color distributions, and strong noise in dark areas. Traditional LDR image enhancement methods primarily focus on color map**, which enhances the visual representation by expanding the image's color range and adjusting the brightness… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: https://sagiri0208.github.io

  9. arXiv:2406.09356  [pdf, other

    cs.CV eess.IV

    CMC-Bench: Towards a New Paradigm of Visual Signal Compression

    Authors: Chunyi Li, Xiele Wu, Haoning Wu, Donghui Feng, Zicheng Zhang, Guo Lu, Xiongkuo Min, Xiaohong Liu, Guangtao Zhai, Weisi Lin

    Abstract: Ultra-low bitrate image compression is a challenging and demanding topic. With the development of Large Multimodal Models (LMMs), a Cross Modality Compression (CMC) paradigm of Image-Text-Image has emerged. Compared with traditional codecs, this semantic-level compression can reduce image data size to 0.1\% or even lower, which has strong potential applications. However, CMC has certain defects in… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  10. arXiv:2406.08771  [pdf, other

    cs.SD cs.AI eess.AS

    MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection

    Authors: Da Mu, Zhicheng Zhang, Haobo Yue

    Abstract: Sound Event Localization and Detection (SELD) involves detecting and localizing sound events using multichannel sound recordings. Previously proposed Event-Independent Network V2 (EINV2) has achieved outstanding performance on SELD. However, it still faces challenges in effectively extracting features across spectral, spatial, and temporal domains. This paper proposes a three-stage network structu… ▽ More

    Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  11. arXiv:2406.07918  [pdf, other

    eess.IV

    Micro-expression recognition based on depth map to point cloud

    Authors: Ren Zhang, Jianqin Yin, Chao Qi, Zehao Wang, Zhicheng Zhang, Yonghao Dang

    Abstract: Micro-expressions are nonverbal facial expressions that reveal the covert emotions of individuals, making the micro-expression recognition task receive widespread attention. However, the micro-expression recognition task is challenging due to the subtle facial motion and brevity in duration. Many 2D image-based methods have been developed in recent years to recognize MEs effectively, but, these ap… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  12. arXiv:2406.06626  [pdf, other

    cs.LG cs.AI cs.HC eess.SP

    Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications

    Authors: Zhou Zhou, Guohang He, Zheng Zhang, Luziwei Leng, Qinghai Guo, Jianxing Liao, Xuan Song, Ran Cheng

    Abstract: Traditional invasive Brain-Computer Interfaces (iBCIs) typically depend on neural decoding processes conducted on workstations within laboratory settings, which prevents their everyday usage. Implementing these decoding processes on edge devices, such as the wearables, introduces considerable challenges related to computational demands, processing speed, and maintaining accuracy. This study seeks… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  13. arXiv:2406.05961  [pdf, other

    eess.AS

    BS-PLCNet 2: Two-stage Band-split Packet Loss Concealment Network with Intra-model Knowledge Distillation

    Authors: Zihan Zhang, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie

    Abstract: Audio packet loss is an inevitable problem in real-time speech communication. A band-split packet loss concealment network (BS-PLCNet) targeting full-band signals was recently proposed. Although it performs superiorly in the ICASSP 2024 PLC Challenge, BS-PLCNet is a large model with high computational complexity of 8.95G FLOPS. This paper presents its updated version, BS-PLCNet 2, to reduce comput… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  14. arXiv:2406.05452  [pdf, other

    eess.SP cs.IT

    Near-Field Channel Estimation for Extremely Large-Scale Terahertz Communications

    Authors: Songjie Yang, Yizhou Peng, Wanting Lyu, Ya Li, Hongjun He, Zhongpei Zhang, Chau Yuen

    Abstract: Future Terahertz communications exhibit significant potential in accommodating ultra-high-rate services. Employing extremely large-scale array antennas is a key approach to realize this potential, as they can harness substantial beamforming gains to overcome the severe path loss and leverage the electromagnetic advantages in the near field. This paper proposes novel estimation methods designed to… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  15. arXiv:2406.04324  [pdf, other

    cs.CV eess.IV

    SF-V: Single Forward Video Generation Model

    Authors: Zhixing Zhang, Yanyu Li, Yushu Wu, Yanwu Xu, Anil Kag, Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin, Junli Cao, Dimitris Metaxas, Sergey Tulyakov, Jian Ren

    Abstract: Diffusion-based video generation models have demonstrated remarkable success in obtaining high-fidelity videos through the iterative denoising process. However, these models require multiple denoising steps during sampling, resulting in high computational costs. In this work, we propose a novel approach to obtain single-step video generation models by leveraging adversarial training to fine-tune p… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Project Page: https://snap-research.github.io/SF-V

  16. arXiv:2406.03875  [pdf, other

    eess.SY

    Energy-storing analysis and fishtail stiffness optimization for a wire-driven elastic robotic fish

    Authors: Xiaocun Liao, Chao Zhou, Junfeng Fan, Zhuoliang Zhang, Zhaoran Yin, Liangwei Deng

    Abstract: The robotic fish with high propulsion efficiency and good maneuverability achieves underwater fishlike propulsion by commonly adopting the motor to drive the fishtail, causing the significant fluctuations of the motor power due to the uneven swing speed of the fishtail in one swing cycle. Hence, we propose a wire-driven robotic fish with a spring-steel-based active-segment elastic spine. This bion… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 14 pages, 19 figures

  17. arXiv:2406.02854  [pdf

    eess.SY

    Development of an underwater inductive coupling communication system with power carrier technology

    Authors: Zhongxing Zhang

    Abstract: Inductive coupling communication is one of the main methods of underwater communication systems due to its excellent comprehensive performance. However, the data transmission distance and operational power consumption need to be further enhanced. In this paper, an underwater induction coupling communication scheme based on power carrier technology is proposed to improve the transmission speed and… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  18. arXiv:2406.01234  [pdf, other

    cs.LG eess.SY math.OC stat.ML

    Achieving Tractable Minimax Optimal Regret in Average Reward MDPs

    Authors: Victor Boone, Zihan Zhang

    Abstract: In recent years, significant attention has been directed towards learning average-reward Markov Decision Processes (MDPs). However, existing algorithms either suffer from sub-optimal regret guarantees or computational inefficiencies. In this paper, we present the first tractable algorithm with minimax optimal regret of $\widetilde{\mathrm{O}}(\sqrt{\mathrm{sp}(h^*) S A T})$, where… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  19. arXiv:2406.00444  [pdf, other

    eess.SP

    Exploring Channel Estimation and Signal Detection for ODDM-based ISAC Systems

    Authors: Dezhi Wang, Chongwen Huang, Lei Liu, Xiaoming Chen, Wei Wang, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: Inspired by providing reliable communications for high-mobility scenarios, in this letter, we investigate the channel estimation and signal detection in integrated sensing and communication~(ISAC) systems based on the orthogonal delay-Doppler multiplexing~(ODDM) modulation, which consists of a pulse-train that can achieve the orthogonality with respect to the resolution of the delay-Doppler~(DD) p… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: accepted by IEEE Wireless Communications Letters

  20. arXiv:2406.00234  [pdf, other

    cs.LG eess.SY

    Learning to Stabilize Unknown LTI Systems on a Single Trajectory under Stochastic Noise

    Authors: Ziyi Zhang, Yorie Nakahira, Guannan Qu

    Abstract: We study the problem of learning to stabilize unknown noisy Linear Time-Invariant (LTI) systems on a single trajectory. It is well known in the literature that the learn-to-stabilize problem suffers from exponential blow-up in which the state norm blows up in the order of $Θ(2^n)$ where $n$ is the state space dimension. This blow-up is due to the open-loop instability when exploring the $n$-dimens… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  21. arXiv:2405.20068  [pdf, other

    eess.SP

    An Efficient Network with Novel Quantization Designed for Massive MIMO CSI Feedback

    Authors: Xinran Sun, Zhengming Zhang, Luxi Yang

    Abstract: The efficacy of massive multiple-input multiple-output (MIMO) techniques heavily relies on the accuracy of channel state information (CSI) in frequency division duplexing (FDD) systems. Many works focus on CSI compression and quantization methods to enhance CSI reconstruction accuracy with lower feedback overhead. In this letter, we propose CsiConformer, a novel CSI feedback network that combines… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  22. arXiv:2405.19298  [pdf, other

    cs.CV eess.IV

    Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare

    Authors: Hanwei Zhu, Haoning Wu, Yixuan Li, Zicheng Zhang, Baoliang Chen, Lingyu Zhu, Yuming Fang, Guangtao Zhai, Weisi Lin, Shiqi Wang

    Abstract: While recent advancements in large multimodal models (LMMs) have significantly improved their abilities in image quality assessment (IQA) relying on absolute quality rating, how to transfer reliable relative quality comparison outputs to continuous perceptual quality scores remains largely unexplored. To address this gap, we introduce Compare2Score-an all-around LMM-based no-reference IQA (NR-IQA)… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  23. Multiscale Spatio-Temporal Enhanced Short-term Load Forecasting of Electric Vehicle Charging Stations

    Authors: Zongbao Zhang, Jiao Hao, Wenmeng Zhao, Yan Liu, Yaohui Huang, Xinhang Luo

    Abstract: The rapid expansion of electric vehicles (EVs) has rendered the load forecasting of electric vehicle charging stations (EVCS) increasingly critical. The primary challenge in achieving precise load forecasting for EVCS lies in accounting for the nonlinear of charging behaviors, the spatial interactions among different stations, and the intricate temporal variations in usage patterns. To address the… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 5 pages, 1 figure, AEEES 2024

  24. arXiv:2405.18731  [pdf, other

    eess.SP cs.AI physics.comp-ph

    VBIM-Net: Variational Born Iterative Network for Inverse Scattering Problems

    Authors: Ziqing Xing, Zhaoyang Zhang, Zirui Chen, Yusong Wang, Haoran Ma, Zhun Wei, Gang Bao

    Abstract: Recently, studies have shown the potential of integrating field-type iterative methods with deep learning (DL) techniques in solving inverse scattering problems (ISPs). In this article, we propose a novel Variational Born Iterative Network, namely, VBIM-Net, to solve the full-wave ISPs with significantly improved flexibility and inversion quality. The proposed VBIM-Net emulates the alternating upd… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 14 pages, 21 figures

  25. arXiv:2405.16850  [pdf, other

    eess.IV cs.CV cs.LG

    UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation

    Authors: Runzhao Yang, Yinda Chen, Zhihong Zhang, Xiaoyu Liu, Zongren Li, Kunlun He, Zhiwei Xiong, **li Suo, Qionghai Dai

    Abstract: In the field of medical image compression, Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios, yet they are constrained by a one-to-one fitting approach that results in lengthy encoding times. Our novel method, ``\textbf{UniCompress}'', innovatively extends the compression capabilities of INR by being the first to compress multi… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  26. arXiv:2405.15863  [pdf, other

    cs.SD cs.AI eess.AS

    Quality-aware Masked Diffusion Transformer for Enhanced Music Generation

    Authors: Chang Li, Ruoyu Wang, Lijuan Liu, Jun Du, Yixuan Sun, Zilu Guo, Zhenrong Zhang, Yuan Jiang

    Abstract: In recent years, diffusion-based text-to-music (TTM) generation has gained prominence, offering a novel approach to synthesizing musical content from textual descriptions. Achieving high accuracy and diversity in this generation process requires extensive, high-quality data, which often constitutes only a fraction of available datasets. Within open-source datasets, the prevalence of issues like mi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  27. arXiv:2405.15655  [pdf, other

    cs.SD cs.LG eess.AS

    HiddenSpeaker: Generate Imperceptible Unlearnable Audios for Speaker Verification System

    Authors: Zhisheng Zhang, Pengyang Huang

    Abstract: In recent years, the remarkable advancements in deep neural networks have brought tremendous convenience. However, the training process of a highly effective model necessitates a substantial quantity of samples, which brings huge potential threats, like unauthorized exploitation with privacy leakage. In response, we propose a framework named HiddenSpeaker, embedding imperceptible perturbations wit… ▽ More

    Submitted 26 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCNN 2024

  28. arXiv:2405.12872  [pdf, other

    eess.IV cs.CV

    Spatial-aware Attention Generative Adversarial Network for Semi-supervised Anomaly Detection in Medical Image

    Authors: Zerui Zhang, Zhichao Sun, Zelong Liu, Bo Du, Rui Yu, Zhou Zhao, Yongchao Xu

    Abstract: Medical anomaly detection is a critical research area aimed at recognizing abnormal images to aid in diagnosis.Most existing methods adopt synthetic anomalies and image restoration on normal samples to detect anomaly. The unlabeled data consisting of both normal and abnormal data is not well explored. We introduce a novel Spatial-aware Attention Generative Adversarial Network (SAGAN) for one-class… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Early Accept by MICCAI 2024

  29. arXiv:2405.12367  [pdf, other

    eess.IV cs.CV

    Large-Scale Multi-Center CT and MRI Segmentation of Pancreas with Deep Learning

    Authors: Zheyuan Zhang, Elif Keles, Gorkem Durak, Yavuz Taktak, Onkar Susladkar, Vandan Gorade, Debesh Jha, Asli C. Ormeci, Alpay Medetalibeyoglu, Lanhong Yao, Bin Wang, Ilkin Sevgi Isler, Linkai Peng, Hongyi Pan, Camila Lopes Vendrami, Amir Bourhani, Yury Velichko, Boqing Gong, Concetto Spampinato, Ayis Pyrros, Pallavi Tiwari, Derk C. F. Klatte, Megan Engels, Sanne Hoogenboom, Candice W. Bolan , et al. (13 additional authors not shown)

    Abstract: Automated volumetric segmentation of the pancreas on cross-sectional imaging is needed for diagnosis and follow-up of pancreatic diseases. While CT-based pancreatic segmentation is more established, MRI-based segmentation methods are understudied, largely due to a lack of publicly available datasets, benchmarking research efforts, and domain-specific deep learning methods. In this retrospective st… ▽ More

    Submitted 25 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: under review version

  30. arXiv:2405.10948  [pdf, other

    cs.CV cs.AI cs.RO eess.IV

    Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery

    Authors: Guankun Wang, Long Bai, Wan Jun Nah, Jie Wang, Zhaoxi Zhang, Zhen Chen, **lin Wu, Mobarakol Islam, Hongbin Liu, Hongliang Ren

    Abstract: Recent advancements in Surgical Visual Question Answering (Surgical-VQA) and related region grounding have shown great promise for robotic and medical applications, addressing the critical need for automated methods in personalized surgical mentorship. However, existing models primarily provide simple structured answers and struggle with complex scenarios due to their limited capability in recogni… ▽ More

    Submitted 22 March, 2024; originally announced May 2024.

  31. arXiv:2405.10535  [pdf, other

    eess.SP

    Dual-Robust Integrated Sensing and Communication: Beamforming under CSI Imperfection and Location Uncertainty

    Authors: Wanting Lyu, Songjie Yang, Yue Xiu, Xinyi Chen, Zhongpei Zhang, Chadi Assi, Chau Yuan

    Abstract: A dual-robust design of beamforming is investigated in an integrated sensing and communication (ISAC) system.Existing research on robust ISAC waveform design, while proposing solutions to imperfect channel state information (CSI), generally depends on prior knowledge of the target's approximate location to design waveforms. This approach, however, limits the precision in sensing the target's exact… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  32. arXiv:2405.10507  [pdf, other

    eess.SP

    Flexible Beamforming for Movable Antenna-Enabled Integrated Sensing and Communication

    Authors: Wanting Lyu, Songjie Yang, Yue Xiu, Zhongpei Zhang, Chadi Assi, Chau Yuen

    Abstract: This paper investigates flexible beamforming design in an integrated sensing and communication (ISAC) network with movable antennas (MAs). A bistatic radar system is integrated into a multi-user multiple-input-single-output (MU-MISO) system, with the base station (BS) equipped with MAs. This enables array response reconfiguration by adjusting the positions of antennas. Thus, a joint beamforming an… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  33. arXiv:2405.10496  [pdf, other

    cs.IT eess.SP

    Electromagnetic Information Theory for Holographic MIMO Communications

    Authors: Li Wei, Tierui Gong, Chongwen Huang, Zhaoyang Zhang, Wei E. I. Sha, Zhi Ning Chen, Linglong Dai, Merouane Debbah, Chau Yuen

    Abstract: Holographic multiple-input multiple-output (HMIMO) utilizes a compact antenna array to form a nearly continuous aperture, thereby enhancing higher capacity and more flexible configurations compared with conventional MIMO systems, making it attractive in current scientific research. Key questions naturally arise regarding the potential of HMIMO to surpass Shannon's theoretical limits and how far it… ▽ More

    Submitted 25 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  34. arXiv:2405.10186  [pdf, other

    eess.IV

    Introducing Learning Rate Adaptation CMA-ES into Rigid 2D/3D Registration for Robotic Navigation in Spine Surgery

    Authors: Zhirun Zhang, Minheng Chen

    Abstract: The covariance matrix adaptive evolution strategy (CMA-ES) has been widely used in the field of 2D/3D registration in recent years. This optimization method exhibits exceptional robustness and usability for complex surgical scenarios. However, due to the inherent ill-posed nature of the 2D/3D registration task and the presence of numerous local minima in the landscape of similarity measures. Evolu… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Technical Report

  35. arXiv:2405.09814  [pdf, other

    cs.GR cs.CV cs.SD eess.AS

    Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis

    Authors: Zeyi Zhang, Tenglong Ao, Yuyao Zhang, Qingzhe Gao, Chuan Lin, Baoquan Chen, Libin Liu

    Abstract: In this work, we present Semantic Gesticulator, a novel framework designed to synthesize realistic gestures accompanying speech with strong semantic correspondence. Semantically meaningful gestures are crucial for effective non-verbal communication, but such gestures often fall within the long tail of the distribution of natural human motion. The sparsity of these movements makes it challenging fo… ▽ More

    Submitted 16 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: SIGGRAPH 2024 (Journal Track); Project page: https://pku-mocca.github.io/Semantic-Gesticulator-Page

  36. arXiv:2405.09073  [pdf, other

    eess.SP

    Interpretable attributed scattering center extracted via deep unfolding

    Authors: Haodong Yang, Zhe Zhang, Zhongling Huang

    Abstract: Most existing sparse representation-based approaches for attributed scattering center (ASC) extraction adopt traditional iterative optimization algorithms, which suffer from lengthy computation times and limited precision. This paper presents a solution by introducing an interpretable network that can effectively and rapidly extract ASC via deep unfolding. Initially, we create a dictionary contain… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted by IGARSS2024

  37. arXiv:2405.08745  [pdf, other

    eess.IV cs.CV cs.MM

    Enhancing Blind Video Quality Assessment with Rich Quality-aware Features

    Authors: Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai

    Abstract: In this paper, we present a simple but effective method to enhance blind video quality assessment (BVQA) models for social media videos. Motivated by previous researches that leverage pre-trained features extracted from various computer vision models as the feature representation for BVQA, we further explore rich quality-aware features from pre-trained blind image quality assessment (BIQA) and BVQ… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  38. arXiv:2405.07291  [pdf, other

    cs.IT eess.SP

    Robust Beamforming with Gradient-based Liquid Neural Network

    Authors: Xinquan Wang, Fenghao Zhu, Chongwen Huang, Ahmed Alhammadi, Faouzi Bader, Zhaoyang Zhang, Chau Yuen, Merouane Debbah

    Abstract: Millimeter-wave (mmWave) multiple-input multiple-output (MIMO) communication with the advanced beamforming technologies is a key enabler to meet the growing demands of future mobile communication. However, the dynamic nature of cellular channels in large-scale urban mmWave MIMO communication scenarios brings substantial challenges, particularly in terms of complexity and robustness. To address the… ▽ More

    Submitted 17 May, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

  39. arXiv:2405.06166  [pdf, other

    eess.IV cs.CV

    MDNet: Multi-Decoder Network for Abdominal CT Organs Segmentation

    Authors: Debesh Jha, Nikhil Kumar Tomar, Koushik Biswas, Gorkem Durak, Matthew Antalek, Zheyuan Zhang, Bin Wang, Md Mostafijur Rahman, Hongyi Pan, Alpay Medetalibeyoglu, Yury Velichko, Daniela Ladner, Amir Borhani, Ulas Bagci

    Abstract: Accurate segmentation of organs from abdominal CT scans is essential for clinical applications such as diagnosis, treatment planning, and patient monitoring. To handle challenges of heterogeneity in organ shapes, sizes, and complex anatomical relationships, we propose a \textbf{\textit{\ac{MDNet}}}, an encoder-decoder network that uses the pre-trained \textit{MiT-B2} as the encoder and multiple di… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  40. arXiv:2405.04311  [pdf, ps, other

    cs.CV cs.AI eess.IV

    Cross-IQA: Unsupervised Learning for Image Quality Assessment

    Authors: Zhen Zhang

    Abstract: Automatic perception of image quality is a challenging problem that impacts billions of Internet and social media users daily. To advance research in this field, we propose a no-reference image quality assessment (NR-IQA) method termed Cross-IQA based on vision transformer(ViT) model. The proposed Cross-IQA method can learn image quality features from unlabeled image data. We construct the pretext… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  41. arXiv:2405.03956  [pdf, other

    cs.SD eess.AS

    Adaptive Speech Emotion Representation Learning Based On Dynamic Graph

    Authors: Yingxue Gao, Huan Zhao, Zixing Zhang

    Abstract: Graph representation learning has become a hot research topic due to its powerful nonlinear fitting capability in extracting representative node embeddings. However, for sequential data such as speech signals, most traditional methods merely focus on the static graph created within a sequence, and largely overlook the intrinsic evolving patterns of these data. This may reduce the efficiency of gra… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Journal ref: published at ICASSP 2024

  42. arXiv:2405.03953  [pdf, other

    cs.SD eess.AS

    Intelligent Cardiac Auscultation for Murmur Detection via Parallel-Attentive Models with Uncertainty Estimation

    Authors: Zixing Zhang, Tao Pang, **g Han, Björn W. Schuller

    Abstract: Heart murmurs are a common manifestation of cardiovascular diseases and can provide crucial clues to early cardiac abnormalities. While most current research methods primarily focus on the accuracy of models, they often overlook other important aspects such as the interpretability of machine learning algorithms and the uncertainty of predictions. This paper introduces a heart murmur detection meth… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Journal ref: published at ICASSP 2024

  43. arXiv:2405.03952  [pdf, other

    cs.SD cs.CL eess.AS

    HAFFormer: A Hierarchical Attention-Free Framework for Alzheimer's Disease Detection From Spontaneous Speech

    Authors: Zhongren Dong, Zixing Zhang, Weixiang Xu, **g Han, Jianjun Ou, Björn W. Schuller

    Abstract: Automatically detecting Alzheimer's Disease (AD) from spontaneous speech plays an important role in its early diagnosis. Recent approaches highly rely on the Transformer architectures due to its efficiency in modelling long-range context dependencies. However, the quadratic increase in computational complexity associated with self-attention and the length of audio poses a challenge when deploying… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Journal ref: publised at ICASSP 2024

  44. arXiv:2405.03300  [pdf, other

    cs.IT eess.SP

    Active RIS-Aided Massive MIMO With Imperfect CSI and Phase Noise

    Authors: Zhangjie Peng, Jianchen Zhu, Cunhua Pan, Zaichen Zhang, Daniel Benevides da Costa, Maged Elkashlan, George K. Karagiannidis

    Abstract: Active reconfigurable intelligent surface (RIS) has attracted significant attention as a recently proposed RIS architecture. Owing to its capability to amplify the incident signals, active RIS can mitigate the multiplicative fading effect inherent in the passive RIS-aided system. In this paper, we consider an active RIS-aided uplink multi-user massive multiple-input multiple-output (MIMO) system i… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  45. arXiv:2405.02604  [pdf, ps, other

    cs.IT eess.SP

    Interleave Frequency Division Multiplexing

    Authors: Yuhao Chi, Lei Liu, Yao Ge, Xuehui Chen, Ying Li, Zhaoyang Zhang

    Abstract: In this letter, we study interleave frequency division multiplexing (IFDM) for multicarrier modulation in static multipath and mobile time-varying channels, which outperforms orthogonal frequency division multiplexing (OFDM), orthogonal time frequency space (OTFS), and affine frequency division multiplexing (AFDM) by considering practical advanced detectors. The fundamental principle underlying ex… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE Wireless Communications Letters

  46. arXiv:2405.01503  [pdf, other

    eess.IV cs.CV

    PAM-UNet: Shifting Attention on Region of Interest in Medical Images

    Authors: Abhijit Das, Debesh Jha, Vandan Gorade, Koushik Biswas, Hongyi Pan, Zheyuan Zhang, Daniela P. Ladner, Yury Velichko, Amir Borhani, Ulas Bagci

    Abstract: Computer-aided segmentation methods can assist medical personnel in improving diagnostic outcomes. While recent advancements like UNet and its variants have shown promise, they face a critical challenge: balancing accuracy with computational efficiency. Shallow encoder architectures in UNets often struggle to capture crucial spatial features, leading in inaccurate and sparse segmentation. To addre… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted at 2024 IEEE EMBC

  47. arXiv:2405.00736  [pdf, other

    eess.SP cs.LG

    Joint Signal Detection and Automatic Modulation Classification via Deep Learning

    Authors: Huijun Xing, Xuhui Zhang, Shuo Chang, **ke Ren, Zixun Zhang, Jie Xu, Shuguang Cui

    Abstract: Signal detection and modulation classification are two crucial tasks in various wireless communication systems. Different from prior works that investigate them independently, this paper studies the joint signal detection and automatic modulation classification (AMC) by considering a realistic and complex scenario, in which multiple signals with different modulation schemes coexist at different ca… ▽ More

    Submitted 29 April, 2024; originally announced May 2024.

  48. arXiv:2405.00734  [pdf, other

    eess.SP cs.AI cs.LG

    EEG-MACS: Manifold Attention and Confidence Stratification for EEG-based Cross-Center Brain Disease Diagnosis under Unreliable Annotations

    Authors: Zhenxi Song, Ruihan Qin, Huixia Ren, Zhen Liang, Yi Guo, Min Zhang, Zhiguo Zhang

    Abstract: Cross-center data heterogeneity and annotation unreliability significantly challenge the intelligent diagnosis of diseases using brain signals. A notable example is the EEG-based diagnosis of neurodegenerative diseases, which features subtler abnormal neural dynamics typically observed in small-group settings. To advance this area, in this work, we introduce a transferable framework employing Mani… ▽ More

    Submitted 29 April, 2024; originally announced May 2024.

  49. arXiv:2405.00391  [pdf, ps, other

    cs.IT eess.SP

    Beamforming Inferring by Conditional WGAN-GP for Holographic Antenna Arrays

    Authors: Fenghao Zhu, Xinquan Wang, Chongwen Huang, Ahmed Alhammadi, Hui Chen, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: The beamforming technology with large holographic antenna arrays is one of the key enablers for the next generation of wireless systems, which can significantly improve the spectral efficiency. However, the deployment of large antenna arrays implies high algorithm complexity and resource overhead at both receiver and transmitter ends. To address this issue, advanced technologies such as artificial… ▽ More

    Submitted 15 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  50. arXiv:2405.00365  [pdf, other

    cs.IT eess.SP

    Robust Continuous-Time Beam Tracking with Liquid Neural Network

    Authors: Fenghao Zhu, Xinquan Wang, Chongwen Huang, Richeng **, Qianqian Yang, Ahmed Alhammadi, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: Millimeter-wave (mmWave) technology is increasingly recognized as a pivotal technology of the sixth-generation communication networks due to the large amounts of available spectrum at high frequencies. However, the huge overhead associated with beam training imposes a significant challenge in mmWave communications, particularly in urban environments with high background noise. To reduce this high… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.