Skip to main content

Showing 1–50 of 8,397 results for author: Li

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.03159  [pdf, other

    cs.SI eess.SY physics.soc-ph

    Protection Degree and Migration in the Stochastic SIRS Model: A Queueing System Perspective

    Authors: Yuhan Li, Ziyan Zeng, Minyu Feng, Jürgen Kurths

    Abstract: With the prevalence of COVID-19, the modeling of epidemic propagation and its analyses have played a significant role in controlling epidemics. However, individual behaviors, in particular the self-protection and migration, which have a strong influence on epidemic propagation, were always neglected in previous studies. In this paper, we mainly propose two models from the individual and population… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2407.02930  [pdf, other

    eess.SP

    Timely Requesting for Time-Critical Content Users in Decentralized F-RANs

    Authors: Xingran Chen, Kai Li, Kun Yang

    Abstract: With the rising demand for high-rate and timely communications, fog radio access networks (F-RANs) offer a promising solution. This work investigates age of information (AoI) performance in F-RANs, consisting of multiple content users (CUs), enhanced remote radio heads (eRRHs), and content providers (CPs). Time-critical CUs need rapid content updates from CPs but cannot communicate directly with t… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  3. arXiv:2407.02913  [pdf, other

    cs.LG cs.AI eess.IV eess.SP math.NA

    SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic

    Authors: Liulu He, Yufei Zhao, Rui Gao, Yuan Du, Li Du

    Abstract: Fast convolution algorithms, including Winograd and FFT, can efficiently accelerate convolution operations in deep models. However, these algorithms depend on high-precision arithmetic to maintain inference accuracy, which conflicts with the model quantization. To resolve this conflict and further improve the efficiency of quantized convolution, we proposes SFC, a new algebra transform for fast co… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: ICML 2024

  4. arXiv:2407.02877  [pdf, other

    cs.IT eess.SP

    Resource Allocation Design for Next-Generation Multiple Access: A Tutorial Overview

    Authors: Zhiqiang Wei, Dongfang Xu, Shuangyang Li, Shenghui Song, Derrick Wing Kwan Ng, Giuseppe Caire

    Abstract: Multiple access is the cornerstone technology for each generation of wireless cellular networks and resource allocation design plays a crucial role in multiple access. In this paper, we present a comprehensive tutorial overview for junior researchers in this field, aiming to offer a foundational guide for resource allocation design in the context of next-generation multiple access (NGMA). Initiall… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 69 pages, 10 figures, 5 tables

  5. arXiv:2407.02830  [pdf, other

    cs.CV eess.IV

    A Radiometric Correction based Optical Modeling Approach to Removing Reflection Noise in TLS Point Clouds of Urban Scenes

    Authors: Li Fang, Tianyu Li, Yanghong Lin, Shudong Zhou, Wei Yao

    Abstract: Point clouds are vital in computer vision tasks such as 3D reconstruction, autonomous driving, and robotics. However, TLS-acquired point clouds often contain virtual points from reflective surfaces, causing disruptions. This study presents a reflection noise elimination algorithm for TLS point clouds. Our innovative reflection plane detection algorithm, based on geometry-optical models and physica… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  6. arXiv:2407.02826  [pdf, other

    eess.AS

    SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech

    Authors: **gru Lin, Meng Ge, Junyi Ao, Liqun Deng, Haizhou Li

    Abstract: It was shown that pre-trained models with self-supervised learning (SSL) techniques are effective in various downstream speech tasks. However, most such models are trained on single-speaker speech data, limiting their effectiveness in mixture speech. This motivates us to explore pre-training on mixture speech. This work presents SA-WavLM, a novel pre-trained model for mixture speech. Specifically,… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: InterSpeech 2024

  7. arXiv:2407.02765  [pdf, ps, other

    eess.SY cs.AI math.OC math.PR

    Graphon Particle Systems, Part II: Dynamics of Distributed Stochastic Continuum Optimization

    Authors: Yan Chen, Tao Li

    Abstract: We study the distributed optimization problem over a graphon with a continuum of nodes, which is regarded as the limit of the distributed networked optimization as the number of nodes goes to infinity. Each node has a private local cost function. The global cost function, which all nodes cooperatively minimize, is the integral of the local cost functions on the node set. We propose stochastic grad… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  8. Coding-Enhanced Cooperative Jamming for Secret Communication in Fluid Antenna Systems

    Authors: Hao Xu, Kai-Kit Wong, Wee Kiat New, Guyue Li, Farshad Rostami Ghadi, Yongxu Zhu, Shi **, Chan-Byoung Chae, Yangyang Zhang

    Abstract: This letter investigates the secret communication problem for a fluid antenna system (FAS)-assisted wiretap channel, where the legitimate transmitter transmits an information-bearing signal to the legitimate receiver, and at the same time, transmits a jamming signal to interfere with the eavesdropper (Eve). Unlike the conventional jamming scheme, which usually transmits Gaussian noise that interfe… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 6 pages, 3 figures, this paper has been accepted by IEEE Communications Letters

  9. arXiv:2407.02387  [pdf, other

    cs.CV eess.IV

    Real HSI-MSI-PAN image dataset for the hyperspectral/multi-spectral/panchromatic image fusion and super-resolution fields

    Authors: Shuangliang Li

    Abstract: Nowadays, most of the hyperspectral image (HSI) fusion experiments are based on simulated datasets to compare different fusion methods. However, most of the spectral response functions and spatial downsampling functions used to create the simulated datasets are not entirely accurate, resulting in deviations in spatial and spectral features between the generated images for fusion and the real image… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  10. arXiv:2407.02277  [pdf, other

    cs.SD eess.AS

    MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music Processing

    Authors: Shangda Wu, Yashan Wang, Xiaobing Li, Feng Yu, Maosong Sun

    Abstract: In the domain of symbolic music research, the progress of develo** scalable systems has been notably hindered by the scarcity of available training data and the demand for models tailored to specific tasks. To address these issues, we propose MelodyT5, a novel unified framework that leverages an encoder-decoder architecture tailored for symbolic music processing in ABC notation. This framework c… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: 9 pages, 2 figures, 3 tables, accepted by ISMIR 2024

  11. arXiv:2407.02251  [pdf, other

    eess.SP

    White-Box 3D-OMP-Transformer for ISAC

    Authors: Bowen Zhang, Geoffrey Ye Li

    Abstract: Transformers have found broad applications for their great ability to capture long-range dependency among the inputs using attention mechanisms. The recent success of transformers increases the need for mathematical interpretation of their underlying working mechanisms, leading to the development of a family of white-box transformer-like deep network architectures. However, designing white-box tra… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  12. arXiv:2407.02160  [pdf, ps, other

    eess.SP

    Intelligent Reflecting Surface-Assisted NLOS Sensing With OFDM Signals

    Authors: Jilin Wang, Jun Fang, Hongbin Li, Lei Huang

    Abstract: This work addresses the problem of intelligent reflecting surface (IRS) assisted target sensing in a non-line-of-sight (NLOS) scenario, where an IRS is employed to facilitate the radar/access point (AP) to sense the targets when the line-of-sight (LOS) path between the AP and the target is blocked by obstacles. To sense the targets, the AP transmits a train of uniformly-spaced orthogonal frequency… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  13. arXiv:2407.02124  [pdf

    eess.SY

    Data-Driven Subsynchronous Oscillation Suppression for Renewable Energy Integrated Power Systems Based on Koopman Operator

    Authors: Zihan Wang, Ziyang Huang, Xiaonan Zhang, Gengyin Li, Le Zheng

    Abstract: Recently, subsynchronous oscillations (SSOs) have emerged frequently worldwide, with the high penetration of renewable power generation in modern power systems. The SSO introduced by renewables has become a prominent new stability problem, seriously threatening the stable operation of systems. This paper proposes a data-driven dynamic optimal controller for renewable energy integrated power system… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  14. arXiv:2407.02052  [pdf, other

    eess.AS cs.SD

    The USTC-NERCSLIP Systems for The ICMC-ASR Challenge

    Authors: Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, **tao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang

    Abstract: This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlap** and Mandarin accent dynamics in the ICMC case. We implement the front-end speaker diarization using the self-supervised learning representation based multi-speaker embedding and beamforming using the speaker position,… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted at ICASSP 2024

  15. arXiv:2407.02049  [pdf, other

    eess.AS cs.CL cs.SD

    Accompanied Singing Voice Synthesis with Fully Text-controlled Melody

    Authors: Ruiqi Li, Zhiqing Hong, Yongqi Wang, Lichao Zhang, Rongjie Huang, Siqi Zheng, Zhou Zhao

    Abstract: Text-to-song (TTSong) is a music generation task that synthesizes accompanied singing voices. Current TTSong methods, inherited from singing voice synthesis (SVS), require melody-related information that can sometimes be impractical, such as music scores or MIDI sequences. We present MelodyLM, the first TTSong model that generates high-quality song pieces with fully text-controlled melodies, achie… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Working in progress

  16. arXiv:2407.02005  [pdf, other

    cs.CL cs.SD eess.AS

    An End-to-End Speech Summarization Using Large Language Model

    Authors: Hengchao Shang, Zongyao Li, Jiaxin Guo, Shaojun Li, Zhiqiang Rao, Yuanchang Luo, Daimeng Wei, Hao Yang

    Abstract: Abstractive Speech Summarization (SSum) aims to generate human-like text summaries from spoken content. It encounters difficulties in handling long speech input and capturing the intricate cross-modal map** between long speech inputs and short text summaries. Research on large language models (LLMs) and multimodal information fusion has provided new insights for addressing these challenges. In t… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: InterSpeech 2024

  17. arXiv:2407.01908  [pdf, other

    eess.IV cs.CV

    Efficient Stochastic Differential Equation for DEM Super Resolution with Void Filling

    Authors: Tongtong Zhang, Zongcheng Zuo, Yuanxiang Li

    Abstract: Digital Elevation Model (DEM) plays a fundamental role in remote sensing and photogrammetry. Enhancing the quality of DEM is crucial for various applications. Although multiple types of defects may appear simultaneously in the same DEM, they are commonly addressed separately. Most existing approaches only aim to fill the DEM voids, or apply super-resolution to the intact DEM. This paper introduces… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  18. arXiv:2407.01891  [pdf, other

    cs.RO eess.SY

    Refined Motion Compensation with Soft Laser Manipulators using Data-Driven Surrogate Models

    Authors: Yongjun Yan, Qingpeng Ding, Mingwu Li, Junyan Yan, Shing Shin Cheng

    Abstract: Non-contact laser ablation, a precise thermal technique, simultaneously cuts and coagulates tissue without the insertion errors associated with rigid needles. Human organ motions, such as those in the liver, exhibit rhythmic components influenced by respiratory and cardiac cycles, making effective laser energy delivery to target lesions while compensating for tumor motion crucial. This research in… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  19. arXiv:2407.01801  [pdf, other

    eess.SP

    Joint State and Parameter Estimation Using the Partial Errors-in-Variables Principle

    Authors: Peng Liu, Kailai Li, Gustaf Hendeby, Fredrik Gustafsson

    Abstract: This letter proposes a new method for joint state and parameter estimation in uncertain dynamical systems. We exploit the partial errors-in-variables (PEIV) principle and formulate a regression problem in the sense of weighted total least squares, where the uncertainty in the parameter prior is explicitly considered. Based thereon, the PEIV regression can be solved iteratively through the Kalman s… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 5 pages

  20. arXiv:2407.01530  [pdf, other

    eess.IV cs.CV

    xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart

    Authors: Tianrun Chen, Chaotao Ding, Lanyun Zhu, Tao Xu, Deyi Ji, Yan Wang, Ying Zang, Zejian Li

    Abstract: Convolutional Neural Networks (CNNs) and Vision Transformers (ViT) have been pivotal in biomedical image segmentation, yet their ability to manage long-range dependencies remains constrained by inherent locality and computational overhead. To overcome these challenges, in this technical report, we first propose xLSTM-UNet, a UNet structured deep learning neural network that leverages Vision-LSTM (… ▽ More

    Submitted 2 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  21. arXiv:2407.01336  [pdf, other

    cs.IT eess.SP

    Compressed Sensing Inspired User Acquisition for Downlink Integrated Sensing and Communication Transmissions

    Authors: Yi Song, Fernando Pedraza, Shuangyang Li, Siyao Li, Han Yu, Giuseppe Caire

    Abstract: This paper investigates radar-assisted user acquisition for downlink multi-user multiple-input multiple-output (MIMO) transmission using Orthogonal Frequency Division Multiplexing (OFDM) signals. Specifically, we formulate a concise mathematical model for the user acquisition problem, where each user is characterized by its delay and beamspace response. Therefore, we propose a two-stage method for… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  22. arXiv:2407.01211  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    Efficient Cutting Tool Wear Segmentation Based on Segment Anything Model

    Authors: Zongshuo Li, Ding Huo, Markus Meurer, Thomas Bergs

    Abstract: Tool wear conditions impact the surface quality of the workpiece and its final geometric precision. In this research, we propose an efficient tool wear segmentation approach based on Segment Anything Model, which integrates U-Net as an automated prompt generator to streamline the processes of tool wear detection. Our evaluation covered three Point-of-Interest generation methods and further investi… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  23. arXiv:2407.01200  [pdf

    cs.LG cs.AI eess.SP

    Deep Learning Approach for Enhanced Transferability and Learning Capacity in Tool Wear Estimation

    Authors: Zongshuo Li, Markus Meurer, Thomas Bergs

    Abstract: As an integral part of contemporary manufacturing, monitoring systems obtain valuable information during machining to oversee the condition of both the process and the machine. Recently, diverse algorithms have been employed to detect tool wear using single or multiple sources of measurements. In this study, a deep learning approach is proposed for estimating tool wear, considering cutting paramet… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  24. arXiv:2407.01199  [pdf, other

    cs.LG cs.AI eess.SP

    Deep Learning Based Tool Wear Estimation Considering Cutting Conditions

    Authors: Zongshuo Li, Markus Meurer, Thomas Bergs

    Abstract: Tool wear conditions impact the final quality of the workpiece. In this study, we propose a deep learning approach based on a convolutional neural network that incorporates cutting conditions as extra model inputs, aiming to improve tool wear estimation accuracy and fulfill industrial demands for zero-shot transferability. Through a series of milling experiments under various cutting parameters, w… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  25. arXiv:2407.01090  [pdf, other

    eess.IV cs.CV

    Learning 3D Gaussians for Extremely Sparse-View Cone-Beam CT Reconstruction

    Authors: Yiqun Lin, Hualiang Wang, Jixiang Chen, Xiaomeng Li

    Abstract: Cone-Beam Computed Tomography (CBCT) is an indispensable technique in medical imaging, yet the associated radiation exposure raises concerns in clinical practice. To mitigate these risks, sparse-view reconstruction has emerged as an essential research direction, aiming to reduce the radiation dose by utilizing fewer projections for CT reconstruction. Although implicit neural representations have b… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted to MICCAI 2024. Project link: https://github.com/xmed-lab/DIF-Gaussian

  26. arXiv:2407.01018  [pdf, other

    cs.IT eess.SP

    Experimental Comparison of Average-Power Constrained and Peak-Power Constrained 64QAM under Optimal Clip** in 400Gbps Unamplified Coherent Links

    Authors: Wing-Chau Ng, Chuandong Li

    Abstract: We experimentally demonstrated an end-to-end link budget optimization over clip** in 400Gbps unamplified links, showing that the clipped MB distribution outperforms the peak-power constrained 64QAM by 1dB link budget.

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Submitted to European Conference on Optical Communications (ECOC) 2024

  27. arXiv:2407.00896  [pdf, other

    eess.SP cs.AI

    Channel Modeling Aided Dataset Generation for AI-Enabled CSI Feedback: Advances, Challenges, and Solutions

    Authors: Yupeng Li, Gang Li, Zirui Wen, Shuangfeng Han, Shijian Gao, Guangyi Liu, Jiangzhou Wang

    Abstract: The AI-enabled autoencoder has demonstrated great potential in channel state information (CSI) feedback in frequency division duplex (FDD) multiple input multiple output (MIMO) systems. However, this method completely changes the existing feedback strategies, making it impractical to deploy in recent years. To address this issue, this paper proposes a channel modeling aided data augmentation metho… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  28. arXiv:2407.00837  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Towards Robust Speech Representation Learning for Thousands of Languages

    Authors: William Chen, Wangyou Zhang, Yifan Peng, Xinjian Li, **chuan Tian, Jiatong Shi, Xuankai Chang, Soumi Maiti, Karen Livescu, Shinji Watanabe

    Abstract: Self-supervised learning (SSL) has helped extend speech technologies to more languages by reducing the need for labeled data. However, models are still far from supporting the world's 7000+ languages. We propose XEUS, a Cross-lingual Encoder for Universal Speech, trained on over 1 million hours of data across 4057 languages, extending the language coverage of SSL models 4-fold. We combine 1 millio… ▽ More

    Submitted 2 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

    Comments: Updated affiliations; 20 pages

  29. arXiv:2407.00718  [pdf, other

    eess.IV cs.CV

    ASPS: Augmented Segment Anything Model for Polyp Segmentation

    Authors: Huiqian Li, Dingwen Zhang, Jieru Yao, Longfei Han, Zhongyu Li, Junwei Han

    Abstract: Polyp segmentation plays a pivotal role in colorectal cancer diagnosis. Recently, the emergence of the Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation, leveraging its powerful pre-training capability on large-scale datasets. However, due to the domain gap between natural and endoscopy images, SAM encounters two limitations in achieving effective performan… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Accepted by MICCAI2024

  30. arXiv:2407.00614  [pdf, other

    cs.RO cs.CV eess.IV

    Learning Granularity-Aware Affordances from Human-Object Interaction for Tool-Based Functional Gras** in Dexterous Robotics

    Authors: Fan Yang, Wenrui Chen, Kailun Yang, Haoran Lin, DongSheng Luo, Conghui Tang, Zhiyong Li, Yaonan Wang

    Abstract: To enable robots to use tools, the initial step is teaching robots to employ dexterous gestures for touching specific areas precisely where tasks are performed. Affordance features of objects serve as a bridge in the functional interaction between agents and objects. However, leveraging these affordance cues to help robots achieve functional tool gras** remains unresolved. To address this, we pr… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: The source code and the established dataset will be made publicly available at https://github.com/yangfan293/GAAF-DEX

  31. arXiv:2407.00596  [pdf, other

    eess.IV cs.CV

    HATs: Hierarchical Adaptive Taxonomy Segmentation for Panoramic Pathology Image Analysis

    Authors: Ruining Deng, Quan Liu, Can Cui, Tianyuan Yao, Juming Xiong, Shunxing Bao, Hao Li, Mengmeng Yin, Yu Wang, Shilin Zhao, Yucheng Tang, Haichun Yang, Yuankai Huo

    Abstract: Panoramic image segmentation in computational pathology presents a remarkable challenge due to the morphologically complex and variably scaled anatomy. For instance, the intricate organization in kidney pathology spans multiple layers, from regions like the cortex and medulla to functional units such as glomeruli, tubules, and vessels, down to various cell types. In this paper, we propose a novel… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.19286

  32. arXiv:2407.00280  [pdf, other

    eess.IV cs.CV

    IVCA: Inter-Relation-Aware Video Complexity Analyzer

    Authors: Junqi Liao, Yao Li, Zhuoyuan Li, Li Li, Dong Liu

    Abstract: To meet the real-time analysis requirements of video streaming applications, we propose an inter-relation-aware video complexity analyzer (IVCA) as an extension to VCA. The IVCA addresses the limitation of VCA by considering inter-frame relations, namely motion and reference structure. First, we enhance the accuracy of temporal features by introducing feature-domain motion estimation into the IVCA… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: The report for the solution of second prize winner in ICIP 2024 Grand Challenge on Video Complexity (Team: USTC-iVC_Team1, USTC-iVC_Team2)

  33. Generative Iris Prior Embedded Transformer for Iris Restoration

    Authors: Yubo Huang, Jia Wang, Peipei Li, Liuyu Xiang, Peigang Li, Zhaofeng He

    Abstract: Iris restoration from complexly degraded iris images, aiming to improve iris recognition performance, is a challenging problem. Due to the complex degradation, directly training a convolutional neural network (CNN) without prior cannot yield satisfactory results. In this work, we propose a generative iris prior embedded Transformer model (Gformer), in which we build a hierarchical encoder-decoder… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: Our code is available at https://github.com/sawyercharlton/Gformer

    Journal ref: 2023 IEEE International Conference on Multimedia and Expo (ICME), Brisbane, Australia, 2023, pp. 510-515

  34. arXiv:2406.19959  [pdf, other

    cs.SD eess.AS

    RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization

    Authors: Bing Yang, Changsheng Quan, Yabo Wang, Pengyu Wang, Yujie Yang, Ying Fang, Nian Shao, Hui Bu, Xin Xu, Xiaofei Li

    Abstract: The training of deep learning-based multichannel speech enhancement and source localization systems relies heavily on the simulation of room impulse response and multichannel diffuse noise, due to the lack of large-scale real-recorded datasets. However, the acoustic mismatch between simulated and real-world data could degrade the model performance when applying in real-world scenarios. To bridge t… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  35. arXiv:2406.19796  [pdf, other

    eess.IV cs.CV

    Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting

    Authors: Wei Li, **gyang Zhang, Pheng-Ann Heng, Lixu Gu

    Abstract: Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources. Task-Incremental Learning (TIL) offers a privacy-preserving training paradigm using tasks arriving sequentially, instead of gathering them due to strict data sharing policies. However, the task evolution can span a wide scope that involves shifts in both image appearanc… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Accepted by MICCAI24

  36. arXiv:2406.19769  [pdf, other

    eess.SP

    Decision Transformer for IRS-Assisted Systems with Diffusion-Driven Generative Channels

    Authors: Jie Zhang, Jun Li, Zhe Wang, Yu Han, Long Shi, Bin Cao

    Abstract: In this paper, we propose a novel diffusion-decision transformer (D2T) architecture to optimize the beamforming strategies for intelligent reflecting surface (IRS)-assisted multiple-input single-output (MISO) communication systems. The first challenge lies in the expensive computation cost to recover the real-time channel state information (CSI) from the received pilot signals, which usually requi… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  37. arXiv:2406.19749  [pdf, other

    eess.IV cs.CV

    SPIRONet: Spatial-Frequency Learning and Topological Channel Interaction Network for Vessel Segmentation

    Authors: De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Shuang-Yi Wang, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Bo-Xian Yao, Zeng-Guang Hou

    Abstract: Automatic vessel segmentation is paramount for develo** next-generation interventional navigation systems. However, current approaches suffer from suboptimal segmentation performances due to significant challenges in intraoperative images (i.e., low signal-to-noise ratio, small or slender vessels, and strong interference). In this paper, a novel spatial-frequency learning and topological channel… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  38. arXiv:2406.19718  [pdf, ps, other

    eess.SY

    Global Regulation of Feedforward Nonlinear Systems: A Logic-Based Switching Gain Approach

    Authors: Debao Fan, Xianfu Zhang, Gang Feng, Hanfeng Li

    Abstract: In this article, we investigate the global regulation problem for a class of feedforward nonlinear systems. Notably, the systems under consideration allow unknown input-output-dependent nonlinear growth rates, which has not been considered in existing works. A novel logic-based switching (LBS) gain approach is proposed to counteract system uncertainties and nonlinearities. Furthermore, a tanh-type… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  39. arXiv:2406.19627  [pdf

    eess.SY

    Practical Power System Inertia Monitoring Based on Pumped Storage Hydropower Operation Signature

    Authors: Hongyu Li, Chang Chen, Mark Baldwin, Shutang You, Wenpeng Yu, Lin Zhu, Yilu Liu

    Abstract: This paper proposes a practical method to monitor power system inertia using Pumped Storage Hydropower (PSH) switching-off events. This approach offers real-time system-level inertia estimation with minimal expenses, no disruption, and the inclusion of behind-the-meter inertia. First, accurate inertia estimation is achieved through improved RoCoF calculation that accounts for pre-event RoCoF, redu… ▽ More

    Submitted 1 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 8 pages, 15 figures

  40. arXiv:2406.19483  [pdf, ps, other

    eess.SP

    Localization in Multipath Environments via Active Sensing with Reconfigurable Intelligent Surfaces

    Authors: Yinghan Li, Wei Yu

    Abstract: This paper investigates an uplink pilot-based wireless indoor localization problem in a multipath environment for a single-input single-output (SISO) narrowband communication system aided by reconfigurable intelligent surface (RIS). The indoor localization problem is challenging because the uplink channel consists of multiple overlap** propagation paths with varying amplitudes and phases, which… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  41. arXiv:2406.19328  [pdf, other

    cs.SD cs.LG eess.AS

    Subtractive Training for Music Stem Insertion using Latent Diffusion Models

    Authors: Ivan Villa-Renteria, Mason L. Wang, Zachary Shah, Zhe Li, Soohyun Kim, Neelesh Ramachandran, Mert Pilanci

    Abstract: We present Subtractive Training, a simple and novel method for synthesizing individual musical instrument stems given other instruments as context. This method pairs a dataset of complete music mixes with 1) a variant of the dataset lacking a specific stem, and 2) LLM-generated instructions describing how the missing stem should be reintroduced. We then fine-tune a pretrained text-to-audio diffusi… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  42. arXiv:2406.19311  [pdf, other

    cs.CR cs.SD eess.AS

    Zero-Query Adversarial Attack on Black-box Automatic Speech Recognition Systems

    Authors: Zheng Fang, Tao Wang, Lingchen Zhao, Shenyi Zhang, Bowen Li, Yunjie Ge, Qi Li, Chao Shen, Qian Wang

    Abstract: In recent years, extensive research has been conducted on the vulnerability of ASR systems, revealing that black-box adversarial example attacks pose significant threats to real-world ASR systems. However, most existing black-box attacks rely on queries to the target ASRs, which is impractical when queries are not permitted. In this paper, we propose ZQ-Attack, a transfer-based adversarial attack… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: To appear in the Proceedings of The ACM Conference on Computer and Communications Security (CCS), 2024

  43. arXiv:2406.19205  [pdf, other

    eess.SP

    Coordinated RSMA for Integrated Sensing and Communication in Emergency UAV Systems

    Authors: Binghan Yao, Ruoguang Li, Yingyang Chen, Li Wang

    Abstract: Recently, unmanned aerial vehicle (UAV)-enabled integrated sensing and communication (ISAC) is emerging as a promising technique for achieving robust and rapid emergency response capabilities. Such a novel framework offers high-quality and cost-efficient C\&S services due to the intrinsic flexibility and mobility of UAVs. In parallel, rate-splitting multiple access (RSMA) is able to achieve a tail… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  44. arXiv:2406.19043  [pdf

    eess.IV cs.AI cs.CV cs.DB

    CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

    Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Ya**g Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, **g Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

    Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 19 pages, 3 figures, 2 tables

  45. arXiv:2406.18935  [pdf

    eess.SY

    Generalized Averaging Method for Power Electronics Modeling from DC to above Half the Switching Frequency

    Authors: Hongchang Li, Kang** Wang, **gyang Fang, Wenjie Chen, Xu Yang

    Abstract: Modeling power electronic converters at frequencies close to or above half the switching frequency has been difficult due to the time-variant and discontinuous switching actions. This paper uses the properties of moving Fourier coefficients to develop the generalized averaging method, breaking though the limit of half the switching frequency. The paper also proposes the generalized average model f… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  46. arXiv:2406.18840  [pdf

    eess.IV

    Shorter SPECT Scans Using Self-supervised Coordinate Learning to Synthesize Skipped Projection Views

    Authors: Zongyu Li, Yixuan Jia, Xiaojian Xu, Jason Hu, Jeffrey A. Fessler, Yuni K. Dewaraja

    Abstract: Purpose: This study addresses the challenge of extended SPECT imaging duration under low-count conditions, as encountered in Lu-177 SPECT imaging, by develo** a self-supervised learning approach to synthesize skipped SPECT projection views, thus shortening scan times in clinical settings. Methods: We employed a self-supervised coordinate-based learning technique, adapting the neural radiance fie… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 25 pages, 5568 words

  47. arXiv:2406.18679  [pdf, other

    eess.AS cs.AI cs.CL cs.LG

    Speakers Unembedded: Embedding-free Approach to Long-form Neural Diarization

    Authors: Xiang Li, Vivek Govindan, Rohit Paturi, Sundararajan Srinivasan

    Abstract: End-to-end neural diarization (EEND) models offer significant improvements over traditional embedding-based Speaker Diarization (SD) approaches but falls short on generalizing to long-form audio with large number of speakers. EEND-vector-clustering method mitigates this by combining local EEND with global clustering of speaker embeddings from local windows, but this requires an additional speaker… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted at INTERSPEECH 2024

  48. arXiv:2406.18536  [pdf, other

    eess.SY cs.AI cs.AR

    Reliable Interval Prediction of Minimum Operating Voltage Based on On-chip Monitors via Conformalized Quantile Regression

    Authors: Yuxuan Yin, Xiaoxiao Wang, Rebecca Chen, Chen He, Peng Li

    Abstract: Predicting the minimum operating voltage ($V_{min}$) of chips is one of the important techniques for improving the manufacturing testing flow, as well as ensuring the long-term reliability and safety of in-field systems. Current $V_{min}$ prediction methods often provide only point estimates, necessitating additional techniques for constructing prediction confidence intervals to cover uncertaintie… ▽ More

    Submitted 3 May, 2024; originally announced June 2024.

    Comments: Accepted by DATE 2024. Camera-ready version

  49. arXiv:2406.18467  [pdf, other

    eess.SY math.OC

    Algebraic Connectivity Control and Maintenance in Multi-Agent Networks under Attack

    Authors: Wenjie Zhao, Diego Deplano, Zhiwu Li, Alessandro Giua, Mauro Franceschelli

    Abstract: This paper studies the problem of increasing the connectivity of an ad-hoc peer-to-peer network subject to cyber-attacks targeting the agents in the network. The adopted strategy involves the design of local interaction rules for the agents to locally modify the graph topology by adding and removing links with neighbors. Two distributed protocols are presented to boost the algebraic connectivity o… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  50. arXiv:2406.18345  [pdf, other

    cs.LG eess.SP

    EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition

    Authors: Yi Ding, Chengxuan Tong, Shuailei Zhang, Muyun Jiang, Yong Li, Kevin Lim Jun Liang, Cuntai Guan

    Abstract: Integrating prior knowledge of neurophysiology into neural network architecture enhances the performance of emotion decoding. While numerous techniques emphasize learning spatial and short-term temporal patterns, there has been limited emphasis on capturing the vital long-term contextual information associated with emotional cognitive processes. In order to address this discrepancy, we introduce a… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 11 pages, 5 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible