Skip to main content

Showing 1–50 of 267 results for author: Xue, W

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.18102  [pdf

    eess.IV cs.CV

    A Lung Nodule Dataset with Histopathology-based Cancer Type Annotation

    Authors: Muwei Jian, Hongyu Chen, Zaiyong Zhang, Nan Yang, Haorang Zhang, Lifu Ma, Wen**g Xu, Huixiang Zhi

    Abstract: Recently, Computer-Aided Diagnosis (CAD) systems have emerged as indispensable tools in clinical diagnostic workflows, significantly alleviating the burden on radiologists. Nevertheless, despite their integration into clinical settings, CAD systems encounter limitations. Specifically, while CAD systems can achieve high performance in the detection of lung nodules, they face challenges in accuratel… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.16943  [pdf, other

    eess.SP cs.AI cs.HC cs.LG

    EarDA: Towards Accurate and Data-Efficient Earable Activity Sensing

    Authors: Shengzhe Lyu, Yongliang Chen, Di Duan, Renqi Jia, Weitao Xu

    Abstract: In the realm of smart sensing with the Internet of Things, earable devices are empowered with the capability of multi-modality sensing and intelligence of context-aware computing, leading to its wide usage in Human Activity Recognition (HAR). Nonetheless, unlike the movements captured by Inertial Measurement Unit (IMU) sensors placed on the upper or lower body, those motion signals obtained from e… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: accepted by 2024 IEEE Coupling of Sensing & Computing in AIoT Systems (CSCAIoT)

  3. arXiv:2406.16200  [pdf, other

    cs.LG cs.CR cs.IT eess.SP

    Towards unlocking the mystery of adversarial fragility of neural networks

    Authors: **gchao Gao, Raghu Mudumbai, Xiaodong Wu, Jirong Yi, Catherine Xu, Hui Xie, Weiyu Xu

    Abstract: In this paper, we study the adversarial robustness of deep neural networks for classification tasks. We look at the smallest magnitude of possible additive perturbations that can change the output of a classification algorithm. We provide a matrix-theoretic explanation of the adversarial fragility of deep neural network for classification. In particular, our theoretical results show that neural ne… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 21 pages

  4. arXiv:2406.14795  [pdf, other

    cs.RO eess.SY

    Design and Control of a Low-cost Non-backdrivable End-effector Upper Limb Rehabilitation Device

    Authors: Fulan Li, Yunfei Guo, Wenda Xu, Weide Zhang, Fangyun Zhao, Baiyu Wang, Huaguang Du, Chengkun Zhang

    Abstract: This paper presents the development of an upper limb end-effector based rehabilitation device for stroke patients, offering assistance or resistance along any 2-dimensional trajectory during physical therapy. It employs a non-backdrivable ball-screw-driven mechanism for enhanced control accuracy. The control system features three novel algorithms: First, the Implicit Euler velocity control algorit… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 12 pages, 15 figures

  5. arXiv:2406.11636  [pdf, other

    eess.IV cs.CV cs.LG

    Feasibility of Federated Learning from Client Databases with Different Brain Diseases and MRI Modalities

    Authors: Felix Wagner, Wentian Xu, Pramit Saha, Ziyun Liang, Daniel Whitehouse, David Menon, Natalie Voets, J. Alison Noble, Konstantinos Kamnitsas

    Abstract: Segmentation models for brain lesions in MRI are commonly developed for a specific disease and trained on data with a predefined set of MRI modalities. Each such model cannot segment the disease using data with a different set of MRI modalities, nor can it segment any other type of disease. Moreover, this training paradigm does not allow a model to benefit from learning from heterogeneous database… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    ACM Class: I.4.9; I.4.6; I.2.11; I.4.0

  6. arXiv:2406.06640  [pdf

    physics.comp-ph eess.IV physics.optics

    A high-performance reconstruction method for partially coherent ptychography

    Authors: Wenhui Xu, Shoucong Ning, Pengju Sheng, Huixiang Lin, Angus I Kirkland, Yong Peng, Fucai Zhang

    Abstract: Ptychography is now integrated as a tool in mainstream microscopy allowing quantitative and high-resolution imaging capabilities over a wide field of view. However, its ultimate performance is inevitably limited by the available coherent flux when implemented using electrons or laboratory X-ray sources. We present a universal reconstruction algorithm with high tolerance to low coherence for both f… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  7. arXiv:2406.05763  [pdf, other

    eess.AS

    WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark

    Authors: Linhan Ma, Dake Guo, Kun Song, Yuepeng Jiang, Shuai Wang, Liumeng Xue, Weiming Xu, Huan Zhao, Binbin Zhang, Lei Xie

    Abstract: With the development of large text-to-speech (TTS) models and scale-up of the training data, state-of-the-art TTS systems have achieved impressive performance. In this paper, we present WenetSpeech4TTS, a multi-domain Mandarin corpus derived from the open-sourced WenetSpeech dataset. Tailored for the text-to-speech tasks, we refined WenetSpeech by adjusting segment boundaries, enhancing the audio… ▽ More

    Submitted 19 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH2024

  8. arXiv:2406.03888  [pdf, ps, other

    cs.IT eess.SP

    MSE-Based Training and Transmission Optimization for MIMO ISAC Systems

    Authors: Zhenyao He, Wei Xu, Hong Shen, Yonina C. Eldar, Xiaohu You

    Abstract: In this paper, we investigate a multiple-input multiple-output (MIMO) integrated sensing and communication (ISAC) system under typical block-fading channels. As a non-trivial extension to most existing works on ISAC, both the training and transmission signals sent by the ISAC transmitter are exploited for sensing. Specifically, we develop two training and transmission design schemes to minimize a… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  9. arXiv:2406.02291  [pdf, other

    cs.NI eess.SP

    A deep-learning-based MAC for integrating channel access, rate adaptation and channel switch

    Authors: Jiantao Xin, Wei Xu, Bin Cao, Taotao Wang, Shengli Zhang

    Abstract: With increasing density and heterogeneity in unlicensed wireless networks, traditional MAC protocols, such as carrier-sense multiple access with collision avoidance (CSMA/CA) in Wi-Fi networks, are experiencing performance degradation. This is manifested in increased collisions and extended backoff times, leading to diminished spectrum efficiency and protocol coordination. Addressing these issues,… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  10. arXiv:2405.17329  [pdf, other

    cs.IT eess.SP

    Joint MIMO Transceiver and Reflector Design for Reconfigurable Intelligent Surface-Assisted Communication

    Authors: Yaqiong Zhao, **dan Xu, Wei Xu, Kezhi Wang, Xinquan Ye, Chau Yuen, Xiaohu You

    Abstract: In this paper, we consider a reconfigurable intelligent surface (RIS)-assisted multiple-input multiple-output communication system with multiple antennas at both the base station (BS) and the user. We plan to maximize the achievable rate through jointly optimizing the transmit precoding matrix, the receive combining matrix, and the RIS reflection matrix under the constraints of the transmit power… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 14 pages, 12 figures

  11. arXiv:2405.11386  [pdf, other

    eess.IV cs.CV

    Liver Fat Quantification Network with Body Shape

    Authors: Qiyue Wang, Wu Xue, Xiaoke Zhang, Fang **, James Hahn

    Abstract: It is critically important to detect the content of liver fat as it is related to cardiac complications and cardiovascular disease mortality. However, existing methods are either associated with high cost and/or medical complications (e.g., liver biopsy, imaging technology) or only roughly estimate the grades of steatosis. In this paper, we propose a deep neural network to estimate the percentage… ▽ More

    Submitted 30 May, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

  12. arXiv:2405.07682  [pdf, other

    cs.SD cs.AI cs.CL cs.MM eess.AS

    FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation

    Authors: Jianyi Chen, Wei Xue, Xu Tan, Zhen Ye, Qifeng Liu, Yike Guo

    Abstract: Singing Accompaniment Generation (SAG), which generates instrumental music to accompany input vocals, is crucial to develo** human-AI symbiotic art creation systems. The state-of-the-art method, SingSong, utilizes a multi-stage autoregressive (AR) model for SAG, however, this method is extremely slow as it generates semantic and acoustic tokens recursively, and this makes it impossible for real-… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: IJCAI 2024

  13. arXiv:2405.07029  [pdf

    cs.SD eess.AS

    A framework of text-dependent speaker verification for chinese numerical string corpus

    Authors: Litong Zheng, Feng Hong, Weijie Xu, Wan Zheng

    Abstract: The Chinese numerical string corpus, serves as a valuable resource for speaker verification, particularly in financial transactions. Researches indicate that in short speech scenarios, text-dependent speaker verification (TD-SV) consistently outperforms text-independent speaker verification (TI-SV). However, TD-SV potentially includes the validation of text information, that can be negatively impa… ▽ More

    Submitted 21 May, 2024; v1 submitted 11 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2312.01645

  14. arXiv:2405.04027  [pdf, other

    eess.SP

    Joint Visibility Region Detection and Channel Estimation for XL-MIMO Systems via Alternating MAP

    Authors: Wenkang Xu, An Liu, Min-jian Zhao

    Abstract: We investigate a joint visibility region (VR) detection and channel estimation problem in extremely large-scale multiple-input-multiple-output (XL-MIMO) systems, where near-field propagation and spatial non-stationary effects exist. In this case, each scatterer can only see a subset of antennas, i.e., it has a certain VR over the antennas. Because of the spatial correlation among adjacent sub-arra… ▽ More

    Submitted 21 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: 13 pages, 14 figures, submitted to IEEE TSP

  15. arXiv:2405.03952  [pdf, other

    cs.SD cs.CL eess.AS

    HAFFormer: A Hierarchical Attention-Free Framework for Alzheimer's Disease Detection From Spontaneous Speech

    Authors: Zhongren Dong, Zixing Zhang, Weixiang Xu, **g Han, Jianjun Ou, Björn W. Schuller

    Abstract: Automatically detecting Alzheimer's Disease (AD) from spontaneous speech plays an important role in its early diagnosis. Recent approaches highly rely on the Transformer architectures due to its efficiency in modelling long-range context dependencies. However, the quadratic increase in computational complexity associated with self-attention and the length of audio poses a challenge when deploying… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Journal ref: publised at ICASSP 2024

  16. arXiv:2405.01197  [pdf, other

    eess.SP cs.IT

    Optimal Beamforming for Bistatic MIMO Sensing

    Authors: Tobias Laas, Ronald Boehnke, Wen Xu

    Abstract: This paper considers the beamforming optimization for sensing a point-like scatterer using a bistatic multiple-input multiple-output (MIMO) orthogonal frequency-division multiplexing (OFDM) radar, which could be part of a joint communication and sensing system. The goal is to minimize the Cramér-Rao bound on the target position's estimation error, where the radar already knows an approximate posit… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 6 pages, 6 figures. Submitted to the IEEE for possible publication

  17. arXiv:2404.19750  [pdf, other

    cs.IT eess.SP

    A Joint Communication and Computation Design for Distributed RISs Assisted Probabilistic Semantic Communication in IIoT

    Authors: Zhouxiang Zhao, Zhaohui Yang, Chongwen Huang, Li Wei, Qianqian Yang, Caijun Zhong, Wei Xu, Zhaoyang Zhang

    Abstract: In this paper, the problem of spectral-efficient communication and computation resource allocation for distributed reconfigurable intelligent surfaces (RISs) assisted probabilistic semantic communication (PSC) in industrial Internet-of-Things (IIoT) is investigated. In the considered model, multiple RISs are deployed to serve multiple users, while PSC adopts compute-then-transmit protocol to reduc… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  18. arXiv:2404.18081  [pdf, other

    cs.SD cs.AI cs.CL cs.LG cs.MM eess.AS

    ComposerX: Multi-Agent Symbolic Music Composition with LLMs

    Authors: Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo

    Abstract: Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints. While demonstrating impressive capabilities in STEM subjects, current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and C… ▽ More

    Submitted 30 April, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

  19. Dynamic fault detection and diagnosis for alkaline water electrolyzer with variational Bayesian Sparse principal component analysis

    Authors: Qi Zhang, Weihua Xu, Lei Xie, Hongye Su

    Abstract: Electrolytic hydrogen production serves as not only a vital source of green hydrogen but also a key strategy for addressing renewable energy consumption challenges. For the safe production of hydrogen through alkaline water electrolyzer (AWE), dependable process monitoring technology is essential. However, random noise can easily contaminate the AWE process data collected in industrial settings, p… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Journal ref: Journal of Process Control, 135:103173, March 2024. ISSN 0959-1524

  20. arXiv:2404.14700  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    FlashSpeech: Efficient Zero-Shot Speech Synthesis

    Authors: Zhen Ye, Zeqian Ju, Haohe Liu, Xu Tan, Jianyi Chen, Yiwen Lu, Peiwen Sun, Jiahao Pan, Weizhen Bian, Shulin He, Qifeng Liu, Yike Guo, Wei Xue

    Abstract: Recent progress in large-scale zero-shot speech synthesis has been significantly advanced by language models and diffusion models. However, the generation process of both methods is slow and computationally intensive. Efficient speech synthesis using a lower computing budget to achieve quality on par with previous work remains a significant challenge. In this paper, we present FlashSpeech, a large… ▽ More

    Submitted 24 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: Efficient zero-shot speech synthesis

  21. arXiv:2404.13388  [pdf

    eess.IV cs.CV cs.LG

    Diagnosis of Multiple Fundus Disorders Amidst a Scarcity of Medical Experts Via Self-supervised Machine Learning

    Authors: Yong Liu, Mengtian Kang, Shuo Gao, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Arokia Nathan, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Luigi Occhipinti

    Abstract: Fundus diseases are major causes of visual impairment and blindness worldwide, especially in underdeveloped regions, where the shortage of ophthalmologists hinders timely diagnosis. AI-assisted fundus image analysis has several advantages, such as high accuracy, reduced workload, and improved accessibility, but it requires a large amount of expert-annotated data to build reliable models. To addres… ▽ More

    Submitted 23 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  22. arXiv:2404.13386  [pdf

    eess.IV cs.CV cs.LG

    SSVT: Self-Supervised Vision Transformer For Eye Disease Diagnosis Based On Fundus Images

    Authors: Jiaqi Wang, Mengtian Kang, Yong Liu, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Shuo Gao, Luigi G. Occhipinti

    Abstract: Machine learning-based fundus image diagnosis technologies trigger worldwide interest owing to their benefits such as reducing medical resource power and providing objective evaluation results. However, current methods are commonly based on supervised methods, bringing in a heavy workload to biomedical staff and hence suffering in expanding effective databases. To address this issue, in this artic… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: ISBI 2024

  23. arXiv:2404.10365  [pdf, other

    cs.NI cs.LG eess.SP

    Learning Wireless Data Knowledge Graph for Green Intelligent Communications: Methodology and Experiments

    Authors: Yongming Huang, Xiaohu You, Hang Zhan, Shiwen He, Ningning Fu, Wei Xu

    Abstract: Intelligent communications have played a pivotal role in sha** the evolution of 6G networks. Native artificial intelligence (AI) within green communication systems must meet stringent real-time requirements. To achieve this, deploying lightweight and resource-efficient AI models is necessary. However, as wireless networks generate a multitude of data fields and indicators during operation, only… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 12 pages,11 figures

  24. arXiv:2404.09519  [pdf, other

    cs.LG eess.SY

    Nonlinear sparse variational Bayesian learning based model predictive control with application to PEMFC temperature control

    Authors: Qi Zhang, Lei Wang, Weihua Xu, Hongye Su, Lei Xie

    Abstract: The accuracy of the underlying model predictions is crucial for the success of model predictive control (MPC) applications. If the model is unable to accurately analyze the dynamics of the controlled system, the performance and stability guarantees provided by MPC may not be achieved. Learning-based MPC can learn models from data, improving the applicability and reliability of MPC. This study deve… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  25. arXiv:2404.08490  [pdf, other

    eess.SP

    SemHARQ: Semantic-Aware HARQ for Multi-task Semantic Communications

    Authors: Jiang**g Hu, Fengyu Wang, Wenjun Xu, Hui Gao, ** Zhang

    Abstract: Intelligent task-oriented semantic communications (SemComs) have witnessed great progress with the development of deep learning (DL). In this paper, we propose a semantic-aware hybrid automatic repeat request (SemHARQ) framework for the robust and efficient transmissions of semantic features. First, to improve the robustness and effectiveness of semantic coding, a multi-task semantic encoder is pr… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  26. arXiv:2404.07721  [pdf, other

    eess.SP cs.IT

    Trainable Joint Channel Estimation, Detection and Decoding for MIMO URLLC Systems

    Authors: Yi Sun, Hong Shen, Bingqing Li, Wei Xu, Pengcheng Zhu, Nan Hu, Chunming Zhao

    Abstract: The receiver design for multi-input multi-output (MIMO) ultra-reliable and low-latency communication (URLLC) systems can be a tough task due to the use of short channel codes and few pilot symbols. Consequently, error propagation can occur in traditional turbo receivers, leading to performance degradation. Moreover, the processing delay induced by information exchange between different modules may… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 17 pages, 12 figures, accepted by IEEE Transactions on Wireless Communications

  27. arXiv:2404.06393  [pdf, other

    cs.SD cs.AI eess.AS

    MuPT: A Generative Symbolic Music Pretrained Transformer

    Authors: Xingwei Qu, Yuelin Bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xinrun Du, Shuyue Guo, Yiming Liang, Yizhi Li, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia, Emmanouil Benetos, Xiang Yue, Chenghua Lin, Xu Tan , et al. (4 additional authors not shown)

    Abstract: In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Notation, which aligns more closely with their design and strengths, thereby enhancing the model's performance in musical composition. To address the chal… ▽ More

    Submitted 10 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  28. arXiv:2404.05322  [pdf

    eess.SY

    A Power Management and Control System for Portable Ecosystem Monitoring Devices

    Authors: Marcel Balle, Wenxiu Xu, Kevin FA Darras, Thomas Cherico Wanger

    Abstract: Recent advances in Internet of Things (IoT) and Artificial Intelligence (AI) technologies help ecosystem monitoring to shift towards automated monitoring with low power sensors and embedded vision on powerful processing units. Vision-based monitoring devices need an effective power management and control system (PMCS) with system-adapted power input and output capabilities to achieve power-efficie… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  29. arXiv:2403.19185  [pdf, other

    cs.IT eess.SP

    Deep CSI Compression for Dual-Polarized Massive MIMO Channels with Disentangled Representation Learning

    Authors: Suhang Fan, Wei Xu, Renjie Xie, Shi **, Derrick Wing Kwan Ng, Naofal Al-Dhahir

    Abstract: Channel state information (CSI) feedback is critical for achieving the promised advantages of enhancing spectral and energy efficiencies in massive multiple-input multiple-output (MIMO) wireless communication systems. Deep learning (DL)-based methods have been proven effective in reducing the required signaling overhead for CSI feedback. In practical dual-polarized MIMO scenarios, channels in the… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  30. arXiv:2403.17249  [pdf, other

    cs.RO eess.SY

    Impact-Aware Bimanual Catching of Large-Momentum Objects

    Authors: Lei Yan, Theodoros Stouraitis, João Moura, Wenfu Xu, Michael Gienger, Sethu Vijayakumar

    Abstract: This paper investigates one of the most challenging tasks in dynamic manipulation -- catching large-momentum moving objects. Beyond the realm of quasi-static manipulation, dealing with highly dynamic objects can significantly improve the robot's capability of interacting with its surrounding environment. Yet, the inevitable motion mismatch between the fast moving object and the approaching robot w… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  31. Secure Outage Analysis for RIS-Aided MISO Systems with Randomly Located Eavesdroppers

    Authors: Wei Shi, **dan Xu, Wei Xu, Chau Yuen, A. Lee Swindlehurst, Xiaohu You, Chunming Zhao

    Abstract: In this paper, we consider the physical layer security of an RIS-assisted multiple-antenna communication system with randomly located eavesdroppers. The exact distributions of the received signal-to-noise-ratios (SNRs) at the legitimate user and the eavesdroppers located according to a Poisson point process (PPP) are derived, and a closed-form expression for the secrecy outage probability (SOP) is… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted by 2023 IEEE Globecom Workshops (GC Wkshps). arXiv admin note: substantial text overlap with arXiv:2312.16814

  32. arXiv:2403.05753  [pdf, other

    eess.IV cs.CV

    UDCR: Unsupervised Aortic DSA/CTA Rigid Registration Using Deep Reinforcement Learning and Overlap Degree Calculation

    Authors: Wentao Liu, Bowen Liang, Wei** Xu, Tong Tian, Qingsheng Lu, Xipeng Pan, Haoyuan Li, Siyu Tian, Huihua Yang, Ruisheng Su

    Abstract: The rigid registration of aortic Digital Subtraction Angiography (DSA) and Computed Tomography Angiography (CTA) can provide 3D anatomical details of the vasculature for the interventional surgical treatment of conditions such as aortic dissection and aortic aneurysms, holding significant value for clinical research. However, the current methods for 2D/3D image registration are dependent on manual… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  33. arXiv:2403.02693  [pdf, other

    cs.MM eess.IV

    Optimizing Mobile-Friendly Viewport Prediction for Live 360-Degree Video Streaming

    Authors: Lei Zhang, Tao Long, Weizhen Xu, Laizhong Cui, Jiangchuan Liu

    Abstract: Viewport prediction is the crucial task for adaptive 360-degree video streaming, as the bitrate control algorithms usually require the knowledge of the user's viewing portions of the frames. Various methods are studied and adopted for viewport prediction from less accurate statistic tools to highly calibrated deep neural networks. Conventionally, it is difficult to implement sophisticated deep lea… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 14 pages

  34. arXiv:2403.00434  [pdf, other

    cs.IT eess.SP

    Probabilistic Semantic Communication over Wireless Networks with Rate Splitting

    Authors: Zhouxiang Zhao, Zhaohui Yang, Ye Hu, Qianqian Yang, Wei Xu, Zhaoyang Zhang

    Abstract: In this paper, the problem of joint transmission and computation resource allocation for probabilistic semantic communication (PSC) system with rate splitting multiple access (RSMA) is investigated. In the considered model, the base station (BS) needs to transmit a large amount of data to multiple users with RSMA. Due to limited communication resources, the BS is required to utilize semantic commu… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  35. arXiv:2402.16153  [pdf, other

    cs.SD cs.AI cs.CL cs.LG cs.MM eess.AS

    ChatMusician: Understanding and Generating Music Intrinsically with LLM

    Authors: Ruibin Yuan, Hanfeng Lin, Yi Wang, Zeyue Tian, Shangda Wu, Tianhao Shen, Ge Zhang, Yuhang Wu, Cong Liu, Ziya Zhou, Ziyang Ma, Liumeng Xue, Ziyu Wang, Qin Liu, Tianyu Zheng, Yizhi Li, Yinghao Ma, Yiming Liang, Xiaowei Chi, Ruibo Liu, Zili Wang, Pengfei Li, **gcheng Wu, Chenghua Lin, Qifeng Liu , et al. (10 additional authors not shown)

    Abstract: While Large Language Models (LLMs) demonstrate impressive capabilities in text generation, we find that their ability has yet to be generalized to music, humanity's creative language. We introduce ChatMusician, an open-source LLM that integrates intrinsic musical abilities. It is based on continual pre-training and finetuning LLaMA2 on a text-compatible music representation, ABC notation, and the… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

    Comments: GitHub: https://shanghaicannon.github.io/ChatMusician/

  36. arXiv:2401.17108  [pdf, other

    cs.IT eess.SP

    Joint Semantic Communication and Target Sensing for 6G Communication System

    Authors: Yinchao Yang, Mohammad Shikh-Bahaei, Zhaohui Yang, Chongwen Huang, Wei Xu, Zhaoyang Zhang

    Abstract: This paper investigates the secure resource allocation for a downlink integrated sensing and communication system with multiple legal users and potential eavesdroppers. In the considered model, the base station (BS) simultaneously transmits sensing and communication signals through beamforming design, where the sensing signals can be viewed as artificial noise to enhance the security of communicat… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  37. arXiv:2401.06669  [pdf, other

    cs.IT eess.SP

    User-Centric Cell-Free Wireless Networks for 6G: Communication Theoretic Models and Research Challenges

    Authors: Fabian Göttsch, Giuseppe Caire, Wen Xu, Martin Schubert

    Abstract: This paper presents a comprehensive communication theoretic model for the physical layer of a cell-free user-centric network, formed by user equipments (UEs), radio units (RUs), and decentralized units (DUs), uniformly spatially distributed over a given coverage area. We consider RUs equipped with multiple antennas, and focus on the regime where the UE, RU, and DU densities are constant and theref… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

  38. arXiv:2401.05551  [pdf, other

    cs.CL cs.SD eess.AS

    Useful Blunders: Can Automated Speech Recognition Errors Improve Downstream Dementia Classification?

    Authors: Changye Li, Weizhe Xu, Trevor Cohen, Serguei Pakhomov

    Abstract: \textbf{Objectives}: We aimed to investigate how errors from automatic speech recognition (ASR) systems affect dementia classification accuracy, specifically in the ``Cookie Theft'' picture description task. We aimed to assess whether imperfect ASR-generated transcripts could provide valuable information for distinguishing between language samples from cognitively healthy individuals and those wit… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: To appear on Journal of Biomedical Informatics

  39. arXiv:2401.04570  [pdf, other

    eess.IV cs.CV

    An Automatic Cascaded Model for Hemorrhagic Stroke Segmentation and Hemorrhagic Volume Estimation

    Authors: Wei** Xu, Zhuang Sha, Huihua Yang, Rongcai Jiang, Zhanying Li, Wentao Liu, Ruisheng Su

    Abstract: Hemorrhagic Stroke (HS) has a rapid onset and is a serious condition that poses a great health threat. Promptly and accurately delineating the bleeding region and estimating the volume of bleeding in Computer Tomography (CT) images can assist clinicians in treatment planning, leading to improved treatment outcomes for patients. In this paper, a cascaded 3D model is constructed based on UNet to per… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted by SWITCH2023: Stroke Workshop on Imaging and Treatment CHallenges, a workshop at MICCAI 2023

  40. DJCM: A Deep Joint Cascade Model for Singing Voice Separation and Vocal Pitch Estimation

    Authors: Haojie Wei, Xueke Cao, Wenbo Xu, Tangpeng Dan, Yueguo Chen

    Abstract: Singing voice separation and vocal pitch estimation are pivotal tasks in music information retrieval. Existing methods for simultaneous extraction of clean vocals and vocal pitches can be classified into two categories: pipeline methods and naive joint learning methods. However, the efficacy of these methods is limited by the following problems: On the one hand, pipeline methods train models for e… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: This paper has been accepted by ICASSP 2024

  41. arXiv:2401.03800  [pdf, other

    cs.CV eess.IV

    MvKSR: Multi-view Knowledge-guided Scene Recovery for Hazy and Rainy Degradation

    Authors: Dong Yang, Wenyu Xu, Yuan Gao, Yuxu Lu, **gming Zhang, Yu Guo

    Abstract: High-quality imaging is crucial for ensuring safety supervision and intelligent deployment in fields like transportation and industry. It enables precise and detailed monitoring of operations, facilitating timely detection of potential hazards and efficient management. However, adverse weather conditions, such as atmospheric haziness and precipitation, can have a significant impact on image qualit… ▽ More

    Submitted 8 January, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

  42. arXiv:2401.03697  [pdf, other

    cs.SD eess.AS

    An audio-quality-based multi-strategy approach for target speaker extraction in the MISP 2023 Challenge

    Authors: Runduo Han, Xiaopeng Yan, Weiming Xu, Pengcheng Guo, Jiayao Sun, He Wang, Quan Lu, Ning Jiang, Lei Xie

    Abstract: This paper describes our audio-quality-based multi-strategy approach for the audio-visual target speaker extraction (AVTSE) task in the Multi-modal Information based Speech Processing (MISP) 2023 Challenge. Specifically, our approach adopts different extraction strategies based on the audio quality, striking a balance between interference removal and speech preservation, which benifits the back-en… ▽ More

    Submitted 6 March, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  43. arXiv:2401.01792  [pdf, other

    eess.AS cs.AI cs.LG cs.SD

    CoMoSVC: Consistency Model-based Singing Voice Conversion

    Authors: Yiwen Lu, Zhen Ye, Wei Xue, Xu Tan, Qifeng Liu, Yike Guo

    Abstract: The diffusion-based Singing Voice Conversion (SVC) methods have achieved remarkable performances, producing natural audios with high similarity to the target timbre. However, the iterative sampling process results in slow inference speed, and acceleration thus becomes crucial. In this paper, we propose CoMoSVC, a consistency model-based SVC method, which aims to achieve both high-quality generatio… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  44. On Secrecy Performance of RIS-Assisted MISO Systems over Rician Channels with Spatially Random Eavesdroppers

    Authors: Wei Shi, **dan Xu, Wei Xu, Chau Yuen, A. Lee Swindlehurst, Chunming Zhao

    Abstract: Reconfigurable intelligent surface (RIS) technology is emerging as a promising technique for performance enhancement for next-generation wireless networks. This paper investigates the physical layer security of an RIS-assisted multiple-antenna communication system in the presence of random spatially distributed eavesdroppers. The RIS-to-ground channels are assumed to experience Rician fading. Usin… ▽ More

    Submitted 21 March, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

    Comments: Accepted by IEEE Transactions on Wireless Communications

  45. arXiv:2312.12874  [pdf, other

    cs.IT eess.SP

    Deep-Unfolded Joint Activity and Data Detection for Grant-Free Transmission in Cell-Free Systems

    Authors: Gangle Sun, Wen** Wang, Wei Xu, Christoph Studer

    Abstract: Massive grant-free transmission and cell-free wireless communication systems have emerged as pivotal enablers for massive machine-type communication. This paper proposes a deep-unfolding-based joint activity and data detection (DU-JAD) algorithm for massive grant-free transmission in cell-free systems. We first formulate a joint activity and data detection optimization problem, which we solve appr… ▽ More

    Submitted 27 February, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: Submitted to ISWCS 2024

  46. arXiv:2312.07379  [pdf, other

    eess.SP

    Sensor Fusion and Resource Management in MIMO-OFDM Joint Sensing and Communication

    Authors: Elia Favarelli, Elisabetta Matricardi, Lorenzo Pucci, Wen Xu, Enrico Paolini, Andrea Giorgetti

    Abstract: This study explores the promising potential of integrating sensing capabilities into multiple-input multiple-output (MIMO)-orthogonal frequency division multiplexing (OFDM)-based networks through innovative multi-sensor fusion techniques, tracking algorithms, and resource management. A novel data fusion technique is proposed within the MIMO-OFDM system, which promotes cooperative sensing among mon… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  47. arXiv:2312.04713  [pdf, other

    cs.CV cs.AI cs.LG eess.SP

    gcDLSeg: Integrating Graph-cut into Deep Learning for Binary Semantic Segmentation

    Authors: Hui Xie, Weiyu Xu, Ya Xing Wang, John Buatti, Xiaodong Wu

    Abstract: Binary semantic segmentation in computer vision is a fundamental problem. As a model-based segmentation method, the graph-cut approach was one of the most successful binary segmentation methods thanks to its global optimality guarantee of the solutions and its practical polynomial-time complexity. Recently, many deep learning (DL) based methods have been developed for this task and yielded remarka… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: 12 pages

  48. arXiv:2312.04602  [pdf, ps, other

    cs.IT eess.SP

    Low-Complexity Channel Estimation for Extremely Large-Scale MIMO in Near Field

    Authors: Chun Huang, **dan Xu, Wei Xu, Xiaohu You, Chau Yuen, Yijian Chen

    Abstract: The extremely large-scale massive multiple-input multiple-output (XL-MIMO) has the potential to achieve boosted spectral efficiency and refined spatial resolution for future wireless networks. However, channel estimation for XL-MIMO is challenging since the large number of antennas results in high computational complexity with the near-field effect. In this letter, we propose a low-complexity sequ… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  49. arXiv:2312.03299  [pdf, other

    cs.IT eess.SP

    Channel-Transferable Semantic Communications for Multi-User OFDM-NOMA Systems

    Authors: Lan Lin, Wenjun Xu, Fengyu Wang, Yimeng Zhang, Wei Zhang, ** Zhang

    Abstract: Semantic communications are expected to become the core new paradigms of the sixth generation (6G) wireless networks. Most existing works implicitly utilize channel information for codecs training, which leads to poor communications when channel type or statistical characteristics change. To tackle this issue posed by various channels, a novel channel-transferable semantic communications (CT-SemCo… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  50. arXiv:2312.02498   

    eess.SY

    Provable Reinforcement Learning for Networked Control Systems with Stochastic Packet Disordering

    Authors: Wenqian Xue, Yi Jiang, Frank L. Lewis, Bosen Lian

    Abstract: This paper formulates a stochastic optimal control problem for linear networked control systems featuring stochastic packet disordering with a unique stabilizing solution certified. The problem is solved by proposing reinforcement learning algorithms. A measurement method is first presented to deal with PD and calculate the newest control input. The NCSs with stochastic PD are modeled as stochasti… ▽ More

    Submitted 11 December, 2023; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: This is a wrong version with problem setting and description errors in main sections