Skip to main content

Showing 1–50 of 97 results for author: Song, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.19464  [pdf, other

    cs.RO cs.AI cs.CV cs.SD eess.AS

    ManiWAV: Learning Robot Manipulation from In-the-Wild Audio-Visual Data

    Authors: Zeyi Liu, Cheng Chi, Eric Cousineau, Naveen Kuppuswamy, Benjamin Burchfiel, Shuran Song

    Abstract: Audio signals provide rich information for the robot interaction and object properties through contact. These information can surprisingly ease the learning of contact-rich robot manipulation skills, especially when the visual information alone is ambiguous or incomplete. However, the usage of audio data in robot manipulation has been constrained to teleoperated demonstrations collected by either… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2406.18156  [pdf, other

    cs.LG cs.DC cs.NI eess.SP

    FedAQ: Communication-Efficient Federated Edge Learning via Joint Uplink and Downlink Adaptive Quantization

    Authors: Lin** Qu, Shenghui Song, Chi-Ying Tsui

    Abstract: Federated learning (FL) is a powerful machine learning paradigm which leverages the data as well as the computational resources of clients, while protecting clients' data privacy. However, the substantial model size and frequent aggregation between the server and clients result in significant communication overhead, making it challenging to deploy FL in resource-limited wireless networks. In this… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  3. arXiv:2406.16323  [pdf, other

    eess.SP

    Low-Complexity CSI Feedback for FDD Massive MIMO Systems via Learning to Optimize

    Authors: Yifan Ma, Hengtao He, Shenghui Song, Jun Zhang, Khaled B. Letaief

    Abstract: In frequency-division duplex (FDD) massive multiple-input multiple-output (MIMO) systems, the growing number of base station antennas leads to prohibitive feedback overhead for downlink channel state information (CSI). To address this challenge, state-of-the-art (SOTA) fully data-driven deep learning (DL)-based CSI feedback schemes have been proposed. However, the high computational complexity and… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: submitted to IEEE for publication

  4. arXiv:2406.13165  [pdf, other

    eess.IV cs.AI cs.CV cs.RO

    Cardiac Copilot: Automatic Probe Guidance for Echocardiography with World Model

    Authors: Haojun Jiang, Zhenguo Sun, Ning Jia, Meng Li, Yu Sun, Shaqi Luo, Shiji Song, Gao Huang

    Abstract: Echocardiography is the only technique capable of real-time imaging of the heart and is vital for diagnosing the majority of cardiac diseases. However, there is a severe shortage of experienced cardiac sonographers, due to the heart's complex structure and significant operational challenges. To mitigate this situation, we present a Cardiac Copilot system capable of providing real-time probe moveme… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Early Accepted by MICCAI 2024

  5. arXiv:2405.09514  [pdf, other

    eess.SP cs.IT cs.LG

    Tackling Distribution Shifts in Task-Oriented Communication with Information Bottleneck

    Authors: Hongru Li, Jiawei Shao, Hengtao He, Shenghui Song, Jun Zhang, Khaled B. Letaief

    Abstract: Task-oriented communication aims to extract and transmit task-relevant information to significantly reduce the communication overhead and transmission latency. However, the unpredictable distribution shifts between training and test data, including domain shift and semantic shift, can dramatically undermine the system performance. In order to tackle these challenges, it is crucial to ensure that t… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 13 pages, 8 figures, submitted to IEEE for potential publication

  6. arXiv:2405.07218  [pdf, other

    physics.med-ph eess.SY

    Chained Flexible Capsule Endoscope: Unraveling the Conundrum of Size Limitations and Functional Integration for Gastrointestinal Transitivity

    Authors: Sishen Yuan, Guang Li, Baijia Liang, Lailu Li, Qingzhuo Zheng, Shuang Song, Zhen Li, Hongliang Ren

    Abstract: Capsule endoscopes, predominantly serving diagnostic functions, provide lucid internal imagery but are devoid of surgical or therapeutic capabilities. Consequently, despite lesion detection, physicians frequently resort to traditional endoscopic or open surgical procedures for treatment, resulting in more complex, potentially risky interventions. To surmount these limitations, this study introduce… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  7. arXiv:2404.16223  [pdf, other

    cs.CV eess.IV

    Deep RAW Image Super-Resolution. A NTIRE 2024 Challenge Survey

    Authors: Marcos V. Conde, Florin-Alexandru Vasluianu, Radu Timofte, Jianxing Zhang, Jia Li, Fan Wang, Xiaopeng Li, Zikun Liu, Hyunhee Park, Sejun Song, Changho Kim, Zhijuan Huang, Hongyuan Yu, Cheng Wan, Wending Xiang, Jiamin Lin, Hang Zhong, Qiaosong Zhang, Yue Sun, Xuanwu Yin, Kunlong Zuo, Senyan Xu, Siyuan Jiang, Zhi**g Sun, Jiaying Zhu , et al. (10 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 RAW Image Super-Resolution Challenge, highlighting the proposed solutions and results. New methods for RAW Super-Resolution could be essential in modern Image Signal Processing (ISP) pipelines, however, this problem is not as explored as in the RGB domain. Th goal of this challenge is to upscale RAW Bayer images by 2x, considering unknown degradations such as nois… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 - NTIRE Workshop

  8. arXiv:2404.11889  [pdf, other

    eess.IV cs.CV

    Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans

    Authors: Lixing Tan, Shuang Song, Kangneng Zhou, Chengbo Duan, Lanying Wang, Huayang Ren, Linlin Liu, Wei Zhang, Ruoxiu Xiao

    Abstract: X-ray images play a vital role in the intraoperative processes due to their high resolution and fast imaging speed and greatly promote the subsequent segmentation, registration and reconstruction. However, over-dosed X-rays superimpose potential risks to human health to some extent. Data-driven algorithms from volume scans to X-ray images are restricted by the scarcity of paired X-ray and volume d… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 13 pages, 10 figures

  9. arXiv:2404.00309  [pdf, other

    cs.IT eess.SP

    Model-Driven Deep Learning for Distributed Detection with Binary Quantization

    Authors: Wei Guo, Meng He, Chuan Huang, Hengtao He, Shenghui Song, Jun Zhang, Khaled B. Letaief

    Abstract: Within the realm of rapidly advancing wireless sensor networks (WSNs), distributed detection assumes a significant role in various practical applications. However, critical challenge lies in maintaining robust detection performance while operating within the constraints of limited bandwidth and energy resources. This paper introduces a novel approach that combines model-driven deep learning (DL) w… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  10. arXiv:2402.14213  [pdf

    q-bio.NC cs.LG eess.SP

    Contrastive Learning of Shared Spatiotemporal EEG Representations Across Individuals for Naturalistic Neuroscience

    Authors: Xinke Shen, Lingyi Tao, Xuyang Chen, Sen Song, Quanying Liu, Dan Zhang

    Abstract: Neural representations induced by naturalistic stimuli offer insights into how humans respond to peripheral stimuli in daily life. The key to understanding the general neural mechanisms underlying naturalistic stimuli processing involves aligning neural activities across individuals and extracting inter-subject shared neural representations. Targeting the Electroencephalogram (EEG) technique, know… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 52 pages, 14 figures

  11. arXiv:2402.10071  [pdf, other

    eess.SP cs.IT

    Approximate Message Passing-Enhanced Graph Neural Network for OTFS Data Detection

    Authors: Wenhao Zhuang, Yuyi Mao, Hengtao He, Lei Xie, Shenghui Song, Yao Ge, Zhi Ding

    Abstract: Orthogonal time frequency space (OTFS) modulation has emerged as a promising solution to support high-mobility wireless communications, for which, cost-effective data detectors are critical. Although graph neural network (GNN)-based data detectors can achieve decent detection accuracy at reasonable computational cost, they fail to best harness prior information of transmitted data. To further mini… ▽ More

    Submitted 14 April, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 8 pages, 7 figures, and 3 tables. Part of this article was submitted to IEEE for possible publication

  12. arXiv:2402.09976  [pdf, ps, other

    eess.SP

    Sensing-assisted Robust SWIPT for Mobile Energy Harvesting Receivers

    Authors: Yiming Xu, Dongfang Xu, Shenghui Song

    Abstract: Simultaneous wireless information and power transfer (SWIPT) has been proposed to offer communication services and transfer power to the energy harvesting receiver (EHR) concurrently. However, existing works mainly focused on static EHRs, without considering the location uncertainty caused by the movement of EHRs and location estimation errors. To tackle this issue, this paper considers the sensin… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  13. arXiv:2402.09974  [pdf, ps, other

    cs.IT eess.SP

    Interference Mitigation for Network-Level ISAC: An Optimization Perspective

    Authors: Dongfang Xu, Yiming Xu, Xin Zhang, Xianghao Yu, Shenghui Song, Robert Schober

    Abstract: Future wireless networks are envisioned to simultaneously provide high data-rate communication and ubiquitous environment-aware services for numerous users. One promising approach to meet this demand is to employ network-level integrated sensing and communications (ISAC) by jointly designing the signal processing and resource allocation over the entire network. However, to unleash the full potenti… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: 7 pages, 6 figures, and the relevant simulation code can be found at https://dongfang-xu.github.io/homepage/code/Two_cases.zip

  14. arXiv:2402.03919  [pdf, other

    cs.IT eess.SP

    Sensing Mutual Information with Random Signals in Gaussian Channels: Bridging Sensing and Communication Metrics

    Authors: Lei Xie, Fan Liu, Jia** Luo, Shenghui Song

    Abstract: Sensing performance is typically evaluated by classical radar metrics, such as Cramer-Rao bound and signal-to-clutter-plus-noise ratio. The recent development of the integrated sensing and communication (ISAC) framework motivated the efforts to unify the performance metric for sensing and communication, where mutual information (MI) was proposed as a sensing performance metric with deterministic s… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2311.07081

  15. arXiv:2402.01467  [pdf, other

    eess.SY cs.AI cs.CE cs.NE q-bio.NC

    Brain-Like Replay Naturally Emerges in Reinforcement Learning Agents

    Authors: Jiyi Wang, Likai Tang, Huimiao Chen, Sen Song

    Abstract: Can replay, as a widely observed neural activity pattern in brain regions, particularly in the hippocampus and neocortex, emerge in an artificial agent? If yes, does it contribute to the tasks? In this work, without heavy dependence on complex assumptions, we discover naturally emergent replay under task-optimized paradigm using a recurrent neural network-based reinforcement learning model, which… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  16. arXiv:2402.01271  [pdf, other

    eess.AS cs.SD

    An Intra-BRNN and GB-RVQ Based END-TO-END Neural Audio Codec

    Authors: Lin** Xu, Jiawei Jiang, Dejun Zhang, Xianjun Xia, Li Chen, Yijian Xiao, Piao Ding, Shenyi Song, Sixing Yin, Ferdous Sohel

    Abstract: Recently, neural networks have proven to be effective in performing speech coding task at low bitrates. However, under-utilization of intra-frame correlations and the error of quantizer specifically degrade the reconstructed audio quality. To improve the coding quality, we present an end-to-end neural speech codec, namely CBRC (Convolutional and Bidirectional Recurrent neural Codec). An interleave… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: INTERSPEECH 2023

  17. Localization of Dummy Data Injection Attacks in Power Systems Considering Incomplete Topological Information: A Spatio-Temporal Graph Wavelet Convolutional Neural Network Approach

    Authors: Zhaoyang Qu, Yunchang Dong, Yang Li, Siqi Song, Tao Jiang, Min Li, Qiming Wang, Lei Wang, Xiaoyong Bo, Jiye Zang, Qi Xu

    Abstract: The emergence of novel the dummy data injection attack (DDIA) poses a severe threat to the secure and stable operation of power systems. These attacks are particularly perilous due to the minimal Euclidean spatial separation between the injected malicious data and legitimate data, rendering their precise detection challenging using conventional distance-based methods. Furthermore, existing researc… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: Accepted by Applied Energy

    Journal ref: Applied Energy 360 (2024) 122736

  18. arXiv:2401.05915  [pdf, other

    eess.IV

    Neural Implicit Surface Reconstruction of Freehand 3D Ultrasound Volume with Geometric Constraints

    Authors: Hongbo Chen, Logiraj Kumaralingam, Shuhang Zhang, Sheng Song, Fayi Zhang, Haibin Zhang, Thanh-Tu Pham, Edmond H. M. Lou, Kumaradevan Punithakumar, Lawrence H. Le, Rui Zheng

    Abstract: Three-dimensional (3D) freehand ultrasound (US) is a widely used imaging modality that allows non-invasive imaging of medical anatomy without radiation exposure. The surface reconstruction of US volume is vital to acquire the accurate anatomical structures needed for modeling, registration, and visualization. However, traditional methods cannot produce a high-quality surface due to image noise. De… ▽ More

    Submitted 1 May, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: Preprint

  19. arXiv:2312.13683  [pdf, other

    eess.SP cs.IT

    Joint Channel Estimation and Cooperative Localization for Near-Field Ultra-Massive MIMO

    Authors: Ruoxiao Cao, Hengtao He, Xianghao Yu, Shenghui Song, Kaibin Huang, Jun Zhang, Yi Gong, Khaled B. Letaief

    Abstract: The next-generation (6G) wireless networks are expected to provide not only seamless and high data-rate communications, but also ubiquitous sensing services. By providing vast spatial degrees of freedom (DoFs), ultra-massive multiple-input multiple-output (UM-MIMO) technology is a key enabler for both sensing and communications in 6G. However, the adoption of UM-MIMO leads to a shift from the far… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: Submit to JSAC

  20. Bayes-Optimal Unsupervised Learning for Channel Estimation in Near-Field Holographic MIMO

    Authors: Wentao Yu, Hengtao He, Xianghao Yu, Shenghui Song, Jun Zhang, Ross Murch, Khaled B. Letaief

    Abstract: Holographic MIMO (HMIMO) is being increasingly recognized as a key enabling technology for 6G wireless systems through the deployment of an extremely large number of antennas within a compact space to fully exploit the potentials of the electromagnetic (EM) channel. Nevertheless, the benefits of HMIMO systems cannot be fully unleashed without an efficient means to estimate the high-dimensional cha… ▽ More

    Submitted 15 June, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

    Comments: 16 pages, 7 figures, 3 tables, accepted by IEEE Journal of Selected Topics in Signal Processing

  21. arXiv:2312.09952  [pdf, other

    eess.AS cs.SD

    Multi-level graph learning for audio event classification and human-perceived annoyance rating prediction

    Authors: Yuanbo Hou, Qiaoqiao Ren, Siyang Song, Yuxin Song, Wenwu Wang, Dick Botteldooren

    Abstract: WHO's report on environmental noise estimates that 22 M people suffer from chronic annoyance related to noise caused by audio events (AEs) from various sources. Annoyance may lead to health issues and adverse effects on metabolic and cognitive systems. In cities, monitoring noise levels does not provide insights into noticeable AEs, let alone their relations to annoyance. To create annoyance-relat… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by ICASSP 2024

  22. arXiv:2311.07908  [pdf, other

    eess.SP cs.IT

    Learning Bayes-Optimal Channel Estimation for Holographic MIMO in Unknown EM Environments

    Authors: Wentao Yu, Hengtao He, Xianghao Yu, Shenghui Song, Jun Zhang, Ross D. Murch, Khaled B. Letaief

    Abstract: Holographic MIMO (HMIMO) has recently been recognized as a promising enabler for future 6G systems through the use of an ultra-massive number of antennas in a compact space to exploit the propagation characteristics of the electromagnetic (EM) channel. Nevertheless, the promised gain of HMIMO could not be fully unleashed without an efficient means to estimate the high-dimensional channel. Bayes-op… ▽ More

    Submitted 4 February, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: 6 pages, 3 figures, 1 table, accepted for presentation at IEEE ICC 2024, Denver, CO, USA

  23. arXiv:2311.07081  [pdf, other

    cs.IT eess.SP

    Sensing Mutual Information with Random Signals in Gaussian Channels

    Authors: Lei Xie, Fan Liu, Zhanyuan Xie, Zheng Jiang, Shenghui Song

    Abstract: Sensing performance is typically evaluated by classical metrics, such as Cramer-Rao bound and signal-to-clutter-plus-noise ratio. The recent development of the integrated sensing and communication (ISAC) framework motivated the efforts to unify the metric for sensing and communication, where researchers have proposed to utilize mutual information (MI) to measure the sensing performance with determ… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  24. arXiv:2311.04468  [pdf

    eess.IV q-bio.NC

    A human brain atlas of chi-separation for normative iron and myelin distributions

    Authors: Kyeongseon Min, Beomseok Sohn, Woo Jung Kim, Chae Jung Park, Soohwa Song, Dong Hoon Shin, Kyung Won Chang, Na-Young Shin, Minjun Kim, Hyeong-Geol Shin, Phil Hyu Lee, Jongho Lee

    Abstract: Iron and myelin are primary susceptibility sources in the human brain. These substances are essential for healthy brain, and their abnormalities are often related to various neurological disorders. Recently, an advanced susceptibility map** technique, which is referred to as chi-separation, has been proposed, successfully disentangling paramagnetic iron from diamagnetic myelin. This method opene… ▽ More

    Submitted 2 April, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: 19 pages, 9 figures

  25. arXiv:2310.18656  [pdf, other

    eess.IV cs.CV

    Med-DANet V2: A Flexible Dynamic Architecture for Efficient Medical Volumetric Segmentation

    Authors: Haoran Shen, Yifu Zhang, Wenxuan Wang, Chen Chen, **g Liu, Shanshan Song, Jiangyun Li

    Abstract: Recent works have shown that the computational efficiency of 3D medical image (e.g. CT and MRI) segmentation can be impressively improved by dynamic inference based on slice-wise complexity. As a pioneering work, a dynamic architecture network for medical volumetric segmentation (i.e. Med-DANet) has achieved a favorable accuracy and efficiency trade-off by dynamically selecting a suitable 2D candi… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: Accepted by WACV 2024

  26. arXiv:2310.04440  [pdf, other

    eess.SY cs.AI

    Facilitating Battery Swap** Services for Freight Trucks with Spatial-Temporal Demand Prediction

    Authors: Linyu Liu, Zhen Dai, Shiji Song, Xiaocheng Li, Guanting Chen

    Abstract: Electrifying heavy-duty trucks offers a substantial opportunity to curtail carbon emissions, advancing toward a carbon-neutral future. However, the inherent challenges of limited battery energy and the sheer weight of heavy-duty trucks lead to reduced mileage and prolonged charging durations. Consequently, battery-swap** services emerge as an attractive solution for these trucks. This paper empl… ▽ More

    Submitted 23 May, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

    Comments: 9 pages, 6 figures

    MSC Class: 90B06; 68T07

  27. Audio Event-Relational Graph Representation Learning for Acoustic Scene Classification

    Authors: Yuanbo Hou, Siyang Song, Chuang Yu, Wenwu Wang, Dick Botteldooren

    Abstract: Most deep learning-based acoustic scene classification (ASC) approaches identify scenes based on acoustic features converted from audio clips containing mixed information entangled by polyphonic audio events (AEs). However, these approaches have difficulties in explaining what cues they use to identify scenes. This paper conducts the first study on disclosing the relationship between real-life aco… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: IEEE Signal Processing Letters, doi: 10.1109/LSP.2023.3319233

  28. arXiv:2309.12783  [pdf, ps, other

    cs.NI eess.SP

    Multi-objective Optimization of Space-Air-Ground Integrated Network Slicing Relying on a Pair of Central and Distributed Learning Algorithms

    Authors: Guorong Zhou, Liqiang Zhao, Gan Zheng, Shenghui Song, Jiankang Zhang, Lajos Hanzo

    Abstract: As an attractive enabling technology for next-generation wireless communications, network slicing supports diverse customized services in the global space-air-ground integrated network (SAGIN) with diverse resource constraints. In this paper, we dynamically consider three typical classes of radio access network (RAN) slices, namely high-throughput slices, low-delay slices and wide-coverage slices,… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: 19 pages, 14 figures, journal

  29. arXiv:2309.10065  [pdf, other

    q-bio.NC cs.LG eess.IV

    Bayesian longitudinal tensor response regression for modeling neuroplasticity

    Authors: Suprateek Kundu, Alec Reinhardt, Serena Song, Joo Han, M. Lawson Meadows, Bruce Crosson, Venkatagiri Krishnamurthy

    Abstract: A major interest in longitudinal neuroimaging studies involves investigating voxel-level neuroplasticity due to treatment and other factors across visits. However, traditional voxel-wise methods are beset with several pitfalls, which can compromise the accuracy of these approaches. We propose a novel Bayesian tensor response regression approach for longitudinal imaging data, which pools informatio… ▽ More

    Submitted 18 October, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: 28 pages, 8 figures, 6 tables

  30. arXiv:2309.09575  [pdf, other

    eess.SP cs.IT

    AI-Native Transceiver Design for Near-Field Ultra-Massive MIMO: Principles and Techniques

    Authors: Wentao Yu, Yifan Ma, Hengtao He, Shenghui Song, Jun Zhang, Khaled B. Letaief

    Abstract: Ultra-massive multiple-input multiple-output (UMMIMO) is a cutting-edge technology that promises to revolutionize wireless networks by providing an unprecedentedly high spectral and energy efficiency. The enlarged array aperture of UM-MIMO facilitates the accessibility of the near-field region, thereby offering a novel degree of freedom for communications and sensing. Nevertheless, the transceiver… ▽ More

    Submitted 3 January, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: 7 pages, 3 figures, 2 tables, magazine manuscript, submitted to IEEE for possible publication

  31. arXiv:2308.11980  [pdf, other

    eess.AS cs.SD

    Joint Prediction of Audio Event and Annoyance Rating in an Urban Soundscape by Hierarchical Graph Representation Learning

    Authors: Yuanbo Hou, Siyang Song, Cheng Luo, Andrew Mitchell, Qiaoqiao Ren, Weicheng Xie, Jian Kang, Wenwu Wang, Dick Botteldooren

    Abstract: Sound events in daily life carry rich information about the objective world. The composition of these sounds affects the mood of people in a soundscape. Most previous approaches only focus on classifying and detecting audio events and scenes, but may ignore their perceptual quality that may impact humans' listening mood for the environment, e.g. annoyance. To this end, this paper proposes a novel… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

    Comments: INTERSPEECH 2023, Code and models: https://github.com/Yuanbo2020/HGRL

  32. arXiv:2308.03137  [pdf, other

    eess.SP

    Digital Self-Interference Cancellation With Robust Multi-layered Total Least Mean Squares Adaptive Filters

    Authors: Shiyu Song, Yanqun Tang, Xizhang Wei, Yu Zhou, Xianjie Lu, Zhengpeng Wang, Songhu Ge

    Abstract: In simultaneous transmit and receive (STAR) wireless communications, digital self-interference (SI) cancellation is required before estimating the remote transmission (RT) channel. Considering the inherent connection between SI channel reconstruction and RT channel estimation, we propose a multi-layered M-estimate total least mean squares (m-MTLS) joint estimator to estimate both channels. In each… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

  33. arXiv:2308.02416  [pdf, other

    eess.SP cs.LG

    Local-Global Temporal Fusion Network with an Attention Mechanism for Multiple and Multiclass Arrhythmia Classification

    Authors: Yun Kwan Kim, Minji Lee, Kunwook Jo, Hee Seok Song, Seong-Whan Lee

    Abstract: Clinical decision support systems (CDSSs) have been widely utilized to support the decisions made by cardiologists when detecting and classifying arrhythmia from electrocardiograms (ECGs). However, forming a CDSS for the arrhythmia classification task is challenging due to the varying lengths of arrhythmias. Although the onset time of arrhythmia varies, previously developed methods have not consid… ▽ More

    Submitted 13 October, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

    Comments: 14 pages, 6 figures

    MSC Class: 68T07; 92C55

  34. arXiv:2307.10538  [pdf, other

    eess.SP

    Power Allocation for Device-to-Device Interference Channel Using Truncated Graph Transformers

    Authors: Dohoon Kim, Shenghui Song

    Abstract: Power control for the device-to-device interference channel with single-antenna transceivers has been widely analyzed with both model-based methods and learning-based approaches. Although the learning-based approaches, i.e., datadriven and model-driven, offer performance improvement, the widely adopted graph neural network suffers from learning the heterophilous power distribution of the interfere… ▽ More

    Submitted 23 July, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: 6 pages, 5 figures. Accepted in IEEE International Mediterranean Conference on Communications and Networking

  35. arXiv:2307.04977  [pdf, other

    cs.IT eess.SP

    Model-Driven Sensing-Node Selection and Power Allocation for Tracking Maneuvering Targets in Perceptive Mobile Networks

    Authors: Lei Xie, Hengtao He, Shenghui Song, Yonina C. Eldar

    Abstract: Maneuvering target tracking will be an important service of future wireless networks to assist innovative applications such as intelligent transportation. However, tracking maneuvering targets by cellular networks faces many challenges. For example, the dense network and high-speed targets make the selection of the sensing nodes (SNs) and the associated power allocation very challenging. Existing… ▽ More

    Submitted 28 March, 2024; v1 submitted 10 July, 2023; originally announced July 2023.

  36. arXiv:2306.02344  [pdf, other

    eess.AS

    Influence of Lossy Speech Codecs on Hearing-aid, Binaural Sound Source Localisation using DNNs

    Authors: Siyuan Song, Stijn Kindt, Jasper Maes, Alexander Bohlender. Nilesh Madhu

    Abstract: Hearing aids are typically equipped with multiple microphones to exploit spatial information for source localisation and speech enhancement. Especially for hearing aids, a good source localisation is important: it not only guides source separation methods but can also be used to enhance spatial cues, increasing user-awareness of important events in their surroundings. We use a state-of-the-art dee… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

  37. arXiv:2306.00812  [pdf, other

    eess.AS cs.SD

    Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

    Authors: Xiaohuai Le, Tong Lei, Li Chen, Yiqing Guo, Chao He, Cheng Chen, Xianjun Xia, Hua Gao, Yijian Xiao, Piao Ding, Shenyi Song, **g Lu

    Abstract: With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccura… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: accepted by Interspeech 2023

  38. arXiv:2305.12423  [pdf, other

    eess.SP cs.IT eess.IV

    Task-Oriented Communication with Out-of-Distribution Detection: An Information Bottleneck Framework

    Authors: Hongru Li, Wentao Yu, Hengtao He, Jiawei Shao, Shenghui Song, Jun Zhang, Khaled B. Letaief

    Abstract: Task-oriented communication is an emerging paradigm for next-generation communication networks, which extracts and transmits task-relevant information, instead of raw data, for downstream applications. Most existing deep learning (DL)-based task-oriented communication systems adopt a closed-world scenario, assuming either the same data distribution for training and testing, or the system could hav… ▽ More

    Submitted 27 January, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: code available in github, accepted by IEEE GLOBECOM2023

  39. arXiv:2305.09946  [pdf

    eess.IV cs.CV cs.LG

    AdaMSS: Adaptive Multi-Modality Segmentation-to-Survival Learning for Survival Outcome Prediction from PET/CT Images

    Authors: Mingyuan Meng, Bingxin Gu, Michael Fulham, Shaoli Song, Dagan Feng, Lei Bi, **man Kim

    Abstract: Survival prediction is a major concern for cancer management. Deep survival models based on deep learning have been widely adopted to perform end-to-end survival prediction from medical images. Recent deep survival models achieved promising performance by jointly performing tumor segmentation with survival prediction, where the models were guided to extract tumor-related information through Multi-… ▽ More

    Submitted 19 July, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: Under Review

  40. arXiv:2305.05265  [pdf, ps, other

    eess.SP eess.SY

    Joint BS Selection, User Association, and Beamforming Design for Network Integrated Sensing and Communication

    Authors: Yiming Xu, Dongfang Xu, Lei Xie, Shenghui Song

    Abstract: Different from conventional radar, the cellular network in the integrated sensing and communication (ISAC) system enables collaborative sensing by multiple sensing nodes, e.g., base stations (BSs). However, existing works normally assume designated BSs as the sensing nodes, and thus can't fully exploit the macro-diversity gain. In the paper, we propose a joint BS selection, user association, and b… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: 6 pages

  41. arXiv:2305.01213  [pdf, ps, other

    cs.IT eess.SP

    Integrated Sensing and Communication in Coordinated Cellular Networks

    Authors: Dongfang Xu, Chang Liu, Shenghui Song, Derrick Wing Kwan Ng

    Abstract: Integrated sensing and communication (ISAC) is a promising technique to provide sensing services in future wireless networks. Numerous existing works have adopted a monostatic radar architecture to realize ISAC, i.e., employing the same base station (BS) to transmit the ISAC signal and receive the echo. Yet, the concurrent information transmission causes unavoidable self-interference (SI) to the r… ▽ More

    Submitted 16 May, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: 6 pages, 3 figures

  42. arXiv:2302.14224  [pdf, other

    eess.SP cs.NI

    Overview and Performance Analysis of Various Waveforms in High Mobility Scenarios

    Authors: Yu Zhou, Haoran Yin, Jiaojiao Xiong, Shiyu Song, Jiajun Zhu, **ming Du, Haibo Chen, Yanqun Tang

    Abstract: In the high-mobility scenarios of next-generation wireless communication systems (beyond 5G/6G), the performance of orthogonal frequency division multiplexing (OFDM) deteriorates drastically due to the loss of orthogonality between the subcarriers caused by large Doppler frequency shifts. Various emerging waveforms have been proposed for fast time-varying channels with excellent results. In this p… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  43. arXiv:2302.06896  [pdf, other

    cs.IT cs.LG eess.SP

    Message Passing Meets Graph Neural Networks: A New Paradigm for Massive MIMO Systems

    Authors: Hengtao He, Xianghao Yu, Jun Zhang, Shenghui Song, Khaled B. Letaief

    Abstract: As one of the core technologies for 5G systems, massive multiple-input multiple-output (MIMO) introduces dramatic capacity improvements along with very high beamforming and spatial multiplexing gains. When develo** efficient physical layer algorithms for massive MIMO systems, message passing is one promising candidate owing to the superior performance. However, as their computational complexity… ▽ More

    Submitted 31 October, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: 30 Pages, 7 Figures, and 4 Tables. This paper has been accepted by the IEEE Transactions on Wireless Communications. The code is available at: https://github.com/hehengtao/AMP_GNN

  44. arXiv:2301.03081  [pdf

    eess.IV cs.CV cs.CY

    Automatic Diagnosis of Carotid Atherosclerosis Using a Portable Freehand 3D Ultrasound Imaging System

    Authors: Jiawen Li, Yunqian Huang, Sheng Song, Hongbo Chen, Junni Shi, Duo Xu, Haibin Zhang, Man Chen, Rui Zheng

    Abstract: The objective of this study is to develop a deep-learning based detection and diagnosis technique for carotid atherosclerosis using a portable freehand 3D ultrasound (US) imaging system. A total of 127 3D carotid artery scans were acquired using a portable 3D US system which consisted of a handheld US scanner and an electromagnetic tracking system. A U-Net segmentation network was firstly applied… ▽ More

    Submitted 9 November, 2023; v1 submitted 8 January, 2023; originally announced January 2023.

  45. arXiv:2301.02753  [pdf, other

    eess.SY

    Planning and Tracking Control of Full Drive-by-Wire Electric Vehicles in Unstructured Scenario

    Authors: Guoying Chen, Min Hua, Wei Liu, **hai Wang, Shunhui Song, Changsheng Liu

    Abstract: Full drive-by-wire electric vehicles (FDWEV) with X-by-wire technology can achieve independent driving, braking, and steering of each wheel, providing a good application platform for autonomous driving technology. Path planning and tracking control, in particular, are critical components of autonomous driving. However, It is challenging to comprehensively design an robust control algorithm by inte… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

  46. arXiv:2212.10945  [pdf, other

    eess.SY

    Standoff Tracking Using DNN-Based MPC with Implementation on FPGA

    Authors: Fei Dong, Xingchen Li, Keyou You, Shiji Song

    Abstract: This work studies the standoff tracking problem to drive an unmanned aerial vehicle (UAV) to slide on a desired circle over a moving target at a constant height. We propose a novel Lyapunov guidance vector (LGV) field with tunable convergence rates for the UAV's trajectory planning and a deep neural network (DNN)-based model predictive control (MPC) scheme to track the reference trajectory. Then,… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

  47. An Adaptive and Robust Deep Learning Framework for THz Ultra-Massive MIMO Channel Estimation

    Authors: Wentao Yu, Yifei Shen, Hengtao He, Xianghao Yu, Shenghui Song, Jun Zhang, Khaled B. Letaief

    Abstract: Terahertz ultra-massive MIMO (THz UM-MIMO) is envisioned as one of the key enablers of 6G wireless networks, for which channel estimation is highly challenging. Traditional analytical estimation methods are no longer effective, as the enlarged array aperture and the small wavelength result in a mixture of far-field and near-field paths, constituting a hybrid-field channel. Deep learning (DL)-based… ▽ More

    Submitted 8 June, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: 15 pages, 11 figures, 5 tables, accepted by IEEE Journal of Selected Topics in Signal Processing (JSTSP)

  48. arXiv:2211.15079  [pdf, other

    cs.IT cs.LG eess.SP

    Lightweight and Flexible Deep Equilibrium Learning for CSI Feedback in FDD Massive MIMO

    Authors: Yifan Ma, Wentao Yu, Xianghao Yu, Jun Zhang, Shenghui Song, Khaled B. Letaief

    Abstract: In frequency-division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems, downlink channel state information (CSI) needs to be sent back to the base station (BS) by the users, which causes prohibitive feedback overhead. In this paper, we propose a lightweight and flexible deep learning-based CSI feedback approach by capitalizing on deep equilibrium models. Different from existin… ▽ More

    Submitted 5 June, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: submitted to IEEE for possible publication

  49. arXiv:2211.07985  [pdf, ps, other

    eess.SP cs.IT

    Blind Performance Prediction for Deep Learning Based Ultra-Massive MIMO Channel Estimation

    Authors: Wentao Yu, Hengtao He, Xianghao Yu, Shenghui Song, Jun Zhang, Khaled B. Letaief

    Abstract: Reliability is of paramount importance for the physical layer of wireless systems due to its decisive impact on end-to-end performance. However, the uncertainty of prevailing deep learning (DL)-based physical layer algorithms is hard to quantify due to the black-box nature of neural networks. This limitation is a major obstacle that hinders their practical deployment. In this paper, we attempt to… ▽ More

    Submitted 5 February, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: 6 pages, 3 figures, 1 table, accepted by IEEE ICC 2023

  50. arXiv:2210.15808  [pdf

    eess.IV cs.CV

    Hyper-Connected Transformer Network for Multi-Modality PET-CT Segmentation

    Authors: Lei Bi, Michael Fulham, Shaoli Song, David Dagan Feng, **man Kim

    Abstract: [18F]-Fluorodeoxyglucose (FDG) positron emission tomography - computed tomography (PET-CT) has become the imaging modality of choice for diagnosing many cancers. Co-learning complementary PET-CT imaging features is a fundamental requirement for automatic tumor segmentation and for develo** computer aided cancer diagnosis systems. In this study, we propose a hyper-connected transformer (HCT) netw… ▽ More

    Submitted 7 August, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: EMBC 2023