Skip to main content

Showing 1–50 of 119 results for author: Xiao, X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.01083  [pdf, ps, other

    eess.SP

    A Note On the Clark Conjecture On Time-Warped Bandlimited Signals

    Authors: Xiang-Gen Xia

    Abstract: In this note, a result of a previous paper on the Clark conjecture on time-warped bandlimited signals is extended to a more general class of the time war** functions, which includes most of the common functions in practice.

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.09822  [pdf, other

    cs.IT cs.CV cs.LG eess.IV eess.SP

    An I2I Inpainting Approach for Efficient Channel Knowledge Map Construction

    Authors: Zhenzhou **, Li You, Jue Wang, Xiang-Gen Xia, Xiqi Gao

    Abstract: Channel knowledge map (CKM) has received widespread attention as an emerging enabling technology for environment-aware wireless communications. It involves the construction of databases containing location-specific channel knowledge, which are then leveraged to facilitate channel state information (CSI) acquisition and transceiver design. In this context, a fundamental challenge lies in efficientl… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 15 pages, 11 figures

  3. arXiv:2406.07498  [pdf, other

    cs.SD eess.AS

    RaD-Net 2: A causal two-stage repairing and denoising speech enhancement network with knowledge distillation and complex axial self-attention

    Authors: Mingshuai Liu, Zhuangqi Chen, Xiaopeng Yan, Yuanjun Lv, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie

    Abstract: In real-time speech communication systems, speech signals are often degraded by multiple distortions. Recently, a two-stage Repair-and-Denoising network (RaD-Net) was proposed with superior speech quality improvement in the ICASSP 2024 Speech Signal Improvement (SSI) Challenge. However, failure to use future information and constraint receptive field of convolution layers limit the system's perfor… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  4. arXiv:2406.05961  [pdf, other

    eess.AS

    BS-PLCNet 2: Two-stage Band-split Packet Loss Concealment Network with Intra-model Knowledge Distillation

    Authors: Zihan Zhang, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie

    Abstract: Audio packet loss is an inevitable problem in real-time speech communication. A band-split packet loss concealment network (BS-PLCNet) targeting full-band signals was recently proposed. Although it performs superiorly in the ICASSP 2024 PLC Challenge, BS-PLCNet is a large model with high computational complexity of 8.95G FLOPS. This paper presents its updated version, BS-PLCNet 2, to reduce comput… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  5. arXiv:2406.04586  [pdf

    eess.SP

    A Simple Channel Independent Beamforming Scheme With Parallel Uniform Circular Array

    Authors: Haiyue **g, Wenchi Cheng, Xiang-Gen Xia

    Abstract: In this letter, we consider a uniform circular array (UCA)-based line-of-sight multiple-input-multiple-output system, where the transmit and receive UCAs are parallel but non-coaxial with each other. We propose a simple channel-independent beamforming scheme with fast symbol-wise maximum likelihood detection.

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: This paper has been published in IEEE Communications Letters. arXiv admin note: substantial text overlap with arXiv:1804.06621

  6. arXiv:2405.11883  [pdf, other

    cs.IT eess.SP

    Asynchronous MIMO-OFDM Massive Unsourced Random Access with Codeword Collisions

    Authors: Tianya Li, Yongpeng Wu, Junyuan Gao, Wenjun Zhang, Xiang-Gen Xia, Derrick Wing Kwan Ng, Chengshan Xiao

    Abstract: This paper investigates asynchronous MIMO massive unsourced random access in an orthogonal frequency division multiplexing (OFDM) system over frequency-selective fading channels, with the presence of both timing and carrier frequency offsets (TO and CFO) and non-negligible codeword collisions. The proposed coding framework segregates the data into two components, namely, preamble and coding parts,… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 13 pages, 12 figures, submitted to the IEEE for possible publication

  7. arXiv:2404.09734  [pdf, other

    cs.IT eess.SP

    Weighted Sum-Rate Maximization for Movable Antenna-Enhanced Wireless Networks

    Authors: Biqian Feng, Yongpeng Wu, Xiang-Gen Xia, Chengshan Xiao

    Abstract: This letter investigates the weighted sum rate maximization problem in movable antenna (MA)-enhanced systems. To reduce the computational complexity, we transform it into a more tractable weighted minimum mean square error (WMMSE) problem well-suited for MA. We then adopt the WMMSE algorithm and majorization-minimization algorithm to optimize the beamforming and antenna positions, respectively. Mo… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE Wireless Communications Letters

  8. arXiv:2404.07425  [pdf, ps, other

    eess.SP cs.IT

    Precoder Design for User-Centric Network Massive MIMO with Matrix Manifold Optimization

    Authors: Rui Sun, Li You, An-An Lu, Chen Sun, Xiqi Gao, Xiang-Gen Xia

    Abstract: In this paper, we investigate the precoder design for user-centric network (UCN) massive multiple-input multiple-output (mMIMO) downlink with matrix manifold optimization. In UCN mMIMO systems, each user terminal (UT) is served by a subset of base stations (BSs) instead of all the BSs, facilitating the implementation of the system and lowering the dimension of the precoders to be designed. By prov… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 13 pages, 9 figures, journal

  9. arXiv:2403.07954  [pdf, other

    cs.LG eess.SP

    Optimizing Polynomial Graph Filters: A Novel Adaptive Krylov Subspace Approach

    Authors: Keke Huang, Wencai Cao, Hoang Ta, Xiaokui Xiao, Pietro Liò

    Abstract: Graph Neural Networks (GNNs), known as spectral graph filters, find a wide range of applications in web networks. To bypass eigendecomposition, polynomial graph filters are proposed to approximate graph filters by leveraging various polynomial bases for filter training. However, no existing studies have explored the diverse polynomial graph filters from a unified perspective for optimization. In… ▽ More

    Submitted 20 May, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  10. arXiv:2402.01271  [pdf, other

    eess.AS cs.SD

    An Intra-BRNN and GB-RVQ Based END-TO-END Neural Audio Codec

    Authors: Lin** Xu, Jiawei Jiang, Dejun Zhang, Xianjun Xia, Li Chen, Yijian Xiao, Piao Ding, Shenyi Song, Sixing Yin, Ferdous Sohel

    Abstract: Recently, neural networks have proven to be effective in performing speech coding task at low bitrates. However, under-utilization of intra-frame correlations and the error of quantizer specifically degrade the reconstructed audio quality. To improve the coding quality, we present an end-to-end neural speech codec, namely CBRC (Convolutional and Bidirectional Recurrent neural Codec). An interleave… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: INTERSPEECH 2023

  11. arXiv:2401.13197  [pdf, other

    eess.IV cs.CV

    Predicting Mitral Valve mTEER Surgery Outcomes Using Machine Learning and Deep Learning Techniques

    Authors: Tejas Vyas, Mohsena Chowdhury, Xiaojiao Xiao, Mathias Claeys, Géraldine Ong, Guanghui Wang

    Abstract: Mitral Transcatheter Edge-to-Edge Repair (mTEER) is a medical procedure utilized for the treatment of mitral valve disorders. However, predicting the outcome of the procedure poses a significant challenge. This paper makes the first attempt to harness classical machine learning (ML) and deep learning (DL) techniques for predicting mitral valve mTEER surgery outcomes. To achieve this, we compiled a… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 5 pages, 1 figure

  12. arXiv:2401.08887  [pdf, ps, other

    cs.SD cs.AI cs.CL eess.AS

    NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription

    Authors: Alon Vinnikov, Amir Ivry, Aviv Hurvitz, Igor Abramovski, Sharon Koubi, Ilya Gurvich, Shai Pe`er, Xiong Xiao, Benjamin Martinez Elizalde, Naoyuki Kanda, Xiaofei Wang, Shalev Shaer, Stav Yagev, Yossi Asher, Sunit Sivasankaran, Yifan Gong, Min Tang, Huaming Wang, Eyal Krupka

    Abstract: We introduce the first Natural Office Talkers in Settings of Far-field Audio Recordings (``NOTSOFAR-1'') Challenge alongside datasets and baseline system. The challenge focuses on distant speaker diarization and automatic speech recognition (DASR) in far-field meeting scenarios, with single-channel and known-geometry multi-channel tracks, and serves as a launch platform for two new datasets: First… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: preprint

  13. arXiv:2401.04389  [pdf, other

    cs.SD eess.AS

    RaD-Net: A Repairing and Denoising Network for Speech Signal Improvement

    Authors: Mingshuai Liu, Zhuangqi Chen, Xiaopeng Yan, Yuanjun Lv, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie

    Abstract: This paper introduces our repairing and denoising network (RaD-Net) for the ICASSP 2024 Speech Signal Improvement (SSI) Challenge. We extend our previous framework based on a two-stage network and propose an upgraded model. Specifically, we replace the repairing network with COM-Net from TEA-PSE. In addition, multi-resolution discriminators and multi-band discriminators are adopted in the training… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: submitted to ICASSP 2024

  14. arXiv:2401.03687  [pdf, other

    eess.AS cs.SD

    BS-PLCNet: Band-split Packet Loss Concealment Network with Multi-task Learning Framework and Multi-discriminators

    Authors: Zihan Zhang, Jiayao Sun, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie

    Abstract: Packet loss is a common and unavoidable problem in voice over internet phone (VoIP) systems. To deal with the problem, we propose a band-split packet loss concealment network (BS-PLCNet). Specifically, we split the full-band signal into wide-band (0-8kHz) and high-band (8-24kHz). The wide-band signals are processed by a gated convolutional recurrent network (GCRN), while the high-band counterpart… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: submitted to ICASSP 2024

  15. arXiv:2401.00413  [pdf, other

    cs.LG cs.ET eess.SP

    Real-Time FJ/MAC PDE Solvers via Tensorized, Back-Propagation-Free Optical PINN Training

    Authors: Yequan Zhao, Xian Xiao, Xinling Yu, Ziyue Liu, Zhixiong Chen, Geza Kurczveil, Raymond G. Beausoleil, Zheng Zhang

    Abstract: Solving partial differential equations (PDEs) numerically often requires huge computing time, energy cost, and hardware resources in practical applications. This has limited their applications in many scenarios (e.g., autonomous systems, supersonic flows) that have a limited energy budget and require near real-time response. Leveraging optical computing, this paper develops an on-chip training fra… ▽ More

    Submitted 4 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

    Comments: ML with New Compute Paradigms (MLNCP) at NeurIPS 2023

  16. arXiv:2312.13311  [pdf, other

    cs.LG eess.IV

    Unlocking Deep Learning: A BP-Free Approach for Parallel Block-Wise Training of Neural Networks

    Authors: Anzhe Cheng, Zhenkun Wang, Chenzhong Yin, Mingxi Cheng, Heng **, Xiongye Xiao, Shahin Nazarian, Paul Bogdan

    Abstract: Backpropagation (BP) has been a successful optimization technique for deep learning models. However, its limitations, such as backward- and update-locking, and its biological implausibility, hinder the concurrent updating of layers and do not mimic the local learning processes observed in the human brain. To address these issues, recent research has suggested using local error signals to asynchron… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: The paper has been accepted by ICASSP2024

  17. arXiv:2312.06969  [pdf, ps, other

    cs.IT eess.SP

    Channel Estimation for Movable Antenna Communication Systems: A Framework Based on Compressed Sensing

    Authors: Zhenyu Xiao, Songqi Cao, Lipeng Zhu, Yanming Liu, Xiang-Gen Xia, Rui Zhang

    Abstract: Movable antenna (MA) is a new technology with great potential to improve communication performance by enabling local movement of antennas for pursuing better channel conditions. In particular, the acquisition of complete channel state information (CSI) between the transmitter (Tx) and receiver (Rx) regions is an essential problem for MA systems to reap performance gains. In this paper, we propose… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  18. arXiv:2311.16565  [pdf, other

    cs.CV cs.SD eess.AS

    DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser

    Authors: Peng Chen, Xiaobao Wei, Ming Lu, Yitong Zhu, Naiming Yao, Xingyu Xiao, Hui Chen

    Abstract: Speech-driven 3D facial animation has been an attractive task in both academia and industry. Traditional methods mostly focus on learning a deterministic map** from speech to animation. Recent approaches start to consider the non-deterministic fact of speech-driven 3D face animation and employ the diffusion model for the task. However, personalizing facial animation and accelerating animation ge… ▽ More

    Submitted 2 December, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

  19. arXiv:2311.11804  [pdf, ps, other

    eess.SP cs.IT

    Robust Multidimentional Chinese Remainder Theorem for Integer Vector Reconstruction

    Authors: Li Xiao, Haiye Huo, Xiang-Gen Xia

    Abstract: The problem of robustly reconstructing an integer vector from its erroneous remainders appears in many applications in the field of multidimensional (MD) signal processing. To address this problem, a robust MD Chinese remainder theorem (CRT) was recently proposed for a special class of moduli, where the remaining integer matrices left-divided by a greatest common left divisor (gcld) of all the mod… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: 12 pages, 5 figure

  20. arXiv:2311.10416  [pdf, other

    eess.SP

    Meta-DSP: A Meta-Learning Approach for Data-Driven Nonlinear Compensation in High-Speed Optical Fiber Systems

    Authors: Xinyu Xiao, Zhennan Zhou, Bin Dong, Dingjiong Ma, Li Zhou, Jie Sun

    Abstract: Non-linear effects in long-haul, high-speed optical fiber systems significantly hinder channel capacity. While the Digital Backward Propagation algorithm (DBP) with adaptive filter (ADF) can mitigate these effects, it suffers from an overwhelming computational complexity. Recent solutions have incorporated deep neural networks in a data-driven strategy to alleviate this complexity in the DBP model… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  21. arXiv:2311.05236  [pdf, ps, other

    eess.SP

    Delay Doppler Transform

    Authors: Xiang-Gen Xia

    Abstract: This letter is to introduce delay Doppler transform (DDT) for a time domain signal. It is motivated by the recent studies in wireless communications over delay Doppler channels that have both time and Doppler spreads, such as, satellite communication channels. We present some simple properties of DDT as well. The DDT study may provide insights of delay Doppler channels.

    Submitted 3 December, 2023; v1 submitted 9 November, 2023; originally announced November 2023.

  22. arXiv:2310.04992  [pdf, other

    eess.IV cs.CV

    VisionFM: a Multi-Modal Multi-Task Vision Foundation Model for Generalist Ophthalmic Artificial Intelligence

    Authors: Jianing Qiu, Jian Wu, Hao Wei, Peilun Shi, Minqing Zhang, Yunyun Sun, Lin Li, Hanruo Liu, Hongyi Liu, Simeng Hou, Yuyang Zhao, Xuehui Shi, Junfang Xian, Xiaoxia Qu, Sirui Zhu, Lijie Pan, Xiaoniao Chen, Xiaojia Zhang, Shuai Jiang, Kebing Wang, Chenlong Yang, Mingqiang Chen, Sujie Fan, Jianhua Hu, Aiguo Lv , et al. (17 additional authors not shown)

    Abstract: We present VisionFM, a foundation model pre-trained with 3.4 million ophthalmic images from 560,457 individuals, covering a broad range of ophthalmic diseases, modalities, imaging devices, and demography. After pre-training, VisionFM provides a foundation to foster multiple ophthalmic artificial intelligence (AI) applications, such as disease screening and diagnosis, disease prognosis, subclassifi… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

  23. arXiv:2310.04715  [pdf, other

    eess.AS cs.SD

    An Exploration of Task-decoupling on Two-stage Neural Post Filter for Real-time Personalized Acoustic Echo Cancellation

    Authors: Zihan Zhang, Jiayao Sun, Xianjun Xia, Ziqian Wang, Xiaopeng Yan, Yijian Xiao, Lei Xie

    Abstract: Deep learning based techniques have been popularly adopted in acoustic echo cancellation (AEC). Utilization of speaker representation has extended the frontier of AEC, thus attracting many researchers' interest in personalized acoustic echo cancellation (PAEC). Meanwhile, task-decoupling strategies are widely adopted in speech enhancement. To further explore the task-decoupling approach, we propos… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

    Comments: accepted to ASRU 2023

  24. arXiv:2309.12660  [pdf, ps, other

    cs.RO eess.SY

    Disturbance Rejection Control for Autonomous Trolley Collection Robots with Prescribed Performance

    Authors: Rui-Dong Xi, Liang Lu, Xue Zhang, Xiao Xiao, Bingyi Xia, Jiankun Wang, Max Q. -H. Meng

    Abstract: Trajectory tracking control of autonomous trolley collection robots (ATCR) is an ambitious work due to the complex environment, serious noise and external disturbances. This work investigates a control scheme for ATCR subjecting to severe environmental interference. A kinematics model based adaptive sliding mode disturbance observer with fast convergence is first proposed to estimate the lumped di… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  25. arXiv:2309.12521  [pdf, other

    cs.SD eess.AS

    Profile-Error-Tolerant Target-Speaker Voice Activity Detection

    Authors: Dongmei Wang, Xiong Xiao, Naoyuki Kanda, Midia Yousefi, Takuya Yoshioka, Jian Wu

    Abstract: Target-Speaker Voice Activity Detection (TS-VAD) utilizes a set of speaker profiles alongside an input audio signal to perform speaker diarization. While its superiority over conventional methods has been demonstrated, the method can suffer from errors in speaker profiles, as those profiles are typically obtained by running a traditional clustering-based diarization method over the input signal. T… ▽ More

    Submitted 3 April, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: Submission for ICASSP 2024

  26. arXiv:2308.16021  [pdf, other

    cs.SD eess.AS

    CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis

    Authors: Yi Meng, Xiang Li, Zhiyong Wu, Tingtian Li, Zixun Sun, Xinyu Xiao, Chi Sun, Hui Zhan, Helen Meng

    Abstract: To further improve the speaking styles of synthesized speeches, current text-to-speech (TTS) synthesis systems commonly employ reference speeches to stylize their outputs instead of just the input texts. These reference speeches are obtained by manual selection which is resource-consuming, or selected by semantic features. However, semantic features contain not only style-related information, but… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

    Comments: Accepted by InterSpeech 2022

  27. arXiv:2308.13148  [pdf, ps, other

    eess.SP cs.IT

    Understanding Turbo Codes: A Signal Processing Study

    Authors: Xiang-Gen Xia

    Abstract: In this paper, we study turbo codes from the digital signal processing point of view by defining turbo codes over the complex field. It is known that iterative decoding and interleaving between concatenated parallel codes are two key elements that make turbo codes perform significantly better than the conventional error control codes. This is analytically illustrated in this paper by showing that… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

  28. arXiv:2308.09512  [pdf, other

    cs.IT eess.SP

    Multiuser Communications with Movable-Antenna Base Station: Joint Antenna Positioning, Receive Combining, and Power Control

    Authors: Zhenyu Xiao, Xiangyu Pi, Lipeng Zhu, Xiang-Gen Xia, Rui Zhang

    Abstract: Movable antenna (MA) is an emerging technology which enables a local movement of the antenna in the transmitter/receiver region for improving the channel condition and communication performance. In this paper, we study the deployment of multiple MAs at the base station (BS) for enhancing the multiuser communication performance. First, we model the multiuser channel in the uplink to characterize th… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2308.05546

  29. arXiv:2307.10316  [pdf, other

    cs.CV eess.IV

    CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation

    Authors: Lizhao Liu, Zhuangwei Zhuang, Shangxin Huang, Xunlong Xiao, Tianhang Xiang, Cen Chen, **gdong Wang, Mingkui Tan

    Abstract: We study the task of weakly-supervised point cloud semantic segmentation with sparse annotations (e.g., less than 0.1% points are labeled), aiming to reduce the expensive cost of dense annotations. Unfortunately, with extremely sparse annotated points, it is very difficult to extract both contextual and object information for scene understanding such as semantic segmentation. Motivated by masked m… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted by ICCV 2023

  30. arXiv:2307.05386  [pdf, other

    eess.SP physics.optics

    Exploring the Potential of Integrated Optical Sensing and Communication (IOSAC) Systems with Si Waveguides for Future Networks

    Authors: Xiangpeng Ou, Ying Qiu, Ming Luo, Fujun Sun, Peng Zhang, Gang Yang, Junjie Li, Jianfeng Gao, Xiaobin He, Anyan Du, Bo Tang, Bin Li, Zichen Liu, Zhihua Li, Ling Xie, Xi Xiao, Jun Luo, Wenwu Wang, ** Tao, Yan Yang

    Abstract: Advanced silicon photonic technologies enable integrated optical sensing and communication (IOSAC) in real time for the emerging application requirements of simultaneous sensing and communication for next-generation networks. Here, we propose and demonstrate the IOSAC system on the silicon nitride (SiN) photonics platform. The IOSAC devices based on microring resonators are capable of monitoring t… ▽ More

    Submitted 27 June, 2023; originally announced July 2023.

    Comments: 11pages, 5 figutres

  31. arXiv:2307.05365  [pdf

    eess.SP cs.HC

    Decoding Taste Information in Human Brain: A Temporal and Spatial Reconstruction Data Augmentation Method Coupled with Taste EEG

    Authors: Xiuxin Xia, Yuchao Yang, Yan Shi, Wenbo Zheng, Hong Men

    Abstract: For humans, taste is essential for perceiving food's nutrient content or harmful components. The current sensory evaluation of taste mainly relies on artificial sensory evaluation and electronic tongue, but the former has strong subjectivity and poor repeatability, and the latter is not flexible enough. This work proposed a strategy for acquiring and recognizing taste electroencephalogram (EEG), a… ▽ More

    Submitted 1 July, 2023; originally announced July 2023.

    Comments: 10 pages, 11 figures, 30 references, article is being submitted

  32. arXiv:2307.03387  [pdf, ps, other

    cs.IT eess.SP

    A Joint Design for Full-duplex OFDM AF Relay System with Precoded Short Guard Interval

    Authors: Pu Yang, Xiang-Gen Xia, Qingyue Qu, Han Wang, Yi Liu

    Abstract: In-band full-duplex relay (FDR) has attracted much attention as an effective solution to improve the coverage and spectral efficiency in wireless communication networks. The basic problem for FDR transmission is how to eliminate the inherent self-interference and re-use the residual self-interference (RSI) at the relay to improve the end-to-end performance. Considering the RSI at the FDR, the over… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

    Comments: 16 pages, 5 figures

    MSC Class: 94-10 ACM Class: H.1.1

  33. arXiv:2307.01798  [pdf, ps, other

    eess.IV cs.CV cs.LG

    Edge-aware Multi-task Network for Integrating Quantification Segmentation and Uncertainty Prediction of Liver Tumor on Multi-modality Non-contrast MRI

    Authors: Xiaojiao Xiao, Qinmin Hu, Guanghui Wang

    Abstract: Simultaneous multi-index quantification, segmentation, and uncertainty estimation of liver tumors on multi-modality non-contrast magnetic resonance imaging (NCMRI) are crucial for accurate diagnosis. However, existing methods lack an effective mechanism for multi-modality NCMRI fusion and accurate boundary information capture, making these tasks challenging. To address these issues, this paper pro… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

  34. arXiv:2306.10805  [pdf

    physics.med-ph cs.CV eess.IV

    Experts' cognition-driven ensemble deep learning for external validation of predicting pathological complete response to neoadjuvant chemotherapy from histological images in breast cancer

    Authors: Yongquan Yang, Fengling Li, Yani Wei, Yuanyuan Zhao, **g Fu, Xiuli Xiao, Hong Bu

    Abstract: In breast cancer imaging, there has been a trend to directly predict pathological complete response (pCR) to neoadjuvant chemotherapy (NAC) from histological images based on deep learning (DL). However, it has been a commonly known problem that the constructed DL-based models numerically have better performances in internal validation than in external validation. The primary reason for this situat… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  35. arXiv:2306.00812  [pdf, other

    eess.AS cs.SD

    Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

    Authors: Xiaohuai Le, Tong Lei, Li Chen, Yiqing Guo, Chao He, Cheng Chen, Xianjun Xia, Hua Gao, Yijian Xiao, Piao Ding, Shenyi Song, **g Lu

    Abstract: With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccura… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: accepted by Interspeech 2023

  36. arXiv:2305.13753  [pdf, other

    cs.IT eess.SP

    A Graph-Based Collision Resolution Scheme for Asynchronous Unsourced Random Access

    Authors: Tianya Li, Yongpeng Wu, Wenjun Zhang, Xiang-Gen Xia, Chengshan Xiao

    Abstract: This paper investigates the multiple-input-multiple-output (MIMO) massive unsourced random access in an asynchronous orthogonal frequency division multiplexing (OFDM) system, with both timing and frequency offsets (TFO) and non-negligible user collisions. The proposed coding framework splits the data into two parts encoded by sparse regression code (SPARC) and low-density parity check (LDPC) code.… ▽ More

    Submitted 18 August, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 6 pages, 6 figures, accepted for the presentation at IEEE GLOBECOM 2023

  37. arXiv:2305.13652  [pdf, ps, other

    cs.CL eess.AS

    Cross-lingual Knowledge Transfer and Iterative Pseudo-labeling for Low-Resource Speech Recognition with Transducers

    Authors: Jan Silovsky, Liuhui Deng, Arturo Argueta, Tresi Arvizo, Roger Hsiao, Sasha Kuznietsov, Yiu-Chang Lin, Xiaoqiang Xiao, Yuanyuan Zhang

    Abstract: Voice technology has become ubiquitous recently. However, the accuracy, and hence experience, in different languages varies significantly, which makes the technology not equally inclusive. The availability of data for different languages is one of the key factors affecting accuracy, especially in training of all-neural end-to-end automatic speech recognition systems. Cross-lingual knowledge tran… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  38. arXiv:2305.07584  [pdf, other

    cs.IT eess.SP

    Proactive Content Caching Scheme in Urban Vehicular Networks

    Authors: Biqian Feng, Chenyuan Feng, Daquan Feng, Yongpeng Wu, Xiang-Gen Xia

    Abstract: Stream media content caching is a key enabling technology to promote the value chain of future urban vehicular networks. Nevertheless, the high mobility of vehicles, intermittency of information transmissions, high dynamics of user requests, limited caching capacities and extreme complexity of business scenarios pose an enormous challenge to content caching and distribution in vehicular networks.… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: Accepted by IEEE Transactions on Communications

  39. Precoder Design for Massive MIMO Downlink with Matrix Manifold Optimization

    Authors: Rui Sun, Chen Wang, An-An Lu, Xiqi Gao, Xiang-Gen Xia

    Abstract: We investigate the weighted sum-rate (WSR) maximization linear precoder design for massive multiple-input multiple-output (MIMO) downlink. We consider a single-cell system with multiple users and propose a unified matrix manifold optimization framework applicable to total power constraint (TPC), per-user power constraint (PUPC) and per-antenna power constraint (PAPC). We prove that the precoders u… ▽ More

    Submitted 10 April, 2024; v1 submitted 31 March, 2023; originally announced April 2023.

    Comments: 16 pages, 11 figures, journal

    Journal ref: IEEE Transactions on Signal Processing, vol. 72, pp. 1065-1080, 2024

  40. arXiv:2303.07621  [pdf, other

    eess.AS cs.SD

    Two-stage Neural Network for ICASSP 2023 Speech Signal Improvement Challenge

    Authors: Mingshuai Liu, Shubo Lv, Zihan Zhang, Runduo Han, Xiang Hao, Xianjun Xia, Li Chen, Yijian Xiao, Lei Xie

    Abstract: In ICASSP 2023 speech signal improvement challenge, we developed a dual-stage neural model which improves speech signal quality induced by different distortions in a stage-wise divide-and-conquer fashion. Specifically, in the first stage, the speech improvement network focuses on recovering the missing components of the spectrum, while in the second stage, our model aims to further suppress noise,… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  41. arXiv:2302.09953  [pdf, other

    eess.AS cs.SD

    Personalized speech enhancement combining band-split RNN and speaker attentive module

    Authors: Xiaohuai Le, Li Chen, Chao He, Yiqing Guo, Cheng Chen, Xianjun Xia, **g Lu

    Abstract: Target speaker information can be utilized in speech enhancement (SE) models to more effectively extract the desired speech. Previous works introduce the speaker embedding into speech enhancement models by means of concatenation or affine transformation. In this paper, we propose a speaker attentive module to calculate the attention scores between the speaker embedding and the intermediate feature… ▽ More

    Submitted 16 March, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

  42. arXiv:2302.08744  [pdf, other

    eess.SP cs.AR

    Tensorized Optical Multimodal Fusion Network

    Authors: Yequan Zhao, Xian Xiao, Geza Kurczveil, Raymond G. Beausoleil, Zheng Zhang

    Abstract: We propose the first tensorized optical multimodal fusion network architecture with a self-attention mechanism and low-rank tensor fusion. Simulation results show $51.3 \times$ less hardware requirement and $3.7\times 10^{13}$ MAC/J energy efficiency.

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: CLEO 2023 Novel Applications in Integrated Photonics

  43. arXiv:2302.08549  [pdf, other

    eess.AS cs.SD

    Speaker Change Detection for Transformer Transducer ASR

    Authors: Jian Wu, Zhuo Chen, Min Hu, Xiong Xiao, **yu Li

    Abstract: Speaker change detection (SCD) is an important feature that improves the readability of the recognized words from an automatic speech recognition (ASR) system by breaking the word sequence into paragraphs at speaker change points. Existing SCD solutions either require additional ensemble for the time based decisions and recognized word sequences, or implement a tight integration between ASR and SC… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: 5 pages, 1 figure, accepted by ICASSP 2023

  44. arXiv:2301.12804  [pdf, ps, other

    cs.IT eess.SP

    From ORAN to Cell-Free RAN: Architecture, Performance Analysis, Testbeds and Trials

    Authors: Yang Cao, Ziyang Zhang, Xinjiang Xia, Pengzhe Xin, Dongjie Liu, Kang Zheng, Mengting Lou, **g **, Qixing Wang, Dongming Wang, Yongming Huang, Xiaohu You, Jiangzhou Wang

    Abstract: Open radio access network (ORAN) provides an open architecture to implement radio access network (RAN) of the fifth generation (5G) and beyond mobile communications. As a key technology for the evolution to the sixth generation (6G) systems, cell-free massive multiple-input multiple-output (CF-mMIMO) can effectively improve the spectrum efficiency, peak rate and reliability of wireless communicati… ▽ More

    Submitted 6 February, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  45. arXiv:2212.04817  [pdf, ps, other

    eess.SP cs.IT

    A New OFDM System for IIR Channels

    Authors: Xiang-Gen Xia

    Abstract: In this paper, we propose a new OFDM system for an IIR channel with the form of $B(z)/A(z)$ for two polynomials $A(z)$ and $B(z)$. Different from the conventional OFDM transmission over an FIR channel, a guard interval of an OFDM symbol is added such that the corresponding part at receiver is the cyclic prefix (CP) of the received OFDM symbol. The guard interval and CP lengths are the same and not… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

  46. arXiv:2212.04191  [pdf

    eess.SY

    Electromagnetic Environment Analysis of High-Power Wireless Charging Device

    Authors: Zhengyang Zhang, Zhihui Liu, Wen** Zhang, Rui Zhang, Xiang Xiao

    Abstract: Objective Aiming at the problems of many interference factors in the electromagnetic radiation simulation of electric vehicles, a field measurement scheme for charging devices is designed. Through the monitoring of wireless charging equipment, the radiation level and distribution of the electric field value and magnetic field value around the charging equipment is explored, to analyze the influenc… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

  47. arXiv:2211.12671  [pdf, other

    eess.SP

    3-D Positioning and Resource Allocation for Multi-UAV Base Stations Under Blockage-Aware Channel Model

    Authors: Pengfei Yi, Lipeng Zhu, Zhenyu Xiao, Rui Zhang, Zhu Han, Xiang-Gen Xia

    Abstract: In this paper, we propose to deploy multiple unmanned aerial vehicle (UAV) mounted base stations to serve ground users in outdoor environments with obstacles. In particular, the geographic information is employed to capture the blockage effects for air-to-ground (A2G) links caused by buildings, and a realistic blockage-aware A2G channel model is proposed to characterize the continuous variation of… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  48. Functional Split of In-Network Deep Learning for 6G: A Feasibility Study

    Authors: Jia He, Huanzhuo Wu, Xun Xiao, Riccardo Bassoli, Frank H. P. Fitzek

    Abstract: In existing mobile network systems, the data plane (DP) is mainly considered a pipeline consisting of network elements end-to-end forwarding user data traffics. With the rapid maturity of programmable network devices, however, mobile network infrastructure mutates towards a programmable computing platform. Therefore, such a programmable DP can provide in-network computing capability for many appli… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

  49. arXiv:2208.13085  [pdf, other

    eess.AS cs.CL cs.SD

    Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization

    Authors: Dongmei Wang, Xiong Xiao, Naoyuki Kanda, Takuya Yoshioka, Jian Wu

    Abstract: This paper describes a speaker diarization model based on target speaker voice activity detection (TS-VAD) using transformers. To overcome the original TS-VAD model's drawback of being unable to handle an arbitrary number of speakers, we investigate model architectures that use input tensors with variable-length time and speaker dimensions. Transformer layers are applied to the speaker axis to mak… ▽ More

    Submitted 25 September, 2022; v1 submitted 27 August, 2022; originally announced August 2022.

  50. Trajectory Tracking Control of the Bionic Joint Actuated by Pneumatic Artificial Muscle Based on Robust Modeling

    Authors: Yang Wang, Qiang Zhang, Xiao-hui Xiao

    Abstract: To simply and effectively realize the trajectory tracking control of a bionic joint actuated by a single pneumatic artificial muscle (PAM), a cascaded control strategy is proposed based on the robust modeling method. Firstly, the relationship between the input voltage of the proportional directional control valve and the inner driving pressure of PAM is expressed as a nonlinear model analytically.… ▽ More

    Submitted 10 August, 2022; originally announced August 2022.

    Comments: 7 pages, 16 figures, journal paper