Skip to main content

Showing 1–50 of 345 results for author: Yang, X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.18935  [pdf

    eess.SY

    Generalized Averaging Method for Power Electronics Modeling from DC to above Half the Switching Frequency

    Authors: Hongchang Li, Kang** Wang, **gyang Fang, Wenjie Chen, Xu Yang

    Abstract: Modeling power electronic converters at frequencies close to or above half the switching frequency has been difficult due to the time-variant and discontinuous switching actions. This paper uses the properties of moving Fourier coefficients to develop the generalized averaging method, breaking though the limit of half the switching frequency. The paper also proposes the generalized average model f… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2406.15656  [pdf, other

    eess.IV cs.CV

    Adaptive Self-Supervised Consistency-Guided Diffusion Model for Accelerated MRI Reconstruction

    Authors: Mojtaba Safari, Zach Eidex, Shaoyan Pan, Richard L. J. Qiu, Xiaofeng Yang

    Abstract: Purpose: To propose a self-supervised deep learning-based compressed sensing MRI (DL-based CS-MRI) method named "Adaptive Self-Supervised Consistency Guided Diffusion Model (ASSCGD)" to accelerate data acquisition without requiring fully sampled datasets. Materials and Methods: We used the fastMRI multi-coil brain axial T2-weighted (T2-w) dataset from 1,376 cases and single-coil brain quantitative… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  3. arXiv:2406.12186  [pdf, ps, other

    eess.IV cs.CV

    Unlocking the Potential of Early Epochs: Uncertainty-aware CT Metal Artifact Reduction

    Authors: Xinquan Yang, Guanqun Zhou, Wei Sun, Youjian Zhang, Zhongya Wang, Jiahui He, Zhicheng Zhang

    Abstract: In computed tomography (CT), the presence of metallic implants in patients often leads to disruptive artifacts in the reconstructed images, hindering accurate diagnosis. Recently, a large amount of supervised deep learning-based approaches have been proposed for metal artifact reduction (MAR). However, these methods neglect the influence of initial training weights. In this paper, we have discover… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  4. arXiv:2406.08887  [pdf, other

    eess.SP

    Low-Overhead Channel Estimation via 3D Extrapolation for TDD mmWave Massive MIMO Systems Under High-Mobility Scenarios

    Authors: Binggui Zhou, Xi Yang, Shaodan Ma, Feifei Gao, Guanghua Yang

    Abstract: In TDD mmWave massive MIMO systems, the downlink CSI can be attained through uplink channel estimation thanks to the uplink-downlink channel reciprocity. However, the channel aging issue is significant under high-mobility scenarios and thus necessitates frequent uplink channel estimation. In addition, large amounts of antennas and subcarriers lead to high-dimensional CSI matrices, aggravating the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 13 pages, 11 figures, 3 tables. This paper has been submitted to IEEE journal for possible publication

  5. arXiv:2406.02247  [pdf, other

    physics.ins-det eess.SY

    A Study of the Latest Updates of the Readout System for the Hybird-Pixel Detector at HEPS

    Authors: Hangxu Li, Jie Zhang, Wei Wei, Zhenjie Li, Xiaolu Ji, Yan Zhang, Xuanzheng Yang, Shuihan Zhang, Xueke Ma, Peng Liu, Zheng Wang, Yuanbai Chen

    Abstract: The High Energy Photon Source (HEPS) represents a fourth-generation light source. This facility has made unprecedented advancements in accelerator technology, necessitating the development of new detectors to satisfy physical requirements such as single-photon resolution, large dynamic range, and high frame rates. Since 2016, the Institute of High Energy Physics has introduced the first user-exper… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  6. arXiv:2406.02126  [pdf, other

    eess.SY cs.AI cs.LG cs.MA

    CityLight: A Universal Model Towards Real-world City-scale Traffic Signal Control Coordination

    Authors: **wei Zeng, Chao Yu, Xinyi Yang, Wenxuan Ao, Jian Yuan, Yong Li, Yu Wang, Huazhong Yang

    Abstract: Traffic signal control (TSC) is a promising low-cost measure to enhance transportation efficiency without affecting existing road infrastructure. While various reinforcement learning-based TSC methods have been proposed and experimentally outperform conventional rule-based methods, none of them has been deployed in the real world. An essential gap lies in the oversimplification of the scenarios in… ▽ More

    Submitted 6 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  7. arXiv:2405.10550  [pdf, other

    eess.IV cs.CV

    LighTDiff: Surgical Endoscopic Image Low-Light Enhancement with T-Diffusion

    Authors: Tong Chen, Qingcheng Lyu, Long Bai, Erjian Guo, Huxin Gao, Xiaoxiao Yang, Hongliang Ren, Lu** Zhou

    Abstract: Advances in endoscopy use in surgeries face challenges like inadequate lighting. Deep learning, notably the Denoising Diffusion Probabilistic Model (DDPM), holds promise for low-light image enhancement in the medical field. However, DDPMs are computationally demanding and slow, limiting their practical medical applications. To bridge this gap, we propose a lightweight DDPM, dubbed LighTDiff. It ad… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  8. arXiv:2404.19087  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Deep Reinforcement Learning for Advanced Longitudinal Control and Collision Avoidance in High-Risk Driving Scenarios

    Authors: Dianwei Chen, Yaobang Gong, Xianfeng Yang

    Abstract: Existing Advanced Driver Assistance Systems primarily focus on the vehicle directly ahead, often overlooking potential risks from following vehicles. This oversight can lead to ineffective handling of high risk situations, such as high speed, closely spaced, multi vehicle scenarios where emergency braking by one vehicle might trigger a pile up collision. To overcome these limitations, this study i… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  9. arXiv:2404.18105  [pdf, other

    cs.RO eess.SP

    Tightly-Coupled VLP/INS Integrated Navigation by Inclination Estimation and Blockage Handling

    Authors: Xiao Sun, Yuan Zhuang, Xiansheng Yang, Jianzhu Huai, Tianming Huang, Daquan Feng

    Abstract: Visible Light Positioning (VLP) has emerged as a promising technology capable of delivering indoor localization with high accuracy. In VLP systems that use Photodiodes (PDs) as light receivers, the Received Signal Strength (RSS) is affected by the incidence angle of light, making the inclination of PDs a critical parameter in the positioning model. Currently, most studies assume the inclination to… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  10. arXiv:2404.15946  [pdf

    cs.CV cs.AI eess.IV

    Mammo-CLIP: Leveraging Contrastive Language-Image Pre-training (CLIP) for Enhanced Breast Cancer Diagnosis with Multi-view Mammography

    Authors: Xuxin Chen, Yuheng Li, Mingzhe Hu, Ella Salari, Xiaoqian Chen, Richard L. J. Qiu, Bin Zheng, Xiaofeng Yang

    Abstract: Although fusion of information from multiple views of mammograms plays an important role to increase accuracy of breast cancer detection, develo** multi-view mammograms-based computer-aided diagnosis (CAD) schemes still faces challenges and no such CAD schemes have been used in clinical practice. To overcome the challenges, we investigate a new approach based on Contrastive Language-Image Pre-tr… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  11. arXiv:2404.12725  [pdf, other

    cs.SD cs.CV cs.LG cs.MM eess.AS

    Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction

    Authors: Zhaoxi Mu, Xinyu Yang

    Abstract: The integration of visual cues has revitalized the performance of the target speech extraction task, elevating it to the forefront of the field. Nevertheless, this multi-modal learning paradigm often encounters the challenge of modality imbalance. In audio-visual target speech extraction tasks, the audio modality tends to dominate, potentially overshadowing the importance of visual guidance. To ta… ▽ More

    Submitted 5 May, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  12. arXiv:2404.10640  [pdf, other

    eess.IV

    Adapting SAM for Surgical Instrument Tracking and Segmentation in Endoscopic Submucosal Dissection Videos

    Authors: Jieming Yu, Long Bai, Guankun Wang, An Wang, Xiaoxiao Yang, Huxin Gao, Hongliang Ren

    Abstract: The precise tracking and segmentation of surgical instruments have led to a remarkable enhancement in the efficiency of surgical procedures. However, the challenge lies in achieving accurate segmentation of surgical instruments while minimizing the need for manual annotation and reducing the time required for the segmentation process. To tackle this, we propose a novel framework for surgical instr… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: To appear in IEEE ICRA 2024 C4SR+ Workshop

  13. Cost-effective company response policy for product co-creation in company-sponsored online community

    Authors: Jiamin Hu, Lu-Xing Yang, Xiaofan Yang, Kaifan Huang, Gang Li, Yong Xiang

    Abstract: Product co-creation based on company-sponsored online community has come to be a paradigm of develo** new products collaboratively with customers. In such a product co-creation campaign, the sponsoring company needs to interact intensively with active community members about the design scheme of the product. We call the collection of the rates of the company's response to active community member… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  14. arXiv:2404.09000  [pdf, other

    eess.IV cs.CV cs.LG

    MaSkel: A Model for Human Whole-body X-rays Generation from Human Masking Images

    Authors: Yingjie Xi, Boyuan Cheng, **gyao Cai, Jian Jun Zhang, Xiaosong Yang

    Abstract: The human whole-body X-rays could offer a valuable reference for various applications, including medical diagnostics, digital animation modeling, and ergonomic design. The traditional method of obtaining X-ray information requires the use of CT (Computed Tomography) scan machines, which emit potentially harmful radiation. Thus it faces a significant limitation for realistic applications because it… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  15. arXiv:2404.03883  [pdf, other

    eess.IV cs.CV

    LiDAR-Guided Cross-Attention Fusion for Hyperspectral Band Selection and Image Classification

    Authors: Judy X Yang, Jun Zhou, **g Wang, Hui Tian, Alan Wee-Chung Liew

    Abstract: The fusion of hyperspectral and LiDAR data has been an active research topic. Existing fusion methods have ignored the high-dimensionality and redundancy challenges in hyperspectral images, despite that band selection methods have been intensively studied for hyperspectral image (HSI) processing. This paper addresses this significant gap by introducing a cross-attention mechanism from the transfor… ▽ More

    Submitted 15 April, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

    Comments: 15 pages, 13 figures

    MSC Class: F.2.2; I.2.7

    Journal ref: IEEE - TGRS-2024-00264.R1 Final Files Received

  16. arXiv:2404.00837  [pdf

    eess.IV cs.CV cs.LG physics.med-ph

    Automated HER2 Scoring in Breast Cancer Images Using Deep Learning and Pyramid Sampling

    Authors: Sahan Yoruc Selcuk, Xilin Yang, Bijie Bai, Yijie Zhang, Yuzhu Li, Musa Aydin, Aras Firat Unal, Aditya Gomatam, Zhen Guo, Darrow Morgan Angus, Goren Kolodney, Karine Atlan, Tal Keidar Haran, Nir Pillar, Aydogan Ozcan

    Abstract: Human epidermal growth factor receptor 2 (HER2) is a critical protein in cancer cell growth that signifies the aggressiveness of breast cancer (BC) and helps predict its prognosis. Accurate assessment of immunohistochemically (IHC) stained tissue slides for HER2 expression levels is essential for both treatment guidance and understanding of cancer mechanisms. Nevertheless, the traditional workflow… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: 21 Pages, 7 Figures

  17. arXiv:2403.17337  [pdf, other

    eess.SY eess.SP

    Destination-Constrained Linear Dynamical System Modeling in Set-Valued Frameworks

    Authors: Xiaowei Yang, Haiqi Liu, Fanqin Meng, Xiao**g Shen

    Abstract: Directional motion towards a specified destination is a common occurrence in physical processes and human societal activities. Utilizing this prior information can significantly improve the control and predictive performance of system models. This paper primarily focuses on reconstructing linear dynamic system models based on destination constraints in the set-valued framework. We treat destinatio… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 15 pages, 11 figures

  18. arXiv:2403.15716  [pdf, other

    cs.RO cs.AI eess.SY

    Distributed Robust Learning based Formation Control of Mobile Robots based on Bioinspired Neural Dynamics

    Authors: Zhe Xu, Tao Yan, Simon X. Yang, S. Andrew Gadsden, Mohammad Biglarbegian

    Abstract: This paper addresses the challenges of distributed formation control in multiple mobile robots, introducing a novel approach that enhances real-world practicability. We first introduce a distributed estimator using a variable structure and cascaded design technique, eliminating the need for derivative information to improve the real time performance. Then, a kinematic tracking control method is de… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: This paper is accepted by IEEE Transactions on Intelligent Vehicles

  19. arXiv:2403.09100  [pdf

    physics.med-ph cs.CV cs.LG eess.IV physics.optics

    Virtual birefringence imaging and histological staining of amyloid deposits in label-free tissue using autofluorescence microscopy and deep learning

    Authors: Xilin Yang, Bijie Bai, Yijie Zhang, Musa Aydin, Sahan Yoruc Selcuk, Zhen Guo, Gregory A. Fishbein, Karine Atlan, William Dean Wallace, Nir Pillar, Aydogan Ozcan

    Abstract: Systemic amyloidosis is a group of diseases characterized by the deposition of misfolded proteins in various organs and tissues, leading to progressive organ dysfunction and failure. Congo red stain is the gold standard chemical stain for the visualization of amyloid deposits in tissue sections, as it forms complexes with the misfolded proteins and shows a birefringence pattern under polarized lig… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 20 Pages, 5 Figures

  20. arXiv:2403.08238  [pdf, other

    cs.RO cs.AI eess.SY

    A Novel Feature Learning-based Bio-inspired Neural Network for Real-time Collision-free Rescue of Multi-Robot Systems

    Authors: Junfei Li, Simon X. Yang

    Abstract: Natural disasters and urban accidents drive the demand for rescue robots to provide safer, faster, and more efficient rescue trajectories. In this paper, a feature learning-based bio-inspired neural network (FLBBINN) is proposed to quickly generate a heuristic rescue path in complex and dynamic environments, as traditional approaches usually cannot provide a satisfactory solution to real-time resp… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: This paper is accepted to publish in IEEE Transactions on Industrial Electronics

  21. arXiv:2403.04116  [pdf, other

    eess.IV cs.CV

    Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis

    Authors: Yuanhao Cai, Yixun Liang, Jiahao Wang, Angtian Wang, Yulun Zhang, Xiaokang Yang, Zongwei Zhou, Alan Yuille

    Abstract: X-ray is widely applied for transmission imaging due to its stronger penetration than natural light. When rendering novel view X-ray projections, existing methods mainly based on NeRF suffer from long training time and slow inference speed. In this paper, we propose a 3D Gaussian splatting-based framework, namely X-Gaussian, for X-ray novel view synthesis. Firstly, we redesign a radiative Gaussian… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: The first 3D Gaussian Splatting-based method for X-ray 3D reconstruction

  22. arXiv:2402.19004  [pdf, other

    cs.CV eess.IV

    RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation

    Authors: Jie Zhang, Xubing Yang, Rui Jiang, Wei Shao, Li Zhang

    Abstract: The development of high-resolution remote sensing satellites has provided great convenience for research work related to remote sensing. Segmentation and extraction of specific targets are essential tasks when facing the vast and complex remote sensing images. Recently, the introduction of Segment Anything Model (SAM) provides a universal pre-training model for image segmentation tasks. While the… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 12 pages, 11 figures

  23. arXiv:2402.17146  [pdf

    eess.AS

    Target Speaker Extraction by Directly Exploiting Contextual Information in the Time-Frequency Domain

    Authors: Xue Yang, Changchun Bao, **g Zhou, Xianhong Chen

    Abstract: In target speaker extraction, many studies rely on the speaker embedding which is obtained from an enrollment of the target speaker and employed as the guidance. However, solely using speaker embedding may not fully utilize the contextual information contained in the enrollment. In this paper, we directly exploit this contextual information in the time-frequency (T-F) domain. Specifically, the T-F… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted by ICASSP 2024

  24. arXiv:2402.08235  [pdf, other

    eess.IV cs.CV

    Color Image Denoising Using The Green Channel Prior

    Authors: Zhaoming Kong, Xiaowei Yang

    Abstract: Noise removal in the standard RGB (sRGB) space remains a challenging task, in that the noise statistics of real-world images can be different in R, G and B channels. In fact, the green channel usually has twice the sampling rate in raw data and a higher signal-to-noise ratio than red/blue ones. However, the green channel prior (GCP) is often understated or ignored in color image denoising since ma… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  25. arXiv:2402.04228  [pdf, other

    cs.RO cs.AI eess.SY

    Intelligent Collective Escape of Swarm Robots Based on a Novel Fish-inspired Self-adaptive Approach with Neurodynamic Models

    Authors: Junfei Li, Simon X. Yang

    Abstract: Fish schools present high-efficiency group behaviors through simple individual interactions to collective migration and dynamic escape from the predator. The school behavior of fish is usually a good inspiration to design control architecture for swarm robots. In this paper, a novel fish-inspired self-adaptive approach is proposed for collective escape for the swarm robots. In addition, a bio-insp… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: This article is accepted for publication in a future issue of IEEE Transactions on Industrial Electronics

  26. arXiv:2401.15913  [pdf, other

    eess.IV cs.CV cs.LG physics.flu-dyn stat.AP

    Vision-Informed Flow Image Super-Resolution with Quaternion Spatial Modeling and Dynamic Flow Convolution

    Authors: Qinglong Cao, Zhengqin Xu, Chao Ma, Xiaokang Yang, Yuntian Chen

    Abstract: Flow image super-resolution (FISR) aims at recovering high-resolution turbulent velocity fields from low-resolution flow images. Existing FISR methods mainly process the flow images in natural image patterns, while the critical and distinct flow visual properties are rarely considered. This negligence would cause the significant domain gap between flow and natural images to severely hamper the acc… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  27. arXiv:2401.11205  [pdf, other

    cs.IT eess.SP

    Joint Beamforming Optimization and Mode Selection for RDARS-aided MIMO Systems

    Authors: **tao Wang, Chengzhi Ma, Shiqi Gong, Xi Yang, Shaodan Ma

    Abstract: Considering the appealing distribution gains of distributed antenna systems (DAS) and passive gains of reconfigurable intelligent surface (RIS), a flexible reconfigurable architecture called reconfigurable distributed antenna and reflecting surface (RDARS) is proposed. RDARS encompasses DAS and RIS as two special cases and maintains the advantages of distributed antennas while reducing the hardwar… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

    Comments: 13 pages, 9 figures. This paper has been submitted to IEEE journal for possible publication

  28. arXiv:2401.05521  [pdf, other

    cs.RO cs.AI eess.SY

    Current Effect-eliminated Optimal Target Assignment and Motion Planning for a Multi-UUV System

    Authors: Danjie Zhu, Simon X. Yang

    Abstract: The paper presents an innovative approach (CBNNTAP) that addresses the complexities and challenges introduced by ocean currents when optimizing target assignment and motion planning for a multi-unmanned underwater vehicle (UUV) system. The core of the proposed algorithm involves the integration of several key components. Firstly, it incorporates a bio-inspired neural network-based (BINN) approach… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: This paper was accepted by IEEE Transactions on Intelligent Transportation Systems

  29. arXiv:2401.05412  [pdf, other

    cs.CV cs.AI eess.SP

    Spatial-Related Sensors Matters: 3D Human Motion Reconstruction Assisted with Textual Semantics

    Authors: Xueyuan Yang, Chao Yao, Xiaojuan Ban

    Abstract: Leveraging wearable devices for motion reconstruction has emerged as an economical and viable technique. Certain methodologies employ sparse Inertial Measurement Units (IMUs) on the human body and harness data-driven strategies to model human poses. However, the reconstruction of motion based solely on sparse IMUs data is inherently fraught with ambiguity, a consequence of numerous identical IMU r… ▽ More

    Submitted 26 December, 2023; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  30. arXiv:2312.16247  [pdf, other

    cs.CV eess.IV

    Toward Accurate and Temporally Consistent Video Restoration from Raw Data

    Authors: Shi Guo, Jianqi Ma, Xi Yang, Zhengqiang Zhang, Lei Zhang

    Abstract: Denoising and demosaicking are two fundamental steps in reconstructing a clean full-color video from raw data, while performing video denoising and demosaicking jointly, namely VJDD, could lead to better video restoration performance than performing them separately. In addition to restoration accuracy, another key challenge to VJDD lies in the temporal consistency of consecutive frames. This issue… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  31. arXiv:2312.16040  [pdf, other

    cs.CV eess.IV

    Multi-scale Progressive Feature Embedding for Accurate NIR-to-RGB Spectral Domain Translation

    Authors: Xingxing Yang, Jie Chen, Zaifeng Yang

    Abstract: NIR-to-RGB spectral domain translation is a challenging task due to the map** ambiguities, and existing methods show limited learning capacities. To address these challenges, we propose to colorize NIR images via a multi-scale progressive feature embedding network (MPFNet), with the guidance of grayscale image colorization. Specifically, we first introduce a domain translation module that transl… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: Accepted by IEEE VCIP 2023

  32. arXiv:2312.10307  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    MusER: Musical Element-Based Regularization for Generating Symbolic Music with Emotion

    Authors: Shulei Ji, Xinyu Yang

    Abstract: Generating music with emotion is an important task in automatic music generation, in which emotion is evoked through a variety of musical elements (such as pitch and duration) that change over time and collaborate with each other. However, prior research on deep learning-based emotional music generation has rarely explored the contribution of different musical elements to emotions, let alone the d… ▽ More

    Submitted 1 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  33. arXiv:2312.10305  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction

    Authors: Zhaoxi Mu, Xinyu Yang, Sining Sun, Qing Yang

    Abstract: Speech signals are inherently complex as they encompass both global acoustic characteristics and local semantic information. However, in the task of target speech extraction, certain elements of global and local semantic information in the reference speech, which are irrelevant to speaker identity, can lead to speaker confusion within the speech extraction network. To overcome this challenge, we p… ▽ More

    Submitted 19 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI2024

  34. arXiv:2312.04062  [pdf, other

    cs.IT cs.AI eess.SP

    A Low-Overhead Incorporation-Extrapolation based Few-Shot CSI Feedback Framework for Massive MIMO Systems

    Authors: Binggui Zhou, Xi Yang, **tao Wang, Shaodan Ma, Feifei Gao, Guanghua Yang

    Abstract: Accurate channel state information (CSI) is essential for downlink precoding in frequency division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems with orthogonal frequency-division multiplexing (OFDM). However, obtaining CSI through feedback from the user equipment (UE) becomes challenging with the increasing scale of antennas and subcarriers and leads to extremely high CSI… ▽ More

    Submitted 21 June, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: 16 pages, 12 figures, 5 tables. Accepted by IEEE Transactions on Wireless Communications

  35. arXiv:2312.03410  [pdf, other

    cs.SD cs.MM eess.AS

    Detecting Voice Cloning Attacks via Timbre Watermarking

    Authors: Chang Liu, Jie Zhang, Tianwei Zhang, Xi Yang, Weiming Zhang, Nenghai Yu

    Abstract: Nowadays, it is common to release audio content to the public. However, with the rise of voice cloning technology, attackers have the potential to easily impersonate a specific person by utilizing his publicly released audio without any permission. Therefore, it becomes significant to detect any potential misuse of the released audio content and protect its timbre from being impersonated. To this… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: NDSS 2024

  36. arXiv:2311.16155  [pdf, other

    eess.SP cs.LG

    Deep Learning-Based Frequency Offset Estimation

    Authors: Tao Chen, Shilian Zheng, Jiawei Zhu, Qi Xuan, Xiaoniu Yang

    Abstract: In wireless communication systems, the asynchronization of the oscillators in the transmitter and the receiver along with the Doppler shift due to relative movement may lead to the presence of carrier frequency offset (CFO) in the received signals. Estimation of CFO is crucial for subsequent processing such as coherent demodulation. In this brief, we demonstrate the utilization of deep learning fo… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  37. arXiv:2311.10641  [pdf

    physics.med-ph eess.IV

    Image-Domain Material Decomposition for Dual-energy CT using Unsupervised Learning with Data-fidelity Loss

    Authors: Junbo Peng, Chih-Wei Chang, Huiqiao Xie, Richard L. J. Qiu, Justin Roper, Tonghe Wang, Beth Bradshaw, Xiangyang Tang, Xiaofeng Yang

    Abstract: Background: Dual-energy CT (DECT) and material decomposition play vital roles in quantitative medical imaging. However, the decomposition process may suffer from significant noise amplification, leading to severely degraded image signal-to-noise ratios (SNRs). While existing iterative algorithms perform noise suppression using different image priors, these heuristic image priors cannot accurately… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  38. arXiv:2311.03761  [pdf, other

    cs.LG cs.AI eess.SP

    Augmenting Radio Signals with Wavelet Transform for Deep Learning-Based Modulation Recognition

    Authors: Tao Chen, Shilian Zheng, Kunfeng Qiu, Luxin Zhang, Qi Xuan, Xiaoniu Yang

    Abstract: The use of deep learning for radio modulation recognition has become prevalent in recent years. This approach automatically extracts high-dimensional features from large datasets, facilitating the accurate classification of modulation schemes. However, in real-world scenarios, it may not be feasible to gather sufficient training data in advance. Data augmentation is a method used to increase the d… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  39. arXiv:2311.02865  [pdf, other

    cs.IT eess.SP

    Geometrically-Shaped Constellation for Visible Light Communications at Short Blocklength

    Authors: Jia-Ning Guo, Ru-Han Chen, Jian Zhang, Longguang Li, Xu Yang, **g Zhou

    Abstract: In this paper, we present a general framework of designing geometrically shaped constellations for short-packet visible light communications with a peak- and an average-intensity constraints. By leveraging tools from large deviation theory, we first characterize the second-order asymptotics of the optimal constellation sha** region under aforementioned intensity constraints, which serves as a go… ▽ More

    Submitted 28 April, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

  40. arXiv:2310.19293  [pdf, other

    eess.IV cs.CV

    FetusMapV2: Enhanced Fetal Pose Estimation in 3D Ultrasound

    Authors: Chaoyu Chen, Xin Yang, Yuhao Huang, Wenlong Shi, Yan Cao, Mingyuan Luo, Xindi Hu, Lei Zhue, Lequan Yu, Kejuan Yue, Yuanji Zhang, Yi Xiong, Dong Ni, Weijun Huang

    Abstract: Fetal pose estimation in 3D ultrasound (US) involves identifying a set of associated fetal anatomical landmarks. Its primary objective is to provide comprehensive information about the fetus through landmark connections, thus benefiting various critical applications, such as biometric measurements, plane localization, and fetal movement monitoring. However, accurately estimating the 3D fetal pose… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: 16 pages, 11 figures, accepted by Medical Image Analysis(2023)

  41. Generalized Firefly Algorithm for Optimal Transmit Beamforming

    Authors: Tuan Anh Le, Xin-She Yang

    Abstract: This paper proposes a generalized Firefly Algorithm (FA) to solve an optimization framework having objective function and constraints as multivariate functions of independent optimization variables. Four representative examples of how the proposed generalized FA can be adopted to solve downlink beamforming problems are shown for a classic transmit beamforming, cognitive beamforming, reconfigurable… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

  42. arXiv:2310.14636  [pdf, other

    eess.IV cs.CV

    Multilevel Perception Boundary-guided Network for Breast Lesion Segmentation in Ultrasound Images

    Authors: Xing Yang, Jian Zhang, Qijian Chen, Li Wang, Lihui Wang

    Abstract: Automatic segmentation of breast tumors from the ultrasound images is essential for the subsequent clinical diagnosis and treatment plan. Although the existing deep learning-based methods have achieved significant progress in automatic segmentation of breast tumor, their performance on tumors with similar intensity to the normal tissues is still not pleasant, especially for the tumor boundaries. T… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 12pages,5 figures

  43. arXiv:2310.14485  [pdf, ps, other

    cs.RO cs.AI eess.SY

    Intelligent Escape of Robotic Systems: A Survey of Methodologies, Applications, and Challenges

    Authors: Junfei Li, Simon X. Yang

    Abstract: Intelligent escape is an interdisciplinary field that employs artificial intelligence (AI) techniques to enable robots with the capacity to intelligently react to potential dangers in dynamic, intricate, and unpredictable scenarios. As the emphasis on safety becomes increasingly paramount and advancements in robotic technologies continue to advance, a wide range of intelligent escape methodologies… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: This paper is accepted by Journal of Intelligent and Robotic Systems

  44. arXiv:2310.11830  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    CLARA: Multilingual Contrastive Learning for Audio Representation Acquisition

    Authors: Kari A Noriy, Xiaosong Yang, Marcin Budka, Jian Jun Zhang

    Abstract: Multilingual speech processing requires understanding emotions, a task made difficult by limited labelled data. CLARA, minimizes reliance on labelled data, enhancing generalization across languages. It excels at fostering shared representations, aiding cross-lingual transfer of speech and emotions, even with little data. Our approach adeptly captures emotional nuances in speech, overcoming subject… ▽ More

    Submitted 1 November, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  45. arXiv:2310.11230  [pdf, other

    eess.AS cs.LG cs.SD

    Zipformer: A faster and better encoder for automatic speech recognition

    Authors: Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui **, Long Lin, Daniel Povey

    Abstract: The Conformer has become the most popular encoder model for automatic speech recognition (ASR). It adds convolution modules to a transformer to learn both local and global dependencies. In this work we describe a faster, more memory-efficient, and better-performing transformer, called Zipformer. Modeling changes include: 1) a U-Net-like encoder structure where middle stacks operate at lower frame… ▽ More

    Submitted 9 April, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Published as a conference paper at ICLR 2024

  46. arXiv:2310.09474  [pdf, other

    eess.SY

    Extremum seeking in the presence of large delays via time-delay approach to averaging

    Authors: Xuefei Yang, Emilia Fridman

    Abstract: In this paper, we study gradient-based classical extremum seeking (ES) for uncertain n-dimensional (nD) static quadratic maps in the presence of known large constant distinct input delays and large output constant delay with a small time-varying uncertainty. This uncertainty may appear due to network-based measurements. We present a quantitative analysis via a time-delay approach to averaging. We… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  47. arXiv:2310.08705  [pdf, other

    cs.CV eess.IV

    A Benchmarking Protocol for SAR Colorization: From Regression to Deep Learning Approaches

    Authors: Kangqing Shen, Gemine Vivone, Xiaoyuan Yang, Simone Lolli, Michael Schmitt

    Abstract: Synthetic aperture radar (SAR) images are widely used in remote sensing. Interpreting SAR images can be challenging due to their intrinsic speckle noise and grayscale nature. To address this issue, SAR colorization has emerged as a research direction to colorize gray scale SAR images while preserving the original spatial information and radiometric information. However, this research field is stil… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: 16 pages, 16 figures, 6 tables

  48. arXiv:2310.07730  [pdf, other

    cs.CV eess.IV

    Domain-Controlled Prompt Learning

    Authors: Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang

    Abstract: Large pre-trained vision-language models, such as CLIP, have shown remarkable generalization capabilities across various tasks when appropriate text prompts are provided. However, adapting these models to specific domains, like remote sensing images (RSIs), medical images, etc, remains unexplored and challenging. Existing prompt learning methods often lack domain-awareness or domain-transfer mecha… ▽ More

    Submitted 12 December, 2023; v1 submitted 29 September, 2023; originally announced October 2023.

  49. arXiv:2310.06382  [pdf, other

    cs.IT eess.SP

    Mutual Information Metrics for Uplink MIMO-OFDM Integrated Sensing and Communication System

    Authors: **ghui Piao, Zhiqing Wei, Xin Yuan, Xiaoyu Yang, Huici Wu, Zhiyong Feng

    Abstract: As the uplink sensing has the advantage of easy implementation, it attracts great attention in integrated sensing and communication (ISAC) system. This paper presents an uplink ISAC system based on multi-input multi-output orthogonal frequency division multiplexing (MIMO-OFDM) technology. The mutual information (MI) is introduced as a unified metric to evaluate the performance of communication and… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  50. arXiv:2309.16813  [pdf, other

    cs.NI eess.SP

    Wi-Fi 8: Embracing the Millimeter-Wave Era

    Authors: Xiaoqian Liu, Tingwei Chen, Yuhan Dong, Zhi Mao, Ming Gan, Xun Yang, Jianmin Lu

    Abstract: With the increasing demands in communication, Wi-Fi technology is advancing towards its next generation. Building on the foundation of Wi-Fi 7, millimeter-wave technology is anticipated to converge with Wi-Fi 8 in the near future. In this paper, we look into the millimeter-wave technology and other potential feasible features, providing a comprehensive perspective on the future of Wi-Fi 8. Our sim… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: 7 pages, 4 figures