Skip to main content

Showing 1–50 of 153 results for author: Gao, X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.09822  [pdf, other

    cs.IT cs.CV cs.LG eess.IV eess.SP

    An I2I Inpainting Approach for Efficient Channel Knowledge Map Construction

    Authors: Zhenzhou **, Li You, Jue Wang, Xiang-Gen Xia, Xiqi Gao

    Abstract: Channel knowledge map (CKM) has received widespread attention as an emerging enabling technology for environment-aware wireless communications. It involves the construction of databases containing location-specific channel knowledge, which are then leveraged to facilitate channel state information (CSI) acquisition and transceiver design. In this context, a fundamental challenge lies in efficientl… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 15 pages, 11 figures

  2. arXiv:2405.14113  [pdf, other

    eess.IV cs.CV

    Multi-modality Regional Alignment Network for Covid X-Ray Survival Prediction and Report Generation

    Authors: Zhusi Zhong, Jie Li, John Sollee, Scott Collins, Harrison Bai, Paul Zhang, Terrence Healey, Michael Atalay, Xinbo Gao, Zhicheng Jiao

    Abstract: In response to the worldwide COVID-19 pandemic, advanced automated technologies have emerged as valuable tools to aid healthcare professionals in managing an increased workload by improving radiology report generation and prognostic analysis. This study proposes Multi-modality Regional Alignment Network (MRANet), an explainable model for radiology report generation and survival prediction that foc… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  3. arXiv:2405.08423  [pdf, other

    eess.IV cs.CV

    NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

    Authors: Yihong Chen, Zhen Fan, Shuai Dong, Zhiwei Chen, Wenjie Li, Minghui Qin, Min Zeng, Xubing Lu, Guofu Zhou, Xingsen Gao, Jun-Ming Liu

    Abstract: Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high co… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  4. arXiv:2404.07425  [pdf, ps, other

    eess.SP cs.IT

    Precoder Design for User-Centric Network Massive MIMO with Matrix Manifold Optimization

    Authors: Rui Sun, Li You, An-An Lu, Chen Sun, Xiqi Gao, Xiang-Gen Xia

    Abstract: In this paper, we investigate the precoder design for user-centric network (UCN) massive multiple-input multiple-output (mMIMO) downlink with matrix manifold optimization. In UCN mMIMO systems, each user terminal (UT) is served by a subset of base stations (BSs) instead of all the BSs, facilitating the implementation of the system and lowering the dimension of the precoders to be designed. By prov… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 13 pages, 9 figures, journal

  5. arXiv:2404.00861  [pdf, other

    eess.AS eess.IV

    Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training

    Authors: Ruijie Tao, Xinyuan Qian, Rohan Kumar Das, Xiaoxue Gao, Jiadong Wang, Haizhou Li

    Abstract: Audio-visual active speaker detection (AV-ASD) aims to identify which visible face is speaking in a scene with one or more persons. Most existing AV-ASD methods prioritize capturing speech-lip correspondence. However, there is a noticeable gap in addressing the challenges from real-world AV-ASD scenarios. Due to the presence of low-quality noisy videos in such cases, AV-ASD systems without a selec… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: 10 pages

  6. Integrated Communications and Localization for Massive MIMO LEO Satellite Systems

    Authors: Li You, Xiaoyu Qiang, Yongxiang Zhu, Fan Jiang, Christos G. Tsinos, Wen** Wang, Henk Wymeersch, Xiqi Gao, Björn Ottersten

    Abstract: Integrated communications and localization (ICAL) will play an important part in future sixth generation (6G) networks for the realization of Internet of Everything (IoE) to support both global communications and seamless localization. Massive multiple-input multiple-output (MIMO) low earth orbit (LEO) satellite systems have great potential in providing wide coverage with enhanced gains, and thus… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 14 pages, 7 figures, to appear in IEEE Transactions on Wireless Communications

  7. arXiv:2402.19111  [pdf, other

    eess.IV cs.CV

    Deep Network for Image Compressed Sensing Coding Using Local Structural Sampling

    Authors: Wenxue Cui, Xingtao Wang, Xiaopeng Fan, Shaohui Liu, Xinwei Gao, Debin Zhao

    Abstract: Existing image compressed sensing (CS) coding frameworks usually solve an inverse problem based on measurement coding and optimization-based image reconstruction, which still exist the following two challenges: 1) The widely used random sampling matrix, such as the Gaussian Random Matrix (GRM), usually leads to low measurement coding efficiency. 2) The optimization-based reconstruction methods gen… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted by ACM Transactions on Multimedia Computing Communications and Applications (TOMM)

  8. arXiv:2402.17718  [pdf

    cs.LG eess.SP

    Towards a Digital Twin Framework in Additive Manufacturing: Machine Learning and Bayesian Optimization for Time Series Process Optimization

    Authors: Vispi Karkaria, Anthony Goeckner, Ru**g Zha, Jie Chen, Jian**g Zhang, Qi Zhu, Jian Cao, Robert X. Gao, Wei Chen

    Abstract: Laser-directed-energy deposition (DED) offers advantages in additive manufacturing (AM) for creating intricate geometries and material grading. Yet, challenges like material inconsistency and part variability remain, mainly due to its layer-wise fabrication. A key issue is heat accumulation during DED, which affects the material microstructure and properties. While closed-loop control methods for… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 12 Pages, 10 Figures, 1 Table, NAMRC Conference

  9. arXiv:2402.15897  [pdf, other

    eess.SP

    MMW-Carry: Enhancing Carry Object Detection through Millimeter-Wave Radar-Camera Fusion

    Authors: Xiangyu Gao, Youchen Luo, Ali Alansari, Ya** Sun

    Abstract: This paper introduces MMW-Carry, a system designed to predict the probability of individuals carrying various objects using millimeter-wave radar signals, complemented by camera input. The primary goal of MMW-Carry is to provide a rapid and cost-effective preliminary screening solution, specifically tailored for non-super-sensitive scenarios. Overall, MMW-Carry achieves significant advancements in… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

    Comments: 10 pages

  10. arXiv:2402.15725  [pdf, other

    eess.AS

    Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks

    Authors: Duo Ma, Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li

    Abstract: Human language can be expressed in either written or spoken form, i.e. text or speech. Humans can acquire knowledge from text to improve speaking and listening. However, the quest for speech pre-trained models to leverage unpaired text has just started. In this paper, we investigate a new way to pre-train such a joint speech-text model to learn enhanced speech representations and benefit various s… ▽ More

    Submitted 28 February, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

    Comments: 5 pages, 1 figures,5 tables, submit to IEEE Signal Processing Letters(SPL)

  11. arXiv:2402.14401  [pdf, other

    cs.CV cs.LG eess.IV

    Diffusion Model Based Visual Compensation Guidance and Visual Difference Analysis for No-Reference Image Quality Assessment

    Authors: Zhaoyang Wang, Bo Hu, Mingyang Zhang, Jie Li, Leida Li, Maoguo Gong, Xinbo Gao

    Abstract: Existing free-energy guided No-Reference Image Quality Assessment (NR-IQA) methods still suffer from finding a balance between learning feature information at the pixel level of the image and capturing high-level feature information and the efficient utilization of the obtained high-level feature information remains a challenge. As a novel class of state-of-the-art (SOTA) generative model, the dif… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  12. arXiv:2402.06841  [pdf

    eess.IV cs.CV

    Point cloud-based registration and image fusion between cardiac SPECT MPI and CTA

    Authors: Shaojie Tang, Penpen Miao, Xingyu Gao, Yu Zhong, Dantong Zhu, Haixing Wen, Zhihui Xu, Qiuyue Wei, Hong** Yao, Xin Huang, Rui Gao, Chen Zhao, Weihua Zhou

    Abstract: A method was proposed for the point cloud-based registration and image fusion between cardiac single photon emission computed tomography (SPECT) myocardial perfusion images (MPI) and cardiac computed tomography angiograms (CTA). Firstly, the left ventricle (LV) epicardial regions (LVERs) in SPECT and CTA images were segmented by using different U-Net neural networks trained to generate the point c… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  13. arXiv:2401.10242  [pdf, other

    cs.OH cs.GR cs.HC cs.SD eess.AS

    DanceMeld: Unraveling Dance Phrases with Hierarchical Latent Codes for Music-to-Dance Synthesis

    Authors: Xin Gao, Li Hu, Peng Zhang, Bang Zhang, Liefeng Bo

    Abstract: In the realm of 3D digital human applications, music-to-dance presents a challenging task. Given the one-to-many relationship between music and dance, previous methods have been limited in their approach, relying solely on matching and generating corresponding dance movements based on music rhythm. In the professional field of choreography, a dance phrase consists of several dance poses and dance… ▽ More

    Submitted 30 November, 2023; originally announced January 2024.

    Comments: 10 pages, 8 figures

  14. arXiv:2401.08921  [pdf, other

    cs.IT eess.SP eess.SY

    Electromagnetic Information Theory: Fundamentals and Applications for 6G Wireless Communication Systems

    Authors: Cheng-Xiang Wang, Yue Yang, Jie Huang, Xiqi Gao, Tie Jun Cui, Lajos Hanzo

    Abstract: In wireless communications, electromagnetic theory and information theory constitute a pair of fundamental theories, bridged by antenna theory and wireless propagation channel modeling theory. Up to the fifth generation (5G) wireless communication networks, these four theories have been develo** relatively independently. However, in sixth generation (6G) space-air-ground-sea wireless communicati… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  15. Fluid Antenna-Assisted MIMO Transmission Exploiting Statistical CSI

    Authors: Yuqi Ye, Li You, Jue Wang, Hao Xu, Kai-Kit Wong, Xiqi Gao

    Abstract: In conventional multiple-input multiple-output (MIMO) communication systems, the positions of antennas are fixed. To take full advantage of spatial degrees of freedom, a new technology called fluid antenna (FA) is proposed to obtain higher achievable rate and diversity gain. Most existing works on FA exploit instantaneous channel state information (CSI). However, in FA-assisted systems, it is diff… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: to appear in IEEE Communications Letters

    Journal ref: IEEE Communications Letters, vol. 28, no. 1, pp. 223-227, Jan. 2024

  16. arXiv:2311.12592  [pdf, other

    cs.HC cs.AI eess.SY

    Visual tracking brain computer interface

    Authors: Changxing Huang, Nanlin Shi, Yining Miao, Xiaogang Chen, Yijun Wang, Xiaorong Gao

    Abstract: Brain-computer interfaces (BCIs) offer a way to interact with computers without relying on physical movements. Non-invasive electroencephalography (EEG)-based visual BCIs, known for efficient speed and calibration ease, face limitations in continuous tasks due to discrete stimulus design and decoding methods. To achieve continuous control, we implemented a novel spatial encoding stimulus paradigm… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  17. arXiv:2311.11596  [pdf

    cs.HC cs.IT eess.SP q-bio.NC

    High-performance cVEP-BCI under minimal calibration

    Authors: Yining Miao, Nanlin Shi, Changxing Huang, Yonghao Song, Xiaogang Chen, Yijun Wang, Xiaorong Gao

    Abstract: The ultimate goal of brain-computer interfaces (BCIs) based on visual modulation paradigms is to achieve high-speed performance without the burden of extensive calibration. Code-modulated visual evoked potential-based BCIs (cVEP-BCIs) modulated by broadband white noise (WN) offer various advantages, including increased communication speed, expanded encoding target capabilities, and enhanced coding… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: 35 pages, 5 figures

  18. arXiv:2311.05273  [pdf, other

    eess.SP

    Few-Shot Recognition and Classification Framework for Jamming Signal: A CGAN-Based Fusion CNN Approach

    Authors: Xuhui Ding, Yue Zhang, Gaoyang Li, Xiaozheng Gao, Neng Ye, Dusit Niyato, Kai Yang

    Abstract: Subject to intricate environmental variables, the precise classification of jamming signals holds paramount significance in the effective implementation of anti-jamming strategies within communication systems. In light of this imperative, we propose an innovative fusion algorithm based on conditional generative adversarial network (CGAN) and convolutional neural network (CNN), which aims to deal w… ▽ More

    Submitted 26 June, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: Required to supplement the experiments in Section VII, enhance the notations in Table I, and make necessary adjustments to Equation 17 to ensure accuracy and completeness

  19. arXiv:2310.10997  [pdf

    eess.SY

    Cooperative Dispatch of Microgrids Community Using Risk-Sensitive Reinforcement Learning with Monotonously Improved Performance

    Authors: Ziqing Zhu, Xiang Gao, Siqi Bu, Ka Wing Chan, Bin Zhou, Shiwei Xia

    Abstract: The integration of individual microgrids (MGs) into Microgrid Clusters (MGCs) significantly improves the reliability and flexibility of energy supply, through resource sharing and ensuring backup during outages. The dispatch of MGCs is the key challenge to be tackled to ensure their secure and economic operation. Currently, there is a lack of optimization method that can achieve a trade-off among… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  20. arXiv:2310.03771  [pdf, other

    eess.SP

    A Comprehensive Indoor Environment Dataset from Single-family Houses in the US

    Authors: Sheik Murad Hassan Anik, Xinghua Gao, Na Meng

    Abstract: The paper describes a dataset comprising indoor environmental factors such as temperature, humidity, air quality, and noise levels. The data was collected from 10 sensing devices installed in various locations within three single-family houses in Virginia, USA. The objective of the data collection was to study the indoor environmental conditions of the houses over time. The data were collected at… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  21. arXiv:2309.11107  [pdf, other

    cs.RO eess.SY

    Indoor Exploration and Simultaneous Trolley Collection Through Task-Oriented Environment Partitioning

    Authors: Junjie Gao, Peijia Xie, Xuheng Gao, Zhirui Sun, Jiankun Wang, Max Q. -H. Meng

    Abstract: In this paper, we present a simultaneous exploration and object search framework for the application of autonomous trolley collection. For environment representation, a task-oriented environment partitioning algorithm is presented to extract diverse information for each sub-task. First, LiDAR data is classified as potential objects, walls, and obstacles after outlier removal. Segmented point cloud… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  22. arXiv:2309.07460  [pdf, other

    cs.IT eess.SP

    A Tutorial on Environment-Aware Communications via Channel Knowledge Map for 6G

    Authors: Yong Zeng, Junting Chen, Jie Xu, Di Wu, Xiaoli Xu, Shi **, Xiqi Gao, David Gesbert, Shuguang Cui, Rui Zhang

    Abstract: Sixth-generation (6G) mobile communication networks are expected to have dense infrastructures, large antenna size, wide bandwidth, cost-effective hardware, diversified positioning methods, and enhanced intelligence. Such trends bring both new challenges and opportunities for the practical design of 6G. On one hand, acquiring channel state information (CSI) in real time for all wireless links beco… ▽ More

    Submitted 6 February, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

  23. arXiv:2308.13234  [pdf, other

    cs.HC cs.AI eess.SP q-bio.NC

    Decoding Natural Images from EEG for Object Recognition

    Authors: Yonghao Song, Bingchuan Liu, Xiang Li, Nanlin Shi, Yijun Wang, Xiaorong Gao

    Abstract: Electroencephalography (EEG) signals, known for convenient non-invasive acquisition but low signal-to-noise ratio, have recently gained substantial attention due to the potential to decode natural images. This paper presents a self-supervised framework to demonstrate the feasibility of learning image representations from EEG signals, particularly for object recognition. The framework utilizes imag… ▽ More

    Submitted 4 April, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: ICLR, 2024

  24. arXiv:2308.13232  [pdf, other

    cs.HC cs.IT eess.SP q-bio.NC

    Estimating and approaching maximum information rate of noninvasive visual brain-computer interface

    Authors: Nanlin Shi, Yining Miao, Changxing Huang, Xiang Li, Yonghao Song, Xiaogang Chen, Yijun Wang, Xiaorong Gao

    Abstract: The mission of visual brain-computer interfaces (BCIs) is to enhance information transfer rate (ITR) to reach high speed towards real-life communication. Despite notable progress, noninvasive visual BCIs have encountered a plateau in ITRs, leaving it uncertain whether higher ITRs are achievable. In this study, we investigate the information rate limits of the primary visual channel to explore whet… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  25. arXiv:2307.13429  [pdf, ps, other

    cs.IT eess.SP

    Multi-Objective Optimisation of URLLC-Based Metaverse Services

    Authors: Xinyu Gao, Wenqiang Yi, Yuanwei Liu, Lajos Hanzo

    Abstract: Metaverse aims for building a fully immersive virtual shared space, where the users are able to engage in various activities. To successfully deploy the service for each user, the Metaverse service provider and network service provider generally localise the user first and then support the communication between the base station (BS) and the user. A reconfigurable intelligent surface (RIS) is capab… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: Accepted by IEEE Transactions on Communications

  26. arXiv:2307.10837  [pdf, other

    cs.IT eess.SP

    Sensing User's Activity, Channel, and Location with Near-Field Extra-Large-Scale MIMO

    Authors: Li Qiao, Anwen Liao, Zhuoran Li, Hua Wang, Zhen Gao, Xiang Gao, Yu Su, Pei Xiao, Li You, Derrick Wing Kwan Ng

    Abstract: This paper proposes a grant-free massive access scheme based on the millimeter wave (mmWave) extra-large-scale multiple-input multiple-output (XL-MIMO) to support massive Internet-of-Things (IoT) devices with low latency, high data rate, and high localization accuracy in the upcoming sixth-generation (6G) networks. The XL-MIMO consists of multiple antenna subarrays that are widely spaced over the… ▽ More

    Submitted 16 October, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: To appear in IEEE Transactions on Communications. Codes will be open to all on https://gaozhen16.github.io/ soon

  27. arXiv:2307.09237  [pdf, other

    eess.SY

    A Quick Guide for the Iterated Extended Kalman Filter on Manifolds

    Authors: Jianzhu Huai, Xiang Gao

    Abstract: The extended Kalman filter (EKF) is a common state estimation method for discrete nonlinear systems. It recursively executes the propagation step as time goes by and the update step when a set of measurements arrives. In the update step, the EKF linearizes the measurement function only once. In contrast, the iterated EKF (IEKF) refines the state in the update step by iteratively solving a least sq… ▽ More

    Submitted 4 October, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

    Comments: 2 pages excluding references

  28. arXiv:2307.01444  [pdf, other

    eess.SP

    Static Background Removal in Vehicular Radar: Filtering in Azimuth-Elevation-Doppler Domain

    Authors: Xiangyu Gao, Sumit Roy, Lyutianyang Zhang

    Abstract: Anti-collision assistance (as part of the current push towards increasing vehicular autonomy) critically depends on accurate detection/localization of moving targets in vicinity. An effective solution pathway involves removing background or static objects from the scene, so as to enhance the detection/localization of moving targets as a key component for improving overall system performance. In th… ▽ More

    Submitted 29 July, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: 13 pages

  29. arXiv:2306.14646  [pdf, other

    eess.IV cs.CV

    Multi-View Attention Learning for Residual Disease Prediction of Ovarian Cancer

    Authors: Xiangneng Gao, Shulan Ruan, Jun Shi, Guoqing Hu, Wei Wei

    Abstract: In the treatment of ovarian cancer, precise residual disease prediction is significant for clinical and surgical decision-making. However, traditional methods are either invasive (e.g., laparoscopy) or time-consuming (e.g., manual analysis). Recently, deep learning methods make many efforts in automatic analysis of medical images. Despite the remarkable progress, most of them underestimated the im… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  30. arXiv:2306.00499  [pdf, other

    eess.IV cs.CV

    DeSAM: Decoupling Segment Anything Model for Generalizable Medical Image Segmentation

    Authors: Yifan Gao, Wei Xia, Dingdu Hu, Xin Gao

    Abstract: Deep learning based automatic medical image segmentation models often suffer from domain shift, where the models trained on a source domain do not generalize well to other unseen domains. As a vision foundation model with powerful generalization capabilities, Segment Anything Model (SAM) shows potential for improving the cross-domain robustness of medical image segmentation. However, SAM and its f… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 12 pages. The code is available at https://github.com/yifangao112/DeSAM

  31. arXiv:2305.13839  [pdf, other

    cs.CV eess.IV

    SAR-to-Optical Image Translation via Thermodynamics-inspired Network

    Authors: Ming** Zhang, Jiamin Xu, Chengyu He, Wenteng Shang, Yunsong Li, Xinbo Gao

    Abstract: Synthetic aperture radar (SAR) is prevalent in the remote sensing field but is difficult to interpret in human visual perception. Recently, SAR-to-optical (S2O) image conversion methods have provided a prospective solution for interpretation. However, since there is a huge domain difference between optical and SAR images, they suffer from low image quality and geometric distortion in the produced… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  32. arXiv:2304.11697  [pdf, other

    cs.CV eess.IV

    Informative Data Selection with Uncertainty for Multi-modal Object Detection

    Authors: Xinyu Zhang, Zhiwei Li, Zhenhong Zou, Xin Gao, Yi** Xiong, Dafeng **, Jun Li, Hua** Liu

    Abstract: Noise has always been nonnegligible trouble in object detection by creating confusion in model reasoning, thereby reducing the informativeness of the data. It can lead to inaccurate recognition due to the shift in the observed pattern, that requires a robust generalization of the models. To implement a general vision model, we need to develop deep learning models that can adaptively select valid i… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

  33. SAWU-Net: Spatial Attention Weighted Unmixing Network for Hyperspectral Images

    Authors: Lin Qi, Xuewen Qin, Feng Gao, Junyu Dong, Xinbo Gao

    Abstract: Hyperspectral unmixing is a critical yet challenging task in hyperspectral image interpretation. Recently, great efforts have been made to solve the hyperspectral unmixing task via deep autoencoders. However, existing networks mainly focus on extracting spectral features from mixed pixels, and the employment of spatial feature prior knowledge is still insufficient. To this end, we put forward a sp… ▽ More

    Submitted 22 April, 2023; originally announced April 2023.

    Comments: IEEE GRSL 2023

  34. arXiv:2304.10691  [pdf, other

    eess.IV cs.CV cs.LG

    SkinGPT-4: An Interactive Dermatology Diagnostic System with Visual Large Language Model

    Authors: Juexiao Zhou, Xiaonan He, Liyuan Sun, Jiannan Xu, Xiuying Chen, Yuetan Chu, Longxi Zhou, Xingyu Liao, Bin Zhang, Xin Gao

    Abstract: Skin and subcutaneous diseases rank high among the leading contributors to the global burden of nonfatal diseases, impacting a considerable portion of the population. Nonetheless, the field of dermatology diagnosis faces three significant hurdles. Firstly, there is a shortage of dermatologists accessible to diagnose patients, particularly in rural regions. Secondly, accurately interpreting skin di… ▽ More

    Submitted 8 June, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

  35. arXiv:2304.03708  [pdf, other

    eess.IV cs.CV

    Efficient automatic segmentation for multi-level pulmonary arteries: The PARSE challenge

    Authors: Gongning Luo, Kuanquan Wang, Jun Liu, Shuo Li, Xinjie Liang, Xiangyu Li, Shaowei Gan, Wei Wang, Suyu Dong, Wenyi Wang, Pengxin Yu, Enyou Liu, Hongrong Wei, Na Wang, Jia Guo, Huiqi Li, Zhao Zhang, Ziwei Zhao, Na Gao, Nan An, Ashkan Pakzad, Bojidar Rangelov, Jiaqi Dou, Song Tian, Zeyu Liu , et al. (5 additional authors not shown)

    Abstract: Efficient automatic segmentation of multi-level (i.e. main and branch) pulmonary arteries (PA) in CTPA images plays a significant role in clinical applications. However, most existing methods concentrate only on main PA or branch PA segmentation separately and ignore segmentation efficiency. Besides, there is no public large-scale dataset focused on PA segmentation, which makes it highly challengi… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

  36. Precoder Design for Massive MIMO Downlink with Matrix Manifold Optimization

    Authors: Rui Sun, Chen Wang, An-An Lu, Xiqi Gao, Xiang-Gen Xia

    Abstract: We investigate the weighted sum-rate (WSR) maximization linear precoder design for massive multiple-input multiple-output (MIMO) downlink. We consider a single-cell system with multiple users and propose a unified matrix manifold optimization framework applicable to total power constraint (TPC), per-user power constraint (PUPC) and per-antenna power constraint (PAPC). We prove that the precoders u… ▽ More

    Submitted 10 April, 2024; v1 submitted 31 March, 2023; originally announced April 2023.

    Comments: 16 pages, 11 figures, journal

    Journal ref: IEEE Transactions on Signal Processing, vol. 72, pp. 1065-1080, 2024

  37. arXiv:2302.14536  [pdf, other

    eess.SP

    On the Road to 6G: Visions, Requirements, Key Technologies and Testbeds

    Authors: Cheng-Xiang Wang, Xiaohu You, Xiqi Gao, Xiuming Zhu, Zixin Li, Chuan Zhang, Haiming Wang, Yongming Huang, Yunfei Chen, Harald Haas, John S. Thompson, Erik G. Larsson, Marco Di Renzo, Wen Tong, Peiying Zhu, Xuemin, Shen, H. Vincent Poor, Lajos Hanzo

    Abstract: Fifth generation (5G) mobile communication systems have entered the stage of commercial development, providing users with new services and improved user experiences as well as offering a host of novel opportunities to various industries. However, 5G still faces many challenges. To address these challenges, international industrial, academic, and standards organizations have commenced research on s… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

  38. Personalized and privacy-preserving federated heterogeneous medical image analysis with PPPML-HMI

    Authors: Juexiao Zhou, Longxi Zhou, Di Wang, Xiaopeng Xu, Haoyang Li, Yuetan Chu, Wenkai Han, Xin Gao

    Abstract: Heterogeneous data is endemic due to the use of diverse models and settings of devices by hospitals in the field of medical imaging. However, there are few open-source frameworks for federated heterogeneous medical image analysis with personalization and privacy protection simultaneously without the demand to modify the existing model structures or to share any private data. In this paper, we prop… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

  39. arXiv:2302.11385  [pdf, other

    cs.IT eess.SP

    Reconfigurable Massive MIMO: Harnessing the Power of the Electromagnetic Domain for Enhanced Information Transfer

    Authors: Keke Ying, Zhen Gao, Sheng Chen, Xinyu Gao, Michail Matthaiou, Rui Zhang, Robert Schober

    Abstract: The capacity of commercial massive multiple-input multiple-output (mMIMO) systems is constrained by the limited array aperture at the base station, and cannot meet the ever-increasing traffic demands of wireless networks. Given the array aperture, holographic MIMO with infinitesimal antenna spacing can maximize the capacity, but is physically unrealizable. As a promising alternative, reconfigurabl… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

    Comments: 7 pages, 3 figures. This paper is accepted by IEEE Wireless Communications Magazine. Copyright may be transferred without notice, after which this version may no longer be accessible

  40. arXiv:2302.06381  [pdf

    eess.IV cs.CV

    Self-supervised phase unwrap** in fringe projection profilometry

    Authors: Xiaomin Gao, Wanzhong Song, Chunqian Tan, Junzhe Lei

    Abstract: Fast-speed and high-accuracy three-dimensional (3D) shape measurement has been the goal all along in fringe projection profilometry (FPP). The dual-frequency temporal phase unwrap** method (DF-TPU) is one of the prominent technologies to achieve this goal. However, the period number of the high-frequency pattern of existing DF-TPU approaches is usually limited by the inevitable phase errors, set… ▽ More

    Submitted 30 May, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

  41. arXiv:2302.01931  [pdf, other

    eess.IV physics.geo-ph

    Characterization and Generation of 3D Realistic Geological Particles with Metaball Descriptor based on X-Ray Computed Tomography

    Authors: Yifeng Zhao, Xiangbo Gao, Pei Zhang, Liang Lei, S. A. Galindo-Torres, Stan Z. Li

    Abstract: The morphology of geological particles is crucial in determining its granular characteristics and assembly responses. In this paper, Metaball-function based solutions are proposed for morphological characterization and generation of three-dimensional realistic particles according to the X-ray Computed Tomography (XRCT) images. For characterization, we develop a geometric-based Metaball-Imaging alg… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

  42. Rate-Splitting Multiple Access for Uplink Massive MIMO With Electromagnetic Exposure Constraints

    Authors: Hanyu Jiang, Li You, Ahmed Elzanaty, Jue Wang, Wen** Wang, Xiqi Gao, Mohamed-Slim Alouini

    Abstract: Over the past few years, the prevalence of wireless devices has become one of the essential sources of electromagnetic (EM) radiation to the public. Facing with the swift development of wireless communications, people are skeptical about the risks of long-term exposure to EM radiation. As EM exposure is required to be restricted at user terminals, it is inefficient to blindly decrease the transmit… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: to appear in IEEE Journal on Selected Areas in Communications

    Journal ref: IEEE Journal on Selected Areas in Communications, vol. 41, no. 5, pp. 1383-1397, May 2023

  43. arXiv:2211.10152  [pdf, other

    eess.AS cs.SD

    Self-Transriber: Few-shot Lyrics Transcription with Self-training

    Authors: Xiaoxue Gao, Xianghu Yue, Haizhou Li

    Abstract: The current lyrics transcription approaches heavily rely on supervised learning with labeled data, but such data are scarce and manual labeling of singing is expensive. How to benefit from unlabeled data and alleviate limited data problem have not been explored for lyrics transcription. We propose the first semi-supervised lyrics transcription paradigm, Self-Transcriber, by leveraging on unlabeled… ▽ More

    Submitted 2 March, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

    Comments: Accepted by ICASSP 2023

  44. arXiv:2211.06641  [pdf, other

    eess.IV cs.CV cs.LG

    Prediction of Geometric Transformation on Cardiac MRI via Convolutional Neural Network

    Authors: Xin Gao

    Abstract: In the field of medical image, deep convolutional neural networks(ConvNets) have achieved great success in the classification, segmentation, and registration tasks thanks to their unparalleled capacity to learn image features. However, these tasks often require large amounts of manually annotated data and are labor-intensive. Therefore, it is of significant importance for us to study unsupervised… ▽ More

    Submitted 12 November, 2022; originally announced November 2022.

    Comments: 8 pages, 6 figures

  45. arXiv:2210.16755  [pdf, other

    cs.CL cs.SD eess.AS

    token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text

    Authors: Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li

    Abstract: Self-supervised pre-training has been successful in both text and speech processing. Speech and text offer different but complementary information. The question is whether we are able to perform a speech-text joint pre-training on unpaired speech and text. In this paper, we take the idea of self-supervised pre-training one step further and propose token2vec, a novel joint pre-training framework fo… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: Submitted to ICASSP 2023

  46. arXiv:2208.06222  [pdf, other

    cs.CV eess.IV

    Scale-free and Task-agnostic Attack: Generating Photo-realistic Adversarial Patterns with Patch Quilting Generator

    Authors: Xiangbo Gao, Cheng Luo, Qinliang Lin, Weicheng Xie, Minmin Liu, Linlin Shen, Keerthy Kusumam, Siyang Song

    Abstract: \noindent Traditional L_p norm-restricted image attack algorithms suffer from poor transferability to black box scenarios and poor robustness to defense algorithms. Recent CNN generator-based attack approaches can synthesize unrestricted and semantically meaningful entities to the image, which is shown to be transferable and robust. However, such methods attack images by either synthesizing local… ▽ More

    Submitted 19 November, 2022; v1 submitted 12 August, 2022; originally announced August 2022.

  47. Channel Estimation for LEO Satellite Massive MIMO OFDM Communications

    Authors: Ke-Xin Li, Xiqi Gao, Xiang-Gen Xia

    Abstract: In this paper, we investigate the massive multiple-input multiple-output orthogonal frequency division multiplexing channel estimation for low-earth-orbit satellite communication systems. First, we use the angle-delay domain channel to characterize the space-frequency domain channel. Then, we show that the asymptotic minimum mean square error (MMSE) of the channel estimation can be minimized if th… ▽ More

    Submitted 12 March, 2023; v1 submitted 25 July, 2022; originally announced July 2022.

    Comments: accepted by IEEE Transactions on Wireless Communications

  48. arXiv:2207.07336  [pdf, other

    eess.AS cs.SD eess.SP

    PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber for Polyphonic Music

    Authors: Xiaoxue Gao, Chitralekha Gupta, Haizhou Li

    Abstract: Lyrics transcription of polyphonic music is challenging as the background music affects lyrics intelligibility. Typically, lyrics transcription can be performed by a two-step pipeline, i.e. a singing vocal extraction front end, followed by a lyrics transcriber back end, where the front end and back end are trained separately. Such a two-step pipeline suffers from both imperfect vocal extraction an… ▽ More

    Submitted 5 May, 2023; v1 submitted 15 July, 2022; originally announced July 2022.

    Comments: TALSP

  49. Massive MIMO Hybrid Precoding for LEO Satellite Communications With Twin-Resolution Phase Shifters and Nonlinear Power Amplifiers

    Authors: Li You, Xiaoyu Qiang, Ke-Xin Li, Christos G. Tsinos, Wen** Wang, Xiqi Gao, Björn Ottersten

    Abstract: The massive multiple-input multiple-output (MIMO) transmission technology has recently attracted much attention in the non-geostationary, e.g., low earth orbit (LEO) satellite communication (SATCOM) systems since it can significantly improve the energy efficiency (EE) and spectral efficiency. In this work, we develop a hybrid analog/digital precoding technique in the massive MIMO LEO SATCOM downli… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

    Comments: 14 pages, 8 figures, to appear in IEEE Transactions on Communications

    Journal ref: IEEE Transactions on Communications, vol. 70, no. 8, pp. 5543-5557, Aug. 2022

  50. arXiv:2206.02442  [pdf, other

    eess.SP

    Pervasive wireless channel modeling theory and applications to 6G GBSMs for all frequency bands and all scenarios

    Authors: Cheng-Xiang Wang, Zhen Lv, Xiqi Gao, Xiaohu You, Yang Hao, Harald Haas

    Abstract: In this paper, a pervasive wireless channel modeling theory is first proposed, which uses a unified channel modeling method and a unified equation of channel impulse response (CIR), and can integrate important channel characteristics at different frequency bands and scenarios. Then, we apply the proposed theory to a three dimensional (3D) space-time-frequency (STF) non-stationary geometry-based st… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.