Skip to main content

Showing 1–50 of 511 results for author: Chen, H

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.18201  [pdf, other

    eess.IV cs.CV

    EFCNet: Every Feature Counts for Small Medical Object Segmentation

    Authors: Lingjie Kong, Qiaoling Wei, Chengming Xu, Han Chen, Yanwei Fu

    Abstract: This paper explores the segmentation of very small medical objects with significant clinical value. While Convolutional Neural Networks (CNNs), particularly UNet-like models, and recent Transformers have shown substantial progress in image segmentation, our empirical findings reveal their poor performance in segmenting the small medical objects and lesions concerned in this paper. This limitation… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.18102  [pdf

    eess.IV cs.CV

    A Lung Nodule Dataset with Histopathology-based Cancer Type Annotation

    Authors: Muwei Jian, Hongyu Chen, Zaiyong Zhang, Nan Yang, Haorang Zhang, Lifu Ma, Wen**g Xu, Huixiang Zhi

    Abstract: Recently, Computer-Aided Diagnosis (CAD) systems have emerged as indispensable tools in clinical diagnostic workflows, significantly alleviating the burden on radiologists. Nevertheless, despite their integration into clinical settings, CAD systems encounter limitations. Specifically, while CAD systems can achieve high performance in the detection of lung nodules, they face challenges in accuratel… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.18088  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    LLM-Driven Multimodal Opinion Expression Identification

    Authors: Bonian Jia, Huiyao Chen, Yueheng Sun, Meishan Zhang, Min Zhang

    Abstract: Opinion Expression Identification (OEI) is essential in NLP for applications ranging from voice assistants to depression diagnosis. This study extends OEI to encompass multimodal inputs, underlining the significance of auditory cues in delivering emotional subtleties beyond the capabilities of text. We introduce a novel multimodal OEI (MOEI) task, integrating text and speech to mirror real-world s… ▽ More

    Submitted 29 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: 5 pages, 3 Figures, Accept by Interspeech 2024

  4. arXiv:2406.18018  [pdf, other

    eess.IV

    A Cross Spatio-Temporal Pathology-based Lung Nodule Dataset

    Authors: Muwei Jian, Haoran Zhang, Mingju Shao, Hongyu Chen, Huihui Huang, Yanjie Zhong, Changlei Zhang, Bin Wang, Penghui Gao

    Abstract: Recently, intelligent analysis of lung nodules with the assistant of computer aided detection (CAD) techniques can improve the accuracy rate of lung cancer diagnosis. However, existing CAD systems and pulmonary datasets mainly focus on Computed Tomography (CT) images from one single period, while ignoring the cross spatio-temporal features associated with the progression of nodules contained in im… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  5. arXiv:2406.17950  [pdf, other

    eess.SP

    V2X Sidelink Positioning in FR1: From Ray-Tracing and Channel Estimation to Bayesian Tracking

    Authors: Yu Ge, Maximilian Stark, Musa Furkan Keskin, Hui Chen, Guillaume Jornod, Thomas Hansen, Frank Hofmann, Henk Wymeersch

    Abstract: Sidelink positioning research predominantly focuses on the snapshot positioning problem, often within the mmWave band. Only a limited number of studies have delved into vehicle-to-anything (V2X) tracking within sub-6 GHz bands. In this paper, we investigate the V2X sidelink tracking challenges over sub-6 GHz frequencies. We propose a Kalman-filter-based tracking approach that leverages the estimat… ▽ More

    Submitted 30 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  6. arXiv:2406.16942  [pdf, other

    eess.IV cs.AI cs.CV

    Enhancing Diagnostic Reliability of Foundation Model with Uncertainty Estimation in OCT Images

    Authors: Yuanyuan Peng, Aidi Lin, Meng Wang, Tian Lin, Ke Zou, Yinglin Cheng, Tingkun Shi, Xulong Liao, Lixia Feng, Zhen Liang, Xinjian Chen, Huazhu Fu, Haoyu Chen

    Abstract: Inability to express the confidence level and detect unseen classes has limited the clinical implementation of artificial intelligence in the real-world. We developed a foundation model with uncertainty estimation (FMUE) to detect 11 retinal conditions on optical coherence tomography (OCT). In the internal test set, FMUE achieved a higher F1 score of 96.76% than two state-of-the-art algorithms, RE… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: All codes are available at https://github.com/yuanyuanpeng0129/FMUE

  7. arXiv:2406.16929  [pdf, other

    eess.SP cs.AI

    Modelling the 5G Energy Consumption using Real-world Data: Energy Fingerprint is All You Need

    Authors: Tingwei Chen, Yantao Wang, Hanzhi Chen, Zijian Zhao, Xinhao Li, Nicola Piovesan, Guangxu Zhu, Qingjiang Shi

    Abstract: The introduction of fifth-generation (5G) radio technology has revolutionized communications, bringing unprecedented automation, capacity, connectivity, and ultra-fast, reliable communications. However, this technological leap comes with a substantial increase in energy consumption, presenting a significant challenge. To improve the energy efficiency of 5G networks, it is imperative to develop sop… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  8. arXiv:2406.11519  [pdf, other

    cs.CV eess.IV

    HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

    Authors: Di Wang, Meiqi Hu, Yao **, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, Chuan Fu, Hongruixuan Chen, Chengxi Han, Naoto Yokoya, **g Zhang, Minqiang Xu, Lin Liu, Lefei Zhang, Chen Wu, Bo Du, Dacheng Tao, Liangpei Zhang

    Abstract: Foundation models (FMs) are revolutionizing the analysis and understanding of remote sensing (RS) scenes, including aerial RGB, multispectral, and SAR images. However, hyperspectral images (HSIs), which are rich in spectral information, have not seen much application of FMs, with existing methods often restricted to specific tasks and lacking generality. To fill this gap, we introduce HyperSIGMA,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: The code and models will be released at https://github.com/WHU-Sigma/HyperSIGMA

  9. arXiv:2406.10724  [pdf, other

    eess.IV cs.CV cs.LG

    Beyond the Visible: Jointly Attending to Spectral and Spatial Dimensions with HSI-Diffusion for the FINCH Spacecraft

    Authors: Ian Vyse, Rishit Dagli, Dav Vrat Chadha, John P. Ma, Hector Chen, Isha Ruparelia, Prithvi Seran, Matthew Xie, Eesa Aamer, Aidan Armstrong, Naveen Black, Ben Borstein, Kevin Caldwell, Orrin Dahanaggamaarachchi, Joe Dai, Abeer Fatima, Stephanie Lu, Maxime Michet, Anoushka Paul, Carrie Ann Po, Shivesh Prakash, Noa Prosser, Riddhiman Roy, Mirai Shinjo, Iliya Shofman , et al. (4 additional authors not shown)

    Abstract: Satellite remote sensing missions have gained popularity over the past fifteen years due to their ability to cover large swaths of land at regular intervals, making them ideal for monitoring environmental trends. The FINCH mission, a 3U+ CubeSat equipped with a hyperspectral camera, aims to monitor crop residue cover in agricultural fields. Although hyperspectral imaging captures both spectral and… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: To appear in 38th Annual Small Satellite Conference

  10. arXiv:2406.09873  [pdf, other

    eess.AS cs.AI cs.SD

    Perceiver-Prompt: Flexible Speaker Adaptation in Whisper for Chinese Disordered Speech Recognition

    Authors: Yicong Jiang, Tianzi Wang, Xurong Xie, Juan Liu, Wei Sun, Nan Yan, Hui Chen, Lan Wang, Xunying Liu, Feng Tian

    Abstract: Disordered speech recognition profound implications for improving the quality of life for individuals afflicted with, for example, dysarthria. Dysarthric speech recognition encounters challenges including limited data, substantial dissimilarities between dysarthric and non-dysarthric speakers, and significant speaker variations stemming from the disorder. This paper introduces Perceiver-Prompt, a… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted by interspeech 2024

  11. arXiv:2406.09696  [pdf, other

    eess.IV cs.CV

    MoME: Mixture of Multimodal Experts for Cancer Survival Prediction

    Authors: Conghao Xiong, Hao Chen, Hao Zheng, Dong Wei, Yefeng Zheng, Joseph J. Y. Sung, Irwin King

    Abstract: Survival analysis, as a challenging task, requires integrating Whole Slide Images (WSIs) and genomic data for comprehensive decision-making. There are two main challenges in this task: significant heterogeneity and complex inter- and intra-modal interactions between the two modalities. Previous approaches utilize co-attention methods, which fuse features from both modalities only once after separa… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 8 + 1/2 pages, early accepted to MICCAI2024

  12. arXiv:2406.09317  [pdf, other

    eess.IV cs.CV

    Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases

    Authors: Meng Wang, Tian Lin, Aidi Lin, Kai Yu, Yuanyuan Peng, Lianyu Wang, Cheng Chen, Ke Zou, Huiyu Liang, Man Chen, Xue Yao, Meiqin Zhang, Binwei Huang, Chaoxin Zheng, Peixin Zhang, Wei Chen, Yilong Luo, Yifan Chen, Honghe Xia, Tingkun Shi, Qi Zhang, **ming Guo, Xiaolin Chen, **gcheng Wang, Yih Chung Tham , et al. (24 additional authors not shown)

    Abstract: Previous foundation models for retinal images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language foundation model that leverages knowledge from over 400 fundus diseases. To RetiZero's pre-training, we compiled 341,896 fundus images paired with text descriptions, sourced from public datasets, ophthalmic literature, and online resources… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  13. arXiv:2406.07399  [pdf, other

    cs.LG eess.SP

    Redefining Automotive Radar Imaging: A Domain-Informed 1D Deep Learning Approach for High-Resolution and Efficient Performance

    Authors: Ruxin Zheng, Shunqiao Sun, Holger Caesar, Honglei Chen, Jian Li

    Abstract: Millimeter-wave (mmWave) radars are indispensable for perception tasks of autonomous vehicles, thanks to their resilience in challenging weather conditions. Yet, their deployment is often limited by insufficient spatial resolution for precise semantic scene interpretation. Classical super-resolution techniques adapted from optical imaging inadequately address the distinct characteristics of radar… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  14. arXiv:2406.02640  [pdf, other

    eess.IV physics.med-ph physics.optics

    Ghost imaging-based Non-contact Heart Rate Detection

    Authors: Jianming Yu, Yuchen He, Bin Li, Hui Chen, Huaibin Zheng, Jianbin Liu, Zhuo Xu

    Abstract: Remote heart rate measurement is an increasingly concerned research field, usually using remote photoplethysmography (rPPG) to collect heart rate information through video data collection. However, in certain specific scenarios (such as low light conditions, intense lighting, and non-line-of-sight situations), traditional imaging methods fail to capture image information effectively, that may lead… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 4 pages, 6 figures

  15. arXiv:2406.00582  [pdf

    eess.SP

    Joint Detection and Classification of Communication and Radar Signals in Congested RF Environments Using YOLOv8

    Authors: Xiwen Kang, Hua-mei Chen, Genshe Chen, Kuo-Chu Chang, Thomas M. Clemons

    Abstract: In this paper, we present a comprehensive study on the application of YOLOv8, a state-of-the-art computer vision (CV) model, to the challenging problem of joint detection and classification of signals in a highly dynamic and congested RF environment. Using our synthetic RF datasets, we explored three different scenarios with congested communication and radar signals. In the first study, we applied… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE MILCOM 2024

  16. arXiv:2405.18167  [pdf, other

    eess.IV cs.CV

    Confidence-aware multi-modality learning for eye disease screening

    Authors: Ke Zou, Tian Lin, Zongbo Han, Meng Wang, Xuedong Yuan, Haoyu Chen, Changqing Zhang, Xiao**g Shen, Huazhu Fu

    Abstract: Multi-modal ophthalmic image classification plays a key role in diagnosing eye diseases, as it integrates information from different sources to complement their respective performances. However, recent improvements have mainly focused on accuracy, often neglecting the importance of confidence and robustness in predictions for diverse modalities. In this study, we propose a novel multi-modality evi… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 27 pages, 7 figures, 9 tables

  17. arXiv:2405.18092  [pdf

    cs.AI cs.ET cs.MA cs.RO eess.SY

    LLM experiments with simulation: Large Language Model Multi-Agent System for Process Simulation Parametrization in Digital Twins

    Authors: Yuchen Xia, Daniel Dittler, Nasser Jazdi, Haonan Chen, Michael Weyrich

    Abstract: This paper presents a novel design of a multi-agent system framework that applies a large language model (LLM) to automate the parametrization of process simulations in digital twins. We propose a multi-agent framework that includes four types of agents: observation, reasoning, decision and summarization. By enabling dynamic interaction between LLM agents and simulation model, the developed system… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Submitted to IEEE-ETFA2024, under peer-review

  18. arXiv:2405.17836  [pdf, other

    eess.SP cs.LG stat.ML

    An Innovative Networks in Federated Learning

    Authors: Zavareh Bozorgasl, Hao Chen

    Abstract: This paper presents the development and application of Wavelet Kolmogorov-Arnold Networks (Wav-KAN) in federated learning. We implemented Wav-KAN \cite{wav-kan} in the clients. Indeed, we have considered both continuous wavelet transform (CWT) and also discrete wavelet transform (DWT) to enable multiresolution capabaility which helps in heteregeneous data distribution across clients. Extensive exp… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Work in progress

  19. arXiv:2405.13365  [pdf, other

    cs.LG cs.MA eess.SP

    Clipped Uniform Quantizers for Communication-Efficient Federated Learning

    Authors: Zavareh Bozorgasl, Hao Chen

    Abstract: This paper introduces an approach to employ clipped uniform quantization in federated learning settings, aiming to enhance model efficiency by reducing communication overhead without compromising accuracy. By employing optimal clip** thresholds and adaptive quantization schemes, our method significantly curtails the bit requirements for model weight transmissions between clients and the server.… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: Work in progress

  20. arXiv:2405.12832  [pdf, other

    cs.LG cs.AI eess.SP stat.ML

    Wav-KAN: Wavelet Kolmogorov-Arnold Networks

    Authors: Zavareh Bozorgasl, Hao Chen

    Abstract: In this paper, we introduce Wav-KAN, an innovative neural network architecture that leverages the Wavelet Kolmogorov-Arnold Networks (Wav-KAN) framework to enhance interpretability and performance. Traditional multilayer perceptrons (MLPs) and even recent advancements like Spl-KAN face challenges related to interpretability, training speed, robustness, computational efficiency, and performance. Wa… ▽ More

    Submitted 27 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: Work in progress; codes are available at are available at https://github.com/zavareh1/Wav-KAN

  21. arXiv:2405.10157  [pdf

    eess.SY

    Incorporating ESO into Deep Koopman Operator Modelling for Control of Autonomous Vehicles

    Authors: Hao Chen, Chen Lv

    Abstract: Koopman operator theory is a kind of data-driven modelling approach that accurately captures the nonlinearities of mechatronic systems such as vehicles against physics-based methods. However, the infinite-dimensional Koopman operator is impossible to implement in real-world applications. To approximate the infinite-dimensional Koopman operator through collection dataset rather than manual trial an… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  22. arXiv:2405.10145  [pdf

    eess.SY

    Deep Koopman Operator-Informed Safety Command Governor for Autonomous Vehicles

    Authors: Hao Chen, Xiangkun He, Shuo Cheng, Chen Lv

    Abstract: Modeling of nonlinear behaviors with physical-based models poses challenges. However, Koopman operator maps the original nonlinear system into an infinite-dimensional linear space to achieve global linearization of the nonlinear system through input and output data, which derives an absolute equivalent linear representation of the original state space. Due to the impossibility of implementing the… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  23. arXiv:2405.09472  [pdf, other

    eess.IV cs.CV

    Perception- and Fidelity-aware Reduced-Reference Super-Resolution Image Quality Assessment

    Authors: Xinying Lin, Xuyang Liu, Hong Yang, Xiaohai He, Honggang Chen

    Abstract: With the advent of image super-resolution (SR) algorithms, how to evaluate the quality of generated SR images has become an urgent task. Although full-reference methods perform well in SR image quality assessment (SR-IQA), their reliance on high-resolution (HR) images limits their practical applicability. Leveraging available reconstruction information as much as possible for SR-IQA, such as low-r… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  24. arXiv:2405.09079  [pdf, other

    eess.SP cs.IT

    Integrated Monostatic Sensing and Full-Duplex Multiuser Communication for mmWave Systems

    Authors: Murat Bayraktar, Nuria González-Prelcic, Mikko Valkama, Hao Chen, Charlie Jianzhong Zhang

    Abstract: In this paper, we propose a hybrid precoding/combining framework for communication-centric integrated sensing and full-duplex (FD) communication operating at mmWave bands. The designed precoders and combiners enable multiuser (MU) FD communication while simultaneously supporting monostatic sensing in a frequency-selective setting. The joint design of precoders and combiners involves the mitigation… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 13 pages, 7 figures

  25. arXiv:2405.05579  [pdf

    cs.HC eess.SY

    Intelligent EC Rearview Mirror: Enhancing Driver Safety with Dynamic Glare Mitigation via Cloud Edge Collaboration

    Authors: Junyi Yang, Zefei Xu, Huayi Lai, Hongjian Chen, Sifan Kong, Yutong Wu, Huan Yang

    Abstract: Sudden glare from trailing vehicles significantly increases driving safety risks. Existing anti-glare technologies such as electronic, manually-adjusted, and electrochromic rearview mirrors, are expensive and lack effective adaptability in different lighting conditions. To address these issues, our research introduces an intelligent rearview mirror system utilizing novel all-liquid electrochromic… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  26. arXiv:2405.05239  [pdf, other

    eess.SY cs.LG

    Cellular Traffic Prediction Using Online Prediction Algorithms

    Authors: Hossein Mehri, Hao Chen, Hani Mehrpouyan

    Abstract: The advent of 5G technology promises a paradigm shift in the realm of telecommunications, offering unprecedented speeds and connectivity. However, the efficient management of traffic in 5G networks remains a critical challenge. It is due to the dynamic and heterogeneous nature of network traffic, varying user behaviors, extended network size, and diverse applications, all of which demand highly ac… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  27. arXiv:2405.05235  [pdf, other

    eess.SY cs.LG

    RACH Traffic Prediction in Massive Machine Type Communications

    Authors: Hossein Mehri, Hao Chen, Hani Mehrpouyan

    Abstract: Traffic pattern prediction has emerged as a promising approach for efficiently managing and mitigating the impacts of event-driven bursty traffic in massive machine-type communication (mMTC) networks. However, achieving accurate predictions of bursty traffic remains a non-trivial task due to the inherent randomness of events, and these challenges intensify within live network environments. Consequ… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Report number: TMLCN-10-23-0189

  28. arXiv:2405.02788  [pdf, other

    eess.SP

    Antenna Failure Resilience: Deep Learning-Enabled Robust DOA Estimation with Single Snapshot Sparse Arrays

    Authors: Ruxin Zheng, Shunqiao Sun, Hongshan Liu, Honglei Chen, Mojtaba Soltanalian, Jian Li

    Abstract: Recent advancements in Deep Learning (DL) for Direction of Arrival (DOA) estimation have highlighted its superiority over traditional methods, offering faster inference, enhanced super-resolution, and robust performance in low Signal-to-Noise Ratio (SNR) environments. Despite these advancements, existing research predominantly focuses on multi-snapshot scenarios, a limitation in the context of aut… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: Invited paper for IEEE Asilomar conference 2024

  29. arXiv:2405.00391  [pdf, ps, other

    cs.IT eess.SP

    Beamforming Inferring by Conditional WGAN-GP for Holographic Antenna Arrays

    Authors: Fenghao Zhu, Xinquan Wang, Chongwen Huang, Ahmed Alhammadi, Hui Chen, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: The beamforming technology with large holographic antenna arrays is one of the key enablers for the next generation of wireless systems, which can significantly improve the spectral efficiency. However, the deployment of large antenna arrays implies high algorithm complexity and resource overhead at both receiver and transmitter ends. To address this issue, advanced technologies such as artificial… ▽ More

    Submitted 15 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  30. arXiv:2404.17089  [pdf, other

    eess.SP

    Auto-Calibration and 2D-DOA Estimation in UCAs via an Integrated Wideband Dictionary

    Authors: Zavareh Bozorgasl, Hao Chen, Mohammad J. Dehghani

    Abstract: In this paper, we present a novel auto-calibration scheme for the joint estimation of the two-dimensional (2-D) direction-of-arrival (DOA) and the mutual coupling matrix (MCM) for a signal measured using uniform circular arrays. The method employs an integrated wideband dictionary to mitigate the detrimental effects of the discretization of the continuous parameter space over the considered azimut… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: This is a completed version of a work which will be sent to 2024 Asilomar Conference on Signals, Systems, and Computers

  31. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, **shan Pan, Jiangxin Dong, **hui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi **, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  32. arXiv:2404.14879  [pdf, other

    eess.SP

    Device-Free 3D Drone Localization in RIS-Assisted mmWave MIMO Networks

    Authors: Jiguang He, Charles Vanwynsberghe, Hui Chen, Chongwen Huang, Aymen Fakhreddine

    Abstract: In this paper, we investigate the potential of reconfigurable intelligent surfaces (RISs) in facilitating passive/device-free three-dimensional (3D) drone localization within existing cellular infrastructure operating at millimeter-wave (mmWave) frequencies and employing multiple antennas at the transceivers. The developed localization system operates in the bi-static mode without requiring direct… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 6 pages, 5 figures, submitted to IEEE GLOBECOM 2024

  33. arXiv:2404.10021  [pdf

    eess.SY

    Monitoring Based Fatigue Damage Prognosis of Wind Turbine Composite Blades under Uncertain Wind Loads

    Authors: Chizhi Zhang, Hua-Peng Chen

    Abstract: Lifecycle assessment of wind turbines is essential to improve their design and to optimum maintenance plans for preventing failures during the design life. A critical element of wind turbines is the composite blade due to uncertain cyclic wind loads with relatively high frequency and amplitude in offshore environments. It is critical to detect the wind fatigue damage evolution in composite blades… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  34. arXiv:2404.09436  [pdf

    physics.med-ph eess.IV

    Image Reconstruction with B0 Inhomogeneity using an Interpretable Deep Unrolled Network on an Open-bore MRI-Linac

    Authors: Shanshan Shan, Yang Gao, David E. J. Waddington, Hongli Chen, Brendan Whelan, Paul Z. Y. Liu, Yaohui Wang, Chunyi Liu, Hong** Gan, Mingyuan Gao, Feng Liu

    Abstract: MRI-Linac systems require fast image reconstruction with high geometric fidelity to localize and track tumours for radiotherapy treatments. However, B0 field inhomogeneity distortions and slow MR acquisition potentially limit the quality of the image guidance and tumour treatments. In this study, we develop an interpretable unrolled network, referred to as RebinNet, to reconstruct distortion-free… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  35. Change Guiding Network: Incorporating Change Prior to Guide Change Detection in Remote Sensing Imagery

    Authors: Chengxi Han, Chen Wu, Haonan Guo, Meiqi Hu, Jiepan Li, Hongruixuan Chen

    Abstract: The rapid advancement of automated artificial intelligence algorithms and remote sensing instruments has benefited change detection (CD) tasks. However, there is still a lot of space to study for precise detection, especially the edge integrity and internal holes phenomenon of change features. In order to solve these problems, we design the Change Guiding Network (CGNet), to tackle the insufficien… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  36. arXiv:2404.07092  [pdf, other

    eess.SP physics.optics

    Net 835-Gb/s/λ Carrier- and LO-Free 100-km Transmission Using Channel-Aware Phase Retrieval Reception

    Authors: Hanzi Huang, Haoshuo Chen, Qian Hu, Di Che, Yetian Huang, Brian Stern, Nicolas K. Fontaine, Mikael Mazur, Lauren Dallachiesa, Roland Ryf, Zhengxuan Li, Yingxiong Song

    Abstract: We experimentally demonstrate the first carrier- and LO-free 800G/λ receiver enabling direct compatibility with standard coherent transmitters via phase retrieval, achieving net 835-Gb/s transmission over 100-km SMF and record 8.27-b/s/Hz net optical spectral efficiency.

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 3 pages, 3 figures

  37. arXiv:2404.04947  [pdf, other

    eess.AS cs.AI cs.LG cs.SD eess.SP

    Gull: A Generative Multifunctional Audio Codec

    Authors: Yi Luo, Jianwei Yu, Hangting Chen, Rongzhi Gu, Chao Weng

    Abstract: We introduce Gull, a generative multifunctional audio codec. Gull is a general purpose neural audio compression and decompression model which can be applied to a wide range of tasks and applications such as real-time communication, audio super-resolution, and codec language models. The key components of Gull include (1) universal-sample-rate modeling via subband modeling schemes motivated by recen… ▽ More

    Submitted 7 June, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

    Comments: Demo page: https://yluo42.github.io/Gull/

  38. arXiv:2404.03425  [pdf, other

    eess.IV cs.AI cs.CV

    ChangeMamba: Remote Sensing Change Detection with Spatio-Temporal State Space Model

    Authors: Hongruixuan Chen, Jian Song, Chengxi Han, Junshi Xia, Naoto Yokoya

    Abstract: Convolutional neural networks (CNN) and Transformers have made impressive progress in the field of remote sensing change detection (CD). However, both architectures have inherent shortcomings: CNN are constrained by a limited receptive field that may hinder their ability to capture broader spatial contexts, while Transformers are computationally intensive, making them costly to train and deploy on… ▽ More

    Submitted 26 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE TGRS

  39. arXiv:2404.02394  [pdf, other

    eess.IV cs.CV

    Cohort-Individual Cooperative Learning for Multimodal Cancer Survival Analysis

    Authors: Huajun Zhou, Fengtao Zhou, Hao Chen

    Abstract: Recently, we have witnessed impressive achievements in cancer survival analysis by integrating multimodal data, e.g., pathology images and genomic profiles. However, the heterogeneity and high dimensionality of these modalities pose significant challenges for extracting discriminative representations while maintaining good generalization. In this paper, we propose a Cohort-individual Cooperative L… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 10 pages, 9 figures

  40. arXiv:2404.01192  [pdf, other

    eess.IV cs.CV

    iMD4GC: Incomplete Multimodal Data Integration to Advance Precise Treatment Response Prediction and Survival Analysis for Gastric Cancer

    Authors: Fengtao Zhou, Yingxue Xu, Yanfen Cui, Shenyan Zhang, Yun Zhu, Weiyang He, Jiguang Wang, Xin Wang, Ronald Chan, Louis Ho Shing Lau, Chu Han, Dafu Zhang, Zhenhui Li, Hao Chen

    Abstract: Gastric cancer (GC) is a prevalent malignancy worldwide, ranking as the fifth most common cancer with over 1 million new cases and 700 thousand deaths in 2020. Locally advanced gastric cancer (LAGC) accounts for approximately two-thirds of GC diagnoses, and neoadjuvant chemotherapy (NACT) has emerged as the standard treatment for LAGC. However, the effectiveness of NACT varies significantly among… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 27 pages, 9 figures, 3 tables (under review)

  41. arXiv:2404.00989  [pdf, other

    cs.CV cs.AI cs.MM cs.SD eess.AS

    360+x: A Panoptic Multi-modal Scene Understanding Dataset

    Authors: Hao Chen, Yuqi Hou, Chenyuan Qu, Irene Testini, Xiaohan Hong, Jianbo Jiao

    Abstract: Human perception of the world is shaped by a multitude of viewpoints and modalities. While many existing datasets focus on scene understanding from a certain perspective (e.g. egocentric or third-person views), our dataset offers a panoptic perspective (i.e. multiple viewpoints with multiple data modalities). Specifically, we encapsulate third-person panoramic and front views, as well as egocentri… ▽ More

    Submitted 7 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 (Oral Presentation), Project page: https://x360dataset.github.io/

    Journal ref: The IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) 2024

  42. arXiv:2403.19785  [pdf, other

    cs.IT eess.SP

    Integrated Communication, Localization, and Sensing in 6G D-MIMO Networks

    Authors: Hao Guo, Henk Wymeersch, Behrooz Makki, Hui Chen, Yibo Wu, Giuseppe Durisi, Musa Furkan Keskin, Mohammad H. Moghaddam, Charitha Madapatha, Han Yu, Peter Hammarberg, Hyowon Kim, Tommy Svensson

    Abstract: Future generations of mobile networks call for concurrent sensing and communication functionalities in the same hardware and/or spectrum. Compared to communication, sensing services often suffer from limited coverage, due to the high path loss of the reflected signal and the increased infrastructure requirements. To provide a more uniform quality of service, distributed multiple input multiple out… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  43. arXiv:2403.17770  [pdf, other

    eess.IV cs.CV

    CT Synthesis with Conditional Diffusion Models for Abdominal Lymph Node Segmentation

    Authors: Yongrui Yu, Hanyu Chen, Zitian Zhang, Qiong Xiao, Wenhui Lei, Linrui Dai, Yu Fu, Hui Tan, Guan Wang, Peng Gao, Xiaofan Zhang

    Abstract: Despite the significant success achieved by deep learning methods in medical image segmentation, researchers still struggle in the computer-aided diagnosis of abdominal lymph nodes due to the complex abdominal environment, small and indistinguishable lesions, and limited annotated data. To address these problems, we present a pipeline that integrates the conditional diffusion model for lymph node… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  44. arXiv:2403.16361  [pdf, other

    eess.IV cs.CV

    RSTAR: Rotational Streak Artifact Reduction in 4D CBCT using Separable and Circular Convolutions

    Authors: Ziheng Deng, Hua Chen, Haibo Hu, Zhiyong Xu, Tianling Lyu, Yan Xi, Yang Chen, Jun Zhao

    Abstract: Four-dimensional cone-beam computed tomography (4D CBCT) provides respiration-resolved images and can be used for image-guided radiation therapy. However, the ability to reveal respiratory motion comes at the cost of image artifacts. As raw projection data are sorted into multiple respiratory phases, there is a limited number of cone-beam projections available for image reconstruction. Consequentl… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  45. arXiv:2403.15739  [pdf, other

    eess.SP

    Towards Channel-Resilient CSI-Based RF Fingerprinting using Deep Learning

    Authors: Ruiqi Kong, He Chen

    Abstract: This work introduces DeepCRF, a deep learning framework designed for channel state information-based radio frequency fingerprinting (CSI-RFF). The considered CSI-RFF is built on micro-CSI, a recently discovered radio-frequency (RF) fingerprint that manifests as micro-signals appearing on the channel state information (CSI) curves of commercial WiFi devices. Micro-CSI facilitates CSI-RFF which is m… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: 6 pages,5 figures, INFOCOM WKSHPS 2024

  46. arXiv:2403.14935  [pdf, ps, other

    math.OC eess.SY

    Data-Driven Predictive Control with Adaptive Disturbance Attenuation for Constrained Systems

    Authors: Nan Li, Ilya Kolmanovsky, Hong Chen

    Abstract: In this paper, we propose a novel data-driven predictive control approach for systems subject to time-domain constraints. The approach combines the strengths of H-infinity control for rejecting disturbances and MPC for handling constraints. In particular, the approach can dynamically adapt H-infinity disturbance attenuation performance depending on measured system state and forecasted disturbance… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 11 pages, 2 figures

  47. arXiv:2403.13225  [pdf, other

    eess.IV

    Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation

    Authors: Linshan Wu, Zhun Zhong, Jiayi Ma, Yunchao Wei, Hao Chen, Leyuan Fang, Shutao Li

    Abstract: Weakly-Supervised Semantic Segmentation (WSSS) aims to train segmentation models by weak labels, which is receiving significant attention due to its low annotation cost. Existing approaches focus on generating pseudo labels for supervision while largely ignoring to leverage the inherent semantic correlation among different pseudo labels. We observe that pseudo-labeled pixels that are close to each… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  48. arXiv:2403.11953  [pdf, other

    eess.IV cs.CV

    Advancing COVID-19 Detection in 3D CT Scans

    Authors: Qingqiu Li, Runtian Yuan, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen

    Abstract: To make a more accurate diagnosis of COVID-19, we propose a straightforward yet effective model. Firstly, we analyse the characteristics of 3D CT scans and remove the non-lung parts, facilitating the model to focus on lesion-related areas and reducing computational cost. We use ResNeSt50 as the strong feature extractor, initializing it with pretrained weights which have COVID-19-specific prior kno… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  49. arXiv:2403.11498  [pdf, other

    eess.IV cs.CV

    Domain Adaptation Using Pseudo Labels for COVID-19 Detection

    Authors: Runtian Yuan, Qingqiu Li, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen

    Abstract: In response to the need for rapid and accurate COVID-19 diagnosis during the global pandemic, we present a two-stage framework that leverages pseudo labels for domain adaptation to enhance the detection of COVID-19 from CT scans. By utilizing annotated data from one domain and non-annotated data from another, the model overcomes the challenge of data scarcity and variability, common in emergent he… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  50. arXiv:2403.10537  [pdf, ps, other

    cs.NI eess.SP

    Semantic Extraction Model Selection for IoT Devices in Edge-assisted Semantic Communications

    Authors: Hong Chen, Fang Fang, Xianbin Wang

    Abstract: Semantic communications offer the potential to alleviate communication loads by exchanging meaningful information. However, semantic extraction (SE) is computationally intensive, posing challenges for resource-constrained Internet of Things (IoT) devices. To address this, leveraging computing resources at the edge servers (ESs) is essential. ESs can support multiple SE models for various tasks, ma… ▽ More

    Submitted 26 February, 2024; originally announced March 2024.

    Comments: Submitted to IEEE Communications Letters