Skip to main content

Showing 1–50 of 596 results for author: Wang, C

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.19043  [pdf

    eess.IV cs.AI cs.CV cs.DB

    CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

    Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Ya**g Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, **g Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

    Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 19 pages, 3 figures, 2 tables

  2. arXiv:2406.16012  [pdf

    eess.IV cs.CV

    Wound Tissue Segmentation in Diabetic Foot Ulcer Images Using Deep Learning: A Pilot Study

    Authors: Mrinal Kanti Dhar, Chuanbo Wang, Yash Patel, Taiyu Zhang, Jeffrey Niezgoda, Sandeep Gopalakrishnan, Keke Chen, Zeyun Yu

    Abstract: Identifying individual tissues, so-called tissue segmentation, in diabetic foot ulcer (DFU) images is a challenging task and little work has been published, largely due to the limited availability of a clinical image dataset. To address this gap, we have created a DFUTissue dataset for the research community to evaluate wound tissue segmentation algorithms. The dataset contains 110 images with tis… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  3. arXiv:2406.14799  [pdf, other

    cs.RO eess.SY

    Capture Point Control in Thruster-Assisted Bipedal Locomotion

    Authors: Shreyansh Pitroda, Aditya Bondada, Kaushik Venkatesh Krishnamurthy, Adarsh Salagame, Chenghao Wang, Taoran Liu, Bibek Gupta, Eric Sihite, Reza Nemovi, Alireza Ramezani, Morteza Gharib

    Abstract: Despite major advancements in control design that are robust to unplanned disturbances, bipedal robots are still susceptible to falling over and struggle to negotiate rough terrains. By utilizing thrusters in our bipedal robot, we can perform additional posture manipulation and expand the modes of locomotion to enhance the robot's stability and ability to negotiate rough and difficult-to-navigate… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Submitted and to be presented at IEEE AIM 2024. arXiv admin note: substantial text overlap with arXiv:2103.15952

  4. arXiv:2406.14794  [pdf, other

    eess.IV cs.CV cs.LG

    ImageFlowNet: Forecasting Multiscale Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images

    Authors: Chen Liu, Ke Xu, Liangbo L. Shen, Guillaume Huguet, Zilong Wang, Alexander Tong, Danilo Bzdok, Jay Stewart, Jay C. Wang, Lucian V. Del Priore, Smita Krishnaswamy

    Abstract: The forecasting of disease progression from images is a holy grail for clinical decision making. However, this task is complicated by the inherent high dimensionality, temporal sparsity and sampling irregularity in longitudinal image acquisitions. Existing methods often rely on extracting hand-crafted features and performing time-series analysis in this vector space, leading to a loss of rich spat… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  5. arXiv:2406.13977  [pdf, other

    eess.IV cs.CV

    Similarity-aware Syncretic Latent Diffusion Model for Medical Image Translation with Representation Learning

    Authors: Tingyi Lin, Pengju Lyu, Jie Zhang, Yuqing Wang, Cheng Wang, Jianjun Zhu

    Abstract: Non-contrast CT (NCCT) imaging may reduce image contrast and anatomical visibility, potentially increasing diagnostic uncertainty. In contrast, contrast-enhanced CT (CECT) facilitates the observation of regions of interest (ROI). Leading generative models, especially the conditional diffusion model, demonstrate remarkable capabilities in medical image modality transformation. Typical conditional d… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  6. arXiv:2406.13118  [pdf, other

    cs.RO eess.SY

    Thruster-Assisted Incline Walking

    Authors: Kaushik Venkatesh Krishnamurthy, Chenghao Wang, Shreyansh Pitroda, Adarsh Salagame, Eric Sihite, Reza Nemovi, Alireza Ramezani, Morteza Gharib

    Abstract: In this study, our aim is to evaluate the effectiveness of thruster-assisted steep slope walking for the Husky Carbon, a quadrupedal robot equipped with custom-designed actuators and plural electric ducted fans, through simulation prior to conducting experimental trials. Thruster-assisted steep slope walking draws inspiration from wing-assisted incline running (WAIR) observed in birds, and intrigu… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 7 pages, 7 figures, submitted to CDC 2024 conference. arXiv admin note: text overlap with arXiv:2405.06070

  7. arXiv:2406.12646  [pdf, other

    eess.IV cs.AI cs.CV

    An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation

    Authors: Qin Li, Yizhe Zhang, Yan Li, Jun Lyu, Meng Liu, Longyu Sun, Mengting Sun, Qirong Li, Wenyue Mao, Xinran Wu, Ya**g Zhang, Yinghua Chu, Shuo Wang, Chengyan Wang

    Abstract: The segmentation foundation model, e.g., Segment Anything Model (SAM), has attracted increasing interest in the medical image community. Early pioneering studies primarily concentrated on assessing and improving SAM's performance from the perspectives of overall accuracy and efficiency, yet little attention was given to the fairness considerations. This oversight raises questions about the potenti… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted to MICCAI-2024

  8. arXiv:2406.09931  [pdf, other

    eess.IV cs.CV cs.LG

    SCKansformer: Fine-Grained Classification of Bone Marrow Cells via Kansformer Backbone and Hierarchical Attention Mechanisms

    Authors: Yifei Chen, Zhu Zhu, Shenghao Zhu, Linwei Qiu, Binfeng Zou, Fan Jia, Yunpeng Zhu, Chenyan Zhang, Zhaojie Fang, Feiwei Qin, ** Fan, Changmiao Wang, Yu Gao, Gang Yu

    Abstract: The incidence and mortality rates of malignant tumors, such as acute leukemia, have risen significantly. Clinically, hospitals rely on cytological examination of peripheral blood and bone marrow smears to diagnose malignant tumors, with accurate blood cell counting being crucial. Existing automated methods face challenges such as low feature expression capability, poor interpretability, and redund… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 15 pages, 6 figures

  9. arXiv:2406.07437  [pdf, other

    cs.SD eess.AS

    Graph-based multi-Feature fusion method for speech emotion recognition

    Authors: Xueyu Liu, Jie Lin, Chao Wang

    Abstract: Exploring proper way to conduct multi-speech feature fusion for cross-corpus speech emotion recognition is crucial as different speech features could provide complementary cues reflecting human emotion status. While most previous approaches only extract a single speech feature for emotion recognition, existing fusion methods such as concatenation, parallel connection, and splicing ignore heterogen… ▽ More

    Submitted 13 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: 25 pages,4 figures

  10. arXiv:2406.07374  [pdf, other

    eess.SP

    Movable-Antenna Array Empowered ISAC Systems for Low-Altitude Economy

    Authors: Ziming Kuang, Wenchao Liu, Chunjie Wang, Zhenzhen **, **ke Ren, Xuhui Zhang, Yanyan Shen

    Abstract: This paper investigates a movable-antenna (MA) array empowered integrated sensing and communications (ISAC) over low-altitude platform (LAP) system to support low-altitude economy (LAE) applications. In the considered system, an unmanned aerial vehicle (UAV) is dispatched to hover in the air, working as the UAV-enabled LAP (ULAP) to provide information transmission and sensing simultaneously for L… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  11. arXiv:2406.07329  [pdf, other

    cs.CV eess.IV

    Cinematic Gaussians: Real-Time HDR Radiance Fields with Depth of Field

    Authors: Chao Wang, Krzysztof Wolski, Bernhard Kerbl, Ana Serrano, Mojtaba Bemana, Hans-Peter Seidel, Karol Myszkowski, Thomas Leimkühler

    Abstract: Radiance field methods represent the state of the art in reconstructing complex scenes from multi-view photos. However, these reconstructions often suffer from one or both of the following limitations: First, they typically represent scenes in low dynamic range (LDR), which restricts their use to evenly lit environments and hinders immersive viewing experiences. Secondly, their reliance on a pinho… ▽ More

    Submitted 21 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  12. arXiv:2406.06086  [pdf, other

    cs.SD eess.AS

    RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection

    Authors: Yujie Chen, Jiangyan Yi, Jun Xue, Chenglong Wang, Xiaohui Zhang, Shunbo Dong, Siding Zeng, Jianhua Tao, Lv Zhao, Cunhang Fan

    Abstract: Fake artefacts for discriminating between bonafide and fake audio can exist in both short- and long-range segments. Therefore, combining local and global feature information can effectively discriminate between bonafide and fake audio. This paper proposes an end-to-end bidirectional state space model, named RawBMamba, to capture both short- and long-range discriminative information for audio deepf… ▽ More

    Submitted 18 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  13. arXiv:2406.03872  [pdf, other

    cs.CL cs.SD eess.AS

    BLSP-Emo: Towards Empathetic Large Speech-Language Models

    Authors: Chen Wang, Minpeng Liao, Zhongqiang Huang, Junhong Wu, Chengqing Zong, Jiajun Zhang

    Abstract: The recent release of GPT-4o showcased the potential of end-to-end multimodal models, not just in terms of low latency but also in their ability to understand and generate expressive speech with rich emotions. While the details are unknown to the open research community, it likely involves significant amounts of curated data and compute, neither of which is readily accessible. In this paper, we pr… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  14. arXiv:2406.03391  [pdf, other

    eess.SP

    Joint Association, Beamforming, and Resource Allocation for Multi-IRS Enabled MU-MISO Systems With RSMA

    Authors: Chunjie Wang, Xuhui Zhang, Huijun Xing, Liang Xue, Shuqiang Wang, Yanyan Shen, Bo Yang, ** Guan

    Abstract: Intelligent reflecting surface (IRS) and rate-splitting multiple access (RSMA) technologies are at the forefront of enhancing spectrum and energy efficiency in the next generation multi-antenna communication systems. This paper explores a RSMA system with multiple IRSs, and proposes two purpose-driven scheduling schemes, i.e., the exhaustive IRS-aided (EIA) and opportunistic IRS-aided (OIA) scheme… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  15. arXiv:2406.02918  [pdf, other

    eess.IV cs.CV

    U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation

    Authors: Chenxin Li, Xinyu Liu, Wuyang Li, Cheng Wang, Hengyu Liu, Yixuan Yuan

    Abstract: U-Net has become a cornerstone in various visual applications such as image segmentation and diffusion probability models. While numerous innovative designs and improvements have been introduced by incorporating transformers or MLPs, the networks are still limited to linearly modeling patterns as well as the deficient interpretability. To address these challenges, our intuition is inspired by the… ▽ More

    Submitted 6 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  16. arXiv:2406.02232  [pdf, other

    eess.SP

    Optimizing Air-borne Network-in-a-box Deployment for Efficient Remote Coverage

    Authors: Sidrah Javed, Yunfei Chen, Mohamed-Slim Alouini, Cheng-Xiang Wang

    Abstract: Among many envisaged drivers for the sixth generation, one is from the United Nations Sustainability Development Goals 2030 to eliminate digital inequality. Remote coverage in sparsely populated areas, difficult terrains, or emergency scenarios requires on-demand access and flexible deployment with minimal capex and opex. In this context, network-in-a-box (NIB) is an exciting solution that packs t… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  17. arXiv:2405.20357  [pdf

    eess.IV physics.app-ph physics.optics

    Encryption in ghost imaging with Kronecker products of random matrices

    Authors: Yi-Ning Zhao, Lin-Shan Chen, Lingxin Kong, Chong Wang, Cheng Ren, De-Zhong Cao

    Abstract: By forming measurement matrices with the Kronecker product of two random matrices, image encryption in computational ghost imaging is investigated. The two-dimensional images are conveniently reconstructed with the pseudo-inverse matrices of the two random matrices. To suppress the noise, the method of truncated singular value decomposition can be applied to either or both of the two pseudo-invers… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 5 pages, 4 figures

  18. arXiv:2405.19041  [pdf, other

    cs.CL cs.SD eess.AS

    BLSP-KD: Bootstrap** Language-Speech Pre-training via Knowledge Distillation

    Authors: Chen Wang, Minpeng Liao, Zhongqiang Huang, Jiajun Zhang

    Abstract: Recent end-to-end approaches have shown promise in extending large language models (LLMs) to speech inputs, but face limitations in directly assessing and optimizing alignment quality and fail to achieve fine-grained alignment due to speech-text length mismatch. We introduce BLSP-KD, a novel approach for Bootstrap** Language-Speech Pretraining via Knowledge Distillation, which addresses these li… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  19. arXiv:2405.17818  [pdf, other

    cs.CV eess.IV

    Hyperspectral and multispectral image fusion with arbitrary resolution through self-supervised representations

    Authors: Ting Wang, Zipei Yan, Jizhou Li, Xile Zhao, Chao Wang, Michael Ng

    Abstract: The fusion of a low-resolution hyperspectral image (LR-HSI) with a high-resolution multispectral image (HR-MSI) has emerged as an effective technique for achieving HSI super-resolution (SR). Previous studies have mainly concentrated on estimating the posterior distribution of the latent high-resolution hyperspectral image (HR-HSI), leveraging an appropriate image prior and likelihood computed from… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  20. arXiv:2405.17659  [pdf, other

    eess.IV cs.CV

    Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba

    Authors: Jiahao Huang, Liutao Yang, Fanwen Wang, Yang Nan, Weiwen Wu, Chengyan Wang, Kuangyu Shi, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Daoqiang Zhang, Guang Yang

    Abstract: Deep learning has been extensively applied in medical image reconstruction, where Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) represent the predominant paradigms, each possessing distinct advantages and inherent limitations: CNNs exhibit linear complexity with local sensitivity, whereas ViTs demonstrate quadratic complexity with global sensitivity. The emerging Mamba has sh… ▽ More

    Submitted 25 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  21. arXiv:2405.10570  [pdf

    eess.IV cs.AI

    Simultaneous Deep Learning of Myocardium Segmentation and T2 Quantification for Acute Myocardial Infarction MRI

    Authors: Yirong Zhou, Chengyan Wang, Mengtian Lu, Kunyuan Guo, Zi Wang, Dan Ruan, Rui Guo, Peijun Zhao, Jianhua Wang, Naiming Wu, Jianzhong Lin, Yinyin Chen, Hang **, Lianxin Xie, Lilan Wu, Liuhong Zhu, Jianjun Zhou, Congbo Cai, He Wang, Xiaobo Qu

    Abstract: In cardiac Magnetic Resonance Imaging (MRI) analysis, simultaneous myocardial segmentation and T2 quantification are crucial for assessing myocardial pathologies. Existing methods often address these tasks separately, limiting their synergistic potential. To address this, we propose SQNet, a dual-task network integrating Transformer and Convolutional Neural Network (CNN) components. SQNet features… ▽ More

    Submitted 29 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: 10 pages, 8 figures, 6 tables

  22. arXiv:2405.10561  [pdf, other

    eess.IV cs.CV

    Infrared Image Super-Resolution via Lightweight Information Split Network

    Authors: Shijie Liu, Kang Yan, Feiwei Qin, Changmiao Wang, Ruiquan Ge, Kai Zhang, Jie Huang, Yong Peng, ** Cao

    Abstract: Single image super-resolution (SR) is an established pixel-level vision task aimed at reconstructing a high-resolution image from its degraded low-resolution counterpart. Despite the notable advancements achieved by leveraging deep neural networks for SR, most existing deep learning architectures feature an extensive number of layers, leading to high computational complexity and substantial memory… ▽ More

    Submitted 27 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  23. arXiv:2405.09995  [pdf, other

    eess.SP

    Semantic Communication via Rate Distortion Perception Bottleneck

    Authors: Zihe Zhao, Chunyue Wang

    Abstract: With the advancement of Artificial Intelligence (AI) technology, next-generation wireless communication network is facing unprecedented challenge. Semantic communication has become a novel solution to address such challenges, with enhancing the efficiency of bandwidth utilization by transmitting meaningful information and filtering out superfluous data. Unfortunately, recent studies have shown tha… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  24. arXiv:2405.09787  [pdf, other

    eess.IV cs.CV cs.LG

    Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge

    Authors: Dominic LaBella, Ujjwal Baid, Omaditya Khanna, Shan McBurney-Lin, Ryan McLean, Pierre Nedelec, Arif Rashid, Nourel Hoda Tahon, Talissa Altes, Radhika Bhalerao, Yaseen Dhemesh, Devon Godfrey, Fathi Hilal, Scott Floyd, Anastasia Janas, Anahita Fathi Kazerooni, John Kirkpatrick, Collin Kent, Florian Kofler, Kevin Leu, Nazanin Maleki, Bjoern Menze, Maxence Pajot, Zachary J. Reitman, Jeffrey D. Rudie , et al. (96 additional authors not shown)

    Abstract: We describe the design and results from the BraTS 2023 Intracranial Meningioma Segmentation Challenge. The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas, which are typically benign extra-axial tumors with diverse radiologic and anatomical presentation and a propensity for multiplicity. Nine participating teams each developed deep-learning… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 16 pages, 11 tables, 10 figures, MICCAI

  25. arXiv:2405.09539  [pdf, ps, other

    eess.IV cs.CV cs.MM

    MMFusion: Multi-modality Diffusion Model for Lymph Node Metastasis Diagnosis in Esophageal Cancer

    Authors: Chengyu Wu, Chengkai Wang, Yaqi Wang, Huiyu Zhou, Yatao Zhang, Qifeng Wang, Shuai Wang

    Abstract: Esophageal cancer is one of the most common types of cancer worldwide and ranks sixth in cancer-related mortality. Accurate computer-assisted diagnosis of cancer progression can help physicians effectively customize personalized treatment plans. Currently, CT-based cancer diagnosis methods have received much attention for their comprehensive ability to examine patients' conditions. However, multi-… ▽ More

    Submitted 16 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    Comments: Early accepted to MICCAI 2024 (6/6/5)

  26. arXiv:2405.07260  [pdf

    cs.LG cs.AI eess.SP

    A Supervised Information Enhanced Multi-Granularity Contrastive Learning Framework for EEG Based Emotion Recognition

    Authors: Xiang Li, Jian Song, Zhigang Zhao, Chunxiao Wang, Dawei Song, Bin Hu

    Abstract: This study introduces a novel Supervised Info-enhanced Contrastive Learning framework for EEG based Emotion Recognition (SICLEER). SI-CLEER employs multi-granularity contrastive learning to create robust EEG contextual representations, potentiallyn improving emotion recognition effectiveness. Unlike existing methods solely guided by classification loss, we propose a joint learning model combining… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: 5 pages, 3 figures, 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  27. arXiv:2405.06070  [pdf, other

    cs.RO eess.SY

    Narrow-Path, Dynamic Walking Using Integrated Posture Manipulation and Thrust Vectoring

    Authors: Kaushik Venkatesh Krishnamurthy, Chenghao Wang, Shreyansh Pitroda, Adarsh Salagame, Eric Sihite, Reza Nemovi, Alireza Ramezani, Morteza Gharib

    Abstract: This research concentrates on enhancing the navigational capabilities of Northeastern Universitys Husky, a multi-modal quadrupedal robot, that can integrate posture manipulation and thrust vectoring, to traverse through narrow pathways such as walking over pipes and slacklining. The Husky is outfitted with thrusters designed to stabilize its body during dynamic walking over these narrow paths. The… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2312.12586

  28. arXiv:2405.05715  [pdf, other

    eess.SP

    Shifting the ISAC Trade-Off with Fluid Antenna Systems

    Authors: Jiaqi Zou, Hao Xu, Chao Wang, Lvxin Xu, Songlin Sun, Kaitao Meng, Christos Masouros, Kai-Kit Wong

    Abstract: As an emerging antenna technology, a fluid antenna system (FAS) enhances spatial diversity to improve both sensing and communication performance by shifting the active antennas among available ports. In this letter, we study the potential of shifting the integrated sensing and communication (ISAC) trade-off with FAS. We propose the model for FAS-enabled ISAC and jointly optimize the transmit beamf… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 5 pages, 5 figures

  29. arXiv:2405.03729  [pdf

    eess.IV physics.optics quant-ph

    Computational ghost imaging with hybrid transforms by integrating Hadamard, discrete cosine, and Haar matrices

    Authors: Yi-Ning Zhao, Lin-Shan Chen, Liu-Ya Chen, Lingxin Kong, Chong Wang, Cheng Ren, Su-Heng Zhang, De-Zhong Cao

    Abstract: A scenario of ghost imaging with hybrid transform approach is proposed by integrating Hadamard, discrete cosine, and Haar matrices. The measurement matrix is formed by the Kronecker product of the two different transform matrices. The image information can be conveniently reconstructed by the corresponding inverse matrices. In experiment, six hybridization sets are performed in computational ghost… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 5 pages, 4 figures

  30. arXiv:2405.02180  [pdf, other

    cs.LG eess.SY

    A Flow-Based Model for Conditional and Probabilistic Electricity Consumption Profile Generation and Prediction

    Authors: Weijie Xia, Chenguang Wang, Peter Palensky, Pedro P. Vergara

    Abstract: Residential Load Profile (RLP) generation and prediction are critical for the operation and planning of distribution networks, especially as diverse low-carbon technologies (e.g., photovoltaic and electric vehicles) are increasingly adopted. This paper introduces a novel flow-based generative model, termed Full Convolutional Profile Flow (FCPFlow), which is uniquely designed for both conditional a… ▽ More

    Submitted 9 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

  31. arXiv:2405.00577  [pdf

    cs.LG eess.SP q-bio.NC

    Discovering robust biomarkers of neurological disorders from functional MRI using graph neural networks: A Review

    Authors: Yi Hao Chan, Deepank Girish, Sukrit Gupta, **g Xia, Chockalingam Kasi, Yinan He, Conghao Wang, Jagath C. Rajapakse

    Abstract: Graph neural networks (GNN) have emerged as a popular tool for modelling functional magnetic resonance imaging (fMRI) datasets. Many recent studies have reported significant improvements in disorder classification performance via more sophisticated GNN designs and highlighted salient features that could be potential biomarkers of the disorder. In this review, we provide an overview of how GNN and… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  32. arXiv:2405.00542  [pdf, other

    eess.IV cs.CV

    UWAFA-GAN: Ultra-Wide-Angle Fluorescein Angiography Transformation via Multi-scale Generation and Registration Enhancement

    Authors: Ruiquan Ge, Zhaojie Fang, Pengxue Wei, Zhanghao Chen, Hongyang Jiang, Ahmed Elazab, Wangting Li, Xiang Wan, Shaochong Zhang, Changmiao Wang

    Abstract: Fundus photography, in combination with the ultra-wide-angle fundus (UWF) techniques, becomes an indispensable diagnostic tool in clinical settings by offering a more comprehensive view of the retina. Nonetheless, UWF fluorescein angiography (UWF-FA) necessitates the administration of a fluorescent dye via injection into the patient's hand or elbow unlike UWF scanning laser ophthalmoscopy (UWF-SLO… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  33. arXiv:2404.18096  [pdf, other

    eess.IV cs.CV

    Snake with Shifted Window: Learning to Adapt Vessel Pattern for OCTA Segmentation

    Authors: Xinrun Chen, Mei Shen, Haojian Ning, Mengzhan Zhang, Chengliang Wang, Shiying Li

    Abstract: Segmenting specific targets or structures in optical coherence tomography angiography (OCTA) images is fundamental for conducting further pathological studies. The retinal vascular layers are rich and intricate, and such vascular with complex shapes can be captured by the widely-studied OCTA images. In this paper, we thus study how to use OCTA images with projection vascular layers to segment reti… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  34. arXiv:2404.16544  [pdf, other

    eess.IV

    Image registration based automated lesion correspondence pipeline for longitudinal CT data

    Authors: Subrata Mukherjee, Thibaud Coroller, Craig Wang, Ravi K. Samala, Tingting Hu, Didem Gokcay, Nicholas Petrick, Berkman Sahiner, Qian Cao

    Abstract: Patients diagnosed with metastatic breast cancer (mBC) typically undergo several radiographic assessments during their treatment. mBC often involves multiple metastatic lesions in different organs, it is imperative to accurately track and assess these lesions to gain a comprehensive understanding of the disease's response to treatment. Computerized analysis methods that rely on lesion-level tracki… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  35. arXiv:2404.15585  [pdf, other

    cs.LG eess.IV

    Brain Storm Optimization Based Swarm Learning for Diabetic Retinopathy Image Classification

    Authors: Liang Qu, Cunze Wang, Yuhui Shi

    Abstract: The application of deep learning techniques to medical problems has garnered widespread research interest in recent years, such as applying convolutional neural networks to medical image classification tasks. However, data in the medical field is often highly private, preventing different hospitals from sharing data to train an accurate model. Federated learning, as a privacy-preserving machine le… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  36. arXiv:2404.15469  [pdf, other

    cs.IT eess.SP

    NMBEnet: Efficient Near-field mmWave Beam Training for Multiuser OFDM Systems Using Sub-6 GHz Pilots

    Authors: Wang Liu, Cunhua Pan, Hong Ren, Cheng-Xiang Wang, Jiangzhou Wang, Xiaohu You

    Abstract: Combining millimetre-wave (mmWave) communications with an extremely large-scale antenna array (ELAA) presents a promising avenue for meeting the spectral efficiency demands of the future sixth generation (6G) mobile communications. However, beam training for mmWave ELAA systems is challenged by excessive pilot overheads as well as insufficient accuracy, as the huge near-field codebook has to be ac… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  37. arXiv:2404.15009  [pdf, other

    cs.CV eess.IV

    The Brain Tumor Segmentation in Pediatrics (BraTS-PEDs) Challenge: Focus on Pediatrics (CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs)

    Authors: Anahita Fathi Kazerooni, Nastaran Khalili, Deep Gandhi, Xinyang Liu, Zhifan Jiang, Syed Muhammed Anwar, Jake Albrecht, Maruf Adewole, Udunna Anazodo, Hannah Anderson, Sina Bagheri, Ujjwal Baid, Timothy Bergquist, Austin J. Borja, Evan Calabrese, Verena Chung, Gian-Marco Conte, Farouk Dako, James Eddy, Ivan Ezhov, Ariana Familiar, Keyvan Farahani, Anurag Gottipati, Debanjan Haldar, Shuvanjan Haldar , et al. (51 additional authors not shown)

    Abstract: Pediatric tumors of the central nervous system are the most common cause of cancer-related death in children. The five-year survival rate for high-grade gliomas in children is less than 20%. Due to their rarity, the diagnosis of these entities is often delayed, their treatment is mainly based on historic treatment concepts, and clinical trials require multi-institutional collaborations. Here we pr… ▽ More

    Submitted 29 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2305.17033

  38. arXiv:2404.08805  [pdf, other

    eess.IV cs.CV cs.LG

    Real-time guidewire tracking and segmentation in intraoperative x-ray

    Authors: Baochang Zhang, Mai Bui, Cheng Wang, Felix Bourier, Heribert Schunkert, Nassir Navab

    Abstract: During endovascular interventions, physicians have to perform accurate and immediate operations based on the available real-time information, such as the shape and position of guidewires observed on the fluoroscopic images, haptic information and the patients' physiological signals. For this purpose, real-time and accurate guidewire segmentation and tracking can enhance the visualization of guidew… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  39. arXiv:2404.06079  [pdf, other

    eess.AS cs.AI

    The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge

    Authors: Yiwei Guo, Chenrun Wang, Yifan Yang, Hankun Wang, Ziyang Ma, Chenpeng Du, Shuai Wang, Hanzheng Li, Shuai Fan, Hui Zhang, Xie Chen, Kai Yu

    Abstract: Discrete speech tokens have been more and more popular in multiple speech processing fields, including automatic speech recognition (ASR), text-to-speech (TTS) and singing voice synthesis (SVS). In this paper, we describe the systems developed by the SJTU X-LANCE group for the TTS (acoustic + vocoder), SVS, and ASR tracks in the Interspeech 2024 Speech Processing Using Discrete Speech Unit Challen… ▽ More

    Submitted 9 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: 5 pages, 3 figures. Report of a challenge

  40. arXiv:2404.03115  [pdf, other

    cs.LG eess.SY

    Deep Learning-Based Weather-Related Power Outage Prediction with Socio-Economic and Power Infrastructure Data

    Authors: Xuesong Wang, Nina Fatehi, Caisheng Wang, Masoud H. Nazari

    Abstract: This paper presents a deep learning-based approach for hourly power outage probability prediction within census tracts encompassing a utility company's service territory. Two distinct deep learning models, conditional Multi-Layer Perceptron (MLP) and unconditional MLP, were developed to forecast power outage probabilities, leveraging a rich array of input features gathered from publicly available… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted in 2024 IEEE PES General Meeting, Seattle, Washington (PES GM 2024)

  41. arXiv:2404.01082  [pdf, other

    eess.IV

    The state-of-the-art in Cardiac MRI Reconstruction: Results of the CMRxRecon Challenge in MICCAI 2023

    Authors: Jun Lyu, Chen Qin, Shuo Wang, Fanwen Wang, Yan Li, Zi Wang, Kunyuan Guo, Cheng Ouyang, Michael Tänzer, Meng Liu, Longyu Sun, Mengting Sun, Qin Li, Zhang Shi, Sha Hua, Hao Li, Zhensen Chen, Zhenlin Zhang, Bingyu Xin, Dimitris N. Metaxas, George Yiasemis, Jonas Teuwen, Li** Zhang, Weitian Chen, Yidong Zhao , et al. (25 additional authors not shown)

    Abstract: Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation p… ▽ More

    Submitted 16 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 25 pages, 17 figures

  42. arXiv:2403.20025  [pdf, ps, other

    cs.IT eess.SP

    Secure Full-Duplex Communication via Movable Antennas

    Authors: **gze Ding, Zijian Zhou, Chenbo Wang, Wenyao Li, Lifeng Lin, Bingli Jiao

    Abstract: This paper investigates physical layer security (PLS) for a movable antenna (MA)-assisted full-duplex (FD) system. In this system, an FD base station (BS) with multiple MAs for transmission and reception provides services for an uplink (UL) user and a downlink (DL) user. Each user operates in half-duplex (HD) mode and is equipped with a single fixed-position antenna (FPA), in the presence of a sin… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: This paper has been submitted for possible publication

  43. arXiv:2403.16529  [pdf, other

    eess.SP

    Exploit High-Dimensional RIS Information to Localization: What Is the Impact of Faulty Element?

    Authors: Tuo Wu, Cunhua Pan, Kangda Zhi, Hong Ren, Maged Elkashlan, Cheng-Xiang Wang, Robert Schober, Xiaohu You

    Abstract: This paper proposes a novel localization algorithm using the reconfigurable intelligent surface (RIS) received signal, i.e., RIS information. Compared with BS received signal, i.e., BS information, RIS information offers higher dimension and richer feature set, thereby providing an enhanced capacity to distinguish positions of the mobile users (MUs). Additionally, we address a practical scenario w… ▽ More

    Submitted 28 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: 17 pages, Accepted by IEEE JSAC

  44. arXiv:2403.15466  [pdf

    cs.CV cs.LG eess.IV

    Using Super-Resolution Imaging for Recognition of Low-Resolution Blurred License Plates: A Comparative Study of Real-ESRGAN, A-ESRGAN, and StarSRGAN

    Authors: Ching-Hsiang Wang

    Abstract: With the robust development of technology, license plate recognition technology can now be properly applied in various scenarios, such as road monitoring, tracking of stolen vehicles, detection at parking lot entrances and exits, and so on. However, the precondition for these applications to function normally is that the license plate must be 'clear' enough to be recognized by the system with the… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Master's thesis

  45. Multi-modal Heart Failure Risk Estimation based on Short ECG and Sampled Long-Term HRV

    Authors: Sergio González, Abel Ko-Chun Yi, Wan-Ting Hsieh, Wei-Chao Chen, Chun-Li Wang, Victor Chien-Chia Wu, Shang-Hung Chang

    Abstract: Cardiovascular diseases, including Heart Failure (HF), remain a leading global cause of mortality, often evading early detection. In this context, accessible and effective risk assessment is indispensable. Traditional approaches rely on resource-intensive diagnostic tests, typically administered after the onset of symptoms. The widespread availability of electrocardiogram (ECG) technology and the… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

    Journal ref: S. González, A. K.-C. Yi, W.-T. Hsieh, W.-C. Chen, C.-L. Wang, V. C.-C. Wu, S.-H. Chang, Multi-modal heart failure risk estimation based on short ECG and sampled long-term HRV, Information Fusion 107 (2024) 102337

  46. arXiv:2403.14402  [pdf, other

    cs.SD cs.CL eess.AS

    XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception

    Authors: HyoJung Han, Mohamed Anwar, Juan Pino, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang

    Abstract: Speech recognition and translation systems perform poorly on noisy inputs, which are frequent in realistic environments. Augmenting these systems with visual signals has the potential to improve robustness to noise. However, audio-visual (AV) data is only available in limited amounts and for fewer languages than audio-only resources. To address this gap, we present XLAVS-R, a cross-lingual audio-v… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  47. arXiv:2403.12408  [pdf, other

    cs.CL cs.SD eess.AS

    MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation

    Authors: Yifan Peng, Ilia Kulikov, Yilin Yang, Sravya Popuri, Hui Lu, Changhan Wang, Hongyu Gong

    Abstract: There have been emerging research interest and advances in speech-to-speech translation (S2ST), translating utterances from one language to another. This work proposes Multitask Speech Language Model (MSLM), which is a decoder-only speech language model trained in a multitask setting. Without reliance on text training data, our model is able to support multilingual S2ST with speaker style preserve… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  48. arXiv:2403.12402  [pdf, other

    cs.CL cs.SD eess.AS

    An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis

    Authors: Yifan Peng, Ilia Kulikov, Yilin Yang, Sravya Popuri, Hui Lu, Changhan Wang, Hongyu Gong

    Abstract: Speech language models (LMs) are promising for high-quality speech synthesis through in-context learning. A typical speech LM takes discrete semantic units as content and a short utterance as prompt, and synthesizes speech which preserves the content's semantics but mimics the prompt's style. However, there is no systematic understanding on how the synthesized audio is controlled by the prompt and… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  49. arXiv:2403.11556  [pdf, other

    eess.IV cs.CV

    Hierarchical Frequency-based Upsampling and Refining for Compressed Video Quality Enhancement

    Authors: Qianyu Zhang, Bolun Zheng, Xinying Chen, Quan Chen, Zhunjie Zhu, Can** Wang, Zongpeng Li, Chengang Yan

    Abstract: Video compression artifacts arise due to the quantization operation in the frequency domain. The goal of video quality enhancement is to reduce compression artifacts and reconstruct a visually-pleasant result. In this work, we propose a hierarchical frequency-based upsampling and refining neural network (HFUR) for compressed video quality enhancement. HFUR consists of two modules: implicit frequen… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  50. arXiv:2403.10064  [pdf, other

    eess.IV cs.CV

    Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI

    Authors: Chong Wang, Lanqing Guo, Yufei Wang, Hao Cheng, Yi Yu, Bihan Wen

    Abstract: Deep unfolding networks (DUN) have emerged as a popular iterative framework for accelerated magnetic resonance imaging (MRI) reconstruction. However, conventional DUN aims to reconstruct all the missing information within the entire null space in each iteration. Thus it could be challenging when dealing with highly ill-posed degradation, usually leading to unsatisfactory reconstruction. In this wo… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024