Skip to main content

Showing 1–50 of 158 results for author: Hu, X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.18021  [pdf, other

    cs.SD cs.LG eess.AS

    SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR

    Authors: Shuaishuai Ye, Shunfei Chen, Xinhui Hu, Xinkang Xu

    Abstract: In this work, we propose a Switch-Conformer-based MoE system named SC-MoE for unified streaming and non-streaming code-switching (CS) automatic speech recognition (ASR), where we design a streaming MoE layer consisting of three language experts, which correspond to Mandarin, English, and blank, respectively, and equipped with a language identification (LID) network with a Connectionist Temporal Cl… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted by InterSpeech 2024; 5 pages, 2 figures

  2. arXiv:2405.18435  [pdf, other

    eess.IV cs.CV

    QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

    Authors: Hongwei Bran Li, Fernando Navarro, Ivan Ezhov, Amirhossein Bayat, Dhritiman Das, Florian Kofler, Suprosanna Shit, Diana Waldmannstetter, Johannes C. Paetzold, Xiaobin Hu, Benedikt Wiestler, Lucas Zimmer, Tamaz Amiranashvili, Chinmay Prabhakar, Christoph Berger, Jonas Weidner, Michelle Alonso-Basant, Arif Rashid, Ujjwal Baid, Wesam Adel, Deniz Ali, Bhakti Baheti, Yingbin Bai, Ishaan Bhatt, Sabri Can Cetindag , et al. (55 additional authors not shown)

    Abstract: Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de… ▽ More

    Submitted 24 June, 2024; v1 submitted 19 March, 2024; originally announced May 2024.

    Comments: initial technical report

  3. arXiv:2405.16446  [pdf, ps, other

    eess.SP

    A New Solution for MU-MISO Symbol-Level Precoding: Extrapolation and Deep Unfolding

    Authors: Mu Liang, Ang Li, Xiaoyan Hu, Christos Masouros

    Abstract: Constructive interference (CI) precoding, which converts the harmful multi-user interference into beneficial signals, is a promising and efficient interference management scheme in multi-antenna communication systems. However, CI-based symbol-level precoding (SLP) experiences high computational complexity as the number of symbol slots increases within a transmission block, rendering it unaffordabl… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  4. arXiv:2405.13166  [pdf, other

    eess.AS cs.AI cs.CY

    FairLENS: Assessing Fairness in Law Enforcement Speech Recognition

    Authors: Yicheng Wang, Mark Cusick, Mohamed Laila, Kate Puech, Zheng** Ji, Xia Hu, Michael Wilson, Noah Spitzer-Williams, Bryan Wheeler, Yasser Ibrahim

    Abstract: Automatic speech recognition (ASR) techniques have become powerful tools, enhancing efficiency in law enforcement scenarios. To ensure fairness for demographic groups in different acoustic environments, ASR engines must be tested across a variety of speakers in realistic settings. However, describing the fairness discrepancies between models with confidence remains a challenge. Meanwhile, most pub… ▽ More

    Submitted 28 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  5. arXiv:2405.05498  [pdf, other

    cs.SD eess.AS

    The RoyalFlush Automatic Speech Diarization and Recognition System for In-Car Multi-Channel Automatic Speech Recognition Challenge

    Authors: **gguang Tian, Shuaishuai Ye, Shunfei Chen, Yang Xiang, Zhaohui Yin, Xinhui Hu, Xinkang Xu

    Abstract: This paper presents our system submission for the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) Challenge, which focuses on speaker diarization and speech recognition in complex multi-speaker scenarios. To address these challenges, we develop end-to-end speaker diarization models that notably decrease the diarization error rate (DER) by 49.58\% compared to the official baseline on t… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  6. arXiv:2405.03711  [pdf, other

    cs.LG cs.AI cs.NE eess.SY

    Guidance Design for Escape Flight Vehicle Using Evolution Strategy Enhanced Deep Reinforcement Learning

    Authors: Xiao Hu, Tianshu Wang, Min Gong, Shaoshi Yang

    Abstract: Guidance commands of flight vehicles are a series of data sets with fixed time intervals, thus guidance design constitutes a sequential decision problem and satisfies the basic conditions for using deep reinforcement learning (DRL). In this paper, we consider the scenario where the escape flight vehicle (EFV) generates guidance commands based on DRL and the pursuit flight vehicle (PFV) generates g… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 13 pages, 13 figures, accepted to appear on IEEE Access, Mar. 2024

    Journal ref: IEEE Access, vol. 12, pp. 48210-48222, Mar. 2024

  7. arXiv:2405.03178  [pdf, other

    cs.SD eess.AS

    POPDG: Popular 3D Dance Generation with PopDanceSet

    Authors: Zhenye Luo, Min Ren, Xuecai Hu, Yongzhen Huang, Li Yao

    Abstract: Generating dances that are both lifelike and well-aligned with music continues to be a challenging task in the cross-modal domain. This paper introduces PopDanceSet, the first dataset tailored to the preferences of young audiences, enabling the generation of aesthetically oriented dances. And it surpasses the AIST++ dataset in music genre diversity and the intricacy and depth of dance movements. M… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  8. arXiv:2404.17667  [pdf, other

    eess.SP cs.LG

    SiamQuality: A ConvNet-Based Foundation Model for Imperfect Physiological Signals

    Authors: Cheng Ding, Zhicheng Guo, Zhaoliang Chen, Randall J Lee, Cynthia Rudin, Xiao Hu

    Abstract: Foundation models, especially those using transformers as backbones, have gained significant popularity, particularly in language and language-vision tasks. However, large foundation models are typically trained on high-quality data, which poses a significant challenge, given the prevalence of poor-quality real-world data. This challenge is more pronounced for develo** foundation models for phys… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  9. arXiv:2404.15353  [pdf, other

    eess.SP cs.AI cs.LG

    SQUWA: Signal Quality Aware DNN Architecture for Enhanced Accuracy in Atrial Fibrillation Detection from Noisy PPG Signals

    Authors: Runze Yan, Cheng Ding, Ran Xiao, Aleksandr Fedorov, Randall J Lee, Fadi Nahab, Xiao Hu

    Abstract: Atrial fibrillation (AF), a common cardiac arrhythmia, significantly increases the risk of stroke, heart disease, and mortality. Photoplethysmography (PPG) offers a promising solution for continuous AF monitoring, due to its cost efficiency and integration into wearable devices. Nonetheless, PPG signals are susceptible to corruption from motion artifacts and other factors often encountered in ambu… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: 15 pages; 9 figures; 2024 Conference on Health, Inference, and Learning (CHIL)

  10. arXiv:2403.13996  [pdf, other

    eess.IV cs.CV

    P-Count: Persistence-based Counting of White Matter Hyperintensities in Brain MRI

    Authors: Xiaoling Hu, Annabel Sorby-Adams, Frederik Barkhof, W Taylor Kimberly, Oula Puonti, Juan Eugenio Iglesias

    Abstract: White matter hyperintensities (WMH) are a hallmark of cerebrovascular disease and multiple sclerosis. Automated WMH segmentation methods enable quantitative analysis via estimation of total lesion load, spatial distribution of lesions, and number of lesions (i.e., number of connected components after thresholding), all of which are correlated with patient outcomes. While the two former measures ca… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: 11 pages, 4 figures

  11. arXiv:2403.03756  [pdf, ps, other

    eess.SP

    Maximizing Energy Charging for UAV-assisted MEC Systems with SWIPT

    Authors: Xiaoyan Hu, Pengle Wen, Han Xiao, Wenjie Wang, Kai-Kit Wong

    Abstract: A Unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) scheme with simultaneous wireless information and power transfer (SWIPT) is proposed in this paper. Unlike existing MEC-WPT schemes that disregard the downlink period for returning computing results to the ground equipment (GEs), our proposed scheme actively considers and capitalizes on this period. By leveraging the SWIPT techni… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  12. arXiv:2402.10226  [pdf, other

    cs.RO eess.SY

    Simulation-based Analysis of a Novel Loop-based Road Topology for Autonomous Vehicles

    Authors: Stefan Ramdhan, Winnie Trandinh, Sathurshan Arulmohan, Xiayong Hu, Spencer Deevy, Victor Bandur, Vera Pantelic, Mark Lawford, Alan Wassyng

    Abstract: The challenges in implementing SAE Level 4/5 autonomous vehicles are manifold, with intersection navigation being a pervasive one. We analyze a novel road topology invented by a co-author of this paper, Xiayong Hu. The topology eliminates the need for traditional traffic control and cross-traffic at intersections, potentially improving the safety of autonomous driving systems. The topology, herein… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 8 pages, 10 figures, Submitted to IV2024

  13. TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion

    Authors: Samuel Pegg, Kai Li, Xiaolin Hu

    Abstract: Audio-visual speech separation has gained significant traction in recent years due to its potential applications in various fields such as speech recognition, diarization, scene analysis and assistive technologies. Designing a lightweight audio-visual speech separation network is important for low-latency applications, but existing methods often require higher computational costs and more paramete… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Journal ref: 2023 13th International Conference on Information Science and Technology (ICIST), Cairo, Egypt, 2023, pp. 243-252

  14. arXiv:2401.08136  [pdf, other

    eess.SY

    Bias-Compensated State of Charge and State of Health Joint Estimation for Lithium Iron Phosphate Batteries

    Authors: Baozhao Yi, Xinhao Du, Jiawei Zhang, Xiaogang Wu, Qiuhao Hu, Weiran Jiang, Xiaosong Hu, Ziyou Song

    Abstract: Accurate estimation of the state of charge (SOC) and state of health (SOH) is crucial for the safe and reliable operation of batteries. Voltage measurement bias highly affects state estimation accuracy, especially in Lithium Iron Phosphate (LFP) batteries, which are susceptible due to their flat open-circuit voltage (OCV) curves. This work introduces a bias-compensated algorithm to reliably estima… ▽ More

    Submitted 12 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 9 pages and 8 figures

  15. arXiv:2401.05725  [pdf, ps, other

    cs.IT eess.SP

    Energy-Efficient STAR-RIS Enhanced UAV-Enabled MEC Networks with Bi-Directional Task Offloading

    Authors: Han Xiao, Xiaoyan Hu, Weile Zhang, Wenjie Wang, Kai-Kit Wong, Kun Yang

    Abstract: This paper introduces a novel multi-user mobile edge computing (MEC) scheme facilitated by the simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) and the unmanned aerial vehicle (UAV). Unlike existing MEC approaches, the proposed scheme enables bidirectional offloading, allowing users to concurrently offload tasks to the MEC servers located at the ground base… ▽ More

    Submitted 9 June, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

  16. arXiv:2401.03122  [pdf, other

    cs.CV eess.IV

    SAR Despeckling via Regional Denoising Diffusion Probabilistic Model

    Authors: Xuran Hu, Ziqiang Xu, Zhihan Chen, Zhengpeng Feng, Mingzhe Zhu, LJubisa Stankovic

    Abstract: Speckle noise poses a significant challenge in maintaining the quality of synthetic aperture radar (SAR) images, so SAR despeckling techniques have drawn increasing attention. Despite the tremendous advancements of deep learning in fixed-scale SAR image despeckling, these methods still struggle to deal with large-scale SAR images. To address this problem, this paper introduces a novel despeckling… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 5 pages, 5 figures

    ACM Class: I.4.4

  17. arXiv:2312.13310  [pdf, other

    eess.IV cs.CV

    Computational Spectral Imaging with Unified Encoding Model: A Comparative Study and Beyond

    Authors: Xinyuan Liu, Lizhi Wang, Lingen Li, Chang Chen, Xue Hu, Fenglong Song, Youliang Yan

    Abstract: Computational spectral imaging is drawing increasing attention owing to the snapshot advantage, and amplitude, phase, and wavelength encoding systems are three types of representative implementations. Fairly comparing and understanding the performance of these systems is essential, but challenging due to the heterogeneity in encoding design. To overcome this limitation, we propose the unified enco… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  18. arXiv:2312.12833  [pdf, other

    eess.IV cs.CV

    Learning Exhaustive Correlation for Spectral Super-Resolution: Where Spatial-Spectral Attention Meets Linear Dependence

    Authors: Hongyuan Wang, Lizhi Wang, Jiang Xu, Chang Chen, Xue Hu, Fenglong Song, Youliang Yan

    Abstract: Spectral super-resolution that aims to recover hyperspectral image (HSI) from easily obtainable RGB image has drawn increasing interest in the field of computational photography. The crucial aspect of spectral super-resolution lies in exploiting the correlation within HSIs. However, two types of bottlenecks in existing Transformers limit performance improvement and practical applications. First, e… ▽ More

    Submitted 18 March, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  19. arXiv:2312.09620  [pdf, other

    eess.AS

    A Deep Representation Learning-based Speech Enhancement Method Using Complex Convolution Recurrent Variational Autoencoder

    Authors: Yang Xiang, **gguang Tian, Xinhui Hu, Xinkang Xu, ZhaoHui Yin

    Abstract: Generally, the performance of deep neural networks (DNNs) heavily depends on the quality of data representation learning. Our preliminary work has emphasized the significance of deep representation learning (DRL) in the context of speech enhancement (SE) applications. Specifically, our initial SE algorithm employed a gated recurrent unit variational autoencoder (VAE) with a Gaussian distribution t… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by ICASSP 2024

  20. arXiv:2312.05256  [pdf, other

    eess.IV cs.AI

    Holistic Evaluation of GPT-4V for Biomedical Imaging

    Authors: Zhengliang Liu, Hanqi Jiang, Tianyang Zhong, Zihao Wu, Chong Ma, Yiwei Li, Xiaowei Yu, Yutong Zhang, Yi Pan, Peng Shu, Yanjun Lyu, Lu Zhang, Junjie Yao, Peixin Dong, Chao Cao, Zhenxiang Xiao, Jiaqi Wang, Huan Zhao, Shaochen Xu, Yaonai Wei, **gyuan Chen, Haixing Dai, Peilong Wang, Hao He, Zewei Wang , et al. (25 additional authors not shown)

    Abstract: In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and mor… ▽ More

    Submitted 10 November, 2023; originally announced December 2023.

  21. arXiv:2312.02300  [pdf

    cs.LG eess.SP

    Reconsideration on evaluation of machine learning models in continuous monitoring using wearables

    Authors: Cheng Ding, Zhicheng Guo, Cynthia Rudin, Ran Xiao, Fadi B Nahab, Xiao Hu

    Abstract: This paper explores the challenges in evaluating machine learning (ML) models for continuous health monitoring using wearable devices beyond conventional metrics. We state the complexities posed by real-world variability, disease dynamics, user-specific characteristics, and the prevalence of false notifications, necessitating novel evaluation strategies. Drawing insights from large-scale heart stu… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  22. arXiv:2311.16447  [pdf, other

    eess.IV cs.CV

    TopoSemiSeg: Enforcing Topological Consistency for Semi-Supervised Segmentation of Histopathology Images

    Authors: Meilong Xu, Xiaoling Hu, Saumya Gupta, Shahira Abousamra, Chao Chen

    Abstract: In computational pathology, segmenting densely distributed objects like glands and nuclei is crucial for downstream analysis. To alleviate the burden of obtaining pixel-wise annotations, semi-supervised learning methods learn from large amounts of unlabeled data. Nevertheless, existing semi-supervised methods overlook the topological information hidden in the unlabeled images and are thus prone to… ▽ More

    Submitted 6 December, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: 14 pages, 7 figures, fix Eq. (8) and Eq. (9)

  23. arXiv:2311.10656  [pdf, other

    eess.AS

    LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement

    Authors: Zili Qi, Xinhui Hu, Wang** Zhou, Sheng Li, Hao Wu, Jian Lu, Xinkang Xu

    Abstract: Recently, researchers have shown an increasing interest in automatically predicting the subjective evaluation for speech synthesis systems. This prediction is a challenging task, especially on the out-of-domain test set. In this paper, we proposed a novel fusion model for MOS prediction that combines supervised and unsupervised approaches. In the supervised aspect, we developed an SSL-based predic… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: accepted in IEEE-ASRU2023

  24. arXiv:2311.07033  [pdf, other

    eess.IV cs.CV

    TTMFN: Two-stream Transformer-based Multimodal Fusion Network for Survival Prediction

    Authors: Ruiquan Ge, Xiangyang Hu, Rungen Huang, Gangyong Jia, Yaqi Wang, Renshu Gu, Changmiao Wang, Elazab Ahmed, Linyan Wang, Juan Ye, Ye Li

    Abstract: Survival prediction plays a crucial role in assisting clinicians with the development of cancer treatment protocols. Recent evidence shows that multimodal data can help in the diagnosis of cancer disease and improve survival prediction. Currently, deep learning-based approaches have experienced increasing success in survival prediction by integrating pathological images and gene expression data. H… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

  25. arXiv:2310.20289  [pdf

    physics.optics eess.IV physics.app-ph

    C-Silicon-based metasurfaces for aperture-robust spectrometer/imaging with angle integration

    Authors: Weizhu Xu, Qingbin Fan, Peicheng Lin, Jiarong Wang, Hao Hu, Tao Yue, Xuemei Hu, Ting Xu

    Abstract: Compared with conventional grating-based spectrometers, reconstructive spectrometers based on spectrally engineered filtering have the advantage of miniaturization because of the less demand for dispersive optics and free propagation space. However, available reconstructive spectrometers fail to balance the performance on operational bandwidth, spectral diversity and angular stability. In this wor… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

  26. arXiv:2310.19293  [pdf, other

    eess.IV cs.CV

    FetusMapV2: Enhanced Fetal Pose Estimation in 3D Ultrasound

    Authors: Chaoyu Chen, Xin Yang, Yuhao Huang, Wenlong Shi, Yan Cao, Mingyuan Luo, Xindi Hu, Lei Zhue, Lequan Yu, Kejuan Yue, Yuanji Zhang, Yi Xiong, Dong Ni, Weijun Huang

    Abstract: Fetal pose estimation in 3D ultrasound (US) involves identifying a set of associated fetal anatomical landmarks. Its primary objective is to provide comprehensive information about the fetus through landmark connections, thus benefiting various critical applications, such as biometric measurements, plane localization, and fetal movement monitoring. However, accurately estimating the 3D fetal pose… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: 16 pages, 11 figures, accepted by Medical Image Analysis(2023)

  27. arXiv:2310.17363  [pdf, ps, other

    eess.SY

    Controllability of networked multiagent systems based on linearized Turing's model

    Authors: Tianhao Li, Ruichang Zhang, Zhixin Liu, Zhuo Zou, Xiaoming Hu

    Abstract: Turing's model has been widely used to explain how simple, uniform structures can give rise to complex, patterned structures during the development of organisms. However, it is very hard to establish rigorous theoretical results for the dynamic evolution behavior of Turing's model since it is described by nonlinear partial differential equations. We focus on controllability of Turing's model by li… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: 13 pages, 4 figures, submitted to automatica

  28. arXiv:2310.14155  [pdf

    eess.SP

    Photoplethysmography based atrial fibrillation detection: an updated review from July 2019

    Authors: Cheng Ding, Ran Xiao, Weijia Wang, Elizabeth Holdsworth, Xiao Hu

    Abstract: Atrial fibrillation (AF) is a prevalent cardiac arrhythmia associated with significant health ramifications, including an elevated susceptibility to ischemic stroke, heart disease, and heightened mortality. Photoplethysmography (PPG) has emerged as a promising technology for continuous AF monitoring for its cost-effectiveness and widespread integration into wearable devices. Our team previously co… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

  29. arXiv:2310.09466  [pdf, ps, other

    cs.IT eess.SP

    Robust Anti-jamming Communications with DMA-Based Reconfigurable Heterogeneous Array

    Authors: Kaizhi Huang, Wenyu Jiang, Yajun Chen, Liang **, Qingqing Wu, Xiaoling Hu

    Abstract: In the future commercial and military communication systems, anti-jamming remains a critical issue. Existing homogeneous or heterogeneous arrays with a limited degrees of freedom (DoF) and high consumption are unable to meet the requirements of communication in rapidly changing and intense jamming environments. To address these challenges, we propose a reconfigurable heterogeneous array (RHA) arch… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  30. arXiv:2309.17189  [pdf, other

    cs.SD cs.CV eess.AS

    RTFS-Net: Recurrent Time-Frequency Modelling for Efficient Audio-Visual Speech Separation

    Authors: Samuel Pegg, Kai Li, Xiaolin Hu

    Abstract: Audio-visual speech separation methods aim to integrate different modalities to generate high-quality separated speech, thereby enhancing the performance of downstream tasks such as speech recognition. Most existing state-of-the-art (SOTA) models operate in the time domain. However, their overly simplistic approach to modeling acoustic features often necessitates larger and more computationally in… ▽ More

    Submitted 21 March, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: Accepted by The Twelfth International Conference on Learning Representations (ICLR) 2024, see https://openreview.net/forum?id=PEuDO2EiDr

  31. arXiv:2309.13585  [pdf, other

    eess.SP

    Identification of Ghost Targets for Automotive Radar in the Presence of Multipath

    Authors: Le Zheng, Jiamin Long, Marco Lops, Fan Liu, Xueyao Hu

    Abstract: Colocated multiple-input multiple-output (MIMO) technology has been widely used in automotive radars as it provides accurate angular estimation of the objects with relatively small number of transmitting and receiving antennas. Since the Direction Of Departure (DOD) and the Direction Of Arrival (DOA) of line-of-sight targets coincide, MIMO signal processing allows forming a larger virtual array fo… ▽ More

    Submitted 26 September, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

    Comments: 13 pages, 10 figures

  32. arXiv:2309.03331  [pdf, other

    cs.CV eess.IV

    Expert Uncertainty and Severity Aware Chest X-Ray Classification by Multi-Relationship Graph Learning

    Authors: Mengliang Zhang, Xinyue Hu, Lin Gu, Liangchen Liu, Kazuma Kobayashi, Tatsuya Harada, Ronald M. Summers, Yingying Zhu

    Abstract: Patients undergoing chest X-rays (CXR) often endure multiple lung diseases. When evaluating a patient's condition, due to the complex pathologies, subtle texture changes of different lung lesions in images, and patient condition differences, radiologists may make uncertain even when they have experienced long-term clinical training and professional guidance, which makes much noise in extracting di… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

  33. arXiv:2309.01273  [pdf, other

    cs.AR eess.SY

    WindMill: A Parameterized and Pluggable CGRA Implemented by DIAG Design Flow

    Authors: Haojia Hui, Jiangyuan Gu, Xunbo Hu, Yang Hu, Leibo Liu, Shaojun Wei, Shouyi Yin

    Abstract: With the cross-fertilization of applications and the ever-increasing scale of models, the efficiency and productivity of hardware computing architectures have become inadequate. This inadequacy further exacerbates issues in design flexibility, design complexity, development cycle, and development costs (4-d problems) in divergent scenarios. To address these challenges, this paper proposed a flexib… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: 7 pages, 10 figures

  34. arXiv:2308.13790  [pdf, other

    eess.IV cs.CV

    FFPN: Fourier Feature Pyramid Network for Ultrasound Image Segmentation

    Authors: Chaoyu Chen, Xin Yang, Rusi Chen, Junxuan Yu, Liwei Du, Jian Wang, Xindi Hu, Yan Cao, Yingying Liu, Dong Ni

    Abstract: Ultrasound (US) image segmentation is an active research area that requires real-time and highly accurate analysis in many scenarios. The detect-to-segment (DTS) frameworks have been recently proposed to balance accuracy and efficiency. However, existing approaches may suffer from inadequate contour encoding or fail to effectively leverage the encoded results. In this paper, we introduce a novel F… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

    Comments: 10 pages, 5 figures, Accepted by MLMI 2023

  35. arXiv:2308.12749  [pdf, other

    cs.IT eess.SP

    Block-Level Interference Exploitation Precoding for MU-MISO: An ADMM Approach

    Authors: Yiran Wang, Yunsi Wen, Ang Li, Xiaoyan Hu, Christos Masouros

    Abstract: We study constructive interference based block-level precoding (CI-BLP) in the downlink of multi-user multiple-input single-output (MU-MISO) systems. Specifically, our aim is to extend the analysis on CI-BLP to the case where the considered number of symbol slots is smaller than that of the users. To this end, we mathematically prove the feasibility of using the pseudo-inverse to obtain the optima… ▽ More

    Submitted 30 August, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

  36. arXiv:2308.08143  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation

    Authors: Kai Li, Runxuan Yang, Fuchun Sun, Xiaolin Hu

    Abstract: Recent research has made significant progress in designing fusion modules for audio-visual speech separation. However, they predominantly focus on multi-modal fusion at a single temporal scale of auditory and visual features without employing selective attention mechanisms, which is in sharp contrast with the brain. To address this issue, We propose a novel model called Intra- and Inter-Attention… ▽ More

    Submitted 2 February, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: 18 pages, 6 figures

  37. arXiv:2308.06786  [pdf, other

    eess.SY

    Challenges and Opportunities for Second-life Batteries: A Review of Key Technologies and Economy

    Authors: Xubo Gu, Hanyu Bai, Xiaofan Cui, Juner Zhu, Weichao Zhuang, Zhaojian Li, Xiaosong Hu, Ziyou Song

    Abstract: Due to the increasing volume of Electric Vehicles in automotive markets and the limited lifetime of onboard lithium-ion batteries (LIBs), the large-scale retirement of LIBs is imminent. The battery packs retired from Electric Vehicles still own 70%-80% of the initial capacity, thus having the potential to be utilized in scenarios with lower energy and power requirements to maximize the value of LI… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.

  38. arXiv:2308.06485  [pdf, other

    eess.IV

    The Color Clifford Hardy Signal: Application to Color Edge Detection and Optical Flow

    Authors: Xiaoxiao Hu, Kit Ian Kou, Cuiming Zou, Dong Cheng

    Abstract: This paper introduces the idea of the color Clifford Hardy signal, which can be used to process color images. As a complex analytic function's high-dimensional analogue, the color Clifford Hardy signal inherits many desirable qualities of analyticity. A crucial tool for getting the color and structural data is the local feature representation of a color image in the color Clifford Hardy signal. By… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    Comments: 13 pages

  39. arXiv:2308.05987  [pdf, other

    cs.SD eess.AS

    Large-Scale Learning on Overlapped Speech Detection: New Benchmark and New General System

    Authors: Zhaohui Yin, **gguang Tian, Xinhui Hu, Xinkang Xu, Yang Xiang

    Abstract: Overlapped Speech Detection (OSD) is an important part of speech applications involving analysis of multi-party conversations. However, most of existing OSD systems are trained and evaluated on small datasets with limited application domains, which led to the robustness of them lacks benchmark for evaluation and the accuracy of them remains inadequate in realistic acoustic environments. To solve t… ▽ More

    Submitted 7 September, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

  40. arXiv:2308.03764  [pdf

    eess.SY cs.AI math.OC

    Deployment of Leader-Follower Automated Vehicle Systems for Smart Work Zone Applications with a Queuing-based Traffic Assignment Approach

    Authors: Qing Tang, Xianbiao Hu

    Abstract: The emerging technology of the Autonomous Truck Mounted Attenuator (ATMA), a leader-follower style vehicle system, utilizes connected and automated vehicle capabilities to enhance safety during transportation infrastructure maintenance in work zones. However, the speed difference between ATMA vehicles and general vehicles creates a moving bottleneck that reduces capacity and increases queue length… ▽ More

    Submitted 23 July, 2023; originally announced August 2023.

  41. arXiv:2307.16418  [pdf, other

    cs.CV cs.MM eess.IV

    DRAW: Defending Camera-shooted RAW against Image Manipulation

    Authors: Xiaoxiao Hu, Qichao Ying, Zhenxing Qian, Sheng Li, Xinpeng Zhang

    Abstract: RAW files are the initial measurement of scene radiance widely used in most cameras, and the ubiquitously-used RGB images are converted from RAW data through Image Signal Processing (ISP) pipelines. Nowadays, digital images are risky of being nefariously manipulated. Inspired by the fact that innate immunity is the first line of body defense, we propose DRAW, a novel scheme of defending images aga… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: To appear in ICCV 2023. The leading two authors contribute equally

  42. arXiv:2307.05385  [pdf, other

    eess.SP cs.AI cs.LG

    Learned Kernels for Sparse, Interpretable, and Efficient Medical Time Series Processing

    Authors: Sully F. Chen, Zhicheng Guo, Cheng Ding, Xiao Hu, Cynthia Rudin

    Abstract: Background: Rapid, reliable, and accurate interpretation of medical signals is crucial for high-stakes clinical decision-making. The advent of deep learning allowed for an explosion of new models that offered unprecedented performance in medical time series processing but at a cost: deep learning models are often compute-intensive and lack interpretability. Methods: We propose Sparse Mixture of… ▽ More

    Submitted 2 April, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: 26 pages, 9 figures

  43. arXiv:2307.05339  [pdf, other

    eess.SP cs.LG

    A Self-Supervised Algorithm for Denoising Photoplethysmography Signals for Heart Rate Estimation from Wearables

    Authors: Pranay Jain, Cheng Ding, Cynthia Rudin, Xiao Hu

    Abstract: Smart watches and other wearable devices are equipped with photoplethysmography (PPG) sensors for monitoring heart rate and other aspects of cardiovascular health. However, PPG signals collected from such devices are susceptible to corruption from noise and motion artifacts, which cause errors in heart rate estimation. Typical denoising approaches filter or reconstruct the signal in ways that elim… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

    Comments: 13 pages, 6 figures

  44. arXiv:2306.16197  [pdf, other

    cs.CV eess.IV

    Multi-IMU with Online Self-Consistency for Freehand 3D Ultrasound Reconstruction

    Authors: Mingyuan Luo, Xin Yang, Zhongnuo Yan, Junyu Li, Yuanji Zhang, Jiongquan Chen, Xindi Hu, Jikuan Qian, Jun Cheng, Dong Ni

    Abstract: Ultrasound (US) imaging is a popular tool in clinical diagnosis, offering safety, repeatability, and real-time capabilities. Freehand 3D US is a technique that provides a deeper understanding of scanned regions without increasing complexity. However, estimating elevation displacement and accumulation error remains challenging, making it difficult to infer the relative position using images alone.… ▽ More

    Submitted 18 July, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: Accepted by MICCAI-2023

  45. arXiv:2306.13570  [pdf, other

    math.OC eess.SY

    Synchronous dynamic game on system observability considering one or two steps optimality

    Authors: Yueyue Xu, Xiaoming Hu, Lin Wang

    Abstract: This paper studies a system security problem in the context of observability based on a two-party non-cooperative asynchronous dynamic game. A system is assumed to be secure if it is not observable. Both the defender and the attacker have means to modify dimension of the unobservable subspace, which is set as the value function. Utilizing tools from geometric control, we construct the best respons… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  46. arXiv:2306.08532  [pdf, ps, other

    eess.SP

    On the Generalization and Advancement of Half-Sine-Based Pulse Sha** Filters for Constant Envelope OQPSK Modulation

    Authors: Pengcheng Mu, Yan Liu, Zihao Guo, Xiaoyan Hu, Kai-Kit Wong

    Abstract: The offset quadrature phase-shift keying (OQPSK) modulation is a key factor for the technique of ZigBee, which has been adopted in IEEE 802.15.4 for wireless communications of Internet of Things (IoT) and Internet of Vehicles (IoV), etc. In this paper, we propose the general conditions of pulse sha** filters (PSFs) with constant envelope (CE) property for OQPSK modulation, which can be easily le… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: 5 pages, 5 figures, journal paper

  47. arXiv:2306.07105  [pdf, ps, other

    cs.IT eess.SP

    STAR-RIS Assisted Covert Communications in NOMA Systems

    Authors: Han Xiao, Xiaoyan Hu, Tong-Xing Zheng, Kai-Kit Wong

    Abstract: Covert communications assisted by simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) in non-orthogonal multiple access (NOMA) systems have been explored in this paper. In particular, the access point (AP) transmitter adopts NOMA to serve a downlink covert user and a public user. The minimum detection error probability (DEP) at the warden is derived considering… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: text overlap with arXiv:2305.04930, arXiv:2305.03991

  48. arXiv:2306.04915  [pdf, ps, other

    cs.IT eess.SP

    Sensing-based Beamforming Design for Joint Performance Enhancement of RIS-Aided ISAC Systems

    Authors: Xiaowei Qian, Xiaoling Hu, Chenxi Liu, Mugen Peng, Caijun Zhong

    Abstract: Reconfigurable intelligent surface (RIS) has shown its great potential in facilitating device-based integrated sensing and communication (ISAC), where sensing and communication tasks are mostly conducted on different time-frequency resources. While the more challenging scenarios of simultaneous sensing and communication (SSC) have so far drawn little attention. In this paper, we propose a novel RI… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  49. arXiv:2306.02057  [pdf, ps, other

    eess.SP

    DataAI-6G: A System Parameters Configurable Channel Dataset for AI-6G Research

    Authors: Zibing Shen, Jianhua Zhang, Li Yu, Yuxiang Zhang, Zhen Zhang, Xidong Hu

    Abstract: With the acceleration of the commercialization of fifth generation (5G) mobile communication technology and the research for 6G communication systems, the communication system has the characteristics of high frequency, multi-band, high speed movement of users and large antenna array. These bring many difficulties to obtain accurate channel state information (CSI), which makes the performance of tr… ▽ More

    Submitted 3 June, 2023; originally announced June 2023.

  50. arXiv:2306.00160  [pdf, other

    eess.AS cs.LG cs.SD

    Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model

    Authors: Héctor Martel, Julius Richter, Kai Li, Xiaolin Hu, Timo Gerkmann

    Abstract: We propose Audio-Visual Lightweight ITerative model (AVLIT), an effective and lightweight neural network that uses Progressive Learning (PL) to perform audio-visual speech separation in noisy environments. To this end, we adopt the Asynchronous Fully Recurrent Convolutional Neural Network (A-FRCNN), which has shown successful results in audio-only speech separation. Our architecture consists of an… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: Accepted by Interspeech 2023