Skip to main content

Showing 1–24 of 24 results for author: Wan, X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.16150  [pdf, other

    eess.IV cs.CV

    Intensity Confusion Matters: An Intensity-Distance Guided Loss for Bronchus Segmentation

    Authors: Haifan Gong, Wenhao Huang, Huan Zhang, Yu Wang, Xiang Wan, Hong Shen, Guanbin Li, Haofeng Li

    Abstract: Automatic segmentation of the bronchial tree from CT imaging is important, as it provides structural information for disease diagnosis. Despite the merits of previous automatic bronchus segmentation methods, they have paied less attention to the issue we term as \textit{Intensity Confusion}, wherein the intensity values of certain background voxels approach those of the foreground voxels within br… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: IEEE International Conference on Multimedia & Expo (ICME) 2024

  2. arXiv:2406.09950  [pdf, other

    cs.SD cs.CL eess.AS

    An efficient text augmentation approach for contextualized Mandarin speech recognition

    Authors: Naijun Zheng, Xucheng Wan, Kai Liu, Ziqing Du, Zhou Huan

    Abstract: Although contextualized automatic speech recognition (ASR) systems are commonly used to improve the recognition of uncommon words, their effectiveness is hindered by the inherent limitations of speech-text data availability. To address this challenge, our study proposes to leverage extensive text-only datasets and contextualize pre-trained ASR models using a straightforward text-augmentation (TA)… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: accepted to interspeech2024

  3. arXiv:2405.03152  [pdf, other

    eess.AS cs.SD

    MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition

    Authors: Bingshen Mu, Yangze Li, Qijie Shao, Kun Wei, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie

    Abstract: Despite notable advancements in automatic speech recognition (ASR), performance tends to degrade when faced with adverse conditions. Generative error correction (GER) leverages the exceptional text comprehension capabilities of large language models (LLM), delivering impressive performance in ASR error correction, where N-best hypotheses provide valuable information for transcription prediction. H… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  4. arXiv:2405.00542  [pdf, other

    eess.IV cs.CV

    UWAFA-GAN: Ultra-Wide-Angle Fluorescein Angiography Transformation via Multi-scale Generation and Registration Enhancement

    Authors: Ruiquan Ge, Zhaojie Fang, Pengxue Wei, Zhanghao Chen, Hongyang Jiang, Ahmed Elazab, Wangting Li, Xiang Wan, Shaochong Zhang, Changmiao Wang

    Abstract: Fundus photography, in combination with the ultra-wide-angle fundus (UWF) techniques, becomes an indispensable diagnostic tool in clinical settings by offering a more comprehensive view of the retina. Nonetheless, UWF fluorescein angiography (UWF-FA) necessitates the administration of a fluorescent dye via injection into the patient's hand or elbow unlike UWF scanning laser ophthalmoscopy (UWF-SLO… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  5. arXiv:2404.11278  [pdf, other

    physics.ins-det eess.IV

    Study on the static detection of ICF target based on muonic X-ray sphere encoded imaging

    Authors: Dikai Li, Jian Yu, Qian Chen, Chunhui Zhang, Xiangyu Wan, Leifeng Cao

    Abstract: Muon Induced X-ray Emission (MIXE) was discovered by Chinese physicist Zhang Wenyu as early as 1947, and it can conduct non-destructive elemental analysis inside samples. Research has shown that MIXE can retain the high efficiency of direct imaging while benefiting from the low noise of pinhole imaging through encoding holes. The related technology significantly improves the counting rate while ma… ▽ More

    Submitted 17 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  6. arXiv:2311.15328  [pdf, other

    eess.IV cs.CV

    BS-Diff: Effective Bone Suppression Using Conditional Diffusion Models from Chest X-Ray Images

    Authors: Zhanghao Chen, Yifei Sun, Wenjian Qin, Ruiquan Ge, Cheng Pan, Wenming Deng, Zhou Liu, Wenwen Min, Ahmed Elazab, Xiang Wan, Changmiao Wang

    Abstract: Chest X-rays (CXRs) are commonly utilized as a low-dose modality for lung screening. Nonetheless, the efficacy of CXRs is somewhat impeded, given that approximately 75% of the lung area overlaps with bone, which in turn hampers the detection and diagnosis of diseases. As a remedial measure, bone suppression techniques have been introduced. The current dual-energy subtraction imaging technique in t… ▽ More

    Submitted 28 February, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

    Comments: 5 pages, 2 figures, accepted by IEEE ISBI 2024

  7. arXiv:2311.04772  [pdf, other

    eess.IV cs.CV

    GCS-ICHNet: Assessment of Intracerebral Hemorrhage Prognosis using Self-Attention with Domain Knowledge Integration

    Authors: Xuhao Shan, Xinyang Li, Ruiquan Ge, Shibin Wu, Ahmed Elazab, Jichao Zhu, Lingyan Zhang, Gangyong Jia, Qingying Xiao, Xiang Wan, Changmiao Wang

    Abstract: Intracerebral Hemorrhage (ICH) is a severe condition resulting from damaged brain blood vessel ruptures, often leading to complications and fatalities. Timely and accurate prognosis and management are essential due to its high mortality rate. However, conventional methods heavily rely on subjective clinician expertise, which can lead to inaccurate diagnoses and delays in treatment. Artificial inte… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: 6 pages, 3 figures, 5 tables, published to BIBM 2023

  8. arXiv:2310.14197  [pdf, other

    eess.IV cs.CV

    Diffusion-based Data Augmentation for Nuclei Image Segmentation

    Authors: Xinyi Yu, Guanbin Li, Wei Lou, Siqi Liu, Xiang Wan, Yan Chen, Haofeng Li

    Abstract: Nuclei segmentation is a fundamental but challenging task in the quantitative analysis of histopathology images. Although fully-supervised deep learning-based methods have made significant progress, a large number of labeled images are required to achieve great segmentation performance. Considering that manually labeling all nuclei instances for a dataset is inefficient, obtaining a large-scale hu… ▽ More

    Submitted 18 January, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

    Comments: MICCAI 2023, released code: https://github.com/lhaof/Nudiff

  9. arXiv:2310.14172  [pdf, other

    eess.IV cs.CV

    ASC: Appearance and Structure Consistency for Unsupervised Domain Adaptation in Fetal Brain MRI Segmentation

    Authors: Zihang Xu, Haifan Gong, Xiang Wan, Haofeng Li

    Abstract: Automatic tissue segmentation of fetal brain images is essential for the quantitative analysis of prenatal neurodevelopment. However, producing voxel-level annotations of fetal brain imaging is time-consuming and expensive. To reduce labeling costs, we propose a practical unsupervised domain adaptation (UDA) setting that adapts the segmentation labels of high-quality fetal brain atlases to unlabel… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: MICCAI 2023, released code: https://github.com/lhaof/ASC

  10. arXiv:2310.02629  [pdf, other

    cs.SD eess.AS

    BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition

    Authors: Peikun Chen, Fan Yu, Yuhao Lian, Hongfei Xue, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie

    Abstract: Mixture-of-experts based models, which use language experts to extract language-specific representations effectively, have been well applied in code-switching automatic speech recognition. However, there is still substantial space to improve as similar pronunciation across languages may result in ineffective multi-language modeling and inaccurate language boundary estimation. To eliminate these dr… ▽ More

    Submitted 7 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: Accepted by ASRU2023

  11. arXiv:2303.05023  [pdf, other

    eess.AS cs.AI cs.SD

    X-SepFormer: End-to-end Speaker Extraction Network with Explicit Optimization on Speaker Confusion

    Authors: Kai Liu, Ziqing Du, Xucheng Wan, Huan Zhou

    Abstract: Target speech extraction (TSE) systems are designed to extract target speech from a multi-talker mixture. The popular training objective for most prior TSE networks is to enhance reconstruction performance of extracted speech waveform. However, it has been reported that a TSE system delivers high reconstruction performance may still suffer low-quality experience problems in practice. One such expe… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  12. arXiv:2301.06277  [pdf, ps, other

    cs.SD cs.AI cs.LG eess.AS

    Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings

    Authors: Kai Liu, Xucheng Wan, Ziqing Du, Huan Zhou

    Abstract: As a practical alternative of speech separation, target speaker extraction (TSE) aims to extract the speech from the desired speaker using additional speaker cue extracted from the speaker. Its main challenge lies in how to properly extract and leverage the speaker cue to benefit the extracted speech quality. The cue extraction method adopted in majority existing TSE studies is to directly utilize… ▽ More

    Submitted 16 January, 2023; originally announced January 2023.

    Comments: ACCEPTED by NCMMSC 2022

  13. arXiv:2301.04904  [pdf, other

    eess.IV cs.CV

    Lesion-aware Dynamic Kernel for Polyp Segmentation

    Authors: Ruifei Zhang, Peiwen Lai, Xiang Wan, De-Jun Fan, Feng Gao, Xiao-Jian Wu, Guanbin Li

    Abstract: Automatic and accurate polyp segmentation plays an essential role in early colorectal cancer diagnosis. However, it has always been a challenging task due to 1) the diverse shape, size, brightness and other appearance characteristics of polyps, 2) the tiny contrast between concealed polyps and their surrounding regions. To address these problems, we propose a lesion-aware dynamic network (LDNet) f… ▽ More

    Submitted 12 January, 2023; originally announced January 2023.

    Comments: Accepted by MICCAI2022

  14. arXiv:2209.13761  [pdf, other

    eess.IV cs.CV cs.MM

    Image Compressed Sensing with Multi-scale Dilated Convolutional Neural Network

    Authors: Zhifeng Wang, Zhenghui Wang, Chunyan Zeng, Yan Yu, Xiangkui Wan

    Abstract: Deep Learning (DL) based Compressed Sensing (CS) has been applied for better performance of image reconstruction than traditional CS methods. However, most existing DL methods utilize the block-by-block measurement and each measurement block is restored separately, which introduces harmful blocking effects for reconstruction. Furthermore, the neuronal receptive fields of those methods are designed… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

    Comments: 28 pages, 8 figures, MsDCNN for CS

  15. arXiv:2209.11906  [pdf, other

    cs.SD cs.AI cs.CL cs.LG eess.AS

    Joint Speech Activity and Overlap Detection with Multi-Exit Architecture

    Authors: Ziqing Du, Kai Liu, Xucheng Wan, Huan Zhou

    Abstract: Overlapped speech detection (OSD) is critical for speech applications in scenario of multi-party conversion. Despite numerous research efforts and progresses, comparing with speech activity detection (VAD), OSD remains an open challenge and its overall performance is far from satisfactory. The majority of prior research typically formulates the OSD problem as a standard classification problem, to… ▽ More

    Submitted 23 September, 2022; originally announced September 2022.

  16. arXiv:2209.11905  [pdf, other

    cs.SD cs.AI cs.CL cs.LG eess.AS

    Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations

    Authors: Xucheng Wan, Kai Liu, Ziqing Du, Huan Zhou

    Abstract: To address the monaural speech enhancement problem, numerous research studies have been conducted to enhance speech via operations either in time-domain on the inner-domain learned from the speech mixture or in time--frequency domain on the fixed full-band short time Fourier transform (STFT) spectrograms. Very recently, a few studies on sub-band based speech enhancement have been proposed. By enha… ▽ More

    Submitted 23 September, 2022; originally announced September 2022.

  17. arXiv:2208.12753  [pdf, other

    cs.SD cs.AI eess.AS

    Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

    Authors: Chunyan Zeng, Shixiong Feng, Zhifeng Wang, Xiangkui Wan, Yunfan Chen, Nan Zhao

    Abstract: The existing source cell-phone recognition method lacks the long-term feature characterization of the source device, resulting in inaccurate representation of the source cell-phone related features which leads to insufficient recognition accuracy. In this paper, we propose a source cell-phone recognition method based on spatio-temporal representation learning, which includes two main parts: extrac… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: 29 pages, 4 figures

  18. arXiv:2208.11920  [pdf

    cs.SD eess.AS

    Digital Audio Tampering Detection Based on ENF Spatio-temporal Features Representation Learning

    Authors: Chunyan Zeng, Shuai Kong, Zhifeng Wang, Xiangkui Wan, Yunfan Chen

    Abstract: Most digital audio tampering detection methods based on electrical network frequency (ENF) only utilize the static spatial information of ENF, ignoring the variation of ENF in time series, which limit the ability of ENF feature representation and reduce the accuracy of tampering detection. This paper proposes a new method for digital audio tampering detection based on ENF spatio-temporal features… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: 19 pages, 6 figures

  19. arXiv:2206.08023  [pdf, other

    eess.IV cs.CV cs.LG

    AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation

    Authors: Yuanfeng Ji, Haotian Bai, Jie Yang, Chongjian Ge, Ye Zhu, Ruimao Zhang, Zhen Li, Lingyan Zhang, Wanling Ma, Xiang Wan, ** Luo

    Abstract: Despite the considerable progress in automatic abdominal multi-organ segmentation from CT/MRI scans in recent years, a comprehensive evaluation of the models' capabilities is hampered by the lack of a large-scale benchmark from diverse clinical scenarios. Constraint by the high cost of collecting and labeling 3D medical data, most of the deep learning models to date are driven by datasets with a l… ▽ More

    Submitted 1 September, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

  20. arXiv:2107.04306  [pdf, other

    eess.IV cs.CV

    Hepatocellular Carcinoma Segmentation from Digital Subtraction Angiography Videos using Learnable Temporal Difference

    Authors: Wenting Jiang, Yicheng Jiang, Lu Zhang, Changmiao Wang, Xiaoguang Han, Shuixing Zhang, Xiang Wan, Shuguang Cui

    Abstract: Automatic segmentation of hepatocellular carcinoma (HCC) in Digital Subtraction Angiography (DSA) videos can assist radiologists in efficient diagnosis of HCC and accurate evaluation of tumors in clinical practice. Few studies have investigated HCC segmentation from DSA videos. It shows great challenging due to motion artifacts in filming, ambiguous boundaries of tumor regions and high similarity… ▽ More

    Submitted 16 September, 2021; v1 submitted 9 July, 2021; originally announced July 2021.

    Comments: 10 pages; accepted to MICCAI 2021

  21. arXiv:2011.00694  [pdf, other

    cs.CV eess.IV

    Multi-Modal Active Learning for Automatic Liver Fibrosis Diagnosis based on Ultrasound Shear Wave Elastography

    Authors: Lufei Gao, Ruisong Zhou, Changfeng Dong, Cheng Feng, Zhen Li, Xiang Wan, Li Liu

    Abstract: With the development of radiomics, noninvasive diagnosis like ultrasound (US) imaging plays a very important role in automatic liver fibrosis diagnosis (ALFD). Due to the noisy data, expensive annotations of US images, the application of Artificial Intelligence (AI) assisting approaches encounters a bottleneck. Besides, the use of mono-modal US data limits the further improve of the classification… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

  22. arXiv:2009.05436  [pdf, other

    cs.CV cs.LG eess.IV

    Semi-Supervised Active Learning for COVID-19 Lung Ultrasound Multi-symptom Classification

    Authors: Lei Liu, Wentao Lei, Yongfang Luo, Cheng Feng, Xiang Wan, Li Liu

    Abstract: Ultrasound (US) is a non-invasive yet effective medical diagnostic imaging technique for the COVID-19 global pandemic. However, due to complex feature behaviors and expensive annotations of US images, it is difficult to apply Artificial Intelligence (AI) assisting approaches for lung's multi-symptom (multi-label) classification. To overcome these difficulties, we propose a novel semi-supervised Tw… ▽ More

    Submitted 28 February, 2021; v1 submitted 9 September, 2020; originally announced September 2020.

  23. arXiv:1812.02339  [pdf, other

    eess.AS cs.SD

    Generative Adversarial Network based Speaker Adaptation for High Fidelity WaveNet Vocoder

    Authors: Qiao Tian, Xucheng Wan, Shan Liu

    Abstract: Although state-of-the-art parallel WaveNet has addressed the issue of real-time waveform generation, there remains problems. Firstly, due to the noisy input signal of the model, there is still a gap between the quality of generated and natural waveforms. Secondly, a parallel WaveNet is trained under a distillation framework, which makes it tedious to adapt a well trained model to a new speaker. To… ▽ More

    Submitted 19 July, 2019; v1 submitted 5 December, 2018; originally announced December 2018.

    Comments: 5 pages, 4 figure, 1 table, 6 equations

  24. Exactly Decoupled Kalman Filtering for Multitarget State Estimation with Sensor Bias

    Authors: Jianxin Yi, Xianrong Wan, Deshi Li

    Abstract: The problem of multisensor multitarget state estimation in the presence of constant but unknown sensor biases is investigated. The classical approach to this problem is to augment the state vector to include the states of all the targets and the sensor biases, and then implement an augmented state Kalman filter (ASKF). In this paper, we propose a novel decoupled Kalman filtering algorithm. The dec… ▽ More

    Submitted 14 October, 2019; v1 submitted 11 July, 2018; originally announced July 2018.

    Journal ref: IEEE Transactions on Aerospace and Electronic Systems, 2019