Skip to main content

Showing 1–47 of 47 results for author: Zhuang, X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.19043  [pdf

    eess.IV cs.AI cs.CV cs.DB

    CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

    Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Ya**g Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, **g Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

    Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 19 pages, 3 figures, 2 tables

  2. arXiv:2406.09676  [pdf, other

    eess.AS cs.CL

    Optimizing Byte-level Representation for End-to-end ASR

    Authors: Roger Hsiao, Liuhui Deng, Erik McDermott, Ruchir Travadi, Xiaodan Zhuang

    Abstract: We propose a novel approach to optimizing a byte-level representation for end-to-end automatic speech recognition (ASR). Byte-level representation is often used by large scale multilingual ASR systems when the character set of the supported languages is large. The compactness and universality of byte-level representation allow the ASR models to use smaller output vocabularies and therefore, provid… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 5 pages, 1 figure

  3. arXiv:2406.02430  [pdf, other

    eess.AS cs.SD

    Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

    Authors: Philip Anastassiou, Jiawei Chen, Jitong Chen, Yuanzhe Chen, Zhuo Chen, Ziyi Chen, Jian Cong, Lelai Deng, Chuang Ding, Lu Gao, Mingqing Gong, Peisong Huang, Qingqing Huang, Zhiying Huang, Yuanyuan Huo, Dongya Jia, Chumin Li, Feiya Li, Hui Li, Jiaxin Li, Xiaoyang Li, Xingxing Li, Lin Liu, Shouda Liu, Sichao Liu , et al. (21 additional authors not shown)

    Abstract: We introduce Seed-TTS, a family of large-scale autoregressive text-to-speech (TTS) models capable of generating speech that is virtually indistinguishable from human speech. Seed-TTS serves as a foundation model for speech generation and excels in speech in-context learning, achieving performance in speaker similarity and naturalness that matches ground truth human speech in both objective and sub… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  4. arXiv:2404.01082  [pdf, other

    eess.IV

    The state-of-the-art in Cardiac MRI Reconstruction: Results of the CMRxRecon Challenge in MICCAI 2023

    Authors: Jun Lyu, Chen Qin, Shuo Wang, Fanwen Wang, Yan Li, Zi Wang, Kunyuan Guo, Cheng Ouyang, Michael Tänzer, Meng Liu, Longyu Sun, Mengting Sun, Qin Li, Zhang Shi, Sha Hua, Hao Li, Zhensen Chen, Zhenlin Zhang, Bingyu Xin, Dimitris N. Metaxas, George Yiasemis, Jonas Teuwen, Li** Zhang, Weitian Chen, Yidong Zhao , et al. (25 additional authors not shown)

    Abstract: Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation p… ▽ More

    Submitted 16 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 25 pages, 17 figures

  5. arXiv:2311.18333  [pdf, other

    math.NA eess.SP

    Spherical Designs for Function Approximation and Beyond

    Authors: Yuchen Xiao, Xiaosheng Zhuang

    Abstract: In this paper, we compare two optimization algorithms using full Hessian and approximation Hessian to obtain numerical spherical designs through their variational characterization. Based on the obtained spherical design point sets, we investigate the approximation of smooth and non-smooth functions by spherical harmonics with spherical designs. Finally, we use spherical framelets for denoising Wen… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: 29 pages, 9 figures, 7 tables

    MSC Class: 42C05; 58C35; 65K10; 65D15; 65D32

  6. arXiv:2309.03537  [pdf, other

    eess.SP cs.LG math.FA

    Data-Adaptive Graph Framelets with Generalized Vanishing Moments for Graph Signal Processing

    Authors: Ruigang Zheng, Xiaosheng Zhuang

    Abstract: In this paper, we propose a novel and general framework to construct tight framelet systems on graphs with localized supports based on hierarchical partitions. Our construction provides parametrized graph framelet systems with great generality based on partition trees, by which we are able to find the size of a low-dimensional subspace that best fits the low-rank structure of a family of signals.… ▽ More

    Submitted 30 December, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

    MSC Class: 43A99; 41A45; 94A11; 94A16

  7. arXiv:2304.08862  [pdf, other

    cs.CL eess.AS

    Approximate Nearest Neighbour Phrase Mining for Contextual Speech Recognition

    Authors: Maurits Bleeker, Pawel Swietojanski, Stefan Braun, Xiaodan Zhuang

    Abstract: This paper presents an extension to train end-to-end Context-Aware Transformer Transducer ( CATT ) models by using a simple, yet efficient method of mining hard negative phrases from the latent space of the context encoder. During training, given a reference query, we mine a number of similar phrases using approximate nearest neighbour search. These sampled phrases are then used as negative exampl… ▽ More

    Submitted 16 August, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: Accepted to Interspeech 2023. 5 pages, 2 figures, 2 tables

  8. arXiv:2302.03537  [pdf, other

    eess.IV cs.CV

    Aligning Multi-Sequence CMR Towards Fully Automated Myocardial Pathology Segmentation

    Authors: Wangbin Ding, Lei Li, Junyi Qiu, Sihan Wang, Liqin Huang, Yinyin Chen, Shan Yang, Xiahai Zhuang

    Abstract: Myocardial pathology segmentation (MyoPS) is critical for the risk stratification and treatment planning of myocardial infarction (MI). Multi-sequence cardiac magnetic resonance (MS-CMR) images can provide valuable information. For instance, balanced steady-state free precession cine sequences present clear anatomical boundaries, while late gadolinium enhancement and T2-weighted CMR sequences visu… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

  9. arXiv:2301.06043  [pdf, other

    eess.IV cs.CV

    Unsupervised Cardiac Segmentation Utilizing Synthesized Images from Anatomical Labels

    Authors: Sihan Wang, Fu** Wu, Lei Li, Zheyao Gao, Byung-Woo Hong, Xiahai Zhuang

    Abstract: Cardiac segmentation is in great demand for clinical practice. Due to the enormous labor of manual delineation, unsupervised segmentation is desired. The ill-posed optimization problem of this task is inherently challenging, requiring well-designed constraints. In this work, we propose an unsupervised framework for multi-class segmentation with both intensity and shape constraints. Firstly, we ext… ▽ More

    Submitted 15 January, 2023; originally announced January 2023.

  10. MyoPS-Net: Myocardial Pathology Segmentation with Flexible Combination of Multi-Sequence CMR Images

    Authors: Junyi Qiu, Lei Li, Sihan Wang, Ke Zhang, Yinyin Chen, Shan Yang, Xiahai Zhuang

    Abstract: Myocardial pathology segmentation (MyoPS) can be a prerequisite for the accurate diagnosis and treatment planning of myocardial infarction. However, achieving this segmentation is challenging, mainly due to the inadequate and indistinct information from an image. In this work, we develop an end-to-end deep neural network, referred to as MyoPS-Net, to flexibly combine five-sequence cardiac magnetic… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

  11. arXiv:2211.01438  [pdf, other

    eess.AS cs.CL cs.SD

    Variable Attention Masking for Configurable Transformer Transducer Speech Recognition

    Authors: Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang

    Abstract: This work studies the use of attention masking in transformer transducer based speech recognition for building a single configurable model for different deployment scenarios. We present a comprehensive set of experiments comparing fixed masking, where the same attention mask is applied at every frame, with chunked masking, where the attention mask for each frame is determined by chunk boundaries,… ▽ More

    Submitted 18 April, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: To appear in ICASSP 2023

    Journal ref: International Conference on Acoustics, Speech, and Signal Processing, 2023 International Conference on Acoustics, Speech, and Signal Processing International Conference on Acoustics, Speech, and Signal Processing

  12. arXiv:2208.12881  [pdf, other

    eess.IV cs.CV

    Multi-Modality Cardiac Image Computing: A Survey

    Authors: Lei Li, Wangbin Ding, Liqun Huang, Xiahai Zhuang, Vicente Grau

    Abstract: Multi-modality cardiac imaging plays a key role in the management of patients with cardiovascular diseases. It allows a combination of complementary anatomical, morphological and functional information, increases diagnosis accuracy, and improves the efficacy of cardiovascular interventions and clinical outcomes. Fully-automated processing and quantitative analysis of multi-modality cardiac images… ▽ More

    Submitted 26 August, 2022; originally announced August 2022.

    Comments: 30 pages

  13. arXiv:2206.05284  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Decoupling Predictions in Distributed Learning for Multi-Center Left Atrial MRI Segmentation

    Authors: Zheyao Gao, Lei Li, Fu** Wu, Sihan Wang, Xiahai Zhuang

    Abstract: Distributed learning has shown great potential in medical image analysis. It allows to use multi-center training data with privacy protection. However, data distributions in local centers can vary from each other due to different imaging vendors, and annotation protocols. Such variation degrades the performance of learning-based methods. To mitigate the influence, two groups of methods have been p… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

    Comments: Accepted by MICCAI 2022

  14. arXiv:2206.04336  [pdf, other

    eess.IV cs.CV

    Joint Modeling of Image and Label Statistics for Enhancing Model Generalizability of Medical Image Segmentation

    Authors: Shangqi Gao, Hangqi Zhou, Yibo Gao, Xiahai Zhuang

    Abstract: Although supervised deep-learning has achieved promising performance in medical image segmentation, many methods cannot generalize well on unseen data, limiting their real-world applicability. To address this problem, we propose a deep learning-based Bayesian framework, which jointly models image and label statistics, utilizing the domain-irrelevant contour of a medical image for segmentation. Spe… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

    Comments: MICCAI 2022

  15. arXiv:2204.00623  [pdf, other

    eess.IV cs.CV cs.LG

    Bayesian Image Super-Resolution with Deep Modeling of Image Statistics

    Authors: Shangqi Gao, Xiahai Zhuang

    Abstract: Modeling statistics of image priors is useful for image super-resolution, but little attention has been paid from the massive works of deep learning-based methods. In this work, we propose a Bayesian image restoration framework, where natural image statistics are modeled with the combination of smoothness and sparsity priors. Concretely, firstly we consider an ideal image as the sum of a smoothnes… ▽ More

    Submitted 31 March, 2022; originally announced April 2022.

    Comments: 45 pages

    MSC Class: 62G ACM Class: I.5

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (2022)

  16. arXiv:2203.07684  [pdf, other

    eess.AS

    FB-MSTCN: A Full-Band Single-Channel Speech Enhancement Method Based on Multi-Scale Temporal Convolutional Network

    Authors: Zehua Zhang, Lu Zhang, Xuyi Zhuang, Yukun Qian, Heng Li, Mingjiang Wang

    Abstract: In recent years, deep learning-based approaches have significantly improved the performance of single-channel speech enhancement. However, due to the limitation of training data and computational complexity, real-time enhancement of full-band (48 kHz) speech signals is still very challenging. Because of the low energy of spectral information in the high-frequency part, it is more difficult to dire… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: Accepted by ICASSP 2022, Deep Noise Suppression Challenge

  17. arXiv:2203.01475  [pdf, other

    eess.IV cs.CV

    CycleMix: A Holistic Strategy for Medical Image Segmentation from Scribble Supervision

    Authors: Ke Zhang, Xiahai Zhuang

    Abstract: Curating a large set of fully annotated training data can be costly, especially for the tasks of medical image segmentation. Scribble, a weaker form of annotation, is more obtainable in practice, but training segmentation models from limited supervision of scribbles is still challenging. To address the difficulties, we propose a new framework for scribble learning-based medical image segmentation,… ▽ More

    Submitted 11 March, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: 10 pages, 5 figures

  18. arXiv:2202.02000  [pdf, other

    eess.IV cs.CV cs.LG

    Cross-Modality Multi-Atlas Segmentation via Deep Registration and Label Fusion

    Authors: Wangbin Ding, Lei Li, Xiahai Zhuang, Liqin Huang

    Abstract: Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation. Generally, MAS methods register multiple atlases, i.e., medical images with corresponding labels, to a target image; and the transformed atlas labels can be combined to generate target segmentation via label fusion schemes. Many conventional MAS methods employed the atlases from the same modality as the target… ▽ More

    Submitted 28 March, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

  19. arXiv:2201.07890  [pdf, other

    eess.SP cs.CV cs.LG math.FA

    Convolutional Neural Networks for Spherical Signal Processing via Spherical Haar Tight Framelets

    Authors: Jianfei Li, Han Feng, Xiaosheng Zhuang

    Abstract: In this paper, we develop a general theoretical framework for constructing Haar-type tight framelets on any compact set with a hierarchical partition. In particular, we construct a novel area-regular hierarchical partition on the 2-sphere and establish its corresponding spherical Haar tight framelets with directionality. We conclude by evaluating and illustrating the effectiveness of our area-regu… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

  20. arXiv:2201.05344  [pdf, other

    eess.IV cs.CV cs.LG

    AWSnet: An Auto-weighted Supervision Attention Network for Myocardial Scar and Edema Segmentation in Multi-sequence Cardiac Magnetic Resonance Images

    Authors: Kai-Ni Wang, Xin Yang, Juzheng Miao, Lei Li, **g Yao, ** Zhou, Wufeng Xue, Guang-Quan Zhou, Xiahai Zhuang, Dong Ni

    Abstract: Multi-sequence cardiac magnetic resonance (CMR) provides essential pathology information (scar and edema) to diagnose myocardial infarction. However, automatic pathology segmentation can be challenging due to the difficulty of effectively exploring the underlying information from the multi-sequence CMR data. This paper aims to tackle the scar and edema segmentation from multi-sequence CMR with a n… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

    Comments: 19 pages, 10 figures, accepted by Medical Image Analysis

  21. arXiv:2201.03186  [pdf, other

    eess.IV cs.CV

    MyoPS: A Benchmark of Myocardial Pathology Segmentation Combining Three-Sequence Cardiac Magnetic Resonance Images

    Authors: Lei Li, Fu** Wu, Sihan Wang, Xinzhe Luo, Carlos Martin-Isla, Shuwei Zhai, Jianpeng Zhang, Yanfei Liu7, Zhen Zhang, Markus J. Ankenbrand, Haochuan Jiang, Xiaoran Zhang, Linhong Wang, Tewodros Weldebirhan Arega, Elif Altunok, Zhou Zhao, Feiyan Li, Jun Ma, ** Yang, Elodie Puybareau, Ilkay Oksuz, Stephanie Bricq, Weisheng Li, Kumaradevan Punithakumar, Sotirios A. Tsaftaris , et al. (7 additional authors not shown)

    Abstract: Assessment of myocardial viability is essential in diagnosis and treatment management of patients suffering from myocardial infarction, and classification of pathology on myocardium is the key to this assessment. This work defines a new task of medical image analysis, i.e., to perform myocardial pathology segmentation (MyoPS) combining three-sequence cardiac magnetic resonance (CMR) images, which… ▽ More

    Submitted 10 January, 2022; originally announced January 2022.

  22. arXiv:2111.04736  [pdf, other

    eess.IV cs.CV

    Multi-Modality Cardiac Image Analysis with Deep Learning

    Authors: Lei Li, Fu** Wu, Sihang Wang, Xiahai Zhuang

    Abstract: Accurate cardiac computing, analysis and modeling from multi-modality images are important for the diagnosis and treatment of cardiac disease. Late gadolinium enhancement magnetic resonance imaging (LGE MRI) is a promising technique to visualize and quantify myocardial infarction (MI) and atrial scars. Automating quantification of MI and atrial scars can be challenging due to the low image quality… ▽ More

    Submitted 8 November, 2021; originally announced November 2021.

    Comments: Under review as a chapter of book 'Deep Learning for Medical Image Analysis, 2E'

  23. arXiv:2110.09121  [pdf, ps, other

    cs.SD eess.AS

    KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke

    Authors: Xiaobin Zhuang, Huiran Yu, Weifeng Zhao, Tao Jiang, Peng Hu

    Abstract: An automatic pitch correction system typically includes several stages, such as pitch extraction, deviation estimation, pitch shift processing, and cross-fade smoothing. However, designing these components with strategies often requires domain expertise and they are likely to fail on corner cases. In this paper, we present KaraTuner, an end-to-end neural architecture that predicts pitch curve and… ▽ More

    Submitted 26 June, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

    Comments: To be published in Proc. Interspeech 2022, Incheon, South Korea

  24. arXiv:2109.02171  [pdf, other

    eess.IV cs.CV

    Right Ventricular Segmentation from Short- and Long-Axis MRIs via Information Transition

    Authors: Lei Li, Wangbin Ding, Liqun Huang, Xiahai Zhuang

    Abstract: Right ventricular (RV) segmentation from magnetic resonance imaging (MRI) is a crucial step for cardiac morphology and function analysis. However, automatic RV segmentation from MRI is still challenging, mainly due to the heterogeneous intensity, the complex variable shapes, and the unclear RV boundary. Moreover, current methods for the RV segmentation tend to suffer from performance degradation a… ▽ More

    Submitted 5 September, 2021; originally announced September 2021.

    Comments: None

  25. Exploring Retraining-Free Speech Recognition for Intra-sentential Code-Switching

    Authors: Zhen Huang, Xiaodan Zhuang, Daben Liu, Xiaoqiang Xiao, Yuchen Zhang, Sabato Marco Siniscalchi

    Abstract: In this paper, we present our initial efforts for building a code-switching (CS) speech recognition system leveraging existing acoustic models (AMs) and language models (LMs), i.e., no training required, and specifically targeting intra-sentential switching. To achieve such an ambitious goal, new mechanisms for foreign pronunciation generation and language model (LM) enrichment have been devised.… ▽ More

    Submitted 27 August, 2021; originally announced September 2021.

    Journal ref: ICASSP2019 12-17 May 2019

  26. arXiv:2108.04016  [pdf, other

    eess.IV cs.CV

    Deep Learning methods for automatic evaluation of delayed enhancement-MRI. The results of the EMIDEC challenge

    Authors: Alain Lalande, Zhihao Chen, Thibaut Pommier, Thomas Decourselle, Abdul Qayyum, Michel Salomon, Dominique Ginhac, Youssef Skandarani, Arnaud Boucher, Khawla Brahim, Marleen de Bruijne, Robin Camarasa, Teresa M. Correia, Xue Feng, Kibrom B. Girum, Anja Hennemuth, Markus Huellebrand, Raabid Hussain, Matthias Ivantsits, Jun Ma, Craig Meyer, Rishabh Sharma, Jixi Shi, Nikolaos V. Tsekos, Marta Varela , et al. (8 additional authors not shown)

    Abstract: A key factor for assessing the state of the heart after myocardial infarction (MI) is to measure whether the myocardium segment is viable after reperfusion or revascularization therapy. Delayed enhancement-MRI or DE-MRI, which is performed several minutes after injection of the contrast agent, provides high contrast between viable and nonviable myocardium and is therefore a method of choice to eva… ▽ More

    Submitted 10 August, 2021; v1 submitted 9 August, 2021; originally announced August 2021.

    Comments: Submitted to Medical Image Analysis

  27. arXiv:2107.04232  [pdf, other

    eess.AS

    Incorporating Multi-Target in Multi-Stage Speech Enhancement Model for Better Generalization

    Authors: Lu Zhang, Mingjiang Wang, Andong Li, Zehua Zhang, Xuyi Zhuang

    Abstract: Recent single-channel speech enhancement methods based on deep neural networks (DNNs) have achieved remarkable results, but there are still generalization problems in real scenes. Like other data-driven methods, DNN-based speech enhancement models produce significant performance degradation on untrained data. In this study, we make full use of the contribution of multi-target joint learning to the… ▽ More

    Submitted 9 July, 2021; originally announced July 2021.

    Comments: Submitted to APSIPA-ASC 2021

  28. arXiv:2106.08752  [pdf, other

    eess.IV cs.CV

    Unsupervised Domain Adaptation with Variational Approximation for Cardiac Segmentation

    Authors: Fu** Wu, Xiahai Zhuang

    Abstract: Unsupervised domain adaptation is useful in medical image segmentation. Particularly, when ground truths of the target images are not available, domain adaptation can train a target-specific model by utilizing the existing labeled images from other modalities. Most of the reported works mapped images of both the source and target domains into a common latent feature space, and then reduced their d… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

    Comments: accepted by IEEE Transactions on Medical Imaging

  29. arXiv:2106.08727  [pdf, other

    eess.IV cs.CV

    AtrialGeneral: Domain Generalization for Left Atrial Segmentation of Multi-Center LGE MRIs

    Authors: Lei Li, Veronika A. Zimmer, Julia A. Schnabel, Xiahai Zhuang

    Abstract: Left atrial (LA) segmentation from late gadolinium enhanced magnetic resonance imaging (LGE MRI) is a crucial step needed for planning the treatment of atrial fibrillation. However, automatic LA segmentation from LGE MRI is still challenging, due to the poor image quality, high variability in LA shapes, and unclear LA boundary. Though deep learning-based methods can provide promising LA segmentati… ▽ More

    Submitted 4 July, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

    Comments: 10 pages, 4 figures, MICCAI2021

  30. arXiv:2106.04878  [pdf, other

    eess.AS cs.SD

    Deep Interaction between Masking and Map** Targets for Single-Channel Speech Enhancement

    Authors: Lu Zhang, Mingjiang Wang, Zehua Zhang, Xuyi Zhuang

    Abstract: The most recent deep neural network (DNN) models exhibit impressive denoising performance in the time-frequency (T-F) magnitude domain. However, the phase is also a critical component of the speech signal that is easily overlooked. In this paper, we propose a multi-branch dilated convolutional network (DCN) to simultaneously enhance the magnitude and phase of noisy speech. A causal and robust mona… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  31. arXiv:2105.07392  [pdf, other

    eess.IV cs.CV

    Unsupervised Multi-Modality Registration Network based on Spatially Encoded Gradient Information

    Authors: Wangbin Ding, Lei Li, Xiahai Zhuang, Liqin Huang

    Abstract: Multi-modality medical images can provide relevant or complementary information for a target (organ, tumor or tissue). Registering multi-modality images to a common space can fuse these comprehensive information, and bring convenience for clinical application. Recently, neural networks have been widely investigated to boost registration methods. However, it is still challenging to develop a multi-… ▽ More

    Submitted 29 August, 2021; v1 submitted 16 May, 2021; originally announced May 2021.

  32. arXiv:2012.08929  [pdf, other

    eess.IV cs.CV cs.LG

    Learning-Based Algorithms for Vessel Tracking: A Review

    Authors: Dengqiang Jia, Xiahai Zhuang

    Abstract: Develo** efficient vessel-tracking algorithms is crucial for imaging-based diagnosis and treatment of vascular diseases. Vessel tracking aims to solve recognition problems such as key (seed) point detection, centerline extraction, and vascular segmentation. Extensive image-processing techniques have been developed to overcome the problems of vessel tracking that are mainly attributed to the comp… ▽ More

    Submitted 16 December, 2020; originally announced December 2020.

    Comments: 19 pages, 3 figures, 9 tables, accept by Computerized Medical Imaging and Graphics

  33. arXiv:2012.04094  [pdf

    cs.CL cs.SD eess.AS

    Frame-level SpecAugment for Deep Convolutional Neural Networks in Hybrid ASR Systems

    Authors: Xinwei Li, Yuanyuan Zhang, Xiaodan Zhuang, Daben Liu

    Abstract: Inspired by SpecAugment -- a data augmentation method for end-to-end ASR systems, we propose a frame-level SpecAugment method (f-SpecAugment) to improve the performance of deep convolutional neural networks (CNN) for hybrid HMM based ASR systems. Similar to the utterance level SpecAugment, f-SpecAugment performs three transformations: time war**, frequency masking, and time masking. Instead of a… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

    Comments: To appear in SLT 2021

  34. Rank-One Network: An Effective Framework for Image Restoration

    Authors: Shangqi Gao, Xiahai Zhuang

    Abstract: The principal rank-one (RO) components of an image represent the self-similarity of the image, which is an important property for image restoration. However, the RO components of a corrupted image could be decimated by the procedure of image denoising. We suggest that the RO property should be utilized and the decimation should be avoided in image restoration. To achieve this, we propose a new fra… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence 2021

  35. arXiv:2011.08769  [pdf, other

    eess.IV cs.CV

    Anatomy Prior Based U-net for Pathology Segmentation with Attention

    Authors: Yuncheng Zhou, Ke Zhang, Xinzhe Luo, Sihan Wang, Xiahai Zhuang

    Abstract: Pathological area segmentation in cardiac magnetic resonance (MR) images plays a vital role in the clinical diagnosis of cardiovascular diseases. Because of the irregular shape and small area, pathological segmentation has always been a challenging task. We propose an anatomy prior based framework, which combines the U-net segmentation network with the attention technique. Leveraging the fact that… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

    Comments: 8 pages, 3 figures, to be published in STACOM 2020 (MICCAI Workshop)

    ACM Class: I.4.6

  36. arXiv:2011.08761  [pdf, other

    eess.IV cs.CV

    Recognition and standardization of cardiac MRI orientation via multi-tasking learning and deep neural networks

    Authors: Ke Zhang, Xiahai Zhuang

    Abstract: In this paper, we study the problem of imaging orientation in cardiac MRI, and propose a framework to categorize the orientation for recognition and standardization via deep neural networks. The method uses a new multi-tasking strategy, where both the tasks of cardiac segmentation and orientation recognition are simultaneously achieved. For multiple sequences and modalities of MRI, we propose a tr… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

    Comments: 10 pages, 2 figures, to be published in STACOM 2020 (MICCAI Workshop)

    ACM Class: I.4.6

  37. arXiv:2010.15647  [pdf, other

    eess.IV cs.CV

    Brain Tumor Segmentation Network Using Attention-based Fusion and Spatial Relationship Constraint

    Authors: Chenyu Liu, Wangbin Ding, Lei Li, Zhen Zhang, Chenhao Pei, Liqin Huang, Xiahai Zhuang

    Abstract: Delineating the brain tumor from magnetic resonance (MR) images is critical for the treatment of gliomas. However, automatic delineation is challenging due to the complex appearance and ambiguous outlines of tumors. Considering that multi-modal MR images can reflect different tumor biological properties, we develop a novel multi-modal tumor segmentation network (MMTSN) to robustly segment brain tu… ▽ More

    Submitted 31 October, 2020; v1 submitted 29 October, 2020; originally announced October 2020.

  38. arXiv:2008.12205  [pdf, other

    cs.CV eess.IV

    Random Style Transfer based Domain Generalization Networks Integrating Shape and Spatial Information

    Authors: Lei Li, Veronika A. Zimmer, Wangbin Ding, Fu** Wu, Liqin Huang, Julia A. Schnabel, Xiahai Zhuang

    Abstract: Deep learning (DL)-based models have demonstrated good performance in medical image segmentation. However, the models trained on a known dataset often fail when performed on an unseen dataset collected from different centers, vendors and disease populations. In this work, we present a random style transfer network to tackle the domain generalization problem for multi-vendor and center cardiac imag… ▽ More

    Submitted 3 September, 2020; v1 submitted 27 August, 2020; originally announced August 2020.

    Comments: 11 pages

  39. arXiv:2008.11966  [pdf, other

    math.FA cs.LG eess.SP

    Adaptive directional Haar tight framelets on bounded domains for digraph signal representations

    Authors: Yuchen Xiao, Xiaosheng Zhuang

    Abstract: Based on hierarchical partitions, we provide the construction of Haar-type tight framelets on any compact set $K\subseteq \mathbb{R}^d$. In particular, on the unit block $[0,1]^d$, such tight framelets can be built to be with adaptivity and directionality. We show that the adaptive directional Haar tight framelet systems can be used for digraph signal representations. Some examples are provided to… ▽ More

    Submitted 27 August, 2020; originally announced August 2020.

  40. arXiv:2008.04729  [pdf, other

    eess.IV cs.CV

    AtrialJSQnet: A New Framework for Joint Segmentation and Quantification of Left Atrium and Scars Incorporating Spatial and Shape Information

    Authors: Lei Li, Veronika A. Zimmer, Julia A. Schnabel, Xiahai Zhuang

    Abstract: Left atrial (LA) and atrial scar segmentation from late gadolinium enhanced magnetic resonance imaging (LGE MRI) is an important task in clinical practice. %, to guide ablation therapy and predict treatment results for atrial fibrillation (AF) patients. The automatic segmentation is however still challenging, due to the poor image quality, the various LA shapes, the thin wall, and the surrounding… ▽ More

    Submitted 12 November, 2021; v1 submitted 11 August, 2020; originally announced August 2020.

    Comments: 12 pages

  41. arXiv:2006.13011  [pdf, other

    eess.IV cs.CV

    Joint Left Atrial Segmentation and Scar Quantification Based on a DNN with Spatial Encoding and Shape Attention

    Authors: Lei Li, Xin Weng, Julia A. Schnabel, Xiahai Zhuang

    Abstract: We propose an end-to-end deep neural network (DNN) which can simultaneously segment the left atrial (LA) cavity and quantify LA scars. The framework incorporates the continuous spatial information of the target by introducing a spatially encoded (SE) loss based on the distance transform map. Compared to conventional binary label based loss, the proposed SE loss can reduce noisy patches in the resu… ▽ More

    Submitted 23 June, 2020; originally announced June 2020.

    Comments: 10 pages

    Journal ref: MICCAI 2020

  42. arXiv:2006.12434  [pdf, other

    eess.IV cs.CV

    Cardiac Segmentation on Late Gadolinium Enhancement MRI: A Benchmark Study from Multi-Sequence Cardiac MR Segmentation Challenge

    Authors: Xiahai Zhuang, Jiahang Xu, Xinzhe Luo, Chen Chen, Cheng Ouyang, Daniel Rueckert, Victor M. Campello, Karim Lekadir, Sulaiman Vesal, Nishant RaviKumar, Yashu Liu, Gongning Luo, **gkun Chen, Hongwei Li, Buntheng Ly, Maxime Sermesant, Holger Roth, Wentao Zhu, Jiexiang Wang, Xinghao Ding, Xinyue Wang, Sen Yang, Lei Li

    Abstract: Accurate computing, analysis and modeling of the ventricles and myocardium from medical images are important, especially in the diagnosis and treatment management for patients suffering from myocardial infarction (MI). Late gadolinium enhancement (LGE) cardiac magnetic resonance (CMR) provides an important protocol to visualize MI. However, automated segmentation of LGE CMR is still challenging, d… ▽ More

    Submitted 17 July, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: 14 pages

  43. arXiv:2004.12314  [pdf

    cs.CV cs.LG eess.IV stat.ML

    A Global Benchmark of Algorithms for Segmenting Late Gadolinium-Enhanced Cardiac Magnetic Resonance Imaging

    Authors: Zhaohan Xiong, Qing Xia, Zhiqiang Hu, Ning Huang, Cheng Bian, Yefeng Zheng, Sulaiman Vesal, Nishant Ravikumar, Andreas Maier, Xin Yang, Pheng-Ann Heng, Dong Ni, Caizi Li, Qianqian Tong, Weixin Si, Elodie Puybareau, Younes Khoudli, Thierry Geraud, Chen Chen, Wenjia Bai, Daniel Rueckert, Lingchao Xu, Xiahai Zhuang, Xinzhe Luo, Shuman Jia , et al. (19 additional authors not shown)

    Abstract: Segmentation of cardiac images, particularly late gadolinium-enhanced magnetic resonance imaging (LGE-MRI) widely used for visualizing diseased cardiac structures, is a crucial first step for clinical diagnosis and treatment. However, direct segmentation of LGE-MRIs is challenging due to its attenuated contrast. Since most clinical studies have relied on manual and labor-intensive approaches, auto… ▽ More

    Submitted 7 May, 2020; v1 submitted 26 April, 2020; originally announced April 2020.

  44. arXiv:1911.06553  [pdf, ps, other

    physics.comp-ph cs.CE eess.SY math.NA

    A meshfree formulation for large deformation analysis of flexoelectric structures accounting for the surface effects

    Authors: Xiaoying Zhuang, S. S. Nanthakumar, Timon Rabczuk

    Abstract: In this work, we present a compactly supported radial basis function (CSRBF) based meshfree method to analyse geometrically nonlinear flexoelectric nanostructures considering surface effects. Flexoelectricity is the polarization of dielectric materials due to the gradient of strain, which is different from piezoelectricity in which polarization is dependent linearly on strain. The surface effects… ▽ More

    Submitted 15 November, 2019; originally announced November 2019.

    Comments: 27 pages 15 figures

  45. arXiv:1910.02241  [pdf, ps, other

    cs.CV cs.LG eess.IV

    Self-supervised Feature Learning for 3D Medical Images by Playing a Rubik's Cube

    Authors: Xinrui Zhuang, Yuexiang Li, Yifan Hu, Kai Ma, Yujiu Yang, Yefeng Zheng

    Abstract: Witnessed the development of deep learning, increasing number of studies try to build computer aided diagnosis systems for 3D volumetric medical data. However, as the annotations of 3D medical data are difficult to acquire, the number of annotated 3D medical images is often not enough to well train the deep learning networks. The self-supervised learning deeply exploiting the information of raw da… ▽ More

    Submitted 5 October, 2019; originally announced October 2019.

    Comments: MICCAI 2019

  46. arXiv:1910.01992  [pdf, other

    cs.LG cs.CL cs.SD eess.AS stat.ML

    SNDCNN: Self-normalizing deep CNNs with scaled exponential linear units for speech recognition

    Authors: Zhen Huang, Tim Ng, Leo Liu, Henry Mason, Xiaodan Zhuang, Daben Liu

    Abstract: Very deep CNNs achieve state-of-the-art results in both computer vision and speech recognition, but are difficult to train. The most popular way to train very deep CNNs is to use shortcut connections (SC) together with batch normalization (BN). Inspired by Self- Normalizing Neural Networks, we propose the self-normalizing deep CNN (SNDCNN) based acoustic model topology, by removing the SC/BN and r… ▽ More

    Submitted 23 March, 2020; v1 submitted 4 October, 2019; originally announced October 2019.

  47. arXiv:1906.07347  [pdf

    eess.IV cs.CV

    Cardiac Segmentation from LGE MRI Using Deep Neural Network Incorporating Shape and Spatial Priors

    Authors: Qian Yue, Xinzhe Luo, Qing Ye, Lingchao Xu, Xiahai Zhuang

    Abstract: Cardiac segmentation from late gadolinium enhancement MRI is an important task in clinics to identify and evaluate the infarction of myocardium. The automatic segmentation is however still challenging, due to the heterogeneous intensity distributions and indistinct boundaries in the images. In this paper, we propose a new method, based on deep neural networks (DNN), for fully automatic segmentatio… ▽ More

    Submitted 25 June, 2019; v1 submitted 17 June, 2019; originally announced June 2019.

    Comments: Accepted for publication in MICCAI 2019