Search | arXiv e-print repository

Narrowing the semantic gaps in U-Net with learnable skip connections: The case of medical image segmentation

Authors: Haonan Wang, Peng Cao, Xiaoli Liu, **zhu Yang, Osmar Zaiane

Abstract: Most state-of-the-art methods for medical image segmentation adopt the encoder-decoder architecture. However, this U-shaped framework still has limitations in capturing the non-local multi-scale information with a simple skip connection. To solve the problem, we firstly explore the potential weakness of skip connections in U-Net on multiple segmentation tasks, and find that i) not all skip connect… ▽ More Most state-of-the-art methods for medical image segmentation adopt the encoder-decoder architecture. However, this U-shaped framework still has limitations in capturing the non-local multi-scale information with a simple skip connection. To solve the problem, we firstly explore the potential weakness of skip connections in U-Net on multiple segmentation tasks, and find that i) not all skip connections are useful, each skip connection has different contribution; ii) the optimal combinations of skip connections are different, relying on the specific datasets. Based on our findings, we propose a new segmentation framework, named UDTransNet, to solve three semantic gaps in U-Net. Specifically, we propose a Dual Attention Transformer (DAT) module for capturing the channel- and spatial-wise relationships to better fuse the encoder features, and a Decoder-guided Recalibration Attention (DRA) module for effectively connecting the DAT tokens and the decoder features to eliminate the inconsistency. Hence, both modules establish a learnable connection to solve the semantic gaps between the encoder and the decoder, which leads to a high-performance segmentation model for medical images. Comprehensive experimental results indicate that our UDTransNet produces higher evaluation scores and finer segmentation results with relatively fewer parameters over the state-of-the-art segmentation methods on different public datasets. Code: https://github.com/McGregorWwww/UDTransNet. △ Less

Submitted 23 December, 2023; originally announced December 2023.

arXiv:2310.12877 [pdf, other]

Perceptual Assessment and Optimization of HDR Image Rendering

Authors: Peibei Cao, Rafal K. Mantiuk, Kede Ma

Abstract: High dynamic range (HDR) rendering has the ability to faithfully reproduce the wide luminance ranges in natural scenes, but how to accurately assess the rendering quality is relatively underexplored. Existing quality models are mostly designed for low dynamic range (LDR) images, and do not align well with human perception of HDR image quality. To fill this gap, we propose a family of HDR quality m… ▽ More High dynamic range (HDR) rendering has the ability to faithfully reproduce the wide luminance ranges in natural scenes, but how to accurately assess the rendering quality is relatively underexplored. Existing quality models are mostly designed for low dynamic range (LDR) images, and do not align well with human perception of HDR image quality. To fill this gap, we propose a family of HDR quality metrics, in which the key step is employing a simple inverse display model to decompose an HDR image into a stack of LDR images with varying exposures. Subsequently, these decomposed images are assessed through well-established LDR quality metrics. Our HDR quality models present three distinct benefits. First, they directly inherit the recent advancements of LDR quality metrics. Second, they do not rely on human perceptual data of HDR image quality for re-calibration. Third, they facilitate the alignment and prioritization of specific luminance ranges for more accurate and detailed quality assessment. Experimental results show that our HDR quality metrics consistently outperform existing models in terms of quality assessment on four HDR image quality datasets and perceptual optimization of HDR novel view synthesis. △ Less

Submitted 16 June, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

arXiv:2309.00514 [pdf]

A Machine Vision Method for Correction of Eccentric Error: Based on Adaptive Enhancement Algorithm

Authors: Fanyi Wang, Pin Cao, Yihui Zhang, Haotian Hu, Yongying Yang

Abstract: In the procedure of surface defects detection for large-aperture aspherical optical elements, it is of vital significance to adjust the optical axis of the element to be coaxial with the mechanical spin axis accurately. Therefore, a machine vision method for eccentric error correction is proposed in this paper. Focusing on the severe defocus blur of reference crosshair image caused by the imaging… ▽ More In the procedure of surface defects detection for large-aperture aspherical optical elements, it is of vital significance to adjust the optical axis of the element to be coaxial with the mechanical spin axis accurately. Therefore, a machine vision method for eccentric error correction is proposed in this paper. Focusing on the severe defocus blur of reference crosshair image caused by the imaging characteristic of the aspherical optical element, which may lead to the failure of correction, an Adaptive Enhancement Algorithm (AEA) is proposed to strengthen the crosshair image. AEA is consisted of existed Guided Filter Dark Channel Dehazing Algorithm (GFA) and proposed lightweight Multi-scale Densely Connected Network (MDC-Net). The enhancement effect of GFA is excellent but time-consuming, and the enhancement effect of MDC-Net is slightly inferior but strongly real-time. As AEA will be executed dozens of times during each correction procedure, its real-time performance is very important. Therefore, by setting the empirical threshold of definition evaluation function SMD2, GFA and MDC-Net are respectively applied to highly and slightly blurred crosshair images so as to ensure the enhancement effect while saving as much time as possible. AEA has certain robustness in time-consuming performance, which takes an average time of 0.2721s and 0.0963s to execute GFA and MDC-Net separately on ten 200pixels 200pixels Region of Interest (ROI) images with different degrees of blur. And the eccentricity error can be reduced to within 10um by our method. △ Less

Submitted 1 September, 2023; originally announced September 2023.

arXiv:2301.06943 [pdf, other]

Self-supervised Domain Adaptation for Breaking the Limits of Low-quality Fundus Image Quality Enhancement

Authors: Qingshan Hou, Peng Cao, Jiaqi Wang, Xiaoli Liu, **zhu Yang, Osmar R. Zaiane

Abstract: Retinal fundus images have been applied for the diagnosis and screening of eye diseases, such as Diabetic Retinopathy (DR) or Diabetic Macular Edema (DME). However, both low-quality fundus images and style inconsistency potentially increase uncertainty in the diagnosis of fundus disease and even lead to misdiagnosis by ophthalmologists. Most of the existing image enhancement methods mainly focus o… ▽ More Retinal fundus images have been applied for the diagnosis and screening of eye diseases, such as Diabetic Retinopathy (DR) or Diabetic Macular Edema (DME). However, both low-quality fundus images and style inconsistency potentially increase uncertainty in the diagnosis of fundus disease and even lead to misdiagnosis by ophthalmologists. Most of the existing image enhancement methods mainly focus on improving the image quality by leveraging the guidance of high-quality images, which is difficult to be collected in medical applications. In this paper, we tackle image quality enhancement in a fully unsupervised setting, i.e., neither paired images nor high-quality images. To this end, we explore the potential of the self-supervised task for improving the quality of fundus images without the requirement of high-quality reference images. Specifically, we construct multiple patch-wise domains via an auxiliary pre-trained quality assessment network and a style clustering. To achieve robust low-quality image enhancement and address style inconsistency, we formulate two self-supervised domain adaptation tasks to disentangle the features of image content, low-quality factor and style information by exploring intrinsic supervision signals within the low-quality images. Extensive experiments are conducted on EyeQ and Messidor datasets, and results show that our DASQE method achieves new state-of-the-art performance when only low-quality images are available. △ Less

Submitted 17 January, 2023; originally announced January 2023.

arXiv:2301.00592 [pdf, other]

Edge Enhanced Image Style Transfer via Transformers

Authors: Chiyu Zhang, Jun Yang, Zaiyan Dai, Peng Cao

Abstract: In recent years, arbitrary image style transfer has attracted more and more attention. Given a pair of content and style images, a stylized one is hoped that retains the content from the former while catching style patterns from the latter. However, it is difficult to simultaneously keep well the trade-off between the content details and the style features. To stylize the image with sufficient sty… ▽ More In recent years, arbitrary image style transfer has attracted more and more attention. Given a pair of content and style images, a stylized one is hoped that retains the content from the former while catching style patterns from the latter. However, it is difficult to simultaneously keep well the trade-off between the content details and the style features. To stylize the image with sufficient style patterns, the content details may be damaged and sometimes the objects of images can not be distinguished clearly. For this reason, we present a new transformer-based method named STT for image style transfer and an edge loss which can enhance the content details apparently to avoid generating blurred results for excessive rendering on style features. Qualitative and quantitative experiments demonstrate that STT achieves comparable performance to state-of-the-art image style transfer methods while alleviating the content leak problem. △ Less

Submitted 2 January, 2023; originally announced January 2023.

arXiv:2212.03357 [pdf, other]

Contactless Oxygen Monitoring with Gated Transformer

Authors: Hao He, Yuan Yuan, Ying-Cong Chen, Peng Cao, Dina Katabi

Abstract: With the increasing popularity of telehealth, it becomes critical to ensure that basic physiological signals can be monitored accurately at home, with minimal patient overhead. In this paper, we propose a contactless approach for monitoring patients' blood oxygen at home, simply by analyzing the radio signals in the room, without any wearable devices. We extract the patients' respiration from the… ▽ More With the increasing popularity of telehealth, it becomes critical to ensure that basic physiological signals can be monitored accurately at home, with minimal patient overhead. In this paper, we propose a contactless approach for monitoring patients' blood oxygen at home, simply by analyzing the radio signals in the room, without any wearable devices. We extract the patients' respiration from the radio signals that bounce off their bodies and devise a novel neural network that infers a patient's oxygen estimates from their breathing signal. Our model, called \emph{Gated BERT-UNet}, is designed to adapt to the patient's medical indices (e.g., gender, sleep stages). It has multiple predictive heads and selects the most suitable head via a gate controlled by the person's physiological indices. Extensive empirical results show that our model achieves high accuracy on both medical and radio datasets. △ Less

Submitted 6 December, 2022; originally announced December 2022.

Comments: 19 pages, Workshop on Learning from Time Series for Health, NeurIPS 2022

arXiv:2211.16881 [pdf, other]

doi 10.1007/978-3-031-34344-5_28

Generalized Deep Learning-based Proximal Gradient Descent for MR Reconstruction

Authors: Guanxiong Luo, Mengmeng Kuang, Peng Cao

Abstract: The data consistency for the physical forward model is crucial in inverse problems, especially in MR imaging reconstruction. The standard way is to unroll an iterative algorithm into a neural network with a forward model embedded. The forward model always changes in clinical practice, so the learning component's entanglement with the forward model makes the reconstruction hard to generalize. The d… ▽ More The data consistency for the physical forward model is crucial in inverse problems, especially in MR imaging reconstruction. The standard way is to unroll an iterative algorithm into a neural network with a forward model embedded. The forward model always changes in clinical practice, so the learning component's entanglement with the forward model makes the reconstruction hard to generalize. The deep learning-based proximal gradient descent was proposed and use a network as regularization term that is independent of the forward model, which makes it more generalizable for different MR acquisition settings. This one-time pre-trained regularization is applied to different MR acquisition settings and was compared to conventional L1 regularization showing ~3 dB improvement in the peak signal-to-noise ratio. We also demonstrated the flexibility of the proposed method in choosing different undersampling patterns. △ Less

Submitted 18 March, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

Comments: Keywords: MRI reconstruction, Deep Learning, Proximal gradient descent, Learned regularization term

arXiv:2206.09146 [pdf, other]

A Perceptually Optimized and Self-Calibrated Tone Map** Operator

Authors: Peibei Cao, Chenyang Le, Yuming Fang, Kede Ma

Abstract: With the increasing popularity and accessibility of high dynamic range (HDR) photography, tone map** operators (TMOs) for dynamic range compression are practically demanding. In this paper, we develop a two-stage neural network-based TMO that is self-calibrated and perceptually optimized. In Stage one, motivated by the physiology of the early stages of the human visual system, we first decompose… ▽ More With the increasing popularity and accessibility of high dynamic range (HDR) photography, tone map** operators (TMOs) for dynamic range compression are practically demanding. In this paper, we develop a two-stage neural network-based TMO that is self-calibrated and perceptually optimized. In Stage one, motivated by the physiology of the early stages of the human visual system, we first decompose an HDR image into a normalized Laplacian pyramid. We then use two lightweight deep neural networks (DNNs), taking the normalized representation as input and estimating the Laplacian pyramid of the corresponding LDR image. We optimize the tone map** network by minimizing the normalized Laplacian pyramid distance (NLPD), a perceptual metric aligning with human judgments of tone-mapped image quality. In Stage two, the input HDR image is self-calibrated to compute the final LDR image. We feed the same HDR image but rescaled with different maximum luminances to the learned tone map** network, and generate a pseudo-multi-exposure image stack with different detail visibility and color saturation. We then train another lightweight DNN to fuse the LDR image stack into a desired LDR image by maximizing a variant of the structural similarity index for multi-exposure image fusion (MEF-SSIM), which has been proven perceptually relevant to fused image quality. The proposed self-calibration mechanism through MEF enables our TMO to accept uncalibrated HDR images, while being physiology-driven. Extensive experiments show that our method produces images with consistently better visual quality. Additionally, since our method builds upon three lightweight DNNs, it is among the fastest local TMOs. △ Less

Submitted 25 August, 2023; v1 submitted 18 June, 2022; originally announced June 2022.

Comments: 15 pages,17 figures

arXiv:2109.04335 [pdf, other]

UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspective with Transformer

Authors: Haonan Wang, Peng Cao, Jiaqi Wang, Osmar R. Zaiane

Abstract: Most recent semantic segmentation methods adopt a U-Net framework with an encoder-decoder architecture. It is still challenging for U-Net with a simple skip connection scheme to model the global multi-scale context: 1) Not each skip connection setting is effective due to the issue of incompatible feature sets of encoder and decoder stage, even some skip connection negatively influence the segmenta… ▽ More Most recent semantic segmentation methods adopt a U-Net framework with an encoder-decoder architecture. It is still challenging for U-Net with a simple skip connection scheme to model the global multi-scale context: 1) Not each skip connection setting is effective due to the issue of incompatible feature sets of encoder and decoder stage, even some skip connection negatively influence the segmentation performance; 2) The original U-Net is worse than the one without any skip connection on some datasets. Based on our findings, we propose a new segmentation framework, named UCTransNet (with a proposed CTrans module in U-Net), from the channel perspective with attention mechanism. Specifically, the CTrans module is an alternate of the U-Net skip connections, which consists of a sub-module to conduct the multi-scale Channel Cross fusion with Transformer (named CCT) and a sub-module Channel-wise Cross-Attention (named CCA) to guide the fused multi-scale channel-wise information to effectively connect to the decoder features for eliminating the ambiguity. Hence, the proposed connection consisting of the CCT and CCA is able to replace the original skip connection to solve the semantic gaps for an accurate automatic medical image segmentation. The experimental results suggest that our UCTransNet produces more precise segmentation performance and achieves consistent improvements over the state-of-the-art for semantic segmentation across different datasets and conventional architectures involving transformer or U-shaped framework. Code: https://github.com/McGregorWwww/UCTransNet. △ Less

Submitted 24 January, 2022; v1 submitted 9 September, 2021; originally announced September 2021.

Comments: Accepted by AAAI 2022. Code is available at https://github.com/McGregorWwww/UCTransNet

arXiv:2105.09930 [pdf, other]

Mondegreen: A Post-Processing Solution to Speech Recognition Error Correction for Voice Search Queries

Authors: Sukhdeep S. Sodhi, Ellie Ka-In Chio, Ambarish Jash, Santiago Ontañón, Ajit Apte, Ankit Kumar, Ayooluwakunmi Jeje, Dima Kuzmin, Harry Fung, Heng-Tze Cheng, Jon Effrat, Tarush Bali, Nitin **dal, Pei Cao, Sarvjeet Singh, Senqiang Zhou, Tameen Khan, Amol Wankhede, Moustafa Alzantot, Allen Wu, Tushar Chandra

Abstract: As more and more online search queries come from voice, automatic speech recognition becomes a key component to deliver relevant search results. Errors introduced by automatic speech recognition (ASR) lead to irrelevant search results returned to the user, thus causing user dissatisfaction. In this paper, we introduce an approach, Mondegreen, to correct voice queries in text space without dependin… ▽ More As more and more online search queries come from voice, automatic speech recognition becomes a key component to deliver relevant search results. Errors introduced by automatic speech recognition (ASR) lead to irrelevant search results returned to the user, thus causing user dissatisfaction. In this paper, we introduce an approach, Mondegreen, to correct voice queries in text space without depending on audio signals, which may not always be available due to system constraints or privacy or bandwidth (for example, some ASR systems run on-device) considerations. We focus on voice queries transcribed via several proprietary commercial ASR systems. These queries come from users making internet, or online service search queries. We first present an analysis showing how different the language distribution coming from user voice queries is from that in traditional text corpora used to train off-the-shelf ASR systems. We then demonstrate that Mondegreen can achieve significant improvements in increased user interaction by correcting user voice queries in one of the largest search systems in Google. Finally, we see Mondegreen as complementing existing highly-optimized production ASR systems, which may not be frequently retrained and thus lag behind due to vocabulary drifts. △ Less

Submitted 20 May, 2021; originally announced May 2021.

Comments: Accepted in KDD 2021

arXiv:2008.13570 [pdf]

Deep Learning based Spectral CT Imaging

Authors: Weiwen Wu, Dianlin Hu, Chuang Niu, Lieza Vanden Broeke, Anthony P. H. Butler, Peng Cao, James Atlas, Alexander Chernoglazov, Varut Vardhanabhuti, Ge Wang

Abstract: Spectral computed tomography (CT) has attracted much attention in radiation dose reduction, metal artifacts removal, tissue quantification and material discrimination. The x-ray energy spectrum is divided into several bins, each energy-bin-specific projection has a low signal-noise-ratio (SNR) than the current-integrating counterpart, which makes image reconstruction a unique challenge. Traditiona… ▽ More Spectral computed tomography (CT) has attracted much attention in radiation dose reduction, metal artifacts removal, tissue quantification and material discrimination. The x-ray energy spectrum is divided into several bins, each energy-bin-specific projection has a low signal-noise-ratio (SNR) than the current-integrating counterpart, which makes image reconstruction a unique challenge. Traditional wisdom is to use prior knowledge based iterative methods. However, this kind of methods demands a great computational cost. Inspired by deep learning, here we first develop a deep learning based reconstruction method; i.e., U-net with L_p^p-norm, Total variation, Residual learning, and Anisotropic adaption (ULTRA). Specifically, we emphasize the Various Multi-scale Feature Fusion and Multichannel Filtering Enhancement with a denser connection encoding architecture for residual learning and feature fusion. To address the image deblurring problem associated with the $L_2^2$-loss, we propose a general $L_p^p$-loss, $p>0$ Furthermore, the images from different energy bins share similar structures of the same object, the regularization characterizing correlations of different energy bins is incorporated into the $L_p^p$-loss function, which helps unify the deep learning based methods with traditional compressed sensing based methods. Finally, the anisotropically weighted total variation is employed to characterize the sparsity in the spatial-spectral domain to regularize the proposed network. In particular, we validate our ULTRA networks on three large-scale spectral CT datasets, and obtain excellent results relative to the competing algorithms. In conclusion, our quantitative and qualitative results in numerical simulation and preclinical experiments demonstrate that our proposed approach is accurate, efficient and robust for high-quality spectral CT image reconstruction. △ Less

Submitted 25 August, 2021; v1 submitted 27 August, 2020; originally announced August 2020.

arXiv:2002.06330 [pdf, other]

Development of Conditional Random Field Insert for UNet-based Zonal Prostate Segmentation on T2-Weighted MRI

Authors: Peng Cao, Susan M. Noworolski, Olga Starobinets, Natalie Korn, Sage P. Kramer, Antonio C. Westphalen, Andrew P. Leynes, Valentina Pedoia, Peder Larson

Abstract: Purpose: A conventional 2D UNet convolutional neural network (CNN) architecture may result in ill-defined boundaries in segmentation output. Several studies imposed stronger constraints on each level of UNet to improve the performance of 2D UNet, such as SegNet. In this study, we investigated 2D SegNet and a proposed conditional random field insert (CRFI) for zonal prostate segmentation from clini… ▽ More Purpose: A conventional 2D UNet convolutional neural network (CNN) architecture may result in ill-defined boundaries in segmentation output. Several studies imposed stronger constraints on each level of UNet to improve the performance of 2D UNet, such as SegNet. In this study, we investigated 2D SegNet and a proposed conditional random field insert (CRFI) for zonal prostate segmentation from clinical T2-weighted MRI data. Methods: We introduced a new methodology that combines SegNet and CRFI to improve the accuracy and robustness of the segmentation. CRFI has feedback connections that encourage the data consistency at multiple levels of the feature pyramid. On the encoder side of the SegNet, the CRFI combines the input feature maps and convolution block output based on their spatial local similarity, like a trainable bilateral filter. For all networks, 725 2D images (i.e., 29 MRI cases) were used in training; while, 174 2D images (i.e., 6 cases) were used in testing. Results: The SegNet with CRFI achieved the relatively high Dice coefficients (0.76, 0.84, and 0.89) for the peripheral zone, central zone, and whole gland, respectively. Compared with UNet, the SegNet+CRFIs segmentation has generally higher Dice score and showed the robustness in determining the boundaries of anatomical structures compared with the SegNet or UNet segmentation. The SegNet with a CRFI at the end showed the CRFI can correct the segmentation errors from SegNet output, generating smooth and consistent segmentation for the prostate. Conclusion: UNet based deep neural networks demonstrated in this study can perform zonal prostate segmentation, achieving high Dice coefficients compared with those in the literature. The proposed CRFI method can reduce the fuzzy boundaries that affected the segmentation performance of baseline UNet and SegNet models. △ Less

Submitted 15 February, 2020; originally announced February 2020.

arXiv:1911.12389 [pdf]

doi 10.1002/mp.13756

Simultaneous Segmentation and Relaxometry for MRI through Multitask Learning

Authors: Peng Cao, **g Liu, Shuyu Tang, Andrew Leynes, Janine M. Lupo, Duan Xu, Peder E. Z. Larson

Abstract: Purpose: This study demonstrated an MR signal multitask learning method for 3D simultaneous segmentation and relaxometry of human brain tissues. Materials and Methods: A 3D inversion-prepared balanced steady-state free precession sequence was used for acquiring in vivo multi-contrast brain images. The deep neural network contained 3 residual blocks, and each block had 8 fully connected layers with… ▽ More Purpose: This study demonstrated an MR signal multitask learning method for 3D simultaneous segmentation and relaxometry of human brain tissues. Materials and Methods: A 3D inversion-prepared balanced steady-state free precession sequence was used for acquiring in vivo multi-contrast brain images. The deep neural network contained 3 residual blocks, and each block had 8 fully connected layers with sigmoid activation, layer norm, and 256 neurons in each layer. Online synthesized MR signal evolutions and labels were used to train the neural network batch-by-batch. Empirically defined ranges of T1 and T2 values for the normal gray matter, white matter and cerebrospinal fluid (CSF) were used as the prior knowledge. MRI brain experiments were performed on 3 healthy volunteers as well as animal (N=6) and prostate patient (N=1) experiments. Results: In animal validation experiment, the differences/errors (mean difference $\pm$ standard deviation of difference) between the T1 and T2 values estimated from the proposed method and the ground truth were 113 $\pm$ 486 and 154 $\pm$ 512 ms for T1, and 5 $\pm$ 33 and 7 $\pm$ 41 ms for T2, respectively. In healthy volunteer experiments (N=3), whole brain segmentation and relaxometry were finished within ~5 seconds. The estimated apparent T1 and T2 maps were in accordance with known brain anatomy, and not affected by coil sensitivity variation. Gray matter, white matter, and CSF were successfully segmented. The deep neural network can also generate synthetic T1 and T2 weighted images. Conclusion: The proposed multitask learning method can directly generate brain apparent T1 and T2 maps, as well as synthetic T1 and T2 weighted images, in conjunction with segmentation of gray matter, white matter and CSF. △ Less

Submitted 27 November, 2019; originally announced November 2019.

Journal ref: Med Phys. 2019 Oct; 46(10):4610-4621. PMID: 31396973

arXiv:1909.01127 [pdf, other]

doi 10.1002/mrm.28274

MRI Reconstruction Using Deep Bayesian Estimation

Authors: GuanXiong Luo, Na Zhao, Wenhao Jiang, Edward S. Hui, Peng Cao

Abstract: Purpose: To develop a deep learning-based Bayesian inference for MRI reconstruction. Methods: We modeled the MRI reconstruction problem with Bayes's theorem, following the recently proposed PixelCNN++ method. The image reconstruction from incomplete k-space measurement was obtained by maximizing the posterior possibility. A generative network was utilized as the image prior, which was computationa… ▽ More Purpose: To develop a deep learning-based Bayesian inference for MRI reconstruction. Methods: We modeled the MRI reconstruction problem with Bayes's theorem, following the recently proposed PixelCNN++ method. The image reconstruction from incomplete k-space measurement was obtained by maximizing the posterior possibility. A generative network was utilized as the image prior, which was computationally tractable, and the k-space data fidelity was enforced by using an equality constraint. The stochastic backpropagation was utilized to calculate the descent gradient in the process of maximum a posterior, and a projected subgradient method was used to impose the equality constraint. In contrast to the other deep learning reconstruction methods, the proposed one used the likelihood of prior as the training loss and the objective function in reconstruction to improve the image quality. Results: The proposed method showed an improved performance in preserving image details and reducing aliasing artifacts, compared with GRAPPA, $\ell_1$-ESPRiT, and MODL, a state-of-the-art deep learning reconstruction method. The proposed method generally achieved more than 5 dB peak signal-to-noise ratio improvement for compressed sensing and parallel imaging reconstructions compared with the other methods. Conclusion: The Bayesian inference significantly improved the reconstruction performance, compared with the conventional $\ell_1$-sparsity prior in compressed sensing reconstruction tasks. More importantly, the proposed reconstruction framework can be generalized for most MRI reconstruction scenarios. △ Less

Submitted 17 February, 2022; v1 submitted 3 September, 2019; originally announced September 2019.

arXiv:1907.11861 [pdf, ps, other]

Deep convolution neural network model for automatic risk assessment of patients with non-metastatic nasopharyngeal carcinoma

Authors: Richard Du, Peng Cao, Lujun Han, Qiyong Ai, Ann D. King, Varut Vardhanabhuti

Abstract: Nasopharyngeal Carcinoma (NPC) is endemic cancer in the south-east Asia. With the advent of intensity-modulated radiotherapy excellent locoregional control are being achieved. Consequently, this had led to pretreatment clinical staging classification to be less prognostic of outcomes such as recurrence after treatment. Alternative pretreatment strategies for prognosis of NPC after treatment are ne… ▽ More Nasopharyngeal Carcinoma (NPC) is endemic cancer in the south-east Asia. With the advent of intensity-modulated radiotherapy excellent locoregional control are being achieved. Consequently, this had led to pretreatment clinical staging classification to be less prognostic of outcomes such as recurrence after treatment. Alternative pretreatment strategies for prognosis of NPC after treatment are needed to provide better risk stratification for NPC. In this study we proposed a deep convolution neural network model based on contrast-enhanced T1 (T1C) and T2 weighted (T2) MRI scan to predict 3-year disease progression of NPC patient after primary treatment. We retrospective obtained 596 non-metastatic NPC patients from four independent centres in Hong Kong and China. Our model first performs a segmentation of the primary NPC tumour to localise the tumour, and then uses the segmentation mask as prior knowledge along with the T1C and T2 scan to classify 3-year disease progression. For segmentation, we adapted and modified a VNet to encode both T1C and T2 scan and also encoding to classify T and overall stage classification. Our modified network performed better than baseline VNet with T1C and network with no T and overall classification. The classification result for 3-year disease progression achieved an AUC of 0.828 in the validation set but did not generalised well for the test set which consist of 146 patients from a different centre to the training data (AUC = 0.69). Our preliminary results show that deep learning may offer prognostication of disease progression of NPC patients after treatment. One advantage of our model is that it does not require manual segmentation of the region of interest, hence reducing clinician's burden. Further development in generalising multicentre data set are needed before clinical application of deep learning models in assessment of NPC. △ Less

Submitted 27 July, 2019; originally announced July 2019.

Comments: Medical Imaging with Deep Learning 2019 - Extended Abstract. MIDL 2019 [arXiv:1907.08612]

Report number: MIDL/2019/ExtendedAbstract/S1xEkdTpYN

arXiv:1806.09258 [pdf]

doi 10.1109/TNS.2019.2900657

Readout Electronics for CBM-TOF Super Module Quality Evaluation

Authors: Wei Jiang, Xiru Huang, ** Cao, Chao Li, Junru Wang, Jiawen Li, Jianhui Yuan, Qi An

Abstract: A super module assembled with MRPC detectors is a component of TOF (Time of Flight) system for the Compressed Baryonic Matter (CBM) experiment. Before the super modules are applied to CBM-TOF, their quality needs to be evaluated. The readout electronics is confronted with a tremendous challenge of transmitting data at the maximal speed of 6 Gbps. In this paper, the readout method for CBM-TOF super… ▽ More A super module assembled with MRPC detectors is a component of TOF (Time of Flight) system for the Compressed Baryonic Matter (CBM) experiment. Before the super modules are applied to CBM-TOF, their quality needs to be evaluated. The readout electronics is confronted with a tremendous challenge of transmitting data at the maximal speed of 6 Gbps. In this paper, the readout method for CBM-TOF super module quality evaluation is presented. A parallel architecture based on the Gigabit Ethernet is designed to meet the requirement for data transmission rate. First, the data is sent from the front-end electronics to four readout module groups via optical fibers at the maximal rate of 1.5 Gbps per fiber. Next, the data are further distributed to sixteen parallel daughter readout models so that the maximal data throughput of each daughter readout module is 375 Mbps, within its capacity of 550 Mbps. Finally, the readout daughter modules send data to the data acquisition (DAQ) software through standard Gigabit Ethernet. The proposed readout method has the advantage of good scalability, so it can meet different requirements for variable data rate. The preliminary test result shows that each of four parallel readout groups can transmit data at the speed of 1.6 Gbps, which indicates that the overall readout system can meet the requirement for the maximal data-transmission speed of 6 Gbps. △ Less

Submitted 8 November, 2023; v1 submitted 24 June, 2018; originally announced June 2018.

arXiv:1806.09250 [pdf]

doi 10.1109/TNS.2019.2900480

Electronics of Time-of-flight Measurement for Back-n at CSNS

Authors: T. Yu, P. Cao, X. Y. Ji, L. K. Xie, X. R. Huang, Q. An, H. Y. Bai, J. Bao, Y. H. Chen, P. J. Cheng, Z. Q. Cui, R. R. Fan, C. Q. Feng, M. H. Gu, Z. J. Han, G. Z. He, Y. C. He, Y. F. He, H. X. Huang, W. L. Huang, X. L. Ji, H. Y. Jiang, W. Jiang, H. Y. **g, L. Kang , et al. (46 additional authors not shown)

Abstract: Back-n is a white neutron experimental facility at China Spallation Neutron Source (CSNS). The time structure of the primary proton beam make it fully applicable to use TOF (time-of-flight) method for neutron energy measuring. We implement the electronics of TOF measurement on the general-purpose readout electronics designed for all of the seven detectors in Back-n. The electronics is based on PXI… ▽ More Back-n is a white neutron experimental facility at China Spallation Neutron Source (CSNS). The time structure of the primary proton beam make it fully applicable to use TOF (time-of-flight) method for neutron energy measuring. We implement the electronics of TOF measurement on the general-purpose readout electronics designed for all of the seven detectors in Back-n. The electronics is based on PXIe (Peripheral Component Interconnect Express eXtensions for Instrumentation) platform, which is composed of FDM (Field Digitizer Modules), TCM (Trigger and Clock Module), and SCM (Signal Conditioning Module). T0 signal synchronous to the CSNS accelerator represents the neutron emission from the target. It is the start of time stamp. The trigger and clock module (TCM) receives, synchronizes and distributes the T0 signal to each FDM based on the PXIe backplane bus. Meantime, detector signals after being conditioned are fed into FDMs for waveform digitizing. First sample point of the signal is the stop of time stamp. According to the start, stop time stamp and the time of signal over threshold, the total TOF can be obtained. FPGA-based (Field Programmable Gate Array) TDC is implemented on TCM to accurately acquire the time interval between the asynchronous T0 signal and the global synchronous clock phase. There is also an FPGA-based TDC on FDM to accurately acquire the time interval between T0 arriving at FDM and the first sample point of the detector signal, the over threshold time of signal is obtained offline. This method for TOF measurement is efficient and not needed for additional modules. Test result shows the accuracy of TOF is sub-nanosecond and can meet the requirement for Back-n at CSNS. △ Less

Submitted 24 June, 2018; originally announced June 2018.

Comments: 4 pages, 13 figures, 21st IEEE Real Time Conference

arXiv:1806.09249 [pdf]

T0 Fan-out for Back-n White Neutron Facility at CSNS

Authors: X. Y. Ji, P. Cao, T. Yu, L. K. Xie, X. R. Huang, Q. An, H. Y. Bai, J. Bao, Y. H. Chen, P. J. Cheng, Z. Q. Cui, R. R. Fan, C. Q. Feng, M. H. Gu, Z. J. Han, G. Z. He, Y. C. He, Y. F. He, H. X. Huang, W. L. Huang, X. L. Ji, H. Y. Jiang, W. Jiang, H. Y. **g, L. Kang , et al. (46 additional authors not shown)

Abstract: the main physics goal for Back-n white neutron facility at China Spallation Neutron Source (CSNS) is to measure nuclear data. The energy of neutrons is one of the most important parameters for measuring nuclear data. Method of time of flight (TOF) is used to obtain the energy of neutrons. The time when proton bunches hit the thick tungsten target is considered as the start point of TOF. T0 signal,… ▽ More the main physics goal for Back-n white neutron facility at China Spallation Neutron Source (CSNS) is to measure nuclear data. The energy of neutrons is one of the most important parameters for measuring nuclear data. Method of time of flight (TOF) is used to obtain the energy of neutrons. The time when proton bunches hit the thick tungsten target is considered as the start point of TOF. T0 signal, generated from the CSNS accelerator, represents this start time. Besides, the T0 signal is also used as the gate control signal that triggers the readout electronics. Obviously, the timing precision of T0 directly affects the measurement precision of TOF and controls the running or readout electronics. In this paper, the T0 fan-out for Back-n white neutron facility at CSNS is proposed. The T0 signal travelling from the CSNS accelerator is fanned out to the two underground experiment stations respectively over long cables. To guarantee the timing precision, T0 signal is conditioned with good signal edge. Furthermore, techniques of signal pre-emphasizing and equalizing are used to improve signal quality after T0 being transmitted over long cables with about 100 m length. Experiments show that the T0 fan-out works well, the T0 signal transmitted over 100 m remains a good time resolution with a standard deviation of 25 ps. It absolutely meets the required accuracy of the measurement of TOF. △ Less

Submitted 24 June, 2018; originally announced June 2018.

Comments: 3 pages, 6 figures, the 21st IEEE Real Time Conference

Showing 1–18 of 18 results for author: Cao, P