Search | arXiv e-print repository

Low-dose CT Denoising with Language-engaged Dual-space Alignment

Authors: Zhihao Chen, Tao Chen, Chenhui Wang, Chuang Niu, Ge Wang, Hongming Shan

Abstract: While various deep learning methods were proposed for low-dose computed tomography (CT) denoising, they often suffer from over-smoothing, blurring, and lack of explainability. To alleviate these issues, we propose a plug-and-play Language-Engaged Dual-space Alignment loss (LEDA) to optimize low-dose CT denoising models. Our idea is to leverage large language models (LLMs) to align denoised CT and… ▽ More While various deep learning methods were proposed for low-dose computed tomography (CT) denoising, they often suffer from over-smoothing, blurring, and lack of explainability. To alleviate these issues, we propose a plug-and-play Language-Engaged Dual-space Alignment loss (LEDA) to optimize low-dose CT denoising models. Our idea is to leverage large language models (LLMs) to align denoised CT and normal dose CT images in both the continuous perceptual space and discrete semantic space, which is the first LLM-based scheme for low-dose CT denoising. LEDA involves two steps: the first is to pretrain an LLM-guided CT autoencoder, which can encode a CT image into continuous high-level features and quantize them into a token space to produce semantic tokens derived from the LLM's vocabulary; and the second is to minimize the discrepancy between the denoised CT images and normal dose CT in terms of both encoded high-level features and quantized token embeddings derived by the LLM-guided CT autoencoder. Extensive experimental results on two public LDCT denoising datasets demonstrate that our LEDA can enhance existing denoising models in terms of quantitative metrics and qualitative evaluation, and also provide explainability through language-level image understanding. Source code is available at https://github.com/hao1635/LEDA. △ Less

Submitted 10 March, 2024; originally announced March 2024.

Comments: 11 pages, 6 figures

arXiv:2402.16212 [pdf, other]

Photon-counting CT using a Conditional Diffusion Model for Super-resolution and Texture-preservation

Authors: Christopher Wiedeman, Chuang Niu, Mengzhou Li, Bruno De Man, Jonathan S Maltz, Ge Wang

Abstract: Ultra-high resolution images are desirable in photon counting CT (PCCT), but resolution is physically limited by interactions such as charge sharing. Deep learning is a possible method for super-resolution (SR), but sourcing paired training data that adequately models the target task is difficult. Additionally, SR algorithms can distort noise texture, which is an important in many clinical diagnos… ▽ More Ultra-high resolution images are desirable in photon counting CT (PCCT), but resolution is physically limited by interactions such as charge sharing. Deep learning is a possible method for super-resolution (SR), but sourcing paired training data that adequately models the target task is difficult. Additionally, SR algorithms can distort noise texture, which is an important in many clinical diagnostic scenarios. Here, we train conditional denoising diffusion probabilistic models (DDPMs) for PCCT super-resolution, with the objective to retain textural characteristics of local noise. PCCT simulation methods are used to synthesize realistic resolution degradation. To preserve noise texture, we explore decoupling the noise and signal image inputs and outputs via deep denoisers, explicitly map** to each during the SR process. Our experimental results indicate that our DDPM trained on simulated data can improve sharpness in real PCCT images. Additionally, the disentanglement of noise from the original image allows our model more faithfully preserve noise texture. △ Less

Submitted 25 February, 2024; originally announced February 2024.

Comments: 5 pages, 4 figures

arXiv:2310.06949 [pdf, other]

Diffusion Prior Regularized Iterative Reconstruction for Low-dose CT

Authors: Wenjun Xia, Yongyi Shi, Chuang Niu, Wenxiang Cong, Ge Wang

Abstract: Computed tomography (CT) involves a patient's exposure to ionizing radiation. To reduce the radiation dose, we can either lower the X-ray photon count or down-sample projection views. However, either of the ways often compromises image quality. To address this challenge, here we introduce an iterative reconstruction algorithm regularized by a diffusion prior. Drawing on the exceptional imaging pro… ▽ More Computed tomography (CT) involves a patient's exposure to ionizing radiation. To reduce the radiation dose, we can either lower the X-ray photon count or down-sample projection views. However, either of the ways often compromises image quality. To address this challenge, here we introduce an iterative reconstruction algorithm regularized by a diffusion prior. Drawing on the exceptional imaging prowess of the denoising diffusion probabilistic model (DDPM), we merge it with a reconstruction procedure that prioritizes data fidelity. This fusion capitalizes on the merits of both techniques, delivering exceptional reconstruction results in an unsupervised framework. To further enhance the efficiency of the reconstruction process, we incorporate the Nesterov momentum acceleration technique. This enhancement facilitates superior diffusion sampling in fewer steps. As demonstrated in our experiments, our method offers a potential pathway to high-definition CT image reconstruction with minimized radiation. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2308.16863 [pdf, other]

Self-pruning Graph Neural Network for Predicting Inflammatory Disease Activity in Multiple Sclerosis from Brain MR Images

Authors: Chinmay Prabhakar, Hongwei Bran Li, Johannes C. Paetzold, Timo Loehr, Chen Niu, Mark Mühlau, Daniel Rueckert, Benedikt Wiestler, Bjoern Menze

Abstract: Multiple Sclerosis (MS) is a severe neurological disease characterized by inflammatory lesions in the central nervous system. Hence, predicting inflammatory disease activity is crucial for disease assessment and treatment. However, MS lesions can occur throughout the brain and vary in shape, size and total count among patients. The high variance in lesion load and locations makes it challenging fo… ▽ More Multiple Sclerosis (MS) is a severe neurological disease characterized by inflammatory lesions in the central nervous system. Hence, predicting inflammatory disease activity is crucial for disease assessment and treatment. However, MS lesions can occur throughout the brain and vary in shape, size and total count among patients. The high variance in lesion load and locations makes it challenging for machine learning methods to learn a globally effective representation of whole-brain MRI scans to assess and predict disease. Technically it is non-trivial to incorporate essential biomarkers such as lesion load or spatial proximity. Our work represents the first attempt to utilize graph neural networks (GNN) to aggregate these biomarkers for a novel global representation. We propose a two-stage MS inflammatory disease activity prediction approach. First, a 3D segmentation network detects lesions, and a self-supervised algorithm extracts their image features. Second, the detected lesions are used to build a patient graph. The lesions act as nodes in the graph and are initialized with image features extracted in the first stage. Finally, the lesions are connected based on their spatial proximity and the inflammatory disease activity prediction is formulated as a graph classification task. Furthermore, we propose a self-pruning strategy to auto-select the most critical lesions for prediction. Our proposed method outperforms the existing baseline by a large margin (AUCs of 0.67 vs. 0.61 and 0.66 vs. 0.60 for one-year and two-year inflammatory disease activity, respectively). Finally, our proposed method enjoys inherent explainability by assigning an importance score to each lesion for the overall prediction. Code is available at https://github.com/chinmay5/ms_ida.git △ Less

Submitted 31 August, 2023; originally announced August 2023.

arXiv:2308.13002 [pdf, other]

Head-Neck Dual-energy CT Contrast Media Reduction Using Diffusion Models

Authors: Qing Lyu, Josh Tan, Megan E. Lipford, Chuang Niu, Micheal E. Zapadka, Christopher M. Lack, Jonathan D. Clemente, Christopher T. Whitlow, Ge Wang

Abstract: Iodinated contrast media is essential for dual-energy computed tomography (DECT) angiography. Previous studies show that iodinated contrast media may cause side effects, and the interruption of the supply chain in 2022 led to a severe contrast media shortage in the US. Both factors justify the necessity of contrast media reduction in relevant clinical applications. In this study, we propose a diff… ▽ More Iodinated contrast media is essential for dual-energy computed tomography (DECT) angiography. Previous studies show that iodinated contrast media may cause side effects, and the interruption of the supply chain in 2022 led to a severe contrast media shortage in the US. Both factors justify the necessity of contrast media reduction in relevant clinical applications. In this study, we propose a diffusion model-based deep learning framework to address this challenge. First, we simulate different levels of low contrast dosage DECT scans from the standard normal contrast dosage DECT scans using material decomposition. Conditional denoising diffusion probabilistic models are then trained to enhance the contrast media and create contrast-enhanced images. Our results demonstrate that the proposed methods can generate high-quality contrast-enhanced results even for images obtained with as low as 12.5% of the normal contrast dosage. Furthermore, our method outperforms selected competing methods in a human reader study. △ Less

Submitted 24 August, 2023; originally announced August 2023.

arXiv:2308.12526 [pdf, other]

UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

Authors: Yu Zheng, Yajun Zhang, Chuanying Niu, Yibin Zhan, Yanhua Long, Dongxing Xu

Abstract: This report describes the UNISOUND submission for Track1 and Track2 of VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC 2023). We submit the same system on Track 1 and Track 2, which is trained with only VoxCeleb2-dev. Large-scale ResNet and RepVGG architectures are developed for the challenge. We propose a consistency-aware score calibration method, which leverages the stability of audio voice… ▽ More This report describes the UNISOUND submission for Track1 and Track2 of VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC 2023). We submit the same system on Track 1 and Track 2, which is trained with only VoxCeleb2-dev. Large-scale ResNet and RepVGG architectures are developed for the challenge. We propose a consistency-aware score calibration method, which leverages the stability of audio voiceprints in similarity score by a Consistency Measure Factor (CMF). CMF brings a huge performance boost in this challenge. Our final system is a fusion of six models and achieves the first place in Track 1 and second place in Track 2 of VoxSRC 2023. The minDCF of our submission is 0.0855 and the EER is 1.5880%. △ Less

Submitted 23 August, 2023; originally announced August 2023.

arXiv:2304.02649 [pdf, other]

Specialty-Oriented Generalist Medical AI for Chest CT Screening

Authors: Chuang Niu, Qing Lyu, Christopher D. Carothers, Parisa Kaviani, Josh Tan, **kun Yan, Mannudeep K. Kalra, Christopher T. Whitlow, Ge Wang

Abstract: Modern medical records include a vast amount of multimodal free text clinical data and imaging data from radiology, cardiology, and digital pathology. Fully mining such big data requires multitasking; otherwise, occult but important aspects may be overlooked, adversely affecting clinical management and population healthcare. Despite remarkable successes of AI in individual tasks with single-modal… ▽ More Modern medical records include a vast amount of multimodal free text clinical data and imaging data from radiology, cardiology, and digital pathology. Fully mining such big data requires multitasking; otherwise, occult but important aspects may be overlooked, adversely affecting clinical management and population healthcare. Despite remarkable successes of AI in individual tasks with single-modal data, the progress in develo** generalist medical AI remains relatively slow to combine multimodal data for multitasks because of the dual challenges of data curation and model architecture. The data challenge involves querying and curating multimodal structured and unstructured text, alphanumeric, and especially 3D tomographic scans on an individual patient level for real-time decisions and on a scale to estimate population health statistics. The model challenge demands a scalable and adaptable network architecture to integrate multimodal datasets for diverse clinical tasks. Here we propose the first-of-its-kind medical multimodal-multitask foundation model (M3FM) with application in lung cancer screening and related tasks. After we curated a comprehensive multimodal multitask dataset consisting of 49 clinical data types including 163,725 chest CT series and 17 medical tasks involved in LCS, we develop a multimodal question-answering framework as a unified training and inference strategy to synergize multimodal information and perform multiple tasks via free-text prompting. M3FM consistently outperforms the state-of-the-art single-modal task-specific models, identifies multimodal data elements informative for clinical tasks and flexibly adapts to new tasks with a small out-of-distribution dataset. As a specialty-oriented generalist medical AI model, M3FM paves the way for similar breakthroughs in other areas of medicine, closing the gap between specialists and the generalist. △ Less

Submitted 24 April, 2024; v1 submitted 3 April, 2023; originally announced April 2023.

arXiv:2303.12861 [pdf, other]

Parallel Diffusion Model-based Sparse-view Cone-beam Breast CT

Authors: Wenjun Xia, Hsin Wu Tseng, Chuang Niu, Wenxiang Cong, Xiaohua Zhang, Shaohua Liu, Ruola Ning, Srinivasan Vedantham, Ge Wang

Abstract: Breast cancer is the most prevalent cancer among women worldwide, and early detection is crucial for reducing its mortality rate and improving quality of life. Dedicated breast computed tomography (CT) scanners offer better image quality than mammography and tomosynthesis in general but at higher radiation dose. To enable breast CT for cancer screening, the challenge is to minimize the radiation d… ▽ More Breast cancer is the most prevalent cancer among women worldwide, and early detection is crucial for reducing its mortality rate and improving quality of life. Dedicated breast computed tomography (CT) scanners offer better image quality than mammography and tomosynthesis in general but at higher radiation dose. To enable breast CT for cancer screening, the challenge is to minimize the radiation dose without compromising image quality, according to the ALARA principle (as low as reasonably achievable). Over the past years, deep learning has shown remarkable successes in various tasks, including low-dose CT especially few-view CT. Currently, the diffusion model presents the state of the art for CT reconstruction. To develop the first diffusion model-based breast CT reconstruction method, here we report innovations to address the large memory requirement for breast cone-beam CT reconstruction and high computational cost of the diffusion model. Specifically, in this study we transform the cutting-edge Denoising Diffusion Probabilistic Model (DDPM) into a parallel framework for sub-volume-based sparse-view breast CT image reconstruction in projection and image domains. This novel approach involves the concurrent training of two distinct DDPM models dedicated to processing projection and image data synergistically in the dual domains. Our experimental findings reveal that this method delivers competitive reconstruction performance at half to one-third of the standard radiation doses. This advancement demonstrates an exciting potential of diffusion-type models for volumetric breast reconstruction at high-resolution with much-reduced radiation dose and as such hopefully redefines breast cancer screening and diagnosis. △ Less

Submitted 28 January, 2024; v1 submitted 22 March, 2023; originally announced March 2023.

arXiv:2302.10630 [pdf, other]

doi 10.1109/TMI.2024.3351723

LIT-Former: Linking In-plane and Through-plane Transformers for Simultaneous CT Image Denoising and Deblurring

Authors: Zhihao Chen, Chuang Niu, Qi Gao, Ge Wang, Hongming Shan

Abstract: This paper studies 3D low-dose computed tomography (CT) imaging. Although various deep learning methods were developed in this context, typically they focus on 2D images and perform denoising due to low-dose and deblurring for super-resolution separately. Up to date, little work was done for simultaneous in-plane denoising and through-plane deblurring, which is important to obtain high-quality 3D… ▽ More This paper studies 3D low-dose computed tomography (CT) imaging. Although various deep learning methods were developed in this context, typically they focus on 2D images and perform denoising due to low-dose and deblurring for super-resolution separately. Up to date, little work was done for simultaneous in-plane denoising and through-plane deblurring, which is important to obtain high-quality 3D CT images with lower radiation and faster imaging speed. For this task, a straightforward method is to directly train an end-to-end 3D network. However, it demands much more training data and expensive computational costs. Here, we propose to link in-plane and through-plane transformers for simultaneous in-plane denoising and through-plane deblurring, termed as LIT-Former, which can efficiently synergize in-plane and through-plane sub-tasks for 3D CT imaging and enjoy the advantages of both convolution and transformer networks. LIT-Former has two novel designs: efficient multi-head self-attention modules (eMSM) and efficient convolutional feedforward networks (eCFN). First, eMSM integrates in-plane 2D self-attention and through-plane 1D self-attention to efficiently capture global interactions of 3D self-attention, the core unit of transformer networks. Second, eCFN integrates 2D convolution and 1D convolution to extract local information of 3D convolution in the same fashion. As a result, the proposed LIT-Former synergize these two subtasks, significantly reducing the computational complexity as compared to 3D counterparts and enabling rapid convergence. Extensive experimental results on simulated and clinical datasets demonstrate superior performance over state-of-the-art models. The source code is made available at https://github.com/hao1635/LIT-Former. △ Less

Submitted 7 January, 2024; v1 submitted 21 February, 2023; originally announced February 2023.

Comments: 15 pages, 12 figures

Journal ref: IEEE Transactions on Medical Imaging, 2024

arXiv:2207.11678 [pdf, other]

doi 10.1109/TMI.2024.3351722

Quad-Net: Quad-domain Network for CT Metal Artifact Reduction

Authors: Zilong Li, Qi Gao, Ya** Wu, Chuang Niu, Jun** Zhang, Meiyun Wang, Ge Wang, Hongming Shan

Abstract: Metal implants and other high-density objects in patients introduce severe streaking artifacts in CT images, compromising image quality and diagnostic performance. Although various methods were developed for CT metal artifact reduction over the past decades, including the latest dual-domain deep networks, remaining metal artifacts are still clinically challenging in many cases. Here we extend the… ▽ More Metal implants and other high-density objects in patients introduce severe streaking artifacts in CT images, compromising image quality and diagnostic performance. Although various methods were developed for CT metal artifact reduction over the past decades, including the latest dual-domain deep networks, remaining metal artifacts are still clinically challenging in many cases. Here we extend the state-of-the-art dual-domain deep network approach into a quad-domain counterpart so that all the features in the sinogram, image, and their corresponding Fourier domains are synergized to eliminate metal artifacts optimally without compromising structural subtleties. Our proposed quad-domain network for MAR, referred to as Quad-Net, takes little additional computational cost since the Fourier transform is highly efficient, and works across the four receptive fields to learn both global and local features as well as their relations. Specifically, we first design a Sinogram-Fourier Restoration Network (SFR-Net) in the sinogram domain and its Fourier space to faithfully inpaint metal-corrupted traces. Then, we couple SFR-Net with an Image-Fourier Refinement Network (IFR-Net) which takes both an image and its Fourier spectrum to improve a CT image reconstructed from the SFR-Net output using cross-domain contextual information. Quad-Net is trained on clinical datasets to minimize a composite loss function. Quad-Net does not require precise metal masks, which is of great importance in clinical practice. Our experimental results demonstrate the superiority of Quad-Net over the state-of-the-art MAR methods quantitatively, visually, and statistically. The Quad-Net code is publicly available at https://github.com/longzilicart/Quad-Net. △ Less

Submitted 31 May, 2023; v1 submitted 24 July, 2022; originally announced July 2022.

Journal ref: IEEE Transactions on Medical Imaging, 2024

arXiv:2205.14833 [pdf, other]

Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning

Authors: Chengfei Lv, Chaoyue Niu, Renjie Gu, Xiaotang Jiang, Zhaode Wang, Bin Liu, Ziqi Wu, Qiulin Yao, Congyu Huang, Panos Huang, Tao Huang, Hui Shu, **de Song, Bin Zou, Peng Lan, Guohuan Xu, Fei Wu, Shaojie Tang, Fan Wu, Guihai Chen

Abstract: To break the bottlenecks of mainstream cloud-based machine learning (ML) paradigm, we adopt device-cloud collaborative ML and build the first end-to-end and general-purpose system, called Walle, as the foundation. Walle consists of a deployment platform, distributing ML tasks to billion-scale devices in time; a data pipeline, efficiently preparing task input; and a compute container, providing a c… ▽ More To break the bottlenecks of mainstream cloud-based machine learning (ML) paradigm, we adopt device-cloud collaborative ML and build the first end-to-end and general-purpose system, called Walle, as the foundation. Walle consists of a deployment platform, distributing ML tasks to billion-scale devices in time; a data pipeline, efficiently preparing task input; and a compute container, providing a cross-platform and high-performance execution environment, while facilitating daily task iteration. Specifically, the compute container is based on Mobile Neural Network (MNN), a tensor compute engine along with the data processing and model execution libraries, which are exposed through a refined Python thread-level virtual machine (VM) to support diverse ML tasks and concurrent task execution. The core of MNN is the novel mechanisms of operator decomposition and semi-auto search, sharply reducing the workload in manually optimizing hundreds of operators for tens of hardware backends and further quickly identifying the best backend with runtime optimization for a computation graph. The data pipeline introduces an on-device stream processing framework to enable processing user behavior data at source. The deployment platform releases ML tasks with an efficient push-then-pull method and supports multi-granularity deployment policies. We evaluate Walle in practical e-commerce application scenarios to demonstrate its effectiveness, efficiency, and scalability. Extensive micro-benchmarks also highlight the superior performance of MNN and the Python thread-level VM. Walle has been in large-scale production use in Alibaba, while MNN has been open source with a broad impact in the community. △ Less

Submitted 29 May, 2022; originally announced May 2022.

Comments: Accepted by OSDI 2022

arXiv:2203.13256 [pdf, ps, other]

doi 10.1016/j.physd.2022.133557

Power Network Uniqueness and Synchronization Stability from a Higher-order Structure Perspective

Authors: Hao Liu, Xin Chen, Long Huo, Chunming Niu

Abstract: Triadic subgraph analysis reveals the structural features in power networks based on higher-order connectivity patterns. Power networks have a unique triad significance profile (TSP) of the five unidirectional triadic subgraphs in comparison with the scale-free, small-world and random networks. Notably, the triadic closure has the highest significance in power networks. Thus, the unique TSP can se… ▽ More Triadic subgraph analysis reveals the structural features in power networks based on higher-order connectivity patterns. Power networks have a unique triad significance profile (TSP) of the five unidirectional triadic subgraphs in comparison with the scale-free, small-world and random networks. Notably, the triadic closure has the highest significance in power networks. Thus, the unique TSP can serve as a structural identifier to differentiate power networks from other complex networks. Power networks form a network superfamily. Furthermore, synthetic power networks based on the random growth model grow up to be networks belonging to the superfamily with a fewer number of transmission lines. The significance of triadic closures strongly correlates with the construction cost measured by network redundancy. The trade off between the synchronization stability and the construction cost leads to the power network superfamily. The power network characterized by the unique TSP is the consequence of the trade-off essentially. The uniqueness of the power network superfamily tells an important fact that power networks. △ Less

Submitted 25 March, 2022; originally announced March 2022.

arXiv:2203.13118 [pdf, other]

X-ray Dissectography Improves Lung Nodule Detection

Authors: Chuang Niu, Giridhar Dasegowda, **kun Yan, Mannudeep K. Kalra, Ge Wang

Abstract: Although radiographs are the most frequently used worldwide due to their cost-effectiveness and widespread accessibility, the structural superposition along the x-ray paths often renders suspicious or concerning lung nodules difficult to detect. In this study, we apply "X-ray dissectography" to dissect lungs digitally from a few radiographic projections, suppress the interference of irrelevant str… ▽ More Although radiographs are the most frequently used worldwide due to their cost-effectiveness and widespread accessibility, the structural superposition along the x-ray paths often renders suspicious or concerning lung nodules difficult to detect. In this study, we apply "X-ray dissectography" to dissect lungs digitally from a few radiographic projections, suppress the interference of irrelevant structures, and improve lung nodule detectability. For this purpose, a collaborative detection network is designed to localize lung nodules in 2D dissected projections and 3D physical space. Our experimental results show that our approach can significantly improve the average precision by 20+% in comparison with the common baseline that detects lung nodules from original projections using a popular detection network. Potentially, this approach could help re-design the current X-ray imaging protocols and workflows and improve the diagnostic performance of chest radiographs in lung diseases. △ Less

Submitted 24 March, 2022; originally announced March 2022.

arXiv:2203.11722 [pdf, other]

Convolutional Neural Network to Restore Low-Dose Digital Breast Tomosynthesis Projections in a Variance Stabilization Domain

Authors: Rodrigo de Barros Vimieiro, Chuang Niu, Hongming Shan, Lucas Rodrigues Borges, Ge Wang, Marcelo Andrade da Costa Vieira

Abstract: Digital breast tomosynthesis (DBT) exams should utilize the lowest possible radiation dose while maintaining sufficiently good image quality for accurate medical diagnosis. In this work, we propose a convolution neural network (CNN) to restore low-dose (LD) DBT projections to achieve an image quality equivalent to a standard full-dose (FD) acquisition. The proposed network architecture benefits fr… ▽ More Digital breast tomosynthesis (DBT) exams should utilize the lowest possible radiation dose while maintaining sufficiently good image quality for accurate medical diagnosis. In this work, we propose a convolution neural network (CNN) to restore low-dose (LD) DBT projections to achieve an image quality equivalent to a standard full-dose (FD) acquisition. The proposed network architecture benefits from priors in terms of layers that were inspired by traditional model-based (MB) restoration methods, considering a model-based deep learning approach, where the network is trained to operate in the variance stabilization transformation (VST) domain. To accurately control the network operation point, in terms of noise and blur of the restored image, we propose a loss function that minimizes the bias and matches residual noise between the input and the output. The training dataset was composed of clinical data acquired at the standard FD and low-dose pairs obtained by the injection of quantum noise. The network was tested using real DBT projections acquired with a physical anthropomorphic breast phantom. The proposed network achieved superior results in terms of the mean normalized squared error (MNSE), training time and noise spatial correlation compared with networks trained with traditional data-driven methods. The proposed approach can be extended for other medical imaging application that requires LD acquisitions. △ Less

Submitted 22 March, 2022; originally announced March 2022.

Comments: 12 pages, 9 figures

arXiv:2108.12076 [pdf]

Stationary Multi-source AI-powered Real-time Tomography (SMART)

Authors: Weiwen Wu, Yaohui Tang, Tianling Lv, Chuang Niu, Cheng Wang, Yiyan Guo, Yunheng Chang, Ge Wang, Yan Xi

Abstract: Over the past decades, the development of CT technologies has been largely driven by the needs for cardiac imaging but the temporal resolution remains insufficient for clinical CT in difficult cases and rather challenging for preclinical micro-CT since small animals, as human cardiac disease models, have much higher heart rates than human. To address this challenge, here we report a Stationary Mul… ▽ More Over the past decades, the development of CT technologies has been largely driven by the needs for cardiac imaging but the temporal resolution remains insufficient for clinical CT in difficult cases and rather challenging for preclinical micro-CT since small animals, as human cardiac disease models, have much higher heart rates than human. To address this challenge, here we report a Stationary Multi-source AI-based Real-time Tomography (SMART) micro-CT system. This unique scanner is featured by 29 source-detector pairs fixed on a circular track to collect x-ray signals in parallel, enabling instantaneous tomography in principle. Given the multi-source architecture, the field-of-view only covers a cardiac region. To solve this interior problem, an AI-empowered interior tomography approach is developed to synergize sparsity-based regularization and learning-based reconstruction. To demonstrate the performance and utilities of the SMART system, extensive results are obtained in physical phantom experiments and animal studies, including dead and live rats as well as live rabbits. The reconstructed volumetric images convincingly demonstrate the merits of the SMART system using the AI-empowered interior tomography approach, enabling cardiac micro-CT with the unprecedented temporal resolution of 30ms, which is an order of magnitude higher than the state of the art. △ Less

Submitted 7 February, 2022; v1 submitted 26 August, 2021; originally announced August 2021.

Comments: 22 pages, 8 figures, 1 table, 33 references

arXiv:2106.09834 [pdf]

AI-Enabled Ultra-Low-Dose CT Reconstruction

Authors: Weiwen Wu, Chuang Niu, Shadi Ebrahimian, Hengyong Yu, Mannu Kalra, Ge Wang

Abstract: By the ALARA (As Low As Reasonably Achievable) principle, ultra-low-dose CT reconstruction is a holy grail to minimize cancer risks and genetic damages, especially for children. With the development of medical CT technologies, the iterative algorithms are widely used to reconstruct decent CT images from a low-dose scan. Recently, artificial intelligence (AI) techniques have shown a great promise i… ▽ More By the ALARA (As Low As Reasonably Achievable) principle, ultra-low-dose CT reconstruction is a holy grail to minimize cancer risks and genetic damages, especially for children. With the development of medical CT technologies, the iterative algorithms are widely used to reconstruct decent CT images from a low-dose scan. Recently, artificial intelligence (AI) techniques have shown a great promise in further reducing CT radiation dose to the next level. In this paper, we demonstrate that AI-powered CT reconstruction offers diagnostic image quality at an ultra-low-dose level comparable to that of radiography. Specifically, here we develop a Split Unrolled Grid-like Alternative Reconstruction (SUGAR) network, in which deep learning, physical modeling and image prior are integrated. The reconstruction results from clinical datasets show that excellent images can be reconstructed using SUGAR from 36 projections. This approach has a potential to change future healthcare. △ Less

Submitted 17 June, 2021; originally announced June 2021.

Comments: 19 pages, 10 figures, 1 table, 44 references

MSC Class: 68T07

arXiv:2103.13557 [pdf, other]

Task-Oriented Low-Dose CT Image Denoising

Authors: Jia** Zhang, Hanqing Chao, Xuanang Xu, Chuang Niu, Ge Wang, **kun Yan

Abstract: The extensive use of medical CT has raised a public concern over the radiation dose to the patient. Reducing the radiation dose leads to increased CT image noise and artifacts, which can adversely affect not only the radiologists judgement but also the performance of downstream medical image analysis tasks. Various low-dose CT denoising methods, especially the recent deep learning based approaches… ▽ More The extensive use of medical CT has raised a public concern over the radiation dose to the patient. Reducing the radiation dose leads to increased CT image noise and artifacts, which can adversely affect not only the radiologists judgement but also the performance of downstream medical image analysis tasks. Various low-dose CT denoising methods, especially the recent deep learning based approaches, have produced impressive results. However, the existing denoising methods are all downstream-task-agnostic and neglect the diverse needs of the downstream applications. In this paper, we introduce a novel Task-Oriented Denoising Network (TOD-Net) with a task-oriented loss leveraging knowledge from the downstream tasks. Comprehensive empirical analysis shows that the task-oriented loss complements other task agnostic losses by steering the denoiser to enhance the image quality in the task related regions of interest. Such enhancement in turn brings general boosts on the performance of various methods for the downstream task. The presented work may shed light on the future development of context-aware image denoising methods. △ Less

Submitted 10 July, 2021; v1 submitted 24 March, 2021; originally announced March 2021.

Comments: Paper accepted by MICCAI-2021

arXiv:2102.09615 [pdf, other]

Noise Entangled GAN For Low-Dose CT Simulation

Authors: Chuang Niu, Ge Wang, **kun Yan, Juergen Hahn, Youfang Lai, Xun Jia, Arjun Krishna, Klaus Mueller, Andreu Badal, KyleJ. Myers, Rong** Zeng

Abstract: We propose a Noise Entangled GAN (NE-GAN) for simulating low-dose computed tomography (CT) images from a higher dose CT image. First, we present two schemes to generate a clean CT image and a noise image from the high-dose CT image. Then, given these generated images, an NE-GAN is proposed to simulate different levels of low-dose CT images, where the level of generated noise can be continuously co… ▽ More We propose a Noise Entangled GAN (NE-GAN) for simulating low-dose computed tomography (CT) images from a higher dose CT image. First, we present two schemes to generate a clean CT image and a noise image from the high-dose CT image. Then, given these generated images, an NE-GAN is proposed to simulate different levels of low-dose CT images, where the level of generated noise can be continuously controlled by a noise factor. NE-GAN consists of a generator and a set of discriminators, and the number of discriminators is determined by the number of noise levels during training. Compared with the traditional methods based on the projection data that are usually unavailable in real applications, NE-GAN can directly learn from the real and/or simulated CT images and may create low-dose CT images quickly without the need of raw data or other proprietary CT scanner information. The experimental results show that the proposed method has the potential to simulate realistic low-dose CT images. △ Less

Submitted 18 February, 2021; originally announced February 2021.

arXiv:2011.08297 [pdf, other]

doi 10.1109/ACCESS.2020.3046187

Clinical Micro-CT Empowered by Interior Tomography, Robotic Scanning, and Deep Learning

Authors: Mengzhou Li, Zheng Fang, Wenxiang Cong, Chuang Niu, Weiwen Wu, Josef Uher, James Bennett, Jay T. Rubinstein, Ge Wang

Abstract: While micro-CT systems are instrumental in preclinical research, clinical micro-CT imaging has long been desired with cochlear implantation as a primary example. The structural details of the cochlear implant and the temporal bone require a significantly higher image resolution than that (about 0.2 mm) provided by current medical CT scanners. In this paper, we propose a clinical micro-CT (CMCT) sy… ▽ More While micro-CT systems are instrumental in preclinical research, clinical micro-CT imaging has long been desired with cochlear implantation as a primary example. The structural details of the cochlear implant and the temporal bone require a significantly higher image resolution than that (about 0.2 mm) provided by current medical CT scanners. In this paper, we propose a clinical micro-CT (CMCT) system design integrating conventional spiral cone-beam CT, contemporary interior tomography, deep learning techniques, and technologies of micro-focus X-ray source, photon-counting detector (PCD), and robotic arms for ultrahigh resolution localized tomography of a freely-selected volume of interest (VOI) at a minimized radiation dose level. The whole system consists of a standard CT scanner for a clinical CT exam and VOI specification, and a robotic-arm based micro-CT scanner for a local scan at much higher spatial and spectral resolution as well as much reduced radiation dose. The prior information from global scan is also fully utilized for background compensation to improve interior tomography from local data for accurate and stable VOI reconstruction. Our results and analysis show that the proposed hybrid reconstruction algorithm delivers superior local reconstruction, being insensitive to the misalignment of the isocenter position and initial view angle in the data/image registration while the attenuation error caused by scale mismatch can be effectively addressed with bias correction. These findings demonstrate the feasibility of our system design. We envision that deep learning techniques can be leveraged for optimized imaging performance. With high resolution imaging, high dose efficiency and low system cost synergistically, our proposed CMCT system has great potentials in temporal bone imaging as well as various other clinical applications. △ Less

Submitted 16 November, 2020; originally announced November 2020.

Comments: 10 pages, 13 figures, 3 tables

arXiv:2011.03384 [pdf, other]

Suppression of Correlated Noise with Similarity-based Unsupervised Deep Learning

Authors: Chuang Niu, Mengzhou Li, Fenglei Fan, Weiwen Wu, Xiaodong Guo, Qing Lyu, Ge Wang

Abstract: Image denoising is a prerequisite for downstream tasks in many fields. Low-dose and photon-counting computed tomography (CT) denoising can optimize diagnostic performance at minimized radiation dose. Supervised deep denoising methods are popular but require paired clean or noisy samples that are often unavailable in practice. Limited by the independent noise assumption, current unsupervised denois… ▽ More Image denoising is a prerequisite for downstream tasks in many fields. Low-dose and photon-counting computed tomography (CT) denoising can optimize diagnostic performance at minimized radiation dose. Supervised deep denoising methods are popular but require paired clean or noisy samples that are often unavailable in practice. Limited by the independent noise assumption, current unsupervised denoising methods cannot process correlated noises as in CT images. Here we propose the first-of-its-kind similarity-based unsupervised deep denoising approach, referred to as Noise2Sim, that works in a nonlocal and nonlinear fashion to suppress not only independent but also correlated noises. Theoretically, Noise2Sim is asymptotically equivalent to supervised learning methods under mild conditions. Experimentally, Nosie2Sim recovers intrinsic features from noisy low-dose CT and photon-counting CT images as effectively as or even better than supervised learning methods on practical datasets visually, quantitatively and statistically. Noise2Sim is a general unsupervised denoising approach and has great potential in diverse applications. △ Less

Submitted 5 January, 2022; v1 submitted 6 November, 2020; originally announced November 2020.

arXiv:2010.14105 [pdf, other]

Micro-CT Synthesis and Inner Ear Super Resolution via Generative Adversarial Networks and Bayesian Inference

Authors: Hongwei Li, Rameshwara G. N. Prasad, Anjany Sekuboyina, Chen Niu, Siwei Bai, Werner Hemmert, Bjoern Menze

Abstract: Existing medical image super-resolution methods rely on pairs of low- and high- resolution images to learn a map** in a fully supervised manner. However, such image pairs are often not available in clinical practice. In this paper, we address super-resolution problem in a real-world scenario using unpaired data and synthesize linearly \textbf{eight times} higher resolved Micro-CT images of tempo… ▽ More Existing medical image super-resolution methods rely on pairs of low- and high- resolution images to learn a map** in a fully supervised manner. However, such image pairs are often not available in clinical practice. In this paper, we address super-resolution problem in a real-world scenario using unpaired data and synthesize linearly \textbf{eight times} higher resolved Micro-CT images of temporal bone structure, which is embedded in the inner ear. We explore cycle-consistency generative adversarial networks for super-resolution task and equip the translation approach with Bayesian inference. We further introduce \emph{Hu Moment distance} the evaluation metric to quantify the shape of the temporal bone. We evaluate our method on a public inner ear CT dataset and have seen both visual and quantitative improvement over state-of-the-art deep-learning-based methods. In addition, we perform a multi-rater visual evaluation experiment and find that trained experts consistently rate the proposed method the highest quality scores among all methods. Furthermore, we are able to quantify uncertainty in the unpaired translation task and the uncertainty map can provide structural information of the temporal bone. △ Less

Submitted 4 February, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: final version in ISBI 2021

arXiv:2008.13570 [pdf]

Deep Learning based Spectral CT Imaging

Authors: Weiwen Wu, Dianlin Hu, Chuang Niu, Lieza Vanden Broeke, Anthony P. H. Butler, Peng Cao, James Atlas, Alexander Chernoglazov, Varut Vardhanabhuti, Ge Wang

Abstract: Spectral computed tomography (CT) has attracted much attention in radiation dose reduction, metal artifacts removal, tissue quantification and material discrimination. The x-ray energy spectrum is divided into several bins, each energy-bin-specific projection has a low signal-noise-ratio (SNR) than the current-integrating counterpart, which makes image reconstruction a unique challenge. Traditiona… ▽ More Spectral computed tomography (CT) has attracted much attention in radiation dose reduction, metal artifacts removal, tissue quantification and material discrimination. The x-ray energy spectrum is divided into several bins, each energy-bin-specific projection has a low signal-noise-ratio (SNR) than the current-integrating counterpart, which makes image reconstruction a unique challenge. Traditional wisdom is to use prior knowledge based iterative methods. However, this kind of methods demands a great computational cost. Inspired by deep learning, here we first develop a deep learning based reconstruction method; i.e., U-net with L_p^p-norm, Total variation, Residual learning, and Anisotropic adaption (ULTRA). Specifically, we emphasize the Various Multi-scale Feature Fusion and Multichannel Filtering Enhancement with a denser connection encoding architecture for residual learning and feature fusion. To address the image deblurring problem associated with the $L_2^2$-loss, we propose a general $L_p^p$-loss, $p>0$ Furthermore, the images from different energy bins share similar structures of the same object, the regularization characterizing correlations of different energy bins is incorporated into the $L_p^p$-loss function, which helps unify the deep learning based methods with traditional compressed sensing based methods. Finally, the anisotropically weighted total variation is employed to characterize the sparsity in the spatial-spectral domain to regularize the proposed network. In particular, we validate our ULTRA networks on three large-scale spectral CT datasets, and obtain excellent results relative to the competing algorithms. In conclusion, our quantitative and qualitative results in numerical simulation and preclinical experiments demonstrate that our proposed approach is accurate, efficient and robust for high-quality spectral CT image reconstruction. △ Less

Submitted 25 August, 2021; v1 submitted 27 August, 2020; originally announced August 2020.

arXiv:2008.01846 [pdf]

Stabilizing Deep Tomographic Reconstruction

Authors: Weiwen Wu, Dianlin Hu, Wenxiang Cong, Hongming Shan, Shaoyu Wang, Chuang Niu, **kun Yan, Hengyong Yu, Varut Vardhanabhuti, Ge Wang

Abstract: Tomographic image reconstruction with deep learning is an emerging field, but a recent landmark study reveals that several deep reconstruction networks are unstable for computed tomography (CT) and magnetic resonance imaging (MRI). Specifically, three kinds of instabilities were reported: (1) strong image artefacts from tiny perturbations, (2) small features missing in a deeply reconstructed image… ▽ More Tomographic image reconstruction with deep learning is an emerging field, but a recent landmark study reveals that several deep reconstruction networks are unstable for computed tomography (CT) and magnetic resonance imaging (MRI). Specifically, three kinds of instabilities were reported: (1) strong image artefacts from tiny perturbations, (2) small features missing in a deeply reconstructed image, and (3) decreased imaging performance with increased input data. On the other hand, compressed sensing (CS) inspired reconstruction methods do not suffer from these instabilities because of their built-in kernel awareness. For deep reconstruction to realize its full potential and become a mainstream approach for tomographic imaging, it is thus critically important to meet this challenge by stabilizing deep reconstruction networks. Here we propose an Analytic Compressed Iterative Deep (ACID) framework to address this challenge. ACID synergizes a deep reconstruction network trained on big data, kernel awareness from CS-inspired processing, and iterative refinement to minimize the data residual relative to real measurement. Our study demonstrates that the deep reconstruction using ACID is accurate and stable, and sheds light on the converging mechanism of the ACID iteration under a Bounded Relative Error Norm (BREN) condition. In particular, the study shows that ACID-based reconstruction is resilient against adversarial attacks, superior to classic sparsity-regularized reconstruction alone, and eliminates the three kinds of instabilities. We anticipate that this integrative data-driven approach will help promote development and translation of deep tomographic image reconstruction networks into clinical applications. △ Less

Submitted 13 September, 2021; v1 submitted 4 August, 2020; originally announced August 2020.

Comments: 78 pages, 30 figures, 149 references

arXiv:2007.03882 [pdf, other]

Low-dimensional Manifold Constrained Disentanglement Network for Metal Artifact Reduction

Authors: Chuang Niu, Wenxiang Cong, Fenglei Fan, Hongming Shan, Mengzhou Li, Jimin Liang, Ge Wang

Abstract: Deep neural network based methods have achieved promising results for CT metal artifact reduction (MAR), most of which use many synthesized paired images for training. As synthesized metal artifacts in CT images may not accurately reflect the clinical counterparts, an artifact disentanglement network (ADN) was proposed with unpaired clinical images directly, producing promising results on clinical… ▽ More Deep neural network based methods have achieved promising results for CT metal artifact reduction (MAR), most of which use many synthesized paired images for training. As synthesized metal artifacts in CT images may not accurately reflect the clinical counterparts, an artifact disentanglement network (ADN) was proposed with unpaired clinical images directly, producing promising results on clinical datasets. However, without sufficient supervision, it is difficult for ADN to recover structural details of artifact-affected CT images based on adversarial losses only. To overcome these problems, here we propose a low-dimensional manifold (LDM) constrained disentanglement network (DN), leveraging the image characteristics that the patch manifold is generally low-dimensional. Specifically, we design an LDM-DN learning algorithm to empower the disentanglement network through optimizing the synergistic network loss functions while constraining the recovered images to be on a low-dimensional patch manifold. Moreover, learning from both paired and unpaired data, an efficient hybrid optimization scheme is proposed to further improve the MAR performance on clinical datasets. Extensive experiments demonstrate that the proposed LDM-DN approach can consistently improve the MAR performance in paired and/or unpaired learning settings, outperforming competing methods on synthesized and clinical datasets. △ Less

Submitted 7 July, 2020; originally announced July 2020.

arXiv:2002.11863 [pdf, other]

GATCluster: Self-Supervised Gaussian-Attention Network for Image Clustering

Authors: Chuang Niu, Jun Zhang, Ge Wang, Jimin Liang

Abstract: We propose a self-supervised Gaussian ATtention network for image Clustering (GATCluster). Rather than extracting intermediate features first and then performing the traditional clustering algorithm, GATCluster directly outputs semantic cluster labels without further post-processing. Theoretically, we give a Label Feature Theorem to guarantee the learned features are one-hot encoded vectors, and t… ▽ More We propose a self-supervised Gaussian ATtention network for image Clustering (GATCluster). Rather than extracting intermediate features first and then performing the traditional clustering algorithm, GATCluster directly outputs semantic cluster labels without further post-processing. Theoretically, we give a Label Feature Theorem to guarantee the learned features are one-hot encoded vectors, and the trivial solutions are avoided. To train the GATCluster in a completely unsupervised manner, we design four self-learning tasks with the constraints of transformation invariance, separability maximization, entropy analysis, and attention map**. Specifically, the transformation invariance and separability maximization tasks learn the relationships between sample pairs. The entropy analysis task aims to avoid trivial solutions. To capture the object-oriented semantics, we design a self-supervised attention mechanism that includes a parameterized attention module and a soft-attention loss. All the guiding signals for clustering are self-generated during the training process. Moreover, we develop a two-step learning algorithm that is memory-efficient for clustering large-size images. Extensive experiments demonstrate the superiority of our proposed method in comparison with the state-of-the-art image clustering benchmarks. Our code has been made publicly available at https://github.com/niuchuangnn/GATCluster. △ Less

Submitted 6 June, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

arXiv:1809.06013 [pdf, other]

DASNet: Reducing Pixel-level Annotations for Instance and Semantic Segmentation

Authors: Chuang Niu, Shenghan Ren, Jimin Liang

Abstract: Pixel-level annotation demands expensive human efforts and limits the performance of deep networks that usually benefits from more such training data. In this work we aim to achieve high quality instance and semantic segmentation results over a small set of pixel-level mask annotations and a large set of box annotations. The basic idea is exploring detection models to simplify the pixel-level supe… ▽ More Pixel-level annotation demands expensive human efforts and limits the performance of deep networks that usually benefits from more such training data. In this work we aim to achieve high quality instance and semantic segmentation results over a small set of pixel-level mask annotations and a large set of box annotations. The basic idea is exploring detection models to simplify the pixel-level supervised learning task and thus reduce the required amount of mask annotations. Our architecture, named DASNet, consists of three modules: detection, attention, and segmentation. The detection module detects all classes of objects, the attention module generates multi-scale class-specific features, and the segmentation module recovers the binary masks. Our method demonstrates substantially improved performance compared to existing semi-supervised approaches on PASCAL VOC 2012 dataset. △ Less

Submitted 31 January, 2020; v1 submitted 17 September, 2018; originally announced September 2018.

Showing 1–26 of 26 results for author: Niu, C