-
Enhancing Single-Slice Segmentation with 3D-to-2D Unpaired Scan Distillation
Authors:
Xin Yu,
Qi Yang,
Han Liu,
Ho Hin Lee,
Yucheng Tang,
Lucas W. Remedios,
Michael Kim,
Shunxing Bao,
Ann Xenobia Moore,
Luigi Ferrucci,
Bennett A. Landman
Abstract:
2D single-slice abdominal computed tomography (CT) enables the assessment of body habitus and organ health with low radiation exposure. However, single-slice data necessitates the use of 2D networks for segmentation, but these networks often struggle to capture contextual information effectively. Consequently, even when trained on identical datasets, 3D networks typically achieve superior segmenta…
▽ More
2D single-slice abdominal computed tomography (CT) enables the assessment of body habitus and organ health with low radiation exposure. However, single-slice data necessitates the use of 2D networks for segmentation, but these networks often struggle to capture contextual information effectively. Consequently, even when trained on identical datasets, 3D networks typically achieve superior segmentation results. In this work, we propose a novel 3D-to-2D distillation framework, leveraging pre-trained 3D models to enhance 2D single-slice segmentation. Specifically, we extract the prediction distribution centroid from the 3D representations, to guide the 2D student by learning intra- and inter-class correlation. Unlike traditional knowledge distillation methods that require the same data input, our approach employs unpaired 3D CT scans with any contrast to guide the 2D student model. Experiments conducted on 707 subjects from the single-slice Baltimore Longitudinal Study of Aging (BLSA) dataset demonstrate that state-of-the-art 2D multi-organ segmentation methods can benefit from the 3D teacher model, achieving enhanced performance in single-slice multi-organ segmentation. Notably, our approach demonstrates considerable efficacy in low-data regimes, outperforming the model trained with all available training subjects even when utilizing only 200 training subjects. Thus, this work underscores the potential to alleviate manual annotation burdens.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Super-resolution multi-contrast unbiased eye atlases with deep probabilistic refinement
Authors:
Ho Hin Lee,
Adam M. Saunders,
Michael E. Kim,
Samuel W. Remedios,
Lucas W. Remedios,
Yucheng Tang,
Qi Yang,
Xin Yu,
Shunxing Bao,
Chloe Cho,
Louise A. Mawn,
Tonia S. Rex,
Kevin L. Schey,
Blake E. Dewey,
Jeffrey M. Spraggins,
Jerry L. Prince,
Yuankai Huo,
Bennett A. Landman
Abstract:
Purpose: Eye morphology varies significantly across the population, especially for the orbit and optic nerve. These variations limit the feasibility and robustness of generalizing population-wise features of eye organs to an unbiased spatial reference.
Approach: To tackle these limitations, we propose a process for creating high-resolution unbiased eye atlases. First, to restore spatial details…
▽ More
Purpose: Eye morphology varies significantly across the population, especially for the orbit and optic nerve. These variations limit the feasibility and robustness of generalizing population-wise features of eye organs to an unbiased spatial reference.
Approach: To tackle these limitations, we propose a process for creating high-resolution unbiased eye atlases. First, to restore spatial details from scans with a low through-plane resolution compared to a high in-plane resolution, we apply a deep learning-based super-resolution algorithm. Then, we generate an initial unbiased reference with an iterative metric-based registration using a small portion of subject scans. We register the remaining scans to this template and refine the template using an unsupervised deep probabilistic approach that generates a more expansive deformation field to enhance the organ boundary alignment. We demonstrate this framework using magnetic resonance images across four different tissue contrasts, generating four atlases in separate spatial alignments.
Results: For each tissue contrast, we find a significant improvement using the Wilcoxon signed-rank test in the average Dice score across four labeled regions compared to a standard registration framework consisting of rigid, affine, and deformable transformations. These results highlight the effective alignment of eye organs and boundaries using our proposed process.
Conclusions: By combining super-resolution preprocessing and deep probabilistic models, we address the challenge of generating an eye atlas to serve as a standardized reference across a largely variable population.
△ Less
Submitted 14 June, 2024; v1 submitted 5 January, 2024;
originally announced January 2024.
-
Predicting Age from White Matter Diffusivity with Residual Learning
Authors:
Chenyu Gao,
Michael E. Kim,
Ho Hin Lee,
Qi Yang,
Nazirah Mohd Khairi,
Praitayini Kanakaraj,
Nancy R. Newlin,
Derek B. Archer,
Angela L. Jefferson,
Warren D. Taylor,
Brian D. Boyd,
Lori L. Beason-Held,
Susan M. Resnick,
The BIOCARD Study Team,
Yuankai Huo,
Katherine D. Van Schaik,
Kurt G. Schilling,
Daniel Moyer,
Ivana IĆĄgum,
Bennett A. Landman
Abstract:
Imaging findings inconsistent with those expected at specific chronological age ranges may serve as early indicators of neurological disorders and increased mortality risk. Estimation of chronological age, and deviations from expected results, from structural MRI data has become an important task for develo** biomarkers that are sensitive to such deviations. Complementary to structural analysis,…
▽ More
Imaging findings inconsistent with those expected at specific chronological age ranges may serve as early indicators of neurological disorders and increased mortality risk. Estimation of chronological age, and deviations from expected results, from structural MRI data has become an important task for develo** biomarkers that are sensitive to such deviations. Complementary to structural analysis, diffusion tensor imaging (DTI) has proven effective in identifying age-related microstructural changes within the brain white matter, thereby presenting itself as a promising additional modality for brain age prediction. Although early studies have sought to harness DTI's advantages for age estimation, there is no evidence that the success of this prediction is owed to the unique microstructural and diffusivity features that DTI provides, rather than the macrostructural features that are also available in DTI data. Therefore, we seek to develop white-matter-specific age estimation to capture deviations from normal white matter aging. Specifically, we deliberately disregard the macrostructural information when predicting age from DTI scalar images, using two distinct methods. The first method relies on extracting only microstructural features from regions of interest. The second applies 3D residual neural networks (ResNets) to learn features directly from the images, which are non-linearly registered and warped to a template to minimize macrostructural variations. When tested on unseen data, the first method yields mean absolute error (MAE) of 6.11 years for cognitively normal participants and MAE of 6.62 years for cognitively impaired participants, while the second method achieves MAE of 4.69 years for cognitively normal participants and MAE of 4.96 years for cognitively impaired participants. We find that the ResNet model captures subtler, non-macrostructural features for brain age prediction.
△ Less
Submitted 21 January, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Inter-vendor harmonization of Computed Tomography (CT) reconstruction kernels using unpaired image translation
Authors:
Aravind R. Krishnan,
Kaiwen Xu,
Thomas Li,
Chenyu Gao,
Lucas W. Remedios,
Praitayini Kanakaraj,
Ho Hin Lee,
Shunxing Bao,
Kim L. Sandler,
Fabien Maldonado,
Ivana Isgum,
Bennett A. Landman
Abstract:
The reconstruction kernel in computed tomography (CT) generation determines the texture of the image. Consistency in reconstruction kernels is important as the underlying CT texture can impact measurements during quantitative image analysis. Harmonization (i.e., kernel conversion) minimizes differences in measurements due to inconsistent reconstruction kernels. Existing methods investigate harmoni…
▽ More
The reconstruction kernel in computed tomography (CT) generation determines the texture of the image. Consistency in reconstruction kernels is important as the underlying CT texture can impact measurements during quantitative image analysis. Harmonization (i.e., kernel conversion) minimizes differences in measurements due to inconsistent reconstruction kernels. Existing methods investigate harmonization of CT scans in single or multiple manufacturers. However, these methods require paired scans of hard and soft reconstruction kernels that are spatially and anatomically aligned. Additionally, a large number of models need to be trained across different kernel pairs within manufacturers. In this study, we adopt an unpaired image translation approach to investigate harmonization between and across reconstruction kernels from different manufacturers by constructing a multipath cycle generative adversarial network (GAN). We use hard and soft reconstruction kernels from the Siemens and GE vendors from the National Lung Screening Trial dataset. We use 50 scans from each reconstruction kernel and train a multipath cycle GAN. To evaluate the effect of harmonization on the reconstruction kernels, we harmonize 50 scans each from Siemens hard kernel, GE soft kernel and GE hard kernel to a reference Siemens soft kernel (B30f) and evaluate percent emphysema. We fit a linear model by considering the age, smoking status, sex and vendor and perform an analysis of variance (ANOVA) on the emphysema scores. Our approach minimizes differences in emphysema measurement and highlights the impact of age, sex, smoking status and vendor on emphysema quantification.
△ Less
Submitted 26 January, 2024; v1 submitted 22 September, 2023;
originally announced September 2023.
-
Deep conditional generative models for longitudinal single-slice abdominal computed tomography harmonization
Authors:
Xin Yu,
Qi Yang,
Yucheng Tang,
Riqiang Gao,
Shunxing Bao,
Leon Y. Cai,
Ho Hin Lee,
Yuankai Huo,
Ann Zenobia Moore,
Luigi Ferrucci,
Bennett A. Landman
Abstract:
Two-dimensional single-slice abdominal computed tomography (CT) provides a detailed tissue map with high resolution allowing quantitative characterization of relationships between health conditions and aging. However, longitudinal analysis of body composition changes using these scans is difficult due to positional variation between slices acquired in different years, which leading to different or…
▽ More
Two-dimensional single-slice abdominal computed tomography (CT) provides a detailed tissue map with high resolution allowing quantitative characterization of relationships between health conditions and aging. However, longitudinal analysis of body composition changes using these scans is difficult due to positional variation between slices acquired in different years, which leading to different organs/tissues captured. To address this issue, we propose C-SliceGen, which takes an arbitrary axial slice in the abdominal region as a condition and generates a pre-defined vertebral level slice by estimating structural changes in the latent space. Our experiments on 2608 volumetric CT data from two in-house datasets and 50 subjects from the 2015 Multi-Atlas Abdomen Labeling Challenge dataset (BTCV) Challenge demonstrate that our model can generate high-quality images that are realistic and similar. We further evaluate our method's capability to harmonize longitudinal positional variation on 1033 subjects from the Baltimore Longitudinal Study of Aging (BLSA) dataset, which contains longitudinal single abdominal slices, and confirmed that our method can harmonize the slice positional variance in terms of visceral fat area. This approach provides a promising direction for map** slices from different vertebral levels to a target slice and reducing positional variance for single-slice longitudinal analysis. The source code is available at: https://github.com/MASILab/C-SliceGen.
△ Less
Submitted 17 September, 2023;
originally announced September 2023.
-
Enhancing Hierarchical Transformers for Whole Brain Segmentation with Intracranial Measurements Integration
Authors:
Xin Yu,
Yucheng Tang,
Qi Yang,
Ho Hin Lee,
Shunxing Bao,
Yuankai Huo,
Bennett A. Landman
Abstract:
Whole brain segmentation with magnetic resonance imaging (MRI) enables the non-invasive measurement of brain regions, including total intracranial volume (TICV) and posterior fossa volume (PFV). Enhancing the existing whole brain segmentation methodology to incorporate intracranial measurements offers a heightened level of comprehensiveness in the analysis of brain structures. Despite its potentia…
▽ More
Whole brain segmentation with magnetic resonance imaging (MRI) enables the non-invasive measurement of brain regions, including total intracranial volume (TICV) and posterior fossa volume (PFV). Enhancing the existing whole brain segmentation methodology to incorporate intracranial measurements offers a heightened level of comprehensiveness in the analysis of brain structures. Despite its potential, the task of generalizing deep learning techniques for intracranial measurements faces data availability constraints due to limited manually annotated atlases encompassing whole brain and TICV/PFV labels. In this paper, we enhancing the hierarchical transformer UNesT for whole brain segmentation to achieve segmenting whole brain with 133 classes and TICV/PFV simultaneously. To address the problem of data scarcity, the model is first pretrained on 4859 T1-weighted (T1w) 3D volumes sourced from 8 different sites. These volumes are processed through a multi-atlas segmentation pipeline for label generation, while TICV/PFV labels are unavailable. Subsequently, the model is finetuned with 45 T1w 3D volumes from Open Access Series Imaging Studies (OASIS) where both 133 whole brain classes and TICV/PFV labels are available. We evaluate our method with Dice similarity coefficients(DSC). We show that our model is able to conduct precise TICV/PFV estimation while maintaining the 132 brain regions performance at a comparable level. Code and trained model are available at: https://github.com/MASILab/UNesT/tree/main/wholebrainSeg.
△ Less
Submitted 10 April, 2024; v1 submitted 7 September, 2023;
originally announced September 2023.
-
Digital Modeling on Large Kernel Metamaterial Neural Network
Authors:
Quan Liu,
Hanyu Zheng,
Brandon T. Swartz,
Ho hin Lee,
Zuhayr Asad,
Ivan Kravchenko,
Jason G. Valentine,
Yuankai Huo
Abstract:
Deep neural networks (DNNs) utilized recently are physically deployed with computational units (e.g., CPUs and GPUs). Such a design might lead to a heavy computational burden, significant latency, and intensive power consumption, which are critical limitations in applications such as the Internet of Things (IoT), edge computing, and the usage of drones. Recent advances in optical computational uni…
▽ More
Deep neural networks (DNNs) utilized recently are physically deployed with computational units (e.g., CPUs and GPUs). Such a design might lead to a heavy computational burden, significant latency, and intensive power consumption, which are critical limitations in applications such as the Internet of Things (IoT), edge computing, and the usage of drones. Recent advances in optical computational units (e.g., metamaterial) have shed light on energy-free and light-speed neural networks. However, the digital design of the metamaterial neural network (MNN) is fundamentally limited by its physical limitations, such as precision, noise, and bandwidth during fabrication. Moreover, the unique advantages of MNN's (e.g., light-speed computation) are not fully explored via standard 3x3 convolution kernels. In this paper, we propose a novel large kernel metamaterial neural network (LMNN) that maximizes the digital capacity of the state-of-the-art (SOTA) MNN with model re-parametrization and network compression, while also considering the optical limitation explicitly. The new digital learning scheme can maximize the learning capacity of MNN while modeling the physical restrictions of meta-optic. With the proposed LMNN, the computation cost of the convolutional front-end can be offloaded into fabricated optical hardware. The experimental results on two publicly available datasets demonstrate that the optimized hybrid design improved classification accuracy while reducing computational latency. The development of the proposed LMNN is a promising step towards the ultimate goal of energy-free and light-speed AI.
△ Less
Submitted 21 July, 2023;
originally announced July 2023.
-
ISI-Mitigating Character Encoding for Molecular communications via Diffusion
Authors:
Haewoong Hyun Changmin Lee,
Miaowen Wen,
Sang-Hyo Kim,
Chan-Byoung Chae
Abstract:
This letter introduces a novel algorithm for generating codebooks in molecular communications (MC). The proposed algorithm utilizes character entropy to effectively mitigate inter-symbol interference (ISI) during MC via diffusion. Based on Huffman coding, the algorithm ensures that consecutive bit-1s are avoided in the resulting codebook. Additionally, the error-correction process at the receiver…
▽ More
This letter introduces a novel algorithm for generating codebooks in molecular communications (MC). The proposed algorithm utilizes character entropy to effectively mitigate inter-symbol interference (ISI) during MC via diffusion. Based on Huffman coding, the algorithm ensures that consecutive bit-1s are avoided in the resulting codebook. Additionally, the error-correction process at the receiver effectively eliminates ISI in the time slot immediately following a bit-1. We conduct an ISI analysis, which confirms that the proposed algorithm significantly reduces decoding errors. Through numerical analysis, we demonstrate that the proposed codebook exhibits superior performance in terms of character error rate compared to existing codebooks. Furthermore, we validate the performance of the algorithm through experimentation on a real-time testbed.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
Multi-Contrast Computed Tomography Atlas of Healthy Pancreas
Authors:
Yinchi Zhou,
Ho Hin Lee,
Yucheng Tang,
Xin Yu,
Qi Yang,
Shunxing Bao,
Jeffrey M. Spraggins,
Yuankai Huo,
Bennett A. Landman
Abstract:
With the substantial diversity in population demographics, such as differences in age and body composition, the volumetric morphology of pancreas varies greatly, resulting in distinctive variations in shape and appearance. Such variations increase the difficulty at generalizing population-wide pancreas features. A volumetric spatial reference is needed to adapt the morphological variability for or…
▽ More
With the substantial diversity in population demographics, such as differences in age and body composition, the volumetric morphology of pancreas varies greatly, resulting in distinctive variations in shape and appearance. Such variations increase the difficulty at generalizing population-wide pancreas features. A volumetric spatial reference is needed to adapt the morphological variability for organ-specific analysis. Here, we proposed a high-resolution computed tomography (CT) atlas framework specifically optimized for the pancreas organ across multi-contrast CT. We introduce a deep learning-based pre-processing technique to extract the abdominal region of interests (ROIs) and leverage a hierarchical registration pipeline to align the pancreas anatomy across populations. Briefly, DEEDs affine and non-rigid registration are performed to transfer patient abdominal volumes to a fixed high-resolution atlas template. To generate and evaluate the pancreas atlas template, multi-contrast modality CT scans of 443 subjects (without reported history of pancreatic disease, age: 15-50 years old) are processed. Comparing with different registration state-of-the-art tools, the combination of DEEDs affine and non-rigid registration achieves the best performance for the pancreas label transfer across all contrast phases. We further perform external evaluation with another research cohort of 100 de-identified portal venous scans with 13 organs labeled, having the best label transfer performance of 0.504 Dice score in unsupervised setting. The qualitative representation (e.g., average map**) of each phase creates a clear boundary of pancreas and its distinctive contrast appearance. The deformation surface renderings across scales (e.g., small to large volume) further illustrate the generalizability of the proposed atlas template.
△ Less
Submitted 2 June, 2023;
originally announced June 2023.
-
Longitudinal Multimodal Transformer Integrating Imaging and Latent Clinical Signatures From Routine EHRs for Pulmonary Nodule Classification
Authors:
Thomas Z. Li,
John M. Still,
Kaiwen Xu,
Ho Hin Lee,
Leon Y. Cai,
Aravind R. Krishnan,
Riqiang Gao,
Mirza S. Khan,
Sanja Antic,
Michael Kammer,
Kim L. Sandler,
Fabien Maldonado,
Bennett A. Landman,
Thomas A. Lasko
Abstract:
The accuracy of predictive models for solitary pulmonary nodule (SPN) diagnosis can be greatly increased by incorporating repeat imaging and medical context, such as electronic health records (EHRs). However, clinically routine modalities such as imaging and diagnostic codes can be asynchronous and irregularly sampled over different time scales which are obstacles to longitudinal multimodal learni…
▽ More
The accuracy of predictive models for solitary pulmonary nodule (SPN) diagnosis can be greatly increased by incorporating repeat imaging and medical context, such as electronic health records (EHRs). However, clinically routine modalities such as imaging and diagnostic codes can be asynchronous and irregularly sampled over different time scales which are obstacles to longitudinal multimodal learning. In this work, we propose a transformer-based multimodal strategy to integrate repeat imaging with longitudinal clinical signatures from routinely collected EHRs for SPN classification. We perform unsupervised disentanglement of latent clinical signatures and leverage time-distance scaled self-attention to jointly learn from clinical signatures expressions and chest computed tomography (CT) scans. Our classifier is pretrained on 2,668 scans from a public dataset and 1,149 subjects with longitudinal chest CTs, billing codes, medications, and laboratory tests from EHRs of our home institution. Evaluation on 227 subjects with challenging SPNs revealed a significant AUC improvement over a longitudinal multimodal baseline (0.824 vs 0.752 AUC), as well as improvements over a single cross-section multimodal scenario (0.809 AUC) and a longitudinal imaging-only scenario (0.741 AUC). This work demonstrates significant advantages with a novel approach for co-learning longitudinal imaging and non-imaging phenotypes with transformers. Code available at https://github.com/MASILab/lmsignatures.
△ Less
Submitted 29 June, 2023; v1 submitted 5 April, 2023;
originally announced April 2023.
-
Scaling Up 3D Kernels with Bayesian Frequency Re-parameterization for Medical Image Segmentation
Authors:
Ho Hin Lee,
Quan Liu,
Shunxing Bao,
Qi Yang,
Xin Yu,
Leon Y. Cai,
Thomas Li,
Yuankai Huo,
Xenofon Koutsoukos,
Bennett A. Landman
Abstract:
With the inspiration of vision transformers, the concept of depth-wise convolution revisits to provide a large Effective Receptive Field (ERF) using Large Kernel (LK) sizes for medical image segmentation. However, the segmentation performance might be saturated and even degraded as the kernel sizes scaled up (e.g., $21\times 21\times 21$) in a Convolutional Neural Network (CNN). We hypothesize tha…
▽ More
With the inspiration of vision transformers, the concept of depth-wise convolution revisits to provide a large Effective Receptive Field (ERF) using Large Kernel (LK) sizes for medical image segmentation. However, the segmentation performance might be saturated and even degraded as the kernel sizes scaled up (e.g., $21\times 21\times 21$) in a Convolutional Neural Network (CNN). We hypothesize that convolution with LK sizes is limited to maintain an optimal convergence for locality learning. While Structural Re-parameterization (SR) enhances the local convergence with small kernels in parallel, optimal small kernel branches may hinder the computational efficiency for training. In this work, we propose RepUX-Net, a pure CNN architecture with a simple large kernel block design, which competes favorably with current network state-of-the-art (SOTA) (e.g., 3D UX-Net, SwinUNETR) using 6 challenging public datasets. We derive an equivalency between kernel re-parameterization and the branch-wise variation in kernel convergence. Inspired by the spatial frequency in the human visual system, we extend to vary the kernel convergence into element-wise setting and model the spatial frequency as a Bayesian prior to re-parameterize convolutional weights during training. Specifically, a reciprocal function is leveraged to estimate a frequency-weighted value, which rescales the corresponding kernel element for stochastic gradient descent. From the experimental results, RepUX-Net consistently outperforms 3D SOTA benchmarks with internal validation (FLARE: 0.929 to 0.944), external validation (MSD: 0.901 to 0.932, KiTS: 0.815 to 0.847, LiTS: 0.933 to 0.949, TCIA: 0.736 to 0.779) and transfer learning (AMOS: 0.880 to 0.911) scenarios in Dice Score.
△ Less
Submitted 5 June, 2023; v1 submitted 10 March, 2023;
originally announced March 2023.
-
Single Slice Thigh CT Muscle Group Segmentation with Domain Adaptation and Self-Training
Authors:
Qi Yang,
Xin Yu,
Ho Hin Lee,
Leon Y. Cai,
Kaiwen Xu,
Shunxing Bao,
Yuankai Huo,
Ann Zenobia Moore,
Sokratis Makrogiannis,
Luigi Ferrucci,
Bennett A. Landman
Abstract:
Objective: Thigh muscle group segmentation is important for assessment of muscle anatomy, metabolic disease and aging. Many efforts have been put into quantifying muscle tissues with magnetic resonance (MR) imaging including manual annotation of individual muscles. However, leveraging publicly available annotations in MR images to achieve muscle group segmentation on single slice computed tomograp…
▽ More
Objective: Thigh muscle group segmentation is important for assessment of muscle anatomy, metabolic disease and aging. Many efforts have been put into quantifying muscle tissues with magnetic resonance (MR) imaging including manual annotation of individual muscles. However, leveraging publicly available annotations in MR images to achieve muscle group segmentation on single slice computed tomography (CT) thigh images is challenging.
Method: We propose an unsupervised domain adaptation pipeline with self-training to transfer labels from 3D MR to single CT slice. First, we transform the image appearance from MR to CT with CycleGAN and feed the synthesized CT images to a segmenter simultaneously. Single CT slices are divided into hard and easy cohorts based on the entropy of pseudo labels inferenced by the segmenter. After refining easy cohort pseudo labels based on anatomical assumption, self-training with easy and hard splits is applied to fine tune the segmenter.
Results: On 152 withheld single CT thigh images, the proposed pipeline achieved a mean Dice of 0.888(0.041) across all muscle groups including sartorius, hamstrings, quadriceps femoris and gracilis. muscles
Conclusion: To our best knowledge, this is the first pipeline to achieve thigh imaging domain adaptation from MR to CT. The proposed pipeline is effective and robust in extracting muscle groups on 2D single slice CT thigh images.The container is available for public use at https://github.com/MASILab/DA_CT_muscle_seg
△ Less
Submitted 30 November, 2022;
originally announced December 2022.
-
Reducing Positional Variance in Cross-sectional Abdominal CT Slices with Deep Conditional Generative Models
Authors:
Xin Yu,
Qi Yang,
Yucheng Tang,
Riqiang Gao,
Shunxing Bao,
LeonY. Cai,
Ho Hin Lee,
Yuankai Huo,
Ann Zenobia Moore,
Luigi Ferrucci,
Bennett A. Landman
Abstract:
2D low-dose single-slice abdominal computed tomography (CT) slice enables direct measurements of body composition, which are critical to quantitatively characterizing health relationships on aging. However, longitudinal analysis of body composition changes using 2D abdominal slices is challenging due to positional variance between longitudinal slices acquired in different years. To reduce the posi…
▽ More
2D low-dose single-slice abdominal computed tomography (CT) slice enables direct measurements of body composition, which are critical to quantitatively characterizing health relationships on aging. However, longitudinal analysis of body composition changes using 2D abdominal slices is challenging due to positional variance between longitudinal slices acquired in different years. To reduce the positional variance, we extend the conditional generative models to our C-SliceGen that takes an arbitrary axial slice in the abdominal region as the condition and generates a defined vertebral level slice by estimating the structural changes in the latent space. Experiments on 1170 subjects from an in-house dataset and 50 subjects from BTCV MICCAI Challenge 2015 show that our model can generate high quality images in terms of realism and similarity. External experiments on 20 subjects from the Baltimore Longitudinal Study of Aging (BLSA) dataset that contains longitudinal single abdominal slices validate that our method can harmonize the slice positional variance in terms of muscle and visceral fat area. Our approach provides a promising direction of map** slices from different vertebral levels to a target slice to reduce positional variance for single slice longitudinal analysis. The source code is available at: https://github.com/MASILab/C-SliceGen.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
UNesT: Local Spatial Representation Learning with Hierarchical Transformer for Efficient Medical Segmentation
Authors:
Xin Yu,
Qi Yang,
Yinchi Zhou,
Leon Y. Cai,
Riqiang Gao,
Ho Hin Lee,
Thomas Li,
Shunxing Bao,
Zhoubing Xu,
Thomas A. Lasko,
Richard G. Abramson,
Zizhao Zhang,
Yuankai Huo,
Bennett A. Landman,
Yucheng Tang
Abstract:
Transformer-based models, capable of learning better global dependencies, have recently demonstrated exceptional representation learning capabilities in computer vision and medical image analysis. Transformer reformats the image into separate patches and realizes global communication via the self-attention mechanism. However, positional information between patches is hard to preserve in such 1D se…
▽ More
Transformer-based models, capable of learning better global dependencies, have recently demonstrated exceptional representation learning capabilities in computer vision and medical image analysis. Transformer reformats the image into separate patches and realizes global communication via the self-attention mechanism. However, positional information between patches is hard to preserve in such 1D sequences, and loss of it can lead to sub-optimal performance when dealing with large amounts of heterogeneous tissues of various sizes in 3D medical image segmentation. Additionally, current methods are not robust and efficient for heavy-duty medical segmentation tasks such as predicting a large number of tissue classes or modeling globally inter-connected tissue structures. To address such challenges and inspired by the nested hierarchical structures in vision transformer, we proposed a novel 3D medical image segmentation method (UNesT), employing a simplified and faster-converging transformer encoder design that achieves local communication among spatially adjacent patch sequences by aggregating them hierarchically. We extensively validate our method on multiple challenging datasets, consisting of multiple modalities, anatomies, and a wide range of tissue classes, including 133 structures in the brain, 14 organs in the abdomen, 4 hierarchical components in the kidneys, inter-connected kidney tumors and brain tumors. We show that UNesT consistently achieves state-of-the-art performance and evaluate its generalizability and data efficiency. Particularly, the model achieves whole brain segmentation task complete ROI with 133 tissue classes in a single network, outperforming prior state-of-the-art method SLANT27 ensembled with 27 networks.
△ Less
Submitted 7 September, 2023; v1 submitted 28 September, 2022;
originally announced September 2022.
-
Pseudo-Label Guided Multi-Contrast Generalization for Non-Contrast Organ-Aware Segmentation
Authors:
Ho Hin Lee,
Yucheng Tang,
Riqiang Gao,
Qi Yang,
Xin Yu,
Shunxing Bao,
James G. Terry,
J. Jeffrey Carr,
Yuankai Huo,
Bennett A. Landman
Abstract:
Non-contrast computed tomography (NCCT) is commonly acquired for lung cancer screening, assessment of general abdominal pain or suspected renal stones, trauma evaluation, and many other indications. However, the absence of contrast limits distinguishing organ in-between boundaries. In this paper, we propose a novel unsupervised approach that leverages pairwise contrast-enhanced CT (CECT) context t…
▽ More
Non-contrast computed tomography (NCCT) is commonly acquired for lung cancer screening, assessment of general abdominal pain or suspected renal stones, trauma evaluation, and many other indications. However, the absence of contrast limits distinguishing organ in-between boundaries. In this paper, we propose a novel unsupervised approach that leverages pairwise contrast-enhanced CT (CECT) context to compute non-contrast segmentation without ground-truth label. Unlike generative adversarial approaches, we compute the pairwise morphological context with CECT to provide teacher guidance instead of generating fake anatomical context. Additionally, we further augment the intensity correlations in 'organ-specific' settings and increase the sensitivity to organ-aware boundary. We validate our approach on multi-organ segmentation with paired non-contrast & contrast-enhanced CT scans using five-fold cross-validation. Full external validations are performed on an independent non-contrast cohort for aorta segmentation. Compared with current abdominal organs segmentation state-of-the-art in fully supervised setting, our proposed pipeline achieves a significantly higher Dice by 3.98% (internal multi-organ annotated), and 8.00% (external aorta annotated) for abdominal organs segmentation. The code and pretrained models are publicly available at https://github.com/MASILab/ContrastMix.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
ModDrop++: A Dynamic Filter Network with Intra-subject Co-training for Multiple Sclerosis Lesion Segmentation with Missing Modalities
Authors:
Han Liu,
Yubo Fan,
Hao Li,
Jiacheng Wang,
Dewei Hu,
Can Cui,
Ho Hin Lee,
Huahong Zhang,
Ipek Oguz
Abstract:
Multiple Sclerosis (MS) is a chronic neuroinflammatory disease and multi-modality MRIs are routinely used to monitor MS lesions. Many automatic MS lesion segmentation models have been developed and have reached human-level performance. However, most established methods assume the MRI modalities used during training are also available during testing, which is not guaranteed in clinical practice. Pr…
▽ More
Multiple Sclerosis (MS) is a chronic neuroinflammatory disease and multi-modality MRIs are routinely used to monitor MS lesions. Many automatic MS lesion segmentation models have been developed and have reached human-level performance. However, most established methods assume the MRI modalities used during training are also available during testing, which is not guaranteed in clinical practice. Previously, a training strategy termed Modality Dropout (ModDrop) has been applied to MS lesion segmentation to achieve the state-of-the-art performance with missing modality. In this paper, we present a novel method dubbed ModDrop++ to train a unified network adaptive to an arbitrary number of input MRI sequences. ModDrop++ upgrades the main idea of ModDrop in two key ways. First, we devise a plug-and-play dynamic head and adopt a filter scaling strategy to improve the expressiveness of the network. Second, we design a co-training strategy to leverage the intra-subject relation between full modality and missing modality. Specifically, the intra-subject co-training strategy aims to guide the dynamic head to generate similar feature representations between the full- and missing-modality data from the same subject. We use two public MS datasets to show the superiority of ModDrop++. Source code and trained models are available at https://github.com/han-liu/ModDropPlusPlus.
△ Less
Submitted 1 July, 2022; v1 submitted 7 March, 2022;
originally announced March 2022.
-
Characterizing Renal Structures with 3D Block Aggregate Transformers
Authors:
Xin Yu,
Yucheng Tang,
Yinchi Zhou,
Riqiang Gao,
Qi Yang,
Ho Hin Lee,
Thomas Li,
Shunxing Bao,
Yuankai Huo,
Zhoubing Xu,
Thomas A. Lasko,
Richard G. Abramson,
Bennett A. Landman
Abstract:
Efficiently quantifying renal structures can provide distinct spatial context and facilitate biomarker discovery for kidney morphology. However, the development and evaluation of the transformer model to segment the renal cortex, medulla, and collecting system remains challenging due to data inefficiency. Inspired by the hierarchical structures in vision transformer, we propose a novel method usin…
▽ More
Efficiently quantifying renal structures can provide distinct spatial context and facilitate biomarker discovery for kidney morphology. However, the development and evaluation of the transformer model to segment the renal cortex, medulla, and collecting system remains challenging due to data inefficiency. Inspired by the hierarchical structures in vision transformer, we propose a novel method using a 3D block aggregation transformer for segmenting kidney components on contrast-enhanced CT scans. We construct the first cohort of renal substructures segmentation dataset with 116 subjects under institutional review board (IRB) approval. Our method yields the state-of-the-art performance (Dice of 0.8467) against the baseline approach of 0.8308 with the data-efficient design. The Pearson R achieves 0.9891 between the proposed method and manual standards and indicates the strong correlation and reproducibility for volumetric analysis. We extend the proposed method to the public KiTS dataset, the method leads to improved accuracy compared to transformer-based approaches. We show that the 3D block aggregation transformer can achieve local communication between sequence representations without modifying self-attention, and it can serve as an accurate and efficient quantification tool for characterizing renal structures.
△ Less
Submitted 4 March, 2022;
originally announced March 2022.
-
Random Multi-Channel Image Synthesis for Multiplexed Immunofluorescence Imaging
Authors:
Shunxing Bao,
Yucheng Tang,
Ho Hin Lee,
Riqiang Gao,
Sophie Chiron,
Ilwoo Lyu,
Lori A. Coburn,
Keith T. Wilson,
Joseph T. Roland,
Bennett A. Landman,
Yuankai Huo
Abstract:
Multiplex immunofluorescence (MxIF) is an emerging imaging technique that produces the high sensitivity and specificity of single-cell map**. With a tenet of 'seeing is believing', MxIF enables iterative staining and imaging extensive antibodies, which provides comprehensive biomarkers to segment and group different cells on a single tissue section. However, considerable depletion of the scarce…
▽ More
Multiplex immunofluorescence (MxIF) is an emerging imaging technique that produces the high sensitivity and specificity of single-cell map**. With a tenet of 'seeing is believing', MxIF enables iterative staining and imaging extensive antibodies, which provides comprehensive biomarkers to segment and group different cells on a single tissue section. However, considerable depletion of the scarce tissue is inevitable from extensive rounds of staining and bleaching ('missing tissue'). Moreover, the immunofluorescence (IF) imaging can globally fail for particular rounds ('missing stain''). In this work, we focus on the 'missing stain' issue. It would be appealing to develop digital image synthesis approaches to restore missing stain images without losing more tissue physically. Herein, we aim to develop image synthesis approaches for eleven MxIF structural molecular markers (i.e., epithelial and stromal) on real samples. We propose a novel multi-channel high-resolution image synthesis approach, called pixN2N-HD, to tackle possible missing stain scenarios via a high-resolution generative adversarial network (GAN). Our contribution is three-fold: (1) a single deep network framework is proposed to tackle missing stain in MxIF; (2) the proposed 'N-to-N' strategy reduces theoretical four years of computational time to 20 hours when covering all possible missing stains scenarios, with up to five missing stains (e.g., '(N-1)-to-1', '(N-2)-to-2'); and (3) this work is the first comprehensive experimental study of investigating cross-stain synthesis in MxIF. Our results elucidate a promising direction of advancing MxIF imaging with deep image synthesis.
△ Less
Submitted 18 September, 2021;
originally announced September 2021.
-
Lung Cancer Risk Estimation with Incomplete Data: A Joint Missing Imputation Perspective
Authors:
Riqiang Gao,
Yucheng Tang,
Kaiwen Xu,
Ho Hin Lee,
Steve Deppen,
Kim Sandler,
Pierre Massion,
Thomas A. Lasko,
Yuankai Huo,
Bennett A. Landman
Abstract:
Data from multi-modality provide complementary information in clinical prediction, but missing data in clinical cohorts limits the number of subjects in multi-modal learning context. Multi-modal missing imputation is challenging with existing methods when 1) the missing data span across heterogeneous modalities (e.g., image vs. non-image); or 2) one modality is largely missing. In this paper, we a…
▽ More
Data from multi-modality provide complementary information in clinical prediction, but missing data in clinical cohorts limits the number of subjects in multi-modal learning context. Multi-modal missing imputation is challenging with existing methods when 1) the missing data span across heterogeneous modalities (e.g., image vs. non-image); or 2) one modality is largely missing. In this paper, we address imputation of missing data by modeling the joint distribution of multi-modal data. Motivated by partial bidirectional generative adversarial net (PBiGAN), we propose a new Conditional PBiGAN (C-PBiGAN) method that imputes one modality combining the conditional knowledge from another modality. Specifically, C-PBiGAN introduces a conditional latent space in a missing imputation framework that jointly encodes the available multi-modal data, along with a class regularization loss on imputed data to recover discriminative information. To our knowledge, it is the first generative adversarial model that addresses multi-modal missing imputation by modeling the joint distribution of image and non-image data. We validate our model with both the national lung screening trial (NLST) dataset and an external clinical validation cohort. The proposed C-PBiGAN achieves significant improvements in lung cancer risk estimation compared with representative imputation methods (e.g., AUC values increase in both NLST (+2.9\%) and in-house dataset (+4.3\%) compared with PBiGAN, p$<$0.05).
△ Less
Submitted 25 July, 2021;
originally announced July 2021.
-
Multi-Contrast Computed Tomography Healthy Kidney Atlas
Authors:
Ho Hin Lee,
Yucheng Tang,
Kaiwen Xu,
Shunxing Bao,
Agnes B. Fogo,
Raymond Harris,
Mark P. de Caestecker,
Mattias Heinrich,
Jeffrey M. Spraggins,
Yuankai Huo,
Bennett A. Landman
Abstract:
The construction of three-dimensional multi-modal tissue maps provides an opportunity to spur interdisciplinary innovations across temporal and spatial scales through information integration. While the preponderance of effort is allocated to the cellular level and explore the changes in cell interactions and organizations, contextualizing findings within organs and systems is essential to visualiz…
▽ More
The construction of three-dimensional multi-modal tissue maps provides an opportunity to spur interdisciplinary innovations across temporal and spatial scales through information integration. While the preponderance of effort is allocated to the cellular level and explore the changes in cell interactions and organizations, contextualizing findings within organs and systems is essential to visualize and interpret higher resolution linkage across scales. There is a substantial normal variation of kidney morphometry and appearance across body size, sex, and imaging protocols in abdominal computed tomography (CT). A volumetric atlas framework is needed to integrate and visualize the variability across scales. However, there is no abdominal and retroperitoneal organs atlas framework for multi-contrast CT. Hence, we proposed a high-resolution CT retroperitoneal atlas specifically optimized for the kidney across non-contrast CT and early arterial, late arterial, venous and delayed contrast enhanced CT. Briefly, we introduce a deep learning-based volume of interest extraction method and an automated two-stage hierarchal registration pipeline to register abdominal volumes to a high-resolution CT atlas template. To generate and evaluate the atlas, multi-contrast modality CT scans of 500 subjects (without reported history of renal disease, age: 15-50 years, 250 males & 250 females) were processed. We demonstrate a stable generalizability of the atlas template for integrating the normal kidney variation from small to large, across contrast modalities and populations with great variability of demographics. The linkage of atlas and demographics provided a better understanding of the variation of kidney anatomy across populations.
△ Less
Submitted 23 December, 2020; v1 submitted 22 December, 2020;
originally announced December 2020.
-
RAP-Net: Coarse-to-Fine Multi-Organ Segmentation with Single Random Anatomical Prior
Authors:
Ho Hin Lee,
Yucheng Tang,
Shunxing Bao,
Richard G. Abramson,
Yuankai Huo,
Bennett A. Landman
Abstract:
Performing coarse-to-fine abdominal multi-organ segmentation facilitates to extract high-resolution segmentation minimizing the lost of spatial contextual information. However, current coarse-to-refine approaches require a significant number of models to perform single organ refine segmentation corresponding to the extracted organ region of interest (ROI). We propose a coarse-to-fine pipeline, whi…
▽ More
Performing coarse-to-fine abdominal multi-organ segmentation facilitates to extract high-resolution segmentation minimizing the lost of spatial contextual information. However, current coarse-to-refine approaches require a significant number of models to perform single organ refine segmentation corresponding to the extracted organ region of interest (ROI). We propose a coarse-to-fine pipeline, which starts from the extraction of the global prior context of multiple organs from 3D volumes using a low-resolution coarse network, followed by a fine phase that uses a single refined model to segment all abdominal organs instead of multiple organ corresponding models. We combine the anatomical prior with corresponding extracted patches to preserve the anatomical locations and boundary information for performing high-resolution segmentation across all organs in a single model. To train and evaluate our method, a clinical research cohort consisting of 100 patient volumes with 13 organs well-annotated is used. We tested our algorithms with 4-fold cross-validation and computed the Dice score for evaluating the segmentation performance of the 13 organs. Our proposed method using single auto-context outperforms the state-of-the-art on 13 models with an average Dice score 84.58% versus 81.69% (p<0.0001).
△ Less
Submitted 23 December, 2020; v1 submitted 22 December, 2020;
originally announced December 2020.
-
Validation and Optimization of Multi-Organ Segmentation on Clinical Imaging Archives
Authors:
Yuchen Xu,
Olivia Tang,
Yucheng Tang,
Ho Hin Lee,
Yunqiang Chen,
Dashan Gao,
Shizhong Han,
Riqiang Gao,
Michael R. Savona,
Richard G. Abramson,
Yuankai Huo,
Bennett A. Landman
Abstract:
Segmentation of abdominal computed tomography(CT) provides spatial context, morphological properties, and a framework for tissue-specific radiomics to guide quantitative Radiological assessment. A 2015 MICCAI challenge spurred substantial innovation in multi-organ abdominal CT segmentation with both traditional and deep learning methods. Recent innovations in deep methods have driven performance t…
▽ More
Segmentation of abdominal computed tomography(CT) provides spatial context, morphological properties, and a framework for tissue-specific radiomics to guide quantitative Radiological assessment. A 2015 MICCAI challenge spurred substantial innovation in multi-organ abdominal CT segmentation with both traditional and deep learning methods. Recent innovations in deep methods have driven performance toward levels for which clinical translation is appealing. However, continued cross-validation on open datasets presents the risk of indirect knowledge contamination and could result in circular reasoning. Moreover, 'real world' segmentations can be challenging due to the wide variability of abdomen physiology within patients. Herein, we perform two data retrievals to capture clinically acquired deidentified abdominal CT cohorts with respect to a recently published variation on 3D U-Net (baseline algorithm). First, we retrieved 2004 deidentified studies on 476 patients with diagnosis codes involving spleen abnormalities (cohort A). Second, we retrieved 4313 deidentified studies on 1754 patients without diagnosis codes involving spleen abnormalities (cohort B). We perform prospective evaluation of the existing algorithm on both cohorts, yielding 13% and 8% failure rate, respectively. Then, we identified 51 subjects in cohort A with segmentation failures and manually corrected the liver and gallbladder labels. We re-trained the model adding the manual labels, resulting in performance improvement of 9% and 6% failure rate for the A and B cohorts, respectively. In summary, the performance of the baseline on the prospective cohorts was similar to that on previously published datasets. Moreover, adding data from the first cohort substantively improved performance when evaluated on the second withheld validation cohort.
△ Less
Submitted 10 February, 2020;
originally announced February 2020.
-
Contrast Phase Classification with a Generative Adversarial Network
Authors:
Yucheng Tang,
Ho Hin Lee,
Yuchen Xu,
Olivia Tang,
Yunqiang Chen,
Dashan Gao,
Shizhong Han,
Riqiang Gao,
Camilo Bermudez,
Michael R. Savona,
Richard G. Abramson,
Yuankai Huo,
Bennett A. Landman
Abstract:
Dynamic contrast enhanced computed tomography (CT) is an imaging technique that provides critical information on the relationship of vascular structure and dynamics in the context of underlying anatomy. A key challenge for image processing with contrast enhanced CT is that phase discrepancies are latent in different tissues due to contrast protocols, vascular dynamics, and metabolism variance. Pre…
▽ More
Dynamic contrast enhanced computed tomography (CT) is an imaging technique that provides critical information on the relationship of vascular structure and dynamics in the context of underlying anatomy. A key challenge for image processing with contrast enhanced CT is that phase discrepancies are latent in different tissues due to contrast protocols, vascular dynamics, and metabolism variance. Previous studies with deep learning frameworks have been proposed for classifying contrast enhancement with networks inspired by computer vision. Here, we revisit the challenge in the context of whole abdomen contrast enhanced CTs. To capture and compensate for the complex contrast changes, we propose a novel discriminator in the form of a multi-domain disentangled representation learning network. The goal of this network is to learn an intermediate representation that separates contrast enhancement from anatomy and enables classification of images with varying contrast time. Briefly, our unpaired contrast disentangling GAN(CD-GAN) Discriminator follows the ResNet architecture to classify a CT scan from different enhancement phases. To evaluate the approach, we trained the enhancement phase classifier on 21060 slices from two clinical cohorts of 230 subjects. Testing was performed on 9100 slices from 30 independent subjects who had been imaged with CT scans from all contrast phases. Performance was quantified in terms of the multi-class normalized confusion matrix. The proposed network significantly improved correspondence over baseline UNet, ResNet50 and StarGAN performance of accuracy scores 0.54. 0.55, 0.62 and 0.91, respectively. The proposed discriminator from the disentangled network presents a promising technique that may allow deeper modeling of dynamic imaging against patient specific anatomies.
△ Less
Submitted 14 November, 2019;
originally announced November 2019.
-
Semi-Supervised Multi-Organ Segmentation through Quality Assurance Supervision
Authors:
Ho Hin Lee,
Yucheng Tang,
Olivia Tang,
Yuchen Xu,
Yunqiang Chen,
Dashan Gao,
Shizhong Han,
Riqiang Gao,
Michael R. Savona,
Richard G. Abramson,
Yuankai Huo,
Bennett A. Landman
Abstract:
Human in-the-loop quality assurance (QA) is typically performed after medical image segmentation to ensure that the systems are performing as intended, as well as identifying and excluding outliers. By performing QA on large-scale, previously unlabeled testing data, categorical QA scores can be generatedIn this paper, we propose a semi-supervised multi-organ segmentation deep neural network consis…
▽ More
Human in-the-loop quality assurance (QA) is typically performed after medical image segmentation to ensure that the systems are performing as intended, as well as identifying and excluding outliers. By performing QA on large-scale, previously unlabeled testing data, categorical QA scores can be generatedIn this paper, we propose a semi-supervised multi-organ segmentation deep neural network consisting of a traditional segmentation model generator and a QA involved discriminator. A large-scale dataset of 2027 volumes are used to train the generator, whose 2-D montage images and segmentation mask with QA scores are used to train the discriminator. To generate the QA scores, the 2-D montage images were reviewed manually and coded 0 (success), 1 (errors consistent with published performance), and 2 (gross failure). Then, the ResNet-18 network was trained with 1623 montage images in equal distribution of all three code labels and achieved an accuracy 94% for classification predictions with 404 montage images withheld for the test cohort. To assess the performance of using the QA supervision, the discriminator was used as a loss function in a multi-organ segmentation pipeline. The inclusion of QA-loss function boosted performance on the unlabeled test dataset from 714 patients to 951 patients over the baseline model. Additionally, the number of failures decreased from 606 (29.90%) to 402 (19.83%). The contributions of the proposed method are threefold: We show that (1) the QA scores can be used as a loss function to perform semi-supervised learning for unlabeled data, (2) the well trained discriminator is learnt by QA score rather than traditional true/false, and (3) the performance of multi-organ segmentation on unlabeled datasets can be fine-tuned with more robust and higher accuracy than the original baseline method.
△ Less
Submitted 12 November, 2019;
originally announced November 2019.