-
Transformer-based segmentation of adnexal lesions and ovarian implants in CT images
Authors:
Aneesh Rangnekar,
Kevin M. Boehm,
Emily A. Aherne,
Ines Nikolovski,
Natalie Gangai,
Ying Liu,
Dimitry Zamarin,
Kara L. Roche,
Sohrab P. Shah,
Yulia Lakhman,
Harini Veeraraghavan
Abstract:
Two self-supervised pretrained transformer-based segmentation models (SMIT and Swin UNETR) fine-tuned on a dataset of ovarian cancer CT images provided reasonably accurate delineations of the tumors in an independent test dataset. Tumors in the adnexa were segmented more accurately by both transformers (SMIT and Swin UNETR) than the omental implants. AI-assisted labeling performed on 72 out of 245…
▽ More
Two self-supervised pretrained transformer-based segmentation models (SMIT and Swin UNETR) fine-tuned on a dataset of ovarian cancer CT images provided reasonably accurate delineations of the tumors in an independent test dataset. Tumors in the adnexa were segmented more accurately by both transformers (SMIT and Swin UNETR) than the omental implants. AI-assisted labeling performed on 72 out of 245 omental implants resulted in smaller manual editing effort of 39.55 mm compared to full manual correction of partial labels of 106.49 mm and resulted in overall improved accuracy performance. Both SMIT and Swin UNETR did not generate any false detection of omental metastases in the urinary bladder and relatively few false detections in the small bowel, with 2.16 cc on average for SMIT and 7.37 cc for Swin UNETR respectively.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Self-supervised learning improves robustness of deep learning lung tumor segmentation to CT imaging differences
Authors:
Jue Jiang,
Aneesh Rangnekar,
Harini Veeraraghavan
Abstract:
Self-supervised learning (SSL) is an approach to extract useful feature representations from unlabeled data, and enable fine-tuning on downstream tasks with limited labeled examples. Self-pretraining is a SSL approach that uses the curated task dataset for both pretraining the networks and fine-tuning them. Availability of large, diverse, and uncurated public medical image sets provides the opport…
▽ More
Self-supervised learning (SSL) is an approach to extract useful feature representations from unlabeled data, and enable fine-tuning on downstream tasks with limited labeled examples. Self-pretraining is a SSL approach that uses the curated task dataset for both pretraining the networks and fine-tuning them. Availability of large, diverse, and uncurated public medical image sets provides the opportunity to apply SSL in the "wild" and potentially extract features robust to imaging variations. However, the benefit of wild- vs self-pretraining has not been studied for medical image analysis. In this paper, we compare robustness of wild versus self-pretrained transformer (vision transformer [ViT] and hierarchical shifted window [Swin]) models to computed tomography (CT) imaging differences for non-small cell lung cancer (NSCLC) segmentation. Wild-pretrained Swin models outperformed self-pretrained Swin for the various imaging acquisitions. ViT resulted in similar accuracy for both wild- and self-pretrained models. Masked image prediction pretext task that forces networks to learn the local structure resulted in higher accuracy compared to contrastive task that models global image information. Wild-pretrained models resulted in higher feature reuse at the lower level layers and feature differentiation close to output layer after fine-tuning. Hence, we conclude: Wild-pretrained networks were more robust to analyzed CT imaging differences for lung tumor segmentation than self-pretrained methods. Swin architecture benefited from such pretraining more than ViT.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Deep learning classifier of locally advanced rectal cancer treatment response from endoscopy images
Authors:
Jorge Tapias Gomez,
Aneesh Rangnekar,
Hannah Williams,
Hannah Thompson,
Julio Garcia-Aguilar,
Joshua Jesse Smith,
Harini Veeraraghavan
Abstract:
We developed a deep learning classifier of rectal cancer response (tumor vs. no-tumor) to total neoadjuvant treatment (TNT) from endoscopic images acquired before, during, and following TNT. We further evaluated the network's ability in a near out-of-distribution (OOD) problem to identify local regrowth (LR) from follow-up endoscopy images acquired several months to years after completing TNT. We…
▽ More
We developed a deep learning classifier of rectal cancer response (tumor vs. no-tumor) to total neoadjuvant treatment (TNT) from endoscopic images acquired before, during, and following TNT. We further evaluated the network's ability in a near out-of-distribution (OOD) problem to identify local regrowth (LR) from follow-up endoscopy images acquired several months to years after completing TNT. We addressed endoscopic image variability by using optimal mass transport-based image harmonization. We evaluated multiple training regularization schemes to study the ResNet-50 network's in-distribution and near-OOD generalization ability. Test time augmentation resulted in the most considerable accuracy improvement. Image harmonization resulted in slight accuracy improvement for the near-OOD cases. Our results suggest that off-the-shelf deep learning classifiers can detect rectal cancer from endoscopic images at various stages of therapy for surveillance.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Trustworthiness of Pretrained Transformers for Lung Cancer Segmentation
Authors:
Aneesh Rangnekar,
Nishant Nadkarni,
Jue Jiang,
Harini Veeraraghavan
Abstract:
We assessed the trustworthiness of two self-supervision pretrained transformer models, Swin UNETR and SMIT, for fine-tuned lung (LC) tumor segmentation using 670 CT and MRI scans. We measured segmentation accuracy on two public 3D-CT datasets, robustness on CT scans of patients with COVID-19, CT scans of patients with ovarian cancer and T2-weighted MRI of men with prostate cancer, and zero-shot ge…
▽ More
We assessed the trustworthiness of two self-supervision pretrained transformer models, Swin UNETR and SMIT, for fine-tuned lung (LC) tumor segmentation using 670 CT and MRI scans. We measured segmentation accuracy on two public 3D-CT datasets, robustness on CT scans of patients with COVID-19, CT scans of patients with ovarian cancer and T2-weighted MRI of men with prostate cancer, and zero-shot generalization of LC for T2-weighted MRIs. Both models demonstrated high accuracy on in-distribution data (Dice 0.80 for SMIT and 0.78 for Swin UNETR). SMIT showed similar near-out-of-distribution performance on CT scans (AUROC 89.85% vs. 89.19%) but significantly better far-out-of-distribution accuracy on CT (AUROC 97.2% vs. 87.1%) and MRI (92.15% vs. 73.8%). SMIT outperformed Swin UNETR in zero-shot segmentation on MRI (Dice 0.78 vs. 0.69). We expect these findings to guide the safe development and deployment of current and future pretrained models in routine clinical use.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Self-distilled Masked Attention guided masked image modeling with noise Regularized Teacher (SMART) for medical image analysis
Authors:
Jue Jiang,
Aneesh Rangnekar,
Chloe Min Seo Choi,
Harini Veeraraghavan
Abstract:
Pretraining vision transformers (ViT) with attention guided masked image modeling (MIM) has shown to increase downstream accuracy for natural image analysis. Hierarchical shifted window (Swin) transformer, often used in medical image analysis cannot use attention guided masking as it lacks an explicit [CLS] token, needed for computing attention maps for selective masking. We thus enhanced Swin wit…
▽ More
Pretraining vision transformers (ViT) with attention guided masked image modeling (MIM) has shown to increase downstream accuracy for natural image analysis. Hierarchical shifted window (Swin) transformer, often used in medical image analysis cannot use attention guided masking as it lacks an explicit [CLS] token, needed for computing attention maps for selective masking. We thus enhanced Swin with semantic class attention. We developed a co-distilled Swin transformer that combines a noisy momentum updated teacher to guide selective masking for MIM. Our approach called \textsc{s}e\textsc{m}antic \textsc{a}ttention guided co-distillation with noisy teacher \textsc{r}egularized Swin \textsc{T}rans\textsc{F}ormer (SMARTFormer) was applied for analyzing 3D computed tomography datasets with lung nodules and malignant lung cancers (LC). We also analyzed the impact of semantic attention and noisy teacher on pretraining and downstream accuracy. SMARTFormer classified lesions (malignant from benign) with a high accuracy of 0.895 of 1000 nodules, predicted LC treatment response with accuracy of 0.74, and achieved high accuracies even in limited data regimes. Pretraining with semantic attention and noisy teacher improved ability to distinguish semantically meaningful structures such as organs in a unsupervised clustering task and localize abnormal structures like tumors. Code, models will be made available through GitHub upon paper acceptance.
△ Less
Submitted 3 July, 2024; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Deep Learning Based Dominant Index Lesion Segmentation for MR-guided Radiation Therapy of Prostate Cancer
Authors:
Josiah Simeth,
Jue Jiang,
Anton Nosov,
Andreas Wibmer,
Michael Zelefsky,
Neelam Tyagi,
Harini Veeraraghavan
Abstract:
Dose escalation radiotherapy allows increased control of prostate cancer (PCa) but requires segmentation of dominant index lesions (DIL), motivating the development of automated methods for fast, accurate, and consistent segmentation of PCa DIL. We evaluated five deep-learning networks on apparent diffusion coefficient (ADC) MRI from 500 lesions in 365 patients arising from internal training Datas…
▽ More
Dose escalation radiotherapy allows increased control of prostate cancer (PCa) but requires segmentation of dominant index lesions (DIL), motivating the development of automated methods for fast, accurate, and consistent segmentation of PCa DIL. We evaluated five deep-learning networks on apparent diffusion coefficient (ADC) MRI from 500 lesions in 365 patients arising from internal training Dataset 1 (1.5Tesla GE MR with endorectal coil), external ProstateX Dataset 2 (3Tesla Siemens MR), and internal inter-rater Dataset 3 (3Tesla Philips MR). The networks include: multiple resolution residually connected network (MRRN) and MRRN regularized in training with deep supervision (MRRN-DS), Unet, Unet++, ResUnet, and fast panoptic segmentation (FPSnet) as well as fast panoptic segmentation with smoothed labels (FPSnet-SL). Models were evaluated by volumetric DIL segmentation accuracy using Dice similarity coefficient (DSC) and detection accuracy, as a function of lesion aggressiveness, size, and location (Dataset 1 and 2), and accuracy with respect to two-raters (on Dataset 3). In general MRRN-DS more accurately segmented tumors than other methods on the testing datasets. MRRN-DS significantly outperformed ResUnet in Dataset2 (DSC of 0.54 vs. 0.44, p<0.001) and the Unet++ in Dataset3 (DSC of 0.45 vs. p=0.04). FPSnet-SL was similarly accurate as MRRN-DS in Dataset2 (p = 0.30), but MRRN-DS significantly outperformed FPSnet and FPSnet-SL in both Dataset1 (0.60 vs 0.51 [p=0.01] and 0.54 [p=0.049] respectively) and Dataset3 (0.45 vs 0.06 [p=0.002] and 0.24 [p=0.004] respectively). Finally, MRRN-DS produced slightly higher agreement with experienced radiologist than two radiologists in Dataset 3 (DSC of 0.45 vs. 0.41).
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Progressively refined deep joint registration segmentation (ProRSeg) of gastrointestinal organs at risk: Application to MRI and cone-beam CT
Authors:
Jue Jiang,
Jun Hong,
Kathryn Tringale,
Marsha Reyngold,
Christopher Crane,
Neelam Tyagi,
Harini Veeraraghavan
Abstract:
Method: ProRSeg was trained using 5-fold cross-validation with 110 T2-weighted MRI acquired at 5 treatment fractions from 10 different patients, taking care that same patient scans were not placed in training and testing folds. Segmentation accuracy was measured using Dice similarity coefficient (DSC) and Hausdorff distance at 95th percentile (HD95). Registration consistency was measured using coe…
▽ More
Method: ProRSeg was trained using 5-fold cross-validation with 110 T2-weighted MRI acquired at 5 treatment fractions from 10 different patients, taking care that same patient scans were not placed in training and testing folds. Segmentation accuracy was measured using Dice similarity coefficient (DSC) and Hausdorff distance at 95th percentile (HD95). Registration consistency was measured using coefficient of variation (CV) in displacement of OARs. Ablation tests and accuracy comparisons against multiple methods were done. Finally, applicability of ProRSeg to segment cone-beam CT (CBCT) scans was evaluated on 80 scans using 5-fold cross-validation. Results: ProRSeg processed 3D volumes (128 $\times$ 192 $\times$ 128) in 3 secs on a NVIDIA Tesla V100 GPU. It's segmentations were significantly more accurate ($p<0.001$) than compared methods, achieving a DSC of 0.94 $\pm$0.02 for liver, 0.88$\pm$0.04 for large bowel, 0.78$\pm$0.03 for small bowel and 0.82$\pm$0.04 for stomach-duodenum from MRI. ProRSeg achieved a DSC of 0.72$\pm$0.01 for small bowel and 0.76$\pm$0.03 for stomach-duodenum from CBCT. ProRSeg registrations resulted in the lowest CV in displacement (stomach-duodenum $CV_{x}$: 0.75\%, $CV_{y}$: 0.73\%, and $CV_{z}$: 0.81\%; small bowel $CV_{x}$: 0.80\%, $CV_{y}$: 0.80\%, and $CV_{z}$: 0.68\%; large bowel $CV_{x}$: 0.71\%, $CV_{y}$ : 0.81\%, and $CV_{z}$: 0.75\%). ProRSeg based dose accumulation accounting for intra-fraction (pre-treatment to post-treatment MRI scan) and inter-fraction motion showed that the organ dose constraints were violated in 4 patients for stomach-duodenum and for 3 patients for small bowel. Study limitations include lack of independent testing and ground truth phantom datasets to measure dose accumulation accuracy.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Self-supervised 3D anatomy segmentation using self-distilled masked image transformer (SMIT)
Authors:
Jue Jiang,
Neelam Tyagi,
Kathryn Tringale,
Christopher Crane,
Harini Veeraraghavan
Abstract:
Vision transformers, with their ability to more efficiently model long-range context, have demonstrated impressive accuracy gains in several computer vision and medical image analysis tasks including segmentation. However, such methods need large labeled datasets for training, which is hard to obtain for medical image analysis. Self-supervised learning (SSL) has demonstrated success in medical ima…
▽ More
Vision transformers, with their ability to more efficiently model long-range context, have demonstrated impressive accuracy gains in several computer vision and medical image analysis tasks including segmentation. However, such methods need large labeled datasets for training, which is hard to obtain for medical image analysis. Self-supervised learning (SSL) has demonstrated success in medical image segmentation using convolutional networks. In this work, we developed a \underline{s}elf-distillation learning with \underline{m}asked \underline{i}mage modeling method to perform SSL for vision \underline{t}ransformers (SMIT) applied to 3D multi-organ segmentation from CT and MRI. Our contribution is a dense pixel-wise regression within masked patches called masked image prediction, which we combined with masked patch token distillation as pretext task to pre-train vision transformers. We show our approach is more accurate and requires fewer fine tuning datasets than other pretext tasks. Unlike prior medical image methods, which typically used image sets arising from disease sites and imaging modalities corresponding to the target tasks, we used 3,643 CT scans (602,708 images) arising from head and neck, lung, and kidney cancers as well as COVID-19 for pre-training and applied it to abdominal organs segmentation from MRI pancreatic cancer patients as well as publicly available 13 different abdominal organs segmentation from CT. Our method showed clear accuracy improvement (average DSC of 0.875 from MRI and 0.878 from CT) with reduced requirement for fine-tuning datasets over commonly used pretext tasks. Extensive comparisons against multiple current SSL methods were done. Code will be made available upon acceptance for publication.
△ Less
Submitted 20 May, 2022;
originally announced May 2022.
-
Wasserstein Image Local Analysis: Histogram of Orientations, Smoothing and Edge Detection
Authors:
Jiening Zhu,
Harini Veeraraghavan,
Larry Norton,
Joseph O. Deasy,
Allen Tannenbaum
Abstract:
The Histogram of Oriented Gradient is a widely used image feature, which describes local image directionality based on numerical differentiation. Due to its ill-posed nature, small noise may lead to large errors. Conventional HOG may fail to produce meaningful directionality results in the presence of noise, which is common in medical radiographic imaging. We approach the directionality problem fr…
▽ More
The Histogram of Oriented Gradient is a widely used image feature, which describes local image directionality based on numerical differentiation. Due to its ill-posed nature, small noise may lead to large errors. Conventional HOG may fail to produce meaningful directionality results in the presence of noise, which is common in medical radiographic imaging. We approach the directionality problem from a novel perspective by the use of the optimal transport map of a local image patch to a uni-color patch of its mean. We decompose the transport map into sub-work costs in different directions. We evaluated the ability of the optimal transport to quantify tumor heterogeneity from brain MRI images of patients with glioblastoma multiforme from the TCIA. By considering the entropy difference of the extracted local directionality within tumor regions, we found that patients with higher entropy in their images, had statistically significant worse overall survival (p $=0.008$), which indicates that tumors exhibiting flows in many directions may be more malignant, perhaps reflecting high tumor histologic grade, a reflection of histologic disorganization. We also explored the possibility of solving classical image processing problems such as smoothing and edge detection via optimal transport. By looking for a 2-color patch with minimum transport distance to a local patch, we derive a nonlinear shock filter, which preserves edges. Moreover, we found that the color difference of the computed 2-color patch indicates whether there is a large change in color, i.e., an edge in the given patch. In summary, we expand the usefulness of optimal transport as an image local analysis tool, to extract robust measures of imaging tumor heterogeneity for outcomes prediction as well as image pre-processing. Because of its robust nature, we find it offers several advantages over the classical approaches.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
One shot PACS: Patient specific Anatomic Context and Shape prior aware recurrent registration-segmentation of longitudinal thoracic cone beam CTs
Authors:
Jue Jiang,
Harini Veeraraghavan
Abstract:
Image-guided adaptive lung radiotherapy requires accurate tumor and organs segmentation from during treatment cone-beam CT (CBCT) images. Thoracic CBCTs are hard to segment because of low soft-tissue contrast, imaging artifacts, respiratory motion, and large treatment induced intra-thoracic anatomic changes. Hence, we developed a novel Patient-specific Anatomic Context and Shape prior or PACS-awar…
▽ More
Image-guided adaptive lung radiotherapy requires accurate tumor and organs segmentation from during treatment cone-beam CT (CBCT) images. Thoracic CBCTs are hard to segment because of low soft-tissue contrast, imaging artifacts, respiratory motion, and large treatment induced intra-thoracic anatomic changes. Hence, we developed a novel Patient-specific Anatomic Context and Shape prior or PACS-aware 3D recurrent registration-segmentation network for longitudinal thoracic CBCT segmentation. Segmentation and registration networks were concurrently trained in an end-to-end framework and implemented with convolutional long-short term memory models. The registration network was trained in an unsupervised manner using pairs of planning CT (pCT) and CBCT images and produced a progressively deformed sequence of images. The segmentation network was optimized in a one-shot setting by combining progressively deformed pCT (anatomic context) and pCT delineations (shape context) with CBCT images. Our method, one-shot PACS was significantly more accurate (p$<$0.001) for tumor (DSC of 0.83 $\pm$ 0.08, surface DSC [sDSC] of 0.97 $\pm$ 0.06, and Hausdorff distance at $95^{th}$ percentile [HD95] of 3.97$\pm$3.02mm) and the esophagus (DSC of 0.78 $\pm$ 0.13, sDSC of 0.90$\pm$0.14, HD95 of 3.22$\pm$2.02) segmentation than multiple methods. Ablation tests and comparative experiments were also done.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Unpaired cross-modality educed distillation (CMEDL) for medical image segmentation
Authors:
Jue Jiang,
Andreas Rimner,
Joseph O. Deasy,
Harini Veeraraghavan
Abstract:
Accurate and robust segmentation of lung cancers from CT, even those located close to mediastinum, is needed to more accurately plan and deliver radiotherapy and to measure treatment response. Therefore, we developed a new cross-modality educed distillation (CMEDL) approach, using unpaired CT and MRI scans, whereby an informative teacher MRI network guides a student CT network to extract features…
▽ More
Accurate and robust segmentation of lung cancers from CT, even those located close to mediastinum, is needed to more accurately plan and deliver radiotherapy and to measure treatment response. Therefore, we developed a new cross-modality educed distillation (CMEDL) approach, using unpaired CT and MRI scans, whereby an informative teacher MRI network guides a student CT network to extract features that signal the difference between foreground and background. Our contribution eliminates two requirements of distillation methods: (i) paired image sets by using an image to image (I2I) translation and (ii) pre-training of the teacher network with a large training set by using concurrent training of all networks. Our framework uses an end-to-end trained unpaired I2I translation, teacher, and student segmentation networks. Architectural flexibility of our framework is demonstrated using 3 segmentation and 2 I2I networks. Networks were trained with 377 CT and 82 T2w MRI from different sets of patients, with independent validation (N=209 tumors) and testing (N=609 tumors) datasets. Network design, methods to combine MRI with CT information, distillation learning under informative (MRI to CT), weak (CT to MRI) and equal teacher (MRI to MRI), and ablation tests were performed. Accuracy was measured using Dice similarity (DSC), surface Dice (sDSC), and Hausdorff distance at the 95$^{th}$ percentile (HD95). The CMEDL approach was significantly (p $<$ 0.001) more accurate (DSC of 0.77 vs. 0.73) than non-CMEDL methods with an informative teacher for CT lung tumor, with a weak teacher (DSC of 0.84 vs. 0.81) for MRI lung tumor, and with equal teacher (DSC of 0.90 vs. 0.88) for MRI multi-organ segmentation. CMEDL also reduced inter-rater lung tumor segmentation variabilities..
△ Less
Submitted 7 December, 2021; v1 submitted 16 July, 2021;
originally announced July 2021.
-
Nested-block self-attention for robust radiotherapy planning segmentation
Authors:
Harini Veeraraghavan,
Jue Jiang,
Sharif Elguindi,
Sean L. Berry,
Ifeanyirochukwu Onochie,
Aditya Apte,
Laura Cervino,
Joseph O. Deasy
Abstract:
Although deep convolutional networks have been widely studied for head and neck (HN) organs at risk (OAR) segmentation, their use for routine clinical treatment planning is limited by a lack of robustness to imaging artifacts, low soft tissue contrast on CT, and the presence of abnormal anatomy. In order to address these challenges, we developed a computationally efficient nested block self-attent…
▽ More
Although deep convolutional networks have been widely studied for head and neck (HN) organs at risk (OAR) segmentation, their use for routine clinical treatment planning is limited by a lack of robustness to imaging artifacts, low soft tissue contrast on CT, and the presence of abnormal anatomy. In order to address these challenges, we developed a computationally efficient nested block self-attention (NBSA) method that can be combined with any convolutional network. Our method achieves computational efficiency by performing non-local calculations within memory blocks of fixed spatial extent. Contextual dependencies are captured by passing information in a raster scan order between blocks, as well as through a second attention layer that causes bi-directional attention flow. We implemented our approach on three different networks to demonstrate feasibility. Following training using 200 cases, we performed comprehensive evaluations using conventional and clinical metrics on a separate set of 172 test scans sourced from external and internal institution datasets without any exclusion criteria. NBSA required a similar number of computations (15.7 gflops) as the most efficient criss-cross attention (CCA) method and generated significantly more accurate segmentations for brain stem (Dice of 0.89 vs. 0.86) and parotid glands (0.86 vs. 0.84) than CCA. NBSA's segmentations were less variable than multiple 3D methods, including for small organs with low soft-tissue contrast such as the submandibular glands (surface Dice of 0.90).
△ Less
Submitted 26 February, 2021;
originally announced February 2021.
-
Deep cross-modality (MR-CT) educed distillation learning for cone beam CT lung tumor segmentation
Authors:
Jue Jiang,
Sadegh Riyahi Alam,
Ishita Chen,
Perry Zhang,
Andreas Rimner,
Joseph O. Deasy,
Harini Veeraraghavan
Abstract:
Despite the widespread availability of in-treatment room cone beam computed tomography (CBCT) imaging, due to the lack of reliable segmentation methods, CBCT is only used for gross set up corrections in lung radiotherapies. Accurate and reliable auto-segmentation tools could potentiate volumetric response assessment and geometry-guided adaptive radiation therapies. Therefore, we developed a new de…
▽ More
Despite the widespread availability of in-treatment room cone beam computed tomography (CBCT) imaging, due to the lack of reliable segmentation methods, CBCT is only used for gross set up corrections in lung radiotherapies. Accurate and reliable auto-segmentation tools could potentiate volumetric response assessment and geometry-guided adaptive radiation therapies. Therefore, we developed a new deep learning CBCT lung tumor segmentation method. Methods: The key idea of our approach called cross modality educed distillation (CMEDL) is to use magnetic resonance imaging (MRI) to guide a CBCT segmentation network training to extract more informative features during training. We accomplish this by training an end-to-end network comprised of unpaired domain adaptation (UDA) and cross-domain segmentation distillation networks (SDN) using unpaired CBCT and MRI datasets. Feature distillation regularizes the student network to extract CBCT features that match the statistical distribution of MRI features extracted by the teacher network and obtain better differentiation of tumor from background.} We also compared against an alternative framework that used UDA with MR segmentation network, whereby segmentation was done on the synthesized pseudo MRI representation. All networks were trained with 216 weekly CBCTs and 82 T2-weighted turbo spin echo MRI acquired from different patient cohorts. Validation was done on 20 weekly CBCTs from patients not used in training. Independent testing was done on 38 weekly CBCTs from patients not used in training or validation. Segmentation accuracy was measured using surface Dice similarity coefficient (SDSC) and Hausdroff distance at 95th percentile (HD95) metrics.
△ Less
Submitted 20 April, 2021; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Unified cross-modality feature disentangler for unsupervised multi-domain MRI abdomen organs segmentation
Authors:
Jue Jiang,
Harini Veeraraghavan
Abstract:
Our contribution is a unified cross-modality feature disentagling approach for multi-domain image translation and multiple organ segmentation. Using CT as the labeled source domain, our approach learns to segment multi-modal (T1-weighted and T2-weighted) MRI having no labeled data. Our approach uses a variational auto-encoder (VAE) to disentangle the image content from style. The VAE constrains th…
▽ More
Our contribution is a unified cross-modality feature disentagling approach for multi-domain image translation and multiple organ segmentation. Using CT as the labeled source domain, our approach learns to segment multi-modal (T1-weighted and T2-weighted) MRI having no labeled data. Our approach uses a variational auto-encoder (VAE) to disentangle the image content from style. The VAE constrains the style feature encoding to match a universal prior (Gaussian) that is assumed to span the styles of all the source and target modalities. The extracted image style is converted into a latent style scaling code, which modulates the generator to produce multi-modality images according to the target domain code from the image content features. Finally, we introduce a joint distribution matching discriminator that combines the translated images with task-relevant segmentation probability maps to further constrain and regularize image-to-image (I2I) translations. We performed extensive comparisons to multiple state-of-the-art I2I translation and segmentation methods. Our approach resulted in the lowest average multi-domain image reconstruction error of 1.34$\pm$0.04. Our approach produced an average Dice similarity coefficient (DSC) of 0.85 for T1w and 0.90 for T2w MRI for multi-organ segmentation, which was highly comparable to a fully supervised MRI multi-organ segmentation network (DSC of 0.86 for T1w and 0.90 for T2w MRI).
△ Less
Submitted 19 July, 2020;
originally announced July 2020.
-
PSIGAN: Joint probabilistic segmentation and image distribution matching for unpaired cross-modality adaptation based MRI segmentation
Authors:
Jue Jiang,
Yu Chi Hu,
Neelam Tyagi,
Andreas Rimner,
Nancy Lee,
Joseph O. Deasy,
Sean Berry,
Harini Veeraraghavan
Abstract:
We developed a new joint probabilistic segmentation and image distribution matching generative adversarial network (PSIGAN) for unsupervised domain adaptation (UDA) and multi-organ segmentation from magnetic resonance (MRI) images. Our UDA approach models the co-dependency between images and their segmentation as a joint probability distribution using a new structure discriminator. The structure d…
▽ More
We developed a new joint probabilistic segmentation and image distribution matching generative adversarial network (PSIGAN) for unsupervised domain adaptation (UDA) and multi-organ segmentation from magnetic resonance (MRI) images. Our UDA approach models the co-dependency between images and their segmentation as a joint probability distribution using a new structure discriminator. The structure discriminator computes structure of interest focused adversarial loss by combining the generated pseudo MRI with probabilistic segmentations produced by a simultaneously trained segmentation sub-network. The segmentation sub-network is trained using the pseudo MRI produced by the generator sub-network. This leads to a cyclical optimization of both the generator and segmentation sub-networks that are jointly trained as part of an end-to-end network. Extensive experiments and comparisons against multiple state-of-the-art methods were done on four different MRI sequences totalling 257 scans for generating multi-organ and tumor segmentation. The experiments included, (a) 20 T1-weighted (T1w) in-phase mdixon and (b) 20 T2-weighted (T2w) abdominal MRI for segmenting liver, spleen, left and right kidneys, (c) 162 T2-weighted fat suppressed head and neck MRI (T2wFS) for parotid gland segmentation, and (d) 75 T2w MRI for lung tumor segmentation. Our method achieved an overall average DSC of 0.87 on T1w and 0.90 on T2w for the abdominal organs, 0.82 on T2wFS for the parotid glands, and 0.77 on T2w MRI for lung tumors.
△ Less
Submitted 18 July, 2021; v1 submitted 18 July, 2020;
originally announced July 2020.
-
Multiple resolution residual network for automatic thoracic organs-at-risk segmentation from CT
Authors:
Hyemin Um,
Jue Jiang,
Maria Thor,
Andreas Rimner,
Leo Luo,
Joseph O. Deasy,
Harini Veeraraghavan
Abstract:
We implemented and evaluated a multiple resolution residual network (MRRN) for multiple normal organs-at-risk (OAR) segmentation from computed tomography (CT) images for thoracic radiotherapy treatment (RT) planning. Our approach simultaneously combines feature streams computed at multiple image resolutions and feature levels through residual connections. The feature streams at each level are upda…
▽ More
We implemented and evaluated a multiple resolution residual network (MRRN) for multiple normal organs-at-risk (OAR) segmentation from computed tomography (CT) images for thoracic radiotherapy treatment (RT) planning. Our approach simultaneously combines feature streams computed at multiple image resolutions and feature levels through residual connections. The feature streams at each level are updated as the images are passed through various feature levels. We trained our approach using 206 thoracic CT scans of lung cancer patients with 35 scans held out for validation to segment the left and right lungs, heart, esophagus, and spinal cord. This approach was tested on 60 CT scans from the open-source AAPM Thoracic Auto-Segmentation Challenge dataset. Performance was measured using the Dice Similarity Coefficient (DSC). Our approach outperformed the best-performing method in the grand challenge for hard-to-segment structures like the esophagus and achieved comparable results for all other structures. Median DSC using our method was 0.97 (interquartile range [IQR]: 0.97-0.98) for the left and right lungs, 0.93 (IQR: 0.93-0.95) for the heart, 0.78 (IQR: 0.76-0.80) for the esophagus, and 0.88 (IQR: 0.86-0.89) for the spinal cord.
△ Less
Submitted 31 May, 2020; v1 submitted 27 May, 2020;
originally announced May 2020.
-
Local block-wise self attention for normal organ segmentation
Authors:
Jue Jiang,
Elguindi Sharif,
Hyemin Um,
Sean Berry,
Harini Veeraraghavan
Abstract:
We developed a new and computationally simple local block-wise self attention based normal structures segmentation approach applied to head and neck computed tomography (CT) images. Our method uses the insight that normal organs exhibit regularity in their spatial location and inter-relation within images, which can be leveraged to simplify the computations required to aggregate feature informatio…
▽ More
We developed a new and computationally simple local block-wise self attention based normal structures segmentation approach applied to head and neck computed tomography (CT) images. Our method uses the insight that normal organs exhibit regularity in their spatial location and inter-relation within images, which can be leveraged to simplify the computations required to aggregate feature information. We accomplish this by using local self attention blocks that pass information between each other to derive the attention map. We show that adding additional attention layers increases the contextual field and captures focused attention from relevant structures. We developed our approach using U-net and compared it against multiple state-of-the-art self attention methods. All models were trained on 48 internal headneck CT scans and tested on 48 CT scans from the external public domain database of computational anatomy dataset. Our method achieved the highest Dice similarity coefficient segmentation accuracy of 0.85$\pm$0.04, 0.86$\pm$0.04 for left and right parotid glands, 0.79$\pm$0.07 and 0.77$\pm$0.05 for left and right submandibular glands, 0.93$\pm$0.01 for mandible and 0.88$\pm$0.02 for the brain stem with the lowest increase of 66.7\% computing time per image and 0.15\% increase in model parameters compared with standard U-net. The best state-of-the-art method called point-wise spatial attention, achieved \textcolor{black}{comparable accuracy but with 516.7\% increase in computing time and 8.14\% increase in parameters compared with standard U-net.} Finally, we performed ablation tests and studied the impact of attention block size, overlap of the attention blocks, additional attention layers, and attention block placement on segmentation performance.
△ Less
Submitted 11 September, 2019;
originally announced September 2019.
-
Integrating cross-modality hallucinated MRI with CT to aid mediastinal lung tumor segmentation
Authors:
Jue Jiang,
Jason Hu,
Neelam Tyagi,
Andreas Rimner,
Sean L. Berry,
Joseph O. Deasy,
Harini Veeraraghavan
Abstract:
Lung tumors, especially those located close to or surrounded by soft tissues like the mediastinum, are difficult to segment due to the low soft tissue contrast on computed tomography images. Magnetic resonance images contain superior soft-tissue contrast information that can be leveraged if both modalities were available for training. Therefore, we developed a cross-modality educed learning approa…
▽ More
Lung tumors, especially those located close to or surrounded by soft tissues like the mediastinum, are difficult to segment due to the low soft tissue contrast on computed tomography images. Magnetic resonance images contain superior soft-tissue contrast information that can be leveraged if both modalities were available for training. Therefore, we developed a cross-modality educed learning approach where MR information that is educed from CT is used to hallucinate MRI and improve CT segmentation. Our approach, called cross-modality educed deep learning segmentation (CMEDL) combines CT and pseudo MR produced from CT by aligning their features to obtain segmentation on CT. Features computed in the last two layers of parallelly trained CT and MR segmentation networks are aligned. We implemented this approach on U-net and dense fully convolutional networks (dense-FCN). Our networks were trained on unrelated cohorts from open-source the Cancer Imaging Archive CT images (N=377), an internal archive T2-weighted MR (N=81), and evaluated using separate validation (N=304) and testing (N=333) CT-delineated tumors. Our approach using both networks were significantly more accurate (U-net $P <0.001$; denseFCN $P <0.001$) than CT-only networks and achieved an accuracy (Dice similarity coefficient) of 0.71$\pm$0.15 (U-net), 0.74$\pm$0.12 (denseFCN) on validation and 0.72$\pm$0.14 (U-net), 0.73$\pm$0.12 (denseFCN) on the testing sets. Our novel approach demonstrated that educing cross-modality information through learned priors enhances CT segmentation performance
△ Less
Submitted 10 September, 2019;
originally announced September 2019.
-
Comparison of Patch-Based Conditional Generative Adversarial Neural Net Models with Emphasis on Model Robustness for Use in Head and Neck Cases for MR-Only planning
Authors:
Peter Klages,
Ilyes Benslimane,
Sadegh Riyahi,
Jue Jiang,
Margie Hunt,
Joe Deasy,
Harini Veeraraghavan,
Neelam Tyagi
Abstract:
A total of twenty paired CT and MR images were used in this study to investigate two conditional generative adversarial networks, Pix2Pix, and Cycle GAN, for generating synthetic CT images for Headand Neck cancer cases. Ten of the patient cases were used for training and included such common artifacts as dental implants; the remaining ten testing cases were used for testing and included a larger r…
▽ More
A total of twenty paired CT and MR images were used in this study to investigate two conditional generative adversarial networks, Pix2Pix, and Cycle GAN, for generating synthetic CT images for Headand Neck cancer cases. Ten of the patient cases were used for training and included such common artifacts as dental implants; the remaining ten testing cases were used for testing and included a larger range of image features commonly found in clinical head and neck cases. These features included strong metal artifacts from dental implants, one case with a metal implant, and one case with abnormal anatomy. The original CT images were deformably registered to the mDixon FFE MR images to minimize the effects of processing the MR images. The sCT generation accuracy and robustness were evaluated using Mean Absolute Error (MAE) based on the Hounsfield Units (HU) for three regions (whole body, bone, and air within the body), Mean Error (ME) to observe systematic average offset errors in the sCT generation, and dosimetric evaluation of all clinically relevant structures. For the test set the MAE for the Pix2Pix and Cycle GAN models were 92.4 $\pm$ 13.5 HU, and 100.7 $\pm$ 14.6 HU, respectively, for the body region, 166.3 $\pm$ 31.8 HU, and 184 $\pm$ 31.9 HU, respectively, for the bone region, and 183.7 $\pm$ 41.3 HU and 185.4 $\pm$ 37.9 HU for the air regions. The ME for Pix2Pix and Cycle GAN were 21.0 $\pm$ 11.8 HU and 37.5 $\pm$ 14.9 HU, respectively. Absolute Percent Mean/Max Dose Errors were less than 2% for the PTV and all critical structures for both models, and DRRs generated from these models looked qualitatively similar to CT generated DRRs showing these methods are promising for MR-only planning.
△ Less
Submitted 27 February, 2019; v1 submitted 1 February, 2019;
originally announced February 2019.
-
Cross-modality (CT-MRI) prior augmented deep learning for robust lung tumor segmentation from small MR datasets
Authors:
Jue Jiang,
Yu-Chi Hu,
Neelam Tyagi,
Pengpeng Zhang,
Andreas Rimner,
Joseph O. Deasy,
Harini Veeraraghavan
Abstract:
Lack of large expert annotated MR datasets makes training deep learning models difficult. Therefore, a cross-modality (MR-CT) deep learning segmentation approach that augments training data using pseudo MR images produced by transforming expert-segmented CT images was developed. Eighty-One T2-weighted MRI scans from 28 patients with non-small cell lung cancers were analyzed. Cross-modality prior e…
▽ More
Lack of large expert annotated MR datasets makes training deep learning models difficult. Therefore, a cross-modality (MR-CT) deep learning segmentation approach that augments training data using pseudo MR images produced by transforming expert-segmented CT images was developed. Eighty-One T2-weighted MRI scans from 28 patients with non-small cell lung cancers were analyzed. Cross-modality prior encoding the transformation of CT to pseudo MR images resembling T2w MRI was learned as a generative adversarial deep learning model. This model augmented training data arising from 6 expert-segmented T2w MR patient scans with 377 pseudo MRI from non-small cell lung cancer CT patient scans with obtained from the Cancer Imaging Archive. A two-dimensional Unet implemented with batch normalization was trained to segment the tumors from T2w MRI. This method was benchmarked against (a) standard data augmentation and two state-of-the art cross-modality pseudo MR-based augmentation and (b) two segmentation networks. Segmentation accuracy was computed using Dice similarity coefficient (DSC), Hausdroff distance metrics, and volume ratio. The proposed approach produced the lowest statistical variability in the intensity distribution between pseudo and T2w MR images measured as Kullback-Leibler divergence of 0.069. This method produced the highest segmentation accuracy with a DSC of 0.75 and the lowest Hausdroff distance on the test dataset. This approach produced highly similar estimations of tumor growth as an expert (P = 0.37). A novel deep learning MR segmentation was developed that overcomes the limitation of learning robust models from small datasets by leveraging learned cross-modality priors to augment training. The results show the feasibility of the approach and the corresponding improvement over the state-of-the-art methods.
△ Less
Submitted 27 February, 2019; v1 submitted 31 January, 2019;
originally announced January 2019.
-
GBM Volumetry using the 3D Slicer Medical Image Computing Platform
Authors:
Jan Egger,
Tina Kapur,
Andriy Fedorov,
Steve Pieper,
James V. Miller,
Harini Veeraraghavan,
Bernd Freisleben,
Alexandra Golby,
Christopher Nimsky,
Ron Kikinis
Abstract:
Volumetric change in glioblastoma multiforme (GBM) over time is a critical factor in treatment decisions. Typically, the tumor volume is computed on a slice-by-slice basis using MRI scans obtained at regular intervals. (3D)Slicer - a free platform for biomedical research - provides an alternative to this manual slice-by-slice segmentation process, which is significantly faster and requires less us…
▽ More
Volumetric change in glioblastoma multiforme (GBM) over time is a critical factor in treatment decisions. Typically, the tumor volume is computed on a slice-by-slice basis using MRI scans obtained at regular intervals. (3D)Slicer - a free platform for biomedical research - provides an alternative to this manual slice-by-slice segmentation process, which is significantly faster and requires less user interaction. In this study, 4 physicians segmented GBMs in 10 patients, once using the competitive region-growing based GrowCut segmentation module of Slicer, and once purely by drawing boundaries completely manually on a slice-by-slice basis. Furthermore, we provide a variability analysis for three physicians for 12 GBMs. The time required for GrowCut segmentation was on an average 61% of the time required for a pure manual segmentation. A comparison of Slicer-based segmentation with manual slice-by-slice segmentation resulted in a Dice Similarity Coefficient of 88.43 +/- 5.23% and a Hausdorff Distance of 2.32 +/- 5.23 mm.
△ Less
Submitted 5 March, 2013;
originally announced March 2013.