-
Uncovering the effects of model initialization on deep model generalization: A study with adult and pediatric Chest X-ray images
Authors:
Sivaramakrishnan Rajaraman,
Ghada Zamzmi,
Feng Yang,
Zhaohui Liang,
Zhiyun Xue,
Sameer Antani
Abstract:
Model initialization techniques are vital for improving the performance and reliability of deep learning models in medical computer vision applications. While much literature exists on non-medical images, the impacts on medical images, particularly chest X-rays (CXRs) are less understood. Addressing this gap, our study explores three deep model initialization techniques: Cold-start, Warm-start, an…
▽ More
Model initialization techniques are vital for improving the performance and reliability of deep learning models in medical computer vision applications. While much literature exists on non-medical images, the impacts on medical images, particularly chest X-rays (CXRs) are less understood. Addressing this gap, our study explores three deep model initialization techniques: Cold-start, Warm-start, and Shrink and Perturb start, focusing on adult and pediatric populations. We specifically focus on scenarios with periodically arriving data for training, thereby embracing the real-world scenarios of ongoing data influx and the need for model updates. We evaluate these models for generalizability against external adult and pediatric CXR datasets. We also propose novel ensemble methods: F-score-weighted Sequential Least-Squares Quadratic Programming (F-SLSQP) and Attention-Guided Ensembles with Learnable Fuzzy Softmax to aggregate weight parameters from multiple models to capitalize on their collective knowledge and complementary representations. We perform statistical significance tests with 95% confidence intervals and p-values to analyze model performance. Our evaluations indicate models initialized with ImageNet-pre-trained weights demonstrate superior generalizability over randomly initialized counterparts, contradicting some findings for non-medical images. Notably, ImageNet-pretrained models exhibit consistent performance during internal and external testing across different training scenarios. Weight-level ensembles of these models show significantly higher recall (p<0.05) during testing compared to individual models. Thus, our study accentuates the benefits of ImageNet-pretrained weight initialization, especially when used with weight-level ensembles, for creating robust and generalizable deep learning solutions.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Semantically Redundant Training Data Removal and Deep Model Classification Performance: A Study with Chest X-rays
Authors:
Sivaramakrishnan Rajaraman,
Ghada Zamzmi,
Feng Yang,
Zhaohui Liang,
Zhiyun Xue,
Sameer Antani
Abstract:
Deep learning (DL) has demonstrated its innate capacity to independently learn hierarchical features from complex and multi-dimensional data. A common understanding is that its performance scales up with the amount of training data. Another data attribute is the inherent variety. It follows, therefore, that semantic redundancy, which is the presence of similar or repetitive information, would tend…
▽ More
Deep learning (DL) has demonstrated its innate capacity to independently learn hierarchical features from complex and multi-dimensional data. A common understanding is that its performance scales up with the amount of training data. Another data attribute is the inherent variety. It follows, therefore, that semantic redundancy, which is the presence of similar or repetitive information, would tend to lower performance and limit generalizability to unseen data. In medical imaging data, semantic redundancy can occur due to the presence of multiple images that have highly similar presentations for the disease of interest. Further, the common use of augmentation methods to generate variety in DL training may be limiting performance when applied to semantically redundant data. We propose an entropy-based sample scoring approach to identify and remove semantically redundant training data. We demonstrate using the publicly available NIH chest X-ray dataset that the model trained on the resulting informative subset of training data significantly outperforms the model trained on the full training set, during both internal (recall: 0.7164 vs 0.6597, p<0.05) and external testing (recall: 0.3185 vs 0.2589, p<0.05). Our findings emphasize the importance of information-oriented training sample selection as opposed to the conventional practice of using all available training data.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Does image resolution impact chest X-ray based fine-grained Tuberculosis-consistent lesion segmentation?
Authors:
Sivaramakrishnan Rajaraman,
Feng Yang,
Ghada Zamzmi,
Zhiyun Xue,
Sameer Antani
Abstract:
Deep learning (DL) models are state-of-the-art in segmenting anatomical and disease regions of interest (ROIs) in medical images. Particularly, a large number of DL-based techniques have been reported using chest X-rays (CXRs). However, these models are reportedly trained on reduced image resolutions for reasons related to the lack of computational resources. Literature is sparse in discussing the…
▽ More
Deep learning (DL) models are state-of-the-art in segmenting anatomical and disease regions of interest (ROIs) in medical images. Particularly, a large number of DL-based techniques have been reported using chest X-rays (CXRs). However, these models are reportedly trained on reduced image resolutions for reasons related to the lack of computational resources. Literature is sparse in discussing the optimal image resolution to train these models for segmenting the Tuberculosis (TB)-consistent lesions in CXRs. In this study, we investigated the performance variations using an Inception-V3 UNet model using various image resolutions with/without lung ROI crop** and aspect ratio adjustments, and (ii) identified the optimal image resolution through extensive empirical evaluations to improve TB-consistent lesion segmentation performance. We used the Shenzhen CXR dataset for the study which includes 326 normal patients and 336 TB patients. We proposed a combinatorial approach consisting of storing model snapshots, optimizing segmentation threshold and test-time augmentation (TTA), and averaging the snapshot predictions, to further improve performance with the optimal resolution. Our experimental results demonstrate that higher image resolutions are not always necessary, however, identifying the optimal image resolution is critical to achieving superior performance.
△ Less
Submitted 27 January, 2023; v1 submitted 10 January, 2023;
originally announced January 2023.
-
Generalizability of Deep Adult Lung Segmentation Models to the Pediatric Population: A Retrospective Study
Authors:
Sivaramakrishnan Rajaraman,
Feng Yang,
Ghada Zamzmi,
Zhiyun Xue,
Sameer Antani
Abstract:
Lung segmentation in chest X-rays (CXRs) is an important prerequisite for improving the specificity of diagnoses of cardiopulmonary diseases in a clinical decision support system. Current deep learning models for lung segmentation are trained and evaluated on CXR datasets in which the radiographic projections are captured predominantly from the adult population. However, the shape of the lungs is…
▽ More
Lung segmentation in chest X-rays (CXRs) is an important prerequisite for improving the specificity of diagnoses of cardiopulmonary diseases in a clinical decision support system. Current deep learning models for lung segmentation are trained and evaluated on CXR datasets in which the radiographic projections are captured predominantly from the adult population. However, the shape of the lungs is reported to be significantly different across the developmental stages from infancy to adulthood. This might result in age-related data domain shifts that would adversely impact lung segmentation performance when the models trained on the adult population are deployed for pediatric lung segmentation. In this work, our goal is to (i) analyze the generalizability of deep adult lung segmentation models to the pediatric population and (ii) improve performance through a stage-wise, systematic approach consisting of CXR modality-specific weight initializations, stacked ensembles, and an ensemble of stacked ensembles. To evaluate segmentation performance and generalizability, novel evaluation metrics consisting of mean lung contour distance (MLCD) and average hash score (AHS) are proposed in addition to the multi-scale structural similarity index measure (MS-SSIM), the intersection of union (IoU), Dice score, 95% Hausdorff distance (HD95), and average symmetric surface distance (ASSD). Our results showed a significant improvement (p < 0.05) in cross-domain generalization through our approach. This study could serve as a paradigm to analyze the cross-domain generalizability of deep segmentation models for other medical imaging modalities and applications.
△ Less
Submitted 25 May, 2023; v1 submitted 4 November, 2022;
originally announced November 2022.
-
Deep ensemble learning for segmenting tuberculosis-consistent manifestations in chest radiographs
Authors:
Sivaramakrishnan Rajaraman,
Feng Yang,
Ghada Zamzmi,
Peng Guo,
Zhiyun Xue,
Sameer K Antani
Abstract:
Automated segmentation of tuberculosis (TB)-consistent lesions in chest X-rays (CXRs) using deep learning (DL) methods can help reduce radiologist effort, supplement clinical decision-making, and potentially result in improved patient treatment. The majority of works in the literature discuss training automatic segmentation models using coarse bounding box annotations. However, the granularity of…
▽ More
Automated segmentation of tuberculosis (TB)-consistent lesions in chest X-rays (CXRs) using deep learning (DL) methods can help reduce radiologist effort, supplement clinical decision-making, and potentially result in improved patient treatment. The majority of works in the literature discuss training automatic segmentation models using coarse bounding box annotations. However, the granularity of the bounding box annotation could result in the inclusion of a considerable fraction of false positives and negatives at the pixel level that may adversely impact overall semantic segmentation performance. This study (i) evaluates the benefits of using fine-grained annotations of TB-consistent lesions and (ii) trains and constructs ensembles of the variants of U-Net models for semantically segmenting TB-consistent lesions in both original and bone-suppressed frontal CXRs. We evaluated segmentation performance using several ensemble methods such as bitwise AND, bitwise-OR, bitwise-MAX, and stacking. We observed that the stacking ensemble demonstrated superior segmentation performance (Dice score: 0.5743, 95% confidence interval: (0.4055,0.7431)) compared to the individual constituent models and other ensemble methods. To the best of our knowledge, this is the first study to apply ensemble learning to improve fine-grained TB-consistent lesion segmentation performance.
△ Less
Submitted 13 June, 2022;
originally announced June 2022.
-
Functional Parcellation of fMRI data using multistage k-means clustering
Authors:
Harshit Parmar,
Brian Nutter,
Rodney Long,
Sameer Antani,
Sunanda Mitra
Abstract:
Purpose: Functional Magnetic Resonance Imaging (fMRI) data acquired through resting-state studies have been used to obtain information about the spontaneous activations inside the brain. One of the approaches for analysis and interpretation of resting-state fMRI data require spatially and functionally homogenous parcellation of the whole brain based on underlying temporal fluctuations. Clustering…
▽ More
Purpose: Functional Magnetic Resonance Imaging (fMRI) data acquired through resting-state studies have been used to obtain information about the spontaneous activations inside the brain. One of the approaches for analysis and interpretation of resting-state fMRI data require spatially and functionally homogenous parcellation of the whole brain based on underlying temporal fluctuations. Clustering is often used to generate functional parcellation. However, major clustering algorithms, when used for fMRI data, have their limitations. Among commonly used parcellation schemes, a tradeoff exists between intra-cluster functional similarity and alignment with anatomical regions. Approach: In this work, we present a clustering algorithm for resting state and task fMRI data which is developed to obtain brain parcellations that show high structural and functional homogeneity. The clustering is performed by multistage binary k-means clustering algorithm designed specifically for the 4D fMRI data. The results from this multistage k-means algorithm show that by modifying and combining different algorithms, we can take advantage of the strengths of different techniques while overcoming their limitations. Results: The clustering output for resting state fMRI data using the multistage k-means approach is shown to be better than simple k-means or functional atlas in terms of spatial and functional homogeneity. The clusters also correspond to commonly identifiable brain networks. For task fMRI, the clustering output can identify primary and secondary activation regions and provide information about the varying hemodynamic response across different brain regions. Conclusion: The multistage k-means approach can provide functional parcellations of the brain using resting state fMRI data. The method is model-free and is data driven which can be applied to both resting state and task fMRI.
△ Less
Submitted 19 February, 2022;
originally announced February 2022.
-
Deep Cervix Model Development from Heterogeneous and Partially Labeled Image Datasets
Authors:
Anabik Pal,
Zhiyun Xue,
Sameer Antani
Abstract:
Cervical cancer is the fourth most common cancer in women worldwide. The availability of a robust automated cervical image classification system can augment the clinical care provider's limitation in traditional visual inspection with acetic acid (VIA). However, there are a wide variety of cervical inspection objectives which impact the labeling criteria for criteria-specific prediction model deve…
▽ More
Cervical cancer is the fourth most common cancer in women worldwide. The availability of a robust automated cervical image classification system can augment the clinical care provider's limitation in traditional visual inspection with acetic acid (VIA). However, there are a wide variety of cervical inspection objectives which impact the labeling criteria for criteria-specific prediction model development. Moreover, due to the lack of confirmatory test results and inter-rater labeling variation, many images are left unlabeled.
Motivated by these challenges, we propose a self-supervised learning (SSL) based approach to produce a pre-trained cervix model from unlabeled cervical images. The developed model is further fine-tuned to produce criteria-specific classification models with the available labeled images. We demonstrate the effectiveness of the proposed approach using two cervical image datasets. Both datasets are partially labeled and labeling criteria are different. The experimental results show that the SSL-based initialization improves classification performance (Accuracy: 2.5% min) and the inclusion of images from both datasets during SSL further improves the performance (Accuracy: 1.5% min). Further, considering data-sharing restrictions, we experimented with the effectiveness of Federated SSL and find that it can improve performance over the SSL model developed with just its images. This justifies the importance of SSL-based cervix model development. We believe that the present research shows a novel direction in develo** criteria-specific custom deep models for cervical image classification by combining images from different sources unlabeled and/or labeled with varying criteria, and addressing image access restrictions.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.
-
Selective Synthetic Augmentation with HistoGAN for Improved Histopathology Image Classification
Authors:
Yuan Xue,
Jiarong Ye,
Qianying Zhou,
Rodney Long,
Sameer Antani,
Zhiyun Xue,
Carl Cornwell,
Richard Zaino,
Keith Cheng,
Xiaolei Huang
Abstract:
Histopathological analysis is the present gold standard for precancerous lesion diagnosis. The goal of automated histopathological classification from digital images requires supervised training, which requires a large number of expert annotations that can be expensive and time-consuming to collect. Meanwhile, accurate classification of image patches cropped from whole-slide images is essential fo…
▽ More
Histopathological analysis is the present gold standard for precancerous lesion diagnosis. The goal of automated histopathological classification from digital images requires supervised training, which requires a large number of expert annotations that can be expensive and time-consuming to collect. Meanwhile, accurate classification of image patches cropped from whole-slide images is essential for standard sliding window based histopathology slide classification methods. To mitigate these issues, we propose a carefully designed conditional GAN model, namely HistoGAN, for synthesizing realistic histopathology image patches conditioned on class labels. We also investigate a novel synthetic augmentation framework that selectively adds new synthetic image patches generated by our proposed HistoGAN, rather than expanding directly the training set with synthetic images. By selecting synthetic images based on the confidence of their assigned labels and their feature similarity to real labeled images, our framework provides quality assurance to synthetic augmentation. Our models are evaluated on two datasets: a cervical histopathology image dataset with limited annotations, and another dataset of lymph node histopathology images with metastatic cancer. Here, we show that leveraging HistoGAN generated images with selective augmentation results in significant and consistent improvements of classification performance (6.7% and 2.8% higher accuracy, respectively) for cervical histopathology and metastatic cancer datasets.
△ Less
Submitted 10 November, 2021;
originally announced November 2021.
-
A bone suppression model ensemble to improve COVID-19 detection in chest X-rays
Authors:
Sivaramakrishnan Rajaraman,
Gregg Cohen,
Lillian Spear,
Les folio,
Sameer Antani
Abstract:
Chest X-ray (CXR) is a widely performed radiology examination that helps to detect abnormalities in the tissues and organs in the thoracic cavity. Detecting pulmonary abnormalities like COVID-19 may become difficult due to that they are obscured by the presence of bony structures like the ribs and the clavicles, thereby resulting in screening/diagnostic misinterpretations. Automated bone suppressi…
▽ More
Chest X-ray (CXR) is a widely performed radiology examination that helps to detect abnormalities in the tissues and organs in the thoracic cavity. Detecting pulmonary abnormalities like COVID-19 may become difficult due to that they are obscured by the presence of bony structures like the ribs and the clavicles, thereby resulting in screening/diagnostic misinterpretations. Automated bone suppression methods would help suppress these bony structures and increase soft tissue visibility. In this study, we propose to build an ensemble of convolutional neural network models to suppress bones in frontal CXRs, improve classification performance, and reduce interpretation errors related to COVID-19 detection. The ensemble is constructed by (i) measuring the multi-scale structural similarity index (MS-SSIM) score between the sub-blocks of the bone-suppressed image predicted by each of the top-3 performing bone-suppression models and the corresponding sub-blocks of its respective ground truth soft-tissue image, and (ii) performing a majority voting of the MS-SSIM score computed in each sub-block to identify the sub-block with the maximum MS-SSIM score and use it in constructing the final bone-suppressed image. We empirically determine the sub-block size that delivers superior bone suppression performance. It is observed that the bone suppression model ensemble outperformed the individual models in terms of MS-SSIM and other metrics. A CXR modality-specific classification model is retrained and evaluated on the non-bone-suppressed and bone-suppressed images to classify them as showing normal lungs or other COVID-19-like manifestations. We observed that the bone-suppressed model training significantly outperformed the model trained on non-bone-suppressed images toward detecting COVID-19 manifestations.
△ Less
Submitted 4 December, 2021; v1 submitted 5 November, 2021;
originally announced November 2021.
-
Does deep learning model calibration improve performance in class-imbalanced medical image classification?
Authors:
Sivaramakrishnan Rajaraman,
Prasanth Ganesan,
Sameer Antani
Abstract:
In medical image classification tasks, it is common to find that the number of normal samples far exceeds the number of abnormal samples. In such class-imbalanced situations, reliable training of deep neural networks continues to be a major challenge. Under these circumstances, the predicted class probabilities may be biased toward the majority class. Calibration has been suggested to alleviate so…
▽ More
In medical image classification tasks, it is common to find that the number of normal samples far exceeds the number of abnormal samples. In such class-imbalanced situations, reliable training of deep neural networks continues to be a major challenge. Under these circumstances, the predicted class probabilities may be biased toward the majority class. Calibration has been suggested to alleviate some of these effects. However, there is insufficient analysis explaining when and whether calibrating a model would be beneficial in improving performance. In this study, we perform a systematic analysis of the effect of model calibration on its performance on two medical image modalities, namely, chest X-rays and fundus images, using various deep learning classifier backbones. For this, we study the following variations: (i) the degree of imbalances in the dataset used for training; (ii) calibration methods; and (iii) two classification thresholds, namely, default decision threshold of 0.5, and optimal threshold from precision-recall curves. Our results indicate that at the default operating threshold of 0.5, the performance achieved through calibration is significantly superior (p < 0.05) to using uncalibrated probabilities. However, at the PR-guided threshold, these gains are not significantly different (p > 0.05). This finding holds for both image modalities and at varying degrees of imbalance.
△ Less
Submitted 11 October, 2021; v1 submitted 29 September, 2021;
originally announced October 2021.
-
Multi-loss ensemble deep learning for chest X-ray classification
Authors:
Sivaramakrishnan Rajaraman,
Ghada Zamzmi,
Sameer Antani
Abstract:
Medical images commonly exhibit multiple abnormalities. Predicting them requires multi-class classifiers whose training and desired reliable performance can be affected by a combination of factors, such as, dataset size, data source, distribution, and the loss function used to train the deep neural networks. Currently, the cross-entropy loss remains the de-facto loss function for training deep lea…
▽ More
Medical images commonly exhibit multiple abnormalities. Predicting them requires multi-class classifiers whose training and desired reliable performance can be affected by a combination of factors, such as, dataset size, data source, distribution, and the loss function used to train the deep neural networks. Currently, the cross-entropy loss remains the de-facto loss function for training deep learning classifiers. This loss function, however, asserts equal learning from all classes, leading to a bias toward the majority class. In this work, we benchmark various state-of-the-art loss functions that are suitable for multi-class classification, critically analyze model performance, and propose improved loss functions. We select a pediatric chest X-ray (CXR) dataset that includes images with no abnormality (normal), and those exhibiting manifestations consistent with bacterial and viral pneumonia. We construct prediction-level and model-level ensembles, respectively, to improve classification performance. Our results show that compared to the individual models and the state-of-the-art literature, the weighted averaging of the predictions for top-3 and top-5 model-level ensembles delivered significantly superior classification performance (p < 0.05) in terms of MCC (0.9068, 95% confidence interval (0.8839, 0.9297)) metric. Finally, we performed localization studies to interpret model behaviors to visualize and confirm that the individual models and ensembles learned meaningful features and highlighted disease manifestations.
△ Less
Submitted 14 November, 2021; v1 submitted 29 September, 2021;
originally announced September 2021.
-
Trilateral Attention Network for Real-time Medical Image Segmentation
Authors:
Ghada Zamzmi,
Vandana Sachdev,
Sameer Antani
Abstract:
Accurate segmentation of medical images into anatomically meaningful regions is critical for the extraction of quantitative indices or biomarkers. The common pipeline for segmentation comprises regions of interest detection stage and segmentation stage, which are independent of each other and typically performed using separate deep learning networks. The performance of the segmentation stage highl…
▽ More
Accurate segmentation of medical images into anatomically meaningful regions is critical for the extraction of quantitative indices or biomarkers. The common pipeline for segmentation comprises regions of interest detection stage and segmentation stage, which are independent of each other and typically performed using separate deep learning networks. The performance of the segmentation stage highly relies on the extracted set of spatial features and the receptive fields. In this work, we propose an end-to-end network, called Trilateral Attention Network (TaNet), for real-time detection and segmentation in medical images. TaNet has a module for region localization, and three segmentation pathways: 1) handcrafted pathway with hand-designed convolutional kernels, 2) detail pathway with regular convolutional kernels, and 3) a global pathway to enlarge the receptive field. The first two pathways encode rich handcrafted and low-level features extracted by hand-designed and regular kernels while the global pathway encodes high-level context information. By jointly training the network for localization and segmentation using different sets of features, TaNet achieved superior performance, in terms of accuracy and speed, when evaluated on an echocardiography dataset for cardiac segmentation. The code and models will be made publicly available in TaNet Github page.
△ Less
Submitted 16 June, 2021;
originally announced June 2021.
-
Chest X-Ray Bone Suppression for Improving Classification of Tuberculosis-Consistent Findings
Authors:
Sivaramakrishnan Rajaraman,
Ghada Zamzmi,
Les Folio,
Philip Alderson,
Sameer Antani
Abstract:
Chest X-rays are the most commonly performed diagnostic examination to detect cardiopulmonary abnormalities. However, the presence of bony structures such as ribs and clavicles can obscure subtle abnormalities, resulting in diagnostic errors. This study aims to build a deep learning-based bone suppression model that identifies and removes these occluding bony structures in frontal CXRs to assist i…
▽ More
Chest X-rays are the most commonly performed diagnostic examination to detect cardiopulmonary abnormalities. However, the presence of bony structures such as ribs and clavicles can obscure subtle abnormalities, resulting in diagnostic errors. This study aims to build a deep learning-based bone suppression model that identifies and removes these occluding bony structures in frontal CXRs to assist in reducing errors in radiological interpretation, including DL workflows, related to detecting manifestations consistent with tuberculosis (TB). Several bone suppression models with various deep architectures are trained and optimized using the proposed combined loss function and their performances are evaluated in a cross-institutional test setting. The best-performing model is used to suppress bones in the publicly available Shenzhen and Montgomery TB CXR collections. A VGG-16 model is pretrained on a large collection of publicly available CXRs. The CXR-pretrained model is then fine-tuned individually on the non-bone-suppressed and bone-suppressed CXRs of Shenzhen and Montgomery TB CXR collections to classify them as showing normal lungs or TB manifestations. The performances of these models are compared using several performance metrics, analyzed for statistical significance, and their predictions are qualitatively interpreted through class-selective relevance maps. It is observed that the models trained on bone-suppressed CXRs significantly outperformed (p<0.05) the models trained on the non-bone-suppressed CXRs. Models trained on bone-suppressed CXRs improved detection of TB-consistent findings and resulted in compact clustering of the data points in the feature space signifying that bone suppression improved the model sensitivity toward TB classification.
△ Less
Submitted 7 May, 2021; v1 submitted 9 April, 2021;
originally announced April 2021.
-
Improved Semantic Segmentation of Tuberculosis-consistent findings in Chest X-rays Using Augmented Training of Modality-specific U-Net Models with Weak Localizations
Authors:
Sivaramakrishnan Rajaraman,
Les Folio,
Jane Dimperio,
Philip Alderson,
Sameer Antani
Abstract:
Deep learning (DL) has drawn tremendous attention in object localization and recognition for both natural and medical images. U-Net segmentation models have demonstrated superior performance compared to conventional handcrafted feature-based methods. Medical image modality-specific DL models are better at transferring domain knowledge to a relevant target task than those that are pretrained on sto…
▽ More
Deep learning (DL) has drawn tremendous attention in object localization and recognition for both natural and medical images. U-Net segmentation models have demonstrated superior performance compared to conventional handcrafted feature-based methods. Medical image modality-specific DL models are better at transferring domain knowledge to a relevant target task than those that are pretrained on stock photography images. This helps improve model adaptation, generalization, and class-specific region of interest (ROI) localization. In this study, we train chest X-ray (CXR) modality-specific U-Nets and other state-of-the-art U-Net models for semantic segmentation of tuberculosis (TB)-consistent findings. Automated segmentation of such manifestations could help radiologists reduce errors and supplement decision-making while improving patient care and productivity. Our approach uses the publicly available TBX11K CXR dataset with weak TB annotations, typically provided as bounding boxes, to train a set of U-Net models. Next, we improve the results by augmenting the training data with weak localizations, post-processed into an ROI mask, from a DL classifier that is trained to classify CXRs as showing normal lungs or suspected TB manifestations. Test data are individually derived from the TBX11K CXR training distribution and other cross-institutional collections including the Shenzhen TB and Montgomery TB CXR datasets. We observe that our augmented training strategy helped the CXR modality-specific U-Net models achieve superior performance with test data derived from the TBX11K CXR training distribution as well as from cross-institutional collections (p < 0.05).
△ Less
Submitted 26 March, 2021; v1 submitted 21 February, 2021;
originally announced February 2021.
-
Synthetic Sample Selection via Reinforcement Learning
Authors:
Jiarong Ye,
Yuan Xue,
L. Rodney Long,
Sameer Antani,
Zhiyun Xue,
Keith Cheng,
Xiaolei Huang
Abstract:
Synthesizing realistic medical images provides a feasible solution to the shortage of training data in deep learning based medical image recognition systems. However, the quality control of synthetic images for data augmentation purposes is under-investigated, and some of the generated images are not realistic and may contain misleading features that distort data distribution when mixed with real…
▽ More
Synthesizing realistic medical images provides a feasible solution to the shortage of training data in deep learning based medical image recognition systems. However, the quality control of synthetic images for data augmentation purposes is under-investigated, and some of the generated images are not realistic and may contain misleading features that distort data distribution when mixed with real images. Thus, the effectiveness of those synthetic images in medical image recognition systems cannot be guaranteed when they are being added randomly without quality assurance. In this work, we propose a reinforcement learning (RL) based synthetic sample selection method that learns to choose synthetic images containing reliable and informative features. A transformer based controller is trained via proximal policy optimization (PPO) using the validation classification accuracy as the reward. The selected images are mixed with the original training data for improved training of image recognition systems. To validate our method, we take the pathology image recognition as an example and conduct extensive experiments on two histopathology image datasets. In experiments on a cervical dataset and a lymph node dataset, the image classification performance is improved by 8.1% and 2.3%, respectively, when utilizing high-quality synthetic images selected by our RL framework. Our proposed synthetic sample selection method is general and has great potential to boost the performance of various medical image recognition systems given limited annotation.
△ Less
Submitted 25 August, 2020;
originally announced August 2020.
-
Feature based Sequential Classifier with Attention Mechanism
Authors:
Sudhir Sornapudi,
R. Joe Stanley,
William V. Stoecker,
Rodney Long,
Zhiyun Xue,
Rosemary Zuna,
Shelliane R. Frazier,
Sameer Antani
Abstract:
Cervical cancer is one of the deadliest cancers affecting women globally. Cervical intraepithelial neoplasia (CIN) assessment using histopathological examination of cervical biopsy slides is subject to interobserver variability. Automated processing of digitized histopathology slides has the potential for more accurate classification for CIN grades from normal to increasing grades of pre-malignanc…
▽ More
Cervical cancer is one of the deadliest cancers affecting women globally. Cervical intraepithelial neoplasia (CIN) assessment using histopathological examination of cervical biopsy slides is subject to interobserver variability. Automated processing of digitized histopathology slides has the potential for more accurate classification for CIN grades from normal to increasing grades of pre-malignancy: CIN1, CIN2 and CIN3. Cervix disease is generally understood to progress from the bottom (basement membrane) to the top of the epithelium. To model this relationship of disease severity to spatial distribution of abnormalities, we propose a network pipeline, DeepCIN, to analyze high-resolution epithelium images (manually extracted from whole-slide images) hierarchically by focusing on localized vertical regions and fusing this local information for determining Normal/CIN classification. The pipeline contains two classifier networks: 1) a cross-sectional, vertical segment-level sequence generator (two-stage encoder model) is trained using weak supervision to generate feature sequences from the vertical segments to preserve the bottom-to-top feature relationships in the epithelium image data; 2) an attention-based fusion network image-level classifier predicting the final CIN grade by merging vertical segment sequences. The model produces the CIN classification results and also determines the vertical segment contributions to CIN grade prediction. Experiments show that DeepCIN achieves pathologist-level CIN classification accuracy.
△ Less
Submitted 22 July, 2020;
originally announced July 2020.
-
Unified Representation Learning for Efficient Medical Image Analysis
Authors:
Ghada Zamzmi,
Sivaramakrishnan Rajaraman,
Sameer Antani
Abstract:
Medical image analysis typically includes several tasks such as enhancement, segmentation, and classification. Traditionally, these tasks are implemented using separate deep learning models for separate tasks, which is not efficient because it involves unnecessary training repetitions, demands greater computational resources, and requires a relatively large amount of labeled data. In this paper, w…
▽ More
Medical image analysis typically includes several tasks such as enhancement, segmentation, and classification. Traditionally, these tasks are implemented using separate deep learning models for separate tasks, which is not efficient because it involves unnecessary training repetitions, demands greater computational resources, and requires a relatively large amount of labeled data. In this paper, we propose a multi-task training approach for medical image analysis, where individual tasks are fine-tuned simultaneously through relevant knowledge transfer using a unified modality-specific feature representation (UMS-Rep). We explore different fine-tuning strategies to demonstrate the impact of the strategy on the performance of target medical image tasks. We experiment with different visual tasks (e.g., image denoising, segmentation, and classification) to highlight the advantages offered with our approach for two imaging modalities, chest X-ray and Doppler echocardiography. Our results demonstrate that the proposed approach reduces the overall demand for computational resources and improves target task generalization and performance. Further, our results prove that the performance of target tasks in medical images is highly influenced by the utilized fine-tuning strategy.
△ Less
Submitted 7 June, 2021; v1 submitted 19 June, 2020;
originally announced June 2020.
-
Iteratively Pruned Deep Learning Ensembles for COVID-19 Detection in Chest X-rays
Authors:
Sivaramakrishnan Rajaraman,
Jen Siegelman,
Philip O. Alderson,
Lucas S. Folio,
Les R. Folio,
Sameer K. Antani
Abstract:
We demonstrate use of iteratively pruned deep learning model ensembles for detecting pulmonary manifestation of COVID-19 with chest X-rays. This disease is caused by the novel Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) virus, also known as the novel Coronavirus (2019-nCoV). A custom convolutional neural network and a selection of ImageNet pretrained models are trained and evaluat…
▽ More
We demonstrate use of iteratively pruned deep learning model ensembles for detecting pulmonary manifestation of COVID-19 with chest X-rays. This disease is caused by the novel Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) virus, also known as the novel Coronavirus (2019-nCoV). A custom convolutional neural network and a selection of ImageNet pretrained models are trained and evaluated at patient-level on publicly available CXR collections to learn modality-specific feature representations. The learned knowledge is transferred and fine-tuned to improve performance and generalization in the related task of classifying CXRs as normal, showing bacterial pneumonia, or COVID-19-viral abnormalities. The best performing models are iteratively pruned to reduce complexity and improve memory efficiency. The predictions of the best-performing pruned models are combined through different ensemble strategies to improve classification performance. Empirical evaluations demonstrate that the weighted average of the best-performing pruned models significantly improves performance resulting in an accuracy of 99.01% and area under the curve of 0.9972 in detecting COVID-19 findings on CXRs. The combined use of modality-specific knowledge transfer, iterative model pruning, and ensemble learning resulted in improved predictions. We expect that this model can be quickly adopted for COVID-19 screening using chest radiographs.
△ Less
Submitted 5 March, 2021; v1 submitted 15 April, 2020;
originally announced April 2020.
-
Selective Synthetic Augmentation with Quality Assurance
Authors:
Yuan Xue,
Jiarong Ye,
Rodney Long,
Sameer Antani,
Zhiyun Xue,
Xiaolei Huang
Abstract:
Supervised training of an automated medical image analysis system often requires a large amount of expert annotations that are hard to collect. Moreover, the proportions of data available across different classes may be highly imbalanced for rare diseases. To mitigate these issues, we investigate a novel data augmentation pipeline that selectively adds new synthetic images generated by conditional…
▽ More
Supervised training of an automated medical image analysis system often requires a large amount of expert annotations that are hard to collect. Moreover, the proportions of data available across different classes may be highly imbalanced for rare diseases. To mitigate these issues, we investigate a novel data augmentation pipeline that selectively adds new synthetic images generated by conditional Adversarial Networks (cGANs), rather than extending directly the training set with synthetic images. The selection mechanisms that we introduce to the synthetic augmentation pipeline are motivated by the observation that, although cGAN-generated images can be visually appealing, they are not guaranteed to contain essential features for classification performance improvement. By selecting synthetic images based on the confidence of their assigned labels and their feature similarity to real labeled images, our framework provides quality assurance to synthetic augmentation by ensuring that adding the selected synthetic images to the training set will improve performance. We evaluate our model on a medical histopathology dataset, and two natural image classification benchmarks, CIFAR10 and SVHN. Results on these datasets show significant and consistent improvements in classification performance (with 6.8%, 3.9%, 1.6% higher accuracy, respectively) by leveraging cGAN generated images with selective augmentation.
△ Less
Submitted 8 December, 2019;
originally announced December 2019.
-
Comparing Deep Learning Models for Multi-cell Classification in Liquid-based Cervical Cytology Images
Authors:
Sudhir Sornapudi,
G. T. Brown,
Zhiyun Xue,
Rodney Long,
Lisa Allen,
Sameer Antani
Abstract:
Liquid-based cytology (LBC) is a reliable automated technique for the screening of Papanicolaou (Pap) smear data. It is an effective technique for collecting a majority of the cervical cells and aiding cytopathologists in locating abnormal cells. Most methods published in the research literature rely on accurate cell segmentation as a prior, which remains challenging due to a variety of factors, e…
▽ More
Liquid-based cytology (LBC) is a reliable automated technique for the screening of Papanicolaou (Pap) smear data. It is an effective technique for collecting a majority of the cervical cells and aiding cytopathologists in locating abnormal cells. Most methods published in the research literature rely on accurate cell segmentation as a prior, which remains challenging due to a variety of factors, e.g., stain consistency, presence of clustered cells, etc. We propose a method for automatic classification of cervical slide images through generation of labeled cervical patch data and extracting deep hierarchical features by fine-tuning convolution neural networks, as well as a novel graph-based cell detection approach for cellular level evaluation. The results show that the proposed pipeline can classify images of both single cell and overlap** cells. The VGG-19 model is found to be the best at classifying the cervical cytology patch data with 95 % accuracy under precision-recall curve.
△ Less
Submitted 1 October, 2019;
originally announced October 2019.
-
Report of 2017 NSF Workshop on Multimedia Challenges, Opportunities and Research Roadmaps
Authors:
Shih-Fu Chang,
Alex Hauptmann,
Louis-Philippe Morency,
Sameer Antani,
Dick Bulterman,
Carlos Busso,
Joyce Chai,
Julia Hirschberg,
Ramesh Jain,
Ketan Mayer-Patel,
Reuven Meth,
Raymond Mooney,
Klara Nahrstedt,
Shri Narayanan,
Prem Natarajan,
Sharon Oviatt,
Balakrishnan Prabhakaran,
Arnold Smeulders,
Hari Sundaram,
Zhengyou Zhang,
Michelle Zhou
Abstract:
With the transformative technologies and the rapidly changing global R&D landscape, the multimedia and multimodal community is now faced with many new opportunities and uncertainties. With the open source dissemination platform and pervasive computing resources, new research results are being discovered at an unprecedented pace. In addition, the rapid exchange and influence of ideas across traditi…
▽ More
With the transformative technologies and the rapidly changing global R&D landscape, the multimedia and multimodal community is now faced with many new opportunities and uncertainties. With the open source dissemination platform and pervasive computing resources, new research results are being discovered at an unprecedented pace. In addition, the rapid exchange and influence of ideas across traditional discipline boundaries have made the emphasis on multimedia multimodal research even more important than before. To seize these opportunities and respond to the challenges, we have organized a workshop to specifically address and brainstorm the challenges, opportunities, and research roadmaps for MM research. The two-day workshop, held on March 30 and 31, 2017 in Washington DC, was sponsored by the Information and Intelligent Systems Division of the National Science Foundation of the United States. Twenty-three (23) invited participants were asked to review and identify research areas in the MM field that are most important over the next 10-15 year timeframe. Important topics were selected through discussion and consensus, and then discussed in depth in breakout groups. Breakout groups reported initial discussion results to the whole group, who continued with further extensive deliberation. For each identified topic, a summary was produced after the workshop to describe the main findings, including the state of the art, challenges, and research roadmaps planned for the next 5, 10, and 15 years in the identified area.
△ Less
Submitted 6 August, 2019;
originally announced August 2019.
-
Synthetic Augmentation and Feature-based Filtering for Improved Cervical Histopathology Image Classification
Authors:
Yuan Xue,
Qianying Zhou,
Jiarong Ye,
L. Rodney Long,
Sameer Antani,
Carl Cornwell,
Zhiyun Xue,
Xiaolei Huang
Abstract:
Cervical intraepithelial neoplasia (CIN) grade of histopathology images is a crucial indicator in cervical biopsy results. Accurate CIN grading of epithelium regions helps pathologists with precancerous lesion diagnosis and treatment planning. Although an automated CIN grading system has been desired, supervised training of such a system would require a large amount of expert annotations, which ar…
▽ More
Cervical intraepithelial neoplasia (CIN) grade of histopathology images is a crucial indicator in cervical biopsy results. Accurate CIN grading of epithelium regions helps pathologists with precancerous lesion diagnosis and treatment planning. Although an automated CIN grading system has been desired, supervised training of such a system would require a large amount of expert annotations, which are expensive and time-consuming to collect. In this paper, we investigate the CIN grade classification problem on segmented epithelium patches. We propose to use conditional Generative Adversarial Networks (cGANs) to expand the limited training dataset, by synthesizing realistic cervical histopathology images. While the synthetic images are visually appealing, they are not guaranteed to contain meaningful features for data augmentation. To tackle this issue, we propose a synthetic-image filtering mechanism based on the divergence in feature space between generated images and class centroids in order to control the feature quality of selected synthetic images for data augmentation. Our models are evaluated on a cervical histopathology image dataset with a limited number of patch-level CIN grade annotations. Extensive experimental results show a significant improvement of classification accuracy from 66.3% to 71.7% using the same ResNet18 baseline classifier after leveraging our cGAN generated images with feature-based filtering, which demonstrates the effectiveness of our models.
△ Less
Submitted 24 July, 2019;
originally announced July 2019.