Critical Evaluation of Artificial Intelligence as Digital Twin of Pathologist for Prostate Cancer Pathology
Authors:
Okyaz Eminaga,
Mahmoud Abbas,
Christian Kunder,
Yuri Tolkach,
Ryan Han,
James D. Brooks,
Rosalie Nolley,
Axel Semjonow,
Martin Boegemann,
Robert West,
** Long,
Richard Fan,
Olaf Bettendorf
Abstract:
Prostate cancer pathology plays a crucial role in clinical management but is time-consuming. Artificial intelligence (AI) shows promise in detecting prostate cancer and grading patterns. We tested an AI-based digital twin of a pathologist, vPatho, on 2,603 histology images of prostate tissue stained with hematoxylin and eosin. We analyzed various factors influencing tumor-grade disagreement betwee…
▽ More
Prostate cancer pathology plays a crucial role in clinical management but is time-consuming. Artificial intelligence (AI) shows promise in detecting prostate cancer and grading patterns. We tested an AI-based digital twin of a pathologist, vPatho, on 2,603 histology images of prostate tissue stained with hematoxylin and eosin. We analyzed various factors influencing tumor-grade disagreement between vPatho and six human pathologists. Our results demonstrated that vPatho achieved comparable performance in prostate cancer detection and tumor volume estimation, as reported in the literature. Concordance levels between vPatho and human pathologists were examined. Notably, moderate to substantial agreement was observed in identifying complementary histological features such as ductal, cribriform, nerve, blood vessels, and lymph cell infiltrations. However, concordance in tumor grading showed a decline when applied to prostatectomy specimens (kappa = 0.44) compared to biopsy cores (kappa = 0.70). Adjusting the decision threshold for the secondary Gleason pattern from 5% to 10% improved the concordance level between pathologists and vPatho for tumor grading on prostatectomy specimens (kappa from 0.44 to 0.64). Potential causes of grade discordance included the vertical extent of tumors toward the prostate boundary and the proportions of slides with prostate cancer. Gleason pattern 4 was particularly associated with discordance. Notably, grade discordance with vPatho was not specific to any of the six pathologists involved in routine clinical grading. In conclusion, our study highlights the potential utility of AI in develo** a digital twin of a pathologist. This approach can help uncover limitations in AI adoption and the current grading system for prostate cancer pathology.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
Plexus Convolutional Neural Network (PlexusNet): A novel neural network architecture for histologic image analysis
Authors:
Okyaz Eminaga,
Mahmoud Abbas,
Christian Kunder,
Andreas M. Loening,
Jeanne Shen,
James D. Brooks,
Curtis P. Langlotz,
Daniel L. Rubin
Abstract:
Different convolutional neural network (CNN) models have been tested for their application in histological image analyses. However, these models are prone to overfitting due to their large parameter capacity, requiring more data or valuable computational resources for model training. Given these limitations, we introduced a novel architecture (termed PlexusNet). We utilized 310 Hematoxylin and Eos…
▽ More
Different convolutional neural network (CNN) models have been tested for their application in histological image analyses. However, these models are prone to overfitting due to their large parameter capacity, requiring more data or valuable computational resources for model training. Given these limitations, we introduced a novel architecture (termed PlexusNet). We utilized 310 Hematoxylin and Eosin stained (H&E) annotated histological images of prostate cancer cases from TCGA-PRAD and Stanford University and 398 H&E whole slides images from the Camelyon 2016 challenge. PlexusNet-architecture -derived models were compared to models derived from several existing "state of the art" architectures. We measured discrimination accuracy, calibration, and clinical utility. An ablation study was conducted to study the effect of each component of PlexusNet on model performance. A well-fitted PlexusNet-based model delivered comparable classification performance (AUC: 0.963) in distinguishing prostate cancer from healthy tissues, although it was at least 23 times smaller, had a better model calibration and clinical utility than the comparison models. A separate smaller PlexusNet model accurately detected slides with breast cancer metastases (AUC: 0.978); it helped reduce the slide number to examine by 43.8% without consequences, although its parameter capacity was 200 times smaller than ResNet18. We found that the partitioning of the development set influences the model calibration for all models. However, with PlexusNet architecture, we could achieve comparable well-calibrated models trained on different partitions. In conclusion, PlexusNet represents a novel model architecture for histological image analysis that achieves classification performance comparable to other models while providing orders-of-magnitude parameter reduction.
△ Less
Submitted 3 June, 2020; v1 submitted 23 August, 2019;
originally announced August 2019.