-
Deep Learning-Based Prediction of Molecular Tumor Biomarkers from H&E: A Practical Review
Authors:
Heather D. Couture
Abstract:
Molecular and genomic properties are critical in selecting cancer treatments to target individual tumors, particularly for immunotherapy. However, the methods to assess such properties are expensive, time-consuming, and often not routinely performed. Applying machine learning to H&E images can provide a more cost-effective screening method. Dozens of studies over the last few years have demonstrat…
▽ More
Molecular and genomic properties are critical in selecting cancer treatments to target individual tumors, particularly for immunotherapy. However, the methods to assess such properties are expensive, time-consuming, and often not routinely performed. Applying machine learning to H&E images can provide a more cost-effective screening method. Dozens of studies over the last few years have demonstrated that a variety of molecular biomarkers can be predicted from H&E alone using the advancements of deep learning: molecular alterations, genomic subtypes, protein biomarkers, and even the presence of viruses. This article reviews the diverse applications across cancer types and the methodology to train and validate these models on whole slide images. From bottom-up to pathologist-driven to hybrid approaches, the leading trends include a variety of weakly supervised deep learning-based approaches, as well as mechanisms for training strongly supervised models in select situations. While results of these algorithms look promising, some challenges still persist, including small training sets, rigorous validation, and model explainability. Biomarker prediction models may yield a screening method to determine when to run molecular tests or an alternative when molecular tests are not possible. They also create new opportunities in quantifying intratumoral heterogeneity and predicting patient outcomes.
△ Less
Submitted 27 November, 2022;
originally announced November 2022.
-
Joint and individual analysis of breast cancer histologic images and genomic covariates
Authors:
Iain Carmichael,
Benjamin C. Calhoun,
Katherine A. Hoadley,
Melissa A. Troester,
Joseph Geradts,
Heather D. Couture,
Linnea Olsson,
Charles M. Perou,
Marc Niethammer,
Jan Hannig,
J. S. Marron
Abstract:
A key challenge in modern data analysis is understanding connections between complex and differing modalities of data. For example, two of the main approaches to the study of breast cancer are histopathology (analyzing visual characteristics of tumors) and genetics. While histopathology is the gold standard for diagnostics and there have been many recent breakthroughs in genetics, there is little…
▽ More
A key challenge in modern data analysis is understanding connections between complex and differing modalities of data. For example, two of the main approaches to the study of breast cancer are histopathology (analyzing visual characteristics of tumors) and genetics. While histopathology is the gold standard for diagnostics and there have been many recent breakthroughs in genetics, there is little overlap between these two fields. We aim to bridge this gap by develo** methods based on Angle-based Joint and Individual Variation Explained (AJIVE) to directly explore similarities and differences between these two modalities. Our approach exploits Convolutional Neural Networks (CNNs) as a powerful, automatic method for image feature extraction to address some of the challenges presented by statistical analysis of histopathology image data. CNNs raise issues of interpretability that we address by develo** novel methods to explore visual modes of variation captured by statistical algorithms (e.g. PCA or AJIVE) applied to CNN features. Our results provide many interpretable connections and contrasts between histopathology and genetics.
△ Less
Submitted 13 April, 2020; v1 submitted 1 December, 2019;
originally announced December 2019.
-
Deep Multi-View Learning via Task-Optimal CCA
Authors:
Heather D. Couture,
Roland Kwitt,
J. S. Marron,
Melissa Troester,
Charles M. Perou,
Marc Niethammer
Abstract:
Canonical Correlation Analysis (CCA) is widely used for multimodal data analysis and, more recently, for discriminative tasks such as multi-view learning; however, it makes no use of class labels. Recent CCA methods have started to address this weakness but are limited in that they do not simultaneously optimize the CCA projection for discrimination and the CCA projection itself, or they are linea…
▽ More
Canonical Correlation Analysis (CCA) is widely used for multimodal data analysis and, more recently, for discriminative tasks such as multi-view learning; however, it makes no use of class labels. Recent CCA methods have started to address this weakness but are limited in that they do not simultaneously optimize the CCA projection for discrimination and the CCA projection itself, or they are linear only. We address these deficiencies by simultaneously optimizing a CCA-based and a task objective in an end-to-end manner. Together, these two objectives learn a non-linear CCA projection to a shared latent space that is highly correlated and discriminative. Our method shows a significant improvement over previous state-of-the-art (including deep supervised approaches) for cross-view classification, regularization with a second view, and semi-supervised learning on real data.
△ Less
Submitted 17 July, 2019;
originally announced July 2019.
-
Multiple Instance Learning for Heterogeneous Images: Training a CNN for Histopathology
Authors:
Heather D. Couture,
J. S. Marron,
Charles M. Perou,
Melissa A. Troester,
Marc Niethammer
Abstract:
Multiple instance (MI) learning with a convolutional neural network enables end-to-end training in the presence of weak image-level labels. We propose a new method for aggregating predictions from smaller regions of the image into an image-level classification by using the quantile function. The quantile function provides a more complete description of the heterogeneity within each image, improvin…
▽ More
Multiple instance (MI) learning with a convolutional neural network enables end-to-end training in the presence of weak image-level labels. We propose a new method for aggregating predictions from smaller regions of the image into an image-level classification by using the quantile function. The quantile function provides a more complete description of the heterogeneity within each image, improving image-level classification. We also adapt image augmentation to the MI framework by randomly selecting cropped regions on which to apply MI aggregation during each epoch of training. This provides a mechanism to study the importance of MI learning. We validate our method on five different classification tasks for breast tumor histology and provide a visualization method for interpreting local image classifications that could lead to future insights into tumor heterogeneity.
△ Less
Submitted 13 June, 2018;
originally announced June 2018.