-
Semi-supervised variational autoencoder for cell feature extraction in multiplexed immunofluorescence images
Authors:
Piumi Sandarenu,
Julia Chen,
Iveta Slapetova,
Lois Browne,
Peter H. Graham,
Alexander Swarbrick,
Ewan K. A. Millar,
Yang Song,
Erik Meijering
Abstract:
Advancements in digital imaging technologies have sparked increased interest in using multiplexed immunofluorescence (mIF) images to visualise and identify the interactions between specific immunophenotypes with the tumour microenvironment at the cellular level. Current state-of-the-art multiplexed immunofluorescence image analysis pipelines depend on cell feature representations characterised by…
▽ More
Advancements in digital imaging technologies have sparked increased interest in using multiplexed immunofluorescence (mIF) images to visualise and identify the interactions between specific immunophenotypes with the tumour microenvironment at the cellular level. Current state-of-the-art multiplexed immunofluorescence image analysis pipelines depend on cell feature representations characterised by morphological and stain intensity-based metrics generated using simple statistical and machine learning-based tools. However, these methods are not capable of generating complex representations of cells. We propose a deep learning-based cell feature extraction model using a variational autoencoder with supervision using a latent subspace to extract cell features in mIF images. We perform cell phenotype classification using a cohort of more than 44,000 multiplexed immunofluorescence cell image patches extracted across 1,093 tissue microarray cores of breast cancer patients, to demonstrate the success of our model against current and alternative methods.
△ Less
Submitted 27 June, 2024; v1 submitted 22 June, 2024;
originally announced June 2024.
-
MM-SurvNet: Deep Learning-Based Survival Risk Stratification in Breast Cancer Through Multimodal Data Fusion
Authors:
Raktim Kumar Mondol,
Ewan K. A. Millar,
Arcot Sowmya,
Erik Meijering
Abstract:
Survival risk stratification is an important step in clinical decision making for breast cancer management. We propose a novel deep learning approach for this purpose by integrating histopathological imaging, genetic and clinical data. It employs vision transformers, specifically the MaxViT model, for image feature extraction, and self-attention to capture intricate image relationships at the pati…
▽ More
Survival risk stratification is an important step in clinical decision making for breast cancer management. We propose a novel deep learning approach for this purpose by integrating histopathological imaging, genetic and clinical data. It employs vision transformers, specifically the MaxViT model, for image feature extraction, and self-attention to capture intricate image relationships at the patient level. A dual cross-attention mechanism fuses these features with genetic data, while clinical data is incorporated at the final layer to enhance predictive accuracy. Experiments on the public TCGA-BRCA dataset show that our model, trained using the negative log likelihood loss function, can achieve superior performance with a mean C-index of 0.64, surpassing existing methods. This advancement facilitates tailored treatment strategies, potentially leading to improved patient outcomes.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
BioFusionNet: Deep Learning-Based Survival Risk Stratification in ER+ Breast Cancer Through Multifeature and Multimodal Data Fusion
Authors:
Raktim Kumar Mondol,
Ewan K. A. Millar,
Arcot Sowmya,
Erik Meijering
Abstract:
Breast cancer is a significant health concern affecting millions of women worldwide. Accurate survival risk stratification plays a crucial role in guiding personalised treatment decisions and improving patient outcomes. Here we present BioFusionNet, a deep learning framework that fuses image-derived features with genetic and clinical data to obtain a holistic profile and achieve survival risk stra…
▽ More
Breast cancer is a significant health concern affecting millions of women worldwide. Accurate survival risk stratification plays a crucial role in guiding personalised treatment decisions and improving patient outcomes. Here we present BioFusionNet, a deep learning framework that fuses image-derived features with genetic and clinical data to obtain a holistic profile and achieve survival risk stratification of ER+ breast cancer patients. We employ multiple self-supervised feature extractors (DINO and MoCoV3) pretrained on histopathological patches to capture detailed image features. These features are then fused by a variational autoencoder and fed to a self-attention network generating patient-level features. A co-dual-cross-attention mechanism combines the histopathological features with genetic data, enabling the model to capture the interplay between them. Additionally, clinical data is incorporated using a feed-forward network, further enhancing predictive performance and achieving comprehensive multimodal feature integration. Furthermore, we introduce a weighted Cox loss function, specifically designed to handle imbalanced survival data, which is a common challenge. Our model achieves a mean concordance index of 0.77 and a time-dependent area under the curve of 0.84, outperforming state-of-the-art methods. It predicts risk (high versus low) with prognostic significance for overall survival in univariate analysis (HR=2.99, 95% CI: 1.88--4.78, p<0.005), and maintains independent significance in multivariate analysis incorporating standard clinicopathological variables (HR=2.91, 95\% CI: 1.80--4.68, p<0.005).
△ Less
Submitted 2 June, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
Virchow: A Million-Slide Digital Pathology Foundation Model
Authors:
Eugene Vorontsov,
Alican Bozkurt,
Adam Casson,
George Shaikovski,
Michal Zelechowski,
Siqi Liu,
Kristen Severson,
Eric Zimmermann,
James Hall,
Neil Tenenholtz,
Nicolo Fusi,
Philippe Mathieu,
Alexander van Eck,
Donghun Lee,
Julian Viret,
Eric Robert,
Yi Kan Wang,
Jeremy D. Kunz,
Matthew C. H. Lee,
Jan Bernhard,
Ran A. Godrich,
Gerard Oakley,
Ewan Millar,
Matthew Hanna,
Juan Retamero
, et al. (6 additional authors not shown)
Abstract:
The use of artificial intelligence to enable precision medicine and decision support systems through the analysis of pathology images has the potential to revolutionize the diagnosis and treatment of cancer. Such applications will depend on models' abilities to capture the diverse patterns observed in pathology images. To address this challenge, we present Virchow, a foundation model for computati…
▽ More
The use of artificial intelligence to enable precision medicine and decision support systems through the analysis of pathology images has the potential to revolutionize the diagnosis and treatment of cancer. Such applications will depend on models' abilities to capture the diverse patterns observed in pathology images. To address this challenge, we present Virchow, a foundation model for computational pathology. Using self-supervised learning empowered by the DINOv2 algorithm, Virchow is a vision transformer model with 632 million parameters trained on 1.5 million hematoxylin and eosin stained whole slide images from diverse tissue and specimen types, which is orders of magnitude more data than previous works. The Virchow model enables the development of a pan-cancer detection system with 0.949 overall specimen-level AUC across 17 different cancer types, while also achieving 0.937 AUC on 7 rare cancer types. The Virchow model sets the state-of-the-art on the internal and external image tile level benchmarks and slide level biomarker prediction tasks. The gains in performance highlight the importance of training on massive pathology image datasets, suggesting scaling up the data and network architecture can improve the accuracy for many high-impact computational pathology applications where limited amounts of training data are available.
△ Less
Submitted 17 January, 2024; v1 submitted 14 September, 2023;
originally announced September 2023.
-
hist2RNA: An efficient deep learning architecture to predict gene expression from breast cancer histopathology images
Authors:
Raktim Kumar Mondol,
Ewan K. A. Millar,
Peter H Graham,
Lois Browne,
Arcot Sowmya,
Erik Meijering
Abstract:
Gene expression can be used to subtype breast cancer with improved prediction of risk of recurrence and treatment responsiveness over that obtained using routine immunohistochemistry (IHC). However, in the clinic, molecular profiling is primarily used for ER+ breast cancer, which is costly, tissue destructive, requires specialized platforms and takes several weeks to obtain a result. Deep learning…
▽ More
Gene expression can be used to subtype breast cancer with improved prediction of risk of recurrence and treatment responsiveness over that obtained using routine immunohistochemistry (IHC). However, in the clinic, molecular profiling is primarily used for ER+ breast cancer, which is costly, tissue destructive, requires specialized platforms and takes several weeks to obtain a result. Deep learning algorithms can effectively extract morphological patterns in digital histopathology images to predict molecular phenotypes quickly and cost-effectively. We propose a new, computationally efficient approach called hist2RNA inspired by bulk RNA-sequencing techniques to predict the expression of 138 genes (incorporated from six commercially available molecular profiling tests), including luminal PAM50 subtype, from hematoxylin and eosin (H&E) stained whole slide images (WSIs). The training phase involves the aggregation of extracted features for each patient from a pretrained model to predict gene expression at the patient level using annotated H&E images from The Cancer Genome Atlas (TCGA, n=335). We demonstrate successful gene prediction on a held-out test set (n = 160, corr = 0.82 across patients, corr = 0.29 across genes) and perform exploratory analysis on an external tissue microarray (TMA) dataset (n = 498) with known IHC and survival information. Our model is able to predict gene expression and luminal PAM50 subtype (Luminal A versus Luminal B) on the TMA dataset with prognostic significance for overall survival in univariate analysis (c-index = 0.56, hazard ratio = 2.16 (95% CI 1.12-3.06), p < 5 x 10-3), and independent significance in multivariate analysis incorporating standard clinicopathological variables (c-index = 0.65, hazard ratio = 1.85 (95% CI 1.30-2.68), p < 5 x 10-3).
△ Less
Submitted 7 May, 2023; v1 submitted 10 April, 2023;
originally announced April 2023.
-
Breast Cancer Histopathology Image based Gene Expression Prediction using Spatial Transcriptomics data and Deep Learning
Authors:
Md Mamunur Rahaman,
Ewan K. A. Millar,
Erik Meijering
Abstract:
Tumour heterogeneity in breast cancer poses challenges in predicting outcome and response to therapy. Spatial transcriptomics technologies may address these challenges, as they provide a wealth of information about gene expression at the cell level, but they are expensive, hindering their use in large-scale clinical oncology studies. Predicting gene expression from hematoxylin and eosin stained hi…
▽ More
Tumour heterogeneity in breast cancer poses challenges in predicting outcome and response to therapy. Spatial transcriptomics technologies may address these challenges, as they provide a wealth of information about gene expression at the cell level, but they are expensive, hindering their use in large-scale clinical oncology studies. Predicting gene expression from hematoxylin and eosin stained histology images provides a more affordable alternative for such studies. Here we present BrST-Net, a deep learning framework for predicting gene expression from histopathology images using spatial transcriptomics data. Using this framework, we trained and evaluated 10 state-of-the-art deep learning models without utilizing pretrained weights for the prediction of 250 genes. To enhance the generalisation performance of the main network, we introduce an auxiliary network into the framework. Our methodology outperforms previous studies, with 237 genes identified with positive correlation, including 24 genes with a median correlation coefficient greater than 0.50. This is a notable improvement over previous studies, which could predict only 102 genes with positive correlation, with the highest correlation values ranging from 0.29 to 0.34.
△ Less
Submitted 17 March, 2023;
originally announced March 2023.
-
The infrared spectrum of the Be star gamma Cassiopeiae
Authors:
S. Hony,
L. B. F. M. Waters,
P. A. Zaal,
A. de Koter,
J. M. Marlborough,
C. E. Millar,
N. R. Trams,
P. W. Morris,
Th. de Graauw
Abstract:
We present the 2.4-45 micrometer ISO-SWS spectrum of the Be star gamma Cas (B0.5 IVe). The spectrum is characterised by a thermal continuum which can be well fit by a power-law S_nu ~ nu^0.99 over the entire SWS wavelength range. For an isothermal disc of ionized gas with constant opening angle, this correponds to a density gradient rho(r) ~ r^(-2.8). We report the detection of the Humphreys bou…
▽ More
We present the 2.4-45 micrometer ISO-SWS spectrum of the Be star gamma Cas (B0.5 IVe). The spectrum is characterised by a thermal continuum which can be well fit by a power-law S_nu ~ nu^0.99 over the entire SWS wavelength range. For an isothermal disc of ionized gas with constant opening angle, this correponds to a density gradient rho(r) ~ r^(-2.8). We report the detection of the Humphreys bound-free jump in emission at 3.4 micrometer. The size of the jump is sensitive to the electron temperature of the gas in the disc, and we find T~9000 K, i.e. much lower than the stellar effective temperature (25000-30000 K). The spectrum is dominated by numerous emission lines, mostly from HI, but also some HeI lines are detected. Several spectral features cannot be identified. The line strengths of the HI{\sc i} emission lines do not follow case B recombination line theory. The line strengths and widths suggest that many lines are optically thick and come from an inner, high density region with radius 3-5 R_star and temperature above that of the bulk of the disc material. Only the alpha, beta and gamma transitions of the series lines contain a contribution from the outer regions. The level populations deviate significantly from LTE and are highly influenced by the optically thick, local (disc) continuum radiation field. The inner disk may be rotating more rapidly than the stellar photosphere.
△ Less
Submitted 25 November, 1999;
originally announced November 1999.