Search | arXiv e-print repository

Benchmarking Pathology Feature Extractors for Whole Slide Image Classification

Authors: Georg Wölflein, Dyke Ferber, Asier R. Meneghetti, Omar S. M. El Nahhas, Daniel Truhn, Zunamys I. Carrero, David J. Harrison, Ognjen Arandjelović, Jakob Nikolas Kather

Abstract: Weakly supervised whole slide image classification is a key task in computational pathology, which involves predicting a slide-level label from a set of image patches constituting the slide. Constructing models to solve this task involves multiple design choices, often made without robust empirical or conclusive theoretical justification. To address this, we conduct a comprehensive benchmarking of… ▽ More Weakly supervised whole slide image classification is a key task in computational pathology, which involves predicting a slide-level label from a set of image patches constituting the slide. Constructing models to solve this task involves multiple design choices, often made without robust empirical or conclusive theoretical justification. To address this, we conduct a comprehensive benchmarking of feature extractors to answer three critical questions: 1) Is stain normalisation still a necessary preprocessing step? 2) Which feature extractors are best for downstream slide-level classification? 3) How does magnification affect downstream performance? Our study constitutes the most comprehensive evaluation of publicly available pathology feature extractors to date, involving more than 10,000 training runs across 14 feature extractors, 9 tasks, 5 datasets, 3 downstream architectures, 2 levels of magnification, and various preprocessing setups. Our findings challenge existing assumptions: 1) We observe empirically, and by analysing the latent space, that skip** stain normalisation and image augmentations does not degrade performance, while significantly reducing memory and computational demands. 2) We develop a novel evaluation metric to compare relative downstream performance, and show that the choice of feature extractor is the most consequential factor for downstream performance. 3) We find that lower-magnification slides are sufficient for accurate slide-level classification. Contrary to previous patch-level benchmarking studies, our approach emphasises clinical relevance by focusing on slide-level biomarker prediction tasks in a weakly supervised setting with external validation cohorts. Our findings stand to streamline digital pathology workflows by minimising preprocessing needs and informing the selection of feature extractors. △ Less

Submitted 21 June, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

Comments: For the conference version see: arXiv:2311.11772v4. For the longer journal version with additional experiments see arXiv:2311.11772v5

arXiv:2305.10552 [pdf, other]

Deep Multiple Instance Learning with Distance-Aware Self-Attention

Authors: Georg Wölflein, Lucie Charlotte Magister, Pietro Liò, David J. Harrison, Ognjen Arandjelović

Abstract: Traditional supervised learning tasks require a label for every instance in the training set, but in many real-world applications, labels are only available for collections (bags) of instances. This problem setting, known as multiple instance learning (MIL), is particularly relevant in the medical domain, where high-resolution images are split into smaller patches, but labels apply to the image as… ▽ More Traditional supervised learning tasks require a label for every instance in the training set, but in many real-world applications, labels are only available for collections (bags) of instances. This problem setting, known as multiple instance learning (MIL), is particularly relevant in the medical domain, where high-resolution images are split into smaller patches, but labels apply to the image as a whole. Recent MIL models are able to capture correspondences between patches by employing self-attention, allowing them to weigh each patch differently based on all other patches in the bag. However, these approaches still do not consider the relative spatial relationships between patches within the larger image, which is especially important in computational pathology. To this end, we introduce a novel MIL model with distance-aware self-attention (DAS-MIL), which explicitly takes into account relative spatial information when modelling the interactions between patches. Unlike existing relative position representations for self-attention which are discrete, our approach introduces continuous distance-dependent terms into the computation of the attention weights, and is the first to apply relative position representations in the context of MIL. We evaluate our model on a custom MNIST-based MIL dataset that requires the consideration of relative spatial information, as well as on CAMELYON16, a publicly available cancer metastasis detection dataset, where we achieve a test AUROC score of 0.91. On both datasets, our model outperforms existing MIL approaches that employ absolute positional encodings, as well as existing relative position representation schemes applied to MIL. Our code is available at https://anonymous.4open.science/r/das-mil. △ Less

Submitted 20 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

arXiv:2210.06909 [pdf, other]

HoechstGAN: Virtual Lymphocyte Staining Using Generative Adversarial Networks

Authors: Georg Wölflein, In Hwa Um, David J Harrison, Ognjen Arandjelović

Abstract: The presence and density of specific types of immune cells are important to understand a patient's immune response to cancer. However, immunofluorescence staining required to identify T cell subtypes is expensive, time-consuming, and rarely performed in clinical settings. We present a framework to virtually stain Hoechst images (which are cheap and widespread) with both CD3 and CD8 to identify T c… ▽ More The presence and density of specific types of immune cells are important to understand a patient's immune response to cancer. However, immunofluorescence staining required to identify T cell subtypes is expensive, time-consuming, and rarely performed in clinical settings. We present a framework to virtually stain Hoechst images (which are cheap and widespread) with both CD3 and CD8 to identify T cell subtypes in clear cell renal cell carcinoma using generative adversarial networks. Our proposed method jointly learns both staining tasks, incentivising the network to incorporate mutually beneficial information from each task. We devise a novel metric to quantify the virtual staining quality, and use it to evaluate our method. △ Less

Submitted 17 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

Comments: Accepted at IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023

arXiv:2107.04388 [pdf, other]

Hoechst Is All You Need: Lymphocyte Classification with Deep Learning

Authors: Jessica Cooper, In Hwa Um, Ognjen Arandjelović, David J Harrison

Abstract: Multiplex immunofluorescence and immunohistochemistry benefit patients by allowing cancer pathologists to identify several proteins expressed on the surface of cells, enabling cell classification, better understanding of the tumour micro-environment, more accurate diagnoses, prognoses, and tailored immunotherapy based on the immune status of individual patients. However, they are expensive and tim… ▽ More Multiplex immunofluorescence and immunohistochemistry benefit patients by allowing cancer pathologists to identify several proteins expressed on the surface of cells, enabling cell classification, better understanding of the tumour micro-environment, more accurate diagnoses, prognoses, and tailored immunotherapy based on the immune status of individual patients. However, they are expensive and time consuming processes which require complex staining and imaging techniques by expert technicians. Hoechst staining is much cheaper and easier to perform, but is not typically used in this case as it binds to DNA rather than to the proteins targeted by immunofluorescent techniques, and it was not previously thought possible to differentiate cells expressing these proteins based only on DNA morphology. In this work we show otherwise, training a deep convolutional neural network to identify cells expressing three proteins (T lymphocyte markers CD3 and CD8, and the B lymphocyte marker CD20) with greater than 90% precision and recall, from Hoechst 33342 stained tissue only. Our model learns previously unknown morphological features associated with expression of these proteins which can be used to accurately differentiate lymphocyte subtypes for use in key prognostic metrics such as assessment of immune cell infiltration,and thereby predict and improve patient outcomes without the need for costly multiplex immunofluorescence. △ Less

Submitted 16 July, 2021; v1 submitted 9 July, 2021; originally announced July 2021.

Comments: 15 pages, 4 figures

arXiv:2103.05108 [pdf, other]

Believe The HiPe: Hierarchical Perturbation for Fast, Robust, and Model-Agnostic Saliency Map**

Authors: Jessica Cooper, Ognjen Arandjelović, David J Harrison

Abstract: Understanding the predictions made by Artificial Intelligence (AI) systems is becoming more and more important as deep learning models are used for increasingly complex and high-stakes tasks. Saliency map** -- a popular visual attribution method -- is one important tool for this, but existing formulations are limited by either computational cost or architectural constraints. We therefore propose… ▽ More Understanding the predictions made by Artificial Intelligence (AI) systems is becoming more and more important as deep learning models are used for increasingly complex and high-stakes tasks. Saliency map** -- a popular visual attribution method -- is one important tool for this, but existing formulations are limited by either computational cost or architectural constraints. We therefore propose Hierarchical Perturbation, a very fast and completely model-agnostic method for interpreting model predictions with robust saliency maps. Using standard benchmarks and datasets, we show that our saliency maps are of competitive or superior quality to those generated by existing model-agnostic methods -- and are over 20 times faster to compute. △ Less

Submitted 11 April, 2022; v1 submitted 22 February, 2021; originally announced March 2021.

Comments: github.com/jessicamarycooper/Hierarchical-Perturbation

Showing 1–5 of 5 results for author: Harrison, D J