-
Open and reusable deep learning for pathology with WSInfer and QuPath
Authors:
Jakub R. Kaczmarzyk,
Alan O'Callaghan,
Fiona Inglis,
Tahsin Kurc,
Rajarsi Gupta,
Erich Bremer,
Peter Bankhead,
Joel H. Saltz
Abstract:
The field of digital pathology has seen a proliferation of deep learning models in recent years. Despite substantial progress, it remains rare for other researchers and pathologists to be able to access models published in the literature and apply them to their own images. This is due to difficulties in both sharing and running models. To address these concerns, we introduce WSInfer: a new, open-s…
▽ More
The field of digital pathology has seen a proliferation of deep learning models in recent years. Despite substantial progress, it remains rare for other researchers and pathologists to be able to access models published in the literature and apply them to their own images. This is due to difficulties in both sharing and running models. To address these concerns, we introduce WSInfer: a new, open-source software ecosystem designed to make deep learning for pathology more streamlined and accessible. WSInfer comprises three main elements: 1) a Python package and command line tool to efficiently apply patch-based deep learning inference to whole slide images; 2) a QuPath extension that provides an alternative inference engine through user-friendly and interactive software, and 3) a model zoo, which enables pathology models and metadata to be easily shared in a standardized form. Together, these contributions aim to encourage wider reuse, exploration, and interrogation of deep learning models for research purposes, by putting them into the hands of pathologists and eliminating a need for coding experience when accessed through QuPath. The WSInfer source code is hosted on GitHub and documentation is available at https://wsinfer.readthedocs.io.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Halcyon -- A Pathology Imaging and Feature analysis and Management System
Authors:
Erich Bremer,
Tammy DiPrima,
Joseph Balsamo,
Jonas Almeida,
Rajarsi Gupta,
Joel Saltz
Abstract:
Halcyon is a new pathology imaging analysis and feature management system based on W3C linked-data open standards and is designed to scale to support the needs for the voluminous production of features from deep-learning feature pipelines. Halcyon can support multiple users with a web-based UX with access to all user data over a standards-based web API allowing for integration with other processes…
▽ More
Halcyon is a new pathology imaging analysis and feature management system based on W3C linked-data open standards and is designed to scale to support the needs for the voluminous production of features from deep-learning feature pipelines. Halcyon can support multiple users with a web-based UX with access to all user data over a standards-based web API allowing for integration with other processes and software systems. Identity management and data security is also provided.
△ Less
Submitted 7 April, 2023;
originally announced April 2023.
-
ImageBox3: No-Server Tile Serving to Traverse Whole Slide Images on the Web
Authors:
Praphulla MS Bhawsar,
Erich Bremer,
Máire A Duggan,
Stephen Chanock,
Montserrat Garcia-Closas,
Joel Saltz,
Jonas S Almeida
Abstract:
Whole slide imaging (WSI) has become the primary modality for digital pathology data. However, due to the size and high-resolution nature of these images, they are generally only accessed in smaller sections or tiles via specialized platforms, most of which require extensive setup and/or costly infrastructure. These platforms typically also need a copy of the images to be locally available to them…
▽ More
Whole slide imaging (WSI) has become the primary modality for digital pathology data. However, due to the size and high-resolution nature of these images, they are generally only accessed in smaller sections or tiles via specialized platforms, most of which require extensive setup and/or costly infrastructure. These platforms typically also need a copy of the images to be locally available to them, potentially causing issues with data governance and provenance. To address these concerns, we developed ImageBox3, an in-browser tiling mechanism to enable zero-footprint traversal of remote WSI data. All computation is performed client-side without compromising user governance, operating public and private images alike as long as the storage service supports HTTP range requests (standard in Cloud storage and most web servers). ImageBox3 thus removes significant hurdles to WSI operation and effective collaboration, allowing for the sort of democratized analytical tools needed to establish participative, FAIR digital pathology data commons.
Availability:
code - https://github.com/episphere/imagebox3;
fig1 (live) - https://episphere.github.io/imagebox3/demo/scriptTag ;
fig2 (live) - https://episphere.github.io/imagebox3/demo/serviceWorker ;
fig 3 (live) - https://observablehq.com/@prafulb/imagebox3-in-observable .
△ Less
Submitted 5 July, 2022; v1 submitted 4 July, 2022;
originally announced July 2022.
-
AI and Pathology: Steering Treatment and Predicting Outcomes
Authors:
Rajarsi Gupta,
Jakub Kaczmarzyk,
Soma Kobayashi,
Tahsin Kurc,
Joel Saltz
Abstract:
The combination of data analysis methods, increasing computing capacity, and improved sensors enable quantitative granular, multi-scale, cell-based analyses. We describe the rich set of application challenges related to tissue interpretation and survey AI methods currently used to address these challenges. We focus on a particular class of targeted human tissue analysis - histopathology - aimed at…
▽ More
The combination of data analysis methods, increasing computing capacity, and improved sensors enable quantitative granular, multi-scale, cell-based analyses. We describe the rich set of application challenges related to tissue interpretation and survey AI methods currently used to address these challenges. We focus on a particular class of targeted human tissue analysis - histopathology - aimed at quantitative characterization of disease state, patient outcome prediction and treatment steering.
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
Evaluating histopathology transfer learning with ChampKit
Authors:
Jakub R. Kaczmarzyk,
Tahsin M. Kurc,
Shahira Abousamra,
Rajarsi Gupta,
Joel H. Saltz,
Peter K. Koo
Abstract:
Histopathology remains the gold standard for diagnosis of various cancers. Recent advances in computer vision, specifically deep learning, have facilitated the analysis of histopathology images for various tasks, including immune cell detection and microsatellite instability classification. The state-of-the-art for each task often employs base architectures that have been pretrained for image clas…
▽ More
Histopathology remains the gold standard for diagnosis of various cancers. Recent advances in computer vision, specifically deep learning, have facilitated the analysis of histopathology images for various tasks, including immune cell detection and microsatellite instability classification. The state-of-the-art for each task often employs base architectures that have been pretrained for image classification on ImageNet. The standard approach to develop classifiers in histopathology tends to focus narrowly on optimizing models for a single task, not considering the aspects of modeling innovations that improve generalization across tasks. Here we present ChampKit (Comprehensive Histopathology Assessment of Model Predictions toolKit): an extensible, fully reproducible benchmarking toolkit that consists of a broad collection of patch-level image classification tasks across different cancers. ChampKit enables a way to systematically document the performance impact of proposed improvements in models and methodology. ChampKit source code and data are freely accessible at https://github.com/kaczmarj/champkit .
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
A Novel Framework for Characterization of Tumor-Immune Spatial Relationships in Tumor Microenvironment
Authors:
Mahmudul Hasan,
Jakub R. Kaczmarzyk,
David Paredes,
Lyanne Oblein,
Jaymie Oentoro,
Shahira Abousamra,
Michael Horowitz,
Dimitris Samaras,
Chao Chen,
Tahsin Kurc,
Kenneth R. Shroyer,
Joel Saltz
Abstract:
Understanding the impact of tumor biology on the composition of nearby cells often requires characterizing the impact of biologically distinct tumor regions. Biomarkers have been developed to label biologically distinct tumor regions, but challenges arise because of differences in the spatial extent and distribution of differentially labeled regions. In this work, we present a framework for system…
▽ More
Understanding the impact of tumor biology on the composition of nearby cells often requires characterizing the impact of biologically distinct tumor regions. Biomarkers have been developed to label biologically distinct tumor regions, but challenges arise because of differences in the spatial extent and distribution of differentially labeled regions. In this work, we present a framework for systematically investigating the impact of distinct tumor regions on cells near the tumor borders, accounting their cross spatial distributions. We apply the framework to multiplex immunohistochemistry (mIHC) studies of pancreatic cancer and show its efficacy in demonstrating how biologically different tumor regions impact the immune response in the tumor microenvironment. Furthermore, we show that the proposed framework can be extended to largescale whole slide image analysis.
△ Less
Submitted 1 May, 2022; v1 submitted 23 April, 2022;
originally announced April 2022.
-
A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study
Authors:
Sarah N Dudgeon,
Si Wen,
Matthew G Hanna,
Rajarsi Gupta,
Mohamed Amgad,
Manasi Sheth,
Hetal Marble,
Richard Huang,
Markus D Herrmann,
Clifford H. Szu,
Darick Tong,
Bruce Werness,
Evan Szu,
Denis Larsimont,
Anant Madabhushi,
Evangelos Hytopoulos,
Weijie Chen,
Rajendra Singh,
Steven N. Hart,
Joel Saltz,
Roberto Salgado,
Brandon D Gallas
Abstract:
Purpose: In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images (WSIs). We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor infiltrating lymphocytes (sTILs) in breast cancer. Methods: We digitized 64 glass slides of hematoxylin- and eo…
▽ More
Purpose: In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images (WSIs). We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor infiltrating lymphocytes (sTILs) in breast cancer. Methods: We digitized 64 glass slides of hematoxylin- and eosin-stained ductal carcinoma core biopsies prepared at a single clinical site. We created training materials and workflows to crowdsource pathologist image annotations on two modes: an optical microscope and two digital platforms. The workflows collect the ROI type, a decision on whether the ROI is appropriate for estimating the density of sTILs, and if appropriate, the sTIL density value for that ROI. Results: The pilot study yielded an abundant number of cases with nominal sTIL infiltration. Furthermore, we found that the sTIL densities are correlated within a case, and there is notable pathologist variability. Consequently, we outline plans to improve our ROI and case sampling methods. We also outline statistical methods to account for ROI correlations within a case and pathologist variability when validating an algorithm. Conclusion: We have built workflows for efficient data collection and tested them in a pilot study. As we prepare for pivotal studies, we will consider what it will take for the dataset to be fit for a regulatory purpose: study size, patient population, and pathologist training and qualifications. To this end, we will elicit feedback from the FDA via the Medical Device Development Tool program and from the broader digital pathology and AI community. Ultimately, we intend to share the dataset, statistical methods, and lessons learned.
△ Less
Submitted 14 October, 2020;
originally announced October 2020.
-
Representing Whole Slide Cancer Image Features with Hilbert Curves
Authors:
Erich Bremer,
Jonas Almeida,
Joel Saltz
Abstract:
Regions of Interest (ROI) contain morphological features in pathology whole slide images (WSI) are delimited with polygons[1]. These polygons are often represented in either a textual notation (with the array of edges) or in a binary mask form. Textual notations have an advantage of human readability and portability, whereas, binary mask representations are more useful as the input and output of f…
▽ More
Regions of Interest (ROI) contain morphological features in pathology whole slide images (WSI) are delimited with polygons[1]. These polygons are often represented in either a textual notation (with the array of edges) or in a binary mask form. Textual notations have an advantage of human readability and portability, whereas, binary mask representations are more useful as the input and output of feature-extraction pipelines that employ deep learning methodologies. For any given whole slide image, more than a million cellular features can be segmented generating a corresponding number of polygons. The corpus of these segmentations for all processed whole slide images creates various challenges for filtering specific areas of data for use in interactive real-time and multi-scale displays and analysis. Simple range queries of image locations do not scale and, instead, spatial indexing schemes are required. In this paper we propose using Hilbert Curves simultaneously for spatial indexing and as a polygonal ROI representation. This is achieved by using a series of Hilbert Curves[2] creating an efficient and inherently spatially-indexed machine-usable form. The distinctive property of Hilbert curves that enables both mask and polygon delimitation of ROIs is that the elements of the vector extracted ro describe morphological features maintain their relative positions for different scales of the same image.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
Large-scale Analysis of Opioid Poisoning Related Hospital Visits in New York State
Authors:
Xin Chen,
Yu Wang,
Xiaxia Yu,
Elinor Schoenfeld,
Mary Saltz,
Joel Saltz,
Fusheng Wang
Abstract:
Opioid related deaths are increasing dramatically in recent years, and opioid epidemic is worsening in the United States. Combating opioid epidemic becomes a high priority for both the U.S. government and local governments such as New York State. Analyzing patient level opioid related hospital visits provides a data driven approach to discover both spatial and temporal patterns and identity potent…
▽ More
Opioid related deaths are increasing dramatically in recent years, and opioid epidemic is worsening in the United States. Combating opioid epidemic becomes a high priority for both the U.S. government and local governments such as New York State. Analyzing patient level opioid related hospital visits provides a data driven approach to discover both spatial and temporal patterns and identity potential causes of opioid related deaths, which provides essential knowledge for governments on decision making. In this paper, we analyzed opioid poisoning related hospital visits using New York State SPARCS data, which provides diagnoses of patients in hospital visits. We identified all patients with primary diagnosis as opioid poisoning from 2010-2014 for our main studies, and from 2003-2014 for temporal trend studies. We performed demographical based studies, and summarized the historical trends of opioid poisoning. We used frequent item mining to find co-occurrences of diagnoses for possible causes of poisoning or effects from poisoning. We provided zip code level spatial analysis to detect local spatial clusters, and studied potential correlations between opioid poisoning and demographic and social-economic factors.
△ Less
Submitted 7 May, 2018; v1 submitted 15 November, 2017;
originally announced November 2017.