Search | arXiv e-print repository

doi 10.1038/s41379-022-01147-y

Recommendations on test datasets for evaluating AI solutions in pathology

Authors: André Homeyer, Christian Geißler, Lars Ole Schwen, Falk Zakrzewski, Theodore Evans, Klaus Strohmenger, Max Westphal, Roman David Bülow, Michaela Kargl, Aray Karjauv, Isidre Munné-Bertran, Carl Orge Retzlaff, Adrià Romero-López, Tomasz Sołtysiński, Markus Plass, Rita Carvalho, Peter Steinbach, Yu-Chia Lan, Nassim Bouteldja, David Haber, Mateo Rojas-Carulla, Alireza Vafaei Sadr, Matthias Kraft, Daniel Krüger, Rutger Fick , et al. (5 additional authors not shown)

Abstract: Artificial intelligence (AI) solutions that automatically extract information from digital histology images have shown great promise for improving pathological diagnosis. Prior to routine use, it is important to evaluate their predictive performance and obtain regulatory approval. This assessment requires appropriate test datasets. However, compiling such datasets is challenging and specific recom… ▽ More Artificial intelligence (AI) solutions that automatically extract information from digital histology images have shown great promise for improving pathological diagnosis. Prior to routine use, it is important to evaluate their predictive performance and obtain regulatory approval. This assessment requires appropriate test datasets. However, compiling such datasets is challenging and specific recommendations are missing. A committee of various stakeholders, including commercial AI developers, pathologists, and researchers, discussed key aspects and conducted extensive literature reviews on test datasets in pathology. Here, we summarize the results and derive general recommendations for the collection of test datasets. We address several questions: Which and how many images are needed? How to deal with low-prevalence subsets? How can potential bias be detected? How should datasets be reported? What are the regulatory requirements in different countries? The recommendations are intended to help AI developers demonstrate the utility of their products and to help regulatory agencies and end users verify reported performance measures. Further research is needed to formulate criteria for sufficiently representative test datasets so that AI solutions can operate with less user intervention and better support diagnostic workflows in the future. △ Less

Submitted 21 April, 2022; originally announced April 2022.

Journal ref: Mod Pathol (2022)

arXiv:2112.08837 [pdf]

Improving Unsupervised Stain-To-Stain Translation using Self-Supervision and Meta-Learning

Authors: Nassim Bouteldja, Barbara Mara Klinkhammer, Tarek Schlaich, Peter Boor, Dorit Merhof

Abstract: In digital pathology, many image analysis tasks are challenged by the need for large and time-consuming manual data annotations to cope with various sources of variability in the image domain. Unsupervised domain adaptation based on image-to-image translation is gaining importance in this field by addressing variabilities without the manual overhead. Here, we tackle the variation of different hist… ▽ More In digital pathology, many image analysis tasks are challenged by the need for large and time-consuming manual data annotations to cope with various sources of variability in the image domain. Unsupervised domain adaptation based on image-to-image translation is gaining importance in this field by addressing variabilities without the manual overhead. Here, we tackle the variation of different histological stains by unsupervised stain-to-stain translation to enable a stain-independent applicability of a deep learning segmentation model. We use CycleGANs for stain-to-stain translation in kidney histopathology, and propose two novel approaches to improve translational effectivity. First, we integrate a prior segmentation network into the CycleGAN for a self-supervised, application-oriented optimization of translation through semantic guidance, and second, we incorporate extra channels to the translation output to implicitly separate artificial meta-information otherwise encoded for tackling underdetermined reconstructions. The latter showed partially superior performances to the unmodified CycleGAN, but the former performed best in all stains providing instance-level Dice scores ranging between 78% and 92% for most kidney structures, such as glomeruli, tubules, and veins. However, CycleGANs showed only limited performance in the translation of other structures, e.g. arteries. Our study also found somewhat lower performance for all structures in all stains when compared to segmentation in the original stain. Our study suggests that with current unsupervised technologies, it seems unlikely to produce generally applicable simulated stains. △ Less

Submitted 17 May, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

Comments: Accepted for Journal of Pathology Informatics (JPI), 2022

arXiv:2101.01877 [pdf, other]

3D Convolutional Selective Autoencoder For Instability Detection in Combustion Systems

Authors: Tryambak Gangopadhyay, Vikram Ramanan, Adedotun Akintayo, Paige K Boor, Soumalya Sarkar, Satyanarayanan R Chakravarthy, Soumik Sarkar

Abstract: While analytical solutions of critical (phase) transitions in physical systems are abundant for simple nonlinear systems, such analysis remains intractable for real-life dynamical systems. A key example of such a physical system is thermoacoustic instability in combustion, where prediction or early detection of an onset of instability is a hard technical challenge, which needs to be addressed to b… ▽ More While analytical solutions of critical (phase) transitions in physical systems are abundant for simple nonlinear systems, such analysis remains intractable for real-life dynamical systems. A key example of such a physical system is thermoacoustic instability in combustion, where prediction or early detection of an onset of instability is a hard technical challenge, which needs to be addressed to build safer and more energy-efficient gas turbine engines powering aerospace and energy industries. The instabilities arising in combustion chambers of engines are mathematically too complex to model. To address this issue in a data-driven manner instead, we propose a novel deep learning architecture called 3D convolutional selective autoencoder (3D-CSAE) to detect the evolution of self-excited oscillations using spatiotemporal data, i.e., hi-speed videos taken from a swirl-stabilized combustor (laboratory surrogate of gas turbine engine combustor). 3D-CSAE consists of filters to learn, in a hierarchical fashion, the complex visual and dynamic features related to combustion instability. We train the 3D-CSAE on frames of videos obtained from a limited set of operating conditions. We select the 3D-CSAE hyper-parameters that are effective for characterizing hierarchical and multiscale instability structure evolution by utilizing the dynamic information available in the video. The proposed model clearly shows performance improvement in detecting the precursors of instability. The machine learning-driven results are verified with physics-based off-line measures. Advanced active control mechanisms can directly leverage the proposed online detection capability of 3D-CSAE to mitigate the adverse effects of combustion instabilities on the engine operating under various stringent requirements and conditions. △ Less

Submitted 6 January, 2021; originally announced January 2021.

Showing 1–3 of 3 results for author: Boor, P