-
DeepEdit: Deep Editable Learning for Interactive Segmentation of 3D Medical Images
Authors:
Andres Diaz-Pinto,
Pritesh Mehta,
Sachidanand Alle,
Muhammad Asad,
Richard Brown,
Vishwesh Nath,
Alvin Ihsani,
Michela Antonelli,
Daniel Palkovics,
Csaba Pinter,
Ron Alkalay,
Steve Pieper,
Holger R. Roth,
Daguang Xu,
Prerna Dogra,
Tom Vercauteren,
Andrew Feng,
Abood Quraini,
Sebastien Ourselin,
M. Jorge Cardoso
Abstract:
Automatic segmentation of medical images is a key step for diagnostic and interventional tasks. However, achieving this requires large amounts of annotated volumes, which can be tedious and time-consuming task for expert annotators. In this paper, we introduce DeepEdit, a deep learning-based method for volumetric medical image annotation, that allows automatic and semi-automatic segmentation, and…
▽ More
Automatic segmentation of medical images is a key step for diagnostic and interventional tasks. However, achieving this requires large amounts of annotated volumes, which can be tedious and time-consuming task for expert annotators. In this paper, we introduce DeepEdit, a deep learning-based method for volumetric medical image annotation, that allows automatic and semi-automatic segmentation, and click-based refinement. DeepEdit combines the power of two methods: a non-interactive (i.e. automatic segmentation using nnU-Net, UNET or UNETR) and an interactive segmentation method (i.e. DeepGrow), into a single deep learning model. It allows easy integration of uncertainty-based ranking strategies (i.e. aleatoric and epistemic uncertainty computation) and active learning. We propose and implement a method for training DeepEdit by using standard training combined with user interaction simulation. Once trained, DeepEdit allows clinicians to quickly segment their datasets by using the algorithm in auto segmentation mode or by providing clicks via a user interface (i.e. 3D Slicer, OHIF). We show the value of DeepEdit through evaluation on the PROSTATEx dataset for prostate/prostatic lesions and the Multi-Atlas Labeling Beyond the Cranial Vault (BTCV) dataset for abdominal CT segmentation, using state-of-the-art network architectures as baseline for comparison. DeepEdit could reduce the time and effort annotating 3D medical images compared to DeepGrow alone. Source code is available at https://github.com/Project-MONAI/MONAILabel
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
The Medical Segmentation Decathlon
Authors:
Michela Antonelli,
Annika Reinke,
Spyridon Bakas,
Keyvan Farahani,
AnnetteKopp-Schneider,
Bennett A. Landman,
Geert Litjens,
Bjoern Menze,
Olaf Ronneberger,
Ronald M. Summers,
Bram van Ginneken,
Michel Bilello,
Patrick Bilic,
Patrick F. Christ,
Richard K. G. Do,
Marc J. Gollub,
Stephan H. Heckers,
Henkjan Huisman,
William R. Jarnagin,
Maureen K. McHugo,
Sandy Napel,
Jennifer S. Goli Pernicka,
Kawal Rhode,
Catalina Tobon-Gomez,
Eugene Vorontsov
, et al. (34 additional authors not shown)
Abstract:
International challenges have become the de facto standard for comparative assessment of image analysis algorithms given a specific task. Segmentation is so far the most widely investigated medical image processing task, but the various segmentation challenges have typically been organized in isolation, such that algorithm development was driven by the need to tackle a single specific clinical pro…
▽ More
International challenges have become the de facto standard for comparative assessment of image analysis algorithms given a specific task. Segmentation is so far the most widely investigated medical image processing task, but the various segmentation challenges have typically been organized in isolation, such that algorithm development was driven by the need to tackle a single specific clinical problem. We hypothesized that a method capable of performing well on multiple tasks will generalize well to a previously unseen task and potentially outperform a custom-designed solution. To investigate the hypothesis, we organized the Medical Segmentation Decathlon (MSD) - a biomedical image analysis challenge, in which algorithms compete in a multitude of both tasks and modalities. The underlying data set was designed to explore the axis of difficulties typically encountered when dealing with medical images, such as small data sets, unbalanced labels, multi-site data and small objects. The MSD challenge confirmed that algorithms with a consistent good performance on a set of tasks preserved their good average performance on a different set of previously unseen tasks. Moreover, by monitoring the MSD winner for two years, we found that this algorithm continued generalizing well to a wide range of other clinical problems, further confirming our hypothesis. Three main conclusions can be drawn from this study: (1) state-of-the-art image segmentation algorithms are mature, accurate, and generalize well when retrained on unseen tasks; (2) consistent algorithmic performance across multiple tasks is a strong surrogate of algorithmic generalizability; (3) the training of accurate AI segmentation models is now commoditized to non AI experts.
△ Less
Submitted 10 June, 2021;
originally announced June 2021.
-
Common Limitations of Image Processing Metrics: A Picture Story
Authors:
Annika Reinke,
Minu D. Tizabi,
Carole H. Sudre,
Matthias Eisenmann,
Tim Rädsch,
Michael Baumgartner,
Laura Acion,
Michela Antonelli,
Tal Arbel,
Spyridon Bakas,
Peter Bankhead,
Arriel Benis,
Matthew Blaschko,
Florian Buettner,
M. Jorge Cardoso,
Jianxu Chen,
Veronika Cheplygina,
Evangelia Christodoulou,
Beth Cimini,
Gary S. Collins,
Sandy Engelhardt,
Keyvan Farahani,
Luciana Ferrer,
Adrian Galdran,
Bram van Ginneken
, et al. (68 additional authors not shown)
Abstract:
While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using spe…
▽ More
While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using specific metrics for a given image analysis task. These are typically related to (1) the disregard of inherent metric properties, such as the behaviour in the presence of class imbalance or small target structures, (2) the disregard of inherent data set properties, such as the non-independence of the test cases, and (3) the disregard of the actual biomedical domain interest that the metrics should reflect. This living dynamically document has the purpose to illustrate important limitations of performance metrics commonly applied in the field of image analysis. In this context, it focuses on biomedical image analysis problems that can be phrased as image-level classification, semantic segmentation, instance segmentation, or object detection task. The current version is based on a Delphi process on metrics conducted by an international consortium of image analysis experts from more than 60 institutions worldwide.
△ Less
Submitted 6 December, 2023; v1 submitted 12 April, 2021;
originally announced April 2021.
-
Detection of Maternal and Fetal Stress from the Electrocardiogram with Self-Supervised Representation Learning
Authors:
Pritam Sarkar,
Silvia Lobmaier,
Bibiana Fabre,
Diego González,
Alexander Mueller,
Martin G. Frasch,
Marta C. Antonelli,
Ali Etemad
Abstract:
In the pregnant mother and her fetus, chronic prenatal stress results in entrainment of the fetal heartbeat by the maternal heartbeat, quantified by the fetal stress index (FSI). Deep learning (DL) is capable of pattern detection in complex medical data with high accuracy in noisy real-life environments, but little is known about DL's utility in non-invasive biometric monitoring during pregnancy.…
▽ More
In the pregnant mother and her fetus, chronic prenatal stress results in entrainment of the fetal heartbeat by the maternal heartbeat, quantified by the fetal stress index (FSI). Deep learning (DL) is capable of pattern detection in complex medical data with high accuracy in noisy real-life environments, but little is known about DL's utility in non-invasive biometric monitoring during pregnancy. A recently established self-supervised learning (SSL) approach to DL provides emotional recognition from electrocardiogram (ECG). We hypothesized that SSL will identify chronically stressed mother-fetus dyads from the raw maternal abdominal electrocardiograms (aECG), containing fetal and maternal ECG. Chronically stressed mothers and controls matched at enrolment at 32 weeks of gestation were studied. We validated the chronic stress exposure by psychological inventory, maternal hair cortisol and FSI. We tested two variants of SSL architecture, one trained on the generic ECG features for emotional recognition obtained from public datasets and another transfer-learned on a subset of our data. Our DL models accurately detect the chronic stress exposure group (AUROC=0.982+/-0.002), the individual psychological stress score (R2=0.943+/-0.009) and FSI at 34 weeks of gestation (R2=0.946+/-0.013), as well as the maternal hair cortisol at birth reflecting chronic stress exposure (0.931+/-0.006). The best performance was achieved with the DL model trained on the public dataset and using maternal ECG alone. The present DL approach provides a novel source of physiological insights into complex multi-modal relationships between different regulatory systems exposed to chronic stress. The final DL model can be deployed in low-cost regular ECG biosensors as a simple, ubiquitous early stress detection and monitoring tool during pregnancy. This discovery should enable early behavioral interventions.
△ Less
Submitted 5 May, 2021; v1 submitted 3 November, 2020;
originally announced November 2020.
-
A large annotated medical image dataset for the development and evaluation of segmentation algorithms
Authors:
Amber L. Simpson,
Michela Antonelli,
Spyridon Bakas,
Michel Bilello,
Keyvan Farahani,
Bram van Ginneken,
Annette Kopp-Schneider,
Bennett A. Landman,
Geert Litjens,
Bjoern Menze,
Olaf Ronneberger,
Ronald M. Summers,
Patrick Bilic,
Patrick F. Christ,
Richard K. G. Do,
Marc Gollub,
Jennifer Golia-Pernicka,
Stephan H. Heckers,
William R. Jarnagin,
Maureen K. McHugo,
Sandy Napel,
Eugene Vorontsov,
Lena Maier-Hein,
M. Jorge Cardoso
Abstract:
Semantic segmentation of medical images aims to associate a pixel with a label in a medical image without human initialization. The success of semantic segmentation algorithms is contingent on the availability of high-quality imaging data with corresponding labels provided by experts. We sought to create a large collection of annotated medical image datasets of various clinically relevant anatomie…
▽ More
Semantic segmentation of medical images aims to associate a pixel with a label in a medical image without human initialization. The success of semantic segmentation algorithms is contingent on the availability of high-quality imaging data with corresponding labels provided by experts. We sought to create a large collection of annotated medical image datasets of various clinically relevant anatomies available under open source license to facilitate the development of semantic segmentation algorithms. Such a resource would allow: 1) objective assessment of general-purpose segmentation methods through comprehensive benchmarking and 2) open and free access to medical image data for any researcher interested in the problem domain. Through a multi-institutional effort, we generated a large, curated dataset representative of several highly variable segmentation tasks that was used in a crowd-sourced challenge - the Medical Segmentation Decathlon held during the 2018 Medical Image Computing and Computer Aided Interventions Conference in Granada, Spain. Here, we describe these ten labeled image datasets so that these data may be effectively reused by the research community.
△ Less
Submitted 24 February, 2019;
originally announced February 2019.