A Real-World Demonstration of Machine Learning Generalizability: Intracranial Hemorrhage Detection on Head CT
Authors:
Hojjat Salehinejad,
Jumpei Kitamura,
Noah Ditkofsky,
Amy Lin,
Aditya Bharatha,
Suradech Suthiphosuwan,
Hui-Ming Lin,
Jefferson R. Wilson,
Muhammad Mamdani,
Errol Colak
Abstract:
Machine learning (ML) holds great promise in transforming healthcare. While published studies have shown the utility of ML models in interpreting medical imaging examinations, these are often evaluated under laboratory settings. The importance of real world evaluation is best illustrated by case studies that have documented successes and failures in the translation of these models into clinical en…
▽ More
Machine learning (ML) holds great promise in transforming healthcare. While published studies have shown the utility of ML models in interpreting medical imaging examinations, these are often evaluated under laboratory settings. The importance of real world evaluation is best illustrated by case studies that have documented successes and failures in the translation of these models into clinical environments. A key prerequisite for the clinical adoption of these technologies is demonstrating generalizable ML model performance under real world circumstances. The purpose of this study was to demonstrate that ML model generalizability is achievable in medical imaging with the detection of intracranial hemorrhage (ICH) on non-contrast computed tomography (CT) scans serving as the use case. An ML model was trained using 21,784 scans from the RSNA Intracranial Hemorrhage CT dataset while generalizability was evaluated using an external validation dataset obtained from our busy trauma and neurosurgical center. This real world external validation dataset consisted of every unenhanced head CT scan (n = 5,965) performed in our emergency department in 2019 without exclusion. The model demonstrated an AUC of 98.4%, sensitivity of 98.8%, and specificity of 98.0%, on the test dataset. On external validation, the model demonstrated an AUC of 95.4%, sensitivity of 91.3%, and specificity of 94.1%. Evaluating the ML model using a real world external validation dataset that is temporally and geographically distinct from the training dataset indicates that ML generalizability is achievable in medical imaging applications.
△ Less
Submitted 9 February, 2021;
originally announced February 2021.
Deep Sequential Learning for Cervical Spine Fracture Detection on Computed Tomography Imaging
Authors:
Hojjat Salehinejad,
Edward Ho,
Hui-Ming Lin,
Priscila Crivellaro,
Oleksandra Samorodova,
Monica Tafur Arciniegas,
Zamir Merali,
Suradech Suthiphosuwan,
Aditya Bharatha,
Kristen Yeom,
Muhammad Mamdani,
Jefferson Wilson,
Errol Colak
Abstract:
Fractures of the cervical spine are a medical emergency and may lead to permanent paralysis and even death. Accurate diagnosis in patients with suspected fractures by computed tomography (CT) is critical to patient management. In this paper, we propose a deep convolutional neural network (DCNN) with a bidirectional long-short term memory (BLSTM) layer for the automated detection of cervical spine…
▽ More
Fractures of the cervical spine are a medical emergency and may lead to permanent paralysis and even death. Accurate diagnosis in patients with suspected fractures by computed tomography (CT) is critical to patient management. In this paper, we propose a deep convolutional neural network (DCNN) with a bidirectional long-short term memory (BLSTM) layer for the automated detection of cervical spine fractures in CT axial images. We used an annotated dataset of 3,666 CT scans (729 positive and 2,937 negative cases) to train and validate the model. The validation results show a classification accuracy of 70.92% and 79.18% on the balanced (104 positive and 104 negative cases) and imbalanced (104 positive and 419 negative cases) test datasets, respectively.
△ Less
Submitted 5 February, 2021; v1 submitted 26 October, 2020;
originally announced October 2020.