-
Real-World Performance of Autonomously Reporting Normal Chest Radiographs in NHS Trusts Using a Deep-Learning Algorithm on the GP Pathway
Authors:
Jordan Smith,
Tom Naunton Morgan,
Paul Williams,
Qaiser Malik,
Simon Rasalingham
Abstract:
AIM To analyse the performance of a deep-learning (DL) algorithm currently deployed as diagnostic decision support software in two NHS Trusts used to identify normal chest x-rays in active clinical pathways.
MATERIALS AND METHODS A DL algorithm has been deployed in Somerset NHS Foundation Trust (SFT) since December 2022, and at Calderdale & Huddersfield NHS Foundation Trust (CHFT) since March 20…
▽ More
AIM To analyse the performance of a deep-learning (DL) algorithm currently deployed as diagnostic decision support software in two NHS Trusts used to identify normal chest x-rays in active clinical pathways.
MATERIALS AND METHODS A DL algorithm has been deployed in Somerset NHS Foundation Trust (SFT) since December 2022, and at Calderdale & Huddersfield NHS Foundation Trust (CHFT) since March 2023. The algorithm was developed and trained prior to deployment, and is used to assign abnormality scores to each GP-requested chest x-ray (CXR). The algorithm classifies a subset of examinations with the lowest abnormality scores as High Confidence Normal (HCN), and displays this result to the Trust. This two-site study includes 4,654 CXR continuous examinations processed by the algorithm over a six-week period.
RESULTS When classifying 20.0% of assessed examinations (930) as HCN, the model classified exams with a negative predictive value (NPV) of 0.96. There were 0.77% of examinations (36) classified incorrectly as HCN, with none of the abnormalities considered clinically significant by auditing radiologists. The DL software maintained fast levels of service to clinicians, with results returned to Trusts in a mean time of 7.1 seconds.
CONCLUSION The DL algorithm performs with a low rate of error and is highly effective as an automated diagnostic decision support tool, used to autonomously report a subset of CXRs as normal with high confidence. Removing 20% of all CXRs reduces workload for reporters and allows radiology departments to focus resources elsewhere.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
Robustness of an Artificial Intelligence Solution for Diagnosis of Normal Chest X-Rays
Authors:
Tom Dyer,
Jordan Smith,
Gaetan Dissez,
Nicole Tay,
Qaiser Malik,
Tom Naunton Morgan,
Paul Williams,
Liliana Garcia-Mondragon,
George Pearse,
Simon Rasalingham
Abstract:
Purpose: Artificial intelligence (AI) solutions for medical diagnosis require thorough evaluation to demonstrate that performance is maintained for all patient sub-groups and to ensure that proposed improvements in care will be delivered equitably. This study evaluates the robustness of an AI solution for the diagnosis of normal chest X-rays (CXRs) by comparing performance across multiple patient…
▽ More
Purpose: Artificial intelligence (AI) solutions for medical diagnosis require thorough evaluation to demonstrate that performance is maintained for all patient sub-groups and to ensure that proposed improvements in care will be delivered equitably. This study evaluates the robustness of an AI solution for the diagnosis of normal chest X-rays (CXRs) by comparing performance across multiple patient and environmental subgroups, as well as comparing AI errors with those made by human experts.
Methods: A total of 4,060 CXRs were sampled to represent a diverse dataset of NHS patients and care settings. Ground-truth labels were assigned by a 3-radiologist panel. AI performance was evaluated against assigned labels and sub-groups analysis was conducted against patient age and sex, as well as CXR view, modality, device manufacturer and hospital site.
Results: The AI solution was able to remove 18.5% of the dataset by classification as High Confidence Normal (HCN). This was associated with a negative predictive value (NPV) of 96.0%, compared to 89.1% for diagnosis of normal scans by radiologists. In all AI false negative (FN) cases, a radiologist was found to have also made the same error when compared to final ground-truth labels. Subgroup analysis showed no statistically significant variations in AI performance, whilst reduced normal classification was observed in data from some hospital sites.
Conclusion: We show the AI solution could provide meaningful workload savings by diagnosis of 18.5% of scans as HCN with a superior NPV to human readers. The AI solution is shown to perform well across patient subgroups and error cases were shown to be subjective or subtle in nature.
△ Less
Submitted 31 August, 2022;
originally announced September 2022.
-
Enhancing Early Lung Cancer Detection on Chest Radiographs with AI-assistance: A Multi-Reader Study
Authors:
Gaetan Dissez,
Nicole Tay,
Tom Dyer,
Matthew Tam,
Richard Dittrich,
David Doyne,
James Hoare,
Jackson J. Pat,
Stephanie Patterson,
Amanda Stockham,
Qaiser Malik,
Tom Naunton Morgan,
Paul Williams,
Liliana Garcia-Mondragon,
Jordan Smith,
George Pearse,
Simon Rasalingham
Abstract:
Objectives: The present study evaluated the impact of a commercially available explainable AI algorithm in augmenting the ability of clinicians to identify lung cancer on chest X-rays (CXR).
Design: This retrospective study evaluated the performance of 11 clinicians for detecting lung cancer from chest radiographs, with and without assistance from a commercially available AI algorithm (red dot,…
▽ More
Objectives: The present study evaluated the impact of a commercially available explainable AI algorithm in augmenting the ability of clinicians to identify lung cancer on chest X-rays (CXR).
Design: This retrospective study evaluated the performance of 11 clinicians for detecting lung cancer from chest radiographs, with and without assistance from a commercially available AI algorithm (red dot, Behold.ai) that predicts suspected lung cancer from CXRs. Clinician performance was evaluated against clinically confirmed diagnoses.
Setting: The study analysed anonymised patient data from an NHS hospital; the dataset consisted of 400 chest radiographs from adult patients (18 years and above) who had a CXR performed in 2020, with corresponding clinical text reports.
Participants: A panel of readers consisting of 11 clinicians (consultant radiologists, radiologist trainees and reporting radiographers) participated in this study.
Main outcome measures: Overall accuracy, sensitivity, specificity and precision for detecting lung cancer on CXRs by clinicians, with and without AI input. Agreement rates between clinicians and performance standard deviation were also evaluated, with and without AI input.
Results: The use of the AI algorithm by clinicians led to an improved overall performance for lung tumour detection, achieving an overall increase of 17.4% of lung cancers being identified on CXRs which would have otherwise been missed, an overall increase in detection of smaller tumours, a 24% and 13% increased detection of stage 1 and stage 2 lung cancers respectively, and standardisation of clinician performance.
Conclusions: This study showed great promise in the clinical utility of AI algorithms in improving early lung cancer diagnosis and promoting health equity through overall improvement in reader performances, without impacting downstream imaging resources.
△ Less
Submitted 31 August, 2022;
originally announced August 2022.