Deep Learning Predicts Hip Fracture using Confounding Patient and Healthcare Variables
Authors:
Marcus A. Badgeley,
John R. Zech,
Luke Oakden-Rayner,
Benjamin S. Glicksberg,
Manway Liu,
William Gale,
Michael V. McConnell,
Beth Percha,
Thomas M. Snyder,
Joel T. Dudley
Abstract:
Hip fractures are a leading cause of death and disability among older adults. Hip fractures are also the most commonly missed diagnosis on pelvic radiographs. Computer-Aided Diagnosis (CAD) algorithms have shown promise for hel** radiologists detect fractures, but the image features underpinning their predictions are notoriously difficult to understand. In this study, we trained deep learning mo…
▽ More
Hip fractures are a leading cause of death and disability among older adults. Hip fractures are also the most commonly missed diagnosis on pelvic radiographs. Computer-Aided Diagnosis (CAD) algorithms have shown promise for hel** radiologists detect fractures, but the image features underpinning their predictions are notoriously difficult to understand. In this study, we trained deep learning models on 17,587 radiographs to classify fracture, five patient traits, and 14 hospital process variables. All 20 variables could be predicted from a radiograph (p < 0.05), with the best performances on scanner model (AUC=1.00), scanner brand (AUC=0.98), and whether the order was marked "priority" (AUC=0.79). Fracture was predicted moderately well from the image (AUC=0.78) and better when combining image features with patient data (AUC=0.86, p=2e-9) or patient data plus hospital process features (AUC=0.91, p=1e-21). The model performance on a test set with matched patient variables was significantly lower than a random test set (AUC=0.67, p=0.003); and when the test set was matched on patient and image acquisition variables, the model performed randomly (AUC=0.52, 95% CI 0.46-0.58), indicating that these variables were the main source of the model's predictive ability overall. We also used Naive Bayes to combine evidence from image models with patient and hospital data and found their inclusion improved performance, but that this approach was nevertheless inferior to directly modeling all variables. If CAD algorithms are inexplicably leveraging patient and process variables in their predictions, it is unclear how radiologists should interpret their predictions in the context of other known patient data. Further research is needed to illuminate deep learning decision processes so that computers and clinicians can effectively cooperate.
△ Less
Submitted 8 November, 2018;
originally announced November 2018.
Predicting Cardiovascular Risk Factors from Retinal Fundus Photographs using Deep Learning
Authors:
Ryan Poplin,
Avinash V. Varadarajan,
Katy Blumer,
Yun Liu,
Michael V. McConnell,
Greg S. Corrado,
Lily Peng,
Dale R. Webster
Abstract:
Traditionally, medical discoveries are made by observing associations and then designing experiments to test these hypotheses. However, observing and quantifying associations in images can be difficult because of the wide variety of features, patterns, colors, values, shapes in real data. In this paper, we use deep learning, a machine learning technique that learns its own features, to discover ne…
▽ More
Traditionally, medical discoveries are made by observing associations and then designing experiments to test these hypotheses. However, observing and quantifying associations in images can be difficult because of the wide variety of features, patterns, colors, values, shapes in real data. In this paper, we use deep learning, a machine learning technique that learns its own features, to discover new knowledge from retinal fundus images. Using models trained on data from 284,335 patients, and validated on two independent datasets of 12,026 and 999 patients, we predict cardiovascular risk factors not previously thought to be present or quantifiable in retinal images, such as such as age (within 3.26 years), gender (0.97 AUC), smoking status (0.71 AUC), HbA1c (within 1.39%), systolic blood pressure (within 11.23mmHg) as well as major adverse cardiac events (0.70 AUC). We further show that our models used distinct aspects of the anatomy to generate each prediction, such as the optic disc or blood vessels, opening avenues of further research.
△ Less
Submitted 21 September, 2017; v1 submitted 31 August, 2017;
originally announced August 2017.