Synthetically Enhanced: Unveiling Synthetic Data's Potential in Medical Imaging Research
Authors:
Bardia Khosravi,
Frank Li,
Theo Dapamede,
Pouria Rouzrokh,
Cooper U. Gamble,
Hari M. Trivedi,
Cody C. Wyles,
Andrew B. Sellergren,
Saptarshi Purkayastha,
Bradley J. Erickson,
Judy W. Gichoya
Abstract:
Chest X-rays (CXR) are the most common medical imaging study and are used to diagnose multiple medical conditions. This study examines the impact of synthetic data supplementation, using diffusion models, on the performance of deep learning (DL) classifiers for CXR analysis. We employed three datasets: CheXpert, MIMIC-CXR, and Emory Chest X-ray, training conditional denoising diffusion probabilist…
▽ More
Chest X-rays (CXR) are the most common medical imaging study and are used to diagnose multiple medical conditions. This study examines the impact of synthetic data supplementation, using diffusion models, on the performance of deep learning (DL) classifiers for CXR analysis. We employed three datasets: CheXpert, MIMIC-CXR, and Emory Chest X-ray, training conditional denoising diffusion probabilistic models (DDPMs) to generate synthetic frontal radiographs. Our approach ensured that synthetic images mirrored the demographic and pathological traits of the original data. Evaluating the classifiers' performance on internal and external datasets revealed that synthetic data supplementation enhances model accuracy, particularly in detecting less prevalent pathologies. Furthermore, models trained on synthetic data alone approached the performance of those trained on real data. This suggests that synthetic data can potentially compensate for real data shortages in training robust DL models. However, despite promising outcomes, the superiority of real data persists.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.