-
Longitudinal assessment of demographic representativeness in the Medical Imaging and Data Resource Center Open Data Commons
Authors:
Heather M. Whitney,
Natalie Baughan,
Kyle J. Myers,
Karen Drukker,
Judy Gichoya,
Brad Bower,
Weijie Chen,
Nicholas Gruszauskas,
Jayashree Kalpathy-Cramer,
Sanmi Koyejo,
Rui C. Sá,
Berkman Sahiner,
Zi Zhang,
Maryellen L. Giger
Abstract:
Purpose: The Medical Imaging and Data Resource Center (MIDRC) open data commons was launched to accelerate the development of artificial intelligence (AI) algorithms to help address the COVID-19 pandemic. The purpose of this study was to quantify longitudinal representativeness of the demographic characteristics of the primary imaging dataset compared to the United States general population (US Ce…
▽ More
Purpose: The Medical Imaging and Data Resource Center (MIDRC) open data commons was launched to accelerate the development of artificial intelligence (AI) algorithms to help address the COVID-19 pandemic. The purpose of this study was to quantify longitudinal representativeness of the demographic characteristics of the primary imaging dataset compared to the United States general population (US Census) and COVID-19 positive case counts from the Centers for Disease Control and Prevention (CDC). Approach: The Jensen Shannon distance (JSD) was used to longitudinally measure the similarity of the distribution of (1) all unique patients in the MIDRC data to the 2020 US Census and (2) all unique COVID-19 positive patients in the MIDRC data to the case counts reported by the CDC. The distributions were evaluated in the demographic categories of age at index, sex, race, ethnicity, and the intersection of race and ethnicity. Results: Representativeness the MIDRC data by ethnicity and the intersection of race and ethnicity was impacted by the percentage of CDC case counts for which data in these categories is not reported. The distributions by sex and race have retained their level of representativeness over time. Conclusion: The representativeness of the open medical imaging datasets in the curated public data commons at MIDRC has evolved over time as both the number of contributing institutions and overall number of subjects has grown. The use of metrics such as the JSD support measurement of representativeness, one step needed for fair and generalizable AI algorithm development.
△ Less
Submitted 18 March, 2023;
originally announced March 2023.
-
Principles and Guidelines for Sharing Biomedical Data for Secondary Use: The University of Chicago Perspective
Authors:
Robert L. Grossman,
Maryellen L. Giger,
Julie A. Johnson,
Jeremy D. Marks,
Jessica P. Ridgway,
Julian Solway,
Walter M. Stadler
Abstract:
Academic medical centers are generating an increasing amount of biomedical data and there is an increasing demand for biomedical data for research purposes by research projects, research consortia, companies, and other third parties. At the same time, as the number of patients grows and the amount of data per patient grows, there is an increasing possibility that some information about some patien…
▽ More
Academic medical centers are generating an increasing amount of biomedical data and there is an increasing demand for biomedical data for research purposes by research projects, research consortia, companies, and other third parties. At the same time, as the number of patients grows and the amount of data per patient grows, there is an increasing possibility that some information about some patients may become available if the data is shared with third parties and the third parties have a data breach or violate the terms of the data use agreement. Balancing the importance of research that may result in improved patient outcomes with the importance of protecting patient data is challenging. The article discusses the principles, considerations about risks and mitigating risks, and guidelines used at the University of Chicago used for making decisions about sharing biomedical data with third parties.
△ Less
Submitted 5 February, 2023;
originally announced February 2023.
-
Transfer Learning in 4D for Breast Cancer Diagnosis using Dynamic Contrast-Enhanced Magnetic Resonance Imaging
Authors:
Qiyuan Hu,
Heather M. Whitney,
Maryellen L. Giger
Abstract:
Deep transfer learning using dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) has shown strong predictive power in characterization of breast lesions. However, pretrained convolutional neural networks (CNNs) require 2D inputs, limiting the ability to exploit the rich 4D (volumetric and temporal) image information inherent in DCE-MRI that is clinically valuable for lesion assessment.…
▽ More
Deep transfer learning using dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) has shown strong predictive power in characterization of breast lesions. However, pretrained convolutional neural networks (CNNs) require 2D inputs, limiting the ability to exploit the rich 4D (volumetric and temporal) image information inherent in DCE-MRI that is clinically valuable for lesion assessment. Training 3D CNNs from scratch, a common method to utilize high-dimensional information in medical images, is computationally expensive and is not best suited for moderately sized healthcare datasets. Therefore, we propose a novel approach using transfer learning that incorporates the 4D information from DCE-MRI, where volumetric information is collapsed at feature level by max pooling along the projection perpendicular to the transverse slices and the temporal information is contained either in second-post contrast subtraction images. Our methodology yielded an area under the receiver operating characteristic curve of 0.89+/-0.01 on a dataset of 1161 breast lesions, significantly outperforming a previous approach that incorporates the 4D information in DCE-MRI by the use of maximum intensity projection (MIP) images.
△ Less
Submitted 7 November, 2019;
originally announced November 2019.