-
Physics-informed motion registration of lung parenchyma across static CT images
Authors:
Sunder Neelakantan,
Tanmay Mukherjee,
Kyle J. Myers,
Rahim Rizi,
Reza Avazmohammadi
Abstract:
Lung injuries, such as ventilator-induced lung injury and radiation-induced lung injury, can lead to heterogeneous alterations in the biomechanical behavior of the lungs. While imaging methods, e.g., X-ray and static computed tomography (CT), can point to regional alterations in lung structure between healthy and diseased tissue, they fall short of delineating timewise kinematic variations between…
▽ More
Lung injuries, such as ventilator-induced lung injury and radiation-induced lung injury, can lead to heterogeneous alterations in the biomechanical behavior of the lungs. While imaging methods, e.g., X-ray and static computed tomography (CT), can point to regional alterations in lung structure between healthy and diseased tissue, they fall short of delineating timewise kinematic variations between the former and the latter. Image registration has gained recent interest as a tool to estimate the displacement experienced by the lungs during respiration via regional deformation metrics such as volumetric expansion and distortion. However, successful image registration commonly relies on a temporal series of image stacks with small displacements in the lungs across succeeding image stacks, which remains limited in static imaging. In this study, we have presented a finite element (FE) method to estimate strains from static images acquired at the end-expiration (EE) and end-inspiration (EI) timepoints, i.e., images with a large deformation between the two distant timepoints. Physiologically realistic loads were applied to the geometry obtained at EE to deform this geometry to match the geometry obtained at EI. The results indicated that the simulation could minimize the error between the two geometries. Using four-dimensional (4D) dynamic CT in a rat, the strain at an isolated transverse plane estimated by our method showed sufficient agreement with that estimated through non-rigid image registration that used all the timepoints. Through the proposed method, we can estimate the lung deformation at any timepoint between EE and EI. The proposed method offers a tool to estimate timewise regional deformation in the lungs using only static images acquired at EE and EI.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Report on the AAPM Grand Challenge on deep generative modeling for learning medical image statistics
Authors:
Rucha Deshpande,
Varun A. Kelkar,
Dimitrios Gotsis,
Prabhat Kc,
Rong** Zeng,
Kyle J. Myers,
Frank J. Brooks,
Mark A. Anastasio
Abstract:
The findings of the 2023 AAPM Grand Challenge on Deep Generative Modeling for Learning Medical Image Statistics are reported in this Special Report. The goal of this challenge was to promote the development of deep generative models (DGMs) for medical imaging and to emphasize the need for their domain-relevant assessment via the analysis of relevant image statistics. As part of this Grand Challeng…
▽ More
The findings of the 2023 AAPM Grand Challenge on Deep Generative Modeling for Learning Medical Image Statistics are reported in this Special Report. The goal of this challenge was to promote the development of deep generative models (DGMs) for medical imaging and to emphasize the need for their domain-relevant assessment via the analysis of relevant image statistics. As part of this Grand Challenge, a training dataset was developed based on 3D anthropomorphic breast phantoms from the VICTRE virtual imaging toolbox. A two-stage evaluation procedure consisting of a preliminary check for memorization and image quality (based on the Frechet Inception distance (FID)), and a second stage evaluating the reproducibility of image statistics corresponding to domain-relevant radiomic features was developed. A summary measure was employed to rank the submissions. Additional analyses of submissions was performed to assess DGM performance specific to individual feature families, and to identify various artifacts. 58 submissions from 12 unique users were received for this Challenge. The top-ranked submission employed a conditional latent diffusion model, whereas the joint runners-up employed a generative adversarial network, followed by another network for image superresolution. We observed that the overall ranking of the top 9 submissions according to our evaluation method (i) did not match the FID-based ranking, and (ii) differed with respect to individual feature families. Another important finding from our additional analyses was that different DGMs demonstrated similar kinds of artifacts. This Grand Challenge highlighted the need for domain-specific evaluation to further DGM design as well as deployment. It also demonstrated that the specification of a DGM may differ depending on its intended use.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Longitudinal assessment of demographic representativeness in the Medical Imaging and Data Resource Center Open Data Commons
Authors:
Heather M. Whitney,
Natalie Baughan,
Kyle J. Myers,
Karen Drukker,
Judy Gichoya,
Brad Bower,
Weijie Chen,
Nicholas Gruszauskas,
Jayashree Kalpathy-Cramer,
Sanmi Koyejo,
Rui C. Sá,
Berkman Sahiner,
Zi Zhang,
Maryellen L. Giger
Abstract:
Purpose: The Medical Imaging and Data Resource Center (MIDRC) open data commons was launched to accelerate the development of artificial intelligence (AI) algorithms to help address the COVID-19 pandemic. The purpose of this study was to quantify longitudinal representativeness of the demographic characteristics of the primary imaging dataset compared to the United States general population (US Ce…
▽ More
Purpose: The Medical Imaging and Data Resource Center (MIDRC) open data commons was launched to accelerate the development of artificial intelligence (AI) algorithms to help address the COVID-19 pandemic. The purpose of this study was to quantify longitudinal representativeness of the demographic characteristics of the primary imaging dataset compared to the United States general population (US Census) and COVID-19 positive case counts from the Centers for Disease Control and Prevention (CDC). Approach: The Jensen Shannon distance (JSD) was used to longitudinally measure the similarity of the distribution of (1) all unique patients in the MIDRC data to the 2020 US Census and (2) all unique COVID-19 positive patients in the MIDRC data to the case counts reported by the CDC. The distributions were evaluated in the demographic categories of age at index, sex, race, ethnicity, and the intersection of race and ethnicity. Results: Representativeness the MIDRC data by ethnicity and the intersection of race and ethnicity was impacted by the percentage of CDC case counts for which data in these categories is not reported. The distributions by sex and race have retained their level of representativeness over time. Conclusion: The representativeness of the open medical imaging datasets in the curated public data commons at MIDRC has evolved over time as both the number of contributing institutions and overall number of subjects has grown. The use of metrics such as the JSD support measurement of representativeness, one step needed for fair and generalizable AI algorithm development.
△ Less
Submitted 18 March, 2023;
originally announced March 2023.
-
Translating Radiology Reports into Plain Language using ChatGPT and GPT-4 with Prompt Learning: Promising Results, Limitations, and Potential
Authors:
Qing Lyu,
Josh Tan,
Michael E. Zapadka,
Janardhana Ponnatapura,
Chuang Niu,
Kyle J. Myers,
Ge Wang,
Christopher T. Whitlow
Abstract:
The large language model called ChatGPT has drawn extensively attention because of its human-like expression and reasoning abilities. In this study, we investigate the feasibility of using ChatGPT in experiments on using ChatGPT to translate radiology reports into plain language for patients and healthcare providers so that they are educated for improved healthcare. Radiology reports from 62 low-d…
▽ More
The large language model called ChatGPT has drawn extensively attention because of its human-like expression and reasoning abilities. In this study, we investigate the feasibility of using ChatGPT in experiments on using ChatGPT to translate radiology reports into plain language for patients and healthcare providers so that they are educated for improved healthcare. Radiology reports from 62 low-dose chest CT lung cancer screening scans and 76 brain MRI metastases screening scans were collected in the first half of February for this study. According to the evaluation by radiologists, ChatGPT can successfully translate radiology reports into plain language with an average score of 4.27 in the five-point system with 0.08 places of information missing and 0.07 places of misinformation. In terms of the suggestions provided by ChatGPT, they are general relevant such as kee** following-up with doctors and closely monitoring any symptoms, and for about 37% of 138 cases in total ChatGPT offers specific suggestions based on findings in the report. ChatGPT also presents some randomness in its responses with occasionally over-simplified or neglected information, which can be mitigated using a more detailed prompt. Furthermore, ChatGPT results are compared with a newly released large model GPT-4, showing that GPT-4 can significantly improve the quality of translated reports. Our results show that it is feasible to utilize large language models in clinical education, and further efforts are needed to address limitations and maximize their potential.
△ Less
Submitted 28 March, 2023; v1 submitted 15 March, 2023;
originally announced March 2023.
-
Assessing the ability of generative adversarial networks to learn canonical medical image statistics
Authors:
Varun A. Kelkar,
Dimitrios S. Gotsis,
Frank J. Brooks,
Prabhat KC,
Kyle J. Myers,
Rong** Zeng,
Mark A. Anastasio
Abstract:
In recent years, generative adversarial networks (GANs) have gained tremendous popularity for potential applications in medical imaging, such as medical image synthesis, restoration, reconstruction, translation, as well as objective image quality assessment. Despite the impressive progress in generating high-resolution, perceptually realistic images, it is not clear if modern GANs reliably learn t…
▽ More
In recent years, generative adversarial networks (GANs) have gained tremendous popularity for potential applications in medical imaging, such as medical image synthesis, restoration, reconstruction, translation, as well as objective image quality assessment. Despite the impressive progress in generating high-resolution, perceptually realistic images, it is not clear if modern GANs reliably learn the statistics that are meaningful to a downstream medical imaging application. In this work, the ability of a state-of-the-art GAN to learn the statistics of canonical stochastic image models (SIMs) that are relevant to objective assessment of image quality is investigated. It is shown that although the employed GAN successfully learned several basic first- and second-order statistics of the specific medical SIMs under consideration and generated images with high perceptual quality, it failed to correctly learn several per-image statistics pertinent to the these SIMs, highlighting the urgent need to assess medical image GANs in terms of objective measures of image quality.
△ Less
Submitted 26 April, 2022; v1 submitted 25 April, 2022;
originally announced April 2022.
-
Evaluating Procedures for Establishing Generative Adversarial Network-based Stochastic Image Models in Medical Imaging
Authors:
Varun A. Kelkar,
Dimitrios S. Gotsis,
Frank J. Brooks,
Kyle J. Myers,
Prabhat KC,
Rong** Zeng,
Mark A. Anastasio
Abstract:
Modern generative models, such as generative adversarial networks (GANs), hold tremendous promise for several areas of medical imaging, such as unconditional medical image synthesis, image restoration, reconstruction and translation, and optimization of imaging systems. However, procedures for establishing stochastic image models (SIMs) using GANs remain generic and do not address specific issues…
▽ More
Modern generative models, such as generative adversarial networks (GANs), hold tremendous promise for several areas of medical imaging, such as unconditional medical image synthesis, image restoration, reconstruction and translation, and optimization of imaging systems. However, procedures for establishing stochastic image models (SIMs) using GANs remain generic and do not address specific issues relevant to medical imaging. In this work, canonical SIMs that simulate realistic vessels in angiography images are employed to evaluate procedures for establishing SIMs using GANs. The GAN-based SIM is compared to the canonical SIM based on its ability to reproduce those statistics that are meaningful to the particular medically realistic SIM considered. It is shown that evaluating GANs using classical metrics and medically relevant metrics may lead to different conclusions about the fidelity of the trained GANs. This work highlights the need for the development of objective metrics for evaluating GANs.
△ Less
Submitted 7 April, 2022;
originally announced April 2022.
-
Deep neural networks-based denoising models for CT imaging and their efficacy
Authors:
Prabhat KC,
Rong** Zeng,
M. Mehdi Farhangi,
Kyle J. Myers
Abstract:
Most of the Deep Neural Networks (DNNs) based CT image denoising literature shows that DNNs outperform traditional iterative methods in terms of metrics such as the RMSE, the PSNR and the SSIM. In many instances, using the same metrics, the DNN results from low-dose inputs are also shown to be comparable to their high-dose counterparts. However, these metrics do not reveal if the DNN results prese…
▽ More
Most of the Deep Neural Networks (DNNs) based CT image denoising literature shows that DNNs outperform traditional iterative methods in terms of metrics such as the RMSE, the PSNR and the SSIM. In many instances, using the same metrics, the DNN results from low-dose inputs are also shown to be comparable to their high-dose counterparts. However, these metrics do not reveal if the DNN results preserve the visibility of subtle lesions or if they alter the CT image properties such as the noise texture. Accordingly, in this work, we seek to examine the image quality of the DNN results from a holistic viewpoint for low-dose CT image denoising. First, we build a library of advanced DNN denoising architectures. This library is comprised of denoising architectures such as the DnCNN, U-Net, Red-Net, GAN, etc. Next, each network is modeled, as well as trained, such that it yields its best performance in terms of the PSNR and SSIM. As such, data inputs (e.g. training patch-size, reconstruction kernel) and numeric-optimizer inputs (e.g. minibatch size, learning rate, loss function) are accordingly tuned. Finally, outputs from thus trained networks are further subjected to a series of CT bench testing metrics such as the contrast-dependent MTF, the NPS and the HU accuracy. These metrics are employed to perform a more nuanced study of the resolution of the DNN outputs' low-contrast features, their noise textures, and their CT number accuracy to better understand the impact each DNN algorithm has on these underlying attributes of image quality.
△ Less
Submitted 18 November, 2021;
originally announced November 2021.
-
Objective task-based evaluation of artificial intelligence-based medical imaging methods: Framework, strategies and role of the physician
Authors:
Abhinav K. Jha,
Kyle J. Myers,
Nancy A. Obuchowski,
Zi** Liu,
Md Ashequr Rahman,
Babak Saboury,
Arman Rahmim,
Barry A. Siegel
Abstract:
Artificial intelligence (AI)-based methods are showing promise in multiple medical-imaging applications. Thus, there is substantial interest in clinical translation of these methods, requiring in turn, that they be evaluated rigorously. In this paper, our goal is to lay out a framework for objective task-based evaluation of AI methods. We will also provide a list of tools available in the literatu…
▽ More
Artificial intelligence (AI)-based methods are showing promise in multiple medical-imaging applications. Thus, there is substantial interest in clinical translation of these methods, requiring in turn, that they be evaluated rigorously. In this paper, our goal is to lay out a framework for objective task-based evaluation of AI methods. We will also provide a list of tools available in the literature to conduct this evaluation. Further, we outline the important role of physicians in conducting these evaluation studies. The examples in this paper will be proposed in the context of PET with a focus on neural-network-based methods. However, the framework is also applicable to evaluate other medical-imaging modalities and other types of AI methods.
△ Less
Submitted 20 July, 2021; v1 submitted 9 July, 2021;
originally announced July 2021.