-
Report on the AAPM Grand Challenge on deep generative modeling for learning medical image statistics
Authors:
Rucha Deshpande,
Varun A. Kelkar,
Dimitrios Gotsis,
Prabhat Kc,
Rong** Zeng,
Kyle J. Myers,
Frank J. Brooks,
Mark A. Anastasio
Abstract:
The findings of the 2023 AAPM Grand Challenge on Deep Generative Modeling for Learning Medical Image Statistics are reported in this Special Report. The goal of this challenge was to promote the development of deep generative models (DGMs) for medical imaging and to emphasize the need for their domain-relevant assessment via the analysis of relevant image statistics. As part of this Grand Challeng…
▽ More
The findings of the 2023 AAPM Grand Challenge on Deep Generative Modeling for Learning Medical Image Statistics are reported in this Special Report. The goal of this challenge was to promote the development of deep generative models (DGMs) for medical imaging and to emphasize the need for their domain-relevant assessment via the analysis of relevant image statistics. As part of this Grand Challenge, a training dataset was developed based on 3D anthropomorphic breast phantoms from the VICTRE virtual imaging toolbox. A two-stage evaluation procedure consisting of a preliminary check for memorization and image quality (based on the Frechet Inception distance (FID)), and a second stage evaluating the reproducibility of image statistics corresponding to domain-relevant radiomic features was developed. A summary measure was employed to rank the submissions. Additional analyses of submissions was performed to assess DGM performance specific to individual feature families, and to identify various artifacts. 58 submissions from 12 unique users were received for this Challenge. The top-ranked submission employed a conditional latent diffusion model, whereas the joint runners-up employed a generative adversarial network, followed by another network for image superresolution. We observed that the overall ranking of the top 9 submissions according to our evaluation method (i) did not match the FID-based ranking, and (ii) differed with respect to individual feature families. Another important finding from our additional analyses was that different DGMs demonstrated similar kinds of artifacts. This Grand Challenge highlighted the need for domain-specific evaluation to further DGM design as well as deployment. It also demonstrated that the specification of a DGM may differ depending on its intended use.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Investigating the robustness of a learning-based method for quantitative phase retrieval from propagation-based x-ray phase contrast measurements under laboratory conditions
Authors:
Rucha Deshpande,
Ashish Avachat,
Frank J. Brooks,
Mark A. Anastasio
Abstract:
Quantitative phase retrieval (QPR) in propagation-based x-ray phase contrast imaging of heterogeneous and structurally complicated objects is challenging under laboratory conditions due to partial spatial coherence and polychromaticity. A learning-based method (LBM) provides a non-linear approach to this problem while not being constrained by restrictive assumptions about object properties and bea…
▽ More
Quantitative phase retrieval (QPR) in propagation-based x-ray phase contrast imaging of heterogeneous and structurally complicated objects is challenging under laboratory conditions due to partial spatial coherence and polychromaticity. A learning-based method (LBM) provides a non-linear approach to this problem while not being constrained by restrictive assumptions about object properties and beam coherence. In this work, a LBM was assessed for its applicability under practical scenarios by evaluating its robustness and generalizability under typical experimental variations. Towards this end, an end-to-end LBM was employed for QPR under laboratory conditions and its robustness was investigated across various system and object conditions. The robustness of the method was tested via varying propagation distances and its generalizability with respect to object structure and experimental data was also tested. Although the LBM was stable under the studied variations, its successful deployment was found to be affected by choices pertaining to data pre-processing, network training considerations and system modeling. To our knowledge, we demonstrated for the first time, the potential applicability of an end-to-end learning-based quantitative phase retrieval method, trained on simulated data, to experimental propagation-based x-ray phase contrast measurements acquired under laboratory conditions. We considered conditions of polychromaticity, partial spatial coherence, and high noise levels, typical to laboratory conditions. This work further explored the robustness of this method to practical variations in propagation distances and object structure with the goal of assessing its potential for experimental use. Such an exploration of any LBM (irrespective of its network architecture) before practical deployment provides an understanding of its potential behavior under experimental settings.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
Assessing the ability of generative adversarial networks to learn canonical medical image statistics
Authors:
Varun A. Kelkar,
Dimitrios S. Gotsis,
Frank J. Brooks,
Prabhat KC,
Kyle J. Myers,
Rong** Zeng,
Mark A. Anastasio
Abstract:
In recent years, generative adversarial networks (GANs) have gained tremendous popularity for potential applications in medical imaging, such as medical image synthesis, restoration, reconstruction, translation, as well as objective image quality assessment. Despite the impressive progress in generating high-resolution, perceptually realistic images, it is not clear if modern GANs reliably learn t…
▽ More
In recent years, generative adversarial networks (GANs) have gained tremendous popularity for potential applications in medical imaging, such as medical image synthesis, restoration, reconstruction, translation, as well as objective image quality assessment. Despite the impressive progress in generating high-resolution, perceptually realistic images, it is not clear if modern GANs reliably learn the statistics that are meaningful to a downstream medical imaging application. In this work, the ability of a state-of-the-art GAN to learn the statistics of canonical stochastic image models (SIMs) that are relevant to objective assessment of image quality is investigated. It is shown that although the employed GAN successfully learned several basic first- and second-order statistics of the specific medical SIMs under consideration and generated images with high perceptual quality, it failed to correctly learn several per-image statistics pertinent to the these SIMs, highlighting the urgent need to assess medical image GANs in terms of objective measures of image quality.
△ Less
Submitted 26 April, 2022; v1 submitted 25 April, 2022;
originally announced April 2022.
-
Evaluating Procedures for Establishing Generative Adversarial Network-based Stochastic Image Models in Medical Imaging
Authors:
Varun A. Kelkar,
Dimitrios S. Gotsis,
Frank J. Brooks,
Kyle J. Myers,
Prabhat KC,
Rong** Zeng,
Mark A. Anastasio
Abstract:
Modern generative models, such as generative adversarial networks (GANs), hold tremendous promise for several areas of medical imaging, such as unconditional medical image synthesis, image restoration, reconstruction and translation, and optimization of imaging systems. However, procedures for establishing stochastic image models (SIMs) using GANs remain generic and do not address specific issues…
▽ More
Modern generative models, such as generative adversarial networks (GANs), hold tremendous promise for several areas of medical imaging, such as unconditional medical image synthesis, image restoration, reconstruction and translation, and optimization of imaging systems. However, procedures for establishing stochastic image models (SIMs) using GANs remain generic and do not address specific issues relevant to medical imaging. In this work, canonical SIMs that simulate realistic vessels in angiography images are employed to evaluate procedures for establishing SIMs using GANs. The GAN-based SIM is compared to the canonical SIM based on its ability to reproduce those statistics that are meaningful to the particular medically realistic SIM considered. It is shown that evaluating GANs using classical metrics and medically relevant metrics may lead to different conclusions about the fidelity of the trained GANs. This work highlights the need for the development of objective metrics for evaluating GANs.
△ Less
Submitted 7 April, 2022;
originally announced April 2022.