-
Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges
Authors:
Mahmoud Ibrahim,
Yasmina Al Khalil,
Sina Amirrajab,
Chang Sun,
Marcel Breeuwer,
Josien Pluim,
Bart Elen,
Gokhan Ertaylan,
Michel Dumontier
Abstract:
This paper presents a comprehensive systematic review of generative models (GANs, VAEs, DMs, and LLMs) used to synthesize various medical data types, including imaging (dermoscopic, mammographic, ultrasound, CT, MRI, and X-ray), text, time-series, and tabular data (EHR). Unlike previous narrowly focused reviews, our study encompasses a broad array of medical data modalities and explores various ge…
▽ More
This paper presents a comprehensive systematic review of generative models (GANs, VAEs, DMs, and LLMs) used to synthesize various medical data types, including imaging (dermoscopic, mammographic, ultrasound, CT, MRI, and X-ray), text, time-series, and tabular data (EHR). Unlike previous narrowly focused reviews, our study encompasses a broad array of medical data modalities and explores various generative models. Our search strategy queries databases such as Scopus, PubMed, and ArXiv, focusing on recent works from January 2021 to November 2023, excluding reviews and perspectives. This period emphasizes recent advancements beyond GANs, which have been extensively covered previously.
The survey reveals insights from three key aspects: (1) Synthesis applications and purpose of synthesis, (2) generation techniques, and (3) evaluation methods. It highlights clinically valid synthesis applications, demonstrating the potential of synthetic data to tackle diverse clinical requirements. While conditional models incorporating class labels, segmentation masks and image translations are prevalent, there is a gap in utilizing prior clinical knowledge and patient-specific context, suggesting a need for more personalized synthesis approaches and emphasizing the importance of tailoring generative approaches to the unique characteristics of medical data. Additionally, there is a significant gap in using synthetic data beyond augmentation, such as for validation and evaluation of downstream medical AI models. The survey uncovers that the lack of standardized evaluation methodologies tailored to medical images is a barrier to clinical application, underscoring the need for in-depth evaluation approaches, benchmarking, and comparative studies to promote openness and collaboration.
△ Less
Submitted 2 July, 2024; v1 submitted 27 June, 2024;
originally announced July 2024.
-
Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy?
Authors:
Nader Asadi,
Mahdi Beitollahi,
Yasser Khalil,
Yinchuan Li,
Guojun Zhang,
Xi Chen
Abstract:
Parameter-efficient fine-tuning stands as the standard for efficiently fine-tuning large language and vision models on downstream tasks. Specifically, the efficiency of low-rank adaptation has facilitated the creation and sharing of hundreds of custom LoRA modules, each trained on distinct data from various downstream tasks. In this paper, we explore the composability of LoRA modules, examining if…
▽ More
Parameter-efficient fine-tuning stands as the standard for efficiently fine-tuning large language and vision models on downstream tasks. Specifically, the efficiency of low-rank adaptation has facilitated the creation and sharing of hundreds of custom LoRA modules, each trained on distinct data from various downstream tasks. In this paper, we explore the composability of LoRA modules, examining if combining these pre-trained modules enhances generalization to unseen downstream tasks. Our investigation involves evaluating two approaches: (a) uniform composition, involving averaging upstream LoRA modules with equal weights, and (b) learned composition, where we learn the weights for each upstream module and perform weighted averaging. Our experimental results on both vision and language models reveal that in few-shot settings, where only a limited number of samples are available for the downstream task, both uniform and learned composition methods result in better transfer accuracy; outperforming full fine-tuning and training a LoRA from scratch. Moreover, in full-shot settings, learned composition performs comparably to regular LoRA training with significantly fewer number of trainable parameters. Our research unveils the potential of uniform composition for enhancing transferability in low-shot settings, without introducing additional learnable parameters.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
DFML: Decentralized Federated Mutual Learning
Authors:
Yasser H. Khalil,
Amir H. Estiri,
Mahdi Beitollahi,
Nader Asadi,
Sobhan Hemati,
Xu Li,
Guojun Zhang,
Xi Chen
Abstract:
In the realm of real-world devices, centralized servers in Federated Learning (FL) present challenges including communication bottlenecks and susceptibility to a single point of failure. Additionally, contemporary devices inherently exhibit model and data heterogeneity. Existing work lacks a Decentralized FL (DFL) framework capable of accommodating such heterogeneity without imposing architectural…
▽ More
In the realm of real-world devices, centralized servers in Federated Learning (FL) present challenges including communication bottlenecks and susceptibility to a single point of failure. Additionally, contemporary devices inherently exhibit model and data heterogeneity. Existing work lacks a Decentralized FL (DFL) framework capable of accommodating such heterogeneity without imposing architectural restrictions or assuming the availability of public data. To address these issues, we propose a Decentralized Federated Mutual Learning (DFML) framework that is serverless, supports nonrestrictive heterogeneous models, and avoids reliance on public data. DFML effectively handles model and data heterogeneity through mutual learning, which distills knowledge between clients, and cyclically varying the amount of supervision and distillation signals. Extensive experimental results demonstrate consistent effectiveness of DFML in both convergence speed and global accuracy, outperforming prevalent baselines under various conditions. For example, with the CIFAR-100 dataset and 50 clients, DFML achieves a substantial increase of +17.20% and +19.95% in global accuracy under Independent and Identically Distributed (IID) and non-IID data shifts, respectively.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Biomedical image analysis competitions: The state of current participation practice
Authors:
Matthias Eisenmann,
Annika Reinke,
Vivienn Weru,
Minu Dietlinde Tizabi,
Fabian Isensee,
Tim J. Adler,
Patrick Godau,
Veronika Cheplygina,
Michal Kozubek,
Sharib Ali,
Anubha Gupta,
Jan Kybic,
Alison Noble,
Carlos Ortiz de Solórzano,
Samiksha Pachade,
Caroline Petitjean,
Daniel Sage,
Donglai Wei,
Elizabeth Wilden,
Deepak Alapatt,
Vincent Andrearczyk,
Ujjwal Baid,
Spyridon Bakas,
Niranjan Balu,
Sophia Bano
, et al. (331 additional authors not shown)
Abstract:
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,…
▽ More
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
△ Less
Submitted 12 September, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
sim2real: Cardiac MR Image Simulation-to-Real Translation via Unsupervised GANs
Authors:
Sina Amirrajab,
Yasmina Al Khalil,
Cristian Lorenz,
Jurgen Weese,
Josien Pluim,
Marcel Breeuwer
Abstract:
There has been considerable interest in the MR physics-based simulation of a database of virtual cardiac MR images for the development of deep-learning analysis networks. However, the employment of such a database is limited or shows suboptimal performance due to the realism gap, missing textures, and the simplified appearance of simulated images. In this work we 1) provide image simulation on vir…
▽ More
There has been considerable interest in the MR physics-based simulation of a database of virtual cardiac MR images for the development of deep-learning analysis networks. However, the employment of such a database is limited or shows suboptimal performance due to the realism gap, missing textures, and the simplified appearance of simulated images. In this work we 1) provide image simulation on virtual XCAT subjects with varying anatomies, and 2) propose sim2real translation network to improve image realism. Our usability experiments suggest that sim2real data exhibits a good potential to augment training data and boost the performance of a segmentation algorithm.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
Blockchain based AI-enabled Industry 4.0 CPS Protection against Advanced Persistent Threat
Authors:
Ziaur Rahman,
Xun Yi Ibrahim Khalil
Abstract:
Industry 4.0 is all about doing things in a concurrent, secure, and fine-grained manner. IoT edge-sensors and their associated data play a predominant role in today's industry ecosystem. Breaching data or forging source devices after injecting advanced persistent threats (APT) damages the industry owners' money and loss of operators' lives. The existing challenges include APT injection attacks tar…
▽ More
Industry 4.0 is all about doing things in a concurrent, secure, and fine-grained manner. IoT edge-sensors and their associated data play a predominant role in today's industry ecosystem. Breaching data or forging source devices after injecting advanced persistent threats (APT) damages the industry owners' money and loss of operators' lives. The existing challenges include APT injection attacks targeting vulnerable edge devices, insecure data transportation, trust inconsistencies among stakeholders, incompliant data storing mechanisms, etc. Edge-servers often suffer because of their lightweight computation capacity to stamp out unauthorized data or instructions, which in essence, makes them exposed to attackers. When attackers target edge servers while transporting data using traditional PKI-rendered trusts, consortium blockchain (CBC) offers proven techniques to transfer and maintain those sensitive data securely. With the recent improvement of edge machine learning, edge devices can filter malicious data at their end which largely motivates us to institute a Blockchain and AI aligned APT detection system. The unique contributions of the paper include efficient APT detection at the edge and transparent recording of the detection history in an immutable blockchain ledger. In line with that, the certificateless data transfer mechanism boost trust among collaborators and ensure an economical and sustainable mechanism after eliminating existing certificate authority. Finally, the edge-compliant storage technique facilitates efficient predictive maintenance. The respective experimental outcomes reveal that the proposed technique outperforms the other competing systems and models.
△ Less
Submitted 30 January, 2022;
originally announced January 2022.
-
XCAT-GAN for Synthesizing 3D Consistent Labeled Cardiac MR Images on Anatomically Variable XCAT Phantoms
Authors:
Sina Amirrajab,
Samaneh Abbasi-Sureshjani,
Yasmina Al Khalil,
Cristian Lorenz,
Juergen Weese,
Josien Pluim,
Marcel Breeuwer
Abstract:
Generative adversarial networks (GANs) have provided promising data enrichment solutions by synthesizing high-fidelity images. However, generating large sets of labeled images with new anatomical variations remains unexplored. We propose a novel method for synthesizing cardiac magnetic resonance (CMR) images on a population of virtual subjects with a large anatomical variation, introduced using th…
▽ More
Generative adversarial networks (GANs) have provided promising data enrichment solutions by synthesizing high-fidelity images. However, generating large sets of labeled images with new anatomical variations remains unexplored. We propose a novel method for synthesizing cardiac magnetic resonance (CMR) images on a population of virtual subjects with a large anatomical variation, introduced using the 4D eXtended Cardiac and Torso (XCAT) computerized human phantom. We investigate two conditional image synthesis approaches grounded on a semantically-consistent mask-guided image generation technique: 4-class and 8-class XCAT-GANs. The 4-class technique relies on only the annotations of the heart; while the 8-class technique employs a predicted multi-tissue label map of the heart-surrounding organs and provides better guidance for our conditional image synthesis. For both techniques, we train our conditional XCAT-GAN with real images paired with corresponding labels and subsequently at the inference time, we substitute the labels with the XCAT derived ones. Therefore, the trained network accurately transfers the tissue-specific textures to the new label maps. By creating 33 virtual subjects of synthetic CMR images at the end-diastolic and end-systolic phases, we evaluate the usefulness of such data in the downstream cardiac cavity segmentation task under different augmentation strategies. Results demonstrate that even with only 20% of real images (40 volumes) seen during training, segmentation performance is retained with the addition of synthetic CMR images. Moreover, the improvement in utilizing synthetic images for augmenting the real data is evident through the reduction of Hausdorff distance up to 28% and an increase in the Dice score up to 5%, indicating a higher similarity to the ground truth in all dimensions.
△ Less
Submitted 31 July, 2020; v1 submitted 27 July, 2020;
originally announced July 2020.
-
Dim Silicon and the Case for Improved DVFS Policies
Authors:
Mathias Gottschlag,
Yussuf Khalil,
Frank Bellosa
Abstract:
Due to thermal and power supply limits, modern Intel CPUs reduce their frequency when AVX2 and AVX-512 instructions are executed. As the CPUs wait for 670μs before increasing the frequency again, the performance of some heterogeneous workloads is reduced. In this paper, we describe parallels between this situation and dynamic power management as well as between the policy implemented by these CPUs…
▽ More
Due to thermal and power supply limits, modern Intel CPUs reduce their frequency when AVX2 and AVX-512 instructions are executed. As the CPUs wait for 670μs before increasing the frequency again, the performance of some heterogeneous workloads is reduced. In this paper, we describe parallels between this situation and dynamic power management as well as between the policy implemented by these CPUs and fixed-timeout device shutdown policies. We show that the policy implemented by Intel CPUs is not optimal and describe potential better policies. In particular, we present a mechanism to classify applications based on their likeliness to cause frequency reduction. Our approach takes either the resulting classification information or information provided by the application and generates hints for the DVFS policy. We show that faster frequency changes based on these hints are able to improve performance for a web server using the OpenSSL library.
△ Less
Submitted 4 May, 2020;
originally announced May 2020.