-
Recombination enables higher numbers of recessive genes, contributing to the emergence of sexual mating in complex organisms
Authors:
Luis A. La Rocca,
Konrad Gerischer,
Anton Bovier,
Peter M. Krawitz
Abstract:
The drift-barrier hypothesis states that random genetic drift constrains the refinement of a phenotype under natural selection. The influence of effective population size and the genome-wide deleterious mutation rate were studied theoretically, and an inverse relationship between mutation rate and genome size has been observed for many species. However, the effect of the recessive gene count, an i…
▽ More
The drift-barrier hypothesis states that random genetic drift constrains the refinement of a phenotype under natural selection. The influence of effective population size and the genome-wide deleterious mutation rate were studied theoretically, and an inverse relationship between mutation rate and genome size has been observed for many species. However, the effect of the recessive gene count, an important feature of the genomic architecture, is unknown. In a Wright-Fisher model, we studied the mutation burden for a growing number of N completely recessive and lethal disease genes. Diploid individuals are represented with a binary $2 \times N$ matrix denoting wild-type and mutated alleles. Analytic results for specific cases were complemented by simulations across a broad parameter regime for gene count, mutation and recombination rates. Simulations revealed transitions to higher mutation burden and prevalence within a few generations that were linked to the extinction of the wild-type haplotype (least-loaded class). This metastability, that is, phases of quasi-equilibrium with intermittent transitions, persists over $100\,000$ generations. The drift-barrier hypothesis is confirmed by a high mutation burden resulting in population collapse. Simulations showed the emergence of mutually exclusive haplotypes for a mutation rate above 0.02 lethal equivalents per generation for a genomic architecture and population size representing complex multicellular organisms such as humans. In such systems, recombination proves pivotal, preventing population collapse and maintaining a mutation burden below 10. This study advances our understanding of gene pool stability, and particularly the role of the number of recessive disorders. Insights into Muller`s ratchet dynamics are provided, and the essential role of recombination in curbing mutation burden and stabilizing the gene pool is demonstrated.
△ Less
Submitted 14 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
GestaltMML: Enhancing Rare Genetic Disease Diagnosis through Multimodal Machine Learning Combining Facial Images and Clinical Texts
Authors:
Da Wu,
**gye Yang,
Cong Liu,
Tzung-Chien Hsieh,
Elaine Marchi,
Justin Blair,
Peter Krawitz,
Chunhua Weng,
Wendy Chung,
Gholson J. Lyon,
Ian D. Krantz,
Jennifer M. Kalish,
Kai Wang
Abstract:
Individuals with suspected rare genetic disorders often undergo multiple clinical evaluations, imaging studies, laboratory tests and genetic tests, to find a possible answer over a prolonged period of time. Addressing this "diagnostic odyssey" thus has substantial clinical, psychosocial, and economic benefits. Many rare genetic diseases have distinctive facial features, which can be used by artifi…
▽ More
Individuals with suspected rare genetic disorders often undergo multiple clinical evaluations, imaging studies, laboratory tests and genetic tests, to find a possible answer over a prolonged period of time. Addressing this "diagnostic odyssey" thus has substantial clinical, psychosocial, and economic benefits. Many rare genetic diseases have distinctive facial features, which can be used by artificial intelligence algorithms to facilitate clinical diagnosis, in prioritizing candidate diseases to be further examined by lab tests or genetic assays, or in hel** the phenotype-driven reinterpretation of genome/exome sequencing data. Existing methods using frontal facial photos were built on conventional Convolutional Neural Networks (CNNs), rely exclusively on facial images, and cannot capture non-facial phenotypic traits and demographic information essential for guiding accurate diagnoses. Here we introduce GestaltMML, a multimodal machine learning (MML) approach solely based on the Transformer architecture. It integrates facial images, demographic information (age, sex, ethnicity), and clinical notes (optionally, a list of Human Phenotype Ontology terms) to improve prediction accuracy. Furthermore, we also evaluated GestaltMML on a diverse range of datasets, including 528 diseases from the GestaltMatcher Database, several in-house datasets of Beckwith-Wiedemann syndrome (BWS, over-growth syndrome with distinct facial features), Sotos syndrome (overgrowth syndrome with overlap** features with BWS), NAA10-related neurodevelopmental syndrome, Cornelia de Lange syndrome (multiple malformation syndrome), and KBG syndrome (multiple malformation syndrome). Our results suggest that GestaltMML effectively incorporates multiple modalities of data, greatly narrowing candidate genetic diagnoses of rare diseases and may facilitate the reinterpretation of genome/exome sequencing data.
△ Less
Submitted 21 April, 2024; v1 submitted 23 December, 2023;
originally announced December 2023.
-
GANonymization: A GAN-based Face Anonymization Framework for Preserving Emotional Expressions
Authors:
Fabio Hellmann,
Silvan Mertes,
Mohamed Benouis,
Alexander Hustinx,
Tzung-Chien Hsieh,
Cristina Conati,
Peter Krawitz,
Elisabeth André
Abstract:
In recent years, the increasing availability of personal data has raised concerns regarding privacy and security. One of the critical processes to address these concerns is data anonymization, which aims to protect individual privacy and prevent the release of sensitive information. This research focuses on the importance of face anonymization. Therefore, we introduce GANonymization, a novel face…
▽ More
In recent years, the increasing availability of personal data has raised concerns regarding privacy and security. One of the critical processes to address these concerns is data anonymization, which aims to protect individual privacy and prevent the release of sensitive information. This research focuses on the importance of face anonymization. Therefore, we introduce GANonymization, a novel face anonymization framework with facial expression-preserving abilities. Our approach is based on a high-level representation of a face, which is synthesized into an anonymized version based on a generative adversarial network (GAN). The effectiveness of the approach was assessed by evaluating its performance in removing identifiable facial attributes to increase the anonymity of the given individual face. Additionally, the performance of preserving facial expressions was evaluated on several affect recognition datasets and outperformed the state-of-the-art methods in most categories. Finally, our approach was analyzed for its ability to remove various facial traits, such as jewelry, hair color, and multiple others. Here, it demonstrated reliable performance in removing these attributes. Our results suggest that GANonymization is a promising approach for anonymizing faces while preserving facial expressions.
△ Less
Submitted 14 November, 2023; v1 submitted 3 May, 2023;
originally announced May 2023.
-
Improving Deep Facial Phenoty** for Ultra-rare Disorder Verification Using Model Ensembles
Authors:
Alexander Hustinx,
Fabio Hellmann,
Ömer Sümer,
Behnam Javanmardi,
Elisabeth André,
Peter Krawitz,
Tzung-Chien Hsieh
Abstract:
Rare genetic disorders affect more than 6% of the global population. Reaching a diagnosis is challenging because rare disorders are very diverse. Many disorders have recognizable facial features that are hints for clinicians to diagnose patients. Previous work, such as GestaltMatcher, utilized representation vectors produced by a DCNN similar to AlexNet to match patients in high-dimensional featur…
▽ More
Rare genetic disorders affect more than 6% of the global population. Reaching a diagnosis is challenging because rare disorders are very diverse. Many disorders have recognizable facial features that are hints for clinicians to diagnose patients. Previous work, such as GestaltMatcher, utilized representation vectors produced by a DCNN similar to AlexNet to match patients in high-dimensional feature space to support "unseen" ultra-rare disorders. However, the architecture and dataset used for transfer learning in GestaltMatcher have become outdated. Moreover, a way to train the model for generating better representation vectors for unseen ultra-rare disorders has not yet been studied. Because of the overall scarcity of patients with ultra-rare disorders, it is infeasible to directly train a model on them. Therefore, we first analyzed the influence of replacing GestaltMatcher DCNN with a state-of-the-art face recognition approach, iResNet with ArcFace. Additionally, we experimented with different face recognition datasets for transfer learning. Furthermore, we proposed test-time augmentation, and model ensembles that mix general face verification models and models specific for verifying disorders to improve the disorder verification accuracy of unseen ultra-rare disorders. Our proposed ensemble model achieves state-of-the-art performance on both seen and unseen disorders.
△ Less
Submitted 12 November, 2022;
originally announced November 2022.
-
Few-Shot Meta Learning for Recognizing Facial Phenotypes of Genetic Disorders
Authors:
Ömer Sümer,
Fabio Hellmann,
Alexander Hustinx,
Tzung-Chien Hsieh,
Elisabeth André,
Peter Krawitz
Abstract:
Computer vision-based methods have valuable use cases in precision medicine, and recognizing facial phenotypes of genetic disorders is one of them. Many genetic disorders are known to affect faces' visual appearance and geometry. Automated classification and similarity retrieval aid physicians in decision-making to diagnose possible genetic conditions as early as possible. Previous work has addres…
▽ More
Computer vision-based methods have valuable use cases in precision medicine, and recognizing facial phenotypes of genetic disorders is one of them. Many genetic disorders are known to affect faces' visual appearance and geometry. Automated classification and similarity retrieval aid physicians in decision-making to diagnose possible genetic conditions as early as possible. Previous work has addressed the problem as a classification problem and used deep learning methods. The challenging issue in practice is the sparse label distribution and huge class imbalances across categories. Furthermore, most disorders have few labeled samples in training sets, making representation learning and generalization essential to acquiring a reliable feature descriptor. In this study, we used a facial recognition model trained on a large corpus of healthy individuals as a pre-task and transferred it to facial phenotype recognition. Furthermore, we created simple baselines of few-shot meta-learning methods to improve our base feature descriptor. Our quantitative results on GestaltMatcher Database show that our CNN baseline surpasses previous works, including GestaltMatcher, and few-shot meta-learning strategies improve retrieval performance in frequent and rare classes.
△ Less
Submitted 24 May, 2023; v1 submitted 23 October, 2022;
originally announced October 2022.
-
A lower prevalence for recessive disorders in a random mating population is a transient phenomenon during and after a growth phase
Authors:
Luis A. La Rocca,
Julia Frank,
Heidi Beate Bentzen,
Jean-Tori Pantel,
Konrad Gerischer,
Anton Bovier,
Peter M. Krawitz
Abstract:
Despite increasing data from population-wide sequencing studies, the risk for recessive disorders in consanguineous partnerships is still heavily debated. An important aspect that has not sufficiently been investigated theoretically, is the influence of inbreeding on mutation load and incidence rates when the population sizes change. We therefore developed a model to study these dynamics for a wid…
▽ More
Despite increasing data from population-wide sequencing studies, the risk for recessive disorders in consanguineous partnerships is still heavily debated. An important aspect that has not sufficiently been investigated theoretically, is the influence of inbreeding on mutation load and incidence rates when the population sizes change. We therefore developed a model to study these dynamics for a wide range of growth and mating conditions. In the phase of population expansion and shortly afterwards, our simulations show that there is a drop of diseased individuals at the expense of an increasing mutation load for random mating, while both parameters remain almost constant in highly consanguineous partnerships. This explains the empirical observation in present times that a high degree of consanguinity is associated with an increased risk of autosomal recessive disorders. However, it also states that the higher frequency of severe recessive disorders with developmental delay in inbred populations is a transient phenomenon before a mutation-selection balance is reached again.
△ Less
Submitted 22 December, 2020; v1 submitted 9 December, 2020;
originally announced December 2020.
-
DeepGestalt - Identifying Rare Genetic Syndromes Using Deep Learning
Authors:
Yaron Gurovich,
Yair Hanani,
Omri Bar,
Nicole Fleischer,
Dekel Gelbman,
Lina Basel-Salmon,
Peter Krawitz,
Susanne B Kamphausen,
Martin Zenker,
Lynne M. Bird,
Karen W. Gripp
Abstract:
Facial analysis technologies have recently measured up to the capabilities of expert clinicians in syndrome identification. To date, these technologies could only identify phenotypes of a few diseases, limiting their role in clinical settings where hundreds of diagnoses must be considered.
We developed a facial analysis framework, DeepGestalt, using computer vision and deep learning algorithms,…
▽ More
Facial analysis technologies have recently measured up to the capabilities of expert clinicians in syndrome identification. To date, these technologies could only identify phenotypes of a few diseases, limiting their role in clinical settings where hundreds of diagnoses must be considered.
We developed a facial analysis framework, DeepGestalt, using computer vision and deep learning algorithms, that quantifies similarities to hundreds of genetic syndromes based on unconstrained 2D images. DeepGestalt is currently trained with over 26,000 patient cases from a rapidly growing phenotype-genotype database, consisting of tens of thousands of validated clinical cases, curated through a community-driven platform. DeepGestalt currently achieves 91% top-10-accuracy in identifying over 215 different genetic syndromes and has outperformed clinical experts in three separate experiments.
We suggest that this form of artificial intelligence is ready to support medical genetics in clinical and laboratory practices and will play a key role in the future of precision medicine.
△ Less
Submitted 23 January, 2018;
originally announced January 2018.
-
Entropy of complex relevant components of Boolean networks
Authors:
P. Krawitz,
I. Shmulevich
Abstract:
Boolean network models of strongly connected modules are capable of capturing the high regulatory complexity of many biological gene regulatory circuits. We study numerically the previously introduced basin entropy, a parameter for the dynamical uncertainty or information storage capacity of a network as well as the average transient time in random relevant components as a function of their conn…
▽ More
Boolean network models of strongly connected modules are capable of capturing the high regulatory complexity of many biological gene regulatory circuits. We study numerically the previously introduced basin entropy, a parameter for the dynamical uncertainty or information storage capacity of a network as well as the average transient time in random relevant components as a function of their connectivity. We also demonstrate that basin entropy can be estimated from time-series data and is therefore also applicable to non-deterministic networks models.
△ Less
Submitted 10 August, 2007;
originally announced August 2007.
-
Basin Entropy in Boolean Network Ensembles
Authors:
Peter Krawitz,
Ilya Shmulevich
Abstract:
The information processing capacity of a complex dynamical system is reflected in the partitioning of its state space into disjoint basins of attraction, with state trajectories in each basin flowing towards their corresponding attractor. We introduce a novel network parameter, the basin entropy, as a measure of the complexity of information that such a system is capable of storing. By studying…
▽ More
The information processing capacity of a complex dynamical system is reflected in the partitioning of its state space into disjoint basins of attraction, with state trajectories in each basin flowing towards their corresponding attractor. We introduce a novel network parameter, the basin entropy, as a measure of the complexity of information that such a system is capable of storing. By studying ensembles of random Boolean networks, we find that the basin entropy scales with system size only in critical regimes, suggesting that the informationally optimal partition of the state space is achieved when the system is operating at the critical boundary between the ordered and disordered phases.
△ Less
Submitted 5 February, 2007;
originally announced February 2007.