-
Mining Patents with Large Language Models Elucidates the Chemical Function Landscape
Authors:
Clayton W. Kosonocky,
Claus O. Wilke,
Edward M. Marcotte,
Andrew D. Ellington
Abstract:
The fundamental goal of small molecule discovery is to generate chemicals with target functionality. While this often proceeds through structure-based methods, we set out to investigate the practicality of orthogonal methods that leverage the extensive corpus of chemical literature. We hypothesize that a sufficiently large text-derived chemical function dataset would mirror the actual landscape of…
▽ More
The fundamental goal of small molecule discovery is to generate chemicals with target functionality. While this often proceeds through structure-based methods, we set out to investigate the practicality of orthogonal methods that leverage the extensive corpus of chemical literature. We hypothesize that a sufficiently large text-derived chemical function dataset would mirror the actual landscape of chemical functionality. Such a landscape would implicitly capture complex physical and biological interactions given that chemical function arises from both a molecule's structure and its interacting partners. To evaluate this hypothesis, we built a Chemical Function (CheF) dataset of patent-derived functional labels. This dataset, comprising 631K molecule-function pairs, was created using an LLM- and embedding-based method to obtain functional labels for approximately 100K molecules from their corresponding 188K unique patents. We carry out a series of analyses demonstrating that the CheF dataset contains a semantically coherent textual representation of the functional landscape congruent with chemical structural relationships, thus approximating the actual chemical function landscape. We then demonstrate that this text-based functional landscape can be leveraged to identify drugs with target functionality using a model able to predict functional profiles from structure alone. We believe that functional label-guided molecular discovery may serve as an orthogonal approach to traditional structure-based methods in the pursuit of designing novel functional molecules.
△ Less
Submitted 18 December, 2023; v1 submitted 15 September, 2023;
originally announced September 2023.
-
Prompt Engineering for Transformer-based Chemical Similarity Search Identifies Structurally Distinct Functional Analogues
Authors:
Clayton W. Kosonocky,
Aaron L. Feller,
Claus O. Wilke,
Andrew D. Ellington
Abstract:
Chemical similarity searches are widely used in-silico methods for identifying new drug-like molecules. These methods have historically relied on structure-based comparisons to compute molecular similarity. Here, we use a chemical language model to create a vector-based chemical search. We extend implementations by creating a prompt engineering strategy that utilizes two different chemical string…
▽ More
Chemical similarity searches are widely used in-silico methods for identifying new drug-like molecules. These methods have historically relied on structure-based comparisons to compute molecular similarity. Here, we use a chemical language model to create a vector-based chemical search. We extend implementations by creating a prompt engineering strategy that utilizes two different chemical string representation algorithms: one for the query and the other for the database. We explore this method by reviewing the search results from five drug-like query molecules (penicillin G, nirmatrelvir, zidovudine, lysergic acid diethylamide, and fentanyl) and three dye-like query molecules (acid blue 25, avobenzone, and 2-diphenylaminocarbazole). We find that this novel method identifies molecules that are functionally similar to the query, indicated by the associated patent literature, and that many of these molecules are structurally distinct from the query, making them unlikely to be found with traditional chemical similarity search methods. This method may aid in the discovery of novel structural classes of molecules that achieve target functionality.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
colorspace: A Toolbox for Manipulating and Assessing Colors and Palettes
Authors:
Achim Zeileis,
Jason C. Fisher,
Kurt Hornik,
Ross Ihaka,
Claire D. McWhite,
Paul Murrell,
Reto Stauffer,
Claus O. Wilke
Abstract:
The R package colorspace provides a flexible toolbox for selecting individual colors or color palettes, manipulating these colors, and employing them in statistical graphics and data visualizations. In particular, the package provides a broad range of color palettes based on the HCL (Hue-Chroma-Luminance) color space. The three HCL dimensions have been shown to match those of the human visual syst…
▽ More
The R package colorspace provides a flexible toolbox for selecting individual colors or color palettes, manipulating these colors, and employing them in statistical graphics and data visualizations. In particular, the package provides a broad range of color palettes based on the HCL (Hue-Chroma-Luminance) color space. The three HCL dimensions have been shown to match those of the human visual system very well, thus facilitating intuitive selection of color palettes through trajectories in this space. Using the HCL color model general strategies for three types of palettes are implemented: (1) Qualitative for coding categorical information, i.e., where no particular ordering of categories is available. (2) Sequential for coding ordered/numeric information, i.e., going from high to low (or vice versa). (3) Diverging for coding ordered/numeric information around a central neutral value, i.e., where colors diverge from neutral to two extremes. To aid selection and application of these palettes the package also contains scales for use with ggplot2, shiny (and tcltk) apps for interactive exploration, visualizations of palette properties, accompanying manipulation utilities (like desaturation and lighten/darken), and emulation of color vision deficiencies.
△ Less
Submitted 14 March, 2019;
originally announced March 2019.
-
Matrix-normal models for fMRI analysis
Authors:
Michael Shvartsman,
Narayanan Sundaram,
Mikio C. Aoi,
Adam Charles,
Theodore C. Wilke,
Jonathan D. Cohen
Abstract:
Multivariate analysis of fMRI data has benefited substantially from advances in machine learning. Most recently, a range of probabilistic latent variable models applied to fMRI data have been successful in a variety of tasks, including identifying similarity patterns in neural data (Representational Similarity Analysis and its empirical Bayes variant, RSA and BRSA; Intersubject Functional Connecti…
▽ More
Multivariate analysis of fMRI data has benefited substantially from advances in machine learning. Most recently, a range of probabilistic latent variable models applied to fMRI data have been successful in a variety of tasks, including identifying similarity patterns in neural data (Representational Similarity Analysis and its empirical Bayes variant, RSA and BRSA; Intersubject Functional Connectivity, ISFC), combining multi-subject datasets (Shared Response Map**; SRM), and map** between brain and behavior (Joint Modeling). Although these methods share some underpinnings, they have been developed as distinct methods, with distinct algorithms and software tools. We show how the matrix-variate normal (MN) formalism can unify some of these methods into a single framework. In doing so, we gain the ability to reuse noise modeling assumptions, algorithms, and code across models. Our primary theoretical contribution shows how some of these methods can be written as instantiations of the same model, allowing us to generalize them to flexibly modeling structured noise covariances. Our formalism permits novel model variants and improved estimation strategies: in contrast to SRM, the number of parameters for MN-SRM does not scale with the number of voxels or subjects; in contrast to BRSA, the number of parameters for MN-RSA scales additively rather than multiplicatively in the number of voxels. We empirically demonstrate advantages of two new methods derived in the formalism: for MN-RSA, we show up to 10x improvement in runtime, up to 6x improvement in RMSE, and more conservative behavior under the null. For MN-SRM, our method grants a modest improvement to out-of-sample reconstruction while relaxing an orthonormality constraint of SRM. We also provide a software prototy** tool for MN models that can flexibly reuse noise covariance assumptions and algorithms across models.
△ Less
Submitted 9 November, 2017; v1 submitted 8 November, 2017;
originally announced November 2017.
-
Dissecting the roles of local packing density and longer-range effects in protein sequence evolution
Authors:
Amir Shahmoradi,
Claus Wilke
Abstract:
What are the structural determinants of protein sequence evolution? A number of site-specific structural characteristics have been proposed, most of which are broadly related to either the density of contacts or the solvent accessibility of individual residues. Most importantly, there has been disagreement in the literature over the relative importance of solvent accessibility and local packing de…
▽ More
What are the structural determinants of protein sequence evolution? A number of site-specific structural characteristics have been proposed, most of which are broadly related to either the density of contacts or the solvent accessibility of individual residues. Most importantly, there has been disagreement in the literature over the relative importance of solvent accessibility and local packing density for explaining site-specific sequence variability in proteins. We show here that this discussion has been confounded by the definition of local packing density. The most commonly used measures of local packing, such as the contact number and the weighted contact number, represent by definition the combined effects of local packing density and longer-range effects. As an alternative, we here propose a truly local measure of packing density around a single residue, based on the Voronoi cell volume. We show that the Voronoi cell volume, when calculated relative to the geometric center of amino-acid side chains, behaves nearly identically to the relative solvent accessibility, and both can explain, on average, approximately 34\% of the site-specific variation in evolutionary rate in a data set of 209 enzymes. An additional 10\% of variation can be explained by non-local effects that are captured in the weighted contact number. Consequently, evolutionary variation at a site is determined by the combined action of the immediate amino-acid neighbors of that site and of effects mediated by more distant amino acids. We conclude that instead of contrasting solvent accessibility and local packing density, future research should emphasize the relative importance of immediate contacts and longer-range effects on evolutionary variation.
△ Less
Submitted 27 January, 2017; v1 submitted 29 July, 2015;
originally announced July 2015.
-
Predicting evolutionary site variability from structure in viral proteins: buriedness, packing, flexibility, and design
Authors:
Amir Shahmoradi,
Dariya K. Sydykova,
Stephanie J. Spielman,
Eleisha L. Jackson,
Eric T. Dawson,
Austin G. Meyer,
Claus O. Wilke
Abstract:
Several recent works have shown that protein structure can predict site-specific evolutionary sequence variation. In particular, sites that are buried and/or have many contacts with other sites in a structure have been shown to evolve more slowly, on average, than surface sites with few contacts. Here, we present a comprehensive study of the extent to which numerous structural properties can predi…
▽ More
Several recent works have shown that protein structure can predict site-specific evolutionary sequence variation. In particular, sites that are buried and/or have many contacts with other sites in a structure have been shown to evolve more slowly, on average, than surface sites with few contacts. Here, we present a comprehensive study of the extent to which numerous structural properties can predict sequence variation. The quantities we considered include buriedness (as measured by relative solvent accessibility), packing density (as measured by contact number), structural flexibility (as measured by B factors, root-mean-square fluctuations, and variation in dihedral angles), and variability in designed structures. We obtained structural flexibility measures both from molecular dynamics simulations performed on 9 non-homologous viral protein structures and from variation in homologous variants of those proteins, where available. We obtained measures of variability in designed structures from flexible-backbone design in the Rosetta software. We found that most of the structural properties correlate with site variation in the majority of structures, though the correlations are generally weak (correlation coefficients of 0.1 to 0.4). Moreover, we found that buriedness and packing density were better predictors of evolutionary variation than was structural flexibility. Finally, variability in designed structures was a weaker predictor of evolutionary variability than was buriedness or packing density, but it was comparable in its predictive power to the best structural flexibility measures. We conclude that simple measures of buriedness and packing density are better predictors of evolutionary variation than are more complicated predictors obtained from dynamic simulations, ensembles of homologous structures, or computational protein design.
△ Less
Submitted 21 July, 2014; v1 submitted 29 April, 2014;
originally announced April 2014.
-
Analyzing Machupo virus-receptor binding by molecular dynamics simulations
Authors:
Austin G. Meyer,
Sara L. Sawyer,
Andrew D. Ellington,
Claus O. Wilke
Abstract:
In many biological applications, we would like to be able to computationally predict mutational effects on affinity in protein-protein interactions. However, many commonly used methods to predict these effects perform poorly in important test cases. In particular, the effects of multiple mutations, non-alanine substitutions, and flexible loops are difficult to predict with available tools and prot…
▽ More
In many biological applications, we would like to be able to computationally predict mutational effects on affinity in protein-protein interactions. However, many commonly used methods to predict these effects perform poorly in important test cases. In particular, the effects of multiple mutations, non-alanine substitutions, and flexible loops are difficult to predict with available tools and protocols. We present here an existing method applied in a novel way to a new test case; we interrogate affinity differences resulting from mutations in a host-virus protein-protein interface. We use steered molecular dynamics (SMD) to computationally pull the machupo virus (MACV) spike glycoprotein (GP1) away from the human transferrin receptor (hTfR1). We then approximate affinity using the maximum applied force of separation and the area under the force-versus-distance curve. We find, even without the rigor and planning required for free energy calculations, that these quantities can provide novel biophysical insight into the GP1/hTfR1 interaction. First, with no prior knowledge of the system we can differentiate among wild type and mutant complexes. Moreover, we show that this simple SMD scheme correlates well with relative free energy differences computed via free energy perturbation. Second, although the static co-crystal structure shows two large hydrogen-bonding networks in the GP1/hTfR1 interface, our simulations indicate that one of them may not be important for tight binding. Third, one viral site known to be critical for infection may mark an important evolutionary suppressor site for infection-resistant hTfR1 mutants. Finally, our approach provides a framework to compare the effects of multiple mutations, individually and jointly, on protein-protein interactions.
△ Less
Submitted 13 January, 2014; v1 submitted 27 February, 2013;
originally announced February 2013.
-
Membrane environment imposes unique selection pressures on transmembrane domains of G protein-coupled receptors
Authors:
Stephanie J. Spielman,
Claus O. Wilke
Abstract:
We have investigated the influence of the plasma membrane environment on the molecular evolution of G protein-coupled receptors (GPCRs), the largest receptor family in Metazoa. In particular, we have analyzed the site-specific rate variation across the two primary structural partitions, transmembrane (TM) and extramembrane (EM), of these membrane proteins. We find that transmembrane domains evolve…
▽ More
We have investigated the influence of the plasma membrane environment on the molecular evolution of G protein-coupled receptors (GPCRs), the largest receptor family in Metazoa. In particular, we have analyzed the site-specific rate variation across the two primary structural partitions, transmembrane (TM) and extramembrane (EM), of these membrane proteins. We find that transmembrane domains evolve more slowly than do extramembrane domains, though TM domains display increased rate heterogeneity relative to their EM counterparts. Although the majority of residues across GPCRs experience strong to weak purifying selection, many GPCRs experience positive selection at both TM and EM residues, albeit with a slight bias towards the EM. Further, a subset of GPCRs, chemosensory receptors (including olfactory and taste receptors), exhibit increased rates of evolution relative to other GPCRs, an effect which is more pronounced in their TM spans. Although it has been previously suggested that the TM's low evolutionary rate is caused by their high percentage of buried residues, we show that their attenuated rate seems to stem from the strong biophysical constraints of the membrane itself, or by functional requirements. In spite of the strong evolutionary constraints acting on the transmembrane spans of GPCRs, positive selection and high levels of evolutionary rate variability are common. Thus, biophysical constraints should not be presumed to preclude a protein's ability to evolve.
△ Less
Submitted 27 December, 2012; v1 submitted 25 November, 2012;
originally announced November 2012.
-
Maximum allowed solvent accessibilites of residues in proteins
Authors:
Matthew Z. Tien,
Austin G. Meyer,
Dariya K. Sydykova,
Stephanie J. Spielman,
Claus O. Wilke
Abstract:
The relative solvent accessibility (RSA) of a residue in a protein measures the extent of burial or exposure of that residue in the 3D structure. RSA is frequently used to describe a protein's biophysical or evolutionary properties. To calculate RSA, a residue's solvent accessibility (ASA) needs to be normalized by a suitable reference value for the given amino acid; several normalization scales h…
▽ More
The relative solvent accessibility (RSA) of a residue in a protein measures the extent of burial or exposure of that residue in the 3D structure. RSA is frequently used to describe a protein's biophysical or evolutionary properties. To calculate RSA, a residue's solvent accessibility (ASA) needs to be normalized by a suitable reference value for the given amino acid; several normalization scales have previously been proposed. However, these scales do not provide tight upper bounds on ASA values frequently observed in empirical crystal structures. Instead, they underestimate the largest allowed ASA values, by up to 20%. As a result, many empirical crystal structures contain residues that seem to have RSA values in excess of one. Here, we derive a new normalization scale that does provide a tight upper bound on observed ASA values. We pursue two complementary strategies, one based on extensive analysis of empirical structures and one based on systematic enumeration of biophysically allowed tripeptides. Both approaches yield congruent results that consistently exceed published values. We conclude that previously published ASA normalization values were too small, primarily because the conformations that maximize ASA had not been correctly identified. As an application of our results, we show that empirically derived hydrophobicity scales are sensitive to accurate RSA calculation, and we derive new hydrophobicity scales that show increased correlation with experimentally measured scales.
△ Less
Submitted 25 September, 2013; v1 submitted 18 November, 2012;
originally announced November 2012.
-
Defecting or not defecting: how to "read" human behavior during cooperative games by EEG measurements
Authors:
F. De Vico Fallani,
V. Nicosia,
R. Sinatra,
L. Astolfi,
F. Cincotti,
D. Mattia,
C. Wilke,
A. Doud,
V. Latora,
B. He,
F. Babiloni
Abstract:
Understanding the neural mechanisms responsible for human social interactions is difficult, since the brain activities of two or more individuals have to be examined simultaneously and correlated with the observed social patterns. We introduce the concept of hyper-brain network, a connectivity pattern representing at once the information flow among the cortical regions of a single brain as well as…
▽ More
Understanding the neural mechanisms responsible for human social interactions is difficult, since the brain activities of two or more individuals have to be examined simultaneously and correlated with the observed social patterns. We introduce the concept of hyper-brain network, a connectivity pattern representing at once the information flow among the cortical regions of a single brain as well as the relations among the areas of two distinct brains. Graph analysis of hyper-brain networks constructed from the EEG scanning of 26 couples of individuals playing the Iterated Prisoner's Dilemma reveals the possibility to predict non-cooperative interactions during the decision-making phase. The hyper-brain networks of two-defector couples have significantly less inter-brain links and overall higher modularity - i.e. the tendency to form two separate subgraphs - than couples playing cooperative or tit-for-tat strategies. The decision to defect can be "read" in advance by evaluating the changes of connectivity pattern in the hyper-brain network.
△ Less
Submitted 27 January, 2011;
originally announced January 2011.
-
The look-ahead effect of phenotypic mutations
Authors:
Dion J. Whitehead,
Claus O. Wilke,
David Vernazobres,
Erich Bornberg-Bauer
Abstract:
The evolution of complex molecular traits such as disulphide bridges often requires multiple mutations. The intermediate steps in such evolutionary trajectories are likely to be selectively neutral or deleterious. Therefore, large populations and long times may be required to evolve such traits. We propose that errors in transcription and translation may allow selection for the intermediate muta…
▽ More
The evolution of complex molecular traits such as disulphide bridges often requires multiple mutations. The intermediate steps in such evolutionary trajectories are likely to be selectively neutral or deleterious. Therefore, large populations and long times may be required to evolve such traits. We propose that errors in transcription and translation may allow selection for the intermediate mutations if the final trait provides a large enough selective advantage. We test this hypothesis using a population based model of protein evolution. If an individual acquires one of two mutations needed for a novel trait, the second mutation can be introduced into the phenotype due to transcription and translation errors. If the novel trait is advantageous enough, the allele with only one mutation will spread through the population, even though the gene sequence does not yet code for the complete trait. The first mutation then has a higher frequency than expected without phenotypic mutations giving the second mutation a higher probability of fixation. Thus, errors allow protein sequences to ''look-ahead'' for a more direct path to a complex trait.
△ Less
Submitted 15 October, 2007;
originally announced October 2007.
-
The traveling wave approach to asexual evolution: Muller's ratchet and speed of adaptation
Authors:
Igor M. Rouzine,
Eric Brunet,
Claus O. Wilke
Abstract:
We use traveling-wave theory to derive expressions for the rate of accumulation of deleterious mutations under Muller's ratchet and the speed of adaptation under positive selection in asexual populations. Traveling-wave theory is a semi-deterministic description of an evolving population, where the bulk of the population is modeled using deterministic equations, but the class of the highest-fitn…
▽ More
We use traveling-wave theory to derive expressions for the rate of accumulation of deleterious mutations under Muller's ratchet and the speed of adaptation under positive selection in asexual populations. Traveling-wave theory is a semi-deterministic description of an evolving population, where the bulk of the population is modeled using deterministic equations, but the class of the highest-fitness genotypes, whose evolution over time determines loss or gain of fitness in the population, is given proper stochastic treatment. We derive improved methods to model the highest-fitness class (the stochastic edge) for both Muller's ratchet and adaptive evolution, and calculate analytic correction terms that compensate for inaccuracies which arise when treating discrete fitness classes as a continuum. We show that traveling wave theory makes excellent predictions for the rate of mutation accumulation in the case of Muller's ratchet, and makes good predictions for the speed of adaptation in a very broad parameter range. We predict the adaptation rate to grow logarithmically in the population size until the population size is extremely large.
△ Less
Submitted 10 October, 2007; v1 submitted 23 July, 2007;
originally announced July 2007.
-
The stochastic edge in adaptive evolution
Authors:
Eric Brunet,
Igor M. Rouzine,
Claus O. Wilke
Abstract:
In a recent article, Desai and Fisher (2007) proposed that the speed of adaptation in an asexual population is determined by the dynamics of the stochastic edge of the population, that is, by the emergence and subsequent establishment of rare mutants that exceed the fitness of all sequences currently present in the population. Desai and Fisher perform an elaborate stochastic calculation of the m…
▽ More
In a recent article, Desai and Fisher (2007) proposed that the speed of adaptation in an asexual population is determined by the dynamics of the stochastic edge of the population, that is, by the emergence and subsequent establishment of rare mutants that exceed the fitness of all sequences currently present in the population. Desai and Fisher perform an elaborate stochastic calculation of the mean time $τ$ until a new class of mutants has been established, and interpret $1/τ$ as the speed of adaptation. As they note, however, their calculations are valid only for moderate speeds. This limitation arises from their method to determine $τ$: Desai and Fisher back-extrapolate the value of $τ$ from the best-fit class' exponential growth at infinite time. This approach is not valid when the population adapts rapidly, because in this case the best-fit class grows non-exponentially during the relevant time interval. Here, we substantially extend Desai and Fisher's analysis of the stochastic edge. We show that we can apply Desai and Fisher's method to high speeds by either exponentially back-extrapolating from finite time or using a non-exponential back-extrapolation. Our results are compatible with predictions made using a different analytical approach (Rouzine et al. 2003, 2007), and agree well with numerical simulations.
△ Less
Submitted 18 December, 2007; v1 submitted 23 July, 2007;
originally announced July 2007.
-
Thermodynamics of Neutral Protein Evolution
Authors:
Jesse D Bloom,
Alpan Raval,
Claus O Wilke
Abstract:
Naturally evolving proteins gradually accumulate mutations while continuing to fold to thermodynamically stable native structures. This process of neutral protein evolution is an important mode of genetic change, and forms the basis for the molecular clock. Here we present a mathematical theory that predicts the number of accumulated mutations, the index of dispersion, and the distribution of st…
▽ More
Naturally evolving proteins gradually accumulate mutations while continuing to fold to thermodynamically stable native structures. This process of neutral protein evolution is an important mode of genetic change, and forms the basis for the molecular clock. Here we present a mathematical theory that predicts the number of accumulated mutations, the index of dispersion, and the distribution of stabilities in an evolving protein population from knowledge of the stability effects (ddG values) for single mutations. Our theory quantitatively describes how neutral evolution leads to marginally stable proteins, and provides formulae for calculating how fluctuations in stability cause an overdispersion of the molecular clock. It also shows that the structural influences on the rate of sequence evolution that have been observed in earlier simulations can be calculated using only the single-mutation ddG values. We consider both the case when the product of the population size and mutation rate is small and the case when this product is large, and show that in the latter case proteins evolve excess mutational robustness that is manifested by extra stability and increases the rate of sequence evolution. Our basic method is to treat protein evolution as a Markov process constrained by a minimal requirement for stable folding, enabling an evolutionary description of the proteins solely in terms of the experimentally measureable ddG values. All of our theoretical predictions are confirmed by simulations with model lattice proteins. Our work provides a mathematical foundation for understanding how protein biophysics helps shape the process of evolution.
△ Less
Submitted 11 January, 2007; v1 submitted 24 May, 2006;
originally announced May 2006.
-
Population genetics of translational robustness
Authors:
Claus O. Wilke,
D. Allan Drummond
Abstract:
Recent work has shown that expression level is the main predictor of a gene’s evolutionary rate, and that more highly expressed genes evolve slower. A possible explanation for this observation is selection for proteins which fold properly despite mistranslation, in short selection for translational robustness. Translational robustness leads to the somewhat paradoxical prediction that highl…
▽ More
Recent work has shown that expression level is the main predictor of a gene’s evolutionary rate, and that more highly expressed genes evolve slower. A possible explanation for this observation is selection for proteins which fold properly despite mistranslation, in short selection for translational robustness. Translational robustness leads to the somewhat paradoxical prediction that highly expressed genes are extremely tolerant to missense substitutions but nevertheless evolve very slowly. Here, we study a simple theoretical model of translational robustness that allows us to gain analytic insight into how this paradoxical behavior arises.
△ Less
Submitted 19 February, 2006; v1 submitted 23 September, 2005;
originally announced September 2005.
-
Quasispecies can exist under neutral drift at finite population sizes
Authors:
Robert Forster,
Christoph Adami,
Claus O. Wilke
Abstract:
We investigate the evolutionary dynamics of a finite population of RNA sequences adapting to a neutral fitness landscape. Despite the lack of differential fitness between viable sequences, we observe typical properties of adaptive evolution, such as increase of mean fitness over time and punctuated equilibrium transitions. We discuss the implications of these results for understanding evolution…
▽ More
We investigate the evolutionary dynamics of a finite population of RNA sequences adapting to a neutral fitness landscape. Despite the lack of differential fitness between viable sequences, we observe typical properties of adaptive evolution, such as increase of mean fitness over time and punctuated equilibrium transitions. We discuss the implications of these results for understanding evolution at high mutation rates, and extend the relevance of the quasispecies concept to finite populations and time scales. Our results imply that the quasispecies concept and neutral drift are not complementary concepts, and that the relative importance of each is determined by a combination of population size and mutation rate.
△ Less
Submitted 31 August, 2005;
originally announced September 2005.
-
A single determinant for the rate of yeast protein evolution
Authors:
D. Allan Drummond,
Alpan Raval,
Claus O. Wilke
Abstract:
A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to no…
▽ More
A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. To overcome these difficulties, we employ an alternative method, principal component regression, which is a multivariate regression of evolutionary rate against the principal components of the predictor variables. We carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network). Strikingly, our analysis reveals a single dominant component which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single determinant among the seven predictors. The dominant component explains nearly half the variation in the rate of synonymous and protein evolution. Our results support the hypothesis that selection against the cost of translation-error-induced protein misfolding governs the rate of synonymous and protein sequence evolution in yeast.
△ Less
Submitted 8 June, 2005;
originally announced June 2005.
-
Why highly expressed proteins evolve slowly
Authors:
D. Allan Drummond,
Jesse D. Bloom,
Christoph Adami,
Claus O. Wilke,
Frances H. Arnold
Abstract:
Much recent work has explored molecular and population-genetic constraints on the rate of protein sequence evolution. The best predictor of evolutionary rate is expression level, for reasons which have remained unexplained. Here, we hypothesize that selection to reduce the burden of protein misfolding will favor protein sequences with increased robustness to translational missense errors. Pressu…
▽ More
Much recent work has explored molecular and population-genetic constraints on the rate of protein sequence evolution. The best predictor of evolutionary rate is expression level, for reasons which have remained unexplained. Here, we hypothesize that selection to reduce the burden of protein misfolding will favor protein sequences with increased robustness to translational missense errors. Pressure for translational robustness increases with expression level and constrains sequence evolution. Using several sequenced yeast genomes, global expression and protein abundance data, and sets of paralogs traceable to an ancient whole-genome duplication in yeast, we rule out several confounding effects and show that expression level explains roughly half the variation in Saccharomyces cerevisiae protein evolutionary rates. We examine causes for expression's dominant role and find that genome-wide tests favor the translational robustness explanation over existing hypotheses that invoke constraints on function or translational efficiency. Our results suggest that proteins evolve at rates largely unrelated to their functions, and can explain why highly expressed proteins evolve slowly across the tree of life.
△ Less
Submitted 12 August, 2005; v1 submitted 2 June, 2005;
originally announced June 2005.
-
Tradeoff between short-term and long-term adaptation in a changing environment
Authors:
Robert Forster,
Claus O. Wilke
Abstract:
We investigate the competition dynamics of two microbial or viral strains that live in an environment that switches periodically between two states. One of the strains is adapted to the long-term environment, but pays a short-term cost, while the other is adapted to the short-term environment and pays a cost in the long term. We explore the tradeoff between these alternative strategies in extens…
▽ More
We investigate the competition dynamics of two microbial or viral strains that live in an environment that switches periodically between two states. One of the strains is adapted to the long-term environment, but pays a short-term cost, while the other is adapted to the short-term environment and pays a cost in the long term. We explore the tradeoff between these alternative strategies in extensive numerical simulations, and present a simple analytic model that can predict the outcome of these competitions as a function of the mutation rate and the time scale of the environmental changes. Our model is relevant for arboviruses, which alternate between different host species on a regular basis.
△ Less
Submitted 3 September, 2005; v1 submitted 15 December, 2004;
originally announced December 2004.
-
Control of the chaotic state caused by the curent-driven ion acoustic instabilit and dynamical behavior using delayed feedback
Authors:
Takao Fukuyama,
Christian Wilke,
Yoshinobu Kawai
Abstract:
Controlling chaos caused by the current-driven ion acoustic instability is attempted using the delayed continuous feedback method, i.e., the time-delay auto synchronization (TDAS) method introduced by Pyragas [Phys. Lett. A 170 (1992) 421.]. When the control is applied to the typical chaotic state, chaotic orbit changes to periodic one, maintaining the instability. The chaotic state is well cont…
▽ More
Controlling chaos caused by the current-driven ion acoustic instability is attempted using the delayed continuous feedback method, i.e., the time-delay auto synchronization (TDAS) method introduced by Pyragas [Phys. Lett. A 170 (1992) 421.]. When the control is applied to the typical chaotic state, chaotic orbit changes to periodic one, maintaining the instability. The chaotic state is well controlled using the TDAS method. It is found that the control is achieved when a delay time is chosen near the unstable periodic orbit corresponding to the fundamental mode. Furthermore, when the delayed feedback is applied to a periodic nonlinear regime and arbitrary time delay is chosen, the periodic state is leaded to various motions including chaos. As a related topic, the synchronization between two instabilities of autonomous discharge tubes in a glow discharge is studied. Two tubes are settled independently and interacting each other through the coupler consisted of variable resister and capacitor. When the value of resister is changed as the strength of coupling, coupled system shows a state such as chaos synchronization.
△ Less
Submitted 21 October, 2004;
originally announced October 2004.
-
Thermodynamic Prediction of Protein Neutrality
Authors:
Jesse D. Bloom,
Jonathan J. Silberg,
Claus O. Wilke,
D. Allan Drummond,
Christoph Adami,
Frances H. Arnold
Abstract:
We present a simple theory that uses thermodynamic parameters to predict the probability that a protein retains the wildtype structure after one or more random amino acid substitutions. Our theory predicts that for large numbers of substitutions the probability that a protein retains its structure will decline exponentially with the number of substitutions, with the severity of this decline dete…
▽ More
We present a simple theory that uses thermodynamic parameters to predict the probability that a protein retains the wildtype structure after one or more random amino acid substitutions. Our theory predicts that for large numbers of substitutions the probability that a protein retains its structure will decline exponentially with the number of substitutions, with the severity of this decline determined by properties of the structure. Our theory also predicts that a protein can gain extra robustness to the first few substitutions by increasing its thermodynamic stability. We validate our theory with simulations on lattice protein models and by showing that it quantitatively predicts previously published experimental measurements on subtilisin and our own measurements on variants of TEM1 beta-lactamase. Our work unifies observations about the clustering of functional proteins in sequence space, and provides a basis for interpreting the response of proteins to substitutions in protein engineering applications.
△ Less
Submitted 4 December, 2004; v1 submitted 13 September, 2004;
originally announced September 2004.
-
Replication at periodically changing multiplicity of infection promotes stable coexistence of competing viral populations
Authors:
Claus O. Wilke,
Daniel D. Reissig,
Isabel S. Novella
Abstract:
RNA viruses are a widely used tool to study evolution experimentally. Many standard protocols of virus propagation and competition are done at nominally low multiplicity of infection (m.o.i.), but lead during one passage to two or more rounds of infection, of which the later ones are at high m.o.i. Here, we develop a model of the competition between wild type (wt) and a mutant under a regime of…
▽ More
RNA viruses are a widely used tool to study evolution experimentally. Many standard protocols of virus propagation and competition are done at nominally low multiplicity of infection (m.o.i.), but lead during one passage to two or more rounds of infection, of which the later ones are at high m.o.i. Here, we develop a model of the competition between wild type (wt) and a mutant under a regime of alternating m.o.i. We assume that the mutant is deleterious when it infects cells on its own, but derives a selective advantage when rare and coinfecting with wt, because it can profit from superior protein products created by the wt. We find that, under these assumptions, replication at alternating low and high m.o.i. may lead to the stable coexistence of wt and mutant for a wide range of parameter settings. The predictions of our model are consistent with earlier observations of frequency-dependent selection in VSV and HIV-1. Our results suggest that frequency-dependent selection may be common in typical evolution experiments with viruses.
△ Less
Submitted 4 February, 2004;
originally announced February 2004.
-
The Speed of Adaptation in Large Asexual Populations
Authors:
Claus O. Wilke
Abstract:
In large asexual populations, beneficial mutations have to compete with each other for fixation. Here, I derive explicit analytic expressions for the rate of substitution and the mean beneficial effect of fixed mutations, under the assumptions that the population size N is large, that the mean effect of new beneficial mutations is smaller than the mean effect of new deleterious mutations, and th…
▽ More
In large asexual populations, beneficial mutations have to compete with each other for fixation. Here, I derive explicit analytic expressions for the rate of substitution and the mean beneficial effect of fixed mutations, under the assumptions that the population size N is large, that the mean effect of new beneficial mutations is smaller than the mean effect of new deleterious mutations, and that new beneficial mutations are exponentially distributed. As N increases, the rate of substitution approaches a constant, which is equal to the mean effect of new beneficial mutations. The mean effect of fixed mutations continues to grow logarithmically with N. The speed of adaptation, measured as the change of log fitness over time, also grows logarithmically with N for moderately large N, and it grows double-logarithmically for extremely large N. Moreover, I derive a simple formula that determines whether at given N beneficial mutations are expected to compete with each other or go to fixation independently. Finally, I verify all results with numerical simulations.
△ Less
Submitted 29 April, 2004; v1 submitted 4 February, 2004;
originally announced February 2004.
-
Stability and the Evolvability of Function in a Model Protein
Authors:
Jesse D Bloom,
Claus O Wilke,
Frances H Arnold,
Christoph Adami
Abstract:
Functional proteins must fold with some minimal stability to a structure that can perform a biochemical task. Here we use a simple model to investigate the relationship between the stability requirement and the capacity of a protein to evolve the function of binding to a ligand. Although our model contains no built-in tradeoff between stability and function, proteins evolved function more effici…
▽ More
Functional proteins must fold with some minimal stability to a structure that can perform a biochemical task. Here we use a simple model to investigate the relationship between the stability requirement and the capacity of a protein to evolve the function of binding to a ligand. Although our model contains no built-in tradeoff between stability and function, proteins evolved function more efficiently when the stability requirement was relaxed. Proteins with both high stability and high function evolved more efficiently when the stability requirement was gradually increased than when there was constant selection for high stability. These results show that in our model, the evolution of function is enhanced by allowing proteins to explore sequences corresponding to marginally stable structures, and that it is easier to improve stability while maintaining high function than to improve function while maintaining high stability. Our model also demonstrates that even in the absence of a fundamental biophysical tradeoff between stability and function, the speed with which function can evolve is limited by the stability requirement imposed on the protein.
△ Less
Submitted 27 January, 2004;
originally announced January 2004.
-
Phenotypic mixing and hiding may contribute to memory in viral quasispecies
Authors:
Claus O. Wilke,
Isabel S. Novella
Abstract:
Background. In a number of recent experiments with food-and-mouth disease virus, a deleterious mutant, was found to avoid extinction and remain in the population for long periods of time. This observation was called quasispecies memory. The origin of quasispecies memory is not fully understood.
Results. We propose and analyze a simple model of complementation between the wild type virus and a…
▽ More
Background. In a number of recent experiments with food-and-mouth disease virus, a deleterious mutant, was found to avoid extinction and remain in the population for long periods of time. This observation was called quasispecies memory. The origin of quasispecies memory is not fully understood.
Results. We propose and analyze a simple model of complementation between the wild type virus and a mutant that has an impaired ability of cell entry. The mutant will go extinct unless it is recreated from the wild type through mutations. However, under phenotypic mixing-and-hiding as a mechanism of complementation, the time to extinction in the absence of mutations increases with increasing multiplicity of infection (m.o.i.). The mutant's frequency at equilibrium under selection-mutation balance also increases with increasing m.o.i. At high m.o.i., a large fraction of mutant genomes are encapsidated with wild-type protein, which enables them to infect cells as efficiently as the wild type virions, and thus increases their fitness to the wild-type level. Moreover, even at low m.o.i. the equilibrium frequency of the mutant is higher than predicted by the standard quasispecies model, because a fraction of mutant virions generated from wild-type parents will also be encapsidated by wild-type protein.
Conclusions. Our model predicts that phenotypic hiding will strongly influence the population dynamics of viruses, particularly at high m.o.i., and will also have important effects on the mutation--selection balance at low m.o.i. The delay in mutant extinction and increase in mutant frequencies at equilibrium may, at least in part, explain memory in quasispecies populations.
△ Less
Submitted 25 June, 2003;
originally announced June 2003.
-
Modeling stochastic clonal interference
Authors:
Paulo R. A. Campos,
Christoph Adami,
Claus O. Wilke
Abstract:
We study the competition between several advantageous mutants in an asexual population (clonal interference) as a function of the time between the appearance of the mutants, their selective advantages, and the rate of deleterious mutations. We find that the overall probability of fixation (the probability that at least one of the mutants becomes the ancestor of the entire population) does not de…
▽ More
We study the competition between several advantageous mutants in an asexual population (clonal interference) as a function of the time between the appearance of the mutants, their selective advantages, and the rate of deleterious mutations. We find that the overall probability of fixation (the probability that at least one of the mutants becomes the ancestor of the entire population) does not depend on the time interval between the appearance of these mutants, and equals the probability that a genotype bearing all of these mutations reaches fixation. This result holds also in the presence of deleterious mutations, and for an arbitrary number of competing mutants. We also show that if mutations interfere, an increase in the mean number of fixation events is associated with a decrease in the expected fitness gain of the population.
△ Less
Submitted 30 April, 2003;
originally announced May 2003.
-
Compensatory mutations cause excess of antagonistic epistasis in RNA secondary structure folding
Authors:
Claus O Wilke,
Richard E Lenski,
Christoph Adami
Abstract:
Background: The rate at which fitness declines as an organism's genome accumulates random mutations is an important variable in several evolutionary theories. At an intuitive level, it might seem natural that random mutations should tend to interact synergistically, such that the rate of mean fitness decline accelerates as the number of random mutations is increased. However, in a number of rece…
▽ More
Background: The rate at which fitness declines as an organism's genome accumulates random mutations is an important variable in several evolutionary theories. At an intuitive level, it might seem natural that random mutations should tend to interact synergistically, such that the rate of mean fitness decline accelerates as the number of random mutations is increased. However, in a number of recent studies, a prevalence of antagonistic epistasis (the tendency of multiple mutations to have a mitigating rather than reinforcing effect) has been observed.
Results: We studied in silico the net amount and form of epistatic interactions in RNA secondary structure folding by measuring the fraction of neutral mutants as a function of mutational distance d. We found a clear prevalence of antagonistic epistasis in RNA secondary structure folding. By relating the fraction of neutral mutants at distance d to the average neutrality at distance d, we showed that this prevalence derives from the existence of many compensatory mutations at larger mutational distances.
Conclusions: Our findings imply that the average direction of epistasis in simple fitness landscapes is directly related to the density with which fitness peaks are distributed in these landscapes.
△ Less
Submitted 18 February, 2003;
originally announced February 2003.
-
Does the Red Queen reign in the kingdom of digital organisms?
Authors:
Claus O. Wilke
Abstract:
In competition experiments between two RNA viruses of equal or almost equal fitness, often both strains gain in fitness before one eventually excludes the other. This observation has been linked to the Red Queen effect, which describes a situation in which organisms have to constantly adapt just to keep their status quo. I carried out experiments with digital organisms (self-replicating computer…
▽ More
In competition experiments between two RNA viruses of equal or almost equal fitness, often both strains gain in fitness before one eventually excludes the other. This observation has been linked to the Red Queen effect, which describes a situation in which organisms have to constantly adapt just to keep their status quo. I carried out experiments with digital organisms (self-replicating computer programs) in order to clarify how the competing strains' location in fitness space influences the Red-Queen effect. I found that gains in fitness during competition were prevalent for organisms that were taken from the base of a fitness peak, but absent or rare for organisms that were taken from the top of a peak or from a considerable distance away from the nearest peak. In the latter two cases, either neutral drift and loss of the fittest mutants or the waiting time to the first beneficial mutation were more important factors. Moreover, I found that the Red-Queen dynamic in general led to faster exclusion than the other two mechanisms.
△ Less
Submitted 13 February, 2003;
originally announced February 2003.
-
Probability of fixation of an advantageous mutant in a viral quasispecies
Authors:
Claus O. Wilke
Abstract:
The probability that an advantageous mutant rises to fixation in a viral quasispecies is investigated in the framework of multi-type branching processes. Whether fixation is possible depends on the overall growth rate of the quasispecies that will form if invasion is successful, rather than on the individual fitness of the invading mutant. The exact fixation probability can only be calculated if…
▽ More
The probability that an advantageous mutant rises to fixation in a viral quasispecies is investigated in the framework of multi-type branching processes. Whether fixation is possible depends on the overall growth rate of the quasispecies that will form if invasion is successful, rather than on the individual fitness of the invading mutant. The exact fixation probability can only be calculated if the fitnesses of all potential members of the invading quasispecies are known. Quasispecies fixation has two important characteristics: First, a sequence with negative selection coefficient has a positive fixation probability as long as it has the potential to grow into a quasispecies with an overall growth rate that exceeds the one of the established quasispecies. Second, the fixation probabilities of sequences with identical fitnesses can nevertheless vary over many orders of magnitudes. Two approximations for the probability of fixation are introduced. Both approximations require only partial knowledge about the potential members of the invading quasispecies. The performance of these two approximations is compared to the exact fixation probability on a network of RNA sequences with identical secondary structure.
△ Less
Submitted 27 September, 2002;
originally announced September 2002.
-
Viral evolution under the pressure of an adaptive immune system - optimal mutation rates for viral escape
Authors:
Christel Kamp,
Claus O. Wilke,
Christoph Adami,
Stefan Bornholdt
Abstract:
Based on a recent model of evolving viruses competing with an adapting immune system [1], we study the conditions under which a viral quasispecies can maximize its growth rate. The range of mutation rates that allows viruses to thrive is limited from above due to genomic information deterioration, and from below by insufficient sequence diversity, which leads to a quick eradication of the virus…
▽ More
Based on a recent model of evolving viruses competing with an adapting immune system [1], we study the conditions under which a viral quasispecies can maximize its growth rate. The range of mutation rates that allows viruses to thrive is limited from above due to genomic information deterioration, and from below by insufficient sequence diversity, which leads to a quick eradication of the virus by the immune system. The mutation rate that optimally balances these two requirements depends to first order on the ratio of the inverse of the virus' growth rate and the time the immune system needs to develop a specific answer to an antigen. We find that a virus is most viable if it generates exactly one mutation within the time it takes for the immune system to adapt to a new viral epitope. Experimental viral mutation rates, in particular for HIV (human immunodeficiency virus), seem to suggest that many viruses have achieved their optimal mutation rate. [1] C.Kamp and S. Bornholdt, Phys. Rev. Lett., 88, 068104 (2002)
△ Less
Submitted 26 September, 2002;
originally announced September 2002.
-
Finite genome size can halt Muller's ratchet
Authors:
Tor Schoenmeyr,
Claus O. Wilke
Abstract:
We study the accumulation of deleterious mutations in a haploid, asexually reproducing population, using analytical models and computer simulations. We find that Muller's ratchet can come to a halt in small populations as a consequence of a finite genome size only, in the complete absence of backward or compensatory mutations, epistasis, or recombination. The origin of this effect lies in the fa…
▽ More
We study the accumulation of deleterious mutations in a haploid, asexually reproducing population, using analytical models and computer simulations. We find that Muller's ratchet can come to a halt in small populations as a consequence of a finite genome size only, in the complete absence of backward or compensatory mutations, epistasis, or recombination. The origin of this effect lies in the fact that the number of loci at which mutations can create considerable damage decreases with every turn of the ratchet, while the total number of mutations per genome and generation remains constant. Whether the ratchet will come to a halt eventually depends on the ratio of the per-locus deleterious mutation rate $u$ and the selection strength $s$. For sufficiently small $u/s$, the ratchet halts after only a few clicks. We discuss the implications of our results for bacterial and virus evolution.
△ Less
Submitted 21 September, 2001;
originally announced September 2001.
-
Optimal adaptive performance and delocalization in NK fitness landscapes
Authors:
Paulo R. A. Campos,
Christoph Adami,
Claus O. Wilke
Abstract:
We investigate the evolutionary dynamics of a finite population of sequences adapting to NK fitness landscapes. We find that, unlike in the case of an infinite population, the average fitness in a finite population is maximized at a small but finite, rather than vanishing, mutation rate. The highest local maxima in the landscape are visited for even larger mutation rates, close to a transition p…
▽ More
We investigate the evolutionary dynamics of a finite population of sequences adapting to NK fitness landscapes. We find that, unlike in the case of an infinite population, the average fitness in a finite population is maximized at a small but finite, rather than vanishing, mutation rate. The highest local maxima in the landscape are visited for even larger mutation rates, close to a transition point at which the population delocalizes (i.e., leaves the fitness peak at which it was localized) and starts traversing the sequence space. If the mutation rate is increased even further, the population undergoes a second transition and loses all sensitivity to fitness peaks. This second transition corresponds to the standard error threshold transition first described by Eigen. We discuss the implications of our results for biological evolution and for evolutionary optimization techniques.
△ Less
Submitted 7 September, 2001;
originally announced September 2001.
-
Maternal effects in molecular evolution
Authors:
Claus O. Wilke
Abstract:
We introduce a model of molecular evolution in which the fitness of an individual depends both on its own and on the parent's genotype. The model can be solved by means of a nonlinear map** onto the standard quasispecies model. The dependency on the parental genotypes cancels from the mean fitness, but not from the individual sequence concentrations. For finite populations, the position of the…
▽ More
We introduce a model of molecular evolution in which the fitness of an individual depends both on its own and on the parent's genotype. The model can be solved by means of a nonlinear map** onto the standard quasispecies model. The dependency on the parental genotypes cancels from the mean fitness, but not from the individual sequence concentrations. For finite populations, the position of the error threshold is very sensitive to the influence from parent genotypes. In addition to biological applications, our model is important for understanding the dynamics of self-replicating computer programs.
△ Less
Submitted 27 June, 2001;
originally announced June 2001.
-
Selection for Fitness vs. Selection for Robustness in RNA Secondary Structure Folding
Authors:
Claus O. Wilke
Abstract:
We investigate the competition between two quasispecies residing on two disparate neutral networks. Under the assumption that the two neutral networks have different topologies and fitness levels, it is the mutation rate that determines which quasispecies will eventually be driven to extinction. For small mutation rates, we find that the quasispecies residing on the neutral network with the lowe…
▽ More
We investigate the competition between two quasispecies residing on two disparate neutral networks. Under the assumption that the two neutral networks have different topologies and fitness levels, it is the mutation rate that determines which quasispecies will eventually be driven to extinction. For small mutation rates, we find that the quasispecies residing on the neutral network with the lower replication rate will disappear. For higher mutation rates, however, the faster replicating sequences may be outcompeted by the slower replicating ones in case the connection density on the second neutral network is sufficiently high. Our analytical results are in excellent agreement with flow-reactor simulations of replicating RNA sequences.
△ Less
Submitted 8 March, 2001;
originally announced March 2001.
-
Adaptive evolution on neutral networks
Authors:
Claus O. Wilke
Abstract:
We study the evolution of large but finite asexual populations evolving in fitness landscapes in which all mutations are either neutral or strongly deleterious. We demonstrate that despite the absence of higher fitness genotypes, adaptation takes place as regions with more advantageous distributions of neutral genotypes are discovered. Since these discoveries are typically rare events, the popul…
▽ More
We study the evolution of large but finite asexual populations evolving in fitness landscapes in which all mutations are either neutral or strongly deleterious. We demonstrate that despite the absence of higher fitness genotypes, adaptation takes place as regions with more advantageous distributions of neutral genotypes are discovered. Since these discoveries are typically rare events, the population dynamics can be subdivided into separate epochs, with rapid transitions between them. Within one epoch, the average fitness in the population is approximately constant. The transitions between epochs, however, are generally accompanied by a significant increase in the average fitness. We verify our theoretical considerations with two analytically tractable bitstring models.
△ Less
Submitted 3 January, 2001;
originally announced January 2001.
-
Dynamic fitness landscapes: Expansions for small mutation rates
Authors:
Claus Wilke,
Christopher Ronnewinkel
Abstract:
We study the evolution of asexual microorganisms with small mutation rate in fluctuating environments, and develop techniques that allow us to expand the formal solution of the evolution equations to first order in the mutation rate. Our method can be applied to both discrete time and continuous time systems. While the behavior of continuous time systems is dominated by the average fitness lands…
▽ More
We study the evolution of asexual microorganisms with small mutation rate in fluctuating environments, and develop techniques that allow us to expand the formal solution of the evolution equations to first order in the mutation rate. Our method can be applied to both discrete time and continuous time systems. While the behavior of continuous time systems is dominated by the average fitness landscape for small mutation rates, in discrete time systems it is instead the geometric mean fitness that determines the system's properties. In both cases, we find that in situations in which the arithmetic (resp. geometric) mean of the fitness landscape is degenerate, regions in which the fitness fluctuates around the mean value present a selective advantage over regions in which the fitness stays at the mean. This effect is caused by the vanishing genetic diffusion at low mutation rates. In the absence of strong diffusion, a population can stay close to a fluctuating peak when the peak's height is below average, and take advantage of the peak when its height is above average.
△ Less
Submitted 17 October, 2000;
originally announced October 2000.
-
Interaction between directional epistasis and average mutational effects
Authors:
Claus O. Wilke,
Christoph Adami
Abstract:
We investigate the relationship between the average fitness decay due to single mutations and the strength of epistatic interactions in genetic sequences. We observe that epistatic interactions between mutations are correlated to the average fitness decay, both in RNA secondary structure prediction as well as in digital organisms replicating in silico. This correlation implies that during adapta…
▽ More
We investigate the relationship between the average fitness decay due to single mutations and the strength of epistatic interactions in genetic sequences. We observe that epistatic interactions between mutations are correlated to the average fitness decay, both in RNA secondary structure prediction as well as in digital organisms replicating in silico. This correlation implies that during adaptation, epistasis and average mutational effect cannot be optimized independently. In experiments with RNA sequences evolving on a neutral network, the selective pressure to decrease the mutational load then leads to a reduction of the amount of sequences with strong antagonistic interactions between deleterious mutations in the population.
△ Less
Submitted 27 June, 2001; v1 submitted 17 July, 2000;
originally announced July 2000.
-
Dynamic Fitness Landscapes in Molecular Evolution
Authors:
Claus O. Wilke,
Christopher Ronnewinkel,
Thomas Martinetz
Abstract:
We study self-replicating molecules under externally varying conditions. Changing conditions such as temperature variations and/or alterations in the environment's resource composition lead to both non-constant replication and decay rates of the molecules. In general, therefore, molecular evolution takes place in a dynamic rather than a static fitness landscape. We incorporate dynamic replicatio…
▽ More
We study self-replicating molecules under externally varying conditions. Changing conditions such as temperature variations and/or alterations in the environment's resource composition lead to both non-constant replication and decay rates of the molecules. In general, therefore, molecular evolution takes place in a dynamic rather than a static fitness landscape. We incorporate dynamic replication and decay rates into the standard quasispecies theory of molecular evolution, and show that for periodic time-dependencies, a system of evolving molecules enters a limit cycle for $t\to\infty$. For fast periodic changes, we show that molecules adapt to the time-averaged fitness landscape, whereas for slow changes they track the variations in the landscape arbitrarily closely. We derive a general approximation method that allows us to calculate the attractor of time-periodic landscapes, and demonstrate using several examples that the results of the approximation and the limiting cases of very slow and very fast changes are in perfect agreement. We also discuss landscapes with arbitrary time dependencies, and show that very fast changes again lead to a system that adapts to the time-averaged landscape. Finally, we analyze the dynamics of a finite population of molecules in a dynamic landscape, and discuss its relation to the infinite population limit.
△ Less
Submitted 12 May, 2000; v1 submitted 4 December, 1999;
originally announced December 1999.
-
Genetic Algorithms in Time-Dependent Environments
Authors:
Christopher Ronnewinkel,
Claus O. Wilke,
Thomas Martinetz
Abstract:
The influence of time-dependent fitnesses on the infinite population dynamics of simple genetic algorithms (without crossover) is analyzed. Based on general arguments, a schematic phase diagram is constructed that allows one to characterize the asymptotic states in dependence on the mutation rate and the time scale of changes. Furthermore, the notion of regular changes is raised for which the po…
▽ More
The influence of time-dependent fitnesses on the infinite population dynamics of simple genetic algorithms (without crossover) is analyzed. Based on general arguments, a schematic phase diagram is constructed that allows one to characterize the asymptotic states in dependence on the mutation rate and the time scale of changes. Furthermore, the notion of regular changes is raised for which the population can be shown to converge towards a generalized quasispecies. Based on this, error thresholds and an optimal mutation rate are approximately calculated for a generational genetic algorithm with a moving needle-in-the-haystack landscape. The so found phase diagram is fully consistent with our general considerations.
△ Less
Submitted 4 November, 1999;
originally announced November 1999.
-
Molecular Evolution in Time Dependent Environments
Authors:
Claus O. Wilke,
Christopher Ronnewinkel,
Thomas Martinetz
Abstract:
The quasispecies theory is studied for dynamic replication landscapes. A meaningful asymptotic quasispecies is defined for periodic time dependencies. The quasispecies' composition is constantly changing over the oscillation period. The error threshold moves towards the position of the time averaged landscape for high oscillation frequencies and follows the landscape closely for low oscillation…
▽ More
The quasispecies theory is studied for dynamic replication landscapes. A meaningful asymptotic quasispecies is defined for periodic time dependencies. The quasispecies' composition is constantly changing over the oscillation period. The error threshold moves towards the position of the time averaged landscape for high oscillation frequencies and follows the landscape closely for low oscillation frequencies.
△ Less
Submitted 17 November, 1999; v1 submitted 14 April, 1999;
originally announced April 1999.
-
Adaptive walks on time-dependent fitness landscapes
Authors:
Claus O. Wilke,
Thomas Martinetz
Abstract:
The idea of adaptive walks on fitness landscapes as a means of studying evolutionary processes on large time scales is extended to fitness landscapes that are slowly changing over time. The influence of ruggedness and of the amount of static fitness contributions are investigated for model landscapes derived from Kauffman's $NK$ landscapes. Depending on the amount of static fitness contributions…
▽ More
The idea of adaptive walks on fitness landscapes as a means of studying evolutionary processes on large time scales is extended to fitness landscapes that are slowly changing over time. The influence of ruggedness and of the amount of static fitness contributions are investigated for model landscapes derived from Kauffman's $NK$ landscapes. Depending on the amount of static fitness contributions in the landscape, the evolutionary dynamics can be divided into a percolating and a non-percolating phase. In the percolating phase, the walker performs a random walk over the regions of the landscape with high fitness.
△ Less
Submitted 16 March, 1999;
originally announced March 1999.
-
Lifetimes of agents under external stress
Authors:
Claus O. Wilke,
Thomas Martinetz
Abstract:
An exact formula for the distribution of lifetimes in coherent-noise models and related models is derived. For certain stress distributions, this formula can be analytically evaluated and yields simple closed expressions. For those types of stress for which a closed expression is not available, a numerical evaluation can be done in a straightforward way. All results obtained are in perfect agree…
▽ More
An exact formula for the distribution of lifetimes in coherent-noise models and related models is derived. For certain stress distributions, this formula can be analytically evaluated and yields simple closed expressions. For those types of stress for which a closed expression is not available, a numerical evaluation can be done in a straightforward way. All results obtained are in perfect agreement with numerical experiments. The implications for the coherent-noise models' application to macroevolution are discussed.
△ Less
Submitted 9 December, 1998;
originally announced December 1998.
-
Evolution in time-dependent fitness landscapes
Authors:
Claus O. Wilke
Abstract:
Evolution in changing environments is an important, but little studied aspect of the theory of evolution. The idea of adaptive walks in fitness landscapes has triggered a vast amount of research and has led to many important insights about the progress of evolution. Nevertheless, the small step to time-dependent fitness landscapes has most of the time not been taken. In this work, some elements…
▽ More
Evolution in changing environments is an important, but little studied aspect of the theory of evolution. The idea of adaptive walks in fitness landscapes has triggered a vast amount of research and has led to many important insights about the progress of evolution. Nevertheless, the small step to time-dependent fitness landscapes has most of the time not been taken. In this work, some elements of a theory of adaptive walks on changing fitness landscapes are proposed, and are subsequently applied to and tested on a simple family of time-dependent fitness landscapes, the oscillating NK landscapes, also introduced here. For these landscapes, the parameter governing the evolutionary dynamics is the fraction of static fitness contributions f_S. For small f_S, local optima are virtually non-existent, and the adaptive walk constantly encounters new genotypes, whereas for large f_S, the evolutionary dynamics reduces to the one on static fitness landscapes. Evidence is presented that the transition between the two regimes is a 2nd order phase transition akin a percolation transition. For f_S close to the critical point, a rich dynamics can be observed. The adaptive walk gets trapped in noisy limit cycles, and transitions from one noisy limit cycle to another occur sporadically.
△ Less
Submitted 13 November, 1998;
originally announced November 1998.
-
How fast do structures emerge in hypercycle-systems?
Authors:
S. Altmeyer,
C. Wilke,
T. Martinetz
Abstract:
A general framework for the simulation of reaction-diffusion systems with probabilistic cellular automata is presented. The basic reaction probabilities of the chemical model translate directly into the transition rules of the automaton, thus allowing a clear comparison between simulation results and analytic calculations. This framework is then applied to simulations of hypercycle-systems in up…
▽ More
A general framework for the simulation of reaction-diffusion systems with probabilistic cellular automata is presented. The basic reaction probabilities of the chemical model translate directly into the transition rules of the automaton, thus allowing a clear comparison between simulation results and analytic calculations. This framework is then applied to simulations of hypercycle-systems in up to three dimensions. Furthermore, a new measurement quantity is introduced and applied to the hypercycle-systems in two and three dimensions. It can be shown that this quantity can be interpreted as a measure for the macroscopic order of the hypercycle systems.
△ Less
Submitted 5 June, 1998;
originally announced June 1998.
-
Hierarchical noise in large systems of independent agents
Authors:
Claus Wilke,
Thomas Martinetz
Abstract:
A generalization of the coherent-noise models [M. E. J. Newman and K. Sneppen, Phys. Rev. E{\bf54}, 6226 (1996)] is presented where the agents in the model are subjected to a multitude of stresses, generated in a hierarchy of different contexts. The hierarchy is realized as a Cayley-tree. Two different ways of stress propagation in the tree are considered. In both cases, coherence arises in larg…
▽ More
A generalization of the coherent-noise models [M. E. J. Newman and K. Sneppen, Phys. Rev. E{\bf54}, 6226 (1996)] is presented where the agents in the model are subjected to a multitude of stresses, generated in a hierarchy of different contexts. The hierarchy is realized as a Cayley-tree. Two different ways of stress propagation in the tree are considered. In both cases, coherence arises in large subsystems of the tree. Clear similarities between the behavior of the tree model and of the coherent-noise model can be observed. For one of the two methods of stress propagation, the behavior of the tree model can be approximated very well by an ensemble of coherent-noise models, where the sizes $k$ of the systems in the ensemble scale as $k^{-2}$. The results are found to be independent of the tree's structure for a large class of reasonable choices. Additionally, it is found that power-law distributed lifetimes of agents arise even under the complete absence of correlations between the stresses the agents feel.
△ Less
Submitted 6 October, 1998; v1 submitted 22 May, 1998;
originally announced May 1998.
-
Large-scale evolution and extinction in a hierarchically structured environment
Authors:
C. Wilke,
S. Altmeyer,
T. Martinetz
Abstract:
A class of models for large-scale evolution and mass extinctions is presented. These models incorporate environmental changes on all scales, from influences on a single species to global effects. This is a step towards a unified picture of mass extinctions, which enables one to study coevolutionary effects and external abiotic influences with the same means. The generic features of such models a…
▽ More
A class of models for large-scale evolution and mass extinctions is presented. These models incorporate environmental changes on all scales, from influences on a single species to global effects. This is a step towards a unified picture of mass extinctions, which enables one to study coevolutionary effects and external abiotic influences with the same means. The generic features of such models are studied in a simple version, in which all environmental changes are generated at random and without feedback from other parts of the system.
△ Less
Submitted 10 March, 1998;
originally announced March 1998.
-
Aftershocks in Coherent-Noise Models
Authors:
C. Wilke,
S. Altmeyer,
T. Martinetz
Abstract:
The decay pattern of aftershocks in the so-called 'coherent-noise' models [M. E. J. Newman and K. Sneppen, Phys. Rev. E54, 6226 (1996)] is studied in detail. Analytical and numerical results show that the probability to find a large event at time $t$ after an initial major event decreases as $t^{-τ}$ for small $t$, with the exponent $τ$ ranging from 0 to values well above 1. This is in contrast…
▽ More
The decay pattern of aftershocks in the so-called 'coherent-noise' models [M. E. J. Newman and K. Sneppen, Phys. Rev. E54, 6226 (1996)] is studied in detail. Analytical and numerical results show that the probability to find a large event at time $t$ after an initial major event decreases as $t^{-τ}$ for small $t$, with the exponent $τ$ ranging from 0 to values well above 1. This is in contrast to Sneppen und Newman, who stated that the exponent is about 1, independent of the microscopic details of the simulation. Numerical simulations of an extended model [C. Wilke, T. Martinetz, Phys. Rev. E56, 7128 (1997)] show that the power-law is only a generic feature of the original dynamics and does not necessarily appear in a more general context. Finally, the implications of the results to the modeling of earthquakes are discussed.
△ Less
Submitted 12 March, 1998; v1 submitted 20 October, 1997;
originally announced October 1997.
-
A Simple Model of Evolution with Variable System Size
Authors:
C. Wilke,
T. Martinetz
Abstract:
A simple model of biological extinction with variable system size is presented that exhibits a power-law distribution of extinction event sizes. The model is a generalization of a model recently introduced by Newman (Proc. R. Soc. Lond. B265, 1605 (1996). Both analytical and numerical analysis show that the exponent of the power-law distribution depends only marginally on the growth rate $g$ at…
▽ More
A simple model of biological extinction with variable system size is presented that exhibits a power-law distribution of extinction event sizes. The model is a generalization of a model recently introduced by Newman (Proc. R. Soc. Lond. B265, 1605 (1996). Both analytical and numerical analysis show that the exponent of the power-law distribution depends only marginally on the growth rate $g$ at which new species enter the system and is equal to the one of the original model in the limit $g\to\infty$. A critical growth rate $g_c$ can be found below which the system dies out. Under these model assumptions stable ecosystems can only exist if the regrowth of species is sufficiently fast.
△ Less
Submitted 20 October, 1997; v1 submitted 7 May, 1997;
originally announced May 1997.
-
Axion cyclotron emissivity of magnetized white dwarfs and neutron stars
Authors:
M. Kachelriess,
C. Wilke,
G. Wunner
Abstract:
The energy loss rate of a magnetized electron gas emitting axions a due to the process $e^- \to e^- +a$ is derived for arbitrary magnetic field strength B. Requiring that for a strongly magnetized neutron star the axion luminosity is smaller than the neutrino luminosity we obtain the bound $g_{ae}\lsim 10^{-10}$ for the axion electron coupling constant. This limit is considerably weaker than the…
▽ More
The energy loss rate of a magnetized electron gas emitting axions a due to the process $e^- \to e^- +a$ is derived for arbitrary magnetic field strength B. Requiring that for a strongly magnetized neutron star the axion luminosity is smaller than the neutrino luminosity we obtain the bound $g_{ae}\lsim 10^{-10}$ for the axion electron coupling constant. This limit is considerably weaker than the bound derived earlier by Borisov and Grishina using the same method. Applying a similar argument to magnetic white dwarf stars results in the more stringent bound $g_{ae}\lsim 9x10^{-13} (T/10^7 K)^{5/4} (B/10^{10} G)^{-2}$ where T is the internal temperature of the white dwarf.
△ Less
Submitted 8 April, 1997; v1 submitted 12 January, 1997;
originally announced January 1997.
-
Photon splitting in strong magnetic fields: asymptotic approximation formulae vs. accurate numerical results
Authors:
C. Wilke,
G. Wunner
Abstract:
We present the results of a numerical calculation of the photon splitting rate below the electron-pair creation threshold ($ω\le 2m$) in magnetic fields $B ~{> \atop \sim}~ B_{\rm cr} = m^2 /e = 4.414 \times 10^{9}$ T. Our results confirm asymptotic approximations derived in the low-field ($B < B_{\rm cr}$) and high-field ($B \gg B_{\rm cr}$) limit, and allow interpolating between the two asympt…
▽ More
We present the results of a numerical calculation of the photon splitting rate below the electron-pair creation threshold ($ω\le 2m$) in magnetic fields $B ~{> \atop \sim}~ B_{\rm cr} = m^2 /e = 4.414 \times 10^{9}$ T. Our results confirm asymptotic approximations derived in the low-field ($B < B_{\rm cr}$) and high-field ($B \gg B_{\rm cr}$) limit, and allow interpolating between the two asymptotic regions. Our expression for the photon splitting rate is a simplified version of a formula given by Mentzel et al. We also point out that, although the analytical formula is correct, the splitting rates calculated there are wrong due to an error in the numerical calculations.
△ Less
Submitted 6 August, 1996; v1 submitted 8 May, 1996;
originally announced May 1996.