-
Predicting the binding of small molecules to proteins through invariant representation of the molecular structure
Authors:
R. Beccaria,
A. Lazzeri,
G. Tiana
Abstract:
We present a computational scheme for predicting the ligands that bind to a pocket of known structure. It is based on the generation of a general abstract representation of the molecules, which is invariant to rotations, translations and permutations of atoms, and has some degree of isometry with the space of conformations. We use these representations to train a non-deep machine learning algorith…
▽ More
We present a computational scheme for predicting the ligands that bind to a pocket of known structure. It is based on the generation of a general abstract representation of the molecules, which is invariant to rotations, translations and permutations of atoms, and has some degree of isometry with the space of conformations. We use these representations to train a non-deep machine learning algorithm to classify the binding between pockets and molecule pairs, and show that this approach has a better generalization capability than existing methods.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Effective model of protein--mediated interactions in chromatin
Authors:
Francesco Borando,
Guido Tiana
Abstract:
Protein-mediated interactions are ubiquitous in the cellular environment, and particularly in the nucleus, where they are responsible for the structuring of chromatin. We show through molecular--dynamics simulations of a polymer surrounded by binders that the strength of the binder-polymer interaction separates an equilibrium from a non-equilibrium regime. In the equilibrium regime, the system can…
▽ More
Protein-mediated interactions are ubiquitous in the cellular environment, and particularly in the nucleus, where they are responsible for the structuring of chromatin. We show through molecular--dynamics simulations of a polymer surrounded by binders that the strength of the binder-polymer interaction separates an equilibrium from a non-equilibrium regime. In the equilibrium regime, the system can be efficiently described by an effective model in which the binders are traced out. Even in this case, the polymer display features that are different from those of a standard homopolymer interacting with two-body interactions. We then extend the effective model to deal with the case where binders cannot be regarded as in equilibrium and a new phenomenology appears, including local blobs in the polymer. Providing an effective description of the system can be useful in clarifying the fundamental mechanisms governing chromatin structuring.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Structure of the space of folding protein sequences defined by large language models
Authors:
A. Zambon,
R. Zecchina,
G. Tiana
Abstract:
Proteins populate a manifold in the high-dimensional sequence space whose geometrical structure guides their natural evolution. Leveraging recently-developed structure prediction tools based on transformer models, we first examine the protein sequence landscape as defined by the folding score function. This landscape shares characteristics with optimization challenges encountered in machine learni…
▽ More
Proteins populate a manifold in the high-dimensional sequence space whose geometrical structure guides their natural evolution. Leveraging recently-developed structure prediction tools based on transformer models, we first examine the protein sequence landscape as defined by the folding score function. This landscape shares characteristics with optimization challenges encountered in machine learning and constraint satisfaction problems. Our analysis reveals that natural proteins predominantly reside in wide, flat minima within this energy landscape. To investigate further, we employ statistical mechanics algorithms specifically designed to explore regions with high local entropy in relatively flat landscapes. Our findings indicate that these specialized algorithms can identify valleys with higher entropy compared to those found using traditional methods such as Monte Carlo Markov Chains. In a proof-of-concept case, we find that these highly entropic minima exhibit significant similarities to natural sequences, especially in critical key sites and local entropy. Additionally, evaluations through Molecular Dynamics suggests that the stability of these sequences closely resembles that of natural proteins. Our tool combines advancements in machine learning and statistical physics, providing new insights into the exploration of sequence landscapes where wide, flat minima coexist alongside a majority of narrower minima.
△ Less
Submitted 10 November, 2023;
originally announced November 2023.
-
Locality of contacts determines the subdiffusion exponents in polymeric models of chromatin
Authors:
E. Marchi,
Y. Zhan,
G. Tiana
Abstract:
Loop extrusion by motor proteins mediates the attractive interactions in chromatin on the length scale of megabases, providing the polymer with a well-defined structure and at the same time determining its dynamics. The mean square displacement of chromatin loci varies from a Rouse-like scaling to a more constrained subdiffusion, depending on cell type, genomic region and time scale. With a simple…
▽ More
Loop extrusion by motor proteins mediates the attractive interactions in chromatin on the length scale of megabases, providing the polymer with a well-defined structure and at the same time determining its dynamics. The mean square displacement of chromatin loci varies from a Rouse-like scaling to a more constrained subdiffusion, depending on cell type, genomic region and time scale. With a simple polymeric model, we show that such a Rouse-like dynamics occurs when the parameters of the model are chosen so that contacts are local along the chain, while in presence of non-local contacts, we observe subdiffusion at short time scales with exponents smaller than 0.5. Such exponents are independent of the detailed choice of the parameters and build a master curve that depends only on the mean locality of the resulting contacts. We compare the loop-extrusion model with a polymeric model with static links, showing that also in this case only the presence of nonlocal contacts can produce low-exponent subdiffusion. We interpret these results in terms of a simple analytical model.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
Key interaction patterns in proteins revealed by cluster expansion of the partition function
Authors:
M. Tajana,
A. Trovato,
G. Tiana
Abstract:
The native conformation of structured proteins is stabilized by a complex network of interactions. We analyzed the elementary patterns that constitute such network and ranked them according to their importance in sha** protein sequence design. To achieve this goal, we employed a cluster expansion of the partition function in the space of sequences and evaluated numerically the statistical import…
▽ More
The native conformation of structured proteins is stabilized by a complex network of interactions. We analyzed the elementary patterns that constitute such network and ranked them according to their importance in sha** protein sequence design. To achieve this goal, we employed a cluster expansion of the partition function in the space of sequences and evaluated numerically the statistical importance of each cluster. An important feature of this procedure is that it is applied to a dense, finite system. We found that patterns that contribute most to the partition function are cycles with even numbers of nodes, while cliques are typically detrimental. Each cluster also gives a contribute to the sequence entropy, which is a measure of the evolutionary designability of a fold. We compared the entropies associated with different interaction patterns to their abundances in the native structures of real proteins.
△ Less
Submitted 8 April, 2022;
originally announced April 2022.
-
Native state of natural proteins optimises local entropy
Authors:
Matteo Negri,
Guido Tiana,
Riccardo Zecchina
Abstract:
The differing ability of polypeptide conformations to act as the native state of proteins has long been rationalized in terms of differing kinetic accessibility or thermodynamic stability. Building on the successful applications of physical concepts and sampling algorithms recently introduced in the study of disordered systems, in particular artificial neural networks, we quantitatively explore ho…
▽ More
The differing ability of polypeptide conformations to act as the native state of proteins has long been rationalized in terms of differing kinetic accessibility or thermodynamic stability. Building on the successful applications of physical concepts and sampling algorithms recently introduced in the study of disordered systems, in particular artificial neural networks, we quantitatively explore how well a quantity known as the local entropy describes the native state of model proteins. In lattice models and all-atom representations of proteins, we are able to efficiently sample high local entropy states and to provide a proof of concept of enhanced stability and folding rate. Our methods are based on simple and general statistical--mechanics arguments, and thus we expect that they are of very general use.
△ Less
Submitted 25 November, 2021;
originally announced November 2021.
-
Effective Model of Loop Extrusion Predicts Chromosomal Domains
Authors:
M. Crippa,
Y. Zhan,
G. Tiana
Abstract:
An active loop-extrusion mechanism is regarded as the main out--of--equilibrium mechanism responsible for the structuring of megabase-sized domains in chromosomes. We developed a model to study the dynamics of the chromosome fibre by solving the kinetic equations associated with the motion of the extruder. By averaging out the position of the extruder along the chain, we build an effective equilib…
▽ More
An active loop-extrusion mechanism is regarded as the main out--of--equilibrium mechanism responsible for the structuring of megabase-sized domains in chromosomes. We developed a model to study the dynamics of the chromosome fibre by solving the kinetic equations associated with the motion of the extruder. By averaging out the position of the extruder along the chain, we build an effective equilibrium model capable of reproducing experimental contact maps based solely on the positions of extrusion--blocking proteins. We assessed the quality of the effective model using numerical simulations of chromosomal segments and comparing the results with explicit-extruder models and experimental data.
△ Less
Submitted 4 September, 2020; v1 submitted 14 May, 2020;
originally announced May 2020.
-
Molecular recognition between cadherins studied by a coarse-grained model interacting with a coevolutionary potential
Authors:
S. Terzoli,
G. Tiana
Abstract:
Studying the conformations involved in the dimerization of cadherins is highly relevant to understand the development of tissue and its failure, which is associated with tumors and metastases. Experimental techniques, like X-ray crystallography, can usually report only the most stable conformations, missing minority states that could nonetheless be important for the recognition mechanism. Computer…
▽ More
Studying the conformations involved in the dimerization of cadherins is highly relevant to understand the development of tissue and its failure, which is associated with tumors and metastases. Experimental techniques, like X-ray crystallography, can usually report only the most stable conformations, missing minority states that could nonetheless be important for the recognition mechanism. Computer simulations could be a valid complement to the experimental approach. However, standard all-atom protein models in explicit solvent are computationally too demanding to search thoroughly the conformational space of multiple chains composed of several hundreds of amino acids. To reach this goal, we resorted to a coarse-grained model in implicit solvent. The standard problem with this kind of models is to find a realistic potential to describe their interactions. We used coevolutionary information from cadherin alignments, corrected by a statistical potential, to build an interaction potential which is agnostic of the experimental conformations of the protein. Using this model, we explored the conformational space of multi-chain systems and validated the results comparing with experimental data. We identified dimeric conformations that are sequence-specific and that can be useful to rationalize the mechanism of recognition between cadherins.
△ Less
Submitted 25 February, 2020;
originally announced February 2020.
-
Prediction of native contacts in proteins from an out--of--equilibrium coevolutionary process
Authors:
D. Oriani,
M. Cagiada,
G. Tiana
Abstract:
The analysis of coevolution of residues in homologous proteins is a powerful tool to predict their native conformation. The standard framework in which coevolutionary analysis is usually worked out is that of equilibrium Potts models, assuming that proteins have evolved for enough time to reach thermodynamic equilibrium in sequence space. Here we propose a model to describe correlations in sequenc…
▽ More
The analysis of coevolution of residues in homologous proteins is a powerful tool to predict their native conformation. The standard framework in which coevolutionary analysis is usually worked out is that of equilibrium Potts models, assuming that proteins have evolved for enough time to reach thermodynamic equilibrium in sequence space. Here we propose a model to describe correlations in sequences based on an explicit description of the evolutionary kinetics of proteins. We show that this procedure improves the correct prediction of native contacts with respect to equilibrium--based models.
△ Less
Submitted 9 February, 2020;
originally announced February 2020.
-
Bifractal nature of chromosome contact maps
Authors:
Simone Pigolotti,
Mogens H. Jensen,
Yinxiu Zhan,
Guido Tiana
Abstract:
Modern biological techniques such as Hi-C permit to measure probabilities that different chromosomal regions are close in space. These probabilities can be visualised as matrices called contact maps. In this paper, we introduce a multifractal analysis of chromosomal contact maps. Our analysis reveals that Hi-C maps are bifractal, i.e. complex geometrical objects characterized by two distinct fract…
▽ More
Modern biological techniques such as Hi-C permit to measure probabilities that different chromosomal regions are close in space. These probabilities can be visualised as matrices called contact maps. In this paper, we introduce a multifractal analysis of chromosomal contact maps. Our analysis reveals that Hi-C maps are bifractal, i.e. complex geometrical objects characterized by two distinct fractal dimensions. To rationalize this observation, we introduce a model that describes chromosomes as a hierarchical set of nested domains and we solve it exactly. The predicted multifractal spectrum is in excellent quantitative agreement with experimental data. Moreover, we show that our theory yields to a more robust estimation of the scaling exponent of the contact probability than existing methods. By applying this method to experimental data, we detect subtle conformational changes among chromosomes during differentiation of human stem cells.
△ Less
Submitted 13 October, 2020; v1 submitted 28 June, 2019;
originally announced June 2019.
-
Statistical mechanical properties of sequence space determine the efficiency of the various algorithms to predict interaction energies and native contacts from protein coevolution
Authors:
G. Franco,
M. Cagiada,
G. Bussi,
G. Tiana
Abstract:
Studying evolutionary correlations in alignments of homologous sequences by means of an inverse Potts model has proven useful to obtain residue-residue contact energies and to predict contacts in proteins. The quality of the results depend much on several choices of the detailed model and on the algorithms used. We built, in a very controlled way, synthetic alignments with statistical properties s…
▽ More
Studying evolutionary correlations in alignments of homologous sequences by means of an inverse Potts model has proven useful to obtain residue-residue contact energies and to predict contacts in proteins. The quality of the results depend much on several choices of the detailed model and on the algorithms used. We built, in a very controlled way, synthetic alignments with statistical properties similar to those of real proteins, and used them to assess the performance of different inversion algorithms and of their variants. Realistic synthetic alignments display typical features of low--temperature phases of disordered systems, a feature that affects the inversion algorithms. We showed that a Boltzmann--learning algorithm is computationally feasible and performs well in predicting the energy of native contacts. However, all algorithms suffer of false positives quite equally, making the quality of the prediction of native contacts with the different algorithm much system--dependent.
△ Less
Submitted 4 February, 2019;
originally announced February 2019.
-
Assessing the accuracy of direct-coupling analysis for RNA contact prediction
Authors:
Francesca Cuturello,
Guido Tiana,
Giovanni Bussi
Abstract:
Many non-coding RNAs are known to play a role in the cell directly linked to their structure. Structure prediction based on the sole sequence is however a challenging task. On the other hand, thanks to the low cost of sequencing technologies, a very large number of homologous sequences are becoming available for many RNA families. In the protein community, it has emerged in the last decade the ide…
▽ More
Many non-coding RNAs are known to play a role in the cell directly linked to their structure. Structure prediction based on the sole sequence is however a challenging task. On the other hand, thanks to the low cost of sequencing technologies, a very large number of homologous sequences are becoming available for many RNA families. In the protein community, it has emerged in the last decade the idea of exploiting the covariance of mutations within a family to predict the protein structure using the direct-coupling-analysis (DCA) method. The application of DCA to RNA systems has been limited so far. We here perform an assessment of the DCA method on 17 riboswitch families, comparing it with the commonly used mutual information analysis and with state-of-the-art R-scape covariance method. We also compare different flavors of DCA, including mean-field, pseudo-likelihood, and a proposed stochastic procedure (Boltzmann learning) for solving exactly the DCA inverse problem. Boltzmann learning outperforms the other methods in predicting contacts observed in high resolution crystal structures.
△ Less
Submitted 29 November, 2019; v1 submitted 18 December, 2018;
originally announced December 2018.
-
A method for partitioning the information contained in a protein sequence between its structure and function
Authors:
A. Possenti,
M. Vendruscolo,
C. Camilloni,
G. Tiana
Abstract:
Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We developed a computational algorithm…
▽ More
Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We developed a computational algorithm to evaluate the amount of information necessary to specify the protein structure, kee** into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize designed proteins sequences.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.
-
An implementation of the maximum-caliber principle by replica-averaged time-resolved restrained simulations
Authors:
Riccardo Capelli,
Guido Tiana,
Carlo Camilloni
Abstract:
Inferential methods can be used to integrate experimental informations and molecular simulations. The maximum entropy principle provides a framework for using equilibrium experimental data and it has been shown that replica-averaged simulations, restrained using a static potential, are a practical and powerful implementation of such principle. Here we show that replica-averaged simulations restrai…
▽ More
Inferential methods can be used to integrate experimental informations and molecular simulations. The maximum entropy principle provides a framework for using equilibrium experimental data and it has been shown that replica-averaged simulations, restrained using a static potential, are a practical and powerful implementation of such principle. Here we show that replica-averaged simulations restrained using a time-dependent potential are equivalent to the principle of maximum caliber, the dynamic version of the principle of maximum entropy, and thus may allow to integrate time-resolved data in molecular dynamics simulations. We provide an analytical proof of the equivalence as well as a computational validation making use of simple models and synthetic data. Some limitations and possible solutions are also discussed.
△ Less
Submitted 24 April, 2018; v1 submitted 19 February, 2018;
originally announced February 2018.
-
Spontaneous domain formation in disordered copolymers as a mechanism for chromosome structuring
Authors:
Matteo Negri,
Marco Gherardi,
Guido Tiana,
Marco Cosentino Lagomarsino
Abstract:
Motivated by the problem of domain formation in chromosomes, we studied a co--polymer model where only a subset of the monomers feel attractive interactions. These monomers are displaced randomly from a regularly-spaced pattern, thus introducing some quenched disorder in the system. Previous work has shown that in the case of regularly-spaced interacting monomers this chain can fold into structure…
▽ More
Motivated by the problem of domain formation in chromosomes, we studied a co--polymer model where only a subset of the monomers feel attractive interactions. These monomers are displaced randomly from a regularly-spaced pattern, thus introducing some quenched disorder in the system. Previous work has shown that in the case of regularly-spaced interacting monomers this chain can fold into structures characterized by multiple distinct domains of consecutive segments. In each domain, attractive interactions are balanced by the entropy cost of forming loops. We show by advanced replica-exchange simulations that adding disorder in the position of the interacting monomers further stabilizes these domains. The model suggests that the partitioning of the chain into well-defined domains of consecutive monomers is a spontaneous property of heteropolymers. In the case of chromosomes, evolution could have acted on the spacing of interacting monomers to modulate in a simple way the underlying domains for functional reasons.
△ Less
Submitted 12 February, 2018;
originally announced February 2018.
-
Complete coverage of space favors modularity of the grid system in the brain
Authors:
Alessandro Sanzeni,
Vijay Balasubramanian,
Guido Tiana,
Massimo Vergassola
Abstract:
Grid cells in the entorhinal cortex fire when animals that are exploring a certain region of space occupy the vertices of a triangular grid that spans the environment. Different neurons feature triangular grids that differ in their properties of periodicity, orientation and ellipticity. Taken together, these grids allow the animal to maintain an internal, mental representation of physical space. E…
▽ More
Grid cells in the entorhinal cortex fire when animals that are exploring a certain region of space occupy the vertices of a triangular grid that spans the environment. Different neurons feature triangular grids that differ in their properties of periodicity, orientation and ellipticity. Taken together, these grids allow the animal to maintain an internal, mental representation of physical space. Experiments show that grid cells are modular, i.e. there are groups of neurons which have grids with similar periodicity, orientation and ellipticity. We use statistical physics methods to derive a relation between variability of the properties of the grids within a module and the range of space that can be covered completely (i.e. without gaps) by the grid system with high probability. Larger variability shrinks the range of representation, providing a functional rationale for the experimentally observed co-modularity of grid cell periodicity, orientation and ellipticity. We obtain a scaling relation between the number of neurons and the period of a module, given the variability and coverage range. Specifically, we predict how many more neurons are required at smaller grid scales than at larger ones.
△ Less
Submitted 16 October, 2016;
originally announced October 2016.
-
Properties of low-dimensional collective variables in the molecular dynamics of biopolymers
Authors:
R. Meloni,
C. Camilloni,
G. Tiana
Abstract:
The description of the dynamics of a complex, high-dimensional system in terms of a low-dimensional set of collective variables Y can be fruitful if the low dimensional representation satisfies a Langevin equation with drift and diffusion coefficients which depend only on Y. We present a computational scheme to evaluate whether a given collective variable provides a faithful low-dimensional repres…
▽ More
The description of the dynamics of a complex, high-dimensional system in terms of a low-dimensional set of collective variables Y can be fruitful if the low dimensional representation satisfies a Langevin equation with drift and diffusion coefficients which depend only on Y. We present a computational scheme to evaluate whether a given collective variable provides a faithful low-dimensional representation of the dynamics of a high-dimensional system. The scheme is based on the framework of finite-difference Langevin-equation, similar to that used for molecular-dynamics simulations. This allows one to calculate the drift and diffusion coefficients in any point of the full-dimensional system. The width of the distribution of drift and diffusion coefficients in an ensemble of microscopic points at the same value of Y indicates to which extent the dynamics of Y is described by a simple Langevin equation. Using a simple protein model we show that collective variables often used to describe biopolymers display a non-negligible width both in the drift and in the diffusion coefficients. We also show that the associated effective force is compatible with the equilibrium free--energy calculated from a microscopic sampling, but results in markedly different dynamical properties.
△ Less
Submitted 28 November, 2016; v1 submitted 1 September, 2016;
originally announced September 2016.
-
The loo** probability of random heteropolymers helps to understand the scaling properties of biopolymers
Authors:
Y. Zhan,
L. Giorgetti,
G. Tiana
Abstract:
Random heteropolymers are a minimal description of biopolymers and can provide a theoretical framework to the investigate the formation of loops in biophysical experiments. A two--state model provides a consistent and robust way to study the scaling properties of loop formation in polymers of the size of typical biological systems. Combining it with self--adjusting simulated--tempering simulations…
▽ More
Random heteropolymers are a minimal description of biopolymers and can provide a theoretical framework to the investigate the formation of loops in biophysical experiments. A two--state model provides a consistent and robust way to study the scaling properties of loop formation in polymers of the size of typical biological systems. Combining it with self--adjusting simulated--tempering simulations, we can calculate numerically the loo** properties of several realizations of the random interactions within the chain. Differently from homopolymers, random heteropolymers display at different temperatures a continuous set of scaling exponents. The necessity of using self--averaging quantities makes finite--size effects dominant at low temperatures even for long polymers, shadowing the length--independent character of loo** probability expected in analogy with homopolymeric globules. This could provide a simple explanation for the small scaling exponents found in experiments, for example in chromosome folding.
△ Less
Submitted 9 August, 2016;
originally announced August 2016.
-
The effect of disorder in the contact probability of elongated conformations of biopolymers
Authors:
Guido Tiana
Abstract:
Biopolymers are characterized by heterogeneous interactions, and usually perform their biological tasks forming contacts within domains of limited size. Combining polymer theory with a replica approach, we study the scaling properties of the probability of contact formation in random heteropolymers as a function of their linear distance. It is found that close or above the theta--point, it is poss…
▽ More
Biopolymers are characterized by heterogeneous interactions, and usually perform their biological tasks forming contacts within domains of limited size. Combining polymer theory with a replica approach, we study the scaling properties of the probability of contact formation in random heteropolymers as a function of their linear distance. It is found that close or above the theta--point, it is possible to define a contact probability which is typical (i.e. "self-averaging") for different realizations of the heterogeneous interactions, and which displays an exponential cut--off, dependent on temperature and on the interaction range. In many cases this cut--off is comparable with the typical sizes of domains in biopolymers. While it is well known that disorder causes interesting effects at low temperature, the behavior elucidated in the present study is an example of a non--trivial effect at high temperature.
△ Less
Submitted 24 June, 2015;
originally announced June 2015.
-
A many-body term improves the accuracy of effective potentials based on protein coevolutionary data
Authors:
A. Contini,
G. Tiana
Abstract:
The study of correlated mutations in alignments of homologous proteins proved to be succesful not only in the prediction of their native conformation, but also in the developement of a two-body effective potential between pairs of amino acids. In the present work we extend the effective potential, introducing a many--body term based on the same theoretical framework, making use of a principle of m…
▽ More
The study of correlated mutations in alignments of homologous proteins proved to be succesful not only in the prediction of their native conformation, but also in the developement of a two-body effective potential between pairs of amino acids. In the present work we extend the effective potential, introducing a many--body term based on the same theoretical framework, making use of a principle of maximum entropy. The extended potential performs better than the two--body one in predicting the energetic effect of 308 mutations in 14 proteins (including membrane proteins). The average value of the parameters of the many-body term correlates with the degree of hydrophobicity of the corresponding residues, suggesting that this term partly reflects the effect of the solvent.
△ Less
Submitted 8 June, 2015;
originally announced June 2015.
-
Iterative derivation of effective potentials to sample the conformational space of proteins at atomistic scale
Authors:
R. Capelli,
C. Paissoni,
P. Sormanni,
G. Tiana
Abstract:
The current capacity of computers makes it possible to perform simulations of small systems with portable, explicit-solvent potentials achieving high degree of accuracy. However, simplified models must be employed to exploit the behaviour of large systems or to perform systematic scans of smaller systems. While powerful algorithms are available to facilitate the sampling of the conformational spac…
▽ More
The current capacity of computers makes it possible to perform simulations of small systems with portable, explicit-solvent potentials achieving high degree of accuracy. However, simplified models must be employed to exploit the behaviour of large systems or to perform systematic scans of smaller systems. While powerful algorithms are available to facilitate the sampling of the conformational space, successful applications of such models are hindered by the availability of simple enough potentials able to satisfactorily reproduce known properties of the system. We develop an interatomic potential to account for a number of properties of proteins in a computationally economic way. The potential is defined within an all-atom, implicit solvent model by contact functions between the different atom types. The associated numerical values can be optimised by an iterative Monte Carlo scheme on any available experimental data, provided that they are expressible as thermal averages of some conformational properties. We test this model on three different proteins, for which we also perform a scan of all possible point mutations with explicit conformational sampling. The resulting models, optimised solely on a subset of native distances, not only reproduce the native conformations within a few Angstroms from the experimental ones, but show the cooperative transition between native and denatured state and correctly predict the measured free--energy changes associated with point mutations. Moreover, differently from other structure-based models, our method leaves a residual degree of frustration, which is known to be present in protein molecules.
△ Less
Submitted 3 February, 2014;
originally announced February 2014.
-
The network of stabilizing contacts in proteins studied by coevolutionary data
Authors:
Sara Lui,
Guido Tiana
Abstract:
The primary structure of proteins, that is their sequence, represents one of the most abundant set of experimental data concerning biomolecules. The study of correlations in families of co--evolving proteins by means of an inverse Ising--model approach allows to obtain information on their native conformation. Following up on a recent development along this line, we optimize the algorithm to calcu…
▽ More
The primary structure of proteins, that is their sequence, represents one of the most abundant set of experimental data concerning biomolecules. The study of correlations in families of co--evolving proteins by means of an inverse Ising--model approach allows to obtain information on their native conformation. Following up on a recent development along this line, we optimize the algorithm to calculate effective energies between the residues, validating the approach both back-calculating interaction energies in a model system, and predicting the free energies associated to mutations in real systems. Making use of these effective energies, we study the networks of interactions which stabilizes the native conformation of some well--studied proteins, showing that it display different properties than the associated contact network.
△ Less
Submitted 5 July, 2013;
originally announced July 2013.
-
Ratcheted molecular-dynamics simulations identify efficiently the transition state of protein folding
Authors:
Guido Tiana,
Carlo Camilloni
Abstract:
The atomistic characterization of the transition state is a fundamental step to improve the understanding of the folding mechanism and the function of proteins. From a computational point of view, the identification of the conformations that build out the transition state is particularly cumbersome, mainly because of the large computational cost of generating a statistically-sound set of folding t…
▽ More
The atomistic characterization of the transition state is a fundamental step to improve the understanding of the folding mechanism and the function of proteins. From a computational point of view, the identification of the conformations that build out the transition state is particularly cumbersome, mainly because of the large computational cost of generating a statistically-sound set of folding trajectories. Here we show that a biasing algorithm, based on the physics of the ratchet-and-pawl, can be used to identify efficiently the transition state. The basic idea is that the algorithmic ratchet exerts a force on the protein when it is climbing the free-energy barrier, while it is inactive when it is descending. The transition state can be identified as the point of the trajectory where the ratchet changes regime. Besides discussing this strategy in general terms, we test it within a protein model whose transition state can be studied independently by plain molecular dynamics simulations. Finally, we show its power in explicit-solvent simulations, obtaining and characterizing a set of transition--state conformations for ACBP and CI2.
△ Less
Submitted 5 July, 2012;
originally announced July 2012.
-
Equilibrium properties of realistic random heteropolymers and their relevance for globular and naturally unfolded proteins
Authors:
Guido Tiana,
Ludovico Sutto
Abstract:
Random heteropolymers do not display the typical equilibrium properties of globular proteins, but are the starting point to understand the physics of proteins and, in particular, to describe their non-native states. So far, they have been studied only with mean-field models in the thermodynamic limit, or with computer simulations of very small chains on lattice. After describing a self-adjusting p…
▽ More
Random heteropolymers do not display the typical equilibrium properties of globular proteins, but are the starting point to understand the physics of proteins and, in particular, to describe their non-native states. So far, they have been studied only with mean-field models in the thermodynamic limit, or with computer simulations of very small chains on lattice. After describing a self-adjusting parallel-tempering technique to sample efficiently the low-energy states of frustrated systems without the need of tuning the system-dependent parameters of the algorithm, we apply it to random heteropolymers moving in continuous space. We show that if the mean interaction between monomers is negative, the usual description through the random energy model is nearly correct, provided that it is extended to account for non-compact conformations. If the mean interaction is positive, such a simple description breaks out and the system behaves in a way more similar to Ising spin glasses. The former case is a model for the denatured state of glob- ular proteins, the latter of naturally-unfolded proteins, whose equilibrium properties thus result qualitatively different.
△ Less
Submitted 24 November, 2011;
originally announced November 2011.
-
Atomic-detailed milestones along the folding trajectory of protein G
Authors:
C. Camilloni,
G. Tiana,
R. A. Broglia
Abstract:
The high computational cost of carrying out molecular dynamics simulations of even small-size proteins is a major obstacle in the study, at atomic detail and in explicit solvent, of the physical mechanism which is at the basis of the folding of proteins. Making use of a biasing algorithm, based on the principle of the ratchet-and-pawl, we have been able to calculate eight folding trajectories (t…
▽ More
The high computational cost of carrying out molecular dynamics simulations of even small-size proteins is a major obstacle in the study, at atomic detail and in explicit solvent, of the physical mechanism which is at the basis of the folding of proteins. Making use of a biasing algorithm, based on the principle of the ratchet-and-pawl, we have been able to calculate eight folding trajectories (to an RMSD between 1.2A and 2.5A) of the B1 domain of protein G in explicit solvent without the need of high-performance computing. The simulations show that in the denatured state there is a complex network of cause-effect relationships among contacts, which results in a rather hierarchical folding mechanism. The network displays few local and nonlocal native contacts which are cause of most of the others, in agreement with the NOE signals obtained in mildly-denatured conditions. Also nonnative contacts play an active role in the folding kinetics. The set of conformations corresponding to the transition state display phi-values with a correlation coefficient of 0.69 with the experimental ones. They are structurally quite homogeneous and topologically native-like, although some of the side chains and most of the hydrogen bonds are not in place.
△ Less
Submitted 18 May, 2009;
originally announced May 2009.
-
Metadynamic sampling of the free energy landscapes of proteins coupled with a Monte Carlo algorithm
Authors:
F. Marini,
C. Camilloni,
D. Provasi,
R. A. Broglia,
G. Tiana
Abstract:
Metadynamics is a powerful computational tool to obtain the free energy landscape of complex systems. The Monte Carlo algorithm has proven useful to calculate thermodynamic quantities associated with simplified models of proteins, and thus to gain an ever-increasing understanding on the general principles underlying the mechanism of protein folding. We show that it is possible to couple metadyna…
▽ More
Metadynamics is a powerful computational tool to obtain the free energy landscape of complex systems. The Monte Carlo algorithm has proven useful to calculate thermodynamic quantities associated with simplified models of proteins, and thus to gain an ever-increasing understanding on the general principles underlying the mechanism of protein folding. We show that it is possible to couple metadynamics and Monte Carlo algorithms to obtain the free energy of model proteins in a way which is computationally very economical.
△ Less
Submitted 3 October, 2007;
originally announced October 2007.
-
Exploring the Protein G Helix Free Energy Surface by Solute Tempering Metadynamics
Authors:
C. Camilloni,
D. Provasi,
G. Tiana,
R. A. Broglia
Abstract:
The free-energy landscape of the alpha-helix of protein G is studied by means of metadynamics coupled with a solute tempering algorithm. Metadynamics allows to overcome large energy barriers, whereas solute tempering improves the sampling with an affordable computational effort. From the sampled free-energy surface we are able to reproduce a number of experimental observations, such as the fact…
▽ More
The free-energy landscape of the alpha-helix of protein G is studied by means of metadynamics coupled with a solute tempering algorithm. Metadynamics allows to overcome large energy barriers, whereas solute tempering improves the sampling with an affordable computational effort. From the sampled free-energy surface we are able to reproduce a number of experimental observations, such as the fact that the lowest minimum corresponds to a globular conformation displaying some degree of beta-structure, that the helical state is metastable and involves only 65% of the chain. The calculations also show that the system populates consistently a pi-helix state and that the hydrophobic staple motif is present only in the free-energy minimum associated with the helices, and contributes to their stabilization. The use of metadynamics coupled with solute tempering results then particularly suitable to provide the thermodynamics of a short peptide, and its computational efficiency is promising to deal with larger proteins.
△ Less
Submitted 9 July, 2007;
originally announced July 2007.
-
Oscillations and temporal signalling in cells
Authors:
G. Tiana,
S. Krishna,
S. Pigolotti,
M. H. Jensen,
K. Sneppen
Abstract:
The development of new techniques to quantitatively measure gene expression in cells has shed light on a number of systems that display oscillations in protein concentration. Here we review the different mechanisms which can produce oscillations in gene expression or protein concentration, using a framework of simple mathematical models. We focus on three eukaryotic genetic regulatory networks w…
▽ More
The development of new techniques to quantitatively measure gene expression in cells has shed light on a number of systems that display oscillations in protein concentration. Here we review the different mechanisms which can produce oscillations in gene expression or protein concentration, using a framework of simple mathematical models. We focus on three eukaryotic genetic regulatory networks which show "ultradian" oscillations, with time period of the order of hours, and involve, respectively, proteins important for development (Hes1), apoptosis (p53) and immune response (NFkB). We argue that underlying all three is a common design consisting of a negative feedback loop with time delay which is responsible for the oscillatory behaviour.
△ Less
Submitted 22 March, 2007;
originally announced March 2007.
-
Use of the Metropolis algorithm to simulate the dynamics of protein chains
Authors:
G. Tiana,
L. Sutto,
R. A. Broglia
Abstract:
The Metropolis implementation of the Monte Carlo algorithm has been developed to study the equilibrium thermodynamics of many-body systems. Choosing small trial moves, the trajectories obtained applying this algorithm agree with those obtained by Langevin's dynamics. Applying this procedure to a simplified protein model, it is possible to show that setting a threshold of 1 degree on the movement…
▽ More
The Metropolis implementation of the Monte Carlo algorithm has been developed to study the equilibrium thermodynamics of many-body systems. Choosing small trial moves, the trajectories obtained applying this algorithm agree with those obtained by Langevin's dynamics. Applying this procedure to a simplified protein model, it is possible to show that setting a threshold of 1 degree on the movement of the dihedrals of the protein backbone in a single Monte Carlo step, the mean quantities associated with the off-equilibrium dynamics (e.g., energy, RMSD, etc.) are well reproduced, while the good description of higher moments requires smaller moves. An important result is that the time duration of a Monte Carlo step depends linearly on the temperature, something which should be accounted for when doing simulations at different temperatures.
△ Less
Submitted 20 February, 2007; v1 submitted 27 June, 2006;
originally announced June 2006.
-
Hierarchy of events in protein folding: beyond the Go model
Authors:
Ludovico Sutto,
Guido Tiana,
Ricardo A. Broglia
Abstract:
Simplified Go models, where only native contacts interact favorably, have proven useful to characterize some aspects of the folding of small proteins. The success of these models is limited by the fact that all residues interact in the same way, so that the folding features of a protein are determined only by the geometry of its native conformation. We present an extended version of a C-alpha ba…
▽ More
Simplified Go models, where only native contacts interact favorably, have proven useful to characterize some aspects of the folding of small proteins. The success of these models is limited by the fact that all residues interact in the same way, so that the folding features of a protein are determined only by the geometry of its native conformation. We present an extended version of a C-alpha based Go model where different residues interact with different energies. The model is used to calculate the thermodynamics of three small proteins (Protein G, SrcSH3 and CI2) and the effect of mutations on the wildtype sequence. The model allows to investigate some of the most controversial areas in protein folding such as its earliest stages, a subject which has lately received particular attention. The picture which emerges for the three proteins under study is that of a hierarchical process, where local elementary structures (LES) (not necessarily coincident with elements of secondary structure) are formed at the early stages of the folding and drive the protein, through the transition state and the postcritical folding nucleus (FN), resulting from the docking of the LES, to the native conformation.
△ Less
Submitted 26 January, 2006;
originally announced January 2006.
-
What thermodynamic features characterize good and bad folders? Results from a simplified off-lattice protein model
Authors:
A. Amatori,
J. Ferkinghoff-Borg,
G. Tiana,
R. A. Broglia
Abstract:
The thermodynamics of the small SH3 protein domain is studied by means of a simplified model where each bead-like amino acid interacts with the others through a contact potential controlled by a 20x20 random matrix. Good folding sequences, characterized by a low native energy, display three main thermodynamical phases, namely a coil-like phase, an unfolded globule and a folded phase (plus other…
▽ More
The thermodynamics of the small SH3 protein domain is studied by means of a simplified model where each bead-like amino acid interacts with the others through a contact potential controlled by a 20x20 random matrix. Good folding sequences, characterized by a low native energy, display three main thermodynamical phases, namely a coil-like phase, an unfolded globule and a folded phase (plus other two phases, namely frozen and random coil, populated only at extremes temperatures). Interestingly, the unfolded globule has some regions already structured. Poorly designed sequences, on the other hand, display a wide transition from the random coil to a frozen state. The comparison with the analytic theory of heteropolymers is discussed.
△ Less
Submitted 12 December, 2005;
originally announced December 2005.
-
A folding inhibitor of the HIV-1 Protease
Authors:
R. A. Broglia,
D. Provasi,
F. Vasile,
G. Ottolina,
R. Longhi,
G. Tiana
Abstract:
Being the HIV-1 Protease (HIV-1-PR) an essential enzyme in the viral life cycle, its inhibition can control AIDS. The folding of single domain proteins, like each of the monomers forming the HIV-1-PR homodimer, is controlled by local elementary structures (LES, folding units stabilized by strongly interacting, highly conserved, as a rule hydrophobic, amino acids). These LES have evolved over myr…
▽ More
Being the HIV-1 Protease (HIV-1-PR) an essential enzyme in the viral life cycle, its inhibition can control AIDS. The folding of single domain proteins, like each of the monomers forming the HIV-1-PR homodimer, is controlled by local elementary structures (LES, folding units stabilized by strongly interacting, highly conserved, as a rule hydrophobic, amino acids). These LES have evolved over myriad of generations to recognize and strongly attract each other, so as to make the protein fold fast and be stable in its native conformation. Consequently, peptides displaying a sequence identical to those segments of the monomers associated with LES are expected to act as competitive inhibitors and thus destabilize the native structure of the enzyme. These inhibitors are unlikely to lead to escape mutants as they bind to the protease monomers through highly conserved amino acids which play an essential role in the folding process. The properties of one of the most promising inhibitors of the folding of the HIV-1-PR monomers found among these peptides is demonstrated with the help of spectrophotometric assays and CD spectroscopy.
△ Less
Submitted 15 September, 2005;
originally announced September 2005.
-
Design of amino acid sequences to fold into C_alpha-model proteins
Authors:
A. Amatori,
G. Tiana,
L. Sutto,
J. Ferkinghoff-Borg,
A. Trovato,
R. A. Broglia
Abstract:
In order to extend the results obtained with minimal lattice models to more realistic systems, we study a model where proteins are described as a chain of 20 kinds of structureless amino acids moving in a continuum space and interacting through a contact potential controlled by a 20x20 quenched random matrix. The goal of the present work is to design and characterize amino acid sequences folding…
▽ More
In order to extend the results obtained with minimal lattice models to more realistic systems, we study a model where proteins are described as a chain of 20 kinds of structureless amino acids moving in a continuum space and interacting through a contact potential controlled by a 20x20 quenched random matrix. The goal of the present work is to design and characterize amino acid sequences folding to the SH3 conformation, a 60-residues recognition domain common to many regulatory proteins. We show that a number of sequences can fold, starting from a random conformation, to within a distance root mean square deviation (dRMSD) of 2.6A from the native state. Good folders are those sequences displaying in the native conformation an energy lower than a sequence--independent threshold energy.
△ Less
Submitted 16 May, 2005;
originally announced May 2005.
-
Design of HIV-1-PR inhibitors which do not create resistance: blocking the folding of single monomers
Authors:
R. A. Broglia,
G. Tiana,
L. Sutto,
D. Provasi,
F. Simona
Abstract:
One of the main problems of drug design is that of optimizing the drug--target interaction. In the case in which the target is a viral protein displaying a high mutation rate, a second problem arises, namely the eventual development of resistance. We wish to suggest a scheme for the design of non--conventional drugs which do not face any of these problems and apply it to the case of HIV--1 prote…
▽ More
One of the main problems of drug design is that of optimizing the drug--target interaction. In the case in which the target is a viral protein displaying a high mutation rate, a second problem arises, namely the eventual development of resistance. We wish to suggest a scheme for the design of non--conventional drugs which do not face any of these problems and apply it to the case of HIV--1 protease. It is based on the knowledge that the folding of single--domain proteins, like e.g. each of the monomers forming the HIV--1--PR homodimer, is controlled by local elementary structures (LES), stabilized by local contacts among hydrophobic, strongly interacting and highly conserved amino acids which play a central role in the folding process. Because LES have evolved over myriads of generations to recognize and strongly interact with each other so as to make the protein fold fast as well as to avoid aggregation with other proteins, highly specific (and thus little toxic) as well as effective folding--inhibitor drugs suggest themselves: short peptides (or eventually their mimetic molecules), displaying the same amino acid sequence of that of LES (p--LES). Aside from being specific and efficient, these inhibitors are expected not to induce resistance: in fact, mutations which successfully avoid their action imply the destabilization of one or more LES and thus should lead to protein denaturation. Making use of Monte Carlo simulations within the framework of a simple although not oversimplified model, which is able to reproduce the main thermodynamic as well as dynamic properties of monoglobular proteins, we first identify the LES of the HIV--1--PR and then show that the corresponding p--LES peptides act as effective inhibitors of the folding of the protease which do not create resistance.
△ Less
Submitted 7 April, 2005;
originally announced April 2005.
-
Early events in insulin fibrillization studied by time-lapse atomic force microscopy
Authors:
A. Podesta',
G. Tiana,
P. Milani,
M. Manno
Abstract:
The importance of understanding the mechanism of protein aggregation into insoluble amyloid fibrils relies not only on its medical consequences, but also on its more basic properties of self--organization. The discovery that a large number of uncorrelated proteins can form, under proper conditions, structurally similar fibrils has suggested that the underlying mechanism is a general feature of p…
▽ More
The importance of understanding the mechanism of protein aggregation into insoluble amyloid fibrils relies not only on its medical consequences, but also on its more basic properties of self--organization. The discovery that a large number of uncorrelated proteins can form, under proper conditions, structurally similar fibrils has suggested that the underlying mechanism is a general feature of polypeptide chains. In the present work, we address the early events preceeding amyloid fibril formation in solutions of zinc--free human insulin incubated at low pH and high temperature. Aside from being a easy--to--handle model for protein fibrillation, subcutaneous aggregation of insulin after injection is a nuisance which affects patients with diabetes. Here, we show by time--lapse atomic force microscopy (AFM) that a steady-state distribution of protein oligomers with an exponential tail is reached within few minutes after heating. This metastable phase lasts for few hours until aggregation into fibrils suddenly occurs. A theoretical explanation of the oligomer pre--fibrillar distribution is given in terms of a simple coagulation--evaporation kinetic model, in which concentration plays the role of a critical parameter. Due to high resolution and sensitivity of AFM technique, the observation of a long-lasting latency time should be considered an actual feature of the aggregation process, and not simply ascribed to instrumental inefficency. These experimental facts, along with the kinetic model used, claim for a critical role of thermal concentration fluctuations in the process of fibril nucleation.
△ Less
Submitted 17 February, 2005;
originally announced February 2005.
-
Design of a folding inhibitor of the HIV-1 Protease
Authors:
R. A. Broglia,
G. Tiana,
D. Provasi,
F. Simona,
L. Sutto,
F. Vasile,
M. Zanotti
Abstract:
Being HIV-1-PR an essential enzyme in the viral life cycle, its inhibition can control AIDS. Because the folding of single domain proteins, like HIV-1-PR is controlled by local elementary structures (LES, folding units stabilized by strongly interacting, highly conserved amino acids) which have evolved over myriads of generations to recognize and strongly attract each other so as to make the pro…
▽ More
Being HIV-1-PR an essential enzyme in the viral life cycle, its inhibition can control AIDS. Because the folding of single domain proteins, like HIV-1-PR is controlled by local elementary structures (LES, folding units stabilized by strongly interacting, highly conserved amino acids) which have evolved over myriads of generations to recognize and strongly attract each other so as to make the protein fold fast, we suggest a novel type of HIV-1-PR inhibitors which interfere with the folding of the protein: short peptides displaying the same amino acid sequence of that of LES. Theoretical and experimental evidence for the specificity and efficiency of such inhibitors are presented.
△ Less
Submitted 16 August, 2004;
originally announced August 2004.
-
Simple models of protein folding and of non--conventional drug design
Authors:
R. A. Broglia,
G. Tiana
Abstract:
While all the information required for the folding of a protein is contained in its amino acid sequence, one has not yet learned how to extract this information to predict the three--dimensional, biologically active, native conformation of a protein whose sequence is known. Using insight obtained from simple model simulations of the folding of proteins, in particular of the fact that this phenom…
▽ More
While all the information required for the folding of a protein is contained in its amino acid sequence, one has not yet learned how to extract this information to predict the three--dimensional, biologically active, native conformation of a protein whose sequence is known. Using insight obtained from simple model simulations of the folding of proteins, in particular of the fact that this phenomenon is essentially controlled by conserved (native) contacts among (few) strongly interacting ("hot"), as a rule hydrophobic, amino acids, which also stabilize local elementary structures (LES, hidden, incipient secondary structures like $α$--helices and $β$--sheets) formed early in the folding process and leading to the postcritical folding nucleus (i.e., the minimum set of native contacts which bring the system pass beyond the highest free--energy barrier found in the whole folding process) it is possible to work out a succesful strategy for reading the native structure of designed proteins from the knowledge of only their amino acid sequence and of the contact energies among the amino acids. Because LES have undergone millions of years of evolution to selectively dock to their complementary structures, small peptides made out of the same amino acids as the LES are expected to selectively attach to the newly expressed (unfolded) protein and inhibit its folding, or to the native (fluctuating) native conformation and denaturate it. These peptides, or their mimetic molecules, can thus be used as effective non--conventional drugs to those already existing (and directed at neutralizing the active site of enzymes), displaying the advantage of not suffering from the uprise of resistance.
△ Less
Submitted 15 June, 2004;
originally announced June 2004.
-
Deriving amino acid contact potentials from their frequencies of occurence in proteins: a lattice model study
Authors:
G. Tiana,
M. Colombo,
D. Provasi,
R. A. Broglia
Abstract:
The possibility of deriving the contact potentials between amino acids from their frequencies of occurence in proteins is discussed in evolutionary terms. This approach allows the use of traditional thermodynamics to describe such frequencies and, consequently, to develop a strategy to include in the calculations correlations due to the spatial proximity of the amino acids and to their overall t…
▽ More
The possibility of deriving the contact potentials between amino acids from their frequencies of occurence in proteins is discussed in evolutionary terms. This approach allows the use of traditional thermodynamics to describe such frequencies and, consequently, to develop a strategy to include in the calculations correlations due to the spatial proximity of the amino acids and to their overall tendency of being conserved in proteins. Making use of a lattice model to describe protein chains and defining a "true" potential, we test these strategies by selecting a database of folding model sequences, deriving the contact potentials from such sequences and comparing them with the "true" potential. Taking into account correlations allows for a markedly better prediction of the interaction potentials.
△ Less
Submitted 15 June, 2004;
originally announced June 2004.
-
Thermodynamics of beta-amyloid fibril formation
Authors:
G. Tiana,
F. Simona,
R. A. Broglia,
G. Colombo
Abstract:
Amyloid fibers are aggregates of proteins. They are built out of a peptide called $β$--amyloid (A$β$) containing between 41 and 43 residues, produced by the action of an enzyme which cleaves a much larger protein known as the Amyloid Precursor Protein (APP). X-ray diffraction experiments have shown that these fibrils are rich in $β$--structures, whereas the shape of the peptide displays an $α$--…
▽ More
Amyloid fibers are aggregates of proteins. They are built out of a peptide called $β$--amyloid (A$β$) containing between 41 and 43 residues, produced by the action of an enzyme which cleaves a much larger protein known as the Amyloid Precursor Protein (APP). X-ray diffraction experiments have shown that these fibrils are rich in $β$--structures, whereas the shape of the peptide displays an $α$--helix structure within the APP in its biologically active conformation. A realistic model of fibril formation is developed based on the seventeen residues A$β$12--28 amyloid peptide, which has been shown to form fibrils structurally similar to those of the whole A$β$ peptide. With the help of physical arguments and in kee** with experimental findings, the A$β$12--28 monomer is assumed to be in four possible states (i.e., native helix conformation, $β$--hairpin, globular low--energy state and unfolded state). Making use of these monomeric states, oligomers (dimers, tertramers and octamers) were constructed. With the help of short, detailed Molecular Dynamics (MD) calculations of the three monomers and of a variety of oligomers, energies for these structures were obtained. Making use of these results within the framework of a simple yet realistic model to describe the entropic terms associated with the variety of amyloid conformations, a phase diagram can be calculated of the whole many--body system, leading to a thermodynamical picture in overall agreement with the experimental findings. In particular, the existence of micellar metastable states seem to be a key issue to determine the thermodynamical properties of the system.
△ Less
Submitted 15 June, 2004;
originally announced June 2004.
-
Understanding the determinants of stability and folding of small globular proteins from their energetics
Authors:
G. Tiana,
F. Simona,
G. M. S. De Mori,
R. A. Broglia,
G. Colombo
Abstract:
The results of minimal model calculations suggest that the stability and the kinetic accessibility of the native state of small globular proteins are controlled by few "hot" sites. By mean of molecular dynamics simulations around the native conformation, which simulate the protein and the surrounding solvent at full--atom level, we generate an energetic map of the equilibrium state of the protei…
▽ More
The results of minimal model calculations suggest that the stability and the kinetic accessibility of the native state of small globular proteins are controlled by few "hot" sites. By mean of molecular dynamics simulations around the native conformation, which simulate the protein and the surrounding solvent at full--atom level, we generate an energetic map of the equilibrium state of the protein and simplify it with an Eigenvalue decomposition. The components of the Eigenvector associated with the lowest Eigenvalue indicate which are the "hot" sites responsible for the stability and for the fast folding of the protein. Comparison of these predictions with the results of mutatgenesis experiments, performed for five small proteins, provide an excellent agreement.
△ Less
Submitted 30 January, 2003;
originally announced January 2003.
-
Role of bulk and of interface contacts in the behaviour of model dimeric proteins
Authors:
G. Tiana,
D. Provasi,
R. A. Broglia
Abstract:
Some dimeric proteins first fold and then dimerize (three--state dimers) while others first dimerize and then fold (two--state dimers). Within the framework of a minimal lattice model, we can distinguish between sequences obeying to one or to the other mechanism on the basis of the partition of the ground state energy between bulk than for interface contacts. The topology of contacts is very dif…
▽ More
Some dimeric proteins first fold and then dimerize (three--state dimers) while others first dimerize and then fold (two--state dimers). Within the framework of a minimal lattice model, we can distinguish between sequences obeying to one or to the other mechanism on the basis of the partition of the ground state energy between bulk than for interface contacts. The topology of contacts is very different for the bulk than for the interface: while the bulk displays a rich network of interactions, the dimer interface is built up a set of essentially independent contacts. Consequently, the two sets of interactions play very different roles both in the the folding and in the evolutionary history of the protein. Three--state dimers, where a large fraction of the energy is concentrated in few contacts buried in the bulk, and where the relative contact energy of interface contacts is considerably smaller than that associated with bulk contacts, fold according to a hierarchycal pathway controlled by local elementary structures, as also happens in the folding of single--domain monomeric proteins. On the other hand, two--state dimers display a relative contact energy of interface contacts which is larger than the corresponding quantity associated with the bulk. In this case, the assembly of the interface stabilizes the system and lead the two chains to fold. The specific properties of three--state dimers acquired through evolution are expected to be more robust than those of two--state dimers, a fact which has consequences on proteins connected with viral diseases.
△ Less
Submitted 6 November, 2002;
originally announced November 2002.
-
Resistance proof, folding-inhibitor drugs
Authors:
R. Broglia,
G. Tiana,
R. Berera
Abstract:
Conventional drugs work, as a rule, by inhibiting the enzymatic activity of specific proteins, cap** their active site. In this paper we present a model of non- conventional drug design based on the inhibiting effects small peptides obtained from segments of the protein itself have on the folding ability of the system. Such peptides attach to the newly expressed (unfolded) protein and inhibit…
▽ More
Conventional drugs work, as a rule, by inhibiting the enzymatic activity of specific proteins, cap** their active site. In this paper we present a model of non- conventional drug design based on the inhibiting effects small peptides obtained from segments of the protein itself have on the folding ability of the system. Such peptides attach to the newly expressed (unfolded) protein and inhibit its folding, inhibition which cannot be avoided but through mutations which in any case denaturate the enzyme. These peptides, or their mimetic molecules, can be used as effective alternative drugs to those already available, displaying the advantage of not suffering from the upraise of resistence.
△ Less
Submitted 9 October, 2002;
originally announced October 2002.
-
Time delay as a key to Apoptosis Induction in the p53 Network
Authors:
G. Tiana,
M. H. Jensen,
K. Sneppen
Abstract:
A feedback mechanism that involves the proteins p53 and mdm2, induces cell death as a controled response to severe DNA damage. A minimal model for this mechanism demonstrates that the respone may be dynamic and connected with the time needed to translate the mdm2 protein. The response takes place if the dissociation constant k between p53 and mdm2 varies from its normal value. Although it is wid…
▽ More
A feedback mechanism that involves the proteins p53 and mdm2, induces cell death as a controled response to severe DNA damage. A minimal model for this mechanism demonstrates that the respone may be dynamic and connected with the time needed to translate the mdm2 protein. The response takes place if the dissociation constant k between p53 and mdm2 varies from its normal value. Although it is widely believed that it is an increase in k that triggers the response, we show that the experimental behaviour is better described by a decrease in the dissociation constant. The response is quite robust upon changes in the parameters of the system, as required by any control mechanism, except for few weak points, which could be connected with the onset of cancer.
△ Less
Submitted 9 July, 2002;
originally announced July 2002.
-
Design and Folding of Dimeric Proteins
Authors:
G. Tiana,
R. A. Broglia
Abstract:
In a similar way in which the folding of single--domain proteins provide an important test in the study of self--organization, the folding of homodimers constitute a basic challenge in the quest for the mechanisms which are at the basis of biological recognition. Dimerization is studied by following the evolution of two identical 20--letter amino acid chains within the framework of a lattice mod…
▽ More
In a similar way in which the folding of single--domain proteins provide an important test in the study of self--organization, the folding of homodimers constitute a basic challenge in the quest for the mechanisms which are at the basis of biological recognition. Dimerization is studied by following the evolution of two identical 20--letter amino acid chains within the framework of a lattice model and using Monte Carlo simulations. It is found that when design (evolution pressure) selects few, strongly interacting (conserved) amino acids to control the process, a three--state folding scenario follows, where the monomers first fold forming the halves of the eventual dimeric interface independently of each other, and then dimerize ("lock and key" kind of association). On the other hand, if design distributes the control of the folding process on a large number of (conserved) amino acids, a two--state folding scenario ensues, where dimerization takes place at the beginning of the proces, resulting in an "induced type" of association. Making use of conservation patterns of families of analogous dimers, it is possible to compare the model predictions with the behaviour of real proteins. It is found that theory provides an overall account of the experimental findings.
△ Less
Submitted 24 April, 2002;
originally announced April 2002.
-
Designability of lattice model heteropolymers
Authors:
G. Tiana,
R. A. Broglia,
D. Provasi
Abstract:
Protein folds are highly designable, in the sense that many sequences fold to the same conformation. In the present work we derive an expression for the designability in a 20 letter lattice model of proteins which, relying only on the Central Limit Theorem, has a generality which goes beyond the simple model used in its derivation. This expression displays an exponential dependence on the energy…
▽ More
Protein folds are highly designable, in the sense that many sequences fold to the same conformation. In the present work we derive an expression for the designability in a 20 letter lattice model of proteins which, relying only on the Central Limit Theorem, has a generality which goes beyond the simple model used in its derivation. This expression displays an exponential dependence on the energy of the optimal sequence folding on the given conformation measured with respect to the lowest energy of the conformational dissimilar structures, energy difference which constitutes the only parameter controlling designability. Accordingly, the designability of a native conformation is intimately connected to the stability of the sequences folding to them.
△ Less
Submitted 17 May, 2001;
originally announced May 2001.
-
Statistical Analysis of Native Contact Formation in the Folding of Designed Model Proteins
Authors:
G. Tiana,
R. A. Broglia
Abstract:
The time evolution of the formation probability of native bonds has been studied for designed sequences which fold fast into the native conformation.
From this analysis a clear hierarchy of bonds emerge a) local, fast forming highly stable native bonds built by some of the most strongly interacting amino acids of the protein, b) non-local bonds formed late in the folding process, in coincidenc…
▽ More
The time evolution of the formation probability of native bonds has been studied for designed sequences which fold fast into the native conformation.
From this analysis a clear hierarchy of bonds emerge a) local, fast forming highly stable native bonds built by some of the most strongly interacting amino acids of the protein, b) non-local bonds formed late in the folding process, in coincidence with the folding nucleus, and involving essentially the same strongly interacting amino acids already participating in the fast bonds, c) the rest of the native bonds whose behaviour is subordinated, to a large extent, to that of the local- and non-local native contacts.
△ Less
Submitted 25 October, 2000;
originally announced October 2000.
-
Reading the three-dimensional structure of a protein from its amino acid sequence
Authors:
R. A. Broglia,
G. Tiana
Abstract:
While all the information required for the folding of a protein is contained in its amino acid sequence, one has not yet learnt how to extract this information so as to predict the detailed, biological active, three-dimensional structure of a protein whose sequence is known. This situation is not particularly satisfactory, in kee** with the fact that while linear sequencing of the amino acids…
▽ More
While all the information required for the folding of a protein is contained in its amino acid sequence, one has not yet learnt how to extract this information so as to predict the detailed, biological active, three-dimensional structure of a protein whose sequence is known. This situation is not particularly satisfactory, in kee** with the fact that while linear sequencing of the amino acids specifying a protein is relatively simple to carry out, the determination of the folded-native-conformation can only be done by an elaborate X-ray diffraction analysis performed on crystals of the protein or, if the protein is very small, by nuclear magnetic resonance techniques. Using insight obtained from lattice model simulations of the folding of small proteins (fewer than 100 residues), in particular of the fact that this phenomenon is essentially controlled by conserved contacts among strongly interacting amino acids, which also stabilize local elementary structures formed early in the folding process and leading to the (post-critical) folding core when they assemble together, we have worked out a successful strategy for reading the three-dimensional structure of a notional protein from its amino acid sequence.
△ Less
Submitted 3 June, 2000; v1 submitted 9 May, 2000;
originally announced May 2000.
-
Hydrogen Bonds in Polymer Folding
Authors:
J. Borg,
M. H. Jensen,
K. Sneppen,
G. Tiana
Abstract:
The thermodynamics of a homopolymeric chain with both Van der Waals and highly-directional hydrogen bond interaction is studied. The effect of hydrogen bonds is to reduce dramatically the entropy of low-lying states and to give raise to long-range order and to conformations displaying secondary structures. For compact polymers a transition is found between helix-rich states and low-entropy sheet…
▽ More
The thermodynamics of a homopolymeric chain with both Van der Waals and highly-directional hydrogen bond interaction is studied. The effect of hydrogen bonds is to reduce dramatically the entropy of low-lying states and to give raise to long-range order and to conformations displaying secondary structures. For compact polymers a transition is found between helix-rich states and low-entropy sheet-dominated states. The consequences of this transition for protein folding and, in particular, for the problem of prions are discussed.
△ Less
Submitted 17 March, 2000;
originally announced March 2000.
-
Why does a protein fold?
Authors:
R. A. Broglia,
G. Tiana
Abstract:
With the help of lattice Monte Carlo modelling of heteropolymers, we show that the necessary condition for a protein to fold on short call is to proceed through partially folded intermediates. These elementary structures are formed at an early stage in the folding process and contain, at the local level, essentially all of the amino acids found in the folding core (transition state) of the prote…
▽ More
With the help of lattice Monte Carlo modelling of heteropolymers, we show that the necessary condition for a protein to fold on short call is to proceed through partially folded intermediates. These elementary structures are formed at an early stage in the folding process and contain, at the local level, essentially all of the amino acids found in the folding core (transition state) of the protein, providing the local guidance for its formation. The sufficient condition for the protein to fold is that the designed sequence has an energy, in the native conformation, below $E_c$ (the lowest energy of the structurally dissimilar compact conformations) where it has not to compete with the bulk of misfolded conformations. Sequences with energy close to $E_c$ can display prion--like behaviour, folding to two structurally dissimilar conformations, one of them being the native.
△ Less
Submitted 7 March, 2000;
originally announced March 2000.
-
Hiking in the energy landscape in sequence space: a bumpy road to good folders
Authors:
G. Tiana,
R. A. Broglia,
E. I. Shakhnovich
Abstract:
With the help of a simple 20 letters, lattice model of heteropolymers, we investigate the energy landscape in the space of designed good-folder sequences. Low-energy sequences form clusters, interconnected via neutral networks, in the space of sequences. Residues which play a key role in the foldability of the chain and in the stability of the native state are highly conserved, even among the ch…
▽ More
With the help of a simple 20 letters, lattice model of heteropolymers, we investigate the energy landscape in the space of designed good-folder sequences. Low-energy sequences form clusters, interconnected via neutral networks, in the space of sequences. Residues which play a key role in the foldability of the chain and in the stability of the native state are highly conserved, even among the chains belonging to different clusters. If, according to the interaction matrix, some strong attractive interactions are almost degenerate (i.e. they can be realized by more than one type of aminoacid contacts) sequence clusters group into a few super-clusters. Sequences belonging to different super-clusters are dissimilar, displaying very small ($\approx 10%$) similarity, and residues in key-sites are, as a rule, not conserved. Similar behavior is observed in the analysis of real protein sequences.
△ Less
Submitted 25 May, 1999; v1 submitted 20 May, 1999;
originally announced May 1999.