-
Autoencoders for dimensionality reduction in molecular dynamics: collective variable dimension, biasing and transition states
Authors:
Zineb Belkacemi,
Marc Bianciotto,
Herve Minoux,
Tony Lelievre,
Gabriel Stoltz,
Paraskevi Gkeka
Abstract:
The heat shock protein 90 (Hsp90) is a molecular chaperone that controls the folding and activation of client proteins using the free energy of ATP hydrolysis. The Hsp90 active site is in its N-terminal domain (NTD). Our goal is to characterize the dynamics of NTD using an autoencoder-learned collective variable (CV) in conjunction with adaptive biasing force (ABF) Langevin dynamics. Using dihedra…
▽ More
The heat shock protein 90 (Hsp90) is a molecular chaperone that controls the folding and activation of client proteins using the free energy of ATP hydrolysis. The Hsp90 active site is in its N-terminal domain (NTD). Our goal is to characterize the dynamics of NTD using an autoencoder-learned collective variable (CV) in conjunction with adaptive biasing force (ABF) Langevin dynamics. Using dihedral analysis, we cluster all available experimental Hsp90 NTD structures into distinct native states. We then perform unbiased molecular dynamics (MD) simulations to construct a dataset that represents each state and use this dataset to train an autoencoder. Two autoencoder architectures are considered, with one and two hidden layers respectively, and bottlenecks of dimension $k$ ranging from 1 to 10. We demonstrate that the addition of an extra hidden layer does not significantly improve the performance, while it leads to complicated CVs that increases the computational cost of biased MD calculations. In addition, a 2D bottleneck can provide enough information of the different states, while the optimal bottleneck dimension is five. For the 2D bottleneck, the two-dimensional CV is directly used in biased MD simulations. For the 5D bottleneck, we perform an analysis of the latent CV space and identify the pair of CV coordinates that best separates the states of Hsp90. Interestingly, selecting a 2D CV out of the 5D CV space leads to better results than directly learning a 2D CV, and allows to observe transitions between native states when running free energy biased dynamics.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
In silico drug repositioning for COVID-19 using absolute binding free energy calculations
Authors:
Théau Debroise,
Rose Hoste,
Quentin Chamayou,
Hervé Minoux,
Bruno Filoche-Rommé,
Marc Bianciotto,
Jean-Philippe Rameau,
Laurent Schio,
Maximilien Levesque
Abstract:
Since the rise of the SARS-CoV-2 pandemic in the winter of 2019, the need for an affordable and efficient drug has not yet been met. Leveraging its unique, fast and precise binding free energy prediction technology, Aqemia screened and ranked FDA-approved molecules against the 3ClPro protein. This protease is key to the post-translational modification of two polyproteins produced by the viral geno…
▽ More
Since the rise of the SARS-CoV-2 pandemic in the winter of 2019, the need for an affordable and efficient drug has not yet been met. Leveraging its unique, fast and precise binding free energy prediction technology, Aqemia screened and ranked FDA-approved molecules against the 3ClPro protein. This protease is key to the post-translational modification of two polyproteins produced by the viral genome. We propose in our top 10 predicted molecules some drugs or prodrugs that could be repurposed and used in the treatment of COVID cases.
△ Less
Submitted 22 September, 2021; v1 submitted 8 September, 2021;
originally announced September 2021.
-
Scaffold-constrained molecular generation
Authors:
Maxime Langevin,
Herve Minoux,
Maximilien Levesque,
Marc Bianciotto
Abstract:
One of the major applications of generative models for drug Discovery targets the lead-optimization phase. During the optimization of a lead series, it is common to have scaffold constraints imposed on the structure of the molecules designed. Without enforcing such constraints, the probability of generating molecules with the required scaffold is extremely low and hinders the practicality of gener…
▽ More
One of the major applications of generative models for drug Discovery targets the lead-optimization phase. During the optimization of a lead series, it is common to have scaffold constraints imposed on the structure of the molecules designed. Without enforcing such constraints, the probability of generating molecules with the required scaffold is extremely low and hinders the practicality of generative models for de-novo drug design. To tackle this issue, we introduce a new algorithm to perform scaffold-constrained in-silico molecular design. We build on the well-known SMILES-based Recurrent Neural Network (RNN) generative model, with a modified sampling procedure to achieve scaffold-constrained generation. We directly benefit from the associated reinforcement Learning methods, allowing to design molecules optimized for different properties while exploring only the relevant chemical space. We showcase the method's ability to perform scaffold-constrained generation on various tasks: designing novel molecules around scaffolds extracted from SureChEMBL chemical series, generating novel active molecules on the Dopamine Receptor D2 (DRD2) target, and, finally, designing predicted actives on the MMP-12 series, an industrial lead-optimization project.
△ Less
Submitted 5 October, 2020; v1 submitted 15 September, 2020;
originally announced September 2020.
-
Machine learning force fields and coarse-grained variables in molecular dynamics: application to materials and biological systems
Authors:
Paraskevi Gkeka,
Gabriel Stoltz,
Amir Barati Farimani,
Zineb Belkacemi,
Michele Ceriotti,
John Chodera,
Aaron R. Dinner,
Andrew Ferguson,
Jean-Bernard Maillet,
Hervé Minoux,
Christine Peter,
Fabio Pietrucci,
Ana Silveira,
Alexandre Tkatchenko,
Zofia Trstanova,
Rafal Wiewiora,
Tony Leliévre
Abstract:
Machine learning encompasses a set of tools and algorithms which are now becoming popular in almost all scientific and technological fields. This is true for molecular dynamics as well, where machine learning offers promises of extracting valuable information from the enormous amounts of data generated by simulation of complex systems. We provide here a review of our current understanding of goals…
▽ More
Machine learning encompasses a set of tools and algorithms which are now becoming popular in almost all scientific and technological fields. This is true for molecular dynamics as well, where machine learning offers promises of extracting valuable information from the enormous amounts of data generated by simulation of complex systems. We provide here a review of our current understanding of goals, benefits, and limitations of machine learning techniques for computational studies on atomistic systems, focusing on the construction of empirical force fields from ab-initio databases and the determination of reaction coordinates for free energy computation and enhanced sampling.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.