-
MiMiC: A High-Performance Framework for Multiscale Molecular Dynamics Simulations
Authors:
Andrej Antalík,
Andrea Levy,
Sonata Kvedaravičiūtė,
Sophia K. Johnson,
David Carrasco-Busturia,
Bharath Raghavan,
François Mouvet,
Angela Acocella,
Sambit Das,
Vikram Gavini,
Davide Mandelli,
Emiliano Ippoliti,
Simone Meloni,
Paolo Carloni,
Ursula Rothlisberger,
Jógvan Magnus Haugaard Olsen
Abstract:
MiMiC is a framework for performing multiscale simulations, where individual subsystems are handled at different resolutions and/or levels of theory by loosely coupled external programs. To make it highly efficient and flexible, we adopt an interoperable approach based on a multiple-program multiple-data paradigm, serving as an intermediary responsible for fast data exchange and interactions betwe…
▽ More
MiMiC is a framework for performing multiscale simulations, where individual subsystems are handled at different resolutions and/or levels of theory by loosely coupled external programs. To make it highly efficient and flexible, we adopt an interoperable approach based on a multiple-program multiple-data paradigm, serving as an intermediary responsible for fast data exchange and interactions between the subsystems. The main goal of MiMiC is to avoid interfering with the underlying parallelization of the external programs, including the operability on hybrid architectures (e.g., CPU/GPU), and keep their setup and execution as close as possible to the original. At the moment, MiMiC offers an efficient implementation of electrostatic embedding QM/MM that has demonstrated unprecedented parallel scaling in simulations of large biomolecules using CPMD and GROMACS as QM and MM engines, respectively. However, as it is designed for high flexibility with general multiscale models in mind, it can be straightforwardly extended beyond QM/MM. In this article, we illustrate the software design and the features of the framework, which make it a compelling choice for multiscale simulations in the upcoming era of exascale high-performance computing.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Effective Data-Driven Collective Variables for Free Energy Calculations from Metadynamics of Paths
Authors:
Lukas Müllender,
Andrea Rizzi,
Michele Parrinello,
Paolo Carloni,
Davide Mandelli
Abstract:
A variety of enhanced sampling methods predict multidimensional free energy landscapes associated with biological and other molecular processes as a function of a few selected collective variables (CVs). The accuracy of these methods is crucially dependent on the ability of the chosen CVs to capture the relevant slow degrees of freedom of the system. For complex processes, finding such CVs is the…
▽ More
A variety of enhanced sampling methods predict multidimensional free energy landscapes associated with biological and other molecular processes as a function of a few selected collective variables (CVs). The accuracy of these methods is crucially dependent on the ability of the chosen CVs to capture the relevant slow degrees of freedom of the system. For complex processes, finding such CVs is the real challenge. Machine learning (ML) CVs offer, in principle, a solution to handle this problem. However, these methods rely on the availability of high-quality datasets -- ideally incorporating information about physical pathways and transition states -- which are difficult to access, therefore greatly limiting their domain of application. Here, we demonstrate how these datasets can be generated by means of enhanced sampling simulations in trajectory space via the metadynamics of paths [arXiv:2002.09281] algorithm. The approach is expected to provide a general and efficient way to generate efficient ML-based CVs for the fast prediction of free energy landscapes in enhanced sampling simulations. We demonstrate our approach with two numerical examples, a two-dimensional model potential and the isomerization of alanine dipeptide, using deep targeted discriminant analysis as our ML-based CV of choice.
△ Less
Submitted 8 April, 2024; v1 submitted 9 November, 2023;
originally announced November 2023.
-
Scalability of 3D-DFT by block tensor-matrix multiplication on the JUWELS Cluster
Authors:
Nitin Malapally,
Viacheslav Bolnykh,
Estela Suarez,
Paolo Carloni,
Thomas Lippert,
Davide Mandelli
Abstract:
The 3D Discrete Fourier Transform (DFT) is a technique used to solve problems in disparate fields. Nowadays, the commonly adopted implementation of the 3D-DFT is derived from the Fast Fourier Transform (FFT) algorithm. However, evidence indicates that the distributed memory 3D-FFT algorithm does not scale well due to its use of all-to-all communication. Here, building on the work of Sedukhin \text…
▽ More
The 3D Discrete Fourier Transform (DFT) is a technique used to solve problems in disparate fields. Nowadays, the commonly adopted implementation of the 3D-DFT is derived from the Fast Fourier Transform (FFT) algorithm. However, evidence indicates that the distributed memory 3D-FFT algorithm does not scale well due to its use of all-to-all communication. Here, building on the work of Sedukhin \textit{et al}. [Proceedings of the 30th International Conference on Computers and Their Applications, CATA 2015 pp. 193-200 (01 2015)], we revisit the possibility of improving the scaling of the 3D-DFT by using an alternative approach that uses point-to-point communication, albeit at a higher arithmetic complexity. The new algorithm exploits tensor-matrix multiplications on a volumetrically decomposed domain via three specially adapted variants of Cannon's algorithm. It has here been implemented as a C++ library called S3DFT and tested on the JUWELS Cluster at the Jülich Supercomputing Center. Our implementation of the shared memory tensor-matrix multiplication attained 88\% of the theoretical single node peak performance. One variant of the distributed memory tensor-matrix multiplication shows excellent scaling, while the other two show poorer performance, which can be attributed to their intrinsic communication patterns. A comparison of S3DFT with the Intel MKL and FFTW3 libraries indicates that currently iMKL performs best overall, followed in order by FFTW3 and S3DFT. This picture might change with further improvements of the algorithm and/or when running on clusters that use network connections with higher latency, e.g. on cloud platforms.
△ Less
Submitted 23 March, 2023;
originally announced March 2023.
-
Multimap targeted free energy estimation
Authors:
Andrea Rizzi,
Paolo Carloni,
Michele Parrinello
Abstract:
We present a new method to compute free energies at a quantum mechanical (QM) level of theory from molecular simulations using cheap reference potential energy functions, such as force fields. To overcome the poor overlap between the reference and target distributions, we generalize targeted free energy perturbation (TFEP) to employ multiple configuration maps. While TFEP maps have been obtained b…
▽ More
We present a new method to compute free energies at a quantum mechanical (QM) level of theory from molecular simulations using cheap reference potential energy functions, such as force fields. To overcome the poor overlap between the reference and target distributions, we generalize targeted free energy perturbation (TFEP) to employ multiple configuration maps. While TFEP maps have been obtained before from an expensive training of a normalizing flow neural network (NN), our multimap estimator allows us to use the same set of QM calculations to both optimize the maps and estimate the free energy, thus removing almost completely the overhead due to training. A multimap extension of the multistate Bennett acceptance ratio estimator is also derived for cases where samples from two or more states are available. Furthermore, we propose a one-epoch learning policy that can be used to efficiently avoid overfitting when computing the loss function is expensive compared to generating data. Finally, we show how our multimap approach can be combined with enhanced sampling strategies to overcome the pervasive problem of poor convergence due to slow degrees of freedom. We test our method on the HiPen dataset of drug-like molecules and fragments, and we show that it can accelerate the calculation of the free energy difference of switching from a force field to a DFTB3 potential by about 3 orders of magnitude compared to standard FEP and by a factor of about 8 compared to previously published nonequilibrium calculations.
△ Less
Submitted 23 August, 2023; v1 submitted 15 February, 2023;
originally announced February 2023.
-
Accelerating Deep Neural Networks for Real-time Data Selection for High-resolution Imaging Particle Detectors
Authors:
Yeon-Jae Jwa,
Giuseppe Di Guglielmo,
Luca P. Carloni,
Georgia Karagiorgi
Abstract:
This paper presents the custom implementation, optimization, and performance evaluation of convolutional neural networks on field programmable gate arrays, for the purposes of accelerating deep neural network inference on large, two-dimensional image inputs. The targeted application is that of data selection for high-resolution particle imaging detectors, and in particular liquid argon time projec…
▽ More
This paper presents the custom implementation, optimization, and performance evaluation of convolutional neural networks on field programmable gate arrays, for the purposes of accelerating deep neural network inference on large, two-dimensional image inputs. The targeted application is that of data selection for high-resolution particle imaging detectors, and in particular liquid argon time projection chamber detectors, such as that employed by the future Deep Underground Neutrino Experiment. We motivate this particular application based on the excellent performance of deep neural networks on classifying simulated raw data from the DUNE LArTPC, combined with the need for power-efficient data processing in the case of remote, long-term, and limited-access operating detector conditions.
△ Less
Submitted 12 January, 2022;
originally announced January 2022.
-
Targeted free energy perturbation revisited: Accurate free energies from mapped reference potentials
Authors:
Andrea Rizzi,
Paolo Carloni,
Michele Parrinello
Abstract:
We present an approach that extends the theory of targeted free energy perturbation (TFEP) to calculate free energy differences and free energy surfaces at an accurate quantum mechanical level of theory from a cheaper reference potential. The convergence is accelerated by a map** function that increases the overlap between the target and the reference distributions. Building on recent work, we s…
▽ More
We present an approach that extends the theory of targeted free energy perturbation (TFEP) to calculate free energy differences and free energy surfaces at an accurate quantum mechanical level of theory from a cheaper reference potential. The convergence is accelerated by a map** function that increases the overlap between the target and the reference distributions. Building on recent work, we show that this map can be learned with a normalizing flow neural network, without requiring simulations with the expensive target potential but only a small number of single-point calculations, and, crucially, avoiding the systematic error that was found previously. We validate the method by numerically evaluating the free energy difference in a system with a double-well potential and by describing the free energy landscape of a simple chemical reaction in the gas phase.
△ Less
Submitted 9 August, 2021; v1 submitted 17 June, 2021;
originally announced June 2021.
-
hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices
Authors:
Farah Fahim,
Benjamin Hawks,
Christian Herwig,
James Hirschauer,
Sergo **dariani,
Nhan Tran,
Luca P. Carloni,
Giuseppe Di Guglielmo,
Philip Harris,
Jeffrey Krupa,
Dylan Rankin,
Manuel Blanco Valentin,
Josiah Hester,
Yingyi Luo,
John Mamish,
Seda Orgrenci-Memik,
Thea Aarrestad,
Hamza Javed,
Vladimir Loncar,
Maurizio Pierini,
Adrian Alan Pol,
Sioni Summers,
Javier Duarte,
Scott Hauck,
Shih-Chieh Hsu
, et al. (5 additional authors not shown)
Abstract:
Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-h…
▽ More
Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.
△ Less
Submitted 23 March, 2021; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Exhaustive Search of Ligand Binding Pathways via Volume-based Metadynamics
Authors:
Riccardo Capelli,
Paolo Carloni,
Michele Parrinello
Abstract:
Determining the complete set of ligands' binding/unbinding pathways is important for drug discovery and to rationally interpret mutation data. Here we have developed a metadynamics-based technique that addressed this issue and allows estimating affinities in the presence of multiple escape pathways. Our approach is shown on a Lysozyme T4 variant in complex with the benzene molecule. The calculated…
▽ More
Determining the complete set of ligands' binding/unbinding pathways is important for drug discovery and to rationally interpret mutation data. Here we have developed a metadynamics-based technique that addressed this issue and allows estimating affinities in the presence of multiple escape pathways. Our approach is shown on a Lysozyme T4 variant in complex with the benzene molecule. The calculated binding free energy is in agreement with experimental data. Remarkably, not only we were able to find all the previously identified ligand binding pathways, but also we uncovered 3 new ones. This results were obtained at a small computational cost, making this approach valuable for practical applications, such as screening of small compounds libraries.
△ Less
Submitted 24 April, 2019;
originally announced April 2019.
-
Open Boundary Simulations of Proteins and Their Hydration Shells by Hamiltonian Adaptive Resolution Scheme
Authors:
Thomas Tarenzi,
Vania Calandrini,
Raffaello Potestio,
Alejandro Giorgetti,
Paolo Carloni
Abstract:
The recently proposed Hamiltonian Adaptive Resolution Scheme (H-AdResS) allows to perform molecular simulations in an open boundary framework. It allows to change on the fly the resolution of specific subset of molecules (usually the solvent), which are free to diffuse between the atomistic region and the coarse-grained reservoir. So far, the method has been successfully applied to pure liquids. C…
▽ More
The recently proposed Hamiltonian Adaptive Resolution Scheme (H-AdResS) allows to perform molecular simulations in an open boundary framework. It allows to change on the fly the resolution of specific subset of molecules (usually the solvent), which are free to diffuse between the atomistic region and the coarse-grained reservoir. So far, the method has been successfully applied to pure liquids. Coupling the H-AdResS methodology to hybrid models of proteins, such as the Molecular Mechanics/Coarse-Grained (MM/CG) scheme, is a promising approach for rigorous calculations of ligand binding free energies in low-resolution protein models. Towards this goal, here we apply for the first time H-AdResS to two atomistic proteins in dual-resolution solvent, proving its ability to reproduce structural and dynamic properties of both the proteins and the solvent, as obtained from atomistic simulations.
△ Less
Submitted 15 November, 2017;
originally announced November 2017.
-
Statistical Analysis of $σ$-Holes: A Novel Complementary View on Halogen Bonding
Authors:
Michal H. Kolář,
Paolo Carloni,
Pavel Hobza
Abstract:
To contribute to the understanding of noncovalent binding of halogenated molecules with a biological activity, electrostatic potential (ESP) maps of more than 2,500 compounds were thoroughly analysed. A peculiar region of positive ESP, called $σ$-hole, is a concept of central importance for halogen bonding. We aim at simplifying the view on $σ$-holes and provide general trends in organic drug-like…
▽ More
To contribute to the understanding of noncovalent binding of halogenated molecules with a biological activity, electrostatic potential (ESP) maps of more than 2,500 compounds were thoroughly analysed. A peculiar region of positive ESP, called $σ$-hole, is a concept of central importance for halogen bonding. We aim at simplifying the view on $σ$-holes and provide general trends in organic drug-like molecules. The results are in fair agreement with crystallographic surveys of small molecules as well as of biomolecular complexes and attempt to improve the intuition of chemists when dealing with halogenated compounds.
△ Less
Submitted 25 September, 2017;
originally announced October 2017.
-
Proton Dynamics in Protein Mass Spectrometry
Authors:
**yu Li,
Wen** Lyu,
Giulia Rossetti,
Albert Konijnenberg,
Antonino Natalello,
Emiliano Ippoliti,
Modesto Orozco,
Frank Sobott,
Rita Grandori,
Paolo Carloni
Abstract:
Native electrospray ionization/ion mobility-mass spectrometry (ESI/IM-MS) allows an accurate determination of low-resolution structural features of proteins. Yet, the presence of proton dynamics, observed already by us for DNA in the gas phase, and its impact on protein structural determinants, have not been investigated so far. Here, we address this issue by a multi-step simulation strategy on a…
▽ More
Native electrospray ionization/ion mobility-mass spectrometry (ESI/IM-MS) allows an accurate determination of low-resolution structural features of proteins. Yet, the presence of proton dynamics, observed already by us for DNA in the gas phase, and its impact on protein structural determinants, have not been investigated so far. Here, we address this issue by a multi-step simulation strategy on a pharmacologically relevant peptide, the N-terminal residues of amyloid-beta peptide (Abeta(1-16)). Our calculations reproduce the experimental maximum charge state from ESI-MS and are also in fair agreement with collision cross section (CCS) data measured here by ESI/IM-MS. Although the main structural features are preserved, subtle conformational changes do take place in the first ~0.1 ms of dynamics. In addition, intramolecular proton dynamics processes occur on the ps-timescale in the gas phase as emerging from quantum mechanics/molecular mechanics (QM/MM) simulations at the B3LYP level of theory. We conclude that proton transfer phenomena do occur frequently during fly time in ESI-MS experiments (typically on the ms timescale). However, the structural changes associated with the process do not significantly affect the structural determinants.
△ Less
Submitted 25 February, 2017;
originally announced February 2017.
-
DNA like$-$charge attraction and overcharging by divalent counterions in the presence of divalent co$-$ions
Authors:
Nguyen Viet Duc,
Toan T. Nguyen,
Paolo Carloni
Abstract:
Strongly correlated electrostatics of DNA systems has drawn the interest of many groups, especially the condensation and overcharging of DNA by multivalent counterions. By adding counterions of different valencies and shapes, one can enhance or reduce DNA overcharging. In this papers, we focus on the effect of multivalent co-ions, specifically divalent co-ions such as SO$_4^{2-}$. A computational…
▽ More
Strongly correlated electrostatics of DNA systems has drawn the interest of many groups, especially the condensation and overcharging of DNA by multivalent counterions. By adding counterions of different valencies and shapes, one can enhance or reduce DNA overcharging. In this papers, we focus on the effect of multivalent co-ions, specifically divalent co-ions such as SO$_4^{2-}$. A computational experiment of DNA condensation using Monte$-$Carlo simulation in grand canonical ensemble is carried out where DNA system is in equilibrium with a bulk solution containing a mixture of salt of different valency of co-ions. Compared to system with purely monovalent co-ions, the influence of divalent co-ions shows up in multiple aspects. Divalent co-ions lead to an increase of monovalent salt in the DNA condensate. Because monovalent salts mostly participate in linear screening of electrostatic interactions in the system, more monovalent salt molecules enter the condensate leads to screening out of short-range DNA$-$DNA like charge attraction and weaker DNA condensation free energy. The overcharging of DNA by multivalent counterions is also reduced in the presence of divalent co$-$ions. Strong repulsions between DNA and divalent co-ions and among divalent co-ions themselves leads to a {\em depletion} of negative ions near DNA surface as compared to the case without divalent co-ions. At large distance, the DNA$-$DNA repulsive interaction is stronger in the presence of divalent co$-$ions, suggesting that divalent co$-$ions role is not only that of simple stronger linear screening.
△ Less
Submitted 24 May, 2017; v1 submitted 5 August, 2016;
originally announced August 2016.
-
RNA/peptide binding driven by electrostatics -- Insight from bi-directional pulling simulations
Authors:
Trang N. Do,
Paolo Carloni,
Gabriele Varani,
Giovanni Bussi
Abstract:
RNA/protein interactions play crucial roles in controlling gene expression. They are becoming important targets for pharmaceutical applications. Due to RNA flexibility and to the strength of electrostatic interactions, standard docking methods are insufficient. We here present a computational method which allows studying the binding of RNA molecules and charged peptides with atomistic, explicit-so…
▽ More
RNA/protein interactions play crucial roles in controlling gene expression. They are becoming important targets for pharmaceutical applications. Due to RNA flexibility and to the strength of electrostatic interactions, standard docking methods are insufficient. We here present a computational method which allows studying the binding of RNA molecules and charged peptides with atomistic, explicit-solvent molecular dynamics. In our method, a suitable estimate of the electrostatic interaction is used as an order parameter (collective variable) which is then accelerated using bi-directional pulling simulations. Since the electrostatic interaction is only used to enhance the sampling, the approximations used to compute it do not affect the final accuracy. The method is employed to characterize the binding of TAR RNA from HIV-1 and a small cyclic peptide. Our simulation protocol allows blindly predicting the binding pocket and pose as well as the binding affinity. The method is general and could be applied to study other electrostatics-driven binding events.
△ Less
Submitted 21 July, 2013;
originally announced July 2013.
-
Many-Body meets QM/MM: Application to indole in water solution
Authors:
Adriano Mosca Conte,
Emiliano Ippoliti,
Rodolfo Del Sole,
Paolo Carloni,
Olivia Pulci
Abstract:
Spectral properties of chromophores are used to probe complex biological processes in vitro and in vivo, yet how the environment tunes their optical properties is far from being fully understood. Here we present a method to calculate such properties on large scale systems, like biologically relevant molecules in aqueous solution. Our approach is based on many body perturbation theory combined wi…
▽ More
Spectral properties of chromophores are used to probe complex biological processes in vitro and in vivo, yet how the environment tunes their optical properties is far from being fully understood. Here we present a method to calculate such properties on large scale systems, like biologically relevant molecules in aqueous solution. Our approach is based on many body perturbation theory combined with quantum-mechanics/molecular-mechanics (QM/MM) approach. We show here how to include quasi-particle and excitonic effects for the calculation of optical absorption spectra in a QM/MM scheme. We apply this scheme, together with the well established TDDFT approach, to indole in water solution. Our calculations show that the solvent induces a redshift in the main spectral peak of indole, in quantitative agreement with the experiments and point to the importance of performing averages over molecular dynamics configurations for calculating optical properties.
△ Less
Submitted 25 May, 2008;
originally announced May 2008.
-
Convergent dynamics in the protease enzymatic superfamily
Authors:
Vincenzo Carnevale,
Simone Raugei,
Cristian Micheletti,
Paolo Carloni
Abstract:
Proteases regulate various aspects of the life cycle in all organisms by cleaving specific peptide bonds. Their action is so central for biochemical processes that at least 2% of any known genome encodes for proteolytic enzymes. Here we show that selected proteases pairs, despite differences in oligomeric state, catalytic residues and fold, share a common structural organization of functionally…
▽ More
Proteases regulate various aspects of the life cycle in all organisms by cleaving specific peptide bonds. Their action is so central for biochemical processes that at least 2% of any known genome encodes for proteolytic enzymes. Here we show that selected proteases pairs, despite differences in oligomeric state, catalytic residues and fold, share a common structural organization of functionally relevant regions which are further shown to undergo similar concerted movements. The structural and dynamical similarities found pervasively across evolutionarily distant clans point to common mechanisms for peptide hydrolysis.
△ Less
Submitted 15 September, 2006;
originally announced September 2006.
-
Serine Proteases: an Ab Initio Molecular Dynamics Study
Authors:
L. De Santis,
P. Carloni
Abstract:
In serine proteases (SP's), the H-bond between His-57 and Asp-102, and that between Gly-193 and the transition state intermediate play a crucial role for enzymatic function. To shed light on the nature of these interactions, we have carried out ab initio molecular dynamics simulations on complexes representing adducts between the reaction intermediate and elastase (one protein belonging to the S…
▽ More
In serine proteases (SP's), the H-bond between His-57 and Asp-102, and that between Gly-193 and the transition state intermediate play a crucial role for enzymatic function. To shed light on the nature of these interactions, we have carried out ab initio molecular dynamics simulations on complexes representing adducts between the reaction intermediate and elastase (one protein belonging to the SP family). Our calculations indicate the presence of a low--barrier H-bond between His-57 and Asp-102, in complete agreement with NMR experiments on enzyme--transition state analog complexes. Comparison with an ab initio molecular dynamics simulation on a model of the substrate--enzyme adduct indicates that the Gly-193--induced strong stabilization of the intermediate is accomplished by charge/dipole interactions and not by H-bonding as previously suggested. Inclusion of the protein electric field in the calculations does not affect significantly the charge distribution.
△ Less
Submitted 1 July, 1999;
originally announced July 1999.