Search | arXiv e-print repository

Molecular dynamics without molecules: searching the conformational space of proteins with generative neural networks

Authors: Gregory Schwing, Luigi L. Palese, Ariel Fernández, Loren Schwiebert, Domenico L. Gatti

Abstract: All-atom and coarse-grained molecular dynamics are two widely used computational tools to study the conformational states of proteins. Yet, these two simulation methods suffer from the fact that without access to supercomputing resources, the time and length scales at which these states become detectable are difficult to achieve. One alternative to such methods is based on encoding the atomistic t… ▽ More All-atom and coarse-grained molecular dynamics are two widely used computational tools to study the conformational states of proteins. Yet, these two simulation methods suffer from the fact that without access to supercomputing resources, the time and length scales at which these states become detectable are difficult to achieve. One alternative to such methods is based on encoding the atomistic trajectory of molecular dynamics as a shorthand version devoid of physical particles, and then learning to propagate the encoded trajectory through the use of artificial intelligence. Here we show that a simple textual representation of the frames of molecular dynamics trajectories as vectors of Ramachandran basin classes retains most of the structural information of the full atomistic representation of a protein in each frame, and can be used to generate equivalent atom-less trajectories suitable to train different types of generative neural networks. In turn, the trained generative models can be used to extend indefinitely the atom-less dynamics or to sample the conformational space of proteins from their representation in the models latent space. We define intuitively this methodology as molecular dynamics without molecules, and show that it enables to cover physically relevant states of proteins that are difficult to access with traditional molecular dynamics. △ Less

Submitted 8 June, 2022; originally announced June 2022.

Comments: 12 pages, 9 figures, 3 tables

arXiv:2103.00087 [pdf]

CXR-Net: An Artificial Intelligence Pipeline for Quick Covid-19 Screening of Chest X-Rays

Authors: Haikal Abdulah, Benjamin Huber, Sinan Lal, Hassan Abdallah, Luigi L. Palese, Hamid Soltanian-Zadeh, Domenico L. Gatti

Abstract: CXR-Net is a two-module Artificial Intelligence pipeline for the quick detection of SARS-CoV-2 from chest X-rays (CXRs). Module 1 was trained on a public dataset of 6395 CXRs with radiologist annotated lung contours to generate masks of the lungs that overlap the heart and large vasa. Module 2 is a hybrid convnet in which the first convolutional layer with learned coefficients is replaced by a lay… ▽ More CXR-Net is a two-module Artificial Intelligence pipeline for the quick detection of SARS-CoV-2 from chest X-rays (CXRs). Module 1 was trained on a public dataset of 6395 CXRs with radiologist annotated lung contours to generate masks of the lungs that overlap the heart and large vasa. Module 2 is a hybrid convnet in which the first convolutional layer with learned coefficients is replaced by a layer with fixed coefficients provided by the Wavelet Scattering Transform (WST). Module 2 takes as inputs the patients CXRs and corresponding lung masks calculated by Module 1, and produces as outputs a class assignment (Covid vs. non-Covid) and high resolution heat maps that identify the SARS associated lung regions. Module 2 was trained on a dataset of CXRs from non-Covid and RT-PCR confirmed Covid patients acquired at the Henry Ford Health System (HFHS) Hospital in Detroit. All non-Covid CXRs were from pre-Covid era (2018-2019), and included images from both normal lungs and lungs affected by non-Covid pathologies. Training and test sets consisted of 2265 CXRs (1417 Covid negative, 848 Covid positive), and 1532 CXRs (945 Covid negative, 587 Covid positive), respectively. Six distinct cross-validation models, each trained on 1887 images and validated against 378 images, were combined into an ensemble model that was used to classify the CXR images of the test set with resulting Accuracy = 0.789, Precision = 0.739, Recall = 0.693, F1 score = 0.715, ROC(AUC) = 0.852. △ Less

Submitted 26 February, 2021; originally announced March 2021.

Comments: 16 pages, 14 figures. arXiv admin note: substantial text overlap with arXiv:2011.08655

MSC Class: 68Txx (Primary); 68T07 (Secondary); 92B20; 68T45 ACM Class: I.2; I.4; I.5; J.3

arXiv:2011.08655 [pdf]

Lung Segmentation in Chest X-rays with Res-CR-Net

Authors: Haikal Abdulah, Benjamin Huber, Sinan Lal, Hassan Abdallah, Hamid Soltanian-Zadeh, Domenico L. Gatti

Abstract: Deep Neural Networks (DNN) are widely used to carry out segmentation tasks in biomedical images. Most DNNs developed for this purpose are based on some variation of the encoder-decoder U-Net architecture. Here we show that Res-CR-Net, a new type of fully convolutional neural network, which was originally developed for the semantic segmentation of microscopy images, and which does not adopt a U-Net… ▽ More Deep Neural Networks (DNN) are widely used to carry out segmentation tasks in biomedical images. Most DNNs developed for this purpose are based on some variation of the encoder-decoder U-Net architecture. Here we show that Res-CR-Net, a new type of fully convolutional neural network, which was originally developed for the semantic segmentation of microscopy images, and which does not adopt a U-Net architecture, is very effective at segmenting the lung fields in chest X-rays from either healthy patients or patients with a variety of lung pathologies. △ Less

Submitted 13 November, 2020; originally announced November 2020.

Comments: 8 pages, 5 figures

MSC Class: 68Txx (Primary); 68T07 (Secondary); 92B20; 68T45 ACM Class: I.2; I.4; J.3

arXiv:2004.08246 [pdf]

Res-CR-Net, a residual network with a novel architecture optimized for the semantic segmentation of microscopy images

Authors: Hassan Abdallah, Asiri Liyanaarachchi, Maranda Saigh, Samantha Silvers, Suzan Arslanturk, Douglas J. Taatjes, Lars Larsson, Bhanu P. Jena, Domenico L. Gatti

Abstract: Deep Neural Networks (DNN) have been widely used to carry out segmentation tasks in both electron and light microscopy. Most DNNs developed for this purpose are based on some variation of the encoder-decoder type U-Net architecture, in combination with residual blocks to increase ease of training and resilience to gradient degradation. Here we introduce Res-CR-Net, a type of DNN that features resi… ▽ More Deep Neural Networks (DNN) have been widely used to carry out segmentation tasks in both electron and light microscopy. Most DNNs developed for this purpose are based on some variation of the encoder-decoder type U-Net architecture, in combination with residual blocks to increase ease of training and resilience to gradient degradation. Here we introduce Res-CR-Net, a type of DNN that features residual blocks with either a bundle of separable atrous convolutions with different dilation rates or a convolutional LSTM. The number of filters used in each residual block and the number of blocks are the only hyperparameters that need to be modified in order to optimize the network training for a variety of different microscopy images. △ Less

Submitted 14 April, 2020; originally announced April 2020.

arXiv:2001.00091 [pdf]

Hemoglobin Non-equilibrium Oxygen Dissociation Curve

Authors: Rosella Scrima, Sabino Fugetto, Nazzareno Capitanio, Domenico L. Gatti

Abstract: Abnormal hemoglobins can have major consequences for tissue delivery of oxygen. Correct diagnosis of hemoglobinopathies with altered oxygen affinity requires a determination of hemoglobin oxygen dissociation curve (ODC), which relates the hemoglobin oxygen saturation to the partial pressure of oxygen in the blood. Determination of the ODC of human hemoglobin is typically carried out under conditio… ▽ More Abnormal hemoglobins can have major consequences for tissue delivery of oxygen. Correct diagnosis of hemoglobinopathies with altered oxygen affinity requires a determination of hemoglobin oxygen dissociation curve (ODC), which relates the hemoglobin oxygen saturation to the partial pressure of oxygen in the blood. Determination of the ODC of human hemoglobin is typically carried out under conditions in which hemoglobin is in equilibrium with O2 at each partial pressure. However, in the human body due to the fast transit of RBCs through tissues hemoglobin oxygen exchanges occur under non-equilibrium conditions. We describe the determination of non-equilibrium ODC, and show that under these conditions Hb cooperativity has two apparent components in the Adair, Perutz, and MWC models of Hb. The first component, which we call sequential cooperativity, accounts for ~70% of Hb cooperativity, and emerges from the constraint of sequential binding that is shared by the three models. The second component, which we call conformational cooperativity, accounts for ~30% of Hb cooperativity, and is due either to a conformational equilibrium between low affinity and high affinity tetramers (as in the MWC model), or to a conformational change from low to high affinity once two of the tetramer sites are occupied (Perutz model). △ Less

Submitted 31 December, 2019; originally announced January 2020.

ACM Class: I.6.5

arXiv:1404.6684 [pdf]

Multidimensional mutual information methods for the analysis of covariation in multiple sequence alignments

Authors: Greg W. Clark, Sharon H. Ackerman, Elisabeth R. Tillier, Domenico L. Gatti

Abstract: Several methods are available for the detection of covarying positions from a multiple sequence alignment (MSA). If the MSA contains a large number of sequences, information about the proximities between residues derived from covariation maps can be sufficient to predict a protein fold. If the structure is already known, information on the covarying positions can be valuable to understand the prot… ▽ More Several methods are available for the detection of covarying positions from a multiple sequence alignment (MSA). If the MSA contains a large number of sequences, information about the proximities between residues derived from covariation maps can be sufficient to predict a protein fold. If the structure is already known, information on the covarying positions can be valuable to understand the protein mechanism. In this study we have sought to determine whether a multivariate extension of traditional mutual information (MI) can be an additional tool to study covariation. The performance of two multidimensional MI (mdMI) methods, designed to remove the effect of ternary/quaternary interdependencies, was tested with a set of 9 MSAs each containing <400 sequences, and was shown to be comparable to that of methods based on maximum entropy/pseudolikelyhood statistical models of protein sequences. However, while all the methods tested detected a similar number of covarying pairs among the residues separated by < 8 Å in the reference X-ray structures, there was on average less than 65% overlap between the top scoring pairs detected by methods that are based on different principles. We have also attempted to identify whether the difference in performance among methods is due to different efficiency in removing covariation originating from chains of structural contacts. We found that the reason why methods that derive partial correlation between the columns of a MSA provide a better recognition of close contacts is not because they remove chaining effects, but because they filter out the correlation between distant residues that originates from general fitness constraints. In contrast we found that true chaining effects are expression of real physical perturbations that propagate inside proteins, and therefore are not removed by the derivation of partial correlation between variables. △ Less

Submitted 26 April, 2014; originally announced April 2014.

Comments: 21 pages, 4 figures, 1 table, supporting information containing 2 additional figures is included at the end of the manuscript

Showing 1–6 of 6 results for author: Gatti, D L