-
NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries
Authors:
Ewa M. Nowara,
Pedro O. Pinheiro,
Sai Pooja Mahajan,
Omar Mahmood,
Andrew Martin Watkins,
Saeed Saremi,
Michael Maser
Abstract:
We present NEBULA, the first latent 3D generative model for scalable generation of large molecular libraries around a seed compound of interest. Such libraries are crucial for scientific discovery, but it remains challenging to generate large numbers of high quality samples efficiently. 3D-voxel-based methods have recently shown great promise for generating high quality samples de novo from random…
▽ More
We present NEBULA, the first latent 3D generative model for scalable generation of large molecular libraries around a seed compound of interest. Such libraries are crucial for scientific discovery, but it remains challenging to generate large numbers of high quality samples efficiently. 3D-voxel-based methods have recently shown great promise for generating high quality samples de novo from random noise (Pinheiro et al., 2023). However, sampling in 3D-voxel space is computationally expensive and use in library generation is prohibitively slow. Here, we instead perform neural empirical Bayes sampling (Saremi & Hyvarinen, 2019) in the learned latent space of a vector-quantized variational autoencoder. NEBULA generates large molecular libraries nearly an order of magnitude faster than existing methods without sacrificing sample quality. Moreover, NEBULA generalizes better to unseen drug-like molecules, as demonstrated on two public datasets and multiple recently released drugs. We expect the approach herein to be highly enabling for machine learning-based drug discovery. The code is available at https://github.com/prescient-design/nebula
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Impact of maternal high fat on neurovascular unit of adult offspring
Authors:
Cheryl A. Hawkes,
Victoria Goss,
Elina Zotova,
Tual Monfort,
Anthony Postle,
Sumeet Mahajan,
James A. R. Nicoll,
Roy O. Weller,
Roxana O. Carare
Abstract:
Maternal obesity is associated with increased risk of diabetes, cardiovascular disease and hypertension in adult offspring. Midlife hypercholesterolemia and hypertension are risk factors for Alzheimer's disease, suggesting that the ageing brain may be impacted by early life environment. We found that exposure to a high fat diet during gestation and lactation induced changes in multiple components…
▽ More
Maternal obesity is associated with increased risk of diabetes, cardiovascular disease and hypertension in adult offspring. Midlife hypercholesterolemia and hypertension are risk factors for Alzheimer's disease, suggesting that the ageing brain may be impacted by early life environment. We found that exposure to a high fat diet during gestation and lactation induced changes in multiple components of the neurovascular unit, including a downregulation in apolipoprotein E and fibronectin, an upregulation in markers of astrocytes and perivascular macrophages and altered blood vessel morphology in the brains of adult mice. Feeding of high fat diet after weaning increased lipid droplets in the brain and influenced the fatty acid composition of phosphatidylcholine and phosphatidylethanolamine species, but did not affect the neurovascular unit. Sustained high fat diet over the entire lifespan resulted in additional decreases in levels of pericytes and collagen IV, changes in phospholipid composition and impaired perivascular clearance of Beta-amyloid (A-Beta) from the brain. In humans, vascular A-Beta load was significantly increased in the brains of aged individuals with a history of hypercholesterolemia. These results support a critical role for early dietary influence on the brain vasculature across the lifespan, with consequences for the development of age-related cerebrovascular and neurodegenerative diseases.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
Excitation Properties of Photopigments and Their Possible Dependence on the Host Star
Authors:
Manasvi Lingam,
Amedeo Balbi,
Swadesh M. Mahajan
Abstract:
Photosynthesis is a plausible pathway for the sustenance of a substantial biosphere on an exoplanet. In fact, it is also anticipated to create distinctive biosignatures detectable by next-generation telescopes. In this work, we explore the excitation features of photopigments that harvest electromagnetic radiation by constructing a simple quantum-mechanical model. Our analysis suggests that the pr…
▽ More
Photosynthesis is a plausible pathway for the sustenance of a substantial biosphere on an exoplanet. In fact, it is also anticipated to create distinctive biosignatures detectable by next-generation telescopes. In this work, we explore the excitation features of photopigments that harvest electromagnetic radiation by constructing a simple quantum-mechanical model. Our analysis suggests that the primary Earth-based photopigments for photosynthesis may not function efficiently at wavelengths $> 1.1$ $μ$m. In the context of (hypothetical) extrasolar photopigments, we calculate the potential number of conjugated $π$-electrons ($N_\star$) in the relevant molecules, which can participate in the absorption of photons. By hypothesizing that the absorption maxima of photopigments are close to the peak spectral photon flux of the host star, we utilize the model to estimate $N_\star$. As per our formalism, $N_\star$ is modulated by the stellar temperature, and is conceivably higher (lower) for planets orbiting stars cooler (hotter) than the sun; exoplanets around late-type M-dwarfs might require an $N_\star$ twice that of the Earth. We conclude the analysis with a brief exposition of how our model could be empirically tested by future observations.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Interpretable (not just posthoc-explainable) heterogeneous survivor bias-corrected treatment effects for assignment of postdischarge interventions to prevent readmissions
Authors:
Hong**g Xia,
Joshua C. Chang,
Sarah Nowak,
Sonya Mahajan,
Rohit Mahajan,
Ted L. Chang,
Carson C. Chow
Abstract:
We used survival analysis to quantify the impact of postdischarge evaluation and management (E/M) services in preventing hospital readmission or death. Our approach avoids a specific pitfall of applying machine learning to this problem, which is an inflated estimate of the effect of interventions, due to survivors bias -- where the magnitude of inflation may be conditional on heterogeneous confoun…
▽ More
We used survival analysis to quantify the impact of postdischarge evaluation and management (E/M) services in preventing hospital readmission or death. Our approach avoids a specific pitfall of applying machine learning to this problem, which is an inflated estimate of the effect of interventions, due to survivors bias -- where the magnitude of inflation may be conditional on heterogeneous confounders in the population. This bias arises simply because in order to receive an intervention after discharge, a person must not have been readmitted in the intervening period. After deriving an expression for this phantom effect, we controlled for this and other biases within an inherently interpretable Bayesian survival framework. We identified case management services as being the most impactful for reducing readmissions overall.
△ Less
Submitted 3 August, 2023; v1 submitted 19 April, 2023;
originally announced April 2023.
-
Deep Learning in Protein Structural Modeling and Design
Authors:
Wenhao Gao,
Sai Pooja Mahajan,
Jeremias Sulam,
Jeffrey J. Gray
Abstract:
Deep learning is catalyzing a scientific revolution fueled by big data, accessible toolkits, and powerful computational resources, impacting many fields including protein structural modeling. Protein structural modeling, such as predicting structure from amino acid sequence and evolutionary information, designing proteins toward desirable functionality, or predicting properties or behavior of a pr…
▽ More
Deep learning is catalyzing a scientific revolution fueled by big data, accessible toolkits, and powerful computational resources, impacting many fields including protein structural modeling. Protein structural modeling, such as predicting structure from amino acid sequence and evolutionary information, designing proteins toward desirable functionality, or predicting properties or behavior of a protein, is critical to understand and engineer biological systems at the molecular level. In this review, we summarize the recent advances in applying deep learning techniques to tackle problems in protein structural modeling and design. We dissect the emerging approaches using deep learning techniques for protein structural modeling, and discuss advances and challenges that must be addressed. We argue for the central importance of structure, following the "sequence -> structure -> function" paradigm. This review is directed to help both computational biologists to gain familiarity with the deep learning methods applied in protein modeling, and computer scientists to gain perspective on the biologically meaningful problems that may benefit from deep learning techniques.
△ Less
Submitted 16 July, 2020;
originally announced July 2020.
-
Feedbacks from the metabolic network to the genetic network reveal regulatory modules in E. coli and B. subtilis
Authors:
Santhust Kumar,
Saurabh Mahajan,
Sanjay Jain
Abstract:
The genetic regulatory network (GRN) plays a key role in controlling the response of the cell to changes in the environment. Although the structure of GRNs has been the subject of many studies, their large scale structure in the light of feedbacks from the metabolic network (MN) has received relatively little attention. Here we study the causal structure of the GRNs, namely the chain of influence…
▽ More
The genetic regulatory network (GRN) plays a key role in controlling the response of the cell to changes in the environment. Although the structure of GRNs has been the subject of many studies, their large scale structure in the light of feedbacks from the metabolic network (MN) has received relatively little attention. Here we study the causal structure of the GRNs, namely the chain of influence of one component on the other, taking into account feedback from the MN. First we consider the GRNs of E. coli and B. subtilis without feedback from MN and illustrate their causal structure. Next we augment the GRNs with feedback from their respective MNs by including (a) links from genes coding for enzymes to metabolites produced or consumed in reactions catalyzed by those enzymes and (b) links from metabolites to genes coding for transcription factors whose transcriptional activity the metabolites alter by binding to them. We find that the inclusion of feedback from MN into GRN significantly affects its causal structure, in particular the number of levels and relative positions of nodes in the hierarchy, and the number and size of the strongly connected components (SCCs). We then study the functional significance of the SCCs. For this we identify condition specific feedbacks from the MN into the GRN by retaining only those enzymes that are essential for growth in specific environmental conditions simulated via the technique of flux balance analysis (FBA). We find that the SCCs of the GRN augmented by these feedbacks can be ascribed specific functional roles in the organism. Our algorithmic approach thus reveals relatively autonomous subsystems with specific functionality, or regulatory modules in the organism. This automated approach could be useful in identifying biologically relevant modules in other organisms for which network data is available, but whose biology is less well studied.
△ Less
Submitted 9 March, 2018;
originally announced March 2018.
-
Spatial and Temporal Sensing Limits of Microtubule Polarization in Neuronal Growth Cones by Intracellular Gradients and Forces
Authors:
Saurabh Mahajan,
Chaitanya A. Athale
Abstract:
Neuronal growth cones are the most sensitive amongst eukaryotic cells in responding to directional chemical cues. Although a dynamic microtubule cytoskeleton has been shown to be essential for growth cone turning, the precise nature of coupling of the spatial cue with microtubule polarization is less understood. Here we present a computational model of microtubule polarization in a turning neurona…
▽ More
Neuronal growth cones are the most sensitive amongst eukaryotic cells in responding to directional chemical cues. Although a dynamic microtubule cytoskeleton has been shown to be essential for growth cone turning, the precise nature of coupling of the spatial cue with microtubule polarization is less understood. Here we present a computational model of microtubule polarization in a turning neuronal growth cone (GC). We explore the limits of directional cues in modifying the spatial polarization of microtubules by testing the role of microtubule dynamics, gradients of regulators and retrograde forces along filopodia. We analyze the steady state and transition behavior of microtubules on being presented with a directional stimulus. The model makes novel predictions about the minimal angular spread of the chemical signal at the growth cone and the fastest polarization times. A regulatory reaction-diffusion network based on the cyclic phosphorylation-dephosphorylation of a regulator predicts that the receptor signal magnitude can generate the maximal polarization of microtubules and not feedback loops or amplifications in the network. Using both the phenomenological and network models we have demonstrated some of the physical limits within which the MT polarization system works in turning neuron.
△ Less
Submitted 1 November, 2012;
originally announced November 2012.