-
A principled framework to assess information theoretical fitness of brain functional sub-circuits
Authors:
Duy Duong-Tran,
Nghi Nguyen,
Shizhuo Mu,
Jiong Chen,
**gxuan Bao,
Frederick Xu,
Sumita Garai,
Jose Cadena-Pico,
Alan David Kaplan,
Tianlong Chen,
Yize Zhao,
Li Shen,
Joaquín Goñi
Abstract:
In systems and network neuroscience, many common practices in brain connectomic analysis are often not properly scrutinized. One such practice is map** a predetermined set of sub-circuits, like functional networks (FNs), onto subjects' functional connectomes (FCs) without adequately assessing the information-theoretic appropriateness of the partition. Another practice that goes unchallenged is t…
▽ More
In systems and network neuroscience, many common practices in brain connectomic analysis are often not properly scrutinized. One such practice is map** a predetermined set of sub-circuits, like functional networks (FNs), onto subjects' functional connectomes (FCs) without adequately assessing the information-theoretic appropriateness of the partition. Another practice that goes unchallenged is thresholding weighted FCs to remove spurious connections without justifying the chosen threshold. This paper leverages recent theoretical advances in Stochastic Block Models (SBMs) to formally define and quantify the information-theoretic fitness (e.g., prominence) of a predetermined set of FNs when mapped to individual FCs under different fMRI task conditions. Our framework allows for evaluating any combination of FC granularity, FN partition, and thresholding strategy, thereby optimizing these choices to preserve important topological features of the human brain connectomes. Our results pave the way for the proper use of predetermined FNs and thresholding methods and provide insights for future research in individualized parcellations.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models
Authors:
Songtao Liu,
Hanjun Dai,
Yue Zhao,
Peng Liu
Abstract:
Molecule synthesis through machine learning is one of the fundamental problems in drug discovery. Current data-driven strategies employ one-step retrosynthesis models and search algorithms to predict synthetic routes in a top-bottom manner. Despite their effective performance, these strategies face limitations in the molecule synthetic route generation due to a greedy selection of the next molecul…
▽ More
Molecule synthesis through machine learning is one of the fundamental problems in drug discovery. Current data-driven strategies employ one-step retrosynthesis models and search algorithms to predict synthetic routes in a top-bottom manner. Despite their effective performance, these strategies face limitations in the molecule synthetic route generation due to a greedy selection of the next molecule set without any lookahead. Furthermore, existing strategies cannot control the generation of synthetic routes based on possible criteria such as material costs, yields, and step count. In this work, we propose a general and principled framework via conditional residual energy-based models (EBMs), that focus on the quality of the entire synthetic route based on the specific criteria. By incorporating an additional energy-based function into our probabilistic model, our proposed algorithm can enhance the quality of the most probable synthetic routes (with higher probabilities) generated by various strategies in a plug-and-play fashion. Extensive experiments demonstrate that our framework can consistently boost performance across various strategies and outperforms previous state-of-the-art top-1 accuracy by a margin of 2.5%. Code is available at https://github.com/SongtaoLiu0823/CREBM.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Pathology-genomic fusion via biologically informed cross-modality graph learning for survival analysis
Authors:
Zeyu Zhang,
Yuanshen Zhao,
**gxian Duan,
Yaou Liu,
Hairong Zheng,
Dong Liang,
Zhenyu Zhang,
Zhi-Cheng Li
Abstract:
The diagnosis and prognosis of cancer are typically based on multi-modal clinical data, including histology images and genomic data, due to the complex pathogenesis and high heterogeneity. Despite the advancements in digital pathology and high-throughput genome sequencing, establishing effective multi-modal fusion models for survival prediction and revealing the potential association between histo…
▽ More
The diagnosis and prognosis of cancer are typically based on multi-modal clinical data, including histology images and genomic data, due to the complex pathogenesis and high heterogeneity. Despite the advancements in digital pathology and high-throughput genome sequencing, establishing effective multi-modal fusion models for survival prediction and revealing the potential association between histopathology and transcriptomics remains challenging. In this paper, we propose Pathology-Genome Heterogeneous Graph (PGHG) that integrates whole slide images (WSI) and bulk RNA-Seq expression data with heterogeneous graph neural network for cancer survival analysis. The PGHG consists of biological knowledge-guided representation learning network and pathology-genome heterogeneous graph. The representation learning network utilizes the biological prior knowledge of intra-modal and inter-modal data associations to guide the feature extraction. The node features of each modality are updated through attention-based graph learning strategy. Unimodal features and bi-modal fused features are extracted via attention pooling module and then used for survival prediction. We evaluate the model on low-grade gliomas, glioblastoma, and kidney renal papillary cell carcinoma datasets from the Cancer Genome Atlas (TCGA) and the First Affiliated Hospital of Zhengzhou University (FAHZU). Extensive experimental results demonstrate that the proposed method outperforms both unimodal and other multi-modal fusion models. For demonstrating the model interpretability, we also visualize the attention heatmap of pathological images and utilize integrated gradient algorithm to identify important tissue structure, biological pathways and key genes.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Learnable Community-Aware Transformer for Brain Connectome Analysis with Token Clustering
Authors:
Yanting Yang,
Beidi Zhao,
Zhuohao Ni,
Yize Zhao,
Xiaoxiao Li
Abstract:
Neuroscientific research has revealed that the complex brain network can be organized into distinct functional communities, each characterized by a cohesive group of regions of interest (ROIs) with strong interconnections. These communities play a crucial role in comprehending the functional organization of the brain and its implications for neurological conditions, including Autism Spectrum Disor…
▽ More
Neuroscientific research has revealed that the complex brain network can be organized into distinct functional communities, each characterized by a cohesive group of regions of interest (ROIs) with strong interconnections. These communities play a crucial role in comprehending the functional organization of the brain and its implications for neurological conditions, including Autism Spectrum Disorder (ASD) and biological differences, such as in gender. Traditional models have been constrained by the necessity of predefined community clusters, limiting their flexibility and adaptability in deciphering the brain's functional organization. Furthermore, these models were restricted by a fixed number of communities, hindering their ability to accurately represent the brain's dynamic nature. In this study, we present a token clustering brain transformer-based model ($\texttt{TC-BrainTF}$) for joint community clustering and classification. Our approach proposes a novel token clustering (TC) module based on the transformer architecture, which utilizes learnable prompt tokens with orthogonal loss where each ROI embedding is projected onto the prompt embedding space, effectively clustering ROIs into communities and reducing the dimensions of the node representation via merging with communities. Our results demonstrate that our learnable community-aware model $\texttt{TC-BrainTF}$ offers improved accuracy in identifying ASD and classifying genders through rigorous testing on ABIDE and HCP datasets. Additionally, the qualitative analysis on $\texttt{TC-BrainTF}$ has demonstrated the effectiveness of the designed TC module and its relevance to neuroscience interpretations.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Analysis of a Leslie-Gower model with Alle effects, cooperative hunting, and constant placement rates
Authors:
Yonghui Zhao
Abstract:
This paper investigates the dynamical properties of the Leslie-Gower model with Alle effects, cooperative hunting, and constant placement rates. The conditions for the existence of the triple equilibrium point of the model are first analyzed. Subsequently, the canonical type theory and the qualitative theory of planar systems are applied to obtain that the triple equilibrium point can be a node wi…
▽ More
This paper investigates the dynamical properties of the Leslie-Gower model with Alle effects, cooperative hunting, and constant placement rates. The conditions for the existence of the triple equilibrium point of the model are first analyzed. Subsequently, the canonical type theory and the qualitative theory of planar systems are applied to obtain that the triple equilibrium point can be a node with a residual dimension of 2 and an equilibrium point with a residual dimension of 3 under different parameter conditions. Finally, it is proved that the system bifurcates with a residual dimension of 2 in the vicinity of the node with cooperative hunting and placement rate as branching parameters.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Brain-inspired and Self-based Artificial Intelligence
Authors:
Yi Zeng,
Feifei Zhao,
Yuxuan Zhao,
Dongcheng Zhao,
Enmeng Lu,
Qian Zhang,
Yuwei Wang,
Hui Feng,
Zhuoya Zhao,
Jihang Wang,
Qingqun Kong,
Yinqian Sun,
Yang Li,
Guobin Shen,
Bing Han,
Yiting Dong,
Wenxuan Pan,
Xiang He,
Aorigele Bao,
** Wang
Abstract:
The question "Can machines think?" and the Turing Test to assess whether machines could achieve human-level intelligence is one of the roots of AI. With the philosophical argument "I think, therefore I am", this paper challenge the idea of a "thinking machine" supported by current AIs since there is no sense of self in them. Current artificial intelligence is only seemingly intelligent information…
▽ More
The question "Can machines think?" and the Turing Test to assess whether machines could achieve human-level intelligence is one of the roots of AI. With the philosophical argument "I think, therefore I am", this paper challenge the idea of a "thinking machine" supported by current AIs since there is no sense of self in them. Current artificial intelligence is only seemingly intelligent information processing and does not truly understand or be subjectively aware of oneself and perceive the world with the self as human intelligence does. In this paper, we introduce a Brain-inspired and Self-based Artificial Intelligence (BriSe AI) paradigm. This BriSe AI paradigm is dedicated to coordinating various cognitive functions and learning strategies in a self-organized manner to build human-level AI models and robotic applications. Specifically, BriSe AI emphasizes the crucial role of the Self in sha** the future AI, rooted with a practical hierarchical Self framework, including Perception and Learning, Bodily Self, Autonomous Self, Social Self, and Conceptual Self. The hierarchical framework of the Self highlights self-based environment perception, self-bodily modeling, autonomous interaction with the environment, social interaction and collaboration with others, and even more abstract understanding of the Self. Furthermore, the positive mutual promotion and support among multiple levels of Self, as well as between Self and learning, enhance the BriSe AI's conscious understanding of information and flexible adaptation to complex environments, serving as a driving force propelling BriSe AI towards real Artificial General Intelligence.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Feedback Efficient Online Fine-Tuning of Diffusion Models
Authors:
Masatoshi Uehara,
Yulai Zhao,
Kevin Black,
Ehsan Hajiramezanali,
Gabriele Scalia,
Nathaniel Lee Diamant,
Alex M Tseng,
Sergey Levine,
Tommaso Biancalani
Abstract:
Diffusion models excel at modeling complex data distributions, including those of images, proteins, and small molecules. However, in many cases, our goal is to model parts of the distribution that maximize certain properties: for example, we may want to generate images with high aesthetic quality, or molecules with high bioactivity. It is natural to frame this as a reinforcement learning (RL) prob…
▽ More
Diffusion models excel at modeling complex data distributions, including those of images, proteins, and small molecules. However, in many cases, our goal is to model parts of the distribution that maximize certain properties: for example, we may want to generate images with high aesthetic quality, or molecules with high bioactivity. It is natural to frame this as a reinforcement learning (RL) problem, in which the objective is to fine-tune a diffusion model to maximize a reward function that corresponds to some property. Even with access to online queries of the ground-truth reward function, efficiently discovering high-reward samples can be challenging: they might have a low probability in the initial distribution, and there might be many infeasible samples that do not even have a well-defined reward (e.g., unnatural images or physically impossible molecules). In this work, we propose a novel reinforcement learning procedure that efficiently explores on the manifold of feasible samples. We present a theoretical analysis providing a regret guarantee, as well as empirical validation across three domains: images, biological sequences, and molecules.
△ Less
Submitted 27 February, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
DiscDiff: Latent Diffusion Model for DNA Sequence Generation
Authors:
Zehui Li,
Yuhao Ni,
William A V Beardall,
Guoxuan Xia,
Akashaditya Das,
Guy-Bart Stan,
Yiren Zhao
Abstract:
This paper introduces a novel framework for DNA sequence generation, comprising two key components: DiscDiff, a Latent Diffusion Model (LDM) tailored for generating discrete DNA sequences, and Absorb-Escape, a post-training algorithm designed to refine these sequences. Absorb-Escape enhances the realism of the generated sequences by correcting `round errors' inherent in the conversion process betw…
▽ More
This paper introduces a novel framework for DNA sequence generation, comprising two key components: DiscDiff, a Latent Diffusion Model (LDM) tailored for generating discrete DNA sequences, and Absorb-Escape, a post-training algorithm designed to refine these sequences. Absorb-Escape enhances the realism of the generated sequences by correcting `round errors' inherent in the conversion process between latent and input spaces. Our approach not only sets new standards in DNA sequence generation but also demonstrates superior performance over existing diffusion models, in generating both short and long DNA sequences. Additionally, we introduce EPD-GenDNA, the first comprehensive, multi-species dataset for DNA generation, encompassing 160,000 unique sequences from 15 species. We hope this study will advance the generative modelling of DNA, with potential implications for gene therapy and protein production.
△ Less
Submitted 17 April, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Empirical Evidence for the Fragment level Understanding on Drug Molecular Structure of LLMs
Authors:
Xiuyuan Hu,
Guoqing Liu,
Yang Zhao,
Hao Zhang
Abstract:
AI for drug discovery has been a research hotspot in recent years, and SMILES-based language models has been increasingly applied in drug molecular design. However, no work has explored whether and how language models understand the chemical spatial structure from 1D sequences. In this work, we pre-train a transformer model on chemical language and fine-tune it toward drug design objectives, and i…
▽ More
AI for drug discovery has been a research hotspot in recent years, and SMILES-based language models has been increasingly applied in drug molecular design. However, no work has explored whether and how language models understand the chemical spatial structure from 1D sequences. In this work, we pre-train a transformer model on chemical language and fine-tune it toward drug design objectives, and investigate the correspondence between high-frequency SMILES substrings and molecular fragments. The results indicate that language models can understand chemical structures from the perspective of molecular fragments, and the structural knowledge learned through fine-tuning is reflected in the high-frequency SMILES substrings generated by the model.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
De novo Drug Design using Reinforcement Learning with Multiple GPT Agents
Authors:
Xiuyuan Hu,
Guoqing Liu,
Yang Zhao,
Hao Zhang
Abstract:
De novo drug design is a pivotal issue in pharmacology and a new area of focus in AI for science research. A central challenge in this field is to generate molecules with specific properties while also producing a wide range of diverse candidates. Although advanced technologies such as transformer models and reinforcement learning have been applied in drug design, their potential has not been full…
▽ More
De novo drug design is a pivotal issue in pharmacology and a new area of focus in AI for science research. A central challenge in this field is to generate molecules with specific properties while also producing a wide range of diverse candidates. Although advanced technologies such as transformer models and reinforcement learning have been applied in drug design, their potential has not been fully realized. Therefore, we propose MolRL-MGPT, a reinforcement learning algorithm with multiple GPT agents for drug molecular generation. To promote molecular diversity, we encourage the agents to collaborate in searching for desirable molecules in diverse directions. Our algorithm has shown promising results on the GuacaMol benchmark and exhibits efficacy in designing inhibitors against SARS-CoV-2 protein targets. The codes are available at: https://github.com/HXYfighter/MolRL-MGPT.
△ Less
Submitted 21 December, 2023;
originally announced January 2024.
-
SAPNet: a deep learning model for identification of single-molecule peptide post-translational modifications with surface enhanced Raman spectroscopy
Authors:
Mulusew W. Yaltaye,
Yingqi Zhao,
Eva Bozo,
Pei-Lin Xin,
Vahid Farrah,
Francesco De Angelis,
Jian-An Huang
Abstract:
Nanopore resistive pulse sensors are emerging technologies for single-molecule protein sequencing. But they can hardly detect small post-translational modifications (PTMs) such as hydroxylation in single-molecule level. While a combination of surface enhanced Raman spectroscopy (SERS) with plasmonic nanopores can detect the small PTMs, the blinking Raman peaks in the single-molecule SERS spectra l…
▽ More
Nanopore resistive pulse sensors are emerging technologies for single-molecule protein sequencing. But they can hardly detect small post-translational modifications (PTMs) such as hydroxylation in single-molecule level. While a combination of surface enhanced Raman spectroscopy (SERS) with plasmonic nanopores can detect the small PTMs, the blinking Raman peaks in the single-molecule SERS spectra leads to a big challenge in data analysis and PTM identification. Herein, we developed and validated a one-dimensional convolutional neural network (1D-CNN) for amino acids and peptides identification from their PTMs including hydroxylation and phosphorylation by their single-molecule SERS spectra, named Single Amino acid and Peptide Network (SAPNet). Our work combines cutting-edge plasmonic nanopore technology for SERS signal acquisition and deep learning for fully automated extraction of information from the SERS signals. The SAPNet model achieved an overall accuracy of 99.66% for the identification of amino acids from their modification, and 98.38% for the identification of peptides from their PTM translation. We also evaluated the model with out-of-sample examples with good performance. Our work can be beneficial for early detection of diseases such as cancers and Alzheimer's disease.
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
GenoCraft: A Comprehensive, User-Friendly Web-Based Platform for High-Throughput Omics Data Analysis and Visualization
Authors:
Yingzhou Lu,
Minjie Shen,
Yue Zhao,
Chenhao Li,
Fan Meng,
Xiao Wang,
David Herrington,
Yue Wang,
Tim Fu,
Capucine Van Rechem
Abstract:
The surge in high-throughput omics data has reshaped the landscape of biological research, underlining the need for powerful, user-friendly data analysis and interpretation tools. This paper presents GenoCraft, a web-based comprehensive software solution designed to handle the entire pipeline of omics data processing. GenoCraft offers a unified platform featuring advanced bioinformatics tools, cov…
▽ More
The surge in high-throughput omics data has reshaped the landscape of biological research, underlining the need for powerful, user-friendly data analysis and interpretation tools. This paper presents GenoCraft, a web-based comprehensive software solution designed to handle the entire pipeline of omics data processing. GenoCraft offers a unified platform featuring advanced bioinformatics tools, covering all aspects of omics data analysis. It encompasses a range of functionalities, such as normalization, quality control, differential analysis, network analysis, pathway analysis, and diverse visualization techniques. This software makes state-of-the-art omics data analysis more accessible to a wider range of users. With GenoCraft, researchers and data scientists have access to an array of cutting-edge bioinformatics tools under a user-friendly interface, making it a valuable resource for managing and analyzing large-scale omics data. The API with an interactive web interface is publicly available at https://genocraft.stanford. edu/. We also release all the codes in https://github.com/futianfan/GenoCraft.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Learning High-Order Relationships of Brain Regions
Authors:
Weikang Qiu,
Huangrui Chu,
Selena Wang,
Haolan Zuo,
Xiaoxiao Li,
Yize Zhao,
Rex Ying
Abstract:
Discovering reliable and informative relationships among brain regions from functional magnetic resonance imaging (fMRI) signals is essential in phenotypic predictions. Most of the current methods fail to accurately characterize those interactions because they only focus on pairwise connections and overlook the high-order relationships of brain regions. We propose that these high-order relationshi…
▽ More
Discovering reliable and informative relationships among brain regions from functional magnetic resonance imaging (fMRI) signals is essential in phenotypic predictions. Most of the current methods fail to accurately characterize those interactions because they only focus on pairwise connections and overlook the high-order relationships of brain regions. We propose that these high-order relationships should be maximally informative and minimally redundant (MIMR). However, identifying such high-order relationships is challenging and under-explored due to the exponential search space and the absence of a tractable objective. In response to this gap, we propose a novel method named HYBRID which aims to extract MIMR high-order relationships from fMRI data. HYBRID employs a CONSTRUCTOR to identify hyperedge structures, and a WEIGHTER to compute a weight for each hyperedge, which avoids searching in exponential space. HYBRID achieves the MIMR objective through an innovative information bottleneck framework named multi-head drop-bottleneck with theoretical guarantees. Our comprehensive experiments demonstrate the effectiveness of our model. Our model outperforms the state-of-the-art predictive model by an average of 11.2%, regarding the quality of hyperedges measured by CPM, a standard protocol for studying brain connections.
△ Less
Submitted 8 June, 2024; v1 submitted 2 December, 2023;
originally announced December 2023.
-
The physical origin of aneurysm growth, dissection, and rupture
Authors:
Tom Y. Zhao,
**-Tae Kim,
Min Cho,
Akhil Narang,
John A. Rogers,
Neelesh A. Patankar
Abstract:
Rupture of aortic aneurysms is by far the most fatal heart disease, with a mortality rate exceeding 80%. There are no reliable clinical protocols to predict growth, dissection, and rupture because the fundamental physics driving aneurysm progression is unknown. Here, via in-vitro experiments, we show that a blood-wall, fluttering instability manifests in synthetic arteries under pulsatile forcing.…
▽ More
Rupture of aortic aneurysms is by far the most fatal heart disease, with a mortality rate exceeding 80%. There are no reliable clinical protocols to predict growth, dissection, and rupture because the fundamental physics driving aneurysm progression is unknown. Here, via in-vitro experiments, we show that a blood-wall, fluttering instability manifests in synthetic arteries under pulsatile forcing. We establish a phase space to prove that the transition from stable flow to unstable aortic flutter is accurately predicted by a flutter instability parameter derived from first principles. Time resolved strain maps of the evolving system reveal the dynamical characteristics of aortic flutter that drive aneurysm progression. We show that low level instability can trigger permanent aortic growth, even in the absence of material remodeling. Sufficiently large flutter beyond a secondary threshold localizes strain in the walls to the length scale clinically observed in aortic dissection. Lastly, significant physical flutter beyond a tertiary threshold can ultimately induce aneurysm rupture via failure modes reported from necropsy. Resolving the fundamental physics of aneurysm progression directly leads to clinical protocols that forecast growth as well as intercept dissection and rupture by pinpointing their physical origin.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
Joint Design of Protein Sequence and Structure based on Motifs
Authors:
Zhenqiao Song,
Yunlong Zhao,
Yufei Song,
Wenxian Shi,
Yang Yang,
Lei Li
Abstract:
Designing novel proteins with desired functions is crucial in biology and chemistry. However, most existing work focus on protein sequence design, leaving protein sequence and structure co-design underexplored. In this paper, we propose GeoPro, a method to design protein backbone structure and sequence jointly. Our motivation is that protein sequence and its backbone structure constrain each other…
▽ More
Designing novel proteins with desired functions is crucial in biology and chemistry. However, most existing work focus on protein sequence design, leaving protein sequence and structure co-design underexplored. In this paper, we propose GeoPro, a method to design protein backbone structure and sequence jointly. Our motivation is that protein sequence and its backbone structure constrain each other, and thus joint design of both can not only avoid nonfolding and misfolding but also produce more diverse candidates with desired functions. To this end, GeoPro is powered by an equivariant encoder for three-dimensional (3D) backbone structure and a protein sequence decoder guided by 3D geometry. Experimental results on two biologically significant metalloprotein datasets, including $β$-lactamases and myoglobins, show that our proposed GeoPro outperforms several strong baselines on most metrics. Remarkably, our method discovers novel $β$-lactamases and myoglobins which are not present in protein data bank (PDB) and UniProt. These proteins exhibit stable folding and active site environments reminiscent of those of natural proteins, demonstrating their excellent potential to be biologically functional.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
The bionic neural network for external simulation of human locomotor system
Authors:
Yue Shi,
Shuhao Ma,
Yihui Zhao
Abstract:
Muscle forces and joint kinematics estimated with musculoskeletal (MSK) modeling techniques offer useful metrics describing movement quality. Model-based computational MSK models can interpret the dynamic interaction between the neural drive to muscles, muscle dynamics, body and joint kinematics, and kinetics. Still, such a set of solutions suffers from high computational time and muscle recruitme…
▽ More
Muscle forces and joint kinematics estimated with musculoskeletal (MSK) modeling techniques offer useful metrics describing movement quality. Model-based computational MSK models can interpret the dynamic interaction between the neural drive to muscles, muscle dynamics, body and joint kinematics, and kinetics. Still, such a set of solutions suffers from high computational time and muscle recruitment problems, especially in complex modeling. In recent years, data-driven methods have emerged as a promising alternative due to the benefits of flexibility and adaptability. However, a large amount of labeled training data is not easy to be acquired. This paper proposes a physics-informed deep learning method based on MSK modeling to predict joint motion and muscle forces. The MSK model is embedded into the neural network as an ordinary differential equation (ODE) loss function with physiological parameters of muscle activation dynamics and muscle contraction dynamics to be identified. These parameters are automatically estimated during the training process which guides the prediction of muscle forces combined with the MSK forward dynamics model. Experimental validations on two groups of data, including one benchmark dataset and one self-collected dataset from six healthy subjects, are performed. The results demonstrate that the proposed deep learning method can effectively identify subject-specific MSK physiological parameters and the trained physics-informed forward-dynamics surrogate yields accurate motion and muscle forces predictions.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Will More Expressive Graph Neural Networks do Better on Generative Tasks?
Authors:
Xiandong Zou,
Xiangyu Zhao,
Pietro Liò,
Yiren Zhao
Abstract:
Graph generation poses a significant challenge as it involves predicting a complete graph with multiple nodes and edges based on simply a given label. This task also carries fundamental importance to numerous real-world applications, including de-novo drug and molecular design. In recent years, several successful methods have emerged in the field of graph generation. However, these approaches suff…
▽ More
Graph generation poses a significant challenge as it involves predicting a complete graph with multiple nodes and edges based on simply a given label. This task also carries fundamental importance to numerous real-world applications, including de-novo drug and molecular design. In recent years, several successful methods have emerged in the field of graph generation. However, these approaches suffer from two significant shortcomings: (1) the underlying Graph Neural Network (GNN) architectures used in these methods are often underexplored; and (2) these methods are often evaluated on only a limited number of metrics. To fill this gap, we investigate the expressiveness of GNNs under the context of the molecular graph generation task, by replacing the underlying GNNs of graph generative models with more expressive GNNs. Specifically, we analyse the performance of six GNNs in two different generative frameworks -- autoregressive generation models, such as GCPN and GraphAF, and one-shot generation models, such as GraphEBM -- on six different molecular generative objectives on the ZINC-250k dataset. Through our extensive experiments, we demonstrate that advanced GNNs can indeed improve the performance of GCPN, GraphAF, and GraphEBM on molecular generation tasks, but GNN expressiveness is not a necessary condition for a good GNN-based generative model. Moreover, we show that GCPN and GraphAF with advanced GNNs can achieve state-of-the-art results across 17 other non-GNN-based graph generative approaches, such as variational autoencoders and Bayesian optimisation models, on the proposed molecular generative objectives (DRD2, Median1, Median2), which are important metrics for de-novo molecular design.
△ Less
Submitted 20 February, 2024; v1 submitted 23 August, 2023;
originally announced August 2023.
-
A Data-Driven Approach to Morphogenesis under Structural Instability
Authors:
Yingjie Zhao,
Zhi** Xu
Abstract:
Morphological development into evolutionary patterns under structural instability is ubiquitous in living systems and often of vital importance for engineering structures. Here we propose a data-driven approach to understand and predict their spatiotemporal complexities. A machine-learning framework is proposed based on the physical modeling of morphogenesis triggered by internal or external forci…
▽ More
Morphological development into evolutionary patterns under structural instability is ubiquitous in living systems and often of vital importance for engineering structures. Here we propose a data-driven approach to understand and predict their spatiotemporal complexities. A machine-learning framework is proposed based on the physical modeling of morphogenesis triggered by internal or external forcing. Digital libraries of structural patterns are constructed from the simulation data, which are then used to recognize the abnormalities, predict their development, and assist in risk assessment and prognosis. The capabilities to identify the key bifurcation characteristics and predict the history-dependent development from the global and local features are demonstrated by examples of brain growth and aerospace structural design, which offer guidelines for disease diagnosis/prognosis and instability-tolerant design.
△ Less
Submitted 22 August, 2023;
originally announced August 2023.
-
Acoustofluidic Engineering Functional Vessel-on-a-Chip
Authors:
Yue Wu,
Yuwen Zhao,
Khayrul Islam,
Yuyuan Zhou,
Saeed Omidi,
Yevgeny Berdichevsky,
Yaling Liu
Abstract:
Construction of in vitro vascular models is of great significance to various biomedical research, such as pharmacokinetics and hemodynamics, thus is an important direction in tissue engineering. In this work, a standing surface acoustic wave field was constructed to spatially arrange suspended endothelial cells into a designated patterning. The cell patterning was maintained after the acoustic fie…
▽ More
Construction of in vitro vascular models is of great significance to various biomedical research, such as pharmacokinetics and hemodynamics, thus is an important direction in tissue engineering. In this work, a standing surface acoustic wave field was constructed to spatially arrange suspended endothelial cells into a designated patterning. The cell patterning was maintained after the acoustic field was withdrawn by the solidified hydrogel. Then, interstitial flow was provided to activate vessel tube formation. Thus, a functional vessel-on-a-chip was engineered with specific vessel geometry. Vascular function, including perfusability and vascular barrier function, was characterized by beads loading and dextran diffusion, respectively. A computational atomistic simulation model was proposed to illustrate how solutes cross vascular lipid bilayer. The reported acoustofluidic methodology is capable of facile and reproducible fabrication of functional vessel network with specific geometry. It is promising to facilitate the development of both fundamental research and regenerative therapy.
△ Less
Submitted 17 August, 2023; v1 submitted 11 August, 2023;
originally announced August 2023.
-
DNAGPT: A Generalized Pre-trained Tool for Versatile DNA Sequence Analysis Tasks
Authors:
Daoan Zhang,
Weitong Zhang,
Yu Zhao,
Jianguo Zhang,
Bing He,
Chenchen Qin,
Jianhua Yao
Abstract:
Pre-trained large language models demonstrate potential in extracting information from DNA sequences, yet adapting to a variety of tasks and data modalities remains a challenge. To address this, we propose DNAGPT, a generalized DNA pre-training model trained on over 200 billion base pairs from all mammals. By enhancing the classic GPT model with a binary classification task (DNA sequence order), a…
▽ More
Pre-trained large language models demonstrate potential in extracting information from DNA sequences, yet adapting to a variety of tasks and data modalities remains a challenge. To address this, we propose DNAGPT, a generalized DNA pre-training model trained on over 200 billion base pairs from all mammals. By enhancing the classic GPT model with a binary classification task (DNA sequence order), a numerical regression task (guanine-cytosine content prediction), and a comprehensive token language, DNAGPT can handle versatile DNA analysis tasks while processing both sequence and numerical data. Our evaluation of genomic signal and region recognition, mRNA abundance regression, and artificial genomes generation tasks demonstrates DNAGPT's superior performance compared to existing models designed for specific downstream tasks, benefiting from pre-training using the newly designed model structure.
△ Less
Submitted 30 August, 2023; v1 submitted 11 July, 2023;
originally announced July 2023.
-
Multi-omics Prediction from High-content Cellular Imaging with Deep Learning
Authors:
Rahil Mehrizi,
Arash Mehrjou,
Maryana Alegro,
Yi Zhao,
Benedetta Carbone,
Carl Fishwick,
Johanna Vappiani,
**g Bi,
Siobhan Sanford,
Hakan Keles,
Marcus Bantscheff,
Cuong Nguyen,
Patrick Schwab
Abstract:
High-content cellular imaging, transcriptomics, and proteomics data provide rich and complementary views on the molecular layers of biology that influence cellular states and function. However, the biological determinants through which changes in multi-omics measurements influence cellular morphology have not yet been systematically explored, and the degree to which cell imaging could potentially…
▽ More
High-content cellular imaging, transcriptomics, and proteomics data provide rich and complementary views on the molecular layers of biology that influence cellular states and function. However, the biological determinants through which changes in multi-omics measurements influence cellular morphology have not yet been systematically explored, and the degree to which cell imaging could potentially enable the prediction of multi-omics directly from cell imaging data is therefore currently unclear. Here, we address the question of whether it is possible to predict bulk multi-omics measurements directly from cell images using Image2Omics - a deep learning approach that predicts multi-omics in a cell population directly from high-content images of cells stained with multiplexed fluorescent dyes. We perform an experimental evaluation in gene-edited macrophages derived from human induced pluripotent stem cells (hiPSC) under multiple stimulation conditions and demonstrate that Image2Omics achieves significantly better performance in predicting transcriptomics and proteomics measurements directly from cell images than predictions based on the mean observed training set abundance. We observed significant predictability of abundances for 4927 (18.72%; 95% CI: 6.52%, 35.52%) and 3521 (13.38%; 95% CI: 4.10%, 32.21%) transcripts out of 26137 in M1 and M2-stimulated macrophages respectively and for 422 (8.46%; 95% CI: 0.58%, 25.83%) and 697 (13.98%; 95% CI: 2.41%, 32.83%) proteins out of 4986 in M1 and M2-stimulated macrophages respectively. Our results show that some transcript and protein abundances are predictable from cell imaging and that cell imaging may potentially, in some settings and depending on the mechanisms of interest and desired performance threshold, even be a scalable and resource-efficient substitute for multi-omics measurements.
△ Less
Submitted 21 May, 2024; v1 submitted 15 June, 2023;
originally announced June 2023.
-
Genomic Interpreter: A Hierarchical Genomic Deep Neural Network with 1D Shifted Window Transformer
Authors:
Zehui Li,
Akashaditya Das,
William A V Beardall,
Yiren Zhao,
Guy-Bart Stan
Abstract:
Given the increasing volume and quality of genomics data, extracting new insights requires interpretable machine-learning models. This work presents Genomic Interpreter: a novel architecture for genomic assay prediction. This model outperforms the state-of-the-art models for genomic assay prediction tasks. Our model can identify hierarchical dependencies in genomic sites. This is achieved through…
▽ More
Given the increasing volume and quality of genomics data, extracting new insights requires interpretable machine-learning models. This work presents Genomic Interpreter: a novel architecture for genomic assay prediction. This model outperforms the state-of-the-art models for genomic assay prediction tasks. Our model can identify hierarchical dependencies in genomic sites. This is achieved through the integration of 1D-Swin, a novel Transformer-based block designed by us for modelling long-range hierarchical data. Evaluated on a dataset containing 38,171 DNA segments of 17K base pairs, Genomic Interpreter demonstrates superior performance in chromatin accessibility and gene expression prediction and unmasks the underlying `syntax' of gene regulation.
△ Less
Submitted 28 June, 2023; v1 submitted 8 June, 2023;
originally announced June 2023.
-
Linear Time GPs for Inferring Latent Trajectories from Neural Spike Trains
Authors:
Matthew Dowling,
Yuan Zhao,
Il Memming Park
Abstract:
Latent Gaussian process (GP) models are widely used in neuroscience to uncover hidden state evolutions from sequential observations, mainly in neural activity recordings. While latent GP models provide a principled and powerful solution in theory, the intractable posterior in non-conjugate settings necessitates approximate inference schemes, which may lack scalability. In this work, we propose cvH…
▽ More
Latent Gaussian process (GP) models are widely used in neuroscience to uncover hidden state evolutions from sequential observations, mainly in neural activity recordings. While latent GP models provide a principled and powerful solution in theory, the intractable posterior in non-conjugate settings necessitates approximate inference schemes, which may lack scalability. In this work, we propose cvHM, a general inference framework for latent GP models leveraging Hida-Matérn kernels and conjugate computation variational inference (CVI). With cvHM, we are able to perform variational inference of latent neural trajectories with linear time complexity for arbitrary likelihoods. The reparameterization of stationary kernels using Hida-Matérn GPs helps us connect the latent variable models that encode prior assumptions through dynamical systems to those that encode trajectory assumptions through GPs. In contrast to previous work, we use bidirectional information filtering, leading to a more concise implementation. Furthermore, we employ the Whittle approximate likelihood to achieve highly efficient hyperparameter learning.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
Real-Time Variational Method for Learning Neural Trajectory and its Dynamics
Authors:
Matthew Dowling,
Yuan Zhao,
Il Memming Park
Abstract:
Latent variable models have become instrumental in computational neuroscience for reasoning about neural computation. This has fostered the development of powerful offline algorithms for extracting latent neural trajectories from neural recordings. However, despite the potential of real time alternatives to give immediate feedback to experimentalists, and enhance experimental design, they have rec…
▽ More
Latent variable models have become instrumental in computational neuroscience for reasoning about neural computation. This has fostered the development of powerful offline algorithms for extracting latent neural trajectories from neural recordings. However, despite the potential of real time alternatives to give immediate feedback to experimentalists, and enhance experimental design, they have received markedly less attention. In this work, we introduce the exponential family variational Kalman filter (eVKF), an online recursive Bayesian method aimed at inferring latent trajectories while simultaneously learning the dynamical system generating them. eVKF works for arbitrary likelihoods and utilizes the constant base measure exponential family to model the latent state stochasticity. We derive a closed-form variational analogue to the predict step of the Kalman filter which leads to a provably tighter bound on the ELBO compared to another online variational method. We validate our method on synthetic and real-world data, and, notably, show that it achieves competitive performance
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Establishing group-level brain structural connectivity incorporating anatomical knowledge under latent space modeling
Authors:
Selena Wang,
Yiting Wang,
Frederick H. Xu,
Li Shen,
Yize Zhao
Abstract:
Brain structural connectivity, capturing the white matter fiber tracts among brain regions inferred by diffusion MRI (dMRI), provides a unique characterization of brain anatomical organization. One fundamental question to address with structural connectivity is how to properly summarize and perform statistical inference for a group-level connectivity architecture, for instance, under different sex…
▽ More
Brain structural connectivity, capturing the white matter fiber tracts among brain regions inferred by diffusion MRI (dMRI), provides a unique characterization of brain anatomical organization. One fundamental question to address with structural connectivity is how to properly summarize and perform statistical inference for a group-level connectivity architecture, for instance, under different sex groups, or disease cohorts. Existing analyses commonly summarize group-level brain connectivity by a simple entry-wise sample mean or median across individual brain connectivity matrices. However, such a heuristic approach fully ignores the associations among structural connections and the topological properties of brain networks. In this project, we propose a latent space-based generative network model to estimate group-level brain connectivity. We name our method the attributes-informed brain connectivity (ABC) model, which compared with existing group-level connectivity estimations, (1) offers an interpretable latent space representation of the group-level connectivity, (2) incorporates the anatomical knowledge of nodes and tests its co-varying relationship with connectivity and (3) quantifies the uncertainty and evaluates the likelihood of the estimated group-level effects against chance. We devise a novel Bayesian MCMC algorithm to estimate the model. By applying the ABC model to study brain structural connectivity stratified by sex among Alzheimer's Disease (AD) subjects and healthy controls incorporating the anatomical attributes (volume, thickness and area) on nodes, our method shows superior predictive power on out-of-sample structural connectivity and identifies meaningful sex-specific network neuromarkers for AD.
△ Less
Submitted 21 February, 2023;
originally announced April 2023.
-
Brain-inspired bodily self-perception model for robot rubber hand illusion
Authors:
Yuxuan Zhao,
Enmeng Lu,
Yi Zeng
Abstract:
At the core of bodily self-consciousness is the perception of the ownership of one's body. Recent efforts to gain a deeper understanding of the mechanisms behind the brain's encoding of the self-body have led to various attempts to develop a unified theoretical framework to explain related behavioral and neurophysiological phenomena. A central question to be explained is how body illusions such as…
▽ More
At the core of bodily self-consciousness is the perception of the ownership of one's body. Recent efforts to gain a deeper understanding of the mechanisms behind the brain's encoding of the self-body have led to various attempts to develop a unified theoretical framework to explain related behavioral and neurophysiological phenomena. A central question to be explained is how body illusions such as the rubber hand illusion actually occur. Despite the conceptual descriptions of the mechanisms of bodily self-consciousness and the possible relevant brain areas, the existing theoretical models still lack an explanation of the computational mechanisms by which the brain encodes the perception of one's body and how our subjectively perceived body illusions can be generated by neural networks. Here we integrate the biological findings of bodily self-consciousness to propose a Brain-inspired bodily self-perception model, by which perceptions of bodily self can be autonomously constructed without any supervision signals. We successfully validated our computational model with six rubber hand illusion experiments and a disability experiment on platforms including a iCub humanoid robot and simulated environments. The experimental results show that our model can not only well replicate the behavioral and neural data of monkeys in biological experiments, but also reasonably explain the causes and results of the rubber hand illusion from the neuronal level due to advantages in biological interpretability, thus contributing to the revealing of the computational and neural mechanisms underlying the occurrence of the rubber hand illusion.
△ Less
Submitted 26 April, 2023; v1 submitted 21 March, 2023;
originally announced March 2023.
-
Knowledge from Large-Scale Protein Contact Prediction Models Can Be Transferred to the Data-Scarce RNA Contact Prediction Task
Authors:
Yiren Jian,
Chongyang Gao,
Chen Zeng,
Yunjie Zhao,
Soroush Vosoughi
Abstract:
RNA, whose functionality is largely determined by its structure, plays an important role in many biological activities. The prediction of pairwise structural proximity between each nucleotide of an RNA sequence can characterize the structural information of the RNA. Historically, this problem has been tackled by machine learning models using expert-engineered features and trained on scarce labeled…
▽ More
RNA, whose functionality is largely determined by its structure, plays an important role in many biological activities. The prediction of pairwise structural proximity between each nucleotide of an RNA sequence can characterize the structural information of the RNA. Historically, this problem has been tackled by machine learning models using expert-engineered features and trained on scarce labeled datasets. Here, we find that the knowledge learned by a protein-coevolution Transformer-based deep neural network can be transferred to the RNA contact prediction task. As protein datasets are orders of magnitude larger than those for RNA contact prediction, our findings and the subsequent framework greatly reduce the data scarcity bottleneck. Experiments confirm that RNA contact prediction through transfer learning using a publicly available protein model is greatly improved. Our findings indicate that the learned structural patterns of proteins can be transferred to RNAs, opening up potential new avenues for research.
△ Less
Submitted 18 January, 2024; v1 submitted 13 February, 2023;
originally announced February 2023.
-
Kainate receptor modulation by NETO2
Authors:
Lingli He,
Jiahui Sun,
Yiwei Gao,
Bin Li,
Yuhang Wang,
Yanli Dong,
Weidong An,
Hang Li,
Bei Yang,
Yuhan Ge,
Xuejun Cai Zhang,
Yun Stone Shi,
Yan Zhao
Abstract:
Glutamate-gated kainate receptors (KARs) are ubiquitous in the central nervous system of vertebrates, mediate synaptic transmission on post-synapse, and modulate transmitter release on pre-synapse. In the brain, the trafficking, gating kinetics, and pharmacology of KARs are tightly regulated by Neuropilin and tolloid-like proteins (Netos). Here we report cryo-EM structures of homo-tetrameric GluK2…
▽ More
Glutamate-gated kainate receptors (KARs) are ubiquitous in the central nervous system of vertebrates, mediate synaptic transmission on post-synapse, and modulate transmitter release on pre-synapse. In the brain, the trafficking, gating kinetics, and pharmacology of KARs are tightly regulated by Neuropilin and tolloid-like proteins (Netos). Here we report cryo-EM structures of homo-tetrameric GluK2 in complex with Neto2 at inhibited and desensitized states, illustrating variable stoichiometry of GluK2-Neto2 complexes, with one or two Neto2 subunits associate with the GluK2. We find that Neto2 accesses only two broad faces of KARs, intermolecularly crosslinking the lower-lobe of ATDA/C, upper-lobe of LBDB/D, and lower-lobe of LBDA/C, illustrating how Neto2 regulates receptor-gating kinetics. The transmembrane helix of Neto2 is positioned proximal to the selectivity filter and competes with the amphiphilic H1-helix after M4 for interacting with an ICD formed by the M1-M2 linkers of the receptor, revealing how rectification is regulated by Neto2.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
Brain Model State Space Reconstruction Using an LSTM Neural Network
Authors:
Yueyang Liu,
Artemio Soto-Breceda,
Yun Zhao,
Phillipa Karoly,
Mark J. Cook,
David B. Grayden,
Daniel Schmidt,
Levin Kuhlmann1
Abstract:
Objective
Kalman filtering has previously been applied to track neural model states and parameters, particularly at the scale relevant to EEG. However, this approach lacks a reliable method to determine the initial filter conditions and assumes that the distribution of states remains Gaussian. This study presents an alternative, data-driven method to track the states and parameters of neural mas…
▽ More
Objective
Kalman filtering has previously been applied to track neural model states and parameters, particularly at the scale relevant to EEG. However, this approach lacks a reliable method to determine the initial filter conditions and assumes that the distribution of states remains Gaussian. This study presents an alternative, data-driven method to track the states and parameters of neural mass models (NMMs) from EEG recordings using deep learning techniques, specifically an LSTM neural network.
Approach
An LSTM filter was trained on simulated EEG data generated by a neural mass model using a wide range of parameters. With an appropriately customised loss function, the LSTM filter can learn the behaviour of NMMs. As a result, it can output the state vector and parameters of NMMs given observation data as the input.
Main Results
Test results using simulated data yielded correlations with R squared of around 0.99 and verified that the method is robust to noise and can be more accurate than a nonlinear Kalman filter when the initial conditions of the Kalman filter are not accurate. As an example of real-world application, the LSTM filter was also applied to real EEG data that included epileptic seizures, and revealed changes in connectivity strength parameters at the beginnings of seizures.
Significance
Tracking the state vector and parameters of mathematical brain models is of great importance in the area of brain modelling, monitoring, imaging and control. This approach has no need to specify the initial state vector and parameters, which is very difficult to do in practice because many of the variables being estimated cannot be measured directly in physiological experiments. This method may be applied using any neural mass model and, therefore, provides a general, novel, efficient approach to estimate brain model variables that are often difficult to measure.
△ Less
Submitted 19 January, 2023;
originally announced January 2023.
-
A Comparative Study of Compartmental Models for COVID-19 Transmission in Ontario, Canada
Authors:
Yuxuan Zhao,
Samuel W. K. Wong
Abstract:
The number of confirmed COVID-19 cases reached over 1.3 million in Ontario, Canada by June 4, 2022. The continued spread of the virus underlying COVID-19 has been spurred by the emergence of variants since the initial outbreak in December, 2019. Much attention has thus been devoted to tracking and modelling the transmission of COVID-19. Compartmental models are commonly used to mimic epidemic tran…
▽ More
The number of confirmed COVID-19 cases reached over 1.3 million in Ontario, Canada by June 4, 2022. The continued spread of the virus underlying COVID-19 has been spurred by the emergence of variants since the initial outbreak in December, 2019. Much attention has thus been devoted to tracking and modelling the transmission of COVID-19. Compartmental models are commonly used to mimic epidemic transmission mechanisms and are easy to understand. Their performance in real-world settings, however, needs to be more thoroughly assessed. In this comparative study, we examine five compartmental models -- four existing ones and an extended model that we propose -- and analyze their ability to describe COVID-19 transmission in Ontario from January 2022 to June 2022.
△ Less
Submitted 24 October, 2022;
originally announced October 2022.
-
Pattern formation of parasite-host model induced by fear effect
Authors:
Yong Ye,
Yi Zhao,
Jiaying Zhou
Abstract:
In this paper, based on the epidemiological microparasite model, a parasite-host model is established by considering the fear effect of susceptible individuals on infectors. We explored the pattern formation with the help of numerical simulation, and analyzed the effects of fear effect, infected host mortality, population diffusion rate and reducing reproduction ability of infected hosts on popula…
▽ More
In this paper, based on the epidemiological microparasite model, a parasite-host model is established by considering the fear effect of susceptible individuals on infectors. We explored the pattern formation with the help of numerical simulation, and analyzed the effects of fear effect, infected host mortality, population diffusion rate and reducing reproduction ability of infected hosts on population activities in different degrees. Theoretically, we give the general conditions for the stability of the model under non-diffusion and considering the Turing instability caused by diffusion. Our results indicate how fear affects the distribution of the uninfected and infected hosts in the habitat and quantify the influence of the fear factor on the spatiotemporal pattern of the population. In addition, we analyze the influence of natural death rate, reproduction ability of infected hosts, and diffusion level of uninfected (infected) hosts on the spatiotemporal pattern, respectively. The results present that the growth of pattern induced by intensified fear effect follows the certain rule: cold spots $\rightarrow$ cold spots-stripes $\rightarrow$ cold stripes $\rightarrow$ hot stripes $\rightarrow$ hot spots-stripes $\rightarrow$ hot spots. Interestingly, the natural mortality and fear effect take the opposite effect on the growth order of the pattern. From the perspective of biological significance, we find that the degree of fear effect can reshape the distribution of population to meet the previous rule.
△ Less
Submitted 18 May, 2022;
originally announced May 2022.
-
Systematic conformation-to-phenotype map** via limited deep-sequencing of proteins
Authors:
Eugene Serebryany,
Victor Y. Zhao,
Kibum Park,
Amir Bitran,
Sunia A. Trauger,
Bogdan Budnik,
Eugene I. Shakhnovich
Abstract:
Non-native conformations drive protein misfolding diseases, complicate bioengineering efforts, and fuel molecular evolution. No current experimental technique is well-suited for elucidating them and their phenotypic effects. Especially intractable are the transient conformations populated by intrinsically disordered proteins. We describe an approach to systematically discover, stabilize, and purif…
▽ More
Non-native conformations drive protein misfolding diseases, complicate bioengineering efforts, and fuel molecular evolution. No current experimental technique is well-suited for elucidating them and their phenotypic effects. Especially intractable are the transient conformations populated by intrinsically disordered proteins. We describe an approach to systematically discover, stabilize, and purify native and non-native conformations, generated in vitro or in vivo, and directly link conformations to molecular, organismal, or evolutionary phenotypes. This approach involves high-throughput disulfide scanning (HTDS) of the entire protein. To reveal which disulfides trap which chromatographically resolvable conformers, we devised a deep-sequencing method for double-Cys variant libraries of proteins that precisely and simultaneously locates both Cys residues within each polypeptide. HTDS of the abundant E. coli periplasmic chaperone HdeA revealed distinct classes of disordered hydrophobic conformers with variable cytotoxicity depending on where the backbone was cross-linked. HTDS can bridge conformational and phenotypic landscapes for many proteins that function in disulfide-permissive environments.
△ Less
Submitted 29 January, 2023; v1 submitted 12 April, 2022;
originally announced April 2022.
-
Local vaccination and systemic tumor suppression via irradiation and manganese adjuvant in mice
Authors:
Chunyang Lu,
**g Qian,
Jianfeng Lv,
**tao Han,
Xiaoyi Sun,
Junyi Chen,
Siwei Ding,
Zhusong Mei,
Yulan Liang,
Yuqi Ma,
Ye Zhao,
Chen Lin,
Yanying Zhao,
Yixing Geng,
Wenjun Ma,
Yugang Wang,
Xueqing Yan,
Gen Yang
Abstract:
Presently 4T-1 luc cells were irradiated with proton under ultra-high dose rate FLASH or with gamma-ray with conventional dose rate, and then subcutaneous vaccination with or without Mn immuno-enhancing adjuvant into the mice for three times. One week later, we injected untreated 4T-1 luc cells on the other side of the vaccinated mice, and found that the untreated 4T-1 luc cells injected later nea…
▽ More
Presently 4T-1 luc cells were irradiated with proton under ultra-high dose rate FLASH or with gamma-ray with conventional dose rate, and then subcutaneous vaccination with or without Mn immuno-enhancing adjuvant into the mice for three times. One week later, we injected untreated 4T-1 luc cells on the other side of the vaccinated mice, and found that the untreated 4T-1 luc cells injected later nearly totally did not grow tumor (1/17) while controls without previous vaccination all grow tumors (18/18). The result is very interesting and the findings may help to explore in situ tumor vaccination as well as new combined radiotherapy strategies to effectively ablate primary and disseminated tumors. To our limited knowledge, this is the first paper reporting the high efficiency induction of systemic vaccination suppressing the metastasized/disseminated tumor progression.
△ Less
Submitted 26 April, 2021;
originally announced April 2021.
-
COSINE: A Web Server for Clonal and Subclonal Structure Inference and Evolution in Cancer Genomics
Authors:
Xiguo Yuan,
Yuan Zhao,
Yang Guo,
Linmei Ge,
Wei Liu,
Shiyu Wen,
Qi Li,
Zhangbo Wan,
Peina Zheng,
Tao Guo,
Zhida Li,
Martin Peifer,
Yupeng Cun
Abstract:
Cancers evolve from mutation of a single cell with sequential clonal and subclonal expansion of somatic mutation acquisition. Inferring clonal and subclonal structures from bulk or single cell tumor genomic sequencing data has a huge impact on cancer evolution studies. Clonal state and mutational order can provide detailed insight into tumor origin and its future development. In the past decade, a…
▽ More
Cancers evolve from mutation of a single cell with sequential clonal and subclonal expansion of somatic mutation acquisition. Inferring clonal and subclonal structures from bulk or single cell tumor genomic sequencing data has a huge impact on cancer evolution studies. Clonal state and mutational order can provide detailed insight into tumor origin and its future development. In the past decade, a variety of methods have been developed for subclonal reconstruction using bulk tumor sequencing data. As these methods have been developed in different programming languages and using different input data formats, their use and comparison can be problematic. Therefore, we established a web server for clonal and subclonal structure inference and evolution of cancer genomic data (COSINE), which included 12 popular subclonal reconstruction methods. We decomposed each method via a detailed workflow of single processing steps with a user-friendly interface. To the best of our knowledge, this is the first web server providing online subclonal inference, including the most popular subclonal reconstruction methods. COSINE is freely accessible at www.clab-cosine.net or http://bio.rj.run:48996/cun-web.
△ Less
Submitted 28 March, 2021;
originally announced March 2021.
-
Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development
Authors:
Kexin Huang,
Tianfan Fu,
Wenhao Gao,
Yue Zhao,
Yusuf Roohani,
Jure Leskovec,
Connor W. Coley,
Cao Xiao,
Jimeng Sun,
Marinka Zitnik
Abstract:
Therapeutics machine learning is an emerging field with incredible opportunities for innovatiaon and impact. However, advancement in this field requires formulation of meaningful learning tasks and careful curation of datasets. Here, we introduce Therapeutics Data Commons (TDC), the first unifying platform to systematically access and evaluate machine learning across the entire range of therapeuti…
▽ More
Therapeutics machine learning is an emerging field with incredible opportunities for innovatiaon and impact. However, advancement in this field requires formulation of meaningful learning tasks and careful curation of datasets. Here, we introduce Therapeutics Data Commons (TDC), the first unifying platform to systematically access and evaluate machine learning across the entire range of therapeutics. To date, TDC includes 66 AI-ready datasets spread across 22 learning tasks and spanning the discovery and development of safe and effective medicines. TDC also provides an ecosystem of tools and community resources, including 33 data functions and types of meaningful data splits, 23 strategies for systematic model evaluation, 17 molecule generation oracles, and 29 public leaderboards. All resources are integrated and accessible via an open Python library. We carry out extensive experiments on selected datasets, demonstrating that even the strongest algorithms fall short of solving key therapeutics challenges, including real dataset distributional shifts, multi-scale modeling of heterogeneous data, and robust generalization to novel data points. We envision that TDC can facilitate algorithmic and scientific advances and considerably accelerate machine-learning model development, validation and transition into biomedical and clinical implementation. TDC is an open-science initiative available at https://tdcommons.ai.
△ Less
Submitted 28 August, 2021; v1 submitted 18 February, 2021;
originally announced February 2021.
-
Comparisons of Graph Neural Networks on Cancer Classification Leveraging a Joint of Phenotypic and Genetic Features
Authors:
David Oniani,
Chen Wang,
Yiqing Zhao,
Andrew Wen,
Hongfang Liu,
Feichen Shen
Abstract:
Cancer is responsible for millions of deaths worldwide every year. Although significant progress hasbeen achieved in cancer medicine, many issues remain to be addressed for improving cancer therapy.Appropriate cancer patient stratification is the prerequisite for selecting appropriate treatment plan, ascancer patients are of known heterogeneous genetic make-ups and phenotypic differences. In thiss…
▽ More
Cancer is responsible for millions of deaths worldwide every year. Although significant progress hasbeen achieved in cancer medicine, many issues remain to be addressed for improving cancer therapy.Appropriate cancer patient stratification is the prerequisite for selecting appropriate treatment plan, ascancer patients are of known heterogeneous genetic make-ups and phenotypic differences. In thisstudy, built upon deep phenotypic characterizations extractable from Mayo Clinic electronic healthrecords (EHRs) and genetic test reports for a collection of cancer patients, we evaluated variousgraph neural networks (GNNs) leveraging a joint of phenotypic and genetic features for cancer typeclassification. Models were applied and fine-tuned on the Mayo Clinic cancer disease dataset. Theassessment was done through the reported accuracy, precision, recall, and F1 values as well as throughF1 scores based on the disease class. Per our evaluation results, GNNs on average outperformed thebaseline models with mean statistics always being higher that those of the baseline models (0.849 vs0.772 for accuracy, 0.858 vs 0.794 for precision, 0.843 vs 0.759 for recall, and 0.843 vs 0.855 for F1score). Among GNNs, ChebNet, GraphSAGE, and TAGCN showed the best performance, while GATshowed the worst. We applied and compared eight GNN models including AGNN, ChebNet, GAT,GCN, GIN, GraphSAGE, SGC, and TAGCN on the Mayo Clinic cancer disease dataset and assessedtheir performance as well as compared them with each other and with more conventional machinelearning models such as decision tree, gradient boosting, multi-layer perceptron, naive bayes, andrandom forest which we used as the baselines.
△ Less
Submitted 14 January, 2021;
originally announced January 2021.
-
Desires and Motivation: The Computational Rule, the Underlying Neural Circuitry, and the Relevant Clinical Disorders
Authors:
Yu Liu,
Yinghong Zhao,
Mo Chen
Abstract:
As organism is a dissipative system. The process from multi desires to exclusive motivation is of great importance among all sensory-action loops. In this paper we argued that a proper Desire-Motivation model should be a continuous dynamic map** from the dynamic desire vector to the sparse motivation vector. Meanwhile, it should at least have specific stability and adjustability of motivation in…
▽ More
As organism is a dissipative system. The process from multi desires to exclusive motivation is of great importance among all sensory-action loops. In this paper we argued that a proper Desire-Motivation model should be a continuous dynamic map** from the dynamic desire vector to the sparse motivation vector. Meanwhile, it should at least have specific stability and adjustability of motivation intensity. Besides, the neuroscience evidences suggest that the Desire-Motivation model should have dynamic information acquisition and should be a recurrent neural network. A five-equation model is built based on the above arguments, namely the Recurrent Gating Desire-Motivation (RGDM) model. Additionally, a heuristic speculation based on the RGDM model about corresponding brain regions is carried out. It believes that the tonic and phasic firing of ventral tegmental area dopamine neurons should execute the respective and collective feedback functions of recurrent processing. The analysis about the RGMD model shows the expectations about individual personality from three dimensions, namely stability, intensity, and motivation decision speed. These three dimensions can be combined and create eight different personalities, which is correspondent to Jung's personality structure theorem. Furthermore, the RGDM model can be used to predict three different brand-new types of depressive disorder with different phenotypes. Moreover, it can also explain several other psychiatry disorders from new perspectives.
△ Less
Submitted 11 November, 2020;
originally announced November 2020.
-
Tetracycline as an inhibitor to the coronavirus SARS-CoV-2
Authors:
Tom Y. Zhao,
Neelesh A. Patankar
Abstract:
The coronavirus SARS-CoV-2 remains an extant threat against public health on a global scale. Cell infection begins when the spike protein of SARS-CoV-2 binds with the cell receptor, angiotensin-converting enzyme 2 (ACE2). Here, we address the role of Tetracycline as an inhibitor for the receptor-binding domain (RBD) of the spike protein. Targeted molecular investigation show that Tetracycline bind…
▽ More
The coronavirus SARS-CoV-2 remains an extant threat against public health on a global scale. Cell infection begins when the spike protein of SARS-CoV-2 binds with the cell receptor, angiotensin-converting enzyme 2 (ACE2). Here, we address the role of Tetracycline as an inhibitor for the receptor-binding domain (RBD) of the spike protein. Targeted molecular investigation show that Tetracycline binds more favorably to the RBD (-9.40 kcal/mol) compared to Chloroquine (-6.31 kcal/mol) or Doxycycline (-8.08 kcal/mol) and inhibits attachment to ACE2 to a greater degree (binding efficiency of 2.98 $\frac{\text{kcal}}{\text{mol}\cdot \text{nm}^2}$ for Tetracycline-RBD, 5.59 $\frac{\text{kcal}}{\text{mol}\cdot \text{nm}^2}$ for Chloroquine-RBD, 5.16 $\frac{\text{kcal}}{\text{mol}\cdot \text{nm}^2}$ for Doxycycline-RBD). Stronger Tetracycline inhibition is verified with nonequilibrium PMF calculations, for which the Tetracycline-RBD complex exhibits the lowest free energy profile along the dissociation pathway from ACE2. Tetracycline appears to target viral residues that are usually involved in significant hydrogen bonding with ACE2; this inhibition of cellular infection complements the anti-inflammatory and cytokine suppressing capability of Tetracycline, and may further reduce the duration of ICU stays and mechanical ventilation induced by the coronavirus SARS-CoV-2.
△ Less
Submitted 13 August, 2020;
originally announced August 2020.
-
Scalable Bayesian Functional Connectivity Inference for Multi-Electrode Array Recordings
Authors:
Yun Zhao,
Richard Jiang,
Zhenni Xu,
Elmer Guzman,
Paul K. Hansma,
Linda Petzold
Abstract:
Multi-electrode arrays (MEAs) can record extracellular action potentials (also known as 'spikes') from hundreds or thousands of neurons simultaneously. Inference of a functional network from a spike train is a fundamental and formidable computational task in neuroscience. With the advancement of MEA technology, it has become increasingly crucial to develop statistical tools for analyzing multiple…
▽ More
Multi-electrode arrays (MEAs) can record extracellular action potentials (also known as 'spikes') from hundreds or thousands of neurons simultaneously. Inference of a functional network from a spike train is a fundamental and formidable computational task in neuroscience. With the advancement of MEA technology, it has become increasingly crucial to develop statistical tools for analyzing multiple neuronal activity as a network. In this paper, we propose a scalable Bayesian framework for inference of functional networks from MEA data. Our framework makes use of the hierarchical structure of networks of neurons. We split the large scale recordings into smaller local networks for network inference, which not only eases the computational burden from Bayesian sampling but also provides useful insights on regional connections in organoids and brains. We speed up the expensive Bayesian sampling process by using parallel computing. Experiments on both synthetic datasets and large-scale real-world MEA recordings show the effectiveness and efficiency of the scalable Bayesian framework. Inference of networks from controlled experiments exposing neural cultures to cadmium presents distinguishable results and further confirms the utility of our framework.
△ Less
Submitted 4 July, 2020;
originally announced July 2020.
-
Unsupervised Learning of Deep-Learned Features from Breast Cancer Images
Authors:
Sanghoon Lee,
Colton Farley,
Simon Shim,
Yanjun Zhao,
Wook** Choi,
Wook-Sung Yoo
Abstract:
Detecting cancer manually in whole slide images requires significant time and effort on the laborious process. Recent advances in whole slide image analysis have stimulated the growth and development of machine learning-based approaches that improve the efficiency and effectiveness in the diagnosis of cancer diseases. In this paper, we propose an unsupervised learning approach for detecting cancer…
▽ More
Detecting cancer manually in whole slide images requires significant time and effort on the laborious process. Recent advances in whole slide image analysis have stimulated the growth and development of machine learning-based approaches that improve the efficiency and effectiveness in the diagnosis of cancer diseases. In this paper, we propose an unsupervised learning approach for detecting cancer in breast invasive carcinoma (BRCA) whole slide images. The proposed method is fully automated and does not require human involvement during the unsupervised learning procedure. We demonstrate the effectiveness of the proposed approach for cancer detection in BRCA and show how the machine can choose the most appropriate clusters during the unsupervised learning procedure. Moreover, we present a prototype application that enables users to select relevant groups map** all regions related to the groups in whole slide images.
△ Less
Submitted 21 June, 2020;
originally announced June 2020.
-
Validation of image-guided cochlear implant programming techniques
Authors:
Yiyuan Zhao,
Jianing Wang,
Rui Li,
Robert F. Labadie,
Benoit M. Dawant,
Jack H. Noble
Abstract:
Cochlear implants (CIs) are a standard treatment for patients who experience severe to profound hearing loss. Recent studies have shown that hearing outcome is correlated with intra-cochlear anatomy and electrode placement. Our group has developed image-guided CI programming (IGCIP) techniques that use image analysis methods to both segment the inner ear structures in pre- or post-implantation CT…
▽ More
Cochlear implants (CIs) are a standard treatment for patients who experience severe to profound hearing loss. Recent studies have shown that hearing outcome is correlated with intra-cochlear anatomy and electrode placement. Our group has developed image-guided CI programming (IGCIP) techniques that use image analysis methods to both segment the inner ear structures in pre- or post-implantation CT images and localize the CI electrodes in post-implantation CT images. This permits to assist audiologists with CI programming by suggesting which among the contacts should be deactivated to reduce electrode interaction that is known to affect outcomes. Clinical studies have shown that IGCIP can improve hearing outcomes for CI recipients. However, the sensitivity of IGCIP with respect to the accuracy of the two major steps: electrode localization and intra-cochlear anatomy segmentation, is unknown. In this article, we create a ground truth dataset with conventional CT and micro-CT images of 35 temporal bone specimens to both rigorously characterize the accuracy of these two steps and assess how inaccuracies in these steps affect the overall results. Our study results show that when clinical pre- and post-implantation CTs are available, IGCIP produces results that are comparable to those obtained with the corresponding ground truth in 86.7% of the subjects tested. When only post-implantation CTs are available, this number is 83.3%. These results suggest that our current method is robust to errors in segmentation and localization but also that it can be improved upon.
Keywords: cochlear implant, ground truth, segmentation, validation
△ Less
Submitted 13 July, 2020; v1 submitted 22 September, 2019;
originally announced September 2019.
-
The Channel Attention based Context Encoder Network for Inner Limiting Membrane Detection
Authors:
Hao Qiu,
Zaiwang Gu,
Lei Mou,
Xiaoqian Mao,
Liyang Fang,
Yitian Zhao,
Jiang Liu,
Jun Cheng
Abstract:
The optic disc segmentation is an important step for retinal image-based disease diagnosis such as glaucoma. The inner limiting membrane (ILM) is the first boundary in the OCT, which can help to extract the retinal pigment epithelium (RPE) through gradient edge information to locate the boundary of the optic disc. Thus, the ILM layer segmentation is of great importance for optic disc localization.…
▽ More
The optic disc segmentation is an important step for retinal image-based disease diagnosis such as glaucoma. The inner limiting membrane (ILM) is the first boundary in the OCT, which can help to extract the retinal pigment epithelium (RPE) through gradient edge information to locate the boundary of the optic disc. Thus, the ILM layer segmentation is of great importance for optic disc localization. In this paper, we build a new optic disc centered dataset from 20 volunteers and manually annotated the ILM boundary in each OCT scan as ground-truth. We also propose a channel attention based context encoder network modified from the CE-Net to segment the optic disc. It mainly contains three phases: the encoder module, the channel attention based context encoder module, and the decoder module. Finally, we demonstrate that our proposed method achieves state-of-the-art disc segmentation performance on our dataset mentioned above.
△ Less
Submitted 9 August, 2019;
originally announced August 2019.
-
SERS discrimination of single amino acid residue in single peptide by plasmonic nanocavities
Authors:
Jian-An Huang,
Mansoureh Z. Mousavi,
Giorgia Giovannini,
Yingqi Zhao,
Aliaksandr Hubarevich,
Denis Garoli,
Francesco De Angelis
Abstract:
Surface-enhanced Raman spectroscopy (SERS) is a sensitive label-free optical method that can provide fingerprint Raman spectra of biomolecules such as DNA, amino acids and proteins. While SERS of single DNA molecule has been recently demonstrated, Raman analysis of single protein sequence was not possible because the SERS spectra of proteins are usually dominated by signals of aromatic amino acid…
▽ More
Surface-enhanced Raman spectroscopy (SERS) is a sensitive label-free optical method that can provide fingerprint Raman spectra of biomolecules such as DNA, amino acids and proteins. While SERS of single DNA molecule has been recently demonstrated, Raman analysis of single protein sequence was not possible because the SERS spectra of proteins are usually dominated by signals of aromatic amino acid residues. Here, we used electroplasmonic approach to trap single gold nanoparticle in a nanohole for generating a plasmonic nanocavity between the trapped nanoparticle and the nanopore wall. The giant field generated in the nanocavity was so sensitive and localized that it enables SERS discrimination of 10 distinct amino acids at single-molecule level. The obtained spectra are used to analyze the spectra of 2 biomarkers (Vasopressin and Oxytocin) made of a short sequence of 9 amino-acids. Significantly, we demonstrated identification of single non-aromatic amino acid residues in a single short peptide chain as well as discrimination between two peptides with sequences distinguishable in 2 specific amino-acids. Our result demonstrate the high sensitivity of our method to identify single amino acid residue in a protein chain and a potential for further applications in proteomics and single-protein sequencing.
△ Less
Submitted 13 December, 2019; v1 submitted 9 August, 2019;
originally announced August 2019.
-
A Deep Learning Framework for Classification of in vitro Multi-Electrode Array Recordings
Authors:
Yun Zhao,
Elmer Guzman,
Morgane Audouard,
Zhuowei Cheng,
PaulK. Hansma,
Kenneth S. Kosik,
Linda Petzold
Abstract:
Multi-Electrode Arrays (MEAs) have been widely used to record neuronal activities, which could be used in the diagnosis of gene defects and drug effects. In this paper, we address the problem of classifying in vitro MEA recordings of mouse and human neuronal cultures from different genotypes, where there is no easy way to directly utilize raw sequences as inputs to train an end-to-end classificati…
▽ More
Multi-Electrode Arrays (MEAs) have been widely used to record neuronal activities, which could be used in the diagnosis of gene defects and drug effects. In this paper, we address the problem of classifying in vitro MEA recordings of mouse and human neuronal cultures from different genotypes, where there is no easy way to directly utilize raw sequences as inputs to train an end-to-end classification model. While carefully extracting some features by hand could partially solve the problem, this approach suffers from obvious drawbacks such as difficulty of generalizing. We propose a deep learning framework to address this challenge. Our approach correctly classifies neuronal culture data prepared from two different genotypes -- a mouse Knockout of the delta-catenin gene and human induced Pluripotent Stem Cell-derived neurons from Williams syndrome. By splitting the long recordings into short slices for training, and applying Consensus Prediction during testing, our deep learning approach improves the prediction accuracy by 16.69% compared with feature based Logistic Regression for mouse MEA recordings. We further achieve an accuracy of 95.91% using Consensus Prediction in one subset of mouse MEA recording data, which were all recorded at six days in vitro. As high-density MEA recordings become more widely available, this approach could be generalized for classification of neurons carrying different mutations and classification of drug responses.
△ Less
Submitted 5 June, 2019;
originally announced June 2019.
-
Critical slowing down and attractive manifold: a mechanism for dynamic robustness in yeast cell-cycle process
Authors:
Yao Zhao,
Dedi Wang,
Zhiwen Zhang,
Ying Lu,
Xiao**g Yang,
Qi Ouyang,
Chao Tang,
Fangting Li
Abstract:
The biological processes that execute complex multiple functions, such as cell cycle, must ensure the order of sequential events and keep the dynamic robustness against various fluctuations. Here, we examine the dynamic mechanism and the fundamental structure to achieve these properties in the cell-cycle process of budding yeast Saccharomyces cerevisiae. We show that the budding yeast cell-cycle p…
▽ More
The biological processes that execute complex multiple functions, such as cell cycle, must ensure the order of sequential events and keep the dynamic robustness against various fluctuations. Here, we examine the dynamic mechanism and the fundamental structure to achieve these properties in the cell-cycle process of budding yeast Saccharomyces cerevisiae. We show that the budding yeast cell-cycle process behaves like an excitable system containing three well-coupled saddle-node bifurcations to execute DNA replication and mitosis events. The yeast cell-cycle regulatory network can be separated into G1/S phase module, early M module and late M phase module, where the positive feedbacks in each module and the interactions among the modules play important role. If the cell-cycle process operates near the critical points of the saddle-node bifurcations, there is a critical slowing down or ghost effect. This can provide the cell-cycle process with a sufficient duration for each event and an attractive manifold for the state checking of the completion of DNA replication and mitosis; moreover, the fluctuation in the early module/event is forbidden to transmit to the latter module/event. Our results suggest both a fundamental structure of cell-cycle regulatory network and a hint for the evolution of eukaryotic cell-cycle processes, from the dynamic checking mechanism to the molecule checkpoint pathway.
△ Less
Submitted 23 November, 2018;
originally announced November 2018.
-
Cell Motility Dependence on Adhesive Wetting
Authors:
Yuansheng Cao,
Richa Karmakar,
Elisabeth Ghabache,
Edgar Gutierrez,
Yanxiang Zhao,
Alex Groisman,
Herbert Levine,
Brian A. Camley,
Wouter-Jan Rappel
Abstract:
Adhesive cell-substrate interactions are crucial for cell motility and are responsible for the necessary traction that propels cells. These interactions can also change the shape of the cell, analogous to liquid droplet wetting on adhesive substrates. To address how these shape changes affect cell migration and cell speed we model motility using deformable, 2D cross-sections of cells in which adhe…
▽ More
Adhesive cell-substrate interactions are crucial for cell motility and are responsible for the necessary traction that propels cells. These interactions can also change the shape of the cell, analogous to liquid droplet wetting on adhesive substrates. To address how these shape changes affect cell migration and cell speed we model motility using deformable, 2D cross-sections of cells in which adhesion and frictional forces between cell and substrate can be varied separately. Our simulations show that increasing the adhesion results in increased spreading of cells and larger cell speeds. We propose an analytical model which shows that the cell speed is inversely proportional to an effective height of the cell and that increasing this height results in increased internal shear stress. The numerical and analytical results are confirmed in experiments on motile eukaryotic cells.
△ Less
Submitted 4 February, 2019; v1 submitted 4 June, 2018;
originally announced June 2018.
-
Effect of time varying transmission rates on coupled dynamics of epidemic and awareness over multiplex network
Authors:
Vikram Sagar,
Yi Zhao,
Abhijit Sen
Abstract:
In the present work, a non-linear stochastic model is presented to study the effect of time variation of transmission rates on the co-evolution of epidemics and its corresponding awareness over a two layered multiplex network. In this model, the infection transmission rate of a given node in the epidemic layer depends upon its awareness probability in the awareness layer. Similarly, the infection…
▽ More
In the present work, a non-linear stochastic model is presented to study the effect of time variation of transmission rates on the co-evolution of epidemics and its corresponding awareness over a two layered multiplex network. In this model, the infection transmission rate of a given node in the epidemic layer depends upon its awareness probability in the awareness layer. Similarly, the infection information transmission rate of a node in the awareness layer depends upon its infection probability in the epidemic layer. The spread of disease resulting from physical contacts is described in terms of SIS (Susceptible Infected Susceptible) process over the epidemic layer and the spread of information about the disease outbreak is described in terms of UAU (Unaware Aware Unaware) process over the virtual interaction mediated awareness layer. The time variation of the transmission rates and the resulting co-evolution of these mutually competing processes is studied in terms of a network topology depend parameter(α). Using a second order linear theory it has been shown that in the continuous time limit, the co-evolution of these processes can be described in terms of damped and driven harmonic oscillator equations. From the results of the Monte-Carlo simulation, it is shown that for the suitable choice of parameter(α), the two process can either exhibit sustained oscillatory or damped dynamics. The damped dynamics corresponds to the endemic state. Further, for the case of endemic state it is shown that the inclusion of awareness layer significantly lowers the disease transmission rate and reduces the size of epidemic. The endemic state infection probability of a given node corresponding to the damped dynamics is found to have dependence upon both the transmission rates as well as on both absolute intra-layer and relative inter-layer degree of the individual nodes.
△ Less
Submitted 31 May, 2018; v1 submitted 22 May, 2018;
originally announced May 2018.
-
An integration of fast alignment and maximum-likelihood methods for electron subtomogram averaging and classification
Authors:
Yixiu Zhao,
Xiangrui Zeng,
Qiang Guo,
Min Xu
Abstract:
Motivation: Cellular Electron CryoTomography (CECT) is an emerging 3D imaging technique that visualizes subcellular organization of single cells at submolecular resolution and in near-native state. CECT captures large numbers of macromolecular complexes of highly diverse structures and abundances. However, the structural complexity and imaging limits complicate the systematic de novo structural re…
▽ More
Motivation: Cellular Electron CryoTomography (CECT) is an emerging 3D imaging technique that visualizes subcellular organization of single cells at submolecular resolution and in near-native state. CECT captures large numbers of macromolecular complexes of highly diverse structures and abundances. However, the structural complexity and imaging limits complicate the systematic de novo structural recovery and recognition of these macromolecular complexes. Efficient and accurate reference-free subtomogram averaging and classification represent the most critical tasks for such analysis. Existing subtomogram alignment based methods are prone to the missing wedge effects and low signal-to-noise ratio (SNR). Moreover, existing maximum-likelihood based methods rely on integration operations, which are in principle computationally infeasible for accurate calculation.
Results: Built on existing works, we propose an integrated method, Fast Alignment Maximum Likelihood method (FAML), which uses fast subtomogram alignment to sample sub-optimal rigid transformations. The transformations are then used to approximate integrals for maximum-likelihood update of subtomogram averages through expectation-maximization algorithm. Our tests on simulated and experimental subtomograms showed that, compared to our previously developed fast alignment method (FA), FAML is significantly more robust to noise and missing wedge effects with moderate increases of computation cost.Besides, FAML performs well with significantly fewer input subtomograms when the FA method fails. Therefore, FAML can serve as a key component for improved construction of initial structural models from macromolecules captured by CECT.
△ Less
Submitted 3 April, 2018;
originally announced April 2018.
-
Proteins at air-water and oil-water interfaces in an all-atom model
Authors:
Yani Zhao,
Marek Cieplak
Abstract:
We study the behavior of five proteins at the air-water and oil-water interfaces by all-atom molecular dynamics. The proteins are found to get distorted when pinned to the interface. This behavior is consistent with the phenomenological way of introducing the interfaces in a coarse-grained model through a force that depends on the hydropathy indices of the residues. Proteins couple to the oil-wate…
▽ More
We study the behavior of five proteins at the air-water and oil-water interfaces by all-atom molecular dynamics. The proteins are found to get distorted when pinned to the interface. This behavior is consistent with the phenomenological way of introducing the interfaces in a coarse-grained model through a force that depends on the hydropathy indices of the residues. Proteins couple to the oil-water interface stronger than to the air- water one. They diffuse slower at the oil-water interface but do not depin from it, whereas depinning events are observed at the other interface. The reduction of the disulfide bonds slows the diffusion down.
△ Less
Submitted 8 January, 2018;
originally announced January 2018.
-
Direct Information Reweighted by Contact Templates: Improved RNA Contact Prediction by Combining Structural Features
Authors:
Yiren Jian,
Chen Zeng,
Yunjie Zhao
Abstract:
It is acknowledged that co-evolutionary nucleotide-nucleotide interactions are essential for RNA structures and functions. Currently, direct coupling analysis (DCA) infers nucleotide contacts in a sequence from its homologous sequence alignment across different species. DCA and similar approaches that use sequence information alone usually yield a low accuracy, especially when the available homolo…
▽ More
It is acknowledged that co-evolutionary nucleotide-nucleotide interactions are essential for RNA structures and functions. Currently, direct coupling analysis (DCA) infers nucleotide contacts in a sequence from its homologous sequence alignment across different species. DCA and similar approaches that use sequence information alone usually yield a low accuracy, especially when the available homologous sequences are limited. Here we present a new method that incorporates a Restricted Boltzmann Machine (RBM) to augment the information on sequence co-variations with structural patterns in contact inference. We thus name our method DIRECT that stands for Direct Information REweighted by Contact Templates. Benchmark tests demonstrate that DIRECT produces a substantial enhancement of 13% in accuracy on average for contact prediction in comparison to the traditional DCA. These results suggest that DIRECT could be used for improving predictions of RNA tertiary structures and functions. The source codes and dataset of DIRECT are available at http:// http://zhao.phy.ccnu.edu.cn:8122/DIRECT/index.html.
△ Less
Submitted 28 November, 2017;
originally announced November 2017.