Search | arXiv e-print repository

Efficient and Precise Force Field Optimization for Biomolecules Using DPA-2

Authors: Junhan Chang, Duo Zhang, Yuqing Deng, Hongrui Lin, Zhirong Liu, Linfeng Zhang, Hang Zheng, Xinyan Wang

Abstract: Molecular simulations are essential tools in computational chemistry, enabling the prediction and understanding of molecular interactions and thermodynamic properties of biomolecules. However, traditional force fields face significant challenges in accurately representing novel molecules and complex chemical environments due to the labor-intensive process of manually setting optimization parameter… ▽ More Molecular simulations are essential tools in computational chemistry, enabling the prediction and understanding of molecular interactions and thermodynamic properties of biomolecules. However, traditional force fields face significant challenges in accurately representing novel molecules and complex chemical environments due to the labor-intensive process of manually setting optimization parameters and the high computational cost of quantum mechanical calculations. To overcome these difficulties, we fine-tuned a high-accuracy DPA-2 pre-trained model and applied it to optimize force field parameters on-the-fly, significantly reducing computational costs. Our method combines this fine-tuned DPA-2 model with a node-embedding-based similarity metric, allowing seamless augmentation to new chemical species without manual intervention. We applied this process to the TYK2 inhibitor and PTP1B systems and demonstrated its effectiveness through the improvement of free energy perturbation calculation results. This advancement contributes valuable insights and tools for the computational chemistry community. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2404.11759 [pdf, other]

Modelling infectious disease transmission dynamics in conference environments: An individual-based approach

Authors: Xue Liu, Yue Deng, **gying Huang, Yuhong Zhang, **zhi Lei

Abstract: The global public health landscape is perpetually challenged by the looming threat of infectious diseases. Central to addressing this concern is the imperative to prevent and manage disease transmission during pandemics, particularly in unique settings. This study addresses the transmission dynamics of infectious diseases within conference venues, presenting a computational model designed to simul… ▽ More The global public health landscape is perpetually challenged by the looming threat of infectious diseases. Central to addressing this concern is the imperative to prevent and manage disease transmission during pandemics, particularly in unique settings. This study addresses the transmission dynamics of infectious diseases within conference venues, presenting a computational model designed to simulate transmission processes within a condensed timeframe (one day), beginning with sporadic cases. Our model intricately captures the activities of individual attendees within the conference venue, encompassing meetings, rest intervals, and meal breaks. While meetings entail proximity seating, rest and lunch periods allow attendees to interact with diverse individuals. Moreover, the restroom environment poses an additional avenue for potential infection transmission. Employing an individual-based model, we meticulously replicated the transmission dynamics of infectious diseases, with a specific emphasis on close-contact interactions between infected and susceptible individuals. Through comprehensive analysis of model simulations, we elucidated the intricacies of disease transmission dynamics within conference settings and assessed the efficacy of control strategies to curb disease dissemination. Ultimately, our study proffers a numerical framework for assessing the risk of infectious disease transmission during short-duration conferences, furnishing conference organizers with valuable insights to inform the implementation of targeted prevention and control measures. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 25 pages; 8 figures

arXiv:2404.10573 [pdf, other]

AAVDiff: Experimental Validation of Enhanced Viability and Diversity in Recombinant Adeno-Associated Virus (AAV) Capsids through Diffusion Generation

Authors: Lijun Liu, Jiali Yang, Jianfei Song, Xinglin Yang, Lele Niu, Zeqi Cai, Hui Shi, Tingjun Hou, Chang-yu Hsieh, Weiran Shen, Yafeng Deng

Abstract: Recombinant adeno-associated virus (rAAV) vectors have revolutionized gene therapy, but their broad tropism and suboptimal transduction efficiency limit their clinical applications. To overcome these limitations, researchers have focused on designing and screening capsid libraries to identify improved vectors. However, the large sequence space and limited resources present challenges in identifyin… ▽ More Recombinant adeno-associated virus (rAAV) vectors have revolutionized gene therapy, but their broad tropism and suboptimal transduction efficiency limit their clinical applications. To overcome these limitations, researchers have focused on designing and screening capsid libraries to identify improved vectors. However, the large sequence space and limited resources present challenges in identifying viable capsid variants. In this study, we propose an end-to-end diffusion model to generate capsid sequences with enhanced viability. Using publicly available AAV2 data, we generated 38,000 diverse AAV2 viral protein (VP) sequences, and evaluated 8,000 for viral selection. The results attested the superiority of our model compared to traditional methods. Additionally, in the absence of AAV9 capsid data, apart from one wild-type sequence, we used the same model to directly generate a number of viable sequences with up to 9 mutations. we transferred the remaining 30,000 samples to the AAV9 domain. Furthermore, we conducted mutagenesis on AAV9 VP hypervariable regions VI and V, contributing to the continuous improvement of the AAV9 VP sequence. This research represents a significant advancement in the design and functional validation of rAAV vectors, offering innovative solutions to enhance specificity and transduction efficiency in gene therapy applications. △ Less

Submitted 17 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

arXiv:2401.11360 [pdf]

PepHarmony: A Multi-View Contrastive Learning Framework for Integrated Sequence and Structure-Based Peptide Encoding

Authors: Ruochi Zhang, Haoran Wu, Chang Liu, Hua** Li, Yuqian Wu, Kewei Li, Yifan Wang, Yifan Deng, Jiahui Chen, Fengfeng Zhou, Xin Gao

Abstract: Recent advances in protein language models have catalyzed significant progress in peptide sequence representation. Despite extensive exploration in this field, pre-trained models tailored for peptide-specific needs remain largely unaddressed due to the difficulty in capturing the complex and sometimes unstable structures of peptides. This study introduces a novel multi-view contrastive learning fr… ▽ More Recent advances in protein language models have catalyzed significant progress in peptide sequence representation. Despite extensive exploration in this field, pre-trained models tailored for peptide-specific needs remain largely unaddressed due to the difficulty in capturing the complex and sometimes unstable structures of peptides. This study introduces a novel multi-view contrastive learning framework PepHarmony for the sequence-based peptide encoding task. PepHarmony innovatively combines both sequence- and structure-level information into a sequence-level encoding module through contrastive learning. We carefully select datasets from the Protein Data Bank (PDB) and AlphaFold database to encompass a broad spectrum of peptide sequences and structures. The experimental data highlights PepHarmony's exceptional capability in capturing the intricate relationship between peptide sequences and structures compared with the baseline and fine-tuned models. The robustness of our model is confirmed through extensive ablation studies, which emphasize the crucial roles of contrastive loss and strategic data sorting in enhancing predictive performance. The proposed PepHarmony framework serves as a notable contribution to peptide representations, and offers valuable insights for future applications in peptide drug discovery and peptide engineering. We have made all the source code utilized in this study publicly accessible via GitHub at https://github.com/zhangruochi/PepHarmony or http://www.healthinformaticslab.org/supp/. △ Less

Submitted 20 January, 2024; originally announced January 2024.

Comments: 25 pages, 5 figures, 3 tables

arXiv:2305.13343 [pdf]

A new sulfur bioconversion process development for energy- and space-efficient secondary wastewater treatment

Authors: Chu-Kuan Jiang, Yang-Fan Deng, Hongxiao Guo, Guang-Hao Chen, Di Wu

Abstract: Harvesting organic matter from wastewater is widely applied to maximize energy recovery; however, it limits the applicability of secondary treatment for acceptable effluent discharge into surface water bodies. To turn this bottleneck issue into an opportunity, this study developed oxygen-induced thiosulfatE production duRing sulfATe reductiOn (EARTO) to provide an efficient electron donor for wast… ▽ More Harvesting organic matter from wastewater is widely applied to maximize energy recovery; however, it limits the applicability of secondary treatment for acceptable effluent discharge into surface water bodies. To turn this bottleneck issue into an opportunity, this study developed oxygen-induced thiosulfatE production duRing sulfATe reductiOn (EARTO) to provide an efficient electron donor for wastewater treatment. Typical pretreated wastewater was synthesized with chemical oxygen demand of 110 mg/L, sulfate of 50 mg S/L, and varying dissolved oxygen (DO) and was fed into a moving-bed biofilm reactor (MBBR). The MBBR was operated continuously with a short hydraulic retention time of 40 min for 349 days. The formation rate of thiosulfate reached 0.12-0.18 g S/(m2.d) with a high produced thiosulfate-S/TdS-S ratio of 38-73% when influent DO was 2.7-3.6 mg/L. The sludge yield was 0.23-0.29 gVSS/gCOD, much lower than it was in conventional activated sludge processes. Then, batch tests and metabolism analysis were conducted to confirm the oxygen effect on thiosulfate formation, characterize the roles of sulfate and microbial activities, and explore the mechanism of oxygen-induced thiosulfate formation in ERATO. Results examined that oxygen supply promoted the thiosulfate-Sproduced/TdS-Sproduced ratio from 4% to 24-26%, demonstrated that sulfate and microbial activities were critical for thiosulfate production, and indicated that oxygen induces thiosulfate formation through two pathways: 1) direct sulfide oxidation, and 2) indirect sulfide oxidation, sulfide is first oxidized to S0 (dominant) which then reacts with sulfite derived from oxygen-regulated biological sulfate reduction. The proposed compact ERATO process, featuring high thiosulfate production and low sludge production, supports space- and energy-efficient secondary wastewater treatment. △ Less

Submitted 21 May, 2023; originally announced May 2023.

Comments: Written by Chu-Kuan Jiang; edited by Yang-Fan Deng, Hongxiao Guo, Guang-Hao Chen, Di Wu; Corresponding authors: Guang-Hao Chen, Di Wu; Last author (team leader): Guang-Hao Chen

arXiv:2304.10494 [pdf]

Infinite Physical Monkey: Do Deep Learning Methods Really Perform Better in Conformation Generation?

Authors: Haotian Zhang, **tu Zhang, Huifeng Zhao, Dejun Jiang, Yafeng Deng

Abstract: Conformation Generation is a fundamental problem in drug discovery and cheminformatics. And organic molecule conformation generation, particularly in vacuum and protein pocket environments, is most relevant to drug design. Recently, with the development of geometric neural networks, the data-driven schemes have been successfully applied in this field, both for molecular conformation generation (in… ▽ More Conformation Generation is a fundamental problem in drug discovery and cheminformatics. And organic molecule conformation generation, particularly in vacuum and protein pocket environments, is most relevant to drug design. Recently, with the development of geometric neural networks, the data-driven schemes have been successfully applied in this field, both for molecular conformation generation (in vacuum) and binding pose generation (in protein pocket). The former beats the traditional ETKDG method, while the latter achieves similar accuracy compared with the widely used molecular docking software. Although these methods have shown promising results, some researchers have recently questioned whether deep learning (DL) methods perform better in molecular conformation generation via a parameter-free method. To our surprise, what they have designed is some kind analogous to the famous infinite monkey theorem, the monkeys that are even equipped with physics education. To discuss the feasibility of their proving, we constructed a real infinite stochastic monkey for molecular conformation generation, showing that even with a more stochastic sampler for geometry generation, the coverage of the benchmark QM-computed conformations are higher than those of most DL-based methods. By extending their physical monkey algorithm for binding pose prediction, we also discover that the successful docking rate also achieves near-best performance among existing DL-based docking models. Thus, though their conclusions are right, their proof process needs more concern. △ Less

Submitted 7 March, 2023; originally announced April 2023.

arXiv:2201.08443 [pdf]

Diversifying the Genomic Data Science Research Community

Authors: The Genomic Data Science Community Network, Rosa Alcazar, Maria Alvarez, Rachel Arnold, Mentewab Ayalew, Lyle G. Best, Michael C. Campbell, Kamal Chowdhury, Katherine E. L. Cox, Christina Daulton, You** Deng, Carla Easter, Karla Fuller, Shazia Tabassum Hakim, Ava M. Hoffman, Natalie Kucher, Andrew Lee, Joslynn Lee, Jeffrey T. Leek, Robert Meller, Loyda B. Méndez, Miguel P. Méndez-González, Stephen Mosher, Michele Nishiguchi, Siddharth Pratap , et al. (13 additional authors not shown)

Abstract: Over the last 20 years, there has been an explosion of genomic data collected for disease association, functional analyses, and other large-scale discoveries. At the same time, there have been revolutions in cloud computing that enable computational and data science research, while making data accessible to anyone with a web browser and an internet connection. However, students at institutions wit… ▽ More Over the last 20 years, there has been an explosion of genomic data collected for disease association, functional analyses, and other large-scale discoveries. At the same time, there have been revolutions in cloud computing that enable computational and data science research, while making data accessible to anyone with a web browser and an internet connection. However, students at institutions with limited resources have received relatively little exposure to curricula or professional development opportunities that lead to careers in genomic data science. To broaden participation in genomics research, the scientific community needs to support students, faculty, and administrators at Underserved Institutions (UIs) including Community Colleges, Historically Black Colleges and Universities, Hispanic-Serving Institutions, and Tribal Colleges and Universities in taking advantage of these tools in local educational and research programs. We have formed the Genomic Data Science Community Network (http://www.gdscn.org/) to identify opportunities and support broadening access to cloud-enabled genomic data science. Here, we provide a summary of the priorities for faculty members at UIs, as well as administrators, funders, and R1 researchers to consider as we create a more diverse genomic data science community. △ Less

Submitted 9 June, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

Comments: 42 pages, 3 figures

arXiv:2110.07347 [pdf, other]

Improved Drug-target Interaction Prediction with Intermolecular Graph Transformer

Authors: Siyuan Liu, Yusong Wang, Tong Wang, Yifan Deng, Liang He, Bin Shao, Jian Yin, Nanning Zheng, Tie-Yan Liu

Abstract: The identification of active binding drugs for target proteins (termed as drug-target interaction prediction) is the key challenge in virtual screening, which plays an essential role in drug discovery. Although recent deep learning-based approaches achieved better performance than molecular docking, existing models often neglect certain aspects of the intermolecular information, hindering the perf… ▽ More The identification of active binding drugs for target proteins (termed as drug-target interaction prediction) is the key challenge in virtual screening, which plays an essential role in drug discovery. Although recent deep learning-based approaches achieved better performance than molecular docking, existing models often neglect certain aspects of the intermolecular information, hindering the performance of prediction. We recognize this problem and propose a novel approach named Intermolecular Graph Transformer (IGT) that employs a dedicated attention mechanism to model intermolecular information with a three-way Transformer-based architecture. IGT outperforms state-of-the-art approaches by 9.1% and 20.5% over the second best for binding activity and binding pose prediction respectively, and shows superior generalization ability to unseen receptor proteins. Furthermore, IGT exhibits promising drug screening ability against SARS-CoV-2 by identifying 83.1% active drugs that have been validated by wet-lab experiments with near-native predicted binding poses. △ Less

Submitted 15 October, 2021; v1 submitted 14 October, 2021; originally announced October 2021.

arXiv:2011.13554 [pdf]

Towards decoding the coupled decision-making of metabolism and epithelial-mesenchymal transition in cancer

Authors: Dongya Jia, Jun Hyoung Park, Harsimran Kaur, Kwang Hwa Jung, Suk** Yang, Shubham Tripathi, Madeline Galbraith, Youyuan Deng, Mohit Kumar Jolly, Benny Abraham Kaipparettu, Jose N. Onuchic, Herbert Levine

Abstract: Cancer cells have the plasticity to adjust their metabolic phenotypes for survival and metastasis. During metastasis, a developmental program known as the epithelial-mesenchymal transition (EMT) plays a critical role. There is extensive cross-talk between metabolism and EMT, but how this leads to coordinated physiological changes is still uncertain. The elusive connection between metabolism and EM… ▽ More Cancer cells have the plasticity to adjust their metabolic phenotypes for survival and metastasis. During metastasis, a developmental program known as the epithelial-mesenchymal transition (EMT) plays a critical role. There is extensive cross-talk between metabolism and EMT, but how this leads to coordinated physiological changes is still uncertain. The elusive connection between metabolism and EMT compromises the efficacy of metabolic therapies targeting metastasis. In this review, we aim for clarifying causation between metabolism and EMT based on recent experimental studies and propose integrated theoretical-experimental efforts to better understand the coupled decision-making of metabolism and EMT. △ Less

Submitted 26 November, 2020; originally announced November 2020.

Comments: 31 pages, 3 figures

arXiv:2004.07765 [pdf]

Cost-effectiveness Analysis of Antiepidemic Policies and Global Situation Assessment of COVID-19

Authors: Liyan Xu, Hongmou Zhang, Yuqiao Deng, Keli Wang, Fu Li, Qing Lu, Jie Yin, Qian Di, Tao Liu, Hang Yin, Zijiao Zhang, Qingyang Du, Hongbin Yu, Aihan Liu, Hezhishi Jiang, **g Guo, Xiumei Yuan, Yun Zhang, Liu Liu, Yu Liu

Abstract: With a two-layer contact-dispersion model and data in China, we analyze the cost-effectiveness of three types of antiepidemic measures for COVID-19: regular epidemiological control, local social interaction control, and inter-city travel restriction. We find that: 1) intercity travel restriction has minimal or even negative effect compared to the other two at the national level; 2) the time of rea… ▽ More With a two-layer contact-dispersion model and data in China, we analyze the cost-effectiveness of three types of antiepidemic measures for COVID-19: regular epidemiological control, local social interaction control, and inter-city travel restriction. We find that: 1) intercity travel restriction has minimal or even negative effect compared to the other two at the national level; 2) the time of reaching turning point is independent of the current number of cases, and only related to the enforcement stringency of epidemiological control and social interaction control measures; 3) strong enforcement at the early stage is the only opportunity to maximize both antiepidemic effectiveness and cost-effectiveness; 4) mediocre stringency of social interaction measures is the worst choice. Subsequently, we cluster countries/regions into four groups based on their control measures and provide situation assessment and policy suggestions for each group. △ Less

Submitted 23 April, 2020; v1 submitted 16 April, 2020; originally announced April 2020.

arXiv:2002.09199 [pdf, ps, other]

Scaling features in the spreading of COVID-19

Authors: Ming Li, Jie Chen, You** Deng

Abstract: Since the outbreak of COVID-19, many data analyses have been done. Some of them are based on the classical epidemiological approach that assumes an exponential growth, but a few studies report that a power-law scaling may provide a better fit to the currently available data. Hereby, we examine the data in China (01/20/2020--02/24/2020), and indeed find that the growth closely follows a power-law k… ▽ More Since the outbreak of COVID-19, many data analyses have been done. Some of them are based on the classical epidemiological approach that assumes an exponential growth, but a few studies report that a power-law scaling may provide a better fit to the currently available data. Hereby, we examine the data in China (01/20/2020--02/24/2020), and indeed find that the growth closely follows a power-law kinetics over a significantly wide time period. The exponents are $2.48(20)$, $2.21(6)$ and $4.26(12)$ for the number of confirmed infections, deaths and cured cases, respectively, indicating an underlying small-world network structure in the pandemic. While no obvious deviations from the power-law growth can be seen yet for the number of deaths and cured cases, negative deviations have clearly appeared in the number of infections, particularly that for the region outside Hubei. This suggests the beginning of the slowing-down of the virus spreading due to the huge containment effort. Meanwhile, we find that despite the dramatic difference in magnitudes, the growth kinetics of the infection number exhibits much similarity for Hubei province and the region outside Hubei. On this basis, in log-log plot, we rescale the infection number for the region outside Hubei such that it overlaps as much as possible with the total infection number in China, from which an approximate extrapolation yields the maximum of the pandemic around March 3, 2020, with the number of infections about $83,000$. Further, by analyzing the kinetics of the mortality in log-log scale, we obtains a rough estimate that near March 3, the death rate of COVID-19 would be about $4.7\%\thicksim 5.0\%$ for Hubei province and $0.7\%\thicksim1.0\%$ for the region outside Hubei. We emphasize that our predictions may be quantitatively unreliable, since the data analysis is purely empirical and various assumptions are used. △ Less

Submitted 25 February, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

Comments: 7 pages, 5 figures

arXiv:1911.06946 [pdf]

doi 10.1016/j.envint.2020.105664

Indoor microbiome, environmental characteristics and asthma among junior high school students in Johor Bahru, Malaysia

Authors: Xi Fu, Dan Norback, Qianqian Yuan, Yanling Li, Xunhua Zhu, Yiqun Deng, Jamal Hisham Hashim, Zailina Hashim, Yi-Wu Zheng, Xu-Xin Lai, Michael Dho Spangfort, Yu Sun

Abstract: Indoor microbial diversity and composition are suggested to affect the prevalence and severity of asthma. In this study, we collected floor dust and environmental characteristics from 21 classrooms, and health data related to asthma symptoms from 309 students, in junior high schools in Johor Bahru, Malaysia. Bacterial and fungal composition was characterized by sequencing 16s rRNA gene and interna… ▽ More Indoor microbial diversity and composition are suggested to affect the prevalence and severity of asthma. In this study, we collected floor dust and environmental characteristics from 21 classrooms, and health data related to asthma symptoms from 309 students, in junior high schools in Johor Bahru, Malaysia. Bacterial and fungal composition was characterized by sequencing 16s rRNA gene and internal transcribed spacer (ITS) region, and the absolute microbial concentration was quantified by qPCR. In total, 326 bacterial and 255 fungal genera were characterized. Five bacterial (Sphingobium, Rhodomicrobium, Shimwellia, Solirubrobacter, Pleurocapsa) and two fungal (Torulaspora and Leptosphaeriaceae) taxa were protective for asthma severity. Two bacterial taxa, Izhakiella and Robinsoniella, were positively associated with asthma severity. Several protective bacterial taxa including Rhodomicrobium, Shimwellia and Sphingobium has been reported as protective microbes in previous studies, whereas other taxa were first time reported. Environmental characteristics, such as age of building, size of textile curtain per room volume, occurrence of cockroaches, concentration of house dust mite allergens transferred from homes by the occupants, were involved in sha** the overall microbial community but not asthma-associated taxa; whereas visible dampness and mold, which did not change the overall microbial community for floor dust, decreased the concentration of protective bacteria Rhodomicrobium (\b{eta}=-2.86, p=0.021) of asthma, indicating complex interactions between microbes, environmental characteristics and asthma symptoms. Overall, this is the first indoor microbiome study to characterize the asthma-associated microbes and their environmental determinant in tropical area, promoting the understanding of microbial exposure and respiratory health in this region. △ Less

Submitted 15 November, 2019; originally announced November 2019.

Comments: 56 pages,1 figure, 3 supplemental figures, 9 supplemental tables

arXiv:1905.10705 [pdf, other]

Modeling treatment events in disease progression

Authors: Guanyang Wang, Yumeng Zhang, Yong Deng, Xuxin Huang, Łukasz Kidziński

Abstract: Ability to quantify and predict progression of a disease is fundamental for selecting an appropriate treatment. Many clinical metrics cannot be acquired frequently either because of their cost (e.g. MRI, gait analysis) or because they are inconvenient or harmful to a patient (e.g. biopsy, x-ray). In such scenarios, in order to estimate individual trajectories of disease progression, it is advantag… ▽ More Ability to quantify and predict progression of a disease is fundamental for selecting an appropriate treatment. Many clinical metrics cannot be acquired frequently either because of their cost (e.g. MRI, gait analysis) or because they are inconvenient or harmful to a patient (e.g. biopsy, x-ray). In such scenarios, in order to estimate individual trajectories of disease progression, it is advantageous to leverage similarities between patients, i.e. the covariance of trajectories, and find a latent representation of progression. Most of existing methods for estimating trajectories do not account for events in-between observations, what dramatically decreases their adequacy for clinical practice. In this study, we develop a machine learning framework named Coordinatewise-Soft-Impute (CSI) for analyzing disease progression from sparse observations in the presence of confounding events. CSI is guaranteed to converge to the global minimum of the corresponding optimization problem. Experimental results also demonstrates the effectiveness of CSI using both simulated and real dataset. △ Less

Submitted 25 May, 2019; originally announced May 2019.

arXiv:1904.07780 [pdf]

Quantifying cancer epithelial-mesenchymal plasticity and its association with stemness and immune response

Authors: Dongya Jia, Xuefei Li, Federico Bocci, Shubham Tripathi, Youyuan Deng, Mohit Kumar Jolly, Jose N. Onuchic, Herbert Levine

Abstract: Cancer cells can acquire a spectrum of stable hybrid epithelial/mesenchymal (E/M) states during epithelial-mesenchymal transition (EMT). Cells in these hybrid E/M phenotypes often combine epithelial and mesenchymal features and tend to migrate collectively commonly as small clusters. Such collectively migrating cancer cells play a pivotal role in seeding metastases and their presence in cancer pat… ▽ More Cancer cells can acquire a spectrum of stable hybrid epithelial/mesenchymal (E/M) states during epithelial-mesenchymal transition (EMT). Cells in these hybrid E/M phenotypes often combine epithelial and mesenchymal features and tend to migrate collectively commonly as small clusters. Such collectively migrating cancer cells play a pivotal role in seeding metastases and their presence in cancer patients indicates an adverse prognostic factor. Moreover, cancer cells in hybrid E/M phenotypes tend to be more associated with stemness which endows them with tumor-initiation ability and therapy resistance. Most recently, cells undergoing EMT have been shown to promote immune suppression for better survival. A systematic understanding of the emergence of hybrid E/M phenotypes and the connection of EMT with stemness and immune suppression would contribute to more effective therapeutic strategies. In this review, we first discuss recent efforts combining theoretical and experimental approaches to elucidate mechanisms underlying EMT multi-stability (i.e. the existence of multiple stable phenotypes during EMT) and the properties of hybrid E/M phenotypes. Following we discuss non-cell-autonomous regulation of EMT by cell cooperation and extracellular matrix. Afterwards, we discuss various metrics that can be used to quantify EMT spectrum. We further describe possible mechanisms underlying the formation of clusters of circulating tumor cells. Last but not least, we summarize recent systems biology analysis of the role of EMT in the acquisition of stemness and immune suppression. △ Less

Submitted 16 April, 2019; originally announced April 2019.

Comments: 50 pages, 6 figures

arXiv:1709.05429 [pdf]

An Algorithmic Information Calculus for Causal Discovery and Reprogramming Systems

Authors: Hector Zenil, Narsis A. Kiani, Francesco Marabita, Yue Deng, Szabolcs Elias, Angelika Schmidt, Gordon Ball, Jesper Tegnér

Abstract: We demonstrate that the algorithmic information content of a system is deeply connected to its potential dynamics, thus affording an avenue for moving systems in the information-theoretic space and controlling them in the phase space. To this end we performed experiments and validated the results on (1) a very large set of small graphs, (2) a number of larger networks with different topologies, an… ▽ More We demonstrate that the algorithmic information content of a system is deeply connected to its potential dynamics, thus affording an avenue for moving systems in the information-theoretic space and controlling them in the phase space. To this end we performed experiments and validated the results on (1) a very large set of small graphs, (2) a number of larger networks with different topologies, and (3) biological networks from a widely studied and validated genetic network (e.coli) as well as on a significant number of differentiating (Th17) and differentiated human cells from high quality databases (Harvard's CellNet) with results conforming to experimentally validated biological data. Based on these results we introduce a conceptual framework, a model-based interventional calculus and a reprogrammability measure with which to steer, manipulate, and reconstruct the dynamics of non- linear dynamical systems from partial and disordered observations. The method consists in finding and applying a series of controlled interventions to a dynamical system to estimate how its algorithmic information content is affected when every one of its elements are perturbed. The approach represents an alternative to numerical simulation and statistical approaches for inferring causal mechanistic/generative models and finding first principles. We demonstrate the framework's capabilities by reconstructing the phase space of some discrete dynamical systems (cellular automata) as case study and reconstructing their generating rules. We thus advance tools for reprogramming artificial and living systems without full knowledge or access to the system's actual kinetic equations or probability distributions yielding a suite of universal and parameter-free algorithms of wide applicability ranging from causation, dimension reduction, feature selection and model generation. △ Less

Submitted 5 April, 2018; v1 submitted 15 September, 2017; originally announced September 2017.

Comments: 50 pages with Supplementary Information and Extended Figures. The Online Algorithmic Complexity Calculator implements the methods in this paper: http://complexitycalculator.com/ Animated video available at: https://youtu.be/ufzq2p5tVLI

arXiv:1706.01241 [pdf, other]

HiDi: An efficient reverse engineering schema for large scale dynamic regulatory network reconstruction using adaptive differentiation

Authors: Yue Deng, Hector Zenil, Jesper Tégner, Narsis A. Kiani

Abstract: The use of differential equations (ODE) is one of the most promising approaches to network inference. The success of ODE-based approaches has, however, been limited, due to the difficulty in estimating parameters and by their lack of scalability. Here we introduce a novel method and pipeline to reverse engineer gene regulatory networks from gene expression of time series and perturbation data base… ▽ More The use of differential equations (ODE) is one of the most promising approaches to network inference. The success of ODE-based approaches has, however, been limited, due to the difficulty in estimating parameters and by their lack of scalability. Here we introduce a novel method and pipeline to reverse engineer gene regulatory networks from gene expression of time series and perturbation data based upon an improvement on the calculation scheme of the derivatives and a pre-filtration step to reduce the number of possible links. The method introduces a linear differential equation model with adaptive numerical differentiation that is scalable to extremely large regulatory networks. We demonstrate the ability of this method to outperform current state-of-the-art methods applied to experimental and synthetic data using test data from the DREAM4 and DREAM5 challenges. Our method displays greater accuracy and scalability. We benchmark the performance of the pipeline with respect to data set size and levels of noise. We show that the computation time is linear over various network sizes. △ Less

Submitted 7 June, 2017; v1 submitted 5 June, 2017; originally announced June 2017.

Comments: As accepted by the journal Bioinformatics (Oxford)

arXiv:1410.5665 [pdf, ps, other]

The Unreasonable Effectiveness of Blood Pressure Measurement: Molecular Communication in Biological Systems

Authors: Malcolm Egan, Adam Noel, Yansha Deng, Maged Elkashlan, Trung Q. Duong

Abstract: Arterial blood pressure is a key vital sign for the health of the human body. As such, accurate and reproducible measurement techniques are necessary for successful diagnosis. Blood pressure measurement is an example of molecular communication in regulated biological systems. In general, communication in regulated biological systems is difficult because the act of encoding information about the st… ▽ More Arterial blood pressure is a key vital sign for the health of the human body. As such, accurate and reproducible measurement techniques are necessary for successful diagnosis. Blood pressure measurement is an example of molecular communication in regulated biological systems. In general, communication in regulated biological systems is difficult because the act of encoding information about the state of the system can corrupt the message itself. In this paper, we propose three strategies to cope with this problem to facilitate reliable molecular communication links: communicate from the outskirts; build it in; and leave a small footprint. Our strategies---inspired by communication in natural biological systems---provide a classification to guide the design of molecular communication mechanisms in synthetic biological systems. We illustrate our classification using examples of the first two strategies in natural systems. We then consider a molecular link within a model based on the Michaelis-Menten kinetics. In particular, we compute the capacity of the link, which reveals the potential of communicating using our leave a small footprint strategy. This provides a way of identifying whether the molecular link can be improved without affecting the function, and a guide to the design of synthetic biological systems. △ Less

Submitted 21 October, 2014; originally announced October 2014.

Comments: Submitted to IEEE ICC 2015

arXiv:1406.0045 [pdf, ps, other]

A belief-based evolutionarily stable strategy

Authors: Xinyang Deng, Zhen Wang, Qi Liu, Yong Deng, Sankaran Mahadevan

Abstract: As an equilibrium refinement of the Nash equilibrium, evolutionarily stable strategy (ESS) is a key concept in evolutionary game theory and has attracted growing interest. An ESS can be either a pure strategy or a mixed strategy. Even though the randomness is allowed in mixed strategy, the selection probability of pure strategy in a mixed strategy may fluctuate due to the impact of many factors. T… ▽ More As an equilibrium refinement of the Nash equilibrium, evolutionarily stable strategy (ESS) is a key concept in evolutionary game theory and has attracted growing interest. An ESS can be either a pure strategy or a mixed strategy. Even though the randomness is allowed in mixed strategy, the selection probability of pure strategy in a mixed strategy may fluctuate due to the impact of many factors. The fluctuation can lead to more uncertainty. In this paper, such uncertainty involved in mixed strategy has been further taken into consideration: a belief strategy is proposed in terms of Dempster-Shafer evidence theory. Furthermore, based on the proposed belief strategy, a belief-based ESS has been developed. The belief strategy and belief-based ESS can reduce to the mixed strategy and mixed ESS, which provide more realistic and powerful tools to describe interactions among agents. △ Less

Submitted 6 June, 2014; v1 submitted 30 May, 2014; originally announced June 2014.

Comments: 26 pages, 3 figures

arXiv:1212.6209 [pdf, other]

doi 10.1371/journal.pone.0065769

Efficient Multiple Object Tracking Using Mutually Repulsive Active Membranes

Authors: Yi Deng, Philip Coen, Mingzhai Sun, Joshua W. Shaevitz

Abstract: Studies of social and group behavior in interacting organisms require high-throughput analysis of the motion of a large number of individual subjects. Computer vision techniques offer solutions to specific tracking problems, and allow automated and efficient tracking with minimal human intervention. In this work, we adopt the open active contour model to track the trajectories of moving objects at… ▽ More Studies of social and group behavior in interacting organisms require high-throughput analysis of the motion of a large number of individual subjects. Computer vision techniques offer solutions to specific tracking problems, and allow automated and efficient tracking with minimal human intervention. In this work, we adopt the open active contour model to track the trajectories of moving objects at high density. We add repulsive interactions between open contours to the original model, treat the trajectories as an extrusion in the temporal dimension, and show applications to two tracking problems. The walking behavior of Drosophila is studied at different population density and gender composition. We demonstrate that individual male flies have distinct walking signatures, and that the social interaction between flies in a mixed gender arena is gender specific. We also apply our model to studies of trajectories of gliding Myxococcus xanthus bacteria at high density. We examine the individual gliding behavioral statistics in terms of the gliding speed distribution. Using these two examples at very distinctive spatial scales, we illustrate the use of our algorithm on tracking both short rigid bodies (Drosophila) and long flexible objects (Myxococcus xanthus). Our repulsive active membrane model reaches error rates better than $5\times 10^{-6}$ per fly per second for Drosophila tracking and comparable results for Myxococcus xanthus. △ Less

Submitted 26 December, 2012; originally announced December 2012.

Comments: 18 pages, 6 figures, 1 table

arXiv:1104.1421 [pdf, ps, other]

doi 10.1103/PhysRevLett.107.158101

Direct measurement of cell wall stress-stiffening and turgor pressure in live bacterial cells

Authors: Yi Deng, Mingzhai Sun, Joshua W. Shaevitz

Abstract: We study intact and bulging Escherichia coli cells using atomic force microscopy to separate the contributions of the cell wall and turgor pressure to the overall cell stiffness. We find strong evidence of power-law stress-stiffening in the E. coli cell wall, with an exponent of 1.22 \pm 0.12, such that the wall is significantly stiffer in intact cells (E = 23 \pm 8 MPa and 49 \pm 20 MPa in the ax… ▽ More We study intact and bulging Escherichia coli cells using atomic force microscopy to separate the contributions of the cell wall and turgor pressure to the overall cell stiffness. We find strong evidence of power-law stress-stiffening in the E. coli cell wall, with an exponent of 1.22 \pm 0.12, such that the wall is significantly stiffer in intact cells (E = 23 \pm 8 MPa and 49 \pm 20 MPa in the axial and circumferential directions) than in unpressurized sacculi. These measurements also indicate that the turgor pressure in living cells E. coli is 29 \pm 3 kPa. △ Less

Submitted 13 July, 2011; v1 submitted 7 April, 2011; originally announced April 2011.

Comments: Main Text: 4 pages, 4 figures. Supplemental Materials: 9 pages

Showing 1–20 of 20 results for author: Deng, Y