Search | arXiv e-print repository

The association of domain-specific physical activity and sedentary activity with stroke: A prospective cohort study

Authors: Xinyi He, Shidi Wang, Yi Li, Jiucun Wang, Guangrui Yang, Jun Chen, Zixin Hu

Abstract: Background The incidence of stroke places a heavy burden on both society and individuals. Activity is closely related to cardiovascular health. This study aimed to investigate the relationship between the varying domains of PA, like occupation-related Physical Activity (OPA), transportation-related Physical Activity (TPA), leisure-time Physical Activity (LTPA), and Sedentary Activity (SA) with str… ▽ More Background The incidence of stroke places a heavy burden on both society and individuals. Activity is closely related to cardiovascular health. This study aimed to investigate the relationship between the varying domains of PA, like occupation-related Physical Activity (OPA), transportation-related Physical Activity (TPA), leisure-time Physical Activity (LTPA), and Sedentary Activity (SA) with stroke. Methods Our analysis included 30,400 participants aged 20+ years from 2007 to 2018 National Health and Nutrition Examination Survey (NHANES). Stroke was identified based on the participant's self-reported diagnoses from previous medical consultations, and PA and SA were self-reported. Multivariable logistic and restricted cubic spline models were used to assess the associations. Results Participants achieving PA guidelines (performing PA more than 150 min/week) were 35.7% less likely to have a stroke based on both the total PA (odds ratio [OR] 0.643, 95% confidence interval [CI] 0.523-0.790) and LTPA (OR 0.643, 95% CI 0.514-0.805), while OPA or TPA did not demonstrate lower stroke risk. Furthermore, participants with less than 7.5 h/day SA levels were 21.6% (OR 0.784, 95% CI 0.665-0.925) less likely to have a stroke. The intensities of total PA and LTPA exhibited nonlinear U-shaped associations with stroke risk. In contrast, those of OPA and TPA showed negative linear associations, while SA intensities were positively linearly correlated with stroke risk. Conclusions LTPA, but not OPA or TPA, was associated with a lower risk of stroke at any amount, suggesting that significant cardiovascular health would benefit from increased PA. Additionally, the positive association between SA and stroke indicated that prolonged sitting was detrimental to cardiovascular health. Overall, increased PA within a reasonable range reduces the risk of stroke, while increased SA elevates it. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2311.16207 [pdf, other]

The Graph Convolutional Network with Multi-representation Alignment for Drug Synergy Prediction

Authors: Xinxing Yang, Genke Yang, Jian Chu

Abstract: Drug combination refers to the use of two or more drugs to treat a specific disease at the same time. It is currently the mainstream way to treat complex diseases. Compared with single drugs, drug combinations have better efficacy and can better inhibit toxicity and drug resistance. The computational model based on deep learning concatenates the representation of multiple drugs and the correspondi… ▽ More Drug combination refers to the use of two or more drugs to treat a specific disease at the same time. It is currently the mainstream way to treat complex diseases. Compared with single drugs, drug combinations have better efficacy and can better inhibit toxicity and drug resistance. The computational model based on deep learning concatenates the representation of multiple drugs and the corresponding cell line feature as input, and the output is whether the drug combination can have an inhibitory effect on the cell line. However, this strategy of concatenating multiple representations has the following defects: the alignment of drug representation and cell line representation is ignored, resulting in the synergistic relationship not being reflected positionally in the embedding space. Moreover, the alignment measurement function in deep learning cannot be suitable for drug synergy prediction tasks due to differences in input types. Therefore, in this work, we propose a graph convolutional network with multi-representation alignment (GCNMRA) for predicting drug synergy. In the GCNMRA model, we designed a multi-representation alignment function suitable for the drug synergy prediction task so that the positional relationship between drug representations and cell line representation is reflected in the embedding space. In addition, the vector modulus of drug representations and cell line representation is considered to improve the accuracy of calculation results and accelerate model convergence. Finally, many relevant experiments were run on multiple drug synergy datasets to verify the effectiveness of the above innovative elements and the excellence of the GCNMRA model. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: 14 pages;

arXiv:2309.04190 [pdf, other]

SegmentAnything helps microscopy images based automatic and quantitative organoid detection and analysis

Authors: Xiaodan Xing, Chunling Tang, Yunzhe Guo, Nicholas Kurniawan, Guang Yang

Abstract: Organoids are self-organized 3D cell clusters that closely mimic the architecture and function of in vivo tissues and organs. Quantification of organoid morphology helps in studying organ development, drug discovery, and toxicity assessment. Recent microscopy techniques provide a potent tool to acquire organoid morphology features, but manual image analysis remains a labor and time-intensive proce… ▽ More Organoids are self-organized 3D cell clusters that closely mimic the architecture and function of in vivo tissues and organs. Quantification of organoid morphology helps in studying organ development, drug discovery, and toxicity assessment. Recent microscopy techniques provide a potent tool to acquire organoid morphology features, but manual image analysis remains a labor and time-intensive process. Thus, this paper proposes a comprehensive pipeline for microscopy analysis that leverages the SegmentAnything to precisely demarcate individual organoids. Additionally, we introduce a set of morphological properties, including perimeter, area, radius, non-smoothness, and non-circularity, allowing researchers to analyze the organoid structures quantitatively and automatically. To validate the effectiveness of our approach, we conducted tests on bright-field images of human induced pluripotent stem cells (iPSCs) derived neural-epithelial (NE) organoids. The results obtained from our automatic pipeline closely align with manual organoid detection and measurement, showcasing the capability of our proposed method in accelerating organoids morphology analysis. △ Less

Submitted 8 April, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: Replace Figure 4 with the correct version. The original version is wrong due to a column name mismatch

arXiv:2308.06578 [pdf]

To reverse engineer an entire nervous system

Authors: Gal Haspel, Edward S Boyden, Jeffrey Brown, George Church, Netta Cohen, Christopher Fang-Yen, Steven Flavell, Miriam B Goodman, Anne C Hart, Oliver Hobert, Eduardo J Izquierdo, Konstantinos Kagias, Shawn Lockery, Yangning Lu, Adam Marblestone, Jordan Matelsky, Hanspeter Pfister, Horacio G Rotstein, Monika Scholz, Eli Shlizerman, Quilee Simeon, Michael A Skuhersky, Vineet Tiruvadi, Vivek Venkatachalam, Guangyu Robert Yang , et al. (3 additional authors not shown)

Abstract: A primary goal of neuroscience is to understand how nervous systems, or assemblies of neural circuits, generate and control behavior. Testing and refining our theories of neural control would be greatly facilitated if we could reliably simulate an entire nervous system so we could replicate the brain dynamics in response to any stimuli and different contexts. More fundamentally, reconstructing or… ▽ More A primary goal of neuroscience is to understand how nervous systems, or assemblies of neural circuits, generate and control behavior. Testing and refining our theories of neural control would be greatly facilitated if we could reliably simulate an entire nervous system so we could replicate the brain dynamics in response to any stimuli and different contexts. More fundamentally, reconstructing or modeling a system is an important milestone in understanding it, and so, simulating an entire nervous system is in itself one of the goals, indeed dreams, of systems neuroscience. To do so requires us to identify how each neuron's output depends on its inputs, within some nervous system. This deconstruction, understanding function from input-output pairs, falls into the realm of reverse engineering. Current efforts at reverse engineering the brain focus on the mammalian nervous system, but these brains are complex, allowing only recordings of tiny subsystems. Here we argue that the time is ripe to embark on a concerted effort to reverse engineer a smaller system and that the nematode C. elegans is the ideal candidate system. In particular, the established and growing toolkit of optophysiology techniques can non-invasively capture and control each neuron's activity and scale to hundreds of thousands of experiments, across a large population of animals. Data across populations and behaviors can be combined because across individuals neuronal identities are largely conserved in form and function. Modern machine-learning-based model training should then enable a simulation of C. elegans' impressive breadth of brain states and behaviors. The ability to reverse engineer an entire nervous system will benefit systems neuroscience as well as the design of artificial intelligence systems, enabling fundamental insights as well as new approaches for investigations of progressively larger nervous systems. △ Less

Submitted 9 December, 2023; v1 submitted 12 August, 2023; originally announced August 2023.

Comments: 23 pages, 2 figures, opinion paper

arXiv:2307.08989 [pdf, other]

GraphCL-DTA: a graph contrastive learning with molecular semantics for drug-target binding affinity prediction

Authors: Xinxing Yang, Genke Yang, Jian Chu

Abstract: Drug-target binding affinity prediction plays an important role in the early stages of drug discovery, which can infer the strength of interactions between new drugs and new targets. However, the performance of previous computational models is limited by the following drawbacks. The learning of drug representation relies only on supervised data, without taking into account the information containe… ▽ More Drug-target binding affinity prediction plays an important role in the early stages of drug discovery, which can infer the strength of interactions between new drugs and new targets. However, the performance of previous computational models is limited by the following drawbacks. The learning of drug representation relies only on supervised data, without taking into account the information contained in the molecular graph itself. Moreover, most previous studies tended to design complicated representation learning module, while uniformity, which is used to measure representation quality, is ignored. In this study, we propose GraphCL-DTA, a graph contrastive learning with molecular semantics for drug-target binding affinity prediction. In GraphCL-DTA, we design a graph contrastive learning framework for molecular graphs to learn drug representations, so that the semantics of molecular graphs are preserved. Through this graph contrastive framework, a more essential and effective drug representation can be learned without additional supervised data. Next, we design a new loss function that can be directly used to smoothly adjust the uniformity of drug and target representations. By directly optimizing the uniformity of representations, the representation quality of drugs and targets can be improved. The effectiveness of the above innovative elements is verified on two real datasets, KIBA and Davis. The excellent performance of GraphCL-DTA on the above datasets suggests its superiority to the state-of-the-art model. △ Less

Submitted 18 July, 2023; originally announced July 2023.

Comments: 13 pages, 4 figures, 5 tables

arXiv:2306.05707 [pdf, ps, other]

On the Mathematics of RNA Velocity II: Algorithmic Aspects

Authors: Tiejun Li, Yizhuo Wang, Guoguo Yang, Peijie Zhou

Abstract: In a previous paper [CSIAM Trans. Appl. Math. 2 (2021), 1-55], the authors proposed a theoretical framework for the analysis of RNA velocity, which is a promising concept in scRNA-seq data analysis to reveal the cell state-transition dynamical processes underlying snapshot data. The current paper is devoted to the algorithmic study of some key components in RNA velocity workflow. Four important po… ▽ More In a previous paper [CSIAM Trans. Appl. Math. 2 (2021), 1-55], the authors proposed a theoretical framework for the analysis of RNA velocity, which is a promising concept in scRNA-seq data analysis to reveal the cell state-transition dynamical processes underlying snapshot data. The current paper is devoted to the algorithmic study of some key components in RNA velocity workflow. Four important points are addressed in this paper: (1) We construct a rational time-scale fixation method which can determine the global gene-shared latent time for cells. (2) We present an uncertainty quantification strategy for the inferred parameters obtained through the EM algorithm. (3) We establish the optimal criterion for the choice of velocity kernel bandwidth with respect to the sample size in the downstream analysis and discuss its implications. (4) We propose a temporal distance estimation approach between two cell clusters along the cellular development path. Some illustrative numerical tests are also carried out to verify our analysis. These results are intended to provide tools and insights in further development of RNA velocity type methods in the future. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: 32 pages, 5 figures

arXiv:2305.11772 [pdf, other]

Neural Foundations of Mental Simulation: Future Prediction of Latent Representations on Dynamic Scenes

Authors: Aran Nayebi, Rishi Rajalingham, Mehrdad Jazayeri, Guangyu Robert Yang

Abstract: Humans and animals have a rich and flexible understanding of the physical world, which enables them to infer the underlying dynamical trajectories of objects and events, plausible future states, and use that to plan and anticipate the consequences of actions. However, the neural mechanisms underlying these computations are unclear. We combine a goal-driven modeling approach with dense neurophysiol… ▽ More Humans and animals have a rich and flexible understanding of the physical world, which enables them to infer the underlying dynamical trajectories of objects and events, plausible future states, and use that to plan and anticipate the consequences of actions. However, the neural mechanisms underlying these computations are unclear. We combine a goal-driven modeling approach with dense neurophysiological data and high-throughput human behavioral readouts to directly im**e on this question. Specifically, we construct and evaluate several classes of sensory-cognitive networks to predict the future state of rich, ethologically-relevant environments, ranging from self-supervised end-to-end models with pixel-wise or object-centric objectives, to models that future predict in the latent space of purely static image-based or dynamic video-based pretrained foundation models. We find strong differentiation across these model classes in their ability to predict neural and behavioral data both within and across diverse environments. In particular, we find that neural responses are currently best predicted by models trained to predict the future state of their environment in the latent space of pretrained foundation models optimized for dynamic scenes in a self-supervised manner. Notably, models that future predict in the latent space of video foundation models that are optimized to support a diverse range of sensorimotor tasks, reasonably match both human behavioral error patterns and neural dynamics across all environmental scenarios that we were able to test. Overall, these findings suggest that the neural mechanisms and behaviors of primate mental simulation are thus far most consistent with being optimized to future predict on dynamic, reusable visual representations that are useful for Embodied AI more generally. △ Less

Submitted 25 October, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

Comments: 20 pages, 10 figures, NeurIPS 2023 Camera Ready Version (spotlight)

arXiv:2206.00262 [pdf, other]

Self-supervised Learning for Label Sparsity in Computational Drug Repositioning

Authors: Xinxing Yang, Genke Yang, Jian Chu

Abstract: The computational drug repositioning aims to discover new uses for marketed drugs, which can accelerate the drug development process and play an important role in the existing drug discovery system. However, the number of validated drug-disease associations is scarce compared to the number of drugs and diseases in the real world. Too few labeled samples will make the classification model unable to… ▽ More The computational drug repositioning aims to discover new uses for marketed drugs, which can accelerate the drug development process and play an important role in the existing drug discovery system. However, the number of validated drug-disease associations is scarce compared to the number of drugs and diseases in the real world. Too few labeled samples will make the classification model unable to learn effective latent factors of drugs, resulting in poor generalization performance. In this work, we propose a multi-task self-supervised learning framework for computational drug repositioning. The framework tackles label sparsity by learning a better drug representation. Specifically, we take the drug-disease association prediction problem as the main task, and the auxiliary task is to use data augmentation strategies and contrast learning to mine the internal relationships of the original drug features, so as to automatically learn a better drug representation without supervised labels. And through joint training, it is ensured that the auxiliary task can improve the prediction accuracy of the main task. More precisely, the auxiliary task improves drug representation and serving as additional regularization to improve generalization. Furthermore, we design a multi-input decoding network to improve the reconstruction ability of the autoencoder model. We evaluate our model using three real-world datasets. The experimental results demonstrate the effectiveness of the multi-task self-supervised learning framework, and its predictive ability is superior to the state-of-the-art model. △ Less

Submitted 1 June, 2022; originally announced June 2022.

Comments: 14 pages

arXiv:2205.13583 [pdf]

Harnessing Artificial Intelligence to Infer Novel Spatial Biomarkers for the Diagnosis of Eosinophilic Esophagitis

Authors: Ariel Larey, Eliel Aknin, Nati Daniel, Garrett A. Osswald, Julie M. Caldwell, Mark Rochman, Tanya Wasserman, Margaret H. Collins, Nicoleta C. Arva, Guang-Yu Yang, Marc E. Rothenberg, Yonatan Savir

Abstract: Eosinophilic esophagitis (EoE) is a chronic allergic inflammatory condition of the esophagus associated with elevated esophageal eosinophils. Second only to gastroesophageal reflux disease, EoE is one of the leading causes of chronic refractory dysphagia in adults and children. EoE diagnosis requires enumerating the density of esophageal eosinophils in esophageal biopsies, a somewhat subjective ta… ▽ More Eosinophilic esophagitis (EoE) is a chronic allergic inflammatory condition of the esophagus associated with elevated esophageal eosinophils. Second only to gastroesophageal reflux disease, EoE is one of the leading causes of chronic refractory dysphagia in adults and children. EoE diagnosis requires enumerating the density of esophageal eosinophils in esophageal biopsies, a somewhat subjective task that is time-consuming, thus reducing the ability to process the complex tissue structure. Previous artificial intelligence (AI) approaches that aimed to improve histology-based diagnosis focused on recapitulating identification and quantification of the area of maximal eosinophil density. However, this metric does not account for the distribution of eosinophils or other histological features, over the whole slide image. Here, we developed an artificial intelligence platform that infers local and spatial biomarkers based on semantic segmentation of intact eosinophils and basal zone distributions. Besides the maximal density of eosinophils (referred to as Peak Eosinophil Count [PEC]) and a maximal basal zone fraction, we identify two additional metrics that reflect the distribution of eosinophils and basal zone fractions. This approach enables a decision support system that predicts EoE activity and classifies the histological severity of EoE patients. We utilized a cohort that includes 1066 biopsy slides from 400 subjects to validate the system's performance and achieved a histological severity classification accuracy of 86.70%, sensitivity of 84.50%, and specificity of 90.09%. Our approach highlights the importance of systematically analyzing the distribution of biopsy features over the entire slide and paves the way towards a personalized decision support system that will assist not only in counting cells but can also potentially improve diagnosis and provide treatment prediction. △ Less

Submitted 26 May, 2022; originally announced May 2022.

Comments: AL, EA, and ND have contributed equally to this work and share first authorship. YS is the corresponding author, e-mail: [email protected]

arXiv:2111.14696 [pdf, other]

The Computational Drug Repositioning without Negative Sampling

Authors: Xinxing Yang, Genke Yang, Jian Chu

Abstract: Computational drug repositioning technology is an effective tool to accelerate drug development. Although this technique has been widely used and successful in recent decades, many existing models still suffer from multiple drawbacks such as the massive number of unvalidated drug-disease associations and the inner product. The limitations of these works are mainly due to the following two reasons:… ▽ More Computational drug repositioning technology is an effective tool to accelerate drug development. Although this technique has been widely used and successful in recent decades, many existing models still suffer from multiple drawbacks such as the massive number of unvalidated drug-disease associations and the inner product. The limitations of these works are mainly due to the following two reasons: firstly, previous works used negative sampling techniques to treat unvalidated drug-disease associations as negative samples, which is invalid in real-world settings; secondly, the inner product cannot fully take into account the feature information contained in the latent factor of drug and disease. In this paper, we propose a novel PUON framework for addressing the above deficiencies, which models the risk estimator of computational drug repositioning only using validated (Positive) and unvalidated (Unlabelled) drug-disease associations without employing negative sampling techniques. The PUON also proposed an Outer Neighborhood-based classifier for modeling the cross-feature information of the latent facotor. For a comprehensive comparison, we considered 8 popular baselines. Extensive experiments in four real-world datasets showed that PUON model achieved the best performance based on 6 evaluation metrics. △ Less

Submitted 31 May, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

Comments: 12 pages,10 figures

arXiv:2110.13976 [pdf, other]

Biological learning in key-value memory networks

Authors: Danil Tyulmankov, Ching Fang, Annapurna Vadaparty, Guangyu Robert Yang

Abstract: In neuroscience, classical Hopfield networks are the standard biologically plausible model of long-term memory, relying on Hebbian plasticity for storage and attractor dynamics for recall. In contrast, memory-augmented neural networks in machine learning commonly use a key-value mechanism to store and read out memories in a single step. Such augmented networks achieve impressive feats of memory co… ▽ More In neuroscience, classical Hopfield networks are the standard biologically plausible model of long-term memory, relying on Hebbian plasticity for storage and attractor dynamics for recall. In contrast, memory-augmented neural networks in machine learning commonly use a key-value mechanism to store and read out memories in a single step. Such augmented networks achieve impressive feats of memory compared to traditional variants, yet their biological relevance is unclear. We propose an implementation of basic key-value memory that stores inputs using a combination of biologically plausible three-factor plasticity rules. The same rules are recovered when network parameters are meta-learned. Our network performs on par with classical Hopfield networks on autoassociative memory tasks and can be naturally extended to continual recall, heteroassociative memory, and sequence learning. Our results suggest a compelling alternative to the classical Hopfield network as a model of biological long-term memory. △ Less

Submitted 26 October, 2021; originally announced October 2021.

Comments: NeurIPS 2021

arXiv:2104.12955 [pdf]

Local vaccination and systemic tumor suppression via irradiation and manganese adjuvant in mice

Authors: Chunyang Lu, **g Qian, Jianfeng Lv, **tao Han, Xiaoyi Sun, Junyi Chen, Siwei Ding, Zhusong Mei, Yulan Liang, Yuqi Ma, Ye Zhao, Chen Lin, Yanying Zhao, Yixing Geng, Wenjun Ma, Yugang Wang, Xueqing Yan, Gen Yang

Abstract: Presently 4T-1 luc cells were irradiated with proton under ultra-high dose rate FLASH or with gamma-ray with conventional dose rate, and then subcutaneous vaccination with or without Mn immuno-enhancing adjuvant into the mice for three times. One week later, we injected untreated 4T-1 luc cells on the other side of the vaccinated mice, and found that the untreated 4T-1 luc cells injected later nea… ▽ More Presently 4T-1 luc cells were irradiated with proton under ultra-high dose rate FLASH or with gamma-ray with conventional dose rate, and then subcutaneous vaccination with or without Mn immuno-enhancing adjuvant into the mice for three times. One week later, we injected untreated 4T-1 luc cells on the other side of the vaccinated mice, and found that the untreated 4T-1 luc cells injected later nearly totally did not grow tumor (1/17) while controls without previous vaccination all grow tumors (18/18). The result is very interesting and the findings may help to explore in situ tumor vaccination as well as new combined radiotherapy strategies to effectively ablate primary and disseminated tumors. To our limited knowledge, this is the first paper reporting the high efficiency induction of systemic vaccination suppressing the metastasized/disseminated tumor progression. △ Less

Submitted 26 April, 2021; originally announced April 2021.

Comments: 16 pages, 3 figures and 1 table

arXiv:2103.02015 [pdf]

PECNet: A Deep Multi-Label Segmentation Network for Eosinophilic Esophagitis Biopsy Diagnostics

Authors: Nati Daniel, Ariel Larey, Eliel Aknin, Garrett A. Osswald, Julie M. Caldwell, Mark Rochman, Margaret H. Collins, Guang-Yu Yang, Nicoleta C. Arva, Kelley E. Capocelli, Marc E. Rothenberg, Yonatan Savir

Abstract: Background. Eosinophilic esophagitis (EoE) is an allergic inflammatory condition of the esophagus associated with elevated numbers of eosinophils. Disease diagnosis and monitoring requires determining the concentration of eosinophils in esophageal biopsies, a time-consuming, tedious and somewhat subjective task currently performed by pathologists. Methods. Herein, we aimed to use machine learning… ▽ More Background. Eosinophilic esophagitis (EoE) is an allergic inflammatory condition of the esophagus associated with elevated numbers of eosinophils. Disease diagnosis and monitoring requires determining the concentration of eosinophils in esophageal biopsies, a time-consuming, tedious and somewhat subjective task currently performed by pathologists. Methods. Herein, we aimed to use machine learning to identify, quantitate and diagnose EoE. We labeled more than 100M pixels of 4345 images obtained by scanning whole slides of H&E-stained sections of esophageal biopsies derived from 23 EoE patients. We used this dataset to train a multi-label segmentation deep network. To validate the network, we examined a replication cohort of 1089 whole slide images from 419 patients derived from multiple institutions. Findings. PECNet segmented both intact and not-intact eosinophils with a mean intersection over union (mIoU) of 0.93. This segmentation was able to quantitate intact eosinophils with a mean absolute error of 0.611 eosinophils and classify EoE disease activity with an accuracy of 98.5%. Using whole slide images from the validation cohort, PECNet achieved an accuracy of 94.8%, sensitivity of 94.3%, and specificity of 95.14% in reporting EoE disease activity. Interpretation. We have developed a deep learning multi-label semantic segmentation network that successfully addresses two of the main challenges in EoE diagnostics and digital pathology, the need to detect several types of small features simultaneously and the ability to analyze whole slides efficiently. Our results pave the way for an automated diagnosis of EoE and can be utilized for other conditions with similar challenges. △ Less

Submitted 2 March, 2021; originally announced March 2021.

arXiv:2007.15422 [pdf, ps, other]

doi 10.1093/bioinformatics/btaa671

Few shot domain adaptation for in situ macromolecule structural classification in cryo-electron tomograms

Authors: Liangyong Yu, Ran Li, Xiangrui Zeng, Hongyi Wang, Jie **, Ge Yang, Rui Jiang, Min Xu

Abstract: Motivation: Cryo-Electron Tomography (cryo-ET) visualizes structure and spatial organization of macromolecules and their interactions with other subcellular components inside single cells in the close-to-native state at sub-molecular resolution. Such information is critical for the accurate understanding of cellular processes. However, subtomogram classification remains one of the major challenges… ▽ More Motivation: Cryo-Electron Tomography (cryo-ET) visualizes structure and spatial organization of macromolecules and their interactions with other subcellular components inside single cells in the close-to-native state at sub-molecular resolution. Such information is critical for the accurate understanding of cellular processes. However, subtomogram classification remains one of the major challenges for the systematic recognition and recovery of the macromolecule structures in cryo-ET because of imaging limits and data quantity. Recently, deep learning has significantly improved the throughput and accuracy of large-scale subtomogram classification. However often it is difficult to get enough high-quality annotated subtomogram data for supervised training due to the enormous expense of labeling. To tackle this problem, it is beneficial to utilize another already annotated dataset to assist the training process. However, due to the discrepancy of image intensity distribution between source domain and target domain, the model trained on subtomograms in source domainmay perform poorly in predicting subtomogram classes in the target domain. Results: In this paper, we adapt a few shot domain adaptation method for deep learning based cross-domain subtomogram classification. The essential idea of our method consists of two parts: 1) take full advantage of the distribution of plentiful unlabeled target domain data, and 2) exploit the correlation between the whole source domain dataset and few labeled target domain data. Experiments conducted on simulated and real datasets show that our method achieves significant improvement on cross domain subtomogram classification compared with baseline methods. △ Less

Submitted 30 July, 2020; originally announced July 2020.

Comments: This article has been accepted for publication in Bioinformatics Published by Oxford University Press

Journal ref: Bioinformatics 2020

arXiv:2006.01001 [pdf, other]

doi 10.1016/j.neuron.2020.09.005

Artificial neural networks for neuroscientists: A primer

Authors: Guangyu Robert Yang, Xiao-**g Wang

Abstract: Artificial neural networks (ANNs) are essential tools in machine learning that have drawn increasing attention in neuroscience. Besides offering powerful techniques for data analysis, ANNs provide a new approach for neuroscientists to build models for complex behaviors, heterogeneous neural activity and circuit connectivity, as well as to explore optimization in neural systems, in ways that tradit… ▽ More Artificial neural networks (ANNs) are essential tools in machine learning that have drawn increasing attention in neuroscience. Besides offering powerful techniques for data analysis, ANNs provide a new approach for neuroscientists to build models for complex behaviors, heterogeneous neural activity and circuit connectivity, as well as to explore optimization in neural systems, in ways that traditional models are not designed for. In this pedagogical Primer, we introduce ANNs and demonstrate how they have been fruitfully deployed to study neuroscientific questions. We first discuss basic concepts and methods of ANNs. Then, with a focus on bringing this mathematical framework closer to neurobiology, we detail how to customize the analysis, structure, and learning of ANNs to better address a wide range of challenges in brain research. To help the readers garner hands-on experience, this Primer is accompanied with tutorial-style code in PyTorch and Jupyter Notebook, covering major topics. △ Less

Submitted 24 September, 2020; v1 submitted 1 June, 2020; originally announced June 2020.

Journal ref: Neuron, Volume 107, Issue 6, 23 September 2020, Pages 1048-1070

arXiv:2003.00163 [pdf]

doi 10.1093/bioinformatics/btaa645

COVID-19 Docking Server: A meta server for docking small molecules, peptides and antibodies against potential targets of COVID-19

Authors: Ren Kong, Guangbo Yang, Rui Xue, Ming Liu, Feng Wang, Jian** Hu, Xiaoqiang Guo, Shan Chang

Abstract: Motivation: The coronavirus disease 2019 (COVID-19) caused by a new type of coronavirus has been emerging from China and led to thousands of death globally since December 2019. Despite many groups have engaged in studying the newly emerged virus and searching for the treatment of COVID-19, the understanding of the COVID-19 target-ligand interactions represents a key chal-lenge. Herein, we introduc… ▽ More Motivation: The coronavirus disease 2019 (COVID-19) caused by a new type of coronavirus has been emerging from China and led to thousands of death globally since December 2019. Despite many groups have engaged in studying the newly emerged virus and searching for the treatment of COVID-19, the understanding of the COVID-19 target-ligand interactions represents a key chal-lenge. Herein, we introduce COVID-19 Docking Server, a web server that predicts the binding modes between COVID-19 targets and the ligands including small molecules, peptides and anti-bodies. Results: Structures of proteins involved in the virus life cycle were collected or constructed based on the homologs of coronavirus, and prepared ready for docking. The meta platform provides a free and interactive tool for the prediction of COVID-19 target-ligand interactions and following drug discovery for COVID-19. △ Less

Submitted 7 August, 2020; v1 submitted 28 February, 2020; originally announced March 2020.

Journal ref: Bioinformatics, 2020, btaa645

arXiv:2001.00114 [pdf]

Expertise and Task Pressure in fNIRS-based brain Connectomes

Authors: F. Deligianni, H. Singh, H. N. Modi, S. Jahani, M. Yucel, A. Darzi, D. R. Leff, G. Z. Yang

Abstract: Acquisition of bimanual motor skills, critical in several applications ranging from robotic teleoperations to surgery, is associated with a protracted learning curve. Brain connectivity based on functional Near Infrared Spectroscopy (fNIRS) data has shown promising results in distinguishing experts from novice surgeons. However, it is less well understood how expertise-related disparity in brain c… ▽ More Acquisition of bimanual motor skills, critical in several applications ranging from robotic teleoperations to surgery, is associated with a protracted learning curve. Brain connectivity based on functional Near Infrared Spectroscopy (fNIRS) data has shown promising results in distinguishing experts from novice surgeons. However, it is less well understood how expertise-related disparity in brain connectivity is modulated by dynamic temporal demands experienced during a surgical task. In this study, we use fNIRS to examine the interplay between frontal and motor brain regions in a cohort of surgical residents of varying expertise performing a laparoscopic surgical task under temporal demand. The results demonstrate that prefrontal-motor connectivity in senior residents is more resilient to time pressure. Furthermore, certain global characteristics of brain connectomes, such as the small-world index, may be used to detect the presence of an underlying stressor. △ Less

Submitted 31 December, 2019; originally announced January 2020.

arXiv:1811.08763 [pdf, other]

doi 10.1109/BIBE.2019.00029

Comparison of Brain Networks based on Predictive Models of Connectivity

Authors: Fani Deligianni, Jonathan D. Clayden, Guang-Zhong Yang

Abstract: In this study we adopt predictive modelling to identify simultaneously commonalities and differences in multi-modal brain networks acquired within subjects. Typically, predictive modelling of functional connectomes from structural connectomes explores commonalities across multimodal imaging data. However, direct application of multivariate approaches such as sparse Canonical Correlation Analysis (… ▽ More In this study we adopt predictive modelling to identify simultaneously commonalities and differences in multi-modal brain networks acquired within subjects. Typically, predictive modelling of functional connectomes from structural connectomes explores commonalities across multimodal imaging data. However, direct application of multivariate approaches such as sparse Canonical Correlation Analysis (sCCA) applies on the vectorised elements of functional connectivity across subjects and it does not guarantee that the predicted models of functional connectivity are Symmetric Positive Matrices (SPD). We suggest an elegant solution based on the transportation of the connectivity matrices on a Riemannian manifold, which notably improves the prediction performance of the model. Randomised lasso is used to alleviate the dependency of the sCCA on the lasso parameters and control the false positive rate. Subsequently, the binomial distribution is exploited to set a threshold statistic that reflects whether a connection is selected or rejected by chance. Finally, we estimate the sCCA loadings based on a de-noising approach that improves the estimation of the coefficients. We validate our approach based on resting-state fMRI and diffusion weighted MRI data. Quantitative validation of the prediction performance shows superior performance, whereas qualitative results of the identification process are promising. △ Less

Submitted 5 November, 2019; v1 submitted 19 November, 2018; originally announced November 2018.

Comments: 7 pages, 4 figures

MSC Class: 97K30 ACM Class: G.3

Journal ref: 19th IEEE International Conference on Bioinformatics and Bioengineering (IEEE BIBE, 2019)

arXiv:1808.06047 [pdf]

Frequency spectrum of biological noise: a probe of reaction dynamics in living cells

Authors: Sanggeun Song, Gil-Suk Yang, Seong Jun Park, Ji-Hyun Kim, Jaeyoung Sung

Abstract: Even in the steady-state, the number of biomolecules in living cells fluctuates dynamically; and the frequency spectrum of this chemical fluctuation carries valuable information about the mechanism and the dynamics of the intracellular reactions creating these biomolecules. Although recent advances in single-cell experimental techniques enable the direct monitoring of the time-traces of the biolog… ▽ More Even in the steady-state, the number of biomolecules in living cells fluctuates dynamically; and the frequency spectrum of this chemical fluctuation carries valuable information about the mechanism and the dynamics of the intracellular reactions creating these biomolecules. Although recent advances in single-cell experimental techniques enable the direct monitoring of the time-traces of the biological noise in each cell, the development of the theoretical tools needed to extract the information encoded in the stochastic dynamics of intracellular chemical fluctuation is still in its adolescence. Here, we present a simple and general equation that relates the power-spectrum of the product number fluctuation to the product lifetime and the reaction dynamics of the product creation process. By analyzing the time traces of the protein copy number using this theory, we can extract the power spectrum of the mRNA number, which cannot be directly measured by currently available experimental techniques. From the power spectrum of the mRNA number, we can further extract quantitative information about the transcriptional regulation dynamics. Our power spectrum analysis of gene expression noise is demonstrated for the gene network model of luciferase expression under the control of the Bmal 1a promoter in mouse fibroblast cells. Additionally, we investigate how the non-Poisson reaction dynamics and the cell-to-cell heterogeneity in transcription and translation affect the power-spectra of the mRNA and protein number. △ Less

Submitted 18 August, 2018; originally announced August 2018.

Comments: Main text: 29 pages, 4 figures Supporting Information: 42 pages, 4 supplementary figures

arXiv:1712.07369 [pdf, other]

Improvement of Resting-state EEG Analysis Process with Spectrum Weight-Voting based on LES

Authors: Yumeng Ye, Haichun Liu, TianHong Zhang, Changchun Pan, Genke Yang, JiJun Wang, Robert C. Qiu

Abstract: EEG is a non-invasive technique for recording brain bioelectric activity, which has potential applications in various fields such as human-computer interaction and neuroscience. However, there are many difficulties in analyzing EEG data, including its complex composition, low amplitude as well as low signal-to-noise ratio. Some of the existing methods of analysis are based on feature extraction an… ▽ More EEG is a non-invasive technique for recording brain bioelectric activity, which has potential applications in various fields such as human-computer interaction and neuroscience. However, there are many difficulties in analyzing EEG data, including its complex composition, low amplitude as well as low signal-to-noise ratio. Some of the existing methods of analysis are based on feature extraction and machine learning to differentiate the phase of schizophrenia that samples belong to. However, medical research requires the use of machine learning not only to give more accurate classification results, but also to give the results that can be applied to pathological studies. The main purpose of this study is to obtain the weight values as the representation of influence of each frequency band on the classification of schizophrenia phases on the basis of a more effective classification method using the LES feature extraction, and then the weight values are processed and applied to improve the accuracy of machine learning classification. We propose a method called weight-voting to obtain the weights of sub-bands features by using results of classification for voting to fit the actual categories of EEG data, and using weights for reclassification. Through this method, we can first obtain the influence of each band in distinguishing three schizophrenia phases, and analyze the effect of band features on the risk of schizophrenia contributing to the study of psychopathology. Our results show that there is a high correlation between the change of weight of low gamma band and the difference between HC, CHR and FES. If the features revised according to weights are used for reclassification, the accuracy of result will be improved compared with the original classifier, which confirms the role of the band weight distribution. △ Less

Submitted 17 January, 2018; v1 submitted 20 December, 2017; originally announced December 2017.

Comments: 9 pages, 6 figures

arXiv:1712.05289 [pdf, other]

A Data Driven Approach for Resting-state EEG signal Classification of Schizophrenia with Control Participants using Random Matrix Theory

Authors: Haichun Liu, TianHong Zhang, Yumeng Ye, Changchun Pan, Genke Yang, JiJun Wang, Robert C. Qiu

Abstract: Resting state electroencephalogram (EEG) abnormalities in clinically high-risk individuals (CHR), clinically stable first-episode patients with schizophrenia (FES), healthy controls (HC) suggest alterations in neural oscillatory activity. However, few studies directly compare these anomalies among each types. Therefore, this study investigated whether these electrophysiological characteristics dif… ▽ More Resting state electroencephalogram (EEG) abnormalities in clinically high-risk individuals (CHR), clinically stable first-episode patients with schizophrenia (FES), healthy controls (HC) suggest alterations in neural oscillatory activity. However, few studies directly compare these anomalies among each types. Therefore, this study investigated whether these electrophysiological characteristics differentiate clinical populations from one another, and from non-psychiatric controls. To address this question, resting EEG power and coherence were assessed in 40 clinically high-risk individuals (CHR), 40 first-episode patients with schizophrenia (FES), and 40 healthy controls (HC). These findings suggest that resting EEG can be a sensitive measure for differentiating between clinical disorders.This paper proposes a novel data-driven supervised learning method to obtain identification of the patients mental status in schizophrenia research. According to Marchenko-Pastur Law, the distribution of the eigenvalues of EEG data is divided into signal subspace and noise subspace. A test statistic named LES that embodies the characteristics of all eigenvalues is adopted. different classifier and different feature(LES test function) are selected for experiments, we have shown that using von Neumann Entropy as LES test function combine with SVM classifier could obtain the best average classification accuracy during three classification among HC, FES and CHR of Schizophrenia group with EEG signal. It is worth noting that the result of LES feature extraction with the highest classification accuracy is around 90% in two classification(HC compare with FES) and around 70% in three classification. Where the classification accuracy higher than 70% could be used to assist clinical diagnosis. △ Less

Submitted 17 January, 2018; v1 submitted 13 December, 2017; originally announced December 2017.

Comments: 9 pages, 5 figures. arXiv admin note: text overlap with arXiv:1503.08445 by other authors

arXiv:1708.00353 [pdf]

A 33-year NPP monitoring study in southwest China by the fusion of multi-source remote sensing and station data

Authors: Xiaobin Guan, Huanfeng Shen, Wenxia Gan, Gang Yang, Lunche Wang, Xinghua Li, Liangpei Zhang

Abstract: Knowledge of regional net primary productivity (NPP) is important for the systematic understanding of the global carbon cycle. In this study, multi-source data were employed to conduct a 33-year regional NPP study in southwest China, at a 1-km scale. A multi-sensor fusion framework was applied to obtain a new normalized difference vegetation index (NDVI) time series from 1982 to 2014, combining th… ▽ More Knowledge of regional net primary productivity (NPP) is important for the systematic understanding of the global carbon cycle. In this study, multi-source data were employed to conduct a 33-year regional NPP study in southwest China, at a 1-km scale. A multi-sensor fusion framework was applied to obtain a new normalized difference vegetation index (NDVI) time series from 1982 to 2014, combining the respective advantages of the different remote sensing datasets. As another key parameter for NPP modeling, the total solar radiation was calculated by the improved Yang hybrid model (YHM), using meteorological station data. The verification described in this paper proved the feasibility of all the applied data processes, and a greatly improved accuracy was obtained for the NPP calculated with the final processed NDVI. The spatio-temporal analysis results indicated that 68.07% of the study area showed an increasing NPP trend over the past three decades. Significant heterogeneity was found in the correlation between NPP and precipitation at a monthly scale, specifically, the negative correlation in the growing season and the positive correlation in the dry season. The lagged positive correlation in the growing season and no lag in the dry season indicated the important impact of precipitation on NPP. △ Less

Submitted 1 August, 2017; originally announced August 2017.

Comments: 20 pages, 11 figures

arXiv:1701.08404 [pdf, other]

doi 10.1093/bioinformatics/btx230

Deep learning based subdivision approach for large scale macromolecules structure recovery from electron cryo tomograms

Authors: Min Xu, Xiaoqi Chai, Hariank Muthakana, Xiaodan Liang, Ge Yang, Tzviya Zeev-Ben-Mordehai, Eric Xing

Abstract: Motivation: Cellular Electron CryoTomography (CECT) enables 3D visualization of cellular organization at near-native state and in sub-molecular resolution, making it a powerful tool for analyzing structures of macromolecular complexes and their spatial organizations inside single cells. However, high degree of structural complexity together with practical imaging limitations make the systematic de… ▽ More Motivation: Cellular Electron CryoTomography (CECT) enables 3D visualization of cellular organization at near-native state and in sub-molecular resolution, making it a powerful tool for analyzing structures of macromolecular complexes and their spatial organizations inside single cells. However, high degree of structural complexity together with practical imaging limitations make the systematic de novo discovery of structures within cells challenging. It would likely require averaging and classifying millions of subtomograms potentially containing hundreds of highly heterogeneous structural classes. Although it is no longer difficult to acquire CECT data containing such amount of subtomograms due to advances in data acquisition automation, existing computational approaches have very limited scalability or discrimination ability, making them incapable of processing such amount of data. Results: To complement existing approaches, in this paper we propose a new approach for subdividing subtomograms into smaller but relatively homogeneous subsets. The structures in these subsets can then be separately recovered using existing computation intensive methods. Our approach is based on supervised structural feature extraction using deep learning, in combination with unsupervised clustering and reference-free classification. Our experiments show that, compared to existing unsupervised rotation invariant feature and pose-normalization based approaches, our new approach achieves significant improvements in both discrimination ability and scalability. More importantly, our new approach is able to discover new structural classes and recover structures that do not exist in training data. △ Less

Submitted 29 January, 2017; originally announced January 2017.

Comments: 12 pages, 5 figures, 3 tables

arXiv:1612.08952 [pdf, ps, other]

doi 10.1103/PhysRevE.94.062316

Compensatory interactions to stabilize multiple steady states or mitigate the effects of multiple deregulations in biological networks

Authors: Gang Yang, Colin Campbell, Réka Albert

Abstract: Complex diseases can be modeled as damage to intracellular networks that results in abnormal cell behaviors. Network-based dynamic models such as Boolean models have been employed to model a variety of biological systems including those corresponding to disease. Previous work designed compensatory interactions to stabilize an attractor of a Boolean network after single node damage. We generalize t… ▽ More Complex diseases can be modeled as damage to intracellular networks that results in abnormal cell behaviors. Network-based dynamic models such as Boolean models have been employed to model a variety of biological systems including those corresponding to disease. Previous work designed compensatory interactions to stabilize an attractor of a Boolean network after single node damage. We generalize this method to a multinode damage scenario and to the simultaneous stabilization of multiple steady state attractors. We classify the emergent situations, with a special focus on combinatorial effects, and characterize each class through simulation. We explore how the structural and functional properties of the network affect its resilience and its possible repair scenarios. We demonstrate the method's applicability to two intracellular network models relevant to cancer. This work has implications in designing prevention strategies for complex disease. △ Less

Submitted 28 December, 2016; originally announced December 2016.

Comments: 16 pages, 8 figures

Journal ref: Phys. Rev. E 94, 062316 (2016)

arXiv:1605.08415 [pdf, other]

doi 10.1073/pnas.1617387114

Structure-based control of complex networks with nonlinear dynamics

Authors: Jorge G. T. Zañudo, Gang Yang, Réka Albert

Abstract: What can we learn about controlling a system solely from its underlying network structure? Here we adapt a recently developed framework for control of networks governed by a broad class of nonlinear dynamics that includes the major dynamic models of biological, technological, and social processes. This feedback-based framework provides realizable node overrides that steer a system towards any of i… ▽ More What can we learn about controlling a system solely from its underlying network structure? Here we adapt a recently developed framework for control of networks governed by a broad class of nonlinear dynamics that includes the major dynamic models of biological, technological, and social processes. This feedback-based framework provides realizable node overrides that steer a system towards any of its natural long term dynamic behaviors, regardless of the specific functional forms and system parameters. We use this framework on several real networks, identify the topological characteristics that underlie the predicted node overrides, and compare its predictions to those of structural controllability in control theory. Finally, we demonstrate this framework's applicability in dynamic models of gene regulatory networks and identify nodes whose override is necessary for control in the general case, but not in specific model instances. △ Less

Submitted 5 July, 2017; v1 submitted 26 May, 2016; originally announced May 2016.

Comments: Includes main text and supporting information

arXiv:1602.03710 [pdf, other]

doi 10.1038/srep30845

Detecting the Collapse of Cooperation in Evolving Networks

Authors: Matteo Cavaliere, Guoli Yang, Vincent Danos, Vasilis Dakos

Abstract: The sustainability of structured biological, social, economic and ecological communities are often determined by the outcome of social conflicts between cooperative and selfish individuals (cheaters). Cheaters avoid the cost of contributing to the community and can occasionally spread in the population leading to the complete collapse of cooperation. Although such a collapse often unfolds unexpect… ▽ More The sustainability of structured biological, social, economic and ecological communities are often determined by the outcome of social conflicts between cooperative and selfish individuals (cheaters). Cheaters avoid the cost of contributing to the community and can occasionally spread in the population leading to the complete collapse of cooperation. Although such a collapse often unfolds unexpectedly bearing the traits of a critical transition, it is unclear whether one can detect the rising risk of cheater's invasions and loss of cooperation in an evolving community. Here, we combine dynamical networks and evolutionary game theory to study the abrupt loss of cooperation as a critical transition. We estimate the risk of collapse of cooperation after the introduction of a single cheater under gradually changing conditions. We observe a systematic increase in the average time it takes for cheaters to be eliminated from the community as the risk of collapse increases. We detect this risk based on changes in community structure and composition. Nonetheless, reliable detection depends on the mechanism that governs how cheaters evolve in the community. Our results suggest possible avenues for detecting the loss of cooperation in evolving communities △ Less

Submitted 11 February, 2016; originally announced February 2016.

Journal ref: Scientific Reports, 6, 30845, 2016

Showing 1–26 of 26 results for author: Yang, G