Search | arXiv e-print repository

Progress Towards Decoding Visual Imagery via fNIRS

Authors: Michel Adamic, Wellington Avelino, Anna Brandenberger, Bryan Chiang, Hunter Davis, Stephen Fay, Andrew Gregory, Aayush Gupta, Raphael Hotter, Grace Jiang, Fiona Leng, Stephen Polcyn, Thomas Ribeiro, Paul Scotti, Michelle Wang, Marley Xiong, Jonathan Xu

Abstract: We demonstrate the possibility of reconstructing images from fNIRS brain activity and start building a prototype to match the required specs. By training an image reconstruction model on downsampled fMRI data, we discovered that cm-scale spatial resolution is sufficient for image generation. We obtained 71% retrieval accuracy with 1-cm resolution, compared to 93% on the full-resolution fMRI, and 2… ▽ More We demonstrate the possibility of reconstructing images from fNIRS brain activity and start building a prototype to match the required specs. By training an image reconstruction model on downsampled fMRI data, we discovered that cm-scale spatial resolution is sufficient for image generation. We obtained 71% retrieval accuracy with 1-cm resolution, compared to 93% on the full-resolution fMRI, and 20% with 2-cm resolution. With simulations and high-density tomography, we found that time-domain fNIRS can achieve 1-cm resolution, compared to 2-cm resolution for continuous-wave fNIRS. Lastly, we share designs for a prototype time-domain fNIRS device, consisting of a laser driver, a single photon detector, and a time-to-digital converter system. △ Less

Submitted 22 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

arXiv:2405.16035 [pdf, other]

A dissimilarity measure for semidirected networks

Authors: Michael Maxfield, **gcheng Xu, Cécile Ané

Abstract: Semidirected networks have received interest in evolutionary biology as the appropriate generalization of unrooted trees to networks, in which some but not all edges are directed. Yet these networks lack proper theoretical study. We define here a general class of semidirected phylogenetic networks, with a stable set of leaves, tree nodes and hybrid nodes. We prove that for these networks, if we lo… ▽ More Semidirected networks have received interest in evolutionary biology as the appropriate generalization of unrooted trees to networks, in which some but not all edges are directed. Yet these networks lack proper theoretical study. We define here a general class of semidirected phylogenetic networks, with a stable set of leaves, tree nodes and hybrid nodes. We prove that for these networks, if we locally choose the direction of one edge, then globally the set of paths starting by this edge is stable across all choices to root the network. We define an edge-based representation of semidirected phylogenetic networks and use it to define a dissimilarity between networks, which can be efficiently computed in near-quadratic time. Our dissimilarity extends the widely-used Robinson-Foulds distance on both rooted trees and unrooted trees. After generalizing the notion of tree-child networks to semidirected networks, we prove that our edge-based dissimilarity is in fact a distance on the space of tree-child semidirected phylogenetic networks. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.09699 [pdf]

Map** Differential Protein-Protein Interaction Networks using Affinity Purification Mass Spectrometry

Authors: Prashant Kaushal, Manisha R. Ummadi, Gwendolyn M. Jang, Yennifer Delgado, Sara K. Makanani, Sophie F. Blanc, Decan M. Winters, Jiewei Xu, Benjamin Polacco, Yuan Zhou, Erica Stevenson, Manon Eckhardt, Lorena Zuliani-Alvarez, Robyn Kaake, Danielle L. Swaney, Nevan Krogan, Mehdi Bouhaddou

Abstract: Proteins congregate into complexes to perform fundamental cellular functions. Phenotypic outcomes, in health and disease, are often mechanistically driven by the remodeling of protein complexes by protein coding mutations or cellular signaling changes in response to molecular cues. Here, we present an affinity purification mass spectrometry (APMS) proteomics protocol to quantify and visualize glob… ▽ More Proteins congregate into complexes to perform fundamental cellular functions. Phenotypic outcomes, in health and disease, are often mechanistically driven by the remodeling of protein complexes by protein coding mutations or cellular signaling changes in response to molecular cues. Here, we present an affinity purification mass spectrometry (APMS) proteomics protocol to quantify and visualize global changes in protein protein interaction (PPI) networks between pairwise conditions. We describe steps for expressing affinity tagged bait proteins in mammalian cells, identifying purified protein complexes, quantifying differential PPIs, and visualizing differential PPI networks. Specifically, this protocol details steps for designing affinity tagged bait gene constructs, transfection, affinity purification, mass spectrometry sample preparation, data acquisition, database search, data quality control, PPI confidence scoring, cross run normalization, statistical data analysis, and differential PPI visualization. Our protocol discusses caveats and limitations with applicability across cell types and biological areas. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 29 pages, 3 figures

arXiv:2404.05553 [pdf, other]

Alljoined1 -- A dataset for EEG-to-Image decoding

Authors: Jonathan Xu, Bruno Aristimunha, Max Emanuel Feucht, Emma Qian, Charles Liu, Tazik Shahjahan, Martyna Spyra, Steven Zifan Zhang, Nicholas Short, Jioh Kim, Paula Perdomo, Ricky Renfeng Mao, Yashvir Sabharwal, Michael Ahedor Moaz Shoura, Adrian Nestor

Abstract: We present Alljoined1, a dataset built specifically for EEG-to-Image decoding. Recognizing that an extensive and unbiased sampling of neural responses to visual stimuli is crucial for image reconstruction efforts, we collected data from 8 participants looking at 10,000 natural images each. We have currently gathered 46,080 epochs of brain responses recorded with a 64-channel EEG headset. The datas… ▽ More We present Alljoined1, a dataset built specifically for EEG-to-Image decoding. Recognizing that an extensive and unbiased sampling of neural responses to visual stimuli is crucial for image reconstruction efforts, we collected data from 8 participants looking at 10,000 natural images each. We have currently gathered 46,080 epochs of brain responses recorded with a 64-channel EEG headset. The dataset combines response-based stimulus timing, repetition between blocks and sessions, and diverse image classes with the goal of improving signal quality. For transparency, we also provide data quality scores. We publicly release the dataset and all code at https://linktr.ee/alljoined1. △ Less

Submitted 14 May, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

Comments: 8 Pages, 6 Figures

ACM Class: I.5.1; I.6.3; I.2.6; K.3.2

arXiv:2403.11207 [pdf, other]

MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data

Authors: Paul S. Scotti, Mihir Tripathy, Cesar Kadir Torrico Villanueva, Reese Kneeland, Tong Chen, Ashutosh Narang, Charan Santhirasegaran, Jonathan Xu, Thomas Naselaris, Kenneth A. Norman, Tanishq Mathew Abraham

Abstract: Reconstructions of visual perception from brain activity have improved tremendously, but the practical utility of such methods has been limited. This is because such models are trained independently per subject where each subject requires dozens of hours of expensive fMRI training data to attain high-quality results. The present work showcases high-quality reconstructions using only 1 hour of fMRI… ▽ More Reconstructions of visual perception from brain activity have improved tremendously, but the practical utility of such methods has been limited. This is because such models are trained independently per subject where each subject requires dozens of hours of expensive fMRI training data to attain high-quality results. The present work showcases high-quality reconstructions using only 1 hour of fMRI training data. We pretrain our model across 7 subjects and then fine-tune on minimal data from a new subject. Our novel functional alignment procedure linearly maps all brain data to a shared-subject latent space, followed by a shared non-linear map** to CLIP image space. We then map from CLIP space to pixel space by fine-tuning Stable Diffusion XL to accept CLIP latents as inputs instead of text. This approach improves out-of-subject generalization with limited training data and also attains state-of-the-art image retrieval and reconstruction metrics compared to single-subject approaches. MindEye2 demonstrates how accurate reconstructions of perception are possible from a single visit to the MRI facility. All code is available on GitHub. △ Less

Submitted 15 June, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

Comments: In Forty-first International Conference on Machine Learning, 2024. Code at https://github.com/MedARC-AI/MindEyeV2. Published as a conference paper at ICML 2024

arXiv:2402.11693 [pdf, other]

Identifying circular orders for blobs in phylogenetic networks

Authors: John A. Rhodes, Hector Banos, **gcheng Xu, Cécile Ané

Abstract: Interest in the inference of evolutionary networks relating species or populations has grown with the increasing recognition of the importance of hybridization, gene flow and admixture, and the availability of large-scale genomic data. However, what network features may be validly inferred from various data types under different models remains poorly understood. Previous work has largely focused o… ▽ More Interest in the inference of evolutionary networks relating species or populations has grown with the increasing recognition of the importance of hybridization, gene flow and admixture, and the availability of large-scale genomic data. However, what network features may be validly inferred from various data types under different models remains poorly understood. Previous work has largely focused on level-1 networks, in which reticulation events are well separated, and on a general network's tree of blobs, the tree obtained by contracting every blob to a node. An open question is the identifiability of the topology of a blob of unknown level. We consider the identifiability of the circular order in which subnetworks attach to a blob, first proving that this order is well-defined for outer-labeled planar blobs. For this class of blobs, we show that the circular order information from 4-taxon subnetworks identifies the full circular order of the blob. Similarly, the circular order from 3-taxon rooted subnetworks identifies the full circular order of a rooted blob. We then show that subnetwork circular information is identifiable from certain data types and evolutionary models. This provides a general positive result for high-level networks, on the identifiability of the ordering in which taxon blocks attach to blobs in outer-labeled planar networks. Finally, we give examples of blobs with different internal structures which cannot be distinguished under many models and data types. △ Less

Submitted 18 February, 2024; originally announced February 2024.

MSC Class: 05C90; 60J95; 62B99; 92D15

arXiv:2401.13858 [pdf, other]

Graph Diffusion Transformer for Multi-Conditional Molecular Generation

Authors: Gang Liu, Jiaxin Xu, Tengfei Luo, Meng Jiang

Abstract: Inverse molecular design with diffusion models holds great potential for advancements in material and drug discovery. Despite success in unconditional molecule generation, integrating multiple properties such as synthetic score and gas permeability as condition constraints into diffusion models remains unexplored. We present the Graph Diffusion Transformer (Graph DiT) for multi-conditional molecul… ▽ More Inverse molecular design with diffusion models holds great potential for advancements in material and drug discovery. Despite success in unconditional molecule generation, integrating multiple properties such as synthetic score and gas permeability as condition constraints into diffusion models remains unexplored. We present the Graph Diffusion Transformer (Graph DiT) for multi-conditional molecular generation. Graph DiT has a condition encoder to learn the representation of numerical and categorical properties and utilizes a Transformer-based graph denoiser to achieve molecular graph denoising under conditions. Unlike previous graph diffusion models that add noise separately on the atoms and bonds in the forward diffusion process, we propose a graph-dependent noise model for training Graph DiT, designed to accurately estimate graph-related noise in molecules. We extensively validate the Graph DiT for multi-conditional polymer and small molecule generation. Results demonstrate our superiority across metrics from distribution learning to condition control for molecular properties. A polymer inverse design task for gas separation with feedback from domain experts further demonstrates its practical utility. △ Less

Submitted 6 May, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

Comments: 21 pages, 9 figures, 7 tables

arXiv:2310.10697 [pdf]

Synthetic IMU Datasets and Protocols Can Simplify Fall Detection Experiments and Optimize Sensor Configuration

Authors: Jie Tang, Bin He, Junkai Xu, Tian Tan, Zhipeng Wang, Yanmin Zhou, Shuo Jiang

Abstract: Falls represent a significant cause of injury among the elderly population. Extensive research has been devoted to the utilization of wearable IMU sensors in conjunction with machine learning techniques for fall detection. To address the challenge of acquiring costly training data, this paper presents a novel method that generates a substantial volume of synthetic IMU data with minimal real fall e… ▽ More Falls represent a significant cause of injury among the elderly population. Extensive research has been devoted to the utilization of wearable IMU sensors in conjunction with machine learning techniques for fall detection. To address the challenge of acquiring costly training data, this paper presents a novel method that generates a substantial volume of synthetic IMU data with minimal real fall experiments. First, unmarked 3D motion capture technology is employed to reconstruct human movements. Subsequently, utilizing the biomechanical simulation platform Opensim and forward kinematic methods, an ample amount of training data from various body segments can be custom generated. An LSTM model is trained, achieving testing accuracies of 91.99% and 86.62% on two distinct datasets of actual fall-related IMU data, demonstrated the comparable performance of models trained using genuine IMU data. Building upon the simulation framework, this paper further optimized the single IMU attachment position and multiple IMU combinations on fall detection. The proposed method simplifies fall detection data acquisition experiments, provides novel venue for generating low cost synthetic data in scenario where acquiring data for machine learning is challenging and paves the way for customizing machine learning configurations. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: 11 pages, 7 figures

arXiv:2310.02265 [pdf, other]

DREAM: Visual Decoding from Reversing Human Visual System

Authors: Weihao Xia, Raoul de Charette, Cengiz Öztireli, **g-Hao Xue

Abstract: In this work we present DREAM, an fMRI-to-image method for reconstructing viewed images from brain activities, grounded on fundamental knowledge of the human visual system. We craft reverse pathways that emulate the hierarchical and parallel nature of how humans perceive the visual world. These tailored pathways are specialized to decipher semantics, color, and depth cues from fMRI data, mirroring… ▽ More In this work we present DREAM, an fMRI-to-image method for reconstructing viewed images from brain activities, grounded on fundamental knowledge of the human visual system. We craft reverse pathways that emulate the hierarchical and parallel nature of how humans perceive the visual world. These tailored pathways are specialized to decipher semantics, color, and depth cues from fMRI data, mirroring the forward pathways from visual stimuli to fMRI recordings. To do so, two components mimic the inverse processes within the human visual system: the Reverse Visual Association Cortex (R-VAC) which reverses pathways of this brain region, extracting semantics from fMRI data; the Reverse Parallel PKM (R-PKM) component simultaneously predicting color and depth from fMRI signals. The experiments indicate that our method outperforms the current state-of-the-art models in terms of the consistency of appearance, structure, and semantics. Code will be made publicly available to facilitate further research in this field. △ Less

Submitted 10 April, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

Comments: Project Page: https://weihaox.github.io/DREAM

arXiv:2309.07178 [pdf]

CloudBrain-NMR: An Intelligent Cloud Computing Platform for NMR Spectroscopy Processing, Reconstruction and Analysis

Authors: Di Guo, Si** Li, Jun Liu, Zhangren Tu, Tianyu Qiu, **g**g Xu, Liubin Feng, Donghai Lin, Qing Hong, Mei** Lin, Yanqin Lin, Xiaobo Qu

Abstract: Nuclear Magnetic Resonance (NMR) spectroscopy has served as a powerful analytical tool for studying molecular structure and dynamics in chemistry and biology. However, the processing of raw data acquired from NMR spectrometers and subsequent quantitative analysis involves various specialized tools, which necessitates comprehensive knowledge in programming and NMR. Particularly, the emerging deep l… ▽ More Nuclear Magnetic Resonance (NMR) spectroscopy has served as a powerful analytical tool for studying molecular structure and dynamics in chemistry and biology. However, the processing of raw data acquired from NMR spectrometers and subsequent quantitative analysis involves various specialized tools, which necessitates comprehensive knowledge in programming and NMR. Particularly, the emerging deep learning tools is hard to be widely used in NMR due to the sophisticated setup of computation. Thus, NMR processing is not an easy task for chemist and biologists. In this work, we present CloudBrain-NMR, an intelligent online cloud computing platform designed for NMR data reading, processing, reconstruction, and quantitative analysis. The platform is conveniently accessed through a web browser, eliminating the need for any program installation on the user side. CloudBrain-NMR uses parallel computing with graphics processing units and central processing units, resulting in significantly shortened computation time. Furthermore, it incorporates state-of-the-art deep learning-based algorithms offering comprehensive functionalities that allow users to complete the entire processing procedure without relying on additional software. This platform has empowered NMR applications with advanced artificial intelligence processing. CloudBrain-NMR is openly accessible for free usage at https://csrc.xmu.edu.cn/CloudBrain.html △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: 11 pages, 13 figures

arXiv:2307.11133 [pdf, other]

Contrastive Graph Pooling for Explainable Classification of Brain Networks

Authors: Jiaxing Xu, Qingtian Bian, Xinhang Li, Aihu Zhang, Yi** Ke, Miao Qiao, Wei Zhang, Wei Khang Jeremy Sim, Balázs Gulyás

Abstract: Functional magnetic resonance imaging (fMRI) is a commonly used technique to measure neural activation. Its application has been particularly important in identifying underlying neurodegenerative conditions such as Parkinson's, Alzheimer's, and Autism. Recent analysis of fMRI data models the brain as a graph and extracts features by graph neural networks (GNNs). However, the unique characteristics… ▽ More Functional magnetic resonance imaging (fMRI) is a commonly used technique to measure neural activation. Its application has been particularly important in identifying underlying neurodegenerative conditions such as Parkinson's, Alzheimer's, and Autism. Recent analysis of fMRI data models the brain as a graph and extracts features by graph neural networks (GNNs). However, the unique characteristics of fMRI data require a special design of GNN. Tailoring GNN to generate effective and domain-explainable features remains challenging. In this paper, we propose a contrastive dual-attention block and a differentiable graph pooling method called ContrastPool to better utilize GNN for brain networks, meeting fMRI-specific requirements. We apply our method to 5 resting-state fMRI brain network datasets of 3 diseases and demonstrate its superiority over state-of-the-art baselines. Our case study confirms that the patterns extracted by our method match the domain knowledge in neuroscience literature, and disclose direct and interesting insights. Our contributions underscore the potential of ContrastPool for advancing the understanding of brain networks and neurodegenerative conditions. The source code is available at https://github.com/AngusMonroe/ContrastPool. △ Less

Submitted 12 April, 2024; v1 submitted 7 July, 2023; originally announced July 2023.

arXiv:2307.06344 [pdf, other]

The Whole Pathological Slide Classification via Weakly Supervised Learning

Authors: Qiehe Sun, Jiawen Li, ** Xu, Junru Cheng, Tian Guan, Yonghong He

Abstract: Due to its superior efficiency in utilizing annotations and addressing gigapixel-sized images, multiple instance learning (MIL) has shown great promise as a framework for whole slide image (WSI) classification in digital pathology diagnosis. However, existing methods tend to focus on advanced aggregators with different structures, often overlooking the intrinsic features of H\&E pathological slide… ▽ More Due to its superior efficiency in utilizing annotations and addressing gigapixel-sized images, multiple instance learning (MIL) has shown great promise as a framework for whole slide image (WSI) classification in digital pathology diagnosis. However, existing methods tend to focus on advanced aggregators with different structures, often overlooking the intrinsic features of H\&E pathological slides. To address this limitation, we introduced two pathological priors: nuclear heterogeneity of diseased cells and spatial correlation of pathological tiles. Leveraging the former, we proposed a data augmentation method that utilizes stain separation during extractor training via a contrastive learning strategy to obtain instance-level representations. We then described the spatial relationships between the tiles using an adjacency matrix. By integrating these two views, we designed a multi-instance framework for analyzing H\&E-stained tissue images based on pathological inductive bias, encompassing feature extraction, filtering, and aggregation. Extensive experiments on the Camelyon16 breast dataset and TCGA-NSCLC Lung dataset demonstrate that our proposed framework can effectively handle tasks related to cancer detection and differentiation of subtypes, outperforming state-of-the-art medical image classification methods based on MIL. The code will be released later. △ Less

Submitted 12 July, 2023; originally announced July 2023.

arXiv:2306.07505 [pdf]

Deep learning radiomics for assessment of gastroesophageal varices in people with compensated advanced chronic liver disease

Authors: Lan Wang, Ruiling He, Lili Zhao, Jia Wang, Zhengzi Geng, Tao Ren, Guo Zhang, Peng Zhang, Kaiqiang Tang, Chaofei Gao, Fei Chen, Liting Zhang, Yonghe Zhou, Xin Li, Fanbin He, Hui Huan, Wenjuan Wang, Yunxiao Liang, Juan Tang, Fang Ai, Tingyu Wang, Liyun Zheng, Zhongwei Zhao, Jiansong Ji, Wei Liu , et al. (22 additional authors not shown)

Abstract: Objective: Bleeding from gastroesophageal varices (GEV) is a medical emergency associated with high mortality. We aim to construct an artificial intelligence-based model of two-dimensional shear wave elastography (2D-SWE) of the liver and spleen to precisely assess the risk of GEV and high-risk gastroesophageal varices (HRV). Design: A prospective multicenter study was conducted in patients with… ▽ More Objective: Bleeding from gastroesophageal varices (GEV) is a medical emergency associated with high mortality. We aim to construct an artificial intelligence-based model of two-dimensional shear wave elastography (2D-SWE) of the liver and spleen to precisely assess the risk of GEV and high-risk gastroesophageal varices (HRV). Design: A prospective multicenter study was conducted in patients with compensated advanced chronic liver disease. 305 patients were enrolled from 12 hospitals, and finally 265 patients were included, with 1136 liver stiffness measurement (LSM) images and 1042 spleen stiffness measurement (SSM) images generated by 2D-SWE. We leveraged deep learning methods to uncover associations between image features and patient risk, and thus conducted models to predict GEV and HRV. Results: A multi-modality Deep Learning Risk Prediction model (DLRP) was constructed to assess GEV and HRV, based on LSM and SSM images, and clinical information. Validation analysis revealed that the AUCs of DLRP were 0.91 for GEV (95% CI 0.90 to 0.93, p < 0.05) and 0.88 for HRV (95% CI 0.86 to 0.89, p < 0.01), which were significantly and robustly better than canonical risk indicators, including the value of LSM and SSM. Moreover, DLPR was better than the model using individual parameters, including LSM and SSM images. In HRV prediction, the 2D-SWE images of SSM outperform LSM (p < 0.01). Conclusion: DLRP shows excellent performance in predicting GEV and HRV over canonical risk indicators LSM and SSM. Additionally, the 2D-SWE images of SSM provided more information for better accuracy in predicting HRV than the LSM. △ Less

Submitted 12 June, 2023; originally announced June 2023.

arXiv:2303.04902 [pdf]

doi 10.1002/hbm.26672

Inter-brain substrates of role switching during mother-child interaction

Authors: Yamin Li, Saishuang Wu, Jiayang Xu, Haiwa Wang, Qi Zhu, Wen Shi, Yue Fang, Fan Jiang, Shanbao Tong, Yunting Zhang, Xiaoli Guo

Abstract: Mother-child interaction is highly dynamic and reciprocal. Switching roles in these back-and-forth interactions serves as a crucial feature of reciprocal behaviors while the underlying neural entrainment is still not well-studied. Here, we designed a role-controlled cooperative task with dual EEG recording to study how differently two brains interact when mothers and children hold different roles.… ▽ More Mother-child interaction is highly dynamic and reciprocal. Switching roles in these back-and-forth interactions serves as a crucial feature of reciprocal behaviors while the underlying neural entrainment is still not well-studied. Here, we designed a role-controlled cooperative task with dual EEG recording to study how differently two brains interact when mothers and children hold different roles. When children were actors and mothers were observers, mother-child inter-brain synchrony emerged within the theta oscillations and the frontal lobe, which highly correlated with children's attachment to their mothers. When their roles were reversed, this synchrony was shifted to the alpha oscillations and the central area and associated with mothers' perception of their relationship with their children. The results suggested an observer-actor neural alignment within the actor's oscillations, which was modulated by the actor-toward-observer emotional bonding. Our findings contribute to the understanding of how inter-brain synchrony is established and dynamically changed during mother-child reciprocal interaction. △ Less

Submitted 8 March, 2023; originally announced March 2023.

arXiv:2302.09151 [pdf]

doi 10.1016/j.biosystems.2023.105001

SBcoyote: An Extensible Python-Based Reaction Editor and Viewer

Authors: ** Xu, Gary Geng, Nhan D. Nguyen, Carmen Perena-Cortes, Claire Samuels, Herbert M. Sauro

Abstract: SBcoyote is an open-source cross-platform biochemical reaction viewer and editor released under the liberal MIT license. It is written in Python and uses wxPython to implement the GUI and the drawing canvas. It supports the visualization and editing of compartments, species, and reactions. It includes many options to stylize each of these components. For instance, species can be in different color… ▽ More SBcoyote is an open-source cross-platform biochemical reaction viewer and editor released under the liberal MIT license. It is written in Python and uses wxPython to implement the GUI and the drawing canvas. It supports the visualization and editing of compartments, species, and reactions. It includes many options to stylize each of these components. For instance, species can be in different colors and shapes. Other core features include the ability to create alias nodes, alignment of groups of nodes, network zooming, as well as an interactive bird-eye view of the network to allow easy navigation on large networks. A unique feature of the tool is the extensive Python plugin API, where third-party developers can include new functionality. To assist third-party plugin developers, we provide a variety of sample plugins, including, random network generation, a simple auto layout tool, export to Antimony, export SBML, import SBML, etc. Of particular interest are the export and import SBML plugins since these support the SBML level 3 layout and render standard, which is exchangeable with other software packages. Plugins are stored in a GitHub repository, and an included plugin manager can retrieve and install new plugins from the repository on demand. Plugins have version metadata associated with them to make it install plugin updates. Availability: https://github.com/sys-bio/SBcoyote. △ Less

Submitted 14 August, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

arXiv:2212.10784 [pdf, other]

Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction?

Authors: Jiashu Xu, Mingyu Derek Ma, Muhao Chen

Abstract: Two key obstacles in biomedical relation extraction (RE) are the scarcity of annotations and the prevalence of instances without explicitly pre-defined labels due to low annotation coverage. Existing approaches, which treat biomedical RE as a multi-class classification task, often result in poor generalization in low-resource settings and do not have the ability to make selective prediction on unk… ▽ More Two key obstacles in biomedical relation extraction (RE) are the scarcity of annotations and the prevalence of instances without explicitly pre-defined labels due to low annotation coverage. Existing approaches, which treat biomedical RE as a multi-class classification task, often result in poor generalization in low-resource settings and do not have the ability to make selective prediction on unknown cases but give a guess from seen relations, hindering the applicability of those approaches. We present NBR, which converts biomedical RE as natural language inference formulation through indirect supervision. By converting relations to natural language hypotheses, NBR is capable of exploiting semantic cues to alleviate annotation scarcity. By incorporating a ranking-based loss that implicitly calibrates abstinent instances, NBR learns a clearer decision boundary and is instructed to abstain on uncertain instances. Extensive experiments on three widely-used biomedical RE benchmarks, namely ChemProt, DDI and GAD, verify the effectiveness of NBR in both full-set and low-resource regimes. Our analysis demonstrates that indirect supervision benefits biomedical RE even when a domain gap exists, and combining NLI knowledge with biomedical knowledge leads to the best performance gains. △ Less

Submitted 19 October, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

Comments: 16 pages; ACL 2023; code in https://github.com/luka-group/NLI_as_Indirect_Supervision

arXiv:2212.03447 [pdf]

Integration of Pre-trained Protein Language Models into Geometric Deep Learning Networks

Authors: Fang Wu, Lirong Wu, Dragomir Radev, **bo Xu, Stan Z. Li

Abstract: Geometric deep learning has recently achieved great success in non-Euclidean domains, and learning on 3D structures of large biomolecules is emerging as a distinct research area. However, its efficacy is largely constrained due to the limited quantity of structural data. Meanwhile, protein language models trained on substantial 1D sequences have shown burgeoning capabilities with scale in a broad… ▽ More Geometric deep learning has recently achieved great success in non-Euclidean domains, and learning on 3D structures of large biomolecules is emerging as a distinct research area. However, its efficacy is largely constrained due to the limited quantity of structural data. Meanwhile, protein language models trained on substantial 1D sequences have shown burgeoning capabilities with scale in a broad range of applications. Several previous studies consider combining these different protein modalities to promote the representation power of geometric neural networks, but fail to present a comprehensive understanding of their benefits. In this work, we integrate the knowledge learned by well-trained protein language models into several state-of-the-art geometric networks and evaluate a variety of protein representation learning benchmarks, including protein-protein interface prediction, model quality assessment, protein-protein rigid-body docking, and binding affinity prediction. Our findings show an overall improvement of 20% over baselines. Strong evidence indicates that the incorporation of protein language models' knowledge enhances geometric networks' capacity by a significant margin and can be generalized to complex tasks. △ Less

Submitted 29 October, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

arXiv:2211.12421 [pdf, other]

Data-Driven Network Neuroscience: On Data Collection and Benchmark

Authors: Jiaxing Xu, Yunhan Yang, David Tse Jung Huang, Sophi Shilpa Gururajapathy, Yi** Ke, Miao Qiao, Alan Wang, Haribalan Kumar, Josh McGeown, Eryn Kwon

Abstract: This paper presents a comprehensive and quality collection of functional human brain network data for potential research in the intersection of neuroscience, machine learning, and graph analytics. Anatomical and functional MRI images have been used to understand the functional connectivity of the human brain and are particularly important in identifying underlying neurodegenerative conditions such… ▽ More This paper presents a comprehensive and quality collection of functional human brain network data for potential research in the intersection of neuroscience, machine learning, and graph analytics. Anatomical and functional MRI images have been used to understand the functional connectivity of the human brain and are particularly important in identifying underlying neurodegenerative conditions such as Alzheimer's, Parkinson's, and Autism. Recently, the study of the brain in the form of brain networks using machine learning and graph analytics has become increasingly popular, especially to predict the early onset of these conditions. A brain network, represented as a graph, retains rich structural and positional information that traditional examination methods are unable to capture. However, the lack of publicly accessible brain network data prevents researchers from data-driven explorations. One of the main difficulties lies in the complicated domain-specific preprocessing steps and the exhaustive computation required to convert the data from MRI images into brain networks. We bridge this gap by collecting a large amount of MRI images from public databases and a private source, working with domain experts to make sensible design choices, and preprocessing the MRI images to produce a collection of brain network datasets. The datasets originate from 6 different sources, cover 4 brain conditions, and consist of a total of 2,702 subjects. We test our graph datasets on 12 machine learning models to provide baselines and validate the data quality on a recent graph analysis model. To lower the barrier to entry and promote the research in this interdisciplinary field, we release our brain network data and complete preprocessing details including codes at https://doi.org/10.17608/k6.auckland.21397377 and https://github.com/brainnetuoa/data_driven_network_neuroscience. △ Less

Submitted 29 October, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

Journal ref: Advances in Neural Information Processing Systems, 2023

arXiv:2211.07294 [pdf, other]

A universal DNA computing model for solving NP-hard subset problems

Authors: Enqiang Zhu, Xianhang Luo, Chanjuan Liu, Xiaolong Shi, ** Xu

Abstract: DNA computing, a nontraditional computing mechanism, provides a feasible and effective method for solving NP-hard problems because of the vast parallelism and high-density storage of DNA molecules. Although DNA computing has been exploited to solve various intractable computational problems, such as the Hamiltonian path problem, SAT problem, and graph coloring problem, there has been little discus… ▽ More DNA computing, a nontraditional computing mechanism, provides a feasible and effective method for solving NP-hard problems because of the vast parallelism and high-density storage of DNA molecules. Although DNA computing has been exploited to solve various intractable computational problems, such as the Hamiltonian path problem, SAT problem, and graph coloring problem, there has been little discussion of designing universal DNA computing-based models, which can solve a class of problems. In this paper, by leveraging the dynamic and enzyme-free properties of DNA strand displacement, we propose a universal model named DCMSubset for solving subset problems in graph theory. The model aims to find a minimum (or maximum) set satisfying given constraints. For each element x involved in a given problem, DCMSubset uses an exclusive single-stranded DNA molecule to model x as well as a specific DNA complex to model the relationship between x and other elements. Based on the proposed model, we conducted simulation and biochemical experiments on three kinds of subset problems, a minimum dominating set, maximum independent set, and minimum vertex cover. We observed that DCMSubset can also be used to solve the graph coloring problem. Moreover, we extended DCMSubset to a model for solving the SAT problem. The results of experiments showed the feasibility and university of the proposed method. Our results highlighted the potential for DNA strand displacement to act as a computation tool to solve NP-hard problems. △ Less

Submitted 15 November, 2022; v1 submitted 14 November, 2022; originally announced November 2022.

arXiv:2210.06538 [pdf]

doi 10.1039/D2LC00596D

D-CryptO: Deep learning-based analysis of colon organoid morphology from brightfield images

Authors: Lyan Abdul, Jocelyn Xu, Alexander Sotra, Abbas Chaudary, Jerry Gao, Shravanthi Rajasekar, Nicky Anvari, Hamidreza Mahyar, Boyang Zhang

Abstract: Stem cell-derived organoids are a promising tool to model native human tissues as they resemble human organs functionally and structurally compared to traditional monolayer cell-based assays. For instance, colon organoids can spontaneously develop crypt-like structures similar to those found in the native colon. While analyzing the structural development of organoids can be a valuable readout, usi… ▽ More Stem cell-derived organoids are a promising tool to model native human tissues as they resemble human organs functionally and structurally compared to traditional monolayer cell-based assays. For instance, colon organoids can spontaneously develop crypt-like structures similar to those found in the native colon. While analyzing the structural development of organoids can be a valuable readout, using traditional image analysis tools makes it challenging because of the heterogeneities and the abstract nature of organoid morphologies. To address this limitation, we developed and validated a deep learning-based image analysis tool, named D-CryptO, for the classification of organoid morphology. D-CryptO can automatically assess the crypt formation and opacity of colorectal organoids from brightfield images to determine the extent of organoid structural maturity. To validate this tool, changes in organoid morphology were analyzed during organoid passaging and short-term forskolin stimulation. To further demonstrate the potential of D-CryptO for drug testing, organoid structures were analyzed following treatments with a panel of chemotherapeutic drugs. With D-CryptO, subtle variations in how colon organoids responded to the different chemotherapeutic drugs were detected, which suggest potentially distinct mechanisms of action. This tool could be expanded to other organoid types, like intestinal organoids, to facilitate 3D tissue morphological analysis. △ Less

Submitted 12 October, 2022; originally announced October 2022.

Comments: Lab on a Chip (2022)

arXiv:2208.13994 [pdf]

HiGNN: Hierarchical Informative Graph Neural Networks for Molecular Property Prediction Equipped with Feature-Wise Attention

Authors: Weimin Zhu, Yi Zhang, DuanCheng Zhao, Jianrong Xu, Ling Wang

Abstract: Elucidating and accurately predicting the druggability and bioactivities of molecules plays a pivotal role in drug design and discovery and remains an open challenge. Recently, graph neural networks (GNN) have made remarkable advancements in graph-based molecular property prediction. However, current graph-based deep learning methods neglect the hierarchical information of molecules and the relati… ▽ More Elucidating and accurately predicting the druggability and bioactivities of molecules plays a pivotal role in drug design and discovery and remains an open challenge. Recently, graph neural networks (GNN) have made remarkable advancements in graph-based molecular property prediction. However, current graph-based deep learning methods neglect the hierarchical information of molecules and the relationships between feature channels. In this study, we propose a well-designed hierarchical informative graph neural networks framework (termed HiGNN) for predicting molecular property by utilizing a co-representation learning of molecular graphs and chemically synthesizable BRICS fragments. Furthermore, a plug-and-play feature-wise attention block is first designed in HiGNN architecture to adaptively recalibrate atomic features after the message passing phase. Extensive experiments demonstrate that HiGNN achieves state-of-the-art predictive performance on many challenging drug discovery-associated benchmark datasets. In addition, we devise a molecule-fragment similarity mechanism to comprehensively investigate the interpretability of HiGNN model at the subgraph level, indicating that HiGNN as a powerful deep learning tool can help chemists and pharmacists identify the key components of molecules for designing better molecules with desired properties or functions. The source code is publicly available at https://github.com/idruglab/hignn. △ Less

Submitted 30 August, 2022; originally announced August 2022.

arXiv:2205.03583 [pdf]

Scanning Electron Microscopy and Metabolite Measurement Revealed the Stress Mechanism of PS-COOH Microplastics on Rhodotorula mucilaginosa AN5

Authors: Jiahao Ma, Xiangfei Meng, Zixin Li, Lexian Li, Jiwen Xu, Guangfeng Kan

Abstract: Microplastics in the marine environment have been paid more and more attention by researchers, and the impact of these substances on marine microorganisms can not be ignored. Studies have shown that PS-COOH Microplastics are harmful to marine molluscs, algae and monads. This study explore the effect and mechanism of microplastics (80 nm PS-COOH) on Antarctic marine yeast, Rhodotorula mucilaginosa… ▽ More Microplastics in the marine environment have been paid more and more attention by researchers, and the impact of these substances on marine microorganisms can not be ignored. Studies have shown that PS-COOH Microplastics are harmful to marine molluscs, algae and monads. This study explore the effect and mechanism of microplastics (80 nm PS-COOH) on Antarctic marine yeast, Rhodotorula mucilaginosa AN5 by bacterial count, Scanning Electron Microscopy (SEM) and metabolite analysis. The results illustrates that a 50 mg/L concentration of PS-COOH could inhibit 36.15% growth of yeast cells and 10 mg/L inhibit 80.20%. Microplastics stress causes changes in the content of some oxidative stress substances, including reactive oxygen species (ROS) 42.86% , malondialdehyde (MDA) 54.06% content and the activities of antioxidant enzymes such as catalase (CAT) 36.00% , peroxidase (POD) 66.67% and superoxide dismutase (SOD) 25.40%. These results revealed the possible stress effect of microplastic pollution on marine yeast and may affect bottom layer of marine ecosystem. △ Less

Submitted 13 September, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

arXiv:2204.12611 [pdf, ps, other]

doi 10.1093/bioinformatics/btac730

SBMLDiagrams: A python package to process and visualize SBML layout and render

Authors: ** Xu, Jessie Jiang, Herbert M. Sauro

Abstract: Summary: The Systems Biology Markup Language (SBML) is an extensible standard format for exchanging biochemical models. One of the extensions for SBML is the SBML Layout and Render package. This allows modelers to describe a biochemical model as a pathway diagram. However, up to now there has been little support to help users easily add and retrieve such information from SBML. In this application… ▽ More Summary: The Systems Biology Markup Language (SBML) is an extensible standard format for exchanging biochemical models. One of the extensions for SBML is the SBML Layout and Render package. This allows modelers to describe a biochemical model as a pathway diagram. However, up to now there has been little support to help users easily add and retrieve such information from SBML. In this application note, we describe a new Python package called SBMLDiagrams. This package allows a user to add layout and render information or retrieve it using a straightforward Python API. The package uses skia-python to support the rendering of the diagrams, allowing export to commons formats such as PNG or PDF. Availability: SBMLDiagrams is publicly available and licensed under the liberal MIT open-source license. The package is available for all major platforms. The source code has been deposited at GitHub (github.com/sys-bio/SBMLDiagrams). Users can install the package using the standard pip installation mechanism: pip install SBMLDiagrams. Contact: [email protected]. △ Less

Submitted 14 November, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

arXiv:2203.08648 [pdf, other]

Artificial Intelligence Enables Real-Time and Intuitive Control of Prostheses via Nerve Interface

Authors: Diu Khue Luu, Anh Tuan Nguyen, Ming Jiang, Markus W. Drealan, Jian Xu, Tong Wu, Wing-kin Tam, Wenfeng Zhao, Brian Z. H. Lim, Cynthia K. Overstreet, Qi Zhao, Jonathan Cheng, Edward W. Keefer, Zhi Yang

Abstract: Objective: The next generation prosthetic hand that moves and feels like a real hand requires a robust neural interconnection between the human minds and machines. Methods: Here we present a neuroprosthetic system to demonstrate that principle by employing an artificial intelligence (AI) agent to translate the amputee's movement intent through a peripheral nerve interface. The AI agent is designed… ▽ More Objective: The next generation prosthetic hand that moves and feels like a real hand requires a robust neural interconnection between the human minds and machines. Methods: Here we present a neuroprosthetic system to demonstrate that principle by employing an artificial intelligence (AI) agent to translate the amputee's movement intent through a peripheral nerve interface. The AI agent is designed based on the recurrent neural network (RNN) and could simultaneously decode six degree-of-freedom (DOF) from multichannel nerve data in real-time. The decoder's performance is characterized in motor decoding experiments with three human amputees. Results: First, we show the AI agent enables amputees to intuitively control a prosthetic hand with individual finger and wrist movements up to 97-98% accuracy. Second, we demonstrate the AI agent's real-time performance by measuring the reaction time and information throughput in a hand gesture matching task. Third, we investigate the AI agent's long-term uses and show the decoder's robust predictive performance over a 16-month implant duration. Conclusion & significance: Our study demonstrates the potential of AI-enabled nerve technology, underling the next generation of dexterous and intuitive prosthetic hands. △ Less

Submitted 16 March, 2022; originally announced March 2022.

arXiv:2203.01175 [pdf, other]

libRoadRunner 2.0: A High-Performance SBML Simulation and Analysis Library

Authors: Ciaran Welsh, ** Xu, Lucian Smith, Matthias König, Kiri Choi, Herbert M. Sauro

Abstract: Motivation: This paper presents libRoadRunner 2.0, an extensible, high-performance, cross-platform, open-source software library for the simulation and analysis of models expressed using Systems Biology Markup Language SBML). Results: libRoadRunner is a self-contained library, able to run both as a component inside other tools via its C++ and C bindings, and interactively through its Python or J… ▽ More Motivation: This paper presents libRoadRunner 2.0, an extensible, high-performance, cross-platform, open-source software library for the simulation and analysis of models expressed using Systems Biology Markup Language SBML). Results: libRoadRunner is a self-contained library, able to run both as a component inside other tools via its C++ and C bindings, and interactively through its Python or Julia interface. libRoadRunner uses a custom Just-In-Time JIT compiler built on the widely-used LLVM JIT compiler framework. It compiles SBML-specified models directly into native machine code for a large variety of processors, making it appropriate for solving extremely large models or repeated runs. libRoadRunner is flexible, supporting the bulk of the SBML specification (except for delay and nonlinear algebraic equations) and including several SBML extensions such as composition and distributions. It offers multiple deterministic and stochastic integrators, as well as tools for steady-state, sensitivity, stability analysis, and structural analysis of the stoichiometric matrix. Availability: libRoadRunner binary distributions are available for Mac OS X, Linux, and Windows. The library is licensed under the Apache License Version 2.0. libRoadRunner is also available for ARM-based computers such as the Raspberry Pi and can in principle be compiled on any system supported by LLVM-13. http://sys-bio.github.io/roadrunner/index.html provides online documentation, full build instructions, binaries, and a git source repository. △ Less

Submitted 25 February, 2022; originally announced March 2022.

arXiv:2202.00087 [pdf, other]

Holistic Fine-grained GGS Characterization: From Detection to Unbalanced Classification

Authors: Yuzhe Lu, Haichun Yang, Zuhayr Asad, Zheyu Zhu, Tianyuan Yao, Jiachen Xu, Agnes B. Fogo, Yuankai Huo

Abstract: Recent studies have demonstrated the diagnostic and prognostic values of global glomerulosclerosis (GGS) in IgA nephropathy, aging, and end-stage renal disease. However, the fine-grained quantitative analysis of multiple GGS subtypes (e.g., obsolescent, solidified, and disappearing glomerulosclerosis) is typically a resource extensive manual process. Very few automatic methods, if any, have been d… ▽ More Recent studies have demonstrated the diagnostic and prognostic values of global glomerulosclerosis (GGS) in IgA nephropathy, aging, and end-stage renal disease. However, the fine-grained quantitative analysis of multiple GGS subtypes (e.g., obsolescent, solidified, and disappearing glomerulosclerosis) is typically a resource extensive manual process. Very few automatic methods, if any, have been developed to bridge this gap for such analytics. In this paper, we present a holistic pipeline to quantify GGS (with both detection and classification) from a whole slide image in a fully automatic manner. In addition, we conduct the fine-grained classification for the sub-types of GGS. Our study releases the open-source quantitative analytical tool for fine-grained GGS characterization while tackling the technical challenges in unbalanced classification and integrating detection and classification. △ Less

Submitted 31 January, 2022; originally announced February 2022.

arXiv:2111.08183 [pdf, other]

Computational tools for assessing gene therapy under branching process models of mutation

Authors: Timothy C Stutz, Janet S. Sinsheimer, Mary Sehl, Jason Xu

Abstract: Multitype branching processes are ideal for studying the population dynamics of stem cell populations undergoing mutation accumulation over the years following transplant. In such stochastic models, several quantities are of clinical interest as insertional mutagenesis carries the potential threat of leukemogenesis following gene therapy with autologous stem cell transplantation. In this paper, we… ▽ More Multitype branching processes are ideal for studying the population dynamics of stem cell populations undergoing mutation accumulation over the years following transplant. In such stochastic models, several quantities are of clinical interest as insertional mutagenesis carries the potential threat of leukemogenesis following gene therapy with autologous stem cell transplantation. In this paper, we develop a three-type branching process model describing accumulations of mutations in a population of stem cells distinguished by their ability for long-term self-renewal. Our outcome of interest is the appearance of a double-mutant cell, which carries a high potential for leukemic transformation. In our model, a single-hit mutation carries a slight proliferative advantage over a wild-type stem cells. We compute marginalized transition probabilities that allow us to capture important quantitative aspects of our model, including the probability of observing a double-hit mutant and relevant moments of a single-hit mutation population over time. We thoroughly explore the model behavior numerically, varying birth rates across the initial sizes and populations of wild type stem cells and single-hit mutants, and compare the probability of observing a double-hit mutant under these conditions. We find that increasing the number of single-mutants over wild-type particles initially present has a large effect on the occurrence of a double-mutant, and that it is relatively safe for single-mutants to be quite proliferative, provided the lentiviral gene addition avoids creating single mutants in the original insertion process. Our approach is broadly applicable to an important set of questions in cancer modeling and other population processes involving multiple stages, compartments, or types. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: 19 pages, 5 figures

arXiv:2110.11814 [pdf, other]

Identifiability of local and global features of phylogenetic networks from average distances

Authors: **gcheng Xu, Cécile Ané

Abstract: Phylogenetic networks extend phylogenetic trees to model non-vertical inheritance, by which a lineage inherits material from multiple parents. The computational complexity of estimating phylogenetic networks from genome-wide data with likelihood-based methods limits the size of networks that can be handled. Methods based on pairwise distances could offer faster alternatives. We study here the info… ▽ More Phylogenetic networks extend phylogenetic trees to model non-vertical inheritance, by which a lineage inherits material from multiple parents. The computational complexity of estimating phylogenetic networks from genome-wide data with likelihood-based methods limits the size of networks that can be handled. Methods based on pairwise distances could offer faster alternatives. We study here the information that average pairwise distances contain on the underlying phylogenetic network, by characterizing local and global features that can or cannot be identified. For general networks, we clarify that the root and edge lengths adjacent to reticulations are not identifiable, and then focus on the class of zipped-up semidirected networks. We provide a criterion to swap subgraphs locally, such as 3-cycles, resulting in indistinguishable networks. We propose the "distance split tree", which can be constructed from pairwise distances, and prove that it is a refinement of the network's tree of blobs, capturing the tree-like features of the network. For level-1 networks, this distance split tree is equal to the tree of blobs refined to separate polytomies from blobs, and we prove that the mixed representation of the network is identifiable. The information loss is localized around 4-cycles, for which the placement of the reticulation is unidentifiable. The mixed representation combines split edges for 4-cycles, regular tree and hybrid edges from the semidirected network, and edge parameters that encode all information identifiable from average pairwise distances. △ Less

Submitted 25 June, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

arXiv:2102.10971 [pdf, other]

doi 10.1109/TCSS.2021.3114504

Agent-Based Campus Novel Coronavirus Infection and Control Simulation

Authors: Pei Lv, Quan Zhang, Boya Xu, Ran Feng, Chaochao Li, Junxiao Xue, Bing Zhou, Mingliang Xu

Abstract: Corona Virus Disease 2019 (COVID-19), due to its extremely high infectivity, has been spreading rapidly around the world and bringing huge influence to socioeconomic development as well as people's daily life. Taking for example the virus transmission that may occur after college students return to school, we analyze the quantitative influence of the key factors on the virus spread, including crow… ▽ More Corona Virus Disease 2019 (COVID-19), due to its extremely high infectivity, has been spreading rapidly around the world and bringing huge influence to socioeconomic development as well as people's daily life. Taking for example the virus transmission that may occur after college students return to school, we analyze the quantitative influence of the key factors on the virus spread, including crowd density and self-protection. One Campus Virus Infection and Control Simulation model (CVICS) of the novel coronavirus is proposed in this paper, fully considering the characteristics of repeated contact and strong mobility of crowd in the closed environment. Specifically, we build an agent-based infection model, introduce the mean field theory to calculate the probability of virus transmission, and micro-simulate the daily prevalence of infection among individuals. The experimental results show that the proposed model in this paper efficiently simulate how the virus spread in the dense crowd in frequent contact under closed environment. Furthermore, preventive and control measures such as self-protection, crowd decentralization and isolation during the epidemic can effectively delay the arrival of infection peak and reduce the prevalence, and finally lower the risk of COVID-19 transmission after the students return to school. △ Less

Submitted 1 September, 2021; v1 submitted 22 February, 2021; originally announced February 2021.

Comments: submitted to IEEE Transactions On Computational Social Systems

Journal ref: IEEE Transactions on Computational Social Systems, 2021

arXiv:2012.11889 [pdf]

Establishment of a diagnostic model to distinguish coronavirus disease 2019 from influenza A based on laboratory findings

Authors: Dongyang Xing, Suyan Tian, Yukun Chen, **mei Wang, Xuejuan Sun, Shanji Li, Jiancheng Xu

Abstract: Background: Coronavirus disease 2019 (COVID-19) and Influenza A are common disease caused by viral infection. The clinical symptoms and transmission routes of the two diseases are similar. However, there are no relevant studies on laboratory diagnostic models to discriminate COVID-19 and influenza A. This study aims at establishing a signature of laboratory findings to tell patients with COVID-19… ▽ More Background: Coronavirus disease 2019 (COVID-19) and Influenza A are common disease caused by viral infection. The clinical symptoms and transmission routes of the two diseases are similar. However, there are no relevant studies on laboratory diagnostic models to discriminate COVID-19 and influenza A. This study aims at establishing a signature of laboratory findings to tell patients with COVID-19 apart from those with influenza A perfectly. Materials: In this study, 56 COVID-19 patients and 54 influenza A patients were included. Laboratory findings, epidemiological characteristics and demographic data were obtained from electronic medical record databases. Elastic network models, followed by a stepwise logistic regression model were implemented to identify indicators capable of discriminating COVID-19 and influenza A. A nomogram is diagramed to show the resulting discriminative model. Results: The majority of hematological and biochemical parameters in COVID-19 patients were significantly different from those in influenza A patients. In the final model, albumin/globulin (A/G), total bilirubin (TBIL) and erythrocyte specific volume (HCT) were selected as predictors. Using an external dataset, the model was validated to perform well. Conclusion: A diagnostic model of laboratory findings was established, in which A/G, TBIL and HCT were included as highly relevant indicators for the segmentation of COVID-19 and influenza A, providing a complimentary means for the precise diagnosis of these two diseases. △ Less

Submitted 22 December, 2020; originally announced December 2020.

Comments: 26 pages,3 figures

arXiv:2011.00034 [pdf, other]

Adaptive Semi-Supervised Intent Inferral to Control a Powered Hand Orthosis for Stroke

Authors: **gxi Xu, Cassie Meeker, Ava Chen, Lauren Winterbottom, Michaela Fraser, Sangwoo Park, Lynne M. Weber, Mitchell Miya, Dawn Nilsen, Joel Stein, Matei Ciocarlie

Abstract: In order to provide therapy in a functional context, controls for wearable robotic orthoses need to be robust and intuitive. We have previously introduced an intuitive, user-driven, EMG-based method to operate a robotic hand orthosis, but the process of training a control that is robust to concept drift (changes in the input signal) places a substantial burden on the user. In this paper, we explor… ▽ More In order to provide therapy in a functional context, controls for wearable robotic orthoses need to be robust and intuitive. We have previously introduced an intuitive, user-driven, EMG-based method to operate a robotic hand orthosis, but the process of training a control that is robust to concept drift (changes in the input signal) places a substantial burden on the user. In this paper, we explore semi-supervised learning as a paradigm for controlling a powered hand orthosis for stroke subjects. To the best of our knowledge, this is the first use of semi-supervised learning for an orthotic application. Specifically, we propose a disagreement-based semi-supervision algorithm for handling intrasession concept drift based on multimodal ipsilateral sensing. We evaluate the performance of our algorithm on data collected from five stroke subjects. Our results show that the proposed algorithm helps the device adapt to intrasession drift using unlabeled data and reduces the training burden placed on the user. We also validate the feasibility of our proposed algorithm with a functional task; in these experiments, two subjects successfully completed multiple instances of a pick-and-handover task. △ Less

Submitted 1 March, 2022; v1 submitted 30 October, 2020; originally announced November 2020.

Comments: 7 pages; Accepted to International Conference on Robotics and Automation (ICRA) 2022

arXiv:2009.12728 [pdf]

Network analysis of ballast-mediated species transfer reveals important introduction and dispersal patterns in the Arctic

Authors: Mandana Saebi, Jian Xu, Salvatore R. Curasi, Erin K. Grey, Nitesh V. Chawla, David M. Lodge

Abstract: Rapid climate change has wide-ranging implications for the Arctic region, including sea ice loss, increased geopolitical attention, and expanding economic activity, including a dramatic increase in ship** activity. As a result, the risk of harmful non-native marine species being introduced into this critical region will increase unless policy and management steps are implemented in response. Usi… ▽ More Rapid climate change has wide-ranging implications for the Arctic region, including sea ice loss, increased geopolitical attention, and expanding economic activity, including a dramatic increase in ship** activity. As a result, the risk of harmful non-native marine species being introduced into this critical region will increase unless policy and management steps are implemented in response. Using big data about ship**, ecoregions, and environmental conditions, we leverage network analysis and data mining techniques to assess, visualize, and project ballast water-mediated species introductions into the Arctic and dispersal of non-native species within the Arctic. We first identify high-risk connections between the Arctic and non-Arctic ports that could be sources of non-native species over 15 years (1997-2012) and observe the emergence of ship** hubs in the Arctic where the cumulative risk of non-native species introduction is increasing. We then consider how environmental conditions can constrain this Arctic introduction network for species with different physiological limits, thus providing a species-level tool for decision-makers. Next, we focus on within-Arctic ballast-mediated species dispersal where we use higher-order network analysis to identify critical ship** routes that may facilitate species dispersal within the Arctic. The risk assessment and projection framework we propose could inform risk-based assessment and management of ship-borne invasive species in the Arctic. △ Less

Submitted 26 September, 2020; originally announced September 2020.

arXiv:1910.04221 [pdf, other]

Likelihood-based Inference for Partially Observed Epidemics on Dynamic Networks

Authors: Fan Bu, Allison E. Aiello, Jason Xu, Alexander Volfovsky

Abstract: We propose a generative model and an inference scheme for epidemic processes on dynamic, adaptive contact networks. Network evolution is formulated as a link-Markovian process, which is then coupled to an individual-level stochastic SIR model, in order to describe the interplay between epidemic dynamics on a network and network link changes. A Markov chain Monte Carlo framework is developed for li… ▽ More We propose a generative model and an inference scheme for epidemic processes on dynamic, adaptive contact networks. Network evolution is formulated as a link-Markovian process, which is then coupled to an individual-level stochastic SIR model, in order to describe the interplay between epidemic dynamics on a network and network link changes. A Markov chain Monte Carlo framework is developed for likelihood-based inference from partial epidemic observations, with a novel data augmentation algorithm specifically designed to deal with missing individual recovery times under the dynamic network setting. Through a series of simulation experiments, we demonstrate the validity and flexibility of the model as well as the efficacy and efficiency of the data augmentation inference scheme. The model is also applied to a recent real-world dataset on influenza-like-illness transmission with high-resolution social contact tracking records. △ Less

Submitted 5 April, 2020; v1 submitted 9 October, 2019; originally announced October 2019.

arXiv:1910.01726 [pdf, other]

A machine learning method correlating pulse pressure wave data with pregnancy

Authors: Jianhong Chen, Huang Huang, Wenrui Hao, **chao Xu

Abstract: Pulse feeling, representing the tactile arterial palpation of the heartbeat, has been widely used in traditional Chinese medicine (TCM) to diagnose various diseases. The quantitative relationship between the pulse wave and health conditions however has not been investigated in modern medicine. In this paper, we explored the correlation between pulse pressure wave (PPW), rather than the pulse key f… ▽ More Pulse feeling, representing the tactile arterial palpation of the heartbeat, has been widely used in traditional Chinese medicine (TCM) to diagnose various diseases. The quantitative relationship between the pulse wave and health conditions however has not been investigated in modern medicine. In this paper, we explored the correlation between pulse pressure wave (PPW), rather than the pulse key features in TCM, and pregnancy by using deep learning technology. This computational approach shows that the accuracy of pregnancy detection by the PPW is 84% with an AUC of 91%. Our study is a proof of concept of pulse diagnosis and will also motivate further sophisticated investigations on pulse waves. △ Less

Submitted 3 October, 2019; originally announced October 2019.

arXiv:1905.10923 [pdf]

doi 10.1371/journal.pone.0228036

Investigation of HIV-1 Gag binding with RNAs and Lipids using Atomic Force Microscopy

Authors: Shaolong Chen, Jun Xu, Mingyue Liu, A. L. N. Rao, Roya Zandi, Sarjeet S. Gill, Umar Mohideen

Abstract: Atomic Force Microscopy was utilized to study the morphology of Gag, ΨRNA, and their binding complexes with lipids in a solution environment with 0.1Å vertical and 1nm lateral resolution. TARpolyA RNA was used as a RNA control. The lipid used was phospha-tidylinositol-(4,5)-bisphosphate (PI(4,5)P2). The morphology of specific complexes Gag-ΨRNA, Gag-TARpolyA RNA, Gag-PI(4,5)P2 and PI(4,5)P2-ΨRNA-G… ▽ More Atomic Force Microscopy was utilized to study the morphology of Gag, ΨRNA, and their binding complexes with lipids in a solution environment with 0.1Å vertical and 1nm lateral resolution. TARpolyA RNA was used as a RNA control. The lipid used was phospha-tidylinositol-(4,5)-bisphosphate (PI(4,5)P2). The morphology of specific complexes Gag-ΨRNA, Gag-TARpolyA RNA, Gag-PI(4,5)P2 and PI(4,5)P2-ΨRNA-Gag were studied. They were imaged on either positively or negatively charged mica substrates depending on the net charges carried. Gag and its complexes consist of monomers, dimers and tetramers, which was confirmed by gel electrophoresis. The addition of specific ΨRNA to Gag is found to increase Gag multimerization. Non-specific TARpolyA RNA was found not to lead to an increase in Gag multimerization. The addition PI(4,5)P2 to Gag increases Gag multimerization, but to a lesser extent than ΨRNA. When both ΨRNA and PI(4,5)P2 are present Gag undergoes comformational changes and an even higher degree of multimerization. △ Less

Submitted 26 May, 2019; originally announced May 2019.

arXiv:1905.07818 [pdf]

Magnetic resonance imaging of mean cell size in human breast tumors

Authors: Junzhong Xu, Xiaoyu Jiang, Hua Li, Lori R. Arlinghaus, Eliot T. McKinley, Sean P. Devan, Benjamin M. Hardy, Hakmook Kang, Anuradha B. Chakravarthy, John C. Gore

Abstract: Purpose: Cell size is a fundamental characteristic of all tissues, and changes in cell size in cancer reflect tumor status and response to treatments, such as apoptosis and cell cycle arrest. Unfortunately, cell size can only be obtained by pathologic evaluation of the tumor in the current standard of care. Previous imaging approaches can be implemented on only animal MRI scanners or require relat… ▽ More Purpose: Cell size is a fundamental characteristic of all tissues, and changes in cell size in cancer reflect tumor status and response to treatments, such as apoptosis and cell cycle arrest. Unfortunately, cell size can only be obtained by pathologic evaluation of the tumor in the current standard of care. Previous imaging approaches can be implemented on only animal MRI scanners or require relatively long acquisition times that are undesirable for clinical imaging. There is a need to develop cell size imaging for clinics. Experimental Design: We propose a new method, IMPULSED (Imaging Microstructural Parameters Using Limited Spectrally Edited Diffusion) that can characterize mean cell sizes in solid tumors. We report the use of combined sequences with different gradient waveforms on human MRI and analytical equations that link DWI signals of real gradient waveforms and specific microstructural parameters such as cell size. We also describe comprehensive validations using computer simulations, cell experiments in vitro, and animal experiments in vivo and finally demonstrate applications in pre-operative breast cancer patients. Results: With fast acquisitions (~ 7 mins), IMPULSED can provide high-resolution (1.3 mm in-plane) map** of mean cell size of human tumors in vivo on currently-available 3T MRI scanners. All validations suggest IMPULSED provide accurate and reliable measurements of mean cell size. Conclusion: The proposed IMPULSED method can assess cell size variations in the tumor of breast cancer patients, which may have the potential to assess early response to neoadjuvant therapy. △ Less

Submitted 19 May, 2019; originally announced May 2019.

arXiv:1902.09934 [pdf]

A Fully-Automatic Framework for Parkinson's Disease Diagnosis by Multi-Modality Images

Authors: Jiahang Xu, Fangyang Jiao, Yechong Huang, Xinzhe Luo, Qian Xu, Ling Li, Xueling Liu, Chuantao Zuo, ** Wu, Xiahai Zhuang

Abstract: Background: Parkinson's disease (PD) is a prevalent long-term neurodegenerative disease. Though the diagnostic criteria of PD are relatively well defined, the current medical imaging diagnostic procedures are expertise-demanding, and thus call for a higher-integrated AI-based diagnostic algorithm. Methods: In this paper, we proposed an automatic, end-to-end, multi-modality diagnosis framework, inc… ▽ More Background: Parkinson's disease (PD) is a prevalent long-term neurodegenerative disease. Though the diagnostic criteria of PD are relatively well defined, the current medical imaging diagnostic procedures are expertise-demanding, and thus call for a higher-integrated AI-based diagnostic algorithm. Methods: In this paper, we proposed an automatic, end-to-end, multi-modality diagnosis framework, including segmentation, registration, feature generation and machine learning, to process the information of the striatum for the diagnosis of PD. Multiple modalities, including T1- weighted MRI and 11C-CFT PET, were used in the proposed framework. The reliability of this framework was then validated on a dataset from the PET center of Huashan Hospital, as the dataset contains paired T1-MRI and CFT-PET images of 18 Normal (NL) subjects and 49 PD subjects. Results: We obtained an accuracy of 100% for the PD/NL classification task, besides, we conducted several comparative experiments to validate the diagnosis ability of our framework. Conclusion: Through experiment we illustrate that (1) automatic segmentation has the same classification effect as the manual segmentation, (2) the multi-modality images generates a better prediction than single modality images, and (3) volume feature is shown to be irrelevant to PD diagnosis. △ Less

Submitted 26 February, 2019; originally announced February 2019.

Comments: 16 pages, 6 figures, 4 tables

arXiv:1811.12314 [pdf, ps, other]

Swift Two-sample Test on High-dimensional Neural Spiking Data

Authors: Zhi-Qin John Xu, Douglas Zhou, David Cai

Abstract: To understand how neural networks process information, it is important to investigate how neural network dynamics varies with respect to different stimuli. One challenging task is to design efficient statistical approaches to analyze multiple spike train data obtained from a short recording time. Based on the development of high-dimensional statistical methods, it is able to deal with data whose d… ▽ More To understand how neural networks process information, it is important to investigate how neural network dynamics varies with respect to different stimuli. One challenging task is to design efficient statistical approaches to analyze multiple spike train data obtained from a short recording time. Based on the development of high-dimensional statistical methods, it is able to deal with data whose dimension is much larger than the sample size. However, these methods often require statistically independent samples to start with, while neural data are correlated over consecutive sampling time bins. We develop an approach to pretreat neural data to become independent samples over time by transferring the correlation of dynamics for each neuron in different sampling time bins into the correlation of dynamics among different dimensions within each sampling time bin. We verify the method using simulation data generated from Integrate-and-fire neuron network models and a large-scale network model of primary visual cortex within a short time, e.g., a few seconds. Our method may offer experimenters to use the advantage of the development of statistical methods to analyze high-dimensional neural data. △ Less

Submitted 11 November, 2018; originally announced November 2018.

Comments: 10 pages, 6 figures

MSC Class: 62H15; 62H30; 92B15 ACM Class: G.3

arXiv:1811.03481 [pdf]

doi 10.1073/pnas.1821309116

Distance-based Protein Folding Powered by Deep Learning

Authors: **bo Xu

Abstract: Contact-assisted protein folding has made very good progress, but two challenges remain. One is accurate contact prediction for proteins lack of many sequence homologs and the other is that time-consuming folding simulation is often needed to predict good 3D models from predicted contacts. We show that protein distance matrix can be predicted well by deep learning and then directly used to constru… ▽ More Contact-assisted protein folding has made very good progress, but two challenges remain. One is accurate contact prediction for proteins lack of many sequence homologs and the other is that time-consuming folding simulation is often needed to predict good 3D models from predicted contacts. We show that protein distance matrix can be predicted well by deep learning and then directly used to construct 3D models without folding simulation at all. Using distance geometry to construct 3D models from our predicted distance matrices, we successfully folded 21 of the 37 CASP12 hard targets with a median family size of 58 effective sequence homologs within 4 hours on a Linux computer of 20 CPUs. In contrast, contacts predicted by direct coupling analysis (DCA) cannot fold any of them in the absence of folding simulation and the best CASP12 group folded 11 of them by integrating predicted contacts into complex, fragment-based folding simulation. The rigorous experimental validation on 15 CASP13 targets show that among the 3 hardest targets of new fold our distance-based folding servers successfully folded 2 large ones with <150 sequence homologs while the other servers failed on all three, and that our ab initio folding server also predicted the best, high-quality 3D model for a large homology modeling target. Further experimental validation in CAMEO shows that our ab initio folding server predicted correct fold for a membrane protein of new fold with 200 residues and 229 sequence homologs while all the other servers failed. These results imply that deep learning offers an efficient and accurate solution for ab initio folding on a personal computer. △ Less

Submitted 11 November, 2018; v1 submitted 8 November, 2018; originally announced November 2018.

arXiv:1810.13041 [pdf, other]

doi 10.1039/C8SM02170H

Concurrent coupling of atomistic simulation and mesoscopic hydrodynamics for flows over soft multi-functional surfaces

Authors: Yuying Wang, Zhen Li, Junbo Xu, Chao Yang, George Em Karniadakis

Abstract: We develop an efficient parallel multiscale method that bridges the atomistic and mesoscale regimes, from nanometer to micron and beyond, via concurrent coupling of atomistic simulation and mesoscopic dynamics. In particular, we combine an all-atom molecular dynamics (MD) description for specific atomistic details in the vicinity of the functional surface, with a dissipative particle dynamics (DPD… ▽ More We develop an efficient parallel multiscale method that bridges the atomistic and mesoscale regimes, from nanometer to micron and beyond, via concurrent coupling of atomistic simulation and mesoscopic dynamics. In particular, we combine an all-atom molecular dynamics (MD) description for specific atomistic details in the vicinity of the functional surface, with a dissipative particle dynamics (DPD) approach that captures mesoscopic hydrodynamics in the domain away from the functional surface. In order to achieve a seamless transition in dynamic properties we endow the MD simulation with a DPD thermostat, which is validated against experimental results by modeling water at different temperatures. We then validate the MD-DPD coupling method for transient Couette and Poiseuille flows, demonstrating that the concurrent MD-DPD coupling can resolve accurately the continuum-based analytical solutions. Subsequently, we simulate shear flows over polydimethylsiloxane (PDMS)-grafted surfaces (polymer brushes) for various grafting densities, and investigate the slip flow as a function of the shear stress. We verify that a "universal" power law exists for the sliplength, in agreement with published results. Having validated the MD-DPD coupling method, we simulate time-dependent flows past an endothelial glycocalyx layer (EGL) in a microchannel. Coupled simulation results elucidate the dynamics of EGL changing from an equilibrium state to a compressed state under shear by aligning the molecular structures along the shear direction. MD-DPD simulation results agree well with results of a single MD simulation, but with the former more than two orders of magnitude faster than the latter for system sizes above one micron. △ Less

Submitted 27 October, 2018; originally announced October 2018.

Comments: 11 pages, 12 figures

Journal ref: Soft Matter, 2019,15: 1747-1757

arXiv:1808.05766 [pdf]

The Function Transformation Omics - Funomics

Authors: Yongshuai Jiang, **g Xu, Simeng Hu, Di Liu, Linna Zhao, Xu Zhou

Abstract: There are no two identical leaves in the world, so how to find effective markers or features to distinguish them is an important issue. Function transformation, such as f(x,y) and f(x,y,z), can transform two, three, or multiple input/observation variables (in biology, it generally refers to the observed/measured value of biomarkers, biological characteristics, or other indicators) into a new outpu… ▽ More There are no two identical leaves in the world, so how to find effective markers or features to distinguish them is an important issue. Function transformation, such as f(x,y) and f(x,y,z), can transform two, three, or multiple input/observation variables (in biology, it generally refers to the observed/measured value of biomarkers, biological characteristics, or other indicators) into a new output variable (new characteristics or indicators). This provided us a chance to re-cognize objective things or relationships beyond the original measurements. For example, Body Mass Index, which transform weight and high into a new indicator BMI=x/y^2 (where x is weight and y is high), is commonly used in to gauge obesity. Here, we proposed a new system, Funomics (Function Transformation Omics), for understanding the world in a different perspective. Funome can be understood as a set of math functions consist of basic elementary functions (such as power functions and exponential functions) and basic mathematical operations (such as addition, subtraction). By scanning the whole Funome, researchers can identify some special functions (called handsome functions) which can generate the novel important output variable (characteristics or indicators). We also start "the Funome project" to develop novel methods, function library and analysis software for Funome studies. The Funome project will accelerate the discovery of new useful indicators or characteristics, will improve the utilization efficiency of directly measured data, and will enhance our ability to understand the world. The analysis tools and data resources about the Funome project can be found gradually at http://www.funome.com. △ Less

Submitted 17 August, 2018; originally announced August 2018.

arXiv:1808.04499 [pdf, other]

doi 10.3390/e21010076

Dynamical and Coupling Structure of Pulse-Coupled Networks in Maximum Entropy Analysis

Authors: Zhi-Qin John Xu, Douglas Zhou, David Cai

Abstract: Maximum entropy principle (MEP) analysis with few non-zero effective interactions successfully characterizes the distribution of dynamical states of pulse-coupled networks in many experiments, e.g., in neuroscience. To better understand the underlying mechanism, we found a relation between the dynamical structure, i.e., effective interactions in MEP analysis, and the coupling structure of pulse-co… ▽ More Maximum entropy principle (MEP) analysis with few non-zero effective interactions successfully characterizes the distribution of dynamical states of pulse-coupled networks in many experiments, e.g., in neuroscience. To better understand the underlying mechanism, we found a relation between the dynamical structure, i.e., effective interactions in MEP analysis, and the coupling structure of pulse-coupled network to understand how a sparse coupling structure could lead to a sparse coding by effective interactions. This relation quantitatively displays how the dynamical structure is closely related to the coupling structure. △ Less

Submitted 13 August, 2018; originally announced August 2018.

Comments: 4 pages, 3 figures

MSC Class: 92B15; 92B20

arXiv:1808.03386 [pdf]

doi 10.3938/jkps.73.1908

Immunological recognition by artificial neural networks

Authors: ** Xu, Junghyo Jo

Abstract: The binding affinity between the T-cell receptors (TCRs) and antigenic peptides mainly determines immunological recognition. It is not a trivial task that T cells identify the digital sequences of peptide amino acids by simply relying on the integrated binding affinity between TCRs and antigenic peptides. To address this problem, we examine whether the affinity-based discrimination of peptide sequ… ▽ More The binding affinity between the T-cell receptors (TCRs) and antigenic peptides mainly determines immunological recognition. It is not a trivial task that T cells identify the digital sequences of peptide amino acids by simply relying on the integrated binding affinity between TCRs and antigenic peptides. To address this problem, we examine whether the affinity-based discrimination of peptide sequences is learnable and generalizable by artificial neural networks (ANNs) that process the digital experimental amino acid sequence information of receptors and peptides. A pair of TCR and peptide sequences correspond to the input for ANNs, while the success or failure of the immunological recognition correspond to the output. The output is obtained by both theoretical model and experimental data. In either case, we confirmed that ANNs could learn the immunological recognition. We also found that a homogenized encoding of amino acid sequence was more effective for the supervised learning task. △ Less

Submitted 12 February, 2024; v1 submitted 9 August, 2018; originally announced August 2018.

arXiv:1802.01980 [pdf]

Layered structure and leveled function of a human brain

Authors: Shengyong Xu, **g**g Xu, Rujun Dai

Abstract: The anatomically layered structure of a human brain results in leveled functions. In all these levels of different functions, comparison, feedback and imitation are the universal and crucial mechanisms. Languages, symbols and tools play key roles in the development of human brain and entire civilization. The anatomically layered structure of a human brain results in leveled functions. In all these levels of different functions, comparison, feedback and imitation are the universal and crucial mechanisms. Languages, symbols and tools play key roles in the development of human brain and entire civilization. △ Less

Submitted 4 February, 2018; originally announced February 2018.

arXiv:1712.07244 [pdf, ps, other]

Real-value and confidence prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning

Authors: Yujuan Gao, Sheng Wang, Minghua Deng, **bo Xu

Abstract: Background. Protein dihedral angles provide a detailed description of protein local conformation. Predicted dihedral angles can be used to narrow down the conformational space of the whole polypeptide chain significantly, thus aiding protein tertiary structure prediction. However, direct angle prediction from sequence alone is challenging. Method. In this study, we present a novel method to pred… ▽ More Background. Protein dihedral angles provide a detailed description of protein local conformation. Predicted dihedral angles can be used to narrow down the conformational space of the whole polypeptide chain significantly, thus aiding protein tertiary structure prediction. However, direct angle prediction from sequence alone is challenging. Method. In this study, we present a novel method to predict real-valued angles by combining clustering and deep learning. That is, we first generate certain clusters of angles (each assigned a label) and then apply a deep residual neural network to predict the label posterior probability. Finally, we output real-valued prediction by a mixture of the clusters with their predicted probabilities. At the same time, we also estimate the bound of the prediction errors at each residue from the predicted label probabilities. Result. In this article, we present a novel method (named RaptorX-Angle) to predict real-valued angles by combining clustering and deep learning. Tested on a subset of PDB25 and the targets in the latest two Critical Assessment of protein Structure Prediction (CASP), our method outperforms the existing state-of-art method SPIDER2 in terms of Pearson Correlation Coefficient (PCC) and Mean Absolute Error (MAE). Our result also shows approximately linear relationship between the real prediction errors and our estimated bounds. That is, the real prediction error can be well approximated by our estimated bounds. Conclusions. Our study provides an alternative and more accurate prediction of dihedral angles, which may facilitate protein structure prediction and functional study. △ Less

Submitted 19 December, 2017; originally announced December 2017.

Comments: 23 pages, 6 figures, has been accepted by The Sixteenth Asia Pacific Bioinformatics Conference

arXiv:1712.04633 [pdf, other]

doi 10.1016/j.jtbi.2019.01.025

Broad cross-reactivity of the T-cell repertoire achieves specific and sufficiently rapid target searching

Authors: ** Xu, Junghyo Jo

Abstract: The molecular recognition of T-cell receptors is the hallmark of the adaptive immunity. Given the finiteness of the T-cell repertoire, individual T-cell receptors are necessary to be cross-reactive to multiple antigenic peptides. In this study, we quantify the variability of the cross-reactivity by using a string model that estimates the binding affinity between two sequences of amino acids. We ex… ▽ More The molecular recognition of T-cell receptors is the hallmark of the adaptive immunity. Given the finiteness of the T-cell repertoire, individual T-cell receptors are necessary to be cross-reactive to multiple antigenic peptides. In this study, we quantify the variability of the cross-reactivity by using a string model that estimates the binding affinity between two sequences of amino acids. We examine sequences of 10,000 human T-cell receptors and 10,000 antigenic peptides, and obtain a full spectrum of cross-reactivity of the receptor-peptide binding. Then, we find that the cross-reactivity spectrum is broad. Some T cells are reactive to 1,000 peptides, but some T cells are reactive to only one or two peptides. Since the degree of cross-reactivity has a correlation with the (un)binding affinity of receptors, we further investigate how the broad crossreactivity affects the target searching of T cells. High cross-reactive T cells may not require many trials for searching correct targets, but they may spend long time to unbind from incorrect targets. In contrast, low cross-reactive T cells may not spend long time to ignore incorrect targets, but they require many trials for screening correct targets. We evaluate this hypothesis, and show that the broad cross-reactivity of the natural T-cell repertoire can balance the trade-off between the rapid screening and unbinding penalty. △ Less

Submitted 13 February, 2024; v1 submitted 13 December, 2017; originally announced December 2017.

arXiv:1711.05042 [pdf]

A memory mechanism based on two dimensional code of neurosome pattern

Authors: Shengyong Xu, **g**g Xu

Abstract: We have recognized that 2D codes, i.e., a group of strongly connected neurosomes that can be simultaneously excited, are the basic data carriers for memory in a brain. An echoing mechanism between two neighboring layers of neurosomes is assumed to establish temporary memory, and repeating processes enhance the formation of long-term memory. Creation and degradation of memory information are statis… ▽ More We have recognized that 2D codes, i.e., a group of strongly connected neurosomes that can be simultaneously excited, are the basic data carriers for memory in a brain. An echoing mechanism between two neighboring layers of neurosomes is assumed to establish temporary memory, and repeating processes enhance the formation of long-term memory. Creation and degradation of memory information are statistically. The maximum capacity of memory storage in a human brain is estimated to be one billion of 2D codes. By triggering one or more neurosomes in a neurosome-based 2D code, the whole strongly connected neurosome network is capable of exciting simultaneously and projecting its excitation onto an analysis layer of neurons in cortex, thus retrieving the stored memory data. The capability of comparing two 2D codes in the analysis layer is one of the major brain functions. △ Less

Submitted 14 November, 2017; originally announced November 2017.

Comments: 9 pages, 2 figures

arXiv:1708.08407 [pdf]

Folding membrane proteins by deep transfer learning

Authors: Sheng Wang, Zhen Li, Yizhou Yu, **bo Xu

Abstract: Computational elucidation of membrane protein (MP) structures is challenging partially due to lack of sufficient solved structures for homology modeling. Here we describe a high-throughput deep transfer learning method that first predicts MP contacts by learning from non-membrane proteins (non-MPs) and then predicting three-dimensional structure models using the predicted contacts as distance rest… ▽ More Computational elucidation of membrane protein (MP) structures is challenging partially due to lack of sufficient solved structures for homology modeling. Here we describe a high-throughput deep transfer learning method that first predicts MP contacts by learning from non-membrane proteins (non-MPs) and then predicting three-dimensional structure models using the predicted contacts as distance restraints. Tested on 510 non-redundant MPs, our method has contact prediction accuracy at least 0.18 better than existing methods, predicts correct folds for 218 MPs (TMscore at least 0.6), and generates three-dimensional models with RMSD less than 4 Angstrom and 5 Angstrom for 57 and 108 MPs, respectively. A rigorous blind test in the continuous automated model evaluation (CAMEO) project shows that our method predicted high-resolution three-dimensional models for two recent test MPs of 210 residues with RMSD close to 2 Angstrom. We estimated that our method could predict correct folds for between 1,345 and 1,871 reviewed human multi-pass MPs including a few hundred new folds, which shall facilitate the discovery of drugs targeting at membrane proteins. △ Less

Submitted 28 August, 2017; originally announced August 2017.

arXiv:1704.07207 [pdf]

Predicting membrane protein contacts from non-membrane proteins by deep transfer learning

Authors: Zhen Li, Sheng Wang, Yizhou Yu, **bo Xu

Abstract: Computational prediction of membrane protein (MP) structures is very challenging partially due to lack of sufficient solved structures for homology modeling. Recently direct evolutionary coupling analysis (DCA) sheds some light on protein contact prediction and accordingly, contact-assisted folding, but DCA is effective only on some very large-sized families since it uses information only in a sin… ▽ More Computational prediction of membrane protein (MP) structures is very challenging partially due to lack of sufficient solved structures for homology modeling. Recently direct evolutionary coupling analysis (DCA) sheds some light on protein contact prediction and accordingly, contact-assisted folding, but DCA is effective only on some very large-sized families since it uses information only in a single protein family. This paper presents a deep transfer learning method that can significantly improve MP contact prediction by learning contact patterns and complex sequence-contact relationship from thousands of non-membrane proteins (non-MPs). Tested on 510 non-redundant MPs, our deep model (learned from only non-MPs) has top L/10 long-range contact prediction accuracy 0.69, better than our deep model trained by only MPs (0.63) and much better than a representative DCA method CCMpred (0.47) and the CASP11 winner MetaPSICOV (0.55). The accuracy of our deep model can be further improved to 0.72 when trained by a mix of non-MPs and MPs. When only contacts in transmembrane regions are evaluated, our method has top L/10 long-range accuracy 0.62, 0.57, and 0.53 when trained by a mix of non-MPs and MPs, by non-MPs only, and by MPs only, respectively, still much better than MetaPSICOV (0.45) and CCMpred (0.40). All these results suggest that sequence-structure relationship learned by our deep model from non-MPs generalizes well to MP contact prediction. Improved contact prediction also leads to better contact-assisted folding. Using only top predicted contacts as restraints, our deep learning method can fold 160 and 200 of 510 MPs with TMscore>0.6 when trained by non-MPs only and by a mix of non-MPs and MPs, respectively, while CCMpred and MetaPSICOV can do so for only 56 and 77 MPs, respectively. Our contact-assisted folding also greatly outperforms homology modeling. △ Less

Submitted 24 April, 2017; originally announced April 2017.

arXiv:1611.05403 [pdf]

doi 10.1002/chem.201505173

Graphitic C3N4 Sensitized TiO2 Nanotube Layers: A Visible Light Activated Efficient Antimicrobial Platform

Authors: **gwen Xu, Yan Li, Xuemei Zhou, Yuzhen Li, Zhi-Da Gao, Yan-Yan Song, Patrik Schmuki

Abstract: In this work, we introduce a facile procedure to graft a thin graphitic C3N4 (g-C3N4) layer on aligned TiO2 nanotube arrays (TiNT) by one-step chemical vapor deposition (CVD) approach. This provides a platform to enhance the visible-light response of TiO2 nanotubes for antimicrobial applications. The formed g- C3N4/TiNT binary nanocomposite exhibits excellent bactericidal efficiency against E. col… ▽ More In this work, we introduce a facile procedure to graft a thin graphitic C3N4 (g-C3N4) layer on aligned TiO2 nanotube arrays (TiNT) by one-step chemical vapor deposition (CVD) approach. This provides a platform to enhance the visible-light response of TiO2 nanotubes for antimicrobial applications. The formed g- C3N4/TiNT binary nanocomposite exhibits excellent bactericidal efficiency against E. coli as a visiblelight activated antibacterial coating. △ Less

Submitted 20 October, 2016; originally announced November 2016.

Journal ref: Chemistry - A European Journal, Volume 22, Issue 12, pages 3947-3951, March 14, 2016

Showing 1–50 of 86 results for author: Xu, J