-
BrainFounder: Towards Brain Foundation Models for Neuroimage Analysis
Authors:
Joseph Cox,
Peng Liu,
Skylar E. Stolte,
Yunchao Yang,
Kang Liu,
Kyle B. See,
Huiwen Ju,
Ruogu Fang
Abstract:
The burgeoning field of brain health research increasingly leverages artificial intelligence (AI) to interpret and analyze neurological data. This study introduces a novel approach towards the creation of medical foundation models by integrating a large-scale multi-modal magnetic resonance imaging (MRI) dataset derived from 41,400 participants in its own. Our method involves a novel two-stage pret…
▽ More
The burgeoning field of brain health research increasingly leverages artificial intelligence (AI) to interpret and analyze neurological data. This study introduces a novel approach towards the creation of medical foundation models by integrating a large-scale multi-modal magnetic resonance imaging (MRI) dataset derived from 41,400 participants in its own. Our method involves a novel two-stage pretraining approach using vision transformers. The first stage is dedicated to encoding anatomical structures in generally healthy brains, identifying key features such as shapes and sizes of different brain regions. The second stage concentrates on spatial information, encompassing aspects like location and the relative positioning of brain structures. We rigorously evaluate our model, BrainFounder, using the Brain Tumor Segmentation (BraTS) challenge and Anatomical Tracings of Lesions After Stroke v2.0 (ATLAS v2.0) datasets. BrainFounder demonstrates a significant performance gain, surpassing the achievements of the previous winning solutions using fully supervised learning. Our findings underscore the impact of scaling up both the complexity of the model and the volume of unlabeled training data derived from generally healthy brains, which enhances the accuracy and predictive capabilities of the model in complex neuroimaging tasks with MRI. The implications of this research provide transformative insights and practical applications in healthcare and make substantial steps towards the creation of foundation models for Medical AI. Our pretrained models and training code can be found at https://github.com/lab-smile/GatorBrain.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models
Authors:
Songtao Liu,
Hanjun Dai,
Yue Zhao,
Peng Liu
Abstract:
Molecule synthesis through machine learning is one of the fundamental problems in drug discovery. Current data-driven strategies employ one-step retrosynthesis models and search algorithms to predict synthetic routes in a top-bottom manner. Despite their effective performance, these strategies face limitations in the molecule synthetic route generation due to a greedy selection of the next molecul…
▽ More
Molecule synthesis through machine learning is one of the fundamental problems in drug discovery. Current data-driven strategies employ one-step retrosynthesis models and search algorithms to predict synthetic routes in a top-bottom manner. Despite their effective performance, these strategies face limitations in the molecule synthetic route generation due to a greedy selection of the next molecule set without any lookahead. Furthermore, existing strategies cannot control the generation of synthetic routes based on possible criteria such as material costs, yields, and step count. In this work, we propose a general and principled framework via conditional residual energy-based models (EBMs), that focus on the quality of the entire synthetic route based on the specific criteria. By incorporating an additional energy-based function into our probabilistic model, our proposed algorithm can enhance the quality of the most probable synthetic routes (with higher probabilities) generated by various strategies in a plug-and-play fashion. Extensive experiments demonstrate that our framework can consistently boost performance across various strategies and outperforms previous state-of-the-art top-1 accuracy by a margin of 2.5%. Code is available at https://github.com/SongtaoLiu0823/CREBM.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions
Authors:
Pengfei Liu,
Jun Tao,
Zhixiang Ren
Abstract:
The task of chemical reaction predictions (CRPs) plays a pivotal role in advancing drug discovery and material science. However, its effectiveness is constrained by the vast and uncertain chemical reaction space and challenges in capturing reaction selectivity, particularly due to existing methods' limitations in exploiting the data's inherent knowledge. To address these challenges, we introduce a…
▽ More
The task of chemical reaction predictions (CRPs) plays a pivotal role in advancing drug discovery and material science. However, its effectiveness is constrained by the vast and uncertain chemical reaction space and challenges in capturing reaction selectivity, particularly due to existing methods' limitations in exploiting the data's inherent knowledge. To address these challenges, we introduce a data-curated self-feedback knowledge elicitation approach. This method starts from iterative optimization of molecular representations and facilitates the extraction of knowledge on chemical reaction types (RTs). Then, we employ adaptive prompt learning to infuse the prior knowledge into the large language model (LLM). As a result, we achieve significant enhancements: a 14.2% increase in retrosynthesis prediction accuracy, a 74.2% rise in reagent prediction accuracy, and an expansion in the model's capability for handling multi-task chemical reactions. This research offers a novel paradigm for knowledge elicitation in scientific research and showcases the untapped potential of LLMs in CRPs.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
DeepCRE: Transforming Drug R&D via AI-Driven Cross-drug Response Evaluation
Authors:
Yushuai Wu,
Ting Zhang,
Hao Zhou,
Hainan Wu,
Hanwen Sunchu,
Lei Hu,
Xiaofang Chen,
Suyuan Zhao,
Gaochao Liu,
Chao Sun,
Jiahuan Zhang,
Yizhen Luo,
Peng Liu,
Zaiqing Nie,
Yushuai Wu
Abstract:
The fields of therapeutic application and drug research and development (R&D) both face substantial challenges, i.e., the therapeutic domain calls for more treatment alternatives, while numerous promising pre-clinical drugs have failed in clinical trials. One of the reasons is the inadequacy of Cross-drug Response Evaluation (CRE) during the late stages of drug R&D. Although in-silico CRE models b…
▽ More
The fields of therapeutic application and drug research and development (R&D) both face substantial challenges, i.e., the therapeutic domain calls for more treatment alternatives, while numerous promising pre-clinical drugs have failed in clinical trials. One of the reasons is the inadequacy of Cross-drug Response Evaluation (CRE) during the late stages of drug R&D. Although in-silico CRE models bring a promising solution, existing methodologies are restricted to early stages of drug R&D, such as target and cell-line levels, offering limited improvement to clinical success rates. Herein, we introduce DeepCRE, a pioneering AI model designed to predict CRE effectively in the late stages of drug R&D. DeepCRE outperforms the existing best models by achieving an average performance improvement of 17.7% in patient-level CRE, and a 5-fold increase in indication-level CRE, facilitating more accurate personalized treatment predictions and better pharmaceutical value assessment for indications, respectively. Furthermore, DeepCRE has identified a set of six drug candidates that show significantly greater effectiveness than a comparator set of two approved drugs in 5/8 colorectal cancer organoids. This demonstrates the capability of DeepCRE to systematically uncover a spectrum of drug candidates with enhanced therapeutic effects, highlighting its potential to transform drug R&D.
△ Less
Submitted 18 March, 2024; v1 submitted 6 March, 2024;
originally announced March 2024.
-
Impact of Domain Knowledge and Multi-Modality on Intelligent Molecular Property Prediction: A Systematic Survey
Authors:
Taojie Kuang,
Pengfei Liu,
Zhixiang Ren
Abstract:
The precise prediction of molecular properties is essential for advancements in drug development, particularly in virtual screening and compound optimization. The recent introduction of numerous deep learning-based methods has shown remarkable potential in enhancing molecular property prediction (MPP), especially improving accuracy and insights into molecular structures. Yet, two critical question…
▽ More
The precise prediction of molecular properties is essential for advancements in drug development, particularly in virtual screening and compound optimization. The recent introduction of numerous deep learning-based methods has shown remarkable potential in enhancing molecular property prediction (MPP), especially improving accuracy and insights into molecular structures. Yet, two critical questions arise: does the integration of domain knowledge augment the accuracy of molecular property prediction and does employing multi-modal data fusion yield more precise results than unique data source methods? To explore these matters, we comprehensively review and quantitatively analyze recent deep learning methods based on various benchmarks. We discover that integrating molecular information significantly improves molecular property prediction (MPP) for both regression and classification tasks. Specifically, regression improvements, measured by reductions in root mean square error (RMSE), are up to 4.0%, while classification enhancements, measured by the area under the receiver operating characteristic curve (ROC-AUC), are up to 1.7%. We also discover that enriching 2D graphs with 1D SMILES boosts multi-modal learning performance for regression tasks by up to 9.1%, and augmenting 2D graphs with 3D information increases performance for classification tasks by up to 13.2%, with both enhancements measured using ROC-AUC. The two consolidated insights offer crucial guidance for future advancements in drug discovery.
△ Less
Submitted 27 June, 2024; v1 submitted 11 February, 2024;
originally announced February 2024.
-
The Impact of Downgrading Protected Areas (PAD) on Biodiversity
Authors:
Yufei Li,
Lingling Hou,
Pengfei Liu
Abstract:
We quantitatively assess the impacts of Downgrading Protected Areas (PAD) on biodiversity in the U.S.. Results show that PAD events significantly reduce biodiversity. The proximity to PAD events decreases the biodiversity by 26.0% within 50 km compared with records of species further away from the PAD events. We observe an overall 32.3% decrease in abundance after those nearest PAD events are enac…
▽ More
We quantitatively assess the impacts of Downgrading Protected Areas (PAD) on biodiversity in the U.S.. Results show that PAD events significantly reduce biodiversity. The proximity to PAD events decreases the biodiversity by 26.0% within 50 km compared with records of species further away from the PAD events. We observe an overall 32.3% decrease in abundance after those nearest PAD events are enacted. Abundance declines more in organisms living in contact with water and non-mammals. Species abundance is more sensitive to the negative impacts in areas where PAD events were later reversed, as well as in areas close to protected areas belonging to the International Union for Conservation of Nature (IUCN) category. The enacted PAD events between the period 1903 to 2018 in the U.S. lead to economic losses of approximately $689.95 million due to decrease in abundance. Our results contribute to the understanding on the impact of environmental interventions such as PAD events on biodiversity change and provide important implications on biodiversity conservation policies.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text
Authors:
Pengfei Liu,
Yiming Ren,
Jun Tao,
Zhixiang Ren
Abstract:
Large language models have made significant strides in natural language processing, enabling innovative applications in molecular science by processing textual representations of molecules. However, most existing language models cannot capture the rich information with complex molecular structures or images. In this paper, we introduce GIT-Mol, a multi-modal large language model that integrates th…
▽ More
Large language models have made significant strides in natural language processing, enabling innovative applications in molecular science by processing textual representations of molecules. However, most existing language models cannot capture the rich information with complex molecular structures or images. In this paper, we introduce GIT-Mol, a multi-modal large language model that integrates the Graph, Image, and Text information. To facilitate the integration of multi-modal molecular data, we propose GIT-Former, a novel architecture that is capable of aligning all modalities into a unified latent space. We achieve a 5%-10% accuracy increase in properties prediction and a 20.2% boost in molecule generation validity compared to the baselines. With the any-to-language molecular translation strategy, our model has the potential to perform more downstream tasks, such as compound name recognition and chemical reaction prediction.
△ Less
Submitted 6 February, 2024; v1 submitted 13 August, 2023;
originally announced August 2023.
-
Emergent Bio-Functional Similarities in a Cortical-Spike-Train-Decoding Spiking Neural Network Facilitate Predictions of Neural Computation
Authors:
Tengjun Liu,
Yansong Chua,
Yiwei Zhang,
Yuxiao Ning,
Pengfu Liu,
Guihua Wan,
Zijun Wan,
Shaomin Zhang,
Weidong Chen
Abstract:
Despite its better bio-plausibility, goal-driven spiking neural network (SNN) has not achieved applicable performance for classifying biological spike trains, and showed little bio-functional similarities compared to traditional artificial neural networks. In this study, we proposed the motorSRNN, a recurrent SNN topologically inspired by the neural motor circuit of primates. By employing the moto…
▽ More
Despite its better bio-plausibility, goal-driven spiking neural network (SNN) has not achieved applicable performance for classifying biological spike trains, and showed little bio-functional similarities compared to traditional artificial neural networks. In this study, we proposed the motorSRNN, a recurrent SNN topologically inspired by the neural motor circuit of primates. By employing the motorSRNN in decoding spike trains from the primary motor cortex of monkeys, we achieved a good balance between classification accuracy and energy consumption. The motorSRNN communicated with the input by capturing and cultivating more cosine-tuning, an essential property of neurons in the motor cortex, and maintained its stability during training. Such training-induced cultivation and persistency of cosine-tuning was also observed in our monkeys. Moreover, the motorSRNN produced additional bio-functional similarities at the single-neuron, population, and circuit levels, demonstrating biological authenticity. Thereby, ablation studies on motorSRNN have suggested long-term stable feedback synapses contribute to the training-induced cultivation in the motor cortex. Besides these novel findings and predictions, we offer a new framework for building authentic models of neural computation.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
Community-developed checklists for publishing images and image analysis
Authors:
Christopher Schmied,
Michael Nelson,
Sergiy Avilov,
Gert-Jan Bakker,
Cristina Bertocchi,
Johanna Bischof,
Ulrike Boehm,
Jan Brocher,
Mariana Carvalho,
Catalin Chiritescu,
Jana Christopher,
Beth Cimini,
Eduardo Conde-Sousa,
Michael Ebner,
Rupert Ecker,
Kevin Eliceiri,
Julia Fernandez-Rodriguez,
Nathalie Gaudreault,
Laurent Gelman,
David Grunwald,
Tingting Gu,
Nadia Halidi,
Mathias Hammer,
Matthew Hartley,
Marie Held
, et al. (29 additional authors not shown)
Abstract:
Images document scientific discoveries and are prevalent in modern biomedical research. Microscopy imaging in particular is currently undergoing rapid technological advancements. However for scientists wishing to publish the obtained images and image analyses results, there are to date no unified guidelines. Consequently, microscopy images and image data in publications may be unclear or difficult…
▽ More
Images document scientific discoveries and are prevalent in modern biomedical research. Microscopy imaging in particular is currently undergoing rapid technological advancements. However for scientists wishing to publish the obtained images and image analyses results, there are to date no unified guidelines. Consequently, microscopy images and image data in publications may be unclear or difficult to interpret. Here we present community-developed checklists for preparing light microscopy images and image analysis for publications. These checklists offer authors, readers, and publishers key recommendations for image formatting and annotation, color selection, data availability, and for reporting image analysis workflows. The goal of our guidelines is to increase the clarity and reproducibility of image figures and thereby heighten the quality of microscopy data is in publications.
△ Less
Submitted 14 September, 2023; v1 submitted 14 February, 2023;
originally announced February 2023.
-
Heavy-tailed distributions of confirmed COVID-19 cases and deaths in spatiotemporal space
Authors:
Peng Liu,
Yanyan Zheng
Abstract:
This paper conducts a systematic statistical analysis of the characteristics of the geographical empirical distributions for the numbers of both cumulative and daily confirmed COVID-19 cases and deaths at county, city, and state levels over a time span from January 2020 to June 2022. The mathematical heavy-tailed distributions can be used for fitting the empirical distributions observed in differe…
▽ More
This paper conducts a systematic statistical analysis of the characteristics of the geographical empirical distributions for the numbers of both cumulative and daily confirmed COVID-19 cases and deaths at county, city, and state levels over a time span from January 2020 to June 2022. The mathematical heavy-tailed distributions can be used for fitting the empirical distributions observed in different temporal stages and geographical scales. The estimations of the shape parameter of the tail distributions using the Generalized Pareto Distribution also support the observations of the heavy-tailed distributions. According to the characteristics of the heavy-tailed distributions, the evolution course of the geographical empirical distributions can be divided into three distinct phases, namely the power-law phase, the lognormal phase I, and the lognormal phase II. These three phases could serve as an indicator of the severity degree of the COVID-19 pandemic within an area. The empirical results suggest important intrinsic dynamics of a human infectious virus spread in the human interconnected physical complex network. The findings extend previous empirical studies and could provide more strict constraints for current mathematical and physical modeling studies, such as the SIR model and its variants based on the theory of complex networks.
△ Less
Submitted 23 November, 2023; v1 submitted 14 August, 2022;
originally announced August 2022.
-
Temporal and spatial evolution of the distribution related to the number of COVID-19 pandemic
Authors:
Peng Liu,
Yanyan Zheng
Abstract:
This work systematically conducts a data analysis based on the numbers of both cumulative and daily confirmed COVID-19 cases and deaths in a time span through April 2020 to June 2022 for over 200 countries around the world. Such research feature aims to reveal the temporal and spatial evolution of the country-level distribution observed in COVID-19 pandemic, and obtains some interesting results as…
▽ More
This work systematically conducts a data analysis based on the numbers of both cumulative and daily confirmed COVID-19 cases and deaths in a time span through April 2020 to June 2022 for over 200 countries around the world. Such research feature aims to reveal the temporal and spatial evolution of the country-level distribution observed in COVID-19 pandemic, and obtains some interesting results as follows. (1) The distributions of the numbers for cumulative confirmed cases and deaths obey power-law in early stages of COVID-19 and stretched exponential function in subsequent course. (2) The distributions of the numbers for daily confirmed cases and deaths obey power-law in early and late stages of COVID-19 and stretched exponential function in middle stages. The crossover region between power-law and stretched exponential behaviour seems to depend on the evolution of "infection" event and "death" event. Such observation implies a kind of important symmetry related to the dynamics process of COVID-19 spreading. (3) The distributions of the normalized numbers for each metric show a temporal scaling behaviour in 2-year period, and are well described by stretched exponential function. The observation of power-law and stretched exponential behaviour in such country-level distributions suggests underlying intrinsic dynamics of a virus spreading process in human interconnected society. And thus it is important for understanding and mathematically modeling the COVID-19 pandemic.
△ Less
Submitted 23 August, 2022; v1 submitted 8 April, 2022;
originally announced April 2022.
-
Comparing the topology of phylogenetic network generators
Authors:
Remie Janssen,
Pengyu Liu
Abstract:
Phylogenetic networks represent evolutionary history of species and can record natural reticulate evolutionary processes such as horizontal gene transfer and gene recombination. This makes phylogenetic networks a more comprehensive representation of evolutionary history compared to phylogenetic trees. Stochastic processes for generating random trees or networks are important tools in evolutionary…
▽ More
Phylogenetic networks represent evolutionary history of species and can record natural reticulate evolutionary processes such as horizontal gene transfer and gene recombination. This makes phylogenetic networks a more comprehensive representation of evolutionary history compared to phylogenetic trees. Stochastic processes for generating random trees or networks are important tools in evolutionary analysis, especially in phylogeny reconstruction where they can be utilized for validation or serve as priors for Bayesian methods. However, as more network generators are developed, there is a lack of discussion or comparison for different generators. To bridge this gap, we compare a set of phylogenetic network generators by profiling topological summary statistics of the generated networks over the number of reticulations and comparing the topological profiles.
△ Less
Submitted 12 June, 2021;
originally announced June 2021.
-
Drug cell line interaction prediction
Authors:
Pengfei Liu
Abstract:
Understanding the phenotypic drug response on cancer cell lines plays a vital rule in anti-cancer drug discovery and re-purposing. The Genomics of Drug Sensitivity in Cancer (GDSC) database provides open data for researchers in phenotypic screening to test their models and methods. Previously, most research in these areas starts from the fingerprints or features of drugs, instead of their structur…
▽ More
Understanding the phenotypic drug response on cancer cell lines plays a vital rule in anti-cancer drug discovery and re-purposing. The Genomics of Drug Sensitivity in Cancer (GDSC) database provides open data for researchers in phenotypic screening to test their models and methods. Previously, most research in these areas starts from the fingerprints or features of drugs, instead of their structures. In this paper, we introduce a model for phenotypic screening, which is called twin Convolutional Neural Network for drugs in SMILES format (tCNNS). tCNNS is comprised of CNN input channels for drugs in SMILES format and cancer cell lines respectively. Our model achieves $0.84$ for the coefficient of determinant($R^2$) and $0.92$ for Pearson correlation($R_p$), which are significantly better than previous works\cite{ammad2014integrative,haider2015copula,menden2013machine}. Besides these statistical metrics, tCNNS also provides some insights into phenotypic screening.
△ Less
Submitted 28 December, 2018;
originally announced December 2018.
-
Entropy-Assisted Multi-Modal Emotion Recognition Framework Based on Physiological Signals
Authors:
Kuan Tung,
Po-Kang Liu,
Yu-Chuan Chuang,
Sheng-Hui Wang,
An-Yeu Wu
Abstract:
As the result of the growing importance of the Human Computer Interface system, understanding human's emotion states has become a consequential ability for the computer. This paper aims to improve the performance of emotion recognition by conducting the complexity analysis of physiological signals. Based on AMIGOS dataset, we extracted several entropy-domain features such as Refined Composite Mult…
▽ More
As the result of the growing importance of the Human Computer Interface system, understanding human's emotion states has become a consequential ability for the computer. This paper aims to improve the performance of emotion recognition by conducting the complexity analysis of physiological signals. Based on AMIGOS dataset, we extracted several entropy-domain features such as Refined Composite Multi-Scale Entropy (RCMSE), Refined Composite Multi-Scale Permutation Entropy (RCMPE) from ECG and GSR signals, and Multivariate Multi-Scale Entropy (MMSE), Multivariate Multi-Scale Permutation Entropy (MMPE) from EEG, respectively. The statistical results show that RCMSE in GSR has a dominating performance in arousal, while RCMPE in GSR would be the excellent feature in valence. Furthermore, we selected XGBoost model to predict emotion and get 68% accuracy in arousal and 84% in valence.
△ Less
Submitted 22 September, 2018;
originally announced September 2018.
-
On the relative role of different age groups during epidemics associated with the respiratory syncytial virus
Authors:
Edward Goldstein,
Hieu H. Nguyen,
Patrick Liu,
Cecile Viboud,
Claudia A. Steiner,
Colin J. Worby,
Marc Lipsitch
Abstract:
Background: While RSV circulation results in high burden of hospitalization, particularly among infants, young children and the elderly, little is known about the role of different age groups in propagating annual RSV epidemics in the community.
Methods: During a communicable disease outbreak, some subpopulations may play a disproportionate role during the outbreak's ascent due to increased susc…
▽ More
Background: While RSV circulation results in high burden of hospitalization, particularly among infants, young children and the elderly, little is known about the role of different age groups in propagating annual RSV epidemics in the community.
Methods: During a communicable disease outbreak, some subpopulations may play a disproportionate role during the outbreak's ascent due to increased susceptibility and/or contact rates. Such subpopulations can be identified by considering the proportion that cases in a subpopulation represent among all cases in the population occurring before (Bp) and after the epidemic peak (Ap) to calculate the subpopulation's relative risk, RR=Bp/Ap. We estimated RR for several age groups using data on RSV hospitalizations in the US between 2001-2012 from the Healthcare Cost and Utilization Project (HCUP).
Results: Children aged 3-4y and 5-6y each had the highest RR estimate for 5/11 seasons in the data, with RSV hospitalization rates in infants being generally higher during seasons when children aged 5-6y had the highest RR estimates. Children aged 2y had the highest RR estimate during one season. RR estimates in infants and individuals aged 11y and older were mostly lower than in children aged 1-10y.
Conclusions: The RR estimates suggest that preschool and young school-age children have the leading relative roles during RSV epidemics. We hope that those results will aid in the design of RSV vaccination policies.
△ Less
Submitted 8 August, 2017;
originally announced August 2017.
-
Flexible Metal Oxide/Graphene Oxide Hybrid Neuromorphic Devices on Flexible Conducting Graphene Substrates
Authors:
Chang ** Wan,
Wei Wang,
Li Qiang Zhu,
Yang Hui Liu,
** Feng,
Zhao ** Liu,
Yi Shi,
Qing Wan
Abstract:
Flexible metal oxide/graphene oxide hybrid multi-gate neuron transistors were fabricated on flexible graphene substrates. Dendritic integrations in both spatial and temporal modes were successfully emulated, and spatiotemporal correlated logics were obtained. A proof-of-principle visual system model for emulating lobula giant motion detector neuron was investigated. Our results are of great intere…
▽ More
Flexible metal oxide/graphene oxide hybrid multi-gate neuron transistors were fabricated on flexible graphene substrates. Dendritic integrations in both spatial and temporal modes were successfully emulated, and spatiotemporal correlated logics were obtained. A proof-of-principle visual system model for emulating lobula giant motion detector neuron was investigated. Our results are of great interest for flexible neuromorphic cognitive systems.
△ Less
Submitted 7 March, 2016;
originally announced September 2016.
-
Accurate, fully-automated NMR spectral profiling for metabolomics
Authors:
Siamak Ravanbakhsh,
Philip Liu,
Trent Bjorndahl,
Rupasri Mandal,
Jason R. Grant,
Michael Wilson,
Roman Eisner,
Igor Sinelnikov,
Xiaoyu Hu,
Claudio Luchinat,
Russell Greiner,
David S. Wishart
Abstract:
Many diseases cause significant changes to the concentrations of small molecules (aka metabolites) that appear in a person's biofluids, which means such diseases can often be readily detected from a person's "metabolic profile". This information can be extracted from a biofluid's NMR spectrum. Today, this is often done manually by trained human experts, which means this process is relatively slow,…
▽ More
Many diseases cause significant changes to the concentrations of small molecules (aka metabolites) that appear in a person's biofluids, which means such diseases can often be readily detected from a person's "metabolic profile". This information can be extracted from a biofluid's NMR spectrum. Today, this is often done manually by trained human experts, which means this process is relatively slow, expensive and error-prone. This paper presents a tool, Bayesil, that can quickly, accurately and autonomously produce a complex biofluid's (e.g., serum or CSF) metabolic profile from a 1D1H NMR spectrum. This requires first performing several spectral processing steps then matching the resulting spectrum against a reference compound library, which contains the "signatures" of each relevant metabolite. Many of these steps are novel algorithms and our matching step views spectral matching as an inference problem within a probabilistic graphical model that rapidly approximates the most probable metabolic profile. Our extensive studies on a diverse set of complex mixtures, show that Bayesil can autonomously find the concentration of all NMR-detectable metabolites accurately (~90% correct identification and ~10% quantification error), in <5minutes on a single CPU. These results demonstrate that Bayesil is the first fully-automatic publicly-accessible system that provides quantitative NMR spectral profiling effectively -- with an accuracy that meets or exceeds the performance of trained experts. We anticipate this tool will usher in high-throughput metabolomics and enable a wealth of new applications of NMR in clinical settings. Available at http://www.bayesil.ca.
△ Less
Submitted 7 September, 2014; v1 submitted 4 September, 2014;
originally announced September 2014.
-
Neural Mechanism of Language
Authors:
Peilei Liu,
Ting Wang
Abstract:
This paper is based on our previous work on neural coding. It is a self-organized model supported by existing evidences. Firstly, we briefly introduce this model in this paper, and then we explain the neural mechanism of language and reasoning with it. Moreover, we find that the position of an area determines its importance. Specifically, language relevant areas are in the capital position of the…
▽ More
This paper is based on our previous work on neural coding. It is a self-organized model supported by existing evidences. Firstly, we briefly introduce this model in this paper, and then we explain the neural mechanism of language and reasoning with it. Moreover, we find that the position of an area determines its importance. Specifically, language relevant areas are in the capital position of the cortical kingdom. Therefore they are closely related with autonomous consciousness and working memories. In essence, language is a miniature of the real world. Briefly, this paper would like to bridge the gap between molecule mechanism of neurons and advanced functions such as language and reasoning.
△ Less
Submitted 22 August, 2014;
originally announced August 2014.
-
Motor Learning Mechanism on the Neuron Scale
Authors:
Peilei Liu,
Ting Wang
Abstract:
Based on existing data, we wish to put forward a biological model of motor system on the neuron scale. Then we indicate its implications in statistics and learning. Specifically, neuron firing frequency and synaptic strength are probability estimates in essence. And the lateral inhibition also has statistical implications. From the standpoint of learning, dendritic competition through retrograde m…
▽ More
Based on existing data, we wish to put forward a biological model of motor system on the neuron scale. Then we indicate its implications in statistics and learning. Specifically, neuron firing frequency and synaptic strength are probability estimates in essence. And the lateral inhibition also has statistical implications. From the standpoint of learning, dendritic competition through retrograde messengers is the foundation of conditional reflex and grandmother cell coding. And they are the kernel mechanisms of motor learning and sensory motor integration respectively. Finally, we compare motor system with sensory system. In short, we would like to bridge the gap between molecule evidences and computational models.
△ Less
Submitted 18 July, 2014;
originally announced July 2014.
-
A Quantitative Neural Coding Model of Sensory Memory
Authors:
Peilei Liu,
Ting Wang
Abstract:
The coding mechanism of sensory memory on the neuron scale is one of the most important questions in neuroscience. We have put forward a quantitative neural network model, which is self organized, self similar, and self adaptive, just like an ecosystem following Darwin theory. According to this model, neural coding is a mult to one map** from objects to neurons. And the whole cerebrum is a real-…
▽ More
The coding mechanism of sensory memory on the neuron scale is one of the most important questions in neuroscience. We have put forward a quantitative neural network model, which is self organized, self similar, and self adaptive, just like an ecosystem following Darwin theory. According to this model, neural coding is a mult to one map** from objects to neurons. And the whole cerebrum is a real-time statistical Turing Machine, with powerful representing and learning ability. This model can reconcile some important disputations, such as: temporal coding versus rate based coding, grandmother cell versus population coding, and decay theory versus interference theory. And it has also provided explanations for some key questions such as memory consolidation, episodic memory, consciousness, and sentiment. Philosophical significance is indicated at last.
△ Less
Submitted 25 June, 2014;
originally announced June 2014.
-
A Unified Quantitative Model of Vision and Audition
Authors:
Peilei Liu,
Ting Wang
Abstract:
We have put forwards a unified quantitative framework of vision and audition, based on existing data and theories. According to this model, the retina is a feedforward network self-adaptive to inputs in a specific period. After fully grown, cells become specialized detectors based on statistics of stimulus history. This model has provided explanations for perception mechanisms of colour, shape, de…
▽ More
We have put forwards a unified quantitative framework of vision and audition, based on existing data and theories. According to this model, the retina is a feedforward network self-adaptive to inputs in a specific period. After fully grown, cells become specialized detectors based on statistics of stimulus history. This model has provided explanations for perception mechanisms of colour, shape, depth and motion. Moreover, based on this ground we have put forwards a bold conjecture that single ear can detect sound direction. This is complementary to existing theories and has provided better explanations for sound localization.
△ Less
Submitted 23 June, 2014;
originally announced June 2014.
-
In Vivo Renal Clearance, Biodistribution, Toxicity of Gold nanoclusters
Authors:
Xiao-Dong Zhang,
Di Wu,
Xiu Shen,
Pei-Xun Liu,
Fei-Yue Fan,
Sai-Jun Fan
Abstract:
Gold nanoparticles have shown great prospective in cancer diagnosis and therapy, but they can not be metabolized and prefer to accumulate in liver and spleen due to their large size. The gold nanoclusters with small size can penetrate kidney tissue and have promise to decrease in vivo toxicity by renal clearance. In this work, we explore the in vivo renal clearance, biodistribution, and toxicity r…
▽ More
Gold nanoparticles have shown great prospective in cancer diagnosis and therapy, but they can not be metabolized and prefer to accumulate in liver and spleen due to their large size. The gold nanoclusters with small size can penetrate kidney tissue and have promise to decrease in vivo toxicity by renal clearance. In this work, we explore the in vivo renal clearance, biodistribution, and toxicity responses of the BSA- and GSH-protected gold nanoclusters for 24 hours and 28 days. The BSA-protected gold nanoclusters have low-efficient renal clearance and only 1% of gold can be cleared, but the GSH-protected gold nanoclusters have high-efficient renal clearance and 36 % of gold can be cleared after 24 hours. The biodistribution further reveals that 94% of gold can be metabolized for the GSH-protected nanoclusters, but only less than 5% of gold can be metabolized for the BSA-protected nanoclusters after 28 days. Both of the GSH- and BSA-protected gold nanoclusters cause acute infection, inflammation, and kidney function damage after 24 hours, but these toxicity responses for the GSH-protected gold nanoclusters can be eliminated after 28 days. Immune system can also be affected by the two kinds of gold nanoclusters, but the immune response for the GSH-protected gold nanoclusters can also be recovered after 28 days. These findings show that the GSH-protected gold nanoclusters have small size and can be metabolized by renal clearance and thus the toxicity can be significantly decreased. The BSA- protected gold nanoclusters, however, can form large compounds and further accumulate in liver and spleen which can cause irreparable toxicity response. Therefore, the GSH-protected gold nanoclusters have great potential for in vivo imaging and therapy, and the BSA-protected gold nanoclusters can be used as the agent of liver cancer therapy.
△ Less
Submitted 27 September, 2012;
originally announced October 2012.