-
Fish Tracking, Counting, and Behaviour Analysis in Digital Aquaculture: A Comprehensive Review
Authors:
Meng Cui,
Xubo Liu,
Haohe Liu,
**zheng Zhao,
Daoliang Li,
Wenwu Wang
Abstract:
Digital aquaculture leverages advanced technologies and data-driven methods, providing substantial benefits over traditional aquaculture practices. Fish tracking, counting, and behaviour analysis are crucial components of digital aquaculture, which are essential for optimizing production efficiency, enhancing fish welfare, and improving resource management. Previous reviews have focused on single…
▽ More
Digital aquaculture leverages advanced technologies and data-driven methods, providing substantial benefits over traditional aquaculture practices. Fish tracking, counting, and behaviour analysis are crucial components of digital aquaculture, which are essential for optimizing production efficiency, enhancing fish welfare, and improving resource management. Previous reviews have focused on single modalities, limiting their ability to address the diverse challenges encountered in these tasks comprehensively. This review provides a comprehensive analysis of the current state of aquaculture digital technologies, including vision-based, acoustic-based, and biosensor-based methods. We examine the advantages, limitations, and applications of these methods, highlighting recent advancements and identifying critical research gaps. The scarcity of comprehensive fish datasets and the lack of unified evaluation standards, which make it difficult to compare the performance of different technologies, are identified as major obstacles hindering progress in this field. To overcome current limitations and improve the accuracy, robustness, and efficiency of fish monitoring systems, we explore the potential of emerging technologies such as multimodal data fusion and deep learning. Additionally, we contribute to the field by providing a summary of existing datasets available for fish tracking, counting, and behaviour analysis. Future research directions are outlined, emphasizing the need for comprehensive datasets and evaluation standards to facilitate meaningful comparisons between technologies and promote their practical implementation in real-world aquaculture settings.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
STimage-1K4M: A histopathology image-gene expression dataset for spatial transcriptomics
Authors:
Jiawen Chen,
Muqing Zhou,
Wenrong Wu,
**wei Zhang,
Yun Li,
Didong Li
Abstract:
Recent advances in multi-modal algorithms have driven and been driven by the increasing availability of large image-text datasets, leading to significant strides in various fields, including computational pathology. However, in most existing medical image-text datasets, the text typically provides high-level summaries that may not sufficiently describe sub-tile regions within a large pathology ima…
▽ More
Recent advances in multi-modal algorithms have driven and been driven by the increasing availability of large image-text datasets, leading to significant strides in various fields, including computational pathology. However, in most existing medical image-text datasets, the text typically provides high-level summaries that may not sufficiently describe sub-tile regions within a large pathology image. For example, an image might cover an extensive tissue area containing cancerous and healthy regions, but the accompanying text might only specify that this image is a cancer slide, lacking the nuanced details needed for in-depth analysis. In this study, we introduce STimage-1K4M, a novel dataset designed to bridge this gap by providing genomic features for sub-tile images. STimage-1K4M contains 1,149 images derived from spatial transcriptomics data, which captures gene expression information at the level of individual spatial spots within a pathology image. Specifically, each image in the dataset is broken down into smaller sub-image tiles, with each tile paired with 15,000-30,000 dimensional gene expressions. With 4,293,195 pairs of sub-tile images and gene expressions, STimage-1K4M offers unprecedented granularity, paving the way for a wide range of advanced research in multi-modal data analysis an innovative applications in computational pathology, and beyond.
△ Less
Submitted 20 June, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion
Authors:
Dongyang Li,
Chen Wei,
Shiying Li,
Jiachen Zou,
Quanying Liu
Abstract:
How to decode human vision through neural signals has attracted a long-standing interest in neuroscience and machine learning. Modern contrastive learning and generative models improved the performance of fMRI-based visual decoding and reconstruction. However, the high cost and low temporal resolution of fMRI limit their applications in brain-computer interfaces (BCIs), prompting a high need for E…
▽ More
How to decode human vision through neural signals has attracted a long-standing interest in neuroscience and machine learning. Modern contrastive learning and generative models improved the performance of fMRI-based visual decoding and reconstruction. However, the high cost and low temporal resolution of fMRI limit their applications in brain-computer interfaces (BCIs), prompting a high need for EEG-based visual reconstruction. In this study, we present an EEG-based visual reconstruction framework. It consists of a plug-and-play EEG encoder called the Adaptive Thinking Mapper (ATM), which is aligned with image embeddings, and a two-stage EEG guidance image generator that first transforms EEG features into image priors and then reconstructs the visual stimuli with a pre-trained image generator. Our approach allows EEG embeddings to achieve superior performance in image classification and retrieval tasks. Our two-stage image generation strategy vividly reconstructs images seen by humans. Furthermore, we analyzed the impact of signals from different time windows and brain regions on decoding and reconstruction. The versatility of our framework is demonstrated in the magnetoencephalogram (MEG) data modality. We report that EEG-based visual decoding achieves SOTA performance, highlighting the portability, low cost, and high temporal resolution of EEG, enabling a wide range of BCI applications. The code of ATM is available at https://github.com/dongyangli-del/EEG_Image_decode.
△ Less
Submitted 4 April, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
EEGFormer: Towards Transferable and Interpretable Large-Scale EEG Foundation Model
Authors:
Yuqi Chen,
Kan Ren,
Kaitao Song,
Yansen Wang,
Yifan Wang,
Dongsheng Li,
Lili Qiu
Abstract:
Self-supervised learning has emerged as a highly effective approach in the fields of natural language processing and computer vision. It is also applicable to brain signals such as electroencephalography (EEG) data, given the abundance of available unlabeled data that exist in a wide spectrum of real-world medical applications ranging from seizure detection to wave analysis. The existing works lev…
▽ More
Self-supervised learning has emerged as a highly effective approach in the fields of natural language processing and computer vision. It is also applicable to brain signals such as electroencephalography (EEG) data, given the abundance of available unlabeled data that exist in a wide spectrum of real-world medical applications ranging from seizure detection to wave analysis. The existing works leveraging self-supervised learning on EEG modeling mainly focus on pretraining upon each individual dataset corresponding to a single downstream task, which cannot leverage the power of abundant data, and they may derive sub-optimal solutions with a lack of generalization. Moreover, these methods rely on end-to-end model learning which is not easy for humans to understand. In this paper, we present a novel EEG foundation model, namely EEGFormer, pretrained on large-scale compound EEG data. The pretrained model cannot only learn universal representations on EEG signals with adaptable performance on various downstream tasks but also provide interpretable outcomes of the useful patterns within the data. To validate the effectiveness of our model, we extensively evaluate it on various downstream tasks and assess the performance under different transfer settings. Furthermore, we demonstrate how the learned model exhibits transferable anomaly detection performance and provides valuable interpretability of the acquired patterns via self-supervised learning.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Seeing through the Brain: Image Reconstruction of Visual Perception from Human Brain Signals
Authors:
Yu-Ting Lan,
Kan Ren,
Yansen Wang,
Wei-Long Zheng,
Dongsheng Li,
Bao-Liang Lu,
Lili Qiu
Abstract:
Seeing is believing, however, the underlying mechanism of how human visual perceptions are intertwined with our cognitions is still a mystery. Thanks to the recent advances in both neuroscience and artificial intelligence, we have been able to record the visually evoked brain activities and mimic the visual perception ability through computational approaches. In this paper, we pay attention to vis…
▽ More
Seeing is believing, however, the underlying mechanism of how human visual perceptions are intertwined with our cognitions is still a mystery. Thanks to the recent advances in both neuroscience and artificial intelligence, we have been able to record the visually evoked brain activities and mimic the visual perception ability through computational approaches. In this paper, we pay attention to visual stimuli reconstruction by reconstructing the observed images based on portably accessible brain signals, i.e., electroencephalography (EEG) data. Since EEG signals are dynamic in the time-series format and are notorious to be noisy, processing and extracting useful information requires more dedicated efforts; In this paper, we propose a comprehensive pipeline, named NeuroImagen, for reconstructing visual stimuli images from EEG signals. Specifically, we incorporate a novel multi-level perceptual information decoding to draw multi-grained outputs from the given EEG data. A latent diffusion model will then leverage the extracted information to reconstruct the high-resolution visual stimuli images. The experimental results have illustrated the effectiveness of image reconstruction and superior quantitative performance of our proposed method.
△ Less
Submitted 16 August, 2023; v1 submitted 27 July, 2023;
originally announced August 2023.
-
Delete: Deep Lead Optimization Enveloped in Protein Pocket through Unified Deleting Strategies and a Structure-aware Network
Authors:
Haotian Zhang,
Huifeng Zhao,
Xujun Zhang,
Qun Su,
Hongyan Du,
Chao Shen,
Zhe Wang,
Dan Li,
Peichen Pan,
Guangyong Chen,
Yu Kang,
Chang-yu Hsieh,
Tingjun Hou
Abstract:
Drug discovery is a highly complicated process, and it is unfeasible to fully commit it to the recently developed molecular generation methods. Deep learning-based lead optimization takes expert knowledge as a starting point, learning from numerous historical cases about how to modify the structure for better drug-forming properties. However, compared with the more established de novo generation s…
▽ More
Drug discovery is a highly complicated process, and it is unfeasible to fully commit it to the recently developed molecular generation methods. Deep learning-based lead optimization takes expert knowledge as a starting point, learning from numerous historical cases about how to modify the structure for better drug-forming properties. However, compared with the more established de novo generation schemes, lead optimization is still an area that requires further exploration. Previously developed models are often limited to resolving one (or few) certain subtask(s) of lead optimization, and most of them can only generate the two-dimensional structures of molecules while disregarding the vital protein-ligand interactions based on the three-dimensional binding poses. To address these challenges, we present a novel tool for lead optimization, named Delete (Deep lead optimization enveloped in protein pocket). Our model can handle all subtasks of lead optimization involving fragment growing, linking, and replacement through a unified deleting (masking) strategy, and is aware of the intricate pocket-ligand interactions through the geometric design of networks. Statistical evaluations and case studies conducted on individual subtasks demonstrate that Delete has a significant ability to produce molecules with superior binding affinities to protein targets and reasonable drug-likeness from given fragments or atoms. This feature may assist medicinal chemists in develo** not only me-too/me-better products from existing drugs but also hit-to-lead for first-in-class drugs in a highly efficient manner.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
Biological Factor Regulatory Neural Network
Authors:
Xinnan Dai,
Caihua Shan,
Jie Zheng,
Xiaoxiao Li,
Dongsheng Li
Abstract:
Genes are fundamental for analyzing biological systems and many recent works proposed to utilize gene expression for various biological tasks by deep learning models. Despite their promising performance, it is hard for deep neural networks to provide biological insights for humans due to their black-box nature. Recently, some works integrated biological knowledge with neural networks to improve th…
▽ More
Genes are fundamental for analyzing biological systems and many recent works proposed to utilize gene expression for various biological tasks by deep learning models. Despite their promising performance, it is hard for deep neural networks to provide biological insights for humans due to their black-box nature. Recently, some works integrated biological knowledge with neural networks to improve the transparency and performance of their models. However, these methods can only incorporate partial biological knowledge, leading to suboptimal performance. In this paper, we propose the Biological Factor Regulatory Neural Network (BFReg-NN), a generic framework to model relations among biological factors in cell systems. BFReg-NN starts from gene expression data and is capable of merging most existing biological knowledge into the model, including the regulatory relations among genes or proteins (e.g., gene regulatory networks (GRN), protein-protein interaction networks (PPI)) and the hierarchical relations among genes, proteins and pathways (e.g., several genes/proteins are contained in a pathway). Moreover, BFReg-NN also has the ability to provide new biologically meaningful insights because of its white-box characteristics. Experimental results on different gene expression-based tasks verify the superiority of BFReg-NN compared with baselines. Our case studies also show that the key insights found by BFReg-NN are consistent with the biological literature.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection
Authors:
**chao Li,
Kaitao Song,
Junan Li,
Bo Zheng,
Dongsheng Li,
Xixin Wu,
Xunying Liu,
Helen Meng
Abstract:
With the global population aging rapidly, Alzheimer's disease (AD) is particularly prominent in older adults, which has an insidious onset and leads to a gradual, irreversible deterioration in cognitive domains (memory, communication, etc.). Speech-based AD detection opens up the possibility of widespread screening and timely disease intervention. Recent advances in pre-trained models motivate AD…
▽ More
With the global population aging rapidly, Alzheimer's disease (AD) is particularly prominent in older adults, which has an insidious onset and leads to a gradual, irreversible deterioration in cognitive domains (memory, communication, etc.). Speech-based AD detection opens up the possibility of widespread screening and timely disease intervention. Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations. This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features. Based on these features, the paper also proposes a novel task-oriented approach by modeling the relationship between the participants' description and the cognitive task. Experiments are carried out on the ADReSS dataset in a binary classification setup, and models are evaluated on the unseen test set. Results and comparison with recent literature demonstrate the efficiency and superior performance of proposed acoustic, linguistic and task-oriented methods. The findings also show the importance of semantic and syntactic information, and feasibility of automation and generalization with the promising audio-only and task-oriented methods for the AD detection task.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
Map** effective connectivity by virtually perturbing a surrogate brain
Authors:
Zixiang Luo,
Kaining Peng,
Zhichao Liang,
Shengyuan Cai,
Chenyu Xu,
Dan Li,
Yu Hu,
Changsong Zhou,
Quanying Liu
Abstract:
Effective connectivity (EC), indicative of the causal interactions between brain regions, is fundamental to understanding information processing in the brain. Traditional approaches, which infer EC from neural responses to stimulations, are not suited for map** whole-brain EC in human due to being invasive and limited spatial coverage of stimulations. To address this gap, we present Neural Pertu…
▽ More
Effective connectivity (EC), indicative of the causal interactions between brain regions, is fundamental to understanding information processing in the brain. Traditional approaches, which infer EC from neural responses to stimulations, are not suited for map** whole-brain EC in human due to being invasive and limited spatial coverage of stimulations. To address this gap, we present Neural Perturbational Inference (NPI), a data-driven framework designed to map EC across the entire brain. NPI employs an artificial neural network trained to learn large-scale neural dynamics as a computational surrogate of the brain. NPI maps EC by perturbing each region of the surrogate brain and observing the resulting responses in the rest of regions. NPI captures the directionality, strength, and excitatory/inhibitory properties of EC on a brain-wide scale. Our validation of NPI, using models with established EC, shows its superiority over Granger Causality and Dynamic Causal Modeling. Applying NPI to resting-state fMRI data from diverse datasets reveals consistent and structurally supported EC. Applications on a disease-specific dataset highlight the potential of using personalized EC as biomarkers for neurological diseases. By transitioning from correlational to causal understandings of brain functionality, NPI marks a stride in decoding the brain's functional architecture and can facilitate neuroscience research and clinical applications.
△ Less
Submitted 14 March, 2024; v1 submitted 31 December, 2022;
originally announced January 2023.
-
STSC-SNN: Spatio-Temporal Synaptic Connection with Temporal Convolution and Attention for Spiking Neural Networks
Authors:
Chengting Yu,
Zheming Gu,
Da Li,
Gaoang Wang,
Aili Wang,
Er** Li
Abstract:
Spiking Neural Networks (SNNs), as one of the algorithmic models in neuromorphic computing, have gained a great deal of research attention owing to temporal information processing capability, low power consumption, and high biological plausibility. The potential to efficiently extract spatio-temporal features makes it suitable for processing the event streams. However, existing synaptic structures…
▽ More
Spiking Neural Networks (SNNs), as one of the algorithmic models in neuromorphic computing, have gained a great deal of research attention owing to temporal information processing capability, low power consumption, and high biological plausibility. The potential to efficiently extract spatio-temporal features makes it suitable for processing the event streams. However, existing synaptic structures in SNNs are almost full-connections or spatial 2D convolution, neither of which can extract temporal dependencies adequately. In this work, we take inspiration from biological synapses and propose a spatio-temporal synaptic connection SNN (STSC-SNN) model, to enhance the spatio-temporal receptive fields of synaptic connections, thereby establishing temporal dependencies across layers. Concretely, we incorporate temporal convolution and attention mechanisms to implement synaptic filtering and gating functions. We show that endowing synaptic models with temporal dependencies can improve the performance of SNNs on classification tasks. In addition, we investigate the impact of performance vias varied spatial-temporal receptive fields and reevaluate the temporal modules in SNNs. Our approach is tested on neuromorphic datasets, including DVS128 Gesture (gesture recognition), N-MNIST, CIFAR10-DVS (image classification), and SHD (speech digit recognition). The results show that the proposed model outperforms the state-of-the-art accuracy on nearly all datasets.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
A large dataset of software mentions in the biomedical literature
Authors:
Ana-Maria Istrate,
Donghui Li,
Dario Taraborelli,
Michaela Torkar,
Boris Veytsman,
Ivana Williams
Abstract:
We describe the CZ Software Mentions dataset, a new dataset of software mentions in biomedical papers. Plain-text software mentions are extracted with a trained SciBERT model from several sources: the NIH PubMed Central collection and from papers provided by various publishers to the Chan Zuckerberg Initiative. The dataset provides sources, context and metadata, and, for a number of mentions, the…
▽ More
We describe the CZ Software Mentions dataset, a new dataset of software mentions in biomedical papers. Plain-text software mentions are extracted with a trained SciBERT model from several sources: the NIH PubMed Central collection and from papers provided by various publishers to the Chan Zuckerberg Initiative. The dataset provides sources, context and metadata, and, for a number of mentions, the disambiguated software entities and links. We extract 1.12 million unique string software mentions from 2.4 million papers in the NIH PMC-OA Commercial subset, 481k unique mentions from the NIH PMC-OA Non-Commercial subset (both gathered in October 2021) and 934k unique mentions from 3 million papers in the Publishers' collection. There is variation in how software is mentioned in papers and extracted by the NER algorithm. We propose a clustering-based disambiguation algorithm to map plain-text software mentions into distinct software entities and apply it on the NIH PubMed Central Commercial collection. Through this methodology, we disambiguate 1.12 million unique strings extracted by the NER model into 97600 unique software entities, covering 78% of all software-paper links. We link 185000 of the mentions to a repository, covering about 55% of all software-paper links. We describe in detail the process of building the datasets, disambiguating and linking the software mentions, as well as opportunities and challenges that come with a dataset of this size. We make all data and code publicly available as a new resource to help assess the impact of software (in particular scientific open source projects) on science.
△ Less
Submitted 27 September, 2022; v1 submitted 1 September, 2022;
originally announced September 2022.
-
Pasture Intake Protects Against Commercial Diet-induced Lipopolysaccharide Production Facilitated by Gut Microbiota through Activating Intestinal Alkaline Phosphatase Enzyme in Meat Geese
Authors:
Qasim Ali,
Sen Ma,
Umar Farooq,
Jiakuan Niu,
Fen Li,
Muhammad Abaidullah,
Boshuai Liu,
Shaokai La,
Defeng Li,
Zhichang Wang,
Hao Sun,
Yalei Cui,
Yinghua Shi
Abstract:
In-house feeding system (IHF, a low dietary fiber source) may cause altered cecal microbiota composition and inflammatory responses in meat geese via increased endotoxemia (lipopolysaccharides) with reduced intestinal alkaline phosphatase (ALP) production. The effects of artificial pasture grazing system (AGF, a high dietary fiber source) on modulating gut microbiota architecture and gut barrier f…
▽ More
In-house feeding system (IHF, a low dietary fiber source) may cause altered cecal microbiota composition and inflammatory responses in meat geese via increased endotoxemia (lipopolysaccharides) with reduced intestinal alkaline phosphatase (ALP) production. The effects of artificial pasture grazing system (AGF, a high dietary fiber source) on modulating gut microbiota architecture and gut barrier functions have not been investigated in meat geese. The intestinal ALP functions to regulate gut microbial homeostasis and barrier function appears to inhibit pro-inflammatory cytokines by reducing LPS-induced reactive oxygen species (ROS) production. The purpose of our study was to investigate whether this enzyme could play a critical role in attenuating ROS generation and then ROS facilitated NF-\k{appa}B pathway-induced systemic inflammation in meat geese. First, we assessed the impacts of IHF and AGF on gut microbial composition via 16 sRNA sequencing in meat geese. In the gut microbiota analysis, meat geese supplemented with pasture demonstrated a significant reduction in microbial richness and diversity compared to IHF meat geese demonstrating antimicrobial, antioxidation, and anti-inflammatory ability of AGF system. Second host markers analysis through protein expression of serum and cecal tissues and quantitative PCR of cecal tissues were evaluated. We confirmed a significant increase in intestinal ALP-induced Nrf2 signaling pathway representing LPS dephosphorylation mediated TLR4/MyD88 induced ROS reduction mechanisms in AGF meat geese. Further, the correlation analysis of top 44 host markers with gut microbiota shows that artificial pasture intake induced gut barrier functions via reducing ROS-mediated NF-\k{appa}B pathway-induced gut permeability, systemic inflammation, and aging phenotypes.
△ Less
Submitted 29 August, 2022;
originally announced August 2022.
-
Subtype-Former: a deep learning approach for cancer subtype discovery with multi-omics data
Authors:
Hai Yang,
Yuhang Sheng,
Yi Jiang,
Xiaoyang Fang,
Dongdong Li,
**g Zhang,
Zhe Wang
Abstract:
Motivation: Cancer is heterogeneous, affecting the precise approach to personalized treatment. Accurate subty** can lead to better survival rates for cancer patients. High-throughput technologies provide multiple omics data for cancer subty**. However, precise cancer subty** remains challenging due to the large amount and high dimensionality of omics data. Results: This study proposed Subtyp…
▽ More
Motivation: Cancer is heterogeneous, affecting the precise approach to personalized treatment. Accurate subty** can lead to better survival rates for cancer patients. High-throughput technologies provide multiple omics data for cancer subty**. However, precise cancer subty** remains challenging due to the large amount and high dimensionality of omics data. Results: This study proposed Subtype-Former, a deep learning method based on MLP and Transformer Block, to extract the low-dimensional representation of the multi-omics data. K-means and Consensus Clustering are also used to achieve accurate subty** results. We compared Subtype-Former with the other state-of-the-art subty** methods across the TCGA 10 cancer types. We found that Subtype-Former can perform better on the benchmark datasets of more than 5000 tumors based on the survival analysis. In addition, Subtype-Former also achieved outstanding results in pan-cancer subty**, which can help analyze the commonalities and differences across various cancer types at the molecular level. Finally, we applied Subtype-Former to the TCGA 10 types of cancers. We identified 50 essential biomarkers, which can be used to study targeted cancer drugs and promote the development of cancer treatments in the era of precision medicine.
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
Neural operator learning of heterogeneous mechanobiological insults contributing to aortic aneurysms
Authors:
Somdatta Goswami,
David S. Li,
Bruno V. Rego,
Marcos Latorre,
Jay D. Humphrey,
George Em Karniadakis
Abstract:
Thoracic aortic aneurysm (TAA) is a localized dilatation of the aorta resulting from compromised wall composition, structure, and function, which can lead to life-threatening dissection or rupture. Several genetic mutations and predisposing factors that contribute to TAA have been studied in mouse models to characterize specific changes in aortic microstructure and material properties that result…
▽ More
Thoracic aortic aneurysm (TAA) is a localized dilatation of the aorta resulting from compromised wall composition, structure, and function, which can lead to life-threatening dissection or rupture. Several genetic mutations and predisposing factors that contribute to TAA have been studied in mouse models to characterize specific changes in aortic microstructure and material properties that result from a wide range of mechanobiological insults. Assessments of TAA progression in vivo is largely limited to measurements of aneurysm size and growth rate. It has been shown that aortic geometry alone is not sufficient to predict the patient-specific progression of TAA but computational modeling of the evolving biomechanics of the aorta could predict future geometry and properties from initiating insults. In this work, we present an integrated framework to train a deep operator network (DeepONet)-based surrogate model to identify contributing factors for TAA by using FE-based datasets of aortic growth and remodeling resulting from prescribed insults. For training data, we investigate multiple types of TAA risk factors and spatial distributions within a constrained mixture model to generate axial--azimuthal maps of aortic dilatation and distensibility. The trained network is then capable of predicting the initial distribution and extent of the insult from a given set of dilatation and distensibility information. Two DeepONet frameworks are proposed, one trained on sparse information and one on full-field grayscale images, to gain insight into a preferred neural operator-based approach. Performance of the surrogate models is evaluated through multiple simulations carried out on insult distributions varying from fusiform to complex. We show that the proposed approach can predict patient-specific mechanobiological insult profile with a high accuracy, particularly when based on full-field images.
△ Less
Submitted 8 May, 2022;
originally announced May 2022.
-
Modeling COVID-19 vaccine-induced immunological memory development and its links to antibody level and infectiousness
Authors:
Xin Gao,
Jianwei Li,
Dianjie Li
Abstract:
COVID-19 vaccines have proven to be effective against SARS-CoV-2 infection. However, the dynamics of vaccine-induced immunological memory development and neutralizing antibodies generation are not fully understood, limiting vaccine development and vaccination regimen determination. Herein, we constructed a mathematical model to characterize the vaccine-induced immune response based on fitting the…
▽ More
COVID-19 vaccines have proven to be effective against SARS-CoV-2 infection. However, the dynamics of vaccine-induced immunological memory development and neutralizing antibodies generation are not fully understood, limiting vaccine development and vaccination regimen determination. Herein, we constructed a mathematical model to characterize the vaccine-induced immune response based on fitting the viral infection and vaccination datasets. With the example of CoronaVac, we revealed the association between vaccine-induced immunological memory development and neutralizing antibody levels. The establishment of the intact immunological memory requires more than 6 months after the first and second doses, after that a booster shot can induce high levels neutralizing antibodies. By introducing the maximum viral load and recovery time after viral infection, we quantitatively studied the protective effect of vaccines against viral infection. Accordingly, we optimized the vaccination regimen, including dose and vaccination timing, and predicted the effect of the fourth dose. Last, by combining the viral transmission model, we showed the suppression of virus transmission by vaccination, which may be instructive for the development of public health policies.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
Network resilience in the aging brain
Authors:
Tao Liu,
Shu Guo,
Hao Liu,
Rui Kang,
Mingyang Bai,
Jiyang Jiang,
Wei Wen,
Xing Pan,
Jun Tai,
Jianxin Li,
Jian Cheng,
**g **g,
Zhenzhou Wu,
Haijun Niu,
Haogang Zhu,
Zixiao Li,
Yongjun Wang,
Henry Brodaty,
Perminder Sachdev,
Daqing Li
Abstract:
Degeneration and adaptation are two competing sides of the same coin called resilience in the progressive processes of brain aging or diseases. Degeneration accumulates during brain aging and other cerebral activities, causing structural atrophy and dysfunction. At the same time, adaptation allows brain network reorganize to compensate for structural loss to maintain cognition function. Although h…
▽ More
Degeneration and adaptation are two competing sides of the same coin called resilience in the progressive processes of brain aging or diseases. Degeneration accumulates during brain aging and other cerebral activities, causing structural atrophy and dysfunction. At the same time, adaptation allows brain network reorganize to compensate for structural loss to maintain cognition function. Although hidden resilience mechanism is critical and fundamental to uncover the brain aging law, due to the lack of datasets and appropriate methodology, it remains essentially unknown how these two processes interact dynamically across brain networks. To quantitatively investigate this complex process, we analyze aging brains based on 6-year follow-up multimodal neuroimaging database from 63 persons. We reveal the critical mechanism of network resilience that various perturbation may cause fast brain structural atrophy, and then brain can reorganize its functional layout to lower its operational efficiency, which helps to slow down the structural atrophy and finally recover its functional efficiency equilibrium. This empirical finding could be explained by our theoretical model, suggesting one universal resilience dynamical function. This resilience is achieved in the brain functional network with evolving percolation and rich-club features. Our findings can help to understand the brain aging process and design possible mitigation methods to adjust interaction between degeneration and adaptation from resilience viewpoint.
△ Less
Submitted 3 February, 2022;
originally announced February 2022.
-
PDBL: Improving Histopathological Tissue Classification with Plug-and-Play Pyramidal Deep-Broad Learning
Authors:
Jiatai Lin,
Guoqiang Han,
Xipeng Pan,
Hao Chen,
Danyi Li,
Xi** Jia,
Zhenwei Shi,
Zhizhen Wang,
Yanfen Cui,
Haiming Li,
Changhong Liang,
Li Liang,
Zaiyi Liu,
Chu Han
Abstract:
Histopathological tissue classification is a fundamental task in pathomics cancer research. Precisely differentiating different tissue types is a benefit for the downstream researches, like cancer diagnosis, prognosis and etc. Existing works mostly leverage the popular classification backbones in computer vision to achieve histopathological tissue classification. In this paper, we proposed a super…
▽ More
Histopathological tissue classification is a fundamental task in pathomics cancer research. Precisely differentiating different tissue types is a benefit for the downstream researches, like cancer diagnosis, prognosis and etc. Existing works mostly leverage the popular classification backbones in computer vision to achieve histopathological tissue classification. In this paper, we proposed a super lightweight plug-and-play module, named Pyramidal Deep-Broad Learning (PDBL), for any well-trained classification backbone to further improve the classification performance without a re-training burden. We mimic how pathologists observe pathology slides in different magnifications and construct an image pyramid for the input image in order to obtain the pyramidal contextual information. For each level in the pyramid, we extract the multi-scale deep-broad features by our proposed Deep-Broad block (DB-block). We equipped PDBL in three popular classification backbones, ShuffLeNetV2, EfficientNetb0, and ResNet50 to evaluate the effectiveness and efficiency of our proposed module on two datasets (Kather Multiclass Dataset and the LC25000 Dataset). Experimental results demonstrate the proposed PDBL can steadily improve the tissue-level classification performance for any CNN backbones, especially for the lightweight models when given a small among of training samples (less than 10%), which greatly saves the computational time and annotation efforts.
△ Less
Submitted 4 November, 2021;
originally announced November 2021.
-
On a Class of Nonlocal SIR Models
Authors:
Li,
Guan,
Dong Li,
Ke Wang,
Kun Zhao
Abstract:
We revisit the classic Susceptible-Infected-Recovered (SIR) epidemic model and one of its nonlocal variations recently developed in \cite{Guan}. We introduce several new approaches to derive exact analytical solutions in the classical situation and analyze the corresponding effective approximations in the nonlocal setting. An interesting new feature of the nonlocal models, compared with the classi…
▽ More
We revisit the classic Susceptible-Infected-Recovered (SIR) epidemic model and one of its nonlocal variations recently developed in \cite{Guan}. We introduce several new approaches to derive exact analytical solutions in the classical situation and analyze the corresponding effective approximations in the nonlocal setting. An interesting new feature of the nonlocal models, compared with the classic SIR model, is the appearance of multiple peak solutions for the infected population. We provide several rigorous results on the existence and non-existence of peak solutions with sharp asymptotics.
△ Less
Submitted 23 June, 2021;
originally announced June 2021.
-
All-Fibre Label-Free Nano-Sensor for Real-Time in situ Early Monitoring of Cellular Apoptosis
Authors:
Danran Li,
Nina Wang,
Tianyang Zhang,
Guangxing Wu,
Yifeng Xiong,
Qianqian Du,
Yunfei Tian,
Wei-wei Zhao,
Jiandong Ye,
Shulin Gu,
Yanqing Lu,
Dechen Jiang,
Fei Xu
Abstract:
The achievement of all-fibre functional nano-modules for subcellular label-free measurement has long been pursued due to the limitations of manufacturing techniques. In this paper, a compact all-fibre label-free nano-sensor composed of a fibre taper and zinc oxide nano-gratings is designed and applied for the early monitoring of apoptosis in single living cells. Because of its nanoscale dimensions…
▽ More
The achievement of all-fibre functional nano-modules for subcellular label-free measurement has long been pursued due to the limitations of manufacturing techniques. In this paper, a compact all-fibre label-free nano-sensor composed of a fibre taper and zinc oxide nano-gratings is designed and applied for the early monitoring of apoptosis in single living cells. Because of its nanoscale dimensions, mechanical flexibility and minimal cytotoxicity to cells, the sensing module can be loaded in cells for long-term in situ tracking with high sensitivity. A gradual increase in the nuclear refractive index during the apoptosis process is observed, revealing the increase in molecular density and the decrease in cell volume. The strategy used in this study not only contributes to the understanding of internal environmental variations during cellular apoptosis but also provides a new platform for non-fluorescent all-fibre devices to investigate cellular events and to promote new progress in fundamental cell biochemical engineering.
△ Less
Submitted 29 May, 2021;
originally announced May 2021.
-
Contrastive latent variable modeling with application to case-control sequencing experiments
Authors:
Andrew Jones,
F. William Townes,
Didong Li,
Barbara E. Engelhardt
Abstract:
High-throughput RNA-sequencing (RNA-seq) technologies are powerful tools for understanding cellular state. Often it is of interest to quantify and summarize changes in cell state that occur between experimental or biological conditions. Differential expression is typically assessed using univariate tests to measure gene-wise shifts in expression. However, these methods largely ignore changes in tr…
▽ More
High-throughput RNA-sequencing (RNA-seq) technologies are powerful tools for understanding cellular state. Often it is of interest to quantify and summarize changes in cell state that occur between experimental or biological conditions. Differential expression is typically assessed using univariate tests to measure gene-wise shifts in expression. However, these methods largely ignore changes in transcriptional correlation. Furthermore, there is a need to identify the low-dimensional structure of the gene expression shift to identify collections of genes that change between conditions. Here, we propose contrastive latent variable models designed for count data to create a richer portrait of differential expression in sequencing data. These models disentangle the sources of transcriptional variation in different conditions, in the context of an explicit model of variation at baseline. Moreover, we develop a model-based hypothesis testing framework that can test for global and gene subset-specific changes in expression. We test our model through extensive simulations and analyses with count-based gene expression data from perturbation and observational sequencing experiments. We find that our methods can effectively summarize and quantify complex transcriptional changes in case-control experimental sequencing data.
△ Less
Submitted 12 February, 2021;
originally announced February 2021.
-
Model-based cellular kinetic analysis of SARS-CoV-2 infection: different immune response modes and treatment strategies
Authors:
Zhengqing Zhou,
Zhiheng Zhao,
Shuyu Shi,
Jianghua Wu,
Dianjie Li,
Jianwei Li,
**gpeng Zhang,
Ke Gui,
Yu Zhang,
Heng Mei,
Yu Hu,
Qi Ouyang,
Fangting Li
Abstract:
Increasing number in global COVID-19 cases demands for mathematical model to analyze the interaction between the virus dynamics and the response of innate and adaptive immunity. Here, based on the assumption of a weak and delayed response of the innate and adaptive immunity in SARS-CoV-2 infection, we constructed a mathematical model to describe the dynamic processes of immune system. Integrating…
▽ More
Increasing number in global COVID-19 cases demands for mathematical model to analyze the interaction between the virus dynamics and the response of innate and adaptive immunity. Here, based on the assumption of a weak and delayed response of the innate and adaptive immunity in SARS-CoV-2 infection, we constructed a mathematical model to describe the dynamic processes of immune system. Integrating theoretical results with clinical COVID-19 patients' data, we classified the COVID-19 development processes into three typical modes of immune responses, correlated with the clinical classification of mild & moderate, severe and critical patients. We found that the immune efficacy (the ability of host to clear virus and kill infected cells) and the lymphocyte supply (the abundance and pool of naĂ¯ve T and B cell) play important roles in the dynamic process and determine the clinical outcome, especially for the severe and critical patients. Furthermore, we put forward possible treatment strategies for the three typical modes of immune response. We hope our results can help to understand the dynamical mechanism of the immune response against SARS-CoV-2 infection, and to be useful for the treatment strategies and vaccine design.
△ Less
Submitted 12 January, 2021;
originally announced January 2021.
-
A prognostic dynamic model applicable to infectious diseases providing easily visualized guides -- A case study of COVID-19 in the UK
Authors:
Yuxuan Zhang,
Chen Gong,
Dawei Li,
Zhi-Wei Wang,
Shengda D Pu,
Alex W Robertson,
Hong Yu,
John Parrington
Abstract:
A reasonable prediction of infectious diseases transmission process under different disease control strategies is an important reference point for policy makers. Here we established a dynamic transmission model via Python and realized comprehensive regulation of disease control measures. We classified government interventions into three categories and introduced three parameters as descriptions fo…
▽ More
A reasonable prediction of infectious diseases transmission process under different disease control strategies is an important reference point for policy makers. Here we established a dynamic transmission model via Python and realized comprehensive regulation of disease control measures. We classified government interventions into three categories and introduced three parameters as descriptions for the key points in disease control, these being intraregional growth rate, interregional communication rate, and detection rate of infectors. Our simulation predicts the infection by COVID-19 in the UK would be out of control in 73 days without any interventions; at the same time, herd immunity acquisition will begin from the epicentre. After we introduced government interventions, single intervention is effective in disease control but at huge expense while combined interventions would be more efficient, among which, enhancing detection number is crucial in control strategy of COVID-19. In addition, we calculated requirements for the most effective vaccination strategy based on infection number in real situation. Our model was programmed with iterative algorithms, and visualized via cellular automata, it can be applied to similar epidemics in other regions if the basic parameters are inputted, and is able to synthetically mimick the effect of multiple factors in infectious disease control.
△ Less
Submitted 22 February, 2021; v1 submitted 13 December, 2020;
originally announced December 2020.
-
Evaluating the effect of city lock-down on controlling COVID-19 propagation through deep learning and network science models
Authors:
Xiaoqi Zhang,
Zheng Ji,
Yanqiao Zheng,
Xinyue Ye,
Dong Li
Abstract:
The special epistemic characteristics of the COVID-19, such as the long incubation period and the infection through asymptomatic cases, put severe challenge to the containment of its outbreak. By the end of March 2020, China has successfully controlled the within-spreading of COVID-19 at a high cost of locking down most of its major cities, including the epicenter, Wuhan. Since the low accuracy of…
▽ More
The special epistemic characteristics of the COVID-19, such as the long incubation period and the infection through asymptomatic cases, put severe challenge to the containment of its outbreak. By the end of March 2020, China has successfully controlled the within-spreading of COVID-19 at a high cost of locking down most of its major cities, including the epicenter, Wuhan. Since the low accuracy of outbreak data before the mid of Feb. 2020 forms a major technical concern on those studies based on statistic inference from the early outbreak. We apply the supervised learning techniques to identify and train NP-Net-SIR model which turns out robust under poor data quality condition. By the trained model parameters, we analyze the connection between population flow and the cross-regional infection connection strength, based on which a set of counterfactual analysis is carried out to study the necessity of lock-down and substitutability between lock-down and the other containment measures. Our findings support the existence of non-lock-down-typed measures that can reach the same containment consequence as the lock-down, and provide useful guideline for the design of a more flexible containment strategy.
△ Less
Submitted 4 September, 2020;
originally announced September 2020.
-
Network resilience
Authors:
Xueming Liu,
Daqing Li,
Manqing Ma,
Boleslaw K. Szymanski,
H Eugene Stanley,
Jianxi Gao
Abstract:
Many systems on our planet are known to shift abruptly and irreversibly from one state to another when they are forced across a "tip** point," such as mass extinctions in ecological networks, cascading failures in infrastructure systems, and social convention changes in human and animal networks. Such a regime shift demonstrates a system's resilience that characterizes the ability of a system to…
▽ More
Many systems on our planet are known to shift abruptly and irreversibly from one state to another when they are forced across a "tip** point," such as mass extinctions in ecological networks, cascading failures in infrastructure systems, and social convention changes in human and animal networks. Such a regime shift demonstrates a system's resilience that characterizes the ability of a system to adjust its activity to retain its basic functionality in the face of internal disturbances or external environmental changes. In the past 50 years, attention was almost exclusively given to low dimensional systems and calibration of their resilience functions and indicators of early warning signals without considerations for the interactions between the components. Only in recent years, taking advantages of the network theory and lavish real data sets, network scientists have directed their interest to the real-world complex networked multidimensional systems and their resilience function and early warning indicators. This report is devoted to a comprehensive review of resilience function and regime shift of complex systems in different domains, such as ecology, biology, social systems and infrastructure. We cover the related research about empirical observations, experimental studies, mathematical modeling, and theoretical analysis. We also discuss some ambiguous definitions, such as robustness, resilience, and stability.
△ Less
Submitted 9 April, 2022; v1 submitted 26 July, 2020;
originally announced July 2020.
-
Statistical Issues and Recommendations for Clinical Trials Conducted During the COVID-19 Pandemic
Authors:
R. Daniel Meyer,
Bohdana Ratitch,
Marcel Wolbers,
Olga Marchenko,
Hui Quan,
Daniel Li,
Chrissie Fletcher,
Xin Li,
David Wright,
Yue Shentu,
Stefan Englert,
Wei Shen,
Jyotirmoy Dey,
Thomas Liu,
Ming Zhou,
Norman Bohidar,
Peng-Liang Zhao,
Michael Hale
Abstract:
The COVID-19 pandemic has had and continues to have major impacts on planned and ongoing clinical trials. Its effects on trial data create multiple potential statistical issues. The scale of impact is unprecedented, but when viewed individually, many of the issues are well defined and feasible to address. A number of strategies and recommendations are put forward to assess and address issues relat…
▽ More
The COVID-19 pandemic has had and continues to have major impacts on planned and ongoing clinical trials. Its effects on trial data create multiple potential statistical issues. The scale of impact is unprecedented, but when viewed individually, many of the issues are well defined and feasible to address. A number of strategies and recommendations are put forward to assess and address issues related to estimands, missing data, validity and modifications of statistical analysis methods, need for additional analyses, ability to meet objectives and overall trial interpretability.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
Modeling Epidemic Spreading through Public Transit using Time-Varying Encounter Network
Authors:
Baichuan Mo,
Kairui Feng,
Yu Shen,
Clarence Tam,
Daqing Li,
Yafeng Yin,
**hua Zhao
Abstract:
Passenger contact in public transit (PT) networks can be a key mediate in the spreading of infectious diseases. This paper proposes a time-varying weighted PT encounter network to model the spreading of infectious diseases through the PT systems. Social activity contacts at both local and global levels are also considered. We select the epidemiological characteristics of coronavirus disease 2019 (…
▽ More
Passenger contact in public transit (PT) networks can be a key mediate in the spreading of infectious diseases. This paper proposes a time-varying weighted PT encounter network to model the spreading of infectious diseases through the PT systems. Social activity contacts at both local and global levels are also considered. We select the epidemiological characteristics of coronavirus disease 2019 (COVID-19) as a case study along with smart card data from Singapore to illustrate the model at the metropolitan level. A scalable and lightweight theoretical framework is derived to capture the time-varying and heterogeneous network structures, which enables to solve the problem at the whole population level with low computational costs. Different control policies from both the public health side and the transportation side are evaluated. We find that people's preventative behavior is one of the most effective measures to control the spreading of epidemics. From the transportation side, partial closure of bus routes helps to slow down but cannot fully contain the spreading of epidemics. Identifying "influential passengers" using the smart card data and isolating them at an early stage can also effectively reduce the epidemic spreading.
△ Less
Submitted 21 November, 2020; v1 submitted 9 April, 2020;
originally announced April 2020.
-
Frontoparietal Connectivity Neurofeedback Training for Promotion of Working Memory: An fNIRS Study in Healthy Male Participants
Authors:
Meiyun Xia,
Pengfei Xu,
Yuanbin Yang,
Wenyu Jiang,
Zehua Wang,
Xiaolei Gu,
Mingxi Yang,
Deyu Li,
Shuyu Li,
Guijun Dong,
Ling Wang,
Daifa Wang
Abstract:
Neurofeedback cognitive training is a promising tool used to promote cognitive functions effectively and efficiently. In this study, we investigated a novel functional near-infrared spectroscopy (fNIRS)-based frontoparietal functional connectivity (FC) neurofeedback training paradigm related to working memory, involving healthy adults. Compared with conventional cognitive training studies, we chos…
▽ More
Neurofeedback cognitive training is a promising tool used to promote cognitive functions effectively and efficiently. In this study, we investigated a novel functional near-infrared spectroscopy (fNIRS)-based frontoparietal functional connectivity (FC) neurofeedback training paradigm related to working memory, involving healthy adults. Compared with conventional cognitive training studies, we chose the frontoparietal network, a key brain region for cognitive function modulation, as neurofeedback, yielding a strong targeting effect. In the experiment, 10 participants (test group) received three cognitive training sessions of 15 min using fNIRS-based frontoparietal FC as neurofeedback, and another 10 participants served as the control group. Frontoparietal FC was significantly increased in the test group (p D 0.03), and the cognitive functions (memory and attention) were significantly promoted compared with the control group (accuracy of 3-back test: p D 0.0005, reaction time of 3-back test: p D 0.0009). After additional validations on long-term training effect and on different patient populations, the proposed method exhibited considerable potential to be developed as a fast, effective, and widespread training approach for cognitive function enhancement.
△ Less
Submitted 2 June, 2021; v1 submitted 31 March, 2020;
originally announced March 2020.
-
Spatio-temporal propagation of COVID-19 pandemics
Authors:
Bnaya Gross,
Zhiguo Zheng,
Shiyan Liu,
Xiaoqi Chen,
Alon Sela,
Jianxin Li,
Daqing Li,
Shlomo Havlin
Abstract:
The new coronavirus known as COVID-19 is spread world-wide since December 2019. Without any vaccination or medicine, the means of controlling it are limited to quarantine and social distancing. Here we study the spatio-temporal propagation of the first wave of the COVID-19 virus in China and compare it to other global locations. We provide a comprehensive picture of the spatial propagation from Hu…
▽ More
The new coronavirus known as COVID-19 is spread world-wide since December 2019. Without any vaccination or medicine, the means of controlling it are limited to quarantine and social distancing. Here we study the spatio-temporal propagation of the first wave of the COVID-19 virus in China and compare it to other global locations. We provide a comprehensive picture of the spatial propagation from Hubei to other provinces in China in terms of distance, population size, and human mobility and their scaling relations. Since strict quarantine has been usually applied between cities, more insight about the temporal evolution of the disease can be obtained by analyzing the epidemic within cities, especially the time evolution of the infection, death, and recovery rates which affected by policies. We study and compare the infection rate in different cities in China and provinces in Italy and find that the disease spread is characterized by a two-stages process. At early times, at order of few days, the infection rate is close to a constant probably due to the lack of means to detect infected individuals before infection symptoms are observed. Then at later times it decays approximately exponentially due to quarantines. The time evolution of the death and recovery rates also distinguish between these two stages and reflect the health system situation which could be overloaded.
△ Less
Submitted 9 July, 2020; v1 submitted 18 March, 2020;
originally announced March 2020.
-
The Alzheimer's Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after 1 Year Follow-up
Authors:
Razvan V. Marinescu,
Neil P. Oxtoby,
Alexandra L. Young,
Esther E. Bron,
Arthur W. Toga,
Michael W. Weiner,
Frederik Barkhof,
Nick C. Fox,
Arman Eshaghi,
Tina Toni,
Marcin Salaterski,
Veronika Lunina,
Manon Ansart,
Stanley Durrleman,
Pascal Lu,
Samuel Iddi,
Dan Li,
Wesley K. Thompson,
Michael C. Donohue,
Aviv Nahon,
Yarden Levy,
Dan Halbersberg,
Mariya Cohen,
Huiling Liao,
Tengfei Li
, et al. (71 additional authors not shown)
Abstract:
We present the findings of "The Alzheimer's Disease Prediction Of Longitudinal Evolution" (TADPOLE) Challenge, which compared the performance of 92 algorithms from 33 international teams at predicting the future trajectory of 219 individuals at risk of Alzheimer's disease. Challenge participants were required to make a prediction, for each month of a 5-year future time period, of three key outcome…
▽ More
We present the findings of "The Alzheimer's Disease Prediction Of Longitudinal Evolution" (TADPOLE) Challenge, which compared the performance of 92 algorithms from 33 international teams at predicting the future trajectory of 219 individuals at risk of Alzheimer's disease. Challenge participants were required to make a prediction, for each month of a 5-year future time period, of three key outcomes: clinical diagnosis, Alzheimer's Disease Assessment Scale Cognitive Subdomain (ADAS-Cog13), and total volume of the ventricles. The methods used by challenge participants included multivariate linear regression, machine learning methods such as support vector machines and deep neural networks, as well as disease progression models. No single submission was best at predicting all three outcomes. For clinical diagnosis and ventricle volume prediction, the best algorithms strongly outperform simple baselines in predictive ability. However, for ADAS-Cog13 no single submitted prediction method was significantly better than random guesswork. Two ensemble methods based on taking the mean and median over all predictions, obtained top scores on almost all tasks. Better than average performance at diagnosis prediction was generally associated with the additional inclusion of features from cerebrospinal fluid (CSF) samples and diffusion tensor imaging (DTI). On the other hand, better performance at ventricle volume prediction was associated with inclusion of summary statistics, such as the slope or maxima/minima of biomarkers. TADPOLE's unique results suggest that current prediction algorithms provide sufficient accuracy to exploit biomarkers related to clinical diagnosis and ventricle volume, for cohort refinement in clinical trials for Alzheimer's disease. However, results call into question the usage of cognitive test scores for patient selection and as a primary endpoint in clinical trials.
△ Less
Submitted 27 December, 2021; v1 submitted 9 February, 2020;
originally announced February 2020.
-
Statistical Data Assimilation: Formulation and Examples from Neurobiology
Authors:
Anna Miller,
Dawei Li,
Jason Platt,
Arij Daou,
Daniel Margoliash,
Henry Abarbanel
Abstract:
For the Research Topic Data Assimilation and Control: Theory and Applications in Life Sciences we first review the formulation of statistical data assimilation (SDA) and discuss algorithms for exploring variational approximations to the conditional expected values of biophysical aspects of functional neural circuits. Then we report on the application of SDA to (1) the exploration of properties of…
▽ More
For the Research Topic Data Assimilation and Control: Theory and Applications in Life Sciences we first review the formulation of statistical data assimilation (SDA) and discuss algorithms for exploring variational approximations to the conditional expected values of biophysical aspects of functional neural circuits. Then we report on the application of SDA to (1) the exploration of properties of individual neurons in the HVC nucleus of the avian song system, and (2) characterizing individual neurons formulated as very large scale integration (VLSI) analog circuits with a goal of building functional, biophysically realistic, VLSI representations of functional nervous systems. Networks of neurons pose a substantially greater challenge, and we comment on formulating experiments to probe the properties, especially the functional connectivity, in song command circuits within HVC.
△ Less
Submitted 13 September, 2018;
originally announced September 2018.
-
Observations and perspectives on the prebiotic sequence evolution
Authors:
Dirson Jian Li
Abstract:
The post-genomic era has brought opportunities to bridge traditionally separate fields of early history of life and brought new insight into origin and evolution of biodiversity. According to distributions of codons in genome sequences, I found a relationship between the genetic code and the tree of life. This remote and profound relationship involves the origin and evolution of the genetic code a…
▽ More
The post-genomic era has brought opportunities to bridge traditionally separate fields of early history of life and brought new insight into origin and evolution of biodiversity. According to distributions of codons in genome sequences, I found a relationship between the genetic code and the tree of life. This remote and profound relationship involves the origin and evolution of the genetic code and the diversification and expansion of genomes. Here, a prebiotic picture of the triplex nucleic acid evolution is proposed to explain the origin of the genetic code, where the transition from disorder to order in the origin of life might be due to the increasing stabilities of triplex base pairs. The codon degeneracy can be obtained in detail based on the coevolution of the genetic code with amino acids, or equivalently, the coevolution of tRNAs with aaRSs. This theory is based on experimental data such as the stability of triplex base pairs and the statistical features of genomic codon distributions. Several experimentally testable proposals have been developed. This study should be regarded as an exploratory attempt to reveal the early evolution of life based on sequence information in a statistical manner.
△ Less
Submitted 12 July, 2018;
originally announced July 2018.
-
Observations and perspectives on the diversification of genomes
Authors:
Dirson Jian Li
Abstract:
Rich information on the prebiotic evolution is still stored in contemporary genomic data. The statistical mechanism at the sequence level may play a significant role in the prebiotic evolution. Based on statistical analysis of genome sequences, it has been observed that there is a close relationship between the evolution of the genetic code and the organisation of genomes. A biodiversity space for…
▽ More
Rich information on the prebiotic evolution is still stored in contemporary genomic data. The statistical mechanism at the sequence level may play a significant role in the prebiotic evolution. Based on statistical analysis of genome sequences, it has been observed that there is a close relationship between the evolution of the genetic code and the organisation of genomes. A biodiversity space for species is constructed based on comparing the distributions of codons in genomes for different species according to recruitment order of codons in the prebiotic evolution, by which a closely relationship between the evolution of the genetic code and the tree of life has been confirmed. On one hand, the three domain tree of life can be reconstructed according to the distance matrix of species in this biodiversity space, which supports the three-domain tree rather than the eocyte tree. On the other hand, an evolutionary tree of codons can be obtained by comparing the distributions of the 64 codons in genomes, which agrees with the recruitment order of codons on the roadmap. This is a simple phylogenomic method to study the origins of metazoan, the evolution of primates, etc. This study should be regarded as an exploratory attempt to explain the diversification of the three domains of life by statistical mechanism in prebiotic sequence evolution. It is indicated that the number of bases in the triplet codons might be explained statistically by the number of strands in the triplex DNAs. The adaptation of life to the changing environment might be due to assembly of redundant genomes at the sequence level.
△ Less
Submitted 10 July, 2018;
originally announced July 2018.
-
Observations and perspectives on the variation of biodiversity
Authors:
Dirson Jian Li
Abstract:
Based on statistical analysis of the complete genome sequences, a remote relationship has been observed between the evolution of the genetic code and the three domain tree of life. The existence of such a remote relationship need to be explained. The unity of the living system throughout the history of life relies on the common features of life: the homochirality, the genetic code and the universa…
▽ More
Based on statistical analysis of the complete genome sequences, a remote relationship has been observed between the evolution of the genetic code and the three domain tree of life. The existence of such a remote relationship need to be explained. The unity of the living system throughout the history of life relies on the common features of life: the homochirality, the genetic code and the universal genome format. The universal genome format has been observed in the genomic codon distributions as a common feature of life at the sequence level. A main aim of this article is to reconstruct and to explain the Phanerozoic biodiversity curve. It has been observed that the exponential growth rate of the Phanerozoic biodiversity curve is about equal to the exponential growth rate of genome size evolution. Hence it is strongly indicated that the expansion of genomes causes the exponential trend of the Phanerozoic biodiversity curve, where the conservative property during the evolution of life is guaranteed by the universal genome format at the sequence level. In addition, a consensus curve based on the climatic and eustatic data is obtained to explain the fluctuations of the Phanerozoic biodiversity curve. Thus, the reconstructed biodiversity curve based on genomic, climatic and eustatic data agrees with Sepkoski's curve based on fossil data. The five mass extinctions can be discerned in this reconstructed biodiversity curve, which indicates a tectonic cause of the mass extinctions. And the declining origination rate and extinction rate throughout the Phanerozoic eon might be due to the growth trend in genome size evolution.
△ Less
Submitted 4 July, 2018;
originally announced July 2018.
-
Interrogating the Escherichia coli cell cycle by cell dimension perturbations
Authors:
Hai Zheng,
Po-Yi Ho,
Meiling Jiang,
Bin Tang,
Weirong Liu,
Deng** Li,
Xuefeng Yu,
Nancy E. Kleckner,
Ariel Amir,
Chenli Liu
Abstract:
Bacteria tightly regulate and coordinate the various events in their cell cycles to duplicate themselves accurately and to control their cell sizes. Growth of Escherichia coli, in particular, follows a relation known as Schaechter 's growth law. This law says that the average cell volume scales exponentially with growth rate, with a scaling exponent equal to the time from initiation of a round of…
▽ More
Bacteria tightly regulate and coordinate the various events in their cell cycles to duplicate themselves accurately and to control their cell sizes. Growth of Escherichia coli, in particular, follows a relation known as Schaechter 's growth law. This law says that the average cell volume scales exponentially with growth rate, with a scaling exponent equal to the time from initiation of a round of DNA replication to the cell division at which the corresponding sister chromosomes segregate. Here, we sought to test the robustness of the growth law to systematic perturbations in cell dimensions achieved by varying the expression levels of mreB and ftsZ. We found that decreasing the mreB level resulted in increased cell width, with little change in cell length, whereas decreasing the ftsZ level resulted in increased cell length. Furthermore, the time from replication termination to cell division increased with the perturbed dimension in both cases. Moreover, the growth law remained valid over a range of growth conditions and dimension perturbations. The growth law can be quantitatively interpreted as a consequence of a tight coupling of cell division to replication initiation. Thus, its robustness to perturbations in cell dimensions strongly supports models in which the timing of replication initiation governs that of cell division, and cell volume is the key phenomenological variable governing the timing of replication initiation. These conclusions are discussed in the context of our recently proposed adder-per-origin model, in which cells add a constant volume per origin between initiations and divide a constant time after initiation.
△ Less
Submitted 3 January, 2017;
originally announced January 2017.
-
Symmetry and size of membrane protein polyhedral nanoparticles
Authors:
Di Li,
Osman Kahraman,
Christoph A. Haselwandter
Abstract:
In recent experiments [T. Basta et al., Proc. Natl. Acad. Sci. U.S.A. 111, 670 (2014)] lipids and membrane proteins were observed to self-assemble into membrane protein polyhedral nanoparticles (MPPNs) with a well-defined polyhedral protein arrangement and characteristic size. We develop a model of MPPN self-assembly in which the preferred symmetry and size of MPPNs emerge from the interplay of pr…
▽ More
In recent experiments [T. Basta et al., Proc. Natl. Acad. Sci. U.S.A. 111, 670 (2014)] lipids and membrane proteins were observed to self-assemble into membrane protein polyhedral nanoparticles (MPPNs) with a well-defined polyhedral protein arrangement and characteristic size. We develop a model of MPPN self-assembly in which the preferred symmetry and size of MPPNs emerge from the interplay of protein-induced lipid bilayer deformations, topological defects in protein packing, and thermal effects. With all model parameters determined directly from experiments, our model correctly predicts the observed symmetry and size of MPPNs. Our model suggests how key lipid and protein properties can be modified to produce a range of MPPN symmetries and sizes in experiments.
△ Less
Submitted 2 November, 2016;
originally announced November 2016.
-
Nonequilibrium and nonlinear kinetics as key determinants for bistability in fission yeast G2-M transition
Authors:
De Zhao,
Teng Wang,
Jian Zhao,
Dianjie Li,
Zhili Lin,
Zeyan Chen,
Qi Ouyang,
Hong Qian,
Yu V. Fu,
Fangting Li
Abstract:
A living cell is an open, nonequilibrium biochemical system where ATP hydrolysis serves as the energy source for a wide range of intracellular processes, possibly including the assurance for decision-making. In the fission yeast cell cycle, the transition from G2 to M phase is driven by the activation of Cdc13/Cdc2 and Cdc25 and the deactivation of Wee1 through phosphorylation-dephosphorylation cy…
▽ More
A living cell is an open, nonequilibrium biochemical system where ATP hydrolysis serves as the energy source for a wide range of intracellular processes, possibly including the assurance for decision-making. In the fission yeast cell cycle, the transition from G2 to M phase is driven by the activation of Cdc13/Cdc2 and Cdc25 and the deactivation of Wee1 through phosphorylation-dephosphorylation cycles with feedback loops. Here, we present a kinetic description of the G2-M circuit which reveals that both cellular ATP level and ATP hydrolysis free energy critically control Cdc2 activation. Using fission yeast nucleoplasmic extract (YNPE), we experimentally verify that increased ATP level drives the activation of Cdc2 which exhibits bistability and hysteresis in response to changes in cellular ATP level and ATP hydrolysis energy. These findings suggest that cellular ATP level and ATP hydrolysis energy are determinants of the bistability and robustness of Cdc2 activation during G2-M transition.
△ Less
Submitted 6 May, 2024; v1 submitted 30 October, 2016;
originally announced October 2016.
-
The qualitative analysis of the impact of media delay on the control of infectious disease
Authors:
Dongmei Li,
Yue Wu,
Panpan Wen,
Weihua Liu
Abstract:
In this paper, we consider the impact of time delay by media on the control of the disease. We set up a class of SISM epidemic model with the time delay and the cumulative density of awareness caused by media. The sufficient condition of global asymptotic stability of disease-free equilibrium is approved. We get the global stability of the epidemic equilibrium and the existence conditions of Hopf…
▽ More
In this paper, we consider the impact of time delay by media on the control of the disease. We set up a class of SISM epidemic model with the time delay and the cumulative density of awareness caused by media. The sufficient condition of global asymptotic stability of disease-free equilibrium is approved. We get the global stability of the epidemic equilibrium and the existence conditions of Hopf bifurcation. Numerical simulations are presented to illustrate the analytical results. Finally, we analyze the influence of parameters on the control of infectious disease by combining the data of H1N1. By shortening the time of media lag, increasing transmission rate of media and the implementation rate of the media project, the spread of disease will be controlled effectively.
△ Less
Submitted 11 September, 2016;
originally announced September 2016.
-
Phylogenomic Analyses of Large-scale Nuclear Genes Provide New Insights into the Evolutionary Relationships within the Rosids
Authors:
Lei Zhao,
Xia Li,
Ning Zhang,
Shu-Dong Zhang,
Ting-Shuang Yi,
Hong Ma,
Zhen-Hua Guo,
De-Zhu Li
Abstract:
The Rosids is one of the largest groups of flowering plants, with 140 families and ~70,000 species. Previous phylogenetic studies of the rosids have primarily utilized organelle genes that likely differ in evolutionary histories from nuclear genes. To better understand the evolutionary history of rosids, it is necessary to investigate their phylogenetic relationships using nuclear genes. Here, we…
▽ More
The Rosids is one of the largest groups of flowering plants, with 140 families and ~70,000 species. Previous phylogenetic studies of the rosids have primarily utilized organelle genes that likely differ in evolutionary histories from nuclear genes. To better understand the evolutionary history of rosids, it is necessary to investigate their phylogenetic relationships using nuclear genes. Here, we employed large-scale phylogenomic datasets composed of nuclear genes, including 891 clusters of putative orthologous genes. Combined with comprehensive taxon sampling covering 63 species representing 14 out of the 17 orders, we reconstructed the rosids phylogeny with coalescence and concatenation methods, yielding similar tree topologies from all datasets. However, these topologies did not agree on the placement of Zygophyllales. Through comprehensive analyses, we found that missing data and gene tree heterogeneity were potential factors that may mislead concatenation methods, in particular, large amounts of missing data under high gene tree heterogeneity. Our results provided new insights into the deep phylogenetic relationships of the rosids, and demonstrated that coalescence methods may effectively resolve the phylogenetic relationships of the rosids with missing data under high gene tree heterogeneity.
△ Less
Submitted 30 June, 2016;
originally announced June 2016.
-
MODA: MOdule Differential Analysis for weighted gene co-expression network
Authors:
Dong Li,
James B. Brown,
Luisa Orsini,
Zhisong Pan,
Guyu Hu,
Shan He
Abstract:
Gene co-expression network differential analysis is designed to help biologists understand gene expression patterns under different condition. By comparing different gene co-expression networks we may find conserved part as well as condition specific set of genes. Taking the network as a collection as modules, we use a sample-saving method to construct condition-specific gene co-expression network…
▽ More
Gene co-expression network differential analysis is designed to help biologists understand gene expression patterns under different condition. By comparing different gene co-expression networks we may find conserved part as well as condition specific set of genes. Taking the network as a collection as modules, we use a sample-saving method to construct condition-specific gene co-expression network, and identify differentially expressed subnetworks as conserved or condition specific modules which may be associated with biological processes. We have implemented the method as an R package which establishes a pipeline from expression profile to biological explanations. The usefulness of the method is also demonstrated by synthetic data as well as Daphnia magna gene expression data under different environmental stresses.
△ Less
Submitted 16 May, 2016;
originally announced May 2016.
-
Statistical properties and error threshold of quasispecies on single-peak Gaussian-distributed fitness landscapes
Authors:
Duo-Fang Li,
Tian-Guang Cao,
**-Peng Geng,
Jian-Zhong Gu,
Hai-Long An,
Yong Zhan
Abstract:
The stochastic Eigen model proposed by Feng et al. (Journal of theoretical biology, 246 (2007) 28) showed that error threshold is no longer a phase transition point but a crossover region whose width depends on the strength of the random fluctuation in an environment. The underlying cause of this phenomenon has not yet been well examined. In this article, we adopt a single peak Gaussian distribute…
▽ More
The stochastic Eigen model proposed by Feng et al. (Journal of theoretical biology, 246 (2007) 28) showed that error threshold is no longer a phase transition point but a crossover region whose width depends on the strength of the random fluctuation in an environment. The underlying cause of this phenomenon has not yet been well examined. In this article, we adopt a single peak Gaussian distributed fitness landscape instead of a constant one to investigate and analyze the change of the error threshold and the statistical property of the quasi-species population. We find a roughly linear relation between the width of the error threshold and the fitness fluctuation strength. For a given quasi-species, the fluctuation of the relative concentration has a minimum with a normal distribution of the relative concentration at the maximum of the averaged relative concentration, it has however a largest value with a bimodal distribution of the relative concentration near the error threshold. The above results deepen our understanding of the quasispecies and error threshold and are heuristic for exploring practicable antiviral strategies.
△ Less
Submitted 1 June, 2015;
originally announced June 2015.
-
Error Threshold of Fully Random Eigen Model
Authors:
Duo-Fang Li,
Tian-Guang Cao,
**-Peng Geng,
Li-Hua Qiao,
Jian-Zhong Gu,
Yong Zhan
Abstract:
Species evolution is essentially a random process of interaction between biological populations and their environments. As a result, some physical parameters in evolution models are subject to statistical fluctuations. In this paper, two important parameters in the Eigen model, the fitness and mutation rate, are treated as Gaussian distributed random variables simultaneously to examine the propert…
▽ More
Species evolution is essentially a random process of interaction between biological populations and their environments. As a result, some physical parameters in evolution models are subject to statistical fluctuations. In this paper, two important parameters in the Eigen model, the fitness and mutation rate, are treated as Gaussian distributed random variables simultaneously to examine the property of the error threshold. Numerical simulation results show that the error threshold in the fully random model appears as a crossover region instead of a phase transition point, and as the fluctuation strength increases the crossover region becomes smoother and smoother. Furthermore, it is shown that the randomization of the mutation rate plays a dominant role in changing the error threshold in the fully random model, which is consistent with the existing experimental data. The implication of the threshold change due to the randomization for antiviral strategies is discussed.
△ Less
Submitted 1 June, 2015;
originally announced June 2015.
-
MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph
Authors:
Dinghua Li,
Chi-Man Liu,
Ruibang Luo,
Kunihiko Sadakane,
Tak-Wah Lam
Abstract:
MEGAHIT is a NGS de novo assembler for assembling large and complex metagenomics data in a time- and cost-efficient manner. It finished assembling a soil metagenomics dataset with 252Gbps in 44.1 hours and 99.6 hours on a single computing node with and without a GPU, respectively. MEGAHIT assembles the data as a whole, i.e., it avoids pre-processing like partitioning and normalization, which might…
▽ More
MEGAHIT is a NGS de novo assembler for assembling large and complex metagenomics data in a time- and cost-efficient manner. It finished assembling a soil metagenomics dataset with 252Gbps in 44.1 hours and 99.6 hours on a single computing node with and without a GPU, respectively. MEGAHIT assembles the data as a whole, i.e., it avoids pre-processing like partitioning and normalization, which might compromise on result integrity. MEGAHIT generates 3 times larger assembly, with longer contig N50 and average contig length than the previous assembly. 55.8% of the reads were aligned to the assembly, which is 4 times higher than the previous. The source code of MEGAHIT is freely available at https://github.com/voutcn/megahit under GPLv3 license.
△ Less
Submitted 23 December, 2014; v1 submitted 25 September, 2014;
originally announced September 2014.
-
Epidemic clones, oceanic gene pools and eco-LD in the free living marine pathogen Vibrio parahaemolyticus
Authors:
Yujun Cui,
Xianwei Yang,
Xavier Didelot,
Chenyi Guo,
Dongfang Li,
Yanfeng Yan,
Yiquan Zhang,
Yanting Yuan,
Huanming Yang,
Jian Wang,
Jun Wang,
Yajun Song,
Dongsheng Zhou,
Daniel Falush,
Ruifu Yang
Abstract:
We investigated global patterns of variation in 157 whole genome sequences of Vibrio parahaemolyticus, a free-living and seafood associated marine bacterium. Pandemic clones, responsible for recent outbreaks of gastroenteritis in humans have spread globally. However, there are oceanic gene pools, one located in the oceans surrounding Asia and another in the Mexican Gulf. Frequent recombination mea…
▽ More
We investigated global patterns of variation in 157 whole genome sequences of Vibrio parahaemolyticus, a free-living and seafood associated marine bacterium. Pandemic clones, responsible for recent outbreaks of gastroenteritis in humans have spread globally. However, there are oceanic gene pools, one located in the oceans surrounding Asia and another in the Mexican Gulf. Frequent recombination means that most isolates have acquired the genetic profile of their current location. We investigated the genetic structure in the Asian gene pool by calculating the effective population size in two different ways. Under standard neutral models, the two estimates should give similar answers but we found a thirty fold difference. We propose that this discrepancy is caused by the subdivision of the species into a hundred or more ecotypes which are maintained stably in the population. To investigate the genetic factors involved, we used 51 unrelated isolates to conduct a genome-wide scan for epistatically interacting loci. We found a single example of strong epistasis between distant genome regions. A majority of strains had a type VI secretion system associated with bacterial killing. The remaining strains had genes associated with biofilm formation and regulated by c-di-GMP signaling. All strains had one or other of the two systems and none of isolate had complete complements of both systems, although several strains had remnants. Further top-down analysis of patterns of linkage disequilibrium within frequently recombining species will allow a detailed understanding of how selection acts to structure the pattern of variation within natural bacterial populations.
△ Less
Submitted 30 November, 2014; v1 submitted 30 June, 2014;
originally announced June 2014.
-
The tectonic cause of mass extinctions and the genomic contribution to biodiversification
Authors:
Dirson Jian Li
Abstract:
Despite numerous mass extinctions in the Phanerozoic eon, the overall trend in biodiversity evolution was not blocked and the life has never been wiped out. Almost all possible catastrophic events (large igneous province, asteroid impact, climate change, regression and transgression, anoxia, acidification, sudden release of methane clathrate, multi-cause etc.) have been proposed to explain the mas…
▽ More
Despite numerous mass extinctions in the Phanerozoic eon, the overall trend in biodiversity evolution was not blocked and the life has never been wiped out. Almost all possible catastrophic events (large igneous province, asteroid impact, climate change, regression and transgression, anoxia, acidification, sudden release of methane clathrate, multi-cause etc.) have been proposed to explain the mass extinctions. However, we should, above all, clarify at what timescale and at what possible levels should we explain the mass extinction? Even though the mass extinctions occurred at short-timescale and at the species level, we reveal that their cause should be explained in a broader context at tectonic timescale and at both the molecular level and the species level. The main result in this paper is that the Phanerozoic biodiversity evolution has been explained by reconstructing the Sepkoski curve based on climatic, eustatic and genomic data. Consequently, we point out that the P-Tr extinction was caused by the tectonically originated climate instability. We also clarify that the overall trend of biodiversification originated from the underlying genome size evolution, and that the fluctuation of biodiversity originated from the interactions among the earth's spheres. The evolution at molecular level had played a significant role for the survival of life from environmental disasters.
△ Less
Submitted 18 December, 2012;
originally announced December 2012.
-
Modeling branching effects on source-sink relationships of the cotton plant
Authors:
Dong Li,
VĂ©ronique Letort,
Yan Guo,
P. De Reffye,
Zhigang Zhan
Abstract:
Compared with classical process-based models, the functional-structural plant models provide more efficient tools to explore the impact of changes in plant structures on plant functioning. In this paper we investigated the effects of branches on the sourcesink interaction for the cotton plant (Gossypium hirsutum L.) based on a two-treatment experiment conducted on cotton grown in the field: the si…
▽ More
Compared with classical process-based models, the functional-structural plant models provide more efficient tools to explore the impact of changes in plant structures on plant functioning. In this paper we investigated the effects of branches on the sourcesink interaction for the cotton plant (Gossypium hirsutum L.) based on a two-treatment experiment conducted on cotton grown in the field: the singlestem plants and the plants with only two vegetative branches. It was observed that the branched cotton had more organs for the whole plant but the organs on the trunk were smaller than those on the single-stem cotton. The phytomer production of the branches was four or five growth cycles delayed compared with the main stem. The organs on the trunk had similar dynamics of expansion for both treatments. Effects of branches were evaluated by using the functionalstructural model GREENLAB. It allowed estimating the coefficients of sink strength to differentiate the biomass acquisition abilities of organs between different physiological ages. We found that the presence of the two vegetative branches increased the ground projection area of plant leaves and had led to slight changes on the directly measured parameters; the potential relative sink strengths of organs were found similar for the two treatments.
△ Less
Submitted 15 December, 2010;
originally announced December 2010.
-
Classification of life by the mechanism of genome size evolution
Authors:
Dirson Jian Li,
Shengli Zhang
Abstract:
The classification of life should be based upon the fundamental mechanism in the evolution of life. We found that the global relationships among species should be circular phylogeny, which is quite different from the common sense based upon phylogenetic trees. The genealogical circles can be observed clearly according to the analysis of protein length distributions of contemporary species. Thus,…
▽ More
The classification of life should be based upon the fundamental mechanism in the evolution of life. We found that the global relationships among species should be circular phylogeny, which is quite different from the common sense based upon phylogenetic trees. The genealogical circles can be observed clearly according to the analysis of protein length distributions of contemporary species. Thus, we suggest that domains can be defined by distinguished phylogenetic circles, which are global and stable characteristics of living systems. The mechanism in genome size evolution has been clarified; hence main component questions on C-value enigma can be explained. According to the correlations and quasi-periodicity of protein length distributions, we can also classify life into three domains.
△ Less
Submitted 17 May, 2009; v1 submitted 19 November, 2008;
originally announced November 2008.
-
Genetic code evolution as an initial driving force for molecular evolution
Authors:
Dirson Jian Li,
Shengli Zhang
Abstract:
There is an intrinsic relationship between the molecular evolution in primordial period and the properties of genomes and proteomes of contemporary species. The genomic data may help us understand the driving force of evolution of life at molecular level. In absence of evidence, numerous problems in molecular evolution had to fall into a twilight zone of speculation and controversy in the past.…
▽ More
There is an intrinsic relationship between the molecular evolution in primordial period and the properties of genomes and proteomes of contemporary species. The genomic data may help us understand the driving force of evolution of life at molecular level. In absence of evidence, numerous problems in molecular evolution had to fall into a twilight zone of speculation and controversy in the past. Here we show that delicate structures of variations of genomic base compositions and amino acid frequencies resulted from the genetic code evolution. And the driving force of evolution of life also originated in the genetic code evolution. The theoretical results on the variations of amino acid frequencies and genomic base compositions agree with the experimental observations very well, not only in the variation trends but also in some fine structures. Inversely, the genomic data of contemporary species can help reconstruct the genetic code chronology and amino acid chronology in primordial period. Our results may shed light on the intrinsic mechanism of molecular evolution and the genetic code evolution.
△ Less
Submitted 15 March, 2009; v1 submitted 10 July, 2008;
originally announced July 2008.
-
Prediction of genomic properties and classification of life by protein length distributions
Authors:
Dirson Jian Li,
Shengli Zhang
Abstract:
Much evolutionary information is stored in the fluctuations of protein length distributions. The genome size and non-coding DNA content can be calculated based only on the protein length distributions. So there is intrinsic relationship between the coding DNA size and non-coding DNA size. According to the correlations and quasi-periodicity of protein length distributions, we can classify life in…
▽ More
Much evolutionary information is stored in the fluctuations of protein length distributions. The genome size and non-coding DNA content can be calculated based only on the protein length distributions. So there is intrinsic relationship between the coding DNA size and non-coding DNA size. According to the correlations and quasi-periodicity of protein length distributions, we can classify life into three domains. Strong evidences are found to support the order in the structures of protein length distributions.
△ Less
Submitted 1 June, 2008;
originally announced June 2008.
-
The C-value enigma and timing of the Cambrian explosion
Authors:
Dirson Jian Li,
Shengli Zhang
Abstract:
The Cambrian explosion is a grand challenge to science today and involves multidisciplinary study. This event is generally believed as a result of genetic innovations, environmental factors and ecological interactions, even though there are many conflicts on nature and timing of metazoan origins. The crux of the matter is that an entire roadmap of the evolution is missing to discern the biologic…
▽ More
The Cambrian explosion is a grand challenge to science today and involves multidisciplinary study. This event is generally believed as a result of genetic innovations, environmental factors and ecological interactions, even though there are many conflicts on nature and timing of metazoan origins. The crux of the matter is that an entire roadmap of the evolution is missing to discern the biological complexity transition and to evaluate the critical role of the Cambrian explosion in the overall evolutionary context. Here we calculate the time of the Cambrian explosion by an innovative and accurate "C-value clock"; our result (560 million years ago) quite fits the fossil records. We clarify that the intrinsic reason of genome evolution determined the Cambrian explosion. A general formula for evaluating genome size of different species has been found, by which major questions of the C-value enigma can be solved and the genome size evolution can be illustrated. The Cambrian explosion is essentially a major transition of biological complexity, which corresponds to a turning point in genome size evolution. The observed maximum prokaryotic complexity is just a relic of the Cambrian explosion and it is supervised by the maximum information storage capability in the observed universe. Our results open a new prospect of studying metazoan origins and molecular evolution.
△ Less
Submitted 31 May, 2008;
originally announced June 2008.
-
Holographic bound and protein linguistics
Authors:
Dirson Jian Li,
Shengli Zhang
Abstract:
The holographic bound in physics constrains the complexity of life. The finite storage capability of information in the observable universe requires the protein linguistics in the evolution of life. We find that the evolution of genetic code determines the variance of amino acid frequencies and genomic GC content among species. The elegant linguistic mechanism is confirmed by the experimental ob…
▽ More
The holographic bound in physics constrains the complexity of life. The finite storage capability of information in the observable universe requires the protein linguistics in the evolution of life. We find that the evolution of genetic code determines the variance of amino acid frequencies and genomic GC content among species. The elegant linguistic mechanism is confirmed by the experimental observations based on all known entire proteomes.
△ Less
Submitted 10 April, 2007;
originally announced April 2007.