Skip to main content

Showing 1–50 of 52 results for author: Zhu, J

Searching in archive q-bio. Search in all archives.
.
  1. arXiv:2406.06969  [pdf, other

    q-bio.GN cs.DM

    Data mining method of single-cell omics data to evaluate a pure tissue environmental effect on gene expression level

    Authors: Daigo Okada, Jianshen Zhu, Kan Shota, Yuuki Nishimura, Kazuya Haraguchi

    Abstract: While single-cell RNA-seq enables the investigation of the celltype effect on the transcriptome, the pure tissue environmental effect has not been well investigated. The bias in the combination of tissue and celltype in the body made it difficult to evaluate the effect of pure tissue environment by omics data mining. It is important to prevent statistical confounding among discrete variables such… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2406.05797  [pdf, other

    q-bio.BM cs.AI cs.CE cs.CL cs.LG

    3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization

    Authors: Qizhi Pei, Lijun Wu, Kaiyuan Gao, **hua Zhu, Rui Yan

    Abstract: The integration of molecule and language has garnered increasing attention in molecular science. Recent advancements in Language Models (LMs) have demonstrated potential for the comprehensive modeling of molecule and language. However, existing works exhibit notable limitations. Most existing works overlook the modeling of 3D information, which is crucial for understanding molecular structures and… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: 18 pages

  3. arXiv:2405.00513  [pdf

    q-bio.QM

    3D MR Fingerprinting for Dynamic Contrast-Enhanced Imaging of Whole Mouse Brain

    Authors: Yuran Zhu, Guanhua Wang, Yuning Gu, Walter Zhao, Jiahao Lu, Junqing Zhu, Christina J. MacAskill, Andrew Dupuis, Mark A. Griswold, Dan Ma, Chris A. Flask, Xin Yu

    Abstract: Quantitative MRI enables direct quantification of contrast agent concentrations in contrast-enhanced scans. However, the lengthy scan times required by conventional methods are inadequate for tracking contrast agent transport dynamically in mouse brain. We developed a 3D MR fingerprinting (MRF) method for simultaneous T1 and T2 map** across the whole mouse brain with 4.3-min temporal resolution.… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  4. arXiv:2403.20261  [pdf, other

    q-bio.BM cs.AI cs.LG

    FABind+: Enhancing Molecular Docking through Improved Pocket Prediction and Pose Generation

    Authors: Kaiyuan Gao, Qizhi Pei, **hua Zhu, Kun He, Lijun Wu

    Abstract: Molecular docking is a pivotal process in drug discovery. While traditional techniques rely on extensive sampling and simulation governed by physical principles, these methods are often slow and costly. The advent of deep learning-based approaches has shown significant promise, offering increases in both accuracy and efficiency. Building upon the foundational work of FABind, a model designed with… ▽ More

    Submitted 7 April, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: 17 pages, 14 figures, 5 tables

  5. arXiv:2403.17513  [pdf, other

    physics.chem-ph physics.bio-ph q-bio.BM

    A unified framework for coarse grained molecular dynamics of proteins

    Authors: **zhen Zhu, Jianpeng Ma

    Abstract: Understanding protein dynamics is crucial for elucidating their biological functions. While all-atom molecular dynamics (MD) simulations provide detailed information, coarse-grained (CG) MD simulations capture the essential collective motions of proteins at significantly lower computational cost. In this article, we present a unified framework for coarse-grained molecular dynamics simulation of pr… ▽ More

    Submitted 4 June, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: 12 pages, 8 figures

  6. arXiv:2403.01528  [pdf, other

    cs.CL cs.AI q-bio.BM

    Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey

    Authors: Qizhi Pei, Lijun Wu, Kaiyuan Gao, **hua Zhu, Yue Wang, Zun Wang, Tao Qin, Rui Yan

    Abstract: The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology. This approach leverages the rich, multifaceted descriptions of biomolecules contained within textual data sources to enhance our fundamental understanding and enable downstream computational tasks such as biomol… ▽ More

    Submitted 5 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: Survey Paper. 25 pages, 9 figures, and 3 tables

  7. arXiv:2402.17810  [pdf, other

    q-bio.QM cs.AI cs.CE cs.LG q-bio.BM

    BioT5+: Towards Generalized Biological Understanding with IUPAC Integration and Multi-task Tuning

    Authors: Qizhi Pei, Lijun Wu, Kaiyuan Gao, Xiaozhuan Liang, Yin Fang, **hua Zhu, Shufang Xie, Tao Qin, Rui Yan

    Abstract: Recent research trends in computational biology have increasingly focused on integrating text and bio-entity modeling, especially in the context of molecules and proteins. However, previous efforts like BioT5 faced challenges in generalizing across diverse tasks and lacked a nuanced understanding of molecular structures, particularly in their textual representations (e.g., IUPAC). This paper intro… ▽ More

    Submitted 31 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024 (Findings)

  8. arXiv:2402.12391  [pdf, other

    q-bio.GN cs.AI cs.LG

    Toward a Team of AI-made Scientists for Scientific Discovery from Gene Expression Data

    Authors: Haoyang Liu, Yijiang Li, **glin Jian, Yuxuan Cheng, Jianrong Lu, Shuyi Guo, **glei Zhu, Mianchen Zhang, Miantong Zhang, Haohan Wang

    Abstract: Machine learning has emerged as a powerful tool for scientific discovery, enabling researchers to extract meaningful insights from complex datasets. For instance, it has facilitated the identification of disease-predictive genes from gene expression data, significantly advancing healthcare. However, the traditional process for analyzing such datasets demands substantial human effort and expertise… ▽ More

    Submitted 20 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 18 pages, 2 figures; added contact

  9. arXiv:2402.06772  [pdf, other

    q-bio.QM cs.AI cs.CE cs.LG

    Retrosynthesis Prediction via Search in (Hyper) Graph

    Authors: Zixun Lan, Binjie Hong, Jiajun Zhu, Zuo Zeng, Zhenfu Liu, Limin Yu, Fei Ma

    Abstract: Predicting reactants from a specified core product stands as a fundamental challenge within organic synthesis, termed retrosynthesis prediction. Recently, semi-template-based methods and graph-edits-based methods have achieved good performance in terms of both interpretability and accuracy. However, due to their mechanisms these methods cannot predict complex reactions, e.g., reactions with multip… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  10. arXiv:2401.10806  [pdf, ps, other

    q-bio.BM

    DeepRLI: A Multi-objective Framework for Universal Protein--Ligand Interaction Prediction

    Authors: Haoyu Lin, Shiwei Wang, **tao Zhu, Yibo Li, Jianfeng Pei, Luhua Lai

    Abstract: Protein (receptor)--ligand interaction prediction is a critical component in computer-aided drug design, significantly influencing molecular docking and virtual screening processes. Despite the development of numerous scoring functions in recent years, particularly those employing machine learning, accurately and efficiently predicting binding affinities for protein--ligand complexes remains a for… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  11. arXiv:2311.15201  [pdf, other

    q-bio.BM

    DiffBindFR: An SE(3) Equivariant Network for Flexible Protein-Ligand Docking

    Authors: **tao Zhu, Zhonghui Gu, Jianfeng Pei, Luhua Lai

    Abstract: Molecular docking, a key technique in structure-based drug design, plays pivotal roles in protein-ligand interaction modeling, hit identification and optimization, in which accurate prediction of protein-ligand binding mode is essential. Conventional docking approaches perform well in redocking tasks with known protein binding pocket conformation in the complex state. However, in real-world dockin… ▽ More

    Submitted 19 December, 2023; v1 submitted 26 November, 2023; originally announced November 2023.

  12. arXiv:2310.13468  [pdf, other

    q-bio.PE physics.soc-ph q-bio.QM

    EpiGeoPop: A Tool for Develo** Spatially Accurate Country-level Epidemiological Models

    Authors: Lara Herriott, Henriette L. Capel, Isaac Ellmen, Nathan Schofield, Jiayuan Zhu, Ben Lambert, David Gavaghan, Ioana Bouros, Richard Creswell, Kit Gallagher

    Abstract: Mathematical models play a crucial role in understanding the spread of infectious disease outbreaks and influencing policy decisions. These models aid pandemic preparedness by predicting outcomes under hypothetical scenarios and identifying weaknesses in existing frameworks. However, their accuracy, utility, and comparability are being scrutinized. Agent-based models (ABMs) have emerged as a valua… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 16 pages, 6 figures, 3 supplementary figures

  13. arXiv:2310.07276  [pdf, other

    cs.CL cs.AI cs.LG q-bio.BM

    BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations

    Authors: Qizhi Pei, Wei Zhang, **hua Zhu, Kehan Wu, Kaiyuan Gao, Lijun Wu, Yingce Xia, Rui Yan

    Abstract: Recent advancements in biological research leverage the integration of molecules, proteins, and natural language to enhance drug discovery. However, current models exhibit several limitations, such as the generation of invalid molecular SMILES, underutilization of contextual information, and equal treatment of structured and unstructured knowledge. To address these issues, we propose… ▽ More

    Submitted 28 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: Accepted by Empirical Methods in Natural Language Processing 2023 (EMNLP 2023)

  14. arXiv:2310.06763  [pdf, other

    cs.LG cs.AI q-bio.BM

    FABind: Fast and Accurate Protein-Ligand Binding

    Authors: Qizhi Pei, Kaiyuan Gao, Lijun Wu, **hua Zhu, Yingce Xia, Shufang Xie, Tao Qin, Kun He, Tie-Yan Liu, Rui Yan

    Abstract: Modeling the interaction between proteins and ligands and accurately predicting their binding structures is a critical yet challenging task in drug discovery. Recent advancements in deep learning have shown promise in addressing this challenge, with sampling-based and regression-based methods emerging as two prominent approaches. However, these methods have notable limitations. Sampling-based meth… ▽ More

    Submitted 8 January, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted by Neural Information Processing Systems 2023 (NeurIPS 2023)

  15. arXiv:2309.07165  [pdf

    q-bio.PE

    Revive, Restore, Revitalize: An Eco-economic Methodology for Maasai Mara

    Authors: Yipeng Xu, He Sun, Junfeng Zhu

    Abstract: The Maasai Mara in Kenya, renowned for its biodiversity, is witnessing ecosystem degradation and species endangerment due to intensified human activities. Addressing this, we introduce a dynamic system harmonizing ecological and human priorities. Our agent-based model replicates the Maasai Mara savanna ecosystem, incorporating 71 animal species, 10 human classifications, and 2 natural resource typ… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: 25 pages, 16 figures

  16. arXiv:2307.08576  [pdf

    q-bio.NC cs.LG

    A Study on the Performance of Generative Pre-trained Transformer (GPT) in Simulating Depressed Individuals on the Standardized Depressive Symptom Scale

    Authors: Si** Cai, Nanfeng Zhang, Jiaying Zhu, Yanjie Liu, Yong** Zhou

    Abstract: Background: Depression is a common mental disorder with societal and economic burden. Current diagnosis relies on self-reports and assessment scales, which have reliability issues. Objective approaches are needed for diagnosing depression. Objective: Evaluate the potential of GPT technology in diagnosing depression. Assess its ability to simulate individuals with depression and investigate the inf… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

  17. arXiv:2306.05445  [pdf, other

    physics.chem-ph cs.LG q-bio.BM

    Towards Predicting Equilibrium Distributions for Molecular Systems with Deep Learning

    Authors: Shuxin Zheng, Jiyan He, Chang Liu, Yu Shi, Ziheng Lu, Weitao Feng, Fusong Ju, Jiaxi Wang, Jianwei Zhu, Yaosen Min, He Zhang, Shidi Tang, Hongxia Hao, Peiran **, Chi Chen, Frank Noé, Haiguang Liu, Tie-Yan Liu

    Abstract: Advances in deep learning have greatly improved structure prediction of molecules. However, many macroscopic observations that are important for real-world applications are not functions of a single molecular structure, but rather determined from the equilibrium distribution of structures. Traditional methods for obtaining these distributions, such as molecular dynamics simulation, are computation… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 80 pages, 11 figures

  18. arXiv:2304.01347  [pdf

    q-bio.NC cs.LG cs.MM

    Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Diagnosis and Lateralization Analysis

    Authors: Cheng Zhu, Ying Tan, Shuqi Yang, Jiaqing Miao, Jiayi Zhu, Huan Huang, Dezhong Yao, Cheng Luo

    Abstract: The available evidence suggests that dynamic functional connectivity (dFC) can capture time-varying abnormalities in brain activity in resting-state cerebral functional magnetic resonance imaging (rs-fMRI) data and has a natural advantage in uncovering mechanisms of abnormal brain activity in schizophrenia(SZ) patients. Hence, an advanced dynamic brain network analysis model called the temporal br… ▽ More

    Submitted 11 September, 2023; v1 submitted 30 March, 2023; originally announced April 2023.

  19. arXiv:2211.08406  [pdf, other

    q-bio.BM cs.AI cs.LG

    Incorporating Pre-training Paradigm for Antibody Sequence-Structure Co-design

    Authors: Kaiyuan Gao, Lijun Wu, **hua Zhu, Tianbo Peng, Yingce Xia, Liang He, Shufang Xie, Tao Qin, Haiguang Liu, Kun He, Tie-Yan Liu

    Abstract: Antibodies are versatile proteins that can bind to pathogens and provide effective protection for human body. Recently, deep learning-based computational antibody design has attracted popular attention since it automatically mines the antibody patterns from data that could be complementary to human experiences. However, the computational methods heavily rely on high-quality antibody structure data… ▽ More

    Submitted 17 November, 2022; v1 submitted 26 October, 2022; originally announced November 2022.

  20. arXiv:2209.15408  [pdf, other

    physics.chem-ph cs.LG q-bio.BM

    Equivariant Energy-Guided SDE for Inverse Molecular Design

    Authors: Fan Bao, Min Zhao, Zhongkai Hao, Peiyao Li, Chongxuan Li, Jun Zhu

    Abstract: Inverse molecular design is critical in material science and drug discovery, where the generated molecules should satisfy certain desirable properties. In this paper, we propose equivariant energy-guided stochastic differential equations (EEGSDE), a flexible framework for controllable 3D molecule generation under the guidance of an energy function in diffusion models. Formally, we show that EEGSDE… ▽ More

    Submitted 28 February, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

  21. arXiv:2209.13527  [pdf, ps, other

    q-bio.BM cs.LG math.OC

    Molecular Design Based on Integer Programming and Quadratic Descriptors in a Two-layered Model

    Authors: Jianshen Zhu, Naveed Ahmed Azam, Shengjuan Cao, Ryota Ido, Kazuya Haraguchi, Liang Zhao, Hiroshi Nagamochi, Tatsuya Akutsu

    Abstract: A novel framework has recently been proposed for designing the molecular structure of chemical compounds with a desired chemical property, where design of novel drugs is an important topic in bioinformatics and chemo-informatics. The framework infers a desired chemical graph by solving a mixed integer linear program (MILP) that simulates the computation process of a feature function defined by a t… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2108.10266, arXiv:2107.02381, arXiv:2109.02628

  22. arXiv:2208.06348  [pdf, other

    q-bio.NC cs.AI cs.CL cs.LG

    Can Brain Signals Reveal Inner Alignment with Human Languages?

    Authors: William Han, Jielin Qiu, Jiacheng Zhu, Mengdi Xu, Douglas Weber, Bo Li, Ding Zhao

    Abstract: Brain Signals, such as Electroencephalography (EEG), and human languages have been widely explored independently for many downstream tasks, however, the connection between them has not been well explored. In this study, we explore the relationship and dependency between EEG and language. To study at the representation level, we introduced \textbf{MTAM}, a \textbf{M}ultimodal \textbf{T}ransformer \… ▽ More

    Submitted 4 May, 2024; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: EMNLP 2023 Findings

  23. arXiv:2206.09818  [pdf, other

    q-bio.BM cs.AI cs.LG

    SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction

    Authors: Qizhi Pei, Lijun Wu, **hua Zhu, Yingce Xia, Shufang Xie, Tao Qin, Haiguang Liu, Tie-Yan Liu, Rui Yan

    Abstract: Accurate prediction of Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery, facilitating the identification of drugs that can effectively interact with specific targets and regulate their activities. While wet experiments remain the most reliable method, they are time-consuming and resource-intensive, resulting in limited data availability that poses challenges for deep… ▽ More

    Submitted 17 October, 2023; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: Accepted by Briefings in Bioinformatics 2023

  24. arXiv:2205.11016  [pdf, other

    cs.CV q-bio.QM

    MolMiner: You only look once for chemical structure recognition

    Authors: Youjun Xu, **chuan Xiao, Chia-Han Chou, Jianhang Zhang, **tao Zhu, Qiwan Hu, Hemin Li, Ningsheng Han, Bingyu Liu, Shuaipeng Zhang, **yu Han, Zhen Zhang, Shuhao Zhang, Weilin Zhang, Luhua Lai, Jianfeng Pei

    Abstract: Molecular structures are always depicted as 2D printed form in scientific documents like journal papers and patents. However, these 2D depictions are not machine-readable. Due to a backlog of decades and an increasing amount of these printed literature, there is a high demand for the translation of printed depictions into machine-readable formats, which is known as Optical Chemical Structure Recog… ▽ More

    Submitted 22 May, 2022; originally announced May 2022.

    Comments: 19 pages, 4 figures

  25. arXiv:2204.11840  [pdf, other

    cs.LG cs.AI eess.SP q-bio.NC

    Dynamic Ensemble Bayesian Filter for Robust Control of a Human Brain-machine Interface

    Authors: Yu Qi, Xinyun Zhu, Kedi Xu, Feixiao Ren, Hongjie Jiang, Junming Zhu, Jianmin Zhang, Gang Pan, Yueming Wang

    Abstract: Objective: Brain-machine interfaces (BMIs) aim to provide direct brain control of devices such as prostheses and computer cursors, which have demonstrated great potential for mobility restoration. One major limitation of current BMIs lies in the unstable performance in online control due to the variability of neural signals, which seriously hinders the clinical availability of BMIs. Method: To dea… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

  26. arXiv:2107.02381  [pdf, ps, other

    cs.LG math.OC q-bio.BM

    An Inverse QSAR Method Based on Linear Regression and Integer Programming

    Authors: Jianshen Zhu, Naveed Ahmed Azam, Kazuya Haraguchi, Liang Zhao, Hiroshi Nagamochi, Tatsuya Akutsu

    Abstract: Recently a novel framework has been proposed for designing the molecular structure of chemical compounds using both artificial neural networks (ANNs) and mixed integer linear programming (MILP). In the framework, we first define a feature vector $f(C)$ of a chemical graph $C$ and construct an ANN that maps $x=f(C)$ to a predicted value $η(x)$ of a chemical property $π$ to $C$. After this, we formu… ▽ More

    Submitted 23 August, 2021; v1 submitted 6 July, 2021; originally announced July 2021.

  27. arXiv:2106.10234  [pdf, other

    q-bio.QM cs.LG

    Dual-view Molecule Pre-training

    Authors: **hua Zhu, Yingce Xia, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu

    Abstract: Inspired by its success in natural language processing and computer vision, pre-training has attracted substantial attention in cheminformatics and bioinformatics, especially for molecule based tasks. A molecule can be represented by either a graph (where atoms are connected by bonds) or a SMILES sequence (where depth-first-search is applied to the molecular graph with specific rules). Existing wo… ▽ More

    Submitted 12 October, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

    Comments: Add new results of retrosynthesis

  28. arXiv:2101.10643  [pdf, other

    stat.ML cs.AI cs.LG q-bio.QM

    Causal inference for observational longitudinal studies using deep survival models

    Authors: Jie Zhu, Blanca Gallego

    Abstract: Causal inference for observational longitudinal studies often requires the accurate estimation of treatment effects on time-to-event outcomes in the presence of time-dependent patient history and time-dependent covariates. To tackle this longitudinal treatment effect estimation problem, we have developed a time-variant causal survival (TCS) model that uses the potential outcomes framework with an… ▽ More

    Submitted 8 June, 2022; v1 submitted 26 January, 2021; originally announced January 2021.

  29. arXiv:2011.01002  [pdf, other

    q-bio.QM eess.IV q-bio.TO stat.AP

    RRScell method for automated single-cell profiling of multiplexed immunofluorescence cancer tissue

    Authors: Alvason Zhenhua Li, Karsten Eichholz, Anton Sholukh, Daniel Stone, Michelle A. Loprieno, Keith R. Jerome, Khamsone Phasouk, Kurt Diem, Jia Zhu, Lawrence Corey

    Abstract: Multiplexed immuno-fluorescence tissue imaging, allowing simultaneous detection of molecular properties of cells, is an essential tool for characterizing the complex cellular mechanisms in translational research and clinical practice. New image analysis approaches are needed because tissue section stained with a mixture of protein, DNA and RNA biomarkers are introducing various complexities, inclu… ▽ More

    Submitted 18 March, 2021; v1 submitted 30 October, 2020; originally announced November 2020.

    Comments: 8 pages, 6 figures, markerUMAP cell clustering

  30. arXiv:2006.03226  [pdf

    cs.NE cs.AI q-bio.NC

    Brain-inspired global-local learning incorporated with neuromorphic computing

    Authors: Yujie Wu, Rong Zhao, Jun Zhu, Feng Chen, Mingkun Xu, Guoqi Li, Sen Song, Lei Deng, Guanrui Wang, Hao Zheng, **g Pei, Youhui Zhang, Mingguo Zhao, Lu** Shi

    Abstract: Two main routes of learning methods exist at present including error-driven global learning and neuroscience-oriented local learning. Integrating them into one network may provide complementary learning capabilities for versatile learning scenarios. At the same time, neuromorphic computing holds great promise, but still needs plenty of useful algorithms and algorithm-hardware co-designs for exploi… ▽ More

    Submitted 21 June, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

    Comments: 5 figures, 6 tables

  31. arXiv:2004.02689  [pdf, other

    q-bio.QM cs.IT eess.SP stat.ME stat.ML

    Noisy Pooled PCR for Virus Testing

    Authors: Junan Zhu, Kristina Rivera, Dror Baron

    Abstract: Fast testing can help mitigate the coronavirus disease 2019 (COVID-19) pandemic. Despite their accuracy for single sample analysis, infectious diseases diagnostic tools, like RT-PCR, require substantial resources to test large populations. We develop a scalable approach for determining the viral status of pooled patient samples. Our approach converts group testing to a linear inverse problem, wher… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

    Comments: 5 pages, 3 figures; we welcome new collaborators to reach out and help improve this work!

  32. arXiv:2002.09283  [pdf

    cs.DL cs.LG q-bio.NC

    MODMA dataset: a Multi-modal Open Dataset for Mental-disorder Analysis

    Authors: Hanshu Cai, Yiwen Gao, Shuting Sun, Na Li, Fuze Tian, Han Xiao, Jianxiu Li, Zhengwu Yang, Xiaowei Li, Qinglin Zhao, Zhenyu Liu, Zhijun Yao, Minqiang Yang, Hong Peng, **g Zhu, Xiaowei Zhang, Guo** Gao, Fang Zheng, Rui Li, Zhihua Guo, Rong Ma, **g Yang, Lan Zhang, Xi** Hu, Yumin Li , et al. (1 additional authors not shown)

    Abstract: According to the World Health Organization, the number of mental disorder patients, especially depression patients, has grown rapidly and become a leading contributor to the global burden of disease. However, the present common practice of depression diagnosis is based on interviews and clinical scales carried out by doctors, which is not only labor-consuming but also time-consuming. One important… ▽ More

    Submitted 4 March, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

    Journal ref: Sci Data 9, 178 (2022)

  33. arXiv:1910.08877  [pdf, other

    stat.ME q-bio.QM stat.ML

    Targeted Estimation of Heterogeneous Treatment Effect in Observational Survival Analysis

    Authors: Jie Zhu, Blanca Gallego

    Abstract: The aim of clinical effectiveness research using repositories of electronic health records is to identify what health interventions 'work best' in real-world settings. Since there are several reasons why the net benefit of intervention may differ across patients, current comparative effectiveness literature focuses on investigating heterogeneous treatment effect and predicting whether an individua… ▽ More

    Submitted 22 October, 2019; v1 submitted 19 October, 2019; originally announced October 2019.

    Journal ref: j.jbi.2020.103474

  34. arXiv:1906.11196  [pdf, other

    q-bio.BM cs.LG stat.ML

    Seq-SetNet: Exploring Sequence Sets for Inferring Structures

    Authors: Fusong Ju, Jianwei Zhu, Guozheng Wei, Qi Zhang, Shiwei Sun, Dongbo Bu

    Abstract: Sequence set is a widely-used type of data source in a large variety of fields. A typical example is protein structure prediction, which takes an multiple sequence alignment (MSA) as input and aims to infer structural information from it. Almost all of the existing approaches exploit MSAs in an indirect fashion, i.e., they transform MSAs into position-specific scoring matrices (PSSM) that represen… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

  35. arXiv:1810.02037  [pdf, other

    stat.ME q-bio.GN

    A statistical normalization method and differential expression analysis for RNA-seq data between different species

    Authors: Yan Zhou, Jiadi Zhu, Tiejun Tong, Junhui Wang, Bingqing Lin, Jun Zhang

    Abstract: Background: High-throughput techniques bring novel tools but also statistical challenges to genomic research. Identifying genes with differential expression between different species is an effective way to discover evolutionarily conserved transcriptional responses. To remove systematic variation between different species for a fair comparison, the normalization procedure serves as a crucial pre-p… ▽ More

    Submitted 3 October, 2018; originally announced October 2018.

  36. arXiv:1809.09553  [pdf

    physics.med-ph q-bio.QM

    Prediction of Coronary Heart Disease Using Routine Blood Tests

    Authors: Ning Meng, Peng Zhang, Junfeng Li, Jun He, ** Zhu

    Abstract: Background --The objective of this study was to examine the association of routine blood test results with coronary heart disease (CHD) risk, to incorporate them into coronary prediction models and to compare the discrimination properties of this approach with other prediction functions. Methods and Results --This work was designed as a retrospective, single-center study of a hospital-based cohort… ▽ More

    Submitted 11 September, 2018; originally announced September 2018.

  37. arXiv:1809.00083  [pdf, other

    q-bio.BM cs.LG stat.ME

    Predicting protein inter-residue contacts using composite likelihood maximization and deep learning

    Authors: Haicang Zhang, Qi Zhang, Fusong Ju, Jianwei Zhu, Shiwei Sun, Yujuan Gao, Ziwei Xie, Minghua Deng, Shiwei Sun, Wei-Mou Zheng, Dongbo Bu

    Abstract: Accurate prediction of inter-residue contacts of a protein is important to calcu- lating its tertiary structure. Analysis of co-evolutionary events among residues has been proved effective to inferring inter-residue contacts. The Markov ran- dom field (MRF) technique, although being widely used for contact prediction, suffers from the following dilemma: the actual likelihood function of MRF is acc… ▽ More

    Submitted 31 August, 2018; originally announced September 2018.

  38. arXiv:1808.08662  [pdf, other

    q-bio.PE

    Advances in Computational Methods for Phylogenetic Networks in the Presence of Hybridization

    Authors: R. A. L. Elworth, H. A. Ogilvie, J. Zhu, L. Nakhleh

    Abstract: Phylogenetic networks extend phylogenetic trees to allow for modeling reticulate evolutionary processes such as hybridization. They take the shape of a rooted, directed, acyclic graph, and when parameterized with evolutionary parameters, such as divergence times and population sizes, they form a generative process of molecular sequence evolution. Early work on computational methods for phylogeneti… ▽ More

    Submitted 26 August, 2018; originally announced August 2018.

  39. arXiv:1805.03327  [pdf, other

    q-bio.MN cs.LG cs.SI

    Network Enhancement: a general method to denoise weighted biological networks

    Authors: Bo Wang, Armin Pourshafeie, Marinka Zitnik, Junjie Zhu, Carlos D. Bustamante, Serafim Batzoglou, Jure Leskovec

    Abstract: Networks are ubiquitous in biology where they encode connectivity patterns at all scales of organization, from molecular to the biome. However, biological networks are noisy due to the limitations of measurement technology and inherent natural variation, which can hamper discovery of network patterns and dynamics. We propose Network Enhancement (NE), a method for improving the signal-to-noise rati… ▽ More

    Submitted 1 June, 2018; v1 submitted 8 May, 2018; originally announced May 2018.

    Journal ref: Nature Communications, 9:3108, 2018

  40. arXiv:1706.02609  [pdf, other

    cs.NE q-bio.NC stat.ML

    Spatio-Temporal Backpropagation for Training High-performance Spiking Neural Networks

    Authors: Yujie Wu, Lei Deng, Guoqi Li, Jun Zhu, Lu** Shi

    Abstract: Compared with artificial neural networks (ANNs), spiking neural networks (SNNs) are promising to explore the brain-like behaviors since the spikes could encode more spatio-temporal information. Although pre-training from ANN or direct training based on backpropagation (BP) makes the supervised training of SNNs possible, these methods only exploit the networks' spatial domain information which lead… ▽ More

    Submitted 12 September, 2017; v1 submitted 8 June, 2017; originally announced June 2017.

    Journal ref: Frontiers in neuroscience, 2018, 12

  41. arXiv:1703.07844  [pdf, other

    q-bio.GN cs.LG q-bio.QM

    SIMLR: A Tool for Large-Scale Genomic Analyses by Multi-Kernel Learning

    Authors: Bo Wang, Daniele Ramazzotti, Luca De Sano, Junjie Zhu, Emma Pierson, Serafim Batzoglou

    Abstract: We here present SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), an open-source tool that implements a novel framework to learn a sample-to-sample similarity measure from expression data observed for heterogenous samples. SIMLR can be effectively used to perform tasks such as dimension reduction, clustering, and visualization of heterogeneous populations of samples. SIMLR was benchmar… ▽ More

    Submitted 18 January, 2018; v1 submitted 21 March, 2017; originally announced March 2017.

  42. arXiv:1611.10252  [pdf, other

    q-bio.NC cs.AI cs.LG

    SeDMiD for Confusion Detection: Uncovering Mind State from Time Series Brain Wave Data

    Authors: **gkang Yang, Haohan Wang, Jun Zhu, Eric P. Xing

    Abstract: Understanding how brain functions has been an intriguing topic for years. With the recent progress on collecting massive data and develo** advanced technology, people have become interested in addressing the challenge of decoding brain wave data into meaningful mind states, with many machine learning models and algorithms being revisited and developed, especially the ones that handle time series… ▽ More

    Submitted 29 November, 2016; originally announced November 2016.

    Comments: 11 pages, 2 figures, NIPS 2016 Time Series Workshop

  43. arXiv:1611.08310  [pdf

    q-bio.NC

    White matter deficits underlie the loss of consciousness level and predict recovery outcome in disorders of consciousness

    Authors: Xuehai Wu, Jiaying Zhang, Zaixu Cui, Weijun Tang, Chunhong Shao, ** Hu, Jianhong Zhu, Liangfu Zhou, Yao Zhao, Lu Lu, Gang Chen, Georg Northoff, Gaolang Gong, Ying Mao, Yong He

    Abstract: This study aimed to identify white matter (WM) deficits underlying the loss of consciousness in disorder of consciousness (DOC) patients using Diffusion Tensor Imaging (DTI) and to demonstrate the potential value of DTI parameters in predicting recovery outcomes of DOC patients. With 30 DOC patients (8 comatose, 8 unresponsive wakefulness syndrome/vegetative state, and 14 minimal conscious state)… ▽ More

    Submitted 24 November, 2016; originally announced November 2016.

  44. arXiv:1611.02317  [pdf

    q-bio.TO

    Renal Parenchymal Area and Kidney Collagen Content

    Authors: Jake A. Nieto, Janice Zhu, Bin Duan, **gsong Li, ** Zhou, Latha Paka, Michael A. Yamin, Itzhak D. Goldberg, Prakash Narayan

    Abstract: The extent of renal scarring in chronic kidney disease (CKD) can only be ascertained by highly invasive, painful and sometimes risky tissue biopsy. Interestingly, CKD-related abnormalities in kidney size can often be visualized using ultrasound. Nevertheless, not only does the ellipsoid formula used today underestimate true renal size but also the relation governing renal size and collagen content… ▽ More

    Submitted 10 November, 2016; v1 submitted 7 November, 2016; originally announced November 2016.

    Comments: 17 pages, 6 figures, 3 equations

  45. arXiv:1606.07350  [pdf, other

    q-bio.PE

    In the Light of Deep Coalescence: Revisiting Trees Within Networks

    Authors: Jiafan Zhu, Yun Yu, Luay Nakhleh

    Abstract: Phylogenetic networks model reticulate evolutionary histories. The last two decades have seen an increased interest in establishing mathematical results and develo** computational methods for inferring and analyzing these networks. A salient concept underlying a great majority of these developments has been the notion that a network displays a set of trees and those trees can be used to infer, a… ▽ More

    Submitted 23 June, 2016; originally announced June 2016.

  46. arXiv:1604.04913  [pdf, other

    q-bio.TO math.OC q-bio.PE

    Optimized Treatment Schedules for Chronic Myeloid Leukemia

    Authors: Qie He, Junfeng Zhu, David Dingli, Jasmine Foo, Kevin Leder

    Abstract: Over the past decade, several targeted therapies (e.g. imatinib, dasatinib, nilotinib) have been developed to treat Chronic Myeloid Leukemia (CML). Despite an initial response to therapy, drug resistance remains a problem for some CML patients. Recent studies have shown that resistance mutations that preexist treatment can be detected in a substan- tial number of patients, and that this may be ass… ▽ More

    Submitted 17 April, 2016; originally announced April 2016.

    Comments: 26 pages, 7 figures

  47. arXiv:1509.03434  [pdf, ps, other

    q-bio.BM q-bio.QM

    Improving protein threading accuracy via combining local and global potential using TreeCRF model

    Authors: Haicang Zhang, Mingfu Shao, Chao Wang, Jianwei Zhu, Wei-Mou Zheng, Dongbo Bu

    Abstract: Protein structure prediction remains to be an open problem in bioinformatics. There are two main categories of methods for protein structure prediction: Free Modeling (FM) and Template Based Modeling (TBM). Protein threading, belonging to the category of template based modeling, identifies the most likely fold with the target by making a sequence-structure alignment between target protein and temp… ▽ More

    Submitted 11 September, 2015; originally announced September 2015.

  48. arXiv:1507.03197  [pdf, ps, other

    q-bio.OT

    TOPO: Improving remote homologue recognition via identifying common protein structure framework

    Authors: Jianwei Zhu, Haicang Zhang, Chao Wang, Bin Ling, Wei-Mou Zheng, Dongbo Bu

    Abstract: Protein structure prediction remains a challenge in the field of computational biology. Traditional protein structure prediction approaches include template-based modelling (say, homology modelling, and threading), and ab initio. A threading algorithm takes a query protein sequence as input, recognizes the most likely fold, and finally reports the alignments of the query sequence to structure-know… ▽ More

    Submitted 12 July, 2015; originally announced July 2015.

  49. arXiv:1411.5624  [pdf, ps, other

    q-bio.PE physics.bio-ph q-bio.GN

    Disorder and Power-law Tails of DNA Sequence Self-Alignment Concentrations in Molecular Evolution

    Authors: Kun Gao, HongGuang Sun, Jian-Zhou Zhu

    Abstract: The self-alignment concentrations, $c(x)$, as functions of the length, $x$, of the identically matching maximal segments in the genomes of a variety of species, typically present power-law tails extending to the largest scales, i.e., $c(x) \propto x^α$, with similar or apparently different negative $α$s ($<-2$). The relevant fundamental processes of molecular evolution are segmental duplication an… ▽ More

    Submitted 19 December, 2014; v1 submitted 20 November, 2014; originally announced November 2014.

    Comments: a figure for the introductory discussion removed; less lengthy

  50. RADIA: RNA and DNA Integrated Analysis for Somatic Mutation Detection

    Authors: Amie J. Radenbaugh, Singer Ma, Adam Ewing, Joshua Stuart, Eric Collisson, **gchun Zhu, David Haussler

    Abstract: The detection of somatic single nucleotide variants is a crucial component to the characterization of the cancer genome. Mutation calling algorithms thus far have focused on comparing the normal and tumor genomes from the same individual. In recent years, it has become routine for projects like The Cancer Genome Atlas (TCGA) to also sequence the tumor RNA. Here we present RADIA (RNA and DNA Integr… ▽ More

    Submitted 4 February, 2014; originally announced February 2014.

    Comments: 25 pages, 3 figures, 4 tables, 8 supplementary figures, submitted to Bioinformatics