Skip to main content

Showing 1–50 of 647 results for author: Yu

Searching in archive q-bio. Search in all archives.
.
  1. arXiv:2407.09100  [pdf, other

    q-bio.NC

    Retrospective for the Dynamic Sensorium Competition for predicting large-scale mouse primary visual cortex activity from videos

    Authors: Polina Turishcheva, Paul G. Fahey, Michaela Vystrčilová, Laura Hansel, Rachel Froebe, Kayla Ponder, Yongrong Qiu, Konstantin F. Willeke, Mohammad Bashiri, Ruslan Baikulov, Yu Zhu, Lei Ma, Shan Yu, Tiejun Huang, Bryan M. Li, Wolf De Wulf, Nina Kudryashova, Matthias H. Hennig, Nathalie L. Rochefort, Arno Onken, Eric Wang, Zhiwei Ding, Andreas S. Tolias, Fabian H. Sinz, Alexander S Ecker

    Abstract: Understanding how biological visual systems process information is challenging because of the nonlinear relationship between visual input and neuronal responses. Artificial neural networks allow computational neuroscientists to create predictive models that connect biological and machine vision. Machine learning has benefited tremendously from benchmarks that compare different model on the same ta… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.07930  [pdf

    q-bio.BM cs.LG

    Token-Mol 1.0: Tokenized drug design with large language model

    Authors: Jike Wang, Rui Qin, Mingyang Wang, Mei**g Fang, Yangyang Zhang, Yuchen Zhu, Qun Su, Qiaolin Gou, Chao Shen, Odin Zhang, Zhenxing Wu, Dejun Jiang, Xujun Zhang, Huifeng Zhao, Xiaozhe Wan, Zhourui Wu, Liwei Liu, Yu Kang, Chang-Yu Hsieh, Tingjun Hou

    Abstract: Significant interests have recently risen in leveraging sequence-based large language models (LLMs) for drug design. However, most current applications of LLMs in drug discovery lack the ability to comprehend three-dimensional (3D) structures, thereby limiting their effectiveness in tasks that explicitly involve molecular conformations. In this study, we introduced Token-Mol, a token-only 3D drug… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  3. arXiv:2407.07357  [pdf, ps, other

    cs.LG q-bio.MN

    A deep graph model for the signed interaction prediction in biological network

    Authors: Shuyi **, Mengji Zhang, Meijie Wang, Lun Yu

    Abstract: In pharmaceutical research, the strategy of drug repurposing accelerates the development of new therapies while reducing R&D costs. Network pharmacology lays the theoretical groundwork for identifying new drug indications, and deep graph models have become essential for their precision in map** complex biological networks. Our study introduces an advanced graph model that utilizes graph convolut… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  4. arXiv:2407.06334  [pdf, other

    cs.AI q-bio.QM

    Double-Ended Synthesis Planning with Goal-Constrained Bidirectional Search

    Authors: Kevin Yu, Jihye Roh, Ziang Li, Wenhao Gao, Runzhong Wang, Connor W. Coley

    Abstract: Computer-aided synthesis planning (CASP) algorithms have demonstrated expert-level abilities in planning retrosynthetic routes to molecules of low to moderate complexity. However, current search methods assume the sufficiency of reaching arbitrary building blocks, failing to address the common real-world constraint where using specific molecules is desired. To this end, we present a formulation of… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 10 pages main, 4 figures

  5. arXiv:2407.00984  [pdf

    q-bio.NC cs.AI

    Individual brain parcellation: Review of methods, validations and applications

    Authors: Chengyi Li, Shan Yu, Yue Cui

    Abstract: Individual brains vary greatly in morphology, connectivity and organization. The applicability of group-level parcellations is limited by the rapid development of precision medicine today because they do not take into account the variation of parcels at the individual level. Accurate map** of brain functional regions at the individual level is pivotal for a comprehensive understanding of the var… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 15 pages, 2 figures

  6. arXiv:2406.16995  [pdf, other

    q-bio.QM cs.AI

    A large language model for predicting T cell receptor-antigen binding specificity

    Authors: Xing Fang, Chenpeng Yu, Shiye Tian, Hui Liu

    Abstract: The human immune response depends on the binding of T-cell receptors (TCRs) to antigens (pTCR), which elicits the T cells to eliminate viruses, tumor cells, and other pathogens. The ability of human immunity system responding to unknown viruses and bacteria stems from the TCR diversity. However, this vast diversity poses challenges on the TCR-antigen binding prediction methods. In this study, we p… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  7. arXiv:2406.12064  [pdf, other

    q-bio.GN

    skandiver: a divergence-based analysis tool for identifying intercellular mobile genetic elements

    Authors: Xiaolei Brian Zhang, Grace Oualline, Jim Shaw, Yun William Yu

    Abstract: Mobile genetic elements (MGEs) are as ubiquitous in nature as they are varied in type, ranging from viral insertions to transposons to incorporated plasmids. Horizontal transfer of MGEs across bacterial species may also pose a significant threat to global health due to their capability to harbour antibiotic resistance genes. However, despite cheap and rapid whole genome sequencing, the varied natu… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 9 pages, 6 figures

  8. arXiv:2406.11568  [pdf, other

    cs.CL cs.SD eess.AS q-bio.NC

    Towards an End-to-End Framework for Invasive Brain Signal Decoding with Large Language Models

    Authors: Sheng Feng, Heyang Liu, Yu Wang, Yanfeng Wang

    Abstract: In this paper, we introduce a groundbreaking end-to-end (E2E) framework for decoding invasive brain signals, marking a significant advancement in the field of speech neuroprosthesis. Our methodology leverages the comprehensive reasoning abilities of large language models (LLMs) to facilitate direct decoding. By fully integrating LLMs, we achieve results comparable to the state-of-the-art cascade m… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  9. arXiv:2406.06767  [pdf

    stat.ME q-bio.QM stat.CO

    ULV: A robust statistical method for clustered data, with applications to multisubject, single-cell omics data

    Authors: Mingyu Du, Kevin Johnston, Veronica Berrocal, Wei Li, Xiangmin Xu, Zhaoxia Yu

    Abstract: Molecular and genomic technological advancements have greatly enhanced our understanding of biological processes by allowing us to quantify key biological variables such as gene expression, protein levels, and microbiome compositions. These breakthroughs have enabled us to achieve increasingly higher levels of resolution in our measurements, exemplified by our ability to comprehensively profile bi… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  10. arXiv:2406.05832  [pdf, other

    q-bio.QM cs.LG q-bio.BM

    Improving Antibody Design with Force-Guided Sampling in Diffusion Models

    Authors: Paulina Kulytė, Francisco Vargas, Simon Valentin Mathis, Yu Guang Wang, José Miguel Hernández-Lobato, Pietro Liò

    Abstract: Antibodies, crucial for immune defense, primarily rely on complementarity-determining regions (CDRs) to bind and neutralize antigens, such as viruses. The design of these CDRs determines the antibody's affinity and specificity towards its target. Generative models, particularly denoising diffusion probabilistic models (DDPMs), have shown potential to advance the structure-based design of CDR regio… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  11. arXiv:2406.05540  [pdf, other

    q-bio.QM cs.AI cs.CL cs.LG

    A Fine-tuning Dataset and Benchmark for Large Language Models for Protein Understanding

    Authors: Yiqing Shen, Zan Chen, Michail Mamalakis, Luhan He, Haiyang Xia, Tianbin Li, Yanzhou Su, Junjun He, Yu Guang Wang

    Abstract: The parallels between protein sequences and natural language in their sequential structures have inspired the application of large language models (LLMs) to protein understanding. Despite the success of LLMs in NLP, their effectiveness in comprehending protein sequences remains an open question, largely due to the absence of datasets linking protein sequences to descriptive text. Researchers have… ▽ More

    Submitted 8 July, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

  12. arXiv:2406.01636  [pdf

    q-bio.QM cs.AI

    COVID-19: post infection implications in different age groups, mechanism, diagnosis, effective prevention, treatment, and recommendations

    Authors: Muhammad Akmal Raheem, Muhammad Ajwad Rahim, Ijaz Gul, Md. Reyad-ul-Ferdous, Liyan Le, Junguo Hui, Shuiwei Xia, Minjiang Chen, Dongmei Yu, Vijay Pandey, Peiwu Qin, Jiansong Ji

    Abstract: SARS-CoV-2, the highly contagious pathogen responsible for the COVID-19 pandemic, has persistent effects that begin four weeks after initial infection and last for an undetermined duration. These chronic effects are more harmful than acute ones. This review explores the long-term impact of the virus on various human organs, including the pulmonary, cardiovascular, neurological, reproductive, gastr… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  13. arXiv:2405.15544  [pdf, other

    q-bio.QM cs.AI cs.LG

    Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction

    Authors: Zeyu Wang, Tianyi Jiang, Yao Lu, Xiaoze Bao, Shanqing Yu, Bin Wei, Qi Xuan

    Abstract: Recently, few-shot molecular property prediction (FSMPP) has garnered increasing attention. Despite impressive breakthroughs achieved by existing methods, they often overlook the inherent many-to-many relationships between molecules and properties, which limits their performance. For instance, similar substructures of molecules can inspire the exploration of new compounds. Additionally, the relati… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  14. arXiv:2405.12645  [pdf, other

    q-bio.NC

    Implementing feature binding through dendritic networks of a single neuron

    Authors: Yuanhong Tang, Shanshan Jia, Tiejun Huang, Zhaofei Yu, Jian K. Liu

    Abstract: A single neuron receives an extensive array of synaptic inputs through its dendrites, raising the fundamental question of how these inputs undergo integration and summation, culminating in the initiation of spikes in the soma. Experimental and computational investigations have revealed various modes of integration operations that include linear, superlinear, and sublinear summation. Interestingly,… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  15. arXiv:2405.12519  [pdf, other

    cs.LG cs.AI q-bio.QM

    MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation

    Authors: Zhaoning Yu, Hongyang Gao

    Abstract: Graph Neural Networks (GNNs) have shown remarkable success in molecular tasks, yet their interpretability remains challenging. Traditional model-level explanation methods like XGNN and GNNInterpreter often fail to identify valid substructures like rings, leading to questionable interpretability. This limitation stems from XGNN's atom-by-atom approach and GNNInterpreter's reliance on average graph… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2405.08419

  16. arXiv:2405.12144  [pdf

    q-bio.NC

    Alterations of electrocortical activity during hand movements induced by motor cortex glioma

    Authors: Yihan Wu, Tao Chang, Siliang Chen, Xiaodong Niu, Yu Li, Yuan Fang, Lei Yang, Yixuan Zong, Yaoxin Yang, Yuehua Li, Mengsong Wang, Wen Yang, Yixuan Wu, Chen Fu, Xia Fang, Yuxin Quan, Xilin Peng, Qiang Sun, Marc M. Van Hulle, Yanhui Liu, Ning Jiang, Dario Farina, Yuan Yang, Jiayuan He, Qing Mao

    Abstract: Glioma cells can reshape functional neuronal networks by hijacking neuronal synapses, leading to partial or complete neurological dysfunction. These mechanisms have been previously explored for language functions. However, the impact of glioma on sensorimotor functions is still unknown. Therefore, we recruited a control group of patients with unaffected motor cortex and a group of patients with gl… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  17. arXiv:2405.06659  [pdf, other

    q-bio.BM cs.AI cs.LG physics.chem-ph

    ControlMol: Adding Substruture Control To Molecule Diffusion Models

    Authors: Qi Zhengyang, Liu Zi**g, Zhang Jiying, Cao He, Li Yu

    Abstract: Designing new molecules is an important task in the field of pharmaceuticals. Due to the vast design space of molecules, generating molecules conditioned on a specific sub-structure relevant to a particular function or therapeutic target is a crucial task in computer-aided drug design. In this paper, we present ControlMol, which adds sub-structure control to molecule generation with diffusion mode… ▽ More

    Submitted 22 April, 2024; originally announced May 2024.

    Comments: 9 pages,7 figures

  18. arXiv:2405.06658  [pdf, other

    q-bio.BM cs.AI cs.LG

    ProteinEngine: Empower LLM with Domain Knowledge for Protein Engineering

    Authors: Yiqing Shen, Outongyi Lv, Houying Zhu, Yu Guang Wang

    Abstract: Large language models (LLMs) have garnered considerable attention for their proficiency in tackling intricate tasks, particularly leveraging their capacities for zero-shot and in-context learning. However, their utility has been predominantly restricted to general tasks due to an absence of domain-specific knowledge. This constraint becomes particularly pertinent in the realm of protein engineerin… ▽ More

    Submitted 20 April, 2024; originally announced May 2024.

  19. arXiv:2405.06653  [pdf, other

    q-bio.BM cs.LG

    A unified cross-attention model for predicting antigen binding specificity to both HLA and TCR molecules

    Authors: Chenpeng Yu, Xing Fang, Hui Liu

    Abstract: The immune checkpoint inhibitors have demonstrated promising clinical efficacy across various tumor types, yet the percentage of patients who benefit from them remains low. The binding affinity between antigens and HLA-I/TCR molecules plays a critical role in antigen presentation and T-cell activation. Some computational methods have been developed to predict antigen-HLA or antigen-TCR binding spe… ▽ More

    Submitted 8 April, 2024; originally announced May 2024.

  20. arXiv:2405.06511  [pdf, other

    q-bio.QM cs.AI

    Towards Less Biased Data-driven Scoring with Deep Learning-Based End-to-end Database Search in Tandem Mass Spectrometry

    Authors: Yonghan Yu, Ming Li

    Abstract: Peptide identification in mass spectrometry-based proteomics is crucial for understanding protein function and dynamics. Traditional database search methods, though widely used, rely on heuristic scoring functions and statistical estimations have to be introduced for a higher identification rate. Here, we introduce DeepSearch, the first deep learning-based end-to-end database search method for tan… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  21. arXiv:2405.05665  [pdf, other

    cs.LG q-bio.QM

    SubGDiff: A Subgraph Diffusion Model to Improve Molecular Representation Learning

    Authors: Jiying Zhang, Zi**g Liu, Yu Wang, Yu Li

    Abstract: Molecular representation learning has shown great success in advancing AI-based drug discovery. The core of many recent works is based on the fact that the 3D geometric structure of molecules provides essential information about their physical and chemical characteristics. Recently, denoising diffusion probabilistic models have achieved impressive performance in 3D molecular representation learnin… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 31 pages

  22. arXiv:2405.02853  [pdf

    q-bio.OT

    Development and validation of a short form of the medication literacy scale for Chinese College Students

    Authors: Chen Zhenzhen, Ren Jiabao, Duan Tingyu, Chen Ke, Hou Ruyi, Li Yimiao, Zeng Leixiao, Meng Xiaoxuan, Wu Yibo, Liu Yu

    Abstract: Medication literacy is integral to health literacy, pivotal for medication safety and adherence. It denotes an individual's capacity to discern, comprehend, and convey medication-related information. Existing scales, however, are time-consuming and predominantly cater to patients and community dwellers, necessitating a more succinct instrument. This study presents the development of a brief Medica… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: 25 pages, 3 figures,3 tables

  23. arXiv:2405.02845  [pdf, other

    cs.LG q-bio.MN

    Data-Efficient Molecular Generation with Hierarchical Textual Inversion

    Authors: Seo** Kim, Jaehyun Nam, Sihyun Yu, Younghoon Shin, **woo Shin

    Abstract: Develo** an effective molecular generation framework even with a limited number of molecules is often important for its practical deployment, e.g., drug discovery, since acquiring task-related molecular data requires expensive and time-consuming experimental costs. To tackle this issue, we introduce Hierarchical textual Inversion for Molecular generation (HI-Mol), a novel data-efficient molecula… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  24. arXiv:2405.00513  [pdf

    q-bio.QM

    3D MR Fingerprinting for Dynamic Contrast-Enhanced Imaging of Whole Mouse Brain

    Authors: Yuran Zhu, Guanhua Wang, Yuning Gu, Walter Zhao, Jiahao Lu, Junqing Zhu, Christina J. MacAskill, Andrew Dupuis, Mark A. Griswold, Dan Ma, Chris A. Flask, Xin Yu

    Abstract: Quantitative MRI enables direct quantification of contrast agent concentrations in contrast-enhanced scans. However, the lengthy scan times required by conventional methods are inadequate for tracking contrast agent transport dynamically in mouse brain. We developed a 3D MR fingerprinting (MRF) method for simultaneous T1 and T2 map** across the whole mouse brain with 4.3-min temporal resolution.… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  25. arXiv:2404.18443  [pdf, other

    cs.CL cs.AI cs.IR q-bio.QM

    BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers

    Authors: Ran Xu, Wenqi Shi, Yue Yu, Yuchen Zhuang, Yanqiao Zhu, May D. Wang, Joyce C. Ho, Chao Zhang, Carl Yang

    Abstract: Develo** effective biomedical retrieval models is important for excelling at knowledge-intensive biomedical tasks but still challenging due to the deficiency of sufficient publicly annotated biomedical data and computational resources. We present BMRetriever, a series of dense retrievers for enhancing biomedical retrieval via unsupervised pre-training on large biomedical corpora, followed by ins… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Work in progress. The model and data will be uploaded to \url{https://github.com/ritaranx/BMRetriever}

  26. arXiv:2404.16880  [pdf, other

    q-bio.QM cs.AI cs.CL

    Atomas: Hierarchical Alignment on Molecule-Text for Unified Molecule Understanding and Generation

    Authors: Yikun Zhang, Geyan Ye, Chaohao Yuan, Bo Han, Long-Kai Huang, Jianhua Yao, Wei Liu, Yu Rong

    Abstract: Molecule-and-text cross-modal representation learning has emerged as a promising direction for enhancing the quality of molecular representation, thereby improving performance in various scientific fields, including drug discovery and materials science. Existing studies adopt a global alignment approach to learn the knowledge from different modalities. These global alignment approaches fail to cap… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  27. arXiv:2404.16866  [pdf, other

    q-bio.QM cs.AI cs.LG

    Functional Protein Design with Local Domain Alignment

    Authors: Chaohao Yuan, Songyou Li, Geyan Ye, Yikun Zhang, Long-Kai Huang, Wenbing Huang, Wei Liu, Jianhua Yao, Yu Rong

    Abstract: The core challenge of de novo protein design lies in creating proteins with specific functions or properties, guided by certain conditions. Current models explore to generate protein using structural and evolutionary guidance, which only provide indirect conditions concerning functions and properties. However, textual annotations of proteins, especially the annotations for protein domains, which d… ▽ More

    Submitted 27 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  28. arXiv:2404.14850  [pdf, other

    cs.CL cs.LG q-bio.BM

    Simple, Efficient and Scalable Structure-aware Adapter Boosts Protein Language Models

    Authors: Yang Tan, Mingchen Li, Bingxin Zhou, Bozitao Zhong, Lirong Zheng, Pan Tan, Ziyi Zhou, Huiqun Yu, Guisheng Fan, Liang Hong

    Abstract: Fine-tuning Pre-trained protein language models (PLMs) has emerged as a prominent strategy for enhancing downstream prediction tasks, often outperforming traditional supervised learning approaches. As a widely applied powerful technique in natural language processing, employing Parameter-Efficient Fine-Tuning techniques could potentially enhance the performance of PLMs. However, the direct transfe… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 30 pages, 4 figures, 8 tables

  29. arXiv:2404.11199  [pdf, other

    q-bio.BM

    RiboDiffusion: Tertiary Structure-based RNA Inverse Folding with Generative Diffusion Models

    Authors: Han Huang, Ziqian Lin, Dongchen He, Liang Hong, Yu Li

    Abstract: RNA design shows growing applications in synthetic biology and therapeutics, driven by the crucial role of RNA in various biological processes. A fundamental challenge is to find functional RNA sequences that satisfy given structural constraints, known as the inverse folding problem. Computational approaches have emerged to address this problem based on secondary structures. However, designing RNA… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 15 pages

  30. arXiv:2404.08027  [pdf, other

    cs.CV cs.AI cs.LG q-bio.QM

    SurvMamba: State Space Model with Multi-grained Multi-modal Interaction for Survival Prediction

    Authors: Ying Chen, Jia**g Xie, Yuxiang Lin, Yuhang Song, Wenxian Yang, Rongshan Yu

    Abstract: Multi-modal learning that combines pathological images with genomic data has significantly enhanced the accuracy of survival prediction. Nevertheless, existing methods have not fully utilized the inherent hierarchical structure within both whole slide images (WSIs) and transcriptomic data, from which better intra-modal representations and inter-modal integration could be derived. Moreover, many ex… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  31. arXiv:2404.06691  [pdf

    q-bio.BM cs.LG cs.NE

    Latent Chemical Space Searching for Plug-in Multi-objective Molecule Generation

    Authors: Ningfeng Liu, Jie Yu, Siyu Xiu, Xinfang Zhao, Siyu Lin, Bo Qiang, Ruqiu Zheng, Hongwei **, Liangren Zhang, Zhenming Liu

    Abstract: Molecular generation, an essential method for identifying new drug structures, has been supported by advancements in machine learning and computational technology. However, challenges remain in multi-objective generation, model adaptability, and practical application in drug discovery. In this study, we developed a versatile 'plug-in' molecular generation model that incorporates multiple objective… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  32. arXiv:2404.00111  [pdf, other

    q-bio.NC

    Variational design of sensory feedback for powerstroke-recovery systems

    Authors: Zhuojun Yu, Peter J. Thomas

    Abstract: Although the raison d'etre of the brain is the survival of the body, there are relatively few theoretical studies of closed-loop rhythmic motor control systems. In this paper we provide a unified framework, based on variational analysis, for investigating the dual goals of performance and robustness in powerstroke-recovery systems. We augment two previously published closed-loop motor control mode… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: 48 pages, 17 figures, 3 tables

  33. arXiv:2404.00044  [pdf, other

    physics.chem-ph cs.AI cs.LG q-bio.QM

    UAlign: Pushing the Limit of Template-free Retrosynthesis Prediction with Unsupervised SMILES Alignment

    Authors: Kaipeng Zeng, Bo yang, Xin Zhao, Yu Zhang, Fan Nie, Xiaokang Yang, Yaohui **, Yanyan Xu

    Abstract: Motivation: Retrosynthesis planning poses a formidable challenge in the organic chemical industry. Single-step retrosynthesis prediction, a crucial step in the planning process, has witnessed a surge in interest in recent years due to advancements in AI for science. Various deep learning-based methods have been proposed for this task in recent years, incorporating diverse levels of additional chem… ▽ More

    Submitted 19 April, 2024; v1 submitted 24 March, 2024; originally announced April 2024.

  34. arXiv:2404.00014  [pdf

    physics.chem-ph cs.AI q-bio.BM

    Deep Geometry Handling and Fragment-wise Molecular 3D Graph Generation

    Authors: Odin Zhang, Yufei Huang, Shichen Cheng, Mengyao Yu, Xujun Zhang, Haitao Lin, Yundian Zeng, Mingyang Wang, Zhenxing Wu, Huifeng Zhao, Zaixi Zhang, Chenqing Hua, Yu Kang, Sunliang Cui, Peichen Pan, Chang-Yu Hsieh, Tingjun Hou

    Abstract: Most earlier 3D structure-based molecular generation approaches follow an atom-wise paradigm, incrementally adding atoms to a partially built molecular fragment within protein pockets. These methods, while effective in designing tightly bound ligands, often overlook other essential properties such as synthesizability. The fragment-wise generation paradigm offers a promising solution. However, a co… ▽ More

    Submitted 15 March, 2024; originally announced April 2024.

  35. arXiv:2403.14481  [pdf

    q-bio.QM

    covSTATIS: a multi-table technique for network neuroscience

    Authors: Giulia Baracchini, Ju-Chi Yu, Jenny Rieck, Derek Beaton, Vincent Guillemot, Cheryl Grady, Herve Abdi, R. Nathan Spreng

    Abstract: Similarity analyses between multiple correlation or covariance tables constitute the cornerstone of network neuroscience. Here, we introduce covSTATIS, a versatile, linear, unsupervised multi-table method designed to identify structured patterns in multi-table data, and allow for the simultaneous extraction and interpretation of both individual and group-level features. With covSTATIS, multiple si… ▽ More

    Submitted 14 April, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: The first two authors contributed equally to this work

  36. arXiv:2403.13862  [pdf, other

    q-bio.MN math.OC

    A necessary condition for non-monotonic dose response, with an application to a kinetic proofreading model -- Extended version

    Authors: Polly Y. Yu, Eduardo D. Sontag

    Abstract: Steady state non-monotonic ("biphasic") dose responses are often observed in experimental biology, which raises the control-theoretic question of identifying which possible mechanisms might underlie such behaviors. It is well known that the presence of an incoherent feedforward loop (IFFL) in a network may give rise to a non-monotonic response. It has been conjectured that this condition is also n… ▽ More

    Submitted 18 April, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: Appendix included

  37. arXiv:2403.13829  [pdf, other

    q-bio.BM cs.LG

    DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization

    Authors: Xiangxin Zhou, Xiwei Cheng, Yuwei Yang, Yu Bao, Liang Wang, Quanquan Gu

    Abstract: Recently, 3D generative models have shown promising performances in structure-based drug design by learning to generate ligands given target binding sites. However, only modeling the target-ligand distribution can hardly fulfill one of the main goals in drug discovery -- designing novel ligands with desired properties, e.g., high binding affinity, easily synthesizable, etc. This challenge becomes… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Accepted to ICLR 2024

  38. arXiv:2403.08192  [pdf, other

    cs.CL q-bio.BM

    MoleculeQA: A Dataset to Evaluate Factual Accuracy in Molecular Comprehension

    Authors: Xingyu Lu, He Cao, Zi**g Liu, Shengyuan Bai, Leqing Chen, Yuan Yao, Hai-Tao Zheng, Yu Li

    Abstract: Large language models are playing an increasingly significant role in molecular research, yet existing models often generate erroneous information, posing challenges to accurate molecular comprehension. Traditional evaluation metrics for generated content fail to assess a model's accuracy in molecular understanding. To rectify the absence of factual evaluation, we present MoleculeQA, a novel quest… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 19 pages, 8 figures

  39. arXiv:2403.07902  [pdf, other

    q-bio.BM cs.LG

    DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design

    Authors: Jiaqi Guan, Xiangxin Zhou, Yuwei Yang, Yu Bao, Jian Peng, Jianzhu Ma, Qiang Liu, Liang Wang, Quanquan Gu

    Abstract: Designing 3D ligands within a target binding site is a fundamental task in drug discovery. Existing structured-based drug design methods treat all ligand atoms equally, which ignores different roles of atoms in the ligand for drug design and can be less efficient for exploring the large drug-like molecule space. In this paper, inspired by the convention in pharmaceutical practice, we decompose the… ▽ More

    Submitted 26 February, 2024; originally announced March 2024.

    Comments: Accepted to ICML 2023

  40. arXiv:2403.06940  [pdf, other

    eess.IV cs.LG q-bio.QM

    Conditional Score-Based Diffusion Model for Cortical Thickness Trajectory Prediction

    Authors: Qing Xiao, Siyeop Yoon, Hui Ren, Matthew Tivnan, Lichao Sun, Quanzheng Li, Tianming Liu, Yu Zhang, Xiang Li

    Abstract: Alzheimer's Disease (AD) is a neurodegenerative condition characterized by diverse progression rates among individuals, with changes in cortical thickness (CTh) closely linked to its progression. Accurately forecasting CTh trajectories can significantly enhance early diagnosis and intervention strategies, providing timely care. However, the longitudinal data essential for these studies often suffe… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  41. arXiv:2403.05762  [pdf, other

    q-bio.NC

    Lateral Control of Brain-Controlled Vehicle Based on SVM Probability Output Model

    Authors: Hongguang Pan, Xinyu Yu, Yong Yang

    Abstract: The non-stationary characteristics of EEG signal and the individual differences of brain-computer interfaces (BCIs) lead to poor performance in the control process of the brain-controlled vehicles (BCVs). In this paper, by combining steady-state visual evoked potential (SSVEP) interactive interface, brain instructions generation module and vehicle lateral control module, a probabilistic output mod… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  42. arXiv:2403.04395  [pdf, other

    q-bio.BM cs.CL

    SGNet: Folding Symmetrical Protein Complex with Deep Learning

    Authors: Zhaoqun Li, **gcheng Yu, Qiwei Ye

    Abstract: Deep learning has made significant progress in protein structure prediction, advancing the development of computational biology. However, despite the high accuracy achieved in predicting single-chain structures, a significant number of large homo-oligomeric assemblies exhibit internal symmetry, posing a major challenge in structure determination. The performances of existing deep learning methods… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  43. arXiv:2403.03526  [pdf, other

    eess.SP cs.LG q-bio.NC

    FingerNet: EEG Decoding of A Fine Motor Imagery with Finger-tap** Task Based on A Deep Neural Network

    Authors: Young-Min Go, Seong-Hyun Yu, Hyeong-Yeong Park, Minji Lee, Ji-Hoon Jeong

    Abstract: Brain-computer interface (BCI) technology facilitates communication between the human brain and computers, primarily utilizing electroencephalography (EEG) signals to discern human intentions. Although EEG-based BCI systems have been developed for paralysis individuals, ongoing studies explore systems for speech imagery and motor imagery (MI). This study introduces FingerNet, a specialized network… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 12 pages,5 figures, and 2 tables

  44. arXiv:2403.01433  [pdf, other

    cs.CE q-bio.NC

    BrainMass: Advancing Brain Network Analysis for Diagnosis with Large-scale Self-Supervised Learning

    Authors: Yanwu Yang, Chenfei Ye, Guinan Su, Ziyao Zhang, Zhikai Chang, Hairui Chen, Piu Chan, Yue Yu, Ting Ma

    Abstract: Foundation models pretrained on large-scale datasets via self-supervised learning demonstrate exceptional versatility across various tasks. Due to the heterogeneity and hard-to-collect medical data, this approach is especially beneficial for medical image analysis and neuroscience research, as it streamlines broad downstream tasks without the need for numerous costly annotations. However, there ha… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  45. arXiv:2403.00815  [pdf, other

    cs.CL cs.AI cs.IR q-bio.OT

    RAM-EHR: Retrieval Augmentation Meets Clinical Predictions on Electronic Health Records

    Authors: Ran Xu, Wenqi Shi, Yue Yu, Yuchen Zhuang, Bowen **, May D. Wang, Joyce C. Ho, Carl Yang

    Abstract: We present RAM-EHR, a Retrieval AugMentation pipeline to improve clinical predictions on Electronic Health Records (EHRs). RAM-EHR first collects multiple knowledge sources, converts them into text format, and uses dense retrieval to obtain information related to medical concepts. This strategy addresses the difficulties associated with complex names for the concepts. RAM-EHR then augments the loc… ▽ More

    Submitted 4 June, 2024; v1 submitted 25 February, 2024; originally announced March 2024.

    Comments: ACL 2024

    Journal ref: ACL 2024

  46. arXiv:2402.18583  [pdf, other

    q-bio.BM cs.LG

    Binding-Adaptive Diffusion Models for Structure-Based Drug Design

    Authors: Zhilin Huang, Ling Yang, Zaixi Zhang, Xiangxin Zhou, Yu Bao, Xiawu Zheng, Yuwei Yang, Yu Wang, Wenming Yang

    Abstract: Structure-based drug design (SBDD) aims to generate 3D ligand molecules that bind to specific protein targets. Existing 3D deep generative models including diffusion models have shown great promise for SBDD. However, it is complex to capture the essential protein-ligand interactions exactly in 3D space for molecular generation. To address this problem, we propose a novel framework, namely Binding-… ▽ More

    Submitted 14 January, 2024; originally announced February 2024.

    Comments: Accepted by AAAI 2024. Project: https://github.com/YangLing0818/BindDM

  47. arXiv:2402.15515  [pdf

    cs.AI q-bio.QM stat.AP

    Feasibility of Identifying Factors Related to Alzheimer's Disease and Related Dementia in Real-World Data

    Authors: Aokun Chen, Qian Li, Yu Huang, Yongqiu Li, Yu-neng Chuang, Xia Hu, Serena Guo, Yonghui Wu, Yi Guo, Jiang Bian

    Abstract: A comprehensive view of factors associated with AD/ADRD will significantly aid in studies to develop new treatments for AD/ADRD and identify high-risk populations and patients for prevention efforts. In our study, we summarized the risk factors for AD/ADRD by reviewing existing meta-analyses and review articles on risk and preventive factors for AD/ADRD. In total, we extracted 477 risk factors in… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  48. arXiv:2402.10387  [pdf, other

    q-bio.BM cs.LG

    MFBind: a Multi-Fidelity Approach for Evaluating Drug Compounds in Practical Generative Modeling

    Authors: Peter Eckmann, Dongxia Wu, Germano Heinzelmann, Michael K Gilson, Rose Yu

    Abstract: Current generative models for drug discovery primarily use molecular docking to evaluate the quality of generated compounds. However, such models are often not useful in practice because even compounds with high docking scores do not consistently show experimental activity. More accurate methods for activity prediction exist, such as molecular dynamics based binding free energy calculations, but t… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: 9 pages, 4 figures

  49. arXiv:2402.06772  [pdf, other

    q-bio.QM cs.AI cs.CE cs.LG

    Retrosynthesis Prediction via Search in (Hyper) Graph

    Authors: Zixun Lan, Binjie Hong, Jiajun Zhu, Zuo Zeng, Zhenfu Liu, Limin Yu, Fei Ma

    Abstract: Predicting reactants from a specified core product stands as a fundamental challenge within organic synthesis, termed retrosynthesis prediction. Recently, semi-template-based methods and graph-edits-based methods have achieved good performance in terms of both interpretability and accuracy. However, due to their mechanisms these methods cannot predict complex reactions, e.g., reactions with multip… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  50. arXiv:2402.04286  [pdf

    q-bio.QM cs.AI cs.LG

    Progress and Opportunities of Foundation Models in Bioinformatics

    Authors: Qing Li, Zhihang Hu, Yixuan Wang, Lei Li, Yimin Fan, Irwin King, Le Song, Yu Li

    Abstract: Bioinformatics has witnessed a paradigm shift with the increasing integration of artificial intelligence (AI), particularly through the adoption of foundation models (FMs). These AI techniques have rapidly advanced, addressing historical challenges in bioinformatics such as the scarcity of annotated data and the presence of data noise. FMs are particularly adept at handling large-scale, unlabeled… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 27 pages, 3 figures, 2 tables

    MSC Class: cs.CL; 92-02 ACM Class: I.2.1