Skip to main content

Showing 1–13 of 13 results for author: Song, L

Searching in archive q-bio. Search in all archives.
.
  1. arXiv:2406.05347  [pdf, other

    q-bio.BM cs.AI cs.LG

    MSAGPT: Neural Prompting Protein Structure Prediction via MSA Generative Pre-Training

    Authors: Bo Chen, Zhilei Bei, Xingyi Cheng, Pan Li, Jie Tang, Le Song

    Abstract: Multiple Sequence Alignment (MSA) plays a pivotal role in unveiling the evolutionary trajectories of protein families. The accuracy of protein structure predictions is often compromised for protein sequences that lack sufficient homologous information to construct high quality MSA. Although various methods have been proposed to generate virtual MSA under these conditions, they fall short in compre… ▽ More

    Submitted 10 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

  2. arXiv:2405.10345  [pdf, other

    q-bio.QM cs.AI cs.LG

    Machine Learning Driven Biomarker Selection for Medical Diagnosis

    Authors: Divyagna Bavikadi, Ayushi Agarwal, Shashank Ganta, Yunro Chung, Lusheng Song, Ji Qiu, Paulo Shakarian

    Abstract: Recent advances in experimental methods have enabled researchers to collect data on thousands of analytes simultaneously. This has led to correlational studies that associated molecular measurements with diseases such as Alzheimer's, Liver, and Gastric Cancer. However, the use of thousands of biomarkers selected from the analytes is not practical for real-world medical diagnosis and is likely unde… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  3. arXiv:2402.04286  [pdf

    q-bio.QM cs.AI cs.LG

    Progress and Opportunities of Foundation Models in Bioinformatics

    Authors: Qing Li, Zhihang Hu, Yixuan Wang, Lei Li, Yimin Fan, Irwin King, Le Song, Yu Li

    Abstract: Bioinformatics has witnessed a paradigm shift with the increasing integration of artificial intelligence (AI), particularly through the adoption of foundation models (FMs). These AI techniques have rapidly advanced, addressing historical challenges in bioinformatics such as the scarcity of annotated data and the presence of data noise. FMs are particularly adept at handling large-scale, unlabeled… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 27 pages, 3 figures, 2 tables

    MSC Class: cs.CL; 92-02 ACM Class: I.2.1

  4. arXiv:2401.06199  [pdf, other

    q-bio.QM cs.AI cs.LG

    xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein

    Authors: Bo Chen, Xingyi Cheng, Pan Li, Yangli-ao Geng, **g Gong, Shen Li, Zhilei Bei, Xu Tan, Boyan Wang, Xin Zeng, Chiming Liu, Aohan Zeng, Yuxiao Dong, Jie Tang, Le Song

    Abstract: Protein language models have shown remarkable success in learning biological information from protein sequences. However, most existing models are limited by either autoencoding or autoregressive pre-training objectives, which makes them struggle to handle protein understanding and generation tasks concurrently. We propose a unified protein language model, xTrimoPGLM, to address these two types of… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  5. arXiv:2312.01186  [pdf, other

    q-bio.BM

    Linker-Tuning: Optimizing Continuous Prompts for Heterodimeric Protein Prediction

    Authors: Shuxian Zou, Hui Li, Shentong Mo, Xingyi Cheng, Eric Xing, Le Song

    Abstract: Predicting the structure of interacting chains is crucial for understanding biological systems and develo** new drugs. Large-scale pre-trained Protein Language Models (PLMs), such as ESM2, have shown impressive abilities in extracting biologically meaningful representations for protein structure prediction. In this paper, we show that ESMFold, which has been successful in computing accurate atom… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  6. arXiv:2311.15156  [pdf, other

    cs.LG cs.AI q-bio.GN

    xTrimoGene: An Efficient and Scalable Representation Learner for Single-Cell RNA-Seq Data

    Authors: **g Gong, Minsheng Hao, Xingyi Cheng, Xin Zeng, Chiming Liu, Jianzhu Ma, Xuegong Zhang, Taifeng Wang, Le Song

    Abstract: Advances in high-throughput sequencing technology have led to significant progress in measuring gene expressions at the single-cell level. The amount of publicly available single-cell RNA-seq (scRNA-seq) data is already surpassing 50M records for humans with each record measuring 20,000 genes. This highlights the need for unsupervised representation learning to fully ingest these data, yet classic… ▽ More

    Submitted 24 February, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

    Comments: Accepted by NeurIPS 2023

  7. arXiv:2301.05931  [pdf, other

    cs.LG q-bio.QM

    Drug Synergistic Combinations Predictions via Large-Scale Pre-Training and Graph Structure Learning

    Authors: Zhihang Hu, Qinze Yu, Yucheng Guo, Taifeng Wang, Irwin King, Xin Gao, Le Song, Yu Li

    Abstract: Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation. However, identifying novel drug combinations through wet-lab experiments is resource intensive due to the vast combinatorial search space. Recently, computational approaches, specifically deep learning models have emerged as an efficient way to discover synergistic c… ▽ More

    Submitted 14 January, 2023; originally announced January 2023.

  8. arXiv:2212.00735  [pdf, other

    q-bio.QM cs.AI cs.CL cs.LG

    xTrimoABFold: De novo Antibody Structure Prediction without MSA

    Authors: Yining Wang, Xumeng Gong, Shaochuan Li, Bing Yang, YiWu Sun, Chuan Shi, Yangang Wang, Cheng Yang, Hui Li, Le Song

    Abstract: In the field of antibody engineering, an essential task is to design a novel antibody whose paratopes bind to a specific antigen with correct epitopes. Understanding antibody structure and its paratope can facilitate a mechanistic understanding of its function. Therefore, antibody structure prediction from its sequence alone has always been a highly valuable problem for de novo antibody design. Al… ▽ More

    Submitted 4 May, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

    Comments: 14 pages, 5 figures

  9. arXiv:2207.13921  [pdf, other

    q-bio.BM cs.AI cs.LG q-bio.QM

    HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative

    Authors: Xiaomin Fang, Fan Wang, Lihang Liu, **gzhou He, Dayong Lin, Yingfei Xiang, Xiaonan Zhang, Hua Wu, Hui Li, Le Song

    Abstract: AI-based protein structure prediction pipelines, such as AlphaFold2, have achieved near-experimental accuracy. These advanced pipelines mainly rely on Multiple Sequence Alignments (MSAs) as inputs to learn the co-evolution information from the homologous sequences. Nonetheless, searching MSAs from protein databases is time-consuming, usually taking dozens of minutes. Consequently, we attempt to ex… ▽ More

    Submitted 21 February, 2023; v1 submitted 28 July, 2022; originally announced July 2022.

    Journal ref: Nature Machine Intelligence, 2023

  10. arXiv:1809.06676  [pdf

    eess.SP q-bio.NC

    Reconfiguration of Brain Network between Resting-state and Oddball Paradigm

    Authors: Fali Li, Chanlin Yi, Yuanyuan Liao, Yuanling Jiang, Ya**g Si, Limeng Song, Tao Zhang, Dezhong Yao, Yangsong Zhang, Zehong Cao, Peng Xu

    Abstract: The oddball paradigm is widely applied to the investigation of multiple cognitive functions. Prior studies have explored the cortical oscillation and power spectral differing from the resting-state conduction to oddball paradigm, but whether brain networks existing the significant difference is still unclear. Our study addressed how the brain reconfigures its architecture from a resting-state cond… ▽ More

    Submitted 18 September, 2018; originally announced September 2018.

    Comments: This manuscript is submitting to IEEE Transactions on Cognitive and Developmental Systems

  11. arXiv:0901.0138  [pdf, other

    q-bio.MN q-bio.QM stat.ML

    Time-Varying Networks: Recovering Temporally Rewiring Genetic Networks During the Life Cycle of Drosophila melanogaster

    Authors: Amr Ahmed, Le Song, Eric P. Xing

    Abstract: Due to the dynamic nature of biological systems, biological networks underlying temporal process such as the development of {\it Drosophila melanogaster} can exhibit significant topological changes to facilitate dynamic regulatory functions. Thus it is essential to develop methodologies that capture the temporal evolution of networks, which make it possible to study the driving forces underlying… ▽ More

    Submitted 6 January, 2009; v1 submitted 31 December, 2008; originally announced January 2009.

    Comments: Correcting some figure formatting errors

    Report number: Amr Ahmed, Le Song, Eric Xing (2008). Time-Varying Networks: Reconstructing Temporally Rewiring Genetic Interactions During the Life Cycle of Drosophila melanogaster. CMU-MLD Technical Report CMU-ML-08-118

  12. arXiv:0901.0135  [pdf, ps, other

    stat.ML q-bio.MN q-bio.QM stat.AP stat.ME

    A state-space mixed membership blockmodel for dynamic network tomography

    Authors: Eric P. Xing, Wenjie Fu, Le Song

    Abstract: In a dynamic social or biological environment, the interactions between the actors can undergo large and systematic changes. In this paper we propose a model-based approach to analyze what we will refer to as the dynamic tomography of such time-evolving networks. Our approach offers an intuitive but powerful tool to infer the semantic underpinnings of each actor, such as its social roles or biolog… ▽ More

    Submitted 8 November, 2010; v1 submitted 31 December, 2008; originally announced January 2009.

    Comments: Published in at http://dx.doi.org/10.1214/09-AOAS311 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS311

    Journal ref: Annals of Applied Statistics 2010, Vol. 4, No. 2, 535-566

  13. arXiv:0812.5087  [pdf, ps, other

    stat.ML q-bio.MN q-bio.QM stat.AP stat.ME

    Estimating time-varying networks

    Authors: Mladen Kolar, Le Song, Amr Ahmed, Eric P. Xing

    Abstract: Stochastic networks are a plausible representation of the relational information among entities in dynamic systems such as living cells or social communities. While there is a rich literature in estimating a static or temporally invariant network from observation data, little has been done toward estimating time-varying networks from time series of entity attributes. In this paper we present two n… ▽ More

    Submitted 20 October, 2010; v1 submitted 30 December, 2008; originally announced December 2008.

    Comments: Published in at http://dx.doi.org/10.1214/09-AOAS308 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS308

    Journal ref: Annals of Applied Statistics 2010, Vol. 4, No. 1, 94-123