Showing 1–2 of 2 results for author: Ravuri, A

Search v0.5.6 released 2020-02-24

arXiv:2405.03879 [pdf, other]

stat.ML cs.LG q-bio.GN stat.AP

Scalable Amortized GPLVMs for Single Cell Transcriptomics Data

Authors: Sarah Zhao, Aditya Ravuri, Vidhi Lalchand, Neil D. Lawrence

Abstract: Dimensionality reduction is crucial for analyzing large-scale single-cell RNA-seq data. Gaussian Process Latent Variable Models (GPLVMs) offer an interpretable dimensionality reduction method, but current scalable models lack effectiveness in clustering cell types. We introduce an improved model, the amortized stochastic variational Bayesian GPLVM (BGPLVM), tailored for single-cell RNA-seq with sp… ▽ More Dimensionality reduction is crucial for analyzing large-scale single-cell RNA-seq data. Gaussian Process Latent Variable Models (GPLVMs) offer an interpretable dimensionality reduction method, but current scalable models lack effectiveness in clustering cell types. We introduce an improved model, the amortized stochastic variational Bayesian GPLVM (BGPLVM), tailored for single-cell RNA-seq with specialized encoder, kernel, and likelihood designs. This model matches the performance of the leading single-cell variational inference (scVI) approach on synthetic and real-world COVID datasets and effectively incorporates cell-cycle and batch information to reveal more interpretable latent structures as we demonstrate on an innate immunity dataset. △ Less

Submitted 6 May, 2024; originally announced May 2024.
arXiv:2209.06716 [pdf, other]

cs.LG q-bio.GN stat.AP stat.ML

Modelling Technical and Biological Effects in scRNA-seq data with Scalable GPLVMs

Authors: Vidhi Lalchand, Aditya Ravuri, Emma Dann, Natsuhiko Kumasaka, Dinithi Sumanaweera, Rik G. H. Lindeboom, Shaista Madad, Sarah A. Teichmann, Neil D. Lawrence

Abstract: Single-cell RNA-seq datasets are growing in size and complexity, enabling the study of cellular composition changes in various biological/clinical contexts. Scalable dimensionality reduction techniques are in need to disentangle biological variation in them, while accounting for technical and biological confounders. In this work, we extend a popular approach for probabilistic non-linear dimensiona… ▽ More Single-cell RNA-seq datasets are growing in size and complexity, enabling the study of cellular composition changes in various biological/clinical contexts. Scalable dimensionality reduction techniques are in need to disentangle biological variation in them, while accounting for technical and biological confounders. In this work, we extend a popular approach for probabilistic non-linear dimensionality reduction, the Gaussian process latent variable model, to scale to massive single-cell datasets while explicitly accounting for technical and biological confounders. The key idea is to use an augmented kernel which preserves the factorisability of the lower bound allowing for fast stochastic variational inference. We demonstrate its ability to reconstruct latent signatures of innate immunity recovered in Kumasaka et al. (2021) with 9x lower training time. We further analyze a COVID dataset and demonstrate across a cohort of 130 individuals, that this framework enables data integration while capturing interpretable signatures of infection. Specifically, we explore COVID severity as a latent dimension to refine patient stratification and capture disease-specific gene expression. △ Less

Submitted 5 November, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

Comments: Machine Learning and Computational Biology Symposium (Oral), 2022

MSC Class: 92D99; 92C99; ACM Class: J.3; I.5

Search v0.5.6 released 2020-02-24