-
Unraveling cell-cell communication with NicheNet by inferring active ligands from transcriptomics data
Authors:
Chananchida Sang-aram,
Robin Browaeys,
Ruth Seurinck,
Yvan Saeys
Abstract:
Ligand-receptor interactions constitute a fundamental mechanism of cell-cell communication and signaling. NicheNet is a well-established computational tool that infers ligand-receptor interactions that potentially regulate gene expression changes in receiver cell populations. Whereas the original publication delves into the algorithm and validation, this paper describes a best practices workflow c…
▽ More
Ligand-receptor interactions constitute a fundamental mechanism of cell-cell communication and signaling. NicheNet is a well-established computational tool that infers ligand-receptor interactions that potentially regulate gene expression changes in receiver cell populations. Whereas the original publication delves into the algorithm and validation, this paper describes a best practices workflow cultivated over four years of experience and user feedback. Starting from the input single-cell expression matrix, we describe a "sender-agnostic" approach which considers ligands from the entire microenvironment, and a "sender-focused" approach which only considers ligands from cell populations of interest. As output, users will obtain a list of prioritized ligands and their potential target genes, along with multiple visualizations. In NicheNet v2, we have updated the data sources and implemented a downstream procedure for prioritizing cell-type-specific ligand-receptor pairs. Although a standard NicheNet analysis takes less than 10 minutes to run, users often invest additional time in making decisions about the approach and parameters that best suit their biological question. This paper serves to aid in this decision-making process by describing the most appropriate workflow for common experimental designs like case-control and cell differentiation studies. Finally, in addition to the step-by-step description of the code, we also provide wrapper functions that enable the analysis to be run in one line of code, thus tailoring the workflow to users at all levels of computational proficiency.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Essential guidelines for computational method benchmarking
Authors:
Lukas M. Weber,
Wouter Saelens,
Robrecht Cannoodt,
Charlotte Soneson,
Alexander Hapfelmeier,
Paul P. Gardner,
Anne-Laure Boulesteix,
Yvan Saeys,
Mark D. Robinson
Abstract:
In computational biology and other sciences, researchers are frequently faced with a choice between several computational methods for performing data analyses. Benchmarking studies aim to rigorously compare the performance of different methods using well-characterized benchmark datasets, to determine the strengths of each method or to provide recommendations regarding suitable choices of methods f…
▽ More
In computational biology and other sciences, researchers are frequently faced with a choice between several computational methods for performing data analyses. Benchmarking studies aim to rigorously compare the performance of different methods using well-characterized benchmark datasets, to determine the strengths of each method or to provide recommendations regarding suitable choices of methods for an analysis. However, benchmarking studies must be carefully designed and implemented to provide accurate, unbiased, and informative results. Here, we summarize key practical guidelines and recommendations for performing high-quality benchmarking analyses, based on our experiences in computational biology.
△ Less
Submitted 3 June, 2019; v1 submitted 3 December, 2018;
originally announced December 2018.
-
Interpretable Convolutional Neural Networks for Effective Translation Initiation Site Prediction
Authors:
Jasper Zuallaert,
Mijung Kim,
Yvan Saeys,
Wesley De Neve
Abstract:
Thanks to rapidly evolving sequencing techniques, the amount of genomic data at our disposal is growing increasingly large. Determining the gene structure is a fundamental requirement to effectively interpret gene function and regulation. An important part in that determination process is the identification of translation initiation sites. In this paper, we propose a novel approach for automatic p…
▽ More
Thanks to rapidly evolving sequencing techniques, the amount of genomic data at our disposal is growing increasingly large. Determining the gene structure is a fundamental requirement to effectively interpret gene function and regulation. An important part in that determination process is the identification of translation initiation sites. In this paper, we propose a novel approach for automatic prediction of translation initiation sites, leveraging convolutional neural networks that allow for automatic feature extraction. Our experimental results demonstrate that we are able to improve the state-of-the-art approaches with a decrease of 75.2% in false positive rate and with a decrease of 24.5% in error rate on chosen datasets. Furthermore, an in-depth analysis of the decision-making process used by our predictive model shows that our neural network implicitly learns biologically relevant features from scratch, without any prior knowledge about the problem at hand, such as the Kozak consensus sequence, the influence of stop and start codons in the sequence and the presence of donor splice site patterns. In summary, our findings yield a better understanding of the internal reasoning of a convolutional neural network when applying such a neural network to genomic data.
△ Less
Submitted 27 November, 2017;
originally announced November 2017.
-
Validating module network learning algorithms using simulated data
Authors:
Tom Michoel,
Steven Maere,
Eric Bonnet,
Anagha Joshi,
Yvan Saeys,
Tim Van den Bulcke,
Koenraad Van Leemput,
Piet van Remortel,
Martin Kuiper,
Kathleen Marchal,
Yves Van de Peer
Abstract:
In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporat…
▽ More
In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance. Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators.
△ Less
Submitted 4 May, 2007;
originally announced May 2007.