SnakeLines: integrated set of computational pipelines for sequencing reads
Authors:
Jaroslav Budis,
Werner Krampl,
Marcel Kucharik,
Rastislav Hekel,
Adrian Goga,
Michal Lichvar,
David Smolak,
Miroslav Bohmer,
Andrej Balaz,
Frantisek Duris,
Juraj Gazdarica,
Katarina Soltys,
Jan Turna,
Jan Radvanszky,
Tomas Szemes
Abstract:
Background: With the rapid growth of massively parallel sequencing technologies, still more laboratories are utilizing sequenced DNA fragments for genomic analyses. Interpretation of sequencing data is, however, strongly dependent on bioinformatics processing, which is often too demanding for clinicians and researchers without a computational background. Another problem represents the reproducibil…
▽ More
Background: With the rapid growth of massively parallel sequencing technologies, still more laboratories are utilizing sequenced DNA fragments for genomic analyses. Interpretation of sequencing data is, however, strongly dependent on bioinformatics processing, which is often too demanding for clinicians and researchers without a computational background. Another problem represents the reproducibility of computational analyses across separated computational centers with inconsistent versions of installed libraries and bioinformatics tools.
Results: We propose an easily extensible set of computational pipelines, called SnakeLines, for processing sequencing reads; including map**, assembly, variant calling, viral identification, transcriptomics, metagenomics, and methylation analysis. Individual steps of an analysis, along with methods and their parameters can be readily modified in a single configuration file. Provided pipelines are embedded in virtual environments that ensure isolation of required resources from the host operating system, rapid deployment, and reproducibility of analysis across different Unix-based platforms.
Conclusion: SnakeLines is a powerful framework for the automation of bioinformatics analyses, with emphasis on a simple set-up, modifications, extensibility, and reproducibility.
Keywords: Computational pipeline, framework, massively parallel sequencing, reproducibility, virtual environment
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
Innovative method for reducing uninformative calls in non-invasive prenatal testing
Authors:
Jaroslav Budis,
Juraj Gazdarica,
Jan Radvanszky,
Gabor Szucs,
Marcel Kucharik,
Lucia Strieskova,
Iveta Gazdaricova,
Maria Harsanyova,
Frantisek Duris,
Gabriel Minarik,
Martina Sekelska,
Balint Nagy,
Jan Turna,
Tomas Szemes
Abstract:
Non-invasive prenatal testing or NIPT is currently among the top researched topic in obstetric care. While the performance of the current state-of-the-art NIPT solutions achieve high sensitivity and specificity, they still struggle with a considerable number of samples that cannot be concluded with certainty. Such uninformative results are often subject to repeated blood sampling and re-analysis,…
▽ More
Non-invasive prenatal testing or NIPT is currently among the top researched topic in obstetric care. While the performance of the current state-of-the-art NIPT solutions achieve high sensitivity and specificity, they still struggle with a considerable number of samples that cannot be concluded with certainty. Such uninformative results are often subject to repeated blood sampling and re-analysis, usually after two weeks, and this period may cause a stress to the future mothers as well as increase the overall cost of the test. We propose a supplementary method to traditional z-scores to reduce the number of such uninformative calls. The method is based on a novel analysis of the length profile of circulating cell free DNA which compares the change in such profiles when random-based and length-based elimination of some fragments is performed. The proposed method is not as accurate as the standard z-score; however, our results suggest that combination of these two independent methods correctly resolves a substantial portion of healthy samples with an uninformative result. Additionally, we discuss how the proposed method can be used to identify maternal aberrations, thus reducing the risk of false positive and false negative calls.
Keywords: Next-generation sequencing, Cell-free DNA, Uninformative result, Method, Trisomy, Prenatal testing
△ Less
Submitted 22 June, 2018;
originally announced June 2018.