-
SnakeLines: integrated set of computational pipelines for sequencing reads
Authors:
Jaroslav Budis,
Werner Krampl,
Marcel Kucharik,
Rastislav Hekel,
Adrian Goga,
Michal Lichvar,
David Smolak,
Miroslav Bohmer,
Andrej Balaz,
Frantisek Duris,
Juraj Gazdarica,
Katarina Soltys,
Jan Turna,
Jan Radvanszky,
Tomas Szemes
Abstract:
Background: With the rapid growth of massively parallel sequencing technologies, still more laboratories are utilizing sequenced DNA fragments for genomic analyses. Interpretation of sequencing data is, however, strongly dependent on bioinformatics processing, which is often too demanding for clinicians and researchers without a computational background. Another problem represents the reproducibil…
▽ More
Background: With the rapid growth of massively parallel sequencing technologies, still more laboratories are utilizing sequenced DNA fragments for genomic analyses. Interpretation of sequencing data is, however, strongly dependent on bioinformatics processing, which is often too demanding for clinicians and researchers without a computational background. Another problem represents the reproducibility of computational analyses across separated computational centers with inconsistent versions of installed libraries and bioinformatics tools.
Results: We propose an easily extensible set of computational pipelines, called SnakeLines, for processing sequencing reads; including map**, assembly, variant calling, viral identification, transcriptomics, metagenomics, and methylation analysis. Individual steps of an analysis, along with methods and their parameters can be readily modified in a single configuration file. Provided pipelines are embedded in virtual environments that ensure isolation of required resources from the host operating system, rapid deployment, and reproducibility of analysis across different Unix-based platforms.
Conclusion: SnakeLines is a powerful framework for the automation of bioinformatics analyses, with emphasis on a simple set-up, modifications, extensibility, and reproducibility.
Keywords: Computational pipeline, framework, massively parallel sequencing, reproducibility, virtual environment
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
WordPress on AWS: a Communication Framework
Authors:
Michael Soltys,
Katharine Soltys
Abstract:
Every organization needs to communicate with its audience, and social media is an attractive and inexpensive way to maintain dialogic communication. About 1/3 of the Internet web pages are powered by WordPress, and about a million companies have moved their IT infrastructure to the AWS cloud. Together, AWS and WordPress offer an attractive, effective and inexpensive way for companies, both large a…
▽ More
Every organization needs to communicate with its audience, and social media is an attractive and inexpensive way to maintain dialogic communication. About 1/3 of the Internet web pages are powered by WordPress, and about a million companies have moved their IT infrastructure to the AWS cloud. Together, AWS and WordPress offer an attractive, effective and inexpensive way for companies, both large and small, to maintain their presence on the web.
△ Less
Submitted 3 July, 2020;
originally announced July 2020.
-
Scrabble is PSPACE-Complete
Authors:
Michael Lampis,
Valia Mitsou,
Karolina Sołtys
Abstract:
In this paper we study the computational complexity of the game of Scrabble. We prove the PSPACE-completeness of a derandomized model of the game, answering an open question of Erik Demaine and Robert Hearn.
In this paper we study the computational complexity of the game of Scrabble. We prove the PSPACE-completeness of a derandomized model of the game, answering an open question of Erik Demaine and Robert Hearn.
△ Less
Submitted 25 January, 2012;
originally announced January 2012.
-
Hierarchies of Inefficient Kernelizability
Authors:
Danny Hermelin,
Stefan Kratsch,
Karolina Sołtys,
Magnus Wahlström,
Xi Wu
Abstract:
The framework of Bodlaender et al. (ICALP 2008) and Fortnow and Santhanam (STOC 2008) allows us to exclude the existence of polynomial kernels for a range of problems under reasonable complexity-theoretical assumptions. However, there are also some issues that are not addressed by this framework, including the existence of Turing kernels such as the "kernelization" of Leaf Out Branching(k) into a…
▽ More
The framework of Bodlaender et al. (ICALP 2008) and Fortnow and Santhanam (STOC 2008) allows us to exclude the existence of polynomial kernels for a range of problems under reasonable complexity-theoretical assumptions. However, there are also some issues that are not addressed by this framework, including the existence of Turing kernels such as the "kernelization" of Leaf Out Branching(k) into a disjunction over n instances of size poly(k). Observing that Turing kernels are preserved by polynomial parametric transformations, we define a kernelization hardness hierarchy, akin to the M- and W-hierarchy of ordinary parameterized complexity, by the PPT-closure of problems that seem likely to be fundamentally hard for efficient Turing kernelization. We find that several previously considered problems are complete for our fundamental hardness class, including Min Ones d-SAT(k), Binary NDTM Halting(k), Connected Vertex Cover(k), and Clique(k log n), the clique problem parameterized by k log n.
△ Less
Submitted 5 October, 2011;
originally announced October 2011.
-
The hardness of Median in the synchronized bit communication model
Authors:
Karolina Sołtys
Abstract:
The synchronized bit communication model, defined recently by Impagliazzo and Williams in \emph{Communication complexity with synchronized clocks}, CCC '10, is a communication model which allows the participants to share a common clock. The main open problem posed in this paper was the following: does the synchronized bit model allow a logarithmic speed-up for all functions over the standard deter…
▽ More
The synchronized bit communication model, defined recently by Impagliazzo and Williams in \emph{Communication complexity with synchronized clocks}, CCC '10, is a communication model which allows the participants to share a common clock. The main open problem posed in this paper was the following: does the synchronized bit model allow a logarithmic speed-up for all functions over the standard deterministic model of communication? We resolve this question in the negative by showing that the Median function, whose communication complexity is $O(\log n)$, does not admit polytime synchronized bit protocol with communication complexity $O\left(\log^{1-ε} n\right)$ for any $ε> 0$. Our results follow by a new round-communication trade-off for the Median function in the standard model, which easily translates to its hardness in the synchronized bit model.
△ Less
Submitted 6 February, 2011;
originally announced February 2011.