Skip to main content

Showing 1–12 of 12 results for author: Koslicki, D

.
  1. arXiv:2212.01384  [pdf, other

    q-bio.QM cs.LG

    KGML-xDTD: A Knowledge Graph-based Machine Learning Framework for Drug Treatment Prediction and Mechanism Description

    Authors: Chunyu Ma, Zhihan Zhou, Han Liu, David Koslicki

    Abstract: Background: Computational drug repurposing is a cost- and time-efficient approach that aims to identify new therapeutic targets or diseases (indications) of existing drugs/compounds. It is especially critical for emerging and/or orphan diseases due to its cheaper investment and shorter research cycle compared with traditional wet-lab drug discovery approaches. However, the underlying mechanisms of… ▽ More

    Submitted 25 April, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

  2. arXiv:2003.00110  [pdf

    q-bio.GN q-bio.QM

    Technology dictates algorithms: Recent developments in read alignment

    Authors: Mohammed Alser, Jeremy Rotman, Kodi Taraszka, Huwenbo Shi, Pelin Icer Baykal, Harry Taegyun Yang, Victor Xue, Sergey Knyazev, Benjamin D. Singer, Brunilda Balliu, David Koslicki, Pavel Skums, Alex Zelikovsky, Can Alkan, Onur Mutlu, Serghei Mangul

    Abstract: Massively parallel sequencing techniques have revolutionized biological and medical sciences by providing unprecedented insight into the genomes of humans, animals, and microbes. Modern sequencing platforms generate enormous amounts of genomic data in the form of nucleotide sequences or reads. Aligning reads onto reference genomes enables the identification of individual-specific genetic variants… ▽ More

    Submitted 9 July, 2020; v1 submitted 28 February, 2020; originally announced March 2020.

    Journal ref: Genome Biol . Aug 26;22(1):249, 2021

  3. arXiv:2001.08717  [pdf, other

    q-bio.GN math.OC

    Finer Metagenomic Reconstruction via Biodiversity Optimization

    Authors: Simon Foucart, David Koslicki

    Abstract: When analyzing communities of microorganisms from their sequenced DNA, an important task is taxonomic profiling: enumerating the presence and relative abundance of all organisms, or merely of all taxa, contained in the sample. This task can be tackled via compressive-sensing-based approaches, which favor communities featuring the fewest organisms among those consistent with the observed DNA data.… ▽ More

    Submitted 23 January, 2020; originally announced January 2020.

    MSC Class: 92D20; 90C26; 90C90

  4. arXiv:1911.11304  [pdf

    q-bio.QM q-bio.GN

    Metagenomics for clinical diagnostics: technologies and informatics

    Authors: Caitlin Loeffler, Keylie M. Gibson, Lana Martin, Liz Chang, Jeremy Rotman, Ian V. Toma, Christopher E. Mason, Eleazar Eskin, Joseph P. Zackular, Keith A. Crandall, David Koslicki, Serghei Mangul

    Abstract: The human-associated microbiome is closely tied to human health and is of substantial clinical interest. Metagenomics-based tools are emerging for clinical diagnostics, tracking the spread of diseases, and surveillance of potential pathogens. In some cases, these tools are overcoming limitations of traditional clinical approaches. Metagenomics has limitations barring the tools from clinical valida… ▽ More

    Submitted 7 August, 2020; v1 submitted 25 November, 2019; originally announced November 2019.

    Comments: 75 pages, 7 figures, 2 tables, 4 supplementary table, review paper

  5. Substitution Markov chains and Martin boundaries

    Authors: David Koslicki, Manfred Denker

    Abstract: Substitution Markov chains have been introduced [7] as a new model to describe molecular evolution. In this note, we study the associated Martin boundaries from a probabilistic and topological viewpoint. An example is given that, although having a boundary homeomorphic to the well-known coin tossing process, has a metric description that differs significantly.

    Submitted 30 May, 2017; originally announced May 2017.

    Comments: 23 pages, 2 figures

    Journal ref: Rocky Mountain J. Math., Volume 46, Number 6 (2016), 1963-1985

  6. arXiv:1611.04634  [pdf, other

    q-bio.QM

    EMDUnifrac: Exact Linear Time Computation of the Unifrac Metric and Identification of Differentially Abundant Organisms

    Authors: Jason McClelland, David Koslicki

    Abstract: Both the weighted and unweighted Unifrac distances have been very successfully employed to assess if two communities differ, but do not give any information about how two communities differ. We take advantage of recent observations that the Unifrac metric is equivalent to the so-called earth mover's distance (also known as the Kantorovich-Rubinstein metric) to develop an algorithm that not only co… ▽ More

    Submitted 14 November, 2016; originally announced November 2016.

    MSC Class: 92-08

  7. arXiv:1610.07705  [pdf, other

    q-bio.PE physics.soc-ph

    Exact probabilities for the indeterminacy of complex networks as perceived through press perturbations

    Authors: David Koslicki, Mark Novak

    Abstract: We consider the goal of predicting how complex networks respond to chronic (press) perturbations when characterizations of their network topology and interaction strengths are associated with uncertainty. Our primary result is the derivation of exact formulas for the expected number and probability of qualitatively incorrect predictions about a system's responses under uncertainties drawn form arb… ▽ More

    Submitted 24 October, 2016; originally announced October 2016.

    Comments: 25 pages, 6 figures

    MSC Class: 92B05; 6008 ACM Class: G.1.3; G.2.2; G.3

  8. arXiv:1602.05328  [pdf, other

    q-bio.GN

    MetaPalette: A $k$-mer painting approach for metagenomic taxonomic profiling and quantification of novel strain variation

    Authors: David Koslicki, Daniel Falush

    Abstract: Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette which uses long $k$-mer sizes ($k=30, 50$) to fit a $k$-mer "palette" of a given sample to the $k$-mer palette of reference organisms. By m… ▽ More

    Submitted 17 February, 2016; originally announced February 2016.

    Comments: 20 pages, 19 figures

    MSC Class: 92-08; 92B05

  9. SEK: Sparsity exploiting $k$-mer-based estimation of bacterial community composition

    Authors: Saikat Chatterjee, David Koslicki, Siyuan Dong, Nicolas Innocenti, Lu Cheng, Yueheng Lan, Mikko Vehkaperä, Mikael Skoglund, Lars K. Rasmussen, Erik Aurell, Jukka Corander

    Abstract: Motivation: Estimation of bacterial community composition from a high-throughput sequenced sample is an important task in metagenomics applications. Since the sample sequence data typically harbors reads of variable lengths and different levels of biological and technical noise, accurate statistical analysis of such data is challenging. Currently popular estimation methods are typically very time… ▽ More

    Submitted 1 July, 2014; originally announced July 2014.

    Comments: 10 pages

  10. arXiv:1109.5999  [pdf, ps, other

    math.DS physics.data-an q-bio.QM

    Coding Sequence Density Estimation Via Topological Pressure

    Authors: David Koslicki, Daniel J. Thompson

    Abstract: We give a new approach to coding sequence (CDS) density estimation in genomic analysis based on the topological pressure, which we develop from a well known concept in ergodic theory. Topological pressure measures the "weighted information content" of a finite word, and incorporates 64 parameters which can be interpreted as a choice of weight for each nucleotide triplet. We train the parameters so… ▽ More

    Submitted 8 January, 2014; v1 submitted 27 September, 2011; originally announced September 2011.

    Comments: From v3, changes to typesetting only. The paper is accepted for publication in the Journal of Mathematical Biology

    MSC Class: 92D20; 37N25; 92-08; 37D35

  11. arXiv:1102.1897  [pdf, ps, other

    q-bio.PE q-bio.QM

    Random Substitution-Insertion-Deletion (RSID) Model of Molecular Evolution with Alignment-free Parameter Estimation

    Authors: David Koslicki

    Abstract: We present a comprehensive new framework for handling biologically accurate models of molecular evolution. This model provides a systematic framework for studying models of molecular evolution that implement heterogeneous rates, conservation of reading frame, differing rates of insertion and deletion, customizable parametrization of the probabilities and types of substitutions, insertions, and del… ▽ More

    Submitted 9 February, 2011; originally announced February 2011.

    Comments: 14 pages

  12. arXiv:1101.4636  [pdf, ps, other

    q-bio.QM cond-mat.stat-mech

    Topological Entropy of DNA Sequences

    Authors: David Koslicki

    Abstract: Topological entropy has been one of the most difficult to implement of all the entropy-theoretic notions. This is primarily due to finite sample effects and high-dimensionality problems. In particular, topological entropy has been implemented in previous literature to conclude that entropy of exons is higher than of introns, thus implying that exons are more "random" than introns. We define a new… ▽ More

    Submitted 24 January, 2011; originally announced January 2011.

    Comments: 16 pages, 9 figures, 2 tables