Skip to main content

Showing 1–11 of 11 results for author: Zeh, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2308.07809  [pdf, other

    cs.DS

    Another virtue of wavelet forests?

    Authors: Christina Boucher, Travis Gagie, Aaron Hong, Yansong Li, Norbert Zeh

    Abstract: A wavelet forest for a text $T [1..n]$ over an alphabet $σ$ takes $n H_0 (T) + o (n \log σ)$ bits of space and supports access and rank on $T$ in $O (\log σ)$ time. Kärkkäinen and Puglisi (2011) implicitly introduced wavelet forests and showed that when $T$ is the Burrows-Wheeler Transform (BWT) of a string $S$, then a wavelet forest for $T$ occupies space bounded in terms of higher-order empirica… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

  2. arXiv:2305.03240  [pdf, ps, other

    cs.DS

    Sum-of-Local-Effects Data Structures for Separable Graphs

    Authors: Xing Lyu, Travis Gagie, Meng He, Yakov Nekrich, Norbert Zeh

    Abstract: It is not difficult to think of applications that can be modelled as graph problems in which placing some facility or commodity at a vertex has some positive or negative effect on the values of all the vertices out to some distance, and we want to be able to calculate quickly the cumulative effect on any vertex's value at any time or the list of the most beneficial or most detrimential effects on… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  3. arXiv:2211.00378  [pdf, ps, other

    cs.DS

    A Near-Linear Kernel for Two-Parsimony Distance

    Authors: Elise Deen, Leo van Iersel, Remie Janssen, Mark Jones, Yuki Murakami, Norbert Zeh

    Abstract: The maximum parsimony distance $d_{\textrm{MP}}(T_1,T_2)$ and the bounded-state maximum parsimony distance $d_{\textrm{MP}}^t(T_1,T_2)$ measure the difference between two phylogenetic trees $T_1,T_2$ in terms of the maximum difference between their parsimony scores for any character (with $t$ a bound on the number of states in the character, in the case of $d_{\textrm{MP}}^t(T_1,T_2)$). While comp… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  4. arXiv:2001.11631  [pdf, ps, other

    cs.IR cs.CL cs.LG

    Enhancement of Short Text Clustering by Iterative Classification

    Authors: Md Rashadul Hasan Rakib, Norbert Zeh, Magdalena Jankowska, Evangelos Milios

    Abstract: Short text clustering is a challenging task due to the lack of signal contained in such short texts. In this work, we propose iterative classification as a method to b o ost the clustering quality (e.g., accuracy) of short texts. Given a clustering of short texts obtained using an arbitrary clustering algorithm, iterative classification applies outlier removal to obtain outlier-free clusters. Then… ▽ More

    Submitted 30 January, 2020; originally announced January 2020.

    Comments: 30 pages, 2 figures

  5. arXiv:1907.08474  [pdf, other

    cs.DM math.CO q-bio.PE

    A Practical Fixed-Parameter Algorithm for Constructing Tree-Child Networks from Multiple Binary Trees

    Authors: Leo van Iersel, Remie Janssen, Mark Jones, Yukihiro Murakami, Norbert Zeh

    Abstract: We present the first fixed-parameter algorithm for constructing a tree-child phylogenetic network that displays an arbitrary number of binary input trees and has the minimum number of reticulations among all such networks. The algorithm uses the recently introduced framework of cherry picking sequences and runs in $O((8k)^k \mathrm{poly}(n, t))$ time, where $n$ is the number of leaves of every tre… ▽ More

    Submitted 19 July, 2019; originally announced July 2019.

    Comments: Code available at https://github.com/nzeh/tree_child_code

    MSC Class: 92D15 05C90 05C85 56R10

  6. arXiv:1402.2136  [pdf, other

    cs.DS

    Hybridization Number on Three Rooted Binary Trees is EPT

    Authors: Leo van Iersel, Steven Kelk, Nela Lekić, Chris Whidden, Norbert Zeh

    Abstract: Phylogenetic networks are leaf-labelled directed acyclic graphs that are used to describe non-treelike evolutionary histories and are thus a generalization of phylogenetic trees. The hybridization number of a phylogenetic network is the sum of all indegrees minus the number of nodes plus one. The Hybridization Number problem takes as input a collection of phylogenetic trees and asks to construct a… ▽ More

    Submitted 31 May, 2016; v1 submitted 10 February, 2014; originally announced February 2014.

  7. QuPARA: Query-Driven Large-Scale Portfolio Aggregate Risk Analysis on MapReduce

    Authors: Andrew Rau-Chaplin, Blesson Varghese, Duane Wilson, Zhimin Yao, Norbert Zeh

    Abstract: Stochastic simulation techniques are used for portfolio risk analysis. Risk portfolios may consist of thousands of reinsurance contracts covering millions of insured locations. To quantify risk each portfolio must be evaluated in up to a million simulation trials, each capturing a different possible sequence of catastrophic events over the course of a contractual year. In this paper, we explore th… ▽ More

    Submitted 16 August, 2013; originally announced August 2013.

    Comments: 9 pages, IEEE International Conference on Big Data (BigData), Santa Clara, USA, 2013

  8. arXiv:1305.0512  [pdf, other

    cs.DS q-bio.PE

    Fixed-Parameter and Approximation Algorithms for Maximum Agreement Forests of Multifurcating Trees

    Authors: Chris Whidden, Robert G. Beiko, Norbert Zeh

    Abstract: We present efficient algorithms for computing a maximum agreement forest (MAF) of a pair of multifurcating (nonbinary) rooted trees. Our algorithms match the running times of the currently best algorithms for the binary case. The size of an MAF corresponds to the subtree prune-and-regraft (SPR) distance of the two trees and is intimately connected to their hybridization number. These distance meas… ▽ More

    Submitted 2 May, 2013; originally announced May 2013.

    Comments: 28 pages, 7 figures

  9. arXiv:1108.2664  [pdf, other

    q-bio.PE cs.DS

    Fixed-Parameter and Approximation Algorithms for Maximum Agreement Forests

    Authors: Chris Whidden, Robert G. Beiko, Norbert Zeh

    Abstract: We present new and improved fixed-parameter algorithms for computing maximum agreement forests (MAFs) of pairs of rooted binary phylogenetic trees. The size of such a forest for two trees corresponds to their subtree prune-and-regraft distance and, if the agreement forest is acyclic, to their hybridization number. These distance measures are essential tools for understanding reticulate evolution.… ▽ More

    Submitted 2 May, 2013; v1 submitted 12 August, 2011; originally announced August 2011.

    Comments: 36 pages, 9 figures. Removed the Approximation and TBR sections and simplified the Hybridization section. To appear in SIAM Journal on Computing

  10. arXiv:0805.1661  [pdf, ps, other

    cs.DS

    NAPX: A Polynomial Time Approximation Scheme for the Noah's Ark Problem

    Authors: G. Hickey, P. Carmi, A. Maheshwari, N. Zeh

    Abstract: The Noah's Ark Problem (NAP) is an NP-Hard optimization problem with relevance to ecological conservation management. It asks to maximize the phylogenetic diversity (PD) of a set of taxa given a fixed budget, where each taxon is associated with a cost of conservation and a probability of extinction. NAP has received renewed interest with the rise in availability of genetic sequence data, allowin… ▽ More

    Submitted 27 October, 2008; v1 submitted 12 May, 2008; originally announced May 2008.

    ACM Class: J.3

  11. arXiv:0711.0114  [pdf, other

    cs.CG

    Geometric Spanners With Small Chromatic Number

    Authors: Prosenjit Bose, Paz Carmi, Mathieu Couture, Anil Maheshwari, Michiel Smid, Norbert Zeh

    Abstract: Given an integer $k \geq 2$, we consider the problem of computing the smallest real number $t(k)$ such that for each set $P$ of points in the plane, there exists a $t(k)$-spanner for $P$ that has chromatic number at most $k$. We prove that $t(2) = 3$, $t(3) = 2$, $t(4) = \sqrt{2}$, and give upper and lower bounds on $t(k)$ for $k>4$. We also show that for any $ε>0$, there exists a $(1+ε)t(k)$-sp… ▽ More

    Submitted 1 November, 2007; originally announced November 2007.

    Report number: TR-07-15