Skip to main content

Showing 1–2 of 2 results for author: Elworth, R A L

Searching in archive cs. Search in all archives.
.
  1. arXiv:1910.04358  [pdf, other

    q-bio.GN cs.IR

    Fast Processing and Querying of 170TB of Genomics Data via a Repeated And Merged BloOm Filter (RAMBO)

    Authors: Gaurav Gupta, Minghao Yan, Benjamin Coleman, Bryce Kille, R. A. Leo Elworth, Tharun Medini, Todd Treangen, Anshumali Shrivastava

    Abstract: DNA sequencing, especially of microbial genomes and metagenomes, has been at the core of recent research advances in large-scale comparative genomics. The data deluge has resulted in exponential growth in genomic datasets over the past years and has shown no sign of slowing down. Several recent attempts have been made to tame the computational burden of sequence search on these terabyte and petaby… ▽ More

    Submitted 30 April, 2022; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: 9 pages

  2. arXiv:1910.02611  [pdf, other

    cs.DS cs.IR

    RAMBO: Repeated And Merged BloOm Filter for Ultra-fast Multiple Set Membership Testing (MSMT) on Large-Scale Data

    Authors: Gaurav Gupta, Minghao Yan, Benjamin Coleman, R. A. Leo Elworth, Tharun Medini, Todd Treangen, Anshumali Shrivastava

    Abstract: Multiple Set Membership Testing (MSMT) is a well-known problem in a variety of search and query applications. Given a dataset of K different sets and a query q, it aims to find all of the sets containing the query. Trivially, an MSMT instance can be reduced to K membership testing instances, each with the same q, leading to O(K) query time with a simple array of Bloom Filters. We propose a data-st… ▽ More

    Submitted 17 July, 2020; v1 submitted 7 October, 2019; originally announced October 2019.

    Comments: 14 pages, 5 figures