Skip to main content

Showing 1–18 of 18 results for author: Bar-Lev, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.08625  [pdf, other

    cs.IT

    Optimal Almost-Balanced Sequences

    Authors: Daniella Bar-Lev, Adir Kobovich, Orian Leitersdorf, Eitan Yaakobi

    Abstract: This paper presents a novel approach to address the constrained coding challenge of generating almost-balanced sequences. While strictly balanced sequences have been well studied in the past, the problem of designing efficient algorithms with small redundancy, preferably constant or even a single bit, for almost balanced sequences has remained unsolved. A sequence is $\varepsilon(n)$-almost balanc… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Accepted to The IEEE International Symposium on Information Theory (ISIT) 2024

  2. arXiv:2405.08475  [pdf, ps, other

    cs.IT

    Representing Information on DNA using Patterns Induced by Enzymatic Labeling

    Authors: Daniella Bar-Lev, Tuvi Etzion, Eitan Yaakobi, Zohar Yakhini

    Abstract: Enzymatic DNA labeling is a powerful tool with applications in biochemistry, molecular biology, biotechnology, medical science, and genomic research. This paper contributes to the evolving field of DNA-based data storage by presenting a formal framework for modeling DNA labeling in strings, specifically tailored for data storage purposes. Our approach involves a known DNA molecule as a template fo… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Accepted to The IEEE International Symposium on Information Theory (ISIT) 2024

  3. arXiv:2401.15722  [pdf, ps, other

    cs.IT math.CO

    Reducing Coverage Depth in DNA Storage: A Combinatorial Perspective on Random Access Efficiency

    Authors: Anina Gruica, Daniella Bar-Lev, Alberto Ravagnani, Eitan Yaakobi

    Abstract: We investigate the fundamental limits of the recently proposed random access coverage depth problem for DNA data storage. Under this paradigm, it is assumed that the user information consists of $k$ information strands, which are encoded into $n$ strands via some generator matrix $G$. In the sequencing process, the strands are read uniformly at random, since each strand is available in a large num… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

  4. arXiv:2305.07992  [pdf, ps, other

    cs.IT

    On the Capacity of DNA Labeling

    Authors: Dganit Hanania, Daniella Bar-Lev, Yevgeni Nogin, Yoav Shechtman, Eitan Yaakobi

    Abstract: DNA labeling is a powerful tool in molecular biology and biotechnology that allows for the visualization, detection, and study of DNA at the molecular level. Under this paradigm, a DNA molecule is being labeled by specific k patterns and is then imaged. Then, the resulted image is modeled as a (k + 1)- ary sequence in which any non-zero symbol indicates on the appearance of the corresponding label… ▽ More

    Submitted 22 January, 2024; v1 submitted 13 May, 2023; originally announced May 2023.

  5. arXiv:2305.05972  [pdf, other

    cs.IT cs.DS

    Coding for IBLTs with Listing Guarantees

    Authors: Daniella Bar-Lev, Avi Mizrahi, Tuvi Etzion, Ori Rottenstreich, Eitan Yaakobi

    Abstract: The Invertible Bloom Lookup Table (IBLT) is a probabilistic data structure for set representation, with applications in network and traffic monitoring. It is known for its ability to list its elements, an operation that succeeds with high probability for sufficiently large table. However, listing can fail even for relatively small sets. This paper extends recent work on the worst-case analysis of… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  6. arXiv:2305.05656  [pdf, other

    cs.DM cs.IT math.PR

    Cover Your Bases: How to Minimize the Sequencing Coverage in DNA Storage Systems

    Authors: Daniella Bar-Lev, Omer Sabary, Ryan Gabrys, Eitan Yaakobi

    Abstract: Although the expenses associated with DNA sequencing have been rapidly decreasing, the current cost of sequencing information stands at roughly $120/GB, which is dramatically more expensive than reading from existing archival storage solutions today. In this work, we aim to reduce not only the cost but also the latency of DNA storage by initiating the study of the DNA coverage depth problem, which… ▽ More

    Submitted 29 November, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

  7. arXiv:2304.10391  [pdf, other

    cs.IT

    DNA-Correcting Codes: End-to-end Correction in DNA Storage Systems

    Authors: Avital Boruchovsky, Daniella Bar-Lev, Eitan Yaakobi

    Abstract: This paper introduces a new solution to DNA storage that integrates all three steps of retrieval, namely clustering, reconstruction, and error correction. DNA-correcting codes are presented as a unique solution to the problem of ensuring that the output of the storage system is unique for any valid set of input strands. To this end, we introduce a novel distance metric to capture the unique behavi… ▽ More

    Submitted 30 June, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: Extended version of the paper that appeared in ISIT 2023

  8. arXiv:2304.01317  [pdf, other

    cs.IT

    Universal Framework for Parametric Constrained Coding

    Authors: Daniella Bar-Lev, Adir Kobovich, Orian Leitersdorf, Eitan Yaakobi

    Abstract: Constrained coding is a fundamental field in coding theory that tackles efficient communication through constrained channels. While channels with fixed constraints have a general optimal solution, there is increasing demand for parametric constraints that are dependent on the message length. Several works have tackled such parametric constraints through iterative algorithms, yet they require compl… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  9. arXiv:2212.13812  [pdf, other

    cs.IT cs.DS

    Invertible Bloom Lookup Tables with Listing Guarantees

    Authors: Avi Mizrahi, Daniella Bar-Lev, Eitan Yaakobi, Ori Rottenstreich

    Abstract: The Invertible Bloom Lookup Table (IBLT) is a probabilistic concise data structure for set representation that supports a listing operation as the recovery of the elements in the represented set. Its applications can be found in network synchronization and traffic monitoring as well as in error-correction codes. IBLT can list its elements with probability affected by the size of the allocated memo… ▽ More

    Submitted 28 December, 2022; originally announced December 2022.

  10. Generalized Unique Reconstruction from Substrings

    Authors: Yonatan Yehezkeally, Daniella Bar-Lev, Sagi Marcovich, Eitan Yaakobi

    Abstract: This paper introduces a new family of reconstruction codes which is motivated by applications in DNA data storage and sequencing. In such applications, DNA strands are sequenced by reading some subset of their substrings. While previous works considered two extreme cases in which all substrings of pre-defined lengths are read or substrings are read with no overlap for the single string case, this… ▽ More

    Submitted 20 April, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: Author-submitted, peer-reviewed and accepted version (IEEE Trans. on Inform. Theory). arXiv admin note: text overlap with arXiv:2205.03933

  11. arXiv:2206.07995  [pdf, ps, other

    cs.IT math.CO

    On the Size of Balls and Anticodes of Small Diameter under the Fixed-Length Levenshtein Metric

    Authors: Daniella Bar-Lev, Tuvi Etzion, Eitan Yaakobi

    Abstract: The rapid development of DNA storage has brought the deletion and insertion channel to the front line of research. When the number of deletions is equal to the number of insertions, the Fixed Length Levenshtein (FLL) metric is the right measure for the distance between two words of the same length. Similar to any other metric, the size of a ball is one of the most fundamental parameters. In this w… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: text overlap with arXiv:2103.01681

  12. arXiv:2205.03933  [pdf, ps, other

    cs.IT

    Reconstruction from Substrings with Partial Overlap

    Authors: Yonatan Yehezkeally, Daniella Bar-Lev, Sagi Marcovich, Eitan Yaakobi

    Abstract: This paper introduces a new family of reconstruction codes which is motivated by applications in DNA data storage and sequencing. In such applications, DNA strands are sequenced by reading some subset of their substrings. While previous works considered two extreme cases in which \emph{all} substrings of some fixed length are read or substrings are read with no overlap, this work considers the set… ▽ More

    Submitted 8 May, 2022; originally announced May 2022.

    Comments: 6 pages, 2 figures; conference submission

  13. arXiv:2205.03911  [pdf, other

    cs.IT

    Codes for Constrained Periodicity

    Authors: Adir Kobovich, Orian Leitersdorf, Daniella Bar-Lev, Eitan Yaakobi

    Abstract: Reliability is an inherent challenge for the emerging nonvolatile technology of racetrack memories, and there exists a fundamental relationship between codes designed for racetrack memories and codes with constrained periodicity. Previous works have sought to construct codes that avoid periodicity in windows, yet have either only provided existence proofs or required high redundancy. This paper pr… ▽ More

    Submitted 25 August, 2022; v1 submitted 8 May, 2022; originally announced May 2022.

    Comments: Accepted to The International Symposium on Information Theory and Its Applications (ISITA) 2022

  14. arXiv:2202.03024  [pdf, other

    cs.IT

    The Input and Output Entropies of the $k$-Deletion/Insertion Channel

    Authors: Shubhransh Singhvi, Omer Sabary, Daniella Bar-Lev, Eitan Yaakobi

    Abstract: The channel output entropy of a transmitted word is the entropy of the possible channel outputs and similarly, the input entropy of a received word is the entropy of all possible transmitted words. The goal of this work is to study these entropy values for the k-deletion, k-insertion channel, where exactly k symbols are deleted, and inserted in the transmitted word, respectively. If all possible w… ▽ More

    Submitted 15 June, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

  15. Adversarial Torn-paper Codes

    Authors: Daniella Bar-Lev, Sagi Marcovich, Eitan Yaakobi, Yonatan Yehezkeally

    Abstract: We study the adversarial torn-paper channel. This problem is motivated by applications in DNA data storage where the DNA strands that carry information may break into smaller pieces which are received out of order. Our model extends the previously researched probabilistic setting to the worst-case. We develop code constructions for any parameters of the channel for which non-vanishing asymptotic r… ▽ More

    Submitted 4 July, 2023; v1 submitted 26 January, 2022; originally announced January 2022.

    Comments: Author submitted, peer-reviewed version

  16. arXiv:2201.02466  [pdf, other

    cs.IT

    On The Decoding Error Weight of One or Two Deletion Channels

    Authors: Omer Sabary, Daniella Bar-Lev, Yotam Gershon, Alexander Yucovich, Eitan Yaakobi

    Abstract: This paper tackles two problems that are relevant to coding for insertions and deletions. These problems are motivated by several applications, among them is reconstructing strands in DNA-based storage systems. Under this paradigm, a word is transmitted over some fixed number of identical independent channels and the goal of the decoder is to output the transmitted word or some close approximation… ▽ More

    Submitted 7 January, 2022; originally announced January 2022.

    Comments: arXiv admin note: text overlap with arXiv:2001.05582

  17. arXiv:2109.00031  [pdf

    cs.IT cs.AI

    Deep DNA Storage: Scalable and Robust DNA Storage via Coding Theory and Deep Learning

    Authors: Daniella Bar-Lev, Itai Orr, Omer Sabary, Tuvi Etzion, Eitan Yaakobi

    Abstract: DNA-based storage is an emerging technology that enables digital information to be archived in DNA molecules. This method enjoys major advantages over magnetic and optical storage solutions such as exceptional information density, enhanced data durability, and negligible power consumption to maintain data integrity. To access the data, an information retrieval process is employed, where some of th… ▽ More

    Submitted 11 March, 2024; v1 submitted 31 August, 2021; originally announced September 2021.

  18. arXiv:2103.01681  [pdf, ps, other

    cs.IT

    On Levenshtein Balls with Radius One

    Authors: Daniella Bar-Lev, Tuvi Etzion, Eitan Yaakobi

    Abstract: The rapid development of DNA storage has brought the deletion and insertion channel, once again, to the front line of research. When the number of deletions is equal to the number of insertions, the Fixed Length Levenshtein (FLL) metric is the right measure for the distance between two words of the same length. The size of a ball is one of the most fundamental parameters in any metric. The size of… ▽ More

    Submitted 29 June, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

    Comments: 6 pages, to be published in 2021 IEEE International Symposium on Information Theory (ISIT)