-
Improving Group Testing via Gradient Descent
Authors:
Sundara Rajan Srinivasavaradhan,
Pavlos Nikolopoulos,
Christina Fragouli,
Suhas Diggavi
Abstract:
We study the problem of group testing with non-identical, independent priors. So far, the pooling strategies that have been proposed in the literature take the following approach: a hand-crafted test design along with a decoding strategy is proposed, and guarantees are provided on how many tests are sufficient in order to identify all infections in a population. In this paper, we take a different,…
▽ More
We study the problem of group testing with non-identical, independent priors. So far, the pooling strategies that have been proposed in the literature take the following approach: a hand-crafted test design along with a decoding strategy is proposed, and guarantees are provided on how many tests are sufficient in order to identify all infections in a population. In this paper, we take a different, yet perhaps more practical, approach: we fix the decoder and the number of tests, and we ask, given these, what is the best test design one could use? We explore this question for the Definite Non-Defectives (DND) decoder. We formulate a (non-convex) optimization problem, where the objective function is the expected number of errors for a particular design. We find approximate solutions via gradient descent, which we further optimize with informed initialization. We illustrate through simulations that our method can achieve significant performance improvement over traditional approaches.
△ Less
Submitted 28 January, 2022;
originally announced January 2022.
-
Trellis BMA: Coded Trace Reconstruction on IDS Channels for DNA Storage
Authors:
Sundara Rajan Srinivasavaradhan,
Sivakanth Gopi,
Henry D. Pfister,
Sergey Yekhanin
Abstract:
Sequencing a DNA strand, as part of the read process in DNA storage, produces multiple noisy copies which can be combined to produce better estimates of the original strand; this is called trace reconstruction. One can reduce the error rate further by introducing redundancy in the write sequence and this is called coded trace reconstruction. In this paper, we model the DNA storage channel as an in…
▽ More
Sequencing a DNA strand, as part of the read process in DNA storage, produces multiple noisy copies which can be combined to produce better estimates of the original strand; this is called trace reconstruction. One can reduce the error rate further by introducing redundancy in the write sequence and this is called coded trace reconstruction. In this paper, we model the DNA storage channel as an insertion-deletion-substitution (IDS) channel and design both encoding schemes and low-complexity decoding algorithms for coded trace reconstruction.
We introduce Trellis BMA, a new reconstruction algorithm whose complexity is linear in the number of traces, and compare its performance to previous algorithms. Our results show that it reduces the error rate on both simulated and experimental data. The performance comparisons in this paper are based on a new dataset of traces that will be publicly released with the paper. Our hope is that this dataset will enable research progress by allowing objective comparisons between candidate algorithms.
△ Less
Submitted 13 July, 2021;
originally announced July 2021.
-
Group testing for overlap** communities
Authors:
Pavlos Nikolopoulos,
Sundara Rajan Srinivasavaradhan,
Tao Guo,
Christina Fragouli,
Suhas Diggavi
Abstract:
In this paper, we propose algorithms that leverage a known community structure to make group testing more efficient. We consider a population organized in connected communities: each individual participates in one or more communities, and the infection probability of each individual depends on the communities (s)he participates in. Use cases include students who participate in several classes, and…
▽ More
In this paper, we propose algorithms that leverage a known community structure to make group testing more efficient. We consider a population organized in connected communities: each individual participates in one or more communities, and the infection probability of each individual depends on the communities (s)he participates in. Use cases include students who participate in several classes, and workers who share common spaces. Group testing reduces the number of tests needed to identify the infected individuals by pooling diagnostic samples and testing them together. We show that making testing algorithms aware of the community structure, can significantly reduce the number of tests needed both for adaptive and non-adaptive group testing.
△ Less
Submitted 16 March, 2021; v1 submitted 4 December, 2020;
originally announced December 2020.
-
Community aware group testing
Authors:
Pavlos Nikolopoulos,
Tao Guo,
Sundara Rajan Srinivasavaradhan,
Christina Fragouli,
Suhas Diggavi
Abstract:
In this paper, we propose algorithms that leverage a known community structure to make group testing more efficient. We consider a population organized in disjoint communities: each individual participates in a community, and its infection probability depends on the community (s)he participates in. Use cases include families, students who participate in several classes, and workers who share commo…
▽ More
In this paper, we propose algorithms that leverage a known community structure to make group testing more efficient. We consider a population organized in disjoint communities: each individual participates in a community, and its infection probability depends on the community (s)he participates in. Use cases include families, students who participate in several classes, and workers who share common spaces. Group testing reduces the number of tests needed to identify the infected individuals by pooling diagnostic samples and testing them together. We show that if we design the testing strategy taking into account the community structure, we can significantly reduce the number of tests needed for adaptive and non-adaptive group testing, and can improve the reliability in cases where tests are noisy.
△ Less
Submitted 16 March, 2021; v1 submitted 16 July, 2020;
originally announced July 2020.
-
Algorithms for reconstruction over single and multiple deletion channels
Authors:
Sundara Rajan Srinivasavaradhan,
Michelle Du,
Suhas Diggavi,
Christina Fragouli
Abstract:
Recent advances in DNA sequencing technology and DNA storage systems have rekindled the interest in deletion channels. Multiple recent works have looked at variants of sequence reconstruction over a single and over multiple deletion channels, a notoriously difficult problem due to its highly combinatorial nature. Although works in theoretical computer science have provided algorithms which guarant…
▽ More
Recent advances in DNA sequencing technology and DNA storage systems have rekindled the interest in deletion channels. Multiple recent works have looked at variants of sequence reconstruction over a single and over multiple deletion channels, a notoriously difficult problem due to its highly combinatorial nature. Although works in theoretical computer science have provided algorithms which guarantee perfect reconstruction with multiple independent observations from the deletion channel, they are only applicable in the large blocklength regime and more restrictively, when the number of observations is also large. Indeed, with only a few observations, perfect reconstruction of the input sequence may not even be possible in most cases. In such situations, maximum likelihood (ML) and maximum aposteriori (MAP) estimates for the deletion channels are natural questions that arise and these have remained open to the best of our knowledge. In this work, we take steps to answer the two aforementioned questions. Specifically: 1. We show that solving for the ML estimate over the single deletion channel (which can be cast as a discrete optimization problem) is equivalent to solving its relaxation, a continuous optimization problem; 2. We exactly compute the symbolwise posterior distributions (under some assumptions on the priors) for both the single as well as multiple deletion channels. As part of our contributions, we also introduce tools to visualize and analyze error events, which we believe could be useful in other related problems concerning deletion channels.
△ Less
Submitted 29 May, 2020;
originally announced May 2020.
-
The Benefit of Being Flexible in Distributed Computation
Authors:
Linqi Song,
Sundara Rajan Srinivasavaradhan,
Christina Fragouli
Abstract:
In wireless distributed computing, networked nodes perform intermediate computations over data placed in their memory and exchange these intermediate values to calculate function values. In this paper we consider an asymmetric setting where each node has access to a random subset of the data, i.e., we cannot control the data placement. The paper makes a simple point: we can realize significant ben…
▽ More
In wireless distributed computing, networked nodes perform intermediate computations over data placed in their memory and exchange these intermediate values to calculate function values. In this paper we consider an asymmetric setting where each node has access to a random subset of the data, i.e., we cannot control the data placement. The paper makes a simple point: we can realize significant benefits if we are allowed to be "flexible", and decide which node computes which function, in our system. We make this argument in the case where each function depends on only two of the data messages, as is the case in similarity searches. We establish a percolation in the behavior of the system, where, depending on the amount of observed data, by being flexible, we may need no communication at all.
△ Less
Submitted 8 August, 2017; v1 submitted 23 May, 2017;
originally announced May 2017.