-
Statistical and Computational Phase Transitions in Group Testing
Authors:
Amin Coja-Oghlan,
Oliver Gebhard,
Max Hahn-Klimroth,
Alexander S. Wein,
Ilias Zadik
Abstract:
We study the group testing problem where the goal is to identify a set of k infected individuals carrying a rare disease within a population of size n, based on the outcomes of pooled tests which return positive whenever there is at least one infected individual in the tested group. We consider two different simple random procedures for assigning individuals to tests: the constant-column design an…
▽ More
We study the group testing problem where the goal is to identify a set of k infected individuals carrying a rare disease within a population of size n, based on the outcomes of pooled tests which return positive whenever there is at least one infected individual in the tested group. We consider two different simple random procedures for assigning individuals to tests: the constant-column design and Bernoulli design. Our first set of results concerns the fundamental statistical limits. For the constant-column design, we give a new information-theoretic lower bound which implies that the proportion of correctly identifiable infected individuals undergoes a sharp "all-or-nothing" phase transition when the number of tests crosses a particular threshold. For the Bernoulli design, we determine the precise number of tests required to solve the associated detection problem (where the goal is to distinguish between a group testing instance and pure noise), improving both the upper and lower bounds of Truong, Aldridge, and Scarlett (2020). For both group testing models, we also study the power of computationally efficient (polynomial-time) inference procedures. We determine the precise number of tests required for the class of low-degree polynomial algorithms to solve the detection problem. This provides evidence for an inherent computational-statistical gap in both the detection and recovery problems at small sparsity levels. Notably, our evidence is contrary to that of Iliopoulos and Zadik (2021), who predicted the absence of a computational-statistical gap in the Bernoulli design.
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
On the Hierarchy of Distributed Majority Protocols
Authors:
Petra Berenbrink,
Amin Coja-Oghlan,
Oliver Gebhard,
Max Hahn-Klimroth,
Dominik Kaaser,
Malin Rau
Abstract:
We study the Consensus problem among $n$ agents, defined as follows. Initially, each agent holds one of two possible opinions. The goal is to reach a consensus configuration in which every agent shares the same opinion. To this end, agents randomly sample other agents and update their opinion according to a simple update function depending on the sampled opinions.
We consider two communication m…
▽ More
We study the Consensus problem among $n$ agents, defined as follows. Initially, each agent holds one of two possible opinions. The goal is to reach a consensus configuration in which every agent shares the same opinion. To this end, agents randomly sample other agents and update their opinion according to a simple update function depending on the sampled opinions.
We consider two communication models: the gossip model and a variant of the population model. In the gossip model, agents are activated in parallel, synchronous rounds. In the population model, one agent is activated after the other in a sequence of discrete time steps. For both models we analyze the following natural family of majority processes called $j$-Majority: when activated, every agent samples $j$ other agents uniformly at random (with replacement) and adopts the majority opinion among the sample (breaking ties uniformly at random). As our main result we show a hierarchy among majority protocols: $(j+1)$-Majority (for $j > 1$) converges stochastically faster than $j$-Majority for any initial opinion configuration. In our analysis we use Strassen's Theorem to prove the existence of a coupling. This gives an affirmative answer for the case of two opinions to an open question asked by Berenbrink et al. [2017].
△ Less
Submitted 17 May, 2022;
originally announced May 2022.
-
Note on the offspring distribution for group testing in the linear regime
Authors:
Oliver Gebhard,
Philipp Loick
Abstract:
The group testing problem is concerned with identifying a small set of $k$ infected individuals in a large population of $n$ people. At our disposal is a testing scheme that can test groups of individuals. A test comes back positive if and only if at least one individual is infected. In this note, we lay groundwork for analysing belief propagation for group testing when $k$ scales linearly in $n$.…
▽ More
The group testing problem is concerned with identifying a small set of $k$ infected individuals in a large population of $n$ people. At our disposal is a testing scheme that can test groups of individuals. A test comes back positive if and only if at least one individual is infected. In this note, we lay groundwork for analysing belief propagation for group testing when $k$ scales linearly in $n$. To this end, we derive the offspring distribution for different types of individuals. With these distributions at hand, one can employ the population dynamics algorithm to simulate the posterior marginal distribution resulting from belief propagation.
△ Less
Submitted 24 March, 2021;
originally announced March 2021.
-
Improved bounds for noisy group testing with constant tests per item
Authors:
Oliver Gebhard,
Oliver Johnson,
Philipp Loick,
Maurice Rolvien
Abstract:
The group testing problem is concerned with identifying a small set of infected individuals in a large population. At our disposal is a testing procedure that allows us to test several individuals together. In an idealized setting, a test is positive if and only if at least one infected individual is included and negative otherwise. Significant progress was made in recent years towards understandi…
▽ More
The group testing problem is concerned with identifying a small set of infected individuals in a large population. At our disposal is a testing procedure that allows us to test several individuals together. In an idealized setting, a test is positive if and only if at least one infected individual is included and negative otherwise. Significant progress was made in recent years towards understanding the information-theoretic and algorithmic properties in this noiseless setting. In this paper, we consider a noisy variant of group testing where test results are flipped with certain probability, including the realistic scenario where sensitivity and specificity can take arbitrary values. Using a test design where each individual is assigned to a fixed number of tests, we derive explicit algorithmic bounds for two commonly considered inference algorithms and thereby naturally extend the results of Scarlett \& Cevher (2016) and Scarlett \& Johnson (2020). We provide improved performance guarantees for the efficient algorithms in these noisy group testing models -- indeed, for a large set of parameter choices the bounds provided in the paper are the strongest currently proved.
△ Less
Submitted 21 December, 2021; v1 submitted 2 July, 2020;
originally announced July 2020.
-
Near optimal sparsity-constrained group testing: improved bounds and algorithms
Authors:
Oliver Gebhard,
Max Hahn-Klimroth,
Olaf Parczyk,
Manuel Penschuck,
Maurice Rolvien,
Jonathan Scarlett,
Nelvin Tan
Abstract:
Recent advances in noiseless non-adaptive group testing have led to a precise asymptotic characterization of the number of tests required for high-probability recovery in the sublinear regime $k = n^θ$ (with $θ\in (0,1)$), with $n$ individuals among which $k$ are infected. However, the required number of tests may increase substantially under real-world practical constraints, notably including bou…
▽ More
Recent advances in noiseless non-adaptive group testing have led to a precise asymptotic characterization of the number of tests required for high-probability recovery in the sublinear regime $k = n^θ$ (with $θ\in (0,1)$), with $n$ individuals among which $k$ are infected. However, the required number of tests may increase substantially under real-world practical constraints, notably including bounds on the maximum number $Δ$ of tests an individual can be placed in, or the maximum number $Γ$ of individuals in a given test. While previous works have given recovery guarantees for these settings, significant gaps remain between the achievability and converse bounds. In this paper, we substantially or completely close several of the most prominent gaps. In the case of $Δ$-divisible items, we show that the definite defectives (DD) algorithm coupled with a random regular design is asymptotically optimal in dense scaling regimes, and optimal to within a factor of $\eul$ more generally; we establish this by strengthening both the best known achievability and converse bounds. In the case of $Γ$-sized tests, we provide a comprehensive analysis of the regime $Γ= Θ(1)$, and again establish a precise threshold proving the asymptotic optimality of SCOMP (a slight refinement of DD) equipped with a tailored pooling scheme. Finally, for each of these two settings, we provide near-optimal adaptive algorithms based on sequential splitting, and provably demonstrate gaps between the performance of optimal adaptive and non-adaptive algorithms.
△ Less
Submitted 22 December, 2021; v1 submitted 24 April, 2020;
originally announced April 2020.
-
Optimal group testing
Authors:
Amin Coja-Oghlan,
Oliver Gebhard,
Max Hahn-Klimroth,
Philipp Loick
Abstract:
In the group testing problem the aim is to identify a small set of $k\sim n^θ$ infected individuals out of a population size $n$, $0<θ<1$. We avail ourselves of a test procedure capable of testing groups of individuals, with the test returning a positive result iff at least one individual in the group is infected. The aim is to devise a test design with as few tests as possible so that the set of…
▽ More
In the group testing problem the aim is to identify a small set of $k\sim n^θ$ infected individuals out of a population size $n$, $0<θ<1$. We avail ourselves of a test procedure capable of testing groups of individuals, with the test returning a positive result iff at least one individual in the group is infected. The aim is to devise a test design with as few tests as possible so that the set of infected individuals can be identified correctly with high probability. We establish an explicit sharp information-theoretic/algorithmic phase transition $\minf$ for non-adaptive group testing, where all tests are conducted in parallel. Thus, with more than $\minf$ tests the infected individuals can be identified in polynomial time \whp, while learning the set of infected individuals is information-theoretically impossible with fewer tests. In addition, we develop an optimal adaptive scheme where the tests are conducted in two stages.
△ Less
Submitted 18 April, 2020; v1 submitted 6 November, 2019;
originally announced November 2019.
-
On the Parallel Reconstruction from Pooled Data
Authors:
Oliver Gebhard,
Max Hahn-Klimroth,
Dominik Kaaser,
Philipp Loick
Abstract:
In the pooled data problem the goal is to efficiently reconstruct a binary signal from additive measurements. Given a signal $σ\in \{ 0,1 \}^n$, we can query multiple entries at once and get the total number of non-zero entries in the query as a result. We assume that queries are time-consuming and therefore focus on the setting where all queries are executed in parallel. For the regime where the…
▽ More
In the pooled data problem the goal is to efficiently reconstruct a binary signal from additive measurements. Given a signal $σ\in \{ 0,1 \}^n$, we can query multiple entries at once and get the total number of non-zero entries in the query as a result. We assume that queries are time-consuming and therefore focus on the setting where all queries are executed in parallel. For the regime where the signal is sparse such that $ || σ||_1 = o(n)$ our results are twofold: First, we propose and analyze a simple and efficient greedy reconstruction algorithm. Secondly, we derive a sharp information-theoretic threshold for the minimum number of queries required to reconstruct $σ$ with high probability. Our first result matches the performance guarantees of much more involved constructions (Karimi et al. 2019). Our second result extends a result of Alaoui et al. (2014) and Scarlett & Cevher (2017) who studied the pooled data problem for dense signals. Finally, our theoretical findings are complemented with empirical simulations. Our data not only confirm the information-theoretic thresholds but also hint at the practical applicability of our pooling scheme and the simple greedy reconstruction algorithm.
△ Less
Submitted 13 April, 2022; v1 submitted 4 May, 2019;
originally announced May 2019.
-
Information-theoretic and algorithmic thresholds for group testing
Authors:
Amin Coja-Oghlan,
Oliver Gebhard,
Max Hahn-Klimroth,
Philipp Loick
Abstract:
In the group testing problem we aim to identify a small number of infected individuals within a large population. We avail ourselves to a procedure that can test a group of multiple individuals, with the test result coming out positive iff at least one individual in the group is infected. With all tests conducted in parallel, what is the least number of tests required to identify the status of all…
▽ More
In the group testing problem we aim to identify a small number of infected individuals within a large population. We avail ourselves to a procedure that can test a group of multiple individuals, with the test result coming out positive iff at least one individual in the group is infected. With all tests conducted in parallel, what is the least number of tests required to identify the status of all individuals? In a recent test design [Aldridge et al.\ 2016] the individuals are assigned to test groups randomly, with every individual joining an equal number of groups. We pinpoint the sharp threshold for the number of tests required in this randomised design so that it is information-theoretically possible to infer the infection status of every individual. Moreover, we analyse two efficient inference algorithms. These results settle conjectures from [Aldridge et al.\ 2014, Johnson et al.\ 2019].
△ Less
Submitted 30 September, 2020; v1 submitted 6 February, 2019;
originally announced February 2019.