Semiquantitative Group Testing in at Most Two Rounds
aa r X i v : . [ c s . I T ] F e b Semiquantitative Group Testingin at Most Two Rounds
Mahdi Cheraghchi
Department of EECSUniversity of Michigan, Ann Arbor, MIEmail: [email protected]
Ryan Gabrys
Naval Information Warfare Center,San Diego, and UIUC, Urbana, ILEmail: [email protected]
Olgica Milenkovic
Department of ECEUniversity of Illinois, Urbana, ILEmail: [email protected]
Abstract —Semiquantitative group testing (SQGT) is a poolingmethod in which the test outcomes represent bounded intervalsfor the number of defectives. Alternatively, it may be viewedas an adder channel with quantized outputs. SQGT representsa natural choice for Covid-19 group testing as it allows fora straightforward interpretation of the cycle threshold valuesproduced by polymerase chain reactions (PCR). Prior work onSQGT did not address the need for adaptive testing with asmall number of rounds as required in practice. We proposeconceptually simple methods for -round and nonadaptive SQGTthat significantly improve upon existing schemes by using ideason nonbinary measurement matrices based on expander graphsand list-disjunct matrices. I. I
NTRODUCTION
Group testing (GT) is a scheme designed to efficientlyidentify a small set of subjects with a particular property(standardly referred to as defectives) within a large population,first introduced by Dorfman [1] and further studied in manyother works, including [2]–[4]. Group testing entails testing acollection of carefully selected subpopulations and reportingfor each subgroup a binary answer: A positive answer isindicative of the existence of at least one defective in thesubgroup while a negative answer implies the absence ofdefectives. Given that screening protocols are extensively usedin engineering and science, group testing has found wide-spread applications in communication theory, signal process-ing, computer science, and computational biology [3], [5].Many different variants of group testing have been proposedin the literature [1], [3], [6]. These include threshold grouptesting proposed by Damaschke [7] and quantitative (additive)group testing studied by Lindstr´om and Du and Hwang [6], [8],[9]. In the latter case, the test results report the exact number ofdefectives in the test subpool. In the former case, if the numberof defectives in a test is smaller than a lower threshold, the testoutcome is negative; if the number of defectives is larger thanan upper threshold, the test outcome is positive; otherwise,the result is arbitrary (positive or negative). To bridge thetwo above described paradigms, Emad and Milenkovic [10]–[12] introduced the notion of semiquantitative group testing(SQGT). SQGT represents a unifying framework of a numberof testing protocols, including conventional, quantitative andgapless threshold group testing and the schemes by D’yachkov
M. Cheraghchi’s research was partially supported by the National ScienceFoundation under Grant No. CCF-2006455. and Rykov [13], [14]. In SQGT, the result of a test is anonbinary value that depends on the number of defectivesthrough a fixed set of thresholds. The SQGT model may alsobe viewed as a quantitative group testing method followedby a quantizer. The original motivation for introducing SQGTmodels is genotyping; more recently, the model has been usedby Gabrys et al. [15] to describe the test outcomes of a Covid-19 testing process known as real-time reverse-transcriptasepolymerase chain reaction (PCR).In nonadaptive
SQGT, each subject is assigned a uniquebinary or nonbinary indicator word of length equal to the totalnumber of tests. These indicators are arranged column-wisein a test matrix . Each coordinate in the codeword assignedto a subject corresponds to a test, and its value reflects the“concentration” of the sample corresponding to the givensubject in the test. Note that the concentrations are nonnegativeintegers that usually correspond to the number of units ofthe genetic material of an individual subject. Two familiesof nonadaptive SQGT codes, SQ-disjunct and SQ-separable,were analyzed in [11], [12]. In the same work, a number ofconstructions for nonadaptive uniform and nonuniform (quan-tized) SQGT codes were presented but no results were reportedfor adaptive tests. The more recent work [15] introduced thefirst combinatorial and probabilistic adaptive SQGT (ASQGT)schemes, the former extending the work of Hwang [16] ongeneralized binary group testing. The proposed combinatorialASQGT schemes involve what is referred to as parallel anddeep search methods that lead to a relatively large numberof testing rounds. This is an undesirable feature for practicalimplementations of SQGT in Covid-19 testing.Here, we describe the first known combinatorial two-round adaptive SQGT (ASQGT) for a special selection of(quantization) thresholds studied in [15]. The scheme uses O ´ d log log τ log τ log nd ¯ tests for n subjects, d defectives and τ SQGT thresholds. It builds upon the ideas of list-disjunctgroup testing [17] and like the approach [15] uses nonbinarytest matrices obtained by careful linear combining of the rowsof a binary disjunct matrix. The described two-round ASQGTprotocol differs from the information-theoretic bound only byabout a factor log τ . We then proceed to improve existingnonadaptive protocols by extending the construction of Poratand Rothschild [18].he paper is organized as follows. Sections III describes ourmain result, the first known two-round ASQGT. Section IVpresents new nonadaptive SQGT schemes that significantlyimprove upon previous constructions [11], [12] and imply newupper bounds for nonadaptive SQGT.II. T ERMINOLOGY , GT B
ACKGROUND , AND B OUNDS
We start with some relevant terminology. All parameters aredenoted by small-case letters, while vectors and matrices aredenoted by bold-face small-case and capitalized Latin letters,respectively. Entries of the vectors are indexed by subscriptswhile matrix entries are indexed by pairs of integers withinparentheses. Unless stated otherwise, all log s are to base .Assume that there are n ą test subjects labeled byelements in r n s : “ t
1, . . . , n u among which d ă n aredefective (i.e, infected). In conventional group testing, wesummarize the set of tests through a binary matrix B m ˆ n inwhich every column of the matrix uniquely characterizes anindividual and each row represents a test. The p i , j q th entryof B , B p i , j q , equals if and only if the individual labeled j is included in the i th test. Let t I P t
0, 1 u m denote the binaryvector that results from m tests using B , assuming that theset of infected individuals equals I Ă r n s , with | I | ď d .Whenever clear from the context, we omit the subscript I .In conventional group testing t I p l q “ if and only if the l th test includes at least one element from I . Let t L P t
0, 1 u m bedefined analogously for another set L Ă r n s . We say that a set L is consistent with I if t L ď t I entrywise.The matrix B m ˆ n is termed d -disjunct if no vector t I for | I | ď d contains in its support a column of B not indexed by I .The disjunctness property ensures that the test results obtainedfrom B uniquely identify the set of defectives. A matrix B is termed p d , ℓ q -list-disjunct if the tests output a superset ofthe defectives of size at most ℓ ` d ; for such a matrix, thesize of any list L consistent with I is at most ℓ ` d . Clearly,a matrix B which is d -disjunct is equivalent to one whichis p d , 0 q -list-disjunct. The notion of list-disjunct matrices wasexplicitly formulated (in an equivalent form) in [19] and isalso essentially equivalent to what was defined earlier in [20].We review the following known results pertaining to theexistence of p d , ℓ q -list-disjunct test matrices B P t
0, 1 u m ˆ n with ℓ “ O p d q and m “ O p d log nd q . First, note that it isstraightforward to see that for a maximal L one has I Ď L .Therefore, as noted in [21], the existence of a p d , ℓ q -list-disjunct test matrix B P t
0, 1 u m ˆ n with ℓ “ O p d q naturallyimplies a two-round testing scheme: The first round of tests isgoverned by the rows of B while the second round involvesindividually testing subjects in L . Randomized and explicitconstructions of list-disjunct matrices exist, particularly viaexpander graphs [14], [17], [19]–[21]. The best known con-struction which achieves an optimal number of rows and nearlylinear time recovery (in the number of rows) is given by [22].The best lower bound on the number of tests necessary foran adaptive ASQGT scheme was established in [15] via asimple counting argument and the bound equals d log τ log nd .In the next section, we establish the existence of a two-round scheme that differs from this lower bound by a factor of log log τ only. For the single-round setting, using a variationof the argument employed by F¨uredi [23] in the context ofcover-free codes, one can show that the corresponding numberof tests scales as d p log τ q log d n whereas the construction fromSection IV implies the existence of a scheme that requiresat most d log τ log n tests. This lower bound applies to not onlygeneral nonadaptive SQGT, but in fact the particular saturationmodel as well, which is the focus of this work. The derivationof the bound is relegated to the full version of the paper.III. T WO -R OUND
ASQGTLet G be a bipartite graph with a vertex partition P (people)and T (tests) such that every vertex in P has degree k (i.e., k neighbors) and | P | “ n , | T | “ m . We say that G is an p α , β q -expander if every P Ď P of size at most α | P | has atleast β | P | neighbors in T . The values of the parameters n , m are dictated by the expansion factors α , β . It is also worthpointing out that explicit constructions of expander graphs withparameters of interest in our derivations may not be known,but their existence is guaranteed via probabilistic arguments.We say that a set of vertices T P T is covered by a set P Ď P if for every vertex t P T , there exists a vertex p P P which isconnected to t . We say that a vertex t P T is uniquely covered(or a unique neighbor) of P if it is the neighbor of exactly onevertex p P P . Henceforth, for a set of vertices P , let N p P q Ď T denote the neighbors of P and let N u p P q Ď T denote the setof unique neighbors of P . Furthermore, we say that a vertex t P T is covered h times by P if it is connected to exactly h different vertices in P . The next results may be obtainedthrough a straightforward modification of existing results. Lemma 1. [19] Suppose that G is an p α , β q -expander whereevery vertex in P has k neighbors and β ą k . Let I Ď P be a subset of size at most | I | ď d . Then for any P Ď P suchthat P X I “ H , | P | ě | I | ` and | P Y I | ď α | P | ,wehave: ˇˇˇ N u p P Y I qz N p I q ˇˇˇ ě k . Thus, given the previous lemma, it follows that there existsat least one test in N p P Y I q that is not covered by anelement of I . Using this observation, we construct the m ˆ n binary matrix B as follows. Suppose that G is an expander aspreviously described. We assume that the vertices in P and T are lexicographically ordered so that we can refer to the i th vertex in T as i and the j th vertex in P as j . Then, for i P T and j P P , B p i , j q “ if an edge exists between i and j in G , otherwise. (1)Thus, as a result of the construction for B , we see that we canuniquely associate each column of B with a vertex in P andeach row of B with a vertex in T .The next two results follow immediately from the previousdiscussion. orollary 2. Supposewearegiventwosets I , L Ď P suchthat I Ď L .If L isconsistentwith I under B and | L | ď α | P | ,then | L | ă | I | ` Lemma 3.
Suppose B isasdefinedin(1)andthesetofinfectedindividualssatisfies | I | ď d .Then,testingwith B recoversaset L Ď P suchthat | L | “ O p d q and I Ď L .The following lemma is known [3] and follows from astandard randomized construction: Lemma 4.
Supposethat α “ d ` n andlet m “ e k α n ,where e is the base of the naturallogarithm.Then, for k “ O p log α q there exists an p α , β q -expander graph G with bipartition P , T suchthat | P | “ n and | T | “ m ,and β “ k .The previous result implies the following theorem. Theorem 5.
Thereexists a conventionaltwo-stageGT schemethatrequiresatmost O ´ e d log nd ¯ testsandcanidentifyasetofinfectedindividualsofsizeatmost d fromapopulationofsize n .We remark that the best known explicit constructions ofbipartite expanders are still inferior to the optimal boundsachieved by random expanders in Lemma 4. For example,using [24] one can get O p d ` α p log n q O p { α q q tests for anyfixed α ą , and [25] would achieve O p d exp pp log log n q qq tests, similar to the derivation in [20].We now discuss how to use the matrix B to design aspecialized two-round SQGT testing scheme for τ ą .We focus on a special case of uniform SQGT with sat-uration [15] for which we are given τ thresholds. The testoutcome vector for a set I of defectives is such that s I p l q “ if the l th test includes no defectives, s I p l q “ if the l th testincludes defective, . . . , s I p l q “ τ ´ if the l th test includes τ ´ defectives and s I p l q “ τ ´ if the number of defectivesin the l th test exceeds τ ´ . To simplify the notation, weassume that τ “ p γ q γ , for some positive integer γ .We show the existence of a two-round testing scheme thatdiffers from the information theoretic lower bound from [15]by only a factor of roughly log τ . As discussed earlier,we only focus on the first round, since the second one isstraightforward. The key idea used to construct the test matrixfor the first round is to start with list-disjunct expander-basedbinary test matrix and then merge the rows via specializedlinear combinations to reduce the number of tests and increasethe size of the alphabet used for the codebook.We start by introducing two matrices S p q and S p q that willbe subsequently concatenated into the “global” SQGT matrix S “ „ S p q S p q . Let B be as defined in (1) and for simplicity,assume that γ | m . Then, for i P r m γ s and j P r n s , we set S p q p i , j q “ B pp i ´ q γ ` j q ` p γ q B pp i ´ q γ ` j q (2) ` p γ q B pp i ´ q γ ` j q ` ¨ ¨ ¨ ` p γ q γ ´ B p i γ , j q ; S p q p i , j q “ B pp i ´ q γ ` j q ` B pp i ´ q γ ` j q (3) ` B pp i ´ q γ ` j q ` ¨ ¨ ¨ ` B p i γ , j q . Note that both S p q and S p q are obtained linear combinationof rows of B , but the scaling factors are different. The SQGTtest matrix S has m γ rows and consequently the same numberof tests. The tests involve taking an integer number of sampleunits dictated by the nonbinary entries in S . The nonbinary(semi-quantitative) test outcome vector will be denoted by s .Let E p a q denote the p γ q -ary expansion of the natu-ral number a in vector form. More precisely, if a “ a ` a γ ` a p γ q ` ¨ ¨ ¨ ` a γ ´ p γ q γ ´ , then E p a q “ ´ a , a , . . . , a γ ´ ¯ , where a i , i P r
0, 4 γ ´ s . Our decod-ing procedure operates as follows. Suppose that s p q “p s p q , . . . , s p q m γ q represents the results of the (quantized) testingusing the matrix (2). We apply the map E to s p q entrywise.We then use an expander-based decoding procedure on thisvector to recover a “noisy” set of test values - the “noise” isdue to the that the matrix S p q p i , j q can handle only up to γ defectives.To this end, let s “ ´ E p s p q q , E p s p q q , . . . , E p s p q m γ q ¯ “p s , s , . . . , s m q and let ˆ t p b q “ ´ r s τ s , . . . , r s m τ s ¯ “ ´ ˆ t p b q , ˆ t p b q , . . . , ˆ t p b q m ¯ P t
0, 1 u m . Note that ˆ t p b q i “ if s i ą andzero otherwise. For shorthand, we write f τ Ñ b ´ s p q ¯ “ ˆ t p b q . We have the following claim.
Claim 6.
Let t P t
0, 1 u m denote the test output based on thebinary matrix B , let s p q be the test output generated via S p q andlet ˆ t p b q beasdefinedabove.Then, d H ´ ˆ t p b q , t ¯ ď dk Proof:
Let f τ Ñ b p s p q i q “ p t γ p i ´ q` , . . . , t γ i q be themapping that corresponds to s p q i . For some j P r γ ´ s ,let vertex p i ´ q γ ` j P T be covered ě γ times. Such avertex may be in error (due to the use of the γ -ary expansion).Since the set I Ď P has at most | I | k neighbors in T Ď G , itfollows from an averaging argument that ˇˇˇ! p i , j q : vertex p i ´ q γ ` j P T is covered ě γ times )ˇˇˇ ď | I | k γ . Let p i ´ q γ ` ℓ P T be a vertex in T which is covered atleast γ times (if no such vertex exists, we are error-free anddo not have to prove anything further). In this case we mayhave f τ Ñ b ´ s p qp i ´ q γ ` ℓ ¯ “ ´ ˆ t p b qp i ´ q γ ` , ˆ t p b qp i ´ q γ ` , . . . , ˆ t p b q γ ¯ ‰ t p i ´ q γ ` , t p i ´ q γ ` , . . . , t γ ¯ ; in the worst case ˆ t p b qp i ´ q γ ` ‰ t p i ´ q γ ` , . . . , ˆ t p b q i γ ‰ t i γ . This implies that for every p i , ℓ q there are at most γ instances where ˆ t p b q v ‰ t v , which gives thedesired result.As a result of the previous lemma, it follows that we canrecover a binary vector ˆ t p b q that is within Hamming distance dk of the binary test result t based on B . Thus, we have torecover the set of infected individuals given a noisy set of testoutcomes. To correct errors, we make use of the test outcomegenerated by the matrix S p q ; this matrix renders the errors in t “asymmetric,” which simplifies the problem. Here, the term“asymmetric” refers to the fact to be addressed in Claim 7 that ˆ t ě t so that in t a can change to a but not otherwise. Moreprecisely, we use S p q to identify tests in S p q that contain ą γ defectives. Note that if at least γ infected individualsare present in some test pool i , then the entries indexed by p i ´ q γ ` p i ´ q γ `
2, . . . , i γ of ˆ t p b q may be in error.Let s p q “ p s p q , . . . , s p q m { γ q P r τ ´ s m γ be the testoutcomes of S p q . Define a vector t p b q j “ $&% if s p q r jm s ě γ ,ˆ t p b q j , otherwise. (4)Similarly as before, for s “ p s p q , s p q q we write f τ Ñ b p s q “ t p b q . The following straightforward claim follows from the pre-vious discussion and the observations in Claim 6.
Claim 7.
Let t p b q “ f τ Ñ b p s q .Then, t p b q ě t ,and d H p t p b q , t q ď dk We next generate a list L of potentially infected individualsconsistent with the outcome of the tests t p b q . The next lemma,which uses the same ideas as Lemma 1, describes an upperbound on the size of L . Lemma 8.
Suppose that s P r τ ´ s m γ is the result of thetests in (2) and (3) and t p b q “ f τ Ñ b p s q . Then the size of anylist of defectives from P consistent with t p b q “ f τ Ñ b p s q is atmost O p d q . Proof:
Recall that in our setup the graph G , which is usedto construct B and also S p q , is an p α , β q -expander. Hence,every vertex in P has k neighbors and β ą k . As before,let I Ď P denote the set of infected individuals such that | I | ď d . Let t I P t
0, 1 u m be the output of the tests dictatedby B . We show that given a S Ď P such that S X I “ H and | S | ě O p d q , S Y I cannot be consistent with t p b q under B .Let S “ S Y I Ď P . Using the same arguments as in theproof of Lemma 1, we can show that the number of uniqueneighbors of S satisfies N u p S q ě k | S | Let E “ t j : t p b q j ą t j u . Since N p I q ď dk and | E | ď dk fromClaim 7, it follows that ˇˇˇ N u p S qzp I Y E q ˇˇˇ ě k | S | ´ p dk ` dk q , which implies that if | S | ą d , then there exists a uniqueneighbor of S which is not in error and is also not alreadycovered by an element in I . This implies that N p S q is notconsistent with t p b q .The following theorem follows from the previous discussionand from Claim 6 and Lemma 8. Theorem 9.
There exists a nonbinary two-stage GT schemethatgiven τ “ p γ q γ thresholdsand O ´ e d γ log nd ¯ teststhatcanidentifyasetofinfectedindividualsofsizeatmost d inapopulationofsize n . Proof:
We prove the result by describing a simple methodfor recovering a set L of size O p d q which contains the set ofdefectives I . First, we generate the vector t p b q “ f τ Ñ b p s q fromour non-binary test outcomes. We initialize L “ H . Then, forevery p P P , if L Y P is consistent with t p b q , we update L “ L Y t p u . Otherwise, we do not change L . At the end of thisprocess we have I Ď L . Furthermore, according to Lemma 8, | L | ď O p d q . The result now follows from Theorem 5.IV. N ONADAPTIVE
SQGTWe describe next constructive nonadaptive testing schemes,which in the asymptotic regime require at most O p d γ log n q tests, with τ “ p γ q γ . Our approach builds upon the construc-tion by Porat and Rothschild (PR construction) [18], whichmakes use of non-binary error-correcting codes. Our key resultis described in Lemma 13.Let C P F m { qq be a q -ary linear error-correcting code, where q is an odd prime, of minimum distance δ mq , q “ O p d q , anddimension log q p n q . The PR construction works by uniquelyassociating each individual in the population of size n witha codeword in C . Under this setup, the test matrix B p PR q “p b p c , x q , j q c Pr mq s , x Pr q ´ s , j Pr n s is defined as b p c , x q , j “ if x p j q c “ x ,0, otherwise,where x p j q is the j -th codeword of C . In words, the test indexedby p c , v q contains the codewords (individuals) from C whose c -th coordinate equals x .Our approach for designing a nonadaptive testing schemeis similar to that for the adaptive setting. Each test can begenerated by taking a linear combination of γ rows of B p PR q .The total number of tests equals mq r q γ s “ O p m γ q , and oncegain the tests are represented by S “ „ S p q S p q , where S p q “ ´ s p qp c , r q , j ¯ c P mq , r Pr r q γ s ´ s , j Pr n s , is defined as follows: s p qp c , r q , j “ if x p j q c “ rp r ´ q γ , min t r γ ´ q ´ us ,0, otherwise.In words, the test in S p q indexed by p c , r q contains thecodewords (individuals) from C whose c -th coordinate has avalue between p r ´ q γ and the minimum of r γ ´ q ´ .Note that the reason for using the minimum in the previousrange of values is a consequence of the fact that we assumed q to be an odd prime. The tests in S p q are defined similarly:Suppose that x p j q c “ p r ´ q γ ` v where v P r γ ´ s . Then, s p qp c , r q , j “ p γ q v . For shorthand, we refer to the codewords in the p c , r q -th testin S p q as T p c , r q Ď C . Claim 10.
Suppose that the number of infected individuals inthetestindexedby p c , r q isatmost γ ´ sothat ˇˇˇ T p c , r q X I ˇˇˇ ď γ ´ Then, given the output of the test T p c , r q we can uniquelydetermine ˇˇˇ x P I : x c “ x (ˇˇˇ , for x P rp r ´ q γ , r γ ´ s .Let n denote the all-ones vector of length n . We assumethat our code C is such that n P C . Henceforth, let I “ ! y ` i ¨ n : y P I ) (5)for all i P t´ γ `
1, . . . , ´
1, 0, 1, . . . , γ ´ u . Claim 11.
Let x P C z I besuchthat x P T p c , r q .Supposethatforaninteger ℓ wehave ˇˇˇ z P I : x c “ z c (ˇˇˇ ď ℓ . Then, ˇˇˇ T p c , r q X I ˇˇˇ ď ℓ . Proof:
This follows since if x P T p c , r q , then x c “ r p γ ´ q ` v for some v P r γ ´ s . If z P T p c , r q X I , then z c “ r p γ ´ q ` v for some v P r γ ´ s . Since v , v P r γ ´ s , it follows that z c ` p v ´ v q “ r p γ ´ q ` v “ x c where p v ´ v q P t´ γ `
1, . . . , ´
1, 0, 1, . . . , γ ´ u . This in turnimplies that z c ` p v ´ v q is the value of component c of avector from the set I .We also need the following result. Claim 12.
Supposethat x P C z I issuchthat x P T p c , r q .Ifthereexistsanindex c P r n s satisfying ˇˇˇ z P I : x c “ z c (ˇˇˇ ď γ ´ (6) and ˇˇˇ z P I : x c “ z c (ˇˇˇ “ (7)then given the output of the tests dictated by S p q , S p q we candeterminethat x R I . Proof:
From Claim 11 and if (6) holds, we have that ˇˇˇ I X T p c , r q (ˇˇˇ ď γ ´ . Then from Claim 10, since the numberof infected individuals in T p c , r q is at most γ ´ , we have ˇˇˇ z P I : z c “ x c (ˇˇˇ “ using the test outputs of T p c , r q . Lemma 13. If C hasminimumdistance δ ą ´ d ,thetests S uniquelydeterminetheset I ofdefectives. Proof:
According to Claim 12, we need to show that (6)and (7) hold for any x P C z I . We start by showing that (7)holds. In particular, we show a stronger claim that there existsa set C p q Ď mq of size at least m q ` where for any c P C p q ,we have x c ‰ y c , (8)where y “ p y , . . . , y mq q P I . Note that this implies that thenumber of coordinates of x which agree in value with anelement of I is at most m { q ´ . Since any two elements in C can agree in at most p ´ δ q mq coordinates and δ ą ´ d , itfollows that ˇˇˇ c : x c “ y c , y P I (ˇˇˇ ď d p ´ δ q mq ă m q . Next, we show that for at least one coordinate in C p q , (6)holds as well. First, note that ˇˇˇ!` y , c ˘ P I ˆ mq : x c “ y c )ˇˇˇ ď γ d p ´ δ q mq ă γ mq , so that for a randomly chosen coordinate c P mq , E ”ˇˇˇ y P I : x c “ y c u ˇˇˇı ă γ . Invoking Markov’s inequality we getPr ´ˇˇ y P I : x c “ y c u ˇˇ ě γ ¯ ă
14 .
Therefore, it follows that there exists a set of coordinates C p q Ď mq of size at least m q such that for any c P C p q ˇˇˇ! y P I : x c “ y c )ˇˇˇ ă γ . Since | C p q | ě m q and C p q ě m q ` , it follows that | C p q X C p q | ě . Letting c ˚ P C p q X C p q we have ˇˇˇ y P I : x c ˚ “ y c ˚ (ˇˇˇ ď γ ´ and ˇˇˇ y P I : x c ˚ “ y c ˚ (ˇˇˇ “ . By Claim 12,we conclude that x R I . Open Problems.
Despite only a small gap remainingbetween the lower bound and the actual constructions forthe saturation model, many other problems remain open andinclude:
Extending the nonadaptive and two-round constructionsfor general quantization thresholds under the SQGTmodel; ‚ Deriving bounds and test strategies for consecutive de-fective models [26], [27], as these capture the order ofarrivals into testing queues; ‚ Addressing generalized binomial SQGT algorithms [28].R
EFERENCES[1] R. Dorfman, “The detection of defective members of large populations,”
Annals of Mathematical Statistics , vol. 14, pp. 436–440, 1943.[2] W. Kautz and R. Singleton, “Nonrandom binary superimposed codes,”
IEEE Transactions on Information Theory , vol. 10, pp. 363–377, 1964.[3] D.-Z. Du and F.-K. Hwang,
Pooling Designs and Nonadaptive GroupTesting . World Scientific, 2006.[4] H. A. Inan, P. Kairouz, M. Wootters, and A. ¨Ozg¨ur, “On the optimality ofthe Kautz-Singleton construction in probabilistic group testing,”
IEEETransactions on Information Theory , vol. 65, no. 9, pp. 5592–5603,2019.[5] J. Wolf, “Born again group testing: Multiaccess communications,”
IEEETransactions on Information Theory , vol. 31, no. 2, pp. 185–191, 1985.[6] A. Dyachkov, “Lectures on designing screening experiments,” 2004,lecture Note Series 10.[7] P. Damaschke, “Threshold group testing,” in
General Theory of Infor-mation Transfer and Combinatorics , ser. Lecture Notes in ComputerScience, vol. 4123, 2006, pp. 707–718.[8] B. Lindstrom, “Determining subsets by unramified experiments,”
ASurvey of Statistical Design and Linear Models , 1975.[9] D.-Z. Du and F. Hwang,
Combinatorial Group Testing and its Applica-tions , 2nd ed. World Scientific, 2000.[10] A. Emad, J. Shen, and O. Milenkovic, “Symmetric group testing andsuperimposed codes,” in ,2011, pp. 20–24.[11] A. Emad and O. Milenkovic, “Semiquantitative group testing,”
IEEETransactions on Information Theory , vol. 60, no. 8, pp. 4614–4636,2014.[12] ——, “Code construction and decoding algorithms for semi-quantitativegroup testing with nonuniform thresholds,”
IEEE Transactions on Infor-mation Theory , vol. 62, no. 4, pp. 1674–1687, 2016.[13] A. G. D’yachkov and V. V. Rykov, “A coding model for a multiple-access adder channel,”
Probl. Perdachi Inform. , pp. 26–32, 1981, inRussian. [14] A. Dyachkov and V. Rykov, “A survey of superimposed code theory,”
Problems of Control and Information Theory , vol. 12, no. 4, pp. 229–242, 1983.[15] R. Gabrys, S. Pattabiraman, V. Rana, J. ao Ribeiro, M. Cheraghchi,V. Guruswami, and O. Milenkovic, “AC-DC: Amplification curve diag-nostics for Covid-19 group testing,” 2020, arXiv:2011.05223.[16] F. Hwang, “A generalized binomial group testing problem,”
Journal ofthe American Statistical Association , vol. 70, no. 352, pp. 923–926,1975.[17] H. Q. Ngo, E. Porat, and A. Rudra, “Efficiently decodable error-correcting list disjunct matrices and applications,” in
InternationalColloquium on Automata, Languages, and Programming . Springer,2011, pp. 557–568.[18] E. Porat and A. Rothschild, “Explicit nonadaptive combinatorial grouptesting schemes,”
IEEE Transactions on Information Theory , vol. 57,no. 12, pp. 7982–7989, 2011.[19] P. Indyk, H. Q. Ngo, and A. Rudra, “Efficiently decodable non-adaptivegroup testing,” in
Proceedings of the twenty-first annual ACM-SIAMsymposium on Discrete Algorithms . SIAM, 2010, pp. 1126–1142.[20] M. Cheraghchi, “Noise-resilient group testing: Limitations and construc-tions,”
Discrete Applied Mathematics , vol. 161, no. 1, pp. 81–95, 2013,preliminary version in Proceedings of the FCT 2009. arXiv manuscriptpublished in 2008.[21] A. De Bonis, L. Gasieniec, and U. Vaccaro, “Optimal two-stage algo-rithms for group testing problems,”
SIAM Journal on Computing , vol. 34,no. 5, pp. 1253–1270, 2005.[22] M. Cheraghchi and V. Nakos, “Combinatorial group testing schemeswith near-optimal decoding time,” in
Proceedings of the 61st AnnualIEEE Symposium on Foundations of Computer Science (FOCS) , 2020.[23] Z. F¨uredi, “On r -cover-free families,” Journal of Combinatorial Theory,Series A , vol. 73, no. 1, pp. 172–173, 1996.[24] V. Guruswami, C. Umans, and S. Vadhan, “Unbalanced expanders andrandomness extractors from Parvaresh-Vardy codes,”
Journal of theACM , vol. 56, no. 4, 2009.[25] M. Capalbo, O. Reingold, S. Vadhan, and A. Wigderson, “Randomnessconductors and constant-degree expansion beyond the degree/2 barrier,”in
Proceedings of the th Annual ACM Symposium on Theory ofComputing (STOC) , 2002, pp. 659–668.[26] T. V. Bui, M. Cheraghchi, and T. D. Nguyen, “Improved algorithms fornon-adaptive group testing with consecutive positives,” arXiv preprintarXiv:2101.11294 , 2021.[27] C. J. Colbourn, “Group testing for consecutive positives,” Annals ofCombinatorics , vol. 3, no. 1, pp. 37–41, 1999.[28] F. Hwang, “A generalized binomial group testing problem,”