[PDF] Group Testing Enables Asymptomatic Screening for COVID-19 Mitigation: Feasibility and Optimal Pool Size Selection with Dilution Effects

Abstract

Repeated asymptomatic screening for SARS-CoV-2 promises to control spread of the virus but would require too many resources to implement at scale. Group testing is promising for screening more people with fewer test resources: multiple samples tested together in one pool can be excluded with one negative test result. Existing approaches to group testing design for SARS-CoV-2 asymptomatic screening, however, do not consider dilution effects: that false negatives become more common with larger pools. As a consequence, they may recommend pool sizes that are too large or misestimate the benefits of screening. Modeling dilution effects, we derive closed-form expressions for the expected number of tests and false negative/positives per person screened under two popular group testing methods: the linear and square array methods. We find that test error correlation induced by a common viral load across an individual's samples results in many fewer false negatives than would be expected from less realistic but more widely assumed independent errors. This insight also suggests that false positives can be controlled through repeated tests without significantly increasing false negatives. Using these closed-form expressions to trace a Pareto frontier over error rates and tests, we design testing protocols for repeated asymptomatic screening of a large population. We minimize disease prevalence by optimizing a time-varying pool sizes and screening frequency constrained by daily test capacity and a false positive limit. This provides a testing protocol practitioners can use for mitigating COVID-19. In a case study, we demonstrate the effectiveness of this methodology in controlling spread.

Full PDF

GG ROUP TESTING DURING THE

COVID-19

PANDEMIC : OPTIMAL GROUP SIZE SELECTION AND PREVALENCE CONTROL

Yifan Lin

Industrial and Systems EngineeringGeorgia Institute of TechnologyAtlanta, GA 30332, USA [email protected]

Yuxuan Ren

Industrial and Systems EngineeringGeorgia Institute of TechnologyAtlanta, GA 30332, USA [email protected]

Jingyuan Wan

Industrial and Systems EngineeringGeorgia Institute of TechnologyAtlanta, GA 30332, USA [email protected]

Enlu Zhou

Industrial and Systems EngineeringGeorgia Institute of TechnologyAtlanta, GA 30332, USA [email protected]

August 18, 2020 A BSTRACT

Group testing pools multiple samples together and performs tests on these pooled samples to discernthe infected samples. It greatly reduces the number of tests, however, with a sacriﬁce of increasingfalse negative rates due to the dilution of the viral load in pooled samples. Therefore, it is important tobalance the trade-off between number of tests and false negative rate. We explore two popular grouptesting methods, namely linear array (a.k.a. Dorfman’s procedure) and square array methods, andanalyze the optimal group size of a pooled sample that minimizes the group false negative numberunder a constraint of testing capacity. Our analysis shows that when there is reasonably large testingcapacity, the linear array method yields smaller false negative number and hence is preferred. Whenthe testing capacity is small, square array method is more feasible and preferred. In addition, weconsider testing a closed community in a period of time and determine the optimal testing cycle thatminimizes the ﬁnal prevalence rate of infection at the end of the time period. Finally, we provide atesting protocol for practitioners to use these group testing methods in the COVID-19 pandemic.

Group testing refers to the idea of pooling multiple samples together and performing tests on certain subsets of thesesamples to discern the infected samples. Due to the resource and time it takes to run the RT-qPCR test for COVID-19,it is nearly impossible to conduct individual test of everyone in a relatively large population. Instead, group testingprovides a promising way to save the testing budget while detecting the infected samples out of a large population.Dating back to [1], many different group testing methods have been developed and analysed over the years, such asthose based on compressed sensing (see [2], [3]), and information theory (see [4], [5]). The COVID-19 pandemicreignites a great interest in group testing, both from academic research (e.g. [6], [7], [8], [9], [10], [11], [12], [13] and[14]) and from the public (e.g. [15] and [16]).To illustrate group testing, let’s ﬁrst take a look at a simple group testing method, called binary search. Suppose thereare eight people to test and only one person is infected (unknown to us). Since we don’t know there is only one infectedperson, we have to test them all in individual testing. In binary search, we ﬁrst divide people into two groups of four.For the ﬁrst four people, we pool their samples together and carry out one test. Suppose it is negative, we then conﬁrmthese four people are not-infected, and conduct the group testing for the second group of four people. Suppose the resultfor the second group of four people is positive, we further divide these four people into two halves, and repeat the above a r X i v : . [ q - b i o . P E ] A ug rocedure. Here we use only six tests in binary search, which saves number of tests compared with individualtesting.There are two paradigms of group testing, as illustrated in [17]. Combinatorial group testing (CGT) assumes a knownnumber of infected people among the tested population. Probabilistic group testing (PGT) assumes that people testpositive independently with probability p . From the perspective of sequences of testing, group testing can be classiﬁedas adaptive testing and non-adaptive testing. In adaptive testing, ﬁrst a group is chosen randomly and tested, and theoutcome of this test determines the next group to test and so on. In non-adaptive testing, a ﬁxed number of tests arealways performed irrespective of the number of infected samples present in the pool, so all the tests can run in parallel.In particular, [18] formulate it as a two-stage process. At the ﬁrst stage, all individuals are arranged into several groupsand each group gets tested. At the second stage, positive/negative result for each individual is deduced.However, the aforementioned work usually assumes that testing result is accurate. In other words, when samples arepooled together and tested, if there is at least one infected sample, then the test result is always positive. Under thisassumption, naturally the main goal for developing different group testing methods is to minimize the total number oftests. [7] gives a scheme for comparing several different group testing methods, and shows that the prevalence rate isthe major factor that determines the number of tests of a group testing method. In other words, performance of grouptesting varies when the prevalence rate changes. [19], [20] discuss the relationship between these quantities. However,when individual samples are pooled together, the viral load in the infected samples gets diluted and hence leads to falsenegative detection (i.e., infected samples test negative). We take this dilution effect into consideration, and considerhow to optimize the group size to balance the trade-off between number of tests and false negative rate. On the onehand, the bigger the group size is, the less number of tests needed. On the other hand, a bigger group size also causes alarger false negative rate. When the group size is too large, the false negative rate will be so large that renders the grouptesting method useless. Therefore, we aim to choose the optimal pool size that minimizes the false negative numberunder a constraint of test capacity.In addition, we consider testing everyone in a large population in a testing cycle of multiple days to help control theprevalence rate. To model the dynamics of the system, we proposed a testing-quarantine-infection model, where grouptesting is conducted at the very beginning of each day and people who test positive get quarantined, while the infectionkeeps spreading. This model requires the testing can be conducted in a short period of time, so we mainly considernon-adaptive group testing methods that can run in parallel. Speciﬁcally, we consider linear array and square arraygroup testing methods that are easy to understand and implement in practice.In summary, the contribution of this paper is two-fold: • We take into consideration the dilution effect caused by pooling, balance the trade-off between the numberof tests and the false negative number, and derive the optimal group size under a stochastic optimizationformulation. • We design a testing protocol such that we can test everyone in a testing cycle under a limited testing capacityand keep the prevalence rate low in the population.The remainder of the paper is organized as follows: Section 2 brieﬂy introduces the dilution effect in group testing.Section 3 studies the linear array and square array methods, and derives the optimal group size that minimizes thefalse negative number under a constraint of daily testing capacity. Section 4 proposes a testing-quarantine-infectionmodel for group testing, and determines the optimal testing cycle length through simulation. Sensitivity analysis is alsoconducted in this section with regard to different parameters. Section 5 concludes the paper with the complete testingprotocol for practitioners.

Even though group testing is efﬁcient in detecting the infected people using as few number of tests as possible, it isat the sacriﬁce of decreased sensitivity. In group testing, the false negative rate, deﬁned as the probability of infectedpeople being tested negative, comes from two sources. • Swabbing. The swab used to take samples might not contain enough amount of the virus from an infectedperson. • Pooling. When pooling many samples into one sample, those infected samples are pooled with uninfectedsamples, which will dilute the viral load in the pooled sample and hence decrease the sensitivity of the test.We focus on the second source of pooling dilution. The impact of pooling dilution has been studied in [21], [22], [23]etc.. To the best of our knowledge, pooling dilution particularly for SARS-CoV-2 has been studied in [17] and [10], till2he time of completion of this manuscript. More speciﬁcally, [17] utilizes a dilution effect model derived for testing HIV,and argues that it is applicable to other viral testings including SARS-CoV-2. Most recently, based on the mechanism ofthe RT-qPCR test and the clinical data for SARS-CoV-2 virus, [10] proposes a new statistical model for determining thefalse negative rate induced by pooling. The basic idea is to measure the false negative rate as the probability that theamount of virus contained in the pooling sample exceeds the detection limit of RT-qPCR test. Assume exactly oneinfected individual is pooled with N − uninfected individuals to form a group of size N , the false negative rate causedby pooling is estimated as γ = 1 − (cid:88) k =1 π k F µ k ,σ k ( d cens − log N ) F µ k ,σ k ( d cens ) , where F µ k ,σ k ( · ) , k = 1 , , is the cumulative distribution function (CDF) of normal distribution N ( µ k , σ k ) , k = 1 , , , d cens is the parameter called detection limit. For more details, such as mathematical deductions, parameters settings,etc., we refer the readers to Appendix D and [10]. Figure 1 shows the relationship between pooling size N and the falsenegative rate γ . Note that in this so-called "uncensored model" in [10], the false negative rate from the ﬁrst source ofswabbing is neglected. Figure 1: False negative rate γ caused by pooling dilution effect. Remark 1.

The results throughout the paper are based on the uncensored model in [10]. If more precise models for thepooling dilution effect become available, they can be used to replace the current model in our approach proposed in thispaper.

Among the various group testing methods, we choose to focus on two methods: the linear array method and the squarearray method. These two methods are non-adaptive, easy to understand and implement in practice, and parallelable. Weﬁrst introduce these two methods, and then evaluate and compare their performance. Speciﬁcally, we will compare thenumber of tests needed and the false negative number under different parameter regimes of testing capacity and initialprevalence rate.In the linear array group testing, suppose the total number of people to test is N and the desired group size is n ( n < N ).Note that the corresponding pool size is also n . We will form (cid:100) Nn (cid:101) linear array groups and perform group testing oneach group. Figure 2 shows a linear array of size n = 5 . Note that if N is not divisible by n , (cid:98) Nn (cid:99) of the groups are ofsize n and the last group is of size N − (cid:98) Nn (cid:99) n . If a group tests positive, we will test each one in the group individually.Without the follow-up diagnostic tests, the non-infected samples in the positive group will be deemed as infected,resulting in false positive detection. Notice that in the linear array method, we need two samples from each person. Ifonly one sample is collected, it will be split into two sub-samples, of which one is used for group testing and the otherfor possible follow-up diagnostic test.Another group testing method is square array group test used by [12]. Note that this method is almost the same asdouble pooling proposed by [9]. In the square array group testing, suppose the total number of people to test is N andwe consider the square array of size n × n . Pools of size n are created from each row and each column. We will form (cid:98) Nn (cid:99) square arrays of size n × n . A sample is deemed as suspicious if both its row and its column pools test positive.3igure 2: Placement of samples in a linear array of size .See Figure 3 for a toy example. Before we form each group, we need to swab three samples from the tested individual.If we only have one sample for each, it will be split into three sub-samples. One is used for row testing, one is usedfor column testing, and the remaining one is used for possible follow-up diagnostic test, when the person is deemedas suspicious in group testing phase. For the remaining N − (cid:98) Nn (cid:99) n people, we will conduct individual tests. Thefollow-up diagnostic test is indispensable since the square array test may result in false positive number (deﬁned as thenumber of not-infected samples that test positive). To illustrate, consider the case where only the sample labeled by ’1’and the sample labeled by ’13’ are infected in the × square array in Figure 3. It is very likely that there are fourpools that will test positive, namely the ﬁrst and the third row pools, and the ﬁrst and the third column pools. If we donot perform the follow-up diagnostic tests, samples labeled by ’3’ and ’11’ will be deemed as infected, though none ofthem are infected actually. Figure 3: Placement of samples in a × square array.We formally deﬁne the following notations: 4 Total number of people to test: N . • Initial prevalence rate: p . It measures the probability that an individual is infected. • Pool size in group testing: n . It is the number of samples that are pooled together in one test. • Maximum pool size: ¯ n . For the linear array test, ¯ n = N . For the square array test, ¯ n = (cid:98)√ N (cid:99) . • Optimal pool size: n ∗ . n ∗ ≤ ¯ n . The pool size which minimizes the expected total false negative number,subject to the constraint of the expected number of total tests. • Number of pools: M P ( n ) . This is also the number of tests conducted in group testing phase. For the lineararray test, the number of pools is (cid:100) Nn (cid:101) . For the square array test, the number of pools is n (cid:98) Nn (cid:99) . • Group size in group testing: g ( n ) . For the linear array test, g = n . For the square array test, g = n × n . • False negative rate caused by dilution effect: γ ( n, d ) . It is a function of pool size n and number of infected d in a pooled sample. • Testing capacity: C . It is the maximum number of tests we can conduct. • Randomness of the sample placements and pool positiveness: ( ω, ξ ) ∈ ( R N × R M P ) . Random variable ω represents the placement of infected samples. ω i = 1 means the i th sample is infected. Random variable ξ represents the positiveness of each pool, and hence ξ is dependent on ω . The reason we need the randomvariable ξ is that, given ω , we still cannot tell the test positiveness of a pool, because of the dilution effect. ξ i = 1 means pool i tests positive, ξ i = 0 means pool i tests negative. • Number of tests conducted individually: M I ( n, ( ω, ξ )) . Suppose we have a realization of random variables ( ω, ξ ) . For the linear array test, it tells us the numbering of pools that test positive. For the square array test, ittells us the numbering of samples at the intersection of positive rows and columns. Construct a suspect set S that contains the above numbers. For linear array test, M I would be | S | . For square array test, M I would be | S | + ( N − (cid:98) Nn (cid:99) n ) . • Number of follow-up diagnostic tests followed by a single group test: m ( n, ( ω, ξ )) . • Number of total tests: M ( n, ( ω, ξ )) . M = M P + M I . • Number of people that test positive: I ( n, ( ω, ξ )) . Given the set S and ω , I = (cid:80) i ∈ S { ω i = 1 } . • Number of infected people: D . D follows a binomial distribution with parameters N and p . • false negative number: f ( n, ( ω, ξ )) = D − I . • Value-at-Risk: VaR ( · ) [ · ] . VaR q [ · ] can used to measure the extreme quantile q . VaR q [ Z ] := inf { t : P ( Z ≤ t ) ≥ q } , where < q < .In the following sections, we assume that the events that each individual gets infected are mutually independent. Inaddition, we suppress the randomness ( ω, ξ ) . We use superscripts ’ L ’ for the linear array method, ’ S ’ for the squarearray method, and ’ B ’ for the benchmark of purely individual testing. For subscripts, ’ I ’ stands for individual test, ’ G ’stands for group test. Finally, in order to ﬁnd the optimal group size, we will ﬁrst ﬁnd the optimal pool size n ∗ , and theoptimal group size would be g ( n ∗ ) . The following subsections will focus on ﬁnding the optimal pool size. In this subsection, we evaluate the expected number of tests and the expected total false negative number in the lineararray group testing method. We leave the details of derivation for the following to Appendix A.Denote by E (cid:2) m L ( n ) (cid:3) the expected value of follow-up diagnostic tests for a single linear array group test of size n . E (cid:2) m L ( n ) (cid:3) = n n (cid:88) d =1 (cid:18) − γ ( n, d ) (cid:19)(cid:0) nd (cid:1) (1 − p ) n − d ( p ) d Denote by E (cid:2) M L ( n ) (cid:3) the total expected number of tests for the linear array method. Note E (cid:2) M L ( n ) (cid:3) is lowerbounded by (cid:100) Nn (cid:101) , the total number of groups. E (cid:2) M L ( n ) (cid:3) = (cid:100) Nn (cid:101) + (cid:98) Nn (cid:99) E (cid:2) m L ( n ) (cid:3) + E (cid:20) m L ( N − (cid:98) Nn (cid:99) n ) (cid:21) (1)5enote by E (cid:2) f LG ( n ) (cid:3) the expected false negative number for a single linear array group test of size n . E (cid:2) f LG ( n ) (cid:3) = n (cid:88) d =1 dγ ( n, d ) (cid:0) nd (cid:1) (1 − p ) n − d ( p ) d For the follow-up diagnostic tests following a single linear array group test of size n , the expected false negative numberis E (cid:2) f LI ( n ) (cid:3) = γ (1 , n (cid:88) d =1 d (cid:18) − γ ( n, d ) (cid:19)(cid:0) nd (cid:1) (1 − p ) n − d ( p ) d The expected total false negative number, including all follow-up diagnostic tests due to (cid:98) Nn (cid:99) arrays of size n and onearray of size ( N − (cid:98) Nn (cid:99) n ) , is E (cid:2) f L ( n ) (cid:3) = (cid:98) Nn (cid:99) (cid:18) E (cid:2) f LI ( n ) (cid:3) + E (cid:2) f LG ( n ) (cid:3) (cid:19) + (cid:32) E (cid:20) f LI ( N − (cid:98) Nn (cid:99) n ) (cid:21) + E (cid:20) f LG ( N − (cid:98) Nn (cid:99) n ) (cid:21) (cid:33) (2)The closed-form expressions above show the expected values of the total number of tests and false negative number. Tosee the variability of these quantities, we run simulations to plot the empirical distributions, as shown in Figure 4.Figure 4: Distribution of total number of tests and false negative number in linear array group testing, when n = 25 , N = 10000 , p = 0 . , and C = 600 . Simulations run 5000 times. In this subsection, we evaluate the expected number of tests and the expected total false negative number in the squarearray group testing method. We leave the details of derivation for the following to Appendix B.Denote by E (cid:2) m S ( n ) (cid:3) the expected value of follow-up diagnostic tests for a single square array of size n × n . E (cid:2) m S ( n ) (cid:3) = n (cid:32)(cid:18) n − (cid:88) d =0 (1 − γ ( n, d + 1)) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d (cid:19) p + (cid:18) n − (cid:88) d =1 (1 − γ ( n, d )) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d − (1 − p ) n − (cid:19) (cid:18) (1 − p ) − − p ) n + (1 − p ) n − (cid:19)(cid:33) As for the expected total number of tests, denote it by E (cid:2) M S ( n ) (cid:3) . Note that E (cid:2) M S ( n ) (cid:3) is lower-bounded by n (cid:98) Nn (cid:99) + ( N − (cid:98) Nn (cid:99) n ) , which is the number of tests for all row and column groups as well as all individual tests forthe remaining samples outside arrays. E (cid:2) M S ( n ) (cid:3) = (cid:98) Nn (cid:99) (cid:18) E (cid:2) m S ( n ) (cid:3) + 2 n (cid:19) + (cid:18) N − (cid:98) Nn (cid:99) n (cid:19) (3)6enote by E (cid:2) f SG ( n ) (cid:3) the expected false negative number for a single square array of size n × n . E (cid:2) f SG ( n ) (cid:3) = n p (cid:32) − (cid:18) n − (cid:88) d =0 (1 − γ ( n, d + 1)) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d (cid:19) (cid:33) Denote by E (cid:2) f SI ( n ) (cid:3) the expected false negative number for the follow-up diagnostic tests following a single squarearray of size n × n . E (cid:2) f SI ( n ) (cid:3) = n p (cid:18) n − (cid:88) d =0 (1 − γ ( n, d + 1)) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d (cid:19) γ (1 , The expected total false negative number, including all tests due to the (cid:98) Nn (cid:99) square arrays and the remaining onesoutside the arrays, is: E (cid:2) f S ( n ) (cid:3) = (cid:98) Nn (cid:99) (cid:18) E (cid:2) f SG ( n ) (cid:3) + E (cid:2) f SI ( n ) (cid:3) (cid:19) + ( N − (cid:98) Nn (cid:99) n ) p γ (1 , (4)Figure 5 shows the empirical distribution of the total number of tests and the distribution of the false negative numberfor the square array method. Notice that in all the group testing methods, the total expected number of tests and totalexpected false negative number depend on the pool size. For the purpose of comparing the methods, we will considertheir total false negative number under the same testing capacity, in a stochastic programming formulation.Figure 5: Distribution of total number of tests and false negative number in square array group testing, when n = 50 , N = 10000 , p = 0 . , and C = 600 . Simulations run 5000 times. In this subsection, we will measure the performance of group testing methods (linear array and square array) withrespect to the total false negative number. The objective is to minimize the expected total false negative number, subjectto the expected total number of tests not exceeding a given testing capacity. Recall that in Section 3, the total number oftests is M ( n ) , the false negative number is f ( n ) . With the same notations, we formulate the problem of selecting theoptimal pool size as follows: minimize n ∈{ , , ··· , (cid:98) ¯ n (cid:99)} E [ f ( n )] subject to E [ M ( n )] ≤ C (5)To solve (5), we make use of the closed-form expression for the expected total number of tests (i.e., (1), (3)) and ﬁndthe feasible region of decision variable n , and evaluate the objective function using the closed-form expression for theexpected total false negative number (i.e., (2), (4)).Under the above formulation, we compare the aforementioned two group testing methods against the benchmarkindividual testing. Fix the population size to be N = 10000 . Under different values of the initial prevalence rate p and7igure 6: Comparison of false negative number between linear array group testing, square array group testing, andbenchmark individual testing. under different settings of p and C .testing capacity C , we ﬁnd the optimal pool size for each method and the corresponding total false negative number.Since the expected total infected samples is the same for both group testing methods, higher expected false negativenumber also means higher false negative rate. As a benchmark, we randomly select C samples from the total N samplesto conduct the individual testing, ignoring the rest. The expected false negative number of benchmark individual testingcan be approximated by E (cid:2) f B (cid:3) ≈ ( N − C ) p . We say a group test method is in its infeasible region if there is nopool size n such that the expected number of tests required is less than C . Figure 6 shows the expected false negativenumber under optimal pool size for all three methods, leaving the infeasible region blank. In summary, both methodsperform better than the benchmark individual testing. The square array group testing has larger feasible region than thelinear array group testing does, since the former requires less number of tests on average. However, within its feasibleregion, the linear array group testing has uniformly less false negative number, and thus smaller false negative rate, thanthe square array group testing.Figure 7: The relationship between the pool size and the total number of tests, as well as the total false negative numberfor the linear array method and the square array when N = 10000 , p = 0 . .To see how the testing capacity C affects the performance of both group testing methods, we ﬁx p = 0 . and showmore details of how the test capacity affects the performance of testing result for all methods. Figure 7 provides therelationship between the pool size and the total number of tests, as well as the relationship between the pool size andthe total false negative number for both methods. We obtain VaR . [ · ] (orange curves) by running simulations 50008imes, and obtain E [ · ] (blue curves) by expressions (1) to (4). In the two graphs on the left side of Figure 7, the poolsizes corresponding to the test numbers below the testing capacity C (i.e., the horizontal lines) are feasible. In the twographs on the right side of Figure 7, we ﬁnd the minimal total false negative number for the feasible pool sizes. Takethe square array method for example. Setting C = 300 gives optimal pool size n ∗ = 100 and minimal mean total falsenegative number f ∗ = 5 . . C n ∗ E (cid:2) M L ( n ∗ ) (cid:3) VaR . (cid:2) M L ( n ∗ ) (cid:3) E (cid:2) f L ( n ∗ ) (cid:3) VaR . (cid:2) f L ( n ∗ ) (cid:3)

100 infeasible infeasible infeasible infeasible infeasible200 infeasible infeasible infeasible infeasible infeasible300 infeasible infeasible infeasible infeasible infeasible400 infeasible infeasible infeasible infeasible infeasible500 infeasible infeasible infeasible infeasible infeasible600 25.0 598.798 726.0 2.027 5.0700 19.0 681.863 774.0 1.814 4.0800 15.0 792.052 862.0 1.636 4.0900 13.0 879.649 939.0 1.529 4.01000 12.0 935.955 1002.0 1.474 4.0Table 1: Linear array method: the impact of test capacity on the performance of the test. p = 0 . . C n ∗ E (cid:2) M S ( n ∗ ) (cid:3) VaR . (cid:2) M S ( n ∗ ) (cid:3) E (cid:2) f S ( n ∗ ) (cid:3) VaR . (cid:2) f S ( n ∗ ) (cid:3)

100 infeasible infeasible infeasible infeasible infeasible200 infeasible infeasible infeasible infeasible infeasible300 100.0 246.559 308.0 5.263 9.0400 100.0 246.559 308.0 5.263 9.0500 50.0 418.207 441.0 4.453 8.0600 50.0 418.207 441.0 4.453 8.0700 50.0 418.207 441.0 4.453 8.0800 30.0 771.106 783.0 3.801 7.0900 25.0 810.008 820.0 3.618 7.01000 25.0 810.008 820.0 3.618 7.0Table 2: Square array method: the impact of test capacity on the performance of the test. p = 0 . . C E (cid:2) f B (cid:3) VaR . (cid:2) f B (cid:3)

100 9.962 15200 9.83 15300 9.688 15400 9.499 15500 9.569 15600 9.415 15700 9.384 15800 9.24 15900 9.097 141000 9.003 14Table 3: Benchmark individual testing. p = 0 . Table 1 and Table 2 summarize more details of testing results for the linear array group test and the square array grouptest, respectively. In general, if we increase the testing capacity, we will have more feasible solutions, and smaller meantotal false negative number. Again, we calculate the benchmark false negative number by randomly selecting C fromthe N samples for benchmark individual testing, ignoring the rest. Table 3 shows the benchmark result for comparison.Intuitively, given the same pool size for the two group testing methods, the square array method will have larger falsenegative number because each sample in the square array method will be tested in both the row pool and the columnpool. Because of the dilution effect, the probability that infected sample will be tested negative in the square arraygroup test is relatively larger. In conclusion, if we are given a relatively large test capacity, we should use the linear9rray method. On the other hand, if we face the shortage of testing kits and only have a relatively small test capacity, weprefer using the square array group test. Remark 2.

We also notice a widely applied group testing method called general binary search (GBS). GBS is anadaptive group testing method which takes a long time to conduct, and the number of swab samples from each individualis large and uncertain. Therefore, we leave the details of GBS to Appendix C for those who are interested.

Remark 3.

In case there are no closed-form expressions for the objective function and constraints, we can also applysample average approximation (SAA) to both constraints and objective function, and solve the approximate deterministicproblem in order to ﬁnd the feasible region of decision variable n and evaluate the objective function to ﬁnd the optimalpool size within the feasible region. We refer the readers to [24] and [25] for the application of SAA to stochasticprogramming. Remark 4.

We also consider some other performance metrics with regard to the total false negative number and thetotal number of test, from the risk-averse perspective. For example, the following two sets of performance metrics canbe applied. • minimize n ∈{ , , ··· , ¯ n } E [ f ( n )] subject to VaR q [ M ( n )] ≤ C, q ∈ (0 , • minimize n ∈{ , , ··· , ¯ n } VaR q [ f ( n )] subject to VaR q [ M ( n )] ≤ C, q ∈ (0 , In this section, we consider testing a large population in a closed community such as college or nursing home. Dueto limited daily testing capacity, we can only test the whole population in a testing cycle of multiple days. The dailymodel in Section 3 chooses the optimal group size to minimize the group false negative number in a day, and thegroup false negative number will affect the number of people quarantined, which further impacts the prevalence rateat the next day. The inﬂuence will propagate to eventually impact the ﬁnal prevalence rate. Therefore, we proposea testing-quarantine-infection model for this scenario. Before we describe this model, we ﬁrst make the followingassumptions: • Among untested population, people are randomly chosen to have the test. • We use a simpliﬁed model for the infection process, where prevalence increases exponentially withoutintervention. • We assume the tests are conducted in the morning, and results can be revealed immediately. Following testresults, people who test positive are assumed to be quarantined, either by self-quarantine or hospitalization.Thus, those who test positive will be removed from the whole population in the morning right after the testing. • We assume that people are within a closed community with no infections being imported. As a consequence,once all infected individuals are quarantined, the pandemic ends. • We assume that there are no false positives.We introduce more sets of notations. The ﬁrst set of notations are the static quantities during the T -day testing: • N total : total number of individuals in the closed community. • T : length of time period for group testing. • l : testing cycle length. Note that l ≤ T . • α : daily growth rate of infection. • p : initial prevalence rate at the beginning of the time period. • C : daily testing capacity.Note that the last two notations are the same as those deﬁned in Section 3. The next set of notations are quantitiesmeasured before the testing stage of each day: 10 N testt : number of people to test in day t . For a testing cycle of l days for N total people, we have N testt = (cid:26) (cid:100) N total l (cid:101) , j = 1 , , · · · , l − N total − (cid:100) N total l (cid:101) ( l − , j = l • R It : infected ratio among those who haven’t been tested.We consider testing N total people in l days, and each day we test N testt people. At the beginning of day t , we randomlychoose N testt people from those who have not been tested yet, and conduct group testing (linear array or square arraymethods) under a limited test capacity. After group testing, we mark those who already get tested, and quarantine thosewho test positive, and return the rest of the people back to the population. People who have been tested will not betested again in this testing cycle. We divide one day into two stages, namely testing stage and infection stage. At thetesting stage, people get tested and those who test positive will be quarantined accordingly. At the infection stage,people who are infected will continue to infect the susceptible people in the population.The following set of notations are quantities measured before the testing stage of each day: • γ t : group false negative rate calculated in day t . It is calculated as the false negative number divided by thenumber of infected people in the tested population in day t . • y T It : increased number of individuals who are tested and infected at day t . • y T Qt : increased number of individuals who are tested, infected and quarantined at day t . • y T NQt : increased number of individuals who are tested, infected but not quarantined at day t . • y T NIt : increased number of individuals who are tested but not infected at day t .The following set of notations are quantities measured after the testing stage of each day: • N t : remaining population size, i.e., number of individuals that have not been quarantined after the day t ; Note N = N total . • Y T It : number of individuals who are infected and have been tested at day t . • Y T NIt : number of individuals who are not infected and have been tested at day t . • Y NT It : number of individuals who are infected and not have been tested at day t . • Y NT NIt : number of individuals who are not infected and not have been tested at day t .The following set of notations are quantities measured before the infection stage of each day: • R NIt : newly infected ratio among those who have not been tested till day t . • N NIt : number of individuals who are newly infected at day t . • N NT NIt : number of individuals who are newly infected from the untested population at day t .The following set of notations are quantities measured after the infection stage of each day: • p t : prevalence rate at the end of day t . • X T It : number of individuals who are infected and have been tested till the end of day t . • X T NIt : number of individuals who are not infected and have been tested at the end of day t . • X NT It : number of individuals who are infected and have not been tested at the end of day t . • X NT NIt : number of individuals who are not infected and have not been tested at the end of day t .Note that for the above notations with subscript t , a notation with t = 0 denotes the corresponding quantity at initialstatus, if applicable. The following algorithm shows the testing-quarantine-infection model, and outputs the optimaltesting cycle length as well as the optimal pool size for each day within the cycle. Note that we use B ( N, p ) to denotethe binomial distribution with parameter N and p . Details and analysis of the algorithm are presented in the followingsubsections. 11 lgorithm 1: Testing-quarantine-infection model. input : total population N total , initial prevalence rate p , infection rate α , testing capacity C output : optimal testing cycle length l ∗ , optimal pool size n ∗ t , t = 1 , · · · , l ∗ for l ← to T dofor replication ← to do ˆ T = T ; while ˆ T > do initialization: set N = N total , X T I = X T NI = 0 , X NT I ∼ B ( N total , p ) , X NT NI = N total − X NT I ; for t ← to min( l, ˆ T ) do before the testing stage ;solve (5) with input p t , N testt , C ;ﬁnd the optimal pool size n ∗ t and the optimal false negative rate γ t ;infected ratio among those not tested: R It = X NTIt − X NTIt − + X NTNIt − ;tested, infected: y T It ∼ B ( N testt , R It ) ;tested, infected, and quarantined: y T Qt = y T It (1 − γ t ) ;tested, infected, but not quarantined: y T NQt = y T It γ t ;tested, not infected: y T NIt = N testt − y T It ; after the testing stage ;tested, infected: Y T It = X T It − + y T NQt ;not tested, infected: Y NT It = X NT It − − ( y T Qt + y T NQt ) ;tested, not infected: Y T NIt = X T NIt − + y T NIt ;not tested, not infected: Y NT NIt = X NT NIt − y T NIt ;not quarantined: N t = N t − − Y T Qt ; before the infection stage ;newly infected: N NIt = ( Y T It + Y NT It ) × ( α − ;newly infected ratio among those not tested: R NIt = Y NTNIt Y NTNIt + Y TNIt ;newly infected from the population that have not been tested yet: N NT NIt ∼ B ( N NIt , R

NIt ) ; after the infection stage ;not tested, not infected: X NT NIt = Y NT NIt − N NT NIt ;tested, not infected: X T NIt = Y T NIt − ( N NIt − N NT NIt ) ;tested, infected: X T It = Y T It + ( N NIt − N NT NIt ) ;not tested, infected: X NT It = Y NT It + N NT NIt ;prevalence rate: p t = X TIt + X NTIt N t ; endif t ≤ ˆ T then update the initial prevalence rate p with X TIt + X NTIt N t ; ˆ T = ˆ T − t ; endend record the ﬁnal prevalence rate at the end of the given time period; end compute the average ﬁnal prevalence rate at the end of the given time period; end return the optimal testing cycle length l ∗ that yields the smallest average ﬁnal prevalence rate;return the optimal group size n ∗ t , t = 1 , · · · , l ∗ associated with the optimal testing cycle length;12 .2 Final Prevalence Rate: Analysis Given all the pre-determined parameters, which are denoted precisely by the ﬁrst set of notations in Section 4.1, weconsider the factor that inﬂuences the prevalence rate. By the deﬁnition of p t , we have p t = X T It + X NT It N t .With the testing-quarantine-infection model described in Section 4.1, we can analytically compute the ﬁnal prevalencerate p T as follows: p T = α T N total · p N T − T (cid:88) t =1 α t y T QT − t +1 N T (6)Eq.6 shows that an increase in quarantining infected individuals, i.e., y T Qt for any t = 1 , , · · · , T leads to a decrease in N T . Hence, both the ﬁrst and the second terms in Eq.6 will increase. However, it can be shown that an increase in y T Qt will actually lead to a decrease in the ﬁnal prevalence rate. To formalize this, let (cid:101) y T Qt = y T Qt + k t , t = 1 , , · · · , T ,where k t ∈ N and (cid:101) p T denote the corresponding ﬁnal prevalence rate. Then we have (cid:101) p T ≤ p T . (7)We leave the details of the proofs for Eq.7 to Appendix E. Eq.7 implies that it sufﬁces to focus on y T Qt , t = 1 , , · · · , T when considering which factors affect ﬁnal prevalence rate p T . From the above argument, we know that the larger thenumber of quarantined, the lower the ﬁnal prevalence rate will be. This result is consistent with the common sense inthat the most effective approach for controlling the spread of virus is to quarantine the infected. Note that if testingcycle length l becomes smaller, the daily test number will have to increase, but the false negative rate γ t will increase aswell. An increase in the testing number is likely to increase y T Qt , but an increase in the false negative rate decreases y T Qt . Formally, in one testing cycle for N t people( t = 0 , l, l, · · · , [ Tl ] l ): E [ y T Qt ] = (cid:100) N t l (cid:101) · p t − · (1 − γ t ) , t = t + 1 , t + 2 , · · · , t + l − E [ y T Qt + l ] = N t − (cid:100) N t l (cid:101) ( l − (8)Due to this trade-off between the number of tests and the false negative rate, it is important to select the optimal cyclelength l to minimize the ﬁnal prevalence rate. To this end, we compare different cycle lengths( l = 1 , , · · · , T ) bysimulating the above test-quarantine-infection model over a time period of T days. The details and results are describedin the next subsection. In this subsection, we use the square array method, since it uses less number of tests and has a relatively higher accuracy,according to our ﬁndings in Section 3.3. We also assume that before the testing, people go through pre-screening, suchthat the initial prevalence rate p would be kept low (for example . ).To ﬁnd the optimal testing cycle length, we compare the ﬁnal prevalence rates of different testing cycle lengths bysimulating the test-quarantine-infection model till the end of T days. When the testing cycle length l is less than T ,we run multiple cycles till T days. For every cycle length l , we run simulation replications and average the ﬁnalprevalence rates over these replications. In an iteration of this T -day time period, if at some day the test capacityconstraint can not be satisﬁed for all possible group sizes, then this iteration will not be recorded.We set T = 7 , N total = 10000 , p = 0 . , α = 1 . , C = 300 . We consider testing cycle length l ranges from to to make sure at least one complete testing cycle can be done, which means everyone in the population is testedat least once. Given values of parameters N total , α, p , C and l , the simulation is carried out in two stages for eachday as mentioned before: the testing stage and the infection stage. We initialize each simulation run as follows: X T I = X T NI = 0 , X NT I = N total × p , X NT NI = N total × (1 − p ) , N = N total . During day t, ≤ t ≤ ,we ﬁrst randomly select N testt individuals who have not been tested so far, then we apply square array tests on these N testt individuals with different pool size to obtain the optimal pool size n ∗ , which minimizes the expected groupfalse negative detection(as expressed in Eq.4) under the daily testing capacity C . The corresponding average numberof total tests for pool size is calculated according to Eq.3. Based on the false negative rate, we obtain values of13 T Qt , y

T NQt , y

T NIt , Y

T It , Y

T NIt , Y

NT It , Y

NT NIt and N t . In the infection stage, we sample newly infected individualsfrom those who have not been infected yet, based on the infection rate α . Then we update X T It , X

T NIt , X

NT It , X

NT NIt and ﬁnally, p t . As a benchmark, we also simulate using the individual testing method, which randomly select and test C individuals each day from people who have not been tested so far.We keep track of the prevalence rate each day within the -day time period. Figure 8 shows the prevalence rate underdifferent cycle lengths l and that of the individual testing. The cycle length l = 1 is not feasible due to the daily testingcapacity, so it is not included in Figure 8. The estimated ﬁnal prevalence rates, total number of quarantined individualsand total number of testing are listed in Table 4. The optimal pool size and the number of tests each day are shown inFigure 9.Figure 8: Prevalence rates of the square array method with different testing cycle length l and of the individual testing, p = 0 . , C = 300 , N total = 10000 .Test cycle length Final prevalence rate Total number of quarantined Total number of tests1 NAN NAN NAN2 0.00022 13.02 16093 0.00048 12.97 18854 0.00084 12.55 15775 0.00098 12.35 18756 0.00122 12.03 15897 0.00132 11.9 1937individual 0.00328 4.27 2100Table 4: The estimated ﬁnal prevalence rate, total number of quarantined, and total number of tests for square arraygroup testing with different cycle length l and the benchmark individual testing.Figure 9: (Left) Optimal pool size each day; (Right) Number of tests each day.14rom Figure 9, we can see that the square array method leads to a much lower ﬁnal prevalence rate than the individualtesting. Group testing with cycle length l = 2 or lowers the prevalence rate over time, while cycle length − keepsthe prevalence rate almost steady. In contrast, the prevalence rate gets out of control when individual test is used. In thecurrent parameter setting, the optimal testing cycle length is days, leading to a ﬁnal prevalence rate . at the endof days, which is about lower than the initial.When l increases, the population get tested in each day N testt = N total l decreases. Therefore, the optimal pool sizedecreases as l increases, as the left of Figure 9 suggests. Compared with the individual testing, group testing uses muchless number of tests each day. Table 4 shows that the number of tests is minimized when l = 4 . We notice that there aresudden increases of number of tests at day for l = 2 , and at day for l = 4 . These two jumps are corresponding tothe increase of the optimal pool size. The reason for this kind of jump is that when determining the optimal pool size n ∗ under a given test capacity, the feasible region for the pool size n contains discrete integers. Hence the optimal poolsize may jump from one integer to another due to some tiny change of the prevalence rate from day to day.Figure 10: (Left) Number of the newly quarantined each day; (Right) False negative rate each day.In the left of Figure 10 and table 4, we see that the number of newly quarantined individuals each day decreases as thetesting cycle length l increases. Recall that the ﬁnal prevalence rate is also monotonically increasing with respect to l ,we conﬁrm our analysis in Section . that an increase in the number of quarantined each day, y T Qt , leads to a decreasein the ﬁnal prevalence rate. Furthermore, it is interesting to observe that the optimal cycle length of days initially hasthe largest false negative rate because of the large pool size, but the false negative rate drops quickly over time andbecomes comparable with other cycle lengths. Under current parameter setting, it turns out that when l is small, thelarger tested population offsets the disadvantage brought by higher false negative rate, leading to the result that moreinfected individuals get quarantined and hence less virus carrier in the population. Figure 11: (Left)Prevalence rate when p = 0 . ; (Right) Prevalence rate when p = 0 . .We conduct sensitivity analysis on the simulation output with respect to three key parameters p , α, C , and our decision,the optimal cycle length l . We test for the cases p = 0 . , . , α = 1 . , . , and C = 200 , , respectively.The results for ﬁnal prevalence rates are shown in Figure , , .15igure 12: (Left) Prevalence rate when α = 1 . ; (Right) Prevalence rate when α = 1 . .It turns out that small changes of p and α bring signiﬁcant changes to the ﬁnal prevalence rate. In contrast, it seemsthat varying C slightly only has a small impact on the prevalence rate. This is because there exist several ranges ofvalues of C such that we have the same optimal pool size within the range, as it is shown in Table 2. Figure 13: (Left) Prevalence rate when C = 200 ; (Right) Prevalence rate when C = 400 .From the above results, it is interesting to see that the optimal cycle length seems to be robust with respect to theparameters p , α, C . In all scenarios that we simulate, it turns out that l = 2 is always the optimal cycle length forcontrolling the prevalence rate. Moreover, among all the feasible cycle lengths, a smaller optimal cycle length alwayshave a better performance on controlling the virus than a bigger one.Based on these observations, we have two remarks for the robustness of the optimal pool size with respect to p , α, C . Remark 5.

Though the false negative rate γ t is relatively high when l = 2 , the testing population size N testt each dayis large. The large testing population size will dominate the number of quarantined each day. In Figure 14, although thenumber of newly quarantined when l = 2 is not always the largest, it is common that the number of newly quarantinedin the ﬁrst couple of days is the highest among all scenarios, and more quarantined individuals in the beginning will behelpful to control the spread of virus. Remark 6.

We note that even though the false negative rate is higher when l = 2 , the infected individuals that aredeemed to be negative have more opportunities to be correctly diagnosed later on. Speciﬁcally, if an infected individualtests negative in the ﬁrst run, it will have the chance of getting retested for three times when l = 2 . However, it willhave at most two more tests when l = 3 . In the worst case, it will no longer have opportunity to get more test when l = 7 . Hence, with a smaller testing cycle length, the false negatives have a higher chance of being corrected in laterruns of the testing procedure, which leads to more quarantined individuals, hence lower ﬁnal prevalence rate. We consider group testing for a relatively large community during the COVID-19 pandemic. In particular, twonon-adaptive group testing methods are considered, namely linear array and square array methods. We take intoconsideration the dilution effect that will increase the false negative rate in the group testing, and derive the optimal The difference of the optimal pool size n ∗ between Table 2 and our simulation output in Section 4.3 is because in Table 2 weconsider one-day testing with N total = 10000 while in Section 4.3 we consider a l -day testing cycle for testing N total = 10000 individuals. α = 1 . ; (upper right) α = 1 . ;(lower left) V = 0 . ; (lower right) V = 0 . .pool size that minimizes the daily total false negative number, under a constraint of testing capacity. In addition, weincorporate the daily model into a testing cycle, propose a testing-quarantine-infection model, and compute the optimaltesting cycle length that minimizes the ﬁnal prevalence rate at the end of the given time period. We ﬁnd that undercertain parameter setting, the shorter the testing cycle length is, the more infected people will be quarantined, and itleads to a lower ﬁnal prevalence rate in spite of the increased false negative number. The sensitivity analysis shows thatthe simulation output is sensitive to the infection rate and the initial prevalence rate, while less insensitive to the testingcapacity within a certain range.The testing protocol is summarized as follows. Algorithm 2:

Testing protocol.Input: total population N total , estimate of initial prevalence rate p , estimate of infection rate α , testing capacity C ;Run the testing-quarantine-infection model (i.e., Algorithm 1);Output: optimal testing cycle length l ∗ , optimal pool size n ∗ t , t = 1 , · · · , l ∗ ;Test the total population in l ∗ days, hence test N total l ∗ in each day; while t ≤ l ∗ do In day t , take three swabs from each individual;Form (cid:98) N total l ∗ ( n ∗ t ) (cid:99) square array of size n ∗ t × n ∗ t . Pools of size n ∗ t are created from each row and each column;Conduct RT-qPCR test for each pool;Conduct individual test for the remaining N total l ∗ − (cid:98) N total l ∗ ( n ∗ t ) (cid:99) ( n ∗ t ) people;Conduct individual test for samples at the intersection of positive row and positive column;Quarantine people who test positive, and return people who test negative back to the community;Set t = t + 1 . end cknowledgement The authors gratefully acknowledge the support by the National Science Foundation under Grant CAREER CMMI-1453934 and the Air Force Ofﬁce of Scientiﬁc Research under Grant FA9550-19-1-0283. The authors would like tothank Professor Peter Frazier from Cornell University for insightful discussions and feedback.

References [1] Robert Dorfman. The detection of defective members of large populations.

Ann. Math. Statist. , 14(4):436–440, 121943.[2] G. K. Atia and V. Saligrama. Boolean compressed sensing and noisy group testing.

IEEE Transactions onInformation Theory , 58(3):1880–1901, 2012.[3] D. Malioutov and M. Malyutov. Boolean compressed sensing: LP relaxation for group testing. In , pages 3305–3308, 2012.[4] Mikhail Malyutov. Search for sparse active inputs: A review. In Harout Aydinian, Ferdinando Cicalese, andChristian Deppe, editors,

Information Theory, Combinatorics, and Search Theory: In Memory of Rudolf Ahlswede ,pages 609–647. Springer Berlin Heidelberg, Berlin, Heidelberg, 2013.[5] Matthew Aldridge, Oliver Johnson, and Jonathan Scarlett. Group testing: An information theory perspective.

Foundations and Trends R (cid:13) in Communications and Information Theory , 15(3-4):196–392, 2019.[6] Cassidy Mentus, Martin Romeo, and Christian DiPaola. Analysis and applications of adaptive group testing meth-ods for COVID-19. medRxiv arXiv:2005.03051 [stat.ME] , 2020. https://arxiv.org/abs/2005.03051.[8] Olivier Gossner. Group testing against COVID-19. Working Papers 2020-02, Center for Research in Economicsand Statistics, March 2020.[9] Andrei Z. Broder and Ravi Kumar. A note on double pooling tests. arXiv:2004.01684 [cs.DM] , 2020.https://arxiv.org/abs/2004.01684.[10] Vincent Brault, Bastien Mallein, and Jean-Francois Rupprecht. Group testing as a strategy for the epidemiologicmonitoring of COVID-19. arXiv:2005.06776 [q-bio.QM] , 2020. https://arxiv.org/abs/2005.06776.[11] Inés Armendáriz, Pablo A. Ferrari, Daniel Fraiman, and Silvina Ponce Dawson. Group testing with nested pools. arXiv:2005.13650 [math.ST] , 2020. https://arxiv.org/abs/2005.13650.[12] J. Massey Cashore, Ning Duan, Alyf Janmohamed, Jiayue Wan, Yujia Zhang, Shane Henderson,David Shmoys, and Peter Frazier. COVID-19 mathematical modeling for cornell’s fall semester, 2020.https://people.orie.cornell.edu/pfrazier/COVID_19_Modeling_Jun15.pdf.[13] Stefan Lohse, Thorsten Pfuhl, Barbara Berkó-Göttel, Jürgen Rissland, Tobias Geißler, Barbara Gärtner, Sören L.Becker, Sophie Schneitler, and Sigrun Smola. Pooling of samples for testing for SARS-CoV-2 in asymptomaticpeople. The Lancet Infectious Diseases , April 2020.[14] Krishna R. Narayanan, Anoosheh Heidarzadeh, and Ramanan Laxminarayan. On accelerated testing for COVID-19 using group testing. arXiv:2004.04785 [cs.IT] arXiv:2004.06306[stat.ME] , 2020. https://arxiv.org/abs/2004.06306.[18] Matthew Aldridge. Conservative two-stage group testing. arXiv:2005.06617 [stat.AP] , 2020.https://arxiv.org/abs/2005.06617.[19] P. Fischer, N. Klasner, and I. Wegener. On the cut-off point for combinatorial group testing.

Discrete appliedmathematics , 91(1):83–92, 1999. 1820] Matthew Aldridge. Individual testing is optimal for nonadaptive group testing in the linear regime.

IEEETransactions on Information Theory , PP(99), 2018.[21] F. K. Hwang. Group testing with a dilution effect.

Biometrika , 63(3):671–680, 12 1976.[22] Jos Weusten, Marion Vermeulen, Harry van drimmelen, and Nico Lelie. Reﬁnement of a viral transmissionrisk model for blood donations screened by nat in different pool sizes and repeat test algorithms.

Transfusion ,51:203–15, 01 2011.[23] Ngoc T. Nguyen, Hrayer Aprahamian, Ebru K. Bish, and Douglas R. Bish. A methodology for deriving thesensitivity of pooled testing, based on viral load progression and pooling dilution.

Journal of TranslationalMedicine , 2019.[24] Anton J. Kleywegt, Alexander Shapiro, and Tito Homem-de Mello. The sample average approximation methodfor stochastic discrete optimization.

SIAM J. on Optimization , 12(2):479–502, February 2002.[25] Wei Wang and Shabbir Ahmed. Sample average approximation of expected value constrained stochastic programs.

Oper. Res. Lett. , 36(5):515–519, September 2008.[26] Terry C Jones, Barbara Mühlemann, Talitha Veith, Guido Biele, Marta Zuchowski, Jörg Hoffmann, Angela Stein,Anke Edelmann, Victor Max Corman, and Christian Drosten. An analysis of SARS-CoV-2 viral load by patientage. medRxiv

Derivation for the Linear Array Group Test

A.1 Expected Number of Follow-up Diagnostic Tests

The average number of tests for a single group of size n is E (cid:2) m L ( n ) (cid:3) . E (cid:2) m L ( n ) (cid:3) = E [ number of individual test f or one group ]= n P ( group test positive ) + 0 P ( group test negative )= n P ( group test positive )= n n (cid:88) d =0 P ( group test positive, d of the group contains virus )= n n (cid:88) d =0 P ( group test positive | d of the group contains virus ) P ( d of the group contains virus )= n n (cid:88) d =1 P ( group test positive | d of the group contains virus ) P ( d of the group contains virus )= n n (cid:88) d =1 (cid:18) − γ ( n, d ) (cid:19)(cid:0) nd (cid:1) (1 − p ) n − d ( p ) d The total expected number of tests for the linear array group test is: E (cid:2) M L ( n ) (cid:3) = (cid:100) N/n (cid:101) + (cid:98) N/n (cid:99) E (cid:2) m L ( n ) (cid:3) + E (cid:20) m L ( N − (cid:98) Nn (cid:99) n ) (cid:21) A.2 Expected False Negative Number

The average number of false negatives for a single group test of size n is E (cid:2) f LG ( n ) (cid:3) . E (cid:2) f LG ( n ) (cid:3) = n (cid:88) d =0 d ∗ P ( group test negative, d of the group contains virus )= n (cid:88) d =0 d P ( group test negative | d of the group contains virus ) P ( d of the group contains virus )= n (cid:88) d =1 dγ ( n, d ) (cid:0) nd (cid:1) (1 − p ) n − d ( p ) d The expected number of false negatives for the resulted individual tests by a single group test of size n is: E (cid:2) f LI ( n ) (cid:3) = γ (1 , n (cid:88) d =0 d P ( group test positive, d of the group contains virus )= γ (1 , n (cid:88) d =0 d P ( group test postive | d of the group contains virus ) P ( d of the group contains virus )= γ (1 , n (cid:88) d =1 d (cid:18) − γ ( n, d ) (cid:19)(cid:0) nd (cid:1) (1 − p ) n − d ( p ) d The total expected number of false negatives for the linear array group test is: E (cid:2) f Ltotal ( n ) (cid:3) = (cid:98) N/n (cid:99) (cid:18) E (cid:2) f LI ( n ) (cid:3) + E (cid:2) f LG ( n ) (cid:3) (cid:19) + (cid:18) E (cid:20) f LI ( N − (cid:98) Nn (cid:99) n ) (cid:21) + E (cid:20) f LG ( N − (cid:98) Nn (cid:99) n ) (cid:21) (cid:19) Derivation for the Square Array Group Test

B.1 Expected Number of Follow-up Diagnostic Tests

The average number of tests for a single group of size n is E (cid:2) m S ( n ) (cid:3) . E (cid:2) m S ( n ) (cid:3) = E [ number of individual test ]= E  n (cid:88) i =1 n (cid:88) j =1 { i th row and j th column test positive }  = n (cid:88) i =1 n (cid:88) j =1 E (cid:2) { i th row and j th column test positive } (cid:3) = n (cid:88) i =1 n (cid:88) j =1 P ( i th row and j th column test positive )= n P ( i th row and j th column test positive )= n P ( A : ,j tests positive and A i, : tests positive )= n { P ( A : ,j tests positive and A i, : tests positive, A : ,j contains virus and A i, : contains virus )+ P ( A : ,j tests positive and A i, : tests positive, A : ,j contains no virus or A i, : contains no virus ) } = n P ( A : ,j tests positive and A i, : tests positive, A : ,j contains virus and A i, : contains virus )= n P ( A : ,j tests positive and A i, : tests positive | A : ,j contains virus and A i, : contains virus ) P ( A : ,j contains virus and A i, : contains virus )= n y P ( A : ,j contains virus and A i, : contains virus )= n y P ( A : ,j contains virus | A i, : contains virus ) P ( A i, : contains virus )= n y (1 − (1 − p ) n )[ p / (1 − (1 − p ) n ) + (1 − p / (1 − (1 − p ) n ))(1 − (1 − p ) n − )]= n y (1 − − p ) n + (1 − p ) n − )= n (cid:40)(cid:20) n − (cid:88) d =0 (1 − γ ( n, d + 1) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d (cid:21) p + (cid:20) n − (cid:88) d =1 (1 − γ ( n, d )) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d − (1 − p ) n − (cid:21) ((1 − p ) − − p ) n + (1 − p ) n − ) (cid:41) Where: P ( i th row contains no virus ) = P ( individual contains no virus ) n = (1 − p ) n P ( A i,j contains virus | A i, : contains virus )= P ( A i,j contains virus ) / P ( A i, : contains virus )= p / (1 − (1 − p ) n ) ( A : ,j contains virus | A i, : contains virus )= P ( A : ,j contains virus, A i,j contains virus | A i, : contains virus )+ P ( A : ,j contains virus, A i,j contains no virus | A i, : contains virus )= P ( A i,j contains virus | A i, : contains virus ) + P ( A i,j contains no virus | A i, : contains virus ) P ( A : ,j contains virus, | A i,j contains no virus, A i, : contains virus )= P ( A i,j contains virus | A i, : contains virus ) + P ( A i,j contains no virus | A i, : contains virus ) P ( A : ,j contains virus | A i,j contains no virus )= p / (1 − (1 − p ) n ) + [1 − p / (1 − (1 − p ) n )](1 − (1 − p ) n − ) y = P ( A : ,j tests positive and A i, : tests positive | A : ,j contains virus and A i, : contains virus )= P ( A : ,j tests positive, A i, : tests positive, A i,j contains virus | A : ,j contains virus and A i, : contains virus )+ P ( A : ,j tests positive, A i, : tests positive, A i,j contains no virus | A : ,j contains virus and A i, : contains virus )= P ( A : ,j tests positive, A i, : tests positive | A i,j contains virus, A : ,j contains virus and A i, : contains virus ) P ( A i,j contains virus | A : ,j contains virus and A i, : contains virus )+ P ( A : ,j tests positive, A i, : tests positive | A i,j contains no virus, A : ,j contains virus and A i, : contains virus ) P ( A i,j contains no virus | A : ,j contains virus and A i, : contains virus )= P ( A : ,j tests positive, A i, : tests positive | A i,j contains virus, A : ,j contains virus and A i, : contains virus ) p / (1 − − p ) n + (1 − p ) n − )+ P ( A : ,j tests positive, A i, : tests positive | A i,j contains no virus, A : ,j contains virus and A i, : contains virus )((1 − p ) − − p ) n + (1 − p ) n − ) / (1 − − p ) n + (1 − p ) n − )= P ( A : ,j tests positive | A i,j contains virus, A : ,j contains virus and A i, : contains virus ) P ( A i, : tests positive | A i,j contains virus, A : ,j contains virus and A i, : contains virus ) p / (1 − − p ) n + (1 − p ) n − )+ P ( A : ,j tests positive | A i,j contains no virus, A : ,j contains virus and A i, : contains virus ) P ( A i, : tests positive | A i,j contains no virus, A : ,j contains virus and A i, : contains virus )((1 − p ) − − p ) n + (1 − p ) n − ) / (1 − − p ) n + (1 − p ) n − )= P ( A : ,j tests positive | A i,j contains virus, A : ,j contains virus and A i, : contains virus ) p / (1 − − p ) n + (1 − p ) n − )+ P ( A : ,j tests positive | A i,j contains no virus, A : ,j contains virus and A i, : contains virus ) ((1 − p ) − − p ) n + (1 − p ) n − ) / (1 − − p ) n + (1 − p ) n − )= P ( A : ,j tests positive | A i,j contains virus, A : ,j contains virus ) p / (1 − − p ) n + (1 − p ) n − )+ P ( A : ,j tests positive | A i,j contains no virus, A : ,j contains virus ) ((1 − p ) − − p ) n + (1 − p ) n − ) / (1 − − p ) n + (1 − p ) n − )= (cid:32) n − (cid:88) d =0 P ( A : ,j tests positive, d of A : ,k (cid:54) = j contain virus | A i,j contains virus, A : ,j contains virus ) (cid:33) p / (1 − − p ) n + (1 − p ) n − )+ (cid:32) n − (cid:88) d =1 P ( A : ,j tests positive, d of A : ,k (cid:54) = j contain virus | A i,j contains no virus, A : ,j contains virus ) (cid:33) ((1 − p ) − − p ) n + (1 − p ) n − ) / (1 − − p ) n + (1 − p ) n − ) (cid:20) n − (cid:88) d =0 P ( A : ,j tests positive | d of A : ,k (cid:54) = j contain virus, A i,j contains virus, A : ,j contains virus ) P ( d of A : ,k (cid:54) = j contain virus | A i,j contains virus, A : ,j contains virus ) (cid:21) p / (1 − − p ) n + (1 − p ) n − )+ (cid:20) n − (cid:88) d =1 P ( A : ,j tests positive | d of A : ,k (cid:54) = j contain virus, A i,j contains no virus, A : ,j contains virus ) P ( d of A : ,k (cid:54) = j contain virus | A i,j contains no virus, A : ,j contains virus ) (cid:21) ((1 − p ) − − p ) n + (1 − p ) n − ) / (1 − − p ) n + (1 − p ) n − )= (cid:20) n − (cid:88) d =0 P ( A : ,j tests positive | d + 1 of A : ,j contain virus ) P ( d of A : ,k (cid:54) = j contain virus | A i,j contains virus, A : ,j contains virus ) (cid:21) p / (1 − − p ) n + (1 − p ) n − )+ (cid:20) n − (cid:88) d =1 P ( A : ,j tests positive | d of A : ,j contain virus ) P ( d of A : ,k (cid:54) = j contain virus | d ≥ (cid:21) ((1 − p ) − − p ) n + (1 − p ) n − ) / (1 − − p ) n + (1 − p ) n − )= (cid:20) n − (cid:88) d =0 P ( A : ,j tests positive | d + 1 of A : ,j contain virus ) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d (cid:21) p / (1 − − p ) n + (1 − p ) n − )+ (cid:20) n − (cid:88) d =1 P ( A : ,j tests positive | d of A : ,j contain virus ) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d − (1 − p ) n − (cid:21) ((1 − p ) − − p ) n + (1 − p ) n − ) / (1 − − p ) n + (1 − p ) n − )= (cid:20) n − (cid:88) d =0 (1 − γ ( n, d + 1)) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d (cid:21) p / (1 − − p ) n + (1 − p ) n − )+ (cid:20) n − (cid:88) d =1 (1 − γ ( n, d )) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d − (1 − p ) n − (cid:21) ((1 − p ) − − p ) n + (1 − p ) n − ) / (1 − − p ) n + (1 − p ) n − ) P ( A i,j contains virus | A : ,j contains virus and A i, : contains virus )= P ( A i,j contains virus, A : ,j contains virus and A i, : contains virus ) / P ( A : ,j contains virus and A i, : contains virus )= P ( A i,j contains virus ) / P ( A : ,j contains virus and A i, : contains virus )= p / (1 − − p ) n + (1 − p ) n − ) P ( A : ,j tests positive | d of A : ,j contain virus )= 1 − γ ( n, d ) For the total N subjects, we need (cid:98)

N/n (cid:99) array tests, and ( N − (cid:98) N/n (cid:99) n ) individual tests. Therefore, the averagetotal number of tests needed for the array test is: E (cid:2) M S ( n ) (cid:3) = (cid:98) N/n (cid:99) (cid:18) E (cid:2) m S ( n ) (cid:3) + 2 n (cid:19) + ( N − (cid:98) N/n (cid:99) n ) .2 Expected False Negative Number The average number of false negatives for a single group test of size n is E (cid:2) f SG ( n ) (cid:3) = E [ number of group test f alse negatives ]= E  n (cid:88) i =1 n (cid:88) j =1 { A i,j contains virus and A : ,j tests negative or A i, : tests negative }  = E  n (cid:88) i =1 n (cid:88) j =1 { A i,j contains virus } − { A i,j contains virus and A : ,j tests positive and A i, : tests positive }  = n (cid:88) i =1 n (cid:88) j =1 E (cid:2) { A i,j contains virus } − { A i,j contains virus and A : ,j tests positive and A i, : tests positive } (cid:3) = n (cid:88) i =1 n (cid:88) j =1 [ P ( A i,j contains virus ) − P ( A i,j contains virus and A : ,j tests positive and A i, : tests positive )]= n [ P ( A i,j contains virus ) − P ( A i,j contains virus and A : ,j tests positive and A i, : tests positive )]= n P ( A i,j contains virus ) [1 − P ( A : ,j tests positive and A i, : tests positive | A i,j contains virus )]= n P ( A i,j contains virus )[1 − P ( A : ,j tests positive | A i,j contains virus ) P ( A i, : tests positive | A i,j contains virus )]= n P ( A i,j contains virus )[1 − P ( A : ,j tests positive | A i,j contains virus ) ]= n P ( A i,j contains virus )  − (cid:32) n − (cid:88) d =0 P ( A : ,j tests positive, d of A : ,k (cid:54) = j contain virus | A i,j contains virus ) (cid:33)  = n P ( A i,j contains virus ) (cid:20) − (cid:18) n − (cid:88) d =0 P ( A : ,j tests positive | d of A : ,k (cid:54) = j contain virus, A i,j contains virus ) P ( d of A : ,k (cid:54) = j contain virus | A i,j contains virus ) (cid:19) (cid:21) = n P ( A i,j contains virus ) (cid:20) − (cid:18) n − (cid:88) d =0 P ( A : ,j tests positive | d + 1 of A : ,j contain virus ) P ( d of A : ,k (cid:54) = j contain virus | A i,j contains virus ) (cid:19) (cid:21) = n P ( A i,j contains virus ) (cid:20) − (cid:18) n − (cid:88) d =0 P ( A : ,j tests positive | d + 1 of A : ,j contain virus ) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d (cid:19) (cid:21) = n P ( A i,j contains virus ) (cid:20) − (cid:18) n − (cid:88) d =0 (1 − γ ( n, d + 1)) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d (cid:19) (cid:21) = n p (cid:20) − (cid:18) n − (cid:88) d =0 (1 − γ ( n, d + 1)) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d (cid:19) (cid:21) n is: E (cid:2) f SI ( n ) (cid:3) = E [ number of f alse negatives in the m people f or indivisual test ]= E  n (cid:88) i =1 n (cid:88) j =1 { A i,j tests negative and A i,j contains virus and A : ,j tests positive and A i, : tests positive }  = n (cid:88) i =1 n (cid:88) j =1 E (cid:2) { A i,j tests negative and A i,j contains virus and A : ,j tests positive and A i, : tests positive } (cid:3) = n (cid:88) i =1 n (cid:88) j =1 P ( A i,j tests negative and A i,j contains virus and A : ,j tests positive and A i, : tests positive )= n P ( A i,j tests negative and A i,j contains virus and A : ,j tests positive and A i, : tests positive )= n P ( A i,j tests negative | A i,j contains virus and A : ,j tests positive and A i, : tests positive ) P ( A i,j contains virus and A : ,j tests positive and A i, : tests positive )= γ (1 , n P ( A i,j contains virus and A : ,j tests positive and A i, : tests positive )= γ (1 , n p (cid:18) n − (cid:88) d =0 (1 − γ ( n, d + 1)) (cid:0) n − d (cid:1) (1 − p ) n − − d ( p ) d (cid:19) The total average number of false negatives for square array test is: E (cid:2) f Stotal ( n ) (cid:3) = (cid:98) N/n (cid:99) (cid:18) E (cid:2) f SG ( n ) (cid:3) + E (cid:2) f SI ( n ) (cid:3) (cid:19) + ( N − (cid:98) N/n (cid:99) n ) p γ (1 , General Binary Search

General binary search (GBS) is the generalization of the binary splitting procedure (BSP). BSP aims to identify exactlyone of the infected people in total sample N . BSP algorithm can be found in Appendix C.1. GBS simply attemptsto perform the BSP δ times to identify at most δ infected samples in a given population of size N . Note that δ playsthe similar role as pool size n in the square array test. With small δ we may underestimate the number of infectedsamples. In such case, GBS stops searching after ﬁnding δ infected samples, causing false negative in group test. Werecommend to test the remaining group at the end of GBS. If we already ﬁnd δ infected samples and the remaininggroup tests positive, we should conduct individual tests for the remaining group. If δ is too large, we would have smallpool size in GBS test, which leads to inefﬁcient binary searching but also small false negative rate in group test. Sincethe closed form solution for the expected total number of test needed for GBS is hard to derive, we resort to MonteCarlo simulation. The distribution of the total number of tests and the distribution of the false negative number areshown in the ﬁrst two graphs of Figure 15. The relationship between δ and the total number of tests, as well as thetotal false negative number, is shown in the last two graphs of Figure 15. As δ increases, the total number of tests willincrease and the false negative number will decrease, since the pool size will be smaller.Figure 15: Distribution of total number of tests and false negative number in GBS under group size δ = 83 , N = 10000 , p = 0 . , C = 600 . & The relationship between the pool size and the total number of tests, as well as the total falsenegative number for GBS. Simulations run 5000 times.GBS performs badly in our setting, due to the high splitting dilution effect, since we need to split each sample into C sub-samples(It is impossible to collect C sample from the subject all at once). In addition, GBS is adaptive, and wehave to wait until the previous test result is revealed to continue the whole procedure. The long time it takes to conductGBS group test makes it hard to implement in practice. 26 .1 BSP AlgorithmAlgorithm 3: BSP

Result: location of the infected in the sample pool, number of tests conductedInput: number of samples N and samples pool A = { A (1) , A (2) , ..., A ( N ) } ;Set testcount = 0, location = 0; while N ≥ do Select N = (cid:98) N/ (cid:99) ;Test group G = { A ( j ) : j ≤ N } ;Update testcount = testcount + 1; if test outcome is positive then Update A = G ;Update N = N ; else Update A = { A ( j ) : j > N } ;Update N = N − N ;Update location = location + N ; endendC.2 GBS AlgorithmAlgorithm 4: GBSInput: N , δ and samples pool P = { , , ..., N } ; while N ≥ δ − and δ > do Choose a group G of size (cid:98) log N − δ − δ (cid:99) ;Test group G ⊆ P ; if test outcome is positive then Identify an infected sample in G with BSP (Since the group tested positive, it must contain at least one infectedsample);Update N = N − − g (where g is the number of uninfected items diagnosed from BSP, remove these from the pool);Update δ = δ − (remove the identiﬁed infected sample from the pool); else Update N = N − | G | ( | G | is the number of uninfected items in G , remove these from the pool); endendif N > and δ > then Test the N samples individually; end Pooling Dilution Effect Model

In Section 2 we mentioned a new proposed model about the false negative induced by pooling proposed by [10].Here we brieﬂy introduce some detailed information about this model. First, they examine how the RT-qPCR test isconducted. • The sample is treated to transcribe the target RNA sequence into DNA sequence. • The sample is placed in a PCR machine, which can measure the concentration by making the target DNAﬂuorescent. • A reaction is made to approximately double the DNA sequence. This is called a cycle of ampliﬁcation. • A time series of the concentration of DNA over time is recorded and then a linear regression is applied on thisdata. The initial DNA concentration is the linear regression value at the origin. • The return value of the test corresponds to − log of the initial number of viral DNA in the sample.Denote by C t the − log of C , the number of virus DNA contained in a sample, which is interpreted as the numberof ampliﬁcation cycles needed to make the intensity reach a threshold. Besides, there is a parameter d cens , which iscalled the limit of detection, meaning that one positive sample will not be detected if its C t value is no less than d cens .Compared with the single individual case, the intensity curve of the pooled case is pushed rightward. Consequently, thenumber of ampliﬁcation cycles to reach the threshold becomes large, which is more likely to be overpass the limit ofdetection, and that is why pooling operations will increase the false negative rate from the perspective of methodology.Next, C t is modeled as a random variable X . Based on the re-simulated data derived by the clinical data in [26], thedistribution of X is established as follows, which is referred to as mixture model. f X ( x ) = (cid:88) k =1 π k f µ k ,σ k ( x ) F µ k ,σ k ( d cens ) { x ≤ d cens } where the parameters are estimated as d cens = 37 . , π = 0 . , π = 0 . , π = 0 . µ = 20 . , µ = 29 . , µ =34 . σ = 3 . , σ = 3 . , σ = 1 . . { f µ k ,σ k ( x ) , k = 1 , , } are the probability density function (PDF) ofGaussian distribution N ( µ k , σ k ) , and { F µ k ,σ k ( x ) , k = 1 , , } are the CDF of Gaussian distribution N ( µ k , σ k ) . Notethat here d cens = 37 . is corresponding to the maximum of the re-simulated data of C t , meaning that the false negativerate caused by taking swab(i.e., the ﬁrst source) is neglected.With the model for random variable X , the false negative rate induced by pooling with N individuals can be deduced.It is assumed that there is one infected individual who is pooled with other N − negative individuals. The virus ofthat infected individual need to carry at least N − d cens virus DNA to allow it to be detected even if it is diluted in agroup with pool size N . Denote by γ the false negative rate, it is estimated as γ = 1 − P ( C ≥ N − d cens )= 1 − P ( − log C ≤ d cens − log N )= 1 − P ( C t ≤ d cens − log N )= 1 − (cid:90) d cens − log N −∞ f X ( x ) dx = 1 − (cid:88) k =1 π k F µ k ,σ k ( d cens − log N ) F µ k ,σ k ( d cens ) . For a given group test, if the number of infected individuals is greater than , we re-scale the pool size to treat it as if ithas only one infected individual. For instance, if we have pool size N but with δ infected individuals among them, thenwe treat the case as there is one infected individual in ND individuals. The reason that we can apply re-scaling is, inRT-qPCR test, the total volume for the sample get tested is a pre-determined constant. In other words, no matter howlarge is the group size and how many infected individuals, the volume gets tested are unchanged. From the analysisabove, we know that C t is determined by the viral load in the tested sample. Therefore, different test pools will have thesame number of viral load as long as the ratio of infected individuals to the pool size is the same.28 Analysis for p T with respect to y T Qt