Active pooling design in group testing based on Bayesian posterior prediction
AActive pooling design in group testing based on Bayesian posterior prediction
Ayaka Sakata ∗ Institute of Statistical Mathematics, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, JapanDepartment of Statistical Science, The Graduate University for Advanced Science (SOKENDAI), Hayama-cho, Kanagawa 240-0193, Japan andJST PRESTO, 4-1-8 Honcho, Kawaguchi, Saitama, 332-0012, Japan (Dated: August 20, 2020)For identifying infected patients in a population, group testing is an effective method to reduce the number oftests and correct test errors. In group testing, tests are performed on pools of specimens collected from patients,where the number of pools is lower than that of patients. The performance of group testing considerably dependson the design of pools and algorithms that are used for inferring the infected patients from the test outcomes. Inthis paper, an adaptive design method of pools based on the predictive distribution is proposed in the frameworkof Bayesian inference. The proposed method executed using a belief propagation algorithm results in moreaccurate identification of the infected patients, compared with the group testing performed on random poolsdetermined in advance.
I. INTRODUCTION
Identification of infected patients from a large populationusing clinical tests, such as blood tests and polymerase chainreaction tests, requires significant operating costs. Group test-ing is one of the approaches to reduce such costs by perform-ing tests on pools of specimens obtained from patients [1, 2].When the fraction of infected patients in a population is suffi-ciently small, the infected patients can be identified from testson pools whose number is smaller than that of the patients.Originally, group testing was developed for blood testing byDorfman and is now applied to various fields, such as qualitycontrol in product testing [3] and multiple access communica-tion [4].Group testing is roughly classified into non-adaptive andadaptive. In non-adaptive group testing, all pools are deter-mined in advance and fixed during all tests. In adaptive grouptesting, pools are designed sequentially, depending on the pre-vious test outcomes. Dorfman’s original study considered thesimplest adaptive procedure, the so-called two-stage testing;here, in the first round, tests are performed on pools designedin advance, and all patients belonging to the positive pool areindividually tested in the subsequent stage. A generalizationof the two-stage testing is known as a binary splitting method[5, 6], where the positive pool in the previous stage is split intotwo subpools. Tests in the subsequent stage are performed onthe subpools until the infected patients are identified. Further,the splitting of the positive pools into several subsets largerthan two sometimes reduces the number of tests required foridentifying the infected patients [7]. These splitting-basedmethods are effective when the number of infected patientsis sufficiently small. However, the splitting-based methodsexhibit a limitation in the correction of false negative resultsbecause patients in the negative pools are never tested again,even when the negative result is false.Different from the splitting-based design, active designof data sampling has been studied in statistics and machinelearning, known as experiments design [8, 9], active learn-ing [10, 11], and Bayesian optimization [12, 13]. In these ∗ [email protected] approaches, the optimal method to select training data for ef-ficient learning is developed considering several criteria thatquantify informativeness of the unknown data. The activedesign of data sampling improves the performance of algo-rithms in several fields, such as text classification [10], semi-supervised learning [14], and support vector machine [15].Active data sampling is particularly effective when data pos-sess uncertainty due to a noisy generative process and thereexists a limitation in the number of data sampling. In the con-text of group testing, active sampling of data corresponds tothe active design of pools for the subsequent stage. Further,noise is observed during tests and the number of tests shouldbe reduced; active sampling makes a significant contributionby addressing these issues.In this paper, we propose an active pooling design methodemploying Bayesian inference for efficient identification of in-fected patients using group testing. Bayesian modeling canconsider the finite false probabilities in the test and provide ameasure to quantify the uncertainty, posterior predictive dis-tribution. We sequentially design pools based on the pre-dictive distribution in adaptive group testing. The procedureis executed using a statistical-physics-based algorithm, beliefpropagation (BP) [16–19], which achieves a reasonable ap-proximation of estimates with a feasible computational cost[20]. We demonstrate that, compared with the approach thatuses randomly generated pools, the proposed pooling methodeffectively corrects errors with a smaller number of tests. II. MATHEMATICAL FORMULATION
Let us denote the true state of N -patients by X ( ) ∈ { , } N ,where X ( ) i = X ( ) i = i -th patientis infected and not infected, respectively. The pooling of thepatients is determined by a matrix F ∈ { , } M × N , where M ( < N ) is the number of pools and F µ i = F µ i = i -th patient is in the µ -th pool and is not,respectively. The true state of the µ -th group, denoted by T ( X ( ) , ˜ F µ ) , where ˜ F µ is the µ -th row vector of F , is given by T ( X ( ) , ˜ F µ ) = ∨ Ni = F µ i X ( ) i , where ∨ Ni = f i = f ∨ f ∨ · · · ∨ f N denotes the logical sum of N components. Namely, when the µ -th pool contains at least one infected patient, the state of the a r X i v : . [ s t a t . M L ] A ug µ -th pool is 1 (positive); otherwise, it is 0 (negative).The test error is modeled by a function C (·) that returns 0or 1 according to the probability conditioned by the input as P ( C ( a ) = | a = ) = p TP , P ( C ( a ) = | a = ) = − p TP (1) P ( C ( a ) = | a = ) = p FP , P ( C ( a ) = | a = ) = − p FP , (2)and p TP and p FP correspond to the true-positive (TP) and false-positive (FP) probabilities in the test, respectively [18, 20].We assume that the test errors are independent of each other;further, from the property of C (·) , the generative model of Y is given by P gen ( Y | X ( ) , F ) = (cid:206) M µ = P gen ( Y µ | X ( ) , ˜ F µ ) , where P gen ( Y µ | X ( ) , ˜ F µ ) = { p TP Y µ + ( − p TP )( − Y µ )} T ( X ( ) , ˜ F µ ) + { p FP Y µ + ( − p FP )( − Y µ )}( − T ( X ( ) , ˜ F µ )) (3)is a Bernoulli distribution conditioned by X and F .Currently, we aim to infer the true states of patients X ( ) from the observation Y . To this end, Bayes formula is consid-ered. Further, we introduce the prior distribution of the patientstates P pri ( X i ) ∼ Bernoulli ( ρ ) , where ρ ∈ [ , ] is the assumedinfection probability. Following the Bayes rule, the posteriordistribution is given by P post ( X | Y ) ∝ P gen ( Y | X ) (cid:206) i P pri ( X i | ρ ) .The i -th patient’s state is identified on the basis of the marginaldistribution P post ( X i | Y ) = (cid:205) X \ X i P post ( X | Y ) , where X \ X i de-notes the components of X other than X i . As the variable X i is binary, we can represent the marginal distribution using aBernoulli probability θ i as P post ( X i | Y ) = θ i ( Y ) X i + ( − θ i ( Y ))( − X i ) , (4)and θ i ( Y ) corresponds to the infection probability estimatedunder the test result Y , namely, the probability that X i =
1. Wehave to convert the returned probability to a binary value forthe identification of the patients’ states. The simplest estimateof X ( ) i is the maximum a posteriori (MAP) estimator given by X ( MAP ) i = I ( θ i > . ) , (5)where I ( a ) is the indicator function whose value is 1 when a is true, and 0 otherwise. III. ADAPTIVE DESIGN OF POOLS
Here, we divide M -tests into M ini -tests under pools fixedin advance as the initial stage and M ada -tests sequentiallyperformed on actively designed pools as the adaptive stage.Hence, M = M ini + M ada . We denote the index set of patientswho are in the ν -th pool as π ( ν ) , where F ν i = i ∈ π ( ν ) ;otherwise, 0. We consider the determination of π ( ν + )( ν ≥ M ini ) among possible pools denoted by P based on the1 , · · · , ν -th test outcomes, denoted by Y ( ν ) = [ Y , Y , ..., Y ν ] T ,which are performed on pools π ( ) , · · · , π ( ν ) . The predictivedistribution for the unknown result of the test Y ∈ { , } , whichwill be performed on a certain pool π ∈ P , is defined as P pre ( Y | Y ( ν ) , π ) = (cid:213) X π P gen ( Y | X π ) P post ( X π | Y ( ν ) ) , (6) where X π = { X i | i ∈ π } and P post ( X π | Y ( ν ) ) = (cid:205) X \ X π P post ( X | Y ( ν ) ) . By setting P post ( X π = | Y ( ν ) ) = q ( π ; Y ( ν ) ) , which is the estimated probability under given Y ( ν ) that all patients in the pool π are not infected, the predictivedistribution is expressed as P pre ( Y | Y ( ν ) , π ) = { p TP Y + ( − p TP )( − Y )}( − q ( π, Y ( ν ) )) + { p FP Y + ( − p FP )( − Y )} q ( π, Y ( ν ) ) . (7)The predictive distribution measures the adequacy of the pos-terior distribution to describe the unknown data, and is usedas a modeling criterion in Bayesian inference [21]. We use thepredictive distribution for active design of pools. For intuitivediscussion, let us consider the case that P pre ( Y = | Y ( ν ) , π ) and P pre ( Y = | Y ( ν ) , π ) are significantly different. We consider thecase that they are close to 1 and close to 0, respectively. Thismeans that the posterior distribution is consistent with the newobservation performed on the pool π in the sense that the cur-rent posterior matches the new test result Y =
1, and Y = π thatgives comparable P pre ( Y = | Y ( ν ) , π ) and P pre ( Y = | Y ( ν ) , π ) ,where the posterior at step ν cannot explain the test result per-formed on the pool π , and hence, the test result is expected tocorrect the posterior to explain it.This strategy can be expressed by the maximization of thepredictive entropy at step ν + S ( Y ( ν ) , π ) = − (cid:213) Y P pre ( Y | Y ( ν ) , π ) ln P pre ( Y | Y ( ν ) , π ) , (8)where P pre ( Y = | Y ( ν ) , π ) = P pre ( Y = | Y ( ν ) , π ) = . q ( π, Y ( ν ) ) . Regardingthe predictive entropy as a function of q ∈ [ , ] , the maximumof the predictive entropy is achieved at q = q ∗ given by q ∗ = p TP − . p TP − p FP , (9)where p FP < . ≤ p TP is assumed. We determine the ν + π ( ν + ) = arg min π ∈P | q ( π, Y ( ν ) ) − q ∗ | . (10)The remaining task is the calculation of q ( π ; Y ( ν ) ) for possi-ble π under the given test results Y ( ν ) . The mathematical formof q ( π ; Y ( ν ) ) depends on the size of π , denoted by | π | . When | π | =
1, we obtain q ( π, Y ( ν ) ) = − θ π ( Y ( ν ) ) . (11)For larger pools, the correlation between the patients in the poolshould be considered for the exact evaluation of q ( π ; Y ( ν ) ) . Forexample, when π = { i , j } (| π | = ) , we obtain q ( π, Y ( ν ) ) = χ ij ( Y ( ν ) ) + ( − θ i ( Y ( ν ) ))( − θ j ( Y ( ν ) )) , (12)where χ ij ( Y ( ν ) ) = E post | Y ( ν ) [ X i X j ] − θ i ( Y ( ν ) ) θ j ( Y ( ν ) ) is the sus-ceptibility and E post | Y ( ν ) [·] denotes the average according to theposterior distribution P post ( X | Y ( ν ) ) .Next, we discuss the relationship between q ∗ , p TP , and p FP .From the definition of q ∗ , eq.(9), if p TP > − p FP , then q ∗ < .
5. This indicates that the pools with q ( π, Y ( ν ) ) < . p TP > − p FP . In other words,when the probability that at least one patient in a pool is in-fected is larger than the probability that no one is infected, thepool tends to be chosen. This can be understood as follows.Introducing false negative probability p FN , p TP > − p FP isequivalent to p FN < p FP . This means that false test resultsare mainly contained in positive results. Hence, pools with q ( π, Y ( ν ) ) < . q ( π, Y ( ν ) ) > .
5. Therefore, in the active pooling design basedon uncertainty, pools with q ( π, Y ( ν ) ) < . p TP > − p FP . Following the same logic, we canunderstand that the pool with q ( π, Y ( ν ) ) > . p TP < − p FP . IV. IMPLEMENTATION BY BELIEF PROPAGATION
The computation of the marginal distribution requires theexponential order of the sums, and thus is intractable. Weapproximately calculate the marginal distribution using the BPalgorithm [17–19]. Compared with the approximation usingthe BP algorithm with the exact calculation at a small size, theBP algorithm has sufficient approximation performance whenapplied to group testing [20]. In this study, we use the BPalgorithm as a reasonable method owing to its approximationaccuracy and computational time. In Appendix A, the BPalgorithm for calculating the infection probability given by theposterior distribution is summarized. We denote the obtainedestimates of θ i and the corresponding MAP estimator as ˆ θ i andˆ X ( MAP ) i = I ( ˆ θ i > . ) , respectively. We measure the accuracyof the MAP estimator by the TP and FP rates given byTP = (cid:205) i X ( ) i ˆ X ( MAP ) i (cid:205) i X ( ) i , FP = (cid:205) i ( − X ( ) i ) ˆ X ( MAP ) i (cid:205) i ( − X ( ) i ) , (13)respectively. A TP value larger than p TP and an FP valuesmaller than p FP indicate that the BP-based identification hasbetter performance than the parallel test of N -patients.To apply the BP algorithm to an adaptive test, we need toobtain q ( π, Y ( ν ) ) for each ν ( > M ini ) . For its exact computa-tion, we need multibody correlations between patients exceptwhen | π | =
1, although the BP algorithm returns one-bodyinformation. In this study, we use the simplest approximationprovided by the BP algorithm as ˆ q ( π, Y ( ν ) ) ≡ (cid:206) | π | i = ( − ˆ θ π i ) ,where π i ( i = , · · · , | π |) is the i -th component in the pool π ,to avoid the increase in the computational time required forthe calculation of multibody correlation. Further, to reducethe time of the computation of q ( π, Y ( ν ) ) for all possible π ,we focus on the subspace of pools P = { π || π | = , π ∈ P} and P = { π || π | ≤ , π ∈ P} ; hence, P ⊂ P ⊂ P . Inprinciple, BP can approximately compute the correlation be-tween patients by deriving conditional posterior expectations, which requires additional computations of the order of O ( N ! /( N − | π |) ! ) according to the product-rule of conditional jointdistributions. As an example, we calculate the susceptibilityusing the BP algorithm and implement active pooling designon the basis of q ( π, Y ( ν ) ) for | π | = q ( π, Y ( ν ) ) throughout the study. Algorithm 1
Group testing with active pooling design usingthe belief propagation (BP) algorithm
Input: Y ( M ini ) ∈ { , } M ini and F ( M ini ) ∈ { , } M ini × N Output: ˆ θ ∈ [ , ] N ˆ θ ← BP ( Y ini , F ini ) for ν = M ini + . . . M do ˆ q ( π ; Y ( ν − ) ) ← (cid:206) | π | i = ( − ˆ θ π i ) for π ∈ P π ( ν ) ← arg min π ∈P | ˆ q ( π ; Y ( ν − ) ) − q ∗ | ˜ F ν ← [ F ν , · · · , F ν N ] , where F ν i = i ∈ π ( ν ) , otherwise0 Y ν ∼ P gen ( Y | X ( ) , ˜ F ν ) (cid:46) Test result performed on π ( ν ) Y ( ν ) ← [ Y ( ν − ) ; Y ν ] F ( ν ) ← [ F ( ν − ) ; ˜ F ν ] ˆ θ ← BP ( Y ( ν ) , F ( ν ) ) end for The setting of the numerical simulation described in thissection is as follows. Let us denote the longitudinal couplingof matrices or vectors a and b that have the same number ofcolumns as [ a ; b ] . Hence, F = [ ˜ F ; ˜ F ; · · · ; ˜ F M ] . The sub-matrix of F given by from the 1-st to the ν -th row vectors isdenoted by F ( ν ) = [ ˜ F ; · · · ; ˜ F ν ] ; hence, F ( ν + ) = [ F ( ν ) ; ˜ F ν + ] .The pooling matrix for the initial stage, F ( M ini ) , is randomlygenerated under the constraint that the number of pools eachpatient belongs to and the number of patients in each poolare fixed at N G ((cid:28) N ) and N O ((cid:28) N ) , respectively. Hence, (cid:205) Ni = F µ i = N G for µ ≤ M ini and (cid:205) M ini µ = F µ i = N O ∀ i hold,and the relationship N O = N G M ini / N holds. The corre-sponding test result in the initial stage, Y ( M ini ) , is generatedas Y ( M ini ) ∼ P gen ( Y | X ( ) , F ( M ini ) ) . The posterior distributionunder given Y ( M ini ) and F ( M ini ) is approximately calculated us-ing the BP algorithm. For the subsequent adaptive stage, weactively choose π ( M ini + ) among P or P based on thepredictive entropy given by the posterior distribution of theinitial stage. Next, we construct ˜ F M ini + , so that F M ini + , i = i ∈ π ( M ini + ) ; otherwise, 0. The test result is generatedas Y M ini + ∼ P gen ( Y | X ( ) , ˜ F M ini + ) , and we obtain the posteriordistribution under F ( M ini + ) = [ F ( M ini ) ; ˜ F M ini + ] and Y ( M ini + ) = [ Y ( M ini ) ; Y M ini + ] using the BP algorithm. This adaptive test pro-cedure is repeated M ada -times, where M = M ini + M ada , andthe state of patients is determined by the MAP estimator corre-sponding to ˆ θ ( Y , F ) , where Y = [ Y ( M ini ) ; Y M ini + ; . . . ; Y M ] and F = [ F ( M ini ) ; ˜ F M ini + ; . . . ; ˜ F M ] . The pseudocode is summa-rized in Algorithm 1, where BP ( Y , F ) indicates the calculationof the infection probability using the BP algorithm under theinput Y and F (see Appendix A).The true state of patients X ( ) is randomly generated under T P (a) ρ P P Random FP (b) ρ P P Random
FIG. 1. ρ -dependence of (a) true-positive and (b) false-positive at N = M = M ini =
300 and M ada = N G =
10 is used. The error ratesare fixed at p TP = . p FP = .
05. The horizontal dashed linein (a) represents p TP . For comparison, a random test with M = P and P denote P and P cases, respectively. T P (a) M M ini =300 M ini =400 M ini =500 P P Random FP (b) M M ini =300 M ini =400 M ini =500 P P Random
FIG. 2. M -dependence of (a) true-positive and (b) false-positive at N = ρ = .
02. The adaptive tests are performed 100 timesafter M ini = , N G =
10. Theerror probabilities on the test are fixed at p TP = . p FP = . p TP . the constraint that (cid:205) i X ( ) i = N ρ . Here, we assume that thecorrect parameters ρ , p TP , and p FP are known in advance. Formore general cases where the estimation of unknown parame-ters is required, we can construct their estimators by combiningthe BP algorithm with the expectation-maximization method,or introducing a hierarchical Bayes model [20].Fig.1 shows the- ρ -dependence of (a) TP and (b) FP at N = M =
400 with M ini =
300 and M ada = p TP = . p FP = .
05, and the groupsize in the initial stage is N G = P and P in the figuredenote the results of the active pooling in the spaces P and P , respectively [24]. For comparison, the results of randompooling are shown, where tests in M ada steps are performed onrandom pools generated by the same rule as the initial M ini -times tests. Each data point represents the averaged valuewith respect to 100 realizations of Y ( M ini ) , F ( M ini ) and X ( ) .For any region of ρ , TP under a random test cannot exceedthe p TP , which is indicated by the horizontal line in Fig.1 (a).The adaptive test improves TP and achieves TP > p TP when ρ < .
02 for P case and ρ < .
04 for P case. As shownin Fig.1 (b), FP is smaller than p FP even when the pooling israndomly determined, but the adaptive test can further decreaseFP.The performance of the adaptive test depends on the number (a) T P p TP P P Random (b) T P p FP P P Random
FIG. 3. p TP -dependence of true-positive for (a) p FP = .
05 and(b) p TP = .
95 at N = ρ = .
02. The adaptive tests areconducted 100 times after M ini = M =
400 in total. Thepool size in the random tests is N G =
10. The horizontal dashed linesrepresent TP = p TP . of initial random tests M ini . Fig.2 shows the M ini -dependenceof (a) TP and (b) FP at N = ρ = . p FP = .
9, and p TP = .
05. The pool size at the initial stage is N G =
10. Thefigure also presents the results for M ini = M ini increases, a high TP closeto 1 is obtained via the adaptive test. Moreover, for a large M ini , such as M ini = M ini , the result of TP depends on thepooling space, and more accurate identification is achieved by P . The test results at the initial stage have large uncertaintieswhen M ini is small, and hence, it is considered that the largepooling space is required for the effective sampling of theuncertain pools.As shown in Fig.2 (a), to achieve a high TP, the activepooling method requires a smaller number of tests than thatrequired by the random pooling method. For instance, theactive pooling in P with M ini =
300 initial stage results ina high TP > p TP after the M ada =
40 adaptive stage, namely M = > p TP with almost M =
500 tests. With regard to the improvementof TP, the adaptive method helps effectively identify infectedpatients with a small number of tests.The active pooling is robust to the errors in the test, com-pared with the random pooling. Fig.3 shows (a) the p TP -dependence of TP for p FP = .
05 and (b) the p FP -dependenceof TP for p TP = .
95 at N = M ini = M ada = ρ = .
02. The random tests in the initial stage are per-formed on pools of size N G =
10. For the random poolingcase, TP > p TP is achieved only when p FP is sufficiently smallsuch as p FP < .
02. The adaptive test improves TP, and theparameter region where TP > p TP is extended in particular forthe case P .These results indicate the efficiency of the active poolingdesign based on predictive distribution in group testing. How-ever, a limitation of this approach is that the computational costinvolved is higher than that of the non-adaptive approach. Theestimation of the infection probability via the BP algorithm M ada steps is obtained again; hence, the computational costof the adaptive approach is approximately M ada times largerthan that of the non-adaptive approach. However, the adaptiveapproach achieves accurate estimation using a small numberof tests. The trade-off between the reduction of the operat-ing cost involved in tests and the increase in the computationtime of inference should be considered for practically applythe adaptive approach. V. SUMMARY AND DISCUSSION
In this paper, we propose an active pooling design in adap-tive group testing, where the pool for the subsequent stageis determined based on the Bayesian posterior predictive dis-tribution under the test outcomes in the previous stage. Theproposed method was implemented using the BP algorithm,and the identification of infected patients using adaptive tests isdemonstrated to be more accurate than that using randomly de-signed pools. In particular, the active pooling design reducedthe number of required tests to achieve TP > p TP . Further, theproposed method is robust to test errors and TP > p TP holds insmaller p TP and larger p FP , compared with the approach thatuses randomly designed pools.In the current study, we restrict the possible pooling spacewithin P and P . Mathematically, more uncertain pool can beconsidered removing this restriction, and further improvementin the TP and FP rates is expected. However, the straightfor-ward calculation of the predictive entropy for all possible π iscomputationally intractable. Hence, some approximation willbe required. An efficient sampling method in π ∈ P to find theuncertain pool should be developed such as the Markov chainMonte Carlo method.We focused on the MAP estimator to convert the estimatedinfection probability, which is [ , ] variable, into the state ofpatients, { , } variable, because of its simplicity; however,changing the decision threshold from 0.5 results in improve-ments in the TP rate. For example, the estimate using the confi-dence interval constructed based on the bootstrap method wasobtained. Further, the TP rate obtained using this method ishigher than that obtained using the MAP estimator [20]; how-ever, its computational cost is unrealistic to accompany theactive pooling procedure. The receiver operating characteris-tic (ROC) analysis is a promising method to understand theappropriate decision threshold [25, 26]. Along with the ROCanalysis, the mathematical background of the active poolingproposed in this paper is expected to be established. ACKNOWLEDGMENTS
This work was accomplished thanks to the authorâĂŹspleasant discussions with Yukito Iba. Further, the authorthanks Koji Hukushima, Yoshiyuki Kabashima, and SatoshiTakabe for their helpful comments and discussions. This re-search was partially supported by Grant-in-Aid for ScientificResearch 19K20363 from the Japanese Society for the Pro-motion of Science (JSPS) and JST PRESTO Grant NumberJPMJPR19M2, Japan. -0.02 0 0.02 0.04 0 2 4 6 8 10 12 14 16 18 20 i -0.04 0 0.04 0.08 0.12 (a) BPExact ε (b) ρ α =0.5, p TP =0.9 p FP =0.1 α =0.5 p TP =0.9 p FP =0.05 α =0.4 p TP =0.9 p FP =0.1 FIG. 4. (a) Examples of susceptibility calculated using the BP al-gorithm and exact computation at N = M = N G = N O = , ρ = . p TP = .
95, and p FP = . χ ij for j =
20 isshown for two different realizations of Y , F , and X ( ) . (b) Quan-tification of the difference between susceptibilities given by the BPalgorithm and the exact calculations at N =
20 and N G =
10 forseveral parameters using (cid:15) . Appendix A: BP algorithm for group testing
We denote π ( µ ) and G( i ) as the indices of the patients inthe µ -th pool and those of the pools in which the i -th patientis included, respectively. For the edge that connects the µ -th factor (test) and the i -th variable (patient), two types ofmessages m i → µ ( X i ) and ˜ m µ → i ( X i ) are defined. Intuitively,the messages m i → µ ( X i ) and ˜ m µ → i ( X i ) represent the marginaldistributions of X i before and after the µ -th test is performed,respectively. The variable X i is binary. Hence, the messagescan be expressed by the Bernoulli probability θ i → µ and ˜ θ µ → i given by θ i → µ = ρ (cid:206) ν ∈G( i )\ µ ˜ θ ν → i Z i → µ , ˜ θ µ → i = U µ ˜ Z µ → i (A1)where U µ = p TP Y µ + ( − p TP )( − Y µ ) , W µ = p FP Y µ + ( − p FP )( − Y µ ) and˜ Z µ → i = U µ (cid:169)(cid:173)(cid:171) − (cid:214) j ∈ π ( µ )\ i ( − θ j → µ ) (cid:170)(cid:174)(cid:172) + W µ (cid:214) j ∈ π ( µ )\ i ( − θ j → µ ) (A2) Z i → µ = ρ (cid:214) ν ∈G( i )\ µ ˜ θ ν → i + ( − ρ ) (cid:214) ν ∈G( i )\ µ ( − ˜ θ ν → i ) . (A3)The BP algorithm consists of the recursive update of θ i → µ and˜ θ µ → i , and at the fixed point, the infection probability is givenby [17, 18, 20]ˆ θ i = ρ (cid:206) µ ∈G( i ) ˜ θ µ → i ρ (cid:206) µ ∈G( i ) ˜ θ µ → i + ( − ρ ) (cid:206) µ ∈G( i ) ( − ˜ θ µ → i ) . (A4) Appendix B: Calculation of susceptibility using the BPalgorithm
A widely used method to calculate susceptibility in theframework of the BP algorithm is susceptibility propagation (a) T P M ada P :Without χ P :With χ P Random (b) FP M ada P :Without χ P :With χ P Random
FIG. 5. Comparison of (a) true-positive and (b) false-positive for P case considering the susceptibility, P case without consideringthe susceptibility, and P case, at N = M ini = ρ = . p TP = .
9, and p FP = .
1. The tests in the initial stage are performedon the randomly designed pools with N G =
10 and N O = [27, 28], where recursive update of tensors that give suscep-tibility is introduced on the basis of linear-response theory.In the current problem setting, the variables to be estimatedobey the Bernoulli probability. Hence, we can compute thesusceptibility in a simpler way.Let us denote the expectation of X j under the posteriorconditional distribution (cid:205) X \ i , j P post ( X j , X i = , X \ i , j | Y ) as θ | X i = j ≡ E post | Y [ X j | X i = ] ( i (cid:44) j ) . This expectation valueis evaluated using the BP algorithm by fixing θ i → µ = θ µ → i = µ ∈ G( i ) . The conditional expectation valueobtained using the BP algorithm is denoted as ˆ θ | X i = j . Thus,the susceptibility is given by ˆ χ ij = ˆ θ i ˆ θ | X i = j − ˆ θ i ˆ θ j . We canshow that the symmetry ˆ θ i ˆ θ | X i = j = ˆ θ | X j = i ˆ θ j holds. To check the accuracy of the susceptibility derived usingthe BP algorithm, we compute the exact posterior distributionby sampling all configurations in { , } N . Examples of theexact susceptibility and the approximated one are shown inFig.4(a) at N = M = ρ = . p TP = .
95, and p FP = .
05, where the i -dependence of χ i , is shown for twodifferent realizations of Y , F , and X ( ) . Here, the poolingmatrix is randomly generated to be N G =
10 and N O =
5. Thedifference between χ and ˆ χ is quantified by (cid:15) ≡ (cid:205) i < j ( χ ij − ˆ χ ij ) /{ N ( N − )/ } , whose behavior is shown in Fig.4 (b) at N =
20 and different values of α ≡ M / N , p TP , and p FP . Forany parameter region, (cid:15) is O ( − ) . Therefore, we considerthat the BP algorithm provides a reasonable approximation ofthe susceptibility and expect that it is also applicable for larger N .In Fig.5, (a) TP and (b) FP are shown for the cases whenthe susceptibility is considered (denoted by ‘ P : with χ ’)and not considered (denoted by ‘ P : without χ ’); namely,eq.(12) is used to determine the pool in the subsequent stage bysubstituting ˆ χ calculated by BP into χ at N = ρ = . p TP = .
9, and p FP = .
1. Each data point is averaged over50 samples of F , X ( ) , and Y . The initial stage consists of M ini =
80 random tests with N G =
10 and N O =
4. The P case is compared with the random case with the same testat the initial stage. Considering the susceptibility, a slightimprovement in TP is observed.Following the same procedure, we can compute a higherorder correlation in principle. For example, E post | Y [ X i | X j = , X k = ] is obtained by fixing θ j → µ = θ k → ν =
1, and˜ θ µ → j =
1, ˜ θ ν → k = µ ∈ G( j ) and ν ∈ G( k ) . [1] R. Dorfman, Ann.Math.Statist. , 436 (1943).[2] D.-Z. Du and F. K. Hwang, Combinatorial Group Testing andIts Applications (World Scientific, 2000).[3] M. Sobel and P. A. Groll, Bell System tech. J. , 1179 (1959).[4] J. K. Wolf, IEEE Transactions on Information Theory , 185(1985).[5] M. Sobel and P. A. Groll, Bell Labs Technical Journal , 1179(1959).[6] M. Sobel and P. A. Groll, Technometrics , 631 (1966).[7] F. K. Hwang, Journal of the American Statistical Association , 605 (1972).[8] V. V. Fedorov, Theory of Optimal Experiments (Academic Press(New York), 1972).[9] F. Pukelsheim,
Optimal Design of Experiments (Academic Press(New York), 1972).[10] D. A. Cohn, Z. Ghahramani, and M. I. Jordan, Journal of Arti-ficial Intelligence Research , 129 (1996).[11] B. Settles, Active learning literature survey , Tech. Rep. (Univer-sity of Wisconsin-Madison Department of Computer Sciences,2009).[12] E. Brochu, V. M. Cora, and N. de Freitas, A tutorial on bayesianoptimization of expensive cost functions, with application toactive user modeling and hierarchical reinforcement learning, https://arxiv.org/abs/1012.2599 (2010).[13] B. Shahriari, K. Swersky, Z. Wang, P. A. R, and N. De Freitas,Proceedings of the IEEE , 148 (2015). [14] X. Zhu, J. Lafferty, and Z. Ghahramani, in
ICML-2003 Work-shop on The Continuum from Labeled to Unlabeled Data (2003)pp. 58–65.[15] S. Tong and D. Koller, Journal of Machine Learning Research , 45 (2001).[16] M. Mézard and A. Montanari, Information, physics, and com-putation (Oxford University Press, 2009).[17] M. Mézard, M. Tarzia, and C. Toninelli, Journal of Physics:Conference Series , 012019 (2008).[18] D. Sejdinovic and O. Johnson, in (IEEE, 2010) pp. 998 – 1003.[19] T. Kanamori, H. Uehara, and M. Jimbo, Journal of StatisticalTheory and Practice , 220 (2012).[20] A. Sakata, Journal of Physical Society of Japan , 084001(2020).[21] G. Kitagawa, Communications in statistics - theory and methods , 2223 (1997).[22] D. D. Lewis and W. A. Gale, in ACM SIGIR Confer-ence on Research and Development in Information Retrieval (ACM/Springer, 1994) pp. 3–12.[23] D. D. Lewis and J. Catlett, in
International Conference on Ma-chine Learning (ICML) (Morgan Kaufmann, 1994) pp. 148–156.[24] We note the heuristics used in the simulation. When ρ is suffi-ciently small such as ρ = . that is, certain pools are selected several times in the adaptivestage. It is known that rank deficiency can cause the instabilityof BP. To avoid this problem, we exclude the already existingpool from the candidates in the subsequent stage for the small- ρ case.[25] R. Kumar and A. Indrayan, Indian Pediatr , 277 (2011).[26] K. Hajian-Tilaki, Caspian J Intern Med , 627 (2013).[27] M. Mézard and T. Mora, J Physiol Paris , 107 (2009).[28] M. Yasuda and K. Tanaka, Physical Review E87