Almost Separable Matrices
AAlmost Separable Matrices
Matthew Aldridge ∗ , Leonardo Baldassini † and KarenGunderson ‡ Heilbronn Institute for Mathematics Research, School ofMathematics, University of Bristol, Bristol, UK School of Mathematics, University of Bristol, Bristol, UKJuly 26, 2018
Abstract An m × n matrix A with column supports { S i } is k -separable if thedisjunctions (cid:83) i ∈K S i are all distinct over all sets K of cardinality k . Whilea simple counting bound shows that m > k log n/k rows are required fora separable matrix to exist, in fact it is necessary for m to be about afactor of k more than this. In this paper, we consider a weaker definitionof ‘almost k -separability’, which requires that the disjunctions are ‘mostlydistinct’. We show using a random construction that these matrices existwith m = O ( k log n ) rows, which is optimal for k = O ( n − β ). Further,by calculating explicit constants, we show how almost separable matricesgive new bounds on the rate of nonadaptive group testing. Let A ∈ { , } m × n be an m × n binary matrix, and write S i for the support ofits i th column (that is, the locations of the s). Then A is said to be k -separable if the sets (cid:83) i ∈K S i are all distinct over all sets K ∈ { , , . . . , n } of cardinality k (see Definition 1, to come).Separable matrices were first introduced by Erd˝os and Moser in 1970 [9] andhave since been studied in different contexts, including coding theory, combina-torics and, as we discuss later, group testing, where they play a very importantrole.Separable matrices are often studied through the slightly stronger conceptof disjunct matrices (see Definition 3). Disjunct matrices were first introducedby Kautz and Singleton [11] and, just like separable matrices, they have beenextensively studied in coding theory, combinatorics and group testing [5, 7, 8,10, 18].A central question in the study of both separable and disjunct matrices isthe following: Given n and k , how large must m be for there to exist either ∗ [email protected] † [email protected] ‡ [email protected] a r X i v : . [ m a t h . C O ] O c t n m × n k -separable or disjunct matrix? In this paper, we investigate theasymptotics for separability as n → ∞ , where k may grow with n .A simple counting bound (Theorem 2) shows that m ≥ Ω( k log n/k ) rowsare required. Disappointingly, when k = o ( n ) this bound is not tight, and werequire roughly a factor of k more than this, as in fact it has been shown [7, 5]that m ≥ Ω( k log n/ log k ) is needed. This lower bound is motivated by theconnection between disjunctness and separability, as we discuss in Section 2.Notice that when k grows linearly with n , taking the identity matrix is orderoptimal – for this reason, we consider only k = o ( n ) in this paper.In order to meet the lower bound m ≥ Ω( k log n/k ), we consider a relaxationof the requirement of k -separability to almost k -separability . Roughly speaking,a matrix is almost k -separable if the sets (cid:83) i ∈K S i are ‘usually’ distinct – seeDefinition 4 for a formal definition.Our main result shows that it is possible to achieve almost separability withonly O ( k log n ) rows (Theorem 7). When k = O ( n − β ), for any β ∈ (0 , m - a goal motivated by the study of the rate of group testingalgorithms.Group testing is an old and well-studied search problem, first considered byDorfman [6], where the goal is to recover a sparse subset of k defective elementsspread among n otherwise identical items. Instead of testing each item for de-fectiveness individually, classic group testing algorithms test items in batches.In the noiseless binary model we consider, tests can only reveal whether a givenset contains at least one defective (a positive test) or no defectives (a negativetest). The connection between separable matrices and nonadaptive group test-ing is well-known, and we discuss it in Section 5. For the moment, we justobserve that a sequence of tests designed a priori ( nonadaptive group testing)has a natural binary-matrix representation: each length- n row represents a test,with entries being if the corresponding item is being included in the test.A matrix being k -separable is equivalent to having zero probability of errorfor nonadaptive group testing, while a matrix being almost k -separable is equiv-alent to having a small probability of error. The ‘arbitrarily small probabilityof error’ criterion we consider here is the same as that in Shannon’s theory ofchannel coding.With this comparison in mind, we consider the concept of rate of grouptesting (Definition 10) for k = n − β defective items in a population of size n , which can be thought of as the amount of information conveyed by eachtest. Using a separable matrix with m = Ω( k log n/ log k ) rows leads to agroup testing rate of 0. However, using an almost separable matrix with m = O ( k log n ) rows gives a strictly positive rate, with the rate depending on thecontant implied by the big- O . Hence, here we are interested in getting goodconstants for m , not only in order-wise results.In Theorem 11, we show that our results meet previous results for the limitingregime where k is fixed as n → ∞ , and improves over the previous best knownbounds for larger values of the sparsity parameter β ∈ [0 ,
1] in the k = n − β regime frequently considered in the group testing literature.2 Separable matrices
We begin by recalling the definition of a separable matrix.
Definition 1.
Given an m × n binary matrix A = ( a ij ) ∈ { , } m × n , we shallwrite S i := { j : a ij = } for the support of column i and for K ⊆ { , , . . . , n } also write S ( K ) := (cid:83) i ∈K S i for the support of a disjunction of columns.The matrix A is called k -separable matrix if the for all sets K of size k , thereis no other set L also of size k with S ( L ) = S ( K ).The case k = 0 is trivial, so we assume k ≥ k ≤ n/
2, which will be no restriction in the limiting regimes we study.The following counting bound is described by Chen and Hwang as “simple-minded” [5].
Theorem 2.
Let M ( n, k ) be the smallest m such that an m × n k -separablematrix exists. Then M ( n, k ) ≥ log (cid:18) nk (cid:19) . Proof.
Clearly |{ S ( K ) : |K| = k }| ≤ |P ( { , , . . . , m } ) | = 2 m , where P denotes the power set. Hence for A to be k -separable we require2 m ≥ (cid:0) nk (cid:1) , and taking logarithms gives the result.Using the lower bound of (cid:16) nk (cid:17) k ≤ (cid:18) nk (cid:19) ≤ (cid:16) e nk (cid:17) k (1)(which we shall use many times in this paper), we see that a k -separable matrixmust have at least m ≥ k log n/k = Ω( k log n/k ) rows.As we anticipated, separable matrices are tightly related to another class ofmatrices, namely that of disjunct matrices. Definition 3.
With the notation of Definition 1, A is k -disjunct if for all sets K of cardinality |K| = k , there does not exist i (cid:54)∈ K such that S i ⊆ S ( K ).In the language of set systems, a matrix A being k -seperable is equivalentto the family { S i } ni =1 being k -union-free, and A being k -disjunct is equivalentto { S i } ni =1 being k -cover-free.It’s easy to see that k -disjunctness implies k -separability (see, for example,[11], [7, Section 7.2], or the special case (cid:15) = 0 of Lemma 6 below). On theother hand, Chen and Hwang [5, Theorem 2] have shown that it is possible toconstruct a k -disjunct matrix from a 2 k -separable matrix by adding at mostone row to it, which means that disjunct and separable matrices share the sameorder-wise asymptotics. Dyachkov and Rykov have quantified these asymptoticsby showing that m ≥ Ω( k log n/ log k ) rows are necessary for a matrix to be k -disjunct [8] – similar results appear elsewhere [18] [10] [7, Theorem 7.2.14]. Thismeans that it is not possible to create a k -separable matrix with m = O ( k log n )rows. 3s disjunctness is a stronger (and, in some ways, simpler) property thanseparability, efforts to derive upper bounds on m for separable matrices haveoften proceeded via the construction of disjunct matrices. In their seminal paper[11], Kautz and Singleton give a probabilistic existence theorem for k -disjunctmatrices with m = O ( k log n ) rows. In the group testing literature there existexplicit constructions of testing schemes with O ( k log n ) rows, see for examplePorat and Rothschild [17]. Since separable matrices cannot meet the counting bound, it would be of interestif a matrix could be close to being separable using only O ( k log n ) rows. Such amatrix would be order-optimal.With this in mind, we define the concept of an almost separable matrix in asimilar manner to Defintion 1. Definition 4.
With the notation of Definition 1, A ∈ { , } m × n is (cid:15) -almost k -separable if for at most (cid:15) (cid:0) nk (cid:1) sets K of size k does there exist another set L ofsize k with S ( L ) = S ( K ).An analogous definition is present in for example [22], where almost separablematrices are called weakly separating designs . Note that setting (cid:15) = 0 gives thedefinition of a separable matrix.The main result of this paper is to show the existence of (cid:15) -almost k -separablematrices with m = O ( k log n ) rows (see Theorem 7 below). We also examinethe implicit constants for the case when k = n − β grows polynomially in n .Malyutov [14] effectively showed that (cid:15) -almost k -separable matrices existwith m = ( k + o (1)) log n rows in the regime where k is fixed as n → ∞ . Thisis a special case of a more general result Malyutov proved using an informationtheoretic argument – this and similar work is reviewed in [15]. Seb˝o showedeffectively the same result [19], again for fixed k , by analysing a concrete boundon the probability that there are two different sets of size k whose disjunctionscoincide – we follow a similar route here later. The same result for k fixed and n → ∞ was rediscovered by Zhigljavsky [22, Theorem 5.5]. Although technicallydifferent from Seb˝o’s argument, Zhigljavsky’s proof is morally similar: given twosets K and L of k columns each, Zhigljavsky counts how many rows it is possibleto construct that would produce the same value for both S ( K ) and S ( L ). Hecalls this number a R´enyi coefficient and only considers designs with fixed- orbounded-size tests.Our result improves on these by allowing k to vary arbitrarily with n , subjectto k = o ( n ). In our discussion of group testing in Section 5 we show how, insome regimes, this work also improves on recent results on nonadaptive grouptesting giving bounds of the form m = O ( k log n ).The definition of a disjunct matrix (Defintion 3) can similarly be weakenedto give an almost disjunct matrix . (This definition also appears in [16] and,previously, in [12].) Definition 5.
With the notation of Definition 1, A is (cid:15) -almost k -disjunct if forat most (cid:15) (cid:0) nk (cid:1) sets K of size k does there exist a column i (cid:54)∈ K with S i ⊆ S ( K ).4ote again that (cid:15) = 0 corresponds to a disjunct matrix. Unsurprisingly,almost disjunctness implies almost separability. Lemma 6.
Let A be an (cid:15) -almost k -disjunct matrix. Then A is (cid:15) -almost k -separable (with the same (cid:15) and k ).Proof. We prove the contrapositive. Suppose A is not (cid:15) -almost k -separable.Then there are more than (cid:15) (cid:0) nk (cid:1) sets of size k breaking separability. Let K beone of these sets, so there is another set L of size k with S ( K ) = S ( L ). Letting i ∈ K \ L , we have S i ⊆ S ( K ), breaking disjunctness. Hence there are morethan (cid:15) (cid:0) nk (cid:1) sets breaking disjunctness, and A is not (cid:15) -almost k -disjunct.Mazumdar [16] shows that there exist almost k -disjunct matrices with m = O ( k / √ log n ) rows in the regime k ∼ n δ , δ >
0, which is the same as thatwe consider for group testing. Mazumdar’s construction is similar to those ofKautz and Singleton [11] and Porat and Rothschild [17]. In particular, [11] showshow to build fully disjunct matrices with O ( k log k log n n ) rows by mapping thesymbols of a q -ary Reed-Solomon code to unit-weight binary vectors of length q , while [17] improves on this scheme by replacing the RS code with a linear q -ary code achieving the Gilbert-Varshamov bound. This produces fully disjunctmatrices with O ( k log n ) rows. This improves on the Ω( k log n/ log k ) requiredfor full disjunctness or separability, while being less good than the O ( k log n )we achieve for almost separability here.Our main result is then the following. Theorem 7.
For any sequence k = k ( n ) = o ( n ) and (cid:15) > , there exist an (cid:15) -almost k -separable matrix with m = O ( k log n ) rows.More precisely, for α ∈ [ln 2 , , define M ( n, k, α ) = 1 − ln(1 − − α + 2e − α ) k ln nk ,M ( n, k, α ) = 1 − ln(1 − − α + 2e − α (1+1 /k ) ) ln nk, (2) M ( n, k ) = min α ∈ [ln 2 , max { M ( n, k, α ) , M ( n, k, α ) } . Then for any (cid:15), δ > , for n sufficiently large, and m > (1 + δ ) M ( n, k ) , thereexists and m × n (cid:15) -almost k -separable matrix. Consider the special case α = ln 2. It is possible to see that M dominates,and hence that there exist almost separable matrices with m = (1 + δ ) k log nk rows. Note that this is sufficient to show the m = O ( k log n ) result – and comeswith a slightly easier proof than the general case (see below). This bound alsomeets the Malyutov–Seb˝o result of m ∼ k log n for k constant. However, it ispossible to get slightly better constants for most k = k ( n ) by allowing differentvalues of α . In particular, M with α = 1 gives the best result in many regimes.In Section 5 we discuss the constants in more detail in the regime k = n − β for β ∈ (0 , Proof of main result
We proceed to prove Theorem 7 as follows. Fix n and k . We will choose A to be an m × n matrix (where m will be determined later) with each entryindependently with probability p and with probability q = 1 − p , for some p also to be chosen later. We aim to show that there is a choice of m and p sothat, with positive probability, A is (cid:15) -almost k -separable, and hence that sucha matrix exists.The following bound will be important, and is fairly well known – see forexample Seb˝o [19], who analyses its asymptotics for fixed k as n → ∞ . Lemma 8.
Let A be a randomly chosen matrix in { , } m × n with each entryindependently with probability p . For any set K of size k ≤ n/ , then P ( ∃ L with |L| = k, S ( L ) = S ( K )) ≤ k − (cid:88) b =0 (cid:18) kb (cid:19)(cid:18) n − kk − b (cid:19) (cid:0) − q k + 2 q k − b (cid:1) m . (3) Proof.
Say that an overlap occurs if there exists L with |L| = k and S ( L ) = S ( K ). Take two distinct sets K , L , both of size k , that have b = |K ∩ L| elementsin common. Then a row j of A could distinguish between K and L in two ways:either we have j ∈ S ( K ) while j / ∈ S ( L ), or the other way round: j ∈ S ( L )while j / ∈ S ( K ).If the entries of the row a j are IID Bernoulli( p ), these two events each occurwith probability q k (1 − q k − b ) = q k − q k − b . Hence, row j fails to distinguishbetween K and L with probability 1 − q k (1 − q k − b ) = 1 − q k + 2 q k − b .Since the rows of A are IID, the whole matrix fails to distinguish between K and L with probability (1 − q k + 2 q k − b ) m .The result then follows by a union bound over L , noting that the number ofsets of size k sharing b elements with K is precisely (cid:0) kb (cid:1)(cid:0) n − kk − b (cid:1) .The main work in this paper is a careful asymptotic analysis of the overlapprobability (3), showing for which m it can be made arbitrarily small. Lemma 9.
For every sequence k = k ( n ) = o ( n ) , ε, δ > , there exists n sothat if n > n and m > (1 + δ ) M ( n, k ) , with M ( n, k ) as in Theorem 7, then P (overlap) < ε .Proof. We first prove that it suffices to have m > (1 + δ ) M ( n, k, ln 2), with M ( n, k, ln 2) = (1 + o (1)) k log nk . This is simpler to prove than the full resultand illustrates the main techniques.Here, we take p = 1 − − /k , as does Seb˝o [19], so that q = 2 − /k . This isa special case of the general value of p used in the appendix, p = 1 − e − α/k , bytaking α = ln 2. Note that, in group testing parlance, this is the value of p thatgives a 50 : 50 chance of a test being positive. The bound (3) then becomes P (overlap) ≤ k − (cid:88) b =0 (cid:18) kb (cid:19)(cid:18) n − kk − b (cid:19) (cid:18)
12 2 b/k (cid:19) m .
6t will be convenient to write c = k − b for the number of nonoverlappingitems, to get P (overlap) ≤ k (cid:88) c =1 (cid:18) kk − c (cid:19)(cid:18) n − kc (cid:19) (cid:18)
12 2 ( k − c ) /k (cid:19) m = k (cid:88) c =1 (cid:18) kc (cid:19)(cid:18) n − kc (cid:19) − cm/k . When m > (1 + δ ) k log nk , then the terms in the above sum are decreasingsince (cid:0) kc +1 (cid:1)(cid:0) n − kc +1 (cid:1) − ( c +1) m/k (cid:0) kc (cid:1)(cid:0) n − kc (cid:1) − cm/k = ( k − c )( n − k − c )2 − m/k ( c + 1) ≤ c − nc + k ( n − k ) nk ( c + 2 c + 1) (since 2 − m/k ≤ /nk ) ≤ , for n > k and k ≥
2. Thus, the probability of an overlap can be estimated bythe largest term with P (overlap) ≤ k ( n − k )2 − m/k k (cid:88) c =1 (cid:18) (cid:19) c − ≤ kn − (1+ δ ) log nk nk ( nk ) − − δ ≤ nk ) − δ , which, for fixed δ >
0, can be made arbitrarily small for n sufficiently large.Further, since log nk ≤ n , we see that m > (1 + δ ) k log nk = O ( k log n ).We can get the more general result that it suffices to have m > (1+ δ ) M ( n, k ),with M ( n, k ) as in (2), by instead taking p = 1 − e − α/k , and then optimising over α . The analysis is very similar to that above, but somewhat more longwinded.The interested reader is directed to the appendix for the details.Proving our main result is now straightforward. Proof of Theorem 7.
Choose the matrix A at random as above, with m and n chosen as in Lemma 9 so that the overlap probability is at most (cid:15)/ X for the number of sets K of size k that experience an overlap. It isclear A will be (cid:15) -almost k -separable provided that X ≤ (cid:15) (cid:0) nk (cid:1) .Then we have P (cid:18) X > (cid:15) (cid:18) nk (cid:19)(cid:19) ≤ (cid:15) (cid:0) nk (cid:1) E X, by the Markov inequality. But this expectation is, by Lemma 9 E X = (cid:88) |K| = k P ( K has an overlap) = (cid:18) nk (cid:19) P (overlap) ≤ (cid:18) nk (cid:19) (cid:15) . A is (cid:15) -almost k -separable with probability at least 1 /
2, sosuch matrices must exist.
In this section, we show how the use of almost separable matrices can give newresults on the rate of nonadaptive group testing.As we outlined in the introduction, in a nonadaptive group testing procedurewe aim to find a subset K of k defective items within a population of n identicalitems. We use m pooled tests. Recall that the outcome of a test j is positiveif one or more of the defective items is in the test pool, and negative if none ofthem are. We summarise our testing procedure by a matrix A = ( a ij ), where a ij = denotes that item i is in the pool for test j , and a ij = denotes thatit is not. Recalling the notation of Definition 1, the set of positive tests for adefective set K is precisely S ( K ).The aim is, given the outcomes S ( K ) and the matrix A , to identify thedefective set K . Clearly if there is no other L with S ( K ) = S ( L ), then we canfind K (at least theoretically: for study of practical algorithms for this, see, forexample, [1, 4, 20, 13, 21]). Conversely, if there is an L with S ( K ) = S ( L ), thenour error probability is at least 1 / k defectiveitems, then a testing matrix will allow us to find the defective set with certaintyif and only if it is k -separable. The advantages of using what we call almostseparability for group testing in the fixed- k regime have also been discussed in[22].While separable matrices allow detection with zero probability of error, thestudy of group testing within the scope of information theory and the need forefficient algorithms generated an interest in nonadaptive group testing with low– but not necessarily zero – probability of error, a situation which has gainedconsiderable attention [1, 4, 20, 14, 13, 2, 21]. Here the probability of error isdefined as an average over all possible defective sets of size k ; that is, P (error) = 1 (cid:0) nk (cid:1) (cid:88) |K| = k P (error | K ) . Baldassini, Johnson and Aldridge [3] introduced a concept of the rate ofgroup testing to quantify how well a group testing design works. (An earlierdefinition of rate for the fixed k regime had been introduced by Malyutov [15].)The rate is the ratio of the number of tests to the counting bound log (cid:0) nk (cid:1) . Ifwe interpret the counting bound as a binary labelling of all possible defectivesets of size k , the rate can be considered as the number of bits learned per testby the group testing procedure. 8 efinition 10. Consider a group testing problem with n items of which k aredefective. A design with m tests is said to have rate R = m/ log (cid:0) nk (cid:1) .Given a sequence of group testing problems for n items of which k = k ( n )are defective, a rate R is said to be achievable for a design A if, for any (cid:15) > (cid:15) with rate atleast R for n sufficiently large.We follow Baldassini et al. [3, 1] and study achievable rates in regimes where k = k ( n ) = n − β for different values of the sparsity parameter β ∈ (0 , k -separable matrix with m ≥ Ω( k log n/ log k ) tests gives rate 0 for all values of β < DD algorithm of Aldridge, Baldassini and Johnson[1], which has a lower bound on the maximum achievable rate of R DD ( β ) = 1e ln 2 min (cid:26) β − β , (cid:27) ≈ .
53 min (cid:26) β − β , (cid:27) , (4)together with the Malytuov–Seb˝o result that R = 1 can be achieved in thefixed- k regime.Baldassini, Johnson and Aldridge [3] also showed that for adaptive grouptesting, the generalized binary splitting algorithm of Hwang [7] gives a rate of1 (the best possible) for all β ∈ (0 , (cid:15) -almost k -separating matrix willfind the defective set with error probability at most (cid:15) , since the sets K withoutoverlaps can by definition be recovered with certainty. Hence, the number ofrows of the almost separating matrix gives bounds on the rate. Therefore, usingour above results, we have the following: Theorem 11.
For β ∈ (0 , and k = n − β , the maximum achievable rate ofnonadaptive group testing with n items of which k are defective is bounded belowby R ≥ α ∈ [ln 2 , min (cid:26) α e − α β − β , − ln (cid:0) − − α + 2e − α (cid:1)(cid:27) . (5)Figure 1 illustrates the result of Theorem 11. Note that our result improvesover the best known result for β > /
3, and meets the Malyutov–Seb˝o point as β → Proof.
Following directly from Theorem 7 and the definition of rate, we have R ≥ α ∈ [ln 2 , min (cid:110) − ln (cid:16) − − α + 2e − α (1+1 /k ) (cid:17) k β − β , − ln (cid:0) − − α + 2e − α (cid:1) (cid:111) , noting that, when k = n − β , k log nk = 2 − ββ k log nk . β = 1, the second term is the minimum. When β <
1, since we havethat k → ∞ , we can take limits in the first minimand. We have − ln (cid:0) − − α + 2e − α (1+1 /k ) (cid:1) k = − ln (cid:16) − − α + 2e − α e − α/k (cid:17) k = − ln (cid:18) − − α + 2e − α (cid:18) − αk + o (cid:18) k (cid:19)(cid:19)(cid:19) k = − ln (cid:18) − − α αk + o (cid:18) k (cid:19)(cid:19) k = (cid:18) − α αk + o (cid:18) k (cid:19)(cid:19) k → α e − α . The result follows.Note that our ‘simpler’ result with α = ln 2 gives a bound almost as goodthe general case, namely R (ln 2) = β − β . In particular, this choice of α = ln 2 is optimal at β = 1.Note also that for all but the sparsest cases, we get the bound by taking α = 1. Specifically, for β ≤ β , where β = − − − + 2e − )2e − − ln(1 − − + 2e − ) ≈ . , ptimal αα = 1 α = ln 2Corollary 12counting boundMalyutov–Sebő β * Sparsity parameter β Rate
Figure 2: Bounds on rates of group testing for large β , showing Theorem 11 fordifferent values of α and the approximation of Corollary 12.the best value of the bound is R (1) = 1ln 2 min (cid:26) − β − β , − ln (cid:0) − − + 2e − (cid:1)(cid:27) = 1ln 2 2e − β − β ≈ . β − β . For β ∈ ( β , α is that which achieves the maximum. It’s easy to see for β ≥ β thatthe maximum over α is achieved when the two terms in the minimum are equal,and this is simple to solve numerically. However, here we also provide someclosed form approximations to this which could be useful. Corollary 12.
For < β < and k = n − β , the maximum achievable rateof nonadaptive group testing with n items, of which k are defective, is boundedfrom below by R ≥ − (cid:32) (cid:18) − β ) ln 2 β (1 − ln 2) (cid:19) (cid:33) This is illustrated in Figure 2. From this, we see that the bound of Corollary12 is very good for β ≈
1, but that when β is not much above β , then the boundof simply α = 1 is better. Hence, setting β = 2 ln 21 − − + ln 2 + 2e − ln 2 ≈ . , β as above, we get the following bound: Corollary 13.
For β ∈ (0 , and k = n − β , the maximum achievable rateof nonadaptive group testing with n items, of which k are defective, is boundedfrom below by R ≥ − β − β if β ≤ β − ln(1 − − + 2e − ) if β < β ≤ β , ln 2 − ln (cid:32) (cid:18) − β ) ln 2 β (1 − ln 2) (cid:19) (cid:33) if β > β The proofs of these statements can be found in Appendix B.
We have explored the asymptotics of almost separability and we have shownthat almost separable matrices exist with O ( k log n ) rows. Furthermore, wehave proved that the use of almost separable matrice can improve the lowerbounds on the rate of nonadaptive group testing in the very sparse regime.Several interesting questions, however, remain still open, and provide scopefor future research. Most notably, while we have given new achievable rates, themaximum rate of nonadative group testing is still unknown. In particular, weknow of no upper bounds beyond the trivial counting bound.As discussed in Section 2, Chen and Hwang [5] have proved that disjunctand separable matrices share the same asymptotics by showing how to constructa k -disjunct matrix out of a 2 k -separable matrices by adding at most one row toit. Unlike its inverse (disjunctness implying separability), this statement doesn’tnaturally carry through to the case of almost separability/disjunctness.Another problem is to extend the existing results to other regimes than the k = n − β for β ∈ (0 ,
1] considered here. Of particular interest is the case where k = cn grows like a constant proportion of n , as in recent work by Wadayama[21]. Note that the counting bound now gives a lower bound of m = O ( n ),while, for coupon-collector reasons, the IID random approach here inevitablyleads to the suboptimal m = Ω( n log n ). A Asymptotic analysis of the overlap probabil-ity
We now show the full result of Lemma 9.We use the same random construction as the special case described in Section4, but now take p = 1 − e − α/k , so q = e − α/k , where α is a parameter to bechosen later (simply taking α = ln 2 as in Section 4 gives p = 1 − − /k ). Withinthe group testing literature, different values of p have also been considered. Forexample, the value p = 1 /k (which gives an average of one defective per test)has been considered before by many authors [1, 4, 2, 22], while Sejdinovic andJohnson [20] consider the more general α/k for noisy group testing. The samevalue can be obtained asymptotically in this context, as p ∼ α/k if k → ∞ as n → ∞ . 12 roof of Lemma 9. We wish to find values of m such that P (overlap) can bemade arbitrarily small. It will be convenient to write s = 1 − q k = 1 − − α , t = 2 q k = 2e − α , u = 1 q = e α/k , allowing us to rewrite the bound (3) as P (overlap) ≤ k − (cid:88) b =0 (cid:18) kb (cid:19)(cid:18) n − kk − b (cid:19) (cid:0) s + tu b (cid:1) m . As before, it will be more convenient to deal with c = b − k , which gives P (overlap) ≤ k (cid:88) c =1 (cid:18) kc (cid:19)(cid:18) n − kc (cid:19) (cid:0) s + tu k − c (cid:1) m . (6)Now, we expand out ( s + tu b ) m in (6) using the binomial theorem and reversethe order of summation to get P (overlap) ≤ k (cid:88) c =1 (cid:18) kc (cid:19)(cid:18) n − kc (cid:19) m (cid:88) j =0 (cid:18) mj (cid:19) s m − j t j u ( k − c ) j = m (cid:88) j =0 (cid:18) mj (cid:19) s m − j t j k (cid:88) c =1 (cid:18) kc (cid:19)(cid:18) n − kc (cid:19) u ( k − c ) j = m (cid:88) j =0 (cid:18) mj (cid:19) s m − j t j u jk k (cid:88) c =1 (cid:18) kc (cid:19)(cid:18) n − kc (cid:19) q cj (7)Consider the inner sum of (7). It is possible to approximate it by its largestterm, which will depend on the value of j . To start with, the following boundholds: (cid:18) kc (cid:19)(cid:18) n − kc (cid:19) q cj ≤ (cid:18) e knq j c (cid:19) c . (8)Note that for any a , the function ( a/x ) x attains its maximum at x = √ a/ e;and further is increasing for x < √ a/ e and decreasing for x > √ a/ e. In (8), themaximum corresponds to c = (cid:112) knq j . Now, 1 < (cid:112) knq j < k when1 − ln q ln nk < j < − − ln q ln nk, or, since q = e − α/k , 1 α k ln nk < j < α k ln nk. Then, in light of the above, we will split between the three cases: first, j ≤ k/α ln n/k ; second, k/α ln n/k < j < k/α ln nk ; and third, j ≥ k/α ln nk .For the first case, j ≤ k/α ln n/k , the maximum of (8) is attained at c = k ,giving the bound (cid:18) e knq j k (cid:19) k = e k (cid:16) nk (cid:17) k q jk . j yields k/α ln n/k (cid:88) j =0 (cid:18) mj (cid:19) s m − j t j u jk k e k (cid:16) nk (cid:17) k q jk = k e k (cid:16) nk (cid:17) k k/α ln n/k (cid:88) j =0 (cid:18) mj (cid:19) s m − j t j ≤ k e k (cid:16) nk (cid:17) k m (cid:88) j =0 (cid:18) mj (cid:19) s m − j t j = k e k (cid:16) nk (cid:17) k ( s + t ) m = k exp (cid:16) k + k ln nk + m log( s + t ) (cid:17) . Provided that m > (1 + δ ) 1 − ln( s + t ) k ln nk = (1 + δ ) 1 − ln(1 − − α + 2e − α ) k ln nk = (1 + δ ) M ( n, k, α ) , (9)for some δ >
0, then this can be made arbitrarily small for n sufficiently large.For the second case, k/α ln n/k < j < k/α ln nk , the maximum is attainedat c = (cid:112) knq j , giving the bound (cid:18) e knq j knq j (cid:19) √ knq j = exp(2 (cid:112) knq j ) ≤ exp(2 (cid:112) knq k/α ln n/k )= exp (cid:32) (cid:114) kn (cid:16) nk (cid:17) k/α ln q (cid:33) = exp (cid:32) (cid:114) kn kn (cid:33) = exp(2 k ) . Then we have that k/α ln nk (cid:88) j = k/α ln n/k (cid:18) mj (cid:19) s m − j t j u jk k (cid:88) c =1 (cid:18) kc (cid:19)(cid:18) n − kc (cid:19) q jc ≤ k/α ln nk (cid:88) j = k/α ln n/k (cid:18) mj (cid:19) s m − j ( tu k ) j k e k = k e k P (cid:18) α k ln nk < X ≤ α k ln nk (cid:19) , where we called X ∼ Bin( m, tu k ), and we have used that s = 1 − tu k . Then aslong as E X = mtu k > (1 + δ ) 1 α k ln nk, (10)14e have by the Azuma–Hoeffding inequality that k e k P (cid:18) α k ln nk < X ≤ α k ln nk (cid:19) ≤ k e k P (cid:18) X ≤ α ln nk (cid:19) ≤ k e k exp (cid:32) − m (cid:18) mtu k − α k ln nk (cid:19) (cid:33) = k exp (cid:32) k − m (cid:18) tu k − k ln nkαm (cid:19) (cid:33) . Given (10), this can be made arbitrarily small for n sufficiently large. We canrewrite (10) as m > (1 + δ ) 1 αtu k k ln nk = (1 + δ ) e α α k ln nk = (1 + δ ) M ( n, k, α ) . (11)Now for the final case, when j ≥ k/α ln nk . Note that for j ≥ k/α ln nk , q j ≤ q k/α ln nk = e − ln nk = 1 nk , hence nkq j ≤
1. Then, splitting up c = 1, c = 2, and c ≥
3, and noting thate / <
1, we have k (cid:88) c =1 (cid:18) e knq j c (cid:19) c ≤ e knq j (cid:32) knq j + k (cid:88) c =3 c (cid:18) e knq j c (cid:19) c − (cid:33) ≤ e knq j (cid:32)
16 + 19 ∞ (cid:88) c =3 (cid:18) e (cid:19) c − (cid:33) ≤ knq j . Thus, m (cid:88) j = α ln nk (cid:18) mj (cid:19) s m − j t j u jk k (cid:88) c =1 (cid:18) e knq j c (cid:19) c ≤ m (cid:88) j = α ln nk (cid:18) mj (cid:19) s m − j t j u jk knq j ≤ kn m (cid:88) j =0 (cid:18) mj (cid:19) s m − j ( tu k − ) j = 5e kn ( s + tu k − ) m = 5 exp (cid:0) ln nk + m ln( s + tu k − ) (cid:1) To make this small requires m > (1 + δ ) 1 − ln( s + tu k − ) ln nk. (12)In order to compare the condition in (12) to (9) and (11), note that for any x, y ∈ (0 , − ln(1 − x (1 − e − y )) ≤ xy. y , the function f y ( x ) = xy + ln(1 − x (1 − e − y )) is concave for x ∈ [0 ,
1] with f y (0) = 0 = f y (1).Thus, since s + tu k − = 1 − − α (1 − e − α/k ), then − s + tu k − ) = − − − α (1 − e − α/k )) ≥ k − α α . Thus, condition (12) is always stronger than (9) and one can see that when k tends to infinity, the two conditions are asymptotically equal.Hence from (9), (11), and (12) our requirements are m > (1 + δ ) M ( n, k, α ) m > (1 + δ ) M ( n, k, α ) . From the above, we can optimise this result over α . Noting that M is minimisedat α = ln 2 and M is minimised at α = 1, it is sufficient to just consider α ∈ [ln 2 , B Explicit bounds on rate
Here we give the proofs of Corollaries 12 and 13.
Proof of Corollary 12.
The bound on R follows from Theorem 11 by a carefulchoice of α in terms of β .In order to simplify some of the expressions that follow, define y = y ( α ) =1 − − α and t = 1 − β − β . Then, for α ∈ [ln 2 ,
1] we have y ∈ [0 , − / e] andas β tends to 1, t tends to 0. Further, the expressions in Theorem 11 can besimplified as − ln(1 − − α + 2e − α ) = − ln (cid:18)
12 (1 + y ) (cid:19) = ln 2 − ln(1 + y )and2 α e − α β − β = (1 − y ) (cid:18) − ln (cid:18) (1 − y )2 (cid:19)(cid:19) (1 − t ) = (1 − y ) (ln 2 − ln(1 − y )) (1 − t ) . Thus, the result of Theorem 11 can be restated as R ≥ y ∈ [0 , − / e] (cid:8) ln 2 − ln(1 + y ) , (1 − t )(1 − y ) (ln 2 − ln(1 − y )) (cid:9) (13)The desired result then follows from equation (13) by choosing y = ln 21 − ln 2 · t − t . (14)Note that, by the definition of t , t − t = − β ) β .What remains is to show that for y given by equation (14),ln 2 − ln(1 + y ) ≤ (1 − y )(1 − t ) (ln 2 − ln(1 − y )) . (15)16or y given by equation (14), the right-hand side of equation (15) is(1 − y )(1 − t ) (ln 2 − ln(1 − y ))= (1 − y )(1 − t )(ln 2 + y ) − (1 − y )(1 − t )( y + ln(1 − y ))= (1 − t ) ln 2 + y (1 − t )(1 − ln 2) − (1 − t ) y − (1 − y )(1 − t )( y + ln(1 − y ))= (1 − t ) ln 2 + t ln 2 − (1 − t ) y − (1 − y )(1 − t )( y + ln(1 − y )) (by eq. (14))= ln 2 − (1 − t )( y + (1 − y ) y + (1 − y ) ln(1 − y ))= ln 2 − (1 − t )( y + (1 − y ) ln(1 − y ))= ln 2 − (cid:18) ln 2ln 2 + y (1 − ln 2) (cid:19) ( y + (1 − y ) ln(1 − y )) (by eq. (14)) . Thus, in order to show that the inequality in (15) holds, it suffices to showthat for all y ∈ [0 , y + (1 − y ) ln(1 − y ) ≤ (cid:18) y (1 − ln 2)ln 2 (cid:19) ln(1 + y ) (16)The inequality in (16) is shown by considering separately the cases y ≤ / y > / y ≤ /
2. Using the fact that ln(1 − y ) < − y andln(1 + y ) ≥ y − y / ≥ y − y / y (1 − y/ . Then, y + (1 − y ) ln(1 − y ) < y and for all y ∈ [0 , / ≤ (cid:18) y (1 − ln 2)ln 2 (cid:19) (cid:16) − y (cid:17) . Thus, for y ≤ / y + (1 − y ) ln(1 − y ) ≤ y ≤ y (cid:18) y (1 − ln 2)ln 2 (cid:19) (cid:16) − y (cid:17) ≤ (cid:18) y (1 − ln 2)ln 2 (cid:19) ln(1 + y ) . Consider now the inequality from (16) in the case y ≥ /
2. Note that for all y ∈ [0 , y ) ≥ ln 2 − (1 − y ) . The above inequality can be seen to be true since it holds for y = 0 and y = 1and ln(1 + y ) is concave. Thus, in order to prove the inequality in (16), itsuffices to show that for y ∈ [1 / , y + (1 − y ) ln(1 − y ) ≤ (cid:18) y (1 − ln 2)ln 2 (cid:19) (ln 2 − (1 − y )) . (17)17gain, the inequality in equation (17) can be seen to be true since it holds for y = 1 / y = 1 and the function (cid:18) y (1 − ln 2)ln 2 (cid:19) (ln 2 − (1 − y )) − y − (1 − y ) ln(1 − y )= ln 2 + y (1 − ln 2) − y − y (1 − ln 2)ln 2 + y (1 − ln 2)ln 2 − y − (1 − y ) ln(1 − y )= (ln 2 −
1) + y (1 − ln 2) (cid:18) − (cid:19) + y (1 − ln 2)ln 2 − (1 − y ) ln(1 − y )is concave for y ∈ [0 , Proof of Corollary 13.
For β < β , the result follows from Theorem 11 by sub-situting α = 1 and noting that the inequality2 β e(2 − β ) ≤ − ln(1 − / e + 2 / e )holds exactly when β < β .For β ≥ β , the result follows from Corollary 12 by noting that β > .In Corollaries 12 and 13, a better bound for the case β > β can be obtainedby substituting in Theorem 11, α chosen so that1 − − α = − β (1 − ln 2) + (cid:112) β (1 − ln 2) + 4(1 − β )(4 − β ) ln 24 − β , but the expression obtained does not seem simpler than statement of Theorem11 itself. References [1] M Aldridge, L Baldassini, and O Johnson. Group testing algorithms:bounds and simulations.
IEEE Transactions on Information Theory , :6,3671–3687, 2014.[2] GK Atia and V Saligrama. Boolean compressed sensing and noisy grouptesting. IEEE Transactions on Information Theory , :3, 1880–1901, 2012.[3] L Baldassini, O Johnson, and M Aldridge. The capacity of adaptive grouptesting. , 2676–2680, 2013.[4] CL Chan, S Jaggi, V Saligrama, and S Agnihotri. Non-adaptive grouptesting: explicit bounds and novel algorithms. IEEE Transactions on In-formation Theory , :5, 3019–3035, 2014185] H-B Chen and FK Hwang. Exploring the missing link among d -separable, d -separable and d -disjunct matrices. Discrete Applied Mathematics , :5,662—664, 2007.[6] R Dorfman. The detection of defective members of large populations. TheAnnals of Mathematical Statistics , :4, 436–440, 1943.[7] D-Z Du and FK Hwang. Combinatorial Group Testing and Applications ,second edition. Series on Applied Mathematics, , World Scientific, 2000.[8] AG D’yachkov and VV Rykov. Bounds on the length of disjunctive codes. Problems of Information Transmission , :3, 166—171, 1982.[9] P Erd˝os and L Moser. Problem 35. Proceedings on the Conference of Com-binatorial Structures and their Applications , Gordon and Breach, 1970.[10] Z F¨uredi. On r -cover-free families. Journal of Combinatorial Theory, SeriesA , :1, 172–173, 1996.[11] WH Kautz and RC Singleton. Nonrandom binary superimposed codes. IEEE Transaction on Information Theory , :4, 363–377, 1964.[12] A Macula, V Rykov and S Yekhanin. Trivial two-stage group testing forcomplexes using almost disjunct matrices. Discrete Applied Mathematics , :1, 97–107, 2004.[13] D Malioutov and M Malyutov. Boolean compressed sensing: Lp relaxationfor group testing. , 3305—3308, 2012.[14] MB Malyutov. The separating property of random matrices. MathematicalNotes of the Academy of Sciences of the USSR , :1, 84–91, 1978.[15] M Malyutov. Search for sparse active inputs: a review. In H Aydinian,F Cicalese, and C Deppe (Eds), Information Theory, Combinatorics andSearch Theory
Lecture notes in Computer Science, , Springer, 609–647, 2013.[16] A Mazumdar. On almost disjunct matrices for group testing.
Algorithmsand Computation , Lecture Notes in Computer Science, , 649–658,2012.[17] E Porat and A Rothschild. Explicit Non-Adaptive Combinatorial GroupTesting Schemes. In L Aceto, I Damgard, LA Goldberg, MM Halldorsson,A Ingolfsdottir and I Walukiewicz (Eds),
ICALP 2008 , Lecture Notes inComputer Science, , 748–759, 2008.[18] M Ruszink´o. On the upper bound of the size of r -cover-free families. Journalof Combinatorial Theory, Series A , :2, 302–310, 1994.[19] A Seb˝o. On two random search problems. Journal of Statistical Planningand Inference , :1, 23–31, 1985. 1920] D Sejdinovic and OT Johnson. Note on noisy group testing: asymptoticbounds and belief propagation reconstruction. Proceedings of the 48th An-nual Allerton Conference on Communication, Control and Computing ,998–1003, 2010.[21] T Wadayama. An analysis on non-adaptive group testing based on sparsepooling graphs. ,2681—2685, 2013.[22] A Zhigljavsky. Probabilistic existence theorems in group testing.
Journalof Statistical Planning and Inference ,115