EEXACT LOWER BOUND ON AN “EXACTLY ONE”PROBABILITY
IOSIF PINELIS
Abstract.
The exact lower bound on the probability of the occurrence ofexactly one of n random events each of probability p is obtained. Introduction, summary, and discussion
Suppose A , . . . , A n are random events each of probability p . Let E denote theevent that exactly one of the events A , . . . , A n occurs.If the A i ’s are independent then, by the binomial probability mass functionformula (see e.g. [2, Section 1.3]), P ( E ) = npq n − , where q := 1 − p . So, in the“independent” case, P ( E ) attains its maximum, equal (1 − /n ) n − (cid:2) −→ n →∞ /e (cid:3) , at p = 1 /n .What will happen with P ( E ) when the A i ’s are only assumed to be pairwiseindependent? Of course, already for n = 3, the pairwise independence of the A i ’sdoes not imply their “complete” independence. Feller [3, page 126] wrote: “Actuallysuch occurrences [of pairwise independence but not “complete” independence] areso rare that their possibility passed unnoticed until S. Bernstein constructed anartificial example. It still takes some search to find a plausible natural example.”This is followed ([3, page 127]) by an example of three pairwise independent eventsthat are not “completely” independent. Another such example, [2, Example 2.3.3]– ascribed in [2] to Bernstein, actually appears more common and natural than thementioned example on page 127 in [3].One may want to dispute the assertion that occurrences of pairwise independencewithout “complete” independence are rare. Indeed, the definition of the indepen-dence of three events A, B, C consists of the following four equations: P ( A ∩ B ) = P ( A ) P ( B ), P ( B ∩ C ) = P ( B ) P ( C ), P ( A ∩ C ) = P ( A ) P ( C ), and P ( A ∩ B ∩ C ) = P ( A ) P ( B ) P ( C ). The first three of these four equations define the pairwise inde-pendence. The probabilities of the events A, B, C and of their pairwise and tripleintersections can all be expressed as the sums of the probabilities of certain piecesof the partition of the sample space (say Ω) generated by the events
A, B, C . Thereare 2 = 8 pieces of this partition, with 8 corresponding probabilities, which maybe considered as nonnegative real variables tied just by one equation – stating thatthe sum of these 8 probabilities is 1. Thus, we have 4 + 1 = 5 equations with 8unknowns, which leaves us 8 − A, B, C may be dropped without altering the notion of independence. In particular, this
Mathematics Subject Classification.
Key words and phrases. exact lower bound; probability inequalities; independence; pairwiseindependence. a r X i v : . [ m a t h . P R ] F e b IOSIF PINELIS way it is easy to see that the pairwise independence does not imply the “complete”independence. Moreover, now it seems plausible that – in the case of three events
A, B, C – the dimension of the semi-algebraic set [1] (in the mentioned 8 variables)corresponding to the “complete” independence is less by 1 than the dimension of thesemi-algebraic set corresponding to the pairwise independence. More generally, forany natural number n of events A , . . . , A n , the difference between the correspond-ing dimensions appears to be 2 n − − n − n ( n − / ∼ n (as n → ∞ ). Here, 2 n appears as the number of all equations of the form P (cid:0) (cid:84) j ∈ J A j (cid:1) = (cid:81) j ∈ J P ( A j ) for J ⊆ [ n ] := { , . . . , n } . Of these 2 n equations, 1 + n equations – for the sets J ⊆ [ n ]of cardinalities 0 and 1 – are trivial; and n ( n − / n equations define thepairwise independence of the n events. Thus, there are 2 n − − n − n ( n − / n events, in addition to the n ( n − / P ( E ) may be quite sensitive to the distinction betweenthe pairwise independence and the “complete” independence: Theorem 1.
For each natural n and each p ∈ [0 , , (1) min P ( E ) = P n,p := np (cid:0) − ( n − p (cid:1) + , where the minimum is taken over all pairwise independent events A , . . . , A n eachof probability p , and x + := max(0 , x ) for real x . We see that, in contrast with the “completely independent case”, for just pairwiseindependent events A , . . . , A n the probability P ( E ) can be 0 for any n ≥ p ≥ / ( n − /n of p – at which, as wasnoted, P ( E ) attains its maximum value (1 − /n ) n − ≈ /e in the “completelyindependent” case – then for just pairwise independent events A , . . . , A n we havemin P ( E ) = 1 /n →
0. However, if e.g. p = c/n with a fixed c ∈ (0 , P ( E ) stays away from 0. So, P ( E ) will necessarily be of the sameorder of magnitude (for large n ) in both cases only if p small – more specifically, if p stays below c/n for some fixed c ∈ (0 , P ( E ) (thevertical axis) in the “completely independent case” (circles) and in the “pairwiseindependent case” (triangles) for n ∈ { , . . . , } (the horizontal axis), p = c/n ,and c ∈ { / , / , , / } .2. Proof of Theorem 1
For n = 1, Theorem 1 is trivial. So, in what follows assume n ≥ j ∈ [ n ], let X j := 1 A j , XACT LOWER BOUND ON AN “EXACTLY ONE” PROBABILITY 3
Figure 1.
Graphs of the values of P ( E ) for p = c/n the indicator of the event A j . Let N := X + · · · + X n , the number of the events A , . . . , A n that occurred. Then(2) E = { N = 1 } . Note that E X j = p and (by the pairwise independence) E X j X k = p + pq j = k )for all j and k in [ n ]. Now we have a perhaps unexpected use of the Chebyshev–Markov inequality (see e.g. [2, Theorem 4.7.4]): P ( N (cid:54) = 1) = P (( N − ≥ ≤ E ( N − = E N − E N + 1= (cid:88) j,k ∈ [ n ] E X j X k − (cid:88) j ∈ [ n ] E X j + 1= n p + npq − np + 1= 1 − np (cid:0) − ( n − p (cid:1) . Therefore and because P ( N = 1) ≥
0, we see that P ( N = 1) ≥ np (cid:0) − ( n − p (cid:1) + = P n,p ;So, in view of (2), P n,p is a lower bound on P ( E ); cf. (1). IOSIF PINELIS
It remains to show that this lower bound is attained, for each natural n ≥ p ∈ [0 , C J := (cid:16) (cid:92) j ∈ J A j (cid:17) ∩ (cid:16) (cid:92) j ∈ [ n ] \ J (Ω \ A j ) (cid:17) for J ⊆ [ n ]. These events constitute a partition of the sample space Ω. Moreover,for each m ∈ { } ∪ [ n ],(3) { N = m } = (cid:91) J ⊆ [ n ] , | J | = m C J , where | J | denotes the cardinality of the set J . Also,(4) A = (cid:91) J ⊆ [ n ] ,J ⊇{ } C J and A ∩ A = (cid:91) J ⊆ [ n ] ,J ⊇{ , } C J . For each m ∈ { } ∪ [ n ], let us assign the same probability, say x m , to each event C J with J ⊆ [ n ] such that | J | = m ; then, by (3),(5) P ( N = m ) = (cid:18) nm (cid:19) x m . So, there will exist a probability space supporting such an assignment of probabil-ities to the C J ’s if and only if x m ≥ m ∈ { } ∪ [ n ] and(6) n (cid:88) m =0 (cid:18) nm (cid:19) x m = 1;this follows because the set of values of the random variable N is the set { } ∪ [ n ].Then, in view of (4), we also have P ( A ) = n (cid:88) m =1 (cid:88) J ⊆ [ n ] ,J ⊇{ } , | J | = m P ( C J ) = n (cid:88) m =1 (cid:18) n − m − (cid:19) x m (which is actually the value of P ( A j ) for all j ∈ [ n ]) and P ( A ∩ A ) = n (cid:88) m =1 (cid:88) J ⊆ [ n ] ,J ⊇{ , } , | J | = m P ( C J ) = n (cid:88) m =1 (cid:18) n − m − (cid:19) x m (which is actually the value of P ( A i ∩ A j ) for all distinct i and j in the set [ n ]). Nowthe conditions that P ( A j ) = p for all j ∈ [ n ] and the A j ’s are pairwise independentcan be rewritten as(7) n (cid:88) m =1 (cid:18) n − m − (cid:19) x m = p and n (cid:88) m =1 (cid:18) n − m − (cid:19) x m = p . Now take any p ∈ [0 , k ∈ [ n −
1] such that(8) k − n − ≤ p ≤ kn − . XACT LOWER BOUND ON AN “EXACTLY ONE” PROBABILITY 5
For such a number k ∈ [ n − x m := npk (cid:0) k − ( n − p (cid:1)(cid:46)(cid:18) nk (cid:19) if m = k,npk + 1 (cid:0) ( n − p − ( k − (cid:1)(cid:46)(cid:18) nk + 1 (cid:19) if m = k + 1 , m ∈ [ n ] \ { k, k + 1 } . Then, in view of condition (8), x m ≥ m ∈ [ n ]. Also, then straightforwardcalculations show that conditions (7) hold and(10) s := n (cid:88) m =1 (cid:18) nm (cid:19) x m = np (cid:0) k − ( n − p (cid:1) k ( k + 1) ≤ . (The latter inequality is elementary. To prove it, one may first note that themaximum in p of the ratio in (10) is kn ( k +1)( n − , which increases in k ∈ [ n −
1] to1.) Therefore, one can satisfy condition (6) by letting x := 1 − s ≥
0, so that thecondition x m ≥ m ∈ { } ∪ [ n ] holds as well.Furthermore, it follows from (2), (5), (8), (9), and the definition of P n,p in (1)that P ( E ) = P ( N = 1) = nx = np (cid:0) − ( n − p (cid:1) if 0 ≤ p ≤ n − , . = np (cid:0) − ( n − p (cid:1) + = P n,p . This shows that the lower bound P n,p on P ( E ) is indeed attained, which com-pletes the proof of Theorem 1. (cid:3) We have the following easy corollary of Theorem 1:
Corollary 2.
In the conditions of Theorem 1, the best lower bound on P ( N = n − is P n,q (cf. (2) ). To see why this corollary holds, switch from the “successes” A j to the “failures”Ω \ A j , and also interchange the roles of p and q = 1 − p .There are a number of further questions that one may ask concerning Theorem 1,including the following:1. Assuming still that A , . . . , A n are pairwise independent events each of proba-bility p , what is the best upper bound on P ( E ) = P ( N = 1)? More generally,for each m ∈ { } ∪ [ n ], under the same conditions on the A j ’s, what are the bestlower and upper bounds on P ( N = m )?2. The same questions as above, but assuming, more generally, that the A j ’s are r -independent for some r ∈ { , . . . , n − } , i.e., assuming that for any J ⊆ [ n ]with | J | = r the family ( A j ) j ∈ J is independent.3. The same questions as above, but assuming, more generally, that the probabili-ties P ( A j ) have possibly different prescribed values p j , for j ∈ [ n ].4. Yet more generally, let B be any subset of the algebra (say A ) generated byevents A , . . . , A n . Suppose that the probabilities P ( B ) have prescribed values,say p B , for all B ∈ B . Take any A ∈ A . What are the best lower and upperbounds on P ( A ) in terms of the p B ’s? IOSIF PINELIS
Looking back at the proof of Theorem 1 and recalling the discussion in Section 1,one can see that all the further problems listed above are ones of linear programmingin a space of dimension exponentially growing with n , with the values of the P ( C J )’sfor J ⊆ [ n ] as the variables. Therefore and because the above proof of Theorem 1,with all its parts fitting together quite tightly, already was not easy to devise, allthese problems seem hard to tackle theoretically or even computationally.3. Conclusion
As we saw in Section 1, the condition of the “complete” independence of n events, oftentimes assumed quite casually, actually involves ∼ n equations, whichare practically impossible to test well even for rather moderate values of n , suchas n = 40. In contrast, the pairwise independence of n events involves only n ( n − / r -independence for some r ∈ { , . . . , n − } ). Also, perhaps some of the further questions enumerated at theend of Section 2 will attract attention of other researchers. Finally, the methodspresented in this note might turn out to be of use in other optimization problemsin probability, statistics, and perhaps elsewhere, especially where the “complete”independence is in doubt. References
1. Jacek Bochnak, Michel Coste, and Marie-Fran¸coise Roy,
Real algebraic geometry , Ergebnisseder Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)],vol. 36, Springer-Verlag, Berlin, 1998, Translated from the 1987 French original, Revised bythe authors. MR 16595092. Alexandr A. Borovkov,
Probability theory , Universitext, Springer, London, 2013, Translatedfrom the 2009 Russian fifth edition by O. B. Borovkova and P. S. Ruzankin, Edited by K. A.Borovkov. MR 30865723. William Feller,
An introduction to probability theory and its applications. Vol. I , Third edition,John Wiley & Sons, Inc., New York-London-Sydney, 1968. MR 02280204. Terence Tao,
The strong law of large numbers , Nov 2008, https://terrytao.wordpress.com/2008/06/18/the-strong-law-of-large-numbers/ . Department of Mathematical Sciences, Michigan Technological University, Houghton,Michigan 49931
Email address ::