A Strong XOR Lemma for Randomized Query Complexity
Joshua Brody, Jae Tak Kim, Peem Lerdputtipongporn, Hariharan Srinivasulu
aa r X i v : . [ c s . CC ] J u l A Strong XOR Lemma for Randomized Query Complexity
Joshua Brody , Jae Tak Kim , Peem Lerdputtipongporn , and Hariharan Srinivasulu Swarthmore College
Abstract
We give a strong direct sum theorem for computing XOR ◦ g. Specifically, we show that for everyfunction g and every k ≥
2, the randomized query complexity of computing the XOR of k instances of g satisfies R ε (XOR ◦ g) = Θ( k R εk ( g )). This matches the naive success amplification upper bound andanswers a conjecture of Blais and Brody [7].As a consequence of our strong direct sum theorem, we give a total function g for which R(XOR ◦ g) =Θ( k log( k ) · R( g )), answering an open question from Ben-David et al. [5]. We show that XOR admits a strong direct sum theorem for randomized query complexity. Generally, the direct sum problem asks how the cost of computing a function g scales with the number k of instances ofthe function that we need to compute. This is a foundational computational problem that has receivedconsiderable attention [9, 2, 13, 14, 10, 6, 8, 7, 3, 4, 5], including recent work of Blais and Brody [7], whichshowed that average-case randomized query complexity obeys a direct sum theorem in a strong sense —computing k copies of a function g with overall error ε requires k times the cost of computing g on oneinput with very low ( εk ) error. This matches the naive success amplification algorithm which runs an εk -erroralgorithm for f once on each of k inputs and applies a union bound to get an overall error guarantee of ε .What happens if we don’t need to compute g on all instances, but only on a function f ◦ g of thoseinstances? Clearly the same success amplification trick (compute g on each input with low error, then apply f to the answers) works for computing f ◦ g ; however, in principle, computing f ◦ g can be easier thancomputing each instance of g individually. When a function f ◦ g requires success amplification for all g , wesay that f admits a strong direct sum theorem . Our main result shows that XOR admits a strong direct sumtheorem. Query Complexity A query algorithm also known as decision tree computing f is an algorithm A that takes an input x to f ,examines (or queries ) bits of x , and outputs an answer for f ( x ). A leaf of A is a bit string q ∈ { , } ∗ representing the answers to the queries made by A on input x . Naturally, our general goal is to minimizethe length of q i.e., minimize the number of queries needed to compute f .A randomized algorithm A computes a function f : { , } n → { , } with error ǫ ≥ x ∈ { , } n , the algorithm outputs the value f ( x ) with probability at least 1 − ǫ . The query cost of A isthe maximum number of bits of x that it queries, with the maximum taken over both the choice of input x and the internal randomness of A . The ǫ -error (worst-case) randomized query complexity of f (also knownas the randomized decision tree complexity of f ) is the minimum query complexity of an algorithm A thatcomputes f with error at most ǫ . We denote this complexity by R ǫ ( f ), and we write R( f ) := R ( f ) todenote the -error randomized query complexity of f .Another natural measure for the query cost of a randomized algorithm A is the expected number ofcoordinates of an input x that it queries. Taking the maximum expected number of coordinates queried by1 over all inputs yields the average query cost of A . The minimum average query complexity of an algorithm A that computes a function f with error at most ǫ is the average ǫ -error query complexity of f , which wedenote by R ǫ ( f ). We again write R( f ) := R ( f ). Note that R ( f ) corresponds to the standard notion of zero-error randomized query complexity of f . Our main result is a strong direct sum theorem for XOR.
Theorem 1.
For every function g : { , } n → { , } and all ε > , we have R ε (XOR ◦ g) = Ω( k · R ε/k ( g )) . This answers Conjecture 1 of Blais and Brody [7] in the affirmative.We prove Theorem 1 by proving an analogous result in distributional query complexity. We also allow ouralgorithms to abort with constant probability. Let D µδ,ε ( f ) denote the minimal query cost of a deterministicquery algorithm that aborts with probability at most δ and errs with probability at most ε , where theprobability is taken over inputs X ∼ µ . Similarly, let R δ,ε ( f ) denote the minimal query cost of a randomizedalgorithm that computes f with abort probability at most δ and error probability at most ε (here probabilitiesare taken over the internal randomness of the algorithm).Our main technical result is the following strong direct sum result for XOR ◦ g for distributional algo-rithms. Lemma 1 (Main Technical Lemma, informally stated.) . For every function g : { , } n → { , } , everydistribution µ , and every small enough δ, ε > , we have D µ k δ,ε (XOR ◦ g) = Ω( k D µδ ′ ,ε ′ ( g )) , for δ ′ = Θ(1) and ε ′ = Θ( ε/k ) . In [7], Blais and Brody also gave a total function g : { , } n → { , } whose average ε error querycomplexity satisfies R ε ( g ) = Ω(R( g ) · log ε ). We use our strong XOR Lemma together with this functionshow the following. Corollary 1.
There exists a total function g : { , } n → { , } such that R ε (XOR ◦ g) = Ω( k log( k ) · R ε ( g )) .Proof. Let g : { , } n → { , } be a function guaranteed by [7]. Then, we haveR(XOR ◦ g) ≥ R(XOR ◦ g) ≥ Ω( k · R / k ( g )) ≥ Ω( k · R( g ) · log(3 k )) = Ω( k log( k ) · R( g )) , where the second inequality is by Theorem 1 and the third inequality is from the query complexity guaranteeof g .This answers Open Question 1 from recent work of Ben-David et al. [5]. Jain et al. [10] gave direct sum theorems for deterministic and randomized query complexity. While theirdirect sum result holds for worst-case randomized query complexity, they incur an increase in error (R ε ( f k ) ≥ δ · k · R ε + δ ( f )) when computing a single copy of f . Shaltiel [14] gave a counterexample function for whichdirect sum fails to hold for distributional complexity. Drucker [8] gave a strong direct product theorem forrandomized query complexity.Our work is most closely related to that of Blais and Brody [7], who give a strong direct sum theorem forR ε ( f k ) = Ω( k R ε/k ( f )), and explicitly conjecture that XOR admits a strong direct product theorem. Both [7]and ours use techniques similar to work of Molinaro et al. [11, 12] who give strong direct sum theorems forcommunication complexity. 2ur strong direct sum for XOR is an example of a composition theorem —lower bound on the querycomplexity of functions of the form f ◦ g . Several very recent works studied composition theorems in querycomplexity. Bassilakis et al. [1] show that R( f ◦ g ) = Ω(fbs( f )R( g )), where fbs( f ) is the fractional blocksensitivity of f . Ben-David and Blais [3, 4] give a tight lower bound on R( f ◦ g ) as a product of R( g )and a new measure they define called noisyR( f ), which measures the complexity of computing f on noisyinputs. They also characterize noisyR( f ) in terms of the gap-majority function. Ben-David et al [5] explicitlyconsider strong direct sum theorems for composed functions in randomized query complexity, asking whetherthe naive success amplification algorithm is necessary to compute f ◦ g . They give a partial strong directsum theorem, showing that there exists a partial function g such that computing XOR ◦ g requires successamplification, even in a model where the abort probability may be arbitrarily close to 1. Ben-David et al.explicitly ask whether there exists a total function g such that R(XOR ◦ g) = Ω( k log( k )R( g )). Our technique most closely follows the strong direct sum theorem of Blais and Brody. We start with a queryalgorithm that computes XOR ◦ g and use it to build a query algorithm for computing g with low error. Todo this, we’ll take an input for g and embed it into an input for XOR ◦ g. Given x ∈ { , } n , i ∈ [ k ], and y ∈ { , } n × k , let y ( i ← x ) := ( y (1) , . . . , y ( i − , x, y ( i +1) , . . . y ( k ) ) denote the input obtained from y by replacingthe i -th coordinate y ( i ) with x . Note that if x ∼ µ and y ∼ µ k , then y ( i ← x ) ∼ µ k for all i ∈ [ k ].We require the following observation of Drucker [8]. Lemma 2 ([8], Lemma 3.2) . Let y ∼ µ k be an input for a query algorithm A , and consider any execution ofqueries by A . The distribution of coordinates of y , conditioned on the queries made by A , remains a productdistribution. In particular, the answers to g ( y ( i ) ) remain independent bits conditioned on any set of queries made bythe query algorithm. Our first observation is that in order to compute XOR ◦ g( y ) with high probability,we must be able to compute g ( y ( i ) ) with very high probability for many i ’s. The intuition behind thisobservation is captured by the following simple fact about the XOR of independent random bits.Define the bias of a random bit X ∈ { , } as r ( X ) := max b ∈{ , } Pr[ X = b ]. Define the advantage of X as adv( X ) := 2 r ( X ) −
1. Note that when adv( X ) = δ , then r ( X ) = (1 + δ ). Fact 1.
Let X , . . . , X k bit independent random bits, and let a i be the advantage of X i . Then, adv( X ⊕ · · · ⊕ X k ) = k Y i =1 adv( X i ) . For completeness, we provide a proof of Fact 1 in Appendix A.Given an algorithm for XOR ◦ g that has error ε , it follows that for typical leaves the advantage ofcomputing XOR ◦ g is & − ε . Fact 1 shows that for such leaves, the advantage of computing g ( y ( i ) ) formost coordinates i is & (1 − ε ) /k = 1 − Θ( ε/k ). Thus, conditioned on reaching this leaf of the queryalgorithm, we could compute g ( y ( i ) ) with very high probability. We’d like to fix a coordinate i ∗ such that formost leaves, our advantage in computing g on coordinate i ∗ is 1 − O ( ε/k ). There are other complications,namely that (i) our construction needs to handle aborts gracefully and (ii) our construction must ensure thatthe algorithm for XOR ◦ g doesn’t query the i ∗ -th coordinate too many times. Our construction identifies acoordinate i ∗ and a string z ∈ { , } n × k , and on input x ∈ { , } n it emulates a query algorithm for XOR ◦ gon input z ( i ∗ ← x ) , and outputs our best guess for g ( x ) (which is now g evaluated on coordinate i ∗ of z ( i ∗ ← x ) ),aborting when needed e.g., when the algorithm for XOR ◦ g aborts or when it queries too many bits of x .We defer full details of the proof to Section 2. In this query complexity model, called PostBPP, the query algorithm is allowed to abort with any probability strictly lessthan 1. When it doesn’t abort, it must output f with probability at least 1 − ε . We use µ k to denote the distribution on k -tuples where each coordinate is independently distributed ∼ µ . .4 Preliminaries and Notation Suppose that f is a Boolean function on domain { , } n and that µ is a distribution on { , } n . Let µ k denote the distribution obtained on k -tuples of { , } n obtained by sampling each coordinate independentlyaccording to µ .An algorithm A is a [ q, δ, ε, µ ]-distributional query algorithm for f if A is a deterministic algorithm withquery cost q that computes f with error probability at most ε and abort probability at most δ when theinput x is drawn from µ . We write A ( x ) = ⊥ to denote that A aborts on input x .Our main theorem is a direct sum result for XOR ◦ g for average case randomized query complexity;however, Lemma 1 uses distributional query complexity. The following results from Blais and Brody [7]connect the query complexities in the randomized, average-case randomized, and distributional query models. Fact 2 ([7], Proposition 14) . For every function f : { , } n → { , } , every ≤ ǫ < and every < δ < , δ · R δ,ε ( f ) ≤ R ǫ ( f ) ≤ − δ · R δ, (1 − δ ) ǫ ( f ) . Fact 3 ([7], Lemma 15) . For any α, β > such that α + β ≤ , we have max µ D µδ/α,ε/β ( f ) ≤ R δ,ε ( f ) ≤ max µ D µαδ,βε ( f ) . We’ll also use the following convenient facts about probability and expectation. For completeness weprovide proofs in Appendix A.
Fact 4.
Let
S, T be random variables. Let E = E ( S, T ) and A be events, and for any s , let µ s be thedistribution on T conditioned on S = s . Then, Pr S,T [ E|A ] = E S (cid:20) Pr T ∼ µ S [ E ( S, T ) |A ] (cid:21) . Fact 5 (Markov Inequality for Bounded Variables) . Let X be a real-valued random variable with ≤ X ≤ .Suppose that E [ X ] ≥ − ε . Then, for any T > it holds that Pr[
X < − T ε ] < T .
In this section, we prove our main result.
Lemma 3 (Formal Restatement of Lemma 1) . For every function g : { , } n → { , } , every distribution µ on { , } n , every ≤ δ ≤ , and every < ε ≤ , we have D µ k δ,ε (XOR ◦ g) = Ω (cid:16) k · D µδ ′ ,ε ′ ( g ) (cid:17) , for δ ′ = 0 .
34 + 4 δ and ε ′ = εk .Proof. Let q := D µ k δ,ε (XOR ◦ g), and suppose that A is a [ q, δ, ε, µ k ]-distributional query algorithm for XOR ◦ g.Our goal is to construct an [ O ( q/k ) , δ ′ , ε ′ , µ ]-distributional query algorithm A ′ for g . Towards that end, foreach leaf ℓ of A define b ℓ := argmax b ∈{ , } Pr x ∼ µ k [XOR ◦ g( x ) = b | leaf( A , x ) = ℓ ] r ℓ := Pr x ∼ µ k [XOR ◦ g( x ) = b ℓ | leaf( A , x ) = ℓ ] a ℓ := 2 r ℓ − . a ℓ the advantage of A on leaf ℓ .The purpose of A is to compute XOR ◦ g; however, we’ll show that A must additionally be able to compute g reasonably well on many coordinates of x . For any i ∈ [ k ] and any leaf ℓ , define b i,ℓ := argmax b ∈{ , } Pr x ∼ µ k [ b = g ( x ( i ) ) | leaf( A , x ) = ℓ ] r i,ℓ := Pr x ∼ µ k [ b i,ℓ = g ( x ( i ) ) | leaf( A , x ) = ℓ ] a i,ℓ := 2 r i,ℓ − . If A reaches leaf ℓ on input y , then write A ( y ) i := b i,ℓ . A ( y ) i represents A ’s best guess for g ( y ( i ) ).Next, we define some structural characteristics of leaves that we’ll need to complete the proof. Definition 1 (Good leaves, good coordinates) . • Call a leaf ℓ good if r ℓ ≥ − ε . • Call a leaf ℓ good for i if a i,ℓ ≥ − ε/k . • Call coordinate i good if Pr x ∼ µ k [leaf( A , x ) is good for i |A ( x ) doesn’t abort ] ≥ − . When a leaf is good for i , then A , conditioned on reaching this leaf, computes g ( x ( i ) ) with very highprobability. When a coordinate i is good, then with high probability A reaches a leaf that is good for i . Tomake our embedding work, we need to fix a good coordinate i ∗ such that A makes only O ( q/k ) queries onthis coordinate. The following claim shows that most coordinates are good. Claim 1. i is good for at least k indices i ∈ [ k ] . We defer the proof of Claim 1 to the following subsection. Next, for each i ∈ [ k ], let q i ( x ) denote thenumber of queries that A makes to x ( i ) on input x . The query cost of A guarantees that for each input x , P ≤ i ≤ k q i ( x ) ≤ q . Therefore, P i ∈ [ k ] E x ∼ µ k [ q i ( x )] ≤ q , and so at least k indices i ∈ [ k ] satisfyE x ∼ µ k [ q i ( x )] ≤ qk . (1)Thus, there exists i ∗ which satisfies both Claim 1 and inequality (1). Fix such an i ∗ . For inputs y ∈ { , } n × k and x ∈ { , } n , let y ( i ∗ ← x ) := ( y (1) , . . . , y ( i ∗ − , x, y ( i ∗ +1) , . . . y ( k ) ) denote the input obtained from y byreplacing y ( i ∗ ) with x . Note that if y ∼ µ k and x ∼ µ , then y ( i ← x ) ∼ µ k for all i ∈ [ k ]. With this notationand using Fact 4, the conditions from inequality (1) and Claim 1 satisfied by i ∗ can be rewritten asE y ∼ µ k (cid:20) E x ∼ µ h q i ∗ ( y ( i ∗ ← x ) ) i(cid:21) ≤ qk , and E y ∼ µ k (cid:20) Pr x ∼ µ h leaf (cid:16) A , y ( i ∗ ← x ) (cid:17) is bad for i ∗ |A ( y ( i ∗ ← x ) ) doesn’t abort i(cid:21) ≤ . Since A has at most δ abort probability, we haveE y ∼ µ k (cid:20) Pr x ∼ µ h A ( y ( i ∗ ← x ) ) = ⊥ i(cid:21) ≤ δ . Finally, for any leaf ℓ for which i ∗ is good, we have a i ∗ ,ℓ ≥ − ε/k . HenceE y ∼ µ k (cid:20) Pr x ∼ µ h A ( y ( i ∗ ← x ) ) i ∗ = g ( x ) | leaf (cid:16) A , y ( i ∗ ← x ) (cid:17) is good for i ∗ i(cid:21) ≤ εk . z ∈ { , } n × k such thatE x ∼ µ h q i ∗ ( z ( i ∗ ← x ) ) i ≤ qk , (2)Pr x ∼ µ h leaf( A , z ( i ∗ ← x ) ) is bad for i ∗ |A ( z ( i ∗ ← x ) ) = ⊥ i ≤ , (3)Pr x ∼ µ h A ( z ( i ∗ ← x ) ) = ⊥ i ≤ δ , and (4)Pr x ∼ µ h A ( z ( i ∗ ← x ) ) i ∗ = g ( x ) | leaf( A , z ( i ∗ ← x ) ) is good for i ∗ i ≤ εk . (5)Fix this z . Now that i ∗ and z are fixed, we are ready to describe our algorithm. Algorithm 1 A ′ z,i ∗ ( x ) y ← z ( i ∗ ← x ) Emulate algorithm A on input y . Abort if A aborts, if A queries more than qk bits of x , or if A reaches a bad leaf. Otherwise, output A ( y ).Note that the emulation is possible since whenever A queries the j -th bit of y ( i ∗ ) , we can query x j , andwe can emulate A querying a bit of y ( i ) for i = i ∗ directly since z is fixed. It remains to show that A ′ is a (cid:2) qk , .
34 + 4 δ, εk , µ (cid:3) -distributional query algorithm for f .First, note that A ′ makes at most 120 q/k queries, since it aborts instead of making more queries. Next,consider the abort probability of A ′ . Our algorithm aborts if A aborts, if A probes more than qk bits,or if A reaches a bad leaf. By inequality (4), A aborts with probability at most 4 δ . By inequality (2) andMarkov’s Inequality, the probability that A probes 120 q/k bits is at most 1 /
10. By inequality (3), we havePr x ∼ µ [ A reaches a bad leaf] ≤ /
25. Hence, A ′ aborts with probability at most 4 δ + + = 0 .
34 + 4 δ .Finally, note that if A ′ doesn’t abort, then A reaches a leaf which is good for i ∗ . By inequality (5), A ′ errswith probability at most 320000 ε/k in this case.We have constructed an algorithm A ′ for g that makes at most 120 q/k queries, and when the input x ∼ µ , A ′ aborts with probability at most δ ′ and errs with probability at most ε ′ . Hence, D µδ ′ ,ε ′ ( g ) ≤ q/k .Rearranging terms and recalling that q = D µ k δ,ε (XOR ◦ g), we get D µ k δ,ε (XOR ◦ g) ≥ k D µδ ′ ,ε ′ ( g ) , completing the proof. Proof of Claim 1.
Let I be uniform on [ k ]. We want to show that Pr[ I is good] ≥ / A not aborting, it outputs the correct value of XOR ◦ g with probability at least 1 − ε − δ ≥ − ε . We first analyze this error probability by conditioning on which leaf is reached. Let ν be thedistribution on leaf( A , x ) when x ∼ µ k , conditioned on A not aborting. Let L ∼ ν . Then, we have1 − ε ≤ Pr x ∼ µ k [ A ( x ) = XOR ◦ g( x ) |A doesn’t abort]= X leaf ℓ Pr L ∼ ν [ L = ℓ ] · Pr[ A ( x ) = XOR ◦ g( x ) | L = ℓ ]= X ℓ Pr[ L = ℓ ] · r ℓ = E L [ r L ] . r L ] ≥ − ε . Recalling that ℓ is good if r ℓ ≥ − ε and using Fact 5, L is good withprobability at least 0 .
99. Note also that when ℓ is good, then a ℓ ≥ − ε . Let β ℓ := Pr I [ ℓ is bad for I ].Using 1 + x ≤ e x and e − x ≤ − x (which holds for all 0 ≤ x ≤ / ℓ − ε ≤ a ℓ = k Y i =1 a i,ℓ ≤ (cid:18) − εk (cid:19) kβ ℓ ≤ e − ε · β ℓ ≤ − εβ ℓ . Rearranging terms, we see that β ℓ ≤ .
01. We’ve just shown that a random leaf ℓ is good with highprobability, and when ℓ is good, it is good for many i . We need to show that there are many i such thatmost leaves are good for i . Towards that end, let δ i,ℓ := 1 if ℓ is good for i ; otherwise, set δ i,ℓ := 0.E I (cid:20) Pr x ∼ µ k [leaf( A , x ) good for I |A doesn’t abort] (cid:21) = E I "X ℓ Pr[ L = ℓ ] · δ I,ℓ = X ℓ Pr[ L = ℓ ] E I [ δ I,ℓ ]= X ℓ Pr[ L = ℓ ] Pr I [ ℓ good for I ] ≥ X good ℓ Pr[ L = ℓ ] · (1 − β ℓ )= Pr L [ L is good] · (1 − β ℓ ) ≥ . − β ℓ ) > . . Thus, E I (cid:2) Pr x ∼ µ k [leaf( A , x ) good for I |A doesn’t abort] (cid:3) ≥ − . Recalling that i is good ifPr[leaf( A , x ) good for i |A ( x ) doesn’t abort] ≥ − and using Fact 5, it follows that Pr I [ I is good] ≥ / Proof of Theorem 1.
Define ε ′ := 640000 ε . Let µ be the input distribution for g achieving max µ D µ , ε ′ k ( g ),and let µ k be the k -fold product distribution of µ . By the first inequality of Fact 2 and the first inequalityof Fact 3, we have R ε (XOR ◦ g) ≥
150 R ,ε (XOR ◦ g) ≥
150 D µ k , ε (XOR ◦ g) . Additionally, by Lemma 1 and the second inequalities of Facts 2 and 3, we haveD µ k , ε (XOR ◦ g) ≥ k
120 D µ , ε ′ k ( g ) ≥ k
120 R , ε ′ k ( g ) ≥ k
360 R ε ′ k ( g ) . Thus, we have R ε (XOR ◦ g) = Ω (cid:16) D µ k , ε (XOR ◦ g) (cid:17) and D µ k , ε (XOR ◦ g) = Ω (cid:16) k R ε ′ k ( g ) (cid:17) . By standardsuccess amplification R ε ′ k ( g ) = Θ(R εk ( g )). Putting these together yieldsR ε (XOR ◦ g) = Ω (cid:16) D µ k , ε (XOR ◦ g) (cid:17) = Ω (cid:16) k R ε ′ k ( g ) (cid:17) = Ω (cid:0) R εk ( g ) (cid:1) , hence R ε (XOR ◦ g) = Ω (cid:0) k R εk ( g ) (cid:1) completing the proof. Acknowledgments
The authors thank Runze Wang for several helpful discussions.7 eferences [1] Andrew Bassilakis, Andrew Drucker, Mika Gs, Lunjia Hu, Weiyun Ma, and Li-Yang Tan. The power ofmany samples in query complexity. In
Proceedings 47th Annual International Colloquium on Automata,Languages, and Programming , 2020.[2] Yosi Ben-Asher and Ilan Newman. Decision trees with AND, OR queries. In
Proceedings 10th AnnualStructure in Complexity Theory Conference , pages 74–81, 1995.[3] Shalev Ben-David and Eric Blais. A new minimax theorem for randomized algorithms, 2020.[4] Shalev Ben-David and Eric Blais. A tight composition theorem for the randomized query complexityof partial functions, 2020.[5] Shalev Ben-David, Mika G¨o¨os, Robin Kothari, and Thomas Watson. When is amplification necessaryfor composition in randomized query complexity?
CoRR , abs/2006.10957, 2020.[6] Shalev Ben-David and Robin Kothari. Randomized query complexity of sabotaged and composedfunctions.
Theory of Computing , 14(1):1–27, 2018.[7] Eric Blais and Joshua Brody. Optimal separation and strong direct sum for randomized query complex-ity. (Originally appeared in CCC 2019) CoRR , abs/1908.01020, 2019.[8] Andrew Drucker. Improved direct product theorems for randomized query complexity.
ComputationalComplexity , 21(2):197–244, 2012.[9] Russell Impagliazzo, Ran Raz, and Avi Wigderson. A direct product theorem. In
Proceedings 9thAnnual Structure in Complexity Theory Conference , pages 88–96, 1994.[10] Rahul Jain, Hartmut Klauck, and Miklos Santha. Optimal direct sum results for deterministic andrandomized decision tree complexity.
Inf. Process. Lett. , 110(20):893–897, 2010.[11] Marco Molinaro, David P Woodruff, and Grigory Yaroslavtsev. Beating the direct sum theorem incommunication complexity with implications for sketching. In
Proceedings of the twenty-fourth annualACM-SIAM symposium on Discrete algorithms , pages 1738–1756. SIAM, 2013.[12] Marco Molinaro, David P Woodruff, and Grigory Yaroslavtsev. Amplification of one-way informationcomplexity via codes and noise sensitivity. In
International Colloquium on Automata, Languages, andProgramming , pages 960–972. Springer, 2015.[13] Noam Nisan, Steven Rudich, and Michael E. Saks. Products and help bits in decision trees.
SIAMJournal on Computing , 28(3):1035–1050, 1999.[14] Ronen Shaltiel. Towards proving strong direct product theorems.
Computational Complexity , 12(1-2):1–22, 2003.
A Proofs of Technical Lemmas
Proof of Fact 1.
For each i , let b i := argmax b ∈{ , } Pr[ X i = b ] and δ i := adv( X i ). Then Pr[ X i = b i ] = (1 + δ i ). We prove Fact 1 by induction on k . When k = 1, there is nothing to prove. For k = 2, note thatPr[ X ⊕ X = b ⊕ b ] = 12 (1 + δ ) 12 (1 + δ ) + 12 (1 − δ ) 12 (1 − δ )= 14 (1 + δ + δ + δ δ ) + 14 (1 − δ − δ + δ δ )= 12 (1 + δ δ ) . X ⊕ X has advantage δ δ and the claim holds for k = 2. For an induction hypothesis, suppose thatthe claim holds for X ⊕ · · · ⊕ X k − . Then, setting Y := X ⊕ · · · ⊕ X k − , by the induction hypothesis, wehave adv( Y ) = Q k − i =1 adv( X i ). Moreover, X ⊕ · · · ⊕ X k = Y ⊕ X k , andadv( X ⊕ · · · ⊕ X k ) = adv( Y ⊕ X k ) = adv( Y ) adv( X k ) = k Y i =1 adv( X i ) . Proof of Fact 4.
We condition Pr
S,T [ E ( S, T ) |A ] on S .Pr S,T [ E|A ] = X s Pr[ S = s |A ] Pr T [ E ( S, T ) |A , S = s ]= X s Pr[ S = s |A ] Pr T ∼ µ s [ E ( S, T ) |A ]= E S (cid:20) Pr T ∼ µ S [ E ( S, T ) |A ] (cid:21) . Proof of Fact 5.
Let Y := 1 − X . Then, E [ Y ] ≤ ε . By Markov’s Inequality we havePr[ X < − T ε ] = Pr[