[PDF] A Direct Product Theorem for One-Way Quantum Communication

Abstract

We prove a direct product theorem for the one-way entanglement-assisted quantum communication complexity of a general relation f⊆X×Y×Z . For any ε,ζ>0 and any k≥1 , we show that Q 1 1−(1−ε ) Ω( ζ 6 k/log|Z|) ( f k )=Ω(k( ζ 5 ⋅ Q 1 ε+12ζ (f)−loglog(1/ζ))), where Q 1 ε (f) represents the one-way entanglement-assisted quantum communication complexity of f with worst-case error ε and f k denotes k parallel instances of f . As far as we are aware, this is the first direct product theorem for quantum communication. Our techniques are inspired by the parallel repetition theorems for the entangled value of two-player non-local games, under product distributions due to Jain, Pereszlényi and Yao, and under anchored distributions due to Bavarian, Vidick and Yuen, as well as message-compression for quantum protocols due to Jain, Radhakrishnan and Sen. Our techniques also work for entangled non-local games which have input distributions anchored on any one side. In particular, we show that for any game G=(q,X×Y,A×B,V) where q is a distribution on X×Y anchored on any one side with anchoring probability ζ , then ω ∗ ( G k )= (1−(1− ω ∗ (G) ) 5 ) Ω( ζ 2 k log(|A|⋅|B|) ) where ω ∗ (G) represents the entangled value of the game G . This is a generalization of the result of Bavarian, Vidick and Yuen, who proved a parallel repetition theorem for games anchored on both sides, and potentially a simplification of their proof.

Full PDF

AA Direct Product Theorem for One-Way Quantum Communication

Rahul Jain ∗ Srijita Kundu † Abstract

We prove a direct product theorem for the one-way entanglement-assisted quantum commu-nication complexity of a general relation f ⊆ X × Y × Z . For any ε, ζ > k ≥

1, weshow that Q − (1 − ε ) Ω( ζ k/ log |Z| ) ( f k ) = Ω (cid:0) k (cid:0) ζ · Q ε +12 ζ ( f ) − log log(1 /ζ ) (cid:1)(cid:1) , where Q ε ( f ) represents the one-way entanglement-assisted quantum communication complexityof f with worst-case error ε and f k denotes k parallel instances of f .As far as we are aware, this is the ﬁrst direct product theorem for quantum communication– direct sum theorems were previously known for one-way quantum protocols. Our techniquesare inspired by the parallel repetition theorems for the entangled value of two-player non-localgames, under product distributions due to Jain, Pereszl´enyi and Yao [JPY14], and under an-chored distributions due to Bavarian, Vidick and Yuen [BVY17], as well as message-compressionfor quantum protocols due to Jain, Radhakrishnan and Sen [JRS05]. In particular, we showthat a direct product theorem holds for the distributional one-way quantum communicationcomplexity of f under any distribution q on X × Y that is anchored on one side, i.e., there existsa y ∗ such that q ( y ∗ ) is constant and q ( x | y ∗ ) = q ( x ) for all x . This allows us to show a directproduct theorem for general distributions, since for any relation f and any distribution p on itsinputs, we can deﬁne a modiﬁed relation ˜ f which has an anchored distribution q close to p , suchthat a protocol that fails with probability at most ε for ˜ f under q can be used to give a protocolthat fails with probability at most ε + ζ for f under p .Our techniques also work for entangled non-local games which have input distributions an-chored on any one side, i.e., either there exists a y ∗ as previously speciﬁed, or there exists an x ∗ such that q ( x ∗ ) is constant and q ( y | x ∗ ) = q ( y ) for all y . In particular, we show that for anygame G = ( q, X × Y , A × B , V ) where q is a distribution on X × Y anchored on any one sidewith anchoring probability ζ , then ω ∗ ( G k ) = (cid:0) − (1 − ω ∗ ( G )) (cid:1) Ω (cid:16) ζ k log( |A|·|B| ) (cid:17) where ω ∗ ( G ) represents the entangled value of the game G . This is a generalization of the resultof [BVY17], who proved a parallel repetition theorem for games anchored on both sides, i.e.,where both a special x ∗ and a special y ∗ exist, and potentially a simpliﬁcation of their proof. ∗ Centre for Quantum Technologies and Department of Computer Science, National University of Singapore andMajuLab, UMI 3654, Singapore. Email: [email protected] † Centre for Quantum Technologies, National University of Singapore, Singapore.Email: [email protected] a r X i v : . [ c s . CC ] A ug Introduction

A fundamental question in complexity theory is: given k independent instances of a function orrelation, does computing them require k times the amount of resources required to compute a singleinstance of the function or relation? Suppose solving one instance of some problem with successprobability at least p requires c units of some resource. A natural way to solve k independentinstances of this problem would be to solve them independently, which requires ck units of theresource. A direct sum theorem for this problem would state that any algorithm for solving k instances which uses o ( ck ) units of resource has success probability at most O ( p ). A direct producttheorem for the problem would state that any algorithm for solving k instances that uses o ( ck ) unitsof resource has success probability at most p Ω( k ) . Hence a direct product theorem is the strongerresult of the two.In this paper, we deal with direct product theorems in the model of communication complexity.In this model, there are two parties Alice and Bob, who receive inputs x and y respectively,and wish to jointly compute a relation f . They can use local computation, public coins, andcommunicate with each other using classical messages, in the classical model; use local unitaries,shared entanglement, and communicate with each other using quantum messages, in the quantummodel. The resource of interest is the number of bits/qubits communicated; so the parties areallowed to share an arbitrary amount of randomness or entanglement, and perform local operationsof arbitrary complexity.Direct product theorems in communication are related to parallel repetition theorems for non-local games . In a non-local game, two parties Alice and Bob are given inputs x and y respectivelyfrom some speciﬁed distribution, and without communicating with each other, they are required togive answers a and b respectively to a referee. They are considered to win the game if V ( a, b, x, y )holds for a speciﬁed predicate V . In the classical model, the players are allowed to share randomness,and in the quantum model they are allowed to share entanglement. A parallel repetition theoremshows that the maximum probability of winning k independent instances of a non-local game is p Ω( k ) , if the maximum probability of winning a single instance of it is p , regardless of the amountof shared randomness or entanglement used. Direct product theorems in communication are oftenproved by combining techniques used to prove direct sum theorems in communication, which requiremessage-compression, and parallel repetition theorems for games.In classical communication complexity, there is a long line of works on direct sum and direct-product theorems including [Raz92, CSWY01, BYJKS02, Sha03, JRS03a, JRS03b, JSR08, KvdW07,BARdW08, LSv08, VW08, JKN08, JK09, HJMR10, Kla10, JY12, She12, BBCR13, BRWY13b,BRWY13a, BR14, Bra15, BW15, Jai15, JPY16, Kol16, BK18, She18]. A parallel repetition theo-rem for the classical value of general two-player non-local games was ﬁrst shown by Raz [Raz95],and the proof was subsequently simpliﬁed by Holenstein [Hol07].In quantum communication complexity, a direct sum theorem is known for the entanglement-assisted one-way [JSR08], simultaneous-message-passing (SMP), entanglement-assisted [JSR08] andunassisted models [JK09]. A strong parallel repetition theorem for the quantum value of a generaltwo-player non-local game is not known. Parallel repetition theorems were shown for special classesof games such as XOR games [CSUU08], unique games [KRT10] and projection games [DSV15].When the type of game is not restricted but the input distribution is, parallel repetition theorems2ave been shown under product distributions [JPY14] and anchored distributions [BVY17, BVY15].For general games under general distributions, the best current result is due to Yuen [Yue16], whichshows that the quantum value of k parallel instances of a general game goes down polynomially in k , if the quantum value of the original game is strictly less than 1. No direct product theorems forquantum communication have so far been shown.Using ideas from Jain, Pereszl´enyi and Yao [JPY14] and the message-compression scheme fromJain, Radhakrishnan and Sen [JSR08], a strong direct product theorem for one-way quantumcommunication under product distributions can be shown. To deal with non-product distributions,we borrow the idea of anchored distributions due to Bavarian, Vidick and Yuen [BVY17, BVY15]. Let Q ε ( f ) denote that the one-way entanglement-assisted quantum communication complexityof a relation f , with worst-case error ε . Let f k denote k parallel instances of f . Our strong directproduct theorem is as follows. Theorem 1.

For any relation f ⊆ X × Y × Z , and any ε, ζ > , Q − (1 − ε ) Ω( ζ k/ log |Z| ) ( f k ) = Ω (cid:0) k (cid:0) ζ · Q ε +12 ζ ( f ) − log log(1 /ζ ) (cid:1)(cid:1) . Let ω ∗ ( G ) represent the entangled value of a two-player non-local game G , and let G k denote k parallel instances of G . We call a distribution q on X × Y anchored on one side with anchoringprobability ζ if one of the following conditions holds:(i) There exists an x ∗ ∈ X such that q ( x ∗ ) = ζ and q ( y | x ∗ ) = q ( y ) for all y ∈ Y ,(ii) There exists an y ∗ ∈ Y such that q ( y ∗ ) = ζ and q ( x | y ∗ ) = q ( x ) for all x ∈ X .The game will be called anchored on both sides with anchoring probability ζ if both conditions holdinstead.Then our parallel repetition theorem is stated as follows. Theorem 2.

For a two-player non-local game G = ( q, X × Y , A × B , V ) such that q is a distributionanchored on one side with anchoring probability ζ , ω ∗ ( G k ) = (cid:0) − (1 − ω ∗ ( G )) (cid:1) Ω (cid:18) ζ k log( |A|·|B| ) (cid:19) . One can get a game anchored on one side (say the Y side) from a general game in the followingway: in the anchored game, the referee chooses ( x, y ) from the original probability distribution,and with probability ζ replaces y with a new input y ∗ . If Bob’s input is y ∗ , then the referee acceptsany answer from the players. In a game anchored on both sides, the referee must instead replace x with x ∗ and y with y ∗ independently with probability ζ , and accept if either Alice’s input is x ∗ or Bob’s input is y ∗ . It is clear that anchoring makes the game easier. In this light, a parallelrepetition theorem for anchoring games can be thought of as follows: for a general game G , thereexists a simple transformation taking it to another game ˜ G such that3. If ω ∗ ( G ) = 1, then ω ∗ ( ˜ G k ) = 1.2. If ω ∗ ( G ) <

1, then ω ∗ ( ˜ G k ) = exp( − Ω( k )).The merit of our result here is that the transformation involved for anchoring on one side changesthe game less than the transformation involved in anchoring it on both sides.We note that the deﬁnition of anchoring used on [BVY17, BVY15] is more general: instead ofsingle inputs x ∗ , y ∗ , they consider anchoring sets X ∗ ⊆ X and Y ∗ ⊆ Y , such that q ( X ∗ ) , q ( Y ∗ ) ≥ ζ ,and whenever x ∈ X ∗ or y ∈ Y ∗ , q ( x, y ) = q ( x ) q ( y ). However, it appears this generalized deﬁnitionis not more useful from the perspective of anchoring transformations. While our technique couldgo through for the one-sided version of this deﬁnition of anchoring, we do not state or prove it assuch for the sake of simplicity.Unlike in the case of communication, worst-case success probability is usually not consideredfor non-local games. But one could deﬁne a game G wc = ( X × Y , A × B , V ) without an associateddistribution, and the worst-case winning probability ω ∗ wc of this over all inputs of this can beconsidered. As long as Alice and Bob are allowed to share randomness (which they are, in thequantum case), Yao’s lemma [Yao79] holds just like in the case of communication, relating theworst-case winning probability to distributional winning probability. Hence, by choosing ζ =(1 − ω ∗ wc ( G wc )) / Corollary 3.

For any two-player non-local game G wc = ( X × Y , A × B , V ) , ω ∗ wc ( G k wc ) = (cid:0) − (1 − ω ∗ wc ( G wc )) (cid:1) Ω (cid:16) k log( |A|·|B| ) (cid:17) . We use the information theoretic framework for parallel repetition and direct product theoremsestablished by [Raz95] and [Hol07]. The broad idea is as follows: for a given relation f ⊆ X ×Y ×Z ,let the one-way quantum communication required to compute a single copy with constant successbe c . Now consider a one-way quantum protocol P for f k which has communication o ( ck ), inwhich we can condition on the success of some t coordinates. If the success probability in these t coordinates is already as small as we want, then we are done. Otherwise, we exhibit a ( t + 1)-thcoordinate i , such that conditioned on the success on the t coordinates, the success of i in P isbounded away from 1. This is done by showing that if the success probability in the t coordinatesin not too small, then we can give a protocol P (cid:48) for f whose communication is o ( c ) and whosesuccess probability is constant – a contradiction. P (cid:48) works by embedding its input into the i -th coordinate of a shared quantum state representingthe ﬁnal input, output, message and discarded registers of P , conditioned on the success event inthe t coordinates, which we denote by E . Suppose the quantum state conditioned on E , when Aliceand Bob’s inputs are x i and y i respectively at the i -th coordinates, is | ϕ (cid:105) x i y i . On input ( x i , y i )in P (cid:48) , Alice and Bob will by means of local unitaries and communication try to get the sharedstate close to | ϕ (cid:105) x i y i , on which Bob can perform a measurement to get an outcome z i . The state | ϕ (cid:105) x i y i is such that the resulting probability distribution P X i Y i Z i is the distribution of X i Y i Z i in P | ϕ (cid:105) x i y i .The proof technique for a parallel repetition theorem is same, except one cannot, and need not,use communication to get the shared state | ϕ (cid:105) x i y i there. In order to motivate our techniques, weshall brieﬂy describe the techniques used in [JPY14] and [BVY15] to get | ϕ (cid:105) x i y i . • In [JPY14] the following three states are considered: | ϕ (cid:105) x i which is the superposition of | ϕ (cid:105) x i y i over the distribution of Y i , | ϕ (cid:105) y i which is the superposition over the distribution of X i , and | ϕ (cid:105) which is the superposition over both. In this setting, X . . . X k are initially in productwith all of Bob’s registers and Y . . . Y k are in product with all of Alice’s registers. If theprobability of E is large, then conditioning on it, the following can be shown:1. By chain rule of mutual information, there is an X i whose mutual information withBob’s registers in | ϕ (cid:105) is small. Hence by Uhlmann’s theorem, there exist unitaries U x i acting on Alice’s registers that take | ϕ (cid:105) close to | ϕ (cid:105) x i .2. Similarly, the mutual information between Y i and Alice’s registers in | ϕ (cid:105) is small, andhence there exist unitaries U y i acting on Bob’s registers that take | ϕ (cid:105) close to | ϕ (cid:105) y i .3. Since U x i and U y i act on disjoint registers, using a commuting argument and the mono-tonicity of trace-distance under quantum-operations, U x i ⊗ U y i takes | ϕ (cid:105) close to | ϕ (cid:105) x i y i .Alice and Bob can thus share | ϕ (cid:105) as entanglement, and get close to | ϕ (cid:105) x i y i by local operations. • In [BVY15], X . . . X k are not initially in product with Y . . . Y k , hence they need to usewhat are known as correlation-breaking variables . For each i , correlation-breaking variables D i G i are such that conditioned on D i G i , X i and Y i are independent. In particular, D i is auniformly distributed bit, and G i takes values in either X or Y depending on whether D i is0 or 1, and is highly correlated with either X i or Y i in the respective cases. This means thatconditioned on D i = 0, G i = x ∗ with probability Ω( ζ ) and conditioned on D i = 1, G i = y ∗ with probability Ω( ζ ).1. The mutual information between X i and Bob’s registers in | ϕ (cid:105) conditioned on D i = 1 and G i is small. Further conditioning on G i = y ∗ (which happens with constant probability),the mutual information between X i and Bob’s registers in | ϕ (cid:105) y ∗ is small. Hence byUhlmann’s theorem, there exist unitaries U x i on Alice’s registers, taking | ϕ (cid:105) x ∗ y ∗ close to | ϕ (cid:105) x i y ∗ .2. Similarly, the mutual information between Y i and Alice’s registers in | ϕ (cid:105) conditioning on D i = 0 and G i = x ∗ is small, which means there exist unitaries U y i on Bob’s registers,taking | ϕ (cid:105) x ∗ y ∗ close to | ϕ (cid:105) x ∗ y i .3. Using an involved argument, it is possible to show that U x i ⊗ U y i takes | ϕ (cid:105) x ∗ y ∗ close to | ϕ (cid:105) x i y i .Alice and Bob can thus share | ϕ (cid:105) x ∗ y ∗ in this case, and get close to | ϕ (cid:105) x i y i by local operations.In our direct product proof, since the distribution is anchored on one side, we use correlation-breaking variables that are identical to those in [BVY15] in the D i = 1 case, but in the D i = 0 weconsider a simpler distribution where G i is perfectly correlated with X i . Here we also clarify what5e mean by G i and Y i being highly correlated when D i = 1: if G i = y ∗ , then Y i is always y ∗ ; butif G i = y i for y i (cid:54) = y ∗ , then Y i still takes value y ∗ with probability Ω( ζ ), and is y i otherwise. Thedistribution of X i conditioned on G i = y ∗ is the marginal distribution of X i , while conditioned on y i , it is the same as the distribution of X i conditioned on Y i = y i (potentially diﬀerent from themarginal distribution of X i ). Our use of these correlation-breaking variables is quite diﬀerent fromthat in [BVY15], however.We note that in a communication protocol where Alice sends the message, we cannot hopeto show that the mutual information between X i and Bob’s registers is small even conditionedon the the correlation-breaking variables, since the ﬁnal state on Bob’s side includes the messagefrom Alice, which can potentially be fully correlated with Alice’s inputs. Since Bob does notcommunicate however, the same does not apply to him. Hence we can show the following:1. If the message size is o ( ck ), by chain rule of mutual information, the mutual informationbetween X i and Bob’s registers in | ϕ (cid:105) is o ( c ), conditioned on D i = 1 , G i = y ∗ . Since thedistribution is anchored on Bob’s side, this means that the mutual information between X i and Bob’s registers in | ϕ (cid:105) y ∗ is o ( c ). Using a result from [JRS02, JSR08], then there existprojectors Π x i acting on Alice’s registers, which succeed with probability 2 − o ( c ) on | ϕ (cid:105) y ∗ , andon success take it close to | ϕ (cid:105) x i y ∗ .2. The mutual information between Y i and Alice’s registers conditioned on D i = 1 , G i (cid:54) = y ∗ issmall. For each value of G i (cid:54) = y ∗ , there exist only two possible values of Y i : y i and y ∗ , andhence Alice’s registers in | ϕ (cid:105) y i and | ϕ (cid:105) y ∗ must be close on average. By Uhlmann’s theorem,there exist unitaries U y i acting on Bob’s registers, taking | ϕ (cid:105) y ∗ close to | ϕ (cid:105) y i .3. Since the marginal distribution of X i conditioned on G i = y i is approximately the same asthe marginal distribution of X i conditioned on Y i = y i , we can show by the same argument asin [JSR08, JPY14], that conditioned on success of Π x i , Π x i ⊗ U y i takes | ϕ (cid:105) y ∗ close to | ϕ (cid:105) x i y i .Hence there is a communication protocol with prior shared entanglement which allows Alice andBob to obtain a state close to | ϕ (cid:105) x i y i as a shared state on input ( x i , y i ): Alice and Bob share2 o ( c ) copies of | ϕ (cid:105) y ∗ as entanglement; Alice performs the Π x i measurement on all these copies, andsucceeds on at least one copy with high probability. She sends the index of the copy on which shesucceeds to Bob, who performs U y i on the same copy. This protocol has communication o ( c ), sincethat is how many classical bits Alice needs in order to encode the index of the successful copy outof 2 o ( c ) copies. This completes the proof of the direct product theorem.Our parallel repetition proof is same as above, except no communication is necessary, since therewas no communication in the original protocol. Instead of a projector on Alice’s registers taking | ϕ (cid:105) y ∗ close to | ϕ (cid:105) x i y ∗ , in this case we will have a unitary U x i doing it. We can argue identically tothe direct product proof that there exist U y i taking | ϕ (cid:105) y ∗ close to | ϕ (cid:105) y i , and U x i ⊗ U y i takes | ϕ (cid:105) y ∗ close to | ϕ (cid:105) x i y i . The last part, indicated as step 3 above, is arguably simpler in our proof comparedto [BVY15]. 6 Preliminaries

We shall denote the probability distribution of a random variable X on some set X by P X .For any event E on X , the distribution of X conditioned on E will be denoted by P X |E . For jointrandom variables XY , P X | Y = y ( x ) is the conditional distribution of X given Y = y ; when it is clearfrom context which variable’s value is being conditioned on, we shall often shorten this to P X | y .We shall use P XY P Z | X to refer to the distribution( P XY P Z | X )( x, y, z ) = P XY ( x, y ) · P Z | X = x ( z ) . For two distributions P X and P X (cid:48) on the same set X , the (cid:96) distance between them is deﬁned as (cid:107) P X − P X (cid:48) (cid:107) = (cid:88) x ∈X | P X ( x ) − P X (cid:48) ( x ) | . Fact 4.

For joint distributions P XY and P X (cid:48) Y (cid:48) on the same sets, (cid:107) P X − P X (cid:48) (cid:107) ≤ (cid:107) P XY − P X (cid:48) Y (cid:48) (cid:107) . Fact 5.

For two distributions P X and P X (cid:48) on the same set and an event E on the set, | P X ( E ) − P X (cid:48) ( E ) | ≤ (cid:107) P X − P X (cid:48) (cid:107) . Fact 6.

For two distributions P X and P X (cid:48) on the same set, and any joint distribution P XX (cid:48) whosemarginals are P X and P X (cid:48) respectively, we have (cid:107) P X − P X (cid:48) (cid:107) ≤ P XX (cid:48) ( X (cid:54) = X (cid:48) ) . Fact 7.

Suppose probability distributions P X , P X (cid:48) satisfy (cid:107) P X − P X (cid:48) (cid:107) ≤ ε , and an event E satisﬁes P X ( E ) ≥ α , where α > ε . Then, (cid:107) P X |E − P X (cid:48) |E (cid:107) ≤ εα . Proof.

From Fact 5, α − ε/ ≤ P X (cid:48) ( E ) ≤ α + ε/

2. By deﬁnition, there exists an event E (cid:48) suchthat 2( P X |E ( E (cid:48) ) − P X (cid:48) |E ( E (cid:48) )) = (cid:107) P X |E − P X (cid:48) |E (cid:107) . Now, P X ( E ∧ E (cid:48) ) = P X ( E ) P X |E ( E (cid:48) ) ≥ α P X |E ( E (cid:48) ).Similarly, P X (cid:48) ( E ∧ E (cid:48) ) ≤ ( α + ε/ P X (cid:48) |E ( E (cid:48) ) ≤ α P X (cid:48) |E ( E (cid:48) ) + (cid:107) P X − P X (cid:48) (cid:107) .Now, (cid:107) P X − P X (cid:48) (cid:107) ≥ P X ( E ∧ E (cid:48) ) − P X (cid:48) ( E ∧ E (cid:48) )) ≥ α ( P X |E ( E (cid:48) ) − P X (cid:48) |E ( E (cid:48) )) − (cid:107) P X − P X (cid:48) (cid:107) ≥ α (cid:107) P X |E − P X (cid:48) |E (cid:107) − (cid:107) P X − P X (cid:48) (cid:107) which gives the required result. Fact 8 ([BVY15], Lemma 16) . Suppose

XY Z are random variables satisfying P XY ( x, y ∗ ) = α · P X ( x ) for all x . Then, (cid:13)(cid:13) P XY Z − P XY P Z | X,y ∗ (cid:13)(cid:13) ≤ α (cid:13)(cid:13) P XY Z − P XY P Z | X (cid:13)(cid:13) . orollary 9. Supose P XY and P X (cid:48) Y (cid:48) Z (cid:48) are distributions such that (cid:107) P XY − P X (cid:48) Y (cid:48) (cid:107) ≤ ε , and P ( x, y ∗ ) = α · P X ( x ) for all x . Then, (cid:107) P X (cid:48) Z (cid:48) | y ∗ − P X (cid:48) Z (cid:48) (cid:107) ≤ α (cid:107) P X (cid:48) Y (cid:48) Z (cid:48) − P XY P Z (cid:48) | X (cid:48) (cid:107) . Proof.

Let (cid:107) P X (cid:48) Y (cid:48) Z (cid:48) − P XY P Z (cid:48) | X (cid:48) (cid:107) = ε . Note that (cid:107) P X | y ∗ − P X (cid:48) | y ∗ (cid:107) ≤ εα by Fact 7. Let P XY Z (cid:48)(cid:48) denote the distribution P XY P Z (cid:48) | X (cid:48) Y (cid:48) . (cid:107) P X (cid:48) Z (cid:48) − P XZ (cid:48)(cid:48) (cid:107) = (cid:88) x,z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) P X (cid:48) ( x ) (cid:88) y P Y (cid:48) | x ( y ) P Z (cid:48) | xy ( z ) − P X ( x ) (cid:88) y P Y | x ( y ) P Z (cid:48) | xy ( z ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:88) x,y,z (cid:12)(cid:12) P X (cid:48) ( x ) P Y (cid:48) | x ( y ) − P X ( x ) P Y | x ( y ) (cid:12)(cid:12) P Z (cid:48) | xy ( z )= (cid:107) P X (cid:48) Y (cid:48) − P XY (cid:107) ≤ ε. (cid:107) P XY Z (cid:48)(cid:48) − P XY P Z (cid:48)(cid:48) | X (cid:107) ≤ (cid:107) P XY Z (cid:48)(cid:48) − P X (cid:48) Y (cid:48) Z (cid:48) (cid:107) + (cid:107) P X (cid:48) Y (cid:48) Z (cid:48) − P XY P Z (cid:48) | X (cid:48) (cid:107) + (cid:107) P XY P Z (cid:48) | X (cid:48) − P XY P Z (cid:48)(cid:48) | X (cid:107) = (cid:107) P XY − P X (cid:48) Y (cid:48) (cid:107) + (cid:107) P X (cid:48) Y (cid:48) Z (cid:48) − P XY P Z (cid:48) | X (cid:48) (cid:107) + (cid:88) x,y P XY ( x, y ) (cid:107) P Z (cid:48) | x − P Z (cid:48)(cid:48) | x (cid:107) ≤ ε + (cid:88) x P X ( x ) (cid:88) y,z | P Y | x ( y ) − P Y (cid:48) | x ( y ) | P Z (cid:48) | xy ( z ) ≤ ε + (cid:88) x,y | P X ( x ) P Y | x ( y ) − P X (cid:48) ( x ) P Y (cid:48) | x ( y ) | + (cid:88) x,y | P X (cid:48) ( x ) − P X ( x ) | P Y (cid:48) | x ( y ) ≤ ε + 2 (cid:107) P XY − P X (cid:48) Y (cid:48) (cid:107) ≤ ε. Combining all this, (cid:107) P X (cid:48) Z (cid:48) | y ∗ − P X (cid:48) Z (cid:48) (cid:107) ≤ (cid:107) P X (cid:48) Z (cid:48) | y ∗ − P XZ (cid:48)(cid:48) | y ∗ (cid:107) + (cid:107) P XZ (cid:48)(cid:48) | y ∗ − P XZ (cid:48)(cid:48) (cid:107) + (cid:107) P XZ (cid:48)(cid:48) − P X (cid:48) Z (cid:48) (cid:107) ≤ (cid:107) P X | y ∗ − P X (cid:48) | y ∗ (cid:107) + (cid:107) P XZ (cid:48)(cid:48) | y ∗ − P XZ (cid:48)(cid:48) (cid:107) + (cid:107) P XZ (cid:48)(cid:48) − P X (cid:48) Z (cid:48) (cid:107) ≤ εα + 2 α (cid:107) P XY Z (cid:48)(cid:48) − P XY P Z (cid:48)(cid:48) | X (cid:107) + ε ≤ εα + 8 εα + ε ≤ εα . where we have used Lemma 8 in the third inequality.8 act 10 ([Hol07], Corollary 6) . Let P T U ...U k V = P T P U | T P U | T . . . P U k | T P V | T U ...U k be a probabilitydistribution over T × U k × V , and let E be any event. Then, k (cid:88) i =1 (cid:107) P T U i V |E − P T V |E P U i | T (cid:107) ≤ (cid:115) k (cid:18) log( |V| ) + log (cid:18) E ] (cid:19)(cid:19) . Deﬁnition 1 ([Hol07]) . For two distributions P XY and P X (cid:48) Y (cid:48) ST , we say ( X, Y ) is (1 − ε )-embeddablein ( X (cid:48) S, Y (cid:48) T ) if there exists a random variable R on a set R independent of XY and functions f A : X × R → S and f B : Y × R → T , such that (cid:107) P XY f A ( X,R ) f B ( X,R ) − P X (cid:48) Y (cid:48) ST (cid:107) ≤ ε. Fact 11 ([Hol07, JPY16]) . If two distributions P XY and P X (cid:48) Y (cid:48) R (cid:48) satisfy (cid:107) P X (cid:48) Y (cid:48) R (cid:48) − P XY P R (cid:48) | X (cid:48) (cid:107) ≤ ε (cid:107) P X (cid:48) Y (cid:48) R (cid:48) − P XY P R (cid:48) | Y (cid:48) (cid:107) ≤ ε, then ( X, Y ) is (1 − ε ) -embeddable in ( X (cid:48) R (cid:48) , Y (cid:48) R (cid:48) ) . The (cid:96) distance between two quantum states ρ and σ is given by (cid:107) ρ − σ (cid:107) = Tr (cid:113) ( ρ − σ ) † ( ρ − σ ) = Tr | ρ − σ | . The ﬁdelity between two quantum states is given by F ( ρ, σ ) = (cid:107)√ ρ √ σ (cid:107) .(cid:96) distance and ﬁdelity are related in the following way. Fact 12 (Fuchs-van de Graaf inequality) . For any pair of quantum states ρ and σ , − F ( ρ, σ )) ≤ (cid:107) ρ − σ (cid:107) ≤ (cid:112) − F ( ρ, σ ) . For two pure states | ψ (cid:105) and | φ (cid:105) , we have (cid:107)| ψ (cid:105)(cid:104) ψ | − | φ (cid:105)(cid:104) φ |(cid:107) = (cid:113) − F ( | ψ (cid:105)(cid:104) ψ | , | φ (cid:105)(cid:104) φ | ) = (cid:112) − |(cid:104) ψ | ψ (cid:105)| . Fact 13 (Uhlmann’s theorem) . Suppose ρ and σ are mixed states on register X which are puriﬁedto | ρ (cid:105) and | σ (cid:105) on registers XY , then it holds that F ( ρ, σ ) = max U |(cid:104) ρ | X ⊗ U | σ (cid:105)| where the maximization is over unitaries acting only on register Y . This fact is equivalent to Lemma 2.11 in [JPY16], although this lemma is stated in terms of relative entropiesinstead of trace distances between the various distributions. In the proof of the lemma, the relative entropies areconverted to the same trace distances as we consider, using Pinsker’s inequality. This justiﬁes our statement of thefact, which is tailored towards our application. act 14. For a quantum channel E and states ρ and σ , (cid:107)E ( ρ ) − E ( σ ) (cid:107) ≤ (cid:107) ρ − σ (cid:107) and F ( E ( ρ ) , E ( σ )) ≥ F ( ρ, σ ) . The entropy of a quantum state ρ on a register Z is given by S ( ρ ) = − Tr( ρ log ρ ) . The relative entropy between two states ρ and σ of the same dimensions is given by S ( ρ (cid:107) σ ) = Tr( ρ log ρ ) − Tr( ρ log σ ) . The relative min-entropy between ρ and σ is deﬁned as S ∞ ( ρ (cid:107) σ ) = min { λ : ρ ≤ λ σ } . It is easy to see that S ( ρ (cid:107) σ ) and S ∞ ( ρ (cid:107) σ ) only take ﬁnite values when the support of ρ is containedin the support of σ . Moreover, clearly 0 ≤ S ( ρ (cid:107) σ ) ≤ S ∞ ( ρ (cid:107) σ ) for all ρ and σ .The ε -smooth relative min-entropy between ρ and σ is deﬁned as S ε ∞ ( ρ (cid:107) σ ) = inf ρ (cid:48) : (cid:107) ρ − ρ (cid:48) (cid:107) ≤ ε S ( ρ (cid:48) (cid:107) σ ) . S ε ∞ ( ρ (cid:107) σ ) can take a ﬁnite value even if the support of ρ is not contained in the support of σ , forexample if ρ is ε -close to a state contained within the support of σ . S ∞ ( ρ (cid:107) σ ) cannot be upperbounded by S ( ρ (cid:107) σ ), but S ε ∞ ( ρ (cid:107) σ ) can be, due to the Quantum Substate Theorem. Fact 15 (Quantum Substate Theorem, [JRS09, JN12]) . For any two states ρ and σ such that thesupport of ρ is contained in the support of σ , and any ε > , S ε ∞ ( ρ (cid:107) σ ) ≤ S ( ρ (cid:107) σ ) ε + log (cid:18) − ε / (cid:19) . Fact 16 (Pinsker’s Inequality) . For any two states ρ and σ , (cid:107) ρ − σ (cid:107) ≤ (cid:112) S ( ρ (cid:107) σ ) . Fact 17. If σ = ερ + (1 − ε ) ρ (cid:48) , then S ∞ ( ρ (cid:107) σ ) ≤ log( /ε ) . Fact 18.

For any three quantum states ρ, σ, ϕ such that supp( ρ ) ⊆ supp( ϕ ) ⊆ supp( σ ) , S ∞ ( ρ (cid:107) σ ) ≤ S ∞ ( ρ (cid:107) ϕ ) + S ∞ ( ϕ (cid:107) σ ) . Fact 19.

For any unitary U , S ∞ ( U ρU † (cid:107) U σU † ) = S ∞ ( ρ (cid:107) σ ) . A state of the form ρ XY = (cid:88) x P X ( x ) | x (cid:105)(cid:104) x | X ⊗ ρ Y | x is called a CQ (classical-quantum) state, with X being the classical register and Y being quantum.We shall use X to refer to both the classical register and the classical random variable with theassociated distribution. As in the classical case, here we are using ρ Y | x to denote the state of theregister Y conditioned on X = x , or in other words the state of the register Y when a measurement10s done on the X register and the outcome is x . Hence ρ XY | x = | x (cid:105)(cid:104) x | X ⊗ ρ Y | x . When the registersare clear from context we shall often write simply ρ x .The mutual information between Y and Z with respect to a state ρ on Y Z is deﬁned as I ( Y : Z ) ρ = S ( ρ Y Z (cid:107) ρ Y ⊗ ρ Z ) . The ε -smooth max-information between Y and Z with respect to ρ is deﬁned as I ε max ( Y : Z ) ρ = inf σ Z S ε ∞ ( ρ Y Z (cid:107) ρ Y ⊗ σ Z ) . The conditional mutual information between Y and Z conditioned on a classical register X , isdeﬁned as I ( Y : Z | X ) = E P X [ I ( Y : Z ) ρ x ] . Mutual information can be seen to satisfy the chain rule I ( XY : Z ) ρ = I ( X : Z ) ρ + I ( Y : Z | X ) ρ . Fact 20 ([BCR11], Lemma B.7) . For any quantum state ρ Y Z , inf σ Z S ∞ ( ρ Y Z (cid:107) ρ Y ⊗ σ Z ) ≤ { log |Y| , log |Z|} . Fact 21.

Suppose σ XY Z and ρ XY Z are CQ states deﬁned as follows σ XY Z = (cid:88) x,y P XY ( x, y ) | x, y (cid:105)(cid:104) x, y | ⊗ σ Z | xy ρ XY Z = (cid:88) x,y P X (cid:48) Y (cid:48) ( x, y ) | x, y (cid:105)(cid:104) x, y | ⊗ σ Z | xy , where (cid:107) P XY − P X (cid:48) Y (cid:48) (cid:107) ≤ δ < . Let I ( Y : Z | X ) σ ≤ c . Then, for any δ < ε < , P X (cid:48) (cid:18) I ε +7 δ/ε max ( Y : Z ) ρ x > c + 1 ε (cid:19) ≤ ε + δ . Proof.

2, which givesus the desired result. 12 act 23 (Quantum Raz’s Lemma, [BVY15]) . Let ρ XY and σ XY be two CQ states with X = X . . . X k being classical, and σ being product across all registers. Then, k (cid:88) i =1 I ( X i : Y ) ρ ≤ S ( ρ XY (cid:107) σ XY ) . Fact 24 ([JRS05], Lemma 2) . Suppose the state | σ (cid:105) X ˜ XAB = (cid:88) x (cid:112) P X ( x ) | xx (cid:105) X ˜ X | σ (cid:105) AB | x satisﬁes I δ max ( X : B ) σ ≤ k for some δ > . Then there is a family of measurement operators { Π x } x acting only on X ˜ XA such that:(i) Each Π x succeeds with probability α = 2 − k/δ on | σ (cid:105) X ˜ XAB ,(ii) (Π x ⊗ B ) | σ (cid:105)(cid:104) σ | (Π x ⊗ B ) is of the form | xx (cid:105)(cid:104) xx | ⊗ ρ x , for some state ρ x on AB , and E P X (cid:13)(cid:13)(cid:13)(cid:13) α (Π x ⊗ B ) | σ (cid:105)(cid:104) σ | X ˜ XAB (Π x ⊗ B ) − | xx (cid:105)(cid:104) xx | X ˜ X ⊗ | σ (cid:105)(cid:104) σ | AB | x (cid:13)(cid:13)(cid:13)(cid:13) ≤ δ. We brieﬂy describe a quantum communication protocol P for computing a relation f ⊆ X × Y ×Z , between two parties Alice and Bob sharing prior entanglement, with inputs x and y respectively.In each round, either Alice or Bob will apply a unitary on their classical input register, alongwith the quantum register they received as a message from the other party in the last round, andmemory registers they may have kept from previous rounds; after the unitary they will keep someregisters as memory and send the rest to the other party as the message for that round. We canalways assume that players make ‘safe’ copies of their inputs using CNOT gates in such protocols,so that the input registers come out as is after each round. We also note that though in generalwe need not consider shared classical randomness in quantum communication protocols, protocolswith shared randomness fall under the shared entanglement framework we have described. Thisis because shared randomness can be obtained by sharing entanglement and then both partiesmeasuring in the same basis.In a one-way, i.e., a single round protocol, the memory from previous rounds is replaced byAlice’s (who we consider to be sending the single message) part of the shared entangled state,and any register she does not send as a message is simply discarded. After Alice’s message, Bobperforms a projective measurement on his input register, his part of the shared entanglement, andAlice’s message, and gives the outcome of this measurement as the output of the protocol, which we The version of this fact stated here is slightly diﬀerent from the original statement in [JRS05], in order to suitour application. In the original statement, I ( X : B ) is used instead of I δ max ( X : B ), and the superposition state lacksthe ˜ X register. However, in the proof of the fact in [JRS05], I ( X : B ) is converted to I δ max ( X : B ) anyway, so theﬁrst change makes no diﬀerence. The second change also makes no diﬀerence as the same projector that takes thesuperposition state without the ˜ X register to | x (cid:105)(cid:104) x | ⊗ | σ (cid:105)(cid:104) σ | AB | x takes the superposition state with the ˜ X register to | xx (cid:105)(cid:104) xx | ⊗ | σ (cid:105)(cid:104) σ | AB | x . P ( x, y ). We can of course think of this measurement as Bob performing a unitaryon the three registers, and then doing a measurement in the computational basis on some log |Z| qubits which are designated for the output. Deﬁnition 2.

The one-way entanglement-assisted quantum communication complexity, with error0 < ε <

1, of a relation f ⊆ X × Y × Z , denoted by Q ε ( f ), is the minimum message size, i.e.,number of qubits sent, in a one-way entanglement-assisted quantum protocol P such that for all( x, y ) ∈ X × Y , Pr[ P ( x, y ) ∈ f ( x, y )] ≥ − ε, where the probability is taken over the inherent randomness in the protocol. Deﬁnition 3.

For a probability distribution p on X × Y , the distributional one-way entanglement-assisted quantum communication complexity of a relation f ⊆ X ×Y ×Z , with error 0 < ε < p , is deﬁned as the minimum message size of a one-way entanglement-assisted quantumprotocol P such that Pr[ P ( x, y ) ∈ f ( x, y )] ≥ − ε, where the probability is taken over the distribution p on ( x, y ) as well as the inherent randomnessin the protocol. Fact 25 (Yao’s lemma, [Yao79]) . For any < ε < , and any relation f , Q ε ( f ) = max p Q p,ε ( f ) . A two-player non-local game G is described as ( q, X × Y , A × B , V ) where q is a distributionover the input set X × Y , A × B is the output set, and V : X × Y × A × B → { , } is a predicate.It is played as follows: a referee selects inputs ( x, y ) according to q , sends x to Alice and y toBob. If Alice and Bob are allowed to share entanglement, they perform measurements on theirrespective halves of the entangled state along with their respective input registers (which we modelas performing unitaries and then measuring in the computational basis on some log |A| and log |B| qubits designated for outputs respectively), and send their outputs ( a, b ) back to the referee. Thereferee accepts and Alice and Bob win the game iﬀ V ( x, y, a, b ) = 1. Deﬁnition 4.

The entangled value of a game G = ( q, X × Y , A × B , V ), denoted by ω ∗ ( G ), is themaximum winning probability of Alice and Bob, averaged over the distribution q as well as inherentrandomness in the strategy, over all shared entanglement strategies for G . In this section, we prove Theorem 1, whose statement we recall below.

Theorem 1.

For any relation f ⊆ X × Y × Z , and any ε, ζ > , Q − (1 − ε ) Ω( ζ k/ log |Z| ) ( f k ) = Ω (cid:0) k (cid:0) ζ · Q ε +12 ζ ( f ) − log log(1 /ζ ) (cid:1)(cid:1) . Let p be the hard distribution on X × Y for Q ε +12 ζ ( f ) from Yao’s lemma, i.e., Q ε +12 ζ ( f ) =Q p,ε +12 ζ ( f ). Consider the relation ˜ f ⊆ X × ( Y ∪ { y ∗ } ) × Z which is the same as f on X × Y × Z and additionally, ( x, y ∗ , z ) ∈ ˜ f ∀ x ∈ X , ∀ z ∈ Z .

14e can think of p as a distribution on X × ( Y ∪ { y ∗ } ) as well, which has p ( y ∗ ) = 0. Clearly,Q p,γ ( ˜ f ) = Q p,γ ( f ) (1)for any error γ , since p has no support on the extra inputs on which ˜ f is deﬁned. We also note thatQ γ ( f k ) ≥ Q γ ( ˜ f k ) (2)for any γ . This is because any protocol for f k is also a protocol for ˜ f k : on the indices where Bob’sinput is y ∗ instead of an element of Y , he pretends he has gotten an input from Y , runs the protocolwith this input and gives the answer accordingly. This gives a correct output if the original protocolgives a correct output, since any output is correct when Bob’s input in y ∗ .For a distribution q related to p , we shall show thatQ q k , − (1 − ε ) Ω( ζ k/ log |Z| ) ( ˜ f k ) ≥ ζ k · Q p,ε +12 ζ ( ˜ f ) − k log log (cid:18) ζ (cid:19) . (3)Since Q γ ( ˜ f k ) ≥ Q q k ,γ ( ˜ f k ), (1), (2) and (3) imply the theorem. The distribution q is deﬁned asfollows q ( x, y ) = (1 − ζ ) · p ( x, y ) ∀ x ∈ X , y ∈ Y q ( x, y ∗ ) = ζ · p ( x ) ∀ x ∈ X . Clearly, q ( x, y ∗ ) = q ( x ) q ( y ∗ ) for all x , and (cid:107) p ( x, y ) − q ( x, y ) (cid:107) ≤ ζ. (4)Following [BVY15], for each i ∈ [ k ], we shall deﬁne a joint distribution P X i Y i D i G i , where themarginal on X i Y i is q ( x, y ), and D i G i are correlation-breaking variables such that conditioned on D i G i = d i g i , X i and Y i are independent. Each X i Y i D i G i is distributed independently of the rest.Each D i is distributed uniformly in { , } . Depending on the value of D i , G i is distributed in thefollowing way: G i =  x w.p. p ( x ) if D i = 0 y ∗ w.p. 1 − (1 − ζ ) / if D i = 1 y w.p. (1 − ζ ) / · p ( y ) if D i = 1Now depending on the value of D i G i , X i Y i is distributed in the following way: X i Y i =  ( x, y ∗ ) w.p. ζ if D i = 0 , G i = x ( x, y ) w.p. (1 − ζ ) · p ( y | x ) if D i = 0 , G i = x ( x, y ∗ ) w.p. p ( x ) if D i = 1 , G i = y ∗ ( x, y ∗ ) w.p. (cid:0) − (1 − ζ ) / (cid:1) · p ( x | y ) if D i = 1 , G i = y ( x, y ) w.p. (1 − ζ ) / · p ( x | y ) if D i = 1 , G i = y. The following lemma is similar to Claim 18 from [BVY15]; we provide a proof for completeness.

Lemma 26.

For all ( x, y ) ∈ X × ( Y ∪ { y ∗ } ) , P X i Y i ( x, y ) = q ( x, y ) . roof. It is trivial to see that P G i Y i | D i =0 ( x, y ) = P X i Y i | D i =0 ( x, y ) = q ( x, y ), since G i = X i condi-tioned on D i = 0. We now prove the D i = 1 case. First consider a y ∈ Y . Y i can only take value y if G i takes value y . Hence, P X i Y i | D i =1 ( x, y ) = P G i | D i =1 ( y ) · P X i Y i | D i =1 ,G i = y ( x, y )= (1 − ζ ) / p ( y ) · (1 − ζ ) / p ( x | y )= (1 − ζ ) · p ( x, y ) = q ( x, y ) . On the other hand, Y i can take value y ∗ when G i = y ∗ or when G i = y for any y ∈ Y . Hence, P X i Y i | D i =1 ( x, y ∗ ) = P G i | D i =1 ( y ∗ ) · P X i Y i | D i =1 ,G i = y ∗ ( x, y ∗ ) + (cid:88) y ∈Y P G i | D i =1 ( y ) · P X i Y i | D i =1 ,G i = y ( x, y ∗ )= (cid:16) − (1 − ζ ) / (cid:17) · p ( x ) + (1 − ζ ) / (cid:16) − (1 − ζ ) / (cid:17) (cid:88) y ∈Y p ( y ) · p ( x | y )= (cid:16) − (1 − ζ ) / (cid:17) · p ( x ) + (cid:16) (1 − ζ ) / − (1 − ζ ) (cid:17) · p ( x )= ζ · p ( x ) = q ( x, y ∗ ) . In particular the lemma means P X i Y i ( x, y ∗ ) = P X i ( x ) P Y i ( y ∗ ). We also note P Y i G i | D i =1 ( Y i (cid:54) = G i ) = (1 − ζ ) / (1 − (1 − ζ ) / ) ≤ − ζ/ − ζ = ζ/ . (5)To prove (3), let P be any quantum one-way protocol between Alice and Bob, for ˜ f k ⊆ X k × ( Y ∪ { y ∗ } ) k × Z k . P is depicted in Figure 1. Alice and Bob’s inputs are in registers X = X . . . X k and Y = Y . . . Y k , and they share an entangled pure state uncorrelated with the inputs on registers E A E B , with Alice holding E A and Bob holding E B . Alice applies a unitary V Alice on XE A , to getthe message register M , and the register A to be discarded. We shall use | θ (cid:105) AME B | x to refer to thepure state in AM E B in the protocol after Alice’s unitary, for inputs xy ( | θ (cid:105) x only depends on y via x ). When Alice and Bob’s inputs are distributed according to P XY , the state of the protocol afterAlice’s message, will be given by the following CQ state: θ XY AME B = (cid:88) xy P XY ( xy ) | xy (cid:105)(cid:104) xy | XY ⊗ | θ (cid:105)(cid:104) θ | AME B | x . We shall also consider the following puriﬁcation of it, with the purifying registers ˜ X and ˜ Y : | θ (cid:105) X ˜ XY ˜ Y AME B = (cid:88) xy (cid:112) P XY ( xy ) | xxyy (cid:105) X ˜ XY ˜ Y | θ (cid:105) AME B | x . After receiving Alice’s message, Bob applies a unitary V Bob to Y M E B , after which M E B getsconverted to BZ , where Z = Z . . . Z k are the answer registers. We shall use | ρ (cid:105) X ˜ XY ˜ Y ABZ torefer to | θ (cid:105) X ˜ XY ˜ Y AME B after V Bob . We shall use P XY DGZ to refer to the joint distribution ofthese variables in | ρ (cid:105) , where the Z distribution is obtained by measuring the Z register in thecomputational basis.We shall show that if the communication cost of P is < ζ k · Q p,ε +12 ζ ( ˜ f ) − k log log(24 / ζ ), thenthe success probability of P is (1 − ε ) Ω( ζ k/ log |Z| ) . This is implied by the following claim, whichthe rest of the proof will show. 16 E A E B Y V

Alice

M V

Bob

XABZYθ ρ

Figure 1: One-way quantum protocol P Lemma 27.

Let δ = ζ and δ (cid:48) = ζ |Z| . For i ∈ [ k ] , let T i be the random variable whichtakes value 1 if P computes f ( X i , Y i ) correctly, and value 0 otherwise. If the communication costof P is < ζ k · Q p,ε +12 ζ ( ˜ f ) − k log log(24 / ζ ) , then there exist (cid:98) δ (cid:48) k (cid:99) coordinates { i , . . . , i (cid:98) δ (cid:48) k (cid:99) } ⊆ [ k ] ,such that for all ≤ r ≤ (cid:98) δ (cid:48) k (cid:99) − , at least one of the following two conditions holds(i) Pr (cid:104)(cid:81) rj =1 T i j = 1 (cid:105) ≤ (1 − ε ) δk (ii) Pr (cid:104) T i r +1 = 1 (cid:12)(cid:12)(cid:12)(cid:81) rj =1 T i j = 1 (cid:105) ≤ − ε . Lemma 27 can be proved inductively. Suppose we have already identiﬁed 1 ≤ t ≤ (cid:98) δ (cid:48) k (cid:99) coordinates in C = { i , . . . i t } , such that for all 1 ≤ r ≤ t −

1, Pr (cid:104) T i r +1 = 1 | (cid:81) rj =1 T i j = 1 (cid:105) ≤ − ε .Let E refer to the event (cid:81) i ∈ C T i = 1. If Pr[ E ] ≤ (1 − ε ) δk , then we are already done. If not, then weshall show how to identify the ( t + 1)-th coordinate i such that Pr [ T i = 1 |E ] ≤ − ε . The process ofidentifying the ﬁrst coordinate is also similar, except in that case the conditioning event is empty.Since we only use the lower bound (1 − ε ) δk on the probability of the conditioning event in ourproof, the proof goes through for that case as well.We shall use the state | ϕ (cid:105) , which is | ρ (cid:105) X ˜ XY ˜ Y ABZ conditioned on E , for the proof of Lemma 27.For any value DG = dg , | ϕ (cid:105) X ˜ XY ˜ Y ABZ | dg is deﬁned as: | ϕ (cid:105) X ˜ XY ˜ Y ABZ | dg = 1 √ γ dg (cid:88) xy (cid:113) P XY | dg ( xy ) | xxyy (cid:105) X ˜ XY ˜ Y ⊗ (cid:88) z C :( x C ,y C ,z C ) ∈ ˜ f t | z C (cid:105) Z C | ˜ ϕ (cid:105) ABZ ¯ C | xyz C . Here | ˜ ϕ (cid:105) xyz C is a subnormalized state with (cid:107)| ˜ ϕ (cid:105) ABZ ¯ C | xyz C (cid:107) = P Z C | xy ( z C ). The overall normaliza-tion factor γ dg is the probability of E conditioned on dg , and satisﬁes (cid:88) dg P DG ( dg ) · γ dg = Pr[ E ] .

17t is clear that the distribution of

XY Z in | ϕ (cid:105) X ˜ XY ˜ Y ABZ | dg is P XY Z |E ,dg . Note that we are usingthe notation | ϕ (cid:105) dg without explicitly considering registers DG on which a measurement is done toobtain | ϕ (cid:105) dg . We shall also sometimes use | ϕ (cid:105) d − i g − i in which the xy distributions are conditioned on d − i g − i instead, which changes the normalization factor to some γ d − i g − i , everything else remainingthe same. ϕ x i y i d − i g − i refers as usual to the state obtained when a measurement done on the X i Y i registers (which are actually present in | ϕ (cid:105) ) in | ϕ (cid:105) d − i g − i . For i / ∈ ¯ C , we shall use the states | ϕ (cid:105) X ¯ C ˜ X ¯ C Y ¯ C ˜ Y ¯ C ABZ ¯ C | x i y i x C y C z C d − i g − i in our proof, which we note are pure states.Lemma 27 will be proved with the help of the following lemma, whose proof we give later. Lemma 28. If Pr[ E ] ≥ (1 − ε ) δk , then there exist a coordinate i ∈ ¯ C , a random variable R i = X C Y C Z C D − i G − i and for each R i = r i a state | ϕ (cid:48) (cid:105) X ¯ C ˜ X ¯ C Y ¯ C ˜ Y ¯ C ABZ ¯ C | y ∗ r i such that the followingconditions hold:(i) (cid:107) P X i Y i R i |E − P X i Y i P R i |E ,X i (cid:107) ≤ ζ (ii) (cid:107) P X i Y i R i |E − P X i Y i P R i |E ,Y i (cid:107) ≤ ζ .There exist projectors { Π x i r i } x i r i acting only on registers X ¯ C ˜ X ¯ C A and unitaries { U y i r i } y i r i actingonly on Y ¯ C ˜ Y ¯ C BZ ¯ C , such that each Π x i r i succeeds on | ϕ (cid:48) (cid:105) r i with probability α r i = 2 − c (cid:48) ri , and(iii) E P Ri |E c (cid:48) r i ≤ cζ (iv) E P XiYiRi |E (cid:13)(cid:13)(cid:13) α ri (Π x i r i ⊗ U y i r i ) | ϕ (cid:48) (cid:105)(cid:104) ϕ (cid:48) | y ∗ r i (Π x i r i ⊗ U † y i r i ) − | ϕ (cid:105)(cid:104) ϕ | x i y i r i (cid:13)(cid:13)(cid:13) ≤ ζ. Proof of Lemma 27.

We give a one-way quantum protocol P (cid:48) for ˜ f , whose inputs are distributedaccording to P X i Y i , i.e., q , by embedding Alice and Bob’s inputs into the i -th coordinate of | ϕ (cid:105) x i y i r i ,as follows: • Alice and Bob have r according to the distribution required by Fact 11 as shared randomness,and 2 c/ζ log(24 / ζ ) copies of | ϕ (cid:48) (cid:105) y ∗ r i as shared entanglement, with Alice holding registers X ¯ C ˜ X ¯ C A and Bob holding registers Y ¯ C ˜ Y ¯ C BZ ¯ C of each copy. • On input ( x i , y i ) from P X i Y i , using items (i), (ii) of Lemma 28, their shared randomness, andthe protocol from Fact 11, Alice and Bob generate random variables R Alice i R Bob i such that (cid:107) P X i Y i R Alice i R Bob i − P X i Y i R i R i |E (cid:107) ≤ ζ . where R i R i denotes two perfectly correlated copies of R i in P X i Y i R i R i |E . • Alice applies the { Π x i r A i , − Π x i r A i } measurement according to her input and R Alice i on herregisters for each copy of the shared entangled state. If the Π x i r A i measurement does notsucceed on any copy, then she aborts. Otherwise, she sends to Bob a ( cζ + log log(24 / ζ ))-bit message indicating an index where Π x i r A i measurement succeeded.18 Bob applies the unitary U y i r B i according to his input and R Bob i on the copy of the sharedentangled state whose index Alice has sent, and measures the Z i register of the resultingstate to give her output.To analyze the success of this protocol, ﬁrst note that E P XiYiRi |E Pr[Result of Z i measurement on | ϕ (cid:105) x i y i r i ∈ ˜ f ( x i , y i )] = Pr[ T i = 1 |E ] . Let us ﬁrst assume Alice and Bob have ( x i , y i , r A i , r B i ) distributed exactly according to P X i Y i R i R i |E – we shall denote both r A i and r B i by r i in this case. Alice aborts the protocol if none of hermeasurements succeed. On expectation, this happens with probability E P Ri |E (cid:104) (1 − − c (cid:48) ri ) c/ζ log(24 / ζ ) (cid:105) ≤ (cid:16) − − E P Ri |E c (cid:48) ri (cid:17) c/ζ log(24 / ζ ) ≤ ζ √ α ri (Π x i r i ⊗ U y i r i | ϕ (cid:48) (cid:105) y ∗ r i . From (iv), the expected probability of the Z i measurement on this state giving ananswer ∈ ˜ f ( x i , y i ) is at least Pr[ T i = 1 |E ] − ζ . Hence, if Alice and Bob had ( x i , y i , r A i , r B i )distributed according to P X i Y i R i R i |E , then their expected success probability would have been atleast Pr[ T i = 1 |E ] − ζ − ζ . Since Alice and Bob have ( x i , y i , r A i , r B i ) according to P X i Y i R Alice i R Bob i instead, their expected success probability is at leastPr[ T i = 1 |E ] − ζ − ζ − ζ ≥ Pr[ T i = 1 |E ] − ζ. Since (cid:107) q ( x, y ) − p ( x, y ) (cid:107) ≤ ζ , when the same protocol is run on X i Y i distributed according to p instead, it must succeed with probability at least Pr[ T i = 1 |E ] − ζ . Since the communication in P (cid:48) is at most ( cζ + log log(24 / ζ )) < Q p,ε +12 ζ ( ˜ f ), Pr[ T i = 1 |E ] ≥ − ε gives the error probabilityof P (cid:48) to be ≤ ε + 12 ζ , which is a contradiction. Hence we must have Pr[ T i = 1 |E ] ≤ − ε . Thedesired result thus follows by setting i t +1 = i . Proof of Lemma 28.

Applying Fact 10 with T and V being trivial and U i = X i Y i D i G i for i ∈ ¯ C ,we get, E i ∈ ¯ C (cid:107) P X i Y i D i G i |E − P X i Y i D i G i (cid:107) ≤ k − t (cid:113) k · log((1 − ε ) δk ) ≤ √ δ. (6)In particular, due to (5), this means E i ∈ ¯ C P Y i G i |E ,D i =1 ( Y i = G i ) ≥ − ζ/ − √ δ. (7)And since P G i | D i =1 ( y ∗ ) = 1 − (1 − ζ ) / , P Y i | D i =1 ,G i = y ( y i ) = (1 − ζ ) / for y i ∈ Y , we have ζ + √ δ ≥ − (1 − ζ ) / + √ δ ≥ E i ∈ ¯ C P G i |E ,D i =1 ( y ∗ ) ≥ − (1 − ζ ) / − √ δ ≥ ζ/ − √ δ (8)(1 − ζ/ √ δ ) · E i ∈ ¯ C P G i |E ,D i =1 ( y i ) ≥ E i ∈ ¯ C P Y i G i |E ,D i =1 ( y i , y i ) ≥ (1 − ζ − √ δ ) · E i ∈ ¯ C P G i |E ,D i =1 ( y i ) . (9)19act 10 can again be applied with U i = X i Y i , T = X C Y C DG and V = Z C . Let δ = δ + δ (cid:48) log |Z| = ζ . Then we have, (cid:112) δ ≥ E i ∈ ¯ C (cid:107) P X i Y i X C Y C Z C DG |E − P X C Y C Z C DG |E P X i Y i | X C Y C DG (cid:107) = E i ∈ ¯ C (cid:107) P X i Y i X C Y C Z C DG |E − P X C Y C Z C DG |E P X i Y i | D i G i (cid:107) = E i ∈ ¯ C (cid:107) P X i Y i D i G i R i |E − P D i G i R i |E P X i Y i | D i G i (cid:107) . (10)We note that D i takes value uniformly in { , } even conditioned on E . Hence from (10), (cid:112) δ ≥ E i ∈ ¯ C (cid:107) P X i Y i G i R i |E ,D i =0 − P G i R i |E ,D i =0 P X i Y i | G i ,D i =0 (cid:107) = 12 E i ∈ ¯ C (cid:107) P X i Y i R i |E − P X i R i |E P Y i | X i (cid:107) where we have used the fact that X i = G i conditioned on D i = 0. Combining this with the factthat E i ∈ ¯ C (cid:107) P X i |E − P X i (cid:107) ≤ √ δ , we have, E i ∈ ¯ C (cid:107) P X i Y i R i |E − P X i Y i P R i |E ,X i (cid:107) ≤ (cid:112) δ < ζ . (11)Due to Corollary 9 we also have from (11), E i ∈ ¯ C (cid:107) P X i R i |E ,y ∗ − P X i R i |E (cid:107) ≤ √ δ ζ . (12)Let F i denote the event Y i = G i . We know E i ∈ ¯ C P X i Y i G i | D i =1 ( F i ) ≥ − ζ/ −√ δ , from (7). Hence,using Fact 7, E i ∈C (cid:107) P X i Y i R i |E − P Y i R i |E P X i | Y i (cid:107) = E i ∈ ¯ C (cid:107) P X i Y i G i R i |E ,D i =1 , F i − P G i R i |E ,D i =1 , F i P X i Y i | G i D i =1 , F i (cid:107) ≤ E i ∈ ¯ C (cid:107) P X i Y i D i R i |E ,G i =1 − P G i R i |E ,D i =1 P X i Y i | G i D i =1 (cid:107) ≤ (cid:112) δ . Using E i ∈ ¯ C (cid:107) P Y i |E − P Y i (cid:107) ≤ √ δ , we have as before, E i ∈ ¯ C (cid:107) P X i Y i R i |E − P X i Y i P R i |E ,Y i (cid:107) ≤ (cid:112) δ = 7 ζ . (13)Let M be ck qubits. By Fact 20, for any value DG = dg , there exists some state σ M | dg suchthat S ∞ ( θ XY ˜ Y E B M | dg (cid:107) θ XY ˜ Y E B | dg ⊗ σ M | dg ) ≤ ck. By Fact 19 we have, S ∞ (cid:16) ρ XY ˜ Y BZ | dg (cid:107) V Bob ( θ XY ˜ Y E B | dg ⊗ σ M | dg )( V Bob ) † (cid:17) ≤ ck. Let ψ X ¯ C Y ¯ C ˜ Y ¯ C BZ ¯ C | dg = Tr Z C ( V Bob ( θ XY E B | dg ⊗ σ M | x C y C dg )( V Bob ) † ). Note that θ XY ˜ Y E B | dg ⊗ σ M | dg is product across X and the other registers, and V Bob does not act on X . Hence ψ X ¯ C Y ¯ C ˜ Y ¯ C BZ ¯ C | dg is20lso product across X and the other registers, and moreover, all the X i -s are in product with eachother as well. We have, S ∞ (cid:16) ρ XY ˜ Y BZ ¯ C | dg (cid:107) ψ XY ˜ Y BZ ¯ C | dg (cid:17) ≤ ck. Using Facts 21 and 18, this gives us E P XCYCZCDG |E (cid:104) S (cid:16) ϕ X ¯ C Y ¯ C ˜ Y ¯ C BZ ¯ C | x C y C z C dg (cid:107) ψ X ¯ C Y ¯ C ˜ Y ¯ C BZ ¯ C | x C y C dg (cid:17)(cid:105) ≤ E P ZCDG |E (cid:104) S (cid:16) ϕ XY ˜ Y BZ ¯ C | z C dg (cid:107) ψ XY ˜ Y BZ ¯ C | dg (cid:17)(cid:105) ≤ E P ZCDG |E (cid:104) S ∞ (cid:16) ϕ XY ˜ Y BZ ¯ C | z C dg (cid:107) ψ XY ˜ Y BZ ¯ C | dg (cid:17)(cid:105) ≤ E P ZCDG |E (cid:104) S ∞ (cid:16) ϕ XY ˜ Y BZ ¯ C | z C dg (cid:107) ϕ XY ˜ Y BZ ¯ C | dg (cid:17) + S ∞ (cid:16) ϕ XY ˜ Y BZ ¯ C | dg (cid:107) ρ XY ˜ Y BZ ¯ C | dg (cid:17) + S ∞ (cid:16) ρ XY ˜ Y BZ ¯ C | dg (cid:107) ψ XY ˜ Y BZ ¯ C | dg (cid:17)(cid:105) ≤ E P ZCDG |E (cid:2) log(1 / P Z C |E ( z C )) + log(1 / Pr[ E ]) + 2 ck (cid:3) ≤ | C | log |Z| + δk + 2 ck ≤ ( δ + 2 c ) k. By Quantum Raz’s Lemma,4 c + 2 δ ≥ E i ∈ ¯ C E P XCYCZCDG |E I ( X i : Y ¯ C ˜ Y ¯ C BZ ¯ C ) ϕ xCyCzCdg = E i ∈ ¯ C E P DiGiRi |E I ( X i : Y ¯ C ˜ Y ¯ C BZ ¯ C ) ϕ digiri ≥ E i ∈ ¯ C P G i |E ,D i =1 ( y ∗ ) E P Ri |E ,Di =1 ,Gi = y ∗ I ( X i : Y ¯ C ˜ Y ¯ C BZ ¯ C ) ϕ ri | Di =1 ,Gi = y ∗ ≥ E i ∈ ¯ C

12 (2 ζ/ − √ δ ) E P Ri |E ,Di =1 ,Gi = y ∗ I ( X i : Y ¯ C ˜ Y ¯ C BZ ¯ C ) ϕ ri,Di =1 ,Gi = y ∗ (14)where we have used (8) in the last inequality.Note that ϕ X ¯ C ˜ X ¯ C Y ¯ C ˜ Y ¯ C ABZ ¯ C | x i r i ,D i =1 ,G i = y ∗ is the same state as ϕ X ¯ C ˜ X ¯ C Y ¯ C ˜ Y ¯ C ABZ ¯ C | x i y ∗ r i , wherethe value of Y i is being conditioned on, instead of G i . | ϕ (cid:105) r i ,D i =1 ,G i = y ∗ is the superposition over X i of | ϕ (cid:105) x i r i ,D i =1 ,G i = y ∗ , with the X i distribution being P X i |E ,r i ,D i =1 ,G i = y ∗ . The only diﬀerence between | ϕ (cid:105) y ∗ r i and | ϕ (cid:105) r i ,D i =1 ,G i = y ∗ is the X i distribution, which in the former is P X i |E ,y ∗ r i instead. We shallrefer to | ϕ (cid:105) r i ,D i =1 ,G i = y ∗ as simply | ϕ (cid:105) r i , ,y ∗ as now on – note that there is no ambiguity betweenthis and | ϕ (cid:105) y ∗ r i . The same goes for the distributions P X i R i |E , ,y ∗ and P X i R i |E ,y ∗ . P X i | ,y ∗ is the same distribution as P X i | y ∗ and P R i |E ,x i , ,y ∗ is the same distribution as P R i |E ,x i y ∗ for any x i . Hence, E i ∈ ¯ C (cid:107) P X i R i |E ,y ∗ − P X i R i |E , ,y ∗ (cid:107) ≤ E i ∈ ¯ C (cid:2) (cid:107) P X i R i |E ,y ∗ − P X i | y ∗ P R i |E ,X i ,y ∗ (cid:107) + (cid:107) ( P X i | ,y ∗ − P X i |E , ,y ∗ ) P R i |E ,X i ,y ∗ (cid:107) (cid:3) ≤ E i ∈ ¯ C (cid:20) (cid:107) P X i R i |E − P X i P R i |E ,X i (cid:107) ζ/ − √ δ + (cid:107) P X i |E − P X i (cid:107) ζ/ − √ δ (cid:21) √ δ ζ where we have used (8) in the second inequality. Using the above computation and (12), we get, E i ∈ ¯ C (cid:107) P X i R i |E − P X i R i |E , ,y ∗ (cid:107) ≤ √ δ ζ . Let | ϕ (cid:48) (cid:105) y ∗ r i denote the pure state where the distribution of X i is unconditioned on Y i = y ∗ , buteverything else is conditioned. From (14) and Fact 22, we then have that, E i ∈ ¯ C P R i |E (cid:18) I ζ +280 √ δ /ζ max ( X i : Y ¯ C ˜ Y ¯ CBZ ¯ C ) ϕ (cid:48) y ∗ ri > c + δ ) + 1 ζ (cid:19) ≤ ζ + 20 √ δ ζ . Hence by Fact 24, there exist projectors Π x i r i acting on registers X ¯ C ˜ X ¯ C A , such that Π x i r i succeedswith probability α r i = 2 − c (cid:48) ri on | ϕ (cid:48) (cid:105) X ¯ C ˜ X ¯ C Y ¯ C ˜ Y ¯ C ABZ ¯ C | y ∗ r i , where E i ∈ ¯ C E P Ri |E c (cid:48) r i ≤ ζ · c + δ ) + 1 ζ ≤ cζ (15) E ∈ ¯ C E P XiRi |E (cid:13)(cid:13)(cid:13)(cid:13) α r i (Π x i r i ⊗ ) | ϕ (cid:48) (cid:105)(cid:104) ϕ (cid:48) | y ∗ r i (Π x i r i ⊗ ) − | ϕ (cid:105)(cid:104) ϕ | x i y ∗ r i (cid:13)(cid:13)(cid:13)(cid:13) ≤ ζ + 300 √ δ ζ ≤ ζ . (16)By similar arguments as the ones leading to (14) on Bob’s side (except the ﬁrst step where weconsider the information due to the message sent by Alice to Bob, which does not apply here), wecan alo upper bound E P XCYCZCDG |E (cid:104) S (cid:16) ϕ Y ¯ C X ¯ C ˜ X ¯ C A | x C y C z C dg (cid:107) ρ Y ¯ C X ¯ C ˜ X ¯ C A | x C y C dg (cid:17)(cid:105) . Hence by Raz’slemma again,2 δ ≥ E i ∈ ¯ C E P DiGiRi |E I ( Y i : X ¯ C ˜ X ¯ C A ) ϕ digiri ≥ E i ∈ ¯ C

12 (1 − ζ − √ δ ) E P RiGi |E ,Di =1 ,Gi (cid:54) = y ∗ I ( Y i : X ¯ C ˜ X ¯ C A ) ϕ ri,Di =1 ,gi = E i ∈ ¯ C

12 (1 − ζ − √ δ ) E P RiGiYi |E ,Di =1 ,Gi (cid:54) = y ∗ (cid:104) S (cid:16) ϕ X ¯ C ˜ X ¯ C A | y i ,D i =1 ,g i (cid:107) ϕ X ¯ C ˜ X ¯ C A | D i =1 ,g i (cid:17)(cid:105) ≥ E i ∈ ¯ C

12 (1 − ζ − √ δ ) (cid:88) y i ∈Y E P Ri |E ,Di =1 ,Gi = yi P G i |E ,D i =1 ( y i ) · (cid:104) (1 − ζ − √ δ ) (cid:107) ϕ X ¯ C ˜ X ¯ C A | y i ,r i ,D i =1 ,G i = y i − ϕ X ¯ C ˜ X ¯ C A | r i ,D i =1 ,G i = y i (cid:107) +( ζ/ − √ δ ) (cid:107) ϕ X ¯ C ˜ X ¯ C A | y ∗ ,r i ,D i =1 ,G i = y i − ϕ X ¯ C ˜ X ¯ C A | r i ,D i =1 ,G i = y i (cid:107) (cid:105) . where we have used (9) and Pinsker’s inequality in the last line. Hence by triangle inequality wehave, E i ∈ ¯ C (cid:88) y i ∈Y E P Ri |E , ,yi P G i |E , ( y i ) (cid:107) ϕ X ¯ C ˜ X ¯ C A | y i r i , ,y i − ϕ X ¯ C ˜ X ¯ C A | y ∗ r i , ,y i (cid:107) ≤ δ ζ .

10 (18)where we have bounded the last term in the ﬁrst inequality by applying Fact 14 on (17) with O X i .Notice that we have also removed the conditioning G i (cid:54) = y ∗ , since for G i = y ∗ , the correspondingstates are both | ϕ (cid:105) x i y ∗ r i .From (16) and (18) we get, E i ∈ ¯ C E P XiYiRi |E (cid:13)(cid:13)(cid:13)(cid:13) α r i (Π x i r i ⊗ U y i r i ) | ϕ (cid:48) (cid:105)(cid:104) ϕ (cid:48) | y ∗ r i (Π x i r i ⊗ U † y i r i ) − | ϕ (cid:105)(cid:104) ϕ | x i y i r i (cid:13)(cid:13)(cid:13)(cid:13) ≤ E i ∈ ¯ C E P XiYiRi |E (cid:20) (cid:13)(cid:13)(cid:13)(cid:13) α r i (Π x i r i ⊗ U y i r i ) | ϕ (cid:48) (cid:105)(cid:104) ϕ (cid:48) | y ∗ r i (Π x i r i ⊗ U † y i r i ) − ( ⊗ U y i r i ) | ϕ (cid:105)(cid:104) ϕ | x i y ∗ r i ( ⊗ U † y i r i ) (cid:13)(cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) ( ⊗ U y i r i ) | ϕ (cid:105)(cid:104) ϕ | x i y ∗ r i ( ⊗ U † y i r i ) − | ϕ (cid:105)(cid:104) ϕ | x i y i r i (cid:13)(cid:13)(cid:13) (cid:21) = E i ∈ ¯ C E P XiYiRi |E (cid:20) (cid:13)(cid:13)(cid:13)(cid:13) α r i (Π x i r i ⊗ ) | ϕ (cid:48) (cid:105)(cid:104) ϕ (cid:48) | y ∗ r i (Π x i r i ⊗ ) − | ϕ (cid:105)(cid:104) ϕ | x i y ∗ r i (cid:13)(cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) ( ⊗ U y i r i ) | ϕ (cid:105)(cid:104) ϕ | x i y ∗ r i ( ⊗ U † y i r i ) − | ϕ (cid:105)(cid:104) ϕ | x i y i r i (cid:13)(cid:13)(cid:13) (cid:21) ≤ ζ ζ

10 = 21 ζ . (19)Using Markov’s inequality on (11), (13), (15) and (19), we get an index i ∈ ¯ C such that theconditions (i)-(iv) for Lemma 28 hold. In this section we prove Theorem 2, whose statement is recalled below.

Theorem 2.

For a two-player non-local game G = ( q, X × Y , A × B , V ) such that q is a distributionanchored on one side with anchoring probability ζ , ω ∗ ( G k ) = (cid:0) − (1 − ω ∗ ( G )) (cid:1) Ω (cid:18) ζ k log( |A|·|B| ) (cid:19) . The proof of this theorem is very similar to that of the direct product theorem, so we shallonly highlight points of diﬀerence. Whereas in the communication case, we started with an arbi-trary distribution p and deﬁned distribution q anchored on one side close to p , here we start withan already anchored distribution. To preserve similarity with the direct product proof, we shall24onsider q to be anchored on the Y side here as well, but the proof goes through analogously fora distribution anchored on the X side. We deﬁne the correlation-breaking variables and the jointdistribution P XY DG exactly as before. We consider an entangled strategy S for G k , where Alice and Bob, with input registers X = X . . . X k and Y = Y . . . Y k , initially share an entangled state, and perform unitaries V Alice and V Bob respectively on their parts of the entangled state and and their input registers. As before,conditioned on any value DG = dg , we deﬁne the following pure state representing S after theseunitaries: | θ (cid:105) X ˜ XY ˜ Y ABE (cid:48) A E (cid:48) B | dg = (cid:88) xy (cid:113) P XY | dg ( xy ) | xxyy (cid:105) X ˜ XY ˜ Y ⊗ | θ (cid:105) ABE A E B | xy where AB are the answer registers which are measured in the computational basis by Alice andBob to obtain their answers ( a, b ), and E (cid:48) A E (cid:48) B are some additional registers which are discarded.We shall use P XY AB | dg to denote the distribution of XY AB in | θ (cid:105) dg ; P XY DGAB is obtained byaveraging over dg .Let the winning probability of of ω ∗ ( G ) be 1 − ε for an appropriate ε . We shall prove thefollowing lemma, which is analogous to the direct product case. It is clear that the lemma implies ω ∗ ( G k ) ≤ (1 − ε ) ζ ε k log( |A|·|B| ) = (cid:0) − (1 − ω ∗ ( G )) (cid:1) Ω (cid:18) ζ k log( |A|·|B| ) (cid:19) . Lemma 29.

Let δ = ζ ε and δ (cid:48) = ζ ε |A|·|B| ) . For i ∈ [ k ] , let T i denote the random vari-able V ( X i , Y i , A i , B i ) , where X i Y i A i B i are according to P XY AB . Then there exist (cid:98) δ (cid:48) k (cid:99) coordinates { i , . . . , i (cid:98) δ (cid:48) k (cid:99) } ⊆ [ k ] , such that for all ≤ r ≤ (cid:98) δ (cid:48) k (cid:99) − , at least one of the conditions holds(i) Pr (cid:104)(cid:81) rj =1 T i j = 1 (cid:105) ≤ (1 − ε ) δk (ii) Pr (cid:104) T i r +1 = 1 (cid:12)(cid:12)(cid:12)(cid:81) rj =1 T i j = 1 (cid:105) ≤ − ε . As before, we shall consider that we have identiﬁed a set of coordinates C = { i , . . . , i t } such thatfor all 1 ≤ r ≤ t −

1, Pr (cid:104) T i r +1 = 1 | (cid:81) rj =1 T i j = 1 (cid:105) ≤ − ε and Pr[ E ] = Pr (cid:104)(cid:81) tj =1 T i j = 1 (cid:105) ≥ (1 − ε ) δk ,and identify a ( t + 1)-th coordinate i . Let E A and E B to denote A ¯ C E (cid:48) A and B ¯ C E (cid:48) B respectively.We deﬁne the following state, which is | θ (cid:105) dg conditioned on success in C : | ϕ (cid:105) X ˜ XY ˜ Y A C B C BE A E B | dg = 1 √ γ dg (cid:88) xy (cid:113) P XY | dg ( xy ) | xxyy (cid:105) X ˜ XY ˜ Y ⊗ (cid:88) a C b C : V t ( x C ,y C ,a C ,b C )=1 | a C b C (cid:105) A C B C | ˜ ϕ (cid:105) E A E B | xya C b C . Here | ˜ ϕ (cid:105) E A E B | xya C b C is a subnormalized state satisfying (cid:107)| ˜ ϕ (cid:105) E A E B | xya C b C (cid:107) = P A C B C | xy ( a C b C ).The following lemma is the analog of Lemma 28, which we shall use to prove Lemma 29. The deﬁnition of P X i Y i D i G i in the previous section makes references to p ( x, y ). Since there is no p in the presentcase, p ( x, y ) can simply be replaced by q ( x, y | y (cid:54) = y ∗ ). emma 30. If Pr[ E ] ≥ (1 − ε ) δk , then there exist a coordinate i ∈ ¯ C , a random variable R i = X C Y C A C B C D − i G − i , such that the following conditions hold:(i) (cid:107) P X i Y i R i |E − P X i Y i P R i |E ,X i (cid:107) ≤ ε (ii) (cid:107) P X i Y i R i |E − P X i Y i P R i |E ,Y i (cid:107) ≤ ε (iii) There exist unitaries { U x i r i } x i r i and { U y i r i } y i r i respectively acting only on X ¯ C ˜ X ¯ C E A and Y ¯ C ˜ Y ¯ C E B , such that E P XiYiRi |E (cid:13)(cid:13)(cid:13) ( U x i r i ⊗ U y i r i ) | ϕ (cid:105)(cid:104) ϕ | y ∗ r i ( U † x i r i ⊗ U † y i r i ) − | ϕ (cid:105)(cid:104) ϕ | x i y i r i (cid:13)(cid:13)(cid:13) ≤ ε . It is easy to see how this lemma implies Lemma 29. As in the direct product case, Alice and Bobshare | ϕ (cid:105) y ∗ r i as entanglement – though in this case only one copy, as well as classical randomnesswith which they can produce R Alice i R Bob i satisfying (cid:107) P X i Y i R Alice i R Bob i − P X i Y i R i R i |E (cid:107) ≤ ε . Alice and Bob apply U x i r A i and U y i r B i according to their inputs and R Alice i and R Bob i respectively,on their registers registers E A and E B of | ϕ (cid:105) y ∗ r i . They then measure in the computational basison the A i B i registers of resulting state, to give their outcomes ( a i , b i ). Pr[ T i = 1 |E ] ≥ − ε impliesthat the resulting strategy for G has success probability > (1 − ε ), a contradiction which lets usidentify i as the ( t + 1)-th coordinate.The rest of the proof will be dedicated to showing Lemma 30. Proof of Lemma 30.

We can prove E i ∈ ¯ C (cid:107) P X i Y i R i |E − P X i Y i P R i |E ,X i (cid:107) ≤ ε

600 (20) E i ∈ ¯ C (cid:107) P X i Y i R i |E − P X i Y i P R i |E ,Y i (cid:107) ≤ ε

600 (21) E i ∈ ¯ C E P XiYiRi |E (cid:107)| ϕ (cid:105)(cid:104) ϕ | x i y i r i − ( ⊗ U y i r i ) | ϕ (cid:105)(cid:104) ϕ | x i y ∗ r i ( ⊗ U † y i r i ) (cid:107) ≤ ε z C is replaced byconditioning on a C b C , which leads to the factor of log( |A| · |B| ). The rest of the proof will hencebe spent getting Alice’s unitaries U x i r i .Letting δ = δ + δ (cid:48) log( |A| · |B| ), the following is derived analogously to the direct product case,except for the extra factor in the mutual information bound due to communication: E i ∈ ¯ C E R i |E ,D i =1 ,G i = y ∗ I ( X i : Y ¯ C ˜ Y ¯ C E B ) ϕ ri,Di =1 ,Gi = y ∗ ≤ δ ζ (23) E i ∈ ¯ C (cid:107) P X i R i |E ,y ∗ − P X i R i |E , ,y ∗ (cid:107) ≤ √ δ ζ (24)26 i ∈ ¯ C (cid:107) P X i R i |E − P X i R i |E , ,y ∗ (cid:107) ≤ √ δ ζ . (25)From (23), by applying Pinsker’s inequality, we get, E i ∈ ¯ C E P XiRi |E , ,y ∗ (cid:107) ϕ Y ¯ C ˜ Y ¯ C E B | x i r i , ,y ∗ − ϕ Y ¯ C ˜ Y ¯ C E B | r i , ,y ∗ (cid:107) ≤ (cid:18) δ ζ (cid:19) / Note that ϕ Y ¯ C ˜ Y ¯ C E B | x i r i , ,y ∗ is the same state as ϕ Y ¯ C ˜ Y ¯ C E B | x i y ∗ r i . But ϕ Y ¯ C ˜ Y ¯ C E B | r i , ,y ∗ is not the samestate as ϕ Y ¯ C ˜ Y ¯ C E B | y ∗ r i , due to the averaging over X i being done with respect to P X i |E ,r i , ,y ∗ in one,and with respect to P X i |E ,y ∗ r i in the other. However, due to (24) we can say, E i ∈ ¯ C E P XiRi |E , ,y ∗ (cid:107) ϕ Y ¯ C ˜ Y ¯ C E B | x i y ∗ r i − ϕ Y ¯ C ˜ Y ¯ C E B | y ∗ r i (cid:107) ≤ (cid:18) δ ζ (cid:19) / + E i ∈ ¯ C (cid:107) P X i R i |E , ,y ∗ − P R i |E , ,y ∗ P X i |E ,R i ,y ∗ (cid:107) ≤ (cid:18) δ ζ (cid:19) / + E i ∈ ¯ C (cid:107) P X i R i |E ,y ∗ − P X i R i |E , ,y ∗ (cid:107) ≤ √ δ ζ . Since | ϕ (cid:105) X ¯ C ˜ X ¯ C Y ¯ C ˜ Y ¯ C E A E B | y ∗ r i is a puriﬁcation of ϕ Y ¯ C ˜ Y ¯ C E B | y ∗ r i and | ϕ (cid:105) X ¯ C ˜ X ¯ C Y ¯ C ˜ Y ¯ C E A E B | x i y ∗ r i is a pu-riﬁcation of ϕ Y ¯ C ˜ Y ¯ C E B | x i y ∗ r i , by the Fuchs-van de Graaf inequality and Uhlmann’s theorem we cansay that there exist unitaries U x i r i on X ¯ C ˜ X ¯ C E A such that E i ∈ ¯ C E P XiRi |E , ,y ∗ (cid:107)| ϕ (cid:105)(cid:104) ϕ | x i y ∗ r i − ( U x i r i ⊗ ) | ϕ (cid:105)(cid:104) ϕ | y ∗ r i ( U † x i r i ⊗ ) (cid:107) ≤ (cid:18) √ δ ζ (cid:19) / and by (25) again, E i ∈ ¯ C E P XiRi |E (cid:107)| ϕ (cid:105)(cid:104) ϕ | x i y ∗ r i − ( U x i r i ⊗ ) | ϕ (cid:105)(cid:104) ϕ | y ∗ r i ( U † x i r i ⊗ ) (cid:107) ≤ (cid:18) √ δ ζ (cid:19) / + 40 √ δ ζ ≤ (cid:18) δ ζ (cid:19) / ≤ ε. (26)Combining (26) and (22) we get, E i ∈ ¯ C E P XiYiRi |E (cid:13)(cid:13)(cid:13) ( U x i r i ⊗ U y i r i ) | ϕ (cid:105)(cid:104) ϕ | y ∗ r i ( U † x i r i ⊗ U † y i r i ) − | ϕ (cid:105)(cid:104) ϕ | x i y i r i (cid:13)(cid:13)(cid:13) ≤ ε . The result then follows by Markov’s inequality.

Acknowledgements

This work is supported by the National Research Foundation, including under NRF RF AwardNo. NRF-NRFF2013-13, the Prime Minister’s Oﬃce, Singapore and the Ministry of Education,Singapore, under the Research Centres of Excellence program and by Grant No. MOE2012-T3-1-009 and in part by the NRF2017-NRF-ANR004

VanQuTe

Grant.27 eferences [BARdW08] Avraham Ben-Aroya, Oded Regev, and Ronald de Wolf. A Hypercontractive Inequal-ity for Matrix-Valued Functions with Applications to Quantum Computing and LDCs.In

Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Sci-ence, FOCS ’08 , pages 477–486, 2008.[BBCR13] Boaz Barak, Mark Braverman, Xi Chen, and Anup Rao. How to Compress InteractiveCommunication.

SIAM Journal on Computing , 42(3):1327–1363, 2013.[BCR11] Mario Berta, Matthias Christandl, and Renato Renner. The Quantum Reverse Shan-non Theorem Based on One-Shot Information Theory.

Communications in Mathe-matical Physics , 306(3):579–615, 2011.[BK18] Mark Braverman and Gillat Kol. Interactive Compression to External Information. In

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing ,STOC ’18, page 964977, 2018.[BR14] Mark Braverman and Anup Rao. Information Equals Amortized Communication.

IEEE Transactions on Information Theory , 60(10):6058–6069, 2014.[Bra15] Mark Braverman. Interactive information complexity.

SIAM Journal on Computing ,44(6):1698–1739, 2015.[BRWY13a] Mark Braverman, Anup Rao, Omri Weinstein, and Amir Yehudayoﬀ. Direct Prod-uct via Round-Preserving Compression. In

Automata, Languages, and Programming ,volume 7965 of

Lecture Notes in Computer Science , pages 232–243. 2013.[BRWY13b] Mark Braverman, Anup Rao, Omri Weinstein, and Amir Yehudayoﬀ. Direct Productsin Communication Complexity. In

Proceedings of the 54th Annual IEEE Symposiumon Foundations of Computer Science, FOCS ’13 , pages 746–755, 2013.[BVY15] Mohammad Bavarian, Thomas Vidick, and Henry Yuen. Anchoring Games for ParallelRepetition. https://arxiv.org/abs/1509.07466 , 2015.[BVY17] Mohammad Bavarian, Thomas Vidick, and Henry Yuen. Hardness Ampliﬁcation forEntangled Games via Anchoring. In

Proceedings of the 49th Annual ACM SIGACTSymposium on Theory of Computing , STOC ’17, page 303316, 2017.[BW15] Mark Braverman and Omri Weinstein. An Interactive Information Odometer andApplications. In

Proceedings of the Forty-Seventh Annual ACM Symposium on Theoryof Computing , STOC ’15, page 341350, 2015.[BYJKS02] Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, and D. Sivakumar. An Information Statis-tics Approach to Data Stream and Communication Complexity. In

Proceedings of the43th Annual IEEE Symposium on Foundations of Computer Science, FOCS ’02 , pages209–218, 2002.[CSUU08] Richard Cleve, William Slofstra, Falk Unger, and Sarvagya Upadhyay. Perfect ParallelRepetition Theorem for Quantum XOR Proof Systems.

Computational Complexity ,17(2):282–299, 2008. 28CSWY01] Amit Chakrabarti, Yaoyun Shi, Anthony Wirth, and Andrew Yao. InformationalComplexity and the Direct Sum Problem for Simultaneous Message Complexity. In

Proceedings of the 42nd Annual IEEE Symposium on Foundations of Computer Sci-ence, FOCS ’01 , pages 270–278, 2001.[DSV15] Irit Dinur, David Steurer, and Thomas Vidick. A Parallel Repetition Theorem forEntangled Projection Games.

Computational Complexity , 24(2):201254, 2015.[HJMR10] Prahladh Harsha, Rahul Jain, David McAllester, and Jaikumar Radhakrishnan. TheCommunication Complexity of Correlation.

IEEE Transactions on Information The-ory , 56(1):438–449, 2010.[Hol07] Thomas Holenstein. Parallel Repetition: Simpliﬁcations and the No-Signaling Case.In

Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing ,STOC ’07, page 411419, 2007.[Jai15] Rahul Jain. New Strong Direct Product Results in Communication Complexity.

Jour-nal of the ACM , 62(3), 2015.[JK09] Rahul Jain and Hartmut Klauck. New Results in the Simultaneous Message PassingModel via Information Theoretic Techniques. In

Proceedings of the 24th Annual IEEEConference on Computational Complexity, CCC ’09 , pages 369–378, 2009.[JKN08] Rahul Jain, Hartmut Klauck, and Ashwin Nayak. Direct Product Theorems for Clas-sical Communication Complexity via Subdistribution Bounds: Extended Abstract. In

Proceedings of the 40th Annual ACM Symposium on Theory of Computing , STOC ’08,pages 599–608, 2008.[JN12] Rahul Jain and Ashwin Nayak. Short Proofs of the Quantum Substate Theorem.

IEEE Transactions on Information Theory , 58(6):3664–3669, 2012.[JPY14] Rahul Jain, Attila Pereszl´enyi, and Penghui Yao. A Parallel Repetition Theorem forEntangled Two-Player One-Round Games under Product Distributions. In , pages 209–216, 2014.[JPY16] Rahul Jain, Attila Pereszl´enyi, and Penghui Yao. A Direct Product Theorem forTwo-Party Bounded-Round Public-Coin Communication Complexity.

Algorithmica ,76(3):720748, 2016.[JRS02] Rahul Jain, Jaikumar Radhakrishnan, and Pranab Sen. The Quantum Communica-tion Complexity of the Pointer Chasing Problem: The Bit Version. In

FSTTCS 2002:Foundations of Software Technology and Theoretical Computer Science , volume 2556of

Lecture Notes in Computer Science , pages 218–229, 2002.[JRS03a] Rahul Jain, Jaikumar Radhakrishnan, and Pranab Sen. A Direct Sum Theorem inCommunication Complexity via Message Compression. In

Automata, Languages andProgramming , volume 2719 of

Lecture Notes in Computer Science , pages 300–315.2003. 29JRS03b] Rahul Jain, Jaikumar Radhakrishnan, and Pranab Sen. A Lower Bound for theBounded Round Quantum Communication Complexity of Set Disjointness. In

Pro-ceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science,FOCS ’03 , pages 220–229, 2003.[JRS05] Rahul Jain, Jaikumar Radhakrishnan, and Pranab Sen. Prior Entanglement, Mes-sage Compression and Privacy in Quantum Communication. In , pages 285–296, 2005.[JRS09] Rahul Jain, Jaikumar Radhakrishnan, and Pranab Sen. A Property of QuantumRelative Entropy with an Application to Privacy in Quantum Communication.

Journalof the ACM , 56(6), 2009.[JSR08] Rahul Jain, Pranab Sen, and Jaikumar Radhakrishnan. Optimal Direct Sum andPrivacy Trade-oﬀ Results for Quantum and Classical Communication Complexity. http://arxiv.org/abs/0807.1267 , 2008.[JY12] Rahul Jain and Penghui Yao. A Strong Direct Product Theorem in Terms of theSmooth Rectangle Bound. http://arxiv.org/abs/1209.0263 , 2012.[Kla10] Hartmut Klauck. A Strong Direct Product Theorem for Disjointness. In

Proceedingsof the 42nd ACM Symposium on Theory of Computing , STOC ’10, pages 77–86, 2010.[Kol16] Gillat Kol. Interactive Compression for Product Distributions. In

Proceedings ofthe Forty-Eighth Annual ACM Symposium on Theory of Computing , STOC ’16, page987998, 2016.[KRT10] Julia Kempe, Oded Regev, and Ben Toner. Unique Games with Entangled Proversare Easy.

SIAM Journal on Computing , 39(7):3207–3229, 2010.[KvdW07] Hartmut Klauck, Robert ˇSpalek, and Ronald de Wolf. Quantum and Classical StrongDirect Product Theorems and Optimal Time-Space Tradeoﬀs.

SIAM Journal on Com-puting , 36(5):1472–1493, 2007.[LSv08] Troy Lee, Adi Shraibman, and Robert ˇSpalek. A Direct Product Theorem for Dis-crepancy. In

Proceedings of the 23rd Annual IEEE Conference on ComputationalComplexity, CCC ’08 , pages 71–80, 2008.[Raz92] Alexander A. Razborov. On the Distributional Complexity of Disjointness.

TheoreticalComputer Science , 106(2):385–390, 1992.[Raz95] Ran Raz. A Parallel Repetition Theorem. In

Proceedings of the Twenty-SeventhAnnual ACM Symposium on Theory of Computing , page 447456, 1995.[Sha03] Ronen Shaltiel. Towards Proving Strong direct Product Theorems.

ComputationalComplexity , 12(1-2):1–22, 2003.[She12] Alexander A. Sherstov. Strong Direct Product Theorems for Quantum Communica-tion and Query Complexity.

SIAM Journal on Computing , 41(5):1122–1165, 2012.30She18] Alexander A. Sherstov. Compressing Interactive Communication Under Product Dis-tributions.

SIAM Journal on Computing , 47(2):367–419, 2018.[VW08] Emanuele Viola and Avi Wigderson. Norms, XOR Lemmas, and Lower Bounds forPolynomials and Protocols.

Theory of Computing , 4(7):137–168, 2008.[Yao79] Andrew C.-C. Yao. Some complexity questions related to distributive computing(preliminary report). In

Proceedings of the 11th Annual ACM Symposium on Theoryof Computing , STOC ’79, pages 209–213, 1979.[Yue16] Henry Yuen. A Parallel Repetition Theorem for All Entangled Games. In ,volume 55 of