[PDF] Non-locality and Communication Complexity

Abstract

Quantum information processing is the emerging field that defines and realizes computing devices that make use of quantum mechanical principles, like the superposition principle, entanglement, and interference. In this review we study the information counterpart of computing. The abstract form of the distributed computing setting is called communication complexity. It studies the amount of information, in terms of bits or in our case qubits, that two spatially separated computing devices need to exchange in order to perform some computational task. Surprisingly, quantum mechanics can be used to obtain dramatic advantages for such tasks. We review the area of quantum communication complexity, and show how it connects the foundational physics questions regarding non-locality with those of communication complexity studied in theoretical computer science. The first examples exhibiting the advantage of the use of qubits in distributed information-processing tasks were based on non-locality tests. However, by now the field has produced strong and interesting quantum protocols and algorithms of its own that demonstrate that entanglement, although it cannot be used to replace communication, can be used to reduce the communication exponentially. In turn, these new advances yield a new outlook on the foundations of physics, and could even yield new proposals for experiments that test the foundations of physics.

Full PDF

aa r X i v : . [ qu a n t - ph ] J u l Non-locality and Communication Complexity

Harry Buhrman ∗ Richard Cleve † Serge Massar ‡ Ronald de Wolf § July 21, 2009

Abstract

Quantum information processing is the emerging ﬁeld that deﬁnes and realizes computingdevices that make use of quantum mechanical principles, like the superposition principle, en-tanglement, and interference. Until recently the common notion of computing was based onclassical mechanics, and did not take into account all the possibilities that physically-realizablecomputing devices oﬀer in principle. The ﬁeld gained momentum after Peter Shor developed aneﬃcient algorithm for factoring numbers, demonstrating the potential computing powers thatquantum computing devices can unleash.In this review we study the information counterpart of computing. It was realized early onby Holevo, that quantum bits, the quantum mechanical counterpart of classical bits, cannot beused for eﬃcient transformation of information, in the sense that arbitrary k -bit messages cannot be compressed into messages of k − communication complexity .It studies the amount of information, in terms of bits or in our case qubits, that two spatiallyseparated computing devices need to exchange in order to perform some computational task.Surprisingly, quantum mechanics can be used to obtain dramatic advantages for such tasks.We review the area of quantum communication complexity, and show how it connects thefoundational physics questions regarding non-locality with those of communication complexitystudied in theoretical computer science. The ﬁrst examples exhibiting the advantage of the useof qubits in distributed information-processing tasks were based on non-locality tests. However,by now the ﬁeld has produced strong and interesting quantum protocols and algorithms of itsown that demonstrate that entanglement, although it cannot be used to replace communication,can be used to reduce the communication exponentially . In turn, these new advances yield anew outlook on the foundations of physics, and could even yield new proposals for experimentsthat test the foundations of physics. ∗ CWI and University of Amsterdam. Partially supported by a Vici grant from the Netherlands Organizationfor Scientiﬁc Research (NWO), and by the European Commission under the Integrated Project Qubit Applications(QAP) funded by the IST directorate as Contract Number 015848. † Institute for Quantum Computing and School of Computer Science, University of Waterloo, and PerimeterInstitute for Theoretical Physics. Partially supported by Canada’s NSERC, CIFAR, QuantumWorks, MITACS, andthe U.S. ARO. ‡ Laboratoire d’Information Quantique, CP 225, Universit´e Libre de Bruxelles (U.L.B.), Boulevard du Triomphe,B-1050 Bruxelles, Belgium. Partially supported by the Interuniversity Attraction Poles Programme - Belgian State- Belgian Science Policy under grant IAP6-10 and by the EU project QAP contract 015848. § CWI Amsterdam. Partially supported by a Vidi grant from the Netherlands Organization for Scientiﬁc Research(NWO), and by the European Commission under the Integrated Project Qubit Applications (QAP) funded by theIST directorate as Contract Number 015848. ontents A Nayak’s Proof of a Consequence of Holevo’s Bound 55B Rectangles and the Lower Bound for Distributed Deutsch-Jozsa 55

B.1 Rectangles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55B.2 Randomized protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56B.3 Discrepancy of the inner product function . . . . . . . . . . . . . . . . . . . . . . . . 57B.4 The lower bound for the Distributed Deutsch-Jozsa problem . . . . . . . . . . . . . . 58

C Razborov’s Lower Bound for the Quantum Communication Complexity of In-tersection 59

C.1 The Kremer-Razborov-Yao lemma and its consequences . . . . . . . . . . . . . . . . 59C.2 Translation from protocols to polynomials . . . . . . . . . . . . . . . . . . . . . . . . 60C.3 The quantum lower bound for Intersection . . . . . . . . . . . . . . . . . . . . . . . . 61

D Asymmetric Detection Eﬃciency 62

During the last decades of the twentieth century it was realized that information processing atthe quantum level could oﬀer tremendous advantages over conventional “classical” informationprocessing. Quantum information admits extremely eﬃcient algorithms, such as Shor’s factoringalgorithm [123], and qualitatively superior cryptographic protocols, such as the BB84 key distribu-tion protocol [17]. Many other works contributed to put this ﬁeld on solid foundations. Quantumerror-correcting codes and fault-tolerant quantum computation showed that these beautiful ideascould in principle be realized experimentally. These codes, combined with Holevo’s Theorem, Schu-macher compression, and entanglement distillation (which are analogs of Shannon’s noiseless codingtheorem) gave us the foundations of an information theory pertaining to quantum systems in termsof quantum bits, or qubits , and entanglement that is measured (in the bipartite case) in entangle-ment bits, or ebits . These discoveries generated huge excitement. By now quantum information3as become a well-established ﬁeld, and there are many reviews and textbooks to which we referthe reader for background information. See for example [103].In view of the advantages that quantum information oﬀers for computation and cryptography,it is natural to enquire whether quantum information is also a superior medium for eﬃcient com-munication. In this article we will review progress on this speciﬁc question, and its relation to theproblem of quantum non-locality which has fascinated physicists for decades.On the face of it, there are important reasons for doubting that quantum information providessuch a communication eﬃciency advantage. Many years before the “quantum information” dis-cipline took hold on a large scale, Holevo [75] proved an important theorem about the classicalinformation capacity of quantum channels. Holevo’s Theorem—as it is now called—states that, forany classical message, the cost of transmitting it from one party (Alice) to another party (Bob)in terms of quantum bits ( qu bits) is the same as the cost of transmitting it in terms of classicalbits. If the task requires k bits on average, then it also requires k qubits on average. The latterconsequence of Holevo’s Theorem can be proven quite simply using a diﬀerent approach [99], andthis proof is reproduced in Appendix A. Thus one would naively expect that quantum informa-tion cannot provide a communication eﬃciency advantage. This intuition turns out to be wrong.Tremendous communication savings are possible with the use of quantum information, as explainedin the next section. To understand why quantum information can provide a communication advantage without contra-dicting Holevo’s Theorem, it is necessary to consider more precisely the various scenarios that canbe associated with “communication”.The simplest scenario, corresponding to the case covered by Holevo’s Theorem, is illustratedin Fig. 1. There are two parties that we refer to as Alice and Bob. Alice has an n -bit string x &%'$ Alice x ∈ { , } n ? &%'$ Bob x ?- Input:Output: communicationFigure 1: The basic communication scenario: Alice receives an n -bit string x as input and sendsone message to Bob, who must output x . For this task, a quantum message is no more eﬃcientthan a classical message.that she would like to convey to Bob by sending one message. Here it is indeed true, by Holevo’sTheorem [75], that quantum messages are no more eﬃcient than classical messages. Alice mustsend n qubits to accomplish this speciﬁc task.A variant of the communication scenario is where Bob’s goal is not to determine Alice’s data x ,but to determine some information that is a function of x in a way that may depend on other4ata y that resides with Bob (while y is unknown to Alice). Such a scenario could occur when Aliceand Bob each begin with n -bit strings, x and y , respectively (Alice knows x but not y and Bobknows y but not x ), and the goal is for Bob to determine the value of some function f ( x, y ) (where f is known to both parties). An example where such a scenario could arise is where Alice and Bobare interested in scheduling an appointment. Alice’s schedule could be represented by x and Bob’sby y : if there are n time-slots, then we can set the i th bit of x to 1 if Alice is available in time-slot i ,and similarly for y . How much communication is required for Bob to ﬁnd a time when they areboth available (i.e., an i such that x i = y i = 1)? We shall see that, for this communication scenario,quantum information enables Alice and Bob to accomplish the task with less (asymptotically lessin the number of time-slots) qubit communication than would be required by any protocol that isrestricted to classical bit communication.This kind of scenario, illustrated in Fig. 2 (for general functions or relations f on { , } n ×{ , } n ) is known as communication complexity . It has been extensively studied in the classical &%'$ Alice x ∈ { , } n ? &%'$ Bob y ∈ { , } n f ( x, y ) ??-(cid:27) - communicationInputs:Output:Figure 2: The basic communication complexity scenario: Alice and Bob receive n -bit strings, x and y respectively, as input and their goal to compute some function of these values f ( x, y ), asBob’s output. There are tasks of this form where communication in terms of quantum messagesis much more eﬃcient than communication in terms of classical messages. The number of qubitscan be exponentially smaller than the number of bits. Note that in this framework we do not takeinto account the time and other resources that Alice and Bob spend locally (although in practiceit turns out that their local computations are almost always eﬃcient).case. Indeed, whereas the trivial solution to this problem is for Alice to send Bob her input x , andfor Bob to compute f ( x, y ), it is often possible for Bob to compute f with much less than n bits ofclassical communication. These savings in classical communication are very interesting both froma practical and a conceptual point of view. Section 3 outlines several of the key results in the area,and we refer the reader to the textbooks [83, 77] for further information.When Alice and Bob can communicate qubits, further reductions in the amount of communi-cation are possible, sometimes even exponential reductions. This remarkable situation is clearlyworthy of further study. It is one of the main subjects covered by the present review, and we willsee many examples later. Long before the work on quantum communication complexity mentioned in the previous section,physicists investigating the foundations of quantum mechanics studied the scenario where local5easurements are carried out on two entangled particles. Such entangled states can (at least inprinciple) be easily produced by having the particles interact together for some time, and thensending the particles away to far-oﬀ locations. Local measurements are then carried out on theparticles. This scenario was ﬁrst studied by Einstein, Podolsky, and Rosen [60] and immediatelyafterwards by Schr¨odinger [118, 119] (who coined the word entanglement ). In these works it wasrealized that the results of the local measurements would exhibit very interesting correlations. Forinstance, for some pairs of the measurements, the results may be always the same; for other pairsof measurements, the results may be always opposite, etc.Nevertheless, one can easily show—this follows immediately from the structure of quantummechanics—that the parties carrying out the measurements cannot use the entangled particles tocommunicate to each other. More precisely, if two physically separated parties, Alice and Bob,initially possess entangled particles and then Alice is given an arbitrary bit x , there is no wayfor Alice to manipulate her particles in order to convey any information about x to Bob when heperforms measurements on his particles.Given that these correlations cannot be used for communication, one would naively expect thatif a (quantum or classical) model can reproduce these correlations, then it is not necessary forthat model to use communication. This is indeed the case in the quantum scenario where, havingestablished the entanglement through some interaction in the past, no communication is neededat the time of the measurement. But if one wants to reproduce these correlations in a purelyclassical model, then classical communication between the parties is required at the moment of themeasurements! This situation is even more surprising if the particles are widely separated fromeach other and the measurements take place during a very short time interval, so short that thetwo measurement events are space-like separated. In this case the communication would have tooccur faster than the speed of light!This remarkable feature of quantum mechanics was discovered by Bell [13], and is now knownas “quantum non-locality”. It has been the subject of much further theoretical and experimentalstudy since. Indeed it is one of the most surprising and counter-intuitive features of quantum me-chanics. Bell’s Theorem shows that Einstein’s program of trying to rationalize quantum mechanicsby reducing it to classical mechanics is futile and doomed to failure, as it cannot be done withoutgiving up another cornerstone of twentieth century physics (discovered by Einstein himself), namelythe fact that information cannot travel faster than the speed of light. More recently, another reasonwhy such a reduction is doomed emerged through the study of quantum information. Namely weexpect any such classical description of quantum mechanics to be exponentially ineﬃcient, i.e., touse exponentially more resources than the quantum theory. We will discuss quantum non-localityextensively in the present review, focusing on its connection to communication complexity. The reason why in this review we deal with quantum communication complexity and quantumnon-locality together is that these two topics are intimately related. Indeed they can be formulatedin a uniﬁed way, and furthermore many questions can be mapped from one topic to the other.In fact, during the past dozen years an intense cross-fertilization has occurred between these twoﬁelds, which has considerably enriched both of them.To see the unity between the two subjects, recall that in both cases the parties, Alice andBob, are given some inputs, x and y . In one case these inputs correspond to the arguments of thefunction that must be computed. In the other case these inputs correspond to a description of the6easurements that must be carried out on the particles (the “measurement settings”). And in bothcases Alice and Bob must provide an output, a and b . In communication complexity we requirethat b = f ( x, y ) and a is irrelevant; in non-locality we are interested in the correlations between a , b and x , y (for instance we request that a = b when x and y have certain values and that a = b when x and y have some other values). We can unify these descriptions by saying that the aim inboth cases is to produce a joint probability distribution P ( a, b | x, y )of the outputs given the inputs, such that P ( a, b | x, y ) has certain desirable properties. In both communication and non-locality, the basic question one wants to answer is: what is theminimum amount of resources necessary to reproduce the distribution P ( a, b | x, y ), and how doesthis amount change when one changes the model, i.e., when one changes the type of resource thatcan be used. There are in fact many diﬀerent types of resources that can be compared, and we nowbrieﬂy review them. We will come back to them in more detail in the body of the review. • Quantum communication.

The parties are allowed to send each other quantum states. Onequantiﬁes the amount of communication by the number of qubits sent. • Classical communication.

The parties are allowed to send each other classical communication.One quantiﬁes the amount of communication by the number of bits sent. • Entanglement.

The parties share entangled states. One quantiﬁes the amount of entanglementby the number of qubits that the state locally consists of. For example we frequently usemaximally entangled states of 2 qubits, called ebits (also known as EPR pairs after [60]), √ ( | i| i + | i| i ) or something that can be obtained from this with local operations. • Shared randomness.

The parties have randomness, i.e., they are allowed to toss coins. Inthe case of shared randomness, the parties both share the same string of coins. This couldfor instance be implemented by having the parties toss the coins beforehand, at some earliertime when they are together, and then use the coins later when they need to solve thecommunication complexity problem. • Local randomness.

The parties have randomness, i.e., they are allowed to toss coins. In thecase of local randomness the coins are tossed locally, and the string of outcomes of the coinsfor Alice is independent of the string of outcomes of the coins for Bob.The rational for measuring classical information in terms of bits is Shannon’s noiseless codingtheorem [121], which states that, asymptotically, the information produced by a stochastic sourcecan be encoded in a number of bits equal to the entropy of the source. This is paralleled in thequantum case by Schumacher compression [120], which states that, asymptotically, the informationproduced by a stochastic quantum source can be encoded into a number of qubits equal to the vonNeumann entropy of the source. And it is paralleled in the case of entanglement, by entanglementdistillations, namely the fact that pure two-party entangled states can, asymptotically in the numberof copies of the state, be converted into the number of ebits equal to the von Neumann entropy7f the reduced density matrix of each party [16]. In the context of communication complexity,however, we are not dealing with the asymptotic limit of large amounts of communication or largeamounts of entanglement. Thus whereas in most cases we will keep the basic concepts of bits,qubits and ebits, it could be relevant in speciﬁc cases to consider variants on these resources, suchas trits, non-maximally entangled states, etc.The above resources have been ordered (more or less) from the strongest to the weakest. Indeedmost of these resources imply the ones below them. For instance one can send classical informationusing qubits; one can use quantum communication to distribute entanglement; one can measurethe entangled particles to produce shared randomness, etc. The only case where the ordering isnot so clear is between classical communication and entanglement. Indeed if two parties share anentangled state, they cannot use it to communicate (as discussed above). But on the other hand(as discussed below) sharing n ebits may allow one to save an exponentially large (in n ) amountof bits in some communication scenarios (whereas in all other cases, n uses of one resource allowsone to implement n uses of the resources below it).There are also a number of nontrivial ways in which these resources can be substituted onefor the other. Quantum teleportation allows one to substitute one ebit and two bits of classicalcommunication for one qubit of quantum communication [14].

Dense coding shows that sharingone ebit and then communicating one qubit allows one to communicate two bits [15].

Newman’sTheorem states that in the context of communication complexity, having shared randomness cansave only a small amount of communication compared to having local randomness [101].In addition we will at some points in this review consider other additional (more specialized ormore exotic) resources. For instance one can consider • One-way classical or quantum communication.

Alice is allowed to communicate to Bob, butBob is not allowed to communicate back to Alice. • Simultaneous Message Passing model.

In this model there is a third party, called the Referee,and messages are only allowed from Alice to the Referee and from Bob to the Referee. It isthe Referee who has to compute the value of the function f ( x, y ). • Multipartite entanglement.

Sometimes one is interested in non-locality or communicationcomplexity between more than two parties. Contrary to bipartite entanglement where it issuﬃcient to consider ebits, there are many kinds of multiparticle entanglement (such as GHZstates, W states, etc.) which could be useful for solving diﬀerent communication problems. • Non-local (or PR) boxes.

This exotic resource is intermediate between an ebit and a bit.Indeed, it is a resource which does not enable the parties to communicate (in the same waythat entanglement does not allow communication). But to be produced physically it requiresa bit of communication between the parties at the moment it is used (contrary to entangle-ment which once established requires no more communication). Its study provides a deeperunderstanding of the power and limitations of quantum entanglement in communication com-plexity.

The basic question asked in communication complexity and quantum non-locality is to understandhow much of these resources are required in diﬀerent situations.8hus classical communication complexity [83] is basically concerned with understanding howmuch classical communication is required to compute the value of a function f ( x, y ), possibly using(shared or local) randomness.In quantum communication complexity the parties are trying to compute the value of f , but maynow use quantum resources. In the quantum communication model , introduced by Yao [137], theycan communicate qubits, and in the entanglement model, introduced by Cleve and Buhrman [45],the parties share entangled particles and are allowed to communicate classical bits. When oneextends the quantum communication model of Yao such that the parties also share entangledparticles, quantum teleportation shows that these two models are essentially equivalent: one qubitin the ﬁrst model can be replaced by two bits and one ebit in the entanglement, and conversely onebit can be simulated by one qubit. It is, however, a challenging open problem whether the quantumcommunication model, without shared entanglement, is essentially equivalent to the entanglementmodel. Non-locality , although at ﬁrst sight a very diﬀerent topic, is also concerned with comparingresources. Indeed the basic question in this area is to compare: • The correlations that can be obtained if the parties share entanglement and carry out localmeasurements on their particles, but are not allowed any communication. • The correlations that can be obtained if the parties have shared randomness, but are notallowed any communication. This is known in the physics literature as a local hidden variablemodel .Bell’s Theorem states that these two scenarios are not equivalent: shared randomness alone isnot suﬃcient to reproduce the quantum correlations.

Thus quantum communication complexity, classical communication complexity, and non-localitycan be put in a uniﬁed framework in which similar kinds of resources are compared. In addition,in some cases there exist mappings between quantum communication complexity scenarios andnon-locality scenarios.The most simple such mapping occurs in the entanglement model if the parties can solve thecommunication complexity problem more eﬃciently using entanglement than without entangle-ment, and if this can be done by measuring their entangled particles before they communicate toeach other. Then it immediately follows that the correlations obtained by measuring their entangledparticles (but without communicating), cannot be realized in a local hidden variable model.Conversely it is possible to map any non-locality experiment to a communication complexityproblem in the entanglement model. This was the approach used in the original paper [45]. Itmapped the non-local correlations that arise in the GHZ paradox to a communication complexityproblem. This approach has since been generalized [26], although in the resulting communicationcomplexity problem the function f ( x, y ) is only computed successfully by the parties with non-zeroprobability.Another mapping can occur in the quantum communication model when one-way quantumcommunication from Alice to Bob is more eﬃcient than classical communication. Then it is oftenpossible to construct from the communication complexity problem a nontrivial non-locality scenario.This approach has yielded some very interesting non-locality scenarios which we will describe indetail below. 9 .8 Summary of the review In this review we will present some of the main results obtained so far in the ﬁeld of quantumcommunication complexity. We start by introducing quantum non-locality in Section 2, focusing onits relation with communication complexity. We present simple examples such as the GHZ paradox,the CHSH example, the magic square game, but rephrasing them in the language of data processing.Next we present quantum communication complexity in Section 3, illustrating it with examples suchas the distributed Deutsch-Jozsa problem, the intersection problem, Raz’s problem, and the hiddenmatching problem. In Section 4 we unite these two approaches, showing how some of the examplesfrom quantum communication complexity can be used to derive new non-locality games. In section 5we discuss another model of communication complexity, the simultaneous message passing model,and show how classical communication, entanglement, quantum communication can be traded onefor the other in this model. In Section 6 we discuss several additional aspects of quantum non-locality, such as non-local boxes, Tsirelson bounds, and simulation of quantum correlations usingclassical resources. Finally we consider in Section 7 experimental issues, in particular the detectionloophole, and present the outlook for future experiments. We conclude by discussing some openquestions in the ﬁeld. The interested reader can also consult the earlier review [20] which coverssome of the material presented here.

The idea of non-locality was originally concerned with the possibility that quantum mechanics isactually a classical theory that depends on “hidden variables” whose values might be discovered inthe future as part of some successor theory to quantum mechanics. Bell [13] proposed a hypotheticalexperiment for ruling out such classical theories under the assumption that measurements of quan-tum systems can occur at diﬀerent points in space-time, and information cannot be transmittedfaster than the speed of light.Another way of interpreting Bell’s experiment is as a method for two (or more) cooperatingdistributed parties to compute some sort of input-output relation, where each party receives inputdata and must produce output data consistent with the relation. In Bell’s experiment, there is sucha task that cannot be accomplished in a setting where the information processing resources are allclassical. In contrast, the task can be accomplished if the parties share prior entanglement.Since Bell’s seminal work, the concept of quantum non-locality has been extensively studied,by physicists, philosophers, and more recently by computer scientists. Some of the important earlyadvances have been the Clauser-Horn-Shimony-Holt (CHSH) inequality [44] which allows Bell’ssurprising predictions to be tested even in the presence of noise; and the GHZ-Mermin scenario [72,94] which was the ﬁrst ”pseudo-telepathy” game. More recently there has been a more or lesssystematic enumeration of Bell inequalities for small number of settings and/or outcomes (see, e.g.,[49, 48, 134, 141]); the study of the statistical power of non-locality tests [52]; an understanding ofthe limits to quantum non-locality (Tsirelson-type bounds) [43] as compared to the larger world ofcorrelations obeying only the no-signalling conditions (e.g., non-local boxes); investigations of thepower of non-locality in cryptographic settings [11], etc.In the next paragraphs we review various non-locality scenarios, casting them in the language ofdata processing. The reader wishing to complement this overview could consult two recent reviews,written more from physics [135] and computer science [21] perspectives.10 .1 GHZ: Greenberger-Horne-Zeilinger and Mermin

The following scenario essentially underlies those of [72, 94], but is cast in the language of dataprocessing. The basic structure is illustrated in Fig. 3. Three physically separated parties—call &%'$

Alice sa ?? &%'$ Bob tb ?? &%'$ Carol uc ?? Inputs:Outputs:Figure 3: The general form of a non-locality scenario involving three parties: Alice, Bob, andCarol receive inputs s , t , u respectively, and are required to produce outputs a , b , c , respectively,satisfying certain conditions. Once the inputs are received, no communication is permitted betweenthe parties. For the speciﬁc GHZ scenario, it is possible to accomplish the task if the parties arein possession of a tripartite entangled state. Without the prior entanglement, it is impossible toaccomplish the task.them Alice, Bob, and Carol—receive input bits s , t , and u , respectively, which are arbitrary subjectto the condition that s ⊕ t ⊕ u = 0 ( ⊕ denotes exclusive or , which is the sum of its arguments inmodulo 2 arithmetic). Once they receive their input data, they are forbidden from having anycommunication between them. Their goal is to produce output bits a , b , and c , respectively, suchthat a ⊕ b ⊕ c = ( stu = 0001 if stu ∈ { , , } . (1)Note that the task that the three parties are trying to accomplish is the computation of a relation,where there are three input bits ( stu ) and three output bits ( abc ). The task is nontrivial in light ofthe fact that the input bits are distributed among the parties so that each party is given the valueof only one of them; the output bits are also distributed.The ﬁrst observation is that with classical resources there must be communication among thethree parties to succeed. To see why this is so, ﬁrst consider deterministic strategies (later wewill analyze the case of probabilistic strategies, where the parties behave stochastically, i.e., theycan ﬂip coins). Since Alice cannot receive any information from Bob or Carol, her output bit a can depend only on the value of her input bit s . Let a (respectively a ) be Alice’s output whenher input bit is 0 (respectively 1). Similarly, let b , b and c , c be Bob and Carol’s outputs fortheir respective input values. Note that the six bits a , a , b , b , c , c completely characterize anydeterministic strategy of Alice, Bob, and Carol. The conditions of the problem translate into the11quations a ⊕ b ⊕ c = 0 ,a ⊕ b ⊕ c = 1 ,a ⊕ b ⊕ c = 1 ,a ⊕ b ⊕ c = 1 . (2)It is impossible to satisfy all four equations simultaneously. This is because summing the fourequations modulo two, yields 0 = 1 (recall that 1 + 1 = 0 modulo 2). Therefore, for any strategy,there exists an input conﬁguration stu ∈ { , , , } for which it fails. Note however thatfor any three out of the four equations from (2) there is a strategy that satisﬁes these three equationsperfectly.To see why probabilistic strategies cannot succeed either, note that any such strategy can bemodeled as a deterministic strategy where Alice, Bob, and Carol have access to a random variable r (for example, r could be the outcomes of a sequence of uniformly distributed random bits). This r is sometimes referred to as a “local hidden variable”. It is assumed that the testing proceduredoes not have access to r , so that the input bits ( stu ) are uncorrelated with r . The intuitiveway of thinking about this scenario is that the three parties get together before the game starts,randomly select r , and then each party secretly keeps a copy of this information. An example of aprobabilistic strategy is for r ∈ { , } to be two uniformly random bits that specify which threeof the four equations in (2) are satisﬁed. This probabilistic strategy succeeds with probability 3 / s, t, u is uniformly distributed over { , , , } . Then thesuccess probability that any randomized protocol achieves is X r q r X s,t,u P ( s, t, u, r ) , (3)where q r is the probability (of the shared randomness) that the parties ﬂip r , and P ( s, t, u, r ) = 1 ifthe deterministic protocol corresponding to r is correct on input stu and P ( s, t, u, r ) = 0 otherwise.Clearly this is bounded above by max r X s,t,u P ( s, t, u, r ) , (4)which by the above discussion is at most 3 / | i − | i − | i − | i . (5)The parties are allowed to apply unitary transformations and perform measurements on theirindividual qubits, but communication between the parties is still forbidden. It turns out that nowthe parties can produce a , b , c satisfying Eq. (1). This is achieved by the procedure that follows.The procedure for Alice is to measure her qubit in the computational basis (consisting of | i and | i ) if her input bit s is 0, and to measure her qubit in the Hadamard basis (consisting of This is an entangled state that is equivalent to the so-called GHZ state √ | i + √ | i (under local unitaryoperations). | i = √ ( | i + | i ) and H | i = √ ( | i − | i )) if her input bit is 1. In either case, she sets heroutput bit a to the outcome of her measurement. The procedures for Bob and Carol are similar tothat of Alice, but with Bob’s bits being s and b , and Carol’s bits being u and c .To see why the described procedure always produces output bits abc satisfying Eq. (1), considerthe various cases of the input possibilities stu . In the case where stu = 000, the state is measuredin the computational basis, so clearly the outcomes are from { , , , } , and hence satisfy a ⊕ b ⊕ c = 0. The case where stu = 011 can be analyzed by assuming that a Hadamard transformis applied to the last two qubits of the state prior to a measurement in the computational basis.Since ( I ⊗ H ⊗ H ) ( | i − | i − | i − | i )= ( I ⊗ H ⊗ H ) (cid:0) | i ( | i − | i ) − | i ( | i + | i ) (cid:1) = | i ( | i + | i ) − | i ( | i − | i )= | i + | i − | i + | i , (6) a ⊕ b ⊕ c = 1, as required, in this case. The remaining cases where stu = 101 and 110 are similarby the symmetry of the entangled state and protocol.We have shown that the entangled state enables the three parties to correlate their output bitswith their inputs bits in a manner that is impossible to achieve with classical resources, unless thereis communication among the parties. It should be noted that, in accomplishing this task using theentangled state, no actual communication occurs among the parties. In particular, the output bits a , b , and c individually contain no information about stu ; they are uniformly distributed in allcases. It is only the trivariate correlations among a , b , and c that are related to the input data stu . The following scenario essentially underlies that of [44] but is cast in the language of data processing.The basic structure is illustrated in Fig. 4. Alice and Bob receive input bits s and t , respectively, &%'$ Alice sa ?? &%'$ Bob tb ?? Inputs:Outputs:Figure 4: The non-locality scenario involving two parties: Alice and Bob receive inputs s and t respectively, and are required to produce outputs a and b respectively, satisfying certain conditions.Once the inputs are received, no communication is permitted between the parties. For the speciﬁcCHSH scenario, it is possible to accomplish the task with probability cos ( π/

8) = 0 . . . . if theparties are in possession of an ebit. Without the prior entanglement, the highest possible successprobability is 3 /

4. 13nd, after this, they are forbidden from communicating with each other. Their goal is to produceoutput bits a and b , respectively, such that a ⊕ b = s ∧ t, (7)(‘ ∧ ’ is the logical and , which is 1 if all its arguments are 1, and which is 0 otherwise) or, failing that,to satisfy this condition with as high a probability as possible. To analyze the situation in termsof classical information, ﬁrst again consider the case of deterministic strategies. For these, Alice’soutput bit depends solely on her input bit s and similarly for Bob. Let a , a be the two possibilitiesfor Alice and b , b be the two possibilities for Bob. These four bits completely characterize anydeterministic strategy. Condition (7) translates into the equations a ⊕ b = 0 ,a ⊕ b = 0 ,a ⊕ b = 0 ,a ⊕ b = 1 . (8)It is impossible to satisfy all four equations simultaneously (since summing them modulo 2 yields0 = 1). Therefore it is impossible to satisfy Condition (7) absolutely. By using a probabilisticstrategy, Alice and Bob can satisfy Condition (7) with probability 3 /

4. For such a strategy, weallow Alice and Bob to have a priori classical random variables, whose distribution is independent ofthat of the inputs s and t . Note that any three of the four equations of (8) can be simultaneouslysatisﬁed. The probabilistic classical strategy works as follows. Alice and Bob have uniformly-distributed random bits that are used to specify which of the four equations of (8) is violated, andthen play the strategy that satisﬁes the other three perfectly. It is easy to see that (a) for any input st , the resulting outputs satisfy Condition (7) with probability 3 /

4, and (b) this is optimal in thatno probabilistic strategy can attain a success probability greater than 3 / √ ( | i − | i ) . (9)It turns out that now the parties can produce data that satisﬁes Condition (7) with probabilitycos ( π/

8) = 0 . . . . , which is higher than what is possible in the classical case. This is achievedby the following procedures. Denote the unitary operation that rotates the qubit by angle θ by R ( θ ) = (cid:18) cos θ − sin θ sin θ cos θ (cid:19) (where we have written it out in the computational basis). Alice appliesone of two rotations on her qubit, depending on her input bit s : if s = 0 the rotation is R ( − π/ s = 1 the rotation is R (3 π/ a to the result. Bob’s procedure is the same, depending on his input bit t . It isstraightforward to calculate that, if Alice rotates by θ and Bob rotates by θ , then the entangledstate becomes √ (cos( θ + θ )( | i − | i ) + sin( θ + θ )( | i + | i )) . (10)After the measurements, the probability that a ⊕ b = 0 is cos ( θ + θ ). It is now a straightfor-ward exercise to verify that Condition 7 is satisﬁed with probability cos ( π/

8) for all four inputpossibilities. 14 .3 Tsirelson’s upper bound for CHSH

Although the protocol in the previous subsection using entanglement has a higher success prob-ability (cos ( π/

8) = 0 . . . . ) than any classical protocol (3 / ( π/ ( π/ | ψ i AB . An arbitrary strategy for Alice thatuses this entangled state can be represented by two observables A and A , each with eigenvaluesin { +1 , − } . When Alice’s input bit is 0, she obtains her output bit by applying the projectivemeasurement corresponding to the eigenspaces of A to the component of | ψ i AB in her possession.The +1-eigenspace of A corresponds to output bit 0, while the − A . Similarly,an arbitrary strategy for Bob can be represented by two observables B and B .At this point, the reader might object that | ψ i AB , A , A , B , and B do not capture everypossible strategy of Alice and Bob, since they need not be limited to applying projective measure-ments. Although non-projective measurements may be used, such measurements can always besimulated by projective measurements in a larger Hilbert space. Thus, no generality has been lostbecause any strategy can be converted to the above form.Since the observables have eigenvalues in { +1 , − } rather than { , } , it is more convenient hereto think of Alice and Bob’s output bits in these terms as a ′ = ( − a and b ′ = ( − b , respectively.Then the protocol succeeds on input st if and only if ( − s ∧ t · a ′ · b ′ = 1.If s and t are randomly chosen according the uniform distribution, then the expected value of( − s ∧ t · a ′ · b ′ is h ψ | AB (cid:0) A ⊗ B + A ⊗ B + A ⊗ B − A ⊗ B (cid:1) | ψ i AB , (11)and is therefore upper bounded by the largest eigenvalue of M = A ⊗ B + A ⊗ B + A ⊗ B − A ⊗ B . (12)It is straightforward to calculate that M = I − ( A A ) ⊗ ( B B )+ ( A A ) ⊗ ( B B )+ ( A A ) ⊗ ( B B ) − ( A A ) ⊗ ( B B ) , (13)from which we can upper bound the maximum eigenvalue of M by the sum of the maximumeigenvalue in each term, obtaining + + + + = . It follows that the largest eigenvalueof M itself is at most 1 / √

2, which therefore upper bounds the expected value of ( − s ∧ t · a ′ · b ′ .This translates into an upper bound of (1 + 1 / √ / ( π/

8) for the success probability of theactual protocol (where Alice and Bob output bits a and b ). This completes the proof of Tsirelson’supper bound for CHSH. An observable is a Hermitian operator. One associates to an observable a projective measurement, with oneprojector for each of the eigenspaces of the observable. .4 Magic square game In one respect the GHZ example is more striking than the CHSH example: in the former case, theprotocol with entanglement always succeeds, while in the latter case the protocol with entanglementmerely succeeds with higher probability. However, the GHZ example involves three parties, whereasthe CHSH example only involves two. Is there a two-party scenario where the quantum protocolalways succeeds, whereas the best classical success probability is bounded below 1? The answeris aﬃrmative, see for instance [38, 37, 39]. A particularly elegant example is the following game,which has been referred to as the magic square game [5].To deﬁne this game, consider the problem of labeling the entries of a 3 × . The two matrices0 0 00 0 01 1 0 0 0 00 0 01 1 1each satisfy ﬁve out of the six constraints. For the ﬁrst matrix, all rows have even parity, but onlythe ﬁrst two columns have odd parity. For the second matrix, the ﬁrst two rows have even parity,and all columns have odd parity.Bearing the above in mind, consider the game where Alice receives s ∈ { , , } as input (spec-ifying the number of a row), and Bob receives t ∈ { , , } as input (specifying the number of acolumn). Their goal is to each produce 3-bit outputs, a a a for Alice and b b b for Bob, withthese properties:1. They satisfy the row/column parity constraints. Namely, a ⊕ a ⊕ a = 0 and b ⊕ b ⊕ b = 1.2. They are consistent where the row intersects the column. Namely, a t = b s .As usual, Alice and Bob are forbidden from communicating once the game starts, so Alice doesnot know what t is and Bob does not know what s is. We shall observe that, classically, the bestsuccess probability possible is 8 /

9, whereas there is a quantum strategy that always succeeds.An example of a strategy that attains success probability 8 / st is uniformlydistributed) is where Alice plays according to the rows of the ﬁrst matrix above and Bob playsaccording the columns of the second matrix above. This succeeds in all cases, except where s = t =3. To see why this is optimal, note that for any other classical strategy, it is possible to representit as two matrices as above but with diﬀerent entries. Alice plays according to the rows of the ﬁrstmatrix and Bob plays according to the columns of the second matrix. We can assume that the rowsof Alice’s matrix all have even parity; if she outputs a row with odd parity then they immediatelylose, regardless of Bob’s output. Similarly, we can assume that all columns of Bob’s matrix haveodd parity. Considering such a pair matrices, the players lose at each entry where they diﬀer.There must be such an entry, since otherwise it would be possible to have all rows even and allcolumns odd with one matrix. Thus, when the input st is chosen uniformly from { , , } × { , , } ,the success probability is at most 8 / As before, we can express a valid solution in terms of equations, in this case six of them (where arithmetic ismodulo 2): m + m + m = 0, m + m + m = 0, m + m + m = 0, m + m + m = 1, m + m + m = 1, m + m + m = 1. Adding these equations modulo 2 yields 0 = 1. In fact, the game can be simpliﬁed so that Alice and Bob each output just two bits, since the parity constraintdetermines the third bit. I , X , Y , Z denote the 2 × I = (cid:18) (cid:19) , X = (cid:18) (cid:19) , Y = (cid:18) − ii (cid:19) , and Z = (cid:18) − (cid:19) . (14)Each is an observable with eigenvalues in { +1 , − } . Consider the following table of two-qubitobservables that are each a tensor product of two Pauli matrices: X ⊗ X Y ⊗ Z Z ⊗ YY ⊗ Y Z ⊗ X X ⊗ ZZ ⊗ Z X ⊗ Y Y ⊗ X For our present purposes, the noteworthy property is that the observables along each row commuteand their product is I ⊗ I , and the observables along each column commute and their productis − I ⊗ I . This implies that, for any two-qubit state, performing the three measurements alongany row results in three { +1 , − } -valued bits whose product is +1. Also, performing the threemeasurements along any column results in three { +1 , − } -valued bits whose product is −

1. Thiscan be seen more easily when one simultaneously diagonalizes the three commuting observables.They will have 1 and − I ⊗ I , and − − I ⊗ I .We can now describe the quantum protocol. It uses two pairs of entangled qubits, each ofwhich is in initial state √ ( | i − | i ). Alice, on input s , applies three two-qubit measurementscorresponding to the observables in row s of the above table. For each measurement, if the resultis +1, she outputs 0 and if the result is −

1, she outputs 1. Similarly, Bob, on input t , applies themeasurements corresponding to the observables in column t , and converts the outcomes into bitsin the same manner.We have already established that Alice and Bob’s output bits satisfy the required parity con-straints. It remains to show that Alice and Bob’s output bits that correspond to where the rowmeets the column are the same. For that measurement, Alice and Bob are measuring with respectto the same observable in the above table. Because all the observables in each row and in eachcolumn commute, we may assume that the place where they intersect is the ﬁrst observable applied.Those bits are obtained by Alice and Bob each measuring ( | i − | i )( | i − | i ) with respect tothe observable in entry ( s, t ) of the table. To show that their measurements will agree for all casesof st , we consider the individual Pauli measurements on the individual entangled pairs of the form √ ( | i − | i ). Let a ′ and b ′ denote the outcomes of the ﬁrst measurement (in terms of bits), and a ′′ and b ′′ denote the outcomes of the second. Since the measurement associated with the tensorproduct of two observables is operationally equivalent to measuring each individual observable andtaking the product of the results, we have that a t = a ′ ⊕ a ′′ and b s = b ′ ⊕ b ′′ . It is straightforwardto verify that if the same measurement from { X, Y, Z } is applied to each qubit of √ ( | i − | i )then the outcomes will be distinct. Therefore, a ′ ⊕ b ′ = 1 and a ′′ ⊕ b ′′ = 1, from which it follows17hat a t ⊕ b s = ( a ′ ⊕ a ′′ ) ⊕ ( b ′ ⊕ b ′′ )= ( a ′ ⊕ b ′ ) ⊕ ( a ′′ ⊕ b ′′ )= 1 ⊕

1= 0 , (15)so a t = b s . This completes the analysis of the magic square game. In the last section we considered scenarios without communication. Here we will extend the non-locality setting to one where the parties (Alice and Bob) are allowed to send information to eachother in the form of bits or qubits. They can still have shared randomness and may share anentangled quantum state. We are now interested in the minimum number of bits or qubits thatare needed in order to compute a function that depends on the inputs of all the parties.The ability to send information to each other departs from the setting of non-locality. We willsee that entanglement can be used to reduce (for certain functions) the communication drasticallycompared to when the parties share just classical resources. Accordingly, while entanglement cannotbe used for signalling, it can be used to signiﬁcantly reduce the communication needed for certaintasks. In later sections we will see how some of the ideas and protocols developed in the setting ofcommunication complexity can be used to formulate new non-locality games.Communication complexity has been studied extensively in the area of theoretical computerscience and has deep connections with seemingly unrelated areas, such as VLSI design, circuitlower bounds, lower bounds on branching programs, size of data structures, and bounds on thelength of logical proof systems, to name just a few. We refer to the textbooks [83, 77] for moredetails.

First we sketch the setting for classical communication complexity. Alice and Bob want to computesome function f : D → { , } , where D ⊆ X × Y . If the domain D equals X × Y then f is calleda total function, otherwise it is a promise function. Alice receives input x ∈ X , Bob receives input y ∈ Y , with ( x, y ) ∈ D . A typical situation, illustrated in Fig. 2, is where X = Y = { , } n , soboth Alice and Bob receive an n -bit input string. As the value f ( x, y ) will generally depend onboth x and y , some communication between Alice and Bob is required in order for them to be ableto compute f ( x, y ). We are interested in the minimal amount of communication they need.A communication protocol is a distributed algorithm where ﬁrst Alice does some individualcomputation, and then sends a message (of one or more bits) to Bob, then Bob does some compu-tation and sends a message to Alice, etc. Each message is called a round . After one or more roundsthe protocol terminates and outputs some value, which must be known to both players. The cost of a protocol is the total number of bits communicated on the worst-case input. A deterministic protocol for f always has to output the right value f ( x, y ) for all ( x, y ) ∈ D . In a bounded-error protocol, Alice and Bob may ﬂip coins and the protocol has to output the right value f ( x, y ) withprobability ≥ / x, y ) ∈ D . We could either allow Alice and Bob to toss coins individually18local randomness, or “private coin”) or jointly (shared randomness, or “public coin”). The later isanalogous to the local hidden variables in non-locality games. A public coin can simulate a privatecoin and is potentially more powerful. However, Newman’s theorem [101] says that having a publiccoin can save at most O (log n ) bits of communication, compared to a protocol with a private coin.Some often studied functions are: • Equality:

EQ( x, y ) = 1 if x = y , and EQ( x, y ) = 0 otherwise • Inner product:

IP( x, y ) = P ni =1 x i y i (mod 2) (for x, y ∈ { , } n , x i is the i th bit of x ) • Intersection:

INT( x, y ) = 1 if there is an i where x i = y i = 1, and INT( x, y ) = 0 otherwise(viewing x as corresponding to the set { i : x i = 1 } and similarly for y , INT( x, y ) says whetherthe sets x and y intersect). A variant of this problem asks to actually ﬁnd an i where x i = y i = 1, or to output that none such i exists.Let us ﬁrst consider the equality problem, which will recur throughout the text. The goal for Aliceis to determine whether her n -bit input is the same as Bob’s or not. It is not hard to show thatin the deterministic case, n bits of communication are needed (see Section B.1 of the appendix fora proof), so Bob might as well send his string to Alice after which Alice announces the answer toBob with one more bit.To illustrate the power of randomness, let us give a simple yet eﬃcient bounded-error protocolfor the equality problem. Alice and Bob jointly toss a random string r ∈ { , } n . Alice sends thebit a = x · r to Bob (where ‘ · ’ is inner product mod 2). Bob computes b = y · r and comparesthis with a . If x = y then a = b , but if x = y then a = b with probability 1/2. Repeating this afew times, Alice and Bob can decide equality with small error using O ( n ) public coin ﬂips and aconstant amount of communication.This protocol uses public coins, but note that Newman’s theorem implies that there existsan O (log n )-bit protocol that uses a private coin. Let us explicitly describe such a protocol. Aliceviews her n bits as the coeﬃcients of a polynomial p x over some ﬁnite ﬁeld F of about 3 n elements: p x ( t ) = P ni =1 x i t i − . She picks a random element a ∈ F , and sends Bob the pair a, p x ( a ), whichshe can do using 2 log(3 n ) bits. Bob computes p y ( a ) and outputs 1 if p x ( a ) = p y ( a ), and outputs0 otherwise. Clearly, if x = y then Bob always outputs the correct answer 1. However, if x = y then the polynomial p x ( t ) − p y ( t ) is a polynomial in t of degree at most n − n − F . Hence with probability atleast 2 /

3, the ﬁeld element a that Alice chose satisﬁes p x ( a ) = p y ( a ), and Bob will give the correctoutput 0 also in this case. Now what happens if we give Alice and Bob a quantum computer and allow them to send eachother qubits and/or to make use of ebits that they share at the start of the protocol?Formally speaking, we can model a quantum protocol as follows. The total state consistsof 3 parts: Alice’s private space, the channel, and Bob’s private space. The starting state is | x i| i| y i : Alice gets x , the channel is initialized to 0, and Bob gets y . Now Alice applies a unitarytransformation to her space and the channel. This corresponds to her private computation as well For those not familiar with ﬁnite ﬁelds: it suﬃces to choose a prime number p ≈ n and do all additions andmultiplications modulo this p .

19s to putting a message on the channel (the length of this message is the number of channel-qubitsaﬀected by Alice’s operation). Then Bob applies a unitary transformation to his space and thechannel, etc. At the end of the protocol Alice or Bob makes a measurement to determine theoutput of the protocol. This model was introduced by Yao [137].In the second model, introduced by Cleve and Buhrman [45], Alice and Bob share an unlimitednumber of ebits at the start of the protocol, but now they communicate via a classical channel: thechannel has to be in a classical state throughout the protocol. We only count the communication,not the number of ebits used. Protocols of this kind can simulate protocols of the ﬁrst kind withonly a factor 2 overhead: using teleportation, the parties can send each other a qubit using an ebitand two classical bits of communication. Hence the qubit-protocols that we describe below alsoimmediately yield protocols that work with entanglement and a classical channel. Note that anebit can simulate a public coin toss: if Alice and Bob each measure their half of the pair of qubits,they get the same random bit.The third variant combines the strengths of the other two: here Alice and Bob start out with anunlimited number of ebits and they are allowed to communicate qubits. This third kind of commu-nication complexity is in fact equivalent to the second, up to a factor of 2, again by teleportation.Before continuing to study this model, we ﬁrst have to face an important question, alreadymentioned in the introduction: is there anything to be gained here?

At ﬁrst sight, the followingargument seems to rule out any signiﬁcant gain. Suppose that in the classical world k bits have tobe communicated in order to compute f . Since Holevo’s theorem says that k qubits cannot containmore information than k classical bits, it seems that the quantum communication complexity shouldbe roughly k qubits as well (maybe k/ k bits of the classical protocol; they are only interested in the value f ( x, y ), which is just 1 bit.Below we will survey some of the main examples that have so far been found of diﬀerences betweenquantum and classical communication complexity. Quantum communication complexity was introduced by Yao [137] and studied by Kremer [82], butneither showed any advantages of quantum over classical communication. Cleve and Buhrman [45]introduced the variant with classical communication and prior entanglement, and exhibited theﬁrst quantum protocol provably better than any classical protocol. It uses quantum entanglementto save 1 bit of classical communication. This gap was extended by Buhrman, Cleve, and vanDam [31] and, for arbitrary k parties, by Buhrman, van Dam, Høyer, and Tapp [34]. The ﬁrst impressively large gaps between quantum and classical communication complexity wereexhibited by Buhrman, Cleve, and Wigderson [33]. Their protocols are distributed versions ofknown quantum query algorithms, like the Deutsch-Jozsa [56] and Grover [74] algorithms.Let us start with the ﬁrst one. It is actually explained most easily in a direct way, withoutreference to the Deutsch-Jozsa algorithm (though that is where the idea came from). The problem20eals with a promise version of the equality problem. Suppose the n -bit inputs x and y are restrictedto the following case: DJ promise: either x = y , or x and y diﬀer in exactly n/ n is an even number, otherwise n/ n a power of 2. Here is a simple quantum protocol to solvethis promise version of equality using only log n qubits:1. Alice sends Bob the log n -qubit state √ n P ni =1 ( − x i | i i , which she can prepare unitarily from x and log n | i -qubits.2. Bob applies the unitary map | i i 7→ ( − y i | i i to the state, applies a Hadamard transform toeach qubit (for this it is convenient to view i as a log n -bit string), and measures the resultinglog n -qubit state.3. Bob outputs 1 if the measurement gave | log n i and outputs 0 otherwise.It is clear that this protocol only communicates log n qubits, but why does it work? Note that thestate that Bob measures is H ⊗ log n √ n n X i =1 ( − x i + y i | i i ! = 1 n n X i =1 ( − x i + y i X j ∈{ , } log n ( − i · j | j i This superposition looks rather unwieldy, but consider the amplitude of the | log n i basis state. Itis n P ni =1 ( − x i + y i , which is 1 if x = y and 0 otherwise because the promise now guarantees that x and y diﬀer in exactly n/ classical protocols (without entanglement) for this problem? Provinglower bounds on communication complexity often requires a very technical combinatorial analysis.Buhrman, Cleve, and Wigderson used a deep combinatorial result of Frankl and R¨odl [62] to provethat every classical errorless protocol for this problem needs to send at least 0 . n bits. We givethe details in Appendix B.4.This log n -qubits-vs-0 . n -bits example was the ﬁrst exponentially large separation of quantumand classical communication complexity. Notice, however, that the diﬀerence disappears if we moveto the bounded-error setting, allowing the protocol to have some small error probability. We canuse the randomized protocol for equality discussed above or even simpler: Alice can just send a few( i, x i ) pairs to Bob, who then compares the x i ’s with his y i ’s. If x = y he will not see a diﬀerence,but if x and y diﬀer in n/ O (log n ) classicalbits of communication suﬃce in the bounded-error setting, in sharp contrast to the errorless setting. Now consider the Intersection function, which is 1 if x i = y i = 1 for at least one i . Note thatthis is a decision problem of the appointment-scheduling problem mentioned in the introduction.Buhrman, Cleve, and Wigderson [33] also presented an eﬃcient quantum protocol for this. Theirprotocol is based on Lov Grover’s famous quantum search algorithm [74], which we will brieﬂysketch here. 21uppose there is some n -bit string z and we would like to ﬁnd an index i such that z i = 1. Wecannot “look” at z directly, but we can apply the following unitary map: O z : | i i 7→ ( − z i | i i . Grover’s algorithm starts in a uniform superposition √ n P ni =1 | i i and then repeatedly applies thefollowing unitary Grover iterate to the state: G = H ⊗ log n O H ⊗ log n O z , where H ⊗ log n is the log n -qubit Hadamard transform, and O is the unitary that puts a ‘ − ’ infront of the all-0 state. Suppose there are exactly t solutions: t indices i where z i = 1. We will notgive the analysis here (see for instance [24]), but one can show that after about π p n/t Grover-iterations, most of the amplitude of the state sits on such solutions. Measuring the state will nowwith high probability give us a solution. Of course we may not know t in advance, but there is away to ﬁnd a solution with high probability using O ( √ n ) Grover-iterates even in that case.Now what about the Intersection problem? Note that we just want to ﬁnd a solution for thestring z = x ∧ y , which is the bit-wise AND of x and y , since z i = 1 whenever both x i = 1 and y i = 1. The idea is now to let Alice run Grover’s algorithm to search for such a solution. Clearly,she can prepare the uniform starting state herself. She can also apply H and O herself. The onlything where she needs Bob’s help, is in implementing O z . This they do as follows. Whenever Alicewant to apply O z to a state | φ i = n X i =1 α i | i i , she tags on her x i in an extra qubit and sends Bob the state n X i =1 α i | i i| x i i . Bob applies the unitary map | i i| x i i 7→ ( − x i ∧ y i | i i| x i i and sends back the result. Alice sets the last qubit back to | i (which she can do unitarily becauseshe has x ), and now she has the state O z | φ i ! Thus we can simulate O z using 2 messages of log( n )+1qubits each. Thus Alice and Bob can run Grover’s algorithm to ﬁnd an intersection, using O ( √ n )messages of O (log n ) qubits each, for total communication of O ( √ n log n ) qubits. Later Aaronsonand Ambainis [1] gave a more complicated protocol that uses O ( √ n ) qubits of communication.What about lower bounds? It is a well-known result of classical communication complexitythat classical bounded-error protocols for the Intersection problem need about n bits of communi-cation [78, 111]. Thus we have a quadratic quantum-classical separation for this problem. Could theseparation be even bigger than quadratic? This question was open for quite a few years after [33]appeared, until ﬁnally Razborov [112] showed that any bounded-error quantum protocol for Inter-section needs to communicate about √ n qubits. His proof is beautiful but deep and complicated.We sketch it in Appendix C. 22 .6 Raz’s problem Notice the contrast between the examples of the last two sections. For the Distributed Deutsch-Jozsa problem we get an exponential quantum-classical separation, but the separation only holdsif we require the classical protocol to be errorless. On the other hand, the gap for the disjointnessfunction is only quadratic , but it holds even if we allow classical protocols to have some errorprobability.Raz [110] exhibited a function where the quantum-classical separation has both features: thequantum protocol is exponentially better than the classical protocol, even if the latter is allowedsome error probability. Consider the following promise problem P :Alice receives a unit vector v ∈ R m and a decomposition of the corresponding space intwo orthogonal subspaces H (0) and H (1) .Bob receives an m × m unitary transformation U .Promise: U v is either “close” to H (0) or to H (1) (more precisely, letting P be theprojector on subspace H , a vector v is close to H if k P v k ≥ / O (log m ) bits. Alice and Bob’s input is now n = O ( m log m )bits long. There is a simple yet eﬃcient 2-round quantum protocol for this problem: Alice views v as a log m -qubit vector and sends this to Bob. Bob applies U and sends back the result. Alicethen measures in which subspace H ( i ) the vector U v lies and outputs the resulting i . This takesonly 2 log m = O (log n ) qubits of communication.The eﬃciency of this protocol comes from the fact that an m -dimensional unit vector can be“compressed” or “represented” as a log m -qubit state. Similar compression is not possible withclassical bits, which suggests that any classical protocol for P will have to send the vector v moreor less literally and hence will require a lot of communication. This turns out to be true but theproof (given in [110]) is surprisingly hard. It shows that any bounded-error protocol for P needsto send at least about n / / log n bits. Consider the following promise problem HM from [9], for even integer n :Alice receives a string x ∈ { , } n .Bob receives a perfect matching M on { , . . . , n } (i.e., a partition into n/ M = { ( i , j ) , . . . , ( i n/ , j n/ ) } ).Question: output a triple ( i, j, x i ⊕ x j ) for some ( i, j ) ∈ M .This communication problem is not a function, but a relation : for each input-pair x, M thereare n/ i, j, x i ⊕ y i ) is correct for each ( i, j ) ∈ M .We consider one-way protocols here, where Alice sends one message to Bob and then Bob shouldproduce a triple ( i, j, x i ⊕ x j ).We now describe a quantum protocol where Alice sends only O (log n ) qubits and Bob gives oneof the correct answers with probability 1 [9]. Alice sends Bob the following log n -qubit message:1 √ n n X i =1 ( − x i | i i . M as an orthogonal decomposition of the space C n into n/ i, j ) ∈ M would be P ij = | i ih i | + | j ih j | .Bob applies this measurement on the state he received, and obtains the label of some random( i, j ) ∈ M as well as the projected state1 √ − x i | i i + ( − x j | j i ) . An appropriate measurement on this state will give Bob the bit x i ⊕ x j with certainty, and he canoutput the correct answer ( i, j, x i ⊕ x j ).What about classical protocols? First note that the HM problem can be solved by a shortclassical message from Bob to Alice: Bob sends Alice a pair ( i, j ) ∈ M using 2 log n bits, whichallows Alice to compute x i ⊕ x j . But the situation is radically diﬀerent if we consider classicalone-way communication from Alice to Bob only . Indeed, one can show that if Alice sends Bob pairs( i, x i ) for O ( √ n ) randomly chosen i ’s, then Bob probably received both points from at least onepair in M . This allows him to output a correct answer. On the other hand, Bar-Yossef, Jayram,and Kerenidis [9] proved that any classical protocol solving the Hidden Matching problem, evenwith small error probability and involving only one-way communication from Alice to Bob needsmessages of length at least about √ n . Thus we have an exponential separation between classicalone-way protocols and quantum one-way protocols.Variants of the Hidden Matching problem have been used recently to obtain other quantum-classical separations. For example, Gavinsky et al. [67] showed a log n -qubits-versus- √ n -classical-bits separation for one-way protocols for a Boolean function derived from the Hidden Matchingproblem (while HM itself is a relational problem). Gavinsky [65] used another variant of HM to exhibit a relational problem where quantum one-way protocols are exponentially more eﬃcientthan classical two-way protocols. In the previous sections we gave examples of quantum-classical separations. The parameters werediﬀerent, but in each case we showed that there was a quantum protocol for the problem at handthat required far less communication than the best classical protocols. Could this always be thecase? Could quantum communication complexity be much more eﬃcient for every communicationcomplexity problem? The answer to this is negative—in fact for most communication complexityproblems, quantum communication does not help much.An important example is the inner product function (IP( x, y ) = x · y = P ni =1 x i y i (mod 2)).All protocols, both classical and quantum, need to send about n bits/qubits to solve this. We willsketch the proof of [46] here for the case of errorless quantum protocols with qubit communicationand without entanglement, the proof for the more general case of entanglement is slightly morecomplicated. The proof uses the IP-protocol to communicate Alice’s n -bit input to Bob, and theninvokes Holevo’s theorem to conclude that many qubits must have been communicated in order toachieve this. Suppose Alice and Bob have some protocol P for IP. They can use this to computethe following mapping : | x i| y i 7→ | x i ( − x · y | y i . (16) This is due to an eﬀect called the “birthday paradox” or “birthday problem”. It states that if we throw roughly √ n balls into n bins at random, then probably there will be a bin containing at least two balls. This is an oversimpliﬁcation of matters: in order to get the map of Eq. (16) one ﬁrst needs to construct a new n -bit state | x i and Bob starts with the uniform super-position √ n P y ∈{ , } n | y i . If they apply the above mapping, the ﬁnal state becomes | x i √ n X y ∈{ , } n ( − x · y | y i . If Bob applies a Hadamard transform to each of his n qubits, then he obtains the basis state | x i ,so Alice’s n classical bits have been communicated to Bob. Holevo’s theorem now implies that theIP-protocol must communicate n qubits (which can trivially be achieved). The same argumentcan, with a minor modiﬁcation, be made to work even if Alice and Bob share unlimited priorentanglement, yielding a lower bound of n/ (1 − ǫ ) n lower boundfor ǫ -error protocols [46]. The constant factor in this bound was subsequently improved to theoptimal by Nayak and Salzman [100]. In Section 2 we introduced several simple non-locality scenarios. Then in Section 3 we introducedcommunication complexity, and gave several problems for which there are large, sometimes expo-nential, separations between the classical and quantum communication complexity. In this sectionwe shall put together these two approaches, and derive from the communication complexity prob-lems new non-locality problems which are very hard, sometimes exponentially hard, to solve ina classical model. In particular we shall present non-locality problems based on the DistributedDeutsch-Jozsa problem and on the Hidden Matching problem. In Section 7 we shall come back tothese non-locality problems, and will discuss these newly developed tests in the context of experi-mental errors.In this section we shall use the following mapping which, when applicable, is very powerful.

Mapping one-way quantum communication complexity to non-locality.

Consider a communication complexity problem where the number q of qubits exchangedin the quantum communication model with one-way communication from Alice to Bobis less than the number c of bits required to solve the problem classically when theparties have shared randomness; and further suppose that—due to some symmetry ofthe problem—it can be solved if Alice starts with an arbitrary basis state | k i (the valueof k being known beforehand to both Alice and Bob) as follows: she carries out atransformation U A ( x ) on this state (that depends on her input x but does not dependon k ), sends it to Bob who carries out a transformation U B ( y ) (that depends on hisinput y but does not depend on k ) and then measures in the computational basis. Theprobability of ﬁnding result ℓ is thus |h ℓ | U B ( y ) U A ( x ) | k i| . From the knowledge of ℓ , k ,and y , Bob can ﬁnd the value of the function f ( x, y ). protocol P − which is the reverse of the original communication protocol P . This can be done without error becausethe original protocol is without error. Combining protocols P and P − one can obtain map (16). If protocol P uses c qubits of communication, protocol P − also uses c qubits, and the protocol for obtaining state (16) uses 2 c qubits.But the crucial point is that still at most c qubits are sent from Alice to Bob, since P − is the reverse of P . Holevo’stheorem lower bounds the communication from Alice to Bob, and hence we get a lower bound of n qubits on c . | ψ i = 2 − q/ P q − i =0 | i i| i i ; Alice carries out a local transformation U A ( x ) T (where ‘ T ’means transposition in the | i i -basis); she measures in the computational basis. Bobcarries out the transformation U B ( y ); he measures in the computational basis. Supposethat Alice obtains outcome k and Bob obtains outcome ℓ . The probability of ﬁndingthese joint outcomes is P ( k, ℓ | x, y ) = |h ℓ |h k | U B ( y ) U A ( x ) T | ψ i| = 2 − q |h ℓ | U B ( y ) U A ( x ) | k i| (the last equality is easy to check). If Alice now sends to Bob the outcome k of her mea-surement (which requires q bits), then Bob can compute f ( x, y ). Thus this constitutesa solution of the communication complexity problem in the entanglement model withhalf the communication that would be required if they had used the trivial mappingbased on teleportation. More importantly, the correlations P ( k, ℓ | x, y ) are non-local,since they could not be obtained in a classical model with shared randomness withoutat least c − q > The above mapping can be applied to the Distributed Deutsch-Jozsa problem from Section 3.4.We describe here the result of the mapping.

Non-local DJ problem:

Alice and Bob receive n -bit inputs x and y that satisfy theDJ promise: either x = y , or x and y diﬀer in exactly n/ a, b ∈ { , } log n such that when x = y then a = b ,and when x and y diﬀer in exactly n/ a = b .They achieve this as follows1. Alice and Bob share the maximally entangled state 1 √ n n − X i =0 | i i| i i .2. Alice and Bob both apply locally a conditional phase to obtain: 1 √ n n − X i =0 ( − x i ( − y i | i i| i i .3. Alice and Bob both apply a Hadamard transform: 1 n √ n n − X a =0 n − X b =0 n X i =1 ( − x i + y i + i · ( a ⊕ b ) ! | a i| b i .4. Alice and Bob measure in the computational basis.For every a , the probability that both Alice and Bob obtain the same result a is: (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n √ n n − X i =0 ( − x i + y i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , which is 1 /n if x = y and 0 otherwise. Hence this solves the problem.Note that if Alice then communicated the result of her measurement to Bob (using log n bits), hecould solve the Distributed Deutsch-Jozsa problem since he could then check whether k = ℓ or k = ℓ .But we know that solving the Distributed Deutsch-Jozsa problem requires at least 0 . n bits. Thuswe have a non-locality problem that can be solved if Alice and Bob share log n ebits, but which26equires about 0 . n bits to be solved in a classical model with shared randomness and classicalcommunication. Note that this very large lower bound on the amount of classical communicationwould disappear in the bounded-error setting where we allow the correlations P ( a, b | x, y ) to diﬀerslightly from the ideal correlations. The same mapping can be applied to the Hidden Matching problem to yield a non-locality problem.

Non-local HM problem:

Assume that n = 2 m , so we can index the numbers between1 and n with m -bit strings.Alice receives a string x ∈ { , } n . Bob receives a perfect matching M on { , . . . , n } (i.e. a partition into n/ k ∈ { , } m . Bob must give as output a matching( i, j ) ∈ M and ℓ ∈ { , } m .Alice and Bob’s output must satisfy i · ( k ⊕ ℓ )) + j · ( k ⊕ ℓ ) = x i + x j mod 2 (recall that a · b = P i a i b i is the inner product between bitstrings a and b , and a ⊕ b is the bitwiseXOR of a and b : the i th bit of a ⊕ b is a i ⊕ b i ).Note that if at the end of the protocol, Alice sends k to Bob at a cost of m = log n classicalbits, then Bob has enough information to compute the triple ( i, j, x i ⊕ x j ), i.e., to solve the HiddenMatching problem as deﬁned in Section 3.7. But we know that classical one-way communicationfrom Alice to Bob needs about √ n bits to solve the Hidden Matching problem. Therefore thecorrelations in the non-local HM problem themselves can only be reproduced if Alice sends Bob atleast about √ n bits of communication (if we are restricted to one-way).Let us show that Alice and Bob can obtain the correlations of the non-local HM problem usinglocal measurements on m = log n ebits. The initial state is:1 √ n X i ∈{ , } m | i i| i i . Alice adds the phases ( − x i . Bob views M as an orthogonal decomposition of the space C n into n/ i, j ) ∈ M would be P ij = | i ih i | + | j ih j | . Bob applies this measurement on the state he received,and obtains the label of some random ( i, j ) ∈ M . This projects the joint state to1 √ − x i | i i| i i + ( − x j | j i| j i ) . Now they both apply Hadamard transforms to each of their qubits. This gives the state1 √  ( − x i n X k,ℓ ∈{ , } m ( − i · k + i · ℓ | k i| ℓ i + ( − x j n X k,ℓ ∈{ , } m ( − j · k + j · ℓ | k i| ℓ i  = 1 n √ X k,ℓ ∈{ , } m (cid:16) ( − x i + i · ( k ⊕ ℓ ) + ( − x j + j · ( k ⊕ ℓ ) (cid:17) | k i| ℓ i . Both parties measure their half of the state in the computational basis. They obtain m -bit strings k and ℓ , respectively, satisfying x i + i · ( k ⊕ ℓ ) = x j + j · ( k ⊕ ℓ ) (modulo 2), since the other k, ℓ -pairshave amplitude 0. This gives: i · ( k ⊕ ℓ ) + j · ( k ⊕ ℓ ) = x i + x j (modulo 2).27 Quantum Fingerprinting and the Simultaneous Message PassingModel

We now describe a model, called the simultaneous message passing (SMP) model, that is neithera non-locality test nor the full-ﬂedged communication complexity scenario, yet that is relevant toboth. The basic structure is illustrated in Fig. 5. Alice and Bob each receive an n -bit input ( x and &%'$ Alice x ∈ { , } n ? HHHHHHHHHj (cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:25) &%'$ Referee f ( x, y ) ? &%'$ Bob y ∈ { , } n ? Inputs:Output:Figure 5: The simultaneous message passing variant of the communication complexity scenario:Alice and Bob receive n -bit strings, x and y respectively, as input and their communication isrestricted to each sending one message to a third party, called the Referee. From these messages,the Referee computes some function f ( x, y ) as the output of the protocol. There are tasks ofthis form where communication in terms of quantum messages is exponentially more eﬃcient thancommunication in terms of classical messages. y , respectively). In this scenario, they do not have any shared resources like shared randomness oran entangled state, but they do have local randomness. They each are required to send a singlemessage to a third party, called the Referee. The Referee, upon receiving message m A from Aliceand m B from Bob, should output the value of some (Boolean) function f ( x, y ). The goal is tocompute f ( x, y ) with a minimum amount of communication from Alice and Bob to the Referee.This scenario was introduced by Yao [136] for the setting where m A and m B are classical messagesconsisting of bits. We compare this classical model to the corresponding quantum version, where m A and m B consist of qubits. We will see that for the very natural problem of equality, where f ( x, y ) = 1 if and only if x = y , there is an exponential savings in communication when qubitsare used instead of classical bits. Classically, the problem of the bounded-error communicationcomplexity of equality in the SMP model was open for almost twenty years, until Newman andSzegedy [102] exhibited a lower bound of about √ n bits. This is tight, since Ambainis [4] constructeda bounded error protocol for this problem where the messages are O ( √ n ) bits long (we describea slightly less eﬃcient classical protocol in Section 5.2). In contrast, Buhrman, Cleve, Watrous,and de Wolf [32] showed that in the quantum setting this problem can be solved with very littlecommunication: only O (log n ) qubits suﬃce. 28 .1 Quantum ﬁngerprints In order to construct the eﬃcient quantum SMP protocol for equality, we need to borrow ideas fromthe eﬃcient classical randomized communication complexity protocol for equality from Section 3.1.Recall that in that protocol, Alice interprets her input x as a polynomial p x ( t ) = P ni =1 x i t i − oversome ﬁnite ﬁeld F of size m (about 3 n ), and then she picks a random point a ∈ F and sends a and p x ( a ) to Bob. The pair a, p x ( a ) is called a “ﬁngerprint” of x , since it describes characteristics of x that can aid in identifying it. Carrying out this ﬁngerprinting procedure in superposition resultsin a quantum ﬁngerprint of x : | F x i = 1 √ m X a ∈ F | a i| p x ( a ) i . Note that | F x i consists of only 2 log m = 2 log n + O (1) qubits. A nearly optimal O ( √ n log n ) classical protocol for equality in the SMP model goes as follows.Alice produces a list of k = O ( √ n ) random points a , . . . , a k in F and sends the list { ( a i , p x ( a i )) } ki =1 to the Referee. Bob does the same with respect to y , sending { ( b i , p y ( b i )) } ki =1 to the Referee. Bythe birthday paradox (see the footnote in Section 3.7), with constant probability there exist i and j such that both a i and b j equal the same ﬁeld element d . In this case the Referee can compare p x ( d ) with p y ( d ). If x = y then p x = p y , and hence p x ( d ) = p y ( d ). On the other hand, if x = y ,then since p x and p y are diﬀerent polynomials of degree at most n −

1, with probability ≥ / p x ( d ) = p y ( d ). The protocol for the Referee is now clear: if the lists of Alice and Bobhave a point d in common, then the Referee outputs 1 if and only if p x ( d ) = p y ( d ). If there is nopoint in common (which happens only with small probability) or if p x ( d ) = p y ( d ), then the Refereeoutputs 0. We now have everything in place to describe the quantum protocol for equality. Alice sends state | F x i to the Referee and Bob sends | F y i . Note that if the Referee now measures | F x i in the com-putational basis, then he will ﬁnd a random point a and the value p x ( a ), just like the classicalprotocol described above. The Referee thus needs to do something smarter. The key observationis the following about the inner products between ﬁngerprints: h F x | F y i = ( x = y ≤ if x = y (17)If x = y then clearly h F x | F y i = 1. If x = y then h F x | F y i = 1 m X i,j ∈ F h p x ( i ) |h i | j i| p y ( j ) i = 1 m X i ∈ F h p x ( i ) | p y ( i ) i . Since p x and p y are diﬀerent polynomials of degree at most n −

1, they have the same value p x ( i ) = p y ( i ) for at most n − i . Hence the inner product is at most n − m ≤ . Ambainis’s protocol from [4] gets rid of the log n factor. | i| φ i| ψ i measure H H s SWAP

Figure 6: Quantum circuit to test if | φ i = | ψ i or |h φ | ψ i| ≤ .This circuit ﬁrst applies a Hadamard transform to a qubit that is initially | i , then SWAPsthe other two registers conditioned on the value of the ﬁrst qubit being | i , then applies anotherHadamard transform to the ﬁrst qubit and measures it. Here SWAP is the operation that swapsthe states | φ i and | ψ i : | φ i| ψ i 7→ | ψ i| φ i . The Referee receives | φ i from Alice and | ψ i from Boband applies the test to these two states. An easy calculation reveals that the outcome of themeasurement is 1 with probability (1 − |h φ | ψ i| ) /

2. Hence if | φ i = | ψ i then we observe a 1 withprobability 0, but if |h φ | ψ i| ≤ then this probability is ≥ . Repeating this procedure with severalindividual ﬁngerprints can make the error probability arbitrary close to 0. After the quantum ﬁngerprinting scheme showed the power of quantum communication in the SMPmodel, a number of further results appeared. Yao [138] exhibited an eﬃcient protocol for testingif the inputs x and y are at some constant Hamming distance d , while Gavinsky et al. [69] relatedquantum ﬁngerprinting to a technique from machine learning which brings out its weaknesses. Onecan also study the variant of the SMP model where Alice and Bob start with a shared entangledstate, but can only send classical messages to the Referee. Gavinsky et al. [68] exhibited a problembased on the Hidden Matching problem and a quantum protocol that solves it with O (log n ) ebitsand O (log n ) classical bits of communication, while any quantum SMP protocol without priorentanglement needs to send at least about ( n/ log n ) / qubits. This shows that entanglementcan reduce communication (even quantum communication!) exponentially, at least for relationalproblems in the SMP model. Finally, Gavinsky, Regev, and de Wolf [70] showed that if Alice’smessage to the referee is allowed to be quantum, while Bob’s message can only be classical, then thequantum advantages over purely classical protocols mostly disappear. In particular, the equalityproblem requires communication at least p n/ log n in this hybrid case. Recently, Gavinsky [66] extended this to a similar separation in the more standard two-way model. Other Aspects of Quantum Non-Locality

In previous sections we studied a hierarchy of resources. In particular, we discussed and comparedthe correlations P ( a, b | x, y ) that can be obtained using only shared randomness, by local measure-ments on entangled states, and ﬁnally those that can be obtained if communication between theparties is allowed. In this section we discuss an interesting set of correlations that lie between thelast two classes.To understand these new correlations, let us note that any correlations P ( a, b | x, y ) obtainedin a local hidden variable model or by local measurements on an entangled state must obey thefollowing properties: Positivity: P ( a, b | x, y ) ≥

0; (18)

Normalization: X a,b P ( a, b | x, y ) = 1; (19) No Signalling: X b P ( a, b | x, y ) = P ( a | x ) is independent of y , X a P ( a, b | x, y ) = P ( b | y )is independent of x . (20)The last condition expresses the fact that Bob cannot transmit any information about his input y to Alice, and similarly Alice cannot communicate to Bob any information about her input x .We are interested here in correlations that obey the above three conditions, but that cannot beobtained from local measurements on entangled states.To illustrate this idea, suppose that Alice and Bob each have some kind of device (introducedindependently in [79] and in [108]) such that Alice can provide an input x ∈ { , } to her deviceand obtain an output a ∈ { , } ; and Bob can provide an input y ∈ { , } to his device and obtainan output b ∈ { , } , and such that the probabilities of the outputs given the inputs obey P ( a, b | x, y ) = (cid:26) if a ⊕ b = x ∧ y non-local(NL) box (other terminology in use is Popescu-Rohrlich (PR) box , in reference to [108]).With this device Alice and Bob always obtain a ⊕ b = x ∧ y , whereas we know that for localmeasurements on entangled quantum states this relation can only be satisﬁed with probability atmost cos ( π/

8) under the uniform distribution on the inputs x and y (see Section 2.3 for a proof).Thus this is an “imaginary” device in the sense that it cannot be realized physically without Aliceand Bob’s devices being connected by some kind of communication channel. It is, however, aninteresting resource to consider, since it is “stronger” than correlations that can be obtained fromlocal measurements on entangled states, but “weaker” than actual communication.31 systematic study of the properties of correlations obeying the above three conditions wasinitiated in [12], and it was shown that they obey properties that one thinks of as genuinelyquantum, such as monogamy and no-cloning [86]. They also allow for secure key distribution [11].Because of the apparent “reasonableness” of the non-local box, Popescu and Rohrlich raised thequestion (in [108], and in fact well before this) why such correlations cannot be realized in naturewithout communication between the parties. The most straightforward answer is the technicalproof in Section 2.3; however, one might seek a more intuitive or philosophical explanation. Onepossible approach is provided by communication complexity. It was shown by van Dam [50, 51],and also noted by one of the authors of the present review (Cleve), that if Alice and Bob have anunlimited amount of non-local boxes then all communication complexity problems become trivial:Suppose Alice and Bob have an unlimited supply of non-local boxes, as described inEq. (21). Suppose Alice receives input x ∈ { , } n and Bob receives input y ∈ { , } n .Then communication complexity becomes trivial, in the sense that the value of anyBoolean function f ( x, y ) ∈ { , } can be computed with certainty with a single bit ofcommunication from Alice to Bob.To prove this, consider an arbitrary function f : { , } n × { , } n → { , } . It can be expressedas a boolean circuit consisting of not and ∧ ( and ) gates, with inputs x , . . . , x n and y , . . . , y n .The idea is to represent the value of each gate of this circuit in terms of two shares , one possessedby Alice and the other by Bob. For a bit a , its representation as shares is any ( a ′ , a ′′ ) where a = a ′ ⊕ a ′′ . Until the end of the protocol, Alice’s information about each gate will be just the ﬁrstbit of its share and Bob’s information will be the second bit. They start by constructing shares ofthe input bits: ( x i ,

0) for each of Alice’s input bits x i (Bob does not need to know x i to constructhis share 0); and similarly (0 , y i ) for each of Bob’s input bits y i . For each gate in the circuit, ifAlice and Bob collectively know the input bits as shares then they can produce the shares for theoutput bit without any communication. For each not gate, Alice merely negates her share (andBob does nothing to his share). For each ∧ gate, assume that the shares of inputs are ( a ′ , a ′′ ) and( b ′ , b ′′ ). The shares of the output should be ( c ′ , c ′′ ) such that c ′ ⊕ c ′′ = ( a ′ ⊕ a ′′ ) ∧ ( b ′ ⊕ b ′′ ) = ( a ′ ∧ b ′ ) ⊕ ( a ′ ∧ b ′′ ) ⊕ ( a ′′ ∧ b ′ ) ⊕ ( a ′′ ∧ b ′′ ) . (22)Consider the four terms arising above. Since Alice possesses a ′ and b ′ , she can easily compute a ′ ∧ b ′ , and similarly Bob can compute a ′′ ∧ b ′′ . The diﬃcult terms are a ′ ∧ b ′′ and a ′′ ∧ b ′ becausethey contain bits that are spread between Alice and Bob—and this is where the non-local boxesare used. Alice and Bob use one non-local box to obtain bits d ′ and d ′′ so that d ′ ⊕ d ′′ = a ′ ∧ b ′′ .They use a second non-local box to obtain e ′ and e ′′ so that e ′ ⊕ e ′′ = a ′′ ∧ b ′ . Then Alice sets hershare to c ′ = ( a ′ ∧ b ′ ) ⊕ d ′ ⊕ e ′ and Bob sets his share to c ′′ = ( a ′′ ∧ b ′′ ) ⊕ d ′′ ⊕ e ′′ . Clearly, c ′ ⊕ c ′′ = ( a ′ ∧ b ′ ) ⊕ ( d ′ ⊕ e ′′ ) ⊕ ( d ′ ⊕ e ′′ ) ⊕ ( a ′′ ∧ b ′′ ) = ( a ′ ∧ b ′ ) ⊕ ( a ′ ∧ b ′′ ) ⊕ ( a ′′ ∧ b ′ ) ⊕ ( a ′′ ∧ b ′′ ) , (23)as required. At the end, Alice and Bob possess shares for the value of f , and Alice sends her one-bitshare to Bob, enabling him to compute the value of f .Is this result speciﬁc to the non-local boxes of the form Eq. (21) (in which case it could beviewed as some kind of anomaly in the space of all possible no-signalling correlations), or does ithold for other no-signalling correlations? In particular, does it hold for noisy correlations? It wasshown in [22] that the latter is the case, if one slightly adapts the deﬁnition of what it means forcommunication complexity to be trivial: 32uppose Alice and Bob have an unlimited supply of noisy non-local boxes whose outputssatisfy Eq. (21) with probability p ≥ √ ≈ . q > / p , but onno other parameter) such that, for any n ≥

0, if Alice receives input x ∈ { , } n andBob receives input y ∈ { , } n , then they can ﬁnd with probability at least q the valueof any Boolean function f ( x, y ) ∈ { , } with a single bit of communication from Aliceto Bob.Note that this result does not hold if Alice and Bob share entangled states instead of (noisy) non-local boxes. Indeed this follows from the result of [46], discussed in Section 3.8, that computingthe inner product of two n -bit strings with success probability q > / O ( n ) bits ofcommunication, even if Alice and Bob have an unlimited supply of entangled particles.Thus the fact that communication complexity is not trivial (i.e., that some communicationcomplexity problems are hard whereas others are easy) can be viewed as a partial characterizationof the non-local correlations that can be obtained by local measurements on entangled particles.Is this a complete characterization? In particular, what is the exact noise threshold p where non-local boxes with noise p render communication complexity trivial? The current bounds on p are:85 . ≈ √ ≤ p ≤ √ ≈ . p , is it possible to produce, using only local operations,a non-local box with a success probability greater than p ? For a ﬁrst step in this direction, see [61]. As discussed in the previous section, there are correlations, such as the non-local box, that cannot bereproduced by local measurements on entangled particles, but that nevertheless obey the conditionsof positivity, normalisation and no-signalling Eqs. (18, 19, 20). More generally, we would like tounderstand within the space of all possible correlations P = { P ( a, b | x, y ) } which ones can beobtained by using only shared randomness (i.e., by local hidden variable models), which ones canbe realised by carrying out local measurements on entangled particles, and what are the ultimatelimits set by Eqs. (18, 19, 20).Answering this question would address the question raised by Popescu and Rohrlich mentionedabove, and would give us basic insights into communication complexity. Indeed it would allow usto understand quantitatively the diﬀerences between shared randomness, shared entanglement, andnon-local correlations, each of which can be viewed as a diﬀerent resource for communication com-plexity. For instance answering this question can have immediate implications for communicationcomplexity in the entanglement model, at least in the case where Alice and Bob use only one roundof communication.Before addressing this question it is useful to understand better the geometry of non-local cor-relations. To this end we introduce Bell expressions, that is linear combinations of the correlations C ( P ) = X abxy c abxy P ( a, b | x, y ) (24)33here c abxy are real numbers. It is easy to show that the space of correlations that can be reproducedusing local hidden variables (i.e., using only shared randomness) is a polytope. That is, it can becharacterised by a ﬁnite number of inequalities, called Bell inequalities, of the form C ( P ) ≤ C lhv . (25)To compute the maximum value allowed by local hidden variable (LHV) models, we can restrictourselves to deterministic models, where a = a ( x ) is a function of input x , and b = b ( y ) is a functionof y . We then have C lhv = max a ( x ) ,b ( y ) X xy c a ( x ) b ( y ) xy . If we consider local measurements on entangled quantum states, then we have bounds of theform C ( P ) ≤ C qm . (26)where C qm = max X abxy c abxy h ψ | Π a ( x ) ⊗ Π b ( y ) | ψ i where the maximum is taken over all states | ψ i , and over all projective measurements { Π a ( x ) } (depending only on x ) and projective measurements { Π b ( y ) } (depending only on y ). (By projectivemeasurements, we mean a set of projectors Π a = Π a that sum to the identity P a Π a = I ). Recentlyit has been shown how the quantum value C lhv could be bounded by a hierarchy of semideﬁniteprograms [97, 98, 57], although the issue of whether this hierarchy converges remains open [117].If we impose only the no-signalling conditions, then we will have C ( P ) ≤ C no-signalling . (27)where the right hand side is the maximum of Eq. (24) subject to Eqs. (18, 19, 20). Note thatEqs. (18, 19, 20) deﬁne another polytope, the no-signalling polytope, and the maximum value of C ( P ) will be attained at a vertex of the polytope.Let us illustrate the above concepts by a speciﬁc kind of Bell expression, called XOR non-localgames [47]. In this particular case, the outputs a, b ∈ { , } are bits and we wish them to come asclose as possible to satisfying a condition of the form a ⊕ b = f ( x, y ) (28)for all x, y . The most celebrated example is the CHSH case, where x, y are also bits and thecondition is a ⊕ b = x ∧ y , see Eq. (7).In the case of XOR games, we take the constants c abxy in Eq. (24) to have the form: c abxy = w xy ( − a ⊕ b ⊕ f ( x,y ) = m xy ( − a ⊕ b (29)where w xy ≥ x, y , and m xy = w xy ( − f ( x,y ) . In the particular case of the CHSH expression, we take m xy = ( − x ∧ y , resulting inthe famous CHSH inequality.When considering LHV theories, it is convenient to deﬁne new variables A x = ( − a ( x ) and B y = ( − b ( y ) , whereupon the maximum value of the Bell expression reachable by LHV theories is C lhv = max A x ,B y ∈{ +1 , − } X xy m xy A x B y

34n the case of local measurements carried out on entangled quantum states, we can write X a,b P ( a, b | x, y )( − a ( − b = h ψ | A x ⊗ B y | ψ i where | ψ i is the quantum state shared by Alice and Bob, and A x , B y are Hermitian operators witheigenvalues in { +1 , − } . We now use the following result of Tsirelson [129]:Suppose Alice and Bob measure observables A x and B y , both with eigenvalues in { +1 , − } , on a pure quantum state | ψ i ∈ C d ⊗ C d , then there are real unit vectors α ( x ) , β ( y ) ∈ R d such that for all x and y , h ψ | A x ⊗ B y | ψ i = α ( x ) · β ( y ).Thus we can re-express the maximal value of C attainable by quantum mechanics as C qm = max α ( x ) ,β ( y ) ∈ R n X xy m xy α ( x ) · β ( y ) . If we impose only the no-signalling conditions, then it is possible to satisfy Eq. (28) for all x, y by choosing P ( ab | xy ) = 1 / a ⊕ b = f ( x, y ), P ( ab | xy ) = 0 if a ⊕ b = f ( x, y ). Hence the maximumvalue of the game is C no-signalling = X xy | m xy | . As illustration, in the case of the CHSH inequality, the results of Section 2.2 can be re-expressedas stating that C lhv = 2 and C qm = 2 √ C no-signalling = 4.Interestingly, the ratio between the LHV values and the quantum value can be bounded inde-pendently of the number of inputs x, y and the choice of matrix m xy by Grothendiek’s constant K G , as ﬁrst noted by Tsirelson [129]: C qm ≤ K G C lhv . A recent development of this line of work is the realisation that for certain Bell inequalities, aviolation larger than a critical value C ( P ) > C d guarantees that if the correlations are obtainedby local measurements on an entangled quantum state, then the state belongs to a Hilbert spaceof dimension at least d (i.e., Alice and Bob’s space each have dimension at least d ) [30, 131, 25]).These Bell inequalities can thus be thought of as “dimension witnesses”. Consider a non-locality experiment in which Alice and Bob share an entangled quantum stateand carry out local measurements on this state; or consider a quantum communication protocolin which Alice and Bob carry out several rounds of quantum communication and then carry outmeasurements on the quantum states. How much classical resources are required to reproducethese quantum experiments? The results from Sections 3 and 4 show that the classical resourcesmust sometimes be larger, even exponentially larger, than the quantum resources. Is this the worstone can expect? What are good protocols to simulate the quantum experiments with classicalresources? In this section we review progress on these questions. Note that we are of course notclaiming that Nature works as in these simulations, but rather we are studying how one couldmimic Nature with these alternative resources. 35 .3.1 When no communication is needed.

When states are very noisy, it may be possible to simulate local measurements on them using onlyshared randomness, even though the states are entangled. Werner’s discovery of a family of states,now known as Werner states, for which such a simulation is possible [133] is one of the results ofquantum information. Werner’s model was restricted to local projective measurements. Later im-provements include [3], and [10] where it was shown that simulations using only shared randomnesscan also exist when considering the more general case of local Positive Operator Valued Measures (POVMs), which are the most general kind of measurement allowed by quantum mechanics. Let us ﬁrst consider the very simple scenario where Alice wants to communicate a single qubit toBob and Bob wants to carry out a projective measurement on the qubit. We can formalise thissimple scenario as follows:

Simulation of one-way communication of a single qubit and subsequent pro-jective measurement.

Alice receives as input a normalized vector ~x ∈ R , with length k ~x k = 1, which describes the quantum state ρ = I + ~x · ~σ where ~σ = ( X, Y, Z ) is thevector of non-trivial Pauli matrices from Eq. (14); Bob receives as input a normalizedvector ~y ∈ R , which describes his projective measurement ~y · ~σ . Bob must output a bit b , with probabilities satisfying P ( b = 0 | ~x~y ) − P ( b = 1 | ~x~y ) = Tr( ρ~y · ~σ ).We can generalise this to the case where Alice sends n qubits to Bob, and Bob carries out aPOVM on the n qubits: Simulation of one-way communication of n qubits. Alice receives as input theclassical description of a quantum state | ψ i , for instance by giving her the values ofthe coeﬃcients c i of the state in a standard basis | ψ i = P i c i | i i . And Bob is given theclassical description of a measurement, for instance by giving him the matrix elementsof the POVM elements A k in the standard basis. The task is for Bob to provide anoutcome k , such that the probability of outcome k occurring is P ( k | ψ ) = h ψ | A k | ψ i .These are communication complexity scenarios where Alice and Bob’s inputs are inﬁnite-dimensional. If one allows for slight imperfections in the simulation, then one can truncate thedescription of the matrix elements of | ψ i and A k , and make the number of input bits ﬁnite. Forinstance on Alice’s side, if | ψ i corresponds to the quantum state of n qubits, then one can truncatethe number of inputs to O ( n n ) bits (by describing each coeﬃcient c i with O ( n ) bits of precision).If Alice then sends her truncated input to Bob, then we have, up to a small error, a classicalsimulation (using O ( n n ) bits) of any one-way quantum communication protocol in which n qubitsare sent from Alice to Bob. One cannot hope to do much better than this, since the HM prob-lem of Section 3.7 exhibits an n versus 2 Ω( √ n ) gap between the quantum and classical one-waycommunication complexity (and this was further strengthened to two-way classical communicationcomplexity in [65]). A Positive Operator Valued Measure (POVM) is a set { A k } of positive-semideﬁnite matrices that sum to identity: P k A k = I . When applied to quantum system in state ρ , the probability of obtaining measurement outcome k isTr( A k ρ ). .3.3 Entanglement simulation We can also consider the case where Alice and Bob want to simulate local measurements on entan-gled quantum particles. The simplest non-locality scenario occurs when Alice and Bob carry outprojective measurements on a single ebit:

Simulation of projective measurements on a single ebit.

Alice and Bob eachreceive as input a normalized vector in R , ~x , ~y with k ~x k = k ~y k = 1, which describetheir projective measurements ~x · ~σ , ~y · ~σ . Alice and Bob must each output a bit ( a, b ,respectively) such that the correlations obey P ( a = b | ~x, ~y ) − P ( a = b | ~x, ~y ) = − ~x · ~y = h ψ − | ~x · ~σ ⊗ ~y · ~σ | ψ − i , where | ψ − i = ( | i| i − | i| i ) / √

2, and such that the marginals, P ( a | ~x, ~y ) and P ( b | ~x, ~y ),are uniform (i.e., P ( a = 0 | ~x, ~y ) = P ( a = 1 | ~x, ~y ) = 1 /

2, etc.).This can be generalized to the case where Alice and Bob carry out POVM’s on arbitraryentangled states of n qubits: Simulation of entangled states of dimension n . Alice and Bob share a classicaldescription of a pure entangled quantum state | ψ i AB , where Alice and Bob’s systemsare each of dimension 2 n . Alice and Bob receive as inputs x, y the classical (inﬁnite-dimensional) descriptions of the measurements they should do (for instance the inputscould consist in the matrix elements of the POVM elements in a standard basis). Aliceand Bob must provide outputs a, b such that the joint probability P ( a, b | x, y ) equals theprobability of getting measurement outcomes a and b when measurements x and y arecarried out on state | ψ AB i .If we have a simulation of one-way quantum communication, then we can transform it intoa simulation of entanglement. To see this, note that one can rewrite the joint probabilities as P ( a, b | x, y ) = P ( a | x ) P ( b | x, y, a ). The simulation is then as follows: Alice chooses a according to theprobability distribution P ( a | x ); she then sends Bob suﬃcient information so that he can choose anoutput b distributed according to P ( b | x, y, a ). It is easy to show that for this second task (producing b distributed according to P ( b | x, y, a )) it suﬃces for Alice to send Bob the measurement outcome,and to describe to him the state onto which his system is projected after Alice’s measurement. Using this correspondence, we thus have a protocol which provides, up to a small error, a classicalsimulation (using O ( n n ) bits of one-way communication) of any measurement on entangled statesof n qubits. Remarkably it is also possible, at least in some cases, to perfectly simulate the quantum commu-nication or quantum entanglement scenarios with ﬁnite classical communication. In such perfectsimulations we do not tolerate any error. Of course such exact simulations are in principle not nec-essary if one wants to interpret the results of real experiments, as any real experiment will alwayshave small imperfections. But these exact simulations are interesting for at least two reasons. On We can assume without loss of generality that Alice’s POVM elements all have rank 1, which implies thatconditional on the measurement outcome, Bob’s state is pure. average amount of classical communication is bounded(but in the worse case the amount of classical communication may be inﬁnite). This model wasﬁrst used in [92, 124] in the context of classical simulation of a single ebit. In [89] this approach wasgeneralized to the simulation of communicating n qubits, or the simulation of POVM measurementson n ebits, using O ( n n ) bits of two-way classical communication on average.A stronger and more interesting model is when the amount of classical communication isbounded (even in the worst case). This model was introduced in [23]. The simulations wereimproved, and in [127] it was shown that the classical simulation of projective measurements on asingle ebit could be realized with a single bit of classical communication from Alice to Bob, and thecommunication of a single qubit could be simulated with 2 bits of communication. Note that thesesimulations use an inﬁnite amount of shared randomness, a requirement that was shown in [89] tobe necessary when the amount of communication is bounded (in the worst case).An even stronger model for the simulation of entanglement is for Alice and Bob to use asresource non-local boxes, rather than classical communication. Indeed, as discussed in Section 6.1,one bit of classical communication can be used to realize a non-local box, but a non-local boxcannot be used to communicate. It was shown in [42] that simulating projective measurements ona single ebit could be carried out with the use of a single non-local box. A uniﬁed approach toprotocols simulating a single ebit with one bit of communication or with one non-local box waspresented in [54]. In this section we put a constraint on the quantum model. We will suppose that any measurementon a quantum system gives the results predicted by quantum mechanics with probability η , anddoes not give any result with probability 1 − η .The motivation for considering this model is that most quantum communication experiments usephotons. Photons are very practical because they can be quite easily produced, manipulated, trans-mitted over long distances, and measured. Unfortunately photons get absorbed during transmission(in commercial optical ﬁbers, photons have approximately 50% probability of being absorbed af-ter travelling 15km), and single-photon detectors have limited eﬃciency: they will sometimes notdetect a photon even though it is present. These eﬀects can be described by the above model.In most experiments to date, the detector eﬃciency η was so low that the correlations couldbe explained by a classical model using shared randomness and no communication (a local hiddenvariable model). This is called the Detection Loophole [106]. Thus for instance in the CHSHexperiment, the correlations can be explained by a local hidden variable model if η ≤ / ( √ ≃ . Communication complexity suggests that by increasing the dimension d of the entangled systemunder study, one can decrease exponentially (in d ) the required eﬃciency of the detectors. Indeed,it appears that in many cases the minimum number c of bits of classical communication required toreproduce the quantum correlations is related to the minimum eﬃciency of the detectors requiredfor the correlations to be non-local by η ≥ − O ( c ) . That there should be a relation between c and η was ﬁrst noted in [71] and further studied in [87, 35, 36].To understand this relation we will compare two classical schemes: • In the ﬁrst scheme, which was discussed at length in Sections 2 and 4, the detectors have100% eﬃciency, the parties have shared randomness and may exchange up to c bits of classicalcommunication. • In the second scheme, the parties have shared randomness, and each party has a detector ofeﬃciency η . This means that each party will with probability η give an output, and withprobability 1 − η produce no output. The detectors are assumed to be independent, so that theprobability that both detectors give an output is η . In the physics terminology this would becalled a local hidden variable model with detector eﬃciency η . (We will also consider belowthe case where one of the detectors has eﬃciency η , and the other always gives a result, i.e.,is 100% eﬃcient.)These two schemes can be related in a number of ways. The simplest relation is:Any classical protocol with c bits of communication can be mapped into a classicalprotocol with no communication but with detector eﬃciency η = 2 − c .This mapping is very simple: Alice and Bob use shared randomness r which is uniformly distributedover all possible conversations. Each party checks whether r is a conversation that is consistentwith their input. If it is then they give the corresponding output, if it is not then they don’t giveany output. The probability that both Alice and Bob give an output is at least 2 − c .This protocol is not perfect since the probability that the parties give an output may diﬀer fromone party to the other, or from one input to the other. What is interesting is that in a number ofcases the converse holds: if the quantum correlations cannot be reproduced with less than c bits of39ommunication, then they can be reproduced without communication only if the detector eﬃciency η is less than 2 − Ω( c ) .A ﬁrst example where this converse occurs, is when bounds on c and on the minimum detectioneﬃciency η can be obtained from the size of monochromatic rectangles (see Appendix B for abrief presentation of this notion). This approach was implicit in [87] where it was shown that thecorrelations of the distributed Deutsch-Jozsa problem could not be reproduced by a local hiddenvariable model if η ≥ O ( n / )2 − . n when the inputs consist of n -bit strings, and hence the partiesuse a maximally entangled system of dimension n . Using the size of monochromatic rectangles wasexploited more fully in [35] in the context of a multipartite communication complexity problem, andthen extended in [36] to take into account the possibility of errors. In particular, in [36] it was shownhow one could obtain a lower bound c ≥ B R on the minimum amount of communication required toreproduce the correlations, where B R is a function of the size and discrepancy of rectangles. It thenfollowed that the correlations could be obtained by a local hidden variable model with detectors ofeﬃciency η only if η ≤ − B R /n (where n is the number of parties). If the rectangle lower bound on c is close to tight, then this implies the relation we mentioned above between c and η . Another interesting example arises if we suppose that Alice’s detector is ineﬃcient, but that Bob’sdetector is perfect. This situation is motivated by the experimental situation reported in [96], wherean ion is entangled with a photon. As discussed above, the measurements on the ion can be donewith 100% eﬃciency, whereas those on the photon will be ineﬃcient. The problem in which Alice’sdetector is ineﬃcient and Bob’s detector is perfect was previously investigated from the point ofview of the detection loophole in [40, 29] for entangled systems of dimension 2.We prove in Appendix D that the Hidden Matching problem is particularly well adapted to thisasymmetric scenario. Namely we show thatSuppose Alice and Bob try to implement the Hidden Matching problem using log n ebits, as discussed in Section 3.7. Suppose that Alice’s detector has eﬃciency η whereasBob’s detector has 100% eﬃciency. Then the correlations obtained by measuring theebits cannot be reproduced by a classical model without communication if η ≥ − Ω( √ n ) ,even allowing for a small error probability.To our knowledge, this is the ﬁrst time it is shown that an exponentially small detection eﬃciencycan be tolerated when allowing for a small error probability. During the past decades there have been many experiments that studied the correlations exhibitedby measurements on entangled quantum particles. Their main aim was to test quantum mechanicsby comparing its predictions with those of hidden variable models. The short result is that thepredictions of quantum mechanics have always been veriﬁed to very high precision. However, upto now some “loopholes” have always been left open, which allow the possibility of explaining thedata with—admittedly contrived—local hidden variable models.We very brieﬂy review how experiments on quantum non-locality have been improved duringthe past decades. We then discuss how the insights from communication complexity suggest new40xperimental challenges. We also discuss experimental realizations of quantum communicationcomplexity.After the initial experiment by Freedman and Clauser [63] on the correlations exhibited by en-tangled photons, the ﬁrst qualitative advance was the experiments of Aspect that used time-varyinganalyzers in order to close the locality loophole. Indeed in previous experiments the measurementswhere kept ﬁxed for long periods of time while experimental results were accumulated, then themeasurements were changed and a new set of data was acquired for the new measurement setting.In Aspect’s experiment [6] the measurement settings changed periodically in time. In the later ex-periment of Weihs et al. [132], the measurement settings were chosen at random using a quantumrandom number generator.Another important advance of the experiments of Aspect et al. [7, 8] was a very precise checkthat the measured correlations coincide with the correlations P QM ( ab | xy ) predicted by quantummechanics for local measurements on a maximally entangled state of two particles (earlier experi-ments were much more imprecise).Some other noteworthy advances: • Non-locality experiments in which the two particles were separated by a large distance of 10km [126] and 50 km [85]; • Non-locality experiments on bipartite entangled systems of dimension 3 [130, 125]; • Non-locality experiments on entangled states of three [104, 109] and four particles [116, 140].In all the above experiments the detection loophole was not closed. This means that the rawdata acquired during the experiment could be explained by a local hidden variable model. It wasonly by making the (physically very reasonable) assumption that the events in which the detectorgives a click are independent of the measurement settings and measurement results (known in thephysics literature as the “fair sampling assumption”) that these experiments could be assumed tobe in contradiction with local hidden variable models.As mentioned above, there have now been two experiments involving ions in which the de-tection loophole has been closed. In the ﬁrst, the two entangled ions were separated by about 3 µm [115], in the second, presented in more detail in Fig. 7, the two entangled ions were separatedby about a meter [91]. In view of these advances, closing both the locality and detection loopholessimultaneously does not seem out of reach.From the point of view of communication complexity, closing the detection loophole is moreimportant than closing the locality loophole. Indeed, if the detection loophole is not closed, itmeans that the raw data can be explained by a model without communication. On the other hand,if the detection loophole is closed, then, by sharing the entanglement, the parties have a resourcethat could only be reproduced classically by communication between the parties. The same is truein other applications of quantum non-locality: closing the detection loophole (but not necessarilythe locality loophole) allows one to increase the security of quantum key distribution [2]. The progress in quantum communication complexity points the way towards new tests of quantumnon-locality which use not one ebit, as in the CHSH test, but many ebits. Ideas for these new testscome from the entanglement-based Deutsch-Jozsa problem discussed in Section 4, the entanglement-based Hidden Matching problem discussed in Section 3.7, recent work of Gavinsky [66], and also41

S PMTPMTYb + Yb + m e t e r BB λ /4 λ /4 Figure 7:

Bell inequality with two remote atomic qubits.

The left-hand side is a schematic de-scription of the experiment reported in [91] in which the internal states of two ions separated byabout one meter were entangled. Measurements on the two ions then allowed the violation of theCHSH inequality with the detection loophole closed. A series of laser pulses simultaneously exciteboth Yb + ions in such a way that when they deexcite, they emit a photon whose polarization isentangled with the ion. A lens is used to couple the photons into optical ﬁbers. The wave plate( λ/

4) is used for convenience to convert circular polarization into linear polarization. The twophotons interfere on a Beam Splitter (BS) and are detected by Photo Multiplier Tubes (PMT).Simultaneous detection of a photon by the two PMT’s signals that the photons were in a Bell state,thereby realizing entanglement swapping: the two ions are now entangled. The internal states ofthe ions are then measured, enabling a violation of the CHSH inequality. Note that there are manyineﬃciencies in this experiment: only a fraction of emitted photons are coupled into the opticalﬁbers, and only a fraction of the photons reaching the PMT’s are detected. But when two photonsare detected, one knows with certainty that the two ions are entangled. The right hand side is aphotograph of one of the ion traps. The other trap is similar, and located about one meter awayon the same optical table. (Both ﬁgures courtesy of S. Monroe and D. Matsukevich; left-hand sidepanel copyright American Physical Society). 42he (non-constructive) results on three-party correlations reported in [107]. There are at least twomotivations for such experiments. First of all they could be more robust against experimental im-perfections (such as the detection loophole or errors) than non-locality tests used at present. Secondthey could illustrate the eﬃciency of quantum mechanics over classical mechanics, as experimentson a small number e of ebits could only be reproduced classically using an exponentially large (in e ) amount c of classical communication.These non-locality experiments on a many ebits can be characterized by several parameters.In particular these would include the number e of ebits involved, or equivalently the dimension d = 2 e of the entangled quantum system; the minimum detector eﬃciency η required for thecorrelations to be non-local; the amount ǫ of errors that can be tolerated; and the amount c ofclassical communication that would be required to reproduce the quantum correlations. In general,for any given non-locality test, we can expect tradeoﬀs between η , ǫ and c .An important point to note is that the proposals inspired by communication complexity typicallyare asymptotic results that deal with the limit where the number of ebits tends towards inﬁnity: e → ∞ . However real experiments will deal with small values of e . For instance, if we think ofthe detection loophole, one should recall that this is only a problem for experiments dealing withentangled photons. On the other hand, the Hilbert space of a single photon can be larger than 2.One can thus eﬀectively manipulate more than one qubit, while manipulating only a single photon.This is potentially an interesting opportunity. Indeed it would be very interesting to devise non-locality experiments that tolerate ineﬃcient detectors (say η < d = 10). If one could devise such a non-locality experiment, there would be a strongincentive to realize it experimentally. Indeed whereas experiments involving entangled atoms orions may be the short-term solution to solving the detection loophole, such experiments are muchslower and much more expensive than experiments involving photons only. Numerical searches forsuch a non-locality experiment have been undertaken, but unsuccessfully so far [90].In summary, quantum communication complexity suggests the possibility of new non-localityexperiments on a moderate number of ebits that either are very resistant to imperfections, or requirevery large amounts of classical resources to reproduce classically. Realizing such experiments willrequire further progress on the theoretical and experimental side. The experimental situation concerning communication complexity proper is less advanced. Indeed,in order to carry out any nontrivial experimental demonstration of communication complexity, oneneeds to take into account the limited eﬃciency of detectors which has been such a plague for non-locality experiments. In this respect, the ﬁrst convincing communication complexity experiment todate is that reported in [128] in which 6 parties, materialized by waveplates along a beam on anoptical table, carried out the communication complexity problem proposed in [45, 34, 31], but in theversion proposed in [64], which does not use entanglement. In this experiment the limited eﬃciencyof detectors was explicitly taken into account. Experiments that studied the entanglement-basedversion of this problem while explicitly taking into account the limited eﬃciency of the detectorshave also been reported [139], based on the proposal of [41].Another protocol which has been studied experimentally is quantum ﬁngerprinting which in theSMP model performs exponentially better than classical protocols (see Section 5.3). The possibilityof realizing such an experiment at a small scale involving one or a few photons has been discussedin [53, 88], and later performed using photons [76] and in NMR [58].43n the future we may expect further proof-of-principle experiments of quantum communica-tion complexity involving the exchange of more qubits and larger distance between the parties.Good candidates for such experiments are Raz’s communication complexity problem, the HiddenMatching problem and its extensions, and quantum ﬁngerprinting.

Quantum communication complexity and quantum non-locality are by now mature ﬁelds. Butmany questions remain open. Here we collect a few.1.

Additional natural problems in quantum communication complexity.

Find ad-ditional problems—if possible natural problems that could have potential applications—forwhich quantum communication is much more eﬃcient than classical commmunication.2.

How much entanglement is needed to get a reduction of communication: equiv-alence of quantum communication and entanglement models of communicationcomplexity.

In the entanglement model of communication complexity, the parties have anunlimited supply of entanglement and use it to reduce the amount of classical communication.How much entanglement is really needed? In classical communication complexity with sharedrandomness Newman’s Theorem [101] states that, if we allow a small increase in the errorprobability, the parties need only have O (log n ) shared random bits (where n is the size of theinputs). Does something similar hold when we replace shared randomness by entanglement?Answering this question would essentially establish whether the quantum communication andthe entanglement models of communication complexity are equivalent3. Are most quantum states useful for communication complexity?

It was recentlyshown in [73] that most n -qubit states (with respect to the uniform measure) are not useful—they are typically too entangled—in the measurement-based version of quantum computation.Are most states useful for communication complexity? For two parties the answer is yes, asthey can work in the Schmidt basis. But consider three parties sharing a random stateof 3 n qubits (each party having n qubits). How useful are most states for communicationcomplexity (asymptotically as n tends to inﬁnity)?4. Find new non-local games, qualitatively diﬀerent from existing ones.

In particularconsider the following more speciﬁc subquestions: • For two-party XOR games, the ratio between the classical and the quantum value ofthe game is bounded by a constant. However in [107] it was shown—using a non-constructive proof—that this is not the case for three-party games. Can one exhibit anexplicit example of this type? • Find Bell inequalities involving rather small systems, say where the dimension of eachparty’s Hilbert space is less than d = 10, which allow for very small detector eﬃciencies.5. Non-local boxes and communication complexity.

As discussed in Section 6.1, non-local boxes are an interesting resource to consider from the point of view of communicationcomplexity. In this regard, two interesting questions are:44

First, what is the noise threshold below which non-local boxes make communicationcomplexity trivial (see [22] for a formulation of this problem). Is this threshold themaximum value p = (2 + √ / • Second, is it possible to amplify non-local correlations, in the sense that given a largenumber of devices that will produce correlations P ( ab | xy ) corresponding to PR boxeswith noise p , is it possible to use the devices in such a way as to produce correlationswith a lower value of p ? A ﬁrst result in this direction can be found in [61].6. Simulation of quantum correlations and quantum communication.

In this context,some questions that come to mind are: • Exact simulation of more than one qubit or ebit using bounded classical communication(in the worse case) or Non-Local Boxes. Some preliminary results on this topic havebeen obtained in the particular case where Alice and Bob carry out measurements withbinary outcomes [55, 113]. • The simulation of non-maximally entangled states using non-local boxes. This appearsto be much harder than the simulation of maximally entangled states, see [28, 27] forsome ﬁrst results. • The simulation of multipartite non-local correlations.

Communication complexity is a task for which quantum information can beat classical informa-tion. Such tasks are rare, and ﬁnding more potential applications of quantum information is veryimportant.Unfortunately most quantum communication complexity problems are either extremely sensitiveto noise, highly contrived, or do not oﬀer exponential gains over the best classical protocols (inwhich case the advantages of quantum communication will probably be more than oﬀset by thelower cost and higher speed of classical communication). The most interesting proposal so far ismaybe the SMP model without shared randomness (a somewhat contrived model) where equality (avery natural problem) can be solved exponentially more eﬃciently using quantum communication.Thus there is the tantalizing possibility that some time in the future, quantum communicationcomplexity could be used in practical applications.Independently of whether quantum communication complexity ever ﬁnds some real-world ap-plications, the results obtained so far have important conceptual implications. First of all theyoﬀer new insights into the power of quantum information, and in particular of quantum computing.Indeed the basic aim of computer science, taken in a wide sense, is to accomplish a task by using theminimum amount of resources. In the usual formulation, the resource that we want to minimize isthe running time of the computer. This is the most important application of quantum computingas Shor’s algorithm suggests that a quantum computer would allow exponential speedups. Butin this context it is very diﬃcult—if not impossible—to prove that quantum computers are morepowerful than classical computers. The advantage of quantum computation can however be provenin simpler contexts such as the black-box model of quantum computing, where the resource thatis quantiﬁed is the number of calls to an oracle; or communication complexity where the resourcethat is quantiﬁed is the amount of communication. The existence of these models where it can be45igorously shown that quantum information oﬀers important advantages over classical informationreinforces our conﬁdence that quantum computers are much more powerful than classical computersfor certain tasks.Second, the study of quantum communication complexity has led to the proposal of new tests ofquantum mechanics. Indeed from Bell onwards it was known that if one wants to replace quantummechanics by a classical model, this classical model would have to use faster than light signalling.The discovery of fast quantum algorithms suggested that such a classical model would use anexponentially large number of resources. Quantum communication complexity has now advancedto the point where it may be possible to propose experiments in which one can prove that a classicalsimulation would require exponentially more resources than are used quantum mechanically.In summary, quantum communication complexity is now a mature ﬁeld that has led to somefundamental insights into the nature of computation and the foundations of physics.

References [1] S. Aaronson and A. Ambainis. Quantum search of spatial regions. In

Proceedings of 44thIEEE FOCS , pages 200–209, 2003. quant-ph/0303041.[2] A. Acin, N. Brunner, N. Gisin, S. Massar, S. Pironio, and V. Scarani. Device-independentsecurity of quantum cryptography against collective attacks.

Physical Review Letters ,98:230501, 2007.[3] A. Ac´ın, N. Gisin, and B. Toner. Grothendieck’s constant and local models for noisy entangledquantum states.

Physical Review A , 73:062105, 2006.[4] A. Ambainis. Communication complexity in a 3-computer model.

Algorithmica , 16(3):298–301, 1996.[5] P. K. Aravind. A simple demonstration of Bell’s theorem involving two observers and noprobabilities or inequalities. quant-ph/0206070, 2002.[6] A. Aspect, J. Dalibard, and G. Roger. Experimental test of Bell’s inequalities using time-varying analyzers.

Physical Review Letters , 49:1804, 1982.[7] A. Aspect, Ph. Grangier, and G. Roger. Experimental tests of realistic local theories viaBell’s theorem.

Physical Review Letters , 47:460, 1981.[8] A. Aspect, Ph. Grangier, and G. Roger. Experimental realization of Einstein-Podolsky-Rosen-Bohm Gedankenexperiment: A new violation of Bell’s inequalities.

Physical Review Letters ,49:91, 1982.[9] Z. Bar-Yossef, T. S. Jayram, and I. Kerenidis. Exponential separation of quantum andclassical one-way communication complexity. In

Proceedings of 36th ACM STOC , pages 128–137, 2004.[10] J. Barrett. Nonsequential positive-operator-valued measurements on entangled mixed statesdo not always violate a Bell inequality.

Physical Review A , 65(4):042302, Mar 2002.4611] J. Barrett, L. Hardy, and A. Kent. No signaling and quantum key distribution.

PhysicalReview Letters , 95(1):010503, Jun 2005.[12] J. Barrett, N. Linden, S. Massar, S. Pironio, S. Popescu, and D. Roberts. Nonlocal correlationsas an information-theoretic resource.

Physical Review A , 71:022101, 2005.[13] J. S. Bell. On the Einstein-Podolsky-Rosen paradox.

Physics , 1:195–200, 1965.[14] C. Bennett, G. Brassard, C. Cr´epeau, R. Jozsa, A. Peres, and W. Wootters. Teleporting anunknown quantum state via dual classical and Einstein-Podolsky-Rosen channels.

PhysicalReview Letters , 70:1895–1899, 1993.[15] C. Bennett and S. Wiesner. Communication via one- and two-particle operators on Einstein-Podolsky-Rosen states.

Physical Review Letters , 69:2881–2884, 1992.[16] C. H. Bennett, H. J. Bernstein, S. Popescu, and B. Schumacher. Concentrating partialentanglement by local operations.

Physical Review A , 53(4):2046–2052, Apr 1996.[17] C. H. Bennett and G. Brassard. Quantum cryptography: Public key distribution and cointossing. In

Proceedings of the IEEE International Conference on Computers, Systems andSignal Processing , pages 175–179, 1984.[18] C. H. Bennett, G. Brassard, S. Popescu, B. Schumacher, J. A. Smolin, and W. K. Wootters.Puriﬁcation of noisy entanglement and faithful teleportation via noisy channels.

PhysicalReview Letters , 76(5):722–725, Jan 1996.[19] R. Bhatia.

Matrix Analysis . Number 169 in Graduate Texts in Mathematics. Springer-Verlag,New York, 1997.[20] G. Brassard. Quantum communication complexity.

Foundations of Physics , 33(11):1593–1616, 2003. quant-ph/0101005.[21] G. Brassard, A. Broadbent, and A. Tapp. Quantum pseudo-telepathy.

Foundations ofPhysics , 35(11):1877–1907, 2005.[22] G. Brassard, H. Buhrman, N. Linden, A. A. M´ethot, A. Tapp, and F. Unger. Limit onnonlocality in any world in which communication complexity is not trivial.

Physical ReviewLetters , 96:250401, 2006.[23] G. Brassard, R. Cleve, and A. Tapp. Cost of exactly simulating quantum entangle-ment with classical communication.

Physical Review Letters , 83(9):1874–1877, Aug 1999.arXiv:quant-ph/9901035.[24] G. Brassard, P. Høyer, M. Mosca, and A. Tapp. Quantum amplitude ampliﬁcation andestimation. In

Quantum Computation and Quantum Information: A Millennium Volume ,volume 305 of

AMS Contemporary Mathematics Series , pages 53–74. 2002. quant-ph/0005055.[25] J. Bri¨et, H. Buhrman, and B. Toner. A generalized Grothendieck inequality and entanglementin XOR games.

Arxiv preprint arXiv:0901.2009 , 2009.4726] ˇC. Brukner, M. ˙Zukowski, J.-W. Pan, and A. Zeilinger. Bell’s inequalities and quantumcommunication complexity.

Physical Review Letters , 92(12):127901, Mar 2004.[27] N. Brunner, N. Gisin, S. Popescu, and V. Scarani. Simulation of partial entanglement withno-signaling resources. arXiv:0803.2359, 2008.[28] N. Brunner, N. Gisin, and V. Scarani. Entanglement and non-locality are diﬀerent resources.

New Journal of Physics , 7:88, 2005.[29] N. Brunner, N. Gisin, V. Scarani, and C. Simon. Detection loophole in asymmetric Bellexperiments.

Physical Review Letters , 98:220403, 2007.[30] N. Brunner, S. Pironio, A. Acin, N. Gisin, A.A. M´ethot, and V. Scarani. Testing the dimensionof Hilbert spaces.

Physical Review Letters , 100(21):210503–210503, 2008.[31] H. Buhrman, R. Cleve, and W. van Dam. Quantum entanglement and communication com-plexity.

SIAM Journal on Computing , 30(8):1829–1841, 2001. quant-ph/9705033.[32] H. Buhrman, R. Cleve, J. Watrous, and R. de Wolf. Quantum ﬁngerprinting.

Physical ReviewLetters , 87(16), September 26, 2001. quant-ph/0102001.[33] H. Buhrman, R. Cleve, and A. Wigderson. Quantum vs. classical communication and com-putation. In

Proceedings of 30th ACM STOC , pages 63–68, 1998. quant-ph/9802040.[34] H. Buhrman, W. van Dam, P. Høyer, and A. Tapp. Multiparty quantum communicationcomplexity.

Physical Review A , 60(4):2737–2741, 1999. quant-ph/9710054.[35] H. Buhrman, P. Høyer, S. Massar, and H. R¨ohrig. Combinatorics and quantum nonlocality.

Physical Review Letters , 91(047903), 2003. quant-ph/0209052.[36] H. Buhrman, P. Høyer, S. Massar, and H. R¨ohrig. Multipartite nonlocal quantum correlationsresistant to imperfections.

Physical Review A , 73(012321), 2006. quant-ph/0410139.[37] A. Cabello. All versus Nothing inseparability for two observers.

Physical Review Letters ,87(1):010403, Jun 2001.[38] A. Cabello. Bell’s theorem without inequalities and without probabilities for two observers.

Physical Review Letters , 86(10):1911–1914, Mar 2001.[39] A. Cabello. Stronger two-observer all-versus-nothing violation of local realism.

PhysicalReview Letters , 95(21):210401, 2005.[40] A. Cabello and J.-A. Larsson. Minimum detection eﬃciency for a loophole-free atom-photonBell experiment.

Physical Review Letters , 98:220402, 2007.[41] A. Cabello and A. J. L´opez-Tarrida. Proposed experiment for the quantum guess my numberprotocol.

Physical Review A , 71:020301(R), 2005.[42] N. J. Cerf, N. Gisin, S. Massar, and S. Popescu. Simulating maximal quantum entanglementwithout communication.

Physical Review Letters , 94:220403, 2005.4843] B. S. Tsirelson (Cirel’son). Quantum generalizations of Bell’s inequality.

Letters in Mathe-matical Physics , 4(2):93–100, 1980.[44] J. F. Clauser, M. A. Horne, A. Shimoney, and R. A. Holt. Proposed experiment to test localhidden-variable theories.

Physical Review Letters , 23(15):880–884, 1969.[45] R. Cleve and H. Buhrman. Substituting quantum entanglement for communication.

PhysicalReview A , 56(2):1201–1204, 1997. quant-ph/9704026.[46] R. Cleve, W. van Dam, M. Nielsen, and A. Tapp. Quantum entanglement and the com-munication complexity of the inner product function. In

Proceedings of 1st NASA QCQCconference , volume 1509 of

Lecture Notes in Computer Science , pages 61–74. Springer, 1998.quant-ph/9708019.[47] R. Cleve, P. Høyer, B. Toner, and J. Watrous. Consequences and limits of nonlocal strategies.In , pages236–249, 2004.[48] D. Collins and N. Gisin. A relevant two qubit Bell inequality inequivalent to the CHSHinequality.

Journal of Physics A-Mathematical and General , 37(5):1775–1788, 2004.[49] D. Collins, N. Gisin, N. Linden, S. Massar, and S. Popescu. Bell inequalities for arbitrarilyhigh-dimensional systems.

Physical Review Letters , 88(4):40404–40404, 2002.[50] W. van Dam.

Nonlocality & Communication Complexity . PhD thesis, University of Oxford,Department of Physics, 2000.[51] W. van Dam. Implausible consequences of superstrong nonlocality, 2005.arXiv:quant-ph/0501159v1.[52] W. van Dam, R. D. Gill, and P. D. Gr¨unwald. The statistical strength of nonlocality proofs.

IEEE Transactions on Information Theory , 51(8):2812–2835, 2005.[53] J. Niel de Beaudrap. One-qubit ﬁngerprinting schemes.

Physical Review A , 69(2):022307,Feb 2004.[54] J. Degorre, S. Laplante, and J. Roland. Simulating quantum correlations as a distributedsampling problem.

Physical Review A , 72:062314, 2005.[55] J. Degorre, S. Laplante, and J. Roland. Classical simulation of traceless binary observableson any bipartite quantum state.

Physical Review A , 75:012309, 2007.[56] D. Deutsch and R. Jozsa. Rapid solution of problems by quantum computation. In

Proceedingsof the Royal Society of London , volume A439, pages 553–558, 1992.[57] A.C. Doherty, Y.C. Liang, B. Toner, and S. Wehner. The quantum moment problem andbounds on entangled multi-prover games. In

Proceedings of 23rd IEEE Conference on Com-putational Complexity , pages 199–210, 2008.[58] J. Du, P. Zou, X. Peng, D. K. Oi, L. C. Kwek, and A. Ekert. Experimental quantummultimeter and one-qubit ﬁngerprinting.

Physical Review A , 74:042319, 2006.4959] H. Ehlich and K. Zeller. Schwankung von Polynomen zwischen Gitterpunkten.

MathematischeZeitschrift , 86:41–44, 1964.[60] A. Einstein, B. Podolsky, and N. Rosen. Can quantum-mechanical description of physicalreality be considered complete?

Physical Review , 47:777–780, 1935.[61] M. Forster, S. Winkler, and S. Wolf. Distilling Non-Locality.

Physical review letters ,102:120401, 2009.[62] P. Frankl and V. R¨odl. Forbidden intersections.

Transactions of the American MathematicalSociety , 300(1):259–286, 1987.[63] S. J. Freedman and J. F. Clauser. Experimental test of local hidden-variable theories.

PhysicalReview Letters , 28(14):938–941, Apr 1972.[64] E. F. Galv˜ao. Feasible quantum communication complexity protocol.

Physical Review A ,65(1):012318, Dec 2001.[65] D. Gavinsky. Classical interaction cannot replace a quantum message. In

Proceedings of 40thACM STOC , pages 95–102, 2008. quant-ph/0703215.[66] D. Gavinsky. Classical interaction cannot replace quantum nonlocality. arXiv:0901.0956,2008.[67] D. Gavinsky, J. Kempe, I. Kerenidis, R. Raz, and R. de Wolf. Exponential separations for one-way quantum communication complexity, with applications to cryptography. In

Proceedingsof 39th ACM STOC , pages 516–525, 2007. quant-ph/0611209.[68] D. Gavinsky, J. Kempe, O. Regev, and R. de Wolf. Bounded-error quantum state identiﬁca-tion and exponential separations in communication complexity. In

Proceedings of 38th ACMSTOC , pages 594–603, 2006. quant-ph/0511013.[69] D. Gavinsky, J. Kempe, and R. de Wolf. Strengths and weaknesses of quantum ﬁngerprinting.In

Proceedings of 21st IEEE Conference on Computational Complexity , pages 288–295, 2006.quant-ph/0603173.[70] D. Gavinsky, O. Regev, and R. de Wolf. Simultaneous communication protocols with quantumand classical messages.

Chicago Journal of Theoretical Computer Science , 7, 2008. quant-ph/0807.2758.[71] N. Gisin and B. Gisin. A local hidden variable model of quantum correlation exploiting thedetection loophole.

Physics Letters A , 260:323–327, 1999.[72] D. M. Greenberger, M. Horne, and A. Zeilinger. Going beyond Bell’s theorem. In M. Kafatos,editor,

Bell’s Theorem, Quantum Theory, and Conceptions of the Universe , pages 69–72.Kluwer Academic, 1989.[73] D. Gross, S. T. Flammia, and J. Eisert. Most quantum states are too entangled to be usefulas computational resources.

Physical Review Letters , 102:190501, 2009.5074] L. K. Grover. A fast quantum mechanical algorithm for database search. In

Proceedings of28th ACM STOC , pages 212–219, 1996. quant-ph/9605043.[75] A. S. Holevo. Bounds for the quantity of information transmitted by a quantum commu-nication channel.

Problemy Peredachi Informatsii , 9(3):3–11, 1973. English translation in

Problems of Information Transmission , 9:177–183, 1973.[76] R. T. Horn, S. A. Babichev, K.-P. Marzlin, A. I. Lvovsky, and B. C. Sanders. Single-qubitoptical quantum ﬁngerprinting.

Physical Review Letters , 95:150502, 2005.[77] Juraj Hromkoviˇc.

Communication complexity and parallel computing . Springer-Verlag, NewYork, 1997.[78] B. Kalyanasundaram and G. Schnitger. The probabilistic communication complexity of setintersection.

SIAM Journal on Discrete Mathematics , 5(4):545–557, 1992. Earlier version inStructures’87.[79] L. A. Khalﬁ and B. S. Tsirelson. Quantum and quasi-classical analogs of Bell inequalities.In P. Lahti and P. Mittelstaedt, editors,

Symposium on the Foundations of Modern Physics ,pages 441–460. World Scientiﬁc, Singapore, 1985.[80] H. Klauck, R. ˇSpalek, and R. de Wolf. Quantum and classical strong direct product theo-rems and optimal time-space tradeoﬀs.

SIAM Journal on Computing , 36(5):1472–1493, 2007.Earlier version in FOCS’03. quant-ph/0402123.[81] D. E. Knuth. Combinatorial matrices. In

Selected Papers on Discrete Mathematics , volume106 of

CSLI Lecture Notes . Stanford University, 2003.[82] I. Kremer. Quantum communication. Master’s thesis, Hebrew University, Computer ScienceDepartment, 1995.[83] E. Kushilevitz and N. Nisan.

Communication Complexity . Cambridge University Press, 1997.[84] N. Linial and A. Shraibman. Lower bounds in communication complexity based on factor-ization norms. In

Proceedings of 39th ACM STOC , pages 699–708, 2007.[85] I. Marcikic, H. de Riedmatten, W. Tittel, H. Zbinden, M. Legr´e, and N. Gisin. Distributionof time-bin entangled qubits over 50 km of optical ﬁber.

Physical Review Letters , 93:180502,2004.[86] Ll. Masanes, A. Acin, and N. Gisin. General properties of nonsignaling theories.

PhysicalReview A , 73:012112, 2006.[87] S. Massar. Nonlocality, closing the detection loophole, and communication complexity.

Phys-ical Review A , 65:032121, 2002.[88] S. Massar. Quantum ﬁngerprinting with a single particle.

Physical Review A , 71:012310,2005.[89] S. Massar, D. Bacon, N. J. Cerf, and R. Cleve. Classical simulation of quantum entanglementwithout local hidden variables.

Physical Review A , 63(5):052305, Apr 2001.5190] S. Massar, S. Pironio, J. Roland, and B. Gisin. Bell inequalities resistant to detector ineﬃ-ciency.

Phys. Rev. A , 66(5):052112, Nov 2002.[91] D. N. Matsukevich, P. Maunz, D. L. Moehring, S. Olmschenk, and C. Monroe. Bell inequalityviolation with two remote atomic qubits.

Physical Review Letters , 100:150404, 2008.[92] T. Maudlin. Bell’s inequality, information transmission, and prism models. In D. Hull,M. Forbes, and K. Okruhlik, editors,

PSA: Proceedings of the Biennial Meeting of the Phi-losophy of Science Association , volume 1, pages 404–417, 1992.[93] N. D. Mermin. Simple uniﬁed form for the major no-hidden-variables theorems.

PhysicalReview Letters , 65:3373–3376, 1990.[94] N. D. Mermin. What’s wrong with these elements of reality?

Physics Today , 43:9–11, 1990.[95] N. D. Mermin. Hidden variables and the two theorems of John Bell.

Reviews of ModernPhysics , 65(3):803–815, 1993.[96] D. L. Moehring, M. J. Madsen, B. B. Blinov, and C.Monroe. Experimental Bell inequalityviolation with an atom and a photon.

Physical Review Letters , 93:090410, 2004.[97] M. Navascues, S. Pironio, and A. Acin. Bounding the set of quantum correlations.

PhysicalReview Letters , 98(1):10401, 2007.[98] M. Navascues, S. Pironio, and A. Ac´ın. A convergent hierarchy of semideﬁnite programscharacterizing the set of quantum correlations.

New Journal of Physics , 10(073013):073013,2008.[99] A. Nayak. Optimal lower bounds for quantum automata and random access codes. In

Proceedings of 40th IEEE FOCS , pages 369–376, 1999. quant-ph/9904093.[100] A. Nayak and J. Salzman. On communication over an entanglement-assisted quantum chan-nel. In

Proceedings of 34th ACM STOC , pages 698–704, 2002.[101] I. Newman. Private vs. common random bits in communication complexity.

InformationProcessing Letters , 39(2):67–71, 1991.[102] I. Newman and M. Szegedy. Public vs. private coin ﬂips in one round communication games.In

Proceedings of 28th ACM STOC , pages 561–570, 1996.[103] M. A. Nielsen and I. L. Chuang.

Quantum Computation and Quantum Information . Cam-bridge University Press, 2000.[104] J.-W. Pan, D. Bouwmeester, M. Daniell, H. Weinfurter, and A. Zeilinger. Experimental testof quantum nonlocality in three-photon Greenberger-Horne-Zeilinger entanglement.

Nature ,403:515, 2000.[105] R. Paturi. On the degree of polynomials that approximate symmetric Boolean functions. In

Proceedings of 24th ACM STOC , pages 468–474, 1992.[106] P. M. Pearle. Hidden-variable example based upon data rejection.

Physical Review D , 2:1418,1970. 52107] D. Perez-Garcia, M.M. Wolf, C. Palazuelos, I. Villanueva, and M. Junge. Unbounded viola-tion of tripartite Bell inequalities.

Communications in Mathematical Physics , 279:455, 2008.arXiv:quant-ph/0702189.[108] S. Popescu and D. Rohrlich. Quantum nonlocality as an axiom.

Foundations of Physics ,24:379, 1994.[109] A. Rauschenbeutel, G. Nogues, S. Osnaghi, P. Bertet, M. Brune, J.-M. Raimond, andS. Haroche. Step-by-step engineered multiparticle entanglement.

Science , 288:5473, 2000.[110] R. Raz. Exponential separation of quantum and classical communication complexity. In

Proceedings of 31st ACM STOC , pages 358–367, 1999.[111] A. Razborov. On the distributional complexity of disjointness.

Theoretical Computer Science ,106(2):385–390, 1992.[112] A. Razborov. Quantum communication complexity of symmetric predicates.

Izvestiya of theRussian Academy of Sciences, mathematics , 67(1):159–176, 2003. quant-ph/0204025.[113] O. Regev and B. Toner. Simulating quantum correlations with ﬁnite communication. In

Pro-ceedings of 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07) ,pages 384–394, 2007.[114] T. J. Rivlin and E. W. Cheney. A comparison of uniform approximations on an interval anda ﬁnite subset thereof.

SIAM Journal on Numerical Analysis , 3(2):311–320, 1966.[115] M. A. Rowe, D. Kielpinski, V. Meyer, C.A. Sackett, W. M. Itano, C. Monroe, and D. J.Wineland. Experimental violation of a Bell’s inequality with eﬃcient detection.

Nature ,409:791, 2001.[116] C. A. Sackett, D. Kielpinski, B. E. King, C. Langer, V. Meyer, C. J. Myatt, M. Rowe, Q. A.Turchette, W. M. Itano, D. J. Wineland, and C. Monroe. Experimental entanglement of fourparticles.

Nature , 404:256, 2000.[117] V. B. Scholz and R. F. Werner. Tsirelson’s Problem.

Arxiv preprint arXiv:0812.4305 , 2008.[118] E. Schr¨odinger. Discussion of probability relations between separated systems.

Proceedingsof the Cambridge Philosophical Society , 31:555–563, 1935.[119] E. Schr¨odinger. Probability relations between separated systems.

Proceedings of the Cam-bridge Philosophical Society , 32:446–451, 1936.[120] B. Schumacher. Quantum coding.

Physical Review A , 51(4):2738–2747, Apr 1995.[121] C. E. Shannon. A mathematical theory of communication.

Bell System Technical Journal ,27:379–423, 623–656, 1948.[122] A. Sherstov. The pattern matrix method for lower bounds on quantum communication. In

Proceedings of 40th ACM STOC , pages 85–94, 2008.53123] P. W. Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on aquantum computer.

SIAM Journal on Computing , 26(5):1484–1509, 1997. Earlier version inFOCS’94. quant-ph/9508027.[124] M. Steiner. Towards quantifying non-local information transfer: ﬁnite-bit non-locality.

Physics Letters A , 270:239–244, 2000. arXiv:quant-ph/9902014.[125] R. T. Thew, A. Ac´ın, H. Zbinden, and N. Gisin. Bell-type test of energy-time entangledqutrits.

Physical Review Letters , 93(1):010503, Jul 2004.[126] W. Tittel, J. Brendel, H. Zbinden, and N. Gisin. Violation of Bell inequalities by photonsmore than 10 km apart.

Physical Review Letters , 81:3563, 1998.[127] B. F. Toner and D. Bacon. Communication cost of simulating Bell correlations.

PhysicalReview Letters , 91(18):187904, Oct 2003.[128] P. Trojek, C. Schmid, M. Bourennane, C. Brukner, M. ˙Zukowski, and H. Weinfurter. Exper-imental quantum communication complexity.

Physical Review A , 75:050305(R), 2005.[129] B. S. Tsirelson. Quantum analogues of the Bell inequalities. the case of two spatially separateddomains.

Journal of Soviet Mathematics , 36:557–570, 1987.[130] A. Vaziri, G. Weihs, and A. Zeilinger. Experimental two-photon, three-dimensional entangle-ment for quantum communication.

Physical Review Letters , 89(24):240401, Nov 2002.[131] S. Wehner, M. Christandl, and A.C. Doherty. Lower bound on the dimension of a quantumsystem given measured data.

Physical Review A , 78(6), 2008.[132] G. Weihs, T. Jennewein, C. Simon, H. Weinfurter, and A. Zeilinger. Violation of Bell’sinequality under strict Einstein locality conditions.

Physical Review Letters , 81:5039, 1998.[133] R. F. Werner. Quantum states with Einstein-Podolsky-Rosen correlations admitting a hidden-variable model.

Physical Review A , 40(8):4277–4281, Oct 1989.[134] R. F. Werner and M. M. Wolf. All-multipartite Bell-correlation inequalities for two dichotomicobservables per site.

Physical Review A , 64:032112, 2001.[135] R. F. Werner and M. M. Wolf. Bell inequalities and entanglement.

Quantum Informationand Computation , 1(3):1–25, 2001. Arxiv preprint quant-ph/0107093.[136] A. C-C. Yao. Some complexity questions related to distributive computing. In

Proceedingsof 11th ACM STOC , pages 209–213, 1979.[137] A. C-C. Yao. Quantum circuit complexity. In

Proceedings of 34th IEEE FOCS , pages 352–360,1993.[138] A. C-C. Yao. On the power of quantum ﬁngerprinting. In

Proceedings of 35th ACM STOC ,pages 77–81, 2003.[139] J. Zhang, X.-H. Bao, T.-Y. Chen, T. Yang, A. Cabello, and J.-W. Pan. Experimental quantumguess my number protocol using multiphoton entanglement.

Physical Review A , 75:022302,2007. 54140] Z. Zhao, T. Yang, Y.-A. Chen, A.-N. Zhang, M. ˙Zukowski, and J.-W. Pan. Experimentalviolation of local realism by four-photon Greenberger-Horne-Zeilinger entanglement.

PhysicalReview Letters , 91(18):180401, Oct 2003.[141] M. ˙Zukowski and ˇC. Brukner. Bell’s theorem for general n-qubit states.

Physical ReviewLetters , 88(21):210401, May 2002.

A Nayak’s Proof of a Consequence of Holevo’s Bound

Here we prove that if we are encoding n bits in d -dimensional quantum states, then the averagerecovery probability is at most d/ n . Therefore, an exact procedure requires d ≥ n , and thus atleast n qubits.Let ρ , . . . , ρ n − be the d -dimensional states that encode the elements of { , } n (which we iden-tify with { , , . . . , n − } in the obvious way). Let E , . . . , E n − be the measurement operatorsapplied for decoding (they sum to the d -dimensional identity). The probability of successfully re-covering x ∈ { , } n from its encoding is Tr( E x ρ x ). Therefore, we can bound the success probabilityfor a uniformly random x ∈ { , } n by12 n n − X x =0 Tr( E x ρ x ) ≤ n n − X x =0 Tr( E x )= 12 n Tr n − X x =0 E x ! = 12 n Tr( I )= d n . (30)The ﬁrst inequality follows because the density operator ρ x is positive semi-deﬁnite and has trace 1,therefore it can be unitarily diagonalized: U ∗ ρ x U = D , where D is diagonal with diagonal entriesthat are non-negative and sum to 1. Because the trace is invariant under cyclic permutations of thematrices, we now have Tr( E x ρ x ) = Tr( U ∗ E x U U ∗ ρ x U ) = Tr( U ∗ E x U D ) ≤ Tr( U ∗ E x U I ) = Tr( E x ). B Rectangles and the Lower Bound for Distributed Deutsch-Jozsa

Separations between quantum and classical communication complexity always require two things:an eﬃcient quantum protocol for some problem, and a lower bound on the communication of allclassical protocols solving that same problem. In this appendix we will give some tools for lowerbounding classical communication complexity, leading eventually to the lower bound on classicalprotocols for the Distributed Deutsch-Jozsa problem that we mentioned in Section 3.4.

B.1 Rectangles

Consider some communication complexity problem f : X × Y → { , } , where Alice starts withan input x ∈ X and Bob starts with an input y ∈ Y . We start by introducing the crucialcombinatorial notion for classical lower bounds. A rectangle is a set R ⊆ X × Y that is of the55orm R = A × B with A ⊆ X and B ⊆ Y . For example, if n = 2 and A = { , } , B = { , } then R = A × B = { (00 , , (00 , , (01 , , (01 , } is a rectangle. The following result is afundamental property of classical deterministic protocols. Lemma 1.

If a deterministic protocol has communication c , then there exist c rectangles R , . . . , R c that partition X × Y , such that the protocol gives the same output a i for each ( x, y ) ∈ R i . We omit the easy proof of this lemma, which is by induction on c . For example, suppose thereis only one k -bit message m going from Alice to Bob and then Bob returns the 1-bit output. Thenthe 2 k +1 rectangles would be of the form R m,a = A m × Y m,a , with m ∈ { , } k and a ∈ { , } ,where A m is the set of x ’s for which Alice sends k -bit message m , and Y m,a is the set of y ’s forwhich Bob returns output a when receiving message m . Note that if our protocol computes f correctly, then the rectangles are “monochromatic”: the protocol returns the same answer f ( x, y )for all ( x, y ) ∈ R i .As a simple application of this we prove the so-called “rank lower bound”. Consider somecommunication complexity problem f : X × Y → { , } . Let M f be the | X | × | Y | matrix whoseentries are deﬁned by M f ( x, y ) = f ( x, y ). This is called the communication matrix of f . It canbe viewed as a 2-dimensional truth table. We use rank( f ) to denote the rank of this matrix overthe ﬁeld of real numbers. For example, the communication matrix for the equality function is the2 n × n identity matrix, which has 1s on its diagonal and 0s elsewhere. Hence rank(EQ) = 2 n .Suppose we have some c -bit deterministic protocol that computes f . We know that this parti-tions the input space X × Y into rectangles R , . . . , R c . Since each 1-input ( x, y ) occurs in exactlyone 1-rectangle, we have M f = X i : a i =1 R i , where we view R i as a 2 n × n matrix with 1s on its elements and 0s elsewhere. Note that R i is amatrix of rank 1. Hence, using rank( A + B ) ≤ rank( A ) + rank( B ), we getrank( M f ) = rank X i : a i =1 R i ! ≤ X i : a i =1 rank( R i ) = X i : a i =1 ≤ c . But that means that a lower bound on the rank of M f implies a lower bound on the communication!In particular, it follows that for the equality problem, the communication c needs to be at least n . B.2 Randomized protocols

In a randomized protocol, Alice and Bob may ﬂip coins and the protocol has to output the rightvalue f ( x, y ) with probability ≥ / x, y ) ∈ D . We can ﬁx these coins to obtain a deter-ministic protocol. Suppose randomized protocol A uses c bits of communication and has successprobability 2 / A ( x, y, r A , r B ) = 1 if the protocol gives the correct output f ( x, y )on input x, y using coin ﬂips r A for Alice and r B for Bob, and A ( x, y, r A , r B ) = 0 otherwise. Foreach x, y , a good randomized protocol satisﬁes E r A ,r B [ A ( x, y, r A , r B )] ≥ / , where the expectation is taken over uniformly chosen strings r A and r B . Now let µ : { , } n ×{ , } n → [0 ,

1] be an input distribution. Then also E µ,r A ,r B [ A ( x, y, r A , r B )] ≥ / , r A , r B , and x, y picked according to µ . By the averaging princi-ple, there exists a way to ﬁx r A and r B such that the success probability (under µ ) of the resulting deterministic protocol is at least 2 /

3. Accordingly, if we want to lower bound the randomizedcommunication complexity of a function, it suﬃces to ﬁnd some “hard” input distribution µ , andto show that all deterministic protocols that have error at most 1 / µ ). Then the protocol will make a large error on all large rectangles. Conversely, if we knowthe protocol does not make a large error, most of its rectangles must have been “small”. But thatcan only be if there are many rectangles. Since the number of rectangles is 2 c , the communication c must have been large. This idea leads to the following lower bound method. The discrepancy of rectangle R = A × B under µ is the diﬀerence between the weight of the 0s and the 1s in thatrectangle: δ µ ( R ) = (cid:12)(cid:12) µ ( R ∩ f − (1)) − µ ( R ∩ f − (0)) (cid:12)(cid:12) The discrepancy of f under µ is the maximum over all rectangles: δ µ ( f ) = max R δ µ ( R ) . If f has small discrepancy, that means that all “large” rectangles are roughly balanced. Suppose adeterministic protocol partitions the input space into rectangles R , . . . , R c . Suppose it has successprobability 1 / ǫ . The best bias (diﬀerence between success and failure probabilities) that theprotocol can achieve on rectangle R i , is δ µ ( R i ), by giving the output with highest weight in thatrectangle. The success probability is P i µ ( R i ∩ f − ( a i )) and the error probability is P i µ ( R i ∩ f − (1 − a i )), where a i is the majority value of f on the pairs ( x, y ) ∈ R i , weighted according to µ .Hence we have2 ǫ ≤ c X i =1 µ ( R i ∩ f − ( a i )) − c X i =1 µ ( R i ∩ f − (1 − a i )) ≤ c X i =1 δ µ ( R i ) ≤ c δ µ ( f ) . This is a lower bound on the communication: c ≥ log(2 ǫ/δ µ ( f )). Accordingly, a distribution µ where δ µ ( f ) is small gives a lower bound on the communication of deterministic protocols for f under µ , and then the same lower bound applies to randomized protocols. B.3 Discrepancy of the inner product function

To illustrate the discrepancy lower bound technique, we now consider the inner product function,deﬁned by IP( x, y ) = x · y (mod 2). We will show that its discrepancy under the uniform distributionis very small. We analyze the 2 n × n matrix M whose ( x, y ) entry is ( − x · y . This is just thecommunication matrix for IP, with 0s replaced by 1s, and 1s replaced by − M are quite balanced: Lemma 2 (Lindsey) . For every rectangle R = A × B , the absolute value of the sum of the M -entriesin that rectangle is at most p | A | · | B | · n . roof: It is easy to see that M is symmetric and M = 2 n I . This implies, for any vector v , k M v k = v T M T M v = 2 n v T v = 2 n k v k , where the norm is the usual Euclidean vector length. Let v A ∈ { , } n and v B ∈ { , } n bethe characteristic (column) vectors of the sets A and B . The sum of the M -entries in R is P a ∈ A,b ∈ B M ab = v TA M v B . We can bound this using Cauchy-Schwarz: | v TA M v B | ≤ k v A k · k M v B k = k v A k · √ n k v B k = p | A | · | B | · n . Let µ ( x, y ) = 1 / n be the uniform input distribution. Note that the discrepancy of the rectangle R under µ is exactly the diﬀerence of +1’s and − R , divided by 2 n . By Lindsey’s lemma,this is δ µ ( R ) ≤ p | A | · | B | / n/ . Because | A | , | B | ≤ n , it follows that the discrepancy of the innerproduct function under the uniform distribution is δ µ (IP) ≤ − n/ . Hence we get a n/ B.4 The lower bound for the Distributed Deutsch-Jozsa problem

Recall the Distributed Deutsch-Jozsa problem from Section 3.4. Buhrman, Cleve, and Wigder-son [33] used a combinatorial result of Frankl and R¨odl [62] to prove the following classical lowerbound:

Theorem 3.

Every deterministic classical protocol that solves the Distributed Deutsch-Jozsa prob-lem, needs to communicate at least . n bits. Proof:

Suppose there is a c -bit deterministic classical protocol for the problem. Each c -bit con-versation corresponds to a rectangle R = A × B , with A, B ⊆ { , } n , such that the protocol hasthe same conversation and output if, and only if, ( x, y ) ∈ R . Since there are at most 2 c possibleconversations, the protocol partitions { , } n × { , } n in at most 2 c diﬀerent such rectangles. Nowconsider all n -bit strings x with Hamming weight n/ n/ n/ (cid:0) nn/ (cid:1) ≈ n / √ n of those. Since every ( x, x )-pair must occur in some rectangle and there are only 2 c rectangles, there is a rectangle R = A × B that contains at least 2 n / ( √ n c ) diﬀerent such ( x, x )-pairs. Let S = { x : | x | = n/ , ( x, x ) ∈ R } be the set of such x . Since R contains some ( x, x )-pairs(on which the protocol outputs 1) and the protocol has the same output for all inputs in R , R cannot contain any 0-inputs. This implies that the Hamming distance of every pair x, y ∈ S isdiﬀerent from n/

2, for otherwise ( x, y ) would be a 0-input in R . Viewing the strings x in S ascharacteristic vectors of sets, it is easy to see that the size of the intersection of x, y ∈ S is never n/

4. Thus we have a set system S of at least 2 n / √ n c sets over an n -element universe, such thatthe size of the intersection of any two sets in S is not n/

4. However, by Corollary 1.2 of [62], sucha set system can have at most 1 . n elements, so we have2 n √ n c ≤ | S | ≤ . n . This implies c ≥ log(2 n / √ n . n ) ≥ . n . 58 Razborov’s Lower Bound for the Quantum Communication Com-plexity of Intersection

While the previous section discussed some basic methods for lower bounding classical communi-cation complexity, here we focus on methods to lower bound quantum communication complexity(sometimes with prior entanglement).

C.1 The Kremer-Razborov-Yao lemma and its consequences

The following lemma is due to Razborov [112, Proposition 3.3] and is similar to earlier statementsby Yao [137] and Kremer [82]. It can intuitively by viewed as a quantum analogue of the rectangle-decomposition of classical protocols that we explained in Section B.1. We skip the easy proof,which is by induction on q . Lemma 4 (Kremer-Razborov-Yao) . Let | Ψ i denote the (possibly entangled) starting state of aquantum protocol that communicates q qubits of communication and has binary output. For allinputs x of Alice and y of Bob, there exist linear operators A h ( x ) , B h ( y ) , for all h ∈ { , } q − ,each with operator norm (i.e., largest singular value) at most 1, such that the acceptance probability(i.e., probability of output ‘1’) of the protocol is P ( x, y ) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X h ∈{ , } q − ( A h ( x ) ⊗ B h ( y )) | Ψ i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) , where the norm is the usual Euclidean vector length. Consider the special case where the protocol starts without entanglement, so we can write | Ψ i = | Ψ A i| Ψ B i . In this case we can rewrite the acceptance probabilities as P ( x, y ) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X h ∈{ , } q − ( A h ( x ) ⊗ B h ( y )) | Ψ A i| Ψ B i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = h Ψ A |h Ψ B |  X h ∈{ , } q − ( A h ( x ) ⊗ B h ( y ))  ∗ ·  X h ′ ∈{ , } q − ( A h ′ ( x ) ⊗ B h ′ ( y ))  | Ψ A i| Ψ B i = X h,h ′ ∈{ , } q − h Ψ A | A h ( x ) ∗ A h ′ ( x ) | Ψ A i · h Ψ B | B h ( y ) ∗ B h ′ ( y ) | Ψ B i . Let a ( x ) be the 2 q − -dimensional row vector with ( h, h ′ )-entry equal to h Ψ A | A h ( x ) ∗ A h ′ ( x ) | Ψ A i ,and similarly deﬁne column vector b ( y ) with entries h Ψ B | B h ( y ) ∗ B h ′ ( y ) | Ψ B i , then the last expressionis just the scalar product a ( x ) b ( y ). If we now deﬁne A to be the | X | × q − matrix with rows a ( x ),and B the 2 q − × | Y | matrix with columns b ( y ), then we have proved the following lemma. Lemma 5.

Consider a quantum communication protocol (without prior entanglement) on input-set X × Y , that communicates q qubits, with acceptance probabilities denoted by P ( x, y ) , and P thecorresponding | X | × | Y | matrix. There exist | X | × q − matrix A and q − × | Y | matrix B , bothwith entries of absolute value at most 1, such that P = AB . P is at most 2 q − , since rank( AB ) ≤ min(rank( A ) , rank( B )). Thisallows us to generalize the classical rank lower bound from Section B.1 to the quantum domain. If wehave a q -qubit protocol that computes some function f : X × Y → { , } with success probability 1,then P ( x, y ) equals f ( x, y ), and the | X | × | Y | matrix P is actually the communication matrix M f ,whose ( x, y ) entry is f ( x, y ). Hence we obtain a lower bound q ≥ rank ( P )2 + 1 = rank ( M f )2 + 1on the quantum communication of protocols with success probability 1. Similarly, one can obtainlower bounds on the bounded-error quantum communication complexity by lower bounding therank needed for a matrix P that is close to the matrix of function values at each entry (since an ǫ -error protocol satisﬁes | P ( x, y ) − f ( x, y ) | ≤ ǫ for all inputs).Finally, let us note without proof that one can also use the discrepancy method (Section B.2)to lower bound quantum communication complexity [82], even for protocols with prior entangle-ment [84]. Since the Inner Product function has very small discrepancy (Section B.3), we thus haveanother way of showing a linear lower bound for it, diﬀerent from the one explained in Section 3.8. C.2 Translation from protocols to polynomials

The following key lemma is implicit in Razborov’s paper [112]; the presentation we give here istaken from [80]. It allows us to translate the average acceptance probability of a q -qubit protocol(as a function of the intersection size i of the inputs x and y , viewed as subsets of { , . . . , n } ) to apolynomial in i of degree roughly q . Accordingly, eﬃcient protocols give low-degree polynomials.Razborov’s proof relies on the following linear algebraic notions. The operator norm k A k of amatrix A is its largest singular value σ (not to be confused with the Euclidean vector norm ofLemma 4). The trace inner product —also known as Hilbert–Schmidt inner product—between A and B is h A, B i = Tr( A ∗ B ). The trace norm is k A k tr = max {|h A, B i| : k B k = 1 } = P i σ i , the sumof all singular values of A . The Frobenius norm is k A k F = pP ij | A ij | = qP i σ i . Lemma 6.

Consider a quantum communication protocol (without prior entanglement) on n -bitinputs x and y , that communicates q qubits, with acceptance probabilities denoted by P ( x, y ) . Deﬁne P ( i ) = E | x | = | y | = n/ , | x ∧ y | = i [ P ( x, y )] , where the expectation is taken uniformly over all x, y that each have weight n/ and that haveintersection i . For every d ≤ n/ there exists a degree- d polynomial q such that | P ( i ) − q ( i ) | ≤ q − ( d/ for all i ∈ { , . . . , n/ } . Proof:

We only consider the N = (cid:0) nn/ (cid:1) strings of weight n/

4. Let P denote the N × N matrixof the acceptance probabilities on these inputs. By Lemma 5, we can write P = AB , where A is an N × q − matrix with each entry at most 1 in absolute value, and similarly for B . Notethat k A k F , k B k F ≤ √N q − . By the Cauchy-Schwarz inequality for unitarily invariant norms [19,p. 95], we have k P k tr ≤ k A k F · k B k F ≤ N q − . Let µ i denote the N × N matrix corresponding to the uniform probability distribution on { ( x, y ) : | x ∧ y | = i } . These “combinatorial matrices” have been well studied [81]. Note that h P, µ i i is theexpected acceptance probability P ( i ) of the protocol under that distribution. One can show that thediﬀerent µ i commute; thus they have the same eigenspaces E , . . . , E n/ and can be simultaneously60iagonalized by some orthogonal matrix U . For t ∈ { , . . . , n/ } , let ( U P U T ) t denote the block of U P U T corresponding to E t , and let a t = Tr(( U P U T ) t ) be its trace. Then we have n/ X t =0 | a t | ≤ N X j =1 (cid:12)(cid:12) ( U P U T ) jj (cid:12)(cid:12) ≤ (cid:13)(cid:13) U P U T (cid:13)(cid:13) tr = k P k tr ≤ N q − , where the second inequality is a property of the trace norm.Let λ it be the eigenvalue of µ i in eigenspace E t . Knuth [81] gives an exact combinatorialexpression for λ it . We will not state this explicitly here, but just note that λ it is a degree- t polynomial in i , and that | λ it | ≤ − t/ / N for i ≤ n/

8. Now consider the high-degree polynomial p deﬁned by p ( i ) = n/ X t =0 a t λ it . This satisﬁes p ( i ) = n/ X t =0 Tr((

U P U T ) t ) λ it = h U P U T , U µ i U T i = h P, µ i i = P ( i ) . Let q be the degree- d polynomial obtained by removing the high-degree parts of p : q ( i ) = d X t =0 a t λ it . Then P and q are close on all integers i between 0 and n/ | P ( i ) − q ( i ) | = | p ( i ) − q ( i ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n/ X t = d +1 a t λ it (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ − d/ N n/ X t =0 | a t | ≤ − d/ q . C.3 The quantum lower bound for Intersection

Now suppose we have a q -qubit protocol for the Intersection problem, say with error probability atmost 1 / x, y . Our goal is to show that q is at least about √ n . Since the protocoloutputs 1 with high probability if, and only if, x and y intersect in at least one point, we know thefollowing about the quantity P ( i ) = E | x | = | y | = n/ , | x ∧ y | = i [ P ( x, y )]: P (0) ∈ [0 , /

3] and P ( i ) ∈ [2 / , i ∈ { , . . . , n } .This P ( i ) is only deﬁned on integers, but by Lemma 6 we can approximate it up to somesmall additive error ǫ using a polynomial q of degree d = 8 q + ⌈ /ǫ ) ⌉ . Then we know q (0) ∈ [ − ǫ, / ǫ ] and q ( i ) ∈ [2 / − ǫ, ǫ ]. However, the following result of Ehlich and Zeller [59] andRivlin and Cheney [114] says that such a polynomial q must have degree about √ n : Theorem 7 (Ehlich & Zeller; Rivlin & Cheney) . Let p : R → R be a polynomial such that b ≤ p ( i ) ≤ b for every integer ≤ i ≤ N , and the derivative p ′ satisﬁes | p ′ ( x ) | ≥ c for some real ≤ x ≤ N . Then the degree of p is at least p cN/ ( c + b − b ) .

61t thus follows that the original protocol must have communicated at least about √ n qubits.In his paper, Razborov gives essentially tight lower bounds not just for the Intersection problem,but for any communication problem that depends only on the size of the intersection of the inputs x and y . This combines Lemma 6 with a polynomial degree lower bound due to Paturi [105].The lower bound proof we gave here only applies to quantum protocols that do not start with anentangled state, but Razborov showed the same lower bound for protocols with prior entanglement,at the expense of some more technical complication. Recently, an alternative proof was obtainedby Sherstov [122]. D Asymmetric Detection Eﬃciency

Here we prove the results stated in Section 7.1.3 concerning the connection between asymmetricexperiments where a single detector is ineﬃcient, and classical protocols with perfect detectors thatuse one-way communication, i.e., where all the communication takes place from Alice to Bob.Let us suppose that in order to reproduce the quantum correlations using one-way communica-tion from Alice to Bob and shared randomness, c ǫ ′ bits of communication are required to reproducethe correlations with error ǫ ′ . More precisely, the error is measured as the total variational distancebetween the predictions of quantum theory P QM ( ab | xy ) and the output P class ( ab | xy ) of the classicalprotocol: error = max xy X ab | P class ( ab | xy ) − P QM ( ab | xy ) | Let us also suppose that there exists a protocol that uses only shared randomness (a localhidden variable model) in which Alice’s detector has eﬃciency η ǫ and Bob’s detector is perfect,and that reproduces the quantum correlations with error ǫ . More precisely the fact that Alice’sdetector has eﬃciency η means that P ( ⊥ b | xy ) = η independently of b, x, y , where ⊥ corresponds toAlice’s detector not giving a result. The error is measured as the total variational distance betweenthe predictions of quantum theory P QM ( ab | xy ) (when the detectors are 100% eﬃcient) and thepredictions P LHV ( ab | xy ) of the LHV model. We divide the latter by η to take into account thatAlice’s detector gives a result with probability η :error = max xy X ab (cid:12)(cid:12)(cid:12)(cid:12) P LHV ( ab | xy ) η − P QM ( ab | xy ) (cid:12)(cid:12)(cid:12)(cid:12) Then we have:

Theorem 8.

With the above hypothesis, we have η ǫ ≤ O (( − ln ǫ )2 − c ǫ ) . To prove this, we use the local hidden variable model (LHV) model with detection eﬃciency η ǫ to construct a classical protocol with communication. The LHV uses shared randomness r . Aliceand Bob share k independently chosen instances of the shared randomness r , r , . . . , r k . Alicechecks whether she should give an output for at least one value of the shared randomness. Thisoccurs with probability 1 − (1 − η ) k . If so, she sends Bob the index j of the shared randomness r j for which she gives an output (using log k bits of communication), and they give the correspondingoutput. If there is no instance of the shared randomness for which Alice should give an outputin the LHV model, Alice gives a random output and sends Bob a random index j . This occurswith probability (1 − η ) k , and in this case Alice and Bob’s results will most likely be completely62iﬀerent from those predicted by quantum mechanics. The error probability in the model withcommunication is thus P ( error ) ≤ (1 − (1 − η ) k ) ǫ + (1 − η ) k ≤ ǫ + (1 − η ) k . Let us take k = ln ǫ ln(1 − η ) ,then the error is bounded by P ( error ) ≤ ǫ + (1 − η ) ln ǫ ln(1 − η ) = 2 ǫ . But we know that to producethe correlations with error 2 ǫ we need at least c ǫ bits of one-way communication, hence k ≥ c ǫ .Therefore − ln(1 − η ) ≤ ( − ln ǫ )2 − c ǫ , which implies the result.(Note that the above mapping does not hold when both Alice and Bob’s detectors are ineﬃcient,since if they try the above procedure, they will need to ﬁnd a value of the shared randomness r j forwhich both their detectors produce an output, i.e., solve an instance of the Intersection problem.)Let us apply this result to the Hidden Matching problem. As mentioned in Section 3.7, thisproblem can be solved using log n ebits and log n bits of classical communication from Alice toBob; but if classical communication from Alice to Bob is considered, then at least Ω( √ n ) bitsof communication are required, even allowing for a small error probability. This implies that thecorrelations obtained by measuring the ebits can only be reproduced using at least Ω( √ n ) bitsof classical communication from Alice to Bob, even allowing for a small error probability. Theabove result then shows that these correlations remain non-local (i.e., cannot be reproduced by aclassical model without communication) if Bob’s detector has 100% eﬃciency and Alice’s detectorhas eﬃciency η ≥ − Ω( √ n ))