Full randomness from arbitrarily deterministic events
Rodrigo Gallego, Lluis Masanes, Gonzalo de la Torre, Chirag Dhara, Leandro Aolita, Antonio Acin
FFull randomness from arbitrarily deterministic events
Rodrigo Gallego, Lluis Masanes, Gonzalo De La Torre, Chirag Dhara, Leandro Aolita, and Antonio Ac´ın
1, 2 ICFO-Institut de Ciencies Fotoniques, Av. Carl Friedrich Gauss, 3, 08860 Castelldefels, Barcelona, Spain ICREA-Instituci´o Catalana de Recerca i Estudis Avanc¸ats, Llu´ıs Companys 23, 08010 Barcelona, Spain
Do completely unpredictable events exist in nature? Classical theory, being fully deterministic, completelyexcludes fundamental randomness. On the contrary, quantum theory allows for randomness within its axiomaticstructure. Yet, the fact that a theory makes prediction only in probabilistic terms does not imply the existence ofany form of randomness in nature. The question then remains whether one can certify randomness independentof the physical framework used. While standard Bell tests [1] approach this question from this perspective, theyrequire prior perfect randomness, which renders the approach circular. Recently, it has been shown that it ispossible to certify full randomness using almost perfect random bits [2]. Here, we prove that full randomnesscan indeed be certified using quantum non-locality under the minimal possible assumptions: the existence of asource of arbitrarily weak (but non-zero) randomness and the impossibility of instantaneous signalling. Thuswe are left with a strict dichotomic choice: either our world is fully deterministic or there exist in nature eventsthat are fully random. Apart from the foundational implications, our results represent a quantum protocol forfull randomness amplification, an information task known to be impossible classically [3]. Finally, they open anew path for device-independent protocols under minimal assumptions.
Understanding whether nature is deterministically pre-determined or there are intrinsically random processes isa fundamental question that has attracted the interest ofmultiple thinkers, ranging from philosophers and mathe-maticians to physicists or neuroscientists. Nowadays thisquestion is also important from a practical perspective, asrandom bits constitute a valuable resource for applicationssuch as cryptographic protocols, gambling, or the numeri-cal simulation of physical and biological systems.Classical physics is a deterministic theory. Perfectknowledge of the positions and velocities of a system ofclassical particles at a given time, as well as of their inter-actions, allows one to predict their future (and also past)behavior with total certainty [4]. Thus, any randomnessobserved in classical systems is not intrinsic to the theorybut just a manifestation of our imperfect description of thesystem.The advent of quantum physics put into question thisdeterministic viewpoint, as there exist experimental situa-tions for which quantum theory gives predictions only inprobabilistic terms, even if one has a perfect descriptionof the preparation and interactions of the system. A pos-sible solution to this classically counterintuitive fact wasproposed in the early days of quantum physics: Quantummechanics had to be incomplete [5], and there should bea complete theory capable of providing deterministic pre-dictions for all conceivable experiments. There would thusbe no room for intrinsic randomness, and any apparent ran-domness would again be a consequence of our lack of con-trol over hypothetical “hidden variables” not contemplatedby the quantum formalism.Bell’s no-go theorem [1], however, implies that hidden-variable theories are inconsistent with quantum mechan-ics. Therefore, none of these could ever render a deter-ministic completion to the quantum formalism. More pre-cisely, all hidden-variable theories compatible with a localcausal structure predict that any correlations among space-like separated events satisfy a series of inequalities, knownas Bell inequalities. Bell inequalities, in turn, are violatedby some correlations among quantum particles. This form of correlations defines the phenomenon of quantum non-locality.Now, it turns out that quantum non-locality does notnecessarily imply the existence of fully unpredictable pro-cesses in nature. The reasons behind this are subtle. Firstof all, unpredictable processes could be certified only if theno-signalling principle holds. This states that no instanta-neous communication is possible, which imposes in turna local causal structure on events, as in Einstein’s specialrelativity. In fact, Bohm’s theory is both deterministic andable to reproduce all quantum predictions [6], but it is in-compatible with no-signalling. Thus, we assume through-out the validity of the no-signalling principle. Yet, evenwithin the no-signalling framework, it is still not possibleto infer the existence of fully random processes only fromthe mere observation of non-local correlations. This is dueto the fact that Bell tests require measurement settings cho-sen at random, but the actual randomness in such choicescan never be certified. The extremal example is given whenthe settings are determined in advance. Then, any Bell vi-olation can easily be explained in terms of deterministicmodels. As a matter of fact, super-deterministic models,which postulate that all phenomena in the universe, includ-ing our own mental processes, are fully pre-programmed,are by definition impossible to rule out.These considerations imply that the strongest result onthe existence of randomness one can hope for using quan-tum non-locality is stated by the following possibility:Given a source that produces an arbitrarily small but non-zero amount of randomness, can one still certify the exis-tence of completely random processes? The main resultof this work is to provide an affirmative answer to thisquestion. Our results, then, imply that the existence ofcorrelations as those predicted by quantum physics forcesus into a dichotomic choice: Either we postulate super-deterministic models in which all events in nature are fullypre-determined, or we accept the existence of fully unpre-dictable events.Besides the philosophical and physics-foundational im-plications, our results provide a protocol for perfect ran- a r X i v : . [ qu a n t - ph ] O c t FIG. 1:
Local causal structure and randomness amplification .A source S produces a sequence x , x , . . . x j , . . . Change x j inthe figure to x j , . . . of imperfect random bits. The goal of ran-domness amplification is to produce a new source S f of perfectrandom bits, that is, to process the initial bits to get a final bit k fully uncorrelated (free) from any potential cause of it. All space-time events outside the future light-cone of k may have been in itspast light-cone before and therefore constitute a potential causeof it. Any such event can be modeled by a measurement z , withan outcome e , on some physical system. This system may be un-der the control of an adversary Eve, interested in predicting thevalue of k . domness amplification using quantum non-locality. Ran-domness amplification is an information-theoretic taskwhose goal is to use an input source S of imperfectly ran-dom bits to produce perfect random bits that are arbitrarilyuncorrelated from all the events that may have been a po-tential cause of them, i.e. arbitrarily free. In general, S produces a sequence of bits x , x , . . . x j , . . . , with x j = 0 or 1 for all j , see Fig. 1. Each bit j contains some random-ness, in the sense that the probability P ( x j | e ) that it takesa given value x j , conditioned on any pre-existing variable e , is such that (cid:15) ≤ P ( x j | e ) ≤ − (cid:15) (1)for all j and e , where < (cid:15) ≤ / . The variable e can cor-respond to any event that could be a possible cause of bit x j . Therefore, e represents events contained in the space-time region lying outside the future light-cone of x j . Freerandom bits correspond to (cid:15) = ; while deterministic ones,i.e. those predictable with certainty by an observer with ac-cess to e , to (cid:15) = 0 . More precisely, when (cid:15) = 0 the bound(C1) is trivial and no randomness can be certified. We re-fer to S as an (cid:15) -source, and to any bit satisfying (C1) asan (cid:15) -free bit. The aim is then to generate, from arbitrarilymany uses of S , a final source S f of (cid:15) f arbitrarily close to / . If this is possible, no cause e can be assigned to thebits produced by S f , which are then fully unpredictable.Note that efficiency issues, such as the rate of uses of S required per final bit generated by S f do not play any rolein randomness amplification. The relevant figure of merit is just the quality, measured by (cid:15) f , of the final bits. Thus,without loss of generality, we restrict our analysis to theproblem of generating a single final free random bit k .Santha and Vazirani proved that randomness amplifica-tion is impossible using classical resources [3]. This is in asense intuitive, in view of the absence of any intrinsic ran-domness in classical physics. In the quantum regime, ran-domness amplification has been recently studied by Col-beck and Renner [2]. There, S is used to choose the mea-surement settings by two distant observers, Alice and Bob,in a Bell test [7] involving two entangled quantum parti-cles. The measurement outcome obtained by one of theobservers, say Alice, in one of the experimental runs (alsochosen with S ) defines the output random bit. Colbeckand Renner proved how input bits with very high random-ness, of . < (cid:15) ≤ . , can be mapped into arbitrarilyfree random bits of (cid:15) f → / , and conjectured that ran-domness amplification should be possible for any initialrandomness [2]. Our results also solve this conjecture, aswe show that quantum non-locality can be exploited to at-tain full randomness amplification , i.e. that (cid:15) f can be madearbitrarily close to / for any < (cid:15) ≤ / .Before presenting the ingredients of our proof, it isworth commenting on previous works on randomness inconnection with quantum non-locality. In [8] it was shownhow to bound the intrinsic randomness generated in a Belltest. These bounds can be used for device-independent ran-domness expansion, following a proposal by Colbeck [9],and to achieve a quadratic expansion of the amount of ran-dom bits [8] (see [10–13] for further works on device-independent randomness expansion). Note however that,in randomness expansion, one assumes instead, from thevery beginning, the existence of an input seed of free ran-dom bits, and the main goal is to expand this into a largersequence. The figure of merit there is the ratio betweenthe length of the final and initial strings of free randombits. Finally, other recent works have analyzed how a lackof randomness in the measurement choices affects a Belltest [14–16] and the randomness generated in it [17].Let us now sketch the realization of our final source S f .We use the input (cid:15) -source S to choose the measurementsettings in a multipartite Bell test involving a number ofobservers that depends both on the input (cid:15) and the target (cid:15) f . After verifying that the expected Bell violation is ob-tained, the measurement outcomes are combined to definethe final bit k . For pedagogical reasons, we adopt a cryp-tographic perspective and assume the worst-case scenariowhere all the devices we use may have been prepared by anadversary Eve equipped with arbitrary non-signalling re-sources, possibly even supra-quantum ones. In the prepa-ration, Eve may have also had access to S and correlatedthe bits it produces with some physical system at her dis-posal, represented by a black box in Fig. 1. Without lossof generality, we can assume that Eve can reveal the valueof e at any stage of the protocol by measuring this system.Full randomness amplification is then equivalent to prov-ing that Eve’s correlations with k can be made arbitrarilysmall.Bell tests for which quantum correlations achieve ... .................. ... FIG. 2:
Protocol for full randomness amplification based on quantum non-locality . In the first two steps, all N quintuplets measuretheir devices, where the choice of measurement is done using the (cid:15) -source S ; the quintuplets whose settings happen not to take placein the five-party Mermin inequality are discarded (in red). In steps 3 and 4, the remaining quintuplets are grouped into blocks. One ofthe blocks is chosen as the distillation block, using again S , while the others are used to check the Bell violation. In the fifth step, therandom bit k is extracted from the distillation block. the maximal non-signalling violation, also known asGreenberger-Horne-Zeilinger (GHZ) paradoxes [18], arenecessary for randomness amplification. This is due tothe fact that unless the maximal non-signalling violationis attained, for sufficiently small (cid:15) , Eve may fake the ob-served correlations with classical deterministic resources.This attack ceases to be possible when the maximal non-signalling violation is observed, as Eve is forced to pre-pare only those non-local correlations attaining the maxi-mal violation. GHZ paradoxes are however not sufficient.Consider for instance the GHZ paradox given by the tri-partite Mermin Bell inequality [19]. One can see that Evecan predict with certainty any function of the measurementoutcomes and still deliver the maximal violation, for all ≤ (cid:15) ≤ / (see Appendix B).For more parties though, the latter happens not to holdany longer. In fact, consider any correlations attainingthe maximal violation of the five-party Mermin inequality.Take the bit corresponding to the majority-vote functionof the outcomes of any subset of three out of the five ob-servers, say the first three. This function is equal to zeroif at least two of the three bits are equal to zero, and equalto one otherwise. We show in Appendix B that Eve’s pre-dictability on this bit is at most 3/4. This is our first result: Result 1.
Given an (cid:15) -source with any < (cid:15) ≤ / , andquantum five-party non-local resources, an intermediate (cid:15) i -source of (cid:15) i = 1 / can be obtained.The partial unpredictability in the five-party MerminBell test is the building block of our protocol. To com-plete it, we must equip it with two essential components:( i ) an estimation procedure that verifies that the untrusteddevices do yield the required Bell violation; and ( ii ) a dis-tillation procedure that, from sufficiently many (cid:15) i -bits gen-erated in the 5-party Bell experiment, distills a single fi-nal (cid:15) f -source of (cid:15) f → / . To these ends, we considera more complex Bell test involving N groups of five ob-servers (quintuplets) each, as depicted in Fig. 2. The stepsin the protocol are described in Box 1.In the appendices we prove using techniques from [20]that, if the protocol is not aborted, the final bit producedby the protocol is indistinguishable from an ideal randombit uncorrelated to the eavesdropper. Thus, the output freerandom bits satisfy universally-composable security [5],the highest standard of cryptographic security, and couldbe used as seed for randomness expansion or any otherprotocol.Finally, we must show that quantum resources can in-deed successfully implement our protocol. It is immediate Box 1: Protocol for Randomness Amplification
1. Every observer measures his device in one of two settingschosen at random by the input (cid:15) -source S .2. Every quintuplet whose settings combination does notappear in the five-party Mermin Bell test is discarded.If the quintuplets left are fewer than N/ , abort.3. Group the quintuples left into N b blocks of equal size N d . Choose a distillation block at random with S .4. If the outcomes of any quintuplet not in the distillationblock are inconsistent with the maximal violation of thefive-party Mermin Bell test, abort.5. Distill the final bit from the distillation block. This isdone in the following way. The majority vote maj ( a ) among for instance the outcomes a , a and a of thefirst three users is computed for each quintuplet. Then, afunction f maps the resulting N d bits into the final bit k . to see that the qubit measurements X or Y on the quan-tum state | Ψ (cid:105) = √ ( | (cid:105) + | (cid:105) ) , with | (cid:105) and | (cid:105) the eigenstates of the Z qubit basis, yield correlationsthat maximally violate the five-partite Mermin inequalityin question. This completes our main result. Result 2 ( Main Result).
Given an (cid:15) -source with any <(cid:15) ≤ / , a perfect free random bit k can be obtained usingquantum non-local correlations.In summary, we have presented a protocol that, usingquantum non-local resources, attains full randomness am-plification . This task is impossible classically and was not known to be possible in the quantum regime. As our goalwas to prove full randomness amplification, our analysisfocuses on the noise-free case. In fact, the noisy case onlymakes sense if one does not aim at perfect random bits andbounds the amount of randomness in the final bit. Then, itshould be possible to adapt our protocol in order to get abound on the noise it tolerates. Other open questions thatnaturally follow from our results consist of studying ran-domness amplification against quantum eavesdroppers, orthe search of protocols in the bipartite scenario.From a more fundamental perspective, our results im-ply that there exist experiments whose outcomes are fullyunpredictable. The only two assumptions for this conclu-sion are the existence of events with an arbitrarily smallbut non-zero amount of randomness and the validity of theno-signalling principle. Dropping the former implies ac-cepting a super-determinisitc view where no randomnessexist, so that we experience a fully pre-determined reality.This possibility is uninteresting from a scientific perspec-tive, and even uncomfortable from a philosophical one.Dropping the latter, in turn, implies abandoning a localcausal structure for events in space-time. However, this isone of the most fundamental notions of special relativity,and without which even the very meaning of randomnessor predictability would be unclear, as these concepts im-plicitly rely on the cause-effect principle. Acknowledgements
We acknowledge support from the ERCStarting Grant PERCENT, the EU Projects Q-Essence andQCS, the Spanish MICIIN through a Juan de la Cierva grantand projects FIS2010-14830, Explora-Intrinqra and CHIST-ERADIQIP, an FI Grant of the Generalitat de Catalunya, Catalunya-Caixa, and Fundaci´o Privada Cellex, Barcelona.[1] J. S. Bell, Physics , 195 (1964); Speakable and unspeakablein quantum mechanics , Cambridge University Press (Cam-bridge, 1987).[2] R. Colbeck and R. Renner,
Free randomness can be amplied ,Nature Phys. , 450 (2012).[3] M. Santha and U. V. Vazirani, in Proc. 25th IEEE Symposiumon Foundations of Computer Science (FOCS-84) , 434 (IEEEComputer Society, 1984).[4] P. S. Laplace, A Philosophical Essay on Probabilities , Paris(1840).[5] A. Einstein, B. Podolsky and N. Rosen, Phys. Rev., , 777-780 (1935).[6] D. Bohm, Phys. Rev. , 166-179 (1952); Phys. Rev. , 180-193 (1952).[7] S. L. Braunstein and C. M. Caves, Wringing out better Bellinequalities , Ann. Phys. , 22 (1990).[8] S. Pironio et al. , Random numbers certified by Bell’s theorem ,Nature , 1021 (2010).[9] R. Colbeck,
Quantum and Relativistic Protocols for Se-cure Multi-Party Computation , PhD dissertation, Univ. Cam-bridge (2007).[10] A. Ac´ın, S. Massar and S. Pironio, Phys. Rev. Lett. ,100402 (2012).[11] S. Pironio and S. Massar, arXiv:1111.6056.[12] S. Fehr, R. Gelles and C. Schaffner, arXiv:1111.6052. [13] U. V. Vazirani and T. Vidick, Proceedings of the ACM Sym-posium on the Theory of Computing (2012).[14] J. Kofler, T. Paterek, and C. Brukner,
Experimenter’s free-dom in Bell’s theorem and quantum cryptography , Phys. Rev.A , 022104 (2006).[15] J. Barrett and N. Gisin, How much measurement indepen-dence is needed to demonstrate nonlocality?
Phys. Rev. Lett. , 100406 (2011).[16] M. J. W. Hall,
Local deterministic model of singlet statecorrelations based on relaxing measurement independence ,Phys. Rev. Lett. , 250404 (2010).[17] D. E. Koh, M. J. W. Hall, Setiawan, J. E. Pope, C. Mar-letto, A. Kay, V. Scarani, and A. Ekert,
The effects ofreduced ‘free will” on Bell-based randomness expansion ,arXiv:1202.3571.[18] D. M. Greenberger, M. A. Horne, and A. Zeilinger, in
Bell’sTheorem, Quantum Theory, and Conceptions of the Universe (Kluwer, Dordrecht), p. 69 (1989).[19] N. D. Mermin,
Simple unified form for the major no-hidden-variables theorems , Phys. Rev. Lett. , 3373 (1990).[20] L. Masanes, Universally-composable privacy amplificationfrom causality constraints , Phys. Rev. Lett. , 140501(2009).[21] R. Canetti; Proc. 42nd IEEE Symposium on Foundations ofComputer Science (FOCS), 136 (2001).
Appendix A: Mermin inequalities
The 5-party Mermin inequality [3] plays a central role in our construction. In each run of this Bell test, measurements (inputs) x = ( x , . . . , x ) on five distant black boxes generate 5 outcomes (outputs) a = ( a , . . . , a ) , distributed according to a non-signalingconditional probability distribution P ( a | x ) . Both inputs and outputs are bits, as they can take two possible values, x i , a i ∈ { , } with i = 1 , . . . , . The inequality can be written as (cid:88) a , x I ( a , x ) P ( a | x ) ≥ , (A1)with coefficients I ( a , x ) = ( a ⊕ a ⊕ a ⊕ a ⊕ a ) δ x ∈X + ( a ⊕ a ⊕ a ⊕ a ⊕ a ⊕ δ x ∈X , (A2)where δ x ∈X = (cid:40) if x ∈ X if x / ∈ X , and X = { (10000) , (01000) , (00100) , (00010) , (00001) , (11111) } , X = { (00111) , (01011) , (01101) , (01110) , (10011) , (10101) , (10110) , (11001) , (11010) , (11100) } . That is, only half of all possible combinations of inputs, namely those in X = X ∪ X , appear in the Bell inequality.The maximal, non-signalling and algebraic, violation of the inequality corresponds to the situation in which the left-hand side of (A1)is zero. The key property of inequality (A1) is that its maximal violation can be attained by quantum correlations. In fact, Mermininequalities are defined for an arbitrary number of parties and quantum correlations attain the maximal non-signalling violation for anyodd number of parties [4]. This violation is always attained by performing local measurements on a GHZ quantum state. Appendix B: Partial unpredictability in the five-party Mermin inequality
Our interest in Mermin inequalities comes from the fact that, for an odd number of parties, they can be maximally violated byquantum correlations. These correlations, then, define a GHZ paradox, which, as explained in the main text, is necessary for fullrandomness amplification. As also mentioned in the main text, GHZ paradoxes are however not sufficient. In fact, it is always possibleto find non-signalling correlations that (i) maximally violate the 3-party Mermin inequality but (ii) assign a deterministic value to anyfunction of the measurement outcomes. This observation can be checked for all unbiased functions mapping { , } to { , } (there are (cid:0) (cid:1) of those) through a linear program analogous to the one used to prove the next Theorem. For a larger number of parties, however,some functions cannot be deterministically fixed to an specific value while maximally violating a Mermin inequality, as implied by thefollowing Theorem. Theorem 1.
Let a five-party non-signaling conditional probability distribution P ( a | x ) in which inputs x = ( x , . . . , x ) and outputs a = ( a , . . . , a ) are bits. Consider the bit maj( a ) ∈ { , } defined by the majority-vote function of any subset consisting of three ofthe five measurement outcomes, say the first three, a , a and a . Then, all non-signalling correlations attaining the maximal violationof the 5-party Mermin inequality are such that the probability that maj( a ) takes a given value, say 0, is bounded by / ≤ P (maj( a ) = 0) ≤ / . (B1) Proof.
This result was obtained by solving a linear program. Therefore, the proof is numeric, but exact. Formally, let P ( a | x ) be a -partite no-signaling probability distribution. For x = x ∈ X , we performed the maximization, P max = max P P (maj( a ) = 0 | x ) subject to I ( a , x ) · P ( a | x ) = 0 (B2)which yields the value P max = 3 / . Since the same result holds for P (maj( a ) = 1 | x ) , we get the bound / ≤ P (maj( a ) = 0) ≤ / .As a further remark, note that a lower bound to P max can easily be obtained by noticing that one can construct conditional probabilitydistributions P ( a | x ) that maximally violate -partite Mermin inequality (A1) for which at most one of the output bits (say a ) isdeterministically fixed to either or . If the other two output bits ( a , a ) were to be completely random, the majority-vote of thethree of them maj( a , a , a ) could be guessed with a probability of / . Our numerical results say that this turns out to be an optimalstrategy.Theorem 1 implies Result 1 in the main text. Moreover it constitutes the simplest GHZ paradox in which some randomness can becertified. This paradox is the building block of our randomness amplification protocol, presented in the next section. Appendix C: Protocol for full randomness amplification
In this section, we describe with more details the protocol summarized in Box 1 of the main text. The protocol uses as resources the (cid:15) -source S and N quantum systems. Recall that the bits produced by the source S are such that the probability P ( x j | e ) that bit j takes a given value x j , conditioned on any pre-existing variable e , is bounded by (cid:15) ≤ P ( x j | e ) ≤ − (cid:15), (C1)for all j and e , where < (cid:15) ≤ / . The bound, when applied to n -bit strings produced by the (cid:15) -source, implies that (cid:15) n ≤ P ( x , . . . , x n | e ) ≤ (1 − (cid:15) ) n . (C2)Each of the quantum systems is abstractly modeled by a black box with binary input x and output a . The protocol processes classicallythe bits generated by S and by the quantum boxes. The result of the protocol is a classical symbol k , associated to an abort/no-abortdecision. If the protocol is not aborted, k encodes the final output bit, with possible values 0 or 1. Whereas when the protocol isaborted, no numerical value is assigned to k but the symbol ∅ instead, representing the fact that the bit is empty. The formal steps ofthe protocol are:1. S is used to generate N quintuple-bits x , . . . x N , which constitute the inputs for the N boxes. The boxes then provide N output quintuple-bits a , . . . a N .2. The quintuplets such that x / ∈ X are discarded. The protocol is aborted if the number of remaining quintuplets is less than N/ .3. The quintuplets left after step 2 are organized in N b blocks each one having N d quintuplets. The number N b of blocks is chosento be a power of 2. For the sake of simplicity, we relabel the index running over the remaining quintuplets, namely x , . . . x N b N d and outputs a , . . . a N b N d . The input and output of the j -th block are defined as y j = ( x ( j − N d +1 , . . . x ( j − N d + N d ) and b j = ( a ( j − N d +1 , . . . a ( j − N d + N d ) respectively, with j ∈ { , . . . , N b } . The random variable l ∈ { , . . . N b } is generated byusing log N b further bits from S . The value of l specifies which block ( b l , y l ) is chosen to generate k , i.e. the distilling block.We define (˜ b, ˜ y ) = ( b l , y l ) . The other N b − blocks are used to check the Bell violation.4. The function r [ b, y ] = (cid:40) if I ( a , x ) = · · · = I ( a N d , x N d ) = 00 otherwise (C3)tells whether block ( b, y ) features the right correlations ( r = 1 ) or the wrong ones ( r = 0 ), in the sense of being compatiblewith the maximal violation of inequality (A1). This function is computed for all blocks but the distilling one. The protocols isaborted unless all of them give the right correlations, g = N b (cid:89) j =1 ,j (cid:54) = l r [ b j , y j ] = (cid:40) not abort abort . (C4)Note that the abort/no-abort decision is independent of whether the distilling block l is right or wrong.5. If the protocol is not aborted then k is assigned a bit generated from b l = ( a , . . . a N d ) as k = f (maj( a ) , . . . maj( a N d )) . (C5)Here f : { , } N d → { , } is a function characterized in Lemma 4 below, while maj( a i ) ∈ { , } is the majority-vote amongthe three first bits of the quintuple string a i . If the protocol is aborted it sets k = ∅ .At the end of the protocol, k is potentially correlated with the settings of the distilling block ˜ y = y l , the bit g in (C4), and the bits t = [ l, ( b , y ) , . . . ( b l − , y l − ) , ( b l +1 , y l +1 ) , . . . ( b N b , y N b )] . Additionally, an eavesdropper Eve might have a physical system correlated with k , which she may measure at any instance of theprotocol. This system is not necessarily classical or quantum, the only assumption about it is that measuring it does not produceinstantaneous signaling anywhere else. We label all possible measurements Eve can perform with the classical variable z , and with e the corresponding outcome. In summary, after the performance of the protocol all the relevant information is k, ˜ y, t, g, e, z , withstatistics described by an unknown conditional probability distribution P ( k, ˜ y, t, g, e | z ) .To assess the security of our protocol for full randomness amplification, we have to show that the distribution describing the protocolwhen not aborted is indistinguishable from the distribution P ideal ( k, ˜ y, t, g, e | zg = 1) = P (˜ y, t, e | zg = 1) describing an ideal freerandom bit. For later purposes, it is convenient to cover the case when the protocol is aborted with an equivalent notation: if the protocolis aborted, we define P ( k, ˜ y, t, e | zg = 0) = δ ∅ k P (˜ y, t, e | zg = 0) and P ideal ( k, ˜ y, t, e | zg = 0) = δ ∅ k P (˜ y, t, e | zg = 0) , where δ k (cid:48) k is a Kronecker’s delta. In this case, it is immediate that P = P ideal , as the locally generated symbol ∅ is always uncorrelated to theenvironment. To quantify the indistinguishability between P and P ideal , we consider the scenario in which an observer, having accessto all the information k, ˜ y, t, g, e, z , has to correctly distinguish between these two distributions. We denote by P (guess) the optimalprobability of correctly guessing between the two distributions. This probability reads P (guess) = 12 + 14 (cid:88) k, ˜ y,t,g max z (cid:88) e (cid:12)(cid:12)(cid:12) P ( k, ˜ y, t, g, e | z ) − P ideal ( k, ˜ y, t, g, e | z ) (cid:12)(cid:12)(cid:12) , (C6) where the second term can be understood as (one fourth of) the variational distance between P and P ideal generalized to the case whenthe distributions are conditioned on an input z [6]. If the protocol is such that this guessing probability can be made arbitrarily closeto 1/2, it generates a distribution P that is basically undistinguishable from the ideal one. This is known as “universally-composablesecurity”, and accounts for the strongest notion of cryptographic security (see [5] and [6]). It implies that the protocol produces arandom bit that is secure (free) in any context. In particular, it remains secure even if the adversary Eve has access to ˜ y , t and g .Our main result, namely the security of our protocol for full randomness amplification, follows from the following Theorem. Theorem 2 ( Main Theorem).
Consider the previous protocol for randomness amplification and the conditional probability distribution P ( k, ˜ y, t, g, e | z ) describing the statistics of the bits k, ˜ y, t, g generated during its execution and any possible system with input z and output e correlated to them. The probability P (guess) of correctly guessing between this distribution and the ideal distribution P ideal ( k, ˜ y, t, g, e | z ) is such that P (guess) ≤
12 + 3 √ N d (cid:104) α N d + 2 N log (1 − (cid:15) ) b (cid:0) β(cid:15) − (cid:1) N d (cid:105) . (C7)where α and β are real numbers such that < α < < β .The right-hand side of (C7) can be made arbitrary close to / , for instance by setting N b = (cid:0) β (cid:15) − (cid:1) N d / | log (1 − (cid:15) ) | andincreasing N d subject to the fulfillment of the condition N d N b ≥ N/ . [Note that log (1 − (cid:15) ) < .] In the limit P (guess) → / ,the bit k generated by the protocol is indistinguishable from an ideal free random bit.The proof of Theorem 2 is provided in the next section. Before moving to it, we would like to comment on the main intuitions behindour protocol. As mentioned, the protocol builds on the 5-party Mermin inequality because it is the simplest GHZ paradox allowingsome randomness certification. The estimation part, given by step 4, is rather standard and inspired by estimation techniques introducedin [7], which were also used in [2] in the context of randomness amplification. The most subtle part is the distillation of the final bit instep 5. Naively, and leaving aside estimation issues, one could argue that it is nothing but a classical processing by means of the function f of the imperfect random bits obtained via the N d quintuplets. But this seems in contradiction with the result by Santha and Vaziraniproving that it is impossible to extract by classical means a perfect free random bit from imperfect ones [1]. This intuition is howeverwrong. The reason is because in our protocol the randomness of the imperfect bits is certified by a Bell violation, which is impossibleclassically. Indeed, the Bell certification allows applying techniques similar to those obtained in Ref. [6] in the context of privacyamplification against non-signalling eavesdroppers. There, it was shown how to amplify the privacy, that is the unpredictability, of oneof the measurement outcomes of bipartite correlations violating a Bell inequality. The key point is that the amplification, or distillation,was attained in a deterministic manner. That is, contrary to standard approaches, the privacy amplification process described in [6]does not consume any randomness. Clearly, these deterministic techniques are extremely convenient for our randomness amplificationscenario. In fact, the distillation part in our protocol can be seen as the translation of the privacy amplification techniques of Ref. [6] toour more complex scenario, involving now 5-party non-local correlations and a function of three of the measurement outcomes. Appendix D: Proof of Theorem 2
Before entering the details of the proof of Theorem 2, let us introduce a convenient notation. In what follows, we sometimes treatconditional probability distributions as vectors. To avoid ambiguities, we explicitly label the vectors describing probability distributionswith the arguments of the distributions in upper case. Thus, for example, we denote by P ( A | X ) the (2 × ) -dimensional vector withcomponents P ( a | x ) for all a , x ∈ { , } . We also denote by I the vector with components I ( a , x ) given in (A2). With this notation,inequality (A1) can be written as the scalar product I · P ( A | X ) = (cid:88) a , x I ( a , x ) P ( a | x ) ≥ . Any probability distribution P ( a | x ) satisfies C · P ( A | X ) = 1 , where C is the vector with components C ( a , x ) = 2 − . We also usethis scalar-product notation for full blocks, as in I ⊗ N d · P ( B | Y ) = (cid:88) a ,... a Nd (cid:88) x ,... x Nd (cid:34) N d (cid:89) i =1 I ( a i , x i ) (cid:35) P ( a , . . . a N d | x , . . . x N d ) . Following our upper/lower-case convention, the vector P ( B | Y, e, z ) has components P ( b | y, e, z ) for all b, y but fixed e, z .The proof of Theorem 2 relies on two crucial lemmas, which are stated and proven in Sections D 1 and D 2, respectively. The firstlemma bounds the distinguishability between the distribution distilled from a block of N d quintuplets and the ideal free random bit asfunction of the Bell violation (A1) in each quintuplet. In particular, it guarantees that, if the correlations of all quintuplets in a givenblock violate inequality (A1) sufficiently much, the bit distilled from the block will be indistinguishable from an ideal free randombit. The second lemma is required to guarantee that, if the statistics observed in all blocks but the distilling one are consistent with amaximal violation of inequality (A1), the violation of the distilling block will be arbitrarily large. Proof of Theorem 2.
We begin with the identity P (guess) = P ( g = 0) P (guess | g = 0) + P ( g = 1) P (guess | g = 1) . (D1) As discussed, when the protocol is aborted ( g = 0 ) the distribution generated by the protocol and the ideal one are indistinguishable.In other words, P (guess | g = 0) = 12 . (D2)If P ( g = 0) = 1 then the protocol is secure, though in a trivial fashion. Next we address the non-trivial case where P ( g = 1) > .From formula (C6), we have P (guess | g = 1)= 12 + 14 (cid:88) k, ˜ y,t max z (cid:88) e (cid:12)(cid:12)(cid:12) P ( k, ˜ y, t, e | z, g = 1) − P (˜ y, t, e | z, g = 1) (cid:12)(cid:12)(cid:12) = 12 + 14 (cid:88) ˜ y,t P (˜ y, t | g = 1) (cid:88) k max z (cid:88) e (cid:12)(cid:12)(cid:12) P ( k, e | z, ˜ y, t, g = 1) − P ( e | z, ˜ y, t, g = 1) (cid:12)(cid:12)(cid:12) ≤
12 + 14 (cid:88) ˜ y,t P (˜ y, t | g = 1) 6 √ N d ( αC + βI ) ⊗ N d · P ( ˜ B | ˜ Y , t, g = 1)= 12 + 3 √ N d αC + βI ) ⊗ N d · (cid:88) ˜ y,t P (˜ y, t | g = 1) P ( ˜ B | ˜ Y , t, g = 1)= 12 + 3 √ N d αC + βI ) ⊗ N d · (cid:88) t P ( t | g = 1) P ( ˜ B | ˜ Y , t, g = 1)= 12 + 3 √ N d αC + βI ) ⊗ N d · (cid:88) t P ( ˜ B, t | ˜ Y , g = 1)= 12 + 3 √ N d αC + βI ) ⊗ N d · P ( ˜ B | ˜ Y , g = 1) (D3)where the inequality is due to Lemma 1 in Section D 1, we have used the no-signalling condition through P (˜ y, t | z, g = 1) = P (˜ y, t | g =1) , in the second equality, and Bayes rule in the second and sixth equalities. From (D3) and Lemma 2 in Section D 2, we obtain P (guess | g = 1) ≤
12 + 3 √ N d (cid:34) α N d + 2 N log (1 − (cid:15) ) b P ( g = 1) (cid:0) β(cid:15) − (cid:1) N d (cid:35) . (D4)Finally, substituting bound (D4) and equality (D2) into (D1), we obtain P (guess) ≤
12 + 3 √ N d (cid:104) P ( g = 1) α N d + 2 N log (1 − (cid:15) ) b (cid:0) β(cid:15) − (cid:1) N d (cid:105) , (D5)which, together with P ( g = 1) ≤ , implies (C7).
1. Statement and proof of Lemma 1
As mentioned, Lemma 1 provides a bound on the distinguishability between the probability distribution obtained after distilling ablock of N d quintuplets and an ideal free random bit in terms of the Bell violation (A1) in each quintuplet. The proof of Lemma 1, inturn, requires two more lemmas, Lemma 3 and Lemma 4, stated and proven in Section D 3. Lemma 1.
For each integer N d ≥ there exists a function f : { , } N d → { , } such that, for any given (5 N d + 1) -partite non-signaling distribution P ( a , . . . a N d , e | x , . . . x N d , z ) = P ( b, e | y, z ) , the random variable k = f (maj( a ) , . . . maj( a N d )) satisfies (cid:88) k max z (cid:88) e (cid:12)(cid:12)(cid:12) P ( k, e | y, z ) − P ( e | y, z ) (cid:12)(cid:12)(cid:12) ≤ √ N d ( αC + βI ) ⊗ N d · P ( B | Y ) (D6)for all inputs y = ( x , . . . x N d ) ∈ X N d , and where α and β are real numbers such that < α < < β . Proof of Lemma 1.
For any x ∈ X let M x w be the vector with components M x w ( a , x ) = δ w maj( a ) δ x x . The probability of getting maj( a ) = w when using x as input can be written as P ( w | x ) = M x w · P ( A | X ) . Note that this probability can also be written as P ( w | x ) = Γ x w · P ( A | X ) , where Γ x w = M x w + Λ x w and Λ x w is any vector orthogonal to the no-signaling subspace, that is, suchthat Λ x w · P ( A | X ) = 0 for all no-signaling distribution P ( A | X ) . We can then write the left-hand side of (D6) as (cid:88) k max z (cid:88) e (cid:12)(cid:12)(cid:12)(cid:12) P ( k, e | y, z ) − P ( e | y, z ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:88) k max z (cid:88) e P ( e | y, z ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) w (cid:18) δ kf ( w ) − (cid:19) P ( w | y, e, z ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:88) k max z (cid:88) e P ( e | z ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) w (cid:18) δ kf ( w ) − (cid:19) (cid:32) N d (cid:79) i =1 Γ x i w i (cid:33) · P ( B | Y, e, z ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , (D7) where in the last equality we have used no-signaling through P ( e | y, z ) = P ( e | z ) and the fact that the probability of obtaining thestring of majorities w when inputting y = ( x , . . . x N d ) ∈ X N d can be written as P ( w | y ) = (cid:32) N d (cid:79) i =1 Γ x i w i (cid:33) · P ( B | Y ) . (D8)In what follows, the absolute value of vectors is understood to be component-wise. Bound (D7) can be rewritten as (cid:88) k max z (cid:88) e (cid:12)(cid:12)(cid:12)(cid:12) P ( k, e | y, z ) − P ( e | y, z ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:88) k max z (cid:88) e P ( e | z ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) w (cid:18) δ kf ( w ) − (cid:19) N d (cid:79) i =1 Γ x i w i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) · P ( B | Y, e, z )= (cid:88) k max z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) w (cid:18) δ kf ( w ) − (cid:19) N d (cid:79) i =1 Γ x i w i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) · (cid:32)(cid:88) e P ( e | z ) P ( B | Y, e, z ) (cid:33) = (cid:88) k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) w (cid:18) δ kf ( w ) − (cid:19) N d (cid:79) i =1 Γ x i w i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) · P ( B | Y ) , (D9)where the inequality follows from the fact that all the components of the vector P ( B | Y, e, z ) are positive and no-signalling has beenused again through P ( B | Y, z ) = P ( B | Y ) in the last equality. The bound applies to any function f and holds for any choice of vectors Λ x i w in Γ x i w . In what follows, we compute this bound for a specific choice of these vectors and function f .Take Λ x i w to be equal to the vectors Λ x w in Lemma 3. These vectors then satisfy the bounds (D20) and (D29) in the same Lemma.Take f to be equal to the function whose existence is proven in Lemma 4. Note that the conditions needed for this Lemma to applyare satisfied because of bound (D20) in Lemma 3, and because the free parameter N d ≥ satisfies (cid:0) √ N d (cid:1) − /N d ≥ γ = 0 . .With this choice of f and Λ x i w , bound (D9) becomes (cid:88) k max z (cid:88) e (cid:12)(cid:12)(cid:12)(cid:12) P ( k, e | y, z ) − P ( e | y, z ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:88) k √ N d (cid:32) N d (cid:79) i =1 Ω x i (cid:33) · P ( B | Y ) ≤ √ N d ( αC + βI ) ⊗ N d · P ( B | Y ) , (D10)where we have used Ω x i = (cid:112) (Γ x i ) + (Γ x i ) , (cid:80) k , bound (D20) in Lemma 3 and bound (D29) in Lemma 4.
2. Statement and proof of Lemma 2
In this section we prove Lemma 2. This Lemma bounds the Bell violation in the distillation block in terms of the probability of notaborting the protocol in step 4 and the number and size of the blocks, N b and N d . Lemma 2.
Let P ( b , . . . b N b | y , . . . y N b ) be a (5 N d N b ) -partite no-signaling distribution, y , . . . y N b and l the variables generated insteps 2 and 3 of the protocol, respectively, and α and β real numbers such that < α < < β ; then ( αC + βI ) ⊗ N d · P ( ˜ B | ˜ Y , g = 1) ≤ α N d + 2 N log (1 − (cid:15) ) b P ( g = 1) (cid:0) β(cid:15) − (cid:1) N d . (D11) Proof of Lemma 2.
According to definition (C3) we have I ( a i , x i ) ≤ δ r [ b,y ] for all values of b = ( a , . . . a N d ) and y =( x , . . . x N d ) . This also implies I ( a i , x i ) I ( a j , x j ) ≤ δ r [ b,y ] and so on. Due to the property < α < < β , one has that ( α − ) N d − i β i ≤ β N d for any i = 1 , . . . N d . All this in turn implies N d (cid:89) i =1 (cid:2) α − + βI i (cid:3) = (cid:0) α − (cid:1) N d + (cid:0) α − (cid:1) N d − β (cid:88) i I i + (cid:0) α − (cid:1) N d − β (cid:88) i (cid:54) = j I i I j + · · ·≤ (cid:0) α − (cid:1) N d + β N d (cid:88) i I i + (cid:88) i (cid:54) = j I i I j + · · · ≤ (cid:0) α − (cid:1) N d + β N d (cid:88) i δ r [ b,y ] + (cid:88) i (cid:54) = j δ r [ b,y ] + · · · ≤ (cid:0) α − (cid:1) N d + β N d (cid:16) N d − (cid:17) δ r [ b,y ] ≤ (cid:0) α − (cid:1) N d + ( β N d δ r [ b,y ] , (D12)where I i = I ( a i , x i ) . This implies that ( αC + βI ) ⊗ N d · P ( B | Y, g = 1)= (cid:88) a ,... a Nd (cid:88) x ,... x Nd N d (cid:89) i =1 (cid:2) α − + βI ( a i , x i ) (cid:3) P ( a , . . . a N d | x , . . . x N d , g = 1) ≤ (cid:88) b,y (cid:104)(cid:0) α − (cid:1) N d + (2 β ) N d δ r [ b,y ] (cid:105) P ( b | y, g = 1)= α N d (cid:88) y − N d + (2 β ) N d (cid:88) y P ( r = 0 | y, g = 1)= α N d + (2 β ) N d (cid:88) y P ( r = 0 | y, g = 1)= α N d + (2 β ) N d (cid:88) y P ( r = 0 , y | g = 1) P ( y | g = 1) . (D13)We can now bound P ( y | g = 1) taking into account that y denotes a N d -bit string generated by the (cid:15) -source S that remains after step 2in the protocol. Note that only half of the 32 possible 5-bit inputs x generated by the source belong to X and remain after step 2. Thus, P (( x , . . . , x N d ) ∈ X N d | g = 1) ≤ N d (1 − (cid:15) ) N d , where we used (C2). This, together with P (( x , . . . , x N d ) | g = 1) ≥ (cid:15) N d implies that P ( y | g = 1) ≥ (cid:18) (cid:15) − (cid:15) ) (cid:19) N d . (D14)Substituting this bound in (D13), and summing over y , gives ( αC + βI ) ⊗ N d · P ( B | Y, g = 1) ≤ α N d + (2 β ) N d (cid:18) − (cid:15) ) (cid:15) (cid:19) N d P ( r = 0 | g = 1) . (D15)In what follows we use the notation P (1 , , , , . . . ) = P ( r [ b , y ] = 1 , r [ b , y ] = 0 , r [ b , y ] = 1 , r [ b , y ] = 1 , . . . ) . According to (C4), the protocol aborts ( g = 0 ) if there is at least a “not right” block ( r [ b j , y j ] = 0 for some j (cid:54) = l ). While abortionalso happens if there are more than one “not right” block, in what follows we lower-bound P ( g = 0) by the probability that there isonly one “not right” block: ≥ P ( g = 0) ≥ N b (cid:88) l =1 P ( l ) N b (cid:88) l (cid:48) =1 , l (cid:48) (cid:54) = l P (1 , . . . l − , l +1 , . . . l (cid:48) − , l (cid:48) , l (cid:48) +1 , . . . N b ) ≥ (cid:88) l P ( l ) (cid:88) l (cid:48) (cid:54) = l P (1 , . . . l − , l , l +1 , . . . l (cid:48) − , l (cid:48) , l (cid:48) +1 , . . . N b )= (cid:88) l (cid:48) (cid:104)(cid:80) l (cid:54) = l (cid:48) P ( l ) (cid:105) P (1 , . . . l − , l , l +1 , . . . l (cid:48) − , l (cid:48) , l (cid:48) +1 , . . . N b )= (cid:88) l (cid:48) [1 − P ( l (cid:48) )] P (1 , . . . l (cid:48) − , l (cid:48) , l (cid:48) +1 , . . . N b ) , (D16) where, when performing the sum over l , we have used that P (1 , . . . l − , l , l +1 , . . . l (cid:48) − , l (cid:48) , l (cid:48) +1 , . . . N b ) ≡ P (1 , . . . l (cid:48) − , l (cid:48) , l (cid:48) +1 , . . . N b ) does not depend on l . Bound (C2) implies − P ( l ) P ( l ) ≥ − (1 − (cid:15) ) log N b (1 − (cid:15) ) log N b = N log − (cid:15) b − ≥ N log − (cid:15) b , (D17)where the last inequality holds for sufficiently large N b . Using this and (D16), we obtain ≥ (cid:88) l (cid:48) N log − (cid:15) b P ( l (cid:48) ) P (1 , . . . l (cid:48) − , l (cid:48) , l (cid:48) +1 , . . . N b ) ≥ N log − (cid:15) b P (˜ r = 0 , g = 1) , (D18)where ˜ r = r [ b l , y l ] . This together with (D15) implies ( αC + βI ) ⊗ N d · P ( ˜ B | ˜ Y , g = 1) ≤ α N d + (2 β ) N d (cid:18) − (cid:15) ) (cid:15) (cid:19) N d P (˜ r = 0 | g = 1) ≤ α N d + 2 P ( g = 1) (cid:18) β (1 − (cid:15) ) (cid:15) (cid:19) N d N log (1 − (cid:15) ) b , (D19)where, in the second inequality, Bayes rule was again invoked. Inequality (D19), in turn, implies (D11).
3. Statement and proof of the additional LemmasLemma 3.
For each x ∈ X there are three vectors Λ x , Λ x , Λ x orthogonal to the non-signaling subspace such that for all w ∈{ , } and a , x ∈ { , } they satisfy (cid:113) [ M x ( a , x ) + Λ x ( a , x )] + [ M x ( a , x ) + Λ x ( a , x )] ≤ αC ( a , x ) + βI ( a , x ) + Λ x ( a , x ) (D20)and | M x w ( a , x ) + Λ x w ( a , x ) | ≤ γ (cid:113) [ M x ( a , x ) + Λ x ( a , x )] + [ M x ( a , x ) + Λ x ( a , x )] (D21)where α = 0 . , β = 1 . and γ = 0 . . Proof of Lemma 3.
The proof of this lemma is numeric but rigorous. It is based on two linear-programming minimization problems,which are carried for each value of x ∈ X . We have repeated this process for different values of γ , finding that γ = 0 . is roughlythe smallest value for which the linear-programs described below are feasible.The fact that the vectors Λ x , Λ x , Λ x are orthogonal to the non-signaling subspace can be written as linear equalities D · Λ x w = (D22)for w ∈ { , , } , where is the zero vector and D is a matrix whose rows constitute a basis of non-signalingprobability distributions. A geometrical interpretation of constraint (D20) is that the point in the plane with coordinates [ M x ( a , x ) + Λ x ( a , x ) , M x ( a , x ) + Λ x ( a , x )] ∈ R is inside a circle of radius αC ( a , x ) + βI ( a , x ) + Λ x ( a , x ) centeredat the origin. All points inside an octagon inscribed in this circle also satisfy constraint (D20). The points of such an inscribed octagonare the ones satisfying the following set of linear constraints: [ M x ( a , x ) + Λ x ( a , x )] η cos θ + [ M x ( a , x ) + Λ x ( a , x )] η sin θ ≤ αC ( a , x ) + βI ( a , x ) + Λ x ( a , x ) , (D23)for all θ ∈ { π , π , π , π , π , π , π , π } , where η = (cos π ) − ≈ . . In other words, the eight conditions (D23) implyconstraint (D20). From now on, we only consider these eight linear constraints (D23). With a bit of algebra, one can see that inequal-ity (D21) is equivalent to the two almost linear inequalities there was an error in the following equation, as the pre-factor in terms of γ was wrong. Please check what was computed and how it affects to γ and, then, to the value of N d ± [ M x w ( a , x ) + Λ x w ( a , x )] ≤ (cid:115) γ − γ | M x ¯ w ( a , x ) + Λ x ¯ w ( a , x ) | , (D24)for all w ∈ { , } , where ¯ w = 1 − w . Clearly, the problem is not linear because of the absolute values. The computation described inwhat follows constitutes a trick to make a good guess for the signs of the terms in the absolute value of (D24), so that the problem canbe made linear by adding extra constraints.The first computational step consists of a linear-programming minimization of α subject to the constraints (D22), (D23), where theminimization is performed over the variables α, β, Λ x , Λ x , Λ x . This step serves to guess the signs σ w ( a , x ) = sign[ M x w ( a , x ) + Λ x w ( a , x )] , (D25) for all w, a , x , where the value of Λ x w ( a , x ) corresponds to the solution of the above minimization. Once we have identified all thesesigns, we can write the inequalities (D24) in a linear fashion: σ w ( a , x ) [ M x w ( a , x ) + Λ x w ( a , x )] ≥ , (D26) σ w ( a , x ) [ M x w ( a , x ) + Λ x w ( a , x )] ≤ (cid:115) γ − γ σ ¯ w ( a , x ) [ M x ¯ w ( a , x ) + Λ x ¯ w ( a , x )] , (D27)for all w ∈ { , } .The second computational step consists of a linear-programming minimization of α subjected to the constraints (D22), (D23), (D26),(D27), over the variables α, β, Λ x , Λ x , Λ x . Clearly, any solution to this problem is also a solution to the original formulation of theLemma. The minimization was performed for any x ∈ X and the values of α, β turned out to be independent of x ∈ X . Theseobtained numerical values are the ones appearing in the formulation of the Lemma.Note that Lemma 3 allows one to bound the predictability of maj( a ) by a linear function of the 5-party Mermin violation. Thiscan be seen by computing Γ x w · P ( A | X ) and applying the bounds in the Lemma. In principle, one expects this bound to exist, asthe predictability is smaller than one at the point of maximal violation, as proven in Theorem 1, and equal to one at the point of noviolation. However, we were unable to find it. This is why we had to resort to the linear optimization technique given above, whichmoreover provides the bounds (D20) and (D21) necessary for the security proof. Lemma 4.
Let N d be a positive integer and let Γ iw ( a , x ) be a given set of real coefficients such that for all i ∈ { , . . . N d } , w ∈ { , } and a , x ∈ { , } they satisfy (cid:12)(cid:12)(cid:12) Γ iw ( a , x ) (cid:12)(cid:12)(cid:12) ≤ (cid:16) √ N d (cid:17) − /N d Ω i ( a , x ) , (D28)where Ω i ( a , x ) = (cid:112) Γ i ( a , x ) + Γ i ( a , x ) . There exists a function f : { , } N d → { , } such that for each sequence ( a , x ) , . . . ( a N d , x N d ) we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) w (cid:18) δ kf ( w ) − (cid:19) N d (cid:89) i =1 Γ iw i ( a i , x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ √ N d N d (cid:89) i =1 Ω i ( a i , x i ) , (D29)where the sum runs over all w = ( w , . . . w N d ) ∈ { , } N d . Proof of Lemma (4) . First, note that for a sequence ( a , x ) , . . . ( a N d , x N d ) for which there is at least one value of i ∈ { , . . . N d } satisfying Γ i ( a i , x i ) = Γ i ( a i , x i ) = 0 , both the left-hand side and the right-hand side of (D29) are equal to zero, hence, inequal-ity (D29) is satisfied independently of the function f . Therefore, in what follows, we only consider sequences ( a , x ) , . . . ( a N d , x N d ) for which either Γ i ( a i , x i ) (cid:54) = 0 or Γ i ( a i , x i ) (cid:54) = 0 , for all i = 1 , . . . N d . Or, equivalently, we consider sequences such that N d (cid:89) i =1 Ω i ( a i , x i ) > . (D30)The existence of the function f satisfying (D29) for all such sequences is shown with a probabilistic argument. We consider thesituation where f is picked from the set of all functions mapping { , } N d to { , } with uniform probability, and upper-boundthe probability that the chosen function does not satisfy the constraint (D29) for all k and all sequences ( a , x ) , . . . ( a N d , x N d ) satisfying (D30). This upper bound is shown to be smaller than one. Therefore there must exist at least one function satisfying (D29).For each w ∈ { , } N d consider the random variable F w = ( δ f ( w ) − ) ∈ { , − } , where f is picked from the set of all functionsmapping { , } N d → { , } with uniform distribution. This is equivalent to saying that the N d random variables { F w } w are indepen-dent and identically distributed according to Pr { F w = ± } = . For ease of notation, let us fix a sequence ( a , x ) , . . . ( a N d , x N d ) satisfying (D30) and use the short-hand notation Γ iw i = Γ iw i ( a i , x i ) . We proceed using the same ideas as in the derivation of the exponential Chebyshev’s Inequality. For any µ, ν ≥ , we have Pr (cid:40)(cid:88) w F w N d (cid:89) i =1 Γ iw i ≥ µ (cid:41) = Pr (cid:40) ν (cid:32) − µ + (cid:88) w F w N d (cid:89) i =1 Γ iw i (cid:33) ≥ (cid:41) = Pr (cid:40) exp (cid:32) − νµ + ν (cid:88) w F w N d (cid:89) i =1 Γ iw i (cid:33) ≥ (cid:41) ≤ E (cid:34) exp (cid:32) − νµ + ν (cid:88) w F w N d (cid:89) i =1 Γ iw i (cid:33)(cid:35) (D31) = E (cid:34) e − νµ (cid:89) w exp (cid:32) νF w N d (cid:89) i =1 Γ iw i (cid:33)(cid:35) = e − νµ (cid:89) w E (cid:34) exp (cid:32) νF w N d (cid:89) i =1 Γ iw i (cid:33)(cid:35) (D32) ≤ e − νµ (cid:89) w E νF w N d (cid:89) i =1 Γ iw i + (cid:32) νF w N d (cid:89) i =1 Γ iw i (cid:33) . (D33)Here E stands for the average over all F w . In (D31) we have used that any positive random variable X satisfies Pr { X ≥ } ≤ E [ X ] .In (D32) we have used that the { F w } w are independent. Finally, in (D33) we have used that e η ≤ η + η , which is only valid if η ≤ . Therefore, we must show that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ν N d (cid:89) i =1 Γ iw i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ , (D34)which is done below, when setting the value of ν . In what follows we use the chain of inequalities (D33), the fact that E [ F w ] = 0 and E [ F w ] = 1 / , bound η ≤ e η for η ≥ , and the definition Ω i = (Γ i ) + (Γ i ) : Pr (cid:40)(cid:88) w F w N d (cid:89) i =1 Γ iw i ≥ µ (cid:41) ≤ e − νµ (cid:89) w (cid:32) E [ F w ] ν N d (cid:89) i =1 Γ iw i + E [ F w ] ν N d (cid:89) i =1 (cid:16) Γ iw i (cid:17) (cid:33) = e − νµ (cid:89) w (cid:32) ν N d (cid:89) i =1 (cid:16) Γ iw i (cid:17) (cid:33) ≤ e − νµ (cid:89) w exp (cid:32) ν N d (cid:89) i =1 (cid:16) Γ iw i (cid:17) (cid:33) = exp (cid:32) − νµ + (cid:88) w ν N d (cid:89) i =1 (cid:16) Γ iw i (cid:17) (cid:33) = exp (cid:32) − νµ + ν N d (cid:89) i =1 Ω i (cid:33) (D35)In order to optimize this upper bound, we minimize the exponent over ν . This is done by differentiating with respect to ν and equatingto zero, which gives ν = 2 µ N d (cid:89) i =1 Ω − i . (D36)Note that constraint (D30) implies that the inverse of Ω i exists. Since we assume µ ≥ , the initial assumption ν ≥ is satisfied bythe solution (D36). By substituting (D36) in (D35) and rescaling the free parameter µ as ˜ µ = µ (cid:81) N d i =1 Ω i , (D37)we obtain Pr (cid:40)(cid:88) w F w N d (cid:89) i =1 Γ iw i ≥ ˜ µ N d (cid:89) i =1 Ω i (cid:41) ≤ e − ˜ µ , (D38) for any ˜ µ ≥ consistent with condition (D34). We now choose ˜ µ = 3 √ N d , see Eq. (D29), getting Pr (cid:40)(cid:88) w F w N d (cid:89) i =1 Γ iw i ≥ √ N d N d (cid:89) i =1 Ω i (cid:41) ≤ e − N d . (D39)With this assignment, and using (D36) and (D37), condition (D34), yet to be fulfilled, becomes √ N d N d (cid:89) i =1 | Γ iw i | Ω i ≤ , (D40)which now holds because of the initial premise (D28).Bound (D39) applies to each of the sequences ( a , x ) , . . . ( a N d , x N d ) satisfying (D30), and there are at most N d of them. Hence,the probability that the random function f does not satisfy the bound (cid:88) w F w N d (cid:89) i =1 Γ iw i ≥ √ N d N d (cid:89) i =1 Ω i , (D41)for at least one of such sequences, is at most N d e − N d , which is smaller than / for any value of N d . A similar argument provesthat the probability that the random function f does not satisfy the bound (cid:88) w F w N d (cid:89) i =1 Γ iw i ≤ − √ N d N d (cid:89) i =1 Ω i , (D42)for at least one sequence satisfying (D30) is also smaller than 1/2. The lemma now easily follows from these two results. Appendix E: Final remarks
The main goal of our work was to prove full randomness amplification. In these appendices, we have shown how our protocol,based on quantum non-local correlations, achieves this task. Unfortunately, we are not able to provide an explicit description of thefunction f : { , } N d → { , } which maps the outcomes of the black boxes to the final random bit k ; we merely show its existence.Such function may be obtained through an algorithm that searches over the set of all functions until it finds one satisfying (D29). Theproblem with this method is that the set of all functions has size N d , which makes the search computationally costly. However, thisproblem can be fixed by noticing that the random choice of f in the proof of Lemma 4 can be restricted to a four-universal family offunctions, with size polynomial in N d . This observation will be developed in future work.A more direct approach could consist of studying how the randomness in the measurement outcomes for correlations maximallyviolating the Mermin inequality increases with the number of parties. We solved linear optimization problems similar to those used inTheorem 1 which showed that for 7 parties Eve’s predictability is / for a function of 5 bits defined by f (00000) = 0 , f (01111) = 0 , f (00111) = 0 and f ( x ) = 1 otherwise. Note that this value is lower than the earlier / and also that the function is different from themajority-vote. We were however unable to generalize these results for an arbitrary number of parties, which forced us to adopt a lessdirect approach. Note in fact that our protocol can be interpreted as a huge multipartite Bell test from which a random bit is extractedby classical processing of some of the measurement outcomes.We conclude by stressing again that the reason why randomness amplification becomes possible using non-locality is becausethe randomness certification is achieved by a Bell inequality violation. There already exist several protocols, both in classical andquantum information theory, in which imperfect randomness is processed to generate perfect (or arbitrarily close to perfect) randomness.However, all these protocols, e.g. two-universal hashing or randomness extractors, always require additional good-quality randomnessto perform such distillation. On the contrary, if the initial imperfect randomness has been certified by a Bell inequality violation, thedistillation procedure can be done with a deterministic hash function (see [6] or Lemma 1 above). This property makes Bell-certifiedrandomness fundamentally different from any other form of randomness, and is the key for the success of our protocol.[1] M. Santha and U. V. Vazirani, in Proc. 25th IEEE Symposium on Foundations of Computer Science (FOCS-84) , 434 (IEEE Com-puter Society, 1984).[2] R. Colbeck and R. Renner, Free randomness can be amplied , Nature Phys. , 450 (2012).[3] N. D. Mermin, Extreme quantum entanglement in a superposition of macroscopically distinct states , Phys. Rev. Lett. , 1838(1990).[4] D. N. Klyshko, Phys. Lett. A , 399 (1993); A. V. Belinskii and D. N. Klyshko, Physics - Uspekhi , 653 (1993); N. Gisin, H.Bechmann-Pasquinucci, Phys.Lett. A , 1-6 (1998).[5] R. Canetti; Proc. 42nd IEEE Symposium on Foundations of Computer Science (FOCS), 136 (2001).[6] L. Masanes; Universally-composable privacy amplification from causality constraints ; Phys. Rev. Lett. , 140501 (2009).[7] J. Barrett, L. Hardy and A. Kent,