Entanglement verification with finite data
EEntanglement verification with finite data
Robin Blume-Kohout , Jun O.S. Yin , S. J. van Enk , Perimeter Institute for Theoretical Physics, Waterloo ON N2L2Y5 Department of Physics and Oregon Center for Optics,University of Oregon, Eugene, OR 97403 Institute for Quantum Information,California Institute of Technology, Pasadena, CA 91125 (Dated: October 29, 2018)Suppose an experimentalist wishes to verify that his apparatus produces entangled quantumstates. A finite amount of data cannot conclusively demonstrate entanglement, so drawing conclu-sions from real-world data requires statistical reasoning. We propose a reliable method to quantifythe weight of evidence for (or against) entanglement, based on a likelihood ratio test. Our methodis universal in that it can be applied to any sort of measurements. We demonstrate the method byapplying it to two simulated experiments on two qubits. The first measures a single entanglementwitness, while the second performs a tomographically complete measurement.
Entanglement is an essential resource for quantum in-formation processing, and producing and verifying entan-gled states is considered a benchmark for quantum exper-iments (for a sample from the most recent experimentson a wide variety of physical systems, see [1]). Severalmethods for verifying entanglement have been developed(for overviews, see [2, 3]). A bipartite state is entangled ifit is not separable, and data D demonstrate entanglementif there is no separable state that could have generatedthem. As the number of data N → ∞ , the data areunambiguous, but for finite N , only probabilistic conclu-sions can be drawn. In this Letter, we quantify exactlywhat can be concluded from finite or small data sets,using a simple and efficient likelihood ratio test.We demonstrate the method using two simulated ex-periments on two-qubit systems [12]. The first mea-sures just one observable, an entanglement witness [4].The other performs a tomographically complete measure-ment. In both cases, we use likelihood ratios to drawdirect conclusions about entanglement, rather than esti-mating the quantum state as an intermediate step. Arelated technique for testing violation of local realism,and based on empirical relative entropy instead of thelikelihood ratio, was proposed by van Dam et al [5] andapplied by Zhang et al [6]. Likelihood Ratios:
Data D could have been generatedby any one of many i.i.d. states ρ ⊗ N . Each state ρ rep-resents a theory about the system, and the relative plau-sibility of different states is measured by their likelihood L ( ρ ). A state’s likelihood is simply the probability of theobserved data given that state, L ( ρ ) ≡ Pr( D| ρ ) , (1)and states with higher likelihood are more plausible. Ifthe most likely state is separable, the data clearly donot support entanglement. If it is entangled, then weneed to ask how convincing the data are – specifically,whether some separable state is almost as plausible. Tojudge whether there is (even just one) separable state ˆ ρ MLE most likelyseparable state separablestatesentangledstates (a) (b)(c) randomly distributed MLE estimates (d)
FIG. 1:
General schema of a likelihood ratio test . Theseparable states S (cyan) are a convex subset of all states, sur-rounded by entangled states (red). Data from an experimenton a state ρ yield a quasiconvex likelihood function [ (a) ] witha unique maximum (ˆ ρ MLE ). ˆ ρ MLE is randomly distributedaround ρ , at a typical length scale δ = O (1 / √ N ). If ˆ ρ MLE isseparable then there is no evidence for entanglement, but ifit’s entangled (as shown), then the relative likelihoods of ˆ ρ MLE and the most likely separable state determine the weight ofevidence. Data are “convincing” if they are very unlikely tohave been produced by a borderline separable state. Typicallikelihood ratios for such states depend on the shape of S .In (b)-(d) we show three possible cases: in (b) S is smallerthan δ and behaves like a point; in (c) it is of size δ and itsbehavior is hard to characterize; in (d) it is much bigger than δ and behaves like a half-space. that fits the data, we compare the likelihoods of (i) themost likely separable state, and (ii) the most likely of allstates. Letting S be the set of separable states, we defineΛ ≡ max ρ ∈S L ( ρ )max all ρ L ( ρ ) . (2)Λ is a likelihood ratio , and λ = − a r X i v : . [ qu a n t - ph ] A p r represents the weight of evidence in favor of entangle-ment [13]. To demonstrate entanglement convincingly,an experiment must yield a sufficiently large value for λ .A likelihood ratio does not assign a probability to “ ρ is entangled”. Instead, it yields a confidence level . Wecan determine what values of λ typically result from mea-surements on ρ ⊗ N , and how their distribution dependson whether ρ is entangled or separable. If we measure λ = λ exp , and no separable state produces λ ≥ λ exp withprobability higher than (cid:15) , then we have demonstratedentanglement at the 1 − (cid:15) confidence level. If an exper-imentalist plans ( before taking data) to calculate λ andreport “ ρ is entangled” only when the data imply 1 − (cid:15) confidence, then the probability that he erroneously re-ports entanglement [14] is at most (cid:15) .So, ρ may be (i) entangled, (ii) separable, or (iii) onthe boundary. Boundary states are still separable, andthey are the hardest separable states to rule out. Todemonstrate entanglement at the 1 − (cid:15) confidence level,we must show that there is no boundary state for whichPr( λ ≥ λ exp ) ≥ (cid:15) . It is difficult to make rigorous proba-bilistic statements about λ for small N . But as N → ∞ ,the following analysis becomes exact, and is generallythought to be reliable for N > ∼
30 [7].
The distribution of λ : The set of quantum states ρ is a convex subset of the vector space of trace-1 d × d Hermitian operators, R d − . An entanglement-verification measurement is represented by a POVM(positive operator-valued measure) M = { E i . . . E m } , inwhich each operator E k represents an event that occurswith probability p k = Tr E k ρ (Born’s rule), and each ρ defines a probability distribution (cid:126)p = { p . . . p m } . Datain which E k appeared n k times define empirical frequen-cies (cid:126)f = { f . . . f m } , where f k ≡ n k N . Both (cid:126)p and (cid:126)f canbe represented as elements of an m -simplex embeddedin a vector space R m − . The probabilities in (cid:126)p may belinearly dependent (e.g., if E j + E k = 1l, then p j + p k = 1for all ρ ), and at most d − ρ contains only d − M ) as the number of independent probabilities.So Born’s rule defines a linear mapping from the opera-tor space containing quantum states into the probabilityspace for measurement M . If dim( M ) < d −
1, then themapping from states to (cid:126)p -vectors is many-to-one, and theexperiment is completely insensitive to some parametersof ρ . Ignoring these irrelevant parameters makes ρ an(effectively) dim( M )-dimensional parameter. Separablestates form a convex subset of all states (see Fig. 1).These sets’ images in probability space are also nestedconvex sets (although if dim( M ) < d −
1, then someentangled states will be indistinguishable from separableones in this experiment).Suppose that N copies of a state ρ are measured,yielding a likelihood function L ( ρ ). L ( ρ ) has a uniqueglobal maximum ˆ ρ MLE . As N → ∞ , the distributionof ˆ ρ MLE approaches a Gaussian around ρ with covari-ance tensor ∆. L ( ρ ) itself is a Gaussian function with the same covariance matrix ∆ (see note [15]). This de-fines a characteristic length scale δ = | ∆ | that scalesas δ = O (1 / √ N ). We can use ∆ to define a stretchedEuclidean metric d ( ρ , ρ ) = (cid:112) Tr [( ρ − ρ )∆ − ( ρ − ρ )] . (4)Using this metric, ˆ ρ MLE is univariate Gaussian dis-tributed around ρ , andlog L ( ρ ) = − d ( ρ, ˆ ρ MLE ) . (5)Thus, λ is determined entirely by d (ˆ ρ MLE , S ), the dis-tance from ˆ ρ MLE to the separable set S . If ρ is demon-strably entangled, then λ will grow proportional to N –but if it is indistinguishable from a separable state, then λ will converge almost certainly to zero (see Figure 2).When ρ is on the boundary, λ neither grows with N nor converges to zero, but continues to fluctuate as N → ∞ . Its distribution is controlled by the shape andradius of S , e.g.:1. If S is small w/r.t. δ , it behaves like a point(see Figure 1b). Then d (ˆ ρ MLE , S ) ≈ d (ˆ ρ MLE , ρ ), λ = − L max / L ( ρ )) = d ( ρ, ˆ ρ MLE ) , and so λ is a χ random variable with dim( M ) degrees offreedom (a.k.a. a χ M ) variable).2. If S is much larger than δ , then it behaves like ahalf-space (see Figure 1d and note [16]). If S were a k -dimensional hyperplane, λ would be a χ M ) − k variable. A halfspace behaves like a hyperplane ofdimension (dim( M ) − , ˆ ρ MLE is separable. Thus, λ is what we will calla semi- χ variable: it equals zero with probability , and is χ -distributed otherwise.As N → ∞ , case (2) applies. For small N , however, thereal situation is somewhere in between (see Figure 1c). S may be small, and its boundary may be sharply curved,increasing λ . In the absence of a detailed understandingof S ’s shape, case (1) provides the best rigorous upperbound on λ . Its cumulative distribution is upper boundedby that of a χ M ) variable – i.e., Pr( λ > x ) is nogreater than it would be if λ was a χ M ) variable. As N → ∞ , the more optimistic semi- χ ansatz is valid –but only if we know that N is “large enough”.A χ k variable has expected value k , and higher val-ues are exponentially suppressed. So λ (cid:29) dim( M ) issufficient to demonstrate entanglement at a high confi-dence level. This implies a tradeoff between an experi-ment’s power (ability to identify many entangled states)and its efficiency (ability to do so rapidly). Powerful ex-periments have large dimension – e.g., a tomographicallycomplete measurement can identify any entangled state,but has dim( M ) = d −
1. This comes at a price; exper-iments with large dimension are potentially much moreprone to spurious large values of λ , so more data is re-quired to achieve conclusive results [ λ (cid:29) dim( M )]. Con-versely, an entanglement witness (see below) is targeted % ! N ! q=0.50q=0.35q=1 / FIG. 2:
Loglikelihood ratios ( λ ) behave dramaticallydifferently for different states. λ is shown for eachtrial (points), and averaged over all 1000 trials (solid lines).For small N the experiment cannot reliably distinguish them.As N grows, it resolves shorter distances in the state space.For entangled states, typical values of λ increase linearly with N , whereas the separable state almost certainly yields λ = 0[not visible in these plots; for ρ q =0 . (black), all trials withmore than N ∼ measurements yielded λ = 0, and theaverage (dashed line)plunges off the graph]. For barely sep-arable states, λ behaves as a semi- χ k variable with k = 1 as N → ∞ (see Fig. 3). at a particular state, but it can rapidly and conclusivelydemonstrate entanglement. Implementation:
Computing λ involves maximizing L ( ρ ) over two convex sets (the set of all states, and the set S of separable states). L ( ρ ) is log-convex, so in principlethis is a convex program.Testing separability is NP-hard, so efficient minimiza-tion over ρ ∈ S is impossible in general. But for twoqubits, the positive partial transpose (PPT) criterionperfectly characterizes entanglement, and λ can be calcu-lated easily (see examples below). For larger systems, S can be bounded by simpler convex sets, as S − ⊂ S ⊂ S + , (e.g., S + = PPT states, and S − = convex combinationsof specific product states). Maximizing L ( ρ ) over S + and S − yields bounds on max ρ ∈S L ( ρ ), which may (depend-ing on how wisely the bounding sets were chosen) be tightenough to confirm or deny entanglement. Examples:
To demonstrate the likelihood ratio test, wesimulate two different experiments on two qubits. Weimagine an experimentalist trying to produce the singletstate | Ψ (cid:105) , and producing instead a Werner state [8], ρ q = q Π singlet + (1 − q ) / , (6)where Π singlet = | Ψ (cid:105)(cid:104) Ψ | . Werner states are separablewhen q ≤ /
3, and entangled otherwise. The experi- ! ! ! CC D F ( ! ) " N=10N=10 N=10 N=10 N=10 N=10 semi⌧ " FIG. 3:
Distribution of λ for a SIC-POVM experiment .We show the empirical complementary cumulative distributionfunction of λ , CCDF ( λ c ) = Pr( λ > λ c ), for the state ρ q =1 / and simulated datasets of size N = { . . . } . The CCDFis used to compute confidence levels – e.g., to report entan-glement at the 95% confidence level, it is necessary to observe λ such that CCDF ( λ ) < .
05. For this particular state, thechance of a zero λ approaches 50% as N increases. For each N , CCDF ( λ c ) was based on roughly 10 data points fromindependent trials, each of which generated a value of λ from N tomographically complete measurements on ρ q =1 / . Wealso show CCDFs for a semi- χ variable and a χ M ) = χ variable. The semi- χ ansatz is good for large N , but unre-liable for small N (yielding too many false positives), whilethe χ ansatz is very conservative. mentalist’s repeated preparations are assumed to be in-dependently and identically distributed (i.i.d.) [9]. Witness data:
The simplest way to test for entangle-ment is to repeatedly measure a single entanglement wit-ness [2, 4]. An optimal witness for Werner states is W = / − Π singlet . Measuring W yields one of two outcomes– “yes” or “no” – corresponding to POVM (positive-operator valued measure) elements { Π singlet , − Π singlet } .The probability of a “yes” outcome is given by Born’s ruleas p = Tr ρ Π singlet , so p completely characterizes a state ρ for the purposes of this experiment. The data from N measurements is fully characterized by the frequency of“yes” results, f = n “yes (cid:48)(cid:48) /N . As N → ∞ , f > repre-sents definitive proof that (cid:104) W (cid:105) <
0, and therefore that ρ is entangled. For finite N , f ≤ means that a separa-ble state fits as well as any other, so there is no case forentanglement. When f > , our likelihood ratio quan-tifies the weight of the evidence for entanglement. Thelikelihood function depends only on p , as L ( ρ ) = L ( p ) = Pr( (cid:126)f | p ) = p Nf (1 − p ) N (1 − f ) = e − N ( − f log p − (1 − f ) log(1 − p )) , (7)making this a single-parameter problem. The maximumlikelihood, attained at p = f , is L max = e − NH ( f ) , ex-pressed in terms of the data’s empirical entropy , H ( f ) = − f log f − (1 − f ) log(1 − f ) . (8)If f > , the most likely separable state has p = , sothat L sep = 2 − N , which yields λ = − L sep L max = 2 N [log(2) − H ( f )] . (9)Our numerical explorations (not shown here) confirmthat for a barely-separable Werner state, λ behaves asa semi- χ variable, even for N as low as 20. Tomographically complete data:
Manyentanglement-verification experiments measure atomographically complete set of observables on afinite-dimensional system (with a heroic example beingtomography on 8 ions in an ion trap [11]). Such dataidentify ρ uniquely as N → ∞ , so one can determinewith certainty whether ρ is entangled (modulo thecomputational difficulties in determining whether aspecified ρ is separable). Analyzing finite data is morecomplicated than in the witness example, for the dataconstrain a multidimensional parameter space. Ad-hoctechniques are unreliable, and the likelihood ratio testcomes into its own.We consider an apparatus that applies a SIC (sym-metric informationally complete)-POVM [10] to each ofour two qubits, independently. This measurement (not to be confused with a 4-dimensional SIC-POVM) is to-mographically complete, has 4 × W , it has nospecial relationship to Werner states, so any entangled ρ will yield overwhelmingly convincing data as N → ∞ .We repeatedly simulated N = 10 . . . measurementson a barely-separable Werner state ( ρ q =1 / ), and com-pared the empirical distribution of λ to those of semi- χ and χ random variables (see Figure 3). As N gets large, λ becomes indistinguishable from a semi- χ variable. Forsmaller N , this ansatz is too optimistic (and would pro-duce excessive false positives), but the χ d − ansatz iswildly overcautious. We found that for small N , λ be-haves like a semi- χ D variable, with D a bit larger than 1(e.g. D ≈ . N = 100). Conclusions:
Entanglement verification is easy when N → ∞ . In practice, N is finite and data are never con-clusive. Likelihood ratios provide a simple, reliable testof significance that can be applied to any experimentaldata. Large values of λ are very unlikely to be generatedby any separable state, but the hardest separable statesto rule out are on the boundary. For such states, the-ory predicts (and our numerics confirm) that λ behaveslike a semi- χ random variable. If the underlying stateis separable, Pr( λ > x ) can be upper bounded using a χ M ) distribution, scaling as e − x for large x . For en-tangled states, λ grows linearly with N , and will thusrapidly become distinguishable from any separable state. [1] L. Hofstetter et al. , Nature , 960 (2009); M. Ans-mann et al. , Nature , 504 (2009); E. Amselem andM. Bourennane, Nature Physics , 748 (2009); P. Bohi et al. , Nature Physics , 592 (2009); L. DiCarlo et al. ,Nature , 240 (2009); J. Janousek et al. , Nature Pho-tonics , 399 (2009); J.D. Jost et al. , Nature , 683(2009); J.C.F. Matthews et al. , Nature Photonics , 346(2009); A. Fedrizzi et al. , Nature Physics , 389 (2009);A. Ourjoumtsev et al. , Nature Physics , 189 (2009); A.S. Coelho et al. , Science , 823 (2009); Scott B. Papp et al. , Science
764 (2009); Ryo Okamoto et al. , Sci-ence , 483 (2009); S. Olmschenk et al. , Science ,486 (2009).[2] O. G¨uhne and G. Toth, Physics Reports , 1 (2009).[3] S.J. van Enk, N. Lutkenhaus, and H.J. Kimble, Phys.Rev. A , 052318 (2007)[4] M. Horodecki, P. Horodecki, and R. Horodecki, Phys.Lett. A , 1 (1996); B. Terhal, Physics Letters A ,319 (2000); P. Hyllus et al. , Phys. Rev. A , 012321(2005).[5] W. van Dam, R. D. Gill, and P. D. Grunwald, IEEETrans. Inf. Th. , 2812 (2005).[6] Y. Zhang, E. Knill, and S. Glancy, arxiv:1001.1750 (2010).[7] J. F. Geweke and K. J. Singleton, J. Am. Stat. Assoc., , 133 (1980).[8] R.F. Werner, Phys. Rev. A , 4277 (1989).[9] R. Renner, Nature Physics , 645 (2007). [10] J. M. Renes, R. Blume-Kohout, A. J. Scott, and C. M.Caves, Journal of Mathematical Physics, , 2171 (2004).[11] H. H¨affner et al. Nature , 643 (2005).[12] For larger systems, determining whether a given state isentangled is an NP-hard problem. In multi-partite sys-tems, different classes of entanglement exist, but theirclassification is still an open problem. Our likelihood ratiomethod applies to any case where the decision is binary :do the data demonstrate entanglement in a particularclass or not? So in this paper, the word “separable” canbe generalized to “not in the desired entanglement class.”[13] The factor of − λ (as defined) is in many cir-cumstances a χ random variable (see text, below).[14] Statisticians call this a “Type I error”. Erroneously re-jecting entanglement, even though the experiment is ca-pable of demonstrating entanglement (which is not thesame as reporting separability), is a “Type II error”. Inentanglement verification one tries to avoid Type I errorsand is merely mildly unenthusiastic about Type II errors.[15] Technically, this Gaussian ansatz is true only when ρ isfull rank – i.e., not on the boundary of the state set. If ρ is rank-deficient, then both the distribution of ˆ ρ MLE and L ( ρ ) itself are typically truncated by the boundary.However, the analysis remains valid (as N → ∞ ) except if ρ is simultaneously rank-deficient and on the boundarybetween separable and entangled states.[16] As long as the boundary of S is differentiable at ρ0