CCorrelation Distance and Bounds for Mutual Information
Michael J. W. Hall ∗ Abstract
The correlation distance quantifies the statistical independence of two classical or quantum systems, viathe distance from their joint state to the product of the marginal states. Tight lower bounds are givenfor the mutual information between pairs of two-valued classical variables and quantum qubits, in terms ofthe corresponding classical and quantum correlation distances. These bounds are stronger than the Pinskerinequality (and refinements thereof) for relative entropy. The classical lower bound may be used to quantifyproperties of statistical models that violate Bell inequalities. Entangled qubits can have a lower mutualinformation than can any two-valued classical variables having the same correlation distance. The qubitcorrelation distance also provides a direct entanglement criterion, related to the spin covariance matrix.Connections of results with classically-correlated quantum states are briefly discussed.
The relative entropy between two probability distributions has many applications in classical and quantuminformation theory. A number of these applications, including the conditional limit theorem [1], and securerandom number generation and communication [2, 3], make use of lower bounds on the relative entropy interms of a suitable distance between the two distributions. The best known such bound is the so-calledPinsker inequality [4] H ( P (cid:107) Q ) := (cid:88) j P ( j )[log P ( j ) − log Q ( j )] ≥ D ( P, Q ) log e, (1)where D ( P, Q ) := (cid:107) P − Q (cid:107) = (cid:80) j | P ( j ) − Q ( j ) | is the variational or L1 distance between distributions P and Q . Note that choice of logarithm base is left open throughout this paper, corresponding to a choice of units.There are a number of such bounds [4], all of which easily generalise to the case of quantum probabilities[5, 6].However, in a number of applications of the Pinsker inequality and its quantum analog, a lower bound isin fact only needed for the special case that the relative entropy quantifies the mutual information betweentwo systems. Such applications include, for example, secure random number generation and coding [2, 3](both classical and quantum), and quantum de Finnetti theorems [7]. Since mutual information is a specialcase of relative entropy, it follows that it may be possible to find strictly stronger lower bounds for mutualinformation.Surprisingly little attention appears to have been paid to this possiblity of better lower bounds (althoughupper bounds for mutual information have been investigated [8]). The results of preliminary investigationsare given here, with explicit tight lower bounds being obtained for pairs of two-valued classical randomvariables, and for pairs of quantum qubits with maximally-mixed reduced states.In the context of mutual information, the corresponding variational distance reduces to the distancebetween the joint state of the systems and the product of their marginal states, referred to here as the‘correlation distance’. It is shown that both the classical and quantum correlation distances are relevant toquantifying properties of quantum entanglement: the former with respect to the classical resources requiredto simulate entanglement, and the latter as providing a criterion for qubit entanglement. In the quantumcase, it is also shown that the minimum value of the mutual information can only be achieved by entangledqbuits if the correlation distance is more than ≈ . ∗ Centre for Quantum Computation and Communication Technology (Australian Research Council), Centre for Quantum Dy-namics, Griffith University, Brisbane, QLD 4111, Australia a r X i v : . [ qu a n t - ph ] J u l he main results are given in the following section. Lower bounds on classical and quantum mutualinformation for two-level systems are derived in sections 3 and 5, and an entanglement criterion for qubits interms of the quantum correlation distance is obtained in Section 4. Connections with classically-correlatedquantum states are briefly discussed in section 6, and conclusions presented in section 7. For two classical random variables A and B , with joint probability distribution P AB ( a, b ) and marginaldistributions P A ( a ) and P B ( b ), the Shannon mutual information and the classical correlation distance aredefined respectively by I ( P AB ) := H ( P AB (cid:107) P A P B ) = H ( P A ) + H ( P B ) − H ( P AB ) ,C ( P AB ) := (cid:107) P AB − P A P B (cid:107) = (cid:88) a,b | P AB ( a, b ) − P A ( a ) P B ( b ) | , where H ( P ) := − (cid:80) j P ( j ) log P ( j ) denotes the Shannon entropy of distribution P . The term ‘correlationdistance’ is used for C ( P AB ), since it inherits all the properties of a distance from the more general variationaldistance, and clearly vanishes for uncorrelated A and B .For two quantum systems A and B described by density operator ρ AB and reduced density operators ρ A and ρ B , the corresponding quantum mutual information and quantum correlation distance are analogouslydefined by I ( ρ AB ) := S ( ρ A ) + S ( ρ B ) − S ( ρ AB ) ,C ( ρ AB ) := (cid:107) ρ AB − ρ A ⊗ ρ B (cid:107) = tr | ρ AB − ρ A ⊗ ρ B | , where S ( ρ ) := − tr[ ρ log ρ ] denotes the von Neumann entropy of density operator ρ .In both the classical and quantum cases, one has the lower bound I ≥ C log e (2)for mutual information, as a direct consequence of the Pinsker inequality (1) for classical relative entropies[4, 5, 6]. However, better bounds for mutual information can be obtained, which are stronger than anygeneral inequality for relative entropy and variational distance.For example, for two-valued classical random variables A and B one has the tight lower bound I ( P AB ) ≥ log 2 − H (cid:18) C ( P AB )2 , − C ( P AB )2 (cid:19) (3)for classical mutual information. This inequality has been previously stated without proof in Ref. [9], whereit was used to bound the shared information required to classically simulate entangled quantum systems. Itis proved in section 3 below.In contrast to Pinsker-type inequalities such as Eq. (2), the quantum generalisation of Eq. (3) is notstraightforward. In particular, note for a two-qubit system that one cannot simply replace P AB by ρ AB in Eq. (3), as the right hand side would be undefined for C ( ρ AB ) > C ( ρ AB ) > C ( ρ AB ) > (cid:113) (1 − tr[ ρ A ]) (1 − tr[ ρ B ]) . (4)An explicit expression for the quantum correlation distance for two qubits, in terms of the spin covariancematrix, is also given in section 4.It is shown in section 5 that the quantum equivalent of Eq. (3), i.e., a tight lower bound for the quantummutual information shared by two qubits, is I ( ρ AB ) ≥ log 2 − H (cid:16) C ( ρ AB )2 , − C ( ρ AB )2 (cid:17) , C ( ρ AB ) ≤ C , log 4 − H (cid:16) + C ( ρ AB )2 , − C ( ρ AB )6 , − C ( ρ AB )6 , − C ( ρ AB )6 (cid:17) , C ( ρ AB ) > C , (5) hen the reduced density operators are maximally mixed, where C ≈ . C ( ρ AB ) > C this lowerbound can only be achieved by entangled states, and cannot be achieved by any classical distribution P AB having the same correlation distance. It is also shown that, for C ( ρ AB ) > C , the bound is also tight if onlyone of the reduced states is maximally mixed. Support is given for the conjecture that the bound in Eq. (5)in fact holds for all two-qubit states.In section 6 the natural role of ‘classically-correlated’ quantum states, in comparing classical and quantumcorrelations, is briefly discussed. Such states have the general form ρ AB = (cid:80) j,k P ( j, k ) | j, k (cid:105)(cid:104) j, k | [10], where P ( j, k ) is a classical joint probability distribution and {| j (cid:105)} and {| k (cid:105)} are orthonormal basis sets for the twoquantum systems. The lower bound in Eq. (5) can be saturated by a classically-correlated state if and onlyif C ≤ C . The tight lower bound in Eq. (3) is derived here. The bound is plotted in Figure 1 below [top curve]. Alsoplotted for comparison are the Pinsker lower bound in Eq. (2) [bottom curve], and the lower bound followingfrom the best possible generic inequality for relative entropy and variational distance, given in parametricform in Ref. [4] [intermediate curve].
Figure 1: Lower bounds for the classical mutual information between two-valued variables. (cid:72) bits (cid:76)
To derive the bound in Eq. (3), it is convenient to label the two possible values of A and B by ±
1. Defining R ( a, b ) := 4[ P AB ( a, b ) − P A ( a ) P B ( b )], it follows by summing over each of a and b that R ( a, b ) = abr for somenumber r , and hence that C ( P AB ) = | r | . Further, writing P A ( a ) = (1 + ax ) / P B ( b ) = (1 + by ) /
2, forsuitable x, y ∈ [ − , P AB ( a, b ) ≥ | x + y | − ≤ r + xy ≤ − | x − y | . (6)Now, Eq. (3) is equivalent to f ( r ) := I ( P AB ) − log 2 + H (cid:18) r , − r (cid:19) ≥ . (7) t is easy to check that this inequality is always saturated for the case of maximally-random marginals, i.e,when x = y = 0. In all other cases, the inequality may be proved by showing that f ( r ) has a unique globalminimum value of 0 at r = 0.In particular, note first that f (0) = 0 (one has P AB = P A P B in this case, so that the mutual informationvanishes). Further, using P AB ( a, b ) = [(1 + ax )(1 + by ) + abr ] /
4, one easily calculates that, using logarithmbase e for convenience, f (cid:48) ( r ) = 14 (cid:88) a,b ab log P AB ( a, b ) − (cid:88) a a log 1 + ar p AB (+ , +) p AB ( − , − ) (1 − r ) p AB (+ , − ) p AB ( − , +) (1 + r ) . Hence, f (cid:48) ( r ) = 0 if and only if the argument of the logarithm is unity, i.e., if and only if[(1 + x )(1 + y ) + r ] [(1 − x )(1 − y ) + r ] (1 − r ) = [(1 + x )(1 − y ) − r ] [(1 − x )(1 + y ) − r ] (1 + r ) . Expanding and simplifying yields two possible solutions: r = 0, or r = ( x + y − x y ) / (2 xy ). However, inthe latter case one has | r + xy | = x + y + x y | xy | = αγ + γ ≥ γ ≥ , where α and γ denote the arithmetic mean and geometric mean, respectively, of x and y (hence α ≥ γ ).This is clearly inconsistent with the positivity condition (6) (unless x = y = 0, which trivially saturatesEq. (7) for all r as noted above). The only remaining solution to f (cid:48) ( r ) = 0 is then r = 0, implying f ( r ) hasa unique maximum or minimum value at r = 0. Finally, it is easily checked that it is a minimum, since f (cid:48)(cid:48) (0) = 116 (cid:88) a,b p A ( a ) p B ( b ) − P A (+) P A ( − ) P B (+) P B ( − ) − − x )(1 − y ) − ≥ x = y = 0). Thus, f ( r ) ≥ f (0) = 0 as required. The hallmark feature of quantum correlations is that they cannot be explained by any underlying statisticalmodel that satisfies three physically very plausible properties: (i) no signaling faster than the speed of light,(ii) free choice of measurement settings, and (iii) independence of local outcomes. Various interpretationsof quantum mechanics differ in regard to which of these properties should be given up. It is of interest toconsider by how much they must be given up, in terms of the information-theoretic resources required tosimulate a given quantum correlation. For example, how many bits of communication, or bits of correlationbetween the source and the measurement settings, or bits of correlation between the outcomes, are required?The lower bound for classical mutual information in Eq. (3) is relevant to the last of these questions.In more detail, if P AB ( a, b ) denotes the joint probability of outcomes a and b , for measurements ofvariables A and B on respective spacelike-separated systems, and λ denotes any underlying variables relevantto the correlations, then Bayes theorem implies that P AB ( a, b ) = (cid:88) λ p AB ( λ ) P AB ( a, b | λ ) , where summation is replaced by integration over any continuous values of λ . The no-signaling propertyrequires that the underlying marginal distribution of A , p A ( a | λ ), is independent of whether B or B (cid:48) wasmeasured on the second system (and vice versa), while the free-choice property requires that λ is independentof the choice of the measured variables A and B , i.e., that p AB ( λ ) = p A (cid:48) B (cid:48) ( λ ) for any A, A (cid:48) , B, B (cid:48) . Finally,the outcome independence property requires that any observed correlation between A and B arises fromignorance of the underlying variable, i.e., that P AB ( a, b | λ ) = P A ( a | λ ) P B ( b | λ ) for all A , B and λ . Thus thecorrelation distance of P AB ( a, b | λ ) vanishes identically: C ( P AB | λ ) ≡ . (8)As is well known, the assumption of all three properties implies that two-valued random variables withvalues ± (cid:104) AB (cid:105) + (cid:104) AB (cid:48) (cid:105) + (cid:104) A (cid:48) B (cid:105) − (cid:104) A (cid:48) B (cid:48) (cid:105) ≤ , (9) hereas quantum correlations can violate this inequality by as much as a factor of √
2. It follows thatquantum correlations can only be modeled by relaxing one or more of the above properties, as has recentlybeen reviewed in detail in Ref. [9].For example, assuming that no-signaling and measurement independence hold (as they do in the standardCopenhagen interpretation of quantum mechanics), and defining C max to be the maximum value of C ( P AB | λ )over all A , B and λ , it can be shown that Eq. (9) generalises to the tight bound [9] (cid:104) AB (cid:105) + (cid:104) AB (cid:48) (cid:105) + (cid:104) A (cid:48) B (cid:105) − (cid:104) A (cid:48) B (cid:48) (cid:105) ≤ − C max . (10)It follows that to simulate a Bell inequality violation (cid:104) AB (cid:105) + (cid:104) AB (cid:48) (cid:105) + (cid:104) A (cid:48) B (cid:105)−(cid:104) A (cid:48) B (cid:48) (cid:105) = 2+ V , for some V > C max ≥ V / (2 + V ).Hence, using the classical lower bound Eq. (3) (stated without proof in Ref. [9]), the observers must sharea minimum mutual information of I min = log 2 − H (cid:18) C max , − C max (cid:19) ≥ log 2 − H (cid:18) V V , − V V (cid:19) . (11)Note this reduces to zero in the limit of no violation of Bell inequality (9), i.e., when V = 0, and reaches amaximum of 1 bit of information in the limit of the maximum possible violation, V = 2. The positivity condition (6) may be used to show that the classical correlation distance between any pairof two-valued random variables is never greater than unity, i.e., that C ( P AB ) = | r | ≤ C ( ρ AB ) ≤ /
2. More generally, one has C ( P AB ) ≤ n − /n, C ( ρ AB ) ≤ n − /n (12)for pairs of n -valued random variables and n -level quantum systems, with saturation corresponding tomaximal correlation and maximal entanglement respectively. Thus, quantum correlations have a quadraticadvantage with respect to correlation distance (this is also the case for mutual information, for which onehas I ( P AB ) ≤ log n and I ( ρ AB ) ≤ log n ).Nonclassical values of the quantum correlation distance are closely related to the quintessential nonclas-sical feature of quantum mechanics: entanglement. In particular, C ( ρ AB ) > Recall that the density operator ρ AB of two qubits may always be written in the Fano form [12] ρ AB = 14 I ⊗ I + u.σ ⊗ I + I ⊗ v.σ + (cid:88) j,k (cid:104) σ j ⊗ σ k (cid:105) σ j ⊗ σ k = ρ A ⊗ ρ B + 14 (cid:88) j,k T jk σ j ⊗ σ k . (13)Here I is the unit operator; { σ j } denotes the set of Pauli spin observables on each qubit Hilbert space; thecomponents of the 3-vectors u and v are the spin expectation values u j := (cid:104) σ j ⊗ (cid:105) and v := (cid:104) ⊗ σ k (cid:105) , for A and B respectively; and T denotes the 3 × T jk := (cid:104) σ j ⊗ σ k (cid:105) − (cid:104) σ j ⊗ I (cid:105) (cid:104) I ⊗ σ k (cid:105) . It immediately follows from Eq. (13) that the quantum correlation distance may be expressed in terms ofthe spin covariance matrix as C ( ρ AB ) = 14 tr (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) j,k T jk σ j ⊗ σ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (14) his expression will be further simplified in subsection 4.2.Now consider the case where ρ AB is a separable state, i.e., of the unentangled form ρ AB = (cid:88) λ p ( λ ) τ A ( λ ) ⊗ ω B ( λ ) , for some probability distribution p ( λ ) and local density operators { τ A ( λ ) } , { ω B ( λ ) } . Defining u j ( λ ) :=tr[ τ A ( λ ) σ j ], v k ( λ ) := tr[ ω B ( λ ) σ k ] implies u = (cid:80) λ p ( λ ) u ( λ ) and v = (cid:80) λ p ( λ ) v ( λ ), and substitution intoEq. (14) then yields C ( ρ AB ) = 14 (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:88) λ p ( λ ) [ u ( λ ) − u ] .σ ⊗ [ v ( λ ) − v ] .σ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:88) λ p ( λ ) (cid:107) [ u ( λ ) − u ] .σ (cid:107) (cid:107) [ v ( λ ) − v ] .σ (cid:107) = (cid:88) λ p ( λ ) | u ( λ ) − u | | v ( λ ) − v | ≤ (cid:34)(cid:88) λ p ( λ ) | u ( λ ) − u | (cid:35) / (cid:34)(cid:88) λ p ( λ ) | v ( λ ) − v | (cid:35) / = (cid:34)(cid:88) λ p ( λ ) | u ( λ ) | − | u | (cid:35) / (cid:34)(cid:88) λ p ( λ ) | v ( λ ) | − | v | (cid:35) / ≤ (cid:112) (1 − u.u )(1 − v.v ) . (15)Note that second line follows from the properties (cid:107) X + Y (cid:107) ≤ (cid:107) X (cid:107) + (cid:107) Y (cid:107) and (cid:107) XY (cid:107) ≤ (cid:107) X (cid:107) (cid:107) Y (cid:107) ofthe trace norm; the third line using (cid:107) X (cid:107) = tr[ √ X † X ] and the Schwarz inequality; and the last line via | u ( λ ) | , | v ( λ ) | ≤ C ( ρ AB ) >
1, immediately implies that the qubits must be entangled. More generally, noting that ρ A = ( I + u.σ ) and ρ B = ( I + v.σ ), one has tr[ ρ A ] = (1 + u.u ) /
2, tr[ ρ B ] = (1 + v.v ) /
2, and the strongerentanglement criterion (4) immediately follows from Eq. (15).The fact that entanglement is required between two qubits, for C ( ρ AB ) to be greater than the maximumpossible value of C ( P AB ) for two-valued classical variables, is a nice distinction between quantum andclassical correlation distances. It would be of interest to determine whether this result generalises to n -levelsystems. This would follow from the validity of Eq. (4) for arbitrary quantum systems. C ( ρ AB ) To explicitly evaluate C ( ρ AB ) in Eq. (14), let T = KDL T denote a singular value decomposition of the spincovariance matrix. Thus, K and L are real orthogonal matrices and D = diag[ t , t , t ], with the singularvalues t ≥ t ≥ t ≥ T T T . Noting that any 3 × − I , one therefore always has a decomposition of the form T = ± KDL T where K and L are now restrictedto be rotation matrices. Hence, defining unitary operators U and V corresponding to rotations K and L ,via Uσ j U † = (cid:80) j,j (cid:48) K jj (cid:48) σ j (cid:48) and V σ j V † = (cid:80) j,j (cid:48) L jj (cid:48) σ j (cid:48) , and using the invariance of the trace norm underunitary transformations, the quantum correlation distance in Eq. (14) can be rewritten as C ( ρ AB ) = 14 tr (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ± (cid:88) j t j Uσ j U † ⊗ V σ j V † (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 14 tr (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) j t j σ j ⊗ σ j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . Determining the eigenvalues of the Hermitian operator (cid:80) j t j σ j ⊗ σ j is a straighforward 4 × C ( ρ AB ) = 14 [ | t + t + t | + | t + t − t | + | t − t + t | + | − t + t + t | ]= 12 max { t + t + t , t } (16)for the quantum correlation distance, in terms of the singular values of the spin covariance matrix. or example, for the Werner state ρ AB = p | ψ (cid:105)(cid:104) ψ | + (1 − p ) / I ⊗ I , where | ψ (cid:105) is the singlet state and − / ≤ p ≤ T = − pI and hence that t = t = t = | p | . The corresponding correlationdistance is therefore 3 | p | /
2, which is greater than the classical maximum of unity for p > / t + t + t > (cid:113) (1 − tr[ ρ A ]) (1 − tr[ ρ B ]) . (17)For the above Werner state this criterion is tight, indicating entanglement for p > /
3. Hence, the maininterest in weaker entanglement criteria based on quantum correlation distance lies in their direct connectionwith nonclassical values of the classical correlation distance.
Here Eq. (5) is derived for the case ρ A = ρ B = I . Evidence is provided for the conjecture that Eq. (5) infact holds for all two-qubit states, including a partial generalisation of Eq. (5) when only one of ρ A and ρ B is maximally-mixed. ρ A and ρ B The tight lower bound for quantum mutual information in Eq. (5), for maximally-mixed reduced states, isplotted in Figure 2 below [top solid curve]. Also plotted for comparison are the Pinsker lower bound inEq. (2) [bottom solid curve], and classical lower bound in Eq. (3) [dashed curve]. The dotted vertical lineindicates the value of C ≈ . C and 1. Figure 2: Lower bounds for the quantum mutual information between two qubits. (cid:72) bits (cid:76)
To derive Eq. (5) for ρ A = ρ B = I , note first that Eq. (13) reduces to ρ AB = [ I ⊗ I + (cid:80) j,k T jk σ j ⊗ σ k ].By the same argument given in section 4.2, this can be transformed via local unitary transformations to thestate ˜ ρ AB = 14 (cid:34) I ⊗ I + (cid:88) j r j σ j ⊗ σ j (cid:35) , (18)where r j = αt j , α = ±
1, and t ≥ t ≥ t ≥ T . Since the quantum mutual information and quantum correlation distance are invariant under localunitary transformations, one has I ( ρ AB ) = I (˜ ρ AB ) and C ( ρ AB ) = C (˜ ρ AB ). Hence Eq. (5) only needs to bedemonstrated for ˜ ρ AB .The mutual information of ˜ ρ AB is easily evaluated as I (˜ ρ AB ) = S (˜ ρ A ) + S (˜ ρ B ) − S (˜ ρ AB ) = log 4 − H ( p , p , p , p ) , (19) here p = (1 − r − r − r ), p = (1 − r + r + r ), p = (1 + r − r + r ), p = (1 + r + r − r )are the eigenvalues of ˜ ρ AB . Inverting the relation between the r j and p j further yields t j = αr j = α [1 − p + p j )] , t + t + t = α (1 − p ) , (20)and hence the correlation distance follows from Eq. (16) as C (˜ ρ AB ) = C := 12 max { α (1 − p ) , α (1 − p + 1 − p ) } . (21)Equation (19) implies that a tight lower bound for I (˜ ρ AB ) corresponds to a tight upper bound for H ( p , p , p , p ). To determine the maximum value of H ( p , p , p , p ), for a fixed correlation distance C ,consider first the case α = 1. The ordering and positivity conditions on t j then require p ≤ p ≤ p , and p + p j ≤ for j = 1 , , p ≤ / C = max { − p , − p +1 − p } .Hence, if p ≤ /
4, then C = 1 − p + p ) ≤
1, implying the constraint p + p = (1 − C ) /
2. Notingthe concavity of entropy, the maximum possible entropy under this constraint corresponds to equal values p = p = (1 − C ) /
4, and p = p = (1 + C ) / p j ). Conversely, if p ≥ / C = (1 − p ) / ≤ /
2, and hence p = 1 / − C/ p = p = p = 1 / C/ p j ). It follows that the maximum possible entropy is(i) the maximum of the entropies H ( C ) = H ((1 − C ) / , (1 − C ) / , (1 + C ) / , (1 + C ) /
4) and H ( C ) = H (1 / − C/ , / C/ , / − C/ , / − C/ C ≤ /
2, and (ii) H ( C ) for 1 / < C ≤
1. However, itis straightforward to show that H ( C ) ≥ H ( C ) over their overlapping range. Hence the maximum possibleentropy is always H ( C ) for the case α = 1.For the case α = −
1, the conditions on t j require that p ≥ p ≥ p and p + p j ≥ for j = 1 , , p ≥ / C = max { p − , p − p − } . Carrying out a similaranalysis to the above, one finds that the maximum possible entropy is (i) the maximum of the entropies H ( C )and H ( C ) = H (1 / Q/ , / − Q/ , / − Q/ , / − Q/
6) for C ≤
1, and (ii) H ( C ) for 1 < C ≤ / H ( C ) > H ( C ) for C > C ≈ . H ( C ) ≤ H ( C ) otherwise.Hence, from Eq. (19) one has the tight lower bound I (˜ ρ AB ) ≥ (cid:26) log 4 − H ( C ) , C ≤ C , log 4 − H ( C ) , C > C . (22)Since H ( C ) = log 2 + H ((1 − C ) / , (1 + C ) / ρ AB in Eq. (18), and hencefor all qubit states with maximally-mixed reduced density operators, as claimed.The states saturating the lower bound in Eqs. (5) and (22) are easily constructed from the above deriva-tion. In particular, they are given by ρ ( C ) := (cid:40) [ I ⊗ I + C σ ⊗ σ ] , C ≤ C , (cid:104) I ⊗ I − (2 C/ (cid:80) j σ j ⊗ σ j (cid:105) , C > C , (23)and any local unitary transformations thereof, where the quantum correlation distance of ρ ( C ) is C byconstruction.Note that ρ ( C ) is unentangled for C ≤ C (it can be written as a mixture of (1 / I ⊗ I , | + (cid:105)(cid:104) + | ⊗ | + (cid:105)(cid:104) + | and |−(cid:105)(cid:104)−| ⊗ |−(cid:105)(cid:104)−| , where σ |±(cid:105) = ±|±(cid:105) ). Conversely, ρ ( C ) is an entangled Werner state for C ≥ C (withsinglet state weighting p = 2 C/ > / C ≥ C , and cannot be achieved by any two-valued classical random variables. It is conjectured that Eq. (5) is in fact a tight lower bound for any two-qubit state. This conjecture wouldfollow immediately if it could be shown that I ( ρ AB ) ≥ I ( ρ (cid:48) AB ) (24)for arbitrary ρ AB , where ρ (cid:48) AB := ρ AB − ρ A ⊗ ρ B + (1 / I ⊗ I . This is because ρ (cid:48) AB is of the form of ˜ ρ AB inEq. (18), and hence I ( ρ (cid:48) AB ) satisfies Eq. (22). artial support for Eq. (24), and hence for the conjecture, is given by noting that any ρ AB and corre-sponding ρ (cid:48) AB can be brought to the respective forms ρ AB = ρ A ⊗ ρ B + 14 (cid:88) j r j σ j ⊗ σ j , ρ (cid:48) AB = 14 (cid:34) I ⊗ I + (cid:88) j r j σ j ⊗ σ j (cid:35) via suitable local unitary transformations, similarly to the argument in section 4.2. Defining the function F ( r , r , r ) := I ( ρ AB ) − I ( ρ (cid:48) AB ) , it is straightforward to show that F = 0 and ∂F/∂r j = 0 for r = r = r = 0, consistent with F ≥ ∂F/∂r j = 0 does not vanish for other physically possiblevalues of the r j (other than for the trivially saturating case ρ A = ρ B = (1 / I ).The above conjecture is further supported by the generalisation of Eq. (5) in the following section. ρ A or ρ B It is straighforward to show that the lower bound on quantum mutual information is tight for C ≥ C whenjust one of the mixed density operators is mixed, i.e., if ρ A or ρ B is equal to (1 / I .First, since (1 / I is invariant under unitary transformations, the same argument as in section 4.2 impliesthe state can always be transformed by local unitary transformations to the generalised form˜ ρ AB = 14 (cid:34) ˜ ρ A ⊗ ˜ ρ B + α (cid:88) j t j σ j ⊗ σ j (cid:35) of Eq. (18), where either ˜ ρ A or ˜ ρ B equals (1 / I and α = ± T denote the ‘twirling’ operation, corresponding to applying a random unitary transformationof the form U ⊗ U [15]. It is easy to check that by definition T ( I ⊗ I ) = I ⊗ I , T ( I ⊗ σ j ) = 0 = T ( σ j ⊗ T ( σ j ⊗ σ j ) = T ( σ k ⊗ σ k ), for any j and k . Since Werner states are invariant under twirling [13, 15], itfollows that T ( σ j ⊗ σ j ) = (1 / (cid:80) k σ k ⊗ σ k . Using these properties, one finds that T (˜ ρ A ⊗ ˜ ρ B ) = (1 / I ⊗ I if one of ˜ ρ A or ˜ ρ B is maximally mixed, and hence that T (˜ ρ AB ) = 14 (cid:34) I ⊗ I + α ¯ t (cid:88) j σ j ⊗ σ j (cid:35) = ρ ( − α ¯ t/ , where ¯ t := ( t + t + t ) / C ≥ C (but not otherwise), with ρ ( C ) definedas per Eq. (23). Further, from Eq. (16) one has C ( T (˜ ρ AB )) = C = 12 max { t, t } = 3¯ t/ . Recalling that ρ ( C ) saturates Eq. (22), an analysis similar to section 5.1 shows for C ≥ C that I ( T (˜ ρ AB )) = log 4 − H ( − αC ) ≥ log 4 − H ( C ) , with equality for α = − T (˜ ρ A ⊗ ˜ ρ B ) = (1 / I ⊗ I , and the property that the relative entropy is non-increasingunder the twirling operation, it follows that I (˜ ρ AB ) = S (˜ ρ AB (cid:107) ˜ ρ A ⊗ ˜ ρ B ) ≥ S ( T (˜ ρ AB ) (cid:107)T (˜ ρ A ⊗ ˜ ρ B )) = I ( T (˜ ρ AB )) ≥ log 4 − H ( C ) (25)for C ≥ C . Since Werner states are invariant under twirling, this inequality is tight for α = −
1, beingsaturated by the choice ˜ ρ AB = ρ ( C ). Recalling that mutual information and correlation distance are invariantunder local unitary operations, the inequality is therefore tight for any ρ AB for which one of ρ A and ρ B ismaximally mixed, as claimed. Classically-Correlated Quantum States
It is well known that a quantum system behaves classically if the state and the observables of interest allcommute, i.e., if they can be simultaneously diagonalised in some basis. Hence, a joint state will behaveclassically if the relevant observables of each system commute with each other and the state. It is thereforenatural to define ρ AB to be classically correlated if and only if it can be diagonalised in a joint basis [10],i.e., if and only if ρ AB = (cid:88) j,k P ( j, k ) | j (cid:105)(cid:104) j | ⊗ | k (cid:105)(cid:104) k | (26)for some distribution P ( j, k ) and orthonormal basis set {| j (cid:105) ⊗ | k (cid:105)} . Classical correlation is preserved bytensor products, and by mixtures of commuting states.While, strictly speaking, a classically-correlated quantum state only behaves classically with respect toobservables that are diagonal with respect to | j (cid:105) ⊗ | k (cid:105) , they also have a number of classical correlationproperties with respect to general observables [10, 16], briefly noted here.First, ρ AB above is separable by construction, and hence is unentangled. Second, since it is diagonal inthe basis {| j (cid:105) ⊗ | k (cid:105)} , the mutual information and correlation distance are easily calculated as I ( ρ AB ) = I ( P ) , C ( ρ AB ) = C ( P ) , (27)and hence can only take classical values.Third, if M and N denote any observables for systems A and B respectively, then their joint statisticsare given by P MN ( m, n ) = (cid:88) j,k p ( m | j ) p ( n | k ) P ( j, k ) = (cid:88) j,k S m,n ; j,k P ( j, k ) , where S m,n ; j,k = p ( m | j ) p ( n | k ) is a stochastic matrix with respect to its first and second pairs of indices.Similarly, one finds P M ( m ) P N ( n ) = (cid:88) j,k S m,n ; j,k P ( j ) P ( k )for the product of the marginals. Since the classical relative entropy and variational distance can onlydecrease under the action of a stochastic matrix, it follows that one has the tight inequalities [10, 16] I ( P MN ) ≤ I ( P ) = I ( ρ AB ) , C ( P MN ) ≤ C ( P ) = C ( ρ AB ) , (28)with saturation for M and N diagonal in the bases {| j (cid:105)} and {| k (cid:105)} respectively. Maximising the first of theseequalities over M or N immediately implies that classically-correlated states have zero quantum discord.Finally, for two-qubit systems, Eq. (26) implies that ρ AB is classically correlated if and only if it isequivalent under local unitary transformations to a state of the form ρ (cid:48) AB = 14 [(1 + xσ ) ⊗ (1 + yσ ) + r σ ⊗ σ ] , where x, y ∈ [ − ,
1] and r satisfies Eq. (6). Hence, the mutual information is bounded by the classicallower bound in Eq. (3), and ρ ( C ) in Eq. (23) is classically correlated for C ≤ C . It follows that the lowerbound for quantum mutual information in Eq. (5) can be attained by classically-correlated states if C ≤ C .Conversely, the minimum possible bound cannot be reached by any classically-correlated two-qubit state if C > C . Lower bounds for mutual information have been obtained that are stronger than those obtainable fromgeneral bounds for relative entropy and variational distance. Unlike the Pinsker inequality in Eq. (2), thequantum form of these bounds is not a simple generalisation of the classical form.Similarly to the case of upper bounds for (classical) mutual information [8], the tight lower boundsobtained here depend on the dimension of the systems. The results of this paper represent a preliminaryinvestigation largely confined to two-valued classical variables and qubits. It would be of interest to generaliseboth the classical and quantum cases, and to further investigate connections between them.Open questions include whether a quantum correlation distance greater than the corresponding maximumclassical correlation distance is a signature of entanglement for higher-dimensional systems, and whether the elated qubit entanglement criterion in Eq. (4) holds more generally. The conjecture in section 5.2, asto whether the quantum lower bound in Eq. (5) is valid for all two-qubit states, also remains to be set-tled. Finally, it would be of interest to generalise and to better understand the role of the transition fromclassically-correlated states to entangled states in saturating information bounds, in the light of Eq. (23) forqubits. Acknowledgements : This research was supported by the ARC Centre of Excellence CE110001027.
References
1. Cover, T.M.; Thomas, J.A.
Elements of Information Theory, 2nd edition ; John Wiley & Sons: Hoboken,U.S.A., 2006; chap. 11.2. Hayashi, M. Large deviation analysis for classical and quantum security via approximate smoothing.Eprint: arXiv:1202.0322v5 [quant-ph].3. He, X.; Yener, A. Strong secrecy and reliable Byzantine detection in the presence of an untrusted relay.
IEEE Trans. Inf. Theory , , 177-192.4. Fedotov, A.A.; Harrem¨oes, P.; Tøpsoe, F. Refinements of Pinsker’s inequality. IEEE Trans. Inf. Theory , , 1491-1498.5. Hiai, F.; Ohya, M.; Tsukada, M. Sufficiency, KMS conditions and relative entropy in von Neumannalgebras. Pacific J. Math. , , 99-109.6. Rastegin, A.E. Fano type quantum inequalities in terms of q-entropies. Quantum Inf. Process. , ,1895-1910.7. Brand˜ao, F.G.S.L.; Harrow, A.W. Quantum de Finetti theorems under local measurements with appli-cations. Eprint: arXiv:1210.6367v3 [quant-ph].8. Zhang, Z. Estimating mutual information via Kolmogorov distance. IEEE Trans. Inf. Theory , ,3280-3282.9. Hall, M.J.W. Relaxed Bell inequalities and Kochen-Specker theorems. Phys. Rev. A , Phys. Rev. Lett. , , 090502.11. Clauser, J.F.; Horne, M.A., Shimony, A., Holt, R.A. Proposed experiment to test local hidden-variabletheories. Phys. Rev. Lett. , , 880-884.12. Fano, U. Pairs of two-level systems. Rev. Mod. Phys. , , 855-874.13. Werner, R.F. Quantum states with Einstein-Podolsky-Rosen correlations admitting a hidden-variablemodel. Phys. Rev. A , , 4277-4281.14. Zhang, C.J; Zhang, Y.S.; Zhang, S.; Guo, G.C. Entanglement detection beyond the computable cross-norm or realignment criterion. Phys. Rev. A , , 060301 (R).15. Bennett, C.H.; DiVincenzo, D.P. Mixed-state entanglement and quantum error correction. Phys. Rev.A , , 3824-3851.16. Wu, S.; Poulsen, U.V.; Mølmer, K. Correlations in local measurements on a quantum state, and com-plementarity as an explanation of nonclassicality. Phys. Rev. A , , 032319., 032319.