aa r X i v : . [ qu a n t - ph ] J u l Modulus of convexity for operator convex functions
Isaac H. Kim
1, 2 Perimeter Institute of Theoretical Physics, Waterloo ON N2L 2Y5, Canada Institute of Quantum Information and Matter, Pasadena CA 91125, USA (Dated: October 30, 2018)Given an operator convex function f ( x ), we obtain an operator-valued lower bound for cf ( x ) +(1 − c ) f ( y ) − f ( cx + (1 − c ) y ), c ∈ [0 , I. INTRODUCTION
Formally, a function f ( x ) is operator convex if it satisfies the following inequality for self-adjoint operators A and B : cf ( A ) + (1 − c ) f ( B ) − f ( cA + (1 − c ) B ) ≥ , c ∈ [0 , . (1)Most of the interesting examples deal with operators that are positive semi-definite. We shall follow the sameconvention in this paper.Operator convex functions are known to satisfy a number of interesting properties. An important discovery wasmade by Hansen and Pederson, who used Eq.1 in order to obtain an operator generalization of the Jensen inequality.[1]Recently, Effros provided an elegant proof of the strong subadditivity of entropy (SSA)[2] by (i) defining an operatorgeneralization of the perspective function and (ii) using the aforementioned operator generalization of Jensen inequal-ity. These results show that the fundamental inequality in quantum information theory - the strong subadditivity ofentropy[3] - can be essentially derived from the operator convexity of a certain matrix-valued function.An important open question in quantum information theory concerns the structure of states that are approximatelyconditionally independent, i.e., the structure of states that has a small yet nonzero quantum conditional mutualinformation.[15] The motivation comes from the fact that states that satisfy the equality condition of the SSA formsa quantum Markov chain.[4] One natural speculation along this line is to guess that a quantum state with a smallconditional mutual information is close to some quantum Markov chain state. While this intuition is correct forclassical states, its obvious quantum generalization is known to be false.[5]There are several ways to circumvent this issue. The predominant approach in the literature is to replace the set ofquantum Markov chain states to a larger set of states, namely the separable states. The first result in this directionwas obtained by Brand˜ao et al. [6] Their result was subsequently strengthened by Li and Winter.[7]However, there is another possibility that is not necessarily precluded by the counterexamples of Ibinson et al. [5] Thequantum Markov chain property derived in Ref.[4] is a consequence of Petz’s theorem[8] and Koashi-Imoto theorem[9].Petz’s theorem asserts that, if a relative entropy between two quantum states does not decrease under a quantumchannel, there exists a canonical recovery operation that can perfectly reverse the action of the channel. Koashi-Imototheorem concerns the structure of states which are invariant under certain quantum channels. The quantum Markovchain property is derived in Ref.[4] by applying Koashi-Imoto theorem to Petz’s canonical recovery operation.Therefore, one may consider an alternative possibility: there might exist a canonical recovery operation analogousto Petz’s recovery channel, whose performance is determined by the conditional mutual information. Such a resultmay not contradict the counterexamples of Ibinson et al. , since one cannot directly apply Koashi-Imoto theorem tosuch channels when the recovery operation is not perfect. Since Petz’s theorem is based on the fundamental resultsabout operator convex and operator monotone functions, a strengthening of these results may lead to new insightson the structure of states that have a small but nonzero amount of conditional mutual information. Also, obtainingsuch a strengthening might be interesting in its own right; it might be potentially useful in extending the preexistingresults that are based on the properties of operator convex functions.Motivated from these observations, we obtain a possible strengthening of Eq.1. More precisely, the right handside of Eq.1 shall be replaced by an operator-valued function that is always nonnegative. This operator, up to someconstant that depends on c , is the matrix Bregman divergence .[10] Matrix Bregman divergence is a natural matrixgeneralization of the classical Bregman divergence.[11] Given a convex function f ( x ) and two probability distributions p ( x ) , q ( x ) over the same domain, Bregman divergence can be defined as B f ( p k q ) := f ( p ) − f ( q ) − lim c → f ( q + ( p − q ) c ) − f ( q ) c . Petz noted that a similar treatment can be carried out even when p and q are promoted to operators, providedthat the function f is operator convex. The resulting matrix-valued divergence is the matrix Bregman divergence.Interestingly, the strengthening is only applicable to operator convex functions; the inequality is false for a convexfunction that is not operator convex.As an application of this result, we prove an inequality that extends Pinsker’s inequality. Recall that Pinsker’sinequality asserts that the relative entropy D ( ρ k σ ) := Tr( ρ (log ρ − log σ )) between two normalized positive semidefiniteoperators, ρ and σ , is lower bounded by their trace distance: D ( ρ k σ ) ≥ k ρ − σ k . A simple corollary of our main result is the following inequality: S ( cρ + (1 − c ) σ ) − cS ( ρ ) − (1 − c ) S ( σ ) ≥ c (1 − c ) k ρ − σ k , c ∈ [0 , , where the underlying Hilbert space is finite-dimensional.The rest of the paper starts by describing the main result in Section II. We shall also show that the main resultcannot be generalized to convex functions by providing a simple argument. Section III describes the key technical resultof this paper, which is a strengthening of the well-known Arithmetic-Harmonic inequality. Using this strengthening,we prove the main result in Section IV. II. AN INEQUALITY BETWEEN THE MODULUS OF CONVEXITY AND BREGMAN DIVERGENCEFOR OPERATOR CONVEX FUNCTIONS
In order to describe the main result, we set the notations first.
Definition 1.
Modulus of convexity of a function f ( x ) is C cf ( A, B ) := cf ( A ) + (1 − c ) f ( B ) − f ( cA + (1 − c ) B ) , c ∈ [0 , . Note that the modulus of convexity of an operator convex function is always nonnegative, as long as A and B areself-adjoint operators whose spectrum lie on the domain of f . The matrix Bregman divergence is nonnegative due tothe same reason. Following Petz[10], we define the matrix Bregman divergence as follows. Definition 2.
Bregman divergence D f ( A, B ) is D f ( A, B ) = f ( A ) − f ( B ) − lim t → t − ( f ( B + t ( A − B )) − f ( B )) . The main result asserts that the Bregman divergence provides an operator-valued lower bound for the modulus ofconvexity.
Theorem 1.
For
A, B > , if f ( x ) on [0 , ∞ ) is operator convex and < c < C cf ( A, B ) ≥ c (1 − c ) D f ( M (1 − c ) , M ( c ))(1 − c ) c = 1218 d dx f ( M ( 12 + x )) | x =0 . c = 12 , (2) where M ( c ) := cA + (1 − c ) B . A. Convex vs. operator convex functions
A natural question is whether Theorem 1 can be extended to operator convex functions of order n . We will give asimple argument that such an extension cannot exist for n = 1. Recall that operator convex functions of order 1 referto all the convex functions. One can easily check that the function g ( x ) = x − (1 + x ) log(1 + x ) is convex for x > g ( x ). If Eq.2 holds for f ( x ) = g ( x ), it implies that Eq.2 holdsfor f ( x ) = − (1 + x ) log(1 + x ) as well; this follows from a simple observation that Eq.2 holds with an equality if f ( x ) = x . Since f ( x ) = − (1 + x ) log(1 + x ) as well as f ( x ) = (1 + x ) log(1 + x ) satisfies Eq.2, one can concludethat the inequalities in Eq.2 must be satisfied with an equality for such function. Clearly, this is not the case, and wearrive at a contradiction. Therefore, Eq.2 cannot be extended to operator convex functions of order 1. III. STRENGTHENING OF THE ARITHMETIC-HARMONIC INEQUALITY
In this section, we prove a strengthening of the well-known Arithmetic-Harmonic(AH) inequality. AH inequalitystates that 1 A + 1 B ≥ A + B (3)for positive definite matrices A and B . It is well known that any operator convex function has a unique integralrepresentation that can utilize Eq.3. For example, the following theorem was recently proved by Hiai et al. [12] Theorem 2. [12] A continuous real function f on [0 , ∞ ) is operator convex iff there exists a real number a , anonnegative number b , and a nonnegative measure µ on [0 , ∞ ) , satisfying Z ∞ λ ) dµ ( λ ) < ∞ , such that f ( x ) = f (0) + ax + bx + Z ∞ ( x λ − λx + λ ) dµ ( λ ) , x ∈ [0 , ∞ ) . (4) Moreover, the numbers a, b , and the measure µ is uniquely determined by f . The existence of the canonical form for operator convex functions is the main motivation behind the strengtheningof AH inequality. Our key lemma is the following:
Lemma 1.
For
A, B > ,
12 ( 1 A + 1 B ) − A + B ≥ A + B ( A − B ) 1 A + B ( A − B ) 1 A + B (5) Proof.
Define C = A − BA − . Applying a left and right multiplication of A on both sides of Eq.5, the left hand sidecan be expressed as (1 − C ) C (1+ C ) , while the right hand side can be expressed as 2 (1 − C ) (1+ C ) . Using the fact that (1+ C ) ≥ C ,one can establish the inequality. IV. PROOF OF THE MAIN RESULT
A well-known approach for proving operator Jensen inequality involves (i) proving the inequality at the midpointand (ii) making a judicious choice of matrices in an enlarged Hilbert space.[1] We shall follow a similar approach.Under elementary manipulations, one can show that Theorem 1 for a general operator convex function follows byproving it for a special family of functions, namely f λ ( x ) = λ + x . Without loss of generality, we shall prove Theorem1 for f ( x ) = λ + x . The case for the linear and quadratic terms are trivial, so we omit the proof for them.First, we consider the c = case of Theorem 1. f ( A ) + f ( B )2 − f ( A + B Z ∞ ( 12 ( 1 A + λ + 1 B + λ ) − A + B + 2 λ ) dµ ( λ ) ≥ Z ∞ A + B + 2 λ ( A − B ) 1 A + B + 2 λ ( A − B ) 1 A + B + 2 λ dµ ( λ )= 14 Z ∞ A + B + λ ( A − B ) 1 A + B + λ ( A − B ) 1 A + B + λ dµ ( λ )= 18 Z ∞ d dx A + B + x ( A − B ) + λ | x =0 dµ ( λ )= 18 d dx f ( A + B x ( A − B )) | x =0 (6)Away from the midpoint, we use the following choice of operators:[13] W = (cid:18) c I − (1 − c ) (1 − c ) I c I (cid:19) , and T = (cid:18) A B (cid:19) . Each of the entries in the matrices correspond to a block of square matrices of the dimension. Setting T = W T W † , T = W † T W , and applying it to the midpoint convexity result, one can obtain the desired result. More precisely, theconvexity at the midpoint is the following: f ( T ) + f ( T )2 − f ( T + T (cid:18) cf ( A ) + (1 − c ) f ( B ) 00 cf ( B ) + (1 − c ) f ( A ) (cid:19) − f (cid:18) cA + (1 − c ) B cB + (1 − c ) A (cid:19) . One can also easily check the following facts: T + T (cid:18) cA + (1 − c ) B cB + (1 − c ) A (cid:19) .T − T (cid:18) p c (1 − c )( A − B ) p c (1 − c )( A − B ) 0 (cid:19) . By taking one of the blocks, C cf ( A, B ) ≥ c (1 − c ) Z ∞ M ( c ) + λ ( A − B ) 1 M (1 − c ) + λ ( A − B ) 1 M ( c ) + λ dµ ( λ )= c (1 − c ) 1(2 c − Z ∞ M ( c ) + λ ( M ( c ) − M (1 − c )) 1 M (1 − c ) + λ ( M ( c ) − M (1 − c )) 1 M ( c ) + λ dµ ( λ ) . (7)One can check that D f ( A, B ) = Z ∞ B + λ ( A − B ) 1 A + λ ( A − B ) 1 B + λ dµ ( λ ) , (8)completing the proof. V. APPLICATION TO THE VON NEUMANN ENTROPY
Now we discuss an application of Theorem 1.
Corollary 1.
For density matrices ρ, σ on a finite-dimensional Hilbert space, S ( cρ + (1 − c ) σ ) − cS ( ρ ) − (1 − c ) S ( σ ) ≥ c (1 − c ) k ρ − σ k (9) Proof.
For f ( x ) = x log x , Petz showed that Tr( D f ( A, B )) = D ( A k B ) , (10)where D ( A k B ) = Tr( A (log A − log B )) is the relative entropy between A and B .[10] Hence, the following inequalityimmediately follows. S ( cρ + (1 − c ) σ ) − cS ( ρ ) − (1 − c ) S ( σ ) ≥ c (1 − c ) 1(1 − c ) D ( cσ + (1 − c ) ρ k cρ + (1 − c ) σ ) (11)for c = . Applying Pinsker’s inequality, S ( cρ + (1 − c ) σ ) − cS ( ρ ) − (1 − c ) S ( σ ) ≥ c (1 − c ) k ρ − σ k . (12)Eq.12 should be true for c = as well by some continuity argument, which is discussed below.Recall that Fannes’ inequality asserts that | S ( ρ ) − S ( σ ) | ≤ ǫ log d − ǫ log ǫ, (13)where ǫ = k ρ − σ k and d is the dimension of the Hilbert space.[14] Define S ( c ) as S ( c ) = S ( cρ + (1 − c ) σ ) − cS ( ρ ) − (1 − c ) S ( σ ) . Using Fannes’ inequality, |S ( 12 ) − S ( 12 + δ ) | ≤ ǫδ (log d − log ǫδ ) + 2 δ log d. (14)Denoting the right hand side of Eq.14 as ∆( δ, ǫ, d ) ≥ S ( 12 ) ≥ S ( 12 + δ ) − ∆( δ, ǫ, d ) ≥ k ρ − σ k − δ k ρ − σ k − ∆( δ, ǫ, d ) . Taking the δ → S ( 12 ) ≥ k ρ − σ k VI. DISCUSSION
We have obtained a lower bound for the modulus of convexity for operator convex functions, which can be expressedin terms of the matrix Bregman divergence. We also gave a simple argument that the inequality cannot be extendedto general convex functions. For the operator convex function f ( x ) = x log x , the trace of the matrix Bregmandivergence reduces to quantum relative entropy. In this case, the inequality reduces to the strict concavity of VonNeumann entropy. It will be interesting to find an application of this inequality. Another important question is tofind a strengthening of the operator Jensen inequality. Since many of the nontrivial results in quantum informationtheory can be essentially derived from the operator Jensen inequality, its strengthening will be undoubtedly useful inmany contexts. Acknowledgments
I would like to thank Andreas Winter and Alexei Kitaev for many helpful discussions which motivated this work.I would also like to thank Jon Tyson, Mary Beth Ruskai, Fernando Brand˜ao for helpful discussions. I would also liketo thank Lin Zhang for pointing out an error in the original manuscript. Lastly, I thank the anonymous referee whosuggested to investigate whether the main result holds for operator convex function of finite order. This research wassupported in part by NSF under Grant No. PHY-0803371, by ARO Grant No. W911NF-09-1-0442, and DOE GrantNo. DE-FG03-92-ER40701. Research at Perimeter Institute is supported by the Government of Canada throughIndustry Canada and by the Province of Ontario through the Ministry of Economic Development and Innovation. [1] F. Hansen and G. K. Pedersen, Math. Ann. , 229 (1982).[2] E. G. Effros, Proc. Natl. Acad. Sci. USA , 1006 (2009).[3] E. H. Lieb and M. B. Ruskai, J. Math. Phys. , 1938 (1973).[4] P. Hayden, R. Jozsa, D. Petz, and A. Winter, Commun. Math. Phys. , 359 (2004).[5] B. Ibinson, N. Linden, and A. Winter, Commun. Math. Phys. , 289 (2008), quant-ph/0611057.[6] F. G. S. L. Brandao, M. Christandl, and J. Yard, Commun. Math. Phys. , 805 (2011), 1010.1750.[7] K. Li and A. Winter (2012), 1210.3181.[8] D. Petz, Quart. J. Math. Oxford , 907 (1988).[9] M. Koashi and N. Imoto, Phys. Rev. A , 022318 (2002). [10] D. Petz, Acta Math. Hungar. , 127 (2007).[11] L. M. Bregman, USSR Compt. Math. and Math. Phys. , 200 (1967).[12] F. Hiai, M. Mosonyi, D. Petz, and C. Beny, Rev. Math. Phys. , 691 (2011), 1008.2529.[13] R. Bhatia, Matrix analysis , Graduate texts in mathematics (Springer, 1997), ISBN 9783540948469.[14] M. Fannes, Comm. Math. Phys.31