[PDF] Bounds for the Wasserstein mean with applications to the Lie-Trotter mean

Abstract

As the least squares mean for the Riemannian trace metric on the cone of positive definite matrices, the Riemannian mean with its computational and theoretical approaches has been widely studied. Recently the new metric and the least squares mean on the cone of positive definite matrices, which are called the Wasserstein metric and the Wasserstein mean, respectively, have been introduced. In this paper, we explore some properties of Wasserstein mean such as determinantal inequality and find bounds for the Wasserstein mean. Using bounds for the Wasserstein mean, we verify that the Wasserstein mean is the multivariate Lie-Trotter mean.

Full PDF

aa r X i v : . [ m a t h . F A ] A p r BOUNDS FOR THE WASSERSTEIN MEAN WITH APPLICATIONS TOTHE LIE-TROTTER MEAN

JINMI HWANG AND SEJONG KIM

Abstract.

As the least squares mean for the Riemannian trace metric on the cone ofpositive deﬁnite matrices, the Riemannian mean with its computational and theoreticalapproaches has been widely studied. Recently the new metric and the least squares meanon the cone of positive deﬁnite matrices, which are called the Wasserstein metric and theWasserstein mean, respectively, have been introduced. In this paper, we explore someproperties of Wasserstein mean such as determinantal inequality and ﬁnd bounds for theWasserstein mean. Using bounds for the Wasserstein mean, we verify that the Wassersteinmean is the multivariate Lie-Trotter mean.

Keywords : Wasserstein mean, Riemannian mean, Lie-Trotter mean Introduction

The open convex cone P m of all m × m positive deﬁnite Hermitian matrices with theinner product h X, Y i A = tr( A − XA − Y ) on the tangent space T A ( P m ) at each point A ∈ P m gives us a Riemannian structure. Indeed, P m is a Cartan-Hadamard manifold,a simply connected complete Riemannain manifold with non-positive sectional curvature,and also a Hadamard space. The Riemannian trace metric between A and B is given by δ ( A, B ) = k log A − / BA − / k , where k · k denotes the Frobenius norm. The natural andcanonical mean on a Hadamard space is the least squares mean, called the Cartan meanor Riemannian mean. For a positive probability vector ω = ( w , . . . , w n ), the Riemannianmean of A , . . . , A n ∈ P m is deﬁned asΛ( ω ; A , . . . , A n ) = arg min X ∈ P m n X j =1 w j δ ( X, A j ) . (1.1) The Riemannian mean with its computational and theoretical approaches has been widelystudied. One of the important properties of the Riemannian mean is the arithmetic-geometric-harmonic mean inequalities:  n X j =1 w j A − j  − ≤ Λ( ω ; A , . . . , A n ) ≤ n X j =1 w j A j . (1.2)Using (1.2) it has been veriﬁed in [7] that the Riemannian mean is the multivariate Lie-Trotter mean: for any diﬀerentiable curves γ , . . . , γ n on P m with γ i (0) = I for all i ,lim s → Λ( ω ; γ ( s ) , γ ( s ) , . . . , γ n ( s )) /s = exp " n X i =1 w i γ ′ i (0) . Bhatia, Jain, and Lim [3] have recently introduced a new metric and the least squaresmean on P m , called the Wasserstein metric and the Wasserstein mean, respectively. Forgiven A, B ∈ P , the Wasserstein metric d ( A, B ) is given by d ( A, B ) = (cid:20) tr (cid:18) A + B (cid:19) − tr( A / BA / ) / (cid:21) / . (1.3)In quantum information theory, the value tr( A / BA / ) / is known as the ﬁdelity andthe Wasserstein metric is known as the Bures distance of density matrices. The geodesicpassing from A to B is given by γ ( t ) = (1 − t ) A + t B + t (1 − t )[( AB ) / + ( BA ) / ] = A ⋄ t B for t ∈ [0 , Wasserstein mean denoted by Ω( ω ; A , . . . , A n ) is deﬁned byΩ( ω ; A , . . . , A n ) = arg min X ∈ P m n X j =1 w j d ( X, A j ) , (1.4)and it coincides with the unique solution X ∈ P m of the equation X = n X j =1 w j ( X / A j X / ) / . (1.5)Note that Ω(1 − t, t ; A, B ) = A ⋄ t B , and it has been shown that the Wasserstein meansatisﬁes the arithmetic-Wasserstein mean inequality. On the other hand, it is shown thatthe Wasserstein mean does not satisfy the monotonicity and the Wasserstein-harmonicmean inequality: see [3, Section 5]. So it is a natural question whether the Wassersteinmean is the multivariate Lie-Trotter mean. The main goals of this paper are to provide OUNDS FOR THE WASSERSTEIN MEAN 3 some properties of the Wasserstein mean and to verify that the Wasserstein mean is themultivariate Lie-Trotter mean by ﬁnding a lower bound for Wasserstein mean.We recall in Section 2 the Wasserstein metric with geodesic and see the Wassersteindistance between A ⋄ t B and A ⋄ t C for A, B, C ∈ P m and t ∈ [0 , A and B and the positive probability vector ω = (1 − t, t )has the unique solution X = A ⋄ t B . This naturally gives an open problem to extend theWasserstein mean of positive deﬁnite matrices to positive invertible operators.2. Wasserstein metric and geodesics

Let M m be the set of all m × m matrices with complex entries. Let H m be the real vectorspace of all m × m Hermitian matrices, and let P m ⊂ H m be the open convex cone of allpositive deﬁnite matrices. For A, B ∈ H m we denote as A ≤ B if and only if B − A ispositive semi-deﬁnite, and as A < B if and only if B − A is positive deﬁnite.The Frobenius norm k · k gives rise to the Riemannian structure on the open convexcone P m with h X, Y i A = Tr( A − XA − Y ), where A ∈ P m and X, Y ∈ T A ( P m ) = H m . Then P m is a Cartan-Hadamard Riemannian manifold, a simply connected complete Riemannianmanifold with non-positive sectional curvature (the canonical 2-tensor is non-negative). TheRiemannian trace metric between A and B is given by δ ( A, B ) = k log A − / BA − / k , and the unique (up to parametrization) geodesic curve on P m connecting from A to B is[0 , ∋ t A t B := A / ( A − / BA − / ) t A / , which is called the weighted geometric mean of A and B . Note that A B = A / B is theunique midpoint of A and B with respect to the Riemannian trace metric, and is the uniquesolution X ∈ P m of the Riccati equation XA − X = B . See [2] for more information. Lemma 2.1.

Let

A, B, C, D ∈ P m and let t ∈ [0 , . Then the following are satisﬁed. (1) A t B = A − t B t if A and B commute. (2) ( aA ) t ( bB ) = a − t b t ( A t B ) for any a, b > . (3) A t B = B − t A . (4) A t B ≤ C t D whenever A ≤ C and B ≤ D . HWANG AND KIM (5)

The map [0 , × P m × P m → P m , ( t, A, B ) A t B is continuous. (6) X ( A t B ) X ∗ = ( XAX ∗ ) t ( XBX ∗ ) for any nonsingular matrix X . (7) ( A t B ) − = A − t B − . (8) [(1 − λ ) A + λB ] t [(1 − λ ) C + λD ] ≥ (1 − λ )( A t C ) + λ ( B t D ) for any λ ∈ [0 , . (9) det( A t B ) = det( A ) − t det( B ) t . (10) [(1 − t ) A − + tB − ] − ≤ A t B ≤ (1 − t ) A + tB . Bhatia, Jain, and Lim [3] have introduced a new metric on P m , called the Wassersteinmetric , and have established that it gives us the Riemannian metric and the explicit formulaof geodesic curve. For given

A, B ∈ P the Wasserstein metric d ( A, B ) is given by d ( A, B ) = (cid:20) tr (cid:18) A + B (cid:19) − tr( A / BA / ) / (cid:21) / . This metric has been of interest in quantum information where it is called the

Bures distance ,and in statistics and the theory of optimal transport where it is called the

Wassersteinmetric . It is the matrix version of the Hellinger distance between probability distributions:for probability vectors p = ( p , . . . , p m ) and q = ( q , . . . , q m ) d ( p , q ) = " m X i =1 ( √ p i − √ q i ) / . We see the Wasserstein metric is related to the solution of extremal problem. Let U m be the compact subset of all m × m unitary matrices. For given A ∈ P m we deﬁne the set F ( A ) as F ( A ) = { M ∈ M m : A = M M ∗ } = { A / U : U ∈ U m } . Theorem 2.2. [3, Theorem 1]

For any

A, B ∈ P m d ( A, B ) = 1 √ M ∈F ( A ) , N ∈F ( B ) k M − N k = 1 √ U ∈ U m k A / − B / U k . The minimum in the second expression is attained at a unitary matrix U occurring in thepolar decomposition of B / A / : B / A / = U | B / A / | = U ( A / BA / ) / . Remark 2.3.

We check that d ( A, B ) is indeed a metric on P by using Theorem 2.2.(i) Obviously, d ( A, B ) ≥ OUNDS FOR THE WASSERSTEIN MEAN 5 (ii) Assume that A = B . Then U = I attains the minimum of k A / − B / U k over U ∈ U m , so the minimum value is 0 = d ( A, B ). Conversely, assume that d ( A, B ) = 0.Then k A / − B / U k = 0 when U = B / A / ( A / BA / ) − / . So A / = B / U = B / B / A / ( A / BA / ) − / = BA / ( A − / B − A − / ) / A / A − / = B ( A B − ) A − / . Set X = A B − . Then X = B − A . By the Riccati equation XA − X = B − , andso, B − A = B − . Thus, A = B .(iii) The Frobenius norm k · k is unitarily invariant: see [6, Chapter 5]. So k A / − B / U k = k A / U ∗ − B / k = k B / − A / V k , where V = U ∗ ∈ U m . Hence, d ( A, B ) = d ( B, A ).(iv) Let

A, B, C ∈ P m . By the triangle inequality for Frobenius norm d ( A, C ) ≤ √ k A / − C / U k ≤ √ k A / − B / V k + k B / V − C / U k )= k A / − B / V k + k B / − C / W k , where W = U V ∗ ∈ U m . So taking the minimum over all U, V ∈ U m , we see that d ( A, C ) ≤ d ( A, B ) + d ( B, C ).At this stage we recall a theorem from Riemannian geometry. Let ( M , g ) and ( N , h ) beRiemannian manifolds with Riemannian metrics g and h . A diﬀerentiable map π : M → N is said to be a smooth submersion if its diﬀerential Dπ ( p ) : T p M → T π ( p ) N is surjective atevery point p ∈ M . Let T p M = V p ⊕ H p be a decomposition of the tangent space T p M ,where V p = ker Dπ ( p ) and H p = (ker Dπ ( p )) ⊥ are called the vertical and horizontal spaceat p . Then π is called a Riemannian submersion if it is a smooth submersion and the map Dπ ( p ) : H p → T π ( p ) N is isometric for all p ∈ M . Theorem 2.4. [5]

Let ( M , g ) be a Riemannian manifold with Riemannian metrics g . Let G be a compact Lie group of isometries of ( M , g ) acting freely on M . Let N = M /G , andlet π : M → N be the quotient map. Then there exists a unique Riemannian metric h on N for which π : ( M , g ) → ( N , h ) is a Riemannian submersion. Note that the general linear group GL m is a Riemannian manifold with the metric inducedby the Frobenius inner product. The group U m of unitary matrices is a compact Lie group HWANG AND KIM of isometries for this metric. The quotient space GL m /U m is P , and the metric inheritedby the quotient space P is (up to a constant factor)min U ∈ U m k A / − B / U k = √ d ( A, B ) . The map π : GL m → P , π ( M ) = M M ∗ is a smooth submersion, and by Theorem 2.4there is a unique Riemannian metric on P , which is the Wasserstein metric d . From thispoint of view, the geodesic on P m joining A and B has been derived in [3]. The straightline segment Z ( t ) = (1 − t ) A / + tB / U for 0 ≤ t ≤ U = B / A / ( A / BA / ) − / is a geodesic in GL m , and by Theorem 2.4 γ ( t ) = π ( Z ( t )) = Z ( t ) Z ( t ) ∗ is a geodesic in P m : γ ( t ) = (1 − t ) A + t B + t (1 − t )[ A ( A − B ) + ( A − B ) A ]= (1 − t ) A + t B + t (1 − t )[( AB ) / + ( BA ) / ] . (2.6)We denote γ ( t ) =: A ⋄ t B for t ∈ [0 , γ ( t ) is the natural parametrization of thegeodesic joining A and B , it satisﬁes the aﬃne property of parameters: for any s, t, u ∈ [0 , A ⋄ s B ) ⋄ u ( A ⋄ t B ) = A ⋄ (1 − u ) s + ut B Lemma 2.5.

For any

A, B, C ∈ P m and t ∈ [0 , d ( A ⋄ t B, A ⋄ t C ) ≤ t r λ k A − B − A − C k , where λ := λ ( A ) is the largest eigenvalue of A .Proof. Note that A ⋄ t B = Z ( t ) Z ( t ) ∗ , where Z ( t ) = (1 − t ) A / + tB / U for 0 ≤ t ≤ U = B / A / ( A / BA / ) − / . So Z ( t ) = (1 − t ) A / + t ( A − B ) A / ∈F ( A ⋄ t B ), since U = B / A / ( A − / B − A − / ) / = B / ( A B − ) A − / , and so B / U = B ( A B − ) A − / = ( A B − ) − A / = ( A − B ) A / by the Riccatiequation and Lemma 2.1 (7). Similarly, A ⋄ t C = Y ( t ) Y ( t ) ∗ , where Y ( t ) = (1 − t ) A / + t ( A − C ) A / ∈ F ( A ⋄ t C ) for 0 ≤ t ≤

1. Therefore, from the ﬁrst expression in Theorem

OUNDS FOR THE WASSERSTEIN MEAN 7 d ( A ⋄ t B, A ⋄ t C ) ≤ √ k Z ( t ) − Y ( t ) k = t √ k ( A − B ) A / − ( A − C ) A / k ≤ t √ k A / k · k A − B − A − C k ≤ t r λ k A − B − A − C k . The second inequality follows from the sub-multiplicative property of the Frobenius norm:see Section 5.6 in [6], and the last inequality follows from the fact that k A / k = m X i =1 λ i ( A ) ≤ λ ( A ) , where λ ( A ) , . . . , λ m ( A ) are positive eigenvalues of A in the decreasing order. (cid:3) Wasserstein mean

Let A = ( A , . . . , A n ) ∈ P nm , and let ω = ( w , . . . , w n ) ∈ ∆ n , the simplex of all positiveprobability vectors in R n . We consider the following minimization problemarg min X ∈ P m n X j =1 w j d ( X, A j ) . (3.7)By using tools from non-smooth analysis, convex duality, and the optimal transport theory,it has been proved in Theorem 6.1, [1] that the above minimization problem has a uniquesolution in P m . On the other hand, it has been shown in [3] that the objective function f ( X ) = n X j =1 w j d ( X, A j ) is strictly convex, by applying the strict concavity of the map h : P m → R , h ( X ) = Tr( X / ). Therefore, we deﬁne such a unique minimizer of (3.7) asthe Wasserstein mean , denoted by Ω( ω ; A ). That is,Ω( ω ; A ) = arg min X ∈ P n X j =1 w j d ( X, A j ) . (3.8)To ﬁnd the unique minimum of objective function f : P m → R , we evaluate the derivative Df ( X ) and set it equal to zero. By using matrix diﬀerential calculus, we have the following. HWANG AND KIM

Theorem 3.1. [3, Theorem 8]

The Wasserstein mean Ω( ω ; A ) is a unique solution X ∈ P m of the nonlinear matrix equation I = n X j =1 w j ( A j X − ) , (3.9) equivalently, X = n X j =1 w j ( X / A j X / ) / . We see some interesting properties of the Wasserstein mean. For given A = ( A , . . . , A n ) ∈ P nm , any permutation σ on { , . . . , n } , and any M ∈ GL m , we denote as A σ = ( A σ (1) , . . . , A σ ( n ) ) ∈ P nm ,M A M ∗ = ( M A M ∗ , . . . , M A n M ∗ ) ∈ P nm , A k = ( A , . . . , A n , . . . , A , . . . , A n ) ∈ P nkm , where the number of blocks in the last expression is k . For given ω = ( w , . . . , w n ) ∈ ∆ n ,we also denote as ω σ = ( w σ (1) , . . . , w σ ( n ) ) ∈ ∆ n ,ω k = 1 k ( w , . . . , w n , . . . , w , . . . , w n ) ∈ ∆ nk . Proposition 3.2.

Let A = ( A , . . . , A n ) ∈ P nm , and let ω = ( w , . . . , w n ) ∈ ∆ n . Then thefollowing are satisﬁed. (1) ( Homogeneity ) Ω( ω ; α A ) = α Ω( ω ; A ) for any α > . (2) ( Permutation invariancy ) Ω( ω σ ; A σ ) = Ω( ω ; A ) for any permutation σ on { , . . . , n } . (3) ( Repetition invariancy ) Ω( ω k ; A k ) = Ω( ω ; A ) for any k ∈ N . (4) ( Unitary congruence invariancy ) Ω( ω ; U A U ∗ ) = U Ω( ω ; A ) U ∗ for any U ∈ U m .Proof. Items (2) and (3) follows from the deﬁnition (3.8) of Wasserstein mean.(1) Let X = Ω( ω ; α A ) for any α >

0. By Theorem 3.1 I = n X j =1 w j ( αA j ) X − = n X j =1 w j A j α − X ) − . By Theorem 3.1 α − X = Ω( ω ; A ), which implies the desired identity.(4) Let X = Ω( ω ; U A U ∗ ) for any U ∈ U m . By Theorem 3.1 I = n X j =1 w j ( U A j U ∗ X − ).Taking the congruence transformation by U ∗ ∈ U m on both sides and applying OUNDS FOR THE WASSERSTEIN MEAN 9

Lemma 2.1 (6) I = n X j =1 w j ( A j U ∗ X − U ) = n X j =1 w j ( A j U ∗ XU ) − ) . By Theorem 3.1, we obtain U ∗ XU = Ω( ω ; A ), that is, Ω( ω ; U A U ∗ ) = U Ω( ω ; A ) U ∗ . (cid:3) Remark 3.3.

Let A = " , B = " . One can see easily that

A, B are positive deﬁnite and AB = BA . The Wasserstein meanΩ (cid:0) , ; A, B (cid:1) = A ⋄ B and the Riemannian mean Λ (cid:0) , ; A, B (cid:1) = A B of positive deﬁnitematrices A and B , respectively, areΩ (cid:18) ,

12 ;

A, B (cid:19) = 14 " , Λ (cid:18) ,

12 ;

A, B (cid:19) = " . . . . . Then their determinants aredet (cid:20) Ω (cid:18) ,

12 ;

A, B (cid:19)(cid:21) = 2 . > (cid:20) Λ (cid:18) ,

12 ;

A, B (cid:19)(cid:21) . In general, det Ω( ω ; A ) = n Y j =1 (det A j ) w j = det Λ( ω ; A ). The following shows the inequalitybetween determinants of the Wasserstein mean and the Cartan mean.It is known from Theorem 7.6.6 in [6] that the map f : P m → R , f ( A ) = log det A isstrictly concave: for any A, B ∈ P m and t ∈ [0 , − t ) A + tB ) ≥ (1 − t ) log det A + t log det B, where equality holds if and only if A = B . By induction together with this property, wehave Lemma 3.4.

Let A , . . . , A n ∈ P m , and let ω = ( w , . . . , w n ) ∈ ∆ n . Then log det  n X j =1 w j A j  ≥ n X j =1 w j log det A j , where equality holds if and only if A = · · · = A n . The following shows the determinantal inequality between the Wasserstein mean and theRiemannian mean.

Theorem 3.5.

Let A = ( A , . . . , A n ) ∈ P nm , and let ω = ( w , . . . , w n ) ∈ ∆ n . Then det Ω( ω ; A ) ≥ n Y j =1 (det A j ) w j , (3.10) where equality holds if and only if A = · · · = A n .Proof. Let X = Ω( ω ; A ). Then by Theorem 3.1 I = n X j =1 w j ( A j X − ), and by Lemma 3.40 = log det  n X j =1 w j ( A j X − )  ≥ n X j =1 w j log det( A j X − )= 12 n X j =1 w j log det A j −

12 log det X. The last equality follows from Lemma 2.1 (9). It implieslog det X ≥ n X j =1 w j log det A j = log  n Y j =1 (det A j ) w j  . Taking the exponential function on both sides and applying the fact that the exponentialfunction from R to (0 , ∞ ) is monotone increasing, we obtain the desired inequality.Moreover, the equality of (3.10) holds if and only if A i X − = A j X − for all i and j . By Lemma 2.1 (3), X − A i = X − A j , and by the deﬁnition of geometric mean it isequivalent to A i = A j for all i and j . (cid:3) Bounds for the Wasserstein mean

The Wasserstein mean satisﬁes the arithmetic-Wasserstein mean inequality.

Theorem 4.1. [3, Theorem 9]

Let A = ( A , . . . , A n ) ∈ P nm and let ω = ( w , . . . , w n ) ∈ ∆ n .Then Ω( ω ; A ) ≤ n X j =1 w j A j . Proposition 4.2.

Let A = ( A , . . . , A n ) ∈ P nm , and let ω = ( w , . . . , w n ) ∈ ∆ n . Then foran operator norm k · k k Ω( ω ; A ) k ≤  n X j =1 w j k A j k /  . OUNDS FOR THE WASSERSTEIN MEAN 11

Proof.

Let X = Ω( ω ; A ). Then by Theorem 3.1, by the triangle inequality for the oper-ator norm, by the fact that k A t k = k A k t for any A ∈ P m and t ≥

0, and by the sub-multiplicativity for the operator norm in [6, Section 5.6] k X k = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n X j =1 w j (cid:16) X / A j X / (cid:17) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ n X j =1 (cid:13)(cid:13)(cid:13)(cid:13) w j (cid:16) X / A j X / (cid:17) / (cid:13)(cid:13)(cid:13)(cid:13) = n X j =1 w j (cid:13)(cid:13)(cid:13) X / A j X / (cid:13)(cid:13)(cid:13) / ≤ n X j =1 w j k X k / k A j k / . Hence, we obtain k Ω( ω ; A ) k = k X k ≤  n X j =1 w j k A j k /  . (cid:3) Remark 4.3.

By Theorem 4.1 we have k Ω( ω ; A ) k ≤ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n X j =1 w j A j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ n X j =1 w j k A j k . Since the square map R ∋ t t ∈ [0 , ∞ ) is convex,  n X j =1 w j k A j k /  ≤ n X j =1 w j k A j k . Thus, one can see that Proposition 4.2 gives a sharp upper bound of the Wasserstein meanfor the operator norm.Unfortunately, the Wasserstein mean does not satisfy the Wasserstein-harmonic meaninequality: see Section 5 in [3]. However, we give the lower bound for the Wasserstein meanunder certain condition.

Theorem 4.4.

Let ω = ( w , . . . , w n ) ∈ ∆ n and A = ( A , . . . , A n ) ∈ P nm . Then Ω( ω ; A ) ≥ I − n X j =1 w j A − j . Proof.

Let Ω = Ω( ω ; A ). By Theorem 3.1 and the geometric-harmonic mean inequality inLemma 2.1 (10), I = n X j =1 w j ( A j − ) ≥ n X j =1 w j A − j + Ω2 ! − . Taking inverse on both sides and applying the convexity of inversion map in (1.33) of [2]yield I ≤  n X j =1 w j A − j + Ω2 ! −  − ≤ n X j =1 w j A − j + Ω2 ! = 12 n X j =1 w j A − j + 12 X. By a simple calculation, we obtain the desired inequality. (cid:3)

Remark 4.5.

Note that 2 I − n X j =1 w j A − j ≤  n X j =1 w j A − j  − . Indeed, I =  n X j =1 w j A − j   n X j =1 w j A − j  − ≤  n X j =1 w j A − j +  n X j =1 w j A − j  −  . We give another upper bound for the Wasserstein mean diﬀerent from the arithmeticmean.

Remark 4.6.

Assume that n X j =1 w j A j < I . Let Ω = Ω( ω ; A ). By Theorem 3.1 and thearithmetic-geometric mean inequality in Lemma 2.1 (10), I = n X j =1 w j ( A j − ) ≤ n X j =1 w j (cid:18) A j + Ω − (cid:19) . By a simple calculation, we have 0 < I − n X j =1 w j A j ≤ Ω − , and soΩ ≤  I − n X j =1 w j A j  − . This means that  I − n X j =1 w j A j  − is an upper bound for Ω( ω ; A , . . . , A n ). OUNDS FOR THE WASSERSTEIN MEAN 13

On the other hand, note that  I − n X j =1 w j A j  − ≥ n X j =1 w j A j . Indeed, I =  n X j =1 w j A j   n X j =1 w j A j  − ≤  n X j =1 w j A j +  n X j =1 w j A j  −  . Then 2 I − n X j =1 w j A j ≤  n X j =1 w j A j  − , so  I − n X j =1 w j A j  − ≥ n X j =1 w j A j .5. Applications to the Lie-Trotter mean

We see in this section some applications of the lower bound of the Wasserstein mean inTheorem 4.4 to the notion of Lie-Trotter means. A weighted n -mean G n on P m for n ≥ G n ( ω ; · ) : P nm → P m that is idempotent, in the sense that G n ( ω ; X, . . . , X ) = X forall X ∈ P m . A weighted n -mean G n ( ω ; · ) : P nm → P m is called a multivariable Lie-Trottermean if it is diﬀerentiable and satisﬁeslim s → G n ( ω ; γ ( s ) , γ ( s ) , . . . , γ n ( s )) /s = exp " n X i =1 w i γ ′ i (0) , (5.11)where for ǫ > γ i : ( − ǫ, ǫ ) → P m are diﬀerentiable curves with γ i (0) = I for all i = 1 , . . . , n .See [7] for more details and information. Lemma 5.1.

Let Ω ω := Ω( ω ; · ) : P nm → P m be the Wasserstein mean for given probabilityvector ω = ( w , . . . , w n ) . Then it is diﬀerentiable at I = ( I, . . . , I ) with D Ω ω ( I )( X , . . . , X n ) = n X j =1 w j X j . Proof.

Let X , . . . , X n ∈ S ( H ) . If X = · · · = X n = 0 , then the statement holds obviously.Without loss of generality, we assume that at least one of X , . . . , X n is not zero. Set ρ := max ≤ j ≤ n σ ( X j )where σ ( X ) is the spectral radius of X. Then ρ > . Deﬁne f ( t ) = 2 I − n X ω j ( i + tX j ) − on ( − ρ , ρ ) . Then λ ( I + tX j ) = 1 + tλ ( X j ) ≥ − | t || λ ( X j ) | ≥ − ρ | t | > where λ ( X ) denote the eigenvalue of X. So I + tX j ∈ P for any t ∈ ( − ρ , ρ ) . Thus f iswell-deﬁned in a neighborhood ( − ρ , ρ ) of 0 and f (0) = 2 I − P nj =1 ω j ( I ) − = I. Since thederivative of the map t ( tX + I ) − at t = 0 is − X. We have f ′ (0) = lim t → I − P nj =1 ω j ( I + tX j ) − t = lim t → n X j =1 ω j ( I + tX j ) − X j ( I + tX j ) − = n X j =1 ω j X j . Then by Theorem 4.1 and 4.4,[2 I − P nj =1 ω j ( I + tX j ) − ] − It ≤ Ω ωn ( ω ; I + tX j ) − It ≤ P nj =1 ω j ( I + tX j ) − It = n X j =1 ω j X j for any suﬃciently small t > . So we havelim t → + Ω wn ( ω ; I + tX , . . . , I + tX n ) − Ω wn ( I, . . . , I ) t = n X j =1 ω j X j . Since Ω ωn ( I, . . . , I ) = I. Similarly, for t < t → − Ω ωn ( ω ; I + tX , . . . , I + tX n ) − Ω ωn ( I, . . . , I ) t = n X j =1 ω j X j . We conclude that Ω wn is diﬀerentiable at I with D Ω ωn ( I )( X , . . . , X n ) = P nj =1 ω j X j . (cid:3) Theorem 5.2.

The Wasserstein mean is the multivariate Lie-Trotter mean, that is, forgiven ω = ( w , . . . , w n ) ∈ ∆ n lim s → Ω( ω ; γ ( s ) , γ ( s ) , . . . , γ n ( s )) /s = exp  n X j =1 w j γ ′ j (0)  , where for ǫ > , γ j : ( − ǫ, ǫ ) → P m are diﬀerentiable curves with γ j (0) = I for all j =1 , . . . , n .Proof. Let ω = ( ω , . . . , ω n − , ω n ) ∈ ∆ n and let γ , . . . , γ n : ( − ǫ, ǫ ) P be any diﬀerentiablecurve with γ j (0) = I for all i = 1 , . . . , n. Then2 I − n X j =1 ω j γ j ( s ) − ≤ Ω( ω ; γ ( s ) , . . . , γ n ( s )) ≤ n X j =1 ω j γ j ( s ) . OUNDS FOR THE WASSERSTEIN MEAN 15

Taking logarithms and using the fact that the logarithm function is operator monotone, wehave log  I − n X j =1 ω j γ j ( s ) −  ≤ log Ω( ω ; γ ( s ) , . . . , γ n ( s )) ≤ log n X j =1 ω j γ j ( s ) . For s > , multiplying all terms by 1 /s, we get1 s log  I − n X j =1 γ j ( S ) −  ≤ log Ω( ω ; γ ( s ) , . . . , γ n ( s )) /s ≤ s log n X j =1 ω j γ j ( s ) . Taking the limit s → + , and using the l’Hˆ o pital’s theorem we obtainlim s → + log Ω( ω ; γ ( s ) , . . . , γ n ( s )) /s = n X j =1 ω j γ ′ j (0) . Since the logarithm map log : P → S ( H ) is diﬀeomorphic,lim s → + Ω( ω ; γ ( s ) , . . . , γ n ( s )) /s = exp  n X j =1 ω j γ ′ j (0)  . For s < , we obtain lim s → − ( ω ; γ ( s ) , . . . , γ n ( s )) /s = exp  n X j =1 ω j γ ′ j (0)  by similar steps. (cid:3) Taking γ i ( s ) = A si for each i and some A i ∈ P m , we obtain from Theorem 5.2 Corollary 5.3.

Let A , . . . , A n ∈ P m and ω = ( w , . . . , w n ) ∈ ∆ n . Then lim s → Ω( ω ; A s , . . . , A sn ) /s = exp " n X i =1 w i log A i . Final remarks

It is a natural question if the Wasserstein mean can be deﬁned on the setting P of positivedeﬁnite operators. Since one can not have the Wasserstein metric on P , the deﬁnition (3.8)may not be available. One possible approach to deﬁne the operator Wasserstein mean isto show the existence and uniqueness of the solution of the equation (3.9). On the otherhand, one can not ﬁnd the explicit form of the solution of the nonlinear equation (3.9), butwe have seen that the solution of (3.9) for two positive deﬁnite matrices A and B coincideswith the geodesic γ ( t ) = A ⋄ t B in (2.6) with respect to the Wasserstein metric. We directlysolve the nonlinear equation (3.9) for n = 2 by using the properties of geometric mean ofpositive deﬁnite operators. For positive deﬁnite operators

A, B ∈ P and t ∈ [0 ,

1] the weighted geometric mean of A and B is deﬁned by A t B = A / ( A − / BA − / ) t A / . Note that A B = A / B is the unique positive deﬁnite solution X ∈ P of the Riccatiequation XA − X = B . Moreover, it satisﬁes most of all properties in Lemma 2.1, but welist some of them that are useful for our goal. See [4, 8, 9]. Lemma 6.1.

Let

A, B, C, D ∈ P and let t ∈ [0 , . Then the following are satisﬁed. (1) A t B = B − t A . (2) X ( A t B ) X ∗ = ( XAX ∗ ) t ( XBX ∗ ) for any nonsingular matrix X . (3) ( A t B ) − = A − t B − . Theorem 6.2.

Let

A, B ∈ P and t ∈ [0 , . Then the nonlinear equation I = (1 − t )( A X − ) + t ( B X − ) (6.12) has a unique positive deﬁnite solution X = A ⋄ t B .Proof. Pre- and post-multiplying all terms by A − / for A > A − = (1 − t )( A − / X − A − / ) / + t ( A − / BA − / A − / X − A − / ) . Let Y = A − / X − A − / and Z = A − / BA − / . Then we have A − = (1 − t ) Y / + t ( Z Y ) . By using the Riccati equation, we get1 t [ A − − (1 − t ) Y / ] Y − [ A − − (1 − t ) Y / ] = Z. Pre- and post-multiplying all terms by A for A > , we get h Y − / − (1 − t ) A i = t AZA.

Taking square root on both sides, we obtain Y − / = (1 − t ) A + t ( AZA ) / . By assumption,we have ( A / XA / ) / = (1 − t ) A + t ( A / BA / ) / . Taking square on both sides, pre- and post-multiplying all terms by A − / for A > , weobtain X = A − / [(1 − t ) A + t ( A / BA / ) / ] A − / = A ⋄ t B. (cid:3) OUNDS FOR THE WASSERSTEIN MEAN 17

Open question . For positive deﬁnite operators A , . . . , A n and a positive probabilityvector ( w , . . . , w n ), the nonlinear equation I = n X j =1 w j ( A j X − ) , has a unique positive deﬁnite solution X in the setting P of positive deﬁnite operators?This is an interesting and challangeable problem, and Theorem 6.2 gives us a positiveanswer. Acknowledgement

The work of S. Kim was supported by the National Research Foundation of Korea (NRF)grant funded by the Korea government (MIST) (No. NRF-2018R1C1B6001394).

References [1] M. Agueh and G. Carlier, Barycenters in the Wasserstein space, SIAM J. Math. Anal. Appl. (2011),904-924.[2] R. Bhatia, Positive Deﬁnite Matrices, Princeton Series in Applied Mathematics, Princeton, 2007.[3] R. Bhatia, T. Jain and Y. Lim, On the Bures-Wasserstein distance between positive deﬁnite matrices,to appear in Expositiones Mathematicae.[4] G. Corach, H. Porta, and L. Recht, Convexity of the geodesic distance on spaces of positive operators,Illinois J. Math. (1994), 87-94.[5] S. Gallot, D. Hulin, and J. Lafontaine, Riemannian Geometry, Springer, 2004.[6] R. A. Horn and C. R. Johnson, Matrix Analysis, 2nd edition, Cambridge University Press, 2013.[7] J. Hwang and S. Kim, Lie-Trotter means of positive deﬁnite operators, Linear Algebra Appl. (2017),268-280.[8] F. Kubo and T. Ando, Means of positive linear operators, Math. Ann. (1979/80), no. 3, 205-224.[9] J. Lawson and Y. Lim, Metric convexity of symmetric cones, Osaka J. Math. (2007), 795-816. Jinmi Hwang, Department of Mathematics, Chungbuk National University, Cheongju 28644,Korea

E-mail address : [email protected] Sejong Kim, Department of Mathematics, Chungbuk National University, Cheongju 28644,Korea

E-mail address ::