FFinsler geometries on strictly accretive matrices
Axel
Ringh ∗ and Li Qiu ∗ Abstract
In this work we study the set of strictly accretive matrices, that is, the set of matriceswith positive definite Hermitian part, and show that the set can be interpreted as a smoothmanifold. Using the recently proposed symmetric polar decomposition for sectorial matrices,we show that this manifold is diffeomorphic to a direct product of the manifold of (Hermitian)positive definite matrices and the manifold of strictly accretive unitary matrices. Utilizingthis decomposition, we introduce a family of Finsler metrics on the manifold and charaterizetheir geodesics and geodesic distance. Finally, we apply the geodesic distance to a matrixapproximation problem, and also give some comments on the relation between the introducedgeometry and the geometric mean for strictly accretive matrices as defined by S. Drury in[S. Drury, Linear Multilinear Algebra. 2015 63(2):296–301].
Key words:
Accretive matrices; matrix manifolds; Finsler geometry; numerical range; geometric mean
Given a complex number z ∈ C we can write it in its Cartesian form z = a + ib , where a = (cid:60) ( z ) isthe real part and b = (cid:61) ( z ) is the imaginary part, or we can write it in its polar form as z = re iθ ,where r = | z | is the magnitude and θ = ∠ z is the phase. The standard metric on C definesthe (absolute) distance between z and z as | z − z | = (cid:112) (cid:60) ( z − z ) + (cid:61) ( z − z ) which isefficiently computed using the Cartesian form as (cid:112) ( a − a ) + ( b − b ) . However, sometimesa logarithmic (relative) distance between the numbers contains information that is more relevantfor the problem at hand. One such distance is given by (cid:112) log( r /r ) + [( θ − θ ) | mod 2 π ] ,and in this distance measure the point 1 is as close to 10 e iθ as it is to 0 . e − iθ . This type ofdistances have wide application in engineering problems, e.g., as demonstrated in the use ofBode plots and Nichols charts in control theory [41]. Moreover, this type of logarithmic metrichas been generalized to (Hermitian) positive definite matrices, with plenty of applications, forexample in computing geometric means between such matrices [37], [11, Chp. 6], [29, Chp. XII].This generalization can done by identifying the set of positive matrices as a smooth manifoldand introducing a Riemannian or Finsler metric on it. Here, we follow a similar path andextend this type of logarithmic metrics to so called strictly accretive matrices. More specifically,the outline of the paper is as follows: in Section 2 we review relevant background materialand set up the notation used in the paper. Section 3 is devoted to showing that the set ofstrictly accretive matrices can be interpreted as a smooth manifold, and that this manifoldis diffeomorphic to a direct product of the smooth manifold of positive definite matrices andthe smooth manifold of strictly accretive unitary matrices. The latter is done using the newlyintroduced symmetric polar decomposition for sectorial matrices [44]. In Section 4 we introduce ∗ Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology,Clear Water Bay, Kowloon, Hong Kong, China. Email: [email protected] (A. Ringh), [email protected] (L. Qiu) † This work was supported by the Knut and Alice Wallenberg foundation, Stockholm, Sweden, under grantKAW 2018.0349, and the Hong Kong Research Grants Council, Hong Kong, China, under project GRF 16200619. a r X i v : . [ m a t h . M G ] N ov family of Finsler metrics on the manifold, by means of the decomposition from the previoussection and so called (Minkowskian) product functions [39]. In particular, this allows us tocharacterize the corresponding geodesics and compute the geodesic distance. Finally, in Section 5we given an application of the metric to a matrix approximation problem and also give somecomments on the relation between the geodesic midpoint and the geometric mean between strictlyaccretive matrices as introduced in [16]. In the following section we introduce some background material needed for the rest of the paper.At the same time, this section is also be used to set up the notation used throughout. To this end,let M n denote the set of n × n matrices over the filed C of complex numbers. For A ∈ M n , let A ∗ denotes its complex conjugate transpose, let H ( A ) := ( A + A ∗ ) denote its Hermitian part, andlet S ( A ) := ( A − A ∗ ) denote its skew-Hermitian part. Moreover, by I we denote the identitymatrix, and for A ∈ M n by λ ( A ) we denote its spectrum, i.e., λ ( A ) := { λ ∈ C | det( λI − A ) = 0 } ,and by σ ( A ) we denote it singular values, i.e., σ ( A ) = (cid:112) λ ( A ∗ A ).The following is a number of different sets of matrices that will be used throughout: GL n denotes the set of invertible matrices, U n denotes the set of unitary matrices, H n denotes theset of Hermitian matrices, P n denotes the set of positive definite matrices, i.e., A ∈ H n s.t. λ ( A ) ⊂ R + \ { } , S n denotes the set of skew-Hermitian matrices, and A n denotes the set ofstrictly accretive matrices, i.e., A ∈ A n if and only if H ( A ) ∈ P n . Two matrices
A, B ∈ M n are said to be congruent if there exists a matrix C ∈ GL n such that A = C ∗ BC . For matrices A, B ∈ M n we define the inner product (cid:104) A, B (cid:105) := tr( A ∗ B ), whichgives the Frobenius norm (cid:107) A (cid:107) := (cid:112) (cid:104) A, A (cid:105) = (cid:113)(cid:80) nj =1 σ j ( A ) . By (cid:107) · (cid:107) sp we denote the spectralnorm, i.e., (cid:107) A (cid:107) sp = sup x ∈ C n \{ } (cid:107) Ax (cid:107) / (cid:107) x (cid:107) = σ max ( A ), the larges singular value of A . Next,a function Φ : R n → R is called a symmetric gauge function if for all x, y ∈ R n and all β ∈ R i) Φ( x ) > x (cid:54) = 0, ii) Φ( βx ) = | β | Φ( x ), iii) Φ( x + y ) ≤ Φ( x ) + Φ( y ), and iv) Φ( x ) = Φ(˜ x ) forall ˜ x = [ ± x α ( i ) ] ni =1 where α is any permutation of { , . . . , n } [36], [34, Sec. 3.I.1]. For any unitaryinvariant norm, i.e., norms |(cid:107) · (cid:107)| such that |(cid:107) U AV (cid:107)| = |(cid:107) A (cid:107)| for all A ∈ M n and all U, V ∈ U n ,there exists a symmetric gauge function Φ such that |(cid:107) A (cid:107)| = Φ( σ ( A )) [10, Thm IV.2.1 ], [34,Thm. 10.A.1]. For this reason we will henceforth denote such norms (cid:107) · (cid:107) Φ . Moreover, we willcall a symmetric gauge function, and the corresponding norm, smooth if it is smooth outside ofthe origin, cf. [32, Thm. 8.5].For a vector x ∈ R n , by x ↓ we denote the vector obtained by sorting the elements in x ina nonincreasing order. More precisely, x ↓ is obtained by permuting the elements of x suchthat x ↓ = [ x ↓ k ] nk =1 where x ↓ ≥ x ↓ ≥ . . . ≥ x ↓ n . For two vectors x, y ∈ R n , we say that x is submajorized (weakly submajorized) by y if (cid:80) (cid:96)k =1 x ↓ k ≤ (cid:80) (cid:96)k =1 y ↓ k for (cid:96) = 1 , . . . , n − (cid:80) nk =1 x ↓ k = ( ≤ ) (cid:80) nk =1 y ↓ k [34, p. 12]. Submajorization (weak submajorization) is a preorder on R n , and we write x ≺ ( ≺ w ) y . On the equivalence classes of vectors sorted in nonincreasing orderit is a partial ordering [34, p. 19]. Note that this is equivalent to the Toeplitz decomposition since if A = (cid:60) ( A ) + i (cid:61) ( A ), then (cid:60) ( A ) = H ( A )and (cid:61) ( A ) = i S ( A ), see [23, p. 7]. The naming used here is the same as in [28, p. 281], in contrast to [8]. .1 Sectorial matrices and the phases of a matrix Given a matrix A ∈ M n , we define the numerical range (field of values) as W ( A ) := (cid:8) z ∈ C | z = x ∗ Ax, x ∈ C n and (cid:107) x (cid:107) = x ∗ x = 1 (cid:9) . Using the numerical range, we can define the set of so called sectorial matrices as W n := { A ∈ M n | (cid:54)∈ W ( A ) } . The name comes from the fact that the numerical range of a matrix A ∈ W n is contained in asector of opening angle less than π . The latter can be seen from the Toeplitz-Hausdorff theorem,which states that for any matrix A ∈ M n , W ( A ) is a convex set [45, Thm. 4.1], [20, Thm. 1.1-2]. Recently, sectorial matrices have received considerable attention in the literature, see, e.g.,[8, 35, 17, 33, 46, 44, 12].Sectorial matrices have several interesting properties. In particular, if A is sectorial it is congruentto a unitary diagonal matrix D , i.e., A = T ∗ DT for some T ∈ GL n [21, 15, 19, 26, 24]. Althoughthe decomposition is not unique, the elements in D are unique up to permutation, and any suchdecomposition is called a sectorial decoposition [46]. Using this decomposition, we define thephases of A , denoted φ ( A ) , φ ( A ) , . . . , φ n ( A ), as the phases of the eigenvalues of D [19, 44, 12]; by convention we defined them to belong the an interval of length strictly less than π . With thisdefinition we have, e.g., that A ∈ A n if and only if A ∈ W n and H ( A ) ∈ P n , which is true if andonly if ( φ ( A ) , . . . , φ n ( A )) ⊂ ( − π/ , π/ A is differentfrom the angles of the eigenvalues, i.e., in general φ ( A ) (cid:54) = ϕ ( A ) where ϕ ( A ) := ∠ λ ( A ). Moreprecisely, equality holds for normal matrices. The phases have a number of desirable propertiesthat the angles of the eigenvalues do not, see [44].Another decomposition of sectorial matrices, which will in fact be central to this work, is the socalled symmetric polar decomposition [44, Thm. 3.1]: for A ∈ W n there is a unique decompositiongiven by A = P U P, where P ∈ P n and U ∈ WU n := U n ∩ W n . The latter is the set of sectorial unitary matrices.The phases of A are given by the phases of U , which is in fact the phases of the eigenvalues of U . Therefore, we have that A ∈ A n if and only if it has a symmetric polar decomposition suchthat U ∈ AU n := { U ∈ WU n | H ( U ) ∈ P n } , i.e., the set of strictly accretive unitary matrices. Smooth manifolds are important mathematical objects which show up in such diverse fields astheoretical physics [40], robotics [38], and statistics and information theory [2]. Intuitively, theycan be thought of as spaces that locally look like the Euclidean space, and on these spaces onecan introduce geometric concepts such as curves and metrics. In particular, all smooth manifoldsadmit a so called Riemannian metric [27, Thm. 1.4.1], [30, Prop. 13.3], and Riemannian geometryis a well-studied subjects, see, e.g., one of the monographs [40, 29, 27, 30, 31]. An relaxationof Riemannian geometry leads to so called Finsler geometry [14]; loosely expressed it can beinterpreted as chaining the tangent space from a Hilbert space to a Banach space. For anintroduction to Finsler geometry, see, e.g., [9, 42, 13].More specifically, given a smooth manifold M , for x ∈ M we denote the tangent space by T x M and the tangent bundle by T M := ∪ x ∈M { x } × T x M . A Riemannian metric is induced by an In [19], these were called canonical angles. (cid:104)· , ·(cid:105) x : T x M× T x M → R , that varies smoothly with the basepoint x . Using this inner product, one defines the norm (cid:112) (cid:104)· , ·(cid:105) x , which in fact defines a smoothfunction on the slit tangent bundle T M\∪ x ∈M ( x, F : T M → R + ,( x, X ) (cid:55)→ (cid:107) X (cid:107) x , where (cid:107) · (cid:107) x is a norm on T x X which is not necessarily induced by an innerproduct, and such that F is smooth on the slit tangent bundle T M \ ∪ x ∈M ( x, γ : [0 , → M , the arc length on the manifold is defined using this Finslerstructure. More precisely, it is defined as L ( γ ) := (cid:90) F ( γ ( t ) , ˙ γ ( t )) dt, where ˙ γ ( t ) is the derivative of γ with respect to t . Using arc length, the geodesic distance betweentwo points x, y ∈ M is defined as δ ( x, y ) := inf γ L ( γ ) : γ is a piece-wise smooth curve such that γ (0) = x, γ (1) = y, and a minimizing curve (if one exists) is called a geodesic. A final notion we need is thatof diffeomorphic manifolds. More precisely, two smooth manifolds M and N are said to bediffeomorphic if there exists a diffeomorphism f : M → N , i.e., a function f which is a smoothbijection with a smooth inverse. In this case we write M ∼ = N .Next, we summarize some results regarding two matrix manifolds, namely P n and U n , togetherwith specific Finsler structures. These will be needed later. P n and their geodesics Riemannian and Finsler geometry on P n is a well-studied subject, and we refer the reader to,e.g., [37], [11, Chp. 6] or [29, Chp. XII]. Here, we summarize some of the results we will need forlater. To this end, note that the tangent space at P ∈ P n is H n , and given P ∈ P n , X, Y ∈ H n we can introduce the inner product on the tangent space as (cid:104) X, Y (cid:105) P = tr(( P − / XP − / ) ∗ ( P − / Y P − / )) , (2.1)with corresponding norm F P n ( P, X ) = (cid:107) X (cid:107) P = (cid:113)(cid:80) nj =1 σ j ( P − / XP − / ) . The geodesic be-tween P, Q ∈ P n in the induced Riemannian metric is given by γ P n ( t ) = P / ( P − / QP − / ) t P / = P e t log( P − Q ) (2.2)and the length of the curve, i.e., the Riemannian distance between P and Q , is given by δ P n ( P, Q ) = (cid:107) log( P − / QP − / ) (cid:107) . Interestingly, if the norm (cid:107) · (cid:107) P on T P P n is changed to any other unitary invariant matrix norm (cid:107) · (cid:107) Φ ,P = Φ( σ ( P − / · P − / ))the expressions for a geodesic between two matrices remains unchanged and the correspondingdistance is given by δ Φ P n ( P, Q ) = (cid:107) log( P − / QP − / ) (cid:107) Φ [11, Sec. 6.4]. However, the geodesic(2.2) might no longer be the unique shortest curve [11, p. 223].An alternative expression for the geodesic (2.2) is given by the following proposition.4 roposition 2.1. Let (cid:107) · (cid:107) Φ be any smooth unitarily invariant norm and consider the Finslerstructure given by F Φ P n : T P n → R + , F Φ P n : ( P, X ) (cid:55)→ (cid:107) P − / XP − / (cid:107) Φ . For P, Q ∈ P n ,a geodesic between them can be written as γ ( t ) = S Λ t S ∗ where P = SS ∗ , Q = S Λ S ∗ is asimultaneous diagonalization by congruence of P and Q , i.e., S ∈ GL n and Λ is diagonal withpositive elements on the diagonal. Moreover, the geodesic distance is δ Φ P n ( P, Q ) = (cid:107) log( P − / QP − / ) (cid:107) Φ = Φ (cid:16) log (cid:0) λ ( P − Q ) (cid:1)(cid:17) = (cid:107) log(Λ) (cid:107) Φ . (2.3) Proof.
To show the second equality in (2.3), note that (cid:107) log( P − / QP − / ) (cid:107) Φ = Φ (cid:16) σ (cid:0) log( P − / QP − / ) (cid:1)(cid:17) = Φ (cid:16) | λ (cid:0) log( P − / QP − / ) (cid:1) | (cid:17) = Φ (cid:16) λ (cid:0) log( P − / QP − / ) (cid:1)(cid:17) = Φ (cid:16) log (cid:0) λ ( P − / QP − / ) (cid:1)(cid:17) = Φ (cid:16) log (cid:0) λ ( P − Q ) (cid:1)(cid:17) where the second equality comes from that the singular values of a Hermitian matrix, i.e., thematrix log( P − / QP − / ), are the absolute values of the eigenvalues, the third equality followssince the symmetric gauge function is invariant under sign changes, the forth can be seen byusing a unitary diagonalization of P − / QP − / , and the fifth equality comes from that thespectrum is invariant under similarity. Next, since both P, Q ∈ P n , by [23, Thm. 7.6.4] we cansimultaneously diagonalize P and Q by congruence, i.e., there exists an S ∈ GL n such that P = SS ∗ and Q = S Λ S ∗ , where Λ = diag( λ , . . . , λ n ) and where λ , . . . , λ n are all strictly largerthan 0. In fact, λ , . . . , λ n are the eigenvalues of P − Q , which means that log (cid:0) λ ( P − Q ) (cid:1) =log( λ (Λ)) = log(Λ) , which in turn gives the last equality in (2.3). Finally, this also gives that γ ( t ) = P e t log( P − Q ) = SS ∗ S −∗ e t log(Λ) S ∗ = S Λ t S ∗ . U n and their geodesics The set of unitary matrices is a Lie group, and results related to Riemannian and Finsler geometryon U n can be found in, e.g., [3, 5, 4, 7, 6]. Again, we here summarize some of the results that wewill need for later. To this end, note that the tangent space at U ∈ U n is S n and given U ∈ U n , X, Y ∈ S n we can introduce the inner product on the tangent space as (cid:104) X, Y (cid:105) U = tr( X ∗ Y ) , (2.4)with corresponding induced norm F U n ( U, X ) = (cid:107) X (cid:107) U = (cid:112) (cid:104) X, X (cid:105) U . The induced Riemannianmetric have shortest curves between U, V ∈ U n given by γ U n ( t ) = U e itZ , (2.5)where V = U e iZ , and where Z ∈ H n is such that (cid:107) Z (cid:107) sp ≤ π . Moreover, the geodesic distance is δ U n ( U, V ) = (cid:107) Z (cid:107) , (2.6)and the geodesic is unique if (cid:107) Z (cid:107) sp < π . Similarly to the results for the smooth manifold P n ,if the norm (cid:107) · (cid:107) U on T U U n is changed to any other unitary invariant matrix norm (cid:107) · (cid:107) Φ ,U theexpressions for a geodesic in (2.5) is unchanged and the expression for the geodesic distance (2.6)is only changed to using the corresponding norm (cid:107) · (cid:107) Φ [4, 7]. However, even if (cid:107) Z (cid:107) sp < π thegeodesic (2.5) might not be unique in this case [7, Sec. 3.2].5 The smooth manifold of strictly accretive matrices
In this section we prove that A n is a smooth manifold diffemorphic to P n × AU n . The latter isfundamental for the introduction of the Finsler structures in the following section. For improvedreadability, some of the technical results of this section are deferred to Appendix A. To this end,we start by proving the following. Theorem 3.1. A n is a connected smooth manifold, and at a point A ∈ A n the tangent space is T A A n = M n .Proof. This follows by Lemma A.1, A.2, and A.3, and applying [30, Ex. 1.26].Next, we prove the following characterization of the manifold.
Theorem 3.2. A n ∼ = P n × AU n , where P n and AU n are embedded submanifolds. This theorem follows as a corollary to the following proposition.
Proposition 3.3.
For A ∈ A n , let A = P U P be the symmetric polar decomposition. Then themapping A (cid:55)→ ( P , U ) is a diffeomorphism between the smooth manifolds A n and P n × AU n .Proof. Since P n and AU n are smooth manifolds (see [11, Chp. 6] and Lemma A.4, respectively), P n × AU n is also a smooth manifold [30, Ex. 1.34]. Next, note that the matrix square rootis a diffeomorphism of P n to itself, with the matrix square as the inverse, cf. [23, Thm. 7.2.6].Therefore, it suffices to show that the mapping A (cid:55)→ ( P, U ) is a diffeomorphism between A n and P n × AU n . To this end, first observe that the latter is a bijection due to the existence anduniqueness of a symmetric polar decomposition [44, Thm. 3.1]. Moreover, that the inverse issmooth follows since the components in A are polynomial in the components in P and U .To show that P and U are smooth in A , we note that since A is strictly accretive, H ( A ) (cid:31) A = H ( A ) + i ( i S ( A )) = H ( A ) / (cid:0) I + iH ( A ) − / i S ( A ) H ( A ) − / (cid:1) H ( A ) / = H ( A ) / KH ( A ) / , where K := I + iH ( A ) − / i S ( A ) H ( A ) − / is a normal matrix (cf. [46, Proof of Cor. 2.5]) whichby construction depends smoothly on A . Now, let K = V K Q K be the polar decomposition of K . Since K depends smoothly on A , and since the polar decomposition is smooth in the matrix(Lemma A.5), V K and Q K are smooth in A . Moreover, since K is normal, V K and Q K commute[45, Thm. 9.1], and thus V K and Q / K commute. Therefore, A = H ( A ) / Q / K U K Q / K H ( A ) / ,where all components depend smoothly on A . Now, let L := Q / K H ( A ) / and let L = V L Q L bethe polar decomposition of L . Similar to before, V L and Q L are thus both smooth in A . Finally,we thus have that A = Q L V ∗ L U K V L Q L , and since V L and U K are unitary so is V ∗ L U K V L . Bythe uniqueness of the symmetric polar decomposition [44, Thm. 3.1] it follows that P = Q L and U = V ∗ L U K V L , which are both smooth in A . Proof of Theorem 3.2.
By Proposition 3.3, A n ∼ = P n × AU n , and by [30, Prop. 5.3], both P n ×{ I } ∼ = P n and { I } × AU n ∼ = AU n are embedded submanifolds of P n × AU n . To see this, note that a matrix A is normal if and only if H ( A ) and i S ( A ) commute, see, e.g., [45, Thm. 9.1]. emark 3.4. The results in Theorem 3.1 and 3.2 can easily be generalized to other subsetsof sectorial matrices, namely any subset ˜ A n ⊂ W n of all matrices A such that there exists α, β ∈ R , α < β , and β − α = π , for which min k =1 ,...n φ k ( A ) > α and max k =1 ,...n φ k ( A ) < β .To see this, note that for ˜ A = ˜ P ˜ U ˜ P ∈ ˜ A n we have that A = ˜ P ( e − ( β + α ) / ˜ U ) ˜ P ∈ A n , i.e.,that e − ( β + α ) / ˜ A n = A n with a diffeomorphism between the components in the symmetric polardecomposition. Examples of such sets of matrices are and the set of strictly dissipative matrices,i.e., matrices such that (cid:60) ( A ) ≺ [28, p. 279], and matrices such that (cid:61) ( A ) (cid:31) . A n and their geodesics In this section we introduce a family of Finsler structures on A n and in particular we willcharacterize the geodesics and geodesic distances corresponding to these structures. To this end,by Theorem 3.2 we have that A n ∼ = P n × AU n . Moreover, P n and U n are smooth manifoldsthat are well-studied in the literature, and since P n and AU n are embedded submanifolds of A n a desired property would be that when restricted to any of the two embedded submanifold theintroduced Finsler structure would yield the corresponding known Finsler structure. To this end,we first characterize the geodesics and the geodesic distance on AU n . Proposition 4.1.
Let (cid:107) · (cid:107) Φ be any smooth unitarily invariant norm and consider the Finslerstructure given by F Φ AU n : T AU n → R + , F Φ AU n : ( U, X ) (cid:55)→ (cid:107) X (cid:107) Φ . A geodesic between U ∈ AU n and V ∈ AU n is given by γ AU n ( t ) = U e t log( U − V ) = U / ( U − / V U − / ) t U / . Moreover, the geodesic distance is given by δ Φ AU n ( U, V ) = (cid:107) log( U − V ) (cid:107) Φ = (cid:107) log( U − / V U − / ) (cid:107) Φ . (4.1) Proof.
Let
U, V ∈ AU n . The proposition follows if we can show that a geodesic on U n between U and V , given by (2.5), remains in AU n for t ∈ [0 , Z in (2.5) and (2.6) takes the form Z = − i log( U ∗ V ). To show the latter, note that e iZ = U − V = U ∗ V , where λ ( U ∗ V ) ∩ R − = ∅ since both U ∗ and V are strictly accretive, see [43], [44, Thm. 6.2]. Therefore we can use theprinciple branch of the logarithm, which gives Z = − i log( U ∗ V ). Next, to show that γ U n ( t ) ∈ AU n for t ∈ [0 , γ U n (1 /
2) = U / ( U − / V U − / ) / U / . By [16, Prop. 3.1 andThm. 3.4], γ U n (1 /
2) is strictly accretive, and a repeated argument now gives that γ U n ( t ) isstrictly accretive for a dense set of t ∈ [0 , t (cid:55)→ γ U n ( t ), the resultfollows.Next, let Φ and Φ be two smooth symmetric gauge functions and consider the Finsler manifolds( P n , F Φ P n ) and ( AU n , F Φ U n ) as defined in Proposition 2.1 and Proposition 4.1, respectively. In theRiemannian case, there is a canonical way to introduce a metric on P n × AU n , namely theproduct metric [30, Ex. 13.2]. However, in the case of products of Finsler manifolds there is nocanonical way to introduce a Finsler structure on a product space, cf. [9, Ex. 11.1.6], [39]. Here,we consider so called (Minkowskian) product manifolds [39] and to this end we next define socalled (Minkowskian) product functions . Definition 4.2 ([39]) . A function
Ψ : R + × R + → R + is called a product function if it satisfiesthe following conditions: Note that the latter has unfortunately also been termed dissipative in the literature, see [28, p. 279]. ) Ψ( x , x ) = 0 if and only if ( x , x ) = (0 , ,ii) Ψ( αx , αx ) = α Ψ( x , x ) for all ( x , x ) ∈ R + × R + and all α ∈ R + ,iii) Ψ is smooth on R + × R + \ { (0 , } ,iv) ∂ x (cid:96) Ψ (cid:54) = 0 on R + × R + \ { (0 , } , for (cid:96) = 1 , ,v) ∂ x Ψ ∂ x Ψ − ∂ x ∂ x Ψ (cid:54) = 0 on R + × R + \ { (0 , } . For any product function Ψ, ( P n × AU n , (cid:113) Ψ(( F Φ P n ) , ( F Φ AU n ) )) is a Finsler manifold [39], andwe therefore define the Finsler manifold ( A n , F Φ , Φ , Ψ A n ) as follows. Definition 4.3.
Let Φ and Φ be two smooth symmetric gauge functions, let ( P n , F Φ P n ) and ( AU n , F Φ U n ) be the Finsler manifolds as defined in Proposition 2.1 and Proposition 4.1, respec-tively, and let Ψ be a product function. The Finsler manifold ( A n , F Φ , Φ , Ψ A n ) is defined via thediffeomorphis in Proposition 3.3 as ( A n , F Φ , Φ , Ψ A n ) := (cid:18) P n × AU n , (cid:113) Ψ(( F Φ P n ) , ( F Φ AU n ) ) (cid:19) . One particular example of a product function is Ψ : ( x , x ) (cid:55)→ x + x , which in [39] this wascalled “the Euclidean product”, and in the Riemannian case this leads to the canonical productmanifold. Moreover, the geodesics and geodesic distance can be characterized for general productfunctions Ψ. This leads to the following result. Theorem 4.4.
Let
A, B ∈ A n , and let A = P A U A P A and B = P B U B P B be the correspondingsymmetric polar decompositions. On the Finsler manifold ( A n , F Φ , Φ , Ψ A n ) , a geodesic from A to B is given by γ A n ( t ) = γ P n ( t ) / · γ AU n ( t ) · γ P n ( t ) / , (4.2a) where γ P n ( t ) := P A ( P − A P B P B P − A ) t P A = P A e t log( P − A P B ) , (4.2b) γ AU n ( t ) := U / A ( U − / A U B U − / A ) t U / A = U A e t log( U ∗ A U B ) . (4.2c) Moreover, the geodesic distance from A to B is given by δ Φ , Φ , Ψ A n ( A, B ) = (cid:114) Ψ (cid:16) δ Φ P n ( P A P A , P B P B ) , δ Φ AU n ( U A , U B ) (cid:17) = (cid:114) Ψ (cid:16) (cid:107) log( P − A P B P B P − A ) (cid:107) , (cid:107) log( U ∗ A U B ) (cid:107) (cid:17) (4.3)= (cid:114) Ψ (cid:16) Φ ( λ (log( P − A P B P B P − A ))) , Φ ( ϕ ( U − A U B )) (cid:17) , where ϕ ( · ) denotes the angles of the eigenvalues. In [39], the convention is that the Finsler structure is squared compared to the one in [9]. We follow theconvention of the latter. Note that functions Ψ fulfilling i)-v) are not necessarily symmetric gauge functions. As an example, considerΨ( x, y ) = ( x + 3 xy + y ) [39, Rem. 6]; this is not a symmetric gauge function since in general Ψ( x, y ) (cid:54) =Ψ( x, − y ). Conversely, symmetric gauge functions do not in general fulfill i)-v). As an example, consider Φ( x, y ) =max {| x | , | y |} [34, p. 138]; at any point ( x, y ) ∈ R + × R + such that x > y , ∂ y Φ = 0 and hence condition iv) is notfulfilled for this function. roof. That ( A n , F Φ , Φ , Ψ A n ) is a Finsler manifold follows from the discussion leading up to thetheorem; see [39]. Moreover, that (4.2) is a geodesic follows (by construction) by using [39,Thm. 3] together with Proposition 2.1 and Proposition 4.1; this also gives the first two equalitiesin (4.3). To prove the last equality, first observe that P − P B P B P − is the geometric mean of P A and P B and hence positive definite [11, Thm. 4.1.3], [16, Sec. 3]. Therefore, log( P − A P B P B P − A )is Hermitian and thus σ (log( P − A P B P B P − A ) = | λ (log( P − A P B P B P − A )) | . Similarly, U − A U B is unitary and thus log( U − A U B ) is skew-Hermitian. Therefore, λ (log( U − A U B )) = − iϕ ( U − A U B ) and hence σ (log( U − A U B )) = | λ (log( U − A U B )) | = | ϕ ( U − A U B ) | . Finally, for unitary invariant norms we have that (cid:107) · (cid:107) Φ = Φ( σ ( · )), and since for any symmetricgauge function Φ( | x | ) = Φ( x ), the last equality follows.Next, we derive some properties of the geodesic distance in Theorem 4.4. Proposition 4.5.
For matrices
A, B ∈ ( A n , F Φ , Φ , Ψ A n ) we have that1) δ Φ , Φ , Ψ A n ( A − , B − ) = δ Φ , Φ , Ψ A n ( A, B ) δ Φ , Φ , Ψ A n ( A ∗ , B ∗ ) = δ Φ , Φ , Ψ A n ( A, B ) δ Φ , Φ , Ψ A n ( A − , A ) = 2 δ Φ , Φ , Ψ A n ( I, A ) = 2 δ Φ , Φ , Ψ A n ( I, A − ) , and the geodesic midpoint between A − and A is γ A n (1 /
2) = I
4) for any U ∈ U n we have that δ Φ , Φ , Ψ A n ( U ∗ AU, U ∗ BU ) = δ Φ , Φ , Ψ A n ( A, B ) . Proof.
To prove the statements, first note that A − = P − A U − A P − A , and that A ∗ = P A U − A P A .To prove 1), we observe that δ Φ , Φ , Ψ A n ( A − , B − ) = (cid:114) Ψ (cid:16) Φ ( λ (log( P A P − B P − B P A ))) , Φ ( ϕ ( U A U − B )) (cid:17) . For the positive definite part, we have that λ (log( P A P − B P − B P A )) = λ ( − log(( P A P − B P − B P A ) − )) = − λ (log( P − A P B P B P − A )) , and since the symmetric gauge function Φ is invariant under sign changes the distance corre-sponding to the positive definite part is equal. Similarly, for the strictly accretive unitary part wehave that | ϕ ( U A U − B ) | = | ϕ (( U A U − B ) ∗ ) | = | ϕ ( U B U − A ) | = | ϕ ( U − A U B ) | , where the first equalityfollows from that the absolute value of the angles of the eigenvalues of a unitary matrix areinvariant under the operation of taking conjugate transpose, and the last equality follows sincethe angles of the eigenvalues are invariant under unitary congruence. Statement 2) follows by asimilar argument. To prove statement 3), δ Φ , Φ , Ψ A n ( A − , A ) = (cid:114) Ψ (cid:16) Φ ( λ (log( P − A ))) , Φ ( ϕ ( U − A )) (cid:17) = (cid:114) Ψ (cid:16) Φ ( − λ (log( P A ))) , Φ ( − ϕ ( U A )) (cid:17) = 2 (cid:114) Ψ (cid:16) Φ ( λ (log( P A ))) , Φ ( ϕ ( U A )) (cid:17) = 2 δ Φ , Φ , Ψ A n ( I, A )9here the second equity follows by an argument similar to previous ones, and the third equal-ity follows from property ii) for symmetric gauge functions and property ii) for product func-tions. The second equality in 3) now follows from 1), and that γ A n (1 /
2) = I follows by adirect calculation using (4.2). Finally, to prove 4), simply note that U ∗ AU = U ∗ P A U A P A U = U ∗ P A U U ∗ U A U U ∗ P A U , i.e., the same unitary congruence transformation applied to P A and U A individually. A direct calculation, using the unitary invariance of eigenvalues, the matrixlogarithm, and the norms, gives the result. Remark 4.6.
Note that ( A n , F Φ , Φ , Ψ A n ) is in general not a complete metric space. In particular,in the Riemannian case, i.e., with symmetric gauge functions Φ (cid:96) : x ∈ R n (cid:55)→ (cid:112)(cid:80) nk =1 x k , (cid:96) = 1 , ,and product function Ψ : ( x , x ) (cid:55)→ x + x , ( A n , F Φ , Φ , Ψ A n ) is not a complete metric space sinceit is not geodesically complete [31, Thm. 6.19]. The latter is due to the fact that ( AU n , F Φ AU n ) isnot geodesically complete; for ( AU n , F Φ AU n ) geodesics are not defined for all t ∈ R + since they willreach the boundary. For an example of a Cauchy sequence that does not converge to an elementin ( A n , F Φ , Φ , Ψ A n ) , consider the sequence ( A (cid:96) ) ∞ (cid:96) =1 , where A (cid:96) = e i ( π/ − π/ (2 (cid:96) )) I ∈ A n for all (cid:96) . Inthis case, the geodesic distance between A (cid:96) and A k is given by δ A n ( A (cid:96) , A k ) = (cid:107) Z ( (cid:96),k ) (cid:107) = √ n π (cid:12)(cid:12)(cid:12)(cid:12) (cid:96) − k (cid:12)(cid:12)(cid:12)(cid:12) , since Z ( (cid:96),k ) := − i log( A ∗ (cid:96) A k ) = − i log( e i ( π/ (2 (cid:96) ) − π/ (2 k )) I ) = π/ /(cid:96) − /k ) I . Thus ( A (cid:96) ) ∞ (cid:96) =1 isa Cauchy sequence, however lim (cid:96) →∞ A (cid:96) = e iπ/ I (cid:54)∈ A n . Since the Hopf-Rinow theorem [31,Thm. 6.19] also carries over to Finsler geometry [9, Thm. 6.6.1], similar statements are truealso in the general case. Remark 4.7.
Using Remark 3.4, the above results can easily be generalized to the same subsets ˜ A n of sectorial matrices. In fact, a direct calculation shows that all the algebraic expressions inTheorem 4.4 still hold in this case. However, statements 1)-3) of Proposition 4.5 use the factthat if A ∈ A n then A − , A ∗ ∈ A n . This is in general not true for other sets ˜ A n . By construction, on the Finsler manifold ( A n , F Φ , Φ , Ψ A n ) the question “given A ∈ A n , whichmatrix B ∈ P n is closest to A ” have the answer “ B = P , where A = P U P is the symmetricpolar decomposition.” Similarly, the corresponding question “which matrix B ∈ AU n is closestto A ” have the answer “ B = U ”. In this section we consider an application of the distance onthe Finsler manifold ( A n , F Φ , Φ , Ψ A n ) to another matrix approximation problem, namely findingthe closest matrix of bounded log-rank, the definition of which is given in Section 5.1. Moreover,in Section 5.2 we consider the relation between the midpoint of geodesics on ( A n , F Φ , Φ , Ψ A n ) andthe geometric mean of strictly accretive matrices as introduced in [16]. For a positive definite matrix P , we defined the log-rank as the rank of the matrix logarithm of P . This is equivalent to the number of eigenvalues of P that are different from 1. We denote thislog-rank P n ( · ). Analogously, for a unitary matrix U the log-rank can be defined as the rank of thematrix logarithm of U , which is equivalent to the number of eigenvalues of U with phase differentfrom 0. We denote this log-rank U n ( · ). For strictly accretive matrices we define the log-rank asfollows. 10 efinition 5.1. For A ∈ A n with symmetric polar decomposition A = P U P , we define thelog-rank of A as log-rank A n ( A ) := max { log-rank P n ( P ) , log-rank U n ( U ) } . We now consider the log-rank approximation problem: given A ∈ A n find A r ∈ A n , the latterwith log-rank bounded by r , that is closest to A in the geodesic distance δ Φ , Φ , Ψ A n . This can beformulated as the optimization probleminf A r ∈ A n δ Φ , Φ , Ψ A n ( A r , A ) (5.1a)subject to log-rank A n ( A r ) ≤ r. (5.1b)Let A r = P r U r P r be the symmetric polar decomposition. By properties i) - iv) in the Defi-nition 4.2 of product functions Ψ, for each such function the distance is nondecreasing in eachargument separately. Therefore, by the form of the geodesic distance (4.3) and the definition oflog-rank on A n , it follows that (5.1) splits into two separate problems over P n and AU n , namelyinf P r ∈ P n (cid:107) log( P − r P P − r ) (cid:107) Φ (5.2a)subject to log-rank P n ( P r ) ≤ r, (5.2b)and inf U r ∈ AU n (cid:107) log( U ∗ r U ) (cid:107) Φ (5.3a)subject to log-rank U n ( U r ) ≤ r. (5.3b)In fact, this gives the following theorem. Theorem 5.2.
Assume that ˆ P r and ˆ U r are optimal solutions to (5.2) and (5.3) , respectively.Then an optimal solution to (5.1) is given by ˆ A r = ˆ P r ˆ U r ˆ P r . Conversely, if (5.1) has an optimalsolution ˆ A r = ˆ P r ˆ U r ˆ P r , then ˆ P r and ˆ U r are optimal solutions to (5.2) and (5.3) , respectively. In [47, Thm. 3] it was shown that (5.3) always has an optimal solution, and that it is the samefor all symmetric gauge functions Φ . More precisely, the optimal solution ˆ U r is obtained from U by setting the n − r phases of U with smallest absolute value equal to 0. That is, let U = V ∗ DV be a diagonalization of U where D = diag([ e i ˜ φ k ( U ) ] nk =1 ) and where [ ˜ φ k ( U )] nk =1 are the phases of U ordered so that | ˜ φ ( U ) | ≥ | ˜ φ ( U ) | ≥ . . . ≥ | ˜ φ n ( U ) | . Thenˆ U r = V ∗ diag( e i ˜ φ ( U ) , . . . e i ˜ φ r ( U ) , , . . . , V is the optimal solution to (5.3). In the same spirit, the optimal solution to (5.2) can be chara-terized as follows. Proposition 5.3.
Let P ∈ P n , and let P = V ∗ diag( λ , . . . , λ n ) V be a diagonlization of P where the eigenvalues are ordered so that | log( λ ) | ≥ | log( λ ) | ≥ . . . ≥ | log( λ n ) | . Then ˆ P r = V ∗ diag( λ , . . . , λ r , , . . . , V is a minimizer to (5.2) for all symmetric gauge functions Φ . roof. Clearly, log-rank P n ( ˆ P r ) = r and hence ˆ P r is feasible to (5.2). Next, (cid:107) log( P − r P P − r ) (cid:107) Φ =Φ ( λ (log( P − r P P − r ))). Moreover, since P − r P P − r ∈ P n and hence is unitary diagonalizableand has positive eigenvalues, we have that λ (log( P − r P P − r )) = log( λ ( P − r P P − r )). Now, toshow that ˆ P r is the minimizer to (5.2) for all symmetric gauge functions Φ , it is equivalent toshow that | log( λ ( ˆ P − r P ˆ P − r )) | ≺ w | log( λ ( P − r P P − r )) | for all P r such that log-rank P n ( P r ) ≤ r ;see, e.g., [18, Thm. 4], [36, Thm. 1], [22, Sec. 3.5], [34, Prop. 4.B.6], [45, Thm. 10.35 ].To this end, using [34, Thm. 9.H.1.f] (or [10, Cor III.4.6 ], [45, Thm. 10.30 ]) we have thatlog( λ ( P )) ↓ − log( λ ( P r )) ↓ ≺ log( λ ( P − r P P − r )), and by [10, Ex 11.3.5] we therefore have that | log( λ ( P )) ↓ − log( λ ( P r )) ↓ | ≺ w | log( λ ( P − r P P − r )) | . Moreover, by a direct calculation it canbe verified that log( λ ( P )) ↓ − log( λ ( ˆ P r )) ↓ = log( λ ( ˆ P − r P ˆ P − r )) ↓ holds. Therefore, if we canshow that | log( λ ( P )) ↓ − log( λ ( ˆ P r )) ↓ | ≺ w | log( λ ( P )) ↓ − log( λ ( P r )) ↓ | for all for all P r such that log-rank P n ( P r ) ≤ r, (5.4)we would have that for all such P r , | log( λ ( ˆ P − r P ˆ P − r )) ↓ | = | log( λ ( P )) ↓ − log( λ ( ˆ P r )) ↓ | ≺ w | log( λ ( P )) ↓ − log( λ ( P r )) ↓ |≺ w | log( λ ( P − r P P − r )) | and by transitivity of preorders the result follows. To show (5.4), we formulate the followingequivalent optimization problem: let a = log( λ ( P )) ↓ and considermin ≺ w | a − x | subject to x ∈ R n , x ≥ x ≥ . . . ≥ x n at most r elements of x are nonzero , where min ≺ w is minimizing with respect to the preordering ≺ w . The solution to the latter is totake x i = a i for the r elements of a with largest absolute value. The geometric mean of strictly accretive matrices, denoted by A B , was introduced in [16] asa generalization of the geometric mean for positive definite matrices [11, Chp. 4 and 6], [37]. Inparticular, in [16] it was shown that for A, B ∈ A n there is a unique solution G ∈ A n to theequation GA − G = B . This solution is given explicitly as A B := G = A / ( A − / BA − / ) / A / , which is also the same algebraic expression as for the geometric mean of positive definite matrices.The geometric mean for positive definite matrices can also be interpreted as the midpoint onthe geodesic connecting the matrices [11, Sec. 6.1.7]. With the Finsler geometry ( A n , F Φ , Φ , Ψ A n ),we can therefore get an alternative definition of the geometric mean between strictly accretivematrices as the geodesic midpoint. However, for A, B ∈ ( A n , F Φ , Φ , Ψ A n ) we in general have that γ A n (1 / (cid:54) = A B . This can be seen by the following simple example. Example 5.4.
Let A = I and let B = P U P ∈ ( A n , F Φ , Φ , Ψ A n ) . Then we have that A B = B / = ( P U P ) / and γ A n (1 /
2) = P / U / P / . Thus, in general A B (cid:54) = γ A n (1 / . In fact,equality holds in this case if and only if P and U commute. To see this, take V = P − r P P − r ∈ P n and U = P r in [34, Thm. 9.H.1.f], and use the fact that theeigenvalues of UV = P r P P − r are invariant under the similarity transform P − r · P r . A, B ∈ ( A n , F Φ , Φ , Ψ A n ) can be ex-pressed using the geometric mean γ A n (1 /
2) = ( P A P B ) / ( U A U B ) ( P A P B ) / , (5.5)which follows directly from (4.2). Using this representation, we can characterize when A B = γ (1 / Lemma 5.5.
Let A ∈ W n , and let A = V Q be is polar decomposition, where V ∈ U n and Q ∈ P n . A is normal if and only if A = Q / V Q / is the symmetric polar decomposition of A .Proof. First, using [21, Lem. 9] we conclude that since A is sectorial, V is also sectorial. Now, A is normal if and only if V and Q commute [45, Thm. 9.1], which is true if and only if V and Q / commute. Hence A is normal if and only if A = V Q = Q / V Q / , and by the existenceand uniqueness of the symmetric polar decomposition the result follows. Lemma 5.6.
Let
A, B ∈ A n and G = A B . For all X ∈ GL n , the unique strictly accretivesolution to H ( X ∗ AX ) − H = X ∗ BX is H = X ∗ GX .Proof. That H = X ∗ GX solves the equations is easily verified by simply plugging it in. Moreover,that H is unique follows from the uniqueness of the geometric mean for strictly accretive matrices[16, Sec. 3] and the fact that for any X ∈ GL n we have that X ∗ AX, X ∗ BX ∈ A n . Proposition 5.7.
For
A, B ∈ ( A n , F Φ , Φ , Ψ A n ) , let A = P A U A P A and B = P B U B P B be thecorresponding symmetric polar decompositions. We have that A B = γ A n (1 / if one of thefollowing holds:i) U A = U B = I ,ii) P A = P B ,iii) A and B are commuting normal matrices.Proof. Using (5.5), the first statement follows immediately. To prove the second statement, let A = P U A P and B = P U B P . From (5.5) it therefor follows that γ A n (1 /
2) = P ( U A U B ) P .Using Lemma 5.6 with G = U A U B and X = P , where therefore have that A B = ( P U A P ) P U B P ) = P ( U A U B ) P = γ A n (1 / . To prove the third statement, by Lemma 5.5 we have that P A , U A , and P B , U B commute.Moreover, since commuting normal matrices are simultaneously unitarilty diagonalizable [23,Thm. 2.5.5], and since a unitary diagonlization is unique up to permutation of the eigenvaluesand eigenvectors, it follows that P A , U A , P B and U B all commute. Using this together with (5.5),a direct calculation gives the result.As noted in the above proof, if A and B are normal and commute they are also simultaneouslyunitarily diagonalizable [23, Thm. 2.5.5], i.e., A = V ∗ Λ A V and B = V ∗ Λ B V for some V ∈ U n .In this case, using Lemma 5.6 we have that A B = V ∗ (Λ A B ) V , and the geometric mean13etween A and B can thus be interpreted as the (independent) geometric mean between thecorresponding pairs of eigenvalues. In fact, the latter observation can be generalized to all pairsof matrices that can be simultaneously diagonalized by congruence, albeit that the elements ofthe diagonal matrices are not necessarily eigenvalues in this case (cf. Proposition 2.1). Proposition 5.8.
Let
A, B ∈ A n and assume that A = T ∗ D A T and B = T ∗ D B T , where T ∈ GL n and where D A and D B are diagonal matrices. Then A B = T ∗ ( D A D B ) T .Proof. Let
A, B ∈ A n and assume that A = T ∗ D A T and B = T ∗ D B T , where T ∈ GL n andwhere D A and D B are diagonal matrices. A direct application of Lemma 5.6, with G = D A D B and X = T , gives the result.As a final remark, note that if D A and D B in Proposition 5.8 are unitary, then A = T ∗ D A T and B = T ∗ D B T are sectorial decompositions of A and B , respectively. Now, let T = V P be the polardecomposition of T , with V ∈ U n and P ∈ P n . Hence we have that A = P V ∗ D A V P = P U A P and B = P V ∗ D B V P = P U B P , i.e., P is the positive definite part and V ∗ D A V and V ∗ D B V arethe strictly accretive unitary part in the symmetric polar decomposition of A and B , respectively.From Proposition 5.7.ii) we therefore have that A B = γ A n (1 /
2) in this case.
In this work we show that the set of strictly accretive matrices is a smooth manifold that isdiffeomorphic to a direct product of the smooth manifold of positive definite matrices and thesmooth manifold of strictly accretive unitary matrices. Using this decomposition, we introduced afamily of Finsler metrics and studied their geodesics and geodesic distances. Finally, we considerthe matrix approximation problem of finding the closest strictly accretive matrix of bounded log-rank, and also discuss the relation between the geodesic midpoint and the previously introducedgeometric mean between accretive matrices.There are several interesting ways in which these results can be extended. For example, in thecase of positive definite matrices the geometric framework offered by the Riemannian manifoldconstruction gives yet another interpretation of the geometric mean. In fact, the geometric meanbetween two positive definite matrices A and B is also the (unique) solution to the variationalproblem min G ∈ P n δ P n ( A, G ) + δ P n ( B, G ) [11, Sec. 6.2.8], [37, Prop. 3.5], and this interpretationcan be used to extend the geometric mean to a mean between several matrices [37]. In a similarway, a geometric mean between the strictly accretive matrices A , . . . , A N can be defined as thesolution to min G ∈ A n N (cid:88) i =1 δ Φ , Φ , Ψ A n ( A i , G ) , however such a generalization would need more investigation. For example, even in the caseof the Riemannian metric on the manifold of positive definite matrices, analytically computingthe geometric mean between several matrices is nontrivial [37, Prop. 3.4]. Nevertheless, thereare efficient numerical algorithms for solving the latter problem, see, e.g., the survey [25] or themonograph [1] and references therein.The idea of this work was to introduce a metric that separates the “magnitudes” and the “phases”of strictly accretive matrices. However, the similarities between the manifold of positive definitematrices and the manifold of unitary matrices raises a question about another potential geometryon A n that does not explicitly use the product structure. More precisely, note that since all14trictly accretive matrices have a unique, strictly accretive square root, the inner product on thetangent space T U AU n , given by (2.4), can be defined analogously to the one on P n , given by(2.1), namely as (cid:104) X, Y (cid:105) U = tr(( U − / XU − / ) ∗ ( U − / Y U − / )) = tr( X ∗ Y ) , for U ∈ AU n and X, Y ∈ T U AU n . Based on the similarities between the inner products, and thecorresponding geodesics and geodesic distances, we ask the following question: for A, B ∈ A n and X, Y ∈ T A A n , if we define the inner product on T A A n as (cid:104) X, Y (cid:105) A = tr(( A − / XA − / ) ∗ ( A − / Y A − / )) , what is the form of the geodesics and the geodesic distance? Acknowledgments
The authors would like to thank Wei Chen, Dan Wang, Xin Mao, Di Zhao, and Chao Chen forvaluable discussions.
Appendix A Technical results from Section 3
The following is a number of lemmata use in the proofs of Theorem 3.1 and 3.2.
Lemma A.1. A n is an open sets in GL n .Proof. To show that A n is open in GL n , note that since H n ⊥ S n , cf. [34, Thm. 10.B.1 and10.B.2], [45, Prob. 10.7.20], we have that A + B = H ( A + B ) + S ( A + B ). Moreover, H ( A + B ) = H ( A ) + H ( B ), and since the set P n is open in H n , P n ⊕ S n is open in GL n . Lemma A.2. A n is connected.Proof. By [30, Prop. 1.11], A n is connected if and only if it is path-connected. To show thelatter, it suffices to show that any A ∈ A n is path-connected to I . To this end, let A = T ∗ DT be the sectorial decomposition of A . A piece-wise smooth path connecting A and I is given by γ ( t ) := T ∗ D − t T, for t ∈ [0 , / T ∗ T ) − t , for t ∈ [1 / , I, for t = 1 , and hence A n is connected. Lemma A.3.
The tangent space at an A ∈ A n is given by T A A n = M n .Proof. This follows since A n is an open subset of M n . Lemma A.4. AU n is a connected smooth manifold and at a point U ∈ AU n the tangent spaceis T U AU n = S n . roof. Since A n is open in GL n (Lemma A.1), AU n = U n ∩ A n is open in U n in the relativetopology with respect to GL n . Thus it is a smooth manifold [30, Ex. 1.26]. Moreover, the proofof Lemma A.2 holds, mutatis mutandis , showing that it is connected. Finally, since it is open in U n , the tangent space at U ∈ AU n is T U AU n = S n , cf. [30, Prob. 8.29]. Lemma A.5 (Cf. [29, Prop. VII.2.5]) . For A ∈ GL n , let A = V Q where V ∈ U n and Q ∈ P n bethe polar decomposition of A . The mapping A (cid:55)→ ( V, Q ) is a diffeomorphis between the manifolds GL n and U n × P n .Proof. First, since U n and P n are smooth manifolds so is U n × P n [30, Ex. 1.34]. Next, for each A ∈ GL n the polar decomposition is unique [23, Thm. 7.3.1], [45, Prob. 3.2.20], and for each pairof matrices ( V, Q ) ∈ U n × P n we have that V Q ∈ GL n ; thus the mapping is bijective. Now, A issmooth in V and Q since it is polynomial in the coefficients, i.e., the inverse mapping is smooth.To prove the converse, note that the components in the polar decomposition A = V Q are givenby Q = ( A ∗ A ) / and V = A ( A ∗ A ) − / , cf. [23, p. 449], [45, p. 288]. Since A ∈ GL n , A ∗ A ∈ P n ,and the matrix square root is a smooth function on P n , cf. [23, Thm. 7.2.6]. Therefore, sinceboth Q and V are compositions of smooth functions of A , they both depend smoothly on thecomponents of A . References [1] Pierre-Antoine Absil, Robert Mahony, and Rodolphe Sepulchre.
Optimization algorithmson matrix manifolds . Princeton University Press, Princeton, NJ, 2008.[2] Shun-ichi Amari and Hiroshi Nagaoka.
Methods of information geometry . American Math-ematical Society, Providence, RI, 2000.[3] Esteban Andruchow. Short geodesics of unitaries in the L metric. Canadian MathematicalBulletin , 48(3):340–354, 2005.[4] Esteban Andruchow, Gabriel Larotonda, and L´azaro A. Recht. Finsler geometry and actionsof the p -Schatten unitary groups. Transactions of the American Mathematical Society ,362(1):319–344, 2010.[5] Esteban Andruchow and L´azaro A. Recht. Geometry of unitaries in a finite algebra: variationformulas and convexity.
International Journal of Mathematics , 19(10):1223–1246, 2008.[6] Jorge Antezana, Eduardo Ghiglioni, and Demetrio Stojanoff. Minimal curves in U ( n ) and G l ( n ) + with respect to the spectral and the trace norms. Journal of Mathematical Analysisand Applications , 483(2):123632, 2020.[7] Jorge Antezana, Gabriel Larotonda, and Alejandro Varela. Optimal paths for symmetricactions in the unitary group.
Communications in Mathematical Physics , 328(2):481–497,2014.[8] Charles S. Ballantine and Charles R. Johnson. Accretive matrix products.
Linear andMultilinear Algebra , 3(3):169–185, 1975.[9] David Bao, Shiing-Shen Chern, and Zhongmin Shen.
An introduction to Riemann-Finslergeometry . Springer, New York, NY, 2000.[10] Rajendra Bhatia.
Matrix analysis . Springer, New York, NY, 1997.1611] Rajendra Bhatia.
Positive definite matrices . Princeton university press, Princton, NJ, 2007.[12] Wei Chen, Dan Wang, Sei Zhen Khong, and Li Qiu. Phase analysis of MIMO LTI systems.In , pages 6062–6067. IEEE,2019.[13] Xinyue Cheng and Zhongmin Shen.
Finsler geometry . Springer, Berlin, Heidelberg, 2012.[14] Shiing-Shen Chern. Finsler geometry is just Riemannian geometry without the quadraticrestriction.
Notices of the American Mathematical Society , 43(9):959–963, 1996.[15] Charles R. DePrima and Charles R. Johnson. The range of A − A ∗ in GL(n, C). LinearAlgebra and its Applications , 9:209–222, 1974.[16] Stephen Drury. Principal powers of matrices with positive definite real part.
Linear andMultilinear Algebra , 63(2):296–301, 2015.[17] Stephen Drury and Minghua Lin. Singular value inequalities for matrices with numericalranges in a sector.
Operators and Matrices , 8(4):1143–1148, 2014.[18] Ky Fan. Maximum properties and inequalities for the eigenvalues of completely continuousoperators.
Proceedings of the National Academy of Sciences of the United States of America ,37(11):760, 1951.[19] Susana Furtado and Charles R. Johnson. Spectral variation under congruence.
Linear andMultilinear Algebra , 49(3):243–259, 2001.[20] Karl E. Gustafson and Duggirala K.M. Rao.
Numerical range . Springer, New York, NY,1997.[21] Alfred Horn and Robert Steinberg. Eigenvalues of the unitary part of a matrix.
PacificJournal of Mathematics , 9(2):541–550, 1959.[22] Roger A. Horn and Charles R. Johnson.
Topics in matrix analysis . Cambridge UniversityPress, New York, NY, 1994.[23] Roger A. Horn and Charles R. Johnson.
Matrix analysis . Cambridge university press, NewYork, NY, 2013.[24] Roger A. Horn and Vladimir V. Sergeichuk. Canonical forms for complex matrix congruenceand ∗ congruence. Linear algebra and its applications , 416(2-3):1010–1032, 2006.[25] Ben Jeuris, Raf Vandebril, and Bart Vandereycken. A survey and comparison of contem-porary algorithms for computing the matrix geometric mean.
Electronic Transactions onNumerical Analysis , 39:379–402, 2012.[26] Charles R. Johnson and Susana Furtado. A generalization of Sylvester’s law of inertia.
Linear Algebra and its Applications , 338(1-3):287–290, 2001.[27] J¨urgen Jost.
Riemannian geometry and geometric analysis . Springer, Berlin, Heidelberg,2008.[28] Tosio Kato.
Perturbation theory for linear operators . Springer, Berlin, Heidelberg, 1995.[29] Serge Lang.
Fundamentals of differential geometry . Springer, New York, NY, 1999.[30] John M. Lee.
Introduction to Smooth Manifolds . Springer, New York, NY, 2013.[31] John M. Lee.
Introduction to Riemannian Manifolds . Springer, Cham, 2018.1732] Adrian S. Lewis. Group invariance and convex matrix analysis.
SIAM Journal on MatrixAnalysis and Applications , 17(4):927–949, 1996.[33] Chi-Kwong Li and Nung-Sing Sze. Determinantal and eigenvalue inequalities for matriceswith numerical ranges in a sector.
Journal of Mathematical Analysis and Applications ,410(1):487–491, 2014.[34] Albert W Marshall, Ingram Olkin, and Barry C Arnold.
Inequalities: theory of majorizationand its applications . Springer, New York, NY, 2nd edition, 2011.[35] Roy Mathias. Matrices with positive definite Hermitian part: Inequalities and linear systems.
SIAM journal on matrix analysis and applications , 13(2):640–654, 1992.[36] Leon Mirsky. Symmetric gauge functions and unitarily invariant norms.
The quarterlyjournal of mathematics , 11(1):50–59, 1960.[37] Maher Moakher. A differential geometric approach to the geometric mean of symmetricpositive-definite matrices.
SIAM Journal on Matrix Analysis and Applications , 26(3):735–747, 2005.[38] Richard M. Murray, Zexiang Li, and S. Shankar Sastry.
A mathematical introduction torobotic manipulation . CRC press, Boca Raton, FL, 1994.[39] Tsutomu Okada. Minkowskian product of Finsler spaces and Berwald connection.
Journalof Mathematics of Kyoto University , 22(2):323–332, 1982.[40] Barrett O’Neill.
Semi-Riemannian geometry with applications to relativity . Academic press,San Diego, CA, 1983.[41] Karl Johan ˚Astr¨om and Richard M. Murray.
Feedback systems . Princeton university press,Princeton, NJ, 2008.[42] Hideo Shimada and Vasile Soriin Sab˘au. Finsler geometry. In P.L. Antonelli, editor,
Fins-lerian Geometries: A Meeting of Minds , pages 15–24. Springer, Dordrecht, 2000.[43] Robert C. Thompson. On the eigenvalues of a product of unitary matrices I.
Linear andMultilinear Algebra , 2(1):13–24, 1974.[44] Dan Wang, Wei Chen, Sei Zhen Khong, and Li Qiu. On the phases of a complex matrix.
Linear Algebra and its Applications , 593:152–179, 2020.[45] Fuzhen Zhang.
Matrix theory: basic results and techniques . Springer, New York, NY, 2011.[46] Fuzhen Zhang. A matrix decomposition and its applications.
Linear and Multilinear Algebra ,63(10):2033–2042, 2015.[47] Di Zhao, Axel Ringh, Li Qiu, and Sei Zhen Khong. Low phase-rank approximation.