aa r X i v : . [ m a t h . F A ] M a y Orthogonality of matrices in the Ky Fan k -norms Priyanka Grover ∗† Department of Mathematics, School of Natural Sciences, Shiv Nadar University,Gautam Buddha Nagar, Uttar Pradesh 201314, India
Abstract
We obtain necessary and sufficient conditions for a matrix A to beBirkhoff-James orthogonal to another matrix B in the Ky Fan k -norms.A characterization for A to be Birkhoff-James orthogonal to any subspace W of M ( n ) is also obtained. AMS classification:
Keywords:
Birkhoff-James orthogonality, Subdifferential, Singular valuedecomposition, Ky Fan norms, k -numerical range, Hausdorff-Toeplitz Theorem,Separating Hyperplane theorem, Norm parallelism Let M ( n ) be the space of n × n complex matrices. Let k · k be any norm on M ( n ). Let A, B ∈ M ( n ). Then A is said to be (Birkhoff-James) orthogonal to B in k · k if k A + λB k ≥ k A k for all λ ∈ C . (1.1)In [5], Bhatia and ˇSemrl obtained a characterization for A to be orthogonal to B in the operator norm (also known as the spectral norm) k · k ∞ . They showedthat A is orthogonal to B in k · k ∞ if and only if there exists a unit vector x ∈ C n such that k Ax k = k A k ∞ and h Ax, Bx i = 0 . (All inner products inthis note are conjugate linear in the first component and linear in the secondcomponent.) Different proofs for this result have been studied in [7, 11, 12].This result can be restated as follows. If A = U | A | is a polar decompositionof A , then A is orthogonal to B in k · k ∞ if and only if there exists a unitvector x ∈ C n such that | A | x = k A k ∞ x and h x, U ∗ Bx i = 0 . In [5], it wasalso showed that if tr U ∗ B = 0 then A is orthogonal to B in the trace norm k · k . And the converse is true if A is taken to be invertible. Later, Li andSchneider [12] gave a characterization for orthogonality in k · k when A need ∗ Email: [email protected] † The author is supported by the research grant of INSPIRE Faculty Award of Departmentof Science and Technology, India. A be ℓ . Let A = U SV ∗ be a singular value decompositionof A . Let B = U (cid:20) B B B B (cid:21) V ∗ , where B ∈ M ( n − ℓ ) , B ∈ M ( ℓ ) . Then k A + λB k ≥ k A k for all λ ∈ C if and only if | tr B | ≤ k B k . The trace norm and the operator norm are special cases of two classes ofnorms, namely the Schatten p -norms k · k p and the Ky Fan k -norms k · k ( k ) .In [5] and [12], the authors have investigated the problem of finding necessaryand sufficient conditions for orthogonality of matrices in k · k p , 1 ≤ p ≤ ∞ . Inthis note, we obtain characterizations for orthogonality of matrices in k · k ( k ) ,1 ≤ k ≤ n . Let s ( A ) ≥ s ( A ) ≥ · · · ≥ s n ( A ) ≥ A .Then k A k ( k ) is defined as k A k ( k ) = s ( A ) + s ( A ) + · · · + s k ( A ) . (1.2)The cases k = 1 and k = n correspond to the operator norm k · k ∞ and thetrace norm k · k , respectively. We show the following. Theorem 1.1.
Let A = U | A | be a polar decomposition of A . If there exist k orthonormal vectors u , u , . . . , u k such that | A | u i = s i ( A ) u i for all ≤ i ≤ k (1.3) and k X i =1 h u i , U ∗ Bu i i = 0 , (1.4) then A is orthogonal to B in k · k ( k ) . If s k ( A ) > , then the converse is alsotrue. The next theorem gives a more general characterization.
Theorem 1.2.
Let A = U SV ∗ be a singular value decomposition of A . Let themultiplicity of s k ( A ) be r + q , where r ≥ and q ≥ , such that s k − q +1 ( A ) = · · · = s k + r ( A ) . Let B = U B B B B B B B B B V ∗ , where B ∈ M ( k − q ) , B ∈ M ( r + q ) , B ∈ M ( n − k − r ) . (a) Let s k ( A ) > . Then A is orthogonal to B in k · k ( k ) if and only if thereexists a positive semidefinite matrix T ∈ M ( r + q ) with λ ( T ) ≤ and P r + qj =1 λ j ( T ) = q such that tr B + tr( T ∗ B ) = 0 . (b) Let s k ( A ) = 0 . Then A is orthogonal to B in k · k ( k ) if and only if thereexists T ∈ M ( n − k + q, r + q ) with s ( T ) ≤ , and P r + qj =1 s j ( T ) ≤ q suchthat tr B + tr (cid:18) T ∗ (cid:20) B B (cid:21)(cid:19) = 0 . W be any subspace of M ( n ). Then A is said to be orthogonal to W (inthe Birkhoff-James sense) in a given norm k · k on M ( n ) if k A + W k ≥ k A k for all W ∈ W . (1.5)In [10], we obtained a necessary and sufficient condition for A to be orthogonalto W in the operator norm. Our next theorem gives a characterization for A tobe orthogonal to W in k · k ( k ) . Theorem 1.3.
Let A = U | A | be a polar decomposition of A . Let W beany subspace of M ( n ) . If there exist density matrices P , P , . . . , P k such that k P ki =1 P i k ∞ ≤ , | A | P i = s i ( A ) P i (1 ≤ i ≤ k ) and U P ki =1 P i ∈ W ⊥ , then A isorthogonal to W in k · k ( k ) . If s k ( A ) > , then the converse is also true. If m i ( A ) is the multiplicity of s i ( A ), then the condition | A | P i = s i ( A ) P i implies that the range of P i is a subspace of the eigenspace of | A | correspondingto s i ( A ). So rank P i is at most m i ( A ).The problem of finding characterizations of orthogonality of a matrix to asubspace W of M ( n ) is closely related to the best approximation problems [18].A specific question is when is the zero matrix a best approximation to A from W ? This is the same as asking when is A orthogonal to W ?In [12], the authors studied a characterization for orthogonality in the in-duced matrix norms. Ben´ıtez, Fern´andez and Soriano [6] showed that a neces-sary and sufficient condition for the norm of a real finite dimensional normedspace X to be induced by an inner product is that for any bounded linear op-erators A, B from X into itself, A is orthogonal to B if and only if there existsa unit vector x ∈ X such that k Ax k = k A k and h Ax, Bx i = 0. More resultsin this direction have been obtained recently in [15, 16]. Characterizations oforthogonality on Hilbert C ∗ -modules have been studied in [1, 2, 3, 7].To obtain the proofs of the above theorems, we use methods that we hadintroduced in [7] and [10]. We first obtain some new expressions for the subdif-ferential of the map taking a matrix A to its Ky Fan k -norm k A k ( k ) in Section2. The proofs of the above theorems are given in Section 3 followed by someremarks in Section 4. k -norm Let X be a Banach space and let f : X → R be a convex function. Definition 2.1.
A subgradient of f at a ∈ X is an element ϕ of the dual space X ∗ such that f ( y ) − f ( a ) ≥ Re ϕ ( y − a ) for all y ∈ X . (2.1)The subdifferential of f at a is the set of bounded linear functionals ϕ ∈ X ∗ satisfying (2.1) and is denoted by ∂f ( a ). It is a non-empty weak* compactconvex subset of X ∗ . For more details, see [9, Chapter D] and [21, Chapter2]. The following proposition is a direct consequence of the definition of thesubdifferential. It is one of the most useful tools that we require in Section 3.3 roposition 2.2. A continuous convex function f : X → R attains its mini-mum value at a if and only if 0 ∈ ∂f ( a ).An equivalent definition of the subdifferential of a continuous convex functioncan be given in terms of f ′ + ( a, x ), the right directional derivative of f at a inthe direction x : ∂f ( a ) = { ϕ ∈ X ∗ : Re ϕ ( x ) ≤ f ′ + ( a, x ) for all x ∈ X } . (2.2)Moreover, for each x ∈ X , f ′ + ( a, x ) = max { Re ϕ ( x ) : ϕ ∈ ∂f ( a ) } . (2.3)The following rule of subdifferential calculus will be helpful in our analysislater. Proposition 2.3.
Let X and Y be Banach spaces. Let S : X → Y be abounded linear map and let L : X → Y be the continuous affine map definedby L ( x ) = S ( x ) + y , for some y ∈ Y . Let g : Y → R be a continuous convexfunction. Then ∂ ( g ◦ L )( a ) = S ∗ ∂g ( L ( a )) for all a ∈ X , (2.4)where S ∗ denotes the real or complex adjoint of S (depending on whether X and Y are both real or both complex Banach spaces.)For any norm k · k on the space M ( n ), it is well known that ∂ k A k = { G ∈ M ( n ) : k A k = Re tr( G ∗ A ) , k G k ∗ ≤ } , (2.5)where k · k ∗ is the dual norm of k · k , and k T k = sup k X k ∗ =1 | tr( T ∗ X ) | = sup k X k ∗ =1 Re tr( T ∗ X ) . (2.6)The subdifferentials of some classes of matrix norms, namely unitarily invariantnorms and induced norms, have been computed by Watson [19]. The followingexpression for the subdifferential of the Ky Fan k -norms was also given by himin [20]. Let 1 ≤ k ≤ n . Let the multiplicity of s k ( A ) be r + q , where r ≥ q ≥
1, such that s k − q +1 ( A ) = · · · = s k + r ( A ) . Let g : M ( n ) → R be the function defined as g ( A ) = k A k ( k ) . Theorem 2.4 ([20]) . Let A = U SV ∗ be a singular value decomposition of A and let the matrices U, V be partitioned as U = [ U : U : U ] and V = [ V : V : V ] where U , V ∈ M ( n, k − q ); U , V ∈ M ( n, r + q ); U , V ∈ M ( n, n − k − r ) . If s k ( A ) > , then G ∈ ∂g ( A ) if and only if there exists a positivesemidefinite matrix T ∈ M ( r + q ) with λ ( T ) ≤ and P r + qj =1 λ j ( T ) = q suchthat G = U V ∗ + U T V ∗ . If s k ( A ) = 0 , then G ∈ ∂g ( A ) if and only if thereexists T ∈ M ( n − k + q, r + q ) with s ( T ) ≤ and P r + qj =1 s j ( T ) ≤ q such that G = U V ∗ + [ U : U ] T V ∗ .
4e obtain new formulas for ∂g ( A ) that can be used more easily in ourproblem. The computations are similar to the ones in [19]. To do so, we firstcalculate g ′ + ( A, · ). For this, an important thing to observe is that the Ky Fan k -norm of a matrix A is also given by k A k ( k ) = max U,V ∈ M ( n,k ) U ∗ U = V ∗ V = I k Re tr U ∗ AV = max U,V ∈ M ( n,k ) U ∗ U = V ∗ V = I k | tr U ∗ AV | . (2.7)(See [13, p. 791].) If A is positive semidefinite, then k A k ( k ) = max U ∈ M ( n,k ) U ∗ U = I k tr U ∗ AU. (2.8)
Theorem 2.5.
For X ∈ M ( n ) , g ′ + ( A, X ) = max u ,...,u k o . n .v ,...,v k o . n .Av i = s i ( A ) u i k X i =1 Re h u i , Xv i i . (2.9) Proof.
From (2.7), we have k A k ( k ) = max u ,...,u k o.n. v ,...,v k o.n. k X i =1 Re h u i , Av i i . (2.10)For any sets of k orthonormal vectors u , . . . , u k and v , . . . , v k satisfying Av i = s i ( A ) u i , ≤ i ≤ k , we have k A + tX k ( k ) ≥ k X i =1 Re h u i , ( A + tX ) v i i = k X i =1 s i ( A ) + t k X i =1 Re h u i , Xv i i = k A k ( k ) + t k X i =1 Re h u i , Xv i i . This gives for t > k A + tX k ( k ) − k A k ( k ) t ≥ max u ,...,u k o.n. v ,...,v k o.n. Av i = s i ( A ) u i k X i =1 Re h u i , Xv i i . (2.11)Now for any sets of k orthonormal vectors u ( t ) , . . . , u k ( t ) and v ( t ) , . . . , v k ( t )satisfying ( A + tX ) v i ( t ) = s i ( A + tX ) u i ( t ) , ≤ i ≤ k, (2.12)5e have k A k ( k ) ≥ k X i =1 Re h u i ( t ) , Av i ( t ) i = k X i =1 s i ( A + tX ) − t k X i =1 Re h u i ( t ) , Xv i ( t ) i = k A + tX k ( k ) − t k X i =1 Re h u i ( t ) , Xv i ( t ) i . So for each t >
0, we obtain k A + tX k ( k ) − k A k ( k ) t ≤ k X i =1 Re h u i ( t ) , Xv i ( t ) i . (2.13)Consider a sequence { t n } of positive real numbers converging to zero as n → ∞ .Since the unit ball in C n is compact, there exists a subsequence { t n m } of { t n } such that for each 1 ≤ i ≤ k , there exist u ′ i and v ′ i such that { u i ( t n m ) } and { v i ( t n m ) } converge to u ′ i and v ′ i , respectively, as m → ∞ . Then the sets ofvectors u ′ , . . . , u ′ k and v ′ , . . . , v ′ k are orthonormal. By continuity of singularvalues, we also know that s i ( A + t n m B ) → s i ( A ) as m → ∞ . (2.14)Hence we obtain Av ′ i = s i ( A ) u ′ i for all 1 ≤ i ≤ k . By (2.13), we get that g ′ + ( A, X ) = lim m →∞ k A + t n m X k ( k ) − k A k ( k ) t n m ≤ max u ,...,u k o.n. v ,...,v k o.n. Av i = s i ( A ) u i k X i =1 Re h u i , Xv i i . (2.15)Combining this with (2.11), we obtain the required result.The above proof works equally well if the maximum in (2.9) is taken overthe sets of orthonormal vectors u , . . . , u k and v , . . . , v k such that for each1 ≤ i ≤ k , u i and v i are left and right singular vectors of A , respectively,corresponding to the i th singular value s i ( A ) of A . We note here that for each t >
0, if along with (2.12), we also have( A + tX ) ∗ u i ( t ) = s i ( A + tX ) v i ( t ) , then by passing onto a subsequence { t n m } as in the above proof, and taking thelimit as m → ∞ , we obtain A ∗ u ′ i = s i ( A ) v ′ i .
6o for each X ∈ M ( n ), we get g ′ + ( A, X ) = max u ,...,u k o.n. v ,...,v k o.n. Av i = s i ( A ) u i A ∗ u i = s i ( A ) v i k X i =1 Re h u i , Xv i i . (2.16) Corollary 2.6.
Let A be positive semidefinite. Let λ ( A ) ≥ · · · ≥ λ n ( A ) ≥ A , with λ k ( A ) >
0. Then g ′ + ( A, X ) = max u ,...,u k o . n .Au i = λ i ( A ) u i k X i =1 Re h u i , Xu i i . (2.17) Proof.
We know that if Av = λu and Au = λv , where λ >
0, then u = v . Usingthis, the required result follows from (2.16). Theorem 2.7.
Let A ∈ M ( n ) . Then ∂g ( A ) = conv ( k X i =1 u i v ∗ i : u , . . . , u k , v , . . . , v k ∈ C n , u , . . . , u k o . n ., v , . . . , v k o . n .,Av i = s i ( A ) u i for all ≤ i ≤ k ) (2.18)= conv ( k X i =1 u i v ∗ i : u , . . . , u k , v , . . . , v k ∈ C n , u , . . . , u k o . n ., v , . . . , v k o . n .,Av i = s i ( A ) u i , A ∗ u i = s i ( A ) v i for all ≤ i ≤ k ) . (2.19) Proof.
Denote the set on the right hand side of (2.18) by H ( A ). Let G ∈ H ( A ).Then G = k X i =1 u i v ∗ i , where u , . . . , u k and v , . . . , v k are orthonormal sets of vectors such that Av i = s i ( A ) u i for all 1 ≤ i ≤ k . SoRe tr( G ∗ A ) = k X i =1 Re h u i , Av i i = k X i =1 s i ( A )= k A k ( k ) , G ∗ X ) = k X i =1 Re h u i , Xv i i≤ k X k ( k ) . Thus k G k ∗ ≤ . So we get by (2.5) that H ( A ) ⊆ ∂g ( A ), and therefore conv H ( A ) ⊆ ∂g ( A ).Now let G ∈ ∂g ( A ). Suppose G / ∈ conv H ( A ). The set H ( A ) is compact, andso is its convex hull. By the Separating Hyperplane Theorem, there exists X ∈ M ( n ) such that for all sets of k orthonormal vectors u , . . . , u k and v , . . . , v k satisfying Av i = s i ( A ) u i for 1 ≤ i ≤ k , we haveRe tr X ∗ k X i =1 u i v ∗ i − G !! < . This implies max u ,...,u k o.n. v ,...,v k o.n. Av i = s i ( A ) u i k X i =1 Re h u i , Xv i i < max G ∈ ∂g ( A ) Re tr( X ∗ G ) . By (2.3), the right hand side is g ′ + ( A, X ). By (2.9), this should be equal to theleft hand side. This gives a contradiction. Thus we obtain (2.18).The expression (2.19) can be proved similarly by using (2.16), instead of(2.9).
Corollary 2.8.
Let A be a positive semidefinite matrix, with eigenvalues λ ( A ) ≥· · · ≥ λ n ( A ) ≥ λ k ( A ) >
0. Then ∂g ( A ) = conv ( k X i =1 u i u ∗ i : u , . . . , u k ∈ C n , u , . . . , u k o.n. , Au i = λ i ( A ) u i for all 1 ≤ i ≤ k ) . (2.20) To prove Theorem 1.1, we require the following lemma.
Lemma 3.1.
Let
X, Y ∈ M ( n ) and let Y be positive semidefinite. Let λ ( Y ) ≥· · · ≥ λ n ( Y ) ≥ Y . For 1 ≤ r ≤ n , let W ( X, Y ) = ( r X i =1 h u i , Xu i i : u , . . . , u r ∈ C n , u , . . . , u r o.n. , Y u i = λ i ( Y ) u i for all 1 ≤ i ≤ r ) . Then W ( X, Y ) is a convex set. 8 roof.
Let the number of distinct eigenvalues of Y be ℓ and let H , . . . , H ℓ bethe respective eigenspaces. Let m , . . . , m ℓ be the dimensions of H , . . . , H ℓ ,respectively. Let 1 ≤ ℓ ′ ≤ ℓ be such that m + · · · + m ℓ ′ − < r ≤ m + · · · + m ℓ ′ .Let m = r − ( m + · · · + m ℓ ′ − ) . Set W j ( X ) = ( m j X i =1 h u i , Xu i i : u , . . . , u m j ∈ H j , u , . . . , u m j o.n. ) for 1 ≤ j ≤ ℓ ′ − , and W ℓ ′ ( X ) = ( m X i =1 h u i , Xu i i : u , . . . , u m ∈ H ℓ ′ , u , . . . , u m o.n. ) . Since H , . . . , H ℓ are mutually orthogonal, we have W ( X, Y ) = ℓ ′ X j =1 W j ( X ) . (3.1)Note that W j ( X ) is a singleton set for 1 ≤ j ≤ ℓ ′ −
1. Hence it is sufficientto show that W ℓ ′ ( X ) is convex. Let P ℓ ′ be the orthogonal projection from C n onto H ℓ ′ , and let ι ℓ ′ denote its adjoint (which is the inclusion map of H ℓ ′ into C n ). Then W ℓ ′ ( X ) is the m -numerical range of P ℓ ′ Xι ℓ ′ , which is convex (see[8, p. 315]).We now state and prove a real version of Theorem 1.1. Theorem 3.2.
Let A = U | A | be a polar decomposition of A . If there exist k orthonormal vectors u , u , . . . , u k such that | A | u i = s i ( A ) u i for all ≤ i ≤ k (3.2) and k X i =1 Re h u i , U ∗ Bu i i = 0 , (3.3) then k A + tB k ( k ) ≥ k A k ( k ) for all t ∈ R . (3.4) If s k ( A ) > , then the converse is also true.Proof. First suppose that there exist k orthonormal vectors u , u , . . . , u k suchthat | A | u i = s i ( A ) u i for all 1 ≤ i ≤ k and P ki =1 Re h u i , U ∗ Bu i i = 0 . We have k A + tB k ( k ) = k| A | + tU ∗ B k ( k ) and by (2.7), k| A | + tU ∗ B k ( k ) ≥ k X i =1 Re h u i , ( | A | + tU ∗ B ) u i i .
9o we get k A + tB k ( k ) ≥ k X i =1 h u i , | A | u i i + t k X i =1 Re h u i , U ∗ Bu i i = k X i =1 s i ( A )= k A k ( k ) . Now suppose that s k ( A ) > k A + tB k ( k ) ≥ k A k ( k ) for all t ∈ R . This can also be written as k| A | + tU ∗ B k ( k ) ≥ k| A |k ( k ) for all t ∈ R . (3.5)Let S : R → M ( n ) be the map given by S ( t ) = tU ∗ B , L : R → M ( n ) be themap defined as L ( t ) = | A | + tU ∗ B and g : M ( n ) → R + be the map definedby g ( X ) = k X k ( k ) . Then we have that g ◦ L attains its minimum at zero. ByProposition 2.2, we obtain that 0 ∈ ∂ ( g ◦ L )(0). Using Proposition 2.3, we obtain0 ∈ S ∗ ∂g ( | A | ) . (3.6)By Corollary 2.8, this is equivalent to saying that0 ∈ conv ( Re k X i =1 h u i , U ∗ Bu i i : u , . . . , u k ∈ C n , u , . . . , u k o.n. , | A | u i = λ i ( | A | ) u i for all 1 ≤ i ≤ k ) . The set in the above equation is conv(Re W ( U ∗ B, | A | )). By Lemma 3.1, Re W ( U ∗ B, | A | )is a convex set. So there exist k orthonormal vectors u , . . . , u k such that | A | u i = s i ( A ) u i and Re k X i =1 h u i , U ∗ Bu i i = 0 . Proof of Theorem 1.1
Suppose that there exist k orthonormal vectors u , u , . . . , u k satisfying (1.3) and (1.4). Let λ ∈ C . Then similar to the ar-10ument in the proof of Theorem 3.2, we get k A + λB k ( k ) = k| A | + λU ∗ B k ( k ) ≥ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) k X i =1 h u i , ( | A | + λU ∗ B ) u i i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) k X i =1 h u i , | A | u i i + λ k X i =1 h u i , U ∗ Bu i i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = k X i =1 s i ( A )= k A k ( k ) . So A is orthogonal to B in k · k ( k ) .Conversely, let s k ( A ) > A is orthogonal to B in k · k ( k ) . So k| A | + re iθ U ∗ B k ( k ) ≥ k A k ( k ) for all r, θ ∈ R . For θ ∈ R , let B ( θ ) = e iθ B . Then we get k| A | + rU ∗ B ( θ ) k ( k ) ≥ k A k ( k ) for all r ∈ R . By Theorem 3.2, there exist k orthonormal vectors u ( θ )1 , . . . , u ( θ ) k such that | A | u ( θ ) j = s j ( A ) u ( θ ) j for all 1 ≤ j ≤ k andRe k X j =1 h u ( θ ) j , U ∗ B ( θ ) u ( θ ) j i = 0 , that is , Re e iθ k X j =1 h u ( θ ) j , U ∗ Bu ( θ ) j i = 0 . (3.7)Now by Lemma 3.1, the set W ( U ∗ B, | A | ) is convex in C . It is also compactin C . If 0 / ∈ W ( U ∗ B, | A | ), then by the Separating Hyperplane Theorem, thereexists a θ such thatRe e iθ k X j =1 h u j , U ∗ Bu j i > u , . . . , u k o.n., | A | u j = s j ( A ) u j for 1 ≤ j ≤ k. This is a contradiction to (3.7). Thus 0 ∈ W ( U ∗ B, | A | ), and so there exist k orthonormal vectors u , . . . , u k such that | A | u i = s i ( A ) u i for all 1 ≤ i ≤ k and k X i =1 h u i , U ∗ Bu i i = 0 . roof of Theorem 1.2 Let
S, L : C → M ( n ) and g : M ( n ) → R + be themaps defined as S ( λ ) = λB , L ( λ ) = A + λB and g ( X ) = k X k ( k ) . Then we get k A + λB k ( k ) ≥ k A k ( k ) for all λ ∈ C if and only if g ◦ L attains its minimum at0. By Proposition 2.2 and Proposition 2.3, a necessary and sufficient conditionfor this is that 0 ∈ S ∗ ∂g ( A ). Let the matrices U, V be partitioned as U =[ U : U : U ] and V = [ V : V : V ], where U , V ∈ M ( n, k − q ); U , V ∈ M ( n, r + q ); U , V ∈ M ( n, n − k − r ). If s k ( A ) >
0, then by Theorem 2.4, weget that 0 ∈ S ∗ ∂g ( A ) if and only if there exists a positive semidefinite matrix T ∈ M ( r + q ) with λ ( T ) ≤ P r + qj =1 λ j ( T ) = q such that tr B ∗ ( U V ∗ + U T V ∗ ) = 0. Similarly, when s k ( A ) = 0, we get that 0 ∈ S ∗ ∂g ( A ) if and onlyif there exists T ∈ M ( n − k + q, r + q ) with s ( T ) ≤ P r + qj =1 s j ( T ) ≤ q suchthat tr B ∗ ( U V ∗ + [ U : U ] T V ∗ ) = 0. A calculation shows thattr B ∗ ( U V ∗ + U T V ∗ ) = tr B ∗ + tr ( B ∗ T )and tr B ∗ ( U V ∗ + [ U : U ] T V ∗ ) = tr B ∗ + tr ([ B ∗ : B ∗ ] T ) . This gives the required result.
Proof of Theorem 1.3
First suppose that there exist density matrices P , . . . , P k such that k P ki =1 P i k ∞ ≤ | A | P i = s i ( A ) P i for all 1 ≤ i ≤ k (3.8)and U P ki =1 P i ∈ W ⊥ . Let Q = P ki =1 P i . Then Q is a positive semidefinitematrix such that k Q k ∞ ≤ k k Q k = k P ki =1 tr P i = 1 andtr( W ∗ U Q ) = 0 for all W ∈ W . (3.9)So by using (2.6) and the fact that k X k ∗ ( k ) = max {k X k ∞ , k k X k } [4, Ex.IV.2.12]., we get that for any W ∈ W , k A + W k ( k ) = k| A | + U ∗ W k ( k ) ≥ tr( | A | Q + U ∗ W Q )= tr( | A | Q ) (by (3.9))= k X i =1 tr | A | P i = k A k ( k ) (by (3.8)) . Conversely, suppose A is orthogonal to W in k · k ( k ) and s k ( A ) >
0. Define S : W → M ( n ) as S ( W ) = U ∗ W . Then S ∗ : M ( n ) → W is given by S ∗ ( T ) = P W ( U T ), where P W is the orthogonal projection onto the subspace W . Let L : W → M ( n ) be the map defined as L ( W ) = | A | + U ∗ W and let g : M ( n ) → R + be the map defined as g ( X ) = k X k ( k ) . Then by Proposition 2.2 and Proposition2.3, we have that k A + W k ( k ) ≥ k A k ( k ) for all W ∈ W if and only if 0 ∈ S ∗ ∂g ( | A | ). By Corollary 2.8, there exist numbers t , . . . , t m such that 0 ≤ t j ≤ P mj =1 t j = 1 and for each 1 ≤ j ≤ m , there exist k orthonormal vectors u ( j )1 , . . . , u ( j ) k such that | A | u ( j ) i = s i ( A ) u ( j ) i for all 1 ≤ i ≤ k (3.10)and S ∗ k X i =1 m X j =1 t j u ( j ) i u ( j ) ∗ i = 0 . (3.11)Let P i = P mj =1 t j u ( j ) i u ( j ) ∗ i . Then each P i is a density matrix. Also, by (3.10),we get | A | P i = s i ( A ) P i . Equation (3.11) says that S ∗ ( P ki =1 P i ) = 0, which isequivalent to saying that U P ki =1 P i ∈ W ⊥ . For each 1 ≤ j ≤ m , the matrix P ki =1 u ( j ) i u ( j ) ∗ i is an orthogonal projection of rank k onto the linear span of { u ( j ) i : 1 ≤ i ≤ k } . In particular k P ki =1 u ( j ) i u ( j ) ∗ i k ∞ ≤
1. Thus k k X i =1 P i k ∞ = k m X j =1 t j k X i =1 u ( j ) i u ( j ) ∗ i k ∞ ≤ m X j =1 t j k k X i =1 u ( j ) i u ( j ) ∗ i k ∞ ≤ .
1. Another necessary and sufficient condition for A to be orthogonal to B in k·k given in [12] is that there exists a matrix G ∈ M ( n ) such that k G k ∞ ≤
1, tr( G ∗ A ) = k A k and tr( G ∗ B ) = 0. One can derive an analogouscharacterization for orthogonality in k · k ( k ) using (2.5). We can show that A is orthogonal to B in k·k ( k ) if and only if there exists a matrix G ∈ M ( n )such that k G k ∞ ≤ k G k ≤ k , tr( G ∗ A ) = k A k ( k ) and tr( G ∗ B ) = 0. Let S, L, g be the maps as defined above in the proof of Theorem 1.2. ThenProposition 2.2, Proposition 2.3 and (2.5) gives that A is orthogonal to B in k·k ( k ) if and only if there exists a matrix G ∈ M ( n ) such that k G k ∞ ≤ k G k ≤ k , Re tr( G ∗ A ) = k A k ( k ) and tr( G ∗ B ) = 0. We observe that if k G k ∗ ( k ) ≤ G ∗ A ) = k A k ( k ) if and only if tr( G ∗ A ) = k A k ( k ) .This is because if Re tr( G ∗ A ) = k A k ( k ) , then k A k ( k ) ≤ | tr( G ∗ A ) | ≤ k G k ∗ ( k ) k A k ( k ) ≤ k A k ( k ) . So Im tr( G ∗ A ) = 0 and hence tr( G ∗ A ) = Re tr( G ∗ A ) = k A k ( k ) . Thus weobtain the required result.2. The characterizations for Birkhoff-James orthogonality are closely relatedto the recent work in norm parallelism [17, 22, 14]. In a normed lin-ear space, an element x is said to be norm-parallel to another element13 (denoted as x || y ) if k x + λy k = k x k + k y k for some λ ∈ C , | λ | = 1.Let A = U | A | be a polar decomposition of A and s k ( A ) >
0. Thenby Theorem 2.4 in [14] and Theorem 1.1, we get that A || B in k · k ( k ) if and only if there exists λ ∈ C with | λ | = 1 and k orthonormal vec-tors u , u , . . . , u k such that | A | u i = s i ( A ) u i for all 1 ≤ i ≤ k and P ki =1 h u i , U ∗ ( k B k ( k ) A + λ k A k ( k ) B ) u i i = 0. Simplifying the expressionsand using the fact that | λ | = 1, we obtain that A || B in k · k ( k ) if andonly if there exist k orthonormal vectors u , u , . . . , u k such that | A | u i = s i ( A ) u i for all 1 ≤ i ≤ k and | P ki =1 h u i , U ∗ Bu i i| = k B k ( k ) . For k = 1,this is just Corollary 2.15 of [14]. Acknowledgement
I would like to thank the referee for several valuablecomments and suggestions.
References [1] L. Arambaˇsi´c, R. Raji´c, The Birkhoff-James orthogonality in Hilbert C ∗ -modules, Linear Algebra Appl.
437 (2012) 1913–1929.[2] L. Arambaˇsi´c, R. Raji´c, A strong version of the Birkhoff-James orthogonalityin Hilbert C ∗ -modules, Ann. Funct. Anal. ∗ -modules, Linear Multilinear Algebra
63 (2015) 1485–1500.[4] R. Bhatia,
Matrix Analysis , Springer, New York, 1997.[5] R. Bhatia, P. ˇSemrl, Orthogonality of Matrices and Some Distance Problems,
Linear Algebra Appl.
287 (1999) 77-86.[6] C. Ben´ıtez, M. Fern´andez, M.L. Soriano, Orthogonality of matrices,
LinearAlgebra Appl.
422 (2007) 155–163.[7] T. Bhattacharyya, P. Grover, Characterization of Birkhoff-James orthogo-nality,
J. Math. Anal. Appl.
407 (2013) 350–358.[8] P.R. Halmos,
A Hilbert Space Problem Book , Narosa Publishing House, 1978.[9] J.B. Hiriart-Urruty, C. Lemar`echal,
Fundamentals of Convex Analysis ,Springer, 2000.[10] P. Grover, Orthogonality to matrix subspaces, and a distance formula,
Linear Algebra Appl.
445 (2014) 280–288.[11] D.J. Keˇcki`c, Gateaux derivative of B(H) norm,
Proc. Amer. Math. Soc.
Linear Algebra Appl.
Inequalities: Theory of Majorizationand Its Applications , Springer, 2011.[14] M.S. Moslehian, A. Zamani, Norm-parallelism in the geometry of HilbertC ∗ -modules, Indag. Math.
27 (2016) 266–281.[15] D. Sain, K. Paul, Operator norm attainment and inner product spaces,
Linear Algebra Appl.
439 (2013) 2448–2452.[16] D. Sain, K. Paul, S. Hait, Operator norm attainment and Birkhoff-Jamesorthogonality,
Linear Algebra Appl.
476 (2015) 85–97.[17] A. Seddik, Rank one operators and norm of elementary operators,
LinearAlgebra Appl.
424 (2007) 177–183.[18] I. Singer,
Best Approximation in Normed Linear Spaces by Elements ofLinear Subspaces , Springer, 1970.[19] G.A. Watson, Characterization of the Subdifferential of Some MatrixNorms,
Linear Algebra Appl.
170 (1992) 33-45.[20] G.A. Watson, On matrix approximation problems with Ky Fan k norms, Numer. Algo.
Convex Analysis in General Vector Spaces , World Scientific,Singapore, 2002.[22] A. Zamani, M.S. Moslehian, Exact and approximate operator parallelism,