[PDF] Unitary Similarity of Nonderogatory Matrices

Abstract

This paper is dedicated to the problem of verification of matrices for unitary similarity. For the case of nonderogatory matrices, we have been able to present the new solution for this problem based on geometric approach. The main advantage of this approach is stability with respect to errors in the initial upper triangular matrix. Since an upper triangular form is usually obtained by approximate methods (e.g. by QR algorithm), the mentioned advantage seems even more significant and allows us to propose the numerically stable and efficient method for verification of matrices for unitary similarity.

Full PDF

aa r X i v : . [ m a t h . NA ] M a r UNITARY SIMILARITY OF NONDEROGATORY MATRICES

YU. NESTERENKO

Abstract.

This paper is dedicated to the problem of veriﬁcation of matri-ces for unitary similarity. For the case of nonderogatory matrices, we havebeen able to present the new solution for this problem based on geometricapproach. The main advantage of this approach is stability with respect toerrors in the initial upper triangular matrix. Since an upper triangular formis usually obtained by approximate methods (e.g. by QR algorithm), thementioned advantage seems even more signiﬁcant and allows us to proposethe numerically stable and eﬃcient method for veriﬁcation of matrices forunitary similarity.

Contents

1. Introduction 22. Preliminary constructions 43. Algorithm for constructing the canonical family 74. Numerical stability 8References 10

Date : July 6, 2018.

Key words and phrases.

Canonical forms, unitary similarity. Introduction

Matrices

A, B ∈ C n × n are unitarily similar if a similarity transformation be-tween them can be implemented using a unitary matrix U :(1.1) B = U AU ∗ . A matrix A ∈ C n × n is called nonderogatory if its Jordan blocks have distincteigenvalues. Equivalently, a matrix A ∈ C n × n is nonderogatory if and only ifits characteristic polynomial and minimum polynomial coincide.This paper concerns the veriﬁcation of matrices for unitary similarity. Basedon other authors’ works concerning this problem, two basic approaches can beidentiﬁed.In the ﬁrst, a complete system of matrix invariants under a unitary similaritytransformation is constructed. In a sense, the ﬁnal result in this direction isthe Specht-Pearcy criterion (see [1, 2]), which reduces the question to verifyingconditions of the form(1.2) tr W ( A, A ∗ ) = tr W ( B, B ∗ )for all words W ( s, t ) of length at most 2 n . However, it seems that the numberof words to be veriﬁed is strongly overestimated (see [3–6]). Moreover, thismethod cannot ﬁnd a matrix generating a given unitary similarity.The second approach is free of this shortcoming and consists of constructing acanonical form of matrices with respect to unitary similarity transformations.Inductive deﬁnitions of the canonical form of a matrix were proposed in [7–9],but it is hard to visualize the ﬁnal canonical form. In more recently work [10]the autors, considered the set of nonderogatory matrix, constructed more visualcanonical form.In this work we poropose the geometric approach to solving the problem fornonderogatory matrices. Given an arbitrary nonderogatory matrix, we con-struct a ﬁnite family of unitarily similar matrices for it (this family is calledcanonical). Whether or not two matrices are unitarily similar can be answeredby verifying the intersection of their corresponding families. This method forunitary similarity veriﬁcation has the signiﬁcant advantage over the method [10]based on construction of the canonical form, it is stable with respect to errorsin the initial matrix. This last aspect is discussed at the end of this paper.While constructing a canonical family, we start from an upper triangular matrixform. Speciﬁcally, by the Schur theorem, any matrix A ∈ C n × n can be reduced NITARY SIMILARITY 3 to such a form by using a unitary similarity transformation:(1.3) ∆ =  λ ∆ ∆ . . . ∆ n λ ∆ . . . ∆ n λ . . . λ n  , where λ , . . . , λ n are eigenvalues of the matrix with multiplicity in some ﬁxedorder. The next statement let us to restrict the set of unitary transformationswhile operating with a nonderogatory triangular matrices. Lemma 1.1.

Let A be a nonderogatory complex n × n matrix, and let ∆ beits upper triangular form with eigenvalues λ , . . . , λ n on the diagonal in someﬁxed order. Then the magnitudes of the elements ∆ ij , where i < j , are uniquelydetermined. Proof.

Since the similar proposition for matrices with simple eigenvalues isknown [8], we can consider the case of nonderogatory matrix with the singleeigenvalue λ . Let ∆ is obtained from A by unitary similarity transformation(1.4) ∆ = Q ∗ AQ where Q = ( q q . . . q n ) is unitary matrix. Rewriting equation (1.4) as AQ = Q ∆, one may see that q is normalized eigenvector of the matrix A correspond-ing to the eigenvalue λ . Further,(1.5) Aq = ∆ q + λq , hence ( A − λE ) q = 0, i.e. q is generalized eigenvector of A. Adding thecondition of orthonormality of the pair q , q , one obtains that q is uniquelydetermined up to multiplication by a scalar of unit modulus. Continuing inthe same vein, we can see that the matrix Q is uniquely determined up tomultiplication by a diagonal unitary matrix, but such a transformations preservethe magnitudes of the oﬀ-diagonal elements of upper triangular form Q . Thusthe lemma is proved.Using the last lemma we can limit our consideration to studying the action ofthe group of unitary similarity transformations with diagonal matrices on theset of upper triangular matrices:(1.6) ∆ X ∆ X ∗ , X = diag ( e iψ , . . . , e iψ n − , , (1.7) { X ∆ X ∗ } ij =  ∆ ij e i ( ψ i − ψ i ) i < j < n, ∆ ij e iψ i i < j = n,λ i i = j, i > j (assumming the last diagonal entry of X is 1, we remove a scalar factor from X ). YU. NESTERENKO Preliminary constructions

Let M denote the range of the parameters of the matrix(2.1) M = { ( r , . . . , r n − ,n ; ϕ , . . . , ϕ n − ,n ) , r ij , ϕ ij ∈ R } , and let M r denote its restriction for ﬁxed r ij :(2.2) M r = { ( r ; ϕ ) ∈ M : r ij are ﬁxed } . The indices i and j run over the values 1 ≤ i < j ≤ n and are ordered lexi-cographically. For elements of M and M r , several equivalent forms of notationare used:(2.3) ( r , . . . , r n − ,n ; ϕ , . . . , ϕ n − ,n ) = ( r ; ϕ , . . . , ϕ n − ,n ) = ( r ; ϕ ) . Looking ahead, r ij and ϕ ij will later play the role of absolute values and argu-ments of oﬀ-diagonal elements of ∆. Despite this geometric interpretation, noconstraints are as yet imposed on r ij and ϕ ij and the indetermination of ϕ ij at r ij = 0 is ignored. At this stage, we work with the formally deﬁned range M .On M we introduce the family of transformations(2.4) X ψ : ( r ; ϕ ) ( r ; ˜ ϕ ) , ˜ ϕ ij = ϕ ij + ψ i − ψ j , ≤ i < j ≤ n − , ˜ ϕ in = ϕ in + ψ i , ≤ i ≤ n − . (2.5)Each such a transformation is deﬁned by a parameter vector ψ = ( ψ , . . . , ψ n − ) ∈ R n − .Consider a subset of matrices K ⊂ M whose elements satisfy the system ofequations(2.6) − s − X k =1 r ks ϕ ks + n X k = s +1 r sk ϕ sk = 0 , s = 1 , . . . , n − . The summation indices in (2.6) are visually described by the diagram(2.7)  λ ∗ . . . ∗ . . . ∗ λ s ∗ ∗ . . . λ n  . The reduction of an arbitrary matrix of M to a K form by applying a trans-formation X ψ is reduced to ﬁnding the parameters of this transformation ψ =( ψ , . . . , ψ n − ) by solving the system of linear equations(2.8) R ( r ) ψ = − b ( r, ϕ ) NITARY SIMILARITY 5 with the symmetric matrix(2.9) { R ( r ) } ij =  − r ij i < j, P i − k =1 r ki + P nk = i +1 r ik i = j, − r ji i > j and with a righthand side that is linear in r and ϕ : b ( r, ϕ ) = ( b ( r, ϕ ) , . . . , b n − ( r, ϕ )) ,b s ( r, ϕ ) = − s − X k =1 r ks ϕ ks + n X k = s +1 r sk ϕ sk , s = 1 , . . . , n − . (2.10)System (2.8) has some remarkable properties. Theorem 2.1. (i) For any ϕ ij and nonnegative r ij , system (2.8) has a solution;i.e., (2.11) − b ( r, ϕ ) ∈ Im R ( r ) , ∀ ϕ ij , ∀ r ij ≥ . (ii) For all r ij ≥ , the determinant detR ( r ) = 0 is nonzero if and only if the in-dices of the nonzero elements r ij > contain a collection ( ij ) , . . . , ( ij ) n − suchthat the set { ψ i p − ψ j p ( respectively ψ i p , if j p = 0) , p = 1 , . . . , n − } forms a linearly independent system of functions of variables ( ψ , . . . , ψ n − ) .(iii) Even if detR ( r ) = 0 for some r ij ≥ , the solution of the equation ψ =( ψ , . . . , ψ n − ) is such that the quantities r ij ( ψ i − ψ j ) , ≤ i < j ≤ n − and r in ψ i , ≤ i ≤ n − are uniquely deﬁned. This means that nonuniqueness in the deﬁnition of ψ i − ψ j occurs if and only if r ij = 0 . Proof.

On the set M r , we introduce the natural structure of a Euclidean space:(2.13) ( r ; ϕ (1) ) + ( r ; ϕ (2) ) = ( r ; ϕ (1)12 + ϕ (2)12 , . . . , ϕ (1) n − ,n + ϕ (2) n − ,n ) , (2.14) α ( r ; ϕ ) = ( r ; αϕ , . . . , αϕ n − ,n ) , α ∈ R , (2.15) h ( r ; ϕ (1) ) , ( r ; ϕ (2) ) i = X ≤ i

The dimensions of K r and G r,ϕ depend on r , but their sum is a constant:(2.18) dim K r + dim G r,ϕ = dim M r . In other words, in the Euclidean space M r , the linear space K r and the linearmanifold G r,ϕ are mutually orthogonal and the sum of their dimensions is thecomplete one. This implies that they have a unique intersection point ( r ; ϕ ′ ) = K r T G r,ϕ . This intersection condition corresponds to the system of equations(2.19) R ( r ) ψ = − b ( r, ϕ ) , where r denotes the vector(2.20) r = ( r , . . . , r n − ,n ) . In terms of ψ , the existence and uniqueness of an intersection point ( r ; ϕ ′ )means that system (2.19) is solvable with arbitrary ϕ ij and r ij and that thevalues r ij ( ψ i − ψ j ) = f ij , ≤ i < j ≤ n − r in ψ i = f in , ≤ i ≤ n − ij ) , . . . , ( ij ) n − corresponding to thenonzero elements of R ( r ) such that the set { ψ i p − ψ j p (respectively ψ i p , if j p =0) , p = 1 , . . . , n − } forms a linearly independent system of functions of vari-ables ( ψ , . . . , ψ n − ). Then a nondegenerate system of linear equations canbe composed of relations (2.21) and ψ = ( ψ , . . . , ψ n − ) can be uniquely de-termined. Thus, under the conditions formulated, system (2.19) has a uniquesolution and, hence, detR ( r ) = 0. The converse can be proved by contradiction.The above results are extended to system (2.8) by making the substitution r ′ ij = r ij ≥

0. The proof is complete.Returning to the matrix ∆, we use Theorem 2.1 to construct the family ofmatrices that are unitarily similar to ∆.With the help of the elements of ∆, we set up the system of linear equations(2.22) R ( r ) ψ = − b ( r, ϕ + 2 πm ) , where r , ϕ , and m are deﬁned as(2.23) r ij = | ∆ ij | , ϕ ij = arg ∆ ij − π, m ij ∈ Z Note that, despite the indetermination of ϕ ij at r ij = 0, the system of equationsis uniquely deﬁned.Solving this system for ψ = ( ψ , . . . , ψ n − ), we construct the matrix X ∆ X ∗ , X = diag ( e i ψ , . . . , e i ψ n − , ψ is not determined uniquelyfrom system (2.22), then, by Theorem 2.1, this nonuniqueness is such that thematrix X ∆ X ∗ is uniquely determined. NITARY SIMILARITY 7

The matrix generated by this procedure from ∆ with the parameter vector m = ( m , . . . , m n − ,n ) is denoted by K (∆ , m ).3. Algorithm for constructing the canonical family

Now, we consider two nonderogatory upper triangular matrices ∆ (1) and ∆ (2) with identical sets of eigenvalues. The eigenvalues are assumed to be identicallyordered on the matrix diagonals. For these matrices, we introduce r (1) ij , ϕ (1) ij and r (2) ij , ϕ (2) ij similar to (2.23). The matrices ∆ (1) and ∆ (2) are related by a unitarysimilarity transformation if and only if(i) r (1) ij = r (2) ij , ≤ i < j ≤ n and(ii) there exist sets ( ψ , . . . , ψ n − ) ∈ R n − and ( k , . . . , k n − ,n ) ∈ Z n ( n − suchthat, for indices ( ij ) corresponding r (1) ij = r (2) ij >

0, we have(3.1) ϕ (1) ij + 2 πk (1) ij = ϕ (2) ij + 2 πk (2) ij + ψ i − ψ j . This implies that a unitary similarity of ∆ (1) and ∆ (2) is equivalent to K ( C (1) , k (1) ) = K ( C (2) , k (2) ) for some integer parameter vectors k (1) = ( k (1)12 , . . . , k (1) n − ,n ) and k (2) = ( k (2)12 , . . . , k (2) n − ,n ).Let us represent the above criterion in an eﬀective form. Deﬁne a subset I ⊂ Z n ( n − :(3.2) I = { k ∈ Z n ( n − : k ij = 0 , ± , ≤ i < j ≤ n − ,k in = 0 , ≤ i ≤ n − } . Theorem 3.1.

The matrices ∆ (1) and ∆ (2) are unitarily similar if and only ifthere exist vectors k (1) , k (2) ∈ I such that K ( C (1) , k (1) ) = K ( C (2) , k (2) ) . Proof.

Let ∆ (1) and ∆ (2) be unitarily similar and all their elements above thediagonal be nonzero. Then, as was shown above, there exist vectors ( ψ , . . . , ψ n − ) ∈ R n − and k (1) , k (2) ∈ Z n ( n − such that equalities (3.1) hold for all ( ij ). We usethem to make up the following linear combinations:(3.3) ϕ (1) ij − ϕ (1) in + ϕ (1) jn + 2 π ( k (1) ij − k (1) in + k (1) jn ) == ϕ (2) ij − ϕ (2) in + ϕ (2) jn + 2 π ( k (2) ij − k (2) in + k (2) jn ) . One may see that the ψ -dependent terms have canceled out. A feature of theselinear combinations is that they are invariant under the action of transforma-tions X ψ on the linear space of vectors ϕ = ( ϕ , . . . , ϕ n − ,n ). Moreover, thesecombinations form a basis in the subspace of linear functionals invariant under X ψ . YU. NESTERENKO

Note that the conditions ϕ ( s ) ij ∈ [ − π, π ) imply ϕ ( s ) ij − ϕ ( s ) in + ϕ ( s ) jn ∈ ( − π, π ),which in turn imply the following constraints on k (1) and k (2) :(3.4) ( k (1) ij − k (1) in + k (1) jn ) − ( k (2) ij − k (2) in + k (2) jn ) = 0 , ± , ± . At the same time, the algorithm for deriving the matrix K (∆ ,

0) shows that thearguments of its elements are linearly expressed in terms of ϕ ij :(3.5) ˜ ϕ ij ∈ L ( ϕ , . . . , ϕ n − ,n ) , Moreover, these linear combinations must be invariant under X ψ , so their formcan be reﬁned:(3.6) ˜ ϕ ij ∈ L ( { ϕ ij − ϕ in + ϕ jn } , ≤ i < j ≤ n − . Combining this with (3.4), we obtain the suﬃciency of verifying the equalities K (∆ (1) , k (1) ) = K (∆ (2) , k (2) ) for k (1) , k (2) ∈ I .In the presence of zero elements above the diagonal of ∆ (1) and ∆ (2) , the propo-sition is proved with slight modiﬁcations.The ﬁnite set of matrices K (∆ , k ), k ∈ I , that are unitarily similar to ∆ is calledthe canonical family of the given matrix.Thus, the following algorithm is proposed for verifying unitary similarity be-tween nonderogatory matrices A and B with the same set of eigenvalues:(i) Reduce these matrices to an upper triangular form with identically orderedeigenvalues on the diagonal to obtain matrices ∆ (1) and ∆ (1) :(3.7) ∆ (1) = U AU ∗ , ∆ (1) = U BU ∗ (ii) For ∆ (1) and ∆ (1) , construct their canonical families K (∆ (1) , k (1) ) and K (∆ (2) , k (2) ), k (1) , k (2) ∈ I .(iii) If these families intersect for some k (1) , k (2) ∈ I and(3.8) K (∆ (1) , k (1) ) = X ∆ (1) X ∗ , K (∆ (2) , k (2) ) = X ∆ (2) X ∗ , then the original matrices are similar and(3.9) B = U AU ∗ , U = U ∗ X ∗ X U . Otherwise, they are not similar.4.

Numerical stability

The approach presented above signiﬁcanly diﬀers from earlier approaches tothe problem studied. As a rule, diﬀerent approaches (e.g. [8, 10]), based on theSchur upper triangular form, tried to create as many positive elements above thediagonal as possible. But such a property of a desired canonical form inevitably

NITARY SIMILARITY 9 leads to the form unstable with respect to errors in initial triangular form. Onemay observe the present eﬀect on the next example:(4.1) A ( ε ) =  i i i i i ε  , where ε is a complex number. If the initially ”strategy” of obtaining the greatestpossible number of positive oﬀ-diagonal elements is to start with superdiagonalelements, then one can chose a A ( ε ) arbitrary close (e.g. with respect to theFrobenius norm) to A (0), but their canonical forms won’t satisfy this property.The stability property seems even more signiﬁcant due to the fact that usuallyan upper trianglular form of a matrix is obtained by approximate methods (e.g.QR algorithm).From the geometric point of view the constructed canonical family is the ﬁ-nite set of the ruled surfaces, such that an orbit of each nonderogatory matrixintersects each of them in a single point. The stability of this set of intersec-tion points follows from the continuity of quantities (2.21) determined fromsystem (2.22). The present property is of special interest in the context of theresult obtained in [11]. Many ideas used by the author were taken from [12].Speciﬁcally, a minimal continuous extension of a canonical Jordan form wasconstructed in [12]. Some of the results presented above are reﬂected in [13].I am deeply grateful to Professor Kh.D. Ikramov for his interest in this workand helpful discussions. References [1] Specht W.

Zur Theorie der Matrizen. II,

Jahresber. Deutsch. Math.-Verein. (1940),19–23.[2] Pearcy C. A complete set of unitary invariants for operators generating ﬁnite W ∗ -algebrasof type I, Paciﬁc J. Math. (1962), 1405–1416.[3] Mumaghan F. D. On the unitary invariants of a square matrix,

Proc. Nat. Acad. Sci.U.S.A. (1932), 85–189.[4] Sibirskiy K. S. Unitary and orthogonal invariants of matrices,

Soviet Math Dokl. (1967),36–40.[5] Laﬀey T. J. Simultaneous reduction of sets of matrices under similarity,

Linear AlgebraAppl. (1986), 123–138.[6] Bhattacharya R. On the Unitary Invariants of an n × n matrix, Ph.D. Thesis, IndianStatistical Inst. New Delhi, (1987).[7] Brenner J.

The problem of unitary equivalence,

Acta Math. (1951), 297–308.[8] Littlewood D. E. On unitary eqivalence,

J. London Math. Soc. (1953), 314–322.[9] Radjavi H. On unitary equivalence of arbitrary matrices,

Trans. Am. Math. Soc. (1962), 363–373.[10] Futorny V., Horn R. A., Sergeichuk V. V.

A canonical form for nonderogatory matricesunder unitary similarity,

Linear Algebra Appl. (2011), 830–841.[11] Paulsen V.

Continuous canonical forms for matrices under unitary equivalence,

PaciﬁcJ. Math. (1978), 129–142.[12] Arnold I. V. Matrices depending on parameters,

Usp. Mat. Nauk, (1971), no.2(158),101–114.[13] Nesterenko Yu. R. Unitary similarity of matrices with simple eigenvalues,

Doklady Math-ematics, (2011), no.3, 795–798. Yu. Nesterenko, Faculty of Computational Mathematics and Cybernetics, MoscowState University, Moscow, Russia

E-mail address ::