aa r X i v : . [ m a t h . OA ] M a y FACTORIZATION OF MATRICES OF QUATERNIONS
TERRY A. LORING
Abstract.
We review known factorization results in quaternionmatrices. Specifically, we derive the Jordan canonical form, polardecomposition, singular value decomposition, the QR factorization.We prove there is a Schur factorization for commuting matrices,and from this derive the spectral theorem. We do not consideralgorithms, but do point to some of the numerical literature.Rather than work directly with matrices of quaternions, we workwith complex matrices with a specific symmetry based on the dualoperation. We discuss related results regarding complex matricesthat are self-dual or symmetric, but perhaps not Hermitian. The quaternionic condition
It is possible to prove many factorization results for matrices of qua-terions by deriving them from their familiar complex counterparts. Theamount of additional work is surprisingly small.There are more abstract factorization theorems that apply to real C ∗ -algebras, along the lines of the delightful paper by Pedersen onfactorization in (complex) C ∗ -algebras. We are dealing here with morebasic questions, involving only finite-dimensional linear algebra. Theseare never (hardly ever?) addressed in basis linear algebra texts, creatingthe impression that linear algebra over the quaternions is more difficultthan it really is.There are serious hazards within linear algebra over the quaternions.One can be lured to difficult questions of determinants and handednessof the spectrum, leaving perhaps victorious, but with the impressionthat all of linear algebra over the quaternions is going to be difficult.We mathematicians are well-advised, when upon such uneven ground,to seek guidance from physicists. Such guidance helped select the top-ics, and helped suggest notation. Mathematics Subject Classification.
Key words and phrases.
Kramers pair, matrix decompositions, dual operation,quaternions.
One topic was included for reasons of elegance, the Jordan canonicalform. For practical purposes, the Schur decomposition will generallysuffice. We prove that as well.It is assumed the reader is familiar with such things as the spectraltheorem and the functional calculus for normal matrices. It is notassumed that the reader knows about C ∗ -algebras or physics, but thesetopics are mentioned incidentally.Let H denote the algebra of quaternions H = n a + b ˆ i + c ˆ j + d ˆ k (cid:12)(cid:12)(cid:12) a, b, c, d ∈ R o . This is an algebra over R . The canonical embedding C ֒ → H sending 1to 1 and i to ˆ i does not make H into an algebra over C . The trouble isthat the embedding is not central. An additional algebraic operationis the involution (cid:16) a + b ˆ i + c ˆ j + d ˆ k (cid:17) ∗ = a − b ˆ i − c ˆ j − d ˆ k which satisfies several axioms, including the following. α ∈ R , x ∈ H = ⇒ ( αx ) ∗ = αx ∗ x ∗ x = 0 = ⇒ x = 0( xy ) ∗ = y ∗ x ∗ Turning to matrices, the algebra M N ( H ) has the expected structureof a unital R -algebra, plus the involution[ a ij ] ∗ = (cid:2) a ∗ ji (cid:3) of conjugate-transpose. We would like to know A ∗ A = 0 = ⇒ A = 0and AB = I = ⇒ BA = I. We will quickly prove these after we consider a representation of M N ( H )on C N .We have an obvious representation of M N ( H ) on H N and quicklynotice that since left and right scalar multiplication of H N disagree, wehave both left-eigenvalues λ ∈ H solving A v = λ v and right-eigenvalues µ ∈ H solving A v = v µ. A glance at the survey [13] reveals that many difficulties arise.
ACTORIZATION OF MATRICES OF QUATERNIONS 3
Ponder the situation for real matrices, in M N ( R ). To the chagrinof undergraduates, it is most natural to consider the representationof M N ( R ) on C N . Given a real orthogonal matrix O we get the fullpicture of why it might not diagonalize in M N ( R ) when we look at thecomplex eigenvalues. It is at this point that we are implicitly letting M N ( R ) act on C N .We do well in the case of the quaternions to regard M N ( H ) as rep-resented on C N , or in more modern terms that de-emphasizes the roleof vectors, via a certain embedding χ : M N ( H ) → M N ( C ) . This is a very old trick. For
N >
Definition 1.1.
Given two complex N -by- N matrices A and B we set χ (cid:16) A + B ˆ j (cid:17) = (cid:20) A B − B A (cid:21) . We generally study complex matrices in the image of χ and only atthe end of our calculations do we draw conclusions about M N ( H ). Thisis in keeping with applications in quantum mechanics, where Hilbertspace is always complex and time-reversal symmetry will often be incor-porated in the conjugate linear operation T . We define T : C N → C N by(1.1) T (cid:18)(cid:20) vw (cid:21)(cid:19) = (cid:20) − wv (cid:21) . See [9, § T is relevant to time reversal symmetry.A less generous description our our approach is we plan to studycomplex matrices with a certain symmetry that is useful in physics,and then sell the same results to pure mathematicians by re-brandingthem as theorems about matrices of quaternions.The operator T is relatively well behaved, despite being only conju-gate linear. It preserves orthogonality, and indeed(1.2) hT ξ, T η i = h ξ, η i . Also(1.3) ξ ⊥ T ξ which is another one-line calculation. Lemma 1.2.
The mapping χ is well-defined, is an R -algebra homo-morphism, is one-to-one and satisfies ( χ ( Y )) ∗ = χ ( Y ∗ ) . ACTORIZATION OF MATRICES OF QUATERNIONS 4
Proof.
Notice that every quaternion q can be written as α + β ˆ j with α and β in C . This makes the map well-defined. It is clearly one-to-oneand R -linear. Notice β ˆ j = ˆ j ¯ β for complex number β and so B ˆ j = ˆ jB .Therefore χ (cid:16) A + B ˆ j (cid:17) χ (cid:16) C + D ˆ j (cid:17) = (cid:20) A B − B A (cid:21) (cid:20)
C D − D C (cid:21) = (cid:20) AC − BD AD + BC − BC − AD − BD + AC (cid:21) = χ (cid:16)(cid:0) AC − BD (cid:1) + (cid:0) AD + BC (cid:1) ˆ j (cid:17) = χ (cid:16)(cid:16) A + B ˆ j (cid:17) (cid:16) C + D ˆ j (cid:17)(cid:17) and (cid:16) χ (cid:16) A + B ˆ j (cid:17)(cid:17) ∗ = (cid:20) A B − B A (cid:21) ∗ = χ (cid:16) A ∗ − B T ˆ j (cid:17) = χ (cid:16)(cid:16) A + B ˆ j (cid:17) ∗ (cid:17) . (cid:3) We can describe the image of χ in several useful ways. We will needseveral unary operations on complex matrices. For starters we needthe transpose A T and the pointwise conjugate A and the conjugate-transpose, or adjoint, A ∗ = A T = A T . Finally, we need a twistedtranspose that is useful in physics. This is a “generalized involution” X ♯ called the dual operation that is defined only for X in M N ( C ). Definition 1.3.
For
A, B, C and D all complex N -by- N matrices,define (cid:20) A BC D (cid:21) ♯ = (cid:20) D T − B T − C T A T (cid:21) . Alternatively, we set X ♯ = − ZX T Z where(1.4) Z = (cid:20) I − I (cid:21) . Notice that Z could be replaced by a matrix with similar propertiesand define a variation on the dual operation. Indeed, the choice of Z is not standardized. Lemma 1.4.
For X in M N ( C ) the following are equivalent. (1) X is in the image of χ ; (2) X ∗ = X ♯ ; (3) X = −T ◦ X ◦ T meaning Xξ = −T ( X T ( ξ )) for every vector ξ in C N . ACTORIZATION OF MATRICES OF QUATERNIONS 5
Proof.
Assume X = (cid:20) A B − B A (cid:21) . Then X ∗ = " A T − B T B T A T = X ♯ . Conversely X ∗ = X ♯ translates into block form as (cid:20) A ∗ C ∗ B ∗ D ∗ (cid:21) = (cid:20) D T − B T − C T A T (cid:21) so we have proven (1) ⇐⇒ (2).We compute −T ◦ X ◦ T for X again in block form, and find −T ◦ X ◦ T (cid:20) vw (cid:21) = −T (cid:18)(cid:20) A BC D (cid:21) (cid:20) − wv (cid:21)(cid:19) = −T (cid:18)(cid:20) − A w + B v − C w + D v (cid:21)(cid:19) = (cid:20) − C w + D v A w − B v (cid:21) = (cid:20) D − C − B A (cid:21) (cid:20) vw (cid:21) so the matrix for the linear operator −T ◦ X ◦ T is (cid:0) X ♯ (cid:1) ∗ . Therefore(2) ⇐⇒ (3). (cid:3) Definition 1.5.
Let us call X ∗ = X ♯ the quaternionic condition and the matrix X we will call a quaternionicmatrix .We now dispose of the implications A ∗ A = 0 = ⇒ A = 0and AB = I = ⇒ BA = I for matrices of quaternions. These are true implications for complexmatrices, and in particular for quaternionic matrices, and therefore truefor matrices of quaternions.We pause to note some axioms of the dual operation. It behaves alot like the transpose. It is linear,( X + αY ) ♯ = X + αY ♯ ACTORIZATION OF MATRICES OF QUATERNIONS 6 which is true even for complex α . Here X and Y are any 2 N -by-2 N complex matrices. The dual reverses multiplication,( XY ) ♯ = Y ♯ X ♯ . It undoes itself, (cid:0) X ♯ (cid:1) ♯ = X and commutes with the adjoint (cid:0) X ♯ (cid:1) ∗ = ( X ∗ ) ♯ . Lemma 1.6.
Every matrix G in M N ( C ) can we expressed in a uniqueway as G = X + iY with X and Y being quaternionic matrices.Proof. Given G we set X = 12 G ♯ ∗ + 12 G and Y = i G ♯ ∗ − i G. (cid:3) Kramers degeneracy and Schur factorization
Eigenvalue doubling is a key feature of self-dual, self-adjoint matri-ces. In physics this is called the Kramers degeneracy theorem, or thetheory of Kramers pairs [6, 12, 9]. This generalizes in two ways, to giveeigenvalue doubling given the symmetry X = X ♯ and conjugate-pairingof eigenvalues given the symmetry X ∗ = X ♯ .Such a collection of paired eigenvectors will, in good situations, forma unitary matrix. If U is unitary matrix that satisfies the quater-nionic condition then it satisfies another symmetry making it symplec-tic, specifically U T ZU = Z . Lemma 2.1.
Suppose U is a unitary N -by- N matrix. The followingare equivalent: (1) U is symplectic; (2) U ∗ = U ♯ ; (3) ZU = U Z ; (4) U ◦ T = T ◦ U ; (5) If v is column j of U for j ≤ N then column N + j of U is T ( v ) .Proof. This follows easily from Lemma 1.4. (cid:3)
ACTORIZATION OF MATRICES OF QUATERNIONS 7
We need to know that the group of symplectic unitary acts transi-tively on C n . Lemma 2.2. If v is any unit vector in C N then there is a symplecticunitary with U e = v .Proof. Let v = v . We use (1.2) and (1.3) to select in order vectorsthat are an orthonormal basis for C N , but of the form v , T v , v , T v , . . . , v N , T v N . If we reorder to v , v , . . . , v N , T v , T v , . . . , T v N we have the columns of the desired symplectic unitary. (cid:3) The most basic result in this realm is an ugly lemma that says that T maps left eigenvectors of X to right eigenvectors of X ♯ , with thesame eigenvalue. The second part of this lemma is more elegant, if lessgeneral. It specifices how every quaternionic matrix has a conjugatesymmetry it its spectral decomposition. Lemma 2.3.
Suppose X is in M N ( C ) . (1) If Xξ = λξ then ( T ξ ) ∗ X ♯ = λ ( T ξ ) ∗ . (2) If X ∗ = X ♯ and Xξ = λξ then X ( T ξ ) = λ ( T ξ ) . Proof. (1) Starting with X = (cid:20) A BC D (cid:21) and ξ = (cid:20) vw (cid:21) we find that Xξ = λξ translates to A v + B w = λ v C v + D w = λ w and ( T v ) ∗ X ♯ = λ ( T v ) ∗ translates to − w T D T − v T C T = − λ w T w T B T + v T A T = λ v T so these are equivalent conditions.(2) follow from (1) by taking adjoints. (cid:3) ACTORIZATION OF MATRICES OF QUATERNIONS 8
Now we are able to extend Kramers degeneracy to a variety of situa-tions, starting with a block diagonalization for commuting quaternionicmatrices. Part (2) of Theorem 2.4 appeared in [11].
Theorem 2.4.
Suppose X , . . . , X k in M N ( C ) commute pairwise and X ∗ j = X ♯j for all j . (1) There is a single symplectic unitary U so that, for all j,U ∗ X j U = (cid:20) T j S j − S j T j (cid:21) . with T j upper-triangular and S j strictly upper-triangular. (2) If, in addition, the X j are normal then there is a single sym-plectic unitary U so that, for all j,U ∗ X j U = (cid:20) D j D j (cid:21) . with D j diagonal. (3) Every symplectic unitary has determinant one.Proof. (1) A finite set of commuting matrices will have a common eigen-vector, so let v be a unit vector so that X j v = λ j v . We know then thatso T v is an eigenvalue for X j with eigenvector λ j . There is a symplecticunitary U so that U e = v , and then U e N +1 = T v by Lemma 2.1.Let Y j = U ∗ X j U . Then Y j e = λ j e and Y j e N +1 = λ j e N +1 . This means the column 1 and column N + 1 are all but zeroed-out, Y j = λ j ∗ ∗ A j C j ∗ λ j ∗ B j D j . Up to a non-symplectic change of basis we are looking at block-uppertriangular matrices that commute, so the lower-right corners in thatbasis commute. This means that the(2.1) Z j = (cid:20) A j C j B j D j (cid:21) all commute, and each must satisfy Z ∗ j = Z ♯j . By induction, we haveproven the first claim.
ACTORIZATION OF MATRICES OF QUATERNIONS 9
For (2) we modify the proof just a little. Starting with the X j normal,we find the Y j are also normal and so Y j = λ j A j C j λ j B j D j . This, after an appropiate basis change, would be block-diagonal, andfrom this we can conclude that the matrix in (2.1) is normal. Theinduction proceeds as before, with the stronger conclusion that B j = C j = 0 and A j = D j is diagonal.(3) Applying (2) we find W = U (cid:20) D D (cid:21) U ∗ where D is a diagonal unitary. Thereforedet( W ) = det( D ) det( D ) = det( D )det( D ) ≥ (cid:3) Corollary 2.5.
Every matrix in M N ( H ) is unitarily equivalent to aupper-triangular matrix in M N ( H ) that has complex numbers on thediagonal. There is an algorithm [1] for the Schur decomposition of quaternionicmatrices.
Corollary 2.6.
Every normal matrix in M N ( H ) is unitarily equivalentto a diagonal matrix in M N ( C ) . Every Hermitian matrix in M N ( H ) is unitarily equivalent to a diagonal matrix in M N ( R ) . Corollary 2.7.
Every Hermitian self-dual matrix X in M N ( C ) is ofthe form X = U (cid:20) D j D j (cid:21) U ∗ for some symplectic unitary U and a diagonal real matrix D j . These corollaries cause us to reconsider the concept of left and righteigenvalues in H and focus on just those that are in C . For details onleft eigenvalues and non-complex right eigenvalues, consult [4]. We putthe complex right eigenvalues in a simple context with the following. Lemma 2.8.
Suppose v , w ∈ C N and λ ∈ C and A, B ∈ M N ( C ) .Then (cid:20) A B − B A (cid:21) (cid:20) vw (cid:21) = λ (cid:20) vw (cid:21) ACTORIZATION OF MATRICES OF QUATERNIONS 10 if and only if (cid:16) A + B ˆ j (cid:17) (cid:16) v − ˆ j w (cid:17) = (cid:16) v − ˆ j w (cid:17) λ. Therefore λ ∈ C is a right eigenvalue of X ∈ M N ( H ) if and only if λ is an eigenvalue of χ ( X ) .Proof. This a short, direct calculation. (cid:3) Jordan canonical form
Kramers degeneracy extends to the generalized eigenvectors used tofind the Jordan canonical form.
Lemma 3.1.
Suppose X ♯ = X ∗ . If ( X − λI ) r v = 0 and ( X − λI ) r − v = 0 then (cid:0) X − λI (cid:1) r T v = 0 and (cid:0) X − λI (cid:1) r − T v = 0 . Proof.
For any Y we have k Y v k = k Y v k = k ZY ZZ v k proving k Y v k = k Y ♯ ∗ T v k . Since (cid:16) ( X − λI ) k (cid:17) ♯ ∗ = (cid:0) X − λI (cid:1) k the result follows. (cid:3) We see the general idea of a proof [14] of the Jordan decompositionfor a quaternionic matrix. When we build a Jordan basis we needto respect T in two ways. Whatever basis we pick for the subspacecorresponding to λ with positive imaginary part, we apply T to get thebasis for the subspace corresponding to λ . When λ is real we need topick generalized eigenvectors in pairs.The Jordan form of a quaternionic matrix is not so elegant, as eachJordan blocks larger than 2-by-2 gets spread around to all four quad-rants of the matrix. We work directly with a Jordan basis and thenmake our final conclusion in terms of quaternions. Theorem 3.2.
Suppose X ♯ = X ∗ . There is a Jordan basis for X consisting of pairs of the form v , T v . ACTORIZATION OF MATRICES OF QUATERNIONS 11
Proof.
Let N λ denote the subspace of all generalized eigenvectors for λ toghether with the zero vector. Recall that just treating X as a complexmatrix, the procedure to select a Jordan basis involves selecting, foreach λ , a basis for N λ with the following property: whenever b is inthis basis, then ( X − λI ) b is either zero or back in this basis.For λ in the spectrum with positive imaginary part we make sucha choice and then apply T to get a set of vectors in N λ that has thecorrect number of elements to be a basis of N λ . It will also be alinearly independent set since Z and conjugation both preserve linearindependence. Since ( X − λI ) T b = T ( X − λI ) b this basis of N λ has the desired property.For λ in the spectrum that is real we need to modify the procedurefor selecting the basis of N λ . A common procedure selects a basis b r, , . . . , b r,m r for(3.1) ker ( X − λ ) r ∩ (cid:0) ker ( X − λ ) r − (cid:1) ⊥ ∩ (im ( X − λ )) ⊥ and constructs for N λ the Jordan basis n ( X − λ ) j b r,k (cid:12)(cid:12) ≤ r ≤ r max , ≤ j ≤ r − , k = 1 , . . . , m r o Since T ( X − λ ) j b r,k = ( X − λ ) j T b r,k we get the desired structure if the subspaces (3.1) are T -invariant. Thisfollows from the next two lemmas and the equality(im ( X − λ )) ⊥ = ker (cid:0) X ∗ − λ (cid:1) . We can now assemble the bases of the various N λ to get a Jordan basisbuilt from the Kramers pairs. (cid:3) Lemma 3.3. If X ♯ = X ∗ then ker( X ) is T -invariant.Proof. This is a restatement of the λ = 0 case of Lemma 2.3(2). (cid:3) Lemma 3.4.
If a subspace H is T -invariant then ( H ) ⊥ is also T -invariant.Proof. If v is in ( H ) ⊥ then for every w ∈ H we have hT v , w i = − hT v , T T w i = −h v , T w i = 0 . (cid:3) Corollary 3.5.
Suppose X ∈ M n ( H ) . There is an invertible matrix S ∈ M n ( H ) and a complex matrix J in Jordan Form such that X = S − J S . ACTORIZATION OF MATRICES OF QUATERNIONS 12 Norms
There are two operator norms to consider on X in M N ( H ), thatinduced by quaternionic Hilbert space and that induced by complexHilbert space on χ ( X ). They end up identical. Theorem 4.1.
Suppose X is in M N ( H ) . Then using the norms k v k = N X j =1 v ∗ j v j ! on H N and k w k = N X j =1 v j v j ! on C N we have sup v =0 k X v kk v k = sup w =0 k χ ( X ) w kk w k . Proof.
Utilizing also the norm on C N we calculate the four relevantnorms: (cid:13)(cid:13)(cid:13)(cid:13)(cid:20) vw (cid:21)(cid:13)(cid:13)(cid:13)(cid:13) = k v k + k w k . (cid:13)(cid:13)(cid:13) v − ˆ j w (cid:13)(cid:13)(cid:13) = k v k + k w k . (cid:13)(cid:13)(cid:13)(cid:13)(cid:20) A B − B A (cid:21) (cid:20) vw (cid:21)(cid:13)(cid:13)(cid:13)(cid:13) = k A v + B w k + (cid:13)(cid:13) A w − B v (cid:13)(cid:13) . (cid:13)(cid:13)(cid:13)(cid:16) A + B ˆ j (cid:17) (cid:16) v − ˆ j w (cid:17)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) A v + B w + ˆ j (cid:0) B v − A w (cid:1)(cid:13)(cid:13)(cid:13) = k A v + B w k + (cid:13)(cid:13) B v − A w (cid:13)(cid:13) . The result follows. (cid:3) Singular value decomposition
As in the complex case, a slick way to prove there is a singular valuedecomposition is to work out the polar decomposition and then use thespectral theorem on the positive part.In [13] it is stated that “a little more work is needed for the sin-gular case” when discussing the polar decomposition. The extra workinvolves padding out a quaternionic partial isometry to be a quater-nionic unitary (so symplectic unitary.)We remind the reader that U is a partial isometry when ( U ∗ U ) = U ∗ U , or equivalently U U ∗ U = U or U ∗ U U ∗ = U ∗ or ( U U ∗ ) = U U ∗ . ACTORIZATION OF MATRICES OF QUATERNIONS 13
If we restrict the domain and range of U we find it is an isometry from(ker( U )) ⊥ to (ker( U ∗ )) ⊥ . Lemma 5.1.
Suppose U ∗ = U ♯ and U is a partial isometry in M N ( C ) .There is a symplectic unitary W in M N ( C ) so that W ξ = U ξ for all ξ ⊥ ker( U ) .Proof. If v is in ker( U ) then by Lemma 2.3, T v is also in ker( U ). Since v and T v are orthogonal, we can show that ker( U ) has even dimension2 m and that is has a basis for the form v , . . . , v m , T v , . . . , T v m . We are working in finite dimensions so the dimension of ker( U ∗ ) is also2 m and we select for it a basis w , . . . , w m , T w , . . . , T w m . We can define W to agree with U on (ker( U )) ⊥ and to send v j to w j and T v j to T w j and so get a unitary that commutes with T , whichmeans it is symplectic. (cid:3) Lemma 5.2.
Suppose X ∗ = X ♯ in M N ( C ) . Then there is a unitary U and a positive semidefinite P with U ∗ = U ♯ and P ∗ = P ♯ and X = U P .Proof.
Let f be a continuous function on the positive reals with f (0) =0 and f ( λ ) = λ − for every nonzero eigenvalue of X ∗ X . Then let W = Xf ( X ∗ X ) . The usual calculations in functional calculus tell us W is a partial isom-etry and that X = W P for P = ( X ∗ X ) . Working with monomials,polynomials and then taking limits, we can show( f ( Y )) ♯ = f (cid:0) Y ♯ (cid:1) for any positive operator Y and so W ♯ = f (cid:0) X ♯ X ∗ ♯ (cid:1) X ♯ = f ( X ∗ X ) X ∗ = W ∗ . Also P ♯ = P = P ∗ .We just showed that the matrices in the minimal polar decompositionare quaternionic. We use Lemma 5.1 to finish the argument. (cid:3) As expected, the polar decomposition leads to a singular value de-composition.
Theorem 5.3.
Suppose X ∗ = X ♯ in M N ( C ) . There are symplecticunitary matrices U and V and a diagonal matrix D with nonnegativereal entries and D ♯ = D ∗ so that X = U DV . ACTORIZATION OF MATRICES OF QUATERNIONS 14
Proof.
We take a quaternionic polar decomposition X = W P . Since P is positive, we apply Theorem 2.4 to get symplectic unitary matrices Q and V and diagonal matrix D so that P = QDV . The eigenvaluesof P are nonnegative, so the same is true for the diagonal elements of D and we have the needed factorization X = ( W Q ) DV . (cid:3) QR factorization
It is easy to use Lemma 2.2 to get over H a QR factorization theorem.Notice that upper triangular matrices are sent by χ to matrices thatare block upper-triangular. Theorem 6.1. If X ∗ = X ♯ in M N ( C ) the there is a symplectic unitary Q and R of the form R = (cid:20) A B − B A (cid:21) with A and B upper triangular. If X ∈ M N ( H ) then there is a untiary Q and upper triangular matrix R in M N ( H ) so that X = QR .Proof. (2) follows directly from (1), so we prove (1).We apply Lemma 2.2 to the first column of X and find X = Q R where R = (cid:20) A B − B A (cid:21) where A and B have zeros in their first columns, except perhaps inthe top position. As we did earlier, we can proceed with a proof byinduction. (cid:3) Self-dual matrices
If we study matrices with X ♯ = X then we are no longer workingdirectly with quaternionic matrices, but as we discuss below, there isa connection. We discuss a Schur factorization and a structured polardecomposition for self-dual matrices. The latter is a bit tricky, so wewarm up with a structured polar decomposition for symmetric complexmatrices.For dealing with a single self-dual matrix, there is the efficient Paige/ Van Loan algorithm [5, 10] to implement the following theorem. Theorem 7.1.
Given X , . . . , X k in M N ( C ) that commute pairwiseand are self-dual, there is a single symplectic unitary U so that for all j, U ∗ X j U = (cid:20) T j C j T T j (cid:21) . with T j upper-triangular and the C j skew-symmetric. ACTORIZATION OF MATRICES OF QUATERNIONS 15
Proof.
Let v be a nonzero unit vector so that X j v = λ j v for all j . ByLemma 2.3, X ∗ j ( T v ) = λ j ( T v ) . There is a symplectic unitary U so that U e = v and U e N +1 = T v . Let Y j = U ∗ X j U . Then Y j e = λ j e and Y ∗ j e N +1 = λ j e N +1 . Since e N +1 is real, we take adjoint and discover e T N +1 Y j = λ j e T N +1 . This means the column 1 and row N + 1 are all but zeroed-out, Y j = λ j ∗ ∗ ∗ A j ∗ C j λ j B j ∗ D j . Basic facts about block triangular matrices show that the Z j = (cid:20) A j C j B j D j (cid:21) are a commuting family of matrices, and since U was chosen to besymplectic, the Z j will be self-dual. As simple induction now finishesthe proof. (cid:3) A promising numerical technique for the joint diagonalization of twocommuting self-dual self-adjoint matrices H and K would be to formthe normal self-dual matrix X = A + iB and apply Paige / Van Loanto reduce to block diagonal form. Then apply ordinary Schur decom-position. This technique was used in [5, §
9] to diagonalize matricesthat were exactly self-dual and approximately unitary. This idea ismentioned in [2], section 6.5.It is not hard to show that the minimal polar decomposition, the onethat is unique and can involve a partial isometry, preserves in some wayjust about any symmetry thrown at it. This is because the functionalcalculus interacts well with the dual operation [8], as well as with thetranspose. It is a bit harder to figure what happens for the maximalpolar decomposition. By the minimal polar decomposition is meant thefactorization that is unique and can involve a partial isometry. By themaximal polar decomposition is meant the factorization that involvesa unitary.
ACTORIZATION OF MATRICES OF QUATERNIONS 16
We begin with the easier result about the polar decomposition ofcomplex symmetric matrices.
Theorem 7.2. If X in M n ( C ) satisfies X T = X then there is aunitary U so that U T = U and X = U | X | .Proof. Again chose f with f (0) = 0 and f ( λ ) = λ − for every nonzeroeigenvalue of X ∗ X and let W = Xf ( X ∗ X ) . As always, W is a partial isometry and X = W P for P = ( X ∗ X ) = | X | . Now we discover W T = f (cid:0) X T X (cid:1) X T = f ( XX ∗ ) X = Xf ( X ∗ X ) = W. To create a unitary U with X = U | X | we must extend W to mapker( W ) to ker ( W ∗ ). We can arrange U T = U as follows. Let v , . . . , v m be an orthonormal basis of ker( W ). Then v , . . . , v m will be an or-thonormal basis of ker( W ∗ ) = ker( W ) and we define V to be zero on(ker( W )) ⊥ and V v j = v j . Thus V ∗ will be zero on (ker( W )) ⊥ and V ∗ v j = v j , but the same can be said about V . That means V T = V and so U = W + V will be the required symmetric unitary.Notice there is no structure on | X | , but considered with | X ∗ | we getthe formula(7.1) | X ∗ | = | X | T . (cid:3) For the self-dual situation, we shall see that a similar constructionworks so long as we respect Kramers degeneracy.
Proposition 7.3.
If a partial isometry W in M N ( C ) is self-dual,then the initial space of W will have even dimension and T will mapthe initial space isometrically onto the final space of W .Proof. Applying Lemma 2.3 to the self-adjoint matrix W ∗ W we find W ∗ W v = v = ⇒ ( W ∗ W ) ♯ T v = T v = ⇒ ( W W ∗ ) T v = T v ACTORIZATION OF MATRICES OF QUATERNIONS 17
That is, when v ∈ (ker W ) ⊥ we have T v ∈ (ker W ∗ ) ⊥ and k W ∗ T v k = k v k . This is useful because h v , W ∗ T v i = h v , − W ∗ Z v i = (cid:10) − W T Z v , v (cid:11) = h− ZW v , v i = h v , W ∗ Z v i = − h v , W ∗ T v i which means v and W ∗ T v are orthogonal, and W ∗ T W ∗ T v = W ∗ ZW ∗ Z v = W ∗ ZW T Z v = − W ∗ W ♯ v = − W ∗ W v = − v . If we start with a unit vector v in (ker W ) ⊥ then we end up withan orthogonal pair of unit vectors v and w with W ∗ T v = w and W ∗ T w = − v .If q is a vector in (ker W ) ⊥ that is orthogonal to both v and w then W ∗ T q will be orthogonal to both W ∗ T v and W ∗ T w since T preservesorthogonality everywhere and W ∗ preserves the orthogonality of vectorsin (ker W ∗ ) ⊥ . Thus we can create a basis of (ker W ) ⊥ out of pairs v , W ∗ T v , . . . , v m , W ∗ T v m . (cid:3) We need some examples of self-dual partial isometries. Treating vec-tors as 2 N -by-1 matrices, if we set V = ( T v ) w ∗ − ( T w ) v ∗ then this rank two (at most) matrix is self-dual since V ♯ = − Z ( − Z vw ∗ + Z wv ∗ ) T Z = − Z ( wv ∗ Z − vw ∗ Z ) Z = Z wv ∗ − Z vw ∗ = −T ( w ) v ∗ + T ( v ) w ∗ = V. ACTORIZATION OF MATRICES OF QUATERNIONS 18
If we start with v and w orthogonal, then V will be the rank-two partialisometry taking v to T w and w to T v . We record this as a lemmathat avoids the ugly notation. Lemma 7.4.
Suppose v and w are orthogonal unit vectors. Then thepartial isometry from C v + C w to C T v + C T w that sends v to T w and w to −T v will be self-dual. Theorem 7.5. If X in M n ( C ) satisfies X ♯ = X then there is a unitary U so that U ♯ = U and X = U | X | .Proof. Once more W = Xf ( X ∗ X ) works to create a partial isometry W with X = W P for P = | X | and this time we have W ♯ = W . Weneed a partial isometry from ker( W ) to ker ( W ∗ ) that is self-dual. Thedimension of ker( W ) will be even, by Lemma 7.3. Moreover if v , w , . . . , v m , w m is an orthogonal basis for ker( W ) then T v , T w , . . . , T v m , T w m will be an orthogonal basis for ker ( W ∗ ). The needed self-dual partialisometry V will send v j to T w j and w j to T v j and the self-dual unitarywe use will be U = W + V . (cid:3) The odd particle causes even degeneracy
We hope not to scare the mathematical reader with more discus-sion of Kramers degeneracy. What Kramers discovered was that for acertain systems involving a odd number of electrons, the Hamiltonianalways had all eigenvalues with even multiplicity.A mathematical manifestation of this is that the tensor product oftwo dual operations is the transpose operation in disguise. In contrastto that, the tensor product of three dual operations is a large dual op-eration in disguise. Specifically if we implement an orthogonal changeof basis, the dual operation becomes the transpose.This is essentially the same as the facts that is more familiar tomathematicians, that H ⊗ R H ∼ = M ( R ) and H ⊗ R H ⊗ R H ∼ = M ( H ).In the following, we let Z N be as in (1.4), the matrix that specifiedthe dual operation. Lemma 8.1.
Consider U = 1 √ I ⊗ I − iZ N ⊗ Z M ) . For all X ∈ M N ( C ) and Y ∈ M M ( C ) , U ∗ (cid:0) X ♯ ⊗ Y ♯ (cid:1) U = ( U ∗ ( X ⊗ Y ) U ) T . ACTORIZATION OF MATRICES OF QUATERNIONS 19
Proof.
Since Z T K = − Z K we see U = U T . Also U ∗ U = 12 ( I ⊗ I + iZ N ⊗ Z M ) ( I ⊗ I − iZ N ⊗ Z M )= 12 (cid:0) I ⊗ I + ( Z N ⊗ Z M ) (cid:1) = I ⊗ I so U is a unitary and U = U − .Since U = − iZ N ⊗ Z M we find U ∗ (cid:0) X ♯ ⊗ Y ♯ (cid:1) U = U (cid:0) X ♯ ⊗ Y ♯ (cid:1) U = U ( − iZ N ⊗ Z M ) (cid:0) X T ⊗ Y T (cid:1) ( iZ N ⊗ Z M ) U = U U (cid:0) X T ⊗ Y T (cid:1) U U = ( U ∗ ( X ⊗ Y ) U ) T . (cid:3) We have refrained from discussing C ∗ -algebras, but here they addclarity. What the lemma is showing is( M N ( C ) ⊗ M M ( C ) , ♯ ⊗ ♯ ) ∼ = (cid:0) M M + N ) ( C ) , T (cid:1) . Lemma 8.2.
For all X ∈ M N ( C ) and Y ∈ M M ( C ) , X T ⊗ Y ♯ = ( X ⊗ Y ) ♯ where the ♯ on the right is taken with respect to I ⊗ Z M .Proof. This is much simpler, as( X ⊗ Y ) ♯ = − ( I ⊗ Z M ) ( X ⊗ Y ) T ( I ⊗ Z M )= − ( I ⊗ Z M ) (cid:0) X T ⊗ Y T (cid:1) ( I ⊗ Z M )= X T ⊗ (cid:0) − Z M Y T Z M (cid:1) . (cid:3) Together, these lemmas show us that a tensor product of three (cid:0) M N j ( C ) , ♯ (cid:1) leads to something isomorphic to (cid:0) M N + N + N ) ( C ) , ♯ (cid:1) . Acknowledgements
The author gratefully aknowleges the guidance and assistence re-ceived from mathematicians and physicists, in particular Vageli Cout-sias, Matthew Hastings and Adam Sørensen.This work was partially supported by a grant from the Simons Foun-dation (208723 to Loring), and by the Efroymson fund at the Universityof New Mexico.
ACTORIZATION OF MATRICES OF QUATERNIONS 20
References [1]
A. Bunse-Gerstner, R. Byers, and V. Mehrmann , A quaternion QRalgorithm , Numerische Mathematik, 55 (1989), pp. 83–95.[2]
A. Bunse-Gerstner, R. Byers, and V. Mehrmann , Numerical meth-ods for simultaneous diagonalization , SIAM J. Matrix Anal. Appl., 14 (1993),pp. 927–949.[3]
G. K. Pedersen , Factorization in C ∗ -algebras , Exposition. Math., 16 (1998),pp. 145–156.[4] D. R. Farenick and B. A. F. Pidkowich , The spectral theorem in quater-nions , Linear Algebra Appl., 371 (2003), pp. 75–102.[5]
M. B. Hastings and T. A. Loring , Topological insulators and C ∗ -algebras:Theory and numerical practice , Ann. Physics, 326 (2011), pp. 1699–1759.[6] H. Kramers , Th´eorie g´en´erale de la rotation paramagn´etique dans lescristaux , Proc. Acad. Amst, 33 (1930), pp. 959–972.[7]
H. C. Lee , Eigenvalues and canonical forms of matrices with quaternion co-efficients , Proc. Roy. Irish Acad. Sect. A., 52 (1949), pp. 253–260.[8]
T. A. Loring and A. Sørensen , Almost commuting self-adjoint matrices—the real and self-dual cases . arXiv:1012.3494.[9]
M. Mehta , Random matrices , Academic press, 2004.[10]
C. Paige and C. Van Loan , A Schur decomposition for Hamiltonian matri-ces , Linear Algebra Appl., 41 (1981), pp. 11–32.[11]
N. Wiegmann , Some theorems on matrices with real quaternion elements ,Canad. J. Math, 7 (1955), pp. 191–201.[12]
E. Wigner , Uber die operation der zeitumkehr in der quantenmechanik , Nach.Ges. Wiss. Gdtt, 32 (1932), pp. 546–559.[13]
F. Zhang , Quaternions and matrices of quaternions , Linear Algebra Appl.,251 (1997), pp. 21–57.[14]
F. Zhang and Y. Wei , Jordan canonical form of a partitioned complex matrixand its application to real quaternion matrices , Communications in Algebra,29 (2001), pp. 2363–2375., Communications in Algebra,29 (2001), pp. 2363–2375.