[PDF] New classes of matrix decompositions

Abstract

The idea of decomposing a matrix into a product of structured matrices such as triangular, orthogonal, diagonal matrices is a milestone of numerical computations. In this paper, we describe six new classes of matrix decompositions, extending our work in arXiv:1307.5132. We prove that every n×n matrix is a product of finitely many bidiagonal, skew symmetric (when n is even), generic, companion matrices and generalized Vandermonde matrices, respectively. We also prove that a generic n×n centrosymmetric matrix is a product of finitely many symmetric Toeplitz (resp. persymmetric Hankel) matrices. We determine an upper bound of the number of structured matrices needed to decompose a matrix for each case.

Full PDF

aa r X i v : . [ m a t h . AG ] S e p NEW CLASSES OF MATRIX DECOMPOSITIONS

KE YE

Abstract.

The idea of decomposing a matrix into a product of structured matrices such as tri-angular, orthogonal, diagonal matrices is a milestone of numerical computations. In this paper, wedescribe six new classes of matrix decompositions, extending our work in [8]. We prove that every n × n matrix is a product of ﬁnitely many bidiagonal, skew symmetric (when n is even), companionmatrices and generalized Vandermonde matrices, respectively. We also prove that a generic n × n centrosymmetric matrix is a product of ﬁnitely many symmetric Toeplitz (resp. persymmetricHankel) matrices. We determine an upper bound of the number of structured matrices needed todecompose a matrix for each case. Introduction

Matrix decomposition is an important technique in numerical computations. For example, wehave classical matrix decompositions:(1) LU: a generic matrix can be decomposed as a product of an upper triangular matrix and alower triangular matrix.(2) QR: every matrix can be decomposed as a product of an orthogonal matrix and an uppertriangular matrix.(3) SVD: every matrix can be decomposed as a product of two orthogonal matrices and adiagonal matrix.These matrix decompositions play a central role in engineering and scientiﬁc problems related tomatrix computations [1],[6]. For example, to solve a linear system Ax = b where A is an n × n matrix and b is a column vector of length n . We can ﬁrst apply LU decompositionto A to obtain LU x = b, where L is a lower triangular matrix and U is an upper triangular matrix. Next, we can solve Ly = b,U x = y. to obtain the solution of the original linear equation. The advantage of decomposing A into theproduct of L and U ﬁrst is that solving linear equations with triangular matrix coeﬃcient is mucheasier than solving the one with general matrix coeﬃcient. Similar idea applies to QR and SVDdecompositions.Those classical matrix decompositions (LU, QR and SVD decompositions) correspond to Bruhat,Iwasawa, and Cartan decompositions of Lie groups [5, 2]. Other than those classical ones, thereare other matrix decompositions. For instance,(1) Every n × n matrix is a product of (2 n + 5) Toeplitz (resp. Hankel) matrices [8].(2) Every matrix is a product of two symmetric matrices [3]. KY is partially supported by AFOSR FA9550-13-1-0133, DARPA D15AP00109, NSF IIS 1546413, DMS 1209136,and DMS 1057064.

As we have seen for classical matrix decompositions, Toeplitz, Hankel and symmetric matrix de-compositions are important in the sense that structured matrices are well understood. For example,a Toeplitz linear system can be solved in O ( n log n ) using displacement rank [10], compared to atleast O ( n ) for general linear systems. Sometimes the matrix decomposition refers to the decom-position of a matrix into the sum of two matrices, see for example, [16, 17, 18]. However, wheneverwe mention the matrix decomposition in this paper, we always refer to the multiplicative version.In this article, we study matrix decompositions beyond those mentioned above. We use AlgebraicGeometry as our tool to explore the existence of matrix decompositions for various structuredmatrices. We deﬁne necessary notions in Section 2 and we prove some general results for thematrix decomposition problem and establish a strategy to tackle the matrix decomposition problemin Section 3. In Section 4 we discuss the matrix decomposition problem with two factors and recoverthe LU decomposition and the QR decomposition for generic matrices using our method. Here theLU (resp. QR) decomposition for generic matrices means that the set of matrices which can bewritten as the product of a lower triangular (resp. orthogonal) matrix and an upper triangularmatrix is a dense subset (with the Zariski topology) of the space of all n × n matrices.In Section 5 we apply the strategy built in Section 3.4 to matrix decomposition problem forlinear subspaces. Lastly, in Section 6 we apply the strategy to matrix decomposition problem fornon-linear varieties. We summarize our contributions in the following list:(1) Bidiagonal decomposition (Section 5.1).(2) Skew symmetric decomposition (Section 5.2).(3) Symmetric Toeplitz decomposition and persymmetric Hankel decomposition (Section 5.3)(4) Generic decomposition (Section 5.4).(5) Companion decomposition (Section 6.1).(6) Generalized Vandermonde decomposition (Section 6.2).For each type of matrices in the list above, we ﬁrst prove the existence of the matrix decompositionfor a generic matrix, in the sense that the set of all matrices which can not be decomposed as aproduct of matrices of the given type, is contained in a proper algebraic subvariety of C n × n . Thenwe use a result from topological group theory to prove the existence of the matrix decompositionfor every matrix. The price we need to pay is to increase the number of factors. Our method canonly show the existence of the matrix decomposition and no algorithm can be obtained in general.However, we do ﬁnd an algorithm for companion decomposition in 6.1.2. Algebraic geometry

In this section, we introduce necessary notions in Algebraic Geometry needed in this paper. Wework over complex numbers but all results hold over algebraically closed ﬁelds of characteristiczero. Every notion we deﬁne in this section can be generalized to a more abstract version, but weonly concentrate on a simpliﬁed version. Main references for this section are [7, 9, 12, 13, 19]Let C [ x , . . . , x n ] be the polynomial ring of n variables over C . We say that a subset X ⊂ C n isan algebraic subvariety if X is the zero set of ﬁnitely many polynomials F , . . . , F r ∈ C [ x , . . . , x n ]and we say that X is deﬁned by F , . . . , F r . For example, any linear subspace of C n is an algebraicsubvariety because they are all deﬁned by polynomials of degree one. Less nontrivial examples arealgebraic groups such as GL n ( C ), the group of all n × n invertible matrices and O ( n ), the groupof all n × n orthogonal matrices. We remark here that GL n ( C ) is an algebraic subvariety of C n +1 deﬁned by F ( x ij , t ) = t det( x ij ) − ∈ C [ x ij , t ] , where t, x ij , ≤ i, j ≤ n are variables and det( x ij ) is the determinant of the n × n matrix ( x ij ).Also O ( n ) is an algebraic subvariety of C n because O ( n ) is the deﬁned by( x ij ) T ( x ij ) − ∈ C [ x ij ] , where x ij , ≤ i, j ≤ n are variables. EW CLASSES OF MATRIX DECOMPOSITIONS 3

Let X be an irreducible algebraic subvariety of C n . We say that X is irreducible if X cannotbe written as the union of two algebraic subvarieties properly contained in X . In other words,whenever X = X ∪ X for algebraic subvarieties X and X , we have X = X or X = X. It is clear that linear subspaces, GL n ( C ) and O ( n ) are all irreducible. The algebraic subvariety X deﬁned by the equation x x = 0 is not irreducible since X is the union of X i which is deﬁned bythe equation x i = 0 , i = 1 , X ⊂ C n be an algebraic subvariety. Let I ( X ) be the ideal consisting of all polynomials f ∈ C [ x , . . . , x n ] such that f ( x ) = 0 for any x ∈ X . It is well known [9] that the ideal I ( X ) mustbe ﬁnitely generated. Assume that I ( X ) is generated by polynomials F , . . . , F r . Let J p be theJacobian matrix J p = h ∂F i ∂x j ( p ) i , where 1 ≤ i ≤ r and 1 ≤ j ≤ n . We deﬁne the dimension of X to bedim X := min p ∈ X { dim ker( J p ) } . The notion of dimension coincides with the intuition. For example, the dimension of a linearsubspace of C n is the same as its dimension as a linear space. The dimension of GL n ( C ) is n andthe dimension of O ( n ) is (cid:0) n (cid:1) . If p ∈ X is a point such thatdim X = ker( J p ) , we say that p is a smooth point of X and we deﬁne the tangent space T p X of X at p to be T p X := ker( J p ) . For example, the tangent space T p X of a linear subspace X ⊂ C n at a point p ∈ X can be identiﬁedwith X itself. The tangent space of O ( n ) at the identity e is simply the Lie algebra o ( n ), the Liealgebra consisting of all n × n skew symmetric matrices [14].Let U ⊂ C n be a subset. We deﬁne the Zariski closure U of U to be the common zero set ofpolynomials vanishing on U . Namely, U = { x ∈ C n : f ( x ) = 0 , f ∈ I ( U ) } . For example, the Zariski closure of R in C is the whose space C . We remark that the Zariski closurecould be much larger than the Euclidean closure. For example, the Euclidean closure of R in C isjust itself.Let X ⊂ C n and Y ⊂ C m be two algebraic subvarieties. We say a map f : X → Y is a polynomialmap if f can be represented as f ( x , . . . , x n ) = ( f ( x , . . . , x n ) , . . . , f m ( x , . . . , x n )) , where f , . . . , f m are polynomials in n variables. For example, we denote C n × n by the space of all n × n matrices. It is clear that C n × n ∼ = C n . Then the map ρ r : C n × n × · · · × C n × n | {z } r copies → C n × n deﬁned by ρ r ( A , . . . , A r ) = A · · · A r is a polynomial map. If W , . . . , W r are algebraic subvarietiesof C n × n , then the restriction of ρ r is, by deﬁnition, also a polynomial map.We denote the set of all k dimensional linear subspaces of C n by Gr( k, n ) and call it the Grass-mannian of k planes in C n . In particular, when k = 1, we have Gr(1 , n ) = P n − , the projectivespace. We say that a subset X of P n − is a projective subvariety if X is deﬁned by homogeneouspolynomials, i.e., there are homogeneous polynomials F , . . . , F r ∈ C [ x , . . . , x n ] such that X = { [ p ] ∈ P n − : F ( p ) = · · · = F r ( p ) = 0 } . KE YE

Here [ p ] is the element in P n − , which corresponds to the line joining the origin and p ∈ C n . It isa fundamental fact [9, 11, 12, 13] that Gr( k, n ) is a projective subvariety in P N − where N = (cid:0) nk (cid:1) .Furthermore, Gr( k, n ) is smooth, i.e., every point in Gr( k, n ) is a smooth point, hence we can deﬁnethe tangent bundle T Gr( k, n ), whose ﬁber over a point [ W ] ∈ Gr( k, n ) is simply the tangent space T [ W ] Gr( k, n ). We deﬁne E := { ([ W ] , w ) : [ W ] ∈ Gr( k, n ) , w ∈ W } and a projection map π : E → Gr( k, n ) π ([ W ] , w ) = [ W ] . It turns out that (

E, π ) is a vector bundle on Gr( k, n ) [9, 11, 12, 13]. We say that E is the tautologicalvector bundle on Gr( k, n ). By deﬁnition, the ﬁber π − ([ W ]) over a point [ W ] ∈ Gr( k, n ) is simplythe vector space W .Let X be an algebraic subvariety in C n . Let P be a property deﬁned for points in X . We saythat P is a generic property if there exists an algebraic subvariety X P ( X such that if x ∈ X doesnot satisfy the property P , then x ∈ X P . If the property P is understood, we say that x ∈ X is a generic point if x satisﬁes the property P . For example, we can say that for a ﬁxed hyperplane H in C n , a generic point in x ∈ C n is not contained in H . A generic n × n matrix is invertible sincematrices that are not invertible are deﬁned by the vanishing of their determinants. We also saythat for a ﬁxed m plane L , a generic k plane intersects with L in a m + k − n dimensional subspace.If x ∈ C n is a generic point for property P , then by deﬁnition, the set of points in C n that doesnot satisfy P is contained in an algebraic subvariety X P ( C n . If we equip C n with the Lebesguemeasure, then it is clear that X P always has measure zero. In other words, the set of generic pointshas the full measure.In particular, when we say that a generic n × n matrix can be decomposed into the product ofﬁnitely many structured matrices, for example, Toeplitz matrices, we mean the set of all matriceswhich admit such a decomposition is an open dense subset (with the Zariski topology) of the spaceof all n × n matrices. For those who are not familiar with the notion of generic objects, one canreplace “generic” by “random” to obtain the intuition, though this is not rigorous in mathematics.3. General method

Let X be an algebraic subvariety of C n × n which is closed under matrix multiplication, i.e., forany A, B ∈ X we have AB ∈ X . Let r be a positive integer and let W , . . . , W r be subvarieties of X . We consider a map ρ r : W r := W × · · · × W r → X deﬁned by the matrix multiplication ρ r ( A , . . . , A r ) = A · · · A r . We can rephrase the matrix decomposition problem in terms of ρ . Question 3.1. (1) Does there exist r such that ρ r is a dominant map, i.e., ρ r ( W r ) = X ?(2) Does there exist r such that ρ r is a surjective map, i.e., ρ r ( W r ) = X ?Here the closure ρ ( W r ) is the Zariski closure of ρ ( W r ) in X . In general, the ﬁst question inQuestion 3.1 is weaker than the second. However, we will see later that with some assumptions on X and W , we can conclude that if ρ r is dominant, ρ r ′ is surjective for some r ′ ≥ r . EW CLASSES OF MATRIX DECOMPOSITIONS 5

Lower bound of r . First, we can do a naive dimension counting to get a lower bound on r .To do this we need Proposition 3.2. [9] If f : Y → Z is a dominant polynomial map between two irreducible algebraicvarieties, then dim Y = dim Z +dim f − ( z ) for a generic point z ∈ Z . In particular, dim Y ≥ dim Z . Apply Proposition 3.2 to our case we obtain

Corollary 3.3. If ρ r : W r → X is dominant, then r X i =1 dim W i ≥ dim X. (1) Corollary 3.4. If dim W = · · · = dim W r = m and ρ r is dominant, then r ≥ ⌈ dim Xm ⌉ . We say that an algebraic subvariety W ⊂ C n × n is a cone if x ∈ W implies that λx ∈ W for any λ ∈ C . Assume that W , . . . , W r ⊂ X are cones. For any A ∈ X with a decomposition A = A . . . A r , where A ∈ W , . . . , A r ∈ W r ,ρ − r ( A ) contains the subvariety Z A = { ( λ A , . . . , λ r A r ) ∈ W r : λ i ∈ C , r Y i =1 λ i = 1 } . It is clear from the deﬁnition of Z A that dim Z A = r − . Apply Proposition 3.2 to this case we obtain

Corollary 3.5. If W , . . . , W r are cones and ρ r : W r → X is dominant, then r X i =1 dim W i − ( r − ≥ dim X. Corollary 3.6. If W , . . . , W r are cones of the same dimension m and ρ r is dominant, then r ≥ ⌈ dim X − m − ⌉ . Criterion for dominant maps.Proposition 3.7.

Let r be an integer and W , . . . , W r be subvarieties of X . If there is a point a = ( A , . . . , A r ) ∈ W r such that the diﬀerential dρ r | a : T A W × · · · × T A r W r → T ρ ( a ) X has full rank dim X , then the map ρ r is dominant.Proof. Suppose that ρ r ( W r ) is not equal to X , then it is a proper subvariety of X and hence ithas dimension strictly smaller than that of X . Therefore, we have that the rank of dρ r | a ′ is strictlysmaller than dim X for generic a ′ ∈ W r . However, the assumption that there exists some a ∈ W r such that dρ r | a has the maximal rank implies that for a generic point a ′ ∈ W r , we must have thatthe rank of dρ r | a ′ is equal to dim X . (cid:3) For readers unfamiliar with the calculation of the diﬀerential dρ r | a , we record the followingformula dρ r | a ( X , . . . , X r ) = r X i =1 A · · · A i − X i A i +1 · · · A r , (2)where X i ∈ T A i W i . If in particular W i is a linear subspace of C n × n , then we may identify thetangent space T A i W i as W i itself. We will apply formula (2) repeatedly the the rest of this paper. KE YE

Criterion for surjective maps.Proposition 3.8. [2]

Let G be a topological group and let U be an open dense subset of G . Assumethat U contains the identity element of the group G . Then G = U · U, i.e, every element g ∈ G is of the form hh ′ for some h, h ′ ∈ U . Theorem 3.9 (open mapping theorem) . [7] Let

X, Y be two irreducible varieties and let f : X → Y be a polynomial map. If f is dominant then there is some U ⊂ f ( X ) which is open and dense in Y . Proposition 3.10.

Let W = · · · = W r = W be a linear subspace and X = C n × n . Suppose that W contains all diagonal matrices and that ρ r is dominant. Then the map ρ r ′ : W r ′ := W × · · · × W | {z } r ′ copies → C n × n deﬁned by matrix multiplication is surjective for r ′ = 4 r + 1 .Proof. Since ρ r is dominant, by Theorem 3.9 its image ρ r ( W r ) := W × · · · × W | {z } r copies contains an open dense subset of C n × n . This implies that ρ r ( W r ) contains an open dense subsetof the group GL n ( C ) because GL n ( C ) is an open dense subset of C n × n . By Proposition 3.8 we seethat GL n ( C ) ⊆ ρ r ( W r ) · ρ r ( W r ) . Lastly, if A ∈ C n × n then there exists P, Q ∈ GL n and a diagonal matrix D ∈ C n × n such that A = P DQ.

Hence we see that C n × n ⊆ ρ r ( W r ) · W · ρ r ( W r ) (cid:3) Our strategy.

Let X be an algebraic subvariety of C n × n which is closed under matrix mul-tiplication and let W , . . . , W r be r algebraic subvarieties of X . We deﬁne ρ r : W r := W × · · · W r → X. In general, we may answer Question 3.1 by the following strategy.(1) We ﬁrst calculate the lower bound r of r for ρ r to be dominant according to Corollaries3.4, 3.5 and 3.6. If r < r then ρ r can not be dominant.(2) If r ≥ r , we calculate the diﬀerential dρ r | a of ρ r at a point a ∈ W r . If dρ r | a has themaximal rank dim X then ρ r is dominant by Proposition 3.7.(3) If W = · · · = W r = W is a linear subspace of X = C n × n containing all diagonal matricesand ρ r is dominant then by Proposition 3.10 ρ r ′ is surjective where r ′ = 4 r + 1.The main step in our strategy is to ﬁnd a point a ∈ W r such that the diﬀerential of ρ r at a hasthe rank dim X . The rest of this paper is concentrating on applying the above strategy to answerQuestion 3.1 for various choices of W i and X . EW CLASSES OF MATRIX DECOMPOSITIONS 7 Toy examples: LU and QR decompositions

In this section, we will discuss the matrix decomposition for two factors, which is the simplestcase. We can recover the existence of LU and QR decompositions for a generic matrix using ourmethod.We know that a generic matrix has the LU decomposition and every matrix has the QR de-composition. Although the existence of the LU decomposition and the QR decomposition is quitelementary and clear from the linear algebra point of view, it is interesting to recover it from otherpoint of view. In fact, we will prove a more general result.

Theorem 4.1.

Let W and W be two algebraic subvarieties of C n × n such that(1) both W and W contain the identity matrix I n as a smooth point, and(2) T I n W + T I n W = C n × n .Then a generic n × n matrix is a product of some A ∈ W and A ∈ W . Moreover, every n × n matrix M is a product of some A , B , C , D ∈ W , A , B , C , D ∈ W and a diagonal matrix E , i.e., M = A B A B EC C D D . Here, by deﬁnition, we can identify the tangent space T A i W i at a smooth point A i with a linearsubspace in C n × n for i = 1 , Proof.

Consider the diﬀerential dρ | ( I n ,I n ) of ρ at ( I n , I n ), then by formula (2) we must have dρ | ( I n ,I n ) ( X , X ) = X + X , for any X i ∈ W i , i = 1 ,

2. By assumption we see that dρ | ( I n ,I n ) is surjective hence ρ is dominantby Proposition 3.7. The moreover part follows from Proposition 3.10. (cid:3) Here we remind readers that by generic, we mean there are polynomials F , . . . , F r ∈ C [ x ij ] suchthat whenever A = ( a ij ) is a matrix such that A cannot be expressed as a product of some A ∈ W and A ∈ W , then we must have F ( A ) = · · · = F r ( A ) = 0 . We warn readers that a generic matrix has a decomposition A = A A for A i ∈ W i , i = 1 ,

2, doesnot imply that every matrix has such a decomposition. Intuitively speaking, a generic matrix hasdecomposition means “most” matrices admit such a decomposition. For example, a generic matrixhas the LU decomposition but it is not true that every matrix has the LU decomposition. We willsee this phenomenon again in Section 6.1, where we prove that a generic n × n matrix is a productof n companion matrices, but there exits n × n matrices that do not admit such a decomposition.4.1. Triangular decomposition.

We apply Theorem 4.1 to the LU decomposition and its vari-ants.

Corollary 4.2.

Let A be a generic n × n matrix. Then(1) There exist a lower triangular matrix L and an upper triangular matrix U such that A = LU. (2) There exist a lower triangular matrix L and an upper triangular matrix U such that A = U L. (3) There exist a top triangular matrix T and a bottom triangular matrix B such that A = T B. (4) There exist a top triangular matrix T and a bottom triangular matrix B such that A = BT.

KE YE

Remark . On the one hand, Corollary 4.2 does not specify what generic matrices are. It onlyguarantees that if we equip C n × n with Lebesgue measure, the probability that a randomly picked n × n matrix can be written as the product of a lower triangular matrix and an upper triangularmatrix is one. On the other hand, It is known [20] that a nonsingular matrix admit an LUdecomposition if and only if all its leading principal minors are nonzero. This implies that anonsingular matrix whose leading principal minors are all nonzero is a generic matrix in this case.However, we have more generic matrices, for example, matrices of rank k whose ﬁrst k principalminors are nonzero are also generic matrices [20]. It is also known [21] that there exist n × n matriceswhich do not admit LU decompositions. Hence generic matrices for the LU decomposition form aproper subset of the space of n × n matrices. Corollary 4.4.

Let A be an n × n matrix. Then(1) There exist lower triangular matrices L , L , L , L and upper triangular matrices U , U , U , U such that A = L U L U L U L U . (2) There exist lower triangular matrices L , L , L , L and upper triangular matrices U , U , U , U such that A = U L U L U L U L . (3) There exist top triangular matrices T , T , T , T and bottom triangular matrices B , B , B , B such that A = T B T B T B T B . (4) There exist top triangular matrices T , T , T , T and bottom triangular matrices B , B , B , B such that A = B T B T B T B T . QR decompositions.

Assume O ( n ) is the group of complex orthogonal matrices and let U be the space of upper triangular matrices. Since the tangent space of O ( n ) at I n is simply thelinear space of all n × n skew symmetric matrices [14] and both O ( n ) and U contain I n as a smoothpoint, we can apply Theorem 4.1 directly to O ( n ) and U Corollary 4.5.

Let A be a generic n × n matrix. Then(1) There exist an orthogonal matrix Q , an upper triangular matrix R such that A = RQ. (2) There exist an orthogonal matrix Q , an upper triangular matrix R such that A = QR. (3) There exist an orthogonal matrix Q , a lower triangular matrix S such that A = QS. (4) There exist an orthogonal matrix Q , a lower triangular matrix S such that A = SQ.

Corollary 4.6.

Let A be an n × n matrix. Then(1) There exist orthogonal matrices Q , Q , Q , Q and upper triangular matrices R , R , R , R such that A = R Q R Q R Q R Q . (2) There exist orthogonal matrices Q , Q , Q , Q and upper triangular matrices R , R , R , R such that A = Q R Q R Q R Q R . EW CLASSES OF MATRIX DECOMPOSITIONS 9 (3) There exist orthogonal matrices Q , Q , Q , Q and lower triangular matrices S , S , S , S such that A = Q S Q S Q S Q S . (4) There exist orthogonal matrices Q , Q , Q , Q and lower triangular matrices S , S , S , S such that A = S Q S Q S Q S Q . Remark . Corollary 4.6 is redundant because it is known that every n × n matrix admit a QRdecomposition. Furthermore, this implies that generic matrices for QR decomposition in Corollary4.5 are actually all matrices. Combining this fact with Remark 4.3, we see that generic matricesare not necessarily the same for diﬀerent matrix decompositions.One might ask whether or not the same method applies to the SVD, but unfortunately, sincethe SVD involves complex conjugation of a matrix, which makes the decomposition non-algebraic,we are not allowed to use the same argument to recover the SVD even for generic matrices.5. Matrix decomposition for linear spaces

Bidiagonal decomposition.

Let A = ( a ij ) be an n × n matrix. We say that A is a k -diagonalmatrix if a ij = 0 if | i − j | ≥ k. In particular 1-diagonal matrices are simply diagonal matrices, 2-diagonal matrices are called bi-diagonal matrices. For example, a 3 ×  a b c d e f g  . An upper k -diagonal matrix A = ( a ij ) is a k -diagonal matrix with further restriction a ij = 0 if i − j ≥ . A 3 ×  a b c d e  . A matrix A is called lower k -diagonal if A T is upper k -diagonal. We denote the set of all k -diagonalmatrices by D k , the set of all upper k -diagonal matrices by D k, ≥ and the set of all lower k -diagonalmatrices by D k, ≤ . Lemma 5.1.

Let ≤ k ≤ n be an integer. A generic n × n upper (resp. lower) k -diagonal matrixis a product of k − upper (resp. lower) bidiagonal matrices. In particular, A generic n × n upper(resp. lower) matrix is a product of n upper (resp. lower) bidiagonal matrices.Proof. We will prove the lemma for upper triangular matrix case. For a positive integer 2 ≤ k , werecall that D k, ≥ is the space of upper k -diagonal matrices. It is clear that the product of k − k diagonal matrix. We want to show that the map ρ k − : D , ≥ × · · · × D , ≥ | {z } k − → D k, ≥ deﬁned by matrix multiplication is dominant. We proceed by induction on k . When k = 2, it isclear that ρ is dominant. Assume that the map ρ k − is dominant where k − ≤ n − ρ k is also dominant. To this end we can factor the map ρ k as ρ k : D , ≥ × ( D , ≥ × · · · × D , ≥ ) | {z } k − n ,ρ k − ) −−−−−−→ D , ≥ × D k − , ≥ ρ −→ D k +1 , ≥ , where Id n is the identity map on D , ≥ and ρ is the map deﬁned by multiplication of two matrices.By the induction hypothesis, we see that (Id , ρ k − ) is dominant hence it is suﬃcient to show that ρ is dominant. Now to see that ρ is dominant, we calculate the diﬀerential of ρ . By formula (2)the diﬀerential of ρ at ( A, B ) is given by dρ | ( A,B ) ( X, Y ) = XB + AY, for all (

X, Y ) ∈ D , ≥ × D k − , ≥ . On the other hand, given any C ∈ D k +1 , ≥ , we can write C = C ′ + C ′′ where C ′ ∈ D k, ≥ and C ′′ ∈ D k +1 , ≥ − D k, ≥ . We take B = ( δ ii + k − ) ∈ D k, ≥ , where δ ij is the Kronecker delta and A = Id n , then one can easily ﬁnd ( X, Y ) ∈ D , ≥ × D k, ≥ such that XB = C ′′ , Y = C ′ . This implies that dρ | ( A,B ) is surjective, and hence ρ is dominant by Proposition 3.7. (cid:3) Corollary 5.2.

Every invertible upper (resp. lower) triangular matrix is a product of n upper(resp. lower) bidiagonal matrices.Proof. Since a generic upper triangular matrix is a product of n upper bidiagonal matrices, thecorollary follows from Proposition 3.8. (cid:3) Proposition 5.3.

A generic n × n matrix can be decomposed into a product of n tridiagonalmatrices. An invertible n × n matrix is a product of n bidiagonal matrices.Proof. By Lemma 5.1 a generic upper (resp. lower) triangular matrix is a product of n upper (resp.lower) bidiagonal matrices. A generic n × n matrix has an LU decomposition. Hence we see that ageneric matrix can be decomposed as a product of n upper and n lower bidiagonal matrices. Thelast statement follows from 3.8. (cid:3) Theorem 5.4.

Let r be the smallest number such that every n × n matrix is a product of r bidiagonalmatrices. Then n − ≤ r ≤ n. Proof.

Notice that every matrix A can be written as A = P DQ, where

P, Q are invertible and D is diagonal. By Proposition 5.3 we see that P, Q are products of8 n bidiagonal matrices, respectively. Since diagonal matrices are also bidiagonal, we see that every n × n matrix is a product of 8 n bidiagonal matrices. This gives the upper bound of r . For thelower bound, we simply notice that a product of k − k -diagonal hence r must be at least n − (cid:3) Since dim D = 3 n −

2, by Corollary 3.6 the expected value of r is ⌈ n +13 ⌉ , while the lower boundof r is n −

1. This shows that Proposition 5.4 gives us an example that the expected value maynot be achieved. Roughly speaking, this is because entries on the diagonal of a tridiagonal matrixdo not contribute to expand the product. To be more precise, if X is a diagonal matrix and Y isa bidiagonal matrix then XY is still a bidiagonal matrix.5.2. Skew symmetric decomposition.

We consider skew symmetric matrix decomposition prob-lem in this section. An n × n skew symmetric matrix A is deﬁned by the condition A = − A T . We denote the space of all n × n skew symmetric matrices by Λ n . It is clear that Λ n is a linearsubspace of C n × n and dim(Λ n ) = (cid:18) n (cid:19) . EW CLASSES OF MATRIX DECOMPOSITIONS 11

On the one hand, since ⌈ n − ( n ) − ⌉ = 3 if n ≥

3, we see that if the map ρ r : Λ n × · · · × Λ n | {z } r copies → C n × n is dominant then r is at least three. On the other hand, from the deﬁnition one can see that forany A ∈ Λ n we have det( A ) = det( A T ) = ( − n det( A ) . In particular if n is odd, we obtain det( A ) = 0. This implies that when n is odd, the map ρ cannever be dominant, regardless how large r is. However, when n is odd, we can expect that ρ r : Λ n × · · · × Λ n | {z } r copies → DET n is dominant for r ≥

3, where DET n is the hypersurface of all n × n matrices whose determinantsare zero. Proposition 5.5.

We have the following two cases:(1) n is even. A generic n × n matrix is a product of r skew symmetric matrices for ( n, r ) where(a) n ≥ , r ≥ or(b) n = 6 , r ≥ or(c) n = 4 , r ≥ .(2) n is odd. A generic n × n matrix is a product of r skew symmetric matrices whose deter-minants are zero for ( n, r ) where(a) n ≥ , r ≥ or(b) n = 3 , r ≥ . Again, we consider the map ρ r :Λ n × · · · × Λ n | {z } r copies → C n × n , when n is even, and ρ r :Λ n × · · · × Λ n | {z } r copies → DET n , when n is odd . Example 5.6.

Using Macalay2 [22] we can calculate the dimension d of the closure of the imageof ρ for small n and r , we list some results(1) n = 2, d = 1 for any r .(2) ( n, r ) = (3 , d = 7.(3) ( n, r ) = (3 , d = 8, im( ρ r ) = DET .(4) ( n, r ) = (4 , d = 13.(5) ( n, r ) = (4 , d = 15, im( ρ r ) is a hypersurface in C .(6) ( n, r ) = (4 , d = 16, im( ρ r ) = C .(7) ( n, r ) = (5 , , (7 , , (9 ,

3) or (11 , d = n −

1, im( ρ r ) = DET n .(8) ( n, r ) = (6 , d = 35, im( ρ r ) is a hypersurface in C .(9) ( n, r ) = (6 , d = 36, im( ρ r ) = C .(10) ( n, r ) = (8 , , (10 , , (12 ,

3) or (14 , d = n , im( ρ r ) = C n . Proof of Proposition 5.5.

It left to show that ρ r is dominant for n ≥ , r = 3 when n is even (resp. n ≥ , r = 3 when n is odd). We may proceed by induction on n .Case 1. We assume n ≥

16 is even, We consider three block diagonal matrices (cid:20)

A OO A ′ (cid:21) , (cid:20) B OO B ′ (cid:21) , and (cid:20) C OO C ′ (cid:21) , where A, B, C are ( n − × ( n −

8) skew symmetric matrices, and A ′ , B ′ , C ′ are 8 × ρ ( n − : Λ n − × Λ n − × Λ n − → C ( n − × ( n − and ρ (8)3 : Λ × Λ × Λ → C × are surjective at ( A, B, C ) and ( A ′ , B ′ , C ′ ), respectively. We consider the diﬀerential of ρ : Λ n × Λ n × Λ n → C n × n at a = (cid:20) A OO A ′ (cid:21) , b = (cid:20) B OO B ′ (cid:21) , and c = (cid:20) C OO C ′ (cid:21) . We parametrize tangent spaces of Λ n at a, b and c by x = (cid:20) X u − u T X ′ (cid:21) , y = (cid:20) Y v − v T Y ′ (cid:21) , and z = (cid:20) Z w − w T Z ′ (cid:21) . Then we have by formula (2) dρ | ( a,b,c ) ( x, y, z ) = ab (cid:20) Z w − w T Z ′ (cid:21) + a (cid:20) Y v − v T Y ′ (cid:21) c + (cid:20) X u − u T X ′ (cid:21) bc = (cid:20) ABZ + AY C + XBC ABw + AvC ′ + uB ′ C ′ − A ′ B ′ w T − A ′ v T C − u T BC A ′ B ′ Z ′ + A ′ Y ′ C ′ + X ′ B ′ C ′ (cid:21) . By choice of

A, B, C and A ′ , B ′ , C ′ we know that ABZ + AY C + XBC can be any ( n − × ( n −

8) matrix and that A ′ B ′ Z ′ + A ′ Y ′ C ′ + X ′ B ′ C ′ can be any 8 × ABw + AvC ′ + uB ′ C ′ and - A ′ B ′ w T − A ′ v T C − u T BC can be any ( n − × × ( n −

8) matrix, respectively. This shows that dρ | ( a,b,c ) is surjective hence ρ is dominant.Case 2. We assume n ≥

13 is odd. Then we can choose a = (cid:20) A OO A ′ (cid:21) , b = (cid:20) B OO B ′ (cid:21) , and c = (cid:20) C OO C ′ (cid:21) . where A, B, C are ( n − × ( n −

5) skew symmetric matrices and A ′ , B ′ , C ′ are 5 × ρ ( n − : Λ n − × Λ n − × Λ n − → C ( n − × ( n − and ρ (5)3 : Λ × Λ × Λ → DET are dominant at ( A, B, C ) and ( A ′ , B ′ , C ′ ) respectively. The similar calculation as in theprevious case shows that ρ is dominant. (cid:3) We remark that when n = 2, a skew symmetric matrix is of the form (cid:20) a − a (cid:21) , a ∈ C . Therefore, we see that if r is even, the image of ρ r is simply the space of all 2 × r is odd, the image of ρ r is the space of skew symmetric matrices.By Proposition 3.10 we can derive from 5.5 the following Theorem 5.7.

For n ≥ , every n × n matrix is a product of skew symmetric matrices. Every × matrix is a product of skew symmetric matrices. Every × matrix is a product of skew symmetric matrices. Notice that we are not able to apply Proposition 3.10 when n is odd. This is because the imageof ρ r is contained in DET n , which does not contain the group of invertible matrices. EW CLASSES OF MATRIX DECOMPOSITIONS 13

Symmetric Toeplitz matrix decomposition.

A symmetric Toeplitz matrix A = ( a ij ) isdeﬁned by a ij = a i + p,j + p , a ij = a ji , ≤ i, j, i + p, j + p ≤ n. We denote the space of all symmetric Toeplitz matrices by ST n . A centrosymmetric matrix B =( b ij ) is deﬁned by b ij = a n − i +1 ,n − j +1 , ≤ i, j ≤ n. It is easy to verify that the product of two centrosymmetric matrices is again a centrosymmetricmatrix hence the space CS n of all centrosymmetric matrices is an algebra. Moreover, we have ST n ⊂ CS n . We say that a matrix A is a persymmetric Hankel if J A is a symmetric Hankel, where J =  · · · · · · ... ... . . . ... ... · · · ·  . We denote the space of all n × n persymmetric Hanekl matrices by P H n . It is clear that P H n ⊂ CS n .We will consider symmetric Toepliz (resp. persymmetric Hankel) matrix decomposition problemof a centrosymmetric matrix. It is clear thatdim( ST n ) = dim( P H n ) = n, dim( CS n ) = ⌈ n ⌉ . Hence by Corollary 3.6 we see that if ρ r : ST n × · · · × ST n | {z } r copies → CS n or ρ r : P H n × · · · × P H n | {z } r copies → CS n is dominant, then r ≥ ⌈ n ⌉ − n − ( ⌊ n +12 ⌋ , if n ≥ , , if n = 2 . Proposition 5.8.

Let n ≥ be an integer. A generic n × n centrosymmetric matrix is a productof ⌊ n +12 ⌋ symmetric Toepitz (resp. persymmetric Hankel) matrices. The proof of Proposition 5.8 is similar to the proof of Toeplitz matrix decomposition theorem[8] hence we will just give a sketch of the proof for Proposition 5.8 here.

Sketch of proof of Proposition 5.8.

It is suﬃcient to prove the statement for symmetric Toeplitz.Indeed, since we have

J A = AJ, J = 1 . if A is symmetric Toeplitz, if X ∈ CS n has a decomposition X = A · · · A r where A , . . . , A r ∈ ST n and r = ⌊ n +12 ⌋ . Then we see that J X = J ( A · · · A r ) = ( J A ) · · · ( J A r ) , if r is odd and X = ( J A ) · · · ( J A r ) , if r is even. In either case, this implies that a generic centrosymmetric matrix is a product of r persymmetric Hankel matrices. Let B k := ( δ i,j + k ) ni,j =1 , k = − ( n − , − ( n − , . . . , n − n × n Toeplitz matrices. Then S k := δ k. − k ( B k + B − k ) , k = 0 , , . . . , n − ST n . Precisely,we have B = " . .. . .. . .. .. . . ..

00 0 1 , B = " . .. . .. . .. , B = " . .. . .. . .. , . . . , B n − = " ... ... ... , and S = " . .. . .. . .. .. . . ..

00 0 1 , S = " . .. . .. . .. .. . . ..

10 1 0 , S = " . .. . .. .. . ... . ..

01 0 0 , . . . , S n − = " . .. . .. . .. .. . . ..

01 0 0 0 , To prove that ρ r is dominant, it suﬃces to ﬁnd a point a := ( A n − r , . . . , A n − ) ∈ ST n × · · · × ST n | {z } r copies such that the diﬀerential of ρ r at a has the maximal rank ⌈ n ⌉ . In stead of choosing a point a explicitly, we will show that such a point exits. To this end, we write A n − i := S + t n − i S n − i , i = 1 , , . . . , r, where t n − , . . . , t n − r are indeterminants. For such A n − i we see that the diﬀerential dρ r | a can berepresented as an ⌈ n ⌉ × rn matrix M , whose entries are polynomials in t n − , . . . , t n − r . Now to seethat M has rank ⌈ n ⌉ , we need to ﬁnd a nonzero ⌈ n ⌉ × ⌈ n ⌉ minor of M . Claim 5.9.

By the same type of argument as in [8] , we can show(1) Any ⌈ n ⌉ × ⌈ n ⌉ of M is a polynomial in t ’s of degree at least ⌈ n ⌉ − n .(2) There exists a ⌈ n ⌉ × ⌈ n ⌉ minor of M contains a monomial of degree exactly ⌈ n ⌉ − n and whose coeﬃcient is non-zero. Indeed, the desired monomial is t n − n − t n − n − · · · t n − n − r +1 if n is odd, and t n − n − t n − n − · · · t n − n − r +2 t n − n − r +1 if n is even. This shows that for a ﬁxed n , there exist some values of t n − , . . . , t n − r such that the diﬀerential dρ r | a has the maximal rank ⌈ n ⌉ . (cid:3) We work out an example to illustrate how the proof of Proposition 5.8 works. We adopt notationsin [8]. For the map ρ r : ST n × · · · × ST n | {z } r copies → CS n , we deﬁne X n − i := n − X k =0 x n − i,k S k , i = 1 , , . . . , r to be the matrix occuring in the i -th argument of ρ r . Then by formula (2), the diﬀerential dρ r | a issimply a linear map deﬁned by dρ r | a ( X n − r , . . . , X n − ) = r X i =1 A n − r · · · A n − i − X n − i A n − i +1 · · · A n − , for X n − i ∈ ST n . We denote the entries of dρ r | a ( X n − r , . . . , X n − ) by L p,q , ≤ p, q ≤ n . Then it isclear that the matrix ( L p,q ) is a centrosymmetric matrix, i.e, L p,q = L n − p +1 ,n − q +1 , ≤ p, q ≤ n. We EW CLASSES OF MATRIX DECOMPOSITIONS 15

Case 1. We consider the case n = 3. This gives r = ⌊ n + 12 ⌋ = 2 , ⌈ n ⌉ = 5 . We will see that any 5 × × M is a polynomial in t ’s of degree atleast 3. We can simply write L p,q = x ,q − p + x ,q − p + Ω( t ) , p, q = 1 , , . Here Ω( t ) stands for terms of L p,q of degree at least one in t ’s. With this notation, weexpress M as  x , x , x , x , x , x , L , = L , ∗ ∗ ∗ ∗ L , = L , ∗ ∗ ∗ ∗ L , = L , ∗ ∗ ∗ ∗ L , ∗ ∗ ∗ ∗ L , = L ∗ ∗ ∗ ∗  . Here means that the entry if of the form Ω( t ) and 1 means that the corresponding L p,q contains x ,k or x k . For example, since L , = L , is of the form x , + x , + Ω( t ) , we put 1 in (1 , , M and ∗ elsewhere in the ﬁrst row. It is nothard to see that any 5 × M has degree at least 5 − n = 5. In this case we have r = ⌊ n + 12 ⌋ = 3 , ⌈ n ⌉ = 13 . We consider the table L i,j t t t t t t t t , indicating the way we obtain t t . Namely, we pick t from L , , L , , L , and L , . Wepick t from L , , L , , L , and L , and one form rest ﬁve L ij ’s. By deﬁnition of L ij , we seethat this is the unique way to obtain the monomial t t . This veriﬁes the second statementof Claim 5.9.5.4. Generic matrix decomposition.

In this section we consider the decomposition problem forgeneric linear subspaces of C n × n . Let r be a positive integer. Assume that for i = 1 , , . . . , r , W i isa k i dimensional subspace of C n × n . We deﬁne W r := W × · · · × W r . Let ρ r : W r → C n × n be the map sending ( A , . . . , A r ) to their product A · · · A r . We want todetermine r , such that ρ r is dominant. Consider the following diagram: T E × · · · × T E r (cid:15) (cid:15) d ˆ ρ r / / T Gr( k , n ) × · · · × T Gr( k r , n ) × C n × n (cid:15) (cid:15) E × · · · × E r (cid:15) (cid:15) ˆ ρ r / / Gr( k , n ) × · · · × Gr( k r , n ) × C n × n (cid:15) (cid:15) Gr( k , n ) × · · · × Gr( k r , n ) id / / Gr( k , n ) × · · · × Gr( k r , n ) . Here E i is the tautological vector bundle over Gr( k i , n ), T E i is the tangent bundle of E i , and ˆ ρ r is the bundle map induced by ρ r : W r → C n , and d ˆ ρ r is the diﬀerential of ˆ ρ r .To be more precise, for any [ W i ] ∈ Gr( k i , n ), the ﬁber E [ W i ] over [ W i ] is W i and the ﬁberof T E i over ([ W i ] , A i ) is T [ W i ] Gr( k i , n ) ⊕ W i , where A i ∈ W i . If we restrict ˆ ρ r to the ﬁber E [ W ] × · · · × E [ W r ] we obtain the map ρ r : W r → C n × n deﬁned before and the restriction of d ˆ ρ r to T E W ] ,A ) × · · · × T E r ([ W r ] ,A r ) becomes the diﬀerential of ρ r at the point ( A , . . . , A r ). Lemma 5.10.

Let r be a positive integer. For each i = 1 , . . . , r , let k i be a ﬁxed integer suchthat ≤ k i ≤ n . Assume that ρ r : W r → C n × n is dominant for some k i dimensional subspace W i of C n × n , i = 1 , , . . . , r . Then for a generic k i -dimensional subspace W ′ i of C n × n , the map ρ r : W ′ r → C n × n is dominant, where W ′ r := W ′ × · · · × W ′ r . Proof.

Since ρ r : W r → C n × n is dominant, we see that the Jacobian matrix of ρ r at a genericpoint ( A , . . . , A r ) in W r has the maximal rank, i.e., the Jacobian matrix of ˆ ρ r at ([ W ] , A ) ×· · · × ([ W r ] , A r ) has the maximal rank. By Proposition 3.7 we see that ˆ ρ r is dominant, i.e., ρ r isdominant for generic W ′ i ∈ Gr( k i , n ), i = 1 , , . . . , r . (cid:3) We shall make use of the following result

Proposition 5.11. [8]

A generic n × n matrix can be decomposed as the product of ⌊ n ⌋ + 1 Toeplitzmatrices.

Proposition 5.12.

Let C n × n be the space of all n × n matrices then (i) For generic (2 n − dimensional subspaces W , . . . , W r of C n × n , a generic n × n matrix is aproduct of elements in W i , i = 1 , , . . . , r if r ≥ ⌊ n ⌋ + 1 . (ii) For generic (cid:0) n +12 (cid:1) dimensional subspaces W , . . . , W r of C n × n , a generic n × n matrix is aproduct of elements in W i , i = 1 , , . . . , r if r ≥ . (iii) For generic (3 n − dimensional subspaces W , . . . , W r of C n × n , a generic n × n matrix is aproduct of elements in W i , i = 1 , , . . . , r if r ≥ n . (iv) For generic (cid:0) n (cid:1) dimensional subspaces W , . . . , W r of C n × n , a generic n × n matrix is aproduct of elements in W i , i = 1 , , . . . , r if(a) r ≥ when n ≥ ,(b) r ≥ when n = 3 ,(c) r ≥ when n = 2 .Proof. The ﬁrst statement follows from Proposition 5.11 and Lemma 5.10. The second statementfollows from Corollary 4.2 and Lemma 5.10. The third statement follows from Proposition 5.3 andLemma 5.10. The last statement follows from Proposition 5.5 and Lemma Lemma 5.10. (cid:3)

The combination of Proposition 3.10 and Proposition 5.12 implies

Theorem 5.13.

Let C n × n be the space of all n × n matrices then (i) For generic (2 n − dimensional subspaces W , . . . , W r of C n × n , an n × n matrix is a productof elements in W i , i = 1 , . . . , r if r ≥ n + 5 . (ii) For generic (cid:0) n +12 (cid:1) dimensional subspaces W , . . . , W r of C n × n , an n × n matrix is a productof elements in W i , i = 1 , . . . , r if r ≥ . (iii) For generic (3 n − dimensional subspaces W , . . . , W r of C n × n , an n × n matrix is a productof elements in W i , i = 1 , . . . , r if r ≥ n + 1 . (iv) For generic (cid:0) n (cid:1) dimensional subspaces W , . . . , W r of C n × n , an n × n matrix is a productof elements in W i , i = 1 , . . . , r if(a) r ≥ when n ≥ ,(b) r ≥ when n = 3 ,(c) r ≥ when n = 2 . EW CLASSES OF MATRIX DECOMPOSITIONS 17

We close this section by remarking that Proposition 5.12 (resp. Theorem 5.13) only hold forgeneric subspaces W , . . . , W r of C n × n , i.e., there is a proper algebraic subvariety Z i ⊂ Gr( k i , n )such that if ( W , . . . , W r ) ∈ (Gr( k , n ) − Z ) × · · · × (Gr( k r , n ) − Z r ) , then ρ r : W × · · · × W r → C is dominant (resp. surjective). However, we do not know any infor-mation about algebraic subvarieties Z i . The main contribution of Proposition 5.12 and Theorem5.13 is that if the matrix decomposition (both dominant and surjective versions) holds for some W , . . . , W r then it also holds for almost all linear subspaces W ′ i of dimensions dim W i , i = 1 , , . . . , r ,respectively. 6. Matrix decomposition for nonlinear spaces

We consider matrix decompositions for non-linear algebraic subvarieties in this section. In 6.1we discuss the matrix decomposition into the product of companion matrices and in 6.2 we discussthe matrix decomposition for generalized Vandermonde matrices.6.1.

Companion decomposition. An n × n companion matrix is a matrix of the form  · · · c · · · c ... ... . . . ... ... · · · c n  , where c , . . . , c n are arbitrary complex numbers. We denote C n by the set of all companion matrices.Then it is clear that C n is an aﬃne varitey of dimension n . Proposition 6.1.

A generic n × n matrix is a product of n companion matricesProof. We need to prove that the map ρ n : C n × · · · × C n | {z } n copies → C n × n is dominant. Let σ be the matrix corresponding to the permutation (12 . . . n ), i.e., σ is the matrix  · · · · · · ... ... . . . ... ... · · ·  . For an n × n matrix A the matrix σA is obtained by shifting the i -th row of A to the ( i + 1)-th row, i = 1 , . . . , n . Similarly, the matrix Aσ is obtained by shifting the i -th column of A to the ( i − i = 1 , . . . , n . Here we adopt the convention that the n + 1-th row is actually the ﬁrst rowand the 0-th column is actually the n -th column.We calculate the rank of dρ n at the point ( σ, · · · , σ ). First notice that the tangent space T σ C n of C n at σ is the linear space consisting of matrices of the form  · · · c · · · c ... ... . . . ... · · · c n  , where c , . . . , c n are arbitrary complex numbers. Let Y , · · · , Y n be n elements of T σ C n then byformula (2) we have dρ | ( σ, ··· ,σ ) ( Y , · · · , Y n ) = n X i =1 σ i − Y i σ n − i . Since σ corresponds to (12 · · · n ), it is easy to see that σ i − Y i σ n − i is a matrix with zero entrieseverywhere except the i -th column. On the other hand, Y i ’s are independent from each other, thissuﬃces to show that the rank of dρ | ( σ, ··· ,σ ) is n . (cid:3) Proposition 3.10 together with Proposition 6.1 imply

Theorem 6.2.

Every n × n invertible matrix is a product of n companion matrices. Every n × n matrix is a prodcut of n companion matrices and a diagonal matrix. Since the map ρ n : C n × · · · × C n | {z } n copies → C n × n is dominant, by Proposition 3.2 we see that for a generic matrix A ∈ C n × n , the ﬁber ρ − n ( A ) is ofdimension zero and hence ρ − n ( A ) is a ﬁnite set. In fact, we can prove more. Theorem 6.3.

The decomposition of a generic n × n matrix into the product of n companionmatrices is unique, i.e., for a generic n × n matrix A , if A = C · · · C n = C ′ · · · C ′ n , where C i , C ′ i , i = 1 , , . . . , n are companion matrices then C i = C ′ i for all i = 1 , . . . , n .Proof. We consider n companion matrices C , . . . , C n and write C i :=  · · · c i, · · · c i, ... ... . . . ... ... · · · c i,n  and calculate the product X k = C . . . C k . We claim that the ( p, q )-th entry X kp,q of X k is a polynomial in c ij where 1 ≤ i ≤ n and 1 ≤ j ≤ q − X kp,q = P q + k − n − j =1 X kp,q − j c q + k − n,n − j +1 + c q + k − n,n +1+ p − q − k , if q ≥ n − k + 11 , if p − q = k and q < n − k + 10 , otherwise . If k = n then X np,q = q − X j =1 X np,q − j c q,n − j +1 + c q,p − q +1 . Now given a generic n × n matrix A = ( a ij ), we can ﬁnd c ij , ≤ i, j ≤ n such that a p, = c ,p ,a p, = a p, c ,n + c ,p − ,a p, = a p, c ,n + a p, c ,n − + c ,p − ,...a p,n = a p,n − c n,n + a p,n − c n,n − + · · · + a p, c n, + c n,p − n +1 , (3)and those c ij ’s are uniquely determined by (3). Those C i where C i :=  · · · c i, · · · c i, ... ... . . . ... ... · · · c i,n  EW CLASSES OF MATRIX DECOMPOSITIONS 19 are the desired companion matrices such that A = C · · · C n . (cid:3) The proof of Theorem 6.3 actually gives another way to show that a generic n × n matrix is aproduct of n companion matrices. Moreover, it also gives an algorithm to decompose a generic n × n matrix into the product of n companion matrices. In Algorithm 1, the input is an n × n matrix A = ( a ij ) with entries a ij , ≤ i, j ≤ n and the out put is a sequence of n companion matrices C , . . . , C n such that A = C · · · C n if such a decomposition exists and is unique. Algorithm 1

Companion matrix decomposition for p, q = 1 , , . . . , n do Solve the linear system a p,q = q − X j =1 a p,q − j c q,n − j +1 + c q,p − q +1 for c q, , . . . , c q,n . Here we adopt the convention that a i,j , c i,j = 0 if either i ≤ j ≤ If the solution does not exist or is not unique, the decomposition of A does not exist or isnot unique. Stop the algorithm. Otherwise, deﬁne a matrix C q :=  · · · c q, · · · c q, ... ... . . . ... ... · · · c q,n  , and continue the algorithm. end for Lastly, we remark that it is not true that every n × n matrix is a product of n companionmatrices. Indeed, if we consider a matrix A = ( a ij ) where a = 0 , a = 1 , then from (3) that we must have c = a = 0 , a c n = a = 1 , which is impossible. Hence the companion matrix decomposition is an example where the map ρ r is dominant but not surjective, as we have remarked in Section 4.6.2. Generalized Vandermond decomposition.

Now we consider generalized Vandermondematrices. First we need to deﬁne generalized Vandermonde matrices.

Deﬁnition 6.4.

Let s be an integer we call a matrix of the form (cid:0) ( x q ) s + p − (cid:1) =  x s x s · · · x sn − x sn x s +11 x s +12 · · · x s +1 n − x s +1 n ... ... . . . ... ...x s + n − x s + n − · · · x n − s + n − x s + n − n  a generalized Vandermonde matrix of type s . We denote the set of all generalized Vandermondematrix of type s by Vand s and the set of transpose of generalized Vandermonde matrices of type s by Vand T s .By the deﬁnition, a Vandermonde matrix is a generalized Vandermonde matrix of type 0. Weconsider the matrix decomposition for generalized Vandermonde matrices and their transpose. Proposition 6.5.

Let s , s , . . . , s n be n integers such that s i s j ( mod n ) if i = j and n X j =1 s j = 0 .A generic n × n matrix is a product of elements in Vand s i and Vand T s i , i = 1 , , . . . , n .Proof. Again, we need to prove that the map ρ n : Vand Ts × Vand s × · · · × Vand Ts n × Vand s n → C n × n is dominant. Let w be a primitive n -th root of unity. Let W i = { B i · A i | B i ∈ Vand Ts i , A i = ( w − q ( p − s i ) ) ∈ Vand s i } . It is clear that W i ⊂ Vand Ts i · Vand s i and that Id ∈ W i . Then it suﬃces to show that ρ n : W × · · · × W n → C n × n is dominant. For this, we will show that the diﬀerential dρ n at (Id , Id , . . . , Id) is surjective. Considerthe diﬀerential dρ n | (Id , Id ,..., Id) ( X , . . . , X n ) = n X i =1 X i , where X i = (( q + s i − w p ( q + s i − x p,i ) · ( w − q ( p + s i − ) ∈ T Id W i and x p,i ’s are variables. Then asimple calculation shows that X i = ( n X k =1 ( k + s i − w ( p − q ) k ) ) w p ( s i − − q ( s i − x p,s i . Let ξ p,q,i = ( n X k =1 ( k + s i − w ( p − q ) k ) ) w p ( s i − − q ( s i − . We regard C n × n as C n by the linear isomor-phism h : C n × n → C n deﬁned by h (( x i,j )) = ( x , , x , , · · · , x ,n , x , , x , , · · · , x ,n , · · · , x n, , x n, , · · · , x n,n ) . Let M be the coeﬃcient matrix of dρ n | (Id , Id ,..., Id) then M is an n × n matrix and M =  ξ , , · · · ξ , , · · · · · · ξ , ,n · · · ξ , , · · · ξ , , · · · · · · ξ , ,n · · · ... ... . . . ... ... ... . . . ... . . . ... ... . . . ...ξ ,n, · · · ξ ,n, · · · · · · ξ ,n,n · · · ξ , , · · · ξ , , · · · · · · ξ , ,n · · · ξ , , · · · ξ , , · · · · · · ξ , ,n · · · ... ... . . . ... ... ... . . . ... . . . ... ... . . . ... ξ ,n, · · · ξ ,n, · · · · · · ξ ,n,n · · · ... ... . . . ... ... ... . . . ... . . . ... ... . . . ... · · · ξ n, , · · · ξ n, , · · · · · · ξ n, ,n · · · ξ n, , · · · ξ n, , · · · · · · ξ n, ,n ... ... . . . ... ... ... . . . ... . . . ... ... . . . ... · · · ξ n,n, · · · ξ n,n, · · · · · · ξ n,n,n  . EW CLASSES OF MATRIX DECOMPOSITIONS 21

Where dρ n | (Id , Id ,..., Id) ( X , . . . , X n ) = n X i =1 X i can be expressed as  ξ , , · · · ξ , , · · · · · · ξ , ,n · · · ξ , , · · · ξ , , · · · · · · ξ , ,n · · · ... ... . . . ... ... ... . . . ... . . . ... ... . . . ...ξ ,n, · · · ξ ,n, · · · · · · ξ ,n,n · · · ξ , , · · · ξ , , · · · · · · ξ , ,n · · · ξ , , · · · ξ , , · · · · · · ξ , ,n · · · ... ... . . . ... ... ... . . . ... . . . ... ... . . . ... ξ ,n, · · · ξ ,n, · · · · · · ξ ,n,n · · · ... ... . . . ... ... ... . . . ... . . . ... ... . . . ... · · · ξ n, , · · · ξ n, , · · · · · · ξ n, ,n · · · ξ n, , · · · ξ n, , · · · · · · ξ n, ,n ... ... . . . ... ... ... . . . ... . . . ... ... . . . ... · · · ξ n,n, · · · ξ n,n, · · · · · · ξ n,n,n  ·  x ,s x ,s ...x n,s x ,s x ,s · · · x n,s ...x ,s n x ,s n ...x n,s n  . Now we will prove that M is of the full rank n . Since by permutation of columns, M becomes ablock diagonal matrix where blocks M p on the diagonal are M p =  ξ p, , ξ p, , · · · ξ p, ,n ξ p, , ξ p, , · · · ξ p, ,n ... ... . . . ...ξ p,n, ξ p,n, · · · ξ p,n,n  . Therefore, it suﬃces to prove that M p is of rank n for each p = 1 , , . . . , n . Note that we have aformulas n X k =1 kw k = − nw − w , n X k =1 w k = 0 . for any n -th root of unity w = 1. Hence ξ p,i,j = ( − nw p − i − w p − i w ( p − i ) s j − p + i , if p = i (2 s j + n − nw − p , if p = i. Then it is easy to see that the rank of M p is the same as the matrix ˜ M p = ( ˜ ξ p,i,j ), where˜ ξ p,i,j = ( w − is j , if i = ps j w − ps j , if i = p. Up to a permutation of rows, ˜ M p is of the form  · · · w − s w − s · · · w − s n ... ... . . . ...w − ( p − s w − ( p − s · · · w − ( p − s n α w − ps α w − ps · · · α n w − ps n w − ( p +1) s w − ( p +1) s · · · w − ( p +1) s n ... ... . . . ...w − ( n − s w − ( n − s · · · w − ( n − s n  , where α , · · · , α n are ﬁxed integers. We still denote this matrix by ˜ M p where p = 0 , , · · · , n − M p . We haveDet( ˜ M p ) = n X j =1 ( − p + j − α j w − ps j V ( w − s j ) S j,n − − p . Where V ( w − s j ) is the determinant of the Vandermonde matrix determined by w − s , · · · , w − s j − , w − s j +1 , · · · , w − s n and S j,n − − p is the ( n − − p )-th symmetric function on w − s , · · · , w − s j − , w − s j +1 , · · · , w − s n .Note that ( t − w − s ) · · · ( t − w − s n ) = t n − , so ( t − w − s ) · · · ( t − w − s j − )( t − w − s j +1 ) · · · ( t − w − s n ) = t n − t − w − s j = n − X k =0 ( w − s j ) k t n − − k , hence S j,n − − p = ( − p ( w − s j ) n − − p = ( w − s j ) − − p .On the other hand, we know that V ( w − s j ) = Y a>b,a,b = j ( w − s a − w − s b ) = ( − j − Y a>b ( w − s a − w s b ) Y c = j ( w − s c − w − s j ) . Let V be the determinant of the Vandermonde matrix determined by w − s , · · · , w − s n . It is obviousthat V = 0. Also from t n − t − w − s j = Y c = j ( t − w − s c ) , we have Y c = j ( w − s c − w − s j ) = lim t → w − sj t n − t − w − s j = nw s j . EW CLASSES OF MATRIX DECOMPOSITIONS 23

ThereforeDet( ˜ M p ) = n X j =1 ( − p + j − α j w − ps j ( − p ( w − s j ) − − p ( − j − Y a>b ( w − s a − w s b ) Y c = j ( w − s c − w − s j )= V n X j =1 α j w s j Y c = j ( w − s c − w − s j )= V n X j =1 α j w s j nw s j = Vn n X j =1 α j . In particular, we set α j = s j then we see that Det( ˜ M p ) = 0 if and only if X j =1 s j = 0. This impliesthat the map ρ n is dominant for all s , . . . , s n such that s i s j ( mod n ) and n X j =1 s j = 0. (cid:3) From the proof of Proposition 6.5 we have

Corollary 6.6.

Let s , . . . , s n be as in Proposition 6.5 and let W i = { B i · A i | B i ∈ Vand Ts i , A i = ( w − q ( p − s i ) ) } . A generic n × n matrix is a product of elements in W i , i = 1 , , . . . , n . Combining Proposition 3.10 and Proposition 6.5 we obtain

Theorem 6.7.

Let s , . . . , s n be as in Proposition 6.5. For every n × n invertible matrix M thereare A i , A ′ i ∈ Vand s i and B i , B ′ i ∈ Vand T s i , i = 1 , , . . . , n such that M = B A · · · B n A n B ′ A ′ · · · B ′ n A ′ n . For every n × n matrix M , there are A i , A ′ i , C i , C ′ i ∈ Vand s i , B i , B ′ i , D i , D ′ i ∈ Vand T s i , i = 1 , , . . . , n and a diagonal matrix E such that M = B A · · · B n A n B ′ A ′ · · · B ′ n A ′ n ED C · · · D n C n D ′ C ′ · · · D ′ n C ′ n . Theorem 6.8.

Let W i , i = 1 , . . . , n be as in Corollary 6.6. For every n × n invertible matrix M there is an element A i in W i for each i = 1 , , . . . , n such that M = A · · · A n . For every n × n matrix M there are A i , B i ∈ W i for each i = 1 , , . . . , n and a diagonal matrix C such that M = A · · · A n CB · · · B n . Conclusion

We discuss the existence of matrix decompositions for bidiagonal, skew symmetric, symmetricToeplitz, persymmetric Hankel, generic, companion, generalized Vandermonde matrix decomposi-tions, for both generic and arbitrary matrices.It is natural to ask, for example, if the number of bidiagonal matrices needed to decomposea generic (resp. arbitrary) matrix is the smallest. For most types of matrix decompositions wediscussed in this paper, the number we obtain is already the smallest for a generic matrix. It is still open if the number we obtain is the smallest for an arbitrary matrix. We summarize our mainresults in the following table. r (generic) sharpness r (arbitrary) algorithmbidiagonal 2 n unknown 8 n unknownskew symmetric ( n ≥ ⌊ n +12 ⌋ yes unknown unknowncompanion n yes 4 n + 1 yesgeneralized Vandermonde 2 n unknown 8 n + 1 unknown References [1] “Algorithms for the ages,”

Science , (2000), no. 5454, pp. 799.[2] A. Borel, Linear Algebraic Groups , 2nd Ed., Graduate Texts in Mathematics, , Springer-Verlag, New York,NY, 1991.[3] A. J. Bosch, “The factorization of a square matrix into two symmetric matrices,”

Amer. Math. Monthly , (1986), no. 6, pp. 462–464.[4] R. Howe, “Very basic Lie theory,” Amer. Math. Monthly , (1983), no. 9, pp. 600–623.[5] A. W. Knapp, Lie Groups Beyond an Introduction , 2nd Ed., Progress in Mathematics, , Birkh¨auser, Boston,MA, 2002.[6] G. W. Stewart, “The decompositional approach to matrix computation,”

Comput. Sci. Eng. , (2000), no. 1,pp. 50–59.[7] J. L. Taylor, Several Complex Variables with Connections to Algebraic Geometry and Lie groups , GraduateStudies in Mathematics, , American Mathematical Society, Providence, RI, 2002.[8] K. Ye and L.-H. Lim, “Every matrix is a product of Toeplitz matrices,” Found. Comput. Math. , (2015), no. 6,pp. 1-22.[9] I. R. Shafarevich, “Basic algebraic geometry. 1”, Springer, Heidelberg, 2013.[10] R. B. Bitmead and B. D. O. Anderson, “Asymptotically fast solution of Toeplitz and related systems of linearequations”, Linear Algebra Appl. , (1980), pp. 103-116.[11] J. Weyman, Cohomology of vector bundles and syzygies , Cambridge Tracts in Mathematics, , CambridgeUniversity Press, Cambridge, 2003.[12] J. Harris,

Algebraic geometry , Graduate Texts in Mathematics, , Springer-Verlag, New York, 1995.[13] P. Griﬃths and J. Harris,

Principles of algebraic geometry , Wiley Classics Library, John Wiley & Sons, Inc.,New York, 1994.[14] W. Fulton and J. Harris,

Representation theory , Graduate Texts in Mathematics, , A ﬁrst course, Readingsin Mathematics, Springer-Verlag, New York, 1991.[15] G. S. Ammar and W. B. Gragg, “Superfast solution of real positive deﬁnite Toeplitz systems,”

SIAM J. MatrixAnal. Appl. , (1988), no. 1, pp. 61–76.[16] Chandrasekaran, Venkat and Sanghavi, Sujay and Parrilo, A. Pablo and Willsky, S. Alan, “Rank-sparsity inco-herence for matrix decomposition”, SIAM J. Optim. , (2011), no. 2, pp. 572-596.[17] D. L. Donoho, and X. M. Huo, “Uncertainty principles and ideal atomic decomposition”, IEEE Trans. Inform.Theory , (2001), no. 7, pp. 2845-2862.[18] J. B. Victor, Z. Douglas and C. David, “Low-rank network decomposition reveals structural characteristics ofsmall-world networks”, Phys. Rev. E , vol. 92 (2015), iss. 6, 062822.[19] H. Robin,

Algebraic Geometry , , Graduate Texts in Mathematics, Springer-Verlag, New York-Heidelberg,1977.[20] H. Roger, C. R. Johnson, Matrix Analysis , Cambridge University Press, 1985.[21] P. Okunev, “Necessary and suﬃcient conditions for existence of the LU factorization of an arbitrary matrix”, arXiv: 0506382 , (1997).[22] Grayson, R. Daniel and M. E. Stillman,

Macaulay2, a software system for research in algebraic geometry , Avail-able at . Department of Statistics, University of Chicago, Chicago, IL 60637-1514.

E-mail address ::