Computing Nearby Non-trivial Smith Forms
CComputing Nearby Non-trivial Smith Forms
Mark Giesbrecht, Joseph Haraldson, George Labahn
Cheriton School of Computer Science, University of Waterloo, Waterloo, Canada
Abstract
We consider the problem of computing the nearest matrix polynomial with a non-trivial SmithNormal Form. We show that computing the Smith form of a matrix polynomial is amenable tonumeric computation as an optimization problem. Furthermore, we describe an e ff ective opti-mization technique to find a nearby matrix polynomial with a non-trivial Smith form. The resultsare then generalized to include the computation of a matrix polynomial having a maximum spec-ified number of ones in the Smith Form (i.e., with a maximum specified McCoy rank).We discuss the geometry and existence of solutions and how our results can be used for an er-ror analysis. We develop an optimization-based approach and demonstrate an iterative numericalmethod for computing a nearby matrix polynomial with the desired spectral properties. We alsodescribe an implementation of our algorithms and demonstrate the robustness with examples in Maple .
1. Introduction
For a given square matrix polynomial (cid:65) ∈ R [ t ] n × n , one can find unimodular matrices (cid:85) , (cid:86) ∈ R [ t ] n × n such that (cid:85)(cid:65)(cid:86) is a diagonal matrix (cid:83) . Unimodular means that there is a poly-nomial inverse matrix, or equivalently, that the determinant is a nonzero constant from R . Theunimodular matrices (cid:85) , (cid:86) encapsulate the polynomial row and column operations, respectively,needed for such a diagonalization. The best known diagonalization is the Smith Normal Form(SNF, or simply Smith form) of a matrix polynomial. Here (cid:83) = s s . . . s n ∈ R [ t ] n × n , where s , . . . , s n ∈ R [ t ] are monic and s i | s i + for 1 ≤ i < n . The Smith form always exists and isunique though the associated unimodular matrices (cid:85) , (cid:86) are not unique (see, e.g., (Kailath, 1980;Gohberg et al., 2009)). The diagonal entries s , . . . , s n are referred to as the invariant factors of (cid:65) . Email addresses: [email protected] (Mark Giesbrecht), [email protected] (Mark Giesbrecht), [email protected] (Joseph Haraldson), [email protected] (George Labahn)
URL: https://cs.uwaterloo.ca/~mwg (Mark Giesbrecht), https://cs.uwaterloo.ca/~jharalds (Joseph Haraldson), https://cs.uwaterloo.ca/~glabahn (George Labahn) a r X i v : . [ c s . S C ] S e p atrix polynomials and their Smith forms are used in many areas of computational algebra,control systems theory, di ff erential equations and mechanics. The Smith form is important asit e ff ectively reveals the structure of the polynomial lattice of rows and columns, as well as thee ff ects of localizing at individual eigenvalues. That is, it characterizes how the rank decreasesas the variable t is set to di ff erent values (especially eigenvalues, where the rank drops). TheSmith form is closely related to the more general Smith-McMillan form for matrices of rationalfunctions, a form that reveals the structure of the eigenvalue at infinity, or the infinite spectralstructure. The eigenvalue at infinity is non-trivial if the leading coe ffi cient matrix is rank deficientor equivalently, the determinant does not achieve the generic degree.The algebra of matrix polynomials is typically described assuming that the coe ffi cients arefixed and come from an exact arithmetic domain, usually the field of real or complex numbers. Inthis exact setting, computing the Smith form has been well studied, and very e ffi cient proceduresare available (see (Kaltofen and Storjohann, 2015) and the references therein). However, insome applications, particularly control theory and mechanics, the coe ffi cients can come frommeasured data or contain some amount of uncertainty. Compounding this, for e ffi ciency reasonssuch computations are usually performed using floating point to approximate computations in R ,introducing roundo ff error. As such, algorithms must accommodate numerical inaccuracies andare prone to numerical instability.Numerical methods to compute the Smith form of a matrix polynomial typically rely on lin-earization and orthogonal transformations (Van Dooren and Dewilde, 1983; Beelen and Van Dooren,1988; Demmel and Kågstr¨om, 1993a,b; Demmel and Edelman, 1995) to infer the Smith form ofa nearby matrix polynomial via the Jordan blocks in the Kronecker canonical form (see (Kailath,1980)). These linearization techniques are numerically backwards stable, and for many prob-lems this is su ffi cient to ensure that the computed solutions are computationally useful when aproblem is continuous.Unfortunately, the eigenvalues of a matrix polynomial are not necessarily continuous func-tions of the coe ffi cients of the matrix polynomial, and backwards stability is not always su ffi cientto ensure computed solutions are useful in the presence of discontinuities. Previous methods arealso unstructured in the sense that the computed non-trivial Smith form may not be the Smithform of a matrix polynomial with a prescribed coe ffi cient structure, for example, maintainingthe degree of entries or not introducing additional non-zero entries or coe ffi cients. In extremeinstances, the unstructured backwards error can be arbitrarily small, while the structured distanceto an interesting Smith form is relatively large. Finally, existing numerical methods can also failto compute meaningful results on some problems due to uncertainties. Examples of such prob-lems include nearly rank deficient matrix polynomials, repeated eigenvalues or eigenvalues thatare close together and other ill-posed instances.In this paper we consider the problem of computing a nearby matrix polynomial with a pre-scribed spectral structure, broadly speaking, the degrees and multiplicities of the invariant factorsin the Smith form, or equivalently the structure and multiplicity of the eigenvalues of the matrixpolynomial. The results presented in this paper extend those in the conference paper (Gies-brecht, Haraldson, and Labahn, 2018). This work is not so much about computing the Smithnormal form of a matrix polynomial using floating point arithmetic, but rather our focus is onthe computation of a nearby matrix polynomial with “an interesting” or non-generic Smith nor-mal form. The emphasis in this work is on the finite spectral structure of a matrix polynomial,since the techniques described are easily generalized to handle the instance of the infinite spectralstructure as a special case.The Smith form of a matrix polynomial is not continuous with respect to the usual topology2f the coe ffi cients and the resulting limitations of backward stability is not the only issue thatneeds to be addressed when finding nearest objects in an approximate arithmetic environment.A second issue can be illustrated by recalling the well-known representation of the invariantfactors s , . . . , s n of a matrix (cid:65) ∈ R [ t ] n × n as ratios s i = δ i /δ i − of the determinantal divisors δ , δ , . . . , δ n ∈ R [ t ], where δ = , δ i = GCD (cid:26) all i × i minors of (cid:65) (cid:27) ∈ R [ t ] . In the case of 2 × unattainable . That is,there are co-prime polynomials with nearby polynomials with a non-trivial GCD at distancesarbitrarily approaching an infimum, while at the infimum itself the GCD is trivial (see, e.g.,(Giesbrecht, Haraldson, and Kaltofen, 2019a)).The issue of unattainable infima extends to the Smith normal form. As an example, consider (cid:65) = (cid:32) t − t + t + t + (cid:33) = diag( f , g ) ∈ R [ t ] × . If we look for nearby polynomials (cid:101) f , (cid:101) g ∈ R [ t ] of degree at most 2 such that gcd( (cid:101) f , (cid:101) g ) = γ t + (cid:107) f − (cid:101) f (cid:107) + (cid:107) g − (cid:101) g (cid:107) for some γ ∈ R , then it is shown in(Haraldson, 2015, Example 3.3.6) that this distance is (5 γ − γ + γ + / ( γ + γ + γ =
0. However at γ = (cid:101) f , (cid:101) g ) = (cid:101) f , (cid:101) g )) > γ (cid:44)
0. For (cid:65) to have a non-trivial Smith form we must perturb f , g such that they have a non-trivial GCD, and thus any such perturbation must be at a distance of atleast 2. However, the perturbation of distance precisely 2 has a trivial Smith form. There is nomerit to perturbing the o ff -diagonal entries of (cid:65) .Our work indirectly involves measuring the sensitivity to the eigenvalues of (cid:65) and the deter-minant of (cid:65) . Thus we di ff er from most sensitivity and perturbation analysis (e.g., (Stewart, 1994;Ahmad and Alam, 2009)), since we also study how perturbations a ff ect the invariant factors, in-stead of the roots of the determinant. Additionally our theory is able to support the instance of (cid:65) being rank deficient and having degree exceeding one. One may also approach the problemgeometrically in the context of manifolds (Edelman et al., 1997, 1999). We do not consider themanifold approach directly since it does not yield numerical algorithms.Determining what it means for a matrix polynomial to have a non-trivial Smith form numer-ically and finding the distance from one matrix polynomial to another matrix polynomial havingan interesting or non-trivial Smith form are formulated as finding solutions to continuous opti-mization problems. The main contributions of this paper are deciding when (cid:65) has an interestingSmith form, providing bounds on a “radius of triviality” around (cid:65) and a structured stabilityanalysis on iterative methods to compute a structured matrix polynomial with desired spectralproperties.The remainder of the paper is organized as follows. In Section 2 we give the notation andterminology along with some needed background used in our work. Section 3 discusses theapproximate Smith form computation as an optimization problem and provides some new boundson the distance to non-triviality. We present an optimization algorithm in Section 4 with localstability properties and rapid local convergence to compute a nearby matrix polynomial witha non-trivial Smith form and discuss implementation details. A method to compute a matrix3olynomial with a prescribed lower bound on the number of ones in the Smith form is discussedin Section 5. We discuss our implementation and include some examples in Section 6. The paperends with a conclusion along with topics for future research.A preliminary version of some of the results in this paper appears in the proceedings of theISSAC 2018 conference (Giesbrecht, Haraldson, and Labahn, 2018).
2. Preliminaries
In this section we give the notation and formal definitions needed to precisely describe theproblems summarized above. We also present some existing results used as building blocks forour work. In addition, we provide a basic description of matrix functions and their first-orderderivatives (Jacobian matrices) which will be needed for the optimization work central to ourresults.
We make extensive use of the following terminology and definitions. A matrix polynomial (cid:65) ∈ R [ t ] n × n is an n × n matrix whose entries are polynomials. Typically we also work withmatrices whose entries have degree bound d and use the notation R ≤ d [ t ] n × n to describe this set.Alternatively, we may express matrix polynomials as (cid:65) = (cid:80) ≤ j ≤ d A j t j where A j ∈ R n × n . The degree of a matrix polynomial d is defined to be the degree of the highest-order non-zero entryof (cid:65) , or the largest index j such that A j (cid:44)
0. We say that (cid:65) has full rank or is regular ifdet( (cid:65) ) (cid:46)
0. As noted in the introduction, (cid:65) is said to be unimodular if det( (cid:65) ) ∈ R \{ } . The (finite) eigenvalues are the roots of det( A ) ∈ R [ t ]. The norm of a polynomial a ∈ R [ t ] is definedas (cid:107) a (cid:107) = (cid:107) a (cid:107) = (cid:107) ( a , a , . . . , a d , , . . . , (cid:107) . For matrix polynomials we define (cid:107) (cid:65) (cid:107) = (cid:107) (cid:65) (cid:107) F = (cid:113)(cid:80) i , j (cid:107) (cid:65) (cid:105) , (cid:106) (cid:107) . Our choice of norm is a distributed coe ffi cient norm, sometimes known as the Frobenius norm . Definition 2.1 (SVD – Golub and Van Loan 2012) . The Singular Value Decomposition (SVD)of A ∈ R n × n is given by U T Σ V, where U , V ∈ R n × n satisfy U T U = I, V T V = I and
Σ = diag( σ , . . . , σ n ) is a diagonal matrix with non-negative real entries in descending order of mag-nitude, the singular values of A. The distance to the nearest (unstructured) matrix of rank m < nis σ m + ( A ) . For scalar matrices we frequently write (cid:107) · (cid:107) for the largest singular value, and σ min ( · ) for thesmallest singular value.In this paper we are mainly concerned with coe ffi cient structures that preserve the zero coef-ficient structure of a matrix polynomial, that is, we generally do not change zero coe ffi cients tonon-zero, or increase the degrees of matrix entries. Definition 2.2 (A ffi ne / Linear Structure) . A non-zero matrix polynomial (cid:65) ∈ R [ t ] n × n of degreeat most d has a linear structure from a set (cid:75) if (cid:65) ∈ span( (cid:75) ) as a vector space over R , where (cid:75) = (cid:8) C , , . . . , C , k , tC , , . . . , tC , k , . . . , t d C d , , . . . , t d C d , k (cid:9) , where C l , j ∈ R n × n for ≤ j ≤ k, where k > is a finite index variable. If (cid:65) = (cid:67) + (cid:67) , where (cid:67) ∈ R [ t ] n × n is fixed and (cid:67) ∈ span( (cid:75) ) , then (cid:65) is said to have an a ffi ne structure from the set (cid:75) . ffi ne linearly structured matrices are best thought of as imposing linear equal-ity constraints on the entries. Examples of matrices with a linear structure include matrices withprescribed zero entries or coe ffi cients, Toeplitz / Hankel matrices, Sylvester matrices, resultant-like matrices, Ruppert matrices and several other matrices appearing in symbolic-numeric com-putation. Matrices with an a ffi ne structure include all matrices with a linear structure and, inaddition, matrices having prescribed non-zero constant entries / coe ffi cients, for example monicmatrix polynomials.Recall that the rank of a matrix polynomial is the maximum number of linearly independentrows or columns as a vector space over R ( t ). This is the same as the rank of the matrix (cid:65) ( ω )for any ω ∈ C that is not an eigenvalue of (cid:65) ( t ). If we allow evaluation at eigenvalues, then theMcCoy rank is the lowest rank when (cid:65) is evaluated at an eigenvalue. Definition 2.3 (McCoy Rank and Non-Trivial SNF) . The McCoy rank of (cid:65) is min ω ∈ C { rank (cid:65) ( ω ) } ,the lowest rank possible when (cid:65) is evaluated at any ω ∈ C . Note that the rank of (cid:65) only dropsat all if it is evaluated at an eigenvalue ω ∈ C . The McCoy rank is also the number of ones inthe Smith form. Equivalently, if (cid:65) has r non-trivial invariant factors, then the McCoy rank of (cid:65) is n − r. The matrix polynomial (cid:65) is said to have a non-trivial or interesting Smith normalform if the McCoy rank is at most n − , or equivalently, if it has two or more invariant factorsof non-zero degree. Problem 2.4 (Approximate SNF Problem) . Given a matrix polynomial (cid:65) ∈ R [ t ] n × n , find thedistance to a non-trivial SNF. That is, find a matrix polynomial (cid:98) (cid:65) ∈ R [ t ] n × n of prescribedcoe ffi cient structure that has a prescribed McCoy rank of at most n − r, for r ≥ , such that (cid:107) (cid:65) − (cid:98) (cid:65) (cid:107) is minimized under (cid:107) · (cid:107) , if such an (cid:98) (cid:65) exists. We will consider (cid:107) · (cid:107) = (cid:107) · (cid:107) F to be the Frobenius norm. The nearest matrix of McCoy rankat most n −
2, if it exists, is called the approximate SNF . Problem 2.5 (Lower McCoy Rank Approximation Problem) . Compute the nearest matrix withMcCoy rank n − r matrix, if it exists, for r ≥ . In a generic sense, the nearest matrix polynomial with an interesting SNF will have McCoyrank n − n − n − ff erent for non-trivialinstances. In our study of nearest non-trivial Smith forms we often make use of the representation of thediagonal elements as ratios of GCDs of sub-determinants. When the coe ffi cients are polynomialswith numeric coe ffi cients it is helpful to embed the arithmetic operations of polynomial multi-plication and polynomial GCD into a matrix problem having numeric coe ffi cients (i.e., from R ).5uch an embedding allows us to employ a number of tools, including condition numbers andperturbations of matrix functions.We start with some basic notation for mapping matrices and polynomials to vectors. Definition 2.6 (Vec Operator) . We define the operator vec : R [ t ] → R ( d + × as follows:p = d (cid:88) j = p j t t ∈ R [ t ] (cid:55)→ vec( p ) = ( p , p , . . . , p d ) T ∈ R ( d + × The vec operator vec( · ) is extended to map a matrix R [ t ] m × n to a single vector in R mn ( d + × bystacking columns of (padded) coe ffi cient vectors on top of each other. (cid:65) ∈ R [ t ] m × n (cid:55)→ vec( (cid:65) ) = vec( (cid:65) ) ... vec( (cid:65) mn ) ∈ R mn ( d + × . It is sometimes useful to reduce matrix polynomials to vectors of polynomials in R [ t ] ratherthan vectors over R . Definition 2.7 (Polynomial Vec Operator) . The pvec operator maps R [ t ] m × n to a vector R [ t ] nm × as (cid:65) ∈ R [ t ] m × n (cid:55)→ pvec( (cid:65) ) = (cid:65) ... (cid:65) mn ∈ R [ t ] mn × . We define the vectorization of matrix polynomials in this somewhat non-standard way so that wecan later facilitate the computation of derivatives of matrix polynomial valued functions.
To describe polynomial multiplication in terms of linear maps over scalars we have:
Definition 2.8 (Convolution Matrix) . Polynomial multiplication between polynomials a , b ∈ R [ t ] , of degrees d and d , respectively may be expressed as a Toeplitz-matrix-vector product.We define φ d ( a ) = a ... . . . a d a . . . ... a d ∈ R ( d + d + × ( d + . It follows that vec( ab ) = φ d ( a )vec( b ) . When a is non-zero, we can also define division through pseudo-inversion or linear leastsquares. In a similar manner, we can define the product of matrix polynomials through a Toeplitz-block matrix. Definition 2.9 (Block Convolution Matrix) . We can express multiplication of a matrix and vectorof polynomials, (cid:65) ∈ R [ t ] m × n and (cid:98) ∈ R [ t ] n × , of degrees at most d and d respectively, as ascalar linear system vec( (cid:65)(cid:98) ) = Φ d ( (cid:65) )vec( (cid:98) ) , here Φ d ( (cid:65) ) = φ d ( (cid:65) ) · · · φ d ( (cid:65) n ) ... ...φ d ( (cid:65) m ) · · · φ d ( (cid:65) mn ) ∈ R m ( d + d + × n ( d + . The block convolution matrix is sometimes referred to as a “Sylvester matrix” associatedwith (cid:65) . However, we reserve the term “Sylvester matrix” for the more standard linear systemappearing from the GCD of two (or more) polynomials. The block convolution matrix is a scalarmatrix whose entries have a linear (Toeplitz-block) structure.
Definition 2.10 (Kronecker Product) . The Kronecker product of (cid:65) ∈ R [ t ] m × n and (cid:66) ∈ R [ t ] k × (cid:96) denoted as (cid:65) ⊗ (cid:66) is the mk × n (cid:96) matrix over R [ t ] defined as (cid:65) ⊗ (cid:66) = (cid:65) (cid:66) · · · (cid:65) n (cid:66) ... ... (cid:65) m (cid:66) · · · (cid:65) mn (cid:66) ∈ R [ t ] mk × n (cid:96) . This definition of Kronecker product, sometimes referred to as the “outer product”, also holdsfor scalar matrices (and vectors).
Lemma 2.11.
For scalar matrices of compatible dimension A , X and B over R , we can vec( AXB ) = ( B T ⊗ A )vec( X ) . Likewise, for matrix polynomials (cid:65) , (cid:88) and (cid:66) of compatible dimension over R [ t ] , we have pvec( (cid:65)(cid:88)(cid:66) ) = ( (cid:66) T ⊗ (cid:65) )pvec( (cid:88) ) . The Kronecker product can also be used to re-write matrix equations of the form AX = B ,for matrices A , B and X of compatible dimensions, tovec( AX ) = ( X T ⊗ I )vec( A ) = ( I ⊗ A )vec( X ) = vec( B ) . In this paper we will need to compute derivatives of some important matrix polynomial val-ued functions, namely the determinant and adjoint. This problem is approached in the contextof computing the Jacobian matrix of a vector valued function. The analysis in this section willbe useful for showing that Lagrange multipliers typically exist in the optimization problems en-countered. The quantities computed can also be used to derive first-order perturbation boundsfor these matrix polynomial valued functions with respect to (cid:107) · (cid:107) F .Recall that the adjoint of a matrix polynomial (cid:65) ∈ R [ t ] n × n , denoted by Adj( (cid:65) ) ∈ R [ t ] n × n ,is the transpose of the cofactor matrix. Thus Adj( (cid:65) ) i j = ( − i + j det( (cid:65) [ j | i ]) where (cid:65) [ j | i ] is the( j , i ) cofactor of (cid:65) , that is, the matrix formed by removing row j and column i from (cid:65) . When (cid:65) has full rank, (cid:65) satisfies (cid:65) Adj( (cid:65) ) = det( (cid:65) ) I .The determinant of an n × n matrix polynomial having entries of degree at most d can beviewed as a mapping from R n ( d + → R nd + , since the determinant has degree at most nd .With this same viewpoint, we can view the adjoint of a matrix polynomial as a mapping from R n ( d + → R n (( n − d + , since the degree of the entries of the adjoint are at most ( n − d .7ur notation for computing derivatives of vector valued functions follows that of (Magnus andNeudecker, 1988).It is not surprising that the determinant of a matrix polynomial has a similar identity (Magnusand Neudecker, 1988) to the well-known scalar identity ∇ det( A ) = vec((Adj( A ) T ) T ) . Theorem 2.12.
Let (cid:65) ∈ R [ t ] n × n have degree at most d, thenJ det = ∂ vec(det( (cid:65) )) ∂ vec( (cid:65) ) = Φ d (pvec(Adj( (cid:65) ) T ) T ) ∈ R ( nd + × n ( d + . Proof.
We note that from generalizing the scalar identity ∇ det( · ) = vec(Adj( · ) T ) T , we can writea first-order expansion of the determinant asdet( (cid:65) + ∆ (cid:65) ) = det( (cid:65) ) + pvec(Adj( (cid:65) ) T ) T pvec( ∆ (cid:65) ) + O ( (cid:107) ∆ (cid:65) (cid:107) F ) , and ignoring higher-order terms we obtain the scalar expressionvec(det( (cid:65) + ∆ (cid:65) )) ≈ vec(det( (cid:65) )) + vec(pvec(Adj( (cid:65) ) T ) T pvec( ∆ (cid:65) )) . The Jacobian can be extracted by (padding with zero coe ffi cient entries as necessary) writingvec(pvec(Adj( (cid:65) ) T ) T pvec( ∆ (cid:65) )) = J det vec( ∆ (cid:65) ) as a matrix-vector product. Thus, using block-convolution matrices we have ∂ vec(det( (cid:65) )) ∂ vec( (cid:65) ) = ∇ (det( (cid:65) )) = Φ d (pvec(Adj( (cid:65) ) T ) T ) . Now that we have a closed-form expression for the derivative of the determinant, it is usefulto derive a closed-form expression for the adjoint matrix. The closed-form expression revealsrank information, and is independently useful for optimization algorithms requiring derivatives.The rank information is useful to obtain insights about the existence of Lagrange multipliers.If J det has full or locally constant (row) rank then constraint qualifications will hold for severalconstrained optimization problems involving the determinant. If Adj( (cid:65) ) is non-zero then onecan often infer the existence of Lagrange multipliers for other problems as well. Theorem 2.13.
Let (cid:65) ∈ R [ t ] n × n have degree at most d and rank n. The Jacobian of Adj( (cid:65) ) isJ Adj ∈ R ( n (( n − d + × n ( d + withJ Adj = (cid:2) Φ ( n − d ( I ⊗ (cid:65) ) (cid:3) + (cid:104) Φ d (pvec( I )pvec(Adj( (cid:65) ) T ) T ) − Φ d (Adj( (cid:65) ) T ⊗ I ) (cid:105) , where I is understood to be the n × n identity matrix and for a scalar matrix A of full rank, A + isthe Moore-Penrose pseudo-inverse arising from the SVD.Proof. First recall that if (cid:65) has full rank, then (cid:65)
Adj( (cid:65) ) = Adj( (cid:65) ) (cid:65) = det( (cid:65) ) I . This expres-sion defines the adjoint matrix when (cid:65) has full rank. We can writepvec( (cid:65) Adj( (cid:65) )) = (Adj( (cid:65) ) T ⊗ I )pvec( (cid:65) ) = ( I ⊗ (cid:65) )pvec(Adj( (cid:65) )) , thus converting to a linear system over R producesvec( (cid:65) Adj( (cid:65) )) = Φ ( n − d ( I ⊗ (cid:65) )vec(Adj( (cid:65) )) = Φ d (Adj( (cid:65) ) T ⊗ I )vec( (cid:65) ) . ∂ vec( (cid:65) Adj( (cid:65) )) = ( ∂ Φ ( n − d ( I ⊗ (cid:65) ))vec(Adj( (cid:65) )) + Φ ( n − d ( I ⊗ (cid:65) ) ∂ vec(Adj( (cid:65) )) . (1)Next we observe that (1) has the same coe ffi cients as the expressionvec(( ∂ (cid:65) ) Adj( (cid:65) ) + (cid:65) ( ∂ Adj( (cid:65) )))which is equivalent tovec((Adj( (cid:65) ) T ⊗ I )pvec( ∂ (cid:65) ) + ( I ⊗ (cid:65) )pvec( ∂ Adj( (cid:65) ))) , which reduces to Φ d ((Adj( (cid:65) ) T ⊗ I ))vec( ∂ (cid:65) ) + Φ ( n − d ( I ⊗ A )vec( ∂ Adj( (cid:65) )) . (2)We now have the derivative of the left hand side the expression (cid:65) Adj( (cid:65) ) = det( (cid:65) ) I . Di ff eren-tiation of the right hand side yields ∂ vec(det( (cid:65) ) I ) = vec( ∂ pvec(det( (cid:65) ) I )) , which is equivalent to the expressionvec( ∂ pvec(det( (cid:65) ) I )) = vec(pvec( I )pvec(Adj( (cid:65) ) T ) T pvec( ∂ (cid:65) )) . (3)Converting (3) into a linear system over R leads tovec(pvec( I )pvec(Adj( (cid:65) ) T ) T )pvec( ∂ (cid:65) ) = Φ d (pvec( I )pvec(Adj( (cid:65) ) T ) T )vec( ∂ (cid:65) ) , (4)which is the derivative of the right-hand side.Combining (2) and (4) we have Φ ( n − d ( I ⊗ (cid:65) ) ∂ vec(Adj( (cid:65) )) ∂ vec( (cid:65) ) = Φ d (pvec( I )pvec(Adj( (cid:65) ) T ) T ) − Φ d (Adj( (cid:65) ) T ⊗ I ) . Assuming that (cid:65) has full rank so Φ ( n − d (pvec( I ⊗ (cid:65) )) is pseudo-invertible, we can write J Adj = (cid:2) Φ ( n − d ( I ⊗ (cid:65) ) (cid:3) + (cid:104) Φ d (pvec( I )pvec(Adj( (cid:65) ) T ) T ) − Φ d (Adj( (cid:65) ) T ⊗ I ) (cid:105) , which completes the proof.An observation that is important later is that the derivative of the adjoint has a Toeplitz-blockstructure. More importantly, the bandwidth is O ( d ), and we only need to compute O ( n ) columnsinstead of O ( n d ). We also note that J Adj may be padded with zeros, since (cid:65) may not havegeneric degrees.
Corollary 2.14. If (cid:65) has full rank then J Adj has full rank. roof. The matrix Φ ( n − d ( I ⊗ (cid:65) ) has full rank since I ⊗ (cid:65) has full rank. The matrixpvec( I )pvec(Adj( (cid:65) ) T ) T − Adj( (cid:65) ) T ⊗ I = − (cid:16) − pvec( I )pvec(Adj( (cid:65) ) T ) T + Adj( (cid:65) ) T ⊗ I (cid:17) (5)is a rank one update to a matrix polynomial. By evaluating (5) at a complex number ω that is notan eigenvalue of (cid:65) we can show that (5) has full rank. Let A = (cid:65) ( ω ), so A ∈ R n × n has full rank.Using the Sherman-Morrison formula (Higham, 2002, pg. 487) for rank 1 updates to a ma-trix, we need to verify that1 − vec(Adj( A ) T ) T (cid:20)(cid:16) Adj( A ) T (cid:17) − ⊗ I (cid:21) vec( I ) (cid:44) , in order to ensure that (5) has full rank. We have thatvec(Adj( A ) T ) T (cid:20)(cid:16) Adj( A ) T (cid:17) − ⊗ I (cid:21) vec( I ) = vec(Adj( A ) T ) T vec (cid:16) Adj( A ) T ) − (cid:17) = Tr (cid:18) Adj( A ) T (cid:16) Adj( A ) T (cid:17) − (cid:19) = n , thus (5) has full rank. Note we used the identities for matrices X , Y and Z of appropriate di-mension, that vec( XYZ ) = ( Z T ⊗ X )vec( Y ) and vec( X T ) T vec( Y ) = Tr( XY ). Again, we havethat Φ d (pvec( I )pvec(Adj( (cid:65) ) T ) T ) − Φ d (Adj( (cid:65) ) T ⊗ I )has full rank, thus J Adj is a product of two matrices of full rank, so J Adj must also have fullrank.Corollary 2.14 implies that Lagrange multipliers will exist to several optimization problemsinvolving the adjoint matrix as a constraint, since the Jacobian matrix of the adjoint has full rank.The linear independent constraint qualification or the constant rank constraint qualification willhold for several optimization problems of the formmin (cid:107) ∆ (cid:65) (cid:107) subject to Adj( (cid:65) + ∆ (cid:65) ) = (cid:70) , for some reasonably prescribed (cid:70) ∈ R [ t ] n × n . Remark 2.15. If (cid:65) is rank deficient, then the derivative is still defined, but not necessarily byTheorem 2.13. If rank( (cid:65) ) ≤ n − then J Adj = , since all ( n − × ( n − minors vanish(J Adj consists of the coe ffi cients of these minors). If rank( (cid:65) ) = n − or rank( (cid:65) ) = n − thenJ Adj is still defined and in both cases J
Adj (cid:44) . However J Adj is not necessarily described byTheorem 2.13.
For several a ffi ne or linear perturbation structures (such as ones that preserve the degree ofentries or the support of entries), Theorem 2.13 and the associated Corollary 2.14 will hold (afterdeleting some extraneous rows or columns).
3. When Does a Numerical Matrix Polynomial have a trivial SNF?
In this section we consider the question of determining if a matrix polynomial has a non-trivial SNF, or rather how much do the coe ffi cients need to be perturbed to have a non-trivialSNF. We provide a lower bound on this distance by analyzing the distance to a reduced-rankgeneralized Sylvester matrix. 10 .1. Embeddings into generalized Sylvester matrices and approximate GCDs In the introduction we demonstrated that some nearby non-trivial Smith Forms are unattain-able. In this subsection we investigate why these unattainable values occur. We first review somebasic results needed to analyze the topology of the approximate Smith form problem.For a matrix (cid:65) ∈ R [ x ] n × n , we know that s n = δ n /δ n − , the quotient of the determinant andthe GCD of all ( n − × ( n −
1) minors. Since these minors are precisely the entries of the adjointmatrix, it follows that (cid:65) has a non-trivial Smith form if and only if the GCD of all entries of theadjoint is non-trivial, that is, deg(gcd( { Adj( (cid:65) ) i j } )) ≥
1. In order to obtain bounds on the distanceto a matrix having a non-trivial Smith form, we consider an approximate GCD problem of theform min (cid:110) (cid:107) ∆ (cid:65) (cid:107) subject to deg (cid:16) gcd (cid:110) Adj ( (cid:65) + ∆ (cid:65) ) i j (cid:17)(cid:111) (cid:44) (cid:111) . If this was a classical approximate GCD problem, then the use of Sylvester-like matrices wouldbe su ffi cient. However, in our problem the degrees of the entries of the adjoint may change underperturbations. In order to perform an analysis, we need to study a family of generalized Sylvestermatrices that allow higher-degree zero coe ffi cients to be perturbed.The computation of the GCD of many polynomials is typically embedded into a scalar matrixproblem using the classical Sylvester matrix. However, in our case we want to look at GCDs ofnearby polynomials but with the added wrinkle that the degrees of the entries of the individualpolynomials may change under perturbations. In order to perform such an analysis, we need tostudy a family of generalized Sylvester matrices that allow higher-degree zero coe ffi cients to beperturbed.Let f = ( f , . . . , f k ) ∈ R [ t ] k be a vector of polynomials with degrees d = ( d , . . . , d k ) orderedas d j ≥ d j + for 1 ≤ j ≤ k −
1. Set d = d and (cid:96) = max( d , . . . , d k ) and suppose that for each i ∈ { , . . . , k } we have f i = (cid:80) ≤ j ≤ (cid:96) f i j t j . Definition 3.1 (Generalized Sylvester Matrix) . The generalized Sylvester matrix of f is definedas Syl( f ) = Syl d ( f ) = φ (cid:96) − ( f ) T φ d − ( f ) T ...φ d − ( f k ) T ∈ R ( (cid:96) + ( k − d ) × ( (cid:96) + d ) . Some authors, e.g., (Fatouros and Karcanias, 2003; Vardulakis and Stoyle, 1978), refer tosuch a matrix as an expanded Sylvester matrix or generalized resultant matrix. The generalizedSylvester matrix has many useful properties pertaining to the B´ezout coe ffi cients. However, weare only concerned with the well known result that gcd( f ) = gcd( f , . . . , f k ) = d ( f ) has full rank (Vardulakis and Stoyle, 1978).Sometimes treatreating a polynomial of degree d as one of larger degree is useful. This can beaccomplished by constructing a similar matrix and padding rows and columns with zero entries.The generalized Sylvester matrix of degree at most d (cid:48) ≥ d (component-wise) of f is definedanalogously as Syl d (cid:48) ( f ), taking d to be the largest degree entry and (cid:96) to be the largest degree ofthe remaining entries of d (cid:48) . Note that (cid:96) = d is possible and typical. If the entries of f have a non-trivial GCD (that is possibly unattainable) under a perturbation structure ∆ f , then it is necessarythat Syl d (cid:48) ( f ) is rank deficient, and often this will be su ffi cient.If we view the entries of f as polynomials of degree d (cid:48) and d (cid:48) i > d i for all i , then the entries of f have an unattainable GCD of distance zero, typically of the form 1 + ε t ∼ t + ε − . In other words,11he underlying approximate GCD problem is ill-posed in a sense that the solution is unattainable.In order to study the theory of unattainable GCD’s, sometimes referred to as GCD’s at infinity,we need to study the notion of a degree reversed polynomial. Lemma 3.2. If max( d ) = max( d (cid:48) ) then Syl d ( f ) has full rank if and only if and Syl d (cid:48) ( f ) has fullrank.Proof. Let d and (cid:96) be the largest and second largest entries of d and (cid:96) (cid:48) be the second largest entryof d (cid:48) . The result follows from the main theorem of Vardulakis and Stoyle (1978) by consideringthe case of (cid:96) (cid:48) = d .This lemma characterizes the (generic) case when elements of maximal degree of f do notchange under perturbations, in which case the generalized Sylvester matrix still meaningfullyencodes GCD information. However, it is possible that Syl d ( f ) has full rank and Syl d (cid:48) ( f ) is rankdeficient but the distance to a non-trivial GCD is not zero. This can occur when d j = d (cid:48) j for some j and d (cid:48) ≥ d . To understand the most general case, we need to look at generalized Sylvestermatrices of the reversal of several polynomials. Definition 3.3.
The degree d reversal of f ∈ R [ t ] of degree at most d is defined as rev d ( f ) = t d f ( t − ) . For a vector of polynomials f ∈ R [ t ] k of degrees at most d = ( d , . . . , d k ) the degree d reversal of f is the vector rev d ( f ) = (rev d ( f ) , . . . , rev d k ( f k )) . The following theorem enables us to determine if unattainable solutions are occurring in anapproximate GCD problem with an arbitrary (possibly non-linear) structure on the coe ffi cients. Theorem 3.4.
Let f be a vector of non-zero polynomials of degree at most d. Suppose that Syl d ( f ) has full rank and Syl d (cid:48) ( f ) is rank deficient, where the perturbations ∆ f have degrees at most d (cid:48) and the entries of f have degrees d . Then f has an unattainable non-trivial GCD of distance zerounder the perturbation structure ∆ f if and only if Syl(rev d (cid:48) ( f )) is rank deficient.Proof. Suppose that Syl(rev d (cid:48) ( f )) has full rank. Then gcd(rev d (cid:48) ( f )) =
1, hence f does not havean unattainable non-trivial GCD, since gcd( f ) =
1. Conversely, suppose that Syl(rev d (cid:48) ( f )) is rankdeficient. Then, t is a factor of gcd(rev d (cid:48) ( f )) but t is not a factor of gcd(rev d ( f )). Accordingly,all entries of f + ∆ f may increase in degree and so the distance of f having a non-trivial GCD iszero, and so is unattainable.If the generalized Sylvester matrix of f has full rank, but the generalized Sylvester matrix thatencodes the perturbations f + ∆ f is rank deficient, then either there is an unattainable GCD, orthe generalized Sylvester matrix is rank deficient due to over-padding with zeros. Theorem 3.4provides a reliable way to detect this over-padding. Definition 3.5.
We say that (cid:65) has an unattainable non-trivial Smith form if gcd(Adj( (cid:65) )) = and gcd(Adj( (cid:65) + (cid:101) ∆ (cid:65) )) (cid:44) for an arbitrarily small perturbation (cid:101) ∆ (cid:65) = (cid:101) ∆ ( ∆ (cid:65) ) of someprescribed a ffi ne structure. Note that (cid:101) ∆ (cid:65) just means that perturbations to (cid:65) are structured as an a ffi ne function of ∆ (cid:65) .It is important to carefully consider structured perturbations, because some matrix polynomialshave an unattainable non-trivial SNF under unstructured perturbations, but have an attainablenon-trivial SNF under structured perturbations (perturbations that preserve the degree of entriesor support of entries are structured). Solutions that cannot be attained correspond to an eigenvalueat infinity of (cid:65) with a non-trivial spectral structure. Such examples are easily constructed whendet( (cid:65) ) or Adj( (cid:65) ) have non-generic degrees. 12 xample 3.6. Let (cid:65) = (cid:32) t t − t + t (cid:33) ∈ R [ t ] × and (cid:67) = (cid:32) (cid:65) (cid:65) (cid:33) ∈ R [ t ] × . Then (cid:67) has an unattainable non-trivial Smith form if all perturbations to (cid:65) are support ordegree preserving (i.e. they preserve zero entries or do not increase the degree of each entry),both linear structures. Note that (cid:65) and (cid:67) are both unimodular. However small perturbations tothe non-zero coe ffi cients of (cid:65) make (cid:65) + ∆ (cid:65) non-unimodular.The Smith form of rev( (cid:67) ) = t (cid:67) | t = t − is SNF(rev( (cid:67) )) = t t , which implies that the eigenvalue at infinity of (cid:65) has a non-trivial spectral structure. The eigen-value at infinity having a non-trivial spectral structure implies that the SNF of (cid:67) is unattainable.Note that this is equivalent to saying that (cid:67) has a non-trivial Smith-McMillan form. These examples are non-generic. Generically, the degree of all entries in the adjoint will be( n − d and will remain unchanged locally under perturbations to the coe ffi cients. Computingthe distance to the nearest matrix polynomial with a non-trivial Smith form under a prescribedperturbation structure can be formulated as finding the nearest rank deficient (structured) gener-alized Sylvester matrix of the adjoint or the d (cid:48) reversal of the adjoint. Suppose that (cid:65) ∈ R [ t ] n × n of degree at most d has a trivial Smith form and does not have anunattainable non-trivial Smith form. Then one method to compute a lower bound on the distancethe entries of (cid:65) need to be perturbed to have an attainable or unattainable non-trivial Smith formis to solve inf (cid:107) ∆ (cid:65) (cid:107) subject to rank(Syl d (cid:48) (Adj( (cid:65) + (cid:101) ∆ (cid:65) ))) < e , e = rank(Syl d (cid:48) (Adj( (cid:65) ))) . (6)Here d (cid:48) is the vector of the largest possible degrees of each entry of Adj( (cid:65) + (cid:101) ∆ (cid:65) ), and (cid:101) ∆ (cid:65) ) isa prescribed linear or a ffi ne perturbation structure.It is su ffi cient to compute max( d (cid:48) ), a quantity which will generically be ( n − d . For non-generic instances we require the computation of d (cid:48) . This optimization problem is non-convex,but multi-linear in each coe ffi cient of ∆ (cid:65) .We do not attempt to solve this problem directly via numerical techniques, since it enforcesa necessary condition that is often su ffi cient. Instead we use it to develop a theory of solutionswhich can be exploited by faster and more robust numerical methods. Lemma 3.7.
Let f be a vector of polynomials with degrees d and admissible perturbations ∆ f ofdegrees d (cid:48) where max( d ) ≤ max( d (cid:48) ) . Then the family of generalized Sylvester matrices Syl d (cid:48) ( f ) of rank at least e form an open set under the perturbations ∆ f . roof. By the degree assumption on ∆ f we have that for an infinitesimal ∆ f that Syl d (cid:48) ( f ) andSyl d (cid:48) ( ∆ f ) have the same dimension. Accordingly, let us suppose that Syl d (cid:48) ( f ) has rank at least e .Then the Sylvester matrix in question must have rank at least e in an open-neighborhood aroundit. In particular, when (cid:107) Syl d (cid:48) ( ∆ f ) (cid:107) < σ e (Syl d (cid:48) ( f )) then rank(Syl d (cid:48) ( f + ∆ f )) ≥ rank(Syl d (cid:48) ( f )) andthe result follows. Theorem 3.8.
The optimization problem (6) has an attainable global minimum under linearperturbation structures.Proof.
Let (cid:83) be the set of all rank at most e − d (cid:48) and Adj( (cid:65) ). Lemma 3.7 implies that (cid:83) is topologically closed.Let (cid:82) = { Syl d (cid:48) (Adj( (cid:67) )) subject to (cid:107) (cid:67) (cid:107) ≤ (cid:107) (cid:65) (cid:107)} , where the generalized Sylvester matricesare padded with zeros to have the appropriate dimension if required. Since ∆ (cid:65) has a linearperturbation structure, a feasible point is always C = − (cid:65) . By inspection (cid:82) is seen to be anon-empty set that is bounded and closed.The functional (cid:107) · (cid:107) is continuous over the non-empty closed and bounded set (cid:83) ∩ (cid:82) . Let (cid:66) ∈ (cid:83) ∩ (cid:82) . By Weierstrass’s theorem (cid:107) (cid:65) − (cid:66) (cid:107) has an attainable global minimum over (cid:83) ∩ (cid:82) .Note that if a feasible point exists under an a ffi ne perturbation structure, then a solution tothe optimization problem exists as well. What this result says is that computing the distance tonon-triviality is generally a well-posed problem, even though computing a matrix polynomial ofminimum distance may be ill-posed (the solution is unattainable). The same results also holdwhen working over the d (cid:48) reversed coe ffi cients. A similar argument is employed by (Kaltofenet al., 2007, Theorem 2.1). Suppose that (cid:65) ∈ R [ t ] n × n , of degree at most d , has a trivial Smith form and does not havean unattainable non-trivial Smith form. This section provides some basic bounds on the distancecoe ffi cients of (cid:65) need to be perturbed to have a non-trivial Smith form. The bounds we deriveare unstructured, although they can be generalized to several perturbation structures (such asones that preserve the degree or support of entries) in a straight forward manner.If we consider the mapping Adj( · ) as a vector-valued function from R n ( d + → R n (( n − d + (with some coordinates possibly fixed to zero), then we note that the mapping is locally Lipschitz.More precisely, there exists c > ffi ciently small ∆ (cid:65) , (cid:107) Adj( (cid:65) ) − Adj( (cid:65) + ∆ (cid:65) ) (cid:107) ≤ c (cid:107) ∆ (cid:65) (cid:107) . The quantity c can be approximately bounded above by the (scalar) Jacobian matrix ∇ Adj( · )evaluated at (cid:65) . A local upper bound for c is approximately (cid:107)∇ Adj( (cid:65) ) (cid:107) . We can invoke Theo-rem 2.13 if (cid:65) has full rank. By considering ˆ c = (cid:13)(cid:13)(cid:13)(cid:2) Φ ( n − d ( I ⊗ (cid:65) ) (cid:3) + (cid:13)(cid:13)(cid:13) , we obtain the (absolute)first-order approximate perturbation bound (cid:107) Adj( (cid:65) ) − Adj( (cid:65) + ∆ (cid:65) ) (cid:107) F (cid:46) ˆ c ( n + √ n )( d + (cid:107) Adj( (cid:65) ) (cid:107) F (cid:107) ∆ (cid:65) (cid:107) F . The entries of ∇ Adj( (cid:65) ) consist of the coe ffi cients of the ( n − × ( n −
2) minors of (cid:65) .This follows because Adj( · ) is a multi-linear vector mapping and the derivative of each entry isa coe ffi cient of the leading coe ffi cient with respect to the variable of di ff erentiation. The size of14ach minor can be bounded above (albeit poorly) by Hadamard’s inequality (Goldstein-Grahamvariant, see (Lossers, 1974)). As such, we have the sequence of bounds (cid:107)∇ Adj( (cid:65) ) (cid:107) ≤ n √ d + (cid:107)∇ Adj( (cid:65) ) (cid:107) ∞ ≤ n ( d + / (cid:107) (cid:65) (cid:107) n − ∞ ( d + n − n ( n − / , where (cid:107) (cid:65) (cid:107) ∞ is understood to be a vector norm and (cid:107)∇ Adj( (cid:65) ) (cid:107) ∞ is understood to be a matrixnorm. The bound in question can be used in conjunction with the SVD to obtain a lower boundon the distance to a matrix polynomial with a non-trivial Smith form. Theorem 3.9.
Suppose that d (cid:48) = ( γ, γ . . . , γ ) and Syl d (cid:48) (Adj( (cid:65) )) has rank e. Then an approxi-mate lower bound on the distance to non-triviality is γ (cid:107)∇ Adj( (cid:65) ) (cid:107) F σ e (Syl d (cid:48) (Adj( (cid:65) ))) . Proof.
We note that for polynomials f with degrees d (cid:48) that (cid:107) Syl d (cid:48) ( f ) (cid:107) F = γ (cid:107) f (cid:107) F . Accordingly, if ∆ (cid:65) is a minimal perturbation to non-triviality, then1 γ σ e (Syl d (cid:48) (Adj( (cid:65) ))) ≤ (cid:107) Adj( (cid:65) ) − Adj( (cid:65) + ∆ (cid:65) ) (cid:107) F (cid:46) (cid:107)∇ Adj( (cid:65) ) (cid:107) F (cid:107) ∆ (cid:65) (cid:107) F , and the theorem follows by a simple rearrangement. Note that (cid:107) · (cid:107) ≤ (cid:107) · (cid:107) F .If d (cid:48) has di ff erent entries, then (cid:96) (cid:107) f (cid:107) F ≤ (cid:107) Syl d (cid:48) ( f ) (cid:107) F ≤ γ (cid:107) f (cid:107) F , where γ and (cid:96) are the largestand second-largest entries of d (cid:48) . The lower bound provided can also be improved using theKarmarkar-Lakshman distance (Karmarkar and Lakshman, 1996) in lieu of the smallest singularvalue of the generalized Sylvester matrix, the d (cid:48) reversal of the adjoint or other approximateGCD lower bounds (e.g., (Beckermann and Labahn, 1998)).
4. Approximate SNF via Optimization
In this section we formulate the approximate Smith form problem as the solution to a contin-uous constrained optimization problem. We assume that the solutions in question are attainableand develop a method with rapid local convergence. As the problem is non-convex, our conver-gence analysis will be local.
An equivalent statement to (cid:65) having a non-trivial attainable Smith form is that Adj( (cid:65) ) = (cid:70) ∗ h where (cid:70) ∗ is a vector (or matrix) of scalar polynomials and h is a divisor of gcd(Adj( (cid:65) )).This directly leads to the following optimization problem:min (cid:107) ∆ (cid:65) (cid:107) F subject to Adj( (cid:65) + ∆ (cid:65) ) = (cid:70) ∗ h , (cid:70) ∗ ∈ R [ t ] n × n , h ∈ R [ t ] , (cid:78) h vec( h ) = , (cid:78) h ∈ R × (deg( h ) + . (7)This is a multi-linearly structured approximate GCD problem which is a non-convex optimizationproblem. Instead of finding a rank deficient Sylvester matrix, we directly enforce that the entriesof Adj( (cid:65) ) have a non-trivial GCD. The normalization requirement that (cid:78) h vec( h ) = h to have a non-zero degree, so that h is not a scalar. One useful normalization is to define (cid:78) h such that lcoe ff ( h ) = ff ( · ) is the leading coe ffi cient of a polynomial). Explicitly,we assume the degree of the approximate GCD is known and make it monic. Of course, othervalid normalizations also exist.Since we are working over R [ t ], there will always be a quadratic, linear or zero factor ofattainable solutions. If h = (cid:65) is rank deficient and computingapproximate SNF reduces to the nearest rank at-most n − n − h ) = h ) = In order to solve our problem we will employ the method of Lagrange multipliers. TheLagrangian is defined as L = (cid:107) ∆ (cid:65) (cid:107) F + λ T (cid:32) vec(Adj( (cid:65) + ∆ (cid:65) ) − (cid:70) ∗ h ) (cid:78) h vec( h ) − (cid:33) , where λ is a vector of Lagrange multipliers.A necessary first-order condition (KKT condition, e.g. (Bertsekas, 1999)) for a tuple z (cid:63) = z (cid:63) ( ∆ (cid:65) , (cid:70) ∗ , h , λ ) to be a regular (attainable) minimizer is that the gradient of L vanishes, that is, ∇ L ( z (cid:63) ) = . (8)Let J be the Jacobian matrix of the constraints defined as J = ∇ ∆ (cid:65) , (cid:70) ∗ , h (cid:16) vec(Adj( (cid:65) + ∆ (cid:65) ) − (cid:70) ∗ h ) (cid:17) . The second-order su ffi ciency condition for optimality at a local minimizer z (cid:63) is thatker( J ( z (cid:63) )) T ∇ xx L ( z (cid:63) ) ker( J ( z (cid:63) )) (cid:31) , (9)that is, the Hessian with respect to x = x ( ∆ (cid:65) , (cid:70) ∗ , h ) is positive definite over the kernel of theJacobian of the constraints. The vector x corresponds to the variables in the a ffi ne structure of ∆ (cid:65) , (cid:70) ∗ , and h . If (8) and (9) both hold, then z (cid:63) is necessarily a local minimizer of (7). Ofcourse, it is also necessary that ker( J ( z (cid:63) )) T ∇ xx L ( z (cid:63) ) ker( J ( z (cid:63) )) (cid:23) ∇ L = A problem with Newton-like methods is that when the Hessian is rank deficient or ill-conditioned, then the Newton step becomes ill-defined or the rate of convergence degrades. Theproposed formulation of our problem can encounter a rank deficient Hessian (this is due to overpadding some vectors with zero entries or redundant constraints). Despite this we are still ableto obtain a method with rapid local convergence under a very weak normalization assumption.In order to obtain rapid convergence we make use of the Levenberg-Marquart (LM) algo-rithm. If H = ∇ L , then the LM iteration is defined as repeatedly solving for z ( k + = z ( k ) +∆ z ( k ) by( H T H + ν k I ) ∆ z ( k ) = − H T ∇ L ( z ( k ) ) where z = (cid:32) x λ (cid:33) ∈ R (cid:96) , (cid:96) > (cid:107)∇ L (cid:107) as a merit function. The speed of convergence depends on thechoice of ν k >
0. Note that since LM is essentially a regularized Gauss-Newton method, whenthe Hessian is rank deficient then we may converge to a stationary point of the merit function.If convergence to a stationary point of the merit function is detected, then the method of Wright(2005) can be used to replace LM in several instances.Yamashita and Fukushima (2001) show that, under a local-error bound condition, a systemof non-linear equations g ( z ) = ffi cient for regularity ( J having full rank) to hold or second-order su ffi ciency, but it is notnecessary to satisfy both. Note that we assume Lagrange multipliers exist. However, unlikethe case when J has full rank, the multipliers need not be unique. The advantage of LM overother Newton-like methods is that this method is globalized in exchange for an extra matrixmultiplication, as H T H + ν k I is always positive definite, and hence always a descent directionfor the merit function. We make the choice of ν k ≈ (cid:107) g ( z ) (cid:107) based on the results of Fan and Yuan(2005). Definition 4.1 (Local Error Bound) . Let Z (cid:63) be the set of all solutions to g ( z ) = and X be asubset of R (cid:96) such that X ∩ Z (cid:63) (cid:44) ∅ . We say that (cid:107) g ( z ) (cid:107) provides a local error bound on g ( z ) = if there exists a positive constant c such that c · dist( z , Z (cid:63) ) ≤ (cid:107) g ( z ) (cid:107) for all z ∈ X, where dist( · ) isthe distance between a point and a set. In this section it is useful to consider g ( z ) = ∇ L ( z ), as we need the local error bounds toestimate ∇ L ( z ) = Theorem 4.2.
If the second-order su ffi ciency condition (9) holds at an attainable solution to (7) ,then the local error-bound property holds.Proof. This result follows immediately from Section 3 of Wright (2005) and the referencestherein.The bounds of Wright can be used to infer when quadratic convergence occurs for Newton-like methods. In this problem, perturbations to x are important in understanding how the problembehaves locally. Remark 4.3.
Let z = z ( x , λ ) where x is a vector of variables and λ is a vector of Lagrangemultipliers, and define g ( z ) = ∇ L ( z ) . First suppose that both the second-order su ffi ciency condi-tion (9) and first-order necessary condition (8) hold at the point z (cid:63) . We can write the first-orderexpansion g ( z (cid:63) + ∆ z ) = H ( z (cid:63) )( ∆ z ) + O ( (cid:107) ∆ z (cid:107) ) ≈ H ( z (cid:63) )( ∆ z ) , noting that g ( z (cid:63) ) = . It is useful to observe thatH ( z (cid:63) ) = (cid:32) H xx ( z (cid:63) ) J T ( z (cid:63) ) J ( z (cid:63) ) 0 (cid:33) . Here “globalized” means that the method will converge to a stationary point of the merit function, not a localextremum of the problem. f ∆ x = then the error-bound from Ho ff man (1952) (main theorem) applies and we have thatthere exists c ho f > such that c ho f (cid:107) ∆ λ (cid:107) ≤ (cid:107) g ( x , λ + ∆ λ ) (cid:107) . If ∆ x (cid:44) then (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:32) H xx ( z (cid:63) ) J ( z (cid:63) ) (cid:33) ∆ x (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≈(cid:107) g ( x + ∆ x , λ ) (cid:107) and (9) implies that H ( z (cid:63) )( ∆ z ) = = ⇒ ∆ x = , so σ min (cid:32) H xx ( z (cid:63) ) J ( z (cid:63) ) (cid:33) (cid:107) ∆ x (cid:107) (cid:46) (cid:107) g ( x + ∆ x , λ ) (cid:107) , so there exists c σ min > when (cid:107) ∆ x (cid:107) is su ffi ciently small such that c σ min (cid:107) ∆ x (cid:107) ≤ (cid:107) g ( x + ∆ x , λ ) (cid:107) .Note that c σ min ≈ σ min (cid:32) H xx ( z (cid:63) ) J ( z (cid:63) ) (cid:33) . The first-order approximation implies that when (cid:107) ∆ z (cid:107) is su ffi ciently small thatg ( z (cid:63) + ∆ z ) ≈ H ( z (cid:63) )( ∆ z ) = H ( z (cid:63) ) (cid:32) ∆ x (cid:33) + H ( z (cid:63) ) (cid:32) ∆ λ (cid:33) ≈ g ( x + ∆ x , λ ) + g ( x , λ + ∆ λ ) . The key idea is to separate the problem into the cases of ∆ x = and ∆ x (cid:44) , and then deriveerror bounds for each case. The important part of the discussion is that if one can estimate c σ min then one can often infer when quadratic convergence occurs.The second-order su ffi ciency assumption is not necessary to derive error bounds bounds. Itis straightforward to show the local error bound property holds if J ( z (cid:63) ) has full rank, as the La-grange multipliers will be (locally) unique, hence the solution is (locally) unique. Alternatively,if J had constant rank in a non-trivial open neighborhood around a solution, then a similarargument could be made about the local error-bound property. Theorem 4.4.
The second-order su ffi ciency condition holds at minimal solutions with Lagrangemultipliers of minimal norm if h is of maximal degree and monic and the minimal structuredperturbation (cid:107) ∆ (cid:65) (cid:63) (cid:107) is su ffi ciently small.Proof. The Hessian of L with respect to x = x ( ∆ A , (cid:70) ∗ , h ) is ∇ xx L = H xx = F + I EE T , where F is a square matrix with zero diagonal whose entries are a multi-linear polynomial in λ and ∆ (cid:65) and E T is a matrix whose entries are homogeneous linear functions in λ .If ∆ (cid:65) (cid:63) = λ (cid:63) =
0. Hence both E = F = y ∈ ker( H xx ) ∩ ker( J ) then y = (cid:16) y y (cid:17) T . The Jacobian of the constraints may be written (up to permutation) as J = (cid:32) ∗ (cid:67) h (cid:67) (cid:70) ∗ (cid:78) h (cid:33) , where ∗ are blocks corresponding to di ff erentiating with respect to variables in ∆ (cid:65) and theblocks (cid:67) (cid:70) ∗ and (cid:67) h consist of block convolution and convolution matrices that correspond tomultiplication by (cid:70) ∗ and h , respectively. The block (cid:78) h contains a normalization vector to ensurethat h has the appropriate degree. Jy = v anda polynomial u with the same degrees as (cid:70) ∗ and h such that (cid:70) ∗ u + vh = (cid:78) h vec( u ) = h is a factor of both (cid:70) ∗ u and vh . Since gcd( (cid:70) ∗ , h ) = h is afactor of u . It follows that deg( u ) = deg( h ), so there exists some α (cid:44) α u = h . Since h
18s monic, we have that (cid:78) h vec( h ) = (cid:78) h vec( u ) =
0, which implies that α =
0, and so u = vh = v =
0. Hence ker( J ) ∩ ker( H xx ) = ffi ciency holds when (cid:107) ∆ (cid:65) ∗ (cid:107) = (cid:107) ∆ (cid:65) ∗ (cid:107) is su ffi ciently small, then (cid:107) F (cid:107) will be su ffi ciently small so that F + I has full rank.Accordingly, we have that ker F + I EE T ⊆ ker I . We remark that the techniques in the proof are very similar to those of Zeng and Dayton(2004) and Giesbrecht, Haraldson, and Kaltofen (2019a) to show that a Jacobian matrix ap-pearing in approximate GCD computations of two (or more) polynomials has full rank. If weover-estimated the degrees of (cid:70) ∗ then H xx would have some columns and rows consisting of zero(the block-convolution matrices would be padded with extra zero entries).In the proof of Theorem 4.4 we note that ∇ xx L = ∇ xx (cid:107) ∆ (cid:65) (cid:107) F + ∇ x λ T J . The matrix F = ∇ ∆ (cid:65) λ T J Adj ( (cid:65) + ∆ (cid:65) ) will consist of coe ffi cients of the ( n − × ( n −
3) minorsof (cid:65) + ∆ (cid:65) scaled by entries of λ . Accordingly, F will generally not have − Remark 4.5.
Thus far we have assumed that Lagrange multipliers exist at the current solutionsof interest, which are attainable solutions that have full rank. Corollary 2.14 and the proof ofTheorem 4.4 imply that Lagrange multipliers generally exist under these assumptions for severalperturbation structures, since we need to solve (cid:16) ∆ (cid:65) ) T (cid:17) = − λ T J , of which J generally has constant or full rank. Of course if the solution was unattainable then theGCD constraints would break down as there is a “solution at infinity” in a sense that (cid:107) h (cid:107) → ∞ as ∆ (cid:65) → ∆ (cid:65) (cid:63) . The implication of the local-error bound property holding is that one can reasonably approx-imate when quadratic convergence occurs by estimating σ min (cid:16)(cid:104) H xx | J T (cid:105)(cid:17) and c ho f . In particular,these quantities act as a structured condition number on the system. A structured backwards-erroranalysis of existing techniques can be performed using these quantities. Additionally, it is some-what generic that F + I has full rank, hence the local error-bound will hold for most instancesof the approximate SNF problem with an attainable solution. It is also important to note that wedid not explicitly use the adjoint matrix. Indeed the result remains valid if we replace the adjointwith minors of prescribed dimension. Likewise, if (cid:65) is an ill-posed instance of lower McCoyrank or approximate SNF without an attainable global minimum, then optimizing over a reversalof each entry of Adj( (cid:65) + ∆ (cid:65) ) would yield a non-trivial answer and the same stability propertieswould hold. Thus, poorly posed problems also remain poorly posed if slightly perturbed. Corollary 4.6.
The LM algorithm for solving ∇ L = has quadratic convergence under theassumptions of Theorem 4.2 and using ν k = (cid:107)∇ ( L ( z k )) (cid:107) .Proof. The quantity ∇ L is a multivariate polynomial, hence it is locally Lipschitz. Second-ordersu ffi ciency holds, thus we have the local error bound property is satisfied. The method convergesrapidly with a suitable initial guess. 19ote that for several perturbation structures if the adjoint has generic degrees, then the Jaco-bian of the constraints will have full rank, and a standard Newton iteration is also well-defined,and will converge quadratically as well.In the next section we discuss a technique that possibly forgoes rapid local convergence, buthas a polynomial per iteration cost to compute a low McCoy rank approximation. The most glaring problem in deriving a fast iterative algorithm for the approximate Smithform problem is that the matrix Adj( (cid:65) + ∆ (cid:65) ) has exponentially many coe ffi cients as a multi-variate polynomial in ∆ (cid:65) . This means computing the adjoint matrix symbolically as an ansatzis not feasible. In order to solve (8) we instead approximate the derivatives of the coe ffi cients ofthe adjoint numerically.To compute an initial guess, we can use ∆ (cid:65) init = (cid:70) ∗ and h to be a reasonableapproximation to an approximate GCD of Adj( (cid:65) ), which will often be valid as per Theorem 4.2.To make sure the point is feasible, one can use a variant of Newton’s method to project to afeasible point. Corollary 2.14 implies that with a suitable initial guess, reasonable variants ofNewton’s method (such as LM) will converge quadratically to a feasible point, assuming oneexists.Another technique is to take two rows or columns of (cid:65) and perturb them so that the 2 n entries have a non-trivial GCD. To find the best guess with this technique, O ( n ) approximateGCD computations on O ( n ) polynomials of degree d need to be performed. In the next sectionwe will discuss more sophisticated techniques. If a solution is unattainable then the degrees of all the entries of the adjoint matrix maychange in an open neighborhood around a solution. If ∆ (cid:65) (cid:63) is an unattainable solution (of fullrank) to (7) then h ( t ) = t is clearly not a solution since h ( t ) = t being a solution implies that sucha solution would be attainable. Let d Adj be the generic degree of Adj( (cid:65) + ∆ (cid:65) ), then t is a factorof gcd(rev d Adj (Adj( (cid:65) + ∆ (cid:65) (cid:63) ))). The reversed adjoint has no GCD at infinity by assumption, assuch a GCD at infinity would be an attainable solution to the original problem. Accordingly, wenote that Theorem 4.4 applies after some straightforward modifications, since ∇ vec(Adj( (cid:65) + ∆ (cid:65) )) and ∇ vec(rev d Adj (Adj( (cid:65) + ∆ (cid:65) )))are essentially (block) permutations of each other.Since rev d Adj (Adj( (cid:65) + ∆ (cid:65) ))) achieves the generic degree, Lagrange multipliers should existas we can apply Corollary 2.14 on ∇ vec(rev d Adj (Adj( (cid:65) + ∆ (cid:65) ))) by permuting entries, and theunderlying approximate GCD problem is well-posed. Thus the problem will also typically admitLagrange multipliers.The essential ingredient in Theorem 4.4 is the normalization of the underlying approximateGCD problem. This means that “backwards stable” algorithms will compute the exact SNF ofa nearby matrix polynomial that has no meaning in the context of computation. This generallyoccurs because the radius of uncertainty, usually proportional to unit rounding errors, containsinfinitely many matrix polynomials with a non-trivial SNF. The backwards stability is not mean-ingful in this context, because the instance of the problem is not continuous. In such instances,computing the SNF is most likely the wrong problem to be considering. Instead, computing the20pectral structure of eigenvalues at infinity is most likely the appropriate problem. However thereexist instances where both problems could be simultaneously poorly conditioned.If the reversed problem has a radius of stability with respect to Theorem 4.4, then the origi-nal problem has a radius of instability, meaning that the iterates will converge to a point where (cid:107) h (cid:107) is excessively large. In other words, if an instance of a problem is ill-posed, then it cannotbe regularized — the finite and infinite eigenvalues and their spectral structure is indistinguish-able in floating point arithmetic — in the context of the QZ decomposition, GUPTRI (Demmeland Kågstr¨om, 1993a,b) or similar algorithms. There are some instances where attempting tocompute the SNF numerically is not possible and should not be attempted. In the context ofan optimization problem, we can of course regularize the problem as we have just described.Van Dooren (1979) suggests that ill-posed problems should be formulated as an optimizationproblem as a means of regularization to overcome some of the numerical di ffi culties.
5. Lower McCoy Rank Approximation
In this section we describe how to find a nearby matrix polynomial of lower McCoy. Anotherway to formulate (cid:65) having a non-trivial SNF is to solve the minimization problemmin (cid:107) ∆ (cid:65) (cid:107) F subject to (cid:0) (cid:65) ( ω ) + ∆ (cid:65) ( ω ) (cid:1) B = B ∗ B = I ,for some ω ∈ C and B ∈ C n × , (10)where ∆ (cid:65) must have the appropriate structure. Essentially this finds the smallest perturbationof (cid:65) with an eigenvalue that lowers the rank by at least 2. The auxiliary variables ω and B areused to enforce this constraint. Here B ∗ is the conjugate transpose of B , and B ∗ B = I ensuresthat the kernel vectors are linearly independent and do not tend towards zero.The optimization is unstable if ω is reasonably large, since the largest terms appearing areproportional to O (( d + (cid:107) (cid:65) (cid:107) ∞ | ω | d ). To remedy this, if we assume that a solution to the optimiza-tion problem (10) exists and has full rank, then we may transform (cid:65) + ∆ (cid:65) into a degree-onematrix polynomial (also known as a matrix pencil) with the same spectral properties, known asa linearization . If there is no full-rank solution one can simply take a lower-rank approximation(Giesbrecht, Haraldson, and Labahn, 2019b) and extract a square matrix polynomial of full rankthat may be linearized. Alternatively, one may forgo the linearization and work directly with aproblem that is more poorly conditioned. For the rest of this section we will assume, withoutloss of generality, that (cid:65) and the solutions to the low McCoy rank problem have full rank.We can encode the spectral structure and SNF of (cid:65) as the following degree-one matrixpolynomial (sometimes referred to as the companion linearization (Gohberg et al., 2009)) of theform (cid:80) ∈ R [ t ] nd × nd , defined as (cid:80) = I . . . A d t − I . . . − A − A · · · − A d − . This particular linearization encodes the SNF of (cid:65) , as SNF( (cid:80) ) = diag( I , I , . . . , I , SNF( (cid:65) )). Itfollows that (cid:65) has a non-trivial SNF if and only if (cid:80) has a non-trivial SNF. If we preserve thea ffi ne structure of (cid:80) and only perturb blocks corresponding to (cid:65) , then the reduction to a pencilwill be su ffi cient. Other linearizations are possible as well. The pencil is generally better behavednumerically since the largest entry upon evaluation at a ω ∈ C is proportional to O ( d (cid:107) (cid:65) (cid:107) ∞ | ω | )rather than O ( (cid:107) A (cid:107) ∞ | ω | d ), albeit with matrices that are d times larger.21 .1. Fast Low McCoy Rank via Optimization One way to approach the lower McCoy rank approximation problem is to study all the mi-nors (or su ffi ciently many) of a matrix polynomial. This method immediately generalizes fromthe previous section, however is not practical for computational purposes since the number ofminors grows exponentially in the dimension. Instead, we can approach the problem by for-mulating it as an optimization problem, one that is remarkably similar to structured lower rankapproximation of scalar matrices. This similarity facilitates computing an initial guess for thefollowing optimization problem using the SVD.The lower McCoy rank approximation problem may be formulated as the following realoptimization problem : to find the nearest matrix polynomial to (cid:65) ∈ R [ t ] n × n with McCoy rank n − r , find the perturbation ∆ (cid:65) ∈ R [ t ] n × n which minimizesmin (cid:107) ∆ (cid:65) (cid:107) F subject to (cid:60) (( (cid:80) + ∆ (cid:80) )( ω ) B ) = , (cid:61) (( (cid:80) + ∆ (cid:80) )( ω ) B ) = , (cid:60) ( B ∗ B ) = I r , (cid:61) ( B ∗ B ) = ω ∈ C and B ∈ C nd × r . (11)Note that the perturbation ∆ (cid:65) is real valued in this problem. The unitary constraint on B ensuresthat rank( B ) = r and each column of B remains away from zero. Accordingly, ω ∈ C will bean eigenvalue of ( (cid:80) + ∆ (cid:80) )( ω ) since rank(( (cid:80) + ∆ (cid:80) )( ω )) ≤ nd − r , and thus the McCoy rank of (cid:65) + ∆ (cid:65) is at-most n − r .Real matrix polynomials can have complex eigenvalues and so complex numbers must nec-essarily appear in the constraints. The constraints arising from the complex numbers may bedivided into real parts and imaginary parts, denoted as (cid:60) ( · ) and (cid:61) ( · ), respectively. By divid-ing the constraint into real and imaginary parts, we are able to solve an equivalent optimizationproblem completely with real variables. This ensures that (cid:61) ( ∆ (cid:65) ) =
0, that is, the perturbationsare real. Since (cid:65) + ∆ (cid:65) may have complex eigenvalues (but entries with real coe ffi cients), werequire that SNF( (cid:65) + ∆ (cid:65) ) has entries from R [ t ]. Accordingly, we need to interpret the auxiliaryvariable ω . The instance of (cid:61) ( ω ) = t − ω as an invariant factor, while (cid:61) ( ω ) (cid:44) t − ω )( t − ω ). Thus at a solution, we are able torecover a real invariant factor regardless if ω has a non-zero imaginary part.In order to approach the problem using the method of Lagrange multipliers we define theLagrangian as L = (cid:107) ∆ (cid:65) (cid:107) F + λ T vec (cid:60) (( (cid:80) + ∆ (cid:80) )( ω ) B ) (cid:61) (( (cid:80) + ∆ (cid:80) )( ω ) B ) (cid:60) ( B ∗ B ) − I r (cid:61) ( B ∗ B ) , and proceed to solve ∇ L =
0. In our implementation we again make use of the LM method,although given the relatively cheap gradient cost, a first-order method will often be su ffi cient andfaster. The problem is essentially tri-linear, and structurally similar to a ffi nely structured lowrank approximation, of which Lagrange multipliers will exist for most instances.It is important to note that an attainable solution to this problem is not guaranteed, as it ispossible for (cid:107) ω (cid:107) → ∞ as ∆ (cid:65) → ∆ (cid:65) (cid:63) . Such an instance is an unattainable solution in thecontext of Section 4.5. These solutions behave like an infinite eigenvalue and can be handed byspecifically considering the eigenvalue t = .2. Computing an Initial Guess In order to compute an initial guess to (11) we exploit the pseudo tri-linearity of the problem.If two of ∆ (cid:65) , ω and B are fixed then the problem is linear (or a linear surrogate can be solved)in the other variable. Despite the unitary constraint on B being non-linear, it is not challenging tohandle. Any full rank B is suitable for an initial guess, since we may orthonormalize B to satisfythe constraint that B ∗ B = I r .First we approximate the determinant of (cid:65) and consider initial guesses where σ n − r ( (cid:65) ( ω init ))is reasonably small. If σ n − r ( (cid:65) ( ω init )) is reasonably small, then ω init is (approximately) an eigen-value of a nearby matrix polynomial of reduced McCoy rank. The zeros and local extrema ofdet( (cid:65) ) are suitable candidates for computing an initial guess for ω . The kernel B init can be ap-proximated from the smallest r singular vectors of (cid:65) ( ω init ). This ensures that B init is unitary andspans the kernel of a nearby rank deficient (scalar) matrix.To compute an initial guess for ∆ (cid:65) we can take ∆ (cid:65) init =
0, or solve a linear least squaresproblem where B and ω are fixed. Alternatively, one may project to a feasible point by using avariant of Newton’s method, using ∆ (cid:65) init = ω init and B init as an initial guess for the Newtoniteration to solve ( (cid:65) + ∆ (cid:65) )( ω ) B = B ∗ B = I r . A feasible point computed by Newton’smethod tends not to perturb ∆ (cid:65) very much, whereas the least squares approximation may perturb (cid:65) by an unnecessarily large amount. The problems previously discussed are NP hard to solve exactly and to approximate withcoe ffi cients from Q . This follows since a ffi nely structured low rank approximation (Braatz et al.,1994; Poljak and Rohn, 1993) is a special case. If we consider a matrix polynomial of degreezero, then this is a scalar matrix with an a ffi ne structure. The approximate SNF will be a matrixof rank at most n −
2, and finding the nearest a ffi nely structured singular matrix is NP hard.Despite the problem being intractable in the worst case, not all instances are necessarily hard.The formulation (11) is multi-linear and polynomial, hence amenable to the sum of squares hier-archy. Lasserre’s sum of squares hierarchy (Lasserre, 2001) is a global framework for polynomialoptimization that asymptotically approximates a lower bound. Accordingly, if (cid:107) ω opt (cid:107) is bounded,then sum of squares techniques should yield insight into the problem.
6. Implementation and Examples
We have implemented our algorithms and techniques in the
Maple computer algebra system .We use the variant of Levenberg-Marquardt discussed in Section 4 in several instances to solvethe first-order necessary condition. All computations are done using hardware precision andmeasured in floating point operations, or FLOPs. The input size of our problem is measured inthe dimension and degree of (cid:65) , which are n and d respectively. The cost of most quasi-Newtonmethods is roughly proportional to inverting the Hessian matrix, which is O ( (cid:96) ) , where (cid:96) is thenumber of variables in the problem.The method of Section 4 requires approximately O (( n d ) ) = O ( n d ) FLOPs per iterationin an asymptotically optimal implementation with cubic matrix inversion, which is the cost ofinverting the Hessian. Computing the Hessian costs roughly O ˜( n d × ( n ) ) = O ˜( n d ) FLOPs Sample code is at . For φ, ψ : R → R , φ = O ˜( ψ ) i ff φ = O ( ψ (log | ψ | ) c ) for some absolute constant c ≥
0, i.e., we ignore log factors. O ˜( n d ) FLOPs (which canbe done via interpolation in a straightforward manner)There are O ( n d ) Lagrange multiplierssince the adjoint has degree at most ( n − d . Using reverse-mode automatic di ff erentiation tocompute ∇ L , this can be accomplished in O ˜( n d × n d ) = O ˜( n d ) FLOPs.The method of Section 5 has a Hessian matrix of size O ( n d ) × O ( n d ) in the case ofa rank zero McCoy rank approximation. Accordingly, the per iteration cost is roughly O ( n d )FLOPs. If the linearization is not performed, then the per-iteration cost is O ( n d ) FLOPs. Giventhe lack of expensive adjoint computation, a first-order method will typically require severalorders of magnitude fewer FLOPs per iteration (ignoring the initial setup cost), with local linearconvergence. Example 6.1 (Nearest Interesting SNF) . Consider the matrix polynomial (cid:65) with a trivial SNF t + . t + . t − . . t + . t + . . . t t + . + . t . t + . . t + . of the form diag(1 , . . . , , det( (cid:65) )) .If we prescribe the perturbations to leave zero coe ffi cients unchanged, then using the methodsof Section 4 and Section 5 results in a local minimizer (cid:65) + ∆ (cid:65) opt given by . t + . t + . . t − . . t + . t + . . . t . t + . t + . . t + . . t + . , with (cid:107) ∆ (cid:65) opt (cid:107) ≈ . . The SNF of (cid:65) + ∆ (cid:65) opt is approximately diag(1 , , s , s ( t + . t + . t + . t + . t + . , where s ≈ t + . t + . . The factor s corresponds to ω opt ≈ − . − . i.The method discussed in Section 4 converges to approximately decimal points of accuracy after iterations and the method of Section 5 converges to the same precision after approxi-mately iterations. The initial guess used in both instances was ∆ (cid:65) init = . The initial guessesof (cid:70) ∗ and h were computed by an approximate GCD routine. For the initial guess of ω we chosea root or local extrema of det( (cid:65) ) that minimized the second-smallest singular value of (cid:65) ( ω ) ,one of which is ω init ≈ − . − . i. Example 6.2 (Lowest McCoy Rank Approximation) . Let (cid:65) be as in the previous example andconsider the -McCoy rank approximation problem with the same prescribed perturbation struc-ture.In this case we compute a local minimizer (cid:65) + ∆ (cid:65) opt given by . t + . . t + . . t + . . t + . . t + . , ∇ L = ith (cid:107) ∆ (cid:65) opt (cid:107) ≈ . after iterations to decimal points of accuracy. Wecompute ω opt ≈ − . i which corresponds to the single invariant factor s ≈ t + . . The SNF of (cid:65) + ∆ (cid:65) opt is of the form ( s , s , s , s ) .
7. Conclusion and Topics for Future Research
In this paper we have shown that the problem of computing a nearby matrix polynomial witha non-trivial spectral structure can be solved by (mostly local) optimization techniques. Regu-larity conditions were shown to hold for most instances of the problems in question, ensuringthat Lagrange multipliers exist under mild assumptions about the solutions. When Lagrangemultipliers do not exist, alternative formulations that admit Lagrange multipliers have been pro-posed. Several of these algorithms are shown to be theoretically robust with a suitable initialguess. In general, reasonable quasi-Newton methods will have rapid local convergence undernormalization assumptions for all the problems considered.There are a number of problems that remain open for future work. In particular in the case ofnearby nontrivial Smith forms there is the question of obtaining such forms via polynomial rowand column operations, that is, finding the unimodular matrix multipliers that will produce ournearest Smith form. Preliminary work on this topic, including the formulation as an optimizationproblem and the proving of the existence of Lagrange multipliers for the optimization can befound in the thesis of Haraldson (2019). In some cases it may be practical to prescribe thedegree structure, also called the structural supports , of the eigenvalues or the invariant factorsof a nearby matrix polynomial. In this case, rather than look for a closest non-trivial SNF onewould be interested in a closest SNF having a particular degree structure. As before this can beformulated as an optimization problem with early results available in (Haraldson, 2019).
References
Ahmad, S., Alam, R., 2009. Pseudospectra, critical points and multiple eigenvalues of matrix polynomials. Linear Alge-bra and its Applications 430 (4), 1171–1195.Beckermann, B., Labahn, G., 1998. When are two numerical polynomials relatively prime? Journal of Symbolic Com-putation 26, 677–689.Beelen, T., Van Dooren, P., 1988. An improved algorithm for the computation of Kronecker’s canonical form of a singularpencil. Linear Algebra and its Applications 105, 9–65.Bertsekas, D., 1999. Nonlinear programming. Athena Scientific, USA.Braatz, R. P., Young, P. M., Doyle, J. C., Morari, M., 1994. Computational complexity of µ calculation. IEEE Transac-tions on Automatic Control 39 (5), 1000–1002.Demmel, J., Kågstr¨om, B., 1993a. The generalized Schur decomposition of an arbitrary pencil A − λ B : robust softwarewith error bounds and applications. Part I: theory and algorithms. ACM Transactions on Mathematical Software(TOMS) 19 (2), 160–174.Demmel, J., Kågstr¨om, B., 1993b. The generalized Schur decomposition of an arbitrary pencil A − λ B : robust softwarewith error bounds and applications. Part II: software and applications. ACM Transactions on Mathematical Software(TOMS) 19 (2), 175–201.Demmel, J. W., Edelman, A., 1995. The dimension of matrices (matrix pencils) with given Jordan (Kronecker) canonicalforms. Linear Algebra and its Applications 230, 61–87.Edelman, A., Elmroth, E., Kågstr¨om, B., 1997. A geometric approach to perturbation theory of matrices and matrixpencils. Part I: Versal deformations. SIAM Journal on Matrix Analysis and Applications 18 (3), 653–692.Edelman, A., Elmroth, E., Kågstr¨om, B., 1999. A geometric approach to perturbation theory of matrices and matrixpencils. Part II: A stratification-enhanced staircase algorithm. SIAM Journal on Matrix Analysis and Applications20 (3), 667–699.Fan, J.-Y., Yuan, Y.-X., 2005. On the quadratic convergence of the Levenberg-Marquardt method without nonsingularityassumption. Computing 74 (1), 23–39. atouros, S., Karcanias, N., 2003. Resultant properties of gcd of many polynomials and a factorization representation ofgcd. International Journal of Control 76 (16), 1666–1683.Giesbrecht, M., Haraldson, J., Kaltofen, E., 2019a. Computing approximate greatest common right divisors of di ff erentialpolynomials. Foundations of Computational Mathematics, to appear.Giesbrecht, M., Haraldson, J., Labahn, G., 2017. Computing the nearest rank-deficient matrix polynomial. In: Proceeed-ings of International Symposium on Symbolic and Algebraic Computation (ISSAC’17). Kaiserslautern, Germany, pp.181–188.Giesbrecht, M., Haraldson, J., Labahn, G., 2018. Computing nearby non-trivial Smith forms. In: Proceedings of theInternational Symposium on Symbolic and Algebraic Computation (ISSAC’18). New York, USA, pp. 159–166.Giesbrecht, M., Haraldson, J., Labahn, G., 2019b. Lower rank approximations of matrix polynomials. Journal of Sym-bolic Computation, to appear.Gohberg, I., Lancaster, P., Rodman, L., 2009. Matrix polynomials. SIAM, USA.Golub, G., Van Loan, C., 2012. Matrix Computations. Vol. 3. Johns Hopkins University Press, USA.Haraldson, J., 2015. Computing Approximate GCRDs of Di ff erential Polynomials. Master’s thesis, Cheriton School ofComputer Science, University of Waterloo.Haraldson, J., 2019. Matrix Polynomials and their Lower Rank Approximations. Ph.D. thesis, Cheriton School of Com-puter Science, University of Waterloo.Higham, N. J., 2002. Accuracy and stability of numerical algorithms. Vol. 80. SIAM.Ho ff man, A. J., 1952. On approximate solutions of systems of linear inequalities. Journal of Research of the NationalBureau of Standards 49 (4).Kailath, T., 1980. Linear systems. Vol. 156. Prentice-Hall, USA.Kaltofen, E., Storjohann, A., 2015. The complexity of computational problems in exact linear algebra. In: Encyclopediaof Applied and Computational Mathematics. Springer, Germany, pp. 227–233.Kaltofen, E., Yang, Z., Zhi, L., 2007. Structured low rank approximation of a Sylvester matrix. In: Symbolic-NumericComputation. Trends in Mathematics. Birkh¨auser Verlag, Basel, Switzerland, pp. 69–83.Karmarkar, N., Lakshman, Y. N., 1996. Approximate polynomial greatest common divisors and nearest singular polyno-mials. In: Proceedings of the International Symposium on Symbolic and Algebraic Computation (ISSAC’96). ACMPress, Zurich, Switzerland, pp. 35–39.Lasserre, J.-B., 2001. Global optimization with polynomials and the problem of moments. SIAM Journal on Optimization11 (3), 796–817.Lossers, O., 1974. Solution to problem 73-17: A Hadamard-type bound on the coe ffi cients of a determinant of polyno-mials. SIAM Review 16 (3), 394–395.Magnus, J., Neudecker, H., 1988. Matrix di ff erential calculus with applications in statistics and econometrics. Wiley.Poljak, S., Rohn, J., 1993. Checking robust nonsingularity is NP-hard. Mathematics of Control, Signals, and Systems(MCSS) 6 (1), 1–9.Stewart, G., 1994. Perturbation theory for rectangular matrix pencils. Linear Algebra and its Applications 208, 297–301.Van Dooren, P., 1979. The computation of Kronecker’s canonical form of a singular pencil. Linear Algebra and itsApplications 27, 103–140.Van Dooren, P., Dewilde, P., 1983. The eigenstructure of an arbitrary polynomial matrix: computational aspects. LinearAlgebra and its Applications 50, 545–579.Vardulakis, A., Stoyle, P., 1978. Generalized resultant theorem. IMA Journal of Applied Mathematics 22 (3), 331–335.Wright, S., 2005. An algorithm for degenerate nonlinear programming with rapid local convergence. SIAM Journal onOptimization 15 (3), 673–696.Yamashita, N., Fukushima, M., 2001. On the rate of convergence of the Levenberg-Marquardt method. In: Topics inNumerical Analysis. Springer, pp. 239–249.Zeng, Z., Dayton, B. H., 2004. The approximate GCD of inexact polynomials. In: Proceedings of the InternationalSymposium on Symbolic and Algebraic Computation (ISSAC’04). Santander, Spain, pp. 320–327.erential calculus with applications in statistics and econometrics. Wiley.Poljak, S., Rohn, J., 1993. Checking robust nonsingularity is NP-hard. Mathematics of Control, Signals, and Systems(MCSS) 6 (1), 1–9.Stewart, G., 1994. Perturbation theory for rectangular matrix pencils. Linear Algebra and its Applications 208, 297–301.Van Dooren, P., 1979. The computation of Kronecker’s canonical form of a singular pencil. Linear Algebra and itsApplications 27, 103–140.Van Dooren, P., Dewilde, P., 1983. The eigenstructure of an arbitrary polynomial matrix: computational aspects. LinearAlgebra and its Applications 50, 545–579.Vardulakis, A., Stoyle, P., 1978. Generalized resultant theorem. IMA Journal of Applied Mathematics 22 (3), 331–335.Wright, S., 2005. An algorithm for degenerate nonlinear programming with rapid local convergence. SIAM Journal onOptimization 15 (3), 673–696.Yamashita, N., Fukushima, M., 2001. On the rate of convergence of the Levenberg-Marquardt method. In: Topics inNumerical Analysis. Springer, pp. 239–249.Zeng, Z., Dayton, B. H., 2004. The approximate GCD of inexact polynomials. In: Proceedings of the InternationalSymposium on Symbolic and Algebraic Computation (ISSAC’04). Santander, Spain, pp. 320–327.