Zhang Neural Networks for Fast and Accurate Computations of the Field of Values
ZZhang Neural Networks for Fast and Accurate Computations ofthe Field of Values
Frank Uhlig ∗ Abstract
In this paper a new and different neural network, called Zhang Neural Network (ZNN) is appropriatedfrom discrete time-varying matrix problems and applied to the angle parameter-varying matrix field ofvalues (FoV) problem. This problem acts as a test bed for newly discovered convergent 1-step ahead finitedifference formulas of high truncation orders. The ZNN method that uses a look-ahead finite differencescheme of error order 6 gives us 15+ accurate digits of the FoV boundary in record time when applied tohermitean matrix flows A ( t ) . Keywords: field of values, numerical range, Zhang neural network, parameter-varying matrix problem, 1-stepahead finite difference formula, convergent characteristic polynomial, eigenvalue computation, numerical matrixalgorithm
AMS :
Introduction
The field of values (FoV) F ( A ) = { x ∗ Ax | x ∈ C n , (cid:107) x (cid:107) = 1 } of real or complex square matrices A n,n has been studied for well over 100 years. In 1902 Ivar Bendixon [1]gave us the so called Bendixon rectangle that contains the field of values. Felix Haussdorf [7] and Otto Toeplitz[16] independently proved its convexity exactly 100 years ago. And 40 years ago, Charlie Johnson [8] addedrotations to the Bendixon rectangle idea to compute boundary points of F ( A ) in the complex plane via repeatedeigenvalue computations as accurately as Francis’ QR algorithm allows for hermitean matrices, i. e., with 14 to15 accurate leading digits. Johnson’s method computes the eigenvectors of the extreme eigenvalues of matrixrotations A ( t ) = e it · A of A via the hermitean and skew-hermitean parts H and K of A , defined as H = ( A + A ∗ ) / and K = ( A − A ∗ ) / (2 i ) with H ∗ = H and K ∗ = K .
With these notations any given matrix A = H + iK ∈ C n,n generates the hermitean matrix flow A ( t ) =cos( t ) ∗ H + sin( t ) ∗ K with A ( t ) = A ( t ) ∗ for all angles ≤ t ≤ π in its associated FoV problem. Thenormalized eigenvectors x for the largest and smallest eigenvalues of A ( t ) then determine two ∂F ( A ) points viathe quadratic form x ∗ Ax ∈ C and two ∂F ( A ) tangents. 37 years ago Miroslav Fiedler [6] established that the fieldof values boundary curve ∂F ( A ) is an algebraic curve except for possible corner point. Authors such as RudolfKippenhahn [9], and Marvin Marcus and Claire Pesce [13] and many others have extended our understanding ofthe field of values. Workshops on field of values research have been held under the name of ’Numerical Ranges andNumerical Radii’ (WONRA) workshops biannually worldwide since 1992. The most recent WONRA gatheringshave attracted researchers in quantum computing theory.This paper deals with recent advances in how to compute the field of values boundary curve ∂F ( A ) fasterand more accurately than Johnson’s fundamental eigenvalue method [8] or Loisel and Maxwell [11] most recent ∗ Department of Mathematics and Statistics, Auburn University, Auburn, AL 36849-5310 ([email protected]) a r X i v : . [ m a t h . NA ] A p r NN for Fast and Accurate Computations of the FoV f (cid:54) = 0 , when studied in the early 1800s. What are the functions f (if any)for which A f = αf for some scalar α and a given differential operator A ? In 1829, Augustin–Louis Cauchy [2,p. 175] applied the eigen concept from differential equation models to matrices and began to view the erstwhile‘eigenvalue equation’ A f = αf as a ‘null space equation’, namely as A f − αf = 0 or ( A − α id ) f = 0 for theidentity operator with id f = f for all f . This then led to the use of determinants and trying to find the roots of thecharacteristic polynomial of A . Some of today’s newest and fastest FoV computational methods reverse this earlierstep of 200 years ago from using differential equations ideas to related concepts for matrices by again employingdifferential operators and DE solvers in time-varying matrix problems. Interestingly, they can do so more speedilyand more accurately here than standard matrix decomposition procedures would.For example, S´ebastien Loisel and Peter Maxwell [11] start with the hermitean part function H ( e it A ) = 1 / e it A + e − it A ∗ ) for a given A ∈ C n,n . Clearly H ( e it A ) is hermitean. If λ ( t ) is the largest eigenvalue of H ( e it A ) and x ( t ) is its unit eigenvector, then x ( t ) ∗ Ax ( t ) lies on ∂ ( F ( A )) and this is true for every t ∈ [0 , π ] . Their recentpaper [11] studies the parametrized equation H ( e it A ) x ( t ) = λ ( t ) x ( t ) . (1)With S ( A ) = 1 / A − A ∗ ) we have A = H ( A ) + S ( A ) . By requiring all eigenvectors x ( t ) of A ( t ) to be unitvectors and differentiating (1) with respect to t , Loisel and Maxwell [11, section 6] obtain the differential equation (cid:18) x (cid:48) ( t ) λ (cid:48) ( t ) (cid:19) = (cid:18) H ( e it A ) − λ ( t ) I n − x ( t ) − x ( t ) ∗ (cid:19) − (cid:18) − iS ( e it A ) x ( t )0 (cid:19) . (2)This system of ODEs is treated in [11, section 6] as an initial value problem starting from just one explicit setof eigendata computations for t = 0 and H ( A ) itself. The parameter-varying eigenvalue problem is then solvedby following the maximal eigenvalue’s path of H ( e it A ) for t ∈ (0 , π ] . To solve (2) Loisel and Maxwell use theDormand-Prince RK5(4)7M method, see [11, section 6], jointly with Hermite interpolation of degree 4. The chosenintegrator has error terms of order O ( h ) and the method delivers dense ∂ ( F ( A )) output points. [11] includesdetailed descriptions of what might happen if the eigenvalues of H ( e it A ) cross paths, an occurrence that willhappen if A has repeated eigenvalues and results in piecewise line segment on the field of values boundary curve.In [11] ways are described to detect such events and how to adjust ODE path following eigen methods in case ofeigenvalue crossings. For random entry matrices A n,n ∈ C n,n , eigenvalue crossings of the associated matrix flow A ( t ) do not occur and thus they do not affect field of values boundary curve computations, see e.g. [14]. However,normal matrices A have polygonal fields of values and their FoV corners are the matrix eigenvalues. Normalmatrices A can be detected readily from their defining equation A ∗ A = AA ∗ and their FoV easily determinedfrom their eigenvalues.Here we propose a new, an even faster and more accurate method than [11] that uses convergent look-ahead finitedifference formulas and Zhang Neural Networks, rather than initial value ODE solvers. For simplicity we restrictour algorithm to non-normal matrices A that do not have repeated eigenvalues and we do not incorporate anywork-arounds for matrices with repeated eigenvalues. Discretized Zhang Neural Networks are a totally new and different process; they share no properties or stepswith matrix decomposition methods nor any with classical path following or homotopy ideas. Their numerical
NN for Fast and Accurate Computations of the FoV F ( t ) = f ( A ( t ) , B ( t ) , C ( t ) , ..., x ( t ) , y ( t ) , ... ) = g ( α ( t ) , β ( t ) , ... ) = G ( t ) (3)with time- or parameter-varying functions A ( t ) , B ( t ) , ..., α ( t ) , β ( t ) , ... for the unknown x ( t ) , y ( t ) , ... . From thegiven general equation (3) and following Cauchy’s idea from 1829, we form the homogeneous error equation E ( t ) = F ( t ) − G ( t ) ! = 0 (4)that ideally should be zero or near zero for all t and the yet unknown time-varying solutions x ( t ) , y ( t ) , ... . TheZNN Ansatz then requires exponential decay for E ( t ) by stipulating ˙ E ( t ) = − ηE ( t ) (5)with a decay constant η > . This translates the given task of solving (3) into an ordinary differential equation.Zhang Neural Networks go back almost 20 years to Yunong Zhang and Jun Wang [24]. These new methods havebeen used extensively by engineers and the engineering literature now contains well over 400 papers on problemspecific ZNN methods. There are several books on ZNN, such as [26]. The treated problems are generally time-varying, with parameter inputs A ( t ) , B ( t ) , ..., α ( t ) , β ( t ) , ... coming in at regular frequencies, such as 10, 50 or 100Hz from sensors on robots or autonomous vehicles and so forth. And their specific tasks may involve optimization,linear solves, operator inversions, eigenvalue computations, and most any other standard computational set-up thatwe are familiar with in the static numerical problem solving realm such as Sylvester equations, Lyapunov equa-tions etc. Note that static solvers are generally of little use or accuracy in time-varying problems. And besides, itappears that the numerical theory for parameter-varying (continuous or discrete) numerical problems is not welldeveloped at this time. The new area has not been even understood nor researched from the Numerical Analysisangle of view. Many specific ZNN methods for time-varying or parameter-varying problems that have been devel-oped and are successfully used in applications by engineers today, yet they are still unstudied and mostly unknownin numerical analysis circles, in our textbooks and courses.This paper uses newly developed high truncation error order convergent look-ahead finite difference schemesfrom [18] and Zhang Neural Networks for the parametrized field of values boundary location problem with A ( t ) = cos( t ) A + A ∗ t ) A − A ∗ i = A ( t ) ∗ ∈ C n,n ( A ( t ) = H ( e it A ) in Loisel and Maxwell, [11] ) . (6) STEP 1, model making :
For the FoV problem ZNN starts from the initial eigendata for A (0) = H at t = 0 andcontinues in discrete and predictive time or angle steps until t = 2 π . For all matrix eigenvalue problems, ZhangNeural Networks would try to solve the parametric eigenvalue equation A ( t ) V ( t ) = V ( t ) D ( t ) for all t (7)with two linked time-varying unknowns, the nonsingular eigenvector matrices V ( t ) and the diagonal matrices D ( t ) that contain the eigenvalues of A ( t ) ∈ C n,n . Step 2; form the error function :
The associated ZNN error equation is E ( t ) = A ( t ) V ( t ) − V ( t ) D ( t ) ! = O n ∈ C n,n . (8) Step 2a; specify the model for one eigenvalue λ i and eigenvector x i ( t ) pair of A ( t ) : Now we consider the errorequation eigenvector/eigenvalue pair equation e i ( t ) = A ( t ) x i ( t ) − λ i ( t ) x i ( t ) ! = o n ∈ C n NN for Fast and Accurate Computations of the FoV i = 1 , ..., n . Step 3, stipulate exponential decay for e i ( t ) : Stipulating exponential decay for e i means that ˙ e i ( t ) = − ηe i ( t ) foreach i and – for symplicity’s sake – a global positive decay constant η . By expansion and rearrangement this leadsto the ODE ( A ( t ) − λ i ( t ) I n ) ˙ x i ( t ) − ˙ λ i ( t ) x i ( t ) = ( − η ( A ( t ) − λ i ( t ) I n ) − ˙ A ( t )) x i ( t ) (9)where I n is the identity matrix of the same size n by n as A ( t ) . Step 3a, arrange equation (9) in augmented matrix form :
Upon further rearrangement this becomes the matrixODE (cid:18) A ( t ) − λ i ( t ) I n − x i ( t )2 x ∗ i ( t ) 0 (cid:19) (cid:18) ˙ x i ( t )˙ λ i ( t ) (cid:19) = (cid:18) ( − η ( A ( t ) − λ i ( t ) I n ) − ˙ A ( t )) x i ( t ) − µ ( x ∗ i ( t ) x i ( t ) − (cid:19) , (10)where the bottom block-matrix row ensures that the computed eigenvectors become unit vectors with stipulatedexponential error decay for a separate decay constant µ .These are the three ubiquitous paper and pencil start-up steps for all ZNN based matrixmethods.In our FoV computations, the matrix flow A ( t ) consists entirely of hermitean matrices and therefore all occur-ring eigenvalues λ i ( t ) will be real and the top left block A ( t ) − λ i ( t ) I n of the system matrix in (10) is hermiteanfor all t . By dividing the lower block row in (10) by –2, we obtain the following equivalent model, now with acompletely hermitean system matrix for speedier computations: (cid:18) A ( t ) − λ i ( t ) I n − x i ( t ) − x ∗ i ( t ) 0 (cid:19) (cid:18) ˙ x i ( t )˙ λ i ( t ) (cid:19) = (cid:18) ( − η ( A ( t ) − λ i ( t ) I n ) − ˙ A ( t )) x i ( t ) µ/ · ( x ∗ i ( t ) x i ( t ) − (cid:19) . (11)Further details of this process are contained in [21, section 2].The system matrix P ( t k ) above is identical to the one of equation (2) that was obtained by differentiating thematrix eigenequation (1) and was used by Loisel and Maxwell’s [11] IVP ODE path following method. But theright hand side of our two coupled matrix ODEs in (11) is completely different and besides, our ODE version (11)with a hermitean system matrix contains the decay constants η and µ and completely different terms than (2).The continuous-time matrix eigendata model (10) has been proven convergent and robust for noisy inputs in [25]for symmetric matrix flows A ( t ) = A ( t ) T . The ideas and proofs of [25] easily generalize to hermitean flows A ( t ) = A ( t ) ∗ .Here we deal with the specific discretized version of the eigendata FoV model that we have just introduced. ZNNmethods for more general time-varying matrix problems were studied and tested earlier in [21].For discretized input data matrices A ( t k ) ∈ C n,n and k = 0 , , , ... it is most natural to discretize theirdifferential equations in sync with the sampling gap τ = t k +1 − t k . In the following we assume that τ is constantfor all k . With P ( t k ) = (cid:18) A ( t k ) − λ i ( t k ) I n − x i ( t k ) − x ∗ i ( t k ) 0 (cid:19) ∈ C n +1 ,n +1 , z ( t k ) = (cid:18) x i ( t k ) λ i ( t k ) (cid:19) ∈ C n +1 , and q ( t k ) = (cid:18) ( − η ( A ( t k ) − λ i ( t k ) I n ) − ˙ A ( t k )) x i ( t k ) µ/ · ( x ∗ i ( t k ) x i ( t k ) − (cid:19) ∈ C n +1 (12)our model (11) with a hermitean system matrix flow P ( t ) becomes the matrix ODE P ( t k ) ˙ z ( t k ) = q ( t k ) (13)for k = 0 , , , ... , each equidistant discrete time step ≤ t k ≤ t f and for each associated eigenpair x i ( t k ) and λ i ( t k ) of A ( t k ) . Here t f denotes the final time t f .Following the above start-up steps, we are now ready to work through the discretized ZNN finite difference portionof the scheme. The ZNN method ultimately relies entirely on look-ahead convergent finite difference schemes.The ensuing high accuracy predictions of future states of a system differ from any other method that we have everseen or encountered and are completely new. Therefore ZNN methods behave quite differently for time-varyingmatrix problems. They belong to a special, a new branch of numerical analysis that deserves to be studied withoutprejudice in its own right. NN for Fast and Accurate Computations of the FoV A ( t k ) at each time instant t k .Thereby one would know the system properties at time t k shortly after time t k has passed due to the inherentcomputational lag. For modern robot control and autonomous vehicle and in other applications, however, suchknowledge is of little value and might even be dangerously deceptive. When we walk or drive along a path orstreet it is good to know where we currently are, i. e., our static data or where and how we have just been. Butmuch more importantly, when moving about we need to look ahead, even if only somewhat unconsciously, andadjust our direction, our speed and so forth at time t k for the future time instance t k +1 . To do so at t k , we can onlyrely on the known system data from past or current moments t j with j ≤ k . As we progress on our path we muststay on the road, follow its curves, and try to avoid puddles and potholes etc that may lie ahead. Thus we need tomake accurate assessments of what comes next upon us in the immediate future using our present and past data.Clearly meaningful numerical methods for such time-varying problems must be able to predict a system’s statesin the future at time t k +1 and do so reliably from data for times at or before t k . Discrete Zhang Neural Networksrely on convergent look-ahead finite difference schemes that can predict the system’s subsequent states reliablyand accurately at time t k +1 from past data. ZNN methods can do so quickly even for corrupted data, corrupted bysensor or transmission failures, thanks to their stipulated exponential error decay; for examples see [21]. Moreoverthe convergence rate and accuracy of discretized ZNN methods depend solely on the chosen sampling gap τ , onthe chosen decay constants η , µ , ... , and on the forward difference formula’s truncation error order. Backwardfinite difference formulas may also be employed inside ZNN methods, such as, for example, to approximate thederivative ˙ A ( t k ) in the right hand side computations of q ( t k ) in formulas (13) and (12) above.Look-ahead finite difference formulas are rare in the literature and in our handbook lists such as in [5, section14.2]. Moreover the few that are listed and involve states at times t k +1 , t k , t k − through t k − j for some j ≥ areusually not convergent and thus of no help with discretized ZNN methods that excel at predicting or computingfuture states fast and accurately. Convergence of multistep formulas is standardly defined via their characteristicpolynomials. A characteristic polynomial is convergent if it is zero stable, if all its characteristic polynomial rootslie in the closed unit disk of C , and if all roots on the unit circle are simple, see e.g. [5, section 17.4]. In the past,only six convergent look-ahead finite difference schemes have been known, all with relatively low truncation errororders of at most four, see e.g. [18]. These were either of classical origin, such as Euler’s formula or constructed byhand from Taylor expansions for the iterates z k +1 , z k − , ..., z k − (cid:96) with clever manipulations that would cancel sec-ond and higher order derivatives in a linear combination of these Taylor expansion equations. For details we referto [18]. Recently this author [18] has constructed an algorithm for finding convergent look-ahead finite differenceformulas of arbitrary truncation error orders. New such finite difference schemes have been found up to truncationerror order eight. In this paper we will apply and test several convergent look-ahead finite difference formulas tothe matrix FoV boundary problem; one known of truncation error order 4, found by hand, pencil and paper in [10],another new one of order 6 from [18], as well as four more such formulas of truncation error orders 3, 5, 7 and 7,respectively. Recall that this paper and the FoV matrix problem is intended as a test bed for comparisons of fastand accurate ZNN based computations. Our ZNN methods achieve spectacular results, see the next section. Step 4, clear up all variables appearing in or after Step 3a :
The iterates of our algorithm are the entries in z k = z ( t k ) = (cid:18) x i ( t k ) λ i ( t k ) (cid:19) ∈ C n +1 from (12). Here x i ( t k ) denotes a field of values generating extreme eigenvector of A ( t k ) = cos( t k ) H + sin( t k ) K as defined in (6) and λ i ( t k ) is the corresponding eigenvalue of A ( t k ) for the angle t k ∈ [0 , π ] . For non sen-sor based processes such as our formulaic equations based FoV boundary point problem, the derivative ˙ A ( t k ) that occurs inside the ZNN based ODE (13) in q and formula (12) does not need to be approximated by a back- NN for Fast and Accurate Computations of the FoV A ( t k ) = cos( t k ) H + sin( t k ) K with respect to t is simply ˙ A ( t k ) = sin( t k ) H − cos( t k ) K . Step 5, computer implementation of a difference scheme with fictitious name m_s : Here m ≥ s are both positiveintegers. The new eigendata iterate z k +1 = z ( t k + τ ) at time t k +1 is found in ZNN from ˙ A ( t k ) , from previouseigendata for t j with j ≤ k , the respective previous eigenvectors zs , and the associated previous eigenvalues ze that are stored in ZZ ∈ C n +1 ,m + s , as well as the n + 1 by n + 1 matrix P as detailed in (12). This is expressed inthe following MATLAB code lines: % top eigenvalue iteration :ze = ZZ(n+1,1);zs = ZZ(1:n,1);Al = Ath; Al(logicIn) = diag(Al) - ze*ones(n,1); % Al = Ath - ze*InP = [Al,-zs;-zs’,0]; % P is hermitean The two code lines below define the right hand side vector q ( t k ) in (12) and find the solution X with P ( t k ) X = q ( t k ) as indicated in equation (13): q = taucoeff*tau*[(eta*Al + Adot)*zs; -1.5*eta*(zs’*zs-1)]; % n+1 vector q in (11)X = linsolve(P,q); % hermitean matrix P in linsolve without any Lopts.SYM setting Following some experimentations, we use the increased decay constant µ = 3 η in the bottom entry of q to achievefaster convergence of the eigenvectors to unit length. Finally the eigendata vector z k +1 is evaluated as Znew in thenext code line via the chosen finite difference formula’s characteristic polynomial of degree m + s , the solutionvector X of P x = q and the previous eigendata matrix ZZ for z j from m + j indices j ≤ k , namely Znew = -(X + ZZ*polyrest); % from the formula for zdot .The only change from one ZNN method to another in this code involves the choice of which normalized finitedifference polynomial to use. The respective characteristic polynomial is stored – without its leading coefficient 1– in polyrest ∈ R m + s . Step 6, choosing a convergent 1-step ahead finite discretization formula :
Below we test our ZNN method fortwo different convergent polynomials and their associated look-ahead finite difference formulas, namely the one with truncation error order 4 and the formula with truncation error order 6. The polynomial was found by pencil and paper. It has integer coefficients and first appeared in [10] while was computedmore recently via the MATLAB optimization and search code detailed in [18]. Both schemes work well forFoV boundary computations. The difference scheme has the non-normalized characteristic polynomial p ( x ) = 8 x + x − x − and the discrete ZNN method with p needs m + s = 2 + 2 = 4 startingvalues of both the eigenvectors and eigenvalues for A ( t k ) at k = 0 , ..., . These we compute at start-up usingexplicit Francis QR eigenanalyses. The roots of p are − . ± . i, . , and . which lie wellwithin the unit disk in C , see Figure 2. Our computer generated finite difference scheme has a non-rationalcharacteristic polynomial which we approximate in MATLAB double precision as p ( x ) = − . x − . x + 1 . x + 2 . x − . x − . x + 0 . x +0 . x − . x − . . Method uses p in its normalized form and it needs eigendata starting values, which are againfound via Francis QR. The roots of p are − . ± . i, − . ± . i, . , . ± . i, . , and − . . These all lie properly inside the unit disk in Figure 2, insuring convergencewhen used inside a look-ahead finite difference based ZNN scheme. Step 7, run tests :
All our test tables for ZNN field of values computations below use random entry complex matri-ces A n,n of varying dimensions. In our first two test tables the matrix dimensions n increase in size by factors of3 and all tables use sampling gaps such as τ = 0 . Hz ) , . , . , . or . . The most accuratefield of values boundary points that anyone can create in MATLAB – without use of accuracy enhancements suchas inverse iteration or extended digital precision ranges – are those from Francis QR algorithm eigenanalyses of NN for Fast and Accurate Computations of the FoV A ( t k ) . These are accurate in their leading 14 or 15 digits. Loisel and Maxwell [11] interpolate their ODE com-puted results further. We instead take smaller and smaller sampling gaps τ for the ZNN iterations and generate upto 62,833 FoV boundary points via ZNN when τ = 0 . . This many points seem to be enough for an accuratepolygon approximation of the FoV boundary. The accuracy is measured in our code by the average number ofcorrect ZNN computed digits of the boundary point estimates x ∗ Ax ∈ C for x = ( z k ) [1: n ] ∈ C n , the eigenvectorpart of our iterate z k that was computed predictively at time t k − via ZNN, when compared with the Francis QRcomputed FoV boundary point for A ( t k ) .Note again that this predictive way of computing by design with ZNN is not available in any other time-varyingmatrix algorithm. finite n = 3 n = 9 n = 27 n = 81 n = 243 diff. formula h ∈ [0 . , . h ∈ [0 . , . h ∈ [0 . , . h ∈ [0 . , . h ∈ [0 . , . η = 19 η = 19 η = 13 η = 7 η = 7 τ = 0 . η = 80 η = 90 η = 84 η = 89 η = 90 τ = 0 . η = 930 η = 900 η = 900 η = 900 η = 900 τ = 0 . η = 600 τ = 0 . Table 1 : Optimal η values, accurate digits and run times for the ZNN method of truncation error order 4.Next the analogous data table for ZNN and the field of values boundary problem when using the finite differ-ence formula of truncation error order 6. finite n = 3 n = 9 n = 27 n = 81 n = 243 diff. formula h ∈ [0 . , . h ∈ [0 . , . h ∈ [0 . , . h ∈ [0 . , . h ∈ [0 . , . η = 6 . η = 7 . η = 8 . η = 8 . η = 8 . τ = 0 . η = 41 η = 49 η = 47 η = 43 η = 49 τ = 0 . η = 390 η = 500 η = 530 η = 510 η = 520 τ = 0 . η = 290 η = 360 η = 380 η = 370 η = 380 τ = 0 . Table 2 : Optimal η values, accurate digits and run times for ZNN method of truncation error order 6.To round out our comparisons, we next look at four convergent look-ahead finite difference schemes of different NN for Fast and Accurate Computations of the FoV
Complex random entry f. diff. formula f. diff. formula f. diff. formula f. diff. formula matrix A n,n with n = 243 0 . ≤ h ≤ .
26 0 . ≤ h ≤ . h = 0 . h = 0 . τ = 0 . η = 22 (100 Hz) 5.6 acc. digits631 FoV points 1.9 sec η = 260 η = 27 η = 6 . η = 16 τ = 0 . η = 2500 η = 320 η = 69 η = 160 τ = 0 . Table 3 : Optimal η values, accurate digits and run times for the ZNN methods , , and oftruncation error orders 3, 5, and 7 twice, respectively.The row data in Table 2 shows that the ZNN methods based on the polynomial with τ = 0 . or τ = 0 . give us more than 15 accurate leading digits for 40,000+ or 60,000+ FoV boundary points, respec-tively, for all chosen matrix sizes n by n . CPU times increase by factors between 1.2 and 7 times for each triplingof test matrix dimension n from 3 to 243 and for each fixed τ . For any fixed τ the number of accurately computeddigits remains almost the same for all tested dimensions, as does their optimal η decay constant at around 300 and500, respectively. Looking at the column data in Table 2, i. e. for decreasing sampling gaps τ and fixed matrixdimension n , for the auxiliary constant h = η · τ is bounded by 0.03 and 0.06 in each column.The data for ZNN with in Table 1 shows similar qualities as that of Table 2, albeit with different optimal η progressions and generally lesser numbers of accurately computed digits (by 1 or 2 digits) in their computed FoVboundary points. Table 3 explores the behavior of four adjacent methods with truncation error orders O ( τ ) , O ( τ ) and O ( τ ) (twice) and for dimension n = 243 only. The based ZNN method with truncation error order O ( τ ) behaves nearly identically to but still comes short of ’s better accuracy performance. Method runs optimally with a much smaller h = 0 . than that of . And finally, the ZNN methods withthe convergent look-ahead finite difference formulas and , both of truncation error orders O ( τ ) do notimprove significantly on the results of or despite their respective higher truncation error order. Notehowever, that works much better regarding accurate digits than does and compare the different rootlocation geometry of their associated characteristic polynomials in Figure 2 below.Further note that our lowest truncation order method is the only one that allows usable output with Hz discrete data input. The other tested ZNN methods would not converge with τ much below . or Hz .Also, the value of h = η · τ is well below 0.1 for all methods except for with h ≈ . for achieving optimaldata accuracy in FoV applications. Usually the values of h hover around 1 to 10 in time-varying ZNN methods.The first ’recurrent neural network’ paper with ZNN exponential error decay stipulations by Yunong Zhang andJun Wang [24] includes the following explanation for the ’small h ’ phenomenon that we have observed in our FoVapplication. In [24, overlapping sentence on p. 1163, 1164] there is the following remark on combined networks,such as ours with two independent error equations. “The ( ... ) process using neural networks is virtually a multipletime-scale system. The plant and controller are in a time scale. N z and N k are in another time scale smaller thanthat of the plant and controller, which requires that the dynamic process for computing KH be sufficiently quick,or the plant be sufficiently slow time-varying.” Thus our aim to achieve high global accuracy of the ZNN com-puted FoV boundary points by using very small sampling gaps throughout the algorithm counteracts the problemspecifics for our chosen knife blade shaped fields of values with long, rather straight sides and very tight turns at thestraights’ ends for large n . The insight that Zhang neural networks are seemingly challenged with our necessarilyquick, narrowly spaced angle changes in high FoV curvature areas and slow angle-variances across the straightshelps explain our small and very sensitive h anomaly due to our specific desires here. Yet ZNN still comes throughvery well. NN for Fast and Accurate Computations of the FoV A , see [11, Fig. 8.1, +++ curve]. Our ZNN method delivers around 15.3 digits accurately in 111 seconds for 243 by 243 random entry matriceswhen it computes 41,889 FoV boundary points for τ = 0 . . Here is the MATLAB loglog line plot for Loiseland Maxwell’s path-following ODE method [11, Fig. 8.1] and our ZNN based method . Figure 1 depicts runtimes versus algorithm accuracy and includes the accuracy/CPU time point * for Johnson’s method [8] when usedwith MATLAB’s recently optimized Francis eig m file with multishifts.Figure 1Figure 1 shows that our discrete ZNN method with is about 4 times faster at each accuracy level thanthe current best FoV problem solver, i.e., the ODE path following method of [11]. To draw the ZNN accuracydata line in Figure 1 we have used the run time data for τ = 0 . and n = 243 from Table 2 with a 13.2digit accuracy achieved in 16.6 seconds from 6,283 computed FoV boundary points. Here we have allowed for 2accurate digits lost – from the 13.2 actual accurate digits on average at 6,000+ FoV boundary points down to just11.2 to draw the ZNN line in Figure 1. This accounts for the unavoidable errors in the gaps between the computedboundary points due to the natural bulging out of the true convex boundary FoV curve.It is interesting to note here that specialized MATLAB programs that rely on particular eigendata acquisitionmethods for special matrix problems seem to have become almost irrelevant now due to recent multishift improve-ments of eig in MATLAB. Using the built-in MATLAB m file eig today in Johnson’s original FoV algorithm[8] to acquire 42,000 FoV boundary points clearly bests Loisel and Maxwell [11] in speed by a factor of two andachieves the same level of accuracy. In turn today’s eig MATLAB function is bested by discretized ZNN, againby a factor of around two according to the CPU data and Figure 1. This paper was intended to test, examine andshow differences in our newly discovered high order discretization versions of ZNN. What further speed gains arepossible with local on-chip implementations of discretized ZNN methods now begs an answer.From our set of data for various error order ZNN methods, it is obvious that ZNN methods of lesser error orderscan give us lower accuracies, but their CPU times are all similar to those of the higher order methods or . This is due to the low O ( n ) CPU work of ZNN to generate future FoV boundary points via finite differenceschemes as linear combinations of previous vector data. The length of an n vector does not really matter much for O ( n ) computational processes since in our ZNN algorithms, the main effort and CPU time is spent on solving thelinear systems (13) P ( t k ) ˙ z ( t k ) = q ( t k ) in the MATLAB code line X = linsolve(P,q) that was quoted atthe start of this section. This line, when executed 40,000+ times for n = 243 with τ = 0 . , takes up almost NN for Fast and Accurate Computations of the FoV . How to solve linear equa-tions more efficiently, possibly via ZNN methods, than using O ( n ) Gauss or generalized Cholesky or O ( n . ... ) Strassen inspired methods is an intriguing open problem.The author’s MATLAB function runconv1step1Plot(runs,jend,k,s) in [18] and [19] creates con-vergent normalized polynomials for possible use in look-ahead finite difference formulas. It does so from a seedvector y of length s ≥ k that the operator chooses and then it creates such polynomials, named k_s here, withtruncation error orders k + 2 for the associated difference scheme. Better high order finite difference methods thanours are likely to exist. How to find or create such is unknown. We feel that the quality of a polynomial in finitelook-ahead difference methods might depend on the lay of the polynomial’s roots inside the unit disk.To help contemplate these matters, here are the root plots of the six characteristic polynomials for our the conver-gent look-ahead discretizations that we have chosen to create Tables 1, 2, and 3. NN for Fast and Accurate Computations of the FoV and with identical truncation error order 7show a slightly different geometry: five of the roots of in the open unit disk lie much closer to the unit circlethan those four inside roots closest to the unit circle for . Moreover, the roots of are more widely andmore evenly spread over the unit disk than those of . In our computations, see Table 3, the results with ZNNand are much improved over those with ZNN and for τ = 0 . and τ = 0 . with quite differentvalues for h for the identical random entry matrix A , . Is there a correlation between a polynomial’s rootgeometry, its accuracy, and the optimal value of h in the associated finite difference ZNN scheme? If so, whichgeometries and which values of h give better results; and why? These are open questions.Our data was computed using MATLAB R2018a on a MacBook Pro (early 2015 version) under High SierraOS with a 4 core Intel i5 processor and 16 GB of Ram. The MATLAB codes FOVZNN4 5ashorteigsym.m formethod and
FOVZNN5 6bshorteigsym.m for method can be found and downloaded from [20].
Remark : ( On repeated eigenvalues inside symmetric matrix flows A ( t ) ) For both, decomposition methods and path following ODE initial value algorithms that find symmetric matrixflow eigendata over time, it is crucial that the matrix flow A ( t ) never encounters repeated eigenvalues in itscourse, see Dieci and Eirola [4] and Loisel and Maxwell [11], respectively. In [11] elaborate work-aroundsfor symmetric matrix eigenvalue repetitions are developed that can mitigate the built-in method limitations.Our discretized ZNN eigenvalue method encounters no problems when faced with matrix flows that incurrepeated eigenvalues. If we had computed all eigenvalues of each A ( t k ) instead of just the two extreme onesand then at each step had selected the two extreme eigenvalues and used their associated eigenvectors tofind the corresponding FoV points, this extension of our algorithm and code would deal securely with suchderogatory symmetric matrix flows. That task sounds like a good project for a strong master’s candidate. Conclusions :
The ZNN method is completely new and highly accurate for time-varying matrix problems. It differs from allother time-varying matrix methods in its use of an error equation inspired differential equation that assuresZNN of automatic exponentially fast convergence. This new ODE is then solved by forward looking finitedifference equations that have never before been used and were actually unknown before their appearancewith ZNN. Being completely new and different from decomposition methods such as Dieci et al [4] or pathfollowing ODE based methods such as [3, 11, 22, 23], ZNN bypasses some of the erstwhile algorithm causedrestrictions and limitations. It often works well in wider, less resticted applications than was possible before.ZNN is a newly emerging worthy subject of numerical analysis, ready made for thorough studies that mayanswer its many new open questions. Our community is thus called to set ZNN on firm computational andnumerical ground.
NN for Fast and Accurate Computations of the FoV Afterthoughts :
When I was younger, much younger, I sometimes envisioned of someday replacing how we computed matrixeigenvalues and eigenvectors then, namely via matrix factorizations such as QR or via vector and subspaceiteration schemes, with solving large numbers of linear equations instead. That was at times when I was tryingto gain deeper understandings; guessing at what might lie ahead for us in ’scientific computing’ terms as wewould call it now and well before that term was invented. I mused about it, but found no answers then.This paper’s main algorithm spends most of its time (up to nearly half of its total CPU time) on solving thou-sands and thousands of linear equations in its most expensive MATLAB code line
X = linsolve(P,q) .On the other hand, its finite difference scheme formulated eigendata finding part runs almost effortlessly andin almost no time with linearly combining vectors cleverly at O ( n ) cost. As applied to the FoV boundaryproblem here, Zhang Neural Networks can find time-varying and parameter-varying eigendata in record timeand record accuracy today by using cheap and clever linear vector combinations while relying on still costlylinear equation solves instead of Francis multishift QR. Amazing.Does this answer my day-dreams of old, I wonder. What else is in store for us in future matrix numerics?And : Google no longer uses the SVD and Francis QR, as Gene Golub had once suggested in Google’s ‘garagedays’, nor does
Google use its ’page rank algorithm’ as search engine any longer. Instead it uses a differentNeural Network based search engine, called TensorFlow, since around 2013, see e. g. [15].
References [1] I
VAR B ENDIXSON , Sur les racines d’une ´equation fondamentale , Acta Math., 25 (1902), p. 358 - 365.[2] A
UGUSTINE -L OUIS C AUCHY , Sur l’equation `a l’aide de laquelle on determine les in´egalit´es s´eculaires desmouvements des plan`etes , Exerc. de Math., 4, (1829); also
Oeuvres , (2) 9, p. 174-195.[3] A. C
ICHOCKI AND
R. U
NBEHAUEN , Neural networks for computing eigenvalues and eigenvectors , Biol.Cybern., 68 (1992), p. 155-164.[4] L
UCA D IECI AND T IMO E IROLA , On smooth decompositions of matrices , SIAM J. Matrix Anal. Appl., 20(1999), p. 800-819 (1999).[5] G
ISELA E NGELN -M ¨
ULLGES AND F RANK U HLIG , Numerical Algorithms with C , Springer (1996), 602 p.[6] M
IROSLAV F IEDLER , Geometry of the numerical range of matrices , Linear Algebra Appl., vol. 37 (1981),p. 81 - 96.[7] F
ELIX H AUSSDORFF , Der Wertevorrat einer Bilinearform , Math. Zeitschr., 3 (1919), p. 314 - 316.[8] C
HARLES
R. J
OHNSON , Numerical determination of the field of values of a general complex matrix , SIAMJ. Numer. Anal., vol. 15 (1978), p. 595 - 602.[9] R
UDOLF K IPPENHAHN , ¨Uber den Wertevorrat einer Matrix , Math. Nachr., vol. 6 (1951), p. 193 - 228; alsoavailable in English as On the numerical range of a matrix , translated by Paul F. Zachlin and Michiel E.Hochstenbach, Linear Multilinear Algebra, vol. 56 (2008), p. 185 - 225.[10] J
IAN L I , M INGZHI M AO , F RANK U HLIG AND Y UNONG Z HANG , A 5-instant finite difference formula tofind discrete time-varying generalized matrix inverses and scalar reciprocals , Numer Alg. (2018), 21 p.,https://doi.org/10.1007/s11075-018-0564-5[11] S ´
EBASTIEN L OISEL AND P ETER M AXWELL , Path-following method to determine the field of val-ues of a matrix at high accuracy , SIAM J. Matrix Analysis, Appl., 39 (2018), p. 1726 - 1749.https://doi.org/10.1137/17M1148608[12] A. A. M
AILYBAEV , Computation of multiple eigenvalues and generalized eigenvectors for matrices depen-dent on parameters , Numerical Linear Algebra with Applications, 13 (2006), 419-436.
NN for Fast and Accurate Computations of the FoV
ARVIN M ARCUS AND C LAIRE P ESCE , Computer generated numerical ranges and some resulting theo-rems , Lin Multilin. Alg., 20 (1987), p. 121 - 157.[14] J
OHN VON N EUMANN AND E UGENE P AUL W IGNER , On the behavior of the eigenvalues of adiabatic pro-cesses , Physikalische Zeitschrift, 30 (1929), p. 467 - 470; reprinted in
Quantum Chemistry, Classic ScientificPapers , Hinne Hettema (editor), World Scientific (2000), p. 25 - 31.[15] J
AMES S OMMERS , Binary Stars, The friendship that made Google huge , The New Yorker, Annals of Tech-nology, December 10, 2018 issue, p. 28 - 35; excerpts on ’Neural Networks’ (from p. 34, 35) at [16] O
TTO T OEPLITZ , Das algebraische Analogen zu einem Satz von Fej´er , Math. Zeitschr., 2 (1918), p. 187 -197.[17] F
RANK U HLIG , Faster and more accurate computation of the field of values boundary for n by n matrices ,Lin. and Multilin. Alg, 62 (2014), p. 554 - 567. http://dx.doi.org/10.1080/03081087.2013.779269[18] F RANK U HLIG , The construction of high order convergent look-ahead finite difference formulas for Zhangneural networks , submitted, 9 p.[19] F
RANK U HLIG , The MATLAB code runconv1step1Plot.m for convergent polynomial construction isavailable at [20] F
RANK U HLIG , The MATLAB codes
FOVZNN4 5ashorteigsym.m and
FOVZNN5 6bshorteigsym.m for our algorithms and are available at [21] F
RANK U HLIG AND Y UNONG Z HANG , Time-varying matrix eigenanalyses via Zhang neural networks andfinite difference equations , submitted, 16 p.[22] X. W
ANG , M. C
HE AND
Y. W EI , Recurrent neural network for computation of generalized eigenvalueproblem with real diagonalizable matrix pair and its applications , Neurocomputing, 216 (2016), p. 230-241.[23] Z. Y I ,Y. F U AND
H. J. T
ANG , Neural networks based approach for computing eigenvectors and eigenvaluesof symmetric matrix , Comput. Math. Appl., 47 (2004), p. 1155-1164.[24] Y
UNONG Z HANG AND J UN W ANG , Recurrent neural networks for nonlinear output regulation , Automatica,vol. 37 (2001), p. 1161 - 1173.[25] Y
UNONG Z HANG , M IN Y ANG , C
HUMIN L I , F RANK U HLIG , H
AIFENG H U , New continuous ZD model forcomputation of time-varying eigenvalues and corresponding eigenvectors , submitted, 16 p.[26] Y