[PDF] A New Selection Operator for the Discrete Empirical Interpolation Method -- improved a priori error bound and extensions

Abstract

This paper introduces a new framework for constructing the Discrete Empirical Interpolation Method DEIM projection operator. The interpolation node selection procedure is formulated using the QR factorization with column pivoting, and it enjoys a sharper error bound for the DEIM projection error. Furthermore, for a subspace U given as the range of an orthonormal U , the DEIM projection does not change if U is replaced by UΩ with arbitrary unitary matrix Ω . In a large-scale setting, the new approach allows modifications that use only randomly sampled rows of U , but with the potential of producing good approximations with corresponding probabilistic error bounds. Another salient feature of the new framework is that robust and efficient software implementation is easily developed, based on readily available high performance linear algebra packages.

Full PDF

AA NEW SELECTION OPERATOR FOR THE DISCRETEEMPIRICAL INTERPOLATION METHOD – IMPROVED A PRIORIERROR BOUND AND EXTENSIONS

ZLATKO DRMAˇC ∗ AND

SERKAN GUGERCIN † Abstract.

This paper introduces a new framework for constructing the Discrete EmpiricalInterpolation Method (

DEIM ) projection operator. The interpolation node selection procedure isformulated using the QR factorization with column pivoting, and it enjoys a sharper error bound forthe

DEIM projection error. Furthermore, for a subspace U given as the range of an orthonormal U ,the DEIM projection does not change if U is replaced by U Ω with arbitrary unitary matrix Ω. In alarge-scale setting, the new approach allows modiﬁcations that use only randomly sampled rows of U , but with the potential of producing good approximations with corresponding probabilistic errorbounds. Another salient feature of the new framework is that robust and eﬃcient software imple-mentation is easily developed, based on readily available high performance linear algebra packages. Key words. empirical interpolation, nonlinear model reduction, proper orthogonal decomposi-tion, projections, QR factorization, randomized sampling, rank revealing factorization

AMS subject classiﬁcations.

1. Introduction.

Direct numerical simulation of dynamical systems plays a cru-cial role in studying a great variety of complex physical phenomena in areas rangingfrom neuron modeling to microchip design. The ever-increasing demand for accu-racy leads to dynamical systems of ever-larger scale and complexity. Simulation insuch large-scale settings can make overwhelming demands on computational resources;thus creating a need for model reduction to create smaller, faster approximations tocomplex dynamical systems that still guarantee high ﬁdelity.

Consider the following non-linear dynamical system of ordinary diﬀerential equations (ODE) E ˙ x ( t ) = A x ( t ) + f ( x ( t )) + B g ( t ) , t ≥ , (1.1)where E, A ∈ R n × n , B ∈ R n × ν , f : R n → R n and g : [0 , ∞ ) → R ν . In (1.1), x ( t ) ∈ R n is the state and g ( t ) is the external forcing term (input); thus (1.1) has n degrees offreedom and ν inputs.Systems of the form (1.1) with very large state-space dimension ( n ≈ O (10 ) orhigher) arise in many disciplines and typically originate from discretization of partialdiﬀerential equation models. The goal of model reduction is to replace (1.1) with areduced surrogate dynamical system having much lower state space dimension, r (cid:28) n .The reduced model will then have the structure E r ˙ x r ( t ) = A r x r ( t ) + f r ( x r ( t )) + B r g ( t ) , (1.2)where E r , A r ∈ R r × r , B r ∈ R r × ν , and f r : R r → R r .We will use a Galerkin projection to construct the reduced model (1.2): Let V r be an r -dimensional subspace spanned by the columns of V ∈ R n × r . Then, we ap-proximate the full-state x ( t ) using the ansatz x ( t ) ≈ V x r ( t ) and enforce the Galerkin ∗ Faculty of Science, Department of Mathematics, University of Zagreb, Bijeniˇcka 30, 10000 Za-greb, Croatia. † Department of Mathematics, Virginia Polytechnic Institute and State University, 460 McBryde,Virginia Tech, Blacksburg, VA 24061-0123. 1 a r X i v : . [ c s . NA ] O c t rthogonality condition EV ˙ x r ( t ) − AV x r ( t ) − f (cid:0) V x r ( t ) (cid:1) − B g ( t ) ⊥ V r to obtain thereduced model (1.2) with the reduced model quantities given by E r = V T EV, A r = V T AV, B r = V T B, and f r ( x r ( t )) = V T f ( V x r ( t )) . (1.3) Forlinear dynamical systems, i.e., when f = in (1.1), a plethora of methods existto perform model reduction: These include gramian based methods such as BalancedTruncation [37, 36] and Optimal Hankel Norm Approximation [24] or rational interpo-lation based methods such Iterative Rational Krylov Algorithm [27]. These methodsrely on the concept of transfer function and perform model reduction independentof the input g ( t ). These ideas have been recently extended to systems with bilin-ear [4, 6, 23] and quadratic nonlinearities [26, 7]. For general nonlinearities, ProperOrthogonal Decomposition (POD) is the most-commonly used method. POD [35, 9]obtains the model reduction basis V from a truncated SVD approximation to a matrixof “snapshots”, a numerically computed trajectory of the full model. It is related tomethods (and known by other names) such as Principal Component Analysis (PCA)in statistical analysis [29] and Karhunen-Lo´eve expansion [34] in stochastic analysis.To construct the model reduction basis V via POD, one performs a numericalsimulation of (1.1) for an input g ( t ) and initial condition x . Let x , x , . . . , x N − denote the snapshots resulting from this numerical simulation; i.e, x i = x ( t i ) ∈ R n for i = 0 , , . . . , N −

1. Construct the

POD snapshot matrix X = [ x , x , x , . . . , x N − ] ∈ R n × N (1.4)and compute its thin SVD X = Z Σ Y T , (1.5)where Z ∈ R n × N , Σ ∈ R N × N , and Y ∈ R N × N with Z T Z = Y T Y = I N , and Σ = diag( σ , σ , . . . , σ n s ), with σ ≥ σ ≥ . . . ≥ σ N ≥

0. Then model reductionby POD chooses V as the leading r left singular vectors of X corresponding to the r largest singular values. Using MATLAB notation, this corresponds to V = Z (: , r ).This basis selection by POD minimizes (cid:80) Ni =0 (cid:107) x i − ΦΦ T x i (cid:107) over all Φ ∈ R n × r withorthonormal columns. Since the objective function does not change if Φ is post-multiplied by an arbitrary r × r orthogonal matrix, this procedure actually seeks an r –dimensional subspace that optimally captures the snapshots in the least squaressense. For more details on POD, we refer the reader to [28, 32]. Even though the state x r ( t ) of the reduced model(1.2) lives in an r -dimensional subspace, deﬁnition of the reduced nonlinear term f r ( x r ( t )) = V T f ( V x r ( t )) in (1.3) requires lifting x r ( t ) back to the full n -dimensionalsubspace in order to evaluate the nonlinear term; this is known as the lifting bottleneckand degrades the performance of reduced models for nonlinear systems. Variousapproaches exist to tackle this issue; see, e.g., [21, 5, 3, 39, 14]. In this paper, wefocus on the Discrete Empirical Interpolation Method ( DEIM ) [39], a discrete variantof the Empirical Interpolation Method introduced in [5].As explained in the original source [39],

DEIM can be used to approximate andeﬃciently evaluate a general nonlinear function f , which is not necessarily tied to themodel reduction set-up we discussed above. For example, f ( τ ) could be a vector-valued function of possibly multidimensional parameter τ . Therefore, following [39],we will present the DEIM construction and our analysis for a generic nonlinear vectorvalued function f ( τ ), yet will point out the implications for nonlinear model reduction. .4. DEIM . Given a nonlinear function f : T −→ R n with T ⊂ R d and a matrix U ∈ R n × m of rank m , DEIM approximation of f is deﬁned by [39, Deﬁnition 3.1] (cid:98) f ( τ ) = U ( S T U ) − S T f ( τ ) , (1.6)where S is n × m matrix obtained by selecting certain columns of the n × n identitymatrix I . With the DEIM approximation to f deﬁned as in (1.6), the nonlinear termin the reduced model (1.2) is now approximated by f r ( x r ( t )) ≈ V T U ( S T U ) − S T f ( V x r ( t )) . (1.7)An eﬀective numerical implementation of f r ( x r ( t )) is diﬀerent than its analytical for-mula in (1.7) and allows computing f r ( x r ( t )) without lifting x r ( t ) to the full dimension n and by only selecting a certain rows of V x r ( t ). We skip those details and refer thereader to [39, § Computation of the

DEIM basis U . In an application, the matrix U can be com-puted as follows. For a ﬁnite grid T (cid:1) ⊂ T , the function is sampled at τ j ∈ T (cid:1) and, asdone for state x ( t ) in POD for model reduction, the function values, nonlinear snap-shots , are collected in a matrix F , i.e., F = [ f ( τ ) , f ( τ ) , . . . , f ( τ κ )]. If f ( τ ) is n × n matrix valued, the vec( · ) operator is used to map its range to R n · n . Then, an or-thogonal projection Ω = UU T , of low rank m , onto the range U = R ( U ) is constructedso that (cid:107) F − Ω F (cid:107) F is minimal. Typically, m (cid:28) n . Therefore, U can be consideredas the POD basis for the nonlinear snapshots. The hope is that the range of U willcapture the values of f over the entire parameter space, i.e., (cid:107) f ( τ ) − UU T f ( τ ) (cid:107) willbe suﬃciently small at any τ ∈ T .The role of S , which we will call selection operator , is to strategically pick coor-dinate indices in R n at which the approximant interpolates f . (Note that S T (cid:98) f ( τ ) = S T f ( τ ).) The DEIM algorithm, proposed in [39], forces the selection operator S to seek m linearly independent rows of U such that the local growth of the spectral norm of( S T U ) − is limited via a greedy search, as implemented in Algorithm 1. This objectiveis founded in the following theoretical basis of DEIM [39, Lemma 3.2]:

Lemma 1.1.

Let U ∈ R n × m be orthonormal ( U ∗ U = I m , m < n ) and let (cid:98) f = U ( S T U ) − S T f (1.8) be the DEIM projection of an arbitrary f ∈ R n , with S computed by Algorithm . Then (cid:107) f − (cid:98) f (cid:107) ≤ c (cid:107) ( I − UU ∗ ) f (cid:107) , c = (cid:107) ( S T U ) − (cid:107) , (1.9) where c ≤ (1 + √ n ) m − (cid:107) u (cid:107) ∞ ≤ √ n (1 + √ n ) m − . Hence, we can focus on a pure matrix theoretical problem: Given orthonormal U ∈ C n × m ( U ∗ U = I m ) ﬁnd a row selection matrix S with (cid:107) ( S T U ) − (cid:107) as small aspossible. If R ( U ) captures the behavior of f well over the given parameter space, andif S results in a moderate value of c in (1.9), the DEIM approximation will succeed.The error bound (1.9) in Lemma 1.1 is rather pessimistic and the

DEIM projectionusually performs substantially better in practice, see [39] for several illustrations of From now on, we consider the problem over the complex ﬁeld.3 lgorithm 1

DEIM (Discrete Empirical Interpolation Method) [39, Algorithm 1] Input: u , . . . , u m linearly independent. Output:

Selection operator S = S m (implicitly by ℘ m ). p = arg max i ( | u ( i ) | ) ; U = [ u ]; S = [ e p ] ; ℘ = [ p ] for j = 2 : m do Solve S Tj − U j − z = S Tj − u j for z ; r j = u j − U j − z ; p j = arg max i ( | r j ( i ) | ) ; U j = [ U j − , u j ] ; S j = [ S j − , e p j ] ; ℘ j = ( ℘ j − , p j ) ; end for superior performance of DEIM . Hence, this is an interesting theoretical question: canthe upper bound can be improved, and what selection operator S will have a sharper apriori error bound, perhaps only mildly dependent on n ? .Note that S computed in Algorithm 1 depends on a particular basis for U ; justreordering the basis vectors may result in diﬀerent S . If, for example, U consists ofthe left singular vectors of the m dominant singular values of the data samples matrix F , and if some of those singular values are multiple or tightly clustered, then somesingular vectors (columns of U ) are non-unique or are numerically badly determinedby the data and the computed U could be algorithm dependent. But the subspace theyspan is well-determined. Therefore, from both the theoretical and practical points ofview, it is important to ask whether we can eﬃciently construct S with an a prioriassurance that c will be moderate and independent of the choice of an orthonormalbasis U of U . Our interest for studying

DEIM in more detail was triggered by the above the-oretical questions from a numerical linear algebra point of view, and by a practicalquestion of eﬃcient implementation of

DEIM as mathematical software on high per-formance computing machinery. In § O ( m n ) + O ( m ). Unfortunately, it has unfavorable ﬂop per memory referenceratio (level 2 BLAS) which precludes eﬃcient software implementation. It would beadvantageous to have an algorithm based on BLAS 3 building blocks, with potentialfor parallel implementations. Furthermore, we may ask whether the contribution ofthe factor n in the overall complexity can be reduced or even removed (e.g. usingonly a subset of the rows of U ) without substantial loss in the quality of the computedselection operator. Fortunately, an aﬃrmative answer to all the questions above is surprisingly simpleand eﬀective: QR factorization with column pivoting of U ∗ . Our new implementationof DEIM , designated as

Q-DEIM , computes S independent of a particular orthonor-mal basis U , enjoys a better upper bound for the condition number c of the DEIM projection, and in practice computes S with usually smaller value of (cid:107) ( S T U ) − (cid:107) thanthe original DEIM algorithm. A further advantage of

Q-DEIM is that it is based onnumerically robust high performance procedures, already available in software pack-ages such as LAPACK, ScaLAPACK, MATLAB, so no additional eﬀort is needed fortuning high performance

DEIM . The details and a theoretical foundation of

Q-DEIM are given in §

2. In particular, in § DEIM projection is almost as good as the orthogonal pro-jection onto the range of U . Numerical experiments that illustrate the performanceof Q-DEIM in the context of nonlinear model reduction are presented in § § DEIM projection is possible even with using only a small por- ion of the rows of U , and we introduce Q-DEIMr , a restricted and randomized

DEIM selection that combines the technique used in

Q-DEIM with the ideas of randomizedsampling. Further developments and applications are outlined in §

2. A new

DEIM framework.

A key observation leading to a selection strategypresented in this section is based on a solution to a similar problem in [19], arisingin the proof of global convergence of a block version of the Jacobi algorithm fordiagonalization of Hermitian matrices. There, a row permutation is needed such thatthe (1 ,

1) diagonal block of a family of 2 × U isorthonormal.We adapt the strategy from [19] and use it in § DEIM projection; the result is a new selectionmethod, called

Q-DEIM , with an improved theoretical bound on c and with the selec-tion operator invariant under arbitrary changes of the orthonormal basis of the rangeof U . We also use the seminal work of Goreinov, Tyrtyshnikov and Zamarshkin [25]to show that DEIM projection is not only numerically but also theoretically almost asgood as the orthogonal projection, up to a factor of the dimension.

Q-DEIM – a new selection procedure.

An answer to all practical ques-tions raised in § § Theorem 2.1.

Let U ∈ C n × m , U ∗ U = I m , m < n . Then : • There exists an algorithm to compute a selection operator S with complexity O ( nm ) , such that (cid:107) ( S T U ) − (cid:107) ≤ √ n − m + 1 √ m + 6 m − , (2.1) and for any f ∈ C n (cid:107) f − U ( S T U ) − S T f (cid:107) ≤ √ n O (2 m ) (cid:107) f − UU ∗ f (cid:107) . (2.2) If U is only full column rank, then the bound ( ) changes to (cid:107) ( S T U ) − (cid:107) ≤ √ n − m + 1 σ min ( U ) √ m + 6 m − . (2.3) • There exists a selection operator S (cid:63) such that the DEIM projection error isbounded by (cid:107) f − U ( S T(cid:63) U ) − S T(cid:63) f (cid:107) ≤ (cid:112) m ( n − m ) (cid:107) f − UU ∗ f (cid:107) . (2.4) • The selection operators S , S (cid:63) do not change if U is changed to U Ω , where Ω is arbitrary m × m unitary matrix, i.e., the selection of indices is assigned toa point on the Stiefel manifold, represented by U . roof . Let W = U ∗ ∈ C m × n , and let W Π = (cid:16) (cid:98) W (cid:98) W (cid:17) = QR = Q  ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗  (2.5)be a column pivoted (rank revealing) QR factorization. We have at our disposal avariety of pivoting strategies that reveal the numerical rank by constructing R in a wayto control the condition numbers (explicitly or implicitly) of its leading submatrices.For instance, the Businger–Golub pivoting [13] at step i ﬁrst determines a smallestlocal index ˆ p i of the largest (in Euclidean norm) column in the submatrix ( i : m, i : n )and swaps globally the columns i and p i = i −

1+ ˆ p i in the whole matrix. The followingscheme illustrates the case with n = 7, m = 4, i = 2, ˆ p = 3, and p = 4:  i p i n(cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63)i • ∗ (cid:126) ∗ ∗ ∗ • ∗ (cid:126) ∗ ∗ ∗ m • ∗ (cid:126) ∗ ∗ ∗  swap ( i,p i ) →−→−→  i p i n(cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63)i (cid:126) ∗ • ∗ ∗ ∗ (cid:126) ∗ • ∗ ∗ ∗ m (cid:126) ∗ • ∗ ∗ ∗  . (2.6)Then, the QR step maps the i –th column in the sub-matrix ( i : m, i : n ) to e i R ii andkeeps all the remaining column norms in the submatrix unchanged and bounded by | R ii | . (Here e i denotes i –th canonical vector of appropriate dimension.) The productof all transpositions gives the permutation Π.We deﬁne the selection operator S as the one that collects the columns of W tobuild (cid:98) W ; this implies that S T U = (cid:98) W ∗ and we need to estimate (cid:107) (cid:98) W − (cid:107) . Partition R in (2.5) as R = (cid:0) T K (cid:1) with m × m upper triangular T . Then (cid:99) W = QT , and theproblem reduces to bounding (cid:107) T − (cid:107) . As a result of the pivoting (2.6), the matrix T ,as the leading m × m submatrix of R , has a special diagonal dominance structure: | T ii | ≥ k (cid:88) j = i | T jk | , ≤ i ≤ k ≤ m ; | T mm | = max j = m : n | R mj | . (2.7)Further, since (cid:98) W ≡ W Π = QR and since (cid:98) W (cid:98) W ∗ = U ∗ U = QRR ∗ Q ∗ = I m , we concludethat RR ∗ = I m , which implies that1 = (cid:107) R ( m, :) (cid:107) = | T mm | + n (cid:88) j = m +1 | R mj | ≤ ( n − m + 1) | T mm | , (2.8)and that min i =1: m | T ii | = | T mm | ≥ √ n − m + 1 . (2.9)If we set D = diag( T ii ) mi =1 , ˘ T = D − T , then (cid:107) T − (cid:107) ≤ √ n − m + 1 (cid:107) ˘ T − (cid:107) . Further,if we assume U to be just of rank m , not necessarily orthonormal, then | T mm | ≥ (cid:107) R ( m, :) (cid:107) √ n − m + 1 ≥ σ min ( R ) √ n − m + 1 = σ min ( U ) √ n − m + 1 , (2.10) σ min ( T ) ≥ σ min ( U ) √ n − m + 1 1 (cid:107) ˘ T − (cid:107) . (2.11) ince S T U = (cid:98) W ∗ = T ∗ Q ∗ , it follows that (cid:107) ( S T U ) − (cid:107) = (cid:107) T − (cid:107) = 1 /σ min ( T ). Hence,to prove (2.1) and (2.3) it remains to estimate the norm of ˘ T − = T − D . This can bedone using an analysis of Faddeev, Kublanovskaya and Faddeeva [22], that can alsobe found in [33]. Systematic use of (2.7) as in [33, Chapter 6] yields the followinguseful inequalities | T − e i | ≤ | T ii | (cid:0) i − , i − , . . . , , , , , , . . . (cid:1) T , i = 2 , . . . , m, where the absolute value and the inequality between vectors are understood element–wise. For i = 1, trivially, we have T − e = e (1 / T ), and ˘ T − e = e . For i =2 , . . . , m we use the relations ˘ T − e i = T − De i = T − e i T ii to conclude | ˘ T − e i | ≤ (cid:0) i − , i − , . . . , , , , , , . . . (cid:1) T , and thus (2.1), (2.3) follow by (2.9), (2.10), (2.11) and (cid:107) ˘ T − (cid:107) ≤ (cid:107) ˘ T − (cid:107) F ≤ g ( m ) , where g ( m ) = (cid:118)(cid:117)(cid:117)(cid:116) m + m (cid:88) i =2 i − (cid:88) j =0 j = √ m + 6 m − . If U is changed to U Ω with unitary Ω, then the column pivoted QR is computed with (cid:101) W = Ω ∗ U ∗ = Ω ∗ W on input. The fact that the QR factorization of W or of (cid:101) W isimplicitly the Cholesky factorization of the Hermitian semideﬁnite H = W ∗ W = (cid:101) W ∗ (cid:101) W extends to the pivoted factorizations as well. In the ﬁrst step, obviously, looking forthe largest diagonal entry of H in the pivoted Cholesky factorization is equivalent tolooking for the column of (cid:101) W (or W ) of largest Euclidean length. Hence, the pivotingwill select the same columns in both cases. After k steps of annihilations usingHouseholder reﬂectors with appropriate column interchanges, the intermediate resultis (cid:101) W ( k ) = (cid:18) (cid:101) W ( k )[11] (cid:101) W ( k )[12] (cid:101) W ( k )[22] (cid:19) , and it is easily checked that ( (cid:101) W ( k )[22] ) ∗ (cid:101) W ( k )[22] equals the Schurcomplement at the corresponding step in the pivoted Cholesky factorization of H .Hence, the next step will have the same pivot selection in both processes.The existence of S (cid:63) is based on an elegant argument by Goreinov et al. [25],who used the concept of matrix volume (the absolute value of the determinant). Theselection S (cid:63) is deﬁned to be the one that maximizes the volume of S T(cid:63) U over all (cid:0) nm (cid:1) = n ! m !( n − m )! m × m submatrices of U . Then, by [25, Lemma 2.1], (cid:107) ( S T(cid:63) U ) − (cid:107) ≤ (cid:112) m ( n − m ) . (2.12)Since postmultiplying U by a unitary Ω cannot change the volume of any m × m submatrix of U , the same maximizing volume submatrix will be selected. (Here weassume that a simple additional condition is imposed to assure unique selection inthe case of several maxima. For instance, among multiple choices, select the one withsmallest lexicographically ordered indices.) The error bounds (2.2) and (2.4) followby inserting the corresponding bounds (2.1) and (2.12) for c into (1.9). Remark 2.2.

According to [22], a slightly better bound (cid:112) (4 m − + 2) / g ( m ). It should be emphasized that the O (2 m ) upper bound is attainedonly on a contrived example (the notorious Kahan matrix [31]) and in practice it canbe replaced by O ( m ). Also, in the application of DEIM , m is assumed small to modest,so (cid:107) ˘ T − (cid:107) can be estimated in O ( m ) or even O ( m ) time using a suitable condition umber estimator; the factor √ n − m + 1 can be replaced by the actually computedand potentially smaller value 1 / | T mm | . The deployment of an incremental conditionestimator will be particularly important in a randomized sampling version of Q-DEIM introduced in § c , but practical experience shows that the pivoting used in Theorem 2.1works very well. It has been conjectured in [25] that (2.12) can be replaced with (cid:107) ( S T(cid:63) U ) − (cid:107) ≤ √ n , and proved that no bound smaller than √ n can exist in general. Remark 2.3.

While the existence and superiority of S (cid:63) are clear, its constructionis diﬃcult. However, we can use its characterization to understand why the selectionoperator S deﬁned in Theorem 2.1 (and the Businger-Golub pivoting in general) usu-ally works very well in practice. The volume of the submatrix selected by S equals thevolume (cid:81) mi =1 | T ii | of the upper triangular T , which is the leading m × m submatrixof the computed R factor. On the other hand, the pivoting, by design, at each steptries to produce maximal possible | T ii | ; thus it can be interpreted as a greedy volumemaximizing scheme. In fact, such an interpretation motivates post-processing to in-crease the determinant, e.g. by replacing trailing submatrix T ( m − m, m − m ) of T by better choices, obtained by inspecting the determinants of 2 × R ( m − m, m + 1 : n ) and moving the corresponding columns upfront. Example 2.4.

We illustrate the diﬀerence in the values of c = (cid:107) ( S T U ) − (cid:107) computed by DEIM and

Q-DEIM using 200 randomly generated orthonormal matricesof size 10000 × Q-DEIM selection not only enjoys better upper bound, butit also in most cases provides smaller actual value of c , as seen on Figure 2.1. It isinteresting to note that all Q-DEIM values of (cid:107) ( S T U ) − (cid:107) are below 100, sustainingthe conjectured bound (cid:107) ( S T(cid:63) U ) − (cid:107) ≤ √ n for the volume maximizing scheme. Onthe other hand, DEIM breaches the √ n upper bound in most of the trials, indicatingless optimal selection with respect to the volume maximizing criterion. In the caseof matrices specially constructed to exhibit large pivot growth, the value of c in bothmethods may exceed √ n , but not with a big factor. We also compare DEIM and

Q-DEIM using a 2048 ×

100 basis for a FitzHugh-Naguma system, analyzed in detailin § c for Q-DEIM is ﬁxed at 2 . e + 01 < √ In terms of the row selection from U , theactual computation used in Theorem 2.1 is an LQ factorization of U with row pivoting.The transposition and QR with column pivoting is used only for convenience and dueto the availability of software implementations. A Householder based QR factorizationof a fat m × n matrix runs with complexity O ( m n ), similar to the complexity of DEIM .LAPACK [1] based software tools use the optimized, BLAS 3 based and robust [20]function xGEQP3 . Other pivoting strategies are possible, such as in xGEQPX, xGEQPY in[10], [11]. On parallel computing machinery, our new approach uses the best availableQR code with column pivoting; e.g.

PxGEQPF from ScaLAPACK [12].As an illustration of the simplicity at which we get high performance computationof a good selection operator, and to make a case for

Q-DEIM , we brieﬂy describe aMATLAB implementation. Using the notation of Theorem 2.1, we have U = Π (cid:18) T ∗ K ∗ (cid:19) Q ∗ , S T U = T ∗ Q ∗ , and thus M ≡ U ( S T U ) − = Π (cid:18) I m K ∗ T −∗ (cid:19) . (2.13)The computation T − K = ˘ T − ( D − K ) by a triangular solver (e.g. the backslash \ or linsolve() in MATLAB) is numerically stable as ˘ T is well conditioned and (trial index) c Comparison of the constant c = k ( S T U − k DEIMQ-DEIM k (trial index) c ( Q - D E I M ) / c ( D E I M ) The ratio c (Q-DEIM)/ c (DEIM) k (trial index) c Comparison of the constant c = k ( S T U − k DEIMQ-DEIM k (trial index) c ( Q - D E I M ) / c ( D E I M ) The ratio c (Q-DEIM)/ c (DEIM) Fig. 2.1 . (Example ) Comparison of the value c = (cid:107) ( S T U ) − (cid:107) in DEIM and

Q-DEIM .The ﬁrst row: The comparison using 200 random orthonormal matrices of size × . Thesecond row: random changes of a DEIM orthonormal basis U of size × , computedfrom simulation of the FitzHugh-Naguma system, see § . The basis changes are obtained bypost-multiplication by random × real orthogonal matrices (uniformly distributed in the Haarmeasure). max ij | ( D − K ) ij | ≤

1. The explicitly set identity matrix I m in (2.13) guarantees thatthe selected entries of a vector f will be exactly interpolated when M is computedas in (2.13). If M is computed as (cid:101) M = computed ( U / ( S T U )) (e.g. by MATLAB’sslash) then S T (cid:101) M = I m + ( (cid:15) ij ) m × m , with all (cid:15) ij at roundoﬀ level. If s , . . . , s m are theinterpolation indices selected by S , then checking the interpolation for f ∈ R n yields( S T (cid:101) MS T f ) i = f s i (1 + (cid:15) ii ) + (cid:88) j (cid:54) = i f s j (cid:15) ij , i = 1 , . . . , m, revealing an undesirable pollution of f s i , in particular if max s j (cid:54) = s i | f s j | (cid:29) | f s i | . function [ S, M ] = q deim( U ) ;% Input : U n − by − m with orthonormal columns% Output : S selection of m row indices with guaranteed upper bound% norm(inv(U(S,:))) < = sqrt(n − m+1) * O(2ˆm).% : M the matrix U*inv(U(S,:));% The Q − DEIM projection of an n − by − \ R(:,m+1:n))'] ;Pinverse(P) = 1 : n ; M = M(Pinverse,:) ;endend .1.2. DEIM and LU with partial pivoting.

It has been known, at least toexperts, that

DEIM is a variation of Gaussian elimination; Sorensen [40, 38] calledit a pivoted LU without replacement . Recently, [2] proposed to replace Step 7 ofAlgorithm 1, U j = (cid:0) U j − u j (cid:1) , by (cid:98) U j = ( (cid:98) U j − (cid:98) r ) (To make the distinction clear,we denote the new variable by (cid:98) U and we use (cid:98) U j to denote the matrix (cid:98) U (: , j ) atthe end of the j th step.). In other words, the basis (cid:98) U j − is updated by adding thecurrent residual vector. Assume that at step j , (cid:98) U j − = U j − G j − , where G j − isunit upper triangular and that both computations have the same selection S j − (thisholds at j = 2). Then the residual (cid:98) r = u j − (cid:98) U j − ( S Tj − (cid:98) U j − ) − S Tj − u j is easily shownto be the same as r in Algorithm 1. Hence, the updated S j will be the same, and (cid:98) U j ≡ ( (cid:98) U j − (cid:98) r ) = (cid:0) U j − u j (cid:1) G j , with an updated unit upper triangular G j .Note that S Tj (cid:98) U j = (cid:0) S j − e p j (cid:1) T ( (cid:98) U j − (cid:98) r ) is lower triangular at each step: Theupper triangular part of the last column of this product is S Tj − (cid:98) r , which is zero by thedeﬁnition of (cid:98) r . Therefore, S T (cid:98) U = S T UG = Z with a lower trapezoidal Z and a unitupper triangular G . Hence, DEIM (with replacement) is the same as a row pivoted LUdecomposition. This connection might help understanding why

DEIM behaves muchbetter than the theoretical upper bound would suggest. Similar to the discussion inRemark 2.3,

DEIM is applying a locally greedy search to maximize the volume of S T U :Because (cid:98) U = UG with a unit upper triangular G , det( S T (cid:98) U ) = det( U ) = (cid:81) mj =1 Z jj , andthe Z jj ’s are results of a greedy search for maxima.The modiﬁed update yielding (cid:98) U reduces the computational complexity of (cid:98) U ( S T (cid:98) U ) − to O ( nm ) down from O ( nm )+ O ( m ) because the LU decomposition of S T (cid:98) U is nolonger necessary. However, the claim in [2] that Algorithm 1 contains O ( m ) complex-ity is misleading because a practical implementation of Step 5, S Tj − U j z = S Tj − u j , willnot compute the LU decomposition of S Tj − U j − from scratch in each step. Instead,it will exploit the fact that, in the j th step, S Tj − U j − is changed by appending only anew row and a column and will update the LU decomposition with complexity O ( j );thus making the total complexity of Algorithm 1 of O ( nm )+ O ( m ).This connection also suggests using a rank revealing LU decomposition of U withcomplete pivoting and taking the indices of the ﬁrst m pivoted rows as the DEIM indices. The bound on c changes only by a factor originating from the inverse ofthe unit upper triangular LU factor, which, as a consequence of complete pivoting, isbounded by O (2 m ). On the other hand, in the case of partial pivoting like in DEIM ,the bound is rather pessimistic. However, as expected,

DEIM behaves much betterin practice, as partial LU hardly exhibits the worst case growth scenario. It is aninteresting challenge to determine what orthonormal basis of the range of U is bestfor the performances of DEIM . In this section, we test the performance ofthe new selection procedure on two model reduction benchmark problems.

The F–N system, a simpli-ﬁed version of the Hodgkin–Huxley model, arises in modeling the activation anddeactivation dynamics of a spiking neuron. This example is borrowed from [39] andwe follow their description of the model, including their notation and parameter se-lection. Let v and w denote, respectively, the voltage and recovery of voltage. Also,let x ∈ [0 , L ] and t ≥

0. Then, the underlying dynamics are described by the coupled onlinear PDEs εv t ( x, t ) = ε v xx ( x, t ) + f ( v ( x, t )) − w ( x, t ) + c (2.14) w t ( x, t ) = bv ( x, t ) − γw ( x, t ) + c (2.15)with the nonlinearity appearing as f ( v ) = v ( v − . − v ) and with the initial andboundary conditions v ( x,

0) = 0 , w ( x,

0) = 0 , x ∈ [0 , L ] ,v x (0 , t ) = − i ( t ) , v x ( L, t ) = 0 , t ≥ , where the model parameters are chosen as L = 1, ε = 0 . b = 0 . γ = 2, c = 0 . i ( t ) = 50000 t e − t . A ﬁnite diﬀerence discretization leads to asystem of the form (1.1) with system dimension n = 2048. A time-domain simula-tion with equally spaced points for t = [0 ,

8] leads to N = 100 state and nonlinearsnapshots. Following [39], we choose r = m = 5 and perform model reduction usingboth DEIM selection procedures. We simulate both reduced models, collect reduced-order snapshots X DEIM and X Q-DEIM . Then, to measure the error in model reduc-tion, we lift these two snapshots back to the original dimension, i.e., we compute V X DEIM and V X Q-DEIM where V is the POD basis for model reduction, and measuretheir distance (in the relative Frobenius-norm) from the original snapshot X . Let (cid:15) DEIM = (cid:107) X − V X DEIM (cid:107) F (cid:107) X (cid:107) F and (cid:15) Q-DEIM = (cid:107) X − V X Q-DEIM (cid:107) F (cid:107) X (cid:107) F denote the resulting errors. For r = m = 5, we obtain (cid:15) DEIM = 3 . × − and (cid:15) Q-DEIM = 3 . × − .To illustrate that this is the usual behavior, i.e., the Q-DEIM selection performsas well as the original

DEIM selection, we test other r = m values as well: r = m = 4 : (cid:15) DEIM = 4 . × − , (cid:15) Q-DEIM = 3 . × − r = m = 6 : (cid:15) DEIM = 3 . × − , (cid:15) Q-DEIM = 3 . × − r = m = 7 : (cid:15) DEIM = 2 . × − , (cid:15) Q-DEIM = 3 . × − . To better illustrate the comparison, we measure, in relative 2-norm, how accuratelyeach entry of x ( t ) is reconstructed with DEIM and

Q-DEIM . Therefore, we measurethe relative 2-norm distance between the k th rows of the original snapshot matrix X and those of the reconstructed ones V X DEIM and V X Q-DEIM . The results are shown inFigure 2.2, once more, illustrating that both procedures perform equally well.

This is a model of nonlinear RC-ladder circuit[16, 4, 7] , another benchmark example for model reduction. The nonlinearity resultsfrom resistors that are in a parallel connection with diodes; the diode I-V character-istics have the nonlinearity i D = e v D − i D is the current through the diodeand v D is the voltage across it. The input is the current source entering at node 1.We take n = 1000, i.e., connect 1000 such ladders and excite the system usingthe exponential forcing g ( t ) = e − t . A numerical simulation over t = [0 ,

7] secondsresults in 1425 POD snapshots x i and 1425 nonlinear ( DEIM ) snapshots f ( x i ). Decayof the POD and DEIM singular values are shown in the left-hand side plot of Figure2.3. Based on this decay, we pick r = m = 10 and apply POD with both DEIM and

Q-DEIM selections. In this example, the voltage at node 1, i.e., the ﬁrst component of x ( t ) (denoted by ξ ( t )), is the quantity of interest and we measure how both reduced The model can be downloaded from Max Planck Institute Model Reduction Wiki page at http://morwiki.mpi-magdeburg.mpg.de/morwiki/index.php/Nonlinear − RC − Ladder .11

00 400 600 800 1000 1200 1400 1600 1800 200000.050.1 r=m=4Error in the snapshot reconstruction

DEIMQ-DEIM200 400 600 800 1000 1200 1400 1600 1800 200000.050.1 r=m=5

DEIMQ-DEIM200 400 600 800 1000 1200 1400 1600 1800 200000.050.1 r=m=6

DEIMQ-DEIM k= the row index of x(t)

200 400 600 800 1000 1200 1400 1600 1800 200000.050.1 r=m=7

DEIMQ-DEIM

Fig. 2.2 . Example . Comparison of the relative error in the snapshot reconstruction forthe F-N model for diﬀerent r and m values. The horizontal axis variable k corresponds to the k rh row of x ( t ) for which the relative error is computed. models approximate ξ ( t ). As shown on Figure 2.3, both methods perform extremelywell; the reduced-model quantities are virtually indistinguishable from the original.As in the previous example, we compute the reconstruction errors due to DEIM and

Q-DEIM , and obtain (cid:15)

DEIM = 8 . × − and (cid:15) Q-DEIM = 6 . × − ; onceagain high accuracy for both models. The reconstruction errors in ξ ( t ) due to DEIM and

Q-DEIM are 1 . × − and 7 . × − respectively. Once we increase r = m = 10 to r = m = 20, both reduced models become even more accurate with (cid:15) DEIM = 1 . × − and (cid:15) Q-DEIM = 1 . × − . The reconstruction errorsin ξ ( t ) are now 3 . × − for DEIM and 3 . × − for Q-DEIM . ! " (cid:239) " %& (cid:239) ' %& (cid:239) ! %& (cid:239) % %& & ()*+,-./0.12*.34).,56.)789.:;5<=>,?.@,>=*: . . ! " *+,-./0-123456+-78.9:.3-*;48<=.9-1450*8>1*+45.4?.. (cid:106) " /*2 . . @ABCDEBF (cid:239) CDEB

Fig. 2.3 . (Example ) The left-plot shows the decay of the POD and DEIM singular values.The right-plot shows the reconstruction accuracy of ξ ( t ) by both selection methods. o further illustrate the dependence of the error on the reduced dimension, wetest both DEIM and

Q-DEIM for 1 ≤ r = m ≤

20. The results depicted in Figure 2.4below conﬁrm the earlier observations.

Fig. 2.4 . (Example ) Relative errors (cid:15) DEIM and (cid:15)

Q-DEIM for varying r and m values. The two other excitation selections suggested for this model are g ( t ) = sin(2 π t )and g ( t ) = sin(2 π t ) [17]. For these two inputs, for all the r and m combinationswe have tried, DEIM and

Q-DEIM have returned exactly the same selection matrix S ;resulting in exactly same reduced model that is very accurate, with relative errors of O (10 − ) even with r = m = 5. For brevity, we omit the resulting ﬁgures.

3. Using restricted/randomized basis information.

The framework intro-duced in § Q-DEIMr ,a version of

Q-DEIM that works only on a random selection of the rows of U . Thisintroduces the techniques of randomized sampling in the theory and practice of DEIM ,but it also provides a new approach to sampling from orthonormal matrices [30].If the dimension n is large, it would be advantageous to determine a good selectionoperator S in a way to reduce the O ( m n ) factor in the complexity of the algorithm.From the proof of Theorem 2.1 it follows that, for the quality of the DEIM projection,we only need to ensure that the upper triangular matrix T = R (1 : m, m ) has smallinverse. But T is the pivoted QR triangular factor of certain columns of W , and ourinitial task is to ﬁnd their indices; we have no interest in the QR factorization assuch. This immediately suggests that we may attempt to ﬁnd such indices using onlya small selection of the columns of W (i.e. of the rows of U ). The following scheme depicts the main idea: suppose we randomly sample k ≥ m columns of W (marked by ↑∗ ) and assemble them in a local working m × k array L .  ⇑ ↑ ↑ ↑ ↑ ⇑ ⇑ ↑ ↑ ⇑ (cid:63) ∗ . . ∗ . ∗ . ∗ (cid:63) . . (cid:63) . ∗ . ∗ . (cid:63)(cid:63) ∗ . . ∗ . ∗ . ∗ (cid:63) . . (cid:63) . ∗ . ∗ . (cid:63)(cid:63) ∗ . . ∗ . ∗ . ∗ (cid:63) . . (cid:63) . ∗ . ∗ . (cid:63)(cid:63) ∗ . . ∗ . ∗ . ∗ (cid:63) . . (cid:63) . ∗ . ∗ . (cid:63)  (cid:55)→ L (cid:122) (cid:125)(cid:124) (cid:123) ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗  (3.1) Then we attempt QR with column pivoting on L with a built-in Incremental ConditionEstimator (ICE) that at any step j very eﬃciently estimates the norm of the inverseof the thus far constructed part of the triangular factor L (1 : j, j ). In another situation, e.g. in a case of gappy POD approximation, we may want to avoid somerows of U because e.g. they correspond to spatial coordinates with corrupted or missing information. Here we assume that the triangular factor overwrites the corresponding part of the array L .13 his is done as follows: Suppose the ﬁrst j − ∗ ∗ ∗ ) in (3.2)) is well conditioned. The correspondingglobal column indices (in W ) that correspond to these columns in L are stored andremoved from the active set of indices from which random selection is made. AllHouseholder reﬂectors used are accumulated in a m × m matrix Ξ.In the j th step, a new pivot column is selected and swapped to be the j th one,and the single Householder reﬂector is applied only to that column to annihilate itspositions j +1 to m , and to compute the ( j, j )th position, ( (cid:126) in (3.2)). At this momentwe have all the ingredients to compute the value γ j = (cid:107) L (1 : j, j ) − (cid:107) (markedas (cid:16) ∗ ∗ × ∗ × (cid:126) (cid:17) in (3.2)). If γ j is below given threshold, the factorization continues byaccepting the pivotal column, completing the j th step and looking for the next pivot.If not, it means that the ( j, j )th position, ( (cid:126) in (3.2)) is too small, and, due topivoting, that all entries in the active submatrix of L ( (cid:12) in (3.2)) are also small.In that case, the columns j to k in L are useless for our purposes and we discardthem and draw new k − j + 1 columns from the active set of columns of W ( ⇑ (cid:63) in(3.1)). Before using newly selected columns as a part of L , we need to update them byapplying unitary matrix Ξ which contains accumulated all previous transformations.The resulting new columns in L ( (cid:63) in (3.2)) can now participate in pivoting.  ∗ ∗ × × × × ∗ × × × × (cid:126) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12)  (cid:32)  ∗ ∗ (cid:63) (cid:63) (cid:63) (cid:63) ∗ (cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63)  (cid:32)  ∗ ∗ ∗ (cid:63) (cid:63) (cid:63) ∗ ∗ (cid:63) (cid:63) (cid:63) ∗ (cid:63) (cid:63) (cid:63) (cid:63) (cid:63) (cid:63)  (3.2) At this point we may simply choose to continue with the factorization, search for thenext pivot column, swap it to the j th position, test γ j against the threshold and acceptif it passes the test, as illustrated in (3.2). Another option is to discard all previouspivoting and start a completely new one by determining the largest column for theﬁrst position, and proceed with a completely new pivoting process on the updatedcontents of L . The value of k is not necessarily ﬁxed and may change dynamicallywith a safety device to prevent failure. With proper data structure, one can develop adetailed algorithm and an eﬃcient software implementation. For the sake of brevity,we omit the details. However, we provide one illustrative example. Example 3.1.

Let f ( t ; µ ) = 10 e − µt (cos(4 µt ) + sin(4 µt )), 1 ≤ t ≤

6, 0 ≤ µ ≤ π .Take 40 uniformly sampled values of µ and compute the snapshots over the discretized t –domain at n = 10000 uniformly spaced nodes. The best low rank approximation ofthe sampled 10000 ×

40 returned U with m = 34 columns. This indicates that thePOD basis has captured the function’s behavior. We allowed Q-DEIMr to process only k = m columns in the work array L , and set the upper bound for c at √ m √ n − m + 1.Column index drawing is done simply: (cid:96) “random” indices are taken as the (cid:96) leadingindices of randomly permuted active set. We stress here that the purpose of thisexample is to illustrate the idea and its potential, to motivate further study of therandomized sampling approach to DEIM projection.After processing 113 rows of U (out of 10000), Q-DEIMr selected a submatrixwith c ≈ . DEIM processed the whole matrix U and returned c ≈ .

13. Totest how well the two methods approximate f , we compute its value at 200 points inthe µ –interval: for each µ j the function is evaluated over the t –grid giving f µ j ∈ R n . This ad hoc choice is motivated by the structure of the upper bound for c , as in the proof ofTheorem 2.1. Note that we use √ m instead of the worst case theoretical O (2 m ) bound.14 he same is done with DEIM and

Q-DEIM projections giving f DEIM µ j and f Q-DEIM µ j ,respectively. The results of a comparison are depicted in Figure 3.1. Since at this time (s) f ( t, µ ) -1.5-1-0.500.511.522.5 f(t, µ ) evaluated at µ =1.2787 fDEIMQ-DEIM r µ R e l a t i v e e rr o r -14 -13 -12 -11 -10 -9 -8 computed errors in approximating f(t, µ ) DEIM erorQ-DEIM r error Fig. 3.1 . (Example ) Comparison of the approximation errors of DEIM and

Q-DEIMr for f ( t ; µ ) = 10 e − µt (cos(4 µt ) + sin(4 µt )) . Left ﬁgure: The function evaluated at µ = 1 . . Rightﬁgure: The relative errors (cid:107) f DEIM µ j − f µ j (cid:107) / (cid:107) f µ j (cid:107) , (cid:107) f Q-DEIM µ j − f µ j (cid:107) / (cid:107) f µ j (cid:107) for uniformly spacedvalues of µ j ∈ [0 , π ] . Q-DEIMr used rows (at most at the same time) of U to make a selection; DEIM used all rows. µ R e l a t i v e e rr o r -14 -12 -10 -8 computed errors in approximating f(t, µ ) DEIM eror

Q-DEIM r error µ R e l a t i v e e rr o r -14 -13 -12 -11 -10 -9 -8 computed errors in approximating f(t, µ ) DEIM erorQ-DEIM r error Fig. 3.2 . (Example ) Comparison of the relative errors (cid:107) f DEIM µ j − f µ j (cid:107) / (cid:107) f µ j (cid:107) , (cid:107) f Q-DEIM µ j − f µ j (cid:107) / (cid:107) f µ j (cid:107) for f ( t ; µ ) = 10 e − µt (cos(4 µt ) + sin(4 µt )) . Left ﬁgure: Upper bound in Q-DEIMr setto m √ n − m + 1 ; it used rows with c ≈ . . Right ﬁgure: Upper bound in Q-DEIMr set to √ m √ n − m + 1 / ; it used rows with c ≈ . . In both cases, DEIM used all rows of U to make a selection, and Q-DEIMr was allowed to process at most rows of U at the same time. point no sophisticated sampling strategy is used, the results may vary, depending on n , m , and the given upper bound for c . Figure 3.2 illustrates how the prescribedupper bound for c changes the execution and performance of Q-DEIMr . By visitingonly 220 rows, we fully recover the accuracy achieved by using all 10000 rows of U . Remark 3.2.

The technical details of using ICE in the above procedure aresimilar to the rank revealing method with windowed column pivoting [11]. However,the overall procedure is substantially diﬀerent in spirit. The diﬀerence is that ourobjective is not to compute a rank revealing QR factorization (we know that W is offull row rank, WW ∗ = I m , and we do not even need its QR factorization), but just to nd a well conditioned submatrix. This allows to touch only a selection of columnsof W and exit when a suﬃciently well conditioned submatrix is determined. Remark 3.3.

The quest for a well-conditioned submatrix of W can be obviouslyparallelized, and many processors can independently work on diﬀerent (not necessarilydisjoint) subsets of column indices, with no need whatsoever to engage in communi-cation, until one ﬁnds suitable columns and sends a halt signal. If more than oneselection is found, the best one will be chosen.The column selection procedure can be improved at a cost of one pass through thearray W to compute the column norms ω i = (cid:107) W (: , i ) (cid:107) . Such additional informationcan be used e.g. in the following two ways: • Select from the sorted columns in batches, as needed, starting from the largestones. The norms of the columns in the active set can also be down-dated, using theprocedure from column pivoted QR with numerically robust implementation [20]. • Deﬁne p i = ω i / ( (cid:80) nj =1 ω j ), i = 1 , . . . , n . Then ( p , . . . , p n ) is a probabilitydistribution that can be used to draw column samples. It prefers larger columns.Each column is used only once, and the distribution is computed for the active set.This opens a completely new aspect of DEIM and establishes its connection torandomized numerical linear algebra, in particular with randomized sampling of rowsof orthonormal matrices [30]. In particular, one can view

Q-DEIMr as a guided ran-domized sampling algorithm for orthonormal matrices. Detailed analysis of blendingthe two procedures is omitted here; it will be available in our subsequent work.

Remark 3.4.

Our experiments with randomized selection indicate that merelypicking m random interpolation indices will not work in general and sophisticatedstrategies of DEIM , Q-DEIM , Q-DEIMr are necessary for a reliable and robust black-boxprocedure. Indeed, this is already revealed in Example 3.1; the initial random selectionis not enough and

Q-DEIMr brings in additional rows to process. Clearly, the resultsdepend on the function being approximated but just to illustrate the importance of theselection principle of

DEIM and

Q-DEIMr , below we show the reconstruction accuracyof

DEIM and randomly selected indices (with few trials to select better rows, butwithout the condition number control as in

Q-DEIMr ) for two nonlinear parametrizedfunctions. As the ﬁgures illustrate, the random selection without the techniques fromthe DEIM procedures may perform very poorly. µ -16 -14 -12 -10 -8 -6 -4 R e l a t i v e e rr o r computed errors in approximating f(t, µ ) Q-DEIMRandom µ -15 -10 -5 R e l a t i v e e rr o r computed errors in approximating f(t, µ ) Q-DEIMrRandom

Fig. 3.3 . (Remark ) The reconstruction accuracy of a plain random selection of indices,as compared to Q-DEIM and

Q-DEIMr . The ﬁrst plot uses the data generated by the function fromExample . The function used for te second plot is f ( x, µ ) = sinh(( µ ∗ cosh( µ/x ))) , . ≤ x ≤ , ≤ µ ≤ π , and m = 11 out of n = 2000 indices are selected. . Conclusions and Future Work. Using the tools from QR factorizationwith column pivoting, this paper has introduced a new

DEIM index selection strategy,that is invariant under orthogonal transformations, with a sharper error bound forthe

DEIM projection error. The new approach, called

Q-DEIM , is tested on severalnumerical examples and it performs as well as the original

DEIM selection procedure.For the cases of large dimensions, a modiﬁcation is proposed that uses only randomlysampled rows of the data matrix yet still leading to high-ﬁdelity approximations.In addition to the nonlinear model reduction and parametrized function settingswe presented here, the new

Q-DEIM selection is well suited for several importantapplications that we are currently investigating for our subsequent work. One suchapplication is randomized sampling of rows of orthonormal matrices for solving theleast-squares problems eﬀectively in the cases with huge row dimension. The sec-ond one is the

DEIM induced CUR factorization recently introduced by Sorensen andEmbree [41]. A third application arises in nonlinear inversion and parametric modelreduction, see, e.g., [8, 18], where an aﬃne decomposition is needed for a parametricmatrix A ( p ) ∈ R n × n for eﬃcient online model reduction step. This is usually han-dled by vectorizing A ( p ) which might lead to very large row dimension dependingon the sparsity pattern of A ( p ). We are currently testing Q-DEIM and

Q-DEIMr inthese settings. Further, both

DEIM and

Q-DEIM (including

Q-DEIMr ) can adopt anupdating scheme when the nonlinear snapshot basis is obtained by the SVD: one canenlarge m and U incrementally to yield better approximations. In Q-DEIM , increasing m will require updating a rank-revealing QR-decomposition; thus ﬁnding an eﬃcientupdating scheme for this incremental implementation will prove very useful.

5. Acknowledgments.

We would like to thank Dr. Saifon Chaturantabut forproviding the data and code for the FitzHugh-Naguma Model Example in § REFERENCES[1]

E. Anderson, Z. Bai, C. Bischof, L. S. Blackford, J. Demmel, J. J. Dongarra,J. Du Croz, S. Hammarling, A. Greenbaum, A. McKenney, and D. Sorensen , LAPACK Users’ Guide (Third Ed.) , Society for Industrial and Applied Mathematics,Philadelphia, PA, USA, 1999.[2]

H. Antil, S. E. Field, F. Herrmann, R. H. Nochetto, and M. Tiglio , Two-step greedy al-gorithm for reduced order quadratures , Journal of Scientiﬁc Computing, 57 (2013), pp. 604–637.[3]

P. Astrid, S. Weiland, K. Willcox, and T. Backx , Missing point estimation in modelsdescribed by proper orthogonal decomposition , IEEE Transactions on Automatic Control,(2008), pp. 2237–2251.[4]

Z. Bai and D. Skoogh , A projection method for model reduction of bilinear dynamical systems ,Linear Algebra and its Applications, 415 (2006), pp. 406–425.175]

M. Barrault, N. C. Nguyen, Y. Maday, and A. T. Patera , An empirical interpolationmethod: Application to eﬃcient reduced-basis discretization of partial diﬀerential equa-tions , C. R. Acad. Sci. Paris, S´erie I., 339 (2004), pp. 667–672.[6]

P. Benner and T. Breiten , Interpolation-based H -model reduction of bilinear control sys-tems , SIAM Journal on Matrix Analysis and Applications, 33 (2012), pp. 859–885.[7] P. Benner and T. Breiten , Two-sided projection methods for nonlinear model order reduc-tion , SIAM Journal on Scientiﬁc Computing, 37 (2015), pp. B239–B260.[8]

P. Benner, S. Gugercin, and K. Willcox , A survey of model reduction methods for para-metric systems , Tech. Rep. MPIMD/13-14, Max Planck Institute Magdeburg Preprint,August 2013.[9]

G. Berkooz, P. Holmes, and J. Lumley , The proper orthogonal decomposition in the analysisof turbulent ﬂows , Annual Review of Fluid Mechanics, 25 (1993), pp. 539–575.[10]

C. H. Bischof and G. Quintana-Orti , Algorithm 782: codes for rank–revealing QR fac-torizations of dense matrices , ACM Transactions on Mathematical Software, 24 (1998),pp. 254–257.[11] ,

Computing rank–revealing QR factorizations of dense matrices , ACM Transactions onMathematical Software, 24 (1998), pp. 226–253.[12]

L. S. Blackford, J. Choi, A. Cleary, E. D’Azeuedo, J. Demmel, I. Dhillon, S. Hammar-ling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley , ScaLAPACKUser’s Guide , Society for Industrial and Applied Mathematics, Philadelphia, PA, USA,1997.[13]

P. A. Businger and G. H. Golub , Linear least squares solutions by Householder transforma-tions , Numerische Mathematik, 7 (1965), pp. 269–276.[14]

K. Carlberg, C. Farhat, J. Cortial, and D. Amsallem , The GNAT method for nonlinearmodel reduction: Eﬀective implementation and application to computational ﬂuid dynam-ics and turbulent ﬂows , Journal of Computational Physics, 242 (2013), pp. 623–647.[15]

S. Chandrasekaran and I. C. F. Ipsen , On rank–revealing factorizations , SIAM Journal onMatrix Analysis and Applications, 15 (1994), pp. 592–622.[16]

Y. Chen , Model order reduction for nonlinear systems , Master’s thesis, Massachusetts Instituteof Technology, 1999.[17]

M. Condon and R. Ivanov , Model reduction of nonlinear systems , COMPEL-The interna-tional journal for computation and mathematics in electrical and electronic engineering, 23(2004), pp. 547–557.[18]

E. de Sturler, S. Gugercin, M. Kilmer, S. Chaturantabut, C. Beattie, andM. O’Connell , Nonlinear parametric inversion using interpolatory model reduction ,SIAM Journal on Scientiﬁc Computing, to appear; arXiv:1311.0922, (2015).[19]

Z. Drmaˇc , A global convergence proof for cyclic Jacobi methods with block rotations , SIAMJournal on Matrix Analysis and Applications, 31 (2009), pp. 1329–1350.[20]

Z. Drmaˇc and Z. Bujanovi´c , On the failure of rank revealing QR factorization software – acase study , ACM Trans. Math. Softw., 35 (2008), pp. 1–28.[21]

R. Everson and L. Sirovich , The Karhunen-Loeve Procedure for Gappy Data , Journal of theOptical Society of America, 12 (1995), pp. 1657–1664.[22]

D. Faddeev, V. Kublanovskaya, and V. Faddeeva , Solution of linear algebraic systems withrectangular matrices. , Proc. Steklov Inst. Math., 96 (1968), pp. 93–111.[23]

G. Flagg and S. Gugercin , Multipoint Volterra series interpolation and H optimal modelreduction of bilinear systems , To appear in SIAM Journal on Matrix and Analysis andApplications, (2015). Available as arXiv preprint arXiv:1312.2627.[24] K. Glover , All optimal Hankel-norm approximations of linear multivariable systems and their l ∞ -error bounds , International journal of control, 39 (1984), pp. 1115–1193.[25] S. Goreinov, E. Tyrtyshnikov, and N. Zamarashkin , A theory of pseudoskeleton approxi-mations , Linear Algebra and its Applications, 261 (1997), pp. 1 – 21.[26]

C. Gu , QLMOR: a projection-based nonlinear model order reduction approach using quadratic-linear representation of nonlinear systems , Computer-Aided Design of Integrated Circuitsand Systems, IEEE Transactions on, 30 (2011), pp. 1307–1320.[27]

S. Gugercin, A. Antoulas, and C. Beattie , H model reduction for large-scale linear dynam-ical systems , SIAM Journal on Matrix Analysis and Applications, 30 (2008), pp. 609–638.[28] M. Hinze and S. Volkwein , Proper orthogonal decomposition surrogate models for nonlineardynamical systems: Error estimates and suboptimal control , in Dimension Reduction ofLarge-Scale Systems, Springer, 2005, pp. 261–306.[29]

H. Hotelling , Analysis of a complex of statistical variables with principal components , Journalof Educational Psychology, 24 (1933), pp. 417–441,498–520.[30]

I. C. F. Ipsen and T. Wentworth , The eﬀect of coherence on sampling from matrices with rthonormal columns, and preconditioned least squares problems , SIAM Journal on MatrixAnalysis and Applications, 35 (2014), pp. 1490–1520.[31] W. Kahan , Numerical linear algebra , Canadian Mathematical Bulletin, 9 (1965), pp. 757–801.[32]

K. Kunisch and S. Volkwein , Galerkin proper orthogonal decomposition methods for a generalequation in ﬂuid dynamics , SIAM Journal on Numerical analysis, 40 (2002), pp. 492–515.[33]

C. L. Lawson and R. J. Hanson , Solving Least Squares Problems , Prentice–Hall Inc., Engle-wood Cliﬀs, N. J., 1974.[34]

M. Lo´eve , Probability Theory , D. Van Nostrand Company Inc., New York, 1955.[35]

J. Lumley , The Structures of Inhomogeneous Turbulent Flow , Atmospheric Turbulence andRadio Wave Propagation, (1967), pp. 166–178.[36]

B. Moore , Principal component analysis in linear systems: Controllability, observability, andmodel reduction , IEEE Transactions on Automatic Control, 26 (1981), pp. 17–32.[37]

C. Mullis and R. Roberts , Synthesis of minimum roundoﬀ noise ﬁxed point digital ﬁlters ,IEEE Transactions on Circuits and Systems, 23 (1976), pp. 551–562.[38]

S. S. Chaturantabut and D. Sorensen. , Nonlinear model reduction for porous media ﬂow ,in 2010 SIAM Annual Meeting, 2010.[39]

S. S. Chaturantabut and D. Sorensen , Nonlinear model reduction via discrete empiricalinterpolation , SIAM Journal on Scientiﬁc Computing, 32 (2010), pp. 2737–2764.[40]

D. Sorensen . Private communications, 2010.[41]