Data-driven modeling of linear dynamical systems with quadratic output in the AAA framework
aa r X i v : . [ m a t h . NA ] F e b Data-driven modeling of linear dynamical systems withquadratic output in the AAA framework
Ion Victor Gosea ∗ Serkan Gugercin † ∗ Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstr. 1, 39106 Magdeburg, Germany.
Email: [email protected] , ORCID: † Department of Mathematics and Computational Modeling and Data Analytics Division, Academy of Integrated Science, VirginiaTech, Blacksburg, VA 24061, USA.
Email: [email protected] , ORCID:
We extend the
AAA (Adaptive-Antoulas-Anderson) algorithm to develop a data-driven modeling frameworkfor linear systems with quadratic output (
LQO ). Such systems are characterized by two transfer functions: onecorresponding to the linear part of the output and another one to the quadratic part. We first establish thejoint barycentric representations and the interpolation theory for the two transfer functions of
LQO systems.This analysis leads to the proposed
AAA-LQO algorithm. We show that by interpolating the transfer functionvalues on a subset of samples together with imposing a least-squares minimization on the rest, we constructreliable data-driven
LQO models. Two numerical test cases illustrate the efficiency of the proposed method.
Keywords: data-driven modeling, nonlinear dynamics, rational approximation, barycentric form, interpolation.
1. Introduction
Model order reduction (
MOR ) is used to approximate large-scale dynamical systems with smaller ones that ideally havesimilar response characteristics to the original one. This hasbeen an active research area and many approaches to
MOR have been proposed. We refer the reader to [1, 3, 6, 8, 24, 26]and the references therein for an overview of
MOR methodsfor both linear and nonlinear dynamical systems.
MOR , as the name implies, assumes access to a full ordermodel to be reduced; in most cases, in the form of a state-space formulation obtained via, e.g., a spatial discretizationof the underlying partial differential equations. Then, thereduced order quantities are computed via an explicit pro-jection of the full-order quantities. However, in some cases,access to full order dynamics is not available. Instead, onehas access to a collection of input/output measurements.In this case, the goal is to construct the approximation di-rectly from data, which we refer to as data-driven modeling.This is the framework we consider in this paper.Specifically, we focus on data-driven modeling of linear dynamical systems with quadratic output (
LQO ). In ourformulation, data correspond to frequency domain samplesof the input/output mapping of the underlying
LQO system,in the form of samples of its two transfer functions; the firsttransfer function being a single-variable one and the seconda bivariate one. For this data set, the proposed frameworkfirst develops the barycentric rational interpolation theoryfor
LQO systems to interpolate a subset of the data andand then extends the
AAA algorithm [20] to this setting byminimizing a least-square measure in the remaining data.We note that system identification of general nonlinearsystems has been a popular topic. In particular, we mentionhere the special case of identifying linear systems with non-linear output or input functions, e.g., the so-called Wiener[30] and Hammerstein models, respectively. Significant ef-fort has been allocated for identification of such models;see, e.g., [16], [12] and the references therein. Nevertheless,the methods previously mentioned are based in the timedomain, while in this paper we focus on frequency domaindata. We point out that the frequency-data based Loewnerframework was recently extended to identifying Hammer-
Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output
LQO systems and their transfer functions in Section 2,followed by a review of barycentric rational approximationand the
AAA algorithm in Section 3. Next, we develop thetheory for barycentric representation and multivariate in-terpolation for
LQO systems in Section 4. Based on thisanalysis, in Section 5, we present the proposed algorithm,
AAA-LQO , for data-driven modeling of
LQO systems. Thenumerical experiments are given in Section 6 followed bythe conclusions in Section 7.
2. Linear systems with quadraticoutput
In state-space form, linear dynamical systems with quadraticoutput (
LQO systems) are described asΣ
LQO ∶ { ˙ x ( t ) = Ax ( t ) + b u ( t ) ,y ( t ) = c T x ( t ) + K [ x ( t ) ⊗ x ( t )] , (1)where A ∈ R N × N , b , c ∈ R N , K ∈ R × N , and the symbol ⊗ denotes the Kronecker product, i.e., for the vector x = [ x x ⋯ x N ] T ∈ R N , we have x ⊗ x = [ x x x x x ⋯ x x N ⋯ x N ] T ∈ R N . The quadratic part of the output in (1), K [ x ( t ) ⊗ x ( t )] ,can be rewritten as x T ( t ) Mx ( t ) with M ∈ R N × N and K = vec ( M )) where vec denotes the vectorization operation. Insome cases, in (1) we have c = , and thus the output hasonly the quadratic term.Several projection-based MOR methodologies have beenalready proposed for LQO systems. More precisely, bal-anced truncation-type methods were considered in [7,23,28],while interpolation-based methods were used in [14,29]. Allthese methods are intrusive, meaning that, they explicitlywork with the state-space matrices A , b , c and K in (1).The main goal of this work is to develop a data-driven modeling framework for LQO systems where only input-output measurements, in the form of transfer function eval-uations, are needed as opposed to the internal state-spacerepresentation. Therefore, our first goal is to derive transferfunctions for this special class of dynamical systems.
Many classes of nonlinear systems can be represented in thetime domain by generalized kernels as presented in the clas-sical Wiener or Volterra series representations. Generically, infinite number of kernels appear in such series, correspond-ing to each homogeneous subsystem. For more details werefer the reader to [25, 30]For the
LQO system (1), the nonlinearity is present inthe state-to-output equation only and one can write theinput-output mapping of the system in the frequency do-main using two transfer functions1. one corresponding to the linear part of the output,i.e., y ( t ) = c T x ( t ) ;2. one corresponding to the quadratic part of the output,i.e., y ( t ) = K ( x ( t ) ⊗ x ( t )) .These transfer functions were recently derived in [14] usingtheir time-domain representations. In the next result, weintroduce and re-derive them for the completeness of thepaper and to illustrate to the reader how they naturallyappear. Lemma 1.
Consider the
LQO system in (1) with x ( ) = .Let the input u ( t ) be a sum of the J harmonic terms, i.e., u ( t ) = J ∑ j = e i ω j t , where ω j > for j = , , . . . , J, (2) and i = − . Then, the output y ( t ) is given by y ( t ) = J ∑ j = H ( i ω j ) e i ω j t + J ∑ j = J ∑ ℓ = H ( i ω j , i ω ℓ ) e i ( ω j + ω ℓ ) t , (3) where H ( s ) = c T ( s I n − A ) − b (4) is the single-variable rational transfer function correspond-ing to y ( t ) and H ( s, z ) = K [( s I n − A ) − b ⊗ ( z I n − A ) − b ] (5) is the two-variable rational transfer function correspondingto y ( t ) with I n denoting the identity matrix of size N × N .Proof. For the input u ( t ) in (2) with x ( ) = , the solutionof the linear state-equation in (1) in steady-state can bewritten as a sum of scaled complex exponential functionsas x ( t ) = J ∑ j = G ( i ω j ) e i ω j t , (6)where G ( s ) = ( s I − A ) − b . Substituting (6) into the out- Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output y ( t ) = c T J ∑ j = G ( i ω j ) e i ω j t + K [ J ∑ j = G ( i ω j ) e i ω j t ] ⊗ [ J ∑ ℓ = G ( i ω ℓ ) e i ω ℓ t ] = J ∑ j = c T G ( i ω j ) e jω j t + J ∑ j = J ∑ ℓ = K [ G ( i ω j ) ⊗ G ( i ω ℓ )] e i ( ω j + ω ℓ ) t . (7)Substituting G ( s ) = ( s I − A ) − b back into the last equa-tion yields the desired result(3) with H ( s ) and H ( s, z ) asdefined in (4) and (5).Lemma 1 shows that the LQO system (1) is characterizedby two transfer functions, namely H ( s ) (corresponding tothe linear component y ( t ) in the output) and H ( s, z ) (cor-responding to the quadratic linear y ( t ) in the output). Asin the classical linear case, H ( s ) is a rational function insingle variable. On the other hand, H ( s, z ) is also a ra-tional function, but of two variables. These two transferfunctions that fully describe the LQO system (1) will playthe fundamental role in our analysis to extend barycentricinterpolation and
AAA to the
LQO setting. Before we es-tablish the theory for
LQO systems, we will briefly reviewthe
AAA algorithm for linear systems in Section 3.
Remark 1.
In the proposed framework, we will requiresampling the two transfer functions H ( s ) and H ( s, z ) . Asit is shown in Lemma 1, this could be achieved by excitingthe system (as a black box) with purely oscillatory controlinputs and measuring the outputs, and performing a Fouriertransformation. For more details on such procedures in sim-ilar settings, we refer the reader to [17]. We also note that[27] examines systems described by two time-domain ker-nels together with their Fourier transformations (deemed astransfer functions) and their measurements. Even thoughno explicit representation of these functions are consideredin terms of a state-space realization, those ideas also equallyapply to sample H ( s ) and H ( s, z ) . Remark 2.
Note that in the special case for which it holdsthat K = α ( c T ⊗ c T ) , we obtain y ( t ) = αy ( t ) where α is ascalar. Therefore, in this case the output y ( t ) is a quadraticpolynomial in the linear output y ( t ) and the LQO modelcan be interpreted as a Wiener model [30]. However, ourfocus here is on the general case of
LQO systems withoutthis special case.
3. Barycentric rational approximationfor linear systems and the AAAalgorithm
For an underlying function H ( ⋅ ) ∶ C → C , e.g., transferfunction of a single-input/single-output (SISO) linear dy-namical system, assume the following set of measurements: { H ( s i )} ∈ C where s i ∈ C for i = , , . . . , N s . (8)Partition the sampling points into two disjoint sets: { s , . . . , s N s } = { ξ , . . . , ξ n } ∪ { ˆ ξ , . . . , ˆ ξ N s − n } def ==== { ξ ∪ ˆ ξ } . (9)We will clarify later how this partitioning is chosen. Basedon (9), define the sampled values h i def ==== H ( ξ i ) for i = , , . . . , n, andˆ h i def ==== H ( ˆ ξ i ) for i = , , . . . , N s − n, (10)and the corresponding data sets h def ==== { h , . . . , h n } and ˆ h def ==== { ˆ h , . . . , ˆ h N s − n } . (11)Define the rational function r ( s ) in barycentric form [9], anumerically stable representation of rational functions : r ( s ) = p ( s ) q ( s ) = n ∑ k = w k h k s − ξ k + n ∑ k = w k s − ξ k , (12)where ξ k ∈ C are the sampling (support) points and the weights w k ∈ C are to be determined. By construction, thedegree-( n −
1) rational function r ( s ) in (12) is a rationalinterpolant at the support point set ξ , i.e., r ( ξ k ) = h k for k = , , . . . , n, (13)assuming w k ≠
0. Then, the freedom in choosing the weights { w k } can be used to match the remaining the data ˆ h in anappropriate measure.Assuming enough degrees of freedom, [2] chooses the weights { w k } to enforce interpolation of ˆ h as well, by computing the With the addition of 1 to the denominator, we guarantee that r ( s ) is a strictly proper rational function with a numerator degree n − n . This is done in the anticipation ofthe dynamical system in (1) we aim to approximate where therewill be no direct input-to-output mapping. This is not a restric-tion, and the numerator and denominator degrees can be chosenin a different way [9, 20]. Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output n −
1) rational function interpolat-ing the full data (8). We skip the details for the conditionsto guarantee the existence and uniqueness of such a rationalinterpolant and refer the reader to [2, 3] for details.The
AAA (Adaptive-Antoulas-Anderson) algorithm [20],on the other hand, elegantly combines interpolation andleast-squares ( LS ) fitting. In the barycentric form (12),which interpolates the data h by construction, AAA choosesthe weights { w k } to minimize a LS error over the data ˆ h .Note that the LS problem over ˆ h is nonlinear in the weights { w k } since these weights appear in the denominator of r ( s ) as well. AAA solves a relaxed linearized LS problem in-stead. For a sampling point ˆ ξ i in the set ξ , AAA uses thelinearizationˆ h i − r ( ˆ ξ i ) = q ( ˆ ξ i ) ( ˆ h i q ( ˆ ξ i ) − p ( ˆ ξ i )) ↝ ˆ h i q ( ˆ ξ i ) − p ( ˆ ξ i ) , (14)leading to the linearized LS problemmin w ,...,w k N s − n ∑ i = ∣ ˆ h i q ( ˆ ξ i ) − p ( ˆ ξ i ) ∣ . (15) AAA is an iterative algorithm and builds the partition-ing (9) using a greedy search. Assume in step n , AAA hasthe rational approximant r ( s ) as in (12) corresponding tothe partitioning (9) where the weights { w k } are selected bysolving (15). AAA updates (9) via a greedy search by find-ing ˆ ξ i ∈ ˆ ξ for which the error ∣ r ( ˆ ξ i ) − ˆ h i ∣ is the largest.This sampling point is, then, added to the interpolation set ξ , the barycentric rational approximant r ( s ) in (12) is up-dated accordingly (it has one higher degree now), and thenew weights are computed, as before, solving the linearized LS problem. The procedure is repeated until either a desiredorder or an error tolerance is obtained. For further details,we refer the reader to the original source [20]. The AAA algorithm proved very flexible and effective, and has beenemployed in various applications such as rational approx-imation over disconnected domains [20], solving nonlineareigenvalue problems [19], modeling of parametrized dynam-ics [10], and approximation of matrix-valued functions [15].
4. Barycentric representations forLQO systems
To develop interpolating barycentric forms for H ( s ) and H ( s, z ) , we first need to specify the data corresponding tothe underlying the LQO system Σ
LQO . The first transferfunction H ( s ) of Σ LQO is a regular single-variable rationalfunction and, as in Section 3, we sample H ( s ) at distinct points { s , . . . , s N s } to obtain the data set { H ( s i )} ∈ C where s i ∈ C for i = , , . . . , N s . (16)The second transfer function H ( s, z ) , on the other hand,is a function of two-variables. Therefore, in agreement withthe data (16), we will sample H ( s, z ) at the correspondingrectangular grid: for i, j = , , . . . , N s , { H ( s i , s j )} ∈ C where s i , s j ∈ C . (17)Partition the full set of sampling points into two disjointsets { s , . . . , s N s } = { ξ , . . . , ξ n } ∪ { ˆ ξ , . . . , ˆ ξ N s − n } = ξ ∪ ˆ ξ (18) and define the sampled values (measurements): h i def ==== H ( ξ i ) for i = , , . . . , n (19)and h i,j def ==== H ( ξ i , ξ j ) for i, j = , , . . . , n. (20)Then, the goal is to a construct a data-driven LQO systemdirectly from its samples without access access to internaldynamics of Σ
LQO . The partition (18) and the error measureused in approximating the data will be clarified later. Firstwe will show how the data in (16) and (17) can be used todevelop barycentric-like representations corresponding to areduced
LQO system. We will use the notation r ( s ) todenote the rational approximation to H ( s ) and r ( s, z ) to H ( s, z ) . Proposition 1.
Given the H ( s ) samples in (16) , pick thenonzero weights { w , w , . . . , w n } . Then, the barycentric ra-tional function r ( s ) = n ( s ) d ( s ) = n ∑ k = w k h k s − ξ k / ( + n ∑ k = w k s − ξ k ) (21) interpolates the data in (19) . Let e ∈ C n denote the vectorof ones. Define the matrices ˆ b = [ w w . . . w n ] T ∈ C n , Ξ = diag ( ξ , . . . , ξ n ) ∈ C n × n ˆ A = Ξ − ˆ be T ∈ C n × n , and ˆ c T = [ h h . . . h n ] ∈ C n . (22) Then, r ( s ) has the state-space form r ( s ) = ˆ c T ( s ˆ I − ˆ A ) − ˆ b , (23) where ˆ I is the identity matrix of dimension n × n . Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output Proof.
The fact that r ( s ) is an interpolating rational func-tion for the data (19) is just a restatement of (13) for com-pleteness. To prove (23), we will use the Sherman-Morrison formula [13]: Let M ∈ C n × n be an invertible and u , v ∈ C n be such that 1 + v ∗ M − u ≠ . Then, ( M + uv ∗ ) − = M − − M − uv ∗ M − + v ∗ M − u . (24)From (22) and (23), we have r ( s ) = ˆ c T ( s ˆ I − ˆ A ) − ˆ b = ˆ c T [( s I − Ξ ) + ˆ be T ] − ˆ b . (25)To simplify the notation, let ˆ Φ s = s I n − Ξ . Then, applyingthe Sherman-Morrison formula to the middle term in (25)with M = ˆ Φ s , u = ˆ b , and v = e , we obtain r ( s ) = ˆ c T ( ˆ Φ s + ˆ be T ) − ˆ b = ˆ c T ( ˆ Φ − s − ˆ Φ − s ˆ be T ˆ Φ − s + e T ˆ Φ − s ˆ b ) ˆ b = ˆ c T ( ˆ Φ − s ˆ b − ˆ Φ − s ˆ b ⋅ e T ˆ Φ − s ˆ b + e T ˆ Φ − s ˆ b ) = ˆ c T ˆ Φ − s ˆ b + e T ˆ Φ − s ˆ b . (26)Since Ξ is diagonal,ˆ Φ − s = ( s I − Ξ ) − = diag ([ ( s − ξ ) − . . . ( s − ξ n ) − ]) . Then, using the definitions of ˆ b and ˆ c in (22), we obtainˆ c T ˆ Φ − s ˆ b = n ∑ k = w k h k s − ξ k and e T ˆ Φ − s ˆ b = n ∑ k = w k s − ξ k . (27)Substituting these last two equalities into (26) yields (23).We note that state-space realizations for rational func-tions are unique up to a similarity transformations. Forother equivalent state-space representations of a barycen-tric form, we refer the reader to, e.g., [5, 19].Given the samples of H ( s ) ( the data in (19)) of the LQO system (1), Proposition 1 constructs the linear partof the data-driven
LQO model, directly from these samples.What we need to achieve next is to use the H ( s, z ) sam-ples (data in (20)) to construct a two-variable rational func-tion r ( s, z ) in a barycentric-like form corresponding to thequadratic part of the data-driven LQO model. However, r ( s, z ) cannot be constructed independently from r ( s ) .Once r ( s, z ) is constructed, we should be able to interpret r ( s ) and r ( s, z ) as the linear and quadratic transfer func-tions of a single LQO system. This is the precise reasonthat we cannot simply view r ( s, z ) as an independent two-variable rational function and use the classical multivariatebarycentric form [3,4]. Therefore, r ( s, z ) needs to have theform r ( s, z ) = ˆ K [( s ˆ I − ˆ A ) − ˆ b ⊗ ( z ˆ I − ˆ A ) ˆ b ] , where ˆ A and ˆ b are the same matrices from (22) used inmodeling r ( s ) and ˆ K ∈ C × n is the (quadratic) free vari-able that will incorporate to model the new data (20). Thenext result achieves this goal. Theorem 1.
Assume the set-up in Proposition 1. Furtherassume that the H ( s, z ) samples in (17) are given. Define,the two-variable function r ( s, z ) in a barycentric-like form: r ( s, z ) = n ∑ k = n ∑ ℓ = h k,ℓ w k w ℓ ( s − ξ k )( z − ξ ℓ ) + n ∑ k = w k s − ξ k + n ∑ ℓ = w ℓ z − ξ ℓ + n ∑ k = n ∑ ℓ = w k w ℓ ( s − ξ k )( z − ξ ℓ ) . (28) Then, r ( s, z ) interpolates the data (20) , i.e., r ( ξ i , ξ j ) = H ( ξ i , ξ j ) for i, j = , . . . , n. (29) Define ˆ M ∈ C n × n and ˆ K ∈ C × n using [ ˆ M ] i,j = h i,j for i, j = , , . . . , n (30) and ˆ K = [ vec ( ˆ M )] T . (31) Then, r ( s, z ) has the state-space form r ( s, z ) = ˆ K [( s ˆ I − ˆ A ) − ˆ b ⊗ ( z ˆ I − ˆ A ) − ˆ b ] . (32) Proof.
To prove the interpolation property (29) of the barycen-tric representation (28), inspired by the linear case, we startby introducing various polynomials in one or two variables: p ( s ) = n ∏ k = ( s − ξ k ) , p ( z ) = n ∏ ℓ = ( z − ξ ℓ ) ,p i ( s ) = n ∏ k = ,k ≠ i ( s − ξ k ) , p j ( z ) = n ∏ ℓ = ,ℓ ≠ j ( z − ξ ℓ ) ,P ( s, z ) = n ∏ k = n ∏ ℓ = ( s − ξ k )( z − ξ ℓ ) , and P i,j ( s, z ) = n ∏ k = ,k ≠ i n ∏ ℓ = ,ℓ ≠ j ( s − ξ k )( z − ξ ℓ ) , (33)for i, j = , . . . , n . Multiply both the numerator and de-nominator of r ( s, z ) in (28) with P ( s, z ) to obtain r ( s, z ) = n ( s, z ) d ( s, z ) , (34) Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output n ( s, z ) = n ∑ k = n ∑ ℓ = h k,ℓ w k w ℓ P k,ℓ ( s, z ) ,d ( s, z ) = P ( s, z ) + n ∑ k = w k p k ( s ) p ( z ) + n ∑ ℓ = w ℓ p ℓ ( z ) p ( s ) + n ∑ k = n ∑ ℓ = w k w ℓ P k,ℓ ( s, z ) . (35)Then, evaluate r ( s, z ) at s = ξ i and z = ξ j to obtain r ( ξ i , ξ j ) = n ( ξ i , ξ j ) d ( ξ i , ξ j ) = h i,j w i w j P i,j ( ξ i , ξ j ) w i w j P i,j ( ξ i , ξ j ) = h i,j . To prove (32), we first note that r ( s, z ) = ˆ K [( s ˆ I − ˆ A ) − ˆ b ⊗ ( z ˆ I − ˆ A ) − ˆ b ] = ˆ K [ ˆ Φ − s ˆ b + e T ˆ Φ − s ˆ b ⊗ ˆ Φ − z ˆ b + e T ˆ Φ − z ˆ b ] , where we used the fact ( s ˆ I − ˆ A ) − ˆ b = ( ˆ Φ s + ˆ be T ) − ˆ b = ˆ Φ − s ˆ b + e T ˆ Φ − s ˆ b , as shown in deriving (26). Since ˆ Φ s diagonal, we have r ( s, z ) = ˆ K ( + e T ˆ Φ s ˆ b )( + e T ˆ Φ z ˆ b ) ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ w s − ξ ⋮ w n s − ξ n ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ ⊗ ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ w z − ξ ⋮ w n z − ξ n ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ . Then, using the definition of ˆ K in (31) together with thesecond formula in (27), we obtain r ( s, z ) = ∑ nk = ∑ nℓ = h k,ℓ w k w ℓ ( s − ξ k )( z − ξ ℓ ) ( + ∑ nk = w k s − ξ k )( + ∑ nℓ = w ℓ z − ξ ℓ ) (36) = ∑ nk = ∑ nℓ = h k,ℓ w k w ℓ ( s − ξ k )( z − ξ ℓ ) + n ∑ k = w k s − ξ k + n ∑ ℓ = w ℓ z − ξ ℓ + n ∑ k = n ∑ ℓ = w k w ℓ ( s − ξ k )( z − ξ ℓ ) , which concludes the proof.The next result follows from Proposition 1 and theorem 1. Corollary 1.
Assume the set-up in Proposition 1 and theorem 1.Then, interpolating rational functions r ( s ) and r ( s, z ) to-gether correspond to an interpolatory LQO model ̂ Σ LQO ∶ { ˙ˆ x ( t ) = ˆ Ax ( t ) + ˆ b u ( t ) , ˆ y ( t ) = ˆ c T ˆ x ( t ) + ˆ K [ ˆ x ( t ) ⊗ ˆ x ( t )] . (37) In others words, the first (linear) transfer function of ̂ Σ LQO is r ( s ) and its second transfer function is r ( s, z ) . Recall the partitioning of the sampling points in (18).In Theorem 1, we have shown that r ( s, z ) interpolates H ( s, z ) over the sampling set ξ × ξ . What is the valueof r ( s, z ) over the mixed sampling sets ξ × ˆ ξ and ˆ ξ × ξ ?Even though we do not enforce interpolation over these sets,in Section 5 we will need a closed-form expression for thevalue of r ( s, z ) over ξ × ˆ ξ and ˆ ξ × ξ . The next lemmaestablishes these results. Lemma 2.
Let r ( s, z ) be as defined in (28) correspondingto the sampling points in (18) and the data in (17) . Then, r ( ξ i , ˆ ξ j ) = n ∑ ℓ = w ℓ h i,ℓ ˆ ξ j − ξ ℓ + n ∑ ℓ = w ℓ ˆ ξ j − ξ ℓ and r ( ˆ ξ j , ξ i ) = n ∑ k = w k h k,i ˆ ξ j − ξ k + n ∑ k = w k ˆ ξ j − ξ k , (38) for i = , . . . , n and j = , . . . , N s − n .Proof. Proof is given in Appendix A.It is important to note that the numerators and denom-inators of r ( ξ i , ˆ ξ j ) and r ( ˆ ξ j , ξ i ) in (38) are linear in theweights w ℓ . This is in contrast to the general form of r ( s, z ) in (36) where both the numerator and denominator arequadratic in w ℓ when evaluated over ˆ ξ × ξ .
5. Proposed framework for data-drivenmodeling of LQO systems
Section 4 established the necessary ingredients to extend
AAA to LQO systems. Given the measurements (16) and(17), Proposition 1 and theorem 1 show how to constructthe barycentric forms r ( s ) and r ( s, z ) interpolating thisdata in accordance with the partitioning (18). Furthermore,Corollary 1 states that r ( s ) and r ( s, z ) together corre-spond to an interpolatory LQO system. Based on theseresults, in this section we will fully develop the
AAA frame-work for
LQO systems. The resulting algorithm will bedenoted by
AAA-LQO . AAA-LQO will be an iterative algorithm, adding one de-gree of freedom to the current data-driven
LQO model inevery iteration step. In the n th step, r ( n ) ( s ) and r ( n ) ( s, z ) will correspond to a data-driven order- n LQO model for thepartitioning of the sampling points in (18). First, for this current partitioning, in Section 5.1, we introduce a LS errormeasure that will be used to choose the barycentric weights { w k } appearing in the definitions of r ( n ) ( s ) and r ( n ) ( s, z ) in (21) and (28). Then, in Section 5.2 we establish a greedysearch procedure for updating the partitioning (18). Thealgorithm will then continue with the LS minimization for Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output ( n + ) th step to construct r ( n + ) ( s ) and r ( n + ) ( s, z ) . AAA-LQO will terminate after adesired error criterion is met or a maximum allowed orderis achieved as explained in Section 5.3.Even though Section 5.1 investigates the LS problem inthe n the of AAA-LQO , to simplify the notation, we drop thesuperscript and use r ( s ) and r ( s, z ) instead. However,they should be understood as the approximants in the n thstep. We will reintroduce the superscript in Section 5.2. For the full
LQO data (16) and (17), we recall (and repeat)the partitioning of the sampling points as in (18): { s , . . . , s N s } = { ξ , . . . , ξ n } ∪ { ˆ ξ , . . . , ˆ ξ N s − n } = ξ ∪ ˆ ξ . (39) Then, r ( s ) interpolates H ( s ) over ξ (i.e., it interpolatesthe data (19)) and r ( s, z ) interpolates H ( s, z ) over ξ × ξ (i.e., it interpolates the data (20)). Also recall that to-gether, r ( s ) and r ( s, z ) correspond to a LQO system.The only remaining degrees of freedom in defining r ( s ) and r ( s, z ) , and thus the corresponding LQO system arethe barycentric weights { w , . . . , w n } . We will choose thoseweights to minimize an appropriate error measure in the uninterpolated data corresponding to the sampling pointsˆ ξ . We first introduce the notation for these uninterpolatedvalues :ˆ h i def ==== H ( ˆ ξ i ) for i = , , . . . , N s − n, (40)ˆ h (2,2) i,j def ==== H ( ˆ ξ i , ˆ ξ j ) for i, j = , , . . . , N s − n, (41)ˆ h (1,2) i,j def ==== H ( ξ i , ˆ ξ j ) for i = , . . . , n, j = , . . . , N s − n, (42)ˆ h (2,1) j,i def ==== H ( ˆ ξ j , ξ i ) for j = , . . . , N s − n, i = , . . . , n. (43)We denote with w ∈ C n the vector of weights to be deter-mined, as w = [ w w . . . w n ] T . A reasonable error measure to minimize is the LS distance inthe uninterpolated data, leading to the minimization prob- Since the evaluation of the uninterpolated H ( s, z ) values occurover three different sets, namely ξ × ˆ ξ , ˆ ξ × ξ , and ˆ ξ × ˆ ξ , we usea superscript to distinguish them. Recall that the interpolatedvalues h i,j = H ( ξ i , ξ j ) are over ξ × ξ only and thus the superscriptnotation is avoided for h i,j . lem min w ≠ ( J + J + J + J ) (44)where J = N s − n N s − n ∑ i = ( r ( ˆ ξ i ) − ˆ h i ) , (45) J = n ( N s − n ) n ∑ i = N s − n ∑ j = ( r ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j ) , (46) J = ( N s − n ) n N s − n ∑ i = n ∑ j = ( r ( ˆ ξ i , ξ j ) − ˆ h (2,1) j,i ) , and (47) J = ( N s − n ) N s − n ∑ i = N s − n ∑ j = ( r ( ˆ ξ i , ˆ ξ j ) − ˆ h (2,2) i,j ) . (48)As in the original AAA for linear dynamical systems, the LS problem (44) is nonlinear in w for LQO systems. Theformulation is more complicated here due to the additional r ( s, z ) term. To resolve this numerical difficulty, we willemploy a strategy, similar to the lineraziation step in (14),and solve a relaxed optimization problem. However, theresulting LS problem in our case will still be nonlinear, yetmuch easier to solve than (44). In the end, we will tackle theoriginal nonlinear LS problem (44) by solving a sequence ofquadratic LS problems. We note that in (45)-(48), we scaleevery error term J i with the number of data points in it. n In this section, we show how to relax the each term, J i ,in the nonlinear LS problem (44). The resulting problemwill then constitute a crucial component in the proposediterative algorithm (Section 5.3). Linearizing J : Note that the i th term of J in (45),namely r ( ˆ ξ i ) − ˆ h i , is the same as the term in (14) appear-ing in AAA . This is natural since r ( s ) corresponds to thelinear part of the LQO system. Therefore, we can linearize J similar to (14). Write r ( s ) as r ( s ) = n ( s )/ d ( s ) , asdefined in (21). Then, the i th term in (45) is linearized as r ( ˆ ξ i ) − ˆ h i = d ( ˆ ξ i ) ( n ( ˆ ξ i ) − ˆ h i d ( ˆ ξ i )) ↝ n ( ˆ ξ i ) − ˆ h i d ( ˆ ξ i ) . (49) Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output n ( s ) and d ( s ) from the definition of r ( s ) in(21) into (49), one obtains n ( ˆ ξ i ) − ˆ h i d ( ˆ ξ i ) = n ∑ k = w k h k ˆ ξ i − ξ k − ˆ h i ( + n ∑ k = w k ˆ ξ i − ξ k ) = n ∑ k = w k ( h k − ˆ h i ) ˆ ξ i − ξ k − ˆ h i . (50)For a matrix X , let ( X ) ij denote its ( ij ) th entry. Similarly,for a vector x , let ( x ) i denote its i th entry. Then, definethe Loewner matrix L ∈ C ( N s − n ) × n with ( L ) ik = ˆ h i − h k ˆ ξ i − ξ k , for i = , . . . , N s , k = , . . . , n, (51)and the vector ˆ h ∈ C N s − n with ( ˆ h ) i = ˆ h i . Then, N s − n ∑ i = ( n ( ˆ ξ i ) − ˆ h i d ( ˆ ξ i )) = ∥ L w + ˆ h ∥ . Therefore, the J term in (45) will be relaxed to J ⟿ N s − n ∥ L w + ˆ h ∥ . (52) Linearizing J and J : Now we extend the linearizationstrategy used in J , which only involved the single-variablefunction r ( s ) , to the error terms J and J , which in-volve r ( s, z ) . The closed-form expressions for r ( ξ i , ˆ ξ j ) and r ( ˆ ξ j , ξ i ) we derived in Lemma 2 will prove fundamen-tal in achieving these goals.We start with J . Write r ( s, z ) = n ( s, z )/ d ( s, z ) as in(34). Then, the linearizing the ( ij ) th term in (46) means r ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j = d ( ξ i , ˆ ξ j ) ( n ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j d ( ξ i , ˆ ξ j )) ↝ n ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j d ( ξ i , ˆ ξ j ) . (53)We substitute n ( ξ i , ˆ ξ j ) and d ( ξ i , ˆ ξ j ) from (38) into (53)to obtain n ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j d ( ξ i , ˆ ξ j ) = n ∑ ℓ = w ℓ h i,ℓ ˆ ξ j − ξ ℓ − ˆ h (1,2) i,j ( + n ∑ ℓ = w ℓ ˆ ξ j − ξ ℓ ) = − ⎛⎜⎝ n ∑ ℓ = w ℓ ( ˆ h (1,2) i,j − h i,ℓ ) ˆ ξ j − ξ ℓ + ˆ h (1,2) i,j ⎞⎟⎠ . (54)Define the indexing variable α ij = ( i − )( N s − n ) + j andlet ˆ h (1,2) ∈ C n ( N s − n ) be the vector defined such that ( ˆ h (1,2) ) α ij = ˆ h (1,2) i,j for 1 ⩽ i ⩽ n and 1 ⩽ j ⩽ N s − n. (55) Define the Loewner matrix L ( , ) ∈ C n ( N s − n ) × n with entries ( L (1,2) ) α ij ℓ = ˆ h (1,2) i,j − h i,ℓ ˆ ξ j − ξ ℓ , (56)for 1 ⩽ i ⩽ n, ⩽ j ⩽ N s − n , and 1 ⩽ ℓ ⩽ n . Then, using(56) and (55) in (54), we obtain n ∑ i = N s − n ∑ j = ( n ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j d ( ξ i , ˆ ξ j )) = ∥ L (1,2) w + ˆ h (1,2) ∥ , yielding the linearization of J : J ⟿ ( N s − n ) n ÂÂÂÂÂ L (1,2) w + ˆ h (1,2) ÂÂÂÂÂ . (57)Using similar arguments and the explicit formula for theexpression r ( ˆ ξ j , ξ i ) in (38), the J term in (47) is linearizedto J ⟿ ( N s − n ) n ÂÂÂÂÂ L (2,1) w + ˆ h (2,1) ÂÂÂÂÂ , (58)where the Loewner matrix L ( , ) ∈ C n ( N s − n ) × n and the vec-tor ˆ h (2,1) ∈ C n ( N s − n ) are defined as ( L (2,1) ) γ ji k = ˆ h (2,1) j,i − h k,i ˆ ξ j − ξ k and ( ˆ h (2,1) ) γ ji = ˆ h (2,1) j,i , with 1 ⩽ j ⩽ N s − n, ⩽ i ⩽ n , 1 ⩽ k ⩽ n , and γ ji = ( j − ) n + i . Quadraticizing the J term : In this section we show howto relax the remaining term, J , in the minimization prob-lem (44). Note that this term includes r ( ˆ ξ i , ˆ ξ j ) ; i.e., r ( s, z ) evaluated over ˆ ξ × ˆ ξ . As we stated earlier, unlike r ( ξ i , ˆ ξ j ) ( r ( s, z ) over ξ × ˆ ξ ) or r ( ˆ ξ i , ξ j ) ( r ( s, z ) over ξ × ˆ ξ ), thenumerator and denominator of the quantity r ( ˆ ξ i , ˆ ξ j ) isquadratic in the weights w ℓ . Therefore, relaxing the ( ij ) thterm in J via multiplying it out with its denominator, willnot yield a linear term, but rather a quadratic, i.e., even therelaxed problem cannot be solved as a linear LS problem.This is what we establish next.Similar to (53), relax the ( ij ) th term in (48) using r ( ˆ ξ i , ˆ ξ j ) − ˆ h (2,2) i,j = d ( ˆ ξ i , ˆ ξ j ) ( n ( ˆ ξ i , ˆ ξ j ) − ˆ h (2,2) i,j d ( ˆ ξ i , ˆ ξ j )) ↝ n ( ˆ ξ i , ˆ ξ j ) − ˆ h (2,2) i,j d ( ˆ ξ i , ˆ ξ j ) . (59)Using (36), we obtain r ( ˆ ξ i , ˆ ξ j ) = ∑ nk = ∑ nℓ = w k w ℓ h k,ℓ ( ˆ ξ i − ξ k )( ˆ ξ j − ξ ℓ ) ( + ∑ nk = w k ˆ ξ i − ξ k )( + ∑ nℓ = w ℓ ˆ ξ j − ξ ℓ ) = n ( ˆ ξ i , ˆ ξ j ) d ( ˆ ξ i , ˆ ξ j ) . (60) Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output n ( ˆ ξ i , ˆ ξ j ) and d ( ˆ ξ i , ˆ ξ j ) from (60) into (59) andre-arranging the terms yields n ( ˆ ξ i , ˆ ξ j ) − ˆ h (1,2) i,j d ( ξ i , ˆ ξ j ) = − ⎛⎜⎝ n ∑ k = n ∑ ℓ = w k w ℓ ( ˆ h (2,2) i,j − h k,ℓ )( ˆ ξ i − ξ k )( ˆ ξ j − ξ ℓ ) + n ∑ k = w k ˆ h (2,2) i,j ˆ ξ i − ξ k + n ∑ ℓ = w ℓ ˆ h (2,2) i,j ˆ ξ j − ξ ℓ − ˆ h (2,2) i,j ⎞⎟⎠ . (61)Note that the expression in (61) is quadratic in w k , as an-ticipated.As we did for the J , J and J , to express the resultingexpression more compactly in matrix form, we introducethe (2D) Loewner matrix L (2,2) ∈ C ( N s − n ) × n as ( L (2,2) ) α ij β kℓ = ˆ h (2,2) i,j − h k,ℓ ( ˆ ξ i − ξ k )( ˆ ξ j − ξ ℓ ) , (62)where α ij = ( i − )( N s − n ) + j and β kℓ = ( k − ) n + ℓ with i, j ∈ { , , . . . , N s − n } and k, ℓ ∈ { , , . . . , n } . Then, the α ij th entry of the vector L (2,2) ( w ⊗ w ) ∈ C ( N s − n ) is ( L (2,2) ( w ⊗ w )) α ij = − n ∑ k = n ∑ ℓ = w k w ℓ ( h k,ℓ − ˆ h (2,2) i,j )( ˆ ξ i − ξ k )( ˆ ξ j − ξ ℓ ) , (63)thus recovering the first sum in (61). Next, introduce thematrices U , U ∈ C ( N s − n ) × n such that for 1 ⩽ k, ℓ ⩽ n , ( U ) α ij k = ˆ h (2,2) i,j ˆ ξ i − ξ k and ( U ) α ij ℓ = ˆ h (2,2) i,j ˆ ξ j − ξ ℓ . (64)Using U and U in (64), the last two sums in (61) can becompactly written as n ∑ k = w k ˆ h (2,2) i,j ˆ ξ i − ξ k = U w and n ∑ ℓ = w ℓ ˆ h (2,2) i,j ˆ ξ j − ξ ℓ = U w . (65)Define U = U + U . Then using (64), we write ( U ) α ij k = ( U ) α ij k + ( U ) α ij k = ˆ h (2,2) i,j ( ˆ ξ i + ˆ ξ j − ξ k )( ˆ ξ i − ξ k )( ˆ ξ j − ξ k ) . (66)Insert (62) and (66) into (60) obtain N s − n ∑ i = N s − n ∑ j = ( n ( ˆ ξ i , ˆ ξ j ) − ˆ h (2,2) i,j d ( ξ i , ˆ ξ j )) = ÂÂÂÂÂ L (2,2) ( w ⊗ w ) + Uw + ˆ h (2,2) ÂÂÂÂÂ , (67) where ˆ h (2,2) ∈ C ( N s − n ) is the vector defined as ( ˆ h (2,2) ) α ij = ˆ h (2,2) i,j , (68)with α ij = ( i − )( N s − n ) + j as before and 1 ⩽ i, j ⩽ N s − n .The expression (67) yields the final relaxation of J : J ⟿ ( N s − n ) ÂÂÂÂÂ L (2,2) ( w ⊗ w ) + Uw + ˆ h (2,2) ÂÂÂÂÂ . (69) n Combining the relaxations J , J , J , and J as given in(52), (57), (58), and (69), at the n th step of the algorithm,we need to solve the quadraticized minimization problemmin w { ρ ÂÂÂÂÂ L w + ˆ h ÂÂÂÂÂ + ρ (ÂÂÂÂÂ L (1,2) w + ˆ h (1,2) ÂÂÂÂÂ + ÂÂÂÂÂ L (2,1) w + ˆ h (2,1) ÂÂÂÂÂ ) + (70) ρ ÂÂÂÂÂ L (2,2) ( w ⊗ w ) + Uw + ˆ h (2,2) ÂÂÂÂÂ } , where ρ = N s − n , ρ = ( N s − n ) n , and ρ = ( N s − n ) . (71)Note that due to the last term, the optimization problem(70) is no longer a linear LS problem, nevertheless can besolved efficiently. One can explicitly compute the gradient(and Hessian) of the cost function and can apply a well-established (quasi)-Newton formulation [21]. If we wereto have a one-step algorithm whose solution is given by(70), one would employ these techniques. However, notethat solving (70) is only one step of our proposed itera-tive algorithm. Hence, as the iteration continues (and n increases) the vector w (and the data-partition) will beupdated and the new optimization problem with a larger-dimension needs to be solved. Therefore, we will approxi-mately solve (70) in every step.One can obtain an approximate solution to (70) in variousways. In our formulation, we will first solve part of theproblem (70) that can be written as a linear least-squaresproblem in w , namelymin w { ρ ÂÂÂÂÂ L w + ˆ h ÂÂÂÂÂ + ρ (ÂÂÂÂÂ L (1,2) w + ˆ h (1,2) ÂÂÂÂÂ + ÂÂÂÂÂ L (2,1) w + ˆ h (2,1) ÂÂÂÂÂ ) . (72)The optimization problem (72) is a classical linear least-squares problem:˜ w = arg min w ÂÂÂÂÂÂÂÂÂÂÂÂÂ ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ρ L ρ L (1,2) ρ L (2,1) ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ w + ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ˆ h ˆ h (1,2) ˆ h (2,1) ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ ÂÂÂÂÂÂÂÂÂÂÂÂÂ . (73) Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output w , we further relax the last term in (70) as ρ ∥ L (2,2) ( w ⊗ w ) + Uw + ˆ h (2,2) ∥ ⟿ ρ ∥ L (2,2) ( ˜ w ⊗ w ) + Uw + ˆ h (2,2) ∥ . (74)Using L (2,2) ( ˜ w ⊗ w ) = L (2,2) ( ˜ w ⊗ I ) w , we rewrite (74) as ρ ∥ L (2,2) ( ˜ w ⊗ w ) + Uw + ˆ h (2,2) ∥ = ρ ∥ T w + ˆ h (2,2) ∥ , (75)where the matrix T ∈ C ( N s − n ) × n is defined as follows T = L (2,2) ( ˜ w ⊗ I ) + U . (76)Then, using (75) in place of the last term in (70), we obtaina minimization problem that is now a linear LS problem.Thus, the solution to our final approximation to (70) isgiven by w ⋆ = arg min w ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ρ L ρ L (1,2) ρ L (2,1) ρ T ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ w + ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ˆ h ˆ h (1,2) ˆ h (2,1) ˆ h (2,2) ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ . (77)Therefore, in the n th step of AAA-LQO , the optimizationproblem (44) is relaxed and the solution of this relaxedproblem (the weights) is given by (77). The algorithmsproceeds with the updated weights as we discuss next.
Given the partition (39) in the Step n of the algorithm,Section 5.1 showed how to choose the barycentric weights w to minimize a joint LS measure over the uninterpolateddata set. The only remaining component of the proposedapproach is, then, to choose the next support point ξ n + andupdate the data partition (39) (so that we repeat Section 5.1for the updated partition until a desired tolerance achieved.)In other words, we will move one sampling point from the LS set ˆ ξ to the interpolation set ξ . Which point to movefrom ˆ ξ to ξ will be done in a greedy manner. To emphasizethe iterative nature of the overall algorithm, at this Step n of the algorithm, we will denote by r ( n ) ( s ) and r ( n ) ( s, z ) the two transfer functions of the current LQO approximant.(Note that we dropped the superscript in Section 5.1 tosimplify the notation there.)We start by defining two constants based on the data: M = max s ∈ Ω ∣ H ( s )∣ , M = max s ∈ Ω ,z ∈ Ω ∣ H ( s, z )∣ . (78)For the current approximant in Step n , introduce the abso-lute error measures, deviations in the linear and quadratic parts: ǫ ( n ) = max s ∈ Ω ∣ H ( s ) − r ( n ) ( s )∣ ,ǫ ( n ) = max s,z ∈ Ω ∣ H ( s, z ) − r ( n ) ( s, z )∣ . (79)The next support point ξ n + is chosen by means of a greedysearch over the set Ω \ { ξ , . . . , ξ n } using the error measures ǫ ( n ) and ǫ ( n ) . More precisely, if ǫ ( n ) / N > ǫ ( n ) / N , then ξ n + = arg max s ∈ Ω ∣ H ( s ) − r ( n ) ( s )∣ . On the other hand, if ǫ ( n ) / N < ǫ ( n ) / N , define s ( n + ) and z ( n + ) using ( s ( n + ) , z ( n + ) ) = arg max s,z ∈ Ω ∣ H ( s, z ) − r ( s, z )∣ . Now the question is whether choose s ( n + ) or z ( n + ) ) as ξ n + . If only one of them was already a support point,then we choose the other one as ξ n + . If neither s ( n + ) nor z ( n + ) was previously chosen as a support point, thenwe compare ∣ H ( s ( n + ) ) − r ( n ) ( s ( n + ) )∣ and ∣ H ( z ( n + ) ) − r ( n ) ( z ( n + ) )∣ , and choose ξ n + as the one that yields thehigher deviation in the first transfer function. Clearly, bothcannot be already a support point due to the interpolationproperty. Remark 3.
Instead of considering the full grid of pairsof sampling points ( s, z ) and the associated measurements,we could consider a sparser grid for H ( s, z ) samples. Thismodification would require changing the greedy selectionscheme accordingly to make sure that all possible combina-tions of selected points appear in the sparser grid. We skipthis aspect in our examples and work with the full data set. Now, we have all the pieces to describe the algorithmicframework for the proposed method
AAA-LQO , the
AAA algorithm for
LQO systems.Given the full
LQO data (16) and (17), we initiate theapproximant ( n =
0) by choosing r ( ) ( s ) as the average of H ( s ) samples and r ( ) ( s, z ) as the average of H ( s, z ) sam-ples. Then, using the greedy selection strategy of Section 5.2we update the partition (39) and solve for the barycentricweights as in Section 5.1, more specifically using (77). Let n m ax denote the largest dimension permitted for the data-driven LQO approximant ̂ Σ LQO and and let ǫ denote therelative error tolerance. Then, AAA-LQO terminates eitherwhen the prescribed dimension n m ax is reached, or whenthe prescribed error tolerance is achieved, namelymax ( ǫ ( n ) / M , ǫ ( n ) / M ) < ǫ. (80) Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output ǫ ( n ) / M and ǫ ( n ) / M during the AAA-LQO iterations. A sketch of
AAA-LQO isgiven in Algorithm 1.
Algorithm 1.
AAA-LQO : AAA algorithm for
LQO systems
Input:
Sampling points { s , . . . , s N s } , and samples { H ( s i )} and { H ( s i , s j )} of an LQO system;Maximum dimension allowed n m ax ;Stopping tolerance ǫ . Output: data-driven
LQO system ̂ Σ LQO as in (37). n = r ( ) = avg { H ( s i )} , and r ( ) = avg { H ( s i , s j )} while max ( ǫ ( n ) / M , ǫ ( n ) / M ) > ǫ and n < r m ax Employ the greedy selection scheme to choose the nextsupport point(s) and update the partitioning as de-scribed in Section 5.2. Compute the vector of weights w ⋆ as in (77). Update r ( n ) ( s ) and r ( n ) ( s, z ) , and compute the errors ǫ ( n ) and ǫ ( n ) as in (79). n = n + endRemark 4. Note that, by choosing complex-conjugate sam-pling points and sampled values, one can enforce the fittedmodels to be real-valued. This is actually enforced for bothexamples presented in Section 6.
6. Numerical examples
We test
AAA-LQO , as given in Algorithm 1, on two
LQO systems. We also apply the original
AAA algorithm (fromthe linear case) to the data corresponding to the first (lin-ear) transfer function only. Therefore, we construct two ap-proximants: (1) A data-driven
LQO approximant of order- n using AAA-LQO and (2) A data-driven linear approximantusing
AAA . Note that both approximants are real-valued,enforced by using a data set that is closed under complexconjugation.
First, we use a single-input/single-output version of the ISS1R Model from the SLICOT MOR benchmark collection[11]. We construct a
LQO system from this linear model byadding a quadratic output with the choice of M = . I + . I ( − ) + . I ( + ) ∈ R × , which scales the product of the state variable with itself, in the output equation. Here, I ( k ) denotes a quasi-diagonal matrix for which the entriesof ones are shifted from the main diagonal based on theinteger k ( k > k < I ( ) = I ).We collect the following data: pick 60 logarithmically-spaced points in the interval [ − , ] i and add its con-jugate pairs in [ − − , )] i to have N s =
120 samplingpoints { s i } and the samples { H ( s i )} for i = , , . . . , N s as in (16). Then, as in (17), we sample the second-transferfunction at H ( s i , s j ) for i, j = , , . . . , N s . The sampleddata are depicted in Figure 1, where we display the mea-surements evaluated only on the “positive side” of the imag-inary axis and skipping the conjugate data.Figure 1.: Measurements corresponding to the transferfunctions; H ( s ) (top) and H ( s, z ) (bottom).We apply Algorithm 1 with n m ax =
30 and ǫ = − (rel-ative tolerance value corresponding to 99% approximationerror on the data). With these variables, AAA-LQO yieldsa data-driven
LQO model of order n = { H ( s i )} samples (corresponding to thelinear observation map), we apply AAA and obtain a data-driven linear approximant of order n =
18. The
AAA ap-proximant is constructed to simply illustrate that a lineardynamical system approximation is not sufficient to accu-rately represent the underlying
LQO system.In the top plot of Figure 2, we show the magnitude of thefirst transfer function H ( s ) of the original system togetherwith that of the linear AAA model and the first transferfunction ( r ( n ) ( s ) ) of the AAA-LQO model. As expected,
AAA model does a good job in matching the linear part ofthe output. Similarly, the
AAA-LQO model also matches H ( s ) accurately. To better illustrate this, in the bottomplot of Figure 2, we depict the magnitude of the approxima-tion errors in H ( s ) . The plot reveals that the AAA-LQO model has a smaller error for most of the frequency values,even in approximating H ( s ) . This happens despite thefact that it focuses on both H ( s ) and H ( s, z ) unlike the Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output AAA model, which only tries to approximate H ( s ) . -1 -5 The first transfer function
OriginalAAA-linearAAA-LQO -1 Frequency s -5 Relative approximation errors for the first transfer function
AAA-linearAAA-LQO
Figure 2.: First transfer function approximation.In Figure 3 we depict the selected support points (inter-polation points) for both
AAA and
AAA-LQO algorithms(without the complex conjugate pairs), as well as the polesof the learned models (i.e., the eigenvalues of A r in bothcases). Note that there are 9 complex conjugate pairs ofsupport points for each method. Even though some of thesupport points of AAA and
AAA-LQO overlap, two of thepairs are different. This difference causes a big deviation inthe the pole pattern as shown in the bottom plot, illustrat-ing that even the linear part of the
AAA-LQO approximant,i.e., r ( n ) ( s ) , is fundamentally different than the linear AAA model. This is expected since
AAA-LQO constructs r ( n ) ( s ) and r ( n ) ( s, z ) together by minimizing a joint LS measure inboth H ( s ) and H ( s, z ) . -1 -5 The selected interpolation points
AAA-LQOAAA-linear -0.45 -0.4 -0.35 -0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0-20020
The poles
AAA-LQOAAA-linear
Figure 3.: The support points (top) and the poles (bottom)for the two
AAA reduced-order models.To show the overall performance of
AAA-LQO in accu-rately approximating not only H ( s ) but also H ( s, z ) (thefull LQO behavior), we perform a time-domain simulation ofthe original
LQO system Σ
LQO , the data-driven
AAA-LQO model ̂ Σ LQO , and the linear
AAA model by using u ( t ) = . ( πt ) as the control input. During the simulation of the original system Σ LQO , we also compute only the linearpart of the output, which the
AAA model should approxi-mate well. The results are given in the top plot of Figure 4.The first observation is that the output of ̂ Σ LQO from
AAA-LQO accurately replicates the output of Σ
LQO . On the otherhand, the linear
AAA model completely misses the quadraticoutput and is only able to approximate the linear compo-nent in the output, as expected. The approximation error inthe output corresponding to ̂ Σ LQO is depicted in the bottomplot of Figure 4. y ( t ) -4 The observed output
OriginalLinear partAAA-LQOAAA-linear
Time (t) -5 Relative approximation error
AAA-LQO
Figure 4.: Time-domain simulations: (top) output of theoriginal and data-driven models, (bottom) ap-proximation error.In Figure 5 we show the convergence behavior of
AAA-LQO by plotting the evolution of the relative approximationerrors ( ǫ ( n ) / M and ǫ ( n ) / M ) for all even values of n . For areference, we also depict the convergence behavior of AAA .The figure illustrates that after n =
18, both relative errorsfall below the given tolerance of 10 − and the algorithmterminates. n -3 -2 -1 Relative approximation errors /M (AAA-linear) /M /M Figure 5.: Relative approximation errors in each step.To investigate how the order of the
AAA-LQO modelvaries based on the stopping tolerance, we set n m ax = AAA-LQO for four tolerance values τ = − , τ = − , τ = − , and τ = − . The results are displayed in Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output τ = − , in Figure 6 we depictthe convergence behavior of AAA-LQO by plotting ǫ ( n ) / M and ǫ ( n ) / M during the iteration. τ = − τ = − τ = − τ = − n = n = n = n = τ vs. the order n
10 20 30 40 50 60 n -8 -6 -4 -2 Relative approximation errors /M /M Figure 6.: Relative approximation errors in each step.
This model taken from [23] corresponds to an
LQO sys-tem whose output measures a variance in the state-variable.A linear mass-spring-damper SISO dynamical system wasmodified in [22] by means of stochastic modeling, by re-placing the physical parameters by independent randomvariables, yielding a linear dynamical system with multi-ple outputs. Based on this multiple output system, a SISO
LQO system was derived in [23] where the output corre-sponds to the variance of tne original output (and thus isquadratic in nature). We refer the reader to [23] for furtherdetails. We obtain the measurements from a version of thismodel corresponding to an underlying
LQO system of order N = c = in (1). Hence, H ( s ) = , ∀ s .As sampling points { s i } , we choose 60 logarithmicallyspaced points over the interval [ − , ] i together withits conjugate pairs, leading to N s =
120 samples. Since H ( s ) =
0, we only need to sample H ( s i , s j ) for i, j = , , . . . , N s . The corresponding data for the second transferfunction are depicted in Figure 7. Figure 7.: Measurements of the second transfer function.We apply AAA-LQO with n m ax =
50 and ǫ = − (rel-ative stopping criterion), obtaining a LQO model of order n =
30. To show the accuracy of the approximant, weperform time-domain simulations of the full model and theapproximant with the input u ( t ) = sin ( . t ) . We depict theobserved outputs in the top plot of Figure 8, illustrating anaccurate approximation. The corresponding output error isplotted in the bottom plot of Figure 8. y ( t ) The observed output
OriginalAAA-LQO
100 200 300 400 500 600 700 800 900 1000
Time (t) -8 -6 -4 -2 Relative approximation error
AAA-LQO
Figure 8.: Time-domain simulations; output of the originaland the reduced system (up) + approximationerror (down).Finally, in Figure 9 we show the convergence behavior of
AAA-LQO by plotting the evolution of relative approxima-tion error ǫ ( n ) / M . Remark 5.
Since
AAA-LQO uses a greedy selection schemeand is not a descent algorithm, there is no theoretical guar-antee that the maximum approximation error will decreasemonotonically. This can be seen in Figures 5, 6 and 9. Thisbehavior was also observed in the original
AAA algorithm;see, e.g, Application 6.3 in [20]. However, numerically theerror indeed decreases monotonically with n in most cases. Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output n -3 -2 -1 Relative approximation error /M Figure 9.: Maximum relative approximation error at eachstep.
7. Conclusions
We have proposed a novel data-driven modeling method,called
AAA-LQO , for linear systems with quadratic outputs(
LQO ). AAA-LQO extends the
AAA algorithm to this newsetting by first developing the barycentric representationtheory for the two transfer functions arising in the analysisof
LQO and then formulating a LS minimization frameworkto efficiently solve for the barycentric coefficients. The twonumerical examples illustrate that AAA-LQO provides high-fidelity data-driven approximants to the original model.The barycentric form we developed here for
LQO systemsoffers promising research directions for modelling systemswith general polynomial observation maps, as well as fornonlinearities appearing in the dynamical equation such asbilinear or quadratic-bilinear systems. These topics are thefocus of on-going research.
8. Acknowledgements
The authors would like to thank Dr. Akil Narayan for pro-viding the source codes for generating the numerical exam-ple presented in Section 6.2.S. Gugercin was supported in parts by National ScienceFoundation under Grant No. DMS-1720257 and DMS-1819110.Part of this material is based upon work supported bythe National Science Foundation under Grant No. DMS-1439786 and by the Simons Foundation Grant No. 507536while Gugercin was in residence at the Institute for Com-putational and Experimental Research in Mathematics inProvidence, RI, during the “Model and dimension reduc-tion in uncertain and dynamic systems” program.
References [1] A. C. Antoulas.
Approximation of large-scale dynami-cal systems . SIAM, Philadelphia, 2005.[2] A. C. Antoulas and B. D. 0. Anderson. On the scalarrational interpolation problem.
IMA Journal of Math-ematical Control and Information , 3(2-3):61–88, 1986.[3] A. C. Antoulas, C. Beattie, and S. Gugercin.
Inter-polatory methods for model reduction . ComputationalScience and Engineering 21. SIAM, Philadelphia, 2020.[4] A. C. Antoulas, A. C. Ionita, and S. Lefteriu. On two-variable rational interpolation.
Linear Algebra and itsApplications , 436(8):28890–2915, apr 2012.[5] A. C. Antoulas, S. Lefteriu, and A. C. Ionita. A tutorialintroduction to the Loewner framework for model re-duction. In
Model Reduction and Approximation , chap-ter 8, pages 335–376. SIAM, 2017.[6] U. Baur, P. Benner, and L. Feng. Model order re-duction for linear and nonlinear systems: A system-theoretic perspective.
Archives of ComputationalMethods in Engineering , 21(4):331–358, 2014.[7] P. Benner, P. Goyal, and I. Pontes Duff. Grami-ans, energy functionals and balanced truncationfor linear dynamical systems with quadratic out-puts. Technical report, arXiv preprint available at https://arxiv.org/abs/1909.04597 , 2019.[8] P. Benner, M. Ohlberger, A. Cohen, and K. Willcox.
Model Reduction and Approximation . Society for In-dustrial and Applied Mathematics, Philadelphia, PA,2017. doi:10.1137/1.9781611974829 .[9] J.P. Berrut and L. N. Trefethen. Barycentric Lagrangeinterpolation.
SIAM Rev. , 46(3):501–517, aug 2004.[10] A. Carracedo Rodriguez and S. Gugercin. The p-AAA algorithm for data driven modeling of parametricdynamical systems. Technical report, arXiv preprintavailable at https://arxiv.org/abs/2003.06536 ,2020.[11] Y. Chahlaoui and P. Van Dooren. A collection ofbenchmark examples for model reduction of lineartime invariant dynamical systems. Technical Report2002–2, SLICOT Working Note, 2002. Available from .[12] F. Giri and E.-W. Bai, editors.
Block-oriented Nonlin-ear System Identification . Lecture Notes in Control andInformation Sciences. Springer-Verlag, London, 2010.
Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output
Matrix Compu-tations . The Johns Hopkins University Press, fourthedition, 2013.[14] I. V. Gosea and A. C. Antoulas. A two-sided iterativeframework for model reduction of linear systems withquadratic output. In
Proceedings of the 58th Confer-ence on Decision and Control (CDC), December 11-13,Nice, France , 2019.[15] I. V. Gosea and S. G¨uttel. Algorithms forthe rational approximation of matrix-valued func-tions. Technical report, arXiv preprint available at https://arxiv.org/abs/2003.06410 , 2020.[16] A. Juditsky, H. Hjalmarsson, A. Benveniste, B. De-lyon, L. Ljung, J. Sj¨oberg, and Q. Zhang. Nonlinearblack-box models in system identification: Mathemati-cal foundations.
Automatica , 31(12):1725 – 1750, 1995.[17] D. S. Karachalios, I. V. Gosea, and A. C.Antoulas. On bilinear time domain identifica-tion and reduction in the Loewner framework.In
Model Reduction of Complex Dynamical Sys-tems , International Series of Numerical Mathemat-ics. Springer, 2020. accepted September 2020. URL: https://arxiv.org/abs/2003.08711 .[18] D. S. Karachalios, I. V. Gosea, and A. C. Antoulas.The Loewner framework for nonlinear identificationand reduction of Hammerstein cascaded dynamical sys-tems. In
Special Issue: 91st Annual Meeting of theInternational Association of Applied Mathematics andMechanics (GAMM) . Wiley, 2021.[19] P. Lietaert, J. P´erez, B. Vandereycken, andK. Meerbergen. Automatic rational approxima-tion and linearization of nonlinear eigenvalue prob-lems. Technical report, arXiv preprint available at https://arxiv.org/abs/1801.08622 , 2018.[20] Y. Nakatsukasa, O. Sete, and L. N. Trefethen. TheAAA algorithm for rational approximation.
SIAMJournal on Scientific Computing , 40(3):A1494–A1522,2018.[21] J. Nocedal and S. Wright.
Numerical optimization .Springer Science & Business Media, 2006.[22] R. Pulch. Model order reduction and low-dimensionalrepresentations for random linear dynamical systems.
Math. Comput. Simulat. , 144:1–20, 2018.[23] R. Pulch and A. Narayan. Balanced truncation formodel order reduction of linear dynamical systems with quadratic outputs.
SIAM Journal on Scientific Com-puting , 41(4):A2270–A2295, 2019.[24] A. Quarteroni, A. Manzoni, and F. Negri.
Reducedbasis methods for partial differential equations: an in-troduction , volume 92. Springer, 2015.[25] W J. Rugh.
Nonlinear System Theory - TheVolterra/Wiener Approach . University Press, Balti-more, MD, 1981.[26] G. Scarciotti and A. Astolfi. Data-driven model re-duction by moment matching for linear and nonlinearsystems.
Automatica , 79:340–351, 2017.[27] L.J. Tick. The estimation of ”transfer functions” ofquadratic systems.
Technometrics , 3(4):563–567, 1961.[28] R. Van Beeumen and K. Meerbergen.
Model reduc-tion by balanced truncation of linear systems with aquadratic output , pages 2033–2036. T.E. Simons, G.Psihoyios, Ch. Tsitouras (eds.), International Confer-ence on Numerical Analysis and Applied Mathematics(ICNAAM). American Institute of Physics, 2010.[29] R. Van Beeumen, K. Van Nimmen, G. Lombaert, andK. Meerbergen. Model reduction for dynamical sys-tems with quadratic output.
Int. J. Numer. Meth. En-gng. , 91:229–248, 2012.[30] N. Wiener.
Nonlinear problems in random theory . Wi-ley, New York, 1958.
A. Proof of Lemma 2
Substitute s = ξ i and z = ˆ ξ j into (35) to obtain n ( ξ i , ˆ ξ j ) = n ∑ k = n ∑ ℓ = h k,ℓ w k w ℓ P k,ℓ ( ξ i , ˆ ξ j ) = n ∑ ℓ = h i,ℓ w i w ℓ P i,ℓ ( ξ i , ˆ ξ j ) . and d ( ξ i , ˆ ξ j ) = P ( ξ i , ˆ ξ j ) + n ∑ k = w k p k ( ξ i ) p ( ˆ ξ j ) + n ∑ ℓ = w ℓ p ℓ ( ˆ ξ j ) p ( ξ i ) + n ∑ k = n ∑ ℓ = w k w ℓ P k,ℓ ( ξ i , ˆ ξ j ) = w i p i ( ξ i ) p ( ˆ ξ j ) + n ∑ ℓ = w i w ℓ P i,ℓ ( ξ i , ˆ ξ j ) , Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output
P, p k , and P k,l are as given in (33). Hence, write r ( ξ i , ˆ ξ j ) = n ( ξ i , ˆ ξ j ) d ( ξ i , ˆ ξ j ) as r ( ξ i , ˆ ξ j ) = n ∑ ℓ = h i,ℓ w i w ℓ P i,ℓ ( ξ i , ˆ ξ j ) w i p i ( ξ i ) p ( ˆ ξ j ) + n ∑ ℓ = w i w ℓ P i,ℓ ( ξ i , ˆ ξ j ) . (81)Introduce the notation P Li ( ξ i , z ) = n ∏ k = ,k ≠ i n ∏ ℓ = ( ξ i − ξ k )( z − ξ ℓ ) = p i ( ξ i ) p ( z ) . (82)Since P Li ( ξ i , ˆ ξ j ) = P i,ℓ ( ξ i , ˆ ξ j )( ˆ ξ j − ξ ℓ ) holds, we can write r ( ξ i , ˆ ξ j ) = n ∑ ℓ = h i,ℓ w i w ℓ P Li ( ξ i , ˆ ξ j ) ˆ ξ j − ξ ℓ w i P Li ( ξ i , ˆ ξ j ) + n ∑ ℓ = w i w ℓ P Li ( ξ i , ˆ ξ j ) ˆ ξ j − ξ ℓ . (83)By simplifying w i P Li ( ξ i , ˆ ξ j ) from both the numerator andthe denominator in the above expression proves the firstdesired result in (38). The proof for r ( ˆ ξ j , ξ i ) follows simi-larly.follows simi-larly.