[PDF] Data-driven modeling of linear dynamical systems with quadratic output in the AAA framework

Abstract

We extend the AAA (Adaptive-Antoulas-Anderson) algorithm to develop a data-driven modeling framework for linear systems with quadratic output (LQO). Such systems are characterized by two transfer functions: one corresponding to the linear part of the output and another one to the quadratic part. We first establish the joint barycentric representations and the interpolation theory for the two transfer functions of LQO systems. This analysis leads to the proposed AAA-LQO algorithm. We show that by interpolating the transfer function values on a subset of samples together with imposing a least-squares minimization on the rest, we construct reliable data-driven LQO models. Two numerical test cases illustrate the efficiency of the proposed method.

Full PDF

aa r X i v : . [ m a t h . NA ] F e b Data-driven modeling of linear dynamical systems withquadratic output in the AAA framework

Ion Victor Gosea ∗ Serkan Gugercin † ∗ Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstr. 1, 39106 Magdeburg, Germany.

Email: [email protected] , ORCID: † Department of Mathematics and Computational Modeling and Data Analytics Division, Academy of Integrated Science, VirginiaTech, Blacksburg, VA 24061, USA.

Email: [email protected] , ORCID:

We extend the

AAA (Adaptive-Antoulas-Anderson) algorithm to develop a data-driven modeling frameworkfor linear systems with quadratic output (

LQO ). Such systems are characterized by two transfer functions: onecorresponding to the linear part of the output and another one to the quadratic part. We ﬁrst establish thejoint barycentric representations and the interpolation theory for the two transfer functions of

LQO systems.This analysis leads to the proposed

AAA-LQO algorithm. We show that by interpolating the transfer functionvalues on a subset of samples together with imposing a least-squares minimization on the rest, we constructreliable data-driven

LQO models. Two numerical test cases illustrate the eﬃciency of the proposed method.

Keywords: data-driven modeling, nonlinear dynamics, rational approximation, barycentric form, interpolation.

1. Introduction

Model order reduction (

MOR ) is used to approximate large-scale dynamical systems with smaller ones that ideally havesimilar response characteristics to the original one. This hasbeen an active research area and many approaches to

MOR have been proposed. We refer the reader to [1, 3, 6, 8, 24, 26]and the references therein for an overview of

MOR methodsfor both linear and nonlinear dynamical systems.

MOR , as the name implies, assumes access to a full ordermodel to be reduced; in most cases, in the form of a state-space formulation obtained via, e.g., a spatial discretizationof the underlying partial diﬀerential equations. Then, thereduced order quantities are computed via an explicit pro-jection of the full-order quantities. However, in some cases,access to full order dynamics is not available. Instead, onehas access to a collection of input/output measurements.In this case, the goal is to construct the approximation di-rectly from data, which we refer to as data-driven modeling.This is the framework we consider in this paper.Speciﬁcally, we focus on data-driven modeling of linear dynamical systems with quadratic output (

LQO ). In ourformulation, data correspond to frequency domain samplesof the input/output mapping of the underlying

LQO system,in the form of samples of its two transfer functions; the ﬁrsttransfer function being a single-variable one and the seconda bivariate one. For this data set, the proposed frameworkﬁrst develops the barycentric rational interpolation theoryfor

LQO systems to interpolate a subset of the data andand then extends the

AAA algorithm [20] to this setting byminimizing a least-square measure in the remaining data.We note that system identiﬁcation of general nonlinearsystems has been a popular topic. In particular, we mentionhere the special case of identifying linear systems with non-linear output or input functions, e.g., the so-called Wiener[30] and Hammerstein models, respectively. Signiﬁcant ef-fort has been allocated for identiﬁcation of such models;see, e.g., [16], [12] and the references therein. Nevertheless,the methods previously mentioned are based in the timedomain, while in this paper we focus on frequency domaindata. We point out that the frequency-data based Loewnerframework was recently extended to identifying Hammer-

Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output

LQO systems and their transfer functions in Section 2,followed by a review of barycentric rational approximationand the

AAA algorithm in Section 3. Next, we develop thetheory for barycentric representation and multivariate in-terpolation for

LQO systems in Section 4. Based on thisanalysis, in Section 5, we present the proposed algorithm,

AAA-LQO , for data-driven modeling of

LQO systems. Thenumerical experiments are given in Section 6 followed bythe conclusions in Section 7.

2. Linear systems with quadraticoutput

In state-space form, linear dynamical systems with quadraticoutput (

LQO systems) are described asΣ

LQO ∶ { ˙ x ( t ) = Ax ( t ) + b u ( t ) ,y ( t ) = c T x ( t ) + K [ x ( t ) ⊗ x ( t )] , (1)where A ∈ R N × N , b , c ∈ R N , K ∈ R × N , and the symbol ⊗ denotes the Kronecker product, i.e., for the vector x = [ x x ⋯ x N ] T ∈ R N , we have x ⊗ x = [ x x x x x ⋯ x x N ⋯ x N ] T ∈ R N . The quadratic part of the output in (1), K [ x ( t ) ⊗ x ( t )] ,can be rewritten as x T ( t ) Mx ( t ) with M ∈ R N × N and K = vec ( M )) where vec denotes the vectorization operation. Insome cases, in (1) we have c = , and thus the output hasonly the quadratic term.Several projection-based MOR methodologies have beenalready proposed for LQO systems. More precisely, bal-anced truncation-type methods were considered in [7,23,28],while interpolation-based methods were used in [14,29]. Allthese methods are intrusive, meaning that, they explicitlywork with the state-space matrices A , b , c and K in (1).The main goal of this work is to develop a data-driven modeling framework for LQO systems where only input-output measurements, in the form of transfer function eval-uations, are needed as opposed to the internal state-spacerepresentation. Therefore, our ﬁrst goal is to derive transferfunctions for this special class of dynamical systems.

Many classes of nonlinear systems can be represented in thetime domain by generalized kernels as presented in the clas-sical Wiener or Volterra series representations. Generically, inﬁnite number of kernels appear in such series, correspond-ing to each homogeneous subsystem. For more details werefer the reader to [25, 30]For the

LQO system (1), the nonlinearity is present inthe state-to-output equation only and one can write theinput-output mapping of the system in the frequency do-main using two transfer functions1. one corresponding to the linear part of the output,i.e., y ( t ) = c T x ( t ) ;2. one corresponding to the quadratic part of the output,i.e., y ( t ) = K ( x ( t ) ⊗ x ( t )) .These transfer functions were recently derived in [14] usingtheir time-domain representations. In the next result, weintroduce and re-derive them for the completeness of thepaper and to illustrate to the reader how they naturallyappear. Lemma 1.

Consider the

LQO system in (1) with x ( ) = .Let the input u ( t ) be a sum of the J harmonic terms, i.e., u ( t ) = J ∑ j = e i ω j t , where ω j > for j = , , . . . , J, (2) and i = − . Then, the output y ( t ) is given by y ( t ) = J ∑ j = H ( i ω j ) e i ω j t + J ∑ j = J ∑ ℓ = H ( i ω j , i ω ℓ ) e i ( ω j + ω ℓ ) t , (3) where H ( s ) = c T ( s I n − A ) − b (4) is the single-variable rational transfer function correspond-ing to y ( t ) and H ( s, z ) = K [( s I n − A ) − b ⊗ ( z I n − A ) − b ] (5) is the two-variable rational transfer function correspondingto y ( t ) with I n denoting the identity matrix of size N × N .Proof. For the input u ( t ) in (2) with x ( ) = , the solutionof the linear state-equation in (1) in steady-state can bewritten as a sum of scaled complex exponential functionsas x ( t ) = J ∑ j = G ( i ω j ) e i ω j t , (6)where G ( s ) = ( s I − A ) − b . Substituting (6) into the out- Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output y ( t ) = c T J ∑ j = G ( i ω j ) e i ω j t + K [ J ∑ j = G ( i ω j ) e i ω j t ] ⊗ [ J ∑ ℓ = G ( i ω ℓ ) e i ω ℓ t ] = J ∑ j = c T G ( i ω j ) e jω j t + J ∑ j = J ∑ ℓ = K [ G ( i ω j ) ⊗ G ( i ω ℓ )] e i ( ω j + ω ℓ ) t . (7)Substituting G ( s ) = ( s I − A ) − b back into the last equa-tion yields the desired result(3) with H ( s ) and H ( s, z ) asdeﬁned in (4) and (5).Lemma 1 shows that the LQO system (1) is characterizedby two transfer functions, namely H ( s ) (corresponding tothe linear component y ( t ) in the output) and H ( s, z ) (cor-responding to the quadratic linear y ( t ) in the output). Asin the classical linear case, H ( s ) is a rational function insingle variable. On the other hand, H ( s, z ) is also a ra-tional function, but of two variables. These two transferfunctions that fully describe the LQO system (1) will playthe fundamental role in our analysis to extend barycentricinterpolation and

AAA to the

LQO setting. Before we es-tablish the theory for

LQO systems, we will brieﬂy reviewthe

AAA algorithm for linear systems in Section 3.

Remark 1.

In the proposed framework, we will requiresampling the two transfer functions H ( s ) and H ( s, z ) . Asit is shown in Lemma 1, this could be achieved by excitingthe system (as a black box) with purely oscillatory controlinputs and measuring the outputs, and performing a Fouriertransformation. For more details on such procedures in sim-ilar settings, we refer the reader to [17]. We also note that[27] examines systems described by two time-domain ker-nels together with their Fourier transformations (deemed astransfer functions) and their measurements. Even thoughno explicit representation of these functions are consideredin terms of a state-space realization, those ideas also equallyapply to sample H ( s ) and H ( s, z ) . Remark 2.

Note that in the special case for which it holdsthat K = α ( c T ⊗ c T ) , we obtain y ( t ) = αy ( t ) where α is ascalar. Therefore, in this case the output y ( t ) is a quadraticpolynomial in the linear output y ( t ) and the LQO modelcan be interpreted as a Wiener model [30]. However, ourfocus here is on the general case of

LQO systems withoutthis special case.

3. Barycentric rational approximationfor linear systems and the AAAalgorithm

For an underlying function H ( ⋅ ) ∶ C → C , e.g., transferfunction of a single-input/single-output (SISO) linear dy-namical system, assume the following set of measurements: { H ( s i )} ∈ C where s i ∈ C for i = , , . . . , N s . (8)Partition the sampling points into two disjoint sets: { s , . . . , s N s } = { ξ , . . . , ξ n } ∪ { ˆ ξ , . . . , ˆ ξ N s − n } def ==== { ξ ∪ ˆ ξ } . (9)We will clarify later how this partitioning is chosen. Basedon (9), deﬁne the sampled values h i def ==== H ( ξ i ) for i = , , . . . , n, andˆ h i def ==== H ( ˆ ξ i ) for i = , , . . . , N s − n, (10)and the corresponding data sets h def ==== { h , . . . , h n } and ˆ h def ==== { ˆ h , . . . , ˆ h N s − n } . (11)Deﬁne the rational function r ( s ) in barycentric form [9], anumerically stable representation of rational functions : r ( s ) = p ( s ) q ( s ) = n ∑ k = w k h k s − ξ k + n ∑ k = w k s − ξ k , (12)where ξ k ∈ C are the sampling (support) points and the weights w k ∈ C are to be determined. By construction, thedegree-( n −

1) rational function r ( s ) in (12) is a rationalinterpolant at the support point set ξ , i.e., r ( ξ k ) = h k for k = , , . . . , n, (13)assuming w k ≠

0. Then, the freedom in choosing the weights { w k } can be used to match the remaining the data ˆ h in anappropriate measure.Assuming enough degrees of freedom, [2] chooses the weights { w k } to enforce interpolation of ˆ h as well, by computing the With the addition of 1 to the denominator, we guarantee that r ( s ) is a strictly proper rational function with a numerator degree n − n . This is done in the anticipation ofthe dynamical system in (1) we aim to approximate where therewill be no direct input-to-output mapping. This is not a restric-tion, and the numerator and denominator degrees can be chosenin a diﬀerent way [9, 20]. Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output n −

1) rational function interpolat-ing the full data (8). We skip the details for the conditionsto guarantee the existence and uniqueness of such a rationalinterpolant and refer the reader to [2, 3] for details.The

AAA (Adaptive-Antoulas-Anderson) algorithm [20],on the other hand, elegantly combines interpolation andleast-squares ( LS ) ﬁtting. In the barycentric form (12),which interpolates the data h by construction, AAA choosesthe weights { w k } to minimize a LS error over the data ˆ h .Note that the LS problem over ˆ h is nonlinear in the weights { w k } since these weights appear in the denominator of r ( s ) as well. AAA solves a relaxed linearized LS problem in-stead. For a sampling point ˆ ξ i in the set ξ , AAA uses thelinearizationˆ h i − r ( ˆ ξ i ) = q ( ˆ ξ i ) ( ˆ h i q ( ˆ ξ i ) − p ( ˆ ξ i )) ↝ ˆ h i q ( ˆ ξ i ) − p ( ˆ ξ i ) , (14)leading to the linearized LS problemmin w ,...,w k N s − n ∑ i = ∣ ˆ h i q ( ˆ ξ i ) − p ( ˆ ξ i ) ∣ . (15) AAA is an iterative algorithm and builds the partition-ing (9) using a greedy search. Assume in step n , AAA hasthe rational approximant r ( s ) as in (12) corresponding tothe partitioning (9) where the weights { w k } are selected bysolving (15). AAA updates (9) via a greedy search by ﬁnd-ing ˆ ξ i ∈ ˆ ξ for which the error ∣ r ( ˆ ξ i ) − ˆ h i ∣ is the largest.This sampling point is, then, added to the interpolation set ξ , the barycentric rational approximant r ( s ) in (12) is up-dated accordingly (it has one higher degree now), and thenew weights are computed, as before, solving the linearized LS problem. The procedure is repeated until either a desiredorder or an error tolerance is obtained. For further details,we refer the reader to the original source [20]. The AAA algorithm proved very ﬂexible and eﬀective, and has beenemployed in various applications such as rational approx-imation over disconnected domains [20], solving nonlineareigenvalue problems [19], modeling of parametrized dynam-ics [10], and approximation of matrix-valued functions [15].

4. Barycentric representations forLQO systems

To develop interpolating barycentric forms for H ( s ) and H ( s, z ) , we ﬁrst need to specify the data corresponding tothe underlying the LQO system Σ

LQO . The ﬁrst transferfunction H ( s ) of Σ LQO is a regular single-variable rationalfunction and, as in Section 3, we sample H ( s ) at distinct points { s , . . . , s N s } to obtain the data set { H ( s i )} ∈ C where s i ∈ C for i = , , . . . , N s . (16)The second transfer function H ( s, z ) , on the other hand,is a function of two-variables. Therefore, in agreement withthe data (16), we will sample H ( s, z ) at the correspondingrectangular grid: for i, j = , , . . . , N s , { H ( s i , s j )} ∈ C where s i , s j ∈ C . (17)Partition the full set of sampling points into two disjointsets { s , . . . , s N s } = { ξ , . . . , ξ n } ∪ { ˆ ξ , . . . , ˆ ξ N s − n } = ξ ∪ ˆ ξ (18) and deﬁne the sampled values (measurements): h i def ==== H ( ξ i ) for i = , , . . . , n (19)and h i,j def ==== H ( ξ i , ξ j ) for i, j = , , . . . , n. (20)Then, the goal is to a construct a data-driven LQO systemdirectly from its samples without access access to internaldynamics of Σ

LQO . The partition (18) and the error measureused in approximating the data will be clariﬁed later. Firstwe will show how the data in (16) and (17) can be used todevelop barycentric-like representations corresponding to areduced

LQO system. We will use the notation r ( s ) todenote the rational approximation to H ( s ) and r ( s, z ) to H ( s, z ) . Proposition 1.

Given the H ( s ) samples in (16) , pick thenonzero weights { w , w , . . . , w n } . Then, the barycentric ra-tional function r ( s ) = n ( s ) d ( s ) = n ∑ k = w k h k s − ξ k / ( + n ∑ k = w k s − ξ k ) (21) interpolates the data in (19) . Let e ∈ C n denote the vectorof ones. Deﬁne the matrices ˆ b = [ w w . . . w n ] T ∈ C n , Ξ = diag ( ξ , . . . , ξ n ) ∈ C n × n ˆ A = Ξ − ˆ be T ∈ C n × n , and ˆ c T = [ h h . . . h n ] ∈ C n . (22) Then, r ( s ) has the state-space form r ( s ) = ˆ c T ( s ˆ I − ˆ A ) − ˆ b , (23) where ˆ I is the identity matrix of dimension n × n . Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output Proof.

The fact that r ( s ) is an interpolating rational func-tion for the data (19) is just a restatement of (13) for com-pleteness. To prove (23), we will use the Sherman-Morrison formula [13]: Let M ∈ C n × n be an invertible and u , v ∈ C n be such that 1 + v ∗ M − u ≠ . Then, ( M + uv ∗ ) − = M − − M − uv ∗ M − + v ∗ M − u . (24)From (22) and (23), we have r ( s ) = ˆ c T ( s ˆ I − ˆ A ) − ˆ b = ˆ c T [( s I − Ξ ) + ˆ be T ] − ˆ b . (25)To simplify the notation, let ˆ Φ s = s I n − Ξ . Then, applyingthe Sherman-Morrison formula to the middle term in (25)with M = ˆ Φ s , u = ˆ b , and v = e , we obtain r ( s ) = ˆ c T ( ˆ Φ s + ˆ be T ) − ˆ b = ˆ c T ( ˆ Φ − s − ˆ Φ − s ˆ be T ˆ Φ − s + e T ˆ Φ − s ˆ b ) ˆ b = ˆ c T ( ˆ Φ − s ˆ b − ˆ Φ − s ˆ b ⋅ e T ˆ Φ − s ˆ b + e T ˆ Φ − s ˆ b ) = ˆ c T ˆ Φ − s ˆ b + e T ˆ Φ − s ˆ b . (26)Since Ξ is diagonal,ˆ Φ − s = ( s I − Ξ ) − = diag ([ ( s − ξ ) − . . . ( s − ξ n ) − ]) . Then, using the deﬁnitions of ˆ b and ˆ c in (22), we obtainˆ c T ˆ Φ − s ˆ b = n ∑ k = w k h k s − ξ k and e T ˆ Φ − s ˆ b = n ∑ k = w k s − ξ k . (27)Substituting these last two equalities into (26) yields (23).We note that state-space realizations for rational func-tions are unique up to a similarity transformations. Forother equivalent state-space representations of a barycen-tric form, we refer the reader to, e.g., [5, 19].Given the samples of H ( s ) ( the data in (19)) of the LQO system (1), Proposition 1 constructs the linear partof the data-driven

LQO model, directly from these samples.What we need to achieve next is to use the H ( s, z ) sam-ples (data in (20)) to construct a two-variable rational func-tion r ( s, z ) in a barycentric-like form corresponding to thequadratic part of the data-driven LQO model. However, r ( s, z ) cannot be constructed independently from r ( s ) .Once r ( s, z ) is constructed, we should be able to interpret r ( s ) and r ( s, z ) as the linear and quadratic transfer func-tions of a single LQO system. This is the precise reasonthat we cannot simply view r ( s, z ) as an independent two-variable rational function and use the classical multivariatebarycentric form [3,4]. Therefore, r ( s, z ) needs to have theform r ( s, z ) = ˆ K [( s ˆ I − ˆ A ) − ˆ b ⊗ ( z ˆ I − ˆ A ) ˆ b ] , where ˆ A and ˆ b are the same matrices from (22) used inmodeling r ( s ) and ˆ K ∈ C × n is the (quadratic) free vari-able that will incorporate to model the new data (20). Thenext result achieves this goal. Theorem 1.

Assume the set-up in Proposition 1. Furtherassume that the H ( s, z ) samples in (17) are given. Deﬁne,the two-variable function r ( s, z ) in a barycentric-like form: r ( s, z ) = n ∑ k = n ∑ ℓ = h k,ℓ w k w ℓ ( s − ξ k )( z − ξ ℓ ) + n ∑ k = w k s − ξ k + n ∑ ℓ = w ℓ z − ξ ℓ + n ∑ k = n ∑ ℓ = w k w ℓ ( s − ξ k )( z − ξ ℓ ) . (28) Then, r ( s, z ) interpolates the data (20) , i.e., r ( ξ i , ξ j ) = H ( ξ i , ξ j ) for i, j = , . . . , n. (29) Deﬁne ˆ M ∈ C n × n and ˆ K ∈ C × n using [ ˆ M ] i,j = h i,j for i, j = , , . . . , n (30) and ˆ K = [ vec ( ˆ M )] T . (31) Then, r ( s, z ) has the state-space form r ( s, z ) = ˆ K [( s ˆ I − ˆ A ) − ˆ b ⊗ ( z ˆ I − ˆ A ) − ˆ b ] . (32) Proof.

To prove the interpolation property (29) of the barycen-tric representation (28), inspired by the linear case, we startby introducing various polynomials in one or two variables: p ( s ) = n ∏ k = ( s − ξ k ) , p ( z ) = n ∏ ℓ = ( z − ξ ℓ ) ,p i ( s ) = n ∏ k = ,k ≠ i ( s − ξ k ) , p j ( z ) = n ∏ ℓ = ,ℓ ≠ j ( z − ξ ℓ ) ,P ( s, z ) = n ∏ k = n ∏ ℓ = ( s − ξ k )( z − ξ ℓ ) , and P i,j ( s, z ) = n ∏ k = ,k ≠ i n ∏ ℓ = ,ℓ ≠ j ( s − ξ k )( z − ξ ℓ ) , (33)for i, j = , . . . , n . Multiply both the numerator and de-nominator of r ( s, z ) in (28) with P ( s, z ) to obtain r ( s, z ) = n ( s, z ) d ( s, z ) , (34) Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output n ( s, z ) = n ∑ k = n ∑ ℓ = h k,ℓ w k w ℓ P k,ℓ ( s, z ) ,d ( s, z ) = P ( s, z ) + n ∑ k = w k p k ( s ) p ( z ) + n ∑ ℓ = w ℓ p ℓ ( z ) p ( s ) + n ∑ k = n ∑ ℓ = w k w ℓ P k,ℓ ( s, z ) . (35)Then, evaluate r ( s, z ) at s = ξ i and z = ξ j to obtain r ( ξ i , ξ j ) = n ( ξ i , ξ j ) d ( ξ i , ξ j ) = h i,j w i w j P i,j ( ξ i , ξ j ) w i w j P i,j ( ξ i , ξ j ) = h i,j . To prove (32), we ﬁrst note that r ( s, z ) = ˆ K [( s ˆ I − ˆ A ) − ˆ b ⊗ ( z ˆ I − ˆ A ) − ˆ b ] = ˆ K [ ˆ Φ − s ˆ b + e T ˆ Φ − s ˆ b ⊗ ˆ Φ − z ˆ b + e T ˆ Φ − z ˆ b ] , where we used the fact ( s ˆ I − ˆ A ) − ˆ b = ( ˆ Φ s + ˆ be T ) − ˆ b = ˆ Φ − s ˆ b + e T ˆ Φ − s ˆ b , as shown in deriving (26). Since ˆ Φ s diagonal, we have r ( s, z ) = ˆ K ( + e T ˆ Φ s ˆ b )( + e T ˆ Φ z ˆ b ) ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ w s − ξ ⋮ w n s − ξ n ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ ⊗ ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ w z − ξ ⋮ w n z − ξ n ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ . Then, using the deﬁnition of ˆ K in (31) together with thesecond formula in (27), we obtain r ( s, z ) = ∑ nk = ∑ nℓ = h k,ℓ w k w ℓ ( s − ξ k )( z − ξ ℓ ) ( + ∑ nk = w k s − ξ k )( + ∑ nℓ = w ℓ z − ξ ℓ ) (36) = ∑ nk = ∑ nℓ = h k,ℓ w k w ℓ ( s − ξ k )( z − ξ ℓ ) + n ∑ k = w k s − ξ k + n ∑ ℓ = w ℓ z − ξ ℓ + n ∑ k = n ∑ ℓ = w k w ℓ ( s − ξ k )( z − ξ ℓ ) , which concludes the proof.The next result follows from Proposition 1 and theorem 1. Corollary 1.

Assume the set-up in Proposition 1 and theorem 1.Then, interpolating rational functions r ( s ) and r ( s, z ) to-gether correspond to an interpolatory LQO model ̂ Σ LQO ∶ { ˙ˆ x ( t ) = ˆ Ax ( t ) + ˆ b u ( t ) , ˆ y ( t ) = ˆ c T ˆ x ( t ) + ˆ K [ ˆ x ( t ) ⊗ ˆ x ( t )] . (37) In others words, the ﬁrst (linear) transfer function of ̂ Σ LQO is r ( s ) and its second transfer function is r ( s, z ) . Recall the partitioning of the sampling points in (18).In Theorem 1, we have shown that r ( s, z ) interpolates H ( s, z ) over the sampling set ξ × ξ . What is the valueof r ( s, z ) over the mixed sampling sets ξ × ˆ ξ and ˆ ξ × ξ ?Even though we do not enforce interpolation over these sets,in Section 5 we will need a closed-form expression for thevalue of r ( s, z ) over ξ × ˆ ξ and ˆ ξ × ξ . The next lemmaestablishes these results. Lemma 2.

Let r ( s, z ) be as deﬁned in (28) correspondingto the sampling points in (18) and the data in (17) . Then, r ( ξ i , ˆ ξ j ) = n ∑ ℓ = w ℓ h i,ℓ ˆ ξ j − ξ ℓ + n ∑ ℓ = w ℓ ˆ ξ j − ξ ℓ and r ( ˆ ξ j , ξ i ) = n ∑ k = w k h k,i ˆ ξ j − ξ k + n ∑ k = w k ˆ ξ j − ξ k , (38) for i = , . . . , n and j = , . . . , N s − n .Proof. Proof is given in Appendix A.It is important to note that the numerators and denom-inators of r ( ξ i , ˆ ξ j ) and r ( ˆ ξ j , ξ i ) in (38) are linear in theweights w ℓ . This is in contrast to the general form of r ( s, z ) in (36) where both the numerator and denominator arequadratic in w ℓ when evaluated over ˆ ξ × ξ .

5. Proposed framework for data-drivenmodeling of LQO systems

Section 4 established the necessary ingredients to extend

AAA to LQO systems. Given the measurements (16) and(17), Proposition 1 and theorem 1 show how to constructthe barycentric forms r ( s ) and r ( s, z ) interpolating thisdata in accordance with the partitioning (18). Furthermore,Corollary 1 states that r ( s ) and r ( s, z ) together corre-spond to an interpolatory LQO system. Based on theseresults, in this section we will fully develop the

AAA frame-work for

LQO systems. The resulting algorithm will bedenoted by

AAA-LQO . AAA-LQO will be an iterative algorithm, adding one de-gree of freedom to the current data-driven

LQO model inevery iteration step. In the n th step, r ( n ) ( s ) and r ( n ) ( s, z ) will correspond to a data-driven order- n LQO model for thepartitioning of the sampling points in (18). First, for this current partitioning, in Section 5.1, we introduce a LS errormeasure that will be used to choose the barycentric weights { w k } appearing in the deﬁnitions of r ( n ) ( s ) and r ( n ) ( s, z ) in (21) and (28). Then, in Section 5.2 we establish a greedysearch procedure for updating the partitioning (18). Thealgorithm will then continue with the LS minimization for Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output ( n + ) th step to construct r ( n + ) ( s ) and r ( n + ) ( s, z ) . AAA-LQO will terminate after adesired error criterion is met or a maximum allowed orderis achieved as explained in Section 5.3.Even though Section 5.1 investigates the LS problem inthe n the of AAA-LQO , to simplify the notation, we drop thesuperscript and use r ( s ) and r ( s, z ) instead. However,they should be understood as the approximants in the n thstep. We will reintroduce the superscript in Section 5.2. For the full

LQO data (16) and (17), we recall (and repeat)the partitioning of the sampling points as in (18): { s , . . . , s N s } = { ξ , . . . , ξ n } ∪ { ˆ ξ , . . . , ˆ ξ N s − n } = ξ ∪ ˆ ξ . (39) Then, r ( s ) interpolates H ( s ) over ξ (i.e., it interpolatesthe data (19)) and r ( s, z ) interpolates H ( s, z ) over ξ × ξ (i.e., it interpolates the data (20)). Also recall that to-gether, r ( s ) and r ( s, z ) correspond to a LQO system.The only remaining degrees of freedom in deﬁning r ( s ) and r ( s, z ) , and thus the corresponding LQO system arethe barycentric weights { w , . . . , w n } . We will choose thoseweights to minimize an appropriate error measure in the uninterpolated data corresponding to the sampling pointsˆ ξ . We ﬁrst introduce the notation for these uninterpolatedvalues :ˆ h i def ==== H ( ˆ ξ i ) for i = , , . . . , N s − n, (40)ˆ h (2,2) i,j def ==== H ( ˆ ξ i , ˆ ξ j ) for i, j = , , . . . , N s − n, (41)ˆ h (1,2) i,j def ==== H ( ξ i , ˆ ξ j ) for i = , . . . , n, j = , . . . , N s − n, (42)ˆ h (2,1) j,i def ==== H ( ˆ ξ j , ξ i ) for j = , . . . , N s − n, i = , . . . , n. (43)We denote with w ∈ C n the vector of weights to be deter-mined, as w = [ w w . . . w n ] T . A reasonable error measure to minimize is the LS distance inthe uninterpolated data, leading to the minimization prob- Since the evaluation of the uninterpolated H ( s, z ) values occurover three diﬀerent sets, namely ξ × ˆ ξ , ˆ ξ × ξ , and ˆ ξ × ˆ ξ , we usea superscript to distinguish them. Recall that the interpolatedvalues h i,j = H ( ξ i , ξ j ) are over ξ × ξ only and thus the superscriptnotation is avoided for h i,j . lem min w ≠ ( J + J + J + J ) (44)where J = N s − n N s − n ∑ i = ( r ( ˆ ξ i ) − ˆ h i ) , (45) J = n ( N s − n ) n ∑ i = N s − n ∑ j = ( r ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j ) , (46) J = ( N s − n ) n N s − n ∑ i = n ∑ j = ( r ( ˆ ξ i , ξ j ) − ˆ h (2,1) j,i ) , and (47) J = ( N s − n ) N s − n ∑ i = N s − n ∑ j = ( r ( ˆ ξ i , ˆ ξ j ) − ˆ h (2,2) i,j ) . (48)As in the original AAA for linear dynamical systems, the LS problem (44) is nonlinear in w for LQO systems. Theformulation is more complicated here due to the additional r ( s, z ) term. To resolve this numerical diﬃculty, we willemploy a strategy, similar to the lineraziation step in (14),and solve a relaxed optimization problem. However, theresulting LS problem in our case will still be nonlinear, yetmuch easier to solve than (44). In the end, we will tackle theoriginal nonlinear LS problem (44) by solving a sequence ofquadratic LS problems. We note that in (45)-(48), we scaleevery error term J i with the number of data points in it. n In this section, we show how to relax the each term, J i ,in the nonlinear LS problem (44). The resulting problemwill then constitute a crucial component in the proposediterative algorithm (Section 5.3). Linearizing J : Note that the i th term of J in (45),namely r ( ˆ ξ i ) − ˆ h i , is the same as the term in (14) appear-ing in AAA . This is natural since r ( s ) corresponds to thelinear part of the LQO system. Therefore, we can linearize J similar to (14). Write r ( s ) as r ( s ) = n ( s )/ d ( s ) , asdeﬁned in (21). Then, the i th term in (45) is linearized as r ( ˆ ξ i ) − ˆ h i = d ( ˆ ξ i ) ( n ( ˆ ξ i ) − ˆ h i d ( ˆ ξ i )) ↝ n ( ˆ ξ i ) − ˆ h i d ( ˆ ξ i ) . (49) Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output n ( s ) and d ( s ) from the deﬁnition of r ( s ) in(21) into (49), one obtains n ( ˆ ξ i ) − ˆ h i d ( ˆ ξ i ) = n ∑ k = w k h k ˆ ξ i − ξ k − ˆ h i ( + n ∑ k = w k ˆ ξ i − ξ k ) = n ∑ k = w k ( h k − ˆ h i ) ˆ ξ i − ξ k − ˆ h i . (50)For a matrix X , let ( X ) ij denote its ( ij ) th entry. Similarly,for a vector x , let ( x ) i denote its i th entry. Then, deﬁnethe Loewner matrix L ∈ C ( N s − n ) × n with ( L ) ik = ˆ h i − h k ˆ ξ i − ξ k , for i = , . . . , N s , k = , . . . , n, (51)and the vector ˆ h ∈ C N s − n with ( ˆ h ) i = ˆ h i . Then, N s − n ∑ i = ( n ( ˆ ξ i ) − ˆ h i d ( ˆ ξ i )) = ∥ L w + ˆ h ∥ . Therefore, the J term in (45) will be relaxed to J ⟿ N s − n ∥ L w + ˆ h ∥ . (52) Linearizing J and J : Now we extend the linearizationstrategy used in J , which only involved the single-variablefunction r ( s ) , to the error terms J and J , which in-volve r ( s, z ) . The closed-form expressions for r ( ξ i , ˆ ξ j ) and r ( ˆ ξ j , ξ i ) we derived in Lemma 2 will prove fundamen-tal in achieving these goals.We start with J . Write r ( s, z ) = n ( s, z )/ d ( s, z ) as in(34). Then, the linearizing the ( ij ) th term in (46) means r ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j = d ( ξ i , ˆ ξ j ) ( n ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j d ( ξ i , ˆ ξ j )) ↝ n ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j d ( ξ i , ˆ ξ j ) . (53)We substitute n ( ξ i , ˆ ξ j ) and d ( ξ i , ˆ ξ j ) from (38) into (53)to obtain n ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j d ( ξ i , ˆ ξ j ) = n ∑ ℓ = w ℓ h i,ℓ ˆ ξ j − ξ ℓ − ˆ h (1,2) i,j ( + n ∑ ℓ = w ℓ ˆ ξ j − ξ ℓ ) = − ⎛⎜⎝ n ∑ ℓ = w ℓ ( ˆ h (1,2) i,j − h i,ℓ ) ˆ ξ j − ξ ℓ + ˆ h (1,2) i,j ⎞⎟⎠ . (54)Deﬁne the indexing variable α ij = ( i − )( N s − n ) + j andlet ˆ h (1,2) ∈ C n ( N s − n ) be the vector deﬁned such that ( ˆ h (1,2) ) α ij = ˆ h (1,2) i,j for 1 ⩽ i ⩽ n and 1 ⩽ j ⩽ N s − n. (55) Deﬁne the Loewner matrix L ( , ) ∈ C n ( N s − n ) × n with entries ( L (1,2) ) α ij ℓ = ˆ h (1,2) i,j − h i,ℓ ˆ ξ j − ξ ℓ , (56)for 1 ⩽ i ⩽ n, ⩽ j ⩽ N s − n , and 1 ⩽ ℓ ⩽ n . Then, using(56) and (55) in (54), we obtain n ∑ i = N s − n ∑ j = ( n ( ξ i , ˆ ξ j ) − ˆ h (1,2) i,j d ( ξ i , ˆ ξ j )) = ∥ L (1,2) w + ˆ h (1,2) ∥ , yielding the linearization of J : J ⟿ ( N s − n ) n ÂÂÂÂÂ L (1,2) w + ˆ h (1,2) ÂÂÂÂÂ . (57)Using similar arguments and the explicit formula for theexpression r ( ˆ ξ j , ξ i ) in (38), the J term in (47) is linearizedto J ⟿ ( N s − n ) n ÂÂÂÂÂ L (2,1) w + ˆ h (2,1) ÂÂÂÂÂ , (58)where the Loewner matrix L ( , ) ∈ C n ( N s − n ) × n and the vec-tor ˆ h (2,1) ∈ C n ( N s − n ) are deﬁned as ( L (2,1) ) γ ji k = ˆ h (2,1) j,i − h k,i ˆ ξ j − ξ k and ( ˆ h (2,1) ) γ ji = ˆ h (2,1) j,i , with 1 ⩽ j ⩽ N s − n, ⩽ i ⩽ n , 1 ⩽ k ⩽ n , and γ ji = ( j − ) n + i . Quadraticizing the J term : In this section we show howto relax the remaining term, J , in the minimization prob-lem (44). Note that this term includes r ( ˆ ξ i , ˆ ξ j ) ; i.e., r ( s, z ) evaluated over ˆ ξ × ˆ ξ . As we stated earlier, unlike r ( ξ i , ˆ ξ j ) ( r ( s, z ) over ξ × ˆ ξ ) or r ( ˆ ξ i , ξ j ) ( r ( s, z ) over ξ × ˆ ξ ), thenumerator and denominator of the quantity r ( ˆ ξ i , ˆ ξ j ) isquadratic in the weights w ℓ . Therefore, relaxing the ( ij ) thterm in J via multiplying it out with its denominator, willnot yield a linear term, but rather a quadratic, i.e., even therelaxed problem cannot be solved as a linear LS problem.This is what we establish next.Similar to (53), relax the ( ij ) th term in (48) using r ( ˆ ξ i , ˆ ξ j ) − ˆ h (2,2) i,j = d ( ˆ ξ i , ˆ ξ j ) ( n ( ˆ ξ i , ˆ ξ j ) − ˆ h (2,2) i,j d ( ˆ ξ i , ˆ ξ j )) ↝ n ( ˆ ξ i , ˆ ξ j ) − ˆ h (2,2) i,j d ( ˆ ξ i , ˆ ξ j ) . (59)Using (36), we obtain r ( ˆ ξ i , ˆ ξ j ) = ∑ nk = ∑ nℓ = w k w ℓ h k,ℓ ( ˆ ξ i − ξ k )( ˆ ξ j − ξ ℓ ) ( + ∑ nk = w k ˆ ξ i − ξ k )( + ∑ nℓ = w ℓ ˆ ξ j − ξ ℓ ) = n ( ˆ ξ i , ˆ ξ j ) d ( ˆ ξ i , ˆ ξ j ) . (60) Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output n ( ˆ ξ i , ˆ ξ j ) and d ( ˆ ξ i , ˆ ξ j ) from (60) into (59) andre-arranging the terms yields n ( ˆ ξ i , ˆ ξ j ) − ˆ h (1,2) i,j d ( ξ i , ˆ ξ j ) = − ⎛⎜⎝ n ∑ k = n ∑ ℓ = w k w ℓ ( ˆ h (2,2) i,j − h k,ℓ )( ˆ ξ i − ξ k )( ˆ ξ j − ξ ℓ ) + n ∑ k = w k ˆ h (2,2) i,j ˆ ξ i − ξ k + n ∑ ℓ = w ℓ ˆ h (2,2) i,j ˆ ξ j − ξ ℓ − ˆ h (2,2) i,j ⎞⎟⎠ . (61)Note that the expression in (61) is quadratic in w k , as an-ticipated.As we did for the J , J and J , to express the resultingexpression more compactly in matrix form, we introducethe (2D) Loewner matrix L (2,2) ∈ C ( N s − n ) × n as ( L (2,2) ) α ij β kℓ = ˆ h (2,2) i,j − h k,ℓ ( ˆ ξ i − ξ k )( ˆ ξ j − ξ ℓ ) , (62)where α ij = ( i − )( N s − n ) + j and β kℓ = ( k − ) n + ℓ with i, j ∈ { , , . . . , N s − n } and k, ℓ ∈ { , , . . . , n } . Then, the α ij th entry of the vector L (2,2) ( w ⊗ w ) ∈ C ( N s − n ) is ( L (2,2) ( w ⊗ w )) α ij = − n ∑ k = n ∑ ℓ = w k w ℓ ( h k,ℓ − ˆ h (2,2) i,j )( ˆ ξ i − ξ k )( ˆ ξ j − ξ ℓ ) , (63)thus recovering the ﬁrst sum in (61). Next, introduce thematrices U , U ∈ C ( N s − n ) × n such that for 1 ⩽ k, ℓ ⩽ n , ( U ) α ij k = ˆ h (2,2) i,j ˆ ξ i − ξ k and ( U ) α ij ℓ = ˆ h (2,2) i,j ˆ ξ j − ξ ℓ . (64)Using U and U in (64), the last two sums in (61) can becompactly written as n ∑ k = w k ˆ h (2,2) i,j ˆ ξ i − ξ k = U w and n ∑ ℓ = w ℓ ˆ h (2,2) i,j ˆ ξ j − ξ ℓ = U w . (65)Deﬁne U = U + U . Then using (64), we write ( U ) α ij k = ( U ) α ij k + ( U ) α ij k = ˆ h (2,2) i,j ( ˆ ξ i + ˆ ξ j − ξ k )( ˆ ξ i − ξ k )( ˆ ξ j − ξ k ) . (66)Insert (62) and (66) into (60) obtain N s − n ∑ i = N s − n ∑ j = ( n ( ˆ ξ i , ˆ ξ j ) − ˆ h (2,2) i,j d ( ξ i , ˆ ξ j )) = ÂÂÂÂÂ L (2,2) ( w ⊗ w ) + Uw + ˆ h (2,2) ÂÂÂÂÂ , (67) where ˆ h (2,2) ∈ C ( N s − n ) is the vector deﬁned as ( ˆ h (2,2) ) α ij = ˆ h (2,2) i,j , (68)with α ij = ( i − )( N s − n ) + j as before and 1 ⩽ i, j ⩽ N s − n .The expression (67) yields the ﬁnal relaxation of J : J ⟿ ( N s − n ) ÂÂÂÂÂ L (2,2) ( w ⊗ w ) + Uw + ˆ h (2,2) ÂÂÂÂÂ . (69) n Combining the relaxations J , J , J , and J as given in(52), (57), (58), and (69), at the n th step of the algorithm,we need to solve the quadraticized minimization problemmin w { ρ ÂÂÂÂÂ L w + ˆ h ÂÂÂÂÂ + ρ (ÂÂÂÂÂ L (1,2) w + ˆ h (1,2) ÂÂÂÂÂ + ÂÂÂÂÂ L (2,1) w + ˆ h (2,1) ÂÂÂÂÂ ) + (70) ρ ÂÂÂÂÂ L (2,2) ( w ⊗ w ) + Uw + ˆ h (2,2) ÂÂÂÂÂ } , where ρ = N s − n , ρ = ( N s − n ) n , and ρ = ( N s − n ) . (71)Note that due to the last term, the optimization problem(70) is no longer a linear LS problem, nevertheless can besolved eﬃciently. One can explicitly compute the gradient(and Hessian) of the cost function and can apply a well-established (quasi)-Newton formulation [21]. If we wereto have a one-step algorithm whose solution is given by(70), one would employ these techniques. However, notethat solving (70) is only one step of our proposed itera-tive algorithm. Hence, as the iteration continues (and n increases) the vector w (and the data-partition) will beupdated and the new optimization problem with a larger-dimension needs to be solved. Therefore, we will approxi-mately solve (70) in every step.One can obtain an approximate solution to (70) in variousways. In our formulation, we will ﬁrst solve part of theproblem (70) that can be written as a linear least-squaresproblem in w , namelymin w { ρ ÂÂÂÂÂ L w + ˆ h ÂÂÂÂÂ + ρ (ÂÂÂÂÂ L (1,2) w + ˆ h (1,2) ÂÂÂÂÂ + ÂÂÂÂÂ L (2,1) w + ˆ h (2,1) ÂÂÂÂÂ ) . (72)The optimization problem (72) is a classical linear least-squares problem:˜ w = arg min w ÂÂÂÂÂÂÂÂÂÂÂÂÂ ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ρ L ρ L (1,2) ρ L (2,1) ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ w + ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ˆ h ˆ h (1,2) ˆ h (2,1) ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ ÂÂÂÂÂÂÂÂÂÂÂÂÂ . (73) Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output w , we further relax the last term in (70) as ρ ∥ L (2,2) ( w ⊗ w ) + Uw + ˆ h (2,2) ∥ ⟿ ρ ∥ L (2,2) ( ˜ w ⊗ w ) + Uw + ˆ h (2,2) ∥ . (74)Using L (2,2) ( ˜ w ⊗ w ) = L (2,2) ( ˜ w ⊗ I ) w , we rewrite (74) as ρ ∥ L (2,2) ( ˜ w ⊗ w ) + Uw + ˆ h (2,2) ∥ = ρ ∥ T w + ˆ h (2,2) ∥ , (75)where the matrix T ∈ C ( N s − n ) × n is deﬁned as follows T = L (2,2) ( ˜ w ⊗ I ) + U . (76)Then, using (75) in place of the last term in (70), we obtaina minimization problem that is now a linear LS problem.Thus, the solution to our ﬁnal approximation to (70) isgiven by w ⋆ = arg min w ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ρ L ρ L (1,2) ρ L (2,1) ρ T ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ w + ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ˆ h ˆ h (1,2) ˆ h (2,1) ˆ h (2,2) ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ . (77)Therefore, in the n th step of AAA-LQO , the optimizationproblem (44) is relaxed and the solution of this relaxedproblem (the weights) is given by (77). The algorithmsproceeds with the updated weights as we discuss next.

Given the partition (39) in the Step n of the algorithm,Section 5.1 showed how to choose the barycentric weights w to minimize a joint LS measure over the uninterpolateddata set. The only remaining component of the proposedapproach is, then, to choose the next support point ξ n + andupdate the data partition (39) (so that we repeat Section 5.1for the updated partition until a desired tolerance achieved.)In other words, we will move one sampling point from the LS set ˆ ξ to the interpolation set ξ . Which point to movefrom ˆ ξ to ξ will be done in a greedy manner. To emphasizethe iterative nature of the overall algorithm, at this Step n of the algorithm, we will denote by r ( n ) ( s ) and r ( n ) ( s, z ) the two transfer functions of the current LQO approximant.(Note that we dropped the superscript in Section 5.1 tosimplify the notation there.)We start by deﬁning two constants based on the data: M = max s ∈ Ω ∣ H ( s )∣ , M = max s ∈ Ω ,z ∈ Ω ∣ H ( s, z )∣ . (78)For the current approximant in Step n , introduce the abso-lute error measures, deviations in the linear and quadratic parts: ǫ ( n ) = max s ∈ Ω ∣ H ( s ) − r ( n ) ( s )∣ ,ǫ ( n ) = max s,z ∈ Ω ∣ H ( s, z ) − r ( n ) ( s, z )∣ . (79)The next support point ξ n + is chosen by means of a greedysearch over the set Ω \ { ξ , . . . , ξ n } using the error measures ǫ ( n ) and ǫ ( n ) . More precisely, if ǫ ( n ) / N > ǫ ( n ) / N , then ξ n + = arg max s ∈ Ω ∣ H ( s ) − r ( n ) ( s )∣ . On the other hand, if ǫ ( n ) / N < ǫ ( n ) / N , deﬁne s ( n + ) and z ( n + ) using ( s ( n + ) , z ( n + ) ) = arg max s,z ∈ Ω ∣ H ( s, z ) − r ( s, z )∣ . Now the question is whether choose s ( n + ) or z ( n + ) ) as ξ n + . If only one of them was already a support point,then we choose the other one as ξ n + . If neither s ( n + ) nor z ( n + ) was previously chosen as a support point, thenwe compare ∣ H ( s ( n + ) ) − r ( n ) ( s ( n + ) )∣ and ∣ H ( z ( n + ) ) − r ( n ) ( z ( n + ) )∣ , and choose ξ n + as the one that yields thehigher deviation in the ﬁrst transfer function. Clearly, bothcannot be already a support point due to the interpolationproperty. Remark 3.

Instead of considering the full grid of pairsof sampling points ( s, z ) and the associated measurements,we could consider a sparser grid for H ( s, z ) samples. Thismodiﬁcation would require changing the greedy selectionscheme accordingly to make sure that all possible combina-tions of selected points appear in the sparser grid. We skipthis aspect in our examples and work with the full data set. Now, we have all the pieces to describe the algorithmicframework for the proposed method

AAA-LQO , the

AAA algorithm for

LQO systems.Given the full

LQO data (16) and (17), we initiate theapproximant ( n =

0) by choosing r ( ) ( s ) as the average of H ( s ) samples and r ( ) ( s, z ) as the average of H ( s, z ) sam-ples. Then, using the greedy selection strategy of Section 5.2we update the partition (39) and solve for the barycentricweights as in Section 5.1, more speciﬁcally using (77). Let n m ax denote the largest dimension permitted for the data-driven LQO approximant ̂ Σ LQO and and let ǫ denote therelative error tolerance. Then, AAA-LQO terminates eitherwhen the prescribed dimension n m ax is reached, or whenthe prescribed error tolerance is achieved, namelymax ( ǫ ( n ) / M , ǫ ( n ) / M ) < ǫ. (80) Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output ǫ ( n ) / M and ǫ ( n ) / M during the AAA-LQO iterations. A sketch of

AAA-LQO isgiven in Algorithm 1.

Algorithm 1.

AAA-LQO : AAA algorithm for

LQO systems

Input:

Sampling points { s , . . . , s N s } , and samples { H ( s i )} and { H ( s i , s j )} of an LQO system;Maximum dimension allowed n m ax ;Stopping tolerance ǫ . Output: data-driven

LQO system ̂ Σ LQO as in (37). n = r ( ) = avg { H ( s i )} , and r ( ) = avg { H ( s i , s j )} while max ( ǫ ( n ) / M , ǫ ( n ) / M ) > ǫ and n < r m ax Employ the greedy selection scheme to choose the nextsupport point(s) and update the partitioning as de-scribed in Section 5.2. Compute the vector of weights w ⋆ as in (77). Update r ( n ) ( s ) and r ( n ) ( s, z ) , and compute the errors ǫ ( n ) and ǫ ( n ) as in (79). n = n + endRemark 4. Note that, by choosing complex-conjugate sam-pling points and sampled values, one can enforce the ﬁttedmodels to be real-valued. This is actually enforced for bothexamples presented in Section 6.

6. Numerical examples

We test

AAA-LQO , as given in Algorithm 1, on two

LQO systems. We also apply the original

AAA algorithm (fromthe linear case) to the data corresponding to the ﬁrst (lin-ear) transfer function only. Therefore, we construct two ap-proximants: (1) A data-driven

LQO approximant of order- n using AAA-LQO and (2) A data-driven linear approximantusing

AAA . Note that both approximants are real-valued,enforced by using a data set that is closed under complexconjugation.

First, we use a single-input/single-output version of the ISS1R Model from the SLICOT MOR benchmark collection[11]. We construct a

LQO system from this linear model byadding a quadratic output with the choice of M = . I + . I ( − ) + . I ( + ) ∈ R × , which scales the product of the state variable with itself, in the output equation. Here, I ( k ) denotes a quasi-diagonal matrix for which the entriesof ones are shifted from the main diagonal based on theinteger k ( k > k < I ( ) = I ).We collect the following data: pick 60 logarithmically-spaced points in the interval [ − , ] i and add its con-jugate pairs in [ − − , )] i to have N s =

120 samplingpoints { s i } and the samples { H ( s i )} for i = , , . . . , N s as in (16). Then, as in (17), we sample the second-transferfunction at H ( s i , s j ) for i, j = , , . . . , N s . The sampleddata are depicted in Figure 1, where we display the mea-surements evaluated only on the “positive side” of the imag-inary axis and skipping the conjugate data.Figure 1.: Measurements corresponding to the transferfunctions; H ( s ) (top) and H ( s, z ) (bottom).We apply Algorithm 1 with n m ax =

30 and ǫ = − (rel-ative tolerance value corresponding to 99% approximationerror on the data). With these variables, AAA-LQO yieldsa data-driven

LQO model of order n = { H ( s i )} samples (corresponding to thelinear observation map), we apply AAA and obtain a data-driven linear approximant of order n =

18. The

AAA ap-proximant is constructed to simply illustrate that a lineardynamical system approximation is not suﬃcient to accu-rately represent the underlying

LQO system.In the top plot of Figure 2, we show the magnitude of theﬁrst transfer function H ( s ) of the original system togetherwith that of the linear AAA model and the ﬁrst transferfunction ( r ( n ) ( s ) ) of the AAA-LQO model. As expected,

AAA model does a good job in matching the linear part ofthe output. Similarly, the

AAA-LQO model also matches H ( s ) accurately. To better illustrate this, in the bottomplot of Figure 2, we depict the magnitude of the approxima-tion errors in H ( s ) . The plot reveals that the AAA-LQO model has a smaller error for most of the frequency values,even in approximating H ( s ) . This happens despite thefact that it focuses on both H ( s ) and H ( s, z ) unlike the Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output AAA model, which only tries to approximate H ( s ) . -1 -5 The first transfer function

OriginalAAA-linearAAA-LQO -1 Frequency s -5 Relative approximation errors for the first transfer function

AAA-linearAAA-LQO

Figure 2.: First transfer function approximation.In Figure 3 we depict the selected support points (inter-polation points) for both

AAA and

AAA-LQO algorithms(without the complex conjugate pairs), as well as the polesof the learned models (i.e., the eigenvalues of A r in bothcases). Note that there are 9 complex conjugate pairs ofsupport points for each method. Even though some of thesupport points of AAA and

AAA-LQO overlap, two of thepairs are diﬀerent. This diﬀerence causes a big deviation inthe the pole pattern as shown in the bottom plot, illustrat-ing that even the linear part of the

AAA-LQO approximant,i.e., r ( n ) ( s ) , is fundamentally diﬀerent than the linear AAA model. This is expected since

AAA-LQO constructs r ( n ) ( s ) and r ( n ) ( s, z ) together by minimizing a joint LS measure inboth H ( s ) and H ( s, z ) . -1 -5 The selected interpolation points

AAA-LQOAAA-linear -0.45 -0.4 -0.35 -0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0-20020

The poles

AAA-LQOAAA-linear

Figure 3.: The support points (top) and the poles (bottom)for the two

AAA reduced-order models.To show the overall performance of

AAA-LQO in accu-rately approximating not only H ( s ) but also H ( s, z ) (thefull LQO behavior), we perform a time-domain simulation ofthe original

LQO system Σ

LQO , the data-driven

AAA-LQO model ̂ Σ LQO , and the linear

AAA model by using u ( t ) = . ( πt ) as the control input. During the simulation of the original system Σ LQO , we also compute only the linearpart of the output, which the

AAA model should approxi-mate well. The results are given in the top plot of Figure 4.The ﬁrst observation is that the output of ̂ Σ LQO from

AAA-LQO accurately replicates the output of Σ

LQO . On the otherhand, the linear

AAA model completely misses the quadraticoutput and is only able to approximate the linear compo-nent in the output, as expected. The approximation error inthe output corresponding to ̂ Σ LQO is depicted in the bottomplot of Figure 4. y ( t ) -4 The observed output

OriginalLinear partAAA-LQOAAA-linear

Time (t) -5 Relative approximation error

AAA-LQO

Figure 4.: Time-domain simulations: (top) output of theoriginal and data-driven models, (bottom) ap-proximation error.In Figure 5 we show the convergence behavior of

AAA-LQO by plotting the evolution of the relative approximationerrors ( ǫ ( n ) / M and ǫ ( n ) / M ) for all even values of n . For areference, we also depict the convergence behavior of AAA .The ﬁgure illustrates that after n =

18, both relative errorsfall below the given tolerance of 10 − and the algorithmterminates. n -3 -2 -1 Relative approximation errors /M (AAA-linear) /M /M Figure 5.: Relative approximation errors in each step.To investigate how the order of the

AAA-LQO modelvaries based on the stopping tolerance, we set n m ax = AAA-LQO for four tolerance values τ = − , τ = − , τ = − , and τ = − . The results are displayed in Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output τ = − , in Figure 6 we depictthe convergence behavior of AAA-LQO by plotting ǫ ( n ) / M and ǫ ( n ) / M during the iteration. τ = − τ = − τ = − τ = − n = n = n = n = τ vs. the order n

10 20 30 40 50 60 n -8 -6 -4 -2 Relative approximation errors /M /M Figure 6.: Relative approximation errors in each step.

This model taken from [23] corresponds to an

LQO sys-tem whose output measures a variance in the state-variable.A linear mass-spring-damper SISO dynamical system wasmodiﬁed in [22] by means of stochastic modeling, by re-placing the physical parameters by independent randomvariables, yielding a linear dynamical system with multi-ple outputs. Based on this multiple output system, a SISO

LQO system was derived in [23] where the output corre-sponds to the variance of tne original output (and thus isquadratic in nature). We refer the reader to [23] for furtherdetails. We obtain the measurements from a version of thismodel corresponding to an underlying

LQO system of order N = c = in (1). Hence, H ( s ) = , ∀ s .As sampling points { s i } , we choose 60 logarithmicallyspaced points over the interval [ − , ] i together withits conjugate pairs, leading to N s =

120 samples. Since H ( s ) =

0, we only need to sample H ( s i , s j ) for i, j = , , . . . , N s . The corresponding data for the second transferfunction are depicted in Figure 7. Figure 7.: Measurements of the second transfer function.We apply AAA-LQO with n m ax =

50 and ǫ = − (rel-ative stopping criterion), obtaining a LQO model of order n =

30. To show the accuracy of the approximant, weperform time-domain simulations of the full model and theapproximant with the input u ( t ) = sin ( . t ) . We depict theobserved outputs in the top plot of Figure 8, illustrating anaccurate approximation. The corresponding output error isplotted in the bottom plot of Figure 8. y ( t ) The observed output

OriginalAAA-LQO

100 200 300 400 500 600 700 800 900 1000

Time (t) -8 -6 -4 -2 Relative approximation error

AAA-LQO

Figure 8.: Time-domain simulations; output of the originaland the reduced system (up) + approximationerror (down).Finally, in Figure 9 we show the convergence behavior of

AAA-LQO by plotting the evolution of relative approxima-tion error ǫ ( n ) / M . Remark 5.

Since

AAA-LQO uses a greedy selection schemeand is not a descent algorithm, there is no theoretical guar-antee that the maximum approximation error will decreasemonotonically. This can be seen in Figures 5, 6 and 9. Thisbehavior was also observed in the original

AAA algorithm;see, e.g, Application 6.3 in [20]. However, numerically theerror indeed decreases monotonically with n in most cases. Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output n -3 -2 -1 Relative approximation error /M Figure 9.: Maximum relative approximation error at eachstep.

7. Conclusions

We have proposed a novel data-driven modeling method,called

AAA-LQO , for linear systems with quadratic outputs(

LQO ). AAA-LQO extends the

AAA algorithm to this newsetting by ﬁrst developing the barycentric representationtheory for the two transfer functions arising in the analysisof

LQO and then formulating a LS minimization frameworkto eﬃciently solve for the barycentric coeﬃcients. The twonumerical examples illustrate that AAA-LQO provides high-ﬁdelity data-driven approximants to the original model.The barycentric form we developed here for

LQO systemsoﬀers promising research directions for modelling systemswith general polynomial observation maps, as well as fornonlinearities appearing in the dynamical equation such asbilinear or quadratic-bilinear systems. These topics are thefocus of on-going research.

8. Acknowledgements

The authors would like to thank Dr. Akil Narayan for pro-viding the source codes for generating the numerical exam-ple presented in Section 6.2.S. Gugercin was supported in parts by National ScienceFoundation under Grant No. DMS-1720257 and DMS-1819110.Part of this material is based upon work supported bythe National Science Foundation under Grant No. DMS-1439786 and by the Simons Foundation Grant No. 507536while Gugercin was in residence at the Institute for Com-putational and Experimental Research in Mathematics inProvidence, RI, during the “Model and dimension reduc-tion in uncertain and dynamic systems” program.

References [1] A. C. Antoulas.

Approximation of large-scale dynami-cal systems . SIAM, Philadelphia, 2005.[2] A. C. Antoulas and B. D. 0. Anderson. On the scalarrational interpolation problem.

IMA Journal of Math-ematical Control and Information , 3(2-3):61–88, 1986.[3] A. C. Antoulas, C. Beattie, and S. Gugercin.

Inter-polatory methods for model reduction . ComputationalScience and Engineering 21. SIAM, Philadelphia, 2020.[4] A. C. Antoulas, A. C. Ionita, and S. Lefteriu. On two-variable rational interpolation.

Linear Algebra and itsApplications , 436(8):28890–2915, apr 2012.[5] A. C. Antoulas, S. Lefteriu, and A. C. Ionita. A tutorialintroduction to the Loewner framework for model re-duction. In

Model Reduction and Approximation , chap-ter 8, pages 335–376. SIAM, 2017.[6] U. Baur, P. Benner, and L. Feng. Model order re-duction for linear and nonlinear systems: A system-theoretic perspective.

Archives of ComputationalMethods in Engineering , 21(4):331–358, 2014.[7] P. Benner, P. Goyal, and I. Pontes Duﬀ. Grami-ans, energy functionals and balanced truncationfor linear dynamical systems with quadratic out-puts. Technical report, arXiv preprint available at https://arxiv.org/abs/1909.04597 , 2019.[8] P. Benner, M. Ohlberger, A. Cohen, and K. Willcox.

Model Reduction and Approximation . Society for In-dustrial and Applied Mathematics, Philadelphia, PA,2017. doi:10.1137/1.9781611974829 .[9] J.P. Berrut and L. N. Trefethen. Barycentric Lagrangeinterpolation.

SIAM Rev. , 46(3):501–517, aug 2004.[10] A. Carracedo Rodriguez and S. Gugercin. The p-AAA algorithm for data driven modeling of parametricdynamical systems. Technical report, arXiv preprintavailable at https://arxiv.org/abs/2003.06536 ,2020.[11] Y. Chahlaoui and P. Van Dooren. A collection ofbenchmark examples for model reduction of lineartime invariant dynamical systems. Technical Report2002–2, SLICOT Working Note, 2002. Available from .[12] F. Giri and E.-W. Bai, editors.

Block-oriented Nonlin-ear System Identiﬁcation . Lecture Notes in Control andInformation Sciences. Springer-Verlag, London, 2010.

Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output

Matrix Compu-tations . The Johns Hopkins University Press, fourthedition, 2013.[14] I. V. Gosea and A. C. Antoulas. A two-sided iterativeframework for model reduction of linear systems withquadratic output. In

Proceedings of the 58th Confer-ence on Decision and Control (CDC), December 11-13,Nice, France , 2019.[15] I. V. Gosea and S. G¨uttel. Algorithms forthe rational approximation of matrix-valued func-tions. Technical report, arXiv preprint available at https://arxiv.org/abs/2003.06410 , 2020.[16] A. Juditsky, H. Hjalmarsson, A. Benveniste, B. De-lyon, L. Ljung, J. Sj¨oberg, and Q. Zhang. Nonlinearblack-box models in system identiﬁcation: Mathemati-cal foundations.

Automatica , 31(12):1725 – 1750, 1995.[17] D. S. Karachalios, I. V. Gosea, and A. C.Antoulas. On bilinear time domain identiﬁca-tion and reduction in the Loewner framework.In

Model Reduction of Complex Dynamical Sys-tems , International Series of Numerical Mathemat-ics. Springer, 2020. accepted September 2020. URL: https://arxiv.org/abs/2003.08711 .[18] D. S. Karachalios, I. V. Gosea, and A. C. Antoulas.The Loewner framework for nonlinear identiﬁcationand reduction of Hammerstein cascaded dynamical sys-tems. In

Special Issue: 91st Annual Meeting of theInternational Association of Applied Mathematics andMechanics (GAMM) . Wiley, 2021.[19] P. Lietaert, J. P´erez, B. Vandereycken, andK. Meerbergen. Automatic rational approxima-tion and linearization of nonlinear eigenvalue prob-lems. Technical report, arXiv preprint available at https://arxiv.org/abs/1801.08622 , 2018.[20] Y. Nakatsukasa, O. Sete, and L. N. Trefethen. TheAAA algorithm for rational approximation.

SIAMJournal on Scientiﬁc Computing , 40(3):A1494–A1522,2018.[21] J. Nocedal and S. Wright.

Numerical optimization .Springer Science & Business Media, 2006.[22] R. Pulch. Model order reduction and low-dimensionalrepresentations for random linear dynamical systems.

Math. Comput. Simulat. , 144:1–20, 2018.[23] R. Pulch and A. Narayan. Balanced truncation formodel order reduction of linear dynamical systems with quadratic outputs.

SIAM Journal on Scientiﬁc Com-puting , 41(4):A2270–A2295, 2019.[24] A. Quarteroni, A. Manzoni, and F. Negri.

Reducedbasis methods for partial diﬀerential equations: an in-troduction , volume 92. Springer, 2015.[25] W J. Rugh.

Nonlinear System Theory - TheVolterra/Wiener Approach . University Press, Balti-more, MD, 1981.[26] G. Scarciotti and A. Astolﬁ. Data-driven model re-duction by moment matching for linear and nonlinearsystems.

Automatica , 79:340–351, 2017.[27] L.J. Tick. The estimation of ”transfer functions” ofquadratic systems.

Technometrics , 3(4):563–567, 1961.[28] R. Van Beeumen and K. Meerbergen.

Model reduc-tion by balanced truncation of linear systems with aquadratic output , pages 2033–2036. T.E. Simons, G.Psihoyios, Ch. Tsitouras (eds.), International Confer-ence on Numerical Analysis and Applied Mathematics(ICNAAM). American Institute of Physics, 2010.[29] R. Van Beeumen, K. Van Nimmen, G. Lombaert, andK. Meerbergen. Model reduction for dynamical sys-tems with quadratic output.

Int. J. Numer. Meth. En-gng. , 91:229–248, 2012.[30] N. Wiener.

Nonlinear problems in random theory . Wi-ley, New York, 1958.

A. Proof of Lemma 2

Substitute s = ξ i and z = ˆ ξ j into (35) to obtain n ( ξ i , ˆ ξ j ) = n ∑ k = n ∑ ℓ = h k,ℓ w k w ℓ P k,ℓ ( ξ i , ˆ ξ j ) = n ∑ ℓ = h i,ℓ w i w ℓ P i,ℓ ( ξ i , ˆ ξ j ) . and d ( ξ i , ˆ ξ j ) = P ( ξ i , ˆ ξ j ) + n ∑ k = w k p k ( ξ i ) p ( ˆ ξ j ) + n ∑ ℓ = w ℓ p ℓ ( ˆ ξ j ) p ( ξ i ) + n ∑ k = n ∑ ℓ = w k w ℓ P k,ℓ ( ξ i , ˆ ξ j ) = w i p i ( ξ i ) p ( ˆ ξ j ) + n ∑ ℓ = w i w ℓ P i,ℓ ( ξ i , ˆ ξ j ) , Preprint (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg). 2021-03-01 . V. Gosea, S. Gugercin: The AAA framework for linear dynamical systems with quadratic output

P, p k , and P k,l are as given in (33). Hence, write r ( ξ i , ˆ ξ j ) = n ( ξ i , ˆ ξ j ) d ( ξ i , ˆ ξ j ) as r ( ξ i , ˆ ξ j ) = n ∑ ℓ = h i,ℓ w i w ℓ P i,ℓ ( ξ i , ˆ ξ j ) w i p i ( ξ i ) p ( ˆ ξ j ) + n ∑ ℓ = w i w ℓ P i,ℓ ( ξ i , ˆ ξ j ) . (81)Introduce the notation P Li ( ξ i , z ) = n ∏ k = ,k ≠ i n ∏ ℓ = ( ξ i − ξ k )( z − ξ ℓ ) = p i ( ξ i ) p ( z ) . (82)Since P Li ( ξ i , ˆ ξ j ) = P i,ℓ ( ξ i , ˆ ξ j )( ˆ ξ j − ξ ℓ ) holds, we can write r ( ξ i , ˆ ξ j ) = n ∑ ℓ = h i,ℓ w i w ℓ P Li ( ξ i , ˆ ξ j ) ˆ ξ j − ξ ℓ w i P Li ( ξ i , ˆ ξ j ) + n ∑ ℓ = w i w ℓ P Li ( ξ i , ˆ ξ j ) ˆ ξ j − ξ ℓ . (83)By simplifying w i P Li ( ξ i , ˆ ξ j ) from both the numerator andthe denominator in the above expression proves the ﬁrstdesired result in (38). The proof for r ( ˆ ξ j , ξ i ) follows simi-larly.follows simi-larly.