Linear Matrix Inequality Approaches to Koopman Operator Approximation
aa r X i v : . [ ee ss . S Y ] F e b Linear Matrix Inequality Approaches to KoopmanOperator Approximation
Steven Dahdah and James Richard Forbes Ph.D. Candidate, Department of Mechanical Engineering, McGill University,817 Sherbrooke Street West, Montreal QC H3A 0C3 , [email protected] Associate Professor, Department of Mechanical Engineering, McGill University,817 Sherbrooke Street West, Montreal QC H3A 0C3 , [email protected] February 9, 2021
Koopman operator theory [1] provides a means to globally represent a nonlinear system as a linearsystem by transforming its states into an infinite-dimensional space of lifted states . The Koopmanoperator advances the current lifted state of the system to the next lifted state, much like the statetransition matrix of a linear system.In general, the Koopman representation of a nonlinear system is infinite-dimensional. One way toapproximate the Koopman operator in finite dimensions is to select a set of lifted states and uselinear regression to find a matrix approximation of the Koopman operator, also called a
Koopmanmatrix [2, 3]. The Koopman representation of a nonlinear system is particularly convenient forcontrol systems design, as its linear representation of nonlinear systems is compatible with a widevariety of existing linear optimal control techniques [3–7].The regression problem associated with finding an approximate Koopman operator is numericallychallenging, requiring regularization techniques, such as Tikhonov regularization or LASSO [8],to find a suitable solution. The novelty of this document is the reformulation of the Koopmanmatrix regression problem as a convex optimization problem with linear matrix inequality (LMI)constraints and the use of additional LMIS to, for instance, regularize the optimization problem.In particular, regularizers with LMI forms, such as the matrix 2-norm or the H ∞ norm, can beadded to the optimization problem in a modular fashion. Additional stability constraints can alsobe added in the same way. Although convex optimization and LMIs have previously been usedto synthesize controllers for Koopman models [7], these tools have not yet been leveraged whensolving the regression problem associated with finding the Koopman matrix.A particular novelty of this document is solving the Koopman regression problem with a systemnorm regularizer. Although this document explores the particular use of the H ∞ norm [9, §3.2.2] as Linear Matrix Inequality Approaches to Koopman Operator Approximation Page 1 of 12 regularizer, any system norm, such as the H norm [9, §3.3] or a mixed H norm [9, §3.5], can beused. This systems perspective on the regression problem is a natural fit with the Koopman operatorbecause the Koopman matrix describes the time evolution of the data associated with a dynamicsystem. Rather than regularizing using a matrix norm, regularizing with a system norm enablesa systems interpretation of the regularization. Moreover, using a system norm as a regularizerenables the use of weighting functions that can explicitly penalize errors in a particular frequencyband.This document focuses on the formulation of the Koopman matrix regression problem using convexoptimization and LMIs, and demonstrates how LMIs can be leveraged to regularize or enforceadditional constraints. This document does not present any numerical results, which are ongoingand will be continued in the future. Consider the discrete-time nonlinear process x k +1 = f ( x k ) , (1)where x k ∈ M evolves on a smooth manifold M ⊆ R m × , which is often just the entirety of R m × .Let ψ : M → R be a lifting function , where ψ ∈ H . Any function of x k that returns a scalar is alifting function. There are therefore infinitely many lifting functions, and they form a Hilbert space H . The Koopman operator U : H → H is a linear operator that advances all scalar-valued liftingfunctions in time by one timestep. That is [2, §3.2], ( U ψ )( · ) = ( ψ ◦ f )( · ) . (2)Using the Koopman operator, the dynamics of (1) may then be rewritten linearly in terms of ψ as ψ ( x k +1 ) = ( U ψ )( x k ) . (3)In finite dimensions, (3) is approximated by ψ ( x k +1 ) = U ψ ( x k ) + r k , (4)where ψ : M → R p × , U ∈ R p × p , and r k is the residual error. Since each element of ψ is amember of H , ψ is called a vector-valued lifting function . The Koopman matrix U is a matrixapproximation of the Koopman operator. If the discrete-time nonlinear process has exogenous inputs, the definitions of the lifting functionsand Koopman operator must be adjusted. Consider x k +1 = f ( x k , u k ) , (5) Linear Matrix Inequality Approaches to Koopman Operator Approximation Page 2 of 12 here x k ∈ M ⊆ R m × and u k ∈ N ⊆ R n × .In this case, the lifting functions become ψ : M × N → R , ψ ∈ H and the Koopman operator U : H → H is instead defined so that ( U ψ )( x k , u k ) = ψ ( f ( x k , u k ) , ⋆ ) , (6)where ⋆ = u k if the input has state-dependent dynamics or ⋆ = if the input has no dynamics [2,§6.5]. The input is state-dependent if it is computed by a controller.Let the vector-valued lifting function ψ : M × N → R p × be partitioned as ψ ( x k , u k ) = (cid:20) ϑ ( x k ) υ ( x k , u k ) (cid:21) , (7)where ϑ : M → R p ϑ × , υ : M × N → R p υ × , and p ϑ + p υ = p . In the case where the input hasno dynamics, (6) has the form [2, §6.5.1] ϑ ( x k +1 ) = U ψ ( x k , u k ) + r k , (8)where U = (cid:2) A B (cid:3) . (9)When expanded, this yields the familiar linear state space form, ϑ ( x k +1 ) = A ϑ ( x k ) + B υ ( x k , u k ) + r k . (10) To approximate the Koopman matrix from data, consider a dataset D = { x k , u k } qk =0 and the cor-responding snapshot matrices X = (cid:2) x x · · · x q − (cid:3) ∈ R m × q , (11a) U = (cid:2) u u · · · u q − (cid:3) ∈ R n × q , (11b) X + = (cid:2) x x · · · x q (cid:3) ∈ R m × q . (11c)The lifted snapshot matrices are Ψ = (cid:2) ψ ψ · · · ψ q − (cid:3) ∈ R p × q , (12a) Θ + = (cid:2) ϑ ϑ · · · ϑ q (cid:3) ∈ R p ϑ × q , (12b)where ψ k = ψ ( x k , u k ) and ϑ k = ϑ ( x k ) . Note that time-shifted input snapshots are not required.The Koopman matrix that minimizes J ( U ) = k Θ + − U Ψ k F (13)is therefore [2, §1.2.1] U = Θ + Ψ † , (14)where ( · ) † denotes the Moore-Penrose pseudoinverse. Linear Matrix Inequality Approaches to Koopman Operator Approximation Page 3 of 12 .4 Extended DMD
Extended Dynamic Mode Decomposition (EDMD) [10] is a method to compute (14) that reducesthe computational cost when the number of snapshots is much larger than the dimension of thelifted state ( p ≪ q ) [2, §10.3.1]. Specifically, it reduces the size of the pseudoinverse required.Consider the least-squares solution for the Koopman matrix, U = Θ + Ψ † (15) = Θ + Ψ † (16) = Θ + ( Ψ T Ψ T † ) Ψ † (17) = ( Θ + Ψ T )( ΨΨ T ) † (18) = GH † , (19)where G = 1 q Θ + Ψ T ∈ R p ϑ × p , (20) H = 1 q ΨΨ T ∈ R p × p . (21)Since p ≪ q , EDMD greatly reduces the dimension of the pseudo-inverse operation required tocompute U [2, §10.3.1]. To improve numerical conditioning, G and H are often scaled by thenumber of snapshots q . This step greatly improves the condition number of H without otherwisealtering the regression problem. To add regularizers in a modular fashion, the Koopman operator regression problem is reformu-lated as a convex optimization problem with LMI constraints. Recall that the Koopman matrix U minimizes J ( U ) = k Θ + − U Ψ k F . (13)This cost function can be rewritten as a convex optimization problem with linear matrix inequality(LMI) constraints. Specifically, consider J ( U ) = 1 q k Θ + − U Ψ k F (22) = 1 q tr (cid:16) ( Θ + − U Ψ ) ( Θ + − U Ψ ) T (cid:17) (23) = 1 q tr (cid:0) Θ + Θ T + (cid:1) − q tr (cid:0) Θ + Ψ T U T + U ΨΘ T + (cid:1) + 1 q tr (cid:0) U ΨΨ T U T (cid:1) (24) = 1 q tr (cid:0) Θ + Θ T + (cid:1)| {z } c − (cid:18) U (cid:18) q ΨΘ T + (cid:19)| {z } G T (cid:19) + tr (cid:18) U (cid:18) q ΨΨ T (cid:19)| {z } H U T (cid:19) (25) = c − (cid:0) UG T (cid:1) + tr (cid:0) UHU T (cid:1) , (26) Linear Matrix Inequality Approaches to Koopman Operator Approximation Page 4 of 12 here c is a scalar constant, G is defined in (20), and H is defined in (21).The minimization of (26) is equivalent to the minimization of J ( U , ν, Z ) = − (cid:0) UG T (cid:1) + ν (27)subject to tr( Z ) ≤ ν, (28) Z > , (29) UHU T ≤ Z , (30)where ν and Z are slack variables that allow the cost function to be rewritten using LMIs [9,§2.14.1]. Using the Schur complement [9, §2.3.2], (30) can be rewritten as Z − UHU T ≥ ⇐⇒ (cid:20) Z UU T H − (cid:21) ≥ , H > . (31)Note that H = H T > if the columns of Ψ are linearly independent. Minimizing (13) is thereforeequivalent to min J ( U , ν, Z ) = − (cid:0) UG T (cid:1) + ν (32) s . t . tr( Z ) ≤ ν, (33) Z > , (34) (cid:20) Z UU T H − (cid:21) ≥ . (35)Note that the objective function and constraints are convex, and U appears linearly in all of them. Computing the inverse of H in (35) is numerically problematic and can be avoided using a Choleskydecomposition or an eigendecomposition. The use of an eigendecomposition is explored in thissection. Consider H = V Λ V T (36) = V √ Λ √ Λ V T , (37)which implies that Z − UHU T = Z − U (cid:16) V √ Λ √ Λ V T (cid:17) U T (38) = Z − (cid:16) UV √ Λ (cid:17) (cid:16) UV √ Λ (cid:17) T . (39)Applying the Schur complement [9, §2.3.2] once again yields a new form of (30), (cid:20) Z UV √ Λ √ Λ V T U T (cid:21) ≥ . (40) Linear Matrix Inequality Approaches to Koopman Operator Approximation Page 5 of 12 his form trades off a matrix inverse for an eigendecomposition. The new optimization problemwithout matrix inversion is min J ( U , ν, Z ) = − (cid:0) UG T (cid:1) + ν (41) s . t . tr( Z ) ≤ ν, (42) Z > , (43) (cid:20) Z UV √ Λ √ Λ V T U T (cid:21) ≥ , (44)where H = V Λ V T . This formulation of the optimization problem is almost always preferable tothe formulation that requires inverting H . Tikhonov regularization [8], which is based on the matrix 2-norm, has an LMI form that can beeasily incorporated into the optimization problem. The regularized cost function is J ( U ) = k Θ + − U Ψ k F + α k U k , (45)where α is the regularization coefficient. The matrix 2-norm of a matrix is its maximum singularvalue. That is, k U k = q ¯ λ ( U T U ) (46) = ¯ σ ( U ) , (47)where ¯ λ ( · ) is the maximum eigenvalue and ¯ σ ( · ) is the maximum singular value.Consider the modified optimization problem min J ( U , γ ) = 1 q k Θ + − U Ψ k F + αq γ (48) s . t . ¯ σ ( U ) ≤ γ . (49)The constraint (49) can be rewritten as [9, §2.11.1] (cid:20) γ T U γ (cid:21) ≥ . (50) Linear Matrix Inequality Approaches to Koopman Operator Approximation Page 6 of 12 t follows that the optimization problem min J ( U , ν, Z , γ ) = − (cid:0) UG T (cid:1) + ν + αq γ (51) s . t . tr( Z ) ≤ ν, (52) Z > , (53) (cid:20) Z UV √ Λ √ Λ V T U T (cid:21) ≥ , (54) (cid:20) γ T U γ (cid:21) ≥ , (55)where H = V Λ V T , is equivalent to minimizing (45). Nuclear norm regularization [11, 12] can be incorporated to favour low-rank Koopman operators.The regularized cost function is J ( U ) = k Θ + − U Ψ k F + α k U k ∗ , (56)where α is the regularization coefficient. The nuclear norm of a matrix is defined as k U k ∗ = tr (cid:16) √ U T U (cid:17) (57) = p ϑ − X i =0 σ i ( U ) , (58)where σ i ( · ) is the i th singular value. Recall that U ∈ R p ϑ × p . From [11], the solution to theoptimization problem min k U k ∗ (59)is equivalent to the solution to the optimization problem min 12 (tr( W ) + tr( W )) (60) s . t . (cid:20) W UU T W (cid:21) ≥ , (61)where W = W T and W = W T . It follows that the optimization problem min J ( U , ν, Z , γ, W , W ) = − (cid:0) UG T (cid:1) + ν + αq γ (62) s . t . tr( Z ) ≤ ν, (63) Z > , (64) (cid:20) Z UV √ Λ √ Λ V T U T (cid:21) ≥ , (65) tr( W ) + tr( W ) ≤ γ, (66) (cid:20) W UU T W (cid:21) ≥ , (67) Linear Matrix Inequality Approaches to Koopman Operator Approximation Page 7 of 12 here H = V Λ V T , is equivalent to minimizing (56). To ensure that all eigenvalues associated with the matrix A , where U = (cid:2) A B (cid:3) , have magni-tude strictly less than one, thus ensuring asymptotic stability, a modified Lyapunov constraint [13,§1.4.4] P > , (68) A T PA − ¯ ρ P ≤ , (69)can be added to ensure that the magnitude of the largest eigenvalue of A is no larger than ¯ ρ < .The LMI form of this constraint is P > , (70) (cid:20) ¯ ρ P A T ΓΓ T A Γ + Γ T − P (cid:21) > , (71)where Γ ∈ R p ϑ × p ϑ [9, §3.1.4].The full optimization problem with asymptotic stability constraint is min J ( U , ν, Z , P , Γ ) = − (cid:0) UG T (cid:1) + ν (72) s . t . tr( Z ) ≤ ν, (73) Z > , (74) (cid:20) Z UV √ Λ √ Λ V T U T (cid:21) ≥ , (75) P > , (76) (cid:20) ¯ ρ P A T ΓΓ T A Γ + Γ T − P (cid:21) > , (77)where H = V Λ V T and Γ ∈ R p ϑ × p ϑ .Since both A and Γ are unknown, this optimization problem is bilinear, and must be solved iter-atively by holding either A or Γ fixed while solving for the other. To accomplish this, select aninitial guess Γ = (78) Linear Matrix Inequality Approaches to Koopman Operator Approximation Page 8 of 12 nd solve min J ( U , ν, Z , P ) = − (cid:0) UG T (cid:1) + ν (79) s . t . tr( Z ) ≤ ν, (80) Z > , (81) (cid:20) Z UV √ Λ √ Λ V T U T (cid:21) ≥ , (82) P > , (83) (cid:20) ¯ ρ P A T Γ i Γ T i A Γ i + Γ T i − P (cid:21) > , (84)where H = V Λ V T , to obtain A i . Then find any Γ that satisfies (cid:20) ¯ ρ P A T i ΓΓ T A i Γ + Γ T − P (cid:21) > . (85)Set the result to Γ i and iterate until the spectrum of A i stops changing significantly. Note that itis preferable to solve a feasibility problem for Γ i rather than solving an optimization problem thatminimizes its norm. The value of Γ i that results in the best A i may not have the smallest norm. A system norm like the H ∞ norm, the H norm, or a mixed H norm can be used to regularizethe Koopman regression problem when it is posed as in Section 3. The use of the H ∞ norm asa regularizer when finding the Koopman matrix via regression is explored next. Minimizing the H ∞ norm guarantees that the resulting LTI system will by asymptotically stable, and allows theregularization problem to be tuned in the frequency domain with weighting functions.The Koopman representation of a nonlinear ODE can be thought of as a discrete-time LTI system G : ℓ e → ℓ e where U = (cid:2) A B (cid:3) , C = , and D = . That is, G min ∼ (cid:20) A BC D (cid:21) . (86)The H ∞ norm of G is the worst-case gain from k u k to k G u k . That is [9, §3.2.2] k G k ∞ = sup u ∈ ℓ , u = k G u k k u k . (87)With H ∞ norm regularization, the cost function associated with the regression problem is J ( U ) = k Θ + − U Ψ k F + α k G k ∞ , (88)where α is the regularization coefficient. Linear Matrix Inequality Approaches to Koopman Operator Approximation Page 9 of 12 he H ∞ norm has an LMI formulation. The inequality k G k ∞ < γ holds if and only if [9, §3.2.2] P > , (89) P AP B 0PA T P 0 PC T B T γ T γ > . (90)The full optimization problem with H ∞ regularization is min J ( U , ν, Z , γ, P ) = − (cid:0) UG T (cid:1) + ν + αq γ (91) s . t . tr( Z ) ≤ ν, (92) Z > , (93) (cid:20) Z UV √ Λ √ Λ V T U T (cid:21) ≥ , (94) P > , (95) P AP B 0PA T P 0 PC T B T γ T γ > , (96)where H = V Λ V T .This is a bilinear optimization problem as both P and A are unknown. It must be solved iterativelyby holding either P or A fixed while solving for the other. To accomplish this, select an initialguess P = (97)and solve min J ( U , ν, Z , γ ) = − (cid:0) UG T (cid:1) + ν + αq γ (98) s . t . tr( Z ) ≤ ν, (99) Z > , (100) (cid:20) Z UV √ Λ √ Λ V T U T (cid:21) ≥ , (101) P i AP i B 0P i A T P i i C T B T γ T i D γ > , (102) Linear Matrix Inequality Approaches to Koopman Operator Approximation Page 10 of 12 here H = V Λ V T , to obtain A i . Then find any P that satisfies P > , (103) P A i P B 0PA T i P 0 PC T B T γ T γ > . (104)Set the result to P i and iterate until A i stops changing significantly. Note that it is preferable tosolve a feasibility problem for P i rather than solving an optimization problem that minimizes itsnorm. The value of P i that results in the best A i may not have the smallest norm. Regression is one way to approximate a Koopman matrix from data. The presented LMI-basedmethods to regularize and constrain the Koopman matrix regression problem are part of a modularapproach that can be readily adjusted for the problem at hand. The proposed method of regularizingthe Koopman matrix regression problem with the H ∞ norm provides a systems perspective tothe problem and allows the regularization to be tuned in the frequency domain using weightingfunctions. Other system norms, like the H norm, the generalized H norm, the peak-to-peaknorm [9, §3] can also be used as regularizers. The unique properties of these system norms mayprove useful in the identification of approximate Koopman operators from data. References [1] B. O. Koopman, “Hamiltonian systems and transformations in Hilbert space,”
Proceedingsof the National Academy of Sciences , vol. 17, no. 5, pp. 315–318, 1931.[2] N. J. Kutz, S. L. Brunton, B. W. Brunton, and J. L. Proctor,
Dynamic Mode Decomposition:Data-Driven Modeling of Complex Systems . Philadelphia, PA: Society for Industrial andApplied Mathematics, 2016.[3] S. E. Otto and C. W. Rowley, “Koopman operators for estimation and control of dynamicalsystems,”
Annual Review of Control, Robotics, and Autonomous Systems , vol. 4, no. 1, 2021.[4] I. Abraham and T. D. Murphey, “Active learning of dynamics for data-driven control usingKoopman operators,”
IEEE Trans. Robot. , vol. 35, no. 5, pp. 1071–1083, 2019.[5] G. Mamakoukas, M. Castano, X. Tan, and T. Murphey, “Local Koopman operators for data-driven control of robotic systems,” in
Proc. Robotics: Science and Systems XV , Freiburg imBreisgau, Germany, 2019.[6] D. Bruder, B. Gillespie, C. D. Remy, and R. Vasudevan, “Modeling and control of soft robotsusing the Koopman operator and model predictive control,” in
Proc. Robotics: Science andSystems XV , Freiburg im Breisgau, Germany, 2019.[7] D. Uchida, A. Yamashita, and H. Asama, “Data-driven koopman controller synthesis basedon the extended H norm characterization,” IEEE Control Systems Letters , vol. 5, no. 5,pp. 1795–1800, 2021.
Linear Matrix Inequality Approaches to Koopman Operator Approximation Page 11 of 12
8] G. James, D. Witten, T. Hastie, and R. Tibshirani,
An Introduction to Statistical Learning .New York, NY: Springer, 2013.[9] R. J. Caverly and J. R. Forbes, “LMI properties and applications in systems, stability, andcontrol theory,” arXiv:1903.08599v2 [cs.SY] , 2019.[10] M. O. Williams, I. G. Kevrekidis, and C. W. Rowley, “A data–driven approximation of theKoopman operator: Extending dynamic mode decomposition,”
J. Nonlinear Sci. , vol. 25,no. 6, pp. 1307–1346, 2015.[11] B. Recht, M. Fazel, and P. Parrilo, “Guaranteed minimum-rank solutions of linear matrixequations via nuclear norm minimization,”
SIAM Review , vol. 52, no. 3, pp. 471–501, 2010.[12] N. Blomberg, “On nuclear norm regularization in system identification,” Ph.D. dissertation,KTH Royal Institute of Technology, 2016.[13] L. El Ghaoui and S.-I. Niculescu,
Advances in Linear Matrix Inequality Methods in Control .Philadelphia, PA: Society for Industrial and Applied Mathematics, 2000..Philadelphia, PA: Society for Industrial and Applied Mathematics, 2000.