The structure of global conservation laws in Galerkin plasma models
Alan A. Kaptanoglu, Kyle D. Morgan, Christopher J. Hansen, Steven L. Brunton
TThe structure of global conservation lawsin Galerkin plasma models
Alan A. Kaptanoglu
Department of Physics, University of Washington, Seattle, WA, 98195, USA
Kyle D. Morgan
Department of Aeronautics and Astronautics, University of Washington, Seattle, WA, 98195, USA
Chris J. Hansen
Department of Aeronautics and Astronautics, University of Washington, Seattle, WA, 98195, USA andDepartment of Applied Physics and Applied Mathematics,Columbia University, New York, NY, 10027, USA
Steven L. Brunton
Department of Mechanical Engineering, University of Washington, Seattle, WA, 98195, USA
Plasmas are highly nonlinear and multi-scale, motivating a hierarchy of models to understandand describe their behavior. However, there is a scarcity of plasma models of lower fidelity thanmagnetohydrodynamics (MHD). Galerkin models, obtained by projection of the MHD equationsonto a truncated modal basis, can furnish this gap in the lower levels of the model hierarchy.In the present work, we develop low-dimensional Galerkin plasma models which preserve globalconservation laws by construction. This additional model structure enables physics-constrainedmachine learning algorithms that can discover these types of low-dimensional plasma models directlyfrom data. This formulation relies on an energy-based inner product which takes into account allof the dynamic variables. The theoretical results here build a bridge to the extensive Galerkinliterature in fluid mechanics, and facilitate the development of physics-constrained reduced-ordermodels from plasma data.
I. INTRODUCTION
There are a tremendous number of known plasma mod-els of varying model complexity, from magnetohydro-dynamics (MHD) to the Klimontovich equations, but alarge gap exists in the lower rungs of this hierarchy be-tween simple circuit models and the many MHD variants.This is a valuable place for improvement because higherfidelity models often require computationally intensiveand high-dimensional simulations [1–3], obfuscating thedynamics and precluding model-based real-time control.Additionally, many high-dimensional nonlinear systemstend to evolve on low-dimensional attractors [4], definedby spatio-temporal coherent structures that character-ize the dominant behavior of the system. A number ofstudies in the plasma physics community indicate thatthe vast majority of the total plasma energy can be ex-plained by fewer than ten low-dimensional modes, acrossa large range of parameter regimes, geometry, and degreeof nonlinearity [5–13]. In these cases, the evolution ofonly a few coherent structures, obtained from systematicmodel-reduction techniques [14, 15], can closely approx-imate the full evolution of the high-dimensional physicalsystem.In the present work and our companion paper [16], weprovide a theoretical framework for physics-constrained,low-dimensional plasma models which furnish this im-portant gap in the hierarchy of plasma models. This is a breakthrough in principled and interpretable model re-duction, offering an alternative to deep learning methodswhich often require vast quantities of data and produceopaque models. The present work focuses on the impor-tant technical details for constructing and constrainingthese models, while the accompanying publication con-tains an overview of our high-level contributions and ini-tial results on 3D plasma simulations.
II. LOW-DIMENSIONAL MODELS
Although there are many ways to obtain low-dimensional models, Galerkin methods and their exten-sions have seen remarkable success in fluid mechanics;careful development of a dimensionalized inner productenabled the extension of the proper orthogonal decompo-sition (POD) from incompressible to compressible fluidflows [17]. It is also common in fluid mechanics to obtainnonlinear reduced-order models by Galerkin projection ofthe Navier-Stokes equations onto POD modes, making itpossible to enforce known symmetries and conservationlaws, such as conservation of energy [18–21].The present work adapts and extends these innovationsfor plasmas, enabling a wealth of advanced modeling andcontrol machinery. The POD is already used extensivelyfor interpreting plasma physics data across a range of pa-rameter regimes [22–26]. For clarity of presentation androbust connection with the Galerkin literature in fluid a r X i v : . [ phy s i c s . p l a s m - ph ] J a n mechanics, we present results for MHD models which areat most quadratic in nonlinearity. This includes idealMHD, incompressible Hall-MHD, or compressible Hall-MHD with a slowly time-varying density, which togetherdescribe the dynamics of a fairly broad class of space andlaboratory plasmas [27–32]. A. An MHD energy inner product
Traditional use of the POD on the MHD fields (ve-locity, magnetic, and temperature) requires separate de-compositions for u , B , and T , or an arbitrary choice ofdimensionalization. However, separate decompositionsof the fields obfuscates the interpretability and increasesthe complexity of a low-dimensional model, and choosingthe units of the combined matrix of measurement datacan have a significant impact on the performance andenergy spectrum of the resulting POD basis. Inspiredby the inner product defined for compressible fluids [17],we introduce an inner product for compressible magne-tohydrodynamic fluids by using the configuration vector q ( x , t ) = [ B u , B , B T ]. Here B u = √ ρµ u , B T = 2 (cid:112) ρµ k b T /m i ( γ − , (1)where u is the fluid velocity, ρ is the mass density, k b is Boltzmann’s constant, µ is the permeability of freespace, T is the plasma temperature, m i is the ion mass, γ is the adiabatic index, and p = 2 ρT /m i is the plasmapressure. B u and B T are defined so that the followingscaled inner product yields the total energy W,W = 12 µ (cid:104) q , q (cid:105) = (cid:90) (cid:32) ρu + B µ + pγ − (cid:33) d x . (2) B. Review of the POD
For the POD, measurements at time t k are arrangedin a vector q k ∈ R D , called a snapshot, where the di-mension D is the product of the number of spatial lo-cations and the number of variables measured at eachpoint. The data is sampled at times t , t , ..., t M , ar-ranged in a matrix X ∈ R D × M , and the average in time¯ q is subtracted off. The singular value decomposition(SVD) provides a low-rank approximation of the subse-quent data matrix X = time −−−−−−−−−−−−−−−−−−−−−−→ q ( t ) q ( t ) · · · q ( t M ) q ( t ) q ( t ) · · · q ( t M )... ... . . . ... q D ( t ) q D ( t ) · · · q D ( t M ) (cid:121) s t a t e = U Σ V ∗ , (3) where U ∈ R D × D and V ∈ R M × M are unitary matrices,and Σ ∈ R D × M is a diagonal matrix containing non-negative and decreasing entries s jj called the singularvalues of X . V ∗ denotes the complex-conjugate trans-pose of V . The singular values indicate the relativeimportance of the corresponding columns of U and V for describing the spatio-temporal structure of X . Al-though varying terminology is used in different fields (inthe plasma physics community this method is often re-ferred to as the biorthogonal decomposition, or BOD),in practice the SVD, BOD, and POD are synonymous.It is often possible to discard small values of Σ , result-ing in a truncated matrix Σ r ∈ R r × r . With the first r (cid:28) min( D, M ) columns of U and V , denoted U r and V r , the matrix X can be approximated as X ≈ U r Σ r V ∗ r . (4)The truncation rank r is typically chosen to balance ac-curacy and complexity [33]. The computational com-plexity of the SVD is O ( DM + M ) [34], although thereare randomized singular value decompositions [35–37]for extremely large problems that can be as fast as O ( DM log( r )). This computational speed enables theSVD to be computed online, for updating a model in areal-time control application, or offline, for the decompo-sition of very large data, an examination of the physics,or development of a more generic model describing thedynamics of an amalgam of discharges. A well-definedSVD requires that the measurements in X have the samephysical dimensions. With a dimensionalized measure-ment vector q , the matrix X ∗ X may be computed via X ∗ X ≈ (cid:104) q ( t k ) , q ( t m ) (cid:105) . (5)When the number of snapshots is far fewer than the num-ber of measurements, M (cid:28) D , we can use the method ofsnapshots. Substitution of Eq. 4 into X ∗ X produces X ∗ XV r ≈ V r Σ r , (6)an eigenvalue equation for V r ; therefore we can obtain V r by diagonalizing X ∗ X ∈ R M × M instead of computing theSVD of X . The chronos are the temporal SVD modes,i.e. the columns of V r , denoted v j . The topos are thespatial modes forming the columns of U r , denoted χ .In the present work, we scale the normalized matrix ofchronos, a , through a jk = v jk / (cid:80) rj =1 max k | v jk | . Finally,the reconstruction can be written q ( x i , t k ) ≈ ¯ q ( x i )+ r (cid:88) j =1 χ j ( x i ) a j ( t k ) , (7)We have absorbed the normalization of a jk and the sin-gular values into the definition of χ j ( x i ). By construc-tion (cid:104) χ i , χ j (cid:105) ∝ δ ij . Note that, in principle, we could haveexpanded q in any set of modes, although orthonormalmodes are preferred because this property facilitates theanalysis in Section III. The advantage of the POD basis isthat the modes are ordered by energy content; a trunca-tion of the system still captures the vast majority of thedynamics. A separate POD of each of the MHD fieldswould lead to three sets of POD modes with indepen-dent time dynamics and mixed orthogonality properties.In contrast, our approach captures all the fields simulta-neously, resulting in a single set of modes a i ( t ) in Eq. (7).An example of this decomposition is illustrated inFig. 1 for a 3D plasma simulation that is dominated byharmonics of a driving frequency. In general, examiningthe structure and symmetry in the spatial and tempo-ral POD modes can inform physical understanding. Forinstance, in Fig. 1, the short-wavelength structures ex-hibited in the 3D spatial modes derive both from dis-persive whistler waves via the Hall term and the smallcharacteristic scale associated with the driving actuator.The steep fall-off in the singular values also indicates thatmodels of only the first few modes would be enough toaccurately forecast and control the dominant dynamics. C. POD-Galerkin models
While we have obtained a useful modal decompositionof the evolved fields, we have yet to derive a model forthe subsequent temporal evolution of the modes. Now wesubstitute Eq. (7) into a quadratically nonlinear MHDmodel, such as ideal MHD, incompressible Hall-MHD, orcompressible Hall-MHD with a slowly time-varying den-sity. Utilizing the orthonormality of the χ i produces:˙ a i ( t ) = C i + r (cid:88) j =1 L ij a j + r (cid:88) j,k =1 Q ijk a j a k , (8) C i = (cid:104) C + L ( ¯ q )+ Q ( ¯ q , ¯ q ) , χ i (cid:105) ,L ij = (cid:104) L ( χ j )+ Q ( ¯ q , χ j )+ Q ( χ j , ¯ q ) , χ i (cid:105) ,Q ijk = (cid:104) Q ( χ j , χ k ) , χ i (cid:105) . The inner products integrate out the spatial dependence,and the model is quadratic in the temporal POD modes a i ( t ). In contrast to Eq. (8), a Galerkin model based onseparate POD expansions for each field would involve sig-nificantly more complicated nonlinear terms from mixingand a lack of orthonormality (cid:104) χ u i , χ B j (cid:105) (cid:54) = δ ij between the different POD modes associated with each field. III. CONSTRAINTS ON MODEL STRUCTURE
Local and global MHD conservation laws are retainedin this low-dimensional basis. Vanishing ∇· B and theorthonormality of the temporal POD modes produce ∇· χ Bi = 0 , ∀ i. (9)In other words, the orthonormality of the temporalmodes guarantees that the local divergence constraintis satisfied by each of the χ Bi by construction. In con-trast, global energy conservation will produce strong con-straints on the structure of the coefficients in Eq. 8. A. Global conservation of energy
For an examination of the global conservation laws, weconsider isothermal Hall-MHD with the assumption thatthe density is slowly-varying in time. This model reducesto ideal MHD and incompressible resistive or Hall MHDin the appropriate limits, and produces (Galtier [39] Eq.3.22) ∂W∂t = − (cid:90) (cid:20) ˜ ν ( ∇× u ) + ηµ ( ∇× B ) +43 ˜ ν ( ∇· u ) (cid:21) d x (10) − (cid:73) (cid:34)(cid:18) ρu + p (cid:19) u + P −
43 ˜ ν ( ∇· u ) u − ˜ ν u × ( ∇× u ) (cid:35) · ˆ n dS. Here ˆ n is a unit normal vector to the boundary, and P = 1 µ E × B = u e µ · ( B I − BB ) − ηµ ( ∇× B ) × B , (11)is the Poynting vector ( E is the electric field), which isoften a known function of space and time. Omission ofthe Hall term changes u e to u . Even with a Hall-MHDmodel that evolves the temperature, the electron diamag-netic contribution to P does not change the energy bal-ance if Dirichlet boundary conditions are used for ρ and T . To simplify, we assume that u · ˆ n = u × ˆ n =0, J · ˆ n =0,and B · ˆ n =0 at the wall, consistent with the simulationsused in the accompanying work [16]. Moreover, we nowassume that we have a steady-state, define a constant a ( t )=1, and substitute Eq. (7) into Eq. (10).0 ≈ ∂W∂t = (cid:73) ηµ (( ∇× B ) × B ) · ˆ n dS − (cid:90) (cid:34) νµ ( ∇× B u − ∇ ρ ρ × B u ) + ηµ ( ∇× B ) + 43 νµ ( ∇· B u − ∇ ρ ρ · B u ) (cid:35) d x , (12)= W C + r (cid:88) i =1 W Li a i + r (cid:88) i,j =1 W Q ij a i a j = W C + r (cid:88) i =1 a i ( W Li + r (cid:88) j =1 W Qij a j )= W C + r (cid:88) i =1 r (cid:88) j =0 W Qij a i a j = r (cid:88) i =0 r (cid:88) j =0 W Qij a i a j , tω (a) Pairwise correlations of POD amplitudes (b) Spatial POD modes(c) Time evolutions a ( t ) and Fourier transforms ˜ a ( ω ) N o r m a li ze d s i n g u l a r v a l u e s Mode index10 − − − − − a a a a a a a a a a a a a a χ χ χ χ χ χ χ B x B y B z B ux B uy B uz a a a a a a a ˜ a ˜ a ˜ a ˜ a ˜ a ˜ a ˜ a FIG. 1: The first seven POD modes for an isothermal, compressible Hall-MHD simulation of the HIT-SI device [38]:(a) Feature space trajectories of every mode pair and the singular values; (b) 3D spatial modes visualized in the Z = 0 midplane and normalized to ±
1; (c) Normalized temporal modes and corresponding Fourier transformsproduce harmonics of the driving frequency, labeled 1-5 in the Fourier space.where we have padded the matrix in the last step so that W Q i =0, W Qi = W Li for i ∈{ ,...,r } , and W Q = W C . Itimmediately follows from Eq. (12) that W Qij is an antisymmetric matrix. Writing out the coefficients we have0= W Q00 = ηµ (cid:73) (cid:2) ( ∇× ¯ B ) × ¯ B (cid:3) · ˆ n dS − (cid:90) (cid:20) ν ( ∇× ¯ B u −∇ ρ ρ × ¯ B u ) + ηµ ( ∇× ¯ B ) +43 ν ( ∇· ¯ B u −∇ ρ ρ · ¯ B u ) (cid:21) d x , (13)0= W Q i = ηµ (cid:73) (cid:104) ( ∇× ¯ B ) × χ Bi +( ∇× χ Bi ) × ¯ B (cid:105) · ˆ n dS − (cid:90) (cid:20) ν ( ∇× ¯ B u −∇ ρ ρ × ¯ B u ) · ( ∇× χ B u i −∇ ρ ρ × χ B u i )+ ηµ ( ∇× ¯ B ) · ( ∇× χ Bi )+43 ν ( ∇· ¯ B u −∇ ρ ρ · ¯ B u ) · ( ∇· χ B u i −∇ ρ ρ · χ B u i ) (cid:21) d x ,W Q ij = ηµ (cid:73) (cid:104) ( ∇× χ Bi ) × χ Bj (cid:105) · ˆ n dS − (cid:90) (cid:20) ν ( ∇× χ B u i −∇ ρ ρ × χ B u i ) · ( ∇× χ B u j −∇ ρ ρ × χ B u j )+ ηµ ( ∇× χ Bi ) · ( ∇× χ Bj )+43 ν ( ∇· χ B u i −∇ ρ ρ · χ B u i ) · ( ∇· χ B u j −∇ ρ ρ · χ B u j ) (cid:21) d x . With some algebra, we can compute a i ˙ a i for i ∈{ ,...,r } , a i ˙ a i = r (cid:88) i,j =1 a i ∂a j ∂t (cid:90) χ i χ j d x = (cid:90) ∂q ∂t d x = ∂W∂t (14) In index notation a i ˙ a i = a i C i + a i L ij a j + a i Q ijk a j a k for i,j,k ∈{ ,...,r } . First, note that W Q i =0 produces C i =0for all i ∈{ ,...,r } . In other words, there are no constantterms in the Galerkin model; data-driven methods canimplement this constraint by simply searching for modelsthat do not have constant terms. This a physical conse-quence of our assumption that ¯ q is steady-state becausenonzero constant terms in the Galerkin model would im-ply the possibility of unbounded growth in the energynorm. The anti-symmetry of W Qij for i,j ∈{ ,...,r } con-strains the quadratic structure of a T · a , a T L a ≈ . (15)This physical interpretation is also clear; if the plasma issteady-state but has finite dissipation, the input power,here manifested through a purely quadratic Poynting flux P ∝ η J × B , must be balancing these losses. Finally, be-cause of the boundary conditions there are no cubic terms in the energy at all, meaning a T Q aa =0 . (16)This is the condition for a system to have energy-preserving quadratic nonlinearities; this conclusion doesnot rely on any assumption of steady-state and energy-preserving structure in other quadratic nonlinearities iswell-studied in fluid mechanics [20, 40]. B. Global conservation of cross-helicity
An analogous derivation can be done to further con-strain the model-building for systems which conservecross-helicity, although this is not appropriate for thesimulation results in the accompanying work [16]. Con-sider the local form of cross-helicity H c = u · B . UsingGaltier [39] Eq. (3.36), ∂H c ∂t = −∇· (cid:32) u γp ( γ − ρ (cid:33) B + u × ( u × B ) − d i √ ρµ u × (cid:0) ( ∇× B ) × B (cid:1) − η u × ( ∇× B ) (17)+ ν ∇· (cid:18) B × ( ∇× u )+ 43 ( ∇· u ) B (cid:19) − d i √ ρµ ( ∇× u ) · (cid:0) ( ∇× B ) × B (cid:1) − ( η + ν )( ∇× B ) · ( ∇× u ) . Consider again the simplifying case J · ˆ n =0, B · ˆ n =0, and u · ˆ n = u × ˆ n =0. Then the integral form is0 ≈ (cid:90) ∂H c ∂t d x = (cid:90) (cid:34) ν ∇ ρρ · (cid:18) B × ( ∇× u )+ 43 ( ∇· u ) B (cid:19) − d i √ ρµ ( ∇× u ) · (cid:0) ( ∇× B ) × B (cid:1) − ( η + ν )( ∇× B ) · ( ∇× u ) (cid:35) d x . (18)Substituting in Eq. (7) produces terms up to cubic in thetemporal POD modes, (cid:90) ∂H c ∂t d x = ∂∂t ( a i a j ) (cid:90) √ ρµ χ B u i · χ Bj d x (19)0= A ij ∂∂t ( a i a j )= A ij C j a i A ij L jk a i a k A ij Q jkl a i a k a l Note that if the system is energy-preserving, C j =0 forall j, so the first equality is already satisfied. The secondinequality gives A ij L jk anti-symmetric under swapping i and k , and energy-preservation gave anti-symmetry un-der swapping j and k (Eq. 15). The most straightforwardsolution is L jk =0 for all j , k ; this solution is precisely theideal limit corresponding to η = ν =0. Since A ij is notsymmetric by construction, this constraint can also ap-ply to systems which conserve cross-helicity despite finitedissipation.Lastly, A ij Q jkl , containing only the contribution fromthe Hall-term, exhibits the same structure as (and is compatible with) our constraint on the energy-preservingnonlinearities in Eq. (16). The simplest solution is A ij Q jkl =0 for all i,k,l , since this corresponds to standardMHD without the Hall term. Like the analysis of the lin-ear terms, this constraint indicates that if the Hall-termshave this special energy-preserving structure, nonzeroHall contributions can still conserve cross-helicity. En-forcing other invariants of Hall-MHD may require alter-native formulations to the one presented here, since de-rived fields like the vector potential are involved. C. Conservation laws with velocity units
In closer analogy to fluid dynamics [17], we could havealternatively used q =[ u , u A ,u s ], u s = 4 Tm i ( γ − , u A = B √ µ ρ , (20)12 (cid:104) q , q (cid:105) = 12 (cid:90) (cid:16) u + u A + u s (cid:17) d x . (21)We have defined a scaled plasma sound speed, u s . If ρ is uniform ρ (cid:104) q , q (cid:105) / W . The isothermal and time-independent density assumptions allow us to derive an-other quadratic model in q , for which a POD-Galerkinmodel is readily available (the form is identical to Eq. 8but the coefficients have changed). Once again, assume u · ˆ n = u × ˆ n =0, and B · ˆ n =0 on the boundary, so that (cid:90) ρ dq dt d x = ∂W∂t . (22)This is equivalent to Eq. (14) in the particular case oftime-independent density. Without this assumption, anextra term appears, proportional to (cid:82) u ·∇ ( u + u A ) d x .Although from dimensional analysis this term is poten-tially very large, this may not be the case for many lab-oratory devices with strong anisotropy introduced by alarge external magnetic field. For instance, steady-statetoroidal plasmas with large closed flux surfaces wouldexpect u ·∇ u A and u ·∇ u to be small, as the fluid ve-locity is primarily along field lines and gradients in boththe magnetic and velocity fields are primarily across fieldlines. For this reason, in certain devices the use of q =[ u , u A ,u s ] could be a useful alternative to the formu-lation used in the main body of this work. It is possi-ble that, in these units, the structure of the nonlinear-ities in the associated POD-Galerkin model may provemore amenable to analysis. Now that we have illustratedhow global conservation laws manifest as structure inGalerkin models, we could compute these coefficients andevolve the subsequent model. However, For an explicitcalculation of the model coefficients, the first and sec-ond order spatial derivatives for the MHD fields must bewell-approximated in the region of experimental interest.In some cases, high-resolution diagnostics on experimen-tal devices can resolve these quantities in a particularregion of the plasma, but even if this data is available,computing these inner products and evaluating the non-linear terms in the model is resource-intensive. Fortu-nately, there are hyper-reduction techniques from fluiddynamics [41], such as the discrete empirical interpola-tion method (DEIM) [42], QDEIM [43], missing-point es-timation (MPE) [44] and gappy POD [45, 46], which canenable efficient computations. Instead of using hyper-reduction, one can use emerging and increasingly sophis-ticated machine learning methods to discover Galerkinmodels from data. In the following section, we deriveconstraints on machine learning methods that guaranteethe model structure we derived from global conservationlaws in Sec. III. IV. GLOBAL CONSERVATION LAWS INMACHINE LEARNING MODEL DISCOVERY
Increasingly, machine learning techniques are allowingscientists to extract a system’s governing equations ofmotion directly from data. Here we show how these algo- rithms can directly incorporate global conservation lawsduring the search for low-dimensional models in plasmadatasets. We use the sparse identification of nonlineardynamics (SINDy) algorithm [47] to identify nonlinearreduced-order models for plasmas in the accompanyingwork [16].
A. The constrained SINDy method
The goal of SINDy is to identify a low-dimensionalmodel for a ( t ), the vector of POD amplitudes, as a sparselinear combination of elements from a library of candi-date terms Θ = (cid:2) θ ( a ) θ ( a ) ··· θ p ( a ) (cid:3) : ddt a = f ( a ) ≈ Θ ( a ) Ξ . (23)To address this combinatorically hard problem, it lever-ages sparse regression techniques, optimizing for thesparsest set of equations that produces an accurate fitof the data. The SINDy optimization problem ismin Ξ || ˙ a − Θ ( a ) Ξ || + λR ( Ξ ) , (24)subject to D Ξ [:]= d , where R ( Ξ ) is some regularizer, like the L or L norm, which promotes sparsity in Ξ . Here a , ˙ a ∈ R M × r , Θ ( a ) ∈ R M × N , Ξ ∈ R N × r , D ∈ R N c × rN , Ξ [:] ∈ R rN , d ∈ R N c , where N is the number of candidateterms, N c is the number of constraints, and Ξ [:]= (cid:2) ξ a ··· ξ a r ··· ξ a N ··· ξ a r N (cid:3) . B. Derivation of the SINDy constraints
In Sec. III, we derived model constraints from globalconservation laws; our goal here is to rewrite these con-straints to be compatible with the notation in Eq. 24.The conclusions for the global conservation of energywere: 1) no constant terms, 2) an anti-symmetryconstraint on the linear part of the coefficient ma-trix Ξ , and 3) a more complicated energy-preservingstructure in the quadratic coefficients. Consider aquadratic library in a set of r modes, ordered as Θ ( a )=[ a ,...,a r ,a a ,...,a r − a r ,a ,...,a r ]. Note that this ar-rangement of the polynomials in Θ differs from Loiseauet al. [40], so the indexing and subscripts are also dif-ferent here. First we will consider the constraint on thelinear part of the Galerkin model in Eq. 8, or equivalentlythat the quadratic term a T L a ≈
0. We can rewrite thisin the SINDy notation as0= (cid:2) a ··· a r (cid:3) ξ a ··· ξ a r ... . . . ... ξ a r r ··· ξ a r r a ... a r . (25)We conclude ξ a j i = − ξ a i j for i,j ∈{ ,...,r } and identify ξ a j i by accessing the ( i − r + j index in Ξ [:]. Note we areonly accessing the first r elements of Ξ [:]. For mod-els limited to linear and quadratic polynomials, N =( r +3 r ) / N L =( r + r ) / rN − N L = r ( r +2 r − / D Ξ [:]= d ,we can write this out explicitly for r =3, ··· ··· ··· ··· ··· ··· ξ a ξ a ξ a ξ a ... = . (26)The boundary conditions u · ˆ n =0, J · ˆ n =0, B · ˆ n =0 guar-anteed that the quadratic nonlinearities were energy-preserving, and thus that cubic terms in Eq. 10 vanish.In this case, regardless of the steady-state assumption,we have that r (cid:88) i,j,k =0 Q ijk a i a j a k ≈ . (27)This constraint is significantly more involved to reformat.Written in SINDy notation, this is equivalent to0= (cid:2) a ··· a r (cid:3) ξ a r +1 ξ a r +2 ··· ξ a N ... ... ... ... ξ a r r +1 ξ a r r +2 ··· ξ a r N a a ... a r − a r a ... a r . (28)Expand this all out and group the like terms, i.e. termswhich look like a i , a i a j or a i a j a k , i,j,k ∈{ ,...,r } , i (cid:54) = j (cid:54) = k . All of the like terms can be straightforwardlyshown to be linearly independent, so we can considerthree constraints separately for the three types of terms.The number of each of these respective terms is (cid:0) r (cid:1) = r ,2 (cid:0) r (cid:1) = r ( r − (cid:0) r (cid:1) = r ( r − r − /
6, for a total of r ( r +1)( r +2) / N Q constraints. With both constraints,we have rN − N L − N Q = r ( r − r +5) / N c = N L + N Q constraints. Further considering thequadratic case, we find that coefficients which adorn a i must vanish, ξ a i N − r + i =0. Now define˜ ξ ijk = ξ a i r + j (2 r − j − k − . (29)The second type of constraint, with i (cid:54) = j , produces ξ a i N − r + j = (cid:40) ˜ ξ jij i
A hierarchy of models with varying fidelity is essen-tial for understanding and controlling plasmas. Thepresent work, along with the accompanying work [16],provides a principled lower level on this hierarchy − low-dimensional and interpretable plasma models which canbe used for physical discovery, forecasting, stability anal-ysis, and real-time control. We have illustrated howGalerkin plasma models retain the global conservationlaws of MHD, and how machine learning methods likeSINDy can use these constraints directly in an optimiza-tion procedure for discovering such models from data.This principled enforcement of global conservation lawsis critical for the success of future low-dimensional plasmamodels. VI. ACKNOWLEDGEMENTS
The authors would like to extend their gratitude toDr. Uri Shumlak for his input on this work. This workand the companion work were supported by the ArmyResearch Office (ARO W911NF-19-1-0045) and the AirForce Office of Scientific Research (AFOSR FA9550-18-1-0200). Simulations were supported by the U.S. Depart-ment of Energy under award numbers DE-SC0016256and DE-AR0001098 and used resources of the NationalEnergy Research Scientific Computing Center, supportedby the Office of Science of the U.S. Department of Energyunder Contract No. DE-AC02–05CH11231. [1] J. Candy and R. E. Waltz, Anomalous transport scal-ing in the DIII-D tokamak matched by supercomputersimulation, Phys. Rev. Lett. , 045001 (2003).[2] O. Ohia, J. Egedal, V. S. Lukin, W. Daughton, and A. Le,Demonstration of anisotropic fluid closure capturing thekinetic structure of magnetic reconnection, Phys. Rev.Lett. , 115004 (2012).[3] D. Groˇselj, A. Mallet, N. F. Loureiro, and F. Jenko, Fullykinetic simulation of 3D kinetic Alfv´en turbulence, Phys.Rev. Lett. , 105101 (2018).[4] K. Taira, S. L. Brunton, S. Dawson, C. W. Rowley,T. Colonius, B. J. McKeon, O. T. Schmidt, S. Gordeyev,V. Theofilis, and L. S. Ukeiley, Modal analysis of fluidflows: An overview, AIAA Journal , 4013 (2017).[5] I. Furno, Fast transient transport phenomena measuredby soft x-ray emission in TCV tokamak plasmas , Tech.Rep. (2001).[6] R. Jim´enez-G´omez, E. Ascas´ıbar, T. Estrada, I. Garc´ıa-Cort´es, B. Van Milligen, A. L´opez-Fraguas, I. Pastor,and D. L´opez-Bruna, Analysis of magnetohydrodynamicinstabilities in TJ-II plasmas, Fusion science and tech-nology , 20 (2007).[7] B. P. van Milligen, E. S´anchez, A. Alonso, M. A. Pe-drosa, C. Hidalgo, A. M. de Aguilera, and A. L. Fraguas,The use of the biorthogonal decomposition for the iden-tification of zonal flows at TJ-II, Plasma Physics andControlled Fusion , 025005 (2014).[8] M. Pandya, Low edge safety factor disruptions in theCompact Toroidal Hybrid: Operation in the low-q regime,passive disruption avoidance and the nature of MHD pre-cursors , Ph.D. thesis, Auburn University (2016).[9] B. Victor, C. Akcay, C. Hansen, T. Jarboe, B. Nelson,and K. Morgan, Development of validation metrics usingbiorthogonal decomposition for the comparison of mag-netic field measurements, Plasma Physics and ControlledFusion , 045010 (2015).[10] E. Strait, J. King, J. Hanson, and N. Logan, Spatial andtemporal analysis of DIII-D 3D magnetic diagnostic data,Review of Scientific Instruments , 11D423 (2016).[11] P. J. Byrne, Study of External Kink Modes in ShapedHBT-EP Plasmas , Ph.D. thesis, Columbia University(2017).[12] S. Gu, B. Wan, Y. Sun, N. Chu, Y. Liu, T. Shi, H. Wang,M. Jia, and K. He, A new criterion for controlling edgelocalized modes based on a multi-mode plasma response,Nuclear Fusion , 126042 (2019).[13] A. A. Kaptanoglu, K. D. Morgan, C. J. Hansen, andS. L. Brunton, Characterizing magnetized plasmas withdynamic mode decomposition, Physics of Plasmas ,032108 (2020).[14] P. Benner, V. Mehrmann, and D. C. Sorensen, Dimensionreduction of large-scale systems , Vol. 45 (Springer, 2005).[15] P. Benner, M. Ohlberger, A. Cohen, and K. Willcox,
Model reduction and approximation: theory and algo-rithms (SIAM, 2017).[16] A. A. Kaptanoglu, K. D. Morgan, C. J. Hansen, and S. L.Brunton, Physics-constrained, low-dimensional modelsfor mhd: First-principles and data-driven approaches,arXiv preprint arXiv:2004.10389 (2020).[17] C. W. Rowley, T. Colonius, and R. M. Murray, Modelreduction for compressible flows using POD and Galerkin projection, Physica D: Nonlinear Phenomena , 115(2004).[18] B. R. Noack, M. Schlegel, M. Morzynski, and G. Tad-mor,
Galerkin method for nonlinear dynamics (Springer,2011).[19] M. J. Balajewicz, E. H. Dowell, and B. R. Noack, Low-dimensional modelling of high-Reynolds-number shearflows incorporating constraints from the Navier–Stokesequation, Journal of Fluid Mechanics , 285 (2013).[20] M. Schlegel and B. R. Noack, On long-term boundednessof Galerkin models, Journal of Fluid Mechanics , 325(2015).[21] K. Carlberg, M. Barone, and H. Antil, Galerkin v. least-squares Petrov–Galerkin projection in nonlinear modelreduction, Journal of Computational Physics , 693(2017).[22] T. Dudok de Wit, A.-L. Pecquet, J.-C. Vallet, andR. Lima, The biorthogonal decomposition as a tool forinvestigating fluctuations in plasmas, Physics of Plasmas , 3288 (1994).[23] J. Levesque, N. Rath, D. Shiraki, S. Angelini, J. Bialek,P. Byrne, B. DeBono, P. Hughes, M. Mauel, G. Navratil, et al. , Multimode observations and 3D magnetic controlof the boundary of a tokamak plasma, Nuclear Fusion , 073037 (2013).[24] C. Galperti, C. Marchetto, E. Alessi, D. Minelli,M. Mosconi, F. Belli, L. Boncagni, A. Botrugno, P. Bu-ratti, B. Esposito, et al. , Development of real-time MHDmarkers based on biorthogonal decomposition of signalsfrom Mirnov coils, Plasma Physics and Controlled Fusion , 114012 (2014).[25] B. P. Van Milligen, E. S´anchez, A. Alonso, M. Pedrosa,C. Hidalgo, A. M. De Aguilera, and A. L. Fraguas, Theuse of the biorthogonal decomposition for the identifica-tion of zonal flows at TJ-II, Plasma Physics and Con-trolled Fusion , 025005 (2014).[26] C. Hansen, B. Victor, K. Morgan, T. Jarboe, A. Hos-sack, G. Marklin, B. Nelson, and D. Sutherland, Nu-merical studies and metric development for validationof magnetohydrodynamic models on the HIT-SI exper-iment, Physics of Plasmas , 056105 (2015).[27] D. D. Schnack, D. C. Barnes, D. P. Brennan, C. C.Hegna, E. Held, C. C. Kim, S. E. Kruger, A. Y. Pankin,and C. R. Sovinec, Computational modeling of fully ion-ized magnetized plasmas using the fluid approximation,Physics of Plasmas , 058103 (2006).[28] Z. W. Ma and A. Bhattacharjee, Hall magnetohydrody-namic reconnection: The geospace environment model-ing challenge, Journal of Geophysical Research: SpacePhysics , 3773.[29] V. Krishan and S. M. Mahajan, Magnetic fluctuationsand Hall magnetohydrodynamic turbulence in the solarwind, Journal of Geophysical Research: Space Physics .[30] F. Ebrahimi, B. Lefebvre, C. B. Forest, and A. Bhat-tacharjee, Global Hall-MHD simulations of magnetoro-tational instability in a plasma Couette flow experiment,Physics of Plasmas , 062904 (2011).[31] N. M. Ferraro, Calculations of two-fluid linear response tonon-axisymmetric fields in tokamaks, Physics of Plasmas , 056105 (2012). [32] A. A. Kaptanoglu, T. E. Benedett, K. D. Morgan, C. J.Hansen, and T. R. Jarboe, Two-temperature effects inHall-MHD simulations of the HIT-SI experiment, Physicsof Plasmas , 072505 (2020).[33] S. L. Brunton and J. N. Kutz, Data-driven science andengineering: Machine learning, dynamical systems, andcontrol (Cambridge University Press, 2019).[34] G. H. Golub et al. , Matrix computations, The Johns Hop-kins (1996).[35] A. Frieze, R. Kannan, and S. Vempala, Fast Monte-Carloalgorithms for finding low-rank approximations, Journalof the ACM (JACM) , 1025 (2004).[36] E. Liberty, F. Woolfe, P.-G. Martinsson, V. Rokhlin,and M. Tygert, Randomized algorithms for the low-rankapproximation of matrices, Proceedings of the NationalAcademy of Sciences , 20167 (2007).[37] F. Woolfe, E. Liberty, V. Rokhlin, and M. Tygert, A fastrandomized algorithm for the approximation of matrices,Applied and Computational Harmonic Analysis , 335(2008).[38] T. Jarboe, W. Hamp, G. Marklin, B. Nelson, R. O’Neill,A. Redd, P. Sieck, R. Smith, and J. Wrobel, Spheromakformation by steady inductive helicity injection, Physicalreview letters , 115003 (2006).[39] S. Galtier, Introduction to modern magnetohydrodynam-ics (Cambridge University Press, 2016).[40] J.-C. Loiseau, B. R. Noack, and S. L. Brunton, Sparsereduced-order modeling: sensor-based dynamics to full-state estimation, Journal of Fluid Mechanics , 459(2018).[41] P. Benner, S. Gugercin, and K. Willcox, A survey ofprojection-based model reduction methods for paramet-ric dynamical systems, SIAM review , 483 (2015).[42] S. Chaturantabut and D. C. Sorensen, Discrete empiricalinterpolation for nonlinear model reduction, in Proceed-ings of the 48th IEEE Conference on Decision and Con-trol (CDC) held jointly with 2009 28th Chinese Control Conference (IEEE, 2009) pp. 4316–4321.[43] Z. Drmac and S. Gugercin, A new selection operator forthe discrete empirical interpolation method—improveda priori error bound and extensions, SIAM Journal onScientific Computing , A631 (2016).[44] P. Astrid, S. Weiland, K. Willcox, and T. Backx, Missingpoint estimation in models described by proper orthog-onal decomposition, IEEE Transactions on AutomaticControl , 2237 (2008).[45] K. Willcox, Unsteady flow sensing and estimation via thegappy proper orthogonal decomposition, Computers &Fluids , 208 (2006).[46] K. Carlberg, C. Farhat, J. Cortial, and D. Amsallem, TheGNAT method for nonlinear model reduction: effectiveimplementation and application to computational fluiddynamics and turbulent flows, Journal of ComputationalPhysics , 623 (2013).[47] S. L. Brunton, J. L. Proctor, and J. N. Kutz, Discover-ing governing equations from data by sparse identifica-tion of nonlinear dynamical systems, Proceedings of theNational Academy of Sciences , 3932 (2016).[48] P. J. Morrison and J. M. Greene, Noncanonical hamil-tonian density formulation of hydrodynamics and idealmagnetohydrodynamics, Physical Review Letters , 790(1980).[49] Z. Yoshida and E. Hameiri, Canonical hamiltonian me-chanics of hall magnetohydrodynamics and its limitto ideal magnetohydrodynamics, Journal of Physics A:Mathematical and Theoretical , 335502 (2013).[50] H. M. Abdelhamid, Y. Kawazura, and Z. Yoshida, Hamil-tonian formalism of extended magnetohydrodynamics,Journal of Physics A: Mathematical and Theoretical ,235502 (2015).[51] H. K. Chu and M. Hayashibe, Discovering interpretabledynamics by sparsity promotion on energy and the la-grangian, IEEE Robotics and Automation Letters5