Towards tensor-based methods for the numerical approximation of the Perron-Frobenius and Koopman operator
TTowards tensor-based methods for the numerical approximation of thePerron–Frobenius and Koopman operator
Stefan Klus and Christof Sch¨utte Department of Mathematics and Computer Science, Freie Universit¨at Berlin, Germany Zuse Institute Berlin, Germany
Abstract
The global behavior of dynamical systems can be studied by analyzing the eigenvalues and correspondingeigenfunctions of linear operators associated with the system. Two important operators which are frequentlyused to gain insight into the system’s behavior are the Perron–Frobenius operator and the Koopman operator.Due to the curse of dimensionality, computing the eigenfunctions of high-dimensional systems is in general infea-sible. We will propose a tensor-based reformulation of two numerical methods for computing finite-dimensionalapproximations of the aforementioned infinite-dimensional operators, namely Ulam’s method and Extended Dy-namic Mode Decomposition (EDMD). The aim of the tensor formulation is to approximate the eigenfunctionsby low-rank tensors, potentially resulting in a significant reduction of the time and memory required to solvethe resulting eigenvalue problems, provided that such a low-rank tensor decomposition exists. Typically, notall variables of a high-dimensional dynamical system contribute equally to the system’s behavior, often the dy-namics can be decomposed into slow and fast processes, which is also reflected in the eigenfunctions. Thus, theweak coupling between different variables might be approximated by low-rank tensor cores. We will illustratethe efficiency of the tensor-based formulation of Ulam’s method and EDMD using simple stochastic differentialequations.
The Perron–Frobenius operator and the Koopman operator enable the analysis of the global behavior of dynamicalsystems. Eigenfunctions of these operators can be used to extract the dominant dynamics, to detect almost invariantsets, or to decompose the system into fast and slow processes [1, 2, 3, 4, 5, 6, 7]. Assume the state space of yoursystem is R d and you want to discretize each direction using n grid points (or boxes if Ulam’s method is used),then overall n d values need to be stored. Even for d = 10 and n = 10, more than 70 gigabyte of storage spacewould be required, whereas typical systems might have hundreds or thousands of dimensions and naturally alsorequire a more fine-grained discretization. This so-called curse of dimensionality can be overcome by using tensorformats which compress the data and store only the information that is relevant for the reconstruction. In general,only an approximation of the original data can be retrieved. Approximating the objects under consideration bysums of low-rank tensor products has become a powerful approach for tackling high-dimensional problems [8] and inmany physically significant problems near-linear complexity can be achieved since the separation rank depends onlyweakly on the dimension [9]. For high-dimensional systems exhibiting multiscale behavior, it might be possible torepresent the weak coupling between different variables by low-rank tensor cores. The leading eigenfunctions of theKoopman operator, for instance, are typically almost constant for the fast variables of the system and depend mainlyon the slowly changing variables (see [6]). Thus, using tensor-based algorithms could reduce the amount of timeand memory required to compute and store eigenfunctions significantly. In this way, analyzing high-dimensionalsystems that could not be tackled using standard methods might become feasible.Tensors, in our sense, are just multidimensional arrays as shown in Figure 1. Here and in what follows, standardvectors will be denoted by lower-case letters, e.g. v , matrices by upper-case letters, e.g. A , and tensors by thecorresponding bold symbols, e.g. x . It is important to note that tensors are typically not explicitly given – forexample by observed data –, but only implicitly as solutions of systems of linear or nonlinear equations or eigenvalue1 a r X i v : . [ m a t h . NA ] N ov roblems [10]. Thus, numerical methods that operate directly on tensor approximations need to be developed sincethe full tensors cannot be stored or handled anymore in practice. x x x x x x x x x x x x x x x x x x x x x x x x v v v v a a a a a a a a a a a a x x x x x x x x x x x x Figure 1:
Tensors as multidimensional arrays. Here, v = ( v i ) ∈ R , A = ( a ij ) ∈ R × , and x = ( x ijk ) ∈ R × × .Over the last years, low-rank tensor approximation approaches have become increasingly popular in the scientificcomputing community and are now becoming a standard tool to cope with large-scale problems that could not behandled before by standard numerical methods. An overview of different low-rank tensor approximation approachescan be found in [10].In this paper, we will show that the use of low-rank tensor approximation schemes potentially enables the com-putation of eigenfunctions of high-dimensional systems. The aim of this paper, however, is not to show that ourapproach is more efficient – the tensor algorithms are mainly implemented in Matlab, a comparison with highlyoptimized numerical libraries implemented in C or C++ would not lead to meaningful results, developing highperformance libraries for large-scale tensor problems is a separate task –, but to derive a tensor-based reformula-tion of existing methods and to show equivalency so that the theory available for the conventional matrix-vectorbased formulation can be carried over to multi-dimensional arrays. Furthermore, tensors could enable low-rankapproximations of the Perron–Frobenius and Koopman operator as well as their eigenfunctions. One of the mainfuture goals is to combine low-rank tensor decomposition techniques and the splitting of the dynamics into fast andslow processes. In [6], it has been shown that such a splitting of a multi-scale system exists and can be exploitedto extract the slow dynamics. Another open problem is the generation of the low-rank approximations of theoperators. Currently, the canonical tensor format representations are converted to the tensor-train format, whichis time-consuming and in general leads to high ranks. Ideally, low-rank TT approximations should be directlygenerated from the given data.We will start by introducing standard methods such as Ulam’s method and the recently developed ExtendedDynamic Mode Decomposition (EDMD) to approximate the eigenfunctions of the Perron–Frobenius operator andthe Koopman operator in Section 2. Then, in Section 3, different tensor formats will be described. In Section 4, wewill reformulate Ulam’s method and EDMD as tensor-based methods. Section 5 contains a brief summary of simplepower iteration schemes for the resulting tensor-based (generalized) eigenvalue problems. Simple examples whichillustrate the proposed approaches are shown in Section 6. In Section 7, we will conclude with a short summaryand possible future work. In this section, we will briefly introduce the Perron–Frobenius operator P and the Koopman operator K as wellas numerical methods to compute finite-dimensional approximations, namely Ulam’s method and EDMD. Themain difference between Ulam’s method and EDMD is that the former uses indicator functions for a given box dis-cretization of the domain while the latter allows arbitrary ansatz functions such as monomials, Hermite polynomials,trigonometric functions, or radial basis functions. Although EDMD was primarily developed for the approximationof the Koopman operator, it can be used to compute eigenfunctions of the Perron–Frobenius operator as well [12].Analogously, Ulam’s method can also be used to compute eigenfunctions of the Koopman operator. Higher-order methods for the approximation of the Perron–Frobenius operator have been proposed in [11]. .1 Perron–Frobenius and Koopman operator Let S : X → X be a dynamical system defined on a domain X , for example X ⊆ R d . Then the Perron–Frobeniusoperator or transfer operator P is defined by (cid:90) g · P f dm = (cid:90) ( g ◦ S ) · f dm, (1)for all f, g ∈ F , where F is an appropriately defined function space and ◦ denotes function composition. We assumein what follows that F = L ( X ). The aim is to compute eigenfunctions of the Perron–Frobenius operator, given by P ϕ i = λ i ϕ i . The eigenfunction ϕ corresponding to λ = 1 is the invariant density of the system, i.e. P ϕ = ϕ . The magnitudeof the second largest eigenvalue λ can be interpreted as the rate at which initial densities converge to the invariantdensity (for more details and assumptions about the dynamical system, see e.g. [6] and references therein). Moregenerally, the leading eigenvalues of the Perron–Frobenius operator close to one correspond to the slowly convergingtransients of the system. The Koopman operator K , on the other hand, is defined by K f = f ◦ S and acts on functions f : X → C , f ∈ F . Correspondingly, the stochastic Koopman operator is defined by K f = E [ f ◦ S ], where E [ · ] denotes the expected value with respect to the probability measure underlying S ( x ). Wewill only introduce the required notation and focus mainly on discrete-time dynamical systems, for more details onthe Koopman operator and its properties, we refer to [4, 5, 13]. While the Perron–Frobenius operator describes theevolution of densities , the Koopman operator describes the evolution of observables , which could be measurementsor sensor probes [4]. Instead of analyzing orbits { x, S ( x ) , S ( x ) , . . . } of the dynamical system, we now analyze themeasurements { f ( x ) , f ( S ( x )) , f ( S ( x )) , . . . } at these points.The Koopman operator K is the adjoint of the Perron–Frobenius operator P and thus an infinite-dimensionalbut linear operator. A finite-dimensional approximation (computed using generalized Galerkin methods) of thisoperator captures the dynamics of a nonlinear dynamical system without necessitating a linearization around afixed point [4, 5]. We are again interested in eigenfunctions of the operator, given by K ϕ i = λ i ϕ i . Let f : X → R be an observable of the system that can be written as a linear combination of the linearly independenteigenfunctions ϕ i , i.e. f ( x ) = (cid:88) i c i ϕ i ( x ) , with c i ∈ C . Then ( K f )( x ) = (cid:88) i λ i c i ϕ i ( x ) . Analogously, for vector-valued functions F = [ f , . . . , f n ] T , we obtain K F = (cid:80) i λ i c i, ϕ i ... (cid:80) i λ i c i,n ϕ i = (cid:88) i λ i ϕ i c i, ... c i,n = (cid:88) i λ i ϕ i v i , where v i = [ c i, , . . . , c i,n ] T . These vectors v i corresponding to the eigenfunctions ϕ i are called Koopman modes. A frequently used method to compute an approximation of the Perron–Frobenius operator is Ulam’s method, seee.g. [2, 14, 15, 6]. First, the state space X is covered by a finite number of disjoint boxes {B , . . . , B k } . Let B i bethe indicator function for box B i , i.e. B i ( x ) = (cid:40) , if x ∈ B i , , otherwise . (cid:90) B j · P B i dm = (cid:90) ( B j ◦ S ) · B i dm = (cid:90) S − ( B j ) · B i dm = m ( S − ( B j ) ∩ B i ) . This relationship can be represented by a matrix ˆ P = (ˆ p ij ) ∈ R k × k withˆ p ij = m (cid:0) S − ( B j ) ∩ B i (cid:1) m ( B i ) . Here, in order to avoid confusion with the approximation P on a tensor space introduced below, we denote thematrix representation ˆ P instead of P . The denominator m ( B i ) normalizes the entries ˆ p ij so that ˆ P becomes a row-stochastic matrix and defines a finite Markov chain. The left eigenvector corresponding to the eigenvalue λ = 1approximates the invariant measure of the Perron–Frobenius operator P .The entries ˆ p ij of the matrix ˆ P represent the probabilities of points being mapped from box B i to box B j bythe dynamical system S . These entries can be estimated by randomly choosing a large number of test points x ( l ) i , l = 1 , . . . , n , in each box B i and by counting how many times test points were mapped from box B i to box B j by S , i.e. ˆ p ij ≈ n n (cid:88) l =1 B j (cid:16) S (cid:16) x ( l ) i (cid:17)(cid:17) . (2)The eigenfunctions of the Perron–Frobenius operator are then approximated by the left eigenvectors of the matrixˆ P , the eigenfunctions of the Koopman operator by the right eigenvectors. An approximation of the Koopman operator, the Koopman eigenvalues, eigenfunctions, and eigenmodes can becomputed using EDMD. The method requires data, i.e. a set of values x i and the corresponding y i = S ( x i ) values, i = 1 , . . . , m , written in matrix form as X = (cid:2) x x · · · x m (cid:3) and Y = (cid:2) y y · · · y m (cid:3) , and additionally a set of ansatz functions or observables D = { ψ , ψ , . . . , ψ k } , with ψ i : X → R . Thus, X, Y ∈ R d × m . The vectors x i are used as collocation points to approximate the integralsrequired for the approximation of the Koopman operator. LetΨ = (cid:2) ψ ψ · · · ψ k (cid:3) T , (3)Ψ : X → R k , be the vector of all ansatz functions, then K can be approximated by a matrix ˆ K ∈ R k × k , withˆ K T = ˆ A ˆ G + , where + denotes the pseudoinverse. The matrices ˆ A, ˆ G ∈ R k × k are defined asˆ A = 1 m m (cid:88) l =1 Ψ( y l )Ψ( x l ) T , ˆ G = 1 m m (cid:88) l =1 Ψ( x l )Ψ( x l ) T . (4)As before, we use the ˆ symbol to distinguish the matrices from the tensor approximations that will be introducedin Section 4. An approximation of the eigenfunction ϕ i of the Koopman operator K is then given by ϕ i = ξ i Ψ , ξ i is the i -th left eigenvector of the matrix ˆ K T . Alternatively, the generalized eigenvalue problem ξ i ˆ A = λ i ξ i ˆ G (5)can be solved, provided that ˆ G is regular. To compute eigenfunctions of the Perron–Frobenius operator, thecorresponding eigenvalue problem ξ i ˆ A T = λ i ξ i ˆ G needs to be solved. Note that this formulation is similar to the variational approach to compute eigenfunctions oftransfer operators of reversible processes presented in [16, 17]. For more details, we refer the reader to [12]. Several different tensor formats have been developed in the past, e.g. the canonical format, the Tucker format, andthe tensor-train format. In this section, we will briefly introduce tensors and the required notation. The overall goalis to rewrite the methods presented in the previous section as tensor-based methods and to take advantage of low-rank tensor approximations and the fact that the dynamics of high-dimensional systems can often be decomposed.
A tensor in full format is simply a multidimensional array v ∈ R k ×···× k d . (A variation of this format is the sparseformat which stores only the nonzero entries and is used, for example, in the sparse grid approach [18].) The entriesof a tensor v are indexed by v i = v i ,...,i d = v [ i ] = v [ i , . . . , i d ], where i = ( i , . . . , i d ) is a multi-index. Addition andsubtraction are trivially defined element-wise. Multiplication of a tensor v by a scalar c ∈ R is naturally generalizedas ( c v )[ i , . . . , i d ] = c v [ i , . . . , i d ]. Matrix-vector multiplication is defined as follows: Given a linear operator A defined on a tensor space R k ×···× k d , with A = A [ i , . . . , i d , j , . . . , j d ] ∈ R k ×···× k d × k ×···× k d , the product of A and v is( Av )[ i , . . . , i d ] = k (cid:88) j =1 · · · k d (cid:88) j d =1 A [ i , . . . , i d , j , . . . , j d ] v [ j , . . . , j d ]or in shorthand notation, using multi-indices, ( Av ) i = (cid:88) j A ij v j . Furthermore, the inner product of two tensors v , w ∈ R k ×···× k d is defined as (cid:104) v , w (cid:105) = k (cid:88) i =1 · · · k d (cid:88) i d =1 v [ i , . . . , i d ] w [ i , . . . , i d ]and the outer product v ⊗ w ∈ R k ×···× k d × k ×···× k d as a tensor with entries( v ⊗ w )[ i , . . . , i d , j , . . . , j d ] = v [ i , . . . , i d ] w [ j , . . . , j d ] . The outer product v ⊗ w can be regarded as a linear map that acts on tensors R k ×···× k d . Often it is required orconvenient to rewrite a tensor as a vector. The vectorization of a tensor, denoted vec( v ), where vec : R k ×···× k d → R k ··· k d , reorders the entries of v into one column vector. For v ∈ R × × , for example,vec( v ) = (cid:2) v v v v . . . v v v v (cid:3) T . For our purposes, we will mainly be interested in eigenvalue problems of the form Av = λ v or Av = λ Bv . r -term or TT format for numerical computations in order to minimize computational costs as well as storagerequirements. Except for very particular examples, it is impossible to compress the data without any compressionerror [18]. Typically, the tensor representation is just an approximation of the original data. Below, we will describedifferent compressed tensor formats, the introduction is based on [18]. A tensor space is given by V = (cid:78) dµ =1 V µ , where V , . . . , V d are vector spaces defined over the same field K , typically R or C . An elementary tensor is defined to be a product of the form v = v ⊗ · · · ⊗ v d , with v µ ∈ V µ , µ = 1 , . . . , d . An algebraic tensor is then a linear combination of elementary tensors, i.e. v = r (cid:88) l =1 v ( l )1 ⊗ · · · ⊗ v ( l ) d . This format is also called r -term format or CP format. That is, instead of trying to store the tensor in the denseformat, one only considers tensors that can be written as products of the form v [ i , . . . , i d ] = v [ i ] · · · v d [ i d ]. Ifthe best approximation of this form is not good enough, then the natural extension is to consider v [ i , . . . , i d ] = (cid:80) rl =1 v ( l )1 [ i ] · · · v ( l ) d [ i d ], cf. [19, 9]. Defining R r = (cid:40) r (cid:88) l =1 v ( l )1 ⊗ · · · ⊗ v ( l ) d : v ( l ) µ ∈ V µ (cid:41) for r ∈ N , which implies that { } = R ⊂ R ⊂ · · · ⊂ V , we callrank( v ) = min { r : v ∈ R r } the tensor rank of v . Example 3.1.
Let us consider a system of the form dx t = −∇ x V ( x t ) dt + σ dW t , where V is the energy landscapeassociated with the system. The invariant density of this process is given by µ ( x ) = Z e − βV ( x ) , where β and Z areconstants [20]. Assume that the state space is three-dimensional and that the potential function can be written as V ( x ) = V ( x ) + V ( x ) + V ( x ), then µ ( x ) = 1 Z e − βV ( x ) e − βV ( x ) e − βV ( x ) . Thus, storing the invariant density for a grid with k grid points in each direction in the full format would requirean array of size k while storing it in the canonical tensor format would require only a tensor of rank 1 and thus anarray of size 3 k . (cid:52) Given tensors in the r -term format, basic operations are defined as follows [18]: • Addition: v = r v (cid:88) l =1 d (cid:79) µ =1 v ( l ) µ , w = r w (cid:88) l =1 d (cid:79) µ =1 w ( l ) µ ⇒ x = v + w = r v + r w (cid:88) l =1 d (cid:79) µ =1 x ( l ) µ , where x ( l ) µ = (cid:40) v ( l ) µ , ≤ l ≤ r v ,w ( l − r v ) µ , r v + 1 ≤ l ≤ r v + r w . • Matrix-vector multiplication:A = r A (cid:88) l A =1 d (cid:79) µ =1 A ( l A ) µ , v = r v (cid:88) l v =1 d (cid:79) µ =1 v ( l v ) µ ⇒ Av = r A (cid:88) l A =1 r v (cid:88) l v =1 d (cid:79) µ =1 A ( l A ) µ v ( l v ) µ . Since these operations increase the rank, truncation is typically required to approximate the resulting tensor v with rank r v by a tensor ˜ v with a lower rank r ˜ v by either fixing the rank r ˜ v or by fixing ε such that (cid:107) v − ˜ v (cid:107) < ε .6 .3 TT format Another frequently used tensor format is the TT format – TT now stands for tensor train instead of the former tree tensor –, which can be obtained by successive singular value decompositions. This is a special case of a moregeneral hierarchical format and has been introduced in quantum physics under the name
Matrix Product States (MPS), see [18] for details. A tensor v ∈ R k ×···× k d is decomposed into d component tensors v i of at most orderthree (the first and last are of order two and are often, for the sake of simplicity, considered as tensors of orderthree with “boundary condition” ρ = ρ d = 1). That is, the entries of v are given by v [ i , . . . , i d ] = ρ (cid:88) k =1 · · · ρ d − (cid:88) k d − =1 v [1 , i , k ] v [ k , i , k ] . . . v d − [ k d − , i d − , k d − ] v d [ k d − , i d , . For fixed indices, the component tensors of rank three can be regarded as matrices, which leads to a more compactrepresentation v [ i , . . . , i d ] = V [ i ] V [ i ] · · · V d − [ i d − ] V d [ i d ]and justifies the original name MPS. Here, the numbers ρ i are the ranks of the TT tensor, resulting in a rank vector ρ = [ ρ , ρ , . . . , ρ d ] which determines the complexity of the representation. The main advantages of the TT formatare its stability from an algorithmic point of view and reasonable computational costs, provided that the ranks ofthe tensors are small [8]. The basic operations such as addition and matrix-vector multiplication are more complexthan in the canonical format and can be found, for example, in [21]. Converting a tensor from the canonical formatto the TT format is trivial, but the TT representation requires more memory. Numerical toolboxes for the TTdecomposition of tensors and several algorithms for solving linear systems of equations are available online, seee.g. [22]. Complexity-wise, the canonical format would be the ideal candidate for representing tensors since the number ofrequired parameters depends only linearly on the dimension d , the rank r , and the sizes of the individual vectorspaces. It turned out, however, that solving even simple problems using the canonical format is hard in practice dueto redundancies and instabilities which can lead to numerical problems [8]. The main advantage of the TT formatis its structural simplicity, higher-order tensors are reduced to d products of tensors of at most order three. Similarapproaches have been known in quantum physics for a long time, the rigorous mathematical analysis, however, isstill work in progress (see [8] and references therein). For our purposes, we will rely on the TT format and theTT toolbox developed by Oseledets et al. [22] and implement simple power iteration schemes to solve the resultingeigenvalue problems as we will show in Section 5. Other tensor formats, however, might be advantageous as wellfor the analysis of the Perron–Frobenius and Koopman operator. This should be investigated further in the future.One drawback of the TT format is that the decomposition depends on the ordering of the dimensions and thusresults in different tensor ranks for different orderings. In this section, we will present a tensor-based reformulation of Ulam’s method and EDMD and show that thesemethods are equivalent to the corresponding vector-based counterparts. The new formulation enables the use of thelow-rank tensor approximation approaches described in the previous section. That is, variables can be approximatedwith different degrees of accuracy.
Given a dynamical system S : X → X , X ⊂ R d , define a box B that contains X , i.e. B = I × · · · × I d ⊃ X , where each I µ = [ a µ , b µ ] ⊂ R is an interval. Furthermore, let each I µ be partitioned into k µ subintervals I i µ µ , i µ = 1 , . . . , k µ , such that I i µ µ ∩ I j µ µ = ∅ for i µ (cid:54) = j µ . This results in a partitioning of B into ˆ k = (cid:81) dµ =1 k µ boxes.7sing again standard multi-index notation, we will denote i = ( i , . . . , i d ). Equipped with the mapping i = ( i , . . . , i d ) (cid:55)→ ˆ i = 1 + d (cid:88) µ =1 (cid:32) µ − (cid:89) ν =1 k ν (cid:33) ( i µ − , (6)each multi-index i corresponds to a number ˆ i ∈ { , . . . , ˆ k } . This induces a canonical numbering of the boxes B i = I i × · · · × I i d d = B ˆ i and the entries of tensors x ∈ R k ×···× k d such that x i = x ˆ i , where x = vec( x ). Thus, with the aid of Ulam’smethod we could now generate the finite-dimensional representation of the Perron–Frobenius operator ˆ P ∈ R ˆ k × ˆ k as described in Section 2. Our goal, however, is to approximate the operator by a tensor P ∈ R k ×···× k d × k ×···× k d .Note that the indicator function for the box B i can be written as B i ( x ) = d (cid:89) µ =1 I iµµ ( x µ ) = B ˆ i ( x ) . (7)That is, each d -dimensional indicator function B i ( x ) is now written as a product of d one-dimensional indicatorfunctions I iµµ ( x µ ). Example 4.1.
Let us start with a simple example which illustrates the idea behind the tensor-based formulation.Consider the box discretization {B , . . . , B } of B = [0 , shown in Figure 2(a). Thus, using Ulam’s method, wewould obtain 9 indicator functions { B , . . . , B } and the matrix ˆ P that approximates the Perron–Frobenius opera-tor P would be a row-stochastic (9 × (a) B B B B B B B B B
300 1 2 3123 (b) ˆ P = .
68 0 .
09 0 .
07 0 .
04 0 .
02 0 0 .
09 0 .
01 00 .
36 0 .
06 0 .
40 0 .
03 0 0 .
02 0 .
05 0 .
03 0 . .
07 0 .
12 0 .
64 0 .
02 0 0 .
07 0 .
03 0 0 . .
31 0 .
07 0 .
02 0 .
09 0 .
01 0 .
02 0 .
39 0 .
05 0 . .
25 0 .
05 0 .
18 0 .
06 0 .
01 0 .
02 0 .
17 0 .
05 0 . .
06 0 .
06 0 .
37 0 .
01 0 0 .
04 0 .
07 0 .
03 0 . .
17 0 .
01 0 .
01 0 .
09 0 .
01 0 .
01 0 .
60 0 .
05 0 . .
05 0 0 .
06 0 .
08 0 0 .
05 0 .
29 0 .
13 0 . .
01 0 0 .
03 0 .
02 0 .
02 0 .
11 0 .
05 0 .
09 0 . Figure 2: (a) Box discretization of B = [0 , . (b) Example of a resulting approximation ˆ P of the Perron–Frobeniusoperator P , obtained by applying Ulam’s method.Defining intervals I µ = [0 , I µ = [1 , I µ = [2 ,
3] as well as indicator functions I iµµ ( x µ ) = (cid:40) , x µ ∈ I i µ µ , , otherwise , for µ = 1 , i µ = 1 , ,
3, the ansatz functions for the box discretization can be written as B , ( x ) = I ( x ) I ( x ) = B ( x ) , B , ( x ) = I ( x ) I ( x ) = B ( x ) , ... ... B , ( x ) = I ( x ) I ( x ) = B ( x ) , B , ( x ) = I ( x ) I ( x ) = B ( x ) . (cid:52) P of the Perron–Frobenius operator P . Let Q µ : R d → R be the projection onto the µ -th component of a vector, i.e. Q µ ( x ) = x µ .Then we define P [ i , . . . , i d , j , . . . , j d ] = 1 n n (cid:88) l =1 d (cid:89) µ =1 I jµµ (cid:16) Q µ (cid:16) S (cid:16) x ( l ) i ,...,i d (cid:17)(cid:17)(cid:17) , (8)where x ( l ) i ,...,i d , l = 1 , . . . , n , are the test points generated for box B i . Instead of checking to which d -dimensionalbox the test points are mapped, the d dimensions are now treated separately. Proposition 4.2.
It holds that Pv = λ v ⇔ ˆ P v = λv. Proof.
It suffices to show that P [ i , . . . , i d , j , . . . , j d ] = ˆ P ˆ i ˆ j and that ( Pv )[ i , . . . , i d ] = ( ˆ P v ) ˆ i . The entries of P andˆ P are identical since with (2) and (7)ˆ P ˆ i ˆ j = 1 n n (cid:88) l =1 B ˆ j (cid:16) S (cid:16) x ( l )ˆ i (cid:17)(cid:17) = 1 n n (cid:88) l =1 d (cid:89) µ =1 I jµµ (cid:16) Q µ (cid:16) S (cid:16) x ( l ) i d ,...,i d (cid:17)(cid:17)(cid:17) = P [ i , . . . , i d , j , . . . , j d ] . For d = 1, the multi-index i is mapped to ˆ i = i and j to ˆ j = j by (6). Furthermore, ˆ k = k and( Pv )[ i ] = k (cid:88) j =1 P [ i , j ] v [ j ] = ˆ k (cid:88) ˆ j =1 ˆ P ˆ i ˆ j v ˆ j = ( ˆ P v ) ˆ i . Then, we obtain by induction( Pv )[ i , . . . , i d +1 ] = k (cid:88) j =1 · · · k d +1 (cid:88) j d +1 =1 P [ i , . . . , i d +1 , j , . . . , j d +1 ] v [ j , . . . , j d +1 ]= k d +1 (cid:88) j d +1 =1 k (cid:88) j =1 · · · k d (cid:88) j d =1 P [ i , . . . , i d , i d +1 , j , . . . , j d , j d +1 ] v [ j , . . . , j d , j d +1 ] = k d +1 (cid:88) j d +1 =1 k (cid:88) j =1 · · · k d (cid:88) j d =1 P ( i d +1 ,j d +1 ) [ i , . . . , i d , j , . . . , j d ] v ( j d +1 ) [ j , . . . , j d ] = (cid:2) ˆ P ( i d +1 , . . . ˆ P ( i d +1 ,k d +1 ) (cid:3) v (1) ... v ( k d +1 ) } ∈ R k ··· k d ... } ∈ R k ··· k d = ( ˆ P v ) ˆ i . Here, the matrices and vectors with the superscripts ( i d +1 , j d +1 ) and ( j d +1 ), respectively, are obtained by fixing thecorresponding indices, that is, these matrices and vectors are lower-dimensional slices of the corresponding higher-dimensional objects. Since the entries of v ( j d +1 ) are indexed by ( j , . . . , j d ) (cid:55)→ ˆ j = 1 + (cid:80) dµ =1 (cid:16)(cid:81) µ − ν =1 k ν (cid:17) ( j µ − v – obtained by stacking the vectors v ( j d +1 ) ∈ R k ··· k d – are indexed by( j , . . . , j d +1 ) (cid:55)→ ˆ j = 1 + d (cid:88) µ =1 (cid:32) µ − (cid:89) ν =1 k ν (cid:33) ( j µ −
1) + ( k · · · k d )( j d +1 −
1) = 1 + d +1 (cid:88) µ =1 (cid:32) µ − (cid:89) ν =1 k ν (cid:33) ( j µ − . P correspond to left eigenvectors of ˆ P . The implementation of the tensor-based formulation is straightforward since only index computations for intervals are required, a numbering of the d -dimensional boxes is not needed anymore. Let T be the set of all test points and ind : R d → N d the functions thatreturns the corresponding multi-index i for a point x ∈ R d so that x µ ∈ I i µ µ , µ = 1 , . . . , d . Then, Ulam’s methodcan simply be expressed as: for each test point x ∈ T do y = S ( x ) i = ind( x ) j = ind( y ) P [ i , . . . , i d , j , . . . , j d ] ← P [ i , . . . , i d , j , . . . , j d ] + n end for In the standard formulation, the rows of the matrix ˆ P sum up to one. Correspondingly, the sum of all entries ofeach subtensor of P with fixed multi-index i is one, i.e. k (cid:88) j =1 · · · k d (cid:88) j d =1 P [ i , . . . , i d , j , . . . , j d ] = 1 . Example 4.3.
Let us consider Example 4.1 again. We select n random test points x ( l ) i ,i for each box B i ,i , i , i = 1 , . . . , l = 1 , . . . , n . This leads to a new approximation P ∈ R × × × . Written in Matlab notation,we would obtain P [: , : , ,
1] = .
68 0 .
31 0 . .
36 0 .
25 0 . .
07 0 .
06 0 . , P [: , : , ,
2] = .
04 0 .
09 0 . .
03 0 .
06 0 . .
02 0 .
01 0 . , P [: , : , ,
3] = .
09 0 .
39 0 . .
05 0 .
17 0 . .
03 0 .
07 0 . , P [: , : , ,
1] = .
09 0 .
07 0 . .
06 0 .
05 00 .
12 0 .
06 0 , P [: , : , ,
2] = .
02 0 .
01 0 .
010 0 .
01 00 0 0 . , P [: , : , ,
3] = .
01 0 .
05 0 . .
03 0 .
05 0 .
130 0 .
03 0 . , P [: , : , ,
1] = .
07 0 .
02 0 . .
40 0 .
18 0 . .
64 0 .
37 0 . , P [: , : , ,
2] = .
02 0 . .
02 0 .
02 0 . .
07 0 .
04 0 . , P [: , : , ,
3] = .
04 0 . .
05 0 .
21 0 . .
05 0 .
36 0 . . Note that each matrix P [: , : , j , j ] corresponds to a column of matrix ˆ P in Figure 2. For the resulting eigenvalueproblem, we obtain – using a simple power iteration, see Section 5 – the left eigenvector v corresponding to thelargest eigenvalue λ = 1 v = . . . . . . . . . , which is a good approximation of the largest left eigenvector v of the matrix ˆ P given by v = (cid:2) . . . . . . . . . (cid:3) . (cid:52) Instead of working with the full format, the matrix P can also be expressed directly using the canonical tensorformat. Assume that a test point x is mapped from box B i to box B j , where i = ( i , . . . , i d ) and j = ( j , . . . , j d ) areagain multi-indices. Now let e i µ µ ∈ R k µ be the i µ -th unit vector of size k µ and let e i = d (cid:79) µ =1 e i µ µ . Then the elementary tensor e i ⊗ e j ∈ R k ×···× k d × k ×···× k d describes the mapping of this point. Furthermore, letind be again the function that returns the multi-index of the box that contains the point x and T the set of all testpoints, then the matrix P can be represented as a sum of elementary tensors of this form, i.e. P = 1 n (cid:88) x ∈ T e ind( x ) ⊗ e ind( S ( x )) . kn , i.e. the number of boxes multiplied by the number of test pointsper box, and thus potentially too large to store (unless sparse tensor formats are used). However, we will store onlya low-rank approximation to reduce the required storage space. Note that this elementary tensor representationcan also be easily converted into the TT format for numerical computations using the TT toolbox.The question now is whether the tensor representation offers advantages over the standard formulation of Ulam’smethod. The goal is to approximate the eigenfunctions of the Perron–Frobenius operator or Koopman operatorusing low-rank tensors, reducing the computational cost as well as the memory consumption. Before we presentnumerical results, let us also rewrite EDMD in tensor form. Instead of writing Ψ as a vector of functions Ψ = [ ψ , ψ , . . . , ψ ˆ k ] T , we now write Ψ as a tensor of functions. Westart by selecting basis functions for each dimension separately. Let D µ = { ψ µ , . . . , ψ k µ µ } be the set of basis functions for dimension µ , µ = 1 , . . . , d . Here, each ψ i µ µ : R → R depends only on x µ . Then ourtensor basis for EDMD contains all functions of the form ψ i ( x ) = d (cid:89) µ =1 ψ i µ µ ( x µ ) , (9)where i = ( i , . . . , i d ) is a multi-index. Thus, D = (cid:40) d (cid:89) µ =1 ψ i µ µ , ψ i µ µ ∈ D µ (cid:41) . That is, we have again ˆ k = (cid:81) dµ =1 k µ basis functions and Ψ : R d → R k ×···× k d , with Ψ [ i , . . . , i d ]( x ) = ψ i ( x ) . Example 4.4.
Let us begin with a simple example: Assume we have a two-dimensional domain
X ⊂ R and wewant to use monomials of order up to three { , x µ , x µ , x µ } in x and x direction to approximate the eigenfunctionsof the Koopman operator. Written in tensor form, we obtain Ψ ( x ) = x x x x x x x x x x x x x x x x x x x x x x x x . That is, Ψ [ i , i ]( x ) = x i − x i − . Analogously, for a d -dimensional domain, we would obtain Ψ ( x ) ∈ R k ×···× k d with Ψ [ i , . . . , i d ]( x ) = x i − · · · x i d − d . (cid:52) Such a tensor basis is often used for high-dimensional problems, see also [23]. Typical basis functions aremonomials, Hermite polynomials, or trigonometric functions. In the standard formulation, all basis functions areenumerated and rewritten in vector form (3). The difference here is that the tensor form will be preserved. We willuse again (6) as a mapping from multi-index to single index when required.Now A , G ∈ R k ×···× k d × k ×···× k d can be constructed as follows: A [ i , . . . , i d , j , . . . , j d ] = (cid:104)K Ψ [ i , . . . , i d ] , Ψ [ j , . . . , j d ] (cid:105) , G [ i , . . . , i d , j , . . . , j d ] = (cid:104) Ψ [ i , . . . , i d ] , Ψ [ j , . . . , j d ] (cid:105) . A [ i , . . . , i d , j , . . . , j d ] = 1 m m (cid:88) l =1 Ψ [ i , . . . , i d ]( y l ) Ψ [ j , . . . , j d ]( x l ) , G [ i , . . . , i d , j , . . . , j d ] = 1 m m (cid:88) l =1 Ψ [ i , . . . , i d ]( x l ) Ψ [ j , . . . , j d ]( x l ) , or in short form, using the outer product, A = 1 m m (cid:88) l =1 Ψ ( y l ) ⊗ Ψ ( x l ) , G = 1 m m (cid:88) l =1 Ψ ( x l ) ⊗ Ψ ( x l ) , (10)which in turn results in a generalized eigenvalue problem of the form ξ A = λ ξ G . Note that the eigenvalue problem is the same as (5) in the standard case. For the sake of simplicity, we are omittingthe index i here. Proposition 4.5.
Provided that the basis functions can be written in tensor product form (9) , ξ A = λ ξ G ⇔ ξ ˆ A = λξ ˆ G. Proof.
We just show that the entries of A and ˆ A as well as the entries of G and ˆ G are identical, the rest – theequivalency of the matrix-vector and tensor products – follows from Proposition 4.2. Assuming that the basis canbe written in the product form, we obtain from (4)ˆ a ˆ i ˆ j = 1 m m (cid:88) l =1 ψ ˆ i ( y l ) ψ ˆ j ( x l )= 1 m m (cid:88) l =1 d (cid:89) µ =1 ψ i µ µ ( Q µ ( y l )) d (cid:89) µ =1 ψ j µ µ ( Q µ ( x l ))= 1 m m (cid:88) l =1 Ψ [ i , . . . , i d ]( y l ) Ψ [ j , . . . , j d ]( x l )= A [ i , . . . , i d , j , . . . , j d ]and analogously ˆ g ˆ i ˆ j = G [ i , . . . , i d , j , . . . , j d ]. Here, Q µ is again the projection onto the µ -th component of a vector,cf. (8).Instead of storing the dense matrices A and G , we can again directly represent these matrices using the canonicaltensor format. The basis was chosen in such a way that Ψ ( x ) can be written as Ψ ( x ) = d (cid:79) µ =1 ˜ ψ µ ( x µ ) , where ˜ ψ µ = [ ψ µ , . . . , ψ k µ µ ] T ∈ R k µ . With (10) it follows that A and G can be written as sums of m elementarytensors. As before, we are not storing the full-rank tensor, but only low-rank approximations.The eigentensors ξ ∈ R k ×···× k d of the generalized eigenvalue problem can then be used to approximate theeigenfunctions of the Perron–Frobenius operator or Koopman operator: Let ξ be a left eigentensor, then ϕ ( x ) = (cid:104) ξ , Ψ ( x ) (cid:105) ξ is a right eigentensor of the generalizedeigenvalue problem – observe that G [ i , . . . , i d , j , . . . , j d ] = G [ j , . . . , j d , i , . . . , i d ] –, then ϕ ( x ) is an approximationof the corresponding eigenfunction of the Perron–Frobenius operator, see also [12].To compute the dominant eigenfunctions of the Koopman operator or Perron–Frobenius operator, we will usesimple power iteration schemes outlined in the next section. General purpose eigenvalue solvers for nonsymmetricgeneralized eigenvalue problems are, to our knowledge, not part of the tensor libraries yet. Solvers for symmetric(non-generalized) eigenvalue problems already exist and are part of the TT toolbox [22]. In Section 2 and Section 4, we have shown that in order to compute the eigenfunctions of the Perron–Frobeniusoperator and Koopman operator, respectively, using either Ulam’s method or EDMD, we need to solve standardeigenvalue problems or generalized eigenvalue problems. For the reformulated version of these methods, we haveto develop the required numerical algorithms to solve the resulting tensor-based eigenvalue problems. At the timeof writing, we are not aware of any tensor toolbox containing numerical methods for nonsymmetric generalizedeigenvalue problems. Methods for the computation of eigenvectors of symmetric positive definite matrices in theTT format have been proposed in [8], where the eigenvalue problem is rewritten as a (Rayleigh quotient based)minimization problem which is then solved using the
Alternating Linear Scheme (ALS). In practice, these methodshave recently also been successfully used for nonsymmetric problems, although convergence has not been shownyet [24].Suitable methods for eigenvalue problems can be subdivided into two main categories as explained in [10] (seealso references therein, e.g. [9]): The first category of methods is based on combining classical iterative algorithmswith low-rank truncation after each step, the second is based on a reformulation as an optimization problem, whereadmissible solutions are constrained to the set of low-rank tensors. In this section, we will describe a generalizationof simple power iteration methods – belonging to the first category – to tensor-based eigenvalue problems. For adetailed description of general power iteration methods, we refer to [25]. Power iteration and inverse power iterationfor tensors have also been proposed in [9]. The main difference between the standard algorithms and the tensor-based counterpart is that for the latter truncation is used to keep the ranks of the tensors low. It is important thatthe iteration moves from the initial state to the final state without creating intermediate solutions with an excessiverank [9].
In what follows, let T denote the truncation of a tensor. Then instead of a classical iteration scheme of the form x k +1 = F ( x k ), we simply obtain x k +1 = T ( F ( x k )). For an eigenvalue problem of the form Av = λ v , given aninitial guess v for the dominant eigenvector, the power iteration algorithm computes: for k = 1 , , . . . dow ( k ) = T (cid:0) Av ( k − (cid:1) v ( k ) = w ( k ) / (cid:13)(cid:13) w ( k ) (cid:13)(cid:13) λ ( k ) = (cid:10) v ( k ) , T ( Av ( k ) ) (cid:11) end for The iteration converges to an eigenvector associated with the largest eigenvalue λ of the truncated operator A ifthe eigenvalue is simple and the initial guess v (0) has a component in the direction of the corresponding dominanteigenvector v [25]. Even if the initial guess does not have a component in the direction of v , rounding errorstypically ensure that this direction will be picked up during the iteration. The rate of convergence depends onthe ratio between the second-largest and largest eigenvalue λ /λ . The main advantage of this method is that itrequires only matrix-vector multiplications and can thus easily be used for tensor eigenvalue problems.A modification of this algorithm to compute eigenvectors corresponding to any eigenvalue is the inverse poweriteration with shift, which – assuming that A − θ I is nonsingular – can be written in the form: for k = 1 , , . . . do Solve ( A − θ I ) w ( k ) = v ( k − v ( k ) = w ( k ) / (cid:13)(cid:13) w ( k ) (cid:13)(cid:13) λ ( k ) = (cid:10) v ( k ) , Av ( k ) (cid:11) nd for The parameter θ is called shift and the iteration converges to the eigenvalue closest to θ . This method is justthe standard power iteration applied to the matrix ( A − θ I ) − . Here, the linear solver computes a low-rankapproximation of the solution so that truncation is not required. Given matrices A and B , the generalized eigenvalue problem is given by Av = λ Bv . In this case, the poweriteration method also requires the solution of a linear system of equations. Thus, we can also directly apply theinverse power iteration, where the resulting systems of linear equations are again solved with ALS: for k = 1 , , . . . do Solve ( A − θ B ) w ( k ) = Bv ( k − v ( k ) = w ( k ) / (cid:13)(cid:13) w ( k ) (cid:13)(cid:13) λ ( k ) = (cid:10) v ( k ) , Av ( k ) (cid:11) / (cid:10) v ( k ) , Bv ( k ) (cid:11) end for In order to keep the ranks of the intermediate solutions low, we also approximate the matrices P (Ulam’smethod) or A and G (EDMD) by low-rank tensors. That is, the initial matrices are converted to ˜P , ˜A , and ˜G with a lower rank since the rank of these matrices can initially be very high. We are typically only interested in thegeneral behavior of the eigenfunctions, a highly accurate representation of the eigenfunctions is often not needed. All examples presented within this section have been implemented in Matlab using – for the sake of efficiency – mex -functions to integrate the SDEs. All tensor computations were carried out with the
TT toolbox [22]. For theeigenvector computations, we used our implementation of the simple power iteration methods described in Section 5.
Let us start with a simple 2-dimensional example, a stochastic differential equation of the form dx = −∇ x V ( x , x ) dt + σ dW ,dx = −∇ x V ( x , x ) dt + σ dW , where W and W are two independent standard Wiener processes. Here, the potential is given by V ( x , x ) = ( x − + x , see Figure 3 (cf. [12]). Furthermore, we set σ = 0 .
7. Note that in this case the potential can be written as V ( x , x ) = V ( x ) + V ( x ). In order to analyze the tensor-based methods, we rotate the potential by an angle α and obtain (cid:101) V ( x , x ) = (cid:0) (cos( α ) x − sin( α ) x ) − (cid:1) + (sin( α ) x + cos( α ) x ) . The two independent Wiener processes W and W are rotated accordingly. We would expect that the eigenfunctionsof systems with small α can be accurately approximated by low-rank tensors, whereas systems with a larger valueof α require higher ranks since the dynamics are not aligned with the axes anymore.The second eigenfunctions of the Perron–Frobenius operator for the systems with potential (cid:101) V and differentvalues of α computed using the tensor-based version of Ulam’s method are shown in Figure 4. We chose α = 0, α = π/ α = π/
6, and α = π/
4. The domain X = [ − , was subdivided into 50 ×
50 equally sized boxes.That is, P ∈ R × × × . For each box, 100 randomly chosen test points were generated. The Euler–Maruyamamethod with a step size h = 10 − was used for the numerical integration, where one evaluation of S correspondsto 10 ,
000 integration steps, that is, the integration interval is [0 , θ of the power iterationmethod was set to a value slightly smaller than 1. Figure 5 illustrates how the tensor approximation, dependingon the rank, successively picks up the information about the shape of the eigenfunction and generates more andmore accurate representations. For α = π/
4, a tensor of rank 1 cannot represent the minimum and maximum14 igure 3:
Double-well potential V ( x , x ) = ( x − + x . Figure 4:
Second eigenfunction of the Perron–Frobenius operator for different values of α .simultaneously since this would lead to two additional peaks in the lower left and upper right corner. Here, thefirst pair of singular vectors represents the maximum, the second pair of singular vectors the minimum.Additionally, we computed the eigenfunctions with the standard version of Ulam’s method to evaluate theaccuracy of the approximation and compared it with the results obtained by using the new tensor-based formulation.Figure 6 shows the influence of the truncation of the operator as well as the influence of the truncation of the resultingeigenfunctions. Here, in order to analyze the accuracy, we also compare the first eigenfunction with the analyticallycomputed invariant density. Since we are computing eigenfunctions of the Perron–Frobenius operator associatedwith a stochastic differential equation, the results depend strongly on the number of test points chosen for each box.The higher the number of test points per box, the smoother the eigenfunction approximation. Thus, in this case,the smoother low-rank solutions can counterintuitively lead to better approximations of the true eigenfunctions.The high ranks are mainly required to resolve the numerical noise introduced by the coarse approximation of theoperator. This can be seen, for example, in Figure 6b. Decreasing the rank initially reduces the error – thetruncation of the operator results in smoother eigenfunctions – until the shape of the eigenfunction cannot bedescribed by a low-rank approximation anymore and the error increases. For α = 0, the x and x dynamics areindependent and a low-rank approximation is sufficient. Furthermore, the results illustrate that for a fixed-rankapproximation, the error is smaller when the system’s dynamics are aligned with the axes.This example shows that in order to be able to approximate eigenfunctions by low-rank tensors, the dynamicsof the system should be aligned with the axes chosen, although even if the dynamics are not aligned, the tensorformat might be advantageous. In general, the dynamics are unknown a priori and not necessarily aligned with theaxes, but for higher-dimensional systems it is often possible to decompose a system into slow and fast subsystems.Not all variables of a system might be equally important to describe the system’s behavior. The intuition wouldbe that certain subsystems require less information and that the tensor approximation automatically captures therelevant dynamics, using high ranks only when necessary.15 igure 5: Tensor approximations of the second eigenfunction of the Perron–Frobenius operator for α = π/ r . Let us consider a more complex 3-dimensional example with the potential function V ( x , x , x ) = 3 e − x − ( x − ) − e − x − ( x − ) − e − ( x − − x − e − ( x +1) − x + x + (cid:0) x − (cid:1) + x . This is a potential taken from [20], augmented by a third dimension. We subdivided the domain X = [ − , × [ − , × [ − ,
2] into 20 × ×
20 boxes of the same size. For each box, we randomly generated 1000 test points.The resulting finite-dimensional approximation is then a tensor P ∈ R × × × × × . Figure 7 shows a scatterplot of the first three eigenfunctions, which were computed using the inverse power iteration described in Section 5.The first eigenfunction clearly shows the three expected regions with high probabilities corresponding to the min-ima of the potential V . The second eigenfunction separates the two deeper wells at ( − , ,
0) and (1 , , , . , v and the tensor v is of the order of 10 − , which is mainly due tothe less accurate power iteration method applied to the tensor eigenvalue problem.For the sake of comparison, we computed the eigenfunctions of the Koopman operator using EDMD. We chosebasis functions D = { x i x i x i , i , i , i = 0 , . . . , } . Hence, A , G ∈ R × × × × × . For higher-order monomials,the resulting matrices are ill-conditioned and the eigenfunctions cannot be computed accurately anymore. Theresults are shown in Figure 8. As before, the second eigenfunction separates the two deeper wells. The thirdeigenfunction separates these wells from the third shallower well and is close to zero for the regions around thedeep wells. The trivial eigenfunction of the Koopman operator corresponding to λ = 1 is not plotted here sinceit is constant and does not contain relevant information about the system. Note that the eigenvalues are slightlydifferent due to the different set of basis functions used for EDMD. We have reformulated the problems of computing finite-dimensional approximations of the Perron–Frobenius andKoopman operator in a different format using tensors instead of vectors. The matrices P (if Ulam’s method is used)or A and G (if EDMD is used) can now either be assembled in the dense tensor format or directly in the canonicaltensor format – which can then easily be converted into the TT format –, enabling low-rank approximations of theaforementioned operators. The next step is to systematically develop the numerical methods required to efficientlysolve the resulting nonsymmetric generalized tensor eigenvalue problems and also to store and handle these tensorsminimizing memory requirements so that even high-dimensional problems can be solved. First results obtained byapplying simple algorithms such as power iteration methods are promising and show that the approaches presentedwithin this paper might be able to tackle high-dimensional problems. Currently, several toolboxes for tensor-basedproblems are under development. Once these toolboxes contain methods for solving nonsymmetric generalizedeigenvalue problems, the proposed approaches can be implemented easily, potentially facilitating the computationof meta-stable sets or almost invariant sets of dynamical systems that could not be handled before due to the16 ) b)c) d) Figure 6: (Top) Let ˆ v denote the first eigenvector of ˆ P , v the (vectorized) eigentensor of the truncated tensorrepresentation P , and µ inv the analytically computed invariant density. The error here is defined by e = k (cid:107) v − ˆ v (cid:107) and e = k (cid:107) v − µ inv (cid:107) , respectively, where k is the number of boxes. a) Difference between v and ˆ v dependingon the rank of P . b) Difference between v and µ inv . (Bottom) Influence of the truncation of the first eigentensor v of the full tensor representation P on the accuracy. c) Difference between the truncated eigentensor v andˆ v . d) Difference between the truncated eigentensor v and µ inv . The dashed lines show the error for the full-rankapproximation which is almost identical for the different values of α .curse of dimensionality. We demonstrated the tensor-based version of Ulam’s method and EDMD using two- andthree-dimensional problems mainly because the results can be easily validated and visualized.Future work also includes determining which tensor format is suited best for our purposes. Currently, one of themain bottlenecks is the simulation required to obtain the data. Even simple molecular dynamics simulations, forinstance, might easily take several days, but in order to capture the behavior of the system, long trajectories or alarge number of short simulations with different initial conditions are required. This huge amount of data must thenbe processed. Thus, also the construction of the matrices P or A and G is time-consuming. The number of boxesor basis functions required to represent each variable of the system accurately is in general unknown a priori. If weare, for instance, only interested in the leading eigenfunctions of the Koopman operator, the fast variables of thesystem are typically almost constant and require less information to be captured. Starting with a set of only a fewbasis functions for each unknown, which is then, if needed, augmented adaptively based on the system’s behaviorwould greatly improve the efficiency. Adaptive methods combined with (sparse) tensor approaches might be ableto tackle high-dimensional systems and diminish the curse of dimensionality. Furthermore, a detailed numericalanalysis of the efficiency and accuracy of the proposed algorithms would help understand the limitations and find17 igure 7: Scatter plot of the first three eigenfunctions of the Perron–Frobenius operator for the triple well system.Only entries whose absolute value is larger than a given threshold are plotted, entries close to zero are omitted.
Figure 8:
Scatter plot of the second and third eigenfunction of the Koopman operator for the triple well system.Note that compared to the other plots the second eigenfunction of the Koopman operator is rotated by 180 degreesaround the x axis for a better visualization.opportunities for improvement.Another open problem is the – depending on the number of dimensions d – typically extremely large conditionnumber of the matrices A or G if EDMD with, for instance, monomials are used to compute the eigenfunctions ofthe Perron–Frobenius or Koopman operator. Hence, the resulting eigenvalue problems cannot be solved accuratelyanymore for high-dimensional systems. A detailed understanding and numerical analysis of different basis functionsmight help mitigate this problem. Radial-basis functions or other more locally defined functions could lead to betterresults. Acknowledgements
This research has been partially funded by Deutsche Forschungsgemeinschaft (DFG) through grant CRC 1114.Moreover, we would like to thank the reviewers for their helpful comments and suggestions.
References [1] M. Dellnitz and O. Junge. On the approximation of complicated dynamical behavior.
SIAM Journal onNumerical Analysis , 36(2):491–515, 1999.[2] M. Dellnitz, G. Froyland, and O. Junge. The algorithms behind GAIO – Set oriented numerical methods fordynamical systems. In
Ergodic theory, analysis, and efficient simulation of dynamical systems , pages 145–174.Springer, 2000.[3] R. Preis, M. Dellnitz, M. Hessel, Ch. Sch¨utte, and E. Meerbach. Dominant paths between almost invariantsets of dynamical systems. DFG Schwerpunktprogramm 1095, Technical Report 154, 2004.184] M. Budiˇsi´c, R. Mohr, and I. Mezi´c. Applied Koopmanism.
Chaos: An Interdisciplinary Journal of NonlinearScience , 22(4), 2012.[5] M. O. Williams, I. G. Kevrekidis, and C. W. Rowley. A data-driven approximation of the Koopman operator:Extending dynamic mode decomposition.
ArXiv e-prints , 2014.[6] G. Froyland, G. Gottwald, and A. Hammerlindl. A computational method to extract macroscopic variablesand their dynamics in multiscale systems.
SIAM Journal on Applied Dynamical Systems , 13(4):1816–1846,2014.[7] G. Froyland, G. A. Gottwald, and A. Hammerlindl. A trajectory-free framework for analysing multiscalesystems.
ArXiv e-prints , 2014.[8] S. Holtz, T. Rohwedder, and R. Schneider. The alternating linear scheme for tensor optimization in the tensortrain format.
SIAM Journal on Scientific Computing , 34(2):A683–A713, 2012.[9] G. Beylkin and M. J. Mohlenkamp. Algorithms for numerical analysis in high dimensions.
SIAM Journal onScientific Computing , 26(6):2133–2159, 2005.[10] L. Grasedyck, D. Kressner, and C. Tobler. A literature survey of low-rank tensor approximation techniques.
ArXiv e-prints , 2013.[11] Jiu Ding, Q. Du, and Tien-Yien Li. High order approximation of the Frobenius–Perron operator.
AppliedMathematics and Computation , 53:151–171, 1993.[12] S. Klus, P. Koltai, and Ch. Sch¨utte. On the numerical approximation of the Perron–Frobenius and Koopmanoperator.
ArXiv e-prints , 2015.[13] M. O. Williams, C. W. Rowley, and I. G. Kevrekidis. A kernel-based approach to data-driven Koopman spectralanalysis.
ArXiv e-prints , 2014.[14] G. Chen and T. Ueta, editors.
Chaos in Circuits and Systems . World Scientific Series on Nonlinear Science,Series B, Volume 11. World Scientific, 2002.[15] E. M. Bollt and N. Santitissadeekorn.
Applied and Computational Measurable Dynamics . Society for Industrialand Applied Mathematics, 2013.[16] F. N¨uske, B. G. Keller, G. P´erez-Hern´andez, A. S. J. S. Mey, and F. No´e. Variational approach to molecularkinetics.
Journal of Chemical Theory and Computation , 10(4):1739–1752, 2014.[17] F. N¨uske, R. Schneider, F. Vitalini, and F. No´e. Variational tensor approach for approximating the rare-eventkinetics of macromolecular systems.
The Journal of Chemical Physics , 144(5), 2016.[18] W. Hackbusch. Numerical tensor calculus.
Acta Numerica , 23:651–742, 2014.[19] G. Beylkin and M. J. Mohlenkamp. Numerical operator calculus in higher dimensions.
Proceedings of theNational Academy of Sciences , 99(16):10246–10251, 2002.[20] Ch. Sch¨utte and M. Sarich.
Metastability and Markov State Models in Molecular Dynamics: Modeling, Analysis,Algorithmic Approaches . Number 24 in Courant Lecture Notes. American Mathematical Society, 2013.[21] I. V. Oseledets. Tensor-train decomposition.
SIAM Journal on Scientific Computing , 33(5):2295–2317, 2011.[22] I. V. Oseledets.
TT-toolbox 2.0: Fast multidimensional array operations in TT format , 2011.[23] G. Friesecke, O. Junge, and P. Koltai. Mean field approximation in conformation dynamics.
Multiscale Modeling& Simulation , 8(1):254–268, 2009.[24] P. Gelß, S. Matera, and Ch. Sch¨utte. Solving the master equation without kinetic Monte Carlo: Tensor trainapproximations for a CO oxidation model.
Journal of Computational Physics , 314:489–502, 2016.[25] G. H. Golub and C. F. Van Loan.