A Flexible Power Method for Solving Infinite Dimensional Tensor Eigenvalue Problems
aa r X i v : . [ m a t h . NA ] J a n A FLEXIBLE POWER METHOD FOR SOLVINGINFINITE DIMENSIONAL TENSOR EIGENVALUE PROBLEMS ∗ ROEL VAN BEEUMEN † , LANA PERIˇSA ‡ , DANIEL KRESSNER § , AND
CHAO YANG † Abstract.
We propose a flexible power method for computing the leftmost, i.e., algebraicallysmallest, eigenvalue of an infinite dimensional tensor eigenvalue problem, Hx = λx , where the infinitedimensional symmetric matrix H exhibits a translational invariant structure. We assume the smallesteigenvalue of H is simple and apply a power iteration of e − H with the eigenvector represented in acompact way as a translational invariant infinite Tensor Ring (iTR). Hence, the infinite dimensionaleigenvector can be represented by a finite number of iTR cores of finite rank. In order to implementthis power iteration, we use a small parameter t so that the infinite matrix-vector operation e − Ht x can efficiently be approximated by the Lie product formula, also known as Suzuki–Trotter splitting,and we employ a low rank approximation through a truncated singular value decomposition on theiTR cores in order to keep the cost of subsequent power iterations bounded. We also use an efficientway for computing the iTR Rayleigh quotient and introduce a finite size iTR residual which is usedto monitor the convergence of the Rayleigh quotient and to modify the timestep t . In this paper,we discuss 2 different implementations of the flexible power algorithm and illustrate the automatictimestep adaption approach for several numerical examples. Key words. infinite eigenvalue problem, tensor eigenvalue problem, tensor ring, tensor train,translational invariance, matrix product states
AMS subject classifications.
1. Introduction.
We consider the problem of computing the leftmost, i.e., al-gebraically smallest, eigenvalue of an infinite dimensional tensor eigenvalue problem(1.1) Hx = λ x , where the infinite dimensional symmetric matrix H is defined as the infinite sum ofKronecker products of an infinite number of finite matrices, i.e.,(1.2) H = + ∞ X k = −∞ H k , H k = · · · ⊗ I ⊗ I ⊗ M k,k +1 ⊗ I ⊗ I ⊗ · · · , with I the d × d identity matrix and M k,k +1 ∈ R d × d . We assume that the matrices M k,k +1 are identical for all k , hence H is called to be translational invariant . Thedouble subscript k, k + 1 for the matrix M is used here to merely indicate the over-lapping positions of M in each Kronecker product. This type of eigenvalue problemsoriginates from the study of simple quantum many-body systems such as a quantumspin chain with nearest neighbor interactions [3]. For some specific choices of M , theanalytical solutions of these eigenvalue problems are known [2, 6, 1]. However, in gen-eral, the smallest eigenvalue and its corresponding eigenvector need to be computednumerically. ∗ Submitted to the editors February 2, 2021.
Funding:
This work was partially funded by the U.S. Department of Energy, Office of Science,Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing(SciDAC) program under Contract No. DE-AC0205CH11231. † Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720,United States. ([email protected], [email protected]). ‡ Visage Technologies, Diskettgatan 11A, SE-583 35 Link¨oping, Sweden.([email protected]). § Department of Mathematics, ´Ecole Polytechnique F´ed´erale de Lausanne, Station 8, 1015 Lau-sanne, Switzerland. (daniel.kressner@epfl.ch). 1
R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG
The eigenvectors of H are infinite dimensional vectors which obviously cannot becomputed or stored directly in memory. One way to study such a problem computa-tionally is to start with a finite H ℓ ∈ R d ℓ × d ℓ that only contains ( ℓ −
1) terms in thesummation and Kronecker products of ( ℓ −
1) matrices, and examine how the small-est eigenvalue of H ℓ changes as ℓ increases. Note that the Kronecker structure of H ℓ allows for efficiently representing this matrix in, e.g., Tensor Train (TT) format [10].The corresponding eigenvector can then be represented by the TT x ℓ ( i , i , . . . , i ℓ ) = X ( i ) X ( i ) · · · X ℓ ( i ℓ ) , where X k ( i k ) is an r k × r k +1 matrix, with r = r ℓ +1 = 1, and the indices i k = 1 , . . . , d ,for k = 1 , . . . , ℓ . Because there is a limit on how large ℓ can be chosen, we maynever know the true solution in the limit ℓ → ∞ . Such a limit is known in thephysics literature as the thermodynamic limit and is important for describing macro-scopic properties of quantum materials when H corresponds to a quantum many-bodyHamiltonian [14].In this paper, we will directly compute the smallest eigenvalue, and correspondingeigenvector, of the infinite dimensional eigenvalue problem (1.1)–(1.2). In order to dothat, we represent the eigenvector in a compact form and make use of the translationalinvariance property of H . Such invariance property was studied by Bethe [2] andHulth´en [6], known as the Bethe–Hulth´en hypothesis , and suggests that the elementsof the eigenvector are invariant with respect to a cyclic permutation of the tensorindices, i.e., x ( . . . , i − , i , i , . . . ) = x ( . . . , i , i , i , . . . ) , where all indices i k = 1 , . . . , d , and the sequence of indices . . . , i − , i , i . . . specifiesa particular element of the infinite dimensional vector x .We assume that the smallest eigenvalue of H is simple and propose to apply aflexible power iteration to e − H t for some small and variable parameter t to computethe desired eigenpair. We represent the eigenvector to be computed as a translationalinvariant infinite Tensor Ring (iTR), defined as follows x ( . . . , i − , i , i , . . . ) = Tr " + ∞ Y k = −∞ X ( i k ) , where all indices i k = 1 , . . . , d , and each X ( i k ) is a matrix of size r × r . Note that dueto the translational invariance, we only need to store, and work, with d matrices ofsize r × r , which is manageable as long as the rank r is not too large. An iTR can beseen as the infinite limit of a finite size tensor ring [17] and is also known as a uniformMatrix Product State (uMPS) [16].In order to implement the power method, we must be able to multiply a matrixexponential with an iTR and keep the product in iTR form. Computing e − H t x , with H being an infinite dimensional matrix and x an iTR, is in general not possible.However, the special structure of (1.2) allows us to split H in even and odd terms, H e and H o , respectively. When t is sufficiently small, we can use the Lie productformula, also known as Suzuki–Trotter splitting, to simplify e − H t x into only localtensor contractions with a d × d matrix exponential e − Mt . Such contractions canbe implemented in a way that the translational invariance of the contracted tensoris preserved, although the rank generally increases. To keep the cost of subsequentpower iterations bounded, a low rank approximation through the use of a truncatedsingular value decomposition is applied. LEXIBLE POWER METHOD FOR INFINITE TEP x , obtained from the power method, toapproximate the smallest eigenvalue of H . Because (1.2) contains an infinite numberof terms in the summation, the classical definition of the Rayleigh quotient involves adiverging infinite sum. Therefore, we define the iTR Rayleigh quotient in an averagesense, and introduce the canonical form of an iTR that can simplify such a definition.In the numerical experiments, we will show that the iTR Rayleigh quotient usedwithin the flexible power method converges from above to the exact solution. Wealso give an alternative way to obtain an eigenvalue approximation that can be moreaccurate than the Rayleigh quotient. Such an approximation relies on viewing an iTRas the product of an infinite sized frame matrix and a vectorized tensor of finite size.The flexible power iteration was first introduced by Vidal in [15] as the iTEBDalgorithm. We call such a scheme a flexible power iteration because it is similar toapplying a power method to the matrix exponential e − H t , except that the matrixexponential is approximated, in each power iteration, through Trotter splitting, andrank truncation is applied to the iTR representation of e − H t x . This is where theterm “flexible” comes from. In [15] and other papers, the accuracy of the eigenvalueapproximate is assessed by either comparing the approximation with the exact solutionor examining the difference between the approximate eigenvalues obtained from twoconsecutive iterations. The former is not practical when the exact solution of theproblem is unknown. The latter can be misleading when the flexible iteration startsto stagnate, which does happen as shown in our numerical experiments.Based on the canonical form of the iTR, we also introduce a finite size iTR residualthat can be used to monitor the convergence of the Rayleigh quotient. We give apractical procedure for adjusting the parameter t in the flexible power iteration toensure rapid but stable convergence and provide a stopping criterion for terminatingthe power iteration. As illustrated in the numerical experiments, such a procedure isvery effective.The paper is organized as follows. In section 2, we introduce diagrammatic no-tations for scalars, vectors, matrices, and tensors, as well as for tensor operationsthat make it easier to explain operations performed on tensor rings. In section 3, webegin with a formal definition of a single core translational invariant infinite TensorRing, describe its properties, and show how an iTR can be converted into a so-calledcanonical form, which is convenient for manipulating iTRs in various calculations.We also extend these definitions and properties to 2-core translational invariant iTRswhich are used to represent the approximate eigenvector of H (1.2) containing nearestneighbor interactions. In section 4, we properly define the iTR Rayleigh quotient andintroduce the finite iTR residual. In section 5, we present the flexible power iterationfor computing the smallest eigenpair of H . We also show how e H t x is computed topreserve the iTR structure and discuss a number of practical computational considera-tions regarding the convergence and error assessment of the method. Three numericalexamples are given in section 6 to demonstrate the effectiveness of the flexible powermethod and the adaptive strategy proposed in section 5. Finally, the main conclusionsare summarized in section 7.
2. Notation.
Throughout the paper, we denote scalars by lower Greek charac-ters, vectors by lower Roman characters, and matrices and higher order tensors byupper Roman characters, e.g., α , x , and M , respectively. I d is the d × d identitymatrix and diagonal matrices are denoted by upper Greek characters, e.g., Σ. Thetrace of a matrix M is given by Tr( M ) and its vectorization by vec( M ). For infinitedimensional vectors and matrices we use bold face Roman characters, e.g., v and M , R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG respectively.We will make use of the powerful tensor diagram notation [11] in order to graph-ically represent tensor contractions. The first 4 low-order tensors, i.e., a scalar α , avector v , a matrix M , and a 3rd-order tensor T , are depicted as follows α = α , v = v j , M = M i j , T = T i j k , where i, j, k are the corresponding tensor indices. This graphical notation has theadvantage that tensor contractions can be represented by connecting the lines thatcorrespond to the indices to sum over. For example, the matrix-vector product andthe matrix-matrix product are given by, respectively, M v = M i v j , M N = M i N j k , where the connections between M and v , and M and N correspond to the sum overindex j . Because contracting a tensor over one of its indices with the identity matrixhas no effect, we represent the identity matrix I by just a line. Another importanttensor contraction, that we will often use, is the contraction of two 3rd-order tensorsinto a so called supercore Z i jl m = X Y i j k l m , Z ijlm = X k X ijk Y klm , where the thick leg indicates combining the indices j and l into a single index.Some other commonly used matrix operations such as the trace, the transpose,and the vectorizations of a matrix are given by the following diagramsTr( M ) = M , v ⊤ = v , vec( M ) = M , vec( M ) ⊤ = M . In order to save space, we sometimes rotate the diagrams counterclockwise by 90degrees, e.g., the vectorization of a 3-rd order tensorvec( T ) = T . Throughout the paper, reshaping tensors into matrices and vice versa will play a keyrole, for example, reshaping the following rank-1 matrices into 4th-order tensorsvec( I ) vec(Σ ) ⊤ = ΣΣ , vec(Σ ) vec( I ) ⊤ = ΣΣ , (2.1)where Σ is a diagonal matrix. We again obtain a matrix by combining respectivelythe left and right pointing legs into one index.
3. Translational invariant infinite tensor ring.
In this section, we definethe translational invariant infinite tensor ring and discuss its properties that will beused in the next sections to construct an approximation to the eigenvector associatedwith the smallest eigenvalue of the infinite dimensional matrix H defined in (1.2). Westart by defining translational invariance for a (finite) tensor ring. LEXIBLE POWER METHOD FOR INFINITE TEP The tensor ring decomposition [17, 8] is a way torepresent high-order tensors, or vectors that can be reshaped into high-order tensors,by a sequence of 3rd-order tensors that are circularly multiplied. We define a finitesize tensor ring as follows.
Definition
Let x ℓ be an ℓ th-order tensor of size d × d ×· · · × d ℓ , then its tensor ring representation is defined as (3.1) x ℓ = X i X i · · · X ℓ i ℓ and element-wise as x ℓ ( i , i , . . . , i ℓ ) := Tr [ X ( i ) X ( i ) · · · X ℓ ( i ℓ )] , where the indices i k , k = 1 , , . . . , ℓ , run from 1 to d k , and each X k ( i k ) is a matrix of size r k × r k +1 ,with r = r ℓ +1 . We call the 3rd-order tensors X k the cores of x ℓ , X k ( i k ) the slices of the k thcore, and r k the corresponding tensor ring ranks . In the physics literature, a tensorring is called a matrix product state with periodic boundary conditions [12]. When r = r ℓ +1 = 1, a tensor ring becomes a tensor train [10], also known in the physicsliterature as a matrix product state with open boundary conditions [12].When all dimensions d k = d , all ranks r k = r , and all slices X k ( i k ) = X ( i k ), for k = 1 , , . . . , ℓ , the tensor ring x ℓ is said to be translational invariant. It follows fromthe trace operation property that x ℓ ( i , i , . . . , i ℓ ) = x ℓ ( i , . . . , i ℓ , i ) , i.e., x ℓ is invariant under cyclic permutations of its indices. This is a desirable propertybecause the eigenvectors of H defined in (1.2) are known to have such a property forcertain M k,k +1 . We define a translational invariant (finite) tensor ring as follows. Definition
Let x ℓ be an ℓ th-order tensor ofsize d ℓ , then its translational invariant tensor ring representation is defined as (3.2) x ℓ = X i X i · · · X i ℓ and element-wise as x ℓ ( i , i , . . . , i ℓ ) := Tr [ X ( i ) X ( i ) · · · X ( i ℓ )] , where the indices i k , k = 1 , , . . . , ℓ , run from 1 to d , and each X ( i k ) is a matrix of size r × r . Note that, due to the translational invariance, the tensor ring (3.2) is fully deter-mined by only one core X of size r × d × r . Generalizing Definition 3.1 to tensor rings with aninfinite number of cores is impossible because this would require an infinite amount ofdata to represent x . On the other hand, using the translational invariance propertyallows us to generalize Definition 3.2 to infinite tensor rings (iTR). Definition
A translational invariant iTR isdefined as x = X · · · i − X i X · · · i and element-wise as (3.3) x ( . . . , i − , i , i , . . . ) := Tr " + ∞ Y k = −∞ X ( i k ) , R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG where all indices i k run from 1 to d and each X ( i ) is a matrix of size r × r , with r referred to as the rank of x . Note that even though x contains an infinite number of cores, each core is com-pletely defined by the d slices of X so that x can be represented by a finite amount ofdata although it is infinite dimensional. A translational invariant iTR is also knownas a uniform matrix product state [16]. In the remainder of the paper, we will onlywork with translational invariant iTRs and just call them iTRs for brevity. The core tensor X of each iTR defines a special matrixcalled transfer matrix , which plays a key role in many operations performed on thecorresponding iTR, e.g., in the Rayleigh quotient and the residual calculations. Definition
Let x be an iTR as defined in Definition .Then we define the transfer matrix T X associated with x as the r × r matrix (3.4) T X := d X i =1 X ( i ) ⊗ X ( i ) = XX , where X ( i ) is the i th slice of X . The transfer matrix (3.4) and in particular its dominant left and right eigenvec-tors, i.e., the eigenvectors corresponding to the eigenvalue with largest magnitude,will play an indispensable role in operations involving iTRs.
Conjecture
Let x be a real iTR as defined in Definition and assumethat the dominant eigenvalue of the corresponding transfer matrix T X is simple. Thenthis eigenvalue, denoted by η , is real and positive, and the corresponding left and righteigenvectors v L and v R satisfy the conditions d X i =1 X ( i ) ⊤ V L X ( i ) = ηV L , d X i =1 X ( i ) V R X ( i ) ⊤ = ηV R , (3.5) where v L =: vec( V L ) and v R =: vec( V R ) , with V L and V R positive definite matrices. When the dominant eigenvalue of the transfer matrix is simple, v ⊤ L v R = 0 holds.We will assume that the eigenvectors v L and v R satisfy the normalization condition v ⊤ L v R = 1, which is equivalent to Tr (cid:0) V ⊤ L V R (cid:1) = 1. A graphically depiction of (3.5),and the normalization condition for v L and v R are shown in Figure 1. XXV L = η V L V L V R = 1 XX V R = η V R Fig. 1 . A graphical representation of (3.5) , and the normalization condition v ⊤ L v R = 1 . Lemma
If the transfer matrix T X is an injective linear map and η is its dom-inant eigenvalue with v L and v R being the corresponding left and right eigenvectors,respectively normalized to satisfy v ⊤ L v R = 1 , then (3.6) lim k →∞ (cid:18) T X η (cid:19) k = v R v ⊤ L . LEXIBLE POWER METHOD FOR INFINITE TEP Proof.
Let the left and right eigenvalue decompositions of T X be given by W ∗ L T X = Λ W ∗ L , T X W R = W R Λ W ∗ L W R = I, where the diagonal matrix Λ contains the eigenvalues on its diagonal, and W L and W R are the corresponding left and right eigenvectors of T X , respectively, and normalizedso that W ∗ L W R = I , i.e.,Λ := (cid:2) η • (cid:3) , W L := (cid:2) v L • (cid:3) , W R := (cid:2) v R • (cid:3) . Then, using that η is simple and W − R = W ∗ L , yieldslim k →∞ (cid:18) T X η (cid:19) k = lim k →∞ η k W R Λ k W − R = lim k →∞ η k W R Λ k W ∗ L = v R v ⊤ L , which completes the proof. Corollary
Let x be an iTR as defined in Definition .Then x is normalized, i.e., x ⊤ x = 1 , if its corresponding transfer matrix T X has asimple dominant eigenvalue η = 1 .Proof. The proof directly follows from Lemma 3.6, i.e., x ⊤ x = XX XX XX · · · · · ·· · · · · · = Tr (cid:20) lim k →∞ ( T X ) k (cid:21) = Tr (cid:2) v R v ⊤ L (cid:3) = v ⊤ L v R = 1 , where we also used the trace property Tr( AB ) = Tr( BA ). Because we can insert the product of anynonsingular matrix S and its inverse between two consecutive cores of an iTR andredefine each slice as S − X ( i ) S , i = 1 , ..., d , the representation of an iTR is not unique.It is useful to define a canonical form that makes it easier to describe computationsperformed on iTRs. Definition
Let x be an iTR as defined in Definition .Then its canonical form is defined as x = Q Σ i − Q Σ i Q Σ i · · · · · · and element-wise as (3.7) x ( . . . , i − , i , i , . . . ) := Tr " + ∞ Y k = −∞ Q ( i k )Σ , where Q ( i ) ∈ R r × r , for i = 1 , . . . , d , and Σ ∈ R r × r is a diagonal matrix with decreas-ing non-negative real numbers on its diagonal and k Σ k F = 1 , such that the followingleft and right orthogonality conditions hold d X i =1 Q L ( i ) ⊤ Q L ( i ) = ηI, d X i =1 Q R ( i ) Q R ( i ) ⊤ = ηI, (3.8) with Q L ( i ) := Σ Q ( i ) , Q R ( i ) := Q ( i )Σ , and η ∈ R being the dominant eigenvalue ofthe transfer matrix T X . R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG
We call the 3rd-order tensor Q the orthogonal core of x and Σ the correspondingsingular value matrix, since it can be obtained from a singular value decomposition.The 3rd-order tensors Q L and Q R , defined in Definition 3.8, are called the left andright orthogonal cores, respectively.Before explaining how to compute the canonical form, we will list some properties.First, note that by definition, Q and Σ satisfy the following relations(3.9) Σ Q ( i )Σ = Q L ( i )Σ = Σ Q R ( i ) , for i = 1 , . . . , d . The left and right canonical transfer matrices are defined as follows. Definition
Let x be an iTR in canonicalform as defined in Definition . Then we define its left and right canonical transfermatrices T Q L and T Q R as the following r × r matrices T Q L := d X i =1 Q L ( i ) ⊗ Q L ( i ) = QQ ΣΣ , (3.10) T Q R := d X i =1 Q R ( i ) ⊗ Q R ( i ) = QQ ΣΣ , (3.11) where Q L and Q R are defined in Definition . The canonical transfer matrices, associated with an iTR in canonical form, andtheir dominant left and right eigenvectors, which play a crucial role in the evaluationof the iTR Rayleigh quotient and the iTR residual, have the following fixed form.
Lemma
Let x be an iTR in canonical form as defined in Definition withits corresponding canonical transfer matrices defined as in Definition . Then, vec( I ) ⊤ T Q L = η vec( I ) ⊤ , T Q R vec( I ) = η vec( I ) , (3.12) T Q L vec(Σ ) = η vec(Σ ) , vec(Σ ) ⊤ T Q R = η vec(Σ ) ⊤ , (3.13) with vec( I ) being the dominant left eigenvector of T Q L and the dominant right eigenvec-tor of T Q R , and vec(Σ ) being the dominant right eigenvector of T Q L and the dominantleft eigenvector of T Q R .Proof. The proof for the dominant right eigenvector of T Q R directly follows fromvectorizing the right equality of (3.8),vec d X i =1 Q R ( i ) Q R ( i ) ⊤ ! = d X i =1 Q R ( i ) ⊗ Q R ( i ) ! vec( I ) = T Q R vec( I ) = η vec( I ) , where we made use of the “vec trick” identity, i.e.,(3.14) vec( ABC ) = ( C ⊤ ⊗ A ) vec( B ) , and Definition 3.9, respectively. For the proof of the dominant left eigenvector of T Q L ,we start from vectorizing the left equality of (3.8) and apply (3.14) to obtain(3.15) vec d X i =1 Q L ( i ) ⊤ Q L ( i ) ! = d X i =1 Q L ( i ) ⊤ ⊗ Q L ( i ) ⊤ ! vec( I ) = η vec( I ) . LEXIBLE POWER METHOD FOR INFINITE TEP ) is the dominant right eigenvector of T Q L , we startagain from the vectorization of the right equality of (3.8),(3.16) vec d X i =1 Q ( i )Σ Q ( i ) ⊤ ! = d X i =1 Q ( i ) ⊗ Q ( i ) ! vec(Σ ) = η vec( I ) , where we substituted Q R ( i ) = Q ( i )Σ and applied (3.14). Next, we multiply bothsides of (3.16) from the left by Σ ⊗ Σ, yielding(Σ ⊗ Σ) d X i =1 Q ( i ) ⊗ Q ( i ) !| {z } = T QL vec(Σ ) = η (Σ ⊗ Σ) vec( I ) = η vec(Σ ) . Finally, in order to prove that vec(Σ ) is also the dominant left eigenvector of T Q R ,we start from the vectorization of the left equality of (3.8),(3.17) vec d X i =1 Q ( i ) ⊤ Σ Q ( i ) ! = d X i =1 Q ( i ) ⊤ ⊗ Q ( i ) ⊤ ! vec(Σ ) = η vec( I ) , where we substituted Q L ( i ) = Σ Q ( i ) and made use of the identity (3.14). Next,we multiply both sides of (3.17) from the left by Σ ⊗ Σ followed by a transposition,yieldingvec(Σ ) ⊤ d X i =1 Q ( i ) ⊗ Q ( i ) ! (Σ ⊗ Σ) | {z } = T QR = η [(Σ ⊗ Σ) vec( I )] ⊤ = η vec(Σ ) ⊤ , which proves (3.13) and completes the proof.The eigenvalue decompositions of Lemma 3.10 are graphically shown in Figure 2.Note also that the left and right orthogonality conditions (3.8), defined by the canon-ical form, can be expressed in terms of the canonical transfer matrices and are equiv-alent to (3.12). QQ ΣΣ = η (a) vec( I ) ⊤ T Q L = η vec( I ) ⊤ QQ ΣΣ = η (b) T Q R vec( I ) = η vec( I ) QQ ΣΣ ΣΣ = η ΣΣ(c) T Q L vec(Σ ) = η vec(Σ ) QQ ΣΣΣΣ = η ΣΣ(d) vec(Σ ) ⊤ T Q R = η vec(Σ ) ⊤ Fig. 2 . Eigenvalue decompositions of the canonical transfer matrices. R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG
Using the orthogonality conditions (3.8), we notice that computing the canonicalform of an iTR corresponds to finding matrices Q ( i ) and a diagonal Σ so that(3.18) d X i =1 Q ( i ) ⊤ Σ Q ( i ) = ηI = d X i =1 Q ( i )Σ Q ( i ) ⊤ . Algorithm 1 shows how the canonical decomposition of an iTR can be computed.Using the properties L − = R ˜Σ, R − = ˜Σ L , V R = R ˜Σ R ⊤ , and V L = L ⊤ ˜Σ L , incombination with the identity (3.14), one can show that Q and Σ obtained in step 6of Algorithm 1 satisfy (3.18).The complexity of Algorithm 1 is dominated by step 1, which requires the com-putation of the dominant left and right eigenvectors of the r × r transfer matrix T X . Note that all remaining steps 2–7 only require operations on r × r matrices. Algorithm 1:
Canonical decomposition of iTR
Input : X Output: Q and diagonal matrix Σ Dominant left and right eigenvectors of transfer matrix T X :vec( V L ) ⊤ T X = η vec( V L ) ⊤ , T X vec( V R ) = η vec( V R ) . Eigendecompositions of V L and V R : V L = U L Λ L U ⊤ L , V R = U R Λ R U ⊤ R . Form: ˜ U L = U L Λ / L , ˜ U R = U R Λ / R . Singular value decomposition: V ˜Σ W ⊤ = ˜ U ⊤ L ˜ U R . Form: L = W ⊤ ˜ U − R , R = ˜ U −⊤ L V. Canonical form: Q ( i ) = k ˜Σ k F LX ( i ) R, Σ = ˜Σ / k ˜Σ k F . (Normalization): Q ( i ) = Q ( i ) / √ η. Starting from an iTR in canonical form, as defined inDefinition 3.8, we can rewrite it as a product of an infinite dimensional matrix withorthogonal columns, called the frame matrix [4, 7], and a finite vector of length equalto the number of elements in 1 core.
Definition
Let x be a normalized iTR in canonical formas defined in Definition . Then we define its k th frame matrix as follows (3.19) F = k := Q Σ · · · Q Q Σ · · · i k − i k − j k − i k j k i k +1 j k +1 . Note that by this definition the frame matrix F = k has an infinite number of rows,which are represented as down pointing legs, and dr columns, which are representedas up pointing legs. Now the iTR x can also be written as x = F = k vec( Q c ) , Q c = Q Σ Σ , (3.20) LEXIBLE POWER METHOD FOR INFINITE TEP Q c is called the canonical center core . Another important property of the framematrix is that it has orthonormal columns, i.e., F ⊤6 = k F = k = I dr , since F ⊤6 = k F = k = Q Σ · · · Q Σ · · · QQ Q Σ · · · Q Σ · · · , = ΣΣ QQ ΣΣ , = ΣΣ = I dr , where we used in the first step Lemma 3.6, i.e., lim k →∞ ( T Q R ) k = vec( I ) vec(Σ ) ⊤ , inorder to replace the infinite repetition of T Q R on both sides of the free indices, followedby vec( I ) ⊤ T Q L = vec( I ) ⊤ , and vec(Σ ) ⊤ vec( I ) = 1, respectively.Due to the orthogonality of the frame matrix, this matrix will turn out to be animportant tool to extract and update cores in an efficient way. Further, this matrixwill also play a role in splitting supercores into normal cores. The translational invariance principlecan be generalized to allow the unit of invariance to include more than one core. Sucha unit is sometimes referred to as a supercore . We now define a 2-core translationalinvariant iTR, which we will refer to as iTR2 to distinguish it from iTRs with a singlecore.
Definition
A 2-core translational in-variant iTR is defined as x = X · · · i Y i X i Y · · · i and element-wise as (3.21) x ( . . . , i , i , i , i , . . . ) := Tr " + ∞ Y k = −∞ X ( i k ) Y ( i k +1 ) , where all indices i k run from 1 to d and each X ( i ) and Y ( i ) are matrices of size r × r . In contrast to an iTR with a single core, there are now 2 transfer matrices asso-ciated with every iTR2, playing a central role in the iTR2 operations.
Definition
Let x be an iTR2 as defined inDefinition . Then we define the transfer matrices T XY and T Y X associated with R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG x as the r × r matrices T XY := d X i =1 d X j =1 h X ( i ) ⊗ X ( i ) ih Y ( j ) ⊗ Y ( j ) i = X YX Y , (3.22) T Y X := d X i =1 d X j =1 h Y ( i ) ⊗ Y ( i ) ih X ( j ) ⊗ X ( j ) i = Y XY X , (3.23) where X ( i ) is the i th slice of X and Y ( j ) the j th slice of Y . Note that the transfer matrices T XY and T Y X of an iTR2 can also be expressedin terms of the 1-core transfer matrices, i.e., T XY = T X T Y and T Y X = T Y T X , where T X is the transfer matrix formed by only using the core X and T Y the transfer matrixformed by only using the core Y . Lemma
Let x be an iTR2 as defined in Definition and assume that T X and T Y are nonsingular. Then its corresponding transfer matrices T XY and T Y X ,defined in Definition , have the same eigenvalues.Proof.
We will show that every eigenvalue of T XY is also an eigenvalue T Y X , andvice versa. First, assume that ( λ, v ) is an eigenpair of T XY , T XY v = T X T Y v = λv. Multiplying both sides from the left by T Y , yields T Y T X ( T Y v ) = λ ( T Y v ) , hence, thepair ( λ, w := T Y v ) is an eigenpair of T Y X .Similarly, assume that ( λ, w ) is an eigenpair of T Y X , T Y X w = T Y T X w = λw. Multiplying both sides from the left by T X , yields again T X T Y ( T X w ) = λ ( T X w ) , hence, the pair ( λ, v := T X w ) is an eigenpair of T XY .We now define the canonical form of an iTR2. Since the canonical form is mainlyused for normalized iTRs and iTR2s, we will assume that the dominant eigenvalue ofits transfer matrices η = 1. Definition
Let x be an iTR2 as defined in Defi-nition . Then its canonical form is defined as x = Q Σ i U Ω i Q Σ i U Ω i · · · · · · and element-wise as (3.24) x ( . . . , i , i , i , i , . . . ) := Tr " + ∞ Y k = −∞ Q ( i k ) Σ U ( i k +1 ) Ω , where Q ( i ) , U ( i ) ∈ R r × r , for i = 1 , . . . , d , and Σ , Ω ∈ R r × r are diagonal matrices withdecreasing non-negative real numbers on its diagonal and k Σ k F = k Ω k F = 1 , suchthat the following left and right orthogonality conditions hold d X i =1 Q L ( i ) ⊤ Q L ( i ) = I, d X i =1 Q R ( i ) Q R ( i ) ⊤ = I, (3.25) d X i =1 U L ( i ) ⊤ U L ( i ) = I, d X i =1 U R ( i ) U R ( i ) ⊤ = I, (3.26) with Q L ( i ) := Ω Q ( i ) , Q R ( i ) := Q ( i )Σ , U L ( i ) := Σ U ( i ) , and U R ( i ) := U ( i )Ω . LEXIBLE POWER METHOD FOR INFINITE TEP
Definition
Let x be an iTR2 in canon-ical form as defined in Definition . Then we define its left canonical transfermatrices as the d × d matrices T Q L U L and T U L Q L , and its right canonical transfermatrices as T Q R U R and T U R Q R , i.e., T Q L U L = T Q L T U L , T Q R U R = T Q R T U R , (3.27) T U L Q L = T U L T Q L , T U R Q R = T U R T Q R , (3.28) where Q L , Q R , U L , and U R are defined in Definition . Lemma
Let x be an iTR2 in canonical form as defined in Definition and its corresponding canonical transfer matrices as in Definition . Then, T Q L U L vec(Ω ) = vec(Ω ) , vec(Ω ) ⊤ T Q R U R = vec(Ω ) ⊤ , (3.29) T U L Q L vec(Σ ) = vec(Σ ) , vec(Σ ) ⊤ T U R Q R = vec(Σ ) ⊤ , (3.30) or vec(Ω ) being the dominant right eigenvector of T Q L U L and the dominant left eigen-vector T Q R U R , and vec(Σ ) being the dominant right eigenvector of T U L Q L and thedominant left eigenvector T U R Q R .Proof. The proof is similar to the one of Lemma 3.10.The canonical decomposition of an iTR2 can be computed by making use ofAlgorithm 1 applied to the supercore, followed by a singular value decomposition of thematrization of the obtained canonical center supercore. This procedure is summarizedin Algorithm 2. In case d ≪ r , the complexity of Algorithm 2 is dominated by step 2,which requires again computing the dominant left and right eigenvectors of the r × r transfer matrix in Algorithm 1. Algorithm 2:
Canonical decomposition of iTR2
Input : X and Y Output: Q and U , and diagonal matrices Σ and Ω Form supercore Z as the contraction of the cores X and Y . Use Algorithm 1 on Z resulting in the supercore Q and the diagonal matrixΩ. Form canonical center supercore Q c and reshape into a matrix M : Q Ω Ω = Q c j i i j −→ M j i j i . Singular value decomposition of M and reshape back into cores: V Σ W ⊤ = M −→ V j i Σ W j i . Canonical form: Q ( i ) = Ω − V ( i ) , U ( j ) = W ( j )Ω − . R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG
4. Eigenvalue approximation.
Before we describe the algorithm for comput-ing the leftmost eigenvalue and its corresponding eigenvector, we first need to definean eigenpair of an infinite dimensional matrix H . The use of the iTR ansatz for theapproximate eigenvector also requires special attention to be paid to the definition ofthe Rayleigh quotient and corresponding residual.First, we introduce the graphical notation of the infinite dimensional matrices H k defined in (1.2) H k = · · · M j k − i k − j k − i k − j k i k j k +1 i k +1 j k +2 i k +2 j k +3 i k +3 · · · , where M ∈ R d × d . Using this notation, the infinite dimensional matrix vector prod-uct H k x , where x is an iTR, corresponds to contracting along all i indices. Let H be the infinite dimensional matrix given in (1.2)and x a normalized iTR that is an approximation of an eigenvector of H . Because(1.2) contains an infinite number of terms in the summation, the classical definitionof the Rayleigh quotient involves an infinite sum, i.e.,(4.1) x ⊤ Hx = + ∞ X k = −∞ x ⊤ H k x , where(4.2) x ⊤ H k x = XX XX XX XX XX XX · · · · · ·· · · · · · M , for all k and M is a d × d matrix that only interacts with two nearest neighborcores of x (and x ⊤ ). In this case, H k is said to be a local nearest neighbor operator.Since all terms in (4.1) contribute equally to the summation, we define the Rayleighquotient in an average sense as follows. Theorem
Let x be a normalized nonzero iTR as definedin Definition and its corresponding transfer matrix T X as in Definition . Thenthe Rayleigh quotient for a given infinite dimensional matrix H of the form (1.2) canbe represented by (4.3) θ = XX XXV L V R M , where V L and V R are, respectively, the matricizations of the left and right dominanteigenvectors of T X . LEXIBLE POWER METHOD FOR INFINITE TEP Proof.
We first rewrite (4.2) in matrix notation x ⊤ H k x = Tr " k − Y ℓ = −∞ T X ! ˜ M + ∞ Y ℓ = k +2 T X ! , ˜ M := XX XXM . (4.4)It follows from Lemma 3.6 that x ⊤ H k x = Tr h v R v ⊤ L ˜ M v R v ⊤ L i = Tr h v ⊤ L ˜ M v R v ⊤ L v R i = v ⊤ L ˜ M v R = θ, where we consecutively used the trace cyclic property Tr( XY Z ) = Tr(
Y ZX ) and thenormalization of the left and right eigenvectors v ⊤ L v R = 1.If x is a normalized iTR, the evaluation of the Rayleigh quotient only requiresmultiplying the matrix ˜ M (4.4) from the left and right, respectively, with the leftand right dominant eigenvectors of its corresponding transfer matrix T X . In case x isgiven in canonical form, the Rayleigh quotient simplifies further because there is noneed to compute the left and right dominant eigenvectors of the transfer matrix, seeLemma 3.10. Corollary
Let x be a normalized iTR incanonical form as defined in Definition and its corresponding canonical transfermatrices as in Definition . Then the Rayleigh quotient associated with the infinitedimensional matrix H (1.2) can be represented by (4.5) θ = QQ ΣΣ QQ ΣΣΣΣ M , where Q and Σ are defined in Definition .Proof. Since x is in canonical form, the terms in (4.1) are given by(4.6) x ⊤ H k x = QQ ΣΣ QQ ΣΣ QQ ΣΣ QQ ΣΣ · · · · · ·· · · · · · M , for all k , and in matrix notation as follows x ⊤ H k x = Tr " k − Y ℓ = −∞ T Q R ! ˜ M Q + ∞ Y ℓ = k +2 T Q R ! , ˜ M Q := QQ ΣΣ QQ ΣΣ M . Next, using Lemma 3.6 together with the fact that vec(Σ ) and vec( I ) are the leftand right dominant eigenvectors of T Q R , respectively, we obtain x ⊤ H k x = Tr h vec( I ) vec(Σ ) ⊤ ˜ M Q vec( I ) vec(Σ ) ⊤ i = vec(Σ ) ⊤ ˜ M Q vec( I ) = θ, R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG where we consecutively used the trace cyclic property Tr(
XY Z ) = Tr(
Y ZX ) and thenormalization of the left and right eigenvectors of T Q R , i.e., vec(Σ ) ⊤ vec( I ) = 1.In case x is an iTR2, the classical definition of the Rayleigh quotient (4.1) nowconsists of the following even and odd terms x ⊤ H k x = XX YY XX YY XX YY · · · · · ·· · · · · · M , (4.7) x ⊤ H k +1 x = YY XX YY XX YY XX · · · · · ·· · · · · · M , (4.8)for all k . Hence, we define the iTR2 Rayleigh quotient as the average of 1 even and1 odd term. Theorem
Let x be a normalized nonzero iTR2 asdefined in Definition and its corresponding transfer matrices be T XY and T Y X as in Definition . Then the Rayleigh quotient for a given infinite dimensionalmatrix H (1.2) is defined as (4.9) θ = 12 XX YYV e L V e R M + 12 YY XXV o L V o R M , where V e L and V e R are the matricizations of the left and right dominant eigenvectors of T XY , and V o L and V o R are the matricizations of the left and right dominant eigenvectorsof T Y X , respectively.Proof.
The proof is similar to the one for Theorem 4.1.Using the iTR2 canonical form of Definition 3.15, the Rayleigh quotient (4.9) yields(4.10) θ = 12 QQ ΣΣ UU ΩΩΩΩ M + 12 UU ΩΩ QQ ΣΣΣΣ M , where Q , U , Σ, and Ω are defined in Definition 3.15. One way to measure the accuracy of an approximate eigenpair( θ, x ) of H is to evaluate the residual norm. When H is infinite dimensional and x an iTR, some care is needed to properly define this residual. By making use of theproperty that x can be written as the matrix-vector product of the frame matrix andthe canonical center core (3.20), we introduce a local iTR residual involving only 1core of x . LEXIBLE POWER METHOD FOR INFINITE TEP H has an infinite number of terms in the sum-mation (1.2), the standard definition of the residual for a Ritz pair ( θ, x ), i.e.,(4.11) Hx − θ x is also infinite. However, by projecting H into the subspace spanned by the columnsof the frame matrix, defined in Definition 3.11, we can define a residual associatedwith the projected H . Due to the translational invariance of x , we can choose anarbitrary k for the frame matrix. Therefore, without loss of generality, we set k = 0for the remainder of this section.We define the projection of H onto F =0 as follows(4.12) H =0 := F ⊤6 =0 HF =0 = + ∞ X k = −∞ F ⊤6 =0 H k F =0 =: + ∞ X k = −∞ H k . where H k is defined in (1.2) and the matrices H k ∈ R dr × dr are obtained by pro-jecting the corresponding H k onto F =0 . Depending on the relative position of thematrix M in each H k , with respect to index i in F =0 , the matrices H k have differentexpressions for each k , i.e.,(4.13) H − − ℓ = l ( T Q L ) ℓ , ℓ = 0 , , . . . ,H − = QQ ΣΣ M ,H = QQ ΣΣ M ,H ℓ = r ( T Q R ) ℓ , ℓ = 0 , , . . . , with l := QQ ΣΣ QQ ΣΣ M , r := ΣΣ QQ ΣΣ QQM . (4.14)Note that, because the transfer matrices T Q L and T Q R have a dominant eigenvalue 1,the infinite sums(4.15) + ∞ X ℓ =0 ( T Q L ) ℓ , and + ∞ X ℓ =0 ( T Q R ) ℓ R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG diverge, hence, the infinite sum in (4.12) also diverges.On the other hand, we can show that for every H k in (4.13),vec( Q c ) ⊤ H k vec( Q c ) = θ, where θ is the the Rayleigh quotient (4.5) and Q c the canonical center core (3.20).Therefore, we define the iTR residual as follows(4.16) Res := + ∞ X k = −∞ (cid:16) H k vec( Q c ) − θ vec( Q c ) (cid:17) =: + ∞ X k = −∞ Res k . The following theorem shows how to construct this residual and proves that, in con-trast to the infinite sum in (4.12), the iTR residual defined in (4.16) does converge.
Theorem
Let x be a normalized nonzero canonical iTR as de-fined in Definition . Then the residual for a given infinite dimensional matrix H (1.2) is given by (4.17) Res = Q c L + QQ Σ Q ΣΣ Σ M + Q Σ QQ Σ ΣΣ M + Q c R − θ Q c , where θ is the Rayleigh quotient (4.5) , Q c the canonical center core (3.20) , and vec( L ) ⊤ := vec( l ) ⊤ h I − ˜ T Q L i − , ˜ T Q L := T Q L − vec(Σ ) vec( I ) ⊤ , (4.18) vec( R ) := h I − ˜ T Q R i − vec( r ) , ˜ T Q R := T Q R − vec( I ) vec(Σ ) ⊤ . (4.19) with l and r are defined by the diagrams given in (4.14) .Proof. Since each term in the infinite sum of (4.16) is defined as the differenceRes k = H k vec( Q c ) − θ vec( Q c ), we immediately obtain from (4.13) that the 2ndand 3rd term in (4.17) are equal to H − vec( Q c ) and H vec( Q c ), respectively. Next,we will prove that P − k = −∞ Res k yields the 1st term in (4.17) minus θ vec( Q c ), and P + ∞ k =1 Res k yields the 4th term minus θ vec( Q c ). We show that both infinite sumsconverge.By using (4.5) and (4.14), we diagrammatically rewrite θ vec( Q c ) in the followingfactored form θ Q c = QQ ΣΣ QQ ΣΣΣΣ
M Q c = l ΣΣ | {z } vec(Σ ) vec( I ) ⊤ Q c , LEXIBLE POWER METHOD FOR INFINITE TEP − − ℓ = Q c l ( T Q L ) ℓ − θ Q c = Q c l ( ˜ T Q L ) ℓ , for ℓ = 1 , , . . . . Because vec( I ) and vec(Σ ) are the left and right eigenvectors asso-ciated with the dominant eigenvalue 1 of T Q L , as shown in Lemma 3.10, the largesteigenvalue of the deflated matrix ˜ T Q L is less than 1 (in magnitude). Hence, in contrastto (4.15), the geometric series of ˜ T Q L converges, yielding + ∞ X ℓ =0 ( ˜ T Q L ) ℓ = h I − ˜ T Q L i − . Note that this sum starts from 0 and (4.20) starts from 1. Consequently, the infinitesum P − k = −∞ Res k yields the 1st term in (4.17) minus θ vec( Q c ), where L is definedby (4.18).For the remaining part of the proof, we rewrite vec( Q c ) θ as Q c θ = QQ ΣΣ QQ ΣΣΣΣ MQ c = Q c ΣΣ | {z } vec( I ) vec(Σ ) ⊤ r , yielding(4.21) Res ℓ = Q c r ( T Q R ) ℓ − Q c θ = Q c r ( ˜ T Q R ) ℓ , for ℓ = 1 , , . . . . Using similar arguments as before, we can show that the geometricseries of ˜ T Q R converges and that the infinite sum P + ∞ k =1 Res k yields the 4th term in(4.17) minus θ vec( Q c ). Remark that the term θ vec( Q c ) in Res k always gets incorpo-rated in ˜ T Q L or ˜ T Q R , except for k = − , − , ,
1, yielding the 5th term in (4.17). Thiscompletes the proof.In order to define the residual for a Ritz pair ( θ, x ) with x being an iTR2, we firstintroduce the frame matrices F = Q and F = U which allow us to rewrite x as follows(4.22) x = F = Q vec( Q c ) = F = U vec( U c ) , where F ⊤6 = Q F = Q = F ⊤6 = U F = U = I and the canonical center cores Q c and U c are Q c = Q Ω Σ , U c = U Σ Ω . (4.23)respectively. Next, we define the residual in an average sense as followsRes = 12 + ∞ X k = −∞ (cid:16) F ⊤6 = Q H k F = Q vec( Q c ) − θ vec( Q c ) (cid:17) +12 + ∞ X k = −∞ (cid:16) F ⊤6 = U H k F = U vec( U c ) − θ vec( U c ) (cid:17) , R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG where θ is the Rayleigh quotient (4.10), and Q c and U c the canonical center cores(4.23). The following theorem summarizes how to compute this residual. Theorem
Let x be a normalized nonzero canonical iTR2 asdefined in Definition . Then the residual for a given infinite dimensional matrix H (1.2) is given by (4.24) Res = 12 Q c L Q + 12 UU Ω Q ΣΣ Σ M +12 Q Σ UU Ω ΩΩ M + 12 Q c R Q − θ Q c +12 U c L U + 12 QQ Σ U ΩΩ Ω M +12 U Ω QQ Σ ΣΣ M + 12 U c R U − θ U c , where θ is the Rayleigh quotient (4.10) , Q c and U c the canonical center cores (4.23) , (4.25) vec( L Q ) ⊤ := h vec( l Q ) ⊤ + vec( l U ) ⊤ T U L i h I − ˜ T Q L U L i − , vec( L U ) ⊤ := h vec( l U ) ⊤ + vec( l Q ) ⊤ T Q L i h I − ˜ T U L Q L i − , vec( R Q ) := h I − ˜ T U R Q R i − h vec( r U ) + T U R vec( r Q ) i , vec( R U ) := h I − ˜ T Q R U R i − h vec( r Q ) + T Q R vec( r U ) i , with ˜ T Q L U L := T Q L U L − vec(Ω ) vec( I ) ⊤ , ˜ T Q R U R := T Q R U R − vec( I ) vec(Ω ) ⊤ , ˜ T U L Q L := T U L Q L − vec(Σ ) vec( I ) ⊤ , ˜ T U R Q R := T U R Q R − vec( I ) vec(Σ ) ⊤ , and l Q := QQ ΣΣ UU ΩΩ M , r Q := ΣΣ UU ΩΩ QQM , LEXIBLE POWER METHOD FOR INFINITE TEP l U := UU ΩΩ QQ ΣΣ M , r U := ΩΩ QQ ΣΣ UUM . Proof.
The proof is similar to the one for Theorem 4.4.
5. Flexible power method.
One way to compute the algebraically smallesteigenvalue λ of H is to apply the power method to the matrix exponential e − H . If λ is simple, the power method converges linearly to the desired eigenpair ( λ , x ) atthe rate e λ − λ . This approach is generally not recommended because working withthe matrix exponential e − H can be costly. In fact, computing e − H or applying e − H to a vector may be harder than computing eigenvalues of H . However, when H hasthe structure exhibited in (1.2), e − H t may be approximated by a simpler form thatmakes it possible to perform a power iteration to e − H t for a small t .Throughout this section, we will make use of iTR matrices which are graphicallyrepresented as follows A = A · · · A A · · · j − i − j i j i , where all indices i k , j k run from 1 to d and each A ( j, i ) is a matrix of size r × r , with r referred to as the rank of A . exp( − H t ) by matrix splitting. We now discuss howto approximate e − H t x , with x being an iTR, required in a single step of the poweriteration. Because the infinite matrix H consists of an infinite sum, it is temptingto rewrite e − H = e − P k H k as · · · e − H k − e − H k e − H k +1 · · · . However, this is not validbecause the H k ’s do not commute in general. On the other hand, we can use the Lieproduct formula, also known as Suzuki–Trotter splitting, e ( A + B ) t = (cid:2) e At e Bt (cid:3) + O ( t ) , where A and B are square matrices, yielding the following approximation(5.1) e − H t ≈ + ∞ Y k = −∞ e − H k t , which may be accurate enough if t is sufficiently small. In the physics literature, t isviewed as an imaginary time variable, hence, e − H t is often referred to as imaginarytime evolution.First, assume the matrices H k to be completely local, i.e., H k = · · · ⊗ I d ⊗ I d ⊗ A k ⊗ I d ⊗ I d ⊗ · · · , A k = A ∈ R d × d , so that, by making use of the identities e A ⊗ I = e A ⊗ I, and e I ⊗ A = I ⊗ e A , the matrix exponentials e − H k t can be simplified as follows e − H k t = · · · ⊗ I ⊗ I ⊗ e − At ⊗ I ⊗ I ⊗ · · · , R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG and the infinite product in (5.1) corresponds to the rank-1 iTR matrix + ∞ Y k = −∞ e − H k t = · · · ⊗ e − At ⊗ e − At ⊗ e − At ⊗ · · · = · · · e − At e − At e − At j − i − j i j i · · · , where the dotted lines correspond to an iTR matrix rank r = 1. Hence, if x is aniTR, the product(5.2) y = + ∞ Y k = −∞ e − H k t ! x , is again an iTR of the same rank, and (5.2) can efficiently be computed by leftmultiplying the slices of x with the d × d matrix e − At Y = Xe − At , where X and Y are the cores of x and y , respectively.If H k represent nearest neighbor interactions, i.e.,(5.3) H k = · · · ⊗ I d ⊗ I d ⊗ M k,k +1 ⊗ I d ⊗ I d ⊗ · · · , M k,k +1 = M ∈ R d × d , so that each H k will interact with the k th and ( k + 1)st cores of x . Because H k and H k +1 both interact with the ( k + 1)st core of x , e − H t x cannot be directly obtainedas in (5.2). Therefore, we first reorder the summation and group the even and oddterms together H = + ∞ X k = −∞ H k ! + + ∞ X k = −∞ H k +1 ! =: H e + H o . Using the Suzuki–Trotter splitting twice, yields(5.4) e − H t = e − ( H e + H o ) t ≈ + ∞ Y k = −∞ e − H k t ! + ∞ Y k = −∞ e − H k +1 t ! , for sufficiently small t , where + ∞ Y k = −∞ e − H k t = · · · · · · e − Mt e − Mt e − Mt j k − i k − j k − i k − j k i k j k +1 i k +1 j k +2 i k +2 j k +3 i k +3 , (5.5) + ∞ Y k = −∞ e − H k +1 t = · · · · · · e − Mt e − Mt e − Mt j k − i k − j k i k j k +1 i k +1 j k +2 i k +2 j k +3 i k +3 j k +4 i k +4 . (5.6)Remark that both (5.5) and (5.6) can also be considered as local operators when wecombine the corresponding neighboring indices. Hence, both the even and odd infinite LEXIBLE POWER METHOD FOR INFINITE TEP + ∞ Y k = −∞ e − H k t = · · · · · · e − Mt e − Mt e − Mt j k − , k − i k − , k − j k, k +1 i k, k +1 j k +2 , k +3 i k +2 , k +3 , (5.7) + ∞ Y k = −∞ e − H k +1 t = · · · · · · e − Mt e − Mt e − Mt j k − , k i k − , k j k +1 , k +2 i k +1 , k +2 j k +3 , k +4 i k +3 , k +4 , (5.8)where the neighboring indices i k, k +1 = ( i k , i k +1 ) and i k +1 , k +2 = ( i k +1 , i k +2 )are combined, respectively.Then, in order to approximate e − H t x , we first focus on the odd terms, i.e.,(5.9) y o = + ∞ Y k = −∞ e − H k +1 t ! x , which can be computed in the same way as (5.2) by reformulating x as an iTR2 with2 cores X , i.e., combining the (2 k + 1)st and (2 k + 2)nd cores into a supercores X ofdimension r × d × r . Hence, (5.9) only requires the left multiplication of the slices ofthe new supercore with the d × d matrix e − Mt Y o d = X e − Mt , with X d = X d X d , where d and d correspond here to the dimensions of the (super)cores. Thereafter, weonly need to multiply the even terms, however, directly applying(5.10) y = + ∞ Y k = −∞ e − H k t ! y o is not possible because the supercores Y o are formed by combining the indices (2 k +1 , k + 2), while (5.7) requires the indices (2 k, k + 1) to be combined. Therefore,we first need to split y o back into 2 normal cores, followed by a new and differentrecombination of the cores in order to carry out (5.10) efficiently.The dimension of Y o is r × d × r . However, there is no guarantee that thissupercore can be split into 2 cores, of dimension r × d × r , without loosing information.A possible way to split is by first reshaping Y o into an rd × rd matrix C , nextapproximating C by a rank- r factorization C ≈ C C ⊤ , where C , C ∈ R dr × r , and finally reshaping the factors C and C into 3rd-ordertensors Y o r d r = C dr rd ≈ C dr rd C ⊤ r = Y Y r r rd d . In order words, y o is approximated by˜ y o = Y · · · i k − Y i k Y i k +1 Y · · · i k +2 . R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG
Consequently, we can now approximate (5.10) by replacing y o with ˜ y o Y d = e Y o e − Mt , with e Y o d = Y d Y d . In the final step, we split the supercore Y into 2 cores Y and Y Y r d r = D dr rd ≈ D dr rd D ⊤ r = Y Y r r rd d . or in order words, y is approximated by˜ y = Y · · · i k Y i k +1 Y i k +2 Y · · · i k +3 . Remark that, although we started with an iTR x , the result of applying e − H t to it isan iTR2. Therefore, in Figure 3 we graphically illustrate the application of (5.1) toan iTR2 with cores X and X . Due to the translational invariance, we only need toapply e − Mt twice in order to compute ˜ y , which is indicated in black. The computationindicated in grey can completely be ignored, so that the multiplication y = e − H t x can be computed in an efficient way. + ∞ Y k = −∞ e − H k +1 t ! x = X X X X X X · · · · · · e − Mt e − Mt e − Mt + ∞ Y k = −∞ e − H k t ! ˜ y o = Y Y Y Y Y Y · · · · · · e − Mt e − Mt e − Mt e − Mt ˜ y = Y Y Y Y Y Y · · · · · · Fig. 3 . The matrix–vector operation y = e − H t x . As illustrated in Figure 3, we only need to operate on 1 super-core at the time. Instead of working with supercores involving X , X and Y , Y ,respectively, we will operate on the following canonical center supercores X and X X = Ω Q U
Σ Ω , and X = Σ U Q
Ω Σ , respectively, so that their corresponding frame matrices are orthogonal.We only describe the application of the odd part of e − H t onto X c , since theeven part can be performed in exactly the same way by interchanging Q, U andΣ , Ω, respectively. The computation consists of 5 steps as illustrated in Figure 4. Instep 1, the canonical center supercore X is formed. Next, we apply e − Mt to it in LEXIBLE POWER METHOD FOR INFINITE TEP Y into a matrix. In step 3, weapproximate the matrix representation of Y by a truncated SVD, with the diagonalmatrix S r containing only the r largest singular values, followed by reshaping the leftand right singular vectors W and V into 3rd-order tensors, i.e., cores. Then in step 4,in order to update the cores Q and U by W Ω and V ⊤ Ω , respectively, and the diagonalmatrix Σ by S r , we transform the result of step 3 into the form of step 1 by left andright multiplying W and V ⊤ , respectively, by Ω − . Note that, although the resultingform resembles the form of step 1, the former is not in canonical form yet. Therefore,we update in step 5 both cores Q ⋆ and U ⋆ , and both diagonal matrices Σ ⋆ and Ω ⋆ tobring the updated iTR2 again in canonical form.step 1: X = Ω Q Σ U Ω step 2: Y = X e − Mt −→ Y step 3: Y ≈ W S r V ⊤ −→ W S r V ⊤ step 4: Ω W Ω S r V ⊤ Ω Ω = Ω Ω - W S r V ⊤ Ω - Ω step 5: W Ω S r V ⊤ Ω Ω −→ Q ⋆ Σ ⋆ U ⋆ Ω ⋆ Fig. 4 . The odd part of the matrix–vector operation y = e − H t x , where x is given in canonicaliTR2 form. Note that the even part corresponds to interchanging the cores Q and U , and thediagonal matrices Σ and Ω , respectively. (1) Forming the canonical center supercore. (2) Applyingthe matrix exponential followed by reshaping the result into a matrix. (3) Performing a truncatedSVD and reshaping the singular vectors into cores. (4) Updating the cores and the central diagonalmatrix, while keeping Ω unchanged. (5) Computing the canonical iTR2 form. We now have all components to state the method for computing the smallesteigenvalue of (1.2). Due to the approximation made by the Suzuki–Trotter splittingof e − H t and the rank truncation for splitting a supercore into 2 normal cores, themultiplication y = e − H t x is inexact. Even when t is fixed, the error introduced bythe rank truncation is different in each iteration and can be viewed as every timeapplying a slightly different operator. Therefore, we call the algorithm for computingthe smallest eigenvalue of H , based on a power method of e − H , the flexible powermethod, similar to the flexible GMRES algorithm used for computing solutions oflinear systems.Algorithm 3 shows the outline of the flexible power method for computing thesmallest eigenvalue of (1.2). This algorithm takes a canonical iTR2 as starting vectorand returns the Rayleigh quotient and its corresponding eigenvector in canonical iTR2form. In every iteration of Algorithm 3, we apply the matrix exponential e − Mt twice,according to Figure 4, followed by the computation of the Rayleigh quotient andresidual.As we will show in the numerical experiments, the timestep t can have a significanteffect on the convergence and accuracy of the Rayleigh quotient. Since Algorithm 3 ap-plies approximate powers of the matrix exponential e − H , choosing a smaller timestep6 R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG
Algorithm 3:
Flexible power method
Input :
Canonical iTR2 with cores
Q, U ∈ R r × d × r and matricesΣ , Ω ∈ R r × r Nearest neighbor interaction matrix M and initial timestep t Output:
Rayleigh quotient θ and corresponding eigenvector in canonicaliTR2 while not converged do Apply e − Mt to supercore X according to Figure 4. Apply e − Mt to supercore X according to Figure 4. Compute Rayleigh quotient θ as in (4.10). Check convergence based on residual (4.24). Update t . end will require more iterations for applying the same power. However, if the timestepis chosen too large, the convergence of Algorithm 3 stagnates due to the dominatingSuzuki–Trotter splitting error. It is therefore recommended to adapt and decreasethe timestep t in the course of the algorithm. By starting with a relative large t , thepower method will stagnate in a small number of iterations. We can use the iTR2residual for detecting stagnation. Next, we decrease t and continue the power methodwith the already obtained approximate eigenvector. By updating t and using moreaccurate approximate eigenvectors for smaller t , we will be able to reduce the totalnumber of iterations. Most operations of Algorithm 3 can beefficiently implemented involving only 3rd-order tensors of size r × d × r , such assteps 1–4 in Figure 4 and the Rayleigh quotient. On the other hand, step 5 inFigure 4 and the computation of the residual involve transfer matrices of size r × r .The former requires computing dominant eigenvectors of the left and right transfermatrices and the latter needs linear systems solves involving the same matrices. Inboth cases, iterative methods can be used, however, even for low values of r ≈ Q ⋆ := W Ω and U ⋆ := V ⊤ Ω , and the updated diagonal matrices Σ ⋆ := S r andΩ ⋆ := Ω. Note that in this case only 1 of the 2 diagonal matrices is updated, hencethe resulting iTR2 will, in general, not satisfy all the orthogonality conditions ofDefinition 3.15. However, due to the alternating singular value decompositions on Y and Y , we observe that the resulting iTR2 converges to canonical form. As a result,we can still use the relatively cheap canonical Rayleigh quotient (4.10) compared tothe standard Rayleigh quotient (4.9) which would again require the computation ofdominant eigenvectors of the transfer matrices.Note that in the iTEBD algorithm described in [15, 9] the matrix-vector operation y = e − H t x is implemented in a slightly different way. Instead of bringing the updatediTR2 to canonical form, as done in step 5 in Figure 4, the iTR2 is directly transformed LEXIBLE POWER METHOD FOR INFINITE TEP
6. Numerical experiments.
In this section we illustrate the flexible powermethod, outlined in Algorithm 3, for 3 examples: the one-dimensional Ising modelwith a transverse field, the spin S = 1 Heisenberg isotropic antiferromagnetic model,and the spin S = 1 / eigs and the linear systems for computing the residual in (4.25) aresolved via gmres with tolerance 1e − We start with the one-dimensional transverse field Ising model that has the corresponding infinite dimen-sional Hamiltonian(6.1) H = − + ∞ X k = −∞ · · · ⊗ I ⊗ σ z ⊗ σ z ⊗ I ⊗ · · · − g + ∞ X k = −∞ · · · ⊗ I ⊗ σ x ⊗ I ⊗ · · · where σ x and σ z are the 1st and 3rd Pauli matrices, respectively, and the following4 × M = − − g − g − g − g − as nearest neighbor interaction matrix in the notation of (1.2). We choose the pa-rameter g = 2. The exact solution of the smallest eigenvalue of (6.1) is [13] λ ( g ) = − π Z π − π p g − g cos x dx, λ (2) ≈ − .
127 088 819 946 730 . In the first experiment, we compare the 2 variants of Algorithm 3, i.e., the canon-cial variant where we always convert the result of applying e − Mt back into canonicaliTR2 form and the non-canonical variant where we skip step 5 in Figure 4. In orderto improve the numerical stability in the latter one, we normalize the diagonal ma-trices to have unit Frobenius norm. Figure 5a shows the Rayleigh quotient θ and itsdifference with the exact solution as a function of the iteration count. Note that theresults for both variants of Algorithm 3 are indistinguishable, hence, we will only usethe faster non-canonical variant in the remainder of this section.Next, we analyze the effect of the timestep t on the convergence of Algorithm 3.Figure 5b depicts the convergence of 2 different runs where we respectively decreasedthe timestep t after every 100 or 1000 iterations. Note that different timesteps areindicated by different colors. From this figure it is clear that the timestep controlcan have a significant effect on the overall attained accuracy. When t is decreasedtoo soon, as in the first run, the flexible power method tend to stagnate and furtherdecreasing t does not help. On the other hand, we do not want to run too manyiteration and therefore we will use the iTR residual, introduced in section 4.2, todetect stagnation.As discussed in section 5.2, the total number of iterations can be reduced bygradually decreasing the timestep t . Table 1 shows that running Algorithm 3 with8 R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG − − . − iteration θ iTR2ciTR2 0 20 40 60 80 10010 − − − − iteration | λ − θ | iTR2ciTR2 (a) Comparison of the canonical and non-canonical variants of Algorithm 3 for timestep t = 0 . ,
000 2 ,
000 3 ,
000 4 ,
000 5 , − − − − iteration | λ − θ | t = 1e − t = 1e − t = 1e − t = 1e − t = 1e − (b) Influence of the timestep t on the convergence of Algorithm 3. − − − iteration | λ − θ | k R k − − − time T | λ − θ | k R k (c) Convergence as a function of iteration count and total time. Fig. 5 . One-dimensional transverse field Ising model with g = 2 and iTR rank r = 10 . a fixed timestep t = 1e − t was divided by 10 when thefirst 3 digits of the residual norm k Res k were not changing anymore in 3 consecutiveconvergence checks or in case the residual increased. The left plot in Figure 5c showsthe corresponding convergence as a function of the iteration count. Note that byusing the automatic timestep adaption approach, the accuracy of the Ritz value ismultiple orders of magnitude higher than achieved in both runs shown in Figure 5b. LEXIBLE POWER METHOD FOR INFINITE TEP Table 1
Wall clock times for one-dimensional transverse field Ising model with g = 2 and iTR rank r = 10 . t iter time [s] iter time [s]1e − . − . − ,
500 1 . − ,
000 4 . − ,
000 122 .
52 150 ,
000 44 . ,
000 122 .
52 164 ,
663 52 . The right plot in Figure 5c shows the same results as the left one but in functionof the total time T = N t , with N the number of iterations of Algorithm 3 for each t . Remark that the stagnation happens for every timestep more or less at the sametotal time T , corresponding to roughly speaking the same power of e − H . We will usethis observation in the next numerical experiments to change the frequency of theconvergence check based on the current timestep. S = 1 Heisenberg isotropic antiferromagnetic model.
As asecond example, we consider the isotropic antiferromagnetic case of the spin S = 1Heisenberg model. The resulting Hamiltonian has the form of (1.2), with the followingnearest neighbor interaction matrix M k,k +1 = X k ⊗ X k +1 + Y k ⊗ Y k +1 + ∆ Z k ⊗ Z k +1 , where ∆ = 1, and X k , Y k , and Z k the spin-1 generators of SU(2) X k = 1 √ , Y k = 1 √ − ı ı − ı ı , Z k = − . The smallest eigenvalue is λ ≈ − .
401 484 038 971 2(2) [5].Figures 6a to 6d show the convergence of Algorithm 3 for different iTR ranks r = 10, 20, 30, and 60, respectively. For every case, we started with a timestep t = 0 . /t in order to avoid computing to many (expensive)residual calculations. The corresponding wall clock time for the different runs aregiven in Table 2. This table illustrates again that the smaller the timestep t the moreiterations are required, hence, the larger the wall clock time.When the rank of the iTR is too small, we see from Figure 6 that the Rayleighquotient does not further improve even when the timestep is reduced because theSVD truncation error is too large. On the other hand, we observe from Figures 6bto 6d that the residual norms in all these runs behave in a similar fashion. Since theiTR2 residual (4.24) is a projected residual, we cannot use it as the only metric fordeciding when to terminate the flexible power iteration. As illustrated in Figure 6e,the singular values given by the diagonal matrices of Definition 3.15 and in particularthe smallest singular value together with the residual are good indicators for the actualaccuracy of the approximate eigenpair of H because the overall accuracy is boundedby both the Suzuki–Trotter splitting and SVD truncation error. S = 1 / Heisenberg model.
As a last example, we consider thespin S = 1 / R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG − − − − time T | λ − θ | k R k (a) r = 10 − − − − time T | λ − θ | k R k (b) r = 20 − − − − time T | λ − θ | k R k (c) r = 30 − − − − time T | λ − θ | k R k (d) r = 60 − − − r = 10 r = 20 r = 30 r = 60 (e) Singular values Fig. 6 . Spin S = 1 Heisenberg isotropic antiferromagnetic model. with the following nearest neighbor interaction matrix M k,k +1 = X k ⊗ X k +1 + Y k ⊗ Y k +1 + Z k ⊗ Z k +1 , where X k = σ x , Y k = σ y , Z k = σ z , with σ x , σ y , σ z the Pauli matrices. The exactsolution of the smallest eigenvalue is λ = − ln(2) + ≈ − .
443 147 180 559 945.In Figure 7a, we plot the convergence history of the Rayleigh quotient θ to theexact eigenvalue λ as well as the changes in the first ( θ ) and second ( θ ) terms of LEXIBLE POWER METHOD FOR INFINITE TEP Table 2
Wall clock times corresponding to Figures to . t r = 10 r = 20 r = 30 r = 60iter time [s] iter time [s] iter time [s] iter time [s]1e − .
60 70 0 .
71 80 1 .
04 100 3 . − .
75 1 ,
100 2 .
26 900 3 .
28 700 9 . − ,
000 3 .
27 6 ,
000 8 .
63 7 ,
000 19 .
60 9 ,
000 96 . − ,
000 33 .
75 80 ,
000 107 .
57 110 ,
000 314 .
33 60 ,
000 590 . − ,
000 288 .
10 900 ,
000 1 , .
97 800 ,
000 2 , .
29 800 ,
000 7 , . (4.9) (excluding the 1/2 factor) with respect to the total time T . In this calculation,we limit the rank of the iTR to 10 and used a fixed timestep t = 0 . θ approaches to λ from above, but θ can fall below λ and eventually approaches λ from below. Although θ and θ both converge towards λ , the convergence ismuch slower than θ which is the average of θ and θ .In addition to approximating the desired eigenvalue by the Rayleigh quotient,we can obtain another type of approximation by computing the eigenvalue of theprojected Hamiltonian matrix H =0 defined by (4.12). Because the summation in H =0 diverges, the eigenvalue of H =0 is not finite. However, the eigenvalue of theaverage of H =0 is finite and serves as an approximation to the average eigenvalue of H defined in (1.2). To compute the average of H =0 , we can first subtract θI fromeach of the terms H − − ℓ and H ℓ in (4.13), for ℓ ≥
1, where θ is the Rayleighquotient. We then compute the geometric sums H L = H − + P ∞ ℓ =1 ( H − − ℓ − θI ) and H R = H + P ∞ ℓ =1 ( H ℓ − θI ), respectively, using the same techniques as discussedin the proof of Theorem 4.4 for the average residual calculation. We can show thatboth geometric sums converge. Next, we sum up H L , H − , H , and H R after θI issubtracted from each of these terms. The subtraction of θI from H L and H R aremade to account for the θI ’s not subtracted from H − and H , respectively, beforethe geometric sums were performed to yield H L and H R . After dividing the sum ofall four terms by 4, we add θI back to form the total average of H =0 , i.e.,ˆ H =0 = ( H L + H − + H + H R − θI ) / θI = ( H L + H − + H + H R ) / . In Figure 7a, the curve marked by the diamonds show that the eigenvalue ˆ θ of theaverage ˆ H =0 converges faster to λ . However, computing the average ˆ H =0 can beexpensive when the rank of x becomes large.Figure 7b shows how the norm of the average residual and the difference betweenthe approximate and exact eigenvalue change with respect to the total time T whenthe rank of the iTR is allowed to increased to 120. Even with this much larger rank,it is difficult to reduce the error in the approximate eigenvalue below 10 − .
7. Conclusions.
In this paper, we described a flexible power method for solvinga class of infinite dimensional tensor eigenvalue problem in which the infinite dimen-sional matrix to be partially diagonalized has a special sum of Kronecker productstructure shown in (1.2), and the approximate eigenvector can be represented by aninfinite translational invariant tensor ring (iTR). We presented some algebraic prop-erties of the iTR in terms of its transfer matrix and showed how to obtain a canonicalform which is convenient for manipulating an iTR in several calculations. We showedhow the Rayleigh-quotient of an iTR x with respect to the matrix (1.2) can be prop-erly computed in an average sense. This is the quantity computed in each step of2 R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG − . − . − . − . − . λ time T θ θθ θ ˆ θ (a) Convergence of the iTR Rayleigh quotient and its 2 comprising terms ( r = 10 and t = 0 . − − − time T | λ − θ | k R k (b) Convergence as a function of total time for r = 120. Fig. 7 . Spin S = 1 / Heisenberg model. the flexible power iteration in which the matrix exponential e − H t is multiplied with x for some small adjustable parameter t approximately. The flexibility of the powermethod results from approximations made in each iteration to compute e − H t x . Theapproximation involves splitting e − H t into a product of simpler terms and reducingthe rank of the iTR after e − H t is applied to x .Although the flexible power method itself is not new, we gave more algorithmicand implementation details in this paper on how to carry out this procedure in apractical setting. In particular, we presented a better way to compute the Rayleighquotient, and pointed out an alternative way to extract possibly better approximationto the desired eigenvalue. We also developed a practical and effective way to moni-tor the convergence of the flexible power iteration by computing an average (finite)residual estimate. Such a residual estimate is used to determine when to decrease t to improve convergence and when to terminate the power iteration.We should note that the flexible power iteration is not the only method for solv-ing this type of infinite dimensional tensor eigenvalue problems. Alternative methods LEXIBLE POWER METHOD FOR INFINITE TEP e − H t with an iTR x . A comparison of the flexible power methodwith other algorithms will be made in future study. REFERENCES[1]
M. T. Batchelor , The Bethe ansatz after 75 years , Phys. Today, 60 (2007), pp. 36–40, https://doi.org/10.1063/1.2709557.[2]
H. Bethe , Zur Theorie der Metalle , Z. Phys., 71 (1931), pp. 205–226, https://doi.org/10.1007/BF01341708.[3]
J. C. Bonner, H. W. J. Bl¨ote, H. Beck, and G. M¨uller , Quantum Spin Chains , in Physicsin One Dimension, J. Bernasconi and T. Schneider, eds., Berlin, Heidelberg, 1981, SpringerBerlin Heidelberg, pp. 115–128.[4]
S. V. Dolgov, B. N. Khoromskij, I. V. Oseledets, and D. V. Savostyanov , Computationof extreme eigenvalues in higher dimensions using block tensor train format , Comput.Phys. Commun., 185 (2014), pp. 1207–1216, https://doi.org/10.1016/j.cpc.2013.12.017.[5]
J. Haegeman, J. I. Cirac, T. J. Osborne, I. Piˇzorn, H. Verschelde, and F. Verstraete , Time-dependent variational principle for quantum lattices , Phys. Rev. Lett., 107 (2011),p. 070601, https://doi.org/10.1103/PhysRevLett.107.070601.[6]
L. Hulth´en , ¨Uber das Austauschproblem eines Kristalles , Ark. Mat. Astr. Fys., 26A (1938),pp. 1–106.[7] D. Kressner, M. Steinlechner, and A. Uschmajew , Low-rank tensor methods with sub-space correction for symmetric eigenvalue problems , SIAM J. Sci. Comput., 36 (2014),pp. A2346–A2368, https://doi.org/10.1137/130949919.[8]
O. Mickelin and S. Karaman , On Algorithms for and Computing with the Tensor RingDecomposition , 2018, https://arxiv.org/abs/1807.02513.[9]
R. Or´us and G. Vidal , Infinite time-evolving block decimation algorithm beyond unitary evo-lution , Phys. Rev. B, 78 (2008), p. 155117, https://doi.org/10.1103/PhysRevB.78.155117.[10]
I. V. Oseledets , Tensor-train decomposition , SIAM J. Sci. Comput., 33 (2011), pp. 2295–2317,https://doi.org/10.1137/090752286.[11]
R. Penrose , Applications of negative dimensional tensors , in Combinatorial Mathematics andits Applications, D. J. A. Welsh, ed., vol. 1, New York, 1971, Academic Press, pp. 221–244.[12]
D. Perez-Garcia, F. Verstraete, M. M. Wolf, and J. I. Cirac , Matrix product staterepresentations , Quantum Inf. Comput., 7 (2007), pp. 401–430.[13]
P. Pfeuty , The one-dimensional Ising model with a transverse field , Ann. Physics, 57 (1970),pp. 79–90, https://doi.org/10.1016/0003-4916(70)90270-8.[14]
S. Rommer and S. ¨Ostlund , Thermodynamic limit and matrix-product states , in LectureNotes in Physics, I. Pesschel, M. Kaulke, X. Wang, and K. Hallberg, eds., vol. 528, Berlin,Heidelberg, 1999, Spinger, pp. 67–89.[15]
G. Vidal , Classical simulation of infinite-size quantum lattice systems in one spatial di-mension , Phys. Rev. Lett., 98 (2007), p. 70201, https://doi.org/10.1103/PhysRevLett.98.070201.[16]
V. Zauner-Stauber, L. Vanderstraeten, M. T. Fishman, F. Verstraete, and J. Haege-man , Variational optimization algorithms for uniform matrix product states , Phys. Rev.B, 97 (2018), p. 45145, https://doi.org/10.1103/PhysRevB.97.045145.[17]