[PDF] A Flexible Power Method for Solving Infinite Dimensional Tensor Eigenvalue Problems

Abstract

We propose a flexible power method for computing the leftmost, i.e., algebraically smallest, eigenvalue of an infinite dimensional tensor eigenvalue problem, H x = \lambda x, where the infinite dimensional symmetric matrix H exhibits a translational invariant structure. We assume the smallest eigenvalue of H is simple and apply a power iteration of e^{-H} with the eigenvector represented in a compact way as a translational invariant infinite Tensor Ring (iTR). Hence, the infinite dimensional eigenvector can be represented by a finite number of iTR cores of finite rank. In order to implement this power iteration, we use a small parameter t so that the infinite matrix-vector operation e^{-Ht}x can efficiently be approximated by the Lie product formula, also known as Suzuki--Trotter splitting, and we employ a low rank approximation through a truncated singular value decomposition on the iTR cores in order to keep the cost of subsequent power iterations bounded. We also use an efficient way for computing the iTR Rayleigh quotient and introduce a finite size iTR residual which is used to monitor the convergence of the Rayleigh quotient and to modify the timestep t. In this paper, we discuss 2 different implementations of the flexible power algorithm and illustrate the automatic timestep adaption approach for several numerical examples.

Full PDF

aa r X i v : . [ m a t h . NA ] J a n A FLEXIBLE POWER METHOD FOR SOLVINGINFINITE DIMENSIONAL TENSOR EIGENVALUE PROBLEMS ∗ ROEL VAN BEEUMEN † , LANA PERIˇSA ‡ , DANIEL KRESSNER § , AND

CHAO YANG † Abstract.

We propose a ﬂexible power method for computing the leftmost, i.e., algebraicallysmallest, eigenvalue of an inﬁnite dimensional tensor eigenvalue problem, Hx = λx , where the inﬁnitedimensional symmetric matrix H exhibits a translational invariant structure. We assume the smallesteigenvalue of H is simple and apply a power iteration of e − H with the eigenvector represented in acompact way as a translational invariant inﬁnite Tensor Ring (iTR). Hence, the inﬁnite dimensionaleigenvector can be represented by a ﬁnite number of iTR cores of ﬁnite rank. In order to implementthis power iteration, we use a small parameter t so that the inﬁnite matrix-vector operation e − Ht x can eﬃciently be approximated by the Lie product formula, also known as Suzuki–Trotter splitting,and we employ a low rank approximation through a truncated singular value decomposition on theiTR cores in order to keep the cost of subsequent power iterations bounded. We also use an eﬃcientway for computing the iTR Rayleigh quotient and introduce a ﬁnite size iTR residual which is usedto monitor the convergence of the Rayleigh quotient and to modify the timestep t . In this paper,we discuss 2 diﬀerent implementations of the ﬂexible power algorithm and illustrate the automatictimestep adaption approach for several numerical examples. Key words. inﬁnite eigenvalue problem, tensor eigenvalue problem, tensor ring, tensor train,translational invariance, matrix product states

AMS subject classiﬁcations.

1. Introduction.

We consider the problem of computing the leftmost, i.e., al-gebraically smallest, eigenvalue of an inﬁnite dimensional tensor eigenvalue problem(1.1) Hx = λ x , where the inﬁnite dimensional symmetric matrix H is deﬁned as the inﬁnite sum ofKronecker products of an inﬁnite number of ﬁnite matrices, i.e.,(1.2) H = + ∞ X k = −∞ H k , H k = · · · ⊗ I ⊗ I ⊗ M k,k +1 ⊗ I ⊗ I ⊗ · · · , with I the d × d identity matrix and M k,k +1 ∈ R d × d . We assume that the matrices M k,k +1 are identical for all k , hence H is called to be translational invariant . Thedouble subscript k, k + 1 for the matrix M is used here to merely indicate the over-lapping positions of M in each Kronecker product. This type of eigenvalue problemsoriginates from the study of simple quantum many-body systems such as a quantumspin chain with nearest neighbor interactions [3]. For some speciﬁc choices of M , theanalytical solutions of these eigenvalue problems are known [2, 6, 1]. However, in gen-eral, the smallest eigenvalue and its corresponding eigenvector need to be computednumerically. ∗ Submitted to the editors February 2, 2021.

Funding:

This work was partially funded by the U.S. Department of Energy, Oﬃce of Science,Oﬃce of Advanced Scientiﬁc Computing Research, Scientiﬁc Discovery through Advanced Computing(SciDAC) program under Contract No. DE-AC0205CH11231. † Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720,United States. ([email protected], [email protected]). ‡ Visage Technologies, Diskettgatan 11A, SE-583 35 Link¨oping, Sweden.([email protected]). § Department of Mathematics, ´Ecole Polytechnique F´ed´erale de Lausanne, Station 8, 1015 Lau-sanne, Switzerland. (daniel.kressner@epﬂ.ch). 1

R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG

The eigenvectors of H are inﬁnite dimensional vectors which obviously cannot becomputed or stored directly in memory. One way to study such a problem computa-tionally is to start with a ﬁnite H ℓ ∈ R d ℓ × d ℓ that only contains ( ℓ −

1) terms in thesummation and Kronecker products of ( ℓ −

1) matrices, and examine how the small-est eigenvalue of H ℓ changes as ℓ increases. Note that the Kronecker structure of H ℓ allows for eﬃciently representing this matrix in, e.g., Tensor Train (TT) format [10].The corresponding eigenvector can then be represented by the TT x ℓ ( i , i , . . . , i ℓ ) = X ( i ) X ( i ) · · · X ℓ ( i ℓ ) , where X k ( i k ) is an r k × r k +1 matrix, with r = r ℓ +1 = 1, and the indices i k = 1 , . . . , d ,for k = 1 , . . . , ℓ . Because there is a limit on how large ℓ can be chosen, we maynever know the true solution in the limit ℓ → ∞ . Such a limit is known in thephysics literature as the thermodynamic limit and is important for describing macro-scopic properties of quantum materials when H corresponds to a quantum many-bodyHamiltonian [14].In this paper, we will directly compute the smallest eigenvalue, and correspondingeigenvector, of the inﬁnite dimensional eigenvalue problem (1.1)–(1.2). In order to dothat, we represent the eigenvector in a compact form and make use of the translationalinvariance property of H . Such invariance property was studied by Bethe [2] andHulth´en [6], known as the Bethe–Hulth´en hypothesis , and suggests that the elementsof the eigenvector are invariant with respect to a cyclic permutation of the tensorindices, i.e., x ( . . . , i − , i , i , . . . ) = x ( . . . , i , i , i , . . . ) , where all indices i k = 1 , . . . , d , and the sequence of indices . . . , i − , i , i . . . speciﬁesa particular element of the inﬁnite dimensional vector x .We assume that the smallest eigenvalue of H is simple and propose to apply aﬂexible power iteration to e − H t for some small and variable parameter t to computethe desired eigenpair. We represent the eigenvector to be computed as a translationalinvariant inﬁnite Tensor Ring (iTR), deﬁned as follows x ( . . . , i − , i , i , . . . ) = Tr " + ∞ Y k = −∞ X ( i k ) , where all indices i k = 1 , . . . , d , and each X ( i k ) is a matrix of size r × r . Note that dueto the translational invariance, we only need to store, and work, with d matrices ofsize r × r , which is manageable as long as the rank r is not too large. An iTR can beseen as the inﬁnite limit of a ﬁnite size tensor ring [17] and is also known as a uniformMatrix Product State (uMPS) [16].In order to implement the power method, we must be able to multiply a matrixexponential with an iTR and keep the product in iTR form. Computing e − H t x , with H being an inﬁnite dimensional matrix and x an iTR, is in general not possible.However, the special structure of (1.2) allows us to split H in even and odd terms, H e and H o , respectively. When t is suﬃciently small, we can use the Lie productformula, also known as Suzuki–Trotter splitting, to simplify e − H t x into only localtensor contractions with a d × d matrix exponential e − Mt . Such contractions canbe implemented in a way that the translational invariance of the contracted tensoris preserved, although the rank generally increases. To keep the cost of subsequentpower iterations bounded, a low rank approximation through the use of a truncatedsingular value decomposition is applied. LEXIBLE POWER METHOD FOR INFINITE TEP x , obtained from the power method, toapproximate the smallest eigenvalue of H . Because (1.2) contains an inﬁnite numberof terms in the summation, the classical deﬁnition of the Rayleigh quotient involves adiverging inﬁnite sum. Therefore, we deﬁne the iTR Rayleigh quotient in an averagesense, and introduce the canonical form of an iTR that can simplify such a deﬁnition.In the numerical experiments, we will show that the iTR Rayleigh quotient usedwithin the ﬂexible power method converges from above to the exact solution. Wealso give an alternative way to obtain an eigenvalue approximation that can be moreaccurate than the Rayleigh quotient. Such an approximation relies on viewing an iTRas the product of an inﬁnite sized frame matrix and a vectorized tensor of ﬁnite size.The ﬂexible power iteration was ﬁrst introduced by Vidal in [15] as the iTEBDalgorithm. We call such a scheme a ﬂexible power iteration because it is similar toapplying a power method to the matrix exponential e − H t , except that the matrixexponential is approximated, in each power iteration, through Trotter splitting, andrank truncation is applied to the iTR representation of e − H t x . This is where theterm “ﬂexible” comes from. In [15] and other papers, the accuracy of the eigenvalueapproximate is assessed by either comparing the approximation with the exact solutionor examining the diﬀerence between the approximate eigenvalues obtained from twoconsecutive iterations. The former is not practical when the exact solution of theproblem is unknown. The latter can be misleading when the ﬂexible iteration startsto stagnate, which does happen as shown in our numerical experiments.Based on the canonical form of the iTR, we also introduce a ﬁnite size iTR residualthat can be used to monitor the convergence of the Rayleigh quotient. We give apractical procedure for adjusting the parameter t in the ﬂexible power iteration toensure rapid but stable convergence and provide a stopping criterion for terminatingthe power iteration. As illustrated in the numerical experiments, such a procedure isvery eﬀective.The paper is organized as follows. In section 2, we introduce diagrammatic no-tations for scalars, vectors, matrices, and tensors, as well as for tensor operationsthat make it easier to explain operations performed on tensor rings. In section 3, webegin with a formal deﬁnition of a single core translational invariant inﬁnite TensorRing, describe its properties, and show how an iTR can be converted into a so-calledcanonical form, which is convenient for manipulating iTRs in various calculations.We also extend these deﬁnitions and properties to 2-core translational invariant iTRswhich are used to represent the approximate eigenvector of H (1.2) containing nearestneighbor interactions. In section 4, we properly deﬁne the iTR Rayleigh quotient andintroduce the ﬁnite iTR residual. In section 5, we present the ﬂexible power iterationfor computing the smallest eigenpair of H . We also show how e H t x is computed topreserve the iTR structure and discuss a number of practical computational considera-tions regarding the convergence and error assessment of the method. Three numericalexamples are given in section 6 to demonstrate the eﬀectiveness of the ﬂexible powermethod and the adaptive strategy proposed in section 5. Finally, the main conclusionsare summarized in section 7.

2. Notation.

Throughout the paper, we denote scalars by lower Greek charac-ters, vectors by lower Roman characters, and matrices and higher order tensors byupper Roman characters, e.g., α , x , and M , respectively. I d is the d × d identitymatrix and diagonal matrices are denoted by upper Greek characters, e.g., Σ. Thetrace of a matrix M is given by Tr( M ) and its vectorization by vec( M ). For inﬁnitedimensional vectors and matrices we use bold face Roman characters, e.g., v and M , R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG respectively.We will make use of the powerful tensor diagram notation [11] in order to graph-ically represent tensor contractions. The ﬁrst 4 low-order tensors, i.e., a scalar α , avector v , a matrix M , and a 3rd-order tensor T , are depicted as follows α = α , v = v j , M = M i j , T = T i j k , where i, j, k are the corresponding tensor indices. This graphical notation has theadvantage that tensor contractions can be represented by connecting the lines thatcorrespond to the indices to sum over. For example, the matrix-vector product andthe matrix-matrix product are given by, respectively, M v = M i v j , M N = M i N j k , where the connections between M and v , and M and N correspond to the sum overindex j . Because contracting a tensor over one of its indices with the identity matrixhas no eﬀect, we represent the identity matrix I by just a line. Another importanttensor contraction, that we will often use, is the contraction of two 3rd-order tensorsinto a so called supercore Z i jl m = X Y i j k l m , Z ijlm = X k X ijk Y klm , where the thick leg indicates combining the indices j and l into a single index.Some other commonly used matrix operations such as the trace, the transpose,and the vectorizations of a matrix are given by the following diagramsTr( M ) = M , v ⊤ = v , vec( M ) = M , vec( M ) ⊤ = M . In order to save space, we sometimes rotate the diagrams counterclockwise by 90degrees, e.g., the vectorization of a 3-rd order tensorvec( T ) = T . Throughout the paper, reshaping tensors into matrices and vice versa will play a keyrole, for example, reshaping the following rank-1 matrices into 4th-order tensorsvec( I ) vec(Σ ) ⊤ = ΣΣ , vec(Σ ) vec( I ) ⊤ = ΣΣ , (2.1)where Σ is a diagonal matrix. We again obtain a matrix by combining respectivelythe left and right pointing legs into one index.

3. Translational invariant inﬁnite tensor ring.

In this section, we deﬁnethe translational invariant inﬁnite tensor ring and discuss its properties that will beused in the next sections to construct an approximation to the eigenvector associatedwith the smallest eigenvalue of the inﬁnite dimensional matrix H deﬁned in (1.2). Westart by deﬁning translational invariance for a (ﬁnite) tensor ring. LEXIBLE POWER METHOD FOR INFINITE TEP The tensor ring decomposition [17, 8] is a way torepresent high-order tensors, or vectors that can be reshaped into high-order tensors,by a sequence of 3rd-order tensors that are circularly multiplied. We deﬁne a ﬁnitesize tensor ring as follows.

Definition

Let x ℓ be an ℓ th-order tensor of size d × d ×· · · × d ℓ , then its tensor ring representation is deﬁned as (3.1) x ℓ = X i X i · · · X ℓ i ℓ and element-wise as x ℓ ( i , i , . . . , i ℓ ) := Tr [ X ( i ) X ( i ) · · · X ℓ ( i ℓ )] , where the indices i k , k = 1 , , . . . , ℓ , run from 1 to d k , and each X k ( i k ) is a matrix of size r k × r k +1 ,with r = r ℓ +1 . We call the 3rd-order tensors X k the cores of x ℓ , X k ( i k ) the slices of the k thcore, and r k the corresponding tensor ring ranks . In the physics literature, a tensorring is called a matrix product state with periodic boundary conditions [12]. When r = r ℓ +1 = 1, a tensor ring becomes a tensor train [10], also known in the physicsliterature as a matrix product state with open boundary conditions [12].When all dimensions d k = d , all ranks r k = r , and all slices X k ( i k ) = X ( i k ), for k = 1 , , . . . , ℓ , the tensor ring x ℓ is said to be translational invariant. It follows fromthe trace operation property that x ℓ ( i , i , . . . , i ℓ ) = x ℓ ( i , . . . , i ℓ , i ) , i.e., x ℓ is invariant under cyclic permutations of its indices. This is a desirable propertybecause the eigenvectors of H deﬁned in (1.2) are known to have such a property forcertain M k,k +1 . We deﬁne a translational invariant (ﬁnite) tensor ring as follows. Definition

Let x ℓ be an ℓ th-order tensor ofsize d ℓ , then its translational invariant tensor ring representation is deﬁned as (3.2) x ℓ = X i X i · · · X i ℓ and element-wise as x ℓ ( i , i , . . . , i ℓ ) := Tr [ X ( i ) X ( i ) · · · X ( i ℓ )] , where the indices i k , k = 1 , , . . . , ℓ , run from 1 to d , and each X ( i k ) is a matrix of size r × r . Note that, due to the translational invariance, the tensor ring (3.2) is fully deter-mined by only one core X of size r × d × r . Generalizing Deﬁnition 3.1 to tensor rings with aninﬁnite number of cores is impossible because this would require an inﬁnite amount ofdata to represent x . On the other hand, using the translational invariance propertyallows us to generalize Deﬁnition 3.2 to inﬁnite tensor rings (iTR). Definition

A translational invariant iTR isdeﬁned as x = X · · · i − X i X · · · i and element-wise as (3.3) x ( . . . , i − , i , i , . . . ) := Tr " + ∞ Y k = −∞ X ( i k ) , R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG where all indices i k run from 1 to d and each X ( i ) is a matrix of size r × r , with r referred to as the rank of x . Note that even though x contains an inﬁnite number of cores, each core is com-pletely deﬁned by the d slices of X so that x can be represented by a ﬁnite amount ofdata although it is inﬁnite dimensional. A translational invariant iTR is also knownas a uniform matrix product state [16]. In the remainder of the paper, we will onlywork with translational invariant iTRs and just call them iTRs for brevity. The core tensor X of each iTR deﬁnes a special matrixcalled transfer matrix , which plays a key role in many operations performed on thecorresponding iTR, e.g., in the Rayleigh quotient and the residual calculations. Definition

Let x be an iTR as deﬁned in Deﬁnition .Then we deﬁne the transfer matrix T X associated with x as the r × r matrix (3.4) T X := d X i =1 X ( i ) ⊗ X ( i ) = XX , where X ( i ) is the i th slice of X . The transfer matrix (3.4) and in particular its dominant left and right eigenvec-tors, i.e., the eigenvectors corresponding to the eigenvalue with largest magnitude,will play an indispensable role in operations involving iTRs.

Conjecture

Let x be a real iTR as deﬁned in Deﬁnition and assumethat the dominant eigenvalue of the corresponding transfer matrix T X is simple. Thenthis eigenvalue, denoted by η , is real and positive, and the corresponding left and righteigenvectors v L and v R satisfy the conditions d X i =1 X ( i ) ⊤ V L X ( i ) = ηV L , d X i =1 X ( i ) V R X ( i ) ⊤ = ηV R , (3.5) where v L =: vec( V L ) and v R =: vec( V R ) , with V L and V R positive deﬁnite matrices. When the dominant eigenvalue of the transfer matrix is simple, v ⊤ L v R = 0 holds.We will assume that the eigenvectors v L and v R satisfy the normalization condition v ⊤ L v R = 1, which is equivalent to Tr (cid:0) V ⊤ L V R (cid:1) = 1. A graphically depiction of (3.5),and the normalization condition for v L and v R are shown in Figure 1. XXV L = η V L V L V R = 1 XX V R = η V R Fig. 1 . A graphical representation of (3.5) , and the normalization condition v ⊤ L v R = 1 . Lemma

If the transfer matrix T X is an injective linear map and η is its dom-inant eigenvalue with v L and v R being the corresponding left and right eigenvectors,respectively normalized to satisfy v ⊤ L v R = 1 , then (3.6) lim k →∞ (cid:18) T X η (cid:19) k = v R v ⊤ L . LEXIBLE POWER METHOD FOR INFINITE TEP Proof.

Let the left and right eigenvalue decompositions of T X be given by W ∗ L T X = Λ W ∗ L , T X W R = W R Λ W ∗ L W R = I, where the diagonal matrix Λ contains the eigenvalues on its diagonal, and W L and W R are the corresponding left and right eigenvectors of T X , respectively, and normalizedso that W ∗ L W R = I , i.e.,Λ := (cid:2) η • (cid:3) , W L := (cid:2) v L • (cid:3) , W R := (cid:2) v R • (cid:3) . Then, using that η is simple and W − R = W ∗ L , yieldslim k →∞ (cid:18) T X η (cid:19) k = lim k →∞ η k W R Λ k W − R = lim k →∞ η k W R Λ k W ∗ L = v R v ⊤ L , which completes the proof. Corollary

Let x be an iTR as deﬁned in Deﬁnition .Then x is normalized, i.e., x ⊤ x = 1 , if its corresponding transfer matrix T X has asimple dominant eigenvalue η = 1 .Proof. The proof directly follows from Lemma 3.6, i.e., x ⊤ x = XX XX XX · · · · · ·· · · · · · = Tr (cid:20) lim k →∞ ( T X ) k (cid:21) = Tr (cid:2) v R v ⊤ L (cid:3) = v ⊤ L v R = 1 , where we also used the trace property Tr( AB ) = Tr( BA ). Because we can insert the product of anynonsingular matrix S and its inverse between two consecutive cores of an iTR andredeﬁne each slice as S − X ( i ) S , i = 1 , ..., d , the representation of an iTR is not unique.It is useful to deﬁne a canonical form that makes it easier to describe computationsperformed on iTRs. Definition

Let x be an iTR as deﬁned in Deﬁnition .Then its canonical form is deﬁned as x = Q Σ i − Q Σ i Q Σ i · · · · · · and element-wise as (3.7) x ( . . . , i − , i , i , . . . ) := Tr " + ∞ Y k = −∞ Q ( i k )Σ , where Q ( i ) ∈ R r × r , for i = 1 , . . . , d , and Σ ∈ R r × r is a diagonal matrix with decreas-ing non-negative real numbers on its diagonal and k Σ k F = 1 , such that the followingleft and right orthogonality conditions hold d X i =1 Q L ( i ) ⊤ Q L ( i ) = ηI, d X i =1 Q R ( i ) Q R ( i ) ⊤ = ηI, (3.8) with Q L ( i ) := Σ Q ( i ) , Q R ( i ) := Q ( i )Σ , and η ∈ R being the dominant eigenvalue ofthe transfer matrix T X . R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG

We call the 3rd-order tensor Q the orthogonal core of x and Σ the correspondingsingular value matrix, since it can be obtained from a singular value decomposition.The 3rd-order tensors Q L and Q R , deﬁned in Deﬁnition 3.8, are called the left andright orthogonal cores, respectively.Before explaining how to compute the canonical form, we will list some properties.First, note that by deﬁnition, Q and Σ satisfy the following relations(3.9) Σ Q ( i )Σ = Q L ( i )Σ = Σ Q R ( i ) , for i = 1 , . . . , d . The left and right canonical transfer matrices are deﬁned as follows. Definition

Let x be an iTR in canonicalform as deﬁned in Deﬁnition . Then we deﬁne its left and right canonical transfermatrices T Q L and T Q R as the following r × r matrices T Q L := d X i =1 Q L ( i ) ⊗ Q L ( i ) = QQ ΣΣ , (3.10) T Q R := d X i =1 Q R ( i ) ⊗ Q R ( i ) = QQ ΣΣ , (3.11) where Q L and Q R are deﬁned in Deﬁnition . The canonical transfer matrices, associated with an iTR in canonical form, andtheir dominant left and right eigenvectors, which play a crucial role in the evaluationof the iTR Rayleigh quotient and the iTR residual, have the following ﬁxed form.

Lemma

Let x be an iTR in canonical form as deﬁned in Deﬁnition withits corresponding canonical transfer matrices deﬁned as in Deﬁnition . Then, vec( I ) ⊤ T Q L = η vec( I ) ⊤ , T Q R vec( I ) = η vec( I ) , (3.12) T Q L vec(Σ ) = η vec(Σ ) , vec(Σ ) ⊤ T Q R = η vec(Σ ) ⊤ , (3.13) with vec( I ) being the dominant left eigenvector of T Q L and the dominant right eigenvec-tor of T Q R , and vec(Σ ) being the dominant right eigenvector of T Q L and the dominantleft eigenvector of T Q R .Proof. The proof for the dominant right eigenvector of T Q R directly follows fromvectorizing the right equality of (3.8),vec d X i =1 Q R ( i ) Q R ( i ) ⊤ ! = d X i =1 Q R ( i ) ⊗ Q R ( i ) ! vec( I ) = T Q R vec( I ) = η vec( I ) , where we made use of the “vec trick” identity, i.e.,(3.14) vec( ABC ) = ( C ⊤ ⊗ A ) vec( B ) , and Deﬁnition 3.9, respectively. For the proof of the dominant left eigenvector of T Q L ,we start from vectorizing the left equality of (3.8) and apply (3.14) to obtain(3.15) vec d X i =1 Q L ( i ) ⊤ Q L ( i ) ! = d X i =1 Q L ( i ) ⊤ ⊗ Q L ( i ) ⊤ ! vec( I ) = η vec( I ) . LEXIBLE POWER METHOD FOR INFINITE TEP ) is the dominant right eigenvector of T Q L , we startagain from the vectorization of the right equality of (3.8),(3.16) vec d X i =1 Q ( i )Σ Q ( i ) ⊤ ! = d X i =1 Q ( i ) ⊗ Q ( i ) ! vec(Σ ) = η vec( I ) , where we substituted Q R ( i ) = Q ( i )Σ and applied (3.14). Next, we multiply bothsides of (3.16) from the left by Σ ⊗ Σ, yielding(Σ ⊗ Σ) d X i =1 Q ( i ) ⊗ Q ( i ) !| {z } = T QL vec(Σ ) = η (Σ ⊗ Σ) vec( I ) = η vec(Σ ) . Finally, in order to prove that vec(Σ ) is also the dominant left eigenvector of T Q R ,we start from the vectorization of the left equality of (3.8),(3.17) vec d X i =1 Q ( i ) ⊤ Σ Q ( i ) ! = d X i =1 Q ( i ) ⊤ ⊗ Q ( i ) ⊤ ! vec(Σ ) = η vec( I ) , where we substituted Q L ( i ) = Σ Q ( i ) and made use of the identity (3.14). Next,we multiply both sides of (3.17) from the left by Σ ⊗ Σ followed by a transposition,yieldingvec(Σ ) ⊤ d X i =1 Q ( i ) ⊗ Q ( i ) ! (Σ ⊗ Σ) | {z } = T QR = η [(Σ ⊗ Σ) vec( I )] ⊤ = η vec(Σ ) ⊤ , which proves (3.13) and completes the proof.The eigenvalue decompositions of Lemma 3.10 are graphically shown in Figure 2.Note also that the left and right orthogonality conditions (3.8), deﬁned by the canon-ical form, can be expressed in terms of the canonical transfer matrices and are equiv-alent to (3.12). QQ ΣΣ = η (a) vec( I ) ⊤ T Q L = η vec( I ) ⊤ QQ ΣΣ = η (b) T Q R vec( I ) = η vec( I ) QQ ΣΣ ΣΣ = η ΣΣ(c) T Q L vec(Σ ) = η vec(Σ ) QQ ΣΣΣΣ = η ΣΣ(d) vec(Σ ) ⊤ T Q R = η vec(Σ ) ⊤ Fig. 2 . Eigenvalue decompositions of the canonical transfer matrices. R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG

Using the orthogonality conditions (3.8), we notice that computing the canonicalform of an iTR corresponds to ﬁnding matrices Q ( i ) and a diagonal Σ so that(3.18) d X i =1 Q ( i ) ⊤ Σ Q ( i ) = ηI = d X i =1 Q ( i )Σ Q ( i ) ⊤ . Algorithm 1 shows how the canonical decomposition of an iTR can be computed.Using the properties L − = R ˜Σ, R − = ˜Σ L , V R = R ˜Σ R ⊤ , and V L = L ⊤ ˜Σ L , incombination with the identity (3.14), one can show that Q and Σ obtained in step 6of Algorithm 1 satisfy (3.18).The complexity of Algorithm 1 is dominated by step 1, which requires the com-putation of the dominant left and right eigenvectors of the r × r transfer matrix T X . Note that all remaining steps 2–7 only require operations on r × r matrices. Algorithm 1:

Canonical decomposition of iTR

Input : X Output: Q and diagonal matrix Σ Dominant left and right eigenvectors of transfer matrix T X :vec( V L ) ⊤ T X = η vec( V L ) ⊤ , T X vec( V R ) = η vec( V R ) . Eigendecompositions of V L and V R : V L = U L Λ L U ⊤ L , V R = U R Λ R U ⊤ R . Form: ˜ U L = U L Λ / L , ˜ U R = U R Λ / R . Singular value decomposition: V ˜Σ W ⊤ = ˜ U ⊤ L ˜ U R . Form: L = W ⊤ ˜ U − R , R = ˜ U −⊤ L V. Canonical form: Q ( i ) = k ˜Σ k F LX ( i ) R, Σ = ˜Σ / k ˜Σ k F . (Normalization): Q ( i ) = Q ( i ) / √ η. Starting from an iTR in canonical form, as deﬁned inDeﬁnition 3.8, we can rewrite it as a product of an inﬁnite dimensional matrix withorthogonal columns, called the frame matrix [4, 7], and a ﬁnite vector of length equalto the number of elements in 1 core.

Definition

Let x be a normalized iTR in canonical formas deﬁned in Deﬁnition . Then we deﬁne its k th frame matrix as follows (3.19) F = k := Q Σ · · · Q Q Σ · · · i k − i k − j k − i k j k i k +1 j k +1 . Note that by this deﬁnition the frame matrix F = k has an inﬁnite number of rows,which are represented as down pointing legs, and dr columns, which are representedas up pointing legs. Now the iTR x can also be written as x = F = k vec( Q c ) , Q c = Q Σ Σ , (3.20) LEXIBLE POWER METHOD FOR INFINITE TEP Q c is called the canonical center core . Another important property of the framematrix is that it has orthonormal columns, i.e., F ⊤6 = k F = k = I dr , since F ⊤6 = k F = k = Q Σ · · · Q Σ · · · QQ Q Σ · · · Q Σ · · · , = ΣΣ QQ ΣΣ , = ΣΣ = I dr , where we used in the ﬁrst step Lemma 3.6, i.e., lim k →∞ ( T Q R ) k = vec( I ) vec(Σ ) ⊤ , inorder to replace the inﬁnite repetition of T Q R on both sides of the free indices, followedby vec( I ) ⊤ T Q L = vec( I ) ⊤ , and vec(Σ ) ⊤ vec( I ) = 1, respectively.Due to the orthogonality of the frame matrix, this matrix will turn out to be animportant tool to extract and update cores in an eﬃcient way. Further, this matrixwill also play a role in splitting supercores into normal cores. The translational invariance principlecan be generalized to allow the unit of invariance to include more than one core. Sucha unit is sometimes referred to as a supercore . We now deﬁne a 2-core translationalinvariant iTR, which we will refer to as iTR2 to distinguish it from iTRs with a singlecore.

Definition

A 2-core translational in-variant iTR is deﬁned as x = X · · · i Y i X i Y · · · i and element-wise as (3.21) x ( . . . , i , i , i , i , . . . ) := Tr " + ∞ Y k = −∞ X ( i k ) Y ( i k +1 ) , where all indices i k run from 1 to d and each X ( i ) and Y ( i ) are matrices of size r × r . In contrast to an iTR with a single core, there are now 2 transfer matrices asso-ciated with every iTR2, playing a central role in the iTR2 operations.

Definition

Let x be an iTR2 as deﬁned inDeﬁnition . Then we deﬁne the transfer matrices T XY and T Y X associated with R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG x as the r × r matrices T XY := d X i =1 d X j =1 h X ( i ) ⊗ X ( i ) ih Y ( j ) ⊗ Y ( j ) i = X YX Y , (3.22) T Y X := d X i =1 d X j =1 h Y ( i ) ⊗ Y ( i ) ih X ( j ) ⊗ X ( j ) i = Y XY X , (3.23) where X ( i ) is the i th slice of X and Y ( j ) the j th slice of Y . Note that the transfer matrices T XY and T Y X of an iTR2 can also be expressedin terms of the 1-core transfer matrices, i.e., T XY = T X T Y and T Y X = T Y T X , where T X is the transfer matrix formed by only using the core X and T Y the transfer matrixformed by only using the core Y . Lemma

Let x be an iTR2 as deﬁned in Deﬁnition and assume that T X and T Y are nonsingular. Then its corresponding transfer matrices T XY and T Y X ,deﬁned in Deﬁnition , have the same eigenvalues.Proof.

We will show that every eigenvalue of T XY is also an eigenvalue T Y X , andvice versa. First, assume that ( λ, v ) is an eigenpair of T XY , T XY v = T X T Y v = λv. Multiplying both sides from the left by T Y , yields T Y T X ( T Y v ) = λ ( T Y v ) , hence, thepair ( λ, w := T Y v ) is an eigenpair of T Y X .Similarly, assume that ( λ, w ) is an eigenpair of T Y X , T Y X w = T Y T X w = λw. Multiplying both sides from the left by T X , yields again T X T Y ( T X w ) = λ ( T X w ) , hence, the pair ( λ, v := T X w ) is an eigenpair of T XY .We now deﬁne the canonical form of an iTR2. Since the canonical form is mainlyused for normalized iTRs and iTR2s, we will assume that the dominant eigenvalue ofits transfer matrices η = 1. Definition

Let x be an iTR2 as deﬁned in Deﬁ-nition . Then its canonical form is deﬁned as x = Q Σ i U Ω i Q Σ i U Ω i · · · · · · and element-wise as (3.24) x ( . . . , i , i , i , i , . . . ) := Tr " + ∞ Y k = −∞ Q ( i k ) Σ U ( i k +1 ) Ω , where Q ( i ) , U ( i ) ∈ R r × r , for i = 1 , . . . , d , and Σ , Ω ∈ R r × r are diagonal matrices withdecreasing non-negative real numbers on its diagonal and k Σ k F = k Ω k F = 1 , suchthat the following left and right orthogonality conditions hold d X i =1 Q L ( i ) ⊤ Q L ( i ) = I, d X i =1 Q R ( i ) Q R ( i ) ⊤ = I, (3.25) d X i =1 U L ( i ) ⊤ U L ( i ) = I, d X i =1 U R ( i ) U R ( i ) ⊤ = I, (3.26) with Q L ( i ) := Ω Q ( i ) , Q R ( i ) := Q ( i )Σ , U L ( i ) := Σ U ( i ) , and U R ( i ) := U ( i )Ω . LEXIBLE POWER METHOD FOR INFINITE TEP

Definition

Let x be an iTR2 in canon-ical form as deﬁned in Deﬁnition . Then we deﬁne its left canonical transfermatrices as the d × d matrices T Q L U L and T U L Q L , and its right canonical transfermatrices as T Q R U R and T U R Q R , i.e., T Q L U L = T Q L T U L , T Q R U R = T Q R T U R , (3.27) T U L Q L = T U L T Q L , T U R Q R = T U R T Q R , (3.28) where Q L , Q R , U L , and U R are deﬁned in Deﬁnition . Lemma

Let x be an iTR2 in canonical form as deﬁned in Deﬁnition and its corresponding canonical transfer matrices as in Deﬁnition . Then, T Q L U L vec(Ω ) = vec(Ω ) , vec(Ω ) ⊤ T Q R U R = vec(Ω ) ⊤ , (3.29) T U L Q L vec(Σ ) = vec(Σ ) , vec(Σ ) ⊤ T U R Q R = vec(Σ ) ⊤ , (3.30) or vec(Ω ) being the dominant right eigenvector of T Q L U L and the dominant left eigen-vector T Q R U R , and vec(Σ ) being the dominant right eigenvector of T U L Q L and thedominant left eigenvector T U R Q R .Proof. The proof is similar to the one of Lemma 3.10.The canonical decomposition of an iTR2 can be computed by making use ofAlgorithm 1 applied to the supercore, followed by a singular value decomposition of thematrization of the obtained canonical center supercore. This procedure is summarizedin Algorithm 2. In case d ≪ r , the complexity of Algorithm 2 is dominated by step 2,which requires again computing the dominant left and right eigenvectors of the r × r transfer matrix in Algorithm 1. Algorithm 2:

Canonical decomposition of iTR2

Input : X and Y Output: Q and U , and diagonal matrices Σ and Ω Form supercore Z as the contraction of the cores X and Y . Use Algorithm 1 on Z resulting in the supercore Q and the diagonal matrixΩ. Form canonical center supercore Q c and reshape into a matrix M : Q Ω Ω = Q c j i i j −→ M j i j i . Singular value decomposition of M and reshape back into cores: V Σ W ⊤ = M −→ V j i Σ W j i . Canonical form: Q ( i ) = Ω − V ( i ) , U ( j ) = W ( j )Ω − . R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG

4. Eigenvalue approximation.

Before we describe the algorithm for comput-ing the leftmost eigenvalue and its corresponding eigenvector, we ﬁrst need to deﬁnean eigenpair of an inﬁnite dimensional matrix H . The use of the iTR ansatz for theapproximate eigenvector also requires special attention to be paid to the deﬁnition ofthe Rayleigh quotient and corresponding residual.First, we introduce the graphical notation of the inﬁnite dimensional matrices H k deﬁned in (1.2) H k = · · · M j k − i k − j k − i k − j k i k j k +1 i k +1 j k +2 i k +2 j k +3 i k +3 · · · , where M ∈ R d × d . Using this notation, the inﬁnite dimensional matrix vector prod-uct H k x , where x is an iTR, corresponds to contracting along all i indices. Let H be the inﬁnite dimensional matrix given in (1.2)and x a normalized iTR that is an approximation of an eigenvector of H . Because(1.2) contains an inﬁnite number of terms in the summation, the classical deﬁnitionof the Rayleigh quotient involves an inﬁnite sum, i.e.,(4.1) x ⊤ Hx = + ∞ X k = −∞ x ⊤ H k x , where(4.2) x ⊤ H k x = XX XX XX XX XX XX · · · · · ·· · · · · · M , for all k and M is a d × d matrix that only interacts with two nearest neighborcores of x (and x ⊤ ). In this case, H k is said to be a local nearest neighbor operator.Since all terms in (4.1) contribute equally to the summation, we deﬁne the Rayleighquotient in an average sense as follows. Theorem

Let x be a normalized nonzero iTR as deﬁnedin Deﬁnition and its corresponding transfer matrix T X as in Deﬁnition . Thenthe Rayleigh quotient for a given inﬁnite dimensional matrix H of the form (1.2) canbe represented by (4.3) θ = XX XXV L V R M , where V L and V R are, respectively, the matricizations of the left and right dominanteigenvectors of T X . LEXIBLE POWER METHOD FOR INFINITE TEP Proof.

We ﬁrst rewrite (4.2) in matrix notation x ⊤ H k x = Tr " k − Y ℓ = −∞ T X ! ˜ M + ∞ Y ℓ = k +2 T X ! , ˜ M := XX XXM . (4.4)It follows from Lemma 3.6 that x ⊤ H k x = Tr h v R v ⊤ L ˜ M v R v ⊤ L i = Tr h v ⊤ L ˜ M v R v ⊤ L v R i = v ⊤ L ˜ M v R = θ, where we consecutively used the trace cyclic property Tr( XY Z ) = Tr(

Y ZX ) and thenormalization of the left and right eigenvectors v ⊤ L v R = 1.If x is a normalized iTR, the evaluation of the Rayleigh quotient only requiresmultiplying the matrix ˜ M (4.4) from the left and right, respectively, with the leftand right dominant eigenvectors of its corresponding transfer matrix T X . In case x isgiven in canonical form, the Rayleigh quotient simpliﬁes further because there is noneed to compute the left and right dominant eigenvectors of the transfer matrix, seeLemma 3.10. Corollary

Let x be a normalized iTR incanonical form as deﬁned in Deﬁnition and its corresponding canonical transfermatrices as in Deﬁnition . Then the Rayleigh quotient associated with the inﬁnitedimensional matrix H (1.2) can be represented by (4.5) θ = QQ ΣΣ QQ ΣΣΣΣ M , where Q and Σ are deﬁned in Deﬁnition .Proof. Since x is in canonical form, the terms in (4.1) are given by(4.6) x ⊤ H k x = QQ ΣΣ QQ ΣΣ QQ ΣΣ QQ ΣΣ · · · · · ·· · · · · · M , for all k , and in matrix notation as follows x ⊤ H k x = Tr " k − Y ℓ = −∞ T Q R ! ˜ M Q + ∞ Y ℓ = k +2 T Q R ! , ˜ M Q := QQ ΣΣ QQ ΣΣ M . Next, using Lemma 3.6 together with the fact that vec(Σ ) and vec( I ) are the leftand right dominant eigenvectors of T Q R , respectively, we obtain x ⊤ H k x = Tr h vec( I ) vec(Σ ) ⊤ ˜ M Q vec( I ) vec(Σ ) ⊤ i = vec(Σ ) ⊤ ˜ M Q vec( I ) = θ, R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG where we consecutively used the trace cyclic property Tr(

XY Z ) = Tr(

Y ZX ) and thenormalization of the left and right eigenvectors of T Q R , i.e., vec(Σ ) ⊤ vec( I ) = 1.In case x is an iTR2, the classical deﬁnition of the Rayleigh quotient (4.1) nowconsists of the following even and odd terms x ⊤ H k x = XX YY XX YY XX YY · · · · · ·· · · · · · M , (4.7) x ⊤ H k +1 x = YY XX YY XX YY XX · · · · · ·· · · · · · M , (4.8)for all k . Hence, we deﬁne the iTR2 Rayleigh quotient as the average of 1 even and1 odd term. Theorem

Let x be a normalized nonzero iTR2 asdeﬁned in Deﬁnition and its corresponding transfer matrices be T XY and T Y X as in Deﬁnition . Then the Rayleigh quotient for a given inﬁnite dimensionalmatrix H (1.2) is deﬁned as (4.9) θ = 12 XX YYV e L V e R M + 12 YY XXV o L V o R M , where V e L and V e R are the matricizations of the left and right dominant eigenvectors of T XY , and V o L and V o R are the matricizations of the left and right dominant eigenvectorsof T Y X , respectively.Proof.

The proof is similar to the one for Theorem 4.1.Using the iTR2 canonical form of Deﬁnition 3.15, the Rayleigh quotient (4.9) yields(4.10) θ = 12 QQ ΣΣ UU ΩΩΩΩ M + 12 UU ΩΩ QQ ΣΣΣΣ M , where Q , U , Σ, and Ω are deﬁned in Deﬁnition 3.15. One way to measure the accuracy of an approximate eigenpair( θ, x ) of H is to evaluate the residual norm. When H is inﬁnite dimensional and x an iTR, some care is needed to properly deﬁne this residual. By making use of theproperty that x can be written as the matrix-vector product of the frame matrix andthe canonical center core (3.20), we introduce a local iTR residual involving only 1core of x . LEXIBLE POWER METHOD FOR INFINITE TEP H has an inﬁnite number of terms in the sum-mation (1.2), the standard deﬁnition of the residual for a Ritz pair ( θ, x ), i.e.,(4.11) Hx − θ x is also inﬁnite. However, by projecting H into the subspace spanned by the columnsof the frame matrix, deﬁned in Deﬁnition 3.11, we can deﬁne a residual associatedwith the projected H . Due to the translational invariance of x , we can choose anarbitrary k for the frame matrix. Therefore, without loss of generality, we set k = 0for the remainder of this section.We deﬁne the projection of H onto F =0 as follows(4.12) H =0 := F ⊤6 =0 HF =0 = + ∞ X k = −∞ F ⊤6 =0 H k F =0 =: + ∞ X k = −∞ H k . where H k is deﬁned in (1.2) and the matrices H k ∈ R dr × dr are obtained by pro-jecting the corresponding H k onto F =0 . Depending on the relative position of thematrix M in each H k , with respect to index i in F =0 , the matrices H k have diﬀerentexpressions for each k , i.e.,(4.13) H − − ℓ = l ( T Q L ) ℓ , ℓ = 0 , , . . . ,H − = QQ ΣΣ M ,H = QQ ΣΣ M ,H ℓ = r ( T Q R ) ℓ , ℓ = 0 , , . . . , with l := QQ ΣΣ QQ ΣΣ M , r := ΣΣ QQ ΣΣ QQM . (4.14)Note that, because the transfer matrices T Q L and T Q R have a dominant eigenvalue 1,the inﬁnite sums(4.15) + ∞ X ℓ =0 ( T Q L ) ℓ , and + ∞ X ℓ =0 ( T Q R ) ℓ R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG diverge, hence, the inﬁnite sum in (4.12) also diverges.On the other hand, we can show that for every H k in (4.13),vec( Q c ) ⊤ H k vec( Q c ) = θ, where θ is the the Rayleigh quotient (4.5) and Q c the canonical center core (3.20).Therefore, we deﬁne the iTR residual as follows(4.16) Res := + ∞ X k = −∞ (cid:16) H k vec( Q c ) − θ vec( Q c ) (cid:17) =: + ∞ X k = −∞ Res k . The following theorem shows how to construct this residual and proves that, in con-trast to the inﬁnite sum in (4.12), the iTR residual deﬁned in (4.16) does converge.

Theorem

Let x be a normalized nonzero canonical iTR as de-ﬁned in Deﬁnition . Then the residual for a given inﬁnite dimensional matrix H (1.2) is given by (4.17) Res = Q c L + QQ Σ Q ΣΣ Σ M + Q Σ QQ Σ ΣΣ M + Q c R − θ Q c , where θ is the Rayleigh quotient (4.5) , Q c the canonical center core (3.20) , and vec( L ) ⊤ := vec( l ) ⊤ h I − ˜ T Q L i − , ˜ T Q L := T Q L − vec(Σ ) vec( I ) ⊤ , (4.18) vec( R ) := h I − ˜ T Q R i − vec( r ) , ˜ T Q R := T Q R − vec( I ) vec(Σ ) ⊤ . (4.19) with l and r are deﬁned by the diagrams given in (4.14) .Proof. Since each term in the inﬁnite sum of (4.16) is deﬁned as the diﬀerenceRes k = H k vec( Q c ) − θ vec( Q c ), we immediately obtain from (4.13) that the 2ndand 3rd term in (4.17) are equal to H − vec( Q c ) and H vec( Q c ), respectively. Next,we will prove that P − k = −∞ Res k yields the 1st term in (4.17) minus θ vec( Q c ), and P + ∞ k =1 Res k yields the 4th term minus θ vec( Q c ). We show that both inﬁnite sumsconverge.By using (4.5) and (4.14), we diagrammatically rewrite θ vec( Q c ) in the followingfactored form θ Q c = QQ ΣΣ QQ ΣΣΣΣ

M Q c = l ΣΣ | {z } vec(Σ ) vec( I ) ⊤ Q c , LEXIBLE POWER METHOD FOR INFINITE TEP − − ℓ = Q c l ( T Q L ) ℓ − θ Q c = Q c l ( ˜ T Q L ) ℓ , for ℓ = 1 , , . . . . Because vec( I ) and vec(Σ ) are the left and right eigenvectors asso-ciated with the dominant eigenvalue 1 of T Q L , as shown in Lemma 3.10, the largesteigenvalue of the deﬂated matrix ˜ T Q L is less than 1 (in magnitude). Hence, in contrastto (4.15), the geometric series of ˜ T Q L converges, yielding + ∞ X ℓ =0 ( ˜ T Q L ) ℓ = h I − ˜ T Q L i − . Note that this sum starts from 0 and (4.20) starts from 1. Consequently, the inﬁnitesum P − k = −∞ Res k yields the 1st term in (4.17) minus θ vec( Q c ), where L is deﬁnedby (4.18).For the remaining part of the proof, we rewrite vec( Q c ) θ as Q c θ = QQ ΣΣ QQ ΣΣΣΣ MQ c = Q c ΣΣ | {z } vec( I ) vec(Σ ) ⊤ r , yielding(4.21) Res ℓ = Q c r ( T Q R ) ℓ − Q c θ = Q c r ( ˜ T Q R ) ℓ , for ℓ = 1 , , . . . . Using similar arguments as before, we can show that the geometricseries of ˜ T Q R converges and that the inﬁnite sum P + ∞ k =1 Res k yields the 4th term in(4.17) minus θ vec( Q c ). Remark that the term θ vec( Q c ) in Res k always gets incorpo-rated in ˜ T Q L or ˜ T Q R , except for k = − , − , ,

1, yielding the 5th term in (4.17). Thiscompletes the proof.In order to deﬁne the residual for a Ritz pair ( θ, x ) with x being an iTR2, we ﬁrstintroduce the frame matrices F = Q and F = U which allow us to rewrite x as follows(4.22) x = F = Q vec( Q c ) = F = U vec( U c ) , where F ⊤6 = Q F = Q = F ⊤6 = U F = U = I and the canonical center cores Q c and U c are Q c = Q Ω Σ , U c = U Σ Ω . (4.23)respectively. Next, we deﬁne the residual in an average sense as followsRes = 12 + ∞ X k = −∞ (cid:16) F ⊤6 = Q H k F = Q vec( Q c ) − θ vec( Q c ) (cid:17) +12 + ∞ X k = −∞ (cid:16) F ⊤6 = U H k F = U vec( U c ) − θ vec( U c ) (cid:17) , R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG where θ is the Rayleigh quotient (4.10), and Q c and U c the canonical center cores(4.23). The following theorem summarizes how to compute this residual. Theorem

Let x be a normalized nonzero canonical iTR2 asdeﬁned in Deﬁnition . Then the residual for a given inﬁnite dimensional matrix H (1.2) is given by (4.24) Res = 12 Q c L Q + 12 UU Ω Q ΣΣ Σ M +12 Q Σ UU Ω ΩΩ M + 12 Q c R Q − θ Q c +12 U c L U + 12 QQ Σ U ΩΩ Ω M +12 U Ω QQ Σ ΣΣ M + 12 U c R U − θ U c , where θ is the Rayleigh quotient (4.10) , Q c and U c the canonical center cores (4.23) , (4.25) vec( L Q ) ⊤ := h vec( l Q ) ⊤ + vec( l U ) ⊤ T U L i h I − ˜ T Q L U L i − , vec( L U ) ⊤ := h vec( l U ) ⊤ + vec( l Q ) ⊤ T Q L i h I − ˜ T U L Q L i − , vec( R Q ) := h I − ˜ T U R Q R i − h vec( r U ) + T U R vec( r Q ) i , vec( R U ) := h I − ˜ T Q R U R i − h vec( r Q ) + T Q R vec( r U ) i , with ˜ T Q L U L := T Q L U L − vec(Ω ) vec( I ) ⊤ , ˜ T Q R U R := T Q R U R − vec( I ) vec(Ω ) ⊤ , ˜ T U L Q L := T U L Q L − vec(Σ ) vec( I ) ⊤ , ˜ T U R Q R := T U R Q R − vec( I ) vec(Σ ) ⊤ , and l Q := QQ ΣΣ UU ΩΩ M , r Q := ΣΣ UU ΩΩ QQM , LEXIBLE POWER METHOD FOR INFINITE TEP l U := UU ΩΩ QQ ΣΣ M , r U := ΩΩ QQ ΣΣ UUM . Proof.

The proof is similar to the one for Theorem 4.4.

5. Flexible power method.

One way to compute the algebraically smallesteigenvalue λ of H is to apply the power method to the matrix exponential e − H . If λ is simple, the power method converges linearly to the desired eigenpair ( λ , x ) atthe rate e λ − λ . This approach is generally not recommended because working withthe matrix exponential e − H can be costly. In fact, computing e − H or applying e − H to a vector may be harder than computing eigenvalues of H . However, when H hasthe structure exhibited in (1.2), e − H t may be approximated by a simpler form thatmakes it possible to perform a power iteration to e − H t for a small t .Throughout this section, we will make use of iTR matrices which are graphicallyrepresented as follows A = A · · · A A · · · j − i − j i j i , where all indices i k , j k run from 1 to d and each A ( j, i ) is a matrix of size r × r , with r referred to as the rank of A . exp( − H t ) by matrix splitting. We now discuss howto approximate e − H t x , with x being an iTR, required in a single step of the poweriteration. Because the inﬁnite matrix H consists of an inﬁnite sum, it is temptingto rewrite e − H = e − P k H k as · · · e − H k − e − H k e − H k +1 · · · . However, this is not validbecause the H k ’s do not commute in general. On the other hand, we can use the Lieproduct formula, also known as Suzuki–Trotter splitting, e ( A + B ) t = (cid:2) e At e Bt (cid:3) + O ( t ) , where A and B are square matrices, yielding the following approximation(5.1) e − H t ≈ + ∞ Y k = −∞ e − H k t , which may be accurate enough if t is suﬃciently small. In the physics literature, t isviewed as an imaginary time variable, hence, e − H t is often referred to as imaginarytime evolution.First, assume the matrices H k to be completely local, i.e., H k = · · · ⊗ I d ⊗ I d ⊗ A k ⊗ I d ⊗ I d ⊗ · · · , A k = A ∈ R d × d , so that, by making use of the identities e A ⊗ I = e A ⊗ I, and e I ⊗ A = I ⊗ e A , the matrix exponentials e − H k t can be simpliﬁed as follows e − H k t = · · · ⊗ I ⊗ I ⊗ e − At ⊗ I ⊗ I ⊗ · · · , R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG and the inﬁnite product in (5.1) corresponds to the rank-1 iTR matrix + ∞ Y k = −∞ e − H k t = · · · ⊗ e − At ⊗ e − At ⊗ e − At ⊗ · · · = · · · e − At e − At e − At j − i − j i j i · · · , where the dotted lines correspond to an iTR matrix rank r = 1. Hence, if x is aniTR, the product(5.2) y = + ∞ Y k = −∞ e − H k t ! x , is again an iTR of the same rank, and (5.2) can eﬃciently be computed by leftmultiplying the slices of x with the d × d matrix e − At Y = Xe − At , where X and Y are the cores of x and y , respectively.If H k represent nearest neighbor interactions, i.e.,(5.3) H k = · · · ⊗ I d ⊗ I d ⊗ M k,k +1 ⊗ I d ⊗ I d ⊗ · · · , M k,k +1 = M ∈ R d × d , so that each H k will interact with the k th and ( k + 1)st cores of x . Because H k and H k +1 both interact with the ( k + 1)st core of x , e − H t x cannot be directly obtainedas in (5.2). Therefore, we ﬁrst reorder the summation and group the even and oddterms together H = + ∞ X k = −∞ H k ! + + ∞ X k = −∞ H k +1 ! =: H e + H o . Using the Suzuki–Trotter splitting twice, yields(5.4) e − H t = e − ( H e + H o ) t ≈ + ∞ Y k = −∞ e − H k t ! + ∞ Y k = −∞ e − H k +1 t ! , for suﬃciently small t , where + ∞ Y k = −∞ e − H k t = · · · · · · e − Mt e − Mt e − Mt j k − i k − j k − i k − j k i k j k +1 i k +1 j k +2 i k +2 j k +3 i k +3 , (5.5) + ∞ Y k = −∞ e − H k +1 t = · · · · · · e − Mt e − Mt e − Mt j k − i k − j k i k j k +1 i k +1 j k +2 i k +2 j k +3 i k +3 j k +4 i k +4 . (5.6)Remark that both (5.5) and (5.6) can also be considered as local operators when wecombine the corresponding neighboring indices. Hence, both the even and odd inﬁnite LEXIBLE POWER METHOD FOR INFINITE TEP + ∞ Y k = −∞ e − H k t = · · · · · · e − Mt e − Mt e − Mt j k − , k − i k − , k − j k, k +1 i k, k +1 j k +2 , k +3 i k +2 , k +3 , (5.7) + ∞ Y k = −∞ e − H k +1 t = · · · · · · e − Mt e − Mt e − Mt j k − , k i k − , k j k +1 , k +2 i k +1 , k +2 j k +3 , k +4 i k +3 , k +4 , (5.8)where the neighboring indices i k, k +1 = ( i k , i k +1 ) and i k +1 , k +2 = ( i k +1 , i k +2 )are combined, respectively.Then, in order to approximate e − H t x , we ﬁrst focus on the odd terms, i.e.,(5.9) y o = + ∞ Y k = −∞ e − H k +1 t ! x , which can be computed in the same way as (5.2) by reformulating x as an iTR2 with2 cores X , i.e., combining the (2 k + 1)st and (2 k + 2)nd cores into a supercores X ofdimension r × d × r . Hence, (5.9) only requires the left multiplication of the slices ofthe new supercore with the d × d matrix e − Mt Y o d = X e − Mt , with X d = X d X d , where d and d correspond here to the dimensions of the (super)cores. Thereafter, weonly need to multiply the even terms, however, directly applying(5.10) y = + ∞ Y k = −∞ e − H k t ! y o is not possible because the supercores Y o are formed by combining the indices (2 k +1 , k + 2), while (5.7) requires the indices (2 k, k + 1) to be combined. Therefore,we ﬁrst need to split y o back into 2 normal cores, followed by a new and diﬀerentrecombination of the cores in order to carry out (5.10) eﬃciently.The dimension of Y o is r × d × r . However, there is no guarantee that thissupercore can be split into 2 cores, of dimension r × d × r , without loosing information.A possible way to split is by ﬁrst reshaping Y o into an rd × rd matrix C , nextapproximating C by a rank- r factorization C ≈ C C ⊤ , where C , C ∈ R dr × r , and ﬁnally reshaping the factors C and C into 3rd-ordertensors Y o r d r = C dr rd ≈ C dr rd C ⊤ r = Y Y r r rd d . In order words, y o is approximated by˜ y o = Y · · · i k − Y i k Y i k +1 Y · · · i k +2 . R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG

Consequently, we can now approximate (5.10) by replacing y o with ˜ y o Y d = e Y o e − Mt , with e Y o d = Y d Y d . In the ﬁnal step, we split the supercore Y into 2 cores Y and Y Y r d r = D dr rd ≈ D dr rd D ⊤ r = Y Y r r rd d . or in order words, y is approximated by˜ y = Y · · · i k Y i k +1 Y i k +2 Y · · · i k +3 . Remark that, although we started with an iTR x , the result of applying e − H t to it isan iTR2. Therefore, in Figure 3 we graphically illustrate the application of (5.1) toan iTR2 with cores X and X . Due to the translational invariance, we only need toapply e − Mt twice in order to compute ˜ y , which is indicated in black. The computationindicated in grey can completely be ignored, so that the multiplication y = e − H t x can be computed in an eﬃcient way. + ∞ Y k = −∞ e − H k +1 t ! x = X X X X X X · · · · · · e − Mt e − Mt e − Mt + ∞ Y k = −∞ e − H k t ! ˜ y o = Y Y Y Y Y Y · · · · · · e − Mt e − Mt e − Mt e − Mt ˜ y = Y Y Y Y Y Y · · · · · · Fig. 3 . The matrix–vector operation y = e − H t x . As illustrated in Figure 3, we only need to operate on 1 super-core at the time. Instead of working with supercores involving X , X and Y , Y ,respectively, we will operate on the following canonical center supercores X and X X = Ω Q U

Σ Ω , and X = Σ U Q

Ω Σ , respectively, so that their corresponding frame matrices are orthogonal.We only describe the application of the odd part of e − H t onto X c , since theeven part can be performed in exactly the same way by interchanging Q, U andΣ , Ω, respectively. The computation consists of 5 steps as illustrated in Figure 4. Instep 1, the canonical center supercore X is formed. Next, we apply e − Mt to it in LEXIBLE POWER METHOD FOR INFINITE TEP Y into a matrix. In step 3, weapproximate the matrix representation of Y by a truncated SVD, with the diagonalmatrix S r containing only the r largest singular values, followed by reshaping the leftand right singular vectors W and V into 3rd-order tensors, i.e., cores. Then in step 4,in order to update the cores Q and U by W Ω and V ⊤ Ω , respectively, and the diagonalmatrix Σ by S r , we transform the result of step 3 into the form of step 1 by left andright multiplying W and V ⊤ , respectively, by Ω − . Note that, although the resultingform resembles the form of step 1, the former is not in canonical form yet. Therefore,we update in step 5 both cores Q ⋆ and U ⋆ , and both diagonal matrices Σ ⋆ and Ω ⋆ tobring the updated iTR2 again in canonical form.step 1: X = Ω Q Σ U Ω step 2: Y = X e − Mt −→ Y step 3: Y ≈ W S r V ⊤ −→ W S r V ⊤ step 4: Ω W Ω S r V ⊤ Ω Ω = Ω Ω - W S r V ⊤ Ω - Ω step 5: W Ω S r V ⊤ Ω Ω −→ Q ⋆ Σ ⋆ U ⋆ Ω ⋆ Fig. 4 . The odd part of the matrix–vector operation y = e − H t x , where x is given in canonicaliTR2 form. Note that the even part corresponds to interchanging the cores Q and U , and thediagonal matrices Σ and Ω , respectively. (1) Forming the canonical center supercore. (2) Applyingthe matrix exponential followed by reshaping the result into a matrix. (3) Performing a truncatedSVD and reshaping the singular vectors into cores. (4) Updating the cores and the central diagonalmatrix, while keeping Ω unchanged. (5) Computing the canonical iTR2 form. We now have all components to state the method for computing the smallesteigenvalue of (1.2). Due to the approximation made by the Suzuki–Trotter splittingof e − H t and the rank truncation for splitting a supercore into 2 normal cores, themultiplication y = e − H t x is inexact. Even when t is ﬁxed, the error introduced bythe rank truncation is diﬀerent in each iteration and can be viewed as every timeapplying a slightly diﬀerent operator. Therefore, we call the algorithm for computingthe smallest eigenvalue of H , based on a power method of e − H , the ﬂexible powermethod, similar to the ﬂexible GMRES algorithm used for computing solutions oflinear systems.Algorithm 3 shows the outline of the ﬂexible power method for computing thesmallest eigenvalue of (1.2). This algorithm takes a canonical iTR2 as starting vectorand returns the Rayleigh quotient and its corresponding eigenvector in canonical iTR2form. In every iteration of Algorithm 3, we apply the matrix exponential e − Mt twice,according to Figure 4, followed by the computation of the Rayleigh quotient andresidual.As we will show in the numerical experiments, the timestep t can have a signiﬁcanteﬀect on the convergence and accuracy of the Rayleigh quotient. Since Algorithm 3 ap-plies approximate powers of the matrix exponential e − H , choosing a smaller timestep6 R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG

Algorithm 3:

Flexible power method

Input :

Canonical iTR2 with cores

Q, U ∈ R r × d × r and matricesΣ , Ω ∈ R r × r Nearest neighbor interaction matrix M and initial timestep t Output:

Rayleigh quotient θ and corresponding eigenvector in canonicaliTR2 while not converged do Apply e − Mt to supercore X according to Figure 4. Apply e − Mt to supercore X according to Figure 4. Compute Rayleigh quotient θ as in (4.10). Check convergence based on residual (4.24). Update t . end will require more iterations for applying the same power. However, if the timestepis chosen too large, the convergence of Algorithm 3 stagnates due to the dominatingSuzuki–Trotter splitting error. It is therefore recommended to adapt and decreasethe timestep t in the course of the algorithm. By starting with a relative large t , thepower method will stagnate in a small number of iterations. We can use the iTR2residual for detecting stagnation. Next, we decrease t and continue the power methodwith the already obtained approximate eigenvector. By updating t and using moreaccurate approximate eigenvectors for smaller t , we will be able to reduce the totalnumber of iterations. Most operations of Algorithm 3 can beeﬃciently implemented involving only 3rd-order tensors of size r × d × r , such assteps 1–4 in Figure 4 and the Rayleigh quotient. On the other hand, step 5 inFigure 4 and the computation of the residual involve transfer matrices of size r × r .The former requires computing dominant eigenvectors of the left and right transfermatrices and the latter needs linear systems solves involving the same matrices. Inboth cases, iterative methods can be used, however, even for low values of r ≈ Q ⋆ := W Ω and U ⋆ := V ⊤ Ω , and the updated diagonal matrices Σ ⋆ := S r andΩ ⋆ := Ω. Note that in this case only 1 of the 2 diagonal matrices is updated, hencethe resulting iTR2 will, in general, not satisfy all the orthogonality conditions ofDeﬁnition 3.15. However, due to the alternating singular value decompositions on Y and Y , we observe that the resulting iTR2 converges to canonical form. As a result,we can still use the relatively cheap canonical Rayleigh quotient (4.10) compared tothe standard Rayleigh quotient (4.9) which would again require the computation ofdominant eigenvectors of the transfer matrices.Note that in the iTEBD algorithm described in [15, 9] the matrix-vector operation y = e − H t x is implemented in a slightly diﬀerent way. Instead of bringing the updatediTR2 to canonical form, as done in step 5 in Figure 4, the iTR2 is directly transformed LEXIBLE POWER METHOD FOR INFINITE TEP

6. Numerical experiments.

In this section we illustrate the ﬂexible powermethod, outlined in Algorithm 3, for 3 examples: the one-dimensional Ising modelwith a transverse ﬁeld, the spin S = 1 Heisenberg isotropic antiferromagnetic model,and the spin S = 1 / eigs and the linear systems for computing the residual in (4.25) aresolved via gmres with tolerance 1e − We start with the one-dimensional transverse ﬁeld Ising model that has the corresponding inﬁnite dimen-sional Hamiltonian(6.1) H = − + ∞ X k = −∞ · · · ⊗ I ⊗ σ z ⊗ σ z ⊗ I ⊗ · · · − g + ∞ X k = −∞ · · · ⊗ I ⊗ σ x ⊗ I ⊗ · · · where σ x and σ z are the 1st and 3rd Pauli matrices, respectively, and the following4 × M =  − − g − g − g − g −  as nearest neighbor interaction matrix in the notation of (1.2). We choose the pa-rameter g = 2. The exact solution of the smallest eigenvalue of (6.1) is [13] λ ( g ) = − π Z π − π p g − g cos x dx, λ (2) ≈ − .

127 088 819 946 730 . In the ﬁrst experiment, we compare the 2 variants of Algorithm 3, i.e., the canon-cial variant where we always convert the result of applying e − Mt back into canonicaliTR2 form and the non-canonical variant where we skip step 5 in Figure 4. In orderto improve the numerical stability in the latter one, we normalize the diagonal ma-trices to have unit Frobenius norm. Figure 5a shows the Rayleigh quotient θ and itsdiﬀerence with the exact solution as a function of the iteration count. Note that theresults for both variants of Algorithm 3 are indistinguishable, hence, we will only usethe faster non-canonical variant in the remainder of this section.Next, we analyze the eﬀect of the timestep t on the convergence of Algorithm 3.Figure 5b depicts the convergence of 2 diﬀerent runs where we respectively decreasedthe timestep t after every 100 or 1000 iterations. Note that diﬀerent timesteps areindicated by diﬀerent colors. From this ﬁgure it is clear that the timestep controlcan have a signiﬁcant eﬀect on the overall attained accuracy. When t is decreasedtoo soon, as in the ﬁrst run, the ﬂexible power method tend to stagnate and furtherdecreasing t does not help. On the other hand, we do not want to run too manyiteration and therefore we will use the iTR residual, introduced in section 4.2, todetect stagnation.As discussed in section 5.2, the total number of iterations can be reduced bygradually decreasing the timestep t . Table 1 shows that running Algorithm 3 with8 R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG − − . − iteration θ iTR2ciTR2 0 20 40 60 80 10010 − − − − iteration | λ − θ | iTR2ciTR2 (a) Comparison of the canonical and non-canonical variants of Algorithm 3 for timestep t = 0 . ,

000 2 ,

000 3 ,

000 4 ,

000 5 , − − − − iteration | λ − θ | t = 1e − t = 1e − t = 1e − t = 1e − t = 1e − (b) Inﬂuence of the timestep t on the convergence of Algorithm 3. − − − iteration | λ − θ | k R k − − − time T | λ − θ | k R k (c) Convergence as a function of iteration count and total time. Fig. 5 . One-dimensional transverse ﬁeld Ising model with g = 2 and iTR rank r = 10 . a ﬁxed timestep t = 1e − t was divided by 10 when theﬁrst 3 digits of the residual norm k Res k were not changing anymore in 3 consecutiveconvergence checks or in case the residual increased. The left plot in Figure 5c showsthe corresponding convergence as a function of the iteration count. Note that byusing the automatic timestep adaption approach, the accuracy of the Ritz value ismultiple orders of magnitude higher than achieved in both runs shown in Figure 5b. LEXIBLE POWER METHOD FOR INFINITE TEP Table 1

Wall clock times for one-dimensional transverse ﬁeld Ising model with g = 2 and iTR rank r = 10 . t iter time [s] iter time [s]1e − . − . − ,

500 1 . − ,

000 4 . − ,

000 122 .

52 150 ,

000 44 . ,

000 122 .

52 164 ,

663 52 . The right plot in Figure 5c shows the same results as the left one but in functionof the total time T = N t , with N the number of iterations of Algorithm 3 for each t . Remark that the stagnation happens for every timestep more or less at the sametotal time T , corresponding to roughly speaking the same power of e − H . We will usethis observation in the next numerical experiments to change the frequency of theconvergence check based on the current timestep. S = 1 Heisenberg isotropic antiferromagnetic model.

As asecond example, we consider the isotropic antiferromagnetic case of the spin S = 1Heisenberg model. The resulting Hamiltonian has the form of (1.2), with the followingnearest neighbor interaction matrix M k,k +1 = X k ⊗ X k +1 + Y k ⊗ Y k +1 + ∆ Z k ⊗ Z k +1 , where ∆ = 1, and X k , Y k , and Z k the spin-1 generators of SU(2) X k = 1 √   , Y k = 1 √  − ı ı − ı ı  , Z k =  −  . The smallest eigenvalue is λ ≈ − .

401 484 038 971 2(2) [5].Figures 6a to 6d show the convergence of Algorithm 3 for diﬀerent iTR ranks r = 10, 20, 30, and 60, respectively. For every case, we started with a timestep t = 0 . /t in order to avoid computing to many (expensive)residual calculations. The corresponding wall clock time for the diﬀerent runs aregiven in Table 2. This table illustrates again that the smaller the timestep t the moreiterations are required, hence, the larger the wall clock time.When the rank of the iTR is too small, we see from Figure 6 that the Rayleighquotient does not further improve even when the timestep is reduced because theSVD truncation error is too large. On the other hand, we observe from Figures 6bto 6d that the residual norms in all these runs behave in a similar fashion. Since theiTR2 residual (4.24) is a projected residual, we cannot use it as the only metric fordeciding when to terminate the ﬂexible power iteration. As illustrated in Figure 6e,the singular values given by the diagonal matrices of Deﬁnition 3.15 and in particularthe smallest singular value together with the residual are good indicators for the actualaccuracy of the approximate eigenpair of H because the overall accuracy is boundedby both the Suzuki–Trotter splitting and SVD truncation error. S = 1 / Heisenberg model.

As a last example, we consider thespin S = 1 / R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG − − − − time T | λ − θ | k R k (a) r = 10 − − − − time T | λ − θ | k R k (b) r = 20 − − − − time T | λ − θ | k R k (c) r = 30 − − − − time T | λ − θ | k R k (d) r = 60 − − − r = 10 r = 20 r = 30 r = 60 (e) Singular values Fig. 6 . Spin S = 1 Heisenberg isotropic antiferromagnetic model. with the following nearest neighbor interaction matrix M k,k +1 = X k ⊗ X k +1 + Y k ⊗ Y k +1 + Z k ⊗ Z k +1 , where X k = σ x , Y k = σ y , Z k = σ z , with σ x , σ y , σ z the Pauli matrices. The exactsolution of the smallest eigenvalue is λ = − ln(2) + ≈ − .

443 147 180 559 945.In Figure 7a, we plot the convergence history of the Rayleigh quotient θ to theexact eigenvalue λ as well as the changes in the ﬁrst ( θ ) and second ( θ ) terms of LEXIBLE POWER METHOD FOR INFINITE TEP Table 2

Wall clock times corresponding to Figures to . t r = 10 r = 20 r = 30 r = 60iter time [s] iter time [s] iter time [s] iter time [s]1e − .

60 70 0 .

71 80 1 .

04 100 3 . − .

75 1 ,

100 2 .

26 900 3 .

28 700 9 . − ,

000 3 .

27 6 ,

000 8 .

63 7 ,

000 19 .

60 9 ,

000 96 . − ,

000 33 .

75 80 ,

000 107 .

57 110 ,

000 314 .

33 60 ,

000 590 . − ,

000 288 .

10 900 ,

000 1 , .

97 800 ,

000 2 , .

29 800 ,

000 7 , . (4.9) (excluding the 1/2 factor) with respect to the total time T . In this calculation,we limit the rank of the iTR to 10 and used a ﬁxed timestep t = 0 . θ approaches to λ from above, but θ can fall below λ and eventually approaches λ from below. Although θ and θ both converge towards λ , the convergence ismuch slower than θ which is the average of θ and θ .In addition to approximating the desired eigenvalue by the Rayleigh quotient,we can obtain another type of approximation by computing the eigenvalue of theprojected Hamiltonian matrix H =0 deﬁned by (4.12). Because the summation in H =0 diverges, the eigenvalue of H =0 is not ﬁnite. However, the eigenvalue of theaverage of H =0 is ﬁnite and serves as an approximation to the average eigenvalue of H deﬁned in (1.2). To compute the average of H =0 , we can ﬁrst subtract θI fromeach of the terms H − − ℓ and H ℓ in (4.13), for ℓ ≥

1, where θ is the Rayleighquotient. We then compute the geometric sums H L = H − + P ∞ ℓ =1 ( H − − ℓ − θI ) and H R = H + P ∞ ℓ =1 ( H ℓ − θI ), respectively, using the same techniques as discussedin the proof of Theorem 4.4 for the average residual calculation. We can show thatboth geometric sums converge. Next, we sum up H L , H − , H , and H R after θI issubtracted from each of these terms. The subtraction of θI from H L and H R aremade to account for the θI ’s not subtracted from H − and H , respectively, beforethe geometric sums were performed to yield H L and H R . After dividing the sum ofall four terms by 4, we add θI back to form the total average of H =0 , i.e.,ˆ H =0 = ( H L + H − + H + H R − θI ) / θI = ( H L + H − + H + H R ) / . In Figure 7a, the curve marked by the diamonds show that the eigenvalue ˆ θ of theaverage ˆ H =0 converges faster to λ . However, computing the average ˆ H =0 can beexpensive when the rank of x becomes large.Figure 7b shows how the norm of the average residual and the diﬀerence betweenthe approximate and exact eigenvalue change with respect to the total time T whenthe rank of the iTR is allowed to increased to 120. Even with this much larger rank,it is diﬃcult to reduce the error in the approximate eigenvalue below 10 − .

7. Conclusions.

In this paper, we described a ﬂexible power method for solvinga class of inﬁnite dimensional tensor eigenvalue problem in which the inﬁnite dimen-sional matrix to be partially diagonalized has a special sum of Kronecker productstructure shown in (1.2), and the approximate eigenvector can be represented by aninﬁnite translational invariant tensor ring (iTR). We presented some algebraic prop-erties of the iTR in terms of its transfer matrix and showed how to obtain a canonicalform which is convenient for manipulating an iTR in several calculations. We showedhow the Rayleigh-quotient of an iTR x with respect to the matrix (1.2) can be prop-erly computed in an average sense. This is the quantity computed in each step of2 R. VAN BEEUMEN, L. PERIˇSA, D. KRESSNER, AND C. YANG − . − . − . − . − . λ time T θ θθ θ ˆ θ (a) Convergence of the iTR Rayleigh quotient and its 2 comprising terms ( r = 10 and t = 0 . − − − time T | λ − θ | k R k (b) Convergence as a function of total time for r = 120. Fig. 7 . Spin S = 1 / Heisenberg model. the ﬂexible power iteration in which the matrix exponential e − H t is multiplied with x for some small adjustable parameter t approximately. The ﬂexibility of the powermethod results from approximations made in each iteration to compute e − H t x . Theapproximation involves splitting e − H t into a product of simpler terms and reducingthe rank of the iTR after e − H t is applied to x .Although the ﬂexible power method itself is not new, we gave more algorithmicand implementation details in this paper on how to carry out this procedure in apractical setting. In particular, we presented a better way to compute the Rayleighquotient, and pointed out an alternative way to extract possibly better approximationto the desired eigenvalue. We also developed a practical and eﬀective way to moni-tor the convergence of the ﬂexible power iteration by computing an average (ﬁnite)residual estimate. Such a residual estimate is used to determine when to decrease t to improve convergence and when to terminate the power iteration.We should note that the ﬂexible power iteration is not the only method for solv-ing this type of inﬁnite dimensional tensor eigenvalue problems. Alternative methods LEXIBLE POWER METHOD FOR INFINITE TEP e − H t with an iTR x . A comparison of the ﬂexible power methodwith other algorithms will be made in future study. REFERENCES[1]

M. T. Batchelor , The Bethe ansatz after 75 years , Phys. Today, 60 (2007), pp. 36–40, https://doi.org/10.1063/1.2709557.[2]

H. Bethe , Zur Theorie der Metalle , Z. Phys., 71 (1931), pp. 205–226, https://doi.org/10.1007/BF01341708.[3]

J. C. Bonner, H. W. J. Bl¨ote, H. Beck, and G. M¨uller , Quantum Spin Chains , in Physicsin One Dimension, J. Bernasconi and T. Schneider, eds., Berlin, Heidelberg, 1981, SpringerBerlin Heidelberg, pp. 115–128.[4]

S. V. Dolgov, B. N. Khoromskij, I. V. Oseledets, and D. V. Savostyanov , Computationof extreme eigenvalues in higher dimensions using block tensor train format , Comput.Phys. Commun., 185 (2014), pp. 1207–1216, https://doi.org/10.1016/j.cpc.2013.12.017.[5]

J. Haegeman, J. I. Cirac, T. J. Osborne, I. Piˇzorn, H. Verschelde, and F. Verstraete , Time-dependent variational principle for quantum lattices , Phys. Rev. Lett., 107 (2011),p. 070601, https://doi.org/10.1103/PhysRevLett.107.070601.[6]

L. Hulth´en , ¨Uber das Austauschproblem eines Kristalles , Ark. Mat. Astr. Fys., 26A (1938),pp. 1–106.[7] D. Kressner, M. Steinlechner, and A. Uschmajew , Low-rank tensor methods with sub-space correction for symmetric eigenvalue problems , SIAM J. Sci. Comput., 36 (2014),pp. A2346–A2368, https://doi.org/10.1137/130949919.[8]

O. Mickelin and S. Karaman , On Algorithms for and Computing with the Tensor RingDecomposition , 2018, https://arxiv.org/abs/1807.02513.[9]

R. Or´us and G. Vidal , Inﬁnite time-evolving block decimation algorithm beyond unitary evo-lution , Phys. Rev. B, 78 (2008), p. 155117, https://doi.org/10.1103/PhysRevB.78.155117.[10]

I. V. Oseledets , Tensor-train decomposition , SIAM J. Sci. Comput., 33 (2011), pp. 2295–2317,https://doi.org/10.1137/090752286.[11]

R. Penrose , Applications of negative dimensional tensors , in Combinatorial Mathematics andits Applications, D. J. A. Welsh, ed., vol. 1, New York, 1971, Academic Press, pp. 221–244.[12]

D. Perez-Garcia, F. Verstraete, M. M. Wolf, and J. I. Cirac , Matrix product staterepresentations , Quantum Inf. Comput., 7 (2007), pp. 401–430.[13]

P. Pfeuty , The one-dimensional Ising model with a transverse ﬁeld , Ann. Physics, 57 (1970),pp. 79–90, https://doi.org/10.1016/0003-4916(70)90270-8.[14]

S. Rommer and S. ¨Ostlund , Thermodynamic limit and matrix-product states , in LectureNotes in Physics, I. Pesschel, M. Kaulke, X. Wang, and K. Hallberg, eds., vol. 528, Berlin,Heidelberg, 1999, Spinger, pp. 67–89.[15]

G. Vidal , Classical simulation of inﬁnite-size quantum lattice systems in one spatial di-mension , Phys. Rev. Lett., 98 (2007), p. 70201, https://doi.org/10.1103/PhysRevLett.98.070201.[16]

V. Zauner-Stauber, L. Vanderstraeten, M. T. Fishman, F. Verstraete, and J. Haege-man , Variational optimization algorithms for uniform matrix product states , Phys. Rev.B, 97 (2018), p. 45145, https://doi.org/10.1103/PhysRevB.97.045145.[17]