[PDF] An iterative method to compute the overlap Dirac operator at nonzero chemical potential

Abstract

The overlap Dirac operator at nonzero quark chemical potential involves the computation of the sign function of a non-Hermitian matrix. In this talk we present an iterative method, first proposed by us in Ref. [1], which allows for an efficient computation of the operator, even on large lattices. The starting point is a Krylov subspace approximation, based on the Arnoldi algorithm, for the evaluation of a generic matrix function. The efficiency of this method is spoiled when the matrix has eigenvalues close to a function discontinuity. To cure this, a small number of critical eigenvectors are added to the Krylov subspace, and two different deflation schemes are proposed in this augmented subspace. The ensuing method is then applied to the sign function of the overlap Dirac operator, for two different lattice sizes. The sign function has a discontinuity along the imaginary axis, and the numerical results show how deflation dramatically improves the efficiency of the method.

Full PDF

AAn iterative method to compute the overlap Diracoperator at nonzero chemical potential

Jacques Bloch ∗ and Tilo Wettig Institute for Theoretical Physics, University of Regensburg, 93040 Regensburg, GermanyE-mail: [email protected]

Andreas Frommer and Bruno Lang

Department of Mathematics, University of Wuppertal, 42097 Wuppertal, Germany

The overlap Dirac operator at nonzero quark chemical potential involves the computation of thesign function of a non-Hermitian matrix. In this talk we present an iterative method, ﬁrst proposedby us in Ref. [1], which allows for an efﬁcient computation of the operator, even on large lattices.The starting point is a Krylov subspace approximation, based on the Arnoldi algorithm, for theevaluation of a generic matrix function. The efﬁciency of this method is spoiled when the matrixhas eigenvalues close to a function discontinuity. To cure this, a small number of critical eigen-vectors are added to the Krylov subspace, and two different deﬂation schemes are proposed in thisaugmented subspace. The ensuing method is then applied to the sign function of the overlap Diracoperator, for two different lattice sizes. The sign function has a discontinuity along the imaginaryaxis, and the numerical results show how deﬂation dramatically improves the efﬁciency of themethod.

The XXV International Symposium on Lattice Field TheoryJuly 30 - August 4 2007Regensburg, Germany ∗ Speaker. c (cid:13) Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike Licence. http://pos.sissa.it/ a r X i v : . [ h e p - l a t ] O c t n iterative method to compute the overlap Dirac operator at nonzero chemical potential Jacques Bloch

1. The overlap operator and the sign function at nonzero quark chemical potential

The overlap Dirac operator [2, 3] provides an exact solution of the Ginsparg-Wilson relationand hence implements chiral symmetry in lattice QCD even at ﬁnite lattice spacing. At zero quarkchemical potential the overlap operator requires the computation of the sign function of the Hermi-tian Wilson-Dirac operator, for which efﬁcient methods have been developed [4, 5].To describe QCD at nonzero baryon density (see Ref. [6] for a review), a quark chemicalpotential µ is introduced in the QCD Lagrangian. The massless overlap Dirac operator at nonzero µ was deﬁned in Ref. [7] as D ov ( µ ) = + γ sgn ( H w ( µ )) (1.1)with H w ( µ ) = γ D w ( µ ) . D w ( µ ) is the Wilson-Dirac operator at nonzero chemical potential [8] [ D w ( µ )] nm = δ n , m − κ ∑ j = ( + γ j ) U n , j δ n + ˆ j , m − κ ∑ j = ( − γ j ) U † n − ˆ j , j δ n − ˆ j , m (1.2) − κ ( + γ ) e µ U n , δ n + ˆ4 , m − κ ( − γ ) e − µ U † n − ˆ4 , δ n − ˆ4 , m , where κ = / ( + m w ) with negative Wilson mass m w ∈ ( − , ) , γ ν with ν = , . . . , γ = γ γ γ γ , and U n , ν are SU ( ) matrices. The exponentialfactors e ± µ implement the quark chemical potential on the lattice. For µ (cid:54) = H w ( µ ) of the sign function in Eq.(1.1) becomes non-Hermitian, and one is faced with the problem ofdeﬁning and computing the sign function of a non-Hermitian matrix.Consider a given matrix A of dimension N and a generic function f . Let Γ be a collection ofclosed contours in C such that f is analytic inside and on Γ and such that Γ encloses the spectrumof A . Then the function f ( A ) of the matrix A can be deﬁned by [9] f ( A ) = π i (cid:73) Γ f ( z )( zI − A ) − dz , (1.3)where the integral is deﬁned component-wise and I denotes the identity matrix. From this deﬁnitionit is easy to derive a spectral function deﬁnition. If the matrix A is diagonalizable, i.e., A = U Λ U − with a diagonal eigenvalue matrix Λ = diag ( λ i ) and U ∈ Gl ( N , C ) , then f ( A ) = U diag ( f ( λ i )) U − . (1.4)If A cannot be diagonalized, a more general spectral deﬁnition can be derived from Eq. (1.3) usingthe Jordan decomposition [10, 1]. Non-Hermitian matrices typically have complex eigenvalues,and applying Eq. (1.4) to the sign function in Eq. (1.1) requires the evaluation of the sign of acomplex number. The sign function needs to satisfy [ sgn ( z )] = x , sgn ( x ) = ± x ≷

0. To satisfy these properties, it has become standard to deﬁnesgn ( z ) ≡ z √ z = sgn ( Re ( z )) , (1.5)where in the last equality the cut of the square root is chosen along the negative real axis. Usingthe deﬁnition (1.5) in the spectral deﬁnition (1.4) and reordering the eigenvalues according to thesign of their real part allows one to write the matrix sign function assgn ( A ) = U (cid:32) + I − I (cid:33) U − . (1.6)2 n iterative method to compute the overlap Dirac operator at nonzero chemical potential Jacques Bloch

The sign function satisﬁes sgn ( A ) = I , and a short calculation [7] shows that for this reason theoverlap operator D ov ( µ ) as deﬁned in Eq. (1.1) satisﬁes the Ginsparg-Wilson relation. Moreover,this deﬁnition agrees with the result obtained when deriving Eq. (1.1) from the domain-wall fermionformalism at µ (cid:54) =

2. Arnoldi method and function approximation for a non-Hermitian matrix

A numerical implementation of the sign function using the spectral deﬁnition (1.4) is onlypossible for small matrices, as a full diagonalization becomes too expensive as the matrix grows.Alternatively, matrix-based iterative algorithms for the computation of the matrix sign functionhave been around for many years, see Ref. [12] and references therein. These are efﬁcient formedium-sized problems, but are still unaffordable for the very large matrices occurring in typicallattice QCD simulations. Therefore, another iterative method is required which approximates thevector y = sgn ( A ) x , rather than the full sign matrix itself. Such iterative methods are alreadyextensively used for Hermitian matrices [13, 14]. Most of these methods are derived from theLanczos method, which uses short recurrences to build an orthonormal basis in a Krylov subspace.Krylov subspace methods have also been introduced for non-Hermitian matrices [15, 16]. Thetwo most widely used methods to compute a basis for the Krylov subspace are the Arnoldi methodand the two-sided Lanczos method. In contrast to the Hermitian case, the Arnoldi method requireslong recurrences to construct an orthonormal basis for the Krylov subspace, while the two-sidedLanczos method uses two short recurrence relations at the cost of losing orthogonality. Here wedescribe a Krylov subspace approximation based on the Arnoldi method to evaluate f ( A ) x for ageneric function of a non-Hermitian matrix.We aim to construct an approximation to f ( A ) x using a polynomial of degree k − k (cid:28) N .For any k there exists a best polynomial approximation ˆ y = P k − ( A ) x of degree at most k −

1, whichis the orthogonal projection of f ( A ) x on the Krylov subspace K k ( A , x ) = span ( x , Ax , . . . , A k − x ) .An orthonormal basis V k = ( v , . . . , v k ) for the Krylov subspace K k ( A , x ) is constructed using theArnoldi recurrence AV k = V k H k + β k v k + e Tk , (2.1)where v = x / β , β = | x | , H k is an upper Hessenberg matrix, β k = H k + , k , and e k is the k -th basisvector in C k . Then V k V † k is a projector on the Krylov subspace, and the projection ˆ y of f ( A ) x on K k ( A , x ) can formally be written as ˆ y = V k V † k f ( A ) x . (2.2)However, to compute the projection (2.2) one would already have to know the exact result f ( A ) x .Therefore, a method is needed to approximate the projected vector ˆ y . From Eq. (2.1) it follows that H k = V † k AV k , (2.3)which suggests the approximation [15] f ( H k ) ≈ V † k f ( A ) V k . (2.4)As x = β V k e , Eq. (2.4) can be substituted in Eq. (2.2), ﬁnally yielding the approximationˆ y ≈ β V k f ( H k ) e . (2.5)3 n iterative method to compute the overlap Dirac operator at nonzero chemical potential Jacques Bloch

In this approximation the computation of f ( A ) is replaced by that of f ( H k ) , where H k is of muchsmaller size than A . f ( H k ) e should be evaluated by some suitable numerical method.The computation of the matrix sign function using Eq. (2.5) converges to the exact solution(see the m = m critical eigen-values λ i of A with orthonormal eigenvectors u i have been computed. Then f ( A ) x = m ∑ i = f ( λ i )( u † i x ) u i + f ( A ) x ⊥ , (2.6)where x = x (cid:107) + x ⊥ with x (cid:107) = ∑ mi = ( u † i x ) u i and x ⊥ = x − x (cid:107) . The ﬁrst term on the right-hand sideof Eq. (2.6) can be computed exactly, while the second term can be approximated using a Krylovsubspace method for f ( A ) x ⊥ . Deﬂation will allow for a much smaller-sized Krylov subspace.For non-Hermitian matrices the eigenvectors are no longer orthogonal, and the simple decom-position into orthogonal subspaces, leading to Eq. (2.6), no longer holds. In the next two sectionswe will develop two alternative deﬂation schemes for the non-Hermitian case.

3. Schur deﬂation

We construct the subspace Ω m + K k ( A , x ) , which is the sum of the subspace Ω m spanned by theright eigenvectors corresponding to m critical eigenvalues of A and the Krylov subspace K k ( A , x ) .Assume that m critical eigenvalues and right eigenvectors of A have been computed. From this, onecan construct m Schur vectors s i , which form an orthonormal basis of Ω m , satisfying AS m = S m T m , (3.1)where S m = ( s , . . . , s m ) and T m is an m × m upper triangular matrix whose diagonal elements arethe eigenvalues corresponding to the Schur vectors.We propose a modiﬁed Arnoldi method to construct an orthogonal basis of the composite sub-space Ω m + K k ( A , x ) . That is, each Arnoldi vector is orthogonalized not only against the previousones, but also against the Schur vectors s i . In analogy to (2.1), this process can be summarized as A (cid:16) S m V k (cid:17) = (cid:16) S m V k (cid:17) (cid:32) T m S † m AV k H k (cid:33) + β k v k + e Tm + k (3.2)with v = x ⊥ / β , where x ⊥ = ( − S m S † m ) x is the projection of x onto the orthogonal complement Ω ⊥ of Ω m and β = | x ⊥ | . The Hessenberg matrix H = (cid:32) T m S † m AV k H k (cid:33) (3.3)4 n iterative method to compute the overlap Dirac operator at nonzero chemical potential Jacques Bloch satisﬁes a relation similar to Eq. (2.3), namely H = Q † AQ , where the columns of Q = ( S m V k ) forman orthonormal basis of Ω m + K k ( A , x ) . In analogy to Sec. 2 we construct the approximation f ( A ) x ≈ Q f ( H ) Q † x . (3.4)Because of the block structure (3.3) of H , the matrix f ( H ) can be written as f ( H ) = (cid:32) f ( T m ) Y f ( H k ) (cid:33) , (3.5)where Y reﬂects the coupling between both subspaces and satisﬁes the Sylvester equation T m Y − Y H k = f ( T m ) X − X f ( H k ) (3.6)with X = S † m AV k . Combining (3.4) and (3.5), we obtain f ( A ) x ≈ S m f ( T m ) S † m x + (cid:16) S m V k (cid:17) (cid:32) Yf ( H k ) (cid:33) β e . (3.7) f ( T m ) and f ( H k ) e are computed with some suitable numerical method, and the mixed triangu-lar/Hessenberg Sylvester equation (3.6) for Y can be solved efﬁciently with the method of Ref. [1].

4. LR-deﬂation

An alternative deﬂation in the same composite subspace Ω m + K k ( A , x ) can be constructedusing both the left and right eigenvectors corresponding to the critical eigenvalues. Assume that m critical eigenvalues of A and their corresponding left and right eigenvectors have been computed, AR m = R m Λ m , L † m A = Λ m L † m , (4.1)where Λ m is the diagonal matrix of critical eigenvalues, and R m = ( r , . . . , r m ) and L m = ( (cid:96) , . . . , (cid:96) m ) are the matrices with the corresponding right and left eigenvectors, respectively. The left andright eigenvectors corresponding to different eigenvalues are orthogonal. If the eigenvectors arenormalized such that (cid:96) † i r i =

1, then L † m R m = I m , and R m L † m is an oblique projector on the subspace Ω m spanned by the right eigenvectors. Let us now decompose x as x = x (cid:107) + x (cid:9) , where x (cid:107) = R m L † m x and x (cid:9) = x − x (cid:107) . Then f ( A ) x = R m f ( Λ m ) L † m x + f ( A ) x (cid:9) . (4.2)The ﬁrst term on the right-hand side, which follows from Eq. (4.1), can be evaluated exactly, whilethe second term can be approximated by applying the Arnoldi method described in Sec. 2 to x (cid:9) . Anorthonormal basis V k is constructed in the Krylov subspace K k ( A , x (cid:9) ) using the Arnoldi recurrence(2.1), with v = x (cid:9) / β and β = | x (cid:9) | . Successive operations of A on x (cid:9) will yield no contributionsalong the m critical eigendirections, hence K k ( A , x (cid:9) ) does not mix with Ω m . Applying the Arnoldiapproximation (2.5) to Eq. (4.2) yields the ﬁnal approximation f ( A ) x ≈ R m f ( Λ m ) L † m x + β V k f ( H k ) e . (4.3)Again, the ﬁrst column of f ( H k ) will be computed with some suitable numerical method.5 n iterative method to compute the overlap Dirac operator at nonzero chemical potential Jacques Bloch x -10 x -8 x -6 e rr o r m=0248163264 x -10 x -8 x -6 e rr o r m=0163264128 Figure 1:

Accuracy of approximation (4.3) for y = sgn ( H w ( µ )) x with x = ( , , . . . , ) at µ = .

3, for a 4 lattice (left) and a 6 lattice (right). The relative error ε = (cid:107) ˜ y − y (cid:107) / (cid:107) y (cid:107) is shown as a function of the Krylovsubspace size k for various numbers of deﬂated eigenvalues m using the LR-deﬂation.

5. Results

We used the methods described above to compute the sign function occurring in the overlapDirac operator (1.1) for a 4 and a 6 lattice gauge conﬁguration. Deﬂation of critical eigenvalues isessential because H w ( µ ) has eigenvalues close to the imaginary axis. In practice, these eigenvectorsneed to be computed to high accuracy, as this will limit the overall accuracy of the function approx-imation. This was done with ARPACK [17]. The modiﬁed Arnoldi method was implemented inC++ using the optimized ATLAS BLAS library [18]. The convergence of the method is illustratedin Fig. 1, where the accuracy of the approximation is shown as a function of the Krylov subspacesize. The various curves correspond to different numbers of deﬂated eigenvalues, using the LR-deﬂation scheme. Without deﬂation ( m =

0) the need for large Krylov subspaces would make themethod unusable. Clearly, deﬂation highly improves the efﬁciency of the numerical method: asmore eigenvalues are deﬂated, smaller Krylov subspaces are sufﬁcient to achieve a given accu-racy. Furthermore, the deﬂation efﬁciency seems to grow with increasing lattice volume. Indeed,although the matrix size N for the 6 lattice is more than 5 times larger than in the 4 case, theKrylov subspace only has to be expanded by a factor of 1.2 to achieve a given accuracy of 10 − (for m ≈ . N ). It is also interesting to note that the modiﬁed Arnoldi approximation (4.3) for f ( A ) x is very close to the best approximation in the composite subspace, which is given by theorthogonal projection of f ( A ) x on Ω m + K k ( A , x ) , as was checked numerically.The results for the Schur deﬂation are not shown here, but are very similar to those for the LR-deﬂation. The Schur deﬂation is slightly less accurate, and requires more CPU time per evaluation,mainly because of the additional orthogonalization of the Arnoldi vectors with respect to the Schurvectors. However, the time taken by its initialization phase is halved, as it only requires the compu-tation of the right eigenvectors, and the best choice of deﬂation scheme will depend on the numberof vectors x for which sgn ( H w ) x needs to be computed. If one needs to apply both sgn ( H w ) and itsadjoint, then, obviously, the LR-deﬂation will be the better choice. A more detailed discussion ofboth deﬂation schemes can be found in Ref. [1] 6 n iterative method to compute the overlap Dirac operator at nonzero chemical potential Jacques Bloch

6. Conclusion

In this talk we presented an algorithm to approximate the action of a function of a non-Hermitian matrix on an arbitrary vector, when some of the eigenvalues of the matrix lie in a regionof the complex plane close to a discontinuity of the function. The method approximates the solutionvector in a composite subspace consisting of a Krylov subspace augmented by the eigenvectors cor-responding to a small number of critical eigenvalues. Two deﬂation variants were presented basedon different subspace decompositions: the Schur deﬂation uses two coupled orthogonal subspaces,while the LR-deﬂation uses two decoupled but non-orthogonal subspaces. Deﬂation explicitlytakes into account the contribution of the critical eigenvalues. This allows for smaller-sized Krylovsubspaces, which is crucial for the efﬁciency of the method. The method was applied to the overlapDirac operator of lattice QCD at nonzero chemical potential, where the importance of deﬂation wasclearly demonstrated.

Acknowledgments

This work was supported in part by DFG grants FOR465-WE2332/4-2 and Fr755/15-1.

References [1] J. Bloch, A. Frommer, B. Lang and T. Wettig, Comput. Phys. Commun. (2007) , in press,arXiv:0704.3486 [hep-lat].[2] R. Narayanan and H. Neuberger, Nucl. Phys. B443 (1995) 305, hep-th/9411108.[3] H. Neuberger, Phys. Lett. B417 (1998) 141, hep-lat/9707022.[4] H. Neuberger, Phys. Rev. Lett. 81 (1998) 4060, hep-lat/9806025.[5] J. van den Eshof et al., Comput. Phys. Commun. 146 (2002) 203, hep-lat/0202025.[6] M.A. Stephanov, PoS LAT2006 (2006) 024, hep-lat/0701002.[7] J. Bloch and T. Wettig, Phys. Rev. Lett. 97 (2006) 012003, hep-lat/0604020.[8] P. Hasenfratz and F. Karsch, Phys. Lett. B125 (1983) 308.[9] N. Dunford and J. Schwartz, Linear Operators, Part I: General Theory (Interscience Publishers, 1958).[10] G.H. Golub and C.F. Van Loan, Matrix Computations (The Johns Hopkins University Press, 1989).[11] J. Bloch and T. Wettig, (2007), arXiv:0709.4630 [hep-lat].[12] N.J. Higham, Num. Algorithms 15 (1997) 227.[13] H.A. van der Vorst, J. Comput. Appl. Math. 18 (1987) 249.[14] V. Druskin, A. Greenbaum and L. Knizhnerman, SIAM J. Sci. Comput. 19 (1998) 38.[15] E. Gallopoulos and Y. Saad, ICS ’89: Proceedings of the 3rd international conference onSupercomputing, pp. 17–28, New York, NY, USA, 1989, ACM Press.[16] M. Hochbruck and M.E. Hochstenbach, Subspace extraction for matrix functions, preprint, 2005.[17] .[18] http://math-atlas.sourceforge.net ..