Alternating Descent Method for Gauge Cooling of Complex Langevin Simulations
aa r X i v : . [ h e p - l a t ] A ug ALTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEXLANGEVIN SIMULATIONS
XIAOYU DONG, ZHENNING CAI, AND YANA DI
Abstract.
We study the gauge cooling technique for the complex Langevin method applied to the com-putation in lattice quantum chromodynamics. We propose a new solver of the minimization problem thatoptimizes the gauge, which does not include any parameter in each iteration, and shows better performancethan the classical gradient descent method especially when the lattice size is large. Two numerical tests arecarried out to show the effectiveness of the new algorithm.
Keywords. complex Langevin method, gauge cooling, alternating descent method. Introduction
Lattice QCD is the standard nonperturbative tool for quantum chromodynamics (QCD). The link variables U x,µ ∈ SU (3) stand for the gluons between lattice points x and x + ˆ µ where x = ( t, x , x , x ) ∈ N . Using { U } to represent the collection of all link variables U x,µ , one can compute the expectation value of anyobservable O by h O i = 1 Z Z [d U ] O ( { U } ) e − S ( { U } ) , where the partition function Z is given by Z = Z [d U ] e − S ( { U } ) , and the total action S ( { U } ) can be written as S B ( { U } ) + S F ( { U } ), with S B ( { U } ) being the bosonic actionand S F ( { U } ) being the negative logarithm of the fermion determinant. For nonzero chemical potential, thefermion determinant det M may be complex, leading to the numerical sign problem when applying MonteCarlo method to compute h O i .Several methods have been proposed to mitigate the numerical sign problem, including analytical methodssuch as analytical continuation and series expansion [11, 4], the Lefschetz thimble method [10, 9], and thecomplex Langevin method (CLM) [15, 14, 16, 22]. The CLM is a straightforward generalization of the realLangevin method, which extends the dynamics on the SU (3) field to the dynamics on the SL (3 , C ) field.Despite its formal legitimacy [2], the results may diverge or even converge to wrong results [20, 24], and thisoften occurs when large excursions occur in the complex Langevin dynamics. Recently in [24], by a detailedanalysis on the boundary terms, the underlying mechanism of the wrong convergence result has surfaced,and the result therein is further developed in [25] to find corrections of the numerical values.Nevertheless, the most straightforward way to stabilize the dynamics is to pull the fields closer to SU (3),which can be done by gauge cooling [26] or dynamical stabilization [5]. The gauge cooling method utilizesthe redundant degrees of freedom in the gauge field, and it does not introduce any biases to the expectationvalues. Such a method has been formally justified in [20, 18], and it has been successfully applied to anumber of problems [27, 19]. Basically, the gauge cooling method requires to choose an optimal gauge forthe complexified field, which minimizes the distance between the current field and SU (3). In one dimension,it has been figured out in [8] that the problem can be solved analytically since the field is essentiallyequivalent to the one-link model. While the analytical solution is unavailable for the multi-dimensional case,the minimization problem is usually solved by the gradient descent method, for which the step length has Date : August 18, 2020.Zhenning Cai’s work was supported by the Academic Research Fund of the Ministry of Education of Singapore under grantNo. R-146-000-291-114. Yana Di’s work was supported by National Natural Science Foundation of China No.11771437. Thecomputations were partly done on the high performance computers of State Key Laboratory of Scientific and EngineeringComputing, Chinese Academy of Sciences. to be chosen carefully to achieve satisfactory convergence rate. Some techniques to choose the step lengthhave been discussed in [1, 6].In this paper, we would like to focus on the gauge cooling problem and propose a more efficient method tosolve the optimization problem, which does not require delicate selection of the step length. The rest of thispaper is organized as follows. In Section 2, we give a brief introduction to the complex Langevin method andthe gauge cooling technique. Our alternating descent method is presented in Section 3. Section 4 includestwo numerical experiments to test the new method. Finally, some concluding remarks are given in Section5. 2.
Complex Langevin method and gauge cooling
In this section, we provide a brief review of the complex Langevim method and the method of gaugecooling. To be more general, we consider the SU ( n ) gauge theory with { U } ∈ [ SU ( n )] N N N N , where N and N , N , N are the numbers of lattice points in time and spatial directions. Define G = { x = ( t, x , x , x ) : t = 1 , · · · , N ; x = 1 , · · · , N ; x = 1 , · · · , N ; x = 1 , · · · , N } , then { U } can be represented by { U x,µ : x ∈ G ; µ = 0 , , , } . We assume periodic boundary conditions for { U } .Let w a,x,µ for a = 1 , · · · , n − x ∈ G and µ = 0 , , , U x,µ = − n − X a =1 i λ a [ U x,µ D a,x,µ S ( { U } )d t + U x,µ ◦ d w a,x,µ ] , ∀ x ∈ G, ∀ µ = 0 , , , λ a , a = 1 , · · · , n − SU ( n ) group satisfyingtr( λ a λ b ) = 2 δ ab , ∀ a, b = 1 , · · · , n − , and D a,x,µ denotes the left Lie derivative operator defined as D a,x,µ f ( { U } ) = lim ǫ → f ( { U ǫ } ) − f ( { U } ) ǫ with { U ǫ } being the set of the matrices U ǫy,ν = exp(i ǫδ xy δ µν λ a ) U y,ν for y ∈ G , ν = 0 , , , a =1 , , · · · , n −
1. Here the field { U ǫ } is actually the perturbation of { U } which only replaces the singlelink U x,µ by exp(i ǫλ a ) U x,µ . In practice, we apply the Euler-Maruyama discretization of the equation (1) fortime evolution:(2) U x,µ = exp − n − X a =1 i λ a (cid:16) D a,x,µ S ∆ t + η a,x,µ √ ∆ t (cid:17) U x,µ , ∀ x ∈ G, µ = 0 , , , , where ∆ t is time step and any of η a,x,µ is normally distributed with mean 0 and variance 2.The complex Langevin method may diverge due to insufficient decay of the probability density function atinfinity, which is often owing to too much excursion away from a unitary field. The gauge cooling method usesthe redundant degrees of freedom in the gauge theory to restrict such excursion. Specifically, the observable O and the complex action S are invariant under the gauge transformation defined as(3) ˜ U x,µ = V − x U x,µ V x +ˆ µ , ∀ x ∈ G, ∀ µ = 0 , , , V x ∈ SL ( n, C ) for any x ∈ G , which also satisfy the periodic boundary conditions. Define { ˜ U } asthe collection of all the ˜ U x,µ . The gauge cooling method applies such a transform after every time step. Bychoosing V x appropriately, the drift away from the unitary field can be mitigated. The distance between { U } and [ SU ( n )] N N N N can be measured by the norm k{ U }k = N N N N X µ =0 X x ∈ G k U x,µ k F ! / , ∀{ U } ∈ [ C n × n ] N N N N based on the Frobenius norm of matrices defined by k U x,µ k F = [tr( U x,µ U † x,µ )] / . It can be proved that k{ U }k > n for any { U } ∈ [ SL ( n, C )] N N N N , and the equality holds if and only if { U } ∈ [ SU ( n )] N N N N [8]. LTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEX LANGEVIN SIMULATIONS 3
Once the link variables { U } are updated by one time step, the gauge cooling step requires us to solve theminimization problem:(4) argmin { ˜ U }∈G ( { U } ) F ( { ˜ U } )where F ( { ˜ U } ) = k{ ˜ U }k , and G ( { U } ) is the set of all fields which are equivalent to { U } under the gaugetransformation: G ( { U } ) = n { ˜ U } : ˜ U x,µ = V − x U x,µ V x +ˆ µ , V x ∈ SL ( n, C ) o . In [26], this problem is solved iteratively by the gradient descent method:(5) ˜ U (0) x,µ = U x,µ , ˜ U ( k +1) x,µ = exp n − X a =1 sV ( k ) a,x λ a ˜ U ( k ) x,µ exp − n − X a =1 sV ( k ) a,x +ˆ µ λ a , where V ( k ) a,x = − X µ =0 tr λ a h ˜ U ( k ) x,µ ( ˜ U ( k ) x,µ ) † − ( ˜ U ( k ) x − ˆ µ,µ ) † ˜ U ( k ) x − ˆ µ,µ i , and s stands for the step length. Since the optimization problem has to be solved at every time step, anaccurate line search method may be too expensive for gauge cooling. The simplest method is to choose s to be proportional to ∆ t [26], while it is unclear how to select a suitable coefficient. In [1], the authorsintroduced the method of adaptive gauge cooling, which sets s to be larger when the current solution isfarther away from the minimum point, and controls the divergence by reducing s when the cooling drift islarge. In [6], the authors find out s numerically by applying Brent’s method [7] and adding an upper boundto avoid instability. 3. Alternating descent method for gauge cooling
To avoid intricate selection of the time step, we will introduce in this section a new numerical method ofgauge cooling, which is free of parameters in each iteration. The general idea of this method is to decomposeall the variables V x into two subsets, and these two subsets of variables will be optimized in turns. Belowwe refer to this method as the alternating descent method .To begin with, we define two index sets: G e = { x = ( t, x , x , x ) ∈ G : ( t + x + x + x ) is even } ,G o = { x = ( t, x , x , x ) ∈ G : ( t + x + x + x ) is odd } . In order to be consistent with the periodic boundary condition, we require that N , N , N and N be evennumbers. Each iteration in the alternating descent method then contains two steps:(1) Let ˜ U ( k +1 / x,µ = ( V ( k ) x ) − ˜ U ( k ) x,µ and ˜ U ( k +1 / x +ˆ µ,µ = ˜ U ( k ) x +ˆ µ,µ V ( k ) x for all x ∈ G e , where V ( k ) x , x ∈ G e arechosen to minimize F ( { ˜ U ( k +1 / } ).(2) Let ˜ U ( k +1) x,µ = ( V ( k +1 / x ) − ˜ U ( k +1 / x,µ and ˜ U ( k +1) x +ˆ µ,µ = ˜ U ( k +1 / x,µ V ( k +1 / x for all x ∈ G o , where V ( k +1 / x , x ∈ G o are chosen to minimize F ( { ˜ U ( k +1) } ).The operation in each step finds the optimal gauge transformation with transformation variables V ( k ) x (or V ( k +1 / x ) being fixed to be identity on odd (or even) lattice points. To implement the above algorithm, itremains only to find V ( k ) x for x ∈ G e and V ( k +1 / x for x ∈ G o . Due to the periodic boundary condition, thealgorithms for finding both sets of matrices are nearly identical, and below we focus only on the algorithmto find V ( k ) x ∈ G e . For simplicity, we omit the superscript k , and thus the optimization problem to be solvedis(6) argmin V x ∈ SL ( n ) , x ∈ G e X µ =0 X x ∈ G e tr( V − x ˜ U x,µ ˜ U † x,µ V −† x + ˜ U x − ˆ µ,µ V x V † x ˜ U † x − ˆ µ,µ ) . Note that the objective function is exactly the value of F ( · ) for link variables after the gauge transforma-tion. In this problem, the optimization of every V x can be fully decoupled, meaning that we can solve the XIAOYU DONG, ZHENNING CAI, AND YANA DI minimization problem(7) argmin V x ∈ SL ( n ) F x ( V x ) = X µ =0 tr( V − x ˜ U x,µ ˜ U † x,µ V −† x + ˜ U x − ˆ µ,µ V x V † x ˜ U † x − ˆ µ,µ )for all x ∈ G e , as makes it feasible to solve (6) exactly. Now we fix x ∈ G e and try to solve (7). Since V x ∈ SL ( n, C ), we can write V x as V x = exp − n − X a =1 V a,x λ a , where all V a,x are real numbers as in the gradient descent method. By straightforward calculation, we have(8) ∂ F x ∂V a,x = 2 tr " λ a X µ =0 ( V − x ˜ U x,µ ˜ U † x,µ V −† x − V † x ˜ U † x − ˆ µ,µ ˜ U x − ˆ µ,µ V x ) . For simplicity, we let P x = X µ =0 ˜ U x,µ ˜ U † x,µ , Q x = X µ =0 ˜ U † x − ˆ µ,µ ˜ U x − ˆ µ,µ , so that the optimal choice of V a,x satisfiestr (cid:2) λ a (cid:0) V − x P x V −† x − V † x Q x V x (cid:1)(cid:3) = 0 , ∀ a = 1 , · · · , n − . Since the matrix V − x P x V −† x − V † x Q x V x is Hermitian, the above equation is equivalent to(9) V − x P x V −† x − V † x Q x V x = α x I, where α x ∈ R is an unknown coefficient.Now we further simplify the notation by omitting the variable x , so that the equation (9) can be rewrittenas(10) V V † QV V † + αV V † = P. If α is given, this equation is the algebraic Riccati equation [17]. The standard way to solve this equation isto first solve the eigenvalue problem(11) (cid:18) αI QP − αI (cid:19) (cid:18) XY (cid:19) = (cid:18) XY (cid:19) Λ , where X, Y, Λ ∈ C n × n and Λ is a diagonal matrix whose all diagonal elements have positive real parts.Then the solution can be written as V V † = Y X − . Here Λ only gives a half of the eigenvalues of the firstmatrix in (11), and the other half of the eigenvalues must have negative real parts. The solution contains theparameter α , which can be further determined by det( V V † ) = 1. In our case, we can simplify the problemby expanding (11):(12) 12 αX + QY = X Λ , P X − αY = Y Λ , which yields(13) QP X = X (cid:18) Λ − αI (cid:19) (cid:18) Λ + 12 αI (cid:19) = X (cid:18) Λ − α I (cid:19) . Since P and Q are Hermitian positive definite matrices, their product matrix QP is diagonalizable and allthe eigenvalues are real. Thus X can be obtained by solving a smaller eigenvalue problem (13). Here weassume that det X = 1. Suppose ξ , · · · , ξ n are the eigenvalues of QP . ThenΛ = diag (cid:16)p ξ + 1 / α , p ξ + 1 / α , · · · , p ξ n + 1 / α (cid:17) . By the second equation in (12), we have Y = P X (cid:18)
Λ + 12 αI (cid:19) − . LTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEX LANGEVIN SIMULATIONS 5
Therefore(14)
V V † = Y X − = P X (cid:18)
Λ + 12 αI (cid:19) − X − . Using the condition det( V ) = 1 and det( X ) = 1, we see thatdet( P )det (cid:0) Λ + αI (cid:1) = 1 . Therefore α satisfies(15) n Y j =1 (cid:18)q ξ j + 1 / α + 1 / α (cid:19) = det( P ) . Note that this equation determines a unique solution, due to the monotonicity of the left-hand side withrespect to α . Thus we obtain the following algorithm to solve V :1) Eigendecompose the matrix QP to get the eigenvalues ξ , ξ , · · · , ξ n and eigenvectors X with det( X ) =1.2) Solve the equation (15) to find the value of α :3) Compute V V † by (14).4) Find V by taking the matrix square root of (14).The above algorithm solves the optimization problem (7) exactly, and thus provides the solution of (6).This method turns out to be similar to the coordinate descent method, which updates one variable ata time. The coordinate descent method has wide applications in large-scale optimization problems [21],especially when the subproblem for each variable can be solved exactly [13, 28]. In the following section, wewill show by numerical tests that such a method is also highly efficient in the gauge cooling problem.4. Numerical Examples
We will now apply the alternating descent method to two numerical examples to demonstrate its efficiency.In both examples, the Euler-Maruyama method (2) is applied to solve the complex Langevin dynamics. Ourmethod will be compared with the gradient descent method, and when the alternating descent method isused, only one iteration is applied after each time step. The two examples to be shown below are, respectively,a one-dimensional Polyakov loop model and a four-dimensional model for heavy quark QCD.4.1.
Polyakov loop model.
For the one-dimensional Polyakov loop model, We denote the link variablesby U k , k = 1 , , · · · , N . Following [26], we define the action S by S ( { U } ) = − tr( β U · · · U N + β U − N · · · U − ) , and choose β = β + κe µ and β = ¯ β + κe − µ with β = 2 , κ = 0 . µ = 1. The model is applied tothe SU (3) theory, so that λ a , a = 1 , · · · , α x in (9) always equals zero since both V − x P x V −† x and V † x Q x V x are Hermitian, positive definite,and have the determinant 1.In our numerical experiments, we choose the Langevin time step to be ∆ t = 2 × − and the simulationends at t = 10. Note that in the one-dimensional case, the exact solution of the optimization problem (4)can be obtained analytically, and the result is U k ← (ΛΛ † ) N , k = 1 , · · · , N − U N ← Λ(ΛΛ † ) − N − N , where Λ is a diagonal matrix whose diagonal entries are all the eigenvalues of U N U · · · U N − . For details,we refer the readers to a recent work [8]. Thus, the following four methods will be tested in this example: • Complex Langevin with no gauge cooling. • Complex Langevin with gauge cooling implemented by the gradient descent method [26]. • Complex Langevin with optimal gauge cooling [8]. • Complex Langevin with gauge cooling implemented by the alternating descent method introducedin Section 3.
XIAOYU DONG, ZHENNING CAI, AND YANA DI t ∆ F -10 -8 -6 -4 -2 OptimalNo gauge coolingGradient descentAlternating descent
N=4 t ∆ F -10 -8 -6 -4 -2 OptimalNo gauge coolingGradient descentAlternating descent
N=8 t ∆ F -12 -10 -8 -6 -4 -2 OptimalNo gauge coolingGradient descentAlternating descent
N=16 t V -12 -10 -8 -6 -4 -2 OptimalNo gauge coolingGradient descent (3 steps, s = ∆ t)Gradient descent (20 steps, s = ∆ t))Gradient descent (3 steps, s = 100 ∆ t)Alternating descent N=32
Figure 1.
The evolution of ∆ F with different gauge cooling techniques.For the gradient descent method, the update of the link variables (5) can be formulated by V a,k ← − λ a h U k U † k − U † k − U k − i , U k ← exp X a =1 sV a,k λ a ! U k exp − X a =1 sV a,k +1 λ a ! , where s stands for the step length which is set as s = ∆ t . Such iteration is applied three times after everyLangevin step. For the alternating descent method, only one iteration per time step is applied as mentionedin the beginning of this section.Fig.1 shows the evolution of norm k{ U }k over the Langevin time. The vertical axis ∆ F is defined by∆ F = F ( { U } ) −
3, which vanishes only if { U } ∈ [ SU (3)] N . Because of the randomness, each curve in thefigures only show the result of one particular run in our simulations. From the figures, we can observe withoutsurprise that if no gauge cooling is applied, the computations are forced to terminate for N = 8 , ,
32 due tooverflow, and even for N = 4, the norm of the links increases too quickly. For N = 4 and 8, the performanceof gradient descent method is excellent due to its good agreement with the optimal gauge cooling. But thedisparity between these two methods increases gradually as N gets larger, as is obvious for N = 16 and 32.On the contrary, in all cases, the alternating descent method performs almost as well as the optimal gaugecooling. In general, we see that the alternating descent method is always competitive, and when the numberof lattice points are large, the alternating descent method can significantly outperform the gradient descentmethod. To demonstrate the efficiency of the method, we have also listed in Table 1 the computational timein our experiments. In Table 1, the total computational time and the computational time spent only ongauge cooling are both provided, where for the gradient descent method, three iterations are applied at everytime step, while only one iteration per time step is applied for the alternating descent method. Averagely, LTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEX LANGEVIN SIMULATIONS 7 the computational time for every iteration is similar for both methods, and the alternating descent methodis faster due to its capability of using less iterations. By comparing the two columns for each method, it canbe seen that gauge cooling takes a significant portion of total computational time, so that a better algorithmfor optimization can effectively reduce the computational cost.
Table 1.
Computational time (in seconds) of every 10 Langevin steps with gauge cool-ing. Opt.: Optimal gauge cooling, GD: Gradient descent method, AD: Alternating descentmethod, g.c.: gauge cooling Opt. GD AD N total g.c. total g.c. total g.c.4 0 .
598 0 .
061 1 .
441 0 .
890 0 .
868 0 . .
157 0 .
071 2 .
925 1 .
787 1 .
745 0 . .
342 0 .
087 5 .
889 3 .
551 3 .
585 1 . .
044 0 .
119 12 .
280 7 .
151 7 .
619 2 . N = 32, if we increase thenumber of iterations to 20, the dynamics can be better stablized, although the value of ∆ F is still about one-magnitude larger than the alternating descent method. However, this also results in a significant incrementof the computational cost. By tuning the parameters, we find that choosing the time step s = 100∆ t andkeeping the number of iterations to be 3 can provide a better balance between the computational cost andthe efficacy of gauge cooling. Even so, the alternating descent method is still superior in this case.To verify the validity of these numerical methods, we calculated the expectation values of observables O k ( { U } ) = tr[( U U · · · U N ) k ] for k = ± , ± ±
3. Samples are taken every 50 steps until t = 10 after t = 1.The results are listed in Table 2, and because the value of h O k i is real, its imaginary part is omitted. Theexact values of the expectation can be obtained by Weyl’s integration theorem. For N = 4, although theexpectation values can be calculated without gauge cooling, the results are obviously erroneous. For N = 4 , N = 32,the data for gradient descent method provided in Table 2 are computed using s = ∆ t with 3 iterations perstep, which show large errors especially when | k | is large. Table 2.
Numerical results for the observables h O k i . Opt.: Optimal gauge cooling, GD:Gradient descent method, AD: Alternating descent method, No g.c.: No gauge cooling N = 4 N = 8 k Exact Opt. GD AD No g.c. Opt. GD AD No g.c.1 2 . . . . . . . . − . . . . . . . . . . . . − . . . . − . . . . . . . . − . − . − . − . − . − . − . − . − − . − . − . − . − . − . − . − . N = 16 N = 32 k Exact Opt. GD AD No g.c. Opt. GD AD No g.c.1 2 . . . . . . . − . . . . . . . . . . . . . . − . . . . . . . − . − . − . − . − . − . − . − − . − . − . − . − . − . − . To demonstrate the efficiency of the alternating descent method more clearly, we remove the complexLangevin dynamics and consider solving the optimization problem (7) directly. The link variables { U } areset to be a random complexified gauge transformation applied to a random SU (3) field, so that one of the XIAOYU DONG, ZHENNING CAI, AND YANA DI
Gauge cooling steps ∆ F -6 -5 -4 -3 -2 -1 s = 0.4 ∆ ts = ∆ ts = 10 ∆ ts = 100 ∆ ts = 300 ∆ tAlternating descent N=4
Gauge cooling steps ∆ F -7 -6 -5 -4 -3 -2 -1 s = 0.4 ∆ ts = ∆ ts = 10 ∆ ts = 100 ∆ ts = 150 ∆ tAlternating descent N=32
Gauge cooling steps ∆ F -8 -6 -4 -2 s = 0.4 ∆ ts = ∆ ts = 5 ∆ ts = 10 ∆ ts = 20 ∆ tAlternating descent N=256
Gauge cooling steps ∆ F -7 -6 -5 -4 -3 -2 -1 s = 0.4 ∆ ts = ∆ ts = 5 ∆ ts = 10 ∆ ts = 20 ∆ tAlternating descent N=1024
Figure 2.
The evolution of ∆ F with gauge cooling steps increasing.solution to the problem (7) is the field before the gauge transformation, for which ∆ F = 0. Below wecompare the performances of the alternating descent method and the gradient descent method with variousstep lengths s . The numbers of link variables are chosen to be 4 , ,
256 and 1024, and the results are plottedin Fig.2. Note that the gradient descent method may get divergent when the time step is too large, and inour numerical test for every N , we have tried to maximize the step length s which maintains the stability ofiterative process. In all test cases, the alternating descent method shows its superiority in the cooling effect.In particular, it can efficiently bring down the value of ∆ F by two to three orders of magnitude in the firstfew steps, and this effect is independent of the number of link variables. Such a property is highly desiredin the simulation of lattice QCD, for it allows us to reduce the computational cost of gauge cooling.4.2. Heavy quark QCD.
In this example, we consider the QCD at finite chemical potential described in[3]. Denoting any x ∈ G by x = ( t, x ), the fermion determinant is given bydet M ( µ ) = Y x det( I + C P x ) det( I + C ′ P − x ) where C = [2 κ exp( µ ) N ], C ′ = [2 κ exp( − µ ) N ], and P x = N Y t =1 U ( t, x ) , . Define the plaquette U x,µν = U x,µ U x +ˆ µ,ν U − x +ˆ ν,µ U − x,ν . LTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEX LANGEVIN SIMULATIONS 9 µ -1 > Alternating descent µ -1 > Gradient descent
Figure 3.
The expectation values of observables with β = 3 . S ( { U } ) = − ln(det M ) + S B , where S B = − β X x ∈ G X µ<ν (cid:20)
16 (tr U x,µν + tr U − x,µν ) − (cid:21) . We refer the readers to [3] for the Lie deriatives of this action. The same example has also been computedusing the complex Langevin method in [23, 26, 12].In our simulation, we choose the chemical potential µ ranging from 0 to 4 . β = 3 . β = 5 . h O i = 13 N N N X x h tr P x i , h O − i = 13 N N N X x h tr P − x i . The value of κ is set to be 0 .
12. The lattice used in our numerical tests is N = N = N = N = 4, andthe Langevin time step is ∆ t = 2 × − . Again, at every Langevin step, we perform three iterations for thegradient descent method and only one iteration for the alternating descent method. Since the lattice sizein each direction is small, we expect that the gradient descent method and the alternating descent methodhave similar efficiency, while the alternating descent method still holds the advantage of its parameter-freeproperty in each iteration. Moreover, averagely speaking, the gradient descent method spends 8 . . β = 3 .
0, we start to take samples every 50 steps from t = 2 to t = 10 for all values of µ . For β = 5 . µ , and therefore we start 8 parallel complex Langevinprocesses, and for each of them, we take one sample every 50 steps from t = 2 to t = 4, so that in total, thesame number of samples as the case β = 3 . µ , these results can becross-checked with the analytical results in [23], and the bell-shaped curves agree with the results computedin [26]. As expected, the results of both methods well agree with each other, while it is worth mentioningthat for the gradient descent method, three iterations are applied to solve the optimization problem afterevery time step, while for the alternating descent method, only one iteration is applied. Fig.5 shows theevolution of ∆ F . Our one-dimensional examples in Section 4.1 suggest similar behavior of two methods, asis generally true in our numerical tests. However, it can be observed in the case β = 5 . µ = 1 . µ -1 > Alternating descent µ -1 > Gradient descent
Figure 4.
The expectation values of observables with β = 5 . t ∆ F β =3.0, µ =1.2 t ∆ F β =5.0, µ =1.5 Figure 5.
The evolution of ∆ F with different gauge cooling techniques.5. Conclusion
We have introduced a new gauge cooling technique called alternating descent method, which helps effec-tively find the gauge transformation that pulls the complexified gauge field closer to the unitary field. Thekey properties of this method include: • No parameter needs to be chosen in each iteration. • It is easy to implement since the cooling step is implemented site by site. • The objective function declines fast especially in the first few steps. • The performance of the method is stable for various numbers of link variables.Our numerical tests show that when the number of link variables is small, its performance is comparableto the gradient descent method with carefully chosen step length; and when the lattice size gets larger,the alternating descent method gets more superior. The underlying mechanism of the alternating descentmethod needs to be further studied, and we expect more applications to be carried out in our future work.
References [1] G. Aarts, L. Bongiovanni, E. Seiler, D. Sexty, and I.-O. Stamatescu. Controlling complex Langevin dynamics at finitedensity.
Eur. Phys. J. A , 49:89, 2013.
LTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEX LANGEVIN SIMULATIONS 11 [2] G. Aarts, E. Seiler, and I.-O. Stamatescu. Complex Langevin method: When can it be trusted?
Phys. Rev. D , 81:054508,2010.[3] G. Aarts and I.-O. Stamatescu. Stachastic quantization at finite chemical potential.
J. High. Ener. Phys. , 09:018, 2008.[4] C. R. Allton, S. Ejiri, S. J. Hands, O. Kaczmarek, F. Karsch, E. Laermann, and C. Schmidt. Equation of state for twoflavor QCD at nonzero chemical potential.
Phys. Rev. D , 68(1):014507, 2003.[5] F. Attanasio and B. J¨ager. Dynamical stabilisation of complex Langevin simulations of QCD.
Eur. Phys. J. C , 79:16, 2019.[6] J. Bloch, J. Glesaaen, J. J. M. Verbaarschot, and S. Zafeiropoulos. Complex Langevin simulation of a random matrixmodel at nonzero chemical potential.
J. High. Ener. Phys. , 03:015, 2018.[7] R. Brent. An algorithm with guaranteed convergence for finding a zero of a function.
Comput. J. , 14:422–425, 1971.[8] Z. Cai, Y. Di, and X. Dong. How does gauge cooling stabilize complex Langevin?
Commun. Comput. Phys. , 27:1344–1377,2020.[9] M. Cristoforetti, F. Di Renzo, A. Mukherjee, and L. Scorzato. Monte Carlo simulations on the Lefschetz thimble: Tamingthe sign problem.
Phys. Rev. D , 88(5):051501, 2013.[10] M. Cristoforetti, F. Di Renzo, and L. Scorzato. New approach to the sign problem in quantum field theories: High densityQCD on a Lefschetz thimble.
Phys. Rev. D , 86(7):074506, 2012.[11] P. de Forcrand and O. Philipsen. The QCD phase diagram for small densities from imaginary chemical potential.
Nucl.Phys. B , 642(1):290–306, 2002.[12] Z. Fodor, S. D. Katz, D. Sexty, and C. T¨or¨ok. Complex Langevin dynamics for dynamical QCD at nonzero chemicalpotential: A comparison with multiparameter reweighting.
Phys. Rev. D , 92:094516, 2015.[13] C. J. Hsiesh, K. W. Chang, C. J. Lin, S. Keerthi, and S. Sundararajan. A dual coordinate descent method for large-scalelinear SVM. In
ICML’ 08: Proceedings of the 25th international conference on Machine learning , pages 408–415, 2008.[14] J. Klauder. A Langevin approach to fermion and quantum spin correlation functions.
J. Phys. A: Math. Gen. , 16:L317–319,1983.[15] J. Klauder. Stochastic quantization.
Acta Physica Austria, Suppl. , XXV:251–281, 1983.[16] J. Klauder. Coherent-state Langevin equations for canonical quantum systems with applications to the quantized Halleffect.
Phys. Rev. A , 29:2036–3047, 1984.[17] P. Lancaster and L. Rodman.
Algebraic Riccati equations . Oxford University Press, 1995.[18] K. Nagata, J. Nishimura, and S. Shimasaki. Argument for justification of the complex Langevin method and the conditionfor correct convergence.
Phys. Rev. D , 94:114515, 2016.[19] K. Nagata, J. Nishimura, and S. Shimasaki. Gauge cooling for the singular-drift problem in the complex Langevin method— a test in Random Matrix Theory for finite density QCD.
J. High. Ener. Phys. , 7:73, 2016.[20] K. Nagata, J. Nishimura, and S. Shimasaki. Justification of the complex Langevin method with the gauge cooling procedure.
Prog. Theor. Exp. Phys. , 2016(1):013B01, 2016.[21] Y. Nesterov. Efficiency of coordinate descent methods on huge-scale optimization problems.
SIAM Journal on Optimization ,22(2):341–362, 2012.[22] G. Parisi. On complex probabilities.
Phys. Lett. B , 131:393–395, 1983.[23] R. D. Pietri, E. Seiler A. Feo, and I.-O. Stamatescu. A model for QCD at high density and large quark mass.
Phys. Rev.D , 76:114501, 2007.[24] M. Scherzer, E. Seiler, D. Sexty, and I.-O. Stamatescu. Complex Langevin and boundary terms.
Phys. Rev. D , 99:014512,2019.[25] M. Scherzer, E. Seiler, D. Sexty, and I.-O. Stamatescu. Controlling complex Langevin simulations of lattice models byboundary term analysis.
Phys. Rev. D , 101(1):014501, 2020.[26] E. Seiler, D. Sexty, and I.-O. Stamatescu. Gauge cooling in complex Langevin for lattice QCD with heavy quarks.
Phys.Lett. B , 723:213–216, 2013.[27] D. Sexty. Simulating full QCD at nonzero density using the complex Langevin equation.
Phys. Lett. B , 729:108–111, 2014.[28] Z. Wang, Y. Li, and J. Lu. Coordinate descent full configuration interaction.
J. Chem. Theory Comput. , 15:3558–3569,2019.(Xiaoyu Dong)
University of Chinese Academy of Sciences, Beijing 100049, China
E-mail address : [email protected] (Zhenning Cai) Department of Mathematics, National University of Singapore, Level 4, Block S17, 10 LowerKent Ridge Road, Singapore 119076
E-mail address : [email protected] (Yana Di) Institute of Mathematical Research, Beijing Normal University & UIC, Zhuhai 519087, China
E-mail address ::