[PDF] Alternating Descent Method for Gauge Cooling of Complex Langevin Simulations

Abstract

We study the gauge cooling technique for the complex Langevin method applied to the computation in lattice quantum chromodynamics. We propose a new solver of the minimization problem that optimizes the gauge, which does not include any parameter in each iteration, and shows better performance than the classical gradient descent method especially when the lattice size is large. Two numerical tests are carried out to show the effectiveness of the new algorithm.

Full PDF

aa r X i v : . [ h e p - l a t ] A ug ALTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEXLANGEVIN SIMULATIONS

XIAOYU DONG, ZHENNING CAI, AND YANA DI

Abstract.

We study the gauge cooling technique for the complex Langevin method applied to the com-putation in lattice quantum chromodynamics. We propose a new solver of the minimization problem thatoptimizes the gauge, which does not include any parameter in each iteration, and shows better performancethan the classical gradient descent method especially when the lattice size is large. Two numerical tests arecarried out to show the eﬀectiveness of the new algorithm.

Keywords. complex Langevin method, gauge cooling, alternating descent method. Introduction

Lattice QCD is the standard nonperturbative tool for quantum chromodynamics (QCD). The link variables U x,µ ∈ SU (3) stand for the gluons between lattice points x and x + ˆ µ where x = ( t, x , x , x ) ∈ N . Using { U } to represent the collection of all link variables U x,µ , one can compute the expectation value of anyobservable O by h O i = 1 Z Z [d U ] O ( { U } ) e − S ( { U } ) , where the partition function Z is given by Z = Z [d U ] e − S ( { U } ) , and the total action S ( { U } ) can be written as S B ( { U } ) + S F ( { U } ), with S B ( { U } ) being the bosonic actionand S F ( { U } ) being the negative logarithm of the fermion determinant. For nonzero chemical potential, thefermion determinant det M may be complex, leading to the numerical sign problem when applying MonteCarlo method to compute h O i .Several methods have been proposed to mitigate the numerical sign problem, including analytical methodssuch as analytical continuation and series expansion [11, 4], the Lefschetz thimble method [10, 9], and thecomplex Langevin method (CLM) [15, 14, 16, 22]. The CLM is a straightforward generalization of the realLangevin method, which extends the dynamics on the SU (3) ﬁeld to the dynamics on the SL (3 , C ) ﬁeld.Despite its formal legitimacy [2], the results may diverge or even converge to wrong results [20, 24], and thisoften occurs when large excursions occur in the complex Langevin dynamics. Recently in [24], by a detailedanalysis on the boundary terms, the underlying mechanism of the wrong convergence result has surfaced,and the result therein is further developed in [25] to ﬁnd corrections of the numerical values.Nevertheless, the most straightforward way to stabilize the dynamics is to pull the ﬁelds closer to SU (3),which can be done by gauge cooling [26] or dynamical stabilization [5]. The gauge cooling method utilizesthe redundant degrees of freedom in the gauge ﬁeld, and it does not introduce any biases to the expectationvalues. Such a method has been formally justiﬁed in [20, 18], and it has been successfully applied to anumber of problems [27, 19]. Basically, the gauge cooling method requires to choose an optimal gauge forthe complexiﬁed ﬁeld, which minimizes the distance between the current ﬁeld and SU (3). In one dimension,it has been ﬁgured out in [8] that the problem can be solved analytically since the ﬁeld is essentiallyequivalent to the one-link model. While the analytical solution is unavailable for the multi-dimensional case,the minimization problem is usually solved by the gradient descent method, for which the step length has Date : August 18, 2020.Zhenning Cai’s work was supported by the Academic Research Fund of the Ministry of Education of Singapore under grantNo. R-146-000-291-114. Yana Di’s work was supported by National Natural Science Foundation of China No.11771437. Thecomputations were partly done on the high performance computers of State Key Laboratory of Scientiﬁc and EngineeringComputing, Chinese Academy of Sciences. to be chosen carefully to achieve satisfactory convergence rate. Some techniques to choose the step lengthhave been discussed in [1, 6].In this paper, we would like to focus on the gauge cooling problem and propose a more eﬃcient method tosolve the optimization problem, which does not require delicate selection of the step length. The rest of thispaper is organized as follows. In Section 2, we give a brief introduction to the complex Langevin method andthe gauge cooling technique. Our alternating descent method is presented in Section 3. Section 4 includestwo numerical experiments to test the new method. Finally, some concluding remarks are given in Section5. 2.

Complex Langevin method and gauge cooling

In this section, we provide a brief review of the complex Langevim method and the method of gaugecooling. To be more general, we consider the SU ( n ) gauge theory with { U } ∈ [ SU ( n )] N N N N , where N and N , N , N are the numbers of lattice points in time and spatial directions. Deﬁne G = { x = ( t, x , x , x ) : t = 1 , · · · , N ; x = 1 , · · · , N ; x = 1 , · · · , N ; x = 1 , · · · , N } , then { U } can be represented by { U x,µ : x ∈ G ; µ = 0 , , , } . We assume periodic boundary conditions for { U } .Let w a,x,µ for a = 1 , · · · , n − x ∈ G and µ = 0 , , , U x,µ = − n − X a =1 i λ a [ U x,µ D a,x,µ S ( { U } )d t + U x,µ ◦ d w a,x,µ ] , ∀ x ∈ G, ∀ µ = 0 , , , λ a , a = 1 , · · · , n − SU ( n ) group satisfyingtr( λ a λ b ) = 2 δ ab , ∀ a, b = 1 , · · · , n − , and D a,x,µ denotes the left Lie derivative operator deﬁned as D a,x,µ f ( { U } ) = lim ǫ → f ( { U ǫ } ) − f ( { U } ) ǫ with { U ǫ } being the set of the matrices U ǫy,ν = exp(i ǫδ xy δ µν λ a ) U y,ν for y ∈ G , ν = 0 , , , a =1 , , · · · , n −

1. Here the ﬁeld { U ǫ } is actually the perturbation of { U } which only replaces the singlelink U x,µ by exp(i ǫλ a ) U x,µ . In practice, we apply the Euler-Maruyama discretization of the equation (1) fortime evolution:(2) U x,µ = exp  − n − X a =1 i λ a (cid:16) D a,x,µ S ∆ t + η a,x,µ √ ∆ t (cid:17) U x,µ , ∀ x ∈ G, µ = 0 , , , , where ∆ t is time step and any of η a,x,µ is normally distributed with mean 0 and variance 2.The complex Langevin method may diverge due to insuﬃcient decay of the probability density function atinﬁnity, which is often owing to too much excursion away from a unitary ﬁeld. The gauge cooling method usesthe redundant degrees of freedom in the gauge theory to restrict such excursion. Speciﬁcally, the observable O and the complex action S are invariant under the gauge transformation deﬁned as(3) ˜ U x,µ = V − x U x,µ V x +ˆ µ , ∀ x ∈ G, ∀ µ = 0 , , , V x ∈ SL ( n, C ) for any x ∈ G , which also satisfy the periodic boundary conditions. Deﬁne { ˜ U } asthe collection of all the ˜ U x,µ . The gauge cooling method applies such a transform after every time step. Bychoosing V x appropriately, the drift away from the unitary ﬁeld can be mitigated. The distance between { U } and [ SU ( n )] N N N N can be measured by the norm k{ U }k = N N N N X µ =0 X x ∈ G k U x,µ k F ! / , ∀{ U } ∈ [ C n × n ] N N N N based on the Frobenius norm of matrices deﬁned by k U x,µ k F = [tr( U x,µ U † x,µ )] / . It can be proved that k{ U }k > n for any { U } ∈ [ SL ( n, C )] N N N N , and the equality holds if and only if { U } ∈ [ SU ( n )] N N N N [8]. LTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEX LANGEVIN SIMULATIONS 3

Once the link variables { U } are updated by one time step, the gauge cooling step requires us to solve theminimization problem:(4) argmin { ˜ U }∈G ( { U } ) F ( { ˜ U } )where F ( { ˜ U } ) = k{ ˜ U }k , and G ( { U } ) is the set of all ﬁelds which are equivalent to { U } under the gaugetransformation: G ( { U } ) = n { ˜ U } : ˜ U x,µ = V − x U x,µ V x +ˆ µ , V x ∈ SL ( n, C ) o . In [26], this problem is solved iteratively by the gradient descent method:(5) ˜ U (0) x,µ = U x,µ , ˜ U ( k +1) x,µ = exp  n − X a =1 sV ( k ) a,x λ a  ˜ U ( k ) x,µ exp  − n − X a =1 sV ( k ) a,x +ˆ µ λ a  , where V ( k ) a,x = − X µ =0 tr λ a h ˜ U ( k ) x,µ ( ˜ U ( k ) x,µ ) † − ( ˜ U ( k ) x − ˆ µ,µ ) † ˜ U ( k ) x − ˆ µ,µ i , and s stands for the step length. Since the optimization problem has to be solved at every time step, anaccurate line search method may be too expensive for gauge cooling. The simplest method is to choose s to be proportional to ∆ t [26], while it is unclear how to select a suitable coeﬃcient. In [1], the authorsintroduced the method of adaptive gauge cooling, which sets s to be larger when the current solution isfarther away from the minimum point, and controls the divergence by reducing s when the cooling drift islarge. In [6], the authors ﬁnd out s numerically by applying Brent’s method [7] and adding an upper boundto avoid instability. 3. Alternating descent method for gauge cooling

To avoid intricate selection of the time step, we will introduce in this section a new numerical method ofgauge cooling, which is free of parameters in each iteration. The general idea of this method is to decomposeall the variables V x into two subsets, and these two subsets of variables will be optimized in turns. Belowwe refer to this method as the alternating descent method .To begin with, we deﬁne two index sets: G e = { x = ( t, x , x , x ) ∈ G : ( t + x + x + x ) is even } ,G o = { x = ( t, x , x , x ) ∈ G : ( t + x + x + x ) is odd } . In order to be consistent with the periodic boundary condition, we require that N , N , N and N be evennumbers. Each iteration in the alternating descent method then contains two steps:(1) Let ˜ U ( k +1 / x,µ = ( V ( k ) x ) − ˜ U ( k ) x,µ and ˜ U ( k +1 / x +ˆ µ,µ = ˜ U ( k ) x +ˆ µ,µ V ( k ) x for all x ∈ G e , where V ( k ) x , x ∈ G e arechosen to minimize F ( { ˜ U ( k +1 / } ).(2) Let ˜ U ( k +1) x,µ = ( V ( k +1 / x ) − ˜ U ( k +1 / x,µ and ˜ U ( k +1) x +ˆ µ,µ = ˜ U ( k +1 / x,µ V ( k +1 / x for all x ∈ G o , where V ( k +1 / x , x ∈ G o are chosen to minimize F ( { ˜ U ( k +1) } ).The operation in each step ﬁnds the optimal gauge transformation with transformation variables V ( k ) x (or V ( k +1 / x ) being ﬁxed to be identity on odd (or even) lattice points. To implement the above algorithm, itremains only to ﬁnd V ( k ) x for x ∈ G e and V ( k +1 / x for x ∈ G o . Due to the periodic boundary condition, thealgorithms for ﬁnding both sets of matrices are nearly identical, and below we focus only on the algorithmto ﬁnd V ( k ) x ∈ G e . For simplicity, we omit the superscript k , and thus the optimization problem to be solvedis(6) argmin V x ∈ SL ( n ) , x ∈ G e X µ =0 X x ∈ G e tr( V − x ˜ U x,µ ˜ U † x,µ V −† x + ˜ U x − ˆ µ,µ V x V † x ˜ U † x − ˆ µ,µ ) . Note that the objective function is exactly the value of F ( · ) for link variables after the gauge transforma-tion. In this problem, the optimization of every V x can be fully decoupled, meaning that we can solve the XIAOYU DONG, ZHENNING CAI, AND YANA DI minimization problem(7) argmin V x ∈ SL ( n ) F x ( V x ) = X µ =0 tr( V − x ˜ U x,µ ˜ U † x,µ V −† x + ˜ U x − ˆ µ,µ V x V † x ˜ U † x − ˆ µ,µ )for all x ∈ G e , as makes it feasible to solve (6) exactly. Now we ﬁx x ∈ G e and try to solve (7). Since V x ∈ SL ( n, C ), we can write V x as V x = exp  − n − X a =1 V a,x λ a  , where all V a,x are real numbers as in the gradient descent method. By straightforward calculation, we have(8) ∂ F x ∂V a,x = 2 tr " λ a X µ =0 ( V − x ˜ U x,µ ˜ U † x,µ V −† x − V † x ˜ U † x − ˆ µ,µ ˜ U x − ˆ µ,µ V x ) . For simplicity, we let P x = X µ =0 ˜ U x,µ ˜ U † x,µ , Q x = X µ =0 ˜ U † x − ˆ µ,µ ˜ U x − ˆ µ,µ , so that the optimal choice of V a,x satisﬁestr (cid:2) λ a (cid:0) V − x P x V −† x − V † x Q x V x (cid:1)(cid:3) = 0 , ∀ a = 1 , · · · , n − . Since the matrix V − x P x V −† x − V † x Q x V x is Hermitian, the above equation is equivalent to(9) V − x P x V −† x − V † x Q x V x = α x I, where α x ∈ R is an unknown coeﬃcient.Now we further simplify the notation by omitting the variable x , so that the equation (9) can be rewrittenas(10) V V † QV V † + αV V † = P. If α is given, this equation is the algebraic Riccati equation [17]. The standard way to solve this equation isto ﬁrst solve the eigenvalue problem(11) (cid:18) αI QP − αI (cid:19) (cid:18) XY (cid:19) = (cid:18) XY (cid:19) Λ , where X, Y, Λ ∈ C n × n and Λ is a diagonal matrix whose all diagonal elements have positive real parts.Then the solution can be written as V V † = Y X − . Here Λ only gives a half of the eigenvalues of the ﬁrstmatrix in (11), and the other half of the eigenvalues must have negative real parts. The solution contains theparameter α , which can be further determined by det( V V † ) = 1. In our case, we can simplify the problemby expanding (11):(12) 12 αX + QY = X Λ , P X − αY = Y Λ , which yields(13) QP X = X (cid:18) Λ − αI (cid:19) (cid:18) Λ + 12 αI (cid:19) = X (cid:18) Λ − α I (cid:19) . Since P and Q are Hermitian positive deﬁnite matrices, their product matrix QP is diagonalizable and allthe eigenvalues are real. Thus X can be obtained by solving a smaller eigenvalue problem (13). Here weassume that det X = 1. Suppose ξ , · · · , ξ n are the eigenvalues of QP . ThenΛ = diag (cid:16)p ξ + 1 / α , p ξ + 1 / α , · · · , p ξ n + 1 / α (cid:17) . By the second equation in (12), we have Y = P X (cid:18)

Λ + 12 αI (cid:19) − . LTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEX LANGEVIN SIMULATIONS 5

Therefore(14)

V V † = Y X − = P X (cid:18)

Λ + 12 αI (cid:19) − X − . Using the condition det( V ) = 1 and det( X ) = 1, we see thatdet( P )det (cid:0) Λ + αI (cid:1) = 1 . Therefore α satisﬁes(15) n Y j =1 (cid:18)q ξ j + 1 / α + 1 / α (cid:19) = det( P ) . Note that this equation determines a unique solution, due to the monotonicity of the left-hand side withrespect to α . Thus we obtain the following algorithm to solve V :1) Eigendecompose the matrix QP to get the eigenvalues ξ , ξ , · · · , ξ n and eigenvectors X with det( X ) =1.2) Solve the equation (15) to ﬁnd the value of α :3) Compute V V † by (14).4) Find V by taking the matrix square root of (14).The above algorithm solves the optimization problem (7) exactly, and thus provides the solution of (6).This method turns out to be similar to the coordinate descent method, which updates one variable ata time. The coordinate descent method has wide applications in large-scale optimization problems [21],especially when the subproblem for each variable can be solved exactly [13, 28]. In the following section, wewill show by numerical tests that such a method is also highly eﬃcient in the gauge cooling problem.4. Numerical Examples

We will now apply the alternating descent method to two numerical examples to demonstrate its eﬃciency.In both examples, the Euler-Maruyama method (2) is applied to solve the complex Langevin dynamics. Ourmethod will be compared with the gradient descent method, and when the alternating descent method isused, only one iteration is applied after each time step. The two examples to be shown below are, respectively,a one-dimensional Polyakov loop model and a four-dimensional model for heavy quark QCD.4.1.

Polyakov loop model.

For the one-dimensional Polyakov loop model, We denote the link variablesby U k , k = 1 , , · · · , N . Following [26], we deﬁne the action S by S ( { U } ) = − tr( β U · · · U N + β U − N · · · U − ) , and choose β = β + κe µ and β = ¯ β + κe − µ with β = 2 , κ = 0 . µ = 1. The model is applied tothe SU (3) theory, so that λ a , a = 1 , · · · , α x in (9) always equals zero since both V − x P x V −† x and V † x Q x V x are Hermitian, positive deﬁnite,and have the determinant 1.In our numerical experiments, we choose the Langevin time step to be ∆ t = 2 × − and the simulationends at t = 10. Note that in the one-dimensional case, the exact solution of the optimization problem (4)can be obtained analytically, and the result is U k ← (ΛΛ † ) N , k = 1 , · · · , N − U N ← Λ(ΛΛ † ) − N − N , where Λ is a diagonal matrix whose diagonal entries are all the eigenvalues of U N U · · · U N − . For details,we refer the readers to a recent work [8]. Thus, the following four methods will be tested in this example: • Complex Langevin with no gauge cooling. • Complex Langevin with gauge cooling implemented by the gradient descent method [26]. • Complex Langevin with optimal gauge cooling [8]. • Complex Langevin with gauge cooling implemented by the alternating descent method introducedin Section 3.

XIAOYU DONG, ZHENNING CAI, AND YANA DI t ∆ F -10 -8 -6 -4 -2 OptimalNo gauge coolingGradient descentAlternating descent

N=4 t ∆ F -10 -8 -6 -4 -2 OptimalNo gauge coolingGradient descentAlternating descent

N=8 t ∆ F -12 -10 -8 -6 -4 -2 OptimalNo gauge coolingGradient descentAlternating descent

N=16 t V -12 -10 -8 -6 -4 -2 OptimalNo gauge coolingGradient descent (3 steps, s = ∆ t)Gradient descent (20 steps, s = ∆ t))Gradient descent (3 steps, s = 100 ∆ t)Alternating descent N=32

Figure 1.

The evolution of ∆ F with diﬀerent gauge cooling techniques.For the gradient descent method, the update of the link variables (5) can be formulated by V a,k ← − λ a h U k U † k − U † k − U k − i , U k ← exp X a =1 sV a,k λ a ! U k exp − X a =1 sV a,k +1 λ a ! , where s stands for the step length which is set as s = ∆ t . Such iteration is applied three times after everyLangevin step. For the alternating descent method, only one iteration per time step is applied as mentionedin the beginning of this section.Fig.1 shows the evolution of norm k{ U }k over the Langevin time. The vertical axis ∆ F is deﬁned by∆ F = F ( { U } ) −

3, which vanishes only if { U } ∈ [ SU (3)] N . Because of the randomness, each curve in theﬁgures only show the result of one particular run in our simulations. From the ﬁgures, we can observe withoutsurprise that if no gauge cooling is applied, the computations are forced to terminate for N = 8 , ,

32 due tooverﬂow, and even for N = 4, the norm of the links increases too quickly. For N = 4 and 8, the performanceof gradient descent method is excellent due to its good agreement with the optimal gauge cooling. But thedisparity between these two methods increases gradually as N gets larger, as is obvious for N = 16 and 32.On the contrary, in all cases, the alternating descent method performs almost as well as the optimal gaugecooling. In general, we see that the alternating descent method is always competitive, and when the numberof lattice points are large, the alternating descent method can signiﬁcantly outperform the gradient descentmethod. To demonstrate the eﬃciency of the method, we have also listed in Table 1 the computational timein our experiments. In Table 1, the total computational time and the computational time spent only ongauge cooling are both provided, where for the gradient descent method, three iterations are applied at everytime step, while only one iteration per time step is applied for the alternating descent method. Averagely, LTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEX LANGEVIN SIMULATIONS 7 the computational time for every iteration is similar for both methods, and the alternating descent methodis faster due to its capability of using less iterations. By comparing the two columns for each method, it canbe seen that gauge cooling takes a signiﬁcant portion of total computational time, so that a better algorithmfor optimization can eﬀectively reduce the computational cost.

Table 1.

Computational time (in seconds) of every 10 Langevin steps with gauge cool-ing. Opt.: Optimal gauge cooling, GD: Gradient descent method, AD: Alternating descentmethod, g.c.: gauge cooling Opt. GD AD N total g.c. total g.c. total g.c.4 0 .

598 0 .

061 1 .

441 0 .

890 0 .

868 0 . .

157 0 .

071 2 .

925 1 .

787 1 .

745 0 . .

342 0 .

087 5 .

889 3 .

551 3 .

585 1 . .

044 0 .

119 12 .

280 7 .

151 7 .

619 2 . N = 32, if we increase thenumber of iterations to 20, the dynamics can be better stablized, although the value of ∆ F is still about one-magnitude larger than the alternating descent method. However, this also results in a signiﬁcant incrementof the computational cost. By tuning the parameters, we ﬁnd that choosing the time step s = 100∆ t andkeeping the number of iterations to be 3 can provide a better balance between the computational cost andthe eﬃcacy of gauge cooling. Even so, the alternating descent method is still superior in this case.To verify the validity of these numerical methods, we calculated the expectation values of observables O k ( { U } ) = tr[( U U · · · U N ) k ] for k = ± , ± ±

3. Samples are taken every 50 steps until t = 10 after t = 1.The results are listed in Table 2, and because the value of h O k i is real, its imaginary part is omitted. Theexact values of the expectation can be obtained by Weyl’s integration theorem. For N = 4, although theexpectation values can be calculated without gauge cooling, the results are obviously erroneous. For N = 4 , N = 32,the data for gradient descent method provided in Table 2 are computed using s = ∆ t with 3 iterations perstep, which show large errors especially when | k | is large. Table 2.

Numerical results for the observables h O k i . Opt.: Optimal gauge cooling, GD:Gradient descent method, AD: Alternating descent method, No g.c.: No gauge cooling N = 4 N = 8 k Exact Opt. GD AD No g.c. Opt. GD AD No g.c.1 2 . . . . . . . . − . . . . . . . . . . . . − . . . . − . . . . . . . . − . − . − . − . − . − . − . − . − − . − . − . − . − . − . − . − . N = 16 N = 32 k Exact Opt. GD AD No g.c. Opt. GD AD No g.c.1 2 . . . . . . . − . . . . . . . . . . . . . . − . . . . . . . − . − . − . − . − . − . − . − − . − . − . − . − . − . − . To demonstrate the eﬃciency of the alternating descent method more clearly, we remove the complexLangevin dynamics and consider solving the optimization problem (7) directly. The link variables { U } areset to be a random complexiﬁed gauge transformation applied to a random SU (3) ﬁeld, so that one of the XIAOYU DONG, ZHENNING CAI, AND YANA DI

Gauge cooling steps ∆ F -6 -5 -4 -3 -2 -1 s = 0.4 ∆ ts = ∆ ts = 10 ∆ ts = 100 ∆ ts = 300 ∆ tAlternating descent N=4

Gauge cooling steps ∆ F -7 -6 -5 -4 -3 -2 -1 s = 0.4 ∆ ts = ∆ ts = 10 ∆ ts = 100 ∆ ts = 150 ∆ tAlternating descent N=32

Gauge cooling steps ∆ F -8 -6 -4 -2 s = 0.4 ∆ ts = ∆ ts = 5 ∆ ts = 10 ∆ ts = 20 ∆ tAlternating descent N=256

Gauge cooling steps ∆ F -7 -6 -5 -4 -3 -2 -1 s = 0.4 ∆ ts = ∆ ts = 5 ∆ ts = 10 ∆ ts = 20 ∆ tAlternating descent N=1024

Figure 2.

The evolution of ∆ F with gauge cooling steps increasing.solution to the problem (7) is the ﬁeld before the gauge transformation, for which ∆ F = 0. Below wecompare the performances of the alternating descent method and the gradient descent method with variousstep lengths s . The numbers of link variables are chosen to be 4 , ,

256 and 1024, and the results are plottedin Fig.2. Note that the gradient descent method may get divergent when the time step is too large, and inour numerical test for every N , we have tried to maximize the step length s which maintains the stability ofiterative process. In all test cases, the alternating descent method shows its superiority in the cooling eﬀect.In particular, it can eﬃciently bring down the value of ∆ F by two to three orders of magnitude in the ﬁrstfew steps, and this eﬀect is independent of the number of link variables. Such a property is highly desiredin the simulation of lattice QCD, for it allows us to reduce the computational cost of gauge cooling.4.2. Heavy quark QCD.

In this example, we consider the QCD at ﬁnite chemical potential described in[3]. Denoting any x ∈ G by x = ( t, x ), the fermion determinant is given bydet M ( µ ) = Y x det( I + C P x ) det( I + C ′ P − x ) where C = [2 κ exp( µ ) N ], C ′ = [2 κ exp( − µ ) N ], and P x = N Y t =1 U ( t, x ) , . Deﬁne the plaquette U x,µν = U x,µ U x +ˆ µ,ν U − x +ˆ ν,µ U − x,ν . LTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEX LANGEVIN SIMULATIONS 9 µ -1 > Alternating descent µ -1 > Gradient descent

Figure 3.

The expectation values of observables with β = 3 . S ( { U } ) = − ln(det M ) + S B , where S B = − β X x ∈ G X µ<ν (cid:20)

16 (tr U x,µν + tr U − x,µν ) − (cid:21) . We refer the readers to [3] for the Lie deriatives of this action. The same example has also been computedusing the complex Langevin method in [23, 26, 12].In our simulation, we choose the chemical potential µ ranging from 0 to 4 . β = 3 . β = 5 . h O i = 13 N N N X x h tr P x i , h O − i = 13 N N N X x h tr P − x i . The value of κ is set to be 0 .

12. The lattice used in our numerical tests is N = N = N = N = 4, andthe Langevin time step is ∆ t = 2 × − . Again, at every Langevin step, we perform three iterations for thegradient descent method and only one iteration for the alternating descent method. Since the lattice sizein each direction is small, we expect that the gradient descent method and the alternating descent methodhave similar eﬃciency, while the alternating descent method still holds the advantage of its parameter-freeproperty in each iteration. Moreover, averagely speaking, the gradient descent method spends 8 . . β = 3 .

0, we start to take samples every 50 steps from t = 2 to t = 10 for all values of µ . For β = 5 . µ , and therefore we start 8 parallel complex Langevinprocesses, and for each of them, we take one sample every 50 steps from t = 2 to t = 4, so that in total, thesame number of samples as the case β = 3 . µ , these results can becross-checked with the analytical results in [23], and the bell-shaped curves agree with the results computedin [26]. As expected, the results of both methods well agree with each other, while it is worth mentioningthat for the gradient descent method, three iterations are applied to solve the optimization problem afterevery time step, while for the alternating descent method, only one iteration is applied. Fig.5 shows theevolution of ∆ F . Our one-dimensional examples in Section 4.1 suggest similar behavior of two methods, asis generally true in our numerical tests. However, it can be observed in the case β = 5 . µ = 1 . µ -1 > Alternating descent µ -1 > Gradient descent

Figure 4.

The expectation values of observables with β = 5 . t ∆ F β =3.0, µ =1.2 t ∆ F β =5.0, µ =1.5 Figure 5.

The evolution of ∆ F with diﬀerent gauge cooling techniques.5. Conclusion

We have introduced a new gauge cooling technique called alternating descent method, which helps eﬀec-tively ﬁnd the gauge transformation that pulls the complexiﬁed gauge ﬁeld closer to the unitary ﬁeld. Thekey properties of this method include: • No parameter needs to be chosen in each iteration. • It is easy to implement since the cooling step is implemented site by site. • The objective function declines fast especially in the ﬁrst few steps. • The performance of the method is stable for various numbers of link variables.Our numerical tests show that when the number of link variables is small, its performance is comparableto the gradient descent method with carefully chosen step length; and when the lattice size gets larger,the alternating descent method gets more superior. The underlying mechanism of the alternating descentmethod needs to be further studied, and we expect more applications to be carried out in our future work.

References [1] G. Aarts, L. Bongiovanni, E. Seiler, D. Sexty, and I.-O. Stamatescu. Controlling complex Langevin dynamics at ﬁnitedensity.

Eur. Phys. J. A , 49:89, 2013.

LTERNATING DESCENT METHOD FOR GAUGE COOLING OF COMPLEX LANGEVIN SIMULATIONS 11 [2] G. Aarts, E. Seiler, and I.-O. Stamatescu. Complex Langevin method: When can it be trusted?

Phys. Rev. D , 81:054508,2010.[3] G. Aarts and I.-O. Stamatescu. Stachastic quantization at ﬁnite chemical potential.

J. High. Ener. Phys. , 09:018, 2008.[4] C. R. Allton, S. Ejiri, S. J. Hands, O. Kaczmarek, F. Karsch, E. Laermann, and C. Schmidt. Equation of state for twoﬂavor QCD at nonzero chemical potential.

Phys. Rev. D , 68(1):014507, 2003.[5] F. Attanasio and B. J¨ager. Dynamical stabilisation of complex Langevin simulations of QCD.

Eur. Phys. J. C , 79:16, 2019.[6] J. Bloch, J. Glesaaen, J. J. M. Verbaarschot, and S. Zafeiropoulos. Complex Langevin simulation of a random matrixmodel at nonzero chemical potential.

J. High. Ener. Phys. , 03:015, 2018.[7] R. Brent. An algorithm with guaranteed convergence for ﬁnding a zero of a function.

Comput. J. , 14:422–425, 1971.[8] Z. Cai, Y. Di, and X. Dong. How does gauge cooling stabilize complex Langevin?

Commun. Comput. Phys. , 27:1344–1377,2020.[9] M. Cristoforetti, F. Di Renzo, A. Mukherjee, and L. Scorzato. Monte Carlo simulations on the Lefschetz thimble: Tamingthe sign problem.

Phys. Rev. D , 88(5):051501, 2013.[10] M. Cristoforetti, F. Di Renzo, and L. Scorzato. New approach to the sign problem in quantum ﬁeld theories: High densityQCD on a Lefschetz thimble.

Phys. Rev. D , 86(7):074506, 2012.[11] P. de Forcrand and O. Philipsen. The QCD phase diagram for small densities from imaginary chemical potential.

Nucl.Phys. B , 642(1):290–306, 2002.[12] Z. Fodor, S. D. Katz, D. Sexty, and C. T¨or¨ok. Complex Langevin dynamics for dynamical QCD at nonzero chemicalpotential: A comparison with multiparameter reweighting.

Phys. Rev. D , 92:094516, 2015.[13] C. J. Hsiesh, K. W. Chang, C. J. Lin, S. Keerthi, and S. Sundararajan. A dual coordinate descent method for large-scalelinear SVM. In

ICML’ 08: Proceedings of the 25th international conference on Machine learning , pages 408–415, 2008.[14] J. Klauder. A Langevin approach to fermion and quantum spin correlation functions.

J. Phys. A: Math. Gen. , 16:L317–319,1983.[15] J. Klauder. Stochastic quantization.

Acta Physica Austria, Suppl. , XXV:251–281, 1983.[16] J. Klauder. Coherent-state Langevin equations for canonical quantum systems with applications to the quantized Halleﬀect.

Phys. Rev. A , 29:2036–3047, 1984.[17] P. Lancaster and L. Rodman.

Algebraic Riccati equations . Oxford University Press, 1995.[18] K. Nagata, J. Nishimura, and S. Shimasaki. Argument for justiﬁcation of the complex Langevin method and the conditionfor correct convergence.

Phys. Rev. D , 94:114515, 2016.[19] K. Nagata, J. Nishimura, and S. Shimasaki. Gauge cooling for the singular-drift problem in the complex Langevin method— a test in Random Matrix Theory for ﬁnite density QCD.

J. High. Ener. Phys. , 7:73, 2016.[20] K. Nagata, J. Nishimura, and S. Shimasaki. Justiﬁcation of the complex Langevin method with the gauge cooling procedure.

Prog. Theor. Exp. Phys. , 2016(1):013B01, 2016.[21] Y. Nesterov. Eﬃciency of coordinate descent methods on huge-scale optimization problems.

SIAM Journal on Optimization ,22(2):341–362, 2012.[22] G. Parisi. On complex probabilities.

Phys. Lett. B , 131:393–395, 1983.[23] R. D. Pietri, E. Seiler A. Feo, and I.-O. Stamatescu. A model for QCD at high density and large quark mass.

Phys. Rev.D , 76:114501, 2007.[24] M. Scherzer, E. Seiler, D. Sexty, and I.-O. Stamatescu. Complex Langevin and boundary terms.

Phys. Rev. D , 99:014512,2019.[25] M. Scherzer, E. Seiler, D. Sexty, and I.-O. Stamatescu. Controlling complex Langevin simulations of lattice models byboundary term analysis.

Phys. Rev. D , 101(1):014501, 2020.[26] E. Seiler, D. Sexty, and I.-O. Stamatescu. Gauge cooling in complex Langevin for lattice QCD with heavy quarks.

Phys.Lett. B , 723:213–216, 2013.[27] D. Sexty. Simulating full QCD at nonzero density using the complex Langevin equation.

Phys. Lett. B , 729:108–111, 2014.[28] Z. Wang, Y. Li, and J. Lu. Coordinate descent full conﬁguration interaction.

J. Chem. Theory Comput. , 15:3558–3569,2019.(Xiaoyu Dong)

University of Chinese Academy of Sciences, Beijing 100049, China

E-mail address : [email protected] (Zhenning Cai) Department of Mathematics, National University of Singapore, Level 4, Block S17, 10 LowerKent Ridge Road, Singapore 119076

E-mail address : [email protected] (Yana Di) Institute of Mathematical Research, Beijing Normal University & UIC, Zhuhai 519087, China

E-mail address ::