Fast Estimator of jacobians in Monte Carlo Integration on Lefschetz Thimbles
Andrei Alexandru, Gokce Basar, Paulo F. Bedaque, Gregory W. Ridgway, Neill C. Warrington
FFast Estimator of jacobians in Monte Carlo Integration on Lefschetz Thimbles
Andrei Alexandru ∗ Department of PhysicsGeorge Washington UniversityWashington, DC 20052
G¨ok¸ce Ba¸sar, † Paulo F. Bedaque, ‡ Gregory W. Ridgway, § and Neill C. Warrington ¶ Department of PhysicsUniversity of MarylandCollege Park, MD 20742 (Dated: October 15, 2018)A solution to the sign problem is the so-called “Lefschetz thimble approach” where the domain ofintegration for field variables in the path integral is deformed from the real axis to a sub-manifold inthe complex space. For properly chosen sub-manifolds (“thimbles”) the sign problem disappears or isdrastically alleviated. The parametrization of the thimble by real coordinates require the calculationof a jacobian with a computational cost of order O ( V ), where V is proportional to the spacetimevolume. In this note we propose two estimators for this jacobian with a computational cost oforder O ( V ). We discuss analytically the regimes where we expect the estimator to work and shownumerical examples in two different models. I. INTRODUCTION
Lattice regularized path integrals provide a means for analyzing strongly coupled systems where perturbationtheory breaks down. These high dimensional integrals are most efficiently computed through Monte Carlomethods but these methods hinge on the fact that e − S / Z is a positive definite probability distribution ( S isthe action and Z the partition function of the system). Unfortunately this is not the case when the action S iscomplex as it is for many systems of interest. Systems with a complex action include QCD, or most models, atfinite density and the Hubbard model away from half filling. This inability to consider e − S / Z as a probabilitydistribution leads to the notorious “sign problem”. A geometric solution to the sign problem recently putforward is the thimble approach [1]. In this scheme, the path integral is extended to complex fields and theoriginal action is analytically continued to a holomorphic function of complex fields. Picard-Lefschetz theoryshows that the original integral over real fields is equal to a sum of integrals, each of which is computed overa submanifold in the complex space called a thimble. These manifolds can be chosen in such a away that theimaginary part of the action S I is (piecewise) constant, so the previously problematic oscillatory part of theaction factors out of the partition function, leaving a positive definite probability distribution that can beused to sample the thimble. Every implementation of this idea of a quantum field theory thus far is verycomputationally expensive since they all involve, for one reason or another, the transport of a basis of vectorsfrom point to point along the thimble. In [2, 3], the transport of a basis is necessary in order to either makeproposal in the Monte Carlo steps or a step in the Langevin evolution that actually lie on the thimble (anontrivial task since the surface is curved). In the method used in [4] one parametrizes the thimble by pointsin the tangent space to the critical point. The jacobian of this parametrization needs to be computed and itcan be thought of as the volume spanned by the vectors obtained by transporting a unit basis to a givenpoint on the thimble. Regardless of the reason, the computational cost of transporting a basis of the tangentspace to the thimble from one point to another scales as O ( V ) (where V is proportional to the spacetimevolume), which consume the overwhelming majority of the computation resources and quickly renders thecalculation unfeasible expensive as the spacetime volume increases. Therefore, it would be beneficial to havean alternative to this procedure. In the present note, we present two such estimators of the jacobian witha computational cost of the order O ( V ) which tracks the exact jacobian with remarkable precision. We ∗ [email protected] † [email protected] ‡ [email protected] § [email protected] ¶ [email protected] a r X i v : . [ h e p - l a t ] A p r motivate the estimators theoretically and demonstrate through examples that these estimators can be used inMetropolis updates with the difference between the estimators and the true jacobian included by reweigthing.Two different models are used for this purpose. A fermionic model in 0 + 1 dimensions and the relativisticBose gas model in 3 + 1 dimensions at finite density in a small lattice. II. THIMBLES AND THE CONTRACTION METHOD
A critical point z c is a point in complex space where the gradient of the action vanishes: ∂S∂z i (cid:12)(cid:12)(cid:12)(cid:12) z c = 0 , (2.1)where i = 1 , . . . , N . A thimble T associated to the critical point z c is the union of all curves that satisfysteepest descent flow equations dz i dτ = − ∂S∂z i (2.2)approaching z c asymptotically. Here the bar denotes complex conjugation and τ is an auxiliary time parameteralong a trajectory. Expanding the flow eq. (2.2) into its real and imaginary parts, dz Ri dτ = − ∂S R ∂z Ri = ∂S I ∂z Ii ,dz Ii dτ = − ∂S R ∂z Ii = − ∂S I ∂z Ri , (2.3)we find that the flow is the gradient flow of the real part of the action S R and, at the same time, thehamiltonian flow of the “hamiltonian” S I . That means that the flow seeks smaller values of S R but conserves S I . As the thimble is defined as the set of curves that satisfy a first order differential equation, any point ona thimble T lies on one and only one curve that satisfies 2.2 and the asymptotic boundary conditions. Thisfact is exploited in the sampling algorithm we will use.Near z c , the action is well approximated by S ( z ) = S ( z c ) + z i H ij z j where the hessian H ij = ∂ S∂z i ∂z j ( z c )is a symmetric, but not necessarily real matrix. Expanding the quadratic action in terms of its real andimaginary parts, one finds ddt (cid:20) z R z I (cid:21) = − (cid:20) H R − H I − H I − H R (cid:21) (cid:20) z R z I (cid:21) = − H (cid:20) z R z I (cid:21) . (2.4)The reason for separating the system in its real and imaginary is that the matrix appearing on eq. (2.4)is now real and symmetric, so it affords an orthonormal basis of eigenvectors with real eigenvalues. Thenon-zero eigenvalues of this matrix appear in pairs ( λ, − λ ) ; the directions tangent to the thimble correspondto the eigenvectors with a positive eigenvalue and has, therefore, N (real) dimensions.The contraction algorithm [4] is a Metropolis style Monte Carlo integration. The difficulty with applyingany Markov chain or similar approaches to thimble integration is that field configurations lying on the thimblemust be proposed in an accept/reject step. This is not an obvious task as the geometry of the thimble isnot a priori known. To understand how the contraction algorithm handles this geometric constraint, notethe following: first, under the flow, any point on the thimble will at some τ arrive arbitrarily close to thecritical point. Second, since the action always increases as the field configuration moves away from the criticalpoint, asymptotically distant fields are not likely to contribute significantly to any observable. Therefore,incurring only exponentially small errors, a finite (but arbitrarily large) subset of the thimble can be sampledto compute observables. The first fact tells us that after a finite amount of flow T , the entire relevant subset We will assume here that H is not degenerate, that is, that no eigenvalue vanishes. is mapped arbitrarily close to the critical point of the thimble. Since the flowed points are in one-to-onecorrespondence with the original region of the thimble, the flow defines a map between the thimble and thetangent space of the critical point. Near enough to the critical point the thimble can be approximated by itstangent space. Thus, any point in the thimble (or, at least, in the regions of the thimble with substantialsupport in the path integral) is mapped by the flow into a point near the critical point and, consequently, intoa point on the tangent space. Alternatively, a point in the tangent space near the critical point is mappedby the reversed flow to a point approaching the thimble as the flow time grows. This observation can thenbe used to establish a coordinate system on the thimble. Every point z of the thimble is parametrized by apoint ˜ z on the tangent space by the flow by a fixed time T :˜ z (cid:55)→ z ( τ = T ) where dz i dτ = + ∂S∂z i and z (0) = ˜ z . (2.5)Note the plus sign in the equation above due to fact that the (reverse) flow begins near the critical pointand flows outwards. The point ˜ z close to the critical point, in its turn, can be written as a (real) linearcombination of the N “complex eigenvectors” satisfying Hρ ( a ) = λ ( a ) ρ ( a ) 2 :˜ z i = N (cid:88) a =1 c a ρ ( a ) i . (2.6)The utility of the parametrization z = z (˜ z ) rests on the fact that Monte Carlo updates can be easilyconstrained to the thimble by proposing changes that keep the coefficients c i real. In fact, a change in c i , andconsequently, to ˜ z i , leads to a change in the point z reached by the (reverse) flow after a time T . The largerthe value of T the closer to the critical point ˜ z will be and more justified we are in identifying a point on thetangent space to a point in the gaussian region of the thimble.The points on the thimble can then be sampled by sampling the variables ˜ z (or c i ) through, for instance, aMetropolis algorithm. In terms of ˜ z we have, in each thimble (cid:104) O (cid:105) = 1 Z (cid:90) T dz e − S ( z ) O ( z ) = 1 Z (cid:90) G d ˜ z e − S ( z (˜ z )) O ( z (˜ z )) det (cid:18) ∂z∂ ˜ z (cid:19) = 1 Z (cid:90) G d ˜ z e − S eff (˜ z ) O (˜ z ) det (cid:18) ∂z∂ ˜ z (cid:19) (2.7)where S eff (˜ z ) = S ( z (˜ z )) − ln(det (cid:0) ∂z∂ ˜ z (cid:1) (˜ z )), O (˜ z ) = O ( z (˜ z )) and the set G is the subset of the tangent space of z c to which the relevant region of T is mapped under the flow. The jacobian det J ( T ) = det (cid:0) ∂z∂ ˜ z (cid:1) encodeshow volumes stretch and twist between the tangent space of the critical point z c and a point on the thimble.It is a complex quantity depending on the geometry of the thimble and on the particular parametrization wechose. In order to compute det J ( T ), we can take an orthonormal basis of the tangent space like, for instance, ρ ( a ) , and evolve it by the (reverse) flow from ˜ z to z . The equation governing the evolution of a vector bythe flow can be found considering the evolution of two infinitesimally close points z and z connected bya tangent vector v = z − z . The flow of these points define the evolution of the tangent vector which isgoverned by the equation dv i dτ = H ij ( z ( τ )) v j where H ij ( z ) ≡ ∂ S ( z ) ∂z i ∂z j . (2.8)The evolved basis defines a volume in complex space whose determinant is det J ( T ). In summary, J can befound by solving dJdt = HJ, J (0) =
R ,dz i dτ = ∂S∂z i , z (0) = ˜ z, (2.9)where the matrix H ( z ( τ )) is evaluated along the trajectory z ( τ ). The initial condition for jacobian matrix J is the matrix R that has the vector ρ i as the columns, that is R ij ≡ ρ ( i ) j . Note that because the vectors ρ i are The N N-dimensional complex eigenvectors ρ ( a ) can be found from the N ρ ( a ) R , ρ ( a ) I ) of H by combining their real and imaginary parts as ρ = ρ R + iρ I . orthogonal not only in terms of the real scalar product, which is true because the real H matrix is symmetric,but also in terms of complex scalar product . If we normalize these vectors, the matrix R is unitary and itsdeterminant is a phase.The cost of computing the jacobian can be split into two parts. One, is the diagonalization of the Hessian H ( z cr ) defining the tangent space which is O ( V ) operation. This step, however, needs to be done only oncein the beginning of the calculation. The cost of solving the system of differential equation in eq. (2.9) isdominated by the matrix multiplication and is also generically of order O ( V ). In some special models, wherethe action is local (typically purely bosonic models) the Hessian is a sparse matrix and the computationalcost of this step is O ( V ). III. THE ESTIMATORS
In this section we propose the use of two different estimators of J . The first one, W is given bylog J ≈ log W = (cid:90) T dt (cid:48) (cid:88) a ρ ( a ) † H ( t (cid:48) )¯ ρ ( a ) , (3.1)where ρ ( a ) are the complex eigenvectors of the matrix H ( z c ) = ∂ S ( z c ) /∂z i ∂z j with positive eigenvalues. Inthe gaussian region, where the action is well approximated by its quadratic part S ( z ) ≈ S ( z c ) + 12 ( z − z c ) T H ( z c )( z − z c ) , (3.2) W agrees with J . In order to see that we notice that eq. (2.9) has the formal solution J ( t ) = R + tHR + t HHR + t HHHR + · · · , (3.3)Indeed, dJ ( t ) dt = HR + tHHR + t HHHR + · · · = H (cid:20) R + tHR + t HHR + · · · (cid:21) = H ¯ J . (3.4)But, since Hρ ( a ) = λ ( a ) ρ ( a ) , HR = Λ R where Λ = diag( λ (1) , λ (2) , · · · ). Thus, J ( T ) = R + T Λ R + T
2! Λ R + · · · = e T Λ R (3.5)and det J ( T ) = det( e T Λ )det( R ) = e T (cid:80) a λ ( a ) det R . (3.6)On the other hand, again in the quadratic approximation, we havelog W = (cid:90) T dt (cid:48) (cid:88) a ρ ( a ) † H ¯ ρ ( a ) = T (cid:88) a ρ ( a ) † ρ ( a ) λ ( a ) = T (cid:88) a λ ( a ) , (3.7)and we see that these expressions only differ by det R which is a fixed phase.The second estimator of J we propose is W = exp (cid:90) T dt (cid:48) Tr H ( t (cid:48) ) . (3.8) This extra condition in true because Im ρ † i ρ j = 0, due to fact that the vectors iρ i are also eigenvectors of H orthogonal on ρ j . A theoretical justification for its use can be found by noticing that the time-ordered exponential P exp (cid:90) T dt (cid:48) Tr H ( t (cid:48) ) (3.9)is the solution of dJdt = HJ, (3.10)which, up to the complex conjugation of J , is the same as eq. (2.9). Presumably W is a good estimator of J in situations when J is mostly real.The cost of computing W and W is somewhat dependent on the model and it will be discussed below;in practice they are computationally much cheaper than det J . Their usefulness relies on that and on thefact that they track the actual value of det J well. Note, they can be used for Monte Carlo runs as long asthe difference between det J and W , is reweighted with the observable. More precisely, we can write theexpectation value of an observable as: (cid:104)O(cid:105) = 1 Z (cid:90) d ˜ z O det Je − S R = 1 Z (cid:90) d ˜ z O e i Im log det J e − S eff = 1 Z (cid:90) d ˜ z O e i Im log det J e − S eff + S (cid:48) eff e − S (cid:48) eff = 1 Z (cid:48) (cid:90) d ˜ z O e i Im log det J e − ∆ S e − S (cid:48) eff (cid:44) Z (cid:48) (cid:90) d ˜ z e i Im log det J e − ∆ S e − S (cid:48) eff = (cid:104)O e i Im log det J e − ∆ S (cid:105) S (cid:48) eff (cid:14) (cid:104) e i Im log det J e − ∆ S (cid:105) S (cid:48) eff , (3.11)where S eff = S R − Re log det J , S (cid:48) eff = S R − Re log det W or S (cid:48) eff = S R − Re log det W and ∆ S = S eff − S (cid:48) eff . Z and Z (cid:48) are the partition functions for det J × exp( − S R ) and exp( − S (cid:48) eff ) respectively. It is evident fromthe formula above that the efficacy of the method relies on on the quantity exp( i Im log det J ) exp( − ∆ S ) tofluctuate little from one field configuration to the next. Some of this fluctuation comes from exp( i Im log det J )(called the “residual phase”) and is intrinsic to the geometry of the thimble and is unrelated to the estimator.The other factor, exp( − ∆ S ), arises from the use of the estimator. In the next section we present tests of thisidea in specific models. IV. RESULTS
The first model we will consider is a 0 + 1 dimensional version of the Thirring model with (continuous)action given by L Th. = ¯ χ (cid:18) γ ddt + m + µγ (cid:19) χ + g (cid:0) ¯ χγ χ (cid:1) , (4.1)where χ is a two component, time dependent spinor and γ is a Pauli matrix. After discretizing it (usingstaggered fermions) and introducing a bosonic auxiliary variable φ we arrive at the lattice model Z = (cid:34) N (cid:89) t =1 (cid:90) π d ˆ φ t π (cid:35) det De − g (cid:80) Nt =1 (1 − cos ˆ φ t ) ≡ (cid:34) N (cid:89) t =1 (cid:90) π d ˆ φ t π (cid:35) e − S [ ˆ φ ] , (4.2)where the effective action and the explicit form of the discretized Dirac matrix are S [ ˆ φ ] = 12ˆ g N (cid:88) t =1 (1 − cos ˆ φ t ) − log det D [ ˆ φ ] ,D t,t (cid:48) [ ˆ φ ] = 12 (cid:16) e ˆ µ + iφ t δ t +1 ,t (cid:48) − e − ˆ µ − i ˆ φ t (cid:48) δ t − ,t (cid:48) + e − ˆ µ − i ˆ φ t (cid:48) δ t, δ t (cid:48) ,N − e ˆ µ + i ˆ φ t δ t,N δ t (cid:48) , (cid:17) + ˆ m δ t,t (cid:48) . (4.3) (cid:83) FIG. 1: Statistical power for the parameter set N = 8 , ˆ m = 1 , ˆ µ = 1 , ˆ g = 1 / N = β/a is an even number that denotes the number of lattice sites related to the inverse temperatureof the system β , and all the dimensionful quantities, m , g , µ are rendered dimensionless by multiplyingwith appropriate powers of the lattice spacing: ˆ m = ma , ˆ µ = µa , ˆ g = g a , ˆ φ = aφ . This model has a signproblem at finite µ that can be severe at large enough g and µ . However, it is easily solved analytically andhas been used as a testing ground of different approaches to the sign problem [5]. It has also been studiedwith the Lefshetz thimble approach where the need to include multi thimble contributions were detected[4, 6, 7] and approaches to include them were discussed [8]. Here we will not discuss the agreement betweenthe Monte Carlo calculations and the exact result. Instead, we focus on the feasibility to use one of theestimators instead of the actual jacobian in the manner described by eq. (3.11).For that we repeat the calculations in [4] by now using the estimator W . If the estimator is accurate, theratio w = e − ∆ S / (cid:104) e − ∆ S (cid:105) fluctuates little and the weight of every configuration to (cid:104)O(cid:105) is similar. On the otherhand, if only a few configurations have a large value of w and dominate the reweighting, large statisticalerrors are expected. This observation is made quantitative by the statistical power Σ defined asΣ = 1 N (cid:104) w (cid:105) S (cid:48) eff (cid:104) w (cid:105) S (cid:48) eff , (4.4)where N is the number of configurations used in the averages. If all configurations have the same statisticalweights in the reweight procedure the statistical power is 1; on the other extreme, if only one configurationhas weight one while all the other configurations have zero weight the statistical power attains its smallestpossible value 1 / N .The critical point with the smallest (real part of the) action is a constant field configuration φ t = ( φ, · · · , φ )for a certain value of φ . The tangent space to the associated thimble (at the critical point) is purely real.Consequently, W = W in this case. The cost of computing the estimator W in this model is dominated bythe cost of computing the trace of the hessian matrix at every point of the flow trajectory connecting ˜ z to z and is of order O ( V ), compared to the O ( V ) cost of computing det J .In order to test the effectiveness of the jacobian estimators we use W in the Monte Carlo runs (or,equivalently, the accept/reject step is perfomed using S (cid:48) eff ). The configurations thus obtained are thenreweighted as shown in eq. (3.11) and the statistical power of the reweighting is computed.To integrate the flow equations we use an adaptive integrator [9], which uses a fourth and fifth orderRunge-Kutta methods to evaluate the next point along the flow and also to estimate the errors. The step Of course, in this simple model an explicit formula for the determinant can be found and, in this sense, the cost of computingdet J is not O ( V ). This is a special feature of this 0 + 1 dimensional model and does not generalize to higher dimensionswhile the estimates above do. (cid:83) (a) Low temperature limit: N = 32 , ˆ m = 1 , ˆ µ = 1 , ˆ g = 1 / (cid:83) (b) Continuum limit: N = 32 , ˆ m = 1 / , ˆ µ = 1 / , ˆ g = 0 . FIG. 2: Statistical power of reweighting for N = 32.size is adjusted up or down to keep the errors at the desired level. The observable used to monitor theerror dictates the size of the integration steps. We monitor both the position along the flow, z ( t ), and thejacobian (or its estimator) and use the largest error to adjust the step size. It turns out that the jacobian ismore sensitive to the integration errors and it determines the step size. It is important to note that whenintegrating the estimator along the flow, the integrator does not use the same steps as used in the calculationof the jacobian. This has two important consequences. First, the estimator is less sensitive to numericalerrors, probably because its calculation does not require an LU decomposition as is necessary for the jacobian.This makes the integration run even faster since it requires fewer steps. On the other hand, the calculation of z ( T ) produces slightly different results when integrating the jacobian or the estimator. The observables areevaluated using the value of z ( T ) produced by the more precise jacobian integration. To account for the factthat z (cid:48) ( T ), the result of estimator integration, differs slightly from the exact z ( T ), we need to reweight using S (cid:48) eff ( z (cid:48) ( T )) = S R ( z (cid:48) ( T )) − Re log det W . Note that z (cid:48) ( T ) can be made arbitrarily close to z ( T ) by tighteningthe integrator errors. From a numerical perspective, it is more profitable to use a looser and faster integrator,as long as S (cid:48) eff ( z (cid:48) ( T )) differs little from S (cid:48) eff ( z ( T )). The final results has no bias, but the error bars mightincrease due to the additional reweighting due to the difference between S (cid:48) eff ( z (cid:48) ( T )) and S (cid:48) eff ( z ( T )).In Fig. 1 we show the statistical power with parameters N = 8, ˆ m = ˆ µ = 1 and ˆ g = 1 /
2. The statisticalpower for both estimators is close to one indicating that both W and W are useful. One might naivelythink that the estimators are reliable for short flow times T but will degrade for larger flows. After all,det J = W = W = 1 for T = 0. As Fig. 1 shows, the opposite is true. This can be understood by noticingthat the points sampled on the thimble are determined by the physics of the model and not by our choiceof parametrization. Consider the parametrization with two different flow times T and T (cid:48) . In these twoparametrizations the same point z on the thimble is parametrized by two different points ˜ z and ˜ z (cid:48) . The pathconnecting ˜ z to z (for flow time T ) and the path connecting ˜ z (cid:48) to the same z (for flow time T (cid:48) ) differ only bythe path connecting ˜ z to ˜ z (cid:48) which lies very close to the critical point. In that region, as shown above, W is agood estimator of det J . Thus, beyond a certain flow time the estimate of det J by W does not get worse.The natural question is whether the estimators still track the jacobian for larger values of N . There are twoways of taking the large N limit: by lowering the temperature with a fixed lattice spacing or by decreasing thelattice spacing at a fixed temperature. In the first case only the parameter N changes while in the second oneall parameters have to be scaled appropriately. Fig. 2a shows the statistical power for the parameters N = 32,ˆ m = ˆ µ = 1 and ˆ g = 1 /
2, corresponding to a temperature 4 times smaller than in Fig. 1. Fig. 2b showsinstead the statistical power for the parameter set N = 32, ˆ m = ˆ µ = 1 / g = 0 . corresponding toa continuum limit extrapolation of the parameter set used in Fig 1. As explained in [7, 8], for flows timeslarger than T > . T = 0 . T = 1 . The parameter scaling we use in the continuum limit is explained in [8]. is probably related to the fact that calculations with small flow times correspond to an integration over amanifold very different from the one thimble sampled in the T → ∞ limit (this point is discussed extensivelyin [8]). As such we should concentrate on the larger flows and, for those, the statistical power is high.The second model we consider is a system of relativistic spin-0 bosons with a finite chemical potential.The lattice action is given by S = (cid:88) x (cid:20) (cid:18) m (cid:19) φ ax φ ax − (cid:88) ν =1 φ ax φ ax +ˆ ν − cosh µφ ax φ ax +ˆ0 + i sinh µ(cid:15) ab φ ax φ bx +ˆ0 + λ φ ax φ ax ) − h ( φ x + φ x ) (cid:21) , (4.5)where (cid:15) is the antisymmetric tensor with (cid:15) = 1. The Hessian is given by ∂∂φ bx ∂∂φ cy S = (8 + m ) δ xy δ bc − (cid:88) ν =1 ( δ x +ˆ ν,y + δ x − ˆ ν,y ) δ bc − cosh µ (cid:16) δ x +ˆ0 ,y + δ x − ˆ0 ,y (cid:17) δ bc + i sinh µ (cid:16) δ x +ˆ0 ,y − δ x − ˆ0 ,y (cid:17) (cid:15) bc + λδ xy (cid:2) ( φ ax φ ax ) δ bc + 2 φ bx φ cx (cid:3) . (4.6)The computation of W for this action starts by the computation (only once) of the basis ρ ( a ) . For the presenthessian H ( φ ) we have H ( φ ) bcx,y = ( H ) bcx,y + λδ xy (cid:2) ( φ ax φ ax ) δ bc + 2 φ bx φ cx − ( φ → φ cr ) (cid:3) , (4.7)where H = H ( φ cr ). In order to compute W we define∆( φ ) ≡ (cid:88) i,x,y,b,c ( ρ ( i )0 ) bx δ xy (cid:2) ( φ ax φ ax ) δ bc + 2( φ bx φ cx ) (cid:3) ( ρ ( i )0 ) cy = (cid:88) x (cid:88) bc R bcx (cid:2) ( φ ax φ ax ) δ bc + 2( φ bx φ cx ) (cid:3) , (4.8)where ρ ( i )0 are the “complex eigenvector” of H and R bcx ≡ (cid:88) i ( ρ ( i )0 ) bx ( ρ ( i )0 ) cx , (4.9)a vector of 2 × (cid:88) a ρ ( a ) † H ( φ )¯ ρ ( a ) = (cid:88) a ρ ( a ) † H ¯ ρ ( a ) + ∆( φ ) − ∆( φ cr ) = (cid:32)(cid:88) i λ i − ∆( φ cr ) (cid:33) + ∆( φ ) , (4.10)where λ i are the positive eigenvalues of H at the critical point. The quantity in the parenthesis is computedonce at the beginning of the calculation and the only term that has to be computed at every step is ∆( φ ).The calculation of ρ ( a ) † H ¯ ρ ( a ) is then a O ( V ) operation. Similarly, the computation of W involves only thetrace of H ( φ ) which is a O ( V ) operation.The statistical power of both estimators, W and W are shown in Fig 3 for a 4 lattice and parameter m = 1, µ = 1 . h = 0 .
03 + i . h was chosen complex to avoidthe appearance of the Stokes phenomenon (a thimble connecting two critical points) which violates thedecomposition of the original integral into a sum of thimbles.The main feature seen in Fig 3 is that the statistical power of the W reweighting remains very close toone at all values of flow time tested. The estimator W also tracks the jacobian reasonably well and has theadvantage of being easier to implement since it does not require the knowledge of the eigenvalues ρ ( a )0 . Thehit on the statistical value of configurations generated using W is more than compensated by the gain inspeed during the Monte Carlo runs. In fact, for the parameters we explored the Monte Carlo runs are a factorof 2000 faster using W than using the jacobian det J while the statistical power is reduced by only a factor ≈
3. Furthermore, the statistical power does seem to saturate at large flow times and not asymptote to zero.
V. CONCLUSIONS
We presented in this note two fast estimators for the jacobian that arises when parametrizing a thimble asused in the contraction algorithm. We showed that these estimators yield statistically robust data sets (in W W (cid:83) FIG. 3: Statistical power of the bosonic model on a 4 lattice with parameters, ˆ m = 1 , ˆ µ = 1 . , λ = 1 and h = 0 .
03 + i .
003 for the two estimators W and W .the sense of statistical power) for a 0 + 1 dimensional fermion model and a 3 + 1 dimensional bosonic modelin a small lattice. The computational costs of the estimators are O ( V ) ( V is the spacetime volume) while theoriginal jacobian requires a number of order O ( V ) of operations. Even in the small volumes considered in ourexperiments (4 in the bosonic model) this is an improvement by a factor of approximately 2000 in efficiencyand it is essential for the practical use of the algorithm. Furthermore, we found that these estimators donot lose their utility in the limit of large flows. This is an extremely useful trait, especially for large latticeswhich require large flows to tame the sign problem. The authors are confident that the estimators describedin this paper will prove useful for a variety of models in the future. ACKNOWLEDGMENTS
A.A. is supported in part by the National Science Foundation CAREER grant PHY-1151648. A.A.gratefully acknowledges the hospitality of the Physics Department at the University of Maryland where partof this work was carried out. G.B., P.B., G.R and N.C.W. are supported by U.S. Department of Energyunder Contract No. DE-FG02-93ER-40762. [1]
AuroraScience
Collaboration, M. Cristoforetti, F. Di Renzo, and L. Scorzato,
New approach to the sign problemin quantum field theories: High density QCD on a Lefschetz thimble , Phys. Rev.
D86 (2012) 074506,[ arXiv:1205.3996 ].[2] M. Cristoforetti, F. Di Renzo, A. Mukherjee, and L. Scorzato,
Monte Carlo simulations on the Lefschetz thimble:Taming the sign problem , Phys. Rev.
D88 (2013), no. 5 051501, [ arXiv:1303.7204 ].[3] H. Fujii, D. Honda, M. Kato, Y. Kikukawa, S. Komatsu, and T. Sano,
Hybrid Monte Carlo on Lefschetz thimbles -A study of the residual sign problem , JHEP (2013) 147, [ arXiv:1309.4371 ].[4] A. Alexandru, G. Basar, and P. Bedaque, Monte Carlo algorithm for simulating fermions on Lefschetz thimbles , Phys. Rev.
D93 (2016), no. 1 014504, [ arXiv:1510.0325 ].[5] J. M. Pawlowski and C. Zielinski,
Thirring model at finite density in 0+1 dimensions with stochastic quantization:Crosscheck with an exact solution , Phys. Rev.
D87 (2013), no. 9 094503, [ arXiv:1302.1622 ].[6] H. Fujii, S. Kamata, and Y. Kikukawa,
Lefschetz thimble structure in one-dimensional lattice Thirring model atfinite density , arXiv:1509.0817 .[7] H. Fujii, S. Kamata, and Y. Kikukawa, Monte Carlo study of Lefschetz thimble structure in one-dimensionalThirring model at finite density , arXiv:1509.0914 . [8] A. Alexandru, G. Basar, P. F. Bedaque, G. W. Ridgway, and N. C. Warrington, Sign problem and Monte Carlocalculations beyond Lefschetz thimbles , arXiv:1512.0876 .[9] J. R. Cash and A. H. Karp, A variable order runge-kutta method for initial value problems with rapidly varyingright-hand sides , ACM Trans. Math. Softw.16