[PDF] Large-N SU(N) Yang-Mills theories with milder topological freezing

Abstract

We simulate 4d SU(N) pure-gauge theories at large N using a parallel tempering scheme that combines simulations with open and periodic boundary conditions, implementing the algorithm originally proposed by Martin Hasenbusch for 2d CP^{N-1} models. That allows to dramatically suppress the topological freezing suffered from standard local algorithms, reducing the autocorrelation time of Q^2 up to two orders of magnitude. Using this algorithm in combination with simulations at non-zero imaginary \theta we are able to refine state-of-the-art results for the large-N behavior of the quartic coefficient of the \theta-dependence of the vacuum energy b_2, reaching an accuracy comparable with that of the large-N limit of the topological susceptibility.

Full PDF

aa r X i v : . [ h e p - l a t ] F e b Large-

N SU ( N ) Yang–Mills theories with milder topological freezing

Claudio Bonanno ∗ Universit`a di Pisa and INFN Sezione di Pisa,Largo Pontecorvo 3, I-56127 Pisa, Italy,Centro Nazionale INFN di Studi Avanzati GGI,Largo Enrico Fermi 2, I-50125 Firenze, Italy

Claudio Bonati † and Massimo D’Elia ‡ Universit`a di Pisa and INFN Sezione di Pisa,Largo Pontecorvo 3, I-56127 Pisa, Italy (Dated: February 8, 2021)We simulate 4 d SU ( N ) pure-gauge theories at large N using a parallel tempering scheme thatcombines simulations with open and periodic boundary conditions, implementing the algorithm orig-inally proposed by Martin Hasenbusch for 2 d CP N − models. That allows to dramatically suppressthe topological freezing suﬀered from standard local algorithms, reducing the autocorrelation timeof Q up to two orders of magnitude. Using this algorithm in combination with simulations atnon-zero imaginary θ we are able to reﬁne state-of-the-art results for the large- N behavior of thequartic coeﬃcient of the θ -dependence of the vacuum energy b , reaching an accuracy comparablewith that of the large- N limit of the topological susceptibility. PACS numbers: 12.38.Aw, 11.15.Ha,12.38.Gc,12.38.Mh

1. INTRODUCTION

The dependence of physical observables on the topolog-ical parameter θ is one of the most interesting propertiesof four dimensional SU ( N ) pure-gauge theories. The pa-rameter is coupled in the action to the topological charge Q = Z d x q ( x ) = 164 π ε µνρσ Z d x F aµν ( x ) F aρσ ( x ) ; (1) θ -dependence can be studied to achieve a better under-standing of the non-perturbative features of Yang–Millstheories, but also has direct phenomenological implica-tions for hadron physics in the limit of large number ofcolors [1–7].A particularly interesting quantity, whose dependenceon θ has been thoroughly investigated, is the vacuumenergy density E ( θ ), which is formally deﬁned by therelation E ( θ ) ≡ − V log Z [ dA ] e − S YM [ A ]+ iθQ [ A ] , (2)where V is the four-dimensional euclidean space-time vol-ume. The functional form of E ( θ ) is known only for veryspeciﬁc theories, like for QCD close to the chiral limit [7].It is customary to consider a Taylor expansion of E ( θ )around θ = 0. Since Q is odd under a CP transformation,only even powers of θ appear in the expansion, which can ∗ Electronic address: [email protected] † Electronic address: [email protected] ‡ Electronic address: [email protected] be parametrized in the form E ( θ ) − E (0) = 12 χθ ∞ X n =1 b n θ n ! . (3)The coeﬃcients of this expansion are related to the cu-mulants of the topological charge distribution at θ = 0,for instance χ = h Q i θ =0 /V , where χ is the topologicalsusceptibility.While the exact numerical values of the coeﬃcients χ and b n are generically unknown, something is knownabout their dependence on the number of colors N , atleast when N is large enough. Indeed, assuming that anon-trivial θ -dependence is present in the large- N limit,it is possible to ﬁx the large N scaling form of the energydensity E ( θ, N ) = N ¯ E ( θ/N ) [4, 6, 8], implying χ = ¯ χ + O (cid:18) N (cid:19) , (4) b n = ¯ b n N n + O (cid:18) N n +1) (cid:19) . (5)The value of ¯ χ is related to the mass of the η ′ meson bythe Witten–Veneziano formula [4, 5], which provides theestimate ¯ χ ≃ (180 MeV) . Analytic estimates of ¯ χ and ofthe ¯ b n coeﬃcients are available only for two dimensionalmodels [9–13]. Given the non-perturbative nature of θ -dependence, the numerical lattice approach is the naturaltool to investigate such topics quantitatively, and in par-ticular to test large- N predictions [11, 13–27]. There arehowever some non-trivial computational challenges thathave to be faced, especially in the large N regime.The ﬁrst problem is related to the measure of the co-eﬃcients b n . The task is challenging by itself since, con-trary to what happens for χ , these coeﬃcients approachzero as N is increased. In addition, the simplest estima-tors available for θ = 0 simulations are based on the cu-mulants of the topological charge distribution, which arenot self-averaging quantities, leading to a bad signal-to-noise ratio for large volumes. Such problem can be solvedby introducing an explicit source term in the action, cou-pled to the topological charge density, which correspondsin practice to an imaginary θ term [22, 28–36]. Themethod was exploited for the determination of the b n coeﬃcients ﬁrst in Ref. [22] and later developed and ap-plied in several works to both SU ( N ) Yang–Mills theory[13, 25–27] and two dimensional CP N − models [37, 38].The second problem is the so-called “freezing prob-lem”, it represents a well known problem in a wide rangeof theories sharing the presence of topological excita-tions [17, 39–49] and it will be the main topic of thiswork. When adopting local update algorithms on lat-tices with periodic boundary conditions, the topologicalmodes experience a severe critical slowing down whenapproaching the continuum limit, with the autocorrela-tion time of the topological charge which grows approx-imately exponentially as a function of 1 /a [17, 43]. Asa consequence gauge conﬁgurations stay ﬁxed (frozen)in a given topological sector for an exceedingly largeamount of Monte Carlo time, thus preventing a correctsampling of the path-integral distribution. This problembecomes even worse in the large- N limit, since at ﬁxedlattice spacing a the autocorrelation time of the topolog-ical charge seems to grow exponentially with the numberof colors [17, 37, 43].Despite the fact that a deﬁnitive solution to the topo-logical freezing problem has been obtained only in toymodels [48], several strategies have been proposed toreduce its severeness, in particular by reducing the ex-ponential critical slowing down to a polynomial criticalslowing down [44, 45, 48, 50], or to extract informationeven from completely frozen conﬁgurations [51]. A pop-ular method, proposed in Ref. [44], is to adopt openboundary conditions for the gauge ﬁelds in the time di-rection instead of the usual periodic ones. The presenceof an open boundary eliminates barriers among diﬀerenttopological sectors even in the continuum, and one ex-pects the critical slowing down of the topological modesto be essentially diﬀusive, with an autocorrelation timeincreasing as 1 /a approaching the continuum limit. Us-ing this method, however, one also breaks translation in-variance and loses completely any notion of global topo-logical charge. Therefore χ and b n can only be estimatedfrom the integral of the (2 n + 2)-point connected correla-tors of the topological charge density on the bulk of thelattice.A diﬀerent strategy, that keeps the advantages of theopen boundary conditions without breaking translationinvariance, has been proposed by M. Hasenbusch inRef. [52], where it was tested for two dimensional CP N − models. The basic idea of this method is to combine pe-riodic and open boundary conditions in a parallel tem-pering framework, using the copies with open or partially open boundary conditions as sources of topological ﬂuc-tuations for the copy with periodic boundary conditions,which is the one on which measures are performed. Toreduce the number of copies to be used in the paralleltempering, open boundary conditions are not enforcedalong all the temporal boundaries, but only in a limitedspace region, that will be referred to as the “defect” inthe following.In Ref. [38] it was shown that in two dimensional CP N − models, for large N values, the adoption ofthe Hasenbusch algorithm in combination with theimaginary- θ method allows to achieve impressive im-provements compared to previously available results forthe θ -dependence of these models. The aim of the presentwork is to test the same setup in the case of four dimen-sional SU ( N ) Yang–Mills theories at zero temperature,comparing its performance with that of the standard sim-ulations. In doing this we will also reﬁne the state-of-the-art results about the large- N behavior of the b co-eﬃcient.This paper is organized as follows: in Sec. 2 we discussour lattice setup, along with the parallel tempering al-gorithm and the imaginary- θ method, in Sec. 3 we showthe numerical results obtained with the parallel temper-ing and ﬁnally in Sec. 4 we draw our conclusions.

2. LATTICE SETUP

In this section we introduce the discretizations adoptedfor the action and the topological charge, we present asummary of the imaginary- θ method and discuss the par-allel tempering algorithm employed in our simulations. A. Lattice action and lattice topological charge

We discretize the Yang–Mills action on an hyper-cubiclattice of size L and with periodic boundary conditionsin every direction (see Sec. 2 C for the defect) using thestandard Wilson action: S W = − βN X x,µ>ν ℜ Tr { Π µν ( x ) } , (6)where Π µν ( x ) ≡ U µ ( x ) U ν ( x + ˆ µ ) U † µ ( x + ˆ ν ) U † ν ( x ) is theplaquette operator.For the topological charge (1), we adopt the simplestdiscretization with deﬁnite parity, the so-called clover discretization: Q clov = 12 π X x,µ,ν,ρ,σ ε µνρσ Tr { C µν ( x ) C ρσ ( x ) } , (7)where C µν ( x ) is a discretization of the ﬁeld strength givenby the sum of all the 4 plaquettes centered in the site x and lying on the µ – ν plane. Q clov is generically non-integer and it is related, conﬁguration by conﬁguration,to the physical charge Q by [53]: Q clov = ZQ + η, (8)where Z is a ﬁnite renormalization constant that ap-proaches 1 in the continuum limit, and η is a stochasticnoise due to ultraviolet (UV) ﬂuctuations at the scale ofthe lattice spacing. Using the variance of Q clov to esti-mate the topological susceptibility would require to takeinto account both multiplicative and additive renormal-izations, which can be avoided by using one of the severalsmoothing procedures that have been proposed in the lit-erature, such as the gradient ﬂow [54, 55] or cooling [56–62], which are all known to agree with each other whenproperly matched [63, 64]. In this work we use coolingdue to its simplicity and numerical eﬀectiveness.We denote by Q coolclov the topological charge obtainedby measuring the observable Eq. (7) on a conﬁgurationto which a certain number of cooling steps have beenapplied. To assign an integer topological charge Q L toeach conﬁguration we follow Ref. [17], deﬁning Q L = round (cid:8) α Q coolclov (cid:9) , (9)where “round” denotes the rounding to the closest integerand the value of α is ﬁxed by minimizing h (cid:0) α Q coolclov − round (cid:8) α Q coolclov (cid:9)(cid:1) i , (10)so that the maxima in the distribution of αQ coolclov are lo-cated approximately at integer values; such ﬁxing is per-formed at θ = 0 and then adopted also for θ = 0. Thetopological susceptibility computed using Q L becomesstable (i.e. independent of the number of cooling steps n cool ) after n cool ∼

10, moreover such threshold revealsto be weakly dependent on the lattice spacing, thus wechose n cool = 20 to deﬁne the topological charge in allsimulations, verifying also the stability of all continuumextrapolations if a diﬀerent value of n cool is used. B. Imaginary- θ method As anticipated in the introduction, the imaginary- θ method is a technique that is useful to estimate thetopological susceptibility and especially the coeﬃcients b n introduced in Eq. (3). In this section we provide ashort summary of this computational method, referringto Ref. [25] for more details. The idea of the method is tointroduce an imaginary θ term, in order to avoid a signproblem, and to extract χ and b n from the dependenceon θ of the cumulants of the topological charge distribu-tion: the method wins over the standard computation at θ = 0 since now the information on the b n parametersis contained in all cumulants, including the lowest orderones. The procedure is most conveniently explained byworking formally in the continuum: the continuum eu-clidean action can be written in the form S ( θ I ) = S YM − θ I Q, (11) where θ I = iθ . The dependence on θ I of the cumulants ofthe topological charge distribution can be computed fromthe derivatives of E ( θ ) in Eq. (2), properly continued tothe imaginary axis, and can be expressed in terms of χ and b n using Eq. (3). As an example, the explicitexpressions of the ﬁrst few cumulants as a function of θ I read: k ( θ I ) V = χ (cid:2) θ I − b θ I + 3 b θ I + O ( θ I ) (cid:3) ,k ( θ I ) V = χ (cid:2) − b θ I + 15 b θ I + O ( θ I ) (cid:3) ,k ( θ I ) V = χ (cid:2) − b θ I + 60 b θ I + O ( θ I ) (cid:3) ,k ( θ I ) V = χ (cid:2) − b + 180 b θ I + O ( θ I ) (cid:3) , (12)where V is the space-time volume and the ﬁrst fourthcumulants of the topological charge are k = h Q i ,k = h Q i − h Q i ,k = h Q i − h Q i h Q i + 2 h Q i ,k = h Q i − h Q i h Q i − h Q i + 12 h Q i h Q i − h Q i . (13)All these averages are computed by using the weight e − S ( θ I ) in the path-integral.Let us now describe what changes on the lattice: thelattice action is S L ( θ L ) = S W − θ L Q clov , (14)where the θ -term is discretized by using the non-smoothed clover charge Q clov deﬁned in the previous sub-section, and θ L is the bare imaginary- θ coupling. Thereason for using Q clov is that with this choice standardheatbath and overrelaxation algorithms can be used inthe update. Relations analogous to Eq. (12) can be ob-tained, where θ I = Zθ L and Z is the renormalizationconstant appearing in Eq. (8). Measuring the cumulantsfor several values of θ L we can thus ﬁt the values of χ , b n and Z . We explictly note that the cumulants arenot aﬀected by the renormalization of Q clov since theyare evaluated by using the smoothed and rounded charge Q L introduced in the previous section (see Ref. [38] for amore detailed discussion on this point). C. Parallel tempering of volume defect

The lattice action in Eq. (14), as the standard Wil-son action, is linear in each of the link variables, hencestandard heathbath [65, 66] and overrelaxation [67] al-gorithms can be applied to update the gauge conﬁgu-rations when using the imaginary- θ method. However,as anticipated in the introduction, the local nature ofthis updating procedure results in a slowing down of thetopological modes, which is exponential both in the in-verse lattice spacing and in the number N of colors. Thismakes very diﬃcult to perform controlled continuum ex-trapolations for large values of N , since even simulationswith moderately small values of the lattice spacing be-come prohibitively expensive as N is increased.To mitigate this problem, we adopt, in this work, theparallel tempering algorithm proposed for the CP N − models in Ref. [52], where this algorithm has been shownto perform as well as simulations with open bound-aries while bypassing their complications related to ﬁnite-size eﬀects. Moreover, as shown for CP N − models inRef. [38], parallel tempering can be easily applied in com-bination with the imaginary- θ method discussed in theprevious subsection.In this algorithm we consider N r identical systems, dif-fering only for the boundary conditions on a cuboid de-fect located on a given spatial slice. In particular, theboundary conditions imposed on the links that orthog-onally cross the defect are chosen so that the diﬀerentcopies interpolate between periodic boundary conditions(pbc) and open boundary conditions (obc). Each sys-tem is evolved independently using standard local algo-rithms, and diﬀerent copies are exchanged from time totime with a Metropolis step, so that the strong reduc-tion of the autocorrelation time achieved in the obc copyis transferred to the pbc one, on which the measure ofthe cumulants of the topological charge Q L is performed.Since the injection (or ejection) of topological charge inthe system is mainly triggered by the update of the linksclose to the defect, it is convenient [52] to alternate up-dating sweeps over the whole lattice with hierarchic up-dates over sub-regions of the lattice centered around thedefect. In particular, we updated more frequently smallhyper-rectangular regions centered around the defect.In our simulations the location of the defect is the tridi-mensional region D = n x = L − a, ≤ x < L (2) d , ≤ x < L (3) d , ≤ x < L (4) d o , however, after every hierarchic update, we perform a ran-dom translation of the pbc copy by one lattice spac-ing, thus eﬀectively moving the location of the defect.For the sake of the simplicity we use a cubic defect L (2) d = L (3) d = L (4) d ≡ L d and it is suﬃcient to choose L d equal to a few lattice spacings to obtain satisfactoryperformances. For a discussion on how the choice of L d aﬀects the eﬃciency of the algorithm, see Sec. 3 A.In order to specify how the diﬀerent boundary condi-tions across the defect are implemented, it is convenientto rescale each link of every replica according to U ( r ) µ ( x ) → K ( r ) µ ( x ) U ( r ) µ ( x ) , where U ( r ) µ ( x ) indicates a link of the r th replica and the explicit expression of K ( r ) µ ( x ) is: K ( r ) µ ( x ) = ( c ( r ) , if µ = 1 and x ∈ D, , otherwise,so that only the links crossing the volume defect are af-fected by its presence. For the pbc replica (correspondingto r = 0) we have c (0) = 1, for the obc replica (corre-sponding to r = N r −

1) we have c ( N r −

1) = 0, for0 < r < N r − c ( r ) interpolates between 0and 1. With these notations the action of the r th copyreads S ( r ) L ( θ L ) = S ( r ) W + S ( r ) θ ( θ L )= − βN X x,µ>ν K ( r ) µν ( x ) ℜ Tr n Π ( r ) µν ( x ) o − θ L Q clov h U ( r ) µ ( x ) i , where K ( r ) µν ( x ) is a short-hand for K ( r ) µν ( x ) ≡ K ( r ) µ ( x ) K ( r ) ν ( x + ˆ µ ) K ( r ) µ ( x + ˆ ν ) K ( r ) ν ( x ) . Note that we chose to keep the θ term insensitive tothe presence of the defect, which only aﬀects the Wilsonpart of the action. Exploratory simulations performedby modifying also the θ term provided evidence that thischoice does not signiﬁcantly aﬀects the performance ofthe algorithm, analogously to what was found for CP N − models in Ref. [38]. This is not surprising since the bar-riers between the topological sectors, responsible of thecritical slowing down, stem essentially from the Wilsonterm.The swap of replicas was proposed after every step ofhierarchic update, for every couple of adjacent copies r and r + 1 (diﬀerentiating the cases of even and odd r inorder to avoid synchronization problems), and was ac-cepted with the Metropolis probability p = min { , exp {− ∆ S }} = min { , exp {− S swap W + S no swap W }} , (15)where the θ -term, which is not aﬀected by the defect,does not enter the acceptance probability. Since bound-ary conditions of diﬀerent replicas diﬀer only on a sub-region of the lattice, to compute ∆ S it is suﬃcient to sumthe contributions to the action of the plaquettes centeredon sites lying in a hyper-cuboid region centered aroundthe defect and extending one lattice spacing from it.For the parallel tempering to be eﬀective in decorre-lating the topological modes of the pbc copy there mustbe no bottleneck for a conﬁguration in the obc copy tobe swapped toward the pbc one. In order to guaranteea random walk without barrier of the conﬁgurations be-tween the diﬀerent replicas it is thus convenient to choosethe constants c ( r ) entering K ( r ) µ ( x ) in such a way that theacceptance ratio is constant for all the proposed swaps: p (0 ,

1) = p (1 ,

2) = · · · = p ( N r − , N r −

3. NUMERICAL RESULTS

In order to compare the performances of parallel tem-pering with the ones of the standard algorithm, we per-formed simulations for N = 4 and 6 with both algorithmsand for several values of β at θ L = 0, measuring the auto-correlation time of Q L . We then performed simulationsat non-zero θ L with parallel tempering for each value of β , to estimate χ , b and b using the imaginary- θ method.In Tab. I we summarize the parameters of the per-formed simulations. Lattice sizes have been chosen toensure that L √ σ &

3, where σ ≃ (440 MeV) is thestring tension, so as to have ﬁnite-size eﬀects under con-trol [13, 68]. Statistics reported in Tab. I refer to theparallel tempering simulations and are reported in num-ber of parallel tempering steps. A single step of tem-pering update consists ﬁrst of all of a complete update ofeach replica, using 5 lattice sweeps of over-relaxation [67]followed by 1 lattice sweep of heat-bath [65, 66], both im-plemented `a la Cabibbo–Marinari [69], i.e., updating allthe N ( N − / SU (2) subgroups of SU ( N ).After this “global” update step we perform an iterationon the sub-lattices entering the hierarchical update (seeSec. 2 C), each iteration consisting of • a local update sweep of the sub-lattices for everyreplica, using the same combination of local algo-rithms adopted for the global update; • the parallel tempering swap proposal; • a random translation of the pbc copy by one latticespacing.Since each system is updated using the same procedureand the time required for hierarchic updates, swap andtranslations is negligible with respect to the time of theglobal update, the total numerical eﬀort of a single par-allel tempering step is ∼ N r times larger than the onerequired for the local update in the standard setup. A. Parallel tempering: results and comparison

Just by inspection of the time histories of the topo-logical charge it is simple to realize that parallel temper-ing substantially reduces the topological freezing allowingus to perform simulations at values of the lattice spac-ings which would have been otherwise prohibitive withthe standard algorithm. An example of Monte Carlotime evolution of Q L obtained with parallel temperingfor N = 6, β = 25 .

75 and θ L = 0 is shown in Fig. 1,where we compare it with the evolution obtained withthe standard algorithm.In order to quantitatively characterize the gainachieved with parallel tempering and optimize its eﬃ-ciency, it is useful to study the autocorrelation time ofthe topological susceptibility. We use as deﬁnition of the N = 4 β L/a a √ σ L √ σ θ max L Stat. θ = 0 Stat. θ = 011.104 16 0.1981(5)* 3.17 15 255k 787k11.347 20 0.1590(6) 3.17 15 1.39M 2.44M N = 6 β L/a a √ σ L √ σ θ max L Stat. θ = 0 Stat. θ = 024.768 12 0.2912(11) 3.57 15 103k 257k24.845 12 0.2801(13) 3.41 15 113k 166k25.056 12 0.2499(10) 3.04 15 228k 280k25.394 14 0.2143(8) 3.00 15 513k 553k25.750 16 0.1878(18) 3.00 17.5 1.12M 1.84MTABLE I: Summary of simulation parameters. Simulationsat non-zero values of θ L were performed in steps of ∆ θ L = 2up to θ L = 10 and in steps of ∆ θ L = 2 . θ L >

10. Thelast column refers to the total statistics accumulated for allimaginary- θ simulations. The defect length was, in all cases, L d /a = 2. All simulations for N = 4 were performed us-ing N r = 10, corresponding to a constant swap probability p around 20%, while simulations for N = 6 used N r = 17,corresponding to p ≈ . . . . . . . × -6-4-20246 Q L parallel temperingstandard algorithm FIG. 1: Monte Carlo time evolution of the lattice topologicalcharge Q L for a run with N = 6, β = 25 .

75 and θ L = 0. Forthe comparison to be fair, data for the parallel tempering caseare plotted as a function of the total number of global updatesperformed on all the replicas, i.e. 17 times the number ofparallel tempering updates. autocorrelation time for the generic observable O the ex-pression [71] τ ( O ) = 12 (cid:18) ∆ binned O ∆ naive O − (cid:19) , (16)where ∆ binned O is the error associated to hOi by using aself-consistent binning analysis, while ∆ naive O is the usualstandard error of the mean for independent identicallydistributed samples. The autocorrelation time of thetopological susceptibility, however, does not take into ac-count the increased computational eﬀort of the paralleltempering algorithm with respect to the standard localalgorithms. As a ﬁgure of merit for the computationaleﬀectiveness of parallel tempering, it is thus convenientto introduce the eﬀective autocorrelation time given by τ pt ( O ) = τ ( O ) N r . (17)As discussed in the previous section, in every paral-lel tempering simulation we tuned the parameters c ( r ) insuch a way that the acceptance p ( r, r +1) of the Metropo-lis swap move between the replicas r and r + 1 is approx-imately independent of r . This tuning was performedusing test simulations at θ L = 0 (acceptances do not de-pend on θ L ), and in Fig. 2 we show an example of thebehavior of c ( r ) for a run with N = 4 and β = 11 . c ( r ) values, however using these values τ pt ( Q L ) is about half the one obtained by using a simplelinear interpolation. Once p ( r, r + 1) is almost indepen-dent of r , conﬁgurations move freely among the diﬀerentreplicas following a random walk, as shown in Fig. 3.The constant value p that is reached by p ( r, r + 1) afterthe tuning of c ( r ) obviously depends on the number N r ofreplicas used. We did not perform a systematic investiga-tion of the dependence on p of the numerical eﬀectivenessof parallel tempering, however this dependence seems tobe quite mild. Indeed by increasing the number of repli-cas N r , the constant acceptance probability p grows andthe autocorrelation time τ ( Q L ) is reduced, however alsothe computational cost increases with N r . The net eﬀectis that τ pt ( Q L ) is largely insensitive to the speciﬁc valueof p , at least as far as it is not too close to 0 or 1, as we ver-iﬁed in some test simulations. For example, simulationsperformed with N = 4 and β = 11 .

104 using p ≃ N r = 10) or p ≃

30% (corresponding to N r = 12) provided consistent values of τ pt ( Q L ): 72(10)and 78(18) respectively. The value of p was kept ﬁxedwhile approaching the continuum limit, and in all caseswe found that this could be achieved by using the samevalue of N r for the diﬀerent lattice spacings (at ﬁxedphysical volume).Most of the simulations reported in this work used L d /a = 2 for the size of the defect, a value that is suﬃ-cient to drastically reduce the freezing problem. To in-vestigate the dependence of τ pt on L d some more simula-tions have been performed with L d /a = 3 and L d /a = 1for the case N = 6, always keeping the swap acceptanceprobability p ﬁxed to about 30%, which requires to scalethe number of replicas approximately as ∼ p L d .A complete list of the obtained autocorrelation times τ pt ( Q L ) is reported in Tab. II, where they are also com-pared with the results obtained in Ref. [13] using justlocal algorithms. The scaling of τ pt ( Q L ) with 1 / ( a √ σ )for the case N = 6 is instead shown in Fig. 4. r c ( r ) r p ( r , r + )( % ) FIG. 2: Behavior of c ( r ) as a function of the replica index r compared to a simple linear behavior (ﬁgure above), alongwith the corresponding acceptances ( ∼ r and r + 1 (ﬁgure below). Data refer to a runwith N = 4, β = 11 .

347 and θ L = 0. c ( r ) FIG. 3: Random walk of a conﬁguration among diﬀerent repli-cas during a parallel tempering run for N = 4, β = 11 . θ L = 0. Replicas are parametrized by their value of c ( r )and the Monte Carlo time is expressed in units of the paralleltempering update step deﬁned in the text. Simulations performed for N = 6 at β = 24 .

845 (cor-responding to the points at 1 / ( a √ σ ) ≃ .

57 in Fig. 4)using L d /a = 1 , , L d /a = 2 is the optimalchoice for this value of the coupling. As can be seenfrom Fig. 4, autocorrelation times extracted from simu-lations performed at ﬁxed L d /a = 2 are much smallerthan the corresponding ones obtained from simulationusing local update algorithms, also for smaller values ofthe lattice spacing. However τ pt ( Q L ) still seems to scale N = 4 β a √ σ L d /a p (%) N r τ pt (cid:0) Q L (cid:1) τ std (cid:0) Q L (cid:1) .

104 0 . N c = 6 β a √ σ L d /a p (%) N r τ pt (cid:0) Q L (cid:1) τ std (cid:0) Q L (cid:1) .

845 0 . .

750 0 . ∼ **3 30 43(11)TABLE II: Results for the autocorrelation time of Q L ob-tained by using the standard and the Hasenbusch algorithm.Quantities denoted with ∗ are taken from Ref. [13], wherethe procedure used for the local update was the same of thepresent work. The result denoted by ** is just a rough esti-mate, since it was impossible to obtain a reliable value evenafter ∼ .

5M trajectories. . . . . a √ σ ) − l og (cid:8) τ p t (cid:0) Q L (cid:1) (cid:9) Standard algorithmParallel tempering, L d = a Parallel tempering, L d = 2 a Parallel tempering, L d = 3 a FIG. 4: Scaling of τ pt (cid:0) Q L (cid:1) with the inverse lattice spacing ob-tained by using the local algorithms or parallel tempering for N = 6. The scaling of the autocorrelation time obtained withthe standard algorithm and with parallel tempering at ﬁxed L d /a = 2 are both compatible with an exponential scaling in1 /a (dashed and solid line). Best ﬁts performed with the ﬁtfunction log { τ pt } = k + k / ( a √ σ ) yield k = 3 . . exponentially with the inverse lattice spacing. This isdue to the fact that, by approaching the continuum limitat ﬁxed L d /a , the size of the defect in physical units isreduced, and the mechanism of injection of topologicalcharge through the defect becomes less and less eﬃcient.If instead L d is kept ﬁxed in physical units while ap-proaching the continuum limit, one generically expects a polynomial critical slowing down in 1 / ( a √ σ ). To inves-tigate this point we performed additional simulations at β = 25 .

75 using L d /a = 3, in order to have at this lat-tice spacing a defect of the same physical size as the onecorresponding to L d /a = 2 at β = 24 .

845 (in both thecases L d √ σ ∼ . ≈

33% reduction of the lattice spacing, the ef-fective autocorrelation time τ pt ( Q L ) is compatible in thetwo cases, as reported in Tab. II and shown in Fig. 4.These results still do not permit to make a clear as-sessment about the scaling and the optimal tuning ofthe parallel tempering algorithm towards the continuumlimit, however, altogether they give a strong indicationthat it works exceedingly well, compared to standardalgorithms, in reducing topological freezing, and thatthe best scaling is obtained keeping L d in the range0 . − . CP N − models [38, 52], where thecontinuum limit is performed at ﬁxed L d /ξ , i.e. at ﬁxedphysical size of the defect. B. Analytic continuation and continuum limit

In Tab. III we summarize the results obtained for thetopological observables χ and b at diﬀerent values of N and of the lattice spacing, obtained by ﬁtting the θ dependence of the cumulants as described in Sec. 2 B.An example of imaginary- θ ﬁt of the cumulants is shownin Fig. 5 for the case N = 6 and β = 25 . O (cid:0) θ L (cid:1) term in the expansion of the vacuumenergy to be well compatible with zero since no signalabove zero is observed for b . In particular, we ﬁnd | b ( N = 4) | · .

15 and | b ( N = 6) | · .

30. Forthis reason, results for a χ and b reported in Tab. IIIhave been obtained by neglecting b in Eqs. (12). Finallywe note that correlations between the diﬀerent cumu-lants are small and do not signiﬁcantly aﬀect the resultof the ﬁt, as we explicitly checked by performing bothcorrelated and uncorrelated ﬁts.We used our data for N = 4 and N = 6, as well as dataobtained for larger lattice spacings taken from Ref. [13],to extrapolate continuum results for χ/σ and b . For thetopological susceptibility the improvement with respectto previously available results is only marginal, since thedominant source of error comes from the string tension σ used to set the scale. For b , instead, we achieveda substantial improvement of the state of the art, bothfor N = 4 and N = 6. In particular for N = 6 paralleltempering allowed us to reach much ﬁner lattice spacingsthan the ones used in previous studies. In this way wecould perform for the ﬁrst time a controlled continuumextrapolation of b in this case, while in Ref. [13] only areasonable conﬁdence interval was reported. In Tab. IVwe summarize our continuum limits, while in Fig. 6 we N = 4 β a √ σ Z a χ · b · N = 6 β a √ σ Z a χ · b · θ ﬁt for N = 4 and 6 by considering up to O ( θ L )terms in the Taylor expansion of the vacuum energy, i.e., ne-glecting b in Eqs. (12). θ L × − k /Vk /Vk /V FIG. 5: Best ﬁt of the ﬁrst 3 cumulants (solid, dashed and dot-ted line respectively) for N = 6, β = 25 .

750 and θ L ∈ [0 , . O ( θ L ) terms in the Taylor expan-sion of the vacuum energy, i.e., neglecting b in Eqs. (12).The best ﬁt yields χ / dof = 14 . / report our continuum extrapolations. N χ/σ b N =3 , N = 3 are taken from Ref. [25]. a σ . . . . . . . χ / σ N = 4 N = 6 a σ − . − . − . − . − . − . − . b N = 4 N = 6 FIG. 6: Continuum extrapolations of χ/σ (above) and b (below) for N = 4 and 6 (solid and dashed line respectively)obtained ﬁtting linear corrections in a σ to the continuumlimit. The reported best ﬁts yield, respectively for N = 4and 6, χ / dof = 0 . / . / χ/σ and χ / dof =4 . / . / b . The diamond points represent thedeterminations reported in Ref. [13] for N = 4 and 6. C. Large- N limit In this section we revisit the large- N extrapolation onthe basis of our improved results in particular we reportour estimates of ¯ χ and ¯ b introduced in Eq. (4). Let usstart from the topological susceptibility: following large- N expectations, we ﬁtted our data for N ≥ χσ = ¯ χσ + kN + O (cid:18) N (cid:19) . (18)Our data are in agreement with the expected large- N scaling and we ﬁnd the result ¯ χ/σ = 0 . χ/σ = 0 . large − N / √ σ = 0 . large − N = 242(10) MeV [73] to convert to physicalunits we get ¯ χ / = 173(8) MeV, in agreement with theprediction ¯ χ / ≃

180 MeV obtained from the Witten–Veneziano formula. / / / /N . . . . . χ / σ FIG. 7: Extrapolation of χ/σ towards the large- N limit usingﬁt function χ/σ = ¯ χ/σ + k/N . Best ﬁt yields ¯ χ/ ¯ σ =0 . k = 0 . We now pass to the discussion of the large- N behaviorof b . According to the standard large- N arguments, weexpect a behavior of the type: b = ¯ b N + ¯ b (1)2 N + O (cid:18) N (cid:19) . (19)To test this prediction we perform a best ﬁt of our datawith N ≥ b ( N ) = ¯ b /N c , ob-taining a perfect agreement with expectations, since theexponent results c = 2 . c = 2 . c = 2 and ﬁtting our data with just the leading behavior b = ¯ b /N in the ranges N ≥ N ≥

4, we ob-tain the results ¯ b = − . b = − . b = − . b = − .

4. CONCLUSIONS

In this work we investigated the θ -dependence of SU ( N ) Yang–Mills theories at zero temperature usingthe parallel tempering algorithm proposed by Hasen-busch in Ref. [52]. This algorithm was originally testedin two dimensional CP N − models, and a ﬁrst extension / / / /N − . − . − . − . − . . b FIG. 8: Extrapolation of b towards the large- N limit. Thesolid line represents the best ﬁt obtained using ﬁt function b = ¯ b /N c in the whole range; the dashed line represents thebest ﬁt obtained using the same ﬁt function but with ﬁxed c = 2 in the whole range; the dotted line represents the bestﬁt obtained in the whole range including a further ¯ b (1)2 /N term in the ﬁt function. The best ﬁts yield, respectively, ¯ b = − . , − . − . c = 2 . b (1)2 = − . of the original proposal has already been performed inRef. [38] (still for CP N − models), by extending the par-allel tempering approach to simulations at imaginary θ values. In the present work we implemented the samesetup in SU ( N ) Yang–Mills theory, thus proving the fea-sibility of the approach also for computationally moredemanding models, and improving the state of the artresults for the θ dependence of these models in the large N limit.The idea of the method is to simulate many indepen-dent identical systems diﬀering only for the boundaryconditions imposed on a cubic defect L d × L d × L d , whichare chosen to interpolate between open (obc) and peri-odic boundary conditions (pbc). Each replica evolves in-dependently, and swaps among them are proposed fromtime to time in order to transfer conﬁgurations from theobc to the pbc replica. In this way a drastic reductionof the autocorrelation time of the topological charge isachieved, while avoiding the complication related to thebreaking of translation invariance connected to the adop-tion of open boundary conditions, since measures are per-formed on the pbc replica.By using the parallel tempering algorithm we got animpressive reduction of the autocorrelation times of topo-logical observables, which for the smallest lattice spacingused was of at least two order of magnitude when takinginto account also the larger computational complexity.A nice feature of the algorithm is that this gain was ob-tained without optimally tuning all the possible parame-0ters entering the update, which proves the robustness ofthe approach. The most relevant parameter to be ﬁxed isclearly the size of the defect, and we veriﬁed that for thecases studied in this paper a size in the range 0 . . θ -dependence beyond the leading O ( θ ) order. In par-ticular, we improved the accuracy of the determination ofthe coeﬃcients b for both N = 4 and N = 6, in the lastcase also performing a controlled continuum limit extrap-olation, which was not possible with previously availableresults. These data conﬁrm that b scales with the num-ber of colors in a way that is consistent with the lead-ing behavior predicted by large- N arguments: data for N ≥ b = ¯ b /N , with ¯ b = − . b = ¯ b /N c returns the value c = 2 . N [75].A further reﬁnement of present results, including alsonew values of N , would thus be welcome in the future:the algorithm proposed in Ref. [52], and extended to SU ( N ) gauge theories in this study, will permit such sys-tematic reﬁnement. Acknowledgments

Numerical simulations have been performed on theMARCONI machine at CINECA, based on the agree-ment between INFN and CINECA (under projectsINF19 npqcd and INF20 npqcd). [1] G. ’t Hooft, Nucl. Phys. B , 461 (1974).[2] G. ’t Hooft, Phys. Rev. Lett. , 8 (1976).[3] E. Witten, Nucl. Phys. B , 285 (1979).[4] E. Witten, Nucl. Phys. B , 269 (1979).[5] G. Veneziano, Nucl. Phys. B , 213 (1979).[6] E. Witten, Annals Phys. , 363 (1980).[7] P. Di Vecchia and G. Veneziano, Nucl. Phys. B , 253(1980).[8] E. Witten, Phys. Rev. Lett. , 2862 (1998), hep-th/9807109.[9] A. D’Adda, M. L¨uscher, and P. Di Vecchia, Nucl. Phys.B , 63 (1978).[10] M. Campostrini and P. Rossi, Phys. Lett. B , 305(1991).[11] L. Del Debbio, G. M. Manca, H. Panagopoulos, A. Sk-ouroupathis, and E. Vicari, JHEP , 005 (2006), hep-th/0603041.[12] P. Rossi, Phys. Rev. D , 045013 (2016), 1606.07252.[13] C. Bonati, M. D’Elia, P. Rossi, and E. Vicari, Phys. Rev.D , 085017 (2016), 1607.06360.[14] B. Alles, M. D’Elia, and A. Di Giacomo, Nucl. Phys. B , 281 (1997), [Erratum: Nucl.Phys.B 679, 397–399(2004)], hep-lat/9605013.[15] B. Alles, M. D’Elia, and A. Di Giacomo, Phys. Lett. B , 119 (1997), hep-lat/9706016.[16] L. Del Debbio, L. Giusti, and C. Pica, Phys. Rev. Lett. , 032003 (2005), hep-th/0407052.[17] L. Del Debbio, H. Panagopoulos, and E. Vicari, JHEP , 044 (2002), hep-th/0204125.[18] M. D’Elia, Nucl. Phys. B , 139 (2003), hep-lat/0302007.[19] B. Lucini, M. Teper, and U. Wenger, Nucl. Phys. B ,461 (2005), hep-lat/0401028.[20] L. Giusti, S. Petrarca, and B. Taglienti, Phys. Rev. D , 094510 (2007), 0705.2352.[21] E. Vicari and H. Panagopoulos, Phys. Rept. , 93(2009), 0803.1593. [22] H. Panagopoulos and E. Vicari, JHEP , 119 (2011),1109.6815.[23] M. C`e, C. Consonni, G. P. Engel, and L. Giusti, Phys.Rev. D , 074502 (2015), 1506.06052.[24] M. C`e, M. Garcia Vera, L. Giusti, and S. Schaefer, Phys.Lett. B , 232 (2016), 1607.05939.[25] C. Bonati, M. D’Elia, and A. Scapellato, Phys. Rev. D , 025028 (2016), 1512.01544.[26] C. Bonati, M. Cardinali, and M. D’Elia, Phys. Rev. D , 054508 (2018), 1807.06558.[27] C. Bonati, M. Cardinali, M. D’Elia, and F. Mazziotti,Phys. Rev. D , 034508 (2020), 1912.02662.[28] G. Bhanot and F. David, Nucl. Phys. B , 127 (1985).[29] V. Azcoiti, G. Di Carlo, A. Galante, and V. Laliena,Phys. Rev. Lett. , 141601 (2002), hep-lat/0203017.[30] B. Alles and A. Papa, Phys. Rev. D , 056008 (2008),0711.1496.[31] M. Imachi, M. Kambayashi, Y. Shinno, andH. Yoneyama, Prog. Theor. Phys. , 181 (2006).[32] S. Aoki, R. Horsley, T. Izubuchi, Y. Nakamura,D. Pleiter, P. E. L. Rakow, G. Schierholz, and J. Zan-otti (2008), 0808.1428.[33] B. Alles, M. Giordano, and A. Papa, Phys. Rev. B ,184421 (2014), 1409.1704.[34] M. D’Elia and F. Negro, Phys. Rev. Lett. , 072001(2012), 1205.0538.[35] M. D’Elia and F. Negro, Phys. Rev. D , 034503 (2013),1306.2919.[36] M. D’Elia, M. Mariti, and F. Negro, Phys. Rev. Lett. , 082002 (2013), 1209.0722.[37] C. Bonanno, C. Bonati, and M. D’Elia, JHEP , 003(2019), 1807.11357.[38] M. Berni, C. Bonanno, and M. D’Elia, Phys. Rev. D ,114509 (2019), 1911.03384.[39] B. Alles, G. Boyd, M. D’Elia, A. Di Giacomo, and E. Vi-cari, Phys. Lett. B , 107 (1996), hep-lat/9607049.[40] P. de Forcrand, M. Garcia Perez, J. E. Hetrick, and I.- O. Stamatescu, Nucl. Phys. Proc. Suppl. , 549 (1998),[,549(1997)], hep-lat/9710001.[41] B. Lucini and M. Teper, JHEP , 050 (2001), hep-lat/0103027.[42] D. B. Leinweber, A. G. Williams, J.-b. Zhang, and F. X.Lee, Phys. Lett. B , 187 (2004), hep-lat/0312035.[43] L. Del Debbio, G. M. Manca, and E. Vicari, Phys. Lett.B , 315 (2004), hep-lat/0403001.[44] M. L¨uscher and S. Schaefer, JHEP , 036 (2011),1105.4749.[45] A. Laio, G. Martinelli, and F. Sanﬁlippo, JHEP , 089(2016), 1508.07270.[46] J. Flynn, A. J¨uttner, A. Lawson, and F. Sanﬁlippo(2015), 1504.06292.[47] C. Bonati, M. D’Elia, M. Mariti, G. Martinelli, M. Mesiti,F. Negro, F. Sanﬁlippo, and G. Villadoro, JHEP , 155(2016), 1512.06746.[48] C. Bonati and M. D’Elia, Phys. Rev. E , 013308(2018), 1709.10034.[49] A. Athenodorou and M. Teper, JHEP , 172 (2020),2007.06422.[50] E. Vicari, Phys. Lett. B , 139 (1993), hep-lat/9209025.[51] W. Bietenholz, P. de Forcrand, and U. Gerber, JHEP ,070 (2015), 1509.06433.[52] M. Hasenbusch, Phys. Rev. D , 054504 (2017),1706.04443.[53] M. Campostrini, A. Di Giacomo, and H. Panagopoulos,Phys. Lett. B , 206 (1988).[54] M. L¨uscher, Commun. Math. Phys. , 899 (2010),0907.5491.[55] M. L¨uscher, JHEP , 071 (2010), [Erratum:JHEP03,092(2014)], 1006.4518.[56] B. Berg, Phys. Lett. B , 475 (1981).[57] Y. Iwasaki and T. Yoshie, Phys. Lett. B , 159 (1983).[58] S. Itoh, Y. Iwasaki, and T. Yoshie, Phys. Lett. B , 141 (1984).[59] M. Teper, Phys. Lett. B , 357 (1985).[60] E.-M. Ilgenfritz, M. Laursen, G. Schierholz, M. M¨uller-Preussker, and H. Schiller, Nucl. Phys. B , 693(1986).[61] M. Campostrini, A. Di Giacomo, H. Panagopoulos, andE. Vicari, Nucl. Phys. B , 683 (1990).[62] B. Alles, L. Cosmai, M. D’Elia, and A. Papa, Phys. Rev.D , 094507 (2000), hep-lat/0001027.[63] C. Bonati and M. D’Elia, Phys. Rev. D , 105005(2014), 1401.2441.[64] C. Alexandrou, A. Athenodorou, and K. Jansen, Phys.Rev. D , 125014 (2015), 1509.04259.[65] M. Creutz, Phys. Rev. D , 2308 (1980).[66] A. Kennedy and B. Pendleton, Phys. Lett. B , 393(1985).[67] M. Creutz, Phys. Rev. D , 515 (1987).[68] L. Del Debbio, H. Panagopoulos, P. Rossi, and E. Vicari,JHEP , 009 (2002), hep-th/0111090.[69] N. Cabibbo and E. Marinari, Phys. Lett. B , 387(1982).[70] B. Lucini, M. Teper, and U. Wenger, JHEP , 033(2005), hep-lat/0502003.[71] B. A. Berg, Markov Chain Monte Carlo Simulations andtheir Statistical Analysis (World Scientiﬁc Publishing,Singapore, 2004), p. 196–235, ISBN 978-981-238-935-0.[72] A. Gonzalez-Arroyo and M. Okawa, PoS

LAT-TICE2012 , 221 (2012), 1212.3835.[73] M. G¨ockeler, R. Horsley, A. Irving, D. Pleiter,P. E. Rakow, G. Schierholz, and H. St¨uben (QCDSF,UKQCD), Nucl. Phys. B Proc. Suppl. , 228 (2005),hep-lat/0409166.[74] T. Vonk, F.-K. Guo, and U.-G. Meißner, JHEP06