Path integral contour deformations for observables in SU(N) gauge theory
William Detmold, Gurtej Kanwar, Henry Lamm, Michael L. Wagman, Neill C. Warrington
FFERMILAB-PUB-21-014-T, INT-PUB-21-002, MIT-CTP/5270
Path integral contour deformations for observables in SU ( N ) gauge theory William Detmold,
1, 2
Gurtej Kanwar,
1, 2
Henry Lamm, Michael L. Wagman, and Neill C. Warrington Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA The NSF AI Institute for Artificial Intelligence and Fundamental Interactions Fermi National Accelerator Laboratory, Batavia, IL 60510, USA Institute for Nuclear Theory, University of Washington, Seattle, Washington 98195-1550
Path integral contour deformations have been shown to mitigate sign and signal-to-noise problemsassociated with phase fluctuations in lattice field theories. We define a family of contour deforma-tions applicable to SU ( N ) lattice gauge theory that can reduce sign and signal-to-noise problemsassociated with complex actions and complex observables. For observables, these contours can beused to define deformed observables with identical expectation value but different variance. Asa proof-of-principle, we apply machine learning techniques to optimize the deformed observablesassociated with Wilson loops in two dimensional SU (2) and SU (3) gauge theory. We study loopsconsisting of up to 64 plaquettes and achieve variance reduction of up to 4 orders of magnitude. I. INTRODUCTION
In order to test the Standard Model and search for newphysics in experiments involving hadrons and nuclei, pre-cision calculations of Standard Model observables are re-quired. To achieve this, Monte Carlo (MC) calculationsof lattice-regularized path integrals for quantum chro-modynamics (QCD) have been used to make precise pre-dictions for many phenomenologically relevant quantitiesin the meson and nucleon sectors; for recent reviews seeRefs. [1–5]. Lattice QCD calculations using MC methodsare performed in Euclidean spacetime where the action S ( U ) is typically a real function of the discretized gaugefield U x,µ ∈ SU (3) and an ensemble of gauge fields canbe generated with probability distribution proportionalto e − S ( U ) . Predictions for the expectation values of ob-servables O are then obtained by averaging the resultsfor O ( U ) obtained for each gauge field configuration.For correlation functions describing nucleons, nuclei,and most mesons, O ( U ) is complex and includes a gauge-field-dependent phase [6, 7]. Phase fluctuations growmore rapid with increasing Euclidean time separationand lead to a “signal-to-noise (StN) problem” in whichthe StN for a fixed size statistical ensemble decreases ex-ponentially with increasing time separation [6–12]. Al-ternatively, multi-nucleon systems can be probed by in-cluding non-zero quark chemical potential in the action.In this case, e − S ( U ) is not real and positive definite andcannot be interpreted as a probability distribution andthe theory is said to have a “sign problem.” If e − Re S ( U ) is used to define a probability distribution in this case, e − i Im S ( U ) leads to rapid phase fluctuations of path in-tegrands and the appearance of a StN problem that isexponential in the spacetime volume [13–19].A generic method for taming sign and StN problemsin path integrals has recently emerged. This method in-volves deforming the manifold of integration of the pathintegral into a complexified field space. If the path inte-grand is a holomorphic function of the field variables,then a multi-dimensional version of Cauchy’s integraltheorem ensures that expectation values of the corre- sponding observable is unchanged by the manifold de-formation. Manifold deformation may, however, changethe values of observables on individual field configura-tions and therefore modify the severity of phase fluctu-ations and associated sign/StN problems. Several meth-ods for finding useful manifolds have been proposed andsuccessfully applied in lattice field theories as well as non-relativistic quantum mechanical theories relevant for con-densed matter physics [20–46]. For a recent review seeRef. [47]. Although most applications have focused onimproving sign problems in theories with complex ac-tions, an analogous method for improving StN problemsfor complex observables in theories with real actions wasintroduced in Ref. [48].The focus of this work is on addressing sign/StN prob-lems in QCD-like theories; in this setting, path integralcontour deformations have so far been restricted to solv-ing sign problems in simple contexts. Thermodynamicobservables at non-zero quark chemical potential havebeen computed in (0 + 1)D QCD, a theory with a singlelink variable, using Lefschetz thimble methods [49] andthe generalized thimble method [50], while preliminaryresults using neural-network manifolds were obtained in[51]. In higher dimensions, Lefschetz thimbles were usedto analyze finite-density observables for one- and two-sitesystems in the heavy-dense limit in Ref. [52], and thisstudy predicted that the number of relevant Lefschetzthimbles would grow exponentially with the number oflattice sites for larger systems. The task of computingnoisy observables in non-Abelian lattice gauge theoriesfor larger spacetime volumes and higher spacetime di-mensions has remained challenging.This paper introduces a simple yet expressive family ofcomplexified manifolds for taming sign/StN problems in SU ( N ) gauge theory using path integral contour defor-mations. The Jacobians required for calculations usingthis family of deformations are shown to be triangularmatrices whose determinants can be computed with acost that scales linearly with the spacetime volume. Thisfamily of manifolds can be applied to all theories involv-ing SU ( N ) gauge fields, including gauge theories coupledto matter fields, although their practical utility for im- a r X i v : . [ h e p - l a t ] J a n proving sign/StN problems is only explored here for puregauge theory.The deformed observable method introduced inRef. [48] relates path integrals over deformed contoursto path integrals written in terms of modified observ-ables on undeformed contours, enabling improvement inthe StN of observables without the need to modify MCsampling. We apply the method here to calculations ofWilson loops in SU (2) and SU (3) gauge theory, in whichWilson loops are known to have an exponentially severeStN problem and have been used to study other StN im-provement methods [53]. Calculations are performed in(1 + 1)D as a proof of concept, as it is possible to com-pare with exact StN results derived analytically and touse specialized approaches for efficient Monte Carlo en-semble generation for (1 + 1)D gauge theories. Resultsare obtained for a range of Wilson loop areas and latticespacings including areas of up to 64 lattice units at thefinest lattice spacing. The variances of Wilson loops withlargest areas are reduced by factors of 10 –10 , demon-strating that deformed observables can dramatically im-prove StN problems in SU ( N ) lattice gauge theory. Thelinear scaling with spacetime volume of these contour de-formations suggests that it should be computationallyfeasible to explore the application of analogous contourdeformations to (3 + 1)D lattice gauge theory in futurework.The remainder of this paper is organized as follows.Sec. II describes our approach to contour deformationsfor SU ( N ) variables, including a family of complex man-ifolds for integration over sets of SU ( N ) variables, andreviews the deformed observables method introduced inRef. [48]. Sec. III presents analytical results for expec-tation values and variances of observables in (1 + 1)D SU ( N ) lattice gauge theory. Results for MC calculationsof deformed observables for Wilson loops are presentedfor SU (2) gauge theory in Sec. IV and for SU (3) gaugetheory in Sec. V. A summary of results and considerationof future work is found in Sec. VI. II. GENERAL FORMALISM
Cauchy’s integral theorem implies that the contour ofa complex line integral can be deformed without chang-ing the integral value if the integrand is holomorphic inthe intervening region and the endpoints are held fixed. When multidimensional integration is performed, the fulltheorem can be generalized if the integral is describableas iterated complex line integrals or by a technical ex-tension to the full multivariate setting [54]. For the pur-pose of contour deformations, however, only a weakerform of the theorem (equivalent to Stokes’ theorem) is For periodic functions this condition on the endpoints can berelaxed, as discussed in Sec. II A. θ invalid (cid:101) θ ( θ )valid (cid:101) θ ( θ ) φ valid (cid:101) φ ( φ )valid (cid:101) φ ( φ )identifiedFIG. 1. Left: schematic depiction of valid and invalid contourdeformations, defined by the mapping (cid:101) θ ( θ ) from base coordi-nates to the manifold, when the original domain is a finite in-terval. Right: schematic depiction of additional allowed defor-mations (shifts) when endpoints are identified; these shifts areapplicable to U (1) variables or azimuthal angles φ in SU ( N )manifolds. required. Specifically, a contour deformation from man-ifold M A to M B leaves the integral value unchanged if M A ∪M B bounds a region in which the integrand is holo-morphic; see Ref. [47] for a simple proof. To implementsuch contour deformations and confirm holomorphy of anintegrand throughout the relevant region of configurationspace, a coordinate parameterization is useful. We dis-cuss such parameterizations and contour deformation for SU ( N ) groups and SU ( N ) gauge theory in the followingsections. A. Contour deformations of angular parameters
A general formalism for applying path integral con-tour deformations to SU ( N ) group integrals can be ob-tained by using manifold coordinates that map subsetsof R N − to SU ( N ). For any N , the group manifoldcan be given explicit global coordinates using N − azimuthal angles φ , . . . , φ J ∈ [0 , π ] and zenith an-gles θ , . . . , θ K ∈ [0 , π/ J = ( N + N − / K = ( N − N ) / The azimuthal angles are peri-odic, such that φ i = 0 is identified with φ i = 2 π , whilethe zenith angles have distinct endpoints. We define thecombined coordinate Ω ≡ ( φ , . . . , φ J , θ , . . . , θ K ).A generic integral over group-valued variable U ∈ SU ( N ) can be written as I = (cid:90) dU f ( U ) , (1)where the Haar measure dU is defined to be the uniquemeasure that satisfies d ( V U ) = d ( U V ) = dU for V ∈ This is not the only possible assignment of angular coordinates tothe manifold. For example, Appendix B explores an alternativeparameterization for SU (2). SU ( N ). We choose the conventional normalization con-dition (cid:82) dU = 1. Using the coordinates introducedabove, the integral can immediately be cast as integra-tion over a sub-domain of R N − , I = J (cid:89) j =1 (cid:20)(cid:90) π dφ j (cid:21) K (cid:89) k =1 (cid:34)(cid:90) π/ dθ k (cid:35) h (Ω) f ( U (Ω)) , (2)where h (Ω) is the Jacobian factor associated with thechange of measure from dU to d Ω = (cid:81) j dφ j (cid:81) k dθ k , and U (Ω) is the group element at the manifold coordinateΩ. The specific forms of h (Ω) and U (Ω) for the groups SU (2) and SU (3) are presented in Sec. IV A and V A.From Eq. (2), it is clear that each θ k can be deformedinto the complex plane holding the endpoints 0 and π/ φ j can be deformed under the weaker con-straint that the endpoints remain identified, as shownin Fig. 1. This can be seen by noting that the end- points of the shifted contour can be connected to theoriginal endpoints using a pair of segments parallel tothe imaginary axis; these segments differ by a real 2 π shift and a change of orientation so that integrals of peri-odic functions along these segments exactly cancel. Thisapproach to deforming periodic variables with identifiedendpoints has previously been applied to path integralsinvolving U (1) variables [48, 56–58]. If the integrand h (Ω) f ( U (Ω)) is a holomorphic function of all componentsof Ω, the value of the integral is unchanged by the de-formations described above. We can define a deformedintegration contour by the maps (cid:101) φ j = (cid:101) φ j (Ω) ∈ C and (cid:101) θ i = (cid:101) θ i (Ω) ∈ C ; for conciseness the collective set of de-formed coordinates can be written as a function of orig-inal coordinates (cid:101) Ω ≡ ( (cid:101) φ , . . . , (cid:101) φ J , (cid:101) θ , . . . , (cid:101) θ K ) = (cid:101) Ω(Ω).The value of the integral is unchanged by this deforma-tion, I = J (cid:89) j =1 (cid:20)(cid:90) π dφ j (cid:21) K (cid:89) k =1 (cid:34)(cid:90) π/ dθ k (cid:35) J (Ω) h ( (cid:101) Ω) f ( U ( (cid:101) Ω))= J (cid:89) j =1 (cid:20)(cid:90) π dφ j (cid:21) K (cid:89) k =1 (cid:34)(cid:90) π/ dθ k (cid:35) h (Ω) (cid:40) J (Ω) h ( (cid:101) Ω) h (Ω) f ( U ( (cid:101) Ω)) (cid:41) = (cid:90) dU J ( U ) f ( (cid:101) U ) . (3)Above, (cid:101) U ≡ U ( (cid:101) Ω) is the deformed group element, J (Ω) ≡ det ∂ (cid:101) Ω α ∂ Ω β is the (complex) Jacobian factor relating themeasure d Ω of the base contour to the measure d (cid:101) Ω ofthe deformed contour, and the Jacobian factor relatingthe Haar measure dU between the original and deformedcontours is given by J ( U ) ≡ J (Ω) h ( (cid:101) Ω) /h (Ω) . (4)For any concrete map U (Ω), the deformed group ele-ment is given by simply applying the map to the com-plexified coordinate (cid:101) Ω. Any real Lie group has a uniquecomplexification and the space in which (cid:101) U lives is well un-derstood. In particular, the complexification of SU ( N )is SL ( N, C ) [59]. This is easy to see on intuitive grounds,as SU ( N ) matrices are specified by unit-norm eigenval-ues with determinant 1 and the effect of complexifyingthe group is to allow arbitrary non-zero eigenvalues whilepreserving the determinant constraint, resulting in thegroup SL ( N, C ).Though the deformation is defined in terms of the man-ifold coordinates, we can see in the last line of Eq. (3)that this new integral can be written independently of thecoordinates, using the modified integrand J ( U ) f ( (cid:101) U ( U )).This form suggests a coordinate-independent definition of a holomorphic integrand and contour deformation. Sucha general approach is beyond the scope of this work, butwe note that a deformation can be applied in any co-ordinate system so long as the integrand can be shownto be holomorphic in some coordinate system. The an-gular coordinates for SU ( N ) are sufficient to show thatall components of the matrix representation of U areholomorphic (i.e. the Wirtinger derivatives ∂U (Ω) /∂ ¯Ω i are zero [54]). The components of U − are holomor-phic whenever U is invertible, as can be seen by startingwith the definition of the inverse, U − U = 1, applyingWirtinger derivatives to both sides, ∂ ( U − U ) / ¯Ω i = 0,and using holomorphy of U to obtain ∂U − /∂ ¯Ω i = 0.The general construction so far is valid for collectionsof SU ( N ) group variables, and is therefore applicable to SU ( N ) lattice gauge theory. Standard pure-gauge ac-tions for SU ( N ) lattice gauge theory can be written asholomorphic functions of the variables U and their in-verses, if we replace instances of U † with U − , and arethus holomorphic throughout the SL ( N, C ) domain ofeach complexified group variable. This fact has previ-ously been recognized in complex Langevin approachesto extensive sign problems in lattice gauge theory [60–62]. Similarly, (unrooted) fermionic determinants can bewritten as polynomials in components of gauge links andare therefore holomorphic [36], and many observables canbe analyzed in the same way. B. Vertical contour deformations
We define a family of deformed manifolds for an SU ( N ) variable in terms of the angular parameterizationby (cid:101) Ω = Ω + if (Ω; λ, χ ) , (5)where f (Ω; λ, χ ) ∈ R N − . This family of vertical de-formations inspired by Refs. [56, 63] is not fully generalas it only includes contour deformations describable byvertically shifting each point purely in the imaginary di-rection, and in particular it does not include any defor-mations with Re (cid:101) Ω(Ω) = Re (cid:101)
Ω(Ω (cid:48) ) for some Ω (cid:54) = Ω (cid:48) . TheJacobian of any deformation in this family is straightfor-ward to compute as J (Ω) = det ∂ (cid:101) Ω α ∂ Ω β = det (cid:20) δ αβ + i ∂f α (Ω; λ, χ ) ∂ Ω β (cid:21) , (6)where α, β = 1 , . . . , N − φ , . . . , φ J , θ , . . . , θ K ).The function f can further be expanded in terms ofFourier modes, f (Ω; λ, χ ) = Λ (cid:88) { n i } =0 Λ (cid:88) { m j } =1 λ I T I (Ω; χ I , . . . , χ aI ) , (7)where I ≡ ( n . . . n a , m . . . m b ), and T I (Ω; χ . . . χ a ) ≡ a (cid:89) i =1 sin( φ i n i + χ i ) b (cid:89) j =1 sin(2 θ j m j ) , (8)which provides a complete basis for vertical contour de-formations in the limit Λ → ∞ . Including successivelymore Fourier modes by increasing the Fourier cutoff Λsystematically improves the flexibility of the function f .In our applications to SU ( N ) gauge theories in (1 + 1)Dbelow, Fourier cutoffs Λ ∈ { , , } are explored. It isnoteworthy that the sum over azimuthal Fourier modesincludes the constant mode as well as phase offsets χ i because azimuthal angles can be deformed without fix-ing their endpoints as discussed above. These constantmodes are essential for the sign/StN problem reductionachieved in (1 + 1)D examples below. C. Path integral deformations for noisy observables
We next review the deformed observable approach pre-sented in Ref [48], in which contour deformations of lat-tice field theory path integrals are used to define modi-fied observables with improved noise properties and un-changed expectation value. We focus here on SU ( N ) lattice gauge theory path integrals, which are high-dimensional integrals over a collection of group-valueddegrees of freedom U x,µ ∈ SU ( N ). Here, x specifies asite on the discrete spacetime lattice and µ ∈ { , . . . , N d } is any of the N d spacetime directions on the lattice. Theintegrals under study take the form (cid:104)O(cid:105) ≡ Z (cid:90) D U O ( U ) e − S ( U ) , (9)where Z = (cid:90) D U e − S ( U ) (10)and the action S ( U ) ∈ R is a function of all gauge links.Details on the construction of lattice gauge theory arepresented in Sec. III.It is possible to deform the integration contour of an SU ( N ) lattice gauge theory path integral by individu-ally deforming each group-valued variable U x,µ using theformalism presented above. In principle the deformedlink, (cid:101) U x,µ , could be a function of all other links on thelattice. However, evaluating the Jacobian factor arisingfrom such an arbitrary deformation would require O ( V )operations, where V is the number of sites of the lattice.For state-of-the-art lattice field theory calculations, thisis intractable. A similar obstacle is encountered in theapplication of normalizing flows to sampling probabilitydistributions in image or lattice data analysis, see e.g.Ref. [64] for a review, where it is avoided by explicitlyrestricting to triangular Jacobians for which the deter-minant factor is efficiently calculable from the diagonalelements. In this work, we similarly restrict to triangu-lar Jacobians by allowing deformations of each variableto depend only on previous variables in a canonical or-dering, described in detail for our particular studies of SU (2) and SU (3) in the following sections. Exploringother options is an interesting possibility for future work;for example, transformations built from a composition ofmultiple triangular transformations may allow more gen-eral spacetime dependence without significant increasein cost, whereas other alternatives based on convolutionsmay scale supra-linearly as the volume is increased.Given a deformed manifold M with tractable Jaco-bian factor, a deformed integral can be constructed tocompute the same expectation value defined by Eq. (9), (cid:104)O(cid:105) = 1 Z (cid:90) M D (cid:101) U O ( (cid:101) U ) e − S ( (cid:101) U ) . (11)The above equality holds if the original manifold canbe deformed to M without encountering any non-holomorphic regions of the integrand, i.e. if the integrandis holomorphic in the region bounded by the union of M and the original manifold. The abstract manifold M canbe specified by a map (cid:101) U ( U ), which then gives a concreteprescription for computing Eq. (11), (cid:104)O(cid:105) = 1 Z (cid:90) D U J ( U ) O ( (cid:101) U ( U )) e − S ( (cid:101) U ( U )) . (12)In contrast to Eq. (9), the path integral in Eq. (12) in-volves a generically complex-valued action, S ( (cid:101) U ( U )) ∈ C , and a different observable J ( U ) O ( (cid:101) U ( U )) written interms of the (efficiently computed) Jacobian factor J ( U ).Cauchy’s theorem implies that these modifications con-spire to cancel and result in an identical expectationvalue.It is useful to further rewrite Eq. (12) as a path integralwith respect to the original action, (cid:104)O(cid:105) = 1 Z (cid:90) D U (cid:110) J ( U ) O ( (cid:101) U ( U )) e − S ( (cid:101) U ( U ))+ S ( U ) (cid:111) e − S ( U ) ≡ Z (cid:90) D U Q ( U ) e − S ( U ) = (cid:104)Q(cid:105) . (13)In this rewriting, it is clear that the deformed path in-tegral is still accessible by performing Monte Carlo sam-pling with respect to the original action S ( U ) and esti-mating the sample mean of Q ( U ). These methods cantherefore be applied at the measurement step, after anensemble of field configurations has been sampled. Inessence, the deformed path integral defines a new observ-able Q ( U ) that has provably identical expectation valueto the original observable O ( U ). Notably, this new ob-servable generically has very different structure from O ,as it may be non-local and depends on the structure ofthe action.The variance of Q ( U ) can, however, be vastly differentfrom the variance of O ( U ). For most observables, sam-ples of O ( U ) are complex-valued, and the variance of thereal and imaginary components are given byVar[Re O ] = 12 (cid:10) |O | (cid:11) + 12 (cid:10) O (cid:11) − [Re (cid:104)O(cid:105) ] , Var[Im O ] = 12 (cid:10) |O | (cid:11) − (cid:10) O (cid:11) − [Im (cid:104)O(cid:105) ] . (14)These are not generically identical to the variance of Re Q and Im Q ,Var[Re Q ] = 12 (cid:10) |Q | (cid:11) + 12 (cid:10) Q (cid:11) − [Re (cid:104)Q(cid:105) ] , Var[Im Q ] = 12 (cid:10) |Q | (cid:11) − (cid:10) Q (cid:11) − [Im (cid:104)Q(cid:105) ] . (15)The final term is unchanged because (cid:104)O(cid:105) = (cid:104)Q(cid:105) , but thefirst two terms in both lines of Eq. (15) are not genericallyequal to the corresponding terms in Eq. (14). Explicitly,those terms are given by (cid:10) |Q | (cid:11) = 1 Z (cid:90) D U (cid:12)(cid:12)(cid:12) J ( U ) O ( (cid:101) U ( U )) (cid:12)(cid:12)(cid:12) e − S ( (cid:101) U ( U ))+ S ( U ) , (cid:10) Q (cid:11) = 1 Z (cid:90) D U J ( U ) O ( (cid:101) U ( U )) e − S ( (cid:101) U ( U ))+ S ( U ) . (16)Extra factors of S ( U ) persist in both expressions inEq. (16), and a non-holomorphic absolute value appears for (cid:10) |Q | (cid:11) , preventing identification with the terms inEq. (14). It can thus be fruitful to look for a modifiedobservable Q for which the terms in Eq. (16) are min-imized and the statistical noise is less than that of theoriginal observable.Given an explicit parameterization of the deformedcontour, standard gradient-based optimization methodscan be applied to find the parameters that minimize theterms in Eq. (16). Since the parameters only affect theobservable itself (the sampling weight is always e − S ( U ) ),the gradient of the variance with respect to the param-eters can be written as an expectation value under theoriginal Monte Carlo sampling. In this work, minimiza-tion of Eq. (16) is performed using stochastic gradientdescent over the deformed contour parameters, which ap-proximately converges to a local minimum of the varianceunder mild assumptions [65–68], and starting at the orig-inal manifold ensures that the result improves relative to(or at worst is equivalent to) the original variance.A potential obstacle to the deformed observablemethod is that some contour deformations could lead toa severe overlap problem between probability distribu-tions proportional to e − S ( U ) and e − Re[ S ( (cid:101) U ( U ))] that couldmake estimates of deformed observable variances from fi-nite Monte Carlo ensembles unreliable. The possibilityof constructing deformed observables with severe overlapproblems can be mitigated, however, by constructing de-formed observables that minimize the variance of Q sincelarge fluctuations of e − S ( (cid:101) U ( U ))+ S ( U ) will tend to increasethe variance of Q . Overlap problems could still arise dueto overfitting to the sample variance of a Monte Carlo en-semble that does not sample the field configurations asso-ciated with large fluctuations of e − S ( (cid:101) U ( U ))+ S ( U ) ; practicalstrategies to avoid such overfitting problems are discussedin Sec. IV C below.The terms to minimize in Eq. (16) are specific to aparticular observable O . There is no reason to expect asingle manifold deformation to be optimal for all possibleobservables, but one does expect similar observables tobe highly correlated, and thus to receive similar varianceimprovements from the same deformation. This suggeststwo useful practical improvements:1. Optimal manifolds can be found for a few repre-sentative observables and can be reused for othersimilar observables.2. When optimizing manifold parameters, the optimalparameters for a similar observable can be used toinitialize the search.In our studies of SU (2) and SU (3) lattice gauge theory inSec. IV and Sec. V, the initialization approach was foundto significantly reduce the number of steps required foroptimization. D. Rewriting observables before deformation
It is often the case that the expectation value of anobservable O has multiple equivalent path integral repre-sentations, for example when a theory possesses a gaugeor global symmetry that modifies O but leaves the actionand expectation values of observables invariant. In ad-dition to the freedom of choosing the parameterizationof contour deformations for a given path integral dis-cussed above, the application of contour deformations toobservables includes the freedom of choosing which pathintegral representation to deform.Simple examples of this freedom arise in two-dimensional U (1) Euclidean lattice gauge theory. Forinstance, path integral representations of Wilson loopswith unit area in this theory are proportional to (cid:90) π − π dφ π e iφ e β cos φ = I ( β ) , (17)which is an integral representation of the modified Besselfunction of the first kind I n ( β ), with n = 1, written interms of the field variable φ . The integrand appearingcan be interpreted as a product of an observable O = e iφ and an e − S factor for the Euclidean action S = − β cos φ .This theory has a charge conjugation symmetry that actson the integrand of Eq. (17) by φ → − φ , which leavesthe action invariant but modifies the observable e iφ → e − iφ . An equally valid integral representation of I ( β )is obtained by averaging the observable and its chargeconjugate as I ( β ) = (cid:90) π − π dφ π cos( φ ) e β cos φ . (18)The choice of integrand in Eq. (17) or Eq. (18) is irrel-evant for Monte Carlo calculations using the integrationcontours shown because the Monte Carlo estimator usedin the first case is Re e iφ = cos φ . However, the abilityof path integral contour deformations to reduce the vari-ance of a Monte Carlo calculation does depend on therepresentation. In particular, the variance of a MonteCarlo evaluation of Eq. (17) can be significantly reducedby contour deformations while the variance of a MonteCarlo evaluation of Eq. (18) cannot, as discussed belowand shown in Fig. 2.Ref. [48] demonstrated that the variance of two-dimensional U (1) Wilson loops can be significantly re-duced using contour deformations of the representationgiven in Eq. (17); we review the analytical derivationhere. Denote averaging of O ( φ ) with respect to the pathintegral of the theory with action − β cos φ by (cid:104)O ( φ ) (cid:105) β ≡ (cid:90) π − π dφ πI ( β ) O ( φ ) e β cos φ , (19)where the modified Bessel function with rank n = 0is used to normalize the distribution. In general, themodified Bessel functions of the first kind are given by I n ( β ) = (cid:82) π − π dφ π e inφ e β cos φ and will be used throughoutthe following derivation. Further denoting the corre-sponding variance of O given β by Var β [ O ], the varianceof Re e iφ = cos φ can be computed to beVar β [Re e iφ ] = (cid:10) cos ( φ ) (cid:11) β − (cid:10) e iφ (cid:11) β = 12 (cid:20) I ( β ) I ( β ) (cid:21) − (cid:20) I ( β ) I ( β ) (cid:21) . (20)The constant vertical deformation φ → (cid:101) φ ( φ ) = φ + if (21)leads to a deformed observable Q e ( φ ) = e i (cid:101) φ ( φ ) e β cos( (cid:101) φ ( φ )) − β cos( φ ) (22)satisfying (cid:104)Q e (cid:105) β = (cid:10) e iφ (cid:11) β . Using Re Q e as a Monte Carloestimator instead of Re e iφ results in varianceVar β [Re Q e ] = e − f (cid:20) R ( β, f )2 + V ( β, f ) (cid:21) − (cid:20) I ( β ) I ( β ) (cid:21) , (23)where V ( β, f ) = (cid:32) e f − e − f − (cid:33) I ( β (cid:112) − f ))2 I ( β ) ,R ( β, f ) = I ( β (2 cosh( f ) − I ( β ) . (24)For positive β , Var β [Re Q e ] generically has a minimum atsome strictly positive f that defines the optimal contourfor reducing the variance of Q e . Using deformed observ-ables associated with this optimal contour rather thanthe original contour leads to variance reduction whosesignificance is larger than an order of magnitude for β (cid:38) U (1) Wilson loops [48].Applying the same constant vertical deformation tothe representation in Eq. (18) leads to an alternative de-formed observable Q c ( φ ) = cos( (cid:101) φ ( φ )) e β cos( (cid:101) φ ( φ )) − β cos( φ ) (25)satisfying (cid:104)Q c (cid:105) β = (cid:10) e iφ (cid:11) β . The real part of this alterna-tive deformed observable has varianceVar β [Re Q c ] = 12 e − f V ( β, f ) + 12 e f V ( β, − f )+ 12 R ( β, f ) − (cid:20) I ( β ) I ( β ) (cid:21) . (26)This expression is symmetric under f → − f and its gra-dient with respect to f therefore vanishes at f = 0, whichcan be verified to be the minimum of f . The deformedobservable Q c associated with integrating along the real − . − . . . . f V a r β [ R e ( e i φ ) ] / V a r β [ R e Q e ] β = 2 β = 3 β = 4 β = 5 − . − . . . . f V a r β [ c o s ( φ ) ] / V a r β [ R e Q c ] β = 2 β = 3 β = 4 β = 5 FIG. 2. Ratios of the variance of observable Re( e iφ ) = cos( φ ) to the variance of deformed observables obtained by applyingthe transformation φ → φ + if to the integrand e iφ (left) or cos( φ ) (right) in the one-dimensional path integral with Euclideanaction − β cos φ discussed in the main text. Gray hatching indicates the region in which the variance of the deformed observableis higher than the original (i.e. there is no improvement). axis is thus the choice with lowest variance, while increas-ing | f | away from zero always leads to increased varianceas illustrated for several choices of β in Fig. 2.The qualitative features of these results can be under-stood from the behaviors of magnitude and phase fluc-tuations of deformed observables. The magnitude of e i (cid:101) φ is reduced for all φ by a constant vertical deformationwith f >
0. To preserve the results of the holomor-phic integral in Eq. (17) under this deformation, cancel-lations from phase fluctuations must correspondingly beless severe and sign/StN problems should therefore beimproved. On the other hand, the magnitude of cos( (cid:101) φ )always satisfies | cos( (cid:101) φ ) | ≥ | cos( φ ) | and therefore can onlybe increased by applying a vertical contour deformation.This suggests that phase fluctuations must become moresevere to preserve deformation-invariant integral resultsand that sign/StN problems will be worsened. This argu-ment applies to non-constant vertical deformations andsuggests that it is generically difficult to construct a con-tour deformation of Eq. (18) that leads to a deformedobservable with variance comparable to the observable Q e obtained by deforming Eq. (17).In the non-Abelian gauge theories that are focus ofthis work, the traces of Wilson loops define gauge invari-ant observables related to the potential energies of staticquark configurations analogous to e iφ in U (1) gauge the-ory. The eigenvalues of Wilson loops in U ( N ) or SU ( N )gauge theories can be expressed as e iφ j , where φ j ∈ R for j = 1 , . . . , N and for the case of SU ( N ) the phasessatisfy (cid:80) Nj =1 φ j = 0 mod 2 π . The trace of the Wilsonloop is given by (cid:80) j e iφ j . For SU ( N ) gauge theory, theunit determinant condition therefore results in analogousobstacles to improving the variance of the trace of Wil-son loops if the observable is not first rewritten usingsymmetries. For the case of SU (2), there is a precise correspondencebetween Wilson loop traces and Eq. (18). The unit de-terminant condition requires the Wilson loop eigenvaluesto be of the form e iφ and e − iφ and the trace appearingin both the observable and the Wilson action [69] (dis-cussed below in Sec. IV) are proportional to cos( φ ). Itis therefore similarly impossible to improve the varianceof a unit area SU (2) Wilson loop trace using constantvertical deformations, and the previous analysis suggestsit is difficult to make significant variance improvementsto traces of SU (2) Wilson loops of general area using ver-tical contour deformations. However, this analysis alsoindicates that e iφ could provide a suitable starting pointfor defining deformed observables. The gauge invariantcomponent of the Wilson loop eigenvalue e iφ is cos( φ )making this a suitable rewriting of the observable thatdoes not affect the expectation value but enables vari-ance improvements by contour deformation. In the case of SU ( N ) with N ≥
3, complex eigenvaluesare not guaranteed to come in complex conjugate pairs,but the unit determinant condition still guarantees thatany vertical deformation of the eigenvalue phases will in-crease the magnitude of at least one e iφ j phase factor.In the sum over eigenvalues defining the trace of a Wil- We could instead use linearity of the expectation value tostart from an explicitly gauge-invariant observable and rewrite (cid:104) cos( φ ) (cid:105) = (cid:10) e iφ (cid:11) + (cid:10) e − iφ (cid:11) . Most generally, we could then ap-ply distinct deformations to each expectation value on the right-hand-side of this expression, and in particular applying constantimaginary shifts with opposite signs will result in the same StNimprovement for both terms. In this example, the two expecta-tion values can be equated using charge conjugation symmetry,which allows the exact rewriting (cid:104) cos( φ ) (cid:105) = (cid:10) e iφ (cid:11) , but the tech-nique of splitting the expectation value using linearity and apply-ing independent deforms may be useful where such a symmetryis not present. son loop, the largest eigenvalue magnitude will set thetypical observable magnitude on generic gauge fields inthe integration domain. While it is possible for strongercancellation between eigenvalues to occur on particulargauge fields, this larger typical magnitude throughout themajority of the domain of integration suggests that can-cellations from phase fluctuations with similar severity tothose of the original observable will therefore be requiredfor such deformed observables to achieve identical expec-tation values, and significant sign/StN problems may beexpected. If one instead chooses the non-gauge-invariantintegrand e iφ to measure the same expectation value,the determinant constraint does not prevent exponen-tially decreasing the magnitude throughout the integra-tion domain using contour deformations. For example,one can choose the shift φ → (cid:101) φ = φ + ifφ → (cid:101) φ = φ − if / φ → (cid:101) φ = φ − if / e − f on every gauge field in the integration domain.Rewriting Wilson loop observables based on eigenval-ues requires diagonalization, and a more practical al-ternative is to use the (1 ,
1) component of the (matrix-valued) Wilson loop as another non-gauge-invariant func-tion with the same expectation value as the trace dividedby N . The phase of this (or any other) single color com-ponent of the Wilson loop is not constrained by the unitdeterminant condition and therefore one expects that asuitable parameterization can be found in which verticaldeformations can be applied to the phase of the (1 , e iφ . Suchparameterizations are given for SU (2) in Sec. IV and for SU (3) in Sec. V following Ref. [55] and are used as start-ing points for defining deformed observables with reducedvariance in calculations of Wilson loop expectation val-ues. An alternative parameterization of SU (2) in whichthe real part of the (1 ,
1) component of the Wilson loopis expressed as cos( α/
2) is explored in Appendix B; asexpected, vertical contour deformations do not improvethe variance of unit area Wilson loops and less (thoughstill significant) variance reduction is found for larger areaWilson loops with this alternative parameterization.
III. NOISE PROBLEMS IN SU ( N ) LATTICEGAUGE THEORY
A simple setting for analyzing SU ( N ) lattice gaugetheory is obtained by considering (1 + 1)D Euclideanspacetime with open boundary conditions. In this space-time geometry, much like in (3 + 1) dimensions, the the-ory features confinement of static test charges and anexponentially severe StN problem associated with staticquark correlation functions, which can be identified withWilson loops [69]. Numerical calculations of Wilson loop expectation values can be performed at much lower com-putational cost in (1 + 1)D than (3 + 1)D, facilitatinga first exploration of path integral contour deformationsapplied to non-Abelian gauge theory observables on non-trivial lattices. Analytic results for (1 + 1)D observablessuch as Wilson loops are also known [70, 71] and can beused to verify the correctness of numerical results. Theseresults are extended to analytic results for the variancesof (1 + 1)D Wilson loops below, which are then used inSecs. IV–V to verify the correctness and study the effec-tiveness of contour deformations applied to Wilson loops.In particular, analytic results can be used to determinethe StN gains obtained by using deformed observableseven when the corresponding undeformed observables aretoo noisy to be determined reliably. A. SU ( N ) lattice gauge theory in (1 + 1) D Lattice gauge theory in (1 + 1)D is defined on a set V of Euclidean spacetime points x arranged in a discretetwo-dimensional lattice, with vectors ˆ1 and ˆ2 giving thedisplacement in lattice units between neighboring latticesites along the two Euclidean spacetime axes. The dis-cretized gauge field is represented by group-valued vari-ables on each link of the lattice, with U x,µ denoting thevariable associated with link ( x, x + ˆ µ ). The physical con-tent of the theory is encoded in the (discretized) action.We consider the Wilson action for SU ( N ) lattice gaugetheory [69], given for a (1 + 1)D Euclidean spacetimevolume by S ≡ − g (cid:88) x ∈V tr (cid:0) P x + P − x (cid:1) , (28)where g is the bare gauge coupling and each plaquette P x ∈ SU ( N ) is defined as P x ≡ U x, U x +ˆ1 , U − x +ˆ2 , U − x, . (29)Writing the action and plaquettes using inversion ratherthan Hermitian conjugation allows the relevant inte-grands to be interpreted in the following sections as holo-morphic functions of integration variables throughout thecomplexified domain. For SU ( N ) elements these opera-tions are equivalent, but analytically continuing the ac-tion to SL ( N, C ) requires the use of the inverse [60–62].Expectation values of operators O ( U ) in the latticeregularized theory are defined by specializing Eq. (9)–(10) to the particular case of SU ( N ) lattice gauge theory (cid:104)O(cid:105) = 1 Z (cid:90) D U O ( U ) e − S ( U ) , (30)where the Euclidean partition function Z is defined by Z = (cid:90) D U e − S ( U ) (31)and D U = (cid:81) x,µ dU x,µ in terms of the Haar measure dU x,µ of SU ( N ).With open boundary conditions in (1 + 1)D, the parti-tion function defined by this action factorizes into a prod-uct of independent integrals over each P x . To exploit thisfactorization in (1 + 1)D, a gauge fixing prescription canbe applied in which U x, = 1 for all x and U x, = 1 forsites with x = 0 (a maximal tree gauge). In this gauge, P x = U x, U − x +ˆ2 , , (32)which can be easily inverted to obtain U x, = (cid:34) x − (cid:89) k =0 P x + k ˆ2 (cid:35) − . (33)The variables P x are therefore in one-to-one correspon-dence with the remaining non-gauge-fixed U x, . TheHaar measure is invariant under this change of variables,and the path integral defining the partition function fac-torizes as Z = (cid:89) x ∈V (cid:48) z = z |V (cid:48) | , (34)where V (cid:48) ⊂ V is the subset of lattice points with uncon-strained U x, in this gauge (those for which x (cid:54) = 0) and z is the contribution to the partition function from a singleplaquette, z ≡ (cid:90) dP e g tr ( P + P − ) . (35)The calculations of z and similar single-variable SU ( N )integrals are presented in Appendix A.Wilson loops are defined by the matrix-valued quantity W A ≡ (cid:89) x,µ ∈ ∂ A U x,µ , (36)where (cid:81) x,µ ∈ ∂ A U x,µ represents an ordered product oflinks along the boundary ∂ A of the two-dimensional re-gion A with area A . The expectation value of the gauge-invariant observable N tr ( W A ) probes the interaction be-tween a pair of static quarks if the region A is taken tobe rectangular. Inserting Eq. (33) into Eq. (36) gives N tr ( W A ) = 1 N tr (cid:32) (cid:89) x ∈A P x (cid:33) . (37)Using linearity of expectation values and factorization ofpath integrals analogous to Eq. (34), the expectation val-ues of Wilson loops can be related to products of (matrix-valued) single-variable expectation values, (cid:28) N tr ( W A ) (cid:29) = 1 N tr (cid:32) (cid:89) x ∈A (cid:104) P x (cid:105) (cid:33) . (38) For simplicity we restrict to rectangular Wilson loops with onecorner at the origin.
Each single-variable expectation value is given by (cid:10) P abx (cid:11) = (cid:104) χ (cid:105) δ ab , allowing the traced Wilson loop to bewritten as a product of scalars, (cid:28) N tr ( W A ) (cid:29) = (cid:89) x ∈A (cid:104) χ (cid:105) = (cid:104) χ (cid:105) A , (39)where we have introduced the single-variable normal-ized expectation value of the group character function χ ( P ) = tr( P ), (cid:104) χ (cid:105) ≡ z (cid:90) dP N tr( P ) e g tr ( P + P − ) , (40)whose value is computed in Appendix A.Eq. (39) implies that Wilson loop expectation valuesfollow area law scaling, (cid:104) tr( W A ) /N (cid:105) ∼ e − σA , and SU ( N )gauge theory in (1 + 1)D confines for all values of thecoupling, with a separation-independent force betweenstatic test charges given by the string tension σ ≡ − lim A →∞ ∂ A ln W A = − ln (cid:104) χ (cid:105) . (41)Although (cid:104) χ (cid:105) is in general given by a convergent infiniteseries in Eq. (A6), in the case of SU (2) a simpler formcan be found in terms of modified Bessel functions, σ SU (2) = ln (cid:18) I (4 /g ) I (4 /g ) (cid:19) , (42)which goes to zero as g →
0. This observation can begeneralized to all SU ( N ) groups, and the lattice-unitsstring tension goes to zero while the static quark corre-lation length grows to infinity in the limit of g → σV , where V isthe total number of plaquettes; the particular choices ofcouplings and V used in our numerical studies are re-ported in Table I. Results are plotted versus σA whencomparing quantities at fixed physical separation is im-portant. B. Noise and sign problems in the Wilson loop
Although the expectation value (cid:104) tr( W A ) /N (cid:105) is real,the integrand tr( W A ) /N has fluctuating signs (for N =2) or fluctuating complex phases (for N ≥
3) across thedomain of integration. These fluctuations result in asign/StN problem for this observable. The sample mean0 SU (2) SU (3) σ V g β g β SU (2) and SU (3) lattice gauge theory. The dimensionlessquantity σV is fixed to 6 . σ and V are individuallyvaried. The conventional Wilson action parameter β = 2 N/g is also reported. of Re tr( W A ) /N gives an estimator for (cid:104) tr( W A ) /N (cid:105) , andthe variance of this estimator can be directly computed,Var[Re tr( W A ) /N ]= 1 N (cid:10) Re tr( W A ) (cid:11) − N Re (cid:104) tr( W A ) (cid:105) = 12 N (cid:10)(cid:12)(cid:12) tr( W A ) (cid:12)(cid:12)(cid:11) + 12 N (cid:10) tr( W A ) (cid:11) − N Re (cid:104) tr( W A ) (cid:105) . (43)The expectation values in the first and second terms inthe variance can be factorized analogously to the Wilsonloop expectation value, and are shown in Appendix A tobe (cid:10)(cid:12)(cid:12) tr( W A ) (cid:12)(cid:12)(cid:11) = 1 + ( N − (cid:104) χ , − (cid:105) A (cid:10) tr( W A ) (cid:11) = N ( N + 1)2 (cid:104) χ (cid:105) A + N ( N − (cid:104) χ , (cid:105) A , (44)in terms of the single-site integrals (cid:104) χ , − (cid:105) , (cid:104) χ (cid:105) , and (cid:104) χ , (cid:105) , defined in Eqs. (A9) and (A13). In total, thevariance isVar[Re tr( W A ) /N ] = 12 N + O ( e − cA )2 N − e − σA , (45)where c is a constant. The fact that (cid:104) χ r (cid:105) < r (assuming that g is finite) [72] impliesthat c > A → ∞ ,Var[Re tr( W A ) /N ] ∼ N . (46)The signal-to-noise ratio for n samples can be writtenexactly in terms of Eqs. (43), (44), and (39), but for thisanalysis it is sufficient to identify the asymptotic behaviorfrom Eqs. (45) and (39), givingStN[Re tr( W A ) /N ] = (cid:10) N tr ( W A ) (cid:11)(cid:113) n Var[Re N tr( W A )] ∼ N √ ne − σA , (47)which degrades exponentially with area A . For the esti-mator Re tr( W A ) /N , the analysis above shows that this can only be counteracted by exponentially increasing thenumber of samples n . Eqs. (44) and (45) also make clearthat the leading asymptotic behavior of the variance isdue to the typical magnitude-squared of the observable, (cid:10) | tr( W A ) | /N (cid:11) , which remains O (1) for all areas. Can-cellations due to phase fluctuations are required to re-produce the exponentially small Wilson loop expectationvalues predicted for large areas, confirming that the StNproblem can be related to a sign problem in the Wilsonloop observable.Attributing the StN problem to O (1) magnitudes forindividual samples of the Wilson loop observable atall areas also inspires our deformations of the Wilsonloop observable in the following sections. The quantity (cid:10) | tr( W A ) | /N (cid:11) can be written as an integral of a non-holomorphic integrand which will generically be modi-fied by contour deformations of the path integral. Ifwe choose contour deformations that reduce the averagemagnitude of the observable, this quantity, and thus theleading term of the variance, will be reduced. The ob-servable mean is unchanged and the StN ratio will thusincrease under such a deformation.For SU (2), the single-site integrals can be evaluatedstraightforwardly (see Appendix A) and the SU (2) vari-ance isVar[Re tr( W SU (2) A ) /N ]= 14 + 34 (cid:18) I (4 /g ) I (4 /g ) (cid:19) A − e − σ SU (2) A , (48)where σ SU (2) is given in Eq. (42). The StN of SU (2)Wilson loops in (1 + 1)D can therefore be explicitly cal-culated,StN[Re tr( W SU (2) A ) /N ]= 2 √ ne − σ SU (2) A (cid:114) (cid:16) I (4 /g ) I (4 /g ) (cid:17) A − e − σ SU (2) A . (49)Using numerical evaluation of the corresponding single-site integrals for SU ( N ) Wilson loops yields theoreticalcurves for the variance and signal-to-noise for general N .In the studies below, we choose to deform the (1 ,
1) com-ponent of the Wilson loop, W A , instead of tr W A /N fol-lowing the reasoning of Sec. II D. The variance of W A can be related to the variance of tr( W A ) /N and is com-pared to Monte Carlo results in the following sections. IV. SU (2) PATH INTEGRAL CONTOURDEFORMATIONS
As a proof-of-principle, we apply path integral contourdeformations to Wilson loop calculations in SU (2) latticegauge theory in (1 + 1)D with open boundary conditions.An identical setting with gauge group SU (3) is investi-gated in the following section.1 A. Gauge field parameterization
There are many possible parameterizations of the SU (2) group manifold, any of which can be used to de-fine valid path integral contour deformations. We argueabove that it is advantageous to consider a single compo-nent of the Wilson loop, taken without loss of generalityto be W A , as the observable whose path integral contouris deformed in order to calculate (cid:10) W A (cid:11) = (cid:104) tr( W A ) /N (cid:105) .Contour deformations that reduce the magnitude of W A in generic gauge field configurations while preserving (cid:10) W A (cid:11) can be expected to reduce phase fluctuations andtherefore the variance of W A . The angular parameter-ization of each plaquette P x ∈ SU (2) is useful for thispurpose, and is explicitly defined by P x = sin θ x e iφ x ,P x = cos θ x e iφ x ,P x = − cos θ x e − iφ x ,P x = sin θ x e − iφ x , (50)following the generalized SU ( N ) angular parameteriza-tion given in Ref. [55]. The azimuthal angles satisfy φ x , φ x ∈ [0 , π ], with endpoints identified, while the an-gle θ x spans the finite interval [0 , π/ dP x = h (Ω x ) d Ω x ≡ π sin(2 θ x ) dθ x dφ x dφ x . (51)We begin by considering the effects of simple deforma-tions using these parameters. In the simplest case of aregion A with area A = 1, the Wilson loop consists of asingle plaquette, W A = P x , where the loop starts andends at site x . The magnitude of W A can be reducedby e − λ by deforming φ x → φ x + iλ analogously to theapproach described for e iφ integrals above. In the caseof A = 2, the Wilson loop can be written in terms of theproduct of two plaquettes, W A = ( P x P x (cid:48) ) . In the an-gular parameterization, the Wilson loop is a sum of twoterms( P x P x (cid:48) ) = sin θ x sin θ x (cid:48) e iφ x + iφ x (cid:48) +cos θ x cos θ x (cid:48) e iφ x − iφ x (cid:48) . (52)The first term involves products of diagonal entries whosemagnitude can be reduced by e − λ by taking φ x → φ x + iλ or φ x (cid:48) → φ x (cid:48) + iλ and the second term involves off-diagonal components whose magnitude can be reducedanalogously by taking φ x − φ x (cid:48) → ( φ x − φ x (cid:48) ) + iλ . For A >
2, it can be seen similarly that shifting φ x → φ x + iλ and ( φ x − φ x +1 ) → ( φ x − φ x +1 ) + iλ for all x leads tosuppression of the magnitudes of all terms appearing in W A .A family of contour deformations capable of express-ing these constant imaginary shifts to the phases of allelements of P x can therefore be expected to reduce phasefluctuations and the variance of W A . Such a family ofcontour deformations is parameterized below as a subset of the vertical deformation expanded in a Fourier seriesin Eqs. (5)–(8).An alternative parameterization of SU (2) is exploredin Appendix B, in which it is found that imaginary shiftsalong these lines are more difficult to express and ordersof magnitude less variance reduction is achieved when ap-plying the same optimization methods. This explorationsuggests that a choice of parameterization that allows theobservable to be expressed in the form e iφ is importantfor variance reduction in observables afflicted with a signproblem.It is also possible to directly parameterize the gaugefield U x,µ ∈ SU (2) using Eq. (50). This alternative pa-rameterization may be useful in more than two space-time dimensions, where Gauss’ Law constraints implythat not all plaquettes are independent and a path in-tegral change of variables from U x,µ to P µνx cannot beperformed straightforwardly. B. Fourier deformation basis
In our study of SU (2) gauge theory, we optimize overa family of vertical contour deformations expressed interms of a Fourier series truncated above a specific cutoffmode. To avoid a costly Jacobian calculation, each pla-quette variable P x is deformed conditioned on plaquettes P y at sites earlier in the product defining W A in Eq. (37),which we write as y ≤ x . This family of deformations isgiven by˜ θ x ≡ θ x + i (cid:88) y ≤ x f θ ( θ y , φ y , φ y ; κ xy , λ xy , η xy , χ xy , ζ xy ) , ˜ φ x ≡ φ x + iκ x ; φ + i (cid:88) y ≤ x f φ ( θ y , φ y , φ y ; κ xy , λ xy , η xy , χ xy , ζ xy ) , ˜ φ x ≡ φ x + iκ x ; φ + i (cid:88) y ≤ x f φ ( θ y , φ y , φ y ; κ xy , λ xy , η xy , χ xy , ζ xy ) , (53)in terms of parameters κ xy , λ xy , η xy , χ xy , and ζ xy . Thefunctions f θ , f φ , and f φ compute the shift in the imag-inary direction of the angular parameters of P x condi-tioned on P y , and their decomposition in terms of Fouriermodes is detailed below. For this ordered dependence onprevious sites, the Jacobian determinant factorizes intoa product of block determinants per lattice site J = (cid:89) x j x ( θ x , φ x , φ x ) , (54)where j x ( θ x , φ x , φ x ) = det ∂f θ ∂θ x ∂f θ ∂φ x ∂f θ ∂φ x ∂f φ ∂θ x ∂f φ ∂φ x ∂f φ ∂φ x ∂f φ ∂θ x ∂f φ ∂φ x ∂f φ ∂φ x . (55)2The structure of the deformation in Eq. (53) therefore by-passes the need for expensive Jacobian determinant cal-culations involving matrices whose rank grows with thespacetime volume and is inspired by analogous methodsto simplify Jacobian determinant calculations in normal- izing flows [64]. Note that an absolute value is not takenover the determinant in Eq. (55).The vertical deformation in Eq. (53) can be expandedin a multi-parameter Fourier series as f θ = Λ (cid:88) m =1 κ xy ; θm sin (2 mθ y ) (cid:40) Λ (cid:88) n =1 (cid:104) η xy ; θ,φ mn sin( nφ y + χ xy ; θ,φ mn ) + η xy ; θ,φ mn sin( nφ y + χ xy ; θ,φ mn ) (cid:105)(cid:41) ,f φ = Λ (cid:88) m =1 κ xy ; φ m sin( mφ y + ζ xy ; φ m ) (cid:40) Λ (cid:88) n =1 (cid:104) λ xy ; φ ,θmn sin(2 nθ y ) + η xy ; φ ,φ mn sin( nφ y + χ xy ; φ ,φ mn ) (cid:105)(cid:41) ,f φ = Λ (cid:88) m =1 κ xy ; φ m sin( mφ y + ζ xy ; φ m ) (cid:40) Λ (cid:88) n =1 (cid:104) λ xy ; φ ,θmn sin(2 nθ y ) + η xy ; φ ,φ mn sin( nφ y + χ xy ; φ ,φ mn ) (cid:105)(cid:41) , (56)where Λ is a hyperparameter that sets the maximumFourier mode to include and controls the total numberof free parameters. As the zero modes have trivial y de-pendence, we have collected them in Eq. (53) into the y -independent terms κ x ; φ and κ x ; φ . The included Fourierterms are defined to satisfy the constraints (cid:101) θ x (0) = 0, (cid:101) θ x ( π/
2) = π/ (cid:101) φ x (0) = (cid:101) φ x (2 π ), and (cid:101) φ x (0) = (cid:101) φ x (2 π ),which together ensure that the endpoints of both thezenith and azimuthal integration domains are appropri-ately handled as described in Sec. II A. The derivativesneeded for the Jacobian in Eqs. (54)–(55) can be calcu-lated straightforwardly by differentiating Eq. (56). Theadditional factor describing the change in the Haar mea-sure needed to compute the Jacobian of the group mea-sure is given in these coordinates as (cid:89) x h ( (cid:101) Ω x ) h (Ω x ) = (cid:89) x (cid:34) sin(2˜ θ x )sin(2 θ x ) (cid:35) . (57)Combining the results of Eq. (50) and Eqs. (53)–(57), the expectation value of any holomorphic observ-able O ( { P x } ) is equal to the expectation value of thedeformed observable Q ( { P x } ) ≡ O ( { (cid:101) P x } ) e − S ( { (cid:101) P x } ) e − S ( { P x } ) (cid:89) x j x (cid:34) sin(2˜ θ x )sin(2 θ x ) (cid:35) , (58)where (cid:101) P x = (cid:32) sin (cid:101) θ x e i (cid:101) φ x cos (cid:101) θ x e i (cid:101) φ x − cos (cid:101) θ x e − i (cid:101) φ x sin (cid:101) θ x e − i (cid:101) φ x (cid:33) ∈ SL (2 , C ) . (59)If the plaquettes are sampled in the matrix representationfor Monte Carlo evaluation, computing the observable Q in Eq. (58) requires converting to the angular represen-tation before deforming and evaluating. This conversion is given by θ x = arcsin( | P x | ) ,φ x = arg( P x ) ,φ x = arg( P x ) , (60)and can be done when evaluating the observable Q ( { P x } ). Though these functions are not entire, the con-version used here does not determine whether the inte-grand itself is a holomorphic function of these angularparameters. C. Optimization procedure
This contour deformation expanded in a Fourier seriesprovides a means of exploring deformed observables withpotentially reduced variance. It is shown above that sim-ple deformations within this family are possible to con-struct by hand and are already sufficient to reduce theaverage magnitude of Wilson loop observables. However,these deformations are restricted to zero modes of theFourier expansion and rely on construction based on in-tuition. To maximize the variance reduction, we explorenumerical optimization of the deformation parameters κ xy , λ xy , η xy , χ xy , and ζ xy as discussed in Sec. II C. Weare interested in Re W A , for which the terms of Eq. (43)that can be modified by contour deformation are L ≡ (cid:10) (Re Q A ) (cid:11) = 12 (cid:10) | Q A | (cid:11) + 12 (cid:10) Q A (cid:11) , (61)where Q A is the deformed observable associated withthe W A . The first term in Eq. (61) is manifestlynon-holomorphic due to the absolute value over acomplex-valued observable, while the second term in-cludes squared reweighting factors of the original anddeformed action which prevent identification as a defor-mation of (cid:10) ( W A ) (cid:11) . These terms together define the loss function L that we aim to minimize as a function of thedeformation parameters.This loss function is written as an expectation value interms of sampling from the original Monte Carlo weights e − S ( { P x } ) , and its gradient can similarly be written as anexpectation value, ∇L = (cid:104) Q A ∇ Re Q A (cid:105) . (62)The term ∇ Re Q A can be expanded using the explicitform of Q A given in Eq. (58), as well as the forms ofthe observable W A and the action S in terms of { P x } in Sec. III A. For this study, the gradient ∇ Re Q A wascomputed explicitly and cross-checked using automaticdifferentiation available in the JAX library [73]. Eq. (62)can be stochastically estimated using an (undeformed)ensemble of n configurations { P ix } , i ∈ [1 , . . . , n ], drawnproportional to the weight e − S ( { P x } ) , ∇L ≈ n n (cid:88) i =1 (cid:2) Q A ( { P ix } ) ∇ Re Q A ( { P ix } ) (cid:3) . (63)In all experiments below, we used the Adam opti-mizer [74] to iteratively update parameters based onthese stochastic gradient estimates. Each gradient es-timate was computed using 1 / s and then permanentlyreduced the step size by a factor of F (i.e. s i +1 = s i /F )each time the loss function failed to improve over a win-dow of W steps. The schedule halted training after thestep size was reduced N r times. We used the parameters F = 10, W = 50, N r = 2, and s = 10 − for both SU (2)and SU (3) gauge theory.In preliminary investigations, we found that naivemanifold optimization resulted in overtraining, i.e. over-fitting parameters to the specific training data avail-able [75–77]. In the context of manifold optimization,this corresponds to minimizing a finite-ensemble varianceestimator rather than minimizing the true variance ofRe Q A . In practice this produced deformed observableswith higher variance when measured on a different en-semble than the training set.To mitigate overtraining in the final results, we re-served a “test set” of configurations, independent of thetraining data, on which the loss function was periodicallymeasured [78]; the reserved test set of configurations waschosen to have the same size as the training set. The stepsize schedule was configured to use loss measurements based on this test set, ensuring that training was slowedand halted before significantly overfitting. We furtherused a mini-batching technique, in which a set of m ≤ n configurations are resampled from the training set to es-timate Eq. (63), as a means of avoiding overtraining [68].The mini-batch size was chosen to be equal to the size ofthe full training set (i.e. m = n ), thus mini-batch eval-uation in this context was just a resampling operation,giving variation in gradient estimates over multiple eval-uations. The fluctuations in these gradient estimates areon the order of the uncertainty of the variance estima-tor, preventing overfitting below this uncertainty. Foreach observable W A we also found it important to re-strict to deforming only the plaquettes within the region A . Though including extra degrees of freedom cannotmake the optimal variance any higher (the optimizationsteps could always leave those plaquettes undeformed),in practice we found that including such degrees of free-dom allowed the deformed manifold to rapidly overtrain,resulting in worse performance overall. Appendix C de-tails further possible approaches to avoiding overfittingand overlap problems using a regularizing term added tothe loss function. These approaches were found to beunnecessary for our final results.Finally, making a good choice of initial manifold pa-rameters yielded practical improvement in training timeand quality. On one hand, initializing the manifold pa-rameters to zero ensures that gradient descent starts fromthe original manifold, and the optimization procedureshould find a local minimum with variance no higher thanthe original manifold (up to noise from stochastic gradi-ent estimates). However, correlations in sign and mag-nitude fluctuations of observables with similar structure,such as Wilson loops W A and W A (cid:48) with overlapping ar-eas A and A (cid:48) , suggest that the variance reduction fromcontour deformations will be correlated as well. Thoughthe optimal manifold for one observable will not gener-ically be optimal for the other, it can serve as a betterstarting point than the original manifold. In our studyof Wilson loops, we initialized manifold parameters foreach Wilson loop of area A using the optimal parametersfor the Wilson loop of area A −
1, when the Fourier cutoffΛ = 0, or using the optimal parameters for the Wilsonloop of area A and cutoff Λ −
1, when Λ (cid:54) = 0. Figure 3shows the improvement in optimization time and qualityusing this method for manifold deformations with Λ = 0.While this approach sacrifices the guarantee that the lo-cal minimum obtained corresponds to a deformed observ-able with variance no higher than the original observable(in the limit of infinitely precise gradient estimates), inpractice we find that this property is not violated. Wealso note that this property can be easily checked afteroptimization, and if the variance were found to increasewith respect to the original observable training could in-stead be started from the original manifold to recover theguarantee.4
Training iteration − − L A = 4 A = 8 A = 12 A = 16 A = 20 A = 24 A = 28 A = 32 0 500 1000 1500 2000 Training iteration − − L A = 4 A = 8 A = 12 A = 16 A = 20 A = 24 A = 28 A = 32 FIG. 3. SU (2) Wilson loop loss L plotted with respect to training iterations for optimization of Wilson loops with g = 0 .
71 andarea A starting with the undeformed manifold (left) or starting with the deformed manifold calculated for area A − A −
1, despite using nearly four times fewertraining iterations.
D. Monte Carlo calculations
We investigated the practical performance of these con-tour deformations on the three sets of SU (2) parametersdetailed in Table I. These parameters correspond to vari-ation in the string tension by a factor of 4. At eachchoice of parameters, n = 32000 configurations were gen-erated, exploiting the factorization of the path integralto draw samples of each plaquette P x independently fromthe marginal distribution, p ( P x ) ∝ e − g tr( P x + P − x ) . (64)Plaquette samples were generated using Hybrid MonteCarlo [79] with parameters tuned to maintain autocor-relation times of order 1–2, and these individual pla-quettes were then arranged into lattices consisting of V sites each. A random shuffle was applied to the collectionof plaquettes prior to this assignment to avoid spuriousspatial correlations, ensuring an asymptotically vanish-ing bias in the limit of infinite ensemble size.On each ensemble, we performed a study of deforma-tions with Fourier cutoff Λ fixed to zero. For all threechoices of parameters, training on a subset of 320 con-figurations was sufficient to converge to manifolds withvariance reduction up to four orders of magnitude whencomparing the largest deformed observables to the origi-nal Wilson loop observables. An additional subset of 320configurations was reserved as the test set during op-timization, and the remaining 31360 configurations were For SU (2) gauge theory it is also possible to apply a heat bathalgorithm to directly draw samples. However, more complicatedvariants are required for SU (3) [80–82] and Hybrid Monte Carlowas selected for simplicity. used to measure results. Measurements of Q A were foundto be consistent with the exact results given in Sec. III Aand with the Monte Carlo results for W A in the regionwhere the estimates of W A were reliable. The variance ofRe W A is dominated by (cid:10) (Re W A ) (cid:11) , which is positive-definite for all gauge field configurations. It was there-fore possible to precisely measure the variance withouta sign/StN problem, and the expected agreement withanalytical results was reproduced. In particular, O (1)scaling at large area can be clearly seen. In contrast, wefound that the variance of Q A for manifolds optimized asabove falls exponentially at large area, giving exponen-tial improvements in the signal-to-noise. These resultsare presented for all three ensembles in Fig. 4.Improvements from contour deformation were alsofound to be similar for Wilson loops with fixed physi-cal area σA across the three values of the lattice spacing.Fig. 5 compares the improvement in the Wilson loop vari-ance versus the dimensionless scale σA across all threeensembles, and the three curves can be seen to nearlycollapse at small areas, though there are differences ofroughly a factor of 4 at the largest areas. Despite thisvariation, the variance of the Wilson loop with largestarea is reduced by more than 10 even for the finest en-semble. Analogous results were observed for (1 + 1)D U (1) gauge theory in Ref. [48].For Λ = 0, there are few enough parameters that itis possible to investigate the optimal parameters foundby the optimization procedure. Fig. 6 depicts the valuesof κ x ; φ and κ x ; φ when optimized to reduce varianceof Q A at three choices of the loop area. It is arguedin Sec. IV A that the magnitude of Q A can be reducedon each sample if φ x and the differences φ x − φ x +1 areshifted by a positive imaginary constant. This manifold isapproximately discovered by the optimization procedure:the final parameters are a positive, nearly constant κ x ; φ ,5 − − − (cid:10) W A (cid:11) (original) h Q A i (deformed)2 4 6 8 10 12 14 16 A . . . A − − − − − − Var (cid:0) Re W A (cid:1) (original)Var ( Q A ) (deformed)10 − − − (cid:10) W A (cid:11) (original) h Q A i (deformed)4 8 12 16 20 24 28 32 A . . . A − − − − − − Var (cid:0) Re W A (cid:1) (original)Var ( Q A ) (deformed)10 − − − (cid:10) W A (cid:11) (original) h Q A i (deformed)8 16 24 32 40 48 56 64 A . . . A − − − − − − Var (cid:0) Re W A (cid:1) (original)Var ( Q A ) (deformed) FIG. 4. SU (2) Wilson loop expectation values and variances for ensembles with three different values of the gauge coupling g = 0 . , . , .
51 (top to bottom). Red lines indicate analytical results for (cid:10) W A (cid:11) = (cid:104) tr( W A ) / (cid:105) (left plots) and forVar(Re W A ) (right plots). σ A V a r ( R e W A ) / V a r ( Q A ) g = 0 .
98 [ SU (2)] g = 0 .
71 [ SU (2)] g = 0 .
51 [ SU (2)] FIG. 5. SU (2) Wilson loop variance ratios of standard observ-ables to deformed observables for ensembles with three differ-ent values of the gauge coupling g , corresponding to threedifferent values of lattice spacing. . . κ x ; φ x κ x ; φ A=8 A=16 A=24
FIG. 6. The manifold parameters found by optimizing thevariance of the deformed Wilson loop observable Q A at threedifferent choices of area A on the ensemble with total volume V = 32 and β = 8 .
0. Optimization at each A was initializedusing the parameters found for the observable with area A − corresponding to a positive imaginary shift applied to φ x , and a decreasing κ x ; φ , corresponding to a positiveimaginary shift applied to each difference φ x − φ x +1 . Onlythese relative differences between neighboring φ x have aneffect on the value of Q A , thus the overall shift on thecollection of κ x ; φ is irrelevant.We further optimized manifold parameters usingFourier cutoffs Λ = 1 , g = 0 .
71 to investigate gains from including higherFourier modes. Including Fourier modes larger thanthe constant term enables more complex deformationsof each angular parameter, and introduces possible de- σA V a r ( R e W A ) / V a r ( Q A ) Λ = 0 [ SU (2)]Λ = 1 [ SU (2)]Λ = 2 [ SU (2)] FIG. 7. SU (2) Wilson loop variance ratios of standard observ-ables to deformed observables for ensembles with g = 0 .
71 andthree different values of the manifold parameterization cutoff. pendence on plaquettes at sites y ≤ x when deforming P x . Despite this increased expressivity, these additionalparameters did not provide significantly larger StN im-provements compared to using only constant terms, asshown in Fig. 7. In some cases, the optimized mani-fold with larger cutoff resulted in higher variance (lowervariance ratio) than the optimized manifold with cutoffΛ = 0. The manifolds with larger cutoffs include all pos-sible manifolds with smaller cutoffs, thus this is neces-sarily a training effect, likely due to noisier gradients andless stable training dynamics. We did not pursue noise re-duction and alternative approaches to training (such asiteratively including higher Λ) as these manifolds withhigher cutoffs did not produce significant improvementsat any value of the area. V. SU (3) PATH INTEGRAL CONTOURDEFORMATIONS
We further investigated the ability of contour defor-mations to reduce the variance of Wilson loops in SU (3)gauge theory in (1 + 1)D with open boundary conditions.This setting is identical to the previous section, except forthe use of SU (3) rather than SU (2) gauge field variables.Suitable parameterizations for contour deformations of SU (3) gauge fields are discussed below. A. Gauge field parameterization and contourdeformation
For the SU (3) gauge group, we use the angular param-eterization constructed in Ref. [55]. The components of7a single plaquette P x ∈ SU (3) are parameterized as P x = cos θ x cos θ x e iφ x ,P x = sin θ x e iφ x ,P x = cos θ x sin θ x e iφ x ,P x = sin θ x sin θ x e − i ( φ x + φ x ) − sin θ x cos θ x cos θ x e i ( φ x + φ x − φ x ) ,P x = cos θ x cos θ x e iφ x ,P x = − cos θ x sin θ x e − i ( φ x + φ x ) − sin θ x sin θ x cos θ x e i ( φ x − φ x + φ x ) ,P x = − sin θ x cos θ x sin θ x e i ( φ x − φ x + φ x ) − sin θ x cos θ x e − i ( φ x + φ x ) ,P x = cos θ x sin θ x e iφ x ,P x = cos θ x cos θ x e − i ( φ x + φ x ) − sin θ x sin θ x sin θ x e − i ( φ x − φ x − φ x ) , (65)in terms of the three zenith angles 0 ≤ θ x , θ x , θ x ≤ π/ ≤ φ x , . . . , φ x ≤ π foreach plaquette. We collect these angles into a variableΩ x = ( θ x , .., θ x , φ x , ..., φ x ) for ease of notation. The Haarmeasure is related to the measure of Ω by [55] dP x = 12 π sin θ x (cos θ x ) sin θ x cos θ x sin θ x cos θ x × dθ x dθ x dθ x dφ x . . . dφ x . (66) To compute deformed observables from Monte Carlosamples in the matrix representation, an inverse map of(65) is needed and for example can be specified by θ x = arcsin( | P x | ) ,θ x = arccos (cid:0) | P x | / cos( θ x ) (cid:1) ,θ x = arccos (cid:0) | P x | / cos( θ x ) (cid:1) ,φ x = arg( P x ) ,φ x = arg( P x ) ,φ x = arg( P x ) ,φ x = arg( P x ) ,φ x = arg( P x ) . (67)An SU (3) field configuration in (1 + 1)D with openboundary conditions is defined by a collection of angularvariables Ω x associated all plaquettes P x on the lattice.The a th component of the deformed angles at site x is de-noted by ( (cid:101) Ω x ) a , and for vertical deformations expandedin Fourier modes is specified by˜ φ ax = φ ax + iκ x ; φ a + i (cid:88) y ≤ x f φ a (Ω y ; κ xy , λ xy , χ xy , ζ xy ) , ˜ θ ax = θ ax + i (cid:88) y ≤ x f θ a (Ω y ; κ xy , λ xy , χ xy ) . (68)where f θ a = Λ (cid:88) m =1 κ xy ; am sin(2 mθ ay ) (cid:26) Λ (cid:88) n =1 (cid:2) (cid:88) r (cid:54) = ar =1 λ xy ; armn sin(2 nθ ry ) + (cid:88) s =1 η xy ; asmn sin( nφ sy + χ xy ; asmn ) (cid:3)(cid:27) ,f φ a = Λ (cid:88) m =1 κ xy ; am sin( mφ ay + ζ xy ; am ) (cid:26) Λ (cid:88) n =1 (cid:2) (cid:88) r =1 λ xy ; armn sin(2 nθ ry ) + (cid:88) s (cid:54) = as =1 η xy ; asmn sin( nφ sy + χ xy ; asmn ) (cid:3)(cid:27) . (69)Deformed observables analogous to Eq. (58) can be con-structed for the SU (3) case using this parameterizationand ratios of the deformed and undeformed Haar measurefactors obtained from Eq. (66). B. Results
Practical performance of these deformations was in-vestigated by optimizing Wilson loop variance using thethree sets of SU (3) parameters detailed in Table I as inthe SU (2) case. The couplings were tuned to match thestring tensions used for SU (2) gauge theory and corre-spond to lattice spacings varying by a factor of 4. Foreach choice of parameters, an ensemble of n = 32000 configurations was generated using the factorized HMCmethod discussed in Sec. IV D. Fig. 8 shows variance re-duction for Wilson loops of all possible sizes for the threelattice spacings studied. At the largest loop areas, wefound variance reduction of greater than three orders ofmagnitude. Across all three ensembles, analytical resultsfor the Wilson loop expectation values and variances werereproduced by the undeformed Monte Carlo estimates.The expectation value of the deformed observable is con-sistent with the analytical and original Monte Carlo re-sults, while the variance of the deformed observable ex-ponentially decreases with increasing σA .Fig. 9 compares the variance reduction achieved at allthree lattice spacings versus physical loop area σA . Wefound approximately equivalent improvement in the vari-8 − − − (cid:10) W A (cid:11) (original) h Q A i (deformed)2 4 6 8 10 12 14 16 A . . . A − − − − Var (cid:0) Re W A (cid:1) (original)Var ( Q A ) (deformed)10 − − − (cid:10) W A (cid:11) (original) h Q A i (deformed)4 8 12 16 20 24 28 32 A . . . A − − − − Var (cid:0) Re W A (cid:1) (original)Var ( Q A ) (deformed)10 − − − (cid:10) W A (cid:11) (original) h Q A i (deformed)8 16 24 32 40 48 56 64 A . . . A − − − − Var (cid:0) Re W A (cid:1) (original)Var ( Q A ) (deformed) FIG. 8. SU (3) Wilson loop expectation values and variances for ensembles with three different values of the gauge couplingthat from top to bottom correspond to g = 0 . , . , .
38. Red lines correspond to analytical results for (cid:10) W A (cid:11) = (cid:104) tr( W A ) / (cid:105) (left plots) and Var(Re W A ) (right plots). σ A V a r ( R e W A ) / V a r ( Q A ) g = 0 .
72 [ SU (3)] g = 0 .
53 [ SU (3)] g = 0 .
38 [ SU (3)] FIG. 9. SU (3) Wilson loop variance ratios of standard ob-servables to deformed observables for ensembles with threedifferent values of the gauge coupling that correspond to g = 0 . , . , .
38 (top to bottom). ance at the two coarser lattice spacings ( g = 0 .
72 and g = 0 . g = 0 . SU (2) gauge theory suggest that the results at finerlattice spacings could be partially explained by increaseddifficulty in training the larger number of parameters.The Λ = 0 manifold parameterization involves fewenough parameters that it is possible to investigate thestructure of the optimal parameters similarly to the caseof SU (2) gauge theory. As shown in Fig. 10, we foundthat the optimized values of κ x ;30 and κ x ;40 decrease with x , while κ x ;10 and κ x ;20 appear to converge towards ap-proximately constant positive and negative values, re-spectively. The final parameter κ x ;50 fluctuates in boththe positive and negative directions. These results canbe qualitatively explained by expanding the (1 ,
1) com-ponent of the Wilson loop for small area. For A = 1the Wilson loop is equivalent to P x , for which the (1 , P x = cos θ x cos θ x e iφ x . (70)The magnitude of this quantity can be reduced by shift-ing φ → (cid:101) φ x = φ x + iλ , which is consistent with thepositive κ x ;10 values obtained after optimization shown inFig. 10. Extending the analysis to the A = 2 Wilson . . κ x ; φ − . . κ x ; φ κ x ; φ κ x ; φ x − κ x ; φ A=8 A=16 A=24
FIG. 10. The parameters of the optimal manifold for Wilsonloop Q A at three different areas A , as determined on the en-semble with total volume V = 32 and g = 0 .
53. Optimizationfor each Q A was initialized using the parameters for optimal Q A (cid:48) with region A (cid:48) of area A −
1, as detailed in the main text. loop, the (1 ,
1) component is given by( P x P x (cid:48) ) = e iφ x + iφ x (cid:48) cos θ x cos θ x (cid:48) cos θ x cos θ x (cid:48) − e iφ x (cid:48) + iφ x (cid:48) + i ( φ x − φ x (cid:48) ) cos θ x (cid:48) cos θ x (cid:48) sin θ x sin θ x (cid:48) − e − iφ x (cid:48) + i ( φ x − φ x (cid:48) ) cos θ x cos θ x (cid:48) sin θ x sin θ x (cid:48) − e iφ x + iφ x (cid:48) − iφ x (cid:48) + iφ x (cid:48) cos θ x cos θ x (cid:48) sin θ x (cid:48) sin θ x sin θ x (cid:48) + e iφ x − iφ x (cid:48) − iφ x (cid:48) sin θ x sin θ x (cid:48) sin θ x (cid:48) . (71)The magnitude of the first, second, and fourth terms arereduced by shifting φ x → φ x + iλ and φ x (cid:48) → φ x (cid:48) + iλ with λ >
0. The magnitude of the second and third terms canbe reduced by shifting φ x − φ x (cid:48) → φ x − φ x (cid:48) + iδ and φ x − φ x (cid:48) → φ x − φ x (cid:48) + iδ with δ >
0; this is also consis-tent with a positive imaginary shift of iδ in φ x − φ x (cid:48) and φ x − φ x (cid:48) , reducing the magnitude of the fourth and fifthterms. These deformations result in reduced magnitudeand correspondingly lower variance. Deformations withthese qualitative features are reproduced in the optimizedmanifolds found for Λ = 0. Finally, we note that φ x (cid:48) ap-pears in the exponent with opposite signs in the secondand third terms, and similarly for φ x (cid:48) in the fourth andfifth terms, so there is no constant vertical deformation0 σA V a r ( R e W A ) / V a r ( Q A ) Λ = 0 [ SU (3)]Λ = 1 [ SU (3)]Λ = 2 [ SU (3)] FIG. 11. SU (3) Wilson loop variance ratios of standard ob-servables to deformed observables for ensembles with g = 0 . of these terms that will reduce the overall magnitude.Analogously to the case of SU (2) gauge theory, no sig-nificant StN improvements were obtained by includinghigher Fourier modes. Fig. 11 directly compares opti-mized manifolds for Λ ≤ VI. CONCLUSIONS
In this work, we have defined a family of complex mani-folds for path integral deformations in SU ( N ) lattice fieldtheories. The manifolds introduced here are describedin terms of an angular parameterization of each SU ( N )variable, with dependence between variables at differingspacetime sites restricted to enforce a triangular Jaco-bian. We find that choosing a parameterization in whichthe observable of interest can be written as O = e iθ X is a useful practical choice, allowing constant shift defor-mations in the parameter θ to make substantial progressin reducing noise. Choosing a spacetime dependence ofthe deformation that gives rise to a triangular Jacobianis key to ensuring the deformed integral can be computedefficiently, as the Jacobian determinant can then be eval-uated with cost linear in the number of spacetime lattice sites.This manifold parameterization can be combinedwith the method of deformed observables introduced inRef. [48] to reduce noise in observables. This approachis applicable when the action is real and the Boltzmannweight e − S can be treated as a probability measure. Westress that this method of deformed observables does notrequire changing the Monte Carlo sampling, despite be-ing based on an analysis of contour deformation of theentire path integral, and can be thought of as an ap-proach to analytically relate observables with identicalexpectation values and different variance. Keeping theMonte Carlo weights unchanged allows manifold opti-mization using estimates of the deformed observable vari-ance computed with respect to a fixed Monte Carlo en-semble. There is a tradeoff between the cost of optimizingmanifold parameters and the statistical precision gained.In practice, we find that initializing manifold parametersfrom optimal parameters for similar observables signifi-cantly reduces the associated cost.This method was shown in Sec. IV and V to improvethe variance of Wilson loop observables in (1+1)D SU (2)and SU (3) lattice gauge theory by orders of magnitude.For the original Wilson loop observables, the signal-to-noise ratio decreases exponentially with area. The de-formed observables mitigate this StN problem, and inparticular we find that the improvements are consistentwith an exponential in the physical Wilson loop area,with the most significant reduction in noise for Wilsonloops of the largest area. The improvement in variancewas empirically found to be similar at three different lat-tice spacings, though less of an improvement was seenat the finest lattice spacing for SU (3); the achieved im-provements in the continuum limit and for other theo-ries is thus an interesting subject of future investigation.However, we stress that making any gains at finite lat-tice spacing is still a significant step forward due to theconvenience of the method: optimizing a deformation ona fixed ensemble quickly gives new observables that en-code the same physical content while having significantlyreduced noise.In demonstrating the method, we focused here on aparticular deformation of the angular parameters basedon a Fourier series expansion and shift in the imaginarydirection only. Writing the observable phase fluctuationsin terms of these periodic angular parameters that canbe shifted by a constant in the imaginary direction ledto deformed observables with significant StN improve-ment. The surprising result that zero-mode terms alonesignificantly reduce noise, with neither dependence be-tween plaquettes at different spacetime sites nor depen-dence on the values of the angular parameters themselves,suggests that the majority of the StN problem in these(1 + 1)D theories arises from independent local fluctua-tions of SU ( N ) angular parameters.Complications are expected in higher dimensions, asGauss’ Law implies that plaquettes at differing space-time locations and orientations must satisfy many inde-1pendent constraints. Deformations thus cannot indepen-dently address fluctuations in each plaquette included ina Wilson loop, or more generally in each fundamentaldegree of freedom included in an observable. It is there-fore an interesting line of future work to determine howbest to incorporate this spacetime dependence in higherdimensional applications of path integral contour defor-mations. The approach employed here is one of manypossible approaches to creating expressive transforma-tions with efficiently computable Jacobians. This issuehas been explored in some depth in normalizing flowsfor sampling in many contexts, including image genera-tion and ensemble generation for lattice field theory; seeRefs. [64, 83] for recent reviews. Similar techniques mayprove more fruitful in future applications of path integralcontour deformations to observables of phenomenologicalinterest in (3 + 1)D lattice gauge theories. ACKNOWLEDGMENTS
W.D. and G.K. are supported in part by the U.S. DOEgrant No. DE-SC0011090. W.D. is also supported bythe SciDAC4 award DE-SC0018121. H.L. is supportedby a Department of Energy QuantiSED grant. N.C.W.is supported by the U.S. DOE under Grant No. DE-FG02-00ER41132. This manuscript has been authoredby Fermi Research Alliance, LLC under Contract No.DE-AC02-07CH11359 with the U.S. Department of En-ergy, Office of Science, Office of High Energy Physics.
Appendix A: Single-variable SU ( N ) integrals The calculation of z for SU ( N ) can be performed anal-ogously to the U ( N ) case considered in Refs. [70, 71] withfurther details for the SU ( N ) case given in Ref. [72]. Inthe eigenbasis of P , the Haar measure is given by dP = 1 N ! N (cid:89) I =1 (cid:34) dθ I π (cid:89) J
1) ( δ i l δ j k δ i l δ j k + δ i l δ j k δ i l δ j k ) . (A7)From Table II, we recognize that (cid:10)(cid:12)(cid:12) tr( W A ) /N (cid:12)(cid:12)(cid:11) is re-lated to χ , − ( P x ). Thus, applying Eq. (A7) we can de-2rive that (cid:90) d Ω χ , − ( A Ω P Ω † B )= tr( AB ) tr( B † A † ) (cid:104) χ , − (cid:105) + tr( AA † B † B ) 1 N (1 − (cid:104) χ , − (cid:105) ) − (cid:2) tr( AB ) tr( B † A † ) − (cid:3) (cid:104) χ , − (cid:105) = χ , − ( AB ) (cid:104) χ , − (cid:105) (A8)where (cid:104) χ , − (cid:105) ≡ z (cid:90) dP N − χ , − ( P ) e g tr ( P + P † ) . (A9)Iterating this identity within (cid:10)(cid:12)(cid:12) tr( W A ) (cid:12)(cid:12)(cid:11) gives (cid:10)(cid:12)(cid:12) tr( W A ) (cid:12)(cid:12)(cid:11) = N (cid:104) χ , − (cid:105) A + (1 − (cid:104) χ , − (cid:105) ) A − (cid:88) k =0 (cid:18) (cid:104) χ , − (cid:105) − N (cid:19) k = 1 + ( N − (cid:104) χ , − (cid:105) A . (A10)Since tr( W A ) is not a character χ r , attempting to fac-torize it generates new terms. Specifically, in addition totr( W A (cid:48) ) one finds tr( W A (cid:48) ) for A (cid:48) ⊂ A . Instead, a basisinvolving χ ( P ) and χ , − ( P ) can be constructed fromlinear combinations of these traces and satisfies (cid:90) d Ω χ ( A Ω P Ω † B ) = 2 N ( N + 1) χ ( AB ) (cid:104) χ (cid:105) , (A11)and (cid:90) d Ω χ , ( A Ω P Ω † B ) = 2 N ( N − χ , ( AB ) (cid:104) χ , (cid:105) , (A12)where as in Eq. (A9) we have factorized using the expec-tation values of the characters (cid:104) χ (cid:105) ≡ z (cid:90) dP N ( N + 1) χ ( P ) e g tr ( P + P † ) , (cid:104) χ , (cid:105) ≡ z (cid:90) dP N ( N − χ , ( P ) e g tr ( P + P † ) . (A13)Putting these together and iterating gives (cid:10) tr( W A ) (cid:11) = N ( N + 1)2 (cid:104) χ (cid:105) A + N ( N − (cid:104) χ , (cid:105) A . (A14)Combining Eqs. (A10) and (A14) gives the general ex- pression for variance of the SU ( N ) Wilson loopVar[Re tr( W SU ( N ) A ) /N ]= 12 N (cid:10)(cid:12)(cid:12) tr( W A ) (cid:12)(cid:12)(cid:11) + 12 N (cid:10) tr( W A ) (cid:11) − N (cid:104) tr( W A ) (cid:105) = 12 N (cid:20) N − (cid:104) χ , − (cid:105) A + N ( N + 1)2 (cid:104) χ (cid:105) A + N ( N − (cid:104) χ , (cid:105) A (cid:21) − e − σ SU ( N ) A . (A15)The character expectation values can be obtained for gen-eral SU ( N ) gauge groups by numerically evaluating theintegrals in Eq. (A13).For SU (2), it is straightforward to analyticallyevaluate the character expection values appearing inEq. (A13). Not all representations of SU (2) are indepen-dent using the enumeration in Table II, and in particular χ ( P ) = χ , − ( P ) and χ , ( P ) = χ ( P ) = 1. After re-moving the redundant characters, the remain integrals χ j ( P ) can be solved using expression of the characters interms of angles [72] χ j = sin([ j + 1 / α )sin( α/
2) (A16)where j = 0 , , · · · index the unique characters of SU (2)and r = { , − } equals j = 1. With these, one has1 z (cid:90) DP d j χ j ( P ) e g Tr( P + P † ) = I j +1 (4 /g ) I (4 /g ) . (A17)From which we deriveVar[Re tr( W SU (2) A ) /N ]= 14 + 34 (cid:18) I (4 /g ) I (4 /g ) (cid:19) A − e − σ SU (2) A , (A18) Appendix B: Alternative SU (2) coordinates As an alternative to the parameterization presented inSec. IV A, plaquettes P x ∈ SU (2) can be represented as P x = exp (cid:18) iα x n x · (cid:126)σ (cid:19) = cos (cid:16) α x (cid:17) + i sin (cid:16) α x (cid:17) ˆ n x · (cid:126)σ. (B1)The su (2) unit vector ˆ n x can be further parameterized asˆ n x = (cos φ x sin θ x , sin φ x sin θ x , cos θ x ) , (B2)and a general SU (2) group element P x can be parame-terized in terms of the three angles0 ≤ α x < π, ≤ φ x < π, ≤ θ x < π. (B3)3 σ A V a r ( R e W A ) / V a r ( Q A ) Main text g = 0 .
98 [Euler] g = 0 .
71 [Euler] g = 0 .
51 [Euler]
FIG. 12. Colored points SU (2) Wilson loop variance ratiosusing the alternative gauge field parameterization defined inEq. (B1). Gray points show analogous variance ratios usingthe parameterization defined in Sec. IV A for comparison andare identical to the results in Fig. 5. The Haar measure is given in these coordinates as dP x = 14 π sin (cid:16) α x (cid:17) dα x sin θ x dθ x dφ x , (B4)The inverse map needed to obtain these angular param-eters for an SU (2) matrix P x is given by α x = 2 arccos (cid:20) (cid:0) P x + P x (cid:1)(cid:21) θ x = arccos (cid:20) P x − P x i sin( α x / (cid:21) φ x = 12 arg (cid:20) P x P x (cid:21) . (B5)As with Eq. (60), these are not entire functions of P x but this is not an obstacle for contour deformation be-cause it is only the parameterization given by Eqs. (B1)and (B2) that determines whether path integrands canbe interpreted as holomorphic functions of the angles { α x , θ x , φ x } associated with P x .Deformed observables starting with the (1 ,
1) compo-nent of SU (2) Wilson loops can be defined using thisparameterization. A family of vertical deformations for { α x , θ x , φ x } can be defined analogously to the deforma-tion described in Sec. IV B. Since α x and θ x have fixed(non-identified) integration contour endpoints, a con-stant vertical deformation can only be applied to φ x . For A = 1 in particular, Tr( P x ) = cos( α x / , A = 1. As shown in Fig. 12, for A > ,
1) components of(products of) SU (2) matrices using the parameterizationEq. (B1). The significance of the difference between theresults demonstrates the utility of rewriting observablesbefore deformation in achieving practical StN improve-ments, as discussed in Sec. II D. Appendix C: Regularization terms to avoidovertraining and overlap problems
When Re ˜ S is significantly different from S , we can en-counter an overlap problem for training and evaluation;both processes involve factors of e − Re ˜ S + S that can havevery large magnitude fluctuations in this situation. Tomitigate this problem, it is helpful to include regular-ization terms in the loss function L . These terms maybias the exact loss minimum away from the optimal, butallow closer convergence to that optimal solution given fi-nite statistics estimates of L . The strength of these termsis controlled by a small parameter (cid:15) .We discuss two possible terms here. First, an L2 reg-ularizer [85] may be used, which simply ensures the pa-rameters controlling the deformation all remain close tozero. Generically labeling those parameters as λ i , thisloss term can be written L L2 ≡ (cid:15) (cid:88) i | λ i | . (C1)In the limit of (cid:15) → ∞ , the parameters λ i are forced tozero and the optimization procedure must remain at theoriginal manifold. A smaller choice of (cid:15) mildly biasesthe optimization procedure towards the original mani-fold, such that the loss function and gradients remainfeasible to estimate with finite statistics. An alternateapproach is to directly penalize distance between Re ˜ S and S using a regularization term, L act ≡ (cid:15) Z (cid:90) dx e − S ( x ) (cid:12)(cid:12)(cid:12) S ( x ) − Re ˜ S ( x ) (cid:12)(cid:12)(cid:12) . (C2)This term is minimized when S = Re ˜ S , providing a biastowards remaining close to the original manifold. Thoughwritten as a path integral, this quantity can be estimatedusing the original samples, much like the main loss func-tion and gradients.Both of these regularizer terms were explored, howeverno severe overlap problem was observed during training4when deformation parameters were restricted to those contained within the target Wilson loop observables. Fi-nal results are based on training without either term. [1] S. Aoki et al. (Flavour Lattice Averaging Group), Eur.Phys. J. C , 113 (2020), arXiv:1902.08191 [hep-lat].[2] W. Detmold, R. G. Edwards, J. J. Dudek, M. En-gelhardt, H.-W. Lin, S. Meinel, K. Orginos, andP. Shanahan (USQCD), Eur. Phys. J. A55 , 193 (2019),arXiv:1904.09512 [hep-lat].[3] C. Lehner et al. (USQCD), Eur. Phys. J. A , 195(2019), arXiv:1904.09479 [hep-lat].[4] V. Cirigliano, Z. Davoudi, T. Bhattacharya, T. Izubuchi,P. E. Shanahan, S. Syritsyn, and M. L. Wag-man (USQCD), Eur. Phys. J. A55 , 197 (2019),arXiv:1904.09704 [hep-lat].[5] T. Aoyama et al. , Phys. Rept. , 1 (2020),arXiv:2006.04822 [hep-ph].[6] M. L. Wagman and M. J. Savage, Phys. Rev.
D96 ,114508 (2017), arXiv:1611.07643 [hep-lat].[7] M. L. Wagman,
Statistical Angles on the Lattice QCDSignal-to-Noise Problem , Ph.D. thesis, U. Washington,Seattle (main) (2017), arXiv:1711.00062 [hep-lat].[8] G. Parisi,
Common trends in particle and condensed mat-ter physics: Proceedings of Les Houches Winter AdvancedStudy Institute, February 1980 , Phys. Rept. , 203(1984).[9] G. P. Lepage, in
Boulder TASI 1989:97-120 (1989) pp.97–120.[10] S. R. Beane, W. Detmold, T. C. Luu, K. Orginos,A. Parre˜no, M. J. Savage, A. Torok, and A. Walker-Loud, Phys. Rev.
D79 , 114502 (2009), arXiv:0903.2990[hep-lat].[11] S. R. Beane, W. Detmold, K. Orginos, and M. J. Savage,J. Phys. G , 034022 (2015), arXiv:1410.2937 [nucl-th].[12] Z. Davoudi, W. Detmold, K. Orginos, A. Parre˜no, M. J.Savage, P. Shanahan, and M. L. Wagman, (2020),arXiv:2008.11160 [hep-lat].[13] P. E. Gibbs, Phys. Lett. B182 , 369 (1986).[14] T. D. Cohen, Phys. Rev. Lett. , 222001 (2003),arXiv:hep-ph/0307089 [hep-ph].[15] T. D. Cohen, Phys. Rev. Lett. , 032002 (2003),arXiv:hep-ph/0304024.[16] K. Splittorff and J. J. M. Verbaarschot, Phys. Rev. Lett. , 031601 (2007), arXiv:hep-lat/0609076 [hep-lat].[17] K. Splittorff and J. J. M. Verbaarschot, Phys. Rev. D75 ,116003 (2007), arXiv:hep-lat/0702011 [HEP-LAT].[18] P. de Forcrand,
Proceedings, 27th International Sympo-sium on Lattice field theory (Lattice 2009): Beijing, P.R.China, July 26-31, 2009 , PoS
LAT2009 , 010 (2009),arXiv:1005.0539 [hep-lat].[19] A. Alexandru, C. Gattringer, H. P. Schadler, K. Split-torff, and J. J. M. Verbaarschot, Phys. Rev.
D91 , 074501(2015), arXiv:1411.4143 [hep-lat].[20] M. Cristoforetti, F. Di Renzo, and L. Scorzato(AuroraScience), Phys. Rev.
D86 , 074506 (2012),arXiv:1205.3996 [hep-lat].[21] G. Aarts, Phys. Rev.
D88 , 094501 (2013),arXiv:1308.4811 [hep-lat].[22] M. Cristoforetti, F. Di Renzo, A. Mukherjee,and L. Scorzato, Phys. Rev.
D88 , 051501 (2013), arXiv:1303.7204 [hep-lat].[23] A. Mukherjee, M. Cristoforetti, and L. Scorzato,Phys. Rev.
D88 , 051502 (2013), arXiv:1308.0233[physics.comp-ph].[24] G. Aarts, L. Bongiovanni, E. Seiler, and D. Sexty, JHEP , 159 (2014), arXiv:1407.2090 [hep-lat].[25] M. Cristoforetti, F. Di Renzo, G. Eruzzi, A. Mukherjee,C. Schmidt, L. Scorzato, and C. Torrero, Phys. Rev. D89 , 114505 (2014), arXiv:1403.5637 [hep-lat].[26] A. Alexandru, G. Ba¸sar, and P. Bedaque, Phys. Rev.
D93 , 014504 (2016), arXiv:1510.03258 [hep-lat].[27] A. Alexandru, G. Ba¸sar, P. F. Bedaque, G. W. Ridg-way, and N. C. Warrington, JHEP , 053 (2016),arXiv:1512.08764 [hep-lat].[28] A. Alexandru, G. Ba¸sar, P. F. Bedaque, S. Vartak, andN. C. Warrington, Phys. Rev. Lett. , 081602 (2016),arXiv:1605.08040 [hep-lat].[29] H. Fujii, S. Kamata, and Y. Kikukawa, JHEP , 125(2015), [Erratum: JHEP09,172(2016)], arXiv:1509.09141[hep-lat].[30] Y. Tanizaki, Y. Hidaka, and T. Hayata, New J. Phys. , 033002 (2016), arXiv:1509.07146 [hep-th].[31] A. Alexandru, P. F. Bedaque, H. Lamm, andS. Lawrence, Phys. Rev. D96 , 094505 (2017),arXiv:1709.01971 [hep-lat].[32] A. Alexandru, G. Ba¸sar, P. F. Bedaque, and G. W. Ridg-way, Phys. Rev.
D95 , 114501 (2017), arXiv:1704.06404[hep-lat].[33] Y. Mori, K. Kashiwa, and A. Ohnishi, PTEP ,023B04 (2018), arXiv:1709.03208 [hep-lat].[34] Y. Tanizaki, H. Nishimura, and J. J. M. Verbaarschot,JHEP , 100 (2017), arXiv:1706.03822 [hep-lat].[35] A. Alexandru, P. F. Bedaque, and N. C. Warrington,Phys. Rev. D98 , 054514 (2018), arXiv:1805.00125 [hep-lat].[36] A. Alexandru, G. Ba¸sar, P. F. Bedaque, H. Lamm,and S. Lawrence, Phys. Rev.
D98 , 034506 (2018),arXiv:1807.02027 [hep-lat].[37] A. Alexandru, P. F. Bedaque, H. Lamm, andS. Lawrence, Phys. Rev.
D97 , 094510 (2018),arXiv:1804.00697 [hep-lat].[38] A. Alexandru, P. F. Bedaque, H. Lamm, S. Lawrence,and N. C. Warrington, Phys. Rev. Lett. , 191602(2018), arXiv:1808.09799 [hep-lat].[39] K. Kashiwa, Y. Mori, and A. Ohnishi, Phys. Rev.
D99 ,014033 (2019), arXiv:1805.08940 [hep-ph].[40] M. Fukuma, N. Matsumoto, and N. Umeda, (2019),arXiv:1912.13303 [hep-lat].[41] M. Fukuma, N. Matsumoto, and N. Umeda, Phys. Rev.
D100 , 114510 (2019), arXiv:1906.04243 [cond-mat.str-el].[42] K. Kashiwa, Y. Mori, and A. Ohnishi, Phys. Rev.
D99 ,114005 (2019), arXiv:1903.03679 [hep-lat].[43] Z.-G. Mou, P. M. Saffin, and A. Tranberg, JHEP ,135 (2019), arXiv:1909.02488 [hep-th].[44] M. Ulybyshev, C. Winterowd, and S. Zafeiropou-los, Phys. Rev. D101 , 014508 (2020), arXiv:1906.07678 [cond-mat.str-el].[45] S. Lawrence, “Sign problems in quantum field the-ory: Classical and quantum approaches,” (2020),arXiv:2006.03683 [hep-lat].[46] S. Lawrence and Y. Yamauchi, (2021), arXiv:2101.05755[hep-lat].[47] A. Alexandru, G. Basar, P. F. Bedaque, and N. C.Warrington, “Complex paths around the sign problem,”(2020), arXiv:2007.05436 [hep-lat].[48] W. Detmold, G. Kanwar, M. L. Wagman, andN. C. Warrington, Phys. Rev. D , 014514 (2020),arXiv:2003.05914 [hep-lat].[49] F. Di Renzo and G. Eruzzi, Physical Review D (2018),10.1103/physrevd.97.014503.[50] C. Schmidt and F. Ziesch´e, PoS LATTICE2016 , 076(2017), arXiv:1701.08959 [hep-lat].[51] A. Ohnishi, Y. Mori, and K. Kashiwa, JPS Conf. Proc. , 024011 (2019).[52] K. Zambello and F. D. Renzo, “Towards lefschetzthimbles regularization of heavy-dense qcd,” (2018),arXiv:1811.03605 [hep-lat].[53] M. L¨uscher and P. Weisz, JHEP , 010 (2001),arXiv:hep-lat/0108014 [hep-lat].[54] R. M. Range, Holomorphic functions and integral rep-resentations in several complex variables , Vol. 108(Springer Science & Business Media, 2013).[55] J. Bronzan, Phys. Rev. D , 1994 (1988).[56] A. Alexandru, P. F. Bedaque, H. Lamm, andS. Lawrence, Phys. Rev. D , 094510 (2018).[57] K. Kashiwa and Y. Mori, Phys. Rev. D , 054519(2020), arXiv:2007.04167 [hep-lat].[58] A. Alexandru, G. Basar, P. F. Bedaque, G. W. Ridgway,and N. C. Warrington, Phys. Rev. D , 014502 (2017),arXiv:1609.01730 [hep-lat].[59] D. Bump, “Complexification,” in Lie Groups (SpringerNew York, New York, NY, 2013) pp. 205–211.[60] G. Aarts and I.-O. Stamatescu, JHEP , 018 (2008),arXiv:0807.1597 [hep-lat].[61] D. Sexty, Proceedings, 24th International Conferenceon Ultra-Relativistic Nucleus-Nucleus Collisions (QuarkMatter 2014): Darmstadt, Germany, May 19-24, 2014 ,Nucl. Phys.
A931 , 856 (2014), arXiv:1408.6767 [hep-lat].[62] E. Seiler,
Proceedings, 35th International Symposium onLattice Field Theory (Lattice 2017): Granada, Spain,June 18-24, 2017 , EPJ Web Conf. , 01019 (2018),arXiv:1708.08254 [hep-lat].[63] A. Alexandru, P. F. Bedaque, H. Lamm, S. Lawrence,and N. C. Warrington, Phys. Rev. Lett. , 191602(2018).[64] G. Papamakarios, E. Nalisnick, D. J. Rezende, S. Mo-hamed, and B. Lakshminarayanan, “Normalizing Flows for Probabilistic Modeling and Inference,” (2019),arXiv:1912.02762 [stat.ML].[65] H. Robbins and S. Monro, Ann. Math. Statist. , 400(1951).[66] V. S. Borkar, Stochastic approximation: a dynamical sys-tems viewpoint , Vol. 48 (Springer, 2009) Chap. 2.[67] J. Chee and P. Toulis, “Convergence diagnostics forstochastic gradient descent with constant step size,”(2018), arXiv:1710.06382 [stat.ML].[68] L. Bottou, F. E. Curtis, and J. Nocedal, “Optimiza-tion methods for large-scale machine learning,” (2018),arXiv:1606.04838 [stat.ML].[69] K. G. Wilson, Phys. Rev.