[PDF] Normalizing Flows and the Real-Time Sign Problem

Abstract

Normalizing flows have recently been applied to the problem of accelerating Markov chains in lattice field theory. We propose a generalization of normalizing flows that allows them to applied to theories with a sign problem. These complex normalizing flows are closely related to contour deformations (i.e. the generalized Lefschetz thimble method), which been applied to sign problems in the past. We discuss the question of the existence of normalizing flows: they do not exist in the most general case, but we argue that exact normalizing flows are likely to exist for many physically interesting problems, including cases where the Lefschetz thimble decomposition has an intractable sign problem. Finally, normalizing flows can be constructed in perturbation theory. We give numerical results on their effectiveness across a range of couplings for the Schwinger-Keldysh sign problem associated to a real scalar field in 0+1 dimensions.

Full PDF

NNormalizing Flows and the Real-Time Sign Problem

Scott Lawrence ∗ and Yukari Yamauchi † Department of Physics, University of Colorado, Boulder, CO 80309, USA Department of Physics, University of Maryland, College Park, Maryland 20742, USA

Normalizing ﬂows have recently been applied to the problem of accelerating Markov chains inlattice ﬁeld theory. We propose a generalization of normalizing ﬂows that allows them to appliedto theories with a sign problem. These complex normalizing ﬂows are closely related to contourdeformations (i.e. the generalized Lefschetz thimble method), which been applied to sign problems inthe past. We discuss the question of the existence of normalizing ﬂows: they do not exist in the mostgeneral case, but we argue that approximate normalizing ﬂows are likely to exist for many physicallyinteresting problems. Finally, normalizing ﬂows can be constructed in perturbation theory. We givenumerical results on their eﬀectiveness across a range of couplings for the Schwinger-Keldysh signproblem associated to a real scalar ﬁeld in 0 + 1 dimensions.

I. INTRODUCTION

Monte Carlo methods, applied to lattice quantum ﬁeldtheory, are unique in providing nonperturbative access toobservables in QCD and other ﬁeld theories. These meth-ods are not, however, equally applicable to all theoriesand observables. In particular, when applied to theorieswith a ﬁnite density of relativistic fermions, or to observ-ables involving real-time evolution, lattice Monte Carlomethods are aﬄicted by the so-called sign problem . Thisobstacle to computing quantum real-time dynamics haspersisted for a considerable time, and is a central moti-vation for the use of quantum computers in high energyand nuclear physics.Lattice methods work by framing the observable to becomputed as a ratio of high-dimensional integrals. Space-time is discretized, and Feynman’s path integral becomesa ﬁnite- (but large-) dimensional integral. For many the-ories, this procedure results in a probability distributionover ﬁeld conﬁgurations, which can be importance sam-pled with Markov chain Monte Carlo methods. However,in the case of a ﬁnite density of relativistic fermions, ornonequilibirium calculations, the Boltzmann factor e − S is generally complex, and cannot be treated as a proba-bility distribution. In such cases, a standard approach isto sample according to the “quenched” Boltzmann factor e − Re S , and include the phases by reweighting. The costof reweighting is generically exponential in the spacetimevolume of the system being simulated.Recent work has introduced normalizing ﬂows as a toolfor accelerating Markov chain Monte Carlo methods [1–5]. The idea is to construct (usually training by gradi-ent descent) a generative model that samples approxi-mately according to the lattice Boltzmann factor. Eitherby reweighting or by using the model to create proposalsfor a Markov chain, the systemic bias of training is re-moved. This method is particularly anticipated to reducethe cost associated with the approach to the continuum ∗ [email protected] † [email protected] limit (“critical slowing down”).Normalizing ﬂows, as usually formulated, are not di-rectly helpful for the sign problem: a generative modelnecessarily models a real probability distribution, ratherthan the complex weights associated to lattice modelswith a sign problem. In this paper, we show how nor-malizing ﬂows may be generalized to alleviate or removea sign problem. The core of this idea is the observationthat a normalizing ﬂow, suitably generalized, implicitlydeﬁnes an integration contour along which the sign prob-lem may be alleviated or removed. These complex nor-malizing ﬂows are thus in the same family of methodsas the Lefschetz thimble approach [6], the generalizedLefschetz thimble method [7, 8], and the search for sign-optimized manifolds [9–12].Complex normalizing ﬂows exist only when a mani-fold is available that exactly solves the sign problem. Wediscuss the conditions under which such perfect mani-folds exist. By modifying the holomorphic gradient ﬂowof [7, 13], we argue that locally perfect manifolds, onwhich there are no local ﬂuctuations of the phase of theBoltzmann factor, always exist. These manifolds maynonetheless posess a global sign problem: diﬀerent com-ponents of the manifold, separated by singularities ofthe action, may contribute with the integral with dif-ferent phases, and therefore (partially) cancel. Condi-tioned on a mild conjecture regarding the dependence oflocally perfect manifolds on the parameters of the ac-tion, we show that globally perfect manifolds exist for abroad class of physical systems, including the Schwinger-Keldysh sign problem.When a manifold is available that merely approxi-mately solves the sign problem, an approximate normal-izing ﬂow exists. A simple physical argument suggeststhat for many problems of physical relevance, manifoldsthat approximately solve the sign problem (with the ap-proximation getting better in the inﬁnite volume limit)should be available.We also ﬁnd that the tool of normalizing ﬂows re-sults in a method for perturbatively approximating sign-problem-ameliorating integration contours, as well as anew approach for machine learning of such contours (a a r X i v : . [ h e p - l a t ] F e b prospect previously explored in [9, 10, 12, 14, 15]). Weexplore the in-practice eﬀectiveness of the perturbativelyconstructed ﬂow with numerical experiments on modest0 + 1 lattices. This method does not appear to have scal-ing properties that would allow it to be used, at leastwithout serious improvement, in higher-dimensional the-ories.Finally, a perturbative view of normalizing ﬂows givesrise to a method of computing lattice expectation valuesby solving a certain high-dimensional ﬁrst-order partialdiﬀerential equation. We demonstrate this method onlattice scalar ﬁeld theory. Unfortunately, this is mostly acuriosity, chieﬂy because the practical algorithm for solv-ing the diﬀerential equations represents an uncontrolledapproximation. On theories with no sign problem, thefact that reliable error bars are not available renders it in-ferior to standard methods; on theories with a sign prob-lem, solving the diﬀerential equation turns out to be hardin practice (for reasons apparently closely connected tothe sign problem itself).The remainder of this paper is structured as follows. InSec. II we describe the lattice Schwinger-Keldysh formal-ism and the origin of the sign problem. Sec. III details thegeneralization of normalizing ﬂows to the complex set-ting, and shows how they relate to contour integrals andthe generalized thimble method. We discuss the questionof the existence of complex normalizing ﬂows (and cor-respondingly, manifolds that solve the sign problem) inSec. IV; the notions of “global” and “local” sign problemsare deﬁned here. Perturbative constructions of complexnormalizing ﬂows are given in Sec. V, with numericalexperiments characterizing their eﬀectiveness. Finally, Sec. VI outlines future avenues to explore. II. LATTICE SCHWINGER-KELDYSH

The lattice Schwinger-Keldysh method was introducedin [16, 17], for use with the generalized Lefschetz thimblemethod, as a formalism for computing real-time observ-ables — that is, observables where the operators havesome time separation. The Schwinger-Keldysh action isreadily derived by considering a lattice ﬁeld theory in theHamiltonian formulation. We are interested in a time-separated observable, of the form (cid:104)O ( t ) O (0) (cid:105) , with theexpectation value taken in a thermal ensemble of inversetemperature β . Removing time-dependences from theoperators, this expectation value can be written (cid:104)O ( t ) O (0) (cid:105) = Tr e − βH e iHt O e − iHt O Tr e − βH . (1)The ordinary lattice path integral involves only theimaginary-time operator, e − βH . That operator is splitup into a product of many e − a τ H , each of which is Trot-terized. Resolutions of the identity are inserted betweeneach pair of operators, resulting in an integral over all(discrete) paths of ﬁeld conﬁgurations.The Schwinger-Keldysh path integral does not dif-fer in its derivation. After all time-evolution operators(whether real or imaginary) are Trotterized and the ﬁeldintegrals inserted, the expectation value is given by (cid:104)O ( t ) O (0) (cid:105) = (cid:82) D φ e − S [ φ ] O ( t ) O (0) (cid:82) D φ e − S [ φ ] , (2)with the (Euclidean) action, in the case of a single realscalar ﬁeld, S = (cid:88) t,x ( φ x,t − φ x,t +1 ) a ( t ) + (cid:88) t a ( t ) + a ( t −  (cid:88) (cid:104) xx (cid:48) (cid:105) ( φ x,t − φ x (cid:48) ,t ) a x + (cid:88) x (cid:18) m φ x,t + λ φ x,t (cid:19) . (3)Here m is the bare mass and λ the coupling. Becausesome of the Trotterized time-evolution operators wereimaginary time and others were real time, the timelikelattice spacing a is taken to vary over the lattice. In thispaper we will take it to be deﬁned by the “S-contour”,although other choices are possible: a ( t ) =  − i t ∈ [0 , N t )1 t ∈ [ N t , N t + N β / i t ∈ [ N t + N β / , N t + N β / t ∈ [2 N t + N β / , N t + N β ) . (4)Here N t and N β denote the number of real-time andthermodynamic time evolution steps, respectively. This choice of action is equivalent to an O ( a ) Trotter approx-imation to e − βH/ e iHt e − βH/ e − iHt .If there were no timeslices with Im a (cid:54) = 0, the actionwould be real. In that case, the Metropolis method givesan algorithm by which the computer can sample fromthe probability distribution proportional to e − S . Expec-tation values with respect to that distribution correspondto physical expectation values.As things are, the action is not pure real, and the Boltz-mann factor does not correspond to any probability dis-tribution — at least, not any distribution of real-valuedﬁelds. The standard approach at this point is to samplewith respect to the quenched Boltzmann factor e − Re S ,and then reweight, computing observables as (cid:104)O(cid:105) = (cid:104) e − iS I O(cid:105) Q (cid:104) e − iS I (cid:105) Q (5)Here (cid:104)·(cid:105) Q denotes an expectation value with respect tothe quenched distribution. The denominator, (cid:104) e − iS I (cid:105) Q ,is known as the average phase, and characteristically de-cays exponentially in the spacetime volume of the sys-tem. A simple but robust argument shows that thisexponential decay is a generic phenomenon. The aver-age phase can be written as a ratio Z/Z Q of the phys-ical partition function to the quenched partition func-tion Z Q ≡ (cid:82) e − Re S . The physical partition function,in the large volume limit, behaves thermodynamically asthe exponential of the (extrinsic) free energy, and there-fore scales as e fV , with f the free energy density. Thequenched partition function describes some (less inter-esting) thermodynamic system, and is therefore expectedto have the same scaling, but with a diﬀerent exponent: e f Q V . Thus, the ratio will exponentially decay. This canbe avoided only when f Q = f exactly: when there is nosign problem at all.The real-time portion of the Schwinger-Keldysh con-tour gives the action an imaginary part. Note that ﬁeldsalong the real-time portion of the contour do not con-tribute at all to the real part of the action. Along thosedirections, importance sampling has no eﬀect and thesign problem is maximally bad — this is generic to allﬁeld theories. Specially to scalar ﬁeld theory, becausethe domain of the path integral has inﬁnite measure, thequenched partition function does not converge and theaverage phase is exactly zero. III. NORMALIZING FLOWS AND CONTOURINTEGRALS

We begin by introducing normalizing ﬂows in the caseof a theory with no sign problem. To accelerate the pro-cess of sampling from the Boltzmann distribution e − S ,we can look for a map ˜ φ ( φ ) with the property (cid:32) det ∂ ˜ φ∂φ (cid:33) e − S [ ˜ φ ( φ )] ≈ N e − φ / . (6)The map ˜ φ (termed a normalizing ﬂow in the machinelearning literature) transforms a Gaussian distribution,which can be sampled from eﬃciently, to the physicaldistribution desired. The normalization constant N is Strictly speaking, it is the inverse map ˜ φ (cid:55)→ φ that is usuallyreferred to as the normalizing ﬂow, as it transforms the distri-bution e − S into the normal distribution. The convention usedhere, of working with ˜ φ ( φ ) itself, allows generalization to actionswith a sign problem. inserted to account for the fact that the partition function Z is generically not equal to the Gaussian integral.If the ﬂow ˜ φ ( φ ) is exact, it allows expectation values tobe computed directly (and is referred to as a trivializingmap). A ﬂow which is merely an approximation inducesan eﬀective action on the ﬁelds ˜ φ which is unequal to thedesired physical action: S induced ( ˜ φ ) = φ / ∂ ˜ φ∂φ , (7)where φ is the preimage of ˜ φ under the normalizingﬂow. To compute the correct expectation values, we mustreweight by computing a ratio of expectation values: (cid:104)O(cid:105) = (cid:104)O e S induced − S (cid:105) n (cid:104) e S induced − S (cid:105) n , (8)where (cid:104)·(cid:105) n denotes an expectation value with respect tothe normal distribution over φ . In practice, it is oftenmore eﬃcient to use the normalizing ﬂow to generateproposals for a Markov Chain instead — the distinctionwill not matter here.This procedure can begin with any easily sampled dis-tribution. The use of a Gaussian is a convenient choicewhen the domain of integration is R N . For compact do-mains of integration, a uniform distribution is likely tobe a more convenient starting point.Note also that normalizing ﬂows compose. Given a se-quence of distributions p , . . . , p k , and k − p i to p i +1 , the composition of thosenormalizing ﬂows transforms p to p k . This composi-tional property is preserved by the complex normalizingﬂows deﬁned below.This method is clearly not directly applicable to mod-els with a sign problem. The normalizing ﬂow inducesan eﬀective action on the physical ﬁelds ˜ φ which is al-ways real , and therefore will never match the physicalaction S [ ˜ φ ]. We can construct a normalizing ﬂow for thequenched action Re S [ ˜ φ ], but this will at most lead to apolynomial speed up in an exponentially slow algorithm .Instead, inspired by the generalized thimble method,we can allow ˜ φ ( φ ) to map real trivial ﬁelds φ ∈ R N tocomplex-valued physical ﬁelds ˜ φ ∈ C N . We dub such aconstruction a complex normalizing ﬂow . The conditionEq. (6) remains the same; to guarantee equality of ex-pectation values, we will see that additional constraintson the behavior of ˜ φ are needed.Assuming for the moment that Eq. (6) holds exactly,let us see what expectation values are computed. (cid:104)O ( ˜ φ ) (cid:105) n = (cid:82) ˜ φ ( R N ) D ˜ φ O e − S (cid:82) ˜ φ ( R N ) D ˜ φ e − S (9) Or at least, the Boltzmann factor is always real. A noninvertibleﬂow may induce a negative Boltzmann factor. Furthermore, as discussed in Sec. II, the real part of the actionfor real-time sign problems is typically ﬂat for most directions.Sampling from the quenched action is not hard to begin with.

Although the integrand is the desired one, the domain ofintegration is incorrect. The physical expectation value (cid:104)O(cid:105) is obtained by an integral over the real plane R N ⊂ C N . The domain of integration used in Eq. (9) is theimage of R N under the map ˜ φ . In order for the twointegrals to be guaranteed equal, we must require thefollowing: • The Boltzmann factor e − S [ ˜ φ ] is holomorphic, as isthe product with the observable e − S O [18]. • The image of R N under ˜ φ ( φ ) is a continuous man-ifold M ⊂ C N . • The contours R N and M are connected by a ho-motopy; that is, there exists a continuous fam-ily of manifolds M ( t ) such that M (0) = R N , M (1) = M , and at no point does M pass througha singularity of an integrand.Implicit in the last condition is the requirement that,when the complexiﬁed domain is not compact, theasymptotic behavior of the manifold at inﬁnity notchange. A change in this asymptotic behavior is con-sidered equivalent to the manifold passing through thesingularity at inﬁnity.From the conditions for equality above, it is clear thata complex normalizing ﬂow induces a manifold of integra-tion M of exactly the sort used in the generalized thimblemethod. For an exactly normalizing ﬂow, the integrationalong this manifold exhibits no sign problem. Therefore,(exact) complex normalizing ﬂows exist only if there is amanifold which exactly solves the sign problem. In fact,as discussed in Sec. IV E below, the converse holds aswell: the existence of a manifold with no sign problemimplies the existence of an exact normalizing ﬂow.In cases where the complex normalizing ﬂow is not ex-act, reweighting is used to recover the precise expectationvalues as usual. This will generally be necessary through-out the numerical methods explored in this paper. IV. EXISTENCE

A theory of complex normalizing ﬂows does little goodif such ﬂows do not exist for problems of physical inter-est. The section is devoted to investigating when complexnormalizing ﬂows exist. Although in no (non-trivial) casecan we show that normalizing ﬂows certainly exist, theevidence suggests that such ﬂows are more likely to existin the case of bosonic (including real-time) sign problemsthan in the case of fermion sign problems.First, we construct manifolds that entirely remove local phase ﬂuctuations, leaving only global cancellations be-tween diﬀerent parts of the manifold of integration. Thisconstruction uses the holomorphic gradient ﬂow (oftenused to approximate or deﬁne Lefschetz thimbles), de-ﬁned and characterized in Sec. IV A. The existence of lo-cally sign-free manifolds is argued for in the subsequent section. In Sec. IV C, we conjecture that locally perfectmanifolds behave smoothly as parameters of the actionare varied. The existence of perfect manifolds for theSchwinger-Keldysh sign problem follows from this con-jecture. Several examples, where sign-free manifolds caneither be found explicitly or shown not to exist at all,are given in Sec. IV D; these examples suggest a patternin which sign-free manifolds generically exist for bosonic,but not fermionic, sign problems. Finally, in Sec. IV E,we use well-known results regarding normalizing ﬂows inthe real setting to conclude that, conditional on a per-fect manifold existing, a complex normalizing ﬂow mustexist.

A. Holomorphic Gradient Flow

The holomorphic gradient ﬂow is a ﬁrst-order dif-ferential equation used to approximate Lefschetz thim-bles [7, 13]. Lefschetz thimbles are the surfaces of steep-est descent of Re S proceeding from critical points of theaction. A certain union of the thimbles can be shown toyield the same integral as the real plane R N . Becausethe thimbles are generically sub-optimal in terms of thesign problem, we will ignore them and focus on the be-havior of the ﬂow itself. The key result is that, when aBoltzmann factor has local phase ﬂuctuations on the realplane, the holomorphic gradient ﬂow can always be usedto ﬁnd a nearby manifold with an improved sign problem.The holomorphic gradient ﬂow is deﬁned byd z d t = ∂S∂z . (10)Here the partial derivative ∂∂z denotes the usual holo-morphic derivative (i.e. the Wirtinger derivative). Wewill assume throughout that the action S is holomorphicin the ﬁeld variables z . This diﬀerential equation governsthe evolution of a ﬁeld conﬁguration z through complexspace. When applied to all ﬁeld conﬁgurations in a man-ifold, we obtain a family of manifolds parameterized bythe ﬂow time t . Note that the ﬂow time is purely ﬁc-tional, and is unrelated to the physical time (representedas part of the lattice).Considering the evolution of the ﬁeld equation, notethat the imaginary part of the action never changes, andthe real part can only increase (or remain the same, if webegin at a critical point):d S d t = (cid:12)(cid:12)(cid:12)(cid:12) ∂S∂z (cid:12)(cid:12)(cid:12)(cid:12) . (11)This can be taken as a motivation for the holomorphicgradient ﬂow, as by increasing the real part of the action,we may hope to decrease the quenched partition functionand improve the average phase. Following this observa-tion, it is convenient to work with the real part of theaction u ≡ Re S .The ﬂow Eq. (10) is most frequently applied to man-ifolds beginning from the real plane ( R N ⊂ C N ). Letus examine its behavior at early times. Note that, as aconsequence of Cauchy’s integral theorem, the partitionfunction itself will not be changed by the ﬂow [19]; onlythe quenched partition function will change. The changein the quenched partition function is given bydd t Z Q = dd t (cid:90) R N D x e − u [ z ( x )] (cid:12)(cid:12)(cid:12)(cid:12) det ∂z∂x (cid:12)(cid:12)(cid:12)(cid:12) , (12)where we have chosen to parameterize the inﬁnitesimallyﬂowed manifold z ( x ) by the real plane. Because the do-main of integration is unchanged, being the parameter-izing real plane for any t , we may proceed to inspect thederivative of the integrand.dd t e − u | det J | = e − u | det J | (cid:20) Re Tr J − d J d t − d u d t (cid:21) . (13)We have already seen that d u d t = | ∂S∂z | is guaranteed tobe non-negative, and positive away from a critical point.The Jacobian term, however, may be larger than theguaranteed-negative term, resulting in a worsening signproblem with ﬂow time t . Empirically, this is indeed thecase at suﬃciently long ﬂow times: the average phaseis maximized at some intermediate t , rather than in thelimit t → ∞ . Beginning from the real plane, however,the Jacobian is the identity, and we ﬁndTr J − d J d t = (cid:88) i ∂ S∂z i . (14)The real part reduces to (cid:80) i ∂ u∂x i . Returning to Eq. (13),dd t e − u | det J | = (cid:88) i e − u (cid:34) ∂ u∂x i − (cid:12)(cid:12)(cid:12)(cid:12) ∂S∂z i (cid:12)(cid:12)(cid:12)(cid:12) (cid:35) . (15)Consider each i independently. Each term is very nearlya total derivative, since ∂∂x i (cid:18) ∂u∂x i e − u (cid:19) = e − u (cid:34) ∂ u∂x i − (cid:18) ∂u∂x i (cid:19) (cid:35) . (16)To connect the two expressions, observe that the magni-tude of the derivative of the action can be written (cid:12)(cid:12)(cid:12)(cid:12) ∂S∂z i (cid:12)(cid:12)(cid:12)(cid:12) = (cid:18) ∂u∂x i (cid:19) + (cid:18) ∂v∂x i (cid:19) (17)where v ≡ Im S . As a result, we ﬁnd that the change inthe quenched partition function, when starting from thereal plane , isd Z Q d t = − (cid:90) D x e − u ( x ) (cid:18) ∂v∂x i (cid:19) . (18) The same argument applies to any ﬂat manifold.

This is never positive; hence, the sign problem is alwaysimproved by a small amount of ﬂow from the real plane.Moreover, as long as the imaginary part of the action isnon-constant on the portion of the real plane where e − u is nonvanishing, a small amount of ﬂow will make thequenched partition function strictly smaller.This is the key result regarding the holomorphic gra-dient ﬂow: when beginning from the real plane, if Im S is not constant where Re S is ﬁnite, a small amount ofﬂow is guaranteed to improve the sign problem.One other property of the holomorphic gradient ﬂowis of interest: regions of C N at which Re S diverges (be-coming large and positive) act as attractors. The ﬂowwill collide with these singularities in a ﬁnite ﬂow time.This does not cause the evolution of the manifold itselfto be ill-deﬁned. The manifold will be continuous, butnot smooth, where it intersects singularities of the action.Because the Boltzmann factor vanishes at these singular-ities, the manifold’s behavior there contributes neither tothe integral nor to the sign problem. B. Existence of Locally Perfect Manifolds

The holomorphic gradient ﬂow is not guaranteed to re-sult in a perfect manifold at asymptotically large times.The asymptotic manifold under this ﬂow is a union of Lef-schetz thimbles. Two features of the Lefschetz thimblescontribute a nonvanishing sign problem. First, on eachthimble Im S is constant, but diﬀerent thimbles generi-cally have diﬀerent values of Im S . Thus, cancellationsoccur between diﬀerent thimbles, which may becomesevere when multiple thimbles have similar quenchedweights. Second, although Im S is constant, the phasein the integral is actually Im S − Im log det J , with theJacobian term coming from the integration measure d z .Each thimble, therefore, comes with local phase ﬂuctua-tions, which have been found to become severe on largelattices [20].These two problems are sometimes contrasted and re-ferred to as “global” vs “local” sign problems. An ex-ample of an unremovable global sign problem was givenin [19]: the one-dimensional integral (cid:82) (cos θ + (cid:15) )d θ , forsmall (cid:15) , cannot have its sign problem repaired by anycontour deformation. Local sign problems, in contrast,have been found to be removable even where the thim-bles fail. In the heavy-dense limit of the Thirring model,numerical experiments show that the ﬂow results in asuboptimal manifold even when a perfect manifold doesexist [20].We will see in this section that local sign problems arealways removable; that is, a piecewise-smooth contour ex-ists along which there are no locally ﬂuctuating phases,but there may be cancellations between diﬀerent pieces.Combined with the observation above that at least some global sign problems are unremovable, this indicates thatthe distinction is well deﬁned. Sign problems can be de-composed into a local and global part, with the local partﬁxable by an appropriate choice of integration contour,but the global part requiring more drastic manipulations.Now we turn to a procedure, based on the holomorphicgradient ﬂow, for removing local sign problems. Beginwith M = R N ⊂ C N , and ﬂow for a small amountof time (cid:15) . This deﬁnes a manifold M , parameterizedapproximately via ˜ φ ( φ ) = φ + (cid:15) ∂S∂φ . (19)(In this discussion, for illustrative purposes, we will ex-pand to linear order in the ﬂow time (cid:15) , as if a discretejump was made. To treat the zeros of e − S correctly, itis important to perform a proper evolution by the ﬂowequation instead.) By the argument in the previous sec-tion, the quenched partition function on M is no largerthan that on M ; moreover, if M (cid:54) = M , then thequenched partition function is smaller.Consider the eﬀective action S induced by ˜ φ ; this isa function from R N to the complex numbers. We wouldnow like to ﬂow with respect to this eﬀective action. Do-ing so would guarantee that the sign problem once againimproves. In order for this to make sense, however, S must be holomorphic. Naively, it appears not to be. Inparticular, its deﬁnition includes the explicitly antiholo-morphic term ∂S∂φ , preventing us from repeating the laststep.This is a ﬁction. First consider ˜ φ ( φ ) as deﬁned inEq. (19). As written, it appears to contain both a holo-morphic and an antiholomorphic term. However, theonly aspect of ˜ φ we care about is its deﬁnition on R N .Functions R N → C are neither holomorphic nor antiholo-morphic; as long as ˜ φ is suﬃciently smooth, it can beanalytically continued into the complex plane in a purelyholomorphic way. Concretely, we can replace Eq. (19) by˜ φ ( φ ) = φ + (cid:15) ∂S∂φ (cid:12)(cid:12)(cid:12)(cid:12) ¯ φ ; (20)that is, we evaluate the function ∂S∂φ at ¯ φ . This param-eterizes exactly the same manifold (as the two functionsagree on R N ), but also deﬁnes a holomorphic map whenevaluated on the rest of the complex plane.Now we return to the eﬀective action, which after onestep is given by S ( φ ) = S [ ˜ φ ( φ )] − log det (cid:32) (cid:15) ∂∂φ ∂S∂φ (cid:12)(cid:12)(cid:12)(cid:12) ¯ φ (cid:33) . (21)As initially deﬁned, this is an analytic function of thereal plane alone. As with ˜ φ , we can choose its behaviorin the complex plane to be holomorphic, at least in someregion around the real plane. We can now ﬂow the realplane again, this time with respect to S . As before, thisis guaranteed to improve the sign problem. We obtain afunction ˜ φ which maps the real plane (the domain of S )to some slightly deformed contour. Composing ˜ φ ◦ ˜ φ yields a map from the domain of the original action S toa deformed contour. Thus, we obtain a new integrationmanifold M = ˜ φ ◦ ˜ φ ( R N ), which induces an eﬀectiveaction S , and we repeat.At every step of this modiﬁed ﬂow, we have the free-dom to arbitrarily reparameterize the integration mani-fold. There is no requirement that the parameterizationsbe “connected” from one step to the next. This meansthat the evolution of the manifold from step to step isnon-unique.What can happen to the manifold in the limit of alarge number of steps? As long as the manifold is chang-ing, Z Q is shrinking; this implies that we cannot reacha cycle. The development of some singular behavior,unremovable by reparameterization, could require us totake ever-smaller steps (cid:15) . We do not have a formal proofforbidding this; however, numerical experience with theholomorphic gradient ﬂow indicates that it does not cre-ate singularities away from zeros of the Boltzmann factor.The last possibility is a ﬁxed point: subsequent mani-folds are ever-better approximations (or perhaps equal)to a manifold M f which is unchanged by the holomor-phic gradient ﬂow under the eﬀective action.The properties of such a ﬁxed-point manifold are bestunderstood by considering the corresponding eﬀective ac-tion S f . Since M f is unchanged after a step of ﬂow,it must be that the ﬂow vectors ∂S f ∂φ lie entirely withinthe real plane. Equivalently, since the sign problem isnot improved by ﬂow, Im S must be constant everywhere e − S >

0. Critically, this does not imply that Im S isin fact globally constant, merely that regions of distinctIm S are separated by vanishing Boltzmann factors.To summarize: the eﬀective action S f on the real planesatisﬁes e − Re S ∂ Im S = 0. The imaginary part is lo-cally constant except at places where the entire actiondiverges (and the Boltzmann factor vanishes). The realplane is thus divided into distinct regions, each with con-stant Im S f and therefore no local sign problem, but withthe possiblity of cancellations between the regions.What does this imply about M f ? Regions on R N where S f does not diverge correspond to smooth partsof the ﬁxed-point manifold, which have no sign problemwhen integrated over. These smooth regions terminatewhere M f intersects with singularites of S , either dueto a fermion determinant become 0, or (in the case ofbosonic sign problems) where one or more ﬁelds ˜ φ di-verge.The similarities of M f to the Lefschetz thimbles M T are striking. Like M f , when the Lefschetz thimbles areparameterized by the real plane, they are separated fromeach other by regions of vanishing eﬀective Boltzmannfactor. Those regions on the real plane correspond tothe places in C N where M T intersects with divergencesof S . The key diﬀerence is that, when working with thethimbles, the imaginary part of the physical action isconstant but the eﬀective action (due to the Jacobian)may have imaginary ﬂuctuations. The ﬁxed-point man-ifold M f will have a ﬂuctuating imaginary action, butconstant eﬀective action on the real plane.On the ﬁxed-point manifold, the notion of a “global”sign problem becomes clear. Diﬀerent parts of the man-ifold have diﬀerent Im S f , with those diﬀerences appar-ently not removable by any choice of integration contour.Which physical systems possess global sign problems re-mains an open question.Hints of this notion of a “global” sign problem are, asmentioned earlier, visible already when considering theLefschetz thimbles. When the integral over the real lineis equal to a sum of integrals over two (or more) thimbleswith diﬀerent phases, it is tempting to disregard the localpart of the sign problem on the thimbles, and attributethe cancellations between thimbles to a global sign prob-lem. It is an open question whether such a global signproblem on the thimbles implies an unremovable globalsign problem.In the context of Lefschetz thimbles, it has been arguedthat cancellations between diﬀerent thimbles should notbe severe in the inﬁnite volume limit. One such argu-ment proceeds as follows [21]. Each thimble is associatedto a critical point of the action; i.e., a classical solutionto the equations of motion. At large volumes, we mayreasonably expect the integral to be dominated by thim-bles associated with large-scale classical solutions. Thesesolutions, and therefore their associated thimbles, per-sist as we enlarge the volume. Therefore, we may nowtalk about “a thimble” across multiple volumes. Eachthimble’s contribution to the path integral should growthermodynamically, deﬁning a per-thimble free energy.Unless protected by some symmetry, each of these freeenergies will generically be diﬀerent, causing one thim-ble to dominate in the large-volume limit. Even in thecase of the thimbles, this argument is not a proof. Itsapplicability to the ﬁxed-point manifolds is particularlyunclear. C. Existence of Perfect Manifolds

In the restricted case of polynomial actions (this ex-cludes lattice models with a fermion determinant), wecan show that globally perfect manifolds always exist,provided that locally perfect manifolds depend smoothlyon the parameters of the action. To be precise, the con-jecture we need is:

Conjecture.

Let S t be a continuous family of actions,and let M be a manifold on which e − S d z has no localphase ﬂuctuations. Then there exists a continuous familyof manifolds M t , with M = M , such that e − S t d z hasno local phase ﬂuctuations on M t . This conjecture empirically holds in several one-dimensional models explored in the next section. It isalso motivated by thinking of normalizing ﬂows as ana-lytic functions not just of the ﬁeld variables, but also ofthe parameters of the action.

FIG. 1. Perfect manifolds, found by numerical search, for theone-dimensional integral deﬁned by Eq. (22). All manifoldshave an average sign measured to be within 10 − of unity. Suppose we start from an action S that has no signproblem on the real plane, whether local or global. Lateractions S t have a sign problem. By deﬁnition the corre-sponding manifolds M t have no local sign problem; cana global sign problem be created?One way for a global sign problem to be created, with-out requiring discontinuous behavior of the family M t , isfor singularities of the action to intersect with the mani-fold. In the case of the Schwinger-Keldysh action Eq. (3)and other polynomial actions, however, there are no sin-gularities of the action except at inﬁnity. Any global signproblem must come from regions of M t on which all ﬁeldscan become arbitrarily large.Creating a global sign problem, therefore, implies in-troducing a new such region of M t . This is a discontin-uous operation on the family of manifolds. If we are tohold to the previous conjecture, we must conclude thatthis family of manifolds is not only locally perfect, butin fact globally perfect. D. Examples

In one dimension, manifolds with no sign problem canbe readily found by a numerical search. As an example,Fig. 1 shows such manifolds for the action S = x + λe iφ x (22)for various values of λ and φ . Note that any coeﬃcient ofthe x term can be absorbed by a linear change of vari-ables, so this is the most general quartic action that iseven in x . These examples motivate the conjecture thatsimilarly structured sign problems (in particular, thosewith a polynomial action) generally admit perfect mani-folds.The availability of perfect manifolds does not hold foreven for all one-dimensional integrals, however. A simpleexample, not physically motivated, was given in [19]. Theintegral of (cos θ + (cid:15) ) has a sign problem of order (cid:15) − for small (cid:15) . For suﬃciently small (cid:15) , it cannot have itssign problem removed by any contour deformation. Thisis readily conﬁrmed by noticing that the magnitude ofcos( a + ib ) (which is the quenched Boltzmann factor) isminimized when b = 0. The integral along the real linewill then have the smallest quenched partition function,and therefore the best possible sign problem.In the previous section, we discussed how a global signproblem could be created when singularities of the actionintersected with a locally perfect manifold. The case of(cos θ + (cid:15) ) is a clear demonstration of this phenomenon. At (cid:15) >

1, there are two zeros of the Boltzmann factor,at Re θ = 0 and Im θ = ± cosh − (cid:15) . As epsilon is low-ered, these move towards the real line; at (cid:15) = 1 theymerge at θ = 0. At this point, no global sign problemyet exists, but the locally perfect manifold now passesthrough a zero. Continue lowering (cid:15) , and the two zerosagain split, now at Re θ = ± cos − (cid:15) . Although the man-ifold has never changed, it now consists of segments withcancelling phases.Let us now consider a more physical model: the 0 + 1-dimensional Thirring model as studied in [7, 13]. TheBoltzmann factor deﬁning this model is e − S = exp (cid:32) g (cid:88) i cos z i (cid:33) det i,j (cid:20) mδ i,j + 12 (cid:0) e µ + iz i δ i +1 ,j − e − µ − iz j δ i − ,j + e − µ − iz j δ i, δ j,β − e µ + iz i δ j, δ i,β (cid:1)(cid:21) , (23)where g is a coupling constant, µ is the chemical poten-tial (and origin of the sign problem), and m is the baremass. The z , . . . , z β are the degrees of freedom beingintegrated over; there are β links on the lattice.The fermion determinant depends only on the sum ofthe ﬁelds βσ = (cid:80) i z i . A natural simpliﬁcation, therefore,is to consider the “mean-ﬁeld” model, a one-dimensionalintegral with Boltzmann factor e − S ( σ ) = e β g cos σ (cid:2) cos( β ( σ − iµ )) + 1 (cid:3) . (24)We have taken m = 0 for convenience (and neglected anoverall normalization).For our purposes, an interesting limit is that of large β ,while keeping the coupling and chemical potential both oforder unity. Numerical experiments indicate that thereis no contour that exactly solves the sign problem withthese parameters — indeed, the average phase falls expo-nentially in β , as one would expect. This holds even if weneglect the Jacobian. In particular, write the quenchedBoltzmann factor explicitly in terms of the real and imag-inary parts of σ = σ R + iσ I : | e − S | = e β g cos σ R cosh σ I (cid:12)(cid:12) βσ R cosh( β ( σ I − µ )) − i sin βσ R sinh( β ( σ I − µ )) (cid:12)(cid:12) . (25)The quenched partition function can be given a lowerbound by minimizing (cid:82) | e − S [ σ R ,σ I ( σ R )] | over all functions σ I . The minimization over σ I can be done individuallyfor each σ R . Even this lower bound on the quenched par-tition function still falls exponentially above the physicalpartition function.Note that the fact that no manifold exists to resolvethe mean-ﬁeld sign problem does not prove that no man-ifold exists that resolves the sign problem of the originaltheory: it is merely suggestive. It seems plausible that asimilar technique could be used to establish the impossi-bility of the original sign problem. This fermionic example diﬀers sharply from theSchwinger-Keldysh action (and from Eq. (22)). The ac-tion is not a polynomial, and relatedly, the Boltzmannfactor falls to zero away from inﬁnity (and nearly on thereal plane). If the failure to have a perfect manifold isrelated to these features, then we expect fermionic signproblems to frequently be unresolvable via contour defor-mation, while bosonic sign problems would generically beresolvable. E. Existence of Normalizing Flows

A parameterization ˜ φ ( φ ) of a manifold with no signproblem induces an eﬀective action on the real plane thatis always real. Thanks to the composability of normaliz-ing ﬂows, the problem of ﬁnding a complex normalizingﬂow reduces to the problem of ﬁnding an ordinary nor-malizing ﬂow for that eﬀective action. Provided that thiscan be done, the existence of a perfect manifold impliesthe existence of a complex normalizing ﬂow.As it happens, given probability distributions p ( x ) and π (˜ x ), a map x → ˜ x always exists such that the measure p ( x )d x induces the measure π (˜ x )d˜ x ; that is, such that p ( x ) (cid:18) det ∂ ˜ x∂x (cid:19) = π [˜ x ( x )]. (26)The construction is simplest, and unique, in one dimen-sion. Deﬁne the cumulative distribution functions P andΠ, of p and π respectively: P ( x ) = (cid:90) x −∞ d x (cid:48) p ( x ). (27)The CDF can be seen as a normalizing ﬂow from a proba-bility distribution to the uniform distribution on the unitinterval. Therefore, the desired map is given by Π − ◦ P .In the multidimensional case such maps are known toexist as well, but cease to be unique. Finding maps withdesirable properties is an active area of research; see [22]for a review.The remainder of this work is dedicated to the taskof ﬁnding approximate normalizing ﬂows, under the as-sumption that such ﬂows exist. V. PERTURBING FLOWS

In principle, a complex normalizing ﬂow can be trainedin much the same way as a regular normalizing ﬂow. Inpractice, this training is a diﬃcult process. One principlereason, closely linked to the sign problem, is that whencomparing Boltzmann factors, a diﬀerence of 2 π in theaction is invisible. As a result, if the physical Boltzmannfactor is 1 and the induced Boltzmann factor is −

1, thegradient descent procedure has no way to know whetherthe induced Im S should be changed by π or − π (or per-haps 3 π ). Circumventing this requires either maintaininga normalizing ﬂow which is always “within π ” of beingexact, or deﬁning the ﬂow in such a way that the imag-inary part of the induced action is itself well deﬁned.Instead, we will work in the spirit of [23], and constructnormalizing ﬂows in perturbation theory. A. Leading Order

A normalizing ﬂow need not begin with a Gaussiandistribution. In the general case, the condition for a nor-malizing ﬂow reads (cid:32) det ∂ ˜ φ∂φ (cid:33) e − S [ ˜ φ ( φ )] = N e − S ( φ ) , (28)where S is the action deﬁning the original probabilitydistribution, which φ (cid:55)→ ˜ φ transforms into e − S . Considerthe case where S is merely a perturbation of S ; that is,where S = S + λ O (29)for small λ . When λ = 0, a suitable normalizing ﬂow issimply ˜ φ = φ . For small λ , we expand ˜ φ as a power series:˜ φ = φ + λ ∆ (1) . Returning to Eq. (28) and expanding toleading order in λ , we ﬁnd the diﬀerential equation for∆ (1) : ∇ · ∆ (1) − ∆ (1) · ∇ S = O − (cid:104)O(cid:105) . (30)Here the expectation value (cid:104)O(cid:105) is evaluated with respectto the original action S . We can obtain Eq. (30) morequickly simply by considering the integral of ∇· (cid:0) ∆ e − S (cid:1) ,for any ∆ that decays at inﬁnity, or diverges suﬃ-ciently slowly. As the integral of a total derivative, itmust vanish. This implies that the expectation value of ∇ · ∆ − ∆ · ∇ S vanishes as well. When perturbing from a free theory (that is, when S deﬁnes a Gaussian), Eq. (30) can be solved exactly. Inparticular, with S = (cid:88) ij φ i M ij φ j + λ (cid:88) i Λ i φ i , (31)the perturbative ﬂow ∆ (1) is given by∆ (1) i = − (cid:88) j (cid:20) M − ij Λ j φ j + 34 M − ij M − jj Λ j φ j (cid:21) . (32)Note that Eq. (31) is a generalization of the Schwinger-Keldysh action Eq. (3). Any two Gaussians are triviallyconnected by a normalizing ﬂow, and as noted earlier,normalizing ﬂows compose. Thus, Eq. (32) implicitly de-ﬁnes a (perturbative) normalizing ﬂow for the Schwinger-Keldysh sign problem in φ ﬁeld theory.Unfortunately, Eq. (28) is not the only condition con-straining a complex normalizing ﬂow. As discussed inSec. III, the asymptotic behavior of the contour ˜ φ ( R N )must match that of the real plane; i.e., the two manifoldsmust be in the same homology class. It is not a surprisethat the perturbative ﬂow Eq. (32) violates this condi-tion, as the perturbative expansion is equivalent to anexpansion in small ﬁelds φ , while the asymptotic behav-ior is purely determined by the behavior of ∆ (1) when φ is large.To get the correct asymptotic behavior, we can workinstead in the strong coupling expansion to obtain∆ (1 , strong) , which becomes a good approximation at large φ . For an action of the form of Eq. (31), it is convenientto construct our normalizing ﬂow as a sequence of fourmaps:1. Map the distribution e − φ to e − ψ via ψ = F ( φ ).2. Rotate and scale the complex plane via ψ = F ( ψ ) to obtain the distribution e − Λ ψ .3. Introduce a perturbative quadratic piece via a per-turbative ﬂow ψ = F ( ψ ) = ψ + √ λ δ (1) ( ψ ). Theresulting distribution is e − S (cid:48) ( ψ ) , where S (cid:48) ( ψ ) = (cid:88) i Λ i ψ i + 1 √ λ (cid:88) ij ψ i M ij ψ j . (33)4. Rescale the ﬁelds to restore the correct ﬁeld nor-malization via ˜ φ = F ( ψ ), ﬁnally obtaining thedesired distribution e − S ( ˜ φ ) , with the action deﬁnedin Eq. (31)Note that the ﬁrst two maps, F and F , factor intoone-dimensional maps, which can be obtained straight-forwardly via the prescription following Eq. (27). Ac-0 FIG. 2. Simulations with the normalizing ﬂow computed in the strong-coupling expansion to leading order. On the left, theresulting sign problem is computed on a lattice with N β = 2, n t = 5, m = 0 .

5, as a function of the coupling λ . The real-timecorrelator (cid:104) φ ( t ) φ (0) (cid:105) is shown on the right, at the same temperature and with m = 0 . λ = 0 .

33. The solid lines labelled‘Exact’ include the same Trotterization errors present on the Schwinger-Keldysh lattice. cordingly, F can be written as F ( φ ) = Π − ◦ P , with (34)Π( φ ) = 12 + 12 (cid:32) − Γ (cid:2) / , φ (cid:3) Γ(1 / (cid:33) sgn φ (35) P ( φ ) = 12 (cid:16) φ/ √ (cid:17) . (36)Above, Γ( x ) is the gamma function, Γ( s, x ) is the up-per incomplete gamma function, and Erf( x ) is the errorfunction.The second map is given by simply multiplying φ byΛ − / i . This is a rotation of the complex plane on mostof the lattice, with an additional scaling factor of 2 / onthe corners of the Schwinger-Keldysh contour. Thus themap F is deﬁned as F ( φ ) = φ/ Λ / i . (37) The map F is where the strong coupling expansionis performed. The diﬀerential equation for δ (1) is of theform of Eq. (30), with O = (cid:80) ij φ i M ij φ j . The expecta-tion value (cid:104)O(cid:105) must now be evaluated with respect to theleading-order action S = (cid:80) i Λ i φ i . Expressed in termsof f i = δ (1) i ( φ ) e − Λ i φ i , and using the fact that (cid:104) φ i φ j (cid:105) vanishes when i (cid:54) = j (in the strong coupling limit), thediﬀerential equation reads ∂f i ∂φ i e Λ i φ i − (cid:88) j M ij φ i φ j = − M ii (cid:104) φ i (cid:105) . (38)The expectation value required is (cid:104) φ i (cid:105) = Γ(3 / / √ Λ i .Using the fact that only diagonal and nearest-neighborterms of M are non-zero, the solution is δ (1) i ( φ ) = e Λ i φ i M ii (cid:20) − φ i Γ[ , Λ i φ i ]4(Λ i φ i ) / + (cid:104) φ i (cid:105) φ i Γ[ , Λ i φ i ]4(Λ i φ i ) / (cid:21) + (cid:88) j ∈{ i − ,i +1 } e Λ i φ i √ π √ Λ i (cid:104) Erf( (cid:112) Λ i φ i ) − C (cid:105) M ij φ j . (39)Above, ( · ) / refers speciﬁcally to the principle fourthroot. A speciﬁc choice of C = 1 gives a solution whichvanishes at ψ i → ∞ and is oscillation-free.Finally, F rescales the ﬁeld by a factor of λ / : F ( φ ) = φ/λ / . (40)Putting it all together, the entire perturbative ﬂow from e − (cid:80) i ψ i to e − S ( φ ) at the leading order is φ + ∆ (1 , strong) ( φ ) = [ F ◦ F ◦ F ◦ F ] ( φ ). (41)The left panel of Fig. 2 shows the average phase ob-tained by this ﬂow on a 12-site lattice with m = 0 .

5, as the coupling is varied. As expected, at strong coupling,the sign problem is almost entirely removed, whereas atsuﬃciently small coupling the average phase is too smallto be distinguished from zero. As a check of the correct-ness and convergence of the ﬂow, the right panel of thesame ﬁgure shows the real-time correlator obtained with m = 0 . λ = 0 .

33, compared with an exact Hamil-tonian calculation. The lattice behind this calculationhas 14 sites (two thermal links and six temporal links ineach direction), and the average phase was computed tobe (cid:104) σ (cid:105) = 0 . B. Extracting Expectation Values

When the action S deﬁnes a probability distribu-tion from which sampling can be performed eﬃciently,Eq. (30) provides a means to approximately sample fromthe perturbed distribution deﬁned by S . However, thatequation is still valid when S is itself hard to samplefrom (and perhaps aﬄicted with a sign problem). Anyvector ﬁeld ∆ i corresponds to some observable O whoseexpectation value is known — and for a ﬁxed desired ob-servable, a numerical solution to ∆ can be attempted,which will automatically yield the expectation value.The diﬀerential equation is in a number of dimensionsequal to the number of sites on the lattice. Neural net-works have been proﬁtably applied to solving such high-dimensional diﬀerential equations [24]. Our strategy is asfollows. A multi-layer perceptron (MLP), with parame-ters labelled W , will represent ∆ as a function of the(real) ﬁelds φ ; a single additional training parameter E represents (cid:104)O(cid:105) . We train these parameters with respectto the cost function C ( W, E ) = (cid:90) d φ e − φ / ×|∇ · ∆ W ( φ ) − ∆ W ( φ ) · ∇ S ( φ ) − E + O ( φ ) | , (42)which is estimated by randomly sampling from the Gaus-sian distribution in φ .The left panel of Fig. 3 showcases this method on 0+1-dimensional scalar ﬁeld theory — equivalent to the modelof Eq. (3) with no real-time evolution. The lattice pa-rameters are m = 0 . β = 10; expectation values aregiven as a function of the coupling. A two-layer MLP isused with hyperbolic tangent as an activation function.The method is seen to have reasonable agreement withthe exact answer across all couplings.This method has serious drawbacks. Most importantly,it represents an uncontrolled approximation. The errorin the estimation of (cid:104)O(cid:105) is due to the shortcomings ofthe ansatz used for ∆, rather than insuﬃcient statistics;therefore, this error cannot be estimated with bootstrapnor removed with a larger number of samples.In the case of the Schwinger-Keldysh action with N t >

0, another issue emerges. The most natural cost functionfor training ∆ would be C (cid:48) ( W, E ) = (cid:90) d φ e − S ×|∇ · ∆ W ( φ ) − ∆ W ( φ ) · ∇ S ( φ ) − E + O ( φ ) | . (43)The task of evaluating this cost function itself has a signproblem; estimating its derivatives with respect to the MLP parameters has a related signal-to-noise problem.Training with respect to the cost function of Eq. (42)has no such diﬃculty, but a small value of that cost func-tion does not imply that ∆ is a good approximation interms of the ‘true’ cost function. This mismatch allowsan apparently good ﬁt to correspond to very inaccurateexpectation values.The right panel of Fig. 3 showcases this failure moreclearly, in the case of N β = 2, for various real-time evo-lutions N t . Shown is the expectation value (cid:104) φ ( t ) φ (0) (cid:105) forseveral time separations t , at an inverse temperature of β = 2, with lattice parameters m = 0 . g = 0 . VI. FURTHER DISCUSSION

This paper introduced the notion of complex normal-izing ﬂows, which extend the applicability of normaliz-ing ﬂow-based methods to (some) models aﬄicted witha sign problem. Unlike real normalizing ﬂows, which al-ways exist, complex normalizing ﬂows exist only whenan integration contour is available which exactly removesthe sign problem. Approximate normalizing ﬂows can beconstructed in perturbation theory, but low-order pertur-bative approximations were found to not have a tractablesign problem on lattices with more than ∼

20 sites.The general question of when sign problem-solvingintegration contours exist remains open. For simplefermionic models, they can be readily shown not to exist,although the possiblity remains that the manner of inte-grating out fermions could be changed to remedy this.Simple models inspired by the Schwinger-Keldysh action do have contours which exactly solve the sign problem.Conjecturally, this is a generic feature of polynomial ac-tions.The Schwinger-Keldysh sign problem for scalar ﬁeldscomes from a polynomial action, but this is far from anunusual property. Consider the case of SU ( N ) gauge the-ories in the absence of fermions. The complexiﬁcation of SU ( N ) is the group of complex N × N matrices U obeyingdet U = 1, SL ( N ; C ). On the space, the standard Wilsonaction and its many improvements can be written holo-morphically as a polynomial of U and U † . Cramer’s rulefor the inversion of matrices provides an expression for U − in terms of the elements of U , in which the only non-polynomial factor is (det U ) − . In the case of SL ( N ; C )matrices, this factor is always 1 and can be neglected.What remains in the action is a polynomial of the ﬁeldvariables.We argued that contours always exist which locally solve the sign problem, by describing an iterative proce-dure for removing local ﬂuctuations in Im S , and study-ing the properties of its ﬁxed points. Analogous to the2 FIG. 3. Evaluation of expectation values via the machine learning method of Sec. V B. On the left, a 10-site lattice with noreal-time evolution, with m = 0 . β = 2, m = 0 .

5, and λ = 0 .

5. In both cases, exact results are shown by the solid line.

Lefschetz thimbles , a ﬁxed-point manifold of this proce-dure can be decomposed into several smooth pieces, sep-arated from each other by singularities of the physicalaction. The key diﬀerence is that on this manifold, theimaginary part of the eﬀective action is constant, ratherthan the imaginary part of the physical action.If the evidence presented earlier is taken at face value,it is likely that manifolds exist that resolve the real-timebosonic sign problem. It is critical to note that this is notequivalent to a solution to that sign problem. Diﬃcultiescould still exist with the computational task of ﬁndingsuch manifolds, or there may be no eﬃcient algorithmsfor sampling from them. In fact, in the case of a real-timesign problem with arbitrary time-dependent source terms(which is still a polynomial action), this is the expectedresult: it was shown in [25] that the task of computing amplitudes in such a context in BQP-hard. ACKNOWLEDGMENTS

We are indebted to Andrei Alexandru, Tom Cohen,Frederic Koehler, and Henry Lamm for many useful dis-cussions. We are additionally grateful to Henry Lamm forreviewing a previous version of this manuscript. S.L. issupported by the U.S. Department of Energy under Con-tract No. DE-SC0017905. Y.Y. is supported by theU.S. Department of Energy under Contract No. DE-FG02-93ER-40762 and by the Jeﬀerson Science Asso-ciates 2020-2021 graduate fellowship program. [1] M. Albergo, G. Kanwar, and P. Shanahan, Flow-basedgenerative models for Markov chain Monte Carlo inlattice ﬁeld theory, Phys. Rev. D , 034515 (2019),arXiv:1904.12072 [hep-lat].[2] G. Kanwar, M. S. Albergo, D. Boyda, K. Cranmer,D. C. Hackett, S. Racani`ere, D. J. Rezende, and P. E.Shanahan, Equivariant ﬂow-based sampling for latticegauge theory, Phys. Rev. Lett. , 121601 (2020),arXiv:2003.06413 [hep-lat].[3] D. Boyda, G. Kanwar, S. Racani`ere, D. J. Rezende, M. S.Albergo, K. Cranmer, D. C. Hackett, and P. E. Shana-han, Sampling using SU ( N ) gauge equivariant ﬂows,(2020), arXiv:2008.05456 [hep-lat].[4] K. A. Nicoli, S. Nakajima, N. Strodthoﬀ, W. Samek, K.-R. M¨uller, and P. Kessel, Asymptotically unbiased es-timation of physical observables with neural samplers,Phys. Rev. E , 023304 (2020), arXiv:1910.13496[cond-mat.stat-mech]. In fact, the regions on the real plane of constant Im S f are thim-bles of the ﬁxed-point eﬀective action. [5] K. A. Nicoli, C. J. Anders, L. Funcke, T. Hartung,K. Jansen, P. Kessel, S. Nakajima, and P. Stornati, Esti-mation of Thermodynamic Observables in Lattice FieldTheories with Deep Generative Models, Phys. Rev. Lett. , 032001 (2021), arXiv:2007.07115 [hep-lat].[6] M. Cristoforetti, F. Di Renzo, and L. Scorzato (Aurora-Science), New approach to the sign problem in quantumﬁeld theories: High density QCD on a Lefschetz thimble,Phys. Rev. D , 074506 (2012), arXiv:1205.3996 [hep-lat].[7] A. Alexandru, G. Basar, P. F. Bedaque, G. W. Ridgway,and N. C. Warrington, Sign problem and Monte Carlocalculations beyond Lefschetz thimbles, JHEP , 053,arXiv:1512.08764 [hep-lat].[8] A. Alexandru, G. Basar, P. F. Bedaque, G. W. Ridgway,and N. C. Warrington, Monte Carlo calculations of theﬁnite density Thirring model, Phys. Rev. D , 014502(2017), arXiv:1609.01730 [hep-lat].[9] Y. Mori, K. Kashiwa, and A. Ohnishi, Applicationof a neural network to the sign problem via thepath optimization method, PTEP , 023B04 (2018),arXiv:1709.03208 [hep-lat].[10] Y. Mori, K. Kashiwa, and A. Ohnishi, Toward solving the sign problem with path optimization method, Phys.Rev. D , 111501 (2017), arXiv:1705.05605 [hep-lat].[11] A. Alexandru, P. F. Bedaque, H. Lamm, S. Lawrence,and N. C. Warrington, Fermions at Finite Density in 2+1Dimensions with Sign-Optimized Manifolds, Phys. Rev.Lett. , 191602 (2018), arXiv:1808.09799 [hep-lat].[12] A. Alexandru, P. F. Bedaque, H. Lamm, andS. Lawrence, Finite-Density Monte Carlo Calculationson Sign-Optimized Manifolds, Phys. Rev. D , 094510(2018), arXiv:1804.00697 [hep-lat].[13] A. Alexandru, G. Basar, and P. Bedaque, Monte Carloalgorithm for simulating fermions on Lefschetz thimbles,Phys. Rev. D , 014504 (2016), arXiv:1510.03258 [hep-lat].[14] A. Alexandru, P. F. Bedaque, H. Lamm, andS. Lawrence, Deep Learning Beyond Lefschetz Thimbles,Phys. Rev. D , 094505 (2017), arXiv:1709.01971 [hep-lat].[15] J.-L. Wynen, E. Berkowitz, S. Krieg, T. Luu, and J. Ost-meyer, Leveraging Machine Learning to Alleviate Hub-bard Model Sign Problems, (2020), arXiv:2006.11221[cond-mat.str-el].[16] A. Alexandru, G. Basar, P. F. Bedaque, S. Vartak, andN. C. Warrington, Monte Carlo Study of Real Time Dy-namics on the Lattice, Phys. Rev. Lett. , 081602(2016), arXiv:1605.08040 [hep-lat]. [17] A. Alexandru, G. Basar, P. F. Bedaque, and G. W.Ridgway, Schwinger-Keldysh formalism on the lattice: Afaster algorithm and its application to ﬁeld theory, Phys.Rev. D , 114501 (2017), arXiv:1704.06404 [hep-lat].[18] A. Alexandru, G. Ba¸sar, P. F. Bedaque, H. Lamm,and S. Lawrence, Finite Density QED

Near Lef-schetz Thimbles, Phys. Rev. D , 034506 (2018),arXiv:1807.02027 [hep-lat].[19] S. Lawrence, Sign Problems in Quantum Field Theory:Classical and Quantum Approaches , Ph.D. thesis, Mary-land U. (2020), arXiv:2006.03683 [hep-lat].[20] S. Lawrence, Beyond Thimbles: Sign-Optimized Man-ifolds for Finite Density, PoS

LATTICE2018 , 149(2018), arXiv:1810.06529 [hep-lat].[21] T. Cohen, personal communication.[22] C. Villani,

Topics in optimal transportation , 58 (Ameri-can Mathematical Soc., 2003).[23] S. Lawrence, Perturbative Removal of a Sign Problem,Phys. Rev. D , 094504 (2020), arXiv:2009.10901 [hep-lat].[24] J. Han, A. Jentzen, and E. Weinan, Solving high-dimensional partial diﬀerential equations using deeplearning, Proceedings of the National Academy of Sci-ences , 8505 (2018).[25] S. P. Jordan, H. Krovi, K. S. Lee, and J. Preskill, Bqp-completeness of scattering in scalar quantum ﬁeld theory,Quantum2