[PDF] Stochastic lists: Sampling multi-variable functions with population methods

Abstract

We introduce the method of stochastic lists to deal with a multi-variable positive function, defined by a self-consistent equation, typical for certain problems in physics and mathematics. In this approach, the function's properties are represented statistically by lists containing a large collection of sets of coordinates (or "walkers") that are distributed according to the function's value. The coordinates are generated stochastically by the Metropolis algorithm and may replace older entries according to some protocol. While stochastic lists offer a solution to the impossibility of efficiently computing and storing multi-variable functions without a systematic bias, extrapolation in the inverse of the number of walkers is usually difficult, even though in practice very good results are found already for short lists. This situation is reminiscent of diffusion Monte Carlo, and is hence generic for all population-based methods. We illustrate the method by computing the lowest-order vertex corrections in Hedin's scheme for the Fröhlich polaron and the ground state energy and wavefunction of the Heisenberg model in two dimensions.

Full PDF

SStochastic lists: Sampling multi-variable functions with population methods

Lode Pollet

Department of Physics, Arnold Sommerfeld Center for Theoretical Physics,University of Munich, Theresienstrasse 37, 80333 Munich, Germany andWilczek Quantum Center, School of Physics and Astronomy and T. D. Lee Institute,Shanghai Jiao Tong University, Shanghai 200240, China

Nikolay V. Prokof’ev

Department of Physics, University of Massachusetts, Amherst, MA 01003, USA andNational Research Center “Kurchatov Institute,” 123182 Moscow, Russia

Boris V. Svistunov

Department of Physics, University of Massachusetts, Amherst, MA 01003, USANational Research Center “Kurchatov Institute,” 123182 Moscow, Russia andWilczek Quantum Center, School of Physics and Astronomy and T. D. Lee Institute,Shanghai Jiao Tong University, Shanghai 200240, China (Dated: June 5, 2018)We introduce the method of stochastic lists to deal with a multi-variable positive function, deﬁnedby a self-consistent equation, typical for certain problems in physics and mathematics. In thisapproach, the function’s properties are represented statistically by lists containing a large collectionof sets of coordinates (or “walkers”) that are distributed according to the function’s value. Thecoordinates are generated stochastically by the Metropolis algorithm and may replace older entriesaccording to some protocol. While stochastic lists oﬀer a solution to the impossibility of eﬃcientlycomputing and storing multi-variable functions without a systematic bias, extrapolation in theinverse of the number of walkers is usually diﬃcult, even though in practice very good results arefound already for short lists. This situation is reminiscent of diﬀusion Monte Carlo, and is hencegeneric for all population-based methods. We illustrate the method by computing the lowest-ordervertex corrections in Hedin’s scheme for the Fr¨ohlich polaron and the ground state energy andwavefunction of the Heisenberg model in two dimensions.

PACS numbers: 03.75.Hh, 67.85.-d, 64.70.Tg, 05.30.Jp

I. INTRODUCTION

We are interested in the solution F ( x ) of equations ofthe type F ( x ) = F ( x ) + K [ F ( x )] , (1)where the coordinate x is high-dimensional. The (gener-ically nonlinear) functional K can involve a number ofintegrations, multiplications and summations, but we donot consider diﬀerentiations. Ultimately, the solution of(1) is often used to compute integrals involving F andsome other, more simple, functions. A straightforwardapproach to solve Eq. (1) is by ﬁxed point iterations:Starting from a guess F (0) , one computes the right-handside, plugs the newly obtained F (1) into the right-handside, and continues this iteration until, ideally, conver-gence is reached. However, as soon as the dimension ishigher than three it becomes very diﬃcult to eﬃcientlycompute and store the function F .Equations of the above type typically occur in the self-consistent formulation of quantum ﬁeld theory, such asthe Hedin equations, the Schwinger-Dyson equations, parquet equations, etc. They can also occur in thepresence of spontaneous symmetry breaking, such asthe Bardeen-Cooper-Schrieﬀer theory, when the order-ing ﬁeld has to be determined self-consistently. Whereas the Green function, the self-energy, the polarization, andthe eﬀective interaction are typically two-dimensional incase of rotational and translational symmetry [i.e., thereis one spatial (or momentum) coordinate and one time(or frequency) coordinate] and can be stored eﬃcientlywith standard grids, the irreducible three-point vertexis already ﬁve-dimensional in 3D. Studies attempting tosolve the Hedin equations therefore often treat the ver-tex function as just a bare vertex, leading to the GW -approximation. Alternatively, the full function is repre-sented by an inﬁnite number of contributions that onlyinvolve relatively simple integrals [cf. the expansion ofthe Luttinger-Ward functional in the Baym-Kadanoﬀ ef-fective action (or its generalization to bosons ) pop-ular in the context of electronic structure calculations].This problem—sometimes referred to as the curse ofdimensions—is among the most prominent ones faced bydiagrammatic Monte Carlo (DiagMC) methods. This isunsatisfactory, because the premise of the DiagMC sim-ulation is precisely to deal with high dimensions whilemaintaining the central limit theorem. In this work, weintroduce stochastic lists , a method in which the high-dimensional function F ( x ) is represented by a stochasticlist of coordinates x , x , . . . , x P , with P the length ofthe list. The entries in the stochastic list are obtainedby some Markov process (to be discussed later; the key a r X i v : . [ c ond - m a t . s t a t - m ec h ] J un property is that the values of F ( x ) do not explicitly en-ter into the equation used for generation of the list en-tries), and can be refreshed by some protocol, which isnon-Markovian. The list provides a faithful statisticalrepresentation of F ( x ): any quantity of interest whichcan be written as some integral over F can be computed.In this work we assume that F is positive.We benchmark the approach by considering two diﬀer-ent systems. First, we study the Hedin equations for theFr¨ohlich polaron by computing the lowest order vertexcorrections self-consistently. Second, we formulate thepower method to ﬁnd the ground state of a Hamiltonianin the language of stochastic lists and study the ground-state energy and wavefunction of the two-dimensionalHeisenberg model. We ﬁnd that, in practice, short listsgive already remarkably accurate results. Typically, apower law extrapolation can be attempted over severaldecades in 1 /P . However, we found deviations for longerlists (in our examples when P (cid:29) ), which makes anyextrapolation diﬃcult (unless the stochastic error domi-nates at this point).As is clear from the power method example, stochas-tic lists share a number of properties with the diﬀusionMonte Carlo method. An exponential scaling for diﬀu-sion Monte Carlo was reported previously in the litera-ture and was related to the correlation within the popula-tion of walkers as a consequence of the population controlmechanisms. . It seems therefore that all population-based methods ultimately scale exponentially, implyingthat for suﬃciently large (and hard) systems the extrapo-lation cannot be done reliably to eliminate the systematicbias. This outcome also questions the practicality of themethod for arbitrary (bosonic) problems. Nevertheless,for many realistic cases this scaling will not be seen (dueto the dominance of the stochastic error), and by suﬃ-cient insight into the problem (such as choosing a verygood guiding wavefunction) the prefactor can be substan-tially reduced such that the scaling is not an issue. Thiswe can also demonstrate for stochastic lists.This paper is structured as follows. In Section II,we study the non-crossing approximation and the low-est non-trivial order vertex corrections for the Fr¨ohlichpolaron problem. In Section III, we proceed with theground state energy of the two-dimensional Heisenbergmodel, with a special emphasis on the numerical con-vergence of the stochastic process. We conclude in Sec-tion IV. II. THE HEDIN EQUATIONS FOR THEFR ¨OHLICH POLARON PROBLEM

As a ﬁrst application, we consider the Hedin equa-tions for the Fr¨ohlich polaron. This system has a positiveexpansion, and is known to be convergent at any ﬁnitetemperature. It is hence free from the most important re-striction on the DiagMC method, which is the series con-vergence. (Since the DiagMC algorithms work by itera- tion, they typically require a ﬁnite region of convergence.)Sign-positive representation also implies that there is noneed to take special care of the diagram topologies, andone can proceed with standard Metropolis-Hastings sam-pling techniques.

Therefore, the study of the Fr¨ohlichpolaron provides an ideal opportunity to benchmark theidea of stochastic lists in the context of vertex corrections.

A. Model

The Fr¨ohlich polaron model describes the interactionbetween an itinerant electron and longitudinal opticalphonons in insulators. Historically, it was the ﬁrst prob-lem to which the DiagMC method was applied andfor which it was able to provide deﬁnite answers regard-ing the polaron spectrum and arbitrarily precise polaronenergies for any coupling strength. The Hamiltonian inthe thermodynamic limit is given by (¯ h = 1) H = H el + H ph + H el − ph (2) H el = (cid:90) d k (2 π ) k m a † k a k ,H ph = (cid:90) d q (2 π ) ω q b † q b q = ω ph (cid:90) d q (2 π ) b † q b q ,H el − ph = (cid:90) d k d q (2 π ) V ( q )( b † q − b q ) a † k − q a k ,V ( q ) = iω ph q (2 mω ph ) / (cid:18) παV (cid:19) / = i ˜ αq . The operators a k and b q are annihilation operators forelectrons of mass m with momentum k and phononswith momentum q , respectively. The phonon frequency ω q ≡ ω ph can be taken momentum-independent for op-tical, longitudinal phonons. The dimensionless couplingconstant is α . Typical values for α vary from 0 .

023 forInSb over 0 .

29 for CdTe to 1 .

84 for AgCl. Here we focus on the T = 0 case. In theimaginary-time representation, the bare propagatorreads G ( k , τ ) = − θ ( τ ) exp (cid:104) − ( k m − µ ) τ (cid:105) . The phononpropagator D ( q , τ ) = (˜ α/q ) exp( − ω ph τ ) remains un-renormalized and is dispersionless. We absorbed themodulus squared of the electron-phonon interaction po-tential into the phonon propagator for convenience (thetwo factors always enter the technique as a product).The method of stochastic lists can not deal with momen-tum or frequency conservation because all coordinates ofthe vertex function are generated by the list without re-strictions. We therefore need to work in the imaginary-time, real-space representation, where the propagatorsfor phonons and electrons read D ( r , τ ) = ˜ α πr exp( − ω ph τ )and G ( r , τ ) = − θ ( τ ) (cid:0) m πτ (cid:1) / exp( − mr τ + µτ ), respec-tively. -6-5-4-3-2-1 0 0 2 4 6 8 10 l n (- G ( p = , τ )) τ stoch. listNCA FIG. 1. (Color online.) Comparison of the Green func-tion dependence on imaginary time at zero momentum, − G ( p = 0 , τ ), obtained in the NCA approximation betweenthe stochastic list method (stoch. list) and an explicit evalu-ation. Data are shown for ˜ α = 5 , µ = − P = 10 . B. The Non-Crossing Approximation

As a ﬁrst application to set the ideas, we solve theFr¨ohlich polaron problem in the non-crossing approxima-tion (NCA), which corresponds to the ﬁrst-order skeletondiagram. The coupled set of NCA equations reads [weemploy both ( k , ω n ) and ( r , τ ) representations to simplifythese equations]Σ (1) ( r , τ ) = D ( r , τ ) G (1) ( r , τ ) ,G (1) ( k , ω n ) = 1 G − ( k , ω n ) − Σ (1) ( k , ω n ) . (3)Reference NCA data can be found in the lecture notesRef. 18.In order to solve these equations with stochastic lists,we cast the entire setup as a single self-consistent non-linear integral equation of the form (1): G (1) ( X ) = G ( X ) + (cid:90) I d X d X , (4) I = G ( X ) G (1) ( X − X ) D ( X − X ) G (1) ( X − X ) , where we introduced the 4-dimensional space-time posi-tion vectors X = ( r , τ ), X = ( r , τ ), and X = ( r , τ ).We then pretend that G (1) ( X ) cannot be evaluated andstored as a function, but its properties can be repre-sented by a collection—the list—of X -coordinates gen-erated stochastically by sampling the r.h.s. of (4) usingstandard DiagMC techniques. There is no gain in usingrotational symmetry for the coordinates in the list.The minimal set of updates consists of switching be-tween the two sectors corresponding to the ﬁrst and sec-ond terms in the r.h.s. of (4). The known integralover the ﬁrst term, N = (cid:82) | G ( X ) | d X = 1 / | µ | , isused for normalization: the Monte Carlo statistics forany property is properly normalized once multiplied bythe factor N /Z , where Z is the number of samples that belong to the ﬁrst term. For example, the nor-malization integral N G = (cid:82) | G (1) ( X ) | d X is obtained as N G = N ( Z/Z ), where Z is the total number of samples(no matter whether they belong to the ﬁrst or the secondterm).In order to go from the ﬁrst to the second term, wedraw a random variable τ according to an exponentialdistribution ∝ | µ | e −| µ | τ dτ and then generate three ran-dom numbers ( x , y , z ) according to a gaussian distri-bution with zero mean and variance τ /m ; the corre-sponding probability density is based on the G func-tion. Next, we choose coordinates X and X uni-formly from the existing list; they deﬁne X = X + X and X = X + X , by translation invariance. Theprobability 1 /P for selecting these two variables fromthe list is a faithful representation of the probability G (1) ( X − X ) G (1) ( X − X ) d X d X/ N G . In the reverseupdate taking the simulation from the second to the ﬁrstterm, the coordinates of the free propagator are againdetermined by the probability density based on G : anexponential random number for the time and three gaus-sian random numbers for the space.The key property behind the list technique is thatall the values of the unknown function G (1) cancel inthe acceptance ratio: The unknown full Green functionsappear in both the proposed conﬁguration weight andthe probability density used to generate new variables.For the same reason all exponential and gaussian fac-tors cancel G ( X ). This results in an acceptance ratio R = D ( X − X ) N G (and 1 /R for the reverse update).The update is then accepted with probability min(1 , R ),according to the Metropolis-Hastings algorithm. Onethus arrives at a protocol of dealing with a function with-out knowing/revealing its explicit form.In this implementation, we work with an existing listfrom which we draw random coordinates, while simulta-neously preparing a new list which we consecutively ﬁllafter each Monte Carlo update by recording the currentvalue of X (in both sectors). When the new list is full,it replaces the existing one and the new list is reset tozero. To initialize the procedure, we start with a shortlist based on random coordinates that we draw from the G distribution, which we let grow by a small factor of theorder of 1 . ÷ .

01 till the maximum length is reached.The results for a moderate coupling ˜ α = 5 are shownin Fig. 1. Within the level of resolution of the plot, thereis no diﬀerence between the exact NCA result and theone obtained by stochastic lists of the length P = 10 .We postpone the discussion of convergence properties tillSec. III. Here we simply note that for P = 10 any sys-tematic bias was subdominant to the statistical noise.Also, there is no need for fast Fourier transforms (cf.Ref. 18) when employing lists and thus no need to takecare of the asymptotic behavior of the Green function forlarge frequencies. FIG. 2. (Color online.)

Upper pane : Lowest-order contribu-tions to the three-point vertex in the (imaginary-time, real-space) representation.

Lower pane : Self-energy in terms of thethree-point vertex in the (imaginary-time, real-space) repre-sentation. -14-12-10-8-6-4-2 0 0 1 2 3 4 5 6 7 8 l n - G ( p = , τ ) τ P=1Mbare

FIG. 3. (Color online.) Comparison of the Green functiondependence on imaginary time at zero momentum, − G ( p =0 , τ ), obtained by the stochastic list method applied to thescheme illustrated in Fig. 2 with a list of length P = 10 (“P=1M”) versus the result obtained by a conventional Di-agMC simulation restricted to sample the same set of dia-grams from the bare series (see text). C. First-order vertex corrections

We now consider the Hedin scheme where the ver-tex function takes into account both the zeroth-orderterm and the ﬁrst-order corrections. The system to solveconsists of three equations, the two of which are showngraphically in Fig. 2, and the third one is the Dyson equa-tion for the Green function. Our implementation involvestwo stochastic lists: one for the Green function G ( X ),as before, and the other one for the three-point vertexΓ( X , X ), the latter containing 6 spatial and 2 tempo-ral coordinates. (As before, there is no gain in exploitingsymmetries to reduce the number of spatial coordinates.)We set up one stochastic Markov-chain process to sampleboth quantities. The algorithm itself is a straightforward ∆ E log(1/P), log(1/N w )HetheringtonProtocol 1Protocol 2Protocol 3 FIG. 4. (Color online.) Diﬀerence between the exact energyper site (taken from Ref. 19) and the one obtained with eithera stochastic list of length P (see text for an explanation of thediﬀerent protocols), or in a diﬀusion Monte Carlo simulationwith N w walkers (“Hetherington,” with k = 128; see text).The system is a 2D Heisenberg model with linear size L = 10with J = 1. extension of the one discussed above, and we will notelaborate here on minute technical details.Reference data were obtained from the algorithms dis-cussed in Ref. 18, where we used the bare-series codewith the updates “insert-remove” and “dress-undress”switched on and the “swap” update switched oﬀ. Start-ing from an arbitrary Green function diagram, the “in-sert” update attempts to insert a new D -propagatorwithout dressing any of the existing vertices; i.e., noneof the vertices covered by the new D -propagator remainsunlinked on the updated time interval. “Remove” is thecomplementary update. The “dress” update attempts toinsert a new D -propagator that covers precisely one ver-tex; the “undress” update is its complementary partner.This set of updates is not ergodic for the full problem(e.g., no diagrams in which a D -propagator covers twoor more non-linked vertices can be reached), but it ac-counts for all diagrams covered by Fig. 2.The results of benchmark comparison are shown inFig. 3. We see that perfect agreement is reached for theGreen function at zero momentum as a function of imag-inary time for a list of length P = 10 (the same lengthis used for both G and Γ). The method of stochastic listsseems thus promising to study vertex corrections in thecontext of (bosonic) dynamical mean-ﬁeld theory and itscluster extensions. However, as we will see in the nextsection, establishing the convergence of the answer canonly be done on a case by case basis, at best. -6 -6 -6 -6 - E FIG. 5. (Color online.) Minus the energy per site as a functionof 1 /P using protocol 3 (with κ = 1) with a Gutzwiller guidingwavefunction with b = 0 .

8, which is (close to) optimal. Theerror bars have been obtained from 10 independent runs. Thedata are compared with the value

E/J = − . . The data have been extrapolatedlinearly (shown as a full line in the plot) according to E ( x ) = E − bx with x = 1 /P , resulting in a = 0 . b = 11 . L = 10 with J = 1. III. GROUND STATE OF THEANTIFERROMAGNETIC HEISENBERG MODEL

In this section, we consider the spin-1 / H = J (cid:88) (cid:104) i,j (cid:105) S i · S j = J (cid:88) (cid:104) i,j (cid:105) ( S + i S − j + S − i S + j +2 S zi S zj ) , (5)with spin exchange amplitude J >

0. The sum isover nearest neighbor sites, and the lattice is of size L × L . By performing the unitary transformation S xi →− S xi , S yi → − S yi , S zi → S zi on one of the sublattices,the sign of the amplitude in front of the raising and low-ering term is reversed. The matrix elements of C − H , foran appropriately chosen constant C = JL /

2, are thenall positive in the usual S z basis. The eigenvalue problem( C − H ) ψ = ( C − E ) ψ, (6)can be considered a power method when the state ψ isiteratively represented by the stochastic list. It yieldsthe absolute value of the largest eigenvalue in magni-tude, whose eigenfunction can always be chosen posi-tive for a positive matrix. With the above transforma-tions, this corresponds to projecting onto the antiferro-magnetic ground state (and not the ferromagnetic anti-groundstate). This problem is challenging because of thegapless spectrum of elementary excitations in the ther- modynamic limit and the large dimension of the Hilbertspace, growing exponentially with system size.The coordinates in the list consist now of L bits rep-resenting the spins on the lattice. We add to the sampledconﬁguration space a dummy spin-independent term fornormalization purposes; the equation to solve thus con-tains a normalization constant C N as well as the matrix-vector multiplication term. The key updates are switch-ing between these two terms. To go from the former tothe latter, we pick randomly an entry from the stochasticlist of length P , which provides us a coordinate (Fockstate) j . The probability of this selection is 1 /P or | ψ j | / N ψ , where N ψ = (cid:80) j | ψ j | is the normalization sum(estimated using the same procedure as described abovefor N G ). Next, we have to determine all non-zero Hamil-ton matrix elements H ij when acting on the state cor-responding to coordinate j . We assume that the Hamil-tonian is too large to be stored explicitly, so that thisstep must be repeated in every Monte Carlo update. Forsparse matrices, there are very few possible ﬁnal coordi-nates i . We choose the ﬁnal coordinate i according to theheatbath algorithm, i.e. with probability p i = H ij / ¯ H j ,where ¯ H j = (cid:80) i H ij . The resulting Metropolis-Hastingsacceptance ratio is just R = N ψ ¯ H j /C N (and 1 /R forthe reverse update). The update is accepted with prob-ability min(1 , R). We also occasionally employed theMetropolis-like algorithm, in which one of the non-zeromatrix elements H ij is chosen with uniform probabilityinstead of the heatbath algorithm. It did not lead to sig-niﬁcant diﬀerences in the autocorrelation times. A goodchoice for the dummy term constant is C N = 2 JL (tocompensate the typical value of the ¯ H j sum). Measure-ments of the parameter N ψ and recordings of the newlist entries are performed only in the matrix-vector mul-tiplication sector.Apart from the previously discussed protocol (“proto-col 1”) where the current list is replaced by the new oneonce the latter is completed, we also applied “protocol2.” Here there is only one list updated at each MonteCarlo step by drawing a random integer in the interval m ∈ [0 , P [ and replacing the existing entry m with thenew coordinate. In “protocol 3” the list is continuouslygrowing as a function of the Monte Carlo steps accordingto P = (cid:112) τ MC /κ . The list grows then by one entry when-ever the integer part of (cid:112) τ MC /κ increases by one; oth-erwise an existing entry is overwritten when measuringthe list. As will be clear from the results, the diﬀerencesbetween these protocols are not of leading importance.The above algorithm works remarkably well for smallmatrices, even if they are poorly conditioned. For a sys-tem of size L = 4, the systematic error can easily bemade smaller than the statistical noise. When P exceeds1000, we ﬁnd that the systematic error decreases as 1 /P .However, at larger system size (speaking of L = 10 andlarger), this extrapolation law is no longer valid: Longerlists are needed, which need to be iterated much longerto converge. We also observe that the converged resultscannot be extrapolated as a power law in 1 /P (eventhough the results are remarkably accurate already forshort lists). This gets worse with increasing system size,and the marginal gain of using longer lists diminishesfurther. We also see in Fig. 4 that the scaling does notdepend on which protocol we use, suggesting that the rea-son for the ineﬃciency of the simulation must be found inthe build-up of autocorrelations scaling unfavorably withthe system size: due to the overwriting of the entries inthe stochastic list, a few Fock states tend to dominateand the superposition of those is not the exact groundstate. Protocol 3 appears to yield results that can beextrapolated by a single power law over several decadesin 1 /P and may hence look superior. Some care withthis observation is however needed because we could notreach lists that are as long as in protocols 1 and 2; i.e., it might be that the law P = (cid:112) τ MC /κ is still too fastwhen P (cid:29) .The method of stochastic lists applied to the powermethod is highly reminiscent of diﬀusion Monte Carlo,which has also been applied to the Heisenberg modelwith impressive results. We compare here with animplementation motivated by the original algorithm byHetherington and the reconﬁguration ideas of Ref. 25.In this scheme, N w walkers propagate through the Fockspace but, instead of satisfying a detailed balance con-dition, they acquire multiplicative weight factors (theseare the previously introduced ¯ H j sums), which ﬂuctuateat an extensive scale. After a number of generations k ,a population-control mechanism is applied to keep thenumber of walkers ﬁxed. Walkers with high weight aremore likely to reproduce and walkers with a low weightare more likely to be eliminated. The resulting bias iscompensated by global factors (cid:10) ¯ H (1) (cid:11) , . . . , (cid:10) ¯ H ( k ) (cid:11) (i.e.,the ¯ H values averaged over all walkers in every genera-tion), see Refs. 20 and 25 for details. In Fig. 4 we seethat the above algorithm with k = 128 yields results thatare more accurate than the list when there are few walk-ers, but that the scaling is the same as for the list. Wechecked that the same holds for k = 12. It has beenknown since the early days of diﬀusion Monte Carlo thatthe population size might easily lead to the dominantsource of error; more recently, Nemec claimed an expo-nential scaling, and population size bias was also foundby Boninsegni and Moroni. The explanation given byNemec apparently also applies to stochastic lists.It is well known that diﬀusion Monte Carlo can sig-niﬁcantly be enhanced by using a good guiding wave-function. For the HAF model, and certainly for smallsystem sizes, excellent variational Jastrow wavefunctionsare known. We employed here a simpler but faster toevaluate Gutzwiller ansatz, reminiscent of perturbationtheory, ψ G ∼ (cid:89) (cid:104) i,j (cid:105) exp( − bS zi S zj ) , (7)where b is a variational parameter (Note however thatwe have no proof that our ﬁnal answer for the ground state energy is variational). For b > H ij = ψ G ( i ) H ij /ψ G ( j ) where ψ G ( i ) denotes Eq. 7evaluated for spin conﬁguration i . We performed thesimulation for various values of b using protocol 3. Weshow in Fig. 5 the convergence for b = 0 .

8, which is veryclose to optimal. We ﬁnd that the energies diﬀer by anamount 8 × ∼ − for the longest lists we have studied, P ∼ × . The ﬁgure makes however clear that theenergy still drifts as a function of 1 /P . If we extrap-olate, the results agree within error bars (of the orderof 4 × − ) with Sandvik’s stochastic series expansionresults and with the diﬀusion Monte Carlo results ofRef. 25.These results are hence up to two orders of mag-nitude more precise than the results without using theguiding wavefunction. However, a poor guiding wave-function can lead to severe slowing down and, recallingour main goal of studying vertex corrections, one hasto recognize that good (and easy to evaluate) guidingschemes are not available in general. IV. CONCLUSION

We have introduced the method of stochastic lists,which allows one to accurately emulate properties of amulti-variable function F ( x ). The repeatedly refreshedlist consists of a large set of coordinates x distributedaccording to the F ( x ) values. The list of length P repre-sents only a tiny fraction of the full coordinate space of F , but after it is refreshed multiple times, a faithful rep-resentation of the entire F ( x ) function is obtained. Themethod was benchmarked by computing vertex correc-tions self-consistently for the Fr¨ohlich polaron model andby applying the power method for obtaining the ground-state energy and wave function of the antiferromagneticHeisenberg model. The method gives reasonably accu-rate results for most problems in practice, and can appar-ently be extrapolated as a power law over several decadesin the inverse list length. However, for very long lists wecould observe deviations rendering a controlled extrapo-lation for an arbitrary problem diﬃcult. This behaviorseems inherent to all population-based methods. Never-theless, stochastic lists are extremely promising for manyproblems where clever guiding functions for the stochas-tic sampling are known. Here the systematic error canbe reduced to such a degree that it becomes irrelevant inpractice. Acknowledgement.

We wish to thank M. Boninsegnifor valuable discussions. This work was supported byH2020/ERC Consolidator Grant No. 771891 (QSIM-CORR), the Munich Quantum Center and the DFGthrough Nano-Initiative Munich, the Simons Collabora-tion on the Many Electron Problem, and the NationalScience Foundation under the grant DMR-1720465. Theopen data for this project , including an open sourceimplementation for Fig. 5, can be found at https:// gitlab.lrz.de/QSIMCORR/StochasticList . L. Hedin, Phys. Rev. , A796 (1965). F. J. Dyson, Phys. Rev. , 1736 (1949). J. Schwinger, Proceedings of the Na-tional Academy of Sciences I. Y. Pomeranchuk, V. V. Sudakov, and K. A. Ter-Martirosyan, Phys. Rev. , 784 (1956). K. Ter-Martirosyan, Phys. Rev. , 948 (1958). J. Luttinger and J. C. Ward, Phys. Rev. (1960). G. Baym and L. P. Kadanoﬀ, Phys. Rev. , 287 (1961). G. Baym, Phys. Rev. , 1391 (1962). C. D. Dominicis and P. C. Martin, Journal of MathematicalPhysics , 14 (1964). C. D. Dominicis and P. C. Martin, Journal of MathematicalPhysics , 31 (1964). N. Nemec, Phys. Rev. B , 035119 (2010). N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H.Teller, and E. Teller, The Journal of Chemical Physics ,1087 (1953). W. K. Hastings, Biometrika , 97 (1970). N. V. Prokof’ev and B. V. Svistunov, Phys. Rev. Lett. ,2514 (1998). A. S. Mishchenko, N. V. Prokof’ev, A. Sakamoto, andB. V. Svistunov, Phys. Rev. B , 6317 (2000). A. S. Mishchenko, N. V. Prokof’ev, B. V. Svistunov, andA. Sakamoto, International Journal of Modern Physics B , 3940 (2001). J. T. Devreese, ArXiv e-prints (2010), arXiv:1012.4576[cond-mat.other]. J. Greitemann and L. Pollet, ArXiv e-prints (2017),arXiv:1711.03044 [cond-mat.stat-mech]. A. W. Sandvik, Phys. Rev. B , 11678 (1997). J. H. Hetherington, Phys. Rev. A , 2713 (1984). N. Trivedi and D. M. Ceperley, Phys. Rev. B , 4552(1990). K. J. Runge, Phys. Rev. B , 7229 (1992). K. J. Runge, Phys. Rev. B , 12292 (1992). C. J. Umrigar, M. P. Nightingale, and K. J. Runge,The Journal of Chemical Physics , 2865 (1993),https://doi.org/10.1063/1.465195. M. Calandra Buonaura and S. Sorella, Phys. Rev. B ,11446 (1998). M. Boninsegni and S. Moroni, Phys. Rev. E , 056712(2012). F. Franji´c and S. Sorella, Progress of Theoretical Physics97