Applicability of Quasi-Monte Carlo for lattice systems
Andreas Ammon, Tobias Hartung, Karl Jansen, Hernan Leovey, Andreas Griewank, Micheal Müller-Preussker
HHU-EP-13/69SFB/CPP-13-98DESY 13-225
Applicability of Quasi-Monte Carlo for latticesystems
Andreas Ammon ∗ , a , b Tobias Hartung, c Karl Jansen, b Hernan Leovey, d Andreas Griewank d and Michael Müller-Preussker a a Humboldt-University Berlin, Department of PhysicsUnter den Linden 6, D-10099 Berlin, Germany b NIC, DESY ZeuthenPlatanenallee 6, D-15738 Zeuthen, Germany c King’s College London, Department of MathematicsStrand, London WC2R 2LS, United Kingdom d Humboldt-University Berlin, Department of MathematicsUnter den Linden 6, D-10099 Berlin, GermanyE-mail:
[email protected] , [email protected] , [email protected] , [email protected] , [email protected] , [email protected] This project investigates the applicability of quasi-Monte Carlo methods to Euclidean lattice sys-tems in order to improve the asymptotic error scaling of observables for such theories. The error ofan observable calculated by averaging over random observations generated from ordinary MonteCarlo simulations scales like N − / , where N is the number of observations. By means of quasi-Monte Carlo methods it is possible to improve this scaling for certain problems to N − , or evenfurther if the problems are regular enough. We adapted and applied this approach to simple sys-tems like the quantum harmonic and anharmonic oscillator and verified an improved error scalingof all investigated observables in both cases. ∗ Speaker. c (cid:13) Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike Licence. http://pos.sissa.it/ a r X i v : . [ h e p - l a t ] N ov MC on the lattice
Andreas Ammon
1. Motivation
The quasi-Monte Carlo (QMC) method and their randomizations (RQMC) are highly interest-ing for the domain of lattice field theory. It offers the possibility to improve tremendously theasymptotic error scaling of observables obtained from Monte Carlo (MC) simulations of latticepath integrals. Substantial reductions in computing time could be achieved if the QMC approachcould eventually be applied to lattice-QCD (quantum chromodynamics in its lattice regularizedform). A mathematical review of the QMC approach can be found in [1]. The major part of thiscontribution is based on our paper [2] (cf. also [3]). The reader interested in more details is referredto this reference at any point of the following discussion.In order to better understand the point where the QMC approach sets in with its improvement,we want to outline the typical workflow during the treatment of a general lattice problem withconventional methods. Such a lattice system might be described by the partition function Z = (cid:82) D x e − S [ x ] given the action S . An observable O could be calculated by (cid:104) O (cid:105) = Z − (cid:82) D x e − S [ x ] O [ x ] . D x stands for the path integral measure of all dynamic fields relevant to the model under consid-eration. This could be for example the gauge field measure for lattice gauge theories or simply aparticle path measure for quantum mechanical problems. It is originated in the high dimensionalityof the lattice path integral that it can naturally only be treated by means of MC methods, providedthat analytic solutions are missing. Within the variety of MC approaches the Markov chain-MonteCarlo (Mc-MC) approach turns out to be the most efficient, as it allows for importance sampling.During a Mc-MC simulation a number N of field configurations ( x i ) i = ... N is generated succes-sively, each of them based on its predecessor and distributed (after the thermalization) according tothe Boltzmann weight. Then for each sample x i the observable O is measured, leading to N samples O i . Often, these samples are distributed normal, at least to a good approximation in most cases.Then, the asymptotic error of the mean ¯ O = N ∑ Ni = O i scales like N − . A fixed-factor increaseof the error often arises from correlations between successive observations O i , which originates inthe nature of Mc-MC methods. Both features, the crude asymptotic error scaling and the possiblystrong auto-correlation, lead to a necessity to generate a large amount of samples to reach a givenerror level. In many cases it is even impossible to reach the target accuracy due to the lack ofsufficient computing resources.The QMC approach provides the potential to circumvent the aforementioned problems, as itexhibits certain favorable properties. Most importantly, it is able to improve the error scaling to N − , given that certain conditions are met (see [2]). It is further encouraging to realize that theQMC technique has already been applied successfully in other fields like financial mathematics [4]for example. But before coming to specific demonstrations of this fascinating approach, we wantto take a closer look to a prominent feature of QMC samples.
2. Quasi-Monte Carlo point sets are more uniform
Most point sets constructed through QMC techniques fulfill a so-called low-discrepancy prop-erty (see [2], sections 3 and 4) also referred to uniformity or more uniform (than conventional MonteCarlo point sets). This property can be illustrated through a simple example in two dimensions, butwhose results can be generalized to arbitrary many dimensions.For this experiment the unit square [ , ] × [ , ] , subdivided into 8 × and uniformly inthis unit square. Then, for each of the 8 × =
64 little squares the number of contained pointsare counted. An example of the outcome of such an experiment is shown in the upper diagram offigure 1. We use the Mersenne Twister pseudo random number generator [5]. MC on the lattice
Andreas Ammon / / / / / ll l ll l ll ll ll ll lll ll ll lll ll l ll l llll l l l ll ll ll ll ll l lll l l lll ll ll lll lll ll lll l l llll lll l lll lll lll llll l ll ll lll lllll lllll ll llll lll ll l ll ll l lll l l ll l ll llll lll l ll ll l l lll l ll lll ll lll ll l ll l ll lll ll ll ll lll lll ll ll l l ll lll ll l l ll ll l ll ll ll ll lll l ll llll l l llll l ll l llll ll ll ll ll l ll ll ll l l l ll ll ll l lll l lll l llll ll ll ll lllll ll ll l lll l l ll lll l lll llll l lll llll llll l l l ll l lll ll ll ll ll ll ll llll l ll ll l ll l ll l ll l ll lll l l lll lll ll ll l l ll l ll ll ll ll lll ll ll ll llll l lllll l lll ll lll ll lll ll ll l lll lll l l lll lll l ll llll l llll ll l ll ll l ll ll ll lll l llll ll llll llll l ll lll lll l lll lll ll lll Pseudo random 2d point setHistogram of counts count f r equen cy Figure 1:
The pseudo-random sam-pling of 512 points in a unit square( upper plot ). For each little squarethe number of containing points iscounted and indicated through acolour code. The meaning of eachcolor code can be seen from the lower diagram , where a histogramof the counts is shown.
The color of each square corresponds to the number ofpoints it contains. In the lower part of this figure we haveplotted a histogram of the counts. We can clearly see that thedistribution of counts is rather broad. This means in practicethat many squares contain significantly more or less samplesthan one would expect on average, namely 8. If this set ofpoints ( x i , y i ) i = ... would be used for a Monte Carlo approx-imation of a two-dimensional integral (cid:82) (cid:82) f ( x , y ) dxdy ≈ ∑ i = f ( x i , y i ) function values in squares with very manyor very few points contribute too much or too less respec-tively to the overall average. This can be seen as a possi-bly avoidable source of extraneous fluctuations which havenothing to do with the nature of the problem, the integral of f . Hence, it is highly desirable if the filling of squares withsamples would happen more evenly. We will see in the fol-lowing repetition of the experiment with QMC samples, thatthis is exactly what can be provided by the QMC method. Wewant to mention, that the distribution on the bottom of figure 1can be described theoretically by the Poisson distribution, inthe limit of infinitely many little squares keeping the averagecount per little square fixed (at 8).We repeat now the experiment with exactly the same pa-rameters but instead of a pseudo-random number generatorwe employ the Sobol’ approach [6], a special QMC method,for the generation of points. The result is shown in figure 2.As can be seen in the upper plot, the filling of squares in facthappens completely even. Each little square contains exactly8 points, and this leads to a delta histogram (shown on thelower part of figure 2). If again the points are used in the ap-proximation of a two-dimensional integral the function valuesfrom each square contributing to the average are representedmuch better with respect to the area they cover, and hence,smaller stochastic fluctuations are expected, leading very likely to smaller errors of this approxi-mation.Through this illustration we might get an understanding on how the more evenly distributedsamples from QMC methods could help to decrease the natural statistical fluctuations of stochasticapproximations.
3. Lattice harmonic and anharmonic oscillator
We want to briefly introduce the quantum mechanical harmonic and anharmonic oscillator quan-tized through the lattice path integral, which we will investigate numerically later on. These sys-tems have been investigated in detail already in [7] using the Metropolis algorithm, which is con-sidered as a Markov chain-Monte Carlo method.The underlying action S = a d ∑ i = (cid:18) M ( x i + − x i ) a + µ x i + λ x i (cid:19) (3.1)3 MC on the lattice
Andreas Ammon with the periodic boundary condition x d + = x is obtained from the discretization of the classicalmechanical action of a particle with mass M passing along the path x ( t ) considered in Euclideantime on an equidistant finite time lattice with lattice spacing a and d lattice points (extent T = ad ). / / / / / ll l llll l ll l l lll lll l llll lll l l lll lll l llll l ll l l lll l ll l llll lll l l lll lll l llll l ll l l lll lll l llll lll l l lll lll l llll l ll l l lll l ll l llll lll l l lll lll l llll l ll l l lll lll l llll lll l l lll lll l llll l ll l l lll l ll l llll lll l l lll lll l llll l ll l l lll lll l llll lll l l lll lll l llll l ll l l lll l ll l llll lll l l lll lll l llll l ll l l lll lll l llll lll l l lll lll l llll l ll l l lll l ll l llll lll l l lll lll l llll l ll l l lll lll l llll lll l l lll lll l llll l ll l l lll l ll l llll lll l l lll lll l llll l ll l l lll lll l llll lll l l lll lll l llll l ll l l lll l ll l llll lll l l lll lll l llll l ll l l lll lll l llll lll l l lll lll l llll l ll l l lll l ll l llll lll l l lll l RQMC point setHistogram of counts count f r equen cy Figure 2:
Distribution of 512Sobol’ points generated uniformlyin the unit square ( upper plot ) andhistogram of counts ( lower plot ).See also description of figure 1.
The time derivative ˙ x ( t ) is replaced by the forward finite dif-ference a ( x i + − x i ) . λ controls the strength of the anhar-monic term x i . Hence, the harmonic oscillator is obtained for λ = µ >
0. This condition has tobe met for a convergent path integral. The anharmonic os-cillator can be simulated with λ > µ ∈ R , both beingfinite. The case µ <
0, to which we restrict ourselves inthe following, is particularly interesting, as the potential ex-hibits two minima in this case (cf. double-well potential). Thequantization is performed through the partition function Z = (cid:82) e − S ( x ) dx . . . dx d . An observable O of the so quantized sys-tem can be expressed as (cid:104) O (cid:105) = Z − (cid:82) O ( x ) e − S ( x ) dx . . . dx d .The primary physical observables are X = d d ∑ i = x i , X = d d ∑ i = x i , and Γ ( τ ) = d d ∑ i = x i x i + τ a . (3.2)The ground state energy E = µ X + λ X + µ and theenergy gap ∆ E = E − E between the ground and first ex-cited state can be derived from them. The latter is deter-mined from a non-linear regression of the model Γ ( τ ) ∼ C (cid:0) e − ∆ E τ + e − ∆ E ( T − τ ) (cid:1) to the data for the correlator Γ ( τ ) ,defined in (3.2), in a range of times τ where the influence ofhigher-than-the-first excited states is negligible.
4. Gaussian Sampling
As the action of the harmonic oscillator is at most quadratic in the variables x i , the correspondingpartition function can be expressed as a multivariate Gaussian integral Z = (cid:82) exp (cid:0) − x t C − x (cid:1) ,where the components of C − are given by ( C − ) i , j = M a (cid:16)(cid:16) + a µ M (cid:17) δ i , j − ( δ i , j + + δ i + , j ) (cid:17) (obtained from comparing: x t C − x = S ( x ) ). C is called the covariance matrix.For such a case, the sampling of lattice paths x is particularly straightforward, and can be basedon samples z , which are generated uniformly in the d -dimensional unit cube. This is particularlyuseful, as most RQMC methods provide samples in this form.Hence, our algorithm aiming at the generation of properly distributed samples x starts in thefirst step with1. the generation of a uniform sample z = ( z , . . . , z d ) t ∈ [ , ] d . This is either, as mentionedabove, a RQMC sample stemming from a scrambled (randomized) Sobol’ point set, using di-rection numbers from F. Kuo’s page http://web.maths.unsw.edu.au/~fkuo/sobol/index.html,or a sample obtained from a separate uniform sampling of each dimension with a pseudo-random number generator. The latter case will be referred to as (conventional) Monte Carlo(MC) sampling in the following.2. In the next step, univariate Gaussian samples w = ( w , . . . , w d ) t are generated by applyingthe inverse standard normal distribution function Φ − to the z i and multiplying the result with4 MC on the lattice
Andreas Ammon the square root of the eigenvalues λ i of C : w i = (cid:112) λ i Φ − (cid:16) z π − ( i ) (cid:17) . (4.1)The eigenvalues are given in a closed form as λ i = (cid:16) M a ( u − cos ( π i / d ) (cid:17) − . As indicatedthrough the (inverse of the) permutation π , the order of dimensions in z has to be modifiedsuch that the component z comes upon the largest eigenvalue, z comes upon the secondlargest eigenvalue and so on, until the last component z d meets the smallest eigenvalue. Thiscan be achieved by determining a permutation π which brings the eigenvalues in decreasingorder ( λ π ( ) ≤ λ π ( ) ≤ . . . ≤ λ π ( d ) ) and calculating π − as the inverse of this permutation(fulfilling π − ( π ( i )) = i ).3. Finally, the multivariate Gaussian variables x i are generated from the orthonormal trans-formation x = G w , where G = ℜ ( F ) + ℑ ( F ) is the discrete Hartley transform and F k , l = √ d e π ikl / d the discrete Fourier transform.
5. Results
The algorithm we just discussed was used to generate lattice paths and corresponding samplesof the observable X for a harmonic oscillator with the parameters M = . a = . µ = . d =
100 for a fixed number of samples N = , , , , . For each N the experiment is repeatedwith 300 scramblings (see section 5 in [2]) of the Sobol’ sequence in the RQMC simulation toallow the approximation of the error and the variance of the error. For the MC simulation 300different seeds have been used to initialize the individual runs. A fit of the model ∆ (cid:104) X (cid:105) ∼ C · N α to the determined errors of X yields an exponent α = − . ( ) for the RQMC simulation caseand α = − . ( ) for the MC simulation. A plot illustrating the results is shown in figure 3. Theoutcome of this investigation basically proofs the full functioning of the (R)QMC method for areal, even though trivial, physical model.To study the scaling for a less trivial model, we passed on to the anharmonic oscillator withthe parameters λ = . M = . µ = − a = . / d . The experiment was performed for d =
100 and 1000 dimensions. The sampling of lattice paths is less straightforward in the presentcase, but can be realized on the basis of the sampling method we used before for the harmonicoscillator. This happens with the aid of the reweighting approach (cf. section 7 in [2]), whichwas first used in [8]. Additionally to the physical action S , describing the anharmonic oscil-lator, an artificial action S (cid:48) is introduced, which is constructed exactly as a harmonic oscilla-tor action with a different set of parameters M (cid:48) , a (cid:48) and µ (cid:48) . Then, harmonic oscillator paths ( x i ) i = ... N are generated corresponding to this unphysical action S (cid:48) . Approximations of observ-ables O of the anharmonic oscillator (described by the physical action S ) are obtained from theweighted mean (cid:104) O (cid:105) ≈ (cid:0) ∑ Ni = O ( x i ) W ( x i ) (cid:1) / (cid:0) ∑ Ni = W ( x i ) (cid:1) , where the weight function W is givenby W ( x ) = exp ( − S ( x ) + S (cid:48) ( x )) . Now, it remains to find reasonable parameters with the objectiveof reducing the fluctuations of the weights W ( x i ) as much as possible, leading most likely to thesmallest possible error of the observables. We found that only the modification of the parameter µ (cid:48) leads already to satisfying results, such that M (cid:48) = M and a (cid:48) = a could be left unchanged. Aheuristic optimization approach led to a value of µ (cid:48) = . X , X and E is significantly improved, although less than in the harmonic oscillator5 MC on the lattice
Andreas Ammon l l l l l − − − Number of samples − N E rr o r o f < x > Error of
Error of X for the RQMC and MC simulation of the harmonic oscillator ( left ). The right plotshows the fit of the asymptotic error scaling for the RQMC simulation. O α log C χ / dof X -0.763(8) 2.0(1) 7.9 / 6 d = X -0.758(8) 4.0(1) 13.2 / 6 E -0.737(9) 4.0(1) 8.3 / 6 X -0.758(14) 2.0(2) 5.0 / 4 d = X -0.755(14) 4.0(2) 5.7 / 4 E -0.737(13) 4.0(2) 4.0 / 4 Table 1:
Results for the error scaling of the observalbes X , X and E for the model of the anharmonicoscillator, simulated through reweighting. Observable errors were fitted to the model ∆ O ∼ CN α . case. More specifically, we can conclude that, within statistical uncertainties, the error scaling is of O ( N − ) in all considered cases.In a further effort we investigated also the energy gap. In order to be able to measure this quan-tity, we increased µ from −
16 to − T → ∞ and a →
0) to a change of ∆ E from 0 . .
576 [9]. For d =
100 dimensionsand sample sizes of N = , , , we obtain an exponent of α = − . ( ) . These results arevery interesting in that the imporved error scaling seems to be rather independent of the observableunder consideration.
6. Outlook & conclusions
In this contribution we showed a first successful application of one specific RQMC method toEuclidean lattice models. We verified a perfect error scaling of O ( N − ) for the harmonic oscillatorand a strongly improved error scaling of O ( N − ) for the anharmonic oscillator with a double-wellpotential. The latter investigation also included the energy gap, which can be considered as a rathernon-trivial observable, as it is obtained from the correlator using a non-linear procedure. A better6 MC on the lattice
Andreas Ammon l l l l l l − − Error of
Double-log plot of the error scaling of the observables X (left), X (middle) and E (right) withthe number of samples N in the RQMC approach. The dashed line shows the fit of the model ∆ O ∼ CN α tothe data. Numerical results can be seen in table 1. understanding of this in-between behavior of N − is planned for the future, and should lead toa better theoretical understanding on how the QMC method behaves when applied to non-trivialproblems. Further improvements in the sampling of the anharmonic oscillator are assumed whenan optimally tuned, more generalized covariance matrix is used in the Gaussian sampling step.Furthermore, other promising non-Gaussian sampling approaches, like inverse sampling [10], areinvestigated at the moment, aiming at a better description of the anharmonic action and probablyinvolving the potential to be applicable to a much broader class of lattice problems; though, it willbe interesting to see in the future how efficient these techniques are in practice.As a next step towards lattice gauge theories we are currently considering a one-dimensionalspin like model, described by the action S = aI ∑ di = − a cos ( φ i + − φ i ) , where I is the moment ofinertia, a the lattice spacing, and φ i are angular variables with periodic boundary conditions ( φ d = φ ). This model exhibits topological features, visible through the non-vanishing of the windingnumber – a feature not present in the previously considered oscillator models but in other latticegauge theories like QCD. Having managed this model with generalizable methods it could beenvisaged that also generic gauge theories could be addressed in the future. References [1] F. Kuo, C. Schwab and I. Sloan,
ANZIAM Journal , (01) (2012).[2] K. Jansen, H. Leovey, A. Ammon, A. Griewank, and M. Müller-Preussker, Comput.Phys.Commun. (2013), http://dx.doi.org/10.1016/j.cpc.2013.10.011, arXiv:1302.6419 [hep-lat] .[3] K. Jansen, H. Leovey, A. Nube, A. Griewank and M. Müller-Preussker
J. Phys.: Conf. Ser. (2013)012043, arXiv:1211.4388 [hep-lat] .[4] P. Glasserman, Springer-Verlag, New-York 2004.[5] M. Matsumoto and T. Nishimura,
ACM Trans. Model. Comput. Simul., no. 1 (1998) 3–30.[6] I. M. Sobol’, U.S.S.R. Comput. Math. and Math. Phys., no. 4 (1967) 86–112.[7] M. Creutz and B. Freedman, Annals Phys., (1981) 427.[8] A.M. Ferrenberg and R.H. Swendsen,
Phys. Rev. Lett., no. 23 (1988) 2635–2638.[9] R. Blankenbecler and T. A. DeGrand, Thomas A. and R. L. Sugar, Phys.Rev.,
D21 (1980) 1055.[10] L. Devroye, Springer-Verlag, New-York 1986.(1980) 1055.[10] L. Devroye, Springer-Verlag, New-York 1986.