aa r X i v : . [ h e p - l a t ] N ov Computational Methods in Quantum Field Theory
Kurt LangfeldSchool of Mathematics & Statistics, University of PlymouthPlymouth, PL4 8AA, UKemail: [email protected] 19, 2007
Abstract
After a brief introduction to the statistical description of data, these lecture notesfocus on quantum field theories as they emerge from lattice models in the criticallimit. For the simulation of these lattice models, Markov chain Monte-Carlo methodsare widely used. We discuss the heat bath and, more modern, cluster algorithms.The Ising model is used as a concrete illustration of important concepts such ascorrespondence between a theory of branes and quantum field theory or the dualitymap between strong and weak couplings. The notes then discuss the inclusion ofgauge symmetries in lattice models and, in particular, the continuum limit in whichquantum Yang-Mills theories arise.
Notes based on a lecture presented at the XIX Physics Graduate Days at theUniversity of Heidelberg, 8th - 12th October 2007.1 ontents Z gauge theory: 3 and 4 dimensions . . . . . . . . . . . . . . . . . . 335.3 Setting up lattice Yang-Mills theory . . . . . . . . . . . . . . . . . . . . . 375.4 The fermion doubling problem . . . . . . . . . . . . . . . . . . . . . . . . 415.5 Overlap fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445.6 Measuring observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.7 The continuum limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Assume that we would like to determine a physical observable ¯ x such as a hadron massor a decay constant by a numerical calculation involving statistical methods or by a directexperimental measurement. A perfect device would just produce ¯ x with a single measure-ment. In practice, such a device does not exist. A realistic device produce a value x inthe interval [ x, x + dx ] with probability P ( x ) dx , where the probability distribution P ( x )characterises the apparatus. We will not assume that our experimental device is hamperedby systematic errors, but we will assume that the device produces the exact value ¯ x by an2verage over many measurements, i.e., Z dx x P ( x ) = ¯ x , (1)but, depending on P ( x ), a single measurement can be far from the true value.As an example, we consider an observable ¯ x = 3 and a crude experiment which can produceany value for between 0 and 6 with equal probability: P ( x ) = (cid:26) / x ∈ [0 , x is not sufficient to reveal the true observable. Theonly thing we can do is to repeat the measurement n times and to consider the average: y = 1 n h x + . . . + x n i , where x to x n are the values obtained from each of the measurements. For the moment,we will assume that the measurements are independent , i.e., that the probability for findinga set { x . . . x n } of data is given by: P ( x , . . . , x n ) = P ( x ) . . . P ( x n ) . The crucial question is to which accuracy have we estimated the true observable ¯ x ?The answer can be inferred from the probability distribution Q ( y ) for the value y : Q n ( y ) = Z n Y i =1 dx i δ (cid:18) y − n [ x + . . . + x n ] (cid:19) P ( x ) . . . P ( x n ) . (3)Given the proper normalisation of the single event distributions, i.e., Z dx i P ( x i ) = 1 , using (1), we can easily show that the average of y coincides with the observable:¯ y = Z dy y Q n ( y )= Z dy Z n Y i =1 dx i n [ x + . . . + x n ] δ (cid:18) y − n [ x + . . . + x n ] (cid:19) P ( x ) . . . P ( x n ) . = Z n Y i =1 dx i n [ x + . . . + x n ] P ( x ) . . . P ( x n ) = 1 n n Z dx x P ( x ) = ¯ x . σ of our estimate is provided by the second moment: σ ( n ) = Z dy (cid:16) y − ¯ y (cid:17) Q n ( y ) , (the variance) . (4)If the distribution Q n ( y ) peaks around the true value for our observable ¯ x and σ ( n ) is tiny,it would mean that a single estimator y has high probability to fall close to ¯ x with highprobability implying that it yields a good approximation to ¯ x .Let us study the moments of the distribution Q ( y ): q m = Z dy Q n ( y ) y m . (5)In order to draw further conclusions, we need to restrict the classes of single event proba-bility distributions P ( x ): we will assume that its Fourier transform¯ P ( β ) = Z dx P ( x ) exp {− i β x } (6)is an analytic function of β at β = 0. As a consequence, the moments of P ( x ) exist andare given by: p m = Z dx P ( x ) x m = i m d m dβ m ¯ P ( β ) | β =0 . (7)We will further assume that ¯ P ( β ) vanishes for | β | → ±∞ . This seems to be quite a weakconstraint. I point out, however, that systems with rare but large fluctuations genericallyfail to possess higher moments. One example is stock market indices [1].Our aim is to express the moments of Q n ( y ) in terms of the moments of P ( x ). For thispurpose, we rewrite the δ -function in (3) as δ (cid:18) y − n [ x + . . . + x n ] (cid:19) = Z dα π exp[ i α y ] n Y i =1 exp n − i αn x i o , (8)and find Q n ( y ) = Z dα π exp( i α y ) (cid:20)Z dx P ( x ) exp n − i αn x o(cid:21) n , (9)= Z dα π exp( i α y ) h ¯ P (cid:16) αn (cid:17) i n . (10)The moments of Q n ( y ) are then obtained from q m = Z dy Z dα π ( − i ) m d m dα m h exp( i α y ) i ¯ P n (cid:16) αn (cid:17) . (11)4fter a series of partial integrations with respect to α (note that boundary terms vanishby virtue of our above assumptions), the latter equation is given by q m = Z dy Z dα π exp( i α y ) ( i ) m d m dα m h ¯ P n (cid:16) αn (cid:17)i = Z dα π n m Z dy exp( i α y ) ( i ) m d m dβ m h ¯ P n ( β ) i = i m n m d m dβ m h ¯ P n ( β ) i | β =0 . (12)Of particular interest are the so-called cumulants c k [ Q n ] of the distribution Q n ( y ). Theseare defined via the generating function T Q ( x ) = X m =0 m ! q m x m , c k [ Q n ] := d k dx k ln T Q ( x ) | x =0 . (13)Note that in particular we find for the ’error’ σ in (4) σ = q − q = c [ Q n ] . (14)Using Taylor’s theorem and the explicit expression (12), we find T Q ( x ) = ¯ P n (cid:18) i xn (cid:19) , c k [ Q n ] = i k n k − h ln ¯ P (0) i ( k ) , (15)where ( k ) denotes the k th derivative. Introducing the cumulants c k [ P ] of the single eventdistribution as well, i.e., c k [ P ] = i k h ln ¯ P (0) i ( k ) , (16)we arrive at a very important result: c k [ Q n ] = 1 n k − c k [ P ] . (17)Note that the cumulants c k [ P ] are finite numbers which characterise the single event proba-bility distribution. Equation (17) then implies that for increasing number of measurements n , the higher ( k >
1) cumulants of Q n ( y ) vanish. In particular, we find that σ ( n ) = p c [ Q n ] = p c [ P ] √ n ∝ / √ n . (18)For the above example, we find p = 16 Z dx x = 3 , p = 16 Z dx x = 12 , c [ P ] = 12 − = 3 , (19)and therefore σ ( n ) = p /n .
5t is well known that if c k [ G ] = 0 for k >
2, the probability distribution G is a Gaussian.We therefore expect that if n is chosen sufficiently large so that we can neglect c k [ Q n ] with k >
2, we should be able to approximate Q n ( y ) by a Gaussian. To support this claim(without mathematical rigour), we start from (10): Q n ( y ) = Z dα π exp( i α y ) exp n n ln h ¯ P (cid:16) αn (cid:17) io , and expand the logarithm with respect to α : Q n ( y ) = Z dα π exp( i α y ) exp ( n ∞ X k =0 k ! (cid:2) ln ¯ P (0) (cid:3) ( k ) (cid:16) αn (cid:17) k ) = Z dα π exp( i α y ) exp ( n ∞ X k =1 k ! c k [ P ] (cid:16) − i αn (cid:17) k ) , where we have used ¯ P (0) = R dx P ( x ) = 1 and the definition of the cumulants of P in(16). Using c [ P ] = p = ¯ x = ¯ y , we find: Q n ( y ) = Z dα π exp[ i α ( y − ¯ y )] exp (cid:26) − c [ P ] (cid:18) α n (cid:19) + O ( α /n ) (cid:27) (20)Note that the dominant contributions from the α integration arises from the regime where α < √ n . In this regime, the correction term is of order O ( α /n ) ≈ O (1 / √ n )and will be neglected for sufficiently large n . The remaining integral can be easily per-formed: Q n ( y ) ≈ √ π σ exp (cid:26) − ( y − ¯ y ) σ (cid:27) , σ = c [ P ] n . (21)which is the celebrated Gaussian distribution. This finding is called the central limittheorem: if the moments of probability distribution P ( x ) exist, the probability distributionfor the average y can be approximated by a Gaussian for sufficiently large n given that thestandard deviation σ is properly scaled with n .Let us discuss this result. Figure 1 shows the distributions Q n ( y ) for n = 1 , , n = 10.Already for n = 10 the distribution is well approximated by the Gaussian.The existence of at least the moment c [ P ] of the single event distribution is crucial forthe error reduction by repeated measurements. Let us consider a Lorentz distribution forthe moment: P L ( x ) = 1 πb
11 + ( x/b ) . (22)6 y Q n ( y ) n=1n=2n=3n=10Gauss, n=10 Figure 1: Illustration of the central limit theorem: probability distributions of the average y after n measurements.With the naked eye this distribution resembles the Gaussian. The crucial difference is,however, that the second moment does not exist: Z dx x P L ( x ) −→ ∞ . The Fourier transform of P L ( x ) does, however, exist¯ P L ( β ) = exp n − b | β | o . (23)If we now repeat the measurements n times, the distribution of the average y is, accordingto (10), by Q n ( y ) = Z dα π exp( i α y ) h exp (cid:16) − b αn (cid:17) i n = P L ( y ) . Obviously, the probability distribution does not change at all even if we repeat the mea-surements many times. This actually implies that it impossible to experimentally gain areliable value for the observable ¯ x . Let us return to the example in (2), and let us assume a group of experimentalists hasperformed n = 12 measurements with the result:7 .813 2.021 0.331 0.8655.394 5.937 5.027 1.5560.325 2.750 1.283 3.890 The average over these values and an estimate h x i of c [ P ] are given by y = 1 n n X k =1 x k ≈ . , h x i = 1 n n X k =1 x k ≈ . . (24)We point out that 3 .
581 is a poor estimate of the true value (19) of c [ P ] = 3, but it givesthe order of magnitude. With this estimate for c [ P ], we find for the error (18) σ ( n = 12) ≈ r . ≈ . . Hence, the final ’experimental’ result for the observable would be¯ x ≈ . ± .
546 = 2 . . (25)Note that the true result ¯ x = 3 lies well with the reach of the error bars.The above experiment was repeated by several research labs. Depending on the budgetand the focus of research, different labs produce different numbers n of measurements:CERN GSI DESY BNL n
120 50 78 150 y . ± .
163 2 . ± .
255 3 . ± .
207 3 . ± . x and how can we quantify its (statistical) error?To answer these questions, we assume that the number n of each measurement was largeenough to approximate the distribution of an individual result y k , k = 1 . . . N (where N = 4 for the above example) by a Gaussian (21): Q ( y l ) ≈ √ π σ l exp (cid:26) − ( y l − ¯ x ) σ l (cid:27) . (26)For the world average y we make the ansatz y = N X l =1 a l y l , N X l =1 a l = 1 . (27)8he weights a l must be chosen in an optimal way. This choice will depend on the errors σ l of the individual experiments. In particular, the experiment with the smallest error shouldcontribute to the world average with the largest weight. Assuming that the experiments atthe different labs were carried out independently, the probability distribution of the worldaverage is now given by W ( y ) = Z N Y k =1 dy k δ (cid:16) y − N X l =1 a l y l (cid:17) Q ( y ) . . . Q ( y N ) . (28)Representing the δ -function in terms of a Fourier integral over α (see (8)), the integrationsover y . . . y N can be easily performed: W ( y ) = Z dα π exp { i ( y − ¯ x ) } exp ( − α X l a l σ l ) . Performing the α integration finally fields: Q ( y ) ≈ √ π σ exp (cid:26) − ( y − ¯ x ) σ (cid:27) , σ = X l a l σ l . (29)The optimal result is achieved if the error squared, i.e., σ , is as small as possible. Here,we must take into account the normalisation condition in (27). We therefore minimise X l h a l σ l − λ a l i −→ min , (30)where λ is a Lagrange multiplier. The global minimum is easily obtained: a l = λ σ l , λ = X l σ l . (31)The minimal value for σ then satisfies σ = λ ⇒ σ = X l σ l . (32)The optimal choice for the weights can therefore also be written as a l = σ σ l . (33)Let us return to the above example. We find σ ≈ . , (34) a ≈ . , a ≈ . , a ≈ . , a ≈ . . With the weights at our disposal, we easily find the optimal value for the world average y ≈ . x = 3 . ± .
089 = 3 . . (35)Note that the true result ¯ x = 3 is again covered within error bars and that the error becamesignificantly smaller than that of the best result provided by the BNL group.9 .3 Autocorrelations In the previous subsections, we have repeatedly assumed that the measurements x i areindependent. We will see below that a vital tool of computational quantum field theory isto use information from the measurement x i to obtain the value for x i +1 . In this case, thedata set x i , i = 1 . . . n is generated by the chain x → x → . . . → x n − → x n , and the probability of finding a particular set does not factorise anymore: P ( x , . . . , x n ) = P ( x ) . . . P ( x n ) . In the context of QFT simulations we will, however, make an effort to render the values x i as independent as possible. This generically implies that events which are separated bysome ‘time’ τ , i.e., the events x i and x i + τ , can be considered as statistically independent.The trick for obtaining an idea of the error of the estimator is to group b measurementstogether: y ν = 1 b b X i =1 x ν + i , ν = 1 . . . M , M = nb , (36)where we choose 1 ≤ τ ≪ b . (37)This latter constraint implies that the values y ν are statistically independent and thatthey have a Gaussian distribution because of the central limit theorem. The quantities ofinterest are the average ¯ y = 1 M M X ν =1 y ν , (38)which converges to the observable ¯ x in the limit M → ∞ , and the corresponding error σ = 1 M c [ P y ] = bn c [ P y ] . (39)where the cumulant c [ P y ] is given by c [ P y ] = * M M X ν =1 y ν + − "* M M X ν =1 y ν + , (40)with h f i = Z n Y l =1 dx l f ( x . . . x n ) P ( x , . . . , x n ) . Assuming translational invariance, i.e., h x ν + i x ν + k i = h x i x k i ,
10e find c [ P y ] = 1 b b X i =1 b X k =1 c ( k − i ) , c ( k − i ) = h x i x k i − h x i ih x k i , (41)where c ( k − i ) is called the autocorrelation function . Introducing the relative distance t = k − i , and trading the summation over k in (41) for a summation over t , we find c [ P y ] = 1 b b X i =1 b − i X t =1 − i c ( t ) . (42)Interchanging the summation indices t and i and after summing over i , this last equationbecomes: c [ P y ] = 1 b c (0) + 2 b b − X t =1 ( b − t ) c ( t ) . We have already mentioned that the measurements x i and x i + τ are (almost) uncorrelated.The equivalent statement is that the correlation function vanishes for sufficiently largearguments: c ( t ) ≈ , for t > τ . For b ≫ τ , we approximately find: c [ P y ] ≈ b c (0) + 2 b b − X t =1 b c ( t ) = 1 b b − X t =1 − b c ( t ) . (43)It is convenient to introduce the normalised autocorrelation function by ρ ( t ) = c ( t ) c (0) , c (0) = h x i − h x i = c [ P x ] . (44)The integrated autocorrelation time is then defined by τ = 12 b − X t =1 − b ρ ( t ) = 12 + b − X t =1 ρ ( t ) . (45)Inserting (45,44,43) into (39), we finally obtain for the error which should be attributed toour estimate ¯ y in (38): σ = 2 τn c [ P x ] . (46)Let us perform a consistency check by considering the special case that the measurementsare uncorrelated. In this case, the autocorrelation function ρ ( t ) vanishes for t ≥
1, and theautocorrelation time is given by τ = 1 /
2. We indeed recover the familiar result σ = 1 n c [ P x ] , for independent measurements. (47)11ote that autocorrelations increase the error bars. Not knowing the autocorrelations in anumerical simulations leads us to the erroneous assumption that the error of the estimator isgiven by (47), while the true result is by a factor τ larger. Not knowing the autocorrelationsalways leads to an underestimation of the statistical error. A phase transition occurs if the properties of matter change qualitatively when an externalparameter, such as temperature, is altered. The phase transition of water from a liquidto a gas phase when the temperature exceeds roughly ≈ Celsius (under normalconditions), is well known from everyday life. A second example is the ferromagnet: theinteraction between microscopic spins favour a unique orientation of the spins. This yieldsan ordered phase at low temperatures. Above the critical temperature, called the
Curietemperature in the present context, the spins are organised in a disordered phase .Let us assume that a ferromagnet is in the disordered phase at a temperature slightlybigger than the critical temperature T c . If we decrease the temperature, the informationof the unique orientation spreads over the spin lattice. This ‘Gedankenexperiment ’showsthat the spatial correlation of the spins becomes large at the critical temperature. Thisphenomenon, is quantified with the help of the spin-spin correlation function : (cid:28) σ ( x ) σ ( y ) (cid:29) ∝ exp (cid:26) − | x − y | ξ (cid:27) . (48)The correlation length ξ obviously measures the spatial distance over which the spins showroughly the same orientation. Close to the phase transition, i.e. for T > ∼ T c , ξ becomeslarge anticipating the ordered phase: ξ ≈ ξ + (cid:12)(cid:12)(cid:12)(cid:12) − TT c (cid:12)(cid:12)(cid:12)(cid:12) − ν , ( T > ∼ T c ) , (49)where ν is a positive number. The divergence of the correlation length at the phasetransition is characteristic for a transition of (or higher) order. In the case of a transition, the increase of ξ is hindered by the nucleation of bubbles which containchunks of the new state of matter. These bubbles provide additional disorder and thecorrelation length stays finite.For phase transitions above first order, the singularity of the correlation length has itsfingerprint in many other thermodynamical quantities such as the specific heat C or the magnetic susceptibility χ : C ≈ C (cid:12)(cid:12)(cid:12)(cid:12) − TT c (cid:12)(cid:12)(cid:12)(cid:12) − α , χ ≈ χ (cid:12)(cid:12)(cid:12)(cid:12) − TT c (cid:12)(cid:12)(cid:12)(cid:12) − γ , ( T > ∼ T c ) . critical exponents ν , α , γ are independent of the microscopic properties of the spinmodel (such as the lattice geometry), and only depend on the symmetries (present at thetransition) and the number of dimensions. They are often used to sort solid state physicsmodels into the so-called universality classes . Let the lattice spacing a denote the distance between two neighbouring lattice sites . In theprevious subsection, we found that the correlation length diverges if the coupling constant β (or inverse temperature in the present context, β = 1 /T ) approaches its critical value(see (49)). This statement can be phrased in units of the lattice spacing as ξa = κ (cid:18) β c − β (cid:19) − ν , β < ∼ β c , (50)where κ is a dimensionless constant which can be obtained by numerical means.Let us now reinterpret these findings. Rather than saying that ξ diverges and a is fixed, wesay that the correlation length ξ is fixed and is given by an observable in physical units. Wewill see that this interpretation of the same data defines a quantum field theory . Forfixed correlation length ξ , (50) defines the lattice spacing as a function of β , i.e., a → a ( β ), a ( β ) = 1 κ ( β − β c ) ν ξ , β → β c . (51)The key point is if we make the number of spins bigger and bigger and, at the same time,the distance a between the spins smaller and smaller, we will obtain a field theory in thelimit a →
0. For the 2d Ising model on a cubic lattice, we have β c ≈ .
44 and ν = 1. Thefield theory limit is then approached when a vanishes linearly with β − β c : a ( β ) = ξκ ( β − β c ) , (2d Ising model).Note that the dimensionless parameter β is not at our disposal anymore, since it specifiesthe magnitude of the lattice spacing. Instead, the value of ξ parameterises the emergingquantum field theory. The exchange of a dimensionless parameter for a scale dependent onein a quantum field theory is known as dimensional transmutation. It is a generic featureof quantum field theories. For instance in the case of perturbative QCD, the dimensionlessgauge coupling g is eliminated in favour of the scale dependent parameter Λ QCD .Let us assume that a certain correlation function was obtained by a numerical simulationof a classical lattice model for large values | x − y | , D (cid:18) | x − y | (cid:19) = (cid:28) F ( φ ( x )) F ( φ ( y )) (cid:29) ∝ exp (cid:26) − m | x − y | (cid:27) , (52)13 yz |z−x| = 2 2 aa|y−x| = 3a Figure 2: Spin correlation along the diagonal and the symmetry axis, respectively.where m is called the screening mass . Since the distance | x − y | is only known in unitsof the lattice spacing by construction, the simulation will provide the mass in units ofthe lattice spacing as a function of β , i.e. ma ( β ). If universality holds, one finds thecharacteristic scaling of the lattice model, i.e., m a ( β ) = κ m (cid:18) β c − β (cid:19) ν , β < ∼ β c . (53)Using (50), we see that the product m ξ approaches a constant in the vicinity of the criticallimit: m ξ = m a ξa = κ m κ . (54)Note that κ and κ m are two c-numbers which we obtain from the numerical simulations.With the help of these two numbers we can “measure” the desired mass m in units of 1 /ξ ,where ξ is the only free parameter of our theory.In the case of a quantum field theory, we expect that due to the isotropy of the vacuumthe correlation function (52) only depends on the distance between x and y . In the clas-sical lattice model, continuous rotational symmetry is violated due to the presence of thecubic lattice, and it might happen that the quantum field theory which emerges from thelattice model inherits the anisotropy. This anisotropy can be measured by comparing thecorrelation length in lattice units along a lattice symmetry axis, ξ , and a long the diagonaldirection, ξ d (see figure 2). As far as global symmetries are concerned, the symmetry underconsideration is restored in the critical limit (51): ξ d = ξ , for a → . Further details on the restoration of rotational symmetry in the context of the 2-dimensionalIsing model can be found in [2]. 14 .3 Mean-field approximation
The starting point for a thermodynamical description of the Ising model is the partitionfunction: Z = X { σ x } exp ( − β H ( σ )) (55)where β = 1 /T and where a spin σ x = ± x of the squarelattice. The sum in (55) extends over all possible spin configurations. The ferromagnetic interaction favours a unique orientation of the spins, and is described by H ( σ ) = − X
Links are short line segments which join two neighbouring sites onthe lattice. In order to unambiguously address a link on the lattice, we use coordinateswhich are integers with the exception of one coordinate which is half integer, such as 2 . plaquette ,17hich is an elementary square of the cubic lattice. Two coordinates are half integer whena plaquette is addressed. In higher dimensions, there are also cubes , and their coordinatesare half integer, while the other coordinates are integer.The dual lattice is an important object which helps to gain non-perturbative informationfor certain lattice models. The coordinates of the dual lattice are obtained by adding 0 . d dimensional lattice model, the dualitytransformation maps an x -dimensional geometrical object into a d − x dimensional objecton the dual lattice. Let us consider 2 dimensions. A site, such as (2 ,
4) is mapped into(2 . , . . , , . σ x σ y can only be ±
1, we expand:exp n β σ x σ y o = a + b σ x σ y . Inserting both possible values for the product σ x σ y , we find: a + b = e β , a − b = e − β , and finally: exp n β σ x σ y o = cosh β + sinh β σ x σ y . (67)Hence, the partition function in (55) can be written as Z = X { σ x } cosh N Y h xy i h β σ x σ y i . (68)where x and y are nearest neighbours on the lattice, and the corresponding link is denotedby h xy i . Note also that, in 2 dimensions, there are 2 N links for a lattice with N sites.In order to perform the sum over all spin configurations in (68), we use the importantrelations: X σ = ± σ = 0 , X σ = ± σ = 2 . Hence, if we perform the sum over the spin σ x in (68), we must make sure that it appearstwice (or an even number of times) when we expand the products of the square brackets.Thus, if we avoid a vanishing contribution to the partition function, integration over thespins generates closed loops the corners of which are marked by a pair of spins. For eachlink of the closed loop, we get a factor tanh β . Hence, after we have integrated out allspins, the partition function can be written as: Z = cosh N β N X loops h tanh β i N ( L ) , (69)18 closedloopspins appeartwice Figure 5: Integration over spins generate closed loops.where N ( L ) is the number of links of the closed loop L . Note that we obtained a factor of2 for each sum over a particular spin σ . This gives rise to the prefactor 2 N in front of thesum in (69). We now have converted the Ising model into a string theory, but we have notgained much information on the Ising model so far. To proceed further, we must controlthe sum over closed loops. For this purpose, we introduce new variables τ = ± −
1, the corresponding link is part of the loop. If the productis 1, the link is not part of the loop. The advantage of the τ variables is that we canrandomly assign ± τ configurations will do the sum over all possible closed loops for us.Note that each plaquette of the lattice is mapped to a site on the dual lattice. The linkbetween two neighbouring plaquettes is mapped into the link between the adjacent sitesof the dual lattice. Finally, we must express N ( L ) in terms of the τ variables. For thispurpose, we have to count all activated links (links which are part of a loop) on the lattice.It is easy to check that N ( L ) = X h x d y d i h − τ x d τ y d i (70)counts these links: if two neighbouring τ s are equal, they do not contribute to N ( L ), andis they are different, they contribute 1 as they should. Using the τ -representation of theclosed loops, the partition function (69) becomes Z = [cosh β ] N [2 tanh β ] N X { τ xd } Y h x d y d i h tanh β i − τ xd τ yd . (71)19 τ = 1τ = −1 Figure 6: Introducing dual variables to represent closed loops.This last equation can be written as Z = sinh N (2 β ) X { τ xd } exp n X h x d y d i e β τ xd τ yd o . (72) e β = −
12 ln tanh β . (73)We have obtained again a 2d Ising model which is now formulated on the dual lattice: theonly difference is that the coupling is now e β rather than β . It is not generally true thatthe duality transform yields the same lattice model just with different couplings. Modelswhich do have this property are called self dual .Now let us assume that β is large (small temperature). In this case, we find from (73) that e β ≈ e − β , β large . By contrast, if β is small (the high temperature limit), we find e β ≈ −
12 ln β , β small . Hence, large β corresponds to small e β and vice versa (see figure 7). This is interesting sincethe so-called strong coupling expansion techniques are available for small β . Performingthe expansion with respect to e β in the dual model, the large β regime can also be studiedby analytic methods. The basis of this expansion is a Taylor expansion of the exponentialwith respect to β . This expansion naturally reaches its radius of convergence when β β β ∼ β M a gn e ti s a ti on Figure 7: The dual coupling constant e β as a function of β (left). Magnetisation as afunction of β (right).approaches the critical coupling β c . Performing the expansion using the dual model, theTaylor expansion with respect to e β also breaks down at the critical coupling. There are noother couplings for which singularities in thermodynamical quantities occur. Hence, thecritical point is obtained if β = e β = β c . (74)Using (73), we therefore find β c = −
12 ln tanh β c , β c = 12 ln(1 + √ ≈ . . . . . (75)Figure 7 also shows the magnetisation as a function of β for a 128 ×
128 lattice comparedwith the mean field result and the exact result in the infinite volume limit.There are lot of interesting features of field theories already present in the Ising model:there is the relation between a lattice model and a theory of strings, and there is the dualitytransform which maps the high temperature theory onto a low temperature theory.
The idea central to all simulations of lattice models is to generate lattice configurations { σ x } according to their probabilistic measure P ( σ ) = exp ( − β H ( σ )) / Z (76)21here Z is the partition function (55). A straightforward idea to accomplish this task wouldbe to generate randomly the spins at each site x and to accept or reject this configurationaccording to (76). The problem is that we would hardly find any acceptable configurations.Why is this so?Let us answer this question in the context of the Ising model of the previous section. Thetwo dimensional lattice consists of N = 125 ×
125 sites. Since σ ∈ {− , +1 } , there are2 N ≈ different lattice configurations. We further introduce the average action persite, i.e. ¯ s = 1 N (cid:28) X
1. (iii) Subsequently, pick another spin for theupdate and start again with (i). Once all spins have been visited at least once, one sweep has been performed. 24 = 0.38β = 0.42 β = 0.46β = 0.10
Figure 8: Thermalised spin configurations of the 2d Ising model for several β valuesstarting from high temperature phase to the low temperature ordered phase.The algorithm above needs an initial configuration. We could choose a unique orienta-tion of all spins. Since this is the ground state of the Hamiltonian which dominates thepartition function for small values of the temperature, this initialisation is called a coldstart . Alternatively, we could start with a random orientation of the spins. This is a con-figuration which is relevant at very high temperatures where interactions are negligible.This initialisation is therefore called a hot start . Independently of our choice, a numberof sweeps is carried out to generate a statistically relevant configuration. This procedureis known as thermalisation . The number required to arrive at an equilibrated spin latticedepends on the number of degrees of freedom and the temperature.Let us now examine typical lattice spin configurations. Starting at low β , the sampleconfigurations are highly disordered (see figure 8). Increasing β up to ≈ .
3, the clusters ofspins with the same orientation already extend over several lattice spacings. Approachingthe critical value, e.g. for β ≈ .
42, the clusters are already as large as the lattice. This25bservation reflects the growth of the spin correlation length which, for the present case,is ξ ≈ ξ + (cid:12)(cid:12)(cid:12)(cid:12) − TT c (cid:12)(cid:12)(cid:12)(cid:12) − , ( T > ∼ T c ) , (93)and hence diverges when β → β c .If our numerical approach should produce the configurations of a Markov chain, the con-figurations may not depend on the Monte-Carlo history. To find out whether the config-urations are indeed statistically independent, we may inspect the autocorrelation time τ e.g. say for the magnetisation M (recall subsection 1.3 for discussions of autocorrelations).To guarantee independence, we perform about 2 τ Monte-Carlo sweeps before we considera configuration eligible for contributing to an estimator. In the case of the heat bath algo-rithm (in fact, for all local update algorithms), one discovers that the autocorrelation timestrongly increases when the critical point is approached. This implies that the interestingregime of the model, namely the regime close to the phase transition, is not accessiblewith these types of algorithm. The reason for this is the following: consider a spin insideone of the clusters. All the neighbouring spins are pointing in the same direction. If thisspin is now subjected to a local update procedure, the spin hardly changes because of thestrong mean field produced by the other spins. Hence, only the boundaries of the clus-ter are significantly modified after one sweep through the lattice. The correct physics is,however, described by configurations consisting of strongly fluctuating clusters. In orderto change a cluster completely, there are roughly ξ lattice sweeps necessary. Hence, onlyafter ξ sweeps, the configuration has changed significantly. This, however, implies thatthe autocorrelation time is roughly given by τ ≈ ξ . Indeed, it was empirically observedfor the Metropolis algorithm that τ ≈ ξ z , z Metro ≈ . . (94)The index z is called the dynamical critical exponent and depends on the algorithm. Sincethe physical correlation length ξ diverges at the phase transition, (94) implies that theregime near the phase transition cannot be simulated with local update algorithms. State-of-the-art simulations which explore the transition regime use the so-called clusteralgorithms. The difference to local update algorithms is that many spins are flipped at atime. To derive the prescription of such a cluster update, we rewrite the partition function(55) as Z = X { σ x } exp β X
Let us assume that the motion of a particle of mass m in 1 dimension is governed by apotential V ( x ). The classical equation of motion can be calculated by variational methodsfrom the action S = Z t dt n m x − V ( x ) o . (99)Classically, these equations of motion determine the time evolution of the position of theparticle x ( t ). At quantum mechanical level, the partition function Z ( T ) = tr exp (cid:26) − T H (cid:27) (100)is a convenient starting point to discuss the thermodynamics of the physical system. Here, H is the quantum mechanical Hamiltonian, i.e., H = − ~ m d dx + V ( x ) . (101) T is the temperature, and is considered as an external parameter. Once one has succeededin calculating the partition function (100), thermodynamical quantities can be easily ob-tained by taking derivatives, e.g., the temperature dependence of the internal energy isgiven by h H i = T d ln Z ( T ) dT . (102)Although a direct calculation of the eigenstates h n | of the Hamiltonian might be the easiestway to calculate a quantum mechanical partition function in practical applications, I wouldlike to reformulate (100) in terms of a functional integral. This will be of great help whenwe generalise the quantum mechanical considerations to the case of a quantum field theory.For these purposes, I introduce a length scale L := 1 /T and an interval [0 , L ] which Idecompose into N equidistant portions of length a ≪ L , where a is called lattice spacing .It is trivial to obtainexp (cid:26) − T H (cid:27) = exp (cid:26) − N X ν =1 a H (cid:27) = N Y ν =1 exp {− a H } . (103)Let us define complete sets of momentum and position eigenstates ( | p i and | x i , respectively)by 1 = Z dx ν | x ν i h x ν | , Z dp ν | p ν i h p ν | , (104)for ν = 1 . . . N . As usual, these states obey h p k | x k i = exp n − i ~ p k x k o . | x i of position eigenstates to evaluate the trace in (100), we find Z dx h x | N Y ν =1 exp {− aH }| x i = Z dx dp dx dp . . . dx N − dp N − h x | e − aH | p i h p | x i h x | e − aH | p i h p | x i . . . h x N − | e − aH | p N − i h p N − | x i . Note that the operators p and V ( x ) do not commute. We may, however, write:exp (cid:26) − a p m − aV ( x ) + a m [ V ( x ) , p ] + . . . (cid:27) = exp {− aV ( x ) } exp (cid:26) − a p m (cid:27) . Since | x i and | p i are eigenstates of the position operator and momentum operators, re-spectively, we find h x k | exp {− aH }| p k i = exp (cid:26) − a (cid:20) p k m + V ( x k ) + O ( a ) (cid:21)(cid:27) exp { i ~ p k x k } . The partition function therefore becomes up to terms of order a Z ( T ) = Z dx dp dx dp . . . dx N − dp N − dx N exp ( − a N − X k =0 (cid:20) p k m + V ( x k ) (cid:21)) exp ( − i ~ N − X k =0 p k ( x k +1 − x k ) ) h x | x N i (105)It is straightforward to perform the momentum integrations, which are Gaussian, Z ( T ) = (cid:18) πma (cid:19) N/ Z dx dx . . . dx N δ x x N (106)exp ( − a N − X k =0 (cid:20) m x k +1 − x k ) a ~ + V ( x k ) (cid:21)) This equation is a completely regularised expression for the partition function and canbe directly used in numerical simulations. Note that in the framework of quantum fieldtheory, one sets ~ = 1.A compact notation can be derived by formally taking the lattice spacing a to zero. Forthis purpose, we define a h := ~ a , and the Euclidean action by S E = Z L dτ (cid:26) m x + V ( x ) (cid:27) . (107)Note the sign change in front of the potential compared with the standard action (99).The interval [0 , L ], which was introduced above (103), is called Euclidean time interval.29 u E u c li d e an ti m e space u τ i D/2 m2a nearestneighbors Figure 9: Classical versus quantum partition functions of a 1-dimensional particle chain.By construction (see above), the length of the Euclidean time interval is given by theinverse temperature, i.e., L = 1 /T . We also introduce a Euclidean particle trajectory, anda Euclidean velocity x k → x ( τ ) x k +1 − x k a h → ˙ x ( τ ) , (108)where we identify dτ = a h . Using the shorthand notation (cid:18) π ~ ma h (cid:19) N/ Z dx dx . . . dx N − → D x ( τ ) , the partition function (106) can be formally written as a functional integral Z ( T ) = Z D x ( τ ) exp (cid:26) − ~ S E (cid:27) . (109)Eq.(109) suggests that an average over all Euclidean trajectories x ( τ ) must be performedwhere the probabilistic weight of each trajectory is given by exp {− S E / ~ } . Note also thatin view of the δ -function in (106) only trajectories which are periodic in Euclidean timemust be considered, i.e., x (0) = x ( L = 1 /T ). For illustration purposes, we consider the 1-dimensional particle chain in figure 9. Here,the positions of the particles i = 0 . . . n are characterised by their extensions u i from theequilibrium position. The particles experience a harmonic potential depending on the30istance to the nearest neighbour. Here, I choose the boundary conditions u = 0, u n = 0.The Hamiltonian, which describes the classical physics, is given by H = n − X i =1 (cid:20) m p i + D u i +1 − u i ) (cid:21) . (110)Hence the classical partition function is given by the multi-dimensional integral Z cla ( T ) ∝ Z dp . . . dp n − du . . . du n − exp (cid:26) − H T (cid:27) . (111)In order to calculate the full quantum mechanical partition function of the particle chain,we first write down the Euclidean partition function. Note for this purpose that thedisplacements u i now acquire an additional dependence on the Euclidean time u i → u i ( τ ) ≡ u τi . With this notation the Euclidean action is given by S E = N X τ =1 n − X i =1 a (cid:20) m a ( u τ, i − u τ − , i ) + D u τ, i +1 − u τ, i ) (cid:21) . (112)The interactions between the c-number fields u τi can be easily visualised (see figure 9): thefields u τi harmonically interact with their nearest neighbours. The harmonic interactionstrength is given by D/ m/ a for neighbours in Euclidean timedirection. The quantum mechanical partition function can be calculated by integratingover of the fields u τi located at the sites of a 2-dimensional grid, .i.e. Z ( T ) ∝ Z D u exp {− S E } , (113)where the temperature enters the consideration via the extension of the lattice in Euclideantime direction with fields obeying periodic boundary conditions.To conclude, we observe that the partition function of a classical D + 1 dimensional fieldtheory (in lattice regularisation) describes the full partition function of a D dimensionalquantum system. D is the number of space dimensions. This correspondence is veryhelpful in understanding the quantum behaviour of a theory, since it can be mapped to aclassical field theory (at the expense of an additional dimension). In the next section, wewill study classical partition functions in 4-dimensional Euclidean space in order to derivethe information on the thermodynamics of the full quantum system. The Ising model, strictly speaking the partition function (55), is invariant under the global transformation of the spins given by σ Ω ( x ) = Ω σ ( x ) , Ω = ± . (114)31he transformation is called global because the transformation affects all spins at the sametime, i.e., Ω is independent of the coordinates (sites). The corresponding symmetry groupis Z .This symmetry group can be generalised to a local symmetry, also known as gauge sym-metry , by demanding invariance under σ Ω ( x ) = Ω( x ) σ ( x ) , Ω( x ) = ± . (115)Of course, the action (56) of the standard Ising model is not invariant under the hugesymmetry group which is now [ Z ] N , where N is the number of sites. In order to obtaina version of the Ising model which possesses a Z gauge symmetry, we need to change theaction. The only way to do it, is to introduce an additional field, Z µ ( x ). This field isassociated with the links of the lattice: x specifies the site and µ the direction in which wefind the link. Alternatively, we could write: Z µ ( x ) = Z h xy i , y = x + ˆ e µ , where ˆ e µ is the unit vector in µ direction. For the latter expression, we will also abbreviate x + ˆ e µ = x + µ . For the action, we choose S matter = κ X h xy i σ ( x ) Z µ ( x ) σ ( x + µ ) , (116)and demand that the link Z µ transforms under gauge transformations as Z Ω µ ( x ) = Ω( x ) Z µ ( x ) Ω( x + µ ) . (117)Since spin and link transform simultaneously with the same Ω( x ) and since Ω ( x ) = 1, oneeasily proves the gauge invariance of the action (116).Obviously, the action S matter describes the interaction between the matter fields, i.e., thespins, and the new link fields. What is left to do is to design a gauge invariant action forthese new degrees of freedom. This interaction should be short ranged in order to preservesome desirable features such as universality. A possible choice is S link = β X x, µ>ν P µν ( x ) , P µν ( x ) = Z µ ( x ) Z ν ( x + µ ) Z µ ( x + ν ) Z ν ( x ) . (118)Here, the numbers ( x, µ > ν ) specify the plaquette of the lattice the lower left cornerof which is located at site x and which is spanned by the directions µ and ν . The fieldcombination P µν ( x ) is often called the plaquette for short. The proof that P µν ( x ) is indeedinvariant under gauge transformations (117) is left to the reader.32he total action of the gauged Ising model consists of two parts: the matter part and the“gauge” part. Correspondingly, there are two coupling constants: the convention is that β is the pre-factor in the pure gauge action, while κ multiplies the matter part.Once our system is now gauged, it only makes sense to consider gauge invariant observablessince non-gauge invariant quantities vanish. Let us explore this for a simple gauge variantquantity such as the spin correlation function: C ( x , y ) = 1 Z X { σ } σ ( x ) σ ( y ) exp n S [ σ ] o , S [ σ ] = S matter + S link , (119) Z = X { σ } exp n S [ σ ] o , (120)Let us now consider a particular gauge transformation (115) of the spins, i.e.,Ω( x ) = (cid:26) − x = x σ ( x ) → σ Ω ( x ), we use the gauge invariance ofthe action and the sum , i.e., S [ σ ] = S [ σ Ω ] , X { σ } = X { σ Ω } . The sum is trivially invariant, since we sum anyhow over all possible ± C ( x , y ) = 1 Z X { σ } σ Ω ( x ) σ Ω ( y ) exp n S [ σ ] o = 1 Z X { σ } [ − σ ( x )] σ ( y ) exp n S [ σ ] o = − C ( x , y ) , (122)where we used our particular choice (121) in the last line above. We conclude from thisthat C ( x , y ) = 0. Z gauge theory: 3 and 4 dimensions Let us now consider the particular case κ = 0 when the matter fields are absent from thetheory. The emerging theory is called pure Z gauge theory, and it is a theory of link fieldsonly. What are the physical (i.e. gauge invariant) degrees of freedom in this case? Let usconsider the more interesting case of 3 and 4 dimensions for these considerations. In orderto talk about gauge invariant information, we now consider the plaquettes P p , p = ( x ; µν ),defined in (118). We say that a short flux line (vortex) passes through the plaquette p if33 ink = −1flux line Figure 10: Flux passing through an elementary cube (left) and closed flux lines on a 3dlattice (right). P p = −
1. Since the plaquette variables P are gauge invariant, so are the flux lines. Moreformally, we introduce a vortex plaquette variable by v p = Y l ∈ p Z l (123)and consider the flux lines which enter/leave an elementary cube of the lattice. We takethe product of all vortex plaquettes which are associated with the faces of the elementarycube and find Y p ∈ c v p = ( − ν , (124)where ν is the total number of vortices at the faces of the cube. Inserting the definition(123), we also find that Y p ∈ c v p = Y p ∈ c Y l ∈ p Z l = 1 , (125)since in the latter products all Z l factors appear twice (see figure 10, left panel; remember Z l = 1). Comparing (125) with (124), we realise that ν must be even. In particular, ν = 1is excluded implying that a vortex never ends inside a cube. Considering 3 dimensions(or the spatial hypercube of 4-dimensional space time), we find that the gauge invariantvortices form closed lines in space. See figure 10, right panel, for an illustration. Let us consider the 4 dimensional model first. The constraint (125) ismost easily interpreted on the dual lattice. A plaquette p of the original lattice maps onto34 plaquette ∗ p of the dual lattice, and a cube c corresponds to a link ∗ l on the dual lattice.Hence, the constraint (125) reads on the dual lattice Y ∗ p ∈ ∗ l v ∗ p = 1 . (126)This simply means that the number of vortex plaquettes which are attached to a link onthe dual lattice must be even. Accordingly, the vortex plaquettes form closed surfaces onthe dual lattice. If n denotes the number of negative plaquettes on the original lattice, thenumber of trivial plaquettes in 4 dimension is 6 N − n , where N is the number of latticepoints. Hence, the probabilistic weight of such a configuration is:[exp { β } ] N − n [exp {− β } ] n = exp { N β } exp {− β n } , so that the partition function can be written as Z = exp { N β } X { closed surfaces } exp {− β n } . (127)Let us interpret this partition function: the degrees of freedom are closed two dimensionalsheets (2-branes) embedded in four dimensions. The surface A of these branes is given by n . Hence, the probabilistic factor is given byexp n − β A o implying that 2 β can be interpreted as the surface tension . At zero temperature ( β → ∞ ),the empty vacuum (no 2-branes) is realised. At finite temperatures, the brane entropycompetes with the penalty from the weight factor. A direct calculation of the entropyof 2d world-sheets in 4d would be cumbersome. However, exploiting the relation to the Z gauge theory makes the calculation of brane expectation values easily accessible bynumerical means.Let us proceed to obtain the duality map of the 4d Z gauge theory. The basic trickto perform the sum over the closed surfaces is to introduce degrees of freedom whichautomatically resolve the constraint. In the present case, these are links ∗ Z ∗ l on the duallattice. Let us consider X { ∗ Z ∗ l } Y ∗ p h t P ∗ p [ ∗ Z ] i , (128)where P ∗ p [ ∗ Z ] is the plaquette generated by the links ∗ Z on the dual lattice. When weremove the brackets in (128), the only way to have a non-vanishing contribution to thesum is by making sure that each link ∗ Z ∗ l appears an even number of times. This, however,means that the negative plaquettes P ∗ p [ ∗ Z ] form closed surfaces. Hence, we find X { ∗ Z ∗ l } Y ∗ p h t P ∗ p [ ∗ Z ] i = 2 N X { closed surfaces } t n . (129)35ote that there are 4 N links on the 4 dimensional lattice and that the factor 2 N arisesfrom the sum over ∗ Z ∗ l . Thus, using (129) in (127), we find Z = exp { N β } − N X { ∗ Z ∗ l } Y ∗ p h t P ∗ p [ ∗ Z ] i , t = exp {− β } . (130)Since the plaquette P ∗ p [ ∗ Z ] only acquires values ±
1, we may write: Y ∗ p exp n e β P ∗ p [ ∗ Z ] o = [cosh e β ] N Y ∗ p h e β P ∗ p [ ∗ Z ] i . (131)The partition function (130) therefore becomes Z = (cid:20) exp { β } cosh e β (cid:21) N − N X { ∗ Z ∗ l } exp n e β X ∗ p P ∗ p [ ∗ Z ] o , (132)exp {− β } = tanh e β . (133)First of all we note that the dual of pure Z gauge theory is another 4-dimensional Z gauge theory: the model is self-dual. Furthermore, relation (133) is already familiar to us:we have obtained a complete analogue of the relation between β and its dual e β for the 2dIsing model. We therefore once again encounter the fact that the weak coupling regime ismapped to the strong coupling regime of the same model. As a byproduct we find thatthe critical coupling is given by β c = 12 ln(1 + √ ≈ . . . . . (134)The fact that the critical couplings of the 2d Ising model and 4d pure Z gauge theorycoincide might be a numerical accident. At least, I do not know any deeper reason forit. I finally point out that the Z gauge symmetries of the original and the dual formu-lation are completely unrelated. This can be most easily seen from the fact that, at anintermediate stage, we have formulated the model entirely in terms of physical, i.e., gaugeinvariant variables: the closed vortex sheets of the dual lattice. Resolving this constraint,the gauge invariance of the dual model arose from a parameterisation invariance, namely,the redundancy when we performed the sum over all closed world sheets with the help ofdual gauge links ∗ Z . Let us finally discuss the 3-dimensional model. In 3 dimensions, aplaquette p is mapped to a link ∗ l and a cube c is mapped to a site ∗ x on the dual lattice.The constraint (125) then translates to Y ∗ l ∈ ∗ x v ∗ l = 1 . (135)36he plaquettes carrying negative flux on the original lattice are represented by links formingclosed loops on the dual lattice. The partition function now takes the form (there are 3 N links on the lattice): Z = exp { N β } X { closed loops } exp {− β n } . (136)We already know how to perform the sum over all closed loops from subsection 2.4:2 − N X { τ ∗ x } Y ∗ l exp n e β τ ∗ x τ ∗ y o (137)= h cosh e β i N X { τ ∗ x } Y ∗ l h e β τ ∗ x τ ∗ y i = h cosh e β i N X { closed loops } [tanh e β ] n . Identifying once again exp {− β } = tanh e β , (138)we find Z = (cid:20) exp { β } e β (cid:21) N X { τ ∗ x } exp nX ∗ l e β τ ∗ x τ ∗ y o . (139)Obviously, the Z gauge theory is dual to a theory which is not a gauge theory anymore, the3d Ising model. This has tremendous consequences: for the standard Ising model, clusterupdate algorithms are available. Using the Swendsen-Wang or Wolff type cluster update,we are able to simulate a gauge theory with much less autocorrelations. Unfortunately,such a framework is not (yet) available for more relevant theories such as lattice Yang-Millstheories. Due to the universality conjecture, the lattice model with the correct number of dimensionsand the correct symmetries uniquely defines the corresponding quantum field theory in thecritical limit. The purpose of the present subsection is to propose a classical lattice modelwhich satisfies these prerequisites in the case of Yang-Mills theory.The QCD matter fields (quarks) belong to the fundamental representation of the so-called SU ( N c ) colour group ( N c = 3 for QCD). Gauge invariance means that the action of thequark fields is invariant under the local unitary transformations, i.e., q ( x ) → q ′ ( x ) = Ω( x ) q ( x ) , Ω( x ) ∈ SU ( N c ) . . (140)As explained in many text books, local gauge invariance of the quark kinetic term mayonly be achieved by introducing additional dynamical fields, the gluon fields A µ ( x ).37 Y µ (z) Ω +(z+ ) µ ...U Ω ( z + ) µ U ν ( z + ) µ ... µ ν Z Ω (x) Ω (y)+ P = P product of link variables P gauge transformation Figure 11: Path ordered product of link variables.The quark fields are associated with the sites in a lattice formulation. Hence, the symmetrygroup of the classical lattice Yang-Mills model is [ SU ( N c )] N s , where N s is the number oflattice sites. In order to enforce such a high symmetry in the critical limit of a latticemodel, it has turned out essential to realise the symmetry even for finite values of thelattice spacing a . This in turn forces the model to maintain local gauge invariance in thecontinuum limit [9]. A potential candidate for a quark kinetic term in the non-interactingcase is X x,µ (cid:20) ¯ q ( x ) γ µ q ( x + µ ) − ¯ q ( x + µ ) γ µ q ( x ) (cid:21) , (141)where the γ µ are the Euclidean γ matrices. Of course, the action (141) is not invariant underthe gauge transformations (140). To achieve this invariance, we introduce an additionalfield of vector type, thus being related to the links of the lattice, U µ ( x ) ∈ SU ( N c ) . (142)38 µν (x) = µ ν plaquette Nc Figure 12: Lattice plaquette variableGeneralising the quark kinetic term (141) to S Q = X x,µ (cid:20) ¯ q ( x ) γ µ U µ ( x ) q ( x + µ ) − ¯ q ( x + µ ) γ µ U † µ ( x ) q ( x ) (cid:21) , (143)one obtains the desired local invariance upon demanding that the link fields transform as U µ ( x ) → Ω( x ) U µ ( x ) Ω † ( x + µ ) . (144)Let us follow the case of the gauged Ising model and construct a kinetic term for the linkfields U µ ( x ). For lattice models “kinetic” means that the interactions of the fields on thelattice are short range , i.e., only nearest neighbours are involved. In order to design agauge invariant kinetic term for every value of the lattice spacing, we firstly investigate thetransformation properties of a path ordered product of link variables. Let us consider anopen path C which starts at point x and ends at y (see figure 11 for an illustration), anddefine P ( x, y ) = P Y x ∈ C U ( x ) , (145)where P implies path ordering. Inserting the gauge transformed links (144) into (145), onefinds P ( x, y ) → P ′ ( x, y ) = Ω( x ) P ( x, y ) Ω( y ) . (146)With the help of (145), it is easy to construct a kinetic term for the link variables which(i) is gauge invariant and (ii) involves only next to nearest neighbours. For this purposeone chooses C to be a closed path starting at x and ending at y = x which encircles anelementary plaquette (see figure 12): P µν ( x ) = 1 N c tr P ( x, y )= 1 N c tr n U µ ( x ) U ν ( x + µ ) U † µ ( x + ν ) U † ν ( x ) o . (147)39sing (146) and the invariance of the trace under cyclic permutations, one easily showsthat the plaquette (147) is indeed gauge invariant.The lattice partition function involves an integration over the dynamical fields of the theory.In the case of the group valued link variables (142), the question arises which measure D U µ should be employed for the integrations. We must demand that the integration measuredoes not spoil gauge invariance. To ensure this we use the so-called Haar measure whichsatisfies dU µ ( x ) = d (cid:18) AU µ ( x ) B (cid:19) , A, B ∈ SU ( N c ) . (148)The Haar measure is available in closed form for the unitary groups SU ( N c ). Here, I willonly present the Haar measure for SU(2) group integrations. The SU(2) unitary matrix U is conveniently parameterised in terms of the Pauli matrices, U = a + i ~a~τ , U U † = 1 → a + ~a = 1 . (149)Since the constraint U U † = 1, i.e. a + ~a = 1, is not changed if U is multiplied with A from the left and B from the right, respectively, these multiplications can be viewed asrotations in the 4-dimensional space spanned by ( a , ~a ). Therefore, an invariant measurecan be defined by dU = da da da da δ (cid:18) a + ~a − (cid:19) . (150)Introducing polar coordinates for the 3-dimensional vector ~a := a~n , ~n~n = 1, the integrationover the norm of the vector ~a can be performed with the help of the δ function in (150).We obtain the final result for the SU(2) Haar measure, i.e., dU = da q − a d Ω ~n , (151)which is commonly used in lattice simulations.Finally, the lattice representation of the gauge invariant partition function is given by Z = Z D U D q D q † exp (cid:26) − S Q + β X x,µ>ν h P µν ( x ) + h . c . i(cid:27) , (152)where the quark interaction is encoded in S Q (143) and P µν ( x ) is the plaquette (147). β is related to the bare gauge coupling constant g of the continuum formulation by β =2 N c /g . The particular choice (152) of the lattice regularised gluonic action is known asthe Wilson action [9]. Note that the fields q ( x ), q † ( x ) are anti-commuting Grassmannfields. This choice for the fermionic fields is necessary to obtain the correct Fermi statisticsas well as to ensure the Pauli principle. It implies that the lattice model (152) cannot bestraightforwardly be used in numerical simulations. Rather, since the action for the quarkfields is quadratic, the integration over the quark fields has to be performed analytically: Z D q D q † exp n − ¯ q A M AB q B o = Det M [ U ] . (153)40here the index A comprises space-time as well as spinorial, etc. indices. The quark deter-minant Det M [ U ] is a gauge invariant function of the link variables U µ ( x ). Note, however,that the link interaction mediated by the quark determinant is non-local, implying thata link at a particular site is coupled to all other links of the lattice. In practice, thisimplies that a local update of a single link enforces the calculation of a functional deter-minant. This explains why the numerical simulation of Yang-Mills theory with dynamicalquarks requires much more computational resources than the simulation of the theory in quenched approximation , where the quark determinant is neglected for the update of thelink variables. It turns out that the treatment of the quark degrees of freedom in (152) is still too naive:since the Dirac equation is linear in momentum, its lattice analogue does not produce justthe desired quark degrees of freedom in the limit a →
0, but rather 2 D copies of them ( D is the number of space time dimensions). This is already true for the free theory as will beshown in what follows.Let us firstly introduce the generating functional for connected Green functions in the caseof free and massless bosonic theory, Z [ j ] = Z D φ exp (cid:26) − φ k Π kl φ l + j x φ x (cid:27) . (154)A sum is understood over indices which appear twice. One easily verifies that the connectedcorrelation function is obtained from Z [ j ] via f ( x − z ) := h φ x φ z E − D φ x ED φ z E = d ln Z [ j ] dj x dj z . (155)By “completing the square” in (154), we find Z [ j ] ∝ exp (cid:26) j x (cid:16) Π − (cid:17) xz j z (cid:27) , (156)and hence for the free bosonic case D φ x φ z E − D φ x ED φ z E = (cid:18) Π − (cid:19) xz . (157)In order to evaluate the inverse Π − , of the ”kinetic” operator, we introduce its eigenvaluesand eigenvectors, whereupon Π | k i = λ k | k i , (158)and formally write (cid:16) Π − (cid:17) xz = X k | k i λ k h k | . (159)41 k a λ a continuumlattice ka λ a continuumlattice Figure 13: Dispersion relation from the tree level kinetic term (continuum versus latticeformulation) for the bosonic case (left) and the fermionic case (right panel).It is now easy to calculate the correlation function for the continuum case Π = − ∂ . Theeigenfunctions are subjected to periodic boundary conditions φ ( x ) = φ ( x + L ), i.e., φ ( x ) ∝ e ikx , , e ikL = 1 , k = 2 πL n , n ∈ Z . (160)The discrete k levels are called Matsubara frequencies . In the continuum, there are nofurther restrictions on the integer n . Making the ansatz (160), we find that the eigenvaluesare given by λ ( k ) = k (continuum) . (161)Hence, a free massless particle manifests itself in the correlation function (159) as a poleat zero momentum transfer. The lattice version of the eigenvalue equation isΠ φ ( x ) = X µ (cid:20) − φ ( x + µ ) + 2 φ ( x ) − φ ( x − µ ) (cid:21) = λ latt a φ ( x ) . (162)In order to solve this equation, we use the plane wave ansatz (160). One crucial differencebetween the lattice and the continuum version is that only wavelengths l obeying l ≥ a , πk ≥ a (163)are sensible. The lattice naturally provides an UV momentum cutoff, i.e., Λ UV = π/a .Inserting (160) into (162) one finds λ latt a = X µ h − e ik µ a − e − ik µ a i = 4 X µ sin (cid:16) k µ a (cid:17) . (164)For momenta which are small compared to the UV cutoff, i.e., ka ≪ π , we recover thecontinuum dispersion relation λ latt = k h O ( k a ) i . (165)42n figure 13 the continuum dispersion relation for bosons is compared to its lattice version.The lattice correlation function has only one singularity reflecting that in the scaling limit λa ≪ ka ≪ π , the dispersion relation of the continuum free particle is recovered.Let us move on to the fermionic case. In order to reproduce the correct Fermi statistics,fermion fields ψ ( x ) are of Grassmann type and obey anti-periodic boundary conditions. Irefer to the textbook [2] for an introduction to the free fermionic theory, and only quote thefinal result for the correlation function which formally agrees with (159). In the continuum,the eigenvalue equation is given byΠ ψ ( x ) = ∂/ψ ( x ) = λ ψ ( x ) , (166)where anti-hermitian (Euclidean) γ matrices are used. The ansatz for the spinor wavefunctions is again of plane wave type, ψ ( x ) ∝ u ( k ) e ikx , e ikL = − , k = 2 πL (cid:16) n + 12 (cid:17) , n ∈ Z . (167)The spectrum λ ( k ) is determined by making the ansatz u ( k ) = h ik/ + λ i u , (168)which yields h ik/ − λ i u ( k ) = h ik/ − λ ih ik/ + λ i u = 0 , (169)and therefore h k − λ i u = 0 . (170)Hence, the spectrum of the continuum theory is linearly increasing, λ = ±√ k . Usingthe kinetic energy for a free quark theory introduced in (141), the lattice version of theeigenvalue equation is given by12 X µ (cid:20) γ µ ψ ( x + µ ) − γ µ ψ ( x − µ ) (cid:21) = λ a ψ ( x ) . (171)The ansatz (167) also provides the eigenvectors of the eigenvalue problem (171). Repeatingthe steps which have led to the continuum dispersion relation, one finds its lattice analogue λ a = sX µ sin (cid:16) k µ a (cid:17) . (172)The fermionic eigenvalue distribution is shown in figure (13), right panel. Close to thecritical limit ( λa ≪ ka . In addition, a second singularity occurs for ka ≈ π . This shows that even if λa ≪ Nielsen-Ninomiya No-Go theorem ).At the present stage, a lot of research effort is devoted to incorporate chiral symmetry atthe expense of, say, a moderate non-locality of the action [10].
In the continuum formulation, the chirally invariant Dirac operator D satisfies the relations (cid:8) D , γ (cid:9) = 0 , (cid:8) D − , γ (cid:9) = 0 , (173)which tells us that the non-zero eigenvalues ¯ λ appear in pairs { ¯ λ, − ¯ λ } . Let D denote alattice candidate of the Dirac operator (in units of the lattice spacing a ) satisfying theso-called Ginsparg-Wilson relation [11], (cid:8)
D, γ (cid:9) = 2 D γ D . (174)One observes that the right hand side of (174) is of order a (compared with the order a ofthe left hand side) implying that the naive continuum limit a → e D − := D − − , (175)and using (cid:8) γ , D − (cid:9) = 2 γ , which directly follows from the Ginsparg-Wilson relation (174), we observe that e D − maybe used as a chirally invariant quark propagator, i.e., (cid:8) γ , e D − (cid:9) = 0 . (176)Hence, we are left with the task to find an operator D obeying the Ginsparg-Wilson relation(174). Here, I will briefly discuss the Overlap Dirac operator [12]- [14], firstly introducedin the pioneering paper [12] by Narayanan and Neuberger. One introduces D = 12 h γ H i , (177)where H is a Hermitian operator with eigenvalues ±
1. Common choice is H = D w / (cid:16) D † w D w (cid:17) / , (178)where D w is the standard Hermitian Wilson-Dirac operator. Inserting (177) into (174), itis straightforward to prove that D from (177) satisfies the Ginsparg-Wilson relation (174).A comprehensive discussion of the quark propagator (175) in the context of a simulationof SU(3) Yang-Mills theory can be found in [15].44 tWilson loop: d(screening) masses Figure 14: Wilson loop and loop–loop correlation function
We have observed that the trace of the path ordered product P ( x, y ) of link variables (145)is gauge invariant when taken along a closed curve C , i.e., x = y . Depending on the choicefor the loop C , the expectation value of such loop variables can be connected to physicalobservables. For instance, for the so-called Wilson loop, we choose a rectangular loop withsize r in one spatial direction and the extension t in the Euclidean time direction (see figure14, left panel). In the limit of large t , the Wilson loop expectation value is related to thepotential V ( r ) between a static quark and a static anti-quark which are separated by thedistance r , i.e., D W [ C ] E ∝ exp n − V ( r ) t o , (179)In the particular case that the potential is linearly rising, V ( r ) = σr with string tension σ , one observes that the Wilson loop expectation value exponentially decreases with thearea A enclosed by the loop C . Since a linearly rising quark anti-quark potential impliesconfinement (see discussion below), this area law (due to Wilson) is a litmus test for quarkconfinement.Furthermore, one can calculate the correlation function L ( t x − t y , ~x − ~y ) of two loops centredat x and y , respectively (see figure 14), right panel). Here, information is transported frompoint x to y by gauge invariant states | ph i . The shape of a particular loop determines itsbehaviour under the symmetry transformations of the underlying lattice. These symmetrytransformations correspond to rotations in the continuum limit. Therefore, it is possibleto select the spin quantum number of the state | ph i by adjusting the shape of the loop.For large distances ∆ = t x − t y , the correlation function exponentially decreases, i.e., X ~u L ( t x − t y , ~u = ~x − ~y ) ∝ exp n − ma ∆ o . (180)45 r/a V (r) a on-axis fit(100) data(110) data(111) dataSU(2) β=1.45 Figure 15: The static quark anti-quark potential as obtained from pure SU(2) latticegauge theory. Plot from [16].Hence, the calculation of loop correlation functions provides access to the so-called screeningmasses m of physical particles. In the purely gluonic theory, the only gauge invariant statesare the glue balls, while in full QCD also hadronic states contribute to the correlation func-tions. For definiteness, I confine myself to the case of pure (i.e. no quarks) SU(2) gauge theory.The generalisation of the findings of the present section to SU ( N c ) is straightforward. Thetask is now to find the critical limit of the lattice Yang-Mills theory.There is a lesson to learn from continuum Yang-Mills theory. In order to renormalise thecontinuum theory, one absorbs a logarithmic divergence into the bare gauge coupling. Adetailed calculation yields 1 g (Λ) = 1124 π ln Λ µ + finite , (181)where Λ is the UV cutoff and where µ is an arbitrary renormalisation point. The coefficientin front of the logarithmic term is the quantity of interest and can be obtained by evaluatinga bunch of one-loop Feynman diagrams. Eq.(181) shows that in the critical limit Λ → .2 1.3 1.4 1.5 1.6 β σ a data1-loop2-loopSU(2) improved action β σ a data1-loop2-loopSU(3) improved action Figure 16: Approaching the continuum limit of SU(2) (left) and SU(3) (right) latticegauge theory (improved action from [16]). ∞ the bare coupling vanishes. This is one manifestation of the celebrated property of asymptotic freedom . Switching from the continuum to the lattice formulation we identifyΛ = π/a and use β = 4 /g to straightforwardly derive a ( β ) = const . exp (cid:26) − π β (cid:27) . (182)Due to asymptotic freedom, we expect that the critical limit is approached when β → ∞ .The perturbative relation between a and β in (182) is called asymptotic scaling .Modern computer simulations use a more complicated “kinetic” term for the gluon fields.One example of such an improved action is given by S = β X µ>ν,x h κ ¯ P µν ( x ) + κ ¯ P (2) µν ( x ) i . (183)where ¯ P (2) µν ( x ) is the 2 × κ + 16 κ = 1 , (184)ensures that the familiar relation between β and the bare gauge coupling g , β = 2 N c /g ,is maintained. The residual freedom of choosing κ and κ can be used to obtain a rathergood agreement with asymptotic scaling on rather coarse lattices.In order to search for the critical limit with the help of numerical simulations, we calculatea physical quantity, e.g. the string tension σ in units of the lattice spacing as a function47f the only parameter β . This is done by calculating the static quark anti-quark potential V ( r ) as a function of the quark anti-quark distance r = n a . The outcome in units of thelattice spacing is shown in figure 15. By fitting the numerical data to V ( r ) a = v − αn + σa n , we find the string tension in units of the lattice spacing, σa , for each value of β . Theoutcome of this calculation is shown in figure 16. One indeed observes that the c-number σa exponentially decreases for large values of β in agreement with the prediction (182) ofcontinuum Yang-Mills theory. The quantum field theoretical limit of the classical latticemodel is obtained by interpreting the correlation length, i.e., the string tension σ in thepresent example, as a fixed physical quantity, and reinterpreting the β dependence of thenumerical data for σa as the β dependence of the lattice spacing.Let us assume we have obtained a glue ball mass m in lattice units, i.e., we know ma asa function of β . If the mass m is a physical observable, one must recover from the datathe characteristic dependence a ( β ) (see (182)) for sufficiently large β values. Hence, theratio of the two dimensionless numbers m a /σa approaches a constant for β close to thecritical point (see figure 16, right panel). Extrapolating the data to the continuum limit a →
0, i.e., β → ∞ , one determines the physical mass m in units of another physicalscale, i.e., √ σ . Finally, let us count the number of parameters. The only parameter ofthe classical lattice model is β , but β is no longer at our disposal in the quantum fieldtheory limit (which implies β → ∞ ). However, the physical value of the correlation length(or √ σ in the present example) takes over the role of a free parameter. The replacementof a dimensionless parameter by a mass scale in the continuum limit is a feature of manyquantum field theories and is called dimensional transmutation . On the lattice everymass scale is obtained in units of the string tension, √ σ = 440 MeV is used to assign thefamiliar units of QCD to observables. For 32 lattice points in any space-time direction, wethen find: β (input) 1 .
250 1 .
400 1 .
500 1 . σa (calculated) 0 . . . . L = N a . . . . π/a . . . . β arbitrarily small sincethe physical volume becomes too small. Small values of β result in large volumes, but wecannot make β too small in order to have a reasonably large UV cutoff. Thus, for a fixednumber of points, there is a small window of β values which are appropriate for a study ofQCD particle properties. This window is sometimes called the scaling window . Acknowledgements:
I thank Tom Heinzl and Martin Lavelle for a careful reading ofthe manuscript and helpful comments. 48 eferences [1] P. Gopikrishnan, V. Plerou1, L. A. Nunes Amaral, M. Meyer, and H. E. Stanley,
Scaling of the distribution of fluctuations of financial market indices ,Phys. Rev.
E60 , 5305 (1999).[2] Michel Le Bellac,
Quantum and Statistical Field Theory , Clarendon Press, Oxford.[3] H. A. Kramers and G. H. Wannier, Phys. Rev. , 252 (1941).[4] C. N. Yang, The spontaneous magnetization of a two-dimensional Ising model ,Phys. Rev. , 808 (1952).[5] C. M. Fortuin and P. W. Kasteleyn, On the Random cluster model. 1. Introductionand relation to other model ,Physica , 536 (1972).[6] R. H. Swendsen and J. S. Wang, Nonuniversal critical dynamics in Monte Carlosimulations ,Phys. Rev. Lett. , 86 (1987).[7] U. Wolff, Collective Monte Carlo Updating for Spin Systems ,Phys. Rev. Lett. , 361 (1989).[8] Wolfhard Janke, Nonlocal Monte Carlo Algorithms for Statistical Physics Applica-tions ,Mathematics and Comnputers in Simulations , 329 (1998).[9] K. G. Wilson, Phys. Rev. D , 2445 (1974).[10] see e.g. D. B. Kaplan, A Method for simulating chiral fermions on the lattice , Phys.Lett. B , 342 (1992), [arXiv:hep-lat/9206013].[11] P. H. Ginsparg and K. G. Wilson,
A Remnant Of Chiral Symmetry On The Lattice ,Phys. Rev. D , 2649 (1982).[12] R. Narayanan and H. Neuberger, A Construction of lattice chiral gauge theories ,Nucl. Phys. B , 305 (1995) [arXiv:hep-th/9411108].[13] H. Neuberger,
A practical implementation of the overlap-Dirac operator , Phys. Rev.Lett. , 4060 (1998) [arXiv:hep-lat/9806025].[14] R. G. Edwards, U. M. Heller and R. Narayanan, A study of chiral symmetry inquenched QCD using the overlap-Dirac operator , Phys. Rev. D , 094510 (1999)[arXiv:hep-lat/9811030]. 4915] F. D. Bonnet, P. O. Bowman, D. B. Leinweber, A. G. Williams and J. b. Zhang[CSSM Lattice collaboration], Overlap quark propagator in Landau gauge , Phys.Rev. D , 114503 (2002) [arXiv:hep-lat/0202003].[16] K. Langfeld, Improved actions and asymptotic scaling in lattice Yang-Mills theory ,Phys. Rev. D76