A Pedagogical Discussion of Magnetisation in the Mean Field Ising Model
AA Pedagogical Discussion of Magnetisation in the Mean Field Ising Model
Dalton A R Sakthivadivel ∗ Stony Brook University, Stony Brook, New York, 11794-5281 (Dated: 3rd February 2021)Here, a complete, pedagogical tutorial for applying mean field theory to the two-dimensional Isingmodel is presented. Beginning with the motivation and basis for mean field theory, we formallyderive the Bogoliubov inequality and discuss mean field theory itself. We proceed with the use ofmean field theory to determine Ising magnetisation, and the results of the derivation are interpretedgraphically and physically. We include some more general comments on the thermodynamics ofthe phase transition. We end by evaluating symmetry considerations in magnetisation, and somemore subtle features of the Ising model. Together, a self-contained overview of the magnetised Isingmodel is presented, with some novel presentation of important results.
I. INTRODUCTION
The Ising model is one of the most commonly used models in statistical mechanics, due to its ability to describe thedynamics of a large number of seemingly quite different systems. Particularly adept at describing phase transitions,the Ising model comprises a particular universality class, which describes the grouping of many different systemsaccording to some key common features in their phase transitions. The Ising model has been used for such diversepurposes as describing the transition from liquid to gas, to representing various features of string theory [1, 2]. Thisis remarkable for a simple model of the electronic structure of a magnet.Devised in 1920 by Wilhelm Lenz, Ernst Ising’s doctoral supervisor, it was given to Ising and solved by him in 1925[3]. Ising solved a one-dimensional model, by way of transfer matrix. The solution is simple, but unfortunately, thereis no phase transition in one-dimension, making the classical one-dimensional Ising model not interesting. In two-dimensions, we observe a phase transition from a paramagnet—disordered spins, no magnetisation—to a ferromagnetat temperatures below a critical point. On the other hand, in two-dimensions, the interactions become too complexto solve for analytically with any ease. The Ising model in three-dimensions is still unsolved, and in d ≥ II. MAIN RESULTS
MFT is formalised by applying the Bogoliubov inequality to a variational Hamiltonian. Broadly, this states thatthe choice of a simpler model, and the statistics it yields, can be made formally based on minimisation of a variationalterm. We explore the Bogoliubov inequality below. ∗ [email protected] a r X i v : . [ c ond - m a t . s t a t - m ec h ] F e b A. Proving The Bogoliubov Inequality
Perturbative methods are commonly used in physics when the problem at hand is intractable. This takes a simpler,exactly solvable model, and perturbs it to describe the more complicated problem. In general, these are composed ofa solvable expression A and an expansion in some control parameter λ , such that A gets approximated by the series: A ≈ A P = A + λA + λ A . . . λ n A n . The Bogoliubov inequality operates on one such perturbative method, and is used to justify MFT. Say we were tryingto find the free energy F of a system, given by − β ln( Z ), where this calculation was not tractable due to computationalor analytical difficulty. We may separate the system into two components such that one has ‘easy’ statistics and theother is more complicated, with the caveat that together they must approximate the true Hamiltonian of the system.Free energy is an important statistic, as many fundamental quantities can be derived from it. In general, it isnatural to ask the questions: how good is our approximation of the system’s dynamics? The Bogoliubov inequalityanswers these questions for both statistical and quantum systems.Suppose we begin with a simple, unperturbed Hamiltonian ˆ H and perturb it with a more complicated expressionˆ H to get ˆ H P = ˆ H + λ ˆ H , and we wanted to approximate the behaviour of the true system as best as possible basedonly on our choice of the second term. This is a variational problem—we are attempting to identify the minimumdifference between the dynamics of our perturbative Hamiltonian ˆ H P and those of our true Hamiltonian ˆ H by varyingour perturbative components. In fact, we will see the second term doesn’t matter at all, and a judicious choice of trial Hamiltonian ˆ H will provide a close match to the actual free energy, which we then improve by minimisation.In the Bogoliubov inequality, the statistics we reproduce are free energy related. The Bogoliubov inequality ensuresthe effect this perturbative model has on the free energy can be given a rigorous upper bound, such that we know theextent of our approximation F V .Given that the free energy F is a concave function of our control parameter λ , e.g.,d F d λ < , such that a minimum actually exists, we have the following: F ≤ F + (cid:104) ˆ H − ˆ H (cid:105) . Deriving the Bogoliubov inequality can be made as simple as observing what happens when we have our perturbativeHamiltonian in the partition function. Say we are able to define this perturbation as an intentional collection of termssuch that ˆ H P is equal to ˆ H , e.g., λ ˆ H ≡ λ ∆ ˆ H = ˆ H − ˆ H . Then, the energy approximation is exact, and the partition function Z is (cid:88) e − β ˆ H = (cid:88) e − β ( ˆ H + λ ∆ ˆ H ) . We further transform this equation using some basic algebra, with the intention of reducing it to the form of a freeenergy: = Z (cid:88) e − βλ ∆ ˆ H · Z − e − β ˆ H = Z (cid:104) e − βλ ∆ ˆ H (cid:105) . Some intervening discussion of this result is necessary. The first step is a simple expansion of the exponential terminside the sum, where Z is the partition function using ˆ H . The second step, however, uses this the mean withrespect to the Gibbs distribution to simplify this term even further. In particular, we denote the mean (cid:104) e − βλ ∆ ˆ H (cid:105) as the mean with respect to a Gibbs distribution over ˆ H , Z − e − β ˆ H . In this case, it is clear to see where the result comes from.We now use Jensen’s inequality, a useful general result on convex functions. It states that, when a function f ofa variable x is convex, e.g., f (cid:48)(cid:48) >
0, the mean (cid:104) f ( x ) (cid:105) is always greater than or equal to f ( (cid:104) x (cid:105) ). Here, since e − x isconvex, we apply Jensen’s inequality to get Z (cid:104) e − βλ ∆ ˆ H (cid:105) ≥ Z e (cid:104)− βλ ∆ ˆ H (cid:105) and simplify to − k B T ln( Z ) ≤ − k B T ln( Z ) + β (cid:104) λ ∆ ˆ H (cid:105) . This is equivalent to F ≤ F + (cid:104) ˆ H − ˆ H (cid:105) because Z (cid:104) e − βλ ∆ ˆ H (cid:105) is equal to our partition function Z , proven earlier, and F = − k B T ln( Z ). As such, we recoverthe Bogoliubov inequality.The term F + (cid:104) ˆ H − ˆ H (cid:105) is called our variational free energy , denoted by F V , and is the resulting free energyfrom our perturbative or variational model. It is, in general, greater than or equal to the actual free energy of thesystem—we only have equality when there is no perturbative component, and ˆ H = ˆ H . In other words, our partitionfunction with respect to our trial Hamiltonian must be our actual partition function. Clearly, that defeats the purposeof using a perturbative method in the first place. We can, on the other hand, assume that F V is some curve lyingabove F , and minimise it when the term depends on some parameter λ , to approximate F as closely as possible. Wewill demonstrate this in the Ising model now. B. Deriving a Mean Field Model by Variational Methods
Recall our Hamiltonian that we defined earlier for the Ising model. Suppose we use the following trial Hamiltonian:ˆ H = − (cid:88) i m i s i rather than the actual Ising Hamiltonian ˆ H = − J (cid:88) i,j s i s j − h (cid:88) i s i . This is a system with no interactions, whose spins experience only an effective magnetic field m i , perhaps fromneighbouring or coupled spins. In fact, we can further simplify to an isotropic magnetic field m —in other words, thesame in each direction. Our earlier Bogoliubov inequality becomes the following: F V = F + (cid:68)(cid:16) − J (cid:88) s i s j − h (cid:88) s i (cid:17) − (cid:16) − m (cid:88) s i (cid:17)(cid:69) . We will proceed to simplify the variational free energy so as to calculate its minimum with respect to m .First, we distribute the expectation into the sums inside. This does not use the previously mentioned Jensen’sinequality, because expectation is a linear operator and sums are not convex functions. F V = F + (cid:68) − J (cid:88) s i s j − h (cid:88) s i + m (cid:88) s i (cid:69) = F − J (cid:88) (cid:104) s i s j (cid:105) − h (cid:88) (cid:104) s i (cid:105) + m (cid:88) (cid:104) s i (cid:105) = F − J (cid:88) (cid:104) s i s j (cid:105) + ( m − h ) (cid:88) (cid:104) s i (cid:105) . We now take the derivative with respect to m , intending to minimise the variational free energy by setting ∂F V ∂m = 0 .∂F V ∂m = ∂∂m (cid:16) F − J (cid:88) (cid:104) s i s j (cid:105) + ( m − h ) (cid:88) (cid:104) s i (cid:105) (cid:17) = ∂F ∂m − ∂∂m (cid:16) J (cid:88) (cid:104) s i s j (cid:105) (cid:17) + ∂∂m (cid:16) ( m − h ) (cid:88) (cid:104) s i (cid:105) (cid:17) . We evaluate these terms separately. ∂F ∂m = − ∂ ln( Z ) ∂m = − Z ∂Z ∂m = − (cid:0)(cid:80) e − m (cid:80) s i (cid:1) ∂ (cid:0)(cid:80) e − m (cid:80) s i (cid:1) ∂m = − (cid:80) s i e − m (cid:80) s i (cid:80) e − m (cid:80) s i = −(cid:104) s i (cid:105) . − ∂∂m (cid:16) J (cid:88) (cid:104) s i s j (cid:105) (cid:17) = − J (cid:88) ∂ (cid:104) s i s j (cid:105) ∂m = − J (cid:88) (cid:18) ∂ (cid:104) s i (cid:105) ∂m (cid:104) s j (cid:105) + (cid:104) s i (cid:105) ∂ (cid:104) s j (cid:105) ∂m (cid:19) = − J (cid:88) (cid:18) ∂ (cid:104) s i (cid:105) ∂m (cid:104) s j (cid:105) (cid:19) .∂∂m (cid:16) ( m − h ) (cid:88) (cid:104) s i (cid:105) (cid:17) = (cid:88) (cid:104) s i (cid:105) + ( m − h ) ∂ (cid:80) (cid:104) s i (cid:105) ∂m = (cid:104) s i (cid:105) + ( m − h ) ∂ (cid:104) s i (cid:105) ∂m . In the first derivation we use our assumption that the partition function is non-interacting, and so the sum overstates is simply the two possible spin states. We also note that the thermal average of spins is present in what mightotherwise be an intermediate step, and so we convert it to this form rather than finish the calculation.In the second derivation we have used an implication of spins being uncorrelated, namely, that (cid:104) s i s j (cid:105) = (cid:104) s i (cid:105)(cid:104) s j (cid:105) .We also use an effective field that only influences a single neighbour, rather than both. Thus, one of the derivativesvanishes. This is a reasonable assumption within the mean field regime, given the set-up of our lattice and ournon-interacting spins.In the final derivation we have simply applied the product rule and then reduced the sum over the thermal averageto the single thermal average that exists for the system.Our final equation looks like this: ∂F V ∂m = −(cid:104) s i (cid:105) − J (cid:88) (cid:18) ∂ (cid:104) s i (cid:105) ∂m (cid:104) s j (cid:105) (cid:19) + (cid:104) s i (cid:105) + ( m − h ) ∂ (cid:104) s i (cid:105) ∂m . We simplify this to: ∂F V ∂m = ∂ (cid:104) s i (cid:105) ∂m (cid:16) − J (cid:88) (cid:104) s j (cid:105) + ( m − h ) (cid:17) , where the two thermal spin averages cancel and the partial derivative ∂ (cid:104) s i (cid:105) ∂m factors out.Now, solving for ∂F V ∂m = 0 , we have: 0 = ∂ (cid:104) s i (cid:105) ∂m (cid:16) − J (cid:88) (cid:104) s j (cid:105) + ( m − h ) (cid:17) = (cid:16) − J (cid:88) (cid:104) s j (cid:105) + ( m − h ) (cid:17) = ⇒ m = J (cid:88) (cid:104) s j (cid:105) + h. Finally, we simplify this by using an expression for the thermal average. In general, the thermal or ensemble averageof a statistical variable is a weighted sum over the possible states given by ˆ H , (cid:104) A (cid:105) = (cid:80) A i e − β (cid:80) ˆ H i (cid:80) e − β (cid:80) ˆ H i . Clearly, one may also define this using a partition function, which we will use now to simplify (cid:104) s j (cid:105) .Z = (cid:88) e − β (cid:80) ˆ H i = (cid:16)(cid:88) e − β ˆ H (cid:17) (cid:16)(cid:88) e − β ˆ H (cid:17) . . . (cid:16)(cid:88) e − β ˆ H n (cid:17) = (cid:89) i (cid:16)(cid:88) e − β ˆ H i (cid:17) . Here, we have factored out the sum in the exponent to multiplied individual exponential functions.Recall we have used a thermal average with respect to our trial Hamiltonian, giving us Z = (cid:89) i (cid:16)(cid:88) e − βms i (cid:17) = (cid:89) (cid:16) e − βm (+1) + e − βm ( − (cid:17) . Any reader familiar with hyperbolic functions will recognise this as the following: (cid:89) βm ) , and using this in our thermal average, we get: (cid:104) s j (cid:105) = 1 Z (cid:88) i s i e − β (cid:80) ˆ H i = (+1) e − βm (+1) + ( − e − βm ( − βm )= 2sinh( βm )2cosh( βm )= tanh( βm ) . Using this result, we can define our model as such: m = J (cid:88) (cid:104) s j (cid:105) + h = J (cid:88) tanh( βm ) + h. This is a sum over neighbouring spins from our earlier ˆ H , so it cannot be removed by considering the effective fieldas we did previously; however, because of this effective field, it can be reduced to multiplication by the number ofneighbours z .So, finally, this is our mean field model of magnetisation: m = zJ tanh( βm ) + h. C. Magnetisation in the Ising Model
For a graphical analysis, it is typically good to reduce the number of free variables that we would have to plot.Here, we have two variables of import— m and T . Since the constant zJ can be absorbed into the argument of thefunction, β = k B T , and we can realistically assume zero external field h , we have the self-consistent equation m = tanh (cid:18) T c mT (cid:19) . (1)While this clearly suggests that T c = zJk B , this is still a difficult equation to make sense of, since there is no obviousway to isolate m . In fact, because hyperbolic equations are transcendental, there is no way to solve for m algebraically.However, if we rearrange it as tanh (cid:18) T c mT (cid:19) − m = 0 , we then have the intersection of the magnetisation value with the magnetisation equation in dimensions m, T, and f ( T, m ). This is our self-consistency condition: trivially, the intersection between two surfaces is defined by the set ofpoints at which the functions are equal, such that f ( x, y ) − g ( x, y ) = 0 . Thus, we define our self-consistent solution as(1), which also serves as the projection of the intersection of the two surfaces onto the m and T axes. In other words,it projects the intersection of the two surfaces in f ( T, m ) to m ( T ). As such, our magnetisation m ( T ) is this curve. Figure 1.
Magnetisation function.
The surface f ( T, m ) = tanh (cid:0) T c mT (cid:1) is plotted here for x = T and y = m .Figure 2. Projecting the intersection of the two surfaces.
One surface is y = m and the other is f ( T, m ) = tanh (cid:0) T c mT (cid:1) .The locus of points where f ( T, m ) = m is plotted on the subspace ( T, m ). In order to obtain an expression for magnetisation in terms of temperature, we need to use this function and thepreviously mentioned self-consistency condition. The self-consistency condition constrains the possible solutions to m as the temperature changes. By applying the self-consistency condition to constrain the evolution of magnetisationin the temperature space, we recover the magnetisation of the Ising model. D. The Thermodynamics of Magnetisation
We will explore the physical meaning and intuition for magnetisation in this section.Recall the definition of actual free energy, in statistical mechanical terms: − β ln( Z ) . We want to expand this into thermodynamical terms, so that we can study state variables in a more concise orapparent way, rather than looking at their underlying statistical structure.Beginning with this statistical structure, we use a generalisation of Boltzmann’s famous definition of ensembleentropy as proportional to the number of microstates W , S = k B ln( W ) . In a system coupled to a heat bath we must generalise to the Gibbs entropy, which is the Boltzmann entropy formicrostates without equal likelihood. This becomes S = − k B (cid:88) i p i ln( p i ) , which is clearly the Boltzmann equation when p i = N − .We condense the probability of states into the Boltzmann distribution p i = e − βE i Z , yielding S = − k B (cid:88) p i ( − βE i − ln( Z )) . Distributing the sum over the probability of each state i , we get: S = k B ( β (cid:104) E (cid:105) + ln( Z ))which is equal to the following: S = (cid:104) E (cid:105) T + k B ln( Z ) T S = (cid:104) E (cid:105) + k B T ln( Z ) F = (cid:104) E (cid:105) − T S.
Thus, we have arrived at the macroscopic or Helmholtz free energy. There is an alternate way to do so by using theLaplace transform of an integral over phase space, but that is unnecessary here. Now, we can analyse what happensto the Ising model, macroscopically, during a phase transition.Specific to our evaluation of the Ising model, and the simplest consideration of the issue, is that as temperaturedecreases, entropy contributes increasingly less to the system. This should seem correct, because entropy is reflectedin disorder and thermal fluctuations cause disorder. As entropy decreases, our system becomes more ordered.We can look further at this by using the following fact: stable configurations of a system minimise free energy.In that case, we can see by the above that as the entropic contribution to free energy decreases, F ≈ E . Since theHamiltonian is a measurement of the total energy of the system, we evaluate this directly, and see that when spinsalign to +1 (in the ferromagnetic case), the system is in the lowest possible energy state. We may calculate this usingˆ H , remembering that the sum occurs over individual multiplicative pairs of spins i and j :ˆ H = − J (cid:88) i,j s i s j . So, for the lowest possible energy state to occur, every pair of spins must be positive, since this yields a set of positivenumbers s i s j multiplied by a negative number − J . These multiplicative pairs can only be positive when both spinsare positive, and hence, every spin must be positive.This still leaves one question unaddressed: why, in principle, do we expect free energy to be minimised in a stablesystem? It is as simple as the definition of stability: a stable system does not change, and free energy is the capacityfor a system to do work—or, enact change. Thus, when a system does not change, its free energy is minimised.In fact, free energy is generally minimised in an equilibrium system, which defines the stable state it can take—evendisordered ones. This may seem confusing, but we can also use this to define the tendency to destabilise, and allowentropy to affect spin configurations, in terms of Jaynesian Maximum Entropy [4]. E T Jaynes defined most resultsin statistical mechanical dynamics as arising from the natural tendency to maximise entropy, which itself is a simpleconsequence of the second law of thermodynamics and related results. For high temperatures where F ≈ − T S , clearly,maximising entropy is equivalent to minimising free energy.We can define a coherent picture for the energy in this system as follows: the Ising model is attached to a temperaturebath, which it is in thermal equilibrium with. We use the following definition of macroscopic entropy: ∆ S bath = − T − Q . The change in energy towards equilibrium is described by a conservation law, where the heat flow isequivalent to the energy leaving the bath and entering the lattice: Q = − ∆ E bath = ∆ E IM . As such, the change inentropy towards equilibrium is given by the following: − QT + ∆ S IM , where Q is the heat flow and ∆ S is the change in lattice entropy. We translate this entropic insight into free energyas follows: ∆ S = − QT + ∆ S IM = − Q + T ∆ S IM T = − ∆ E IM + T ∆ S IM T = − ∆ ( E IM − T S IM ) T = − ∆ FT .
Since ∆ S ≥
0, ∆ FT ≤ . So, regardless of the temperature of the system, an equilibrium state at any T is defined by minimum actual freeenergy. Once again, at low temperatures, this is accomplished by reducing the actual energy of a configuration, whichoccurs only when spins are positively aligned. E. Broken Symmetry and Spin Configuration
We note one final important consideration for magnetisation and for phase transitions in general. We have previouslymade reference to a non-zero order parameter as characterising a phase transition. In defining what ‘order’ meansin the order parameter, we are prompted to consider the configuration of the system, and thus symmetry. Anorder parameter is, in a more fundamental sense, a measurement of the breaking of symmetry induced by the phasetransition. Consider that, with zero field, the Ising Hamiltonian is symmetric under the z transformation s → − s , atotal rotation: ˆ H = (cid:88) J ij · − s i · − s j ⇐⇒ ˆ H = (cid:88) J ij · s i · s j . This is obviously true, and yet, the ground state of ˆ H is not symmetric—all spins must align upwards due to the veryenergy considerations that previously yielded symmetry. The order parameter is a measurement of how the physicschanges in a phase transition. We can demonstrate that the ensemble average of the order parameter vanishes if thesymmetries of the Hamiltonian are obeyed. If not, then the order parameter becomes non-zero, and we have brokenthe symmetry we began with. We saw this directly with magnetisation: m = 0 before the phase transition and m = 1afterwards, which is given by the change in configuration from disordered spins to long-range order, the latter of whichimplies no symmetric mixture of states.This leads directly to another question, which is more difficult to answer—how is the magnetised state chosen inthe non-interacting case, with no symmetry breaking magnetic field h ? This presents an interesting problem for themean field approximation, which assumes non-interacting spins, and for a more realistic model where J ’s are nothomogeneous, and the penalty for magnetising ‘incorrectly’ is not clear. In this case, it is chosen randomly, due tofluctuations at criticality. When both states are equally likely, and therefore (cid:104) m (cid:105) = 0, but we expect symmetry tostill be broken, a paradox is evident. Indeed, we must ask how we can prove the Ising model will magnetise at all,and what value it is expected to take.The special case h = 0 demands a more sophisticated look at the definition of magnetisation. Degenerate groundstates in quantum systems, such as a superposition, or coherence, of possible low energy states, present just thissame problem; luckily, there are many methods for characterising such systems. The simplest technique for doingso consists of applying the thermodynamic limit N → ∞ to a perturbative expansion of our degenerate magnetisedground state. For a lattice of size Λ, this method formalises the calculation of the following (non-commuting) doublelimit: lim h → lim Λ →∞ m Λ ( h ) . The simplest thing we could do is prove why the limit takes this form. We can do this mathematically by showing thatthe partition function is a sum of smooth (in fact, analytic) functions e − β ˆ H , so for finite N , the partition function isalso smooth. As such, for h = 0, symmetry requires that magnetisation m ( h ) is continuously zero everywhere. On theother hand, an infinite sum of continuous functions is not necessarily continuous itself, and so admits a discontinuityin m ( h ) in the case of h →
0. In particular, we have m ( h ) > h → III. CONCLUSION
We have derived, from first principles, a mean field theory as a valid approximation for the two-dimensionalIsing model. We began by justifying MFT using the Bogoliubov inequality, and then calculated this mean fieldtheory. We then analysed the meaning of these results, and built an intuition for what the mean field magnetisationequation indicates and why the Ising model magnetises at all. We have explored some typical results in a veryimportant statistical mechanical model, and set the stage for even more involved investigation of the Ising model,phase transitions, and statistical mechanics itself. [1] A D Bruce and N B Wilding. Scaling fields and universality of the liquid-gas critical point.
Physical Review Letters ,68:193–196, Jan 1992.[2] A Sedrakyan. 3D Ising model as a string theory in three-dimensional Euclidean space.
Physics Letters B , 304(3):256 – 262,1993.[3] Stephen Brush. History of the Lenz-Ising model.
Reviews of Modern Physics , 39(4):883–893, 1967.[4] Edwin Thompson Jaynes. Information theory and statistical mechanics.
Physical Review , 106:620–630, 1957.[5] Elliott W Montroll, Renfrey B Potts, and John C Ward. Correlations and spontaneous magnetization of the two-dimensionalIsing model.
Journal of Mathematical Physics , 4(2):308–322, 1963. [6] Tai Tsun Wu, Barry M McCoy, Craig A Tracy, and Eytan Barouch. Spin-spin correlation functions for the two-dimensionalIsing model: Exact theory in the scaling region. Phys. Rev. B , 13:316–374, Jan 1976.[7] N N Bogoliubov.
Lectures on Quantum Statistics , volume 2 of
Lectures on Quantum Statistics . Gordon and Breach, 1970.