Maximum Entropy Principle for the Microcanonical Ensemble
aa r X i v : . [ c ond - m a t . s t a t - m ec h ] F e b Maximum Entropy Principle for the Microcanonical Ensemble
Michele Campisi ∗ and Donald H. Kobe † Department of Physics,University of North Texas Denton, TX 76203-1427, U.S.A. (Dated: May 30, 2018)We derive the microcanonical ensemble from the Maximum Entropy Principle (MEP) using thephase space volume entropy of P. Hertz. Maximizing this entropy with respect to the probabilitydistribution with the constraints of normalization and average energy, we obtain the condition ofconstant energy. This approach is complementary to the traditional derivation of the microcanonicalensemble from the MEP using Shannon entropy and assuming a priori that the energy is constantwhich results in equal probabilities.
PACS numbers: 05.30.Ch, 05.30.-d, 05.20.Gg, 89.70.+cKeywords: microcanonical ensemble, maximum entropy principle, constraints, quantum ensemble, classicalensemble, probability distribution
I. INTRODUCTION
The seminal works of Jaynes [1, 2] presents the information theory approach to statistical physics using the MaximumEntropy Principle (MEP). In the original papers, Jaynes maximized the Shannon information entropy using constraintsof normalization and average energy to obtain the canonical ensemble. Later on, Tsallis [3] maximized generalizedinformation entropies, like the R´enyi and Tsallis entropies, using constraints of normalization and average energy toobtain deformed exponential distributions that describe the behavior of nonextensive systems.In this paper we show that there is also a special information entropy associated with the microcanonical ensemble.This microcanonical information entropy is the phase-space volume entropy, originally due to P. Hertz [4] (see also [5])that satisfies the heat theorem [6, 7, 8]). Using this entropy in the MEP with constraints of normalization and averageenergy, we obtain the condition that the energy distribution is a delta function, i.e. , we derive the microcanonicalensemble from the MEP.In Section 2 we review the traditional application of the MEP to the microcanonical ensemble. The quantumstatistical application of the MEP with discrete probabilities using the volume entropy is treated in Section 3. Theclassical statistical application is given in Section 4, which employs integration and functional differentiation withcontinuous probability distribution functions. The conclusion is given in Section 5.
II. TRADITIONAL APPROACH TO THE MICROCANONICAL ENSEMBLE
The traditional MEP is reviewed here to contrast it with our approach and to establish the notation. The traditionalapproach to the quantum microcanonical ensemble starts with the assumption that the system is isolated and has afixed energy U . Such a macrostate of energy U can be realized in a number W of possible ways each correspondingto a microstate i . Then one looks for the probability p i that the system is in a certain state i with energy U . Inquantum mechanics U is an eigenvalue of the Hamiltonian operator E β and W is its degeneracy g β , i.e ., U = E β and W = g β . Since we are looking for the probability of a state i that is already assumed to belong to the eigenvalue E β ,the traditional MEP does not have to use the energy constraint and is − X j ∈{ j | E j = E β } p j log p j − λ X j ∈{ j | E j = E β } p j − = maximum, (1)where the first term is Shannon entropy and the sums are over states restricted to j ∈ { j | E j = E β } . ∗ Electronic address: [email protected] † Electronic address: [email protected]:
The MEP in (1) gives Laplace’s Principle of Insufficient Reason p i = 1 g β = constant for i ∈ { j | E j = E β } , (2)that shows the states j in the given macrostate β with energy E β are equiprobable. Thus the maximization proceduregives us a flat distribution. With some abuse of terminology Eq. (2) is often referred to as the “microcanonicalensemble,” but it is defined only for the states j such that E j = E β . Strictly speaking, the microcanonical ensembleis defined on the whole phase space and constrains the system state to lie on a given surface of constant energy. Themicrocanonical ensemble of energy E β is really given as [9] p i = 1 g β δ Kr ( E i , E β ) (3)where δ Kr is the Kronecker delta [ δ Kr ( x, y ) = 1 for x = y and 0 for x = y ]. The Kronecker delta does not appear inEq. (2) because it is assumed a priori .We stress that the traditional approach does not maximize on the whole set of eigenstates of the Hamiltonian butrather on the subset of eigenstates belonging to the eigenvalue E β . This approach is quite different from Jaynes’sderivation of the canonical ensemble, where i runs over all the energy eigenstates. In the following section we ask thequestion: Is it possible to derive the microcanonical ensemble in (3) from a suitable MEP performed on the whole setof eigenstates, as Jaynes did for the canonical ensemble?
III. DERIVATION OF THE MICROCANONICAL DISTRIBUTION: QUANTUM CASE
In order to answer to the question posed above, let us proceed by analogy with Jaynes’s approach to the canonicalensemble. In order to obtain the canonical distribution, p i = Z − e − βE i . (4)where Z is the partition function, and β − is the absolute temperature, one maximizes the Shannon entropy − P i p i log p i under the energy constraint U = P i p i E i and the normalization constraint P i p i = 1 , where i runsover all energy eigenstates. When the Shannon entropy is evaluated with the maximal distribution (4) we obtain thecorrect thermodynamic entropy βU + log X n e − βE n . (5)This thermodynamic entropy is correct in the sense that it satisfies the heat theorem whenever the averages arecalculated over the canonical ensemble [10].In the microcanonical case the correct thermodynamic entropy that satisfies the heat theorem is given by thelogarithm of the volume of phase space enclosed by the hypersurface of energy U = E β [7, 10]. In the quantumversion such entropy is S ( U ) = log Φ( U ) . = log X j θ ( U − E j ) , (6)where θ ( x ) is the step function [ θ ( x ) = 1 for x ≥ , and 0 for x < must use the energy constraintas we do with the canonical ensemble. Thus we are maximizing (6) under the normalization and average energyconditions, X j p j = 1 , X j p j E j = U, (7)Using the constraints in Eq. (7), we can rewrite the entropy in Eq. (6) as S ( p ) = log X j θ X k p k E k − E j X k p k ! (8)where the sums on j and k are over all states. The discrete probability distribution p = { p i } for the microcanonicalensemble is obtained when this entropy is an extremum. Differentiating Eq. (8) with respect to p i and setting theresult equal to zero, we obtain ∂S∂p i = 1Φ( U ) X j δ ( U − E j ) ( E i − E j ) = 0 , (9)for each state i , where θ ′ ( x ) = δ ( x ) is the Dirac delta function. We can see by inspection that Eq. (9) is satisfied if E j = U. When E j = U the state i must be such that E i = E j [because xδ ( x ) = 0] . In the latter case we have E i = U .The probability distribution for states i is therefore p i = A i δ Kr ( E i , U = E β ) , (10)where A i are yet to be determined. The Kronecker delta δ Kr ( E i , E β ) imposes the restriction that the probability ofstates i / ∈ { i | E i = E β } are zero.Since there is nothing to distinguish different states i ∈ { i | E i = E β } , we can invoke Laplace’s Principle of InsufficientReason, obtained from the traditional MEP approach, to choose A i = A β to be the same for all states belonging tothe same eigenenergy E β . Using the constraint of normalization in Eq. (7), we obtain p i = 1 g β δ Kr ( E i , E β ) , (11)which is the microcanonical probability distribution. The only nonzero contributions are from states i with fixedenergy E i = E β . IV. DERIVATION OF THE MICROCANONICAL DISTRIBUTION: CLASSICAL CASE
The derivation of the classical microcanonical distribution proceeds in a way analogous to the quantum derivation.Because we need to use a continuous probability distribution, we must use integration and functional differentiationin the MEP. However, the treatment is sufficiently different to merit some discussion.Equation (9) for the classical volume entropy of P. Hertz [4] is S ( U ) = log Φ( U ) , (12)where U is again the energy. In the classical case the function Φ( U ) is now the volume of phase space enclosed by thehypersurface of energy U [7] Φ( U ) . = Z z ∈{ z | H ( z ) ≤ U } d z = Z d z θ ( U − H ( z )) , (13)where the Hamiltonian is H ( z ) and the step function θ ( U − H ( z )) provides the limits for the integral. The phasespace coordinate z = ( q , p ) consists of the set of canonical coordinates q = { q i } ni =1 in n -dimensional space and the setof their conjugate canonical momenta p = { p i } ni =1 . The element of volume in 2 n -dimensional phase space is d z = d n qd n p and integration is over all phase space if no limits are shown.For the classical case, the constraints on normalization and average energy corresponding to Eq. (7) are Z d z ρ ( z ) = 1 , Z d z ρ ( z ) H ( z ) = U, (14)respectively, where ρ ( z ) is the probability density in phase space. The MEP for the classical microcanonical ensembleis analogous to the quantum case. Using Eq. (13) and the constraints of normalization and average energy in Eq.(14), we can rewrite the entropy in Eq. (12) as a functional S [ ρ ] = log Z d z ′ θ (cid:18)Z d z ′′ ρ ( z ′′ ) H ( z ′′ ) − H ( z ′ ) Z d z ′′ ρ ( z ′′ ) (cid:19) , (15)where the integration is over all phase space. The continuous probability distribution ρ = ρ ( z ) for the microcanonicalensemble is obtained when this entropy is an extremum. Functionally differentiating Eq. (15) with respect to ρ ( z )and setting the result equal to zero, we obtain δS [ ρ ] δρ ( z ) = 1Φ( U ) Z d z ′ δ ( U − H ( z ′ )) ( H ( z ) − H ( z ′ )) = 0 . (16)By inspection we see that this equation is satisfied if z ′ is such that H ( z ′ ) = U . When H ( z ′ ) = U for some values z ′ wemust also have H ( z ′ ) = H ( z ) for some values of z [because xδ ( x ) = 0] . In the latter case we therefore have H ( z ) = U. The distribution function ρ ( z ) therefore has a delta function that restricts the Hamiltonian to the hypersurface ofenergy U , ρ ( z ) = A ( z ) δ ( U − H ( z )) , (17)where A ( z ) is an arbitrary function of z , Since there is nothing to distinguish different points in phase space z ∈{ z | H ( z ) = U } that are all on the energy hypersurface, we can invoke Laplace’s Principle of Insufficient Reason tochoose A ( z ) = A U , which is constant for fixed U for all these phase space points. The normalization condition in Eq.(14) then becomes Z d z ρ ( z ) = Z d z A ( z ) δ ( U − H ( z )) = A U Z d z δ ( U − H ( z )) = 1 . (18)The last integral in Eq. (18) can be performed by making a change of variables to e = H ( z ) , which gives Z d z δ ( U − H ( z )) = Z de d z de δ ( U − e ) = (cid:18) d z de (cid:19) e = U ≡ Ω( U ) , (19)where the function Ω( U ) is the density of states for energy U , i.e. , the number of states per unit energy. SubstitutingEq. (19) into Eq. (18), we obtain A U = Ω( U ) − . Therefore, the probability distribution function in phase space inEq. (17) becomes ρ ( z ) = 1Ω( U ) δ ( U − H ( z )) , (20)which is in fact the well-known classical microcanonical distribution. If the phase space point z is not on the energyhypersurface U = H ( z ) the probability density is zero. This probability density is analogous to the probabilitydistribution in Eq. (11) for the quantum case, where the degeneracy g β corresponds to the density of states Ω( U ). V. CONCLUSION