[PDF] Calculation of the Residual Entropy of Ice Ih by Monte Carlo simulation with the Combination of the Replica-Exchange Wang-Landau algorithm and Multicanonical Replica-Exchange Method

Abstract

We estimated the residual entropy of ice Ih by the recently developed simulation protocol, namely, the combination of Replica-Exchange Wang-Landau algorithm and Multicanonical Replica-Exchange Method. We employed a model with the nearest neighbor interactions on the three-dimensional hexagonal lattice, which satisfied the ice rules in the ground state. The results showed that our estimate of the residual entropy is found to be within 0.038 % of series expansion estimate by Nagle and within 0.000077 % of PEPS algorithm by Vanderstraeten. In this article, we not only give our latest estimate of the residual entropy of ice Ih but also discuss the importance of the uniformity of a random number generator in MC simulations.

Full PDF

aa r X i v : . [ c ond - m a t . s t a t - m ec h ] S e p Calculation of the Residual Entropy of Ice Ihby Monte Carlo simulation with the Combination of the Replica-ExchangeWang-Landau algorithm and Multicanonical Replica-Exchange Method

Takuya Hayashi , Chizuru Muguruma , and Yuko Okamoto , , Department of Physics, Graduate School of Science,Nagoya University, Nagoya, Aichi 464-8602, Japan Faculty of Liberal Arts and Sciences, Chukyo University, Nagoya, Aichi 466-8666, Japan Center for Computational Science, Graduate School of Engineering,Nagoya University, Nagoya, Aichi 464-8603, Japan Information Technology Center, Nagoya University, Nagoya, Aichi 464-8601, Japan

038 % of series expansion estimateby Nagle and within 0 . I. INTRODUCTION

After the experimental discovery that the ice Ih has non-zero residual entropy near zero temperature [1],the theoretical explanation about the origin was proposed by the ice rules [2], which considered the hydrogenbonds between water molecules in ice [3]. The residual entropy per one water molecule S is proportional tothe logarithm of the number of degrees of freedom of the orientations of one water molecule W : S = k B ln W . (1)Here, k B is the Boltzmann constant and the value is 1 . / deg]. The estimate by Pauling was W Pauling0 =1 . S Pauling0 ≃ .

806 [3]. It was in accord with the experimental value S Experiment0 = 0 . Multicanonical (MUCA)

Monte Carlo (MC)

Method [10, 11] (for reviews, see, e.g., [12, 13]). After these simulation modelswere suggested, many research groups estimated the residual entropy by various computational approachesfor the last decade (see, e.g., [14–18]). The estimates by computer simulations seem to be equal to or moreaccurate than theoretical estimate by Nagle. Although the residual entropy of ice is becoming one of goodexamples to test the accuracy of simulation algorithms, there seem to be small disagreements among theestimates. The exact residual entropy of Ice Ih has yet to be obtained.In this article, we present our latest estimate of the residual entropy by the recently proposed MC simula-tion with the combination of

Replica-Exchange Wang-Landau algorithm (REWL) [19, 20] and

MulticanonicalReplica-Exchange Method (MUCAREM) [21–23], which we refer to as REWL-MUCAREM [24]. REWL-MUCAREM can give us high precise estimates of the density of state (DOS) and the entropy under appro-priate computational conditions. We employed the 2-state model proposed in [6]. Our latest result is in goodagreement with the estimates by several research groups which used other simulation methods. In addition,we also report that the uniformity of the random numbers is important for MC simulations.This article is organized as follows. We summarize the results of previous researches brieﬂy in Sec. II. InSec. III, we explain the ice model that we employed and the REWL-MUCAREM protocol. In Sec. IV, thesimulation details are given. In Sec. V, our results are presented, and Sec. VI is devoted to conclusions. InAppendix A, the importance of random numbers is discussed.

II. RESIDUAL ENTROPY

Figure 1 shows the hexagonal crystal structure of ice Ih in two-dimensional projections. Figures 1(a) and1(b) correspond to the projection to the xy -plane and the yz -plane, respectively. We assume that the watermolecules exist as H O molecules in ice and hydrogen atoms can occupy one of the two places on each bondaccording to the ice rules in [3]: (1) there is one hydrogen atom on each bond, and (2) there are two hydrogenatoms near each oxygen atom.Suppose that there are N water molecules. The number of hydrogen atoms is 2 N . The theoretical residualentropy S per one water molecule is deﬁned by: S = k B ln WN = k B ln W , (2)where W = ( W ) N . (3)Here, W is the total number of conﬁgurations of water molecules which satisfy the two ice rules. By deﬁning W as the number of orientations per one water molecule, Pauling estimated the value to be [3] W Pauling0 = 1 . . (4)His strategy is as follows: ignoring the second ice rule (two hydrogen atoms exist near each oxygen atom),2 N conﬁgurations can be considered because each hydrogen atom is given the choice of two positions oneach bond. There are 16 arrangements of the four hydrogen atoms around one oxygen atom and the only 6arrangements can satisfy the second ice rule. Thus, the total number of conﬁgurations W that satisﬁes theice rule (1) and ice rule (2) simultaneously is: W = ( W Pauling0 ) N = 2 N × (cid:18) (cid:19) N = (cid:18) (cid:19) N . (5)Eq. (5) can be converted to the residual entropy as S Pauling0 = k B ln (cid:16) W Pauling0 (cid:17) = 0 . · · · [cal / deg mole] . (6)Onsager and Dupuis showed that W Pauling0 = 1 . W Nagle0 = 1 . , (7)and S Nagle0 = 0 . · · · [cal / deg mole] . (8)Here, the error bar is not statistical but reﬂects higher-order corrections of the expansion, which are notentirely under control. In terms of theoretical approximation, another series expansion method, which usednumerical linked cluster (NCL) expansion, were proposed [25].With the development of computer science, many research groups have tried to estimate the residualentropy by various computational approach (for example, Thermodynamic Integration method, Wang-Landaualgorithm, and PEPS algorithm) [14–18]. However, there remain small diﬀerences between these results. Wegive our latest estimate by REWL-MUCAREM protocol in this article. III. MODELS AND METHODSA. Models

We used the 2-state model [6]. In this model, we do not consider distinct orientations of the water molecule(the ice rule (2) is ignored), but allow two positions for each hydrogen nucleus between two oxygen atoms(the ice rule (1) is always satisﬁed). The total potential energy E of this system is deﬁned by E = − X i f (cid:0) i, b i , b i , b i , b i (cid:1) , (9)where i stands for a site number of oxygen atoms. The sum is over all sites (oxygen atoms) of the lattice.The function f is given by f (cid:0) i, b i , b i , b i , b i (cid:1) =  i i i (10)The ground state of this model fulﬁlls the two ice rules completely. The energy at the ground state E ground is − N . Because the normalization (the total number of conﬁgurations P E n ( E ) is 2 N where n ( E ) isthe number of states at energy E ) is known, MUCA simulations allow us to estimate the number of theconﬁgurations at the ground state accurately by calculating the ratio of ˜ n ( E ground ) to P E ˜ n ( E ) [26]. Here,˜ n ( E ) is the estimates obtained from MUCA simulations. B. Methods

We used an advanced generalized-ensemble MC algorithm that we recently developed, REWL-MUCAREM[24]. In this protocol, the multicanonical weight factor (i.e., the inverse of the DOS) is determined roughlyby a REWL simulation and then the weight factor is reﬁned by repeating MUCAREM simulations.A brief explanation of MUCA [10–13] is now given here. The multicanonical probability distribution ofpotential energy P MUCA ( E ) is deﬁned by P MUCA ( E ) ∝ g ( E ) W MUCA ( E ) ≡ const , (11)where W MUCA ( E ) is the multicanonical weight factor, the function g ( E ) is the DOS, and E is the totalpotential energy. By omitting a constant factor, we have W MUCA ( E ) = 1 g ( E ) . (12)In MUCA MC simulations, the trial moves are accepted with the following Metropolis transition probability w ( E → E ′ ): w ( E → E ′ ) = min (cid:20) , W MUCA ( E ′ ) W MUCA ( E ) (cid:21) = min (cid:20) , g ( E ) g ( E ′ ) (cid:21) . (13)Here, E is the potential energy of the original conﬁguration and E ′ is that of a proposed one. After a longproduction run, the best estimate of DOS can be obtained by the single-histogram reweighting techniques[27]: g ( E ) = H ( E ) W MUCA ( E ) , (14)where H ( E ) is the histogram of sampled potential energy. Practically, the W MUCA ( E ) is set to exp[ − βE ]at ﬁrst and modiﬁed by repeating sampling and reweighting. Here, β is the inverse of temperature T ( β = 1 /k B T ).The Wang-Landau (WL) algorithm [28, 29] also uses 1 /g ( E ) as the weight factor and the Metropoliscriterion is the same as in Eq. (13). However, g ( E ) is updated dynamically as g ( E ) → f × g ( E ) duringthe simulation when the simulation visits a certain energy value E . f is a modiﬁcation factor. We continuethe updating until the histogram H ( E ) becomes ﬂat. If H ( E ) is ﬂat enough, a next simulation beginsafter resetting the histogram to zero and reducing the modiﬁcation factor (usually, f → √ f ). The ﬂatnessevaluation can be done in various ways. This process is terminated when the modiﬁcation factor attains apredetermined value f ﬁnal , and exp(10 − ) ≃ .

000 000 01 is often used as f ﬁnal . Hence, the estimated g ( E )tends to converge to the true DOS of the system within this much accuracy set by f ﬁnal .MUCA can be combined with Replica-Exchange Method (REM) [30–32] for more eﬃcient sampling. (REMis also referred to as

Parallel Tempering [33].) The method is referred to as MUCAREM [21–23]. In MU-CAREM, the entire energy range of interest [ E min , E max ] is divided into M sub-regions, E { m } min ≤ E ≤ E { m } max ( m = 1 , , · · · , M ), where E { } min = E min and E { M } max = E max . There should be some overlaps between theadjacent regions. MUCAREM uses M replicas of the original system. The weight factor for sub-region m isdeﬁned by [21–23]: W { m } MUCA ( E ) =  e − β { m } L E , for E < E { m } min , g m ( E ) , for E { m } min ≤ E ≤ E { m } max ,e − β { m } H E , for E > E { m } max , (15)where g m ( E ) is the DOS for E { m } min ≤ E ≤ E { m } max in sub-region m , β { m } L = dk B ln [ g m ( E )] /dE ( E = E { m } min )and, β { m } H = d ln [ g m ( E )] /dE ( E = E { m } max ). The MUCAREM weight factor W MUCAREM ( E ) for the entireenergy range is expressed by the following formula: W MUCAREM ( E ) = M Y m =1 W { m } MUCA ( E ) . (16)After a certain number of independent MC steps, replica exchange is proposed between two replicas, i and j ,in neighboring sub-regions, m and m + 1, respectively. The transition probability, w MUCAREM , of this replicaexchange is given by w MUCAREM = min " , W { m } MUCA ( E j ) W { m +1 } MUCA ( E i ) W { m } MUCA ( E i ) W { m +1 } MUCA ( E j ) , (17)where E i and E j are the energy of replicas i and j before the replica exchange, respectively. If replica exchangeis accepted, the two replicas exchange their weight factors W { m } MUCA ( E ) and W { m +1 } MUCA ( E ) and energy histogram H m ( E ) and H m +1 ( E ). The ﬁnal estimate of DOS can be obtained from H m ( E ) after a long productionsimulation by the multiple-histogram reweighting techniques [34, 35] or weighted histogram analysis method(WHAM) [35]. Let n m be the total number of samples for the m -th energy sub-region. The ﬁnal estimate ofDOS, g ( E ), is obtained by solving the following WHAM equations self-consistently by iteration [21–23]:  g ( E ) = M X m =1 H m ( E ) M X m =1 n m exp ( f m ) W { m } MUCA ( E ) , exp ( − f m ) = X E g ( E ) W { m } MUCA ( E ) . (18)Repeating these MUCAREM sampling and WHAM reweighting processes can obtain more accurate DOS.Although ordinary REM is often used to obtain the ﬁrst estimate of DOS in the MUCAREM iterations, weused the results of REWL simulation [19, 20] instead of the ﬁrst REM run because REWL is stable and itcan give more accurate DOS.The REWL method is essentially based on the same weight factors as in MUCAREM, while the WL simu-lations replace the MUCA simulations for each replica. This simulation is terminated when the modiﬁcationfactors on all sub-regions attain a certain minimum value f ﬁnal . After a REWL simulation, M pieces of DOSfragments with overlapping energy intervals are obtained. The fragments need to be connected in order todetermine the ﬁnal DOS in the entire energy range [ E min , E max ]. The joining point for any two overlappingDOS pieces is chosen where the inverse microcanonical temperature β (= ∂S ( E ) /∂E ) coincides best [19, 20].This connecting process can be omitted in REWL-MUCAREM because the estimated DOS from WHAM isused directly as multicanonical weight factor in MUCAREM. After repeating MUCAREM several times, theDOS with highest accuracy is obtained. In this article, ordinary MUCA simulations were performed afterREWL-MUCAREM for estimating the errors. IV. COMPUTATIONAL DETAILS

The total number of water molecules N is given by n x × n y × n z , where n x , n y , and n z are the numbersof sites along the x, y, and z axis, respectively (see Fig. 1). The total number of sites (i.e., total numberof oxygen atoms) is N and the total number of hydrogen atoms is 2 N . The values of n x , n y , and n z arerestricted to n x = 1 , , , · · · , n y = 4 , , , · · · , and n z = 2 , , , · · · , because we used periodic boundaryconditions (PBC). The total number of molecules considered was N = 128 , , , , , , N times.The REWL-MUCAREM protocol was used in order to obtain the DOS. It corresponds to the numberof conﬁguration n ( E ) at E . In MUCA and WL MC simulations, it is necessary to determine the entireenergy range [ E min , E max ] before starting simulations. We selected the values as follows: [ E min , E max ] =[ − N, − N/ E min corresponds to the ground state and E max corresponds to the energy value aroundwhich the entropy takes the maximum value (see Fig. 2). Figure 2 shows the typical dimensionless entropy(ln n ( E )) per one water molecule in ice Ih, which was estimated by our additional simulation for the system N = 2880 under the condition [ E min , E max ] = [ − N, E/N = − /

4. Thus, the inverse temperature β takes the value 0 at E max . Under the condition[ E min , E max ] = [ − N, − N/ − N ≤ E ≤ − N/ β = 0 is obtained in E > − N/

4. The n ( E ) is summed up to themaximum energy which obtained during simulations in order to estimate the total number of conﬁgurations.Although it is desirable to take E max = 0 in order to estimate W with high accuracy according to ournormalization, it is suﬃcient that E max is − N/ E = − N/ E = − N/ n ( E ) which was normalized at E min per onewater molecule for the system N = 128. It was summed up from E min to E ( E is a certain energy value).The summation is saturated at a bit larger energy than E/N = − /

4. The diﬀerence between the asymptoticvalue and the value 16 /

6, which is the inverse value of Pauling’s estimate 6 /

16, represents the eﬀects of closedloops in [4]. The inset in Fig. 3 shows the n ( E ) directly. Most of the total number of conﬁgurations aredistributed around the peak. These results implies that the sum of the number of states which takes muchhigher energy than E/N = − / W under the condition [ − N, − N/

4] with the estimate underthe condition [ − N,

0] up to the N = 2880 system, the diﬀerence was small enough within errors. As a resultfor [ E min , E max ] = [ − N, − N/ H min /H max > . f → √ f . Here, H min is the smallest and H max is the largest valueof the histogram H ( E ). We iterated the reducing f process 20 times and we set f ﬁnal ≃ . × − .Once a rough estimate of DOS was obtained by REWL, MUCAREM samplings and WHAM reweightingprocesses were then repeated 5 times in order to get more precise DOS. The total number of MC sweeps foreach MUCAREM was 2 . × sweeps.After we obtained a DOS by REWL-MUCAREM, MUCA production runs were performed M = 32 timesindependently for evaluating the residual entropy and errors. Average values and errors were obtained bythe following standard formulae: n ( E min ) = M X i =1 n ( E min ) { i } M , ε n = vuuuut M X i =1 (cid:16) n ( E min ) { i } − n ( E min ) (cid:17) M ( M − . (19)Here, n ( E min ) { i } is a measured value from the i -th simulation ( i = 1 , , · · · , M ). The total number ofMC sweeps for measurement was 6 . × sweeps for each MUCA production run. The single-histogramreweighting techniques were employed in order to obtain estimates for W .Random number generators have a large eﬀect on the MC method (see Appendix A). In this article, theMersenne Twister random number generator was employed [36]. We used the program code on open source[37]. V. RESULTS AND DISCUSSION

Figure 4 shows the time series of the energy-range index of one of the replicas (Replica 1) during theﬁnal MUCAREM simulation for the N = 4704 system. Here, we used 32 replicas. The total energy range[ E min , E max ] was divided into 32 sub-regions. E min was − E max was − N = 4704 system. Each energy labelcorresponds to the sub-region m . Although we used 32 sub-regions, the only four sub-regions ( m = 1 , , , N = 4704 and Fig.8 shows the energy histogram obtained after the MUCA production runs which used the ﬁnal DOS as theweight factor. The ideal MUCA weight factor makes a completely ﬂat histogram. The ﬂatness ( H min /H max )after MUCA production runs are listed in TABLE II, and the values are larger than 0 . .

5. It means that our estimate of DOS by theREWL-MUCAREM protocol is very accurate indeed. Similar results were obtained in all system sizes.The tunneling events during the MUCA production runs were also counted. Here, a tunneling event isdeﬁned by a trajectory that goes from E min to E max and back (or goes from E max to E min and back).TABLE II lists the total number of tunneling events of 32 independent MUCA production runs. A lot oftunneling events were indeed observed in all system sizes. It implies that the observed conﬁgurations changeddramatically during the simulation many times. We concluded that our REWL-MUCAREM protocol andMUCA production run worked properly from these results.Our estimates of W are also listed in TABLE II. The values obtained from Eq. (19) and the extrapolationare shown in Fig. 9. We used the following form as an extrapolation formula: W (cid:18) N (cid:19) = W (0) + a (cid:18) N (cid:19) θ . (20)Here, we have θ = 1 reﬂects bond correlations in the ground state [6]. The ﬁnal estimate of W ThisWork0 (whichis equal to W (0) in Fig. 9) is given in the last row of TABLE III. The data points for smaller lattice sizesare included in the ﬁt, but not shown in Fig. 9 because we would like to focus on the large lattice N region.The ﬁnal estimate is W (0) = 1 . ± . . (21)This estimate converts into S = 0 . ± . / deg mole] . (22)The parameters of the ﬁt is also consistent and their values are a = 1 . ± . θ = 0 . ± . W with the results of other research groups. In Fig. 10,the estimation values of W with their error bar were plotted. Various calculation methods for S and W and their calculated values were summarized in TABLE III. The relative error between our result and theestimate of Nagle is 0 .

038 %. We used the following formula as relative error. ε = | A − A | A . (23)Here, A is our measured value and A is Nagle’s theoretical estimate. Our previous evaluation in 2012 [9]by MUCA showed that the diﬀerence was 0 .

017 %. However, we considered that our latest estimate is morereliable than that of previous one because of the accuracy of the random number generator. The Metropoliscriteria based on MUCA weight factor in Eq. (13) might not have worked properly in large systems (especially,the system for N = 2880: see Appendix A) in [9]. Our latest estimate is within the error of the estimatesby MUCA simulation in 2007 [6], in which the problem of random number generator did not occurred. Inorder to estimate the residual entropy with higher accuracy than our latest results, the calculation of W on systems larger than N = 4704 will be necessary. Although our latest results are slightly diﬀerent fromour previous results in [9], three diﬀerent computational approaches (PEPS algorithm [18], ThermodynamicIntegration [15] and REWL-MUCAREM) give almost the same estimates. VI. CONCLUSIONS

Although the theoretical or experimental estimate is still diﬃcult, the residual entropy of ice Ih is becomingone of good models for testing the accuracy of simulation algorithms because of the rapid computationaldevelopment in recent years. However, there seem to be small disagreements among the results of thesesimulations. The exact residual entropy of Ice Ih has yet to be obtained. In this article, we estimated theresidual entropy by the REWL-MUCAREM simulations. Although our ﬁnal estimate is slightly diﬀerentfrom that of the previous MUCA simulation in [9], it agreed well with the results of several simulation groupsand three diﬀerent computational groups gave almost same estimates. We also discussed the importance ofthe uniformity of pseudo random number generators in Appendix A. iThe REWL-MUCAREM strategy can be useful to estimate DOS with high accuracy for the systems whichhave rough energy landscapes, for example, spin-glass or protein systems. By combining with the reweightingtechniques, more information about the systems can be obtained in detail. In addition, REWL-MUCAREMprotocol can also be used in molecular dynamics (MD) simulations. The problem of discrete random numbersin MC simulations can be avoid by MD simulations. (Perhaps,

Statistical temperature molecular dynamicsmethod (STMD) [39, 40] or meta-dynamics algorithm [41–43], which has a close relationship to WL, is properto the systems.) In this case, we can incorporate many techniques which improve the eﬃciency of sampling(e.g., [44]) into REWL-MUCAREM MD. We hope that the REWL-MUCAREM strategy will give us morereliable insights into complex systems.

Acknowledgements:

Some of the computations were performed on the supercomputers at the Supercomputer Center, Institutefor Solid State Physics, University of Tokyo.

Appendix A: The Eﬀects of Random Numbers on Multicanonical Monte Carlo Simulations

There is no doubt that the quality of pseudo random number generators strongly aﬀects the results ofMonte Carlo simulations. Pseudo random number generators have their own characteristics, for example,periodicity of random numbers. Here, we would like to discuss the minimum value which can be generatedby random number generators and the eﬀects on the MUCA MC simulations.We used two well-known pseudo random number generators, namely, Marsaglia pseudo random numbergenerator [38] and Mersenne Twister pseudo random number generator [36]. Marsaglia generator was em-ployed in our previous studies [6, 8, 9]. Mersenne Twister generator was used in this work. The source codesare found in [13, 37].In order to compare the accuracy of random numbers, pseudo random numbers were generated 10 timesby these generators. The generated values less than 5 . × − by Marsaglia generator (green dots) andMersenne Twister generator (purple dots) are plotted in Fig. A1. Although random numbers by MersenneTwister generator seems to make a uniform distribution, we can see a discrete distribution by Marsagliagenerators. Thus, samples by Marsaglia make green lines in Fig. A1. The minimum random number valueby Marsaglia generator was 0 and the next minimum value was 5 . × − . The random seeds wereSeed1= 11 and Seed2 = 20. It means that Marsaglia generator we employed cannot generate the valueswithin (0 , . × − ) as a random number. On the other hand, the minimum random number value byMersenne Twister generator (the random seed is 4357) was 0 and the next minimum value was 2 . × − in our test, which is smaller than the value 5 . × − by Marsaglia.In the 2-state model, the transition probability w ( X → X ) = exp( − ∆ S ), where ∆ S = ln n − ln n ,during MUCA simulations from the ground state X to the ﬁrst excited state X are shown in Fig. A2. Theinset in Fig. A2 shows the diﬀerences of the estimate of entropy ∆ S between the ground state (the valueof entropy is ln n ) and the ﬁrst exited state (the value of entropy is ln n ). It is clear that the diﬀerencebecomes larger as the number of molecules increases. Thus, the acceptance probability around the groundstate becomes small. The w ( X → X ) is approximately to e − . ( ≃ . × − ) for N = 4704. TheMarsaglia generator would not work properly because of the badness of the uniformity of random numbers.In addition, we might not have obtained a proper estimate for N = 2880 in our previous work in [9]. Thisis the reason why our latest estimate of residual entropy ( S = 0 . ± . / deg mole]) in thisarticle is diﬀerent from our previous result ( S = 0 . ± . / deg mole]). Note that thereare a sophisticated Marsaglia random number generator to alleviate the discrete problem by combining twoMarsaglia random numbers into one [13]. TABLE I. Initial conditions in REWL-MUCAREM simulations.

N n x n y n z No. of replicas Replica Exchange a WL criteria b Total MC sweeps Total MC sweepsfor REWL c for MUCAREM d

128 4 8 4 4 250 500 1 . × . × × . × . × × . × . × × . × . × × . × . × × . × . × × . × . × × . × . × × a The interval of replica exchange trial (MC sweeps) in REWL and MUCAREM. b The interval of WL criteria check (MC sweeps) in REWL. c Total MC sweeps per each replica that is required for all WL weight factors f to converge to f ﬁnal in REWL. d Total MC sweeps per each replica in MUCAREM. MUCAREM simulations were repeated 5 times.TABLE II. Estimated residual entropy of Ice Ih.

N n x n y n z Tunneling a Flatness b W S

128 4 8 4 7612228 0 . . . . . . . . . . . . . . . . . . . . . . . . ∞ ﬁtting 1 . . a The total counts of observed tunneling events during 32 MUCA production runs. b The value of ﬂatness ( H max /H min ) after 32 MUCA production runs. * The values in parentheses represent the errors obtained by 32 MUCA production runs and ﬁtting, using Eq. (19).TABLE III. Comparing the estimates of various methods.Group Methods W ∆ W S ∆ S Nagle [5] Series expansion 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (a) n y n x up down

012 0 1 2 3 4 xy - plane (b) n z n y Oxygen (up)Oxygen (down)Hydrogen

FIG. 1. Two-dimensional projection of ice Ih. (a) shows the projection to the xy -plane and (b) shows the projectionto the yz -plane. The scale is diﬀerent from the actual ice Ih structure for simplicity. n x , n y , and n z are the numbersof sites along the x, y, and z axis, respectively. The total number of water molecules N is given by n x × n y × n z .The red triangles imply that the lattice points exist above the xy -plane, and the blue triangles imply that the latticepoints exist below the xy -plane in (a). Oxygen atoms are located on lattice points. The triangles in (b) also representthe oxygen atoms. The dotted lines represent the hydrogen bonds pair of oxygen atoms. The ﬁlled green circles arehydrogen atoms on chemical bonds. Hydrogen atoms can occupy one of the two places on each bond according to theice rules.FIG. 2. Typical dimensionless entropy ln n ( E/N ) perone water molecule of ice Ih as a function of po-tential energy per site (

E/N ). The values were ob-tained by additional REWL-MUCAREM simulationfor N = 2880 system. ln n (0) is set to ln(2) becausethe possible conﬁgures at E = 0 are 2. The entropytakes the maximum value at the energy E/N = − /

4. FIG. 3. The summation of n ( E/N ) from E min to E for the system N = 128. The value is normalized at E min per one water molecule. The horizontal greenline shows the inverse of Pauling’s estimate (6 / E/N = − /

4. The inset showsthe n ( E/N ) we obtained. Here, n (0) is set to 2. Thehorizontal orange line shows the total number of con-ﬁgurations ( P E n ( E/N ) = 2 N ). n ( E/N ) takes themaximum value at E = − / (cid:0)

0 5x10 (cid:8) E n e r gy L a b e l Time-step

FIG. 4. History of the energy-range index (Energylabel) of one of the replicas (Replica 1) during theﬁnal MUCAREM simulation for N = 4704. -9500-9000-8500-8000-7500-7000-6500-6000-5500 0 5x10 (cid:9)(cid:10)(cid:11)(cid:12)(cid:13)(cid:14) (cid:15)(cid:16)(cid:17)(cid:18) E Time-step

FIG. 5. History of the potential energy of one ofthe replicas (Replica 1) during the ﬁnal MUCAREMsimulation for N = 4704. (cid:19)(cid:20)(cid:21) (a) Energy label 1 (b) Energy label 2 H ( E ) [ × ] (c) Energy label 3 (d) Energy label 4 E FIG. 6. Histograms of potential energy obtained by the ﬁnal MUCAREM simulation of the water molecules N = 4704.Each energy label corresponds to the sub-region m . Sub-regions have an overlap of about 80 % between neighboringsub-regions. Each histogram shows a ﬂat distribution. (cid:22)(cid:23)(cid:24) (cid:25)(cid:26)(cid:27) (cid:28)(cid:29)(cid:30)(cid:31)!" -9500 -9000 -8500 -8000 -7500 -7000 -6500 -6000 -5500 N = l n n ( E )( × E FIG. 7. The entropy as a function of energy E es-timated by the REWL-MUCAREM simulation for N = 4704. Here, the value of ln n ( E ) at E = − *+,-./012458 -9500 -9000 -8500 -8000 -7500 -7000 -6500 -6000 -5500 H ( E ?@ × A B E FIG. 8. Total histogram of potential energy obtainedby the MUCA production simulation for N = 4704.The entropy in Fig. 7 was used as the MUCA weightfactor. Nagle W FIG. 9. The degree of freedom of the orientation of one water molecule W (1 /N ) at ground state as the function ofthe inverse of N . Error bars are smaller than the symbols. CDEFGH IJKLMN OPQRST

Nagle (1966) Berg (2007)Berg (2012)Herrero (2013)Kolafa (2014)Ferreyra (2018) Vanderstraeten (2018)This work (2020) R e s ea r c h g r oup s W FIG. 10. Evaluates of W by several research groups.FIG. A1. Generated random numbers byMarsaglia generator (green) and MersenneTwister generator (purple). Although the pur-ple dots seem to be distributed uniformly,green dots only take nine discrete values(0 , . , . , . , . , . , . , . , . × − ]). FIG. A2. Transition probability ( ω ( X → X ) =exp[ − ∆ S ]) from the ground state X (the estimateddimensionless entropy is ln n ) to the ﬁrst excitedstates X (the estimated dimensionless entropy isln n ) in the 2-state model. Here, ∆ S is deﬁned by∆ S = ln n − ln n . The inset shows ∆ S . The shadedarea (light blue region) corresponds to the range ofthe ordinate in Fig. A1. [1] W. F. Giauque and M. Ashley, Phys. Rev. , 81 (1933).[2] J. D. Bernal and R. H. Fowler, J. Chem. Phys. , 515 (1933).[3] L. Pauling, J. Am. Chem. Soc. , 2680 (1935).[4] L. Onsager and M. Dupuis, Rend. Sc. Int. Fis. Enrico Fermi , 294 (1960).[5] J. F. Nagle, J. Math. Phys. , 1484 (1966).[6] B. A. Berg, C. Muguruma, and Y. Okamoto, Phys. Rev. B , 092202 (2007).[7] B. A. Berg and W. Yang, J. Chem. Phys. , 224502 (2007).[8] C. Muguruma, Y. Okamoto, B. A. Berg, Phys. Rev. E , 041113 (2008).[9] B. A. Berg, C. Muguruma, and Y. Okamoto, Mol. Sim. , 856 (2012).[10] B. A. Berg and T. Neuhaus, Phys. Lett. B , 249 (1991).[11] B. A. Berg and T. Neuhaus, Phys. Rev. Lett. , 9 (1992).[12] W. Janke, Physica A , 164 (1998).[13] B. A. Berg, Markov Chain Monte Carlo Simulation and Their Statistical Analysis (World Scientiﬁc, Singapore,2004).[14] C. P. Herrero and R. Ram´ırez, Chem. Phys. Lett. , 568 (2013).[15] J. Kolafa, J. Chem. Phys. , 204507 (2014).[16] M. V. Ferreyra, G. Giordano, R. A. Borzi, J. J. Betouras, and S. A. Grigera, Phys. Rev. E , 042146 (2016).[17] M. V. Ferreyra and S. A. Grigera, Phys. Rev. E , 042146, (2018).[18] L. Vanderstraeten, B. Vanhecke and F. Verstraete, Phys. Rev. E , 142145 (2018).[19] T. Vogel, Y. W. Li, T. W¨ust, and D. P. Landau, Phys. Rev. Lett. , 210603 (2013).[20] T. Vogel, Y. W. Li, T. W¨ust, and D. P. Landau, Phys. Rev. E , 023302 (2014).[21] Y. Sugita and Y. Okamoto, Chem. Phys. Lett. , 261 (2000).[22] A. Mitsutake, Y. Sugita, and Y. Okamoto, J. Chem. Phys. , 6664 (2003).[23] A. Mitsutake, Y. Sugita, and Y. Okamoto, J. Chem. Phys. , 6676 (2003).[24] T. Hayashi and Y. Okamoto, Phys. Rev. E , 043304 (2019).[25] R. R. P. Singh and J. Oitmaa, Phys. Rev. B , 144414 (2012).[26] B. A. Berg and T. Celik, Phys. Rev. Lett. , 2292 (1992).[27] A. M. Ferrenberg and R. H. Swendsen, Phys. Rev. Lett. , 2635 (1988).[28] F. Wang and D. P. Landau, Phys. Rev. Lett. , 2050 (2001).[29] F. Wang and D. P. Landau, Phys. Rev. E , 056101 (2001).[30] K. Hukushima and K. Nemoto, J. Phys. Soc. Jpn. , 1604 (1996).[31] Y. Sugita and Y. Okamoto, Chem. Phys. Lett. , 141 (1999).[32] R. H. Swendsen and J. S. Wang, Phys. Rev. Lett. , 2607 (1986).[33] E. Marinari, G. Parisi, and J. J. Ruiz-Lorenzo, in Spin Glasses and Random Fields, A. P. Young (ed.) (WorldScientiﬁc, Singapore, 1997) pp. 59-98.[34] A. M. Ferrenberg and R. H. Swendsen, Phys. Rev. Lett. , 1195 (1989).[35] S. Kumar, J. M. Rosenberg, D. Bouzida, R. H. Swendsen, and P. A. Kollman, J. Comput. Chem. , 1011 (1992).[36] M. Matsumoto and T. Nishimura, TOMACS , 35 (1990).[39] J. Kim, J. E. Straub, and T. Keyes, Phys. Rev. Lett. , 050601 (2006).[40] J. Kim, J. E. Straub, and T. Keyes, J. Phys. Chem. B , 8646 (2012).[41] A. Laio and M. Parrinello, Proc. Natl. Acad. Sci. USA , 12562 (2002).[42] T. Huber, A. E. Torda, W. F. van Gunsteren, J. Comp. Aid. Mol. Des. , 695 (1994).[43] H. Gr¨ubmuller, Phys. Rev. E , 2893 (1995).[44] C. Junghans, D. Perez, and T. Vogel, J. Chem. Theory Comput.10