An improved coarse-grained model of solvation and the hydrophobic effect
aa r X i v : . [ c ond - m a t . s o f t ] O c t An improved coarse-grained model of solvation and the hydrophobic effect
Patrick Varilly, Amish J. Patel, and David Chandler a) Department of Chemistry, University of California, Berkeley, California 94720 (Dated: 5 November 2018)
We present a coarse-grained lattice model of solvation thermodynamics and the hydrophobic effect thatimplements the ideas of Lum-Chandler-Weeks (LCW) theory [J. Phys. Chem. B , 4570 (1999)] andimproves upon previous lattice models based on it. Through comparison with molecular simulation, we showthat our model captures the length-scale and curvature dependence of solvation free energies with near-quantitative accuracy and two to three orders of magnitude less computational effort, and further, correctlydescribes the large but rare solvent fluctuations that are involved in dewetting, vapor tube formation andhydrophobic assembly. Our model is intermediate in detail and complexity between implicit-solvent modelsand explicit-water simulations.
I. INTRODUCTION
This is a technical paper that addresses how the hy-drophobic effect may be understood quantitatively. De-spite its technical nature, the physical ideas and finalmodel we formulate should be accessible and potentiallyinteresting to a wide audience of researchers who are con-fronted with the many manifestations of the hydrophobiceffect, and are in need of an effective quantitative tool fortreating them.The hydrophobic effect is presumed to be an importantdriving force in biology and nanoscale self-assembly.
Because of its collective nature and its length-scaledependence, and because of its nonlocal dependence onsolute surface moeities, modeling the hydrophobic ef-fect remains a challenge. To treat it theoretically, onecould track the explicit position of possibly tens of thou-sands of water molecules around solutes of interest, but the computational cost of this approach limits itsapplications. Alternatively, at significantly reduced cost,one could replace explicit waters by an implicit solventmodel, as is done in the generalized Born and surface area(GBSA) approach. In this paper, building on previ-ous efforts, we propose a coarse-grained model inter-mediate in detail between these two extremes, one thatretains most of the computational advantage of implicitsolvent models and overcomes two of their significant con-ceptual flaws: their incorrect scaling behavior and theirneglect of rare but large solvent density fluctuations thatplay pivotal roles in the dynamics of assembly.Solvation free energies of solutes with sub-nanometer features, exactly the size prevalent in bi-ological regimes, do not in general scale as surfacearea. By construction, models that assume suchscaling significantly underestimate the driving force forhydrophobic assembly.
Our model, on the other hand,captures the correct scaling behavior with solute size forgeneric solute geometries.Since hydrophobicity is a solvent property as much asit is a solute property, it is important to consider the sol- a) Electronic mail: [email protected] vent on length scales dictated by the solute(s). Numer-ous studies of hydrophobicity have shown thatrare solvent motions and dewetting transitions in con-fining environments play a critical role in solute assem-bly and function. Our model adequately captures theserare and important fluctuations. To demonstrate this,we consider the water number distribution P V ( N ) in aprobe volume V . Hummer et al. introduced the idea ofcharacterizing this distribution in the context of solva-tion theory, and the utility of this function has beendemonstrated subsequently. The GBSA model, widely used in biological set-tings, captures the effect of electrostatics with reasonableaccuracy, but its treatment of the hydrophobic effect isless adequate, for the reasons discussed above. In-teresting examples of solutes for which hydrophobicity isessential, and for which GBSA is unsuitable, include largeclasses of proteins, such as those involved in transmem-brane protein recognition and insertion, and versatilechaperones. It is these kinds of solutes for which ourapproach may eventually prove most useful.The ideas behind our model are those of Lum-Chandler-Weeks (LCW) theory. Ten Wolde, Sun andChandler generalized this theory by casting it interms of a Hamiltonian for a lattice field theory. Themotivation for that development was to facilitate treat-ments of large length scale dynamics. The motivation ofthe current work is similar, though in this paper we con-fine our attention to time-independent properties. Themain contribution of this paper is to improve upon theseprevious attempts, and to introduce concrete implemen-tations of the underlying theory that illustrate the im-provements, which are significant.The paper is organized as follows. In Section II, wesketch the physical ideas behind our model and presenttheir implementation in a tractable format. The deriva-tion and approximations made therein are left to the Ap-pendix. In Section III, we consider the accuracy of ourmodel by computing the solvation free energies of solutesand P V ( N ) distributions for various geometries, with andwithout adhesive solute-solvent interactions. Finally, inSection IV we conclude with a discussion of the mer-its and limitations of the present implementation of themodel. II. MODELA. Density fields and Hamiltonian
In this section we first consider general features of aliquid solvent, specializing to water only later. We fo-cus on the solvent density, ρ ( r ). For water in particular, ρ ( r ) refers to the instantaneous positions of water oxygenatoms. Effects of other variables such as molecular ori-entations appear implicitly in terms of parameters. Wedecompose density into large and small length-scale con-tributions ρ ℓ n ( r ) and δρ ( r ), respectively, ρ ( r ) = ρ ℓ n ( r ) + δρ ( r ) . (1)Here, ρ ℓ is the bulk liquid density, and n ( r ) is an Ising-likefield that is 1 in regions that are locally liquid-like and 0in regions that are locally vapor-like. The field takes onintermediate values only around interfacial regions. Thislarge length-scale field describes extended liquid-vaporinterfaces, while the small length-scale field describesmore rapidly-varying density fluctuations. This separa-tion implies some form of space-time coarse-graining todefine n ( r ), a coarse-graining which is most reasonablefor dense fluids far from their critical points. The keydevelopment of LCW was to describe how to (a) per-form this decomposition, and (b) couple the two separatefields.Building on the work of ten Wolde, Sun andChandler, we construct a Hamiltonian for the solventdensity that captures the dominant physics. We have H [ n ( r ) , δρ ( r )] = H large [ n ( r )]+ H small [ δρ ( r ); n ( r )] + H int [ n ( r ) , δρ ( r )] . (2)The term H large [ n ( r )] captures the physics of interfaceformation in n ( r ). For the term H small [ δρ ( r ); n ( r )],we exploit the observation that small length-scale den-sity fluctuations in homogeneous liquids obey Gaussianstatistics. Thus, for a given configuration of n ( r ),we assume that δρ ( r ) has Gaussian statistics with vari-ance χ ( r , r ′ ; [ n ( r )]) = h δρ ( r ) δρ ( r ′ ) i n ( r ) . Here, the right-hand side denotes the thermal average of δρ ( r ) δρ ( r ′ ) under the constraint of fixed n ( r ). The term H small [ δρ ( r ); n ( r )] is then a Gaussian with this variance,namely H small [ δρ ( r ); n ( r )] = k B T Z r Z r ′ δρ ( r ) χ − ( r , r ′ ; [ n ( r )]) δρ ( r ′ ) , (3)where k B T is temperature, T , times Boltzmann’s con-stant. For conciseness, we use an abbreviated integration notation, where the integration variable is denoted with asubscript to the integral sign, and the integration domainis all of space unless otherwise stated. We approximatethe variance with χ ( r , r ′ ; [ n ( r )]) ≈ ( χ ( r − r ′ ) , n ( r ) = n ( r ′ ) = 1;0 , otherwise , (4)where χ ( r ) can be written in terms of the radial distri-bution function g ( r ) as χ ( r ) = ρ ℓ δ ( r ) + ρ ℓ [ g ( | r | ) − . (5)For the uses we make of the approximation in Equa-tion (4), corrections have quantitative but not qualitativeeffects, as discussed in the Appendix and also Ref. 44.Finally, H int [ n ( r ) , δρ ( r )] is an effective coupling between n ( r ) and δρ ( r ) due to unbalanced attractive forces in thesolvent, whose details are given in the Appendix.In the absence of large solutes, fluctuations in n ( r ) areunlikely. The only fluctuations of significance in thatcase are those described by δρ ( r ). In the presence oflarge solutes, however, n ( r ) will often differ significantlyfrom its bulk mean value. In that case the statistics of δρ ( r ) is modified and the coupling H int [ n ( r ) , δρ ( r )] be-tween n ( r ) and δρ ( r ) becomes significant. When δρ ( r )is integrated out, a renormalized Hamiltonian for n ( r )results.LCW theory is a mean-field theory for the averagelarge length-scale field, h n ( r ) i , so it ignores the effects oflarge-scale fluctuations in n ( r ). Subsequent lattice im-plementations of LCW theory have incorporatedfluctuations in the simplest possible manner. The presentmodel refines these previous attempts to achieve near-quantitative accuracy for solvation free-energies and cor-rect behavior of fluctuations in n ( r ). Most importantly,we improve the calculation of the interfacial energies dueto n ( r ).To write down the renormalized Hamiltonian, we be-gin by describing n ( r ) with reference to a cubic grid ofspacing λ , depicted in Figure 1, and we denote its valueat the center of cell i by n i . Then, n ( r ) is given by n ( r ) = X i n i Ψ( r − r i ) , (6)where r i is the center of cell i and n i is 1 or 0, and thesum is over all cells i . The function Ψ( r ) is maximal withvalue 1 at r = ; it is cubic symmetric about the origin;and it is zero when the magnitude of any of the Carte-sian components of r is greater than λ . The value of λ ,about which we will say more later, should be roughlythe size of the bulk correlation length of the liquid sol-vent. The typical size of interfacial energies between cellson this grid is γλ , where γ is the liquid-vapor surfacetension of the solvent. The dissolved solute excludes sol-vent density from a volume v , and we define ¯ v to be itscomplement, so that the total volume of the system isthe union of v and ¯ v . The excluded volume can be of any n ( r ) = 1 n ( r ) = 0 λ v cell i v i FIG. 1. Schematic showing the solute and the large length-scale density field on a grid shape, and it can be composed of disconnected parts.In Ref. 42, regions within v are called “in”, and regionswithin ¯ v are called “out”. For any volume V , we havethe projector b V ( r ), b V ( r ) = ( , r ∈ V, , otherwise , (7)so that b V ( r ) + b ¯ V ( r ) = 1. We denote the overlap of v or ¯ v with cell i by v i and ¯ v i , respectively.In the absence of a solute, the Gaussian nature of δρ ( r )results in solvent number fluctuation correlations. Thecorrelations between the portions of cells i and j thatoverlap with two volumes V and V ′ , respectively, formthe elements of a matrix χ ij ( V, V ′ ) = Z r ∈ i Z r ′ ∈ j b V ( r ) χ ( r , r ′ ) b V ′ ( r ′ ) . (8)Here, the domain of the r and r ′ integrals are restrictedto the volume of cells i and j , as indicated. A way toestimate these elements is outlined in the Appendix. Theresulting matrix is used to calculate entropic effects dueto solvent exclusion and the linear response of solventdensity to external fields.Part of the renormalized Hamiltonian is the free en-ergy H large [ n ( r )] of the field n ( r ) in the absence of ex-ternal solutes. We estimate this contribution using aLandau-Ginzburg Hamiltonian H large [ n ( r )] = Z r h w ( n ( r ) , µ ) + m |∇ n ( r ) | i , (9)where w ( ρ/ρ ℓ , µ ) is the grand free energy density for theliquid solvent at a given density, ρ , and chemical po-tential, µ , relative to that of the gas. The parameter m reflects surface tension and intrinsic interfacial width. At liquid-gas phase coexistence, µ = 0, the value of the inte-gral is conveniently expressed as the sum γλ P i h i withthe local integrals h i = 1 γλ Z x i + λx i d x Z y i + λy i d y Z z i + λz i d z h w ( n ( x, y, z ) ,
0) + m |∇ n ( x, y, z ) | i . (10)The quantity h i depends only on the values of n j forcells j that share one of the corners of cell i . Thereare only 14 distinct possible values of h i , which canbe precalculated numerically for a given free energydensity w ( n,
0) and cell size λ , as detailed in the Ap-pendix. In previous modeling, the simpler Ising modelestimate γλ P h ij i ( n i − n j ) has been used. For reasonsdiscussed in the Appendix, this simpler estimate provesless accurate than the one used here.With the above notation, we now write the Hamilto-nian for our model, which constitutes the main result ofthe paper, H eff [ { n i } ] = γλ X i h i − µρ ℓ λ X i n i + K X i φ i ( − ρ ℓ n i v i )+ k B T [ h N i v / σ v + C/ , (11a)where φ i = 2 aρ ℓ h − n i − X ′ j (nn i ) n j i , (11b) h N i v = ρ ℓ X i n i v i , (11c) σ v = X i,j n i χ ij ( v, v ) n j , (11d) C = ( ln(2 πσ v ) , h N i v > , max[ln(2 πσ v ) , h N i v ] , otherwise . (11e)The field φ i , with strength governed by the positive pa-rameter a , is what Weeks has termed an “unbalancingpotential”. The primed sum over j (nn i ) is a sum overthe six cells j that are nearest neighbors to cell i . The fi-nal expression for φ i shown above, with renormalizationconstant K , is the result of an accurate and computa-tionally convenient approximation, which is described indetail in the Appendix.The terms on the right-hand-side of Equation (11a)respectively approximate: the free-energy cost of estab-lishing interfaces in n ( r ), the pressure-induced bias to-wards the liquid phase, the effective coupling between n ( r ) and δρ ( r ) induced by the presence of a solute, andthe entropic cost of excluding solvent density from theportions of v where n ( r ) = 1. As the total number of wa-ters to be excluded, h N i v , approaches zero, the statisticsof solvent number fluctuations in v changes from Gaus-sian to Poisson, so that its variance, σ v , also approacheszero. Equation (11e) captures this change continuouslyand prevents H eff [ { n i } ] from becoming infinitely negativein this limit. B. Incorporating Solute-solvent attractions
The above Hamiltonian pertains to the simplest case,where the solute interacts with the solvent only by hard-core repulsion. Realistic solutes additionally have attrac-tive interactions with the solvent that can be modeled asan external potential u ( r ) that couples to ρ ( r ). Such apotential induces an additional term in our Hamiltonian,which we denote by H u [ { n i } ]. To describe this term, wedefine a discretized analogue u i of u ( r ), u i = 1¯ v i Z r ∈ i b ¯ v ( r ) u ( r ) . (12)Notice that u i is independent of u ( r ) for values of r insidethe solute. The apparent divergence, where v completelyoverlaps cell i , has no effect in the final expression. Inparticular, H u [ { n i } ] = X i u i n i h ρ ℓ ¯ v i − X j χ ij (¯ v, v ) n j h N i v /σ v − X j χ ij (¯ v, ¯ v ) n j β ( u j + φ j ) i + k B T X i,j βu i n i χ ij (¯ v, ¯ v ) n j βu j , (13)where β is the reciprocal of k B T . The first part is themean-field contribution R r u ( r ) h ρ ( r ) i , while the last termis the entropic cost of the external potential modifyingthe average solvent density in the vicinity of the solute. C. Parameters of the Hamiltonian
We now specialize our model to water at ambient con-ditions, T = 298 K and 1 atm pressure, p . Further, wecomment upon what changes are required for applica-tions at different states of water.The cell size length λ should be no smaller than theintrinsic width of the liquid-vapor interface. Based uponthe interfacial profile of the SPC/E model, we there-fore pick λ = 4 ˚A. This is the minimal scale over whichthe time-averaged solvent density can transition fromliquid-like to vapor-like values. Following Ref. 52, weuse the free-energy density w ( n, µ ) = 2 md ( n − n − µρ ℓ n, (14)with d = 1 .
27 ˚A because this choice reproduces the sig-moidal density profile of water-vapor interfaces at coex-istence. The resulting values of h i are tabulated in the Appendix. The bulk liquid density ρ ℓ is the experimen-tal value, whereby a liquid cell contains ρ ℓ λ ≈ . m is chosen such thatthe interfacial energy of vapor spheres of radius R tendsto 4 πγR for large R . At ambient conditions, the ex-perimental value for the surface tension yields γλ ≈ . k B T . Finally, the relative chemical potential is givenby µ ≈ ( p − p vap ) /ρ ℓ , where p vap is the vapor pressureat 298 K. This relationship gives µ ≈ . × − k B T ,which is quite small, reflecting that water at ambientconditions is nearly at coexistence with its vapor.The matrix elements χ ij ( V, V ′ ) are computed from theradial distribution function, g ( r ), and we derive this func-tion from Narten and Levy’s tabulated data. It is aconvenient data set because it covers a broad range oftemperatures for the liquid at and near p = 1 atm. Atone temperature, 25 ◦ C, we have checked that a differentestimate of the radial distribution function, that of theSPC/E model, yields similar matrix elements, and theresulting solvation properties are essentially identical tothose obtained when the χ ij ( V, V ′ )’s are computed fromthe Narten-Levy data at the same temperature.The only parameters that we estimate through fittingare the strength a of the unbalancing potential and therenormalization constant K . In the absence of solute-solvent attractions, only the product of a and K is rel-evant. Values of a and K with Kaρ ℓ = 2 . k B T allowus to match the solvation free energies of hard spheresin SPC/E water (see below). By comparing the aver-age value of the computationally convenient approximateexpression involving φ i in Equation (11a) with that ofits complete and unrenormalized counterpart, as is donein the Appendix, we find that K is about 1 /
2, so that aρ ℓ ≈ . k B T . This value for a is close to the originalLCW estimate, arrived at from a different criterion.These values are applicable at ambient conditions.As temperature and pressure vary, only γ , µ and g ( r )vary appreciably, while K varies slightly. In partic-ular, surface tension decreases roughly linearly withtemperature (with d γ/ d T ≈ − .
15 mJ/m · K which is − . × − k B T /λ · K at T = 298 K). As noted above, µ increases roughly linearly with pressure. The pair corre-lation function g ( r ) loses some structure for temperaturesabove 50 ◦ C. The terms that are modeled by the renor-malization constant K reflect the degree to which solventdensity layers next to a solute. Since this layering reflectsthe structure of g ( r ), we expect K to be slightly state-dependent, with its value increasing with temperature.Conversely, liquid water has a nearly constant den-sity and bulk correlation length at the temperatures andpressures where our model would be useful, so ρ ℓ and λ can be taken as constant as well. Theoretical estimatesfor a in simple liquids (Eq. 4 in Ref 46) are state-independent, so we expect that in water, a will be nearlystate-independent as well. III. APPLICATIONS AND RESULTSA. Solvation Free Energies
To test our model’s ability to capture the length-scaledependence of solvation, and to parametrize the strengthof the unbalancing potential, we have calculated thesolvation free energy of hard spheres of different radii.Whether within our model or using explicit water simu-lations, we calculate the solvation free energy of a solutefollowing the guidelines of Ref. 56. Briefly, we first de-fine a series of M + 1 solutes S through S M that slowlyinterpolate from an empty system ( S ) to the final so-lute of interest ( S M ). We then sequentially calculate thefree energy difference between solute m and solute m + 1using the Bennett acceptance ratio estimator (BAR),and, where necessary, the linear interpolation stratifica-tion procedure of Ref. 56. Error estimates are calculatedusing BAR, and are generally smaller than 0 . . would take around 600hours on the same machine to obtain a similar statisticalaccuracy.Hard-sphere solvation free energies scale as solute vol-ume for small spheres, and as surface area for largespheres, with a smooth crossover at intermediate sizes. Figure 2 illustrates this behavior and compares the re-sults of our model to previous simulation results usingSPC/E water. As the model manifestly reproduces thesmall- and large-length scale limits, the most significantfeature illustrated in Figure 2 is the gradual crossoverfrom volume to surface area scaling. Ignoring the unbal-ancing potential leads to a qualitatively correct scalingbehavior. However, adjustment of the single parameter a ,which determines the strength of the unbalancing poten-tial, yields a near-exact agreement between our modeland the SPC/E results for all sphere sizes. In all subse-quent results, the parameter a is fixed at this value.The model results have small lattice artifacts—resultsthat depend upon the position of the solute relative tothat of the coarse-grained lattice—as shown in the insetof Figure 2. When studying stationary solutes, latticeartifacts may be mitigated by performing multiple cal-culations, differing only by small displacements of thesolutes, and then averaging the results. When studyingdynamical phenomena, lattice artifacts tend to pin so-lutes into alignment with the coarse-grained lattice. Forarbitrary molecular solutes, we expect that pinning forcesacting on one portion of the molecule will generically op-pose pinning forces on other parts of the molecule, so thatthe total pinning forces will largely cancel out. However,when treating many identical molecules, lattice artifacts β G / π R ( n m − ) Sphere radius R (nm)Explicitwater Present model51015 0.5 1 FIG. 2. Solvation free energies G of hard spheres of increasingradii, as calculated from explicit SPC/E water simulations (solid blue), from the coarse-grained model (solid black), andfrom the most common GBSA variant (arrow at bottomright). When the coarse-grained model has no unbalancingpotential ( a = 0, dashed gray), the intermediate-size regimeis only qualitatively reproduced. For large spheres, the ratioof G to surface area tends to the liquid-vapor surface tension γ (horizontal red dots). Inset: Illustration of lattice artifacts.The spheres are centered at different offsets from the lattice:a generic position (0 .
98 ˚A , .
79 ˚A , .
89 ˚A) that breaks all ro-tational and mirror symmetries (black), a lattice cell corner(blue) and a lattice cell center (red). All three curves areidentical for R ≤ .
35 nm. can add constructively, and additional steps are neededto mitigate them. Since the unbalancing potential is explicitlyparametrized with the solvation free energy of hardspheres, it is useful to evaluate the accuracy of theresults in other geometries. To this effect we computedthe solvation free energies of a family of hexagonalplates, consisting of 37 methane-like oily sites arrangedinto three concentric rings. We control the size of theseplates, depicted in Figure 3, by varying the distance d between neighboring oily sites. For our calculationswith explicit SPC/E water, the sites are uncharged andinteract with the solvent molecules via a standard water-methane Lennard-Jones potential. To study therole of attractive interactions, we split this Lennard-Jones potential using the Weeks-Chandler-Andersen(WCA) prescription into a repulsive part u ( r ) andan attractive part ∆ u ( r ). The magnitude of the at-tractive tail can be varied systematically with a scalingparameter η , such that u η ( r ) = u ( r ) + η ∆ u ( r ) . (15)For the ideal hydrophobic plate, we set η to zero.In the coarse-grained model, the repulsive core of thesolute is represented as an excluded volume. To con-struct it, we replace each solute particle by a thermally- -10-505101520 1.5 2 2.5 3 3.5 4 4.5 5 β G / S A S A ( n m − ) Solute Center Separation d (Å) d FIG. 3. Solvation free energies G of hexagonal plates,as a function of plate size, as calculated by the coarse-grained model (solid lines), by explicit SPC/E water sim-ulations (points), and by the most common GBSA variant(arrow on right). Three values of the attractive interactionstrength η are shown: 0 . . . with a particle radius of 1 .
97 ˚A, a solvent radiusof 1 . R = 3 .
37 ˚A. equivalent hard sphere, whose radius R is estimated ac-cording to R = Z ∞ d r [1 − exp( − βu ( r ))] , which is a first approximation to the WCA value of thisradius, and is essentially the radius at which u ( r )is k B T . The excluded volume is then the union of thehard-sphere volumes of each solute site.Figure 3 compares the solvation free energies for thisfamily of solute plates computed from our atomistic sim-ulations with those computed from the coarse-grainedmodel with the unbalancing parameter a determinedabove for solvated hard spheres. Now, with this differentgeometry, the coarse-grained model continues to performwell. The discrepancies are primarily due to the smallunderestimation, shown in Figure 2, of the solvation freeenergy of small spheres. Figure 3 also compares the sol-vation free energies of plates with increasing attractionsto the corresponding results from explicit-water simu-lations. Aside from the small artifacts already presentin the ideal solute case, the contribution of the attrac-tions to solvation free energies calculated with the coarsegrained model is nearly quantitative. B. Fluctuations
A more detailed probe of solvent behavior than sol-vation free energies is the probability P V ( N ) of find-ing N waters in a given volume V . The solvation freeenergy G of an ideal, volume-excluding hydrophobe issimply βG = − ln P V (0), and we can glean informationabout hydrophobicity and dewetting from the behaviorat non-zero N .In the present model, we estimate P V ( N ) by a two-step procedure. For any given solvent configuration { n i } ,the small length-scale fluctuations of δρ ( r ) give rise to aGaussian distribution in the numbers of waters, so that P V ( N |{ n i } ) ∝ exp (cid:2) − ( N − h N i V ) / σ V (cid:3) , (16)where, h N i V = X i n i h ρ ℓ V i − X j χ ij ( V, v ) n j h N i v /σ v − X j χ ij ( V, ¯ v ) n j β ( u j + φ j ) i , (17)and, σ V = X ij n i χ ij ( V, V ) n j . (18)Here, V i is the overlap of the probe volume with cell i .Notice the use of the probe volume V in the χ ij matrices.Formally, we then thermally average the above result overall possible solvent configurations to obtain P V ( N ) ∝ X { n i } P V ( N |{ n i } ) exp( − βH eff [ { n i } ]) . In practice, we estimate this sum by sampling a latticevariable n that closely correlates with N , given by n = X i ∈ V n i . We divide the range of possible values of n into smalloverlapping windows, and sample relevant configura-tions at every value of n using Wang-Landau sampling along n , together with replica exchange, to obtain goodsampling and avoid kinetic traps. We then used themultistate Bennet acceptance ratio estimator (MBAR)to reconstruct from these runs the probability distribu-tion P ( n ). During the umbrella sampling runs, latticeconfigurations with equal n are observed in proportionto their Boltzmann weight. Using the notation { n i } ∈ n to denote all observed lattice gas configurations with aparticular value of n , we finally obtain P V ( N ) = X n P ( n ) X { n i }∈ n P V ( N |{ n i } ) . To estimate the statistical errors in our procedure, wecalculate P V ( N ) in five independent Monte Carlo runs,and estimate the standard error in the mean of ln P V ( N ). -160-140-120-100-80-60-40-200 0 10 20 30 40 50 60 70 l n P V ( N ) Number of waters N SPC/EPresent model a = 0 Ising LG, a = 0 FIG. 4. Water number distribution in a 12 × ×
12 ˚A ,as obtained using explicit SPC/E water simulations, thepresent model, the present model without the unbalancingpotential ( a = 0), and a model with an Ising Lattice Gas andno unbalancing potential. For comparison, we also calculate these distribu-tions in SPC/E water using LAMMPS as describedpreviously, paying careful attention to good samplingaround free energy barriers. Errors were estimated withMBAR.In the absence of a solute, P V ( N ) is sensitive only tothe interfacial energetics of the lattice gas. Figure 4 com-pares the P V ( N ) curve obtained using the present modelfor a 12 × ×
12 ˚A volume with results that we havepreviously obtained from simulation of SPC/E water, and with (a) a version of the coarse-grained model thatlacks an unbalacing potential ( a is set to zero), and (b) aversion that additionally uses the naive Ising lattice gasfor estimating interfacial energetics in n ( r ). Our presentmodel captures the observed deviations from Gaussianbehavior better than these simpler models, which reflectsits higher accuracy in estimating interfacial energeticsand microscopic curvature effects.We have also previously examined how hydrophobicsolutes affect water number fluctuations in nearby probevolumes. To evaluate the performance of our model inthat scenario, we use the model hydrophobic plate so-lute described in Ref. 32. The plate is made up of oilyparticles with the same number density as water, whosecenters lie inside a 24 × × volume . Taking intoaccount the van der Waals radii of the oily particles, theplate has approximate dimensions 28 × × . Wemodel this solute in the same way as the hexagonal platesdescribed above. As before, we explore the role of at-tractive interactions by varying the attraction strengthparameter η .Figure 5 shows the water number distribution in a24 × × probe volume adjacent to the plate. With -120-100-80-60-40-200 0 10 20 30 40 50 60 70 l n P V ( N ) Number of waters N η = 0 η = 1 η = 2 z FIG. 5. Water number distributions in a probe volume of size24 × × immediately adjacent to a model plate solute(inset) of varying attractive strength η , in the coarse-grainedmodel (solid lines) and in explicit SPC/E water (points).Defining z = 0 to be the plane passing through the plate cen-ter, points in the probe volume (green) satisfy 5 ˚A < z < no solute-solvent attractive interactions, the probabil-ity computed from the lattice model has a clear fat tailtowards lower numbers of waters in the probe volume.This fat tail is the hallmark of a soft vapor-liquid inter-face, in this case a soft interface next to the hydropho-bic solute. At higher attractive interactions, this fattail is correspondingly depressed, but not entirely sup-pressed. Accordingly, in Ref. 32, the fat tail is only fullysuppressed when η exceeds 3 . P V ( N ) distributions are much more de-tailed probes of solvent structure than solvation free ener-gies, we expect more room for disagreement with simula-tion. Nevertheless, we emphasize that, by construction,no implicit solvation models can capture the above ef-fects on solvent structure, which underlie the pathwaysof hydrophobic assembly. Other coarse-grained solvationmodels (for example, see Ref. 76), on the other hand, can probe rare solvent fluctuations, and it would be useful toevaluate their accuracy in this respect as compared toexplicit-water models and the present lattice model. C. Confinement
To examine confinement in detail, we place two of themodel hydrophobic plates at a distance d from each other,as shown in Figure 6, and calculate the water numberdistribution in a 24 × × ( d −
3) ˚A probe volume be-tween them as a function of interplate separation d and d " FIG. 6. Setup for examining water fluctuations under confine-ment (here, d = 8 ˚A). The model hydrophobic plates (greyparticles) are placed d ˚A apart: taking into account the vander Waals radii of about 2 ˚A of the plates’ oily particles andthe 3 ˚A thickness of each plate, the center of the first plateis placed at z = 0, the center of the second plate is placedat z = d + 7 ˚A. The van der Waals radius of water (red andwhite sticks) being about 1 . × × ( d −
3) ˚A probevolume (green) extends from z = 5 ˚A to z = d + 2 ˚A. Theplates are not perfectly flat, so some waters fit between theplates and the probe volume. attraction strength η . Figure 7 summarizes the resultsin the form of a phase diagram. At small separationsand low attractive strengths, the dry state (low N ) ismost stable, whereas high attractive strengths and largeseparations stabilize the wet state (high N ). Generi-cally, the hydrophobic association of two such plates pro-ceeds through a dewetting transition in the inter-platevolume. The general, though not quantitative, agreement be-tween the coarse-grained model and the SPC/E data isvery encouraging: bistability is observed in the P V ( N )distributions in both cases, with the barriers at the nearlyequal values of N , and with barrier heights that track theSPC/E barrier heights. The phase boundary in Figure 7closely tracks the phase boundary observed in explicitwater, with a shift of less than 2 ˚A for all η . Moreover, asshown in Figure 8, once the general shift in the phaseboundaries is accounted for, the P V ( N ) distributionsfor systems near that boundary obtained by the coarse-grained model and the SPC/E simulations agree reason-ably well. Hence, the present model is better suited thanimplicit solvation models for studies of nanoscale self-assembly or protein-protein interactions driven by thehydrophobic effect. A recently developed coarse-grainedmodel of water (mW water) has been used to extensivelyprobe these rare fluctuations, and their predictions alsodisplay the characteristic bistability of dewetting transi- A tt r a c t i o n s t r e n g t h η Plate separation d (Å)Dry Wet FIG. 7. Phase diagram for the interplate region of the systemdepicted in Figure 6. For the explicit SPC/E water simula-tions (blue), each symbol corresponds to an individual P V ( N )distribution that we have calculated (filled: wet state stable;open: dry state stable). The phase boundary (blue dashes) isestimated from a linear interpolation of the relative stabilityof the wet and dry states. The relative stability is deter-mined from the relative depths of the basins in − ln P V ( N )The phase boundary for the present model (red solid line) wasestimated from a dense sampling of P V ( N ) distributions, andis accurate to ± . d and ± . η . tions that we observe. IV. DISCUSSION
We have presented a coarse-grained model of solvationthermodynamics that correctly reproduces the length-scale dependence of solvation free energies, and, more-over, correctly captures the behavior of the slow and raresolvent fluctuations that are pivotal in pathways to hy-drophobic assembly. Our model is applicable to genericsolute shapes, and addresses the effects of solute-solventattractive interactions.While our model successfully describes various aspectsof the hydrophobic effect, several technical challengesmust be addressed before it can be applied in biologi-cal settings. Most notably, electrostatic forces are miss-ing from our model. As a first approximation, the GBtreatment may well be sufficiently accurate, as long asthe low-permittivity cavity includes both the solute’s ex-cluded volume and the regions where n ( r ) is zero. Itmay also be possible to implement electrostatics in termsof a dipole density field coupled to the water densityfield. The statistics of the dipole field are known to beGaussian so that their contribution to H eff [ { n i } ] maybe computed analytically.A second notable technical hurdle is to find efficientalgorithms for calculating the gradient of H eff [ { n i } ] withrespect to the position of the solute’s atomic centers, nec- -20-18-16-14-12-10-8-6-4-2 0 0.2 0.4 0.6 0.8 1 1.2 l n P V ( N ) Relative water density N/ ( ρ ℓ V ) FIG. 8. Water density distribution of confined water 1 ˚A fromcoexistence. These distributions are for the system depictedin Figure 6 when η = 0 .
5. Coexistence lines are shown in Fig-ure 7. The explicit water simulation data (black) correspondsto d = 11 ˚A, while the coarse-grained model (red) results cor-respond to d = 9 ˚A. The remaining P V ( N ) distributions areincluded in the Supplementary Data . essary for implementing realistic solute dynamics, suchas Brownian dynamics. In the context of solvent latticemodels, the problem is tractable for spherical solutes withlimited overlap, but the implementation of a solutionfor generic solutes is more challenging.Finally, as with implicit solvation models, our ownmodel does not attempt to capture solvent dynamics . Forthermodynamically-driven processes, almost any reason-able dynamics may suffice when estimating the kineticprefactor of rate constants of interest. Indeed, in a pre-vious lattice model, the solvent dynamics is approxi-mated by Glauber dynamics, the solute’s by Langevindynamics, and the relative rates at which the two dynam-ics are advanced are calibrated through physically rea-sonable arguments. However, it is known, as evidencedin the form of the Oseen tensor, that hydrodynamicinteractions can be long-ranged and can influencetimescales of molecular processes by one or more ordersof magnitude. This observation may prove important innanoscale assembly processes that are kinetically driven,rather than thermodynamically driven. Approaches toimplementing coarse-grained dynamics in a lattice set-ting include multiparticle collision dynamics, fluctuat-ing hydrodynamics and lattice Boltzmann methods, among others. We leave all dynamical considerations tofuture work. ACKNOWLEDGMENTS
NIH Grant No. R01-GM078102-04 supported P.V. inthe later stages of this work, A.P. throughout and D.C. inthe early stages. In the early stages, P.V. was supported by a Berkeley Fellowship. In the later stages, D.C. wassupported by the Director, Office of Science, Office of Ba-sic Energy Sciences, Materials Sciences and EngineeringDivision and Chemical Sciences, Geosciences, and Bio-sciences Division of the U.S. Department of Energy underContract No. DE-AC02-05CH11231. John Chodera andMichael Shirts helped in understanding and implement-ing MBAR. We thank David Limmer, Ulf Pedersen andThomas Speck for a critical reading of the manuscript.
Appendix A: Derivation of the Model1. Continuum formulation
In this appendix, we derive Equation (11) starting fromthe microscopic ideas of LCW theory embodied in Equa-tion (2). The forms of H small [ δρ ( r ); n ( r )] and H large [ n ( r )]are given in Equations (3) and (9), and are discussed inthe main text. Here, we begin by discussing the detailsof the H int [ n ( r ) , δρ ( r )] term. Next, we integrate out thefield δρ ( r ) to obtain the effective Hamiltonian H eff [ n ( r )].In the following section, we discretize it.Whenever n ( r ) is non-uniform, solvent molecules ex-perience an effective attraction towards denser regions,or equivalently, an effective repulsion from less dense re-gions. As argued by Weeks and coworkers, thiseffect can be modeled as a coupling, H int [ n ( r ) , δρ ( r )], be-tween an external unbalancing potential, φ ( r ), and sol-vent density. In the absence of a solute, the energetics ofthis effect is completely contained in H large [ n ( r )], but thepresence of a solute gives rise to important corrections.Formally, the coupling is given by H int [ n ( r ) , δρ ( r )] = Z r φ ( r ) δρ ( r ) + H norm [ n ( r )] , (A1)where φ ( r ) = − aρ ℓ [ n ( r ) − . (A2)Here, a determines the strength of the potential, and theoverbar operator smears n ( r ) over the effective range ofsolvent-solvent attractive interactions. The potential isshifted so that it is zero for the uniform liquid. The term H norm [ n ( r )] is chosen so that, in the absence of a solute, H eff [ n ( r )] is identical to H large [ n ( r )].We now integrate out the field δρ ( r ) to ob-tain H eff [ n ( r )]. For notational simplicity, we suppress thedependence of χ ( r , r ′ ) on n ( r ) manifest in Equation (4).The total density ρ ℓ n ( r ) + δρ ( r ) is constrained to be zerofor all points r in v , so the effective Hamiltonian is givenby exp {− βH eff [ n ( r )] } = Z D δρ ( r )exp {− βH [ n ( r ) , δρ ( r )] } Y r ∈ v δ ( ρ ℓ n ( r ) + δρ ( r )) . (A3)A long but straightforward calculation yields0 H eff [ n ( r )] = H large [ n ( r )] + k B T ln p det 2 πχ v − Z r ∈ v φ ( r ) ρ ℓ n ( r )+ k B T Z r ∈ v Z r ′ ∈ v " ρ ℓ n ( r ) − Z r ′′ ∈ ¯ v χ ( r , r ′′ ) βφ ( r ′′ ) χ − v ( r , r ′ ) " ρ ℓ n ( r ′ ) − Z r ′′′ ∈ ¯ v χ ( r ′ , r ′′′ ) βφ ( r ′′′ ) − k B T Z r ∈ ¯ v Z r ′ ∈ ¯ v βφ ( r ) χ ( r , r ′ ) βφ ( r ′ ) + H norm [ n ( r )] . (A4)Here, χ v ( r , r ′ ) is the restriction of χ ( r , r ′ ) to the vol-ume v . As such, χ − v ( r , r ′ ) satisfies Z r ′ ∈ v χ − v ( r , r ′ ) χ ( r ′ , r ′′ ) = δ ( r − r ′′ ) , r , r ′′ ∈ v. (A5)To make H eff [ n ( r )] equal to H large [ n ( r )] in the absence ofa solute, H norm [ n ( r )] = k B T Z r Z r ′ βφ ( r ) χ ( r , r ′ ) βφ ( r ′ ) . (A6)It is useful to recast Equation (A4) into a form wherethe physical significance of each term is manifest. To doso, we first note how the constraint of zero solvent den-sity inside v modifies the solvent density and its fluctua-tion spectrum outside of v . As described in Ref. 42, theaverage of δρ ( r ) δρ ( r ′ ) in the presence of the constraint,denoted by χ (m) ( r , r ′ ), is given by χ (m) ( r , r ′ ) = χ ( r , r ′ ) − Z r ′′ ∈ v Z r ′′′ ∈ v χ ( r , r ′′ ) χ − v ( r ′′ , r ′′′ ) χ ( r ′′′ , r ′ ) . (A7)From Equation (A5), it follows that χ (m) ( r , r ′ ) is zerowhenever r or r ′ are in v , as required by the solvent ex-clusion constraint. To describe the constraint’s effect onthe average density, we introduce an auxiliary field c ( r )that satisfies Z r ′ ∈ v χ ( r , r ′ ) c ( r ′ ) = ρ ℓ n ( r ) , r ∈ v, (A8a) c ( r ) = 0 , r ∈ ¯ v. (A8b)In terms of c ( r ) and χ (m) , the average density in thepresence of the solute is given by h ρ ( r ) i = ρ ℓ n ( r ) − Z r ′ ∈ v χ ( r , r ′ ) c ( r ′ ) − Z r ′ ∈ ¯ v χ (m) ( r , r ′ ) βφ ( r ′ ) . (A9)Equation (A4) can now be written much more simply as follows. H eff [ n ( r )] = H large [ n ( r )] − Z r ∈ v φ ( r ) ρ ℓ n ( r )+ k B T ln p det 2 πχ v + k B T Z r ∈ v ρ ℓ n ( r ) c ( r ) − Z r ∈ ¯ v Z r ′ ∈ v φ ( r ) χ ( r , r ′ ) c ( r ′ )+ k B T Z r Z r ′ βφ ( r )( χ − χ (m) )( r , r ′ ) βφ ( r ′ ) . (A10)For the geometries we considered, the sum of the last twoterms of this equation is, on average, opposite in signbut nearly proportional to the much simpler remainingterm involving φ ( r ) (see Section D). Physically, thesethree terms capture the energetic bonus of driving δρ ( r )to 0 inside v where φ is positive, the energetic cost ofthe consequent density enhancement just outside of v ,and the small difference between (a) the entropic costassociated with φ modifying the solvent density in thepresence of a solute and (b) that same cost in the absenceof a solute. In typical configurations, the three terms areroughly proportional to the subvolume of v where n ( r ) =1, and capture how solvation free energies are modified bythe microscopic curvature of v . We have found it accurateto model the effect of these three terms using only thesecond term of Equation (A10), whose strength is thenrenormalized by a factor K . The resulting approximationfor H int [ n ( r )] is H int [ n ( r )] ≈ − K Z r ∈ v φ ( r ) ρ ℓ n ( r ) . (A11)Finally, we introduce an important simplification in H small [ n ( r )]. Instead of solving Equation (A8a) to obtainthe value of the field c ( r ) in v , we replace c ( r ) there by itsaverage value, c , and obtain the much simpler relation c = h N i v /σ v , (A12)where h N i v = Z r ∈ v ρ ℓ n ( r ) , (A13) σ v = Z r ∈ v Z r ′ ∈ v χ ( r , r ′ ) . (A14)1Equivalently, in Equation (A3) we enforce the single con-straint that R r ∈ v ρ ℓ n ( r )+ δρ ( r ) be zero, instead of enforc-ing the multitude of constraints that ρ ℓ n ( r ) + δρ ( r ) bezero at every point r in v . We have verified that this ap-proximation, dubbed the “one-basis set approximation”in previous works, does not appreciably change thesolvation free energies and P V ( N ) distributions that wehave obtained. Crucially, this approximation replaces thelarge (though sparse) linear system of Equation (A8a)with the trivial relation of Equation (A12), and is there-fore very advantageous computationally. With it, theterm H small [ n ( r )] is given by H small [ n ( r )] = k B T [ h N i v / σ v + C/ , (A15)The normalization constant C is defined by Equa-tion (11e). When the value of h N i v becomes small, theintegral defining σ v is dominated by the δ -function inEquation (5) and takes the value σ v ≈ h N i v . The valueln(2 πσ v ) of C that is applicable for larger h N i v thustends unphysically to negative infinity as h N i v tends tozero. This deficiency arises from a breakdown of Gaus-sian statistics for solvent number fluctuations in sub-Angstrom volumes. Since solvent molecules are discreteentities, these statistics are instead Poissonian. A smallcavity v can contain either one solvent molecule, withprobability h N i v , or no solvent molecules, with proba-bility 1 − h N i v . The free-energy cost of evacuating thatcavity is thus − k B T ln(1 − h N i v ) ≈ h N i v k B T . The def-inition of C given in Equation (11e) is a simple, con-tinuous way of capturing this difference in fluctuationstatistics at tiny length-scales. The crossover occurs at h N i v ≈ (2 π − − ≈ .
2. Lattice formulation
Using Equation (6), we express H eff [ n ( r )] and its com-ponent terms in terms of the lattice variables n i , so that H eff [ { n i } ] = H large [ { n i } ] + H int [ { n i } ] + H small [ { n i } ] . (A16)The integrals that define each term are then approxi-mated through lattice sums, with continuous fields re-placed by either their average values or their integralsover each cell.Equation (A15) for H small [ δρ ( r ); n ( r )] is the easiest totackle. We begin by discretizing Equation (4), whichdefines χ ( r , r ′ ), when the domains of integration for r and r ′ are V and V ′ , respectively. In terms of the ma-trix χ ij ( V, V ′ ) defined by Equation (8), our prescriptionyields χ ( r , r ′ ) → n i χ ij ( V, V ′ ) n j , r ∈ V, r ′ ∈ V ′ . Equation (11d) for σ v then follows immediately fromEquation (A14). Equation (11c) for h N i v reasonably ap-proximates the integral in Equation (A13). To discretize Equation (A11) for H int [ n ( r )], we needto choose a concrete implementation of the overbar op-eration that is used to define φ ( r ). Following Ref. 13, weapproximate it as a weighted average involving the celland its nearest neighbors , given by n ( r ) → h n i + 112 X j (nn i ) n j i . The average, φ i , of φ ( r ) over cell i follows immediatelyfrom Equation (A2), and is given by Equation (11b). Fol-lowing our prescription, Equation (A11) is then reason-ably discretized as the lattice sum H int [ { n i } ] ≈ K X i φ i ( − ρ ℓ n i v i ) . Discretizing H large [ n ( r )] correctly is a surprisingly sub-tle challenge. Previously, it has been approxi-mated it by an Ising Hamiltonian with nearest-neighborcoupling H large [ { n i } ] ? → γλ X h ij i ( n i − n j ) − µρ ℓ λ X i n i . Unfortunately, the use of this Hamiltonian results in se-rious artifacts. Consider, for instance, the energetics ofa convex vapor bubble embedded in the liquid, as rep-resented by the field { n i } . Many configurations of thefield that are physically distinct have nonetheless equalprojections onto the xy -, yz - and xz -planes, so they willbe given equal statistical weight by the Hamiltonian.Hence, the use of this Hamiltonian results in an unphys-ical excess of entropy, as shown in detail in Section E.Moreover, the energetic cost of common configurations ofthe field { n i } is substantially overestimated. The IsingHamiltonian assigns a large vapor bubble of radius R aninterfacial energy of about 6 πγR , not 4 πγR . Whereasusing a renormalized γ can alleviate this latter problem, the problem of excess entropy is more fundamental.Motivated by the above deficiencies of the Ising Hamil-tonian, we have instead chosen to evaluate the Landau-Ginzburg integral in Equation (9) numerically. To pro-ceed, we need to construct the basis function Ψ( r ) used inEquation (6). Our choice, depicted in Figure 9 for water,approximates the usual van der Waals construction ata local level. We first construct a 1D basis function ψ ( x )satisfying w ′ ( ψ ( x ) , − mψ ′′ ( x ) = 0 , (A17)with boundary conditions ψ (0) = 1 and ψ ( λ ) = 0. Wethen extend the range of ψ ( x ) and symmetrize it so that ψ ( x > λ ) = 0 , (A18)and ψ ( x <
0) = ψ ( − x ) . (A19)2 -4 -2 0 2 4 ψ ( x ) x (Å) FIG. 9. Constructing n ( r ) from { n i } . The binary field speci-fies whether the density at the center of each lattice cell shouldbe that of the liquid or that of the vapor. Between cell cen-ters, the density is interpolated using the basis function ψ ( x )(whose form for water is shown in the lower left panel). Thedashed lines delineate the domain of integration of the localfree energy h i given by Equation (10). Finally, the three-dimensional basis function Ψ( r ) is con-structed from the one-dimensional profiles ψ ( x ) to giveΨ( x, y, z ) = ψ ( x ) ψ ( y ) ψ ( z ) . The field n ( r ) constructed from Equation (6) using thisbasis function has many useful properties: the value of n ( r ) at the center of each cell i corresponds to the stateencoded in n i ; the density interpolates smoothly betweenadjacent cells; and the density profile of a configurationrepresenting an axis-aligned wall, where all n i ’s are 1 onone side of a plane and 0 on the other, nearly reproducesthe interface profile given by the van der Waals construc-tion.For water, we use the function w ( n, µ ) given in Equa-tion (14). This choice results in both sides of Equa-tion (A17) being proportional to m , so the function ψ ( x )is independent of m . In the free van der Waals theory,where the boundary conditions on Equation (A17) are ψ ( −∞ ) = 1 and ψ (+ ∞ ) = 0, the density profile ψ ( z )that results is ψ ( z ) = [1 + tanh( z/d )] / , which accurately describes the average density profile ofan SPC/E water slab at ambient conditions. The thick-ness parameter d can thus be determined from simula-tion. A complication due to capillary waves is that d grows logarithmically with simulation box size, sodifferent authors quote different values of d : 1 .
27 ˚A for a19 ×
19 ˚A interface in Ref. 52 and 1 .
54 ˚A for a 30 ×
30 ˚A interface in Ref. 89. We choose the smaller value be-cause the instantaneous configuration of n ( r ) should beblurred only by small-scale fluctuations, not by large-scale capillary waves, which correspond instead to differ-ent conformations of n ( r ). The profile shown in Figure 9corresponds to the solution of Equation (A17) when themore restrictive boundary conditions described above areimposed, with λ = 4 ˚A and d = 1 .
27 ˚A.With concrete choices of Ψ( r ), w ( n, m and λ , theintegrals h i defined by Equation (10) can be evaluated.We discuss the choice of m below. As outlined in themain text, the value of h i depends only on the values of n j for the 8 cells j that share one of the corners of cell i .Out of the 256 possible configurations of { n j } , only 14are unique when one accounts for reflection, rotation andinversion symmetry. Thus, only 14 distinct integrals needto be evaluated numerically. This decomposition bearsa strong resemblance to the marching cubes algorithm that reconstructs interfaces in volumetric data, and iswidely used in computerized tomography.In principle, the value of m is related to the surfacetension by the relation γ = Z λ h w ( ψ ( x ) ,
0) + mψ ′ ( x ) / i d x. (A20)On a lattice, as exemplified above by the Ising Hamilto-nian, this choice results in perfect interfacial energies forflat axis-aligned interfaces at the expense of more com-mon curved interfaces. Thus, we instead choose m self-consistenly such that ψ ( x ) satisfies Equation (A17) andthe calculated interfacial energy of some reference geom-etry of surface area A is γA . Equation (A20) correspondsto a cubic reference geometry. Since curved surfaces arefar more common than flat one in realistic solutes, weinstead use large spheres as our reference geometry.For the specific form of w ( n ) that we use for water, h i is proportional to m and ψ ( x ) is itself independent of m .The above self-consistent procedure can hence be imple-mented quite simply. We first calculate the h i quantitiesup to a factor of m , and then pick m to obtain the correctinterfacial energies. The resulting values of h i are givenin Table I.
3. Incorporating solute-solvent interactions
A generic solute interacts with a solvent moleculethrough a potential u ( r ). This interaction is reflectedin the microscopic Hamiltonian of Equation (2) as anadditional term H u [ n ( r ) , δρ ( r )] given by H u [ n ( r ) , δρ ( r )] = Z r u ( r )[ ρ ℓ n ( r ) + δρ ( r )] . Upon integrating out the density fluctuations, an addi-tional term H u [ n ( r )] appears in H eff [ n ( r )]. Physically,3 TABLE I. Relative interfacial free energy h i for each distinct neighboring cell configuration (diagrams after Ref. 90). Highlightedcorners denote cells j with n j = 1, whereas the others refer to cells with n j = 0; cell i is the lower-left corner in the back. Toaid the eye, a schematic of the implied liquid-vapor interface of each configuration is shown in orange. The values of h i areinversion-symmetric: interchanging highlighted and unhighlighted corners yields the same interface and interfacial energy. Alsoshown are the values of h i that would reproduce the energetics of the standard Ising lattice gas, namely γλ P h ij i ( n i − n j ) . Local { n j } configuration h i (Present model) 0.000 0.387 0.676 0.725 0.754 0.851 0.965 h i (Ising model) 0.000 0.750 1.000 1.500 1.500 1.250 1.750Local { n j } configuration h i (Present model) 0.983 0.857 0.910 1.104 0.965 1.040 1.134 h i (Ising model) 2.250 1.000 1.500 2.000 1.500 2.000 3.000the total solvent density responds linearly to the externalfield u ( r ) according to the density fluctuation spectrumis given by χ (m) ( r , r ′ ), so that h ρ ( r ) i = ρ ℓ n ( r ) − Z r ′ χ ( r , r ′ ) c ( r ′ ) − Z r ′ χ (m) ( r , r ′ ) β [ φ ( r ′ ) + u ( r ′ )] . (A21)The resulting free energy change H u [ n ( r )] arises from thedirect interaction of the solute and the solvent, and fromthe entropic cost of modifying the mean solvent densityaround the solute. It is given by H u [ n ( r )] = Z r d r u ( r ) h ρ ( r ) i + k B T Z r Z r ′ βu ( r ) χ (m) ( r , r ′ ) βu ( r ′ ) . (A22)Note that the integrands are zero whenever r or r ′ areinside the solute.To implement the previous equation on a lattice, wehave found it useful to approximate χ (m) ( r , r ′ ) by χ (m) ( r , r ′ ) ≈ ( χ ( r − r ′ ) , r , r ′ ∈ ¯ v, n ( r ) = n ( r ′ ) = 1 , , otherwise . (A23)We also use the one-basis set approximation, c ( r ) ≈ c ,given in Equation (A12). Discretizing Equation (A22)as in the previous section then immediately yields Equa-tion (13). Appendix B: Estimating χ ij ( V, V ′ ) An essential ingredient of the model we present is thematrix χ ij ( V, V ′ ), given by the integral in Equation (8). The terms involving the delta-functions of Equation (4)are trivial. Owing to the rapid oscillations in g ( r ) −
1, theremaining integrals are harder to estimate. We employ atwo-step procedure to estimate these integrals efficiently.We begin by subdiving the λ = 4 ˚A-resolution grid ofcells into a much finer grid of resolution λ f = 1 ˚A. Forclarity, below we explicitly distinguish between cells inthe coarse grid, indexed by the letters i and j , and cells inthe fine grid, indexed by the letters a and b . We evaluatethe integrals of the non-delta-function portion of χ onthe fine grid without otherwise restricting the argumentsto particular volumes V and V ′ , and denote the resultby χ ab . Each fine cell is so small that the effect of arestriction on the integration domain can be estimatedaccurately with a simple interpolation formula. We thenuse these interpolated values in the fine grid to build upthe elements of χ ij ( V, V ′ ) over the coarse grid.To evaluate χ ab , we use the Narten-Levy data for thestructure factor S ( k ) of water. Since the S ( k ) is un-available for wave-numbers k higher than 16 ˚A − , we blurthe domains of integration over a range of about 2 π/
16 ˚A,which makes the values of the integrals practically insen-sitive to this missing data. Concretely, we introduce abasis function Φ, given byΦ( x, y, z ) = ϕ ( x ) ϕ ( y ) ϕ ( z ) , with ϕ ( x ) = 12 (cid:20) tanh x − λ f / − tanh x + λ f / (cid:21) . The function ϕ is unity around x = 0, and goes rapidlyto zero as | x | & λ f /
2, with ∆ controlling the range of x over which this transition occurs. We have found a valueof 0 . r a todenote the center of fine cell a , the value of χ ab is given4by χ ab = ρ ℓ Z r Z r ′ Φ( r − r a )[ g ( | r − r ′ | ) − r ′ − r b ) . (B1)The integral is best evaluated in Fourier space, wherethe term in square brackets appears as the experimen-tal S ( k ) profile. We overcome the convergence problemsof a rapidly oscillating integrand by using the Haselgrove-Conroy integration algorithm. To properly accountfor g ( r ) being exactly zero for r . .
35 ˚A, we further set χ ab to exactly − ρ ℓ if all points in a are within r c = 2 .
35 ˚Afrom all points in b . To limit the range of χ ab , we alsoset it to zero if all points in a are more than 10 ˚A fromall point in b . The values of χ ab need only be calculatedonce at each state point of water, and we have spent con-siderable effort in compiling them at ambient conditions.Our results are included in the Supplementary Data .For specific volumes V and V ′ , we estimate the valueof χ ij ( V, V ′ ) as a weighted average of the pertinent valuesof χ ab , χ ij ( V, V ′ ) ≈ ρ ℓ ( V ∩ V ′ ) + X a ∈ i X b ∈ j ( V a /λ f ) χ ab ( V ′ b /λ f ) , (B2)where ( V ∩ V ′ ) is the volume of the overlap between V and V ′ . This interpolation formula for χ ij is manifestlylinear in its arguments, so that χ ij ( V, V ′ ) + χ ij ( V, V ′′ ) = χ ij ( V, V ′ ∪ V ′′ ) , whenever V ′ and V ′′ do not overlap. Most importantly,the interpolation procedure is simple, convenient, andcorrect for the limiting cases of where all the values of V a are either 0 or λ f .For comparison, we have also calculated values of χ ab from an explicit SPC/E water simulation in GROMACSat temperature T = 298 K and pressure p = 1 atm. Thevalues are also included in the Supplementary Data .For the quantities we have studied in the main text, usingthese values for χ ab instead of those derived from theNarten-Levy data yields nearly identical results. Appendix C: Fluctuation variance
The variance of the field δρ ( r ) given in Equation (4) isa simplification of the LCW interpolation formula, χ LCW ( r , r ′ ) = ρ ℓ n ( r ) δ ( r − r ′ )+ ρ ℓ n ( r )[ g ( | r − r ′ | ) − n ( r ′ ) , to the case where n ( r ) only takes the values 0 or 1.The discrepancies arising from using Equation (4) andmore precise expressions for the variance are mostlyquantitative and limited to the vicinity of liquid-vaporinterfaces. One possible improvement to Equation (4) is given in Ref. 42: χ ( r , r ′ ) = χ ( r , r ′ ) − Z r ′′ ∈ E d r ′′ Z r ′′′ ∈ E d r ′′′ χ ( r , r ′′ ) χ − E ( r ′′ , r ′′′ ) χ ( r ′′′ , r ′ ) , (C1)where E is the empty (i.e., gaseous) region of space where n ( r ) is 0, and χ − E ( r , r ′ ) satisfies Z r ′′ ∈ E χ − E ( r , r ′′ ) χ ( r ′′ , r ′ ) = δ ( r − r ′ ) , r , r ′ ∈ E. Equations (C1) and (4) are in qualitatively agreement:both are zero when r or r ′ are in the gaseous region,and both reduce to χ ( r , r ′ ) well into the liquid phase.The differences are, as expected, concentrated near theboundaries of E . In this refined expression, the integrandoscillates significantly within a lattice cell, so a latticeapproximation to Equation (C1) proves unreliable. Be-cause using Equation (4) gives accurate results for allquantities we have examined, we regard the approximateEquation (4) to be acceptable, and we have not pursuedalgorithms by which Equation (C1) can be accuratelyevaluated. Appendix D: How well is the effect of unbalanced forcescaptured by Equation (A11) ? Above, we replaced the three terms involving φ i inEquation (A10) by the simpler expression given in (A11).We now justify this replacement.Denote by H + [ n ( r )] the terms dropped from Equa-tion (A10). They are H + [ n ( r )] = − Z r ∈ ¯ v Z r ′ ∈ v φ ( r ) χ ( r , r ′ ) c ( r ′ )+ k B T Z r Z r ′ βφ ( r )( χ − χ (m) )( r , r ′ ) βφ ( r ′ ) . (D1)Using the approximation for χ (m) given in Equa-tion (A23) and the one-basis set approximation of Equa-tion (A12), we discretize these terms to obtain a latticeversion of H + [ n ( r )], H + [ { n i } ] = − X i,j φ i n i χ ij (¯ v, v ) n j h N i v /σ v + k B T X i,j βφ i n i [ χ ij ( v, v ) / χ ij (¯ v, v )] n j βφ j . Because of the double sums in the formula, calculating H + [ { n i } ] is by far the most computationally-demandingpart of calculating H eff [ { n i } ]. Since a single cell flipchanges the value of φ i in up to 7 cells, calculating incre-mental changes to H + [ { n i } ] is also much more expensivethan calculating incremental changes to H u [ { n i } ] (Equa-tion (13)), which has a similar structure.5 β G / π R ( n m − ) Sphere radius R (nm)0.40.50.6 0.4 0.8 1.2 R (nm) 0.40.50.6 2 3 4 5 d (nm) FIG. 10. Solvation free energies G of hard spheres as afunction of sphere radius, where the term H + [ { n i } ] (Equa-tion (D1)) has been included and the renormalization con-stant K has been set to 1 (solid black), compared to thesimpler model in Equation (11) (circles). The averages of −h H int [ { n i } ] i (red) and h H + [ { n i } ] i (black) are nearly propor-tional to each other. Left Inset: implied renormalization con-stant K , equal to h H int [ { n i } ] + H + [ { n i } ] i / h H int [ { n i } ] i . Notethat both the numerator and denominator take on essentiallyzero value for R . . K forhexagonal plate solute (Figure 3) with η = 1 .
0. The impliedvalue of K is similar for different η . Figure 10 presents the solvation free energies of hardspheres calculated when the H + [ { n i } ] term is includedand the renormalization constant K is set to 1. As canbe seen, the term corresponding to H int [ { n i } ] has a muchlarger absolute value, and in the region where their valuesare not negligible, the average values of H int [ { n i } ] and H + [ { n i } ] are, as claimed, essentially proportional. Therenormalization procedure we implement thus seems jus-tified, a conclusion borne out by the results in the text.For completeness, we have verified that the solvation freeenergies of the hexagonal plate solute (Figure 3) calcu-lated when H + [ { n i } ] is included and K is 1 are essentiallyidentical to the ones calculated using Equation (11). Appendix E: Comparison to the model of ten Wolde andChandler
Above, we argued that the Ising Hamiltonian estimatefor H large [ n ( r )] overestimates the interfacial energy of asphere of radius R by a factor of 3 /
2. However, the lat-tice version of LCW theory presented by ten Wolde andChandler uses precisely this Hamiltonian, yet the sol-vation free energy of spheres seems to tend to the correctvalue as R grows. Here we explain this apparent paradox.Figure 11 shows the solvation free energies of spheres inthe model of Ref. 14, and shows how they differ when thelattice cell size of λ = 2 . λ = 2 . β G / π R ( n m − ) Sphere radius R (nm) FIG. 11. Solvation free energies G of spheres in the model ofRef. 14 (black), for cell sizes λ = 2 . λ = 2 . H large [ { n i } ] (red) to significantly exceed the solvationfree energy, but also leads to large excess entropies (blue, T S = h H i − G ). At λ = 2 . λ = 2 . claimed, h H large [ { n i } ] i is much larger than it should be,but for λ = 2 . W. Kauzmann, Adv. Protein Chem. , 1 (1959). C. Tanford,
The hydrophobic effect: formation of micelles andbiological membranes (Wiley, 1973). B. Alberts, A. Johnson, J. Lewis, M. Raff, K. Roberts, andP. Walter,
Molecular Biology of the Cell , 5th ed. (Garland Sci-ence, 2007). D. Chandler, Nature , 640 (2005). N. Chennamsetty, V. Voynov, V. Kayser, B. Helk, and B. L.Trout, Proc. Natl. Acad. Sci. U.S.A. , 11937 (2009). N. Chennamsetty, V. Voynov, V. Kayser, B. Helk, and B. L.Trout, J. Phys. Chem. B , 6614 (2010). H. Acharya, S. Vembanur, S. N. Jamadagni, and S. Garde, Fara-day Discuss. , 1 (2010). M. R. Shirts, J. W. Pitera, W. C. Swope, and V. S. Pande, J.Chem. Phys. , 5740 (2003). M. R. Shirts and V. S. Pande, J. Chem. Phys. , 134508 (2005). W. C. Still, A. Tempczyk, R. C. Hawley, and T. Hendrickson,J. Am. Chem. Soc. , 6127 (1990). D. Qiu, P. S. Shenkin, F. P. Hollinger, and W. C. Still, J. Phys.Chem. A , 3005 (1997). K. Lum, D. Chandler, and J. D. Weeks, J. Phys. Chem. B ,4570 (1999). P. R. ten Wolde, S. X. Sun, and D. Chandler, Phys. Rev. E ,011201 (2001). P. R. ten Wolde and D. Chandler, Proc. Natl. Acad. Sci. U.S.A. , 6539 (2002). D. Chandler,
Introduction to Modern Statistical Mechanics (Ox-ford University Press, 1987). B. Roux and T. Simonson, Biophys. Chem. , 1 (1999). For reduced classes of solutes, such as linear alkanes, surface-area scaling is nonetheless observed. These molecules are prop-erly in the small length-scale regime, where solvation free en- ergy scales as volume. However, for linear molecules, surfacearea also scales as volume, leading to the misleading scal-ing behavior. Moreover, the resulting empirical surface ten-sion is almost negligibly small. Typical values are in the 5–10 cal/mol/˚A ≈ k B T /nm range, in contrast to the water-air surface tension of about 17 k B T /nm and the water-oil surfacetension of about 12 k B T nm : see Ref. 18. C. Tanford, Proc. Natl. Acad. Sci. U.S.A. , 4175 (1979). D. Huang, P. L. Geissler, and D. Chandler, J. Phys. Chem. B , 6704 (2001). R. M. Levy, L. Y. Zhang, E. Gallicchio, and A. K. Felts, J. Am.Chem. Soc. , 9523 (2003). J. Chen and C. L. Brooks, J. Am. Chem. Soc. , 2444 (2007). A. Wallqvist and B. J. Berne, J. Phys. Chem. , 2893 (1995). K. Lum and A. Luzar, Phys. Rev. E , R6283 (1997). P. G. Bolhuis and D. Chandler, J. Chem. Phys. , 8154 (2000). A. Anishkin and S. Sukharev, Biophys. J. , 2883 (2004). P. Liu, X. Huang, R. Zhou, and B. J. Berne, Nature , 159(2005). T. F. Miller, E. Vanden-Eijnden, and D. Chandler,Proc. Natl. Acad. Sci. U.S.A. , 14559 (2007). M. V. Athawale, G. Goel, T. Ghosh, T. M. Truskett, andS. Garde, Proc. Natl. Acad. Sci. U.S.A. , 733 (2007). J. C. Rasaiah, S. Garde, and G. Hummer, Annu. Rev. Phys.Chem. , 713 (2008). S. N. Jamadagni, R. Godawat, J. S. Dordick, and S. Garde, J.Phys. Chem. B , 4093 (2009). B. J. Berne, J. D. Weeks, and R. Zhou, Annu. Rev. Phys. Chem. , 85 (2009). A. J. Patel, P. Varilly, and D. Chandler, J. Phys. Chem. B ,1632 (2010). G. Hummer, S. Garde, A. E. Garc´ıa, A. Pohorille, and L. R.Pratt, Proc. Natl. Acad. Sci. U.S.A. , 8951 (1996). See, for instance, D. Huang and D. Chandler, Phys. Rev. E ,1501 (2000) and Refs. 35, 32 and 36. S. Garde, R. Khare, and G. Hummer, J. Chem. Phys. , 1574(2000). L. Xu and V. Molinero, J. Phys. Chem. B , 7320 (2010). R. Zhou, Proteins , 148 (2003). I. Daidone, M. B. Ulmschneider, A. Di Nola, A. Amadei, andJ. C. Smith, Proc. Natl. Acad. Sci. U.S.A. , 15230 (2007). J. Chen, C. L. B. III, and J. Khandogin, Curr. Opin. Struc. Biol. , 140 (2008). C. Y. Janda, J. Li, C. Oubridge, H. Hernandez, C. V. Robinson,and K. Nagai, Nature , 507 (2010). S. F. Harris, A. K. Shiau, and D. A. Agard, Structure , 1087(2004). D. Chandler, Phys. Rev. E , 2898 (1993). G. Crooks and D. Chandler, Phys. Rev. E , 4217 (1997). K. Lum,
Hydrophobicity at Small and Large Length Scales , Ph.D.thesis, University of California, Berkeley (1998), Figure (3.6),page 61. A. P. Willard and D. Chandler, J. Phys. Chem. B , 6187(2008). J. D. Weeks, Annu. Rev. Phys. Chem. , 533 (2002). J. D. Weeks, R. L. B. Selinger, and J. Q. Broughton, Phys. Rev.Lett. , 2694 (1995). J. D. Weeks, K. Vollmayr, and K. Katsov, Physica A , 461(1997). J. D. Weeks, K. Katsov, and K. Vollmayr, Phys. Rev. Lett. ,4400 (1998). K. Katsov and J. D. Weeks, J. Phys. Chem. B , 6738 (2001). H. Berendsen, J. Grigera, and T. Straatsma, J. Phys. Chem. ,6269 (1987). D. M. Huang and D. Chandler, J. Phys. Chem. B , 2047(2002). E. Lemmon, M. McLinden, and D. Friend, in
NIST Chem-istry WebBook, NIST Standard Reference Database Number69 , edited by P. Linstrom and W. Mallard (National Instituteof Standards and Technology, Gaithersburg MD, 20899, 2009) http://webbook.nist.gov (retrieved October 25, 2009). A. H. Narten and H. A. Levy, J. Chem. Phys. , 2263 (1971). D. M. Huang and D. Chandler, Proc. Natl. Acad. Sci. U.S.A. ,8324 (2000). A. Pohorille, C. Jarzynski, and C. Chipot, J. Phys. Chem. B , 10235 (2010). C. H. Bennett, J. Comp. Phys. , 245 (1976). B. Hess, C. Kutzner, D. van der Spoel, and E. Lindahl, J. Chem.Theory Comput. , 435 (2008). C. G. Sztrum-Vartash and E. Rabani, J. Phys. Chem. C ,11040 (2010). J. Chen and C. L. Brooks III, Phys. Chem. Chem. Phys. , 471(2008). W. Humphrey, A. Dalke, and K. Schulten, J. Mol. Graphics ,33 (1996). VMD was developed by the Theoretical and Computa-tional Biophysics Group in the Beckman Institute for AdvancedScience and Technology at the University of Illinois at Urbana-Champaign. The parameters of the solute-solute Lennard-Jones potential arethose of Ref. 63: σ = 3 .
905 ˚A and ǫ = 0 .
118 kcal/mol. Lorentz-Berthelot mixing rules were used to obtain the water-solute in-teraction parameters. W. L. Jorgensen, J. D. Madura, and C. J. Swenson, J. Am.Chem. Soc. , 6638 (1984). J. D. Weeks, D. Chandler, and H. C. Andersen, J. Chem. Phys. , 5237 (1971). J.-P. Hansen and I. R. McDonald,
Theory of Simple Liquids , 3rded. (Academic Press, 2006), Section 5.3 L. Verlet and J.-J. Weis, Phys. Rev. A , 939 (1972). F. Wang and D. Landau, Phys. Rev. Lett. , 2050 (2001). D. Earl and M. Deem, Phys. Chem. Chem. Phys. , 3910 (2005). M. R. Shirts and J. D. Chodera, J. Chem. Phys. , 124105(2008). S. Plimpton, J. Comp. Phys. , 1 (1995), available at http://lammps.sandia.gov . See Supplementary Material Document No. NNNN for a listing ofthe model plate’s coordinates, tabulated values of χ ab for waterat ambient conditions, and additional P V ( N ) distributions. J. Mittal and G. Hummer, Proc. Natl. Acad. Sci. U.S.A. ,20130 (2008). R. Godawat, S. N. Jamadagni, and S. Garde,Proc. Natl. Acad. Sci. U.S.A. , 15119 (2009). S. Sarupria and S. Garde, Phys. Rev. Lett. , 037803 (2009). J. Mittal and G. Hummer, Faraday Discuss. , 341 (2010). P. Setny and M. Zacharias, J. Phys. Chem. B , 8667 (2010). V. Molinero and E. B. Moore, J. Phys. Chem. B , 4008 (2009). X. Song, D. Chandler, and R. A. Marcus, J. Phys. Chem. ,11954 (1996). X. Song and D. Chandler, J. Chem. Phys. , 2594 (1998). M. Doi and S. F. Edwards,
The theory of polymer dynamics (Ox-ford, 1988). N. Kikuchi, A. Gent, and J. Yeomans, Eur. Phys. J. E , 63(2002). S. Whitelam, E. H. Feng, M. F. Hagan, and P. L. Geissler, SoftMatter , 1251 (2009). A. Malevanets and R. Kapral, Europhys. Lett. , 552 (1998). N. K. Voulgarakis and J.-W. Chu, J. Chem. Phys. , 134111(2009). S. Chen and G. D. Doolen, Ann. Rev. Fluid Mech. , 329 (1998). In Ref. 13, the term proportional to n i is omitted. Since φ ( r )only acts on cells with n i = 1, this omission is inconsequential,and shows up as an extra factor of 2 in their value of a . J. S. Rowlinson and B. Widom,
Molecule Theory of Capillarity (Dover, 1982). Chapter 3. J. D. Weeks, J. Chem. Phys. , 3106 (1977). C. Vega and E. de Miguel, J. Chem. Phys. , 154707 (2007). W. E. Lorensen and H. E. Cline, SIGGRAPH Comput. Graph. , 163 (1987). C. B. Haselgrove, Math. Comput. , 323 (1961). H. Conroy, J. Chem. Phys.47