Derivation of coarse-grained potentials via multistate iterative Boltzmann inversion
DDerivation of coarse-grained potentials via multistate iterative Boltzmann inversion
Timothy C. Moore and Christopher R. Iacovella
Department of Chemical and Biomolecular Engineering,Vanderbilt University, Nashville, TN 37235 USA andVanderbilt Multiscale Modeling and Simulation (MuMS) Center,Vanderbilt University, Nashville, TN 37235 USA
Clare McCabe ∗ Department of Chemical and Biomolecular Engineering,Vanderbilt University, Nashville, TN 37235 USAVanderbilt Multiscale Modeling and Simulation (MuMS) Center,Vanderbilt University, Nashville, TN 37235 USA andDepartment of Chemistry, Vanderbilt University, Nashville, TN 37235 USA (Dated: August 25, 2018)In this work, an extension to the standard iterative Boltzmann inversion (IBI) method to derivecoarse-grained potentials is proposed. It is shown that the inclusion of target data from multiplestates yields a less state-dependent potential, and is thus better suited to simulate systems overa range of thermodynamic states than the standard IBI method. The inclusion of target datafrom multiple states forces the algorithm to sample regions of potential phase space that matchthe radial distribution function at multiple state points, thus producing a derived potential thatis more representative of the underlying potential interactions. It is shown that the algorithm isable to converge to the true potential for a system where the underlying potential is known. Itis also shown that potentials derived via the proposed method better predict the behavior of n -alkane chains than those derived via the standard method. Additionally, through the examinationof alkane monolayers, it is shown that the relative weight given to each state in the fitting procedurecan impact bulk system properties, allowing the potentials to be further tuned in order to matchthe properties of reference atomistic and/or experimental systems. I. INTRODUCTION
The utility of coarse-grained (CG) forcefields for softmatter and biological simulations has been well estab-lished in the literature, enabling simulation to exploregreater length- and time-scales than is feasible with fullyatomistic models. This is of particular importance whenstudying the self-assembly of soft matter systems, wherethe assembly is typically driven by weak forces, (e.g.,hydrophobicity and entropy) [1–6] and structures oftendemonstrate hierarchical ordering (e.g., molecules orga-nized into micelles, micelles organized into local/globalpatterns). [5, 7–11] While generic, non-specific CG mod-els have been widely applied, [12–16] providing impor-tant information regarding trends and design rules, it isoften necessary to use CG models specifically mappedto the system of interest to provide a direct one-to-onecorrespondence with experiment. While several trans-ferable CG forcefields, such as TraPPE-CG [17] andMARTINI,[18] have been developed, akin to forcefield de-velopment at the atomistic level, [19–23] the developmentof new CG forcefields is still often necessary. This is oftenrequired since the available forcefields may be lacking thenecessary molecular species/groupings or may not havebeen optimized for the properties of interest. This secondpoint is of particular consequence, since, for example, a ∗ [email protected] forcefield optimized to match phase behavior may not ap-propriately capture subtle structural features.[17] Gener-ally speaking, direct structural correspondence is neededto accurately transition between different simulation lev-els (e.g., atomistic to CG), in order to perform multiscale[24–26] and hybrid-multiscale simulations, [27–33] as wellas to recover atomistic details from CG simulations. [34–37]Several approaches have been developed to derive andoptimize CG forcefields. [38–44] Among these, the iter-ative Boltzmann inversion (IBI) method[38] has becomea popular choice due to its straightforward nature, gen-eral applicability to a wide range of systems, and basisin structural properties. The IBI method relies on self-consistently adjusting a given potential to achieve conver-gence with target structural data; for nonbonded inter-actions this target data takes the form of the radial dis-tribution function (RDF) between interaction sites andthe potential is iteratively updated according to: V i +1 ( r ) = V i ( r ) + αk B T ln (cid:20) g i ( r ) g ∗ ( r ) (cid:21) (1)where V i ( r ) is a numerical pair potential; i representsthe current iteration; α is a damping factor to suppresslarges changes to the potential update, often varying from0.2 to unity, where smaller values tend to be necessaryto capture dense and/or crystalline states;[45] k B is theBoltzmann constant; T the absolute temperature; r theseparation between particles; g i ( r ) the pair RDF from a r X i v : . [ c ond - m a t . s o f t ] O c t the simulation of V i ( r ), and g ∗ ( r ) the RDF of the tar-get system mapped to the CG level. Although the CGpotentials derived from IBI are typically able to accu-rately reproduce the target RDFs, they are, in general,only applicable at the state point for which they werederived, due to the structural nature of their derivation(e.g., note the explicit temperature dependence of Equa-tion 1, as well as the implicit temperature and densitydependence through the g ( r ) terms).[46, 47] For exam-ple, separate potentials were required to capture both thesolid and fluid structures of a pure simple lipid;[45] Qian, et al. [48] found that the potential derived using IBI forethylbenzene scales in a non-linear fashion with temper-ature (i.e., a square root dependence); and several workshave shown that CG polymer potentials derived via theIBI method can depend on the chemical environment forwhich they were derived. [38, 49, 50] Recent work hasshown that some of the CG potentials in a benzene-urea-water system derived via IBI have some degree of statepoint transferability,[51] but it is unclear why IBI pro-vides transferability for some but not all. Furthermore,for complex systems, it may not be possible to optimizethe potentials at the state points of interest, due to time-or length-scale limitations of the atomistic simulationsand/or system complexity (i.e., systems many unique in-teractions that need to be derived simultaneously); thusmaking it difficult to apply the IBI method appropri-ately, given that potentials are not necessarily transfer-able. Perhaps of most concern is the fact that the IBImethod does not guarantee a unique solution, as a mul-titude of vastly differing potentials may give rise to oth-erwise matching RDFs. The form of the final derived po-tential may also vary based on runtime parameters, suchas the inital potential guess, potential cutoff, magnitudeof the damping factor, etc. Additionally, the derived po-tential may include artifacts associated with intermedi-ate and long-range structural correlations in the system,e.g., oscillatory behavior in the potential that follows thepeaks and valleys in the RDF, which may alter otherproperties of the system, even if RDFs match.In this work, the IBI method is extended to per-form multi-state optimization, i.e., the potential is self-consistently adjusted to achieve simultaneous conver-gence of target data from multiple states. The generalidea, illustrated in Figure 1, is that the inclusion of targetdata from multiple states adds constraints to the opti-mization problem, such that the derived forcefield tendstoward a single potential that can adequately representall states. For example, potentials in region ‘ i ’ of theupper portion of Figure 1 are able to match the targetstructure at a single state ‘ i ’, potentials in region ‘ ii ’ areable to reproduce target data at state ‘ ii ’, etc., with a sin-gle representative potential lying at the overlap of theseregions, shown as region ‘ iv ’. To test the efficacy of theproposed multi-state iterative Boltzmann inversion (MSIBI) method, in Section III A, we first perform poten-tial optimizations for the idealized system of a Lennard-Jones (LJ) fluid for which the potential is known, in or- der to determine if the method resolves the correct po-tential. In Section III B, to test the method in a systemwhere only nonbonded interactions are present in the CGmodel, a 3-to-1 mapped CG forcefield is optimizted forpropane using target data generated from united-atom(UA) propane simulations, and compared with a single-site LJ model mapped to the experimental critical pointof propane. In Section III C, we apply this approach to n -dodecane, a system more representative of the typicalapplication of a CG forcefield. In Section III D, we exam-ine a monolayer system composed of n -docecane, whereit is demonstrated that adjustment of the relative weightsgiven to each target in the MS IBI method can be usedto tune the potentials to match other measurable systemproperties beyond the RDF. all potentialsi ii iii ivall potentialsiiii iiiv a)b) FIG. 1. Regions of good potential phase space for states withoptimal overlap (top) and too much overlap (bottom).
II. METHODS AND SIMULATION DETAILSA. Single-state iterative Boltzmann inversion
In the IBI method (which for clarity we shall refer toas single state, SS IBI), a numerical pair potential, V ( r )is iteratively updated according to Equation 1. In thismanner, V ( r ) is updated at each separation, r , basedon whether the RDF overpredicts or underpredicts thetarget RDF at the given r , and is repeated until the trialRDF matches the target RDF within some tolerance.[38]The initial guess of the numerical potential is often takento be the Boltzmann inversion of the RDF of the targetsystem: V ( r ) = − k B T ln g ∗ ( r ) (2)While not exact for site-site interactions in molecules,[52]this methodology is motivated by the statistical me-chanics relationship between the potential of mean force(PMF) and the RDF, and provides a reasonable startingpotential over which to iterate.Typically, potentials derived with this method are ca-pable of reproducing the target RDFs with high accu-racy, with slight deviations resulting from informationlost during coarse-graining. The ease of use of the IBImethod and its general applicability make it a powerfultool; given a CG mapping and a target RDF, site-site pairpotentials can be readily derived with little user input. B. Multistate extension of IBI
Although potentials derived with SS IBI will typicallyreproduce their target RDFs with high accuracy, cautionmust be taken when using the potentials. Upon success-ful convergence of the potential, it is only guaranteed thatthe derived and target RDFs match, not that the poten-tial is necessarily representative of the “true” underly-ing potential (i.e, not necessarily state independent). Itis important to note, especially since information is lostdue to coarse-graining, that a multitude of potentials maygive rise to similar RDFs. Only a small portion of thepotentials that produce matching RDFs may actually fallwithin the region of potentials that match the true po-tential and, since the true potential is typically unknown,it is difficult to assess the accuracy of the derived poten-tials. If the derived potential falls far outside the truepotential region, this may give rise to potentials that,despite providing a good match for the target RDF, lacktransferability and may contain artifacts making themincapable of resolving system properties other than theRDF.The proposed MS IBI method aims to minimize thestate dependence of the derived potentials by adding ad-ditional constraints to the optimization process such thatthe derived potentials fall within the region of phasespace where potentials are representative of the “true”potential. This approach relies on two key assumptions:(1) different thermodynamic states have different regionsof the potential phase space that adequately reproducetheir respective target RDFs, and (2) that the true, un-derlying potential lies within the common overlap be-tween these regions of phase space. As the name suggests,this is accomplished by updating the derived potential tosimultaneously match target RDFs at different thermo-dynamic states, producing a single potential that pro-vides sufficient matching for all target RDFs considered.As shown graphically in the upper portion of Figure 1, the converged potential lies at the intersection of each ofthe regions representing the target RDFs, as this is theonly region where a sufficient match will be found for allstates.The implementation of MS IBI is similar to that of SSIBI, the only additional requirement is more target data.As in SS IBI, the initial potential is assumed to be theBoltzmann inversions of the target RDFs, averaged overthe N states used, V ( r ) = − N (cid:88) s k B T s g ∗ s ( r ) (3)where the subscript s represents the property at state s .After a trial CG simulation is run at each state usingthe potential from Equation 3, the potential is updatedaccording to: V i +1 ( r ) = V i ( r ) + 1 N (cid:88) s α s ( r ) k B T s ln (cid:20) g is ( r ) g ∗ s ( r ) (cid:21) (4)While in SS IBI, α represents a damping factor usefulfor suppressing fluctuations in the potential update, here α s ( r ) also serves as a weighting factor, allowing moreor less emphasis to be put on each state. For exam-ple, if fitting a potential with three states, where state1 will ultimately be of most interest, it may make senseto give state 1 a higher α value; this will be discussedlater in Section III D. Additionally, here α s ( r ) is definedas a linear function of the separation r , with the points α (0) = α max , and α ( r cutoff ) = 0, such that we ensurethe derived potential has a value of zero at the interac-tion cutoff, r cutoff (i.e., the point at which we assumethat pair interactions are zero). Since α decreases as r increases, increased emphasis is placed on shorter-rangeinteractions compared to long-range interactions, similarto the radial dependence of the pressure correction for-mula often used with IBI.[38] This helps to suppress theinfluence of long-range structural correlations on the de-rived potential, as short-range interactions may certainlygive rise to long-range correlations (e.g., the formation ofbulk crystals from particles interacting through a short-ranged, truncated potential). For direct comparability inthis work, both SS and MS IBI treat the damping factoras a linear function of separation, with a fixed value of0 at the potential cutoff. Note that, although bondedinteractions may be optimized in a similar manner (i.e.,adjusting the potential to match a target distribution),in this work, we make the assumption that bonded andnonbonded interactions are sufficiently independent suchthat we use analytical bonded potentials, as has beendone in previous work.[45, 53, 54]The choice of states used in the fitting procedure isnaturally important to deriving an accurate potential.To derive the potential most representative of the un-derlying one, it would not be beneficial to choose stateswith RDFs that are too similar, as the overlap regionwould be large, essentially providing minimal additionalconstraints; this situation is shown in the lower portionof Figure 1. In such a case, there would be no advantageto the multistate fitting. At the other end of the spec-trum, there may in fact be no overlap between states, ormore specifically, no overlap for a given level of match-ing (i.e., no overlap without relaxing the tolerance of anRDF similarity test). For some systems, it may not bepossible to define a single pair potential that accuratelyreproduces the target structure at all states. This is nota problem unique to CG potentials, as it applies at alllevels of modeling, e.g., classical atomistic potentials mayalso lack full state-independence given that they do notallow variation in electron density. C. Simulation Model
In this work, simulations were performed using 3 dis-tinct models: generic LJ fluid, TraPPE-UA, and CGmodels derived via IBI. First, simulations of monatomicLJ spheres were performed in the canonical ensem-ble (i.e., fixed number of particles N , volume V , andtemperature T ), with temperature controlled via theNos´e-Hoover thermostat. These monatomic LJ systemscontained 1468 particles initially randomly distributedthroughout the box, and were run for 1 × timesteps,during which the reduced temperature was decreasedfrom 2.0 to the final target temperature. The systemswere further equilibrated for 1 × timesteps before tar-get data was collected over 1 × steps. A timestep of1 × − reduced time units was used. The interactionparameters used in all LJ simulations were σ = 1 . ε = 1 .
0, with a potential cutoff r cutoff = 3 σ . Here,no coarse-graining was applied to the target systems, asthese simulations were used simply to test the efficacyof the potential derivation under the ideal circumstanceswhere the true potential is known.The second model used relies on the TraPPE-UA force-field for simulation.[19] Here, alkanes were simulatedin the canonical ensemble, with temperature controlledvia the Nos´e-Hoover thermostat. Bulk fluid systemsfor both propane (1024 molecules) and n -dodecane (400molecules) were simulated at 3 different states, as listedin Section III, and used to generate target RDF data.Although not an all-atom model (as hydrogens are notexplicitly modeled), the TraPPE forcefield was chosen forcomputational convenience, since, in principle, the targetdata can come from any source. In all cases, a timestepof 1 fs was used. After an initial equilibration period of5 ns, data was collected over a 10 ns production run. Inaddition to the bulk fluid n -dodecane simulations, UAsimulations were performed of n -dodecane gel and fluidmonolayers, composed of 100 n -dodecane chains each.These were performed in the same manner as the bulksimulations at 298 K, but with the first bead of each chainheld stationary such that a 2D hexagonally arranged pe-riodic array with density 4.10 chains per ns (gel) and3.79 chains per nm (fluid) was achieved; these were cho-sen to match state points commonly used in alkylsilane monolayer simuations and experiments.[55]The third model used is a CG representation of alka-nes. In all cases a 3-to-1 CG model (i.e., each CG beadrepresents 3 UA carbon groups) was used to simulatebulk fluid and monolayer systems of alkanes. Pair po-tentials were derived using the SS and MS IBI methods,using the results of the UA simulations as target data,as discussed in detail in Section III. The bond stretch-ing and angle bending potentials used in the study ofdodecane were derived by a Boltzmann inversion of thebonded distributions sampled in the atomistic trajectorymapped to the CG level.[53] Specifically, from a normal-ized bond length distribution p ( r ), the bond stretchingpotential is written as V bond ( r ) = − k B T ln p ( r ) (5)which, assuming a Gaussian bond length distribution,results in a harmonic potential about the most proba-ble bond length, r eq ; note an identical formalism wasused for angles, where θ is substituted for r , and the nor-malization includes a factor of sin − θ . Since minimalstate dependence was found between systems, a singleset of bonded parameters was used in all simulations,with k/k B = 15 .
60 K/˚A and r eq = 3 .
56 ˚A for bondsand k/k B = 0 .
17 K/deg and θ eq = 174 .
53 ˚A for angles.Bond histograms and additional details are included inthe Supplemental Material.[56]In all cases, the GPU-enabled HOOMD-Blue[57] , [58]simulation engine was used to perform the simulations.The high performance of the GPU allows for rapid deriva-tion of potentials. A standard potential optimization us-ing MS IBI required approximately 50 iterations to bewell-converged. For the pure LJ systems with 1468 par-ticles, this convergence took less than one hour usingthree NVidia GTX580 GPUs concurrently. The follow-ing convergence criteria was used to measure how well atrial RDF matched with its target, where dr is the sizeof an RDF bin: f fit = 1 − (cid:82) r cut dr | g i ( r ) − g ∗ ( r ) | (cid:82) r cut dr | g i ( r ) + g ∗ ( r ) | (6)An f fit value of unity represents a perfect match be-tween the trial and target RDFs. Additionally, in allfigures, the following two-point central moving averagesmoothing function was applied to the derived potentialto reduce the noise: V (cid:48) n ( r ) = 13 [ V n − ( r ) + V n ( r ) + V n +1 ( r )] (7)where V n ( r ) is the n th element of the numerical poten-tial, and the prime denotes the smoothed value. Theapplicaiton of the smoothing function was not found tosignificantly influence the behavior or degree of match-ing. III. RESULTSA. Monatomic Lennard-Jones fluid
To test the efficacy of the MS IBI method, potentialswere derived using RDFs from monatomic LJ spheresas target data, and the results compared to single statepotential derivation (i.e., SS IBI). Target data was ac-quired from the following states: stata A, reduced den-sity ρ ∗ = N σ /V = 0 .
85, reduced temperature T ∗ = k B T /ε = 0 .
5; state B, ρ ∗ = 0 . T ∗ = 1 .
5; and state C, ρ ∗ = 0 . T ∗ = 2 .
0. No coarse-graining was performedsince the goal was to test whether the MS IBI methodcould recover a known potential. In contrast to map-ping an atomistic system to the CG level, no informationabout the system is lost ensuring that a single potential isapplicable to all states and that this potential is known.While the RDFs match well, as illustrated in Figure 2,the potentials derived via SS IBI demonstrate signifi-cant state dependence, as shown in Figure 2d. For themore dense states A and B (Figures 2a,b), the SS IBImethod was not able to converge to the true potentialto the extent that in the most dense system (state A),the converged potential is almost purely repulsive. Thisresult is due to the elevated density of this state, wherethe structure can be reproduced with a purely repulsivepotential.[59] In this case, even though the RDF matchesthe target well, the overall behavior of the system wouldbe dramatically altered as compared to the target. A sim-ilar situation arises in state B where only a weak attrac-tive potential is required to match the target structure.In state C, however, the low density causes attractiveforces to become important, and as such, the attractiveportion of the LJ potential is needed to fully reproducethe target data. Thus, the true LJ potential is recov-ered only for state C. The application of SS IBI to themonatomic LJ system illustrates two points: (1) that po-tentials derived via SS IBI are state-dependent, and (2)these potentials are not unique, in that both the LJ po-tential and the vastly differing derived potential producematching RDFs.MS IBI aims to address the aforementioned issuesby forcing the potential to sample portions of potentialphase space that satisfy all of the constraints, i.e., finda single potential that matches the target structure atmultiple states. The results of applying MS IBI to themonatomic LJ fluid are shown in Figure 2e-h. The inclu-sion of target data from multiple states results in closelymatching RDFs and a derived potential that accuratelyreproduces the true LJ potential, as shown in Figure 2h.Although this example is simple, as no coarse-grainingwas performed, it illustrates the ability of MS IBI to re-cover a known potential and reduce the state-dependenceof the derived potential. . . . . . . . . g ( r ) (a) f fit = 0.9970 0.0 1.0 2.0 3.0 4.0 5.00 . . . . . . g ( r ) (b) f fit = 0.99780.0 1.0 2.0 3.0 4.0 5.00 . . . . . . . . . . g ( r ) (c) f fit = 0.9972 0 . . . . . . V ( r ) / (cid:15) (d) f fit,A = 0.0699 f fit,B = 0.0694 f fit,C = 0.91290.0 1.0 2.0 3.0 4.0 5.00 . . . . . . . . g ( r ) (e) f fit = 0.9959 0.0 1.0 2.0 3.0 4.0 5.00 . . . . . . g ( r ) (f) f fit = 0.99580.0 1.0 2.0 3.0 4.0 5.0 r/ σ . . . . . . . . . . g ( r ) (g) f fit = 0.9899 0 . . . . . . r/ σ -1.5-1.0-0.50.00.51.01.52.02.53.0 V ( r ) / (cid:15) (h) f fit = 0.9320 FIG. 2. RDFs and potentials derived for the LJ system. (a-d) SS IBI results. (e-h) MS IBI results. The α value usedfor the MS IBI optimizations was 0.7 for each state. f fit forthe potentials was calculated in the rancge σ ≤ r ≤ r cutoff .The solid black line represents the target RDF (a-c, e-g) orthe known potential (d, h). The symbols represent the derivedpotential (d, h) or the RDFs calculated from simulations usingthe derived potential (a-c, e-g). B. Propane
To further test the MS IBI algorithm, potential op-timizations were performed on propane. The chosen 3-to-1 mapping results in a single-site model that can bedirectly compared to known single site 12-6 LJ mod-els from the literature.[60] Note, the 12-6 LJ potentialshould not be considered to the the “true” potential, butrather a good approximation. Target data was acquiredfrom UA simulations at the following states: state A,298 K, 0.818 g/mL, α A (0) = 0 .
5; state B, 298 K, 0.439g/mL, α B (0) = 0 .
7, and state C, 298 K, 0.014 g/mL, α C (0) = 0 .
5. The resulting RDFs and (single) pair po-tential are presented in Figure 3. At each state, f fit indicates excellent agreement between the target RDFsand those calculated from simulations using the derivedpotential. Moreover, we find that the derived potentialagrees well with a single-site 12-6 LJ model using param-eters mapped to the critical point of propane,[60] provid-ing confidence in the MS IBI method. While the matchbetween the two potentials is good, the derived potentialdoes show two small bumps at ∼ ∼ . . . . . . g ( r ) f fit = 0.9780(a) 0 2 4 6 8 10 12 14 160 . . . . . g ( r ) f fit = 0.9879(b)0 2 4 6 8 10 12 14 16 r( ˚A ) . . . . . . g ( r ) f fit = 0.9762(c) 3 4 5 6 7 8 9 10 r( ˚A ) − − V ( r ) / k B ( K ) (d) Derived potentialPu et al . 2007 FIG. 3. RDFs (a-c) and potential (d) derived for propaneusing MS IBI. a, b, and c correspond to states A, B, and Cin the text, respectively. The α values used were 0.5, 0.7, and0.5 for states A, B, and C, respectively. To illustrate the consistency of the potentials derivedvia MS IBI (i.e., that the final potential is insensitive tothe initial guess), optimizations were performed using anumber of different initial potentials. In addition to thePMF-like quantity of Equation 3, three additional initialguesses were used, each a 12-6 LJ potential with vastlydiffering parameters: (1) ε = 0 .
46 kcal/mol, σ = 4 . ε = 0 .
001 kcal/mol, σ = σ ; (3) ε = 2 ε , σ = σ ; The final derived potentials are, in each case,very similar to each other and to the derived potentialin Figre 3d, as shown in Figure 4. Particularly, the f fit values between each potential and the derived potentialshown in Figure 3 are 0.986, 0.980, and 0.986, respec-tively. C. n -dodecane To look at a more complex system and test the state-independence, we look at n -dodecane in the bulk anduse it to examine systems containing monolayers. Inter- r( ˚A ) − − − − V ( r ) / k B ( K ) r( ˚A ) − − − V ( r ) / k B ( K ) FIG. 4. Different initial guesses (top) and the resulting de-rived potentials (bottom) for propane optimizations. Bluetriangles: LJ with ε = ε and σ = σ ; black circles: V ( r )from Eq. 3 (same as shown in Figure 3); green ‘x’: LJ with ε = ε and σ = σ ; solid black line: 1-site propane model[60](not used as initial guess, shown for reference); magenta ‘+’:LJ with ε = ε and σ = σ . Symbols in top plot correspond tothe same symbols in the bottom plot. Note that all potentialsconverge to very similar values. molecular pair potentials were derived for the beads of aCG model of n -dodecane, again using a 3-to-1 mapping.The resulting 4-site model contains two middle beads andtwo terminal beads, where middle and terminal beadswere treated as unique entities, resulting in the need toderive three pair potentials; harmonic bonds and angleswere used, as detailed in the methods section. The targetdata was collected from UA simulations of n -dodecane atthe following states: state A, 298 K, 1.04 g/mL; state B,298 K, 0.74 g/mL; and state C, 370 K, 0.55 g/mL; thedamping values used were α A (0) = 0 . α B (0) = 0 . α C (0) = 0 .
5. Note that state B corresponds to theexperimental density at standard ambient temperatureand pressure, and, as such, is given higher weight thanthe other states in this example. Close agreement withthe target RDFs is found, with an f fit value greater than0.98 for each of the nine RDFs calculated (not shown, seeSupplemental Material). To further assess the quality ofthe potentials derived via MS IBI, the average squaredradius of gyration normalized by the average end-to-enddistance, denoted by R chain , was calculated, providing ameasure of the chain conformations at different thermo-dynamic states. Using potentials derived with MS IBI,good agreement is seen between the UA target data andthe CG model of the ratio R chain , as shown in Figure 5;in this plotting scheme an ideal match corresponds to adata point situated on the line y = x . While deviationsbecome more apparent as R chain increases, the potentialsderived from only state B (i.e., standard temperature andpressure) via SS IBI show larger, systematic deviationsof R chain over the entire range of state points considered.As such, it appears the potential from MS IBI more accu-rately models the conformations of dodecane over a rangeof state points. Note, in both cases, additional simula-tions were performed at state points not used in the fit-ting (state points used in the fitting are highlighted withopen squares in Figure 5), to also test the transferabilityof the derived potential. The improved match of chainconformations was further tested by examining systemscontaining n -dodecane monolayers. .
34 0 .
36 0 .
38 0 . R chain , CG . . . . R c h a i n , U A .
34 0 .
36 0 .
38 0 . R chain , CG . . . . R c h a i n , U A FIG. 5. Comparison of a structural metric between the CGand UA models of n -dodecane. The CG potentials were de-rived from MS IBI (top) and SS IBI (bottom). A value lyingon the solid line represents a perfect match between the CGand UA models. Squares represent data points from simula-tions at state points where the potential was derived; circlesare data points from other states used for testing the state de-pendence. The states used in the multi-state fitting are statesA, B, and C as described above with α A . α B (0) = 0 . α C (0) = 0 .
5. State B was used for the single state fittingwith α = 0 . D. n -dodecane Monolayer As mentioned in Section II B, here the damping co-efficient, α s ( r ), is a function of both separation, r , andstate, s . Recall that the dependence on separation is cho- sen such that the derived potential has a value of zero atthe potential cutoff as well as to reduce the influenceof intermediate and long-range structural correlations onthe derived potentials. Adjusting the α s (0) value given toeach state effectively alters the weight given to each statein the fitting, i.e., more or less emphasis can be placed ona given state. While adjusting the relative weight givento each state may have only a small effect on the derivedRDFs, it may alter the potential, which ultimately mayvary other system properties, allowing potentials to betuned to capture specific behaviors. To demonstrate this,as well as to further test the transferability of the derivedpotentials, alkane monolayers were simulated with the 3-to-1 CG model, with potentials optimized in the bulkstates discussed above, using various values of α s (0) foreach of the three states. The average tilt angle, θ , withrespect to the surface and the nematic order parameter, S , of the chains were calculated[61] and compared withthose values calculated from the corresponding UA sim-ulations. Note that the UA monolayer simulations werenot used as target data in the potential derivation, usedonly to validate the properties predicted by the derivedCG potential.Unique sets of CG potentials were derived over a rangeof α s (0) values, as summarized in Table I. Here, the statesA, B, and C are the same states previously used as tar-get data to derive a CG potential for bulk systems of n -dodecane above. The results indicate that both theaverage chain tilt angle and the nematic order param-eter are functions of the relative α s (0) weights for thefluid state monolayer. For the gel phase monolayer, thenematic order parameter is less dependent on the α s (0)values, while the chain tilt angle is significantly depen-dent.Initially, potentials were optimized with equal weightsassigned to each state. As shown in Table I, this α setyields a potential that significantly overpredicts the fluidphase order parameter, while it underpredicts the gelphase chain tilt. Since the monolayers are inherentlysomewhat ordered, it would be expected that increas-ing the relative weight given to the most dense state,state A, would yield a potential that better captures thesytem behavior. By systematically reducing the weightgiven to the less dense states, first state C, then state B,a potential that very closely reproduces the monolayerbehavior is obtained for α values of 0.7, 0.1, and 0.1 forstate A, B, and C, respectively. Given the small weightsassigned to states B and C in this case, it may be ex-pected that this potential would give results similar tothe potential derived via SS IBI at state A. However, itcan be seen in Table I that this clearly is not the case;potentials derived from SS IBI at state A show large de-viations, underpredicting both the average tilt angle andnematic order parameter in the gel phase monolayer, instark contrast to the near perfect behavior predicted byMS IBI. This result is a direct consequence of using theMS IBI method; even though low weights are given to theother states, the derived potentials will only be consid-ered converged if all states demonstrate good agreement.Again, we note that when deriving the potentials, the UAmonolayer was not used as target data (i.e., the structurematching was performed in the same manner as describedin Section III C, except with varying values of α s (0) foreach state). The close match that is observed is a resultof the success of the MS IBI method in deriving a moregenerally applicable, transferable, set of potentials. IV. CONCLUSION
A multistate extension of the popular IBI method hasbeen proposed. In the proposed MS IBI method, multi-ple thermodynamic states are used in the derivation of asingle, generally applicable potential. For systems with aknown potential, it was shown that the MS IBI methodwas capable of accurately recovering the true, underlyingpotential, while the SS IBI method was unable to con-sistently derive a gnerally applicable potential. Throughthe coarse-graining of propane, it was shown that MS IBIwas able to recover a potential very similar to a previ-ously published single-site model with good reproducibil-ity. Furthermore, potentials derived via MS IBI wereshown to better reproduce structural conformations of n -dodecane than potentials derived via SS IBI. It was alsodemonstrated that adjusting the relative weights given toeach target in the optimizations can be used to tune sys-tem properties beyond the RDF; in this case, tuning theweights enabled potentials to be derived that providednear perfect agreement between CG and atomistic mod-els when considering the nematic order parameter andtilt angle of an n -dodecane monolayer. While pressure,and thermodynamics in general, were not investigated inthis work, the standard pressure correction scheme of SSIBI[38] could be trivially applied to MS IBI by calculat-ing the average pressure deviations between all states,and using this quantity in the pressure correction term. As such, the MS IBI stands as an improvement of thetypical IBI method, producing more generally applicablepotentials that can be tuned to match target propertiesfrom experiment or finer-grained simulations.This improved methodology should be very useful fora host of molecular systems, including, for example, lipidsystems, where not only do systems demonstrate struc-tural heterogeneity within a given state point (i.e., differ-ent molecular structures in a single system), but proper-ties such as tilt angle, nematic order, area per lipid, etc.,need to be tuned in order to match atomistic simulationsand experiment. [62–66] Given that the MS IBI approachis also capable of deriving potentials which demonstrateincreased levels of transferability than SS IBI, potentialscan be derived for complex systems with many unique in-teractions by examining the individual components sep-arately, reducing the number of simultaneous optimiza-tions that need to be performed. Furthermore, this workpresents a method to develop potentials that enable theexamination of phase transitions; in many prior worksutilizing SS IBI, different potentials are needed to ap-propriately model different states, making it difficult toaccurately examine the transition between those states.[18, 45, 67, 68] Additionally, given that multi-GPU ma-chines and GPU enabled simulation packages[57, 69–72]are becoming more common, the potential derivation pro-cess can be carried out with relatively little computa-tional effort, even if a large number of targets are needed,or a large number of iterations must be undertaken to findappropriate weighting functions. ACKNOWLEDGMENTS
The authors acknowledge support from grant numberR01 AR057886-01 from the National Institute of Arthri-tis and Musculoskeletal and Skin Diseases. [1] J. N. Israelachvili, D. J. Mitchell, and B. W. Ninham, J.Chem. Soc., Faraday Trans. 2 , 1525 (1976).[2] J. N. Israelachvili, D. J. Mitchell, and B. W. Ninham,Biochimica et Biophysica Acta (BBA)-Biomembranes , 185 (1977).[3] I. Muˇseviˇc and M. ˇSkarabot, Soft Matter , 195 (2008).[4] K. J. Bishop, C. E. Wilmer, S. Soh, and B. A. Grzy-bowski, small , 1600 (2009).[5] C. R. Iacovella, A. S. Keys, and S. C. Glotzer, Proc. ofthe Nat. Ac. of Sci. , 20935 (2011).[6] P. Ziherl and R. D. Kamien, J. Phys. Chem. B , 10147(2001).[7] E. L. Thomas, D. J. Kinning, D. B. Alward, and C. S.Henkee, Macromol. , 2934 (1987).[8] C. L. Phillips, C. R. Iacovella, and S. C. Glotzer, SoftMatter , 1693 (2010).[9] C. R. Iacovella and S. C. Glotzer, Nano Lett. , 1206(2009). [10] M. A. Glaser, G. M. Grason, R. D. Kamien, A. Koˇsmrlj,C. D. Santangelo, and P. Ziherl, Europhys. Lett. ,46004 (2007).[11] P. Ziherl and R. D. Kamien, Phys. Rev. Lett. , 3528(2000).[12] R. Larson, L. Scriven, and H. Davis, J. Chem. Phys. ,2411 (1985).[13] S. C. Glotzer, M. A. Horsch, C. R. Iacovella, Z. Zhang,E. R. Chan, and X. Zhang, Current Opinion in Colloid& Interface Science , 287 (2005).[14] L. Gai, K. A. Maerzke, P. T. Cummings, and C. McCabe,J. Chem. Phys. , 144901 (2012).[15] C. R. Iacovella, M. A. Horsch, Z. Zhang, and S. C.Glotzer, Langmuir , 9488 (2005).[16] K. F. Lau and K. A. Dill, Macromol. , 3986 (1989).[17] K. A. Maerzke and J. I. Siepmann, J. Phys. Chem. B , 3452 (2011).[18] S. J. Marrink, H. J. Risselada, S. Yefimov, D. P. Tiele- TABLE I. Average chain tilt with respect to surface normal, θ , and nematic order parameter, S , for the monolayers in thefluid state (subscript F) and in the gel state (subscript G). The states A, B, and C are the same ones used in Section III C.Values are given as ensemble averages ± standard deviation. α A / α B / α C θ F S ,F θ G S ,G ± . ◦ . ± .
013 18 ± . ◦ . ± . ± . ◦ . ± .
012 20 ± . ◦ . ± . ± . ◦ . ± .
020 20 ± . ◦ . ± . ± . ◦ . ± .
018 20 ± . ◦ . ± . ± . ◦ . ± .
019 19 ± . ◦ . ± . ± . ◦ . ± .
014 12 ± . ◦ . ± . ± . ◦ . ± .
011 8 ± . ◦ . ± . ± . ◦ . ± .
021 29 ± . ◦ . ± . ± ◦ . ± .
023 32 . ± . ◦ . ± . ± . ◦ . ± .
017 15 ± . ◦ . ± . ± . ◦ . ± .
005 21 . ± . ◦ . ± . ± . ◦ . ± .
002 20 ± . ◦ . ± . ± ◦ . ± .
028 32 . ± . ◦ . ± . , 7812(2007).[19] M. G. Martin and J. I. Siepmann, J. Phys. Chem. B ,2569 (1998).[20] W. L. Jorgensen, D. S. Maxwell, and J. Tirado-Rives, J.Am. Chem. Soc. , 11225 (1996).[21] C. Oostenbrink, A. Villa, A. E. Mark, and W. F.Van Gunsteren, J. Comp. Chem. , 1656 (2004).[22] K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu,S. Zhong, J. Shim, E. Darian, O. Guvench, P. Lopes,I. Vorobyov, et al. , J. Comp. Chem. , 671 (2010).[23] J. Wang, R. M. Wolf, J. W. Caldwell, P. A. Kollman,and D. A. Case, J. Comp. Chem. , 1157 (2004).[24] C. McCabe, S. C. Glotzer, J. Kieffer, M. Neurock, andP. T. Cummings, J. Comp. and Theor. Nanosci. , 265(2004).[25] H. Liu, M. Li, Z.-Y. Lu, Z.-G. Zhang, C.-C. Sun, andT. Cui, Macromol. , 8650 (2011).[26] C. Peter and K. Kremer, Soft Matter , 4357 (2009).[27] N. Di Pasquale, D. Marchisio, and P. Carbone, J. Chem.Phys. , 164111 (2012).[28] B. Ensing, S. O. Nielsen, P. B. Moore, M. L. Klein,and M. Parrinello, J. Chem. Theory and Comp. , 1100(2007).[29] E. Lidorikis, M. E. Bachlechner, R. K. Kalia, A. Nakano,P. Vashishta, and G. Z. Voyiadjis, Phys. Rev. Lett. ,086104 (2001).[30] J. Michel, M. Orsi, and J. W. Essex, J. Phys. Chem. B , 657 (2008).[31] M. Praprotnik, L. D. Site, and K. Kremer, Annu. Rev.Phys. Chem. , 545 (2008).[32] A. J. Rzepiela, M. Louhivuori, C. Peter, and S. J. Mar-rink, Phys. Chem. Chem. Phys. , 10437 (2011).[33] T. Werder, J. H. Walther, and P. Koumoutsakos, J.Comp. Phys. , 373 (2005).[34] A. P. Heath, L. E. Kavraki, and C. Clementi, Proteins:Structure, Function, and Bioinformatics , 646 (2007).[35] B. Hess, C. Holm, and N. van der Vegt, J. Chem. Phys. , 164509 (2006).[36] P. Liu, Q. Shi, E. Lyman, and G. A. Voth, J. Chem.Phys. , 114103 (2008).[37] A. J. Rzepiela, L. V. Sch¨afer, N. Goga, H. J. Risselada,A. H. De Vries, and S. J. Marrink, J. Comp. Chem. , 1333 (2010).[38] D. Reith, M. P¨utz, and F. M¨uller-Plathe, J. Comp.Chem. , 1624 (2003).[39] F. Ercolessi and J. B. Adams, Europhys. Lett. , 583(1994).[40] S. Izvekov and G. A. Voth, J. Phys. Chem. B , 2469(2005).[41] M. S. Shell, J. Chem. Phys. , 108 (2008).[42] A. Chaimovich and M. S. Shell, J. Chem. Phys. ,094112 (2011).[43] C. R. Iacovella, R. E. Rogers, S. C. Glotzer, and M. J.Solomon, J. Chem. Phys. , 164903 (2010).[44] B. Bozorgui, D. Meng, S. K. Kumar, C. Chakravarty,and A. Cacciuto, Nano Lett. , 2732 (2013).[45] K. Hadley and C. McCabe, J. Chem. Phys. , 134505(2010).[46] R. Faller, Polymer , 3869 (2004).[47] C.-C. Fu, P. M. Kulkarni, M. S. Shell, and L. G. Leal,J. Chem. Phys. , 164106 (2012).[48] H.-J. Qian, P. Carbone, X. Chen, H. A. Karimi-Varzaneh,C. C. Liew, and F. Muller-Plathe, Macromol. , 9919(2008).[49] B. Bayramoglu and R. Faller, Macromol. , 9205 (2012).[50] B. Bayramoglu and R. Faller, Macromol. , 7957 (2013).[51] P. Ganguly and N. F. van der Vegt, J. Chem. Theory andComp. , 1347 (2013).[52] E. R. Chan, A. Striolo, C. McCabe, P. T. Cummings,and S. C. Glotzer, J. Chem. Phys. , 114102 (2007).[53] G. Milano, S. Goudeau, and F. M¨uller-Plathe, J. of Poly-mer Science Part B: Polymer Phys. , 871 (2005).[54] K. Hadley and C. McCabe, Biophys. J. , 2896 (2010).[55] J. L. Rivera, G. K. Jennings, and C. McCabe, J. Chem.Phys. , 244701 (2012).[56] See supplemental material at [URL will be inserted byAIP] for derivation of n -dodecane pair and bonded po-tentials and associated RDFs.[57] J. A. Anderson, C. D. Lorenz, and A. Travesset, J.Comp. Phys. , 5342 (2008).[58] http://codeblue.umich.edu/hoomd-blue.[59] J. D. Weeks, D. Chandler, and H. C. Andersen, J. Chem.Phys. , 5237 (1971).[60] Q. Pu, Y. Leng, X. Zhao, and P. T. Cummings, Nan-otech. , 424007 (2007). [61] A. S. Keys, C. R. Iacovella, and S. C. Glotzer, J. Comp.Phys. , 6438 (2011).[62] H. Heller, M. Schaefer, and K. Schulten, J. Phys. Chem. , 8343 (1993).[63] E. Egberts, S.-J. Marrink, and H. J. Berendsen, Euro.Biophys. J. , 423 (1994).[64] L. Saiz and M. L. Klein, Accounts of Chemical Research , 482 (2002).[65] H. L. Scott, Current Opinion in Structural Biology ,495 (2002).[66] M. Venturoli, M. Maddalena Sperotto, M. Kranenburg,and B. Smit, Physics Reports , 1 (2006). [67] T. Vettorel and H. Meyer, J. Chem. Theory and Comp. , 616 (2006).[68] G. D’Adamo, A. Pelissetto, and C. Pierleoni, Soft Matter , 5151 (2012).[69] B. Hess, C. Kutzner, D. Van Der Spoel, and E. Lindahl,J. Chem. Theory and Comp. , 435 (2008).[70] S. Plimpton, J. Comp. Phys. , 1 (1995).[71] W. M. Brown, P. Wang, S. J. Plimpton, and A. N. Thar-rington, Computer Phys. Comm. , 898 (2011).[72] W. M. Brown, A. Kohlmeyer, S. J. Plimpton, and A. N.Tharrington, Computer Phys. Comm.183