Intra-chain organisation of hydrophobic residues controls inter-chain aggregation rates of amphiphilic polymers
Patrick Varilly, Adam P. Willard, Julius B. Kirkegaard, Tuomas P. J. Knowles, David Chandler
IIntra-chain organisation of hydrophobic residues controls inter-chainaggregation rates of amphiphilic polymers
Patrick Varilly, Adam P. Willard, Julius B. Kirkegaard, Tuomas P. J. Knowles,
1, 4 and David Chandler Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW,UK Department of Chemistry, MIT, Cambridge MA, USA Department Applied Mathematics and Theoretical Physics, University of Cambridge, Centre for Mathematical Sciences,Wilberforce Road, Cambridge CB3 0WA, UK Cavendish Laboratory, Department of Physics, University of Cambridge, J J Thomson Avenue, Cambridge CB3 0HF,UK Department of Chemistry, University of California, Berkeley, California 94720, USA
Aggregation of amphiphiles through the action of hydrophobic interactions is a common feature in soft condensed mat-ter systems and is of particular importance in the context of biophysics as it underlies both the generation of functionalbiological machinery as well as the formation of pathological misassembled states of proteins. Here we explore theaggregation behaviour of amphiphilic polymers using lattice Monte-Carlo calculations and show that the distribution ofhydrophobic residues within the polymer sequence determines the facility with which dry/wet interfaces can be createdand that such interfaces drive the aggregation process.Keywords: Hydrophobic effect — Protein aggregation — Polypeptide sequenceDue to their importance in governing self-assembly of bio-logical components, hydrophobic interactions and the mecha-nism of hydrophobic collapse leading to the aggregation ofhydrophobic species in an aqueous environment have beenstudied in detail using approaches ranging from spectroscopyto atomistic and coarse-grained simulations . The phe-nomenon of hydrophobic collapse by its very nature involvesthe removal of water molecules between adjacent hydropho-bic entities in order to allow for them to come together,and therefore the creation of an interface with unsatisfiedhydrogen-bonding separating ”wet” solvent from ”dry” aggre-gated hydrophobes in a manner reminiscent of a liquid-vapourphase transition. The picture that has emerged from computa-tional studies of the collapse of hydrophobic chains is that itis the creation of such interfaces which controls the transitionbetween solvated hydrophobes and their compact aggregatedstate . In biological systems, hydrophobic units rarely oc-cur in isolation and are most commonly part of a macromolec-ular system. In the present work, we investigate using latticeMonte-Carlo calculations the nature of the hydrophobic col-lapse for polymers with a varying distribution of hydrophobicand hydrophilic elements and demonstrate that the clusteringof hydrophobic entities is crucial for nucleating the formationof dry interfaces driving the eventual aggregation process.
I. OFF-LATTICE SOLUTES IN A LATTICE GAS SOLVENTMODEL
Hydrophobic assembly is characterised by the expulsionof water molecules from aggregates of hydrophobic entities.This effect can be captured by considering the evolution ofthe local water density field. The short length scale densityfluctuations contributing to the local density are characterisedby rapid relaxation and follow to a very good approxima-tion Gaussian statistics . These short scale fluctuations cantherefore be integrated analytically , resulting in a coarse grained density field ρ ( (cid:126)r ) which is readily simulated with adiscretized binary field n i which tracks the density fluctua-tions resulting in the appearance of cells i with a lower den-sity, ”vapour” cells n i = 0 , and ”wet” cells within the bulksolvent with n i = 1 . This description is particularly wellsuited for numerical evaluation as the computationally costlyshort length scale fluctuations characteristic of atomistic mod-els have been treated analytically. Within this picture, there isa cost to create a wet/dry interface, given by nearest neigh-bour interactions of the form n i n i ± and an energy associatedwith the solvation of chemical species n i µ ex . This descrip-tion is therefore equivalent to a 3D lattice gas system with theHamiltonian: H [ { n i } , { h i } ] = (cid:88) i [ − µ + µ ex h i ] n i + (cid:15) (cid:88) (cid:104) i,j (cid:105) n i n j (1)where µ is the solvent chemical potential and (cid:104) . . . (cid:105) indicatessummation over nearest neighbours on the lattice. The pres-ence of hydrophobic solutes at lattice sites i with h i = 1 re-sults in an excess chemical potential µ ex .The values of the parameters governing the coarse grainedwater degrees of freedom, µ = 3 (cid:15) − . · k B T and (cid:15) = 1 . k B T , can be determined for a lattice size of l = 0 . nm through comparisons with the experimental bulk valuesfor the isothermal compressibility and the surface tension γ = (cid:15)/ (2 l ) = 0 .
07N m − of water at room temperatureand 1 atm pressure. The value of µ coex = 3 (cid:15) represents thechemical potential of the solvent at phase coexistence with thevapour phase, and the small difference µ − µ coex (cid:28) k B T highlights the fact that water is close to phase coexistenceunder standard conditions. It has been shown that this de-scription of water reproduces faithfully the key properties as-sociated with hydrophobic interactions, in particular the char-acteristic solvation free energy changes with increasing so-lute sizes. This coarse-grained water description has previ-ously been used to study the collapse of a single hydrophobicchain , and we extend this approach here to cover the aggre- a r X i v : . [ phy s i c s . b i o - ph ] O c t gation of amphiphilic chains.With the specific parameterization given above, the latticesolvent used here is below the roughening transition for thethree-dimensional Ising model. As a result, there can be lat-tice artifacts due to a tendency of an interface to align with theorientation of the underlying lattice vectors . A true liquid-vapor interface would not exhibit this behavior. Vaikun-tanathan and Geissler have recently demonstrated that thistendency can give rise to inaccurate solvation free energiesfor hydrophobic solutes that are nanometer sized or larger .This inaccuracy is most significant for aspherical or irregu-larly shaped solutes, but grows less pronounced as the rough-ening transition is approached from below. In the case pre-sented here the model is only slightly below the rougheningtransition, which is located at (cid:15) ≈ . k B T , and so lattice ar-tifacts are only expected to manifest on length scales of about − (cid:96) . Since the critical nucleus size for hydrophobic pep-tide aggregation is about 1nm, or ≈ (cid:96) , we expect any latticeartifacts associated with being below the roughening transi-tion to be negligible for the results presented here. Indeed, themechanism for collapse found for a hydrophobic chain withthe lattice solvent model we employ is consistent with thatfound for same hydrophobic chain with an atomistic solventmodel. For more generally shaped solutes, effects of lat-tice artifacts may best be avoided by adopting Vaikuntanathanand Geisslers related lattice model, which is slightly morecomplicated than that of Ref. (17).For the hydrophobic segments, the excess chemical poten-tial is given by µ ex = c phob v , where c phob = 60 k B T nm − is taken to be the reversible work required to accommodate ahydrophobe of volume v . Idealised hydrophilic segments arewater like and as such the excess chemical potential due tothe presence of hydrophobic solutes vanishes µ ex = 0 . Fur-thermore, weak depletion forces act between two hydropho-bic particles and originate from the reduction of volume fromwhich the solute excludes solvent molecules.The solvent degrees of freedom in the model, { n i } , canbe efficiently sampled using a Metropolis Monte Carlo al-gorithm. By contrast, sampling the solute degrees of free-dom is more complex. The principal problem is to calcu-late the overlap volume v a between the excluded volume v and a given fine cell a . The volume v is typically a unionof overlapping spheres, one for each excluded volume asso-ciated with a solute particle, here a polymer segment. Pre-viously in Refs. (17) and (10), an interpolation scheme wasused that only works if no point in space is within the solvent-excluding radius of more than two spheres simultaneously,and a solute geometry was chosen that avoids this situation.In the present paper we focus on aggregation of multiplechains where densely packed structures are expected and thusmultiple overlaps will occur – thus the existing interpolationscheme is not suitable.Here, we discuss a partial solution to the above problem.Specifically, we present an approximate method of calculat-ing v a when v is, as above, a union of possibly overlappingspheres of a few different sizes. The gradient of v a with re-spect to the centers of these spheres is also easy to calculate.In principle, propagating these gradients to obtain gradients V t o t ( Å ) A BC d(Å)
FIG. 1. Shown in A, the fine grid on which the overlap volumeis calculated. This is mapped onto the lattice on which the field n i is defined as shown in B. In C, the total volume excluded by twospheres of radius R = 4 . ˚A when a distance d apart, as calculatedexactly (Equation (4)) and by the numerical approximation schemewith spacing 1 ˚A. of H eff [ { n i } ] is then simply a (non-trivial) bookkeeping ex-ercise. In describing our scheme, we treat cell indices a asvectors that can be added and subtracted. We denote by x a the coordinates of the corner of cell a with lowest Cartesiancomponents. For a solvent-excluding sphere of radius R cen-tered at x , we can pre-calculate the overlap volumes ˆ v s of allcells s by any method, such as Monte Carlo integration. Wedo this once at the beginning of a simulation.Generically, the center r of a solvent-excluding spherewill not coincide with a cell corner. We denote the indicesof the eight corners of the cell containing r by a , . . . , a ,and their positions by x , . . . , x . We construct eight non-negative weights c ( r ) , . . . , c ( r ) , the sum of which is and whose value depends continuously on r , such that r = (cid:80) k =1 c k ( r ) x k . Any scheme with these characteristics, suchas trilinear interpolation, can be used. The overlap volumesfor cells a near x are then estimated by ˜ v a ( r ) ≈ (cid:88) k =1 c k ( r )ˆ v a − a k . (2)This interpolation scheme has the desirable property that thetotal volume of a sphere, given by (cid:80) a ˜ v a ( r ) , is independent A B
FIG. 2. Hydrophobic collapse of an amphiphilic chains. In A isshown the system before collapse and in B the state of the systemafter 20,000 MC moves have been performed. Vapour lattice sitesare shown as grey, transparent cubes, hydrophobic residues in red,and hydrophilic residues in blue. of r . -1 Largest Bubble Size ( nm ) H y d r o p h o b i c C o n t a c t s FIG. 3. Map of the free energy landscape for the aggregation ofa solution of diblock polymer chains composed of six hydrophilicresidues followed by six hydrophilic residues computed using um-brella sampling (colour scale: energy in k B T ). Superimposed inblack is shown an unbiased trajectory displaying aggregation. When a solute is composed of multiple spheres, centeredat r N , we simply add together the overlap volumes given byEquation (2) for each solute, but we cap the sum at the totalvolume of each fine cell, λ f . In summary, we have v a ≈ min (cid:34) λ f , N (cid:88) n =1 ˜ v a ( r n ) (cid:35) (3)This scheme is exact whenever one or more spheres overlapscell a completely, as well as when two spheres both overlapcell a but not each other. When two or more spheres bothpartially overlap cell a and each other, our scheme mildly overestimates the overlap volume. We have evaluated the pre-cision of our scheme by calculating the total volume V tot oftwo spheres of radius R whose position is varied, and thetwo spheres are placed at arbitrary positions with respect tothe fine grid. The total volume can be calculated analyticallywhen the two sphere centers are a distance d apart, V tot ( d ) = × πR , d > R, × πR − π (4 R + d )(2 R − d ) , otherwise . (4)The results of this comparison are shown in Figure 1C when R = 4 . ˚A and the fine grid resolution is ˚A. As expected,the exact and numerical results agree closely. Of equal im-portance, the spread in the numerical estimate of the total vol-ume is small, which suggests that the lattice artifacts of ouroverlap-volume scheme are quite modest while offering a ro-bust and computationally advantageous solution to the prob-lem of computing overlap integrals for off-lattice solutes.We note that if the solute needed to be propagated throughsome variant of molecular dynamics, such as Langevin dy-namics , the gradient of H eff [ { n i } ] with respect to solute po-sitions would also be needed, and this is easy to calculate tothe present approximation scheme. II. RESULTS
The framework presented in this paper allows the study ofsolutes that can move freely in space in combination with aneffective and computationally tractable explicit solvent modelthat exploits the statistical mechanics of lattice gas models.Using this approach, we have probed the aggregation be-haviour of amphiphilic polymers with differing sequences ofhydrophilic and hydrophobic residues. We model the dimen-sions of our chains on that of polypeptide chains; the ex-cluded volume of the residues V = 310 . ˚A with a corevolume V = 92 . ˚A , is chosen to match that determinedexperimentally for amino acids . Furthermore, the excesschemical potentials of solvation possess values which coverthe range measured for hydrophilic and hydrophobic aminoacids . The polymer is modeled as a Gaussian chain, and thesimulation box with periodic boundary conditions has a vol-ume of 216.0 nm . We used a polymer mass concentration of79.6 mg/ml, a value comparable to the total protein concen-tration in many organisms.First, a solution of polymer chains composed of six hy-drophobic residues followed by six hydrophilic residues spon-taneously aggregates during unbiased simulations to form acluster as shown in Fig. 2 where the hydrophobic sectionsform a dry core and the hydrophilic segments are solvated onthe outside of this core. We follow the mechanism of aggre-gation of the chains by focusing on two coordinates: a sol-vent coordinate measuring the size of largest bubble of vapoursites, and a polymer coordinate which describes the number ofhydrophobic residues that are in contact, defined as their cen-tres lying within a distance less than R + 1 ˚A, where R is theresidue radius. In this manner we have a reporter for both the F r ee E ne r g y HPHPHPHPHPHPHHPPHHPPHHPPHHHPPPHHHPPPHHHHHHPPPPPP
FIG. 4. The free energy landscape in the solvent coordinate forpolymers which all have as half of their residues hydrophobes andas the other half hydrophilic residues, but where the distribution ofhydrophobes within the sequence has been varied. changes in the conformations of the polymer chains as well ason the behaviour of the solvent. Unbiased trajectories showthat during the aggregation process initially rapid fluctuationsin the number of hydrophobic contacts are observed; this pro-cess in itself does not, however, lead to aggregation of thechains since any contacts formed are able to dissociate readilyduring the simulation. However, the system may undergo acritical fluctuation in the water coordinate leading to the dry-ing of a hydrophobic contact. This fluctuation then drives thecomplete aggregation as other residues are subsequently re-cruited into this hydrophobic core. This mechanism is anal-ogous to that observed for the formation of intra-molecularcontacts in purely hydrophobic chains studied previously .The importance of fluctuations in the solvent degrees offreedom, which allow hydrophobic contacts to be stabilised,raises the question of how the ease of generating such fluc-tuations depends on the sequence in which hydrophobic andhydrophilic residues are distributed within the polymer chain.This question is also motivated by the empirical observationthat there is evidence from studies of protein sequences thatsignificant evolutionary pressures govern the distributions ofhydrophilic and hydrophobic residues in order to avoid un-wanted aggregation , in addition to the more conventionalrole of sequence in determining the final fold of the chain .We investigated this question by generating different se-quences of polymers which all share the same average com-position of 50% hydrophilic residues and 50% hydrophobicresidues. The evaluation of the free energy barriers in Fig. 4against aggregation reveals that the sequence of the polymer,even for a constant average composition, has a major role indetermining its propensity to aggregate. The free energy barri-ers are significantly larger for polymers which have only smallhydrophobic clusters and where the residues are evenly dis-tributed throughout the chain. By contrast, for polypeptidechains where the hydrophobic residues are segregated to oneend of the chain, we observe a significantly reduced barrier -6-4-20246 0 5 10 15 20 25 30 H y d r o ph o b i c i t y ( K y t e - D oo li tt l e ) residue numberwild typemutant -4-2024 -5 0 5 10 15 20 25 30 35 -4-2024-4-2024 -5 0 5 10 15 20 25 30 35 -4-2024 WTM clustered hydrophobic residues
Hydrophobicity (Kyte-Doolittle) - - - - - FIG. 5. Analysis of the aggregation propensity horse heart apomyo-globin and a scrambled sequence (top). The local Kyte-Doolittlehydrophobicity is shown as a bar chart as a function of residue num-ber for both the wild type and scrambled sequence; a sliding windowaverage of size 4 over the sequence length of wild type horse heartapomyoglobin (solid lines) compared to scrambled sequence result-ing in an aggregation prone mutant (red line), highlighting the role ofclusters of hydrophobic residues in favouring aggregation even in theabsence of an overall increase of the hydrophobicity of the sequence,in agreement with the simulation results in Fig.4. The sequences arefrom . and increased propensity to aggregate. The entropic penalty ofbringing together a critical number of hydrophobic residues,that are required to observe a drying transition leading a sta-ble hydrophobic contact, is significantly greater when theseresidues are not at adjacent positions in the chain but dis-tributed throughout the sequence. In this manner, the interplaybetween polymer degrees of freedom and solvent degrees offreedom generates a very significant and sensitive dependencyof the aggregation potential of the chain on the precise place-ment of the hydrophobic residues.Interestingly, experiments designed to probe the aggrega-tion propensity of sequence scrambled variants of the first 29residues of horse heart apomyoglobin have been reported .This system consists of a short peptide where the aminoacid composition has been kept constant, but several mutantswere generated where the position of the amino acids withinthe sequence was varied. It was observed that such mu-tants possessed markedly different aggregation propensitiesdespite their common amino acid composition, with aggrega-tion prone clusters of hydrophobic residues being particularlyassociated with a high propensity to aggregate. To facilitatea quantitative comparison between these different peptideswe map their sequences onto a quantitative measure of hy-drophobicity. The Kyte-Doolittle scale is one such measurethat associates a scalar hydrophobicity score with each aminoacid. If we consider the local Kyte-Doolittle hydrophobicity of the most aggregation prone mutant reported in the studyrelative to the aggregation resistant wild type, as shown inFig. 5, a marked difference in the distribution pattern of thehydrophobic residues is apparent, with the aggregation-pronemutant possessing a cluster of hydrophobic residues whichare distributed more evenly throughout the sequence for thewild type. This type of observation is in close agreement withthe importance of clusters of hydrophobic residues determinedfrom simulations in the present work. III. CONCLUSIONS
In this paper we have developed and described an approachto use off-lattice solutes within a statistical mechanical modelof lattice solvents. Our system is computationally tractable,yet includes the relevant solvent degrees of freedom. We haveused this system to probe the role of the distribution withinthe sequence of amphiphilic polymers of the positions of hy-drophobes. Our simulations show that highly aggregationprone chains result when the hydrophobes are not distributedevenly within the chain but cluster in close proximity alongthe chain. This clustering favours the nucleation of a dry hy-drophobic core when two or more such chains come together,leading to an inter-molecular hydrophobic collapse stabilisingthe aggregated state.
ACKNOWLEDGMENTS
We thank the BBSRC, the Frances and Augustus NewmanFoundation and the Wellcome Trust (TPJK) for financial sup-port and Daan Frenkel and Suriyanarayanan Vaikuntanathanfor helpful discussions. We are indebted to Michele Vendr-uscolo for bringing to our attention the data in Fig. 5. DChas been supported in part by Director, Office of Science, Of-fice of Basic Energy Sciences, and by the Division of Chem-ical Sciences, Geosciences, and Biosciences of the U.S. De-partment of Energy at LBNL under Contract No. DE-AC02- 05CH11231 Wallqvist, A.; Berne, B.
J. Chem.Phys. , (9), 2893–2899. Lum, K.; Chandler, D.; Weeks, J. D.
J. Phys.Chem. B , (22), 4570–4577. Huang, D. M.; Geissler, P. L.; Chandler, D.
J. Phys.Chem. B , (28),6704–6709. Raschke, T. M.; Tsai, J.; Levitt, M.
Proc. Natl. Acad. Sci , (11),5965–5969. Huang, D. M.; Chandler, D.
J. Phys. Chem. B , (8), 2047–2053. Dixit, S.; Crain, J.; Poon, W.; Finney, J.; Soper, A.
Nature , (6883),829–832. Chandler, D.
Nature , (7059), 640–647. Granick, S.; Bae, S. C.
Science (New York, NY) , (5907), 1477–1478. Rasaiah, J. C.; Garde, S.; Hummer, G.
Annu. Rev. Phys. Chem. , ,713–740. Willard, A. P.; Chandler, D.
J. Phys.Chem. B , (19), 6187–6192. Berne, B. J.; Weeks, J. D.; Zhou, R.
Annu. Rev. Phys. Chem. , , 85. Patel, A. J.; Varilly, P.; Chandler, D.
J. Phys.Chem. B , (4), 1632–1637. Hummer, G.
Nature chemistry , (11), 906–907. Garde, S.; Patel, A. J.
Proc. Natl. Acad. Sci , (40), 16491–16492. Patel, A. J.; Varilly, P.; Jamadagni, S. N.; Hagan, M. F.; Chandler, D.;Garde, S.
J. Phys.Chem. B , (8), 2498–2503. Ben-Amotz, D.
J. Phys. Chem. Lett . ten Wolde, P. R.; Chandler, D. Proc. Natl. Acad. Sci , (10), 6539–6543. Miller, T. F.; Vanden-Eijnden, E.; Chandler, D.
Proc. Natl. Acad. Sci , (37), 14559–14564. Jamadagni, S. N.; Godawat, R.; Dordick, J. S.; Garde, S.
J. Phys.Chem. B , (13), 4093–4101. Hummer, G.; Garde, S.; Garcia, A. E.; Pohorille, A.; Pratt, L. R.
Proc. Natl. Acad. Sci , (17), 8951–8955. Chandler, D.
Phys. Rev. E , (4), 2898. Varilly, P.; Patel, A. J.; Chandler, D.
J. Chem. Phys. , (7), 074109. Weeks, J. D.
J. Chem. Phys , , 3106. Vaikuntanathan, S.; Geissler, P. L.
Phys. Rev. Lett. , (2), 020603. Vaikuntanathan, S.; Rotskoff, G.; Hudson, A.; Geissler, P. L.Proc. Natl. Acad. Sci , (16), E2224–E2230. Levitt, M.
J. Mol. Biol. , (1), 59–107. Roseman, M. A. J. Mol. Biol , (3), 621–623. Monsellier, E.; Ramazzotti, M.; de Laureto, P. P.; Tartaglia, G.-G.; Taddei,N.; Fontana, A.; Vendruscolo, M.; Chiti, F.
Biophys. J. , (12), 4382– 4391. DuBay, K. F.; Pawar, A. P.; Chiti, F.; Zurdo, J.; Dobson, C. M.; Vendrus-colo, M.
J. Mol. Biol , (5), 1317–1326. Anfinsen, C. B.
Science , (96), 223–230. Kyte, J.; Doolittle, R. F.
J. Mol. Biol ,157