[PDF] Competition between monomeric and dimeric crystals in schematic models for globular proteins

Abstract

Advances in experimental techniques and in theoretical models have improved our understanding of protein crystallization. But they have also left open questions regarding the protein phase behavior and self-assembly kinetics, such as why (nearly) identical crystallization conditions sometimes result in the formation of different crystal forms. Here, we develop a patchy particle model with competing sets of patches that provides a microscopic explanation of this phenomenon. We identify different regimes in which one or two crystal forms can coexist with a low-density fluid. Using analytical approximations, we extend our findings to different crystal phases, providing a general framework for treating protein crystallization when multiple crystal forms compete. Our results also suggest different experimental routes for targeting a specific crystal form, and for reducing the dynamical competition between the two forms, thus facilitating protein crystal assembly.

Full PDF

CCompetition between monomeric and dimeric crystals in schematic models forglobular proteins

Diana Fusco

1, 2 and Patrick Charbonneau

1, 2, 3 Program in Computational Biology and Bioinformatics, Duke University, Durham, NC 27708 Department of Chemistry, Duke University, Durham, NC 27708 Department of Physics, Duke University, Durham, NC 27708

Advances in experimental techniques and in theoretical models have improved our understandingof protein crystallization. But they have also left open questions regarding the protein phase behaviorand self-assembly kinetics, such as why (nearly) identical crystallization conditions sometimes resultin the formation of diﬀerent crystal forms. Here, we develop a patchy particle model with competingsets of patches that provides a microscopic explanation of this phenomenon. We identify diﬀerentregimes in which one or two crystal forms can coexist with a low-density ﬂuid. Using analyticalapproximations, we extend our ﬁndings to diﬀerent crystal phases, providing a general frameworkfor treating protein crystallization when multiple crystal forms compete. Our results also suggestdiﬀerent experimental routes for targeting a speciﬁc crystal form, and for reducing the dynamicalcompetition between the two forms, thus facilitating protein crystal assembly.

I. INTRODUCTION

Crystallizing biomolecules is of central importance fordetermining their three-dimensional structure throughX-ray or neutron diﬀraction [1, 2], but remains noto-riously diﬃcult to achieve [3]. Fortunately, increas-ing the number of new structures deposited in publicdatabases [1] enriches our understanding of eﬀective crys-tallization screens and strategies that can be used withthe vast majority of biomolecules that still resist assem-bly [4]. Yet even elementary structural analyses revealblind spots in our materials comprehension. For instance,among the 80K Protein Data Bank (PDB [1]) depositedstructures obtained through X-ray crystallography, 45%come from monomeric structures, 43% come from homo-mers, and 64% of these homomers result from dimericassembly. Dimer formation may thus be an importantaspect of crystal formation, yet has thus far been mostlyneglected [5].A related feature is that many proteins crystallize inmore than one crystal form, with some instances, suchas lysozyme, resulting in tens of diﬀerent unit cells [6, 7].This diversity is partially caused by the variety in crystal-lization conditions. Diﬀerent cosolute, pH levels, and saltconcentrations can tilt the scale toward diﬀerent protein-protein interaction mechanisms, leading to the assemblyof distinct crystal forms. A very high salt concentration,for instance, strengthens hydrophobic interactions andscreens electrostatic ones. Yet even under the same so-lution conditions, and thus presumably similar eﬀectiveprotein-protein interactions, diﬀerent crystals are some-times found to assemble [8–11]. This phenomenon hasbeen observed in at least three diﬀerent experimentalcontexts: (i) by changing the crystallization tempera-ture [11], (ii) by changing the initial protein concentra-tion [11], (iii) by letting the crystallization experimentrun longer [8–11]. This last eﬀect is particularly inter-esting because studies have found that the crystal withthe slower growth rate can typically be resolved at a higher resolution; it is less likely to incorporate defectsand is thus often preferable, especially for neutron diﬀrac-tion [12].A microscopic understanding of these experimental ob-servations would help target the desirable crystal phaseand increase its growth speed and reliability. Studies ofsmall molecules have revealed how atomistic conforma-tional changes result in metastable solid polymorphismswith diﬀerent nucleation rates [13]. In larger and morerigid biomolecules, such as globular proteins, an alterna-tive cause to such phenomenon is the presence of com-peting crystal contacts. In this scenario, one possiblemechanism for crystal competition is the formation ofprotein dimers, which occurs when a given region on theprotein surface strongly interacts with the same region ona diﬀerent chain. Under these circumstances, a crystalof dimers, which satisﬁes the speciﬁc dimeric interaction,may also compete with a crystal of monomers, which doesnot carry the dimeric interaction but may be more eﬃ-ciently packed. Given the reported abundance of dimericand monomeric crystals, this scenario oﬀers a promisingstarting point for understanding the role of competitionin protein crystal assembly.From a physical viewpoint, identifying the solutionconditions leading to the formation of protein crystals isakin to determining the protein solution phase diagram.In typical experimental setups, a protein is crystallizedby super-saturating a low concentration protein solutionat constant temperature [2]. The crystal that nucleates isthus expected to be the thermodynamically stable form,leaving behind a solution depleted in proteins. Early ex-perimental characterization revealed an analogy betweenthe phase behavior of globular proteins and that of col-loidal particles with short range interactions [14, 15],which both often display a metastable critical point be-low the crystal solubility line. In these systems, success-ful crystallization is typically achieved in the region in-termediate between the solubility line -above which thesolution is stable- and the liquid-liquid critical point - a r X i v : . [ c ond - m a t . s o f t ] M a r below which the system precipitates into amorphous ma-terials [16–18]. The observation that even spherical andrigid globular proteins are characterized by directionalinteractions (these interactions are notably responsiblefor the low packing fraction of protein crystals comparedto atomic solids), however, soon led to the replacementof spherical symmetry with patchiness [19–23]. In patchymodels, a protein is described as a spherical particle dec-orated with attractive patches that mimic the solution-mediated directional protein-protein interactions drivingcrystal assembly. Properly parameterized, these modelscan achieve near-quantitative agreement with experimen-tal phase diagrams of simple proteins [19, 22]. Thus far,however, patchy models have assumed that only patchesleading to a single crystal form are present on the pro-tein surface. To take account of competing crystal forms,a dual set of complementary patches that correspond todistinct crystal symmetries has to be included.In this study, we design a patchy particle model for pro-teins with competing interactions that can result in bothmonomeric and dimeric crystal forms. Using numericalsimulations, we characterize the model’s phase diagramunder diﬀerent interaction parameters and test whetherit can explain the experimental observation that diﬀerentcrystal forms can assemble depending on (i) crystalliza-tion temperature, (ii) initial protein concentration, and(iii) experimental time. II. METHODS

In the following section, we describe the schematic pro-tein model, summarize the details of the simulation tech-niques, and present analytical approximations that canbe used to extend the present analysis to diﬀerent crystallattices.

A. Model

We adopt a schematic patchy particle model for pro-teins with a distribution of orientational interactions cho-sen such that the model can form both monomeric anddimeric crystals (Fig. 1). Each protein is represented bya hard sphere, which models the overall steric repulsionbetween two proteins, decorated by square-well attractivepatches that represent the protein-protein interactions atcrystal contact. The solvent contribution is taken to beeﬀective and is thus directly integrated into the patch-patch interaction potential. Because the crystal symme-try P2 is the most common for both monomericand dimeric crystals in the PDB [1], we assign patch po-sitions according to protein crystals that already havethis symmetry. The monomeric crystal patches are cho-sen to be the same as those used in a previous study ofthe rubredoxin crystal (PDB: 1BRF) [22]; the dimericcrystal patches follow the crystal symmetry of the yeastMyo5 SH3 domain (PDB: 1ZUY), whose chain length and crystal density are similar to those of rubredoxin.This choice guarantees that the two crystal forms of theschematic model have comparable number density, al-though the dimeric crystal happens to be slightly denser.Note that the model does not correspond to a speciﬁcprotein, but should be taken as prototype for monomeric–dimeric crystal competition (Fig. 1). MonomericDimeric

FIG. 1. Representation of the patchy particle and the twocrystals unit cells. The green patch is the dimeric interaction( D ), the red patches are the crystal contacts of the monomericunit cell ( m ) and the blue patches are the crystal contacts ofthe dimeric unit cell ( d ). The purple patch represents patch 2in Table I, which is a shared crystal contact between the twocrystal forms. Each particles carries a set Γ of n = 13 patches whosepair interactions are depicted in Figure 1. Particles 1and 2, whose centers are a distance r apart, interactthrough a pair potential φ ( r , Ω , Ω ) = φ HS ( r ) + n (cid:88) i,j =1 φ i,j ( r , Ω , Ω ) , (1)where Ω and Ω are Euler angles describing the orienta-tion of the two particles. The hard-sphere (HS) potentialcaptures the volume exclusion up to a diameter σφ HS = (cid:26) ∞ r ≤ σ r > σ, (2)and patch-patch interactions are the product of a radialand of an angular component φ i,j ( r , Ω , Ω ) = ψ i,j ( r ) ω i,j (Ω , Ω ) . (3)The radial component depends on the patch type and theinter-particle distance ψ i,j = (cid:26) − ε i,j r ≤ λ i,j and p i,j = 10 otherwise , (4)where p i,j takes value 1 if patch i interacts with j hasreported in Table I and 0 otherwise, whereas λ i,j and ε i,j are the square-well interaction range and strength,respectively. The angular part guarantees that patchesonly interact when facing each other ω i,j (Ω , Ω ) = (cid:26) θ ,i ≤ δ i and θ ,j ≤ δ j , (5)where θ ,i is the angle between vector r and patch i vec-tor on particle 1, θ ,j is the angle between r = − r and patch j vector on particle 2, and δ i and δ j are thesemi-angular widths of the patches. The choice of inter-action potential implicitly assumes that long-range elec-trostatic repulsion can be ignored and that no isotropicdepletion forces are at play. These assumptions are rea-sonable for almost 50% of successful crystallization ex-periments, in which the relatively high salt concentrationscreens long-range electrostatics and no depletion agent,such as poly-ethylene glycol, is used [24].In the following, we simplify the model by assumingthat the patch width and interaction range are equal forall patches, using δ i = acos(0 .

99) and λ i,j = 1 . σ as typ-ical values for protein-protein interactions [22]. Threetypes of interaction energies are considered: ε D for thepatch that holds the dimer together, i.e., the dimericpatch, and ε m and ε d for the patches corresponding tothe crystal contacts of the monomeric and the dimericcrystals, respectively. From the P2 lattice geome-try, we obtain a lattice energy per particle e m = − ε m for the monomeric crystal and e d = − ( ε D +5 ε d ) / σ , and energy and inverse temper-ature β = 1 /k B T , where k B is the Boltzmann constant,in units of ε D .Our choice of model accounts for systems in which thedimeric patch is stronger than the other, less speciﬁc in-teractions. It explores the regimes in which the dimericpatch strength suﬃces to control crystal assembly com-pared to diﬀerent combinations of ε d and ε m . In partic-ular, we consider the case ε d = ε m , which corresponds toidentical crystal contacts for the two crystal forms, andthe case e d = e m ( ε m = ε d ), which corresponds to anidentical lattice energy for the two crystal forms (Fig. 2).Table I reports the speciﬁc positions of the patches onthe sphere and their interaction energy pairing. B. Numerical Simulations

Phase diagrams for the model family are computedwith the help of special-purpose Monte Carlo (MC) sim-ulation methodologies that closely follow the approachused in Ref. 21. The gas-liquid line of the phase diagramis obtained using the Gibbs ensemble method [25], andthe critical temperature T c and density ρ c are extracted number θ φ interacting pair type D d / m d d d d d d m

10 1.9858 0.6667 9 m

11 2.7123 3.5542 12 m

12 0.4293 3.5542 11 m

13 1.0007 1.1452 2 m TABLE I. Patches spherical coordinates (in radiants), inter-acting patch and types, classiﬁed as D for the dimeric patch, d for the dimeric crystal contacts and m for the monomericcrystal contacts. Columns 1 and 4 identify the pairs of inter-acting patches for which p i,j = 1 FIG. 2. Combination of crystal contact strengths explored inthis paper. The blue solid line represents the case in whichboth crystals have identical crystal contact strength, the reddashed line when the energy per particle in the two crystalsis the same. The symbols identify the speciﬁc parameter setsinvestigated in simulations. using the law of rectilinear diameters [26]. For these com-putations, a system of N = 1000 particles is simulatedfor an equilibration run of 2 × MC cycles that pre-cedes the production run of 3 × MC cycles. Each MCcycle consists of N particle displacements, N particle ro-tations, N/

10 particle swaps between the liquid and thegas box and 2 volume moves.The solubility line is computed by integrating theClausius-Clapeyron equation starting from a coexistencepoint determined using free energy calculations and ther-modynamic integration [26, 27]. In this case, the crystalfree energy is computed by integrating from an Einsteincrystal [26], whose free energy is evaluated by a saddlepoint approximation [21], and the ﬂuid free energy is in-tegrated from the ideal gas reference state. Free-energyintegration over isotherms or isobars identiﬁes the coex-istence points between the ﬂuid and the dimeric crystal,the ﬂuid and the monomeric crystal, and the two crystals.Simulations at constant N , pressure P , and T are runusing N =500 particles for the ﬂuid and the monomericcrystal, and N =512 particles for the dimeric crystal. Inthis case, each MC cycle corresponds to N particle dis-placements, N particle rotations, and 2 volume movesthat are isotropic for the ﬂuid and anisotropic for thecrystal.To analyze the ﬂuid phase dynamics, we perform MCsimulations with N =864 particles, at constant V and T = T c for 3 × MC cycles. Previous studies haveshown that MC simulations qualitatively capture theBrownian dynamics of patchy particle ﬂuids similar tothose used here [21, 28, 29]. During the course of thesimulation we track the number and size of monomericand dimeric crystallites, as well as the number of dimers.We classify a particle as being part of a monomeric ordimeric crystallite if, respectively, all its monomeric ordimeric crystal contacts are satisﬁed. Two crystal parti-cles belong to the same monomeric or dimeric crystalliteif they are bonded through a monomeric or a dimeric con-tact, respectively. The number of dimers is straightfor-wardly deﬁned as the number of particles whose dimericpatch is satisﬁed divided by 2. The nucleation barriersare calculated using umbrella sampling in constant

N P T

MC simulations using the crystal cluster size as order pa-rameters, similarly to Ref. 21.

C. Analytical approximations

We derive analytical approximations for the ﬂuid andcrystal free energy using Wertheim perturbation theoryand cell theory, respectively, in order to generalize thesimulation results to a broader set of interaction energyvalues and diﬀerent crystal lattices.

1. Wertheim perturbation theory for the ﬂuid

Wetheim’s perturbation theory [30, 31] approximatesthe free energy of a ﬂuid of patchy particles as the HSfree energy [32] and a bond free energy correction a f = a HS + a bond , (6)where βa bond = n (cid:88) i =1 (cid:18) ln X i − X i (cid:19) . (7)Here X i is the probability that the particle is not bondedat patch i . The chemical potential is then given by βµ f = βa f + βPρ = βa HS + βa bond + βP HS ρ + βP bond ρ , (8) where ρ = N/V is the number density and the pressurecontribution to bonding is βP bond = ρ n (cid:88) i =1 (cid:18) ∂X i ∂ρ (cid:19) (cid:18) X i − (cid:19) . (9)The value of X i can be determined by solving the equa-tion of mass-action X i = 11 + (cid:80) nj =1 ρX j ∆ i,j . (10)Following the notation of Ref. 33, we deﬁne ∆ i,j as∆ i,j = (cid:90) g r (12) f i,j (12) d (12) , (11)where d (12) denotes an integral over all orientations andseparations of two particles, g r is the radial distribu-tion function of HS ﬂuids and f i,j = exp[ − βφ (12)] − i,j canbe approximated by using the contact value of the radialdistribution function g + , such that∆ i,j = π [1 − cos( δ )] ( λ − g + . (12)As showed in Ref. 33, for patches that interact with asingle partner, it follows that X i = 21 + (cid:112) ρ ∆ i,j , (13)where j denotes the partner site. In our model, all in-teractions but three fall into this category. For the otherthree (2, 3 and 13 in Table I), because patch 2 interactswith both 3 and 13 with diﬀerent energies, the solutioncomes from solving X + ρX [∆ , X + ∆ , X ] = 1 X + ρ ∆ , X X = 1 X + ρ ∆ , X X = 1 . (14)In the special case ε m = ε d , we also have that ∆ , =∆ , , hence X = X = X +12 , which oﬀers an analyti-cal solution X = − (1 + ρ ∆ , ) + (cid:113) ρ ∆ , + ρ ∆ , ρ ∆ , . (15)

2. Cell model for crystals

Previous studies [21, 35] of the ﬂuid-crystal coexistenceof patchy particles approximated the chemical potentialof the crystal as βµ c = βe c − s c + βP/ρ c ∼ βe c − s SC , (16)where e c and s c , indicate the lattice energy per particleand the entropy per particle, respectively. This approx-imation assumes that βP/ρ is small, which is realisticwhen studying liquid-crystal coexistence at low pressure,and that the entropy of the crystal, whatever its sym-metry, can be roughly approximated by that of a simplecubic lattice, s SC . This last assumption, however, breaksdown when analyzing coexistence between two crystals,or otherwise the only remaining contribution is the dif-ference in lattice energy.A cell model for a crystal approximates the partitionfunction by the free volume of each particle [36]. Sim-ilarly to the HS case, we use a Voronoi tessellation todivide the crystal in cells V i , each containing a singleparticle i , by construction. The partition function is thenobtained by assuming perfect decorrelation of the cells Z ≈ (cid:90) V i d r i d Ω i exp( − β (cid:88) i

In this section, we discuss the analytical and simula-tion results for two sets of parameters: (i) equal latticeenergy ( e d = e m ), and (ii) equal crystal contact energy( ε d = ε m ). The relatively good agreement between simu-lation and theory allows us to extend the analytical treat-ment and to draw more general insights into the role ofcompetition in protein crystal assembly. A. Case 1: Equal lattice energy

In the case where both crystal forms have the samelattice energy, we ﬁnd that the dimeric crystal phase has ρ s FIG. 3. Cell-model crystal entropy of the monomeric (bluecircles) and dimeric (red squares) lattices as a function of crys-tal density determined using Monte Carlo integration. Thelines are provided as a guide for the eye. no thermodynamic advantage over the monomeric crystalat low-to-medium protein ﬂuid concentrations. Indepen-dently of the crystal contact energy, the ﬂuid thermody-namically coexists with the monomeric crystal (Fig. 4).Although stable, the dimeric crystal is only found at veryhigh pressures and ρ > ε m ≈ ε d (cid:28) ε m = ε d = 1at ρ < .

4, monomeric crystallites grow into large clus-ters, while the dimeric crystals remain small (Fig. 5). At ρ > .

4, however, only a few small monomeric crystallitesare observed, while the number of dimeric crystallites in-creases. Remarkably, although the ﬂuid supersaturationincreases with density, crystallization of the thermody-namically preferred crystal becomes increasingly dynam-ically suppressed, even though the system is still far fromthe glassy regime. In light of this observation, it is rea-sonable to hypothesize that as the ﬂuid density increases,crowding enhances the occurrence of interactions that areincompatible with the monomeric crystal, i.e., dimericcrystal contacts and patches. In other words, the ﬂuid-monomeric-crystal interfacial free energy likely increases,which increases the barrier to crystal nucleation.The dimeric patch inhibits monomeric crystal assemblyeven more so when ε d <

1, as the energetics of dimer for-mation grows increasingly favored (Fig. 5). As ε d is low-ered, the number of dimers indeed steadily increases andno spontaneous monomeric crystal growth is observedat any density, over the course of the simulation. At ε d = 0 . ρ (cid:38) .

3, the dimeric crystal is more stablethan the ﬂuid, as conﬁrmed by the growth (melt) of adimeric crystal seed above (below) this density, and thenear constant satisfaction of the dimeric patch suggests d = m =0.40 0.5 10.10.120.140.160.180.2 d = m =1 T / D d = m =0.325 0 0.5 10.0110.01150.0120.0125 d = m =0.1 MF DF F FM M D D (b) (c) (d) d =0.1 m =0.25 (a) (e) F M

FIG. 4. Temperature-density phase diagrams for a set of models. (a) e d = e m = 0 .

75, (b) ε d = ε m = 1, (c) ε d = ε m = 0 . ε d = ε m = 0 . ε d = ε m = 0 .

1. Black squares indicate the liquid-liquid coexistence line and the ﬁlled squares thecritical point obtained through ﬁtting (black line). Blue circles indicate the ﬂuid–monomeric crystal coexistence, red trianglesmonomeric-dimeric crystal coexistence, and green right-pointed triangles ﬂuid–dimeric crystal coexistence (lines are guides forthe eye). The dashed line represents the metastable ﬂuid-dimeric crystal coexistence line. The letters indicate the stabilityregions for the ﬂuid (F), the monomeric crystal (M) and the dimeric crystal (D). that the metastable dimeric crystal may be kinetically ac-cessible. Although the nucleation of the metastable phaseis clearly observed in Case 2 (see below), specializedrare-event sampling methodologies are needed to conclu-sively settle this issue. Calculations of the nucleationbarriers for the two crystals at diﬀerent densities sup-ports the kinetic control hypothesis (Fig. 5). The heightof the barrier increases with density for the monomericcrystal and decreases for the dimeric crystals, in spiteof the fact that the drive to crystallize ∆ µ steadily in-creases. The interfacial free energy between the ﬂuid andthe monomeric crystal thus rapidly increases with den-sity. Above ρ = 0 .

5, the free energy barrier to forminga dimeric crystal is even lower than the monomeric one,suggesting that the former likely precipitates earlier. Yet,at much longer times, the dimeric crystal should even-tually transform into the thermodynamically-preferredmonomeric crystal through a (much slower) crystal re-organization. The formation of dimeric crystals underkinetic control can therefore explain two key experimen-tal observations: diﬀerent initial protein concentrationsleading to diﬀerent crystals, depending on whether theconcentration is above or below the metastable dimericcrystal solubility line; and diﬀerent experimental waitingtime leading to diﬀerent crystal structures.It is also interesting to note that for ε d <

1, the frac-tion of dimers in the ﬂuid decreases with increasing ﬂuiddensity. At low density, clusters of size larger than twoare rare due to their entropic cost, and therefore thestrongest (dimeric) patch dominates clustering; by con-trast, at high densities, multiple particles come in contactand the weaker patches more easily participate in the for-mation of aggregates, without necessarily satisfying alldimeric interactions. Increasing temperature, however,leaves only dimers in the metastable ﬂuid at all densi-ties, because the eﬀective stickiness of weaker patcheslowers and the dimeric patch becomes the dominating interaction type. Note that a similar phenomenon hasbeen reported for patchy particles with a single, narrowpatch [38]. In these models, simple oligomers are ob-served at low densities, but larger clusters form in high-density ﬂuid. In our case, the dimeric patch is too nar-row to allow for multiple contacts, but the other (weaker)patches play that role in its stead.

B. Case 2: Equal crystal contact energy

The topology of the phase diagram dramaticallychanges upon tuning ε d = ε m ≤ ε d = ε m = 1 the only stable crystal is the monomeric crys-tal and the phase diagram is qualitatively similar to theone discussed in the previous section. Under thermody-namic control, a low-density protein solution would nec-essarily crystallize in the monomeric form. Decreasingthe crystal contact strength allows for the dimeric crys-tal form to appear at ρ <

1, but up to ε d = ε m ≈ . ρ ≥ . T , where the entropic contribution to the free energy issmall. The accessibility of this phase depends on therelative position of the ﬂuid–dimeric crystal–monomericcrystal triple point with respect to the gas-liquid criticalpoint. The phase diagram at ε d = ε m = 0 .

325 illustrates ρ ε d monomericdimeric>503-501-20>503-501-20 ρ Δ G monomericdimeric ρ d i m e r f r a c t i o n (a) (b) (c) FIG. 5. Nucleation behavior of the two crystals. (a) Size of monomeric and dimeric crystals as a function of the ﬂuid densityand crystal energy e d = e m after 3 × MC steps at T c . The dashed line is the limit of solubility of the metastable dimericcrystal. (b) Nucleation barriers as a function of ﬂuid density for monomeric (blue circles) and dimeric (red squares) crystalsat ε d = 1 (empty symbols) and ε d = 0 . this situation. In this case, the triple point temperature T t is higher than T c , and therefore both crystal formsare thermodynamically accessible from the low densityﬂuid: the monomeric crystal at higher temperatures andthe dimeric crystal at lower temperatures. Interestingly,a comparable eﬀect has been observed in lysozyme. Un-der identical solution conmpositions, but diﬀerent tem-peratures (5 ◦ C vs. 35 ◦ C), lysozyme assembles in diﬀer-ent crystal forms [11]. The absence of deposited crystalstructures, however, prevents us from determining if alysozyme dimer is involved in the assembly.Note that when ε d = ε m becomes suﬃciently low, thetriple point temperature increases so much that only veryhigh density ﬂuids could coexist with the monomericcrystal. In actual protein crystallization experiments,reaching such a high protein concentration, however, typ-ically results in denaturation [39], hence the patchy parti-cle description breaks down. In this case, the monomericcrystal is thus assumed to be unreachable and the proteincan only form a dimeric crystal.Figure 4 indicates that in general weakening crystalcontacts lowers the critical point and raises the triplepoint. The analytical Wertheim-cell model allows usto extend these simulation results to any ε d = ε m , soas to better capture the thermodynamics of crystal as-sembly. [40] The case ε d (cid:54) = ε m could also be examinedby numerically solving Eq. (14), but is not consideredhere. Equating the chemical potential and pressure oflow- and high-density ﬂuids provides an estimate for thecritical temperature (blue line in Fig. 6) [21]. Because atlow pressure the monomeric–dimeric crystal coexistence T P line is almost vertical, we can safely assume thatthe triple point between the ﬂuid and the two crystals,i.e., the minimum stability temperature of the monomericcrystal, is found in the low-pressure regime. In this case,equating the free energies of the two crystals provides the triple point temperature T t (red line in Fig. 6)1 T t ∼ s d ( ρ d ) − s m ( ρ m ) e d − e m . (18)The critical and triple temperatures crossover at a spe-ciﬁc value ε d = ε m = ε + . For ε d = ε m > ε + , T c > T t ,and thus, in order to crystallize in the dimeric form, den-sifying a protein ﬂuid would have to cross the metastablegas-liquid coexistence regime. Gelation should then typ-ically render this crystal form inaccessible [16–18]. For ε d = ε m < ε + , the triple point is higher than the crit-ical point. A crystallization experiment at a tempera-ture T c < T < T t would result in dimeric crystal forma-tion. Comparing the position of the critical and the tripletemperature thus identiﬁes an upper threshold ε + abovewhich the ﬂuid can only crystallize in the monomericcrystal, and below which both crystal forms are attain-able by tuning the system temperature. FIG. 6. Critical temperature T c (blue solid line) and tripletemperature T t (red dashed line) as a function of ε d = ε m .The crossing point identiﬁes the crossover ε + above whichonly the monomeric crystal can be obtained, and below whichboth monomeric and dimeric crystals are accessible. At low ε d = ε m , the monomeric crystal is only reach-able from a very high-density ﬂuid (Fig. 4), which, asargued above, is experimentally inaccessible. A secondthreshold ε − for ε d = ε m , below which only the dimericcrystal coexists with the low density ﬂuid can thus bedeﬁned. The value of ε − is estimated as the point, wherethe triple point involves a high-density ﬂuid (arbitrarilyset to ρ = 0 . e m − T s m ( ρ m ) + P f /ρ m = µ f e d − T s d ( ρ d ) + P f /ρ d = µ f . (19)Here, P f and µ f are the values for a high-density ﬂuid,and ρ m ( ρ d ) are the monomeric (dimeric) crystal coexist-ing densities. For ε d = ε m , independently of the actualvalue of the energy, ρ d ∼ .

89 and ρ m ∼ .

82, whichspeciﬁes the crystal entropies in Eq. (19) (Fig. 3).

FIG. 7. Crystal stability regimes. The blue line is the criti-cal temperature, the red dashed line is the coexistence tem-perature between dimeric and monomeric crystals at the lowpressure limit. The green dotted line and the magenta dot-dashed line are the coexistence temperatures between the ﬂuidat ρ = 0 . ε + and lower ε − thresholds. Figure 7 summarizes the interplay between the crit-ical point, the triple point at low pressure, and thetriple point coexisting with a high density ﬂuid. When ε d = ε m < ε − , the monomeric crystal is only stable whenit coexists with a ﬂuid with ρ > .

8, while the dimericcrystal coexists with a low-to-medium density ﬂuid. Inthe range ε − < ε d = ε m < ε + , the ﬂuid coexists withboth the dimeric (at lower temperatures and densities)and the monomeric (at higher temperatures and densi-ties) crystals. For ε d = ε m > ε + , the critical temper-ature rises above the triple point and the dimeric crys-tal becomes inaccessible. From this analysis we get thatthe thermodynamic competition between the two crystalforms occurs over a relatively small range of parameter space. Hence, generally, a single symmetry is thermo-dynamically stable, despite the presence of several com-peting patches leading to diﬀerent crystal forms. Kineticcontrol is thus likely to be a more prevalent mechanismfor driving crystal contention than outright thermody-namic competition. IV. CONCLUSIONS

In this work, we have used a simple patchy parti-cle model to study competing crystal forms in the con-text of protein crystallization, and extended our ﬁndingsby using analytical approximations from Wertheim andcell theory. Our analysis reveals that tuning the rela-tive strength of crystal-contact and dimeric interactionsresults in qualitatively diﬀerent equilibrium phase dia-grams, and suggests that certain protein crystal assem-blies may be under kinetic control. These results providea microscopic explanation for some of the experimentalobservations gathered over the years. The existence ofa ﬂuid-crystal triple point positioned above the criticalpoint identiﬁes a possible scenario that explains why dif-ferent crystallization temperatures may result in distinctcrystal forms. If this solution behavior is detected exper-imentally, our analysis suggests that tuning the sampletemperature should be an eﬀective strategy to obtain thedesired crystal and control its quality. A more invasivealternative would be to mutate amino acids involved inthe dimeric patch: strengthening the dimer would widenthe stability range of the dimeric crystal, and weaken-ing it would favor the monomeric crystal. Crosslinkingthe proteins along the dimeric interaction, for instance,would fully suppress the monomeric crystal form. Thedimeric crystal would then become the only stable crys-tal form and larger crystallites should be able to grow.Similar strategies have already been experimentally usedto crystallize recalcitrant proteins by encouraging themto dimerize [41]. A notable example is racemic proteincrystallization, where the introduction of a protein mir-ror image favors pseudo-dimer assembly [42]. Beyondthe general structure of the phase diagram, we also showthat more subtle experimental observations can be un-derstood by analyzing the dynamics of protein assembly.Changing the crystal form by tuning the initial proteinconcentration or the experimental time are characteris-tic of kinetically-controlled assembly. In this context, weﬁnd that decreasing the protein concentration , and hencethe ﬂuid supersaturation, may counter-intuitively accel-erate crystal assembly.The schematic model introduced here provides a gen-eral framework for understanding how crystal latticeswith diﬀerent energies, entropies, and densities assem-ble. For example, if the higher-density crystal had ahigher entropy, the position of the two crystals in the T – ρ phase diagram would be ﬂipped, but the ﬂuid phasewould mostly remain unaﬀected. A diﬀerent assemblykinetic would then be expected. We thus anticipate thatfuture analyses on these richer models for protein assem-bly will provide better guidance for macromolecular crys-tallization. V. ACKNOWLEDGEMENT

We thank J. Skinner for his gracious support and en-couragements over the years. We acknowledge supportfrom National Science Foundation Grant No. NSF DMR-1055586. [1] H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N.Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne,Nucleic Acids Res. , 235 (2000).[2] A. McPherson, Crystallization of Biological Macro-molecules (CSHL Press, Cold Spring Harbor, 1999).[3] K. Khaﬁzov, C. Madrid-Aliste, S. C. Almo, and A. Fiser,Proc. Natl. Acad. Sci. , 3733 (2014).[4] D. Fusco, T. J. Barnum, A. E. Bruno, J. R. Luft,E. H. Snell, M. Sayan, and P. Charbonneau, unpublishedarXiv:1312.7012 (2014).[5] Z. Liu, W.-P. Zhang, Q. Xing, X. Ren, M. Liu, and C.Tang, Angewandte Chemie International Edition , 469(2012).[6] X. J. Zhang, J. A. Wozniak, and B. W. Matthews, J.Mol. Biol. , 527 (1995).[7] J. E. Kohn, P. V. Afonine, J. Z. Ruscio, P. D. Adams,and T. Head-Gordon, PLoS Comput. Biol. , e1000911(2010).[8] M. Dixon, H. Nicholson, L. Shewchuk, W. Baase, and B.Matthews, J. Mol. Biol. , 917 (1992).[9] H. Faber and B. Matthews, Nature , 263 (1990).[10] D. E. McRee, S. M. Redford, T. E. Meyer, and M. A.Cusanovich, J. Biol. Chem. , 5364 (1990).[11] A. V. Elgersma, M. Ataka, and T. Katsura, J. Cryst.Growth , 31 (1992).[12] D. A. Myles, Curr. Opin. Struct. Biol. , 630 (2006).[13] S.-Y. Chung, Y.-M. Kim, J.-G. Kim, and Y.-J. Kim, Na-ture physics , 68 (2009).[14] D. Rosenbaum, P. C. Zamora, and C. F. Zukoski, Phys.Rev. Lett. , 150 (1996).[15] P. R. ten Wolde and D. Frenkel, Science , 1975(1997).[16] P. Charbonneau and D. R. Reichman, Phys. Rev. E ,050401 (2007).[17] P. J. Lu, E. Zaccarelli, F. Ciulla, A. B. Schoﬁeld, F.Sciortino, and D. A. Weitz, Nature , 499 (2008).[18] A. Fortini, E. Sanz, and M. Dijkstra, Phys. Rev. E ,041402 (2008).[19] A. Lomakin, N. Asherie, and G. B. Benedek, Proc. Natl.Acad. Sci. USA , 9465 (1999).[20] E. Bianchi, R. Blaak, and C. N. Likos, Phys. Chem.Chem. Phys. , 6397 (2011).[21] D. Fusco and P. Charbonneau, Phys. Rev. E , 012721(2013). [22] D. Fusco, J. J. Headd, A. De Simone, J. Wang, and P.Charbonneau, Soft Matter , 290 (2014).[23] T. K. Haxton and S. Whitelam, Soft Matter , 3558(2012).[24] M. Charles, S. Veesler, and F. Bonnet´e, Acta Crystallogr.D , 1311 (2006).[25] A. Z. Panagiotopoulos, Mol. Phys. , 813 (1987).[26] D. Frenkel and B. Smit, Understanding Molecular Simu-lation (Academic Press, London, 2001).[27] C. Vega, E. Sanz, J. L. F. Abascal, and E. G. Noya, J.Phys.-Condens. Mat. , 153101 (2008).[28] C. De Michele, S. Gabrielli, P. Tartaglia, and F.Sciortino, J. Phys. Chem. B , 8064 (2006).[29] E. Sanz and D. Marenduzzo, J. Chem. Phys. , 194102(2010).[30] M. S. Wertheim, J. Stat. Phys. , 19 (1984).[31] M. S. Wertheim, J. Stat. Phys. , 35 (1984).[32] N. F. Carnahan and K. E. Starling, J. Chem. Phys. ,(1969).[33] G. Jackson, W. G. Chapman, and K. E. Gubbins, Mol.Phys. , 1 (1988).[34] N. Kern and D. Frenkel, J. Chem. Phys. , 9882(2003).[35] R. P. Sear, J. Chem. Phys. , 4800 (1999).[36] G. Gompper and M. Schick, Soft Matter: Colloidal order:entropic and surface forces (Vch Verlagsgesellschaft Mbh,ADDRESS, 2007), Vol. 3.[37] C. Vega and P. A. Monson, J. Chem. Phys. , 9938(1998).[38] G. Munao, Z. Preisler, T. Vissers, F. Smallenburg, andF. Sciortino, Soft Matter , 2652 (2013).[39] P. G. Vekilov and A. A. Chernov, Solid State Phys. ,1 (2002).[40] In agreement with other reports [37], 10% discrepancybetween cell theory and simulations is observed, whichexplains why at ε d = ε m = 0 . < ε + the theory predictsa triple point above the critical point in contrast to thesimulation results (Fig. 7 vs. Fig. 4).[41] D. R. Banatao, D. Cascio, C. S. Crowley, M. R. Fleissner,H. L. Tienson, and T. O. Yeates, Proc. Natl. Acad. Sci.USA , 16230 (2006).[42] T. O. Yeates and S. B. Kent, Annu. Rev. Biophys.41