[PDF] Analyzing mechanisms and microscopic reversibility of self-assembly

Abstract

We use computer simulations to investigate self-assembly in a system of model chaperonin proteins, and in an Ising lattice gas. We discuss the mechanisms responsible for rapid and efficient assembly in these systems, and we use measurements of dynamical activity and assembly progress to compare their propensities for kinetic trapping. We use the analytic solution of a simple minimal model to illustrate the key features associated with such trapping, paying particular attention to the number of ways that particles can misbind. We discuss the relevance of our results for the design and control of self-assembly in general.

Full PDF

AAnalyzing mechanisms and microscopic reversibility of self-assembly

James Grant, Robert L. Jack, and Stephen Whitelam Department of Physics, University of Bath, Bath BA2 7AY, UK Molecular Foundry, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720,USA

We use computer simulations to investigate self-assembly in a system of model chaperonin proteins, and in anIsing lattice gas. We discuss the mechanisms responsible for rapid and eﬃcient assembly in these systems, andwe use measurements of dynamical activity and assembly progress to compare their propensities for kinetictrapping. We use the analytic solution of a simple minimal model to illustrate the key features associatedwith such trapping, paying particular attention to the number of ways that particles can misbind. We discussthe relevance of our results for the design and control of self-assembly in general.

I. INTRODUCTION

In self-assembly , particles combine spontaneously toform structures that can be closed, like capsids andDNA ‘origami’ , or extended, like ﬁlaments , sheets and unusual crystals . The possibility of exploitingassembly for technological ends has been discussed manytimes , but to realize this possibility we need to developthe ability to predict and control the properties of exper-imental self-assembling systems in general. In particular,understanding how systems can be designed so as to as-semble reliably and rapidly while avoiding kinetic trapsremains a key challenge.Eﬀective dynamical assembly typically requires bond-making and bond-breaking events, so that assemblingparticles can avoid long-lived disordered structures andform the desired ordered one. The role of transient un-binding during self-assembly is understood at a qualita-tive level : particles on the micro- and nanoscalecan exploit thermal ﬂuctuations in order to sample arange of bound conﬁgurations as structures grow. Suchﬂuctuations allow particles to break local bonds and es-cape the kinetic ‘traps’ that result when misbound parti-cles become frozen into place by the arrival of more ma-terial. The importance of such ﬂuctuations is apparentfrom measurements, in computer simulations, of assem-bly yield as a function of particle binding strength. Typ-ically, such curves are non-monotonic, with a decrease inyield at large binding strength due to the suppression ofbond-breaking events (see Fig. 1). However,while the roles of ﬂuctuations and transient unbindingare clear at this qualitative level, it is not clear ‘howmuch’ reversibility is required for eﬀective self-assemblyin a given system.Here we address this question. We introduce a toymodel of assembly whose analytic solution demonstratesa minimal set of requirements for kinetic trapping. Wealso consider computer simulations of two models of in-teracting particles. The ﬁrst is an oﬀ-lattice, coarse-grained model of ‘chaperonin’ proteins from whichﬁlament-like and sheet-like structures can assemble. Thesecond is the two-dimensional lattice gas, whose separa-tion into dense and dilute phases exhibits many of thecharacteristic features of self-assembly . We discussthe assembly mechanisms in these models, and in par- ticular identify whether assembly is more eﬃcient whena single structure forms by nucleation and growth, orwhen multiple structures form simultaneously. We thenconsider the role of thermal ﬂuctuations, comparing mea-surements of dynamical activity with the ﬂux to-wards the assembled state. For example, as chaperoninparticles assemble into a close-packed sheet, they typi-cally bind and unbind hundreds or thousands of timesbefore attaining their ﬁnal positions. We ﬁnd that boththe mechanism of assembly and the dynamical activityindicate the eﬀectiveness of a system in avoiding (or es-caping from) kinetic traps, and we discuss the relevanceof these results for the design of self-assembling systems. II. MODELS AND ASSEMBLY YIELDSA. General considerations

A key aim of this article is to identify features that areconserved between diﬀerent self-assembling systems. Tothis end, we show results for three model systems, em-phasising their common features as well as some salientdiﬀerences. In all cases we initialise interacting particlesin disordered conﬁgurations and they evolve with diﬀu-sive dynamics towards low-energy thermally-equilibratedstructures. For example, we will consider model chaper-onin proteins that assemble into extended close-packedsheets (full details are given in Sec. II B). We deﬁne the‘yield’ of this assembly process to be the fraction of par-ticles embedded in such close-packed sheets. To facilitatecomparison between systems, we consistently use (cid:15) b /T todenote a dimensionless measure of the strength of inter-particle bonds; we also use n opt to denote the assemblyyield, deﬁned as the fraction of particles that are in ‘op-timal’ bonding environments.Fig. 1(a) shows results for the sheet-forming chap-eronin system and Fig. 1(b) shows results for a two-dimensional lattice model where particles assemble intolarge close-packed clusters (see Sec. II C for full details).For these two systems, on these time scales, n opt is largeonly in a narrow range of bond strength. When bondsare too weak, the assembled structure is not stable; whenbonds are too strong, the system is vulnerable to kinetictrapping. We contrast this behavior with that of a dif- a r X i v : . [ c ond - m a t . s o f t ] D ec . . . eqm t long t = 5 t b /T b /T = 6 . b /T = 9 . (a) n opt b /T (c) n opt

10 15 20 2500 . . . t = 5 t t long b /T = 12 b /T = 24 (b) n opt t = 10 t = 10 b /T b /T = 5 b /T = 10 eqm FIG. 1. Assembly yield (cid:104) n opt (cid:105) versus binding strength (cid:15) b /T ,for various times and for equilibrated systems. We showrepresentative snapshots of clusters at long times, for bondstrengths indicated. (a) Sheet-forming chaperonin system( σ = 0 . (cid:15) b /T ; in (c), yield ismonotonic, reﬂecting the absence of kinetic trapping. Datamarked ‘long’ are taken from simulations lasting 300 hours ofCPU time, rather than a ﬁxed ﬁnal time t . ferent model of chaperonin proteins which assemble intolong ﬁlaments. Fig. 1(c) shows that this process does notsuﬀer kinetic trapping even when bonds are very strong:the yield is monotonic in (cid:15) b /T . B. Chaperonin model

Chaperonin proteins assemble in vitro into a rangeof structures that include extended two-dimensionalsheets and quasi one-dimensional ﬁlaments. FollowingRefs. , we model chaperonins as hard spheres of di-ameter 2 a equipped with orientation-dependent pairwiseinteractions that encourage either equator-to-equator orpole-to-pole binding (see Appendix B). The anisotropicinteractions have range a/ (cid:15) b /T . They also depend ona parameter σ that determines how precisely two chap-eronins must align before they receive an energetic re-ward: the smaller is σ , the more speciﬁc is the angu-lar interaction. We simulated N = 1000 chaperoninsin periodically-replicated cubic boxes of side L . Chaper-onins were present at a concentration of 0.82% by volume(i.e. N π ( a/L ) = 0 . σ = 0 . σ = 0 . σ and the volumefraction. Although there is a degree of arbitrariness inthe particular values chosen, we ﬁnd that the qualitativebehaviour of the systems we consider here varies onlyweakly if we vary model parameters over a wide rangeof values. For instance, we do not ﬁnd regimes in whichthe yields of chaperonin sheet formers or the lattice gas(Fig. 1) vary monotonically with binding strength. In-deed, Fig. 1 shows that these systems exhibit similarqualitative trends, despite their diﬀerences in dimension,packing fraction, and the microscopic detail of their inter-actions. Similar behaviour has been observed in a rangeof other self-assembling systems . We are thereforeconﬁdent that our results are relevant for a range of self-assembling model systems; we would also expect similarphenomenology to be reproduced in experiments.We performed dynamic simulations, starting fromwell-mixed conﬁgurations, using the virtual-move MonteCarlo (MC) algorithm described in Ref. . This algo-rithm approximates a diﬀusive dynamics by using po-tential energy gradients to generate both single-particle-and collective translations and rotations. We deﬁne τ B as the mean time taken for an isolated particle to diﬀuse ! b /T . . . n o p t state 0: ‘monomers’states 1: ‘misbound particles’ state 2: ‘optimally bound particles’ ( a ) ( b ) ( c ) E = − b E = − b / E = 0 ··· ! b /T . . . n o p t M = 0 eqmeqm M = 10 t = 50 t = 300 n opt b /T n opt cM e − β b / e − β b c b /T t = 3000 t = 50 t = 100 t = 150 FIG. 2. Analytic toy model of assembly demonstrating the requirements for kinetic trapping. (a) Particles transfer betweenthe ‘monomer’, ‘misbound’ and ‘optimally bound’ levels with the rates shown; (cid:15) b is the particle binding strength; c is aconcentration variable (set to 10 − in the other panels); and M is the number of ways of misbinding. (b) When there existsthe possibility of misbinding ( M > non-monotonic with (cid:15) b , because as (cid:15) b increases 1) equilibrium yield increases but 2) the escape rate from misbound states decreases . (c) When misbinding is not possible ( M = 0), dynamic yieldincreases with binding strength. Similar behaviour is seen in computer models in Fig. 1. a length equal to its diameter (150 MC steps in our sim-ulations). For later convenience we deﬁne t ≡ MCsteps ≈ τ B . For the sheet-forming systems we sam-pled thermal equilibrium by starting from a large close-packed sheet inserted into a gas of monomers, and usinglocal Monte Carlo moves supplemented by the nonlocalalgorithm described in Ref. .To deﬁne the yield n opt , we consider two particles i and j to be neighbours if their interaction energy E ij ≤− T . For the sheet-forming model the optimal number ofneighbours is N max = 6; for the ﬁlament-forming model N max = 2. The yield n opt is the fraction of particles withthis number of neighbours. We also deﬁne a normalisedenergy (‘fraction of possible bonds’) n b = − E N max (cid:15) b , (1)where E is the total energy of the system. Thus, n b = 0if no bonds are present, while n b = 1 if all particles arein optimal binding environments.The results shown in Fig 1 illustrate that the sheet-forming model suﬀers from kinetic trapping when (cid:15) b /T is large, so that good assembly occurs only in an inter-mediate range of bond strengths . On the other hand,growing ﬁlaments in this model cannot become kineti-cally trapped: each particle can bind only at its northor south pole, and each of those two modes of bindingpermits the structure to be extended in an orderly man-ner. In this case, thermal ﬂuctuations do not facilitateassembly, but instead break up long ﬁlaments and reduceyield. We note that assembly of ﬁlaments may still suﬀerfrom kinetic trapping if they have more internal structurethan the simple strings of particles considered here . C. Lattice gas

We also consider the two-dimensional lattice gas, com-prising N particles on a square lattice of V = L sites.Particles on nearest-neighbouring sites form bonds ofenergy − (cid:15) b ; particles may not overlap. The systemphase-separates when bonds are strong, forming dense(liquid) and dilute (gas) phases. We work at density ρ ≡ N/V = 0 .

002 for which the onset of phase separation(binodal) is at (cid:15) b /T = 3 . . We take L = 2048 through-out. Motivated by the characteristic non-monotonic yieldshown in Fig. 1(b), we draw an analogy between thisphase separation and the self-assembly observed in thechaperonin model . In the limit of large (cid:15) b /T we ob-serve diﬀusion-limited cluster aggregation , an exampleof kinetic trapping that frustrates phase separation.We again used an MC scheme with cluster moves inorder to simulate the dynamics of Brownian particles dis-persed in a solvent. Our scheme is a variant of the ‘cleav-ing’ algorithm of Ref. : In each MC move we select aseed particle, and begin to grow a cluster by adding tothe seed, with probability p c = 1 − e − λ(cid:15) b /T , each of itsneighbouring particles. Here λ = 0 . E betweenthe original and proposed conﬁgurations. The cluster ismoved with probability p m = p a /n where n is the sizeof the cluster and p a = min(1 , e − (1 − λ )∆ E/T ) if ∆ E (cid:54) = 0.When ∆ E = 0 we take p a = α with α = 0 .

9. The factors p a , p m and p c together ensure that the dynamics obeydetailed balance and that clusters of n particles diﬀusewith a rate proportional to 1 /n . The parameters α and λ are chosen for computational eﬃciency, and for con-sistency with our other studies of this model . An MCsweep comprises N MC moves. The Brownian time foran isolated particle is τ B ≈ .

11 MC sweep. Equilib-rium conditions were probed using simulations that wereinitialised with a large assembled cluster.Particles are considered to be neighbours if they areon adjacent lattice sites. The optimal number of neigh-bours is N max = 4, allowing n opt and n b to be deﬁned asin Sec. II B. Fig. 1(b) shows that the assembly yield ofthis simple two-dimensional lattice model is qualitativelysimilar to that of the sheet-forming chaperonin model. Inwhat follows, we use comparisons between these systemsto identify which assembly properties may be generalisedbetween models, and which are model-dependent. D. Schematic model of assembly and kinetic trapping

To illustrate the physical origins of the behaviour inFig. 1, we introduce a toy model of self-assembly. Weconsider a large number of particles, each of which caninhabit any of three energy levels: a ‘monomer’ level ofenergy 0, a ‘misbound’ level of energy − (cid:15) b /

2, and an‘optimally bound’ level of energy − (cid:15) b : see Fig. 2(a).Particles begin in the monomer level, and transfer intothe bound levels with the displayed rates. Here c is aconcentration-like variable, and M is the degeneracy ofthe misbound level, which reﬂects the number of ways aparticle can misbind. Particles escape from bound stateswith the Arrhenius-like rates shown.Denoting the unbound, misbound and optimally boundstates by 0 , , t P ( t ) = W P ( t ) , (2)where P ( t ) ≡ ( P ( t ) , P ( t ) , P ( t )); the variable P i ( t ) isthe probability that a particle resides in state i at time t ; and the matrix W is W =  − c ( M + 1) α α cM − α c − α  . (3)We have deﬁned α ≡ e − (cid:15) b / T for compactness of notationand we take Boltzmann’s constant k B = 1 throughoutthis paper. The yield in this model is n opt ≡ P .All particles start in the monomer state, so thatEq. (2) is to be solved with the initial condition P (0) =(1 , , P ( t ) converges to the equilibrium distribution s = Z ( α , cM α, c ) where Z = c + cM α + α is the parti-tion function. Thus the equilibrium (long-time) yield is n eq = c/ ( c + cM α + α ). Dynamic yields are shown in Fig. 2(b,c). The long-timeyield n eq increases as particle binding strength (cid:15) b /T in-creases. However, the escape rate α from the misboundstate decreases as (cid:15) b /T increases, so that misbound parti-cles take a long time to unbind and transfer to the boundstate. As long as M >

0, these two conﬂicting eﬀectsresult in a yield n opt that at ﬁnite times decreases forlarge binding strength (Fig. 2(b)). We show in AppendixA that if α is small then reaching the equilibrium yieldtakes a time of order ( M + 1) /α . However, if M = 0,i.e. there is no possibility of binding in a non-productivemanner, then yield increases monotonically with bindingstrength (Fig. 2(c)).When M > M = 0. We see immediately the three requirements for adynamic yield that is non-monotonic in particle bindingstrength: 1) equilibrium yield increases with increasingbinding strength; 2) there exists the possibility of mis-binding; and 3) the escape rate from misbound states decreases with increasing binding strength. III. ASSEMBLY MECHANISMS

We collate information about assembly mechanisms atdiﬀerent state points and in diﬀerent models by plot-ting in Fig. 3 the normalized energy n b against thenormalized number of optimally-bound particles n opt (which amounts to using energy as a measure of assem-bly progress). If the system contains large clusters ofoptimally bound particles then one expects n opt ≈ n b .However, particles on the cluster surfaces contribute to n b but not to n opt so one ﬁnds in general that n opt < n b .For ﬁxed n b the diﬀerence n b − n opt is smallest when thesystem contains one large cluster, which has relativelyfew surface particles. For kinetically trapped states onetypically ﬁnds n opt (cid:28) n b , because few particles are inoptimal environments.In the chaperonin system there is a pronounced nucle-ation regime in which assembly proceeds by growth ofa single large cluster. Since nucleation is a rare event,this regime is characterised by system-wide ﬂuctuations.However, the assembly mechanism does not ﬂuctuate,but is the same for all trajectories: a single sheet growsfrom a gas of particles (evidence for this assertion is givenin Appendix C). In the parametric plots of Fig. 3, this be-comes clearest when we plot (cid:104) n opt (cid:105) n b , the assembly yieldfrom multiple trajectories averaged over conﬁgurationswith a given value of n b . For the lattice gas system, thefree energy barrier to nucleation is smaller, ﬂuctuationsbetween trajectories are less pronounced, and it is ap-propriate to take time as a parametric variable. We plotquantities averaged at constant time, (cid:104) n opt ( t ) (cid:105) against (cid:104) n b ( t ) (cid:105) . We also show isochrones , lines connecting pointsof equal time (lattice gas), or points of equal average time .

25 0 . . n b00 . . . ! b /T = 6 . ! b /T = 6 . ! b /T = 6 . ! b /T = 7 ! b /T = 7 . ! b /T = 7 . ! b /T = 9 n b (a) Sheet-forming chaperonins .

25 0 . .

75 1 n b00 . . . ! b /T = 12 ! b /T = 14 ! b /T = 16 ! b /T = 18 ! b /T = 20 ! b /T = 22 ! b /T = 24 n b n opt n b n opt n b n opt n b (b) Lattice gas (c) Filament-forming chaperonins = 2= 2.2= 2.5= 2.9= 3.3= 5= 10 opt b /T .

25 0 . .

75 1 n b00 . . . ! b /T = 12 ! b /T = 14 ! b /T = 16 ! b /T = 18 ! b /T = 20 ! b /T = 22 ! b /T = 24 .

25 0 . . n b00 . . . ! b /T = 6 . ! b /T = 6 . ! b /T = 6 . ! b /T = 7 ! b /T = 7 . ! b /T = 7 . ! b /T = 9 n b t t t t t ( a ) ( b ) ( c ) n b n o p t n b n o p t n b n o p t n b = 2= 2.2= 2.5= 2.9= 3.3= 5= 10 opt b /T .

25 0 . .

75 1 n b00 . . . ! b /T = 12 ! b /T = 14 ! b /T = 16 ! b /T = 18 ! b /T = 20 ! b /T = 22 ! b /T = 24 .

25 0 . . n b00 . . . ! b /T = 6 . ! b /T = 6 . ! b /T = 6 . ! b /T = 7 ! b /T = 7 . ! b /T = 7 . ! b /T = 9 n b t t t t t ( a ) ( b ) ( c ) n b n o p t n b n o p t n b n o p t n b t t t t t b /T FIG. 3. Assembly quality n opt versus progress n b for (a) the chaperonin sheet-forming system ( σ = 0 . , 10 and 10 MC steps), and (c) the chaperonin ﬁlament-forming system. We show data for a range ofbond strengths (cid:15) b /T , as indicated. Time advances from bottom left to top right: dotted lines of constant time (isochrones)are drawn. In (a) the straight lines for the two highest temperatures indicate that assembly corresponds to the nucleation andgrowth of a single sheet; as temperature is lowered, multiple nucleation events are seen, and curves bend away from this line.Peak yield at long time is obtained (at (cid:15) b /T ≈ .

8) slightly away from the single-sheet nucleation regime (peak yield is obtainedeven further from this regime for a sheet-forming system with a more generous angular binding criterion: see Fig. 4(c)). Similarbehaviour is seen in (b), although the nucleation regime is less pronounced. In (c), lowering temperature changes the assemblymechanism only slightly, and maximal yield at ﬁxed time is always obtained for the lowest temperature considered. (chaperonin systems).Fig. 3 allows us to draw several conclusions about theassembly mechanism in these systems. In Fig. 3(a) thenearly-straight lines at the two highest temperatures in-dicate that assembly corresponds to the nucleation andgrowth of a single sheet. As temperature is lowered, mul-tiple nucleation events are seen, and curves bend awayfrom this line. (Since there are multiple growing sheets,the fraction of bound particles located on cluster surfacesis larger, and n opt /n b is lower, than in the single-sheetregime.) The maximal yield n opt at long times is ob-tained at (cid:15) b /T ≈ .

8, slightly away from the single-sheetnucleation regime. That is, while the ratio of surfaceto bulk particles is optimal in the single-sheet regime,the total number of assembled particles increases with (cid:15) b such that the yield continues to increase even as thesurface-to-bulk ratio starts to fall. This competition be-tween quality and quantity of assembled product was re-cently discussed in Ref. . Further from the single-sheetnucleation regime the surface-to-bulk eﬀect dominates,and yield begins to decline. For very strong bonds, clus-ters become ramiﬁed, as illustrated in Fig. 1, and yield issmall. We note in passing that optimal assembly regimeseems to take place near the spinodal line for phase sep-aration, since it is associated with a nucleation barrierthat is just small enough for nucleation to cease to be arare event – the possibility of controlling the nucleationbarrier to achieve optimal assembly was discussed in .In Fig. 3(b), we show data for the lattice gas model,which behaves similarly to the sheet-forming chaper-onins: maximal yield is obtained in a regime in whichmany clusters grow simultaneously, but too strong aninteraction again impairs assembly. By contrast, theassembly mechanism in the ﬁlament-forming system is largely insensitive to bond strength: the main eﬀect ofincreasing (cid:15) b /T is that the system makes more progressalong the reaction coordinate (Fig. 3(c)). This again re-ﬂects the low propensity for kinetic trapping in this sys-tem.Finally, we note that despite their diﬀerent spatial di-mensionality and binding geometry, the sheet-formingchaperonin model and the lattice gas show similar be-haviour in the representations of Figs. 1 and 3. In bothcases, assembly can take place through the nucleationand growth of a single structure, but optimal yield oc-curs in the regime in which several clusters (sheets) growsimultaneously (see also ). In Fig. 4 we show data for achaperonin sheet-forming system with an angular bindingspeciﬁcity ( σ = 0 .

7) more generous than that ( σ = 0 . IV. REVERSIBILITY OF BINDINGA. Everything put together (well) falls apart (transiently):Statistics of bond-breaking and bond-making

As we have discussed (see e.g. Fig. 2), non-monotonicyields such as those shown in Fig. 1 occur because assem-bling particles must break bonds that are not compatiblewith the ﬁnal ordered structure . In Fig. 5 we show thescaled energy E i ≡ E i /(cid:15) b of each of 5 randomly-chosen .

25 0 . . n b00 . . . y ! b /T = 5 . ! b /T = 5 . ! b /T = 6 ! b /T = 6 . ! b /T = 7 ! b /T = 8 ! b /T = 9 . . . eqm t long t = 5 t ! b /T n opt ! b /T = 6 t = 40 t t = 60 t n b t t t t t ( a ) ( b ) ( c ) ! n opt " n b .

25 0 . . n b00 . . . y ! b /T = 5 . ! b /T = 5 . ! b /T = 6 ! b /T = 6 . ! b /T = 7 ! b /T = 8 ! b /T = 9 FIG. 4. Data for a chaperonin sheet-forming system with angular binding speciﬁcity σ = 0 . heal (b), eliminatingjoins formed by collisions and allowing particles to acquire optimal binding environments. As a result, peak yield (c) is foundfurther from the regime of single-sheet nucleation than in Fig. 3(a): this system can tolerate deeper supercooling than the onewith σ = 0 . chaperonin sheet-formers as a function of the time t , fortwo diﬀerent bond strengths. We show similar data forthe lattice gas system. It is clear that assembling par-ticles bind and unbind, and unbind more readily at theweaker bond strength. However, despite the clear link be-tween bond-breaking events and good assembly, we pos-sess little understanding of how many bond breakings arerequired in order to maintain eﬀective assembly.To investigate this, we recorded for each particle thenumber of bound neighbours N old it possessed before eachaccepted MC move, and the number of bound neigh-bours N new it possessed after each accepted MC move.If N new > N old then we count a binding event for thisparticle; if N new < N old then we count an unbindingevent . If N new = N old then we assume that nothinghappened to this particle (it might have gained and lostneighbours in equal number, but this happens so rarelyin our simulations that we ignore it). We write K ± torepresent the total number of binding/unbinding eventsin a given time window of an assembly trajectory. We usethese counts of binding and unbinding events to measurereversibility by separating them into time-reversal sym-metric and asymmetric measures. That is, averaging thenumbers of events between times 0 and t , we deﬁne the traﬃc (or dynamical activity ) as T ( t ) ≡ (cid:104) K + (cid:105) + (cid:104) K − (cid:105) N , (4)and the ﬂux as F ( t ) ≡ (cid:104) K + (cid:105) − (cid:104) K − (cid:105) N . (5) Traﬃc measures the total number of events per particle;ﬂux measures the excess of binding over unbinding eventsper particle, and is a measure of the extent to whichtime-reversal symmetry is broken in the system. For anequilibrated system (which is time-reversal symmetric),we have F ( t ) = 0 and T ( t ) ∝ t . For a system in whichbonds never break, we have F ( t ) = T ( t ).We show typical results in Fig. 6. The maximal possi-ble ﬂux in a system is approximately N max : ﬂux increasesin time in a similar way to (cid:104) n b ( t ) (cid:105) , because it quantiﬁesthe number of bonds in the system. The ﬂux thereforesaturates at long times, while the traﬃc continues to in-crease (events continue to happen in the system evenafter it has equilibrated in the assembled state). B. Two steps forwards, one step back: quantifyingreversibility

Close inspection of Fig. 6 reveals that under optimalassembly conditions in the sheet-forming system, par-ticles eventually form on average about 5 . . t t Sheet-forming chaperoninsLattice gas (a) b /T = 6 . b /T = 8 . b /T = 4 . b /T = 5 . ! ! ! ! t/t

50 10 1502460246 t/t

50 10 15 E i E i E i E i FIG. 5. (a,b) Scaled energy E i for each of 5 randomly-chosensheet-forming chaperonin particles as a function of time t ,for two diﬀerent bond strengths and σ = 0 .

3. (c,d) Similardata for lattice gas particles. Assembling particles bind and unbind, with unbinding being more frequent when bonds areweaker. The range of times shown is such that substantialassembly has occurred by the end of all trajectories. toy model deﬁned in Sec. II D. We assume that reachingthe ‘optimally bound’ state results in 2 binding events,and reaching the ‘misbound’ state results in 1 event(the idea is that optimally-bound particles typically have N max neighbours while misbound ones have fewer than N max ; we take N max = 2).Assuming that (cid:15) b /T is large, it is useful to work atleading order in α ≡ e − (cid:15) b / (2 T ) . The analysis is performedin Appendix A: here we summarise the main results. Inthe limit of small α (and assuming M > n opt ∼ − e − t/τ with τ ≈ ( M + 1) /α . There is a broad time window τ (cid:28) t (cid:28) α − in which F ( t ) ≈ T ( t ) ≈ M + 1). Making aparametric plot of ﬂux against traﬃc as in Fig. 7(a), oneobserves that for large (cid:15) b , the traﬃc plays the role of aclock, with the system reaching the assembled state when T ( t ) ≈ M + 1) (and T ( t ) / F ( t ) = M + 1). [If the limitof large (cid:15) b has not yet been reached, the system reachesequilibrium at a value of T ( t ) larger than 2( M + 1).]We show parametric plots of ﬂux and traﬃc for thesheet-forming chaperonin model and the lattice gas inFigs. 7(b,c). At a ﬁxed value of traﬃc, ﬂux is an increas-ing function of (cid:15) b , reﬂecting the role of (cid:15) b as a drivingforce towards the assembled state. But at ﬁxed time,traﬃc is a decreasing function of (cid:15) b , reﬂecting the role of (cid:15) b in the activation energy for escaping from misbound t t Sheet-forming chaperoninsLattice gas (d)(a) (b) F ( t ) T ( t ) F ( t ) T ( t ) (c) t/t t/t b /T -2 -1 -2 -1 -2 b /T FIG. 6. (a,b) Flux and traﬃc measurements for the sheet-forming chaperonin system ( σ = 0 . (cid:15) b /T (compare yield in Fig. 1); but traﬃc decreases with increasing (cid:15) b /T , due to the role of the bond strength as an activationenergy for bond-breaking. states.The parametric plots of Fig. 7 allow comparison ofreversibility of assembly between diﬀerent systems. Bycomparison with the schematic model, we use this datato quantify systems’ propensity for kinetic trapping, asfollows. We calculate the ratio ˜ M ( t ) = T ( t ) / F ( t ): un-der optimal assembly conditions in the schematic model(large (cid:15) b /T and long time t ) then ˜ M ( t ) approaches M +1,the ratio of the number of misbound and optimally boundstates. Given simulations of ﬁxed length t but varying (cid:15) b /T , we deﬁne a parameter M eﬀ ( t ) by evaluating ˜ M ( t )in the system with optimal (cid:15) b /T . In the schematic model, M eﬀ ( t ) ≈ M + 1 as long as substantial assembly occursbefore time t for at least one value of (cid:15) b /T .For our computer models, we obtain order-of-magnitude estimates of M eﬀ as follows. For the sheet-forming chaperonins and times in the range 20 − t ,optimal assembly is in the range 7 < (cid:15) b /T < . F ≈ < T < M eﬀ lies in the range400 − − MC sweeps, optimal assembly is in therange 5 < (cid:15) b /T < .

7, the ﬂux is

F ≈ − M eﬀ is 200 − -1 (a) Schematic model(c) Lattice gas(b) Sheet-forming chaperonins b /T F ( t ) T ( t ) F ( t ) T ( t ) b /T T ( t ) b /T t t t -2 -1 t = 0 . t = 1 t = 10 FIG. 7. Parametric plots of ﬂux and traﬃc during assembly,showing the role of traﬃc as a system clock. (a) Schematic M -state model with M = 10 and c = 0 .

01. For large (cid:15) b /T , as-sembly is complete after approximately 2( M +1) = 22 events.(b) Sheet-forming chaperonin model ( σ = 0 .

3) with isochronesat the indicated times. (c) Lattice gas model (isochrones areat 10 , 10 and 10 MC steps).

Given the large overlap in estimates of M eﬀ for lattice gasand chaperonin systems, we conclude that the propensityfor kinetic trapping in these two models are quite simi-lar. For the sheet-forming chaperonins with σ = 0 . − t we obtain a rangefor M eﬀ of 800 − σ = 0 .

3. Itmay be that the less speciﬁc interaction potential oﬀers more possibilities for disordered states, so that the sys-tem requires more unbonding events in order to reacha ﬁnal ordered structure. For ﬁlament-forming chaper-onins, optimal assembly occurs at very large (cid:15) b /T , forwhich T ≈ F and hence M eﬀ = 1 (the analogous toymodel has M = M eﬀ − M eﬀ : the model of Sec. II D is a toy model ofassembly, and one should not expect a direct mapping tomore detailed computer models. For example, the valueswe obtain for M eﬀ depend on the method used to identifyneighbouring particles in the chaperonin model, and onthe time at which ﬂux and traﬃc are measured. Physi-cally, the structures of the misbound states that cause ki-netic trapping vary with time as assembly takes place, sodescribing these states with a single number M eﬀ is sim-plistic. Nevertheless, we argue that the parameter M eﬀ which we extract provides a useful estimate of the im-portance of kinetic trapping in these assembling systems.Comparison of the values of M eﬀ emphasises the diﬀer-ence between sheet-forming and ﬁlament-forming chap-eronins. On the other hand, the diﬀerence between thesheet-forming chaperonins and the assembling lattice gasmodel is very small, especially given the inherent uncer-tainties in estimating M eﬀ .In terms of eﬀectiveness of assembly, we draw two mainconclusions from the toy model. Firstly, the time taken toequilibrate depends strongly on the activation barrier forescape from misbound states, and is τ ∼ ( M + 1)e − (cid:15) b / T .Thus, assembly is most rapid if the system possesses rel-atively weak bonds. Secondly, the number of unbindingevents required to arrive at the assembled product de-pends on the number of misbound states: this numberreﬂects a system’s propensity for trapping, and minimis-ing M provides a method for increasing assembly quality.Practical design rules for minimisation of M remain anoutstanding problem, but tuning the speciﬁcity of inter-particle attractions might provide a route to minimiz-ing this parameter. V. OUTLOOK

Based on the analysis of this article, we draw two mainconclusions. In Section III we showed that the assem-bly mechanism assumed by classical nucleation theory(CNT), consisting of the growth of an isolated, com-pact cluster, typically operates when bonds are relativelyweak. As bonds get stronger this simple picture no longerholds: multiple clusters grow , and for very strongbonds cluster structures become ramiﬁed. We ﬁnd thatthe competition between quality and quantity of assem-bly results in optimal assembly happening away fromthe ‘CNT regime’. The extent to which this happensdepends on the design of inter-component interactions(compare the sheet-forming systems with angular speci-ﬁcity σ = 0 . σ = 0 . M eﬀ that counts degeneracy of misbound states. Large valuesof M eﬀ indicate that a system is prone to kinetic trap-ping; a system’s bonds must be relatively weak in orderto avoid such trapping.These conclusions reinforce the importance of anneal-ing if kinetic trapping is to be avoided. If departures fromCNT at optimal assembly are large, then the system iseﬀective in annealing disordered clusters into well-formedproducts. Similarly, if M eﬀ is small, the system requiresrelatively few unbinding events in order to arrive at anassembled product. Both these measurements reﬂect the‘forgivingness’ of assembly, by which we mean the abilityof a particles to escape from kinetic traps and form anassembled product. We believe that guidelines for im-proving forgivingness are potentially useful in the designof self-assembly in general. For the chaperonin sheet-formers that we considered, we found that the versionwith reduced angular speciﬁcity seems to be the moreforgiving of the two. Similarly, crystallisation tends tobe most forgiving when interactions are relatively long-ranged; short-ranged interactions more frequently leadto gelation or other forms of kinetic trapping . Furthersimulation studies are needed in order to clarify the im-portance of microscopic parameters to the ‘forgivingness’of self-assembly, and to assess how typical numbers for‘ﬂux’ and ‘traﬃc’ compare to those seen in the modelsystems studied here. Ultimately, however, applicationof the ideas developed here requires the development ofexperimental systems in which the microscopic reversibil-ity of self-assembling components can be quantiﬁed. ACKNOWLEDGMENTS

RLJ thanks Mike Hagan and Paddy Royall for manyuseful discussions. This work was done as part of a Userproject at the Molecular Foundry, Lawrence BerkeleyNational Laboratory. JG and RLJ acknowledge ﬁnan-cial support by the EPSRC through a doctoral traininggrant (to JG) and through grants EP/G038074/1 andEP/I003797/1 (to RLJ). SW was supported by the Di-rector, Oﬃce of Science, Oﬃce of Basic Energy Sciences,of the U.S. Department of Energy under Contract No.DE-AC02–05CH11231.

Appendix A: Minimal model of kinetic trapping

In this appendix, we analyse the toy model introducedin Sec. II D. The master equation (2) can be solved ex- actly by matrix diagonalization: we write W = SDS − where D is a diagonal matrix. The columns of S are theright eigenvectors of W . The solution is then P ( t ) = S e Dt S − P (0). There is a zero eigenvalue of W thatcorresponds to the steady state: we denote the othertwo eigenvalues by − λ + and − λ − which are ordered as0 < λ − < λ + .

1. Assembly yield

The yield of assembly is n opt ( t ) ≡ P ( t ). The righteigenvector of W corresponding to the equilibrium stateis s = Z ( α , cM α, c ) where Z = c + cM α + α is the par-tition function. Thus the equilibrium (long-time) yield is n eq = cc + cMα + α while for general t the solution is of theform n opt ( t ) = n eq [1 − a e − λ + t − b e − λ − t ] , (A1)where a and b are (positive) constants that depend on α , c , and M , subject to a + b = 1.To gain physical intuition, it is convenient to assumethat α is small. In this case, we have λ + = c ( M + 1) + O ( α ) while λ − = α ( M +1) + O ( α ). Physically, the systemforms bonds quickly (with rate λ + ), arriving in a state inwhich P ≈ M +1 and P ≈ MM +1 . There is then a slow re-laxation (with rate λ − (cid:28)

1) in which P increases to thevalue n eq ≈

1. [Here and in the following, we use approx-imate equalities to indicate that there are corrections at O ( α ).] The slow relaxation to equilibrium involves parti-cles escaping from the misbound energy level, and there-fore has an activated rate λ − ∼ e − (cid:15) b / T . This gives riseto the non-monotonic yield plot shown in Fig. 2(b).When there is no possibility of misbinding (i.e. when M = 0), the previous analysis holds but b = 0 in (A1);the slow stage of relaxation is irrelevant for the yield.In this case, yield curves are monotonic with (cid:15) b /T ; seeFig. 2(c).

2. Flux and traﬃc

To obtain time-averaged ﬂux and traﬃc in this model,we notice that the average number of transitions fromstate 1 to state 0 between times 0 and t is K = α (cid:82) t d t (cid:48) P ( t (cid:48) ), with similar results for transitions betweenother states. For a full analysis of the statistics of thenumber of transitions between states in Markov pro-cesses, see . If we assume that transitions betweenstates 0 and 1 involve the making (or breaking) of onebond while transitions between states 0 and 2 involvemaking or breaking of two bonds, we arrive at expres-0sions for the traﬃc and ﬂux: T ( t ) = (cid:90) t d t (cid:48) [ c ( M + 2) P ( t (cid:48) ) + αP ( t (cid:48) ) + 2 α P ( t (cid:48) )] , F ( t ) = (cid:90) t d t (cid:48) [ − c ( M + 2) P ( t (cid:48) ) + αP ( t (cid:48) ) + 2 α P ( t (cid:48) )] . (A2)For the initial conditions used here, it may be readilyshown from (2) that F ( t ) = P ( t ) + 2 P ( t ), as required.Using the solution P ( t ) given above and performingthe time integral, one arrives at T ( t ) = ( k + + k − ) T S − D int ( t ) S P (0) , where k − = (0 , α, α ), k + = ( c ( M + 2) , ,

0) and D int ( t )is a diagonal matrix with elements ( t, (1 − e − λ − t ) /λ − , (1 − e λ + t ) /λ + ).We again analyse the limit of small α . For large times( λ − t (cid:29)

1) the ﬂux saturates at 2 + O ( α ) while the traﬃcis T ( t ) ≈ M + 1) + 2( M + 2) α t. (A3)For large enough times, the second term dominates andthe traﬃc increases linearly with time, but if α − (cid:28) t (cid:28) α − then traﬃc saturates at 2( M + 1). This is thelimit in which the number of unbinding events from themisbound state is large, but unbinding events from theoptimally-bound state are rare enough that they may beneglected. The existence of such a limit is the basis forthe extraction of the parameter M eﬀ discussed in Sec. IV. Appendix B: Inter-chaperonin potential

Model chaperonins are hard spheres of diameter 2 a ,equipped with an attractive pairwise interaction that op-erates only when the centres of two chaperonins lie withina distance 2 a and 2 a + a/

4. Consider two chaperonins i and j that lie within this interaction range. Let n i and n j be unit vectors pointing from the centre of each chap-eronin to its north pole, and let r ij be the unit vectorpointing from the centre of i to the centre of j . Let φ ij be the angle between the orientation vectors n i and n j ,and let θ i be the angle between n i and r ij (and let θ j be the angle between n j and − r ij ). Our ‘sticky equator’systems have orientational interaction (cid:15) eq = − (cid:15) b ˆ C ( φ ij ; σ align ) C ( θ i ; σ eq ) C ( θ j ; σ eq ) , (B1)where C α ( ψ ; σ ) ≡ e − (cos ψ − α ) /σ rewards the alignmentof angles ψ and cos − α . The parameter σ determinesthe angular tolerance of this interaction. ˆ C α ( ψ ; σ ) ≡ C α ( ψ ; σ ) + C − α ( ψ ; σ ) is this function’s symmetrizedcounterpart. In Eq. (B1) the factors C encourage ori-entation vectors to point perpendicular to the inter-chaperonin vector. The factor ˆ C encourages orienta-tion vectors to point parallel or antiparallel. For the ×

107 4 ×

107 6 × t . . . n opt n opt t .

25 0 . . n b00 . . . n opt n b n opt FIG. 8. System-wide ﬂuctuations associated with nucleationcan be controlled by using bond number (system energy) asa measure of assembly progress. (Top) We show yield n opt against time t for three independent dynamical simulationsof the sheet-forming model considered in the main text, for (cid:15) b /T = 6 .

6. Here assembly proceeds via nucleation andgrowth of a single sheet, and the characteristic time for ap-pearance of the sheet is broadly distributed. (Bottom) How-ever, when bond number n b is used as a measure of reactionprogress, the data collapse. This collapse reveals that the as-sembly mechanism in all three trajectories is the same, andmotivates the parametric plot of Fig. 3. sheet-forming system described in the main text we set σ align = σ eq = 0 .

3. For the sheet-forming system de-scribed in Appendix B we set σ align = σ eq = 0 . (cid:15) pol = − (cid:15) b ˆ C ( φ ij ; σ align ) ˆ C ( θ i ; σ pol ) ˆ C ( θ j ; σ pol ) , (B2)whose three functions encourage alignment vectors topoint parallel or antiparallel (function 1), and align-ment vectors to point parallel or antiparallel to the inter-chaperonin vector (functions 2 and 3). We set σ align = 0 . σ pol = 0 . Appendix C: Trajectory-to-trajectory ﬂuctuations

The self-assembly of sheets and lattice gas clusters re-ﬂects an underlying ﬁrst-order phase transition, and canhappen, roughly speaking, in one of two ways. Either asingle critical nucleus appears in the system and grows byacquiring monomers, or many clusters of the new phasegrow simultaneously and coalesce. Which of these mecha-nisms operates depends on the thermodynamic state andthe system size (the latter is ﬁxed in our simulations).The nucleation regime is characterised by large ﬂuctua-tions: the randomly-distributed time at which the ﬁrstcritical nucleus appears strongly aﬀects the behaviour ofthe whole system. In simulation studies, ﬂuctuations as-sociated with rare nucleation events lead to substantialdiﬀerences in values of observables such as assembly yield n opt ( t ) from run-to-run; the time-averaged yield (cid:104) n opt ( t ) (cid:105) is usually not representative of the behaviour of any sin-gle trajectory. In order to deduce the assembly mecha-nism it is therefore useful to use the number of bonds inthe system, rather than time, as a reaction coordinate.Fig. 8 shows that data from diﬀerent trajectories col-lapse in this representation: although the time to assem-bly varies signiﬁcantly between trajectories, the assemblymechanism does not. G. Whitesides and B. Grzybowski, Science, , 2418 (2002). S. C. Glotzer and M. J. Solomon, Nature Mat., , 557 (2007). M. F. Hagan and D. Chandler, Biophys. J., , 42 (2006). P. Rothemund, Nature, , 297 (2006). Y. Yang, R. Meyer, and M. F. Hagan, Phys. Rev. Lett., ,258102 (2010). C. Paavola, S. Chan, Y. Li, K. Mazzarella, R. McMillan, andJ. Trent, Nanotechnology, , 1171 (2006). Y. Li, C. D. Paavola, H. Kagawa, S. L. Chan, and J. D. Trent,Nanotechnology, , 455101 (2007). M. Leunissen, C. Christova, A. Hynninen, C. Royall, A. Camp-bell, A. Imhof, M. Dijkstra, R. van Roij, and A. van Blaaderen,Nature, , 235 (2005). C. R. Iacovella and S. C. Glotzer, Nano Lett., , 1206 (2009). S. Chung, S. Shin, C. Bertozzi, and J. De Yoreo, Proc. Natl.Acad. Sci. (USA), , 16536 (2010). F. Romano, E. Sanz, and F. Sciortino, J. Chem. Phys., ,184501 (2010), ISSN 00219606. W. L. Miller and A. Cacciuto, J. Chem. Phys., , 234108(2010). G. M. Whitesides and M. Boncheva, Proc. Natl. Acad. Sci.(USA), , 4769 (2002). R. L. Jack, M. F. Hagan, and D. Chandler, Phys. Rev. E, ,021119 (2007). A. W. Wilber, J. P. K. Doye, A. A. Louis, E. G. Noya, M. A.Miller, and P. Wong, J. Chem. Phys., (2007). D. C. Rapaport, Phys. Rev. Lett., , 186101 (2008). S. Whitelam, E. H. Feng, M. F. Hagan, and P. L. Geissler, SoftMatter, , 1251 (2009). M. F. Hagan, O. M. Elrad, and R. L. Jack, J. Chem. Phys., ,104115 (2011). D. Klotsa and R. L. Jack, Soft Matter, , 6294 (2011). C. L. Klix, C. P. Royall, and H. Tanaka, Phys. Rev. Lett., ,165702 (2010). S. Whitelam, Phys. Rev. Lett., , 088102 (2010). S. Whitelam and P. L. Geissler, J. Chem. Phys., , 154101(2007). M. Baiesi, C. Maes, and B. Wynants, Phys. Rev. Lett., (2009). J. P. Garrahan, R. L. Jack, V. Lecomte, E. Pitard, K. van Dui-jvendijk, and F. van Wijland, Phys. Rev. Lett., , 195702(2007). S. Whitelam, C. Rogers, A. Pasqua, C. Paavola, J. Trent, andP. L. Geissler, Nano Letters, , 292 (2009). S. Whitelam, arXiv:1009.2008, Molecular Simulation, in press(2011). We note also that dynamical trajectories of the chaperonin modelequilibrate only within a narrow range of bond strengths. Atsmall bond strengths, free energy barriers to sheet nucleation arelarge enough that they are not surmounted in direct simulations;at large bond strengths, disordered aggregates form and do notrelax on timescales simulated. M. Fandrich, J. Meinhardt, and N. Grigorieﬀ, Prion, , 89(2009). R. J. Baxter,

Exactly solved models in statistical mechanics (Aca-demic Press, 2002). P. Meakin, Phys. Rev. Lett., , 1119 (1983). J. Grant and R. L. Jack, in preparation. We also tested a modiﬁed scheme in which particles making mul-tiple bonds in one move contribute N new − N old to K + , etc. Theresults from this scheme and the one used in the main text areessentially indistinguishable. E. Sanz, C. Valeriani, T. Vissers, A. Fortini, M. E. Leunissen,A. van Blaaderen, D. Frenkel, and M. Dijkstra, J. Phys.: Con-dens. Matt., , 494247 (2008). J. P. Garrahan, R. L. Jack, V. Lecomte, E. Pitard, K. van Dui-jvendijk, and F. van Wijland, J. Phys. A,42