[PDF] A Time Lower Bound for Multiple Nucleation on a Surface

Abstract

Majumder, Reif and Sahu have presented a stochastic model of reversible, error-permitting, two-dimensional tile self-assembly, and showed that restricted classes of tile assembly systems achieved equilibrium in (expected) polynomial time. One open question they asked was how much computational power would be added if the model permitted multiple nucleation, i.e., independent groups of tiles growing before attaching to the original seed assembly. This paper provides a partial answer, by proving that if a tile assembly model uses only local binding rules, then it cannot use multiple nucleation on a surface to solve certain "simple" problems in constant time (time independent of the size of the surface). Moreover, this time bound applies to macroscale robotic systems that assemble in a three-dimensional grid, not just to tile assembly systems on a two-dimensional surface. The proof technique defines a new model of distributed computing that simulates tile (and robotic) self-assembly. Keywords: self-assembly, multiple nucleation, locally checkable labeling.

Full PDF

AA Time Lower Bound for Multiple Nucleation ona Surface

Aaron Sterling ∗ November 8, 2018

Abstract

Majumder, Reif and Sahu have presented a stochastic model of re-versible, error-permitting, two-dimensional tile self-assembly, and showedthat restricted classes of tile assembly systems achieved equilibrium in(expected) polynomial time. One open question they asked was howmuch computational power would be added if the model permitted multi-ple nucleation, i.e., independent groups of tiles growing before attachingto the original seed assembly. This paper provides a partial answer, byproving that if a tile assembly model uses only local binding rules, thenit cannot use multiple nucleation on a surface to solve certain “simple”problems in constant time (time independent of the size of the surface).Moreover, this time bound applies to macroscale robotic systems thatassemble in a three-dimensional grid, not just to tile assembly systemson a two-dimensional surface. The proof technique deﬁnes a new modelof distributed computing that simulates tile (and robotic) self-assembly.

Keywords: self-assembly, multiple nucleation, locally checkable labeling.

Nature is replete with examples of the self-assembly of individual parts into amore complex whole, such as the development from zygote to fetus, or, more sim-ply, the replication of DNA itself. In his Ph.D. thesis in 1998, Winfree proposeda formal mathematical model to reason algorithmically about processes of self-assembly [21]. Winfree connected the experimental work of Seeman [16] (whohad built “DNA tiles,” molecules with unmatched DNA base pairs protruding infour directions, so they could be approximated by squares with diﬀerent “glues”on each side) to a notion of tiling the integer plane developed by Wang in the ∗ Laboratory for Nanoscale Self-Assembly, Department of Computer Science, Iowa StateUniversity, Ames, IA 50014, USA. [email protected] . This research was supportedin part by National Science Foundation Grants 0652569 and 0728806. a r X i v : . [ c s . CC ] A ug seed tile or a connected, ﬁnite seedassembly . Tiles would then accrete one at a time to the seed assembly, growinga seed supertile . A tile assembly system is a ﬁnite set of tile types . Tile typesare characterized by the names of the “glues” they carry on each of their foursides, and the binding strength each glue can exert. We assume that when thetiles interact “in solution,” there are inﬁnitely many tiles of each tile type. Tileassembly proceeds in discrete stages. At each stage s , from all possibilities oftile attachment at all possible locations (as determined by the glues of the tiletypes and the binding requirements of the system overall), one tile will bind,with tile type and location “chosen” nondeterministically from possible legalbonds at that stage. (Later, we will generalize this so multiple tiles can bindconcurrently, at a given stage.) Winfree proved that his Tile Assembly Modelis Turing universal.The abstract Tile Assembly Model (aTAM) is error-free and irreversible —tiles always bind correctly, and, once a tile binds, it can never unbind. Adleman et al. were the ﬁrst to deﬁne a notion of time complexity for tile assembly, using aone-dimensional error-permitting, reversible model, where tiles would assemblein a line with some error probability, then be scrambled, and fall back to theline [1]. Adleman et al. proved bounds on how long it would take such models toachieve equilibrium. Majumder, Reif and Sahu have recently presented a two-dimensional stochastic model for self-assembly [11], and have shown that sometiling problems in their model correspond to rapidly mixing Markov chains —Markov chains that reach stationary distribution in time polynomial in the statespace of legally reachable assemblies.While the aTAM is nondeterministic, real-world chemical reactions are prob-abilistic, and discrete molecular interactions are often modeled stochastically.We will deﬁne a class of stochastic self-assembly models that contains the modelof Majumder et al. , and prove a lower bound about any model in that class.The tile assembly systems analyzed in [11] had the property that their equi-librium assemblies were identical (allowing for small error) with their terminal or complete assemblies , i.e. , assemblies that cannot legally evolve further, giventhe rules of the system. This identity does not, however, hold in general. In aclosed chemical system, where equilibrium may be achieved, it is possible thatthe system at equilibrium might consist almost entirely of large, undesirableassemblies that do not perform the desired computation. In these cases, correctassembly occurs when the system is out of equilibrium, and can be maintainedbecause there is a large kinetic energy barrier to forming undesired structures.Therefore, when we discuss the “solution to a problem” in this paper, we identifythat with the notion of a complete assembly.We will prove a time complexity lower bound on the solution of a graphcoloring problem for a class of self-assembly models, including, but not limitedto, a generalization of the model of [11]. The tile assembly model in [11], likethe aTAM, allows only for a single seed assembly, and one of the open problems2n [11] was how the model might change if it allowed multiple nucleation, i.e. ,if multiple supertiles could build independently before attaching to a growingseed supertile. The main result of this paper provides a time complexity lowerbound for a class of tile assembly models that permit multiple nucleation on a 2Dsurface or a 3D grid: there is no way for those models to use multiple nucleationto achieve a speedup to tiling a surface in constant time (time independent ofthe size of the surface) in order to solve a graph coloring problem, even thoughthat graph coloring problem requires only seven tile types to solve in the aTAM.This result holds for tile assembly models that are reversible, irreversible, error-permitting or error-free. In fact, a speedup to constant time is impossible, evenif we relax the model to allow that, at each step s , there is a positive probabilityfor every available location that a tile will bind there (instead of requiring thatexactly one tile bind per stage).To our knowledge, the method of proof in this paper is novel: given a tileassembly model and a tile assembly system T in that model, we construct adistributed network of processors that can simulate the behavior of T as it as-sembles on a surface. Our result then follows from the theorem by Naor andStockmeyer that locally checkable labeling (LCL) problems have no local solu-tion in constant time [12]. This is true for both deterministic and randomizedalgorithms, so no constant-time tile assembly system exists that solves an LCLproblem with a positive probability of success. We consider one LCL problem inspeciﬁc, the weak c -coloring problem, and demonstrate a tile set of only seventile types that solves the weak c -coloring problem in the abstract Tile AssemblyModel, even though that same problem is impossible to solve in constant timeby multiple nucleation on a surface, for a broad class of self-assembly models.Intuitively, this demonstrates that even a problem that can be solved in poly-nomial time by using a few local rules when starting from a single point, cannotnecessarily be solved in constant time when starting from multiple points, re-gardless of the rule set used. (The abstract Tile Assembly Model can weakly c -color an n × n surface in n steps, yet none of the multiple nucleation modelswe consider can solve the weak c -coloring problem in constant-many steps.)The results of Naor and Stockmeyer we apply are more powerful than neededto obtain the time complexity lower bound for a system in which the self-assembling agents are as simple as DNA tiles. Our lower bound actually demon-strates that constant-time speedup to solve LCL problems is impossible viamultiple nucleation, even for self-assembling modular robots capable of formingphysical bonds in a three-dimensional grid, and, in addition, of sending mes-sages to their neighbors once they have bonded, and potentially deciding tobreak bonds they previously formed. In the abstract Tile Assembly Model, one tile is added per stage, so the primarycomplexity measure is not one of time, but of how much information a tile setneeds in order to solve a particular problem. Several researchers [1] [3] [4] [15] [17]have investigated the tile complexity (the minimum number of distinct tile types3equired for assembly) of ﬁnite shapes, and sets of “scale-equivalent” shapes(essentially a Z × Z analogue of the Euclidean notion of similar ﬁgures). Forexample, it is now known that the number of tile types required to assemble asquare of size n × n (for n any natural number) is Ω(log n/ log log n ) [15]. Or,if T is the set of all discrete equilateral triangles, the asymptotically optimalrelationship between triangle size and number of tiles required to assemble thattriangle, is closely related to the Kolmogorov Complexity of a program thatoutputs the triangle as a list of coordinates [17].Despite these advances in understanding of the complexity of assemblingﬁnite, bounded shapes, the self-assembly of inﬁnite structures is not as wellunderstood. In particular, there are few lower bounds or impossibility results onwhat inﬁnite structures can be self-assembled in the Tile Assembly Model. Theﬁrst such impossibility result appeared in [10], when Lathrop, Lutz and Summersshowed that no ﬁnite tile set can assemble the discrete Sierpinski Triangle byplacing a tile only on the coordinates of the shape itself. (By contrast, Winfreehad shown that just seven tile types are required to tile the ﬁrst quadrant of theinteger plane with tiles of one color on the coordinates of the discrete SierpinskiTriangle, and tiles of another color on the coordinates of the complement [21].)Recently, Patitz and Summers have extended this initial impossibility resultto other discrete fractals [13], and Lathrop et al. [9] have demonstrated sets in Z × Z that are Turing decidable but cannot be self-assembled in Winfree’s sense.To date, there has been no work comparing the strengths of diﬀerent tileassembly models with respect to inﬁnite (nor to ﬁnite but arbitrarily large)structures. Since self-assembly is a process in which each point has only lo-cal knowledge, it is natural to consider whether the techniques of distributedcomputing might be useful for comparing models of self-assembly and prov-ing impossibility results about them. This paper is an initial attempt in thatdirection.Aggarwal et al. in [3] proposed a generalization of the standard Tile Assem-bly Model, which they called the q -Tile Assembly Model. This model permittedmultiple nucleation: tiles did not need to bind immediately to the seed supertile.Instead, they could form independent supertiles of size up to some constant q before then attaching to the seed supertile. While the main question consideredin [3] was tile complexity, we can also ask whether multiple nucleation wouldallow an improvement in time complexity. Intuitively, Does starting from mul-tiple points allow us to build things strictly faster than starting from a singlepoint?As mentioned above, Majumder, Reif and Sahu recently presented a stochas-tic, error-permitting tile assembly model, and calculated the rate of convergenceto equilibrium for several tile assembly systems [11]. The model in [11] permit-ted only a single seed assembly, and addition of one tile to the seed supertile ateach stage. Majumder, Reif and Sahu left as an open question how the modelmight be extended to permit the presence and binding of multiple supertiles.Therefore, we can rephrase the “intuitive” question above as follows: Can wetile a surface of size n × n in a constant number of stages, by randomly selectingnucleation points on the surface, building supertiles of size q or smaller from4hose points in ≤ q stages, and then allowing ≤ r additional stages for tilesto fall oﬀ and be replaced if the edges of the supertiles contain tiles that bindincorrectly? (The assembly achieves equilibrium in constant time because q and r do not depend on n .) The partial answer obtained in this paper is thatlocally checkable labeling problems cannot be solved in constant time, if we limitourselves to self-assembly on a surface.Limiting ourselves to self-assembly on a surface is signiﬁcant, because weare requiring that agents adhere to a substrate and then never move again,unless they dissociate completely from the larger assembly. When assembliesmultiply nucleate in solution, however, they form disjoint supertiles that canﬂoat independently until potentially becoming aligned, with some probability.A self-assembly model that made this rigorous might be strictly stronger thanthe self-assembly models we consider in this paper, as it is not clear how tosimulate ﬂoating supertiles within our distributed computing models withoutintroducing slowdown, as processors simulating locations of the surface wouldhave to “pass along” information from one processor to the next, to simulateelements of the moving supertile. We leave the possibility of simulating ﬂoatingsupertiles to future work.Another limitation to our results is that our proof technique applies onlyto self-assembly models whose binding rules are completely local. One couldimagine models in which supertiles combine (or separate) based on simultaneousinteractions at several locations, instead of the models we consider in this paper,in which the system’s behavior at each location depends only on the propertiesof that location’s immediate neighbors. The self-assembly literature, to ourknowledge, contains little regarding self-assembly models with nonlocal bindingrules, and this could be a fruitful area to investigate.Klavins and co-authors have modeled self-assembly phenomena—and pro-grammed self-assembling modular robots—using graph grammars [6] [8]. Klavinsin [7] informally compares the limitations of the “distributed algorithms” ofgraph grammars (used to program self-assembling robots) to impossibility re-sults in distributed computing. Recently, we have shown connections betweenself-assembly and the wait-free consensus hierarchy [18], and we have embeddedthe “graph assembly systems” of Klavins into a known graph grammar char-acterization of distributed systems [19]. The present paper, to the best of ourknowledge, is the ﬁrst to construct a formal reduction from self-assembly modelsto models of distributed computing.Section 2 of this paper describes the abstract Tile Assembly Model, and thenconsiders generalizations of the standard model that permit multiple nucleation.Section 3 reviews the distributed computing results of Naor and Stockmeyerneeded to prove the impossibility result. In Section 4 we present our simulationtechnique and lower bound results. Section 5 concludes the paper and suggestsdirections for future research. 5 Y0 0Y1

The west side has binding strength 0, represented by a dashed line. The north side has glue type “Y0” and binding strength 2, represented by a double line. The east side has glue type “0” and binding strength 1, represented by a single line.The south side has glue type “Y1” and binding strength 2. This tile is named “Y1”.

Figure 1: An example tile with explanation.

Winfree’s objective in deﬁning the Tile Assembly Model was to provide a use-ful mathematical abstraction of DNA tiles combining in solution [21]. Rothe-mund [14], and Rothemund and Winfree [15], extended the original deﬁnitionof the model. For a comprehensive introduction to tile assembly, we refer thereader to [14]. In our presentation here, we follow [10], which gives equal statusto ﬁnite and inﬁnite tile assemblies.Intuitively, a tile of type t is a unit square that can be placed with its centeron a point in the integer lattice. A tile has a unique orientation; it can betranslated, but not rotated. We identify the side of a tile with the direction (orunit vector) one must travel from the center to cross that side. The literatureoften refers to west, north, east and south sides, starting at the leftmost sideand proceeding clockwise. Each side of a tile is covered with a “glue” that hasa color and a strength . Figure 1 shows how a tile is represented graphically.If tiles of types t and t (cid:48) are placed adjacent to each other, then they will bind with the strength shared by both adjacent sides if the glues on those sidesare the same. Note that this deﬁnition of binding implies that if the glues ofthe adjacent sides do not have the same color or strength, then their bindingstrength is 0. Later, we will permit pairs of glues to have negative bindingstrength, to model error occurrence and correction.One parameter in a tile assembly model is the minimum binding strength6equired for tiles to bind “stably.” This parameter is usually termed temperature and denoted by τ , where τ ∈ N .As we consider only two-dimensional tile assemblies, we limit ourselves toworking in Z = Z × Z . U is the set of all unit vectors in Z .A binding function on an (undirected) graph G = ( V, E ) is a function β : E −→ N . If β is a binding function on a graph G = ( V, E ) and C = ( C , C ) isa cut of G , then the binding strength of β on C is β C = { β ( e ) | e ∈ E, e ∈ C , and e ∈ C } . The binding strength of β on G is then β ( G ) = min { β C | C is a cut of G } . In-tuitively, the binding function captures the strength with which any two neigh-bors are bound together, and the binding strength of the graph is the minimumstrength of bonds that would have to be severed in order to separate the graphinto two pieces.A binding graph is an ordered triple G = ( V, E, β ) where (

V, E ) is a graphand β is a binding function on ( V, E ). If τ ∈ N , a binding graph G = ( V, E, β )is τ -stable if β (( V, E )) ≥ τ .Recall that a grid graph is a graph G = ( V, E ) where V ⊆ Z × Z and everyedge {−→ m, −→ n } ∈ E has the property that −→ m − −→ n ∈ U . We write [ V ] for theset {{ v , v } | v ∈ V and v ∈ V } , i.e. , the two-element subsets of V . Deﬁnition 1. A tile type over a (ﬁnite) alphabet Σ is a function t : U −→ Σ ∗ × N . We write t = (col t , str t ) , where col t : U −→ Σ ∗ , and str t : U −→ N are deﬁned by t ( −→ u ) = (col t ( −→ u ) , str t ( −→ u )) for all −→ u ∈ U . Deﬁnition 2. If T is a set of tile types, a T -conﬁguration is a partial function α : Z (cid:57)(cid:57)(cid:75) T . Deﬁnition 3.

The binding graph of a T -conﬁguration α : Z (cid:57)(cid:57)(cid:75) T is thebinding graph G α = ( V, E, β ) , where ( V, E ) is the grid graph given by V = dom( α ) ,E = (cid:8) {−→ m, −→ n } ∈ [ V ] | −→ m − −→ n ∈ U , col α ( −→ m ) ( −→ n − −→ m ) = col α ( −→ n ) ( −→ m − −→ n ) , and str α ( −→ m ) ( −→ n − −→ m ) > (cid:9) , and the binding function β : E −→ Z + is given by β ( {−→ m, −→ n } ) = str α ( −→ m ) ( −→ n −−→ m ) for all {−→ m, −→ n } ∈ E . Deﬁnition 4.

For T a set of tile types, a T -conﬁguration α is stable if itsbinding graph G α is τ -stable. A τ - T -assembly is a T -conﬁguration that is τ -stable. We write A τT for the set of all τ - T -assemblies. Deﬁnition 5.

Let α and α (cid:48) be T -conﬁgurations.1. α is a subconﬁguration of α (cid:48) , and we write α (cid:118) α (cid:48) , if dom( α ) ⊆ dom( α (cid:48) ) and, for all −→ m ∈ dom( α ) , α ( −→ m ) = α (cid:48) ( −→ m ) . . α (cid:48) is a single-tile extension of α if α (cid:118) α (cid:48) and dom( α (cid:48) ) (cid:114) dom( α ) is asingleton set. In this case, we write α (cid:48) = α + ( −→ m (cid:55)→ t ) , where {−→ m } =dom( α (cid:48) ) (cid:114) dom( α ) and t = α (cid:48) ( −→ m ) .3. The notation α −→ τ,T α (cid:48) means that α, α (cid:48) ∈ A τT and α (cid:48) is a single-tileextension of α . (The “1” above the arrow is to denote that a single tile isadded at this step.) Deﬁnition 6.

Let α ∈ A τT .1. For each t ∈ T , the τ - t -frontier of α is the set ∂ τT α = (cid:110) −→ m ∈ Z (cid:114) dom( α ) (cid:12)(cid:12)(cid:12) (cid:88) −→ u ∈ U str t ( −→ u ) · (cid:74) α ( −→ m + −→ u )( −−→ u ) = t ( −→ u ) (cid:75) ≥ τ (cid:111) .

2. The τ -frontier of α is the set ∂ τ α = (cid:91) t ∈ T ∂ τt α . Deﬁnition 7. A τ - T -assembly sequence is a sequence −→ α = ( α i | ≤ i < k ) in A τT , where k ∈ Z + ∪ {∞} and, for each i with ≤ i + 1 < k , α i −→ τ,T α i +1 . Deﬁnition 8.

The result of a τ - T -assembly sequence −→ α = ( α i | ≤ i < k ) is the unique T -conﬁguration α = res( −→ α ) satisfying: dom( α ) = ∪ ≤ i

Let α, α (cid:48) ∈ A τT . A τ - T -assembly sequence from α to α (cid:48) is a τ - T -assembly sequence −→ α = ( α i | ≤ i < k ) such that α = α and res( −→ α ) = α (cid:48) .We write α −→ τ,T α (cid:48) to indicate that there exists a τ - T -assembly from α to α (cid:48) . Deﬁnition 10.

An assembly α ∈ A τT is terminal if ∂ τ α = ∅ . Intuitively, a conﬁguration is a set of tiles that have been placed in the plane,and the conﬁguration is stable if the binding strength at every possible cut is atleast as high as the temperature of the system. Informally, an assembly sequenceis a sequence of single-tile additions to the frontier of the assembly constructedat the previous stage. Assembly sequences can be ﬁnite or inﬁnite in length.We are now ready to present a deﬁnition of a tile assembly system.

Deﬁnition 11.

Write A τT for the set of conﬁgurations, stable at temperature τ , of tiles whose tile types are in T . A tile assembly system is an ordered triple T = ( T, σ, τ ) where T is a ﬁnite set of tile types, σ ∈ A τT is the seed assembly ,and τ ∈ N is the temperature . We require dom( σ ) to be ﬁnite. Deﬁnition 12.

Let T = ( T, σ, τ ) be a tile assembly system. . Then the set of assemblies produced by T is A [ T ] = (cid:8) α ∈ A τT (cid:12)(cid:12) σ −→ τ,T α (cid:9) , where “ σ −→ τ,T α ” means that tile conﬁguration α can be obtained from seedassembly σ by a legal addition of tiles.2. The set of terminal assemblies produced by T is A (cid:3) [ T ] = { α ∈ A [ T ] | α is terminal } , where “terminal” describes a conﬁguration to which no tiles can be legallyadded. If we view tile assembly as the programming of matter, the following analogyis useful: the seed assembly is the input to the computation; the addition oftile types to the growing assembly are the legal steps the computation can take;the temperature is the primary inference rule of the system; and the terminalassemblies are the possible outputs.We are, of course, interested in being able to prove that a certain tile assem-bly system always achieves a certain output. In [17], Soloveichik and Winfreepresented a strong technique for this: local determinism.Informally, an assembly sequence −→ α is locally deterministic if (1) each tileadded in −→ α binds with the minimum strength required for binding; (2) if thereis a tile of type t at location −→ m in the result of α , and t and the immediate“OUT-neighbors” of t are deleted from the result of α , then no other tile typein T can legally bind at −→ m ; the result of α is terminal. We formalize thesepoints as follows. Deﬁnition 13 (Soloveichik and Winfree [17]) . A τ - T -assembly sequence −→ α =( α i | ≤ i ≤ k ) with result α is locally deterministic if it has the following threeproperties.1. For all −→ m ∈ dom ( α ) − dom ( α ), (cid:88) −→ u ∈ IN −→ α ( −→ m ) str α iα ( −→ m ) ( −→ m, −→ u ) = τ , where IN −→ α ( −→ m ) means the sides of the tile that bound at location −→ m duringassembly sequence −→ α that contributed nonzero strength during the stage atwhich the tile bound. (Informally, these are the “input sides” of the tileat location −→ m , with respect to assembly sequence −→ α .)2. For all −→ m ∈ dom ( α ) − dom ( α ) and all t ∈ T − { α ( −→ m ) } , −→ m / ∈ ∂ τt ( −→ α \−→ m ) .3. ∂ τ α = ∅ . Deﬁnition 14 (Soloveichik and Winfree [17]) . A tile assembly system T is locally deterministic if there exists a locally deterministic τ - T -assembly sequence α = ( α i | ≤ i < k ) with α = σ . Theorem 1 (Soloveichik and Winfree [17]) . If T is locally deterministic, then T has a unique terminal assembly. We move now from DNA tiles self-assembling on a two-dimensional surface, to amore general setting, where self-assembling “agents” with the ability not just tobind but also to communicate after binding and potentially unbind, can assembleeither in the plane or in three-space. One could think of think of these agents as(nano- or macroscale) robots that interlock physically, and, after interlocking,can send their neighbors electronic messages of low complexity. Based on receiptof messages, the robots can then decide to break bonds to one or more of theirneighbors. Such modular robots have already been implemented in laboratoryexperiments [7]. Further, these robots may be constructed so each has (at leastwith high probability) a unique identiﬁcation code—permitting transmission ofstrictly more information than is possible in the setting of tile self-assembly, inwhich tiles do not have unique identiﬁers.We will consider generalizations of the abstract Tile Assembly Model thatinclude the following: (1) multiple nucleation; (2) assembly in which glues bindincorrectly according to some error probability; and (3) negative glue strengths,allowing incorrectly bound tiles to be released from the assembly so it is pos-sible for a correctly-binding tile to attach in that location; (4) a third spatialdimension; and (5) tiles can now be “agents,” i.e. , ﬁnite state machines withalgorithms and unique identiﬁers. We formalize this as follows.

Deﬁnition 15. A d -regular self-assembling agent type T is a ﬁnite state ma-chine of form T = (cid:104) A, ( g , . . . , g d ) (cid:105) , where A is an (deterministic or probabilis-tic) algorithm and the g i ’s are ﬁnite strings over a ﬁnite alphabet (codes for theglue types associated with T ) that are hardcoded into the machine. The algorithm A can be null (in the case of passive self-assembly like DNA tiles), or can decidewhether to transmit messages of length bounded by a constant to neighbors basedon the agent’s interaction with neighboring glue types. We will assume that all agent types have identical geometric structure, andtheir d glues all have the same orientation. For example, in the aTAM, all agenttypes are unit squares, oriented north, east, south, west. Also, for simplicity ofthe proof, we will assume that our agents are memoryless. However, because ofthe generality of the results of Naor and Stockmeyer, our lower bound resultswould still hold if agents could make active self-assembly decisions based on ahistory of messages received from neighbors, not just the current messages andglue types of their neighbors. Deﬁnition 16.

The binary relation R is a set of binding rules for the (ﬁnite)set of agent types { T i } i if, for any ( x, y ) ∈ R , both x and y are glue types thatappear in elements of { T i } i . eﬁnition 17. The function β is an assignment of binding strengths for theset of binding rules R , if the domain of β is R , and the range of β is the set ofnonnegative integers. Deﬁnition 18. M is a model of d -regular self-assembling agents if M = (cid:104){ T i } i , R, β, τ, σ (cid:105) , where { T i } i is a (ﬁnite) set of d -regular self-assembling agenttypes, R is a set of binding rules for { T i } i , β is an assignment of bindingstrengths for R , τ is the temperature of the system (the threshold binding strengthfor bonds to be stable), and σ is an initial (ﬁnite) seed assembly. The algorithm of each agent type may include a variable MY-ID, and weallow the possibility that each agent in the system does, in fact, have a uniqueidentiﬁcation number. This might be appropriate when modeling robotic self-assembly. In the case of molecular self-assembly, each agent is anonymous.Assembly systems in both the aTAM and the stochastic model of Majumder et al. can be deﬁned in this formalism, by giving each agent type an algo-rithm that performs no instructions, and deﬁning (respectively deterministic orprobabilistic) binding relations in a natural way.To conclude this section, we formalize what it means for a self-assemblymodel to allow multiple nucleation on a surface.

Deﬁnition 19.

Let M be a model of d -regular self-assembling agents. Wesay M allows multiple nucleation if, in addition to the placement of the seedassembly at the initial stage of assembly, there is some probability π ν that (at theﬁrst stage of assembly only) an agent is placed on each location of the surfacewith probability π ν . Further, if an agent is placed at location −→ m because ofmultiple nucleation, its agent type is chosen uniformly at random from the spaceof possible agent types. We could allow multiple nucleation to occur at multiple stages during theassembly, not just the ﬁrst. Again, because of the generality of Naor and Stock-meyer’s results, that would not aﬀect our lower bound proof.

In a well known distributed computing paper, Naor and Stockmeyer investigatedwhether “locally checkable labeling” problems could be solved over a network ofprocessors in an entirely local manner, where a local solution means a solutionarrived at “within time (or distance) independent of the size of the network” [12].One locally checkable labeling problem Naor and Stockmeyer considered was the weak c -coloring problem . Deﬁnition 20 (Naor and Stockmeyer [12]) . For c ∈ N , a weak c -coloring of agraph is an assignment of numbers from { , . . . , c } (the possible “colors”) to thevertices of the graph such that for every non-isolated vertex v there is at leastone neighbor w such that v and w receive diﬀerent colors. Given a graph G , the weak c -coloring problem for G is to weak c -color the nodes of G .

11n the context of tiling, to solve the weak c -coloring problem for an n × n surface means tiling the surface so each tile has at least one neighbor (to thenorth, south, east or west) of a diﬀerent color. In the next section, we willpresent a simple solution to the weak c -coloring problem in the abstract TileAssembly Model. By contrast, Naor and Stockmeyer showed that no local,constant-time algorithm can solve the weak c -coloring problem for grid graphs,nor for k -dimensional meshes, a generalization of grid graphs which we nowdeﬁne. Deﬁnition 21. A k -dimensional mesh is a graph with vertex set { , , . . . , m } k for some m , such that two vertices are connected by an edge if the L -distancebetween them is 1. Theorem 2 (Naor and Stockmeyer [12]) . For any natural numbers c , k and t , there is no local algorithm with time bound t that solves the weak c -coloringproblem for the class of k -dimensional meshes. (This remains true even if theprocessors have unique identiﬁers and can transmit them as part of the localalgorithm.) A second theorem from the same paper says that randomization does nothelp. The original result is stronger than the formulation here.

Theorem 3 (Naor and Stockmeyer [12]) . Fix a class G of graphs closed underdisjoint union. If there is a randomized local algorithm P with time bound t thatsolves the weak c -coloring problem for G with error probability (cid:15) for some (cid:15) < ,then there is a deterministic local algorithm A with time bound t that solves theweak c -coloring problem for G . In order to apply the theorems of Naor and Stockmeyer to the realm of self-assembly, we build a distributed network of processors that reduces a self-assembly problem to a distributed computing problem. The motivating in-tuition is that each processor simulates a location of the surface, and reports toits neighbors whether there is a (simulated) agent at that location. Formally,we prove the following theorem.

Theorem 4.

Let M be a model of d -regular self-assembling agents for anynatural number d > , such that M self-assembles on a k -dimensional mesh ofsize n k , and that M allows multiple nucleation. Then there is a model N ofdistributed computing that simulates M using n k processors with the networktopology of a k -dimensional mesh, and constant-size message complexity.Proof. Fix a model of d -regular self-assembling agents M as in the theoremstatement. Let α be a conﬁguration of agents on the mesh of size n k . Let Γ bethe set of glue types of M and M the set of electronic messages of M . (Both12 and M are ﬁnite sets.) The deﬁnition of the binding function β induces afunction ˆ β : ( T ∪ {∅} ) × (Γ ∪ {∅} ) d × ( M ∪ {∅} ) d × ( T ∪ {∅} ) −→ [0 , β takes as input the (possibly empty) agent type α ( −→ m ) at some loca-tion −→ m in conﬁguration α , and, based on the glue types and electronic messagesreceived from the d neighbors that could be incident to an agent at −→ m , re-turns, for each agent type, the probability that α ( −→ m ) would contain that agenttype, over the space of all legal M -assembly sequences that start with con-ﬁguration α and run for one time step. In particular, for ﬁxed t ∈ T , ﬁxed γ , . . . , γ d ∈ Γ ∪ {∅} , and ﬁxed m , . . . , m d ∈ M ∪ {∅} , it is true that (cid:88) t ∈ T ∪{∅} ˆ β ( t , γ , . . . , γ d , m , . . . , m d , t ) = 1 . We have not formally deﬁned M -assembly sequences, but they are a naturalextension of the τ - T -assembly sequences of tile self-assembly, where the β and τ of M are used to determine whether agents bind stably to one another. Also,if an agent type lies on the edge of the n k -size surface, so it does not have aneighbor in a particular direction, we deﬁne ˆ β so that the empty set is the glueand electronic message “transmitted” from the “neighbor” in that direction.We simulate assembly sequences of M on an k -dimensional mesh where eachof the dimensions has length n by a network of processors N whose networkgraph is also a k -dimensional mesh of total size n k . Each processor will simulatethe presence or absence of an agent in the same location on the assembly surface.We interpret bonds between two agents as messages. We add on top of thosemessages, an additional set of electronic messages agents can send neighbors, andencode the combination as an ordered pair: glue type and electronic message.The function ˆ β will be the probabilistic transition function for processors in thissystem.Processors of N are of the following form. Processor p i d -many input message buﬀers: inbuf i, , . . . , inbuf i,d . d -many output message buﬀers: outbuf i, , . . . , outbuf i,d . A color variable:

COLOR i , a variable that can take a value from { , . . . , c } ,where c is a global constant. A local state:

Each processor is in one of | T | + 1 diﬀerent local states q duringa given execution stage s . There is one stage q k to simulate each agenttype t k ∈ T , and an additional stage EMPTY, to simulate the absence ofan agent from the surface location that p i is simulating. A state transition function:

This function takes the current processor stateand the messages received in the current round, and probabilistically di-rects what state the processor will adopt in the next round.13he messages processors send on the network are of form (cid:104) glue type, elec-tronic message (cid:105) . The input message buﬀers of processor p i simulate the gluetypes of the edges the agent at p i ’s location is adjacent to, and the electronicmessages (if any) received from an agent’s neighbors. The output messagebuﬀers of p i simulate the glues on the edges of the tile p i is simulating, andthe electronic messages the agent transmits to its neighbors. The purpose ofCOLOR i is to simulate the color of the agent placed at the location simulatedby p i .All processors in N are hardcoded with the same probabilistic state transi-tion function, which is determined from the deﬁnition of ˆ β (which we inducedabove from the properties of M ), in the natural way: if, in round r of thealgorithm execution, p i is in state q k , a simulation of t k ∈ T , and hears mes-sages that simulate glue types g , . . . , g d and electronic messages m , . . . , m d ,then at the end of round r , it will transition to state q j with probability π j ,where ˆ β ( t k , g , . . . , g d , m , . . . , m d , t j ) = π j and each t j is a distinct element of T ∪ {∅} . As explained above, we denote the state that simulates the “presenceof the empty set”— i.e., the absence of any agent from the location simulatedby p i —as EMPTY.To simulate the process of self-assembly, we run the following distributedalgorithm on N .Algorithm execution proceeds in synchronized rounds. Before execution be-gins, all processors start in state EMPTY. In round r = 0, (through the inter-vention of an omniscient operator) each processor in the locations correspondingto the seed assembly enters the stage to simulate the agent type at that locationin the seed assembly.Also in round r = 0, each processor not simulating part of the seed assem-bly “wakes up” (enters a state other than EMPTY) with probability π ν , themultiple nucleation probability of M . If a processor wakes up, it enters state q (cid:54) = EMPTY, chosen uniformly at random from the set of non-EMPTY states.For any round r >

0, each processor runs either Algorithm 1 or Algorithm 2,depending on whether it is in state EMPTY.The interaction between agents in M is completely deﬁned by the gluesand electronic messages of an agent’s immediate neighbors, as speciﬁed in thefunction ˆ β and the algorithm of each agent type. The processors of N simulatethat behavior with Algorithm 2. Since the processors of N simulate emptylocations with Algorithm 1, by a straightforward induction argument, N cansimulate all possible M -assembly sequences, and the theorem is proved.We obtain our time lower bound results as corollaries of Theorem 4. Corollary 1.

If the (deterministic or probabilistic) binding rules of a multiplynucleating tile assembly system T are entirely local, then T is unable to solvethe weak c -coloring problem in constant time.Proof. Suppose T is an irreversible tiling model. If T can weak c -color surfacesin constant time, then there is a deterministic algorithm for the distributed14 lgorithm 1 For p i in state EMPTY at round r if r = 0 then wake up with probability π ν , and cease execution for this round. end ifif r > then Read the d -many input buﬀers. if no messages were received then cease execution for the round else let q be the state change obtained according to probabilities ˆ β assignsto the space T ∪ {∅} , for a location that has adjacent glue types and elec-tronic messages that are simulated by the messages received this round.Send the messages indicated by state q and the behavior of A .Set the value of COLOR i according to q .Enter state q and cease execution for this round. end ifend ifAlgorithm 2 For p i in state q (cid:54) = EMPTY (at any round)Read the four input buﬀers. if no messages were received then Send the messages indicated by state q and the behavior of A and ceaseexecution for this round. else Let q be the state change obtained probabilistically, based on the proba-bilities produced by the function ˆ β to the space T ∪ {∅} , given input fromthe glue types and electronic messages simulated by the messages receivedthis round.Send the messages indicated by state q .Set the value of COLOR i according to q .Enter state q and cease execution for this round. end if EED

Y1 X1 Y1 Y0 0Y1 Y0 Y1 1Y0 X1 X0 Figure 2: The tileset T ∗ used in the proof of Proposition 1.network N that weak c -colors N locally, and in constant time. By Theorem 2that is impossible.So assume T is a reversible tiling model, and when T assembles, it weak c -colors the tiling surface, and achieves bond pair equilibrium in constant time.Then there is a local probabilistic algorithm for N that weak c -colors N inconstant time, with positive probability of success. By Theorem 3 that is im-possible as well. Therefore, no T exists that weak c -colors surfaces in constanttime.By a similar argument, we obtain a lower bound for active self-assemblingagents on a three-dimensional cubic grid. Corollary 2.

If a model of 6-regular self-assembling agents has only local bind-ing rules, then it cannot solve the weak c -coloring problem in constant time ona 3-dimensional mesh, for any value of c . A physical interpretation of Corollary 2 would be that robots self-assemblingin three-space ( i.e. , k = 3 and, in this example, d = 6 so there are six armscoming oﬀ each robot, orthogonally to one another) cannot achieve speedup toconstant time by self-assembling in separate groups and then joining the groupstogether. This lower bound remains in eﬀect even if the robots are designed bya method that assigns each robot a unique identiﬁer.We conclude this section by noting that the weak c -coloring problem haslow tile complexity—that is, can be deﬁned using only a few local rules—in theaTAM. 16 roposition 1. There is a tile assembly system in the abstract Tile AssemblyModel that weak c -colors the ﬁrst quadrant, using only seven distinct tile types.Proof. Figure 2 exhibits a tileset T ∗ of seven tile types that assembles into aweak c -coloring of the ﬁrst quadrant, starting from an individual seed tile placedat the origin. One can verify by inspection that T ∗ is locally deterministic, soit will always produce the same terminal assembly. All assembly sequencesgenerated by T ∗ produce a checkerboard pattern in which a monochromatic“+” conﬁguration never appears. Hence, it solves the weak c -coloring problemfor the entire ﬁrst quadrant, and also for all n × n squares, for any n .One can deﬁne a three-dimensional version of the tileset T ∗ (shown in Fig-ure 2) in the natural way, using for example the 3D tile assembly model in [5].Such a three-dimensional tileset will weakly c -color the three-dimensional meshwhere d = 6, with low tile complexity. In this paper, we showed that if a tile assembly model has only local bindingrules, then it cannot use multiple nucleation on a surface to solve locally check-able labeling problems in constant time, even though the abstract Tile AssemblyModel can solve a locally checkable labeling problem using just seven tile types.In fact, we proved a more general impossibility result, which showed the samelower bound applies to self-assembling agents in a three-dimensional grid thatare capable of binding and subsequently sending messages to their neighbors.To the best of our knowledge, this was the ﬁrst application of a distributedcomputing impossibility result to the ﬁeld of self-assembly.There are still many open questions regarding multiple nucleation. Aggarwal et al. asked in [3] whether multiple nucleation might reduce the tile complexityof ﬁnite shapes. The answer is not known. Furthermore, we can ask for whatclass of computational problems does there exist some function f such thatwe could tile an n × n square in time O (1) < O ( f ) < O ( n ), and “solve”the problem with “acceptable” probability of error, in a tile assembly modelthat permits multiple nucleation. It would also be interesting to explore thepossibility of modeling multiple nucleation of molecules ﬂoating in solution—instead of adhering to a surface—perhaps by using techniques from the ﬁeld of ad hoc wireless networks.We hope that this is just the start of a conversation between researchers inthe ﬁelds of distributed computing and biomolecular computation. Acknowledgements

I am grateful to Soma Chaudhuri, Dave Doty, Jim Lathrop and Jack Lutzfor helpful discussions on earlier versions of this paper. I am also grateful totwo anonymous referees, who suggested signiﬁcant conceptual and technicalimprovements to my original journal submission.17 eferenceseferences