A Time Lower Bound for Multiple Nucleation on a Surface
AA Time Lower Bound for Multiple Nucleation ona Surface
Aaron Sterling ∗ November 8, 2018
Abstract
Majumder, Reif and Sahu have presented a stochastic model of re-versible, error-permitting, two-dimensional tile self-assembly, and showedthat restricted classes of tile assembly systems achieved equilibrium in(expected) polynomial time. One open question they asked was howmuch computational power would be added if the model permitted multi-ple nucleation, i.e., independent groups of tiles growing before attachingto the original seed assembly. This paper provides a partial answer, byproving that if a tile assembly model uses only local binding rules, thenit cannot use multiple nucleation on a surface to solve certain “simple”problems in constant time (time independent of the size of the surface).Moreover, this time bound applies to macroscale robotic systems thatassemble in a three-dimensional grid, not just to tile assembly systemson a two-dimensional surface. The proof technique defines a new modelof distributed computing that simulates tile (and robotic) self-assembly.
Keywords: self-assembly, multiple nucleation, locally checkable labeling.
Nature is replete with examples of the self-assembly of individual parts into amore complex whole, such as the development from zygote to fetus, or, more sim-ply, the replication of DNA itself. In his Ph.D. thesis in 1998, Winfree proposeda formal mathematical model to reason algorithmically about processes of self-assembly [21]. Winfree connected the experimental work of Seeman [16] (whohad built “DNA tiles,” molecules with unmatched DNA base pairs protruding infour directions, so they could be approximated by squares with different “glues”on each side) to a notion of tiling the integer plane developed by Wang in the ∗ Laboratory for Nanoscale Self-Assembly, Department of Computer Science, Iowa StateUniversity, Ames, IA 50014, USA. [email protected] . This research was supportedin part by National Science Foundation Grants 0652569 and 0728806. a r X i v : . [ c s . CC ] A ug seed tile or a connected, finite seedassembly . Tiles would then accrete one at a time to the seed assembly, growinga seed supertile . A tile assembly system is a finite set of tile types . Tile typesare characterized by the names of the “glues” they carry on each of their foursides, and the binding strength each glue can exert. We assume that when thetiles interact “in solution,” there are infinitely many tiles of each tile type. Tileassembly proceeds in discrete stages. At each stage s , from all possibilities oftile attachment at all possible locations (as determined by the glues of the tiletypes and the binding requirements of the system overall), one tile will bind,with tile type and location “chosen” nondeterministically from possible legalbonds at that stage. (Later, we will generalize this so multiple tiles can bindconcurrently, at a given stage.) Winfree proved that his Tile Assembly Modelis Turing universal.The abstract Tile Assembly Model (aTAM) is error-free and irreversible —tiles always bind correctly, and, once a tile binds, it can never unbind. Adleman et al. were the first to define a notion of time complexity for tile assembly, using aone-dimensional error-permitting, reversible model, where tiles would assemblein a line with some error probability, then be scrambled, and fall back to theline [1]. Adleman et al. proved bounds on how long it would take such models toachieve equilibrium. Majumder, Reif and Sahu have recently presented a two-dimensional stochastic model for self-assembly [11], and have shown that sometiling problems in their model correspond to rapidly mixing Markov chains —Markov chains that reach stationary distribution in time polynomial in the statespace of legally reachable assemblies.While the aTAM is nondeterministic, real-world chemical reactions are prob-abilistic, and discrete molecular interactions are often modeled stochastically.We will define a class of stochastic self-assembly models that contains the modelof Majumder et al. , and prove a lower bound about any model in that class.The tile assembly systems analyzed in [11] had the property that their equi-librium assemblies were identical (allowing for small error) with their terminal or complete assemblies , i.e. , assemblies that cannot legally evolve further, giventhe rules of the system. This identity does not, however, hold in general. In aclosed chemical system, where equilibrium may be achieved, it is possible thatthe system at equilibrium might consist almost entirely of large, undesirableassemblies that do not perform the desired computation. In these cases, correctassembly occurs when the system is out of equilibrium, and can be maintainedbecause there is a large kinetic energy barrier to forming undesired structures.Therefore, when we discuss the “solution to a problem” in this paper, we identifythat with the notion of a complete assembly.We will prove a time complexity lower bound on the solution of a graphcoloring problem for a class of self-assembly models, including, but not limitedto, a generalization of the model of [11]. The tile assembly model in [11], likethe aTAM, allows only for a single seed assembly, and one of the open problems2n [11] was how the model might change if it allowed multiple nucleation, i.e. ,if multiple supertiles could build independently before attaching to a growingseed supertile. The main result of this paper provides a time complexity lowerbound for a class of tile assembly models that permit multiple nucleation on a 2Dsurface or a 3D grid: there is no way for those models to use multiple nucleationto achieve a speedup to tiling a surface in constant time (time independent ofthe size of the surface) in order to solve a graph coloring problem, even thoughthat graph coloring problem requires only seven tile types to solve in the aTAM.This result holds for tile assembly models that are reversible, irreversible, error-permitting or error-free. In fact, a speedup to constant time is impossible, evenif we relax the model to allow that, at each step s , there is a positive probabilityfor every available location that a tile will bind there (instead of requiring thatexactly one tile bind per stage).To our knowledge, the method of proof in this paper is novel: given a tileassembly model and a tile assembly system T in that model, we construct adistributed network of processors that can simulate the behavior of T as it as-sembles on a surface. Our result then follows from the theorem by Naor andStockmeyer that locally checkable labeling (LCL) problems have no local solu-tion in constant time [12]. This is true for both deterministic and randomizedalgorithms, so no constant-time tile assembly system exists that solves an LCLproblem with a positive probability of success. We consider one LCL problem inspecific, the weak c -coloring problem, and demonstrate a tile set of only seventile types that solves the weak c -coloring problem in the abstract Tile AssemblyModel, even though that same problem is impossible to solve in constant timeby multiple nucleation on a surface, for a broad class of self-assembly models.Intuitively, this demonstrates that even a problem that can be solved in poly-nomial time by using a few local rules when starting from a single point, cannotnecessarily be solved in constant time when starting from multiple points, re-gardless of the rule set used. (The abstract Tile Assembly Model can weakly c -color an n × n surface in n steps, yet none of the multiple nucleation modelswe consider can solve the weak c -coloring problem in constant-many steps.)The results of Naor and Stockmeyer we apply are more powerful than neededto obtain the time complexity lower bound for a system in which the self-assembling agents are as simple as DNA tiles. Our lower bound actually demon-strates that constant-time speedup to solve LCL problems is impossible viamultiple nucleation, even for self-assembling modular robots capable of formingphysical bonds in a three-dimensional grid, and, in addition, of sending mes-sages to their neighbors once they have bonded, and potentially deciding tobreak bonds they previously formed. In the abstract Tile Assembly Model, one tile is added per stage, so the primarycomplexity measure is not one of time, but of how much information a tile setneeds in order to solve a particular problem. Several researchers [1] [3] [4] [15] [17]have investigated the tile complexity (the minimum number of distinct tile types3equired for assembly) of finite shapes, and sets of “scale-equivalent” shapes(essentially a Z × Z analogue of the Euclidean notion of similar figures). Forexample, it is now known that the number of tile types required to assemble asquare of size n × n (for n any natural number) is Ω(log n/ log log n ) [15]. Or,if T is the set of all discrete equilateral triangles, the asymptotically optimalrelationship between triangle size and number of tiles required to assemble thattriangle, is closely related to the Kolmogorov Complexity of a program thatoutputs the triangle as a list of coordinates [17].Despite these advances in understanding of the complexity of assemblingfinite, bounded shapes, the self-assembly of infinite structures is not as wellunderstood. In particular, there are few lower bounds or impossibility results onwhat infinite structures can be self-assembled in the Tile Assembly Model. Thefirst such impossibility result appeared in [10], when Lathrop, Lutz and Summersshowed that no finite tile set can assemble the discrete Sierpinski Triangle byplacing a tile only on the coordinates of the shape itself. (By contrast, Winfreehad shown that just seven tile types are required to tile the first quadrant of theinteger plane with tiles of one color on the coordinates of the discrete SierpinskiTriangle, and tiles of another color on the coordinates of the complement [21].)Recently, Patitz and Summers have extended this initial impossibility resultto other discrete fractals [13], and Lathrop et al. [9] have demonstrated sets in Z × Z that are Turing decidable but cannot be self-assembled in Winfree’s sense.To date, there has been no work comparing the strengths of different tileassembly models with respect to infinite (nor to finite but arbitrarily large)structures. Since self-assembly is a process in which each point has only lo-cal knowledge, it is natural to consider whether the techniques of distributedcomputing might be useful for comparing models of self-assembly and prov-ing impossibility results about them. This paper is an initial attempt in thatdirection.Aggarwal et al. in [3] proposed a generalization of the standard Tile Assem-bly Model, which they called the q -Tile Assembly Model. This model permittedmultiple nucleation: tiles did not need to bind immediately to the seed supertile.Instead, they could form independent supertiles of size up to some constant q before then attaching to the seed supertile. While the main question consideredin [3] was tile complexity, we can also ask whether multiple nucleation wouldallow an improvement in time complexity. Intuitively, Does starting from mul-tiple points allow us to build things strictly faster than starting from a singlepoint?As mentioned above, Majumder, Reif and Sahu recently presented a stochas-tic, error-permitting tile assembly model, and calculated the rate of convergenceto equilibrium for several tile assembly systems [11]. The model in [11] permit-ted only a single seed assembly, and addition of one tile to the seed supertile ateach stage. Majumder, Reif and Sahu left as an open question how the modelmight be extended to permit the presence and binding of multiple supertiles.Therefore, we can rephrase the “intuitive” question above as follows: Can wetile a surface of size n × n in a constant number of stages, by randomly selectingnucleation points on the surface, building supertiles of size q or smaller from4hose points in ≤ q stages, and then allowing ≤ r additional stages for tilesto fall off and be replaced if the edges of the supertiles contain tiles that bindincorrectly? (The assembly achieves equilibrium in constant time because q and r do not depend on n .) The partial answer obtained in this paper is thatlocally checkable labeling problems cannot be solved in constant time, if we limitourselves to self-assembly on a surface.Limiting ourselves to self-assembly on a surface is significant, because weare requiring that agents adhere to a substrate and then never move again,unless they dissociate completely from the larger assembly. When assembliesmultiply nucleate in solution, however, they form disjoint supertiles that canfloat independently until potentially becoming aligned, with some probability.A self-assembly model that made this rigorous might be strictly stronger thanthe self-assembly models we consider in this paper, as it is not clear how tosimulate floating supertiles within our distributed computing models withoutintroducing slowdown, as processors simulating locations of the surface wouldhave to “pass along” information from one processor to the next, to simulateelements of the moving supertile. We leave the possibility of simulating floatingsupertiles to future work.Another limitation to our results is that our proof technique applies onlyto self-assembly models whose binding rules are completely local. One couldimagine models in which supertiles combine (or separate) based on simultaneousinteractions at several locations, instead of the models we consider in this paper,in which the system’s behavior at each location depends only on the propertiesof that location’s immediate neighbors. The self-assembly literature, to ourknowledge, contains little regarding self-assembly models with nonlocal bindingrules, and this could be a fruitful area to investigate.Klavins and co-authors have modeled self-assembly phenomena—and pro-grammed self-assembling modular robots—using graph grammars [6] [8]. Klavinsin [7] informally compares the limitations of the “distributed algorithms” ofgraph grammars (used to program self-assembling robots) to impossibility re-sults in distributed computing. Recently, we have shown connections betweenself-assembly and the wait-free consensus hierarchy [18], and we have embeddedthe “graph assembly systems” of Klavins into a known graph grammar char-acterization of distributed systems [19]. The present paper, to the best of ourknowledge, is the first to construct a formal reduction from self-assembly modelsto models of distributed computing.Section 2 of this paper describes the abstract Tile Assembly Model, and thenconsiders generalizations of the standard model that permit multiple nucleation.Section 3 reviews the distributed computing results of Naor and Stockmeyerneeded to prove the impossibility result. In Section 4 we present our simulationtechnique and lower bound results. Section 5 concludes the paper and suggestsdirections for future research. 5 Y0 0Y1
The west side has binding strength 0, represented by a dashed line. The north side has glue type “Y0” and binding strength 2, represented by a double line. The east side has glue type “0” and binding strength 1, represented by a single line.The south side has glue type “Y1” and binding strength 2. This tile is named “Y1”.
Figure 1: An example tile with explanation.
Winfree’s objective in defining the Tile Assembly Model was to provide a use-ful mathematical abstraction of DNA tiles combining in solution [21]. Rothe-mund [14], and Rothemund and Winfree [15], extended the original definitionof the model. For a comprehensive introduction to tile assembly, we refer thereader to [14]. In our presentation here, we follow [10], which gives equal statusto finite and infinite tile assemblies.Intuitively, a tile of type t is a unit square that can be placed with its centeron a point in the integer lattice. A tile has a unique orientation; it can betranslated, but not rotated. We identify the side of a tile with the direction (orunit vector) one must travel from the center to cross that side. The literatureoften refers to west, north, east and south sides, starting at the leftmost sideand proceeding clockwise. Each side of a tile is covered with a “glue” that hasa color and a strength . Figure 1 shows how a tile is represented graphically.If tiles of types t and t (cid:48) are placed adjacent to each other, then they will bind with the strength shared by both adjacent sides if the glues on those sidesare the same. Note that this definition of binding implies that if the glues ofthe adjacent sides do not have the same color or strength, then their bindingstrength is 0. Later, we will permit pairs of glues to have negative bindingstrength, to model error occurrence and correction.One parameter in a tile assembly model is the minimum binding strength6equired for tiles to bind “stably.” This parameter is usually termed temperature and denoted by τ , where τ ∈ N .As we consider only two-dimensional tile assemblies, we limit ourselves toworking in Z = Z × Z . U is the set of all unit vectors in Z .A binding function on an (undirected) graph G = ( V, E ) is a function β : E −→ N . If β is a binding function on a graph G = ( V, E ) and C = ( C , C ) isa cut of G , then the binding strength of β on C is β C = { β ( e ) | e ∈ E, e ∈ C , and e ∈ C } . The binding strength of β on G is then β ( G ) = min { β C | C is a cut of G } . In-tuitively, the binding function captures the strength with which any two neigh-bors are bound together, and the binding strength of the graph is the minimumstrength of bonds that would have to be severed in order to separate the graphinto two pieces.A binding graph is an ordered triple G = ( V, E, β ) where (
V, E ) is a graphand β is a binding function on ( V, E ). If τ ∈ N , a binding graph G = ( V, E, β )is τ -stable if β (( V, E )) ≥ τ .Recall that a grid graph is a graph G = ( V, E ) where V ⊆ Z × Z and everyedge {−→ m, −→ n } ∈ E has the property that −→ m − −→ n ∈ U . We write [ V ] for theset {{ v , v } | v ∈ V and v ∈ V } , i.e. , the two-element subsets of V . Definition 1. A tile type over a (finite) alphabet Σ is a function t : U −→ Σ ∗ × N . We write t = (col t , str t ) , where col t : U −→ Σ ∗ , and str t : U −→ N are defined by t ( −→ u ) = (col t ( −→ u ) , str t ( −→ u )) for all −→ u ∈ U . Definition 2. If T is a set of tile types, a T -configuration is a partial function α : Z (cid:57)(cid:57)(cid:75) T . Definition 3.
The binding graph of a T -configuration α : Z (cid:57)(cid:57)(cid:75) T is thebinding graph G α = ( V, E, β ) , where ( V, E ) is the grid graph given by V = dom( α ) ,E = (cid:8) {−→ m, −→ n } ∈ [ V ] | −→ m − −→ n ∈ U , col α ( −→ m ) ( −→ n − −→ m ) = col α ( −→ n ) ( −→ m − −→ n ) , and str α ( −→ m ) ( −→ n − −→ m ) > (cid:9) , and the binding function β : E −→ Z + is given by β ( {−→ m, −→ n } ) = str α ( −→ m ) ( −→ n −−→ m ) for all {−→ m, −→ n } ∈ E . Definition 4.
For T a set of tile types, a T -configuration α is stable if itsbinding graph G α is τ -stable. A τ - T -assembly is a T -configuration that is τ -stable. We write A τT for the set of all τ - T -assemblies. Definition 5.
Let α and α (cid:48) be T -configurations.1. α is a subconfiguration of α (cid:48) , and we write α (cid:118) α (cid:48) , if dom( α ) ⊆ dom( α (cid:48) ) and, for all −→ m ∈ dom( α ) , α ( −→ m ) = α (cid:48) ( −→ m ) . . α (cid:48) is a single-tile extension of α if α (cid:118) α (cid:48) and dom( α (cid:48) ) (cid:114) dom( α ) is asingleton set. In this case, we write α (cid:48) = α + ( −→ m (cid:55)→ t ) , where {−→ m } =dom( α (cid:48) ) (cid:114) dom( α ) and t = α (cid:48) ( −→ m ) .3. The notation α −→ τ,T α (cid:48) means that α, α (cid:48) ∈ A τT and α (cid:48) is a single-tileextension of α . (The “1” above the arrow is to denote that a single tile isadded at this step.) Definition 6.
Let α ∈ A τT .1. For each t ∈ T , the τ - t -frontier of α is the set ∂ τT α = (cid:110) −→ m ∈ Z (cid:114) dom( α ) (cid:12)(cid:12)(cid:12) (cid:88) −→ u ∈ U str t ( −→ u ) · (cid:74) α ( −→ m + −→ u )( −−→ u ) = t ( −→ u ) (cid:75) ≥ τ (cid:111) .
2. The τ -frontier of α is the set ∂ τ α = (cid:91) t ∈ T ∂ τt α . Definition 7. A τ - T -assembly sequence is a sequence −→ α = ( α i | ≤ i < k ) in A τT , where k ∈ Z + ∪ {∞} and, for each i with ≤ i + 1 < k , α i −→ τ,T α i +1 . Definition 8.
The result of a τ - T -assembly sequence −→ α = ( α i | ≤ i < k ) is the unique T -configuration α = res( −→ α ) satisfying: dom( α ) = ∪ ≤ i Let α, α (cid:48) ∈ A τT . A τ - T -assembly sequence from α to α (cid:48) is a τ - T -assembly sequence −→ α = ( α i | ≤ i < k ) such that α = α and res( −→ α ) = α (cid:48) .We write α −→ τ,T α (cid:48) to indicate that there exists a τ - T -assembly from α to α (cid:48) . Definition 10. An assembly α ∈ A τT is terminal if ∂ τ α = ∅ . Intuitively, a configuration is a set of tiles that have been placed in the plane,and the configuration is stable if the binding strength at every possible cut is atleast as high as the temperature of the system. Informally, an assembly sequenceis a sequence of single-tile additions to the frontier of the assembly constructedat the previous stage. Assembly sequences can be finite or infinite in length.We are now ready to present a definition of a tile assembly system. Definition 11. Write A τT for the set of configurations, stable at temperature τ , of tiles whose tile types are in T . A tile assembly system is an ordered triple T = ( T, σ, τ ) where T is a finite set of tile types, σ ∈ A τT is the seed assembly ,and τ ∈ N is the temperature . We require dom( σ ) to be finite. Definition 12. Let T = ( T, σ, τ ) be a tile assembly system. . Then the set of assemblies produced by T is A [ T ] = (cid:8) α ∈ A τT (cid:12)(cid:12) σ −→ τ,T α (cid:9) , where “ σ −→ τ,T α ” means that tile configuration α can be obtained from seedassembly σ by a legal addition of tiles.2. The set of terminal assemblies produced by T is A (cid:3) [ T ] = { α ∈ A [ T ] | α is terminal } , where “terminal” describes a configuration to which no tiles can be legallyadded. If we view tile assembly as the programming of matter, the following analogyis useful: the seed assembly is the input to the computation; the addition oftile types to the growing assembly are the legal steps the computation can take;the temperature is the primary inference rule of the system; and the terminalassemblies are the possible outputs.We are, of course, interested in being able to prove that a certain tile assem-bly system always achieves a certain output. In [17], Soloveichik and Winfreepresented a strong technique for this: local determinism.Informally, an assembly sequence −→ α is locally deterministic if (1) each tileadded in −→ α binds with the minimum strength required for binding; (2) if thereis a tile of type t at location −→ m in the result of α , and t and the immediate“OUT-neighbors” of t are deleted from the result of α , then no other tile typein T can legally bind at −→ m ; the result of α is terminal. We formalize thesepoints as follows. Definition 13 (Soloveichik and Winfree [17]) . A τ - T -assembly sequence −→ α =( α i | ≤ i ≤ k ) with result α is locally deterministic if it has the following threeproperties.1. For all −→ m ∈ dom ( α ) − dom ( α ), (cid:88) −→ u ∈ IN −→ α ( −→ m ) str α iα ( −→ m ) ( −→ m, −→ u ) = τ , where IN −→ α ( −→ m ) means the sides of the tile that bound at location −→ m duringassembly sequence −→ α that contributed nonzero strength during the stage atwhich the tile bound. (Informally, these are the “input sides” of the tileat location −→ m , with respect to assembly sequence −→ α .)2. For all −→ m ∈ dom ( α ) − dom ( α ) and all t ∈ T − { α ( −→ m ) } , −→ m / ∈ ∂ τt ( −→ α \−→ m ) .3. ∂ τ α = ∅ . Definition 14 (Soloveichik and Winfree [17]) . A tile assembly system T is locally deterministic if there exists a locally deterministic τ - T -assembly sequence α = ( α i | ≤ i < k ) with α = σ . Theorem 1 (Soloveichik and Winfree [17]) . If T is locally deterministic, then T has a unique terminal assembly. We move now from DNA tiles self-assembling on a two-dimensional surface, to amore general setting, where self-assembling “agents” with the ability not just tobind but also to communicate after binding and potentially unbind, can assembleeither in the plane or in three-space. One could think of think of these agents as(nano- or macroscale) robots that interlock physically, and, after interlocking,can send their neighbors electronic messages of low complexity. Based on receiptof messages, the robots can then decide to break bonds to one or more of theirneighbors. Such modular robots have already been implemented in laboratoryexperiments [7]. Further, these robots may be constructed so each has (at leastwith high probability) a unique identification code—permitting transmission ofstrictly more information than is possible in the setting of tile self-assembly, inwhich tiles do not have unique identifiers.We will consider generalizations of the abstract Tile Assembly Model thatinclude the following: (1) multiple nucleation; (2) assembly in which glues bindincorrectly according to some error probability; and (3) negative glue strengths,allowing incorrectly bound tiles to be released from the assembly so it is pos-sible for a correctly-binding tile to attach in that location; (4) a third spatialdimension; and (5) tiles can now be “agents,” i.e. , finite state machines withalgorithms and unique identifiers. We formalize this as follows. Definition 15. A d -regular self-assembling agent type T is a finite state ma-chine of form T = (cid:104) A, ( g , . . . , g d ) (cid:105) , where A is an (deterministic or probabilis-tic) algorithm and the g i ’s are finite strings over a finite alphabet (codes for theglue types associated with T ) that are hardcoded into the machine. The algorithm A can be null (in the case of passive self-assembly like DNA tiles), or can decidewhether to transmit messages of length bounded by a constant to neighbors basedon the agent’s interaction with neighboring glue types. We will assume that all agent types have identical geometric structure, andtheir d glues all have the same orientation. For example, in the aTAM, all agenttypes are unit squares, oriented north, east, south, west. Also, for simplicity ofthe proof, we will assume that our agents are memoryless. However, because ofthe generality of the results of Naor and Stockmeyer, our lower bound resultswould still hold if agents could make active self-assembly decisions based on ahistory of messages received from neighbors, not just the current messages andglue types of their neighbors. Definition 16. The binary relation R is a set of binding rules for the (finite)set of agent types { T i } i if, for any ( x, y ) ∈ R , both x and y are glue types thatappear in elements of { T i } i . efinition 17. The function β is an assignment of binding strengths for theset of binding rules R , if the domain of β is R , and the range of β is the set ofnonnegative integers. Definition 18. M is a model of d -regular self-assembling agents if M = (cid:104){ T i } i , R, β, τ, σ (cid:105) , where { T i } i is a (finite) set of d -regular self-assembling agenttypes, R is a set of binding rules for { T i } i , β is an assignment of bindingstrengths for R , τ is the temperature of the system (the threshold binding strengthfor bonds to be stable), and σ is an initial (finite) seed assembly. The algorithm of each agent type may include a variable MY-ID, and weallow the possibility that each agent in the system does, in fact, have a uniqueidentification number. This might be appropriate when modeling robotic self-assembly. In the case of molecular self-assembly, each agent is anonymous.Assembly systems in both the aTAM and the stochastic model of Majumder et al. can be defined in this formalism, by giving each agent type an algo-rithm that performs no instructions, and defining (respectively deterministic orprobabilistic) binding relations in a natural way.To conclude this section, we formalize what it means for a self-assemblymodel to allow multiple nucleation on a surface. Definition 19. Let M be a model of d -regular self-assembling agents. Wesay M allows multiple nucleation if, in addition to the placement of the seedassembly at the initial stage of assembly, there is some probability π ν that (at thefirst stage of assembly only) an agent is placed on each location of the surfacewith probability π ν . Further, if an agent is placed at location −→ m because ofmultiple nucleation, its agent type is chosen uniformly at random from the spaceof possible agent types. We could allow multiple nucleation to occur at multiple stages during theassembly, not just the first. Again, because of the generality of Naor and Stock-meyer’s results, that would not affect our lower bound proof. In a well known distributed computing paper, Naor and Stockmeyer investigatedwhether “locally checkable labeling” problems could be solved over a network ofprocessors in an entirely local manner, where a local solution means a solutionarrived at “within time (or distance) independent of the size of the network” [12].One locally checkable labeling problem Naor and Stockmeyer considered was the weak c -coloring problem . Definition 20 (Naor and Stockmeyer [12]) . For c ∈ N , a weak c -coloring of agraph is an assignment of numbers from { , . . . , c } (the possible “colors”) to thevertices of the graph such that for every non-isolated vertex v there is at leastone neighbor w such that v and w receive different colors. Given a graph G , the weak c -coloring problem for G is to weak c -color the nodes of G . 11n the context of tiling, to solve the weak c -coloring problem for an n × n surface means tiling the surface so each tile has at least one neighbor (to thenorth, south, east or west) of a different color. In the next section, we willpresent a simple solution to the weak c -coloring problem in the abstract TileAssembly Model. By contrast, Naor and Stockmeyer showed that no local,constant-time algorithm can solve the weak c -coloring problem for grid graphs,nor for k -dimensional meshes, a generalization of grid graphs which we nowdefine. Definition 21. A k -dimensional mesh is a graph with vertex set { , , . . . , m } k for some m , such that two vertices are connected by an edge if the L -distancebetween them is 1. Theorem 2 (Naor and Stockmeyer [12]) . For any natural numbers c , k and t , there is no local algorithm with time bound t that solves the weak c -coloringproblem for the class of k -dimensional meshes. (This remains true even if theprocessors have unique identifiers and can transmit them as part of the localalgorithm.) A second theorem from the same paper says that randomization does nothelp. The original result is stronger than the formulation here. Theorem 3 (Naor and Stockmeyer [12]) . Fix a class G of graphs closed underdisjoint union. If there is a randomized local algorithm P with time bound t thatsolves the weak c -coloring problem for G with error probability (cid:15) for some (cid:15) < ,then there is a deterministic local algorithm A with time bound t that solves theweak c -coloring problem for G . In order to apply the theorems of Naor and Stockmeyer to the realm of self-assembly, we build a distributed network of processors that reduces a self-assembly problem to a distributed computing problem. The motivating in-tuition is that each processor simulates a location of the surface, and reports toits neighbors whether there is a (simulated) agent at that location. Formally,we prove the following theorem. Theorem 4. Let M be a model of d -regular self-assembling agents for anynatural number d > , such that M self-assembles on a k -dimensional mesh ofsize n k , and that M allows multiple nucleation. Then there is a model N ofdistributed computing that simulates M using n k processors with the networktopology of a k -dimensional mesh, and constant-size message complexity.Proof. Fix a model of d -regular self-assembling agents M as in the theoremstatement. Let α be a configuration of agents on the mesh of size n k . Let Γ bethe set of glue types of M and M the set of electronic messages of M . (Both12 and M are finite sets.) The definition of the binding function β induces afunction ˆ β : ( T ∪ {∅} ) × (Γ ∪ {∅} ) d × ( M ∪ {∅} ) d × ( T ∪ {∅} ) −→ [0 , β takes as input the (possibly empty) agent type α ( −→ m ) at some loca-tion −→ m in configuration α , and, based on the glue types and electronic messagesreceived from the d neighbors that could be incident to an agent at −→ m , re-turns, for each agent type, the probability that α ( −→ m ) would contain that agenttype, over the space of all legal M -assembly sequences that start with con-figuration α and run for one time step. In particular, for fixed t ∈ T , fixed γ , . . . , γ d ∈ Γ ∪ {∅} , and fixed m , . . . , m d ∈ M ∪ {∅} , it is true that (cid:88) t ∈ T ∪{∅} ˆ β ( t , γ , . . . , γ d , m , . . . , m d , t ) = 1 . We have not formally defined M -assembly sequences, but they are a naturalextension of the τ - T -assembly sequences of tile self-assembly, where the β and τ of M are used to determine whether agents bind stably to one another. Also,if an agent type lies on the edge of the n k -size surface, so it does not have aneighbor in a particular direction, we define ˆ β so that the empty set is the glueand electronic message “transmitted” from the “neighbor” in that direction.We simulate assembly sequences of M on an k -dimensional mesh where eachof the dimensions has length n by a network of processors N whose networkgraph is also a k -dimensional mesh of total size n k . Each processor will simulatethe presence or absence of an agent in the same location on the assembly surface.We interpret bonds between two agents as messages. We add on top of thosemessages, an additional set of electronic messages agents can send neighbors, andencode the combination as an ordered pair: glue type and electronic message.The function ˆ β will be the probabilistic transition function for processors in thissystem.Processors of N are of the following form. Processor p i d -many input message buffers: inbuf i, , . . . , inbuf i,d . d -many output message buffers: outbuf i, , . . . , outbuf i,d . A color variable: COLOR i , a variable that can take a value from { , . . . , c } ,where c is a global constant. A local state: Each processor is in one of | T | + 1 different local states q duringa given execution stage s . There is one stage q k to simulate each agenttype t k ∈ T , and an additional stage EMPTY, to simulate the absence ofan agent from the surface location that p i is simulating. A state transition function: This function takes the current processor stateand the messages received in the current round, and probabilistically di-rects what state the processor will adopt in the next round.13he messages processors send on the network are of form (cid:104) glue type, elec-tronic message (cid:105) . The input message buffers of processor p i simulate the gluetypes of the edges the agent at p i ’s location is adjacent to, and the electronicmessages (if any) received from an agent’s neighbors. The output messagebuffers of p i simulate the glues on the edges of the tile p i is simulating, andthe electronic messages the agent transmits to its neighbors. The purpose ofCOLOR i is to simulate the color of the agent placed at the location simulatedby p i .All processors in N are hardcoded with the same probabilistic state transi-tion function, which is determined from the definition of ˆ β (which we inducedabove from the properties of M ), in the natural way: if, in round r of thealgorithm execution, p i is in state q k , a simulation of t k ∈ T , and hears mes-sages that simulate glue types g , . . . , g d and electronic messages m , . . . , m d ,then at the end of round r , it will transition to state q j with probability π j ,where ˆ β ( t k , g , . . . , g d , m , . . . , m d , t j ) = π j and each t j is a distinct element of T ∪ {∅} . As explained above, we denote the state that simulates the “presenceof the empty set”— i.e., the absence of any agent from the location simulatedby p i —as EMPTY.To simulate the process of self-assembly, we run the following distributedalgorithm on N .Algorithm execution proceeds in synchronized rounds. Before execution be-gins, all processors start in state EMPTY. In round r = 0, (through the inter-vention of an omniscient operator) each processor in the locations correspondingto the seed assembly enters the stage to simulate the agent type at that locationin the seed assembly.Also in round r = 0, each processor not simulating part of the seed assem-bly “wakes up” (enters a state other than EMPTY) with probability π ν , themultiple nucleation probability of M . If a processor wakes up, it enters state q (cid:54) = EMPTY, chosen uniformly at random from the set of non-EMPTY states.For any round r > 0, each processor runs either Algorithm 1 or Algorithm 2,depending on whether it is in state EMPTY.The interaction between agents in M is completely defined by the gluesand electronic messages of an agent’s immediate neighbors, as specified in thefunction ˆ β and the algorithm of each agent type. The processors of N simulatethat behavior with Algorithm 2. Since the processors of N simulate emptylocations with Algorithm 1, by a straightforward induction argument, N cansimulate all possible M -assembly sequences, and the theorem is proved.We obtain our time lower bound results as corollaries of Theorem 4. Corollary 1. If the (deterministic or probabilistic) binding rules of a multiplynucleating tile assembly system T are entirely local, then T is unable to solvethe weak c -coloring problem in constant time.Proof. Suppose T is an irreversible tiling model. If T can weak c -color surfacesin constant time, then there is a deterministic algorithm for the distributed14 lgorithm 1 For p i in state EMPTY at round r if r = 0 then wake up with probability π ν , and cease execution for this round. end ifif r > then Read the d -many input buffers. if no messages were received then cease execution for the round else let q be the state change obtained according to probabilities ˆ β assignsto the space T ∪ {∅} , for a location that has adjacent glue types and elec-tronic messages that are simulated by the messages received this round.Send the messages indicated by state q and the behavior of A .Set the value of COLOR i according to q .Enter state q and cease execution for this round. end ifend ifAlgorithm 2 For p i in state q (cid:54) = EMPTY (at any round)Read the four input buffers. if no messages were received then Send the messages indicated by state q and the behavior of A and ceaseexecution for this round. else Let q be the state change obtained probabilistically, based on the proba-bilities produced by the function ˆ β to the space T ∪ {∅} , given input fromthe glue types and electronic messages simulated by the messages receivedthis round.Send the messages indicated by state q .Set the value of COLOR i according to q .Enter state q and cease execution for this round. end if EED Y1 X1 Y1 Y0 0Y1 Y0 Y1 1Y0 X1 X0 Figure 2: The tileset T ∗ used in the proof of Proposition 1.network N that weak c -colors N locally, and in constant time. By Theorem 2that is impossible.So assume T is a reversible tiling model, and when T assembles, it weak c -colors the tiling surface, and achieves bond pair equilibrium in constant time.Then there is a local probabilistic algorithm for N that weak c -colors N inconstant time, with positive probability of success. By Theorem 3 that is im-possible as well. Therefore, no T exists that weak c -colors surfaces in constanttime.By a similar argument, we obtain a lower bound for active self-assemblingagents on a three-dimensional cubic grid. Corollary 2. If a model of 6-regular self-assembling agents has only local bind-ing rules, then it cannot solve the weak c -coloring problem in constant time ona 3-dimensional mesh, for any value of c . A physical interpretation of Corollary 2 would be that robots self-assemblingin three-space ( i.e. , k = 3 and, in this example, d = 6 so there are six armscoming off each robot, orthogonally to one another) cannot achieve speedup toconstant time by self-assembling in separate groups and then joining the groupstogether. This lower bound remains in effect even if the robots are designed bya method that assigns each robot a unique identifier.We conclude this section by noting that the weak c -coloring problem haslow tile complexity—that is, can be defined using only a few local rules—in theaTAM. 16 roposition 1. There is a tile assembly system in the abstract Tile AssemblyModel that weak c -colors the first quadrant, using only seven distinct tile types.Proof. Figure 2 exhibits a tileset T ∗ of seven tile types that assembles into aweak c -coloring of the first quadrant, starting from an individual seed tile placedat the origin. One can verify by inspection that T ∗ is locally deterministic, soit will always produce the same terminal assembly. All assembly sequencesgenerated by T ∗ produce a checkerboard pattern in which a monochromatic“+” configuration never appears. Hence, it solves the weak c -coloring problemfor the entire first quadrant, and also for all n × n squares, for any n .One can define a three-dimensional version of the tileset T ∗ (shown in Fig-ure 2) in the natural way, using for example the 3D tile assembly model in [5].Such a three-dimensional tileset will weakly c -color the three-dimensional meshwhere d = 6, with low tile complexity. In this paper, we showed that if a tile assembly model has only local bindingrules, then it cannot use multiple nucleation on a surface to solve locally check-able labeling problems in constant time, even though the abstract Tile AssemblyModel can solve a locally checkable labeling problem using just seven tile types.In fact, we proved a more general impossibility result, which showed the samelower bound applies to self-assembling agents in a three-dimensional grid thatare capable of binding and subsequently sending messages to their neighbors.To the best of our knowledge, this was the first application of a distributedcomputing impossibility result to the field of self-assembly.There are still many open questions regarding multiple nucleation. Aggarwal et al. asked in [3] whether multiple nucleation might reduce the tile complexityof finite shapes. The answer is not known. Furthermore, we can ask for whatclass of computational problems does there exist some function f such thatwe could tile an n × n square in time O (1) < O ( f ) < O ( n ), and “solve”the problem with “acceptable” probability of error, in a tile assembly modelthat permits multiple nucleation. It would also be interesting to explore thepossibility of modeling multiple nucleation of molecules floating in solution—instead of adhering to a surface—perhaps by using techniques from the field of ad hoc wireless networks.We hope that this is just the start of a conversation between researchers inthe fields of distributed computing and biomolecular computation. Acknowledgements I am grateful to Soma Chaudhuri, Dave Doty, Jim Lathrop and Jack Lutzfor helpful discussions on earlier versions of this paper. I am also grateful totwo anonymous referees, who suggested significant conceptual and technicalimprovements to my original journal submission.17 eferenceseferences