[PDF] An Improved Algorithm for Coarse-Graining Cellular Automata

Abstract

In studying the predictability of emergent phenomena in complex systems, Israeli & Goldenfeld (Phys. Rev. Lett., 2004; Phys. Rev. E, 2006) showed how to coarse-grain (elementary) cellular automata (CA). Their algorithm for finding coarse-grainings of supercell size N took doubly-exponential 2^{2^N}-time, and thus only allowed them to explore supercell sizes N \leq 4. Here we introduce a new, more efficient algorithm for finding coarse-grainings between any two given CA that allows us to systematically explore all elementary CA with supercell sizes up to N=7, and to explore individual examples of even larger supercell size. Our algorithm is based on a backtracking search, similar to the DPLL algorithm with unit propagation for the NP-complete problem of Boolean Satisfiability.

Full PDF

AAn Improved Algorithm for Coarse-Graining Cellular Automata

Yerim Song

Canyon Crest Academy, San Diego, CA, USA ∗ Joshua A. Grochow

Department of Computer Science, University of Colorado Boulder, Boulder, CO, USA andDepartment of Mathematics, University of Colorado Boulder, Boulder, CO, USA † In studying the predictability of emergent phenomena in complex systems, Israeli & Goldenfeld(

Phys. Rev. Lett. , 2004;

Phys. Rev. E , 2006) showed how to coarse-grain (elementary) cellu-lar automata (CA). Their algorithm for ﬁnding coarse-grainings of supercell size N took doubly-exponential 2 N -time, and thus only allowed them to explore supercell sizes N ≤

4. Here weintroduce a new, more eﬃcient algorithm for ﬁnding coarse-grainings between any two given CAthat allows us to systematically explore all elementary CA with supercell sizes up to N = 7, andto explore individual examples of even larger supercell size. Our algorithm is based on a backtrack-ing search, similar to the DPLL algorithm with unit propagation for the NP -complete problem ofBoolean Satisﬁability. I. INTRODUCTION

Cellular automata (CA) are a model of dynamical sys-tems that are discrete in both space and time. Sincetheir development by Ulam and von Neumann [1] in the1940s, CA have been applied to many subjects includingbiology [2], physics [3], and computer science [4, 5].Coarse-graining is one method of reducing the com-plexity of a system, while hopefully retaining enoughstructure to reveal insights about it. Rather than focus-ing on small, speciﬁc details, we “zoom out” and look ata larger picture. Israeli & Goldenfeld [6, 7] introduced amethod of coarse-graining cellular automata, and appliedit systematically to the elementary cellular automata.Elementary cellular automata were introduced andstudied extensively by Wolfram [6]: they are automatawith 2 states, on a 1-dimensional grid of cells, with onlynearest-neighbor interactions. There are exactly 256 dis-tinct elementary CAs, and Wolfram proposed dividingthem into four classes based on their long-term behavior:stabilizing to a homogeneous state, ultimately becomingperiodic, or maintaining random or complex-looking be-havior indeﬁnitely. The latter classes are hypothesizedto be “computationally irreducible” and diﬃcult or im-possible to predict [8–10]. However through the processof coarse-graining, even CA that are originally in thesecomplex classes can coarse-grain to CA in the simplerclasses [6, 7].To calculate these coarse-grainings, Israeli and Gold-enfeld used a brute force algorithm, which required 2 N steps to search for coarse-grainings of supercell size N .The lengthy time required for the computations limitedtheir study to coarse-grainings of supercell size only 4.They reported that there were 16 elementary CA thatthey could not coarse-grain at all, and for the other CA ∗ [email protected] † [email protected] it was unknown whether they had found all the availablecoarse-grainings.In this paper we develop a new algorithm, using abacktracking search similar to the DPLL algorithm forBoolean Satisﬁability [11, 12]. This backtracking searchallows us to prune branches from the search tree early,based on coarse-graining constraints. Furthermore, dy-namically ordering the variables allows us to speed upthe process considerably in practice.Using our new algorithm, we were able to systemat-ically ﬁnd all coarse-grainings between elementary CAup to supercell size 7 on a commodity laptop. Note that2 ≈ . × , so even if each step could be done in 1msthe brute force algorithm of [6, 7] would take ≈ yearsto handle supercell size 7. We were also able to search forspeciﬁc coarse-grainings of larger supercell size. We ﬁnd56 new coarse-grainings at these larger supercell sizes (26up to symmetry), though we leave open the mathemati-cal question of how to prove when all nontrivial coarse-grainings have been found. Interestingly, all of these newcoarse-grainings were found at N = 5 or 6; there were nonew pairs of CA A, B such that A coarse-grained to B atsupercell size 7. A. Related Work

Magiera & Dzwinel [13, 14] also present an improvedalgorithm for coarse-graining CA, and get up to super-cell size 7. However, despite the similar name, they aresolving a related but diﬀerent problem than the one wesolve. Namely, they solve the problem:

IsCoarseGrainable

Input:

A CA A and supercell size N Output:

A nontrivial coarse-graining of A with supercell size N , or ⊥ if none exists.In contrast, we solve the following problem, which en-ables us to ﬁll out Fig. 5 (a CA analogue of a renormal- a r X i v : . [ n li n . C G ] D ec ization group ﬂow diagram): CoarseGraining

Input:

Two CAs

A, B and supercell size N Output:

A nontrivial coarse-graining from A to B with supercell size N , or ⊥ if none exists.Their algorithm begins by constructing A N , with k N states per cell if A had k states per cell, and then tries tocollapse those new states together as much as possible,resulting in another CA (but one which is not speciﬁedahead of time in the input). In contrast, our algorithmtries to ﬁnd the coarse-graining map from A N to thegiven CA B . The fact that B is given in the input al-lows us to use other algorithmic tactics that are less easyto take advantage of in their setting. In contrast, theiralgorithm is aimed at deciding which CA can be coarse-grained at all.We note that one could directly reduce to a BooleanSatisﬁability problem and use an oﬀ-the-shelf SAT solver.This did not seem very promising to us, as supercell size N would result in a SAT instance with 2 N variables and2 N constraints, where each constraint would consist ofseveral CNF clauses. See Remark 1 below for more de-tails.However, given that our approach is similar to ap-proaches to the NP -complete problem of Boolean Sat-isﬁability, it is natural to wonder about comparing thetwo. Because the coarse-graining problem has expo-nentially many variables—one variable for the projec-tion of each possible N -tuple of states—it is easy to seethat CoarseGraining is in the complexity class

NEXP .It would be interesting to know whether it is

NEXP -complete, as that would suggest that some amount ofbrute force search is inevitable, assuming the widely be-lieved complexity conjecture that

EXP (cid:54) = NEXP .We note that while we present our algorithm and re-sults in the context of elementary

CA (2-state, 1D, near-est neighbor), it is trivial to adapt it to arbitrary CA.

II. BACKGROUND ON CELLULARAUTOMATA AND COARSE-GRAINING

A cellular automaton (CA) A has cells at the nodesof some (usually inﬁnite, regular) graph, each cell hasa state from a ﬁnite set S A , and an update rule f A ( a ; a , . . . , a k ), which takes in the state a of a celland the states a , . . . , a k of its neighbors, and outputsthe state of the cell for the next time step. In this paperwe consider only elementary CA, which are CA on a 1Dlattice, with nearest-neighbor interactions, and 2 states, S A = { , } . Thus each rule f A takes in only three states:that of a cell, its left neighbor, and its right neighbor.For geometric convenience, rather than putting the cell’sstate ﬁrst, we put them in geometric order, so that, e.g., f A ( a , a , a ) is the next state of the cell in position 2.As introduced by Israeli & Goldenfeld [6, 7], a coarse-graining from one CA A to another CA B with supercell size N is a map P : S NA → S B such that, given any stringstates, if we ﬁrst run A for N steps and then apply P , theresult is the same as ﬁrst applying P , and then running B for a single step. That is, the coarse-graining scalesboth the cell size and the speed of the CA by N .For 1D CA with nearest-neighbor interactions, this iscaptured precisely by the fundamental coarse-grainingequation P ( f NA ( x , x , x )) = f B ( P ( x ) , P ( x ) , P ( x )) ∀ x i ∈ S NA (1)Given an input ( x , x , x ) ∈ S NA , we will refer to theresult on the left-hand side here throughout the paper as res = res ( x , x , x ), and similarly the result on theright-hand side as res . We will sometimes abbreviateEq. (1) to: P A N = BP.

Given CA

A, B and supercell size N , our goal is thus toﬁnd a projection P : S NA → S B such that res = res ∀ ( x , x , x ) ∈ ( S NA ) . These will be the fundamental constraints that our back-tracking algorithm explores and exploits.Following [6, 7], we consider a coarse-graining “trivial”if P is a constant function, that is, if it maps all inputs to0 (resp., all inputs to 1). We ignore these trivial coarse-grainings by ﬁat.It is diﬃcult to discover which projection functions P are valid coarse-grainings at window size N , becausethere are 2 N − P (excluding the twotrivial possibilities), and for each such P there are 2 N constraints to check. There seems to be no apparent pat-tern as to which projections will satisfy all of constraints,and thus be a coarse-graining. (The lack of such patternscould be attested to theoretically by answering aﬃrma-tively our question above about NEXP -completeness.) Inthe absence of such patterns, one might be resigned totrying every possible projection, and indeed this is essen-tially the approach taken in [6, 7]. In this paper we showthat a backtracking search with a few easy-to-implementheuristics can do signiﬁcantly better in practice, whilestill ensuring our algorithm is complete in the sense thatevery possible coarse-graining will be found.

A. Symmetries

In our algorithm, we will determine whether there arecoarse-grainings from A to B for all pairs of elementaryCA A, B , for supercell sizes up to N = 7. Once wehave found a coarse-graining from A to B , we do notconsider it further in our experiments; e. g., if we ﬁnd acoarse-graining from A to B at supercell size 3, we donot search for coarse-grainings from A to B with largersupercell sizes, though our algorithm could be used to doso for further exploratory purposes.Additionally, as noted in [6, 7], we may eliminate CAthat are “isomorphic” to one another under the symme-tries that swap left and right, that swap 0 and 1, or thatswap both left-right and 0-1. If we use σ LR to denote theleft-right swap, σ to denote the 0-1 swap, and A → B to denote a coarse-graining, then we have: A → B ⇐⇒ σ ( A ) → σ ( B ) ∀ σ ∈ { σ LR , σ , σ LR ◦ σ = σ ◦ σ LR } . For example, suppose A is rule 24 and B is rule 240.We have σ (24) = 231 , σ LR (24) = 66 , σ ,LR (24) = 189,and σ (240) = 240 , σ LR (240) = σ LR, (240) = 170. Soamong the possibilities24 →

240 231 →

240 66 →

170 189 → , we need only check one, rather than all four. By elim-inating these symmetric cases, the number of compar-isons we must make between pairs of elementary CA iscut down to about 1/3 of the original total of 256 ; it isnot 1/4 because of the presence of some rules that aremapped to themselves by some of the symmetries (suchas σ (240) = 240 in the preceding example). III. BACKTRACKING ALGORITHM

The new approach we developed to more eﬃcientlycompute coarse-grainings involves considering a back-tracking tree search, allowing us to to selectively cut treebranches earlier in the computation without losing infor-mation. The basic idea is that, given cellular automata

A, B and supercell size N , we seek to ﬁnd values for P ( x ) for all x ∈ S NA , that satisfy (1). For each such x , we branch on trying either P ( x ) = 0 or P ( x ) = 1(when working with elementary CA). As we assign val-ues to P ( x ) for some of the x ’s, we can begin to seewhether (1) is satisﬁed or violated for diﬀerent inputs( x , x , x ) ∈ S NA . In some cases, this lets us undo anassignment for P ( x ), backtrack, and try the other value.Figure 1 shows an example of this search tree for su-percell size N = 2. Generally, for supercell size N , thetree will have 2 N levels and 2 N leaves, though many ofthese will be pruned before they are reached. Note thatalthough it is included in the tree diagram, we excludethe all-0s and all-1s projections in our calculations, asthese are considered “trivial” coarse-grainings, as in Is-raeli & Goldenfeld. These projections are represented inthe ﬁrst and last leaves in the diagram. If we did not ex-clude them, then almost all rules could be coarse grainedby these trivial projections. A. Tree pruning

The ﬁrst way we reduce the number of computationsis to prune branches at earlier levels of the tree, so that

FIG. 1. The search tree above displays all 16 possible pro-jections with supercell size N = 2. Based on the projec-tion input on the left, the output is displayed on the tree.In this diagram, level 1 represents P (00), level 2 represent P (01), and so on. The boxed sequence demonstrates thatthe fourth leaf corresponds to the projection 0011, that is, P (00) = 0 , P (01) = 0 , P (10) = 1 , P (11) = 1. computations in the remaining levels can be skipped. Byallowing several projections to be skipped early on, wemanage to signiﬁcantly reduce the computation time toﬁnd or rule out a coarse-graining.To determine whether a branch can be pruned the al-gorithms checks (1) on inputs ( x , x , x ) ∈ ( S NA ) thathave been determined by the choices made so far. Givena partial projection P , deﬁned only on a subset of S NA , wesay that an input ( x , x , x ) has a decided projection if P ( x ) , P ( x ) , P ( x ), and P ( f NA ( x , x , x )) have all beenassigned already. We begin at level 1, where the ﬁrst pro-jection will be set to either 0 or 1. Meanwhile, the restof the projections will be temporarily undecided. Usingall possible inputs with decided projections, we check if P A N = BP is satisﬁed before proceeding to the nextlevel. Recall that we use res to denote the output of P A N (when deﬁned) and res the output of BP . If wediscover a decided input for which res (cid:54) = res , then themost recent branch leading to the current projection canbe safely cut oﬀ, as there are no valid projections thatextend it (i.e., in its subtree). Example 1.

To better illustrate the method, we walkthrough an example of ﬁnding a coarse-graining from rule196 to rule 192 with supercell size N = 2; this exampleis illustrated in Figure 2. To begin, the tree diagramwill start at level 1 and branch on P (00) = 0. We onlycheck inputs whose projections are deﬁned, so the onlyavailable input to test at this point is x = 000000. Whenrule 196 is applied to this pattern N = 2 times, the resultis 00, so we may indeed check the result at x . P (00) = 0,so res = 0. To calculate res , we apply P to the input000000 three times, resulting in P (00) P (00) P (00) = 000,and then use rule 192 on 000 to get res = 0. Because res = res at this point, we cannot cut this branch, andcontinue to the next level in the tree.Suppose that in the next level we branch on P (01) = 1(and we still have assigned P (00) = 0). As x = 000100has its projection deﬁned, we may attempt to validateour choices on that input. To get res we ﬁrst apply rule196 twice to get an outcome of 01 and then P to theresult to get P A N (000100) = P (01) = 1. To compute res we apply rule 192 to P (00) P (01) P (00) = 010, re-sulting in res = 0. Since res (cid:54) = res , we can cut oﬀthe whole branch with P (00) = 0 , P (01) = 1, avoidingfurther computations. Remark 1.

This is one of the places our approachesdiﬀers from simply reducing to a Boolean Satisﬁability(SAT) problem and using an oﬀ-the-shelf SAT solver. Inour approach, we get to choose “on the ﬂy” which in-puts ( x , x , x ) to test for res = res . In contrast, areduction to a SAT solver would require writing down all the constraints for every possible input x right from thebeginning (of which there are 8 N ), before any brancheshave been made. FIG. 2. Tree pruning. The search tree for attempting tocoarse grain from rule 196 to rule 192 at supercell size N = 2(Example 1). After branching on P (00) = 0 and P (01) = 1,one of the equations (1) is violated, and the search backs upto try the branch P (01) = 0 instead. The ‘X’ representsthe tree being pruned at level 2; the box represents all thecomputations that were skipped due to this cut. B. Forced values

Our second method to reduce computation time is toleverage forced values of the projection. Namely, oncewe have branched on assigning P ( x ) for some values of x ∈ S NA , we may be forced to assign P ( x (cid:48) ) for some new,previously unassigned x (cid:48) in order to satisfy (1). (Thisis similar to unit propagation in Boolean Satisﬁabilitysolvers.) When this occurs later in the tree, it lets useliminate duplicated computations; when it occurs earlierin the tree, it lets us avoid large swaths of the tree.In this method, instead of simply checking whether res = res on inputs ( x , x , x ) for which P ( x ) , P ( x ) , P ( x ), and P ( f NA ( x , x , x )) have all beenassigned, we make an eﬀort to match res and res even when these have not yet all been assigned. Asin the method and example above, we begin with aninput ( x , x , x ) such that P ( x ) , P ( x ), and P ( x )are all deﬁned. We can then calculate res = f B ( P ( x ) , P ( x ) , P ( x )). Let x = f NA ( x , x , x ); notethat res = P ( x ). If P ( x ) is not yet assigned, then thevalue of P ( x ) is forced by our earlier choices to be equalto res . Example 2.

We illustrate the process of forcing on acoarse-graining from rule 2 to rule 4 with N = 2; see Fig-ure 3. In this example, we will assign variables in “back-wards” order, beginning with P (11) and ending with P (00). We start by branching on P (11) = 0. The onlyinput for which P ( x ) , P ( x ) , P ( x ) are all determined is x = 111111. To ﬁnd res we apply rule 2 N times andget the result of f NA ( x ) = 00. However, P (00) won’t bedetermined until the 4th level of the tree, and we are onlyat level 1, so P (00) is currently unassigned. We compute res and get a result of 0. To ensure res = res , res must also be 0, so we record P (00) = 0. Then the algo-rithm continues by branching on the value of P (10), then P (01), and when it gets to the last level, it remembersthat the value of P (00) was already forced to 0 and doesnot branch further; at that point it just checks whetherthe choices it has made satisfy (1) for all inputs. FIG. 3. Forced values. This search tree shows the coarse-graining from rule 2 to rule 4 at supercell size N = 2. Whilecalculating if we can continue after branching on P (11) = 0,we ﬁnd that the projection at level 4, P (00) is forced to be 0.The algorithm thus records that P (00) = 0 at this point, butthen continues down the search tree as previously. C. Dynamic variable ordering

Our third method to reduce computation time is totake further advantage of the forced values. Namely,when a value is forced, we can be more eﬃcient by re-ordering the projection inputs dynamically (i. e., at thetime). That is, instead of the levels of the tree beinglabeled in a static order such as 00, 01, 10, 11 (“for-wards”), or 11, 10, 01, 00 (“backwards”), we can insteadmake the ordering dynamic as the algorithm proceeds.By considering a forced variable as soon as it is forced,the algorithm is able to make further inferences at thetime, performing even more pruning and perhaps ﬁnd-ing even more forced variables. This is analogous to unitpropagation in SAT solvers.

Example 3 (Continuation of Example 2) . See Figure 4.As in Example 2, suppose we branch on P (11) = 0 anddiscover that P (00) = 0 is forced. Rather than waiting tosee P (00) at level 4 as “originally planned” (in the static,backwards order), the algorithm decides to make P (00)the immediate next (second) level of the tree. The algo-rithm now continues as before. But while doing this, wesee that P (10) is also forced, so level 3 gets dynamicallyset to branch on P (10). Dynamic ordering allows cuts tooccur earlier in the search tree, signiﬁcantly reducing thecomputation time. FIG. 4. The dynamic ordering process. See Example 3.(Above) In this example, the backwards algorithm is used,so it begins by branching on P (11), while the ordering ofthe remaining variables is not yet decided (indicated by the“?”s). (Below) After branching on P (11), the value of P (00)is forced, so level 2 is the assignments to 00. The processcontinues in this way until all the variables are assigned eitherby branching or being forced. D. Implementation variants

Our dynamic ordering selects the next projection asone that is forced, if such a projection exists. We stillhave to explain what inputs x to test res = res on(when there is a choice, which occurs frequently furtheralong in the tree), and what projection to choose nextwhen none are forced, including the ﬁrst projection tobranch on.We implemented four speciﬁc strategies for thesechoices that we test experimentally: F In this strategy, the default ordering is “forward,”moving from P (00 · · ·

0) to P (11 · · · F The default ordering is still “forward,” but now af-ter a forced value is assigned, the next variable to branch on is the next one after the forced variablein the forward ordering. B Similar to strategy F , except the default orderingis “backwards”, starting with P (11 · · ·

1) and end-ing with P (00 · · · B Similar to strategy F , but with the default order-ing being backwards.For example, in strategy F , the algorithm begins bybranching on P (000). If P (100) were then forced (andthere were no variables forced immediately after), thenthe next projection to branch on would be P (001), since001 is the ﬁrst unassigned input in the forward ordering.In contrast, in the same scenario in F , the algorithmwould branch next on P (101), since 101 is the next inputafter the forced input 100. E. Rules 0 and 255

Experimentally, we noticed that coarse-graining torules 0 or 255 took a very long time, because pruningcould not occur until very late in the search tree. Notethat a nontrivial coarse-graining to (say) rule 0 is thesame as a projection P that is 0 everywhere except pos-sibly on some input x that never occurs in the run ofthe CA A , that is, an input x ∈ S NA that is not in theimage of f NA . So for coarse-graining to rules 0 or 255,we instead search directly for such non-occuring inputs x ∈ S NA . When such an input is found, if coarse-grainingto rule 0 we simply set P ( x ) = 1, to ensure our projec-tion is not the trivial all-0s projection, which recall weare excluding by ﬁat (for rule 255 we set P ( x ) = 0 toavoid the all-1s projection).If every possible projection input occurs at least oncein the image of f NA , then we immediately conclude that A cannot be non-trivially coarse-grained to either rule 0or 255. Also, note that by this characterization, a rule A can be non-trivially coarse-grained to rule 0 if and only ifit can be coarse-grained to rule 255. (Note that for most A , this does not follow from the fact that σ (0) = 255,since that would only let us conclude that A → ⇔ σ ( A ) → IV. RESULTSA. Coarse grainings

By implementing our improved method, we signif-icantly sped up the search for coarse-graining. Weused our new method(s) to exhaustively ﬁnd all coarse-grainings between elementary CA with supercell size N up to 7, on a commodity laptop. Israeli & Goldenfeldreported results up to N = 4, and brieﬂy discussed fur-ther results they achieved with large amounts of time ona super-computer. Our Figure 5 extends their results upto N = 7.The following coarse-grainings were discovered only atsupercell sizes 5 or 6, but not smaller (CA are bracketedaccording to their symmetry classes): • The following rules coarse-grain to rules 0 and 255:[25, 61, 67, 103], [41, 97, 107, 121], [43, 113], [54,147], [57, 99], [62, 118, 131, 145], [73, 109], [94,133], [104, 233], [110, 124, 137, 193], [122, 161],[142, 212], • The following coarse-grain to rule 204: 23, [36, 219],[50, 179], 77, [108, 201], [132, 222], 178, 232 • [1 , , [19 , → • [22 , , [104 , → [128 , • [66 , → , → N = 5 ,

6, despite exhaustively and conclusivelysearching at N = 7, we did not discover any new pairs( A, B ) such that A coarse-grained to B at supercell size7 but not smaller.The main diagram of results in [6, 7] has 54 elemen-tary CA that have no nontrivial coarse-grainings at allup to supercell size 4; using a super-computer to explorelarger supercell sizes, Israeli & Goldenfeld then reportthat there were only 16 elementary CA for which theycouldn’t ﬁnd a nontrivial coarse-graining. On a laptop,we ﬁnd only 20 elementary CA that do not have coarse-grainings up to supercell size N = 7: the 16 previouslyreported, along with the two symmetry classes [37,91]and [164, 218]. The non-trivial coarse-grainings for thesefour are presumably present at larger supercell sizes. B. Comparison of algorithms

We compared the four implementation variants F , F , B , B of our method discussed above with thebrute force method of [6, 7]. The brute force approachdoes not include any of the new ideas developed in thispaper, such as the search tree or pruning branches. Thebrute force method does, however, include the reductiondue to symmetries.To measure eﬃciency we will display the number ofcomparisons between res and res ; these are displayedin Figure 6; since the four graphs for F , F , B , B wereso similar, we also report the numerical values for the ﬁveapproaches in Table I.As expected, the brute force approach was signiﬁcantlyless eﬃcient than the other 4 selection methods. In Fig-ure 6 all four of our new approaches appear to exhibitsingly exponential exp( cN ) scaling, in comparison to the2 N scaling of the brute force algorithm. As we progress,pruning branches apparently has greater eﬀect in reduc-ing the number of comparisons. N Brute Force F F B B res and res forﬁve methods (brute force, F , F , B , B ). The * indicatesvalues left uncomputed because the computations took toolong. Among F , F , B , and B , we see that despite B being the best approach overall up to N = 6 there is onlyabout a 1-2% diﬀerence in the number of comparisons for N = 4 and 5. When we move on to N = 6, however, thediﬀerence increases to about 15%. On examination, wefound that this greater diﬀerence was caused by a singlecase, the coarse graining from rule 162 to rule 170. If weeliminate this case, N = 6 would also have a miniscule1-2% diﬀerence between our four variants, similar to thatobserved at smaller values of N .The coarse-graining from rule 162 to rule 170 at N = 6is an interesting example because of the immense diﬀer-ence between computations in the forward and backwarddirection. When applying B , there were precisely 500Kcomparisons. Meanwhile, for F , there were 916M com-parisons, a factor of approximately 1800. In this speciﬁcexample, the backward direction yields a drastic improve-ment over the forward direction, but this does not holdtrue for all other cases.In the coarse-grainings from rule 154 to rule 170 with N = 4, we discovered that it was the opposite situa-tion in which the forward direction was far more eﬃcientthan the backward direction. When using the backwardapproach, 66,040 total comparisons were needed to com-plete. Meanwhile, when we tried the forward approach,there were only 7,128 comparisons, a factor of just under10.We also ﬁnd examples where there is a signiﬁcant dif-ference between selection methods 1 and 2. In coarse-graining rule 29 to rule 51 with N = 7, method B used1,353K comparisons, while B used only 241K, a nearlysix-fold diﬀerence.More importantly, we discovered that by altering theinitial input projection, we could reduce the number ofcomparisons by an extremely large amount. Althoughit was a coincidence, we found that when choosing toﬁrst branch on P (1010101), only 60 comparisons wererequired to complete (just 60, not 60K!). Even if wedo not expect such drastic improvements in all cases,the magnitude of this drop was surprising, and suggeststhat heuristics for better selection of projection inputs tobranch on could yield quite signiﬁcant further improve-ments.As is to be expected, no one ordering is always the best.Thus, to provide optimal eﬃciency we could combine all FIG. 5. Shows the coarse-graining transitions within the 256 elementary CA. Results from supercell size N = 2 to N = 7.An arrow indicates that the ﬁrst rule may be coarse grained to the second rule for at least one choice of supercell size N andnontrivial projection P .FIG. 6. Number of comparisons res = res used by eachof the ﬁve methods—brute force, F , F , B , B —in ﬁndingcoarse-grainings up to supercell size N = 6. (The plots for F , B aren’t visible simply because they’re so well-overlappedby those for F , B at this scale.) Note the y -axis is scaledlogarithmically, so all four of our new approaches appear toexhibit singly exponential scaling (approximately ∝ N ), incomparison to the 2 N scaling of the brute force algorithm.The counts for the brute force algorithm are not reported for N > four of these. Each of the coarse-grainings are more ef-fective with diﬀerent types of dynamic orderings due to their unique binary patterns, so to further reduce thecomputations, we still need to discover how to choose aselection method that will be most compatible with eachof the rules. We now discuss a few patterns we noticedthat might be useful in this endeavor.Using these new approaches, we observed four commonpatterns in the cuts. The ﬁrst in which both the 0 and1 branches immediately prune oﬀ at level 1, the lowestlevel. The second pattern tapers from higher to lowerlevels in the 0 branch but cuts immediately at level 1 inthe 1 branch. The third pattern is the opposite, in which0 immediately cuts at level 1 but the 1 branch tapers fromlower to higher levels. The last pattern we saw frequentlywas a combination of the second and third, in which the0 branch tapers from higher to lower levels while the 1branch tapers from lower to higher levels.While most coarse grainings followed these four generaltrends with slight deviations, there were some outliers.The coarse-graining from rule 162 to rule 170 in the for-ward ordering was unique among all those we explored.It had a seemingly highly irregular pattern of cuts, withmost cuts occurring only very late in the tree, havinglittle eﬀect in reducing the runtime. However, when wetried reversing the order, starting at 11, the shape trans-formed into the fourth pattern mentioned above and thecomputation ﬁnished fairly quickly.Overall, we saw that when there were cases in whichthe coarse-graining did not follow one of the precedingfour patterns, they far took longer to compute than oth-ers. However, as seen in the example above, changing thedirection can sometimes help ﬁx this issue.

V. CONCLUSION

We developed a much more eﬃcient algorithm for ﬁnd-ing coarse-grainings between cellular automata, using abacktracking search, with propagation of forced valuesand a dynamic variable ordering. Experimentally, ourmethod appears to have singly-exponential time scaling,compared to the previous brute-force method whose run-time scaled as 2 N [6, 7]. Using our method we couldexamine exhaustively all possible coarse-grainings of ele-mentary cellular automata up to supercell size 7, extend-ing the previous results up that only went up to supercellsize 4 [6, 7]. We found 26 new symmetry classes of coarse-grainings (56 coarse grainings total).To explore further improvements to our algorithm, weexamined several diﬀerent variable orderings, and foundthat each had diﬀerent advantages. We examined speciﬁccases that suggest that heuristics to select which variableto branch on next could have quite drastic eﬀects on theruntime. Other possible improvements are suggested byanalogy with the SAT literature, such as an analogue of clause learning, or to try to incorporate SAT solversdirectly but dynamically (rather than an all-at-once re-duction, see Remark 1).Interesting open questions include the original ques-tions raised in [6, 7]—such as whether the remaining16 elementary CA can be coarse-grained at all—as wellas new questions we highlight, such as classifying thecomplexity of the coarse-graining problem. The exper-imentally observed singly-exponential scaling of our al-gorithm raises the possibility that the problem might infact be in the complexity class EXP , making the questionof whether it is in

EXP or is

NEXP -complete even moresalient. There also remains the question ﬁnding math-ematical methods to prove when there does not exist acoarse graining between two given (elementary) cellularautomata— regardless of supercell size—which would en-able us to complete the picture of coarse-grainings be-tween elementary CA.

ACKNOWLEDGMENTS

We would like to thank ATHENA by WiSTEM, the or-ganization that matched the authors together, eventuallyleading to this project. The authors were partially fundedby NSF grant DMS-1829826 (formerly DMS-1622390). [1] J. von Neumann,

Theory of self-reproducing automata (Univeristy of Illinois Press, 1966) edited and completedby A. W. Burks.[2] E. G. B. and L. Edelstein-Keshet, Cellular automata ap-proaches to biological modeling, J. Theor. Biol. , 97(1993).[3] A. Ilachinski,

Cellular automata (World Scientiﬁc Pub-lishing Co., Inc., River Edge, NJ, 2001) a discrete uni-verse.[4] M. Mitchell, Computation in cellular automata: A se-lected review, in