Error-prone cellular automata as metaphors of immunity as computation
aa r X i v : . [ n li n . C G ] J a n Error-Prone Cellular Automata as Metaphors ofImmunity as Computation
K´atia K. CassianoValmir C. Barbosa ∗ Programa de Engenharia de Sistemas e Computa¸c˜ao, COPPEUniversidade Federal do Rio de JaneiroCaixa Postal 6851121941-972 Rio de Janeiro - RJ, Brazil
Abstract
Modeling the immune system so that its essential functionalities standout without the need for every molecular or cellular interaction to be takeninto account has been challenging for many decades. Two competing ap-proaches have been the clonal selection theory and the idiotypic-networktheory, each stemming from a relatively separate set of principles. Onerecent perspective holding the promise of unification is that of immunityas computation, that is, of immunity as the process of computing the stateof the body so that protection can be effected, as well as boosted throughlearning. Here we investigate the use of cellular automata (CA) as thecore abstraction supporting this new perspective. Our choice of CA forthis role is based on the potential variety of basins in a CA attractor field.Associating each basin with a consistent set of body states, and moreoverproviding for the noisy evolution of the CA in time so that jumping be-tween basins is possible, have provided the necessary backdrop. Given aCA rule to be followed by all cells synchronously, our model is based on aprobability with which each cell, at each time step, independently updatesits own state differently than the rule mandates. Setting up and solvingthe corresponding Markov chain for its stationary probabilities have re-vealed that already in the context of elementary CA there exist rules that,while allowing transitions between basins, display remarkable resiliency interms of basin occupation. For these rules, the long-run probability thatthe CA is found in a given basin is practically the same as in the deter-ministic case when the initial CA state is chosen uniformly at random.We argue that, consequently, our single-parameter CA model may be asuitable abstraction of immunity as computation.
Keywords:
Immune system, Cellular automata, Immunity as computa-tion. ∗ Corresponding author ([email protected]). Introduction
The immune system is one of the body’s major regulatory systems. Compris-ing important elements at various physical scales, such as organs, cells, andmolecules, the immune system provides defenses against pathogenic bacteriaand viruses, identifies and seeks to eliminate abnormally behaving cells beforethey become established tumors, and carries out tissue restoration as well as var-ious other housekeeping activities. The immune response to invading pathogens,as well as the system’s participation in body maintenance, are the product oflearning and self-organization: Beginning with the so-called innate immunity,the immune system is capable of recreating itself along its history while avoid-ing the pitfalls of autoimmunity [20]. In order to remain fit for such potentiallydaunting task for as long as possible, the immune system relies on the processknown as somatic hypermutation [21], which continually provides the requireddiversity at the immune-cellular level.While by virtue of the immune system’s nature as a self-organizing entity itseems safe to view the rise of the various immune functions as a process thatproceeds from the bottom up, starting with local interactions at the molecularlevel, immunity is undoubtedly a systemic process. Explanatory theories of theimmune system have therefore oscillated between the very local (with the clonalselection theory [5, 11]) and the very wide (with the elusive idiotypic-networktheory [16, 3, 19], based on the idea that many immune-system elements interactwith one another much as they do with antigens). A curious (though apt)perspective that might reconcile the two extremes is that the immune systemcontinually “computes” the state of the body (of which it is part), resulting instate alterations as the immune system both acts and learns [6].Models of the immune system, however, have concentrated on expressing theevolution in time of cell concentrations and other quantities, usually by differ-ential equations (cf., e.g., [10]) but also by discrete-time abstractions akin tocellular automata (CA) [22]. In general such models have been shown to pro-vide a qualitatively convincing picture of how several of the important immunefunctions arise, or of how the idiotypic network is thought to be organized. Butthe immunity-as-computation paradigm is to our knowledge yet to be explored,though it should be for at least two reasons that we find quite compelling. Thefirst one is that viewing immunity as resulting from the continual computa-tion of states of the body is bound to require new abstractions through whichsuch states can be represented and manipulated, mathematically or computa-tionally. As a consequence, valuable insight can be expected to emerge. Thesecond reason is that, once suitable state representations have been identified,the possibility of uncertain events that renders the entire system both adaptiveand vulnerable can be more easily taken into account.Here we begin to investigate the use of CA as a suitable abstraction tounderlie the study of the immune system as a computational entity. Althoughchoosing CA may seem only natural to unconditional CA enthusiasts, given theimpressive plethora of domains to which CA have been applied [27], in our visionthere are specific reasons backing our choice. One of them is that, by virtue of2he deterministic character of how CA evolve in time, all CA states for a givenfinite number of cells and a fixed rule are necessarily partitioned into attractorbasins. Viewing CA states as body states and the CA rule as summarizingthe computation of body states by the immune system immediately yields aninterpretation of each basin as the set of states to which the body is confinedonce it is born into that basin. Depending on the CA rule in question somebasins may express a complex succession of body states while others may seemdull by comparison or merely bespeak decay and disorganization.Another reason for choosing a representation by CA is that they yield easilyto the incorporation of uncertainty. This can be achieved in many ways, ourchoice being to allow each cell, at each time step, to disobey the CA rule inuse and change its state differently than the rule mandates. We model thispossibility by a single probability parameter, denoted by p . The usual, deter-ministic CA world is recovered by setting p to 0, but proceeding otherwise (i.e.,choosing p >
0) immediately opens up new doors. Specifically, every CA statebecomes reachable from every other state, whence it follows that the aforemen-tioned attractor basins are no longer unreachable from one another during theCA dynamics but rather allow the body whose states are the CA states to jour-ney through a rich variety of domains (health, disease, recovery, etc.), howeverunlikely the transition from one to another may be. It also follows that theattractor dynamics inside a basin is no longer inevitable, and likewise that theperiodic attractor lying at a basin’s core is not inescapable.The question we seek to answer is the following. Given a CA rule and anattractor basin in the corresponding CA-state space, what is the probabilitythat, in the long run, the CA state is part of that basin? Unlike other studiesthat models uncertainty in a manner similar to ours (cf., e.g., [23] and refer-ences therein), answering this question relies not on analyzing spatiotemporalpatterns of CA evolution but rather on solving Markov chains for their station-ary distributions. This is computationally strenuous, but for modestly sizedsystems we show that there do exist CA rules for which the added uncertainty,while allowing the desired transitions between CA states of different basins tooccur, nevertheless tends to confine the CA dynamics to within the same basinwhere it would unfold if no uncertainty had been added but initial conditionswere random.We proceed in the following manner. We present our model, along with itsmain properties, in Section 2. This is followed by our methodology in Section 3,results in Section 4, and discussion in Sections 5 and 6. We conclude in Section 7.
We consider binary CA, i.e., CA whose cell states are either 0 or 1. If n is thenumber of cells, assumed finite, then the number of distinct CA states is 2 n . Allcells update their states at all times synchronously (i.e., in lockstep) based onthe same rule, which can be thought of as a table of binary outputs indexed by( δ + 1)-bit inputs. Here δ is the size of a cell’s neighborhood, the same for all3ells, so a cell’s new state depends on its own current state and on its neighbors’current states. Each rule’s size is 2 δ +1 , so there exist 2 δ +1 distinct rules. Fixingthe rule to be used gives rise to a function f mapping each CA state in { , } n into another state in the same set.Our model is based on turning deterministic CA into probabilistic ones. Wedo this by introducing a probability, p , with which each cell, at each time step,disobeys the rule’s prescription for its next state independently of all other cells.So, if x denotes a cell’s next state and the CA rule’s current prescription for thevalue of x is b ∈ { , } , we have x := (cid:26) − b, with probability p ; b, otherwise. (1)Now let i, j ∈ { , } n be any two CA states and let H i,j be the Hammingdistance between them (i.e., the number of cells at which i and j differ). Addi-tionally, let k i = f ( i ), i.e., k i is the CA state that follows i in the deterministicdynamics for the rule at hand. Once we introduce the probability p , the prob-ability that CA state i is followed by j , denoted by p i,j , is p i,j = p H j,ki (1 − p ) n − H j,ki . (2)Readily, letting j = k i yields H j,k i = 0, and consequently p i,j = (1 − p ) n . Thisis the probability with which i is followed by k i , that is, the probability that atany given time step the deterministic prescription is respected.Thus, while using p = 0 clearly recovers the traditional, deterministic dy-namics (since p i,j = 1 if j = k i and p i,j = 0 otherwise), using p > P = [ p i,j ] for transition-probability matrix. To see this it suffices toverify that the elements of P sum up to 1 on any row. That is, fixing i yields X j ∈{ , } n p i,j = n X h =0 (cid:18) nh (cid:19) p h (1 − p ) n − h = 1 (3)(because k i is fixed along with i and differs at h cells from (cid:0) nh (cid:1) of the 2 n CAstates for any given number h of cells). Moreover, for p > P is nonzero and therefore the chain is ergodic, meaning that, regardless of howlikely it is for any given CA state to be the initial state, in the long run the CAis found in state i with the stationary probability π i given by π = πP , where π = [ π i ] is a row vector. By Eq. (1), letting p = 1 also implies deterministic behavior, but following therule that is complementary to the one that is followed when p = 0. That is,one rule sets x to b if and only if the other sets it to 1 − b . A similar type ofsymmetry occurs between the case in which p > − p isused instead. 4o see this, first let ¯ l denote the complement of CA state l (i.e., adding anycell’s state in l to its state in ¯ l yields 1). It clearly follows that H l,j + H ¯ l,j = n forany CA state j . Now recall that Eq. (2) refers to a specific CA rule and to eachcell disobeying it with probability p at each time step. Rewriting the equationfor the complementary rule and also letting it be disobeyed with probability1 − p instead has no effect on the value of p i,j , since(1 − p ) H j, ¯ ki p n − H j, ¯ ki = (1 − p ) n − H j,ki p H j,ki . (4)Thus, studying the case of any given rule under p leads to the same Markovchain as studying the complementary rule under 1 − p , and consequently to thesame stationary probabilities on the CA states.Typically our interest lies in small values of p , which makes the case of (thecorrespondingly large) 1 − p even more remarkable, at least at the level of CAstates. At the higher level of the attractor basins, however, no equivalence canin general be expected: The probability that the CA is found in a particularbasin in the long run depends on how the CA states cluster into basins and ingeneral this happens differently for a given rule and its complement.Nevertheless, there do exist rule pairs that display equivalent behavior forthe same value of p . We identify these pairs by first introducing a transformationbetween CA states, call it g , and requiring that one of the rules in the pair leadthe CA from state i to state k i if and only if the other rule leads the CA fromstate g ( i ) to state g ( k i ). Any rule pair satisfying this requirement is such thatthe corresponding sets of attractor basins, one for each rule, are structurallyequivalent to each other. If, moreover, we require H j,k i = H g ( j ) ,g ( k i ) , then wealso have p i,j = p g ( i ) ,g ( j ) . What results from this is that, in the long run, theCA is found in any given basin of one of the rules with the same probabilitythat it is found in the equivalent basin of the other rule.Rule pairs like this are important in our context because they have thepotential of reducing the number of rules that have to be analyzed. This is sobecause, even though the two sets of stationary probabilities on the CA statesare in general distinct, when the probabilities are summed up inside any basinof one of the rules the result is the same as that for the other rule’s equivalentbasin. One transformation g for which every rule has a counterpart with whichit satisfies the two requirements above is negation, i.e., adding any cell’s statein i to its state in g ( i ) yields 1. Another one is reflection, i.e., the c th cell’s statein i is the same as the ( n − c + 1)th cell’s state in g ( i ) for every c ∈ { , , . . . , n } . By Eq. (2), letting p = 0 . p i,j = 1 / n regardless of i , j , or the rule beingused. From this it follows that π i = 1 / n for every i , so the CA is equally likelyto be found at any state in the long run. However, our transition-probabilitymatrix P for this particular value of p is not the only one leading to the uniformdistribution over the CA states: In fact, this happens if and only if the matrixis doubly stochastic (i.e., its elements add up to 1 column-wise just as they do5ow-wise) and implies an ergodic chain. An example is obtained by letting p i,j = (cid:26) / (cid:0) nτ (cid:1) , if H i,j = τ ;0 , otherwise (5)for any number τ of cells [24] (but note that our p = 0 . τ ). Our model is a special case of the so-called probabilistic CA (PCA), in whicha cell’s next state is no longer given by the customary deterministic rule butinstead is chosen probabilistically as a function of the cell’s and its neighbors’current states. Our particular type of PCA relies on the probabilistic decisionsummarized in Eq. (1), itself dependent on a specific deterministic rule (unlikemost PCA, in whose case no deterministic rule plays any role).Placing our model within the wider class of PCA is important because theyhave been viewed as prototypes of many important systems, both physical andcomputational, in a way similar to that in which immunity may come to becharacterized as a computational process. Examples of such systems includethe spin lattices of statistical physics [9, 4, 14, 12, 18] and, more generally, theMarkov and Gibbs random fields [1] that, together with various asynchronousstate-update schemes [15, 2], underlie many of the so-called probabilistic graph-ical models (such as Bayesian networks and hidden Markov models) in artificialintelligence [17].
Given a deterministic CA rule and the number n of cells, let m denote thenumber of attractor basins into which the set { , } n is partitioned. We denotethese basins by B , B , . . . , B m . For the case in which the rule in question maybe disobeyed by any cell at any time step according to Eq. (1) with p >
0, ouraim is to calculate the probability that, in the long run, the CA is found in somestate of a given basin B ∈ { B , B , . . . , B m } . Denoting this probability by π B ,we clearly have π B = X k ∈{ , } n π B | k Pr( k ) , (6)where π B | k is the conditional probability that in the long run the CA is found insome state in B , given that it started at state k , and Pr( k ) is the probabilitythat it did start at k . However, it follows from our discussion in Section 2that π B | k is actually unaffected by k and can be obtained by adding up π k ,the stationary probability of CA state k in the associated Markov chain, for all k ∈ B . We then have π B = X k ∈ B π k , (7)6egardless of how we choose the initial state k , i.e., regardless of Pr( k ) for any k . All our analyses in the forthcoming sections are based on comparing π B tothe corresponding probability when p = 0, that is, when evolution is deter-ministic. We denote this probability by σ B and the corresponding conditionalprobability, given k , by σ B | k . Readily, σ B | k = (cid:26) , if k ∈ B ;0 , otherwise (8)and σ B = X k ∈{ , } n σ B | k Pr( k ) = X k ∈ B Pr( k ) , (9)so σ B is clearly dependent upon how k is chosen. We continue by assumingthat this happens uniformly at random, that is, Pr( k ) = 1 / n for every k ,whence we obtain σ B = | B | n . (10)Thus, σ B results trivially from the uniform distribution over all CA states (wesimply add it up for all states in basin B ).Obtaining π B for every basin B requires the system π = πP to be solved,subject to the constraints that π i > i ∈ { , } n and P i ∈{ , } n π i = 1,for each desired combination of n , CA rule, and p >
0. We have used the solverthat is freely available as part of the Tangram-II modeling tool [8]. This solveremploys state-of-the-art techniques for the above determination of π given P ,but in our case P is a 2 n × n matrix with no zeroes and no facilitating sym-metries or structure. Thus the solution process has been very time-consuming,which has constrained n to the modest values of 10 through 12. For the record,we mention that, depending on the CA rule at hand, stepping up to n = 13would demand nearly two months per run on an Intel Xeon E5-1650 at 3.2GHzwith enough memory to store the entire 8192 × Henceforth we let B denote the set { B , B , . . . , B m } of all basins for a given CArule and a fixed value of n . We compare the distributions π B , π B , . . . , π B m and σ B , σ B , . . . , σ B m by means of the Hellinger distance between them, denoted by H ( π, σ ) and given by H ( π, σ ) = s − X B ∈B √ π B σ B . (11)Using the Hellinger distance to compare the two distributions is convenient notonly because it truly is a distance function but also because it is always such7hat 0 ≤ H ( π, σ ) ≤
1. In fact, clearly H ( π, σ ) = 0 if and only if π B = σ B for all B ∈ B and H ( π, σ ) = 1 if and only if π B σ B = 0 for all B ∈ B . Thelatter, however, can never be achieved in our context because both π B and σ B are strictly positive for all B ∈ B .We also compare the mean and standard deviation of basin sizes as theyvary from one distribution to the other. To this end, we use the ratios ρ mean = P B ∈B π B | B | P B ∈B σ B | B | (12)and ρ s . d . = s P B ∈B π B | B | − ( P B ∈B π B | B | ) P B ∈B σ B | B | − ( P B ∈B σ B | B | ) . (13)Clearly, comparing ρ mean to 1 lets us detect increases or decreases in the meanbasin size as we move from using the probabilities σ B , σ B , . . . , σ B m to using π B , π B , . . . , π B m , and likewise for ρ s . d . with respect to the standard deviationof basin sizes.These data are given in Tables 1 and 2, the former containing Hellingerdistances, the latter containing mean and standard-deviation ratios. All datarefer to elementary CA [25], which in the present context corresponds to set-ting a cell’s neighborhood size ( δ ) to 2, and to an arrangement of cells thatis one-dimensional with periodic boundaries (i.e., the first and last cells in thearrangement are neighbors). Moreover, our data encompass all combinations ofa unique rule, a CA size n ∈ { , , } , and a probability p ∈ { . , . } .By unique rule we mean one that is not equivalent to any other selected rule bynegation or reflection. Of the 256 possible elementary-CA rules, 88 are unique inthis sense but group with the remaining 168 rules into equivalence classes of sizeat most 4, or into larger clusters of size at most 8 as two equivalence classes ofpairwise complementary rules are joined. Each of the equivalence classes mightbe represented in our tables by any of its members, but we follow Wuensche andLesser, who in their atlas [29] use one or two rules of each larger cluster, viz. therule of least number (in the customary Wolfram sense [25]) and its complementif not already in the first rule’s equivalence class. Each table also informs arule’s class (1 through 4) according to Wolfram’s well-known qualitative scheme[26]. As we have seen, disobeying a CA rule independently at each cell with proba-bility p makes the CA dynamics stochastic and puts it between two extremesthat, in a sense, are equivalent. One extreme is the p = 0 case, i.e., the case inwhich the rule is not disobeyed at all and the customary deterministic dynamicsis followed. In this case, the long-run probability that a randomly chosen CAstate is in some basin B is σ B and stems from the uniform distribution on theCA states provided the initial state is itself chosen uniformly at random. The8able 1: Hellinger distances.Wolfram H ( π, σ ) for n = 10 H ( π, σ ) for n = 11 H ( π, σ ) for n = 12Rule class p = 0 . p = 0 . p = 0 . p = 0 . p = 0 . p = 0 .
248 1 0.248956 0.248918 0.223216 0.223206 0.200278 0.200274
249 1 0.091299 0.091276 0.073387 0.073380 0.059551 0.059549
250 1 0.176776 0.176729 0.015626 0.015626 0.125000 0.124995
251 1 0.031257 0.031224 0.000000 0.000000 0.015626 0.015623252 1 0.022100 0.022100 0.015626 0.015626 0.011049 0.011049253 1 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000254 1 0.022100 0.022100 0.015626 0.015626 0.011049 0.011049
12 2 0.084915 0.083186 0.088408 0.086590 0.091966 0.090065
13 2 0.561884 0.436485 0.298189 0.248901 0.620558 0.46909314 2 0.296465 0.245507 0.335487 0.266519 0.492403 0.338750
15 2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 able 1: Continued.Wolfram H ( π, σ ) for n = 10 H ( π, σ ) for n = 11 H ( π, σ ) for n = 12Rule class p = 0 . p = 0 . p = 0 . p = 0 . p = 0 . p = 0 .
19 2 0.665262 0.455025 0.687454 0.466017 0.706507 0.47642623 2 0.649786 0.504885 0.674805 0.517551 0.699909 0.532909
24 2 0.151170 0.145988 0.162003 0.156480 0.163455 0.15790625 2 0.178794 0.142287 0.205302 0.166376 0.240752 0.184615
26 2 0.096811 0.087081 0.092313 0.081963 0.082902 0.07338027 2 0.078410 0.075206 0.078149 0.075674 0.088139 0.084074
28 2 0.507050 0.293820 0.276705 0.187250 0.554002 0.304611
29 2 0.042041 0.041289 0.044401 0.043618 0.046283 0.045463
33 2 0.128619 0.124749 0.102505 0.099605 0.131497 0.12809535 2 0.112197 0.095522 0.108425 0.093326 0.136032 0.10091936 2 0.210118 0.202473 0.212746 0.204591 0.218307 0.20983237 2 0.217930 0.128618 0.130832 0.085161 0.153067 0.127335
38 2 0.053281 0.051025 0.060564 0.058097 0.058462 0.055931
41 2 0.120070 0.097433 0.115620 0.106492 0.150240 0.10728543 2 0.299150 0.254258 0.340329 0.279313 0.499819 0.35752446 2 0.176524 0.167568 0.186924 0.177534 0.196353 0.18641950 2 0.621971 0.428828 0.434170 0.334403 0.667557 0.448039
51 2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
57 2 0.271948 0.093908 0.095220 0.034859 0.287305 0.07980458 2 0.230614 0.194367 0.401245 0.276011 0.410814 0.41073462 2 0.197845 0.138003 0.132255 0.122913 0.215650 0.11791777 2 0.649786 0.504885 0.447864 0.375344 0.699909 0.532909 able 1: Continued.Wolfram H ( π, σ ) for n = 10 H ( π, σ ) for n = 11 H ( π, σ ) for n = 12Rule class p = 0 . p = 0 . p = 0 . p = 0 . p = 0 . p = 0 .
204 2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000205 2 0.081927 0.080898 0.082447 0.081349 0.083984 0.082815210 2 0.007328 0.006725 0.000000 0.000000 0.010027 0.008364
212 2 0.299150 0.254258 0.340329 0.279313 0.499819 0.357524214 2 0.104722 0.100715 0.138109 0.122696 0.112897 0.094736217 2 0.120698 0.115389 0.130762 0.125581 0.126279 0.121099218 2 0.266122 0.259242 0.221638 0.214302 0.265582 0.258562
220 2 0.092624 0.090070 0.092633 0.089882 0.094236 0.091322222 2 0.081912 0.081363 0.081185 0.080555 0.084533 0.084083
226 2 0.170992 0.148976 0.079509 0.075465 0.196928 0.168250227 2 0.128235 0.089781 0.076123 0.065573 0.109403 0.072279228 2 0.322168 0.308522 0.308795 0.298360 0.299346 0.290927229 2 0.101086 0.096509 0.134254 0.119041 0.115874 0.108862230 2 0.238324 0.230492 0.258104 0.250046 0.277262 0.268729
232 2 0.649786 0.504885 0.674805 0.517551 0.699909 0.532909
233 2 0.370172 0.311163 0.404103 0.341928 0.420107 0.357435
236 2 0.767929 0.677742 0.790184 0.701010 0.810056 0.722260 able 1: Continued.Wolfram H ( π, σ ) for n = 10 H ( π, σ ) for n = 11 H ( π, σ ) for n = 12Rule class p = 0 . p = 0 . p = 0 . p = 0 . p = 0 . p = 0 .
240 2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
241 2 0.161638 0.159585 0.169371 0.167225 0.176764 0.174527242 2 0.219813 0.216880 0.229854 0.226784 0.239672 0.236473
243 2 0.084915 0.083186 0.088408 0.086590 0.091966 0.090065
244 2 0.253371 0.248050 0.265043 0.259493 0.276340 0.270564246 2 0.123588 0.121820 0.126632 0.124803 0.133269 0.13138718 3 0.177219 0.171971 0.098671 0.095609 0.216715 0.20070922 3 0.051738 0.044548 0.090170 0.077895 0.250064 0.134211
30 3 0.034544 0.016889 0.020255 0.007665 0.038162 0.00949245 3 0.016456 0.010594 0.000000 0.000000 0.056506 0.00409760 3 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
73 3 0.294481 0.182849 0.120781 0.107784 0.181444 0.151175
90 3 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000105 3 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
126 3 0.114009 0.110473 0.158805 0.146309 0.140149 0.125633
150 3 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
161 3 0.213897 0.142957 0.046676 0.018741 0.145264 0.101305182 3 0.118028 0.098982 0.037891 0.030647 0.110355 0.095368225 3 0.034380 0.020135 0.062034 0.018674 0.488660 0.11976954 4 0.250242 0.160801 0.078368 0.060045 0.285840 0.138826
193 4 0.066972 0.049254 0.057118 0.030831 0.096156 0.052231 able 2: Mean and standard-deviation ratios (I: p = 0 . p = 0 . n = 10 n = 11 n = 12Wolfram ρ mean ρ s . d . ρ mean ρ s . d . ρ mean ρ s . d . Rule class I II I II I II I II I II I II
248 1 1.13 1.13 0.00 0.00 1.11 1.11 0.00 0.00 1.09 1.09 0.00 0.00
249 1 1.02 1.02 0.00 0.00 1.01 1.01 0.00 0.00 1.01 1.01 0.00 0.00
250 1 1.06 1.06 0.00 0.00 1.00 1.00 0.00 0.00 1.03 1.03 0.00 0.00
251 1 1.00 1.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00252 1 1.00 1.00 0.00 0.00 1.00 1.00 0.00 0.00 1.00 1.00 0.00 0.00253 1 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00254 1 1.00 1.00 0.00 0.00 1.00 1.00 0.00 0.00 1.00 1.00 0.00 0.00
12 2 0.93 0.93 1.01 1.01 0.92 0.92 1.01 1.01 0.91 0.92 1.00 1.00
13 2 1.93 1.83 0.18 0.54 1.17 1.16 0.08 0.26 2.33 2.14 0.24 0.7114 2 1.03 1.03 0.99 0.99 1.42 1.37 0.71 0.77 1.91 1.71 0.26 0.68
15 2 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 able 2: Continued. n = 10 n = 11 n = 12Wolfram ρ mean ρ s . d . ρ mean ρ s . d . ρ mean ρ s . d . Rule class I II I II I II I II I II I II
19 2 3.68 2.95 0.48 1.16 4.28 3.27 0.58 1.33 4.97 3.61 0.70 1.5123 2 3.33 3.01 0.26 0.75 3.89 3.43 0.32 0.88 4.57 3.92 0.38 1.02
24 2 1.01 1.01 1.23 1.21 0.97 0.97 1.03 1.02 0.97 0.98 0.98 0.9825 2 1.00 0.99 0.99 1.00 0.87 0.89 1.27 1.22 1.21 1.17 0.99 1.01
26 2 0.92 0.92 1.03 1.03 0.94 0.94 0.98 0.98 0.89 0.90 0.94 0.9427 2 0.95 0.95 0.97 0.97 0.93 0.93 0.94 0.94 0.94 0.94 0.98 0.98
28 2 2.16 1.78 0.35 0.85 1.22 1.18 0.16 0.48 2.72 2.04 0.46 1.03
29 2 1.03 1.03 1.03 1.03 1.04 1.04 1.06 1.06 1.04 1.04 1.05 1.05
33 2 0.77 0.77 0.85 0.86 0.79 0.79 0.84 0.84 0.77 0.77 0.83 0.8335 2 1.15 1.13 1.01 1.01 1.00 1.00 0.97 0.97 1.10 1.07 1.00 1.0036 2 0.57 0.59 0.90 0.90 0.54 0.56 0.86 0.87 0.50 0.52 0.82 0.8337 2 0.86 0.92 1.18 1.10 0.96 0.99 1.14 1.09 1.03 1.03 0.96 0.96
38 2 0.96 0.97 1.02 1.02 0.96 0.96 1.04 1.03 0.95 0.95 1.00 1.00
41 2 1.04 1.03 0.95 0.97 1.16 1.15 1.12 1.12 0.83 0.89 0.97 1.0143 2 1.06 1.05 0.86 0.87 1.42 1.38 0.68 0.74 2.08 1.87 0.24 0.6546 2 0.96 0.96 1.09 1.08 0.97 0.97 1.14 1.13 0.92 0.93 1.03 1.0350 2 3.29 2.74 0.37 0.94 1.58 1.52 0.14 0.43 4.49 3.43 0.52 1.22
51 2 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
57 2 1.28 1.12 0.34 0.83 1.03 1.01 0.28 0.78 1.35 1.12 0.41 0.8958 2 0.97 0.98 1.00 0.99 1.36 1.30 0.18 0.52 1.35 1.31 0.43 0.4362 2 1.24 1.17 0.61 0.77 1.31 1.30 1.17 1.16 0.85 0.96 1.04 1.0177 2 3.33 3.01 0.26 0.75 1.58 1.55 0.10 0.32 4.57 3.92 0.38 1.02 able 2: Continued. n = 10 n = 11 n = 12Wolfram ρ mean ρ s . d . ρ mean ρ s . d . ρ mean ρ s . d . Rule class I II I II I II I II I II I II94 2 0.84 0.87 1.13 1.12 0.90 0.92 1.04 1.03 2.04 1.30 0.66 1.08178 2 3.33 3.01 0.26 0.75 1.58 1.55 0.10 0.32 4.57 3.92 0.38 1.02197 2 1.91 1.70 0.28 0.75 1.17 1.15 0.13 0.39 2.30 1.92 0.37 0.94198 2 2.17 1.85 0.30 0.80 1.22 1.19 0.14 0.42 2.75 2.17 0.41 0.98201 2 0.91 0.91 1.16 1.15 0.90 0.91 1.12 1.11 0.97 0.96 1.22 1.19
204 2 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00205 2 0.97 0.97 1.02 1.01 0.97 0.97 1.00 1.00 0.97 0.97 1.01 1.01210 2 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
212 2 1.04 1.04 0.91 0.91 1.56 1.50 0.85 0.89 2.08 1.87 0.24 0.65214 2 0.96 0.96 1.05 1.05 0.79 0.82 1.09 1.09 0.97 0.97 0.96 0.97217 2 0.94 0.94 1.37 1.36 0.92 0.92 1.37 1.35 0.90 0.90 1.12 1.11218 2 0.59 0.60 0.80 0.81 0.55 0.56 0.79 0.80 0.50 0.52 0.73 0.74
220 2 0.93 0.93 1.02 1.02 0.92 0.92 1.01 1.01 0.91 0.92 1.01 1.01222 2 0.85 0.85 0.91 0.91 0.83 0.84 0.89 0.89 0.82 0.81 0.88 0.88
226 2 1.08 1.07 0.96 0.97 0.99 0.99 1.00 1.00 1.09 1.08 0.98 0.98227 2 1.07 1.04 1.00 1.00 1.07 1.05 1.04 1.04 1.08 1.05 1.02 1.01228 2 0.72 0.74 1.05 1.05 0.69 0.70 0.98 0.98 0.66 0.68 0.91 0.91229 2 0.96 0.96 1.03 1.03 0.89 0.90 1.01 1.01 0.80 0.81 0.86 0.86230 2 0.70 0.71 0.86 0.87 0.73 0.74 0.90 0.91 0.70 0.71 0.80 0.81
232 2 3.33 3.01 0.26 0.75 3.89 3.43 0.32 0.88 4.57 3.92 0.38 1.02
233 2 1.39 1.37 0.09 0.28 1.48 1.46 0.09 0.28 1.53 1.51 0.09 0.29
236 2 5.40 5.11 0.24 0.72 6.39 6.01 0.27 0.82 7.57 7.08 0.31 0.92 able 2: Continued. n = 10 n = 11 n = 12Wolfram ρ mean ρ s . d . ρ mean ρ s . d . ρ mean ρ s . d . Rule class I II I II I II I II I II I II237 2 1.88 1.84 0.10 0.31 2.00 1.95 0.10 0.32 2.13 2.07 0.11 0.33
240 2 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
241 2 0.80 0.80 0.94 0.94 0.77 0.77 0.90 0.90 0.76 0.76 0.88 0.89242 2 0.78 0.78 0.96 0.96 0.68 0.68 0.81 0.81 0.69 0.69 0.86 0.86
243 2 0.94 0.94 1.04 1.04 0.92 0.92 1.02 1.02 0.92 0.93 1.00 1.00
244 2 0.96 0.96 0.98 0.98 0.96 0.96 0.99 0.99 0.96 0.96 1.00 1.00246 2 0.88 0.88 1.10 1.10 0.89 0.90 1.17 1.16 0.92 0.92 1.07 1.0718 3 0.86 0.86 0.94 0.94 0.90 0.90 1.04 1.04 0.81 0.82 0.64 0.6522 3 0.94 0.95 0.98 1.00 0.97 0.96 0.99 0.98 0.60 0.79 1.01 1.05
30 3 0.98 0.99 1.05 1.04 0.98 0.99 1.02 1.01 0.99 1.00 1.10 1.0445 3 0.99 0.99 1.01 1.01 1.00 1.00 1.00 1.00 0.98 1.00 1.00 1.0060 3 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
73 3 1.08 1.02 1.16 1.09 0.93 0.93 0.90 0.90 0.91 0.86 0.98 0.92
90 3 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00105 3 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
126 3 0.91 0.91 1.07 1.07 0.88 0.88 1.02 1.02 0.87 0.88 0.75 0.78
150 3 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
161 3 0.97 0.98 1.16 1.12 1.02 1.00 0.98 1.02 0.93 0.94 1.05 1.05182 3 1.02 1.02 0.87 0.88 1.02 1.02 0.95 0.96 1.01 1.00 0.87 0.86225 3 0.99 0.99 1.02 1.02 1.03 0.99 0.88 1.04 0.39 0.92 1.24 1.1454 4 0.86 0.90 0.84 0.89 1.03 1.04 1.06 1.05 1.41 1.17 0.86 0.98
193 4 1.03 1.02 0.99 1.00 1.07 1.03 0.92 0.97 0.95 0.97 1.05 1.03 ther extreme is that of p = 0 .
5, in which case the long-run probability that theCA is found in basin B is π B , now stemming from CA-state probabilities thatare again uniform but now by virtue of the underlying Markov chain’s stationarydistribution.Comparing these two distributions as indicated in Section 4 clearly yields H ( π, σ ) = 0 and, consequently, ρ mean = ρ s . d . = 1, regardless of the particularCA rule and CA size being considered. Although these values may look like whatwe seek (stochastic CA dynamics that, while allowing occasional transitionsbetween basins, let the CA state be found in a same basin for long stretches oftime), they are only the product of erratic transitions between the CA states.In fact, for p = 0 . p = 0 . p suchthat 0 < p < .
5. We first rewrite the transition probability p i,j of Eq. (2) as p i,j = (1 − p ) n (cid:18) − pp (cid:19) − H j,ki , (14)which for 0 < p < . p i,j decays exponentially with theHamming distance between j and k i from the maximum value of (1 − p ) n . Thismaximum, as we have noted, is achieved for j = k i , so evolving toward i ’sdeterministic successor in a single time step is always exponentially more likelythan doing it toward any other CA state. Intuitively one might then expect theoccurrence of H ( π, σ ) ≈ j = k i is picked when the current CA state is i .We proceed by singling out some rules for a more detailed discussion. Mostof these are highlighted in Tables 1 and 2 with a bold typeface. We occasionallymention specific characteristics of a rule or its basins, and for these the readeris referred to one of the available atlases [29, 28].First note that, though not commonplace, rules for which H ( π, σ ) is indis-tinguishable from 0 within the six decimal places used in Table 1 do exist. Theseare class-1 rules 0 and 253; class-2 rules 15, 51, 204, and 240; and class-3 rules60, 90 (the XOR rule), 105, and 150. For two of these rules, namely 0 and 253,the value of H ( π, σ ) is precisely 0, since each of them gives rise to exactly onebasin of attraction, call it B , whence it follows that π B = σ B = 1 no matterwhat the stationary CA-state probabilities that make up π B turn out to be.The value of H ( π, σ ) is precisely 0 also for four other rules, namely 15, 51, 204,and 240, but for an entirely different reason. What happens in these cases isthat the transition-probability matrix is doubly stochastic, which as we havenoted implies that the stationary distribution over the CA states is uniform.For rules 51 and 204, in particular, double stochasticity is a consequence of thematrices’ being symmetric (i.e., p i,j = p j,i for all CA states i and j ). As for17ules 60, 90, 105, and 150, H ( π, σ ) is probably only approximately equal to 0,since the matrices do not seem to be doubly stochastic.Making the requirement on H ( π, σ ) less stringent, for example by replacingindistinguishability from 0 with H ( π, σ ) < .
1, turns up further rules: class-1rules 249, 251, 252, and 254; class-2 rules 12, 26, 27, 29, 38, 205, 210, 220,222, and 243; class-3 rules 30 and 45; and even one of the elusive class-4 rules,namely rule 193 (more widely recognized through its equivalent by both nega-tion and reflection, the celebrated rule 110, known to be capable of universalcomputation). The class-1 additions to the list are not really surprising, since inall four cases nearly all CA states cluster into one single basin and therefore ourargument above for rules 0 and 253 essentially continues to hold (though ap-proximately). As for the remaining additions (the class-2 through class-4 rules),no readily discernible characteristic seems to stand out that might help explainthe relative proximity of the two distributions, not even inside each class.Aside from these 27 zero or near-zero cases of the Hellinger distance, theremaining 61 rules in Tables 1 and 2, at least for our small sample of n and p values, all give rise to stationary basin probabilities that differ from those of thedeterministic case (with initial CA states chosen uniformly at random) to somesubstantial extent. Singling out some rules on the higher extreme of distancevalues is not as clear-cut a task as picking the zeroes. As we mentioned earlier,the theoretical maximum distance of 1 can never be achieved for distributionsthat are strictly positive everywhere, so figuring out the actual maximum forelementary CA is far from a trivial task.What we do is then to highlight those rules that, across our small sampleof n and p values, are on the far side of the (admittedly arbitrary) thresholdof H ( π, σ ) = 0 .
45. Doing this yields four rules, all in class 2 and italicized inthe tables: rules 19, 23, 232, and 236. Once again it is hard to discern anyexplaining characteristics, but from Table 2 it is clear that all four rules have incommon the facts that ρ mean is substantially larger than 1 (but less so as p isincreased) and that ρ s . d . is often smaller than 1 (but growing as p is increased).That is, for small p the distribution is more concentrated on larger basins, allrelative to the basin-size distribution arising from the uniform distribution onCA states. This becomes less so as p is increased and the already discussedlimit, as p is driven toward 0 .
5, is approached.
The present study has hinged on Eq. (1), a simple probabilistic expression of acell’s ability to alter its state differently than the CA rule in use directs it to, atevery time step and independently of all other cells. If we view the CA states asstates of the body, including the portion of it known as the immune system, thenthe evolution of CA states in time stands not only for the natural succession ofbody states but also for the computation of such states by the immune system.Given this context, the adoption of the spatially and temporally local proba-bilistic alterations to the CA rule given in Eq. (1) is an attempt to summarize18everal phenomena originating from the uncertainty that is inherent to everybiological process. Such uncertainty drives adaptation, gives rise to diversity aswell as disease, and fuels the appearance of idiotypes never before seen in thebody and with them the possibility of better immunity through learning.Though inherently stochastic, our model is also inherently dependent on afixed CA rule. This is clear already in Eq. (1) itself, where we recall that b stands precisely for the cell’s next state according to such a fixed rule. More-over, although Eq. (1) makes every state update of every cell nondeterministic,globally it is always exponentially more likely to evolve to the CA state the rulemandates than to any other CA state. This means that the clustering of CAstates into basins, though no longer unbreachable, is still meaningful and canbe exploited as we adopt the modified CA as metaphors of immunity as compu-tation. For example, each basin can be viewed as encompassing CA states thatare equivalent from the perspective of the immune system as it computes thestate of the body. Some possibilities that come to mind are basins representinga healthy or unhealthy body, others representing a body under recovery throughthe action of the immune system, and still many others as details are broughtinto the picture.In such a setting, changes in the CA state other than those mandated by theunderlying CA rule can be interpreted in a variety of ways: e.g., inter-basin tran-sitions may stand for the appearance of or the recovery from diseases, as well asto adaptation into a distinct, though still healthy, set of states; intra-basin tran-sitions, in turn, may represent change that nevertheless does not fundamentallyalter the state of the body as far as being healthy is concerned. So far we haveexplored this landscape by simply asking what the effects of Eq. (1) might be interms of fundamentally deviating the CA from its traditional excursion into thefield of attractor basins under the CA rule in question. We have discovered CArules in all four of Wolfram’s classes for which no fundamental deviation existswhile still allowing the CA to occasionally drift in and out of the field’s basins.It is telling that we should find such behavior already in the simplest of CA,viz. elementary CA, and already for the very small ones we investigate in thiswork. Moving forward will require the investigation of more complex CA, atthe same time higher-dimensional, larger, and governed by larger-neighborhoodrules. We expect that these enriched scenarios will provide many useful possi-bilities to characterize immunity as computation. In our view, the importanceof characterizations such as this can hardly be overstated: Even as we write,immunotherapy is being hailed as a fundamental breakthrough in cancer treat-ment (cf. [7], as well as [13] and related content), and theoretical modeling isbound to be instrumental in better understanding this and other applications. An important characteristic of our model is its reliance on one single parameter,the probability p . Assuming that it acts at each cell independently of all othershas allowed the transition probability p i,j , from CA state i to CA state j in a19ingle step, to be written as in Eq. (2), which in turn implies the ergodicity ofthe corresponding Markov chain whenever p >
0. The model is then conceptu-ally simple, but studying it requires the Markov chain’s stationary probabilitiesto be found, which by virtue of the model’s inherent combinatorial growth inthe general case quickly becomes computationally burdensome if not downrightintractable.Further research should then first concentrate on looking for those CA rules,if any exist, for which the transition matrix can somehow be simplified so thatsome facilitating structure emerges. We already know that, for p < .
5, the dom-inant probability on any of the matrix’s rows, say the i th, is p i,k i = (1 − p ) n .Not only this, but p i,j for any j = k i is smaller than p i,k i by the exponentiallydecaying factor of [(1 − p ) /p ] − H j,ki . The key to solving the Markov chain as-sociated with certain rules may lie precisely in ignoring such vanishingly smallprobabilities, but to our knowledge substantial further research is needed toascertain this. Acknowledgments
We acknowledge partial support from CNPq, CAPES, a FAPERJ BBP grant,and the joint PRONEX initiative of CNPq and FAPERJ under contract E-26/110.550/2010.
References [1] V. C. Barbosa.
Massively Parallel Models of Computation . Ellis Horwood,Chichester, UK, 1993.[2] V. C. Barbosa and E. Gafni. A distributed implementation of simulatedannealing.
Journal of Parallel and Distributed Computing , 6:411–434, 1989.[3] U. Behn. Idiotypic networks: toward a renaissance?
Immunological Re-views , 216:142–152, 2007.[4] C. H. Bennett and G. Grinstein. Role of irreversibility in stabilizing complexand nonergodic behavior in locally interacting discrete systems.
PhysicalReview Letters , 55:657–660, 1985.[5] F. M. Burnet.
The Clonal Selection Theory of Acquired Immunity . Cam-bridge University Press, Cambridge, UK, 1959.[6] I. R. Cohen. Real and artificial immune systems: computing the state ofthe body.
Nature Reviews Immunology , 7:569–574, 2007.[7] J. Couzin-Frankel. Cancer immunotherapy.
Science , 342:1432–1433, 2013.[8] E. de Souza e Silva, R. M. M. Le˜ao, A. P. C. Silva, A. A. A. Rocha, F. P.Duarte, F. J. Silveira Filho, G. D. G. Jaime, and R. R. Muntz. Mod-eling, analysis, measurement and experimentation with the Tangram-II20ntegrated environment. In
Proceedings of the First International Confer-ence on Performance Evaluation Methodologies and Tools
Physical Review Letters , 53:311–314, 1984.[10] L. E. Flores, E. J. Aguilar, V. C. Barbosa, and L. A. V. Carvalho. Agraph model for the evolution of specificity in humoral immunity.
Journalof Theoretical Biology , 229:311–325, 2004.[11] D. R. Forsdyke. The origins of the clonal selection theory of immunity: acase study for evaluation in science.
FASEB Journal , 9:164–166, 1995.[12] A. Georges and P. Le Doussal. From equilibrium spin models to prob-abilistic cellular automata.
Journal of Statistical Physics , 54:1011–1064,1989.[13] L. Gravitz. Cancer immunotherapy.
Nature , 504:S1, 2013.[14] G. Grinstein, C. Jayaprakash, and Y. He. Statistical mechanics of proba-bilistic cellular automata.
Physical Review Letters , 55:2527–2530, 1985.[15] T. E. Ingerson and R. L. Buvel. Structure in asynchronous cellular au-tomata.
Physica D , 10:59–68, 1984.[16] N. K. Jerne. Towards a network theory of the immune system.
Annalesd’Immunologie , C125:373–389, 1974.[17] D. Koller and N. Friedman.
Probabilistic Graphical Models . MIT Press,Cambridge, MA, 2009.[18] J. L. Lebowitz, C. Maes, and E. R. Speer. Statistical mechanics of proba-bilistic cellular automata.
Journal of Statistical Physics , 59:117–170, 1990.[19] A. Madi, D. Y. Kenett, S. Bransburg-Zabary, Y. Merbl, F. J. Quintana,A. I. Tauber, I. R. Cohen, and E. Ben-Jacob. Network theory analysis ofantibody-antigen reactivity data: the immune trees at birth and adulthood.
PLoS ONE.
Nature Reviews Immunology , 6:573–583, 2006.[22] H. Schmidtchen, M. Th¨une, and U. Behn. Randomly evolving idiotypicnetworks: structural properties and architecture.
Physical Review E ,86:011930, 2012. 2123] F. Silva and L. Correia. An experimental study of noise and asynchrony inelementary cellular automata with sampling compensation.
Natural Com-puting , 12:573–588, 2013.[24] B. Voorhees and C. Beauchemin. Point mutations and transitions betweencellular automata attractor basins.
Complex Systems , 15:41–78, 2004.[25] S. Wolfram. Statistical mechanics of cellular automata.
Reviews of ModernPhysics , 55:601–644, 1983.[26] S. Wolfram. Universality and complexity in cellular automata.
Physica D ,10:1–35, 1984.[27] S. Wolfram.
A New Kind of Science . Wolfram Media, Champaign, IL,2002.[28] Wolfram Research, Inc. Elementary cellular automata.http://atlas.wolfram.com/01/01/.[29] A. Wuensche and M. Lesser.