[PDF] Algorithmic Information Dynamics of Persistent Patterns and Colliding Particles in the Game of Life

Abstract

Without loss of generalisation to other systems, including possibly non-deterministic ones, we demonstrate the application of methods drawn from algorithmic information dynamics to the characterisation and classification of emergent and persistent patterns, motifs and colliding particles in Conway's Game of Life (GoL), a cellular automaton serving as a case study illustrating the way in which such ideas can be applied to a typical discrete dynamical system. We explore the issue of local observations of closed systems whose orbits may appear open because of inaccessibility to the global rules governing the overall system. We also investigate aspects of symmetry related to complexity in the distribution of patterns that occur with high frequency in GoL (which we thus call motifs) and analyse the distribution of these motifs with a view to tracking the changes in their algorithmic probability over time. We demonstrate how the tools introduced are an alternative to other computable measures that are unable to capture changes in emergent structures in evolving complex systems that are often too small or too subtle to be properly characterised by methods such as lossless compression and Shannon entropy.

Full PDF

AAlgorithmic Information Dynamics of PersistentPatterns and Colliding Particlesin the Game of Life ∗ Hector Zenil , , , , Narsis A. Kiani , , , , Jesper Tegn´er , , Algorithmic Dynamics Lab, Centre for Molecular Medicine,Karolinska Institute, Stockholm, Sweden Unit of Computational Medicine, Department of Medicine,Karolinska Institute, Stockholm, Sweden Science for Life Laboratory, SciLifeLab, Stockholm, Sweden Algorithmic Nature Group, LABORES for the Natural andDigital Sciences, Paris, France Biological and Environmental Sciences and Engineering Division,Computer, Electrical and Mathematical Sciences and EngineeringDivision, King Abdullah University of Science andTechnology (KAUST), Kingdom of Saudi Arabia { hector.zenil, narsis.kiani, jesper.tegner } @ki.se Abstract

Without loss of generalisation to other systems, including possiblynon-deterministic ones, we demonstrate the application of methods drawnfrom algorithmic information dynamics to the characterisation and classi-ﬁcation of emergent and persistent patterns, motifs and colliding particlesin Conway’s Game of Life (GoL), a cellular automaton serving as a casestudy illustrating the way in which such ideas can be applied to a typicaldiscrete dynamical system. We explore the issue of local observations ofclosed systems whose orbits may appear open because of inaccessibility tothe global rules governing the overall system. We also investigate aspectsof symmetry related to complexity in the distribution of patterns thatoccur with high frequency in GoL (which we thus call motifs ) and analysethe distribution of these motifs with a view to tracking the changes in theiralgorithmic probability over time. We demonstrate how the tools intro-duced are an alternative to other computable measures that are unable tocapture changes in emergent structures in evolving complex systems thatare often too small or too subtle to be properly characterised by methodssuch as lossless compression and Shannon entropy.

Keywords:

Kolmogorov-Chaitin complexity; cellular automata; algo-rithmic probability; algorithmic Coding Theorem, Turing machines; Algo-rithmic Information Theory; Game of Life; dynamic pattern classiﬁcation ∗ Source code available at: https://github.com/hzenilc/algorithmicdynamicGoL.git .An online implementation of estimations of graph complexity is available at a r X i v : . [ n li n . C G ] A p r Introduction

It has been proven that there are quantitative connections between indicatorsof algorithmic information content (or algorithmic complexity) and the chaoticbehaviour of dynamical systems that is related to their sensitivity to initialconditions. Some of these results and the relevant references are, for exam-ple, given in [21]. Previous numerical approaches, such as the one used in [21]and others cited in the same paper, including those proposed by the authorsof the landmark textbook on Kolmogorov complexity [18], make use of com-putable measures, in particular measures based on popular lossless compressionalgorithms, and suggest that non-computable approximations cannot be usedin computer simulations or in the analysis of experiments. One of the aims ofthis paper is to prove that a new measure [28, 26, 27] based on the concept ofalgorithmic probability, that has been shown to be more powerful [31, 29] thancomputable approximations [25] such as popular lossless compression algorithms(e.g. LZW), can overcome some previous limitations and diﬃculties in proﬁlingorbit complexity, diﬃculties particularly encountered in the investigation of thebehaviour of local observations typical of computer experiments in, e.g., cellularautomata research. This is because, for example, typically-used popular losslesscompression algorithms are closer to Shannon entropy in their operation [31]than to a measure of algorithmic complexity, and Shannon entropy is not onlylimited in that it can only quantify statistical regularities, but it is also notrobust and can easily be fooled in very simple ways [30].The concept of

Algorithmic Information Dynamics (or simply algorithmicdynamics ) was introduced in [26] and draws heavily on the theories of Com-putability and Algorithmic Information. It is a calculus with which to studythe change in the causal content of a dynamical system’s orbits when the com-plex system is perturbed or unfolds over time. We demonstrate the applica-tion and utility of these methods in characterising evolving emergent patternsand interactions (collisions) in a well-studied example of a dynamical (discrete)complex system that has been proven to be very expressive by virtue of beingcomputationally universal [6].The purpose of algorithmic dynamics is to trace in detail the changes inalgorithmic probability—estimated by local observations-produced by naturalor induced perturbations in evolving open complex systems. This is possibleeven for partial observations that may look diﬀerent but that come from thesame source. For in general, we can only have partial access in the real-worldto a system’s underlying generating mechanism, yet from partial observationsalgorithmic models can be derived, and their likelihood of being the producersof the phenomena observed estimated.

Conway’s Game of Life [7] (GoL) is a 2-dimensional cellular automaton (seeFigure 8 Sup. Inf.). A cellular automaton is a computer program that appliesin parallel a global rule composed of local rules on a tape of cells with symbols2e.g. binary). The local rules governing GoL are traditionally written as follows:1. A live cell with fewer than two live neighbours dies.2. A live cell with more than three live neighbours dies.3. A live cell with two or three live neighbours continues to live.4. A dead cell with three live neighbours becomes a live cell.Each of these is a local rule governing a special case, while the set of rules1-4 constitute the global rule deﬁning the Game of Life.Following [6], we call a conﬁguration in GoL that contains only a ﬁnitenumber of ‘alive’ cells and prevails a pattern . If such a pattern occurs with highfrequency we call it a motif .For example, so-called ‘gliders’ are a (small) pattern that emerges in GoLwith high frequency. The most frequent glider motif (see Fig. 3D) travels diag-onally at a speed of t/ t is the automaton runtime from initial condition t = 0.Glider collisions and interactions can produce other particles such as so-called ‘blocks’, ‘beehives’, ‘blinkers’, ‘traﬃc lights’, and a less common patternknown as the ‘eater’. Particle collisions in cellular automata, as in high particlephysics supercolliders, have been studied before [16], demonstrating the com-putational capabilities of such interactions where both annihilation and newparticle production is key. Particle collision and interaction proﬁling may thusbe key in controlling the way in which computation can happen within the cel-lular automaton. For example, using only gliders, one can build a pattern thatacts like a ﬁnite state machine connected to two counters. This has the samecomputational power as a universal Turing machine, so using the glider, theGame of Life automaton was proven to be Turing-universal, that is, as powerfulas any computer with unlimited memory and no time constraints [1].GoL is an example of a 2-dimensional cellular automaton that is not onlyTuring-universal but also intrinsically universal [6]. This means that the Gameof Life not only computes any computable function but can also emulate thebehaviour of any other 2-dimensional cellular automaton (under rescaling). We are interested in applying some measures related to (algorithmic) informa-tion theory to track the local dynamical changes of patterns and motifs in GoLthat may shed light on the local but also the global behaviour of a discretedynamical system, of which GoL is a well-known case study. To this end, wecompare and apply Shannon Entropy; Compress, an algorithm implementinglossless compression; and a measure related to and motivated by algorithmicprobability (CTM/BDM) that has been used in other contexts with interestingresults. 3 .1 Shannon entropy

The entropy of a discrete random variable s with possible values s , . . . , s n andprobability distribution P ( s ) is deﬁned as: H ( s ) = − n (cid:88) i =1 P ( s i ) log P ( s i )where if P ( s i ) = 0 for some i , then log (0) = 0.In the case of arrays or matrices s is a random variable in a set of arraysor matrices according to some probability distribution (usually the uniform dis-tribution is assumed, given that Shannon entropy per se does not provide anymeans or methods for updating P ( s )). Lossless compression algorithms have traditionally been used to approximatethe Kolmogorov complexity of an object. Data compression can be viewed asa function that maps data onto other data using the same units or alphabet(if the translation is into diﬀerent units or a larger or smaller alphabet, thenthe process is called a ’re-encoding’ or simply a ’translation’). Compressionis successful if the resulting data are shorter than the original data plus thedecompression instructions needed to fully reconstruct said original data. Fora compression algorithm to be lossless, there must be a reverse mapping fromcompressed data to the original data. That is to say, the compression methodmust encapsulate a bijection between “plain” and “compressed” data, becausethe original data and the compressed data should be in the same units.A caveat about lossless compression: lossless compression based on the mostpopular algorithms such as LZW (Gzip, PNG, Compress) that are traditionallyconsidered to be approximations to algorithmic (Kolmogorov) complexity arecloser to Shannon entropy than to algorithmic complexity (which we will denoteby K ). This is because these popular lossless compression algorithms implementa method that traverses the object of interest looking for statistical repetitionsfrom which a basic grammar is produced based entirely on their frequency ofappearance. This means that common lossless compression algorithms overlookmany algorithmic aspects of data that are invisible to them because they do notproduce any statistical mark. Algorithmic Probability is a seminal concept in the theory of algorithmic infor-mation. The algorithmic probability of a string s is a measure that describesthe probability that a valid (not part of the beginning of any other) randomprogram p produces the string s when run on a universal Turing machine U . Inequation form this can be rendered as 4 ( s ) = (cid:88) p : U ( p )= s / | p | That is, the sum over all the programs p for which U outputs s and halts.The Algorithmic Probability [20, 14] measure m ( s ) is related to algorithmiccomplexity K ( s ) in that m ( s ) is at least the maximum term in the summationof programs, given that the shortest program carries the greatest weight in thesum. The Coding Theorem further establishes the connection between m ( s )and K ( s ) as follows: | − log m ( s ) − K ( s ) | < c (1)where c is a ﬁxed constant independent of s . The Coding Theorem impliesthat [5, 17] one can estimate the algorithmic complexity of a string from itsfrequency by rewriting Eq. 1 as: K m ( s ) = − log m ( s ) + c (2)where O (1) is a constant. One can see that it is possible to approximate K byapproximations to m (such ﬁnite approximations have also been explored in [19]on integer sequences), with the added advantage that m ( s ) is more sensitive tosmall objects [5] than the traditional approach to K using lossless compres-sion algorithms, which typically perform poorly for small objects (e.g. smallpatterns).A major improvement in approximating the algorithmic complexity of strings,images, graphs and networks based on the concept of algorithmic probability(AP) oﬀers diﬀerent and more stable and robust approximations to algorithmiccomplexity by way of the so-called algorithmic Coding theorem (c.f. below).The method, called the Coding Theorem Method, suﬀers some of the samedrawbacks as other approximations to K , including lossless compression, re-lated to the additive constant involved in the invariance theorem as introducedby Kolmogorov, Chaitin and Solomonoﬀ [13, 3, 20] that guarantees conver-gence towards K at the limit without the rate of convergence ever being known.The chief advantage of the algorithm is, however, that algorithmic probability(AP) [20, 14] not only looks for repetitions but for algorithmic causal segments,such as in the deterministic nature of the digits of π , without the need for wildassumptions about the underlying mass distributions.As illustrated in Figure 1, an isolated observation window does not containall the algorithmic information of an evolving system. In particular, it may notcontain the complexity to be able to infer the set of local generating rules, andhence the global rule of a deterministic system (Figure 1A). So in practice thephenomena in the window appear to be driven by external processes that arerandom for all practical purposes, while some others can be explained by inter-acting/evolving local patterns in space and time (Figure 1C). This means thateven though GoL is a fully deterministic system and thus its algorithmic com-plexity K can only grow by log ( t ) (Figure 1B), one can meaningfully estimate5igure 1: The algorithmic complexity of an observation. A: Generating ruleof Conway’s Game of Life (GoL), a 2-dimensional Cellular Automaton whoseglobal rule is composed of local rules that can be represented by the average ofthe values of the cells in the (Moore) neighbourhood (a property also referred toas ’totalistic’ [23]). B: 3D space-time representation of successive conﬁgurationsof GoL after 30 steps. C: Projected slice window w of an observation of theevolution of B, the last step of GoL. K ( w ) of a cross-section w (Figure 1C) of an orbit of a deterministic system likeGoL and study its algorithmic dynamics (the change of K ( w ) over time). The method studied and applied here was ﬁrst deﬁned in [24, 29], and is in manyrespects independent of the observer to the greatest possible extent. For exam-ple, unlike popular implementations of lossless compression used to approximatealgorithmic complexity (such as LZW), the method based on Algorithmic Prob-ability averages over a large number of computer programs that were found toaccurately (without loss of any information) reproduce the output, thus mak-ing the problem of the choice of enumeration less relevant, as against the morearbitrary choice of a particular lossless compression algorithm, especially onethat is mostly a variation of limited measures such as Shannon entropy. Theadvantage of the measure of graph algorithmic complexity is that when it di-verges from algorithmic complexity-because it requires greater computationalpower-it can only behave as poorly as Shannon entropy [29], but any behaviour6ivergent from Shannon entropy can only be an improvement on entropy anda more accurate estimation of the actual information contained in the objectbased on local calculations of algorithmic complexity.The

Coding Theorem Method (CTM) [5, 17] is rooted in the relation estab-lished by Algorithmic Probability between frequency of production of a stringfrom a random program and its Kolmogorov complexity (Eq. 1, also called thealgorithmic

Coding theorem , in contrast with the Coding theorem in classicalinformation theory). Essentially, it uses the fact that the more frequent a string(or object), the lower its algorithmic complexity; and strings of lower frequencyhave higher algorithmic complexity. As has been said, BDM actually calcu-lates Shannon Entropy combined with better approximations, by way of localestimations, of algorithmic complexity.The approach adopted here consists in determining the algorithmic com-plexity of a matrix by quantifying the likelihood that a random Turing machineoperating on a 2-dimensional tape can generate it and halt. The

Block Decom-position Method (BDM) then decomposes the matrix into smaller matrices forwhich we can numerically calculate the algorithmic probability by running alarge set of small 2-dimensional deterministic Turing machines, and upon ap-plication of the algorithmic Coding theorem, its algorithmic complexity. Thenthe overall complexity of the original matrix is the sum of the complexity of itsparts, albeit with a logarithmic penalisation for repetitions, given that n repe-titions of the same object only adds log n complexity to its overall complexity,as one can simply describe a repetition in terms of the multiplicity of the ﬁrstoccurrence. More formally, the Kolmogorov complexity of a matrix G is deﬁnedas follows: BDM ( g, d ) = (cid:88) ( r u ,n u ) ∈ A ( G ) d × d log ( n u ) + CT M ( r u ) (3)where K m ( r u ) is the approximation of the algorithmic (Kolmogorov-Chaitin)complexity of the subarrays r u arrived at by using the algorithmic Coding theo-rem (Eq. 2), a method that we denote by CTM, and A ( G ) d × d represents the setwith elements ( r u , n u ), obtained when decomposing the matrix of G into non-overlapping squares of size d by d . In each ( r u , n u ) pair, r u is one such squareand n u its multiplicity (number of occurrences). From now on K BDM ( g, d = 4)will be denoted only by K ( G ), but it should be taken as an approximation to K ( G ) unless otherwise stated (e.g. when speaking of the theoretical true K ( G )value).The only parameters used for the decomposition of BDM as suggested in [29]were the maximum 12 for strings and 4 for arrays, given the current best CTMapproximation [17] based on an empirical distribution based on all Turing ma-chines with up to 5 states, and no string/array overlapping decomposition formaximum eﬃciency (as it runs in linear time) and for which the error (due toboundary conditions) is bounded [29].An advantage of these algorithm-based measures is that the 2-dimensionalversions of both CTM and BDM are native bidimensional measures of com-7lexity and thus do not destroy the 2-dimensional structure of a matrix. Thisis achieved by making a generalisation of the algorithmic Coding theorem us-ing 2-dimensional Turing machines. In this way we can deﬁne the probabilityof production of a matrix as the result of a randomly chosen deterministic 2-dimensional-tape Turing machine without any array transformations of a stringmaking it dependent on an arbitrary mapping. Figure 2A suggests that highly symmetric patterns/motifs that produce aboutthe same number of black and white pixels and look similar (small standard vari-ation) for Entropy can actually have more complex shapes than those collapsedby Entropy alone. Similar results were obtained before and after normalisingby pattern size (length × width). Symmetries considered include the squaredihedral group D , i.e. those invariant to rotations and reﬂections. Shannonentropy characterises the highest symmetry as having the lowest randomness,but both lossless compression and algorithmic probability (BDM) suggest thathighly symmetric shapes can also reach higher complexity.The distribution of motifs (the 100 most frequent local persistent patterns,also called ash as they are the debris of random interactions) of GoL are re-ported in by starting from 1 829 196 random seed (in a torus conﬁguration) with ini-tial density 0.375 black cells over a grid size of 2048 × p -value 8 . × − )of the simplicity bias in the distribution of these motifs (see Fig. 2B).Algorithmic probability may not account for a greater percentage of thedeviation from uniform or normal distribution because patterns are ﬁltered bypersistence, i.e. only persistent patterns are retained after an arbitrary runtimestep, and therefore no natural halting state exists, likely producing a diﬀerencein distribution as reported in [31], where distributions from halting and non-halting models of computation were studied and analysed. Values of algorithmicprobability for some motifs (the top and bottom 20 motifs in GoL) are given inFigure 9.On the other hand, as plotted in Figure 2B-D, the frequency and algorithmiccomplexity of the patterns in GoL follow a rank distribution and are negatively8 BC DFigure 2: A: Classical and algorithmic measures versus symmetries of the top100 most frequent patterns (hence motifs) in GoL. The measures show diverse(and similar) abilities to separate patterns with the highest and lowest numberof symmetries. Notation for the square dihedral group D : invariant to allpossible rotations (*), to all reﬂections (+), to 2 rotations (X) only and to2 reﬂections (/), 1 rotation (:) and 1 reﬂection (.). B: The heavily long-taildistribution of local persistent patterns in GoL (of less than 10 x

10 pixels) fromthe 100 most frequent emerging patterns and of (C and D) most-likely still andperiodic structures. 9 BC DFigure 3: A: Algorithmic probability approximation of local GoL orbits by BDMon evolving patterns of size 3 × × While each pattern in GoL evolving in time t comes from the same generatingglobal rule for which K ( GoL ( t )) is ﬁxed (up to log( t ) corresponding to thebinary encoding of the runtime step), a pattern within an observational window(Fig. 1) that does not necessarily display the action of all the local rules of theglobal rule can be regarded as an (open) system separate from the larger systemgoverned by the global rule. This is similar to what happens in the practice ofunderstanding real-world complex systems to which we only have partial access10 BCDFigure 4: A, B and C: 3 possible collisions showing 2-particle annihilation (A),stability (B) and instability, i.e. production of new particles (C). D: The algo-rithmic information dynamics of a 2-particle stable collision.11 BC DFigure 5: Orbit algorithmic dynamics of local emergent patterns in GoL. Com-press (A) and Entropy (B) retrieve very noisy results compared to BDM (C)which converges faster and separates the dynamic behaviour of all emergingpatterns in GoL of size 4 × n × m cells froma 2D cross section of the 3D evolution of GoL as shown in Figure 1. For mostcases n = m . The size of n and m is determined by the size of the pattern ofinterest, with the sliding window following the unfolding pattern. The values of n or m may increase if the pattern grows but never decreases, even if the pat-tern disappears. Each line in all plots corresponds to the algorithmic dynamics(complexity change) of the orbit of a local pattern in GoL, unless otherwiseestablished (e.g. such as in collapsed cases). Figure 3, for example, demon-strates how the algorithmic probability approach implemented by BDM cancapture dynamical changes even for small patterns where lossless compressionmay fail because limited to statistical regularities that are not always present.For example, in Figure 3A, BDM captures the periodic/oscillating behaviour(period 2) of a small pattern, something that compression, as an approximationto algorithmic complexity, was unable to capture for the same motifs in Fig-ure 3B. Likewise, the BDM approximation to algorithmic complexity captures12 BCDFigure 6: Orbit complexity proﬁling. A and B: collapsing all the simplestcases (1, 2 and 4) to the bottom, closest to zero, values diverging from theonly open-ended case (3). A: The measure BDM returns the best separationcompared to Entropy C: 16 steps corresponding to evolving steps of the 4 casescaptured in A and B. C: The algorithmic information dynamics of 3 particleinteractions/collisions. The unstable collision corresponds to Figure 4D, the 3-particle annihilation is qualitatively similar to the 2-particle Figure 4A and thenear-miss stable collision corresponds to Figure 4B where the 4 particles lookas if about to collide but appear not to (hence a ‘near miss’). Starting seeds areshown in (see Figure 10 Sup. Inf.). 13BFigure 7: A: Collapsed cases suggesting clusters of dynamical system attractorsof colliding gliders in GoL. B: Density plot of all non-trivial (particles that arenot entirely annihilated) qualitative interactions among 4 particles. The darkerthe later and more persistent in time.the periodic behaviour of the glider in Figure 3D for 10 steps.Figures 5A and B illustrate cases of diagonal particle (glider) collisions. Ina slightly diﬀerent position, the same 2 particles can produce a single still pat-tern as shown in Figure 5D, that reaches a maximum of complexity when newparticles are produced, thereby proﬁling the collision as a transition between adynamic and a still conﬁguration. In Figure 5A the particles annihilate eachother after a short transition of diﬀerent conﬁgurations. In Figure 5B the colli-sion of 4 gliders produces a stable non-empty conﬁguration of still particles aftera short transition of slightly more complicated interactions. We call this inter-action a ‘near-miss’ because the particles seem to have missed each other eventhough there is an underlying interaction. In Figure 5C, an unstable collisioncharacterised by the open-ended number of new patterns evolving over time in agrowing window can also be characterised by their algorithmic dynamics usingBDM, as shown in Figure 6D and marked as an unstable collision.More cases, both trivial and non-trivial, are shown in Figures 5 and 6A andB. Figure 5 shows other 7 cases of evolving motifs starting from diﬀerent initialconditions in small grid sliding windows of size up to 4 × × We traced the evolution of collisions of so-called gliders. Figure 4 Sup. Inf.shows concrete examples of particle collisions of gliders in GoL and the al-gorithmic dynamic characterisation of one such interaction, and Figure 7Aillustrates all cases for a sliding window of up to size 17 ×

17 where all casesfor up to 4 colliding gliders are reported, analysed and classiﬁed by diﬀerentinformation-theoretic indexes, including compression as a typical estimator ofalgorithmic complexity and BDM as an improvement on both Shannon entropyalone and typical lossless compression algorithms. The results show that casescan be classiﬁed in a few categories corresponding to the qualitative behaviourof the possible outcomes of up to 4 particle collisions.Figure 6D summarises the algorithmic dynamics of diﬀerent collisions andfor all cases with up to 4 gliders in Figure 7A by numerically producing all colli-sions but collapsing cases into similar behaviour corresponding to qualitativelydiﬀerent cases, as shown in the density plots in Figure 7B. The interaction ofcolliding particles is characterised by their algorithmic dynamics, with the algo-rithmic probability estimated by BDM remaining constant in the case in which 4particles prevail, the annihilation case collapsing to 0, and the unstable collisionproducing more particles diverges.

We have explained how observational windows can be regarded as apparentlyopen systems even if they come from a closed deterministic system D ( t ) forwhich the algorithmic complexity K ( D ) cannot diﬀer by more than log ( t ) overtime t –a (mostly) ﬁxed algorithmic complexity value. However, in local observa-tions patterns seem to emerge and interact, generating algorithmic informationas they unfold and requiring diﬀerent local rules and revealing the underlyingmechanisms of the larger closed system.We have shown the diﬀerent capabilities that both classical information andalgorithmic complexity (the former represented by the lossless compression al-gorithm Compresss, and the latter based on algorithmic probability) display inthe characterisation of these objects and how they can be used and exploited totrack changes and analyse their spatial dynamics.We have illustrated the way in which the method and tools of algorithmicdynamics can be used and exploited to measure the algorithmic informationdynamics of discrete dynamical systems, in particular of emerging local patterns(particles) and interacting objects (such as colliding particles), as exempliﬁedin a much studied 2-dimensional cellular automaton.15 cknowledgements H.Z. was supported by Swedish Research Council (Vetenskapsr˚adet) grant No.2015-05299.

References [1] E.R. Berlekamp, J.H. Conway, R.K. Guy, Winning Ways for your Mathe-matical Plays (2nd ed.), A K Peters Ltd., 2004.[2] A.H. Brady, The determination of the value of Rado’s noncomputable func-tion Σ( k ) for four-state Turing machines, Mathematics of Computation 40 (162): 647–665, 1983.[3] G.J. Chaitin, On the length of programs for computing ﬁnite binary se-quences: Statistical considerations,

Journal of the ACM , 16(1):145–159,1969.[4] T.M. Cover, J.A. Thomas,

Elements of Information Theory , Wiley Series inTelecommunications and Signal Processing, Wiley-Blackwell; 2nd. Editioned., 2006.[5] J.-P. Delahaye & H. Zenil, Numerical Evaluation of the Complexity of ShortStrings: A Glance Into the Innermost Structure of Algorithmic Random-ness,

Applied Math. and Comp. , 2012.[6] B. Durand, Zs. R´oka, The Game of Life: universality revisited, ResearchReport No 98-01, 1998.[7] M. Gardner, “Mathematical Games – The fantastic combinations of JohnConway’s new solitaire game ‘life’”. Scientiﬁc American. 223: 120–123,1970.[8] N. Gauvrit, H. Singmann, F. Soler-Toscano, H. Zenil, Algorithmic complex-ity for psychology: A user-friendly implementation of the coding theoremmethod,

Behavior Research Methods , vol. 48:1, pp. 1–16, 2015.[9] N. Gauvrit, F. Soler-Toscano, H. Zenil, Natural scene statistics mediate theperception of image complexity,

Visual Cognition , vol. 22:8, pp.1084–1091,2014.[10] S.W. Golomb, Polyominoes (2nd ed.). Princeton, New Jersey: PrincetonUniversity Press, 1994.[11] V. Kempe, N. Gauvrit, D. Forsyth, Structure emerges faster during culturaltransmission in children than in adults,

Cognition , Cognition vol. 136, pp247254, 2015.[12] W. Kircher, M. Li, and P. Vit´anyi, The Miraculous Universal Distribution,

The Mathematical Intelligencer,

Problems of Information and Transmission , 1(1):1–7, 1965.[14] L. Levin, Laws of information conservation (non-growth) and aspects ofthe foundation of probability theory,

Problems in Form. Transmission

Bell System Technical Journal,

Vol. 41, No. 3, pp. 877–884, 1962.[16] G.J. Mart´ınez, A. Adamatzky, and H.V. McIntosh, A Computation in aCellular Automaton Collider Rule 110, In Andrew Adamatzky (Ed.),

Ad-vances in Unconventional Computing pp. 391–428, Springer Verlag, 2016.[17] F. Soler-Toscano, H. Zenil, J.-P. Delahaye and N. Gauvrit, “CalculatingKolmogorov Complexity from the Output Frequency Distributions of SmallTuring Machines”,

PLoS ONE

Complexity , vol. 2017 (2017), Article ID 7208216.[20] R.J. Solomonoﬀ, A formal theory of inductive inference: Parts 1 and 2.

Information and Control , 7:1–22 and 224–254, 1964.[21] V. Benci, C. Bonanno, S. Galatolo, G. Menconi, M. Virgilio, Dynamicalsystems and computable information,

Discrete & Continuous DynamicalSystems - B , 4(4): pp. 935–960, 2004.[22] Weisstein, Eric W. “Polyomino.” From MathWorld–A Wolfram Web Re-source. http://mathworld.wolfram.com/Polyomino.html [23] S. Wolfram,

A New Kind of Science , Wolfram Media, Champaign, IL. 2002.[24] H. Zenil, F. Soler-Toscano, J.-P. Delahaye and N. Gauvrit, Two-Dimensional Kolmogorov Complexity and Validation of the Coding Theo-rem Method by Compressibility,

PeerJ Comput. Sci. , 1:e23, 2015.[25] H. Zenil, Algorithmic Data Analytics, Small Data Matters and Correla-tion versus Causation. M. Ott, W. Pietsch, J. Wernecke (eds.)

Berechen-barkeit der Welt? Philosophie und Wissenschaft im Zeitalter von Big Data ,Springer Verlag, Heidelberg, 2017.[26] H. Zenil, N.A. Kiani, F. Marabita, Y. Deng, S. Elias, A. Schmidt, G.Ball, J. Tegn´er, An Algorithmic Information Calculus for Causal Discoveryand Reprogramming Systems, bioRxiv https://doi.org/10.1101/185637 . 1727] H. Zenil, F. Soler-Toscano, K. Dingle and A. Louis, Correlation of automor-phism group size and topological properties with program-size complexityevaluations of graphs and complex networks,

Physica A: Statistical Me-chanics and its Applications , Volume 404, pp. 341–358, 2014.[28] H. Zenil, N.A. Kiani and J. Tegn´er, Methods of Information Theory andAlgorithmic Complexity for Network Biology,

Seminars in Cell and Devel-opmental Biology , vol. 51, pp. 32-43, 2016.[29] H. Zenil, F. Soler-Toscano, N.A. Kiani, S. Hern´andez-Orozco, A. Rueda-Toicen, A Decomposition Method for Global Evaluation of Shannon En-tropy and Local Estimations of Algorithmic Complexity, arXiv:1609.00110[cs.IT].[30] H. Zenil, N.A. Kiani and Jesper Tegn´er, Low Algorithmic ComplexityEntropy-deceiving Graphs,

Phys Rev E.

96, 012308, 2017.[31] H. Zenil, L. Badillo, S. Hern´andez-Orozco and F. Hern´andez-Quiroz,Coding-theorem Like Behaviour and Emergence of the Universal Distribu-tion from Resource-bounded Algorithmic Probability,

International Jour-nal of Parallel Emergent and Distributed Systems (accepted).18 upplementary Informationupplementary Information