[PDF] Percolation theory of self-exciting temporal processes

Abstract

We investigate how the properties of inhomogeneous patterns of activity, appearing in many natural and social phenomena, depend on the temporal resolution used to define individual bursts of activity. To this end, we consider time series of microscopic events produced by a self-exciting Hawkes process, and leverage a percolation framework to study the formation of macroscopic bursts of activity as a function of the resolution parameter. We find that the very same process may result in different distributions of avalanche size and duration, which are understood in terms of the competition between the 1D percolation and the branching process universality class. Pure regimes for the individual classes are observed at specific values of the resolution parameter corresponding to the critical points of the percolation diagram. A regime of crossover characterized by a mixture of the two universal behaviors is observed in a wide region of the diagram. The hybrid scaling appears to be a likely outcome for an analysis of the time series based on a reasonably chosen, but not precisely adjusted, value of the resolution parameter.

Full PDF

PPercolation theory of self-exciting temporal processes

Daniele Notarmuzi, Claudio Castellano, Alessandro Flammini, Dario Mazzilli, and Filippo Radicchi ∗ Center for Complex Networks and Systems Research,Luddy School of Informatics, Computing, and Engineering,Indiana University, Bloomington, Indiana 47408, USA Istituto dei Sistemi Complessi (ISC-CNR), Via dei Taurini 19, I-00185 Roma, Italy

We investigate how the properties of inhomogeneous patterns of activity, appearing in many natural and socialphenomena, depend on the temporal resolution used to deﬁne individual bursts of activity. To this end, we con-sider time series of microscopic events produced by a self-exciting Hawkes process, and leverage a percolationframework to study the formation of macroscopic bursts of activity as a function of the resolution parameter.We ﬁnd that the very same process may result in di ﬀ erent distributions of avalanche size and duration, whichare understood in terms of the competition between the 1D percolation and the branching process universalityclass. Pure regimes for the individual classes are observed at speciﬁc values of the resolution parameter cor-responding to the critical points of the percolation diagram. A regime of crossover characterized by a mixtureof the two universal behaviors is observed in a wide region of the diagram. The hybrid scaling appears to be alikely outcome for an analysis of the time series based on a reasonably chosen, but not precisely adjusted, valueof the resolution parameter. Inhomogeneous patterns of activity, characterized by burstsof events separated by periods of quiescence, are ubiquitousin nature [1]. The ﬁring of neurons [2, 3], earthquakes [4], en-ergy release in astrophysical systems [5] and spreading of in-formation in social systems [6–8] exhibit bursty activity, withintensity and duration of bursts obeying power-law distribu-tions [2, 3, 7].If activity consists of point-like events in time, size andduration of bursts are obtained from the inter-event time se-quence. The analysis of many systems [4, 6, 7, 9, 10] revealsthat the inter-event time between consecutive events has a fat-tailed distribution [4, 6, 7]. This distribution appears more re-liable for the characterization of correlation in bursty systemsthan other traditional measures, e.g., the autocorrelation func-tion [7, 11, 12]. However, the relation between autocorrelationand burst size distribution is opaque. Further complicationsarise as the separation between di ﬀ erent bursts is not clear-cut. In discrete time series, avalanches of correlated activityare monitored by coarsening the time series at a ﬁxed tempo-ral scale, and correlations are measured by assigning events tothe same burst if their inter-event time is smaller than a giventhreshold [7]. The threshold is set equal to some arbitrar-ily chosen value and / or imposed by the temporal resolutionat which empirical data are acquired, despite its potential ofa ﬀ ecting the properties of the resulting distributions [13–19].The purpose of the present letter is to understand the re-lation between temporal resolution and burst statistics. Weintroduce a principled technique to determine the value of thetime resolution that should be used to deﬁne avalanches fromtime series. We validate the method on time series generatedaccording to an Hawkes process [20], a model of autocorre-lated behavior used for the description of earthquakes [21, 22],neuronal networks [23], and socio-economic systems [24, 25].The use of the Hawkes process a ﬀ ords us a complete controlover the mechanism that generates correlations and the possi-bility to attack the problem analytically.We start by deﬁning a cluster of activity consistently with the informal notion of a burst composed of close-by events.Data are represented by K total events { t , . . . , t K } , where t i is the time of appearance of the i -th event. We ﬁx a res-olution parameter ∆ ≥ t b is given by the S consecutive events { t b , t b + , . . . , t b + S − } such that t b − t b − > ∆ , t b + S − t b + S − > ∆ ,and t b + i − t b + i − ≤ ∆ for all i = , . . . , S . We assume t = −∞ and t K + = + ∞ , implying that the ﬁrst and the last eventsopen and close a cluster, respectively. We deﬁne the size S as the number of events within the cluster, and its duration as T = t b + S − − t b , i.e., the time lag between the ﬁrst and lastevent in the cluster.If ∆ is larger than the largest inter-event time, then we havea single cluster of size K and duration t K − t . On the otherhand, if ∆ is smaller than the smallest inter-event time, eachevent is a cluster of size 1 and duration 0. As in 1D percolationproblems [26], we expect for an intermediate value ∆ = ∆ ∗ atransition from the non-percolating to the percolating phase.What can we learn from the percolation diagram of the timeseries? Does ﬁxing ∆ = ∆ ∗ allow us to observe properties ofthe process otherwise not apparent?We address the above questions in a controlled settingwhere we generate time series via an Hawkes process [20, 27]with conditional rate λ ( t | t , . . . , t k ) = µ + n k (cid:88) i = φ ( t − t i ) . (1)The rate depends on the k earlier events happened at times t ≤ t ≤ . . . ≤ t k ≤ t . The ﬁrst term in Eq. (1) producesspontaneous events at rate µ ≥

0. The second term consistsof the sum of individual contributions from each earlier event,with the i -th event happened at time t i ≤ t increasing the rateby φ ( t − t i ). φ ( x ) is the excitation or kernel function of theself-exciting process, and it is assumed to be non-negative andmonotonically non-increasing. Typical choices for the ker-nel are exponential or power-law decaying functions. We willconsider both cases. In Eq. (1), we assume (cid:82) ∞ φ ( x ) dx = a r X i v : . [ phy s i c s . s o c - ph ] F e b . . . a K = 10 K = 10 K = 10 K = 10 K = 10 − . . . b − − . . . c − ∆ / ∆ ∗ − ∆ / ∆ ∗ − ∆ / ∆ ∗ − ∆ / ∆ ∗ Resolution parameter ∆ P e r c o l a t i on s t r eng t h P ∞ Figure 1. Percolation phase diagrams of self-exciting temporal processes. We plot the percolation strength P ∞ as a function of the resolutionparameter ∆ for various conﬁgurations of the rate of Eq. (1), with exponential kernel function and various system sizes K . Average values areobtained by considering R = realizations of the process. (a) We set n = µ =

1. The inset shows the same data as of the main withabscissa rescaled as ∆ / ∆ ∗ . (b) We set n = µ = − . The insets display the same data as of the main, but with rescaled abscissa, ∆ / ∆ ∗ in the lower inset, and ∆ / ∆ ∗ in the upper inset. (c) We set n = µ = . The inset shows the same data as of the main, but with abscissarescaled as ∆ / ∆ ∗ . so that the memory term is weighted by the single parameter n ≥

0. Unless otherwise stated, we always set n =

1, cor-responding to the critical dynamical regime of the temporalpoint process described by Eq. (1) [21].The percolation framework allows us to characterize thegeneric Hawkes process of Eq. (1) using ﬁnite-size scalinganalysis [26] (SM, sec. C). The total number K of events inthe time series is the system size. For a given value of K ,we generate multiple time series and compute the percola-tion strength P ∞ , i.e., the fraction of events belonging to thelargest cluster, and the associated susceptibility (SM, sec. C).By studying the behavior of these macroscopic observablesas K grows, we estimate the values of the thresholds and thecritical exponents.Let us start with the case n = µ . The genericinter-event time x i = t i − t i − is a random variate distributedas P ( x i ) = µ e − µ x i . Two consecutive events are part ofthe same cluster with probability P ( x i ≤ ∆ ) = − e − µ ∆ ,which is independent of the index i and represents an e ﬀ ec-tive bond occupation probability in a homogeneous 1D perco-lation model [26, 28]. For ﬁnite K values, P ∞ sharply growsfrom 0 to 1 around the pseudo-critical point ∆ ∗ ( K ) = log( K ) /µ (SM, sec. D). Finite-size scaling analysis indicates that thetransition is discontinuous, as expected for 1D ordinary perco-lation [26]. We note that the distributions of cluster size P ( S )and duration P ( T ) are exactly described by the 1D percolationtheory [26] (SM, sec. D). They are the product of a power-lawfunction and a fast-decaying scaling function accounting forthe system ﬁnite size [28]. In this speciﬁc case, the scalingfunctions contain a multiplicative term that exactly cancels thepower-law term of the distribution. Therefore, the distribu-tions have exponential behavior at ∆ = ∆ ∗ . A clear signatureof criticality is manifest in the relation between size and dura-tion, (cid:104) S (cid:105) ∼ T , in agreement with the relation (cid:104) S (cid:105) ∼ T ( α − / ( τ − (SM, sec. D).We now consider the Hawkes process of Eq. (1) with expo-nential kernel φ ( x ) = e − x [27, 29]. Results of our ﬁnite-sizescaling analysis are reported in Figures 1b and 1c, for µ (cid:28) µ (cid:29)

1, respectively.For µ (cid:28)

1, the phenomenology is rich, with two distincttransitions at ∆ ∗ < ∆ ∗ , respectively. Around the critical point ∆ ∗ , the system is characterized by a behavior compatible withthe universality class of 1D percolation, i.e, the same as ofthe homogeneous Poisson process. Both P ( S ) and P ( T ) dis-play power-law decays at ∆ ∗ , with exponent values τ = α = ∆ ∗ ( K ) (cid:39) log( K ) / (cid:104) λ (cid:105) = log( K ) / (cid:16) µ + (cid:112) K µ (cid:17) , thusleading to a vanishing critical point in the thermodynamiclimit (SM, sec. E). (cid:104) λ (cid:105) is the expectation value, over an in-ﬁnite number of realizations of the process, of the rate after K events have happened; the estimate of the critical point ∆ ∗ ( K ) is thus obtained using the same exact equation as fora homogeneous Poisson process with e ﬀ ective rate (cid:104) λ (cid:105) . Theother transition at ∆ ∗ ( K ) = log( K ) /µ , which tends to inﬁniteas K grows, corresponds to the merger of the whole time se-ries into one cluster; its features are compatible with thoseof the universality class of the mean-ﬁeld branching process,i.e., τ = / α =

2. The region of the phase diagram[ ∆ ∗ ( K ) , ∆ ∗ ( K )], which is expanding as K increases, is char-acterized by critical behavior. While the percolation strengthplateaus at P ∞ (cid:39) − / e (cid:39) .

63, the susceptibility is largerthan zero. Furthermore, the distribution P ( S ) displays a neatcrossover between the regime τ = S and theregime τ = / S (Figure 2a).For µ (cid:29)

1, the phase diagram displays a single transition(Figure 1c), with features identical to those described for thecase µ (cid:28) ∆ ∗ : no crossover is present, and the criticalexponents of the distributions P ( S ) and P ( T ) are τ = α = λ ( t ) ∼ t (SM, Sec. K).The two di ﬀ erent behaviors observed for µ (cid:28) µ (cid:29) µ (cid:28) /µ (cid:29)

1, con-secutive bursts are well separated one from the other. Increas-ing ∆ , the system exhibits ﬁrst a transition ”within bursts” at ∆ = ∆ ∗ , corresponding to the merger of events within the sameburst, and then a transition ”across bursts” at ∆ = ∆ ∗ , corre-sponding to the merger of consecutive bursts of activity. For µ (cid:29)

1, all events belong to a unique burst of self-excitation.The time scale of spontaneous activity is equal or smaller thanthe one due to self-excitation. Thus, although the memorydecays exponentially fast, a new spontaneous event re-excitesthe process quickly enough to allow the burst to proceed its ac-tivity uninterrupted. The burst is truncated in the simulationsdue to the ﬁxed size K of the time series. As ∆ increases, allevents of the single burst are merged into a single cluster. Thetransition is therefore of the same type as the one observedwithin bursts at ∆ = ∆ ∗ in the case µ (cid:28) − − a ∆ = 10 − ∆ ∗ ∆ = ∆ ∗ ∆ = 10 / ∆ ∗ ∆ = 10 ∆ ∗ S − S − / S − S − / − − b ∆ = ∆ ∗ /

2∆ = ∆ ∗ ∆ = 2∆ ∗ ∆ = 5∆ ∗ S − S − / S − S − / − − − c T − − − − − d T − Cluster duration T Cluster size S P r obab ili t y d i s t r i bu t i on Figure 2. Critical properties of self-exciting temporal processes. Weconsider processes generated from the rate of Eq. (1) with exponen-tial kernel and n =

1. System size K = . Histograms obtainedby considering C = clusters per conﬁguration. (a) Cluster sizedistribution for µ = − . (b) Cluster size distribution for µ = .(c) Cluster duration distribution for the same data as in panel a. (d)Cluster duration distribution for the same data as in panel b. We can separately study the transitions within and acrossbursts. To this end, we simplify the actual process of Eq. (1)by setting µ = ﬀ springs) obeying a Poisson distribution with ex-pected value equal to n , the parameter appearing in Eq. (1).Time is assigned as follows. The ﬁrst event happens at an ar-bitrary time t , say for simplicity t =

0. Then each of the fol-lowing events has associated a time equal to the time of its par-ent plus a random variate x extracted from the kernel function φ ( x ) of Eq. (1). The mapping to the BP o ﬀ ers an alternative(on average statistically equivalent) way of generating time se-ries for the self-exciting process of Eq. (1). We ﬁrst generate aBP tree, and then associate a time to each event of the tree ac-cording to the rule described above. The time t associated to ageneric event of the g -th generation is distributed according toa function P ( t | g ). For the exponential kernel function, P ( t | g )is the sum of g exponentially distributed variables, i.e., the Er-lang distribution with rate equal to 1, P ( t | g ) = t g − e − t / ( g − ab t t t t t t t t Generation index / time N u m be r o f e v en t s /I s t an t aneou s r a t e T TT N g ˜ ( t | N ,...,N G ) b TimeTime series t t t t t t t t First generation t t t Second generation t t t t Third generation t t t t t Figure 3. Latent tree structure of self-exciting temporal processes.(a) Each event in the time series on the left is associated to a parentnode. On the right, the branching tree corresponding to the timeseries. Each node is assigned to a generation, and each bond hasassociated an inter-event time. If N g is the number of nodes in the g -th generation, the depicted tree has { N = , N = , N = , N = , . . . } . (b) The mapping of panel a allows us to associate to the tree { N , . . . , N G } (blue curve) an inhomogeneous Poisson process withinstantaneous rate ˜ λ ( t | N , . . . , N G ) (orange). Such a process generatestime series statistically equivalent to those generated by an Hawkesprocess with latent tree structure { N , . . . , N G } . The inverse resolutionparameter ∆ − (dashed black line) is an e ﬀ ective threshold for thePoisson process ˜ λ ( t | N , . . . , N G ). As a result, size and duration ofclusters are related by Eq. (2). The shaded areas denote the two termsappearing on the rhs of Eq. (2). The mapping of the self-exciting process to a BP allows usto fully understand the numerical ﬁndings of Figures 1 and 2.For n = P ( Z ) ∼ Z − / and the distribution of the tree depth is P ( D ) ∼ D − . Individual bursts of activity, as seen for su ﬃ ciently high ∆ values and µ (cid:28)

1, obey this statistics. Speciﬁcally, the sizeof each burst S is exactly the size Z of the tree. The averageduration of the bursts (cid:104) T (cid:105) ∼ D , as expected for the sum of iidexponentially distributed random variates. For ∆ ∈ [ ∆ ∗ , ∆ ∗ ], P ∞ of Figure 1b follows the same statistics as the maximumvalue of a sample of variables extracted from the distribution P ( Z ) ∼ Z − / divided by their sum, and the average value ofthe ratio plateaus at 1 − / e for su ﬃ ciently large sample sizes,fully explaining the results of Figure 1 (SM, sec. H).The behavior at ∆ = ∆ ∗ and the crossover towards thestandard BP regime for larger ∆ are due to a threshold phe-nomenon. This directly follows from the abrupt nature ofthe percolation transition of the Poisson process (Figure 1a).Given the latent branching tree { N , N , . . . , N g , . . . , N G } ,where N g indicates the number of events of the g -th generationof the tree, the time series of the Hawkes process is statisti-cally equivalent to the one of the inhomogeneous Poisson pro-cess with instantaneous rate ˜ λ ( t | N , . . . , N G ) = (cid:80) g P ( t | g ) N g .Hence, for a given ∆ , as long as ˜ λ ( t | N , . . . , N G ) > / ∆ , allevents are part of the same cluster of activity; when instead˜ λ ( t | N , . . . , N G ) < / ∆ , then events around time t belong toseparate clusters of activity. As a consequence, the total num-ber of events S T that form a cluster of activity of duration T is the integral of the curve ˜ λ ( t | N , . . . , N G ) in the time intervalwhen the rate is above ∆ − (Figure 3b). We repeat a similarcalculation as in Ref. [19]. The integral can be split in twocontributions, one corresponding to the area of the order of T appearing above the threshold line, as expected for a criti-cal BP [19, 30], and the other corresponding to the area ∆ − T appearing below threshold, S T ∼ T + ∆ − T . (2)While the distribution of cluster durations is always the same[i.e., P ( T ) ∼ T − of the underlying BP], if ∆ − > T then S T ∼ T implying the within-burst statistics P ( S ) ∼ S − . Instead, if ∆ − < T then S T ∼ T and the conservation of probabilityleads to the BP statistics P ( S ) ∼ S − / . When the two termson the rhs of Eq. (2) have comparable magnitude, a crossoverbetween the two scalings occurs. The crossover point varieswith the temporal resolution as S c ∝ ∆ − (SM, sec. G). Afull understanding of P ( S ) is achieved by noting that power-law scaling requires a minimum sample size to be observed,su ﬃ cient for the largest cluster to have duration comparableto 1 / ∆ ∗ . If the sample is not large enough the distribution willappear as exponential (SM, sec. G).We ﬁnally consider the power-law kernel function φ ( x ) = ( γ − + x ) − γ . The branching structure underlying the processis not a ﬀ ected by the kernel so the results above should con-tinue to hold [21]. For γ > φ ( x ) has ﬁnite mean value and,as a consequence, results are identical to those obtained forthe exponential kernel (SM, Sec. G). Speciﬁcally, P ∞ shows adiscontinuous transition when µ (cid:29)

1, while two sharp transi-tions are observed for µ (cid:28)

1. The distribution of cluster sizes exhibits a crossover from τ = ∆ ∗ to τ = / ∆ (cid:29) ∆ ∗ when µ (cid:28)

1, and the exponent τ = µ (cid:29)

1. If γ ≤ φ ( x ) has diverging mean value, the typicalinter-event time is large preventing the present framework tobe applicable.In summary, we investigated how self-excitation mecha-nisms are reﬂected in the bursty dynamics, exploring theirrelationship with avalanche distributions, which o ﬀ er an ef-fective probe into the presence of autocorrelation in time se-ries [1]. We focused on the Hawkes process, a general mecha-nism to produce self-excitation, autocorrelation, and fat-taileddistributions in the avalanche size and duration. Critical be-havior in the distributions is observed at speciﬁc values of theresolution parameter ∆ , and is characterized by exponents in-dependent of the form of the self-excitation mechanism. Theuniversal critical behavior is governed by both the branch-ing structure underlying the Hawkes process and the featuresof 1D percolation. Nontrivial details of the size distribu-tion depend on the relative force of the spontaneous and self-excitation mechanisms. The two classes of behavior coexistfor a wide range of ∆ values, thus making the observation of amixture of two classes the most likely outcome of an analysiswhere the resolution parameter is not ﬁne-tuned. All ﬁndingsextend to the slightly subcritical conﬁguration of the Hawkesprocess (SM, Sec. I), thus showing that our method is sci-entiﬁcally sound also for the analysis of avalanches in somenatural systems possibly operating close to, but not exactlyin, a critical regime [31]. Our work o ﬀ ers an interpretativeframework for the relationship between avalanche propertiesand the mechanisms producing autocorrelation in bursty dy-namics. More work in this area is nevertheless needed. TheHawkes process is unable to reproduce the variety of criticalbehaviors reported for real data sets in Ref. [1], and other self-excitation mechanisms need to be considered.D.N. and F.R. acknowledge partial support from the Na-tional Science Foundation (Grant No. CMMI-1552487). ∗ ﬁ[email protected][1] M. Karsai, H.-H. Jo, and K. Kaski, Bursty human dynamics (Springer, 2018).[2] L. Dalla Porta and M. Copelli, PLoS computational biology ,e1006924 (2019).[3] J. M. Beggs and D. Plenz, Journal of neuroscience , 11167(2003).[4] P. Bak, K. Christensen, L. Danon, and T. Scanlon, PhysicalReview Letters , 178501 (2002).[5] F. Wang and Z. Dai, Nature Physics , 465 (2013).[6] A.-L. Barabasi, Nature , 207 (2005).[7] M. Karsai, K. Kaski, A.-L. Barab´asi, and J. Kert´esz, Scientiﬁcreports , 1 (2012).[8] J. P. Gleeson, J. A. Ward, K. P. O’sullivan, and W. T. Lee,Physical review letters , 048701 (2014).[9] A. J. Fontenele, N. A. de Vasconcelos, T. Feliciano, L. A.Aguiar, C. Soares-Cunha, B. Coimbra, L. Dalla Porta,S. Ribeiro, A. J. Rodrigues, N. Sousa, et al. , Physical review letters , 208101 (2019).[10] L. Weng, A. Flammini, A. Vespignani, and F. Menczer, Scien-tiﬁc reports , 335 (2012).[11] H.-H. Jo, J. I. Perotti, K. Kaski, and J. Kert´esz, Physical ReviewE , 022814 (2015).[12] P. Kumar, E. Korkolis, R. Benzi, D. Denisov, A. Niemeijer,P. Schall, F. Toschi, and J. Trampert, Scientiﬁc reports , 1(2020).[13] V. Pasquale, P. Massobrio, L. Bologna, M. Chiappalone, andS. Martinoia, Neuroscience , 1354 (2008).[14] J. P. Neto, F. P. Spitzner, and V. Priesemann, arXiv preprintarXiv:1910.09984 (2019).[15] S. R. Miller, S. Yu, and D. Plenz, Scientiﬁc reports , 1 (2019).[16] M. Chiappalone, A. Novellino, I. Vajda, A. Vato, S. Martinoia,and J. van Pelt, Neurocomputing , 653 (2005).[17] A. Levina and V. Priesemann, Nature communications , 1(2017).[18] S. Jani´cevi´c, L. Laurson, K. J. Måløy, S. Santucci, and M. J.Alava, Physical review letters , 230601 (2016).[19] P. Villegas, S. di Santo, R. Burioni, and M. A. Mu˜noz, PhysicalReview E , 012133 (2019).[20] A. G. Hawkes, Biometrika , 83 (1971).[21] A. Helmstetter and D. Sornette, Journal of Geophysical Re-search: Solid Earth , ESE (2002).[22] Y. Ogata, Journal of the American Statistical association , 9(1988).[23] F. Y. K. Kossio, S. Goedeke, B. van den Akker, B. Ibarz,and R.-M. Memmesheimer, Physical review letters , 058301(2018). [24] R. Crane and D. Sornette, Proceedings of the NationalAcademy of Sciences , 15649 (2008).[25] D. Sornette, F. Deschˆatres, T. Gilbert, and Y. Ageon, PhysicalReview Letters , 228701 (2004).[26] D. Stau ﬀ er and A. Aharony, Introduction to percolation theory (Taylor and Francis, 1994).[27] A. G. Hawkes and D. Oakes, Journal of Applied Probability ,493 (1974).[28] D. Stau ﬀ er and C. Jayaprakash, Physics Letters A , 433(1978).[29] A. Dassios and H. Zhao, Electronic Communications in Proba-bility (2013).[30] S. di Santo, P. Villegas, R. Burioni, and M. A. Mu˜noz, PhysicalReview E , 032115 (2017).[31] V. Priesemann, M. Wibral, M. Valderrama, R. Pr¨opper,M. Le Van Quyen, T. Geisel, J. Triesch, D. Nikoli´c, and M. H.Munk, Frontiers in systems neuroscience , 108 (2014).[32] Y. Ogata, IEEE transactions on information theory , 23(1981).[33] M.-A. Rizoiu, Y. Lee, S. Mishra, and L. Xie, arXiv preprintarXiv:1708.06401 (2017).[34] A. Dassios and H. Zhao, Advances in applied probability ,814 (2011).[35] P. Colomer-de Sim´on and M. Bogu˜n´a, Physical Review X ,041020 (2014).[36] M. Bogun´a, R. Pastor-Satorras, and A. Vespignani, The Euro-pean Physical Journal B , 205 (2004). SUPPLEMENTAL MATERIALHawkes process: numerical simulations

Exponential kernel

The Markov property of the exponential kernel can be exploited to numerically simulate the process in linear time by meansof the algorithm developed in Ref. [29]. In the present case of unmarked Hawkes process the implementation is straightforward.Given the rate λ ( t k − ) right after the ( k − k -th inter-event time can be sampled as1. Set F = − log( u ) /µ, where u ∼ Unif(0 , G = + log( u ) / ( λ ( t k − ) − µ ) , where u ∼ Unif(0 , F =  ∞ if G ≤ − log( G ) if G > x k = min { F , F } .5. Set λ ( t k − + x k ) = ( λ ( t k − ) − µ ) e − x k + n + µ . Power-law kernel

To generate time series from an Hawkes process with power-law kernel we take advantage of the thinning algorithm byOgata [32]. The computational complexity of the algorithm grows quadratically with the number of events. The thinningalgorithm requires knowledge of the full history { t , t , ..., t k − } of the process for the k -th inter-event to be sampled. Speciﬁcally,given λ ( t k − ) and the set of previous event times,1. Set x = − log( u ) /λ ( t k − ) , where u ∼ Unif(0 , λ ( t k − + x ) and compute the ratio (cid:96) = λ ( t k − + x ) /λ ( t k − ).3. Sample u ∼ Unif(0 , u ≤ (cid:96) , set t k = t k − + x . Note that λ ( t k ) = λ ( t k − + x ) + n ( γ − ﬃ ciency reasons, time can be updated even if the event is rejected (i.e. u > (cid:96) ) and λ ( t k − + x ) can be used in step1 instead of λ ( t k − ) [33]. Leveraging the equivalence with the branching process to generating time series for the Hawkes process

As stated in the main text, a statistically equivalent way to produce time series with the rate of Eq. (1) is to generate abranching tree with the Poisson o ﬀ spring distribution P ( o ; n ) and then to assign an event time to each node. The inter-event time x between successive generation’s nodes is a random variable with distribution φ ( x ). This type of simulations is particularlyuseful to explore the large-scale properties of the process with the power-law kernel, in that the quadratic scaling of the thinningalgorithm with K prevents large time series to be generated. Critical Hawkes process: average rate

Theorem 3.6 and Corollary 3.5 in Ref. [34] can be reformulated for the rate of Eq. (1) in the case of exponential kernel. Therate in Ref. [34] takes the form of Eq. (1) by simply setting, in Ref. [34] notation, a = λ = µ , δ = Y i = Z i = n for all i .Then Theorem 3.6 and Corollary 3.5 take the form (cid:104) λ ( t ) (cid:105) = µ (1 + t ) (cid:104) λ ( t ) (cid:105) = µ + (2 µ + n )( µ t + µ t /

2) (S3)respectively for n =

1. In Eq. (S3), (cid:104)·(cid:105) indicates average value over an inﬁnite number of realizations of the Hawkes process. Itfollows that (cid:104) λ ( t ) (cid:105) − (cid:104) λ ( t ) (cid:105) (cid:104) λ ( t ) (cid:105) = µ (S4)for large time. Thus the process experiences ﬂuctuations that are much smaller than the average rate for µ (cid:29) t , namely k ( t ), obeys dk ( t ) / dt = λ ( t ), it is easy to see that (cid:104) k ( t ) (cid:105) = µ (cid:16) t + t / (cid:17) . Itfollows that for t (cid:28) (cid:104) λ (cid:105) ≈ µ and (cid:104) k (cid:105) ( t ) ≈ µ t , while for large time wehave that (cid:104) λ (cid:105) ≈ µ t and (cid:104) k (cid:105) ( t ) ≈ µ t /

2. Finally, inverting the relation between time and number of events, the average rate is (cid:104) λ ( k ) (cid:105) = µ + (cid:112) µ k . (S5)The equations above allow to understand the physical interpretation of the phase diagram of Figure 1 of the main paper.In Figure S4a, the average rate is shown and compared to individual realizations of the process. For µ (cid:28)

1, the individualrealization is characterized by a sequence of bursts during which the rate grows proportionally to the square root of the eventnumber, in agreement with Eq. (S5). A burst ends when a ﬂuctuation of the rate is large enough for it to drop much below itsaverage value. Strong ﬂuctuations correspond to inter-event times much larger than 1. They are expected to occur as predictedby Eq. (S4). Large inter-event times make all the individual terms in the kernel small, with the rate dropping to its minimalvalue, i.e., λ = µ . Then, a new event occurring after an inter-event time of the order of µ − gives rise to a new burst of theself-exciting process. When instead µ (cid:29)

1, ﬂuctuations are expected to be smaller than the average value. The temporal scalesof the spontaneous activity and of the self-exciting process are comparable, so that spontaneously generated events reinforce theburst instead of interrupting it.The quadratic scaling (cid:104) k ( t ) (cid:105) ≈ µ t / λ . The number of events grows as the square of time, in the same way as the size of a branching process grows as the squareof the number of generations. This fact can also be seen by noting that λ ∼ √ k and that the typical inter-event time is of theorder of 1 /λ . We are not aware of explicit formulas for the average rate in the case of the power-law kernel. However, it is notsurprising to see, in numerical simulations, λ growing with the square root of k for this kernel as well (Figure S4b). a µ = 0 . µ = 100 b √ k Event number k R a t e λ Figure S4. Individual realizations of the critical Hawkes process. (a) Exponential kernel. Two realizations of the process, one for µ = .

01 andone for µ = γ = .

5. The dashed line here simply scales as √ k . Finite-size scaling analysis

The percolation strength is deﬁned as the average size of the largest cluster, S M , normalized to K : P ∞ = (cid:104) S M (cid:105) K . (S6)The susceptibility associated to the order parameter can be deﬁned as χ = (cid:104) S M (cid:105) − (cid:104) S M (cid:105) (cid:104) S M (cid:105) . (S7)Averages are meant over an inﬁnite number of realizations of the Hawkes process.The cluster number n S is usually deﬁned as the number of clusters of size S per unit lattice site [26]. The same deﬁnition canbe used for the cluster duration n T . In percolation theory, the scaling ansatz for the cluster number is formulated as n S = S − τ G S ( z ) z = ( p c − p ) S σ , (S8)where p is the concentration (i.e., the bond occupation probability) and p c is the critical point of the model. Homogeneouspercolation in one dimension has p c =

1. The scaling ansatz is assumed to be valid in the vicinity of the critical point and forsizes S (cid:29) S c ∝ | p − p c | − /σ . (S9)Equations (S8, S9) deﬁne the exponents τ and σ , respectively quantifying the decay of the cluster number and the divergence ofthe crossover scale. All the other critical exponents can be computed via scaling relations involving τ and σ . Cluster duration,on a lattice, is the same observable as the cluster size. On disordered topologies, we assume that a scaling form analogous toEq. (S8) exists for the cluster duration, with n T characterized by its own exponents α and σ T and by its own scaling function G T .Given the scaling of Eq. (S8), the order parameter is expected to grow from zero to a positive value as a power law withexponent β . The critical exponent β can be determined using the knowledge of τ and σ , according to P ∞ ∝ ( p − p c ) β for p → p + c ,β = τ − σ . (S10)In the vicinity of the critical point the susceptibility (S7) diverges as a power law χ ∝ | p − p c | − ( γ + β ) γ = − τσ . (S11)The presence of the exponent β in the above expression is direct a consequence of the deﬁnition of Eq. (S7) [35].Critical exponents can be evaluated by means of Finite-Size Scaling (FSS) analysis. The key concept underlying FSS is theexistence of a unique length scale characterizing the behaviour of an inﬁnite system. This is the correlation length ξ , whichrepresents the typical linear size of the largest (but ﬁnite) cluster. The correlation length is in fact deﬁned as the typical scale ofthe correlation function, which has the exponential form g ( r ) = e − r /ξ for p < p c . The correlation function, in turn, is deﬁnedas the probability that a site at distance r from an active site belongs to the same cluster. The correlation length diverges as | p − p c | − ν in the vicinity of the critical point, indicating that the largest cluster spans the whole lattice. This divergence, however,is de facto bounded by K in ﬁnite systems, with K the linear size of the system. Then in any ﬁnite system, at any p value, thereare two possibilities: p is close enough to p c , so that ξ (cid:29) K , or p is far enough from p c and ξ (cid:28) K .A second crucial property is the nature of the critical point: p c is a thermodynamic quantity. Critical behavior in ﬁnite systemsis observed at a di ﬀ erent value of the concentration p , namely p ∗ . The pseudo (size-dependent) critical point scales with thesystem size according to p c − p ∗ ( K ) ∼ K − /ν . (S12)The above equation quantiﬁes the convergence of p ∗ toward p c , and allows us to estimate the exponent ν . In the vicinity of p ∗ ,we further have ξ (cid:29) K , thus allowing us to measure the exponents β and γ . The scaling P ∞ ∼ ( p − p c ) β can be replaced, byvirtue of ξ ∼ ( p − p c ) − ν , with P ∞ ∼ ξ − β/ν . Further, if the system is close enough to criticality, ξ is bounded by K and the expectedscaling is P ∞ ∼ K − β/ν . Analogous reasoning leads to the scaling of the susceptibility χ ∼ K − ( γ + β ) /ν . We recall that ordinarypercolation in one dimension is characterized by the exponents τ = ν = σ = γ = β =

0, stemming for the discontinuousnature of the transition.

Percolation theory of the Poisson process

The Hawkes process for n = µ . Event times are independent, and the inter-event timedistribution is exponential with rate µ . It follows that the probability that two consecutive events belong to the same cluster is p = µ (cid:82) ∆ e − µ x dx = − e − µ ∆ . We used the symbol p on purpose to stress the analogy of this quantity with the bond occupationprobability in ordinary percolation.The percolation probability in a ﬁnite system can be computed as follows. A time series with K events percolates if ∆ is greaterthan the largest inter-event time in the time series. We can estimate the e ﬀ ective threshold ∆ ∗ as the average (over realizations)of the largest inter-event time observed by drawing K samples from the inter-event time distribution. We can therefore solve theequation K (cid:90) ∞ ∆ ∗ P ( x ) dx = , obtaining ∆ ∗ ( K ) = µ − log( K ).In summary, the percolation problem associated to time series generated by a homogeneous Poisson process is a standardbond percolation problem in one dimension. In particular, all critical exponents are identical to those of the one-dimensionalpercolation, as well as the cluster number n S is expected to behave as predicted by the exact solution in Ref. [28]. The clusterduration n T can be also understood by nothing that, as the duration is the sum of the inter-event times composing the cluster andas the inter-event time distribution is exponential, the duration of a cluster with size S is, on average, (cid:104) T (cid:105) = (cid:104) x (cid:105) S = S µ − . Thecluster duration n T can be computed by noting that the total number of cluster with duration T is N T = µ N s = µ Kn s and that thecluster duration must be normalized to the length of the system, L = (cid:104) x (cid:105) K = µ − K , instead of its size K . In summary, one has n S = N s K = S − G [ S ( p c − p )] n T = N T L = T − G [ µ T ( p c − p )] G ( x ) = x e − x . (S13)Note also that the scaling function G S = G T , so we use the symbol G .Figure S5 shows the result of the FSS analysis for the Poisson process. We measure the e ﬀ ective critical point as the valueof the temporal resolution ∆ where the susceptibility peaks. We use p c − p ∗ = e − µ ∆ ∗ and conﬁrm the scaling K − , i.e., ν = ν , the scaling of the threshold with the system size should be performed by measuringsystem size in units of time and not of events. We note, however, that for the homogeneous Poisson process the number ofevents grows linearly with time, i.e., k ( t ) ∼ t , thus measuring ν with respect to time or number of events does not change itsvalue. The exponents γ and β , measured from the value of χ and of (cid:104) S M (cid:105) at ∆ ∗ , are also consistent with those of the ordinaryone-dimensional percolation. Resolution parameter ∆0 . . . . S u sc ep t i b ili t y K − χ a K = 10 K = 10 K = 10 K = 10 K = 10 System size K − − b e − µ ∆ ∗ ( K ) /ν = 1 . System size K c − β/ν = 1 . h S M (∆ ∗ ) i χ (∆ ∗ )( γ + β ) /ν = 1 . χ (∆ ∗ )( γ + β ) /ν = 1 . − ∆ / ∆ ∗ Figure S5. Finite-size scaling analysis of percolation in Poisson process. We use here µ =

1. (a) Susceptibility [i.e., Eq. (S7)] divided by K ,as a function of the resolution parameter ∆ . The inset shows the same data as of the main panel but the resolution parameter is rescaled as ∆ / ∆ ∗ . (b) Convergence of the e ﬀ ective critical point to the critical point. We measure ∆ ∗ ( K ) as the value of the resolution parameter where weobserve the peak of the susceptibility. (c) Scaling of the peak of susceptibility (orange squares) and of the average size of the largest cluster(blue circles) at the e ﬀ ective resolution ∆ ∗ . Also, we conﬁrm the value of the critical exponents τ = α = σ = σ T = G (Figure S6).We stress that the value of the critical exponent τ cannot be deduced immediately by looking at the distribution P ( S ) atcriticality (Figure S7). Percolation theory of the critical Hawkes process

We can fully describe the percolation transition of the Hawkes process in terms of the percolation transition of the Poissonprocess. The only caveat is accounting for the fact that process is not stationary at criticality [27]. To this end, we simply assumethat the process is a Poisson process with rate dependent on the number of events as in Eq. (S5). The e ﬀ ective critical point is ∆ ∗ = log( K ) (cid:104) λ ( K ) (cid:105) = log( K ) µ + (cid:112) K µ . The above expression implies that, in the thermodynamic limit, the critical point of the model is ∆ c = ∆ ∗ approaches ∆ c with the exponent ν = ν =

1. In fact, the number of events scales quadratically with time, i.e., k ( t ) ∼ t . There,0 − Rescaled size S ( p c − p )10 − − R e sc a l ed c l u s t e r nu m be r S n S a x exp( − x ) − − Rescaled duration T ( p c − p )10 − − R e sc a l ed c l u s t e r du r a t i on T n T b x exp( − x ) − Cluster duration T A v e r age c l u s t e r s i z e h S i c ∆ = ∆ ∗ /

10∆ = ∆ ∗ /

5∆ = ∆ ∗ /

2∆ = ∆ ∗ / . x Figure S6. Cluster statistics of the Poisson process. We set µ =

1. Each curve is estimated by considering C = clusters. The system size is K = . (a) Cluster number n S , rescaled by S , as a function of S ( p c − p ), for di ﬀ erent values of p and p c =

1. Here, p = − e − µ ∆ . The solidblack line is the scaling function G ( x ) = x e − x [28] (b) Cluster duration n T , rescaled by T , as a function of T ( p c − p ). The solid black line isthe same as in panel a. (c) Average size of clusters as a function of their duration. The black line indicates linear scaling. Cluster size S − − P r obab ili t y d i s t r i bu t i on P ( S ) a − Rescaled size S ( p c − p )10 − R e sc a l ed p r obab ili t y S P ( S ) b Cluster size S − − C l u s t e r nu m be r n S c ∆ = ∆ ∗ /

10∆ = ∆ ∗ /

5∆ = ∆ ∗ /

2∆ = ∆ ∗ / . Figure S7. Cluster statistics of the Poisson process. We set µ =

1. Each curve is estimated by considering C = clusters. The system sizeis K = . (a) Probability distribution P ( S ) of the cluster size for di ﬀ erent values of ∆ . (b) Same as in panel a, but the distribution is rescaledby S and the ordinate is rescaled by ( p c − p ), with p = e − µ ∆ and p c =

1. (c) Cluster number n S for the same data as in panels a, b. the susceptibility associated to the order parameter collapses, if properly rescaled, on a unique curve, irrespective of the µ valueunder consideration. The empirical measure of the e ﬀ ective critical point, as the value of ∆ where the susceptibility peaks,further shows the scaling ∆ ∗ ( K ) ∼ log( K ) / K / . The FSS analysis is completed by showing that the exponent β =

0, as expectedfor a discontinuous transition, and that the exponent γ = ν . It follows, from respectively Eqs. (S10) and (S11), that τ = σ = τ = α =

2. For µ (cid:28) ∆ large enough, the cluster size distribution shows the scaling τ = /

2. For thischoice of the parameters, we observe also a quadratic relation between S and T .As stated in the main text, we further consider the power law kernel φ ( x ) = ( γ − + x ) − γ for γ >

2, so that φ ( x ) has aﬁnite mean. The simulation requires the use of the thinning algorithm, so that the system size K must be kept small. We showthe percolation phase diagram and the cluster size distribution for both µ (cid:28) µ (cid:29) µ (cid:28) µ (cid:29)

1. Further, as Figure S4 shows, the rategrows again as the square root of the number of events, as expected for a branching process. It follows that the critical point ∆ ∗ for the power-law kernel scales again as K − / as can be veriﬁed by rescaling the data in Figure S10a. Overall, Figure S4and Figure S10 show that the results presented in the main text for the exponential kernel remain valid for the power law kernelwhich, in turn, suggests that the results are valid for any kernel with ﬁnite mean. Large-scale simulations of the non-MarkovianHawkes process can be performed by exploiting the map of the process with the branching process (BP), which is done in the1 − Resolution parameter ∆0 . . . S u sc ep t i b ili t y K − χ a K = 10 K = 10 K = 10 K = 10 K = 10 System size K − − b ∆ ∗ ( K )1 /ν = log( K ) / √ K System size K c − β/ν = 1 . h S M (∆ ∗ ) i χ (∆ ∗ )( γ + β ) /ν = 1 . χ (∆ ∗ )( γ + β ) /ν = 1 . − − Resolution parameter ∆0 . . . S u sc ep t i b ili t y K − χ d K = 10 K = 10 K = 10 K = 10 K = 10 System size K − − e ∆ ∗ ( K )1 /ν = log( K ) / √ K System size K f − β/ν = 1 . h S M (∆ ∗ ) i χ (∆ ∗ )( γ + β ) /ν = 1 . χ (∆ ∗ )( γ + β ) /ν = 1 . − ∆ / ∆ ∗ − ∆ / ∆ ∗ Figure S8. Finite-size scaling analysis of percolation in the Hawkes process. In panels a, b, and c, we use µ = − . (a) Susceptibility [i.e.,Eq. (S7)] divided by K as a function of the resolution parameter ∆ . The inset shows the same data as in the main panel bu as a function of therescaled variable ∆ / ∆ ∗ . (b) Scaling of the position of the peak of the susceptibility for di ﬀ erent system size. (c) Scaling of the maximum ofthe susceptibility and of the average size of the largest cluster at the resolution ∆ ∗ . (d), (e), (f) Same as in panels a, b, and c, respectively, butfor µ = . next section. The temporal branching process

To numerically demonstrate the statistical equivalence between the Hawkes process and the branching process, in Figures S11and S12 we show results valid for time series generated using a branching process with inter-event times respectively sampledfrom exponential and power-law distributions.

Crossover point

As stated in the main text, the crossover between the scaling τ = τ = / ∆ − . On average, the largest cluster whose duration is T ≤ ∆ − hassize S T ∝ ∆ − T so that S T ≤ ∆ − . Analogously, the smallest cluster whose duration is T ≥ ∆ − has, on average, size S T ∝ T so that S T ≥ ∆ − . It follows that the crossover point is expected to scale with the temporal resolution as S c ∝ ∆ − . This result isconﬁrmed in Fig. S13a.2 Figure S9. Critical percolation properties of self-exciting temporal processes. We consider the rate of Eq. (1) of the main text with exponentialkernel function and n =

1. System size is ﬁxed at K = . Histograms are obtained by considering C = clusters per conﬁguration. Weuse µ = − in panels a, b, and c. (a) Complementary Cumulative Distribution of the cluster size at di ﬀ erent resolution ∆ . (b) Distribution ofthe cluster duration for the same data as in panel a. (c) Average size of clusters as a function of their duration for the same data as in panel (a).(d), (e) and (f) Same as in panels a, b, and c, respectively, but for µ = . − Resolution parameter ∆ . . . P e r c o l a t i on s t r eng t h P ∞ a K = 10 K = 10 K = 10 Cluster size S − − P r obab ili t y d i s t r i bu t i on b ∆ = 10 − ∆ ∗ ∆ = ∆ ∗ ∆ = 10 / ∆ ∗ ∆ = 10 / ∆ ∗ S − S − / S − S − / Cluster size S − − P r obab ili t y d i s t r i bu t i on c ∆ = ∆ ∗ /

2∆ = ∆ ∗ ∆ = 2∆ ∗ S − S − / S − S − / − Figure S10. Percolation transition in power-law self-exciting temporal processes. We consider temporal processes generated by the rate ofEq. (1) of the main text with a power-law kernel and parameter n =

1. The exponent of the power-law kernel is γ = .

5. Results are obtainedby considering R = realizations of the process. (a) Percolation strength as a function of the resolution parameter. We display results fordi ﬀ erent system sizes K . In the main panel, we set µ = − . In the inset, results are obtained for µ = . (b) Distribution of the cluster sizeat criticality for µ = − . System size here is K = . (c) Same as in panel b but for µ = . System size and sample size

The argument further allows us to understand the properties of the cluster size distribution. Given the power-law shape of P ( S ), the largest cluster in a sample of size C grows as a power law [36]. If C is not large enough for (at least) the largestcluster to have size comparable with ∆ − T , then the power-law shape of P ( S ) is not observed at all, and the distribution appearsto be exponential. If the sample size is large enough, then P ( S ) does not show any further dependence on C , and the crossoverpoint S c is not a ﬀ ected by C , as conﬁrmed by Figure S13b. Finally, for a given system size K , we have shown that there exist a3 − − − a S − . S − . − − − b T − . − c ∆ = ∆ ∗ ∆ = 10 / ∆ ∗ ∆ = 10 ∆ ∗ TT Cluster Size S Cluster Duration T Cluster Duration T P r obab ili t y d i s t r i bu t i on P r obab ili t y d i s t r i bu t i on A v e r age c l u s t e r s i z e h S i Figure S11. Critical percolation properties of time series generated from a branching tree with bonds weighted by inter-event times distributedas φ ( x ). We use here φ ( x ) = e − x . Histograms are obtained by considering C = clusters per conﬁguration. (a) Distribution of the clustersize at di ﬀ erent resolution ∆ . (b) Distribution of the cluster duration for the same data as in panel a. (c) Average size of clusters as a functionof their duration for the same data as in panel a. − − − a S − . S − . − − − b T − . − c ∆ = ∆ ∗ ∆ = 10 / ∆ ∗ ∆ = 10 ∆ ∗ TT Cluster size S Cluster duration T Cluster duration T P r obab ili t y d i s t r i bu t i on P r obab ili t y d i s t r i bu t i on A v e r age c l u s t e r s i z e h S i Figure S12. Critical percolation properties of time series generated from a branching tree with bonds weighted by inter-event times distributedas φ ( x ). We use here the power-law kernel φ ( x ) = ( γ − + x ) − γ with γ = .

5. Histograms are obtained by considering C = clusters perconﬁguration. (a) Distribution of the cluster size at di ﬀ erent resolution ∆ . (b) Distribution of the cluster duration for the same data as in panela. (c) Average size of clusters as a function of their duration for the same data as in panel a. pseudo-critical temporal resolution ∆ ∗ ( K ) such that P ( S ) ∼ S − . As ∆ ∗ →

0, reducing K for a given ∆ is analogous to increasing ∆ for a given K (and vice versa). It follows that, for a given temporal resolution ∆ , system size can be too small for power-lawscaling to be observed or large enough for the crossover between the two regimes to be observed. Between the two extremes,there exists a size such that the resolution ∆ = ∆ ∗ ( K ), as shown in Figure S13c. Theory of critical branching process explains plateau values observed in phase diagrams

We show here that numerical values of the order parameter and susceptibility observed in Figure 1b of the main text, andFigures S8a, and S8d can be reproduced by considering the statistics of the critical branching process. Recall that the orderparameter is obtained as the normalized size of the largest cluster, averaged over several realizations, and that the susceptibilityis deﬁned as the ﬂuctuations of the average size of the largest cluster. We can reproduce these statistics by:1. Extracting a sample of size C of iid random variables { Z , Z , . . . , Z C } from the power-law distribution P ( Z ) ∼ Z − / , i.e.,the distribution of the tree size for a critical branching process.2. Calculating their sum V = (cid:80) Cc = Z q and their maximum Z max = max { Z , Z , . . . , Z C } to estimate Y = Z max / V . In this way4 Rescaled cluster size S ∆ − S CCD F ( S ) a ∆ = ∆ ∗ ∆ = 10 / ∆ ∗ ∆ = 10 ∆ ∗ ∆ = 10 / ∆ ∗ Cluster size S − − CCD F ( S ) b C = 10 C = 10 C = 10 Cluster size S − − CCD F ( S ) c K = 10 K = 10 K = 10 S − Figure S13. Crossover behavior and role of system and sample size in cluster statistics of the Hawkes process. We use φ ( x ) = e − x and µ = − .(a) Complementary Cumulative Distribution Function (CCDF), rescaled by S , against the rescaled cluster size S ∆ − of C = clusters. Weuse here K = . (b) CCDF( S ) for di ﬀ erent sample size C . We use here ∆ = . K = . (c) CCDF( S ) for di ﬀ erent system size. Weconsider here C = clusters and ∆ = . Y represents S M / K in a single realization of the Hawkes process. Order parameter and susceptibility can be obtained byaveraging.3. Repeating the ﬁrst two points T times to obtain the sample { Y , Y , . . . , Y T } , and ﬁnally estimating the order parameter as P ∞ = T − T (cid:88) t = Y t and the susceptibility as χ =  T − T (cid:88) t = Y t − P ∞  / P ∞ . . . . . . P e r c o l a t i on s t r eng t h P ∞ a T = 10 T = 10 T = 10 . . . . . S u sc ep t i b ili t y χ bSample size C Figure S14. (a) Estimate of the order parameter P ∞ as a function of the sample size C obtained for di ﬀ erent total realizations T of the process.The horizontal black line stands for P ∞ = − / e . (b) Susceptibility χ as a function of the sample size. χ plateaus at the same value as observedin Figure 1b of the main text, and Figures S8a and S8d. Percolation theory of the subcritical Hawkes process

As stated in the main text, several di ﬀ erent systems exhibit bursty patterns of activity. Neuronal networks are one possibleexample of such systems. Whether the brain truly operates at criticality [3] or in a slightly subcritical state [31] is a debated5argument. As such, it could be interesting to check whether the theory presented in the main text remains valid for n (cid:46) n = . (cid:104) λ (cid:105) and second moment (cid:104) λ (cid:105) of the rate of an exponential Hawkes process in the subcritical regime n < (cid:104) λ (cid:105) = µ − n (cid:104) λ (cid:105) = µ + n µ − n ) . (S14)Both the above expressions are valid in the long-term limit, when the process is stationary. In particular, we note that the squareof the ratio standard deviation over average value is inversely proportional to µ , i.e., (cid:104) λ (cid:105) − (cid:104) λ (cid:105) (cid:104) λ (cid:105) = n µ . (S15)The results of the numerical simulations reported in Figures S15 and S16 are explained using the above expressions.For µ (cid:28)

1, the behavior is similar to the one observed for the critical Hawkes process. In this regime, time series consist ofquick bursts of activity well separated in time. The fundamental di ﬀ erence with the critical conﬁguration is that burst sizes obeya power-law distribution with a neat exponential cut-o ﬀ [23]. As such, the largest cluster generated by a single burst has ﬁnitesize, independently of the system size K . In turn, the value of the plateau observed in the percolation phase diagram decreasesas K increase, and the associated phase transition disappears in the thermodynamic limit, see Figure S15. In ﬁnite-size systems,we can still measure the cluster distributions P ( S ) and P ( T ) around the pseudo-critical point ∆ ∗ , as done in Figure S16. ThePoisson-like transition is observed at the pseudo-critical point ∆ ∗ = log( K ) /µ . This second transition does not vanish as thesystem size increases.For µ (cid:29)

1, Eq. (S15) tells us that ﬂuctuations are much smaller than the expected value of the rate, so we expect the processto behave as a Poisson process with rate (cid:104) λ (cid:105) . In particular, the phase diagram should look like the one of a homogeneous one-dimensional percolation model with occupation probability p = − e −(cid:104) λ (cid:105) ∆ , thus consisting of a unique transition happening at ∆ ∗ = log( K ) / (cid:104) λ (cid:105) . Our prediction is conﬁrmed by the collapse of the order parameter shown in Figure S15c, and further supportedby the numerical analysis regarding the scaling of the cluster number, see Figure S16. − . . . a − − − b ∆ ∗ − − . . . c K = 10 K = 10 K = 10 K = 10 K = 10 K = 10 ∆ / ∆ ∗ − ∆ / ∆ ∗ Resolution parameter ∆ P e r c o l a t i on s t r eng t h P ∞ Figure S15. Percolation phase diagrams of the subcritical Hawkes process. We consider the exponential kernel function and use n = .

95. Weplot the percolation strength P ∞ as a function of the resolution parameter ∆ for di ﬀ erent conﬁgurations and di ﬀ erent system sizes K. Averagevalues are obtained by considering R = realizations of the process. (a) We set µ = − . The inset shows the same data as of the mainpanel but the resolution parameter is rescaled as ∆ / ∆ ∗ , with ∆ ∗ = (1 − n ) log( K ) /µ . (b) Same as in panel a, but we use the logarithmic scale forthe ordinates. The vertical gray line roughly indicates the position of the pseudo-critical point ∆ ∗ . We use this value as the reference for theplots in Figure S16. (c) We set µ = ∆ / ∆ ∗ ,with ∆ ∗ = (1 − n ) log( K ) /µ . Size S − − − P r obab ili t y P ( S ) a S − S − . − Rescaled size S ( p c − p )10 − − R e sc a l ed c l u s t e r nu m be r S n S b x − e − x − Duration T − − P r obab ili t y P ( T ) c ∆ = ∆ ∗ /

10∆ = ∆ ∗ ∆ = 2∆ ∗ ∆ = 10∆ ∗ ∆ = 100∆ ∗ T − T − − − Rescaled duration T h λ i ( p c − p )10 − − R e sc a l ed c l u s t e r du r a t i on T n T d ∆ = ∆ ∗ /

10∆ = ∆ ∗ /

5∆ = ∆ ∗ /

2∆ = ∆ ∗ / . x − e − x Figure S16. Percolation properties of the subcritical Hawkes process. We consider temporal processes identical to those studied in Figure S15,thus generated using an exponential kernel function with n = .

95. System size is ﬁxed at K = . Histograms are obtained by considering C = clusters per conﬁguration. (a) We set µ = − . Probability density function of cluster size S for di ﬀ erent values of ∆ . The dashedblack line scales as S − while the solid blue line scales as S − / . (b) We set µ = n s , rescaled by S , as a function of S (1 − p ), for di ﬀ erent values of p . The solid black line is the scaling function G ( x ) = x e − x . (c) We set µ = − . Probability density functionof cluster duration T for di ﬀ erent values of ∆ . The dashed black line scales as T − . (d) We set µ = n T , rescaled by T , asa function of (cid:104) λ (cid:105) T (1 − p ), for the same values of p as in panel (a). The solid black line is the scaling function G ( x ) = x e − x . Non-homogeneous Poisson process with rate linearly growing in time

Here we discuss a non-homogeneous Poisson process with rate linearly growing in time, i.e., λ ( t ) = t . Numerical simulationsof the model can be performed e ﬃ ciently by using the inverse transform method. Speciﬁcally, the probability that no events areobserved in the interval ( t , t + x ) is P n = [ t , x ] = exp (cid:32) − (cid:90) t + xt λ ( t (cid:48) ) dt (cid:48) (cid:33) . (S16)Thus, the inter-event time x is a random variable satisfying x = − t + (cid:113) t − u ) , where u ∼ Unif(0 , . (S17)All inter-events are obtained from the above expression, including the ﬁrst one at t = λ ( t ) = t implies that k ( t ) = t and λ ( k ) = √ k . Figure S17 shows that the expected dependence k ( t ) = t is correct.We remark that the most general expression for a linearly increasing rate is λ ( t ) = µ + β t . In our experiments, we assume thattime is measured in units equal to β . Further, we assume µ =

0. Taking µ > ∆ . The transition point of the model isexpected to scale as ∆ ∗ = log( K ) / √ K for a time series with K events. Such a prediction is veriﬁed in the inset of Figure S18a.The validity of the percolation framework is further conﬁrmed in Figure S19a, where the susceptibility is shown to collapse7 Time t E v en t nu m be r k k ( t ) t Figure S17. Individual realizations of the non-homogeneous Poisson process with deterministic linear time growth. The realization is made of K = events. The dashed blue line is the expected relation k ( t ) = t . under the usual rescaling of the temporal resolution, and in Figure S19b, where the location of the susceptibility’s peak isshown to converge to 0 as ∆ ∗ ( K ) = log( K ) / √ K . Figure S19c further displays the measure of the exponent β = γ = τ = α = − Resolution parameter ∆0 . . . P e r c o l a t i on s t r eng t h P ∞ a K = 10 K = 10 K = 10 K = 10 K = 10 Cluster size S − − − P r obab ili t y d i s t r i bu t i on b ∆ = ∆ ∗ / ∆ = ∆ ∗ /

10∆ = ∆ ∗ ∆ = ∆ ∗ / − S − S − / S − S − / − − Cluster duration T − P r obab ili t y d i s t r i bu t i on c T − T − − ∆ / ∆ ∗ Figure S18. Percolation transition in the non-homogeneous Poisson processes. Results are obtained by considering R = realizations of theprocess. (a) Percolation strength as a function of the resolution parameter ∆ . We display results for di ﬀ erent system sizes K . The inset showsthe same data as of the main panel but the resolution parameter is rescaled as ∆ / ∆ ∗ . (b) Distribution of the cluster size. System size here is K = . (c) Distribution of the cluster duration for the same data of panel (b). In conclusion, the non-homogeneous Poisson process with linearly increasing rate has the same critical properties as of thehomogeneous Poisson process.8 − Resolution parameter ∆0 . . . S u sc ep t i b ili t y K − χ a K = 10 K = 10 K = 10 K = 10 K = 10 System size K − − − b ∆ ∗ ( K )1 /ν = log( K ) / √ K System size K c − β/ν = 1 . h S M (∆ ∗ ) i χ (∆ ∗ )( γ + β ) /ν = 1 . χ (∆ ∗ )( γ + β ) /ν = 1 . − ∆ / ∆ ∗ Figure S19. Finite-size scaling analysis of percolation in the non-homogeneous Poisson process.(a) Susceptibility [i.e., Eq. (S5)] divided byK, as a function of the resolution parameter ∆ . The inset shows the same data as of the main panel but the resolution parameter is rescaled as ∆ / ∆ ∗ . (b) Convergence of the e ﬀ ective critical point to the critical point. We measure ∆ ∗ as the value of the resolution parameter where weobserve the peak of the susceptibility. (c) Scaling of the peak of susceptibility (orange squares) and of the average size of the largest cluster(blue circles) at the e ﬀ ective resolution ∆ ∗∗