Burst-tree decomposition of time series reveals the structure of temporal correlations
BBurst-tree decomposition of time series reveals the structure of temporal correlations
Hang-Hyun Jo,
1, 2, 3, ∗ Takayuki Hiraoka, and Mikko Kivel¨a Asia Pacific Center for Theoretical Physics, Pohang 37673, Republic of Korea Department of Physics, Pohang University of Science and Technology, Pohang 37673, Republic of Korea Department of Computer Science, Aalto University, Espoo FI-00076, Finland (Dated: August 1, 2019)Comprehensive characterization of non-Poissonian, bursty temporal patterns observed in variousnatural and social processes is crucial to understand the underlying mechanisms behind such tem-poral patterns. Among them bursty event sequences have been studied mostly in terms of intereventtimes (IETs), while the higher-order correlation structure between IETs has gained very little at-tention due to the lack of a proper characterization method. In this paper we propose a methodof decomposing an event sequence into a set of IETs and a burst tree, which exactly captures thestructure of temporal correlations that is entirely missing in the analysis of IET distributions. Weapply the burst-tree decomposition method to various datasets and analyze the structure of the re-vealed burst trees. In particular, we observe that event sequences show similar burst-tree structure,such as heavy-tailed burst size distributions, despite of very different IET distributions. The bursttrees allow us to directly characterize the preferential and assortative mixing structure of burstsresponsible for the higher-order temporal correlations. We also show how to use the decompositionmethod for the systematic investigation of such higher-order correlations captured by the burst treesin the framework of randomized reference models. Finally, we devise a simple kernel-based modelfor generating event sequences showing appropriate higher-order temporal correlations. Our methodis a tool to make the otherwise overwhelming analysis of higher-order correlations in bursty timeseries tractable by turning it into the analysis of a tree structure.
I. INTRODUCTION
A variety of dynamical processes in natural and socialphenomena are known to be non-Poissonian or bursty, asobserved in solar flares [1], earthquakes [2, 3], neuronalfirings [4], and human activities [5, 6] to name a few.Traditionally, long-term temporal correlations have beencharacterized in terms of 1 /f noise [7–9], autocorrelationfunction [10–13], or Hurst exponent [10, 11, 14–16]. Morerecently, temporally correlated behavior, called bursts,has gained attention [5, 6]. Bursts are rapidly occur-ring events in short-time periods alternating with longinactive periods. The mechanisms behind bursty tem-poral patterns have been studied by a number of mod-eling approaches [5, 6, 12, 17–26]. It is also well-knownthat bursty interactions between elements of the systemsinfluence the dynamical processes taking place in thosesystems, such as spreading or diffusion [27–34]. There-fore, characterization of the bursty temporal patterns iscrucial to understand the underlying mechanisms for theemergent dynamics observed in various complex systems.Temporal correlations in event sequences can be un-derstood not only by statistical properties of time inter-vals between two consecutive events, i.e., interevent times(IETs), but also by correlations between IETs [35, 36].The temporal correlations due to the heavy-tailed IETdistributions have been extensively studied in recentyears [6]. In contrast, characterization and understand-ing of the sequence of IETs is far from being fully ex-plored. The correlations between IETs have been de- ∗ [email protected] scribed in terms of memory coefficient [35] and burstytrain size distribution [12] among others [6]. The mem-ory coefficient is a Pearson correlation coefficient betweentwo consecutive IETs. To capture the structure beyondpairwise correlations the notion of bursty trains has beensuggested. The size of bursty train or burst size is de-fined as the number of consecutive events that are notseparated by IETs larger than some fixed time resolu-tion or time window. Several empirical distributions ofburst sizes are found to show heavy tails or power-lawtails for a wide range of time windows [12, 37], whichclearly implies the existence of higher-order correlationsbetween IETs than simply expected by the memory coef-ficient [38]. These findings naturally raise an importantquestion: What is the origin of such higher-order tem-poral correlations? This issue has rarely been explored,except for a recent numerical study demonstrating therole of the tendency of bigger (smaller) bursts to be fol-lowed by bigger (smaller) ones in the higher-order tem-poral correlations [36].We stress that the burst size distribution of a time se-ries is far from capturing the entire structure of temporalcorrelations present in the time series: The burst size dis-tributions are often based on a few or even a single—oftenarbitrarily chosen—time resolutions, which limits the in-terpretation of the results to these specific resolutions.Further, information on the correlations between consec-utive bursts is missing in the burst size distributions.This information is crucial for understanding the mecha-nisms behind the higher-order correlations between IETsevidenced by heavy-tailed burst size distributions, as wellas for exploring the implications of this type of higher-order correlations. Therefore, it is strongly required to go a r X i v : . [ phy s i c s . d a t a - a n ] J u l beyond the current state of the art and devise a methodfor comprehensively characterizing the structure of tem-poral correlations in event sequences.In this paper we propose a method of decomposingany event sequence into an IET distribution and a bursttree. The IET distribution reveals the temporal scalesbetween two consecutive events, while the burst tree doesthe same for their higher-order correlation structure. Asthe IET distributions have been extensively analyzed inthe literature [6], we focus mostly on the higher-ordercorrelation structure in the event sequences. By measur-ing various existing and newly introduced quantities fromthe revealed burst tree, such as the memory coefficientbetween consecutive bursts, we empirically demonstratethat the burst-tree decomposition is indeed useful to di-rectly characterize the preferential and assortative mix-ing structure of bursts responsible for the higher-ordercorrelations between IETs. Further, we observe thatevent sequences show similar burst-tree structure, suchas heavy-tailed burst size distributions, despite of verydifferent IET distributions. The burst-tree structure alsoallows us to construct novel microcanonical randomizedreference models for event sequences [39], which can beused to explore the implications of higher-order correla-tions in a controlled and meaningful way. Finally, wesuccessfully generate event sequences showing the empir-ically observed higher-order temporal correlations usinga simple model based on the burst-tree structure. II. BURST-TREE DECOMPOSITION METHOD
We propose a burst-tree decomposition method for de-tecting the temporal correlation structure in an eventsequence. For a given time window ∆ t , some consecu-tive events can be clustered to a bursty train: A burstytrain is defined as a set of events such that intereventtimes (IETs) between any two consecutive events in thebursty train are less than or equal to ∆ t , while thosebetween events in different bursty trains are larger than∆ t [12]. The number of events in each bursty train iscalled a burst size. On the one hand, when ∆ t is smallerthan the minimum IET of the given event sequence, de-noted by τ min , each event constitutes a burst of size 1on its own. On the other hand, when ∆ t is larger thanthe maximum IET, denoted by τ max , all events belong toone burst, which we call a giant burst. Then by increas-ing ∆ t continuously from τ min to τ max , the bursts willconsecutively merge to form bigger bursts, see Fig. 1(a)for a schematic diagram. Such a merging process can befully described by a rooted tree whose leaf nodes, internalnodes, and the root node correspond to the events, themergings or merged bursts, and the giant burst, respec-tively. Note that the root node is also an internal node.Hence, this burst tree of the event sequence reveals thehierarchical structure of temporal correlations [40].We start by introducing a notation for event sequences.A given event sequence of n + 1 events can be described by an ordered set of event timings { ˆ t , · · · , ˆ t n } . In mostcases ˆ t indicates the beginning time of data collection.Otherwise, its relationship to the beginning time of datacollection can be used to infer IET distributions [41]. Forconvenience we shift timings by ˆ t using t i ≡ ˆ t i − ˆ t forthe i th event. This leads to the shifted event sequence E ≡ { t , · · · , t n } with t = 0 by definition. From E thesequence of IETs is derived as { τ , · · · , τ n } by the defini-tion of τ i ≡ t i − t i − .Using the shifted event sequence E we can now formallydefine the burst tree. Firstly, each of n + 1 leaf nodes ofthe tree represents a single event, i.e., a burst of size 1.Secondly, each of n internal nodes of the tree, indexedby u , represents a merging of two consecutive bursts,indexed by v and w , respectively. Here v ( w ) is the indexof the earlier (later) burst among them or the left (right)child of its parent node u . The IET between bursts v and w , i.e., the time interval between the last event in v andthe first event in w , is associated with the internal node u ,and this IET is denoted by ˆ τ u . Note that the distributionof ˆ τ u s is exactly the same as that of τ i s, denoted by P ( τ ).Although the leaf nodes are not associated with any IETby construction, we set their associated IETs as 0 forconvenience. The index u for the internal node followsthe rank of its associated IET in { ˆ τ u } , e.g., u = 1 for theroot node as ˆ τ = τ max . In sum, each internal node isrepresented by a tuple of ( u, v, w, ˆ τ u ), and the weightedburst tree by T ≡ { ( u, v, w, ˆ τ u ) } for u = 1 , · · · , n . Oncethe weighted burst tree is derived, the burst size for eachinternal node u , denoted by b u , is computed as beingequal to b v + b w .The weighted burst tree T is an alternative representa-tion of the event sequence E . That is, the event sequence E can be exactly reconstructed by visiting internal nodesof T in the inorder and by setting the i th event timing for i = 1 , · · · , n as t i = t i − + ˆ τ u ( i ) , where u ( i ) denotes the i th visited internal node by the inorder traversal [42]. Wedenote this type of equivalence by E (cid:98) = T . Now T can bedecomposed into the IET distribution P ( τ ) and the ordi-nal burst tree G ≡ { ( u, v, w ) } . Here the ordinal burst treeretains the information on the ranks or orders of internalnodes, while the information on the IETs is discarded.As before, T can be exactly reconstructed by associatingthe u th largest IET in P ( τ ) to the internal node u in G .We denote this equivalence by T (cid:98) =( P ( τ ) , G ). By transi-tivity, E is also equivalent to ( P ( τ ) , G ), i.e., E (cid:98) =( P ( τ ) , G ).We note that if more than two consecutive bursts areseparated by IETs of the same length, hence mergedto the same node at the same time, then the order inwhich those bursts are merged in a pairwise manner isnot well defined. In such cases, we randomly and uni-formly choose two consecutive bursts and merge theminto one burst, and repeat this binary merging until allthese bursts are merged into one burst. These cornercases leading to the non-binary merging are insignifi-cant for the analysis of the datasets in the next sec-tion [43]. We also remark that the burst-tree decompo-sition method has some conceptual resemblance to visi- time τ max Δ t
0 500 1000 1500 2000 (b)
Wikipedia editor 1 ∆ t ( s e c ond ) t (second) -1.0-0.50.00.51.010 (g) < a u > ∆ t (second) -1.0-0.50.00.51.010 (e) ∆ t (second) M b M lr -12 -9 -6 -3 (c)(a) P ( τ ) τ (second) α =1.77(1) 10 -6 -4 -2 -2 -1 (d) Q ∆ t ( b ) < b > ∆ t b/ ∆ t ∆ t=300 s1800 s3600 s10800 s β =2.52(5) (f) K(b v ,b w )b v b w -1 FIG. 1. (a) Schematic diagram for the burst-tree decomposition method of an event sequence. The lower horizontal arrowdenotes a time axis and blue vertical lines are events. The upper burst tree is derived by increasing the time window ∆ t from0 to the maximum IET τ max . At the bottom of the tree are the leaf nodes (red empty circles). Each internal node (red filledcircle) in the tree denotes the merging of its left and right children, and the number next to the internal node is the burst sizeafter merging. The height of the internal node corresponds to the IET between the last event of the left child and the first eventof the right child. For the horizontal dashed line, see the main text. (b) A burst tree derived from a part of the edit sequenceby the most active Wikipedia editor, namely, editor 1. (c–g) Empirical results of the editor 1’s event sequence of n ≈ . × in terms of the IET distribution P ( τ ), burst size distributions Q ∆ t ( b ) for several values of ∆ t , memory coefficients betweenconsecutive bursts M b and between sibling bursts M lr at each ∆ t , the merging kernel K ( b v , b w ), and the averaged asymmetry (cid:104) a u (cid:105) at each ∆ t , respectively. In the panel (d), (cid:104) b (cid:105) ∆ t is the average burst size for a given ∆ t . Vertical dashed lines in panels(c, e, g) denote 1 day. bility graphs [44] in a sense that both methods map timeseries onto graphs. III. HIGHER-ORDER STRUCTURE IN DATAA. Data description
We consider four time series datasets: EnglishWikipedia, Twitter, heartbeat, and Japan UniversityNetwork Earthquake Catalog (JUNEC). (i) We analyzeedit sequences by editors from the English Wikipediadump on October 2, 2015 [45]. Each edit is recordedwith the timing of edit in a resolution of seconds. As acase study, we choose one of the most active editors, whoedited more than 1 . . . × earth-quakes occurred in Japan from July 1, 1985 to December31, 1998 [49]. B. Preferential and assortative mixing structure
As a case study, we analyze the Wikipedia editor 1’sevent sequence of n ≈ . × using the burst-tree de-composition method. In Fig. 1(b), we show a burst-treestructure derived from a part of the event sequence bythe editor 1. We also find a power-law scaling in the in-terevent time (IET) distribution, i.e., P ( τ ) ∼ τ − α with α = 1 . < τ < in seconds, as shown inFig. 1(c).To study the mixing structure of bursts for the higher-order correlations between IETs, we measure various ex-isting and newly introduced quantities. First of all, theburst size distribution Q ∆ t ( b ) for a time window ∆ t sim-ply reflects a horizontal cross-section of the weightedburst tree at the height of ∆ t , as depicted by crossingpoints between the edges of the tree and the horizontaldashed line in Fig. 1(a). Precisely, we collect pairs witha child (either leaf node or internal node) and its parentsatisfying the condition that an IET associated with thechild is smaller than or equal to ∆ t , while its parent isassociated with an IET larger than ∆ t . Denoting thetime-ordered set of such children by C ∆ t , the burst sizedistribution Q ∆ t ( b ) is directly obtained from the burstsizes b u of nodes u ∈ C ∆ t . We find for the editor 1 that Q ∆ t ( b ) ∼ b − β with β = 2 . t ,as shown in Fig. 1(d). To investigate the origin of sucha power-law behavior in burst size distributions, we pro-pose two definitions of memory coefficient between con-secutive burst sizes, M b and M lr , as well as the mergingkernel K ( b v , b w ). Later we also make use of the burst-tree structure to get some insight into the feature of timeasymmetry by measuring asymmetries { a u } .The assortative mixing structure of bursts, i.e., a ten-dency of big (small) bursts to be followed by big (small)ones, can be directly tested by measuring the correla-tions between consecutive bursts. We define the memorycoefficient M b for a given ∆ t as a Pearson correlation co-efficient between burst sizes of two consecutive nodes in C ∆ t as follows: M b ≡ n b − n b − (cid:88) k =1 ( b ( k ) − µ )( b ( k +1) − µ ) σ σ , (1)where n b = | C ∆ t | is the number of bursts and b ( k ) denotesthe burst size of the k th node in C ∆ t for k = 1 , · · · , n b . µ ( µ ) and σ ( σ ) denote the average and standard devia-tion of burst sizes of nodes except for the last (the first)node in C ∆ t , respectively. Positive M b implies a tendencyof big (small) bursts to be followed by big (small) ones.The opposite tendency can be observed for the negative M b , while M b = 0 indicates the absence of correlationsbetween consecutive burst sizes. Figure 1(e) shows that M b has positive values of 0 . ∼ . t , clearly revealing the assortative mixing structureof bursts. Note that the value of M b fluctuates around0 for ∆ t > M b in Eq. (1), all pairs of two consec-utive nodes or bursts in C ∆ t have been equally consideredno matter how large IETs separating those bursts are,which may introduce some spurious correlations. Suchspurious correlations can be corrected by considering onlythe pairs of sibling nodes in C ∆ t , i.e., those sharing thesame parent node. We denote the set of those siblings by S ∆ t ≡ { ( v, w ) | v, w ∈ C ∆ t , ( u, v, w ) ∈ G} . Then we sug-gest a novel definition of the memory coefficient betweensibling bursts for a given ∆ t as follows: M lr ≡ | S ∆ t | (cid:88) ( v,w ) ∈ S ∆ t ( b v − µ l )( b w − µ r ) σ l σ r , (2)where µ l ( µ r ) and σ l ( σ r ) respectively denote the aver-age and standard deviation of burst sizes of all the left(right) children in S ∆ t . In Fig. 1(e), we find that M lr has positive values of 0 . ∼ . t ,again showing the assortative mixing structure of bursts.The correlations between siblings of burst sizes b v and b w for the whole burst tree can be more precisely charac-terized by estimating a merging kernel K ( b v , b w ). This isbased on the observation that the merging process with increasing ∆ t from τ min to τ max can be interpreted asa stochastic process for coalescence in physical or net-worked systems [50]. Instead of ∆ t , we use the cumu-lative number of binary mergings, denoted by s , as anauxiliary time in the process. Then the merging processcan be described as follows: At the time step s = 0 (i.e.,∆ t < τ min ) we have n + 1 events, equivalently, n + 1bursts of size 1. At each time step s , one has n + 1 − s bursts, whose burst size distribution is denoted by Q s ( b ).A pair of bursts among n +1 − s bursts are randomly cho-sen with a probability proportional to the merging kernel K ( b v , b w ), and then they are merged into another burstof size b v + b w . This process is repeated until all eventseventually belong to a giant burst (i.e., ∆ t ≥ τ max ).Then we consider the empirical ordinal burst tree G as a realization of the above merging process. As eachmerging corresponds to an internal node u in G , the timestep s is related to the node index u as s = n − u . For es-timating the merging kernel from G , we define m s ( b v , b w )at the time step s , i.e., for the internal node u = n − s ,as having a value of 1 if its child nodes have burst sizes b v and b w , and 0 otherwise. The merging kernel is nowestimated using the following formula: K ( b v , b w ) ≡ (cid:80) n − s =0 m s ( b v , b w ) (cid:80) n − s =0 Q s ( b v ) Q s ( b w ) . (3)Equation (3) has been modified from the formula thatwas introduced to numerically estimate the kernel forthe preferential attachment in evolving scale-free net-works [51]. Therefore, we expect the merging kernel toreveal the mechanism behind the power-law burst sizedistributions.From the merging kernel estimated for the editor 1in Fig. 1(f) we make three important observations: (i) K ( b v , b w ) shows a high profile in the diagonal part aroundthe line of b v = b w , while it has low values in the off-diagonal part. (ii) The diagonal cross-section K ( b, b ) isan overall increasing function of b . (iii) K ( b v , b w ) showsan overall symmetric behavior with respect to the diago-nal axis, implying K ( b, b (cid:48) ) ≈ K ( b (cid:48) , b ). The observation (i)is indeed consistent with positive M b and M lr in Fig. 1(e),confirming the assortative mixing of bursts. On the otherhand, the observation (ii) implies the preferential mixingstructure of bursts, by which the bigger bursts tend tobe merged earlier, i.e., at smaller time scales, than thesmaller ones. Conclusively, these empirical evidences en-able us to understand the power-law behavior of burstsize distributions in Fig. 1(d) by means of the preferen-tial and assortative mixing structure. Further, a possibleconnection between the observation (iii) and the averagedasymmetry (cid:104) a u (cid:105) in Fig. 1(f) will be discussed below.We demonstrate how the burst-tree structure can beused to study the time asymmetry in the event sequenceregarding the issues of nonlinearity or irreversibility intime series [52–55]. For this, we define for each inter-nal node u the asymmetry between its two children with (f) Heartbeat subject 1 P ( τ ) τ (ms) -1.0-0.50.00.51.0 0 500 1000 1500 (h) ∆ t (ms) M b M lr -1.0-0.50.00.51.0 0 500 1000 1500 (j) < a u > ∆ t (ms) -1.0-0.50.00.51.010 (c) ∆ t (second) M b M lr -1.0-0.50.00.51.010 (m) ∆ t (second) M b M lr -1.0-0.50.00.51.010 (e) < a u > ∆ t (second) -1.0-0.50.00.51.010 (o) < a u > ∆ t (second) -9 -6 -3 (a) Twitter user 1 P ( τ ) τ (second) -9 -6 -3 (k) JUNEC P ( τ ) τ (second) -7 -4 -1 -1 (l) Q ∆ t ( b ) < b > ∆ t b/ ∆ t ∆ t=600 s1800 s3600 s7200 s β =2.9(1)10 -7 -4 -1 -2 (g) Q ∆ t ( b ) < b > ∆ t b/ ∆ t ∆ t=500 ms700 ms900 ms1100 ms β =1.8(1)10 -4 -2 -2 -1 (b) Q ∆ t ( b ) < b > ∆ t b/ ∆ t ∆ t=60 s300 s1800 s β =1.5(1) (n) K(b v ,b w )b v b w -1 (i) K(b v ,b w )b v b w (d) K(b v ,b w )b v b w -1 FIG. 2. Empirical results for the tweet sequence of Twitter user 1 of n ≈ . × (top), the heartbeat time series of subject1 of n ≈ . × (middle), and Japanese earthquake sequence (JUNEC) of n ≈ . × (bottom), in terms of the IETdistribution P ( τ ), burst size distributions Q ∆ t ( b ), memory coefficients M b and M lr , the merging kernel K ( b v , b w ), and theaveraged asymmetry (cid:104) a u (cid:105) (from left to right), respectively. In panels (b, g, l), (cid:104) b (cid:105) ∆ t is the average burst size for a given ∆ t . burst sizes b v and b w as a u ≡ b w − b v b w + b v . (4)If b w = b v , one gets a u = 0. If b w (cid:29) b v , we have a u ≈ a u ≈ − b w (cid:28) b v . For a given ∆ t , we take anaverage of a u over the nodes whose associated IETs arethe same as ∆ t , and denote it by (cid:104) a u (cid:105) . For the editor1, the value of (cid:104) a u (cid:105) turns out to remain close to 0 for∆ t < t between2 and 10 hours, see Fig. 1(g). It implies that relativelysmall bursts tend to be followed by relatively big burstsafter several hours of inactivity. One can interpret thisobservation such that the less bursty editing activitiesin the afternoon tend to be followed by the more burstyediting activities at night after several hours. Note thatthe behavior of (cid:104) a u (cid:105) ≈ t can be to some ex-tent considered as being consistent with the observation(iii) for the merging kernel, namely, K ( b, b (cid:48) ) ≈ K ( b (cid:48) , b ).Our framework of the burst-tree decomposition can bestraightforwardly applied to any other event sequences.We have analyzed edit sequences of other active editorsin the English Wikipedia, tweet sequences of active Twit-ter users, heartbeat time series of healthy subjects, andthe earthquake sequence in the JUNEC. Among them,the results for the Twitter user 1, for the heartbeat sub-ject 1, and for the earthquake sequence are summarizedin Fig. 2, while those for other Wikipedia editors, other Twitter users, and other heartbeat subjects are presentedin Figs. S2–S4 of Supplemental Material [43]. For mostevent sequences analyzed, we find heavy-tailed burst sizedistributions Q ∆ t ( b ) for several values of ∆ t , positive M b and M lr for a wide range of ∆ t , merging kernels K ( b v , b w )with overall increasing K ( b, b ), and negligible values of (cid:104) a u (cid:105) for a wide range of ∆ t .Interestingly, as shown in Fig. 2, we commonly ob-serve nontrivial structure of burst trees, such as heavy-tailed burst size distributions, irrespective of the func-tional form of IET distributions P ( τ ). This clearly showsthat the IET distributions and the burst-tree structuresare not only separable—as is evident from the equiva-lence relation of E (cid:98) =( P ( τ ) , G )—but there does not seemto be a strong connection between the IET distributionsand the higher-order structures.Finally, we remark that there could be alternativemechanisms behind the power-law distributions of burstsizes. Since the merging process with increasing ∆ t issimilar to the percolation process with increasing connec-tivity between elements of the system, one can hypothe-size that the power-law burst size distribution could cor-respond to the power-law distribution of connected com-ponent sizes at the percolation transition point. However,this analogy may not be plausible because the power-lawburst size distribution is observed for a wide range of ∆ t in our empirical analysis, while the power-law distribu-tion of connected component sizes appears only at the -1.0-0.50.00.51.010 (b) M b ∆ t (second) -1.0-0.50.00.51.010 (c) M l r ∆ t (second) -1.0-0.50.00.51.010 (e) < a u > ∆ t (second) -1 (d) K ( b , b ) b -8 -6 -4 -2 (a) IET MRRM Q ∆ t ( b ) b ∆ t=300 s -1.0-0.50.00.51.010 (f) Left-right MRRM M b ∆ t (second) -1.0-0.50.00.51.010 (g) M l r ∆ t (second) -1.0-0.50.00.51.010 (h) < a u > ∆ t (second) FIG. 3. Results of two microcanonical randomized reference models (MRRMs) for the Wikipedia editor 1’s edit sequence using100 randomized sequences for each MRRM: (a–e) IET MRRRM and (f–h) left-right MRRM. In all panels, the median is plottedby the red solid curve, while 95th and 5th percentiles by the orange solid curves. The original curves are also plotted by bluesymbols for comparison. Vertical dashed lines in some panels denote 1 day. percolation transition point in the conventional percola-tion problem [56, 57]. By measuring the fraction of thelargest burst size and the susceptibility as functions of ∆ t for the editor 1, we find that the percolation transitionoccurs around at ∆ t ≈ C. Randomized reference models
Next, we demonstrate that our burst-tree decomposi-tion method can be useful to systematically characterizetemporal correlations or features in event sequences. Forthis, we adopt the methodology of microcanonical ran-domized reference models (MRRMs) for event sequences.The MRRMs have been extensively applied to character-ize various features in temporal networks, see Ref. [39]and references therein. These MRRMs can be definedwith the set of features they retain, and here we denoteby P [ X ] the MRRM which exactly retains the feature X , while maximally randomizing everything else. TheMRRMs can be ordered according to the amount of in-formation they preserve, such that the simpler or lessinformative the features are, the more of the originaldata is shuffled or randomized. In the case with eventsequences, only very simple features discarding higher-order correlations such as the IET distribution have beenconsidered in the literature [39]. To investigate the effectsof keeping higher-order structures, compared to keepingonly the simple ones, we will study the higher-order MR-RMs based on the burst-tree structure.The simplest MRRM used here is the one that onlykeeps the number of events, P [ n + 1]. This MRRM ran-domizes the timings of events by assigning to each event arandom timing drawn from a uniform distribution in the TABLE I. Features or temporal correlations conserved in var-ious microcanonical randomized reference models (MRRMs).The “original” data trivially retains all features. For defini-tions of MRRMs and symbols, see the main text.MRRM P ( τ ) Q ∆ t ( b ) M b M lr K ( b, b ) (cid:104) a u (cid:105) Original (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)
Left-right shuffled (cid:88) (cid:88) (cid:88)
IET shuffled (cid:88)
Timing shuffled entire time period [ t , t n ]. In the limit of large n this re-sults in a Poisson process with an event rate determinedonly by the number of events and the time period [58].The next simplest MRRM, denoted by P [ n + 1 , P ( τ )],retains the number of events and the IET distribution,which we call the IET MRRM. By permuting IETs inthe empirical IET sequence, the IET MRRM only keepscorrelations between two consecutive events, while all thehigher-order correlations considering more than two con-secutive events are destroyed.There are several possibilities of devising MRRMsbased on the burst-tree structure. Here we are interestedin the question about whether the arrow of time is rele-vant to the structure of temporal correlations. To studythis issue, we first define an unoriented ordinal burst treeˆ G = { ( u, [ v, w ]) } , which is the same as the ordinal bursttree G defined in the previous section, except that theset [ v, w ] does not carry any information on the left-rightorientation of v and w . Now we devise an MRRM, de-noted by P [ P ( τ ) , ˆ G ], which keeps all features other thanthe left-right orientation of bursts. This MRRM is imple-mented by randomly swapping the left and right childrenfor each internal node u , enabling us to call it the left-right MRRM. It strictly conserves the features measuredby P ( τ ), Q ∆ t ( b ), and K ( b, b ), while it may destroy otherfeatures measured by M b , M lr , and (cid:104) a u (cid:105) . In the case with M lr in Eq. (2), the randomization keeps the average of b v b w , while it may change µ l , µ r , σ l , and σ r but onlymarginally. Thus, we do not expect M lr in the left-rightMRRM to be drastically different from the original re-sult. The features conserved in the above three MRRMsare summarized in Table I [59].Using the empirical event sequences, we test if the fea-tures that are not necessarily destroyed by the random-ization still remain after the randomization. For eachMRRM, we generate 100 randomized event sequences tomeasure various quantities for detecting the correspond-ing features for each randomized event sequence, fromwhich we obtain the curves of median, 95th and 5th per-centiles to compare them to the original curves. As forthe merging kernel, we compare the results of its diago-nal cross-section K ( b, b ) instead of K ( b v , b w ) for effectivecomparison.As an example, we find for the editor 1 that the shuf-fling of event timings destroys all temporal correlations,while the shuffling of IETs does the same apart from theIET distribution as expected, as shown in Fig. 3(a–e).Interestingly, the shuffling of the left-right children turnsout to barely destroy the feature measured by M b , while M lr is almost identical to the original result, as depictedin Fig. 3(f, g). It implies that the correlations betweentwo consecutive burst sizes (measured by M b ) might bedominated by those between sibling bursts (measured by M lr ). Finally, the time asymmetry measured by (cid:104) a u (cid:105) isdestroyed by the randomization, see Fig. 3(h). The com-plete results of the MRRM for the editor 1 are shownin Fig. S5 of Supplemental Material [43]. We also findthe similar results for other datasets, i.e., the Twitteruser 1, the heartbeat subject 1, and the JUNEC. For thecomplete results of MRRMs for the Twitter user 1, theheartbeat subject 1, and the JUNEC, see Figs. S6–S8 ofSupplemental Material [43], respectively.The conclusion of the above MRRM study is that thefeatures of the burst tree we observed for the data can-not be explained by randomness, but there is much morestructure in the time series than just the IET distribu-tions. Further, the temporal order of the bursts doesnot have a major effect on the observed burst-tree struc-ture, apart from the asymmetries { a u } that we destroyin the left-right MRRM. Based on this conclusion, onecan devise a simple model, mainly exploiting the merg-ing kernel, for generating the event sequence with theempirically observed higher-order temporal correlations,as discussed in the next section. IV. KERNEL-BASED MODELING
Based on the empirical findings for the merging kernel,we can devise a simple model to reproduce the temporalcorrelations observed in the empirical event sequences. To generate an event sequence consisting of n + 1 events,we need an interevent time (IET) distribution P ( τ ) fordrawing n IETs, and an ordinal burst tree G = { ( u, v, w ) } with n internal nodes for the higher-order correlations be-tween those IETs. For constructing the ordinal burst treewith n internal nodes, we begin with n + 1 events or leafnodes, i.e., n + 1 bursts of size 1. Then two bursts, say b and b (cid:48) , are randomly chosen with a probability propor-tional to a model kernel K ( b, b (cid:48) ). These two bursts aremerged and randomly set as the left and right childrenof the merged burst, i.e., their parent node. This par-ent node is indexed by n as this node will be associatedwith the smallest IET in P ( τ ). The next merging leadsto another parent node to be indexed by n −
1, and soon. This binary merging is repeated until we end up withthe giant burst of size n + 1. Inspired by the empiricalmerging kernels, e.g., in Figs. 1(f) and 2(d, n), we adoptthe following model kernel: K ( b, b (cid:48) ) = [1 + c (ln b + ln b (cid:48) )] × (cid:104) c e − (ln b − ln b (cid:48) ) /c (cid:105) (5)with positive parameters c , c , and c . The first paren-thesis of the right hand side in Eq. (5) describes an in-creasing behavior of the diagonal cross-section along theline of b = b (cid:48) for the preferential mixing of bursts. Thesecond parenthesis is to implement the symmetrically de-caying behavior with respect to the diagonal axis, whichis for the assortative mixing of bursts. For example, seethe heatmap of this model kernel for c = 3, c = 100,and c = 4 in Fig. 4(a).Once the ordinal burst tree is ready, we draw n ran-dom numbers from an IET distribution P ( τ ) to get aset of n IETs, { τ i } . Precisely, we use a power-law IETdistribution with an exponent α > P ( τ ) = ( α − τ − α for τ ≥ τ min = 1 . (6)After assigning the IETs in { τ i } to the internal nodes in G , e.g., the largest IET to the root node, the event se-quence of n + 1 events is obtained by setting t = 0 andthen by calculating the event timings as t i = t i − + ˆ τ u ( i ) ,where u ( i ) denotes the i th visited internal node whentraversing the ordinal burst tree in the inorder. Thisevent sequence is analyzed to find that our simple kernel-based model successfully generates the event sequenceshowing the heavy-tailed burst size distributions for sev-eral values of ∆ t as well as positive M b and M lr for awide range of ∆ t , as shown in Fig. 4(b, c).Using our kernel-based model, we can test if both pref-erential and assortative mixing structures are necessaryfor the power-law burst size distributions. The case withonly assortative mixing can be studied by setting c = 0,leading to the constant K ( b, b ), as depicted in Fig. 4(d).This leads to thinner tails of Q ∆ t ( b ) than those for thecase with c >
0. Yet the positive M b and M lr are ob-served, implying that the assortative mixing is not suffi-cient to generate the power-law burst size distributions. -1.0-0.50.00.51.010 (l) ∆ t M b M lr -1.0-0.50.00.51.010 (i) ∆ t M b M lr -1.0-0.50.00.51.010 (f) ∆ t M b M lr -1.0-0.50.00.51.010 (c) ∆ t M b M lr -7 -4 -1 -2 (k) Q ∆ t ( b ) < b > ∆ t b/ ∆ t ∆ t=16642561024 β =1.30(2)10 -7 -4 -1 -2 (b) Q ∆ t ( b ) < b > ∆ t b/ ∆ t ∆ t=16642561024 β =1.32(2)10 -7 -4 -1 -2 (h) Q ∆ t ( b ) < b > ∆ t b/ ∆ t ∆ t=1664256102410 -7 -4 -1 -2 (e) Q ∆ t ( b ) < b > ∆ t b/ ∆ t ∆ t=16642561024 (j) b v b w (g) b v b w (d) b v b w (a) b v b w c =3, c =100, c =1 c =3, c =0, c =4 c =0, c =100, c =4 c =3, c =100, c =4K(b v ,b w ) FIG. 4. Simulation results of the kernel-based model using the model kernel in Eq. (5) with various sets of parameter values,as depicted in the left panels. For each case, 100 event sequence of n = 10 are generated, also using the IET distribution inEq. (6) with α = 1 .
8. By aggregating the detected burst sizes in 100 event sequences, we obtain the burst size distributions Q ∆ t ( b ) for several values of ∆ t , rescaled by the average burst size (cid:104) b (cid:105) ∆ t (center panels). The curves of memory coefficients M b and M lr are averaged over 100 event sequences (right panels), where the error bars denotes the standard errors. Next, we consider the case only with the preferential mix-ing by setting c = 0. Then the diagonal part of K ( b, b (cid:48) )is no longer higher than the off-diagonal part, as depictedin Fig. 4(g). It turns out that tails of Q ∆ t ( b ) are thinnerthan those for the case with c >
0. Further, the valuesof M b and M lr are almost zero or even slightly nega-tive for a wide range of ∆ t , because big and small burstscan be merged with each other more easily. Therefore,we conclude that both preferential and assortative mix-ing structures are necessary for obtaining the power-law Q ∆ t ( b ) and positive M b and M lr simultaneously.Finally, we test the effect of c on the results: Thesmaller c leads to the steeper decay of K ( b v , b w ) alongthe direction perpendicular to the diagonal axis, as shownin Fig. 4(j). As the smaller c would enhance the possi-bility of merging bursts of similar sizes, one can expect M lr to have the larger values than for the case with thelarger c , which is indeed the case as shown in Fig. 4(l).We also find no considerable differences in M v as well as in Q ∆ t ( b ). In particular, the shapes of Q ∆ t ( b ) are quitesimilar to the case with the larger c , probably becausethe heavy tails of Q ∆ t ( b ) are largely affected by the char-acteristics of the diagonal cross-section K ( b, b ).We also have tested other functional forms of the modelkernel to draw the qualitatively same conclusions, seeSec. V and Figs. S9 and S10 of Supplemental Mate-rial [43]. In addition, we remark that the asymmetric be-havior between sibling bursts can be easily implementedin our model, e.g., by assigning a bigger (smaller) burstamong chosen b and b (cid:48) to the right (left) child with aprobability p ( q = 1 − p ). Then the case with p = q = 1 / p, q (cid:54) = 1 / V. CONCLUSION
The comprehensive characterization of temporal corre-lations observed in various natural and social processesis crucial to the understanding of the underlying mech-anisms behind such temporal processes. Non-Poissonianor bursty temporal patterns in empirical event sequenceshave been studied mostly in terms of heterogeneous in-terevent times (IETs), while the higher-order correlationsbetween IETs are far from being fully understood due tothe lack of the proper characterization method. In thispaper we have proposed the burst-tree decompositionmethod that decomposes a given event sequence into theIET distribution and the ordinal burst tree, hence with-out loss of information on the temporal correlations. Thisimplies that the ordinal burst tree, together with an IETdistribution, can exactly reproduce the original eventsequence. Using our burst-tree decomposition methodone can systematically study the hierarchical structureof temporal correlations: In particular, the preferentialand assortative mixing structure of bursts is empiricallyvalidated by measuring the novel memory coefficients be-tween consecutive bursts and between sibling bursts aswell as the merging kernel. In addition, the burst-treedecomposition turns out to be useful for the system-atic investigation of temporal correlations in the frame-work of randomized reference models [39]. Finally, basedon the empirically estimated merging kernels, we devisea kernel-based model to successfully generate event se-quences showing the higher-order temporal correlations observed in the empirical datasets.We remark that once the ordinal burst tree is de-rived or given, it can be associated with any other setof IETs, irrespective of the functional form of the IETdistribution. This implies that apparently very differentevent sequences might have the similar temporal correla-tion structure when their burst trees look similar to eachother. We have observed this type of phenomenon in theempirical event sequences that show heavy-tailed burstsize distributions despite of very different IET distribu-tions. In addition, mapping the structure of temporalcorrelations onto a tree enables to propose other novelquantities for measuring various higher-order correlationsas a tree structure is more intuitive and better visualizedthan the time series itself. Finally, we have consideredonly a binary tree in our work, while for more realisticdecomposition of the event sequences in various datasetsmore complex trees than a binary tree can be used in afuture.
ACKNOWLEDGMENTS
The authors thank Woo-Sik Son for providing us withthe preprocessed dataset of the English Wikipedia. H.-H.J. and T.H. acknowledge financial support by Ba-sic Science Research Program through the National Re-search Foundation of Korea (NRF) grant funded by theMinistry of Education (NRF-2018R1D1A1A09081919). [1] M. S. Wheatland, P. A. Sturrock, and J. M. McTiernan,The Astrophysical Journal , 448 (1998).[2] ´A. Corral, Physical Review Letters , 108501 (2004).[3] L. de Arcangelis, C. Godano, E. Lippiello, andM. Nicodemi, Physical Review Letters , 051102 (2006).[4] T. Kemuriyama, H. Ohta, Y. Sato, S. Maruyama,M. Tandai-Hiruma, K. Kato, and Y. Nishida, BioSystems , 144 (2010).[5] A.-L. Barab´asi, Nature , 207 (2005).[6] M. Karsai, H.-H. Jo, and K. Kaski, Bursty Human Dy-namics (Springer International Publishing, Cham, 2018).[7] P. Bak, C. Tang, and K. Wiesenfeld, Physical ReviewLetters , 381 (1987).[8] M. B. Weissman, Reviews of Modern Physics , 537(1988).[9] L. Ward and P. Greenwood, Scholarpedia , 1537 (2007).[10] J. W. Kantelhardt, E. Koscielny-Bunde, H. H. A. Rego,S. Havlin, and A. Bunde, Physica A: Statistical Mechan-ics and its Applications , 441 (2001).[11] P. Allegrini, D. Menicucci, R. Bedini, L. Fronzoni,A. Gemignani, P. Grigolini, B. J. West, and P. Paradisi,Physical Review E , 061914 (2009).[12] M. Karsai, K. Kaski, A.-L. Barab´asi, and J. Kert´esz, Sci-entific Reports , 397 (2012).[13] T. Yasseri, R. Sumi, A. Rung, A. Kornai, and J. Kert´esz,PLoS ONE , e38869 (2012). [14] C.-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E.Stanley, and A. L. Goldberger, Physical Review E ,1685 (1994).[15] D. Rybski, S. V. Buldyrev, S. Havlin, F. Liljeros, andH. A. Makse, Proceedings of the National Academy ofSciences , 12640 (2009).[16] D. Rybski, S. V. Buldyrev, S. Havlin, F. Liljeros, andH. A. Makse, Scientific Reports , 560 (2012).[17] A. V´azquez, J. G. Oliveira, Z. Dezs¨o, K.-I. Goh, I. Kon-dor, and A.-L. Barab´asi, Physical Review E , 036127(2006).[18] R. D. Malmgren, D. B. Stouffer, A. E. Motter, andL. A. N. Amaral, Proceedings of the National Academyof Sciences , 18153 (2008).[19] R. D. Malmgren, D. B. Stouffer, A. S. L. O. Campanharo,and L. A. Amaral, Science , 1696 (2009).[20] H.-H. Jo, R. K. Pan, J. I. Perotti, and K. Kaski, PhysicalReview E , 062131 (2013).[21] N. Masuda, T. Takaguchi, N. Sato, and K. Yano, in Tem-poral Networks , edited by P. Holme and J. Saramaki(Springer-Verlag, Berlin, 2013) pp. 245–264.[22] P. Wang, T. Zhou, X.-P. Han, and B.-H. Wang, PhysicaA: Statistical Mechanics and its Applications , 145(2014).[23] H.-H. Jo, J. I. Perotti, K. Kaski, and J. Kert´esz, PhysicalReview E , 022814 (2015). [24] G. Garc´ıa-P´erez, M. Bogu˜n´a, and M. ´A. Serrano, Scien-tific Reports , 9714 (2015).[25] J. R. Zipkin, F. P. Schoenberg, K. Coronges, and A. L.Bertozzi, European Journal of Applied Mathematics ,502 (2016).[26] B.-H. Lee, W.-S. Jung, and H.-H. Jo, Physical Review E , 022316 (2018).[27] A. Vazquez, Physica A: Statistical Mechanics and its Ap-plications , 747 (2007).[28] M. Karsai, M. Kivel¨a, R. K. Pan, K. Kaski, J. Kert´esz,A.-L. Barab´asi, and J. Saram¨aki, Physical Review E ,025102(R) (2011).[29] G. Miritello, E. Moro, and R. Lara, Physical Review E , 045102(R) (2011).[30] L. E. C. Rocha, F. Liljeros, and P. Holme, PLOS Com-putational Biology , e1001109 (2011).[31] H.-H. Jo, J. I. Perotti, K. Kaski, and J. Kert´esz, PhysicalReview X , 011041 (2014).[32] J.-C. Delvenne, R. Lambiotte, and L. E. C. Rocha, Na-ture Communications , 7366 (2015).[33] O. Artime, J. J. Ramasco, and M. San Miguel, ScientificReports , 41627 (2017).[34] T. Hiraoka and H.-H. Jo, Scientific Reports , 15321(2018).[35] K.-I. Goh and A.-L. Barab´asi, EPL (Europhysics Letters) , 48002 (2008).[36] H.-H. Jo, Physical Review E , 062131 (2017).[37] W. Wang, N. Yuan, L. Pan, P. Jiao, W. Dai, G. Xue, andD. Liu, Physica A: Statistical Mechanics and its Appli-cations , 846 (2015).[38] Note that the exponential burst size distributions haverecently been reported using the mobile phone commu-nication dataset [60], implying the apparent absence ofthe higher-order correlations in the dataset analyzed.[39] L. Gauvin, M. G´enois, M. Karsai, M. Kivel¨a, T. Tak-aguchi, E. Valdano, and C. L. Vestergaard (2018),arXiv:1806.04032.[40] The burst tree can be seen as a hierarchical data cluster-ing in one-dimensional space using the nearest neighbordistance [61].[41] M. Kivel¨a and M. A. Porter, Physical Review E ,052813 (2015).[42] The inorder traversal is a depth-first way of traversinga tree by which one first visits the left (earlier) subtreebefore visiting the branching node and lastly the right(later) subtree.[43] See Supplemental Material at [URL] for the statistics ofcases of merging more than two bursts at the same timefor Wikipedia editor 1 (Sec. I and Fig. S1(a, b)), empir-ical results of the 20 most active Wikipedia editors, the20 most active Twitter users, and 20 healthy subjects (Sec. II and Figs. S2–S4), empirical results of percolationprocess for Wikipedia editor 1 (Sec. III and Fig. S1(c)),complete results of microcanonical randomized referencemodels for Wikipedia editor 1, Twitter user 1, heart-beat subject 1, and Japanese earthquakes (Sec. IV andFigs. S5–S8), and simulation results of other kernel-basedmodels (Sec. V and Figs. S9 and S10).[44] L. Lacasa, B. Luque, F. Ballesteros, J. Luque, and J. C.Nu˜no, Proceedings of the National Academy of Sciences , 4972 (2008).[45] https://dumps.wikimedia.org/ .[46] J. Yang and J. Leskovec, in Proceedings of the fourthACM international conference on Web search and datamining (ACM Press, Hong Kong, China, 2011) pp. 177–186.[47] P. Stein and R. Goldsmith, Normal Sinus Rhythm RRInterval Database (2003), type: dataset.[48] https://physionet.org/physiobank/ .[49] .[50] D. J. Aldous, Bernoulli , 3 (1999).[51] T. Pham, P. Sheridan, and H. Shimodaira, PLoS ONE , e0137796 (2015).[52] J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, andJ. Doyne Farmer, Physica D: Nonlinear Phenomena ,77 (1992).[53] C. S. Daw, C. E. A. Finney, and M. B. Kennel, PhysicalReview E , 1912 (2000).[54] A. Porporato, J. R. Rigby, and E. Daly, Physical ReviewLetters , 094101 (2007).[55] J. F. Donges, R. V. Donner, and J. Kurths, EPL (Euro-physics Letters) , 10004 (2013).[56] D. Stauffer and A. Aharony, Introduction to percolationtheory , 2nd ed. (Taylor & Francis, 1994).[57] M. E. J. Newman,
Networks: An Introduction , 1st ed.(Oxford University Press, 2010).[58] M. Kivel¨a, R. K. Pan, K. Kaski, J. Kert´esz, J. Saram¨aki,and M. Karsai, Journal of Statistical Mechanics: Theoryand Experiment , P03005 (2012).[59] For the Wikipedia editor 1 and Twitter user 1, one canalso introduce variants of the left-right MRRM by limit-ing the shuffling only to internal nodes with IETs larger(or smaller) than a fixed timescale of 1 day, aiming to de-stroy the left-right structure for the timescale longer (orshorter) than 1 day. The results of these variants are in-cluded in Figs. S5 and S6 of Supplemental Material [43].[60] Z.-Q. Jiang, W.-J. Xie, M.-X. Li, W.-X. Zhou, andD. Sornette, Journal of Statistical Mechanics: Theoryand Experiment , 073210 (2016).[61] G. Gan, C. Ma, and J. Wu,