Characterizing stochastic time series with ordinal networks
CCharacterizing stochastic time series with ordinal networks
Arthur A. B. Pessa ∗ and Haroldo V. Ribeiro † Departamento de F´ısica, Universidade Estadual de Maring´a – Maring´a, PR 87020-900, Brazil (Dated: October 4, 2019)Approaches for mapping time series to networks have become essential tools for dealing with theincreasing challenges of characterizing data from complex systems. Among the different algorithms,the recently proposed ordinal networks stand out due to its simplicity and computational efficiency.However, applications of ordinal networks have been mainly focused on time series arising fromnonlinear dynamical systems, while basic properties of ordinal networks related to simple stochasticprocesses remain poorly understood. Here, we investigate several properties of ordinal networksemerging from random time series, noisy periodic signals, fractional Brownian motion, and earth-quake magnitude series. For ordinal networks of random series, we present an approach for buildingthe exact form of the adjacency matrix, which in turn is useful for detecting non-random behaviorin time series and the existence of missing transitions among ordinal patterns. We find that theaverage value of a local entropy, estimated from transition probabilities among neighboring nodesof ordinal networks, is more robust against noise addition than the standard permutation entropy.We show that ordinal networks can be used for estimating the Hurst exponent of time series withaccuracy comparable with state-of-the-art methods. Finally, we argue that ordinal networks candetect sudden changes in Earth seismic activity caused by large earthquakes.
I. INTRODUCTION
The amount of data researchers currently handle hasdrastically increased over the past few decades. Notonly the data volume has grown, but also the investi-gated themes and degree of detail are nowadays unprece-dented. While large and detailed data sets allow us toprobe quantitative questions about topics as diverse asmaterial sciences [1] and art [2], this complexity oftenchallenges the methods and techniques available for exe-cuting the data analysis. Even the discrimination amongsimple data such as time series becomes challenging de-pending on the subject and amount of data involved inthe task. Thus, the development of efficient and ofteninterdisciplinary approaches for data analysis representsan important step towards the extraction of hidden andmeaningful patterns from complex data sets [3].Among the different approaches that have been pro-posed, the idea of applying complex network tools for in-vestigating time series has attracted great interest fromthe scientific community since the seminal works of Nico-lis et al. [4], Zhang and Small [5] and Lacasa et al. [6],particularly among researchers working on statisticalphysics [7]. These seminal works gave birth to the threemain approaches for mapping time series into complexnetworks: transition networks, proximity networks, andvisibility graphs, respectively. In visibility graphs, eachdata point of a time series is mapped into a vertex of anetwork, and links are drawn based on a visibility condi-tion between pairs of data points [6, 8]. Proximity net-works, in turn, assume that time series segments playthe role of nodes and a similarity measure estimated be-tween every pair of segments defines the network links. ∗ arthur [email protected] † hvr@dfi.uem.br Mainly because of its intimate connection with recur-rence plots [7], recurrence networks are an importantmethod among the proximity-based approaches. In thisapproach, time series segments are regarded as a nodesand standard metrics such as Euclidean distance are usedto create connections. On the other hand, transition net-works consider discrete states over time series partitionsas nodes and the connections represent transition prob-abilities among these states. The ideas underlying tran-sition networks have a long historical precedent that canbe traced back to the theory of Markov chains [9], butits use as a tool for time series analysis is indeed muchmore recent [7].Transition networks of particular interest to this workare the ordinal networks. This method was proposedby Small [10] and has a direct inspiration in the workof Bandt and Pompe [11], which consists of an embed-ding method (known as Bandt-Pompe symbolization ap-proach) based on the idea of ordinal or permutation pat-terns. In ordinal networks, nodes represent possible or-dering patterns among elements within partitions of atime series, and links are drawn based on the temporalsuccession of these patterns. Since its first appearance inthe literature, ordinal networks have been mainly used forinvestigating time series arising from nonlinear dynami-cal systems (such as those obtained by iterating chaoticmaps), while very few works have used these networksfor characterizing stochastic processes or real-world timeseries [12]. It is also equally surprising that in spite ofrecently proposed generalizations of this approach (in-cluding one dealing with multivariate time series [13]),basic properties of ordinal networks obtained from sim-ple univariate stochastic time series (such as a randomwalk) remain poorly understood.Here we help to fill this gap by investigating propertiesof ordinal networks obtained from periodic, random, andfractional Brownian motion time series. We show that a r X i v : . [ phy s i c s . d a t a - a n ] O c t properties of ordinal networks can be used for estimat-ing the Hurst exponent of time series with high preci-sion, outperforming state-of-the-art methods such as de-trended fluctuation analysis [14, 15]. We further demon-strate the usefulness of this algorithm for investigatingempirical time series, by showing that ordinal networksare capable of identifying sudden changes in Earth’s seis-mic activity after the occurrence of large earthquakes(mainshocks).This paper is organized as follows. We review how or-dinal networks map time series to complex networks anddiscuss its general structural constraints in Section II.Next, in Section III, we investigate properties of ordinalnetworks emerging from periodic (III A), fully random(III B), noisy periodic (III C), fractional Brownian mo-tion (III D), and empirical seismic activity time series(III E). We present our conclusions and final remarks inSection IV. II. METHODS
As we have mentioned, ordinal networks have astraight connection with the symbolization approach ofBandt and Shiha [16] and with the permutation en-tropy framework of Bandt and Pompe [11]. To bet-ter understand this connection, we start by revisitingsuch ideas. Within the so-called Bandt-Pompe ap-proach, from a given time series { x t } t = ,...,N of length N , we construct n = N − d + d (the embedding dimension), represented by w s = ( x s , x s + , . . . , x s + d − , x s + d − ) , where s = , . . . , n isthe partition index. Next, we evaluate the permutation π s = ( r , r , . . . , r d − ) of ( , , . . . , d − ) sorting the el-ements of w s (in ascending order), that is, the permu-tations defined by x s − r d − ⩽ x s − r d − ⩽ ⋅ ⋅ ⋅ ⩽ x s − r . Wefurther assume that r i < r i − if x s − r i < x s − r i − in caseof draws within a partition, keeping the order of occur-rence [17, 18]. After these two steps, we obtain a symbolicsequence { π s } s = ,...,n .Having obtained all permutations, we thus calculatethe relative frequency p i ( π i ) of each one of the d ! possiblepermutations π i of the symbols ( , , . . . , d − ) p i ( π i ) = number of partitions of type π i in { π s } n , (1)from which we estimate the ordinal probability distribu-tion P = { p i ( π i )} i = ,...,d ! . The permutation entropy issimply the Shannon entropy [19] of the ordinal probabil-ity distribution, that is, H = − d ! ∑ i = p i ( π i ) log p i ( π i ) , (2)where log ( . . . ) stands for the base-2 logarithm. The em-bedding dimension d > d ! ≪ N must hold for a re-liable estimate of the distribution P = { p i ( π i )} i = ,...,d ! .The value of H quantifies the randomness in the localordering patterns of x t : H ≈ log d ! indicates that the el-ements of x t are locally randomly ordered, while H ≈ x t are very likely to appearin a particular order.Inspired by the Bandt-Pompe approach, Small [10]proposed to use the symbolic sequence { π s } s = ,...,n forcreating the ordinal network, a graph representation ofthe time series { x t } t = ,...,N . The approach consists ofconsidering all possible permutations { π i } i = ,...,d ! as thenetwork nodes, and the connections are drawn betweenevery pair of permutations that occurs in successionwithin the sequence { π s } s = ,...,n . The edges are directedaccording to the temporal succession of permutations (forinstance, for the symbolic sequence { π , π } the link is π → π ) and are also weighted by the relative frequencyin which the succession ( π i , π j ) occurs in { π s } s = ,...,n .The elements of the weighted adjacency matrix of thisnetwork are p i,j = number of times π i is followed by π j in { π s } n − , (3)where i, j = , , . . . , d ! and the denominator n − x t = { , , , , , , , , } with N = d = π = ( , ) (labeled as‘01’) and π = ( , ) (labeled as ‘10’). The first par-tition is w = ( , ) and corresponds to the permuta-tion ( , ) since the elements of w are in descendingorder. The second partition is w = ( , ) and corre-sponds to the permutation ( , ) since the elements of w are in ascending order. By repeating this process forall n = {( , ) , ( , ) , ( , ) , ( , ) , ( , ) , ( , ) , ( , ) , ( , )} . Next,by analyzing all consecutive successions between pairs ofpermutations, we find that ( , ) → ( , ) , ( , ) → ( , ) , ( , ) → ( , ) , and ( , ) → ( , ) occur 1, 2, 3, and 1times, respectively. Thus, the weighted adjacency ma-trix associated with this particular time series is (0,1) (1,0) [ ] (0,1) / / (1,0) / / H = p ( π ) = / p ( π ) = / et al. [20] have proposedto estimate another entropic metric based on the tran-sition probabilities (see also Unakafov and Keller [21]).The idea is to calculate a local entropy for node i via h i = − ∑ j ∈O i p ′ i,j log p ′ i,j , (4) x xx x FIG. 1. Ordinal network approach for mapping time se-ries into networks. (a) Illustration of the time series x t = { , , , , , , , , } and the corresponding symbolicsequence {( , ) , ( , ) , ( , ) , ( , ) , ( , ) , ( , ) , ( , ) , ( , )} obtained from the Bandt-Pompe method with embedding di-mension d =
2. (b) Visualization of the ordinal network as-sociated with the time series x t . Nodes ‘01’ (red) and ‘10’(blue) correspond to the permutations ( , ) and ( , ) , re-spectively. Directed edges are drawn between network nodesbased on temporal succession of permutations in the symbolicsequence, and their weights reflect the relative frequency ofeach possible succession. Self-loops appear when a permuta-tion is followed by itself in the symbolic sequence. where p ′ i,j = p i,j / ∑ k ∈O i p i,k is the renormalized transitionprobability of leaving node i to node j and O i standsfor the outgoing neighbourhood of node i (all edges thatleave node i ). The value of h i thus quantifies the degreeof randomness associated with transitions starting frompermutation π i . For instance, h i = log ∣O i ∣ ( ∣ . . . ∣ standsfor set cardinality, that is, the number of possible outgo-ing neighbours of i ) if all possible transitions leaving π i are equiprobable, whereas h i = π i . For the network in Figure 1, h = h ( ‘01’ ) = .
918 and h = h ( ‘10’ ) = . h i , McCullough et al. [20] have pro-posed to calculate the global node entropy H GN = d ! ∑ i = p ′ i h i , (5)where p ′ i = ∑ j ∈I i p j,i is the probability of arriving at node i from its incoming neighbourhood I i (in-strength of node i ). It is worth noticing that p ′ i ≈ p i ( π i ) for long timeseries, and thus, the permutation entropy is the Shannonentropy associated with the in-strengths of ordinal net-work nodes. Equation 5 represents a weighted mean ofthe local node entropy, and quantifies the global degree ofrandomness over all (first order) permutation transitionsoccurring in a time series x t . More regular time serieshave smaller values of H GN than time series associatedwith random processes. For the example shown in Fig-ure 1, we find that H GN = . et al. [20]have further proposed to ignore the auto-loops when es-timating the global node entropy ( H GN ). As they haveargued, auto-loops may become too intense when deal-ing with over-sampled time series, which in turn may biasthe values of H GN . In our work, we have opted to con-sider the auto-loops because of the nature of the timeseries we shall investigate. To be more precise in termi-nology, the entropy definition of Eq. 5 is usually calledconditional permutation entropy [21]. It is also worthmentioning that our entropy definition and other topo-logical properties of ordinal networks are different fromthe concept of simplicial complexes of graphs [22], wherealgebraic topology methods are used for defining topo-logical information measures [23], such as the entropyof a topological level used by Andjelkovi´c et al. [24] forinvestigating traffic jamming.In addition to global node entropy H GN , the study oftime series based on ordinal networks explores the myr-iad of network metrics available for characterizing com-plex networks. However, important features of ordinalnetworks are naturally inherited and limited by the sym-bolic sequences that give rise to networks. Some of theseproperties of ordinal networks have already been implic-itly discussed in previous works [10, 25, 26], but theystill lack attention. An important limitation is related tothe maximum number of connections for a node. Thisconstraint is not readily apparent when d =
2, becausethis case displays all possible connections [Figure 1(b)].On the other hand, there exist six possible permutations(that is, d !) and thus six nodes in ordinal networks when d =
3, but each node connects with only three othernodes. This restriction arises from the fact that a per-mutation π s ordering a partition w s can only be followedby three ordinal patterns associated with the partition w s + .To illustrate such constraints, let us consider a par-tition w = ( x , x , x ) having the permutation π =( , , ) , that is, the ordering of its elements is such that x < x < x . Suppose now that the next partition is w = ( x , x , x ) , where x and x are the same elementscontained in w . Because π = ( , , ) , we know thatthe condition x < x should hold in w , and therefore,the number 0 should precede 1 in the permutation as-sociated with w . Among all possible permutations for d = {( , , ) , ( , , ) , ( , , ) , ( , , ) , ( , , ) , ( , , )} ,only ( , , ) , ( , , ) and ( , , ) satisfy the previous con-dition. Thus, there are only three permutations π thatcan appear after π , which in turn limit the number ofoutgoing connections of node ( , , ) to three. By us-ing the same argument, we can verify that the numberof incoming connections is also limited to three and thatthese constraints hold for all nodes.These ideas generalize to all values of embedding di-mension d , so that all nodes in a permutation networkhave in-degree and out-degree limited to numbers be-tween 0 and d . Consequently, ordinal networks cannothave more than d × ( d ! ) edges. This limitation resultsfrom the fact that the ordering of the elements in a par-tition w s is partially carried out to the next partition w s + , since the elements of w s + comprise d − w s . Furthermore, the smaller the embedding dimension,the fewer are the elements shared among the time se-ries partitions, and thus, the series past rapidly becomesunimportant for determining future permutations.Another intriguing consequence of the constraints asso-ciated with transitions among permutations is related toself-edges in the network. By analyzing all transitions, weconclude that self-loops can only exist for two particularnodes of ordinal networks, regardless of the embeddingdimension d . These two nodes are associated with onlyascending or only descending permutations, that is, per-mutations related to partitions in which the elements areall successively increasing or decreasing. For instance,these nodes correspond to the permutations ( , , ) and ( , , ) in the case of d =
3, and ( , , , ) and ( , , , ) for d = π s can be followed byany possible permutation in the case of non-overlappingpartitions since consecutive partitions do not share anytime series elements. For time-lagged elements, similarrestrictions among permutations emerge for high-ordertransitions. These generalized algorithms are also inter-esting and may deserve further investigation; here, wehave focused on ordinal networks directly inspired bythe seminal works of Bandt and Shiha [16] and Bandtand Pompe [11], as illustrated in Figure 1. III. RESULTSA. Ordinal networks of simple time series
We start our empirical investigation by analyzing thestructure of ordinal networks arising from elementarytime series. Perhaps the simplest time series to consideris a monotonic (increasing or decreasing) series. In thiscase, regardless of the embedding dimension d , the or-dinal network is composed of a single node (represent- ing the solely increasing or decreasing permutation) withonly one auto-loop (meaning that the permutation is al-ways followed by itself). Therefore, network metrics formonotonic time series are all trivial.Periodic series are another example of simple yet moreinteresting signals. Ordinal networks for simple periodicseries form cyclic structures, where the arrangement ofnodes and edges allude to the series itself. If the numberof data points within a period of a time series is T and theembedding dimension is d = T , the ordinal network is acyclic graph (without auto-loops) with T nodes (numberof different permutations) since the time series repeatsitself after T data points. Furthermore, this cyclic struc-ture does not change if we consider larger embeddingdimensions ( d > T ), only the permutation symbols asso-ciated with the network nodes are modified [Figure 2(a)].Because of this behavior, ordinal networks of periodic sig-nals display a constant average degree [for in and out con-nections, Figure 2(c)] and a diameter that grows linearlywith the embedding dimension [Figure 2(d)]. We canalso show that the average weighted distance from onenode to all others is d ∑ d − i = ( id ) , which is independent ofthe node we choose due to graph symmetry. Thus, theaverage weighted shortest path for the whole network isalso ⟨ l ⟩ per = d d − ∑ i = ( id ) = d − d , (6)where the summation represents the average weighteddistance from a particular node to all others, and thefactor 1 / d accounts for the average over all nodes in thenetwork [Figure 2(e)]. This expression is valid when allexisting edges have the same weight, which holds for longtime series or series composed of an integer number ofperiods. B. Random ordinal networks
We have further investigated ordinal networks emerg-ing from random signals. Differently from monotonic andperiodic signals, ordinal networks emerging from randomseries (henceforward called random ordinal networks) areexpected to display all possible connections since all per-mutations are equally likely to occur in random signalsthat are sufficiently long. Figure 2(b) shows examplesof ordinal networks for different embedding dimensionsemerging from Gaussian white noise. We notice thatthese networks are composed of d ! nodes and each nodehas d outgoing and d incoming links. Because of theseproperties, the average degree and the diameter of ran-dom ordinal networks increase linearly with the embed-ding dimension d [Figure 2(c) and (d)].An intriguing aspect of these random ordinal networksis that edge weights are not all the same, albeit all per-mutations are equiprobable in random time series. In-deed, one of the d outward edges of all nodes has twicethe weight of the remaining d − d = 2d = 3 d = 4 Gaussian white noisePeriodic signal d = 6d = 3 d = 4d = 5d = 2
FIG. 2. Ordinal networks of periodic and random time series. (a) Illustration of the mapping of a periodic signal into ordinalnetworks with different embedding dimensions d (indicated within the panel). (b) Mapping of Gaussian white noise into ordinalnetworks with different embedding dimensions d (shown within the panel). Black edges indicate the transitions that are twiceas likely to occur. (c) Average degree ⟨ k ⟩ , (d) diameter L , and (e) average weighted shortest path ⟨ l ⟩ as a function of theembedding dimension d for ordinal networks emerging from random (circles) and periodic (triangles) signals. The periodicsignals have period T matching the embedding dimension ( d = T ). The results for the three previous properties hold forsufficiently long time series, that is, when a reliable estimate of all permutation transitions is available. Figure 2(b)], a result that holds for long sequences ofrandom numbers drawn from any continuous probabilitydistribution. To illustrate how this happens, let us con-sider that the first partition with d = w = ( x , x , x ) and the correspondingpermutation is π = ( , , ) (that is, x < x < x ). Thenext partition is w = ( x , x , x ) and the new element x fits one of the following conditions: i) x < x < x < x , ii) x < x < x < x , iii) x < x < x < x , and iv) x < x < x < x . Conditions i) and ii) yield π = ( , , ) ,while iii) results in π = ( , , ) and iv) in π = ( , , ) . Thus, there are two possibilities of finding π = ( , , ) which makes the transition ( , , ) → ( , , ) twice aslikely than ( , , ) → ( , , ) or ( , , ) → ( , , ) . Itis also worth noticing that if we draw a large numberof samples ( x , x , x ) so that x < x < x , the averagevalues of x , x , and x converge to the quartiles of theprobability distribution of the time series, and therefore,the conditions i)-iv) are equiprobable. The same ideaholds for all other first-order transitions when d = π s + that is more likely to follow π s is pick-ing the π s + in which the symbol equal to ‘ d −
1’ fitsthe position of the symbol ‘0’ in π s . For instance,if π s = ( , , , ) , π ∗ s + = ( , , , ) is twice as likelythan π s + ∈ {( , , , ) , ( , , , ) , ( , , , )} because thesymbol ‘3’ in π ∗ s + is located in the same position sym-bol ‘0’ is placed in π s . This result allows us to build theweighted adjacency matrix of random ordinal networksfor any embedding dimension d . To do so, we start witha network having d ! nodes (each one representing a par-ticular permutation π i ), and draw a directed connectionwith unitary weight between all pairs of nodes π i and π j for which the transition π i → π j is possible. Next, we up-date the weights of all more probable connections fromone to two. Finally, the elements of the resulting adja-cency matrix p i,j are divided by the factor ( d + ) !, whichrepresents the sum of all unitary weights [ d ! ( d − ) ] plustwice the number of edges with double weights [ d ! ( ) ].By using this weighted adjacency matrix, we can numer-ically evaluate any network metric of random ordinal net-works with arbitrary embedding dimension d . Figure 2(e)shows the average weighted shortest path as a functionof d estimated from these theoretical networks, where weobserve that this measure approaches zero for large val-ues of d since edge weights decrease with the increase of d . Another interesting property of random ordinal net-works that we can analytically estimate is the global nodeentropy (Eq. 5). To do so, we first notice that the localentropy for node i is h i = − ˜ p ′ i,j log ˜ p ′ i,j − d − ∑ j = p ′ i,j log p ′ i,j = − d + ( d + ) − ( d − d + ) log ( d + ) , (7)where p ′ i,j = /( d + ) are the renormalized transitionprobabilities from node i to node j that are equiprob-able, and ˜ p ′ i,j = p ′ i,j represents the renormalized tran-sition probability that is twice as likely to occur. Byplugging this result into Eq. 5 with p ′ i = / d ! (equiproba-ble permutations), we find that the global node entropyfor uncorrelated sequences of random numbers is H (rand)GN = log ( d + ) − ( log 4 )/( d + ) . (8)We note that the value of H (rand)GN is always smaller thanthe one obtained when all transitions are equiprobable(for this case, H (equi)GN = log d ), and only when d → ∞ we have H (rand)GN → H (equi)GN . Thus, the value of H (rand)GN does not represent the maximum entropy for a given d ,in a sense that there exist time series for which H GN > H (rand)GN .An interesting application for these theoretical randomnetworks is to detect non-random behavior in empiricaltime series. One option is to directly compare empiricalordinal networks with their random theoretical counter-parts via a proper graph distance measure [28, 29]. To illustrate this possibility, we have investigated whetherpartially ordered Gaussian white noise can be distin-guishable from pure noise by estimating the edit distance δ [29] among their ordinal networks. For a weighted anddirected graph, this simple metric represents the amountof edge weight (strength) that needs to be reallocated sothat the two graphs become equal. We have generatedan ensemble of white noise series, where a given fraction η of consecutive elements are placed in ascending order.Next, we map these time series into ordinal networks andestimate the average values of the edit distance δ to thetheoretical random networks as a function of the fractionof ordered data η . Figures 3(a)-(c) show the behavior ofthe average value of δ as a function of η as well as the 95%confidence region for embedding dimensions d ∈ { , , } and for series with N ∈ { , , } elements.As expected, we observe that the values of δ increasewith η in all cases. We have also estimated the averagevalue of the edit distance between empirical ordinal net-works of white noise series (with the same lengths) andthe theoretical random networks for constructing randomconfidence bands. Values of δ outside this random confi-dence band represent a reliable indicator that an empiri-cal time series displays deviations from pure random be-havior. Moreover, these confidence regions also allow usto identify the threshold fractions η ∗ of order from whichthe edit distance δ is capable of accurately detecting thisanomaly. Figures 3(a)-(c) show that this approach is ca-pable of detecting the ordering in time series with 1000elements for η ∗ ≈ .
8% when d =
2, for η ∗ ≈ .
0% when d =
3, and for η ∗ ≈ .
8% when d =
4. These figures alsoindicate that the values of η ∗ decrease with the increaseof the time series length N . However, we observe that d = η ∗ regardless of N .We have further estimated the average values of η ∗ over 10 realizations of the detection procedure as a func-tion of N . Figure 3(d) shows that η ∗ decreases exponen-tially with N for d ∈ { , , } and that for very long series( N > ) all values of d are equally efficient in detectingthe minimal fraction of ordered elements used in our nu-merical experiments ( η = d = d = d = d = d . This situation is likely tochange if the anomalous pattern becomes more complex,so that ordinal networks with higher embedding dimen-sions may perform better at detecting the anomaly insuch cases. However, this example illustrates that therewill always be a trade-off between embedding dimensionand time series length.By knowing the exact form of random ordinal net-works, we can also estimate the fraction of missing transi- FIG. 3. Detecting non-random behavior with random ordinal networks. Panels (a)-(c) show the average values of the editdistance δ (colored lines) between ordinal networks of partially ordered white noise ( η is the fraction of ordered elements) andthe theoretical random ordinal networks for embedding dimensions d = , , and 4, respectively. In each of these panels, resultsof the first column are obtained from ensembles of 1000 time series with N = N = δ calculated fromfinite random series. Vertical dashed lines indicate the threshold fraction η ∗ of ordered elements at which a partially orderedrandom series becomes distinguishable from a totally random series of the same length. (d) Threshold fraction η ∗ as a functionof the time series length N for different embedding d ∈ { , , } . The markers represent average value of η ∗ estimated from 10replicas, the shaded regions are one standard deviation bands, and the dashed line indicates the minimum fraction of orderedelements used in our numerical experiments (1%). tions in empirical time series. This analysis is somewhatsimilar to the one of missing permutations or forbiddenpatterns introduced by Amig´o et al. [30–32] and exploredby several works [33–35]. These works have observed thatsome ordinal patterns cannot occur in chaotic systems[for instance, the permutation ( , , ) never appears inlogistic map under fully developed chaos] and that evenstochastic processes may present missing ordinal patternsdepending on the time series length and the choice ofembedding dimension. However, the number of missingpatterns in random processes decays as the time series be-come longer, and because of that, are often called “falseforbidden patterns.” Figure 4(a) illustrates how the frac-tion of missing permutations in Gaussian white noisesdecreases with the increase of the time series length fordifferent embedding dimensions d ∈ { , , , , , } . Toextend these results, we test whether noisy series alsopresent “false forbidden transitions”. To do so, we evalu-ate ordinal networks from Gaussian white series of length N and compare their number of transition links withthose in the exact form of the random ordinal network.Figure 4(b) shows these values for different embedding di-mensions, where we observe that white noise series also present “false missing transitions” depending on the val-ues of N and d . We note that the shape of these curvesresembles those observed for the fraction of missing per-mutations; however, the number of missing transitionsdecreases more slowly with N than the number of miss-ing permutations. It is also worth mentioning that theedit distance calculated between these empirical ordinalnetworks and the exact form of random ordinal networksquantifies not only the existence of missing transitions,but the differences in their occurrence frequencies in atime series. C. Ordinal networks of noisy periodic time series
It is also interesting to investigate how ordinal net-works of periodic signals transform into random ordi-nal networks as we add noise to such time series. Todo so, we generate sawtooth-like signals of length 10 and period T of the form x t = { , , , , . . . } ( T = x t = { , / , , , / , , . . . } ( T = [− ξ, ξ ] , where ξ F r a c t i on o f m i ss i ngpe r m u t a t i on s (a) d = 3 d = 4 d = 5 d = 6 d = 7 d = 8 Time series length, N F r a c t i on o f m i ss i ng t r an s i t i on s (b) FIG. 4. Missing ordinal patterns and missing permutationtransitions in random series. (a) Fraction of missing permu-tations and (b) fraction of missing permutation transitions inGaussian white noise series of unitary variance as a functionof the time series length N for different values of embeddingdimensions d ∈ { , , , , , } (indicated by the legend). Inboth plots, the colored curves represent average values over100 realizations of white noise series for each embedding di-mension, and the shaded regions are 95% confidence intervals. is a parameter controlling the signal-to-noise ratio. Fig-ure 5(a) shows examples of ordinal networks obtainedfrom these noisy periodic signals for ξ ∈ { , . , . , } and d = T =
3. As expected, the ring-like structure of theperiodic ordinal network ( ξ =
0) approaches a randomordinal network as we increase the value of ξ . We noticethat this gradual process starts with the emergence of allpossible network nodes (which happens for small valuesof ξ ), followed by the fading of the three initial links ofthe network for ξ = H GN (Eq. 5) in comparison with the standard per-mutation entropy H (Eq. 2). For that purpose, we gen-erate an ensemble of 100 sawtooth-like signals of length10 for each value of ξ ∈ { , . , . , . . . , } . Next, weestimate the values of H GN and H for each time series and the average values of these quantities for all ξ val-ues. We have further normalized these quantities divid-ing the values of H GN by H (rand)GN (Eq. 8) and H by log d !.Figure 5(b) shows the mean of the normalized values of H GN and H as a function of ξ for d = , , H GN are more robust than H against noise addition, and therefore, more efficient fordistinguishing among time series with different values of ξ . This feature is more evident for d = T =
2) be-cause in this case x t = { , , , , . . . } , and so the two ordi-nal patterns are equiprobable even when ξ < .
5, makingthe normalized value of H always equal to 1. Conversely,the global node entropy is zero for ξ < . ξ greater than 0 .
5. For largerembedding dimensions, we observe that the permutationentropy H approaches 1 much faster than H GN . For in-stance, the values of H are unable to distinguish thesenoisy periodic series if ξ >
1, whereas the values of H GN are still able to differentiate them. D. Ordinal networks of fractional Gaussian noiseand fractional Brownian motion
We have also investigated ordinal networks emergingfrom time series of fractional Gaussian noise and frac-tional Brownian motion [36, 37]. Fractional Brownianmotion is a stochastic process characterized by a param-eter h ∈ ( , ) (the Hurst exponent) that exhibits long-range correlations. The increments of a fractional Brown-ian motion are Gaussian distributed with zero mean, sta-tionary, and usually called fractional Gaussian noise. TheHurst exponent h controls the roughness/raggedness ofthese stochastic processes. For h < /
2, a fractional Gaus-sian noise is anti-persistent, meaning (roughly speaking)that positive values are followed by negative values (orvice versa) more frequently than by chance. On the otherhand, a fractional Gaussian noise is persistent if h > / h → /
2. For fractional Brownianmotion, the larger the value of h , the smoother is thegenerated time series.Figure 6(a) and (b) show examples of fractional Gaus-sian noises and fractional Brownian motions for differentvalues of h generated with the procedure of Hosking [38].These figures also depict visualizations of the correspond-ing ordinal networks for d = h , while fractional Brownian motions ξ = 0.40 ξ = 0.80 ξ = 2.00 ξ = 0.00 FIG. 5. Robustness to noise addition of permutation entropy and global node entropy. (a) Visualizations of ordinal networksrelated to noisy periodic signals for d = ξ (shown within the panel). (b) Average values ofthe permutation entropy H (circles) and global node entropy H GN (triangles) obtained from an ensemble of 100 realizations ofnoisy periodic signals as a function of ξ for d = , , elements and the period T is set equal tothe embedding dimension ( T = d ). The permutation entropy is normalized by its maximum value and the global node entropyby the value of the random ordinal network (Eq. 8). The tiny shaded regions correspond to one standard deviation band, anddashed lines are the values for random ordinal networks. display more balanced weights for low values of h . In thecase of fractional Brownian motions, it is worth noticingthat the auto-loop weight associated with the permuta-tion ( , , ) becomes quite intense for large values of h ,reflecting the downward trends of those particular real-izations of the stochastic process.One possibility for quantifying inequality in edgeweights is to calculate the Gini index [39]. This coefficientis widely used in several disciplines (especially in eco-nomics) and represents a measure of statistical dispersionof probability distributions. Values of Gini index closeto zero show that the weights are equally distributed,while values close to 1 indicate a sharp inequality in theweight distribution. We have estimated the Gini indexfrom an ensemble of ordinal networks associated withfractional Gaussian noise and fractional Brownian mo-tion with different Hurst exponents (100 realizations foreach h ∈ { . , . , . . . , . } ). The average values of theGini coefficient are shown in Figure 6(c), where the re-sults confirm our previous visual analysis. For fractionalGaussian noise, we observe that the Gini index decreasesas the values of h rise up to h ≈ .
5, from where it displaysa plateau whose value is very close to the Gini index of arandom ordinal network. Conversely, the Gini coefficientsystematically increases with h for the fractional Brown-ian motion, reflecting the rise in the persistent behavior. We have also evaluated the average values of the nor-malized global node entropy H GN in comparison with thenormalized permutation entropy H . Figure 6(d) showsthe results for fractional Gaussian noises. We note thatthe behavior of H versus h is more concave than H GN versus h , and the values of H GN display a much broaderrange of variation than those of H . This behavior is sim-ilar to what we have reported for noisy periodic signals[Figure 5(b)] and provides further support to the hypoth-esis that the global node entropy has larger discriminat-ing power than the usual permutation entropy. We havefurther calculated the average weighted shortest path ⟨ l ⟩ for the fractional Brownian motion as a function of theHurst exponent h . Figure 6(e) shows that the values of ⟨ l ⟩ monotonically decrease with h for embedding dimen-sions d = d = ⟨ l ⟩ versus h suggests that we can use the average weighted shortestpath to predict the Hurst exponent. To systematicallytest for this possibility, we have built a statistical learningregression task, where the Hurst exponents of fractionalBrownian motions are predicted using the values of ⟨ l ⟩ for d = h ∈ { . , . , . , . . . , . } F r a c t i ona l B r o w n i an m o t i on F r a c t i ona l G au ss i an no i s e FIG. 6. Ordinal networks of fractional Gaussian noises and fractional Brownian motions. (a) Illustration of the mappingof fractional Gaussian noise samples with different Hurst exponents ( h , shown within the panel) into ordinal networks. (b)Illustration of the mapping of fractional Brownian motion samples with different Hurst exponents ( h , shown within the panel)into ordinal networks. In the previous panels, the thickness of the links are proportional to the edge weights and all time seriesare generated with the same starting seed. (c) Dependence of the Gini index associated with the edge weights distribution onthe Hurst exponent h for the fractional Gaussian noise (circles) and fractional Brownian motion (triangles). (d) Permutationentropy H (triangles) and global node entropy H GN (circles) with d = h for the fractionalGaussian noise. The permutation entropy is normalized by its maximum value and the global node entropy by the value ofthe random ordinal network (Eq. 8). (e) Average weighted shortest path ⟨ l ⟩ as a function of the Hurst exponent h with d = d = True Hurst P r ed i c t ed H u r s t (a) Ordinal Network
True Hurst P r ed i c t ed H u r s t (b) Detrended Fluctuation Analysis
True Hurst P r ed i c t ed H u r s t (c) Quantile Graph O r d i na l N e t w o r ks D e t r ended F l u c t ua t i on A na l ys i s Q uan t il e G r aph C oe ff i c i en t o f D e t e r m i na t i on , R (d) Number of nodes C oe ff i c i en t o f D e t e r m i na t i on , R (e) Ordinal NetworkQuantile Graph
FIG. 7. Estimating the Hurst exponent with ordinal networks. Panels (a)-(c) show the relationship between true and predictedvalues of Hurst exponent obtained via ordinal networks ( d = q = K -nearest neighbors regressor algorithm to the values of average weighted shortest paths ⟨ l ⟩ (for ordinal networks)and vertex degree (for quantile graphs) related to fractional Brownian motion series of 1024 data points. For DFA, the valueof h is obtained by least squares fitting the relationship between the fluctuation function F ( s ) and the scale parameter s on alog-log scale (that is, log F ( s ) ∝ h log s ). (d) The bar plot shows the accuracy of each approach as measured by the coefficientof determination R . Error bars are 95% bootstrapping confidence intervals with 1000 resamples. (e) Dependence of R on thenumber of nodes in ordinal networks (squares) and quantile graphs (circles). via Hosking’s method. By using this data set, we havetrained a K -nearest neighbors regressor algorithm [40](see Appendix A) with 75% of all series and used a five-fold cross-validation approach to select the optimal num-ber of neighbors k . The remaining 25% of the series (thatwere never exposed to the learning algorithm) are usedfor testing the accuracy of the predictions. Figure 7(a)shows the relationship between true and predicted valuesfor the Hurst exponent, where an accuracy (measuredby the coefficient of determination R ) of R = .
7% isachieved. This represents a remarkable precision, partic-ularly when considering that the series contain only 1024elements. To provide a baseline accuracy, we have com-pared the results obtained from ordinal networks withthose from detrended fluctuation analysis (DFA) [14], awidely used approach for estimating the Hurst exponentthat is considered a cutting edge and reliable method [15].Figure 7(b) shows the relationship between true and pre-dicted/estimated values of h obtained by applying theDFA to 25% of the same ensemble of series. We imme-diately note a larger dispersion of this relationship thatcan be quantified by the value of R = . q of quantiles of the empirical prob-ability distribution associated with the time series, andlinks are drawn based on the temporal succession of quan-tiles related to each time series element. We have buildquantile networks with the same data set used for ordi-nal networks and tested several network metrics (averagevertex degree, average weighted shortest path, diameter,clustering coefficient) as predictive features of the Hurstexponent. We find that average vertex degree (in, out,or both combined) displays the best performance for thisregression task. The average weighted shortest paths areproblematic for quantile graphs due to the emergence ofinfinite distances that are either associated with morethan one network component or with unreachable nodesthrough the directed paths. Figure 7(c) shows the rela-tionship between true and predicted values of Hurst ex-ponents for quantile graphs with 50 quantiles, where thecoefficient of determination is R = . d since bothparameters define the number of nodes in the mappednetwork. Thus, we need to fine tune these parametersto have a fairer comparison between both approaches.To do so, we have trained the learning algorithms usingdifferent values of d and the number of quantiles q , andestimated the values of R in the test set for each combi-nation. Figure 7(e) shows R as a function of the numberof network nodes for both approaches. We observe thatthe increase of d systematically reduces the performanceof the regression task and that the optimum value oc-curs for d = q ⪅
10) or too large ( q ⪆ q ≈ d and q is significantly better for or-dinal networks ( R = .
7% for d =
2) than for quantilegraphs ( R = .
5% for q = E. Ordinal networks of earthquake magnitudeseries
As the last application, we have investigated ordinalnetworks emerging from time series of Earth seismic ac-tivity. In particular, we have analyzed earthquake mag-nitude time series from the Southern California SeismicNetwork [43] between the years 1990 and 2019. We haveasked whether ordinal networks are capable of detectingchanges in the behavior of magnitude series caused bythe occurrence of a large earthquake event (mainshock).To test for this possibility, we have selected all events ofmagnitude higher than 7.0 and built two time series com-posed of N events before and N events after the main-shock. Figure 8(a) shows an example of before and aftertime series for N =
200 related to the M 7.2 2010 BajaCalifornia earthquake that took place on April 4, 2010,at Guadalupe Victoria, a small city in the Mexican stateof Baja California [44]. There were also two other largeearthquakes in our data: the M 7.3 1992 Landers (June28, 1992) [45] and M 7.1 1999 Hector Mine (October 16,1999) [46].We have thus mapped the before and after magnitudeseries of those three mainshocks into ordinal networkswith d = N = ⟨ l ⟩ forthe before and after networks. The results of Figure 8(b)show that the values of ⟨ l ⟩ always decrease after the oc-currence of a large earthquake. We have also verified thatthis result is robust against changes in the number of be-fore and after events within the range 150-300, as shown in Figure 8(c). The decrease in ⟨ l ⟩ after the occurrenceof a large mainshock is likely to be related to Omori’slaw [47], one of the fundamental seismic laws establishingthat the number of aftershocks per unit of time decays asa power-law function of the elapsed time since the main-shock. We have verified that the Omori decay increasesthe persistence in the time series after mainshock events,as quantified by the lag-1 autocorrelation coefficient thatincreases from 0 . . N = ⟨ l ⟩ may be associatedwith the increase in the persistent behavior of time seriesafter mainshock events. It is also worth mentioning thatthe small length of these time series ( N ≤ IV. CONCLUSIONS
We have presented an investigation of ordinal networksmapped from time series of stochastic nature. In partic-ular, we have analyzed ordinal networks emerging fromrandom series, noisy periodic signals, fractional Brown-ian motions, and earthquake magnitude sequences. Wehave provided a detailed description of random ordi-nal networks, revealing some counter-intuitive proper-ties such as the non-uniform distribution of edge weightsand the existence of auto-loops only in nodes related tosolely increasing or decreasing permutations. We havealso proposed an approach that is capable of buildingthe exact form of random ordinal networks, which in turnare used for detecting non-random behavior in time se-ries and missing permutation transitions. Our resultsfor noisy periodic signals have indicated that the globalnode entropy estimated from ordinal networks is morerobust against the presence of noise than the standardpermutation entropy. We have further demonstrated theusefulness of ordinal networks for estimating the Hurstexponent of time series and for detecting sudden changesin earthquake magnitude series after the occurrence oflarge mainshock events.We thus believe our work contributes for a better un-derstanding of the general properties of ordinal networks,shedding light upon results and applications related totimes series of stochastic nature.
ACKNOWLEDGMENTS
This research was supported by Coordena¸c˜ao de Aper-feicoamento de Pessoal de N´ıvel Superior (CAPES)and Conselho Nacional de Desenvolvimento Cient´ıficoe Tecnol´ogico (CNPq — Grants 407690/2018-2 and303121/2018-1).3
FIG. 8. Detecting changes in Earth seismic activity with ordinal networks. (a) Earthquake magnitude time series before (red)and after (purple) the great M 7 . d = ⟨ l ⟩ estimated from ordinal networks built with before and after time series (200 events each) associated with three majorearthquakes (indicated within the plot). We note that the values ⟨ l ⟩ decrease after the mainshock occurrence in the three cases.(c) Each colored curve shows the difference between the average weighted shortest path before ( ⟨ l ⟩ before ) and after ( ⟨ l ⟩ after ) asa function of the number of before and after events (that is, the time series length N ). We observe that ⟨ l ⟩ before − ⟨ l ⟩ after isalways positive for N within the range 150-300, which corroborates to the robustness of our results. Appendix A: Statistical learning algorithm
The K -nearest neighbors (KNN) is a supervised sta-tistical learning algorithm used in both classification orregression tasks [40]. The term supervised indicates thatthe algorithm analyzes a fraction of the data set (thetraining set) to produce an inferred function that is thenused for predicting the behavior of data instances. Inparticular, the KNN algorithm determines the value of anew observation by averaging the values of the K closestdata points (using the space spanned by the independentvariables in the training set). Thus, the number of near-est neighbors K is a parameter of this algorithm, whichis usually determined by simultaneously minimizing thebias and variance errors of the predictions [40]. Bias er-rors happen when the learning model is too simple torepresent an adequate description of the data. Varianceerrors, on the other hand, emerge when a complex modeladjusts very well to the training set but is not able togeneralize to unseen instances.As mentioned in the main text, we have used the av-erage weighted shortest ⟨ l ⟩ of ordinal networks and theaverage vertex degree ⟨ k ⟩ of quantile graphs in order toestimate the Hurst exponent h of samples of fractional Brownian motion with the KNN algorithm. To do so, wegenerate an ensemble of 100 pairs of values ( h, ⟨ l ⟩) and ( h, ⟨ k ⟩) for each h ∈ { . , . , . , . . . , . } . We thenrandomly select 75% of these data for training (train-ing set) the KNN algorithm and let the remaining 25%for testing the accuracy of the predictions. By using thetraining set, we have employed a five-fold cross-validationapproach [40] in order to determine the optimal value forthe parameter K . This process consists of splitting thetraining set into 5 subsets, using one of the subsets forvalidating the algorithm, and the remaining 4 for train-ing. This process is repeated 5 times so that each subsetis used for validation. In each step, we estimate the accu-racy (here the coefficient of determination R ) from thetraining subsets (the training score) and from the validat-ing subset (the validation score). After 5 repetitions, weestimate the average values of the training and validationscores and their confidence intervals. The plot of thesescores as a function of the number of nearest neighbors K (the so-called validation curves) allows us to identifythe optimal value for this parameter, that is, the one cor-responding to the highest value for the validation score.With this procedure, we find that the optimal value of K for the results reported in the main text is K =
125 forordinal networks [Figure 7(a)] and K =
77 for quantilegraphs [Figure 7(c)]. [1] A. Ziletti, D. Kumar, M. Scheffler, and L. M. Ghir-inghelli, Nature Communications , 2775 (2018). [2] H. Y. Sigaki, M. Perc, and H. V. Ribeiro, Proceedings of the National Academy of Sciences , E8585 (2018).[3] C. A. Mattmann, Nature , 473 (2013).[4] G. Nicolis, A. G. Cantu, and C. Nicolis, InternationalJournal of Bifurcation and Chaos , 3467 (2005).[5] J. Zhang and M. Small, Physical Review Letters ,238701 (2006).[6] L. Lacasa, B. Luque, F. Ballesteros, J. Luque, and J. C.Nu˜no, Proceedings of the National Academy of Sciences , 4972 (2008).[7] Y. Zou, R. V. Donner, N. Marwan, J. F. Donges, andJ. Kurths, Physics Reports , 1 (2019).[8] L. Lacasa and R. Toral, Physical Review E , 036120(2010).[9] J. Schnakenberg, Reviews of Modern Physics , 571(1976).[10] M. Small, in (2013) pp. 2509–2512.[11] C. Bandt and B. Pompe, Physical Review Letters ,174102 (2002).[12] M. Small, M. McCullough, and K. Sakellariou, in (2018) pp. 1–5.[13] J. Zhang, J. Zhou, M. Tang, H. Guo, M. Small, andY. Zou, Scientific Reports , 7795 (2017).[14] C.-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E.Stanley, and A. L. Goldberger, Physical Review E ,1685 (1994).[15] Y.-H. Shao, G.-F. Gu, Z.-Q. Jiang, W.-X. Zhou, andD. Sornette, Scientific Reports , 835 (2012).[16] C. Bandt and F. Shiha, Journal of Time Series Analysis , 646 (2007).[17] Y. Cao, W.-w. Tung, J. B. Gao, V. A. Protopopescu,and L. M. Hively, Physical Review E , 046217 (2004).[18] O. A. Rosso, H. A. Larrondo, M. T. Martin, A. Plastino,and M. A. Fuentes, Physical Review Letters , 154102(2007).[19] C. E. Shannon, Bell System Technical Journal , 379(1948).[20] M. McCullough, M. Small, H. H. C. Iu, and T. Stemler,Philosophical Transactions of the Royal Society A ,20160292 (2017).[21] A. M. Unakafov and K. Keller, Physica D , 94 (2014).[22] J. Jonsson, Simplicial complexes of graphs (Springer,2008).[23] P. Baudot, M. Tapia, D. Bennequin, and J.-M. Goaillard,Entropy , 869 (2019).[24] M. Andjelkovi´c, N. Gupte, and B. Tadi´c, Physical Re-view E , 052817 (2015).[25] M. McCullough, M. Small, T. Stemler, and H. H.-C. Iu,Chaos , 053101 (2015).[26] X. Sun, M. Small, Y. Zhao, and X. Xue, Chaos ,024402 (2014).[27] C. Masoller, Y. Hong, S. Ayad, F. Gustave, S. Barland, A. J. Pons, S. G´omez, and A. Arenas, New Journal ofPhysics , 023068 (2015).[28] C. Donnat, S. Holmes, et al. , The Annals of AppliedStatistics , 971 (2018).[29] P. Wills and F. G. Meyer, arXiv preprintarXiv:1904.07414 (2019).[30] J. M. Amig´o, L. Kocarev, and J. Szczepanski, PhysicsLetters A , 27 (2006).[31] J. M. Amig´o, S. Zambrano, and M. A. F. Sanju´an, Eu-rophysics Letters (EPL) , 50001 (2007).[32] J. Amig´o, S. Zambrano, and M. A. Sanju´an, EPL (Eu-rophysics Letters) , 60005 (2008).[33] M. McCullough, K. Sakellariou, T. Stemler, andM. Small, Chaos , 123103 (2016).[34] K. Sakellariou, M. McCullough, T. Stemler, andM. Small, Chaos: An Interdisciplinary Journal of Non-linear Science , 123104 (2016).[35] F. Olivares, L. Zunino, and D. G. P´erez, Physica A ,122100 (2019).[36] B. B. Mandelbrot, The Fractal Geometry of Nature (Free-man, San Francisco, 1982).[37] B. B. Mandelbrot and J. W. Van Ness, SIAM Review ,422 (1968).[38] J. R. M. Hosking, Water Resources Research , 1898(1984).[39] C. Gini, The Economic Journal , 124 (1921).[40] G. James, D. Witten, T. Hastie, and R. Tibshirani, AnIntroduction to Statistical Learning: With Applicationsin R (Springer, New York, 2014).[41] A. S. L. O. Campanharo, M. I. Sirer, R. D. Malmgren,F. M. Ramos, and L. A. N. Amaral, PLoS ONE , e23378(2011).[42] A. S. Campanharo and F. M. Ramos, Physica A , 43(2016).[43] “California Institute Of Technology And United StatesGeological Survey Pasadena - Southern California Seis-mic Network.” Available: , Accessed: 28 Jun 2019.[44] “Tectonic Summary – M 7.2 - 12km SW of Delta,B.C., MX.” Available: https://earthquake.usgs.gov/earthquakes/eventpage/usp000habu/executive ,Accessed: 28 Jun 2019.[45] “Tectonic Summary – M 7.3 - Landers, CaliforniaEarthquake.” Available: https://earthquake.usgs.gov/earthquakes/eventpage/usp00059sn/executive ,Accessed: 28 Jun 2019.[46] “Tectonic Summary – M 7.1 - 16km SW of Lud-low, CA.” Available: https://earthquake.usgs.gov/earthquakes/eventpage/usp0009fwb/executive ,Accessed: 28 Jun 2019.[47] T. Utsu, Y. Ogata, R. S, and Matsu’ura, Journal ofPhysics of the Earth43