A Look into Chaos Detection through Topological Data Analysis
AA Look into Chaos Detection through Topological Data Analysis
Joshua R. Tempelman and Firas A. Khasawneh
Department of Mechanical Engineering, Michigan State University, 428 S. Shaw Ln. East Lansing, MI 48824
Abstract.
Traditionally, computation of Lyapunov exponents has been the marque method for iden-tifying chaos in a time series. Recently, new methods have emerged for systems with both known andunknown models to produce a definitive 0–1 diagnostic. However, there still lacks a method whichcan reliably perform an evaluation for noisy time series with no known model. In this paper, wepresent a new chaos detection method which utilizes tools from topological data analysis. Bi-variatedensity estimates of the randomly projected time series in the p - q plane described in Gottwald andMelbourne’s approach for 0–1 detection are used to generate a gray-scale image. We show thatsimple statistical summaries of the 0D sub-level set persistence of the images can elucidate whetheror not the underlying time series is chaotic. Case studies on the Lorenz and Rossler attractors aswell as the Logistic Map are used to validate this claim. We demonstrate that our test is comparableto the 0–1 correlation test for clean time series and that it is able to distinguish between periodicand chaotic dynamics even at high noise-levels. However, we show that neither our persistence basedtest nor the 0–1 test converge for trajectories with partially predicable chaos, i.e. trajectories with across-distance scaling exponent of zero and a non-zero cross correlation. Key words.
Topological data analysis, chaos detection, persistent homology, time series
Chaos is an obscure phenomenon which appears in many physical and theoretical systems. Chaoticsystems display a hyper-sensitivity to initial conditions, and their subsequent dynamics are difficult toanticipate as they are seemingly random and sporadic. Differentiating a chaotic time series from aperiodic one is not a trivial task. The traditional recipe for doing so utilizes a tangent space methodwhere, for example, the sign of the maximal Lyuponov exponent is used to classify the systems behavior.This method has proven reliable, although it is not straightforward to implement for systems with noknown model, even with the procedures given by Bennetin [1] and Wolf et al . [2]. Additionally, thetangent space methods perform poorly when dealing with noisy finite time series, and this is a majorlimitation when analyzing data from a high-dimensional and noisy system.There have been recent efforts to detect chaos in time series such as the binary test introduced byWernecke et al . [3] which can also distinguish strong chaos from partially predictable chaos. Like theLyuponov exponent procedure, this method is reliant on the time–evolution of initially close trajectories.Thus, a known model is required to deliver a diagnosis. Additionally, there are no results given in [3] onthe effects of noise on the scoring method.A popular emergent tool for chaos detection in a time series is the 0–1 test which was introduced byGottwald and Melbourne in 2004 [4]. Unlike tangent space methods, knowledge of the full state–space isnot required to perform the 0–1 test. Since a state-space reconstruction is not required, the 0–1 methodis favorable for data streams for which not all state variables are observable; this is the case for manyengineered and natural systems.The 0–1 test has been presented in two forms: the regression test and the correlation test (seeSection 1.2 and Refs. [4, 5, 6, 7]). The correlation test has been shown to perform exceptionally well fornoise-free time series [6]. However, when noise contaminates the data, a modified version of the originalregression test desensitizes the scores to noise [5, 6]. Alternatively, the correlation method handles noiseexceptionally well if “gauge” values are given for a time series known to be periodic [8]. However, therequired gauge score is not always obtainable in real world systems.In this paper, we present an alternative approach for combating the sensitivity of the 0–1 test tonoise. Our approach combines elements from the 0–1 recipe with a tool from Topological Data Analysis(TDA) specifically persistent homology. We leverage the stability of persistent homology in the presenceof noise to deliver a method for identifying the shift-points between chaotic and regular dynamics inmoderately noisy time series.
Address correspondence to [email protected]
Code and data available at Mendeley repository
Chaos Detection with Persistent Homology
Accepted February 26, 2020 in
Physica D: Nonlinear Phenomenon , DOI : a r X i v : . [ n li n . C D ] M a r haos detection through TDA Tempelman and Khasawneh The paper is organized as follows. In Sections 1.1 and 1.2, we introduce the dynamic regimes we arestudying and review the 0–1 test, respectively. In Section 2, we describe persistent homology and howthis tool can be used in conjunction with projections of the time series data in order to identify the shiftpoint between chaotic and periodic dynamics. Section 3 investigates the robustness of our approach tonoise, and the summarizing conclusion is given in Section 4.
The exposition of our method uses simulations of the Lorenz model. The equations of motion for thisdynamical system are defined as [9] ˙ x = σ ( y − x ) , ˙ y = x ( ρ − z ) − y, ˙ z = xy − βz. (1)where nominal definitions of the Lorenz constants for this study are β = 8 / , σ = 10 , and ρ ∈ [180 . , . is left as a bifurcation parameter. The manipulation of ρ can lead to three distinctdynamic regimes as shown in Fig. 1. In [3], it is shown that the Lorenz system displays periodic, par-tially predictable chaos (PPC), and chaotic behavior over the range ρ ∈ [180 . , . . These regimesare formally defined in [3] as strong chaos, PPC, and laminar flow. We implement the 0–1 tests of [5, 6]as well as our proposed persistence–based method on time series which were shown in [3] to be PPC.The PPC trajectories are characterized by having a non-zero cross correlation for time-scales beyondthe Lyapunov prediction time and a cross-distance scaling exponent ν of zero. The cross-correlation isdefined as C = 1 s (cid:104) [ x ( t ) − E ( x )] · [( x ( t ) − E ( x )] (cid:105) , (2)where E ( x ) is the expected value given by E ( x ) = lim N →∞ N N (cid:88) j =1 x ( j ) , (cid:104) (cid:105) denotes the ensemble mean, and the variance s is s = lim T −∞ (cid:90) TT [ x ( t ) − E ( x )] d t. (3)The cross-distance scaling between trajectories x and x is defined with d ( t > T λ ) ∝ δ ν , d ( t ) = (cid:104)| x ( t ) − x ( t ) |(cid:105) (4)where T λ is the Lyapunov prediction time T λ = ln ( | x − x | /δ ) /λ m , δ is the distance of initially closetrajectories, and λ m is the maximal Lyapunov exponent. The reader is directed to [3] for a detailedexplanation and demonstration of these definitions.In this paper, the regimes will be referred to as chaotic, PPC, and periodic dynamics, respectively.The ensemble mean of independent trials used in [3] is very effective in identifying these respectiveregimes with a binary diagnostic (Fig. 1). However, these evaluations require a known set of differentialequations to generate neighboring initial conditions. Thus, the approach in [3] requires the full knowledgeof the state–space and cannot be applied to a single observation of a high–dimensional system. A 0–1 diagnostic for identifying chaos in time series with known or unknown models, i.e. usingrecorded data, is given in [4]. The method has been presented in several different forms since its originalconception [4, 5, 6, 7]. In this paper, we use the modified regression method described in [5] as wellas the correlation method as described in [7]. The original test identifies some observable characteristic φ ( j ) of a dynamic system and uses it to construct planar ( p, q ) data sets according to p ( n ) = n (cid:88) j =1 φ ( j ) cos jc, and q ( n ) = n (cid:88) j =1 φ ( j ) sin jc, (5)where c is a random variable drawn from a uniform distribution in (0 , π ) and n = (0 , , . . . , N ) . Using asingle data set of length N , the number of c values selected creates N c data sets in the ( p, q ) plane. Inthis paper, N c = 100 and N = 5000 unless otherwise stated. The p - q data is then used to compute themean-square displacement M c ( n ) = lim N →∞ N N (cid:88) j =1 [ p c ( j + n ) − p c ( j )] + [ q c ( j + n ) − q c ( j )] . (6)Page 2 haos detection through TDA Tempelman and Khasawneh Figure 1: (a) The cross-correlation C , (b) the scaling distance ν , and (c) the various dynamic regimes displayedby the Lorenz system for modification of the parameter ρ . A chaotic trajectory is shown in for ρ = 180 . . Partiallypredictable chaos is shown in the center with the trajectory tightly orbiting the attracting braid, and fully periodicbehavior is shown to the right where the trajectory is stable on the manifold. Note that in Eq. (6), n = (1 , . . . , n cut ) where n cut = N/ . The test for chaos is based on the asymptoticgrowth of M c ( n ) with respect to n , which is a bounded function of time for periodic dynamics and growslinearly with time for chaotic dynamics [6]. It is shown in [6] that the modified mean-square displacement D c ( n ) possesses better convergence characteristics. To construct D c ( n ) , begin by defining the oscillatoryterm V osc ( c, n ) = ( Eφ ) − cos ( nc )1 − cos ( c ) . (7)Now, subtract the V osc term from the mean-square displacement to define the modified mean-squaredisplacement D c ( n ) = M c ( n ) − V osc . (8)The test can now be performed using either the correlation test or regression test. Using the standarddefinitions for covariance and variance respectively cov ( x, y ) = 1 q q (cid:88) j =1 ( x ( j ) − ¯ x ) ( y ( j ) − ¯ y ) , and var ( x ) = cov ( x, x ) , (9)the correlation K c value is found using the correlation function K c = corr ( ξ, ∆) = cov ( ξ, δ ) (cid:112) var ( ξ ) var (∆) ∈ [ − , , (10)where ξ = (1 , , . . . , n cut ) and ∆ = ( D c (1) , D c (2) , . . . D c ( n cut )) . The 0–1 correlation K value of the timeseries is then defined as the median value of the K c values with a median near zero noting a periodictime series and a median near one noting a chaotic time series.Alternatively, the 0–1 regression test may be applied according to K c = lim n →∞ log ˜ D c ( n )log n , ˜ D c ( n ) = D c ( n ) − min n =1 ,...,n cut D c ( n ) . (11)In [5], a K c value is computed for M c ( n ) rather than ˜ D c ( n ) , and this will be referred to as the modified0–1 regression test K c = lim n →∞ log M c ( n )log n . (12)For finite data, the K c score is computed as the slope of the line which fits the log( M c ( n )) versus log( n ) with the least absolute deviation [6]. The modified 0–1 regression test will be used instead of the standardPage 3 haos detection through TDA Tempelman and Khasawneh regression test in this paper since it has been shown to perform better in the presence of noise [5, 6];noise robustness will play major role in later sections of this paper (section 3).The 0–1 test typically works better than tangent space methods for detecting chaos. However, someabating factors must be addressed. The test will always return a periodic diagnosis when the data isover-sampled for the case of time-continuous systems [10]. Additionally, false positives are found whenperiodic time series are contaminated with substantial noise [5]. The issue of oversampling can be resolvedby properly sub-sampling the time series according to [6, 10, 11, 12], the effects of noise are more difficultto overcome. A method has been developed which allows the 0–1 test to handle noise very effectively if agauge-value (i.e. time series with known periodic parameters) is provided [8]. Furthermore, the effect ofnoise on the 0–1 test has been leveraged to predict the noise level within a periodic time series [7]. If noiseis of concern and using a gauge value is not an option, the modified regression score should be used [5, 8].However, whereas the 0-1 test as described above has a mathematically rigorous underpinning [13], thisis not the case for the more noise-robust modification [5].An interesting intermediate result of the 0–1 method is the projection of the data onto the p - q plane. If the dynamics of the time series are periodic, then the resulting projection will have a compactand typically annular topology. In contrast, Fig. 2 shows that chaotic dynamics lead to an irregularscattering of points in the p - q space. Therefore, the topology of the p - q projections can give insight intothe underlying dynamics, as we show in Section 2. Figure 2: (top) the K c values found using the 0–1 method for a periodic time series (blue) and a chaotic timeseries (red). The data is generated with the Lorenz model. (Bottom left) The p - q projections for the periodictime series produce a compact annular geometry. (Bottom right) The p - q projection of the chaotic time seriesproduces a scattered, irregular pattern. The motivation behind a TDA approach is that data can have shape, and that shape has a meaning.TDA provides tools which lead to a computational summary of the shape of data. We showed inSection 1.2 that the 0–1 test produces data projections with meaningful topology. Thus, these projectionscan be quantified in a topological sense and studied in a new framework.The ability to quantify the topology of a data structure has led to many breakthroughs in variousfields. Specifically, TDA has already made meaningful contributions in time series analysis [14, 15, 16],economics [17], machining dynamics [18, 19, 20, 21], biochemistry [22], and plant morphology [23]. Inthis section, we extend the application of this powerful computational tool to chaos detection.Recently, Mittal and Gupta have explored persistence homology as a tool for detecting early bifur-cations and chaos in dynamic systems [24]. In [24], persistent homology is used to identify the presencePage 4 haos detection through TDA Tempelman and Khasawneh of holes in the phase portraits of deterministic dynamical systems, thus an indicator for bifurcations.The method has shown promising results in [24] even with a signal to noise ratio of 30 dB. However,wavelet filtering is required to pre–process noisy data. Additionally, we note that their method requiresone to have access to the whole phase-space or to perform a phase-space reconstruction. The approachwe describe here requires neither data pre–processing nor phase-space reconstruction.We use the Lorenz system as a test model for elucidating our TDA-based approach for chaos detection.Time series which are generated for different values of the bifurcation parameter ρ in the Lorenz modelare used to exhibit chaotic, PPC, and periodic dynamics. The Lorenz time series were produced usingthe ODE45 function in MATLAB R (cid:13) with a relative tolerance of 1e-6, a time step of ∆ t = 0 . , and atransient cut-off of 100 seconds.Consistent with the 0–1 test, the time series data must first be sub-sampled, if needed. This was doneusing the frequency approach presented by Melosik and Marszalek [10, 11]. This sub-sampling methodresults in a time series sampled such that f max < f s < f max where f s is the sub–sampling frequencyand f max is the maximum significant frequency of the data in the frequency domain. In the 0–1 test,failure to sub-sample may result in false-negatives. Similarly, we found that oversampling affects thetopological characteristics of the p – q projections thus leading to incorrect conclusions when studying thetopology of the p – q space. As a general rule-of-thumb, it is suggested that the sampling rate chosen inthe time series sub–sampling is set to three times the maximum significant frequency of the data.The subsequent sections give a detailed development of the TDA approach; the overview is as follows.Multiple projections of p - q data are generated for a given time series. The bi-variate Kernel DensityEstimates (KDEs) of these projections produce a gray scale image (Fig. 4). Sub-level set persistence isthen used to provide a computational summary of the topological structure of the KDE. We give a briefexplanation of persistent homology in Section 2.1 and the procedure for applying TDA to the p - q datain Section 2.2. The results are explained in Section 2.3 while case studies on the Rossler system and theLogistic map are given in Sections 2.3.1 and 2.3.2 respectively. The implementation of this method isdescribed with Algorithm 1 which we implement in MATLAB R (cid:13) and is published online [25]. Only a brief explanation of persistent homology, commonly known as persistence, will be provided inthis paper. For a more comprehensive explanation, the reader is directed to [26, 27, 28, 29, 30, 31].Persistence can be applied to 3-dimensional data such as contour plots or even gray-scaled images.Each point in topological space X is assigned a real number, which for the purpose of this explanation,may be thought of as a height coordinate. This resulting function f : X → R can have local maximaand minima. Persistent homology is interested in changes in topology at various sub-level sets . For thisfunction f , the λ sub-level set is L λ = { x : f ( x ) ≤ λ } = f − ([ −∞ , λ ]) . (13)For any pair of levels λ > λ the condition L λ ⊇ L λ must be satisfied and the collection of sets { L λ } λ ∈ R forms a filtration where the level is the index set [32]. Consider Fig. 3 which shows a toyexample for computing 0D sub-level persistence for a function f . One way to visualize the process is toimagine a rising water level beginning at the base of X and account for each time a new minimum of thesurface is touched and new maximum is reached (see Fig. 3). As the level λ is raised, new topologicalfeatures are captured through generators of its homology group. Two dimensions of homology groupsare considered. The 0D groups capture connected components (see Fig. 3) while the 1D groups capturesregions forming circular structures (see Fig. 5).As new levels are introduced, new generators of homology groups are formed and existing generatorsmerge. The level at which these generators and groupings occur are the birth and death time, respectively,of the topological feature. Therefore, the three characteristics for each generator are the homologydimension, birth time, and death time. The collection of these points is drawn as what is referred to asa persistence diagram. For | D | generators, the persistence diagram is D = { ( r j , b j , d j ) : j = 1 , . . . , | D |} where r j , b j , and d j are, respectively, homology order, birth time, and death time of the j th generator [32]. In this paper, we utilize the 0D sub-level set persistence to extract the underlying characteristics ofthe probability densities of the p - q projections. The KDEs correspond to the p - q data according to [33] ˆ f ( x, H ) = 1 n n (cid:88) i =1 K H ( x − x i ) , (14)Page 5 haos detection through TDA Tempelman and Khasawneh Figure 3:
An example showing the construction of the 0D persistence diagram of the function f : X → R . (left)The rising “water” level on cross-section cut of a function f in topological space X and (middle) the sub-level setsproduced by tracking the “birth” and “death” of topological features on f . The birth and death times are plottedagainst on each other (right) to produce the corresponding persistence diagram for the function. Figure 4:
The transformation from time series data to a gray-scaled image. The raw-time series (left) areprojected into the p - q plane (middle-left) before a bi-variate kernel density estimate produces a smooth topologicalsurface (middle-right). The density estimates are then turned into gray-scale images (right) for sub-level setpersistence. Shown here is one iteration for laminar flow (top) and chaos (bottom). Figure 5:
Sub-level filtrations for the KDE (left) of a p - q projection for a periodic time series and the resultingpersistence diagram for the 0D persistence. where x = ( p, q ) , n is the number of elements in { x , x , . . . , x n } , and K H is the kernel function. Theresulting KDEs are converted to a gray scale image. The gray scale image is normalized to have a pixelPage 6 haos detection through TDA Tempelman and Khasawneh value in [0, 1].It should be noted that increasing the grid resolution will drastically lengthen the runtime of theprogram. To reduce computation time of the corresponding sublevel set persistence, a KDE grid size of × is used. This is a relatively coarse grid resolution and results in KDEs which are jagged whencompared to a KDE of a higher resolution. This could lead to misleading computational summaries ofthe KDE topology, thus a 2D Gaussian smoothing function G h = 12 πh e − x y h (15)is used. We found h = 1 . to be a good choice for the smoothing kernel width at a × grid resolution(the effect of this kernel width on the results of the persistence test is discussed in Section 3.1). TheDIPHA library was used to generate all the needed 0D persistence diagrams. The central motivation for this study is to yield a reliable interpretation of the underlying dynamics oftime series data using persistence. To better understand how this may be achieved, refer to Fig. 6. Here,multiple persistence diagrams are overlaid for ranges of the bifurcation parameter ρ of the Lorenz model.The persistence points with a birth time near zero are omitted from Fig. 6 since they are considerednoise in this application. The persistence points shown in Fig. 6 show that when persistence diagramsare generated from time series associated with a periodic bifurcation parameter, the persistence pointsaccumulate in the upper-right corner of the persistence diagram. Figure 6: (top) The bifurcation diagram of the Lorenz model as the bifurcation parameter ρ is varied between180.3 and 181.3. (bottom) The persistence diagrams generated from kernel density estimates of p - q projectionsacquired from time series at ranges of the bifurcation parameter. The color scheme corresponds to the underlyingdynamics as found in [3] where blue indicates periodicity, orange indicates PPC, and red indicates chaos. These produce birth/death combinations which are tightly clustered and bounded to regions on ornear the diagonal. There is an apparent correlation here between the birth and death times of thepersistence points. Furthermore, these points all appear to be occurring at higher birth and death timesthan what is seen for their chaotic counterparts. The persistence data generated form PPC trials arevisually indistinguishable from periodic trials. On the other hand, chaos based persistence diagrams arevisually perceptible from periodic when many trials are overlaid as shown in Fig. 6. Thus, the hypothesisto test is whether or not this behavior may be reliably used to trace back the dynamics of the originaltime series.
Scoring : Infinitely many p - q projections may be generated for a given time series since c may beindefinitely drawn from (0 , π ) . Thus, we use an ensemble of p - q projections and use a statistical summaryof birth and death times to deduce the original dynamics of the time series. In this section, the meandistance of persistence points from the diagram’s origin are used to discern the underlying dynamics ofthe time series. https://github.com/DIPHA/dipha Page 7 haos detection through TDA Tempelman and Khasawneh
Figure 7:
Basic statistical summary of 200 persistence diagrams taken from the gray-scaled image renditions ofthe KDE of the p - q projections for 200 simulations of the Lorenz system. The red bars correspond to chaos, theorange bars correspond to PPC, and the blue bars correspond to periodic dynamics. 200 simulations are usedfor ρ ∈ [180 . , . to generate the full range of dynamics. The images were smoothed by Gaussian filteringand the data used for the statistical summary ignores points with birth very close to the origin. Figure 8:
Visual interpretation of sub-level filtrations for the normalized KDE of a p - q projection for a (left)chaotic and (right) periodic time series and the resulting 0D persistence. The threshold for chaotic versus periodic or PPC behavior in persistence diagrams can be definedas a result of the location of the points in the diagram. A basic statistical evaluation of the trialsused in Fig. 6 is used to elucidate how such a threshold is defined. Looking at the average birth anddeath times, Fig. 7 shows that the average time of birth and death for periodic data is substantiallyhigher than the chaotic counterpart (approximately . – . for periodic and . – . for chaotic). Thestatistical summary outlined in Fig. 7 ignores the birth/death combination on or near zero birth timevertical. These persistence points do not reveal any relevant information, and can be thought of asirrelevant noise in the KDE plots. The statistical summary was conducted by taking the mean birthand death time of each diagram for 200 realizations of p - q projections at each bifurcation parameter andthen computing the ensemble mean of these values. The average birth and death times are substantiallyhigher for periodic and PPC data; therefore, we score the data using the average distance to the originaccording to P S = (cid:42) N i (cid:88) j =1 (cid:113) ( d j + b j ) N i , ( b j , d j ) ∈ D i (cid:43) , (16)where P S ∈ [0 , √ , D i is the persistence diagram generated form the i th selection of c ∈ (0 , π ) , and N i is the number of 0D persistence points in D i .The reason for the higher average birth and death times in the persistence diagrams for KDEsgenerated from periodic time series as compared to chaotic ones is illustrated in Fig. 8. The diffusivenature of the p - q trajectories [8] for chaotic data results in KDEs which typically have multiple localmaxima and minima randomally appearing in the p - q space. These local extrema correlate to the birthsand deaths of homology classes in the persistence diagram. Alternatively, the bounded projections ofperiodic time series data will consistently produce a KDE which is bound to a finite region in the p - q space with prominent peaks representing frequently visited parts of the p - q space. This structure of theKDE leads to the homology classes being constant during the filtration until the top of the structure isreached. At which point, ridges near the peak of the KDE produce many generators near a value of onewhich die soon after birth.Thus, it can be concluded that for the Lorenz system, if the P S score for a set of p - q projectionsis near √ , then the underlying dynamics are periodic. The mean distance from the origin P S wasconsistently found to be a successful score for classifying chaos for the systems studied in this paper andthus it will be the focus of the results section. Page 8 haos detection through TDA Tempelman and Khasawneh Figure 9:
A direct comparison (left) the proposed method of averaging persistence data based on its positionwith respect to the origin and (right) the Gottwald 0–1 correlation (C) and regression (R) tests. The long–time0–1 correlation score (N=25000) is provided as a reference.
Algorithm 1
Testing for chaos with persistence procedure P S ( Φ , f s ) (cid:46) time series data matrix φ ← subsample( φ, f s ) (cid:46) Select observable and sub-sample for i = 1 : N do (cid:46) N = 200 is suggested c i ← rand (0 , π ) (cid:46) Random c value [ p , q ] ← f ( x , c ) (cid:46) Project into p - q space P ← f ( p , q ) (cid:46) Get bi-variate density estimate of (p,q) I ← P (cid:46) Convert to image [ B , D , d ] ← DIPHA( I ) (cid:46) Call DIPHA and get persistence data B ← B ( d = 0) (cid:46) Cut out persistence data with dimension 1 D ← D ( d = 0) s ← f ( B , D ) (cid:46) Calculate distance from origin for each point S i ← mean( s ) (cid:46) Take the mean of distances for the diagram end return mean( S ) (cid:46) Compute ensemble mean on index i Figure 10:
The convergence of (left) the 0-1 correlation test, (middle) the 0–1 modified regression test and(right) the
P S scores. Figure 9 shows a direct comparison between our proposed method (see Fig. 4 and Algorithm 1) andthe 0–1 correlation test. The 0–1 test and
P S scores are performed using N = 5000 (after sub-sampling),and a long time limit ( N = 25000 ) evaluation of the 0–1 test is included as a reference case [8, 34]. Whenthe dynamics are periodic ( ρ > . ), the distance from the origin stabilizes near 1.3. When chaoticdynamics are present, the P S score stabilizes at approximately 0.8. The scores of 1.3 for periodic and0.8 for chaotic were found to be consistent when testing with longer time series as well.The numerous numerical experiments that we preformed seemed to yield P S scores close to the [1 / √ ≈ . , (cid:112) / ≈ . . These two scores coincide with the geometric centers of the diagonal linePage 9 haos detection through TDA Tempelman and Khasawneh and the upper left triangle of the persistence diagram, respectively. This suggests that the persistence ofthe corresponding probability density estimates of the projections possess a global shape which appearssimilar to the one obtained form the 0D persistence of Gaussian random field excursions [35]. However,computing distributions of persistence diagrams is not a trivial task [35, 36, 37, 38] and is complicated byfactors such as the highly nonlinear nature of the resulting metric space, as well as the non-uniquenessof geodesics and means in the persistence diagram space. Finding a closed-form expression for theseexpectations is a topic for future research.Figure 9 also shows how the 0–1 test and the P S scores behave when PPC dynamics are presentin the bifurcation parameter range ρ ∈ [180 . , . Specifically, in this parameter range both methodsreturn a periodic diagnostic for N = 5000 . However, Fig. 10 shows that the long-time limit ( N = 10 points) for PPC data deviates from this periodic result for both the 0–1 correlational and regressionscores. The figure shows that as the time series is extended, both the 0–1 correlation test and the P S statistic start to yield non-binary results between zero and one, but not necessarily near zero or one.Although this is found to be consistent in both the modified regression and correlation 0–1 tests, thedivergence from a periodic score occurs quicker with the latter. Similarly, a P S score between 0.8 and1.3, but not necessarily near 0.8 or 1.3, is returned from the P S statistic. One may argue that this issimply an issue of finite time. While it appears that the PPC data are tending towards one of the twodiagnostic limits for both the 0–1 tests and P S scores, it does so at too slow of a rate for convergenceto be achieved without resorting to excessively taxing computations. In contrast, scores for stronglychaotic or fully periodic regimes remain consistent during this long-time evaluation. This shows that itis difficult to distinguish PPC solely from p - q projections. Figure 11: (a) The bifurcation diagram of the Rossler model for a ∈ [0 . , . , (b) 2-D projection of thephase space in the x - y plane for multiple values of the bifurcation parameter, (c) resulting persistence diagramsgenerated over ranges of a , and (d) the results P S and 0–1 correlation (C) and regression (R) testing. Thelong-time correlation (N=25000) 0–1 score is provided as a reference. Page 10 haos detection through TDA Tempelman and Khasawneh
In order for our method to be applicable to a wide class of signals, it must yield consistent numericalresults for periodic and chaotic time series regardless of the source of the signal. To investigate this, weuse the Rossler system to gather additional synthetic data. Like the Lorenz equations, the Rossler modelis a classic system which has been studied widely [39] and it is described by ˙ x = − y − z, ˙ y = x + ay, ˙ z = γ + z ( x − c ) , (17)where a , c and γ are scalars. We only consider fully periodic and fully chaotic signals in this section. TheRossler data is generated through numerical simulation using the MATLAB R (cid:13) ODE45 integration toolwith ∆ t = 0 . , a relative tolerance of − , and a transient cut-off of 1,000 seconds. The parametersselected for this study are a ∈ [0 .
25 0 . , γ = 2 , and c = 4 [40]. Figure 11 shows the results of the0–1 correlation test along with the P S scores for N = 5000 as well as the long time reference using N = 25000 using the 0–1 correlation test. The bifurcation diagram and trajectories in the phase spaceare also given in Fig. 11a and 11b, respectively.The same TDA based approach we used for the Lorenz system for chaos detection is applied tothe Rossler data. Time series were generated for a span of bifurcation parameters. The minimumsignificant frequency sub-sampling method described in section 2 (i.e. f max < f s < f max ) was appliedto the Rossler simulations which are then projected into the p - q space per Eq. (5). The 0D sub-levelset persistence then generated the birth/death times for gray-scale renditions of the normalized KDEof these projections which were smoothed per Eq. (15) with h = 1 . . The P S and the correlation 0–1scores were obtained for the time series generated over the different values of a with N = 5000 . Figure 11shows that the correlation 0–1 test and the P S scores perform comparably for N = 5000 , with a more Figure 12: (a) The bifurcation diagram of the Logistics map for µ ∈ [3 , (b) the plots of x n versus x n +1 as µ is varied, (c) persistence diagrams generated at different ranges of the bifurcation parameter, and (d) theresulting P S and 0–1 correlation (C) and regression (R) scores. The long-time correlation (N=25000) 0–1 scoreis provided as a reference. Page 11 haos detection through TDA Tempelman and Khasawneh definitive classification
P S scores for a ∈ [0 . , . . In this range of a values, the regression 0–1 testvalues are rather ambiguous since the K values commonly fall between 0.5 and 0.8 over these selectionsof a .On the other hand, the P S and 0–1 scores do not consistently yield a definitive periodic or chaoticdiagnostic for a ∈ [0 . , . . While there is a “gray area" in this parameter range, Fig. 11 shows theemergence of elevated plateaus in the P S scores and 0–1 correlation scores for a > . . These valuescoincide with windows of periodicity shown in the bifurcation diagram and in the long–time referencescores of the 0–1 correlation test. Such gray areas are to be expected when dealing with finite time seriessince the length of the time series may not be sufficient to capture the full dynamics of the attractor [8, 34]. We further test our method on the Logistic map which is an iterative map governed by [41] x n +1 = µx n (1 − x n ) . (18)This nonlinear map was introduced as a simplified population model where x n is a ratio between existingpopulation and maximum population. Here, the bifurcation parameter is µ with transitions in thedynamics occurring during µ ∈ [3 , .Both the 0–1 correlation test and the persistence scoring method are applied to time series generatedfrom Eq. (18) over the range of µ values. Like before, the kernel width h in Eq. (15) is set to 1.3 andthe persistence points found very close to the origin are ignored. The 0–1 correlation test does a verygood job of correctly separating regions of chaos and periodicity through nearly perfect binary scores.The P S scores do not separate as cleanly as shown in Fig. 12. However, the transition from periodicbehavior to chaotic behavior ( µ ≈ . ) as well as the pockets of periodicity in µ ∈ [3 . , are clearlynoticeable in the persistence scores. Additionally, Fig. 12 shows that P S ≈ . indicates a periodic timeseries while P S ≈ . indicates chaos in the Logistic map. This is consistent with the results in Fig. 9for the Lorenz system and in Fig. 11 for the Rossler system. Figure 13: (top)Bifurcation of the Lorenz model for ρ ∈ [135 , and (bottom)The P S scores and the 0-1scores for the Lorenz model for ρ ∈ [135 , . However, it is clear from Figs. 9, 11, and 12 that at times the topological approach does not separatetime series data as cleanly as the 0–1 correlation test. Specifically, the transition from chaotic to periodicbehavior is at times less obvious in comparison to the original 0–1 test (see Figs. 11 and 12).Page 12 haos detection through TDA Tempelman and Khasawneh
In addition to testing across multiple systems, we also explore the Lorenz model of Section 1.1, butthis time with the system parameters σ = 10 , β = 0 . , and ρ ∈ [135 , . Again, the time seriesare sub-sampled per the minimum significant frequency method described in section 2 so that f max
P S scores to noise and compare their performance to both the correlation and regression 0–1methods.While the correlation 0–1 test does an excellent job in distinguishing chaos from periodicity, theincreased sensitivity makes it more susceptible to noise [6]. A brief noise sensitivity demonstration isgiven in Fig. 14 to exhibit this for simulations of the Lorenz system. This is done by averaging the median K c values as computed by Eq. (10) for time series containing signal-to-noise ratios (SNRs) ranging from60 to 10 decibels (dB). The average K at each SNR is computed using 100 realizations of these noisy timeseries. Figure 14 shows that low noise levels (i.e. high SNR) do not affect the resulting K score of the0–1 correlation test. However, at an SNR of 40 dB, the test begins to yield an ambiguous (non-binary)result. Noise levels above an SNR of 30 dB consistently produce a false-positive indicator for periodictime series. These negative effects on the 0–1 test in the presence of noise were discussed in [5, 34], Figure 14:
Effects of noise on the Correlation 0–1 chaos test with one-hundred tests being performed for eachdynamic regime starting from noise levels of SNR= 50dB to SNR=10 dB with all points plotted (left), and (right)a plot of the mean K values as well as error bars given by one standard deviation. It can be seen that 30 dBlead to ambiguous results for periodic time series, and noise levels above 30 dB lead to false positives. Page 13 haos detection through TDA Tempelman and Khasawneh
These references recommend using the 0–1 regression test with noisy time series which was shown tooutperform tangent space methods for chaos detection in noisy signals [5, 6].We start by showing the effect of increasing the level of noise on the structure of the persistencediagrams. Figure 15 shows that these diagrams remain stable as the SNR is varied between 45 and30 dB. The reason that persistent homology provides this stability is its ability to emphasize the mainfeatures of a data set and ignore minor perturbations [28]. In other words, the topological structure ofthe data does not typically change in the presence of moderate noise. However, that structure starts tobreak down once noise levels reach the point at which the salient topological features of the KDE are nolonger distinguishable from noise.The robustness of the persistence–based
P S score to noise is further illustrated with Fig. 16. Fig-ure 16 shows the effects of noise on the respective detection methods for the Lorenz model, Rosslermodel, and Logistic map for SNR levels of 40 and 30 dB. We note that a minimum SNR of 15 dB is oftennecessary to extract any useful information from a signal. The results obtained from the P S scores aswell as the modified 0–1 test appear stable for 40 dB of noise; however, deviations from the nominalnoise-free cases are apparent when the noise levels reach 30 dB. In contrast, the correlation 0–1 testappears to be greatly affected by the noise showing that the trade off of the increased sensitivity to thetest is the inability to effectively handle noise.For the Lorenz model, the last column of Fig. 16 shows that the P S scores are both stable andbinary when noise is added to the signal. While the modified regression method performs well, it doesnot appear to separate as cleanly as the P S scores. Similar results are shown for the Rossler model; thecorrelation method quickly separates from the nominal case when noise levels increase while the modifiedregression method and the P S scores remain relatively unaffected. For both the Rossler model and theLogistic maps, the 0–1 correlation test indicates false positives for wide regions of periodic parameterswhile the persistence scores do not deviate dramatically from what is found in the noise-free data. The P S scores are comparable to the modified regression 0–1 scores when examining the results of the Figure 15:
The shifts in birth and death time trends as noise is added to the original time series. Shown hereis the effect of the noise levels of
SNR = 45 dB,
SNR = 40 dB,
SNR = 35 dB, and
SNR = 30 dB. The diagramrepresenting each noise level is composed of multiple realizations of periodic and chaotic time series generatedwith the Lorenz model for σ = 10 , β = 8 / , and ρ ∈ (180 . , . . The stability of the persistence points withrespect to noise levels is clearly illustrated. Page 14 haos detection through TDA Tempelman and Khasawneh
Figure 16: (top) The effects of added noise to the (top) Lorenz model, (middle) Rossler model, and (bottom)Logistic map for (left) the 0–1 correlation method (1 or 0, respectively), (middle) to 0–1 modified regressionmethod, and (right) the
P S scores. The effects of noise are comparable between the modified regression 0–1test and the P S scores, and both of these methods are shown to be far more robust than the 0–1 correlationmethod. Logistic map as well. An interesting observation to note is that the correlation method yields a falsepositive for the periodic parameter µ near 3.0 even for small noise levels. The kernel width h of Eq. (15) plays an important role in the final persistence score. The valueof h = 1 . is selected based on a parameter study of the systems across their bifurcation parameterswhereby the value of h is varied between 0.1 and 4. The resulting persistence scores are plotted to revealwhich kernel parameter provides the best qualitative separation between periodic and chaotic parameterranges for both clean and noisy (SNR = 30 dB) time series. Figure 17 shows the effect of the h on thepersistence scores. Note that Fig. 17 displays the normalized persistence score (cid:103) P S , which is normalizedto have a maximum of one and minimum of zero.For both the Lorenz model and Logistics map, there is much better contrast between P S scoresgenerated from periodic time series versus chaotic time series when h < . . For the Rossler model,Page 15 haos detection through TDA Tempelman and Khasawneh Figure 17:
The normalized persistence scores for (left) no noise and (right) an SNR of 30 dB. The results aregiven for the Lorenz model, Rossler model, and Logistics map for data with no noise as well as for and SNR of30 dB. The color values indicate the value of persistence scores at each bifurcation parameter over a range of thekernel width h ∈ [0 . , . the cleanest separation of chaotic and periodic regimes occurs at < h < . . For the Logistics map,the kernel width does not appear to be a factor for the P S scores ability to appropriately differentiatethe classes when no noise is present. However, when the SNR is 30 dB the best range for separation is < h < . . Thus, it is concluded that for the systems explored in this paper any value . h ≤ . willsufficiently distinguish between chaos and periodicity for both clean and noisy time series. Note thatif a finer grid resolution is chosen for the KDE construction, this optimal bandwidth range is likely tochange. Gottwald and Melbourne’s 0-1 test for chaos uses a one dimensional time series to drive a two-dimensional system in a p - q space. The test is based on showing that when the original system’s dynamicsare regular, the resulting p - q trajectories are typically bounded whereas chaotic dynamics lead to diffusivetrajectories. Gottwald and Melbourne used these observations to define two flavors of the test: the 0-1correlation test, and the 0-1 modified regression test. The former is a more sensitive test but it incurslarger errors in the presence of noise, while the latter provides better results in the presence of noise.Page 16 haos detection through TDA Tempelman and Khasawneh However, mathematical justification has been provided for the 0–1 correlation test [13] while the samecannot be said for the modified 0–1 regression test.This work builds upon the results of Gottwald and Melbourne and identifies the dynamics of thesystem that generated the time series by examining the geometry of the density of the p - q trajectoriesusing persistent homology. Specifically, A Kernel Density Estimate (KDE) is obtained using the p - q projections and the corresponding 0D sub-level set persistence is computed and summarized in persistencediagrams. Several projections are obtained for the same time series to acquire ensemble of persistencediagrams that encode the shape of the corresponding KDEs. Based on observations of 0D persistenceon normalized KDEs, we define the persistence score P S ∈ [0 , √ which represents the mean distancefrom the origin for the points in the persistence diagram. P S assumes values close to √ when the timeseries is periodic, but its values are below √ for chaotic signals.The KDEs are smoothed by a Gaussian filter with kernel width h . Fig. 17 shows the parameter studywhich leads to the bandwidth selection of . ≤ h ≤ . , and it is found that the results of P S scoresare affected by the kernel bandwidth.The 0–1 regression and correlation test as well as the P S scores are applied to the PPC dynamicsfound in the Lorenz model for ρ ∈ [180 . , , . While Fig. 9 shows that each of these tests report PPCdata as periodic in the short-time evaluation ( N = 5000 ), Fig. 10 shows that each test fails to convergeon a binary results for PPC data in the long-time limit ( N = 10 ).For noise-free time series, Figs. 9, 11, and 12 shows that our P S score successfully delineates betweenchaotic and periodic behavior albeit not with as clear of a separation as the 0-1 correlation test. Incontrast, Fig. 16 shows that when noise is added, the P S score maintains a consistent performanceowing to the stability of persistence diagrams. In this case, comparisons to the more noise robust 0-1modified regression test show that P S provides clearer separation especially for high levels of noise, orequivalently, low levels of Signal to Noise Ratio (SNR). If a periodic parameter is known, a gauge valuemay be obtained and then the test described in [8] can be performed. Otherwise, either the modifiedregression test or the P S scores should be evaluated. The P S scores perform just as well the modifiedregression test in the presence of noise, and the P S scores separate data effectively for clean time series. Acknowledgment
The authors would like to acknowledge Dr. Elizabeth Munch for the helpful insight she providedregarding persistent homology and its applications. This material is based upon work supported by theNational Science Foundation under Grant Nos. CMMI-1759823 and DMS-1759824 with PI FAK.
References [1] G. Benettin, L. Galgani, A. Giorgilli, and J.-M. Strelcyn, “Lyapunov characteristic exponents forsmooth dynamical systems and for hamiltonian systems: A method for computing all of them. part2: Numerical application,”
Meccanica , vol. 15, pp. 21–30, mar 1980.[2] A. Wolf, J. B. Swift, H. L. Swinney, and J. A. Vastano, “Determining lyapunov exponents from atime series,”
Physica D: Nonlinear Phenomena , vol. 16, pp. 285–317, jul 1985.[3] H. Wernecke, Bulcsusandor, and C. Gros, “How to test for partially predictable chaos,”
ScientificReports , vol. 7, apr 2017.[4] G. A. Gottwald and I. Melbourne, “A new test for chaos in deterministic systems,”
Proceedings ofthe Royal Society A: Mathematical, Physical and Engineering Sciences , vol. 460, pp. 603–611, feb2004.[5] G. A. Gottwald and I. Melbourne, “Testing for chaos in deterministic systems with noise,”
PhysicaD: Nonlinear Phenomena , vol. 212, pp. 100–110, dec 2005.[6] G. A. Gottwald and I. Melbourne, “On the implementation of the 0–1 test for chaos,”
SIAM Journalon Applied Dynamical Systems , vol. 8, pp. 129–145, jan 2009.[7] C. H. Skokos, G. A. Gottwald, and J. Laskar,
Chaos Detection and Predictability , vol. 915. Springer,2016.[8] G. A. Gottwald and I. Melbourne, “The 0-1 test for chaos: A review,” in
Chaos Detection andPredictability , pp. 221–247, Springer Berlin Heidelberg, 2016.[9] E. N. Lorenz, “Deterministic nonperiodic flow,”
Journal of the Atmospheric Sciences , vol. 20,pp. 130–141, mar 1963. Page 17 haos detection through TDA Tempelman and Khasawneh [10] M. Melosik and W. Marszalek, “On the 0/1 test for chaos in continuous systems,”
Bulletin of thePolish Academy of Sciences Technical Sciences , vol. 64, pp. 521–528, sep 2016.[11] A. Myers and F. Khasawneh, “On the automatic parameter selection for permutation entropy,”[12] A. D. Myers and F. A. Khasawneh, “Delay parameter selection in permutation entropy using topo-logical data analysis,”[13] G. A. Gottwald and I. Melbourne, “On the validity of the 0–1 test for chaos,”
Nonlinearity , vol. 22,pp. 1367–1382, may 2009.[14] M. Robinson,
Topological Signal Processing . Springer Berlin Heidelberg, 2014.[15] J. A. Perea, “Topological time series analysis,”[16] F. A. Khasawneh and E. Munch, “Topological data analysis for true step detection in periodic piece-wise constant signals,”
Proceedings of the Royal Society A: Mathematical, Physical and EngineeringScience , vol. 474, p. 20180027, oct 2018.[17] M. Gidea, Y. A. Katz, P. Roldan, D. Goldsmith, and Y. Shmalo, “Topological recognition of criticaltransitions in time series of cryptocurrencies,”
SSRN Electronic Journal , 2018.[18] F. A. Khasawneh and E. Munch, “Chatter detection in turning using persistent homology,”
Mechan-ical Systems and Signal Processing , vol. 70-71, pp. 527–541, mar 2016.[19] F. A. Khasawneh, E. Munch, and J. A. Perea, “Chatter classification in turning using machinelearning and topological data analysis,”
IFAC-PapersOnLine , vol. 51, no. 14, pp. 195–200, 2018.[20] M. C. Yesilli, F. A. Khasawneh, and A. Otto, “Topological feature vectors for chatter detection inturning processes,”[21] M. C. Yesilli, S. Tymochko, F. A. Khasawneh, and E. Munch, “Chatter diagnosis in milling usingsupervised learning and topological features vector,”[22] M. Offroy and L. Duponchel, “Topological data analysis: A promising big data exploration tool inbiology, analytical chemistry and physical chemistry,”
Analytica Chimica Acta , vol. 910, pp. 1–11,mar 2016.[23] M. Li, M. Frank, V. Coneva, W. Mio, D. H. Chitwood, and C. N. Topp, “The persistent homologymathematical framework provides enhanced genotype-to-phenotype associations for plant morphol-ogy,”
Plant Physiology , p. 001042018, jun 2018.[24] K. Mittal and S. Gupta, “Topological characterization and early detection of bifurcations and chaosin complex systems using persistent homology,”
Chaos: An Interdisciplinary Journal of NonlinearScience , vol. 27, p. 051102, may 2017.[25] J. R. Tempelman and F. A. Khasawneh, “Chaos detection with persistent homology.”[26] J. R. Munkres,
Elements of Algebraic Topology . WESTVIEW PR, 1993.[27] E. Munch, “A user’s guide to topological data analysis,”
Journal of Learning Analytics , vol. 4,pp. 47–61, jul 2017.[28] D. Cohen-Steiner, H. Edelsbrunner, and J. Harer, “Stability of persistence diagrams,”
Discrete &Computational Geometry , vol. 37, pp. 103–120, dec 2006.[29] R. W. Ghrist,
Elementary applied topology , vol. 1. Createspace Seattle, 2014.[30] H. Edelsbrunner and D. Morozov, “Persistent homology: theory and practice,” 2013.[31] H. Edelsbrunner and J. Harer, “Persistent homology—a survey,” 2008.[32] E. Berry, Y.-C. Chen, J. Cisewski-Kehe, and B. T. Fasy, “Functional summaries of persistencediagrams,”[33] Z. I. Botev, J. F. Grotowski, and D. P. Kroese, “Kernel density estimation via diffusion,”
The Annalsof Statistics , vol. 38, pp. 2916–2957, oct 2010.[34] G. A. Gottwald and I. Melbourne, “Comment on “reliability of the 0-1 test for chaos”,”
PhysicalReview E , vol. 77, feb 2008.[35] R. J. Adler and S. Agami, “Modelling persistence diagrams with planar point processes, and revealingtopology with bagplots,”
Journal of Applied and Computational Topology , vol. 3, pp. 139–183, jul2019.[36] R. J. Adler, O. Bobrowski, M. S. Borman, E. Subag, and S. Weinberger, “Persistent homologyPage 18 haos detection through TDA Tempelman and Khasawneh for random fields and complexes,” in
Institute of Mathematical Statistics Collections , pp. 124–143,Institute of Mathematical Statistics, 2010.[37] R. J. Adler, O. Bobrowski, and S. Weinberger, “Crackle: The homology of noise,”
Discrete &Computational Geometry , vol. 52, pp. 680–704, aug 2014.[38] M. Kahle and E. Meckes, “Limit theorems for betti numbers of random simplicial complexes,”
Homology, Homotopy and Applications , vol. 15, no. 1, pp. 343–374, 2013.[39] O. Rössler, “An equation for continuous chaos,”
Physics Letters A , vol. 57, pp. 397–398, jul 1976.[40] C. Letellier, P. Dutertre, and B. Maheu, “Unstable periodic orbits and templates of the rösslersystem: Toward a systematic topological characterization,”
Chaos: An Interdisciplinary Journal ofNonlinear Science , vol. 5, pp. 271–282, mar 1995.[41] R. M. May, “Simple mathematical models with very complicated dynamics,” in