Differential symbolic entropy in nonlinear dynamics complexity analysis
aa r X i v : . [ phy s i c s . d a t a - a n ] D ec Differential symbolic entropy in nonlinear dynamics complexity analysis
Wenpo Yao a , Jun Wang b, ∗ a School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing210003, China b School of Geography and Biological Information, Nanjing University of Posts and Telecommunications, Nanjing 210023,China
Abstract
Differential symbolic entropy, a measure for nonlinear dynamics complexity, is proposed in our contribution.With flexible controlling parameter, the chaotic deterministic measure takes advantage of local nonlinear dy-namical information among three adjacent elements to extract nonlinear complexity. In nonlinear complexitydetections of chaotic logistic series, DSEn (differential symbolic entropy) has satisfied complexity extractionswith the changes of chaotic features of logistic map. In nonlinear analysis of real-world physiological heartsignals, three kinds of heart rates are significantly distinguished by DSEn in statistics, healthy young subjects > healthy elderly people > CHF (congestive heart failure) patients, highlighting the complex-losing theory ofaging and heart diseases in cardiac nonlinearity. Moreover, DSEn does not have high demand on data lengthand can extract nonlinear complexity at short data sets; therefore, it is an efficient parameter to characterizenonlinear dynamic complexity.
Keywords: differential symbolic entropy; symbolization; nonlinear complexity; heartbeat
1. Introduction
The paradigm of deterministic chaos for nonlinear complex systems becomes increasingly popular forits attractive concept and successful applications in chaotic models and real-world complex systems [1].Nonlinear complexity measures, including fractal dimensions, correlation dimension, and Lyapunov exponentetc., are defined for chaotic dynamical systems and applied in physics, biology, meteorology, chemistry andso on [2]. Some entropy methods, such as K-S entropy, approximate entropy [3], sample entropy [4, 5, 6],permutation entropy [7, 8], multiscale entropy [9, 10] and so forth [11, 12, 13] are also developed for thesenonlinear dynamics. Among these nonlinear complexity measures, symbolic dynamics analysis, with basicidea of simplicity and efficiency, provides a rigorous way of observing dynamics with finite precision [7, 14, 15].Symbolic time series analysis involves in symbolic transformation and measurements for these symbolicsequences [16]. It reduces high requirements for data by transforming raw series into a finite number of statesand mapping each state onto symbol from a given alphabet. For example, symbolic transfer entropy [17, 18]and multiscale symbolic entropy analysis [19] improves original methods, like reducing high demands on dataor sensitivity to noise, by taking advantage of symbolic dynamic analysis. Symbolizations of these methodscan be grouped into the static range-partitioning and dynamic differenced-based approaches [16, 20], amongwhich dynamical methods have high real-time features and are relatively insensitive to extreme noise spikes[21, 22]. Measures of symbolic sequence include direct visual histograms and quantitative measures basedon classical statistics or information theory [16]. The combinations of symbolization and entropy measures,belonging to information methods, play important roles in complexity detections and nonlinear dynamicsanalysis for their characteristics of simplicity, fast, insensitivity to noise etc. [14, 23, 24, 25].DSEn (differential symbolic entropy), targeting on informational properties of dynamical symbolic se-quence, is proposed in our contributions. It extracts local nonlinear dynamic information from three adjacentelements and uses adjustable controlling parameter to improve flexibility in nonlinear complexity detections. ∗ Corresponding author
Email address: [email protected] (Jun Wang)
Preprint submitted to Journal of L A TEX Templates December 5, 2018 haotic model, logistic map, and three groups of real-world physiological heart signals are applied to testnonlinear dynamic complexity detections of DSEn.
2. Differential symbolic entropy
Symbolization plays important role in symbolic dynamic analysis. The symbolic procedure inevitablyleads to the loss of part of statistical information; however, it simplifies time series analysis and contributesto dynamic complexity detection by extracting symbolic dynamic information [24, 25].A symbolization in works of J. Kurths et al. [26], using typical local dynamic symbolization, conductssymbolic transformation by comparing relationships between adjacent symbols. Given time series X = { x , x , · · · , x i , · · · } , JK symbolization, being described as especially reflects dynamical properties of therecord [26], transforms time series into symbol sequence as Eq. (1). s i ( x i ) = x > . σ ∆ x > and ∆ x ≤ . σ ∆ x > − . σ ∆ and ∆ x ≤
03 : ∆ x ≤ − . σ ∆ (1)where ∆ x = x i +1 − x i , and σ ∆ is variance of the adjacent measurement values.JK symbolic transformation refines differences between neighboring elements, but it only considers twoadjacent values and lack flexibility due to the fixed 1 . σ ∆ . Taking relationships of three consecutive elements into account, we propose differential symbolic trans-formation with flexible controlling parameter.Considering time series X = { x , x , · · · , x i , · · · } , the differences between current element and its forwardand backward ones are expressed as D = k x ( i ) − x ( i − k and D = k x ( i ) − x ( i + 1) k . Difference between D and D is calculated as dif f = D − D and var is defined as p ( D + D ) / α is obtained through Eq. (2). s i ( x i ) = dif f ≥ α · var ≤ dif f < α · var − α · var < dif f <
03 : dif f ≥ − α · var (2)The symbolization in Eq. (2) takes advantage of more detailed local information to complexity measuresthan JK symbolic transformation. Data seriesSymbolseries 0 1 3 0 1 2 2Codeseries 1 0 2 1 1 2 3 1 07 28 49 6 26 41 36 18 9 37 22 27 45 52 ( )
X i ( )
S i ( )
C i
Figure 1: Process of symbolization and coding (data and symbols in virtual frame will not be symbolized or encoded). Increation of code series, symbol length m is 3 and delay τ is 1. The first and last elements will not be transformed according tothe determination of symbolization, and the last n-1 bit symbols are not encoded for this encoding process C ( i ) and there are 4 m symbols in coded series considering the 4-symbols differential symbolization.Taking 3-bit coding as example, code for ’abc’ can be c ( i ) = a ∗ n + b ∗ n + c where ’n’ should not be smallerthan the number of symbols’ types. And a procedure of symbolization and coding is illustrated in Fig. 1.The probability of each code symbol is P ( π ) = [ p ( π ) , p ( π ) , . . . , p ( π m )]. Entropy is a classical approach in quantification of the complexity and is preferable in characterizing real-world time series [27]. Differential symbolic entropy, the central concept in our paper, is defined as Shannonentropy of all words’ probabilistic distributions as Eq. (3), and its normalized form is h ( m ) = H ( m ) /log m . H ( m ) = − X p ( π i ) logp ( π i ) , where p ( π i ) = 0 (3)
3. Differential symbolic entropy in chaotic model test
Logistic map is employed to investigate chaotic detections of DSEn. The two-degree polynomial map,mathematically written as x i +1 = r · x i (1 − x i ) , is often cited in chaotic, nonlinear dynamical analysis andused to calculate the properties of random process [28]. r J K sy m bo li c en t r op y r* b) r D i ff e r en t i a l sy m bo li c en t r op y r* c) α =0.33 r D i ff e r en t i a l sy m bo li c en t r op y d) r* α =0.38 Figure 2: Logistic equations for varying controlling parameter r from 3.4 to 4 with step of 10 − . Logical map is generated withinitial value of 0.03 and length of each sequence is 1200 (the initial value and length of logistic series have no significant influenceto chaotic detections of DSEn). As the cut-off point of whether sequence becomes chaotic or not, r ∗ = 3.5669 is marked in eachsubplot. (a) Bifurcation diagram. (b) JK symbolic entropy. (c) and (d) Differential symbolic entropy with α =0.33 and 0.38. Fig. 2(a) shows logistic map for varying r . In nonlinear complexity detections, DSEn with α between 0.33and 0.38 is proved to have satisfied chaotic detection at the beginning stage of r ∗ and has increasing entropywith the increasing r , showing in subfigures Fig. 2(c) and (d).As r becomes to r ∗ , JKSEn (JK symbolic entropy), however, shows no change in the early stage as can beseen in Fig. 2(b). JK symbolic entropy has its starting at r=3.679 which is much bigger than r ∗ , so they donot achieve identification of nonlinear features at the very beginning of chaotic logistic series. Moreover, withincreasing chaotic features of logistic map, JKSEn have unstable nonlinear dynamics analysis. In α =3.702and between 3.739 and 3.744, JKSEn have abnormally high entropy values, and when α is between 3.857 and3.927 JKSEn does not increase with enhancing chaotic behaviors of logistic series.3 . Differential symbolic entropy in heartbeats In this section, we test DSEn in nonlinear dynamic complexity extraction of cardiac electric activities.Heart rate, mainly represented RR intervals derived from ECG, is highly irregular and non-stationary [29,30, 31] and contains nonlinear physiological information of cardiac regulation which contributes to scientificresearches and clinical applications [32, 33].Heartbeat intervals of three groups of subjects from Physionet Database [34] are applied to test ourdifferential symbolic entropy. The three kinds of heartbeats, often being applied to test nonlinear approaches[9, 14, 19, 20, 31], are collected from patients with severe congestive heart failure [35] and two groups ofhealthy young and elderly subjects [36].
Controlling parameter influences symbolic transformation and nonlinear complexity detections of DSEnin heartbeats. When α is bigger than 0.3, the differential symbolic entropies of three types of heart signalsappear Young > Elderly > CHF and remain unchanged. Among all the controlling parameters, α between0.59 and 0.63 for DSEn have optimum discriminations of the different heartbeats. Differential symbolicentropy of the three kinds of heart rates are shown in Fig. 3, statistical tests for which are listed in table 1. D SE n CHF Elderly Youngm3 τ τ τ τ Figure 3: Differential symbolic entropy of three kinds of heart rates. ’ m τ
1’ denotes coding length m is 3 and delay τ is 1.Table 1: Statistical tests of DSEn of three kinds of heart rates. P values of DSEn of CHF and healthy young heartbeats, notlisted in the table, are less than 2 . ∗ − . τ =1 τ =2CHF-Elderly(m=3) 3 . ∗ − . ∗ − Elderly-Young(m=3) 2 . ∗ − . ∗ − CHF-Elderly(m=4) 1 . ∗ − . ∗ − Elderly-Young(m=4) 3 . ∗ − . ∗ − The basic dynamic complexity relationships of the three kinds of heartbeats characterized by DSEn,healthy young subjects > healthy elderly people > CHF patients, are consistent with the well-acceptedcomplex losing theory that with aging and heart diseases, cardiac modulation associated with pathologicalalterations that regulate heartbeats will decrease, leading to a loss of nonlinear complexity in heart rates.Reasons for CHF patients’ lowest complexity are that CHF damages the patient’s heart control systems whichleads to complexities fluctuations patterns of heartbeat intervals in CHF patients become quite regular anddecreases cardiac inherent nonlinear irregularities. And with increasing age, cardiac functions of the elderlydecrease, resulting in loss of dynamic complexity information, so the elderly persons have lower entropy valuesthan the young ones [10, 30, 31, 37, 38].For comparison, JK symbolic entropy of the three kinds of heartbeats is shown in Fig. 4.4 J SKE n , m = CHF Elderly Young a) J SKE n , m = CHF Elderly Young b) Figure 4: JK symbolic entropy of three kinds of heart rates. Coding lengths are m=3 and 4, coding delay is from 1 to 5. Onlywhen τ =5, JKSEn has similar nonlinearity detections with complex-losing theory. Coding process, particular delay, has great impact on nonlinear analysis of JKSEn in heartbeats analysis.When delay is from 1 to 5, JKSEn of the three kinds of heart rates change too much and have no clearregularity. When delay is 2, JKSEn of the heart signals, healthy young < healthy elderly < CHF, is contraryto complex-losing knowledge about aging and heart diseases [10, 30, 31, 37, 38]. Only when tau=5, the resultsare consistent to recent researches, however, the discriminations between CHF and elderly (when m=3 and4, p=0.065 and 0.142), and between elderly and young people (when m=3 and 4, p=0.32 and 0.408) are notacceptable in statistics.Compared with JKSEn, DSEn significantly discriminates the three kinds of heartbeats in statistics andits results are consistent with the well-accepted complex losing theory. Two reasons may account for theadvantages of DSEn. On one hand, JKSEn only take two elements’ relationship into consideration while DSEnapplies relationships among three neighboring values which can apply more dynamic information to detectcomplexities of time series. On the other hand, parameter in the JKSEn is fixed to 1.5 while that in DSEncan be adjusted accordingly. Nonlinear complex processes have different structural or dynamical properties,and different symbolic time series analysis or complexity measures target on different aspects of nonlinearcomplex systems. One cannot find an optimum controlling parameter for all processes, the disadvantage offixed range-partitioning symbolization lies in that it can not be adjusted according to different applications.
In nonlinear complexity detections of chaotic model, the length of logistic sequence does not affect theanalysis of DSEn, and in related symbolic entropy researches [7, 14, 18], the nonlinear measures usually donot have high demand for data length and have satisfied complexity extraction in very short time series. Totest the influence of date length on the dynamic analysis method, we set data length from 100 to 1800 withstep size of 100.When data length is only 200, showing in Fig. 5, three heart rates’ nonlinear complexity increases clearlyand have clear distinctions among the three kinds of heart rates, the healthy young > healthy elderly > CHFpatients. When data length increases from 500 to 600, three different heart signals’ entropy values have slightgrowth and gradually tend to be stable and convergent.
Table 2: Statistical tests of DSEn of three kinds of heart rates for different data lengths, where ’0.000’ should be read as ’smallerthan 0.001’. P values between DSEn of CHF and healthy young heartbeats are usually less than 0.00001.
Data length 200 300 400 500 600 700 800m=3 CHF-Elderly 0.039 0.015 0.006 0.002 0.001 0.001 0.001Elderly-Young 0.004 0.002 0.001 0.001 0.000 0.001 0.001m=4 CHF-Elderly
200 400 600 800 1000 1200 1400 1600 18000.750.80.850.90.95
Data Length D SE n , m = , α = . CHFElderlyYoung a) Data Length D SE n , m = , α = . CHFElderlyYoung b) Data Length D SE n , m = , α = . CHFElderlyYoung c) Data Length D SE n , m = , α = . CHFElderlyYoung d) Figure 5: Differential entropy of three types of heartbeats for increasing data length. Encoding lengths of DSEn is 3 and 4 and τ = 1. α of 0.61 and 0.63 are selected referring their satisfied analysis in previous subsection. or bigger, DSEn of the three groups of heartbeats are significantly different from each other in statistics. Inour heartbeats nonlinear complexity analysis, differential symbolic entropy does not have high demands ondata length and distinguish three different signals at short data sets. As proved by t tests, DSEn has satisfiednonlinear complexity analysis when data is short to 300 and is not influenced by data length significantly.From the above analysis, differential symbolic entropy, extracting local dynamic information and detectingnonlinear complexity, has successful chaotic detections in logistic map and real-world physiological heart rates.Moreover, it does not have high requirement to data length and enables short data sets nonlinear dynamicanalysis.Based on researches on chaotic model and real-world physiological signals, we learn that the controllingparameter plays an important role in differential symbolic entropy analysis and it should be adjusted accord-ingly. In chaotic detections of logistic map, controlling parameters interval [0.33, 0.38] are chosen for theirpreferable complexity extraction while in nonlinear dynamic analysis of three kinds of heartbeats, controllingparameters having satisfied distinctions should be selected in [0.59, 0.63] according to statistical tests. Thecontrolling parameter, therefore, need to be chosen according to different signals, which at the same time isthe drawback of JKSEn. The reason account for variable controlling parameter choices, we suppose, is dueto differences in nonlinear structural or dynamical properties of different chaotic models and physiologicalsignals and we can not find a optimal controlling parameter for all different signals.
5. Conclusions
Differential symbolic entropy is a nonlinear complexity measure making use of difference-based dynamicsin three adjacent elements. The symbolic nonlinear complexity measure with adjustable controlling parameterhas satisfied nonlinear analysis in chaotic model and real-world physiological signals and has features of fastand simplicity even for very short data sets.The complex losing theory of decreased nonlinear complexity in aging and diseased hear rates is validatedby DSEn. And our findings suggest that for different structural or dynamical information in complexitysystems, controlling parameter in differential symbolic transformation should be adjusted accordingly toextract nonlinear symbolic dynamics. 6 . Acknowledgments
The work is supported by Project supported by the National Natural Science Foundation of China (GrantNos. 61271082, 61401518, 81201161), Jiangsu Provincial Key R & D Program (Social Development) (GrantNo.BE2015700), the Natural Science Foundation of Jiangsu Province (Grant No. BK20141432), NaturalScience Research Major Programmer in Universities of Jiangsu Province (Grant No.16KJA310002), Post-graduate Research & Practice Innovation Program of Jiangsu Province.