Reliable Detection of Causal Asymmetries in Dynamical Systems
RReliable Detection of Causal Asymmetries in Dynamical Systems
Erik Laminski ∗ and Klaus R. Pawelzik † University of Bremen, 28359 Bremen, Germany (Dated: October 20, 2020)Knowledge about existence, strength, and dominant direction of causal influences is of paramountimportance for understanding complex systems. With limited amounts of realistic data, however,current methods for investigating causal links among different observables from dynamical systemssuffer from ambiguous results. Particularly challenging are synchronizations, where it is difficult toinfer the dominant direction of influence. Missing is a statistically well defined approach that avoidsfalse positive detections while being sensitive for weak interactions. The proposed method exploitslocal inflations of manifolds to obtain estimates of upper bounds on the information loss amongstate reconstructions from two observables. It comes with a test for the absence of causal influences.Simulated data demonstrate that it is robust to intrinsic noise, copes with synchronizations, andtolerates also measurement noise.
I. INTRODUCTION
Simple cause-effect notions of causality are mislead-ing when interactions are reciprocal. This is particularlyclear for deterministic dynamical systems where Takens’Theorem [1–3] shows that different observables individ-ually contain all information about the entire system’sstate. Here, state reconstructions from different observ-ables are generically equivalent which reflects the factthat the systems are non-separable and behave as wholes.Still, it is of considerable interest to gain knowledge aboutthe strength of influences among selected components ofa complex dynamical system.Several methods proposed for this purpose rely onphase space reconstructions [4–6]. In essence, they arebased on the following simple consideration: Let a sys-tem X uni-directionally influence another system Y. Ob-viously, then Y receives information about X. Conse-quently, states of Y will contain information about thestate of X, while states of X by assumption cannot pro-vide full ’knowledge’ about Y. Thereby states Y can beexpected to predict observables of X better than viceversa. This heuristics was put on solid mathematicalgrounds in [5] leading to a method that allowed to cap-ture also mutual interactions. The measure of Topologi-cal Causality was introduced for inference of the intensityof directed effective influences among observables. It re-lies on local expansions of the mappings between state re-constructions. While mathematically transparent, Topo-logical Causality was hitherto estimated from fitting thelocal maps among state reconstructions from data. Thiscan be quite challenging, in particular, when only limitedamounts of observations are available, when the system’sstrict determinism is violated (intrinsic noise), and whenmeasurement noise contaminates the data. Particularly,synchronizations and common input can cause mislead-ing results. Similar problems affect also other methods ∗ [email protected] † [email protected] based on the same heuristics. E.g. the CCM method [4]was found to yield wrong directions of dominant influ-ences with synchronizations [7, 8], particularly see Fig.4in [9].The method presented here is also based on expan-sions, however, not among but instead within each statespace reconstruction. Following the above heuristics, itestimates a bound on the relative amount of informationcontained in the state reconstructed from of the causallyaffected system about the state reconstructed from theeffecting system. This is done by comparing the sizesof local neighbourhoods in one state space reconstruc-tion with both, the projections of neighbours in the otherstate space reconstruction and random neighbours. Thecorresponding manifold inflations not only capture pos-sible dimensional conflicts among state reconstructions[6], but also allow for a statistical criterion to control forfalse positive detections of causal influences.After introducing the cross projection method (CPM)it is tested on a time discrete model system demonstrat-ing the relation to coupling constants and its dependencyon intrinsic noise, limited measurements, and measure-ment noise. Then we demonstrate that CPM reliably de-tects the absence of causal links and copes with synchro-nization when applied to time series of realistic lengthfrom time continuous systems. A first application toheart rate and breathing rate data is found to deliverunambiguous results. II. MANIFOLD INFLATION AS A PROXY FORINFORMATION LOSS
We consider dynamical systems composed of two sub-systems X and Y , governed by ˙x = f ( x , w xy µ x ( y )) ˙y = g ( y , w yx µ y ( x ))where µ i ( i ) denote fixed scalar functions and w ij coupling constants. It was shown by Takens [1] that atopologically equivalent portrait of the attractor can be a r X i v : . [ phy s i c s . d a t a - a n ] O c t reconstructed from lagged coordinates. For this purposedelayed copies of a single observable are merged giving a m -dimensional vector r x ( t ) = [ x ( t ) , ..., x ( t + ( m − τ )].The dimension m is sufficient if m > D where D is the dimension of the attractor. Formally the timedelay τ > m and τ [10][11]. Note that to be a valid reconstruction of theoverall attractor, there must be incoming connections w xi linking the subsystem X to the whole system. Inthis case there also exists a unique mapping betweenreconstructions from different observables, in this casefrom r x to r y [12].Here we use a combination of local and global prop-erties of the relations among nearest neighbours to ref-erence points in both reconstructions for estimating howmuch the causally affected system ’knows’ about the stateof the system influencing it. Each point is identified byits time index t and its location in the respective recon-struction. For each reference point t we determine the k nearest neighbours in both reconstructed spaces t xl and t yl , l = 1 , ..., k and the (euclidean) distances from the ref-erence point to these neighbours both, in their originspace L x ( t, t xl ) and the distance to the putative neigh-bours based on proximity in the respective other space L x ( t, t yl ).For the chance level an ensemble E of surrogate neigh-bors is generated by shifting the time indices of all actualneighbors of each reference point with a mutual randomtime interval δt E We avoid overlap of the embedding vec-tor by excluding the immediate temporal neighbors (i.e. δt r > ± mτ ). When applying this random time shift tothe time indices of neighbors in space Y a given referencepoint t becomes in X associated with a set of points t y ∗ l haveing distances L x ( t, t y ∗ l ) to these points. This proce-dure removes causal relations of neighborhoods betweendifferent observables while preserving the temporal cor-relations within the set of surrogate neighbors.Also, we introduce d ji ( k ) as mean logarithmic size of theneighbourhood for all L i ( t, t jl ): d ji ( k ) = (cid:68) log ( max [( L i ( t, t jl )] l =1 ..k ) (cid:69) t ∧ E i, j = x, y, x ∗ , y ∗ The random neighbourhoods are here averaged overboth, the Monte-Carlo ensembles E and the referencepoints, whereas for the non-shifted neighbours the aver-age is only over the reference points.As an example consider two coupled logistic maps x ( t + 1) = x ( t )[ R x (1 − x ( t )) − w xy y ( t )] + η x ( t ) y ( t + 1) = y ( t )[ R y (1 − y ( t )) − w yx x ( t )] + η y ( t )with reflecting boundaries, R i being system parametersand subjected to additive Gaussian noise η i ( t ) ∈ N (0 , σ ). The simplest case is the noise free unilaterally coupledsystem ( w xy = 0 and w yx = 0 . x t over the ( y t , y t +1 )-plane. (FIG.1). Since w yx > FIG. 1. x t over y t and y t +1 for the noise free unilaterallycoupled logistic maps with w xy = 0 , w yx = 0 . , R x = R y =3 .
82, together with the projection of the manifold on the y t +1 - y t -plane and the x t -axis. 10 data points are shown in bluethe 10 nearest neighbours of a reference point are shown incolour in both, the manifold and the projections (grey). The10 neighbours searched in Y (X) are shown in red (green). Y and the images of neighbours in Y are also localizedin X. With increasing coupling w yx neighbours searchedin Y become more localized in X, in the limiting caseof perfect information preservation the neighbours areidentical and d xx ( k ) = d yx ( k ). In contrast, X here doesnot constrain Y completely and thus the image of theseneighbours is spread over the whole y t - y t +1 -plane. Inthe other limiting case, w xy = 0, no information aboutX is included in Y and d yx ( k ) will on average be identicalwith the random neighbourhood d y ∗ x ( k ).These relations between neighborhood sizes are visi-ble in plots of the mean logarithmic neighbourhood sizes d ji ( k ) as functions of κ = ψ ( k ) − log ( N ), where k is thenumber of neighbours and N the amount of data and ψ is the digamma-function. This particular choice of theabscissa allows for an unbiased estimate of the fractal(information) dimensions D of the subsystems by tak-ing the slope of ψ ( k ) versus d xx ( k ) (resp. d yy ( k )) [13]. Inthe present example the manifold reconstructed from ob-servable X has a smaller dimension than the attractorreconstructed from the influenced observable Y. Whiledimensional conflicts provide a sufficient criterion for thedirection of causal influence in unilaterally coupled deter-ministic systems [6], they are useless for mutually coupledsystems. In this general case a different criterion for de-termining the dominant direction of causal influence isneeded.With d xx ( k ) providing a lower bound and d ∗ x ( k ) an up- -6 -4 (k) -8-6-4-20 d ij ( k ) (a) -6 -4 (k) -8-6-4-20 (b) -6 -4 (k) (d) -6 -4 (k) (c) FIG. 2. Logarithmic neighbourhood sizes d ji ( k ) over κ fornoise free unilateral coupled logistic maps ( w xy = 0 , w yx = 0 . R x = R y = 3 .
82) from N = 10 data points. The observ-ables were embedded with m = 4 and τ = 1 and 10 ensem-bles were used for chance-level-estimation. a) d xx ( k ) shownas solid line, d yx ( k ) shown as dotted line and the respectivechance-level d y ∗ x ( k ) (dashed line). b) d yy ( k ) shown as solidline, d xy ( k ) shown as dotted line and the respective chance-level d x ∗ y ( k ) (dashed line). Furthermore the fivefold standarderror is shown for each d ji ( k ) as a grey shade. c) , d) show the same results, e.g. neighborhoodsizes, forthe heart and breathing rate of a sleeping human. The datawas sampled at 2 Hz and embedded with τ = 6 and m = 5,all n = 601 data points were chosen as reference points. c) d HeartHeart ( k ) shown as solid line, d BreathHeart ( k ) shown as dot-ted line and the respective chance-level d Breath ∗ Heart ( k ) (dashedline). Due to the small amount of data only the threefoldstandard deviation was used to determine the chance-level. d) d BreathBreath ( k ) shown as solid line, d HeartBreath ( k ) shown as dottedline and the respective chance-level d Heart ∗ Breath ( k ) (dashed line). per bound for the size of the k -the neighbourhood, thesize d xy ( k ) can be used to define a measure for the infor-mation preserved within the neighbourhood in y . We usethe ratio of the distance between d xx ( k ) and d yx ( k ) and thechance-level d y ∗ x ( k ): I x → y ( k ) = d y ∗ x ( k ) − d yx ( k ) d y ∗ x ( k ) − d xx ( k )Note that 0 ≤ I x → y ≤ I x → y (cid:39) x does not influence Y at all. If I x → y (cid:39) Y ’knowseverything’ about X , which suggests that X has a stronginfluence on Y . Furthermore we introduce a measure forthe asymmetry of causal influences α : α ( k ) = I y → x ( k ) − I x → y ( k ) I y → x ( k ) + I x → y ( k )For determining significance we use the standard error(SE) of the mean logarithmic neighbourhood sizes σ ¯ d ji .For significance the standard errors σ ¯ d yx and σ ¯ d ∗ x must notoverlap, e.g. as shown in FIG. 2 a) . In practice the errorsoverlap for large k . Therefore, in all following results werequire for significance that the difference exceeds severalSE’s and additonally that at least 15 out of k = 1 .. III. RESULTS
Firstly, we investigate sensitivity and specificity of themethod. For this purpose two logistic maps are coupledbilaterally and the causal influence is determined for dif-ferent coupling weights as shown in Fig. 3. The causal in-fluence I y → x is a monotonic function of the weight w y → x ,while the other direction I x → y is largely unaffected bythis weight over three different magnitudes of couplingstrength. However, for couplings smaller than 10 − -4 -3 -2 -1 w y x I y x I x y FIG. 3. Causal Influence between bilaterally coupled logisticmaps. 2 · data points were generated with R x = R y =3 .
92, the observed time series were embedded with m = 4and τ = 1 and we used the k = 20 nearest neighbors todetermine the causal influence and E = 10 permutations forthe chance-level. Black lines show I y → x corresponding to thevaried coupling w y → x , grey lines show the reverse directionwith the coupling W x → y fixed at 0 (circles), 0 . .
032 (triangles). coupling for different amounts of data (Fig. 4 (a) , (d) ).For this system at least 10 data points are needed todetect significant causal influences. If the coupling issmall the required amount of data increases. Note that,regardless of the amount of data, no false positives aredetected. To demonstrate noise robustness we injectedintrinsic and external additive Gaussian noise in the lo-gistic maps FIG. 4 (c) - (f ) . While noise lowers thecausal influence in both cases, it still correctly dependson the coupling and even for strong noise no false posi-tives are introduced.Next we demonstrate the correct identification of thedirection of causal influence for time continuous systemsusing two coupled Lorenz-systems. Here, we additionallyintroduced external and internal noise to demonstratenoise resistance also in this case (Fig. 5). The Lorenzsystems used were coupled by their x -components andare given by:˙ x i = − µ i ( x i − y i ) + w ij x j + ση x i ˙ y i = ρ i x i − y i − x i z i + ση y i ˙ z i = − θ i z i + x i y i + ση z i i with < η i η j > = δ ij δ ( t − t (cid:48) ) , with parameters µ i = 28, ρ i = 8 / θ i = 10. Forboth, the noise free and noise polluted system, the correctdirection of causal influence is determined correctly by α . w x y (a) N w y x (d) (b) / x (e) (c) / x (f) i j FIG. 4. Bilaterally coupled logistic maps R x = R y = 3 . w x → y = 0 .
05 and varying w y → x between 0 and 0 .
1. All timeseries were embedded with τ = 1 and m = 4 and 10 ensem-bles were generated to estimate chance-level. (a) & (d ) Vary-ing amount of data between 10 and 10 using E = 10 per-mutations for chance-level, in (a) the causal influence I y → x associated with the coupling w yx and (d) the reverse direc-tion I y → x . For noise polluted time-series 10 data points wereused. (b) & (e ) Additive internal noise is injected into thesystem. The x -axis shows the ratio of the standard devia-tions of noise and the unpolluted system varying between 0and roughly 8% noise. (b) shows the causal influence I y → x associated with the coupling w yx and (e) the reverse direc-tion I y → x . (c) & (f ) Additive external noise is added to theobserved time-series. The x -axis shows the ratio of the stan-dard deviations of noise and the unpolluted system varyingbetween 0 and roughly 8% noise. (c) shows the causal influ-ence I y → x associated with the coupling w yx and f the reversedirection I y → x . For better visualisation contour lines marklines of equal causal influence. However, in the presence of noise the causal influence isweakened and the asymmetry less pronounced, which wasalso observed for the logistic maps.Finally, we investigate the causal influence for strongcouplings, where systems tend to synchronize. To quan-tify synchronization we estimate the information dimen-sion from d ii ( k ). For unilateral coupling this is a sensitivemeasure, since for complete synchronization the dimen-sion of the driven system will drop to the one of thedriving system. As an example we use a Roessler-Lorenz-System as analysed in [9]. In this system synchronizationoccurs at a critical value of w ≈ .
5, where the dimen-sions of the reconstructions coincide. CPM is not onlyable to detect the correct direction of causal influence be-fore, but also after the critical coupling is reached (Fig.5 b),c)). This result is in sharp contrast to other ap-proaches (Krakovsk´a Fig.4 [9]). Further tests in othersystems (e.g. two coupled Fitzhugh-Nagumo Neurons)showed similar results up to the critical value. Inter-estingly, we found that intrinsic noise improves the de- -1 0 1 w -1-0.500.51 (a) w -1-0.500.51 2345 D (c) w I (b) FIG. 5. (a)
Asymmetry index between two Lorenz oscillators( N = 10 data points) with slightly different frequencies andcoupled by their y -components ( θ / = 10 , ρ = 28 . , ρ =27 . , θ / = 8 / m =9 and τ = 10 and E = 10 were used to estimate chance-level. The noise free asymmetry is shown in black. Colouredcircles represent internal noise with σ = 2 (green) and additiveexternal noise with σ = 2 (red). (b),(c) Unilaterally coupledRoessler (X) → Lorenz (Y) System. 10 reference points werechosen from 10 data points with τ = 2 and m = 7. Forchance-level erstimation E = 10 ensembles were used. (b) Causal Influence I x → y (blue) and I y → x (red) for the noise free(solid-lines) and for the systems perturbed by Gaussian whitenoise (asterix’s). (c) The asymmetry for the noise free (solidlines) and perturbed system (asterix’s). Grey Lines show theInformation Dimension estimated from the respective timeseries. tectability of influence asymmetries (asterix’s in Fig. 5c)).
IV. DISCUSSION:
When a system’s component X unilaterally influencesanother component Y, then Y recieves informationabout X. X will not have as large proportions ofinformation about states of Y, since in this case thelatter typically have independent degrees of freedom.This simple heuristics is somewhat opposite to standardapproaches based on prediction in direction of theinfluence including Granger Causality [14] and TransferEntropy [15]. Here, we consider the relative amounts ofinformation about the systems’ states in the directionsopposite to the influences. The cross prediction methodestimates metric inflations of neighbourhoods and theirprojections via the putatively homeomorphic mappingsamong manifolds of reconstructed system states as aproxy for information loss. In the past a related idea wasused for determining the quality of mappings [10] [16],where the ratio of the inflations within the respectivespaces was expected to be close to one for homeomorphywhile for topology violations it systematically deviatesfrom one. In other words, while the core of methodpresented here is information theoretic its particularsensitivity for metric topology violations could havebeen expected from previous work. A recent work[5] similarly used local properties of the mappingsamong state reconstructions to establish a measure ofcausal influence (termed Topological Causality). Instark contrast to Topological Causality, however, CPMexploits expansions within and not among reconstruc-tions. Basing the method on the relation of distances within the same space solves a range of problems ofTC and related approaches including Convergent CrossMapping [4]. In particular, CPM is much less sensitiveto synchronizations where to our knowledge previousmethods often deliver misleading results [7][8].We are confident that applications also to real datawill provide more reliable results than previous meth-ods. For example an application to heart rate versusbreathing rate data from an apnea patient [17] revealeda substantial asymmetry of influence from heart rate tobreathing rate (Fig. 2 c), d)), an unambiguous resultthat is more pronounced than when determined withTransfer Entropy from the same data (Fig. 4 in [15]).We thank M. Sch¨unemann for helpful comments on themanuscript. [1] F. Takens, Detecting strange attractors in turbulence,in
Dynamical Systems and Turbulence , Springer LectureNotes in Mathematics, Vol. 898, edited by D. A. Randand L.-S. Young (Springer-Verlag, Berlin, 1981).[2] N. H. Packard, J. P. Crutchfield, J. D. Farmer, and R. S.Shaw, Geometry from a time series, Phys. Rev. Lett. ,712 (1980).[3] J. Stark, D. Broomhead, M. Davies, and J. Huke, Takensembedding theorems for forced and stochastic systems,Nonlinear Analysis: Theory, Methods & Applications ,5303 (1997), proceedings of the Second World Congressof Nonlinear Analysts.[4] G. Sugihara, R. M. May, H. Ye, C. Hsieh, E. R. Deyle,and M. Fogarty, Detecting causality in complex ecosys-tems, Science , 496 (2012).[5] D. Harnack, E. Laminski, M. Sch¨unemann, and K. R.Pawelzik, Topological causality in dynamical systems,Phys. Rev. Lett. , 098301 (2017).[6] Z. Benk˝o, ´Ad´am Zlatniczki, D. Fab´o, A. S´olyom,L. Er˝oss, A. Telcs, and Z. Somogyv´ari, Exact infer-ence of causal relations in dynamical systems (2018),arXiv:1808.10806 [q-bio.QM].[7] E. R. Deyle, M. C. Maher, R. D. Hernandez,S. Basu, and G. Sugihara, Global environmen-tal drivers of influenza, Proceedings of the Na-tional Academy of Sciences , 052203 (2016).[10] W. Liebert, K. Pawelzik, and H. Schuster, Optimal em-beddings of chaotic attractors from topological consider-ations, EPL (Europhysics Letters) , 521 (1991).[11] P. Grassberger, T. Schreiber, and C. Schaffrath, Non-linear time sequence analysis, International Journal ofBifurcation and Chaos , 521 (1991).[12] B. Cummins, T. Gedeon, and K. Spendlove, On the effi-cacy of state space reconstruction methods in determin-ing causality, SIAM Journal on Applied Dynamical Sys-tems , 335 (2015), https://doi.org/10.1137/130946344.[13] P. Grassberger, Finite sample corrections to entropy anddimension estimates, Physics Letters A , 369 (1988).[14] C. W. J. Granger, Investigating causal relations by econo-metric models and cross-spectral methods, Econometrica , 424 (1969).[15] T. Schreiber, Measuring information transfer, Phys. Rev.Lett. , 461 (2000).[16] H.-U. Bauer and K. R. Pawelzik, Quantifying the neigh-borhood preservation of self-organizing feature maps,IEEE Transactions on neural networks3