[PDF] Reliable Detection of Causal Asymmetries in Dynamical Systems

Abstract

Knowledge about existence, strength, and dominant direction of causal influences is of paramount importance for understanding complex systems. With limited amounts of realistic data, however, current methods for investigating causal links among different observables from dynamical systems suffer from ambiguous results. Missing is a statistically well defined approach that avoids false positive detections while being sensitive for weak interactions. Ideally, it should be able to infer directed causal influences also when synchronizations occur. The proposed method exploits local inflations of manifolds to obtain estimates of upper bounds on the information loss among state reconstructions from two observables. It comes with a test for the absence of causal influences. Simulated data demonstrate that it is robust to intrinsic noise, copes with synchronizations, and tolerates also measurement noise.

Full PDF

RReliable Detection of Causal Asymmetries in Dynamical Systems

Erik Laminski ∗ and Klaus R. Pawelzik † University of Bremen, 28359 Bremen, Germany (Dated: October 20, 2020)Knowledge about existence, strength, and dominant direction of causal inﬂuences is of paramountimportance for understanding complex systems. With limited amounts of realistic data, however,current methods for investigating causal links among diﬀerent observables from dynamical systemssuﬀer from ambiguous results. Particularly challenging are synchronizations, where it is diﬃcult toinfer the dominant direction of inﬂuence. Missing is a statistically well deﬁned approach that avoidsfalse positive detections while being sensitive for weak interactions. The proposed method exploitslocal inﬂations of manifolds to obtain estimates of upper bounds on the information loss amongstate reconstructions from two observables. It comes with a test for the absence of causal inﬂuences.Simulated data demonstrate that it is robust to intrinsic noise, copes with synchronizations, andtolerates also measurement noise.

I. INTRODUCTION

Simple cause-eﬀect notions of causality are mislead-ing when interactions are reciprocal. This is particularlyclear for deterministic dynamical systems where Takens’Theorem [1–3] shows that diﬀerent observables individ-ually contain all information about the entire system’sstate. Here, state reconstructions from diﬀerent observ-ables are generically equivalent which reﬂects the factthat the systems are non-separable and behave as wholes.Still, it is of considerable interest to gain knowledge aboutthe strength of inﬂuences among selected components ofa complex dynamical system.Several methods proposed for this purpose rely onphase space reconstructions [4–6]. In essence, they arebased on the following simple consideration: Let a sys-tem X uni-directionally inﬂuence another system Y. Ob-viously, then Y receives information about X. Conse-quently, states of Y will contain information about thestate of X, while states of X by assumption cannot pro-vide full ’knowledge’ about Y. Thereby states Y can beexpected to predict observables of X better than viceversa. This heuristics was put on solid mathematicalgrounds in [5] leading to a method that allowed to cap-ture also mutual interactions. The measure of Topologi-cal Causality was introduced for inference of the intensityof directed eﬀective inﬂuences among observables. It re-lies on local expansions of the mappings between state re-constructions. While mathematically transparent, Topo-logical Causality was hitherto estimated from ﬁtting thelocal maps among state reconstructions from data. Thiscan be quite challenging, in particular, when only limitedamounts of observations are available, when the system’sstrict determinism is violated (intrinsic noise), and whenmeasurement noise contaminates the data. Particularly,synchronizations and common input can cause mislead-ing results. Similar problems aﬀect also other methods ∗ [email protected] † [email protected] based on the same heuristics. E.g. the CCM method [4]was found to yield wrong directions of dominant inﬂu-ences with synchronizations [7, 8], particularly see Fig.4in [9].The method presented here is also based on expan-sions, however, not among but instead within each statespace reconstruction. Following the above heuristics, itestimates a bound on the relative amount of informationcontained in the state reconstructed from of the causallyaﬀected system about the state reconstructed from theeﬀecting system. This is done by comparing the sizesof local neighbourhoods in one state space reconstruc-tion with both, the projections of neighbours in the otherstate space reconstruction and random neighbours. Thecorresponding manifold inﬂations not only capture pos-sible dimensional conﬂicts among state reconstructions[6], but also allow for a statistical criterion to control forfalse positive detections of causal inﬂuences.After introducing the cross projection method (CPM)it is tested on a time discrete model system demonstrat-ing the relation to coupling constants and its dependencyon intrinsic noise, limited measurements, and measure-ment noise. Then we demonstrate that CPM reliably de-tects the absence of causal links and copes with synchro-nization when applied to time series of realistic lengthfrom time continuous systems. A ﬁrst application toheart rate and breathing rate data is found to deliverunambiguous results. II. MANIFOLD INFLATION AS A PROXY FORINFORMATION LOSS

We consider dynamical systems composed of two sub-systems X and Y , governed by ˙x = f ( x , w xy µ x ( y )) ˙y = g ( y , w yx µ y ( x ))where µ i ( i ) denote ﬁxed scalar functions and w ij coupling constants. It was shown by Takens [1] that atopologically equivalent portrait of the attractor can be a r X i v : . [ phy s i c s . d a t a - a n ] O c t reconstructed from lagged coordinates. For this purposedelayed copies of a single observable are merged giving a m -dimensional vector r x ( t ) = [ x ( t ) , ..., x ( t + ( m − τ )].The dimension m is suﬃcient if m > D where D is the dimension of the attractor. Formally the timedelay τ > m and τ [10][11]. Note that to be a valid reconstruction of theoverall attractor, there must be incoming connections w xi linking the subsystem X to the whole system. Inthis case there also exists a unique mapping betweenreconstructions from diﬀerent observables, in this casefrom r x to r y [12].Here we use a combination of local and global prop-erties of the relations among nearest neighbours to ref-erence points in both reconstructions for estimating howmuch the causally aﬀected system ’knows’ about the stateof the system inﬂuencing it. Each point is identiﬁed byits time index t and its location in the respective recon-struction. For each reference point t we determine the k nearest neighbours in both reconstructed spaces t xl and t yl , l = 1 , ..., k and the (euclidean) distances from the ref-erence point to these neighbours both, in their originspace L x ( t, t xl ) and the distance to the putative neigh-bours based on proximity in the respective other space L x ( t, t yl ).For the chance level an ensemble E of surrogate neigh-bors is generated by shifting the time indices of all actualneighbors of each reference point with a mutual randomtime interval δt E We avoid overlap of the embedding vec-tor by excluding the immediate temporal neighbors (i.e. δt r > ± mτ ). When applying this random time shift tothe time indices of neighbors in space Y a given referencepoint t becomes in X associated with a set of points t y ∗ l haveing distances L x ( t, t y ∗ l ) to these points. This proce-dure removes causal relations of neighborhoods betweendiﬀerent observables while preserving the temporal cor-relations within the set of surrogate neighbors.Also, we introduce d ji ( k ) as mean logarithmic size of theneighbourhood for all L i ( t, t jl ): d ji ( k ) = (cid:68) log ( max [( L i ( t, t jl )] l =1 ..k ) (cid:69) t ∧ E i, j = x, y, x ∗ , y ∗ The random neighbourhoods are here averaged overboth, the Monte-Carlo ensembles E and the referencepoints, whereas for the non-shifted neighbours the aver-age is only over the reference points.As an example consider two coupled logistic maps x ( t + 1) = x ( t )[ R x (1 − x ( t )) − w xy y ( t )] + η x ( t ) y ( t + 1) = y ( t )[ R y (1 − y ( t )) − w yx x ( t )] + η y ( t )with reﬂecting boundaries, R i being system parametersand subjected to additive Gaussian noise η i ( t ) ∈ N (0 , σ ). The simplest case is the noise free unilaterally coupledsystem ( w xy = 0 and w yx = 0 . x t over the ( y t , y t +1 )-plane. (FIG.1). Since w yx > FIG. 1. x t over y t and y t +1 for the noise free unilaterallycoupled logistic maps with w xy = 0 , w yx = 0 . , R x = R y =3 .

82, together with the projection of the manifold on the y t +1 - y t -plane and the x t -axis. 10 data points are shown in bluethe 10 nearest neighbours of a reference point are shown incolour in both, the manifold and the projections (grey). The10 neighbours searched in Y (X) are shown in red (green). Y and the images of neighbours in Y are also localizedin X. With increasing coupling w yx neighbours searchedin Y become more localized in X, in the limiting caseof perfect information preservation the neighbours areidentical and d xx ( k ) = d yx ( k ). In contrast, X here doesnot constrain Y completely and thus the image of theseneighbours is spread over the whole y t - y t +1 -plane. Inthe other limiting case, w xy = 0, no information aboutX is included in Y and d yx ( k ) will on average be identicalwith the random neighbourhood d y ∗ x ( k ).These relations between neighborhood sizes are visi-ble in plots of the mean logarithmic neighbourhood sizes d ji ( k ) as functions of κ = ψ ( k ) − log ( N ), where k is thenumber of neighbours and N the amount of data and ψ is the digamma-function. This particular choice of theabscissa allows for an unbiased estimate of the fractal(information) dimensions D of the subsystems by tak-ing the slope of ψ ( k ) versus d xx ( k ) (resp. d yy ( k )) [13]. Inthe present example the manifold reconstructed from ob-servable X has a smaller dimension than the attractorreconstructed from the inﬂuenced observable Y. Whiledimensional conﬂicts provide a suﬃcient criterion for thedirection of causal inﬂuence in unilaterally coupled deter-ministic systems [6], they are useless for mutually coupledsystems. In this general case a diﬀerent criterion for de-termining the dominant direction of causal inﬂuence isneeded.With d xx ( k ) providing a lower bound and d ∗ x ( k ) an up- -6 -4 (k) -8-6-4-20 d ij ( k ) (a) -6 -4 (k) -8-6-4-20 (b) -6 -4 (k) (d) -6 -4 (k) (c) FIG. 2. Logarithmic neighbourhood sizes d ji ( k ) over κ fornoise free unilateral coupled logistic maps ( w xy = 0 , w yx = 0 . R x = R y = 3 .

82) from N = 10 data points. The observ-ables were embedded with m = 4 and τ = 1 and 10 ensem-bles were used for chance-level-estimation. a) d xx ( k ) shownas solid line, d yx ( k ) shown as dotted line and the respectivechance-level d y ∗ x ( k ) (dashed line). b) d yy ( k ) shown as solidline, d xy ( k ) shown as dotted line and the respective chance-level d x ∗ y ( k ) (dashed line). Furthermore the ﬁvefold standarderror is shown for each d ji ( k ) as a grey shade. c) , d) show the same results, e.g. neighborhoodsizes, forthe heart and breathing rate of a sleeping human. The datawas sampled at 2 Hz and embedded with τ = 6 and m = 5,all n = 601 data points were chosen as reference points. c) d HeartHeart ( k ) shown as solid line, d BreathHeart ( k ) shown as dot-ted line and the respective chance-level d Breath ∗ Heart ( k ) (dashedline). Due to the small amount of data only the threefoldstandard deviation was used to determine the chance-level. d) d BreathBreath ( k ) shown as solid line, d HeartBreath ( k ) shown as dottedline and the respective chance-level d Heart ∗ Breath ( k ) (dashed line). per bound for the size of the k -the neighbourhood, thesize d xy ( k ) can be used to deﬁne a measure for the infor-mation preserved within the neighbourhood in y . We usethe ratio of the distance between d xx ( k ) and d yx ( k ) and thechance-level d y ∗ x ( k ): I x → y ( k ) = d y ∗ x ( k ) − d yx ( k ) d y ∗ x ( k ) − d xx ( k )Note that 0 ≤ I x → y ≤ I x → y (cid:39) x does not inﬂuence Y at all. If I x → y (cid:39) Y ’knowseverything’ about X , which suggests that X has a stronginﬂuence on Y . Furthermore we introduce a measure forthe asymmetry of causal inﬂuences α : α ( k ) = I y → x ( k ) − I x → y ( k ) I y → x ( k ) + I x → y ( k )For determining signiﬁcance we use the standard error(SE) of the mean logarithmic neighbourhood sizes σ ¯ d ji .For signiﬁcance the standard errors σ ¯ d yx and σ ¯ d ∗ x must notoverlap, e.g. as shown in FIG. 2 a) . In practice the errorsoverlap for large k . Therefore, in all following results werequire for signiﬁcance that the diﬀerence exceeds severalSE’s and additonally that at least 15 out of k = 1 .. III. RESULTS

Firstly, we investigate sensitivity and speciﬁcity of themethod. For this purpose two logistic maps are coupledbilaterally and the causal inﬂuence is determined for dif-ferent coupling weights as shown in Fig. 3. The causal in-ﬂuence I y → x is a monotonic function of the weight w y → x ,while the other direction I x → y is largely unaﬀected bythis weight over three diﬀerent magnitudes of couplingstrength. However, for couplings smaller than 10 − -4 -3 -2 -1 w y x I y x I x y FIG. 3. Causal Inﬂuence between bilaterally coupled logisticmaps. 2 · data points were generated with R x = R y =3 .

92, the observed time series were embedded with m = 4and τ = 1 and we used the k = 20 nearest neighbors todetermine the causal inﬂuence and E = 10 permutations forthe chance-level. Black lines show I y → x corresponding to thevaried coupling w y → x , grey lines show the reverse directionwith the coupling W x → y ﬁxed at 0 (circles), 0 . .

032 (triangles). coupling for diﬀerent amounts of data (Fig. 4 (a) , (d) ).For this system at least 10 data points are needed todetect signiﬁcant causal inﬂuences. If the coupling issmall the required amount of data increases. Note that,regardless of the amount of data, no false positives aredetected. To demonstrate noise robustness we injectedintrinsic and external additive Gaussian noise in the lo-gistic maps FIG. 4 (c) - (f ) . While noise lowers thecausal inﬂuence in both cases, it still correctly dependson the coupling and even for strong noise no false posi-tives are introduced.Next we demonstrate the correct identiﬁcation of thedirection of causal inﬂuence for time continuous systemsusing two coupled Lorenz-systems. Here, we additionallyintroduced external and internal noise to demonstratenoise resistance also in this case (Fig. 5). The Lorenzsystems used were coupled by their x -components andare given by:˙ x i = − µ i ( x i − y i ) + w ij x j + ση x i ˙ y i = ρ i x i − y i − x i z i + ση y i ˙ z i = − θ i z i + x i y i + ση z i i with < η i η j > = δ ij δ ( t − t (cid:48) ) , with parameters µ i = 28, ρ i = 8 / θ i = 10. Forboth, the noise free and noise polluted system, the correctdirection of causal inﬂuence is determined correctly by α . w x y (a) N w y x (d) (b) / x (e) (c) / x (f) i j FIG. 4. Bilaterally coupled logistic maps R x = R y = 3 . w x → y = 0 .

05 and varying w y → x between 0 and 0 .

1. All timeseries were embedded with τ = 1 and m = 4 and 10 ensem-bles were generated to estimate chance-level. (a) & (d ) Vary-ing amount of data between 10 and 10 using E = 10 per-mutations for chance-level, in (a) the causal inﬂuence I y → x associated with the coupling w yx and (d) the reverse direc-tion I y → x . For noise polluted time-series 10 data points wereused. (b) & (e ) Additive internal noise is injected into thesystem. The x -axis shows the ratio of the standard devia-tions of noise and the unpolluted system varying between 0and roughly 8% noise. (b) shows the causal inﬂuence I y → x associated with the coupling w yx and (e) the reverse direc-tion I y → x . (c) & (f ) Additive external noise is added to theobserved time-series. The x -axis shows the ratio of the stan-dard deviations of noise and the unpolluted system varyingbetween 0 and roughly 8% noise. (c) shows the causal inﬂu-ence I y → x associated with the coupling w yx and f the reversedirection I y → x . For better visualisation contour lines marklines of equal causal inﬂuence. However, in the presence of noise the causal inﬂuence isweakened and the asymmetry less pronounced, which wasalso observed for the logistic maps.Finally, we investigate the causal inﬂuence for strongcouplings, where systems tend to synchronize. To quan-tify synchronization we estimate the information dimen-sion from d ii ( k ). For unilateral coupling this is a sensitivemeasure, since for complete synchronization the dimen-sion of the driven system will drop to the one of thedriving system. As an example we use a Roessler-Lorenz-System as analysed in [9]. In this system synchronizationoccurs at a critical value of w ≈ .

5, where the dimen-sions of the reconstructions coincide. CPM is not onlyable to detect the correct direction of causal inﬂuence be-fore, but also after the critical coupling is reached (Fig.5 b),c)). This result is in sharp contrast to other ap-proaches (Krakovsk´a Fig.4 [9]). Further tests in othersystems (e.g. two coupled Fitzhugh-Nagumo Neurons)showed similar results up to the critical value. Inter-estingly, we found that intrinsic noise improves the de- -1 0 1 w -1-0.500.51 (a) w -1-0.500.51 2345 D (c) w I (b) FIG. 5. (a)

Asymmetry index between two Lorenz oscillators( N = 10 data points) with slightly diﬀerent frequencies andcoupled by their y -components ( θ / = 10 , ρ = 28 . , ρ =27 . , θ / = 8 / m =9 and τ = 10 and E = 10 were used to estimate chance-level. The noise free asymmetry is shown in black. Colouredcircles represent internal noise with σ = 2 (green) and additiveexternal noise with σ = 2 (red). (b),(c) Unilaterally coupledRoessler (X) → Lorenz (Y) System. 10 reference points werechosen from 10 data points with τ = 2 and m = 7. Forchance-level erstimation E = 10 ensembles were used. (b) Causal Inﬂuence I x → y (blue) and I y → x (red) for the noise free(solid-lines) and for the systems perturbed by Gaussian whitenoise (asterix’s). (c) The asymmetry for the noise free (solidlines) and perturbed system (asterix’s). Grey Lines show theInformation Dimension estimated from the respective timeseries. tectability of inﬂuence asymmetries (asterix’s in Fig. 5c)).

IV. DISCUSSION:

When a system’s component X unilaterally inﬂuencesanother component Y, then Y recieves informationabout X. X will not have as large proportions ofinformation about states of Y, since in this case thelatter typically have independent degrees of freedom.This simple heuristics is somewhat opposite to standardapproaches based on prediction in direction of theinﬂuence including Granger Causality [14] and TransferEntropy [15]. Here, we consider the relative amounts ofinformation about the systems’ states in the directionsopposite to the inﬂuences. The cross prediction methodestimates metric inﬂations of neighbourhoods and theirprojections via the putatively homeomorphic mappingsamong manifolds of reconstructed system states as aproxy for information loss. In the past a related idea wasused for determining the quality of mappings [10] [16],where the ratio of the inﬂations within the respectivespaces was expected to be close to one for homeomorphywhile for topology violations it systematically deviatesfrom one. In other words, while the core of methodpresented here is information theoretic its particularsensitivity for metric topology violations could havebeen expected from previous work. A recent work[5] similarly used local properties of the mappingsamong state reconstructions to establish a measure ofcausal inﬂuence (termed Topological Causality). Instark contrast to Topological Causality, however, CPMexploits expansions within and not among reconstruc-tions. Basing the method on the relation of distances within the same space solves a range of problems ofTC and related approaches including Convergent CrossMapping [4]. In particular, CPM is much less sensitiveto synchronizations where to our knowledge previousmethods often deliver misleading results [7][8].We are conﬁdent that applications also to real datawill provide more reliable results than previous meth-ods. For example an application to heart rate versusbreathing rate data from an apnea patient [17] revealeda substantial asymmetry of inﬂuence from heart rate tobreathing rate (Fig. 2 c), d)), an unambiguous resultthat is more pronounced than when determined withTransfer Entropy from the same data (Fig. 4 in [15]).We thank M. Sch¨unemann for helpful comments on themanuscript. [1] F. Takens, Detecting strange attractors in turbulence,in

Dynamical Systems and Turbulence , Springer LectureNotes in Mathematics, Vol. 898, edited by D. A. Randand L.-S. Young (Springer-Verlag, Berlin, 1981).[2] N. H. Packard, J. P. Crutchﬁeld, J. D. Farmer, and R. S.Shaw, Geometry from a time series, Phys. Rev. Lett. ,712 (1980).[3] J. Stark, D. Broomhead, M. Davies, and J. Huke, Takensembedding theorems for forced and stochastic systems,Nonlinear Analysis: Theory, Methods & Applications ,5303 (1997), proceedings of the Second World Congressof Nonlinear Analysts.[4] G. Sugihara, R. M. May, H. Ye, C. Hsieh, E. R. Deyle,and M. Fogarty, Detecting causality in complex ecosys-tems, Science , 496 (2012).[5] D. Harnack, E. Laminski, M. Sch¨unemann, and K. R.Pawelzik, Topological causality in dynamical systems,Phys. Rev. Lett. , 098301 (2017).[6] Z. Benk˝o, ´Ad´am Zlatniczki, D. Fab´o, A. S´olyom,L. Er˝oss, A. Telcs, and Z. Somogyv´ari, Exact infer-ence of causal relations in dynamical systems (2018),arXiv:1808.10806 [q-bio.QM].[7] E. R. Deyle, M. C. Maher, R. D. Hernandez,S. Basu, and G. Sugihara, Global environmen-tal drivers of inﬂuenza, Proceedings of the Na-tional Academy of Sciences , 052203 (2016).[10] W. Liebert, K. Pawelzik, and H. Schuster, Optimal em-beddings of chaotic attractors from topological consider-ations, EPL (Europhysics Letters) , 521 (1991).[11] P. Grassberger, T. Schreiber, and C. Schaﬀrath, Non-linear time sequence analysis, International Journal ofBifurcation and Chaos , 521 (1991).[12] B. Cummins, T. Gedeon, and K. Spendlove, On the eﬃ-cacy of state space reconstruction methods in determin-ing causality, SIAM Journal on Applied Dynamical Sys-tems , 335 (2015), https://doi.org/10.1137/130946344.[13] P. Grassberger, Finite sample corrections to entropy anddimension estimates, Physics Letters A , 369 (1988).[14] C. W. J. Granger, Investigating causal relations by econo-metric models and cross-spectral methods, Econometrica , 424 (1969).[15] T. Schreiber, Measuring information transfer, Phys. Rev.Lett. , 461 (2000).[16] H.-U. Bauer and K. R. Pawelzik, Quantifying the neigh-borhood preservation of self-organizing feature maps,IEEE Transactions on neural networks3