[PDF] Coping with dating errors in causality estimation

Abstract

{We consider the problem of estimating causal influences between observed processes from time series possibly corrupted by errors in the time variable (dating errors) which are typical in palaeoclimatology, planetary science and astrophysics. "Causality ratio" based on the Wiener -- Granger causality is proposed and studied for a paradigmatic class of model systems to reveal conditions under which it correctly indicates directionality of unidirectional coupling. It is argued that in case of {\it a priori} known directionality, the causality ratio allows a characterization of dating errors and observational noise. Finally, we apply the developed approach to palaeoclimatic data and quantify the influence of solar activity on tropical Atlantic climate dynamics over the last two millennia. A stronger solar influence in the first millennium A.D. is inferred. The results also suggest a dating error of about 20 years in the solar proxy time series over the same period.

Full PDF

eepl draft

Coping with Dating Errors in Causality Estimation

D.A. Smirnov , , N. Marwan , S.F.M. Breitenbach , F. Lechleitner , , and J. Kurths , Saratov Branch of V.A. Kotel’nikov Institute of RadioEngineering and Electronics of the Russian Academy of Sciences– 38 Zelyonaya St., Saratov 410019, Russia Institute of Applied Physics of the Russian Academy of Sciences, 46 Ulyanova St., Nizhny Novgorod 603950, Russia Potsdam Institute for Climate Impact Research, Telegraphenberg A31, Potsdam 14473, Germany Institute for Geology, Mineralogy & Geophysics, Ruhr-Universit¨at Bochum, Universit¨atsstr. 150, 44801, Bochum,Germany Department of Earth Sciences, ETH Zurich, Sonneggstrasse 5, 8092 Zurich, Switzerland Department of Earth Sciences, Durham University, Durham, DH1 3LE, UK

PACS – Time series analysis

Abstract –We consider the problem of estimating causal inﬂuences between observed processesfrom time series possibly corrupted by errors in the time variable (dating errors) which are typicalin palaeoclimatology, planetary science and astrophysics. “Causality ratio” based on the Wiener– Granger causality is proposed and studied for a paradigmatic class of model systems to revealconditions under which it correctly indicates directionality of unidirectional coupling. It is arguedthat in case of a priori known directionality, the causality ratio allows a characterization of datingerrors and observational noise. Finally, we apply the developed approach to palaeoclimatic dataand quantify the inﬂuence of solar activity on tropical Atlantic climate dynamics over the lasttwo millennia. A stronger solar inﬂuence in the ﬁrst millennium A.D. is inferred. The results alsosuggest a dating error of about 20 years in the solar proxy time series over the same period.

Introduction. –

Revealing cause-and-eﬀect relation-ships between observed processes at various time scales isan important step in understanding many physical, bio-logical, physiological and geophysical systems [1–8]. Fre-quently, this issue must be addressed with rather limitedknowledge about the systems under study, amounts of ob-servational data, and dating accuracy. A general approachto detect and quantify causal couplings, i.e., to ﬁnd out“who drives whom”, is the Wiener – Granger (WG) causal-ity [2, 3]. In its simplest version, the idea is to checkwhether a present value of one process ( X ) can be pre-dicted more accurately using the past of a second process( Y ) in comparison with predictions based solely on thepast of X . In fact, this concept generalizes a conditional(partial) cross-correlation [11] and has been followed by anumber of elaborations such as information-theoretic mea-sures [3,12–15] and various nonlinear approximations [16].Despite some limitations and obstacles [8, 17–19], theWG causality appears quite useful in practice, allowingmeaningful dynamical interpretations [10, 22] and becom-ing increasingly widely used in diﬀerent ﬁelds, such asbiomedicine [1, 5, 8] and geophysics [6].Causal coupling estimation is also of great value in cli- mate science, where temporal changes of climatically sen-sitive proxies [23] are the main source of information aboutpast climate dynamics over long time intervals. The sta-lagmite YOK-I from the Yok Balum Cave in SouthernBelize is especially well dated [24] and provides a high-resolution reconstruction of low-latitudinal Atlantic mois-ture variations [25]. Making use of solar irradiance re-constructions (e.g. [26]), one can ask “How do variationsin solar activity aﬀect regional Atlantic climate?”. An-swering this question helps further delineating the time-variant processes that drive climate variations. However,this question leads directly to the main diﬃculty with suchdata: dating accuracy of the reconstructions used. Uncer-tainties inherent to sampling and dating methods limit ourknowledge of the time instant of each proxy observation,so that temporal ordering of the observations from the twotime series may be distorted uniformly or irregularly in thecourse of time. This makes questionable any applicationof the WG causality approach, which essentially requiresa clear distinction between the future and the past.In this Letter, we propose a solution with an appropri-ate speciﬁcation of the problem setting and adaptation ofthe WG causality characteristics. We consider a situationp-1 a r X i v : . [ phy s i c s . d a t a - a n ] A p r .A. Smirnov et alwhere it is known in advance that the coupling betweentwo processes underlying the observed time series is unidi-rectional, and the problem reduces to identifying the cou-pling directionality. Observational noise and dating errorsmay strongly aﬀect the results of any coupling analysis.In particular, the usual cross-correlation function (CCF)is obviously insuﬃcient since even a uniform dating er-ror moves the location of the CCF maximum along thetime axis, so that “lead – lag” information is lost. Wenote, however, that the WG causality approach providestwo coupling characteristics corresponding to the two di-rections X → Y and Y → X , which is a richer characteri-zation than a single CCF value. To make the WG causal-ity work in case of dating errors, we suggest its modiﬁca-tion involving the deﬁnition of the causality ratio r Y → X which is the ratio of maximized time-lagged truncated WGcausalities in the directions Y → X and X → Y . We arguethat if a coupling indeed exists in the direction Y → X ,then under certain conditions r Y → X >

1, i.e., the causal-ity ratio is an indicator of the coupling directionality.We study the conditions under which this causality ra-tio allows us to extract information on directionality ofunidirectional coupling or, knowing the directionality, tocharacterize dating errors and observational noise in theanalyzed time series. As for the latter task, the men-tioned palaeoclimate problem is a relevant example wherecoupling is unidirectional from solar activity variationsto regional climate (reﬂected in proxy reconstructions),while dating errors and observational noise in the proxysignals remain largely unknown. Here, we (i) determinethe causality ratio for a class of model systems exactly, (ii)analyze statistical properties of its estimator in numericalsimulations, and (iii) apply the approach to palaeoclimatedata using the two records mentioned above to assess theirdating accuracy and quantify the time-variant inﬂuence ofsolar activity on the tropical Atlantic climate. Further de-tails of the method and additional results are given in [27].

Wiener – Granger causality. –

Let ( X ( t ) , Y ( t )) bea bivariate random process with realizations ( x ( t ) , y ( t )).Denote x n = x ( t n ), y n = y ( t n ), where t n = nh , n ∈ Z , and h is the sampling interval. Considerthe self-predictor x indn = E [ X ( t n ) | x n − , x n − , . . . ] wherethe expectation E [ ·|· ] is conditioned on the inﬁnite past { x n − , x n − , . . . } . Its mean-squared error is σ X,ind = E [( X ( t n ) − x indn ) ] where the expectation is taken overall x n and all { x n − , x n − , . . . } . This error is the leastover all self-predictors for X . The joint predictor x jointn = E [ X ( t n ) | x n − , y n − , x n − , y n − , . . . ] gives the least error σ X,joint over all joint predictors. The prediction improve-ment (PI) G Y → X = ( σ X,ind − σ X,joint ) /σ X,ind is a measureof WG causality in the direction Y → X . Everything isanalogous for the direction X → Y .The WG idea was ﬁrst realized for stationary Gaussianprocesses [3]. Then, when estimating G Y → X from a ﬁnitetime series { x n , y n } Nn =1 , one truncates the (conditioning)inﬁnite pasts at ﬁnite numbers of terms l X and l XY and ﬁts univariate and bivariate linear autoregressive modelsof the orders l X and ( l X , l XY ) to the data via the ordi-nary least-squares technique. In other words, one usesthe predictors x indn,l X = E [ X n | x n − , x n − , . . . , x n − l X ] and x jointn,l X ,l XY = E [ X n | x n − , . . . , x n − l X , y n − , . . . , y n − l XY ] andgets the truncated WG causality G trY → X . The latter is of-ten a good approximation of G Y → X even at small l X and l XY . The model orders can be selected via the Schwarzcriterion [5] and statistical signiﬁcance can be checked viaFisher’s F -test [6]. Causality ratio. –

Consider a more general settingwith the original processes X and Y , whose observedversions X and Y are distorted along two lines. First,due to an amplitude noise: X ( t ) = X ( t ) + Ξ( t ) and Y ( t ) = Y ( t ) + Ψ( t ) where Ξ( t ) and Ψ( t ) are independentobservational noises with variances σ and σ , whose dis-crete time realizations ξ n and ψ n are white noises. Second,due to time uncertainty: genuine ( a priori unknown) ob-servation instants t Xn and t Yn deviate from the supposedregular equidistant series t n = nh : x n = x ( t Xn ) + ξ n and y n = y ( t Yn ) + ψ n with t Xn + δ Xn = nh and t Yn + δ Yn = nh ,where δ Xn and δ Yn stay for the time axis (i.e. dating)errors. The latter may be rapidly ﬂuctuating or slowlyvarying and may be deﬁned either as random processesor deterministic functions of time. To account for thedating errors and retain sensitivity to coupling, we usethe time-lagged WG causality: namely, G trY → X (∆) is de-ﬁned as prediction improvement of x n when using the seg-ment { y n − ∆ /h , . . . , y n − ( l XY − − ∆ /h } . Then, we suggestto determine its maximum over an interval of positiveand negative time lags of some width 2∆ m : G tr,maxY → X =max − ∆ m ≤ ∆ ≤ ∆ m G trY → X (∆). Analogously we deﬁne G tr,maxX → Y .Finally, the causality ratio in the direction Y → X reads r Y → X = G tr,maxY → X G tr,maxX → Y . (1)Obviously, r X → Y = 1 /r Y → X . The value of ∆ m should bechosen so as to exceed a maximal possible dating error toavoid missing the maximal PIs. If, moreover, the couplingis time-delayed, locations of the PIs maxima are shiftedalong the ∆-axis by the value of this delay. Hence, ifone expects a time delay, then the value of ∆ m should beselected so as to exceed the sum of the absolute values ofthe coupling delay and the dating error.We conjecture that for unidirectional coupling Y → X and similar individual characteristics of the processes X and Y , the ratio r Y → X is considerably greater than unity.However, dating errors and observational noise along withestimates ﬂuctuations due to shortness of time series maysomewhat decrease r Y → X , which is studied below. Model system. –

Since the value of r Y → X may de-pend on many features of the processes under study (suchas characteristic times and sampling interval) and param-eters of the estimation technique (such as l X ), we need top-2oping with Dating Errorschoose a reasonably simple system and a narrow range ofthe parameters for which the causality ratio can be stud-ied in detail. As such a testing system, we use coupled“relaxators” (ﬁrst-order decay processes): dX /dt = − αX ( t ) + kY ( t ) + ζ X ( t ) ,dY /dt = − αY ( t ) + ζ Y ( t ) , (2)where α determines the characteristic relaxation time τ = 1 /α , k is the coupling coeﬃcient, and ζ X and ζ Y areindependent zero-mean white noises with autocorrelationfunctions E [ ζ X ( t ) ζ X ( t )] = E [ ζ Y ( t ) ζ Y ( t )] = δ ( t − t )where δ is Dirac’s delta. Eqs. (9) represent a simple,but basic class of systems which still exhibit irregulartemporal behavior and are often encountered in diﬀerentﬁelds (e.g. [30]). The squared zero-lag CCF reads here C X Y , = ( β/ / (1 + β/

2) where β = k /α is a non-dimensional coupling strength. C X Y , ranges from 0 (for k = 0) to 0.5 (for k → ∞ ) and can be used to parameter-ize the coupling strength as well. The sampling rate canbe conveniently characterised by the ratio h/τ .For system (9) it appears possible to conﬁne ourselveswith the orders l X = l XY = l Y = l Y X = 1. It can beargued that G trY → X (∆) obtained at l X = l XY = 1 is closeto G trY → X (∆) obtained at l X = ∞ and l XY = 1, if thesampling interval h is not too small (e.g. ≥ . τ ) [10]. Innumerical simulations here, we also ﬁnd that the resultsfor G trY → X (∆) with l X = 1 are close to those obtainedwith l X selected via the Schwarz criterion (diﬀerence ofthe order of 1%). Similar arguments hold for l XY . Then,the quantity G trY → X (∆) can be expressed via the autocor-relation function (ACF) C XX ( h ) and the CCF C XY (∆)and C XY (∆ − h ) [8, 27]. Having found ACFs and CCFanalytically, we compute the time-lagged truncated WGcausalities versus ∆ and select their maxima to calculatethe causality ratio. Such a precise analysis is performed forvarious coupling coeﬃcient values, sampling intervals, ob-servational noise and dating error levels, while statisticalproperties of the causality ratio estimator are investigatedin numerical simulations. We check if indeed r Y → X > r Y → X can be at all. A closer atten-tion is paid to cases with 0 . ≤ C XY,max ≤ . . ≤ G tr,maxY → X ≤ .

03 which are reminiscentof those often observed in climate data analysis in casesof statistically signiﬁcant coupling detection (e.g. [31] andthe palaeoclimate example below).

Exact study of possible causality ratio values.–

Before considering the central point of dating errorsidentiﬁcation, it is necessary to study the case of undis-torted observations X = X and Y = Y . For the mostpractically interesting situations of not too sparse sam-pling (e.g. h ≤ . τ ), r Y → X is well above unity, conﬁ-dently indicating the correct coupling direction. Namely, r Y → X = 1 . h = 0 . τ and a moderately strong cou-pling of C X Y , = 0 .

1. For rather sparse samplings of h ≥ τ , the ratio r Y → X gets close to unity and, hence, can- not reliably reveal coupling directionality. This is similarfor any coupling strength: in particular, at h/τ = 0 . r Y → X ≈ .

6) inthe wide range of 0 < C XY, < .

3. For stronger couplings, r Y → X becomes even greater, up to ≈ C XY, = 0 . r Y → X correctly de-tects coupling directionality. More details are given in [27].Though there can be diﬀerent types of dating errors,their basic eﬀect can be studied on a simple example wheredating errors equal a constant temporal shift half the time(e.g. for an older half of a palaeoclimate record where ac-curate dating is more diﬃcult) and zero otherwise. Re-gardless which signal is erroneously dated, only the rela-tive dating errors matter in causality estimation. For deﬁ-niteness, we introduce the dating errors only into the driv-ing signal: δ Yn = const = δ Y half time (for n = 1 , . . . , N/ δ Yn = 0 otherwise (for n = N/ , . . . , N ). The “av-erage CCF” of such a nonstationary process ( X, Y ) canbe deﬁned as the expectation of the sample CCF com-puted over the entire time span and equals an arithmeticmean of the CCFs for the two stationary halves. Theusual WG causalities deﬁned for the entire time span areexpressed via such an average CCF in the same way as be-fore. Figs. 1,a,b show that the shape of the plots for thetime-lagged WG causalities and locations of their max-ima change strongly when the dating error becomes com-parable with the relaxation time τ . Then, the “correct” G tr,maxY → X decreases almost two times as compared to zerodating error, while the opposite G tr,maxX → Y decreases only1.5 times. At that, the causality ratio becomes close tounity and may even fall down to 0.9 for the dating errorgreater than τ . If a smaller or a larger portion of a timeseries suﬀers from a uniform dating error, then the eﬀect ofthe latter on the causality ratio and the respective distor-tions of the plots G trY → X (∆) are weaker [27], in particular,they vanish if the entire time series is characterized witha uniform dating error since the causality ratio involvesmaximization over temporal shifts.Principally, dating errors may be distributed in a com-plicated manner determined both by random walk-likestochastic contribution, analytical limitations and globalcontribution induced by incorrect tie points as, e.g., er-roneous attribution of volcanic eruption dates due to in-correct identiﬁcation of individual eruptions [13]. Still, wehave obtained results very similar to Fig. 1 for dating er-rors linearly increasing with age, even with a superimposedrandom-walk component whose values become of the orderof τ for ages of the order of 100 τ as motivated by palaeocli-mate applications. Thus, the described eﬀect of the datingerrors is robust, being observed just for reasonably largedating errors without any other, speciﬁc conditions.When dating errors are present, it is natural to expectalso an observational noise. Let us ﬁrst show how the lat-ter aﬀects the causality ratio for zero dating errors. Itappears that the noise Ψ in the driving signal can signif-icantly decrease r Y → X . Thus, at moderate h/τ = 0 . D / t G Y ® X ( D ) D / t G X ® Y ( D ) half-time datingerror d Y / t -4 -2 0 2 4 -4 -2 0 2 4 a) b) -2 -1 0 1 2 half-time dating error d Y / t G Y ® X , G X ® Y C XY , max tr , max tr , max c) tr tr -2 -1 0 1 2 half-time dating error d Y / t r Y ® X d) Fig. 1: Causality measures depending on dating error δ Y for the system (9) at h/τ = 0 . σ = σ = 0, k/α issuch that C X Y , = 0 . l X = l XY = l Y = l Y X = 1: (a,b)truncated WG causalities versus time lag for diﬀerent datingerrors; (c) maximal truncated WG causalities (blue and green)and maximum CCF value (black) and (d) causality ratio versus δ Y . C X Y , = 0 . σ = 0, the “correct” G tr,maxY → X de-creases with σ faster than G tr,maxX → Y so that r Y → X ap-proaches unity at σ /σ Y > . r Y → X apart fromunity, which becomes quite visible as soon as σ /σ X ex-ceeds just 0 .

1. To summarize, large values of σ /σ Y (50%and greater) along with small σ /σ X (less than 10%) atmoderate coupling strengths make the ratio r Y → X closeto unity. Hence, such a speciﬁc combination of noise levelscan complicate inference of coupling direction from r Y → X .To distinguish between impacts of observational noiseand dating error from data, we can use either (i) assump-tions about possible levels of both factors or (ii) shapes ofthe plots G tr (∆). For example, (i) if the noise is hardlygreater than 20% in terms of variance, then r Y → X < . τ / G trY → X (∆) and G trX → Y (∆) strongly diﬀer from each other (cf. Figs. 1,a,band 2,a,b), this is a sign of dating errors rather than obser-vational noise. Being based on exact values of the causalityratio, such considerations are valid only for long enoughtime series, where statistical ﬂuctuations can be neglected. Note of causality ratio estimation. –

Muchsmaller causality ratio estimates (e.g., 0 .

5) could appearin practice either due to a violation of Eq. (9) or too shorttime series. To give an analytic guess for possible statis-tical ﬂuctuations of time series-based estimates, we notethat the estimator (

N/l XY ) ˆ G trY → X (∆) for suﬃciently large N roughly follows χ distribution with l XY degrees of free-dom, so the amplitude of its deviations from the meanfor l XY = 1 equals 3 /N (the latter is the distance from0 . m = 4 τ , the diﬀer- s Y /s Y G Y ® X , G X ® Y C XY ,max s Y /s Y r Y ® X tr , max tr , max c) d) D / t G Y ® X ( D ) D / t G X ® Y ( D ) noise level s / s Y -4 -2 0 2 4 -4 -2 0 2 4 a) b) tr tr Fig. 2: Causality measures depending on observational noiselevel σ for the system (9) at h/τ = 0 . σ = 0, δ Y = 0, C X Y , = 0 .

1, and l X = l XY = l Y = l Y X = 1: (a,b) truncatedWG causalities versus time lag for diﬀerent noise levels; (c)maximal truncated WG causalities (blue and green) and max-imal CCF value (black) and (d) causality ratio versus σ /σ Y . ence δ ˆ G = ˆ G tr,maxY → X (∆) − ˆ G tr,maxX → Y (∆) for l XY = l Y X = 1ﬂuctuates with an amplitude of (3 /N ) √ · ≈ /N . De-note the expectation of this diﬀerence δG . Then, δ ˆ G andhence the estimator ˆ r Y → X = ˆ G tr,maxY → X / ˆ G tr,maxX → Y are slightlyaﬀected by statistical ﬂuctuations if the time series lengthis N (cid:29) /δG . Hence, for a typical δ ˆ G ≈ .

01 (as inthe following example) one should require N (cid:29) Causality estimates from palaeoclimate data. –

A key problem in Climate Sciences is to understandand evaluate relative contributions of diﬀerent factors toobserved global and regional climate variations over timescales on the order of decades and longer. The bestsources of such information from the pre-instrumental eraare palaeoclimate proxies from diﬀerent natural archives.One well-dated high-resolution reconstruction has beenextracted from the stalagmite YOK-I from Yok BalumCave (Southern Belize) [25]. The δ O record representslocal to regional hydroclimate variations in that Atlanticregion over the last two millennia with a mean temporalresolution of half a year and is characterized by very lowdating errors (up to 17 yrs for ages about 2000 yrs). Thistime series ( x signal) is examined here in parallel with thereconstruction of the total solar irradiance (TSI) based on Be measurements on ice cores [26] to extract informa-tion on a possible inﬂuence of solar activity ( y signal) onthe Belize climate over the last two millennia.The time series are presented in Figs. 3,a,b. The TSIdata (Fig. 3,b) have originally been processed to removethe 11-yr solar cycle [26] and sampled in steps of h = 5p-4oping with Dating Errors time, yr AD -5-4-3 x [ d O, permil VPDB] a) time, yr AD -1.2-0.8-0.400.40.8 y [TSI, W/m ] b) -2 -1 0 1 2 D /25 yr G Y ® X ( D ) , G X ® Y ( D ) tr tr -2 -1 0 1 2 D /25 yr p ( G Y ® X ) , p ( G X ® Y ) D /25 yr -0.400.40.8 C XX , C YY -4 -2 0 2 4 D /25 yr -0.2-0.100.10.20.3 C XY c)e) d)f) tr tr Fig. 3: Estimation from palaeoclimate data over the period [15yr BC - 2010 yr AD]: (a) time series of δ O from a speleothemrepresenting local climate (moisture) in the Atlantic region, redpoints denote the original data, blue line – smoothed signal; (b)time series of solar activity (total solar irradiance); (c) sampleACF for the signals x (blue) and y (green); (d) sample CCF;(e) truncated WG causalities in the directions TSI → Belizeclimate (blue) and Belize climate → TSI (green) for l X = 3, l XY = 1, l Y = 4, l Y X = 1; (f) the respective pointwise p -levels for the positivity of ˆ G trY → X (blue) and ˆ G trX → Y (green),black dashed lines show the pointwise p -levels correspondingto the total p -level of 0.05 and obtained via the Bonferronicorrection [12] with a pre-deﬁned order of tests. yrs. The original, nonequidistantly sampled YOK-I δ O values are shown as red dots in Fig. 3,a, the blue lineshows the Gaussian kernel-based ﬁltered [34] record (eﬃ-cient width of 5 yrs) sampled equidistantly in smaller stepsof 1 yr. The sample ACFs of both signals (Fig. 3,c) andtheir CCF (Fig. 3,d, ˆ C XY,max = 0 .

09) agree reasonablywell with the hypothesis of the relaxators (9) with τ ≈ N = 400: thesignal duration is 80 τ , the sampling interval is 0 . τ .To focus on the most statistically reliable results, weuse the model orders selected via the Schwarz criterionfor these data ( l X = 3 and l Y = 4), even though every-thing is similar for the unit orders. The WG causalityestimates diﬀer from zero at least at the level of 0.05:ˆ G tr,maxY → X = 0 .

014 and ˆ G tr,maxX → Y = 0 .

025 (Figs. 3,e,f). Sinceˆ G trY → X (∆) for the direction TSI → Belize climate is max-imal at negative time lag ∆ instead of an expected non-negative lag, a possible dating error can be assumed. Itis surprising that the causality ratio from TSI to Belize climate is ˆ r Y → X = 0 .

56, though we would expect muchgreater r Y → X > . r Y → X > . r Y → X . Causality estimates from short time series. –

Taking N = 400 and h/τ = 0 .

2, we generated an ensembleof 1000 time series by integrating Eqs. (9) with the Euler– Maruyama technique at time step of τ /

300 and impos-ing (or not) observational noise and dating errors. Fromeach time series, we estimated WG causalities and causal-ity ratio (for l X = 3, l Y = 4, l XY = l Y X = 1). Then wecalculated their mean values and probabilities to exceedthreshold values equal to the respective palaeoclimate es-timates [27]. The result is that for this data amount theeﬀect of statistical ﬂuctuations on the causality estimatesis considerably stronger than that of dating errors (thesecond place) and observational noise (the third place).Without observational noise and dating errors, we spec-ify k/α = 0 .

45 which gives CCF close to the palaeocli-mate estimate. For smaller k/α (e.g. ≤ .

3) the WGcausality estimates are insigniﬁcant according to the F -test, while for greater k/α (e.g. ≥ .

6) the CCF and WGcausalities estimates considerably exceed the respectivepalaeoclimate values. The estimation shows that typicallyˆ r y → x >

1. A less typical case of r y → x < p = 0 .

05 in more than 90% of the time series. Ap-pearance of the plots ˆ G trY → X (∆) and ˆ G trX → Y (∆) is similarto Figs. 3,e,f, except for the locations of the maxima [27]:statistically signiﬁcant ˆ G trY → X (∆) has a maximum nearzero, not at a negative lag. However, the half-time dat-ing error δ Y = 0 . τ = 20 yrs moves the maximum ofˆ G trY → X (∆) to negative lags of ∆ ≈ − δ Y which is observedin about 50% of the ensemble. Thus, the system (9) withdating errors is closer to our palaeoclimate example.The values of ˆ r Y → X depend on various factors [27]. Forzero observational noise and zero dating errors the meanof ˆ r Y → X is 1.2 which is already low enough as comparedto the theoretical r Y → X = 1 .

6, i.e., statistical ﬂuctuationsof the estimate already play the role of noise. The ratio r Y → X decreases very slightly under increasing noise in thedriving signal σ even up to a very large 100% level (atzero noise in the driven signal). The probability to observevalues of ˆ r Y → X ≤ .

56 rises with σ from 0.03 only up toto 0.05. The estimates of r Y → X appear more sensitive tothe dating error and their mean falls down to 1.1 alreadyfor moderate δ Y = − . τ and the probability of observingˆ r Y → X ≤ .

56 rises from 0.03 to 0.06 at δ Y = − . τ andeven to 0.08 at δ Y = − τ suggesting that the dating erroris more probable to be of importance here than the obser-vational noise. Overall, for a time series of the consideredmoderate length, statistical ﬂuctuations are more inﬂuen-tial than observational noise and dating errors: the formerp-5.A. Smirnov et aldecrease the causality ratio from 1.6 to 1.2, as comparedto the change of the order of 0.1 induced by the dating er-ror and 0.05 by observational noise. Thus, the time serieslength seems to be the main factor limiting the accuracyof the estimation for the palaeoclimate data at hand. Yet,as justiﬁed above, the relative importance of each factordepends on the time series length. In practice, it can bechecked ad hoc for a time series at hand as is done here.To develop a standard test for statistical signiﬁcance,we note that under the null hypothesis of uncoupled pro-cesses the estimator ˆ r Y → X resembles the ratio of two χ -distributed quantities with l XY and l Y X degrees of free-dom. Maximization of G tr (∆) over an interval of width2∆ m = 4 τ consisting of four independent segments cor-responds to maximization of χ -distributed quantity overfour independent trials. Numerical simulations show thatfor l XY = 1 such a maximization results in the distribu-tion which can be approximated by the χ law with twodegrees of freedom. Then, ˆ r Y → X is distributed accordingto Fisher’s F -law with (2 ,

2) degrees of freedom. However,quality of the approximation reduces for short time series,where Monte-Carlo based estimation seems more reliable.Additional tests with simulations of a non-equidistantsampling from (9) and a subsequent Gaussian kernel-basedﬁltering (all identical to the palaeoclimate case) show thatit slightly increases the likelihood of the causality esti-mates obtained from the palaeoclimate data. Still, evenin case of best correspondence, the system (9) exhibitscharacteristics similar to those in the palaeoclimate dataonly in 10% of all realizations. One reason for this limitedagreement between the data and the stationary randomprocess (9) can be temporal changes of some characteris-tics of the processes underlying the proxy records.

Nonstationarity of the palaeoclimate processes.–

We have accounted for a possible nonstationarityby moving window analysis of the palaeoclimate data.The main results are presented in Fig. 4 for two non-overlapping time windows corresponding to the two subse-quent millennia. Figs. 4,a,b (the ﬁrst millennium A.D.) re-veal a usual value of the causality ratio r Y → X = 1 . > -2 -1 0 1 2 D /25 yr G Y ® X ( D ) , G X ® Y ( D ) tr tr -2 -1 0 1 2 D /25 yr p ( G Y ® X ) , p ( G X ® Y ) a) b) -2 -1 0 1 2 D /25 yr G Y ® X ( D ) , G X ® Y ( D ) tr tr -2 -1 0 1 2 D /25 yr p ( G Y ® X ) , p ( G X ® Y ) c) d) tr trtr tr Fig. 4: Estimation from two non-overlapping 1000-yr intervalsof the palaeoclimate data: (a,b) [15 yr BC – 985 yr AD]; (c,d)[985 yr AD – 1985 yr AD]. Panels (a,c) show truncated WGcausalities in the directions TSI → Belize climate (blue) andBelize climate → TSI (green) for l X = 3, l XY = 1, l Y = 4, l Y X = 1. Panels (b,d) show pointwise p-levels for positivityof ˆ G trY → X (blue) and ˆ G trX → Y (green), black dashed lines showpointwise p -levels corresponding to the total p -level of 0.05. to the Sun). Such a lag may well be determined by datingerror of at least 20 yrs: Either the age of the solar signalis underestimated or the age of the cave signal is over-estimated. Importantly, the question about which signal(or both) has a larger dating error is not possible to an-swer on the basis of bivariate data. We therefore includethe best-dated ice-core based volcanic activity data [13] inour analysis (instead of the TSI data) to check whetherits inﬂuence on the Belize climate (which is expected andwell-accepted) is also characterized by a non-physical neg-ative temporal shift [27]. We have found highly statisti-cally signiﬁcant volcanic forcing on speleothem δ O varia-tions, the maximum of G trY → X (∆) being shifted to positive∆ = 2 or 3 yrs, i.e. ∆ ≈ h/

2, that agrees with the notionof volcanic forcing delayed by no more than 1 yr. Such asmall time delay is totally acceptable. Hence, the test withthe volcanic record shows that there is excellent correspon-dence between eruptions recorded in ice cores and YOK-Iwhich strongly supports the claim of highly accurate dat-ing of the speleothem. Therefore, we conclude that it isthe TSI record which is less accurately dated in the ﬁrstmillennium A.D., with a possible age underestimation ofabout 20 yrs.

Conclusions. –

Dating errors are an almost in-evitable characteristic of palaeoclimate time series whichmakes causality estimation even more diﬃcult. We haveproposed the causality ratio r Y → X based on WG causal-ity (1) as a relevant tool to cope with this problem. Wehave shown that the value of r Y → X > Y → X foridentical stochastic relaxators in the absence of observa-p-6oping with Dating Errorstional noise and dating errors, if the sampling is not toosparse. Only very large observational noise in the drivingsignal (more than 50% in terms of variance) along withthe noise-free driven signal makes r Y → X close to unity andunsuitable for coupling directionality identiﬁcation. Thecausality ratio is more sensitive to the dating error: if halfa time series is dated with an error about the relaxationtime τ or greater, r Y → X gets close to unity again. Hence,in case of a priori known coupling direction, the value of r Y → X allows to assess likely values of dating errors andobservational noise level. However, statistical ﬂuctuationsof the estimates from suﬃciently short time series may ex-ceed the inﬂuence of dating errors and observational noise.Applying the above results to analyze palaeoclimatedata, we conﬁrmed a strong inﬂuence of solar activity onthe Belize climate over the ﬁrst millennium A.D. and sug-gested that this inﬂuence strongly decreased in the secondmillennium. An unexpectedly low causality ratio appearsto be determined by the shortness of the time series and,probably, the dating error in the solar proxy over the ﬁrstmillennium A.D. of about 20 yrs, the age of the solar databeing underestimated. It seems to be an interesting andfruitful conclusion from an analysis of such a short pieceof data on the basis of the adapted causality analysis.The theoretical part of our research is based on the anal-ysis of a simple, but basic test system (9). Further studiesof the inﬂuence of dating errors and other factors on WGcausalities for more general systems are relevant, includ-ing non-identical processes, higher dimensionality of statespaces, and various kinds of nonlinearity. More “inertial”couplings can be analyzed with l XY > l XY diﬀerent temporal shifts rather than with a single ∆.All these features will possibly reveal more complicated re-lationships between the causality ratio and coupling direc-tionality which can then be taken into account, extendingthe range of applicability of the approach to all ﬁelds wheredating errors are encountered. Yet, the research presentedhere is valuable as the ﬁrst step which already reveals thatthe adapted WG causality analysis is a promising tool todeal with data corrupted by dating errors and extract in-formation about underlying causal couplings. ∗ ∗ ∗ The work is partially supported by the Governmentof Russian Federation (Agreement No. 14.Z50.31.0033with the Institute of Applied Physics RAS) and the Euro-pean Union’s Horizon 2020 Research and Innovation pro-gramme (Marie Sk(cid:32)lodowska-Curie grant agreement No.691037). The theoretical and numerical study of mathe-matical examples is done under the support of the RussianScience Foundation (grant No. 14-12-00291).

REFERENCES[1] E. Pereda, R. Quian Quiroga, and J. Bhattacharya, Progr.Neurobiol. (2005) 1. [2] M. Winterhalder, B. Schelter, and J. Timmer (eds.), Handbook of Time Series Analysis (Berlin: Wiley-VCH,2006).[3] K. Hlavackova-Schindler, M. Palus, M. Vejmelka, and J.Bhattacharya, Phys. Rep. (2007) 1.[4] B.P. Bezruchko and D.A. Smirnov,

Extracting knowledgefrom time series: An introduction to nonlinear empiricalmodeling (Berlin, Heidelberg: Springer-Verlag, 2010).[5] M. Wibral, B. Rahm, M. Rieder, et al, Prog. Biophys.Mol. Biol. , 80 (2011).[6] A. Attanasio, A. Pasini, and U. Triacca, Atmospheric andClimate Sciences (2013) 514.[7] C.L. Webber and N. Marwan (eds.), Recurrence Quantiﬁ-cation Analysis Theory and Best Practices (Berlin, Hei-delberg: Springer-Verlag, 2015).[8] A. Mueller, J.F. Kraemer, T. Penzel, H. Bonnemeier, J.Kurths, and N. Wessel, Physiol. Meas. (2016) R46R72.[9] N. Wiener, in E.F. Beckenbach (ed.) Modern Mathematicsfor the Engineer (New York: McGraw-Hill, 1956).[10] C.W.J. Granger, Econometrica (1969) 424.[11] J. Runge, J. Kurths, and V. Petoukhov, J. Climate (2014) 720.[12] T. Schreiber, Phys. Rev. Lett. (2000) 461.[13] N. Ay and D. Polani, Adv. Complex Syst. (2008) 17.[14] J.T. Lizier and M. Prokopenko, Eur. Phys. J. B (2010)605.[15] X. San Liang, Phys. Rev. E (2014) 052150.[16] A. Montalto, S. Stramaglia, L. Faes, G. Tessitore, R. Pre-vete, and D. Marinazzo, Neural Networks (2015) 159.[17] H. Nalatore, M. Ding, and G. Rangarajan, Phys. Rev. E (2007) 031123.[18] D.W. Hahs and S.D. Pethel, Phys. Rev. Lett. (2011)128701.[19] D.A. Smirnov and B.P. Bezruchko, Europhys. Lett. (2012) 10005.[20] D.A. Smirnov, Phys. Rev. E (2013) 042917.[21] D.A. Smirnov, Phys. Rev. E (2014) 062921.[22] D.A. Smirnov and I.I. Mokhov, Phys. Rev. E (2015)042138.[23] P.D. Jones, K.R. Briﬀa, T.J. Osborn, et al, The Holocene (2009) 3.[24] H.E. Ridley, Y. Asmerom, J.U.L. Baldini, et al, Nat.Geosci. (2015) 195.[25] D.J. Kennett, S.F.M. Breitenbach, V.V. Aquino et al, Sci-ence (2012) 788.[26] F. Steinhilber, J. Beer, and C. Froehlich, Geophys. Res.Lett. (2009) L19704.[27] Supplementary material on the web-site of EurophysicsLetters https://epletters.net/.[28] G. Schwarz, Ann. Stat. (1978) 461.[29] G.A.F. Seber, Linear Regression Analysis (New York: Wi-ley, 1977).[30] K. Hasselmann, Tellus (1976) 473.[31] I.I. Mokhov, D.A. Smirnov, P.I. Nakonechny, S.S. Ko-zlenko, Ye.P. Seleznev, and J. Kurths, Geophys. Res. Lett. (2011) L00F04.[32] M. Sigl, M. Winstrup, J.R. McConnell, et al. Nature (2015) 543.[33] E.L. Lehmann, Testing Statistical Hypotheses (New York:Springer, 1986).[34] K. Rehfeld, N. Marwan, S.F.M. Breitenbach, and J.Kurths, Clim. Dyn. (2013) 3. p-7.A. Smirnov et al p-8oping with Dating Errors Supplementary Material. –Deﬁnition of Wiener – Granger causality. –

Let ( X ( t ) , Y ( t )) be a bivariate random process with x n = X ( nh ), y n = Y ( nh ), n ∈ Z , h is sampling interval. Self-predictor of x n given by x indn = E [ x n | x n − , x n − , . . . ], where E [ ·|· ]stands for a conditional expectation, gives the least (over all self-predictors) mean-squared error σ x,ind = E [( x n − x indn ) ]. The joint predictor x jointn = E [ x n | x n − , y n − , x n − , y n − , . . . ] gives the error σ x,joint . Normalized predictionimprovement value G Y → X = ( σ x,ind − σ x,joint ) /σ x,ind is a measure of WG causality in the direction Y → X , originallycalled “causality strength” [1]. The idea was suggested in Ref. [2] and realized in Ref. [3] in application to stationaryGaussian process ( x n , y n ). The latter yields to a bivariate linear autoregressive (AR) equation x n = ∞ (cid:88) k =1 a x,k x n − k + ∞ (cid:88) k =1 b x,k y n − k + ξ n ,y n = ∞ (cid:88) k =1 a y,k y n − k + ∞ (cid:88) k =1 b y,k x n − k + ψ n , (3)where ( ξ n , ψ n ) is bivariate zero-mean Gaussian white noise with variances σ ξ , σ ψ and covariance E [ ξ n ψ n ] = γ .Whiteness assures that σ ξ = σ x,joint and σ ψ = σ y,joint [4]. Similarly, a process x n yields to a univariate ARdescription, i.e. the ﬁrst line of Eqs. (3) with all b x,k = 0 and white noise ξ (cid:48) n with variance σ ξ (cid:48) = σ x,ind . Now, G Y → X can be determined. Everything is similar for G X → Y . Estimation of WG causality. –

In order to estimate the theoretical values G y → x from a ﬁnite time series { x n , y n } Nn =1 , one truncates the inﬁnite sums in Eq. (3) at ﬁnite numbers of terms and ﬁts truncated univariate andbivariate AR models x n = l X (cid:88) k =1 ˜ a x,k x n − k + l XY (cid:88) k =1 ˜ b x,k y n − k + ˜ ξ n ,y n = l Y (cid:88) k =1 ˜ a y,k y n − k + l Y X (cid:88) k =1 ˜ b y,k x n − k + ˜ ψ n , (4)to the data via the ordinary least-squares technique, e.g. [4]. Formally speaking, one uses the predictors x indn,l X = E [ x n | x n − , x n − , . . . , x n − l X ] and x jointn,l X ,l XY = E [ x n | x n − , . . . , x n − l X , y n − , . . . , y n − l XY ]. Thereby, one gets truncatedWG causality measure G trY → X . The latter is often a good approximation of G Y → X already at quite small values ofthe AR orders l X and l XY .The value of l X is often (and, in particular, in study of the climate data here) selected via the Schwarz criterion [5].Namely, one minimizes the quantity N σ ξ + l X + 12 ln N , where ˆ σ ξ is the achieved mean-squared error of the one-stepAR model prediction. At any value of l XY , the pointwise statistical signiﬁcance level p ( l XY ) (probability of randomerror) of the conclusion “ G Y → X >

0” is checked via Fisher’s F -test [6]. The value of l XY can also be selected via theSchwarz criterion. Alternatively, it can be selected via minimization of the overall signiﬁcance level with the account ofBonferroni correction for multiple testing, i.e. via minimization of the value l XY p ( l XY ). In our palaeoclimate examplewe conﬁne ourselves with l XY = l Y X = 1 based on the Schwarz criterion. Thereby, we ﬁnally get an estimate ˆ G Y → X . Exact calculation of truncated WG causality. –

Denote R ( z ) = (cid:104) z · z T (cid:105) covariance matrix of a ran-dom vector z , angle brackets stand for expectation. Denote x l X n − = ( x n − , x n − , . . . , x n − l X ) T and y l XY n − =( y n − , y n − , . . . , y n − l XY ) T , where T stands for transposition. To compute G Y → X , one can use the covariance ma-trices R ( x n , x l X n − ), R ( x l X n − ), R ( x n , x l X n − , y l XY n − ), and R ( x l X n − , y l XY n − ) of the respective (conjugated) random vectors.These are square matrices of dimensions l X + 1, l X , l X + l XY + 1, and l X + l XY , respectively. According to Refs. [7–9],the truncated WG causality for stationary Gaussian processes x n and y n relates to the determinants of these matricesas G trY → X = 1 − | R ( x n , x l X n − , y l XY n − ) || R ( x l X n − , y l XY n − ) | / | R ( x n , x l X n − ) || R ( x l X n − ) | . (5)If l X = l XY = 1, the right-hand side of Eq. (5) involves only the correlations C XX ( h ), C XY (0) and C XY ( h ), wherecorrelation functions are deﬁned as C XX ( lh ) = (cid:104) x n x n − l (cid:105) / (cid:104) x n (cid:105) , C XY ( lh ) = (cid:104) x n y n − l (cid:105) / (cid:112) (cid:104) x n (cid:105)(cid:104) y n (cid:105) , where zero mean ofthe processes is taken into account and l is integer . p-9.A. Smirnov et alThe time-lagged WG causality G trY → X (∆) is deﬁned in full analogy with (5) where y l XY n − is replaced by y n − l where l is integer and ∆ = lh : G trY → X (∆) = 1 − | R ( x n , x l X n , y l XY n − l ) || R ( x l X n , y l XY n − l ) | / | R ( x n , x l X n ) || R ( x l X n ) | . (6)If l X = l XY = 1, the right-hand side of Eq. (6) involves only the correlations C XX ( h ), C XY (∆) and C XY (∆ − h ).For a model system speciﬁed by stochastic diﬀerential equations d z /dt = A · z + ξ, (7)where A is a constant matrix and ξ is white noise, all these covariance matrices can be found via standard solution oflinear diﬀerential equations for the second moments [10]: d (cid:104) z (0) · z ( − t ) T (cid:105) dt = A · (cid:104) z (0) · z ( − t ) T (cid:105) . (8) Model system and design of numerical study. –

To repeat the main text: As a model system, we consideridentical ﬁrst-order decay processes dX /dt = − αX ( t ) + kY ( t ) + ζ X ( t ) ,dY /dt = − αY ( t ) + ζ Y ( t ) , (9)where α determines the characteristic relaxation time τ = 1 /α , k is the coupling coeﬃcient, and ζ X and ζ Y areindependent zero-mean white noises with autocorrelation functions (cid:104) ζ X ( t ) ζ X ( t ) (cid:105) = (cid:104) ζ Y ( t ) ζ Y ( t ) (cid:105) = δ ( t − t ) where δ is Dirac’s delta. For the system (9) it appears possible to conﬁne ourselves with the orders l X = l XY = l Y = l Y X = 1.The quantity G trY → X (∆) at l X = l XY = 1 coincides exactly with squared partial cross-correlation [11]. Since thecovariance matrices are found explicitly for the system (9), we compute the time-lagged truncated WG causalitiesversus ∆ in the wide range [ − τ, τ ] at high resolution of 0 . τ to select their maxima. Thereby, the causality ratiois found at high precision. Causality ratio versus sampling rate and coupling strength. –

Figs. 5,a,b show G trY → X (∆) and G trX → Y (∆)for various sampling rates at moderate coupling strength corresponding to C XY, = 0 . C XY,max = 0 . h = 0 . τ , the maximum value of G tr (∆) in the “correct” direction Y → X is achieved at small positive∆ = h/ h as is possible when one of thesignals is available at such a smaller sampling interval (Fig. 1,a, black line), or at ∆ = 0, if ∆ is varied in steps of h (black circles). The maximum in the opposite direction X → Y is achieved at a “nonphysical” negative ∆ = − τ (Fig. 1,b) as a result of the interdependence between X ( t ) and Y ( t ) induced by the Y → X coupling. This pattern of D / t G Y ® X ( D ) D / t G X ® Y ( D ) h/ t tr -2 0 2 -4 -2 0 2 a) b) h / t C XY , max , G Y ® X , G X ® Y h / t r Y ® Xtr , max tr , max c) d) tr Fig. 5: Causality measures for the system (9) at C X Y , = 0 . l X = l XY = l Y = l Y X = 1: (a,b) truncated WG causalitiesversus time lag for diﬀerent sampling intervals; (c) maximal truncated WG causalities (blue and green) and maximum CCFvalue (black) and (d) causality ratio versus sampling interval. Dashed lines in (c) and (d) are for maximization over ∆ variedin steps of h , solid lines – for ∆ varied in smaller steps of 0 . τ . p-10oping with Dating Errors C X Y ,0 C XY ,max, G Y ® X , G X ® Y C X Y ,0 r Y ® X tr , max tr , max a) b) Fig. 6: Causality measures for the system (9) at ∆ t/τ = 0 . l X = l XY = l Y = l Y X = 1 versus squared zero-lag CCF atzero observational noise: (a) Maximal truncated WG causalities (blue and green) and maximal CCF value (black); (d) causalityratio. the maxima locations is characteristic of unidirectional coupling. The causality ratio is r Y → X ≈ . h , with r Y → X ≈ . h ≥ τ , the ratio r Y → X gets close to unity, since G trY → X (∆) and G trX → Y (∆)become almost independent of the conditioning variables x ( t − h ) and y ( t − h ) tending to the squared CCF (Fig. 5,a-c).For discrete ∆, the ratio r Y → X → τ , the causality ratio cannot reliablyreveal the coupling directionality.The situation is similar for any coupling strength. Fig. 6,a,b show dependencies of the causality characteristics on C X Y , at ∆ t/τ = 0 .

2. The causality ratio achieves its maximal value of ≈ C X Y , → . X is sustained entirely by the system Y and C X Y ,max = 0 .

86. The causality ratio remains almost constantand equal to ≈ . C X Y , from 0.05 to 0.3 (Fig. 6,b). This range corresponds to the maximalCCF ranging from the (notable) value of 0.27 to the (rather large) value of 0.66. In particular, this range includesthe most interesting for us moderate maximal CCFs about 0.3 – 0.4. Thus, if the sampling is not too sparse andcross-correlation is not too low, the causality ratio in the “correct” direction is considerably greater than unity (1.6and greater) which should allow one to conﬁdently infer coupling direction in practice from a suﬃciently long timeseries. s Y /s Y r Y ® X half-time datingerror d Y / t b) -1 0 1 2 dating error d Y / t r Y ® X time series portionwith dating error1/21/31/41/51/6 a) Fig. 7: Causality measures for the system (9) at ∆ t/τ = 0 . C X Y , = 0 . σ = 0 and diﬀerent half-time dating errors. p-11.A. Smirnov et al Causality ratio versus dating errors and observational noise. –

Fig. 7,a presents the causality ratio versusdating error for the situation when diﬀerent portion of the time series { y n } Nn =1 (from 1/2 to 1/6 of the entire series)is corrupted by the dating error. One can see that half-time dating error reduces the value of the causality ratio moststrongly (the solid curve). Smaller portions distorted by the uniform dating error lead to a weaker reduction of thecausality ratio (dashed curves). Larger uniform error-corrupted portions of 2/3, 3/4, 4/5, and 5/6 lead to the samecausality ratio reduction as the smaller complementary ones of 1/3, 1/4, 1/5, and 1/6, respectively (not shown in theplots). Indeed, if the entire series suﬀers from a uniform dating error, this does not inﬂuence the causality ratio sincemaximization over temporal shifts is involved in the deﬁnition of the latter.Fig. 7,b presents simultaneous inﬂuence of the half-time dating error and observational noise in the driving signal σ . One can see that their contributions to the reduction of the causality ratio r y → x can sum up: e.g. dating errorof 0 . τ reduces the causality ratio as compared to zero dating error approximately by 0.15 both for σ = 0 and σ = 0 . σ Y , while σ = 0 . σ Y reduces the causality ratio as compared to σ = 0 approximately by 0.3 both for zerodating error and δ Y = 0 . τ . However, for stronger errors of both kinds their eﬀects do not simply add: For datingerror of 0 . τ and greater, the causality ratio saturates at the level of unity and the noise does not reduce it any more(and in some range of the noise levels it even increases the causality ratio). Similar saturation of the causality ratiovalues exists for the noise of about σ = 0 . σ Y and greater. However, the latter is a huge noise level (about 80 % inroot-mean-squared amplitude), while the dating error of 0 . τ is quite realistic for palaeoclimate studies, including theexample considered in this work. Thus, the capability of the dating error to decrease r y → x seems to be stronger andmore robust. Still, we note that even the two factors together cannot make r y → x considerably less than unity, onlythe ranges of their values leading to r y → x ≈ -2 -1 0 1 2 D / t G Y ® X ( D ) , G X ® Y ( D ) tr tr -2 -1 0 1 2 D / t p ( G Y ® X ) , p ( G X ® Y ) -2 -1 0 1 2 D / t G Y ® X ( D ) , G X ® Y ( D ) tr tr -2 -1 0 1 2 D / t p ( G Y ® X ) , p ( G X ® Y ) -4 -2 0 2 4 D / t -0.400.40.8 C XX , C YY -0.200.20.4 C XY -4 -2 0 2 4 D / t -0.400.40.8 C XX , C YY -0.200.20.4 C XY a) b)c) d)e) f) tr tr tr tr Fig. 8: Two examples (left and right column, respectively) of correlation and causality estimates from time series of thesystem (9) versus time lag at ∆ t/τ = 0 . σ = 0, k = 0 . δ Y = 0: (a,b) ACF for the signals x (blue) and y (green) andCCF (black); (c,d) WG causality in the directions Y → X (blue) and X → Y (green); (e,f) F -test based signiﬁcance levelestimates (pointwise p-levels) for positivity of G Y → X (blue) and G x → y (green), black dashed lines show the pointwise p -levelcorresponding to the global p -level of 0.05 (Bonferroni correction [12] with a pre-deﬁned order of tests). p-12oping with Dating Errors Causality estimates from time series: Numerical simulations. –

As discussed in the main text, thecausality ratio is slightly aﬀected by the estimator ﬂuctuations for the estimated values of palaeoclimate predictionimprovements(of the order of 0.01), if the time series length is

N > N = 400 (the signal duration of 80 τ at sampling interval 0 . τ ) so that the role of statisticalﬂuctuations may well appear strong. Therefore, we performed numerical experiments with estimation of WG causalitiesand causality ratio from time series with the above parameters N = 400 and ∆ t/τ = 0 . α = 1 /

300 month − = 1 /

25 yr − . To generate the time series, we integrated the with Euler – Maruyama techniquewith time step of τ /

300 = 1 month and sampling interval of ∆ t = 60 months which is analogous to the paleoclimatedata below. An ensemble of 1000 time series was generated at each set of parameter values. Mean values of WGcausalities and causality ratio and probability of them to exceed the respective experimentally observed paleoclimateestimates are computed from each ensemble.Starting with the case of absent observational noise and dating errors, we specify k/α = 0 .

45, i.e. k = 0 . − which appears overall the most close to the observed paleoclimate data properties. In the selected case, weget mean value of the maximal sample CCF equal to 0 .

33 and the probability for it to exceed the paleoclimate valueof 0 .

29 equal to 0 .

67. Here, we present estimates for the truncated WG causality G trY → X for l X = 4 , l XY = 1 ratherthan for l X = l XY = 1 to be consistent with the paleoclimate example where the orders were selected via the Schwarzcriterion. However, numerical experiments show that the causality ratio estimates in these two cases are very close toeach other, in particular, their statistics (mean values and probabilities) diﬀer by no more than 1%. This is a furtherconﬁrmation that the above results for l X = l XY = 1 are correct for (and at least qualitatively agree with) those forhigher AR orders, in particular, for the Schwarz criterion-based orders l X , l XY .Fig. 8 presents two examples of estimates obtained from two diﬀerent time series of the system (9): left column isthe most typical case where the causality ratio r Y → X is greater than unity (namely about 1.5, Fig. 8,c), right columnis less typical case observed in less than for 10% of the time series in the ensemble where r Y → X < F -test with Bonferroni correction, which takes place for more than 90% of the time series in the ensemble. As forACF and CCF estimates they look quite similar for both cases (Fig. 8,a,b). The right column is quantitatively similarto the paleoclimate example except for the positions of the maxima in WG causality plots. For the correct direction,the maximum is located close to zero in contrast with the paleoclimate example where it is located at a negative lag.Sometimes, the maxima for the correct direction can appear at negative lags in this mathematical example as well,but these cases correspond to statistically insigniﬁcant WG causality estimates. -2 -1 0 1 2 D / t G Y ® X ( D ) , G X ® Y ( D ) tr tr -2 -1 0 1 2 D / t p ( G Y ® X ) , p ( G X ® Y ) D / t -0.400.40.8 C XX , C YY -4 -2 0 2 4 D / t -0.200.20.4 C XY a)c) b)d) tr tr Fig. 9: An example of correlation and causality estimates from a time series of the system (9) versus time lag at at ∆ t/τ = 0 . σ = 0, k = 0 . δ Y /τ = 0 .

8: (a) ACF for the signals x (blue) and y (green); (b) CCF; (c) WGcausality in the directions Y → X (blue) and X → Y (green); (d) pointwise p-levels for positivity of G Y → X (blue) and G X → Y (green), black dashed lines show the pointwise p -level corresponding to the global p -level of 0.05. p-13.A. Smirnov et al s Y /s Y G Y ® X , G X ® Y r Y ® X tr , max tr , max _ a) _ _ s Y /s Y G Y ® X >0.014]P[ G X ® Y >0.025] 0.950.960.97P[ r Y ® X > 0.56 ] tr ,max tr ,max b) half-time dating error d Y / t G Y ® X , G X ® Y r Y ® X tr , max tr , max _ c) _ _ half-time dating error d Y / t G Y ® X >0.014]P[ G X ® Y >0.025] 0.920.940.96P[ r Y ® X > 0.56 ] tr ,max tr ,max d) Fig. 10: Statistics of the causality estimates for the system (9) depending on observational noise level at zero dating error (a,b)and on half-time dating error at zero observational noise (c,d) over an ensemble of 1000 time series of the length N = 400 atsampling interval ∆ t/τ = 0 .

2. The left column shows mean values for the maximum truncated WG causalities (blue and green)and the causality ratio (black lines, right ordinate axes). The right column shows probabilities for these estimates to exceed thesame estimates obtained from the paleoclimate data.

The half-time dating error δ Y = − . τ = −

20 yr (Fig. 9) moves the plots for WG causality estimates along theabscissa axis. In particular, the maximum of the plot for the correct direction moves to the negative lags of ∆ ≈ δ Y (Fig. 9,c,d). This location of the maxima is similar to those for paleoclimate data and is observed in about 50% ofcases for the analyzed ensemble. Thus, we could say that the system (9) with half-time dating error exhibit someproperties close to those for paleoclimate data.To study a dependence of the estimated causalities on noise level and dating error, let us consider Fig. 10. Notethat the mean value of the estimate of the causality ratio r Y → X is already low enough already for zero noise sincestatistical ﬂuctuations play the role of noise and move the estimated causality ratio close to unity that the theoreticalvalue (1.2 as compared to 1.6, Fig. 10,a, black line). Fig. 10,a further shows that mean values of WG causalitiessomewhat decrease with the noise level, but the causality ratio decreases very slightly from 1.2 to 1.17 at the verylarge 100% noise. As for the probabilities to exceed the ﬁxed “paleoclimate” values, Fig. 10,b shows that they areconstant for WG causalities, but for the causality ratio the probability to observe such a low value as 0.56 rises from0.03 to 0.05 with the noise level. Overall, the causality ratio estimates appear weakly sensitive to observational noise,even though a very large noise makes the observed paleoclimate estimate somewhat more probable, prompting thatthe solar activity signal might be more noise-corrupted than the Atlantic climate proxy.As for the dating error, Figs. 10,c,d show that the causality ratio estimates are more sensitive to this quantity.Thus, r Y → X falls down to 1.1 already for moderate δ Y = − . τ and, more importantly, probability of observingso low causality ratio rises from 0.03 to 0.06 at δ Y = − . τ and even to 0.08 at δ Y = − τ making the observedpaleoclimate estimate of r Y → X even more probable, prompting that the dating error might well be present in theearlier parts of the paleoclimate data at hand.Overall, we must state that the contribution of statistical ﬂuctuations is much more important than impacts of thedating error and observational noise. The former circumstance decreases the causality ratio on average from 1.6 to 1.2,as compared to the average change of the order of 0.1 induced by the dating error and 0.05 by the observational noise.Thus, the time series length seems to be the main factor limiting the accuracy of estimation for the paleoclimate dataat hand. Estimation of causality between volcanic activity and Yok Balum speleothem-based data. –

We haveperformed an additional analysis with quite accurately dated recently published volcanic activity data [13] (Fig. 11).We have revealed that the volcanic activity inﬂuences the δ O variations with ∆ = 2 − h/ time, yr AD -5-4-3 x [ d O, permil VPDB] a) -2 -1 0 1 2 D /25 yr G Y ® X ( D ) tr -2 -1 0 1 2 D /25 yr p ( G Y ® X ) D /25 yr -0.400.40.8 C XX , C YY -4 -2 0 2 4 D /25 yr -0.100.10.2 C XY c)e) d)f) tr time, yr AD -40-30-20-100 y [volc forcing, W/m ] b) Fig. 11: Estimation from palaeoclimate data over the period [15 yr BC - 2010 yr AD]: (a) time series of δ O from aspeleothem representing local climate (moisture) in the Atlantic region, red points denote the original data, blue line – signalwhich is smoothed with a Gaussian kernel of the eﬀective width of 5 yrs; (b) an original proxy time series of volcanic activity(global); (c) sample ACF for the signals x (blue) and y smoothed with a Gaussian kernel of 5 yrs width (green); (d) sampleCCF; (e) truncated WG causalities in the directions volcanoes → Belize climate for l X = 3, l XY = 1; (f) the respective pointwise p -levels for the positivity of ˆ G trY → X , black dashed lines show the pointwise p -levels corresponding to the total p -level of 0.05. coupling and absent dating errors. The deviation is less than 1 yr. The obtained WG causality estimate is statisticallyhighly signiﬁcant and the observed small time lag is perfectly acceptable.Considering quite precise dating of the volcanic activity proxy, the above result is a strong argument in favor ofan accurate dating of the speleothem data as well. Since we have found a “non-physical” negative lag of total solarirradiance (TSI) variations behind the speleothem-based hydroclimate proxy, we suggest that it is the solar activitysignal which might be less accurately dated (with a possible 20 yrs age underestimation, i.e. the TSI record might bein sections too young) rather than the speleothem-based hydroclimate proxy. This notion is corroborated by previousstudies, e.g. Ref. [14] (Supplementary material, Figure caption) where the authors also found 22 yrs negative lag (theTSI impossibly following Asian monsoon) and concluded that to be acceptable and within errors. REFERENCES[1] C.W.J. Granger, Information and Control , 18 (1963).[2] N. Wiener, in E.F. Beckenbach (ed.) Modern Mathematics for the Engineer (New York: McGraw-Hill, 1956).[3] C.W.J. Granger, Econometrica (1969) 424.[4] G.E.P. Box and G.M. Jenkins, Time series analysis. Forecasting and control (San Francisco: Holden-Day, 1970).[5] G. Schwarz, Ann. Stat. , 461 (1978).[6] G.A.F. Seber, Linear Regression Analysis (New York: Wiley, 1977).[7] L. Barnett, A.B. Barrett, and A.K. Seth, Phys. Rev. Lett. (2009) 238701.[8] D.A. Smirnov, Phys. Rev. E , 042917 (2013).[9] D.W. Hahs and S.D. Pethel, Entropy 15, 767 (2013).[10] D.A. Smirnov, Phys. Rev. E , 062921 (2014).[11] Runge, J. Kurths, and V. Petoukhov, J. Climate , 720 (2014).[12] E.L. Lehmann, Testing Statistical Hypotheses (New York: Springer, 1986).[13] M. Sigl, M. Winstrup, J.R. McConnell1, et al. Nature , 543 (2015).[14] F. Steinhilber, J.A. Abreu, J. Beer, et al. Proceedings of the National Academy of Sciences , 5967 (2012)., 5967 (2012).