[PDF] The significance of trends in long-term correlated records

Abstract

We study the distribution P(x;α,L) of the relative trend x in long-term correlated records of length L that are characterized by a Hurst-exponent α between 0.5 and 1.5 obtained by DFA2. The relative trend x is the ratio between the strength of the trend Δ in the record measured by linear regression, and the standard deviation σ around the regression line. We consider L between 400 and 2200, which is the typical length scale of monthly local and annual reconstructed global climate records. Extending previous work by Lennartz and Bunde \cite{Lennartz2011} we show explicitely that P follows the student-t distribution P∝[1+(x/a ) 2 /l ] −(l+1)/2 , where the scaling parameter a depends on both L and α , while the effective length l depends, for α below 1.15, only on the record length L . From P we can derive an analytical expression for the trend significance S(x;α,L)= ∫ x −x P( x ′ ;α,L)d x ′ and the border lines of the 95% percent significance interval. We show that the results are nearly independent of the distribution of the data in the record, holding for Gaussian data as well as for highly skewed non-Gaussian data. For an application, we use our methodology to estimate the significance of Central West Antarctic warming.

Full PDF

aa r X i v : . [ phy s i c s . d a t a - a n ] N ov The signiﬁcance of trends in long-term correlated records

Araik Tamazian

Institut f¨ur Theoretische Physik, Justus-Liebig-Universit¨at Giessen, 35392 Giessen, Germany

Josef Ludescher and Armin Bunde

Institut f¨ur Theoretische Physik, Justus-Liebig-Universit¨at Giessen, D-35392 Giessen, Germany

We study the distribution P ( x ; α, L ) of the relative trend x in long-term correlated records oflength L that are characterized by a Hurst-exponent α between 0.5 and 1.5 obtained by DFA2. Therelative trend x is the ratio between the strength of the trend ∆ in the record measured by linearregression, and the standard deviation σ around the regression line. We consider L between 400and 2200, which is the typical length scale of monthly local and annual reconstructed global climaterecords. Extending previous work by Lennartz and Bunde [1] we show explicitely that P follows thestudent-t distribution P ∝ [1 + ( x/a ) /l ] − ( l +1) / , where the scaling parameter a depends on both L and α , while the eﬀective length l depends, for α below 1.15, only on the record length L . From P we can derive an analytical expression for the trend signiﬁcance S ( x ; α, L ) = R x − x P ( x ′ ; α, L ) dx ′ and the border lines of the 95% percent signiﬁcance interval. We show that the results are nearlyindependent of the distribution of the data in the record, holding for Gaussian data as well as forhighly skewed non-Gaussian data. For an application, we use our methodology to estimate thesigniﬁcance of Central West Antarctic warming.PACS numbers: 05.10.-a, 05.45.Tp, 92.70.Mn I. INTRODUCTION

The variability of a data set { y i } , i = 1 , . . . , L , de-pends on its (internal) correlation properties and can beinﬂuenced by external mechanisms. Prominent examplesare temperature records[1–28], river ﬂows [29–36], andsea level heights [37–39] that all show a strong naturallong-term persistence and, in addition, are eﬀected byanthropogenic inﬂuences that may lead to an additionaltrend.In long-term persistent records, small events have atendency to cluster in ”valleys” while large events tendto cluster in ”mountains”. Accordingly, long-term persis-tent records exhibit a pronounced valley-mountain struc-ture, where it is diﬃcult to distinguish a natural trend(starting in a valley and ending in a mountain) from asmall external deterministic trend. The problem of esti-mating the anthropogenic trend in temperature records,river ﬂows and sea level height is an important issue inhydroclimatology and is called the ”detection problem”[3, 40–42]. The central quantity here is the probabil-ity P ( x ; α, L ) that in a long-term persistent record oflength L , characterized by the Hurst exponent α , a rel-ative trend of strength x occurs; x is obtained from thestandard linear regression analysis of the record and rep-resents the height ∆ of the regression line divided by thestandard deviation of y i around the regression line (seeSection II). From P one obtains the signiﬁcance of thetrend as well as its error bars (denoting the 95% signiﬁ-cance interval)[1, 3, 22].In many previous attempts to solve this problem (see,e.g., [43–47, 49]) it had been tacitly assumed that thepersistence of the climate records can be modelled by anauto-regressive process of ﬁrst order (AR(1)) which al-lowed, in a simple and straightforward way, to estimate the signiﬁcance of the warming trend ∆ and its errorbars. In this case, the central quantity is the probability P ( x ; C (1) , L ) that in an AR(1) record of length L , char-acterized by the detrended lag-1 autocorrelation C (1),a relative trend of strength x occurs. It is well knownthat P ( x ; C (1) , L ) follows a student-t distribution (seeEq (6)), where the scaling paramer a and the eﬀectivelength l are functions of C (1) and L . This approach,however, which is conventional in climate science, canonly be considered as a crude approximation since thetemperature variability cannot be described by an AR(1)process where the autocorrelation function C ( s ) decaysexponentially with time lag s , but is described by a long-term persistent process where C ( s ) decays algebraicallywith s .In recent years, there have been several attempts tosolve the detection problem in long-term persistent data[1, 14, 16, 17, 19–22]. Using Monte Carlo simulationsand scaling arguments it was found empirically [1, 22]that for long-term persistent Gaussian data, P ( x ; α, L )can be approximated reasonably by a Gaussian for small x and by a simple exponential for large x values.Here we perform the same kind of calculations as in[1, 22], but with a considerably better statistics, and showthat the best approximation for P , in the whole x -regime ,is again the student-t distribution , where now the scalingparameter a and the eﬀective length l depend on α and L .While the previous result [1] represents a good approxi-mation in the respective x -windows, the present result ismore satisfying since it shows that the distributions of arelative trend in uncorrelated, short-term correlated andlong-term correlated data all follow the same equation,namely a student-t-distribution, but with diﬀerent scal-ing parameters a and eﬀective lengths l . Accordingly, theexceeding probability W and the signiﬁcance S of a trendis described, in all these diﬀerent systems, by the samehypergeometric function. In addition, we also study P for strongly skewed non-Gaussian data and ﬁnd that, toa very good approximation, P is the same for all consid-ered distributions. Finally, we apply our methodology tothe West-Antarctic temperature record at Byrd station.The paper is organized as follows: In Section II wedescribe how the exceedance probability W and the sig-niﬁcance S is related to P and which form these quanti-ties have for uncorrelated Gaussian noise and short-termcorrelated Gaussian data characterized by an AR(1) pro-cess. We also give a brief introduction into long-termpersistent data and their characterization. In SectionsIII we present our numerical results for the signiﬁcanceof a relative trend in long-term persistent Gaussian (Sec-tion III) and non-Gaussian data (Section IV). In SectionV we show how our approach can be applied to monthlytemperature records. As an example we take the Byrdrecord from West Antarctica that has very recently beenreconstructed [49]. In Section VI, ﬁnally, we summarizeour results. II. DETECTION OF EXTERNAL TRENDS

We consider a record { y i } , i = 1 , . . . , L and assume,without loss of generality, that the mean value ¯ y of thedata is zero. To estimate the increase or decrease of thedata values in the considered time window of length L ,one usually performes a regression analysis. From theregression line r i = bi + d , one obtains the magnitude ofthe trend ∆ y = c ( L −

1) as well as the ﬂuctuations aroundthe trend, characterized by the standard deviation σ =[(1 /L ) P Li =1 ( y i − r i ) ] / . The relevant quantity we areinterested in is the relative trend x = ∆ y/σ. (1)When a certain relative trend has been measured in adata set, the central question is, if this trend may be dueto the natural variability of the data set or not (”detec-tion problem”). To solve this problem, one needs to knowthe probability P ( x ; L ) dx that in model records with thesame persistence properties as the considered data set, arelative trend between x and x + dx occurs. The proba-bility density function P ( x ; L ) is symmetric in x . In thefollowing we consider x >

0. From P we derive the ex-ceedance probability W ( x ; L ) = R ∞ x P ( x ′ ; L ) dx ′ and thetrend signiﬁcance S ( x ; L ) = Z x − x P ( x ′ ; L ) dx ′ = 1 − W ( x ; L ) . (2)By deﬁnition, S is the probability that the relative trendin the record is between − x and x .If the signiﬁcance of a relative trend is above 0.95 (or95%), one usually assumes that the considered trend can-not be fully explained by the natural variability of therecord. The relation S ( x ; L ) = 0 .

95 deﬁnes the up-per and lower limits ± x of the 95% signiﬁcance inter- val (also called conﬁdence interval). By the above as-sumption, relative trends x between − x ( L ) and x ( L )can be regarded as natural. If x is above x , the part x − x cannot be explained by the natural variability ofthe record and thus can be regarded as minimum externalrelative trend, x minext = x − x . (3)On the other hand, the external trend cannot exceed x maxext = x + x , (4)which thus represents the maximum external relativetrend. By deﬁnition, x minext represents the lower marginof the observed relative trend that cannot be explainedby the natural variability alone, while x maxext ist the largestpossible external relative trend consistent with the nat-ural variability of the record. According to Eqs. (3) and(4), ± x ( L ) can be regarded as error bars for an exter-nal relative trend in a record of length L . A. White noise

For uncorrelated Gaussian data (white noise), it hasbeen assumed (see [43] and references therein) that theratio t b between the estimated trend slope b and its stan-dard error s b (see Eqs. (1-5) in [43]) follows a student-tdistribution. This assumption can be written as P ( x ; L ) = Γ( l ( L )+12 )Γ( l ( L )2 ) p πl ( L ) a (cid:18) x/a ) l ( L ) (cid:19) − l ( L )+12 , (5)with the degrees of freedom l ( L ) = L − a = √ L − √ L + 2 1 p l ( L ) . (7)Γ denotes the Γ-function. In the limit of large L , a tendsto a ∼ = √ / p l ( L ).From (5) and (2) one can obtain straightforwardly thesigniﬁcance trend S as a function of x/a and l ( L ), S ( x ; L ) = 2 xa Γ( ( l ( L ) + 1) √ πl Γ( l ( L )2 ) × F (cid:18) ,

12 ( l ( L ) + 1); 32 ; − ( x/a ) l ( L ) (cid:19) (8)where F is the hypergeometric function. B. Short-term correlations

The most basic model for short-term correlations indata sets is the autoregressive process of ﬁrst order(AR1), where the data satisfy the equation y i +1 = c y i + η i , i = 1 , , . . . , L − . (9)Here, the AR1 parameter c is between -1 and 1 and η i iswhite noise. For c < c > c = 0, they are whitenoise.For characterizing the persistence of a record, one oftenstudies the autocorrelation function C ( s ) = h y i y i + s i ≡ L − s ) P L − si =1 y i y i + s / ( L P Li =1 y i ). By deﬁnition, C (0) =1. It is easy to show that for AR1 processes, in thelimit of L → ∞ , C ( s ) decays exponentially, C ( s ) = c s ,i.e., c is identical to the lag-1 autocorrelation C (1). For c > C ( s ) can be written as C ( s ) = exp( − s/s x )) where s x = 1 / | ln c | denotes the persistence time.It has been shown [43] that for suﬃciently large L where C (1) = c , P has approximately the form of thestudent-t distribution Eq. (5), with l ( L ) = L − C (1)1 + C (1) − a = √ L − √ L + 2 1 p l ( L ) ∼ = √ p l ( L ) (11)Accordingly, the signiﬁcance S of the trend is describedby Eq. (8) with l ( L ) from (10) and a from (11). C. Long-term persistence

Long-term correlated records can be characterized bythe power spectral density S ( f ) = | y ( f ) | , where { y ( f ) } , f = 0 , , . . . , L/

2, is the Fourier transform of { y i } . Withincreasing frequency f , S ( f ) decays by a power law, S ( f ) ∼ f − β , (12)where β > β = 0. Records with 0 < β < C ( s ) thatdecays by a power law, C ( s ) ∼ (1 − γ ) s − γ , with γ = 1 − β .To generate long-term persistent data, one usually usesthe Fourier-ﬁltering technique based on (12), where longrecords of uncorrelated Gaussian data (typically of length L = 2 ) are transformed to Fourier space. The result ismultiplied by f − β/ and then transformed back to timespace. The resulting record is Gaussian distributed. Forobtaining records of the desired length L , one divides thelong record into segments of length L .Since both S ( f ) and C ( s ) exhibit large ﬁnite size ef-fects and are strongly inﬂuenced by external determin-istic trends, one usually does not use these methods to characterize the long-term persistence, but prefers meth-ods like the detrending ﬂuctuation analysis of 2nd order(DFA2) [48] where linear trends in the data are elimi-nated systematically.In DFA2 one measures the variability of a record bystudying the ﬂuctuations in segments of the record as afunction of the length s of the segments. Accordingly,one ﬁrst divides the record { y i } , i = 1 , , . . . , L , into non-overlapping windows ν of lengths s . Then one focuses, ineach segment ν , on the cumulated sum Y i of the data anddetermines the variance F ν ( s ) of the Y i around the bestpolynomial ﬁt of order 2. After averaging F ν ( s ) over allsegments ν and taking the square root, we arrive at thedesired ﬂuctuation function F ( s ). One can show that F ( s ) ∼ s α . (13)The exponent α can be associated with the Hurst expo-nent, and is related to the correlation exponent γ and thespectral exponent β by α = (1 + β ) / α = 1 − γ/ α = 1 /

2. The DFA2 techniquegives reliable results for time scales s between 10 and L/ F ( s ) approaches a power law,with α = 1 /

2, for s well above the persistence time s x .For s well below s x (which can only be the case for b veryclose to 1), α is close to 1.5. The diﬀerence in the func-tional form of F ( s ) allows to distinguish between short-term and long-term persistent processes.Recently, it has been shown [1] by Monte Carlo simula-tions that in long-term persistent data of length L , wherethe Hurst exponent α is determined by DFA2, the prob-ability density P ( x ; α, L ) of the relative trend x can bereasonably approximated by a Gaussian for small x andby a simple exponential for large x . Using scaling theory,an analytic expression for W ( x ; L ) has been obtained, asa function of α , in the two x -regimes. Here we followthe same route as in [1], but with a better statistics, andﬁnd that the best approximation for P , in the whole x regime, is the student-t-distribution Eq. (5), where thescaling parameter a and the eﬀective length l depend onboth α and L .Accordingly, the signiﬁcance of a relative trend in long-term persistent records is described by the same hyperge-ometric function as for white noise and AR1 noise, onlythe parameters l and a are diﬀerent. We also show thatthe results derived for Gaussian data hold, in an excel-lent approximation, also for data with a symmetric ex-ponential distribution D ( y ) = (1 / exp ( −| y | ) as well asfor strongly skewed distributions like the one-sided expo-nential and one-sided power-law distribution. α -3 -2 -1 H ( α ; α * , L ) α * -3 -2 -1 G ( α * ; α , L ) α * = 0.75 α = 0.75a) b) FIG. 1. (color online) (a) When long data sets with globalHurst exponents α ∗ (created by Fourier ﬁltering) are di-vided into sub-segments of length L , the local Hurst expo-nents α vary around α ∗ . The ﬁgure shows the distribu-tion H ( α ; α ∗ , L ) of the local Hurst exponents α (obtained byDFA2) in segments of lenght L = 400 (continuous red line)and L = 2200 (dashed green line), for ﬁxed global Hurst ex-ponent α ∗ = 0 .

75. (b) According to (a), the same local Hurstexponent α can originate from long data sets with diﬀerentglobal Hurst exponents α ∗ . The ﬁgure shows, for the localHurst exponent α = 0 .

75 in segments of length L = 400 (con-tinuous red line) and L = 2200 (dashed green line), the dis-tribution G ( α ∗ ; α, L ) of the involved global Hurst exponents α ∗ . III. SIGNIFICANCE OF TRENDS INLONG-TERM CORRELATED GAUSSIAN DATASETS

For determining P ( x ; α, L ) numerically, we follow [1].We use the Fourier-ﬁltering technique [2] to generate 800synthetic records of length 2 , for 241 global Hurst expo-nents α ∗ ranging from α ∗ = 0 . α ∗ = 2 .

5. We are in-terested in data sets with lengths L between 400 and 2200which correspond, in monthly temperature data sets, todata lengths between 33 and 183 years. Accordingly,we divided each data set into subsequences of lengths L = 400 , , , . . . , L , we used linear regression to determine (i) the localDFA2 Hurst exponent α as the slope of the regressionline in a double logarithmic presentation of the ﬂuctua-tion function F ( s ) between s = 10 and L/ x . We are interested in α values between0.5 and 1.5, which are most common in nature.It has been noticed before [1, 18, 22], that the localHurst exponents α obtained in each subrecord are notidentical to the global Hurst exponent α ∗ of the entirerecord, but vary around α ∗ . The distribution of the localHurst exponents α , for ﬁxed α ∗ = 0 .

75 and L = 400 and2200, is shown in Fig. 1a. As expected, the distributionnarrows with increasing subrecord length L . Accordingly,when in a subrecord a certain local Hurst exponent α is measured, the subrecord may be part of a long dataset with a diﬀerent global Hurst exponent α ∗ . Figure1b shows, for ﬁxed α = 0 .

75, the distribution of the α ∗ values. Again, the distribution narrows with increasing L . As a consequence, for determining the signiﬁcance of -3 -2 -1 - S ( x ; α , L ) L = 6000 2 410 -3 -2 -1 - S ( x ; α , L ) -3 -2 -1 - S ( x ; α , L ) α = 0.5, 0.6, FIG. 2. (color online) Signiﬁcance S ( x ; α, L ) of relative trends x occurring in long term correlated data of length L and Hurstexponent α . The data are Gaussian distributed. For clarity,we focus on 1 − S . (a) is for L = 600 and α = 0 . , . ., . α = 0 . , . , . α = 1 . , . , . , . , . L = 1200 and L = 1800, respectively. a relative trend in a long-term correlated record of length L , one cannot simply identify the local Hurst exponents α with the global one, but has to determine the localHurst exponents in each subrecord separately. As wewill show in the second part of this Section, ignoring thisfact will lead to a strongly enhanced signiﬁcance. Welike to note that a similar problems occurs in short-termpersistent records, where only in long data sets the lag-1autocorrelation function C (1) is equal to the persistenceparameter b . In subrecords (or short data sets) of length L , the values of C (1) ﬂuctuate around b , and Eqs. (10)and (11) are not valid [28].After having obtained in each subrecord k (of ﬁxedlength L ) the local Hurst exponents α k , we focus on thosesubrecords that have local α values between 0.49 and1.51. We divide the local α values into 51 windows oflength 0.02, such that in the ﬁrst window α = 0 . ± . α = 0 . ± .

01, and in the lastwindow α = 1 . ± .

01. Then we determine, in each α -window, the distribution P ( x ; α, L ) of the relative trendsas well as the trend signiﬁcance S .Figure 2 shows 1 − S ( x ; α, L ) = 2 W ( x ; α, L ) for three α -1 a ( α , L ) L = 400L = 1200L = 2200

FIG. 3. (color online) The characteristic relative trend a ( α, L )that characterizes the decay of 1 − S (see Fig. 2 and Eq. ( ?? ),as a function of the Hurst exponent α for three record lengths L = 400 , , α = 0 . representative data lengths L = 600, 1200, and 1800,which in monthly climate records correspond to 50, 100,and 150 years. The dots are our numerical results. Thecontinous lines result from a ﬁt of S to Eq. (8), withappropriately chosen values for the scaling parameter a and the eﬀective length l . The ﬁgure shows that over allthree decades of 1 − S considered here (where S rangesfrom 0 to 0.999 (or from 0 to 99.9 %)), the ﬁt is excellent.The parameters l and a are listed, for L between 400 and2200, in the Appendix.Table 1 in the Appendix shows that for ﬁxed recordlength L , the eﬀective length l is a constant in the mostrelevant range between α = 0 . L = 600, 1200, and 1800, l = 9.24, 12.05, and 13.69,respectively. For α above 1.1, l ( L ) decreases strongly.Figure 3 shows that the scaling parameter a listed inTable 2 in the Appendix, can be approximated, for α between 0.5 and 1.2, by a power law, where the slopeincreases with increasing L . Above α = 1 . a showsonly a very weak L dependence.From Fig. 2, by intersecting 1 − S with the constant5 × − , we obtain immediately the relative trend x that is conventionally used to estimate the error barsof a measured relative trend. Figure 4 shows x for L = 600, 1200 and 1800. From the ﬁgure, one can imme-diately read oﬀ the error bars of a relative temperaturetrend in monthly records of length 50, 100, and 150ywith DFA2 exponent α , and specify its lower and upperbounds x minext = x − x and x maxext = x + x .Finally, at the end of this Section, let us go back toits beginning and ask the following question: Given along record of length L described by the global Hurstexponent α ∗ , which is divided into subrecords of length L ≪ L . What is the signiﬁcance ˜ S ( x ; α ∗ , L ) of a relativetrend x in these subrecords? We expect that for large L where all local Hurst exponents α are very close to α ∗ ,˜ S ( x, α ∗ , L ) and S ( x, α, L ) will coincide. For small L weexpect that ˜ S ( x, α ∗ , L ) overestimates the signiﬁcance.The diﬀerence between S ( x, α, L ) and ˜ S ( x, α ∗ , L ) may α -1 x ( α ) L = 600L = 1200L = 1800

FIG. 4. (color online) The relative trend x correspondingto the 95% signiﬁcance level as a function of the Hurst ex-ponent α for the segment lengths L = 600 , x is obtained from theintersection of 1 − S in Fig. 2 with 1-0.95=0.05. also be regarded as follows: If we know a priori that theconsidered data set is characterized by a certain Hurstexponent (which is the case, for example, when we con-sider records consisting of Gaussian distributed indepen-dent random numbers or their cumulated sum), then˜ S ( x, α ∗ , L ) gives the proper signiﬁcance. If we do notknow the characteristics of the data set a priori, theuncertainty is increased and we have to determine ex-plicitely its Hurst exponent, S ( x, α, L ) gives the propersigniﬁcance.Figure 5 shows ˜ S ( x, α ∗ , L ) for the global Hurst expo-nents α ∗ = 0 .

5, 0.75, 1, and 1.25 in subrecords of lengths L = 400 , , and 2200. The ﬁgure shows also the sig-niﬁcance S ( x, α, L ), for α = 0 .

5, 0.75, 1, and 1.25. For α ∗ = 0 .

5, ˜ S ( x, α ∗ , L ) follows Eq. (8) with (6) and (7). Wefound that also for α = 0 .

75 and 1, ˜ S ( x, α ∗ , L ) is well de-scribed by the student-t distribution (8) with l ( L ) = L − α = 0 .

75, the a values are 0.512, 0.383, and 0.328for L = 400, 1200 and 2200, respectively. For α = 1, therespective a values are 1.313, 1.185, and 1.135. As ex-pected, ˜ S ( x, α ∗ , L ) overestimates the signiﬁcance of anobserved relative trend and thus underestimates the er-ror bars ± x of a relative trend. For example, when in amonthly temperature record of length 600 (correspondingto 50 years) characterized by α = 0 .

75 a relative trend x = 1 is measured, the proper signiﬁcance of this trendis S = 0 .

94, i.e. the trend is not signiﬁcant. However,if falsely ˜ S is used for estimating the signiﬁcance, oneoverestimates the signiﬁcance, since ˜ S = 0 .

975 in thiscase.

IV. SIGNIFICANCE OF TRENDS INLONG-TERM CORRELATED NON-GAUSSIANDATA SETS

By using the Fourier-ﬁltering technique we generatedlong-term correlated Gaussian data { y i } . Many natu-ral records, e.g., monthly temperature anomalies where -3 -2 -1 - S ( x ; α ∗ , L ) L = 400 0.1 0.2 0.3 0.4L = 1200 0.1 0.2 0.3L = 22000 1 2 310 -3 -2 -1 - S ( x ; α ∗ , L ) -3 -2 -1 - S ( x ; α ∗ , L ) -3 -2 -1 - S ( x ; α ∗ , L ) α ∗ = 0.50.751.01.25 α = 0.5 ~~~~ FIG. 5. (color online) Signiﬁcance ˜ S ( x, α ∗ , L ) of relativetrends x occurring in segments of length N of long data sets(length 2 ) with ﬁxed global Hurst exponent α ∗ (continousline). The data follow Gaussian distributions. For clarity, wefocus on 1 − S , as in Fig. 2. We compare ˜ S ( x, α ∗ , L ) with thesigniﬁcance S ( x, α, L ) of relative trends x occuring in datasets of the same length N with ﬁxed local Hurst exponent α (dashed red line) that was shown in Fig. 2. The consid-ered segment lengths are L = 400 , , and 2200, shown incolumns 1 −

3. The Hurst exponents α ∗ = α are 0.5, 0.75,1.0, and 1 .

25, shown in rows 1 − the seasonal trend has been removed, are Gaussian dis-tributed. But others, like river run-oﬀ data, have a quiteskewed distribution and cannot be characterized by aGaussian [35 ? , 36]. Accordingly, the question arises,to which extend our results derived in the previous sub-section are general and apply also to non-Gaussian dis-tributions.To answer this question, we have considered threekinds of non-Gaussian distributions: (i) the symmet-ric exponential distribution D ( y ) = (1 /

2) exp( −| y | ), (ii)the (highly skewed) exponential distribution D ( y ) =exp( − y ) , y ≥

0, and (iii) the (highly skewed) power lawdistribution D ( y ) = y − , y ≥

0. To generate these dis-tributions, we have ﬁrst generated long-term correlateddata { y i } of length 2 that are Gaussian distributed, asabove. Then we generated 2 data of the considerednon-Gaussian distribution and exchanged the long-term α -3 -2 -1 H ( α ; α * , L ) αα * = 1.5 a) b) FIG. 6. (color online) Distribution H ( α ; α ∗ , L ) of local Hurstexponents α (obtained by DFA2) in segments of lenght L =400 and L = 2200, for ﬁxed global Hurst exponents α ∗ = 1 . H ( α ; α ∗ , L ) has been obtained for data following a Gaus-sian distribution (open circles), an exponential distribution(full circles), a symmetric exponential distribution (triangleup), and a power law distribution (triangle down). The non-Gaussian data have been generated by the exchange tech-nique. correlated Gaussian data rankwise by the non-Gaussiandata.By this simple exchange technique we obtain long-termcorrelated data following the considered non-Gaussiandistribution, but the global Hurst exponent as well asthe local ones usually diﬀer slightly from the original one.These slight deviations do not play a role here, since weconsider α ∗ values between 0.1 and 1.9 and only the local α values measured by DFA2 are essential in our analy-sis. If one needs to obtain data with exactly the same α ∗ value as the Gaussian data, one has to use the it-erative Schreiber-Schmitz procedure [51], where in eachiteration the data are (1) Fourier-transformed to f space.Then (2) the Fourier-transformed data are exchangedby the Fourier-transform of the original Gaussian dataand (3) Fourier-transformed back to time space. Finally,(4) these data are exchanged rankwise by the desirednon-Gaussian distribution. By comparing the simple ex-change method with the Schreiber-Schmitz procedure wefound that both methods yield, for the same global Hurstexponent α ∗ ≤ .

5, the same distribution H ( α ; α ∗ , L ) oflocal Hurst exponents α as the Gaussian records.Figure 6 shows, for α ∗ = 1 . L = 400 and 2200, thedistributions H ( α ; α ∗ , L ) of local Hurst exponents α forboth Gaussian and non-Gaussian data. Above α ∗ = 1 . α ∗ , the output α ∗ was al-ways close to 1.5, and the distribution of the α valuesbecame much broader than for the Gaussian data. Ac-cordingly, our analysis of non-Gaussian data is limited tothose α values that are typically absent in long recordswith α ∗ above 1.5. We found that this is the case forsubrecords of length L = 400, 1200, and 2200, for α be-low α c ( L ) = 1 .

15, 1.31, and 1.34, respectively, where thefraction of subrecords originating from long records with -3 -2 -1 - S ( x ; α , L ) L = 400 0.2 0.41200 0.1 0.2 0.322000 1 2 310 -3 -2 -1 - S ( x ; α , L ) -3 -2 -1 - S ( x ; α , L ) -3 -2 -1 - S ( x ; α , L ) α = 0.50.751.01.25 FIG. 7. (color online) Dependence of the signiﬁcance S ( x, α, L ) of the relative trend x on the distribution of thelong-term persistent data. Apart from Gaussian data (con-tinous line, shown also in Fig. 2), we consider data withasymmetric exponential distribution (full circle), symmetricexponential distribution (triangle up), and power law distri-bution (triangle down). For convenience, we show 1 − S asin Fig. 2. The record lenghts are L = 400, 1200, and 2200(columns 1 − α = 0 . , . , − α ∗ above 1.5, is below 10 − . Accordingly, our analysisholds only for local Hurst exponents α below α c ( L ).Figure 7 shows the signiﬁcance of the relative trend x for the 3 non-Gaussian distributions considered here, for L = 400, 1200, and 2200. The continous line (diﬃcult tosee) is the result for the Gaussian data. The ﬁgure showsthat for α below 1 (rows 1 − − S of the Gaussian distribution appears to be alower bound for 1 − S of the skewed distributions, thismeans that the signiﬁcance of a trend in long-term cor-related Gaussian data represents an upper bound for thesigniﬁcance.Accordingly, the value obtained for x in Gaussiandistributed data represents a lower bound, but the diﬀer-ences between the diﬀerent distributions are very small, the largest deviations occur for α = 0 .

75 and L = 2200where x for the skewed power-law distribution exceeds x for the Gaussian distribution by less than 5 percent.For α = 1, x is the same for all distributions. For α = 1 .

25 (last row) which is below α c for L = 1200 and2200, the agreement between Gaussian and non-Gaussiandata is perfect for L = 1200 and 2200. For L = 400,where α is above α c and thus should not be considered,Gaussian and non-Gaussian data still collapse for 1 − S above 10 − . The shoulder below 10 − is an artifact whichoriginates from records where the original α ∗ value wasabove 1.5. V. EXAMPLE: THE ANTARCTIC BYRDRECORD

It is straightforward to apply our methodology to ob-servational data. Important applications are climate data(e.g., river ﬂows, precipitation, and temperature data)where one likes to know the signiﬁcance of trends dueto anthropogenic climate change. When considering cli-mate data, it is important to use monthly data whereadditional short-term dependencies have been averagedout and seasonal trends can be better eliminated than indaily data [1].For convenience we consider temperature data. Theseasonal trend elimination is done in 2 steps [35, 36]. Inthe ﬁrst step, we substract the monthly seasonal trend toobtain the temperature anomalies ˜ y i , i = 1 , . . . , L . Sincethe variance of the temperature anomalies may dependon the season, we divide in the second step the tempera-ture anomalies by the seasonal standard deviation. Theresulting dimensionless record y i has unit variance andzero mean.Next we perform the regression analysis for the { y i } which yields ∆ and σ and thus the relative trend x . Thenwe employ DFA2 and obtain the Hurst exponent α . From x and α we can estimate the signiﬁcance S of the temper-ature trend from Tables I and II as well as the boundary ± x of the 95% signiﬁcance interval.Since we divided the temperature anomalies by theseasonal standard deviation to obtain { y i } , ∆ and σ aswell as the error bars ± ∆ = x σ are dimensionless.To obtain the real trend ∆ real and its real error bars ± ∆ real in units of o C , we perform a regression analy-sis of the temperature anomalies ˜ y i , i = 1 , . . . , L . To ob-tain the error bars ∆ real95 we use the identity (see [1])∆ real95 / ∆ real = ∆ / ∆, which then yields∆ real95 = ∆ ∆ real ∆ . The resulting minimum and maximum external temper-ature trends are ∆ real ± ∆ real95 .To show explicitely how our approach can be used toestimate the signiﬁcance of a warming trend we considerthe monthly (corrected) Byrd record between 1957 and2013 that was recently reconstructed by Bromwich et y i F ( s ) [ a r b . un it s ] α = 0.65 a) b) FIG. 8. (color online) (a) Monthly fully seasonally detrendedrecord y i at the Byrd station between 1957 and 2013 (blacklines) and the corresponding linear regression (red line). (b)The DFA2 ﬂuctuation function F ( s ) of the monthly Byrdrecord y i (black circles). al. [50].(An earlier version [46] of the Byrd record hasbeen discussed in [28]). The Byrd station is located inthe center of West Antarctica which is one of the fastestwarming places on Earth. The regression analysis yields∆ real = 2 . o C . It is obvious that the question of the sig-niﬁcance of Antarctic warming is highly relevant, sincethe warming trend inﬂuences the melting of the WestAntarctic Ice Shielf and thus contributes to future sealevel rise.Figure 8a shows the fully seasonally detrended Byrdrecord y i , i = 1 , . . . , .

69 and σ = 1 . x = 0 . y i (full circles). In the double logarithmic plot, the DFA2ﬂuctuation function F ( s ) follows a straight line with ex-ponent α = 0 .

65 between s = 10 and L/

4. Accordingly,the data are long-term persistent and our methodologyapplies. The exponent α = 0 .

65 is in agreement withearlier estimates [28, 50]. We like to note that the tem-perature anomalies ˜ y i yield to the same Hurst exponentshowing explicitely that the second step in the seasonaldetrending has no inﬂuence on the persistence properties.For obtaining the degrees of freedom l and the scalingfactor a for α = 0 .

65 and L = 684, we consider the 4thline in Table 1 and 2 and use the respective values for L = 500 , ,

700 and 800 for a cubic interpolation. Thisgives l = 9 .

78 and a = 0 . S = 0 .

953 and x = 0 .

64. Accordingly,the signiﬁcance of the warming trend at the Byrd stationis 95.3%. The minimum external trend is 0 . o C , whilethe maximum external trend is 3 . o C .When Bromwich et al determined these quantities,they used the conventional hypothesis that the annual linearly detrended temperature data follow an AR(1)process and that S can be obtained from Eqs. (9), (11)and (12) (even though the length of the annual data is too small to make (11) and (12) applicable). For theannual detrended data, they obtained C (1) = 0 .

075 andthus l = 49 .

05 and a = 0 . S = 0 .

999 and x = 0 . . o C , while the maximumexternal trend is 3 . o C .Accordingly, the signiﬁcance of the warming trend aswell as the minimum external trend has been stronglyoverestimated by Bromwich et al, while the uncertainty2 x has been underestimated. VI. CONCLUSIONS

In summary, we have studied by extensive Monte-Carlosimulations the distribution P ( x ; α, L ) of linear trends inlong-term correlated records of length L that are charac-terized by a Hurst exponent α between 0.5 and 1.5 (de-termined by DFA2). The Hurst exponent was obtainedby linear regression from the slope of the regression linein a double logarithmic representation of the DFA2 ﬂuc-tuation function F ( s ) between s = 10 and L/

4. We haveconsidered record lengths L between 400 and 2200, whichcorresponds, in monthly climate records, to time scalesbetween 33.3 and 183.3 years. In each record we havedetermined by linear regression analysis the increase ∆and the standard deviation σ of the data around the re-gression line; the ratio x = ∆ /σ is the relative trend.We have extended the earlier analysis [1] in three im-portant directions:(i) We found that P ( x ; α, L ) follows, in the whole x -range, the student-t distribution with two ﬁt parameters,the scaling parameter a and the eﬀective length l . Thisgeneralizes nicely the known results for white noise andAR(1) processes, where P also follows a student-t dis-tribution, but with diﬀerent a and l values. For Hurstexponents between 0.5 and 1.1, l depends only on therecord length L , and not on the Hurst exponent α . Inthe previous work [1], the distribution was approximatedby a Gaussian at small x and an exponential at large x values, which allowed to determine easily the signiﬁcanceof large relative trends.(ii) In [1], only Hurst exponents up to 1.1 could betreated analytically. Here we extend the analytical anal-ysis to α = 1 . x are more pronounced. We alsoconsidered slightly smaller and larger record lengths.(iii) In [1], only Gaussian data were considered. Herewe have shown explicitely that the results are stable anddo hold also for very diﬀerent, highly skewed distribu-tions.For applying our methodology to observable data onemust be sure that there are no additional short termcorrelations on short time scales. It is known that intemperature anomalies, there are additional short termcorrelation on time scales up to 10 days. For river ﬂows,the short term persistence may range up to one month.These short term correlations can be eliminated by aver-aging the data over short time windows that are largerthan the persistence time, i.e. by considering monthlytemperature anomalies and quarter annual river ﬂows.The DFA2 analysis then has to be performed on these av-eraged records and the actual data length L is the lengthof the averaged data set.In long-term persistent processes it is enough to de-termine α by DFA2 and use this value to determine a and l . In most previous evaluations of the signiﬁcanceof trends in climate science the signiﬁcance of trendshas been determined from the value of C (1) in annualrecords, assuming that the signiﬁcance of trends in anAR(1) record may be a good approximation for the signif-icance of trends in long-term correlated climate records.Our results show that one does not need to rely on thiscrude approximation (which usually strongly exaggeratesthe signiﬁcance) since the estimation of the signiﬁcanceof trends in long-term correlated records is not more dif-ﬁcult.Acknowledgements: We like to thank the DeutscheForschungsgemeinschaft and the Ministry of Educationand Science of the Russian Federation for ﬁnancial sup-port. VII. APPENDIX

We have shown in this article that the probability P ( x ; α, L ) that in a long-term persistent record of length L , characterized by the Hurst exponent α , a relativetrend of strength x occurs, has the form of a student-t distribution, P ( x ; α, L ) = Γ( l +12 )Γ( l ) √ πla (cid:18) x/a ) l (cid:19) − l +12 , with the eﬀective length l and the scaling parameter a .The related trend signiﬁcance is S ( x ; α, L ) = 2 xa Γ( ( l + 1) √ πl Γ( l ) F (cid:18) ,

12 ( l + 1); 32 ; − ( x/a ) l (cid:19) . Tables I and II list the eﬀective lengths l and the scalingfactor a as function of the DFA2 Hurst exponent α andthe record length l . [1] S. Lennartz and A. Bunde, Phys. Rev. E., , 021129(2011).[2] B. B. Mandelbrot, Gaussian Self-Aﬃnity and Fractals (Springer, New York, Berlin, Heidelberg, 2001)[3] P. Bloomﬁeld, and D. Nychka, Climatic Change, , 275(1992).[4] J. D. Pelletier and D. L. Turcotte , J. Hydrology, ,198 (1997).[5] E. Koscielny-Bunde et al., Phys. Rev. Lett., , 729(1998).[6] B. D. Malamud and D. L. Turcotte, Advances in Geo-physics, , 1 (1999).[7] P. Talkner and R.O. Weber, Phys. Rev. E, , 150,DOI:10.1103/PhysRevE.62.150 (2000).[8] R.O. Weber and P. Talkner, J. Geophys. Res. , 20131(2001).[9] R.A. Monetti, S. Havlin, and A. Bunde, Physica A ,581 (2003).[10] J. Eichner, E. Koscielny-Bunde, A. Bunde, S. Havlin, andH. J. Schellnhuber, Phys. Rev. E, , 046133 (2003).[11] K. Fraedrich and R. Blender, Phys. Rev. Lett. , 108501(2003).[12] R. Blender and K. Fraedrich, Geophys. Res. Lett., ,1769 (2003).[13] L. A. Gil-Alana, J. Climate, , 5357 (2005).[14] T.A. Cohn and H.F. Lins, Geophys. Res. Lett., ,L23402 (2005).[15] A. Kir´aly, I. Bartos, and I.M. J´anosi, Tellus, , 5, 593,(2006).[16] D. Rybski, A. Bunde, S. Havlin and H. von Storch, Geo-phys. Res. Lett. , L06718 (2006). [17] E. Giese, I. Mossig, D. Rybski, and A. Bunde, Erdkunde , 186 (2007).[18] D. A. Rybski, A. Bunde, and H. v. Storch, J. Geophys.Res. Atmospheres, , D02106 (2008).[19] E. Zorita, T.F. Stocker, and H. v. Storch, Geophys. Res.Lett. , L24706 (2008).[20] D. Rybski and A. Bunde, Physica A , 1687 (2009).[21] J.M. Halley, Physica A , 2492 (2009).[22] S. Lennartz, and A. Bunde, Geophys. Res. Lett., ,L16706 (2009).[23] S. Fatichi, S. M. Barbosa, E. Caporali, and M. E. Silva,J. Geophys. Res., , D18121 (2009).[24] C. Franzke, J. Climate, , 6074 (2010).[25] C. Franzke, C., J. Climate, , 4172 (2012).[26] S. Lovejoy and D. Schertzer, in: Extreme events and nat-ural hazards: the complexity perspective, A.S. Sharma,A. Bunde, D. Baker, and V.P. Dimri (eds), AGU mono-graphs, 231 (2012).[27] C. Franzke, Geophys. Res. Lett., , 1391 (2013).[28] A. Bunde, J. Ludescher, C. Franzke, and U. B¨untgen,Nature Geoscience, , 246 (2014).[29] H. E. Hurst, Transactions of the American Society of civilengineers, , 770 (1951).[30] B. B. Mandelbrot and J. R. Wallis, Water Resources Re-search, , 5, 909 (1968).[31] A. Montanari, R. Rosso, and M. S. Taqqu, Water Re-sources Research, , 5, 1249 (2000).[32] A. Montanari, Theory and Application of Long-rangeDependence, P. Doukhan, G. Oppenheim, M. S. Taqqu,Eds., 461-472 (2003).[33] D. Koutsoyiannis, Hydrological Sciences J. , 3 (2003).[34] D. Koutsoyiannis, J. Hydrology , 25 (2006). α \ L L400 L500 L600 L700 L800 L900 L1000 L1200 L1400 L1600 L1800 L2000 L22000.50 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.55 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.60 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.65 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.70 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.75 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.80 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.85 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.90 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.95 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.501.00 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.501.05 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.501.10 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.501.15 7.06 7.69 8.86 9.32 9.63 10.51 11.08 11.76 13.12 12.50 13.07 14.24 14.441.20 6.95 7.50 8.27 9.54 9.21 9.63 10.28 11.14 11.91 11.92 12.48 13.13 14.261.25 6.41 7.32 7.91 8.74 8.61 9.27 9.85 10.55 11.21 11.34 11.59 12.44 12.941.30 6.25 6.68 7.60 8.11 8.28 8.55 9.28 9.90 10.21 10.36 10.74 11.26 11.031.35 5.93 6.31 7.29 7.68 7.83 8.21 8.74 9.12 9.88 9.88 10.26 10.40 10.401.40 5.47 5.80 6.63 7.21 7.44 7.55 7.93 8.49 8.94 8.78 9.02 9.48 9.741.45 5.19 5.54 6.33 6.83 6.83 7.14 7.58 7.82 8.62 8.18 8.43 8.80 9.011.50 4.77 5.10 5.88 6.45 6.33 6.76 6.88 7.42 7.48 7.55 7.76 8.25 8.41TABLE I. Eﬀective length l ( α, L ) in in the trend distribution P and the trend signiﬁcance S in long term correlated data withHurst exponent α and record length L . α \ L L400 L500 L600 L700 L800 L900 L1000 L1200 L1400 L1600 L1800 L2000 L22000.50 0.174 0.162 0.133 0.124 0.115 0.107 0.104 0.092 0.086 0.081 0.076 0.072 0.0680.55 0.227 0.212 0.177 0.165 0.154 0.145 0.140 0.126 0.117 0.111 0.105 0.100 0.0960.60 0.292 0.275 0.232 0.218 0.205 0.193 0.188 0.171 0.160 0.153 0.145 0.139 0.1330.65 0.371 0.352 0.300 0.284 0.268 0.256 0.249 0.227 0.215 0.206 0.196 0.190 0.1820.70 0.465 0.447 0.385 0.368 0.348 0.332 0.325 0.301 0.286 0.275 0.265 0.256 0.2470.75 0.577 0.556 0.486 0.466 0.446 0.428 0.420 0.392 0.376 0.364 0.351 0.341 0.3320.80 0.706 0.687 0.606 0.585 0.564 0.543 0.537 0.504 0.486 0.473 0.460 0.448 0.4380.85 0.854 0.835 0.745 0.726 0.703 0.681 0.675 0.639 0.621 0.609 0.593 0.582 0.5680.90 1.021 1.006 0.906 0.883 0.864 0.841 0.833 0.800 0.778 0.769 0.752 0.742 0.7260.95 1.208 1.198 1.088 1.069 1.051 1.026 1.021 0.981 0.961 0.952 0.937 0.924 0.9171.00 1.417 1.410 1.291 1.276 1.256 1.233 1.229 1.187 1.171 1.167 1.151 1.142 1.1001.05 1.646 1.648 1.517 1.506 1.488 1.466 1.456 1.420 1.409 1.404 1.390 1.340 1.3661.10 1.900 1.915 1.776 1.758 1.745 1.722 1.716 1.681 1.666 1.670 1.657 1.590 1.6331.15 2.138 2.133 2.024 2.016 1.992 1.985 1.993 1.953 1.971 1.938 1.937 1.938 1.9251.20 2.427 2.420 2.288 2.327 2.273 2.261 2.268 2.248 2.244 2.238 2.237 2.231 2.2381.25 2.695 2.758 2.584 2.615 2.568 2.569 2.580 2.556 2.559 2.561 2.548 2.552 2.5571.30 3.045 3.044 2.917 2.927 2.906 2.884 2.915 2.893 2.883 2.887 2.886 2.887 2.8571.35 3.404 3.411 3.286 3.287 3.264 3.252 3.279 3.242 3.267 3.261 3.263 3.239 3.2281.40 3.758 3.750 3.621 3.662 3.665 3.619 3.637 3.640 3.643 3.626 3.630 3.624 3.6351.45 4.202 4.234 4.052 4.102 4.062 4.060 4.101 4.042 4.097 4.061 4.055 4.065 4.0601.50 4.666 4.679 4.538 4.615 4.534 4.554 4.546 4.550 4.514 4.535 4.530 4.560 4.561TABLE II. Scaling factor a ( α, L ) in the trend distribution P and the trend signiﬁcance S in long term correlated data withHurst exponent α and record length L .[35] J. W. Kantelhardt, E. Koscielny-Bunde, D. Rybski, P.Braun, A. Bunde, and S. Havlin, J. Geophys. Res. At-mospheres, , D01106 (2006).[36] E. Koscielny-Bunde, J. W. Kantelhardt, P. Braun, A.Bunde, and S. Havlin, J. of Hydrology., , 120 (2006). [37] A. Beretta, H. E. Roman, F. Ricich, and F. Crisciani,Physica A, , 695 (2005).[38] S. Dangendorf et al. Geophys. Res. Lett. , 15, 5530(2014).[39] M. Becker, M. Karpytchev, and S. Lennartz-Sassinek,Geophys. Res. Lett. , 15, 5571 (2014). [40] K. Hasselmann, J. Climate , 1957 (1993).[41] G.C. Hegerl,H. von Storch, K. Hasselmann, B.D. Santer,U. Cubasch, and P.D. Jones J. Climate , 2281 (1996).[42] F. W. Zwiers, in Antropogenic Climate Change (Springer,New York, 1999), pp. 163-209[43] B. D. Santer et al., J. Geophys. Res. Atmospheres, ,7337 (2000).[44] J. Turner, Int. J. Climatol., , 279 (2005).[45] E. J. Steig, Nature, , 459 (2009).[46] D. H. Bromwich al., Nature Geoscience, , 139 (2012).[47] IPCC, Stocker T. F. et al. Eds. (2013), Climate Change2013: The Physical Science Basis. Contribution of Work- ing Group I to the Fifth Assessment Report of the In-tergovernmental Panel on Climate Change , CambridgeUniversity Press, Cambridge and New York.[48] J. W. Kantelhardt, E. Koscielny-Bunde, H.H.A. Rego, A.Bunde, and S. Havlin, Physica A, , 441 (2001).[49] D. H. Bromwich et al, Nature Geoscience, , 11247(2014).[50] D. H. Bromwich and J. P. Nicolas, Nature Geoscience, ,247 (2014).[51] T. Schreiber and A. Schmitz, Phys. Rev. E,77