The significance of trends in long-term correlated records
aa r X i v : . [ phy s i c s . d a t a - a n ] N ov The significance of trends in long-term correlated records
Araik Tamazian
Institut f¨ur Theoretische Physik, Justus-Liebig-Universit¨at Giessen, 35392 Giessen, Germany
Josef Ludescher and Armin Bunde
Institut f¨ur Theoretische Physik, Justus-Liebig-Universit¨at Giessen, D-35392 Giessen, Germany
We study the distribution P ( x ; α, L ) of the relative trend x in long-term correlated records oflength L that are characterized by a Hurst-exponent α between 0.5 and 1.5 obtained by DFA2. Therelative trend x is the ratio between the strength of the trend ∆ in the record measured by linearregression, and the standard deviation σ around the regression line. We consider L between 400and 2200, which is the typical length scale of monthly local and annual reconstructed global climaterecords. Extending previous work by Lennartz and Bunde [1] we show explicitely that P follows thestudent-t distribution P ∝ [1 + ( x/a ) /l ] − ( l +1) / , where the scaling parameter a depends on both L and α , while the effective length l depends, for α below 1.15, only on the record length L . From P we can derive an analytical expression for the trend significance S ( x ; α, L ) = R x − x P ( x ′ ; α, L ) dx ′ and the border lines of the 95% percent significance interval. We show that the results are nearlyindependent of the distribution of the data in the record, holding for Gaussian data as well as forhighly skewed non-Gaussian data. For an application, we use our methodology to estimate thesignificance of Central West Antarctic warming.PACS numbers: 05.10.-a, 05.45.Tp, 92.70.Mn I. INTRODUCTION
The variability of a data set { y i } , i = 1 , . . . , L , de-pends on its (internal) correlation properties and can beinfluenced by external mechanisms. Prominent examplesare temperature records[1–28], river flows [29–36], andsea level heights [37–39] that all show a strong naturallong-term persistence and, in addition, are effected byanthropogenic influences that may lead to an additionaltrend.In long-term persistent records, small events have atendency to cluster in ”valleys” while large events tendto cluster in ”mountains”. Accordingly, long-term persis-tent records exhibit a pronounced valley-mountain struc-ture, where it is difficult to distinguish a natural trend(starting in a valley and ending in a mountain) from asmall external deterministic trend. The problem of esti-mating the anthropogenic trend in temperature records,river flows and sea level height is an important issue inhydroclimatology and is called the ”detection problem”[3, 40–42]. The central quantity here is the probabil-ity P ( x ; α, L ) that in a long-term persistent record oflength L , characterized by the Hurst exponent α , a rel-ative trend of strength x occurs; x is obtained from thestandard linear regression analysis of the record and rep-resents the height ∆ of the regression line divided by thestandard deviation of y i around the regression line (seeSection II). From P one obtains the significance of thetrend as well as its error bars (denoting the 95% signifi-cance interval)[1, 3, 22].In many previous attempts to solve this problem (see,e.g., [43–47, 49]) it had been tacitly assumed that thepersistence of the climate records can be modelled by anauto-regressive process of first order (AR(1)) which al-lowed, in a simple and straightforward way, to estimate the significance of the warming trend ∆ and its errorbars. In this case, the central quantity is the probability P ( x ; C (1) , L ) that in an AR(1) record of length L , char-acterized by the detrended lag-1 autocorrelation C (1),a relative trend of strength x occurs. It is well knownthat P ( x ; C (1) , L ) follows a student-t distribution (seeEq (6)), where the scaling paramer a and the effectivelength l are functions of C (1) and L . This approach,however, which is conventional in climate science, canonly be considered as a crude approximation since thetemperature variability cannot be described by an AR(1)process where the autocorrelation function C ( s ) decaysexponentially with time lag s , but is described by a long-term persistent process where C ( s ) decays algebraicallywith s .In recent years, there have been several attempts tosolve the detection problem in long-term persistent data[1, 14, 16, 17, 19–22]. Using Monte Carlo simulationsand scaling arguments it was found empirically [1, 22]that for long-term persistent Gaussian data, P ( x ; α, L )can be approximated reasonably by a Gaussian for small x and by a simple exponential for large x values.Here we perform the same kind of calculations as in[1, 22], but with a considerably better statistics, and showthat the best approximation for P , in the whole x -regime ,is again the student-t distribution , where now the scalingparameter a and the effective length l depend on α and L .While the previous result [1] represents a good approxi-mation in the respective x -windows, the present result ismore satisfying since it shows that the distributions of arelative trend in uncorrelated, short-term correlated andlong-term correlated data all follow the same equation,namely a student-t-distribution, but with different scal-ing parameters a and effective lengths l . Accordingly, theexceeding probability W and the significance S of a trendis described, in all these different systems, by the samehypergeometric function. In addition, we also study P for strongly skewed non-Gaussian data and find that, toa very good approximation, P is the same for all consid-ered distributions. Finally, we apply our methodology tothe West-Antarctic temperature record at Byrd station.The paper is organized as follows: In Section II wedescribe how the exceedance probability W and the sig-nificance S is related to P and which form these quanti-ties have for uncorrelated Gaussian noise and short-termcorrelated Gaussian data characterized by an AR(1) pro-cess. We also give a brief introduction into long-termpersistent data and their characterization. In SectionsIII we present our numerical results for the significanceof a relative trend in long-term persistent Gaussian (Sec-tion III) and non-Gaussian data (Section IV). In SectionV we show how our approach can be applied to monthlytemperature records. As an example we take the Byrdrecord from West Antarctica that has very recently beenreconstructed [49]. In Section VI, finally, we summarizeour results. II. DETECTION OF EXTERNAL TRENDS
We consider a record { y i } , i = 1 , . . . , L and assume,without loss of generality, that the mean value ¯ y of thedata is zero. To estimate the increase or decrease of thedata values in the considered time window of length L ,one usually performes a regression analysis. From theregression line r i = bi + d , one obtains the magnitude ofthe trend ∆ y = c ( L −
1) as well as the fluctuations aroundthe trend, characterized by the standard deviation σ =[(1 /L ) P Li =1 ( y i − r i ) ] / . The relevant quantity we areinterested in is the relative trend x = ∆ y/σ. (1)When a certain relative trend has been measured in adata set, the central question is, if this trend may be dueto the natural variability of the data set or not (”detec-tion problem”). To solve this problem, one needs to knowthe probability P ( x ; L ) dx that in model records with thesame persistence properties as the considered data set, arelative trend between x and x + dx occurs. The proba-bility density function P ( x ; L ) is symmetric in x . In thefollowing we consider x >
0. From P we derive the ex-ceedance probability W ( x ; L ) = R ∞ x P ( x ′ ; L ) dx ′ and thetrend significance S ( x ; L ) = Z x − x P ( x ′ ; L ) dx ′ = 1 − W ( x ; L ) . (2)By definition, S is the probability that the relative trendin the record is between − x and x .If the significance of a relative trend is above 0.95 (or95%), one usually assumes that the considered trend can-not be fully explained by the natural variability of therecord. The relation S ( x ; L ) = 0 .
95 defines the up-per and lower limits ± x of the 95% significance inter- val (also called confidence interval). By the above as-sumption, relative trends x between − x ( L ) and x ( L )can be regarded as natural. If x is above x , the part x − x cannot be explained by the natural variability ofthe record and thus can be regarded as minimum externalrelative trend, x minext = x − x . (3)On the other hand, the external trend cannot exceed x maxext = x + x , (4)which thus represents the maximum external relativetrend. By definition, x minext represents the lower marginof the observed relative trend that cannot be explainedby the natural variability alone, while x maxext ist the largestpossible external relative trend consistent with the nat-ural variability of the record. According to Eqs. (3) and(4), ± x ( L ) can be regarded as error bars for an exter-nal relative trend in a record of length L . A. White noise
For uncorrelated Gaussian data (white noise), it hasbeen assumed (see [43] and references therein) that theratio t b between the estimated trend slope b and its stan-dard error s b (see Eqs. (1-5) in [43]) follows a student-tdistribution. This assumption can be written as P ( x ; L ) = Γ( l ( L )+12 )Γ( l ( L )2 ) p πl ( L ) a (cid:18) x/a ) l ( L ) (cid:19) − l ( L )+12 , (5)with the degrees of freedom l ( L ) = L − a = √ L − √ L + 2 1 p l ( L ) . (7)Γ denotes the Γ-function. In the limit of large L , a tendsto a ∼ = √ / p l ( L ).From (5) and (2) one can obtain straightforwardly thesignificance trend S as a function of x/a and l ( L ), S ( x ; L ) = 2 xa Γ( ( l ( L ) + 1) √ πl Γ( l ( L )2 ) × F (cid:18) ,
12 ( l ( L ) + 1); 32 ; − ( x/a ) l ( L ) (cid:19) (8)where F is the hypergeometric function. B. Short-term correlations
The most basic model for short-term correlations indata sets is the autoregressive process of first order(AR1), where the data satisfy the equation y i +1 = c y i + η i , i = 1 , , . . . , L − . (9)Here, the AR1 parameter c is between -1 and 1 and η i iswhite noise. For c < c > c = 0, they are whitenoise.For characterizing the persistence of a record, one oftenstudies the autocorrelation function C ( s ) = h y i y i + s i ≡ L − s ) P L − si =1 y i y i + s / ( L P Li =1 y i ). By definition, C (0) =1. It is easy to show that for AR1 processes, in thelimit of L → ∞ , C ( s ) decays exponentially, C ( s ) = c s ,i.e., c is identical to the lag-1 autocorrelation C (1). For c > C ( s ) can be written as C ( s ) = exp( − s/s x )) where s x = 1 / | ln c | denotes the persistence time.It has been shown [43] that for sufficiently large L where C (1) = c , P has approximately the form of thestudent-t distribution Eq. (5), with l ( L ) = L − C (1)1 + C (1) − a = √ L − √ L + 2 1 p l ( L ) ∼ = √ p l ( L ) (11)Accordingly, the significance S of the trend is describedby Eq. (8) with l ( L ) from (10) and a from (11). C. Long-term persistence
Long-term correlated records can be characterized bythe power spectral density S ( f ) = | y ( f ) | , where { y ( f ) } , f = 0 , , . . . , L/
2, is the Fourier transform of { y i } . Withincreasing frequency f , S ( f ) decays by a power law, S ( f ) ∼ f − β , (12)where β > β = 0. Records with 0 < β < C ( s ) thatdecays by a power law, C ( s ) ∼ (1 − γ ) s − γ , with γ = 1 − β .To generate long-term persistent data, one usually usesthe Fourier-filtering technique based on (12), where longrecords of uncorrelated Gaussian data (typically of length L = 2 ) are transformed to Fourier space. The result ismultiplied by f − β/ and then transformed back to timespace. The resulting record is Gaussian distributed. Forobtaining records of the desired length L , one divides thelong record into segments of length L .Since both S ( f ) and C ( s ) exhibit large finite size ef-fects and are strongly influenced by external determin-istic trends, one usually does not use these methods to characterize the long-term persistence, but prefers meth-ods like the detrending fluctuation analysis of 2nd order(DFA2) [48] where linear trends in the data are elimi-nated systematically.In DFA2 one measures the variability of a record bystudying the fluctuations in segments of the record as afunction of the length s of the segments. Accordingly,one first divides the record { y i } , i = 1 , , . . . , L , into non-overlapping windows ν of lengths s . Then one focuses, ineach segment ν , on the cumulated sum Y i of the data anddetermines the variance F ν ( s ) of the Y i around the bestpolynomial fit of order 2. After averaging F ν ( s ) over allsegments ν and taking the square root, we arrive at thedesired fluctuation function F ( s ). One can show that F ( s ) ∼ s α . (13)The exponent α can be associated with the Hurst expo-nent, and is related to the correlation exponent γ and thespectral exponent β by α = (1 + β ) / α = 1 − γ/ α = 1 /
2. The DFA2 techniquegives reliable results for time scales s between 10 and L/ F ( s ) approaches a power law,with α = 1 /
2, for s well above the persistence time s x .For s well below s x (which can only be the case for b veryclose to 1), α is close to 1.5. The difference in the func-tional form of F ( s ) allows to distinguish between short-term and long-term persistent processes.Recently, it has been shown [1] by Monte Carlo simula-tions that in long-term persistent data of length L , wherethe Hurst exponent α is determined by DFA2, the prob-ability density P ( x ; α, L ) of the relative trend x can bereasonably approximated by a Gaussian for small x andby a simple exponential for large x . Using scaling theory,an analytic expression for W ( x ; L ) has been obtained, asa function of α , in the two x -regimes. Here we followthe same route as in [1], but with a better statistics, andfind that the best approximation for P , in the whole x regime, is the student-t-distribution Eq. (5), where thescaling parameter a and the effective length l depend onboth α and L .Accordingly, the significance of a relative trend in long-term persistent records is described by the same hyperge-ometric function as for white noise and AR1 noise, onlythe parameters l and a are different. We also show thatthe results derived for Gaussian data hold, in an excel-lent approximation, also for data with a symmetric ex-ponential distribution D ( y ) = (1 / exp ( −| y | ) as well asfor strongly skewed distributions like the one-sided expo-nential and one-sided power-law distribution. α -3 -2 -1 H ( α ; α * , L ) α * -3 -2 -1 G ( α * ; α , L ) α * = 0.75 α = 0.75a) b) FIG. 1. (color online) (a) When long data sets with globalHurst exponents α ∗ (created by Fourier filtering) are di-vided into sub-segments of length L , the local Hurst expo-nents α vary around α ∗ . The figure shows the distribu-tion H ( α ; α ∗ , L ) of the local Hurst exponents α (obtained byDFA2) in segments of lenght L = 400 (continuous red line)and L = 2200 (dashed green line), for fixed global Hurst ex-ponent α ∗ = 0 .
75. (b) According to (a), the same local Hurstexponent α can originate from long data sets with differentglobal Hurst exponents α ∗ . The figure shows, for the localHurst exponent α = 0 .
75 in segments of length L = 400 (con-tinuous red line) and L = 2200 (dashed green line), the dis-tribution G ( α ∗ ; α, L ) of the involved global Hurst exponents α ∗ . III. SIGNIFICANCE OF TRENDS INLONG-TERM CORRELATED GAUSSIAN DATASETS
For determining P ( x ; α, L ) numerically, we follow [1].We use the Fourier-filtering technique [2] to generate 800synthetic records of length 2 , for 241 global Hurst expo-nents α ∗ ranging from α ∗ = 0 . α ∗ = 2 .
5. We are in-terested in data sets with lengths L between 400 and 2200which correspond, in monthly temperature data sets, todata lengths between 33 and 183 years. Accordingly,we divided each data set into subsequences of lengths L = 400 , , , . . . , L , we used linear regression to determine (i) the localDFA2 Hurst exponent α as the slope of the regressionline in a double logarithmic presentation of the fluctua-tion function F ( s ) between s = 10 and L/ x . We are interested in α values between0.5 and 1.5, which are most common in nature.It has been noticed before [1, 18, 22], that the localHurst exponents α obtained in each subrecord are notidentical to the global Hurst exponent α ∗ of the entirerecord, but vary around α ∗ . The distribution of the localHurst exponents α , for fixed α ∗ = 0 .
75 and L = 400 and2200, is shown in Fig. 1a. As expected, the distributionnarrows with increasing subrecord length L . Accordingly,when in a subrecord a certain local Hurst exponent α is measured, the subrecord may be part of a long dataset with a different global Hurst exponent α ∗ . Figure1b shows, for fixed α = 0 .
75, the distribution of the α ∗ values. Again, the distribution narrows with increasing L . As a consequence, for determining the significance of -3 -2 -1 - S ( x ; α , L ) L = 6000 2 410 -3 -2 -1 - S ( x ; α , L ) -3 -2 -1 - S ( x ; α , L ) α = 0.5, 0.6, FIG. 2. (color online) Significance S ( x ; α, L ) of relative trends x occurring in long term correlated data of length L and Hurstexponent α . The data are Gaussian distributed. For clarity,we focus on 1 − S . (a) is for L = 600 and α = 0 . , . ., . α = 0 . , . , . α = 1 . , . , . , . , . L = 1200 and L = 1800, respectively. a relative trend in a long-term correlated record of length L , one cannot simply identify the local Hurst exponents α with the global one, but has to determine the localHurst exponents in each subrecord separately. As wewill show in the second part of this Section, ignoring thisfact will lead to a strongly enhanced significance. Welike to note that a similar problems occurs in short-termpersistent records, where only in long data sets the lag-1autocorrelation function C (1) is equal to the persistenceparameter b . In subrecords (or short data sets) of length L , the values of C (1) fluctuate around b , and Eqs. (10)and (11) are not valid [28].After having obtained in each subrecord k (of fixedlength L ) the local Hurst exponents α k , we focus on thosesubrecords that have local α values between 0.49 and1.51. We divide the local α values into 51 windows oflength 0.02, such that in the first window α = 0 . ± . α = 0 . ± .
01, and in the lastwindow α = 1 . ± .
01. Then we determine, in each α -window, the distribution P ( x ; α, L ) of the relative trendsas well as the trend significance S .Figure 2 shows 1 − S ( x ; α, L ) = 2 W ( x ; α, L ) for three α -1 a ( α , L ) L = 400L = 1200L = 2200
FIG. 3. (color online) The characteristic relative trend a ( α, L )that characterizes the decay of 1 − S (see Fig. 2 and Eq. ( ?? ),as a function of the Hurst exponent α for three record lengths L = 400 , , α = 0 . representative data lengths L = 600, 1200, and 1800,which in monthly climate records correspond to 50, 100,and 150 years. The dots are our numerical results. Thecontinous lines result from a fit of S to Eq. (8), withappropriately chosen values for the scaling parameter a and the effective length l . The figure shows that over allthree decades of 1 − S considered here (where S rangesfrom 0 to 0.999 (or from 0 to 99.9 %)), the fit is excellent.The parameters l and a are listed, for L between 400 and2200, in the Appendix.Table 1 in the Appendix shows that for fixed recordlength L , the effective length l is a constant in the mostrelevant range between α = 0 . L = 600, 1200, and 1800, l = 9.24, 12.05, and 13.69,respectively. For α above 1.1, l ( L ) decreases strongly.Figure 3 shows that the scaling parameter a listed inTable 2 in the Appendix, can be approximated, for α between 0.5 and 1.2, by a power law, where the slopeincreases with increasing L . Above α = 1 . a showsonly a very weak L dependence.From Fig. 2, by intersecting 1 − S with the constant5 × − , we obtain immediately the relative trend x that is conventionally used to estimate the error barsof a measured relative trend. Figure 4 shows x for L = 600, 1200 and 1800. From the figure, one can imme-diately read off the error bars of a relative temperaturetrend in monthly records of length 50, 100, and 150ywith DFA2 exponent α , and specify its lower and upperbounds x minext = x − x and x maxext = x + x .Finally, at the end of this Section, let us go back toits beginning and ask the following question: Given along record of length L described by the global Hurstexponent α ∗ , which is divided into subrecords of length L ≪ L . What is the significance ˜ S ( x ; α ∗ , L ) of a relativetrend x in these subrecords? We expect that for large L where all local Hurst exponents α are very close to α ∗ ,˜ S ( x, α ∗ , L ) and S ( x, α, L ) will coincide. For small L weexpect that ˜ S ( x, α ∗ , L ) overestimates the significance.The difference between S ( x, α, L ) and ˜ S ( x, α ∗ , L ) may α -1 x ( α ) L = 600L = 1200L = 1800
FIG. 4. (color online) The relative trend x correspondingto the 95% significance level as a function of the Hurst ex-ponent α for the segment lengths L = 600 , x is obtained from theintersection of 1 − S in Fig. 2 with 1-0.95=0.05. also be regarded as follows: If we know a priori that theconsidered data set is characterized by a certain Hurstexponent (which is the case, for example, when we con-sider records consisting of Gaussian distributed indepen-dent random numbers or their cumulated sum), then˜ S ( x, α ∗ , L ) gives the proper significance. If we do notknow the characteristics of the data set a priori, theuncertainty is increased and we have to determine ex-plicitely its Hurst exponent, S ( x, α, L ) gives the propersignificance.Figure 5 shows ˜ S ( x, α ∗ , L ) for the global Hurst expo-nents α ∗ = 0 .
5, 0.75, 1, and 1.25 in subrecords of lengths L = 400 , , and 2200. The figure shows also the sig-nificance S ( x, α, L ), for α = 0 .
5, 0.75, 1, and 1.25. For α ∗ = 0 .
5, ˜ S ( x, α ∗ , L ) follows Eq. (8) with (6) and (7). Wefound that also for α = 0 .
75 and 1, ˜ S ( x, α ∗ , L ) is well de-scribed by the student-t distribution (8) with l ( L ) = L − α = 0 .
75, the a values are 0.512, 0.383, and 0.328for L = 400, 1200 and 2200, respectively. For α = 1, therespective a values are 1.313, 1.185, and 1.135. As ex-pected, ˜ S ( x, α ∗ , L ) overestimates the significance of anobserved relative trend and thus underestimates the er-ror bars ± x of a relative trend. For example, when in amonthly temperature record of length 600 (correspondingto 50 years) characterized by α = 0 .
75 a relative trend x = 1 is measured, the proper significance of this trendis S = 0 .
94, i.e. the trend is not significant. However,if falsely ˜ S is used for estimating the significance, oneoverestimates the significance, since ˜ S = 0 .
975 in thiscase.
IV. SIGNIFICANCE OF TRENDS INLONG-TERM CORRELATED NON-GAUSSIANDATA SETS
By using the Fourier-filtering technique we generatedlong-term correlated Gaussian data { y i } . Many natu-ral records, e.g., monthly temperature anomalies where -3 -2 -1 - S ( x ; α ∗ , L ) L = 400 0.1 0.2 0.3 0.4L = 1200 0.1 0.2 0.3L = 22000 1 2 310 -3 -2 -1 - S ( x ; α ∗ , L ) -3 -2 -1 - S ( x ; α ∗ , L ) -3 -2 -1 - S ( x ; α ∗ , L ) α ∗ = 0.50.751.01.25 α = 0.5 ~~~~ FIG. 5. (color online) Significance ˜ S ( x, α ∗ , L ) of relativetrends x occurring in segments of length N of long data sets(length 2 ) with fixed global Hurst exponent α ∗ (continousline). The data follow Gaussian distributions. For clarity, wefocus on 1 − S , as in Fig. 2. We compare ˜ S ( x, α ∗ , L ) with thesignificance S ( x, α, L ) of relative trends x occuring in datasets of the same length N with fixed local Hurst exponent α (dashed red line) that was shown in Fig. 2. The consid-ered segment lengths are L = 400 , , and 2200, shown incolumns 1 −
3. The Hurst exponents α ∗ = α are 0.5, 0.75,1.0, and 1 .
25, shown in rows 1 − the seasonal trend has been removed, are Gaussian dis-tributed. But others, like river run-off data, have a quiteskewed distribution and cannot be characterized by aGaussian [35 ? , 36]. Accordingly, the question arises,to which extend our results derived in the previous sub-section are general and apply also to non-Gaussian dis-tributions.To answer this question, we have considered threekinds of non-Gaussian distributions: (i) the symmet-ric exponential distribution D ( y ) = (1 /
2) exp( −| y | ), (ii)the (highly skewed) exponential distribution D ( y ) =exp( − y ) , y ≥
0, and (iii) the (highly skewed) power lawdistribution D ( y ) = y − , y ≥
0. To generate these dis-tributions, we have first generated long-term correlateddata { y i } of length 2 that are Gaussian distributed, asabove. Then we generated 2 data of the considerednon-Gaussian distribution and exchanged the long-term α -3 -2 -1 H ( α ; α * , L ) αα * = 1.5 a) b) FIG. 6. (color online) Distribution H ( α ; α ∗ , L ) of local Hurstexponents α (obtained by DFA2) in segments of lenght L =400 and L = 2200, for fixed global Hurst exponents α ∗ = 1 . H ( α ; α ∗ , L ) has been obtained for data following a Gaus-sian distribution (open circles), an exponential distribution(full circles), a symmetric exponential distribution (triangleup), and a power law distribution (triangle down). The non-Gaussian data have been generated by the exchange tech-nique. correlated Gaussian data rankwise by the non-Gaussiandata.By this simple exchange technique we obtain long-termcorrelated data following the considered non-Gaussiandistribution, but the global Hurst exponent as well asthe local ones usually differ slightly from the original one.These slight deviations do not play a role here, since weconsider α ∗ values between 0.1 and 1.9 and only the local α values measured by DFA2 are essential in our analy-sis. If one needs to obtain data with exactly the same α ∗ value as the Gaussian data, one has to use the it-erative Schreiber-Schmitz procedure [51], where in eachiteration the data are (1) Fourier-transformed to f space.Then (2) the Fourier-transformed data are exchangedby the Fourier-transform of the original Gaussian dataand (3) Fourier-transformed back to time space. Finally,(4) these data are exchanged rankwise by the desirednon-Gaussian distribution. By comparing the simple ex-change method with the Schreiber-Schmitz procedure wefound that both methods yield, for the same global Hurstexponent α ∗ ≤ .
5, the same distribution H ( α ; α ∗ , L ) oflocal Hurst exponents α as the Gaussian records.Figure 6 shows, for α ∗ = 1 . L = 400 and 2200, thedistributions H ( α ; α ∗ , L ) of local Hurst exponents α forboth Gaussian and non-Gaussian data. Above α ∗ = 1 . α ∗ , the output α ∗ was al-ways close to 1.5, and the distribution of the α valuesbecame much broader than for the Gaussian data. Ac-cordingly, our analysis of non-Gaussian data is limited tothose α values that are typically absent in long recordswith α ∗ above 1.5. We found that this is the case forsubrecords of length L = 400, 1200, and 2200, for α be-low α c ( L ) = 1 .
15, 1.31, and 1.34, respectively, where thefraction of subrecords originating from long records with -3 -2 -1 - S ( x ; α , L ) L = 400 0.2 0.41200 0.1 0.2 0.322000 1 2 310 -3 -2 -1 - S ( x ; α , L ) -3 -2 -1 - S ( x ; α , L ) -3 -2 -1 - S ( x ; α , L ) α = 0.50.751.01.25 FIG. 7. (color online) Dependence of the significance S ( x, α, L ) of the relative trend x on the distribution of thelong-term persistent data. Apart from Gaussian data (con-tinous line, shown also in Fig. 2), we consider data withasymmetric exponential distribution (full circle), symmetricexponential distribution (triangle up), and power law distri-bution (triangle down). For convenience, we show 1 − S asin Fig. 2. The record lenghts are L = 400, 1200, and 2200(columns 1 − α = 0 . , . , − α ∗ above 1.5, is below 10 − . Accordingly, our analysisholds only for local Hurst exponents α below α c ( L ).Figure 7 shows the significance of the relative trend x for the 3 non-Gaussian distributions considered here, for L = 400, 1200, and 2200. The continous line (difficult tosee) is the result for the Gaussian data. The figure showsthat for α below 1 (rows 1 − − S of the Gaussian distribution appears to be alower bound for 1 − S of the skewed distributions, thismeans that the significance of a trend in long-term cor-related Gaussian data represents an upper bound for thesignificance.Accordingly, the value obtained for x in Gaussiandistributed data represents a lower bound, but the differ-ences between the different distributions are very small, the largest deviations occur for α = 0 .
75 and L = 2200where x for the skewed power-law distribution exceeds x for the Gaussian distribution by less than 5 percent.For α = 1, x is the same for all distributions. For α = 1 .
25 (last row) which is below α c for L = 1200 and2200, the agreement between Gaussian and non-Gaussiandata is perfect for L = 1200 and 2200. For L = 400,where α is above α c and thus should not be considered,Gaussian and non-Gaussian data still collapse for 1 − S above 10 − . The shoulder below 10 − is an artifact whichoriginates from records where the original α ∗ value wasabove 1.5. V. EXAMPLE: THE ANTARCTIC BYRDRECORD
It is straightforward to apply our methodology to ob-servational data. Important applications are climate data(e.g., river flows, precipitation, and temperature data)where one likes to know the significance of trends dueto anthropogenic climate change. When considering cli-mate data, it is important to use monthly data whereadditional short-term dependencies have been averagedout and seasonal trends can be better eliminated than indaily data [1].For convenience we consider temperature data. Theseasonal trend elimination is done in 2 steps [35, 36]. Inthe first step, we substract the monthly seasonal trend toobtain the temperature anomalies ˜ y i , i = 1 , . . . , L . Sincethe variance of the temperature anomalies may dependon the season, we divide in the second step the tempera-ture anomalies by the seasonal standard deviation. Theresulting dimensionless record y i has unit variance andzero mean.Next we perform the regression analysis for the { y i } which yields ∆ and σ and thus the relative trend x . Thenwe employ DFA2 and obtain the Hurst exponent α . From x and α we can estimate the significance S of the temper-ature trend from Tables I and II as well as the boundary ± x of the 95% significance interval.Since we divided the temperature anomalies by theseasonal standard deviation to obtain { y i } , ∆ and σ aswell as the error bars ± ∆ = x σ are dimensionless.To obtain the real trend ∆ real and its real error bars ± ∆ real in units of o C , we perform a regression analy-sis of the temperature anomalies ˜ y i , i = 1 , . . . , L . To ob-tain the error bars ∆ real95 we use the identity (see [1])∆ real95 / ∆ real = ∆ / ∆, which then yields∆ real95 = ∆ ∆ real ∆ . The resulting minimum and maximum external temper-ature trends are ∆ real ± ∆ real95 .To show explicitely how our approach can be used toestimate the significance of a warming trend we considerthe monthly (corrected) Byrd record between 1957 and2013 that was recently reconstructed by Bromwich et y i F ( s ) [ a r b . un it s ] α = 0.65 a) b) FIG. 8. (color online) (a) Monthly fully seasonally detrendedrecord y i at the Byrd station between 1957 and 2013 (blacklines) and the corresponding linear regression (red line). (b)The DFA2 fluctuation function F ( s ) of the monthly Byrdrecord y i (black circles). al. [50].(An earlier version [46] of the Byrd record hasbeen discussed in [28]). The Byrd station is located inthe center of West Antarctica which is one of the fastestwarming places on Earth. The regression analysis yields∆ real = 2 . o C . It is obvious that the question of the sig-nificance of Antarctic warming is highly relevant, sincethe warming trend influences the melting of the WestAntarctic Ice Shielf and thus contributes to future sealevel rise.Figure 8a shows the fully seasonally detrended Byrdrecord y i , i = 1 , . . . , .
69 and σ = 1 . x = 0 . y i (full circles). In the double logarithmic plot, the DFA2fluctuation function F ( s ) follows a straight line with ex-ponent α = 0 .
65 between s = 10 and L/
4. Accordingly,the data are long-term persistent and our methodologyapplies. The exponent α = 0 .
65 is in agreement withearlier estimates [28, 50]. We like to note that the tem-perature anomalies ˜ y i yield to the same Hurst exponentshowing explicitely that the second step in the seasonaldetrending has no influence on the persistence properties.For obtaining the degrees of freedom l and the scalingfactor a for α = 0 .
65 and L = 684, we consider the 4thline in Table 1 and 2 and use the respective values for L = 500 , ,
700 and 800 for a cubic interpolation. Thisgives l = 9 .
78 and a = 0 . S = 0 .
953 and x = 0 .
64. Accordingly,the significance of the warming trend at the Byrd stationis 95.3%. The minimum external trend is 0 . o C , whilethe maximum external trend is 3 . o C .When Bromwich et al determined these quantities,they used the conventional hypothesis that the annual linearly detrended temperature data follow an AR(1)process and that S can be obtained from Eqs. (9), (11)and (12) (even though the length of the annual data is too small to make (11) and (12) applicable). For theannual detrended data, they obtained C (1) = 0 .
075 andthus l = 49 .
05 and a = 0 . S = 0 .
999 and x = 0 . . o C , while the maximumexternal trend is 3 . o C .Accordingly, the significance of the warming trend aswell as the minimum external trend has been stronglyoverestimated by Bromwich et al, while the uncertainty2 x has been underestimated. VI. CONCLUSIONS
In summary, we have studied by extensive Monte-Carlosimulations the distribution P ( x ; α, L ) of linear trends inlong-term correlated records of length L that are charac-terized by a Hurst exponent α between 0.5 and 1.5 (de-termined by DFA2). The Hurst exponent was obtainedby linear regression from the slope of the regression linein a double logarithmic representation of the DFA2 fluc-tuation function F ( s ) between s = 10 and L/
4. We haveconsidered record lengths L between 400 and 2200, whichcorresponds, in monthly climate records, to time scalesbetween 33.3 and 183.3 years. In each record we havedetermined by linear regression analysis the increase ∆and the standard deviation σ of the data around the re-gression line; the ratio x = ∆ /σ is the relative trend.We have extended the earlier analysis [1] in three im-portant directions:(i) We found that P ( x ; α, L ) follows, in the whole x -range, the student-t distribution with two fit parameters,the scaling parameter a and the effective length l . Thisgeneralizes nicely the known results for white noise andAR(1) processes, where P also follows a student-t dis-tribution, but with different a and l values. For Hurstexponents between 0.5 and 1.1, l depends only on therecord length L , and not on the Hurst exponent α . Inthe previous work [1], the distribution was approximatedby a Gaussian at small x and an exponential at large x values, which allowed to determine easily the significanceof large relative trends.(ii) In [1], only Hurst exponents up to 1.1 could betreated analytically. Here we extend the analytical anal-ysis to α = 1 . x are more pronounced. We alsoconsidered slightly smaller and larger record lengths.(iii) In [1], only Gaussian data were considered. Herewe have shown explicitely that the results are stable anddo hold also for very different, highly skewed distribu-tions.For applying our methodology to observable data onemust be sure that there are no additional short termcorrelations on short time scales. It is known that intemperature anomalies, there are additional short termcorrelation on time scales up to 10 days. For river flows,the short term persistence may range up to one month.These short term correlations can be eliminated by aver-aging the data over short time windows that are largerthan the persistence time, i.e. by considering monthlytemperature anomalies and quarter annual river flows.The DFA2 analysis then has to be performed on these av-eraged records and the actual data length L is the lengthof the averaged data set.In long-term persistent processes it is enough to de-termine α by DFA2 and use this value to determine a and l . In most previous evaluations of the significanceof trends in climate science the significance of trendshas been determined from the value of C (1) in annualrecords, assuming that the significance of trends in anAR(1) record may be a good approximation for the signif-icance of trends in long-term correlated climate records.Our results show that one does not need to rely on thiscrude approximation (which usually strongly exaggeratesthe significance) since the estimation of the significanceof trends in long-term correlated records is not more dif-ficult.Acknowledgements: We like to thank the DeutscheForschungsgemeinschaft and the Ministry of Educationand Science of the Russian Federation for financial sup-port. VII. APPENDIX
We have shown in this article that the probability P ( x ; α, L ) that in a long-term persistent record of length L , characterized by the Hurst exponent α , a relativetrend of strength x occurs, has the form of a student-t distribution, P ( x ; α, L ) = Γ( l +12 )Γ( l ) √ πla (cid:18) x/a ) l (cid:19) − l +12 , with the effective length l and the scaling parameter a .The related trend significance is S ( x ; α, L ) = 2 xa Γ( ( l + 1) √ πl Γ( l ) F (cid:18) ,
12 ( l + 1); 32 ; − ( x/a ) l (cid:19) . Tables I and II list the effective lengths l and the scalingfactor a as function of the DFA2 Hurst exponent α andthe record length l . [1] S. Lennartz and A. Bunde, Phys. Rev. E., , 021129(2011).[2] B. B. Mandelbrot, Gaussian Self-Affinity and Fractals (Springer, New York, Berlin, Heidelberg, 2001)[3] P. Bloomfield, and D. Nychka, Climatic Change, , 275(1992).[4] J. D. Pelletier and D. L. Turcotte , J. Hydrology, ,198 (1997).[5] E. Koscielny-Bunde et al., Phys. Rev. Lett., , 729(1998).[6] B. D. Malamud and D. L. Turcotte, Advances in Geo-physics, , 1 (1999).[7] P. Talkner and R.O. Weber, Phys. Rev. E, , 150,DOI:10.1103/PhysRevE.62.150 (2000).[8] R.O. Weber and P. Talkner, J. Geophys. Res. , 20131(2001).[9] R.A. Monetti, S. Havlin, and A. Bunde, Physica A ,581 (2003).[10] J. Eichner, E. Koscielny-Bunde, A. Bunde, S. Havlin, andH. J. Schellnhuber, Phys. Rev. E, , 046133 (2003).[11] K. Fraedrich and R. Blender, Phys. Rev. Lett. , 108501(2003).[12] R. Blender and K. Fraedrich, Geophys. Res. Lett., ,1769 (2003).[13] L. A. Gil-Alana, J. Climate, , 5357 (2005).[14] T.A. Cohn and H.F. Lins, Geophys. Res. Lett., ,L23402 (2005).[15] A. Kir´aly, I. Bartos, and I.M. J´anosi, Tellus, , 5, 593,(2006).[16] D. Rybski, A. Bunde, S. Havlin and H. von Storch, Geo-phys. Res. Lett. , L06718 (2006). [17] E. Giese, I. Mossig, D. Rybski, and A. Bunde, Erdkunde , 186 (2007).[18] D. A. Rybski, A. Bunde, and H. v. Storch, J. Geophys.Res. Atmospheres, , D02106 (2008).[19] E. Zorita, T.F. Stocker, and H. v. Storch, Geophys. Res.Lett. , L24706 (2008).[20] D. Rybski and A. Bunde, Physica A , 1687 (2009).[21] J.M. Halley, Physica A , 2492 (2009).[22] S. Lennartz, and A. Bunde, Geophys. Res. Lett., ,L16706 (2009).[23] S. Fatichi, S. M. Barbosa, E. Caporali, and M. E. Silva,J. Geophys. Res., , D18121 (2009).[24] C. Franzke, J. Climate, , 6074 (2010).[25] C. Franzke, C., J. Climate, , 4172 (2012).[26] S. Lovejoy and D. Schertzer, in: Extreme events and nat-ural hazards: the complexity perspective, A.S. Sharma,A. Bunde, D. Baker, and V.P. Dimri (eds), AGU mono-graphs, 231 (2012).[27] C. Franzke, Geophys. Res. Lett., , 1391 (2013).[28] A. Bunde, J. Ludescher, C. Franzke, and U. B¨untgen,Nature Geoscience, , 246 (2014).[29] H. E. Hurst, Transactions of the American Society of civilengineers, , 770 (1951).[30] B. B. Mandelbrot and J. R. Wallis, Water Resources Re-search, , 5, 909 (1968).[31] A. Montanari, R. Rosso, and M. S. Taqqu, Water Re-sources Research, , 5, 1249 (2000).[32] A. Montanari, Theory and Application of Long-rangeDependence, P. Doukhan, G. Oppenheim, M. S. Taqqu,Eds., 461-472 (2003).[33] D. Koutsoyiannis, Hydrological Sciences J. , 3 (2003).[34] D. Koutsoyiannis, J. Hydrology , 25 (2006). α \ L L400 L500 L600 L700 L800 L900 L1000 L1200 L1400 L1600 L1800 L2000 L22000.50 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.55 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.60 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.65 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.70 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.75 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.80 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.85 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.90 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.500.95 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.501.00 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.501.05 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.501.10 7.60 8.50 9.24 9.87 10.41 10.88 11.31 12.05 12.67 13.21 13.69 14.11 14.501.15 7.06 7.69 8.86 9.32 9.63 10.51 11.08 11.76 13.12 12.50 13.07 14.24 14.441.20 6.95 7.50 8.27 9.54 9.21 9.63 10.28 11.14 11.91 11.92 12.48 13.13 14.261.25 6.41 7.32 7.91 8.74 8.61 9.27 9.85 10.55 11.21 11.34 11.59 12.44 12.941.30 6.25 6.68 7.60 8.11 8.28 8.55 9.28 9.90 10.21 10.36 10.74 11.26 11.031.35 5.93 6.31 7.29 7.68 7.83 8.21 8.74 9.12 9.88 9.88 10.26 10.40 10.401.40 5.47 5.80 6.63 7.21 7.44 7.55 7.93 8.49 8.94 8.78 9.02 9.48 9.741.45 5.19 5.54 6.33 6.83 6.83 7.14 7.58 7.82 8.62 8.18 8.43 8.80 9.011.50 4.77 5.10 5.88 6.45 6.33 6.76 6.88 7.42 7.48 7.55 7.76 8.25 8.41TABLE I. Effective length l ( α, L ) in in the trend distribution P and the trend significance S in long term correlated data withHurst exponent α and record length L . α \ L L400 L500 L600 L700 L800 L900 L1000 L1200 L1400 L1600 L1800 L2000 L22000.50 0.174 0.162 0.133 0.124 0.115 0.107 0.104 0.092 0.086 0.081 0.076 0.072 0.0680.55 0.227 0.212 0.177 0.165 0.154 0.145 0.140 0.126 0.117 0.111 0.105 0.100 0.0960.60 0.292 0.275 0.232 0.218 0.205 0.193 0.188 0.171 0.160 0.153 0.145 0.139 0.1330.65 0.371 0.352 0.300 0.284 0.268 0.256 0.249 0.227 0.215 0.206 0.196 0.190 0.1820.70 0.465 0.447 0.385 0.368 0.348 0.332 0.325 0.301 0.286 0.275 0.265 0.256 0.2470.75 0.577 0.556 0.486 0.466 0.446 0.428 0.420 0.392 0.376 0.364 0.351 0.341 0.3320.80 0.706 0.687 0.606 0.585 0.564 0.543 0.537 0.504 0.486 0.473 0.460 0.448 0.4380.85 0.854 0.835 0.745 0.726 0.703 0.681 0.675 0.639 0.621 0.609 0.593 0.582 0.5680.90 1.021 1.006 0.906 0.883 0.864 0.841 0.833 0.800 0.778 0.769 0.752 0.742 0.7260.95 1.208 1.198 1.088 1.069 1.051 1.026 1.021 0.981 0.961 0.952 0.937 0.924 0.9171.00 1.417 1.410 1.291 1.276 1.256 1.233 1.229 1.187 1.171 1.167 1.151 1.142 1.1001.05 1.646 1.648 1.517 1.506 1.488 1.466 1.456 1.420 1.409 1.404 1.390 1.340 1.3661.10 1.900 1.915 1.776 1.758 1.745 1.722 1.716 1.681 1.666 1.670 1.657 1.590 1.6331.15 2.138 2.133 2.024 2.016 1.992 1.985 1.993 1.953 1.971 1.938 1.937 1.938 1.9251.20 2.427 2.420 2.288 2.327 2.273 2.261 2.268 2.248 2.244 2.238 2.237 2.231 2.2381.25 2.695 2.758 2.584 2.615 2.568 2.569 2.580 2.556 2.559 2.561 2.548 2.552 2.5571.30 3.045 3.044 2.917 2.927 2.906 2.884 2.915 2.893 2.883 2.887 2.886 2.887 2.8571.35 3.404 3.411 3.286 3.287 3.264 3.252 3.279 3.242 3.267 3.261 3.263 3.239 3.2281.40 3.758 3.750 3.621 3.662 3.665 3.619 3.637 3.640 3.643 3.626 3.630 3.624 3.6351.45 4.202 4.234 4.052 4.102 4.062 4.060 4.101 4.042 4.097 4.061 4.055 4.065 4.0601.50 4.666 4.679 4.538 4.615 4.534 4.554 4.546 4.550 4.514 4.535 4.530 4.560 4.561TABLE II. Scaling factor a ( α, L ) in the trend distribution P and the trend significance S in long term correlated data withHurst exponent α and record length L .[35] J. W. Kantelhardt, E. Koscielny-Bunde, D. Rybski, P.Braun, A. Bunde, and S. Havlin, J. Geophys. Res. At-mospheres, , D01106 (2006).[36] E. Koscielny-Bunde, J. W. Kantelhardt, P. Braun, A.Bunde, and S. Havlin, J. of Hydrology., , 120 (2006). [37] A. Beretta, H. E. Roman, F. Ricich, and F. Crisciani,Physica A, , 695 (2005).[38] S. Dangendorf et al. Geophys. Res. Lett. , 15, 5530(2014).[39] M. Becker, M. Karpytchev, and S. Lennartz-Sassinek,Geophys. Res. Lett. , 15, 5571 (2014). [40] K. Hasselmann, J. Climate , 1957 (1993).[41] G.C. Hegerl,H. von Storch, K. Hasselmann, B.D. Santer,U. Cubasch, and P.D. Jones J. Climate , 2281 (1996).[42] F. W. Zwiers, in Antropogenic Climate Change (Springer,New York, 1999), pp. 163-209[43] B. D. Santer et al., J. Geophys. Res. Atmospheres, ,7337 (2000).[44] J. Turner, Int. J. Climatol., , 279 (2005).[45] E. J. Steig, Nature, , 459 (2009).[46] D. H. Bromwich al., Nature Geoscience, , 139 (2012).[47] IPCC, Stocker T. F. et al. Eds. (2013), Climate Change2013: The Physical Science Basis. Contribution of Work- ing Group I to the Fifth Assessment Report of the In-tergovernmental Panel on Climate Change , CambridgeUniversity Press, Cambridge and New York.[48] J. W. Kantelhardt, E. Koscielny-Bunde, H.H.A. Rego, A.Bunde, and S. Havlin, Physica A, , 441 (2001).[49] D. H. Bromwich et al, Nature Geoscience, , 11247(2014).[50] D. H. Bromwich and J. P. Nicolas, Nature Geoscience, ,247 (2014).[51] T. Schreiber and A. Schmitz, Phys. Rev. E,77