[PDF] Long-term prediction intervals of economic time series

Abstract

We construct long-term prediction intervals for time-aggregated future values of univariate economic time series. We propose computational adjustments of the existing methods to improve coverage probability under a small sample constraint. A pseudo-out-of-sample evaluation shows that our methods perform at least as well as selected alternative methods based on model-implied Bayesian approaches and bootstrapping. Our most successful method yields prediction intervals for eight macroeconomic indicators over a horizon spanning several decades.

Full PDF

EEmpirical Economics (2020) 58:191–222https://doi.org/10.1007/s00181-019-01689-2

Long-term prediction intervals of economic time series

M. Chudý · S. Karmakar · W. B. Wu Received: 16 July 2018 / Accepted: 27 March 2019 / Published online: 10 April 2019© The Author(s) 2019

Abstract

We construct long-term prediction intervals for time-aggregated future values of uni-variate economic time series. We propose computational adjustments of the existingmethods to improve coverage probability under a small sample constraint. A pseudo-out-of-sample evaluation shows that our methods perform at least as well as selectedalternative methods based on model-implied Bayesian approaches and bootstrapping.Our most successful method yields prediction intervals for eight macroeconomic indi-cators over a horizon spanning several decades.

Keywords

Heavy-tailed noise · Long memory · Kernel quantile estimator · Stationary bootstrap · Bayes

JEL Classiﬁcation

C14 · C15 · C53 · C87 · E27

Long-term predictions of economic time series are published yearly by the US Con-gressional Budget Ofﬁce (CBO) in its Long-term Budget and Economic Outlook. Inthis report, the CBO predicts US federal spending and revenue growth in the comingdecades under the assumption of stable tax and spending policies. However, structuralchanges occur over the long run (taking the turbulent period after the Great Moderationas an example), and not only as a result of changes in legislation. The CBO stated in its B M. Chudý[email protected] Institute for Financial Policy, Ministry of Finance, Stefanovicova 5, 817 82 Bratislava, SlovakRepublic Department of Statistics and Operations Research, University of Vienna, Oskar MorgensternPlatz 1, 1090 Vienna, Austria Department of Statistics, University of Florida, 230 Newell Drive, Gainesville, FL 32611, USA Department of Statistics, University of Chicago, 5747 S. Ellis Avenue, Chicago, IL 60637, USA

92 M. Chudý et al.

January 2000 Budget and Economic Outlook that the baseline projections allow for anaverage recession within the next 10 years (2000–2010). Today, we know that the 2008recession was more severe than the predicted average recession. Moreover, in its 2011report the US Financial Crisis Inquiry Commission concluded that the crisis wouldhave been avoidable if timely preventive measures had been introduced. We do not linkthe absence of these measures with the CBO’s projections from 18 years ago, but wedo believe that accurate long-term economic predictions can trigger right and timelydecisions. Economic predictions for several decades ahead are crucial for strategicdecisions made by trust funds, pension management and insurance companies, port-folio management of speciﬁc derivatives (Kitsul and Wright 2013) and assets (seeBansal et al. 2016). Several facts hamper the long-term prediction of economic timeseries: small sample size because most post-WWII economic indicators are reportedon monthly/quarterly bases, (anti-) persistence (see Diebold and Rudebusch 1989;Baillie 1996; Diebold and Linder 1996, who also give PIs), heteroscedasticity andstructural change (Cheng et al. 2016), the latter of which is inevitable in the long run(Stock and Watson 2005).Sometimes, decision makers call for predictions of boundaries [ L , U ] covering thefuture value of interest with a certain probability. Unlike point forecasts, predictionintervals (PIs) can capture the uncertainty surrounding predictions. As a proxy for thisuncertainty, one can look at the widths of different PIs. Most software packages offerPIs as part of their standard output. PIs from exponential smoothing, for instance,are readily available without any strict assumptions, but then as the forecast horizongrows, these PIs often become too wide to be informative (see Chatﬁeld 1993, formore background). By contrast, PIs implied by arma – garch models often turn out tobe too narrow because they ignore distributional, parameter and model uncertainty(see Pastor and Stambaugh 2012). Pascual et al. (2004, 2006) therefore computepredictive densities using bootstrapping without the usual distributional assumptionswhile incorporating parameter uncertainty. Using Bayesian methods, one can accountfor both model and parameter uncertainty, but the pre-assigned coverage of PIs isattained only on average relative to the speciﬁed prior. Müller and Watson (2016)construct bayes PIs for temporal averages of economic series’ growth rates over ahorizon of 10–75 years. Using the so-called least favorable distribution solves theproblem above with the pre-assigned coverage and makes their PIs more conservative.Zhou et al. (2010) provide theoretically valid long-term PIs for the same type of targetas Müller and Watson (2016), i.e., the temporal aggregate of series over a long horizon.As opposed to Müller and Watson (2016), Zhou et al. (2010) do not require any modelﬁtting (at least in our univariate setup) and thus provide a very simple alternative.While both papers allow for the presence of a long-memory component in the datagenerating process (DGP), Zhou et al. (2010) did not apply their methods to economictime series nor has either of the papers compared themselves with any benchmark.These facts pave the way for the following empirical research:– First, since Zhou et al. (2010) evaluate their PIs using only simulated data, we ﬁndit necessary to verify their results using real data. Anti-persistence can be observed as well, often as a result of (over-) differencing. ong-term prediction intervals of economic time series 193 – Second, the methods of Zhou et al. (2010), although theoretically valid, do not toaccount for some characteristics of economic time series. Therefore, we proposecomputational adjustments of the PIs of Zhou et al. (2010) that lead to betterpredictive performance for small samples and long horizons. Our adjustmentsemploy a stationary bootstrap (Politis and Romano 1994) and kernel quantileestimators (Sheather and Marron 1990).– Third, since neither Zhou et al. (2010) nor Müller and Watson (2016) compare theirPIs to any benchmark, we take over this responsibility and conduct an extensivepseudo-out-of-sample (POOS) comparison. We augment the comparison with PIsimplied by arﬁma – garch models computed as one of the following: (i) forecasts fortime-aggregated series or (ii) time-aggregated forecasts of disaggregated series. Tocompute (i) and (ii) we use both analytic formulas and bootstrap path simulations(Pascual et al. 2006).The main results of our paper may be summarized as follows:– First, our simulation study reveals that the PIs of Zhou et al. (2010) fail to achievetheir nominal coverage rate under a growing horizon as a result of rapidly shrinkingwidth. Particularly under long-memory DGP, the coverage rate reaches only halfof the nominal level.– Second, using the proposed computational adjustments, we achieved an improve-ment in the coverage rate of 20pp, which may, however, still be below the nominallevel.– Third, based on real data (S&P 500 returns and US 3-month TB interest rates),the adjusted PIs of Zhou et al. (2010) provide a valid competitor for Müller andWatson (2016). Particularly in case of asset returns, the PIs of Müller and Watson(2016) provide higher coverage but less precision (larger width), while for assets’volatility, the roles are switched. In both cases, adjusted Zhou et al. (2010) PIsoutperform the bootstrap PIs of Pascual et al. (2006).– Fourth, with the adjusted method of Zhou et al. (2010), we construct long-termprediction intervals for selected US macroeconomic time series including GDPgrowth, total factor productivity, inﬂation, population, and others. These PIs pro-vide an alternative for PIs given by Müller and Watson (2016) in Table 5 on pages1731–1732 in the referenced paper.Our article is organized as follows: In Sect. 2 we review the methods discussed abovewith a focus on their scope and implementation. We further describe our computationaladjustments of both methods of Zhou et al. (2010) and justify them using simulations.Section 3 summarizes the empirical comparison of all previously discussed methods.Section 4 provides PIs for eight macro-indicators over the horizon of up to sevendecades from now. Section 5 contains concluding remarks. Plots and details concerningimplementation and underlying theory are available in “Appendix.” In this section, we ﬁrst brieﬂy discuss the three selected approaches for computing ofPIs followed by their merits and demerits. Then we propose computational adjustments

94 M. Chudý et al. of the methods proposed by Zhou et al. (2010). Next, our simulations show thatthese adjustments improve the coverage when the horizon m is large compared tothe sample size T , for example when m = T /

2. In the following, assume that weobserve y , . . . , y T and we want to provide a PI for the temporal average ( y T + +· · ·+ y T + m )/ m . For the rest of the paper, we use the following notation (and analogousfor the process of innovations e t ) ¯ y = T T (cid:2) t = y t , ¯ y + : m = m T + m (cid:2) t = T + y t , ¯ y t ( m ) = m m (cid:2) j = y t − j + . (2.1) For a speciﬁc description of their approach, let us assume a weakly stationary arma (1,1)– garch (1,1) process of the form y t = φ y t − + e t + θ e t − , e t = σ t ε t , ε t ∼ W N , σ t = ω + α e t − + βσ t − . (2.2)In order to obtain PIs for y T + m , one typically uses the estimated MSE predictors of y T + m and σ T + m given the past observations ˆ y T , T + m = ˆ φ m y T + ˆ φ m − ˆ θ ˆ e T , (2.3) ˆ σ T , T + m = ˆ ω − ˆ α − ˆ β + (cid:3) ˆ α + ˆ β (cid:4) m − (cid:5) ˆ σ T + − ˆ ω − ˆ α − ˆ β (cid:6) . (2.4)The resulting analytic (anlt) PIs have the form [ L , U ] = ˆ y T , T + m + [ Q N (α/ ), Q N ( − α/ ) ] ⎛⎝ m (cid:2) j = ˆ σ T , T + j ˆ Ψ m − j ⎞⎠ / , (2.5)where ˆ Ψ = ˆ Ψ j , j = , . . . , m − y t . Q N denotes normal quantile. Besides the fact that thesePIs ignore the parameter uncertainty, they would be inappropriate for heavy-tailedprocesses or when the innovations distribution is asymmetric. In order to deal withthese issues, Pascual et al. (2004) introduce a re-sampling strategy using the estimatedinnovations in order to simulate y t , t = T + , . . . , T + m , and then compute theconditional distribution of y T + m directly, avoiding strict distributional assumptionson the innovations. Their approach does not need a backward representation and thuscaptures garch -type processes. They also show the validity of their PIs for arima processes. However, we are not aware of any extension of these results for arﬁma .Computationally, it is simple to obtain both the analytic and bootstrap PIs implied ong-term prediction intervals of economic time series 195 by arﬁma models, since all necessary ingredients are readily available in rugarch R-package (see Ghalanos 2017) which efﬁciently implements the bootstrap PIs of Pascualet al. (2004, 2006). While the re-sampling can also incorporate parameter uncertainty,Pascual et al. (2006) show that for the garch models, coverage of PIs is similar whetherone accounts for parameter uncertainty or not. The superiority of bootstrap PIs over theanalytic PIs (2.5) prevails especially when the innovations distribution is asymmetric.In order to obtain PIs for ¯ y + : m with arﬁma – garch -type models, we can either (i)use averages of the in-sample observations ¯ y t ( m ) , as deﬁned above, or (ii) average theforecasts of y t , over t = T + , . . . , T + m . In both cases, we ﬁt arﬁma ( p , d , q ) – garch ( P , Q ) models to the data with the rugarch R-package. As already mentioned,the full re-sampling scheme takes into account the parameter uncertainty, how-ever, for minor improvement of performance and high cost concerning computationtime. Therefore, we use a partial re-sampling scheme which accounts for the uncer-tainty due to the unknown distribution of innovations. The fractional parameter d ∈ [0 , . ) is, depending on the series, either ﬁxed to 0 (only for stock returns, seeSect. 3) or estimated by maximum likelihood (ML). The arma orders are restrictedto p , q ∈ { , . . . , } and are selected by aic . The garch orders are restricted to ( P , Q ) ∈ { ( , ), ( , ) } . The details of our implementation follow: Fitting arﬁma–garch to averaged in-sample observations (avg-series):

1. Compute the series of overlapping rolling averages ¯ y t ( m ) = m − (cid:11) mi = y t − i + , for t = m , . . . , T .2. Fit the selected arﬁma – garch model to the series of ¯ y t ( m ) .3. Compute(anlt) m -step-ahead MSE forecasts ˆ¯ y T , + : m and ˆ¯ σ T , + : m analogously to (2.3) bysubstituting the observations y t by rolling averages ¯ y t ( m ) . Then, PIs aregiven by (2.5). (boot) residuals ˆ e t , t = , . . . , T , and generate b = , . . . , B future paths ˆ¯ y bT ( m ), t , t = T + , . . . , T + m , recursively using (2.5) and the param-eter estimates from the original sample. Obtain the PIs by inverting theempirical distribution of ˆ¯ y bT ( m ), T + m , b = , . . . , B . Fitting arﬁma–garch to original series and averaging forecasts (avg-forecasts):

1. Fit the selected arﬁma – garch model to the series y t .2. Compute(anlt) ˆ¯ y T , + : m = m − (cid:11) mi = ˆ y T , T + i , with ˆ y T , T + i the i -step-ahead analyticforecast from (2.3). The scaling factor in PI [ L , U ] = ˆ¯ y T , + : m +[ Q t (α/ ), Q t ( − α/ ) ] ˆ as T , + : m is derived in “Appendix C.”(boot) residuals ˆ e t , t = , . . . , T and generate b = , . . . , B future paths ˆ y bT , t , t = T + , . . . , T + m , recursively using (2.5) and the parameter estimates fromthe original sample. Compute the temporal averages ˆ¯ y bT , + : m , as estimatorsof ¯ y bT , + : m , b = , . . . , B . We obtain the PIs by inverting the empiricaldistribution of ˆ¯ y bT , + : m , b = , . . . , B . Instead of the normal quantiles, we rather utilize Student’s t where df is estimated by ML.

96 M. Chudý et al.

With both the sample size and horizon growing proportionally, Müller and Watson(2016) provide asymptotically valid long-term PIs under a rich class of models forseries with long memory under a uniﬁed spectral representation. In order to capture alarger scope of long-run dynamics in economic time series beyond those described by arﬁma models, Müller and Watson (2016) consider two additional models, namely (i)the local-level model y t = y t + ( bT ) − t (cid:2) s = y s , where { y t } , { y t } are mutually independent I(0) processes, (2.6)and (ii) the local to unity ar(1) model deﬁned by: y t = ( − c / T ) y t − + y t , where { y t } is an I(0) process. (2.7)The former captures varying “local means” arising, e.g., from stochastic breaks, whilethe latter is useful for modeling highly persistent series. In (i) the role of the persistentcomponent is determined by the parameter b , while in (ii) it is driven by c . The arﬁma models with fractional integration parameter d complete the triple of modelsin Müller and Watson (2016) who design a uniﬁed spectral representation of theirlong-run dynamics using the parametrization ϑ = ( b , c , d ) .A natural way of how to incorporate the uncertainty about ϑ , which is crucial forthe asymptotic predictive distribution of ¯ y + : m , is to assume a prior for ϑ . A practicaldrawback of such an approach is that the pre-assigned coverage holds only on averagerelative to the prior. Hence, Müller and Watson (2016) further robustify their bayesPIs in order to attain “frequentist coverage,” i.e., coverage that holds over the wholeparameter space.The main idea behind their approach is to extract the long-run information fromselected low-frequency projections of y t , t = , . . . , T . Assume that the set of predic-tors for ¯ y + : m consists of q low-frequency cosine transformations X = ( X , . . . , X q ) T of y t . Then the asymptotic conditional density of ¯ y + : m is a function of the covariancematrix of ( X , . . . , X q , ¯ y + : m ) denoted as Σ , which in turn can be expressed as afunction of properly scaled spectra S ( m / T , q , ϑ) . When the number of frequencies q is kept small, the high-frequency noise is ﬁltered out, thus providing more robustness.For ﬁxed ϑ = ( , , ) the conditional distribution of ¯ y + : m turns out to be Student’st with q degrees of freedom and the PIs take the form [ L , U ] = ¯ y + [ Q tq (α/ ), Q tq ( − α/ ) ] (cid:12) m + Tmq X T X . (2.8)These ( naive ) PIs implied by ﬁxed ϑ = ( , , ) can be enhanced by imposing a uni-form prior on ϑ , giving equal weight to all combinations of parameters − . ≤ d ≤ ong-term prediction intervals of economic time series 197 b , c ≥

0, and using standard Bayesian procedure to obtain posterior predictive den-sity, which is no longer a simple t -distribution but rather a mixture of different t -distributions. We denote the implied PIs as ( bayes ) PIs . Finally, the ( robust ) PIsadditionally guarantee the correct coverage uniformly across the parameter space Θ and simultaneously have optimal (mean) width. We conclude by giving our implemen-tation steps for the ( bayes ) PIs, leaving the additional steps necessary for computingthe ( robust ) PIs to our “Appendix B.”

Bayes PIs (bayes):

1. Set q small and compute the cosine transformations X = ( X , , X q ) of the targetseries y t . Standardize them as Z = ( Z , · · · , , Z q ) T = X / (cid:13) X T X .2. For a chosen grid of parameter values ϑ = ( b , c , d ) satisfying − . ≤ d ≤ b , c ≥ Σ( m / T , q , ϑ) following formulas (9) and (20)from Müller and Watson (2016) and using, e.g., a numerical integration algorithm.(Details are given in “Appendix” of the original paper.)3. Choose a prior for ϑ = ( b , c , d ) and compute the posterior covariance matrix Σ = (cid:14) Σ Z Z Σ Z ¯ e Σ Z ¯ e Σ ¯ e ¯ e (cid:15) .4. Obtain the covariance matrix of the residuals as Σ UU = Σ ¯ e ¯ e − Σ (cid:6) Z ¯ e (Σ − Z Z )Σ Z ¯ e .5. Compute the quantiles Q tmix q (α/ ), Q tmix q ( − α/ ) of the conditional (mixture-t)distribution of ¯ y + : m using, e.g., sequential bisection approximation. (Details aregiven in “Appendix” of the original paper.)6. The PIs are given by [ L , U ] = ¯ y + [ Q tmix q (α/ ), Q tmix q ( − α/ ) ] (cid:13) X T X . For presentational clarity of their approach, assume y t = μ + e t , (2.9)where e t is a mean-zero stationary process and μ is the unknown deterministic mean.The PI for y t process will be constructed via that of the ˆ e t = y t − ˆ μ process by addingthe ˆ μ = ¯ y to both components of the intervals. It is common practice and can alsobe proved to have the correct coverage using standard arguments. We ﬁrst provide asummary of the two methods proposed in Zhou et al. (2010) and then we discuss theirconsistency.CLT method ( clt ): If the process e t shows short-range dependence and light-tailedbehavior, then in the light of a quenched CLT, Zhou et al. (2010) propose thefollowing PI for ¯ e + : m [ L , U ] = [ Q N (α/ ), Q N ( − α/ ) ] σ √ m , (2.10)where σ is the long-run standard deviation (sd) of e t . However, since σ is unknown,it must be estimated. One popular choice is the lag window estimator

98 M. Chudý et al. ˆ σ = k T (cid:2) k =− k T ˆ γ k = k T (cid:2) k =− k T T T −| k | (cid:2) t = ( ˆ e t − ¯ˆ e )( ˆ e t + k − ¯ˆ e ). (2.11)The PI for ¯ y + : m with nominal coverage 100 ( − α) % is given by [ L , U ] = ¯ y + [ Q N (α/ ), Q N ( − α/ ) ] ˆ σ √ m . (2.12)Quantile method ( qtl ): If we allow for heavy tails and long memory, the PI for ¯ y + : m with nominal coverage 100 ( − α) % can be obtained by [ L , U ] = ¯ y + [ ˆ Q (α/ ), ˆ Q ( − α/ ) ] , (2.13)where ˆ Q ( · ) is the respective empirical quantile of ¯ˆ e t ( m ) , t = m , . . . , T . The clt method is applicable only for processes with light-tailed behavior and short-range dependence. Let S t = (cid:11) ≤ j ≤ t e j . Under stationarity, the problem of predicting ¯ e + : m = ( S T + m − S T )/ m after observing e , . . . , e T can be analogically thought ofas predicting S m / √ m after observing . . . , e − , e . Let F be the σ -ﬁeld generatedby . . . , e − , e . Assume E ( | e t | p ) < ∞ for some p >

2. Wu and Woodroofe (2004)proved that, if for some q > / (cid:9) E ( S m | F ) (cid:9) = O (cid:5) √ m log q m (cid:6) , (2.14)then we have the a.s. convergence Δ( P ( S m / √ m ≤ ·| F ), N ( , σ )) = , (2.15)where Δ denotes the Levy distance, m → ∞ , and σ = lim m →∞ (cid:9) S m (cid:9) / m is thelong-run variance. Verifying (2.14) is elementary for many well-known time seriesmodels. We postpone the discussion on the veriﬁcation of such a result for a linearand nonlinear process for the interested reader to “Appendix D.” The qtl method is based on the intuitive fact that for the horizon m growing to ∞ andin the light of weak dependence, P (cid:5) a ≤ e T + + · · · e T + m m ≤ b | e , . . . , e T (cid:6) ≈ P (cid:5) a ≤ e T + + · · · e T + m m ≤ b (cid:6) , (2.16)if m / T is not too small. Thus, it sufﬁces to estimate the quantiles of ¯ e T ( m ) = ( e T + +· · · + e T + m )/ m using, e.g., empirical quantiles of ¯ e t ( m ) , t = m , . . . , T . The power ofthis method lies in its applicability to heavy-tailed error process. Zhou et al. (2010)provided a consistency result for this method for the subclass of linear processes (seeTheorem 2 in our “Appendix” section D). ong-term prediction intervals of economic time series 199 Pros and cons of

Pascual et al. (2004, 2006) When comparing the bootstrap PIs to theanalytic PIs, the former provide the advantage of including the uncertainty due to theunknown distribution of the residuals and unknown parameters into the uncertaintyabout the target. In cases when the distribution of the residuals is asymmetric anddoubts about the proximity of the estimated model to the true DGP exist, the boot-strap approach dominates the analytic. Furthermore, regarding the implementation,the analytic PIs are more difﬁcult to obtain since we deal with a nonstandard target.By contrast, the bootstrap PIs are readily available in the R-package rugarch . Hence,their estimation is cheap. Concerning the two ways of ﬁtting the models to data, i.e.,(i) using the series of rolling temporal averages or (ii) using the original series andaveraging the forecasts, there are pros and cons for each approach regarding the imple-mentation and effective use of our relatively small sample. The literature (e.g., page302 in Lütkepohl 2006; Marcellino 1999) does not provide any conclusion about thesuperiority of (i) over (ii), or vice versa. Therefore, we include both (i) and (ii) in thePOOS comparison in Sect. 3.

Pros and cons of

Müller and Watson (2016) Their methods represent the state of theart, being robust against stylized peculiarities of economic time series. Their Bayesianapproach accounts for both model and parameter uncertainty, but the focus is onlyon those parameters ruling the persistence, which is in contrast to the previously dis-cussed bootstrap approach where the focus is on the short-term dynamics. To date,no package implementation has been available, which makes the approach less attrac-tive to practitioners. Moreover, the PIs depend on several forecaster-made choices,such as the number of frequencies q to keep, the grid of values for parameters, thechoice of prior. Even with these inputs ﬁxed, the computation takes longer due tomultiple advanced numerical approximations required for the ( bayes ) PIs and furtheroptimization to attain the “frequentist coverage.” PIs for ﬁxed parameters q =

12 and0 . ≤ m / T ≤ . Pros and cons of

Zhou et al. (2010) Their methods provide a simple alternative to thepreviously discussed ones. As to their scope of applicability, the clt method does notrequire any speciﬁc rate of how fast the horizon can grow compared to the sample size.However, the predictive performance heavily depends on the estimator of the long-term volatility σ . Furthermore, for some processes with heavy-tailed innovations orlong-range dependence, the notion of the long-run variance σ does not exist, andthus this method is not applicable. The attractive feature of the qtl PIs is the simplicityand more general applicability than the clt . Their computation requires almost nooptimization (at least in our univariate case) and is straightforward. Pascual et al.(2004, 2006) and Müller and Watson (2016) assume that the DGP of y t is (possiblylong-memory and heteroscedastic version of) an arma process. Zhou et al. (2010) donot a priori assume any parametrization for the dynamics of y t , but argue that both qtl and clt PIs are valid for arma processes, whereas only the former should be used for The replication ﬁles for these methods are available in MATLAB from M. Watson’s homepage.

00 M. Chudý et al. processes with a long memory. The simulations of Zhou et al. (2010) conﬁrm theirclaims. However, as we demonstrate next, the qtl

PIs under-perform when T is smalland m / T ≈ /

2. Therefore, we propose some computational adjustment and providea simulation-based justiﬁcation of their superiority over the original clt and qtl . The simulation setup in Zhou et al. (2010), page 1440, assumes T = m = m stays approximately the same. Herewe show how one can easily modify the computation of clt and qtl PIs in order toenhance their performance. In particular, for qtl , we use a stationary bootstrap (Politisand Romano 1994) with optimal window width as proposed by Politis and White(2004) and Patton et al. (2009) to obtain a set of replicated series. Next, kernel quantileestimators (see Silverman 1986; Sheather and Marron 1990) are used instead of samplequantiles. In order to improve the clt method, we employ a different estimator (cf. 2.18)of σ than (2.11) and account for the estimation uncertainty. These three modiﬁcationsare then shown to improve the empirical coverage using simulations. Stationary bootstrap

The procedure starts with the decomposition of the original sam-ple into blocks by choosing the starting point i and the length of block L i from auniform and geometric distribution, respectively, that are independent of the data. Forevery starting point and length, we re-sample from the blocks of the original series. Theresulting blocks are then concatenated. As proposed by Politis and Romano (1994) intheir seminal paper, this way of re-sampling retains the weak stationarity and is lesssensitive to the choice of block size than moving block bootstrap (Künsch 1989). Italso retains the dependence structure asymptotically since every block contains con-secutive elements of the original series. The two re-sampling schemes differ in theway how they deal with the end-effects. Under mixing conditions, the consistency ofstationary bootstrap for the centered and normalized mean process has been studiedin the literature. Gonçalves and de Jong (2003) show that under some mild momentconditions, for some suitable c T → x | P ∗ ( √ T ( ¯ y ∗ − ¯ y ) ≤ x ) − P ( √ T ( ¯ y − μ) ≤ x ) | = o P ( c T ) (2.17)holds where ¯ y ∗ and P ∗ refer to the re-sampled mean and the probability measureinduced by the bootstrap. We conjecture that, along the same line of proof shown byZhou et al. (2010), it is easy to show consistency results for the bootstrapped versionsof the rolling averages of m consecutive realizations. This is immediate for linearprocesses. For nonlinear processes, one can use the functional dependence measureintroduced by Wu (2005) and obtain analogous results. To keep our focus on empiricalevaluations, we leave the proof of the consistency for our future work. Interestedreaders can also look at the arguments by Sun and Lahiri (2006) for moving blockbootstrap and the corresponding changes as suggested in Lahiri (2013) to get an idea ong-term prediction intervals of economic time series 201 how to show quantile consistency result. For time series forecasting and quantileregression, the stationary bootstrap has been used by White (2000) and Han et al.(2016) among others. Kernel quantile estimation

The efﬁciency of kernel quantile estimators over the usualsample quantiles has been proved in Falk (1984) and was extended to several variantsby Sheather and Marron (1990). As proposed in the latter, the improvement in MSE is aconstant order of (cid:16) u K ( u ) K − ( u ) d u for the used symmetric kernel K . The theoremsmentioned in Sect. 2 are easily extendable to these kernel quantile estimators. Weconjecture that one can use the Bahadur-type representations for the kernel quantileestimators as shown in Falk (1985) and obtain similar results of consistency for at leastlinear processes. We used the popular Epanechnikov kernel K ( x ) = . ( − x ) + for our computations because of its efﬁciency in terms of mean integrated square error. Estimation of σ and degrees of freedom As mentioned above, Zhou et al. (2010) used clt as in (2.12) with normal quantiles. For many applications in economics and ﬁnance,the normal distribution fails to describe the possibly heavy-tailed behavior. Therefore,we propose to substitute the normal with the Student t -distribution, given the factthat σ has to be estimated. Accounting for the estimation uncertainty indeed gives aStudent t -distribution of ¯ e + : m in the limit. The question remains: How many degreesof freedom ( df ) we should use. Rather than some arbitrary choices such as 5, as usedby default in many software packages, or ML-estimated df which would be very noisygiven the small sample, we link them to the estimator of σ . This would not be trivialfor the lag window estimator (2.11). Instead, we use the subsampling block estimator(see eq. 2, page 142 in Dehling et al. 2013) ˜ σ = √ π l / T κ (cid:2) i = (cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17) il (cid:2) t = ( i − ) l + ˆ e t (cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17) , (2.18)with the block length l and number of blocks κ = (cid:13) T / l (cid:14) . Then the adjusted cltp = ( − α) % PI for ¯ y + : m is given by [ L , U ] = ¯ y + [ Q t κ − (α/ ), Q t κ − ( − α/ ) ] ˜ σ √ m , (2.19)where Q t κ − ( · ) denotes a quantile of Student’s t distribution with κ − ˜ σ with proper normalization con-stant behaves similarly to a χ distribution with κ − adjusted qtl (kernel-boot) :1. Replicate series e t , B times obtaining e bt , t = , . . . , T , b = , . . . , B .2. Compute ( ¯ e bt ( m ) ) = m − (cid:11) mi = e bt − i + , t = m , . . . T from every replicated series.3. Estimate the α/ ( − α/ ) th quantiles ˆ Q (α/ ) and ˆ Q ( − α/ ) using theEpanechnikov kernel density estimator from ¯ e bT ( m ) , b = , . . . , B .4. The PI for ¯ y + : m is [ L , U ] = ¯ y + [ ˆ Q (α/ ), ˆ Q ( − α/ ) ] .

02 M. Chudý et al.

Similarly, the implementation of the adjusted clt method (clt-tdist):

1. Estimate the long-run standard deviation from e t , t = , . . . , T , using the sub-sampling estimator (2.18) with block length as proposed by Carlstein (1986).2. The PI is given by [ L , U ] = ¯ y + [ Q t κ − (α/ ), Q t κ − ( − α/ ) ] ˜ σ / √ m . An extensive out-of-sample forecasting evaluation based on independent samples ispossible only with artiﬁcial data. Our simulation setup is designed to assess the per-formance of the original methods of Zhou et al. (2010) as described in Sect. 2.1.3and the computational modiﬁcations described in 2.2.1. The simulation results pro-vide evidence for the usefulness of these modiﬁcations in an artiﬁcial setup based onpossibly long-memory arma -like processes. This setup would provide an advantagefor approaches described in 2.1.1 and 2.1.2, should we challenge them. We leave thistask for the next section and real data.We adopt the following four scenarios for the e t process from Zhou et al. (2010):(i) e t = . e t − + σ (cid:18) t , for i.i.d mixture-normal (cid:18) t ∼ . N ( , ) + . N ( , . ) ,(ii) e t = σ (cid:11) ∞ j = ( j + ) − . (cid:18) t − j , with noise as in (i),(iii) e t = . e t − + σ (cid:18) t , for stable (cid:18) t with heavy tail index 1.5 and scale 1,(iv) e t = σ (cid:11) ∞ j = ( j + ) − . (cid:18) t − j , with noise as in (iii),which correspond to (i) light tail and short memory, (ii) light tail and long memory,(iii) heavy tail and short memory, and (iv) heavy tail and long memory DGPs. For eachscenario, we generate pseudo-data of length T + m , using the ﬁrst T observations forestimation and the last m for evaluation. The experiment is repeated N trials =

10 000times for each scenario.The choice of parameters T = , m = , , , , ,

130 and σ = . p = − α =

90% (see Table 1A), resp. =

67% (see Table 1B), and compute theempirical coverage probability ˆ p = N trials N trials (cid:2) i = I (cid:14) [ L , U ] i (cid:15) ¯ e i , + : m (cid:15) , (2.20)where I for the i th trial is 1 when the future mean for the i th trial ¯ e i , + : m is coveredby the [ L , U ] i and 0 otherwise. Furthermore, we report the relative median width ˆ w = median (cid:14) | U − L | , . . . , | U − L | N trials (cid:15) / (cid:14) ˆ Q ( − α/ ) − ˆ Q (α/ ) (cid:15) , (2.21)where ˆ Q ( · ) denotes the corresponding quantile of the empirical distribution of ¯ e + : m ,estimated from ¯ e i , + : m , i = , . . . , N trials .We focus on the evaluation under the longest horizon m = The value of σ was obtained from an ar ( ) model ﬁtted to the full data set of S&P 500 returns. ong-term prediction intervals of economic time series 203 T a b l e S i m u l a ti on r e s u lt s : c o m p a r i s ono f c ov e r a g e p r ob a b iliti e s f o r t w oo r i g i n a l m e t hod s p r opo s e dby Z hou e t a l . ( ) a nd t h ec o m pu t a ti on a l a d j u s t m e n t s t h e r e o f p r opo s e d i n S ec t . . . C ov e r a g e p r ob a b ilit y ˆ p R e l a ti v e m e d i a n w i d t h ˆ w H o r i z on20304060901302030406090130 ( A ) R e s u lt s o f s i m u l a t e d f o r ec a s ti ng ex p e r i m e n tf o r no m i na l c o ve r ag e p r obab ilit y p = − α = % s ho r t - li gh t q tl - o r i g i n a l . . . . . . . . . . . .

47 q tl - k e r n e l . . . . . . . . . . . .

52 q tl - boo t . . . . . . . . . . . .

84 k e r n e l - boo t . . . . . . . . . . . . c lt - o r i g i n a l . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . l ong - li gh t q tl - o r i g i n a l . . . . . . . . . . . .

29 q tl - k e r n e l . . . . . . . . . . . .

32 q tl - boo t . . . . . . . . . . . .

44 k e r n e l - boo t . . . . . . . . . . . . c lt - o r i g i n a l . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . s ho r t - h ea vyq tl - o r i g i n a l . . . . . . . . . . . .

36 q tl - k e r n e l . . . . . . . . . . . .

40 q tl - boo t . . . . . . . . . . . .

62 k e r n e l - boo t . . . . . . . . . . . . c lt - o r i g i n a l . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . .

04 M. Chudý et al. T a b l e c on ti nu e d C ov e r a g e p r ob a b ilit y ˆ p R e l a ti v e m e d i a n w i d t h ˆ w H o r i z on20304060901302030406090130 l ong - h ea vyq tl - o r i g i n a l . . . . . . . . . . . .

21 q tl - k e r n e l . . . . . . . . . . . .

24 q tl - boo t . . . . . . . . . . . .

31 k e r n e l - boo t . . . . . . . . . . . . c lt - o r i g i n a l . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . ( B ) R e s u lt s o f s i m u l a t e d f o r ec a s ti ng ex p e r i m e n tf o r no m i na l c o ve r ag e p r obab ilit y p = − α = % s ho r t - li gh t q tl - o r i g i n a l . . . . . . . . . . . .

52 q tl - k e r n e l . . . . . . . . . . . .

55 q tl - boo t . . . . . . . . . . . .

84 k e r n e l - boo t . . . . . . . . . . . . c lt - o r i g i n a l . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . l ong - li gh t q tl - o r i g i n a l . . . . . . . . . . . .

35 q tl - k e r n e l . . . . . . . . . . . .

36 q tl - boo t . . . . . . . . . . . .

45 k e r n e l - boo t . . . . . . . . . . . . c lt - o r i g i n a l . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . ong-term prediction intervals of economic time series 205 T a b l e c on ti nu e d C ov e r a g e p r ob a b ilit y ˆ p R e l a ti v e m e d i a n w i d t h ˆ w H o r i z on20304060901302030406090130 s ho r t - h ea vyq tl - o r i g i n a l . . . . . . . . . . . .

49 q tl - k e r n e l . . . . . . . . . . . .

52 q tl - boo t . . . . . . . . . . . .

77 k e r n e l - boo t . . . . . . . . . . . . c lt - o r i g i n a l . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . l ong - h ea vyq tl - o r i g i n a l . . . . . . . . . . . .

31 q tl - k e r n e l . . . . . . . . . . . .

32 q tl - boo t . . . . . . . . . . . .

39 k e r n e l - boo t . . . . . . . . . . . . c lt - o r i g i n a l . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . W e s i m u l a t e f o ll o w i ng s ce n a r i o s : s ho r t m e m o r y & li gh tt a il , l ong m e m o r y & li gh tt a il , s ho r t m e m o r y & h e a vy t a il and l ong m e m o r y & h e a vy t a il . T h e r e po r t e dv a l u e s a r e c ov e r a g e p r ob a b iliti e s ( i n % ) a nd m e d i a n w i d t h s o f t h e p r e d i c ti on i n t e r v a l s ob t a i n e d f o r t - o f- s a m p l e t r i a l s . T h e m e d i a n w i d t h s a r e r e po r t e d r e l a ti v e t o t h e w i d t ho f t h e r e s p ec ti v e i n t e r- qu a n til e r a ng ec o m pu t e d fr o m t h ee m p i r i ca l d i s t r i bu ti ono f t h e t a r g e t e dou t - o f- s a m p l ea v e r a g e s

06 M. Chudý et al.

Scenario (i)

When m =

130 the original qtl covers the future realizations in onlyaround 48% of cases, while the nominal coverage is 90%. Employing the kernelquantile adjustment on qtl increases this number by 4 percent points (pp), and whencombined with the adjustment based on bootstrapping it yields an additional 26ppon top. Intuitively, using Student’s t quantiles (instead of normal) leads to a highercoverage probability for the clt . As expected, the two methods perform similarly wellin this particular scenario.

Scenario (ii)

Long memory of the DGP has a strongly negative impact on both methods.The combined kernel-bootstrap adjustment increases the coverage of qtl by 20pp,which is, however, still very low. The same holds for the performance of clt undert-quantile adjustment.

Scenario (iii)

Heavy-tailed noise has also a negative impact on the original clt (cov-erage probability falls by 13pp compared to the light-tailed case), whereas qtl , asexpected, is more robust (falls by 4–6pp). The kernel-bootstrap adjustment increasescoverage probability by 27pp for the qtl , whereas the clt- tdist yields only negligibleimprovement compared to the original clt . Scenario (iv)

The combined effect of (ii) and (iii) cuts the coverage probabilities furtherdown—below 45%. The proposed adjustments increase the coverage probabilities byup to 10pp.Overall, for the short and medium horizons, i.e., m = , . . . ,

60, we corroboratethe conclusion from Zhou et al. (2010) that the (original) clt loses against the (original) qtl . However, both original methods exhibit rapid decay in their coverage probabilitiesas the forecasting horizon grows. For instance, in the scenario (iv) the gap betweenhorizon m =

20 and m =

130 for the qtl is 47pp. Concerning the width of thePIs, we can see that both adjusted and original methods underestimate the dispersionand the gap between the width of PIs and the width of the empirical inter-quantilerange increases with the horizon. However, our computational adjustments improve theoriginal methods consistently over all scenarios. The improvement is most remarkablefor the combined adjustment ( kernel-boot ). This section summarizes a real-data POOS forecasting comparison for:(zxw) adjusted PIs by Zhou et al. (2010),(mw) robust bayes PIs by Müller and Watson (2016),(prr) bootstrap PIs by Pascual et al. (2004, 2006) augmented by their analyticalcounterpart.

Data and setup for POOS exercise

The data on univariate time series y t are sampled atequidistant times t = , . . . , T . We forecast the average of m future values ¯ y + : m = m − (cid:11) mt = y T + t . We design our POOS comparison using the following three timeseries (plots of the series are given in “Appendix A”),(spret) S&P 500 value weighted daily returns including dividends available from Jan-uary 2, 1926, till December 31, 2014, with a total of 23, 535 observations, ong-term prediction intervals of economic time series 207 (spret2) squared daily returns, with the same period and(tb3m) nominal interest rates for 3-month US Treasury Bills available from April 1,1954, till August 13, 2015, with a total of 15,396 observations.The sample size for post-WWII quarterly macroeconomic time series is 4 × =

272 observations. We mimic the macroeconomic forecasting setup in that we use arolling sample estimation with sample size T =

260 days (i.e., 1 year of daily data),and forecasting horizon m = , , , ,

90 and 130 days. The rolling samples areoverlapping in the last (resp. ﬁrst) T − m observations, so that, e.g., for m = m =

130 (resp. m = N trials =

178 (resp. 1163) non-overlappingin-or-out POOS trials. All models are selected and parameters estimated anew at eachforecast origin.The simulation results in Sect. 2.2.2 have shown that zxw

PIs have decent coveragefor short-memory e t ∼ I ( ) , but lose the coverage rapidly if the process has longmemory. As a remedy, we apply an appropriate transformation before we use re-sampling and perform the reversed transformation immediately before the estimationof the kernel quantiles (see “Appendix B”). The re-sampling scheme also beneﬁtsfrom the prior transformation, since the stationary bootstrap is suitable for weaklydependent series. For zxw , we assume spret ∼ I ( ) , spret2 ∼ I ( d ) with d = . tb3m ∼ I ( ) , and we replace e t by respective differences de t = ( − L ) d e t (with L as lag operator). Concerning the bootstrap/analytic PIs for arﬁma(p,d,q) – garch(P,Q) , we report only the best empirical coverage probability ˆ p and corresponding relative width ˆ w among two choices of ( P , Q ) ∈ { ( , ), ( , ) } . POOS results

Similarly as in Sect. 2.2.2, we evaluate the coverage probability (2.20)and relative median width (2.21), for nominal coverage probabilities 90% (see Table2A) and 67% (see Table 2B). Overall, the results show tight competition between mw and zxw . Better coverage probability is generally compensated by a larger width,hence less precision. Only for tb3m , zxw performs better in both aspects. The prr PIsshow mixed performance, and it is difﬁcult to draw any general conclusion whetherone should prefer averaging of series ( series ) or averaging the forecasts ( ) andwhether to use analytic formulas ( anlt ) rather than bootstrapping ( boot ) to obtain PIs.We keep our focus on the coverage yield for the longer horizons. spret

Based on the simulation results for short-memory and heavy-tailed series, weexpect that both methods of zxw should give decent coverage probability close to thenominal level. The real-data performance is better than suggested using artiﬁcial data,with an average drop of 9 resp. 12pp for kernel-boot resp. clt-tdist below the nominallevel. On the other hand, mw exceed the nominal coverage even with the naive method.The difference in coverage probability between robust and kernel-boot reaches 15pp.The zxw provide advantage regarding the width, as the robust has twice the width of kernel-boot for m = prr gives decent coverage only for short horizon. Formedium and long horizons, both the coverage and the width of prr exhibit a rapiddecay. Averaging series dominates over averaging forecasts by 10pp. To our surprise,the bootstrap PIs do not outperform the analytic PIs. Regarding the width, the mw are by 40pp more conservative than the empirical inter-quantile range of the out-of-

08 M. Chudý et al. T a b l e R ea l - d a t a r e s u lt s : c o m p a r i s ono f c ov e r a g e p r ob a b iliti e s f o r z x w , m w a nd p rr on eac ho f t h e t h r ee d a il y ti m e s e r i e s : s p r e t , s p r e t , t b3 m C ov e r a g e p r ob a b ilit y ˆ p R e l a ti v e m e d i a n w i d t h ˆ w H o r i z on ( d a y s ) ( A ) R e s u lt s o f P OO S f o r ec a s ti ng ex p e r i m e n tf o r no m i na l c o ve r ag e p r obab ilit y p = − α = % S & P r e t u r n s k e r n e l - boo t . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . r obu s t . . . . . . . . . . . .

43 b a y e s . . . . . . . . . . . .

19 n a i v e . . . . . . . . . . . . s e r i e s - a n lt . . . . . . . . . . . . s e r i e s - boo t . . . . . . . . . . . .

68 4 ca s t - a n lt . . . . . . . . . . . .

57 4 ca s t - boo t . . . . . . . . . . . . S & P r e t u r n s k e r n e l - boo t . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . r obu s t . . . . . . . . . . . .

26 b a y e s . . . . . . . . . . . .

24 n a i v e . . . . . . . . . . . . s e r i e s - a n lt . . . . . . . . . . . . s e r i e s - boo t . . . . . . . . . . . .

06 4 ca s t - a n lt . . . . . . . . . . . .

09 4 ca s t - boo t . . . . . . . . . . . . ong-term prediction intervals of economic time series 209 T a b l e c on ti nu e d C ov e r a g e p r ob a b ilit y ˆ p R e l a ti v e m e d i a n w i d t h ˆ w H o r i z on ( d a y s ) T B M i n t e r e s t r a t e k e r n e l - boo t . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . r obu s t . . . . . . . . . . . .

12 b a y e s . . . . . . . . . . . .

12 n a i v e . . . . . . . . . . . . s e r i e s - a n lt . . . . . . . . . . . . s e r i e s - boo t . . . . . . . . . . . .

10 4 ca s t - a n lt . . . . . . . . . . . .

09 4 ca s t - boo t . . . . . . . . . . . . ( B ) R e s u lt s o f P OO S f o r ec a s ti ng ex p e r i m e n tf o r no m i na l c o ve r ag e p r obab ilit y p = − α = % S & P r e t u r n s k e r n e l - boo t . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . r obu s t . . . . . . . . . . . .

69 b a y e s . . . . . . . . . . . .

17 n a i v e . . . . . . . . . . . . s e r i e s - a n lt . . . . . . . . . . . . s e r i e s - boo t . . . . . . . . . . . .

77 4 ca s t - a n lt . . . . . . . . . . . .

61 4 ca s t - boo t . . . . . . . . . . . .

10 M. Chudý et al. T a b l e c on ti nu e d C ov e r a g e p r ob a b ilit y ˆ p R e l a ti v e m e d i a n w i d t h ˆ w H o r i z on ( d a y s ) S & P r e t u r n s k e r n e l - boo t . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . r obu s t . . . . . . . . . . . .

49 b a y e s . . . . . . . . . . . .

42 n a i v e . . . . . . . . . . . . s e r i e s - a n lt . . . . . . . . . . . . s e r i e s - boo t . . . . . . . . . . . .

13 4 ca s t - a n lt . . . . . . . . . . . .

13 4 ca s t - boo t . . . . . . . . . . . . T B M i n t e r e s t r a t e k e r n e l - boo t . . . . . . . . . . . . c lt - t d i s t . . . . . . . . . . . . r obu s t . . . . . . . . . . . .

12 b a y e s . . . . . . . . . . . .

12 n a i v e . . . . . . . . . . . . s e r i e s - a n lt . . . . . . . . . . . . s e r i e s - boo t . . . . . . . . . . . .

09 4 ca s t - a n lt . . . . . . . . . . . .

09 4 ca s t - boo t . . . . . . . . . . . . T h e r e po r t e dv a l u e s a r ec ov e r a g e p r ob a b iliti e s a nd m e d i a n w i d t h s o f t h e r e s p ec ti v e p r e d i c ti on i n t e r v a l s b a s e don100 + r o lli ngp s e udo - ou t - o f- s a m p l e t r i a l s . T h e m e d i a n w i d t h s a r e r e po r t e d r e l a ti v e t o t h e w i d t ho f t h e r e s p ec ti v e i n t e r- qu a n til e r a ng ec o m pu t e d fr o m t h ee m p i r i ca l d i s t r i bu ti ono f t h e t a r g e t e dou t - o f- s a m p l ea v e r a g e s ong-term prediction intervals of economic time series 211 sample mean-returns, whereas zxw resp. prr are 23–30pp resp. 31–43pp below theinter-quantile range width. spret2 Realized volatility is known for the persistence and heavy tails. The mw giveslightly lower coverage probabilities than zxw compensated by a relatively smallerwidth, thus better precision. With the growing horizon all prr methods suffer a dropin coverage, at least 40pp below the nominal level, accompanied by the largest reduc-tion of width among all methods. The bootstrap PIs dominate over analytic, and thecompetition between and series is tight. Concerning the relative width, whencompared to the previous case of returns, all methods provide very narrow PIs. Webelieve that the seemingly shrinking width of PIs is caused by a larger dispersion ofthe entire spret2 [entering the denominator of (2.21)] compared to the dispersion ofeach local average [the nominator in (2.21)]. Note that for spret2 the denominator in(2.21) does not provide adequate scale and the discrepancy will become even worsefor more persistent tb3m . tb3m Interest rates exhibit strong persistence, and for enhanced performance of zxw ,we again apply differencing, but with d =

1. Note that the naive

PIs have coverageprobabilities as low as 25%. The coverage probabilities of all methods are lower thanfor the last two series, but zxw performs better than mw for all horizons. Moreover, zxw gives better results in terms of width. The coverage probability for prr falls far belowthe nominal level as the horizon grows. Here, series dominates , and bootstrapPIs are inferior to the analytic PIs, even though with only half the nominal coverage.Except spret , clt-tdist gives slightly higher coverage probabilities than kernel-boot corresponding to smaller precision. Eventually, we prefer the kernel-boot method anduse it in the following section for computing PIs for eight economic time series andS&P 500 returns. Müller and Watson (2016), in Table 5 on pages 1731–1732, gave their long-run PIsfor eight quarterly post-WWII US economic time series and quarterly returns. Sinceour kernel-boot method performed well in the last real-data POOS comparison, weemploy this method in order to obtain alternative PIs for these series. We report thesePIs in Tables 3 and 4. Müller and Watson (2016) compare their PIs to those publishedby CBO. They conclude that some similarities between their PIs for series such asGDP are due to a combination of (i) CBO’s ignorance for parameter uncertainty and(ii) CBO’s ignorance of possible anti-persistence of GDP during Great Moderation.Since the effects of (i) and (ii) on the PIs width are the opposite, they eventually seemto cancel out.The eight economic time series are real per capita GDP, real per capita consumptionexpenditures, total factor productivity, labor productivity, population, inﬂation (PCE ),inﬂation (CPI ) and Japanese inﬂation (CPI)—all transformed into log-differences. Personal consumption expenditure deﬂator. Consumer price index.

12 M. Chudý et al. T a b l e P r e d i c ti on i n t e r v a l s f o r l ong -r un a v e r a g e s o f qu a r t e r l ypo s t - WW II g r o w t h r a t e s ˆ p = % ˆ p = % H o r i z on ( y ea r s ) GD P / P op [ − . , . ][ − . , . ][ − . , . ][ − . , . ][ − . , . ][ − . , . ] C on s / P op [ . , . ][ . , . ][ . , . ][ − . , . ][ − . , . ][ − . , . ] T F p r od . [ − . , . ][ − . , . ][ − . , . ][ − . , . ][ − . , . ][ − . , . ] L a bo r p r od . [ . , . ][ . , . ][ . , . ][ − . , . ][ . , . ][ . , . ] P opu l a ti on [ . , . ][ . , . ][ − . , . ][ . , . ][ − . , . ][ − . , . ] P C E i n ﬂ . [ − . , . ][ − . , . ][ − . , . ][ − . , . ][ − . , . ][ − . , . ] C P I i n ﬂ . [ − . , . ][ − . , . ][ − . , . ][ − . , . ][ − . , . ][ − . , . ] J a p . C P I i n ﬂ . [ − . , . ][ − . , . ][ − . , . ][ − . , . ][ − . , . ][ − . , . ] R e t u r n s [ . , . ][ . , . ][ . , . ][ − . , . ][ . , . ][ . , . ] T h ee i gh t m ac r o ec ono m i c ti m e s e r i e s a r e r e a l p e r c ap it a GD P , r e a l p e r c ap it a c on s u m p ti on ex p e nd it u r e s , t o t a lf a c t o r p r odu c ti v it y , l abo r p r odu c ti v it y , popu l a ti on , p e rs ona l c on s u m p ti on ex p e nd it u r e d e ﬂa t o r , C P I i nﬂa ti on a nd J apan e s e i nﬂa ti on — a llt r a n s f o r m e d i n t o l og - d i ff e r e n ce s . T h e p r e d i c ti on i n t e r v a l s p r ov i d ea lt e r n a ti v e t o T a b l e f M ü ll e r a nd W a t s on ( ) , w ho r e po r ti n t e r v a l s a l s o f o r t h e ho r i z on m = ea r s f o r t h e s e s ho r t po s t - WW II qu a r t e r l y s e r i e s . H o w e v e r , s i n ce t h i s ho r i z on w ou l d e x cee d t h e s a m p l e s i ze , w eca nno t p r ov i d e t h e ke r n e l - boo t a s a lt e r n a ti v e . B u t w e p r e s e n t r e s u lt s f o r t h i s ho r i z on i n t h e n e x tt a b l e , w h e r e w e u s e l ong e r y ea r l y s e r i e s ong-term prediction intervals of economic time series 213 Table 4

Prediction intervals for long-run averages of annual growth rates and annual S&P 500 returnsHorizon (years) 10 25 50 7567% GDP/Pop [ − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − real per capita GDP , real per capita consumption expenditures , pop-ulation , CPI inﬂation and

Japanese inﬂation —all transformed into log-differences. This table providesalternative prediction intervals to those reported in Table 5 of Müller and Watson (2016) (Plots are given in “Appendix A.”) The data are available from 1Q-1947 till 4Q-2014,and we forecast them over next m = ,

25 and 50 years. For a subset of these series,we report results based on longer (yearly) sample starting in 1Q-1920, and we add thehorizon m =

75 years for these yearly series.For per capita real GDP , per capita consumption and productivity , we use differ-encing with d = . kernel-boot PIs. Thus, these intervals are wider than inMüller and Watson (2016), especially those for GDP. This case is similar to the caseof realized volatility in the previous section. Wide PIs are often considered as a failureof the forecasting method or model. On the other hand, they can also reﬂect the higheruncertainty about the series future. The width of PIs for GDP is not surprising giventhat similar as CBO, we do not account for the possible anti-persistence during theGreat Moderation. With the longer yearly sample, our PIs get even wider, as a resultof higher volatility in the early twentieth century. Interestingly, the growth in Laborproduction seems to be higher in general than reported by Müller and Watson (2016).

Consumption, population and inﬂation are well known as quite persistent. There-fore, we would expect that similarly as in case of interest rates, kernel-boot could givebetter coverage and possibly narrower PIs than robust . The uncertainty is similarlylarge according to both our kernel-boot and robust , but the location is generally shifteddownward, especially for inﬂation, where the shift is about − quarterly returns , we might expect kernel-boot to give less con-servative thus narrow estimates, and we see this happening with discrepancy growingwith the forecasting horizons. It is clear that robust is very conservative in uncertaintyabout positive returns, where the difference from kernel-boot reached 11pp. Employ-ing the longer yearly time series makes the difference fall to 3pp. On the other hand,3pp is a lot from an investors perspective.

14 M. Chudý et al.

We have constructed prediction intervals for univariate economic time series. Ourforecasting comparison shows that even the simple methods of Zhou et al. (2010)provide a valid alternative for sophisticated prediction intervals designed speciﬁcallyfor the economic framework by Müller and Watson (2016). However, based on oursimulation results, we emphasize that both the methods and the series need to besuitably adjusted, especially under the small sample constraint, which, on the otherhand, is quite common in practice. Based on the comparison results, we eventuallyprovided alternative long-run prediction intervals for eight US economic indicators.Forecasting average growth of economic series over the coming decades is a veryambitious task, and naturally, there are doubts about its usefulness in practice. Thetest of Breitung and Knüppel (2018), whether a forecast is informative, is based on theprediction error variance. They conclude that economic forecasts beyond a horizonof several quarters become uninformative. At ﬁrst sight, such a claim seems to be anargument against following the research of Müller and Watson (2016) and Zhou et al.(2010). However, there are some differences in the assumptions and targets whichhave to be carefully analyzed before we make such statements. The assumption of along-memory component is crucial, and it is hard to verify and distinguish it from apossible structural break. In our paper, we did not tackle the issue of whether long-termpredictions are informative or not. We instead probed into the existing methods andprovided new empirical comparison results.Throughout this paper, we focused on PIs estimated from historical data on thepredicted series. A multivariate or high-dimensional extension would, of course, beattractive. It is widely recognized that big data contain additional forecasting power.Unfortunately, in the economic literature, the boom of forecasting with many predictors(e.g., Stock and Watson 2012; Elliott et al. 2013; Kim and Swanson 2014) is mainlyfocused on short horizons and point-forecasting (for an exception see Bai and Ng2006). This is not a coincidence. Many economic time series exhibit persistence (ofvarying degrees), and this is their essential property in the long run. These long-termeffects, combined over many series, are difﬁcult to understand, partially due to theirdependence on unknown nuisance parameters (see Elliott et al. 2015). The role ofcointegration in long-run forecasting is investigated by Christoffersen and Diebold(1998).We do not use some methods such as quantile (auto-) regression (Koenker 2005)in the current study, and the out-of-sample forecasting comparison could be enhancedby statistical tests (see Clements and Taylor 2003; Gneiting and Raftery 2007, forexample).An extension (including the theory) of Zhou et al. (2010) into a high-dimensionalregression framework using the LASSO estimator is currently being developed. Evenmore challenging is a case of multivariate target series and subsequent construction ofsimultaneous prediction intervals which can have interesting implications for markettrading strategies.

Acknowledgements

Open access funding provided by University of Vienna. We would like to thank JörgBreitung, Francis Diebold, Alexander Kment, Lubos Pastor, Irina Pimenova, Justin Veenstra, Mark Watson ong-term prediction intervals of economic time series 215 as well as the participants of the conference: Big Data in Predictive Dynamic Econometric Modeling heldat the University of Pennsylvania and the 1st Vienna Workshop on Economic Forecasting 2018 held at theInstitute for Advanced Studies for helpful discussion and for answering our questions. Special thanks go toEric Nesbitt and the two anonymous referees. We also acknowledge the computational resources providedby the Vienna Scientiﬁc Cluster. Note that the opinions expressed in the paper are those of the authors anddo not necessarily reﬂect the opinions of the Institute for Financial Policy.

Compliance with ethical standards

Conﬂict of interest

The authors declare that they have no conﬂict of interest.

Funding

The research was partly supported by grant NSF/DMS 1405410. M. Chudý received ﬁnancialsupport from J.W. Fulbright Commission for Educational Exchange in the Slovak Republic, The Ministryof Education, Science, Research and Sport of the Slovak Republic and the Stevanovich Center for FinancialMathematics.

Open Access

This article is distributed under the terms of the Creative Commons Attribution 4.0 Interna-tional License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution,and reproduction in any medium, provided you give appropriate credit to the original author(s) and thesource, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix A: Figures of time series used in Sects. 3 and 4

See Figs. 1, 2 and 3 − A B C Fig. 1

Daily time series: a S&P 500 value weighted daily returns incl. dividend, b squared returns, c nominalinterest rates for 3-month US Treasury Bills

16 M. Chudý et al. − A − B − C . . D Fig. 2

Annual time series—growth rates: a real per capita GDP, b real per capita consumption expenditures, c inﬂation (CPI), d population − A − B − C − D . . . E − F − G H Fig. 3

Quarterly time series—growth rates: a real per capita GDP, b real per capita consumption expen-ditures, c total factor productivity, d labor productivity, e population, f prices (PCE), g inﬂation (CPI), h Japanese inﬂation

Appendix B: Additional steps for implementation of zxw and mw

All macro-series in Sect. 4 are transformed to log-differences. This does not precludelong-memory dynamics or even a unit root. Note that if y t is I ( ) and has deterministictrend component rather than a constant level, the location of the PI would have to beshifted to m + Δ y instead of ¯ y . kernel-boot:

1. Compute the mean adjusted series e t = y t − ¯ y , t = , . . . , T .2. Fix d = . d = de t = ( − L ) d , t = , . . . , T , where L denotes lag operator.3. Replicate de t , t = , . . . , T B times getting de bt , t = , . . . , T , b = , . . . , B . ong-term prediction intervals of economic time series 217

4. Compute the series of overlapping rolling means ¯ de bt ( m ) = m − (cid:11) mi = de bt − i + , t = m , . . . T from every replicated series.5. Estimate quantiles ˆ Q (α/ ) and ˆ Q ( − α/ ) from ¯ e bT ( m ) , b = , . . . , B with T =

260 obtained as ¯ e bT ( m ) = m − (cid:11) mi = ( − L ) − d de bT − i + .6. The PI is given by [ L , U ] = ¯ y + [ Q t κ − (α/ ), Q t κ − ( − α/ ) ] σ/ √ m . clt-tdist:

1. Compute the mean adjusted series e t = y t − ¯ y , t = , . . . , T .2. Fix d = . d = de t = ( − L ) d , t = , . . . , T , where L denotes lag operator.3. Estimate the long-run standard deviation ˜ σ of de t , t = , . . . , T .4. Compute the long-run standard deviation of e t : for d = σ e ( ˜ σ ) = ˜ σ √ ( m + )/ d = . σ e ( ˜ σ ) = ˜ σ m − (cid:18)(cid:11) mi = ( (cid:11) m − ij = ( − ) j (cid:14) − . j (cid:15) ) .5. The PI is given by ¯ y + [ Q t κ − (α/ ), Q t κ − ( − α/ ) ] σ e . robust after steps 1–4.5.1 Compute weights for speciﬁc choice of q and m / T and the prior from step 3.5.2 Numerically approximate s. c. least favorable distribution (LFD) of θ for speciﬁcchoice of q and m / T (see “Appendix” of Müller and Watson (2016)).5.3 Using the weights and the LFD solve the minimization problem ( ) on page 1721in Müller and Watson (2016) to get quantiles which give uniform coverage andminimize the expected PIs width.6. The same as in bayes with the robust quantiles. Appendix C: Derivation of an error standard deviation for time-aggregated forecast

The formula for computing prediction error sd is ˆ as T , + : m = m (cid:18)(cid:11) mi = ( ˆ σ T , T + i (cid:11) m − ij = ˆ Ψ j ) , where ˆ Ψ = ˆ Ψ j , j = , . . . , m are the esti-mates of coefﬁcients from the causal representation of y t . The ˆ σ T , T + i is the garch forecast for innovations deviation. For simplicity, we show the derivation for the caseof constant innovation variance.Assume that y t has causal representation: y t = (cid:18) t + Ψ (cid:18) t − + · · · , where (cid:18) t ∼ ( , σ ) is the innovation process with constant second moment. Standingat time T the i th-step-ahead prediction error can be expressed as pe T , i = (cid:18) t + i + Ψ (cid:18) t + i − + · · · + Ψ i − (cid:18) T + . See http://mathworld.wolfram.com/BinomialSeries.html.

18 M. Chudý et al.

The average prediction error over horizons i = , . . . , m is therefore given by ¯ pe T , + : m = m m (cid:2) i = (cid:2) j = i Ψ i − j (cid:18) T + j , with Ψ =

1. Now, this can be rewritten as ¯ pe T , + : m = m ⎛⎜⎜⎜⎜⎜⎝ (cid:18) T + m − (cid:2) j = Ψ j (cid:20) (cid:21)(cid:22) (cid:23) c m − + (cid:18) T + m − (cid:2) j = Ψ j (cid:20) (cid:21)(cid:22) (cid:23) c m − + · · · + (cid:18) T + m − (cid:2) j = Ψ j (cid:20) (cid:21)(cid:22) (cid:23) c + (cid:18) T + m ⎞⎟⎟⎟⎟⎟⎠ , where c = Ψ =

1. Since innovations are uncorrelated, we can compute the varianceof average prediction error over the horizons i = , . . . , m asvar ( ¯ pe T , + : m ) = (cid:3) σ m (cid:4) m (cid:2) i = c m − i . Appendix D: Discussion on CLT and QTL methods

For the interested reader here we provide some discussion on the justiﬁcation of thetwo original methods from Zhou et al. (2010) and how one can verify them in linear andpossibly nonlinear processes. First, we discuss a result for the CLT method. Assumethe process e t admits the following linear form e t = ∞ (cid:2) j = a j (cid:18) t − j , (5.1)where (cid:18) t are mean-zero, independent and identically distributed (i.i.d.) random vari-ables with ﬁnite second moment. For this structural form, we can evaluate (2.14). Weassume a particular decay rate of a i and state the following theorem. Theorem 1

Assume the process e t admits representation (5.1) where a i satisﬁesa i = O ( i − χ ( log i ) − A ), χ > , A > , (5.2) where larger χ and A mean fast decay rate of dependence. Further assume, A > / if < χ < / . Then the sufﬁcient condition (2.14) implies that convergence (2.15) to the asymptotic normal distribution holds. ong-term prediction intervals of economic time series 219 Proof (cid:9) E ( S m | F ) (cid:9) = (cid:9) ( a + · · · + a m )(cid:18) + ( a + · · · + a m )(cid:18) − + · · · (cid:9) = m (cid:2) i = b i , (5.3)where b i = a i + · · · + a m . Note that (cid:11) mi = b i assumes the following value dependingon χ > / m (cid:2) i = b i = (cid:25) O ( m − χ ( log m ) − A ), for 3 − χ > O ( ) for 3 − χ ≤ . (5.4) (cid:16)(cid:17) Note that Theorem 1 concerns only linear processes. This class covers a large classof time series processes already. However, we do not necessarily require linearity ofthe error process ( e t ) . One can equivalently use the functional dependence measureintroduced in Wu (2005) to state an equivalent result for stationary possibly nonlinearerror processes of the form e t = G ((cid:18) t , (cid:18) t − , . . .), where (cid:18) i are i.i.d. random variables. For this process assuming p ≥ δ j , p = (cid:9) e j − G ((cid:18) j , . . . , (cid:18) ∗ , . . .) (cid:9) p , where (cid:18) ∗ ( · ) is an i.i.d. copy of (cid:18) ( · ) process. For the speciﬁc case of e t assuming a linearform as speciﬁed in (5.1), we have δ j , p = a j . This lays down a straightforward wayin how our results for the linear process can be easily extended to nonlinear processes.Next, for the sake of completeness, we borrow a result from Zhou et al. (2010) thatdiscusses the quantile consistency for the QTL method. Recall that we will exhibit aspromised that this method allows for the situation where the i.i.d. innovations (cid:18) t in thedecomposition (5.1) can have both light tails, i.e., E ( | (cid:18) t | ) < ∞ , and heavy tails, i.e., α < α = sup r > { r : E ( | (cid:18) t | r ) < ∞} .We will impose the following conditions on the coefﬁcients for short- or long-rangedependence and also assume boundedness of the density of (cid:18) t in the following sense: ( SRD ) : ∞ (cid:2) j = | a j | < ∞ ,( DEN ) : sup x ∈ R f (cid:18) ( x ) + | f (cid:6) (cid:18) ( x ) | < ∞ ,( LRD ) : a j = O (( j + ) − γ l ( j )), /α < γ< , l ( · ) is slowly varying function (s. v. f.) , (5.5)

20 M. Chudý et al. where s. v. f. is a function g ( x ) such that lim x →∞ g ( t x )/ g ( x ) = t . Thecondition (SRD) is a classic short-range-dependent condition (see Box et al. 2015,for more discussion). (LRD) refers to the long memory of the time series, and it issatisﬁed by a large class of models such as arﬁma . (DEN) is also a mild conditionsince by inversion theorem, all symmetric stable distributions fall under this condition.We borrow the following result from Zhou et al. (2010) for linear process. It is worthnoting that one can extend this to nonlinear processes as well by deﬁning the coupling-based dependence on predictive density of e t as done in Zhang and Wu (2015), butwe postpone that discussion for a future paper. Quantile consistency results for the quantile method

For a ﬁxed 0 < q <

1, let ˆ Q ( q ) and ˜ Q ( q ) denote the q th sample quantile and actual quantile of ( m ¯ e t ( m ) / H m ) ; t = m , . . . , T where, using (5.5), H m = ⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩ √ m , if (SRD) holds and E ((cid:18) j ) < ∞ , inf (cid:30) x : P ( | (cid:18) i | > x ) ≤ m (cid:31) if (SRD) holds and E ((cid:18) j ) = ∞ , m / − γ l ( m ) if (LRD) holds and E ((cid:18) j ) < ∞ , inf { x : P ( | (cid:18) i | > x ) m − γ l ( m ) if (LRD) holds and E ((cid:18) j ) = ∞ . (5.6)We have the following different rates of convergence of quantiles based on the natureof tail or dependence: Theorem 2 (Zhou et al. (2010) Th 1:4)[Quantile consistency result] – Light-tailed (SRD) Suppose (DEN) and (SRD) hold and E ((cid:18) j ) < ∞ . If m / T → ,then for any ﬁxed < q < , | ˆ Q ( q ) − ˜ Q ( q ) | = O P ( m / √ T ). (5.7) – Light-tailed (LRD) Suppose (LRD) and (DEN) hold with γ and l ( · ) in (5.5) . Ifm / − γ T / − γ l ( T ) → , then for any ﬁxed < q < , | ˆ Q ( q ) − ˜ Q ( q ) | = O P ( mT / − γ | l ( T ) | ). (5.8) – Heavy-tailed (SRD) Suppose (DEN) and (SRD) hold and E ( | (cid:18) j | α ) < ∞ for some < α < . If m = O ( T k ) for some k < (α − )/(α + ) , then for any ﬁxed < q < , | ˆ Q ( q ) − ˜ Q ( q ) | = O P ( mT ν ) for all ν > /α − . (5.9) – Heavy-tailed (LRD) Suppose (LRD) holds with γ and l ( · ) in (5.5) . If m = O ( T k ) for some k < (αγ − )/( α + − αγ ) , then for any ﬁxed < q < , | ˆ Q ( q ) − ˜ Q ( q ) | = O P ( mT ν ) for all ν > /α − γ . (5.10) ong-term prediction intervals of economic time series 221 References

Andersen TG, Bollerslev T, Diebold FX, Labys P (2003) Modeling and forecasting realized volatility.Econometrica 71(2):579–625Bai J, Ng S (2006) Conﬁdence intervals for diffusion index forecasts and inference for factor-augmentedregressions. Econometrica 74:1133–1150Baillie RT (1996) Long memory processes and fractional integration in econometrics. J Econ 73:5–59Bansal R, Kiku D, Yaron A (2016) Risks for the long run: estimation with time aggregation. J Monet Econ82:52–69Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley,LondonBreitung J, Knüppel M (2018) How far can we forecast? Statistical tests of the predictive content. DeutscheBundesbank Discussion Paper No. 07/2018Carlstein E (1986) The use of subseries values for estimating the variance of a general statistic from astationary sequence. Ann Stat 14:1171–1179Chatﬁeld C (1993) Calculating interval forecasts. J Bus Econ Stat 11(2):121–135Cheng X, Liao Z, Schorfheide F (2016) Shrinkage estimation of high-dimensional factor models withstructural instabilities. Rev Econ Stud 83(4):1511–1543Christoffersen PF, Diebold FX (1998) Cointegration and long-horizon forecasting. J Bus Econ Stat 16:450–458Clements MP, Taylor N (2003) Evaluating interval forecasts of high-frequency ﬁnancial data. Appl Econ18:445–456Dehling H, Fried R, Shapirov O, Vogel D, Wornowizki M (2013) Estimation of the variance of partial sumsof dependent processes. Stat Probab Lett 83(1):141–147Diebold FX, Linder P (1996) Fractional integration and interval prediction. Econ Lett 50:305–313Diebold FX, Rudebusch GD (1989) Long memory and persistence in aggregate output. J Monet Econ24:189–209Elliott G, Gargano A, Timmermann A (2013) Complete subset regressions. J Econ 177(2):357–373Elliott G, Mller UK, Watson MW (2015) Nearly optimal tests when a nuisance parameter is present underthe null hypothesis. Econometrica 83(2):771–811Falk M (1984) Relative deﬁciency of kernel type estimators of quantiles. Ann Stat 12(1):261–268Falk M (1985) Asymptotic normality of the kernel quantile estimator. Ann Stat 13(1):428–433Ghalanos A (2017) rugarch: Univariate GARCH models. R package version 1.3-8Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc102(477):359–378Gonçalves S, de Jong R (2003) Consistency of the stationary bootstrap under weak moment conditions.Econ Lett 81(2):273–278Han H, Linton O, Oka T, Whang Y-J (2016) The cross-quantilogram: measuring quantile dependence andtesting directional predictability between time series. J Econ 193(1):251–270Kim H, Swanson N (2014) Forecasting ﬁnancial and macroeconomic variables using data reduction methods:New empirical evidence. J Econ 178:352–367Kitsul Y, Wright J (2013) The economics of options-implied inﬂation probability density functions. J FinancEcon 110:696–711Koenker R (2005) Quantile regression. Cambridge University Press, CambridgeKünsch HR (1989) The jackknife and the bootstrap for general stationary observations. Ann Stat 17(3):1217–1241Lahiri SN (2013) Resampling methods for dependent data. Springer, BerlinLütkepohl H (2006) Forecasting with varma models. In: Granger G, Granger WJ, Timmermann AG (eds)Handbook of economic forecasting, vol 1. Elsevier B.V., North Holland, pp 287–325Marcellino M (1999) Some consequences of temporal aggregation in empirical analysis. J Bus Econ Stat17(1):129–136Müller U, Watson M (2016) Measuring uncertainty about long-run predictions. Rev Econ Stud 83(4):1711–1740Pascual L, Romo J, Ruiz E (2004) Bootstrap predictive inference for arima processes. J Time Ser Anal25(4):449–465Pascual L, Romo J, Ruiz E (2006) Bootstrap prediction for returns and volatilities in garch models. ComputStat Data Anal 50(9):2293–2312

22 M. Chudý et al.

Pastor L, Stambaugh RF (2012) Are stocks really less volatile in the long run. J Finance 67(2):431–478Patton A, Politis DN, White H (2009) Correction to automatic block-length selection for the dependentbootstrap. Econ Rev 28(4):372–375Politis DN, Romano JP (1994) The stationary bootstrap. J Am Stat Assoc 89:1303–1313Politis DN, White H (2004) Automatic block-length selection for the dependent bootstrap. Econ Rev23(1):53–70Sheather SJ, Marron JS (1990) Kernel quantile estimators. J Am Stat Assoc 85:410–416Silverman B (1986) Density estimation for statistics and data analysis. Chapman & Hall/CRC, LondonStock J, Watson M (2012) Generalised shrinkage methods for forecasting using many predictors. J BusEcon Stat 30(4):482–493Stock JH, Watson MW (2005) Understanding changes in international business cycle dynamics. J Eur EconAssoc 3:968–1006Sun S, Lahiri SN (2006) Bootstrapping the sample quantile of a weakly dependent sequence. Sankhy¯aIndian J Stat 68:130–166White H (2000) A reality check for data snooping. Econometrica 68(5):1097–1126Wu WB (2005) Nonlinear system theory: another look at dependence. Proc Natl Acad Sci USA102(40):14150–14154 (electronic)Wu WB, Woodroofe M (2004) Martingale approximations for sums of stationary processes. Ann Probab32(2):1674–1690Zhang T, Wu WB (2015) Time-varying nonlinear regression models: nonparametric estimation and modelselection. Ann Stat 43(2):741–768Zhou Z, Xu Z, Wu WB (2010) Long-term prediction intervals of time series. IEEE Trans Inform Theory56(3):1436–1446

Publisher’s Note