[PDF] Cross-temporal forecast reconciliation: Optimal combination method and heuristic alternatives

Abstract

Forecast reconciliation is a post-forecasting process aimed to improve the quality of the base forecasts for a system of hierarchical/grouped time series (Hyndman et al., 2011). Contemporaneous (cross-sectional) and temporal hierarchies have been considered in the literature, but - except for Kourentzes and Athanasopoulos (2019) - generally these two features have not been fully considered together. Adopting a notation able to simultaneously deal with both forecast reconciliation dimensions, the paper shows two new results: (i) an iterative cross-temporal forecast reconciliation procedure which extends, and overcomes some weaknesses of, the two-step procedure by Kourentzes and Athanasopoulos (2019), and (ii) the closed-form expression of the optimal (in least squares sense) point forecasts which fulfill both contemporaneous and temporal constraints. The feasibility of the proposed procedures, along with first evaluations of their performance as compared to the most performing `single dimension' (either cross-sectional or temporal) forecast reconciliation procedures, is studied through a forecasting experiment on the 95 quarterly time series of the Australian GDP from Income and Expenditure sides considered by Athanasopoulos et al. (2019).

Full PDF

CCross-temporal forecast reconciliation: Optimal combination method andheuristic alternatives

Tommaso Di Fonzo ∗ Daniele Girolimetto ∗∗ Abstract

Forecast reconciliation is a post-forecasting process aimed to improve the quality of the base fore-casts for a system of hierarchical/grouped time series (Hyndman et al., 2011). Contemporaneous(cross-sectional) and temporal hierarchies have been considered in the literature, but - except forKourentzes and Athanasopoulos (2019) - generally these two features have not been fully consid-ered together. Adopting a notation able to simultaneously deal with both forecast reconciliationdimensions, the paper shows two new results: (i) an iterative cross-temporal forecast reconcili-ation procedure which extends, and overcomes some weaknesses of, the two-step procedure byKourentzes and Athanasopoulos (2019), and (ii) the closed-form expression of the optimal (in leastsquares sense) point forecasts which fulﬁll both contemporaneous and temporal constraints. Thefeasibility of the proposed procedures, along with ﬁrst evaluations of their performance as comparedto the most performing ‘single dimension’ (either cross-sectional or temporal) forecast reconcilia-tion procedures, is studied through a forecasting experiment on the 95 quarterly time series of theAustralian GDP from Income and Expenditure sides considered by Athanasopoulos et al. (2019).

Key Words:

Cross-temporal forecast reconciliation, Optimal combination, Heuristics, Hierarchi-cal and Grouped Time Series, Quarterly Australian GDP, Income and Expenditure sides.

JEL classiﬁcation codes:

C22, C61, C82

1. Introduction

In several operational ﬁelds, decisions to be successful require the support of accurate, de-tailed but also coherent forecasts. Forecasts are coherent when the predicted values at thedisaggregate and aggregate scales are equal when brought to the same level. For example,temporally coherent monthly predictions sum up to annual ones and similarly geographi-cally coherent regional predictions add up to country level ones. Such a coherence is animportant qualiﬁer for forecasts, so as to support aligned decision making across differentplanning units and horizons, while avoiding that different decision making units plan ondifferent views of the future. For example, Kourentzes and Athanasopoulos (2019) gener-ate forecasts for Australian domestic tourism that are coherent across multiple geographicaldivisions (regions, zones, states, and whole country), but are also coherent across time (atmonthly, bi-monthly, quarterly, 4-monthly, 6-monthly, and annual levels), i.e. for differentplanning horizons.As a matter of fact, in the growing literature on hierarchical forecast reconciliation thecross-sectional (contemporaneous) and temporal coherency dimensions are mostly consid-ered in separate ways: either ‘time-by-time’ cross-sectional forecast reconciliation for a n -dimensional time series (Hyndman et al., 2011, 2020), or temporal coherency for theforecasts of a single variable for different time frequencies (Athanasopoulos et al., 2017,Hyndman and Kourentzes, 2018). The issue of getting coherent forecasts along both cross-sectional and temporal dimensions (i.e., cross-temporal coherency) has been dealt with byYagli et al. (2019) and Spiliotis et al. (2020). However, as far as we know, the procedure ∗ Department of Statistical Sciences, University of Padua, Italy. [email protected] ∗∗ Department of Statistical Sciences, University of Padua, Italy. [email protected] a r X i v : . [ s t a t . M E ] J u l roposed by Kourentzes and Athanasopoulos (2019) is the only one able to simultaneouslyfulﬁll both cross-sectional and temporal coherency in the ﬁnal reconciled point forecasts atany considered aggregation dimension.Whereas the most recent forecast reconciliation procedures for each single coherencedimension are based on some optimality criterion (van Erven and Cugliari, 2015, Wick-ramasuriya et al., 2019), the cross-temporal reconciliation procedure by Kourentzes andAthanasopoulos (2019) can be considered as an heuristic with a simple and effective struc-ture, i.e. an approach that employs a practical method that is not guaranteed to be optimal,but which is nevertheless sufﬁcient for reaching an immediate goal, which in our case isthe coherency along all the considered dimensions of the reconciled forecasts. This factis probably due to the consideration that in a cross-temporal forecast reconciliation frame-work the complexity and the dimensions of the problem grow very quickly along with therequested computational time and memory (Kourentzes and Athanasopoulos, 2019, Nys-trup et al., 2020). This is certainly true, but nevertheless an appropriate, thorough use ofsome well known linear algebra tools, and the expanding computation facilities, in termsof both calculation power and memory, makes it feasible to look for the optimal solution(in least squares sense) expanding the ﬁeld of application of the ‘forecast reconciliationmethodology’ to simultaneously encompass both contemporaneous (cross-sectional) andtemporal coherency dimensions.Bottom-up and top-down approaches to forecast reconciliation are well known to bothpractitioners and researchers. In short, according to the bottom-up approach (Dunn et al.,1976), the forecasts at the most disaggregated level are summed up to obtain the aggre-gates of interest. On the contrary, in the top-down approach (Gross and Sohl, 1990) themost aggregated series is forecasted ﬁrst, and then is disaggregated to provide lower levelpredictions (Fliedner, 2001, Athanasopoulos et al., 2009, and the references therein). Acombination of these two approaches, known as middle-out (Athanasopoulos et al., 2009),selects an intermediate level of (dis)aggregation, and moves downward in a top-down fash-ion, and onwards according to bottom-up.However, in the last decade there have been several contributions exploiting a regres-sion based optimal combination approach (Hyndman et al., 2011), which has proven con-vincing from a mathematical-statistical point of view, ﬂexible enough to be adapted to bothcross-sectional and temporal frameworks (Wickramasuriya et al., 2019, and Athanasopou-los et al., 2017, respectively), and very effective in improving the base forecasts from manydifferent application ﬁelds (economics, demography, energy, tourism, etc.).In this paper we consider an optimal combination approach, which takes the base (in-coherent and however obtained) forecasts of all nodes in the hierarchy, and reconcile them.We show that all the summation constraints induced by the cross-temporal hierarchy un-derlying the time series, may be used to reconciliate the base forecasts through a sim-ple projection in a well chosen linear space. At this end, we extend the optimal (in leastsquares sense) solutions separately proposed for each coherency dimension, developing theoptimal point forecasting procedure which - while exploiting both cross-sectional and tem-poral hierarchies - simultaneously fulﬁlls both contemporaneous and temporal constraints.We refer to this as optimal combination cross-temporal forecast reconciliation. In addition,grounding on the existing literature on this topic (Wickramasuriya et al., 2019, Athana-sopoulos et al., 2017, and Nystrup et al., 2020), we discuss some simple approximations ofthe covariance matrix to be used in the statistical point forecast reconciliation, with focuson those making use of the in-sample residuals (when available) of the models used to getthe base forecasts. The strictly, and very important related issue of probabilistic forecastreconciliation (Gamakumara et al., 2018, Athanasopoulos et al., 2019, Hong et al., 2019,Jeon et al., 2019, Roach, 2019, Ben Taieb et al., 2020, Yang, 2020) is not considered in this2aper, and will be dealt with in the near future.The paper is organized as follows. We start by presenting the general framework ofpoint forecast reconciliation according to a projection approach (Panagiotelis et al., 2020),avoiding reference to cross-sectional or time indices (section 2). Hierarchical and groupedsystems of multivariate time series are introduced in section 3. The temporal hierarchieswhich characterize a single time series are discussed in section 4. The cross-sectional andtemporal coherency dimensions are dealt with simultaneously in section 5, and the optimal(in least squares sense) solution to the cross-temporal forecast reconciliation problem isthen developed in section 6. The heuristics proposed by Kourentzes and Athanasopoulos(2019) is described in section 7, where some variants of that procedure are discussed, and avery simple (and possibly more effective) iterative cross-temporal reconciliation procedureis proposed. The feasibility of all the proposed procedures, along with the evaluation oftheir performance as compared to the most performing ‘single dimension’ (either cross-sectional or temporal) forecast reconciliation procedures, is studied in section 8 througha forecasting experiment on the 95 quarterly time series of the Australian GDP from In-come and Expenditure sides considered by Athanasopoulos et al. (2019) and Bisaglia et al.(2020). Section 9 presents conclusions and lists some topics in this ﬁeld worth to be consid-ered for future research. Finally, the Appendix contains complementary materials, on bothmethodological and practical issues, not considered into the paper for length reasons. Inaddition, it contains supplementary tables and graphs related to the empirical application.3 ymbols and notation used in the paper Symbols Description n b , n a , n Number of bottom, upper, and total time series ( n = n a + n b ) m Highest available sampling frequency per seasonal cycle(max. order of temporal aggregation) h ≥ Forecast horizon for the lowest frequency time series T Number of high-frequency observations used in theforecasting models N Number of observations at the lowest frequency: N = Tm K Set of factors of m in descending order: K = { k p , k p − , . . . , k , k } , k p = m , k = 1 k ∗ p − (cid:88) j =1 k j M k mk , k ∈ K b t ∈ R n b vector containing the bottom time series (bts) at time t a t ∈ R n a vector containing the upper time series (uts) at time t y t ∈ R n vector containing the time series y t = (cid:2) a (cid:48) t b (cid:48) t (cid:3) (cid:48) at time t Y , (cid:98) Y , (cid:101) Y ∈ R n × h ( k ∗ + m ) Matrices of target, base and cross-temporally reconciled forecasts Y [ k ] , k ∈ K ( n × hM k ) matrix containing the target forecasts of the level k temporallyaggregated series. Component of matrix Y = (cid:104) Y [ m ] Y [ k p − ] . . . Y [ k ] Y [1] (cid:105)(cid:98) Y [ k ] , k ∈ K ( n × hM k ) matrix containing the base forecasts of the level k temporallyaggregated series. Component of matrix (cid:98) Y = (cid:104)(cid:98) Y [ m ] , (cid:98) Y [ k p − ] , . . . , (cid:98) Y [ k ] , (cid:98) Y [1] (cid:105)(cid:101) Y [ k ] , k ∈ K ( n × hM k ) matrix containing the cross-temporally reconciled forecastsof the level k temporally aggregated series. Component of matrix (cid:101) Y = (cid:104)(cid:101) Y [ m ] , (cid:101) Y [ k p − ] , . . . , (cid:101) Y [ k ] , (cid:101) Y [1] (cid:105) A [ k ] , B [ k ] , k ∈ K ( n a × hM k ) and ( n b × hM k ) components of matrix Y [ k ] = (cid:20) A [ k ] B [ k ] (cid:21) B ∗ ∈ R n b × hk ∗ (cid:104) B [ m ] B [ k p − ] · · · B [ k ] (cid:105) y , ˆ y , ˜ y ∈ R nh ( k ∗ + m ) y = vec (cid:0) Y (cid:48) (cid:1) , ˆ y = vec (cid:16)(cid:98) Y (cid:48) (cid:17) , ˜ y = vec (cid:16)(cid:101) Y (cid:48) (cid:17) ˇ y ∈ R nh ( k ∗ + m ) Alternative vectorization of matrix Y , used in the cross-temporalstructural representation: ˇ y =  vec (cid:0) A (cid:48) (cid:1) vec (cid:0) B ∗(cid:48) (cid:1) vec (cid:16) B [1] (cid:48) (cid:17) C ∈ R n a × n b Cross-sectional (contemporaneous) aggregation matrix S ∈ R n × n b Cross-sectional (contemporaneous) summing matrix U (cid:48) ∈ R n a × n Zero constraints cross-sectional kernel matrix: U (cid:48) Y = [ n a × ( k ∗ + m )] K h ∈ R hk ∗ × hm Temporal aggregation matrix R h ∈ R h ( k ∗ + m ) × hm Temporal summing matrix Z (cid:48) h ∈ R hk ∗ × h ( k ∗ + m ) Zero constraints temporal kernel matrix: Z (cid:48) h Y (cid:48) = [ hk ∗ × n ] H (cid:48) ∈ R hn ∗ a × nh ( k ∗ + m ) Zero constraints full row-rank cross-temporal kernel matrix: H (cid:48) y = ˇ C ∈ R n ∗ a × n b m Cross-temporal aggregation matrix (structural representation) for h = 1ˇ S ∈ R n ( k ∗ + m ) × n b m Cross-temporal summing matrix (structural representation) for h = 1ˇ H (cid:48) ∈ R hn ∗ a × nh ( k ∗ + m ) Zero constraints full row-rank cross-temporal kernel matrixvalid for ˇ y : ˇ H (cid:48) ˇ y = . Optimal point forecast reconciliation: projection approach Forecast reconciliation is a post-forecasting process aimed to improve the quality of the base forecasts for a system of hierarchical/grouped, and more generally linearly constrained,time series (Hyndman et al., 2011, Panagiotelis et al., 2020) by exploiting the constraintsthat the series in the system must fulﬁll, whereas in general the base forecasts don’t.Let y be a ( s × ) vector of target point forecasts which are wished to satisfy the systemof linearly independent constraints H (cid:48) y = ( r × , (1)where H (cid:48) is a ( r × s ) matrix, with rank ( H (cid:48) ) = r < s , and ( r × is a ( r × ) null vector.Let ˆ y be a ( s × ) vector of unbiased point forecasts, not fulﬁlling the linear constraints (1)(i.e., H (cid:48) ˆ y (cid:54) = ).Drawing upon Stone et al. (1942), Byron (1978), Weale (1988), Solomou and Weale(1993), and Dagum and Cholette (2006), among others, we consider a regression-basedreconciliation method assuming that ˆ y is related to y by ˆ y = y + ε, (2)where ε is a ( s × ) vector of zero mean disturbances, with known p.d. covariance matrix W . The reconciled forecasts ˜ y are found by minimizing the generalized least squares(GLS) objective function (ˆ y − y ) (cid:48) W − (ˆ y − y ) constrained by (1): ˜ y = arg min y ( y − ˆ y ) (cid:48) W − ( y − ˆ y ) , s.t. H (cid:48) y = . The solution is given by (see Appendix A.1): ˜ y = ˆ y − W H (cid:0) H (cid:48) WH (cid:1) − H (cid:48) ˆ y = M ˆ y , (3)where M = I s − W H ( H (cid:48) WH ) − H (cid:48) is a ( s × s ) projection matrix . Denoting with d ˆ y = − H (cid:48) ˆ y the ( r × vector containing the base forecasts’ ‘coherency’ errors, we canre-state expression (3) as ˜ y = ˆ y + WH (cid:0) H (cid:48) WH (cid:1) − d ˆ y , which makes it clear that the reconciliation formula (3) simply ‘adjusts’ the original fore-casts vector ˆ y with a linear combination – according to the smoothing matrix WH ( H (cid:48) WH ) − – of the coherency errors in the base forecasts. In addition, if the error term of model (2) isgaussian, the reconciliation error ˜ ε = ˜ y − y is a zero-mean gaussian vector with covariancematrix E (˜ y − y ) (˜ y − y ) (cid:48) = W − WH (cid:0) H (cid:48) WH (cid:1) − H (cid:48) = MW . Hyndman et al. (2011, see also Wickramasuriya et al., 2019) propose an alternativeformulation as for the reconciled estimates, equivalent to expression (3) and obtained byGLS estimation of the model ˆ y = S β + ε, (4)where S is a ‘structural summation matrix’ describing the aggregation relationships oper-ating on y , and β is a subset of y containing the target forecasts of the bottom level series, A geometric interpretation of the entire hierarchical forecasting problem is given by Panagiotelis et al.(2020). y = S β (see section 3). Since the hypotheses on ε remain unchanged, it can beshown (see Appendix A.1) that ˜ β = (cid:0) S (cid:48) W − S (cid:1) − S (cid:48) W − ˆ y is the best linear unbiased estimate of β , and that the whole reconciled forecasts vector isgiven by ˜ y = S ˜ β = SG ˆ y , where G = (cid:0) S (cid:48) W − S (cid:1) − S (cid:48) W − .As witnessed by the huge literature on adjusting preliminary data (as the base forecastscan be considered) in order to fulﬁll some externally imposed constraints, the distinctivefeature of the generalized least-squares reconciliation approach is that it can take into ac-count the ‘quality’, however measured, of the preliminary estimates, through an appropri-ate choice of the covariance matrix W . However, for a long time these procedures havedepended on the assumption that this matrix (or any other indicators of the estimates’ accu-racy) of the ﬁgures to be reconciled was known. In many practical situation W is assumedto be diagonal, and the data are adjusted in the light of their relative variances so as tosatisfy the linear restrictions. But another - perhaps more delicate - challenge raises wheneither any reliability measure is available or it can be hardly deduced by the data. Thesolutions proposed in literature for this case are basically of two types, both of which areconsistent with the least-squares approach shown so far:1. mathematical/mechanical solutions: the base forecasts are balanced by minimizinga penalty criterion which ‘induces’ a covariance matrix (which is simply a statisticalartifact);2. data-based solutions: the variability of the base forecasts to be reconciled is estimatedthrough the models and the data used to produce the forecasts.As for point forecast reconciliation, in the following we will consider both approaches,with an explicit preference towards approximations of W based on the in-sample residuals(when available), which appear both more convincing from a statistical point of view, andgenerally well-performing in practical applications. However, this topic deserves furtherattention (Jeon et al., 2019, p. 368, see also Kourentzes, 2017, 2018), and will be consideredfor future research.

3. Hierarchical and Grouped Time Series

Extending the deﬁnition of hierarchical time series given by Panagiotelis et al. (2020), alinearly constrained time series y t is a n -dimensional time series such that all observedvalues y . . . y T and all future values y T +1 , y T +2 . . . lie in the coherent linear subspace S ,that is: y t ∈ S , ∀ t . In many situations, the time series are linked through summationconstraints, which induce a hierarchy. Figure 1 gives an example of a hierarchical systemwith eight variables and three levels: the top-variable at level 1, two variables (A and B) atlevel 2, and ﬁve variables at level 3 (AA, AB, BA, BB, BB, BC). The temporal observationsof these variables form a hierarchical time series, consisting of 5 bottom time series (bts)and 3 aggregated upper time series (uts).Assuming that the relationship mapping the lower-level series in the hierarchy of Fig-ure 1 into the higher ones always be a simple summation , the bottom-level series can be For space reasons, in this paper only simple summation for both contemporaneous and temporal aggrega-tion relationships is considered. Remaining in a linear framework, the extension to general linear constraints(i.e., weighted summation), able to cover other important data features, is rather straightforward (Shang, 2017,2019, Shang and Hyndman, 2017, Li and Hyndman, 2019, Panagiotelis et al., 2020). otal A AA AB B BA BB BC

Figure 1 : A simple two-level hierarchical structurethought as building blocks that cannot be obtained as sum of other series in the hierarchy,while all the series at upper levels can be expressed by appropriately summing part or allof them. For all time periods t = 1 , . . . , T , the link between the top level series y t and thebottom level series is given by: y t = y AA,t + y AB,t + y BA,t + y BB,t + y BC,t . (5)At the same time, the nodes at the intermediate level of the hierarchy satisfy the aggregationconstraints: y A,t = y AA,t + y AB,t y B,t = y BA,t + y BB,t + y BC,t . (6)In summary, there are as many summation constraints as many nodes with leaves (3, i.e.Total, A, B). Consider now the matrices C , S , and U (cid:48) , of dimension (3 × , (8 × , and (3 × , respectively: C =   , S =   = (cid:20) CI (cid:21) , U (cid:48) = [ I − C ] , where matrix U (cid:48) encodes each summation relationship in a row, with 1 at the associatednode, and -1 at its leaves.Expressions (5) and (6) can be written in a more compact way if we deﬁne the vectorsof bottom level ( b t ) and upper level ( a t ) time series at time t as, respectively, b t =  y AA,t y AB,t y BA,t y BA,t y BA,t  , a t =  y t y A,t y B,t  . Denoting by y t the (8 × vector y t = [ a (cid:48) t b (cid:48) t ] (cid:48) , the relationships linking bottom and uppertime series can be equivalently expressed as: a t = Cb t , y t = Sb t , U (cid:48) y t = (3 × , t = 1 , . . . , T. (7)7hus, for any time index t , y t is in the kernel of U (cid:48) , also known as null-space of the lineartransformation induced by matrix U (cid:48) , which is given by the set of vectors v ∈ R , such that U (cid:48) v = (3 × (Harville, 2008, p. 591). We call structural representation of series y t theformulation y t = Sb t , t = 1 , . . . , T, and zero constraints kernel representation of series y t the equivalent expression U (cid:48) y t = , t = . . . , T. A linearly constrained time series formed by two or more hierarchical time series shar-ing the same top level series, and the same bottom level series, is called grouped time series(Hyndman et al., 2011, Hyndman and Athanasopoulos, 2018). Provided matrix C is ap-propriately designed, the deﬁnitions of matrices S and U (cid:48) , depending solely on matrix C ,remain unchanged.It should be noted that we can face linearly constrained time series for which the struc-tural representaton y t = Sb t does not give a straightforward view of the links betweenbottom and upper level time series. Figure 2 shows two very simple hierarchies, where thevariables of each hierarchy contribute (from different sides) to the same top level variable X , and the bottom level series of the hierarchy on the left side ( A , A , B ) are independentfrom those on the right side ( C , D ). A A B C D X A X

Figure 2 : Two hierarchies sharing the same top-level series X The aggregation relationships between the upper variables X and A , and the bottomones A , A , B , C , and D are given by: X = A A BX = C + DA = A A . (8)Expression (8) cannot be represented as a mapping from the bottom variables into (them-selves, and) the upper variables. Nevertheless, it is possible to set up the constraints validfor all the component series in y = [ X A A A B C D ] (cid:48) through the matrix ˇ U (cid:48) =  − − − − −

10 1 − −  , ˇ U (cid:48) y = (3 × . After simple operations on expression (8), it is found: X = C + DA = − B + C + DA − A − B + C + D , (9)so we can write U (cid:48) y = (3 × , with U (cid:48) =  − −

10 1 0 0 1 − −

10 0 1 1 1 − −  = [ I − C ] . While there is no practical problem in working with such constraints, it is clear that they donot conform to the visual pattern of the linearly constrained time series in Figure 2, where A appears as a ‘bottom variable’, whereas in (9) it is expressed as linear combination ofseries A , B , C , and D .In addition, notice that the left side hierarchy of Figure 2 is ‘unbalanced’, in the sensethat unlike node A , node B has no children, and thus is located at the bottom level of thehierarchy, though it could be considered at the same level as node A . Situations like that,often met in practice when dealing with hierarchical/grouped time series, require an ap-propriate formulation of the cross-sectional aggregation matrix C , in order to avoid nodes’duplication (see Appendix A.2). Suppose we have the ( n × vector ˆ y h of unbiased base forecasts for the n variables ofthe linearly constrained series y t for the forecast horizon h . If the base forecasts havebeen independently computed, generally they do not fulﬁll the cross-sectional aggregationconstraints, i.e. U (cid:48) ˆ y h (cid:54) = ( n × . By adapting the general point forecast reconciliationformula (3), the vector of reconciled forecasts is given by: ˜ y h = ˆ y h − W cs U (cid:0) U (cid:48) W cs U (cid:1) − U (cid:48) ˆ y h , (10)where W cs is a ( n × n ) p.d. matrix, assumed known, and sufﬁx ‘cs’ stands for ‘cross-sectional’. Alternative choices for W cs proposed in literature are the following: • identity (cs-ols): W cs = I n (Hyndman et al. 2011), • structural (cs-struc): W cs = diag ( S n b ) (Athanasopoulos et al., 2017), • series variance (cs-wls): W cs = (cid:98) W cs-var = I n (cid:12) (cid:98) W (Hyndman et al., 2016), • MinT-shr (cs-shr): W cs = (cid:98) W cs-shr = ˆ λ (cid:98) W cs-var + (1 − ˆ λ ) (cid:98) W (Wickramasuriya et al.,2019), • MinT-sam (cs-sam): W cs = (cid:98) W (Wickramasuriya et al., 2019),where the symbol (cid:12) denotes the Hadamard product, ˆ λ is an estimated shrinkage coefﬁcient(Ledoit and Wolf, 2004, Sh¨afer and Strimmer, 2005), (cid:98) W is the sample covariance matrixof the one-step ahead in-sample forecast errors : (cid:98) W = 1 T T (cid:88) t =1 ˆ e t ˆ e (cid:48) t , (11) Expression (11) assumes that T − T (cid:88) t =1 ˆ e t,i = 0 , i = 1 , . . . , n . When this does not hold, (cid:98) W is the sampleMean Square Error (MSE) matrix. ˆ e t = y t − ˆ y t , t = 1 , . . . , T , are ( n × vectors of in-sample forecast errors.The ﬁrst three matrices are diagonal, and in the ﬁrst case the projection is orthogonal,whereas the latter two ones (cs-shr and cs-sam) have been proposed within the minimum-trace point forecast reconciliation approach by Wickramasuriya et al. (2019). It shouldbe noted that the quality of the estimate (cid:98) W crucially depends on the dimension of T . Inparticular, when T < n , matrix (cid:98) W is singular, which prevents the matrix inversion inexpression (10). The shrunk version (cid:98) W cs-shr is a feasible alternative, well performing inmany practical situations (Wickramasuriya et al., 2019). Let us denote with b t = [ b t . . . b jt . . . b n b t ] (cid:48) , t = 1 , . . . , T, (12)the T vectors, each of dimension ( n b × , containing the high-frequency bottom-time se-ries (hf-bts), that is the bottom series of the hierarchy/group observed at the highest avail-able temporal frequency. As we shall see in section 5, in cross-temporal hierarchies of timeseries the hf-bts should be considered as the ‘very’ bottom series of the system, since theycannot be formed as either contemporaneous or temporal sum of other variables. Likewise,let us denote with a t = [ a t . . . a it . . . a n a t ] (cid:48) , t = 1 , . . . , T, (13)the T vectors, each of dimension ( n a × , containing the high-frequency upper-time series (hf-uts), which are the cross-sectionally aggregated series of the hierarchy/group, observedat the highest temporal frequency.At each time t = 1 , . . . , T , the cross-sectional (contemporaneous) aggregation con-straints that map the hf-bts into the hf-uts can be written as: a t = Cb t , t = 1 , . . . , T, (14)where C is a ( n a × n b ) contemporaneous aggregation matrix . The structural representa-tion of the linearly constrained time series y t is in turn given by (Hyndman et al., 2011): (cid:20) a t b t (cid:21) = (cid:20) CI n b (cid:21) b t ⇒ y t = Sb t , t = 1 , . . . , T, where S = (cid:20) CI n b (cid:21) is a ( n × n b ) contemporaneous summing matrix , with n = n a + n b .The constraints valid for y t can be expressed in kernel form through the ( n a × n ) zeroconstraints matrix U (cid:48) = [ I n a − C ] , that is: U (cid:48) y t = ( n a × , t = 1 , . . . , T. Now, denote B the ( n b × T ) matrix containing the T observations of the n b -variatehf-bts of the system: B =  b . . . b t . . . b T ... . . . ... . . . ... b i . . . b it . . . b iT ... . . . ... . . . ... b n b . . . b n b t . . . b n b T  = (cid:2) b . . . b t . . . b T (cid:3) =  b ∗(cid:48) ... b ∗(cid:48) i ... b ∗(cid:48) n b  , b t has been deﬁned by (12), and b ∗ i = [ b i . . . b it . . . b iT ] (cid:48) , i = 1 , . . . , n b , is the ( T × vector containing all the observations of the i -th univariate hf-bts, where theasterisk in b ∗ i is used to distinguish this vector, which combines b it across all times for oneseries, from b t , which combines b it across all series for one time.We consider the ( n a × T ) matrix A for the hf-uts as well: A =  a . . . a t . . . a T ... . . . ... . . . ... a j . . . a jt . . . a jT ... . . . ... . . . ... a n a . . . a n a t . . . a n a T  = (cid:2) a . . . a t . . . a T (cid:3) =  a ∗(cid:48) ... a ∗(cid:48) j ... a ∗(cid:48) n a  , where a t was deﬁned by (13), and a ∗ j = [ a j . . . a jt . . . a jT ] (cid:48) , j = 1 , . . . , n a , is the ( T × vector containing all the observations of the j -th univariate component hf-uts.The cross-sectional (contemporaneous) aggregation relationships (14) linking bottomand upper time series of y t can thus be expressed in compact form, by simultaneouslyencompassing all T time periods, for both types of data organization. In fact, extendingexpression (14) to the whole observation period, it is A = CB , (15)which is equivalent to U (cid:48) Y = ( n a × T ) , (16)where Y = (cid:20) AB (cid:21) is the ( n × T ) matrix containing the observations of all n series. It is worth noting that thecross-sectional constraints (15) and (16) hold at any time observation index of any temporalfrequency. This has to be considered when dealing with cross-temporal hierarchies (seesection 5).Now, let us consider two vectorized forms of matrices B and A , namely: b = vec ( B ) , a = vec ( A ) , b ∗ = vec (cid:0) B (cid:48) (cid:1) , a ∗ = vec (cid:0) A (cid:48) (cid:1) . Both b and b ∗ have the same dimension ( T n b × ), and this holds for a and a ∗ as well, whichhave dimension ( T n a × ). However, in the former case ( b and a ) the data is organized‘by-time-ﬁrst-and-then-by-variable’: b = (cid:2) b (cid:48) . . . b (cid:48) t . . . b (cid:48) T (cid:3) (cid:48) , a = (cid:2) a (cid:48) . . . a (cid:48) t . . . a (cid:48) T (cid:3) (cid:48) , whereas in the latter ( b ∗ and a ∗ ) the data is organized ‘by-variable-ﬁrst-and-then-by-time’: b ∗ = (cid:2) b ∗(cid:48) . . . b ∗(cid:48) j . . . b ∗(cid:48) n b (cid:3) (cid:48) , a ∗ = (cid:2) a ∗(cid:48) . . . a ∗(cid:48) j . . . a ∗(cid:48) n a (cid:3) (cid:48) . b ( a ) canbe obtained by simple transformation of vector b ∗ ( a ∗ ) through an appropriate permutationmatrix, and vice-versa (see Appendix A.3.1).Depending on the preferred data organization type, the cross-sectional constraints (15)can be equivalently expressed in vectorized form as (Harville, 2008, pp. 345): a = vec ( A ) = ( I T ⊗ C ) vec ( B ) = ( I T ⊗ C ) b , a ∗ = vec (cid:0) A (cid:48) (cid:1) = ( C ⊗ I T ) vec ( B (cid:48) ) = ( C ⊗ I T ) b ∗ , (17)where the symbol ⊗ denotes the Kronecker product. Expressions (17) can be also formu-lated using matrix U (cid:48) , as in (16), i.e. (cid:0) I T ⊗ U (cid:48) (cid:1) (cid:20) ab (cid:21) = ( T n a × , (cid:0) U (cid:48) ⊗ I T (cid:1) (cid:20) a ∗ b ∗ (cid:21) = ( T n a × . (18)In order to avoid mistakes, one should note that, while y ∗ = vec (cid:0) Y (cid:48) (cid:1) = (cid:20) vec (cid:0) A (cid:48) (cid:1) vec ( B (cid:48) ) (cid:21) = (cid:20) a ∗ b ∗ (cid:21) , (19)it is in turn: y = vec ( Y ) (cid:54) = (cid:20) vec ( A ) vec ( B ) (cid:21) = (cid:20) ab (cid:21) . Therefore, in the following when a matrix vectorization is needed, we will generally preferusing vectorized version of matrices Y (cid:48) , A (cid:48) , and B (cid:48) , as in (19). In addition, to ease thenotation, from now on the asterisk will be omitted, which means that, unlike we previouslydid, we denote with y , a , and b the following vectors: y = vec (cid:0) Y (cid:48) (cid:1) , a = vec (cid:0) A (cid:48) (cid:1) , b = vec (cid:0) B (cid:48) (cid:1) .

4. Temporal hierarchies

Following Athanasopoulos et al. (2017), we consider a time series { x t } Tt =1 observed atthe highest available sampling frequency per seasonal cycle, say m (e.g., month per year, m = 12 , quarter per year, m = 4 , hour per day, m = 24 ). Given a factor k of m , we can construct a temporally aggregated version of the time series x t , through the non-overlapping sums of its k successive values, which has seasonal period equal to M k = m/k . To avoid ragged-edge data, we assume that the total number of observations of x t involved in the non-overlapping aggregation is a multiple of m , i.e. T = N · m , where N isthe length of the most temporally aggregated version of the series, i.e. the series observedat the lowest available frequency.We denote with K = { k m , k p − , . . . , k , k } the set of the p factors of m , in descend-ing order, where k p = m and k = 1 . The temporally aggregated series of order k can bewritten as x [ k ] l = lk (cid:88) t =( l − k +1 x t , l = 1 , . . . , Tk , k ∈ K . (20)Expression (20) accounts also for the trivial temporal aggregation transforming x t in itself(i.e., x t ≡ x [1] l , l = t ). If k is not a factor of m , then the seasonality of the aggregate series is non-integer, and so forecasts of theaggregate are more difﬁcult to compute. [4] τ x [2]2( τ − x [1]4( τ − x [1]4( τ − x [2]2 τ x [1]4( τ − x [1]4 τ Figure 3 : Temporal hierarchy for quarterly series using the common index τ for all levelsof aggregation.Since the observation index l in (20) varies with each aggregation level k , in order toexpress a common index for all levels, we deﬁne τ as the observation index of the mostaggregated series, such that l = τ at that level, i.e. x [ m ] τ , τ = 1 , . . . , N. As for the other temporally aggregated series deﬁned in expression (20), we stack the ob-servations for each aggregation level below m in the ( M k × column vectors x [ k ] τ = (cid:104) x [ k ] M k ( τ − x [ k ] M k ( τ − . . . x [ k ] M k τ (cid:105) (cid:48) , τ = 1 , . . . , N, k ∈ { k p − , . . . , k , } . (21)We may collect x [ m ] τ and the p − vectors deﬁned by expression (21) in a single columnvector, by keeping distinct the temporally aggregated data from the high-frequency one: x τ = (cid:104) x [ m ] τ x [ k p − ] τ (cid:48) . . . x [ k ] τ (cid:48) x [1] τ (cid:48) (cid:105) (cid:48) = (cid:104) t x τ (cid:48) x [1] τ (cid:48) (cid:105) (cid:48) , τ = 1 , . . . , N, where t x τ = (cid:104) x [ m ] τ x [ k p − ] τ (cid:48) . . . x [ k ] τ (cid:48) (cid:105) (cid:48) is a ( k ∗ × vector, with k ∗ = p − (cid:88) j =1 k j , containingall the temporally aggregated series at the observation index τ , x [1] τ is the ( m × vector ofobservations of the time series at the highest available frequency within the complete τ -thcycle, and thus each x τ has dimension [( k ∗ + m ) × .The relationships linking the original high-frequency series x t and its temporal aggre-gates can be graphically represented as a hierarchical/grouped series. For example, forquarterly data k ∈ { , , } , then every four quarterly observations are aggregated up toannual and semi-annual counterparts. According to the notation so far, for a single yearthe quarterly hierarchical structure can be deﬁned as in Figure 3, where x [4] , x [2] and x [1] denote annual, semi-annual, and quarterly values, respectively. The relationships linkingthe nodes in the hierarchy can be expressed as we did in (7) for the cross-sectional (con-temporaneous) hierarchy case: t x τ = K x [1] τ , x τ = R x [1] τ , Z (cid:48) x τ = ( k ∗ × , τ = 1 , . . . , N, (22)13here K is the ( k ∗ × m ) temporal aggregation matrix converting the high-frequency ob-servations into lower-frequency (temporally aggregated) ones, R = (cid:20) K I m (cid:21) is the [( k ∗ + m ) × m ] temporal summing matrix, and Z (cid:48) = [ I k ∗ − K ] is the zero con-straints kernel matrix valid for x τ . For example, with quarterly data it is m = 4 , k ∗ = 3 ,and K =   , R =   , Z (cid:48) = [ I − K ] . The temporal aggregation relationships can be extended to the whole time span coveredby series x t . Denoting by x = ( x (cid:48) . . . x (cid:48) τ . . . x (cid:48) N ) (cid:48) the [ N ( k ∗ + m ) × vector containingall the data of series X at any observed temporal frequency, the complete set of temporalaggregation constraints valid for this vector is given by Z (cid:48) N x = ( Nk ∗ × , (23)where Z (cid:48) N = [ I Nk ∗ − K N ] , and K N = (cid:20) I N ⊗ (cid:48) I N ⊗ (cid:48) (cid:21) . It is not always possible to represent the temporal aggregates of one series in a singletree such as Fig. 3. In Appendix A.4 the representations valid for monthly and hourlyhierarchies are shown. Suppose we have the [( k ∗ + m ) × vector ˆ x h of unbiased base forecasts for the p temporalaggregates of a single time series X within a complete time cycle, i.e. at the forecasthorizon h = 1 for the lowest (most aggregated) time frequency. If the base forecastshave been independently computed, generally they do not fulﬁll the temporal aggregationconstraints, i.e. Z (cid:48) ˆ x h (cid:54) = ( k ∗ × . By adapting the general point forecast reconciliationformula (3), and not considering sufﬁx h to simplify the notation, the vector of temporallyreconciled forecasts is given by: ˜ x = ˆ x − Ω Z (cid:0) Z (cid:48) Ω Z (cid:1) − Z (cid:48) ˆ x , (24)where Ω is a [( k ∗ + m ) × ( k ∗ + m )] p.d. matrix, assumed known.In order to consider possible residual-based estimates of matrix Ω , denote ˆ e [ k ] τ = x [ k ] τ − ˆ x [ k ] τ , τ = 1 , . . . , N, k ∈ K , (25) For any given positive m > , there is a single unique temporal hierarchy only if m = q α , where α is apositive integer and q is a prime number (Yang et al., 2017). A corollary is that a single unique hierarchy isonly possible when there are no coprime pairs in the set { k p − , . . . , k , k } (Athanasopoulos et al., 2017). ( M k × vectors of the in-sample residuals at time index τ for the models used togenerate the base forecasts of the temporally aggregated series of order k . These vectorscan be organized in matrix form as (cid:98) E [ k ] x = (cid:16) ˆ e [ k ]1 (cid:17) (cid:48) ... (cid:16) ˆ e [ k ] τ (cid:17) (cid:48) ... (cid:16) ˆ e [ k ] N (cid:17) (cid:48)  , k ∈ K , (26)where each matrix (cid:98) E [ k ] x has dimension ( N × M k ) , and then grouped in the [ N × ( k ∗ + m )] matrix of in-sample residuals (cid:98) E x = (cid:104)(cid:98) E [ m ] x (cid:98) E [ k p − ] x . . . (cid:98) E [ k ] x (cid:98) E [1] x (cid:105) . Each column of this matrix contains the in-sample residuals pertaining to a speciﬁc nodeof the temporal hierarchy, thus the sample cross-covariance matrix of the k ∗ + m nodes ofthe temporal hierarchy is given by : Ω (cid:86) = 1 N (cid:16)(cid:98) E x (cid:17) (cid:48) (cid:98) E x . (27)This matrix is well deﬁned if N > ( k ∗ + m ) , otherwise there might be singularity issueswhich would prevent its use in expression (24) in place of matrix Ω .Athanasopoulos et al. (2017) and Hyndman and Kourentzes (2018) consider the follow-ing alternative choices for Ω (the sufﬁx ‘t’ stands for ‘temporal’, to keep the ‘t’-proceduresdistinct from the ‘cs’-ones shown in section 3.1): • identity (t-ols): Ω = I k ∗ + m , • structural (t-struc): Ω = Ω (cid:86) t-struc = diag ( R m ) • hierarchy variance scaling (t-wlsh): Ω = Ω (cid:86) t-wlsh = I k ∗ + m (cid:12) Ω (cid:86) • series variance scaling (t-wlsv): Ω = Ω (cid:86) t-wlsv • MinT-shr (t-shr): Ω = Ω (cid:86) t-shr = ˆ λ Ω (cid:86) t-wlsh + (1 − ˆ λ ) Ω (cid:86) • MinT-sam (t-sam): Ω = Ω (cid:86) The series variance scaling matrix Ω (cid:86) t-wlsv is a diagonal matrix “which contains estimatesof the in-sample one-step-ahead error variances across each level” (Athanasopoulos et al.,2017, p. 64), that requires a reduced number ( p instead of k ∗ + m ) of variances to be esti-mated as compared to the hierarchy variance scaling matrix Ω (cid:86) t-wlsh , with increased samplesize available for the estimation.“As the purpose of temporal aggregation is to exploit important information about timeseries at different frequencies”, Nystrup et al. (2020) propose other formulations in orderto include potential information in the autocorrelation structure. The matrices consideredin this paper are: Expression (27) assumes that N − N (cid:88) τ =1 ˆ e τ,l = 0 , l = 1 , . . . , k ∗ + m . When this does not hold, Ω (cid:86) is thesample Mean Square Error (MSE) matrix. For the time being, we do not consider all the newly proposed covariance matrices. The interested readermay refer to Nystrup et al. (2020). auto-covariance scaling (t-acov): Ω = Ω (cid:86) t-acov • structural Markov (t-strar1): Ω = Ω (cid:86) t-strar1 • series Markov (t-sar1): Ω = Ω (cid:86) t-sar1 • hierarchy Markov (t-har1): Ω = Ω (cid:86) t-har1 The auto-covariance scaling makes use of the estimates of the full autocovariance matri-ces within each aggregation level, while ignoring correlations between aggregation levels: Ω (cid:86) t-acov =  Ω (cid:86) [ m ] · · · Ω (cid:86) [ k p − ] · · · ... ... . . . ... ... · · · Ω (cid:86) [ k ]

00 0 · · · Ω (cid:86) [1]  , where the ( M k × M k ) matrices Ω (cid:86) [ k ] are given by : Ω (cid:86) [ k ] = 1 N N (cid:88) τ =1 ˆ e [ k ] τ (ˆ e [ k ] τ ) (cid:48) = 1 N (cid:16)(cid:98) E [ k ] (cid:17) (cid:48) (cid:98) E [ k ] , k ∈ K , (28)where vector ˆ e [ k ] τ and matrix (cid:98) E [ k ] x are given by (25) and (26), respectively. A necessarycondition in order to matrix Ω (cid:86) t-acov be invertible, is N > m , which is less demanding thanwhat is needed for the non-singularity of matrix Ω (cid:86) .Because it is sometimes difﬁcult to estimate the covariance matrix within each aggrega-tion level without assuming that it has some special form, Nystrup et al. (2020) propose “anestimator that blends autocorrelation and variance information, but only requires estimationof the ﬁrst order autocorrelation coefﬁcient at each aggregation level”. They consider theToeplitz matrix for the estimated ﬁrst-order autocorrelation coefﬁcients of the in-sampleresiduals for the p − levels k = k , . . . , k p − , of the series’ temporal hierarchy. Denotingthese autocorrelation coefﬁcents with ρ [ k ] , it is: Γ [ m ] = 1 , Γ [ k ] =  ρ [ k ] · · · ρ M k − k ] ρ [ k ] · · · ρ M k − k ] ... ... . . . ... ρ M k − k ] ρ M k − k ] · · ·  , k = k , . . . , k p − , where each matrix Γ [ k ] , k ∈ K , has dimension ( M k × M k ) . The p matrices are used tobuild the [( k ∗ + m ) × ( k ∗ + m )] matrix: Γ =  (cid:48) · · · (cid:48) Γ [ k p − ] · · · ... ... . . . ... · · · Γ [1]  , Matrix Ω (cid:86) [ m ] reduces to a scalar variance. Ω : Ω (cid:86) t-strar1 = Ω (cid:86) t-struc ΓΩ (cid:86) t-struc Ω (cid:86) t-sar1 = Ω (cid:86) t-wlsv ΓΩ (cid:86) t-wlsv Ω (cid:86) t-har1 = Ω (cid:86) t-wlsh ΓΩ (cid:86) t-wlsh

5. The cross-temporal forecast reconciliation framework5.1 Cross-temporal aggregation constraints

The cross sectional aggregation relationships (18), linking n series at a single time period,and the temporal aggregation relationships (22), valid for an individual variable, can be si-multaneously considered, by extending (i) the cross-sectional constraints to all observationfrequencies, and (ii) the temporal aggregation relationships to all variables.Considering contemporaneous and temporal dimensions in the same framework re-quires to extend and adapt the notations used so far. At this end, deﬁne the p matrices Y [ k ] ,each of dimension ( n × N M k ) , as Y [ k ] = (cid:20) A [ k ] B [ k ] (cid:21) , k ∈ K , where B [ k ] =  b [ k ]1 (cid:48) ... b [ k ] i (cid:48) ... b [ k ] n b (cid:48)  , A [ k ] =  a [ k ]1 (cid:48) ... a [ k ] j (cid:48) ... a [ k ] n a (cid:48)  , k ∈ K , are the matrices containing the k -order temporal aggregates of the bts ( B [ k ] ) and uts ( A [ k ] ),of dimension ( n b × N M k ) and ( n a × N M k ) , respectively.In order to be consistent with the notation so far, Y [1] , B [1] , and A [1] denote the matricescontaining data at the highest available sampling frequency, while Y , B , and A are usednow to denote the matrices containing the data at any considered temporal frequency, thatis: Y = (cid:20) AB (cid:21) = (cid:20) A [ m ] A [ k p − ] · · · A [ k ] A [1] B [ m ] B [ k p − ] · · · B [ k ] B [1] (cid:21) , where Y , A , and B have n , n a and n b rows, respectively, and the same number of columns, [ N ( k ∗ + m )] . Cross-sectional aggregation constraints

The cross-sectional aggregation relationships operating along all the time observation in-dices can be worked out by extending the latter equation in expression (18): U (cid:48) Y [ k ] = ( n a × NM k ) , k ∈ K , which can be expressed in compact form as U (cid:48) Y = [ n a × N ( k ∗ + m )] . (29)17f we deﬁne y as the ( nN ( k ∗ + m ) × vector containing all the observations of allseries at any temporal aggregation level, organized ‘by-variable-ﬁrst-then-by-descending-aggregation-order’, that is y = vec ( Y (cid:48) ) , the cross-sectional (contemporaneous) constraints(29) can be equivalently expressed as: (cid:0) U (cid:48) ⊗ I N ( k ∗ + m ) (cid:1) y = [ n a N ( k ∗ + m ) × . (30) Temporal aggregation constraints

The temporal aggregation relationships (22), valid for a single series, can be extended toeach component of the n -variate time series y t as follows:  A [ m ] (cid:48) B [ m ] (cid:48) ... ... A [ k ] (cid:48) B [ k ] (cid:48)  = K N (cid:104) A [1] (cid:48) B [1] (cid:48) (cid:105) , (31)which can be equivalently written as [ I Nk ∗ − K N ] Y (cid:48) = ( Nk ∗ × n ) , that is: Z (cid:48) N Y (cid:48) = ( Nk ∗ × n ) . (32)The temporal aggregation constraints (32) can thus be re-stated as: (cid:0) I n ⊗ Z (cid:48) N (cid:1) y = ( nNk ∗ × , (33)which, for n = 1 , is equivalent to expression (23).In summary, by considering expressions (30) and (33) together, the cross-temporal con-straints working on the complete set of observations can be expressed as: ˘ H (cid:48) y = ( n ∗ × , (34)where n ∗ = n a N ( k ∗ + m ) + nN k ∗ , and ˘ H (cid:48) = (cid:20) U (cid:48) ⊗ I N ( k ∗ + m ) I n ⊗ Z (cid:48) N (cid:21) is a [ n ∗ × nN ( k ∗ + m )] cross-temporal zero-constraints kernel matrix.Due to the simultaneous consideration of temporal and cross-sectional relationshipslinking the various time series of the system, some rows of ˘ H (cid:48) are redundant, and canbe eliminated if one wishes a full row-rank zero-constraints kernel matrix. This issue isnot new, since it has been encountered in the past when contemporaneous and temporalaggregation constraints are simultaneously considered for the reconciliation of a system oftime series (Di Fonzo, 1990, Di Fonzo and Marini, 2011). In detail, matrix ˘ H (cid:48) consists in: • N n a k ∗ rows deﬁning the cross-sectional (contemporaneous) aggregation constraintsoperating on the lf-uts; • N n a m rows deﬁning the cross-sectional (contemporaneous) aggregation constraintsoperating on the hf-bts; • N n a ( k ∗ + m ) rows deﬁning the temporal aggregation constraints operating on bothhf- and lf-uts; 18 N n b ( k ∗ + m ) rows deﬁning the temporal aggregation constraints operating on bothhf- and lf-bts.Since the ﬁrst set of N n a k ∗ constraints is linearly dependent from the other rows of matrix ˘ H (cid:48) , a full row-rank cross-temporal zero constraints kernel matrix H (cid:48) can be obtained by:1. considering the [( N n ( k ∗ + m ) × ( N n ( k ∗ + m )] commutation matrix (Magnus andNeudecker, 2019, p. 54; see Appendix A.3.1) P such that P [ vec ( Y )] = vec ( Y (cid:48) ) ;2. deﬁning a matrix U ∗ as: U ∗ = (cid:2) ( Nn a m × Nnk ∗ ) I Nm ⊗ U (cid:48) (cid:3) P (cid:48) ;

3. considering the [ N ( n a m + nk ∗ ) × N n ( k ∗ + m )] matrix: H (cid:48) = (cid:20) U ∗ I n ⊗ Z (cid:48) N (cid:21) , (35)which has full row-rank equal to N ( n a m + nk ∗ ) = n ∗ − N n a k ∗ , and allows tore-state the complete cross-temporal constraints (34) as: H (cid:48) y = . (36) Cross-temporal structural representation

The cross-temporal structural representation can be seen as a generalization from a singletime index t to a single cycle index τ (i.e., the low-frequency time index) of the cross-sectional structural representation (see section 3), extended to cover n ( k ∗ + m ) nodes in-stead of n .Denote with Y τ the [( n × ( k ∗ + m )] data matrix available at cycle τ : Y τ = (cid:20) A τ B τ (cid:21) =  A [ m ] τ A [ k p − ] τ . . . A [ k ] τ A [1] τ B [ m ] τ B [ k p − ] τ . . . B [ k ] τ B [1] τ  , τ = 1 , . . . N, and let ˇ S be the ( n ∗ × n b m ) cross-temporal summation matrix ˇ S = (cid:20) ˇ CI n b m (cid:21) , where ˇ C denotes a ( n ∗ a × n b m ) cross-temporal aggregation matrix mapping the hf-bts intothe uts and lf-bts ones ( n ∗ a = n a ( k ∗ + m ) + n b k ∗ ). Denote with a ∗ τ = (cid:34) vec (cid:0) A (cid:48) τ (cid:1) vec (cid:0) B ∗ τ (cid:48) (cid:1)(cid:35) , τ = 1 , . . . , N, the ( n ∗ a × vector of ‘cross-temporal upper series’, containing the uts and lf-bts data atthe low-frequency time index τ , where B ∗ τ = (cid:104) B [ m ] τ B [ k p − ] τ . . . B [ k ] τ (cid:105) , τ = 1 , . . . , N , andwith b [1] τ = vec (cid:16) B [1] τ (cid:48) (cid:17) , τ = 1 , . . . , N, the ( n b m × vector of ‘cross-temporal bottom series’, containing the hf-bts data. Thestructural representation of a cross-temporal system of n time series takes the form ˇ y τ = ˇ Sb [1] τ , τ = 1 , . . . , N, (37)19here ˇ y τ is a [ n ( k ∗ + m ) × vector where we place all the uts and the lf-bts at the top,and all the hf-bts at the bottom: ˇ y τ = (cid:34) a ∗ τ b [1] τ (cid:35) , τ = 1 , . . . , N. (38) Let us assume to have unbiased base forecasts for all the individual time series of themultivariate hierarchical/grouped time series, and for all levels of the temporal hierarchiesbuilt from the highest available sampling frequency. In addition, assume that the forecasthorizon for the most temporally aggregated time series be h = 1 , and that the forecasthorizons for the other temporally aggregated series cover the entire time cycle. This meansthat (i) the forecast horizon for the highest frequency time series is equal to m , and (ii) ingeneral, the forecast horizon for a temporally aggregated time series of order k spans from1 to M k .The base forecasts for each bottom time series of the system form the vectors (cid:98) b [ k ] i , i = 1 , . . . , n b , k ∈ K , where (cid:98) b [1] i = (cid:110) ˆ b [1] il (cid:111) ml =1 is the ( m × vector containing the base forecasts for the i -thhigh-frequency bottom time series (hf-bts), which are the ‘very’ bottom time series in thecross-temporal framework, while the remaining (cid:98) b [ k ] i ’s (for k (cid:54) = 1 ) contain the M k forecastsfor the lower-frequency ones (lf-bts).The base forecasts for the upper time series can be deﬁned likeways as (cid:98) a [ k ] j , j = 1 , . . . , n a , k ∈ K , where (cid:98) a [1] j = (cid:110)(cid:98) a [1] jl (cid:111) ml =1 is the ( m × vector containing the base forecasts for the high-frequency j -th upper time series (hf-uts), and the (cid:98) a [ k ] j ’s (for k (cid:54) = 1 ) are ( M k × vectors oflow-frequency upper time series (lf-uts) forecasts.Let us collect these base forecasts in the ( n b × M k ) and ( n a × M k ) , respectively,matrices (cid:98) B [ k ] =  (cid:98) b [ k ]1 (cid:48) ... (cid:98) b [ k ] i (cid:48) ... (cid:98) b [ k ] n b (cid:48)  , (cid:98) A [ k ] =  (cid:98) a [ k ]1 (cid:48) ... (cid:98) a [ k ] j (cid:48) ... (cid:98) a [ k ] n a (cid:48)  , k ∈ K . (39)The matrix containing the base bts forecasts is given by: (cid:98) B = (cid:104)(cid:98) B [ m ] (cid:98) B [ k p − ] . . . (cid:98) B [ k ] (cid:98) B [1] (cid:105) , where (cid:98) B has dimension [ n b × ( k ∗ + m )] . The base uts forecasts can be similarly arrangedin the [ n a × ( k ∗ + m )] matrix (cid:98) A = (cid:104)(cid:98) A [ m ] (cid:98) A [ k p − ] . . . (cid:98) A [ k ] (cid:98) A [1] (cid:105) . The general case h ≥ can be dealt with in a straightforward way. p matrices (cid:98) Y [ k ] , each of dimension ( n × M k ) , with n = n a + n b , containing the base forecasts for the temporal aggregation level k of both utsand bts: (cid:98) Y [ k ] = (cid:34) (cid:98) A [ k ] (cid:98) B [ k ] (cid:35) , k ∈ K . Finally, denoting with (cid:98) Y the [ n × ( k ∗ + m )] matrix containing the base forecasts of allseries and for all temporal aggregation levels, it is: (cid:98) Y = (cid:104)(cid:98) Y [ m ] (cid:98) Y [ k p − ] . . . (cid:98) Y [ k ] (cid:98) Y [1] (cid:105) = (cid:34) (cid:98) A (cid:98) B (cid:35) . In general, the base forecasts fulﬁll neither cross-sectional (contemporaneous) nor tem-poral aggregation constraints. That is, respectively: U (cid:48) (cid:98) Y (cid:54) = [ n a × ( k ∗ + m )] , Z (cid:48) (cid:98) Y (cid:48) (cid:54) = ( k ∗ × n ) . The cross-temporal point forecast reconciliation problem can thus be stated as follows: weare looking for a reconciled point forecast matrix, say (cid:101) Y , which is ‘as-close-as-possible’(according to a pre-speciﬁed metric) to the base forecast matrix (cid:98) Y , and simultaneously inline with the cross-sectional and temporal aggregation constraints, that is: U (cid:48) (cid:101) Y = n a × ( k ∗ + m ) , Z (cid:48) (cid:101) Y (cid:48) = ( k ∗ × n ) . (40)As we have previously shown, the relationships (40) can be expressed in vectorized formas H (cid:48) ˜ y = , where ˜ y = vec (cid:16)(cid:101) Y (cid:48) (cid:17) and, since h = 1 , the full row-rank matrix H (cid:48) in (35)becomes H (cid:48) = (cid:20) U ∗ I n ⊗ Z (cid:48) (cid:21) . (41) Cross temporal reconciled forecasts for all series at any temporal aggregation level can beeasily computed by appropriate summation of the hf-bts base forecasts (cid:98) b [1] i , i = 1 , . . . , n b ,according to a bottom-up procedure like what is currently done in both the cross-sectionaland temporal frameworks.Denoting by ¨ Y the [ n × ( k ∗ + m )] matrix containing the bottom-up cross temporalreconciled forecasts, it is: ¨ Y = (cid:20) ¨ A ¨ B (cid:21) = (cid:34) ¨ A [ m ] ¨ A [ k p − ] . . . ¨ A [ k ] ¨ A [1] ¨ B [ m ] ¨ B [ k p − ] . . . ¨ B [ k ] ¨ B [1] (cid:35) . Since the hf-bts reconciled forecasts are by deﬁnition equal to the hf-bts base forecasts,i.e. ¨ B [1] = (cid:98) B [1] , the bottom-up forecast reconciliation procedure consists of the followingsteps:1. compute the hf-uts reconciled forecasts using the cross-sectional aggregation rela-tionship (15): ¨ A [1] = C (cid:98) B [1] ;

21. compute the lf-bts reconciled forecasts according to the temporal aggregation rela-tionship (31): (cid:16) ¨ B [ m ] (cid:17) (cid:48) ... (cid:16) ¨ B [ k ] (cid:17) (cid:48)  = K (cid:16)(cid:98) B [1] (cid:17) (cid:48) ⇒ (cid:104) ¨ B [ m ] ¨ B [ k p − ] . . . ¨ B [ k ] (cid:105) = (cid:98) B [1] K (cid:48) ;

3. compute the lf-uts reconciled forecasts by cross-sectional aggregation of the lf-btsreconciled forecasts obtained in the previous step: ¨ A [ k ] = C ¨ B [ k ] , k ∈ K ⇒ (cid:104) ¨ A [ m ] ¨ A [ k p − ] . . . ¨ A [ k ] (cid:105) = C (cid:98) B [1] K (cid:48) . In summary, the matrix containing the bottom-up reconciled forecasts, solely depending onthe hf-bts base forecasts, is given by: ¨ Y = (cid:34) C (cid:98) B [1] K (cid:48) C (cid:98) B [1] (cid:98) B [1] K (cid:48) (cid:98) B [1] (cid:35) . (42)An equivalent, succint alternative to expression (42) consists in exploiting the cross-temporal structural representation (37): ¨ˇ y = ˇ S (cid:98) b [1] , (43)where (cid:98) b [1] = vec (cid:16)(cid:98) B [1] (cid:17) (cid:48) , keeping in mind that the elements in ˇ y and in y are differentlyorganized, and in general it is ¨ˇ y (cid:54) = ¨ y , with ¨ y = vec (cid:16) ¨ Y (cid:48) (cid:17) . This last issue can be easily dealtwith by considering the ( n ∗ × n ∗ ) permutation matrix such that y = Q ˇ y (see AppendixA.3.2): given the orthogonality of matrix Q , expression (43) can be re-stated as ¨ y = Q ˇ S (cid:98) b [1] .However, the formulation of matrix ˇ S , which requires to manage linear relationships acrosscross-sectional and temporal dimensions, may be tedious and prone to errors, mostly forlarge collections of time series. In such cases, it might be preferible using formulation (42),where (cid:98) B [1] , C and K are involved in simple matrix products.Appendix A.5 describes all these features with reference to a ‘toy example’ of a verysimple two-level hierarchy with two quarterly bottom time series.

6. Cross-temporal optimal forecast combination

Let us consider the multivariate regression model (cid:98) Y = Y + E , (44)where the involved matrices have each dimension [ n × ( k ∗ + m )] and contain, respectively,the base ( (cid:98) Y ) and the target forecasts ( Y ), and the coherency errors ( E ) for the n componentvariables of the linearly constrained time series of interest. For each variable, k ∗ + m baseforecasts are available, pertaining to all aggregation levels of the temporal hierarchy for acomplete cycle of high-frequency observation, m .Consider now two vectorized versions of model (44), by transforming the matriceseither in original form:vec (cid:16)(cid:98) Y (cid:17) = vec ( Y ) + vec ( E ) ⇔ (cid:99) Y = Y + ε, (45)22r in transposed form:vec (cid:16)(cid:98) Y (cid:48) (cid:17) = vec (cid:0) Y (cid:48) (cid:1) + vec (cid:0) E (cid:48) (cid:1) ⇔ ˆ y = y + η. (46)The target forecasts must fulﬁll the cross-sectional (contemporaneous) constraints U (cid:48) Y = [ n a × ( k ∗ + m )] and the temporal aggregation constraints Z (cid:48) Y (cid:48) = ( k ∗ × n ) , that is, in vectorized form: (cid:0) I k ∗ + m ⊗ U (cid:48) (cid:1) Y = [ n a ( k ∗ + m ) × ⇔ (cid:0) U (cid:48) ⊗ I k ∗ + m (cid:1) y = [ n a ( k ∗ + m ) × (47) (cid:0) Z (cid:48) ⊗ I n (cid:1) Y = ( k ∗ n × ⇔ (cid:0) I n ⊗ Z (cid:48) (cid:1) y = ( k ∗ n × (48)Denote with P the [ n ( k ∗ + m ) × n ( k ∗ + m )] commutation matrix such that P vec ( Y ) = vec (cid:0) Y (cid:48) (cid:1) ⇔ P Y = yP vec (cid:16)(cid:98) Y (cid:17) = vec (cid:16)(cid:98) Y (cid:48) (cid:17) ⇔ P (cid:99) Y = ˆ yP vec ( E ) = vec (cid:0) E (cid:48) (cid:1) ⇔ P ε = η As a consequence, using the full row-rank matrix H (cid:48) deﬁned by expression (41), the con-straints (47) and (48) can be re-stated as: H (cid:48) y = ⇔ H (cid:48) P Y = . Let W = E [ εε (cid:48) ] be the covariance matrix of vector ε , and Ω = E [ ηη (cid:48) ] the covariancematrix of vector η . Clearly, W and Ω are different parameterizations of the same statisticalobject, i.e. the covariance matrix of the random disturbances in the multivariate regressionmodel (44), for which the following relationships hold: Ω = PWP (cid:48) , W = P (cid:48) Ω P . In order to apply the general point forecast reconciliation formula (3) to a cross-temporalforecast reconciliation problem, we may consider either the expression ˜ y = ˆ y − Ω H (cid:0) H (cid:48) Ω H (cid:1) − H (cid:48) ˆ y , where ˆ y = vec (cid:16)(cid:98) Y (cid:48) (cid:17) is the row vectorization of the base forecasts matrix (cid:98) Y , or equivalentlyre-state the expression above as: (cid:102) Y = (cid:99) Y − WP (cid:48) H (cid:0) H (cid:48) PWP (cid:48) H (cid:1) − H (cid:48) P (cid:99) Y , by considering the column vectorization as in (45).23 .1 Simple alternative approximations of the covariance matrix for cross-temporalpoint forecast reconciliation Consider the column vectorized form of the multivariate regression (45), whose randomdisturbances can be written as: ε =  ε [ m ]1 ε [ k p − ]1 ... ε [ k p − ] mkp − ... ε [1]1 ... ε [1] m  , where each ( n × vector ε [ k ] l , k ∈ K , l = 1 , . . . , M k , contains contemporaneous randomdisturbances, i.e. at the same observation index of any temporal aggregation order.A simple, though rather irrealistic, generalization to the cross-temporal frameworkof the cross-sectional approach (see section 3.1) consists in assuming that only the dis-turbances at the same time index of the same temporal aggregation level are correlated,whereas no temporal dependence (either within the same series at different times, or be-tween the n series) is admitted: E (cid:20) ε [ k i ] r (cid:16) ε [ k j ] s (cid:17) (cid:48) (cid:21) = (cid:40) W [ k ] l if k i = k j = k, r = s = l otherwise , k ∈ K ,l = 1 , . . . , M k . In this case, the covariance matrix W has the following block-diagonal structure: W =  W [ m ]1 · · · · · · · · ·

00 W [ k p − ]1 · · · · · · · · · ... ... . . . ... . . . ... . . . ... · · · W [ k p − ] mkp − · · · · · · ... ... . . . ... . . . ... . . . ... · · · · · · · · · W [1]1 · · · ... ... . . . ... . . . ... . . . ... · · · · · · · · · · · · W [1] m  . (49)Furthermore, if it is assumed that within each temporal aggregation level the random dis-turbances follow a multivariate white noise, which means that the contemporaneous co-variance matrices are constant in time (i.e., W [ k ] l = W [ k ] , k ∈ K , l = 1 , . . . , M k ), theprevious expression simpliﬁes as follows: W =  W [ m ] · · · (cid:16) I mkp − ⊗ W [ k p − ] (cid:17) · · · ... ... . . . ... · · · (cid:16) I m ⊗ W [1] (cid:17) . (50)24rom a practical point of view, each ( n × n ) matrix W [ k ] , k ∈ K , may be approximated likein the cross-sectional forecast reconciliation case, possibly using the in-sample residuals(see section 3.1). Expressions (49) and (50) can thus be seen as two simple extensions tothe cross-temporal case of the approach developed in the cross-sectional framework, whereno temporal dependence is accounted for both within and between the n series.We may similarly propose a simpliﬁed pattern of the disturbances covariance matrix ofthe multivariate regression model (44), by considering the row vectorization form (46). Inthis case, the random disturbances vector η can be written as η = (cid:2) η (cid:48) . . . η (cid:48) i . . . η (cid:48) n (cid:3) (cid:48) , where each [( k ∗ + m ) × vector η i , i = 1 , . . . , n , contains the random disturbances atdifferent observation indices of the various temporal aggregation levels for the same series i . If we assume that the n series are uncorrelated at any observation index for any tem-poral aggregation level (i.e. neither contemporaneous nor temporal correlation is admittedbetween the series, which is rather irrealistic), denoting with Ω ii = E ( η i η (cid:48) i ) , i = 1 , . . . , n ,the [( k ∗ + m ) × ( k ∗ + m )] covariance matrix of the coherency errors of the temporal hier-archies of series i , the complete matrix Ω can be written as follows: Ω =  Ω · · · Ω · · · ... ... . . . ... · · · Ω nn  , (51)where each matrix Ω ii , i = 1 , . . . , n , may be approximated as in the temporal forecast rec-onciliation case, possibly using the in-sample residuals (see section 4.1). Thus expression(51) can be seen as a very simple (maybe too simple!) extension to the cross-temporal caseof the approach developed in the temporal hierarchies framework, where no correlation isadmitted between the random errors of the n series.Clearly, the two covariance patterns (50) and (51) (i) are placed at opposite ends ofpossible ways of dealing with cross-temporal variables, and (ii) should be considered asﬁrst practical devices to make the optimal combination forecast approach feasible for thecross-temporal framework as well. This subject is undoubtedly of far greater importancebetween the open issues still remaining in this ﬁeld, and we plan to go deep on this subjectin the near future.Residual-based estimates of the covariance matrix W (and of its re-parameterized coun-terpart Ω ) make use of the in-sample residuals of the models used to forecast the n timeseries considered at any temporal aggregation level. Denote by (cid:98) E [ k ] l , k ∈ K , l = 1 , . . . , M k , the ( n × N ) matrix containing the in-sample residuals for a single node of the cross-temporal hierarchy (i.e., the i -th row contains the residuals for the N sub-periods l of themodel used to forecast the temporal aggregate of order k of series i ). For each temporalaggregation level k ∈ K , the M k matrices (cid:98) E [ k ] l can be grouped into the ( n × N M k ) matrix (cid:98) E [ k ] = (cid:104)(cid:98) E [ k ]1 . . . (cid:98) E [ k ] l . . . (cid:98) E [ k ] M k (cid:105) , k ∈ K . The ( n ( k ∗ + m ) × N ) matrix containing all the residuals at any time observation index and25ny temporal aggregation level, can in turn be written as: (cid:98) E =  (cid:98) E [ m ]1 (cid:98) E [ k p − ]1 ... (cid:98) E [ k p − ] mkp − ... (cid:98) E [1]1 ... (cid:98) E [1] m  = (cid:2) ˆ e . . . ˆ e τ . . . ˆ e N (cid:3) , where each [ n ( k ∗ + m ) × vector ˆ e τ , τ = 1 , . . . , N , is given by: ˆ e τ =  ˆ e [ m ]1 ,τ . . . (ˆ e [1]1 ,τ ) (cid:48) (cid:124) (cid:123)(cid:122) (cid:125) k ∗ + m · · · ˆ e [ m ] n,τ . . . (ˆ e [1] n,τ ) (cid:48) (cid:124) (cid:123)(cid:122) (cid:125) k ∗ + m  (cid:48) . The sample residual covariance matrix can be calculated according to both parameteriza-tion as: Ω (cid:86) sam = 1 N N (cid:88) τ =1 ˆ e τ (ˆ e τ ) (cid:48) = 1 N (cid:98) E (cid:98) E (cid:48) , W (cid:86) sam = P (cid:48) Ω (cid:86) sam P . However, in many practical situations matrix (cid:98) E has a number of rows - which is equal to thenumber of nodes in the cross-temporal hierarchy - much larger than the number of columns,which is equal to N = Tm . Thus matrices Ω (cid:86) sam and (cid:98) W sam might not have good properties(in particular, they are not p.d. if N ≤ n ( k ∗ + m ) ), and simpliﬁed approximations must belooked for.Two feasible alternatives are given by either the diagonalization or the shrinkage ofmatrix (cid:98) W sam , that is, respectively: (cid:98) W wlsh = I n ( k ∗ + m ) (cid:12) (cid:98) W sam , (cid:98) W shr = ˆ λ (cid:98) W wlsh + (1 − ˆ λ ) (cid:98) W sam , where (cid:98) W wlsh is a diagonal matrix containing the estimates of the ‘hierarchy variances’ foreach node of the cross-temporal hierarchy, (cid:98) W shr is the matrix obtained by shrinkage of (cid:98) W sam with target (cid:98) W wlsh , and ˆ λ is an estimate of the coefﬁcient of shrinkage intensity λ , ≤ λ ≤ . Both (cid:98) W sam and (cid:98) W shr refer to all the n ( k ∗ + m ) hierarchy nodes simultaneouslytaken, but unlike the former matrix, the latter should not suffer for possible singularityproblems.The ( n × n ) matrices W [ k ] l , k ∈ K , l = 1 , . . . , M k , forming the blocks on the diag-onal of matrix (49), can be estimated both in full and shrunk version using the in-sampleresiduals (cid:98) E [ k ] l : (cid:98) W [ k ] l, sam = 1 N (cid:98) E [ k ] l ( (cid:98) E [ k ] l ) (cid:48) , k ∈ K , l = 1 , . . . , M k , (52) Indeed, Ω (cid:86) sam and W (cid:86) sam are mean square error (MSE) matrices. W [ k ] l, shr = ˆ λ k,l (cid:16) I n (cid:12) (cid:98) W [ k ] l, sam (cid:17) + (1 − ˆ λ k,l ) (cid:98) W [ k ] l, sam , k ∈ K , l = 1 , . . . , M k . Similarly, full and shrunk estimates of matrices W [ k ] , k ∈ K , forming the blocks on thediagonal of matrix (50), may be computed as: (cid:98) W [ k ] sam = 1 N M k (cid:98) E [ k ] ( (cid:98) E [ k ] ) (cid:48) , k ∈ K , (53) (cid:98) W [ k ] shr = ˆ λ k (cid:16) I n (cid:12) (cid:98) W [ k ] sam (cid:17) + (1 − ˆ λ k ) (cid:98) W [ k ] sam , k ∈ K . (54)While expression (52) always requires N > n in order to have good properties, formula(53) makes it clear that - except (cid:98) W [ m ] sam , which is calculated with the same N residuals foreach series as (cid:98) W [ k ]1 , sam - the estimates are based on more data, and the necessary conditionto have a p.d. matrix is N M k > n . However, in order to have all the p matrices (cid:98) W [ k ] sam welldeﬁned, the more restrictive condition N > n should be met. Matrices (52) - (54) can beused to approximate W as follows: (cid:98) W BD hsam =  (cid:98) W [ m ]1 , sam · · · · · · · · · (cid:98) W [ k p − ]1 , sam · · · · · · · · · ... ... . . . ... . . . ... . . . ... · · · (cid:98) W [ k p − ] mkp − , sam · · · · · · ... ... . . . ... . . . ... . . . ... · · · · · · · · · (cid:98) W [1]1 , sam · · · ... ... . . . ... . . . ... . . . ... · · · · · · · · · · · · (cid:98) W [1] m, sam  . (cid:98) W BD hshr =  (cid:98) W [ m ]1 , shr · · · · · · · · · (cid:98) W [ k p − ]1 , shr · · · · · · · · · ... ... . . . ... . . . ... . . . ... · · · (cid:98) W [ k p − ] mkp − , shr · · · · · · ... ... . . . ... . . . ... . . . ... · · · · · · · · · (cid:98) W [1]1 , shr · · · ... ... . . . ... . . . ... . . . ... · · · · · · · · · · · · (cid:98) W [1] m, shr  . W (cid:86) BD sam =  W (cid:86) [ m ] sam . . .

00 I mkp − ⊗ W (cid:86) [ k p − ] sam . . . ... ... . . . ... . . . I m ⊗ W (cid:86) [1] sam  W (cid:86) BD shr =  W (cid:86) [ m ] shr . . .

00 I mkp − ⊗ W (cid:86) [ k p − ] shr . . . ... ... . . . ... . . . I m ⊗ W (cid:86) [1] shr  W (or Ω ) shown so far are simple extensions to thecross-temporal framework of the approximations for W (or Ω ) considered either in cross-sectional or in temporal forecast reconciliation. For the time being, we are considering thefollowing approximations (‘oct’ stands for ‘optimal cross-temporal’): • identity (oct-ols): W = Ω = I n ( k ∗ + m ) • structural (oct-struc): W = (cid:98) W struc = P (cid:48) (cid:2) diag (cid:0) Q ˇ S n b m (cid:1) P (cid:3) = diag (cid:0) P (cid:48) Q ˇ S n b m (cid:1) (see section 5.3, and appendix A.3.2) • hierarchy variance scaling (oct-wlsh): W = (cid:98) W wlsh • series variance scaling (oct-wlsv): W = W (cid:86) wlsv = P (cid:48) Ω (cid:86) wlsv P , where Ω (cid:86) wlsv is astraightforward extension of Ω (cid:86) t-wlsv (see section 4.1) • block-diagonal shrunk cross-covariance scaling (oct-bdshr): W = W (cid:86) BD shr • block-diagonal cross-covariance scaling (oct-bdsam): W = W (cid:86) BD sam • auto-covariance scaling (acov): W = W (cid:86) acov = P (cid:48) Ω (cid:86) acov P , where Ω (cid:86) acov is a straight-forward extension of Ω (cid:86) t-acov (see section 4.1) • MinT-shr (oct-shr): W = (cid:98) W shr • MinT-sam (oct-sam): W = (cid:98) W sam

7. An heuristic cross-temporal reconciliation procedure

Kourentzes and Athanasopoulos (2019), henceforth KA, have proposed a cross-temporalreconciliation procedure that can be viewed as an ensamble forecasting procedure whichexploits the simple averaging of different forecasts. The procedure consists in the followingsteps (it is assumed h = 1 ): Step 1

For each individual variable, compute the temporally reconciled forecasts and collect themin the [ n × ( k ∗ + m )] matrix (cid:98) Y : (cid:98) Y → (cid:98) Y . This result can be obtained by applying the point forecast reconciliation formula (24) toeach column of matrix (cid:98) Y (cid:48) , which can be written as: (cid:98) Y (cid:48) =  ˆ t a · · · ˆ t a na ˆ t b · · · ˆ t b nb ˆ a [1]1 · · · ˆ a [1] n a ˆ b [1]1 · · · ˆ b [1] n b  . The n a vectors of temporally reconciled forecasts of the uts can be obtained as:  ˇ t a j ˇ a [1] j  = M a j  ˆ t a j ˆ a [1] j  , M a j = I k ∗ + m − Ω a j Z (cid:0) Z (cid:48) Ω a j Z (cid:1) − Z (cid:48) , j = 1 , . . . , n a . Likeways, the n b vectors of temporally reconciled forecasts of the bts are given by:  ˇ t b i ˇ b [1] i  = M b i  ˆ t b i ˆ b [1] i  , M b i = I k ∗ + m − Ω b i Z (cid:0) Z (cid:48) Ω b i Z (cid:1) − Z (cid:48) , i = 1 , . . . , n b , n a + n b matrices M a j and M b i have dimension [( k ∗ + m ) × ( k ∗ + m )] , and Ω a j , j = 1 , . . . , n a , and Ω b i , i = 1 , . . . , n b , are known p.d. [( k ∗ + m ) × ( k ∗ + m )] matrices.The mapping performing the transformation of the base forecasts into the temporallyreconciled ones can be expressed in compact form as:vec (cid:16)(cid:98) Y (cid:48) (cid:17) =  M a · · · · · · ... . . . ... ... . . . ... · · · M a na · · · · · · b · · · ... . . . ... ... . . . ... · · · · · · M b nb  vec (cid:16)(cid:98) Y (cid:48) (cid:17) , and then matrix (cid:98) Y (cid:48) can be re-stated as: (cid:98) Y (cid:48) =  ˇ t a · · · ˇ t a na ˇ t b · · · ˇ t b nb ˇ a [1]1 · · · ˇ a [1] n a ˇ b [1]1 · · · ˇ b [1] n b  =  ( (cid:98) A [ m ] ) (cid:48) ( (cid:98) B [ m ] ) (cid:48) ... ... ( (cid:98) A [ k ] ) (cid:48) ( (cid:98) B [ k ] ) (cid:48) ( (cid:98) A [1] ) (cid:48) ( (cid:98) B [1] ) (cid:48)  . These reconciled forecasts are in line with the temporal aggregation constraints, i.e. Z (cid:48) (cid:98) Y (cid:48) = ( k ∗ × n ) , but in general they are not in line with the cross-sectional (contemporaneous) con-straints, that is: U (cid:48) (cid:98) Y (cid:54) = [ n a × ( k ∗ + m )] . Step 2

Transform (cid:98) Y by computing time-by-time cross-sectional reconciled forecasts for all thetemporal aggregation levels, and collect them in the [ n × ( k ∗ + m )] matrix ( Y : (cid:98) Y → ( Y . Matrix (cid:98) Y can be written as (cid:98) Y = (cid:104)(cid:98) Y [ m ] (cid:98) Y [ k p − ] . . . (cid:98) Y [ k ] (cid:98) Y [1] (cid:105) , where (cid:98) Y [ k ] , k ∈ K , has dimension ( n × M k ) . Thus, the cross-sectionally reconciledforecasts can be computed by transforming each (cid:98) Y [ k ] as: ( Y [ k ] = M [ k ] (cid:98) Y [ k ] , k ∈ K , where M [ k ] denotes the ( n × n ) projection matrix used to reconcile forecasts of k -leveltemporally aggregated time series: M [ k ] = I n − W [ k ] U (cid:16) U (cid:48) W [ k ] U (cid:17) − U (cid:48) , k ∈ K , and W [ k ] is a ( n × n ) known p.d. matrix. Since it is U (cid:48) M [ k ] = ( n a × n ) , k ∈ K , the recon-ciled forecasts are cross-sectionally coherent, i.e. U (cid:48) ( Y = [ n a × ( k ∗ + m )] , but not temporally: Z (cid:48) ( Y (cid:48) (cid:54) = ( k ∗ × n ) . 29 tep 3 Transform again the step 1 forecasts (cid:98) Y , by computing time-by-time cross-sectional recon-ciled forecasts for all the temporal aggregation levels using the ( n × n ) matrix M , given bythe average of the matrices M [ k ] obtained at step 2: (cid:98) Y → (cid:101) Y KA . Matrix M can be expressed as: M = 1 p (cid:88) k ∈ K M [ k ] , and the ﬁnal cross-temporal reconciled forecasts are given by: (cid:101) Y KA = M (cid:98) Y . (55)Since U (cid:48) M = 1 p (cid:88) k ∈ K U (cid:48) M [ k ] = ( n a × n ) , and Z (cid:48) (cid:98) Y (cid:48) = ( k ∗ × n ) , the reconciled forecasts (55)fulﬁll both cross-sectional and temporal aggregation constraints: U (cid:48) (cid:101) Y KA = U (cid:48) M ( Y = [ n a × ( k ∗ + m )] , Z (cid:48) (cid:16)(cid:101) Y KA (cid:17) (cid:48) = Z (cid:48) ( Y (cid:48) M (cid:48) = ( k ∗ × n ) . To perform step 1, KA consider two alternatives as for the [( k ∗ + m ) × ( k ∗ + m )] matrices Ω a j and Ω b i needed for computing the transformation matrices M a j and M b i , respectively.The former is t-struc, while the latter is t-wlsv (see section 4.1). As for step 2, KA useeither cs-wls or cs-shr (see section 3.1). Remark 1

These two steps can be seen as the successive applications of two distinct multivariate rec-onciliation procedures, each characterized by different covariance matrix and constraints.For, in the ﬁrst step it is solved a quadratic linear problem, where only temporal aggregationconstraints are considered: ˇ y = arg min y ( y − ˆ y ) (cid:48) Ω − ( y − ˆ y ) , s.t. (cid:0) I n ⊗ Z (cid:48) (cid:1) y = , where Ω is the block-diagonal matrix in (51). The solution is given by: ˇ y = ˆ y − Ω ( I n ⊗ Z ) (cid:2)(cid:0) I n ⊗ Z (cid:48) (cid:1) Ω ( I n ⊗ Z ) (cid:3) − (cid:0) I n ⊗ Z (cid:48) (cid:1) ˆ y = M ˆ y , where M = I − Ω ( I n ⊗ Z ) (cid:2)(cid:0) I n ⊗ Z (cid:48) (cid:1) Ω ( I n ⊗ Z ) (cid:3) − (cid:0) I n ⊗ Z (cid:48) (cid:1) is the [ n ( k ∗ + m ) × n ( k ∗ + m )] projection matrix M =  M · · ·

00 M · · · ... ... . . . ... · · · M n  , with M i = I k ∗ + m − Ω ii Z (cid:0) Z (cid:48) Ω ii Z (cid:1) − Z (cid:48) , i = 1 , . . . , n. ( Y = arg min Y (cid:16) Y − (cid:99) Y (cid:17) (cid:48) W − (cid:16) Y − (cid:99) Y (cid:17) , s.t. (cid:0) I k ∗ + m ⊗ U (cid:48) (cid:1) Y = , where W is the block-diagonal matrix in (50), and whose solution is given by: ( Y = (cid:99) Y − W ( I k ∗ + m ⊗ U ) (cid:2)(cid:0) I k ∗ + m ⊗ U (cid:48) (cid:1) W ( I k ∗ + m ⊗ U ) (cid:3) − (cid:0) I k ∗ + m ⊗ U (cid:48) (cid:1) (cid:99) Y = M (cid:99) Y (56)where M = I − W ( I k ∗ + m ⊗ U ) (cid:2)(cid:0) I k ∗ + m ⊗ U (cid:48) (cid:1) W ( I k ∗ + m ⊗ U ) (cid:3) − (cid:0) I k ∗ + m ⊗ U (cid:48) (cid:1) is the [ n ( k ∗ + m ) × n ( k ∗ + m )] projection matrix M =  M [ m ] · · · · · · · · ·

00 M [ k p − ] · · · · · · · · · ... ... . . . ... . . . ... . . . ... · · · M [ k p − ] · · · · · · ... ... . . . ... . . . ... . . . ... · · · · · · · · · M [1] · · · ... ... . . . ... . . . ... . . . ... · · · · · · · · · · · · M [1]  =  M [ m ] · · · (cid:16) I mkp − ⊗ M [ k p − ] (cid:17) · · · ... ... . . . ... · · · (cid:16) I m ⊗ M [1] (cid:17) with M [ k ] = I n − W [ k ] U (cid:16) U (cid:48) W [ k ] U (cid:17) − U (cid:48) , k ∈ K . It is worth noting that the reconciled forecasts (56) can be expressed according to the alter-native vectorization, as: ˘ y = P (cid:48) ˘ Y . Remark 2

The cross-sectional reconciliation performed at step 2 of the KA procedure involves thetransformations of k ∗ + m vector of forecasts. More precisely, each transformation matrix M [ k ] , k ∈ K , is applied to M k different ( n × vectors. Thus, a sensible alternative to theKA proposal might be considering the weighted average of the transformation matrices: M ∗ = 1 k ∗ + m (cid:88) k ∈ K M k M [ k ] . Remark 3

In general the ﬁnal result of the reconciliation procedure would change if the user invert theorder of application of the two reconciliation steps. In Appendix A.6 the ‘cross-sectional-ﬁrst-then-temporal’ reconciliation procedure is shown, along with the relevant M matrix,31hich in this case is obtained through an average of the projection matrices used for the rec-onciliation of the n series according to temporal hierarchies. Since the differences betweenthe reconciled point forecasts according to these two approaches could be not negligible(see section 7.2), in our view this is a weakness of the procedure, and calls for a decisionrule about the ﬁnal reconciled forecasts to retain. A practical way of doing could be choos-ing the reconciled forecasts which are the ‘closest’ (according to a given metric) to the baseforecasts between the two alternatives. Remark 4

The calculation of the average matrix M in the ﬁnal step of the procedure, needed to recoverthe cross-temporal coherency across the point forecasts, requires the availability of the pro-jection matrices used in the second step. This poses no problem when closed form reconcil-iation formulae can be used. Unfortunately, this is not the case when the user is interestedin considering more general linear constraints (e.g., non-negativity of the ﬁnal reconciledestimates), that should be treated with appropriate numerical procedures (Kourentzes andAthanasopoulos, 2020a, Wickramasuriya et al., 2020) .In the next subsection we extend the heuristic KA procedure in such a way that theseissues can be overcome in a simple and effective manner. Taking inspiration from the heuristic KA reconciliation procedure, we consider an itera-tive procedure which produces cross-temporally reconciled forecasts by alternating fore-cast reconciliation along one single dimension (either cross-sectional or temporal) at eachiteration step.Each iteration consists in the ﬁrst two steps of the heuristic KA procedure, so the fore-casts are reconciled by alternating cross-sectional (contemporaneous) reconciliation, andreconciliation through temporal hierarchies in a cyclic fashion.Starting from the base forecasts (cid:98) Y , denote with d cs and d te , respectively, the cross-sectional and temporal gross discrepancies, given by: d cs = (cid:13)(cid:13)(cid:13) U (cid:48) (cid:98) Y (cid:13)(cid:13)(cid:13) d te = (cid:13)(cid:13)(cid:13) Z (cid:48) (cid:98) Y (cid:48) (cid:13)(cid:13)(cid:13) where (cid:107) X (cid:107) = (cid:80) i,j | x i,j | . Since the base forecasts are not in line with either type ofconstraints, in general both d cs and d te are greater than zero.The iterative procedure can be described as follows:1. Start the iterations by calculating the temporally reconciled forecasts (cid:101) Y (1) , such that Z (cid:48) (cid:16)(cid:101) Y (1) (cid:17) (cid:48) = , and d (1) cs = (cid:13)(cid:13)(cid:13) U (cid:48) (cid:101) Y (1) (cid:13)(cid:13)(cid:13) ≥ .2. The point forecasts in matrix (cid:101) Y (1) are then cross-sectionally reconciled, obtaining (cid:101) Y (2) , which is such that U (cid:48) (cid:101) Y (2) = , and d (1) te = (cid:13)(cid:13)(cid:13)(cid:13) Z (cid:48) (cid:16)(cid:101) Y (2) (cid:17) (cid:48) (cid:13)(cid:13)(cid:13)(cid:13) ≥ .3. The updates in steps 1. and 2. are performed at each iteration j , j = 1 , , . . . ,until a convergence criterion is met, that is d ( j ) te < δ , where δ is a positive tolerancevalue (e.g., δ = 10 − ) , and matrix (cid:101) Y (2 j ) contains the ﬁnal cross-temporal reconciledforecasts. This issue is currently under study, in order to develop a procedure which, by exploiting some distinctivefeatures of a hierarchical/grouped time series, be able to produce non-negative reconciled forecasts with areduced computational effort. quarter −0.2%0.0%0.2%0.4% semester −0.1%0.0%0.1%0.2%0.3% year ka tcs < ka cst ka tcs > ka cst

GDP TS: ka tcs vs ka cst (wlsv−shr)

Figure 4 : Quarterly, semi-annual and annual Australian GDP one-step-ahead reconciled forecasts according to theKourentzes and Athanasopoulos (2019) cross-temporal reconciliation approach (t-wlsv for the temporal step, cs-shr for thecross-sectional step) by alternating the constraint dimensions to be fulﬁlled: percentage differences between the reconciledforecasts obtained through (i) temporal-then-cross-sectional reconciliation, and (ii) cross-sectional-then-temporal reconcili-ation. The differences between the two reconciled forecasts are divided by their arithmetic mean.

Figure 5 completes the results shown so far, by considering the forecasts of the strictlypositive 79 (out of 95) variables from both Income and Expenditure sides, cross-temporallyreconciled according to the KA procedure and its iterative variant. The boxplots showthe distributions of the percentage discrepancies between the reconciled forecasts obtainedusing temporal reconciliation ﬁrst, and cross-sectional reconciliation then, vis-`a-vis the re-sults obtained by inverting the order of application of the two reconciliation procedures.It clearly appears that the iterative variant of the original KA proposal produces less pro-33ounced discrepancies . −0.90%−0.60%−0.30%0.00%0.30%0.60%0.90% quarter semester year iterKABoxplot wlsv−shr: KA vs iterative Figure 5 : Quarterly, semi-annual and annual one-step-ahead reconciled forecasts of 79 out of 95 times series of the Aus-tralian GDP from Income and Expenditure sides using both the original KA cross-temporal reconciliation procedure (t-wlsvfor the temporal step, and cs-shr for the cross-sectional one), and its iterative variant: boxplots of the percentage differencesbetween the reconciled forecasts obtained through (i) temporal-then-cross-sectional reconciliation, and (ii) cross-sectional-then-temporal reconciliation. The differences between each pair of reconciled forecasts are divided by their arithmetic mean.

It must also be said that the convergence speed of the iterative procedure does notseem to be affected by the choice of the ﬁrst dimension to be fulﬁlled when the iterationstarts. Figure 6 shows an example of the convergence speed of the iterative procedure ei-ther starting with the cross-sectional (bottom panel) or temporal (top panel) reconciliationprocedure for the Australian GDP forecasts. In both cases, the convergence is achievedvery quickly: ﬁxing δ = 10 − , 15 (14) iterates are needed when starting from the temporal(cross-sectional) dimension. Furthermore, from the fourth iteration onwards the constraintsare practically fulﬁlled in both cases. Nevertheless, since the ﬁnal reconciled values dependon this choice, it would be useful having an ex-ante ‘choice rule’ between the two alterna-tives. We are currently working on this issue, however in the rest of the paper, when consid-ering heuristic cross-temporal forecast reconciliation procedures, for ease of presentationwe maintain the original choice made by KA, performing temporal forecast reconciliationﬁrst, and cross-sectional reconciliation then.

8. Cross-temporal reconciliation of the Australian GDP forecasts from Income andExpenditure sides

In a recent paper, Athanasopoulos et al. (2019, p. 690) propose “the application of state-of-the-art forecast reconciliation methods to macroeconomic forecasting” in order to performaligned decision making and to improve forecast accuracy. In their empirical study theyconsider the cross-sectional forecast reconciliation for 95 Australian Quarterly National Temporal reconciliation has been done using t-wlsv, while cross-sectional reconciliation was performedusing cs-shr. However, this result does not seem to depend on the reconciliation procedures considered: t-struc+cs-wls, t-struc+cs-shr, and t-wlsv+cs-wls give very similar results, here not presented for space reasons,but available on request from the authors. l l l l l l l l ll l l l l l l l iteration.step l Cross−sectional incoherenceTemporal incoherence temporal−then−cross−sectional reconciliation l l l l l l l l ll l l l l l l l l iteration.step l Cross−sectional incoherenceTemporal incoherence cross−sectional−then−temporal reconciliation

Figure 6 : Cross-sectional and temporal gross incoherence at each iteration step of the iterativecross-temporal forecast reconciliation procedure (t-wlsv + cs-shr) for the Australian GDP time se-ries, at the ﬁrst forecast origin 1994:Q3.

Accounts time series, describing the Gross Domestic Product (

GDP ) at current pricesfrom Income and Expenditure sides, interpreted as two distinct hierarchical structures. Inthe former case (Income),

GDP is on the top of 15 lower level aggregates (ﬁgure 7), whilein the latter (Expenditure),

GDP is the top level aggregate of a hierarchy of 79 time series(see ﬁgures 21.5-21.7 in Athanasopoulos et al., 2019, pp. 703-705).By managing the complete set of 95 time series following the approach described insection 3, Bisaglia et al. (2020) have extended the results of Athanasopoulos et al. (2019),showing that fully reconciled forecasts of

GDP , coherent with all the reconciled forecastsfrom both Expenditure and Income sides, can be obtained through the projection approachdescribed in section 2. According to the notation adopted so far, the (33 × kernel matrixaccounting for the cross-sectional zero constraints is given by (Bisaglia et al., 2020): U (cid:48) =  (cid:48) (5 × − (cid:48) (10 × (cid:48) (26 × (cid:48) (53 × (cid:48) (5 × (cid:48) (10 × (cid:48) (26 × − (cid:48) (53 × (5 × I − C I (5 × (5 × (26 × (26 × (26 × I − C E  . In what follows, cross-temporal forecast reconciliation is applied within the same forecast-ing experiment designed by Athanasopoulos et al. (2019), extended in order to considersemi-annual and annual forecasts as well: for the available time series span (1984:Q4 -2018:Q1), quarterly base forecasts from 1 up to 4 quarters ahead have been obtained for the n = 95 separate time series through simple univariate ARIMA models selected using the auto.arima function of the R -package forecast (Hyndman et al., 2020). The fore-casting experiment uses a recursive training sample with expanding window length, wherethe ﬁrst training sample is set from 1984:Q4 to 1994:Q3 and the last ends on 2017:Q1, for35 igure 7 : Hierarchical structure of the income approach for Australian GDP. The pink cell contains the mostaggregate series. The blue cell contain intermediate-level series and the yellow cells correspond to the mostdisaggregate bottom-level series. Source: Athanasopoulos et al., 2019, p. 702. a total of 91 forecast origins . Likeways, in the same automatic fashion we have computed(i) one and two-step ahead forecasts for the time series obtained by temporal aggregationof two successive quarters, and (ii) one-step-ahead forecasts for the time series obtained bytemporal aggregation of four successive quarters. We evaluate the performance of multiple (say,

J > ) forecast reconciliation proceduresthrough forecast accuracy indices calculated on the forecast error ˆ e [ k ] ,hi,j,t = y [ k ] i,t + h − ˆ y [ k ] ,hi,j,t , i = 1 , . . . , ,j = 0 , . . . , J, t = 1 , . . . , , k ∈ K ,h = 1 , . . . , h k , where y and ˆ y are the observed and forecasted values, respectively, i denotes the series( i = 1 , . . . , , for the uts, i = 33 , . . . , , for the bts), j = 0 denotes the base forecasts, t is the forecast origin ( t = 1 corresponds to 1994:Q3), K = { , , } , and h = 1 , h = 2 , h = 4 , are the forecast horizons for annual, semi-annual, and quarterly timeseries, respectively.The accuracy is evaluated using the Average Relative Mean Square Error (AvgRelMSE,Davydenko and Fildes, 2013; Kourentzes and Athanasopoulos, 2019, 2020b), obtained bytransforming the MSE index, given by the average across all 91 forecasts origins of the The R scripts, the data and the results of the paper by Athanasopoulos et al. (2019) are available in thegithub repository located at https://github.com/PuwasalaG/Hierarchical-Book-Chapter .We did not change this ﬁrst, crucial step in the forecast reconciliation workﬂow, since the focus is on thepotential of cross-temporal forecast reconciliation. However, Athanasopoulos et al. (2019) point out that thisfast and ﬂexible approach performs well in forecasting Australian GDP aggregates, even compared to othermore complex methods. Sagaert et al. (2019) warn practitioners that this could be ‘a myopic choice as (the accuracy metrics)consider solely the ﬁrst moment of the error distribution and ignore higher moments, which can have signiﬁcantimplications for decision making’. This important issue will be dealt with in the near future. [ k ] ,hi,j = 191 (cid:88) t =1 (cid:16) ˆ e [ k ] ,hi,j,t (cid:17) , i = 1 , . . . , ,j = 0 , . . . , J, k ∈ K ,h = 1 , . . . , h k . (57)The AvgRelMSE is the geometric mean across all 95 series of the MSE ratio of a forecastover a benchmark given by the base, incoherent ARIMA forecasts, across all evaluationsamples, for a given horizon h :AvgRelMSE [ k ] ,hj = (cid:32) (cid:89) i =1 rMSE [ k ] ,hi,j (cid:33) , j = 0 , . . . , J, k ∈ K ,h = 1 , . . . , h k , (58)where rMSE [ k ] ,hi,j is the relative MSE:rMSE [ k ] ,hi,j = MSE [ k ] ,hi,j MSE [ k ] ,hi, , i = 1 , . . . , ,j = 0 , . . . , J, k ∈ K ,h = 1 , . . . , h k . If a forecast outperforms the base forecasts, then the AvgRelMSE becomes smaller thanone and vice-versa, and the percentage improvement in accuracy over the benchmark canbe calculated as (cid:16) − AvgRelMSE [ k ] ,hj (cid:17) × .Expression (58), which refers to all 95 time series, can be re-stated for (i) groups ofvariables (e.g., bts and uts), (ii) multiple forecast horizons (e.g., h = 1 − for quarterlyforecasts, k = 1 ; h = 1 − for semi-annual forecasts, k = 2 ), (iii) different temporalaggregation levels over the whole forecast horizon (e.g., accuracy indices for the wholetemporal hierarchy of each series) . In Appendix A.7 we show the expressions used tocompute forecast accuracy indices in a rolling forecast experiment, like the one we aredealing with, for selected combinations of variables/time frequencies/forecast horizons.In order to give a complete picture of the evaluation results, in the next subsectionwe show and discuss the MSE-based accuracy indices, at multiple timescales and forecasthorizons, for a set of selected forecast reconciliation procedures. Appendix A.8 reportsthe indices based on MAE as well, and several tables and graphs of the accuracy indices(for both MSE and MAE) for all the forecast reconciliation procedures described in theprevious sections, by keeping distinct one-dimension (either cross-sectional or temporal)forecast reconciliation procedures from cross-temporal heuristic and optimal combinationprocedures.Furthermore, we use the non-parametric Friedman and post-hoc Nemenyi tests (see alsoKoning et al., 2005, and Hibon et al., 2012), as implemented in the R -package tsutils (Kourentzes, 2019), to establish if the differences in the forecasts produced by the consid-ered procedures are signiﬁcant. According to Kourentzes and Athanasopoulos (2019, p. Davydenko and Fildes (2013) develop the Average Relative MAE (AvgRelMAE), based on the MeanAbsolute Error of the forecasts, but suggest that this formulation ‘If required (...) can also be extended to othermeasures of dispersion or loss functions’, as the AvgRelMSE in (58) and the AvgRelRMSE, based on the RootMean Square Error (Sagaert et al., 2019, Kourentzes and Athanasopoulos, 2020a). On this last point, Kourentzes and Athanasopoulos (2020b, pp. 17-18) raise an important issue by claimingthat ‘in contrast to common practice, we believe that there is limited beneﬁt in an empirical evaluation setting,to report average accuracy measures across all levels of the hierarchy (...). It is very improbable that this reﬂectsa realistic situation. Hence, it is paramount that the modeller attempts to establish a strong connection betweenthe objectives of the forecasts and the evaluation’. In the forecasting experiment of this paper, where onlythree temporal aggregation levels are in order, it could be sensible not to consider as strategic the semi-annualtime frequency, which is instrumentally used to improve the accuracy of quarterly and annual forecasts, but isprobably of reduced practical usefulness to analysts and decision makers.

The empirical application mainly aims to evaluate the performance of the most convincingnew cross-temporal reconciliation procedures, which basically are those using residual-based approximations of the covariance matrix, as compared to the state-of-the-art pointforecast reconciliation procedures. More precisely, we consider ﬁve selected proceduresrecently proposed in the hierarchical forecasting literature: • cs-shr (Wickramasuriya, et al. 2019), • t-wlsv (Kourentzes et al., 2017), • t-acov (Nystrup et al., 2020), • t-sar1 (Nystrup et al., 2020), • kah-wlsv-shr (Kourentzes and Athanasopoulos, 2019),ﬁve (two-step and iterative) variants of the KA approach: • tcs-acov-shr, i.e. two-step t-acov + cs-shr, • tcs-sar1-shr, i.e. two-step t-sar1 + cs-shr, • ite-wlsv-shr, i.e. iterative t-wlsv + cs-shr (see section 7.2), • ite-acov-shr, i.e. iterative t-acov + cs-shr (see section 7.2), • ite-sar1-shr, i.e. iterative t-sar1 + cs-shr (see section 7.2),and ﬁnally, three optimal combination forecast procedures: • oct-wlsv, i.e. W = (cid:98) W wlsv (see section 6.1), • oct-bdshr, i.e. W = (cid:98) W BD shr (see section 6.1), • oct-acov, i.e. W = (cid:98) W acov (see section 6.1).The ﬁrst ﬁve procedures have proven well performing in the empirical applicationswhere they have been used (Athanasopoulos et al., 2017, 2019, Wickramasuriya et al.,2019, Bisaglia et al., 2020, Nystrup et al., 2020, among others). Clearly, the one-dimensionreconciliation procedures (cs-shr, t-wlsv, t-acov, and t-sar1) do not give fully coherentforecasts. Rather, as far as it is expected that they improve on the base forecasts, thebest-practice one-dimension procedures should be viewed as stricter benchmarks for thecross-temporal forecast reconciliation procedures, which are requested to give accurateone-number-forecasts as well.In summary, the forecasting experiment was designed to evaluate the capability of thecross-temporal forecast reconciliation procedures to improve the forecast accuracy as com-pared (i) to the base forecasts, and (ii) to the most performing one-dimension forecast38econciliation procedures. In addition, the experiment should help in assessing (iii) theperformance of both KA-variants (two-step and iterative procedures) and optimal combi-nation forecasts as compared to the original proposal by KA, and (iv) the feasibility andthe accuracy of the optimal combination cross-temporal reconciliation procedures, whichfor the time being - even when they are computed using the in-sample residuals - are basedon rather simple/unrealistic approximations of the covariance matrix (see section 6.1). Asfor this last point, we are interested in understading if there is any signiﬁcant differencebetween the reconciled forecasts produced by the most performing heuristic and optimalcombination forecast procedures. Table 1 presents the AvgRelMSE’s obtained for the forecasting techniques (base + 13 rec-onciliation procedures) listed in the previuos sub-section. We provide results for all 95component time series, and for the 32 upper-level and the 63 bottom-level time series sep-arately. The results are shown by level of temporal aggregation and forecast horizon. Ateach column, the lowest error is highlighted in red boldface, while values greater than one,which mean that the reconciled forecasts are worse than the base ones, are highlighted inblack boldface.Most of the data in the table are represented in the top panel of Figure 8, containing thegraphs of the AvgRelMSE’s for the considered procedures, across all forecast horizons, bytemporal aggregation level of the forecasted series. The ranks of these indices are reportedin the bottom panel of the same ﬁgure, with colours in background chosen to highlight theprocedures’ performance, from best (green) to worst (red). In this ﬁgure the procedureshave been put in the order given by the overall AvgRelMSE, which seems a good compro-mise to represent such a multiple comparison. Figure 9 shows the Multiple Comparisonwith the Best Nemenyi test, after that the Friedman test has shown that the forecasts givenby the considered procedures are different both when all temporal aggregation levels andforecast horizons (top panel), and when only one-step-ahead quarterly forecasts (bottompanel), are considered. 39 able 1 : AvgRelMSE at any temporal aggregation level and any forecast horizon.

Quarterly Semi-annual Annual AllProcedure all 95 series base 1 1 1 1 1 1 1 1 1 1cs-shr 0.9583 0.9701 0.9757 0.9824 0.9716 0.9526 0.9781 0.9652 0.9657 0.9689t-wlsv ite-sar1-shr 0.9613 0.9683 0.9588 0.9605 0.9622 0.8151 0.9092 0.8609 0.7514 0.8997oct-wlsv 0.9692 0.9719 0.9622 0.9631 0.9666 0.8203 0.9125 0.8652 0.7562 0.9042oct-bdshr 0.9838 0.9798 0.9618 0.9665 0.9730 0.8297 0.9144 0.8710 0.7573 0.9095oct-acov 0.9553 0.9648 0.9767 0.9707 0.9668 0.8013 0.9185 0.8579 0.7531 0.9016

32 upper series base 1 1 1 1 1 1 1 1 1 1cs-shr ite-acov-shr 0.9283 0.9398 0.9259 0.9314 0.9313

63 bottom series base 1 1 1 1 1 1 1 1 1 1cs-shr 0.9806 0.9928 0.9998 tcs-sar1-shr 0.9832 0.9818 0.9762 0.9761 0.9793 0.8268 0.9253 0.8746 0.7713 0.9164ite-wlsv-shr 0.9798 0.9814 0.9776 0.9776 0.9791 0.8259 0.9275 0.8753 0.7723 0.9166ite-acov-shr ll l lll lll l ll l ll l lll lll l ll l ll l lll lll l ll l ll l lll lll l ll ll cs s h r t s a r t w l sv t a c o v o c t bd s h r o c t w l sv o c t a c o v t cs s a r − s h r k ah w l sv − s h r i t e s a r − s h r i t e w l sv − s h r t cs a c o v − s h r i t e a c o v − s h r T e m po r a l a gg r e g a t i on l eve l A ll l eve l s l all bts uts A v g R e l M SE all 1 2 4 all 1 2 4 all 1 2 4basecs shrt sar1t wlsvt acovoct bdshroct wlsvoct acovtcs sar1−shrkah wlsv−shrite sar1−shrite wlsv−shrtcs acov−shrite acov−shr all bts uts Figure 8 : Top panel: Average Relative MSE across all series and forecast horizons, by fre-quency of observation. Bottom panel: Rankings by frequency of observation and forecasthorizon. 41 ll lll l l l lll l l ite−wlsv−shr 5.19ite−acov−shr 5.39ite−sar1−shr 5.44tcs−acov−shr 5.98kah−wlsv−shr 6.01tcs−sar1−shr 6.04oct−acov 6.38oct−wlsv 6.78oct−bdshr 7.54t−wlsv 8.23t−acov 8.25t−sar1 8.4cs−shr 12.27base 13.09 6 9 12

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different) llll l lll l l l l ll ite−sar1−shr 5.92ite−acov−shr 5.93ite−wlsv−shr 5.94cs−shr 5.98tcs−acov−shr 6.51tcs−sar1−shr 6.69kah−wlsv−shr 6.79oct−acov 6.86oct−wlsv 7.4base 8.81oct−bdshr 9.06t−acov 9.36t−sar1 9.84t−wlsv 9.92 6 8 10

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different)

Figure 9 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The reconciliationprocedures are sorted vertically according to the MSE mean rank (i) across all time frequen-cies and forecast horizons (top), and (ii) for one-step-ahead quarterly forecasts (bottom).The main results found on this dataset can be summarized as follows: • as compared to both base forecasts and one-dimension reconciliation procedures,using cross-temporal hierarchies provides a clear decrease in the AvgRelMSE for theuts (likely the most important variables for the decision maker, e.g. GDP) at anytemporal aggregation level and any forecast horizon;42 this accuracy improvement is less marked, though yet visually evident, for the bottomlevel series, as compared to the reconciled forecasts through temporal hierarchiesalone, which however are cross-sectionally incoherent; • each iterative procedure performs better than its two-step counterpart; • within the cross-temporal procedures, the heuristic procedures provide better resultsthan the optimal combination ones.Looking at the performances of each procedure, it’s worth noting that cs-shr scores ﬁrstas for the quarterly forecasts of the uts, and almost always improves on the base forecasts’accuracy, regardless of series’ group, temporal aggregation level and forecast horizon . Inaddition, from the bottom panel of Figure 9 we observe that, when considered only on quar-terly basis, the one-step-ahead forecasts for all series provided by cs-shr are (temporally in-coherent and) not signiﬁcantly different from those provided by the best procedure (whichin this case is ite-acov-shr). However, since the temporal dimension is not accounted forby this reconciliation procedure, the relative performance worsens (i.e., the cross-temporalprocedures improve on the base forecasts more than cs-shr) as the temporal aggregationlevel increases.Overall, ite-acov-shr always scores best for all series and all forecast horizons, andsecond-best for the bts series and all forecast horizons, while tcs-acov-shr scores secondand ﬁrst, in turn. However, ite-acov-shr shows good results for the uts forecasts as well. Inthis case, the best performances are given by ite-sar1-shr and ite-wlsv-shr. Figure 9 showsthat the differences in the forecasts produced by all the considered heuristic procedures arenot statistically signiﬁcant at any temporal aggregation level and forecast horizon . Fur-thermore, two optimal combination procedures (oct-acov and oct-wlsv) produce reconciledforecasts not signiﬁcantly different from the best procedure according to the Nemenyi test(see Figure 9), while oct-bdshr is signiﬁcantly (worse and) different from the best forecastreconciliation procedure.Finally, in Table 2 the AvgRelMSE’s for selected upper time series and reconciliationprocedures are shown . The series we analyze (see Figure 10) come from the ﬁrst threelevels of both the Income and Expenditure sides hierarchies: • Gross Domestic Product • Total Factor Income (Income side) • Gross Operating Surplus (Income side) • Compensation of Employees (Income side) • Gross National Expenditure (Expenditure side) • Domestic Final Demand (Expenditure side) • Changes in Inventories (Expenditure sides) • Final Consumption Expenditures (Expenditure side) • Gross Fixed Capital Formation (Expenditure side) The only exception is an AvgRelMSE greater than 1 (1.0094) for the bts quarterly forecasts at horizon 4. Figure 9 reports only the test results across all temporal aggregation levels and forecast horizons (top), andfor k = 1 and h = 1 (bottom). The graphs of the Nemenyi test for each temporal aggregation level and eachforecast horizon are provided in Appendix A.8. The results for all 95 series, and AvgRelMAE as well, are available in Appendix A.8. able 2 : AvgRelMSE at any temporal aggregation level and any forecast horizonfor selected upper time series and reconciliation procedures. Quarterly Semi-annual Annual AllProcedure

Gross Domestic Product cs-shr

Total Factor Income cs-shr

Gross Operating Surplus cs-shr oct-acov 0.9524 0.9233 0.9181 0.8826 0.9187 0.8534

Compensation of Employees cs-shr t-acov

Gross National Expenditure cs-shr oct-acov

Domestic Final Demand cs-shr

Changes in Inventories cs-shr oct-acov

Final Consumption Expenditures cs-shr oct-acov 0.9691 0.9489 0.9310 0.9373 0.9464 0.8277 0.8673 0.8473 0.6888 0.8763

Gross Fixed Capital Formation cs-shr ross Domestic Product1995−1 2000−1 2005−1 2010−1 2015−1200000300000400000 Final Consumption Expenditures Gross Fixed Capital FormationDomestic Final Demand Changes in InventoriesCompensation of Employees Gross National ExpenditureTotal Factor Income Gross Operating Surplus1995−1 2000−1 2005−1 2010−1 2015−1 1995−1 2000−1 2005−1 2010−1 2015−1 60000 90000120000150000200000300000400000−3000 0 3000 6000 25000 50000 75000100000100000200000300000400000100000150000200000200000300000400000100000150000200000250000300000350000 Base Forecasts Actual Figure 10 : Quarterly GDP and selected time series from both Income and Expendituresides: actual values and one-step-ahead base forecasts during the testing period (1994:Q4 -2018:Q1)The ability of cs-shr to improve on short-term (1 or 2-quarter ahead) base forecastsclearly emerges, with the only exception of the forecasts of the Change in Inventories series,where most indices at quarterly level are greater than 1. However, this bad performance isshared by the other reconciliation procedures as well, and is likely due to the low quality ofthe base forecasts as compared to the other considered series (see Figure 10).To conclude, the general improvement registered on average (last column of Table 2) bythe cross-temporal reconciliation procedures may be considered a positive outcome, whichcombines an acceptable forecasting performance at quarterly level with a good performanceat semi-annual and annual-levels, with the additional feature that the complete system offorecasts is internally and temporally coherent.45 . Conclusions

The hierarchical framework is currently considered as an effective way to improve the ac-curacy of forecasts in many different ﬁelds of application. In this paper we give somecontributions and extensions to a topic which has been widely studied in the last decade,by connecting it to the widespread literature on least-squares adjustment of preliminarydata (Stone et al., 1942, Byron, 1978), with focus on a projection approach which de facto encompasses and extends the modelling framework by Hyndman et al. (2011) (see Wick-ramasuriya et al., 2019, and Panagiotelis et al., 2020). However, we do agree with Jeon etal. (2019, p. 368) that a “shortcoming of many of the approaches above, including WLSwith structural scaling, is that the weights (...) are a function of in-sample errors and are notdirectly determined with reference to an objective function ultimately used to asses forecastquality”. This problem, yet present for cross-temporal hierarchies, is added to the dimen-sionality issues which generally characterizes these structures, whose number of nodes isconsiderably larger than the relevant single-dimension hierarchies, and calls for alternativeestimation strategies, based for example on cross validation, as proposed by Jeon et al.(2019), or - when enough data is available - on Machine Learning techniques (Mancuso etal., 2020, Spiliotis et al., 2020).Nevertheless, cross-temporal point forecast reconciliation seems to be a promisingtheme, which is worth considering for future research. In particular, we are currently work-ing to ﬁnalize an R package offering classical and new optimal and heuristic combinationforecast reconciliation procedures ( forec - fo recast rec onciliation). In addition, weplan to perform simulation experiments to better understand behaviour, potentiality, andpossible shortcomings of the proposed procedures. Other topics in our research agendaare: • looking for more realistic (and hopefully effective) approximations of the covariancematrices for cross-temporal reconciliation, (i) by building on Jeon et al. (2019), (ii)by deepening some ideas by Kourentzes (2017, 2018), and (iii) by extending/adaptingsome proposals by Nystrup et al. (2020) to the cross-temporal framework; • extending the cross-temporal framework to the reconciliation of probabilistic fore-casts (Gamakumara et al., 2018, Jeon et al., 2019, Ben Taieb et al., 2020), and forbayesian (Eckert et al., 2019) and fast (Ashouri et al., 2019) forecast reconciliationprocedures; • extending the cross-temporal optimal combination approach to the case of intermit-tent demand forecasts (Petropoulos and Kourentzes, 2015), with the related non-negativity issues (Kourentzes and Athanasopoulos, 2020a, Wickramasuriya et al.,2020), and possible consideration of ‘soft’ constraints (Danilov and Magnus, 2008).46 eferences Ashouri, M., Hyndman, R.J., Shmueli, G. (2019),

Fast forecast reconciliation using lin-ear models , Department of Econometrics and Business Statistics, Monash University,Working Paper 29/19.Athanasopoulos, G., Ahmed, R.A., Hyndman, R.J. (2009), Hierarchical forecasts for Aus-tralian domestic tourism,

International Journal of Forecasting , 25, 1, 146–166.Athanasopoulos, G., Gamakumara, P., Panagiotelis, A., Hyndman, R.J., Affan, M. (2019),Hierarchical Forecasting, in Fuleky, P. (ed.),

Macroeconomic Forecasting in the Era ofBig Data , pp. 703-733, Cham, Springer.Athanasopoulos, G., Hyndman, R.J., Kourentzes, N., Petropoulos, F. (2017), Forecastingwith Temporal Hierarchies,

European Journal of Operational Research , 262, 1, 60–74.Ben Taieb, S., Taylor, J.W., Hyndman, R.J. (2020), Hierarchical Probabilistic Forecast-ing of Electricity Demand with Smart Meter Data,

Journal of the American StatisticalAssociation (in press).Bertani, N., Satop¨a¨a, V., Jensen, S. (2020),

Joint Bottom-Up Method for Hi-erarchical Time-Series: Application to Australian Tourism . Available at SSRN:https://ssrn.com/abstract=3542278. Accessed on June 1, 2020.Bisaglia, L., Di Fonzo, T., Girolimetto, D. (2020),

Fully reconciled GDP forecasts fromIncome and Expenditure sides , arXiv:2004.03864v2. Accessed on June 1, 2020.Byron, R.P. (1978), The estimation of large Social Account Matrices,

Journal of the RoyalStatistical Society A , 141, 3, 359–367.Dagum, E.B., Cholette, P.A. (2006),

Benchmarking, Temporal Distribution and Reconcil-iation Methods for Time Series , New York, Springer.Danilov, D. and Magnus, J.R. (2008), On the estimation of a large sparse Bayesian system:The Snaer program,

Computational Statistics & Data Analysis , 52, 9, 4203–4224.Davydenko, A., Fildes, R. (2013), Measuring forecasting accuracy: The case of judgmen-tal adjustments to SKU-level demand forecasts,

International Journal of Forecasting ,29, 3, 510–522.Deming, E., Stephan, F.F. (1940), On a least squares adjustment of a sampled frequencytable when the expected marginal totals are known,

The Annals of Mathematical Statis-tics , 11, 4, 427–444.Di Fonzo, T. (1990), The estimation of M disaggregate time series when contemporane-ous and temporal aggregates are known, The Review of Economics and Statistics , 72, 1,188-192.Di Fonzo, T., Marini, M. (2011), Simultaneous and two-step reconciliation of systemsof time series: methodological and practical issues,

Journal of the Royal StatisticalSociety, Series C , 60, 2, 143-164.Dunn, D.M., Williams, W.H., DeChaine, T.L. (1976), Aggregate versus subaggregatemodels in local area forecasting,

Journal of the American Statistical Association , 71,353, 68–71.Eckert, F. Hyndman, R.J., Panagiotelis, A. (2019), Forecasting Swiss Exports usingBayesian Forecast Reconciliation,

KOF Working Papers , KOF Swiss Economic Insti-tute, ETH Zurich, No. 457. 47an Erven, T., Cugliari, J. (2015), Game-theoretically Optimal Reconciliation of Contem-poraneous Hierarchical Time Series Forecasts, in Antoniadis, A., Poggi, J.M., Brossat,X.,

Modeling and Stochastic Learning for Forecasting in High Dimensions , Berlin,Springer, 297–317.Fliedner, G. (2001), Hierarchical forecasting: issues and guidelines,

Industrial Manage-ment and Data Systems , 101, 1, 5–12.Gamakumara, P., Panagiotelis, A., Athanasopoulos, G., Hyndman, R.J. (2018),

Proba-bilistic Forecasts in Hierarchical Time Series , Department of Econometrics and Busi-ness Statistics, Monash University, Working Paper 11/18.Gross, C.W., Sohl, J.E. (1990), Disaggregation methods to expedite product line forecast-ing,

Journal of Forecasting , 9, 3, 233–254.Harville, D.A. (2008),

Matrix algebra from a statistician’s perspective , New York,Springer-Verlag.Hibon, M., Crone, S.F., Kourentzes, N. (2012),

Statistical signiﬁcance offorecasting methods , presentation at the 32nd Annual International Sympo-sium on Forecasting, Boston, available at: https://kourentzes.com/forecasting/wp-content/uploads/2014/04/ISF2012 Tests Kourentzes.pdf. Accessed on June 1, 2020.Hong, T. and Xie, J. and Black, J. (2019), Global energy forecasting competition 2017:Hierarchical probabilistic load forecasting,

International Journal of Forecasting , 35, 4,1389–1399.Hyndman, R.J., Ahmed, R.A., Athanasopoulos, G., Shang, H.L. (2011), Optimal combi-nation forecasts for hierarchical time series,

Computational Statistics & Data Analysis ,55, 9, 2579–2589.Hyndman, R.J., Athanasopoulos, G. (2018),

Forecasting: principles and practice , 2ndedition, OTexts: Melbourne, Australia. OTexts.com/fpp2. Accessed on November 14,2019.Hyndman, R.J., Athanasopoulos, G., Bergmeir, C., Caceres, G., Chhay, L., O’Hara-Wild,M., Petropoulos, F., Razbash, S., Wang, E., Yasmeen, F. (2020),

Forecast: Forecastingfunctions for time series and linear models , R package version 8.12 (March 31, 2020).Hyndman, R.J., Kourentzes, N. (2018),

Package ‘thief ’ , R package version 0.3 (January24, 2018).Hyndman, R.J., Lee, A., Wang, E. (2016), Fast computation of reconciled forecasts forhierarchical and grouped time series,

Computational Statistics & Data Analysis , 97,16–32.Hyndman, R.J., Lee, A., Wang, E., Wickramasuriya, S.L. (2020),

Package ‘hts’ , R pack-age version 6.0.0 (April 3, 2020).Jeon, J., Panagiotelis, A., Petropoulos, F. (2019), Probabilistic forecast reconciliation withapplications to wind power and electric load,

European Journal of Operational Re-search , 279, 2, 364–379.Johnston, R.J., Pattie, C.J. (1993), Entropy-Maximizing and the Iterative ProportionalFitting Procedure,

The Professional Geographer , 45, 3, 317–322.Koning, A.J., Franses, P.H., Hibon, M., Stekler, H.O. (2005), The M3 competition: Sta-tistical tests of the results,

International Journal of Forecasting , 21, 3, 397–409.48ourentzes, N. (2017),

Uncertainty in predictive modelling ,https://kourentzes.com/forecasting/wp-content/uploads/2017/09/OR59 Kourentzes Uncertainty.pdf.Accessed on June 1, 2020.Kourentzes, N. (2018),

Model uncertainty in hierarchical forecast-ing

Annals of Tourism Research , 75, 393–409.Kourentzes, N., Athanasopoulos, G. (2020a), Elucidate structure in intermittent demandseries,

European Journal of Operational Research (in press).Kourentzes, N., Athanasopoulos, G. (2020b),

On the evaluation of hierarchical forecasts ,Department of Econometrics and Business Statistics, Monash University, Working Pa-per 02/20.Kourentzes, N., Svetunkov, O., Schaer, O. (2020),

Package ‘tsutils’. Time Series Explo-ration, Modelling and Forecasting , R package version 0.9.2 (February 6, 2020).Ledoit, O., Wolf, M. (2004), A well-conditioned estimator for large-dimensional covari-ance matrices,

Journal of Multivariate Analysis , 88, 2, 365–411.Li, H., Hyndman, R.J. (2019),

Assessing longevity inequality in the U.S.: What can besaid about the future?

Available at http://dx.doi.org/10.2139/ssrn.3550683. Accessedon June 1, 2020.Lou, T.T., Shiou, S.-H. (2002), Inverses of 2 × Computers and Mathe-matics with Applications , 43(1-2), 119–129.Magnus, J.R., Neudecker, H. (2019),

Matrix Differential Calculus with Applications inStatistics and Econometrics , third edition, New York, Wiley.Mancuso, P., Piccialli, V., Sudoso, A.M. (2020),

A machine learning approach forforecasting hierarchical time series , Available at https://arxiv.org/abs/2006.00630. Ac-cessed on June 16, 2020.Miller, R.E., Blair, P.D. (2009),

Input-output analysis: foundations and extensions , 2ndedition, New York, Cambridge University Press.Nystrup, P., Lindstroem, E., Pinson, P., Madsen, H. (2020), Temporal hierarchies withautocorrelation for load forecasting,

European Journal of Operational Research , 280,1, 876-888.Panagiotelis, A., Gamakumara, P., Athanasopoulos, G., Hyndman, R.J. (2020), Forecastreconciliation: A geometric view with new insights on bias correction,

InternationalJournal of Forecasting , in press.Petropoulos, F., Kourentzes, N. (2015), Forecast combination for intermittent demand,

Journal of the Operational Research Society , 66, 6, 914–924.Roach, C. (2019), Reconciled boosted models for GEFCom2017 hierarchical probabilisticload forecasting,

International Journal of Forecasting , 35, 4, 1439–1450.Sagaert, Y.R., Kourentzes, N., De Vuyst, S., Aghezzaf, E.-H. (2019), Incorporatingmacroeconomic leading indicators in tactical capacity planning,

International Journalof Production Economics , 209, 12–19. 49ch¨afer, J.L., Strimmer, K. (2005), A Shrinkage Approach to Large-Scale CovarianceMatrix Estimation and Implications for Functional Genomics,

Statistical Applicationsin Genetics and Molecular Biology , 4, 1.Shang, H.L. (2017), Reconciling Forecasts of Infant Mortality Rates at National and Sub-National Levels: Grouped Time-Series Methods,

Population Resarch Policy Review ,36, 55–84.Shang, H.L. (2019),

Dynamic principal component regression for forecasting functionaltime series in a group structure , arXiv:1909.00456. Accessed on June 1, 2020.Shang, H.L., Hyndman, R.J. (2017), Grouped functional time series forecasting: an appli-cation to age-speciﬁc mortality rates,

Journal of Computational and Graphical Statis-tics , 26, 2, 330–343.Solomou, S, Weale, M. (1993), Balanced estimates of national accounts when measure-ment errors are autocorrelated,

Journal of the Royal Statistical Society, A , 156, 1, 89-105.Spiliotis, E., Abolghasemi, M., Hyndman, R.J., Petropoulos, F., Assimakopoulos,V. (2020),

Hierarchical forecast reconciliation with machine learning , Available athttps://arxiv.org/abs/2006.02043. Accessed on June 16, 2020.Spiliotis, E., Petropoulos, F., Kourentzes, N., Assimakopoulos, V. (2020), Cross-temporalaggregation: Improving the forecast accuracy of hierarchical electricity consumption,

Applied Energy , 261, 1.Stone, R., Champernowne, D.G., Meade, J.E. (1942), The precision of national incomeestimates,

The Review of Economic Studies , 9, 2, 111–125.Weale, M. (1988), The reconciliation of values, volumes and prices in the National Ac-counts,

Journal of the Royal Statistical Society, A , 151, 1, 211-221.Wickramasuriya, S.L., Athanasopoulos, G., Hyndman, R.J. (2019), Optimal forecast rec-onciliation for hierarchical and grouped time series through trace minimization,

Journalof the American Statistical Association , 114, 526, 804–819.Wickramasuriya, S.L., Turlach, B.A., Hyndman, R.J. (2020), Optimal non-negative fore-cast reconciliation,

Statistics and Computing (in press).Yagli, G.M., Yang, D., Srinivasan, D. (2019), Reconciling solar forecasts: Sequentialreconciliation,

Solar Energy , 179, 391–397.Yang, D. (2020), Reconciling solar forecasts: Probabilistic forecast reconciliation in anonparametric framework,

Solar Energy , in press.Yang, D., Quan, H., Disfani, V.R., Rodriguez-Gallego, C.D. (2017), Reconciling solarforecasts: Temporal hierarchy,

Solar Energy , 158, 332–346.50 ppendixCross-temporal forecast reconciliation: Optimal combination method andheuristic alternatives

Tommaso Di Fonzo ∗ Daniele Girolimetto ∗∗ A.1 Alternative, equivalent formulations of the solution to the optimalpoint forecast reconciliation problem 52A.2 Balanced and unbalanced hierarchies 53A.3 Commutation matrix and the relationships linking vectors andmatrices of bottom and upper time series 54A.4 Monthly and hourly temporal hierarchies 56A.5 Cross-temporal hierarchy: a toy example 58A.6 An alternative heuristic cross-temporal reconciliation procedure 61A.7 Average relative accuracy indices for selected groups ofvariables/time frequencies/forecast horizons, in a rolling forecastexperiment 63A.8 Forecast reconciliation experiment: supplementary tables andgraphs 65 ∗ Department of Statistical Sciences, University of Padua, Italy. [email protected] ∗∗ Department of Statistical Sciences, University of Padua, Italy. [email protected] .1 Alternative, equivalent formulations of the solution to the optimal point forecastreconciliation problem Given the model ˆ y = S β + ε, E ( ε ) = , E (cid:0) εε (cid:48) (cid:1) = W , the GLS estimator of vector β is given by ˜ β = (cid:0) S (cid:48) W − S (cid:1) − S (cid:48) W − ˆ y and then the vector containing all reconciled forecasts is given by ˜ y = S ˜ β = S (cid:0) S (cid:48) W − S (cid:1) − S (cid:48) W − ˆ y = SG ˆ y , (A.1)where G = (cid:0) S (cid:48) W − S (cid:1) − S (cid:48) W − .Now we show that solution (A.1) is equivalent to the one we obtain considering thefollowing model and its subsequent formulation in terms of constrained quadratic mini-mization: ˆ y = y + ε, E ( ε ) = , E (cid:0) εε (cid:48) (cid:1) = W , s.t. H (cid:48) y = , where H = (cid:2) I n a − C (cid:3) is a matrix of dimension [ n a × ( n a + n b )] .In this case the following constrained minimization problem must be solved: min y ( y − ˆ y ) (cid:48) W − ( y − ˆ y ) , s.t. H (cid:48) y = Let’s consider the lagrangean function L = ( y − ˆ y ) (cid:48) W − ( y − ˆ y ) + 2 λ (cid:48) H (cid:48) y = y (cid:48) W − y − y (cid:48) W − y + 2 λ (cid:48) H (cid:48) y , where λ is a ( n a × vector of Lagrange multipliers.Differentiating L wrt y and λ and then equating to zero (ﬁrst order conditions), we getthe linear system W − y + 2 H λ = 2 W − ˆ y H (cid:48) y = that is: (cid:20) W − HH (cid:48) (cid:21) (cid:20) y λ (cid:21) = (cid:20) W − ˆ y (cid:21) . According to the lemma of inversion of a block-partitioned matrix (Lou and Shiou, 2002),it is: (cid:20) W − HH (cid:48) (cid:21) − = (cid:34) W − WH ( H (cid:48) WH ) − H (cid:48) W WH ( H (cid:48) WH ) − ( H (cid:48) WH ) − H (cid:48) W − ( H (cid:48) WH ) − (cid:35) , and thus ˜ y = (cid:104) I n − WH (cid:0) H (cid:48) WH (cid:1) − H (cid:48) (cid:105) ˆ y . Now, let us consider matrix J , deﬁned as J = (cid:2) n b × n a I n b (cid:3) , which has dimension [ n b × n ] , and is such that when applied to a ( n × vector, ‘extracts’its last n b elements. In other words, by denoting ˜ β = J ˜ y , it is: ˜ y = S ˜ β = (cid:104) J − J WH (cid:0) H (cid:48) WH (cid:1) − H (cid:48) (cid:105) ˆ y = G ˆ y , G = (cid:0) S (cid:48) W − S (cid:1) − S (cid:48) W − = (cid:104) J − J WH (cid:0) H (cid:48) WH (cid:1) − H (cid:48) (cid:105) . In the former case, the expression requires the inversion of a ( n × n ) matrix, W , and ofa ( n b × n b ) matrix, (cid:0) S (cid:48) W − S (cid:1) . In the latter case the matrix to be inverted, ( H (cid:48) WH ) hasdimension ( n a × n a ) . A.2 Balanced and unbalanced hierarchies

A simple three-level hierarchy is shown in the right panel of ﬁgure A.1, where variable C at the second level of the hierarchy has no ‘children’, and thus is considered as a bottomvariable too, at level three of the hierarchy. AA AB BA BB CA B CT ot. AA AB BA BB CA BT ot.

Figure A.1 : A simple unbalanced hierarchy (right) and its balanced version (left)The left panel shows the ‘balanced version’ of the same hierarchy, where variable C is(duplicated and) present at both levels two and three.The aggregation relationships linking the component series can be expressed as follows: y T ot = y AA + y AB + y BA + y BB + y C y A = y AA + y AB y B = y BA + y BB y C = y C , where the last equality has merely the function of making the hierarchy balanced. Thecorresponding contemporaneous aggregation matrix C is given by: C =   . The redundant relationship ( y C = y C ) makes the last row of matrix C equal to the lastrow of the contemporaneous summing matrix S = (cid:20) CI (cid:21) . This redundancy can be easilyeliminated by considering the new contemporaneous aggregation matrix ˜ C , which in thecase of an unbalanced hierarchy has clearly one row less than in the balanced version: ˜ C =   . ˜ S = (cid:20) ˜ CI (cid:21) , which has di-mension (8 × instead of (9 × as for matrix S . In a complex hierarchy, mostly whencontemporaneous and temporal hierarchies are simultaneously considered, this fact shouldbe carefully considered in order to save memory space and computing time.The R package hts (Hyndman et al., 2020) manages only balanced hierarchy, and thusbuilds matrix S instead of ˜ S . Due to this fact, large cross-sectional hierarchies might re-quire computational efforts larger than necessary, and could face numerical problems whenmore soﬁsticated reconciliation strategies are applied. For example, the grouped time se-ries of the Australian Tourism Demand analyzed by Wickramasuriya et al. (2019) (see alsoAshouri et al., 2019; Bertani et al., 2020; Wickramasuriya et al., 2020), contains 30 dupli-cated time series, since it comes from two unbalanced hierarchies with only 525 ‘unique’time series (304 bts and 221 uts), as compared to the 555 time series of the balanced ver-sion. A similar, though less pronounced case (105 ‘unique’ time series out of 111 for thebalanced hierarchy) is present in the reduced version of this system analyzed by Kourentzesand Athanasopoulos (2019) and Panagiotelis et al. (2020). A.3 Commutation matrix and the relationships linking vectors and matrices ofbottom and upper time series

Given an ( r × c ) matrix X , denote with C r,c the ( rc × rc ) commutation matrix (Magnusand Neudecker, 2019) which maps vec ( X ) into vec ( X (cid:48) ) : C r,c vec ( X ) = vec (cid:0) X (cid:48) (cid:1) . This matrix is a special type of permutation matrix, obtained by simple exchanges of rowsof the identity matrix, and is therefore orthogonal, that is: C − r,c = C (cid:48) r,c = C c,r . A.3.1 Cross-sectional case: the permutation matrices linking vectors b ∗ to b and a ∗ toa Denoting b = vec ( B ) , b ∗ = vec ( B (cid:48) ) , a = vec ( A ) , a ∗ = vec (cid:0) A (cid:48) (cid:1) , the mappings of b into b ∗ and a into a ∗ , respectively, can be expressed as P b b = b ∗ , P a a = a ∗ , where P b = C n b T,n b T and P a = C n a T,n a T are ( n b T × n b T ) and ( n a T × n a T ) , respectively,commutation matrices. Since both P b and P a are orthogonal, it is: b = P (cid:48) b b ∗ , a = P (cid:48) a a ∗ . The index k , k = 1 , . . . , n b T , of the generic element of vector b ∗ can be expressed in termsof the row and column indices of the corresponding element of matrix B (cid:48) :vec ( B (cid:48) ) = b ∗ = { b ∗ k } , b ∗ k = b ti , with k = t + ( i − T. As for the index l , l = 1 , . . . , n a T , of the generic element of vector a ∗ , we have:vec ( A (cid:48) ) = a ∗ = { a ∗ l } , a ∗ l = a tj , with l = t + ( j − T. numerical example Assuming that n = 2 variables and T = 3 time periods are considered, matrix X = (cid:20)

11 12 1321 22 23 (cid:21) can be vectorized either asvec ( X ) = x = (cid:2)

11 21 12 22 13 23 (cid:3) (cid:48) or vec ( X (cid:48) ) = x ∗ = (cid:2)

11 12 13 21 22 23 (cid:3) (cid:48) . In this case, the permutation matrix P mapping x ∗ into x , such that x = Px ∗ (and x ∗ = P (cid:48) x ),is given by: P =   . The following R script performs the calculation of matrix P : n <- 2;t <- 3;I <- matrix(1:(n*t), n, t, byrow = T)I <- as.vector(I) A.3.2 Cross-temporal case: the relationship between ˇ y and vec ( Y (cid:48) ) Assuming h = 1 , denote with Y = (cid:20) AB (cid:21) the [ n × ( k ∗ + m )] matrix of the target forecastsat any temporal frequency. The [ n b × ( k ∗ + m )] submatrix B , which contains the targetforecasts of the bottom time series, can be written as: B = (cid:104) B [ m ] B [ k p − ] . . . B [ k ] B [1] (cid:105) = (cid:104) B ∗ B [1] (cid:105) , where the ( n b × k ∗ ) matrix B ∗ = (cid:104) B [ m ] B [ k p − ] . . . B [ k ] (cid:105) , and matrix B [1] contain thetarget forecasts for, respectively, the temporally aggregated time series (lf-bts) and the high-frequency ones (hf-bts). The following relationships hold: C n b , ( k ∗ + m ) [ vec ( B )] = vec (cid:0) B (cid:48) (cid:1) , C n b ,k ∗ [ vec ( B ∗ )] = vec (cid:2) ( B ∗ ) (cid:48) (cid:3) , n b ,m (cid:104) vec (cid:16) B [1] (cid:17)(cid:105) = vec (cid:104) ( B [1] ) (cid:48) (cid:105) . Since vec ( B ) =  vec ( B ∗ ) vec (cid:16) B [1] (cid:17) , we can write: C n b , ( k ∗ + m ) vec ( B ) = C n b , ( k ∗ + m ) (cid:34) C k ∗ ,n b ( n b k ∗ × n b m ) ( n b m × n b k ∗ ) C m,n b (cid:35)  vec [( B ∗ ) (cid:48) ] vec (cid:104) ( B [1] ) (cid:48) (cid:105) , that is: vec (cid:0) B (cid:48) (cid:1) = (cid:101) Q  vec [( B ∗ ) (cid:48) ] vec (cid:104) ( B [1] ) (cid:48) (cid:105) , where (cid:101) Q = C n b , ( k ∗ + m ) (cid:34) C k ∗ ,n b ( n b k ∗ × n b m ) ( n b m × n b k ∗ ) C m,n b (cid:35) . (A.2)According to expression (38), the [ n ( k ∗ + m ) × vector ˇ y can be written as: ˇ y =  vec ( A (cid:48) ) vec [( B ∗ ) (cid:48) ] vec (cid:104) ( B [1] ) (cid:48) (cid:105) . Then, vec ( Y (cid:48) ) can be expressed in terms of ˇ y as:vec (cid:0) Y (cid:48) (cid:1) = Q ˇ y , where Q = (cid:34) I n a ( k ∗ + m ) [ n a ( k ∗ + m ) × n b m ] [ n b m × n a ( k ∗ + m )] (cid:101) Q (cid:35) . A.4 Monthly and hourly temporal hierarchies

For monthly data, the aggregates of interest are for k ∈ { , , , , , } . Hence themonthly observations are aggregated to annual, semi-annual, four-monthly, quarterly andbi-monthly observations. These can be represented in two separate hierarchies, as shownin Fig. A.2, which means that the temporal hierarchies form a grouped series, sharing the‘top level’ (annual) aggregate, and the same twelve ‘bottom’ nodes, one for each month ofthe original temporally disaggregated time series.56 M M M M M M M M M M M Q Q Q Q S S A (a) Monthly - Quarterly - Semi-Annual - Annual frequencies M M M M M M M M M M M M BM BM BM BM BM BM F M F M F M A (b) Monthly - Bi-Monthly - Four-Monthly - Annual frequencies Figure A.2 : The two temporal hierarchies induced by a monthly time series.However, the (16 × temporal aggregation matrix K for this case is easily obtained: K =   , and thus R = (cid:20) K I (cid:21) , Z (cid:48) = [ I − K ] , x τ = R x [1] τ , and Z (cid:48) x τ = , τ = 1 , . . . , N ,where x τ = (cid:104) x [12] τ , x [6] τ (cid:48) , x [4] τ (cid:48) , x [3] τ (cid:48) , x [2] τ (cid:48) , x [1] τ (cid:48) (cid:105) (cid:48) is the (28 × vector containing all temporalaggregates of variable X at the observation index τ (i.e., within the complete τ -th cycle).Let’s conclude with considering the case of an hourly time series with diurnal period-57city. In this case it is m = 24 , k ∗ = 36 , and K N is the (36 N × N ) matrix K N =  I N ⊗ (cid:48) I N ⊗ (cid:48) I N ⊗ (cid:48) I N ⊗ (cid:48) I N ⊗ (cid:48) I N ⊗ (cid:48) I N ⊗ (cid:48)  , which converts single hour values into the sum of 2, 3, 4, 6, 8, 12, and 24 hours data,respectively, and Z (cid:48) N = [ I N − K N ] is a full row-rank (36 N × N ) matrix. A.5 Cross-temporal hierarchy: a toy example

Let us consider the relationships linking all the variables implied by a cross-temporal hier-archy for the very simple case of a total quarterly series observed for one year, X , obtainedas the sum of two component variables, W and Z , respectively. The contemporaneous(cross-sectional) constraint, X = W + Z , must hold at any observation index of all tempo-ral frequencies considered in the temporal hierarchy of Figure 3 (annual, semi-annual andquarterly), as shown in Figure A.3, which gives a graphical view of the the way in whichthe two dimensions (cross-sectional and temporal) are combined within a complete timecycle (one year).All the nodes in the cross-temporal hierarchy can be expressed in terms of the quarterlybottom time series w [1] t and z [1] t , t = 1 , . . . , , according to the structural representation:  x [4]1 x [2]1 x [2]2 x [1]1 x [1]2 x [1]3 x [1]4 w [4]1 w [2]1 w [2]2 z [4]1 z [2]1 z [2]2 w [1]1 w [1]2 w [1]3 w [1]4 z [1]1 z [1]2 z [1]3 z [1]4 (cid:124) (cid:123)(cid:122) (cid:125) ˇ y =  I (cid:124) (cid:123)(cid:122) (cid:125) ˇ S  w [1]1 w [1]2 w [1]3 w [1]4 z [1]1 z [1]2 z [1]3 z [1]4 (cid:124) (cid:123)(cid:122) (cid:125) b , [1]1 w [1]2 w [1]3 w [1]4 w [2]1 w [2]2 w [4]1 W z [1]1 z [1]2 z [1]3 z [1]4 z [2]1 z [2]2 z [4]1 Z x [1]1 x [1]2 x [1]3 x [1]4 x [2]1 x [2]2 x [1]1 X Figure A.3 : A two level cross-temporal hierarchy with quarterly datawhere a = (cid:104) x [4]1 x [2]1 x [2]2 x [1]1 x [1]2 x [1]3 x [1]4 (cid:105) (cid:48) , b = (cid:104) w [1]1 w [1]2 w [1]3 w [1]4 z [1]1 z [1]2 z [1]3 z [1]4 (cid:105) (cid:48) , ˇ y = (cid:20) ab (cid:21) , ˇ S = (cid:20) ˇ CI (cid:21) , and ˇ C is the (13 × matrix: ˇ C =   . The zero constraints valid for the nodes of the cross-temporal hierarchy can be representedthrough the (13 × matrix ˇ H (cid:48) = (cid:2) I − ˇ C (cid:3) , which has full row-rank, and is such that: ˇ H (cid:48) ˇ y = (13 × . (A.3)According to the notation used so far, it is n a = 1 , n b = 2 , T = m = 4 , N = 1 , p = 3 ,and K = { , , } . The contemporaneous aggregation matrix C , mapping bts into uts, issimply a (1 × row vector of ones: C = [1 1] , and thus U (cid:48) is the (1 × row vector U (cid:48) = [1 − − . Furthermore, the (3 × temporal aggregation matrix K mapping a59uarterly series into its semi-annual and annual counterparts, and the related (3 × matrix Z (cid:48) = [ I − K ] , are given by: K =   , Z (cid:48) =  − − − −

10 1 0 − − − −  . The (3 × matrix Y , collecting all the time series at any observation frequency, is givenby: Y =  x [4]1 x [2]1 x [2]2 x [1]1 x [1]2 x [1]3 x [1]4 w [4]1 w [2]1 w [2]2 w [1]1 w [1]2 w [1]3 w [1]4 z [4]1 z [2]1 z [2]2 z [1]1 z [1]2 z [1]3 z [1]4  , and then y = vec ( Y (cid:48) ) is the (21 × vector y = (cid:104) x [4]1 x [2]1 x [2]2 x [1]1 x [1]2 x [1]3 x [1]4 w [4]1 w [2]1 w [2]2 w [1]1 w [1]2 w [1]3 w [1]4 z [4]1 z [2]1 z [2]2 z [1]1 z [1]2 z [1]3 z [1]4 (cid:105) (cid:48) , which is differently organized as compared to ˇ y . However, it is easy to show that y = Q ˇ y ,where Q is the (21 × permutation matrix Q =  I (10 × (4 × I  . Given the orthogonality of matrix Q , it is ˇ y = Q (cid:48) y , and then the constraints (A.3) can bere-stated as ˇ H (cid:48) Q (cid:48) y = (13 × , that is H (cid:48) y = (13 × , where H (cid:48) = (cid:0) Q ˇ H (cid:1) (cid:48) is a (13 × full row-rank matrix.The cross-temporal constraints can be formulated according to expression (34) as well,where ˘ H (cid:48) is the (16 × matrix ˘ H (cid:48) = (cid:20) U (cid:48) ⊗ I I ⊗ Z (cid:48) (cid:21) =  I − I − I Z (cid:48) (cid:48) (cid:48)  . The rank of ˘ H (cid:48) is 13, which means that the matrix is not full row-rank. The choice ofthe rows to remove is not unique , and in real life applications the elimination of lineardependent relationships from the cross-temporal constraint set might be not as simple asin this toy example. In general, the computation of H (cid:48) as in (35) seems rather quick and For example, in this simple example a different full row-rank H (cid:48) can be obtained either by removing thelast three rows, or by eliminating rows 5, 10 and 15 from matrix ˘ H (cid:48) . H (cid:48) matrix according to that procedure is simplymatrix ˘ H (cid:48) without the ﬁrst three rows, that is: H (cid:48) =  I ∗ − I ∗ − I ∗ Z (cid:48) (cid:48)

00 0 Z (cid:48)  , where I ∗ is the (4 × matrix I ∗ =   . A.6 An alternative heuristic cross-temporal reconciliation procedure

Let us consider a cross-temporal reconciliation procedure based on the reversal of the orderin which the one-dimension forecast reconciliation procedures are applied by KA. Theprocedure consists in the following steps (it is assumed h = 1 ): Step 1

Transform (cid:98) Y by computing time-by-time cross-sectional reconciled forecasts ( Y for all thetemporal aggregation levels: (cid:98) Y → ( Y . The [ n × ( k ∗ + m )] matrix (cid:98) Y can be re-written also as: (cid:98) Y = (cid:104)(cid:98) Y [ m ] (cid:98) Y [ k p − ] . . . (cid:98) Y [ k ] (cid:98) Y [1] (cid:105) , where (cid:98) Y [ k ] , k ∈ K , has dimension ( n × M k ) . Cross-sectionally reconciled forecasts canbe computed by transforming each (cid:98) Y [ k ] as: ( Y [ k ] = M [ k ] (cid:98) Y [ k ] , k ∈ K , where M [ k ] are p transformation matrices, each of dimension ( n × n ) , given by: M [ k ] = I n − W [ k ] U (cid:16) U (cid:48) W [ k ] U (cid:17) − U (cid:48) , k ∈ K , and W [ k ] is a ( n × n ) p.d. known matrix. Since it is U (cid:48) M [ k ] = ( n a × n ) , k ∈ K , the recon-ciled forecasts are cross-sectionally coherent, i.e. U (cid:48) ( Y = [ n a × ( k ∗ + m )] , but not temporally: Z (cid:48) ( Y (cid:48) (cid:54) = ( k ∗ × n ) . Step 2

For each individual variable, compute the temporally reconciled forecasts (cid:98) Y : ( Y → (cid:98) Y . This result can be obtained by applying the point forecast reconciliation formula accordingto temporally hierarchies (24) to each column of matrix ( Y (cid:48) . In fact, using the notation ofsection 4, it is ( Y (cid:48) =  ˘ t a · · · ˘ t a na ˘ t b · · · ˘ t b nb ˘ a [1]1 · · · ˘ a [1] n a ˘ b [1]1 · · · ˘ b [1] n b  . n a vectors of temporally reconciled forecasts of the uts can be obtained as:  ˇ t a j ˇ a [1] j  = M a j  ˘ t a j ˘ a [1] j  , M a j = I k ∗ + m − Ω a j Z (cid:0) Z (cid:48) Ω a j Z (cid:1) − Z (cid:48) , j = 1 , . . . , n a . Likeways, the n b vectors of temporally reconciled forecasts of the bts are given by:  ˇ t b i ˇ b [1] i  = M b i  ˘ t b i ˘ b [1] i  , M b i = I k ∗ + m − Ω b i Z (cid:0) Z (cid:48) Ω b i Z (cid:1) − Z (cid:48) , i = 1 , . . . , n b , where the n a + n b matrices M a j and M b i have dimension [( k ∗ + m ) × ( k ∗ + m )] , and each Ω a j , j = 1 , . . . , n a , and Ω b i , i = 1 , . . . , n b , respectively, is a known p.d. [( k ∗ + m ) × ( k ∗ + m )] matrix.The mapping performing the transformation of the base forecasts into the temporallyreconciled ones can be expressed in compact form as:vec (cid:16)(cid:98) Y (cid:48) (cid:17) =  M a · · · · · · ... . . . ... ... . . . ... · · · M a na · · · · · · b · · · ... . . . ... ... . . . ... · · · · · · M b nb  vec (cid:16) ( Y (cid:48) (cid:17) . The temporally reconciled forecasts can be then collected in the matrix (cid:98) Y (cid:48) : (cid:98) Y (cid:48) =  ˇ t a · · · ˇ t a na ˇ t b · · · ˇ t b nb ˇ a [1]1 · · · ˇ a [1] n a ˇ b [1]1 · · · ˇ b [1] n b  =  ( (cid:98) A [ m ] ) (cid:48) ( (cid:98) B [ m ] ) (cid:48) ... ... ( (cid:98) A [ k ] ) (cid:48) ( (cid:98) B [ k ] ) (cid:48) ( (cid:98) A [1] ) (cid:48) ( (cid:98) B [1] ) (cid:48)  , which is in line with the temporal aggregation constraints, i.e. Z (cid:48) (cid:98) Y (cid:48) = ( k ∗ × n ) , but ingeneral it is not in line with the cross-sectional (contemporaneous) constraints: U (cid:48) (cid:98) Y (cid:54) = n a × ( k ∗ + m ) . Step 3

Transform again the step 1 forecasts ( Y , by computing temporally reconciled forecasts forall n variables using the [( k ∗ + m ) × ( k ∗ + m )] matrix M cst , where ‘cst’ stands for ‘cross-sectional-then-temporal’, given by the average of the matrices M i obtained at step 2: ( Y ⇒ (cid:101) Y cst . Matrix M cst can be expressed as: M cst = 1 n n (cid:88) i =1 M i . The ﬁnal cross-temporal reconciled forecasts are given by: (cid:101) Y cst = (cid:16) M cst ( Y (cid:48) (cid:17) (cid:48) = ( Y ( M cst ) (cid:48) . (A.4)62ince U (cid:48) ( Y = [ n a × ( k ∗ + m )] , and Z (cid:48) M cst = n − (cid:80) ni =1 Z (cid:48) M i = [ k ∗ × ( k ∗ + m )] , the recon-ciled forecasts (A.4) fulﬁll both cross-sectional and temporal aggregation constraints: U (cid:48) (cid:101) Y cst = U (cid:48) ( Y ( M cst ) (cid:48) = [ n a × ( k ∗ + m )] , Z (cid:48) (cid:16)(cid:101) Y cst (cid:17) (cid:48) = Z (cid:48) M cst ( Y (cid:48) = ( k ∗ × n ) . A.7 Average relative accuracy indices for selected groups of variables/timefrequencies/forecast horizons, in a rolling forecast experiment

Let ˆ e [ k ] ,hi,j,t = y [ k ] i,t + h − ˆ y [ k ] ,hi,j,t , i = 1 , . . . , n,j = 0 , . . . , J, t = 1 , . . . , q, k ∈ K ,h = 1 , . . . , h k , be the forecast error, where y and ˆ y are the actual and the forecasted values, respectively,sufﬁx i denotes the variable of interest, j is the forecasting technique, where j = 0 isthe benchmark forecasting procedure, t is the forecast origin, K is the set of the timefrequencies at which the series is observed, and h is the forecast horizon, whose lead timedepends on the time frequency k .Denote by A [ k ] ,hi,j the forecasting accuracy of the technique j , computed across q forecastorigins, for the h -step-ahead forecasts of the variable i at the temporal aggregation level k .For example, A [ k ] ,hi,j = M SE [ k ] ,hi,j , as deﬁned in expression (57), otherwise we might haveA [ k ] ,hi,j = M AE [ k ] ,hi,j or A [ k ] ,hi,j = RM SE [ k ] ,hi,j , where M AE [ k ] ,hi,j = 1 q q (cid:88) t =1 (cid:12)(cid:12)(cid:12) ˆ e [ k ] ,hi,j (cid:12)(cid:12)(cid:12) RM SE [ k ] ,hi,j = (cid:118)(cid:117)(cid:117)(cid:116) q q (cid:88) t =1 (cid:16) ˆ e [ k ] ,hi,j (cid:17) In any case, we consider the relative version of the accuracy index A [ k ] ,hi,j , given by: r [ k ] ,hi,j = A [ k ] ,hi,j A [ k ] ,hi, , i = 1 , . . . , n, j = 0 , . . . , J, k ∈ K , h = 1 , . . . , M k , and use it to compute the Average relative accuracy index of the forecasting procedure j ,for given k and h , through the geometric mean:AvgRelA [ k ] ,hj = (cid:32) n (cid:89) i =1 r [ k ] ,hi,j (cid:33) n , j = 0 , . . . , J. We may consider the following average relative accuracy indices for selected groups ofvariables/time frequencies and forecast horizons:

Average relative accuracy indices for a single variable at a given time frequency, formultiple forecast horizons

AvgRelA [ k ] ,h : h i,j =  h (cid:89) h = h r [ k ] ,hi,j  h − h , i = 1 , . . . , n, j = 0 , . . . , J, k ∈ K . verage relative accuracy indices for a group of variables (either all, or selected groups,e.g. a: uts, b: bts) at a given time frequency, either for a single forecast horizon oracross them AvgRelA [ k ] ,hj = (cid:32) n (cid:89) i =1 r [ k ] ,hi,j (cid:33) n , j = 0 , . . . , J, k ∈ K , h = 1 , . . . , M k AvgRelA [ k ] ,ha,j = (cid:32) n a (cid:89) i =1 r [ k ] ,hi,j (cid:33) na , j = 0 , . . . , J, k ∈ K AvgRelA [ k ] ,hb,j = (cid:32) n (cid:89) i = n a +1 r [ k ] ,hi,j (cid:33) nb , j = 0 , . . . , J, k ∈ K AvgRelA [ k ] j = (cid:32) n (cid:89) i =1 M k (cid:89) h =1 r [ k ] ,hi,j (cid:33) nMk , j = 0 , . . . , J, k ∈ K AvgRelA [ k ] a,j = (cid:32) n a (cid:89) i =1 M k (cid:89) h =1 r [ k ] ,hi,j (cid:33) naMk , j = 0 , . . . , J, k ∈ K AvgRelA [ k ] b,j = (cid:32) n (cid:89) i = n a +1 M k (cid:89) h =1 r [ k ] ,hi,j (cid:33) nbMk , j = 0 , . . . , J, k ∈ K Average relative accuracy indices for a single variable or for a group of variables (all,a: uts, b: bts), across all time frequencies and forecast horizons

AvgRelMSE i,j = (cid:32) (cid:89) k ∈ K M k (cid:89) h =1 RelMSE [ k ] ,hi,j (cid:33) k ∗ + m , i = 1 , . . . , nj = 0 , . . . , J AvgRelMSE j = (cid:32) n (cid:89) i =1 (cid:89) k ∈ K M k (cid:89) h =1 RelMSE [ k ] ,hi,j (cid:33) n ( k ∗ + m ) , j = 0 , . . . , J AvgRelMSE a,j = (cid:32) n a (cid:89) i =1 (cid:89) k ∈ K M k (cid:89) h =1 RelMSE [ k ] ,hi,j (cid:33) na ( k ∗ + m ) , j = 0 , . . . , J AvgRelMSE b,j = (cid:32) n (cid:89) i = n a +1 (cid:89) k ∈ K M k (cid:89) h =1 RelMSE [ k ] ,hi,j (cid:33) nb ( k ∗ + m ) , j = 0 , . . . , J .8 Forecast reconciliation experiment: supplementary tables and graphsA.8.1 Selected forecast reconciliation procedures: performance results using AvgRel-MAETable A.1 : AvgRelMAE at any temporal aggregation level and any forecast horizon. Quarterly Semi-annual Annual AllProcedure all 95 series base 1 1 1 1 1 1 1 1 1 1cs-shr 0.9769 0.9842 0.9863 0.9893 0.9842 0.9733 0.9871 0.9802 0.9840 0.9830t-wlsv 0.9997 0.9988 0.9967 0.9953 0.9976 0.9212 0.9617 0.9412 0.8714 0.9624t-acov 0.9893 0.9951 ite-sar1-shr 0.9751 0.9819

32 upper series base 1 1 1 1 1 1 1 1 1 1cs-shr 0.9484 ite-acov-shr 0.9528 0.9674 0.9627 0.9684 0.9628

63 bottom series base 1 1 1 1 1 1 1 1 1 1cs-shr 0.9917 0.9953 tcs-sar1-shr 0.9930 0.9891 0.9875 0.9844 0.9885 0.9117 0.9559 0.9335 0.8704 0.9550ite-wlsv-shr 0.9898 0.9874 0.9869 ll l lll lll l ll l ll l lll lll l ll l ll l lll lll l ll l ll l lll lll l ll ll cs s h r t w l sv t s a r t a c o v o c t bd s h r o c t w l sv o c t a c o v t cs s a r − s h r k ah w l sv − s h r i t e s a r − s h r i t e w l sv − s h r t cs a c o v − s h r i t e a c o v − s h r T e m po r a l a gg r e g a t i on l eve l A ll l eve l s l all bts uts A v g R e l M AE all 1 2 4 all 1 2 4 all 1 2 4basecs shrt wlsvt sar1t acovoct bdshroct wlsvoct acovtcs sar1−shrkah wlsv−shrite sar1−shrite wlsv−shrtcs acov−shrite acov−shr all bts uts Figure A.4 : Top panel: Average Relative MAE across all series and forecast horizons,by frequency of observation. Bottom panel: Rankings by frequency of observation andforecast horizon. 66 l llll ll l lll l l ite−sar1−shr 5.14ite−wlsv−shr 5.14ite−acov−shr 5.55tcs−sar1−shr 5.68tcs−acov−shr 5.71kah−wlsv−shr 5.72oct−acov 6.96oct−wlsv 6.97oct−bdshr 7.39t−acov 8.45t−sar1 8.56t−wlsv 8.57cs−shr 12.11base 13.07 6 9 12

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different) ll l l ll lll l l l ll ite−wlsv−shr 5.52ite−sar1−shr 5.54cs−shr 5.93ite−acov−shr 6.31kah−wlsv−shr 6.61tcs−sar1−shr 6.61tcs−acov−shr 7.33oct−wlsv 7.41oct−acov 7.49oct−bdshr 8.43base 8.92t−acov 9.34t−sar1 9.78t−wlsv 9.8 6 8 10

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different)

Figure A.5 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The recon-ciliation procedures are sorted vertically according to the MAE mean rank (ii) across alltime frequencies and forecast horizons (top), and (ii) for one-step-ahead quarterly forecasts(bottom). 67 ll l lll ll l l l l l ite−wlsv−shr 6.25ite−acov−shr 6.35tcs−acov−shr 6.38ite−sar1−shr 6.49oct−acov 6.71tcs−sar1−shr 6.72kah−wlsv−shr 6.81cs−shr 7.25oct−wlsv 7.34oct−bdshr 8.4base 8.87t−acov 8.99t−wlsv 9.15t−sar1 9.29 5 6 7 8 9 10

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different) l l ll ll l l l ll l ll ite−wlsv−shr 6.09ite−sar1−shr 6.27kah−wlsv−shr 6.45tcs−sar1−shr 6.47oct−bdshr 6.72oct−wlsv 6.81ite−acov−shr 6.95tcs−acov−shr 7.27oct−acov 7.89t−wlsv 8.56cs−shr 8.58t−sar1 8.73base 9.08t−acov 9.12 5 6 7 8 9 10

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different)

Figure A.6 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The reconcili-ation procedures are sorted vertically according to the MSE mean rank for two-step-ahead(top) and three-step-ahead (bottom) quarterly forecasts.68 l l ll l l l lll ll l tcs−sar1−shr 5.94kah−wlsv−shr 6.03ite−wlsv−shr 6.28ite−sar1−shr 6.41oct−wlsv 6.51tcs−acov−shr 7.11oct−bdshr 7.39ite−acov−shr 7.68oct−acov 8.15t−wlsv 8.19t−sar1 8.29cs−shr 8.68t−acov 8.79base 9.55 5 6 7 8 9 10

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different) l l l ll l ll l l l ll l ite−wlsv−shr 5.62ite−sar1−shr 5.88ite−acov−shr 6.17tcs−sar1−shr 6.45kah−wlsv−shr 6.49tcs−acov−shr 6.67oct−acov 7.29oct−wlsv 7.33cs−shr 7.68oct−bdshr 8.33t−wlsv 9.09t−sar1 9.23t−acov 9.31base 9.44 6 8 10

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different)

Figure A.7 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The reconcilia-tion procedures are sorted vertically according to the MSE mean rank for four-step-ahead(top) and one-to-four-step-ahead (bottom) quarterly forecasts.69 llll ll l ll ll l l ite−acov−shr 5.03ite−wlsv−shr 5.29ite−sar1−shr 5.39tcs−acov−shr 5.49oct−acov 5.68kah−wlsv−shr 6.17tcs−sar1−shr 6.31oct−wlsv 6.98t−acov 7.92oct−bdshr 8.08t−wlsv 8.65t−sar1 8.78cs−shr 11.91base 13.32 6 9 12

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different) lll ll ll l l ll l l l kah−wlsv−shr 5.97tcs−sar1−shr 6.08ite−wlsv−shr 6.18tcs−acov−shr 6.39ite−sar1−shr 6.42ite−acov−shr 6.62oct−wlsv 6.66oct−bdshr 6.89oct−acov 7.63t−wlsv 7.87t−sar1 7.98t−acov 8.34cs−shr 10.57base 11.39 6 8 10 12

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different)

Figure A.8 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The reconcili-ation procedures are sorted vertically according to the MSE mean rank for one-step-ahead(top) and two-step-ahead (bottom) six-months forecasts.70 ll l lll l ll ll l l ite−acov−shr 5.24ite−wlsv−shr 5.48tcs−acov−shr 5.58ite−sar1−shr 5.78oct−acov 6.19kah−wlsv−shr 6.25tcs−sar1−shr 6.41oct−wlsv 6.87t−acov 7.83oct−bdshr 7.96t−wlsv 8.25t−sar1 8.35cs−shr 11.72base 13.08 6 9 12

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different) ll llll l l l lll l l ite−acov−shr 5.72ite−wlsv−shr 5.79ite−sar1−shr 6tcs−acov−shr 6.15tcs−sar1−shr 6.15kah−wlsv−shr 6.21oct−acov 6.52oct−wlsv 6.78oct−bdshr 7.21t−wlsv 7.84t−acov 7.89t−sar1 7.98cs−shr 11.98base 12.79 6 8 10 12 14

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different)

Figure A.9 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The reconcilia-tion procedures are sorted vertically according to the MSE mean rank for one-to-two-step-ahead (top) six-months forecasts and one-step-ahead twelve-months forecasts (bottom).71 l lllll ll l l l ll tcs−acov−shr 5.75ite−wlsv−shr 6.18ite−acov−shr 6.43cs−shr 6.44kah−wlsv−shr 6.46tcs−sar1−shr 6.47ite−sar1−shr 6.47oct−wlsv 7.55oct−acov 7.61oct−bdshr 8.44base 8.75t−acov 9.35t−wlsv 9.52t−sar1 9.58 5 6 7 8 9 10

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different) l l l l ll l l l l l l l l ite−sar1−shr 5.81ite−wlsv−shr 5.93tcs−sar1−shr 6.23kah−wlsv−shr 6.37oct−bdshr 6.72ite−acov−shr 6.76tcs−acov−shr 7.2oct−wlsv 7.37cs−shr 7.99oct−acov 8.28t−sar1 8.81t−wlsv 8.94base 9.08t−acov 9.52 5 6 7 8 9 10

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different)

Figure A.10 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The reconcili-ation procedures are sorted vertically according to the MAE mean rank for two-step-ahead(top) and three-step-ahead (bottom) quarterly forecasts.72 l ll l l ll l l l l l l ite−sar1−shr 5.69ite−wlsv−shr 5.74tcs−sar1−shr 5.97kah−wlsv−shr 6.02tcs−acov−shr 6.79ite−acov−shr 6.93oct−wlsv 7.38oct−bdshr 7.41cs−shr 8.09oct−acov 8.45t−wlsv 8.79t−sar1 8.92base 9.24t−acov 9.58 6 8 10

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different) ll lll l ll l l lll l ite−wlsv−shr 5.49ite−sar1−shr 5.55tcs−sar1−shr 6.2ite−acov−shr 6.24kah−wlsv−shr 6.28tcs−acov−shr 6.47cs−shr 7.41oct−wlsv 7.49oct−acov 7.89oct−bdshr 8.07base 9.36t−sar1 9.41t−wlsv 9.48t−acov 9.63 6 8 10

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different)

Figure A.11 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The reconcili-ation procedures are sorted vertically according to the MAE mean rank for four-step-ahead(top) and one-to-four-step-ahead (bottom) quarterly forecasts.73 lll ll l l l l ll l l ite−acov−shr 5.29ite−wlsv−shr 5.31tcs−acov−shr 5.41ite−sar1−shr 5.59kah−wlsv−shr 5.88tcs−sar1−shr 5.92oct−acov 6.14oct−wlsv 7.08oct−bdshr 7.82t−acov 8.04t−wlsv 8.63t−sar1 8.68cs−shr 11.92base 13.28 6 9 12

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different) ll llll l l l lll l l kah−wlsv−shr 5.75tcs−sar1−shr 5.81ite−wlsv−shr 5.99tcs−acov−shr 6.02ite−sar1−shr 6.12ite−acov−shr 6.2oct−bdshr 6.88oct−wlsv 7.32oct−acov 7.84t−sar1 8.22t−acov 8.24t−wlsv 8.34cs−shr 10.67base 11.6 6 8 10 12

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different)

Figure A.12 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The reconcili-ation procedures are sorted vertically according to the MAE mean rank for one-step-ahead(top) and two-step-ahead (bottom) six-months forecasts.74 l ll ll l l l l ll l l tcs−acov−shr 5.27ite−acov−shr 5.28ite−sar1−shr 5.47ite−wlsv−shr 5.47kah−wlsv−shr 5.76tcs−sar1−shr 5.77oct−acov 6.6oct−wlsv 7.19oct−bdshr 7.57t−acov 8.02t−wlsv 8.71t−sar1 8.82cs−shr 12.04base 13.02 4 6 8 10 12 14

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different) llllll l ll ll l l l ite−wlsv−shr 5.42ite−acov−shr 5.53ite−sar1−shr 5.68tcs−sar1−shr 5.8kah−wlsv−shr 5.82tcs−acov−shr 5.82oct−acov 6.64oct−wlsv 7.23oct−bdshr 7.36t−acov 8.2t−sar1 8.22t−wlsv 8.46cs−shr 12.16base 12.65 4 6 8 10 12 14

Mean ranks

Critical distance = 2.04Friedman: 0 (Ha: Different)

Figure A.13 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The reconcilia-tion procedures are sorted vertically according to the MAE mean rank for one-to-two-step-ahead (top) six-months forecasts and one-step-ahead twelve-months forecasts (bottom).75 able A.2 : AvgRelMSE at any temporal aggregation level and any forecast horizon for all95 time series and selected reconciliation procedures.

Quarterly Semi-annual Annual AllSeries cs-shr

Gdp 0.9740 0.9397 0.9028 0.8924 0.9267 0.8382 0.8713 0.8546 0.7116 0.8719Tﬁ 0.8316 0.9002 0.8769 0.8335 0.8600 0.8232 0,8760 0.8492 0.7162 0.8348TﬁGos 0.9170 0.8834 0.9140 0.9008 0.9037

TﬁGosCop 0.8994 0.9722

GneDfdFceHfcMis

GneDfdFceHfcTpt 0.9166 0.9311 0.9593 0.9482 0.9387 0.8779 0.8693 0.8736

TﬁGosCopNfnPvt 0.9149 0.9766 0.9871

TﬁCoeWns 0.9782

TﬁCoeEsc 0.9635 0.9746 0.9866 0.9909 0.9788 0.9925

GneDfdFceGvtNatNdf 0.9019 0.8936 0.9145

GneDfdGfcPubGvtNatDef 0.9251

GneDfdGfcPubPcpCmw

GneDfdGfcPubPcpSnl 0.9308 0.9414 0.8722 0.8440 0.8962 0.8914 0.8616 0.8764 0.8955 0.8904GneDfdGfcPvtTdwNnu 0.9714 0.9673 0.9850 0.9995 0.9807 0.9027 0.9781 0.9396

GneDfdGfcPvtPbiIprRnd

GneDfdGfcPvtPbiIprMnp

GneDfdGfcPvtPbiIprCom 0.9789 0.9792 0.9906 0.9954 0.9860 0.9327 0.9847 0.9584 0.9131 0.9673GneDfdGfcPvtPbiIprArt

GneDfdGfcPvtPbiNdcNec 0.8576 0.8409 0.8613 0.8666 0.8566 0.8683 0.9379 0.9024 0.7908 0.8596GneDfdGfcPvtPbiNdcSha 0,9030 0.9850

GneDfdGfcPvtPbiNdmNew

GneDfdGfcPvtOtc 0.9625 neDfdFceHfcAbtCig 0.8864 0.9089 0.9201 0.9864 0.9247 0.9083 0.9672 0.9373 0.9889 0.9372GneDfdFceHfcMisOgd 0.9932 0.9826 0.9538 0.9366 0.9663 GneDfdFceHfcTptOvh 0.9588 0.9439 0.9461 0.9333 0.9455

GneDfdFceHfcFheApp

GneDfdFceHfcHweRnt

GneDfdFceHfcCnf 0.9780

GneDfdFceHfcRnc

GneDfdFceHfcCom

GneCiiPnfMin 0.9405 0.9335

GneCiiPfm 0.9696 0.9319 0.9780

ExpMinImp 0.9393 0.8813 0.9690 0.9741 0.9402 0.9531 t-acov

Gdp ﬁGosGvt 0.4874 0.7062 kah-wlsv-shr Gdp neDfd 0.9112 GneDfdFceHfcMis 0.9363 0.9543

TﬁCoeWns

GneDfdGfcPubGvtNatDef 0.9778

GneDfdGfcPubPcpCmw 0.9434 0.9811

GneDfdGfcPvtPbiNdcNec 0.8632 0.8604 0.8526 0.8464 0.8556 0.7980 0,9080 0.8512 0.492 0.7894GneDfdGfcPvtPbiNdcSha 0.9393 0.9627 neDfdFceHfcHweRnt GneCiiPfm 0.8704 0.8621

ExpMinImp ite-acov-shr

Gdp

GneDfdFceHfcMis 0.9939 0.9567

TﬁCoeWns

GneDfdGfcPubPcpCmw 0.9304 neDfdGfcPvtPbiIprRnd 0.7186 0.8211 GneDfdGfcPvtPbiNdcNec 0.8956 0.8537 0.8668 0.8709 0.8716 0.8109 0.9229 0.8651 0.5003 0.8034GneDfdGfcPvtPbiNdcSha 0.9375

GneDfdFceHfcHweRnt

GneCiiPfm 0.8413 0.8623

ExpMinImp 0.9817 0.8665 0.9291 0.9253 0.9247 0.7558 0.9985 0.8687 0.9526 0.9122 oct-acov

Gdp neDfdFceHfcFhe 0.9782 0.9749 0.9617 0.9142 0.9569 0.8415 0.7836 0.8121 0.7530 0.8823GneDfdFceHfcHwe 0.7829 0.8061 0.8020 0.8759 0.8160 0.5803 0.7330 0.6522 0.5549 0.7244GneDfdGfcPubGvtNat 0.8693 0.8991 0.8749 0.8987 0.8854 0.9940 0.9754 0.9846 0.8661 0.9098GneDfdGfcPvtPbiIpr 0.7368 0.7365 0.8195 0.8122 0.7752 0.629 0.7903 0.7051 0.4687 0.7022GneDfdGfcPvtPbiNdc 0.9218 0.8003 0.8633 0.8812 0.8655 0.7415 0.8047 0.7725 0.5741 0.7901GneDfdGfcPvtPbiNdm 0.9173 0.9770 0.9587 0.9495 0.9504 0.9245 0.9733 0.9486 0.7686 0.9215TﬁGosCopNfnPub TﬁCoeWns

GneDfdGfcPubGvtNatDef 0.9315

GneDfdGfcPvtPbiIprCom 0.9976 0.8462

GneDfdGfcPvtPbiNdcNec 0.8851 0.9479 0.9126 0.8933 0.9094 0.8731 0.9312 0.9017 0.5098 0.8352GneDfdGfcPvtPbiNdcSha 0.9358 0.9322

GneCiiPfm 0.8615 0.8361 able A.3 : AvgRelMAE at any temporal aggregation level and any forecast horizon for all95 time series and selected reconciliation procedures. Quarterly Semi-annual Annual AllSeries cs-shr

Gdp

GneDfd 0.9952 0.9952

GneCii 0.9950

GneDfdFceHfcMis

GneDfdFceHfcTpt 0.9312 0.9336 0.9345 0.9201 0.9298 0.8731 0.9258 0.8991 0.9612 0.9253GneDfdFceHfcHcr 0.9848 0.9764 0.9839 0.9814 0.9816 0.9473

TﬁGosCopNfnPvt 0.9961 0.9738 0.9851 0.9951 0.9875 0.9981 0.9902 0.9941 0.9622 0.9857TﬁGosCopFin

TﬁGosGvt

TﬁCoeWns 0.9607 0.9761 0.9613 0.9663 0.9661 0.9885 0.9824 0.9855 0.9578 0.9704TﬁCoeEsc 0.9668 0.9596 0.9669 0.9749 0.9670 0.9921

Sdi

GneDfdFceGvtNatNdf 0.9545 0.9463 0.9565 0.9726 0.9574 0.9527 0.9429 0.9478 0.9275 0.9503GneDfdFceGvtNatDef

GneDfdGfcPubPcpCmw 0.9841

GneDfdGfcPubPcpSnl 0.9130 0.9451 0.9185 0.9009 0.9192 0.9268 0.9239 0.9254 0.9294 0.9224GneDfdGfcPvtTdwNnu 0.9738 0.9968

GneDfdGfcPvtPbiIprCom 0.9767 0.9865 0.9878 0.9937 0.9862 0.9805 0.9841 0.9823 0.9498 0.9798GneDfdGfcPvtPbiIprArt 0.9983 0.9924 0.9968 0.9902 0.9944

GneDfdGfcPvtPbiNdcNec 0.9535 0.9619 0.9639 0.9509 0.9575

GneDfdGfcPvtPbiNdmNew

GneDfdGfcPvtPbiCbr

GneDfdGfcPvtOtc 0.9948 neDfdFceHfcAbtCig 0.9480 0.9524 0.9609 0.9875 0.9621 0.9481 0.9702 0.9591 GneDfdFceHfcTptOvh 0.9909 0.9799 0.9655 0.9624 0.9746

GneDfdFceHfcHltMed

GneDfdFceHfcFheTls 0.9949

GneDfdFceHfcFheApp

GneDfdFceHfcHweRnt

GneDfdFceHfcCnf

GneDfdFceHfcRnc 0.9870

GneCiiPnfMin 0.9887 0.9692 0.9822

GneCiiPnfMan 0.9925 0.9595

GneCiiPfm 0.9820 0.9804 0.9993 0.9917 0.9883 0.9737

ExpMinImp 0.9521 0.9408 0.9716 0.9735 0.9594 0.9274 0.9920 0.9592 t-acov

Gdp ﬁGosGvt 0.7047 0.8383 GneDfdGfcPvtPbiNdmNew 0.9996

GneDfdGfcPvtPbiCbr 0,8780 0.9541 0.9540 0.9216 0.9264 0.7348 0.8803 0.8043 0.7498 0.8632GneDfdGfcPvtOtc 0.9980 0.9829 0.9671 0.9836 0.9829 0.8562 0.9890 0.9202 0.8118 0.9385GneDfdFceHfcAbtAlc kah-wlsv-shr

Gdp neDfd TﬁCoeWns

GneDfdFceGvtNatNdf 0.9127 0.9511 0.9237 0.9385 0.9314 0.9121 0.9409 0.9264 0.8655 0.9203GneDfdFceGvtNatDef

GneDfdGfcPvtPbiNdcNec 0.9799 0.9740 0.9759 0.9379 0.9668 neDfdFceHfcHweRnt GneCiiPfm 0.9531 0.9136

ExpMinImp 0.9775 0.9254 0.9352 0.9435 0.9452 0.8551 0.9516 0.9021 0.8757 0.9226 ite-acov-shr

Gdp

TﬁCoeWns 0.9691 0.9719

GneDfdFceGvtNatNdf 0.9121 0.9305 0.9169 0.9406 0.9250 0.9047 0.9425 0.9234 0.8622 0.9153GneDfdFceGvtNatDef

GneDfdGfcPubPcpCmw 0.9490 neDfdGfcPvtPbiIprRnd 0.8281 0.9210 GneDfdGfcPvtPbiNdmNew 0.9953 0.9995 0.9837 0.9783 0.9892 0.9125 0.9716 0.9416 0.8667 0.9571GneDfdGfcPvtPbiNdmSha

GneDfdGfcPvtPbiCbr 0.8936 0.9616 0.9512 0.9088 0.9284 0.7420 0.8748 0.8057 0.7479 0.8644GneDfdGfcPvtOtc 0.9913 0.9886 0.9669 0.9878 0.9836 0.8668

GneDfdFceHfcHweRnt

GneCiiPfm 0.9658 0.9413

ExpMinImp 0.9767 0.9343 0.9321 0.9457 0.9470 0.8567 0.9550 0.9045 0.8722 0.9238 oct-acov

Gdp neDfdFceHfcFhe 0.9850 TﬁCoeWns

GneDfdFceGvtNatNdf 0.9341 0.9588 0.9289 0.9537 0.9438 0.9193 0.9471 0.9331 0.8734 0.9304GneDfdFceGvtNatDef

GneDfdGfcPvtPbiNdcNec 0.9837

GneDfdGfcPvtPbiNdmNew 0.9894

GneDfdGfcPvtPbiCbr 0.8738 0.9485 0.9550 0.9182 0.9233 0.7302 0.8794 0.8013 0.748 0.8604GneDfdGfcPvtOtc 0.9914

GneCiiPfm 0.9783 0.9406 able A.4 : Variables, series IDs and their descriptions for the Expenditure Approach Variable Series ID Description

Gdp A2302467A GDP - Gross Domestic ProductSde A2302566J Statistical Discrepancy(E)Exp A2302564C Exports of goods and servicesImp A2302565F Imports of goods and servicesGne A2302563A Gross national exp.GneDfdFceGvtNatDef A2302523J Gen. gov. - National; Final consumption exp. - DefenceGneDfdFceGvtNatNdf A2302524K Gen. gov. - National; Final consumption exp. - Non-defenceGneDfdFceGvtNat A2302525L Gen. gov. - National; Final consumption exp.GneDfdFceGvtSnl A2302526R Gen. gov. - State and local; Final consumption exp,GneDfdFceGvt A2302527T Gen. gov.; Final consumption exp.GneDfdFce A2302529W All sectors; Final consumption exp.GneDfdGfcPvtTdwNnu A2302543T Pvt.; Gross ﬁxed capital formation (GFCF)GneDfdGfcPvtTdwAna A2302544V Pvt.; GFCF - Dwellings - Alterations and additionsGneDfdGfcPvtTdw A2302545W Pvt.; GFCF - Dwellings - TotalGneDfdGfcPvtOtc A2302546X Pvt.; GFCF - Ownership transfer costsGneDfdGfcPvtPbiNdcNbd A2302533L Pvt. GFCF - Non-dwelling construction - New buildingGneDfdGfcPvtPbiNdcNec A2302534R Pvt.; GFCF - Non-dwelling construction -New engineering constructionGneDfdGfcPvtPbiNdcSha A2302535T Pvt.; GFCF - Non-dwelling construction -Net purchase of second hand assetsGneDfdGfcPvtPbiNdc A2302536V Pvt.; GFCF - Non-dwelling construction - TotalGneDfdGfcPvtPbiNdmNew A2302530F Pvt.; GFCF - Machinery and equipment - NewGneDfdGfcPvtPbiNdmSha A2302531J Pvt.; GFCF - Machinery and equipment -Net purchase of second hand assetsGneDfdGfcPvtPbiNdm A2302532K Pvt.; GFCF - Machinery and equipment - TotalGneDfdGfcPvtPbiCbr A2716219R Pvt.; GFCF - Cultivated biological resourcesGneDfdGfcPvtPbiIprRnd A2716221A Pvt.; GFCF - Intellectual property products -Research and developmentGneDfdGfcPvtPbiIprMnp A2302539A Pvt.; GFCF - Intellectual property products -Mineral and petroleum explorationGneDfdGfcPvtPbiIprCom A2302538X Pvt.; GFCF - Intellectual property products - ComputersoftwareGneDfdGfcPvtPbiIprArt A2302540K Pvt.; GFCF - Intellectual property products - ArtisticoriginalsGneDfdGfcPvtPbiIpr A2716220X Pvt.; GFCF - Intellectual property products TotalGneDfdGfcPvtPbi A2302542R Pvt.; GFCF - Total private business investmentGneDfdGfcPvt A2302547A Pvt.; GFCFGneDfdGfcPubPcpCmw A2302548C Plc. corporations - Commonwealth; GFCFGneDfdGfcPubPcpSnl A2302549F Plc. corporations - State and local; GFCFGneDfdGfcPubPcp A2302550R Plc. corporations; GFCF TotalGneDfdGfcPubGvtNatDef A2302551T Gen. gov. - National; GFCF - DefenceGneDfdGfcPubGvtNatNdf A2302552V Gen. gov. - National ; GFCF - Non-defenceGneDfdGfcPubGvtNat A2302553W Gen. gov. - National ; GFCF TotalGneDfdGfcPubGvtSnl A2302554X Gen. gov. - State and local; GFCFGneDfdGfcPubGvt A2302555A Gen. gov.; GFCFGneDfdGfcPub A2302556C Plc.; GFCFGneDfdGfc A2302557F All sectors; GFCFSource: Athanasopoulos et al. (2019). able A.5 : Variables, series IDs and their descriptions for Household Final Consumption -Expenditure Approach Variable Series ID Description

GneDfdHfc A2302254W Household Final Consumption ExpenditureGneDfdFceHfcFud A2302237V FoodGneDfdFceHfcAbt A3605816F Alcoholic beverages and tobaccoGneDfdFceHfcAbtCig A2302238W Cigarettes and tobaccoGneDfdFceHfcAbtAlc A2302239X Alcoholic beveragesGneDfdFceHfcCnf A2302240J Clothing and footwearGneDfdFceHfcHwe A3605680F Housing, water, electricity, gas and other fuelsGneDfdFceHfcHweRnt A3605681J Actual and imputed rent for housingGneDfdFceHfcHweWsc A3605682K Water and sewerage chargesGneDfdFceHfcHweEgf A2302242L Electricity, gas and other fuelGneDfdFceHfcFhe A2302243R Furnishings and household equipmentGneDfdFceHfcFheFnt A3605683L Furniture, ﬂoor coverings and household goodsGneDfdFceHfcFheApp A3605684R Household appliancesGneDfdFceHfcFheTls A3605685T Household toolsGneDfdFceHfcHlt A2302244T HealthGneDfdFceHfcHltMed A3605686V Medicines, medical aids and therapeutic appliancesGneDfdFceHfcHltHsv A3605687W Total health servicesGneDfdFceHfcTpt A3605688X TransportGneDfdFceHfcTptPvh A2302245V Purchase of vehiclesGneDfdFceHfcTptOvh A2302246W Operation of vehiclesGneDfdFceHfcTptTsv A2302247X Transport servicesGneDfdFceHfcCom A2302248A CommunicationsGneDfdFceHfcRnc A2302249C Recreation and cultureGneDfdFceHfcEdc A2302250L Education servicesGneDfdFceHfcHcr A2302251R Hotels, cafes and restaurantsGneDfdFceHfcHcrCsv A3605694V Catering servicesGneDfdFceHfcHcrAsv A3605695W Accommodation servicesGneDfdFceHfcMis A3605696X Miscellaneous goods and servicesGneDfdFceHfcMisOgd A3605697A Other goodsGneDfdFceHfcMisIfs A2302252T Insurance and other ﬁnancial servicesGneDfdFceHfcMisOsv A3606485T Other servicesSource: Athanasopoulos et al. (2019). able A.6 : Variables, series IDs and their descriptions for Changes in Inventories - Expen-diture Approach Variable Series ID Description

GneCii A2302562X Changes in InventoriesGneCiiPfm A2302560V FarmGneCiiPba A2302561W Public authoritiesGneCiiPnf A2302559K Private; Non-farm TotalGneCiiPnfMin A83722619L Private; Mining (B)GneCiiPnfMan A3348511X Private; Manufacturing (C)GneCiiPnfWht A3348512A Private; Wholesale trade (F)GneCiiPnfRet A3348513C Private; Retail trade (G)GneCiiPnfOnf A2302273C Private; Non-farm; Other non-farm industriesSource: Athanasopoulos et al. (2019).

Table A.7 : Variables, series IDs and their descriptions for the Income approach

Variable Series ID Description

Sdi A2302413V Statistical discrepancy (I)Tsi A2302412T Taxes less subsidies (I)TﬁCoeWns A2302399K Compensation of employees; Wages and salariesTﬁCoeEsc A2302400J Compensation of employees; Employers’ social contributionsTﬁCoe A2302401K Compensation of employeesTﬁGosCopNfnPvt A2323369L Private non-ﬁnancial corporations; Gross operating surplusTﬁGosCopNfnPub A2302403R Public non-ﬁnancial corporations; Gross operating surplusTﬁGosCopNfn A2302404T Non-ﬁnancial corporations; Gross operating surplusTﬁGosCopFin A2302405V Financial corporations; Gross operating surplusTﬁGosCop A2302406W Total corporations; Gross operating surplusTﬁGosGvt A2298711F General government; Gross operating surplusTﬁGosDwl A2302408A Dwellings owned by persons; Gross operating surplusTﬁGos A2302409C All sectors; Gross operating surplusTﬁGmi A2302410L Gross mixed incomeTﬁ A2302411R Total factor incomeSource: Athanasopoulos et al. (2019). .8.2 Cross-sectional & temporal reconciliation procedures able A.8 : AvgRelMSE at any temporal aggregation level and any forecast horizon. Quarterly Semi-annual Annual AllProcedure all 95 series base 1 1 1 1 1 1 1 1 1 1cs-ols cs-struc cs-wls 0.9619 t-struc t-strar1

32 upper series base 1 1 1 1 1 1 1 1 1 1cs-ols 0.9713 0.9767 0.9798 0.9852 0.9782 0.9508 0.9854 0.9679 t-bu 1 1 1 1 1 t-struc

63 bottom series base 1 1 1 1 1 1 1 1 1 1cs-ols cs-struc cs-wls 0.9801 0.9894 0.9929 t-struc t-strar1 able A.9 : AvgRelMAE at any temporal aggregation level and any forecast horizon. Quarterly Semi-annual Annual AllProcedure all 95 series base 1 1 1 1 1 1 1 1 1 1cs-ols cs-struc cs-wls 0.9793 0.9846 0.9866 t-struc t-strar1

32 upper series base 1 1 1 1 1 1 1 1 1 1cs-ols 0.9820 0.9878 0.9875 0.9852 0.9856 0.9704 0.9857 0.9780 t-bu 1 1 1 1 1 t-struc

63 bottom series base 1 1 1 1 1 1 1 1 1 1cs-ols cs-struc cs-wls 0.9913 0.9936 0.9971 0.9996 0.9954 0.9903 0.9961 0.9932 0.9975 0.9951cs-shr 0.9917 0.9953 t-struc t-strar1 l lll l lll l lll l ll ll l ll ll lll ll l ll ll lll ll l ll ll lll ll l ll ll lll Cross−sectional Temporal ll o l s s t r u c w l s s h r o l s s t r a r s t r u c s h r bu ha r w l s h s a r w l sv a c o v T e m po r a l a gg r e g a t i on l eve l A ll l eve l s l all bts uts A v g R e l M SE all 1 2 4 all 1 2 4 all 1 2 4cs olscs struct olsbaset strar1t struccs wlscs shrt shrt but har1t wlsht sar1t wlsvt acov all bts uts Figure A.14 : Top panel: Average Relative MSE across all series and forecast horizons,by frequency of observation. Bottom panel: Rankings by frequency of observation andforecast horizon. 96 l lll l lll l lll l ll ll l l l ll l llll l l l ll l llll l l l ll l llll l l l ll l ll

Cross−sectional Temporal ll o l s s t r u c w l s s h r o l s s t r a r s t r u c bu s h r w l sv s a r w l s h ha r c o v T e m po r a l a gg r e g a t i on l eve l A ll l eve l s l all bts uts A v g R e l M AE all 1 2 4 all 1 2 4 all 1 2 4cs olscs struct olsbaset strar1t struccs wlscs shrt but shrt wlsvt sar1t wlsht har1t acov all bts uts Figure A.15 : Top panel: Average Relative MAE across all series and forecast horizons,by frequency of observation. Bottom panel: Rankings by frequency of observation andforecast horizon. 97 llll ll llll l l l l t−wlsv 4.85t−sar1 5.07t−wlsh 5.22t−acov 5.32t−har1 5.46t−bu 6.84t−shr 6.89cs−shr 8.43t−struc 8.54t−strar1 8.55cs−wls 8.57cs−struc 10.87cs−ols 11.43t−ols 11.77base 12.18 4 6 8 10 12

Mean ranks

Critical distance = 2.2Friedman: 0 (Ha: Different) ll lllllll l ll ll l cs−shr 4.89cs−wls 4.89t−har1 6.95t−wlsh 6.95t−acov 7.13t−sar1 7.24t−wlsv 7.37t−bu 7.52base 7.52cs−struc 7.79cs−ols 8.27t−shr 8.34t−strar1 10.92t−struc 10.99t−ols 13.23 6 9 12

Mean ranks

Critical distance = 2.2Friedman: 0 (Ha: Different)

Figure A.16 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The cross-sectional and temporal reconciliation procedures are sorted vertically according to the MSEmean rank (i) across all time frequencies and forecast horizons (top), and (ii) for one-step-ahead quarterly forecasts (bottom). 98 llll l l l l ll ll l l t−sar1 5.17t−wlsv 5.19t−har1 5.31t−wlsh 5.43t−acov 5.51t−shr 6.09t−bu 6.48cs−shr 7.92cs−wls 8.24t−struc 9.06t−strar1 9.08cs−struc 11.09cs−ols 11.19base 12t−ols 12.23 4 6 8 10 12

Mean ranks

Critical distance = 2.2Friedman: 0 (Ha: Different) ll ll lllll l l l ll l cs−shr 4.8cs−wls 4.82t−har1 6.64t−wlsh 6.73t−sar1 7.37t−bu 7.41base 7.41t−wlsv 7.41t−acov 7.52cs−struc 8.32t−shr 8.52cs−ols 8.84t−struc 10.54t−strar1 10.55t−ols 13.15 6 9 12

Mean ranks

Critical distance = 2.2Friedman: 0 (Ha: Different)

Figure A.17 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The cross-sectional and temporal reconciliation procedures are sorted vertically according to the MAEmean rank (i) across all time frequencies and forecast horizons (top), and (ii) for one-step-ahead quarterly forecasts (bottom). 99 .8.3 Heuristic KA, alternatives and iterative cross-temporal reconciliation proce-dures l lll lll lll lll lll lll lll ll l lllll llll l lllll llll l lllll llll l lllll llll ll ll llllll ll llllll ll llllll ll llllll ll llllll ll llllll ll llllll ll llll

Heuristic KA Heuristic KA Alternatives Iterative ll s t r u c − w l s s t r u c − s h r w l s h − w l s w l sv − w l s w l s h − s h r w l sv − s h r s h r − w l s bu − s h r bu − w l s s h r − s h r ha r − w l s s a r − w l s ha r − s h r s a r − s h r a c o v − w l s a c o v − s h r s t r u c − w l ss t r u c − s h r bu − s h r bu − w l ss h r − s h r s h r − w l s w l s h − w l s ha r − w l s w l sv − w l ss a r − w l s ha r − s h r w l s h − s h r s a r − s h r w l sv − s h r a c o v − w l s a c o v − s h r T e m po r a l a gg r e g a t i on l eve l A ll l eve l s l all bts uts A v g R e l M SE Figure A.18 : Average Relative MSE across all series and forecast horizons, by frequencyof observation (selected procedures). 100 lll l ll lll l ll lll l ll lll l l l lllll ll l l l lllll ll l l l lllll ll l l l lllll ll l l ll ll llllll ll llllll ll llllll ll llllll ll llllll ll llllll ll llllll ll llll

Heuristic KA Heuristic KA Alternatives Iterative ll s t r u c − w l s s t r u c − s h r w l sv − w l s w l s h − w l s w l s h − s h r w l sv − s h r bu − w l s s a r − w l s ha r − w l s s h r − w l s bu − s h r a c o v − w l s s h r − s h r ha r − s h r s a r − s h r a c o v − s h r s t r u c − w l ss t r u c − s h r bu − w l s bu − s h r s h r − w l s w l s h − w l s ha r − w l s w l sv − w l ss a r − w l ss h r − s h r a c o v − w l s ha r − s h r w l s h − s h r s a r − s h r w l sv − s h r a c o v − s h r T e m po r a l a gg r e g a t i on l eve l A ll l eve l s l all bts uts A v g R e l M AE Figure A.19 : Average Relative MAE across all series and forecast horizons, by frequencyof observation (selected procedures). 101 all 1 2 4 all 1 2 4 all 1 2 4ite ols−olstcs ols−olstcs ols−strucite ols−strucite shr−olstcs shr−olsite strar1−olstcs strar1−olsite struc−olstcs struc−olsite bu−olstcs bu−olsite har1−olstcs har1−olsite acov−olstcs acov−olsite wlsh−olstcs wlsh−olsite sar1−olstcs sar1−olsite wlsv−olstcs wlsv−olstcs shr−strucite shr−strucite bu−structcs bu−structcs strar1−strucite strar1−structcs struc−strucite struc−structcs acov−strucite acov−structcs har1−strucite har1−structcs wlsh−strucite wlsh−strucite sar1−structcs sar1−structcs wlsv−strucite wlsv−strucite ols−wlstcs ols−wlsite ols−shrtcs ols−shrtcs strar1−wlskah struc−wlsite strar1−wlsite struc−wlstcs strar1−shrite strar1−shrkah struc−shrite struc−shrite bu−shrite bu−wlsite shr−shrite shr−wlstcs shr−wlstcs bu−shrtcs bu−wlstcs shr−shrkah wlsh−wlsite wlsh−wlstcs har1−wlsite har1−wlskah wlsv−wlstcs sar1−wlsite wlsv−wlsite sar1−wlstcs har1−shrkah wlsh−shrtcs sar1−shrkah wlsv−shrite har1−shrite wlsh−shrite sar1−shrite wlsv−shrite acov−wlstcs acov−wlstcs acov−shrite acov−shr all bts uts

Figure A.20 : Rankings (Average Relative MSE) by frequency of observation and forecasthorizon. 102 all 1 2 4 all 1 2 4 all 1 2 4ite ols−olstcs ols−olsite ols−structcs ols−strucite shr−olstcs shr−olsite strar1−olstcs strar1−olsite struc−olstcs struc−olsite bu−olstcs bu−olsite har1−olstcs har1−olsite acov−olstcs acov−olsite wlsh−olstcs wlsh−olsite sar1−olstcs sar1−olsite wlsv−olstcs wlsv−olstcs strar1−strucite strar1−structcs struc−strucite struc−structcs shr−strucite shr−strucite bu−structcs bu−strucite ols−wlstcs ols−wlstcs acov−strucite acov−structcs har1−strucite har1−structcs wlsh−strucite wlsh−structcs sar1−strucite sar1−structcs wlsv−strucite wlsv−strucite ols−shrtcs ols−shrtcs strar1−wlskah struc−wlsite strar1−wlsite struc−wlstcs strar1−shrkah struc−shrite strar1−shrite struc−shrite bu−wlsite bu−shrtcs bu−wlskah wlsv−wlskah wlsh−wlstcs sar1−wlsite shr−wlstcs har1−wlsite wlsh−wlsite har1−wlsite wlsv−wlsite sar1−wlstcs shr−wlstcs bu−shrite shr−shrite acov−wlstcs acov−wlstcs shr−shrtcs har1−shrkah wlsh−shrtcs sar1−shrkah wlsv−shrite har1−shrite wlsh−shrite sar1−shrite wlsv−shrtcs acov−shrite acov−shr all bts uts

Figure A.21 : Rankings (Average Relative MAE) by frequency of observation and forecasthorizon. 103 able A.10 : AvgRelMSE for the 63 bottom series at any temporal aggregation level andany forecast horizon.

Quarterly Semi-annual Annual All bts bts bts bts

Procedure tcs-acov-shr 0.9474 tcs-acov-struc tcs-acov-wls 0.9482 0.9664 0.9926 0.9806 0.9718 0.8009 0.9316 0.8638 0.7691 0.9088tcs-bu-ols tcs-bu-shr 0.9829 0.9880 0.9886 0.9953 0.9887 0.8347 0.9445 0.8879 0.7931 0.9291tcs-bu-struc tcs-bu-wls 0.9802 0.9852 0.9866 0.9943 0.9866 0.8331 0.9426 0.8862 0.7920 0.9272tcs-har1-ols tcs-har1-shr 0.9653 0.9809 0.9959 0.9808 0.9806 0.8157 0.9333 0.8725 0.7742 0.917tcs-har1-struc tcs-har1-wls 0.9664 0.9812 0.9965 0.9824 0.9816 0.8179 0.9350 0.8745 0.7770 0.9185tcs-ols-ols tcs-ols-shr tcs-ols-struc tcs-ols-wls tcs-sar1-ols tcs-sar1-shr 0.9832 0.9818 0.9762 0.9761 0.9793 0.8268 0.9253 0.8746 0.7713 0.9164tcs-sar1-struc tcs-sar1-wls 0.9850 0.9824 0.9773 0.9777 0.9806 0.8291 0.9268 0.8766 0.7740 0.9181tcs-shr-ols tcs-shr-shr 0.9704 0.9856 tcs-shr-wls 0.9675 0.9832 tcs-strar1-shr tcs-strar1-wls tcs-struc-struc tcs-wlsh-ols tcs-wlsh-struc tcs-wlsv-ols tcs-wlsv-struc ite-acov-ols ite-acov-shr ite-acov-wls 0.9482 0.9698 0.9942 0.9836 0.9738 0.8022 0.9340 0.8656 0.7703 0.9106ite-bu-ols ite-bu-shr 0.9806 0.9928 0.9998 ite-bu-wls 0.9801 0.9894 0.9929 ite-har1-shr 0.9606 0.9813 0.9984 0.9843 0.9811 0.8147 0.9362 0.8733 0.7753 0.9176ite-har1-struc ite-har1-wls 0.9634 0.9826 0.9980 0.9845 0.9821 0.8175 0.9372 0.8753 0.7782 0.9193ite-ols-ols ite-ols-shr ite-ols-struc ite-ols-wls ite-sar1-ols ite-sar1-shr 0.9800 0.9817 0.9779 0.9778 0.9793 0.8262 0.9278 0.8755 0.7725 0.9169ite-sar1-struc ite-sar1-wls 0.9836 0.9826 0.9783 0.9790 0.9809 0.8290 0.9290 0.8776 0.7754 0.9188ite-shr-ols ite-shr-shr 0.9786 0.9960 ite-shr-wls 0.9710 0.9911 ite-strar1-shr ite-strar1-wls ite-struc-shr ite-struc-wls ite-wlsh-shr 0.9618 0.9809 0.9979 0.9837 0.9810 0.8144 0.9355 0.8729 0.7748 0.9174ite-wlsh-struc ite-wlsh-wls 0.9649 0.9826 0.9976 0.9841 0.9822 0.8176 0.9367 0.8751 0.7780 0.9192ite-wlsv-ols ite-wlsv-shr 0.9798 0.9814 0.9776 0.9776 0.9791 0.8259 0.9275 0.8753 0.7723 0.9166ite-wlsv-struc ite-wlsv-wls 0.9837 0.9828 0.9782 0.9789 0.9809 0.8292 0.9288 0.8776 0.7754 0.9188 able A.11 : AvgRelMSE for the 32 upper series at any temporal aggregation level and anyforecast horizon.

Quarterly Semi-annual Annual All bts bts bts bts

Procedure tcs-ols-shr tcs-ols-wls ite-bu-struc 0.9720 0.9696 0.9655 0.9610 0.967 0.8219 0.9132 0.8663 0.7443 0.9027ite-bu-wls 0.9271 0.9324 0.9349 0.9323 0.9317 0.7883 0.8837 0.8346 0.7182 0.8699ite-har1-ols 0.9774 0.9857 0.9723 0.9795 0.9787 0.8351 0.9316 0.8821 0.7613 0.9166ite-har1-shr 0.9305 0.9452 0.9240 0.9301 0.9324 0.7939 0.8750 0.8335 0.7123 0.8689ite-har1-struc 0.9826 0.9827 0.9576 0.9582 0.9702 0.8296 0.9058 0.8668 0.7419 0.9041ite-har1-wls 0.9428 0.9523 0.9324 0.9346 0.9405 0.8028 0.8818 0.8414 0.7204 0.8770ite-ols-ols ite-ols-shr ite-ols-wls able A.12 : AvgRelMSE for all 95 series at any temporal aggregation level and any fore-cast horizon.

Quarterly Semi-annual Annual All all all all all

Procedure tcs-acov-shr 0.9453 tcs-acov-wls 0.9470 0.9613 0.9729 0.9657 0.9617 0.8010 0.9150 0.8561 0.7524 0.8982tcs-bu-ols tcs-bu-shr 0.9640 0.9680 0.9694 0.9733 0.9687 0.8181 0.9226 0.8688 0.7659 0.9080tcs-bu-struc tcs-bu-wls 0.9635 0.9678 0.9693 0.9733 0.9685 0.8183 0.9223 0.8687 0.7662 0.9080tcs-har1-ols tcs-har1-shr 0.9578 0.9697 0.9730 0.9642 0.9661 0.8102 0.9143 0.8607 0.7540 0.9022tcs-har1-struc tcs-har1-wls 0.9597 0.9719 0.9750 0.9664 0.9682 0.8134 0.9170 0.8636 0.7578 0.9049tcs-ols-ols tcs-ols-shr tcs-ols-struc tcs-ols-wls tcs-sar1-ols tcs-sar1-shr 0.9684 0.9697 0.9597 0.9603 0.9645 0.8175 0.9086 0.8619 0.7518 0.9013tcs-sar1-struc tcs-sar1-wls 0.9715 0.9721 0.9622 0.9627 0.9671 0.8209 0.9113 0.8649 0.7555 0.9043tcs-shr-ols tcs-shr-shr 0.9592 0.9689 0.9854 0.9810 0.9736 0.8045 0.9261 0.8632 0.7579 0.9076tcs-shr-struc tcs-shr-wls 0.9573 0.9688 0.9874 0.9818 0.9738 0.8044 0.9279 0.8640 0.7597 0.9082tcs-strar1-ols tcs-strar1-shr tcs-strar1-wls tcs-struc-struc tcs-wlsh-ols tcs-wlsh-struc tcs-wlsv-ols tcs-wlsv-struc ite-acov-ols ite-acov-shr ite-acov-struc ite-acov-wls 0.9459 0.9632 0.9734 0.9674 0.9624 0.8013 0.9163 0.8569 0.7528 0.8989ite-bu-ols ite-bu-shr 0.9583 0.9701 0.9757 0.9824 0.9716 0.8191 0.9356 0.8754 0.7757 0.9132ite-bu-struc ite-bu-wls 0.9619 0.9698 0.9730 0.9778 0.9706 0.8200 0.9293 0.8730 0.7735 0.9116ite-har1-ols ite-har1-shr 0.9504 0.9690 0.9727 0.9657 0.9644 0.8076 0.9151 0.8597 0.7535 0.9009ite-har1-struc ite-har1-wls 0.9565 0.9723 0.9754 0.9674 0.9679 0.8125 0.9182 0.8637 0.7582 0.9048ite-ols-ols ite-ols-shr ite-ols-struc ite-ols-wls ite-sar1-ols ite-sar1-shr 0.9613 0.9683 0.9588 0.9605 0.9622 0.8151 0.9092 0.8609 0.7514 0.8997ite-sar1-struc ite-sar1-wls 0.9690 0.9715 0.9622 0.9631 0.9664 0.8202 0.9125 0.8651 0.7561 0.9041ite-shr-ols ite-shr-shr 0.9612 0.9747 0.9884 0.9834 0.9769 0.8088 0.9282 0.8664 0.7597 0.9107ite-shr-struc ite-shr-wls 0.9580 0.9738 0.9896 0.9832 0.9761 0.8074 0.9299 0.8665 0.7611 0.9105ite-strar1-ols ite-strar1-shr ite-strar1-wls ite-struc-shr ite-struc-wls ite-wlsh-shr 0.9510 0.9686 0.9725 0.9653 0.9643 0.8073 0.9147 0.8593 0.7531 0.9007ite-wlsh-struc ite-wlsh-wls 0.9574 0.9725 0.9753 0.9672 0.9681 0.8127 0.9179 0.8637 0.7582 0.9049ite-wlsv-ols ite-wlsv-shr 0.9611 0.9680 ite-wlsv-wls 0.9692 0.9719 0.9622 0.9631 0.9666 0.8203 0.9125 0.8652 0.7562 0.9042 able A.13 : AvgRelMAE for the 63 bottom series at any temporal aggregation level andany forecast horizon.

Quarterly Semi-annual Annual All bts bts bts bts

Procedure tcs-acov-shr 0.9751 tcs-acov-struc tcs-acov-wls 0.9756 0.9820 0.9962 0.9886 0.9856 0.8972 0.9607 0.9284 0.8713 0.9520tcs-bu-ols tcs-bu-shr 0.9934 0.9921 0.9939 0.9939 0.9933 0.9146 0.9657 0.9398 0.8814 0.9612tcs-bu-struc tcs-bu-wls 0.9918 0.9911 0.9935 0.9949 0.9928 0.9141 0.9663 0.9398 0.8816 0.9609tcs-har1-ols tcs-har1-shr 0.9827 0.9886 0.9961 0.9862 0.9884 0.9056 0.9600 0.9324 0.8718 0.9548tcs-har1-struc tcs-har1-wls 0.9833 0.9897 0.9970 0.9884 0.9896 0.9074 0.9624 0.9345 0.8749 0.9565tcs-ols-ols tcs-ols-shr tcs-ols-struc tcs-ols-wls tcs-sar1-ols tcs-sar1-shr 0.9930 0.9891 0.9875 0.9844 0.9885 0.9117 0.9559 0.9335 0.8704 0.9550tcs-sar1-struc tcs-sar1-wls 0.9947 0.9908 0.9886 0.9863 0.9901 0.9138 0.9582 0.9357 0.8733 0.9569tcs-shr-ols tcs-shr-shr 0.9854 0.9883 tcs-shr-wls 0.9828 0.9867 tcs-strar1-shr tcs-strar1-wls tcs-struc-struc tcs-wlsh-ols tcs-wlsh-struc tcs-wlsv-ols tcs-wlsv-struc ite-acov-ols ite-acov-shr ite-acov-wls 0.9758 0.9839 0.9972 0.9905 0.9868 0.8980 0.9620 0.9295 0.8715 0.9530ite-bu-ols ite-bu-shr 0.9917 0.9953 ite-bu-wls 0.9913 0.9936 0.9971 0.9996 0.9954 0.9162 0.9721 0.9438 0.8885 0.9646ite-har1-ols ite-har1-shr 0.9794 0.9880 0.9967 0.9874 0.9878 0.9053 0.9610 0.9327 0.8715 0.9545ite-har1-struc ite-har1-wls 0.9806 0.9901 0.9978 0.9898 0.9896 0.9070 0.9633 0.9348 0.8749 0.9566ite-ols-ols ite-ols-shr ite-ols-struc ite-ols-wls ite-sar1-ols ite-sar1-shr 0.9897 0.9879 ite-sar1-wls 0.9927 0.9903 0.9887 0.9870 0.9897 0.9134 0.9591 0.9360 0.8733 0.9568ite-shr-ols ite-shr-shr 0.9895 0.9931 ite-shr-wls 0.9831 0.9915 ite-strar1-shr ite-strar1-wls ite-struc-shr ite-struc-wls ite-wlsh-shr 0.9801 0.9877 0.9966 0.9872 0.9879 0.9051 0.9607 0.9325 0.8713 0.9544ite-wlsh-struc ite-wlsh-wls 0.9815 0.9899 0.9979 0.9897 0.9897 0.9069 0.9632 0.9346 0.8747 0.9566ite-wlsv-ols ite-wlsv-shr 0.9898 0.9874 0.9869 ite-wlsv-wls 0.9929 0.9901 0.9888 0.9870 0.9897 0.9133 0.9591 0.9360 0.8732 0.9568 able A.14 : AvgRelMAE for the 32 upper series at any temporal aggregation level and anyforecast horizon.

Quarterly Semi-annual Annual All bts bts bts bts

Procedure tcs-ols-shr tcs-ols-wls ite-bu-struc 0.9821 0.9876 0.9840 0.9849 0.9846 0.9082 0.9465 0.9271 0.8491 0.9476ite-bu-wls 0.9562 0.9673 0.9664 0.9692 0.9648 0.8875 0.9283 0.9077 0.8323 0.9283ite-har1-ols 0.9806 0.9937 0.9884 0.9892 0.9880 0.9125 0.9537 0.9329 0.8578 0.9525ite-har1-shr 0.9517 0.9712 0.9624 0.9681 0.9633 0.8901 0.9257 0.9078 0.8305 0.9273ite-har1-struc 0.9816 0.9930 0.9844 0.9882 0.9868 0.9122 0.9482 0.9300 0.853 0.9502ite-har1-wls 0.9621 0.9770 0.9714 0.9768 0.9718 0.8965 0.9345 0.9153 0.8405 0.9357ite-ols-ols ite-ols-shr ite-ols-wls ite-sar1-ols 0.9793 0.9913 0.9877 0.9875 0.9864 0.9129 0.9529 0.9327 0.8576 0.9516ite-sar1-shr 0.9469 0.9704 0.9611 0.9665 0.9612 0.8901 0.9246 0.9072 0.8300 0.9258ite-sar1-struc 0.9790 0.9928 0.9840 0.9873 0.9858 0.9119 0.9483 0.9299 0.8529 0.9496ite-sar1-wls 0.9589 0.9770 0.9708 0.9750 0.9704 0.8970 0.9335 0.9151 0.8400 0.9348ite-shr-ols 0.9926 0.9919 0.9878 0.9907 0.9907 0.9134 0.9526 0.9328 0.8569 0.9538ite-shr-shr 0.9585 0.9624 able A.15 : AvgRelMAE for the 95 series at any temporal aggregation level and any fore-cast horizon.

Quarterly Semi-annual Annual All all all all all

Procedure tcs-acov-shr 0.9698 tcs-acov-wls 0.9723 0.9797 0.9883 0.9851 0.9813 0.8967 0.9522 0.9240 0.8609 0.9467tcs-bu-ols tcs-bu-shr 0.9800 0.9818 0.9827 0.9837 0.9821 0.9040 0.9510 0.9272 0.8622 0.9483tcs-bu-struc tcs-bu-wls 0.9805 0.9833 0.9847 0.9866 0.9838 0.9055 0.9538 0.9293 0.8650 0.9503tcs-har1-ols tcs-har1-shr 0.9745 0.9826 0.9856 0.9806 0.9808 0.9007 0.9488 0.9244 0.8583 0.9461tcs-har1-struc tcs-har1-wls 0.9768 0.9855 0.9886 0.9847 0.9839 0.9038 0.9531 0.9282 0.8634 0.9497tcs-ols-ols tcs-ols-shr tcs-ols-struc tcs-ols-wls tcs-sar1-ols tcs-sar1-shr 0.9797 0.9830 0.9796 0.9790 0.9803 0.9048 tcs-sar1-wls 0.9832 0.9864 0.9830 0.9828 0.9839 0.9082 0.9501 0.9289 0.8623 0.9498tcs-shr-ols tcs-shr-shr 0.9773 0.9795 0.9869 0.9867 0.9826 0.8954 0.9519 0.9232 0.8558 0.9464tcs-shr-struc tcs-shr-wls 0.9763 0.9807 0.9904 0.9901 0.9843 0.8965 0.9561 0.9258 0.8601 0.9488tcs-strar1-ols tcs-strar1-shr tcs-strar1-wls tcs-struc-struc tcs-wlsh-ols tcs-wlsh-struc tcs-wlsv-ols tcs-wlsv-struc ite-acov-ols ite-acov-shr ite-acov-struc ite-acov-wls 0.9718 0.9809 0.9887 0.9862 0.9819 0.8970 0.9528 0.9245 0.8607 0.9471ite-bu-ols ite-bu-shr 0.9769 0.9842 0.9863 0.9893 0.9842 0.9062 0.9581 0.9318 0.8693 0.9519ite-bu-struc ite-bu-wls 0.9793 0.9846 0.9866 0.9893 0.9850 0.9065 0.9572 0.9315 0.8692 0.9522ite-har1-ols ite-har1-shr 0.9700 0.9823 0.9850 0.9808 0.9795 0.9002 0.9489 0.9242 0.8575 0.9453ite-har1-struc ite-har1-wls 0.9743 0.9856 0.9889 0.9854 0.9835 0.9035 0.9535 0.9282 0.8631 0.9495ite-ols-ols ite-ols-shr ite-ols-struc ite-ols-wls ite-sar1-ols ite-sar1-shr 0.9751 0.9819 ite-sar1-wls 0.9812 0.9858 0.9827 0.9829 0.9831 0.9078 0.9504 0.9289 0.8619 0.9493ite-shr-ols ite-shr-shr 0.9790 0.9827 0.9888 0.9877 0.9845 0.8994 0.9530 0.9258 0.8561 0.9483ite-shr-struc ite-shr-wls 0.9758 0.9839 0.9915 0.9909 0.9855 0.8981 0.9570 0.9271 0.8598 0.9498ite-strar1-ols ite-strar1-shr ite-strar1-wls ite-struc-shr ite-struc-wls ite-wlsh-shr 0.9703 0.9820 0.9851 0.9808 0.9795 0.8999 0.9488 0.9240 0.8573 0.9452ite-wlsh-struc ite-wlsh-wls 0.9749 0.9856 0.9890 0.9854 0.9837 0.9033 0.9536 0.9281 0.8631 0.9496ite-wlsv-ols ite-wlsv-shr 0.9750 0.9815 0.9783 ite-wlsv-wls 0.9813 0.9858 0.9829 0.9830 0.9832 0.9078 0.9506 0.9289 0.8620 0.9494 lllllllllllllllllll lll lllll llll llll llll llllll lllllllllllllll lllll ll llllllll l ll ll ite−wlsv−shr 21.26ite−sar1−shr 21.68ite−acov−shr 22.25ite−wlsh−shr 22.39tcs−acov−shr 22.76ite−har1−shr 23.13tcs−sar1−shr 23.18kah−wlsv−shr 23.29kah−wlsh−shr 23.95tcs−har1−shr 24.17ite−sar1−wls 25.18ite−wlsv−wls 25.23ite−acov−wls 26.07tcs−sar1−wls 26.31ite−wlsh−wls 26.32ite−har1−wls 26.43kah−wlsv−wls 26.44tcs−acov−wls 26.54kah−wlsh−wls 26.76tcs−har1−wls 26.85tcs−bu−wls 29.24tcs−bu−shr 29.36tcs−shr−shr 29.56ite−bu−wls 31.12ite−shr−shr 31.14tcs−shr−wls 31.25ite−shr−wls 31.59ite−bu−shr 32.46ite−struc−shr 34.35kah−struc−shr 34.39ite−strar1−shr 34.65tcs−strar1−shr 34.66ite−strar1−wls 39.57ite−struc−wls 39.76tcs−strar1−wls 40.07kah−struc−wls 40.33ite−wlsv−struc 41.94ite−sar1−struc 42.03tcs−wlsv−struc 42.07tcs−sar1−struc 42.13tcs−wlsh−struc 43.19ite−wlsh−struc 43.24ite−har1−struc 43.31ite−acov−struc 43.36tcs−har1−struc 43.42tcs−acov−struc 43.49tcs−wlsv−ols 46.84ite−wlsv−ols 46.88tcs−sar1−ols 47.42ite−sar1−ols 47.45ite−bu−struc 48.05tcs−bu−struc 48.19tcs−wlsh−ols 48.41ite−wlsh−ols 48.41ite−acov−ols 48.54tcs−acov−ols 48.57ite−har1−ols 48.85ite−shr−struc 48.88tcs−har1−ols 48.89tcs−shr−struc 48.92tcs−ols−shr 49.95ite−strar1−struc 51.31ite−struc−struc 51.32tcs−struc−struc 51.38ite−ols−shr 51.41tcs−strar1−struc 51.42tcs−bu−ols 53.64ite−bu−ols 53.72ite−shr−ols 54.95tcs−shr−ols 54.96tcs−ols−wls 55.66tcs−strar1−ols 55.77tcs−struc−ols 55.79ite−strar1−ols 55.82ite−struc−ols 55.84ite−ols−wls 55.86base 59.69ite−ols−struc 64.22tcs−ols−struc 64.3tcs−ols−ols 68.88ite−ols−ols 68.96 20 40 60

Mean ranks

Critical distance = 14.37Friedman: 0 (Ha: Different)

Figure A.22 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The recon-ciliation procedures are sorted vertically according to the MSE mean rank across all timefrequencies and forecast horizons. 110 lllllllll l lllllllllllllllll llll llllllllllllll llllllllllllll llllll llll lllllll ll ll ite−wlsv−shr 19.76ite−sar1−shr 20.08ite−wlsh−shr 20.44ite−har1−shr 20.49kah−wlsv−shr 20.88ite−acov−shr 20.92tcs−sar1−shr 20.96tcs−acov−shr 21.17kah−wlsh−shr 21.62tcs−har1−shr 21.69tcs−shr−shr 24.2ite−sar1−wls 25.87ite−wlsv−wls 25.97ite−har1−wls 26.39ite−wlsh−wls 26.48ite−shr−shr 26.82tcs−sar1−wls 26.86ite−acov−wls 26.99tcs−acov−wls 27.02kah−wlsv−wls 27.07tcs−har1−wls 27.15tcs−bu−shr 27.25kah−wlsh−wls 27.59tcs−shr−wls 28.39tcs−bu−wls 28.92ite−shr−wls 29.24ite−bu−wls 29.58ite−bu−shr 29.84ite−struc−shr 34.32ite−strar1−shr 34.59kah−struc−shr 35.05tcs−strar1−shr 35.43ite−strar1−wls 42.27ite−struc−wls 42.41ite−sar1−struc 42.47tcs−sar1−struc 42.61tcs−strar1−wls 42.74kah−struc−wls 42.86ite−har1−struc 42.92ite−wlsv−struc 42.99tcs−har1−struc 43.01tcs−wlsv−struc 43.12ite−wlsh−struc 43.59tcs−wlsh−struc 43.66ite−acov−struc 43.79tcs−acov−struc 43.94ite−sar1−ols 46.51tcs−sar1−ols 46.57tcs−wlsv−ols 46.59ite−wlsv−ols 46.68ite−bu−struc 47.4tcs−bu−struc 47.52tcs−har1−ols 47.93tcs−wlsh−ols 48.02ite−har1−ols 48.04ite−shr−struc 48.05ite−wlsh−ols 48.05tcs−shr−struc 48.17tcs−acov−ols 48.25ite−acov−ols 48.33tcs−ols−shr 51.06ite−ols−shr 52.04tcs−bu−ols 53.04ite−bu−ols 53.09tcs−shr−ols 53.35ite−shr−ols 53.42tcs−strar1−struc 54.56ite−strar1−struc 54.56ite−struc−struc 54.84tcs−struc−struc 54.86tcs−strar1−ols 56.91tcs−struc−ols 56.95ite−strar1−ols 57ite−struc−ols 57.06tcs−ols−wls 57.94ite−ols−wls 58base 58.98ite−ols−struc 67.18tcs−ols−struc 67.27tcs−ols−ols 70.63ite−ols−ols 70.68 20 40 60 80

Mean ranks

Critical distance = 14.37Friedman: 0 (Ha: Different)

Figure A.23 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The recon-ciliation procedures are sorted vertically according to the MAE mean rank across all timefrequencies and forecast horizons 111 lll lllllllllllllllll lllllll lllllllllllllllllllllll llllll llllll ll llll l lllllll ll ll ite−sar1−shr 20.57ite−har1−shr 20.62ite−wlsv−shr 21ite−wlsh−shr 21.01tcs−har1−shr 23.12ite−acov−shr 23.39kah−wlsh−shr 23.45ite−bu−shr 23.74tcs−sar1−shr 24.43kah−wlsv−shr 24.44ite−har1−wls 24.57tcs−acov−shr 24.81ite−wlsh−wls 25.01tcs−bu−shr 25.15ite−sar1−wls 26ite−wlsv−wls 26.25tcs−har1−wls 26.29ite−bu−wls 26.34tcs−bu−wls 26.79ite−acov−wls 26.93kah−wlsh−wls 26.97tcs−sar1−wls 28.24tcs−acov−wls 28.31kah−wlsv−wls 28.76tcs−shr−shr 29.28ite−shr−shr 30.04ite−shr−wls 30.99tcs−shr−wls 31.47ite−har1−struc 40.59base 40.6tcs−har1−struc 40.64ite−wlsh−struc 40.85tcs−wlsh−struc 41.03kah−struc−shr 41.16tcs−strar1−shr 41.36tcs−bu−struc 41.38ite−bu−struc 41.45ite−struc−shr 41.48ite−strar1−shr 41.65ite−sar1−struc 41.77ite−wlsv−struc 41.79tcs−wlsh−ols 41.83ite−wlsh−ols 41.87tcs−sar1−struc 41.91tcs−wlsv−struc 41.95tcs−har1−ols 41.99ite−acov−struc 42.04ite−har1−ols 42.04tcs−acov−struc 42.16tcs−acov−ols 42.17ite−acov−ols 42.19ite−wlsv−ols 43.58tcs−wlsv−ols 43.69ite−sar1−ols 43.99tcs−sar1−ols 44.02tcs−bu−ols 44.42ite−bu−ols 44.44ite−struc−wls 46.24ite−strar1−wls 46.28kah−struc−wls 46.61tcs−strar1−wls 46.8ite−shr−struc 47.48tcs−shr−struc 47.63tcs−shr−ols 50.47ite−shr−ols 50.47ite−struc−struc 56.86tcs−struc−struc 56.87ite−strar1−struc 57.12tcs−strar1−struc 57.21tcs−ols−shr 58.75tcs−struc−ols 60.3ite−struc−ols 60.36ite−ols−shr 60.55ite−strar1−ols 60.88tcs−strar1−ols 61.04tcs−ols−wls 61.55ite−ols−wls 61.75tcs−ols−struc 69.59ite−ols−struc 69.59tcs−ols−ols 74.29ite−ols−ols 74.29 20 40 60 80

Mean ranks

Critical distance = 14.37Friedman: 0 (Ha: Different)

Figure A.24 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The recon-ciliation procedures are sorted vertically according to the MSE mean rank for the one-stepahead quarterly forecasts. 112 lll lllllllllllll lllllllllll llll lllllllllllllllllllllll llllll ll ll l l llllllllll ll ll ite−sar1−shr 17.84ite−wlsv−shr 17.95ite−har1−shr 18.59ite−wlsh−shr 18.76ite−acov−shr 21.28kah−wlsv−shr 21.96tcs−sar1−shr 22.01kah−wlsh−shr 22.6tcs−har1−shr 22.82ite−bu−shr 23.68ite−har1−wls 24.05ite−wlsh−wls 24.26tcs−bu−shr 24.54tcs−acov−shr 24.68ite−sar1−wls 24.86ite−wlsv−wls 24.93ite−bu−wls 25.27tcs−bu−wls 26.64tcs−sar1−wls 27.39kah−wlsh−wls 27.54kah−wlsv−wls 27.71tcs−har1−wls 27.71ite−acov−wls 28.44ite−shr−wls 29.02tcs−shr−shr 29.64ite−shr−shr 29.92tcs−acov−wls 29.95tcs−shr−wls 30.97ite−struc−shr 37.43ite−strar1−shr 37.84kah−struc−shr 38.24tcs−strar1−shr 38.31base 40.99ite−har1−struc 41.52tcs−har1−struc 41.58ite−wlsh−struc 41.75tcs−wlsh−struc 41.84ite−wlsv−struc 42.25tcs−wlsv−struc 42.33ite−sar1−struc 42.42tcs−sar1−struc 42.54tcs−wlsh−ols 43.41ite−wlsh−ols 43.41tcs−har1−ols 43.85ite−wlsv−ols 43.94ite−har1−ols 43.95tcs−wlsv−ols 43.96tcs−bu−struc 44.33ite−sar1−ols 44.33tcs−sar1−ols 44.36ite−bu−struc 44.36ite−strar1−wls 44.97ite−struc−wls 45.31kah−struc−wls 45.36tcs−strar1−wls 45.38ite−acov−struc 46.5tcs−acov−struc 46.63tcs−acov−ols 46.66ite−acov−ols 46.74tcs−bu−ols 46.83ite−bu−ols 46.88ite−shr−struc 50.25tcs−shr−struc 50.33ite−shr−ols 53.43tcs−shr−ols 53.45tcs−ols−shr 54.59ite−ols−shr 55.74tcs−struc−struc 57.49ite−struc−struc 57.49ite−strar1−struc 58.22tcs−strar1−struc 58.32tcs−struc−ols 58.94ite−struc−ols 59.05tcs−ols−wls 59.4ite−ols−wls 59.47ite−strar1−ols 59.91tcs−strar1−ols 59.92ite−ols−struc 70.76tcs−ols−struc 70.81tcs−ols−ols 74.1ite−ols−ols 74.16 20 40 60 80

Mean ranks

Critical distance = 14.37Friedman: 0 (Ha: Different)

Figure A.25 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The reconcil-iation procedures are sorted vertically according to the MAE mean rank for the one-stepahead quarterly forecasts. 113 .8.4 Optimal combination forecast cross-temporal reconciliation procedures l l l ll l lll l l ll l lll l l ll l lll l l ll l ll

Optimal Cross−temporal ll o l s s t r u c S s h r bd s h r w l s h w l sv s h r a c o v T e m po r a l a gg r e g a t i on l eve l A ll l eve l s l all bts uts A v g R e l M SE all 1 2 4 all 1 2 4 all 1 2 4oct olsoct strucbaseoct Sshroct bdshroct wlshoct wlsvoct shroct acov all bts uts Figure A.26 : Top panel: Average Relative MSE across all series and forecast horizons,by frequency of observation. Bottom panel: Rankings by frequency of observation andforecast horizon. 114 able A.16 : AvgRelMSE at any temporal aggregation level and any forecast horizon.

Quarterly Semi-annual Annual AllProcedure all 95 series base 1 1 1 1 1 1 1 1 1 1oct-ols oct-struc oct-wlsh 0.9548 0.9718 0.9760 0.9696 0.9680 0.8112 0.9194 0.8636 0.7579 0.9048oct-wlsv 0.9692 0.9719 0.9622 oct-shr 0.9652

32 upper series base 1 1 1 1 1 1 1 1 1 1oct-ols oct-struc oct-Sshr

63 bottom series base 1 1 1 1 1 1 1 1 1 1oct-ols oct-struc oct-wlsh oct-shr 0.9830 0.9805 l l lll lll l l lll lll l l lll lll l l lll ll

Optimal Cross−temporal ll o l s s t r u c bd s h r S s h r w l s h w l sv a c o v s h r T e m po r a l a gg r e g a t i on l eve l A ll l eve l s l all bts uts A v g R e l M AE all 1 2 4 all 1 2 4 all 1 2 4oct olsoct strucbaseoct bdshroct Sshroct wlshoct wlsvoct acovoct shr all bts uts Figure A.27 : Top panel: Average Relative MAE across all series and forecast horizons,by frequency of observation. Bottom panel: Rankings by frequency of observation andforecast horizon. 116 able A.17 : AvgRelMAE at any temporal aggregation level and any forecast horizon.

Quarterly Semi-annual Annual AllProcedure all 95 series base 1 1 1 1 1 1 1 1 1 1oct-ols oct-struc oct-wlsh 0.9745 0.9863 0.9895 0.9867 0.9842 0.9028 0.9545 0.9283 0.8630 0.9499oct-wlsv 0.9813 0.9858 0.9829 oct-Sshr

32 upper series base 1 1 1 1 1 1 1 1 1 1oct-ols oct-struc oct-Sshr 0.9587 0.9675 0.9660 0.9769

63 bottom series base 1 1 1 1 1 1 1 1 1 1oct-ols oct-struc oct-wlsh 0.9818 0.9906 0.9981 0.9916 0.9905 0.9069 0.9639 0.9350 0.8746 0.9571oct-wlsv 0.9929 0.9901 oct-shr 0.9946 0.9913 lll l l l l l oct−wlsh 3.68oct−acov 3.72oct−wlsv 3.77oct−shr 3.78oct−bdshr 4.14oct−Sshr 4.38oct−struc 5.98base 7.59oct−ols 7.97 3 4 5 6 7 8

Mean ranks

Critical distance = 1.23Friedman: 0 (Ha: Different) l ll l l l l l l oct−wlsh 3.55oct−Sshr 3.83oct−acov 3.85oct−wlsv 3.98oct−shr 4.57oct−bdshr 4.76base 5.22oct−struc 6.77oct−ols 8.47 4 6 8

Mean ranks

Critical distance = 1.23Friedman: 0 (Ha: Different)

Figure A.28 : Nemenyi test results at 5% signiﬁcance level for all 95 series. The optimalcombination reconciliation procedures are sorted vertically according to the MSE meanrank (i) across all time frequencies and forecast horizons (top), and (ii) for 1-step-aheadquarterly forecasts (bottom). 118 l ll ll l l l oct−shr 3.58oct−wlsv 3.64oct−wlsh 3.81oct−acov 3.83oct−Sshr 4.01oct−bdshr 4.02oct−struc 6.38base 7.53oct−ols 8.2 3 4 5 6 7 8 9

Mean ranks

Critical distance = 1.23Friedman: 0 (Ha: Different) l lll l l l l l oct−wlsh 3.67oct−Sshr 3.91oct−acov 3.96oct−wlsv 3.96oct−bdshr 4.41oct−shr 4.62base 5.32oct−struc 6.71oct−ols 8.45 4 6 8

Mean ranks

Critical distance = 1.23Friedman: 0 (Ha: Different)