Rediscovering Bottom-Up: Effective Forecasting in Temporal Hierarchies

Lukas Neubauer
TU Wien
[email protected]
&Peter Filzmoser
TU Wien
[email protected]
Abstract

Forecast reconciliation has become a prominent topic in recent forecasting literature, with a primary distinction made between cross-sectional and temporal hierarchies. This work focuses on temporal hierarchies, such as aggregating monthly time series data to annual data. We explore the impact of various forecast reconciliation methods on temporally aggregated ARIMA models, thereby bridging the fields of hierarchical forecast reconciliation and temporal aggregation both theoretically and experimentally. Our paper is the first to theoretically examine the effects of temporal hierarchical forecast reconciliation, demonstrating that the optimal method aligns with a bottom-up aggregation approach. To assess the practical implications and performance of the reconciled forecasts, we conduct a series of simulation studies, confirming that the findings extend to more complex models. This result helps explain the strong performance of the bottom-up approach observed in many prior studies. Finally, we apply our methods to real data examples, where we observe similar results.

Keywords Temporal Hierarchical Forecast Reconcilation, Temporal Aggregation, Bottom-Up

1 Introduction

Forecast reconciliation has been a very popular topic in recent forecasting literature. It covers the questions on how to properly forecast time series which have been aggregated in a certain way. This aggregation could come from a cross-sectional aspect where a collection of time series is aggregated across different variables such as location or organizational unit. In contrast, the time series could also be aggregated on a temporal basis, such as monthly, quarterly, and annual time series. Naturally, both types of aggregation might be combined in any way, leading to cross-temporal hierarchies.

The field of hierarchical forecast reconciliation investigates how to handle forecasting those hierarchies such that the resulting forecasts match the aggregation properties of the hierarchy. In addition, it is often examined how the performance of the reconciliation methods yielding so-called coherent forecasts is compared to original, possibly non-coherent forecasts. A very recent and extensive review of forecast reconciliation is given in Athanasopoulos et al., (2024). Many extensions are discussed such as adding complex constraints (non-negativity, integer-based time series, …) or probabilistic forecasting.

In this paper we investigate temporal hierarchies as introduced by Athanasopoulos et al., (2017). The authors argue that already existing forecast reconciliation methods can be applied to temporally aggregated time series in a straightforward manner. However, no further assumptions besides the base forecasts being unbiased are investigated, especially since no work is available looking at the theoretical implications of reconciliation methods assuming certain data-generating processes. We fill this gap of research and examine the performance of forecast reconciliation in temporal hierarchies in the theoretical framework of temporally aggregated time series models such as ARIMA models.

The effects of temporal aggregation in autoregressive models were first studied by Amemiya and Wu, (1972). The authors prove that if some data is generated by an autoregressive model of order p𝑝pitalic_p, then a non-overlap** aggregate of these data will also follow a similar generating process. Namely, the autoregressive order of the aggregate remains at the same order p𝑝pitalic_p while there might exist a moving average part of a certain order as well. In fact, the authors give a maximum order for this moving average part of the process. Silvestrini and Veredas, (2008) give a generalized overview of this theory and extend it to general SARIMA models.

In temporal hierarchies, simple reconciliation techniques such as bottom-up approaches are often applied. A bottom-up forecast is generated by aggregating the forecasts of the disaggregated series. Ramírez et al., (2014) suggest that forecasts of aggregated time series can be improved by using bottom-up forecasts, as long as the aggregated model includes a significant moving average component. Without this component, the improvements may be minimal or nonexistent. In this work, we extend this analysis by considering more complex models and more intricate temporal hierarchies.

We take an additional step to analyze the performance of the bottom-up approach compared to more sophisticated reconciliation methods, thereby linking the fields of temporal forecast reconciliation and temporally aggregated time series models. Although this was experimentally examined in Athanasopoulos et al., (2017), the results have yet to be theoretically justified. In general, the connection between these two fields has not been established from a theoretical perspective.

The paper is structured as follows. In Section 2 we briefly discuss the ideas of hierarchical forecast reconciliation and recent advances, in particular regarding temporal hierarchies (Section 2.1) as well as the basics of temporally aggregated time series models (Section 2.3). This is followed by the linkage of those two topics in Section 3 where we discuss the theoretical implications of forecast reconciliation on the temporally aggregated time series. In Section 4, we investigate the discussed implications in a simulation study, followed by real data applications in Section 5. Finally, we give concluding remarks in Section 6.

2 Related Work

2.1 Hierarchical Forecast Reconciliation

First introduced by Hyndman et al., (2011), optimal forecast reconciliation is formulated as follows. Consider a multivariate time series 𝐲1,,𝐲Tnsubscript𝐲1subscript𝐲𝑇superscript𝑛\mathbf{y}_{1},\dots,\mathbf{y}_{T}\in\mathbb{R}^{n}bold_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_y start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT fulfilling possible linear constraints, namely 𝐲t=S𝐛tsubscript𝐲𝑡𝑆subscript𝐛𝑡\mathbf{y}_{t}=S\mathbf{b}_{t}bold_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_S bold_b start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, where S𝑆Sitalic_S is a n×nb𝑛subscript𝑛𝑏n\times n_{b}italic_n × italic_n start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT summing matrix with nb<nsubscript𝑛𝑏𝑛n_{b}<nitalic_n start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT < italic_n, and btsubscript𝑏𝑡b_{t}italic_b start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT denotes the bottom level series of the hierarchy. The summing matrix is defined by the type of hierarchy of interest. For example, a matrix with n=7𝑛7n=7italic_n = 7 and nb=4subscript𝑛𝑏4n_{b}=4italic_n start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = 4 given by

S=(1111110000111000010000100001)𝑆matrix1111110000111000010000100001\displaystyle S=\begin{pmatrix}1&1&1&1\\ 1&1&0&0\\ 0&0&1&1\\ 1&0&0&0\\ 0&1&0&0\\ 0&0&1&0\\ 0&0&0&1\end{pmatrix}italic_S = ( start_ARG start_ROW start_CELL 1 end_CELL start_CELL 1 end_CELL start_CELL 1 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW end_ARG ) (1)

could be understood as a 3333-level hierarchy of 4444 districts and 2222 states of one country whereby the first two districts are part of the first state and so on. Such linear constraints are naturally fulfilled on the observed data because it is set up to do so. When forecasting such series, we want the forecasts to also adhere to the same constraints which leads to so-called coherent forecasts, namely 𝐲^t+h|t=S𝐛^t+h|tsubscript^𝐲𝑡conditional𝑡𝑆subscript^𝐛𝑡conditional𝑡\hat{\mathbf{y}}_{t+h|t}=S\hat{\mathbf{b}}_{t+h|t}over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_t + italic_h | italic_t end_POSTSUBSCRIPT = italic_S over^ start_ARG bold_b end_ARG start_POSTSUBSCRIPT italic_t + italic_h | italic_t end_POSTSUBSCRIPT, where 𝐲^t+h|tsubscript^𝐲𝑡conditional𝑡\hat{\mathbf{y}}_{t+h|t}over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_t + italic_h | italic_t end_POSTSUBSCRIPT and 𝐛^t+h|tsubscript^𝐛𝑡conditional𝑡\hat{\mathbf{b}}_{t+h|t}over^ start_ARG bold_b end_ARG start_POSTSUBSCRIPT italic_t + italic_h | italic_t end_POSTSUBSCRIPT denote the corresponding hhitalic_h-step forecasts. However, by forecasting each time series of the hierarchy individually we will most likely not obtain such coherent forecasts. This is where forecast reconciliation proves crucial.

Historically, simple reconciliation methods such as bottom-up or top-down approaches have been and remain in use. The bottom-up approach starts at the bottom level of the hierarchy, using forecasts from this level to construct forecasts for the entire hierarchy. This method avoids information loss due to aggregation but can be challenging because the bottom level time series may be harder to forecast accurately due to noise or other factors. On the other hand, top-down reconciliation uses only top-level forecasts and requires a proportion vector 𝐩𝐩\mathbf{p}bold_p of size n𝑛nitalic_n to break down these forecasts into coherent lower-level forecasts, with the main challenge being the identification of an appropriate breakdown vector.

In the seminal work by Hyndman et al., (2011), the following regression problem was proposed to achieve least-squares reconciliation. Let 𝐲^h=𝐲^t+h|tsubscript^𝐲subscript^𝐲𝑡conditional𝑡\hat{\mathbf{y}}_{h}=\hat{\mathbf{y}}_{t+h|t}over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_t + italic_h | italic_t end_POSTSUBSCRIPT represent a vector containing hhitalic_h-step base forecasts in a stacked manner, and let S𝑆Sitalic_S be a summation matrix defined by the hierarchy of interest. Base forecasts refer to any appropriate and possibly incoherent forecasts for the corresponding time series, which we assume are available at this stage. Then write

𝐲^h=S𝜷h+ϵh,subscript^𝐲𝑆subscript𝜷subscriptbold-italic-ϵ\displaystyle\hat{\mathbf{y}}_{h}=S\bm{\beta}_{h}+\bm{\epsilon}_{h},over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = italic_S bold_italic_β start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT + bold_italic_ϵ start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT , (2)

where 𝜷hsubscript𝜷\bm{\beta}_{h}bold_italic_β start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT are the regression coefficients indicating the unknown mean of the bottom level, and ϵhsubscriptbold-italic-ϵ\bm{\epsilon}_{h}bold_italic_ϵ start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT is the unobservable reconciliation error with zero mean and covariance matrix Vhsubscript𝑉V_{h}italic_V start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT.

Solving this regression problem using generalized least-squares leads to the generalized linear solution of 𝜷^h=Gh𝐲^hsubscript^𝜷subscript𝐺subscript^𝐲\hat{\bm{\beta}}_{h}=G_{h}\hat{\mathbf{y}}_{h}over^ start_ARG bold_italic_β end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = italic_G start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT and reconciled forecasts 𝐲~h=SGh𝐲^hsubscript~𝐲𝑆subscript𝐺subscript^𝐲\tilde{\mathbf{y}}_{h}=SG_{h}\hat{\mathbf{y}}_{h}over~ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = italic_S italic_G start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT. The nb×nsubscript𝑛𝑏𝑛n_{b}\times nitalic_n start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT × italic_n matrix Ghsubscript𝐺G_{h}italic_G start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT maps the base forecasts into appropriate bottom level forecasts and is given by

Gh=(SVh1S)1SVh1.subscript𝐺superscriptsuperscript𝑆superscriptsubscript𝑉1𝑆1superscript𝑆superscriptsubscript𝑉1\displaystyle G_{h}=(S^{\prime}V_{h}^{-1}S)^{-1}S^{\prime}V_{h}^{-1}.italic_G start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = ( italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_V start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_V start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT . (3)

The regression problem was inspired by the authors’ findings that simple reconciliation methods, such as bottom-up or top-down, can all be expressed as 𝐲~=SG𝐲^~𝐲𝑆𝐺^𝐲\tilde{\mathbf{y}}=SG\hat{\mathbf{y}}over~ start_ARG bold_y end_ARG = italic_S italic_G over^ start_ARG bold_y end_ARG with an appropriate map** matrix G𝐺Gitalic_G. For example, setting G=(0n×(nnb)In)𝐺subscript0𝑛𝑛subscript𝑛𝑏subscript𝐼𝑛G=(0_{n\times(n-n_{b})}~{}I_{n})italic_G = ( 0 start_POSTSUBSCRIPT italic_n × ( italic_n - italic_n start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) or G=(𝐩0n×(nb1))𝐺𝐩subscript0𝑛subscript𝑛𝑏1G=(\mathbf{p}~{}0_{n\times(n_{b}-1)})italic_G = ( bold_p 0 start_POSTSUBSCRIPT italic_n × ( italic_n start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT - 1 ) end_POSTSUBSCRIPT ), where 0r×qsubscript0𝑟𝑞0_{r\times q}0 start_POSTSUBSCRIPT italic_r × italic_q end_POSTSUBSCRIPT denotes a r×q𝑟𝑞r\times qitalic_r × italic_q matrix of zeros of size, Iqsubscript𝐼𝑞I_{q}italic_I start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT is the identity matrix of size q𝑞qitalic_q and 𝐩𝐩\mathbf{p}bold_p is a proportion vector of size n𝑛nitalic_n, yields the bottom-up or top-down methods, respectively. The regression problem (2) was introduced to determine the optimal map** matrix in a least-squares sense.

It is further argued in Hyndman et al., (2011) that if the base forecasts are unbiased, that is 𝔼[𝐲^h]=𝔼[𝐲t+h]𝔼delimited-[]subscript^𝐲𝔼delimited-[]subscript𝐲𝑡\mathbb{E}[\hat{\mathbf{y}}_{h}]=\mathbb{E}[\mathbf{y}_{t+h}]blackboard_E [ over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ] = blackboard_E [ bold_y start_POSTSUBSCRIPT italic_t + italic_h end_POSTSUBSCRIPT ], and G𝐺Gitalic_G is such that SGS=S𝑆𝐺𝑆𝑆SGS=Sitalic_S italic_G italic_S = italic_S, then the reconciled forecasts are also unbiased. The condition of SGS=S𝑆𝐺𝑆𝑆SGS=Sitalic_S italic_G italic_S = italic_S is equivalent to SG𝑆𝐺SGitalic_S italic_G being a projection matrix (Panagiotelis et al.,, 2021), ensuring that already coherent forecasts remain unchanged in this transformation.

One essential problem is that Vhsubscript𝑉V_{h}italic_V start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT is not known and not even identifiable as shown in Wickramasuriya et al., (2019). In Hyndman et al., (2011), the authors avoided this by setting Vh=khInsubscript𝑉subscript𝑘subscript𝐼𝑛V_{h}=k_{h}I_{n}italic_V start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = italic_k start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT with some consistency constant khsubscript𝑘k_{h}italic_k start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT (which need not be computed since it cancels out in further calculation steps) and hence weighting all series equally, disregarding any level of aggregation or performance of base forecasts. This simplification results in an OLS solution and G=(SS)1S𝐺superscriptsuperscript𝑆𝑆1superscript𝑆G=(S^{\prime}S)^{-1}S^{\prime}italic_G = ( italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. The transformation matrix SG=S(SS)1S𝑆𝐺𝑆superscriptsuperscript𝑆𝑆1superscript𝑆SG=S(S^{\prime}S)^{-1}S^{\prime}italic_S italic_G = italic_S ( italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is then an orthogonal projection with respect to the Euclidean distance, ensuring minimal change of the forecasts while reducing squared forecast errors of all levels of the hierarchy (Panagiotelis et al.,, 2021). A scaled reconciliation method is introduced in Hyndman et al., (2016) where the authors set Vh=khdiag(Wh)subscript𝑉subscript𝑘diagsubscript𝑊V_{h}=k_{h}\text{diag}(W_{h})italic_V start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = italic_k start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT diag ( italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ) with Wh=Cov(𝐲t+h|h𝐲^h)subscript𝑊Covsubscript𝐲𝑡conditionalsubscript^𝐲W_{h}=\text{Cov}(\mathbf{y}_{t+h|h}-\hat{\mathbf{y}}_{h})italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = Cov ( bold_y start_POSTSUBSCRIPT italic_t + italic_h | italic_h end_POSTSUBSCRIPT - over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ) being the covariance matrix of the base forecasts, leading to a weighted linear solution.

In the work of Wickramasuriya et al., (2019) the so-called minimum trace estimator is proposed by minimizing the trace of the covariance of the reconciled errors subject to unbiasedness, thus

minGtrCov(𝐲T+h|h𝐲~h)subscript𝐺trCovsubscript𝐲𝑇conditionalsubscript~𝐲\displaystyle\min_{G}\text{tr}~{}\text{Cov}(\mathbf{y}_{T+h|h}-\tilde{\mathbf{% y}}_{h})roman_min start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT tr Cov ( bold_y start_POSTSUBSCRIPT italic_T + italic_h | italic_h end_POSTSUBSCRIPT - over~ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ) =minGtrCov(𝐲T+h|hSG𝐲^h)absentsubscript𝐺trCovsubscript𝐲𝑇conditional𝑆𝐺subscript^𝐲\displaystyle=\min_{G}\text{tr}~{}\text{Cov}(\mathbf{y}_{T+h|h}-SG\hat{\mathbf% {y}}_{h})= roman_min start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT tr Cov ( bold_y start_POSTSUBSCRIPT italic_T + italic_h | italic_h end_POSTSUBSCRIPT - italic_S italic_G over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT )
=minGtrSGWhGS,absentsubscript𝐺tr𝑆𝐺subscript𝑊superscript𝐺superscript𝑆\displaystyle=\min_{G}\text{tr}~{}SGW_{h}G^{\prime}S^{\prime},= roman_min start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT tr italic_S italic_G italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , (4)

subject to SGS=S𝑆𝐺𝑆𝑆SGS=Sitalic_S italic_G italic_S = italic_S. The trace of a n×n𝑛𝑛n\times nitalic_n × italic_n matrix is tr(A)=i=1nAiitr𝐴superscriptsubscript𝑖1𝑛subscript𝐴𝑖𝑖\text{tr}(A)=\sum_{i=1}^{n}A_{ii}tr ( italic_A ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_i italic_i end_POSTSUBSCRIPT. This leads to Gh=(SWh1S)1SWh1subscript𝐺superscriptsuperscript𝑆superscriptsubscript𝑊1𝑆1superscript𝑆superscriptsubscript𝑊1G_{h}=(S^{\prime}W_{h}^{-1}S)^{-1}S^{\prime}W_{h}^{-1}italic_G start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = ( italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT. Thus, instead of estimating Vhsubscript𝑉V_{h}italic_V start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT, we now need to estimate the covariance of the base forecast errors, Whsubscript𝑊W_{h}italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT, which is more feasible. This method is equivalent to the generalized linear solution, with the regression-based solution being a special case. The transformation matrix SG𝑆𝐺SGitalic_S italic_G now represents an oblique projection. By drop** the assumption of an orthogonal projection, we allow for greater forecast improvements on average. However, Panagiotelis et al., (2021) argue that for some realizations, the performance of the reconciled forecasts may be worsened.

Estimating Whsubscript𝑊W_{h}italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT presents difficulties, especially for complex hierarchies and forecast horizons beyond h>11h>1italic_h > 1, due to the limited sample size determined by the number of top-level observations. Therefore, it may be practical to revert to simpler estimates as previously described. Additionally, Wickramasuriya et al., (2019) propose sample and shrinkage estimators by setting Wh=khW^1subscript𝑊subscript𝑘subscript^𝑊1W_{h}=k_{h}\hat{W}_{1}italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = italic_k start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT over^ start_ARG italic_W end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Wh=kh(λdiag(W^1)+(1λ)W^1),λ(0,1)formulae-sequencesubscript𝑊subscript𝑘𝜆diagsubscript^𝑊11𝜆subscript^𝑊1𝜆01W_{h}=k_{h}(\lambda\text{diag}(\hat{W}_{1})+(1-\lambda)\hat{W}_{1}),\lambda\in% (0,1)italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = italic_k start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ( italic_λ diag ( over^ start_ARG italic_W end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + ( 1 - italic_λ ) over^ start_ARG italic_W end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , italic_λ ∈ ( 0 , 1 ), respectively, with appropriate consistency constants. The shrinkage estimator is particularly useful when n>T𝑛𝑇n>Titalic_n > italic_T, which can result in a singular sample covariance matrix.

The authors of Wickramasuriya et al., (2019) also give a different type of estimator, denoted by structural scaling. It is proposed to set Wh=khdiag(S𝟏nb)subscript𝑊subscript𝑘diag𝑆subscript1subscript𝑛𝑏W_{h}=k_{h}\text{diag}(S\mathbf{1}_{n_{b}})italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = italic_k start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT diag ( italic_S bold_1 start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) implying that each forecast is scaled according to the number of series in its level of the hierarchy. Here, 𝟏nbsubscript1subscript𝑛𝑏\mathbf{1}_{n_{b}}bold_1 start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT is a vector with nbsubscript𝑛𝑏n_{b}italic_n start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT entries of one.

Overall, the minimum trace method addresses three key aspects. Firstly, it produces coherent forecasts, which is the most crucial factor. Secondly, as long as the base forecasts are unbiased, the reconciled forecasts will also be unbiased. Lastly, it enhances forecast performance by minimizing the forecast error variance across all series on average.

2.2 Temporal Hierarchical Forecast Reconciliation

While forecast reconciliation has not been developed with temporal hierarchies in mind, it can be applied to them naturally as discussed in Athanasopoulos et al., (2017). Temporal hierarchies allow for even more sophisticated methods for estimating the covariance matrix of the base forecast errors.

Let ytsubscript𝑦𝑡y_{t}italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT with t=1,,T𝑡1𝑇t=1,\dots,Titalic_t = 1 , … , italic_T be a univariate time series of interest of a certain frequency m𝑚mitalic_m. A k𝑘kitalic_k-aggregate, where k𝑘kitalic_k is a factor of m𝑚mitalic_m, is defined to be

yj[k]=t=t+(j1)kt+jk1yt,j=1,,T/k,formulae-sequencesuperscriptsubscript𝑦𝑗delimited-[]𝑘superscriptsubscript𝑡superscript𝑡𝑗1𝑘superscript𝑡𝑗𝑘1subscript𝑦𝑡𝑗1𝑇𝑘\displaystyle y_{j}^{[k]}=\sum_{t=t^{\ast}+(j-1)k}^{t^{\ast}+jk-1}y_{t},\quad j% =1,\dots,\lfloor T/k\rfloor,italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_t = italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + ( italic_j - 1 ) italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_j italic_k - 1 end_POSTSUPERSCRIPT italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_j = 1 , … , ⌊ italic_T / italic_k ⌋ , (5)

where tsuperscript𝑡t^{\ast}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is the starting point of the aggregation to ensure non-overlap** aggregates. The resulting frequency is then Mk=m/ksubscript𝑀𝑘𝑚𝑘M_{k}=m/kitalic_M start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_m / italic_k. To have a common index across all levels of aggregation, the authors set i=1,,T/m𝑖1𝑇𝑚i=1,\dots,\lfloor T/m\rflooritalic_i = 1 , … , ⌊ italic_T / italic_m ⌋ and

yMk(i1)+z[k]=yj[k],z=1,,Mk,formulae-sequencesuperscriptsubscript𝑦subscript𝑀𝑘𝑖1𝑧delimited-[]𝑘superscriptsubscript𝑦𝑗delimited-[]𝑘𝑧1subscript𝑀𝑘\displaystyle y_{M_{k}(i-1)+z}^{[k]}=y_{j}^{[k]},\quad z=1,\dots,M_{k},italic_y start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_i - 1 ) + italic_z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT = italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT , italic_z = 1 , … , italic_M start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , (6)

such that i𝑖iitalic_i controls the top-level steps and z𝑧zitalic_z determines the steps within each aggregation period. That way we can write one time step of the hierarchy as the vector given by

𝐲i=(yi[m],,𝐲i[k2],𝐲i[k1]),subscript𝐲𝑖superscriptsuperscriptsubscript𝑦𝑖delimited-[]𝑚superscriptsuperscriptsubscript𝐲𝑖delimited-[]subscript𝑘2superscriptsuperscriptsubscript𝐲𝑖delimited-[]subscript𝑘1\displaystyle\mathbf{y}_{i}=\left(y_{i}^{[m]},\dots,{\mathbf{y}_{i}^{[{k_{2}}]% }}^{\prime},{\mathbf{y}_{i}^{[{k_{1}}]}}^{\prime}\right)^{\prime},bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_m ] end_POSTSUPERSCRIPT , … , bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUPERSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] end_POSTSUPERSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , (7)

where 𝐲i[k]=(yMk(i1)+1[k],yMk(i1)+2[k],,yMki[k])superscriptsubscript𝐲𝑖delimited-[]𝑘superscriptsuperscriptsubscript𝑦subscript𝑀𝑘𝑖11delimited-[]𝑘superscriptsubscript𝑦subscript𝑀𝑘𝑖12delimited-[]𝑘superscriptsubscript𝑦subscript𝑀𝑘𝑖delimited-[]𝑘\mathbf{y}_{i}^{[k]}=\left(y_{M_{k}(i-1)+1}^{[k]},y_{M_{k}(i-1)+2}^{[k]},\dots% ,y_{M_{k}i}^{[k]}\right)^{\prime}bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT = ( italic_y start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_i - 1 ) + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT , italic_y start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_i - 1 ) + 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT denotes the stacked entries of the time series at aggregation level k𝑘kitalic_k. This implies that 𝐲i=S𝐲i[1]subscript𝐲𝑖𝑆superscriptsubscript𝐲𝑖delimited-[]1\mathbf{y}_{i}=S\mathbf{y}_{i}^{[1]}bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT, where S𝑆Sitalic_S is an appropriate summing matrix as defined in general forecast reconciliation.

According to Athanasopoulos et al., (2017) we write the levels of aggregation in descending order as {kp,,k2,1}subscript𝑘𝑝subscript𝑘21\{k_{p},\dots,k_{2},1\}{ italic_k start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , … , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , 1 } with kp=msubscript𝑘𝑝𝑚k_{p}=mitalic_k start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_m. For a quarterly-biannual-annual aggregation scheme this yields k{4,2,1}𝑘421k\in\{4,2,1\}italic_k ∈ { 4 , 2 , 1 }. A corresponding visualization is available in Figure 1.

\TreeAnnualyi[4]superscriptsubscript𝑦𝑖delimited-[]4y_{i}^{[4]}italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ 4 ] end_POSTSUPERSCRIPTBiannual1subscriptBiannual1\text{Biannual}_{1}Biannual start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTy2(i1)+1[2]superscriptsubscript𝑦2𝑖11delimited-[]2y_{2(i-1)+1}^{[2]}italic_y start_POSTSUBSCRIPT 2 ( italic_i - 1 ) + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ 2 ] end_POSTSUPERSCRIPTQuarterly1subscriptQuarterly1\text{Quarterly}_{1}Quarterly start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTy4(i1)+1[1]superscriptsubscript𝑦4𝑖11delimited-[]1y_{4(i-1)+1}^{[1]}italic_y start_POSTSUBSCRIPT 4 ( italic_i - 1 ) + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPTQuarterly2subscriptQuarterly2\text{Quarterly}_{2}Quarterly start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTy4(i1)+2[1]superscriptsubscript𝑦4𝑖12delimited-[]1y_{4(i-1)+2}^{[1]}italic_y start_POSTSUBSCRIPT 4 ( italic_i - 1 ) + 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPTBiannual2subscriptBiannual2\text{Biannual}_{2}Biannual start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTy2i[2]superscriptsubscript𝑦2𝑖delimited-[]2y_{2i}^{[2]}italic_y start_POSTSUBSCRIPT 2 italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ 2 ] end_POSTSUPERSCRIPTQuarterly3subscriptQuarterly3\text{Quarterly}_{3}Quarterly start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPTy4(i1)+3[1]superscriptsubscript𝑦4𝑖13delimited-[]1y_{4(i-1)+3}^{[1]}italic_y start_POSTSUBSCRIPT 4 ( italic_i - 1 ) + 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPTQuarterly4subscriptQuarterly4\text{Quarterly}_{4}Quarterly start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPTy4i[1]superscriptsubscript𝑦4𝑖delimited-[]1y_{4i}^{[1]}italic_y start_POSTSUBSCRIPT 4 italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT𝐲i[2]superscriptsubscript𝐲𝑖delimited-[]2\mathbf{y}_{i}^{[2]}bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ 2 ] end_POSTSUPERSCRIPT𝐲i[1]superscriptsubscript𝐲𝑖delimited-[]1\mathbf{y}_{i}^{[1]}bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT
Figure 1: Visualization of an annual-biannual-quarterly temporal hierarchy.

The fact that 𝐲i=S𝐲i[1]subscript𝐲𝑖𝑆superscriptsubscript𝐲𝑖delimited-[]1\mathbf{y}_{i}=S\mathbf{y}_{i}^{[1]}bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT suggests we can set up a very similar regression problem based on the base forecasts as in Eq. (2). The minimum trace approach then yields

𝐲~h=S(SWh1S)1SWh1𝐲^h,subscript~𝐲𝑆superscriptsuperscript𝑆superscriptsubscript𝑊1𝑆1superscript𝑆superscriptsubscript𝑊1subscript^𝐲\displaystyle\mathbf{\tilde{y}}_{h}=S(S^{\prime}W_{h}^{-1}S)^{-1}S^{\prime}W_{% h}^{-1}\mathbf{\hat{y}}_{h},over~ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = italic_S ( italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT , (8)

where 𝐲^hsubscript^𝐲\mathbf{\hat{y}}_{h}over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT are the stacked base forecasts across the entire hierarchy, and Wh=Cov(𝐲h𝐲^h)subscript𝑊Covsubscript𝐲subscript^𝐲W_{h}=\text{Cov}(\mathbf{y}_{h}-\mathbf{\hat{y}}_{h})italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = Cov ( bold_y start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT - over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ) denotes the covariance matrix of the stacked base forecast errors. Specifically, this means that on each aggregation level, we require Mkhsubscript𝑀𝑘M_{k}hitalic_M start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_h-step forecasts, which can be already challenging to obtain properly.

As in conventional forecast reconciliation, the estimation of Whsubscript𝑊W_{h}italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT can be difficult because the sample size is bounded by the number of observations on the top level of the aggregation hierarchy. Thus, the authors propose several simplified covariance estimators. One of them is similar to the scaled reconciliation of Hyndman et al., (2011) by setting Wh=khdiag(W^1)subscript𝑊subscript𝑘diagsubscript^𝑊1W_{h}=k_{h}\text{diag}(\hat{W}_{1})italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = italic_k start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT diag ( over^ start_ARG italic_W end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ), while structural scaling is also proposed as in Wickramasuriya et al., (2019).

Temporal aggregation allows for more refined methods to enhance the estimation of the covariance matrix. Nystrup et al., (2020) suggest modeling the autocorrelation structure of the forecasts, leading to four different estimators. The autocovariance scaling estimator estimates the full autocovariance matrix at each aggregation level, while the Markov scaling assumes a first-order Markov structure, estimating only lag 1111 correlations per aggregation level. Additionally, the authors propose using GLASSO to estimate the inverse cross-correlation matrix and a cross-correlation shrinkage estimator, similar to Wickramasuriya et al., (2019). It is worth noting that all correlation-based estimators can be combined with variance and structural scaling variances.

In a subsequent work by Nystrup et al., (2021), the authors explore dimension reduction. They propose an eigendecomposition of the cross-correlation matrix and construct a filtered precision matrix by selecting the first few eigenvectors and applying shrinkage to the eigenvalues. Such an estimator is especially useful when forecasting a very deep and complex hierarchy.

2.3 Temporal Aggregation

Temporal aggregation of series was first studied in the seminal work of Amemiya and Wu, (1972). A rather recent review of the most relevant advances in this field can be found in Silvestrini and Veredas, (2008). The models discussed in these works are mostly ARIMA-based, and we will briefly explain the essential ideas and results.

Consider a univariate time series yt,t=1,,Tformulae-sequencesubscript𝑦𝑡𝑡1𝑇y_{t},t=1,\dots,Titalic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t = 1 , … , italic_T observed at some frequency. A k𝑘kitalic_k-aggregate series is defined, equivalent to Eq. (5), by

yt=i=0kwiyti.superscriptsubscript𝑦𝑡superscriptsubscript𝑖0𝑘subscript𝑤𝑖subscript𝑦𝑡𝑖\displaystyle y_{t}^{\ast}=\sum_{i=0}^{k}w_{i}y_{t-i}.italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_t - italic_i end_POSTSUBSCRIPT . (9)

To obtain non-overlap** aggregates, a new time scale is introduced by setting T=kt𝑇𝑘𝑡T=ktitalic_T = italic_k italic_t, and thus yT=yktsubscriptsuperscript𝑦𝑇subscriptsuperscript𝑦𝑘𝑡y^{\ast}_{T}=y^{\ast}_{kt}italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k italic_t end_POSTSUBSCRIPT with yT+1=yk(t+1)subscriptsuperscript𝑦𝑇1subscriptsuperscript𝑦𝑘𝑡1y^{\ast}_{T+1}=y^{\ast}_{k(t+1)}italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T + 1 end_POSTSUBSCRIPT = italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k ( italic_t + 1 ) end_POSTSUBSCRIPT. Hence, ysuperscript𝑦y^{\ast}italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is a series at lower frequency because observations are only available every k𝑘kitalic_k time steps.

The more general definition of Eq. (9) allows for different types of aggregation. The most common one is the so-called flow aggregation with wi=1subscript𝑤𝑖1w_{i}=1italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1. This type of aggregation is just the sum in each aggregation period. Another type is stock aggregation. One usually sets k=0,w0=1formulae-sequence𝑘0subscript𝑤01k=0,w_{0}=1italic_k = 0 , italic_w start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1. Thus, only the last observation in each period is equal to the period’s aggregate. As in most literature, we also focus on the flow type of aggregation.

Now assume that the higher frequency series y𝑦yitalic_y seen as a random process is an ARIMA(p,d,q)ARIMA𝑝𝑑𝑞\text{ARIMA}(p,d,q)ARIMA ( italic_p , italic_d , italic_q ) model. We are interested in the model specification of ysuperscript𝑦y^{\ast}italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT after aggregation. The theory gives us that ysuperscript𝑦y^{\ast}italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is again an ARIMA model as discussed in Silvestrini and Veredas, (2008, Section 3.3). We have that

yARIMA(p,d,r),rp(k1)+(d+1)(k1)+qk.formulae-sequencesimilar-tosuperscript𝑦ARIMA𝑝𝑑𝑟𝑟𝑝𝑘1𝑑1𝑘1𝑞𝑘\displaystyle y^{\ast}\sim\text{ARIMA}(p,d,r),\quad r\leq\left\lfloor\frac{p(k% -1)+(d+1)(k-1)+q}{k}\right\rfloor.italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∼ ARIMA ( italic_p , italic_d , italic_r ) , italic_r ≤ ⌊ divide start_ARG italic_p ( italic_k - 1 ) + ( italic_d + 1 ) ( italic_k - 1 ) + italic_q end_ARG start_ARG italic_k end_ARG ⌋ . (10)

The autoregressive and integrated orders of the aggregated series remain unchanged, while the moving average order increases. The theory also provides a method to compute the exact parameters of the aggregated series. Specifically, the roots of the autoregressive polynomial of the AR component of the aggregated series are equal to the k𝑘kitalic_k-th power of the AR roots of the disaggregated model. Thus, assuming stationarity, the AR effect in the aggregate model diminishes as the aggregation period increases. Simultaneously, the MA effect becomes more significant. However, calculating the MA coefficients is more complex. These coefficients can be determined by comparing the autocorrelation functions of the aggregated model and the transformed disaggregated model, leading to several potentially non-linear equations. The unknowns in these equations include the MA coefficients, the innovation variance, and a possible non-zero mean.

This theory has also been extended to more complex ARIMA models like ARIMAX or even SARIMA where the results are very much similar. There are even results when looking at volatility models such as GARCH.

The reason why the aggregated MA order in Eq. (10) is only bounded above by the right-hand side is due to the possibility of polynomial term cancellation in the disaggregated model, which can result in much simpler models. An extreme example is provided in Ramírez et al., (2014), where the authors show that if the disaggregated model is an AR(9)AR9\text{AR}(9)AR ( 9 ) model with non-zero coefficients at lags 3,6,93693,6,93 , 6 , 9, then the 3333-aggregated series will simplify to an AR(3)AR3\text{AR}(3)AR ( 3 ) model. This simplification is reasonable because the disaggregated series already contains the essential aggregation information.

In the same work of Ramírez et al., (2014), the forecast performance of aggregation is also investigated. The authors argue that if the aggregated series exhibits a moving average part, then its forecast error can be reduced when performing an according bottom-up forecast using the disaggregated series. This makes sense since aggregation leads to a loss of information. However, this is only the case if the moving average part is significant. If not, then the improvements are very small or even non-existent.

Since it might not be clear how such model aggregation works on paper, we put a thorough calculation of the simple AR(1)AR1\text{AR}(1)AR ( 1 ) model in A.

3 Temporal Hierarchical Forecast Reconciliation in Temporally Aggregated Models

In this section, we will theoretically integrate the fields of temporal forecast reconciliation and temporally aggregated ARIMA models. To the best of our knowledge, this is the first time such an integration has been attempted. While Athanasopoulos et al., (2017) utilized the theory of temporally aggregated ARIMA models, their approach was primarily experimental. They examined the performance of temporal forecast reconciliation methods, such as variance scaling, and compared them to a simple bottom-up approach under varying levels of uncertainty. Specifically, they conducted experiments with fixed model orders and parameters, fixed orders alone, or automatically selected models based on model selection criteria. The authors found that temporal forecast reconciliation and bottom-up methods perform equally well in highly certain settings, but the performance of bottom-up methods declines when models are misspecified.

In general, the data-generating process has not been of much interest so far in the field of temporal forecast reconciliation because it has been developed as a post-hoc procedure to transform base forecasts coherently. In the theory of temporally aggregated models, the combination of forecasts of different levels to achieve coherent or even better forecasts has not been looked at.

Our contribution is as follows: Utilizing the theoretical model of aggregation, we will derive the theoretical covariance matrix of the base forecast errors, denoted as W𝑊Witalic_W, given in Lemma 1. This covariance matrix will then be employed to perform the minimum trace estimation manually. Through matrix algebra, we will demonstrate in Theorem 1 that the resulting map** matrix G𝐺Gitalic_G corresponds to a bottom-up forecast. Consequently, we show that within the framework of aggregated ARIMA models, the optimal forecast reconciliation technique is indeed the bottom-up approach.

Building on the insights from Section 2.3, we aim to manually implement the minimum trace reconciliation method. To do this, we need the covariance matrix of the base forecast errors, which we can readily compute. To maintain simplicity, we will initially focus on the straightforward case of an AR(1)AR1\text{AR}(1)AR ( 1 ) model and subsequently discuss more complex models. The first result in Lemma 1 is about the covariance structure of the aggregated model. Its proof can be found in A.

Lemma 1.

The covariance matrix W1subscript𝑊1W_{1}italic_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT of 1111-step forecast errors in a k𝑘kitalic_k-aggregated AR(1)AR1\text{AR}(1)AR ( 1 ) model with parameter ϕitalic-ϕ\phiitalic_ϕ and innovation variance σ2superscript𝜎2\sigma^{2}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is equal to

W1subscript𝑊1\displaystyle W_{1}italic_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT =(σ2σ2𝟏kΦΦσ2ΦΦ𝟏kσ2ΦΦ)absentmatrixsuperscriptsubscript𝜎2superscript𝜎2superscriptsubscript1𝑘ΦsuperscriptΦsuperscript𝜎2ΦsuperscriptΦsubscript1𝑘superscript𝜎2ΦsuperscriptΦ\displaystyle=\begin{pmatrix}\sigma_{\ast}^{2}&\sigma^{2}\mathbf{1}_{k}^{% \prime}\Phi\Phi^{\prime}\\ \sigma^{2}\Phi\Phi^{\prime}\mathbf{1}_{k}&\sigma^{2}\Phi\Phi^{\prime}\end{pmatrix}= ( start_ARG start_ROW start_CELL italic_σ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_1 start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_Φ roman_Φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_Φ roman_Φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT bold_1 start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL start_CELL italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_Φ roman_Φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) (11)

where 𝟏ksubscript1𝑘\mathbf{1}_{k}bold_1 start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT denotes a vector of ones of length k𝑘kitalic_k, ΦΦ\Phiroman_Φ is a lower triangle matrix given by

ΦΦ\displaystyle\Phiroman_Φ =(1000ϕ1ϕk20ϕk1ϕk2ϕ1),absentmatrix1000italic-ϕ1superscriptitalic-ϕ𝑘20superscriptitalic-ϕ𝑘1superscriptitalic-ϕ𝑘2italic-ϕ1\displaystyle=\begin{pmatrix}1&0&0&\dots&0\\ \phi&1&\ddots&\ddots&\vdots\\ \vdots&\ddots&\ddots&\ddots&\vdots\\ \phi^{k-2}&\ddots&\ddots&\ddots&0\\ \phi^{k-1}&\phi^{k-2}&\dots&\phi&1\end{pmatrix},= ( start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_ϕ end_CELL start_CELL 1 end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_ϕ start_POSTSUPERSCRIPT italic_k - 2 end_POSTSUPERSCRIPT end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_ϕ start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT end_CELL start_CELL italic_ϕ start_POSTSUPERSCRIPT italic_k - 2 end_POSTSUPERSCRIPT end_CELL start_CELL … end_CELL start_CELL italic_ϕ end_CELL start_CELL 1 end_CELL end_ROW end_ARG ) , (12)

and σ2superscriptsubscript𝜎2\sigma_{\ast}^{2}italic_σ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT denotes the innovation variance of the aggregated model.

Based on Lemma 1 we now manually compute the optimal unbiased reconciliation matrix, summarised in Theorem 1. The proof is available in A.

Theorem 1.

The minimum trace reconciliation method in a k𝑘kitalic_k-aggregated AR(1)AR1\text{AR}(1)AR ( 1 ) model is equal to a bottom-up approach, implying that

SG=(0𝟏k𝟎kIk),𝑆superscript𝐺matrix0superscriptsubscript1𝑘subscript0𝑘subscript𝐼𝑘\displaystyle SG^{\ast}=\begin{pmatrix}0&\mathbf{1}_{k}^{\prime}\\ \mathbf{0}_{k}&I_{k}\end{pmatrix},italic_S italic_G start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL bold_1 start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL bold_0 start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL start_CELL italic_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) ,

where 𝟎ksubscript0𝑘\mathbf{0}_{k}bold_0 start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is a vector of zeros of length k𝑘kitalic_k and Gsuperscript𝐺G^{\ast}italic_G start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT denotes the optimal map** matrix from problem (2.1).

Theorem 1 indicates that the optimal unbiased reconciliation method for the aggregated AR(1)AR1\text{AR}(1)AR ( 1 ) model is the bottom-up approach. Consequently, the forecasts at the bottom level remain unchanged, with no potential for enhancing forecast accuracy. Conversely, the aggregated forecast is disregarded in any form of combination. This outcome elucidates why the bottom-up approach frequently demonstrates effectiveness in both simulation studies and real-world data applications, thus bolstering its practicality.

Before moving on to the experimental part of this study, we aim to illustrate how this theorem works using a sample-based approach. In Figure 2, the average transformation matrix SG𝑆𝐺SGitalic_S italic_G for a two-level hierarchy is presented. To do this, we simulated 100 models and estimated the complete sample covariance matrix based on the simulations. The models used consist of an AR(1)AR1\text{AR}(1)AR ( 1 ) model with parameters ϕ=0.8,σ2=1formulae-sequenceitalic-ϕ0.8superscript𝜎21\phi=0.8,\sigma^{2}=1italic_ϕ = 0.8 , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 at the lower level, which is then combined into an ARMA(1,1)ARMA11\text{ARMA}(1,1)ARMA ( 1 , 1 ) model at the higher level of the hierarchy with k{4,1}𝑘41k\in\{4,1\}italic_k ∈ { 4 , 1 }. The nodes of the hierarchy are shown on both axes, with 11111-11 - 1 representing the entry at the top level and 2i2𝑖2-i2 - italic_i representing the i𝑖iitalic_i-th step of the lower level. This precisely specifies the transformation matrix as used in Theorem 1. The first row shows the effects of the base forecasts on the reconciled top-level forecast. It is evident that there is little impact from the top-level base forecast, with nearly equal weights close to 1 for the bottom level base forecasts. Similarly, the following 4444 rows demonstrate the weights for the reconciled bottom level forecasts, with a zero column followed by the identity matrix I4subscript𝐼4I_{4}italic_I start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT. This indicates that the reconciled bottom level forecasts closely match the bottom level base forecasts. In summary, the tendency for a bottom-up reconciliation approach is clear.

Refer to caption
Figure 2: Full sample transformation matrix SG𝑆𝐺SGitalic_S italic_G for n=100,ϕ=0.8,h=1,k{4,1},σ2=1formulae-sequence𝑛100formulae-sequenceitalic-ϕ0.8formulae-sequence1formulae-sequence𝑘41superscript𝜎21n=100,\phi=0.8,h=1,k\in\{4,1\},\sigma^{2}=1italic_n = 100 , italic_ϕ = 0.8 , italic_h = 1 , italic_k ∈ { 4 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1. The colors correspond to the mean value over 100100100100 repetitions. The standard errors are given in parentheses.

In Section 4, we further investigate this theorem experimentally to gain a deeper understanding. A natural extension of Theorem 1 is to increase the depth of the hierarchy. Figure 13 in B shows the transformation matrix SG𝑆𝐺SGitalic_S italic_G for a three-level hierarchy with k{4,2,1}𝑘421k\in\{4,2,1\}italic_k ∈ { 4 , 2 , 1 }, similar to Figure 2. While the results are less clear-cut, the tendency towards a bottom-up approach remains evident. Specifically, the reconciled first-level forecast is constructed using similar components from the lowest level, whereas the reconciled bottom level relies solely on base bottom level data. The standard errors, indicated in parentheses, show that the first three columns are close to zero, meaning that the forecasts for the first and second levels of the hierarchy do not carry much weight. In other words, the forecast for the first half-year is derived from the first two quarters, and similarly for the second half-year.

4 Experiments

In this section, we experimentally investigate different types of forecast reconciliation methods in the framework of temporally aggregated time series models and beyond.

We evaluate the results based on percentage errors, namely for aggregation parameter k𝑘kitalic_k we obtain a relative mean squared error of

rMSE[k](𝐲~,𝐲^)=i𝐲~i[k]𝐲i[k]22i𝐲^i[k]𝐲i[k]221,superscriptrMSEdelimited-[]𝑘~𝐲^𝐲subscript𝑖superscriptsubscriptnormsuperscriptsubscript~𝐲𝑖delimited-[]𝑘superscriptsubscript𝐲𝑖delimited-[]𝑘22subscript𝑖superscriptsubscriptnormsuperscriptsubscript^𝐲𝑖delimited-[]𝑘superscriptsubscript𝐲𝑖delimited-[]𝑘221\displaystyle\text{rMSE}^{[k]}(\tilde{\mathbf{y}},\hat{\mathbf{y}})=\frac{\sum% _{i}\left\|\tilde{\mathbf{y}}_{i}^{[k]}-\mathbf{y}_{i}^{[k]}\right\|_{2}^{2}}{% \sum_{i}\left\|\hat{\mathbf{y}}_{i}^{[k]}-\mathbf{y}_{i}^{[k]}\right\|_{2}^{2}% }-1,rMSE start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT ( over~ start_ARG bold_y end_ARG , over^ start_ARG bold_y end_ARG ) = divide start_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ over~ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT - bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT - bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG - 1 ,

where 𝐲~i[k]superscriptsubscript~𝐲𝑖delimited-[]𝑘\tilde{\mathbf{y}}_{i}^{[k]}over~ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT denotes the i𝑖iitalic_i-th vector of reconciled forecasts of aggregation level k𝑘kitalic_k, 𝐲^i[k]superscriptsubscript^𝐲𝑖delimited-[]𝑘\hat{\mathbf{y}}_{i}^{[k]}over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT is the i𝑖iitalic_i-th vector of the base forecasts of aggregation level k𝑘kitalic_k, and 22\|\cdot\|_{2}^{2}∥ ⋅ ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is the squared Euclidean norm. We analyze both in-sample (training) reconciliation errors and out-of-sample (test) reconciliation errors to assess generalizability, aggregating the corresponding observations accordingly. Depending on the level of aggregation, we may encounter multi-step ahead forecasts. To simplify, we aggregate these multi-step forecasts, providing a single error measure for each aggregation level.

The test reconciliation forecasts are acquired through the following procedure. The reconciliation method employed is trained exclusively on the training data, meaning that the covariance matrix and the corresponding base ARIMA models are estimated solely based on the training data. Subsequently, forecasts for hhitalic_h steps ahead are generated for the test data in a cumulative manner, effectively utilizing the test data for the base test forecasts.

MSE values are computed for each level of the hierarchy as well on an overall level by taking the sum of MSEs across all levels. The reason we consider MSE instead of a different error measure is that the minimum trace reconciliation method exactly minimizes the sum of the error variances.

For a robustness check of the results, we also consider a relative mean absolute error and use it to calculate percentage errors. Namely,

rMAE[k](𝐲~,𝐲^)=i𝐲~i[k]𝐲i[k]1i𝐲^i[k]𝐲i[k]11,superscriptrMAEdelimited-[]𝑘~𝐲^𝐲subscript𝑖subscriptnormsuperscriptsubscript~𝐲𝑖delimited-[]𝑘superscriptsubscript𝐲𝑖delimited-[]𝑘1subscript𝑖subscriptnormsuperscriptsubscript^𝐲𝑖delimited-[]𝑘superscriptsubscript𝐲𝑖delimited-[]𝑘11\displaystyle\text{rMAE}^{[k]}(\tilde{\mathbf{y}},\hat{\mathbf{y}})=\frac{\sum% _{i}\left\|\tilde{\mathbf{y}}_{i}^{[k]}-\mathbf{y}_{i}^{[k]}\right\|_{1}}{\sum% _{i}\left\|\hat{\mathbf{y}}_{i}^{[k]}-\mathbf{y}_{i}^{[k]}\right\|_{1}}-1,rMAE start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT ( over~ start_ARG bold_y end_ARG , over^ start_ARG bold_y end_ARG ) = divide start_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ over~ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT - bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ over^ start_ARG bold_y end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT - bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_k ] end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG - 1 , (13)

where 1\|\cdot\|_{1}∥ ⋅ ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is the absolute-value norm. This error measure is inherently less sensitive to outliers. We have focused on reporting results for rMSE to keep things concise. The conclusions remain consistent even when considering rMAE or similar relative error measures.

Overall, if a percentage error is below 00, it indicates that the reconciled forecasts perform better, whereas errors above 00 suggest the opposite. It is important to note that we are only examining relative errors, focusing on the performance of the temporally reconciled forecasts rather than the base forecasts. Our aim is to evaluate how different types of temporal forecast reconciliation methods perform.

4.1 Autoregressive Models of Order 1

In the first experiment, we want to demonstrate the implications of Theorem 1. We simulate stationary AR(1)AR1\text{AR}(1)AR ( 1 ) data on the bottom level of the hierarchy and aggregate them to obtain the remaining levels of the hierarchy. The parameters we vary are

  • Sample size on the top level n=20,50,100𝑛2050100n=20,50,100italic_n = 20 , 50 , 100,

  • AR parameter ϕ=0.9,,0.9italic-ϕ0.90.9\phi=-0.9,\dots,0.9italic_ϕ = - 0.9 , … , 0.9,

  • Innovation variance on the bottom level σ2=1,5superscript𝜎215\sigma^{2}=1,5italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 , 5,

  • Hierarchy size k{4,1},{5,1},{12,4,1}𝑘41511241k\in\{4,1\},\{5,1\},\{12,4,1\}italic_k ∈ { 4 , 1 } , { 5 , 1 } , { 12 , 4 , 1 },

  • Forecast horizon h=1,212h=1,2italic_h = 1 , 2, and

  • Fixed order of the ARMA models to remove model uncertainty which corresponds to Scenario 2222 of Athanasopoulos et al., (2017), or automated model selection (Scenario 3333).

For each setting we simulate N=50𝑁50N=50italic_N = 50 time series and compute training and test rMSE values. The training data always consist of 75%percent7575\%75 % of the total data.

The covariance estimators we focus on in this simulation are

  • OLS: W^h=khIsubscript^𝑊subscript𝑘𝐼\hat{W}_{h}=k_{h}Iover^ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = italic_k start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT italic_I,

  • Full Cov.: W^h=10.75ni=10.75n(𝐞^i(h))(𝐞^i(h))subscript^𝑊10.75𝑛superscriptsubscript𝑖10.75𝑛superscriptsubscript^𝐞𝑖superscriptsuperscriptsubscript^𝐞𝑖\hat{W}_{h}=\frac{1}{0.75n}\sum_{i=1}^{0.75n}\left(\hat{\mathbf{e}}_{i}^{(h)}% \right)\left(\hat{\mathbf{e}}_{i}^{(h)}\right)^{\prime}over^ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 0.75 italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0.75 italic_n end_POSTSUPERSCRIPT ( over^ start_ARG bold_e end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_h ) end_POSTSUPERSCRIPT ) ( over^ start_ARG bold_e end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_h ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, where 𝐞^i(h)superscriptsubscript^𝐞𝑖\hat{\mathbf{e}}_{i}^{(h)}over^ start_ARG bold_e end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_h ) end_POSTSUPERSCRIPT denote the i𝑖iitalic_i-th vector of hhitalic_h-step residuals of the base forecasts, and

  • Spectral Scaling (Nystrup et al.,, 2021):

    1. 1.

      Shrink the empirical cross-correlation matrix R𝑅Ritalic_R to Rshrink=(1ν)R+νIsubscript𝑅shrink1𝜈𝑅𝜈𝐼R_{\text{shrink}}=(1-\nu)R+\nu Iitalic_R start_POSTSUBSCRIPT shrink end_POSTSUBSCRIPT = ( 1 - italic_ν ) italic_R + italic_ν italic_I

    2. 2.

      Eigen-decompose this shrunk cross-correlation matrix by Rshrink=VΛshrinkVsubscript𝑅shrink𝑉subscriptΛshrinksuperscript𝑉R_{\text{shrink}}=V\Lambda_{\text{shrink}}V^{\prime}italic_R start_POSTSUBSCRIPT shrink end_POSTSUBSCRIPT = italic_V roman_Λ start_POSTSUBSCRIPT shrink end_POSTSUBSCRIPT italic_V start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT where R=VΛV𝑅𝑉Λsuperscript𝑉R=V\Lambda V^{\prime}italic_R = italic_V roman_Λ italic_V start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

    3. 3.

      Reconstruct the filtered precision matrix by Q=(WAW+cI)1𝑄superscript𝑊𝐴superscript𝑊𝑐𝐼1Q=(WAW^{\prime}+cI)^{-1}italic_Q = ( italic_W italic_A italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_c italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT such that W𝑊Witalic_W contains the first neigsubscript𝑛eign_{\text{eig}}italic_n start_POSTSUBSCRIPT eig end_POSTSUBSCRIPT columns of V𝑉Vitalic_V and A=diag((1ν)λ1+νc,,(1ν)λneig+νc)𝐴diag1𝜈subscript𝜆1𝜈𝑐1𝜈subscript𝜆neig𝜈𝑐A=\text{diag}((1-\nu)\lambda_{1}+\nu-c,\dots,(1-\nu)\lambda_{\text{neig}}+\nu-c)italic_A = diag ( ( 1 - italic_ν ) italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_ν - italic_c , … , ( 1 - italic_ν ) italic_λ start_POSTSUBSCRIPT neig end_POSTSUBSCRIPT + italic_ν - italic_c ) with c𝑐citalic_c being the average of the remaining smallest shrunken eigenvalues.

    4. 4.

      Set W^h1=Dvar1/2QDvar1/2superscriptsubscript^𝑊1superscriptsubscript𝐷var12𝑄superscriptsubscript𝐷var12\hat{W}_{h}^{-1}=D_{\text{var}}^{-1/2}QD_{\text{var}}^{-1/2}over^ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = italic_D start_POSTSUBSCRIPT var end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT italic_Q italic_D start_POSTSUBSCRIPT var end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT where Dvarsubscript𝐷varD_{\text{var}}italic_D start_POSTSUBSCRIPT var end_POSTSUBSCRIPT corresponds to variance scaling.

    The two hyperparameters ν,neig𝜈subscript𝑛eig\nu,n_{\text{eig}}italic_ν , italic_n start_POSTSUBSCRIPT eig end_POSTSUBSCRIPT are chosen in a time series cross-validation procedure. The authors do not follow this procedure and rather rely on an optimally chosen shrinkage parameter ν𝜈\nuitalic_ν (Ledoit and Wolf, (2012)) and a fixed number of chosen eigenvectors neigsubscript𝑛eign_{\text{eig}}italic_n start_POSTSUBSCRIPT eig end_POSTSUBSCRIPT.

Other estimators, including various shrinkage estimators and scaling variants, were initially considered in this simulation but produced results very similar to those listed. Additionally, the bottom-up approach was also examined.

4.1.1 One-Step Ahead

At first, we take a look at the performance of the bottom-up approach compared to using the full covariance matrix for reconciliation. Figure 4 shows the difference of in-sample rMSE values for h=1,k{4,1},σ2=1formulae-sequence1formulae-sequence𝑘41superscript𝜎21h=1,k\in\{4,1\},\sigma^{2}=1italic_h = 1 , italic_k ∈ { 4 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 as well as fixed orders of the models to remove model uncertainty. We clearly observe that both methods result in very similar improvements once the covariance matrix can be estimated properly. The differences are driven by the top level of the hierarchy since most changes are to be expected there. Thus, the theoretical results also hold in this simulation setting. Figure 4 shows the test differences in rMSE. While the differences are indeed higher than expected, the theoretical results still hold on the test sets, and we can conclude that the full covariance matrix reconciliation method is equivalent to the bottom-up approach. Interestingly, most differences are present at larger values of ϕitalic-ϕ\phiitalic_ϕ.

Refer to caption
Figure 3: In-sample rMSE differences of the full covariance matrix and bottom-up reconciliation for h=1,k{4,1},σ2=1formulae-sequence1formulae-sequence𝑘41superscript𝜎21h=1,k\in\{4,1\},\sigma^{2}=1italic_h = 1 , italic_k ∈ { 4 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed-order models
Refer to caption
Figure 4: Out-of-sample rMSE differences of full covariance matrix and bottom-up reconciliation for h=1,k{4,1},σ2=1formulae-sequence1formulae-sequence𝑘41superscript𝜎21h=1,k\in\{4,1\},\sigma^{2}=1italic_h = 1 , italic_k ∈ { 4 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed-order models

Table 1 presents the training and test rMSE values for the selected reconciliation methods and parameters, grouped by buckets of the AR parameter. This allows us to distinguish between high negative or positive correlation as well as almost random walks. We observe that most improvements occur at the top level of the hierarchy, while reconciliation at the bottom level yields worse results, especially out-of-sample. Overall, we notice similar improvements for the bottom-up approach compared to more sophisticated methods once the sample size is sufficiently large. Note that the highest improvements are observed for a large AR parameter across all methods.

Table 1: Mean rMSE per buckets of ϕitalic-ϕ\phiitalic_ϕ for h=1,k{4,1},σ2=1formulae-sequence1formulae-sequence𝑘41superscript𝜎21h=1,k\in\{4,1\},\sigma^{2}=1italic_h = 1 , italic_k ∈ { 4 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed order of the used models. The standard errors are given in parentheses.
Training rMSE Test rMSE
Level n Recon. Type [-0.9,-0.5] (-0.5,0.5] (0.5,0.9] [-0.9,-0.5] (-0.5,0.5] (0.5,0.9]
Bottom-Up 0.15 (0.02) 0.16 (0.01) 0.03 (0.02) -0.08 (0.03) -0.04 (0.02) -0.11 (0.03)
Full Cov. -0.03 (0.01) 0.01 (0.01) -0.10 (0.01) 0.11 (0.04) 0.09 (0.02) -0.06 (0.03)
Spectral -0.03 (0.00) 0.01 (0.00) -0.08 (0.01) -0.02 (0.02) -0.01 (0.01) -0.04 (0.04)
20 OLS -0.01 (0.00) 0.01 (0.00) -0.04 (0.00) -0.06 (0.01) -0.04 (0.00) -0.08 (0.01)
Bottom-Up 0.02 (0.01) 0.06 (0.00) -0.08 (0.01) -0.11 (0.01) -0.07 (0.01) -0.14 (0.01)
Full Cov. -0.05 (0.00) -0.01 (0.00) -0.12 (0.01) -0.05 (0.01) 0.01 (0.01) -0.13 (0.01)
Spectral -0.03 (0.00) 0.00 (0.00) -0.11 (0.01) -0.05 (0.01) -0.02 (0.01) -0.13 (0.01)
50 OLS -0.02 (0.00) 0.00 (0.00) -0.05 (0.00) -0.04 (0.00) -0.03 (0.00) -0.06 (0.00)
Bottom-Up -0.03 (0.00) 0.01 (0.00) -0.11 (0.01) -0.09 (0.01) -0.04 (0.00) -0.17 (0.01)
Full Cov. -0.05 (0.00) -0.01 (0.00) -0.13 (0.01) -0.06 (0.01) 0.00 (0.00) -0.15 (0.01)
Spectral -0.04 (0.00) -0.01 (0.00) -0.12 (0.01) -0.05 (0.01) -0.02 (0.00) -0.15 (0.01)
Level 1 100 OLS -0.02 (0.00) 0.00 (0.00) -0.05 (0.00) -0.04 (0.00) -0.01 (0.00) -0.06 (0.00)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. -0.06 (0.01) -0.06 (0.01) -0.08 (0.01) 0.17 (0.02) 0.24 (0.03) 0.17 (0.03)
Spectral -0.04 (0.00) -0.04 (0.00) -0.05 (0.01) 0.07 (0.01) 0.07 (0.01) 0.12 (0.02)
20 OLS -0.01 (0.00) -0.03 (0.00) 0.00 (0.01) 0.02 (0.00) 0.04 (0.01) 0.16 (0.03)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. -0.02 (0.00) -0.03 (0.00) -0.03 (0.00) 0.05 (0.01) 0.09 (0.01) 0.04 (0.01)
Spectral -0.01 (0.00) -0.02 (0.00) -0.01 (0.00) 0.02 (0.01) 0.05 (0.01) 0.03 (0.01)
50 OLS 0.00 (0.00) -0.01 (0.00) 0.04 (0.01) 0.01 (0.00) 0.03 (0.01) 0.11 (0.01)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. -0.01 (0.00) -0.02 (0.00) -0.02 (0.00) 0.02 (0.00) 0.02 (0.00) 0.02 (0.01)
Spectral -0.01 (0.00) -0.01 (0.00) 0.00 (0.00) 0.01 (0.00) 0.01 (0.00) 0.02 (0.01)
Level 2 100 OLS 0.00 (0.00) 0.00 (0.00) 0.05 (0.00) 0.01 (0.00) 0.01 (0.00) 0.10 (0.01)
Bottom-Up 0.07 (0.01) 0.12 (0.01) 0.02 (0.01) -0.08 (0.01) -0.05 (0.01) -0.12 (0.03)
Full Cov. -0.04 (0.01) -0.01 (0.01) -0.11 (0.01) 0.10 (0.02) 0.10 (0.01) -0.05 (0.02)
Spectral -0.03 (0.00) -0.01 (0.00) -0.08 (0.01) 0.01 (0.01) 0.00 (0.01) -0.04 (0.04)
20 OLS -0.01 (0.00) 0.00 (0.00) -0.04 (0.00) -0.04 (0.00) -0.03 (0.00) -0.07 (0.00)
Bottom-Up 0.01 (0.00) 0.04 (0.00) -0.08 (0.01) -0.07 (0.01) -0.06 (0.01) -0.14 (0.01)
Full Cov. -0.04 (0.00) -0.01 (0.00) -0.11 (0.01) -0.01 (0.01) 0.02 (0.00) -0.12 (0.01)
Spectral -0.02 (0.00) 0.00 (0.00) -0.10 (0.01) -0.02 (0.00) -0.01 (0.01) -0.12 (0.01)
50 OLS -0.01 (0.00) 0.00 (0.00) -0.04 (0.00) -0.02 (0.00) -0.02 (0.00) -0.05 (0.00)
Bottom-Up -0.02 (0.00) 0.01 (0.00) -0.10 (0.01) -0.06 (0.01) -0.03 (0.00) -0.15 (0.01)
Full Cov. -0.03 (0.00) -0.01 (0.00) -0.12 (0.01) -0.03 (0.00) 0.00 (0.00) -0.14 (0.01)
Spectral -0.03 (0.00) -0.01 (0.00) -0.11 (0.00) -0.02 (0.00) -0.01 (0.00) -0.14 (0.01)
Overall 100 OLS -0.01 (0.00) 0.00 (0.00) -0.04 (0.00) -0.02 (0.00) -0.01 (0.00) -0.05 (0.00)

4.1.2 Deeper Hierarchy

Table 2: Mean rMSE per buckets of ϕitalic-ϕ\phiitalic_ϕ for h=1,k{12,4,1},σ2=1formulae-sequence1formulae-sequence𝑘1241superscript𝜎21h=1,k\in\{12,4,1\},\sigma^{2}=1italic_h = 1 , italic_k ∈ { 12 , 4 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed order of the used models. The standard errors are given in parentheses.
Training rMSE Test rMSE
Level n Recon. Type [-0.9,-0.5] (-0.5,0.5] (0.5,0.9] [-0.9,-0.5] (-0.5,0.5] (0.5,0.9]
Bottom-Up 0.16 (0.02) 0.20 (0.01) 0.08 (0.02) -0.05 (0.04) -0.06 (0.02) -0.13 (0.03)
Full Cov. - - - - - -
Spectral -0.03 (0.01) 0.01 (0.00) -0.07 (0.01) -0.06 (0.01) 0.00 (0.01) 0.02 (0.05)
20 OLS 0.00 (0.00) 0.01 (0.00) -0.02 (0.00) -0.05 (0.01) -0.05 (0.00) -0.08 (0.01)
Bottom-Up 0.02 (0.01) 0.07 (0.00) -0.04 (0.01) -0.10 (0.01) -0.05 (0.01) -0.15 (0.01)
Full Cov. - - - - - -
Spectral -0.04 (0.00) 0.00 (0.00) -0.09 (0.01) -0.04 (0.01) 0.00 (0.01) -0.07 (0.02)
50 OLS -0.01 (0.00) 0.00 (0.00) -0.03 (0.00) -0.05 (0.00) -0.02 (0.00) -0.06 (0.00)
Bottom-Up -0.03 (0.00) 0.03 (0.00) -0.08 (0.01) -0.08 (0.01) -0.04 (0.00) -0.13 (0.01)
Full Cov. - - - - - -
Spectral -0.04 (0.00) 0.00 (0.00) -0.10 (0.01) -0.04 (0.01) -0.01 (0.00) -0.10 (0.01)
Level 1 100 OLS -0.02 (0.00) 0.00 (0.00) -0.04 (0.00) -0.03 (0.00) -0.02 (0.00) -0.05 (0.00)
Bottom-Up -0.02 (0.00) 0.01 (0.00) -0.02 (0.00) -0.05 (0.01) -0.01 (0.00) -0.04 (0.01)
Full Cov. - - - - - -
Spectral -0.07 (0.00) -0.05 (0.00) -0.09 (0.01) 0.02 (0.01) 0.06 (0.01) 0.18 (0.05)
20 OLS -0.03 (0.00) -0.03 (0.00) -0.04 (0.01) 0.00 (0.01) 0.02 (0.01) 0.12 (0.02)
Bottom-Up -0.03 (0.00) 0.00 (0.00) -0.02 (0.00) -0.04 (0.00) -0.01 (0.00) -0.04 (0.00)
Full Cov. - - - - - -
Spectral -0.05 (0.00) -0.03 (0.00) -0.05 (0.00) 0.02 (0.01) 0.03 (0.01) 0.03 (0.01)
50 OLS -0.02 (0.00) -0.01 (0.00) 0.00 (0.00) 0.00 (0.00) 0.01 (0.00) 0.05 (0.01)
Bottom-Up -0.04 (0.00) 0.00 (0.00) -0.03 (0.00) -0.05 (0.00) -0.01 (0.00) -0.03 (0.00)
Full Cov. - - - - - -
Spectral -0.04 (0.00) -0.01 (0.00) -0.04 (0.00) -0.02 (0.00) 0.01 (0.00) -0.01 (0.00)
Level 2 100 OLS -0.02 (0.00) -0.01 (0.00) 0.01 (0.00) -0.01 (0.00) 0.01 (0.00) 0.04 (0.01)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. - - - - - -
Spectral -0.03 (0.00) -0.04 (0.00) -0.06 (0.01) 0.04 (0.01) 0.05 (0.01) 0.19 (0.04)
20 OLS 0.00 (0.00) -0.02 (0.00) -0.01 (0.01) 0.01 (0.00) 0.01 (0.00) 0.14 (0.02)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. - - - - - -
Spectral -0.02 (0.00) -0.02 (0.00) -0.03 (0.00) 0.03 (0.01) 0.02 (0.00) 0.06 (0.01)
50 OLS 0.00 (0.00) -0.01 (0.00) 0.02 (0.00) 0.01 (0.00) 0.01 (0.00) 0.07 (0.01)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. - - - - - -
Spectral -0.01 (0.00) -0.01 (0.00) -0.01 (0.00) 0.01 (0.00) 0.01 (0.00) 0.02 (0.00)
Level 3 100 OLS 0.00 (0.00) 0.00 (0.00) 0.03 (0.00) 0.01 (0.00) 0.00 (0.00) 0.05 (0.01)
Bottom-Up 0.06 (0.01) 0.14 (0.01) 0.06 (0.02) -0.09 (0.01) -0.06 (0.01) -0.14 (0.02)
Full Cov. - - - - - -
Spectral -0.04 (0.00) -0.01 (0.00) -0.08 (0.01) -0.03 (0.01) 0.00 (0.01) 0.01 (0.04)
20 OLS -0.01 (0.00) 0.00 (0.00) -0.03 (0.00) -0.04 (0.00) -0.04 (0.00) -0.07 (0.01)
Bottom-Up 0.00 (0.00) 0.05 (0.00) -0.04 (0.01) -0.08 (0.01) -0.04 (0.01) -0.14 (0.01)
Full Cov. - - - - - -
Spectral -0.04 (0.00) -0.01 (0.00) -0.09 (0.01) -0.02 (0.01) 0.00 (0.01) -0.06 (0.02)
50 OLS -0.01 (0.00) 0.00 (0.00) -0.03 (0.00) -0.03 (0.00) -0.02 (0.00) -0.05 (0.00)
Bottom-Up -0.02 (0.00) 0.02 (0.00) -0.07 (0.01) -0.06 (0.01) -0.03 (0.00) -0.12 (0.01)
Full Cov. - - - - - -
Spectral -0.04 (0.00) -0.01 (0.00) -0.09 (0.01) -0.03 (0.00) -0.01 (0.00) -0.10 (0.01)
Overall 100 OLS -0.01 (0.00) 0.00 (0.00) -0.03 (0.00) -0.02 (0.00) -0.01 (0.00) -0.04 (0.00)

Table 2 displays the training errors for a three-level hierarchy using fixed-order models. Note that in this scenario, the full covariance matrix cannot be estimated due to the simple models producing a singular covariance matrix of the base forecast errors. This issue also arises with automatically selected base models. For the other methods, we observe similar improvements at the top level. Interestingly, the spectral method based on dimension reduction performs exceptionally well, yielding better results than the bottom-up reconciliation method based on in-sample errors. Out-of-sample this relationship is turned over and the bottom-up approach generalizes more efficiently.

4.1.3 Multi-Step Ahead

Table 3: Mean rMSE per buckets of ϕitalic-ϕ\phiitalic_ϕ for h=2,k{4,1},σ2=1formulae-sequence2formulae-sequence𝑘41superscript𝜎21h=2,k\in\{4,1\},\sigma^{2}=1italic_h = 2 , italic_k ∈ { 4 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed order of the used models. The standard errors are given in parentheses.
Training rMSE Test rMSE
Level n Recon. Type [-0.9,-0.5] (-0.5,0.5] (0.5,0.9] [-0.9,-0.5] (-0.5,0.5] (0.5,0.9]
Bottom-Up 0.02 (0.01) 0.06 (0.00) -0.11 (0.02) -0.01 (0.02) -0.02 (0.01) 0.02 (0.02)
Full Cov. 0.00 (0.02) 0.02 (0.01) -0.14 (0.03) 0.18 (0.05) 0.16 (0.03) 0.21 (0.05)
Spectral -0.01 (0.00) 0.00 (0.00) -0.15 (0.02) -0.01 (0.01) 0.00 (0.01) 0.15 (0.05)
20 OLS 0.00 (0.00) 0.00 (0.00) -0.04 (0.00) -0.01 (0.00) -0.01 (0.00) -0.01 (0.00)
Bottom-Up -0.01 (0.00) 0.03 (0.00) -0.14 (0.01) -0.04 (0.01) -0.02 (0.01) -0.04 (0.01)
Full Cov. -0.04 (0.00) - -0.19 (0.01) 0.02 (0.01) - 0.01 (0.01)
Spectral -0.02 (0.00) 0.00 (0.00) -0.17 (0.01) -0.02 (0.00) -0.01 (0.01) -0.02 (0.01)
50 OLS -0.01 (0.00) 0.00 (0.00) -0.04 (0.00) -0.01 (0.00) -0.01 (0.00) -0.02 (0.00)
Bottom-Up -0.02 (0.00) 0.01 (0.00) -0.16 (0.01) -0.02 (0.00) -0.02 (0.00) -0.02 (0.01)
Full Cov. -0.04 (0.00) 0.00 (0.00) -0.22 (0.01) 0.01 (0.01) 0.01 (0.00) 0.02 (0.01)
Spectral -0.02 (0.00) 0.00 (0.00) -0.20 (0.01) -0.02 (0.00) 0.00 (0.00) 0.01 (0.01)
Level 1 100 OLS -0.01 (0.00) 0.00 (0.00) -0.05 (0.00) -0.01 (0.00) -0.01 (0.00) -0.01 (0.00)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. -0.08 (0.01) -0.05 (0.01) -0.06 (0.02) 0.18 (0.03) 0.23 (0.03) 0.13 (0.02)
Spectral -0.03 (0.00) -0.02 (0.00) -0.04 (0.01) 0.01 (0.00) 0.03 (0.01) 0.06 (0.01)
20 OLS 0.00 (0.00) -0.01 (0.00) 0.12 (0.02) 0.00 (0.00) 0.01 (0.00) 0.02 (0.01)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. -0.09 (0.01) - -0.06 (0.00) 0.11 (0.01) - 0.04 (0.01)
Spectral -0.03 (0.00) -0.01 (0.00) -0.04 (0.00) 0.02 (0.00) 0.01 (0.00) 0.02 (0.00)
50 OLS 0.00 (0.00) -0.01 (0.00) 0.12 (0.01) 0.00 (0.00) 0.01 (0.00) 0.03 (0.01)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. -0.09 (0.00) -0.01 (0.00) -0.06 (0.00) 0.08 (0.01) 0.03 (0.00) 0.03 (0.00)
Spectral -0.04 (0.00) 0.00 (0.00) -0.04 (0.00) 0.01 (0.00) 0.01 (0.00) 0.02 (0.00)
Level 2 100 OLS 0.00 (0.00) 0.00 (0.00) 0.14 (0.01) 0.00 (0.00) 0.00 (0.00) 0.01 (0.00)
Bottom-Up 0.01 (0.00) 0.04 (0.00) -0.11 (0.02) -0.01 (0.01) -0.02 (0.01) 0.01 (0.02)
Full Cov. -0.04 (0.01) 0.00 (0.01) -0.14 (0.03) 0.12 (0.02) 0.16 (0.02) 0.18 (0.04)
Spectral -0.02 (0.00) 0.00 (0.00) -0.15 (0.02) 0.00 (0.00) 0.00 (0.01) 0.11 (0.04)
20 OLS 0.00 (0.00) 0.00 (0.00) -0.04 (0.00) -0.01 (0.00) -0.01 (0.00) -0.01 (0.00)
Bottom-Up 0.00 (0.00) 0.02 (0.00) -0.13 (0.01) -0.02 (0.00) -0.02 (0.00) -0.04 (0.01)
Full Cov. -0.06 (0.00) - -0.19 (0.01) 0.06 (0.01) - 0.01 (0.01)
Spectral -0.03 (0.00) 0.00 (0.00) -0.17 (0.01) 0.00 (0.00) 0.00 (0.00) -0.02 (0.01)
50 OLS 0.00 (0.00) 0.00 (0.00) -0.04 (0.00) -0.01 (0.00) -0.01 (0.00) -0.01 (0.00)
Bottom-Up -0.01 (0.00) 0.01 (0.00) -0.15 (0.01) -0.01 (0.00) -0.01 (0.00) -0.02 (0.01)
Full Cov. -0.07 (0.00) 0.00 (0.00) -0.21 (0.01) 0.04 (0.00) 0.01 (0.00) 0.02 (0.01)
Spectral -0.03 (0.00) 0.00 (0.00) -0.19 (0.01) 0.00 (0.00) 0.00 (0.00) 0.01 (0.01)
Overall 100 OLS 0.00 (0.00) 0.00 (0.00) -0.04 (0.00) 0.00 (0.00) 0.00 (0.00) -0.01 (0.00)

As we extend the forecast horizon, the results shift, with the bottom-up approach performing worse compared to using the full covariance matrix or even the reduced spectral-based one, as shown in Table 3. This trend holds in-sample; however, out-of-sample, the situation changes. The bottom-up method then produces the best test relative errors, as previously observed.

4.1.4 Odd Hierarchy Width

Table 4: Mean rMSE per buckets of ϕitalic-ϕ\phiitalic_ϕ for h=1,k{5,1},σ2=1formulae-sequence1formulae-sequence𝑘51superscript𝜎21h=1,k\in\{5,1\},\sigma^{2}=1italic_h = 1 , italic_k ∈ { 5 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed order of the used models. The standard errors are given in parentheses.
Training rMSE Test rMSE
Level n Recon. Type [-0.9,-0.5] (-0.5,0.5] (0.5,0.9] [-0.9,-0.5] (-0.5,0.5] (0.5,0.9]
Bottom-Up 0.05 (0.02) 0.17 (0.01) 0.02 (0.02) -0.15 (0.03) -0.09 (0.02) -0.17 (0.02)
Full Cov. -0.10 (0.01) 0.02 (0.01) -0.11 (0.01) 0.00 (0.04) 0.07 (0.02) -0.07 (0.02)
Spectral -0.09 (0.01) 0.01 (0.00) -0.08 (0.01) -0.08 (0.02) 0.01 (0.01) -0.11 (0.02)
20 OLS -0.03 (0.00) 0.00 (0.00) -0.03 (0.00) -0.07 (0.00) -0.04 (0.00) -0.07 (0.00)
Bottom-Up -0.07 (0.01) 0.06 (0.00) -0.08 (0.01) -0.16 (0.01) -0.07 (0.01) -0.15 (0.02)
Full Cov. -0.12 (0.01) 0.00 (0.01) -0.12 (0.01) -0.14 (0.01) 0.02 (0.01) -0.12 (0.01)
Spectral -0.10 (0.01) 0.00 (0.00) -0.10 (0.01) -0.11 (0.01) -0.02 (0.01) -0.10 (0.02)
50 OLS -0.04 (0.00) 0.00 (0.00) -0.04 (0.00) -0.06 (0.00) -0.02 (0.00) -0.05 (0.00)
Bottom-Up -0.11 (0.01) 0.02 (0.00) -0.12 (0.01) -0.15 (0.01) -0.05 (0.00) -0.13 (0.01)
Full Cov. -0.13 (0.01) -0.01 (0.00) -0.13 (0.01) -0.14 (0.01) -0.01 (0.00) -0.12 (0.01)
Spectral -0.11 (0.01) 0.00 (0.00) -0.13 (0.01) -0.12 (0.01) -0.02 (0.00) -0.12 (0.01)
Level 1 100 OLS -0.04 (0.00) 0.00 (0.00) -0.04 (0.00) -0.05 (0.00) -0.01 (0.00) -0.05 (0.00)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. -0.06 (0.00) -0.06 (0.01) -0.09 (0.01) 0.16 (0.02) 0.21 (0.02) 0.24 (0.03)
Spectral -0.03 (0.00) -0.04 (0.00) -0.05 (0.01) 0.07 (0.01) 0.11 (0.02) 0.16 (0.03)
20 OLS 0.00 (0.00) -0.03 (0.00) 0.01 (0.01) 0.02 (0.00) 0.05 (0.01) 0.21 (0.03)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. -0.02 (0.00) -0.03 (0.00) -0.03 (0.00) 0.03 (0.00) 0.07 (0.01) 0.05 (0.01)
Spectral -0.01 (0.00) -0.02 (0.00) -0.01 (0.00) 0.01 (0.00) 0.03 (0.01) 0.06 (0.01)
50 OLS 0.00 (0.00) -0.01 (0.00) 0.04 (0.01) 0.01 (0.00) 0.02 (0.00) 0.12 (0.01)
Bottom-Up 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Full Cov. -0.01 (0.00) -0.02 (0.00) -0.01 (0.00) 0.01 (0.00) 0.03 (0.00) 0.01 (0.00)
Spectral -0.01 (0.00) -0.01 (0.00) 0.00 (0.00) 0.01 (0.00) 0.01 (0.00) 0.01 (0.00)
Level 2 100 OLS 0.00 (0.00) 0.00 (0.00) 0.06 (0.00) 0.01 (0.00) 0.01 (0.00) 0.08 (0.01)
Bottom-Up 0.02 (0.01) 0.13 (0.01) 0.02 (0.02) -0.13 (0.02) -0.10 (0.01) -0.17 (0.02)
Full Cov. -0.09 (0.01) 0.00 (0.01) -0.11 (0.01) 0.02 (0.02) 0.07 (0.01) -0.06 (0.02)
Spectral -0.07 (0.01) 0.00 (0.00) -0.08 (0.01) -0.04 (0.01) 0.01 (0.01) -0.10 (0.02)
20 OLS -0.02 (0.00) 0.00 (0.00) -0.03 (0.00) -0.04 (0.00) -0.03 (0.00) -0.06 (0.00)
Bottom-Up -0.05 (0.01) 0.04 (0.00) -0.08 (0.01) -0.11 (0.01) -0.06 (0.01) -0.15 (0.01)
Full Cov. -0.08 (0.00) -0.01 (0.01) -0.12 (0.01) -0.08 (0.01) 0.02 (0.01) -0.11 (0.01)
Spectral -0.07 (0.00) 0.00 (0.00) -0.10 (0.01) -0.07 (0.01) -0.02 (0.00) -0.09 (0.02)
50 OLS -0.02 (0.00) 0.00 (0.00) -0.04 (0.00) -0.03 (0.00) -0.02 (0.00) -0.05 (0.00)
Bottom-Up -0.07 (0.00) 0.02 (0.00) -0.11 (0.01) -0.10 (0.01) -0.04 (0.00) -0.13 (0.01)
Full Cov. -0.08 (0.00) -0.01 (0.00) -0.13 (0.01) -0.08 (0.01) 0.00 (0.00) -0.12 (0.01)
Spectral -0.07 (0.00) 0.00 (0.00) -0.12 (0.01) -0.07 (0.01) -0.01 (0.00) -0.11 (0.01)
Overall 100 OLS -0.02 (0.00) 0.00 (0.00) -0.04 (0.00) -0.03 (0.00) -0.01 (0.00) -0.04 (0.00)

So far, we have only considered even hierarchy widths such as {4,1}41\{4,1\}{ 4 , 1 } or {12,4,1}1241\{12,4,1\}{ 12 , 4 , 1 }. These even aggregations result in a non-negative AR parameter at the top level, even if the bottom level model is generated with a negative one. Table 4 shows the training and test relative errors for the odd width hierarchy {5,1}51\{5,1\}{ 5 , 1 }. We observe that for a negative AR parameter, the overall improvements are much more significant. In-sample, covariance-based methods still perform better in low sample size settings, with the difference becoming marginally small for larger sample sizes. However, the bottom-up method yields better results on the test set.

4.2 ARMA Models of Higher Order

For more complex models such as ARMA(2,2)ARMA22\text{ARMA}(2,2)ARMA ( 2 , 2 ) and its aggregates, computing the covariance matrices of forecast errors becomes very tedious. Therefore, we focus on experimental evaluation for these cases to investigate if the implications of Theorem 1 still hold.

As the complexity of an ARMA model increases, identifying the parameter space that yields stationary models becomes non-trivial. It is particularly challenging to define stationary parameter combinations for p,q>2𝑝𝑞2p,q>2italic_p , italic_q > 2. To address this, we randomly draw stationary parameters using the partial correlation function as described by Jones, (1987).

For each combination of p{1,2}𝑝12p\in\{1,2\}italic_p ∈ { 1 , 2 } and q{0,1,2}𝑞012q\in\{0,1,2\}italic_q ∈ { 0 , 1 , 2 }, we randomly draw 100100100100 sets of parameters ϕ1,,ϕp,θ1,,θqsubscriptitalic-ϕ1subscriptitalic-ϕ𝑝subscript𝜃1subscript𝜃𝑞{\phi_{1},\dots,\phi_{p},\theta_{1},\dots,\theta_{q}}italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ϕ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_θ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT. To mitigate the randomness of each realization, we further simulate 20202020 time series for each of the 100100100100 random parameter sets.

Refer to caption
Figure 5: In-sample rMSE for various ARMA models and h=1,k{4,1},σ2=1formulae-sequence1formulae-sequence𝑘41superscript𝜎21h=1,k\in\{4,1\},\sigma^{2}=1italic_h = 1 , italic_k ∈ { 4 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed-order models.

Figure 5 shows the in-sample rMSE values for the full covariance estimator as well as the bottom-up approach for various sample sizes of the top-level. The setting is h=1,k{4,1}formulae-sequence1𝑘41h=1,k\in\{4,1\}italic_h = 1 , italic_k ∈ { 4 , 1 } and σ2=1superscript𝜎21\sigma^{2}=1italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 as well as fixed-order models. As in the AR(1)AR1\text{AR}(1)AR ( 1 ) for varying AR parameters, we observe equivalent reconciliation performance for a larger sample size for any ARMA(p,q)ARMA𝑝𝑞\text{ARMA}(p,q)ARMA ( italic_p , italic_q ) present. In the low sample size case we see that bottom-up performs worse with increasing model complexity. Interestingly, this difference becomes larger for higher model complexity. We also observe that the full covariance method can produce better forecasts on the bottom level. This improvement also increases with the complexity of the bottom level base model. Overall, the MA order does not seem as impactful as the AR order.

Refer to caption
Figure 6: Out-of-sample rMSE for various ARMA models and h=1,k{4,1},σ2=1formulae-sequence1formulae-sequence𝑘41superscript𝜎21h=1,k\in\{4,1\},\sigma^{2}=1italic_h = 1 , italic_k ∈ { 4 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed-order models.

Figure 6 shows the test errors for the very same setting. As in the simple AR(1)AR1\text{AR}(1)AR ( 1 ) case, the roles of bottom-up and using the full covariance matrix estimator switch and the bottom-up approach perform better the more complex the base bottom model is set up to be.

In this analysis, we aggregate over the whole space of stationary models of a certain order. Hence we also take a look at the performance of 2222-dimensional base models in a more detailed manner. Figure 7 shows the mean training rMSE differences between the full covariance-based reconciliation and the bottom-up approach for the randomly drawn stationary AR(2)AR2\text{AR}(2)AR ( 2 ) models. Based on this plot, there is no tendency for performance based on the space of the stationary parameters. Test errors are available in the Appendix in Figure 14.

Refer to caption
Figure 7: In-sample mean rMSE differences of the full covariance matrix and bottom-up reconciliation for h=1,k{4,1},σ2=1formulae-sequence1formulae-sequence𝑘41superscript𝜎21h=1,k\in\{4,1\},\sigma^{2}=1italic_h = 1 , italic_k ∈ { 4 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed-order models.

Similarly, Figure 8 shows the training mean rMSE differences for ARMA(1,1)ARMA11\text{ARMA}(1,1)ARMA ( 1 , 1 ) models. Test errors are available in the Appendix in Figure 15.

Refer to caption
Figure 8: In-sample mean rMSE differences of the full covariance matrix and bottom-up reconciliation for h=1,k{4,1},σ2=1formulae-sequence1formulae-sequence𝑘41superscript𝜎21h=1,k\in\{4,1\},\sigma^{2}=1italic_h = 1 , italic_k ∈ { 4 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed-order models.

5 Real Data Applications

5.1 A&E Emergency Service Demand

Following the data example of Athanasopoulos et al., (2017), we illustrate this paper’s work on the Accident & Emergency Service Demand dataset, available from the thief package in R. In this dataset, a number of demand statistics of A&E departments are recorded on a weekly basis from 20101107201011072010-11-072010 - 11 - 07 to 20150607201506072015-06-072015 - 06 - 07.

Before any modeling, we perform some preprocessing. To ensure complete observations for the hierarchy, we remove the incomplete years 2011201120112011 and 2015201520152015, resulting in 208208208208 weeks of data. Next, we decompose the weekly time series of interest into seasonal, trend, and remaining components using the stl function in R, and remove the seasonal component. For interpretability, we also demean the resulting non-seasonal weekly time series.

We analyze the Total Attendances time series and aggregate it on a monthly basis, resulting in a small hierarchy with 52525252 months of data. The training data consists of the first 41414141 months, or 164164164164 weeks, with the remaining data designated as test data. As before, we are focused on cumulative one-step-ahead forecasts at the top level of the hierarchy, which in this case would be month-by-month forecasts. Using automated model selection, the chosen models are ARIMA(0,0,0)ARIMA000\text{ARIMA}(0,0,0)ARIMA ( 0 , 0 , 0 ) and ARIMA(1,1,1)ARIMA111\text{ARIMA}(1,1,1)ARIMA ( 1 , 1 , 1 ), respectively.

To stick to the framework of temporally aggregated ARIMA models, we fix the orders of the used models accordingly. This yields an ARIMA(1,1,2)ARIMA112\text{ARIMA}(1,1,2)ARIMA ( 1 , 1 , 2 ) model for the monthly time series. The resulting model on the top level gives an AICc value of 406.47406.47406.47406.47 which is only around 0.6%percent0.60.6\%0.6 % worse than the automatically selected model, hence it still seems like an appropriate model. Table 5 shows the corresponding errors. We observe better generability of the bottom-up approach compared to using the full covariance matrix. The spectral method does seem to perform quite well out-of-sample leading to similar results as the bottom-up approach. A common aspect is still the fact that each covariance-based reconciliation method achieves worse forecasts on the test set for the bottom level time series.

Table 5: Results for A&E Total Addendances in units of People2superscriptPeople2\text{People}^{2}People start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and Wool Production in units of (100tonnes)2superscript100tonnes2(100~{}\text{tonnes})^{2}( 100 tonnes ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT with fixed-order models.
Level Training Base MSE Test Base MSE Recon. Type Training Recon. MSE Test Recon. MSE Training rMSE Test rMSE
A&E Total Attendances Bottom-Up 1219.43 1981.83 0.08 -0.09
Full Cov. 1124.78 2222.40 0.00 0.03
Spectral 1169.64 2021.56 0.04 -0.07
Annual 1125.60 2166.15 OLS 1132.26 2112.09 0.01 -0.02
Bottom-Up 150.19 170.12 0.00 0.00
Full Cov. 148.26 186.73 -0.01 0.10
Spectral 148.44 171.55 -0.01 0.01
Quarterly 150.19 170.12 OLS 147.74 176.55 -0.02 0.04
Bottom-Up 1369.62 2151.95 0.07 -0.08
Full Cov. 1273.04 2409.12 0.00 0.03
Spectral 1318.08 2193.11 0.03 -0.06
Overall 1275.79 2336.26 OLS 1280.00 2288.64 0.00 -0.02
Wool Production Bottom-Up 156.23 293.50 0.05 1.23
Full Cov. 119.19 330.31 -0.20 1.51
Spectral 134.30 200.81 -0.10 0.52
Annual 149.18 131.83 OLS 141.12 146.46 -0.05 0.11
Bottom-Up 80.55 80.23 0.07 1.11
Full Cov. 50.70 87.30 -0.33 1.30
Spectral 59.84 53.83 -0.21 0.42
Biannual 75.42 37.98 OLS 66.77 40.65 -0.11 0.07
Bottom-Up 24.59 23.54 0.00 0.00
Full Cov. 16.43 24.89 -0.33 0.06
Spectral 19.11 16.60 -0.22 -0.29
Quarterly 24.59 23.54 OLS 21.15 13.64 -0.14 -0.42
Bottom-Up 261.37 397.27 0.05 1.05
Full Cov. 186.32 442.49 -0.25 1.29
Spectral 213.25 271.25 -0.14 0.40
Overall 249.19 193.35 OLS 229.04 200.75 -0.08 0.04

Figure 9 shows the transformed time series as well as the base and reconciled forecasts, split by training and test set for the bottom-up and full covariance approach.

Refer to caption
Figure 9: Transformed weekly and monthly time series of the A&E Total Attendances data with base and reconciled forecasts for the bottom-up and full covariance method using fixed-order models. The vertical line indicates the split between training and test data.

5.2 Wool Production

Another popular dataset is the woolyrnq dataset, available from the forecast package in R. It is about the quarterly production of woolen yarn in Australia, given in units of tonnes from March 1965196519651965 to September 1994199419941994. We aggregate the data to biannual as well as annual frequency yielding a 3333-level hierarchy with k{4,2,1}𝑘421k\in\{4,2,1\}italic_k ∈ { 4 , 2 , 1 }. In order to have complete observations we remove the partially observed last year 1994199419941994. This then gives us 116116116116 quarters, 58585858 half-years as well as 29292929 years of data. As previously, we split the data into 80%percent8080\%80 % training data leading to 23232323 training years.

In contrast to the A&E data, we do not perform any preprocessing besides de-meaning for interpretability purposes. A seasonality decomposition such as stl is not suitable for the annual time series, hence we do not perform it at all.

Table 5 presents the results for fixed order models. According to AICc, the most suitable model for the quarterly time series is an ARIMA(3,1,2)ARIMA312\text{ARIMA}(3,1,2)ARIMA ( 3 , 1 , 2 ) model, which is already quite complex. The theory of aggregated ARIMA models then gives us ARIMA(3,1,3)ARIMA313\text{ARIMA}(3,1,3)ARIMA ( 3 , 1 , 3 ) and ARIMA(3,1,4)ARIMA314\text{ARIMA}(3,1,4)ARIMA ( 3 , 1 , 4 ) models for the biannual and annual time series, respectively. Despite the relatively small sample sizes for the biannual and annual data, these high-complexity models do not seem to suffer from overfitting. Using automated model selection, the corresponding models would be ARIMA(0,1,0)ARIMA010\text{ARIMA}(0,1,0)ARIMA ( 0 , 1 , 0 ) and ARIMA(1,1,1)ARIMA111\text{ARIMA}(1,1,1)ARIMA ( 1 , 1 , 1 ), respectively, which produce very similar results. Therefore, we only present the results for the fixed-order case.

Nevertheless, we observe similar effects as with the A&E data. The bottom-up approach performs worse on the training data compared to covariance-based reconciliation methods. On the test data, both the bottom-up approach and the full covariance method exhibit poor generalization, while the spectral and OLS methods perform better. Notably, the full covariance method generalizes even worse than the bottom-up approach, a consistent finding across all data examples and simulations.

Figure 10 shows the transformed time series as well as the base and reconciled forecasts, split by training and test set for the bottom-up and full covariance approach.

Refer to caption
Figure 10: Quarterly, biannual, and annual time series of the Wool Production data with base and reconciled forecasts for the bottom-up and full covariance method using fixed-order models. The vertical line indicates the split between training and test data.

5.3 Additional Datasets

We run experiments on some additional datasets and give an overall summary of the results. Based on the forecasting literature, especially hierarchical forecast reconciliation, we select the following 5555 datasets.

  • Energy (Panagiotelis et al.,, 2023): Daily electricity generation per source, available from the author’s GitHub repository111https://github.com/PuwasalaG/Probabilistic-Forecast-Reconciliation.

  • Food (Neubauer and Filzmoser,, 2024): Daily data from smart fridges with the goal of forecasting the demand for each fridge for the upcoming week in a one-step-ahead fashion.

  • M3 (Makridakis and Hibon,, 2000): Quarterly data of the M3 competition. The data was obtained from the R package Mcomp (Hyndman,, 2018).

  • Prison (Hyndman and Athanasopoulos,, 2018): Quarterly data about Australian prison population per state.

  • Tourism (Wickramasuriya et al.,, 2019; Girolimetto et al.,, 2023): Monthly data about visitor nights in Australian districts, taken from GitHub222https://github.com/daniGiro/ctprob.

This selection of datasets covers a wide range of frequencies and domains, summarised in Table 6. To ensure a non-singular covariance matrix estimate in order to be able to compute the full covariance reconciliation method, we maintain a relatively low order of aggregation. Specifically, we aggregate the energy data into weekly data, the M3 data into annual data, and so on. For each time series, we hold out 20%percent2020\%20 % of the data as test data. Table 7 also presents the training and test rMSE values for the selected reconciliation methods, summarized by trimmed means and corresponding standard errors. However, this presentation of the results does not provide much insight into the underlying dynamics. We observe that in-sample, the full covariance method performs well, but it does not generalize effectively. Similarly, the bottom-up approach does not produce the best results on the training data and also yields sub-optimal forecasts on the test data, contrary to the simulations. Comparing the two approaches we do observe that the full covariance method generalizes worse than the bottom-up method, confirming our simulation findings. Finally, the more sophisticated approach of utilizing the spectral decomposition performs well out-of-sample.

Table 6: Dataset properties. N𝑁Nitalic_N denotes the number of total time series in the dataset, and ntop,nbottomsubscript𝑛topsubscript𝑛bottomn_{\text{top}},n_{\text{bottom}}italic_n start_POSTSUBSCRIPT top end_POSTSUBSCRIPT , italic_n start_POSTSUBSCRIPT bottom end_POSTSUBSCRIPT give the range of the available lengths in the hierarchy given by k𝑘kitalic_k.
Dataset N𝑁Nitalic_N ntopsubscript𝑛topn_{\text{top}}italic_n start_POSTSUBSCRIPT top end_POSTSUBSCRIPT nbottomsubscript𝑛bottomn_{\text{bottom}}italic_n start_POSTSUBSCRIPT bottom end_POSTSUBSCRIPT k𝑘absentk\initalic_k ∈
Energy 23 51-51 357-357 {7,1}71\{7,1\}{ 7 , 1 }
Food 122 7-107 35-535 {5,1}51\{5,1\}{ 5 , 1 }
M3 756 8-18 32-72 {4,1}41\{4,1\}{ 4 , 1 }
Prison 8 12-12 48-48 {4,1}41\{4,1\}{ 4 , 1 }
Tourism 525 76-76 228-228 {3,1}31\{3,1\}{ 3 , 1 }
Table 7: 10%percent1010\%10 %-trimmed overall means for 5555 datasets and selected reconciliation methods. The standard errors are available in parentheses.
Training rMSE Test rMSE
Dataset Bottom-Up Full Cov. OLS Spectral Bottom-Up Full Cov. OLS Spectral
Energy -0.03 (0.02) -0.06 (0.01) -0.02 (0.00) -0.06 (0.01) -0.02 (0.05) -0.02 (0.05) -0.02 (0.00) -0.04 (0.03)
Food 0.04 (0.01) -0.03 (0.01) -0.01 (0.00) -0.01 (0.00) 0.04 (0.02) 0.01 (0.01) -0.01 (0.00) 0.00 (0.01)
M3 -0.17 (0.02) -0.28 (0.01) -0.11 (0.00) -0.27 (0.01) -0.09 (0.03) -0.13 (0.03) -0.11 (0.01) -0.19 (0.02)
Prison -0.18 (0.12) -0.12 (0.17) -0.11 (0.02) 0.00 (0.18) -0.40 (0.12) -0.30 (0.14) -0.14 (0.03) -0.01 (0.20)
Tourism 0.03 (0.00) -0.05 (0.00) 0.00 (0.00) -0.01 (0.00) 0.01 (0.01) -0.02 (0.01) -0.01 (0.00) -0.01 (0.00)
Refer to caption
Figure 11: MCB test with confidence level 0.950.950.950.95 on the overall level.

We conduct an accuracy ranking based on multiple comparisons with the best (MCB) test, introduced by Koning et al., (2005), for each dataset, divided into training and test data. Figure 11 clearly demonstrates the statistically superior performance of the full covariance method compared to the bottom-up approach in-sample, while the performance difference becomes practically negligible on the test data, consistent with our theory and simulations.

Additionally, Figure 12 presents percentile plots comparing the four different approaches. These plots further illustrate that while the full covariance method performs well in-sample, its performance significantly deteriorates out-of-sample. Specifically, on the training data, more forecasts are improved by full covariance reconciliation, but this relationship largely reverses on the test data.

Refer to caption
Figure 12: Percentile plots for each dataset, split by training and test set.

6 Conclusions

In this paper, we explored the theoretical implications of applying the minimum trace reconciliation method within the context of temporal hierarchies. By examining temporally aggregated ARMA models, we demonstrated that the optimal reconciliation method, when based on the true covariance matrix, is equivalent to a bottom-up approach. Our extensive simulation studies tested this theory across various scenarios involving different model complexities, hierarchy structures, and levels of uncertainty. The findings support our theory, indicating that the bottom-up method is a viable approach. This aligns with numerous literature findings where the bottom-up approach consistently produces useful results in suitable settings.

The simulation results also reveal that in-sample, covariance-based minimum trace reconciliation methods outperform the simple bottom-up approach. However, this relationship reverses out-of-sample, with the bottom-up approach generalizing better on the test data compared to the full covariance matrix across simulations and data examples. Further research is necessary to understand why this effect occurs so markedly. Additionally, other estimators were tested and showed improved performance over the full covariance matrix in certain settings, highlighting the potential for the ongoing research of new temporal hierarchical covariance estimators in the minimum trace approach.

Overall, our work contributes to the field of temporal forecast reconciliation by linking it to temporally aggregated ARMA models. We have theoretically established that the bottom-up approach is the optimal reconciliation method and reinforced this with comprehensive simulation studies and data illustrations. This supports the use of the bottom-up method in both theoretical and practical applications.

Computational details

The simulations and data examples were carried out in R 4.3.0. The corresponding source code of this paper in the form of an R package is available from GitHub at https://github.com/neubluk/FTATS. For convenience, all datasets except the M3M3\text{M}3M 3 dataset are included in the package.

Declaration of Generative AI and AI-assisted technologies in the writing process

During the preparation of this work the authors used ChatGPT in order to improve readability and language. After using this tool/service, the authors reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.

Acknowledgments and Disclosure of Funding

We acknowledge support from the Austrian Research Promotion Agency (FFG), Basisprogramm project “Meal Demand Forecast” and Schrankerl GmbH for the cooperation and access to their data. We further acknowledge funding from the Austrian Science Fund (FWF) for the project “High-dimensional statistical learning: New methods to advance economic and sustainability policies” (ZK 35), jointly carried out by WU Vienna University of Economics and Business, Paris Lodron University Salzburg, TU Wien, and the Austrian Institute of Economic Research (WIFO).

Appendix A Calculations and Proofs

As in Silvestrini and Veredas, (2008), we illustrate this framework based on an AR(1)AR1\text{AR}(1)AR ( 1 ) model. Let ytAR(1)similar-tosubscript𝑦𝑡AR1y_{t}\sim\text{AR}(1)italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∼ AR ( 1 ) be centered at 00 with AR parameter ϕ(1,1)italic-ϕ11\phi\in(-1,1)italic_ϕ ∈ ( - 1 , 1 ) and innovation variance σ2superscript𝜎2\sigma^{2}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. According to Eq. (10) we obtain yTARMA(1,1)similar-tosuperscriptsubscript𝑦𝑇ARMA11y_{T}^{\ast}\sim\text{ARMA}(1,1)italic_y start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∼ ARMA ( 1 , 1 ) for any k>1𝑘1k>1italic_k > 1 and AR parameter β=ϕk𝛽superscriptitalic-ϕ𝑘\beta=\phi^{k}italic_β = italic_ϕ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT. The MA parameter η𝜂\etaitalic_η as well as the noise σ2superscriptsubscript𝜎2\sigma_{\ast}^{2}italic_σ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT are computed as follows.

For lags 0,1010,10 , 1 we compute the autocovariances of (1+ηB)ϵT1𝜂𝐵subscriptsuperscriptitalic-ϵ𝑇(1+\eta B)\epsilon^{\ast}_{T}( 1 + italic_η italic_B ) italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT with B=Lk𝐵superscript𝐿𝑘B=L^{k}italic_B = italic_L start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT and T(L)ϵt𝑇𝐿subscriptitalic-ϵ𝑡T(L)\epsilon_{t}italic_T ( italic_L ) italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT with the aggregation polynomial T(L)𝑇𝐿T(L)italic_T ( italic_L ) given by

T(L)𝑇𝐿\displaystyle T(L)italic_T ( italic_L ) =1δkLk1δL1Lk1LUNKNOWNabsent1superscript𝛿𝑘superscript𝐿𝑘1𝛿𝐿1superscript𝐿𝑘1𝐿UNKNOWN\displaystyle=\frac{1-\delta^{k}L^{k}}{1-\delta L}\frac{1-L^{k}}{1-L} = divide start_ARG 1 - italic_δ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_L start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_δ italic_L end_ARG divide start_ARG 1 - italic_L start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_L end_ARG UNKNOWN (14)
=i=0k1δiLij=0k1Lj,absentsuperscriptsubscript𝑖0𝑘1superscript𝛿𝑖superscript𝐿𝑖superscriptsubscript𝑗0𝑘1superscript𝐿𝑗\displaystyle=\sum_{i=0}^{k-1}\delta^{i}L^{i}\sum_{j=0}^{k-1}L^{j},= ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT italic_δ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_L start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT italic_L start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT , (15)

with δ=ϕ1𝛿superscriptitalic-ϕ1\delta=\phi^{-1}italic_δ = italic_ϕ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT being the inverse root of the corresponding AR polynomial and L𝐿Litalic_L being the lag operator such that Lyt=Lyt1𝐿subscript𝑦𝑡𝐿subscript𝑦𝑡1Ly_{t}=Ly_{t-1}italic_L italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_L italic_y start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT.

Because the MA order is 1111, all lags greater than 1111 are zero. First note that

T(L)ϵt=(1,ϕ,,ϕk1)(110001100k×k001k×(k1)11)=A(ϵtϵt(2k2)).𝑇𝐿subscriptitalic-ϵ𝑡1italic-ϕsuperscriptitalic-ϕ𝑘1superscriptmatrix110001100subscriptabsent𝑘𝑘001subscriptabsent𝑘𝑘111absent𝐴matrixsubscriptitalic-ϵ𝑡subscriptitalic-ϵ𝑡2𝑘2\displaystyle T(L)\epsilon_{t}=(1,\phi,\dots,\phi^{k-1})\overbrace{\begin{% pmatrix}1&\dots&\dots&\dots&1&0&\dots&\dots&0\\ 0&1&\dots&\dots&\vdots&1&0&\dots&0\\ \vdots&\ddots&\ddots&\ddots&\vdots&\vdots&\ddots&\ddots&\vdots\\ \vdots&\ddots&\ddots&\ddots&\vdots&\vdots&\ddots&\ddots&\vdots\\ \makebox[0.0pt][l]{$\smash{\underbrace{\phantom{\begin{matrix}0&\dots&\dots&0&% 1&\end{matrix}}}_{\text{$k\times k$}}}$}0&\dots&\dots&0&1&\makebox[0.0pt][l]{$% \smash{\underbrace{\phantom{\begin{matrix}1&\dots&\dots&1\end{matrix}}}_{\text% {$k\times(k-1)$}}}$}1&\dots&\dots&1\end{pmatrix}}^{=A}\begin{pmatrix}\epsilon_% {t}\\ \vdots\\ \epsilon_{t-(2k-2)}\end{pmatrix}.italic_T ( italic_L ) italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = ( 1 , italic_ϕ , … , italic_ϕ start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) over⏞ start_ARG ( start_ARG start_ROW start_CELL 1 end_CELL start_CELL … end_CELL start_CELL … end_CELL start_CELL … end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL … end_CELL start_CELL … end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL … end_CELL start_CELL … end_CELL start_CELL ⋮ end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL under⏟ start_ARG end_ARG start_POSTSUBSCRIPT italic_k × italic_k end_POSTSUBSCRIPT 0 end_CELL start_CELL … end_CELL start_CELL … end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL under⏟ start_ARG end_ARG start_POSTSUBSCRIPT italic_k × ( italic_k - 1 ) end_POSTSUBSCRIPT 1 end_CELL start_CELL … end_CELL start_CELL … end_CELL start_CELL 1 end_CELL end_ROW end_ARG ) end_ARG start_POSTSUPERSCRIPT = italic_A end_POSTSUPERSCRIPT ( start_ARG start_ROW start_CELL italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_ϵ start_POSTSUBSCRIPT italic_t - ( 2 italic_k - 2 ) end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) .

Next, we set up the equations based on the auto-correlation functions to determine η𝜂\etaitalic_η and σ2superscriptsubscript𝜎2\sigma_{\ast}^{2}italic_σ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

To this end, the variances are computed to be

γ(0)superscript𝛾0\displaystyle\gamma^{\ast}(0)italic_γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( 0 ) =Var((1+ηB)ϵT)absentVar1𝜂𝐵subscriptsuperscriptitalic-ϵ𝑇\displaystyle=\text{Var}((1+\eta B)\epsilon^{\ast}_{T})= Var ( ( 1 + italic_η italic_B ) italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT )
=(1+η2)σ2,absent1superscript𝜂2superscriptsubscript𝜎2\displaystyle=(1+\eta^{2})\sigma_{\ast}^{2},= ( 1 + italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_σ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (16)

which must be equal to

γ(0)𝛾0\displaystyle\gamma(0)italic_γ ( 0 ) =Var(T(L)ϵt)absentVar𝑇𝐿subscriptitalic-ϵ𝑡\displaystyle=\text{Var}(T(L)\epsilon_{t})= Var ( italic_T ( italic_L ) italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT )
=σ2(1,ϕ,,ϕk1)AA(1,ϕ,,ϕk1)absentsuperscript𝜎21italic-ϕsuperscriptitalic-ϕ𝑘1𝐴superscript𝐴superscript1italic-ϕsuperscriptitalic-ϕ𝑘1\displaystyle=\sigma^{2}(1,\phi,\dots,\phi^{k-1})AA^{\prime}(1,\phi,\dots,\phi% ^{k-1})^{\prime}= italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 1 , italic_ϕ , … , italic_ϕ start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) italic_A italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( 1 , italic_ϕ , … , italic_ϕ start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
=σ2(j=0k1(i=0jϕi)2+j=0k1(i=jk1ϕi)2).absentsuperscript𝜎2superscriptsubscript𝑗0𝑘1superscriptsuperscriptsubscript𝑖0𝑗superscriptitalic-ϕ𝑖2superscriptsubscript𝑗0𝑘1superscriptsuperscriptsubscript𝑖𝑗𝑘1superscriptitalic-ϕ𝑖2\displaystyle=\sigma^{2}\left(\sum_{j=0}^{k-1}\left(\sum_{i=0}^{j}\phi^{i}% \right)^{2}+\sum_{j=0}^{k-1}\left(\sum_{i=j}^{k-1}\phi^{i}\right)^{2}\right).= italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT italic_ϕ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT italic_ϕ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) . (17)

Similarly, the lag 1111 auto-covariances are

γ(1)superscript𝛾1\displaystyle\gamma^{\ast}(1)italic_γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( 1 ) =Cov((1+ηB)ϵT,(1+ηB)ϵT1)absentCov1𝜂𝐵subscriptsuperscriptitalic-ϵ𝑇1𝜂𝐵subscriptsuperscriptitalic-ϵ𝑇1\displaystyle=\text{Cov}((1+\eta B)\epsilon^{\ast}_{T},(1+\eta B)\epsilon^{% \ast}_{T-1})= Cov ( ( 1 + italic_η italic_B ) italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT , ( 1 + italic_η italic_B ) italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT )
=ησ2,absent𝜂superscriptsubscript𝜎2\displaystyle=\eta\sigma_{\ast}^{2},= italic_η italic_σ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (18)

with needed equality to

γ(1)𝛾1\displaystyle\gamma(1)italic_γ ( 1 ) =Cov(T(L)ϵt,T(L)ϵtk)absentCov𝑇𝐿subscriptitalic-ϵ𝑡𝑇𝐿subscriptitalic-ϵ𝑡𝑘\displaystyle=\text{Cov}(T(L)\epsilon_{t},T(L)\epsilon_{t-k})= Cov ( italic_T ( italic_L ) italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_T ( italic_L ) italic_ϵ start_POSTSUBSCRIPT italic_t - italic_k end_POSTSUBSCRIPT )
=σ2(1,ϕ,,ϕk1)ACA(1,ϕ,,ϕk1)absentsuperscript𝜎21italic-ϕsuperscriptitalic-ϕ𝑘1𝐴𝐶superscript𝐴1italic-ϕsuperscriptitalic-ϕ𝑘1\displaystyle=\sigma^{2}(1,\phi,\dots,\phi^{k-1})ACA^{\prime}(1,\phi,\dots,% \phi^{k-1})= italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 1 , italic_ϕ , … , italic_ϕ start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) italic_A italic_C italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( 1 , italic_ϕ , … , italic_ϕ start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT )
=σ2(j=1k1(i=jk1ϕil=0j1ϕl))absentsuperscript𝜎2superscriptsubscript𝑗1𝑘1superscriptsubscript𝑖𝑗𝑘1superscriptitalic-ϕ𝑖superscriptsubscript𝑙0𝑗1superscriptitalic-ϕ𝑙\displaystyle=\sigma^{2}\left(\sum_{j=1}^{k-1}\left(\sum_{i=j}^{k-1}\phi^{i}% \sum_{l=0}^{j-1}\phi^{l}\right)\right)= italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT italic_ϕ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j - 1 end_POSTSUPERSCRIPT italic_ϕ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ) ) (19)

where

C𝐶\displaystyle Citalic_C =1σ2Cov((ϵtϵt(2k2)),(ϵtk,,ϵtk(2k2)))absent1superscript𝜎2Covsuperscriptsubscriptitalic-ϵ𝑡subscriptitalic-ϵ𝑡2𝑘2superscriptsubscriptitalic-ϵ𝑡𝑘subscriptitalic-ϵ𝑡𝑘2𝑘2\displaystyle=\frac{1}{\sigma^{2}}\text{Cov}\left((\epsilon_{t}\dots\epsilon_{% t-(2k-2)})^{\prime},(\epsilon_{t-k},\dots,\epsilon_{t-k-(2k-2)})^{\prime}\right)= divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG Cov ( ( italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT … italic_ϵ start_POSTSUBSCRIPT italic_t - ( 2 italic_k - 2 ) end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , ( italic_ϵ start_POSTSUBSCRIPT italic_t - italic_k end_POSTSUBSCRIPT , … , italic_ϵ start_POSTSUBSCRIPT italic_t - italic_k - ( 2 italic_k - 2 ) end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
=(0k×(k1)0k×kIk10(k1)×k)absentmatrixsubscript0𝑘𝑘1subscript0𝑘𝑘subscript𝐼𝑘1subscript0𝑘1𝑘\displaystyle=\begin{pmatrix}0_{k\times(k-1)}&0_{k\times k}\\ I_{k-1}&0_{(k-1)\times k}\end{pmatrix}= ( start_ARG start_ROW start_CELL 0 start_POSTSUBSCRIPT italic_k × ( italic_k - 1 ) end_POSTSUBSCRIPT end_CELL start_CELL 0 start_POSTSUBSCRIPT italic_k × italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_I start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_CELL start_CELL 0 start_POSTSUBSCRIPT ( italic_k - 1 ) × italic_k end_POSTSUBSCRIPT end_CELL end_ROW end_ARG )

Solving the system of equations γ(0)=γ(0),γ(1)=γ(1)formulae-sequence𝛾0superscript𝛾0𝛾1superscript𝛾1\gamma(0)=\gamma^{\ast}(0),\gamma(1)=\gamma^{\ast}(1)italic_γ ( 0 ) = italic_γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( 0 ) , italic_γ ( 1 ) = italic_γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( 1 ) using (A)-(A) yields

σ2superscriptsubscript𝜎2\displaystyle\sigma_{\ast}^{2}italic_σ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT =σ2(1,ϕ,,ϕk1)AA(1,ϕ,,ϕk1)1+η2absentsuperscript𝜎21italic-ϕsuperscriptitalic-ϕ𝑘1𝐴superscript𝐴superscript1italic-ϕsuperscriptitalic-ϕ𝑘11superscript𝜂2\displaystyle=\sigma^{2}\frac{(1,\phi,\dots,\phi^{k-1})AA^{\prime}(1,\phi,% \dots,\phi^{k-1})^{\prime}}{1+\eta^{2}}= italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG ( 1 , italic_ϕ , … , italic_ϕ start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) italic_A italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( 1 , italic_ϕ , … , italic_ϕ start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG 1 + italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG
η𝜂\displaystyle\etaitalic_η =(1+η2)ρ1,absent1superscript𝜂2subscript𝜌1\displaystyle=(1+\eta^{2})\rho_{1},= ( 1 + italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_ρ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ,

where ρ1=γ(1)γ(0)=γ(1)γ(0)subscript𝜌1𝛾1𝛾0superscript𝛾1superscript𝛾0\rho_{1}=\frac{\gamma(1)}{\gamma(0)}=\frac{\gamma^{\ast}(1)}{\gamma^{\ast}(0)}italic_ρ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = divide start_ARG italic_γ ( 1 ) end_ARG start_ARG italic_γ ( 0 ) end_ARG = divide start_ARG italic_γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( 1 ) end_ARG start_ARG italic_γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( 0 ) end_ARG denotes the auto-correlation value at lag 1111.

Proof of Lemma 1.

First, we compute the hhitalic_h-step forecasts of the disaggregated series for h=1,,k1𝑘h=1,\dots,kitalic_h = 1 , … , italic_k. For the AR(1)AR1\text{AR}(1)AR ( 1 ) process this can be done recursively and we obtain residuals given by

et(h)=i=0h1ϕiϵt+hi.superscriptsubscript𝑒𝑡superscriptsubscript𝑖01superscriptitalic-ϕ𝑖subscriptitalic-ϵ𝑡𝑖\displaystyle e_{t}^{(h)}=\sum_{i=0}^{h-1}\phi^{i}\epsilon_{t+h-i}.italic_e start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_h ) end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_h - 1 end_POSTSUPERSCRIPT italic_ϕ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_t + italic_h - italic_i end_POSTSUBSCRIPT . (20)

The corresponding pairwise covariances are quickly computed for h1h2subscript1subscript2h_{1}\leq h_{2}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT by

Cov(et(h1),et(h2))Covsuperscriptsubscript𝑒𝑡subscript1superscriptsubscript𝑒𝑡subscript2\displaystyle\text{Cov}\left(e_{t}^{(h_{1})},e_{t}^{(h_{2})}\right)Cov ( italic_e start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT , italic_e start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ) =σ2l=0h11ϕh2h1+2labsentsuperscript𝜎2superscriptsubscript𝑙0subscript11superscriptitalic-ϕsubscript2subscript12𝑙\displaystyle=\sigma^{2}\sum_{l=0}^{h_{1}-1}\phi^{h_{2}-h_{1}+2l}= italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - 1 end_POSTSUPERSCRIPT italic_ϕ start_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + 2 italic_l end_POSTSUPERSCRIPT (21)
=σ2ϕh2h11ϕ2h11ϕ2,absentsuperscript𝜎2superscriptitalic-ϕsubscript2subscript11superscriptitalic-ϕ2subscript11superscriptitalic-ϕ2\displaystyle=\sigma^{2}\phi^{h_{2}-h_{1}}\frac{1-\phi^{2h_{1}}}{1-\phi^{2}},= italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_ϕ start_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT divide start_ARG 1 - italic_ϕ start_POSTSUPERSCRIPT 2 italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_ϕ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (22)

hence for 𝐞t=(et(1),,et(k))subscript𝐞𝑡superscriptsuperscriptsubscript𝑒𝑡1superscriptsubscript𝑒𝑡𝑘\mathbf{e}_{t}=\left(e_{t}^{(1)},\dots,e_{t}^{(k)}\right)^{\prime}bold_e start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = ( italic_e start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , … , italic_e start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT we obtain the covariance matrix on the bottom level Cov(𝐞t)=σ2ΦΦCovsubscript𝐞𝑡superscript𝜎2ΦsuperscriptΦ\text{Cov}(\mathbf{e}_{t})=\sigma^{2}\Phi\Phi^{\prime}Cov ( bold_e start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_Φ roman_Φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

For yTsubscriptsuperscript𝑦𝑇y^{\ast}_{T}italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT we perform a 1111-step forecast, thus eT(1)=ϵT+1superscriptsubscriptsuperscript𝑒𝑇1subscriptsuperscriptitalic-ϵ𝑇1{e^{\ast}_{T}}^{(1)}=\epsilon^{\ast}_{T+1}italic_e start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT = italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T + 1 end_POSTSUBSCRIPT with Var(eT(1))=σ2Varsuperscriptsubscriptsuperscript𝑒𝑇1superscriptsubscript𝜎2\text{Var}({e^{\ast}_{T}}^{(1)})=\sigma_{\ast}^{2}Var ( italic_e start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ) = italic_σ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. To compute Cov(eT(1),et(h))Covsuperscriptsubscriptsuperscript𝑒𝑇1superscriptsubscript𝑒𝑡\text{Cov}({e^{\ast}_{T}}^{(1)},e_{t}^{(h)})Cov ( italic_e start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , italic_e start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_h ) end_POSTSUPERSCRIPT ), we do as follows. First, write ϵT+1=yT+1βyTηϵTsubscriptsuperscriptitalic-ϵ𝑇1subscriptsuperscript𝑦𝑇1𝛽subscriptsuperscript𝑦𝑇𝜂subscriptsuperscriptitalic-ϵ𝑇\epsilon^{\ast}_{T+1}=y^{\ast}_{T+1}-\beta y^{\ast}_{T}-\eta\epsilon^{\ast}_{T}italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T + 1 end_POSTSUBSCRIPT = italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T + 1 end_POSTSUBSCRIPT - italic_β italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - italic_η italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT, then for T=tk𝑇𝑡𝑘T=tkitalic_T = italic_t italic_k and j=1,,k𝑗1𝑘j=1,\dots,kitalic_j = 1 , … , italic_k we have

Cov(ϵT+1,ϵtk+j)Covsubscriptsuperscriptitalic-ϵ𝑇1subscriptitalic-ϵ𝑡𝑘𝑗\displaystyle\text{Cov}(\epsilon^{\ast}_{T+1},\epsilon_{tk+j})Cov ( italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T + 1 end_POSTSUBSCRIPT , italic_ϵ start_POSTSUBSCRIPT italic_t italic_k + italic_j end_POSTSUBSCRIPT ) =i=0k1Cov(ytk+ki,ϵtk+j)absentsuperscriptsubscript𝑖0𝑘1Covsubscript𝑦𝑡𝑘𝑘𝑖subscriptitalic-ϵ𝑡𝑘𝑗\displaystyle=\sum_{i=0}^{k-1}\text{Cov}(y_{tk+k-i},\epsilon_{tk+j})= ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT Cov ( italic_y start_POSTSUBSCRIPT italic_t italic_k + italic_k - italic_i end_POSTSUBSCRIPT , italic_ϵ start_POSTSUBSCRIPT italic_t italic_k + italic_j end_POSTSUBSCRIPT ) (24)
=i=0k1l=0tk+kiϕlCov(ϵtk+kil,ϵtk+j)absentsuperscriptsubscript𝑖0𝑘1superscriptsubscript𝑙0𝑡𝑘𝑘𝑖superscriptitalic-ϕ𝑙Covsubscriptitalic-ϵ𝑡𝑘𝑘𝑖𝑙subscriptitalic-ϵ𝑡𝑘𝑗\displaystyle=\sum_{i=0}^{k-1}\sum_{l=0}^{tk+k-i}\phi^{l}\text{Cov}(\epsilon_{% tk+k-i-l},\epsilon_{tk+j})= ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t italic_k + italic_k - italic_i end_POSTSUPERSCRIPT italic_ϕ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT Cov ( italic_ϵ start_POSTSUBSCRIPT italic_t italic_k + italic_k - italic_i - italic_l end_POSTSUBSCRIPT , italic_ϵ start_POSTSUBSCRIPT italic_t italic_k + italic_j end_POSTSUBSCRIPT ) (25)
=σ2i=0kjϕiabsentsuperscript𝜎2superscriptsubscript𝑖0𝑘𝑗superscriptitalic-ϕ𝑖\displaystyle=\sigma^{2}\sum_{i=0}^{k-j}\phi^{i}= italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - italic_j end_POSTSUPERSCRIPT italic_ϕ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT (26)
=σ21ϕkj+11ϕ,absentsuperscript𝜎21superscriptitalic-ϕ𝑘𝑗11italic-ϕ\displaystyle=\sigma^{2}\frac{1-\phi^{k-j+1}}{1-\phi},= italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG 1 - italic_ϕ start_POSTSUPERSCRIPT italic_k - italic_j + 1 end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_ϕ end_ARG , (27)

since Cov(ϵtk+kil,ϵtk+j)=σ2Covsubscriptitalic-ϵ𝑡𝑘𝑘𝑖𝑙subscriptitalic-ϵ𝑡𝑘𝑗superscript𝜎2\text{Cov}(\epsilon_{tk+k-i-l},\epsilon_{tk+j})=\sigma^{2}Cov ( italic_ϵ start_POSTSUBSCRIPT italic_t italic_k + italic_k - italic_i - italic_l end_POSTSUBSCRIPT , italic_ϵ start_POSTSUBSCRIPT italic_t italic_k + italic_j end_POSTSUBSCRIPT ) = italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT if l=kij𝑙𝑘𝑖𝑗l=k-i-jitalic_l = italic_k - italic_i - italic_j and 00 otherwise. Together, we obtain the temporal cross-covariances of

Cov(eT(1),etk(h))Covsuperscriptsubscriptsuperscript𝑒𝑇1superscriptsubscript𝑒𝑡𝑘\displaystyle\text{Cov}({e^{\ast}_{T}}^{(1)},e_{tk}^{(h)})Cov ( italic_e start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , italic_e start_POSTSUBSCRIPT italic_t italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_h ) end_POSTSUPERSCRIPT ) =Cov(eT(1),i=0h1ϕiϵtk+hi)absentCovsuperscriptsubscriptsuperscript𝑒𝑇1superscriptsubscript𝑖01superscriptitalic-ϕ𝑖subscriptitalic-ϵ𝑡𝑘𝑖\displaystyle=\text{Cov}({e^{\ast}_{T}}^{(1)},\sum_{i=0}^{h-1}\phi^{i}\epsilon% _{tk+h-i})= Cov ( italic_e start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_h - 1 end_POSTSUPERSCRIPT italic_ϕ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_t italic_k + italic_h - italic_i end_POSTSUBSCRIPT ) (28)
=σ21ϕ(1ϕh1ϕϕkh+11ϕ2h1ϕ2),absentsuperscript𝜎21italic-ϕ1superscriptitalic-ϕ1italic-ϕsuperscriptitalic-ϕ𝑘11superscriptitalic-ϕ21superscriptitalic-ϕ2\displaystyle=\frac{\sigma^{2}}{1-\phi}\left(\frac{1-\phi^{h}}{1-\phi}-\phi^{k% -h+1}\frac{1-\phi^{2h}}{1-\phi^{2}}\right),= divide start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_ϕ end_ARG ( divide start_ARG 1 - italic_ϕ start_POSTSUPERSCRIPT italic_h end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_ϕ end_ARG - italic_ϕ start_POSTSUPERSCRIPT italic_k - italic_h + 1 end_POSTSUPERSCRIPT divide start_ARG 1 - italic_ϕ start_POSTSUPERSCRIPT 2 italic_h end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_ϕ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) , (29)

hence the cross-covariance vector is given by

Cov(eT,𝐞tk)Covsubscriptsuperscript𝑒𝑇subscript𝐞𝑡𝑘\displaystyle\text{Cov}(e^{\ast}_{T},\mathbf{e}_{tk})Cov ( italic_e start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT , bold_e start_POSTSUBSCRIPT italic_t italic_k end_POSTSUBSCRIPT ) =σ2(1,,1)Φ~Φ~.absentsuperscript𝜎211~Φ~Φ\displaystyle=\sigma^{2}(1,\dots,1)\tilde{\Phi}\tilde{\Phi}.= italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 1 , … , 1 ) over~ start_ARG roman_Φ end_ARG over~ start_ARG roman_Φ end_ARG . (30)

Proof of Theorem 1.

The minimizer of Eq. (2.1) is given by G=(SW11S)1SW11superscript𝐺superscriptsuperscript𝑆superscriptsubscript𝑊11𝑆1superscript𝑆superscriptsubscript𝑊11G^{\ast}=(S^{\prime}W_{1}^{-1}S)^{-1}S^{\prime}W_{1}^{-1}italic_G start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT. First, note that

W11S=(𝟎k(σ2ΦΦ)1),superscriptsubscript𝑊11𝑆matrixsuperscriptsubscript0𝑘superscriptsuperscript𝜎2ΦsuperscriptΦ1\displaystyle W_{1}^{-1}S=\begin{pmatrix}\mathbf{0}_{k}^{\prime}\\ (\sigma^{2}\Phi\Phi^{\prime})^{-1}\end{pmatrix},italic_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S = ( start_ARG start_ROW start_CELL bold_0 start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL ( italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_Φ roman_Φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) , (31)

due to Cov(eT,𝐞tk)=σ2𝟏kΦΦCovsubscriptsuperscript𝑒𝑇subscript𝐞𝑡𝑘superscript𝜎2superscriptsubscript1𝑘ΦsuperscriptΦ\text{Cov}(e^{\ast}_{T},\mathbf{e}_{tk})=\sigma^{2}\mathbf{1}_{k}^{\prime}\Phi% \Phi^{\prime}Cov ( italic_e start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT , bold_e start_POSTSUBSCRIPT italic_t italic_k end_POSTSUBSCRIPT ) = italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_1 start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_Φ roman_Φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Then the minimizing Gsuperscript𝐺G^{\ast}italic_G start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT matrix is obtained to be G=(𝟎kIk)superscript𝐺subscript0𝑘subscript𝐼𝑘G^{\ast}=(\mathbf{0}_{k}~{}I_{k})italic_G start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( bold_0 start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) and hence

SG=(0𝟏k𝟎kIk),𝑆superscript𝐺matrix0superscriptsubscript1𝑘subscript0𝑘subscript𝐼𝑘\displaystyle SG^{\ast}=\begin{pmatrix}0&\mathbf{1}_{k}^{\prime}\\ \mathbf{0}_{k}&I_{k}\end{pmatrix},italic_S italic_G start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL bold_1 start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL bold_0 start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL start_CELL italic_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) , (32)

which is exactly the bottom-up forecast for the aggregated series. ∎

Appendix B Additional Plots

Refer to caption
Figure 13: Full sample transformation matrix SG𝑆𝐺SGitalic_S italic_G for n=100,ϕ=0.8,h=1,k{4,2,1},σ2=1formulae-sequence𝑛100formulae-sequenceitalic-ϕ0.8formulae-sequence1formulae-sequence𝑘421superscript𝜎21n=100,\phi=0.8,h=1,k\in\{4,2,1\},\sigma^{2}=1italic_n = 100 , italic_ϕ = 0.8 , italic_h = 1 , italic_k ∈ { 4 , 2 , 1 } , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1. The colors correspond to the mean value over 100100100100 repetitions. The standard errors are given in parentheses.
Refer to caption
Figure 14: Out-of-sample mean rMSE differences of the full covariance matrix and bottom-up reconciliation for h=1,k=(1,4),σ2=1formulae-sequence1formulae-sequence𝑘14superscript𝜎21h=1,k=(1,4),\sigma^{2}=1italic_h = 1 , italic_k = ( 1 , 4 ) , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed-order models.
Refer to caption
Figure 15: Out-of-sample mean rMSE differences of the full covariance matrix and bottom-up reconciliation for h=1,k=(1,4),σ2=1formulae-sequence1formulae-sequence𝑘14superscript𝜎21h=1,k=(1,4),\sigma^{2}=1italic_h = 1 , italic_k = ( 1 , 4 ) , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 and fixed-order models.

References

  • Amemiya and Wu, (1972) Amemiya, T. and Wu, R. Y. (1972). The effect of aggregation on prediction in the autoregressive model. Journal of the American Statistical Association, 67(339):628–632.
  • Athanasopoulos et al., (2024) Athanasopoulos, G., Hyndman, R. J., Kourentzes, N., and Panagiotelis, A. (2024). Forecast reconciliation: A review. International Journal of Forecasting, 40(2):430–456.
  • Athanasopoulos et al., (2017) Athanasopoulos, G., Hyndman, R. J., Kourentzes, N., and Petropoulos, F. (2017). Forecasting with temporal hierarchies. European Journal of Operational Research, 262(1):60–74.
  • Girolimetto et al., (2023) Girolimetto, D., Athanasopoulos, G., Di Fonzo, T., and Hyndman, R. J. (2023). Cross-temporal probabilistic forecast reconciliation: Methodological and practical issues. International Journal of Forecasting.
  • Hyndman, (2018) Hyndman, R. (2018). Mcomp: Data from the M-Competitions. R package version 2.8.
  • Hyndman et al., (2011) Hyndman, R. J., Ahmed, R. A., Athanasopoulos, G., and Shang, H. L. (2011). Optimal combination forecasts for hierarchical time series. Computational Statistics & Data Analysis, 55(9):2579–2589.
  • Hyndman and Athanasopoulos, (2018) Hyndman, R. J. and Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts.
  • Hyndman et al., (2016) Hyndman, R. J., Lee, A. J., and Wang, E. (2016). Fast computation of reconciled forecasts for hierarchical and grouped time series. Computational Statistics & Data Analysis, 97:16–32.
  • Jones, (1987) Jones, M. C. (1987). Randomly choosing parameters from the stationarity and invertibility region of autoregressive-moving average models. Journal of the Royal Statistical Society. Series C (Applied Statistics), 36(2):134–138.
  • Koning et al., (2005) Koning, A. J., Franses, P. H., Hibon, M., and Stekler, H. (2005). The m3 competition: Statistical tests of the results. International Journal of Forecasting, 21(3):397–409.
  • Ledoit and Wolf, (2012) Ledoit, O. and Wolf, M. (2012). Nonlinear shrinkage estimation of large-dimensional covariance matrices. The Annals of Statistics, 40(2):1024 – 1060.
  • Makridakis and Hibon, (2000) Makridakis, S. and Hibon, M. (2000). The m3-competition: results, conclusions and implications. International Journal of Forecasting, 16(4):451–476. The M3- Competition.
  • Neubauer and Filzmoser, (2024) Neubauer, L. and Filzmoser, P. (2024). Improving forecasts for heterogeneous time series by “averaging”, with application to food demand forecasts. International Journal of Forecasting.
  • Nystrup et al., (2021) Nystrup, P., Lindström, E., Møller, J. K., and Madsen, H. (2021). Dimensionality reduction in forecasting with temporal hierarchies. International Journal of Forecasting, 37(3):1127–1146.
  • Nystrup et al., (2020) Nystrup, P., Lindström, E., Pinson, P., and Madsen, H. (2020). Temporal hierarchies with autocorrelation for load forecasting. European Journal of Operational Research, 280(3):876–888.
  • Panagiotelis et al., (2021) Panagiotelis, A., Athanasopoulos, G., Gamakumara, P., and Hyndman, R. J. (2021). Forecast reconciliation: A geometric view with new insights on bias correction. International Journal of Forecasting, 37(1):343–359.
  • Panagiotelis et al., (2023) Panagiotelis, A., Gamakumara, P., Athanasopoulos, G., and Hyndman, R. J. (2023). Probabilistic forecast reconciliation: Properties, evaluation and score optimisation. European Journal of Operational Research, 306(2):693–706.
  • Ramírez et al., (2014) Ramírez, O. A., Mullen, J., and Collart, A. J. (2014). Insights into the appropriate level of disaggregation for efficient time series model forecasting. Journal of Applied Statistics, 41:2298 – 2311.
  • Silvestrini and Veredas, (2008) Silvestrini, A. and Veredas, D. (2008). Temporal aggregation of univariate and multivariate time series models: A survey. Journal of Economic Surveys, 22(3):458–497.
  • Wickramasuriya et al., (2019) Wickramasuriya, S. L., Athanasopoulos, G., and Hyndman, R. J. (2019). Optimal forecast reconciliation for hierarchical and grouped time series through trace minimization. Journal of the American Statistical Association, 114(526):804–819.