Ensemble Forecasting for Intraday Electricity Prices: Simulating Trajectories
EEnsemble Forecasting for IntradayElectricity Prices: Simulating Tra jectories
Micha(cid:32)l NarajewskiUniversity of Duisburg-EssenandFlorian ZielUniversity of Duisburg-EssenJuly 8, 2020
Abstract
Recent studies concerning the point electricity price forecasting have shown ev-idence that the hourly German Intraday Continuous Market is weak-form efficient.Therefore, we take a novel, advanced approach to the problem. A probabilistic fore-casting of the hourly intraday electricity prices is performed by simulating trajectoriesin every trading window to receive a realistic ensemble to allow for more efficient in-traday trading and redispatch. A generalized additive model is fitted to the pricedifferences with the assumption that they follow a zero-inflated distribution, pre-cisely a mixture of the Dirac and the Student’s t-distributions. Moreover, the mixingterm is estimated using a high-dimensional logistic regression with lasso penalty. Wemodel the expected value and volatility of the series using i.a. autoregressive andno-trade effects or load, wind and solar generation forecasts and accounting for thenon-linearities in e.g. time to maturity. Both the in-sample characteristics and fore-casting performance are analysed using a rolling window forecasting study. Multipleversions of the model are compared to several benchmark models and evaluated usingprobabilistic forecasting measures and significance tests. The study aims to forecastthe price distribution in the German Intraday Continuous Market in the last 3 hoursof trading, but the approach allows for application to other continuous markets, es-pecially in Europe. The results prove superiority of the mixture model over thebenchmarks gaining the most from the modelling of the volatility. They also indicatethat the introduction of XBID reduced the market volatility.
Keywords: electricity price forecasting, power markets, intraday market, continuous-trademarkets, XBID, ensemble forecasting, probabilistic forecasting, short-term forecasting, tra-jectories, generalized additive models, lasso, logistic regression, zero-inflated distribution,scenario simulation 1 a r X i v : . [ q -f i n . S T ] J u l Introduction
Intraday continuous electricity markets gain on importance every day [1]. Their primarypurpose is to handle the uncertainty in electricity generation and load arisen since theday-ahead markets [2]. A number of events can cause the uncertainty, e.g. unexpectedpower plant outage or changing weather conditions. The latter one is the result of theglobal trend of investing in weather-dependent renewable power sources and is a subjectof modelling and forecasting [3]. The need of intraday continuous trading is fulfilled bythe power exchanges and transmission system operators (TSO) [4]. They allow the marketparticipants to trade the energy continuously up to 5 minutes before the delivery, e.g.in France or Germany, and to trade it cross-border, e.g. using the cross-border intraday(XBID) market [5]. Even though there is a clear evidence of the importance of this kindof markets, the researchers do not investigate them in terms of forecasting as willingly asthe day-ahead market.The day-ahead market is the main electricity spot market with a long history of researchon electricity price forecasting [6]. Recent studies on the electricity price forecasting (EPF)in day-ahead markets consider i.a. the probabilistic forecasting and forecasting combina-tion. Nowotarski and Weron [7] present a review of probabilistic EPF and Muniain andZiel [8] use it to simulate peak and off-peak prices. Marcjasz et al. [9] combine point fore-casts achieved using different calibration windows while Uniejewski et al. [10] and Serafinet al. [11] do it for probabilistic forecasts. A very big part of the recent EPF literatureare also hybrid models [12–14] and neural networks [15–17]. Also the market integrationplays an important role in price formation in both day-ahead and intraday markets whatis elaborated by Lago et al. [18] and Kath [5].The role of the intraday markets in the balancing of electricity systems was emphasizedand explained by Ocker and Ehrhart [19] and Koch and Hirth [20] on the basis of the Ger-man electricity market. They observed that the introduction of the intraday continuousmarket in Germany partially led to a substantial decrease in the demand for balancingenergy while the wind and solar energy generation increased. Karanfil and Li [21] clarifythe reason for the spread between day-ahead and intraday prices in Denmark, while Ma-ciejowska et al. [22] forecast the price spread between the day-ahead and intraday markets2ased on the Polish and German data. The continuity of the intraday market has encour-aged the researchers to investigate the transaction arrival process [23], bidding behaviour[24–26] and optimal trading strategies [27–29]. The impact of fundamental regressors onthe price formation in the intraday market was examined by Pape et al. [30], G¨urtler andPaulsen [31], and Kremer et al. [32].The literature on the EPF in the intraday markets is not that broad as in the day-ahead markets or as the one regarding other aspects of the intraday markets. Monteiroet al. [33] and Andrade et al. [34] conducted the EPF for the Iberian intraday market,however it is not a continuous market, and thus their studies are more similar to these onday-ahead markets. Uniejewski et al. [35], Narajewski and Ziel [36] and Marcjasz et al. [37]performed the EPF in the German Intraday Continuous Market, while Oksuz and Ugurlu[38] in the Turkish Intraday market. An outcome of the second one was an indication of theweak-form efficiency of the investigated market. This was partially confirmed by Janke andSteinke [39], who forecasted the distribution of prices during the last three hours of tradingand concluded that forecasting of the central quantiles yields marginal improvement to thenaive benchmark. However, Marcjasz et al. [37] managed to outperform the most recentprice by using an ensemble of it and a lasso-estimated model.The only four papers on EPF in the German intraday market considered the ID -Price(a volume-weighted average price of transactions in the last three hours before delivery) asthe most important price index in the German intraday market and conducted forecastingof it. This paper focuses on the ID index as well, but not directly. Instead of forecastingits price we simulate the paths of 5-minute volume weighted average price during the time-frame of the index. This way we obtain a distribution forecast of the prices in every 5 minwindow during the last three hours before the delivery. An example of this approach canbe seen in Figure 1. We motivate our research with results on the weak-form efficiencyof the market concluded by Narajewski and Ziel [36] and a possible application of themethodology to trading of the electricity and optimal redispatch management.In purpose of modelling and forecasting of the trajectories, we utilize the generalizedadditive models for location scale and shape (GAMLSS) [40] which extends the generalizedadditive models (GAM) [41]. This methodology found applications to the electricity load3 Time to delivery (minutes) x I D m p r i c e ( hou r l y , : ) Figure 1: Price trajectory for the hourly product with delivery on 15.07.2016 at 12:00. Theblack part is the realization and the colourful part consists of 100 simulations from theGaussian random walk. Time of forecasting is indicated by the green dashed line.[42, 43] and day-ahead price [44–46] forecasting, but never to the intraday electricity mar-kets. The model for price difference ∆ P is fitted to the Student’s t-distribution and mixedwith the Dirac distribution, i.e. ∆ P ∼ (1 − α ) δ + α t. α is assumed to be a Bernoullivariable with probability π and is modelled using the logistic regression. We estimate itwith the lasso method [47]. A broader description of the modelling exercise can be foundin Section 4.The forecasting part utilizes a rolling window study. This is a very common studytype in the EPF and is widely utilized by researchers [35, 36]. We analyse both in-samplecharacteristics and evaluate the out-of-sample forecasting performance. The evaluationmeasures that we consider are: energy score (ES), continuous ranked probability score(CRPS), mean absolute error (MAE), root mean squared error (RMSE) and the Dieboldand Mariano [48] test. Moreover, we evaluate the empirical coverage.The major contributions of this paper are as follows:(1) It is the first work on the price trajectories in intraday continuous markets which area new and developing part of the electricity markets.(2) A rigourous presentation and discussion of all characteristics of the market, like trad-ing frequency and volatility. 43) We propose a model that utilizes a mixture of GAMLSS and logit-lasso estimationmethods and generates realistic ensembles what allows for efficient decision-making,especially for trading and redispatch.(4) The components of the proposed model are interpreted with respect to the marketbehaviour, highlighting the impact of the XBID introduction and relevant features,like wind and solar generation, load, calendar effects, trading activity and historicprices.(5) The high-quality predictive performance of the proposed model is compared withsimple benchmarks and sophisticated models with respect to point and probabilisticforecasting.The remainder of this paper has the following structure. In the next section, we describethe market. The third section consists of the data description and descriptive statistics.Then, a broader explanation of the estimation methods is presented, followed by the de-scription of the considered models and benchmarks. In the fifth section, the forecastingstudy and evaluation measures are introduced. In the sixth section, we present the resultswhich consist of the in-sample analysis with relevant model interpretations and the out-of-sample evaluation. The final section concludes this paper. The methodology used inthe paper is very innovative, especially in regard to the intraday electricity markets. Wepresent it with an application to the German Intraday Continuous Market, but it can beeasily used with any other intraday electricity continuous market. The German Intraday Continuous Market allows to trade hourly, half-hourly and quarter-hourly products. We conduct the study using the most liquid part of the market – thehourly one. This is in line with other EPF studies in intraday markets. Trading of hourlyproducts in the German Intraday Continuous begins every day at 15:00 for the 24 productsof the following day. It is possible to trade the electricity until 30 minutes (in the wholemarket) and up to 5 minutes (within respective control zones) before the delivery. In the5eantime, between hour 22:00 and 60 minutes before the delivery the cross-border tradingwithin XBID system is possible Kath [5]. This system went live on 18th June 2018. Avisualization of the trading timeline can be seen in Figure 2. For more details on theGerman electricity market, we recommend the paper of Viehmann [4].The most important price measure in the German intraday market is the volume-weighted average price of transactions in the last three hours of trading, called ID . Theindex takes into account only these transactions that happen until the gate closure 30minutes before the delivery, so in fact it measures the last two and a half hours of tradingbefore the gate closure. The relevance of ID is an outcome of the behaviour of traders inthe intraday market – most of the transactions are held in this time period making it veryliquid. This results in a high interest of practitioners and researchers in the ID -Price. Formore details on the index visit the webpage of EPEX SPOT or see e.g. Narajewski andZiel [36].To measure the prices during the trading period, we use the x ID y defined by Narajewskiand Ziel [36]. Let us recall the definition of x ID y . Let b ( d, s ) be the start of the delivery ofa product s on day d . By T d,sx,y = [ b ( d, s ) − x − y, b ( d, s ) − x ), x ≥ y >
0, we denotethe time interval between x + y and x minutes before the delivery, and by T d,s we denotea set of timestamps of transactions on the product. The x ID y is defined by x ID d,sy := 1 (cid:80) k ∈ T d,sx,y ∩T d,s V d,sk (cid:88) k ∈ T d,sx,y ∩T d,s V d,sk P d,sk , (1)where V d,sk and P d,sk are the volume and the price of k -th trade within the transaction set d − d − d − d − d , s −
60 minXBIDcloses Marketcloses d , s −
30 minControl zonesclose d , s − d , s Figure 2: The daily routine of the German spot electricity market. d corresponds to theday of the delivery and s corresponds to the hour of the delivery.6 d,sx,y ∩ T d,s respectively. Let us note that the x ID y is simply a volume-weighted averageprice of transactions in the time interval of length y hours and ending x hours before thedelivery.In the case of T d,sx,y ∩ T d,s = ∅ we use the value of x + y ID y , that is to say the previousobserved volume-weighted average price measured on the time period of the same length. In the case of no trades appearing since the start of trading, the price is set to the price ofthe corresponding Day-Ahead Auction.
The data used in purpose of this study consists of all transactions on hourly products inthe German Intraday Continuous Market between 16th July 2015 and 1st October 2019.A more general descriptive statistics were presented by Narajewski and Ziel [36]. As men-tioned in the previous section, the XBID system started to function on 18th June 2018.This means that XBID trades were possible only on around 30% of the days in the data.In the forecasting study, we use D = 365 days of the data as in-sample, and thereforethe analysis in this section is based only on the initial in-sample, i.e. the data between16th July 2015 and 14th July 2016. The start of the data is set to the first day of leadchange in Germany from 45 min to 30 min in order to avoid this structural break. Inthis paper, we aggregate the transactions using the x ID d,sy with y = 5 min, and this waywe obtain dense time series data. As said before, we are particularly interested in theevolution of prices during the last 2.5 hours of trading before the gate closure, so we use x ∈ J = { , , . . . , , } , where x is denoted in minutes. This way we observe T = 31price points a day, what results in T D = 31 ×
365 = 11315 in-sample observations and T -dimensional simulated trajectories. Subsequently, we use a very specific setting, but itcan be applied to any other continuous intraday market with other input variables.As the market shows strong indications of weak-form efficiency, we focus on modelling In Narajewski and Ziel [36] this value is set to the price of the last transaction. This adjustment iscaused by the fact that in this paper we work with 5-minute time intervals, leading to a significant numberof the events of no trade in the time interval. This would often result in an artificial change of the price,compared to the previously observed x ID y .
7f the price differences ∆ P d,st = ( T − t ) y +30 ID d,sy − ( T − ( t − y +30 ID d,sy instead of pure prices P d,st = ( T − t ) y +30 ID d,sy . We also introduce the P d,st notation for simplicity. Due to the usageof price differences and to the fact that the data is aggregated using 5-minutes grid, weobserve a high frequency of observations with no trade, and thus price differences equal to 0.This is depicted in Figure 3. One can see that lack of transactions happens more often to thenight and morning hours. In Figure 4, we zoom in the tails of the histograms from Figure 3.We also plot there densities of 4 distributions fitted to the data: the normal distribution N (0 , (cid:98) σ ) and the t-distribution t(0 , (cid:98) σ, ν ) with fixed ν ∈ { . , , } and estimated (cid:98) σ usingmaximum likelihood estimation ignoring the no-trade observations. Based on Figure 4 itis clear that the price differences ∆ P d,st are heavy-tailed. One can see that even the t-distribution with ν = 4 seems to be not heavy-tailed enough for the data. This indicatesthat the tail-index of the price differences may be lower than 4 which would mean that thefourth moment of the ∆ P d,st might not exist what is a strong indication for heavy tails.Figure 5 shows the frequency of the no-trade event over time to delivery. We see thatthe overall behaviour is very similar across all products – the closer to the delivery, the lessobservations without transactions. What is different among the products is the level of thefrequency. It is clear that the frequency decreases as the product time increases and thereason for it may be the time distance from the Day-Ahead and Intraday Auctions. It isintuitive that since these auctions the uncertainty could be smaller for the first productsand higher for the last ones, but the smallest values of frequency are achieved not for the Hour 00:00 -5.0 -2.5 0.0 2.5 5.0
Hour 06:00 -5.0 -2.5 0.0 2.5 5.0
Hour 12:00 -5.0 -2.5 0.0 2.5 5.0
Hour 18:00 Δ P td,s Figure 3: Histograms of the initial in-sample price differences ∆ P d,st for selected hours.Blue colour corresponds to the no-trade cases.8 .0000.0050.0100.015 -10 -8 -6 -4 Hour 00:00 -10 -8 -6 -4
Hour 06:00 -10 -8 -6 -4
N(0, σ ) t(0, σ , ν = σ , ν =
3) t(0, σ , ν = Hour 12:00 -10 -8 -6 -4
Hour 18:00 Δ P td,s Figure 4: Histograms of the tails of the initial in-sample price differences ∆ P d,st for selectedhours. Solid lines depict densities of the distributions according to the legend. Time to delivery (minutes) F r equen cy Figure 5: Frequency of no-trade event in the initial in-sample price differences ∆ P d,st overtime to delivery for all 24 products.evening, but for the day-peak hours. This can be explained by higher activity in the marketdue to higher expected demand.Figure 6 shows the in-sample standard deviation of price differences ∆ P d,st over time todelivery. The dashed lines depict the standard deviation of the whole samples, independentof time. If the price processes would be similar to random walk, the sample standard de-viation over time should be oscillating around these dashed lines. The behaviour in Figure6 is clearly different, with a spike in the last 30 minutes before gate closure. This suggeststhat the variance should be a subject of modelling. Figure 7 presents the partial autocor-relation function of the absolute price differences (cid:12)(cid:12)(cid:12) ∆ P d,st (cid:12)(cid:12)(cid:12) to explore potential conditional9
23 50100150
Time to delivery (minutes) S t anda r d de v i a t i on Figure 6: Standard deviation of the initial in-sample price differences ∆ P d,st for selectedhours over time to delivery. Dashed lines indicate the standard deviation independent oftime. PA C F o f | Δ P t d , s | Hour 00:00
Hour 06:00
Hour 12:00
Hour 18:00
Lag
Figure 7: Partial autocorrelation function of the initial in-sample absolute price differences (cid:12)(cid:12)(cid:12) ∆ P d,st (cid:12)(cid:12)(cid:12) for selected hours. Blue, dashed lines indicate the confidence intervals.heteroscedasticity in the heavy-tailed data. Figure 7 shows that the most significant arethe first three lags. Also, lags up to 6 may contain some information. Surprisingly, lagsaround 31 seem to be significant too, but this is most likely some daily dependence. We assume the price differences ∆ P d,st to follow a 4-parametric distribution – a mixtureof the Dirac δ distribution and the 3-parametric t-distribution, sometimes referred as10ero-inflated t-distribution: G d,st = (1 − α d,st ) δ + α d,st F d,st (2)where α d,st = ( V d,st (cid:54) = 0) is a Bernoulli variable of the event that there is non-zero vol-ume of energy traded on product s on day d at time t with probability π d,st and F d,st isthe 3-parametric t-distribution t ( µ d,st , σ d,st , ν d,st ) where µ d,st ∈ R is the mean, σ d,st > ν d,st > Y be a random variable with a density function f ( y | Θ), where Θ is a set of up to four distribution parameters. Then each θ i ∈ Θ may bemodelled by g i ( θ i ) = J i (cid:88) j =1 h ji ( x ji ) (3)where g i is some link function, J i is a number of explanatory variables and h ji is a smoothfunction of explanatory variable x ji . Note that function h ji does not have to be a parametricfunction. In our exercise, we use the following link functions g ( µ ) = µg ( σ ) = log( σ ) ( σ ≤
1) + ( σ − ( σ > g ( ν ) = log( ν − . (4) g is a standard link function for the expected value. g is a link function that we call”logident” and we introduce it in order to avoid exponential inverse function for highvalues of estimates. The third link function is simply a natural logarithm shifted to 2 forpreserving the condition that ν >
2. The three link functions are plotted in Figure 8. Themodels for F d,st are estimated using the gamlss package in R [49].Due to the novelty of the exercise, we cannot use any literature benchmarks, as well asany standard approaches to the modelling of volatility, e.g. GARCH. Even though the datalooks like time series, the biggest problem lies in the gap between days. We model eachproduct separately, and for each product we have 31 observations every day. In the corre-11ponding time series, the observations on day d appear in 5-minute breaks, while the timedifference between the last observation on day d and the first on day d +1 is around 21 hours. -2.50.02.55.0 0 1 2 3 4 5 x g1g2g3 Figure 8: An illustration of the threelink functions g , g and g .Furthermore, there is no direct link betweenthe prices on day d and day d + 1 as theyare for different delivery periods with potentiallydifferent fundamental market situations. Thus,the usage of GARCH-type components to ad-dress conditional heteroscedasticity is not straight-forward. Instead, as simple benchmarks we usemodels that assume the distribution of ∆ P d,s =(∆ P d,s , ∆ P d,s , . . . , ∆ P d,sT ) to be multivariate, ran-dom walk models, and a model that uses in-sampleprice differences to create an ensemble forecast. Also, as advanced benchmarks linear quan-tile regression with copula models are considered.In the following subsection, the more complicated models are considered. We modelexplicitly the probability of non-zero number of transactions, the mean, and the varianceof fitted distribution. We present the models from the least to the most complex and showthe results similarly. This allows us to observe the gain caused by every new part of themodel. We introduce a dependency structure between the first three parameters of the G d,st dis-tribution, i.e. π d,st , µ d,st and σ d,st , and the data. For the fourth parameter, the degreesof freedom ν d,st , we assume the constancy. The G d,st distribution is estimated in a 2-stepapproach. First, the π d,st parameter is estimated, and then the F d,st distribution is fitted tothe in-sample price differences ∆ P d,st for which the value of α d,st is 1.12n the first step, we build a logistic model for π d,st log (cid:32) π d,st − π d,st (cid:33) = β + (cid:88) j =1 β j ∆ P d,st − j + (cid:88) j =1 β j (cid:12)(cid:12)(cid:12) ∆ P d,st − j (cid:12)(cid:12)(cid:12) + β
10 12 (cid:88) j =7 (cid:12)(cid:12)(cid:12) ∆ P d,st − j (cid:12)(cid:12)(cid:12)(cid:124) (cid:123)(cid:122) (cid:125) price differences + β Mon( d ) + β Sat( d ) + β Sun( d ) + (cid:88) j =1 β j TtM j ( t ) (cid:124) (cid:123)(cid:122) (cid:125) time dummies + β DA d,s Load + β DA d,s Sol + β DA d,s WiOn + β DA d,s WiOff (cid:124) (cid:123)(cid:122) (cid:125) fundamental regressors + (cid:88) j =1 β j ¯ α d,st − j (cid:124) (cid:123)(cid:122) (cid:125) regression on α d,st . (5)The model explains the logit function with 4 main components: price difference impact,time dummies, fundamental regressors and regression on α d,st . Price difference impactconsists of 3 most recent price differences, 6 most recent absolute price differences and asum of absolute prices differences lagged by 7 to 12. This component addresses the overallimpact of price volatility on π d,st . We expect to observe more trades when the prices are morevolatile. Time dummies consist of three weekday dummies and time to maturity dummies.The weekday dummies for Monday, Saturday and Sunday are chosen literature-based. Anumber of studies [50–52] have proven that usage of these dummies in EPF substantiallyimproves the forecasting performance. These three dummies indicate the end of the weekwith Monday being a transition day. The use of time to maturity dummies is clear when wetake a look again at Figure 5. It is expected that π d,st rises as we approach the gate closure.Fundamental regressors consist of day-ahead forecasts of total load, solar generation, windonshore generation and wind offshore generation. It is expected that higher load and shareof renewables should rise the uncertainty in the market, and encourage market participantsto trade more. The last, but not the least is the regression on α d,st . We do not use theregression directly, but instead we use the average of last j observed values of α d,st whichwe denote by ¯ α d,st − j . We expect these values to have a significant impact on the predictionof π d,st . Intuitively, the higher these averages, the higher the value of π d,st .Model (5) consists of 61 regressors in total. To avoid overfitting problems, we estimatethe model using the least absolute shrinkage and selection operator (lasso) of Tibshirani[47]. Let us recall that if we possess a logistic model log (cid:0) π − π (cid:1) = X (cid:48) β for the Bernoulli13ariable α with P ( α = 1) = π , then the lasso estimator (cid:98) β lasso is given by (cid:98) β lasso = arg min β (cid:110) − l (cid:16) β , (cid:101) X (cid:17) + λ ( || β || − | β | ) (cid:111) , (6)where l is the corresponding log-likelihood l ( β , (cid:101) X ) = 1 N N (cid:88) i =1 α i (cid:101) X (cid:48) i β − log(1 + e (cid:101) X (cid:48) i β ) , (7) (cid:101) X is a standardization of X and λ is a tunable shrinkage parameter. This method foundalready many successful applications to the EPF and intraday markets [35, 36, 53]. In thisexercise, we utilize the glmnet package in R by Friedman et al. [54]. The estimation isconducted using a BIC-tuned λ value chosen from an exponential grid of 100 values.Let us now take a look at the F d,st distribution in equation (2). We consider fourversions of it. In the first one, we assume that F d,st follows t (0 , σ d,st , ν d,st ) with constant σ d,st and ν d,st . We denote it simply by Mix.RW.t . The F d,st distribution is fitted to the in-sample price differences with non-zero transaction number using the GAMLSS. With thismodel we can observe the gain of using a complex model for the π d,st parameter. Figure 9shows fitted densities to the histograms presented in Figure 3. They were obtained withmodel Mix.RW.t .The second model utilizes F d,st with modelled µ d,st and constant σ d,st and ν d,st , and wedenote it by Mix.t.mu . This model helps us understand the outcome of modelling of
Hour 00:00 -5.0 -2.5 0.0 2.5 5.0
Hour 06:00 -5.0 -2.5 0.0 2.5 5.0
Hour 12:00 -5.0 -2.5 0.0 2.5 5.0
Hour 18:00 Δ P td,s Figure 9: Histograms of the initial in-sample price differences ∆ P d,st with fitted densitiesof model Mix.RW.t for selected hours. Blue colour corresponds to the no-trade cases.14he expected value of ∆ P t . However, a preliminary analysis has shown that most of theregressors used in model (5) were not significant for modelling of µ d,st . The only significantwere the three most recent price differences. Therefore, we model the expected value with g ( µ d,st ) = β ∆ P d,st − + β ∆ P d,st − + β ∆ P d,st − . (8)The next model uses F d,st with µ d,st ≡
0, modelled σ d,st and constant ν d,st . We denote itby Mix.t.sigma . The formula for the standard deviation is as follows g ( σ d,st ) = β + (cid:88) j =1 β j (cid:12)(cid:12)(cid:12) ∆ P d,st − j (cid:12)(cid:12)(cid:12) + β (cid:88) j =7 (cid:12)(cid:12)(cid:12) ∆ P d,st − j (cid:12)(cid:12)(cid:12)(cid:124) (cid:123)(cid:122) (cid:125) absolute price differences + β Mon( d ) + β Sat( d ) + β Sun( d ) (cid:124) (cid:123)(cid:122) (cid:125) weekday dummies + β DA d,s Load + β DA d,s Sol + β DA d,s WiOn + β DA d,s WiOff (cid:124) (cid:123)(cid:122) (cid:125) fundamental regressors + β α d,st − + β α d,st − (cid:124) (cid:123)(cid:122) (cid:125) lagged α d,st + h ( P d,st − ) + h ( t ) (cid:124) (cid:123)(cid:122) (cid:125) non-linear effects (9)where h and h are smooth non-linear P-spline functions. The P-splines simply combineequally-spaced B-splines and discrete penalties. More information on P-splines can be foundin Eilers et al. [55]. Let us note that the model described by equation (9) uses much moreregressors than in equation (8). The explanation of the choice of the variables is very similarto the one of the model described by equation (5). We explain the standard deviation ofprice differences with: lagged absolute price differences, weekday dummies, fundamentalregressors, lagged values of α d,st and non-linearities in most recent price and time to maturityvariables. We expect that the absolute price changes are a suitable explanatory variablefor the standard deviation as motivated through Figure 7. The fundamental regressors aresupposed to have a positive linear correlation with the σ d,st . For the Saturday and Sundaydummies we might expect a negative impact due to lower trading activity on weekends, butalso a positive impact due to the fact that higher bid-ask spreads are plausible. The laggedvalues of α d,st indicate if the market participants traded lately, and thus we believe that itcould identify higher price difference’s variance. The last two regressors are expected tohave a non-linear impact on the formation of σ d,st , and therefore they are estimated usingP-splines. Figure 6 provides already an evidence that the standard deviation varies over15ime to maturity. Moreover, we suspect that extreme values of most recent price P d,st − resultin a higher variance due to a relatively inelastic supply curve in extreme price areas.The last and at the same point the most complicated model uses F d,st with µ d,st and σ d,st modelled and constant ν d,st . We denote it by Mix.t.mu.sigma . The µ d,st is modelledusing the formula from equation (8) and the σ d,st using the formula from equation (9).Let us mention that we could make the mixture model even more complex by modellingthe degrees of freedom parameter ν d,st . However, a preliminary analysis has shown thatit does not yield any significant improvement while increasing heavily the computationalcost. Thus, in the forecasting study we analyse the performance of 8 models described inthis section. The first benchmark model uses one of D = 365 historical trajectories to model the pricedifference vector ∆ P d,s = (∆ P d,s , ∆ P d,s , . . . , ∆ P d,sT ). We denote it by Naive and its for-mula is given by ∆ P d,s = ∆ P d (cid:48) ,s (10)where d (cid:48) ∼ U ( { d − , . . . , d − D } ) is a uniform random variable indicating the day used tomodel the price difference. Let us note that a fixed d (cid:48) index is used to model the whole pricetrajectory, i.e. for every t ∈ { , , . . . , T } . This model assumes that the future trajectoriescan be forecasted using simply the past ones.The second and the third benchmark models assume that the price difference vector∆ P d,s follows a multivariate normal and t-distributions, respectively. They are denoted by MV.N and
MV.t and are given by ∆ P d,s = ε d,s (11)where ε d,s ∼ N (cid:0) , Σ d,s (cid:1) in the case of MV . N and ε d,s ∼ t (cid:0) , Σ d,s , ν d,s (cid:1) in the caseof MV.t . Let us note that the covariance matrix Σ d,s and degrees of freedom ν d,s areestimated by fitting the respective distributions to the in-sample observations. Moreover,the degrees of freedom ν d,s is assumed to be constant for all t ∈ { , , . . . , T } . RW.t.mix.D . The formula is as follows∆ P d,st = P d,st − P d,st − = ε d,st (12)where ε d,st ∼ G d,st with (cid:98) π d,st = DT (cid:80) d − i = d − D (cid:80) Tt =1 ( V i,st (cid:54) = 0), µ d,st ≡ σ d,st and ν d,st . These values are estimated based on the in-sample data.The fifth benchmark model is a modification of the RW.t.mix.D . We denote it by
RW.t , and we simply set π d,st ≡ N (0 , ( σ d,st ) ) and is denoted by RW.N . In terms of the G d,st distribution, we simply modifythe RW.t model by taking ν d,st → ∞ .Later, we consider the random walk models from the simplest RW.N to the mostcomplex
RW.t.mix.D . This allows us to observe the gain of introducing more complexstructure of the distribution. Let us note that model
RW.N assumes exponentially decay-ing tails of the price differences ∆ P d,st . Comparing it to model RW.t we measure the gainof assuming heavier, polynomially decaying tails. Based on the number of outlier observa-tions in the German intraday market and on Figure 4, we expect it to perform better thanthe Gaussian random walk. Then, considering the
RW.t.mix.D helps us to understandthe gain of the introduction of the mixture.
As mentioned, we are unable to use any literature-based benchmarks as this is the firstpaper on ensemble forecasting in intraday electricity markets. However, it is possible toimplement scenario generating methods that are utilized in other research areas. Thus,as advanced benchmark we utilize a smoothed linear quantile regression model with twocopulas: Gaussian and independence, and we denote them by
LQR.Gauss and
LQR.ind ,respectively. A very similar approach was applied recently in the purpose of generatingdensity forecasts of significant sea wave height and peak wave period [56].First, we build the linear quantile regression (LQR) model using the same set of regres-17ors as for the mixture models. The formula is as follows Q τ (cid:16) ∆ P d,st (cid:17) = β + (cid:88) j =1 β j ∆ P d,s − j + (cid:88) j =1 β j (cid:12)(cid:12)(cid:12) ∆ P d,s − j (cid:12)(cid:12)(cid:12) + β
10 12 (cid:88) j =7 (cid:12)(cid:12)(cid:12) ∆ P d,s − j (cid:12)(cid:12)(cid:12) + β P d,s (cid:124) (cid:123)(cid:122) (cid:125) price components + β Mon( d ) + β Sat( d ) + β Sun( d ) (cid:124) (cid:123)(cid:122) (cid:125) + β α d,s − + β α d,s − (cid:124) (cid:123)(cid:122) (cid:125) lagged α d,st + β DA d,s Load + β DA d,s Sol + β DA d,s WiOn + β DA d,s WiOff (cid:124) (cid:123)(cid:122) (cid:125) fundamental regressors (13)for τ ∈ { . , . , . . . , . } and t = 1 , . . . , T . That is to say, we build separate modelsfor each quantile τ and each time point t . Let us note that due to the design of the model,we can use only the regressor values available at the time of forecasting t = 0 (i.e. 3 h 5 minbefore the delivery). This results in the fact that here we model all T time points usingthe same data, what is contrary to the mixture models where we can use autoregressivevariables due to the recursive character of the models. We estimate the LQR models usingthe quantreg package in R [57].In the next step, a spline interpolation is applied over all fitted Q τ (cid:16) ∆ P d,st (cid:17) for τ ∈{ . , . , . . . , . } and for every t = 1 , . . . , T . In order to preserve the monotonicity ofthe estimated cumulative distribution function (CDF) we compute a monotonic cubic splineusing Hyman filtering [58]. This way we obtain a smooth and monotonic T -dimensionalCDF function Ψ d,s ( x ) = (cid:16) Ψ d,s ( x ) , Ψ d,s ( x ) , . . . , Ψ d,sT ( x T ) (cid:17) (14)where Ψ d,st (cid:16) min { d − ,...,d − D } (cid:16) ∆ P d,st (cid:17)(cid:17) = 0 and Ψ d,st (cid:16) max { d − ,...,d − D } (cid:16) ∆ P d,st (cid:17)(cid:17) = 1. Toassess the dependency structure of the price differences ∆ P d,st over t we use two copulas:Gaussian and independence. The Gaussian copula for a given correlation matrix R can bewritten as C R ( u ) = Φ R (cid:0) Φ − ( u ) , Φ − ( u ) , . . . , Φ − ( u T ) (cid:1) (15)where Φ − is the inverse CDF of a standard normal distribution and Φ R is a joint CDF ofa multivariate normal distribution with mean vector zero and covariance matrix equal tothe correlation matrix R . We estimate the correlation matrix R simply by calculating thein-sample correlation matrix. 18 Forecasting study and evaluation
We use a rolling window forecasting study approach with D = 365 days in-sample sizeand N = 1173 days out-of-sample size. The in-sample data consists of DT data pointswhere T = 31 in this study. We model each of the S = 24 hourly products separatelyand our forecasting time is 185 minutes before the delivery of product s on day D + 1.That is to say, we can utilize all the information from the in-sample data and from the day D + 1 until 185 minutes before the delivery. At this time we forecast M = 1000 times thefirst price difference ∆ P d,s = ID d,s − ID d,s . Based on these forecasts and explanatorydata we simulate M second price differences ∆ P d,s and we continue this recursive processuntil we reach the gate closure. Figure 10 provides an outline of the exercise. This givesus M simulated trajectories, each consisting of T = 31 points. After that, we move thewindow forward by one day and repeat the exercise until the end of out-of-sample data.However, in the case of benchmark models we do not use the recursive algorithm as thereis no recursion in their formulas.As forecasting measures we utilize the mean absolute error (MAE) and the root meansquared error (RMSE) to evaluate the median and mean trajectories, respectively. Forevaluation of the probabilistic forecasting performance the energy score (ES) and the con-tinuous ranked probability score (CRPS) are used. Moreover, the energy score evaluates inaddition the quality of generated scenarios. For a broader explanation of these, see Gneit-ing and Raftery [59]. We also calculate the empirical coverage of the simulated trajectories.To draw statistically significant conclusions on the outperformance of the forecasts of theconsidered models we utilize also the Diebold and Mariano [48] test. The RMSE is theoptimal least squares measure, i.e. it is the strictly proper scoring rule for mean evalua- d − d, s − h m d, s − h d, s − h m d, s − h m . . . d, s − m d, s − m d, s Timeof forecastingStartof trading
Delivery ID ID ID ID . . . . . . ID Figure 10: An outline of the ensemble forecasting exercise.19ion while MAE is strictly proper for median evaluation. They are widely used both byresearchers and practitioners. Their formulas are given byRMSE = (cid:118)(cid:117)(cid:117)(cid:116) SN T S (cid:88) s =1 N (cid:88) d =1 T (cid:88) t =1 (cid:32) P d,st − M M (cid:88) j =1 (cid:98) P d,st,j (cid:33) (16)and MAE = 1 SN T S (cid:88) s =1 N (cid:88) d =1 T (cid:88) t =1 (cid:12)(cid:12)(cid:12) P d,st − med j =1 ,...,M (cid:16) (cid:98) P d,st,j (cid:17)(cid:12)(cid:12)(cid:12) (17)where (cid:98) P d,st,j is the j -th simulation of P d,st and med j =1 ,...,M (cid:16) (cid:98) P d,st,j (cid:17) is the median of M simu-lated (cid:98) P d,st,j prices.We approximate the CRPS using the pinball lossCRPS d,st = 1 R (cid:88) τ ∈ r PB d,st,τ (18)for a dense equidistant grid of probabilities r between 0 and 1 of size R , see e.g. [7]. In thisstudy, we consider r = { . , . , . . . , . } of size R = 99. PB d,st,τ is the pinball loss withrespect to the probability τ . Its formula is given byPB d,st,τ = (cid:16) τ − { P d,st 12 EI d,s (22)20here ED d,s = 1 M M (cid:88) j =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) P d,s − (cid:98) P d,sj (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (23)and EI d,s = 1 M ( M − M (cid:88) j =1 M (cid:88) i = j +1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) P d,sj − (cid:98) P d,si (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (24)with P d,s = (cid:16) P d,s , P d,s , . . . , P d,sT (cid:17) and (cid:98) P d,sj = (cid:16) (cid:98) P d,s ,j , (cid:98) P d,s ,j , . . . , (cid:98) P d,sT,j (cid:17) . The ED d,s componentmeasures the distance between the simulated trajectories and the observed prices. On theother hand, EI d,s measures the spread between the simulations. Again, to calculate theoverall energy score we use an averageES = 1 SN S (cid:88) s =1 N (cid:88) d =1 ES d,s . (25)Let us note that in our exercise both ES and CRPS are strictly proper scoring rules [59]. TheCRPS evaluates the marginal probability distribution while the ES evaluates in additionthe generated scenarios. To illustrate it and to prove the appropriateness of the energy scoreto evaluate correctly an ensemble forecasting study, we perform a short experiment in theresults section. We take the best performing model and modify it with 3 different copulaswhich we refer as maximum dependency, minimum dependency and independence. For themaximum dependency copula we consider the T -dimensional comonotonicity copula definedby M max ( u ) = min( u , u , . . . , u T ). The minimum dependecy copula M min is constructedusing pairwise bivariate countermonotonicity copulas defined by W ( u , u ) = max( u + u − , M min is the coupla that is associated with the T -dimensionl uniform randomvariable U = ( U , . . . , U T ) that satisfies ( U t , U t +1 ) ∼ W for all t = 1 , . . . , T − 1. The T -dimensional independence copula is defined by M ind ( u ) = Π Tt =1 u t . The 3 new models areevaluated using all considered measures and compared to the original one.As mentioned, we use also the empirical coverage to evaluate the forecasts. The τ %-coverage is calculated using the following formula τ %-cov = 1 SN T S (cid:88) s =1 N (cid:88) d =1 T (cid:88) t =1 (cid:110) Q (1 − τ ) / j =1 ,...,M ( (cid:98) P d,st,j )
0. Let us note that these testsare complementary, and we assume that the loss differential series is covariance stationary. We divided this section into two subsections: in the first one, we inspect the in-samplecharacteristics and in the second one, we present the out-of-sample simulation results. We start our study with an analysis of the initial in-sample characteristics. Table 1 showsthe estimated coefficient values of model Mix.t.mu.sigma based on the initial in-sample22 ( µ d,st ) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23∆ P d,st − -0.09 -0.09 -0.11 -0.09 -0.11 -0.08 -0.06 -0.02 0.02 0.07 0.06 0.07 0.04 0.03 -0.01 0 -0.01 0.02 0.01 0.03 -0.02 -0.03 -0.03 -0.07∆ P d,st − -0.05 -0.06 -0.04 -0.07 -0.05 -0.05 -0.04 -0.04 -0.02 0 0.01 0.02 0.01 0 0 0 -0.03 -0.01 0 -0.03 -0.01 -0.01 -0.02 -0.05∆ P d,st − -0.02 -0.03 -0.03 -0.03 -0.03 -0.04 -0.01 -0.01 0 0.01 0.02 0.01 0.01 0.01 0.01 0 0 0 0 -0.01 -0.01 0 0.01 -0.02 g ( σ d,st ) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23Intercept 0.49 0.52 0.5 0.42 0.47 0.56 0.83 0.66 0.55 0.29 0.45 0.48 0.36 0.31 -1.78 -1.75 -1.65 0.39 0.62 0.36 0.36 0.59 0.61 0.51 (cid:12)(cid:12)(cid:12) ∆ P d,st − (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) ∆ P d,st − (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) ∆ P d,st − (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) ∆ P d,st − (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) ∆ P d,st − (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) ∆ P d,st − (cid:12)(cid:12)(cid:12) (cid:80) j =7 (cid:12)(cid:12)(cid:12) ∆ P d,st − j (cid:12)(cid:12)(cid:12) d,s Load -0.02 -0.03 -0.02 -0.03 0 0.02 0.05 -0.01 -0.02 -0.02 -0.01 0.02 0.03 -0.01 -0.03 -0.04 -0.02 -0.03 -0.04 0 -0.01 0 0.01 0.03DA d,s Sol d,s WiOn d,s WiOff d ) 0.01 0.05 0.05 0.1 0.12 0.21 0.15 0.1 0.02 0.06 0.02 0.03 0.01 -0.01 0 -0.03 -0.08 -0.1 -0.05 -0.03 -0.06 -0.05 0.01 0Sat( d ) -0.06 0.06 0.05 0.02 0.06 0.06 0.01 0.03 0.04 0.06 0.15 0.24 0.26 0.13 0.16 0.23 0.19 0.15 0.12 0.16 0.12 0.11 0.1 0.04Sun( d ) 0.11 0.07 0.18 0.09 0.08 0.06 -0.01 0.01 0.02 0.15 0.16 0.24 0.3 0.2 0.2 0.27 0.34 0.21 0.21 0.24 0.2 0.12 0.13 0.04 α d,st − -0.45 -0.44 -0.45 -0.39 -0.36 -0.39 -0.5 -0.4 -0.42 -0.38 -0.44 -0.35 -0.34 -0.37 -0.38 -0.34 -0.38 -0.4 -0.36 -0.32 -0.4 -0.4 -0.5 -0.39 α d,st − -0.18 -0.19 -0.18 -0.16 -0.17 -0.19 -0.22 -0.15 -0.15 -0.18 -0.15 -0.13 -0.15 -0.08 -0.04 -0.05 -0.17 -0.12 -0.2 -0.12 -0.13 -0.18 -0.16 -0.18 g ( ν d,st ) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23Intercept 0.53 0.74 0.69 0.86 0.84 0.52 0.56 0.7 1.11 1.25 1.52 1.37 1.41 1.37 1.07 0.87 0.93 1.06 1.24 1.18 1.02 0.74 0.76 0.57 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% ≥ Significance level Table 1: Initial in-sample coefficient values of model Mix.t.mu.sigma reported for everyhourly product. The price and generation variables were scaled for better clarity. Thep-value of the test for significance of the values is indicated by the colour. The legend isexplained by the table in the bottom.data. The table reports the values for every hourly product, and it is split to 3 sub-tables, each regarding different parameter of the t-distribution. The first sub-table presentscoefficients of the model described by equation 8. Variable ∆ P d,st − appears to be statisticallysignificant for most of the hours. However, raising the lag decreases the significance. Thisbehaviour goes in the direction of weak-form efficiency concluded by Narajewski and Ziel[36].The second sub-table shows coefficients of the model presented in equation 9. Here,we see that all the variables using lagged absolute price differences are mostly significantlydifferent from 0. Moreover, the coefficients of | ∆ P d,st − | and | ∆ P d,st − | are relatively high.Surprisingly, the day-ahead forecast of total load is mostly irrelevant. The day-aheadforecast of solar generation is significant mainly during the day-peak and in the evening.23he day-ahead forecast of wind onshore generation appears to have a big positive impact onthe volatility of price differences, in contrast to the wind offshore forecast. The behaviourof weekday dummies gives some light to our mixed expectancies – they indicate a differentbehaviour of traders on Monday at night and on Saturday and Sunday during the day. Onweekends the volatility is higher, likely due to higher bid-ask spreads on weekends. Thelagged values of α d,st have significant, negative impact on the volatility of price differences.It means that if there was trading at times t − t − 2, then the standard deviationwould be lower. Let us note that the values of the coefficients are very similar among allhours except hours 14 to 16. For these hours the estimates of intercept and absolute pricedifferences deviate heavily from the estimates of the remaining hours. A possible reason forthis may be a few extreme outliers which were observed for these hours and for the othersnot. The table presents no values for the P-splines, because they are non-parametricfunctions. The last sub-table shows the estimate values for g ( ν d,st ) which we assumed tobe constant. We show it anyway to gain an insight in the magnitude of the degrees offreedom. Let us recall that g − ( ν ) = exp( ν ) + 2. Applying this to the estimate results invalues of ν d,st between around 3.7 and 6.6. Thus, the innovations are not extremely heavytailed, and it is reasonable to apply asymptotic statistic for validation and interpretation.Figure 11 shows the initial in-sample P-splines h ( P d,st − ) and h ( t ). We see that in case P t-1d,s h ( P t - , s ) (a) -1.5-1.0-0.50.00.5 50100150 Time to delivery (minutes) h ( t ) (b) -1.5-1.0-0.50.00.5 50100150 Time to delivery (minutes) h ( t ) (c) Figure 11: Initial in-sample smoothing effects of variables (a) P d,st − and (b) time to maturityin the model described in equation (9). (c) is analogous to (b), but as in-sample consideringthe first year of XBID. Note that the support of P d,st − differs among products.24f both variables, the smoothing functions are non-linear. Extreme values of most recentprice P d,st − result in most cases in high rise of volatility. On the other hand, the valuesbetween 0 and 50 EUR/MWh have rather marginal impact on the variance of the pricedifferences. An interesting effect can be seen in Figure 11b. We see that until 60 minutesbefore the delivery the impact on the volatility is on a similar, negative level among allproducts. Then, in the last 30 minutes of trading the volatility rises substantially abovezero. This behaviour can be misinterpreted as a result of the closure of XBID as in Figure 2.However, this plot is based on the initial in-sample data, i.e. the data between 16th July2015 and 14th July 2016. Therefore, the effect of XBID could not be in the data as it wasintroduced on 18th June 2018. Figure 11c is analogous to Figure 11b, but based on thefirst year of XBID, i.e. the data between 18th June 2018 and 17th June 2019. Comparingthe two figures concludes that the introduction of XBID has an impact on the volatilityof the price differences decreasing it even lower before the XBID closure and rising it evenhigher just after it. Interestingly, this is in contrary to the paper of Kath [5] who concludedthat there is no evidence for the influence of XBID on the price volatility. Y t d , s -10123 08/07 08:55 09/07 08:55 10/07 08:55 11/07 08:55 12/07 08:55 13/07 08:55 14/07 08:55 Time (year 2016) g ( σ t d , s ) | Δ P t-id,s | No-trade indicator DA.Load DA.Solar DA.Wind Intercept h (P t-1d,s ) h (t) Weekday dummies Figure 12: Price trajectory (top) and decomposition of fitted g ( σ d,st ) (bottom) of hourlyproduct with delivery at 12:00 for 7 consecutive in-sample days. The end of trading andthe time of forecasting are indicated by the dashed black and grey lines, respectively.25igure 12 presents a price trajectory and a decomposition of fitted g ( σ d,st ) of the prod-uct with delivery at 12:00 for the last 7 days of in-sample data. For the sake of readabilitywe grouped the components of the model for standard deviation similarly as in equation (9).Let us note that the absolute price differences and fundamental regressors have big, positiveimpact on the volatility of price differences. We also observe overall higher volatility on theweekend, i.e. the second and third day on the plot, than on the week. Note that in thisspecific example the impact of the non-linear price due to h looks rather negligible. How-ever, the price level in these seven trading sessions is always between 0 and 40 EUR/MWhwhere we expect minor impacts. Now, we turn ourselves to the analysis of the simulated trajectories. Figure 13 showsthe first out-of-sample simulation exercise of prices of product with delivery at 12:00 on15.07.2016. The trajectories are simulated from Mix.t.mu.sigma model and it can beeasily compared to the simulations from Gaussian random walk presented in Figure 1. Itis clear that in this example the trajectories of the mixture model are less volatile than therandom walk.Table 2 shows the values of utilized error measures. The Naive model performs very well Time to delivery (minutes) x I D m p r i c e ( hou r l y , : ) Figure 13: Price trajectory for the hourly product with delivery on 15.07.2016 at 12:00.The black part is the realization and the colourful part consists of 100 simulations fromthe Mix.t.mu.sigma . Time of forecasting is indicated by the green dashed line.26 S CRPS MAE RMSE 50%-cov 90%-cov 99%-covNaive 16.428 1.179 3.075 5.805 Mix.t.mu.sigma Table 2: Error measures of the considered models. Colour indicates the performancecolumnwise (the greener, the better). With bold, we depicted the best values in eachcolumn.overall. Moreover, it gives the best results in terms of 50% coverage. Very similar resultsto the naive model gives the MV.t which assumes the trajectories to follow a multivariatet-distribution. Having a look at the performance of MV.N we see that indeed the t-distribution is better in modelling of the trajectories. The LQR-based models are overallworse than the Naive or MV.t . It is worth to emphasize the very bad ability to model themean and median trajectories and on the other hand quite good 99% coverage. Let us notea very bad performance of the Gaussian random walk. Model RW.N is clearly the worst.Having a look at its coverage values, we conclude that its simulations are too volatile. Theintroduction of t-distribution to random walk yields already a big improvement. Anotherstep in our modelling, the usage of simple mixture distribution of the Dirac distribution δ and the random walk with innovations from t-distribution do not improve the results.However, the next step, i.e. modelling of the probability π d,st with model (5) improves theresults, but still they are not better than the ones of model RW.t . Moreover, modelling ofthe expected value as in equation (8) also worsens the performance substantially. All these27odels are clearly worse than the Naive considering every measure. The last change tothe mixture model, i.e. modelling of the standard deviation according to the formula inequation (9) lowers the errors significantly. Modelling of the expected value in addition tothe standard deviation brings a little improvement. Model Mix.t.mu.sigma is marginallybetter than Mix.t.sigma and it turns out to be the best model in terms of ES, CRPS,MAE and RMSE. A little disturbing are the values of the 50%- and 90%-coverage whichare too high for the mixture models. This means that it is very likely that the results canbe still improved. On the other hand, they capture better the behaviour in the tails thanthe Naive model.The values of the error measures in Table 2 may suggest that both ES and CRPSevaluate the same thing – the marginal distribution. To emphasize that ES evaluatesalso the quality of the generated scenarios, we perform a small experiment on the model Mix.t.mu.sigma . In Table 3, we compare the performance of the Mix.t.mu.sigma withits copies modified using 3 copulas: maximum dependency, minimum dependency and in-dependent. This results in the same marginal distributions, mean and median trajectories,and coverage values, but in completely different ensembles. This is depicted by the valuesof the measures – only the ES is different among the modifications.Figure 14 shows the models’ performance over all products in terms of energy score.A very interesting is the case of model RW.N . Usually it is not that much worse thanthe other random walks, but for hours 14-16 the error explodes. A look into the dataexplains the situation clearly – there were a few in-sample observations of extreme pricedifferences. The normal distribution assumes exponentially decaying tails, and thus themodel overestimated the variance. This indicates clearly that the t-distribution is better ES CRPS MAE RMSE 50%-cov 90%-cov 99%-covoriginal 15.956 1.144 3.073 5.804 0.5482 0.9277 0.9928max dependency 31.625 1.144 3.073 5.804 0.5482 0.9277 0.9928min dependency 33.263 1.144 3.073 5.804 0.5482 0.9277 0.9928independent 16.925 1.144 3.073 5.804 0.5482 0.9277 0.9928 Table 3: Error measures of 4 Mix.t.mu.sigma models with different copulas.28n this purpose as it was unaffected by these events. Furthermore, we observe that modelswith modelled σ d,st are uniformly better than the others.Figure 15 presents the models’ performance over time to delivery. The values rise as thetime goes, but it is rather not surprising. An interesting behaviour can be observed from150 to 100 minutes before the delivery. In this time range the errors of the random walkmodels and the mixture models that assume constant standard deviation rise significantlyin comparison to the other models. It is also the time of decreasing volatility in Figure 11b.Pinball Score values over quantiles τ are depicted in Figure 16. Let us note that thegain from the forecasting of central quantiles is marginal, and it is inline with other studies Product E ne r g y S c o r e NaiveMV.NMV.tRW.NRW.tRW.t.mix.D LQR.GaussLQR.indMix.RW.tMix.t.muMix.t.sigmaMix.t.mu.sigma 0.951.001.051.101.15 00 06 12 18 00 Product R a t i o t o t he N a i v e Figure 14: Energy score (left) and its ratio to the Naive (right) over 24 hourly products.The right graph is shown without RW.N for better clarity. 12 50100150 Time to delivery (min) CR PS t NaiveMV.NMV.tRW.NRW.tRW.t.mix.D LQR.GaussLQR.indMix.RW.tMix.t.muMix.t.sigmaMix.t.mu.sigma 0.961.001.041.081.12 50100150 Time to delivery (min) R a t i o t o t he N a i v e Figure 15: Continuous ranked probability score (left) and its ratio to the Naive (right)over time to delivery. The right graph is shown without RW.N for better clarity.29 .51.01.5 0 25 50 75 100 τ PB τ NaiveMV.NMV.tRW.NRW.tRW.t.mix.D LQR.GaussLQR.indMix.RW.tMix.t.muMix.t.sigmaMix.t.mu.sigma 0.70.80.91.01.1 0 25 50 75 100 τ R a t i o t o t he N a i v e Figure 16: Pinball score (left) and its ratio to the Naive (right) over quantiles τ ∈ r . Theright graph is shown without RW.N for better clarity. l p−valueNaiveMV.NMV.tRW.NRW.tRW.t.mix.DLQR.GaussLQR.indMix.RW.tMix.t.muMix.t.sigmaMix.t.mu.sigma N a i v e M V . N M V . t R W . N R W . t R W . t . m i x . D L Q R . G a u ss L Q R . i nd M i x . R W . t M i x . t . m u M i x . t . s i g m a M i x . t . m u . s i g m a (a) l p−valueNaiveMV.NMV.tRW.NRW.tRW.t.mix.DLQR.GaussLQR.indMix.RW.tMix.t.muMix.t.sigmaMix.t.mu.sigma N a i v e M V . N M V . t R W . N R W . t R W . t . m i x . D L Q R . G a u ss L Q R . i nd M i x . R W . t M i x . t . m u M i x . t . s i g m a M i x . t . m u . s i g m a (b) Figure 17: Results of the Diebold-Mariano test. (a) presents the p -values for the ES d,s loss,(b) the values for the CRPS d,s loss. The figures use a heat map to indicate the range of the p -values. The closer they are to zero ( → dark green), the more significant the difference isbetween forecasts of X-axis model (better) and forecasts of the Y-axis model (worse).regarding the ID -Price in the German intraday market [36, 39]. On the other hand, models Mix.t.sigma and Mix.t.mu.sigma gain a lot from the forecasting of quantiles outsidethe centre, performing especially well in the tails. In relation to the naive benchmark, the30rror is around 30% lower in the lower tail and around 25% lower in the upper tail. Let usalso note that the LQR-based models give quite good results in the tails, but lose a lot inthe centre, compared to the naive or to the models with non-constant σ d,st .Figure 17 shows the results of the DM test using two types of losses: the ES d,s and theCRPS d,s . Before applying the test we conducted on the multivariate loss differential series∆ dA,B three tests that evaluate the null hypothesis that a unit root is present in the seriesagainst the alternative that the data is stationary or trend-stationary. We used the Dickey-Fuller test [61], the Augmented Dickey-Fuller test [62], and the Phillips-Perron test [63]. In99% of cases the obtained p-values were smaller than 0.01, so we reject the null hypothesis.This indicates in our case that the loss differential series is covariance stationary. Only forthe differences with RW.N the Dickey-Fuller test reported no significance for rejecting thenull hypothesis. This may be caused by the bad capturing of the marginal distribution ofthe RW.N . The results of the DM test show that the difference between the forecasts ofmodels Mix.t.mu.sigma and Mix.t.sigma is insignificant. Moreover, these models givebetter forecasts than all the other considered models. It is worth to emphasize a very goodperformance of the Naive model, but it is not surprising after taking a look at Table 2. We conducted an ensemble forecasting study in the German Intraday Continuous Marketwhich is novel in two ways. The first way, this study is the first one that raises the issueof price trajectory simulation and ensemble forecasting in continuous intraday electricitymarkets. The second way, the study uses a very clever mixture of distributions that is fittedto the data. The results are very satisfying and showing that it is possible to successfullymodel the volatility in the German Intraday Continuous Market. The study was carriedout using the data from the German market, but the generality of this method and theorganization of the European electricity markets ensure a possible application to othermarkets, especially the markets participate in XBID which covers exchanges like EPEX,Nordpool and OMIE.Obviously, the proposed method can be developed further. One of possible directions isusing other external processes like the traded volume or price of nearby hours as regressors.31lthough, this is a non-trivial task and could easily lead to the accumulation of errors.Another possibility is utilization of other probability distribution. The not perfect coverageof the best performing model indicates that there is still some space for improvement. Thisissue could be addressed with some post-processing method as well. Acknowledgments This research article was partially supported by the German Research Foundation (DFG,Germany) and the National Science Center (NCN, Poland) through BEETHOVEN grantno. 2016/23/G/HS4/01005. References [1] S. Goodarzi, H. N. Perera, and D. Bunn. The impact of renewable energy forecasterrors on imbalance volumes and electricity spot prices. Energy Policy , 134:110827,2019.[2] C. Kath and F. Ziel. The value of forecasts: Quantifying the economic gains of accuratequarter-hourly electricity price forecasts. Energy Economics , 76:411–423, 2018.[3] K. Maciejowska. Assessing the impact of renewable energy sources on the electricityprice level and variability–A quantile regression approach. Energy Economics , 85:104532, 2020.[4] J. Viehmann. State of the German Short-Term Power Market. Zeitschriftf¨ur Energiewirtschaft , 41(2):87–103, Jun 2017. ISSN 1866-2765. doi: 10.1007/s12398-017-0196-9. URL https://doi.org/10.1007/s12398-017-0196-9 .[5] C. Kath. Modeling intraday markets under the new advances of the cross-borderintraday project (XBID): Evidence from the German intraday market. Energies , 12(22):4339, 2019.[6] R. Weron. Electricity price forecasting: A review of the state-of-the-art with a lookinto the future. International journal of forecasting , 30(4):1030–1081, 2014.327] J. Nowotarski and R. Weron. Recent advances in electricity price forecasting: A reviewof probabilistic forecasting. Renewable and Sustainable Energy Reviews , 81:1548–1568,2018.[8] P. Muniain and F. Ziel. Probabilistic forecasting in day-ahead electricity markets:Simulating peak and off-peak prices. International Journal of Forecasting , 2020.[9] G. Marcjasz, T. Serafin, and R. Weron. Selection of calibration windows for day-aheadelectricity price forecasting. Energies , 11(9):2364, 2018.[10] B. Uniejewski, G. Marcjasz, and R. Weron. On the importance of the long-termseasonal component in day-ahead electricity price forecasting: Part II—Probabilisticforecasting. Energy Economics , 79:171–182, 2019.[11] T. Serafin, B. Uniejewski, and R. Weron. Averaging predictive distributions acrosscalibration windows for day-ahead electricity price forecasting. Energies , 12(13):2561,2019.[12] Z. Yang, L. Ce, and L. Lian. Electricity price forecasting by a hybrid model, combin-ing wavelet transform, ARMA and kernel-based extreme learning machine methods. Applied Energy , 190:291–305, 2017.[13] D. Wang, H. Luo, O. Grunder, Y. Lin, and H. Guo. Multi-step ahead electricity priceforecasting using a hybrid model based on two-layer decomposition technique and BPneural network optimized by firefly algorithm. Applied Energy , 190:390–407, 2017.[14] J. Zhang, Z. Tan, and Y. Wei. An adaptive hybrid model for short term electricityprice forecasting. Applied Energy , 258:114087, 2020.[15] L. Xiao, W. Shao, M. Yu, J. Ma, and C. Jin. Research and application of a hybridwavelet neural network model with the improved cuckoo search algorithm for electricalpower system forecasting. Applied Energy , 198:203–222, 2017.[16] P. Bento, J. Pombo, M. Calado, and S. Mariano. A bat optimized neural networkand wavelet transform approach for short-term price forecasting. Applied Energy , 210:88–97, 2018. 3317] D. Keles, J. Scelle, F. Paraschiv, and W. Fichtner. Extended forecast methods forday-ahead electricity spot prices applying artificial neural networks. Applied Energy ,162:218–230, 2016.[18] J. Lago, F. De Ridder, P. Vrancx, and B. De Schutter. Forecasting day-ahead electricityprices in Europe: the importance of considering market integration. Applied Energy ,211:890–903, 2018.[19] F. Ocker and K.-M. Ehrhart. The “German Paradox” in the balancing power markets. Renewable and Sustainable Energy Reviews , 67:892–898, 2017.[20] C. Koch and L. Hirth. Short-term electricity trading for system balancing: An empir-ical analysis of the role of intraday trading in balancing Germany’s electricity system. Renewable and Sustainable Energy Reviews , 113:109275, 2019.[21] F. Karanfil and Y. Li. The role of continuous intraday electricity markets: The inte-gration of large-share wind power generation in Denmark. The Energy Journal , 38(2),2017.[22] K. Maciejowska, W. Nitka, and T. Weron. Day-ahead vs. Intraday—Forecasting theprice spread to maximize economic benefits. Energies , 12(4):631, 2019.[23] M. Narajewski and F. Ziel. Estimation and Simulation of the Transaction ArrivalProcess in Intraday Electricity Markets. Energies , 12(23):4518, 2019.[24] R. Kiesel and F. Paraschiv. Econometric analysis of 15-minute intraday electricityprices. Energy Economics , 64:77–90, 2017.[25] N. Graf von Luckner and R. Kiesel. Modeling market order arrivals on the intradaymarket for electricity deliveries in Germany with the Hawkes process. Available atSSRN , 2020.[26] T. Rintam¨aki, A. S. Siddiqui, and A. Salo. Strategic offering of a flexible producer inday-ahead and intraday power markets. European Journal of Operational Research ,2020. 3427] R. A¨ıd, P. Gruet, and H. Pham. An optimal trading problem in intraday electricitymarkets. Mathematics and Financial Economics , 10(1):49–85, 2016.[28] X. Ay´on, M. ´A. Moreno, and J. Usaola. Aggregators’ optimal bidding strategy insequential day-ahead and intraday electricity spot markets. Energies , 10(4):450, 2017.[29] S. Glas, R. Kiesel, S. Kolkmann, M. Kremer, N. G. von Luckner, L. Ostmeier, et al.Intraday renewable electricity trading: advanced modeling and numerical optimal con-trol. Journal of Mathematics in Industry , 10(1):3, 2020.[30] C. Pape, S. Hagemann, and C. Weber. Are fundamentals enough? Explaining pricevariations in the German day-ahead and intraday power market. Energy Economics ,54:376–387, 2016.[31] M. G¨urtler and T. Paulsen. The effect of wind and solar power forecasts on day-aheadand intraday electricity prices in Germany. Energy Economics , 75:150–162, 2018.[32] M. Kremer, R. Kiesela, and F. Paraschivc. An Econometric Model for Intraday Elec-tricity Trading. 2020.[33] C. Monteiro, I. J. Ramirez-Rosado, L. A. Fernandez-Jimenez, and P. Conde. Short-Term Price Forecasting Models Based on Artificial Neural Networks for Intraday Ses-sions in the Iberian Electricity Market. Energies , 9(9):721, 2016.[34] J. R. Andrade, J. Filipe, M. Reis, and R. J. Bessa. Probabilistic Price Forecastingfor Day-Ahead and Intraday Markets: Beyond the Statistical Model. Sustainability , 9(11):1990, 2017.[35] B. Uniejewski, G. Marcjasz, and R. Weron. Understanding intraday electricity mar-kets: Variable selection and very short-term price forecasting using LASSO. Interna-tional Journal of Forecasting , 35(4):1533–1547, 2019.[36] M. Narajewski and F. Ziel. Econometric modelling and forecasting of intraday elec-tricity prices. Journal of Commodity Markets , page 100107, 2019.3537] G. Marcjasz, B. Uniejewski, and R. Weron. Beating the Na¨ıve—Combining LASSOwith Na¨ıve Intraday Electricity Price Forecasts. Energies , 13(7):1667, 2020.[38] I. Oksuz and U. Ugurlu. Neural network based model comparison for intraday elec-tricity price forecasting. Energies , 12(23):4557, 2019.[39] T. Janke and F. Steinke. Forecasting the price distribution of continuous intradayelectricity trading. Energies , 12(22):4262, 2019.[40] R. A. Rigby and D. M. Stasinopoulos. Generalized additive models for location, scaleand shape. Journal of the Royal Statistical Society: Series C (Applied Statistics) , 54(3):507–554, 2005.[41] T. Hastie and R. Tibshirani. Generalized Additive Models , volume 43. CRC Press,1990.[42] A. Pierrot and Y. Goude. Short-term electricity load forecasting with generalizedadditive models. Proceedings of ISAP power , 2011, 2011.[43] P. Gaillard, Y. Goude, and R. Nedellec. Additive models and robust aggregation forGEFCom2014 probabilistic electric load and electricity price forecasting. InternationalJournal of forecasting , 32(3):1038–1050, 2016.[44] F. Serinaldi. Distributional modeling and short-term forecasting of electricity pricesby generalized additive models for location, scale and shape. Energy Economics , 33(6):1216–1226, 2011.[45] A. Gianfreda and D. Bunn. A stochastic latent moment model for electricity priceformation. Operations Research , 66(5):1189–1203, 2018.[46] E. Abramova and D. Bunn. Forecasting the Intra-Day Spread Densities of ElectricityPrices. Energies , 13(3):687, 2020.[47] R. Tibshirani. Regression Shrinkage and Selection via the Lasso. Journal of the RoyalStatistical Society. Series B (Methodological) , 58(1):267–288, 1996. ISSN 00359246.URL .3648] F. Diebold and R. Mariano. Comparing Predictive Accuracy. Journal of Business &Economic Statistics , 13(3):253–63, 1995.[49] D. M. Stasinopoulos, R. A. Rigby, et al. Generalized additive models for location scaleand shape (GAMLSS) in R. Journal of Statistical Software , 23(7):1–46, 2007.[50] A. Misiorek, S. Trueck, and R. Weron. Point and interval forecasting of spot electricityprices: Linear vs. non-linear time series models. Studies in Nonlinear Dynamics &Econometrics , 10(3), 2006.[51] B. Uniejewski, J. Nowotarski, and R. Weron. Automated variable selection and shrink-age for day-ahead electricity price forecasting. Energies , 9(8):621, 2016.[52] F. Ziel and R. Weron. Day-ahead electricity price forecasting with high-dimensionalstructures: Univariate vs. multivariate modeling frameworks. Energy Economics , 70:396–420, 2018.[53] F. Ziel. Forecasting electricity spot prices using lasso: On capturing the autoregressiveintraday structure. IEEE Transactions on Power Systems , 31(6):4977–4987, 2016.[54] J. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linearmodels via coordinate descent. Journal of statistical software , 33(1):1, 2010.[55] P. H. Eilers, B. D. Marx, and M. Durb´an. Twenty years of P-splines. SORT: statisticsand operations research transactions , 39(2):0149–186, 2015.[56] C. Gilbert, J. Browell, and D. McMillan. Probabilistic access forecasting for improvedoffshore operations. International Journal of Forecasting , 2020.[57] R. Koenker. quantreg: Quantile Regression , 2019. URL https://CRAN.R-project.org/package=quantreg . R package version 5.54.[58] J. M. Hyman. Accurate monotonicity preserving cubic interpolation. SIAM Journalon Scientific and Statistical Computing , 4(4):645–654, 1983.[59] T. Gneiting and A. E. Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association , 102(477):359–378, 2007.3760] S. Lerch, S. Baran, A. M¨oller, J. Groß, R. Schefzik, S. Hemri, and M. Graeter.Simulation-based comparison of multivariate ensemble post-processing methods. Non-linear Processes in Geophysics , 27(2):349–371, 2020.[61] D. A. Dickey and W. A. Fuller. Distribution of the estimators for autoregressive timeseries with a unit root. Journal of the American statistical association , 74(366a):427–431, 1979.[62] S. E. Said and D. A. Dickey. Testing for unit roots in autoregressive-moving averagemodels of unknown order. Biometrika , 71(3):599–607, 1984.[63] P. C. Phillips and P. Perron. Testing for a unit root in time series regression.