Modified Auto Regressive Technique for Univariate Time Series Prediction of Solar Irradiance
Umar Marikkar, A. S. Jameel Hassan, Mihitha S. Maithripala, Roshan I. Godaliyadda, Parakrama B. Ekanayake, Janaka B. Ekanayake
©© 20xx IEEE. Personal use of this material is permitted.Permission from IEEE must be obtained for all other uses, in anycurrent or future media, including reprinting/republishing this material for advertising or promotional purposes, creating newcollective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in otherworks. a r X i v : . [ ee ss . SP ] D ec odified Auto Regressive Technique for UnivariateTime Series Prediction of Solar Irradiance Umar Marikkar, A. S. Jameel Hassan, Mihitha S. MaithripalaRoshan I. Godaliyadda, Parakrama B. Ekanayake, Janaka B. Ekanayake
Department of Electrical and Electronic Engineering, Faculty of EngineeringUniversity of Peradeniya, Peradeniya 20400, Sri Lanka {e14219, jameel.hassan.2014, e14215}@eng.pdn.ac.lk{roshangodd, mpb.ekanayake, jbe}@ee.pdn.ac.lk
Abstract —The integration of renewable resources has in-creased in power generation as a means to reduce the fossilfuel usage and mitigate its adverse effects on the environment.However, renewables like solar energy are stochastic in naturedue to its high dependency on weather patterns. This uncer-tainty vastly diminishes the benefit of solar panel integrationand increases the operating costs due to larger energy reserverequirement. To address this issue, a Modified Auto Regressivemodel, a Convolutional Neural Network and a Long ShortTerm Memory neural network that can accurately predict thesolar irradiance are proposed. The proposed techniques arecompared against each other by means of multiple error metricsof validation. The Modified Auto Regressive model has a meanabsolute percentage error of 14.2%, 19.9% and 22.4% for 10minute, 30 minute and 1 hour prediction horizons. Therefore,the Modified Auto Regressive model is proposed as the mostrobust method, assimilating the state of the art neural networksfor the solar forecasting problem.
Index Terms —Univariate irradiance forecast, Auto Regression,Neural networks
I. I
NTRODUCTION
Over the years, penetration of renewable energy has vastlyincreased in the electricity grid network. The increase inenergy demand, adverse effects of fossil fuel generation andawareness towards climate change has advanced the use ofrenewable resources [1]. The growing concern towards envi-ronmental pollution has rendered the sustainable developmentgoal
ATA P REPARATION
A. Study Area
The solar irradiance data was obtained from the PV plantstationed at the Faculty of Engineering in University ofPeradeniya in Sri Lanka. The city of Peradeniya is locatedsurrounded by the hills of Hantana with a tropical climate.This results in fluctuations of the solar irradiance curve ratherthan yielding the typical "bell" shaped curve. This settinggives a more challenging data set which highly reflects thevolatile nature of solar irradiance in contrast to data sets oftenencountered in the literature. The data is collected for a periodof one year with data points at every 10 minute interval.
B. Training/Testing Split and Data Standardisation
For all forecasting models, the training/testing data splitis divided as 70/30 % considering conventional deep learningpractice. As the collected data spans a whole year, this givesa sufficiently large data-set ( ≈
110 days) for testing.For an efficient training and forecast performance, the inputdata is standardised as in equation (1) as a pre processing step,and de-standardised in the post processing stage. z = x − µσ (1) where, z = Normalised signal value x = Irradiance level at each timestamp µ = Mean of the dataset σ = Variance of the datasetIII. M ETHODOLOGY
The short-term prediction of solar irradiance is implementedfor the time horizons of 10 minute, 30 minute and 1 hour in-tervals. The forecasting schemes are developed using machinelearning techniques in terms of Convolutional Neural Net-works (CNN), Long-Short Term Memory (LSTM) networks,and in addition a Modified Auto Regressive (MAR) forecastmodel is implemented. Out of the three techniques, the MARapproach is highlighted as the best model for solar prediction.
A. Convolutional Neural Network (CNN)
CNNs are a type of neural networks most prominentlydeployed in image processing and classification problems. Thekey notion of CNN is its ability to learn abstract features andrecognise them during training by means of kernels in thenetwork [17]. Therefore, in this paper, the CNN has beenemployed in a time series model to identify the temporalabstract level features in order to predict the next time step.In order to encapsulate the complex modelling of features,the CNN utilises three separate layers, namely: the convolu-tional layer, the pooling layer and the fully connected layer.The convolutional layer is responsible for the identificationof relationships between the inputs in the locality of theconvolution operation that takes place between inputs and thekernels. The pooling layer performs a down-sampling of theoutput from the convolution operation. This is then fed to afully connected layer which is responsible for predicting theoutput depending on the features. A series of convolution-pooling layers can be used if necessary.In this paper, one convolution layer and an average poolinglayer is used. These layers are designed to extract the featureon the input variables, which is the past 4 samples (selected asin section III-D2) of the time series sequence, as in equation(2). h kij = g (( W k ∗ x ) ij + b k ) (2)where, W k is the weight of the kernel connected to k th featuremap, g is the activation function and b k is the bias unit.The Rectified Linear Unit (ReLU) function is used as theactivation function after evaluating the performance againstother activation functions. The ReLU function is defined byequation (3). g ( x ) = max (0 , x ) (3)The Adam optimisation algorithm is used as the trainingfunction which is an efficient implementation of the gradientdescent algorithm [18]. Finally, two dense - fully connectedlayers are implemented following the pooling layer with onefinal dense layer with a single neuron that outputs the predic-tion. An abstraction of the CNN architecture implemented ishown in Fig 1. The hyper parameters of the model are chosenby optimisation of a grid search algorithm as highlighted inTable I.
1) Pre processing and Post processing stages:
The solar irradiance curve has a trend of the ” bell ” shapeto it. In order to remove this trend in the input data, preprocessing is performed at the input stage. In addition to thedata standardisation described in II-B, a difference transformof lag 1 is performed to the input signal after standardisation.The transformed input is fed to the CNN and the predictedsignal is obtained. The predicted signal is passed in a postprocessing stage to reconstruct the solar irradiance curve aspredicted. The pre processing difference transform and postprocessing reconstruction equations are given in equation (4). X = [ x , x , . . . , x n ]˜ X = [( x − , ( x − x ) , . . . , ( x n − x n − )]˜ Y = [ ˜ y , ˜ y , . . . , ˜ y n ] Y = [( ˜ y + 0) , ( ˜ y + x ) , . . . , ( ˜ y n + x n − ) (4)here, X = Normalised signal value ˜ X = Difference transformed input ˜ Y = Predicted signal Y = Reconstructed predicted signal value B. Long-Short Term Memory Neural Network (LSTM)
The LSTM network is a type of Recurrent Neural Networks(RNN), used for time series prediction. A major drawbackof RNN is the inability to capture long-term dependencies ina signal, due to memory constraints. The LSTM cell has aselective storage of trends in its memory, hence it ignoresrepetitive information. The cell state is defined by whichinformation is stored in or discarded. This is controlled bymeans of three gates; the input gate i t , output gate O t andforget gate f t . The output of the LSTM networks dependson the current input and the cell state [19]. The workingmechanism and cell architecture of the LSTM network isshown in Fig. 2 and Fig. 3 respectively.At time t , the inputs to the network are the sequence vector X t , the hidden state output h t − and the cell state C t − . Theoutputs of the network are, the LSTM hidden state h t and thecell state C t . The forget gate, input gate and output gate are Flattening Layer Dense Layer 116 neurons Dense Layer 28 neurons Output Layer fi l t e r s Input LayerFilter Convolution Average Pooling LayerConvolutional Layer
Fig. 1. CNN architecture htLSTMX(t) h0LSTM h1LSTM h2LSTM htLSTMX(0) X(1) X(2) X(t)Unroll... t Fig. 2. Working mechanism of LSTM cells calculated using equations (5), (6) and (7). Here, i t is the inputgate and O t is the output gate. The forget gate f t is used toupdate, maintain or delete the cell state information. f t = σ ( W f × [ h t − , x t ] + b f ) (5) i t = σ ( W i × [ h t − , x t ] + b i ) (6) O t = σ ( W O × [ h t − , x t ] + b o ) (7)The current candidate cell state ¯ C is calculated by equation(8), and is updated to produce the output cell state C t as inequation (9). Using the output cell state, the current hiddenstate h t is calculated by equation (10). ¯ C t = tanh ( W C × [ h t − , x t ] + b c ) (8) C t = f t × C t − + i t × ¯ C t (9) h t = O t × tanh ( C t ) (10) W f , W i , W O , and b f , b i , b o and b c are weights and biasparameters of each gate. σ is the sigmoid activation function. C. Network Design for Deep-Learning Models
All simulations are run on an Intel core-i7 @4.5GHzcomputer. Implemented deep-learning networks are designedusing MATLAB deep learning toolbox.Neural networks, if poorly trained, leads to over-fitting orunder-fitting of the training data, resulting in disparity betweentraining data prediction and actual prediction performance.Similarly, bad design of an neural network architecture couldlead to error propagation, high computational cost, or simplyoverkill. σ tanh tanh σ σ f(t)h(t-1) c(t-1) i(t) c(t) c(t)h(t)+ x(t) o(t) Fig. 3. LSTM cell architectureABLE IH
YPER - PARAMETER OPTIMIZATION FOR IMPLEMENTED NETWORKS
Network Model Model Hyper-Parameter Names Search Space for Optimal Hyper-Parameters
CNN Optimizer AdamLearning rate ( α ) [0.1 0.01
50 150 300]Number of Kernels [3
80 150]Kernel Size [3 ]Batch Size [16 32 64 128
100 500]LSTM Optimizer AdamInitial learning rate ( α ) [0.1
100 300]LSTM Layers [
64 128 256]Fully Connected Layers [
16 32 64 128]Epochs [50
300 500]
Hyper-parameter optimization plays an important role inchoosing the optimal neural network architecture and trainingparameters. Brute force methods such as grid search, prob-abilistic models such as bayesian optimization and randomsearches are widely used. As high computational power isavailable for training, grid search algorithm was implemented.Initially, a coarse search was carried out on a large searchspace as shown in Table I. Then, a fine search was imple-mented on a smaller search space. As all hyper-parameterswere well optimized throughout the smaller search space,coarse search hyper-parameters were chosen, as highlightedin Table I.
D. Modified Auto-Regressive Model (MAR)
In the AR model, the predicted signal value at the next timestep is linearly dependent on observed values at a set numberof previous time steps. However, our proposed model doesnot work on the standardized irradiance measurements, butensemble deducted values, as described in section III-D1. TheAR model equation relating predicted value to the previouslyobserved values is given by equation (11). x n,pred = m (cid:88) k =1 w k × x n − k (11)where, m = order of the AR model x n,pred = predicted signal value for next timestamp w k = model weights x n − k = past signal values
1) Feature Engineering by Ensemble Deduction:
Prior to the prediction, the expected value of m number ofpast signal values at each timestamp is deducted from its corre-sponding irradiance measurement, as shown in equation (12).This ensures that the periodic nature of the days, governed bythe bell shape curve, is unaffected at the time of prediction. The ensemble deduction in a given day to predict the 20 th timestamp of the day is illustrated in Fig. 4. x n − i,ens = x n − i − E [ x n − i ] (12)where, i = [1, . . . , m ] n = prediction timestamp x n − i,ens = ensemble deducted signal value at n − ix n = actual standardized signal value at n − iE [ . ] = statistical expectation operator Timestamps -1.5-1-0.500.511.52 S t anda r d i z ed i rr ad i an c e Previous signalsTimestamps needed for predictionEnsemble deducted signal
Fig. 4. Ensemble Deduction to predict 20 th timestamp
2) Parameter Optimisation:
The order of the model ( m ) depends on the Partial Auto-Correlation Function (PACF) of the given data. The PACFprovides the correlation between a fixed time series value x n and its lagged values x n − τ relative to the fixed value. Theequation to compute the PACF is described in equation (13).ig. 5 shows a graphical representation of equation (13). Asobserved, m = 4 was chosen as the optimal order. R τ = E [ x n − τ · x n ] (13)where, E [ . ] = statistical expectation operator R τ = correlation function τ = lag of the previous timestamp C o rr e l a t i on c oe ff i c en t Lag (time-step)
Fig. 5. Partial Auto-Correlation Function (PACF)
The prediction error x n,pred − x n,real is chosen to calculatemodel parameters. They are calculated using optimisation;where a positive, monotonically increasing error function isminimized. A squared error function as given by equation(14) exhibits these characteristics. Therefore, the Yule-Walkerequation given by equation (15) is used to calculate modelparameters. f ( e n ) = ( x n,pred − x n,real ) (14)where, f ( e n ) = error function e n = error at a given time step nx n,pred = predicted value at nx n,real = observed value at nW = ( X T X ) − X T Y (15)where, W = weights matrix X = design matrix (dependent on order m ) Y = output matrix ( X real )The design matrix X contains the training examples asits rows, and features for each example as its columns. Thenumber of columns depends on the order m . After optimizingthe model parameters, a finite loop is run for each time stepof the day, predicting the signal value x n,pred at the next timestep. To calculate predicted solar irradiance, x n,pred is de-standardized. IV. R ESULTS AND D ISCUSSION
Irradiance prediction for two randomly chosen successivedays for 10 minute, 30 minute and 1 hour prediction horizonsare shown in Fig. 6, Fig. 7, Fig. 8 respectively. The deeplearning models and the MAR model are designed with onespecific model to forecast across all time horizons discussed inthe paper. As observed, when the prediction horizon increases,the tendency for predicted curves to follow sudden changes in
TABLE IIE
RROR COMPARISON OF P REDICTION M ODELS
Error Model Horizon CNN AR LSTM MAR
10 min 113.62 115.62 114.79 110.38RMSE /
W m −
30 min 164.63 170.05 146.50 148.251 h 181.98 182.17 161.40 158.5610 min 64.66 74.52 69.70 68.21MAE /
W m −
30 min 102.82 123.16 98.34 99.061 h 124.52 138.07 111.98 112.0910 min 14.56 16.02 14.71 14.20MAPE /% 30 min 21.81 23.67 19.36 19.941 h 24.59 27.45 22.18 22.42 irradiance is less. Therefore, the dependency on the ensemble(bell-shaped) features of data increases.The performance of the four models are evaluated withrespect to the metrics of measure Root Mean Square Error(RMSE), Mean Absolute Error (MAE) and Mean AbsolutePercentage Error (MAPE) in Table II. It can be observedthat the error increases for all models when the predictionhorizon increases. However, the performance of the CNN andconventional AR model deteriorates faster than the other two.It is noteworthy that, the MAR model, being a simplisticimplementation with pre processing, consolidates a robustperformance with the time horizon change while matching theperformance of a deep learning LSTM model in all aspects;both errors and increased time horizons.V. C
ONCLUSION
In this paper we propose three models of solar prediction;a Modified Auto Regressive (MAR) model, two deep learningmodels each based on CNN and LSTM neural networks. Theperformance of the models are quantified by the error metricsRMSE, MAE and MAPE, and it affirms that the MAR modelfits best for the case of very-short term prediction of solarirradiance.In a system such as a tropical environment, variabilityof irradiance at a given timestamp is high, reducing thecorrelation between consecutive samples. Hence, deep neuralnetworks tend to mostly capture the bell-shaped nature ofsolar irradiance, as intra-day variations are highly uncorrelated.By means of the ensemble mean curve deduction the MAR,having the least computational cost, is capable of predictingsolar irradiance with a performance similar to LSTM- thestate of the art prediction scheme- across all tested predictionhorizons.Existing prediction models use multi-sensory data; suchas temperature, humidity, cloud cover and irradiance. Theproposed MAR uses a single sensor measurement as inputfor the prediction sufficing in performance for most use cases,with an MAPE of less than 15% for 10 minute prediction, andless than 20% for 30 minute prediction. This enables an easyacquisition of data, which facilitates an easily deployable
020 1030 1040 1050 1060 1070 1080 1090 1100
Time steps / 10min per step I rr ad i an c e / W m - ObservedLSTMCNNMAR
Fig. 6. Observed vs. Predicted irradiance for 10 minute prediction horizonextracted for 1 day from the predicted dataset
375 380 385 390 395 400 405 410 415 420 425
Time steps / 30min per step I rr ad i an c e / W m - ObservedLSTMCNNMAR
Fig. 7. Observed vs. Predicted irradiance for 30 minute prediction horizonextracted for 2 consecutive days from the predicted dataset
520 525 530 535 540 545
Time steps / 1h per step I rr ad i an c e / W m - ObservedLSTMCNNMAR
Fig. 8. Observed vs. Predicted irradiance for 1 hour prediction horizonextracted for 2 consecutive days from the predicted dataset forecast system. Thus, taking into account the aforementionedconditions, MAR is chosen as the optimal solar irradianceprediction model. R
EFERENCES[1] D. Abbot, “Keeping the energy debate clean: How do we supply theworld’s energy needs?”
Proceedings of the IEEE , vol. 98, no. 1, pp.42–66, Jan 2010.[2] UNDP, “UNDP strategic plan, 2018–2021. Special session (executiveboard of the United Nations Development Programme, the UnitedNations Population Fund and the United Nations office for projectservices),” New York: United Nations; 2017.[3] A. Kaur, L. Nonnenmacher, H. Pedro, and C. Coimbra, “Benefits of solarforecasting for energy imbalance markets,”
Renewable Energy , vol. 86,pp. 819–830, Feb 2016.[4] D. U. F. Malte Zieher, Dr. Matthias Lange, “Varaiable RenewableEnergy Forecasting- Integration into Electricity Grid and Markets- ABest Practice Guide,” Federal Ministry for Economic Corporation andDevelopment, Germany, Tech. Rep., 2015.[5] Y. Wang, Y. Shen, S. Mao, X. Chen, and H. Zou, “LASSO and LSTMintegrated temporal model for short-term solar intensity forecasting,”
IEEE Internet of Things Journal , vol. 6, no. 2, pp. 2933–2944, April2019.[6] W. Lee, K. Kim, J. Park, J. Kim, and Y. Kim, “Forecasting solar powerusing Long-Short Term Memory and Convolutional Neural Networks,”
IEEE Access , vol. 6, pp. 73 068–73 080, 2018.[7] Q. Huang and S. Wei, “Improved Quantile Convolutional Neural Net-work with two-stage training for daily-ahead probabilistic forecasting ofphotovoltaic power,”
Energy Conversion and Management , vol. 220, p.113085, Sep 2020.[8] Dong et al. , “A novel Convolutional Neural Network framework basedsolar irradiance prediction method,”
International Journal of ElectricalPower & Energy Systems , vol. 114, p. 105411, Jan 2020.[9] D. M. L. H. Dissawa et al. , “Cross-correlation based cloud motionestimation for short-term solar irradiation predictions,” in ,2017, pp. 1–6.[10] M. Cervantes, H. Krishnaswami, W. Richardson, and R. Vega, “Uti-lization of low cost, sky-imaging technology for irradiance forecastingof distributed solar generation,” in , April 2016, pp. 142–146.[11] A. Moreno-Munoz, J. J. G. de la Rosa, R. Posadillo, and F. Bellido,“Very short term forecasting of solar radiation,” in
IEEE PhotovoltaicSpecialists Conference , San Diego, CA, USA, May 2009.[12] M. Rafi, M. T. Wahab, M. Bilal Khan, and H. Raza, “Atm cash predictionusing time series approach,” in ,2020, pp. 1–6.[13] H. Matsila and P. Bokoro, “Load forecasting using statistical time seriesmodel in a medium voltage distribution network,” in
IECON 2018 - 44thAnnual Conference of the IEEE Industrial Electronics Society , 2018, pp.4974–4979.[14] N. Chen-xu and W. Jie-sheng, “Auto Regressive Moving Average(ARMA) prediction method of bank cash flow time series,” in , 2015, pp. 4928–4933.[15] W. Ji, C. Chan, J. Loh, F. Choo, and L. Chen, “Solar radiation predictionusing statistical approaches,” in , 2009, pp.1–5.[16] H. Nazaripouya, B. Wang, Y. Wang, P. Chu, H. R. Pota, and R. Gadh,“Univariate time series prediction of solar power using a hybrid wavelet-ARMA-NARX prediction method,” in , 2016, pp. 1–5.[17] S. Oehmcke, O. Zielinski, and O. Kramer, “Input quality aware Convo-lutional LSTM networks for virtual marine sensors,”
Neurocomputing ,Nov 2017.[18] L. Wen, K. Zhou, S. Yang, and X. Lu, “Optimal load dispatch ofcommunity microgrid with deep learning based solar power and loadforecasting,”
Energy , vol. 171, 03 2019.[19] H. Zhou, Y. Zhang, L. Yang, Q. Liu, K. Yan, and Y. Du, “Short-termphotovoltaic power forecasting based on Long Short Term Memoryneural network and attention mechanism,”