PSO based Neural Networks vs. Traditional Statistical Models for Seasonal Time Series Forecasting
PPSO based Neural Networks vs. Traditional Statistical Models for Seasonal Time Series Forecasting
Ratnadip Adhikari
Computer and Systems Sciences Jawaharlal Nehru University New Delhi, India Email: [email protected]
R. K. Agrawal
Computer and Systems Sciences Jawaharlal Nehru University New Delhi, India Email: [email protected]
Laxmi Kant
Department of Mathematics IIT Roorkee Roorkee, India Email: [email protected]
Abstract —Seasonality is a distinctive characteristic which is often observed in many practical time series. Artificial Neural Networks (ANNs) are a class of promising models for efficiently recognizing and forecasting seasonal patterns. In this paper, the Particle Swarm Optimization (PSO) approach is used to enhance the forecasting strengths of feedforward ANN (FANN) as well as Elman ANN (EANN) models for seasonal data. Three widely popular versions of the basic PSO algorithm, viz. Trelea-I, Trelea-II and Clerc-Type1 are considered here. The empirical analysis is conducted on three real-world seasonal time series. Results clearly show that each version of the PSO algorithm achieves notably better forecasting accuracies than the standard Backpropagation (BP) training method for both FANN and EANN models. The neural network forecasting results are also compared with those from the three traditional statistical models, viz. Seasonal Autoregressive Integrated Moving Average (SARIMA), Holt-Winters (HW) and Support Vector Machine (SVM). The comparison demonstrates that both PSO and BP based neural networks outperform SARIMA, HW and SVM models for all three time series datasets. The forecasting performances of ANNs are further improved through combining the outputs from the three PSO based models.
Keywords—time series forecasting; seasonality; Box-Jenkins models; ANN; Elman ANN; particle swarm optimization I. I NTRODUCTION
Trends and seasonal effects are frequently observed in many time series, especially those pertaining to economics, business, finance and natural phenomena. Seasonality is a kind of regular repetitive fluctuation which occurs within a year, often in a quarterly or monthly basis. The seasonal patterns introduce additional intricacies to a time series, thereby making the task of apposite modeling and forecasting reasonably difficult [1]. To date, there are a few recognized approaches to systematically analyze and forecast seasonal data. Some popular old methods include the moving average [2], exponential smoothing [2], Holt-Winters (HW) [3], etc. Over the past four decades, the Seasonal Autoregressive Integrated Moving Average (SARIMA) model [4] occupies the leading position among all statistical methods of forecasting seasonal data. More recently, Support Vector Machine (SVM) [3, 5] has also found notable applications in this area. It is imperative to know that most of these traditional methods were developed decades ago and they adopt a rather fixed model for a seasonal time series [3]. This restricts their flexibility through making them inadequate to cope with the frequent dynamic changes in seasonal patterns. Furthermore, the traditional models solely require the pre-removal of the seasonal effect from a time series through techniques, such as seasonal differencing and deseasonalization before making future forecasts. Such techniques often distort many fundamental properties of the time series and so they have been criticized by numerous well respected researchers [1, 6, 7]. It should not be surprising that Artificial Neural Networks (ANNs) have gained massive popularity in modeling seasonal data. ANNs have already become a research hotspot in time series analysis and forecasting due to their several amazing characteristics, a comprehensive study of which can be found in the work of Zhang et al. [8]. A number of prominent researchers, such as Alon et al. [9], Tseng et al. [10], Kihoro and Otieno [11] observed that ANNs have excellent ability in properly recognizing and forecasting the seasonal fluctuations. However, their findings were also disagreed by an equally large number of researchers. For example, according to Faraway and Chatfield [12], Hill et al. [13], Zhang and Qi [14], ANNs are not appropriate for seasonal time series forecasting. Despite these earlier contradictory conclusions, many recent studies strongly support that ANNs are superior to the traditional statistical models in forecasting seasonal data [3, 15, 16]. Over the past few years, evolutionary computing techniques have attracted considerable attention of artificial intelligence researchers. One of such techniques is the Particle swarm Optimization (PSO), developed by Kennedy and Eberhart [17]. PSO provides an intelligent mathematical formulation of the working mechanism of birds flock in order to iteratively optimize nonlinear multidimensional functions. It has high search power in the state space, fast convergence rate and ability to provide global optimal solution [18, 19]. Due to these salient properties, PSO has been effectively used by some researchers as an alternative to the Backpropagation (BP) algorithm for training ANN models [19 , T RADITIONAL S EASONALITY F ORECASTING M ETHODS
The identification of the nature and period of seasonal variation is crucial for appropriate modeling and forecasting. In many cases, the correlogram of the time series provides a fairly good idea about its seasonal structure [3, 4]. A correlogram of the time series { } , , , N y y y … with mean y is a plot of the sample autocorrelation coefficients r k against the successive time lags k , where: ( )( )( ) N k t t ktk N tt y y y yr y y − += = − −= − ∑ ∑ (1) ( ) k N ∀ = − … The correlogram of a seasonal time series shows the same kind of oscillations at the seasonal lags and so can be used to manually detect the seasonal patterns. Other more robust automatic seasonality detection tools are also used. Recently, Yan [23] employed a simple rule of thumb for identifying seasonality. According to him, a time series of length N has a seasonality of period s only if ( )( ) and for 60 ss s r NNr r N N > ≤ > > (2) Through this criterion, Yan [23] could correctly identify 45 out of all 54 seasonal series and 53 out of all 57 nonseasonal series of the NN3 competition data. Next, we are going to discuss about the three well recognized traditional statistical models for seasonal time series forecasting. A. Box-Jenkins Model
The most well-known statistical technique for seasonal time series forecasting is the Seasonal autoregressive Integrated Moving average (SARIMA) model which is also commonly known as the Box-Jenkins model [4, 10, 16]. Mathematically, a SARIMA( p , d , q )×( P , D , Q ) s model is given by: ( ) ( ) ( ) ( ) s sp P t q Q t B B W B B Z φ θΦ = Θ (3) where, s is the period of seasonality; B is the backshift operator, defined as By t = y t -1; , , , p P q Q φ θΦ Θ are the lagged polynomials in B of orders p , P , q and Q respectively; Z t is a series of random errors and W t is the stationary, nonseasonal series which is obtained after the ordinary and seasonal differencing processes, i.e. ( ) ( ) Dd st t
W B B y = − − (4) The SARIMA model transforms a seasonal time series to a stationary nonseasonal one through applying ordinary and seasonal differencing sequences to the series [4, 16]. The most appropriate SARIMA model for a particular forecasting problem is usually determined through the famous Box-Jenkins three step iterative model building methodology [4]. B. Holt-Winters (HW) Model
The Holt-Winters (HW) model belongs to the family of exponential smoothing techniques. The multiplicative HW model iteratively updates the local mean level ( L t ), trend ( T t ), and seasonal index ( I t ) of a time series as follows [3, 24]: ( ) ( ) ( )( ) ( ) ( ) ( ) t t t s t tt t t tt t t t s L y I L IT L L TI y L I α αγ γδ δ − − −− −− = + − + = − + − = + − (5) t N ∀ = … where, α , γ , and δ are the smoothing parameters and s is the seasonal period. Using the multiplicative HW model, the k -step ahead forecast of y ( t ) is given by: ( ) ( ) ˆ t t t s k y t L kT I − + = + (6)
1, 2, , . k s ∀ = … There are also analogous formulae for the additive case. Chatfield and Yar [24] presented a detailed study about the fitting of HW model and in this paper we precisely follow their guidelines and recommendations. C. Support Vector Machine (SVM) Model
During the past few years, SVMs [25] have found notable applications in the domain of time series modeling and forecasting. These are based on the Structural Risk Minimization (SRM) principle with the objective to find a linear decision rule with good generalization ability. SVM maps the input space into a higher dimensional feature space by using kernel functions [3, 5, 25]. A time series forecasting problem with N training pairs { } , Ni i i y = x , , ni i R y R ∈ ∈ x falls in the category of Support Vector Regression (SVR) whose aim is to find a maximum margin hyperplane in order to classify real-valued outputs. Using Vapnik’s ε -insensitive loss function [3, 25], the SVR is converted to a Quadratic Programming Problem (QPP) in order to minimize the empirical risk: ( ) ( )
2* *1
1, , 2
Ni i i ii
J C ξ ξ ξ ξ = = + + ∑ w w (7) here, C is the positive regularization constant which acts as a penalty to misfit, * , i i ξ ξ are the slack variables and w is the weight vector. A solution of the QPP yields the optimal decision hyperplane as follows: ( ) ( ) ( ) * opt1 , s N i i ii y K b α α = = − + ∑ x x x (8) where, N s is the number of support vectors, * , i i α α are the Lagrange multipliers, b opt is the optimal bias and K ( x , x i ) is the kernel function. In this paper, we use the Radial Basis function (RBF), defined as K ( x , y )= exp(–|| x – y || ⁄ σ ) as the SVM kernel. The optimal SVM parameters, i.e. C and σ are selected through the usual grid search technique [3, 5]. III. ANN A PPROACH FOR T IME S ERIES F ORECASTING
ANNs are the most widely used computational intelligence models for time series analysis and forecasting. They differ from the traditional statistical forecasting methods due to their data-driven and self-adaptive nature [2, 3, 8]. ANNs can be truly referred as model free structures because they do not need any prior knowledge about the intrinsic data generating process. The appropriate network structure is determined solely on the basis of available input and target patterns. ANNs are also favored due to their distinctive ability of nonlinear modeling with remarkable accuracies. Different types of neural network structures have been proposed in literature and among them the Feedforward ANN (FANN) is most popular in forecasting applications [2, 8]. Recurrent neural networks have also been recently used in time series forecasting problems, although to limited extents [2, 28]. In this paper, we consider the recurrent ANN model of Elman type (EANN) to forecast seasonal data. A. FANN model
A typical FANN consists of many processing units or nodes which are distributed in multiple layers, viz. an input layer , one or more hidden layers and an output layer . The nodes in each layer are connected to those in the immediate next layer through acyclic feedforward connections. A single layer of hidden units are enough to provide the desired accuracy in most of the forecasting situations [8]. In a fully connected FANN model with p input, h hidden and a single output node, the relationship between the inputs y t - i ( i =1, 2, …, p ) and the output y t is given by: pht j j ij t ij i y G F y α α β β −= = = + + ∑ ∑ (9) where, α j , β ij ( i = 1, 2, . . . , p ; j = 1, 2, . . . , h ) are the connection weights, α , β j are the bias terms and F , G are the network activation functions. The common choice for the activation functions are the logistic function for the input layer and the identity function for the output layer [2, 8]. In order to ensure a nonlinear input-output mapping, the parameter h is taken to be non-zero. The model, presented in Eq. 9 is commonly referred as a p × h ×1 FANN model, the architecture of which is diagrammatically depicted in Fig. 1. Fig. 1. A p × h ×1 FANN model with the slant arrows as the network weights The associated network weights and biases in an ANN model are optimized through the training process. The ANN training is an unconstrained nonlinear minimization problem which iteratively updates the network parameters with the goal of minimizing the overall Sum Squared Error (SSE) between the desired and actual output values. The best known training algorithm is the standard backpropagation (BP). It modifies the weights and biases towards the fastest decrease of the error function, i.e. towards the negative of the gradient [8]. In spite of its immense popularity, the standard BP algorithm has some crucial shortcomings which include fairly large number of computations, slow rate of convergence, complex pattern of the error surface, getting trapped into local minima [8, 19, 20]. Although a lot of modifications have been developed in literature, none of them could overcome all the drawbacks of the BP algorithm; for example, none can currently guarantee the global optimal solution [8]. These issues led to the use of evolutionary computation methods for ANN training. B. EANN model
Together with the most common feedforward neural networks, other types of network architectures have also been investigated for time series modeling and forecasting. A prominent among them is the Elman ANN (EANN) model. An EANN has a recurrent neural network structure, consisting of a new context layer and feedback connections [26]. At each step, the outputs from the hidden layer are again fed back to the context layer in order to make the network able to perform dynamical time-varying mappings of the associated nodes. There seems to be no recognized study which firmly establishes the superiority of either FANN or EANN for time series forecasting. However, EANNs are often more robust in adequately modeling the temporal relationships among the input data [2, 26]. C. Seasonal Time Series Forecasting through ANNs
The traditional statistical methods for seasonal time series forecasting suffer from various drawbacks. These include the assumption of linearity, fixed model form, removal of seasonal effect through deseaonalization or seasonal differencing, etc. ……………… ………………
Bias12 p hy t -1 y t -2 y t - p Bias 1 y t NNs can overcome many of these limitations and so has found numerous effective applications in modeling and forecasting of seasonal time series. Faraway and Chatfield [12] studied several ANN structures for appropriately modeling the well-known monthly airline data. Alon et al. [9] compared the performances of ANN with four traditional methods for predicting US aggregate retail sales which have strong trends and seasonal fluctuations. Zhang and Qi [14] comprehensively investigated the effects of various data preprocessing techniques on neural network forecasting of seasonal time series. More recently, Zhang and Kline [1] studied the forecasting capabilities of 48 different neural network models for a large set of 756 quarterly time series, collected from the M3 competition. Combinations of ANN and other methods were also examined for modeling seasonal data. For example, Tseng et al. [10] proposed the SARIMABP model which combines SARIMA and BP-ANN models. Their findings show that the SARIMABP model outperformed the SARIMA, the BP-ANN with deseasonalized data and the BP-ANN with differenced data [10]. Despite the great potential of ANNs in forecasting seasonal data, the earlier studies yielded mixed results. While some find that ANNs can directly model seasonal effects without removing them, others have the just opposite view [9, 10, 12 – The structure of a typical s × h × s SANN model D. Designing of the Appropriate ANN Model
The successful forecasting through an ANN largely depends on the appropriate model designing which is however not a trivial task [8, 11, 12]. As mentioned earlier, the prime benefit of using SANN is that with this model, the architecture selection actually boils down to the selection of the optimal number of hidden nodes only. In this paper, the number of hidden nodes is selected through the widely popular
Bayesian Information Criterion (BIC) [11, 12]. For the s × h × s SANN model, the BIC is mathematically given by: ( ) ( ) , ,
BIC= ln ln s h s h
SN N n n n + + W (10) where, N s , h = s + h (2 s +1) is the number of total network parameters, n = N – s is the number of effective observations, N being the size of the training set, W is the space of all connection weights and biases and S ( W ) is the network misfit function which is commonly taken as the SSE. The BIC effectively controls the network size by penalizing for each increase in the number of network parameters [12]. Out of several feedforward SANN structures, the one which minimizes the BIC is chosen to be the optimal one. It should however be noted that the use of BIC is popular in feedforward neural networks only and there seems to be no similar criterion for EANN models. But, it is well-known that EANNs require more hidden nodes than their feedforward counterpart [2]. We use the number of hidden nodes as 2 s for each EANN model, s being the seasonal period. The choice of the appropriate training algorithm is another crucial point in ANN model designing. In this paper, we use the Levenberg-Marquardt (LM) [27] and traingdx [28] as the BP training algorithms for feedforward and recurrent ANN models respectively. IV.
PSO A LGORITHM FOR T RAINING N EURAL N ETWORKS
The celebrated PSO algorithm is based on the evolutionary computation meta-heuristic [17, 20]. In this paper, we use PSO for training feedforward as well as recurrent SANN models. Let us consider the s × h × s SANN structure, s and h respectively being the period of seasonality and the number of hidden nodes. The PSO algorithm starts with a population (also called swarm ), consisting of some predefined number of particles which are initialized with random positions and velocities . If there are N swarm swarm particles, then each one of them is of dimension D = s + h (2 s +1) which is equal to the number of total network parameters. The position of each particle is assigned through evaluating a fitness function for it. The particles are moved through the search space on the basis of two best positions, viz. personal and global best. The personal best position of a particle is the best fitness achieved by it so far, whereas the global best position is the best fitness achieved so far across the whole swarm. Then, the velocity and position of the i th particle at the d th dimension are updated as: ( ) ( ) ( ) ( ) ( ) ( ) id id id idgd id v t av t b r p x tb r p x t + = + −+ − (11) ( ) ( ) ( ) id id id x t x t v t + = + + (12) ……………… Bias y t y t -1 y t - s -1 Bias ………… .. ……………… y t +1 y t +2 y t + s s s h Input layer Output layer Hidden layer here, x id and v id are respectively the position and velocity of the i th particle at the d th dimension; p id and p gd are respectively the personal and global best positions, achieved so far at the d th dimension; b and b are the acceleration coefficients; r and r are two uniform random variables in the [0, 1] interval and a is the inertia weight. The updating process, presented through Eq. 11 and Eq. 12 continues until some predefined stopping criterion, e.g. the maximum number of iterations or the maximum increase in the validation error is attained [19, 20]. In order to improve performances of the PSO algorithm in practical problems, several variants of it have been developed in literature. Among them, the versions of Trelea [21] and Clerc [22] are used in this paper. A. PSO Trelea-I and PSO Trelea-II
Trelea [21] proposed an improved deterministic version of the basic PSO algorithm. The associated formulae for velocity and position updating are given by: ( ) ( ) ( ) ( ) id id d id v t av t b p x t + = + − (13) ( ) ( ) ( ) id id id x t x t v t + = + + (14) where, d id gd b b b br r b p p pb b b b += = = = ++ + After lots of analysis and simulation experiments, Trelea emphasized on two sets of values for the parameters a and b which respectively correspond to PSO Trelea-I ( a =0.6, b =1.7) and PSO Trelea-II ( a =0.729, b =1.494). B. PSO with Clerc-Type 1 Constriction
In order to constrain as well as control the velocities of the swarm particles, Clerc and Kennedy [22] suggested the use of a constriction factor in the basic PSO algorithm. The constriction factor is given by:
2 if 42 4 if 0< 4 κ φχ φ φ φκ φ ≥= − + − < (15) where, κ< < and . b b φ = + The calculated velocity through Eq. 11 is multiplied at each step with the constriction factor. This formulation is known as Clerc-Type1 PSO. The values φ= and κ = are often found suitable in many practical applications. V. E MPIRICAL R ESULTS AND D ISCUSSIONS
Three time series with dominant seasonal patterns are used in our experiment. These are: (1)
Airline : contains the number of monthly international airline passengers (in thousands) from January, 1949 to December, 1960, (2)
Red wine : contains the monthly sales of red wine (in thousands of liters) in Australia from January, 1980 to December, 1995, (3)
Industry : contains quarterly industrial observations, starting from the first quarter of 1977 and ending at the last quarter of 1992. The Airline and Red wine series are collected from the online Time Series Data Library (TSDL) [29], whereas the Industry series is the 219 th quarterly time series having ID N 863 of the M3 forecasting competition [1]. The description of these series is presented in Table I and their respective time plots are shown in Fig. 3. TABLE I. D ESCRIPTION OF THE T HREE S EASONAL T IME S ERIES
Dataset Total size Training size (N) s r s r / √ N Airline 144 132 12 0.748 0.514 0.174 Red wine 187 168 12 0.777 0.658 0.154 Industry 64 48 4 0.888 0.775 0.289 N u m b e r o f p a ss e ng e r s ( ' s ) Airline R e d w i n e s a l e s ( t hou s a nd s o f lit e r s ) Red wine Q u a r t e r l y ob s e r v a ti on s Industry
Fig. 3.
Time plots of the three seasonal time series ach time series shows strong seasonal fluctuations which is obvious from the time plots of Fig. 3 as well as from the rule of thumb, as discussed in Section II. The rule of thumb for each time series can be verified from Table I. We fit the SARIMA, ANN and SVM models through the MATLAB software, whereas the HW models through the R environment [30]. The neural network toolbox [28] is used for BP training and the PSO toolbox of Birge [31] is used for implementing the three variants of the PSO algorithm. The observations of each time series are normalized to lie in the interval [0, 1] as follows: ( ) ( )( ) ( ) minnew max min ii y yy y y −= − (16)
1, 2, , . i N ∀ = … where, train T1 2 , , , N y y y = Y … is the training dataset and ( ) ( ) ( ) ( ) Tnew new newnew 1 2 , , , N y y y = Y … is the normalized dataset, y (min) and y (max) respectively being the minimum and maximum values of the dataset Y . After the forecasting experiment, all the normalized observations are again transformed back to their original values. The SARIMA(0,1,1)×(0,1,1) s is found to be the suitable Box-Jenkins model for the three time series. The feedforward SANN structure for the Airline, Red wine and Industry series is determined to be 12×1×12, 12×2×12 and 4×3×4 respectively. It is observed that increasing the PSO swarm size beyond the range 24 to 30 often deteriorates the performance [20]. As such, the swarm size is kept to the minimum, i.e. 24 for all time series. The forecasting accuracies are evaluated through the two well-known error measures: Mean Absolute Error (MAE) and Mean Squared Error (MSE) [16, 20]. The obtained forecasting results are presented in Tables II, III and IV. In particular, the results for Red wine and Industry dataset are given in transformed scale (original MAE=MAE×10 and original MSE=MSE×10 ). TABLE II. F ORECASTING R ESULTS OF THE T HREE S TATISTICAL M ODELS
Method Airline Red Wine Industry MAE MSE MAE MSE MAE MSE
SARIMA 17.08 411.75 2.42 9.01 1.53 4.84 HW 10.48 254.44 2.40 9.60 1.80 5.49 SVM 10.85 176.89 3.00 12.85 1.96 5.53 TABLE III. F ORECASTING R ESULTS OF THE
SFANN M ODELS
Training algorithm
Airline Red Wine Industry MAE MSE MAE MSE MAE MSE
BP 10.41 175.69 1.73 4.79 1.18 2.17 PSO-Trelea1 9.12 150.41 1.47 3.62 1.05 1.65 PSO-Trelea2 9.98 146.18 1.71 4.29 0.90 1.59 PSO-Clerc 9.50 142.12 1.44 3.57 1.03 1.72 PSO-Average 9.34 133.70 1.33 3.06 0.92 1.44 PSO-Median 9.16 138.34 1.30 2.98 0.93 1.49 TABLE IV. F ORECASTING R ESULTS OF THE
SEANN M ODELS
Training algorithm
Airline Red Wine Industry MAE MSE MAE MSE MAE MSE
BP 10.07 171.24 2.13 6.97 1.48 2.81 PSO-Trelea1 9.85 161.76 1.94 5.57 0.95 1.40 PSO-Trelea2 9.50 143.35 1.96 4.69 1.05 1.79 PSO-Clerc 9.84 161.96 2.02 5.93 1.01 1.65 PSO-Average 9.45 147.29 1.53 3.49 0.96 1.48 PSO-Median 9.50 144.90 1.84 4.76 0.92 1.38
A careful study of Tables II, III and IV reveals the following facts: • The forecast errors obtained through the neural network models are notably less than those obtained through the traditional statistical models for all the three seasonal time series. • Each of the three PSO variants achieves remarkably better forecasting accuracies, as compared to the standard BP algorithm for both feedforward as well as
Elman SANN models. • Among the three variants of the PSO algorithm, viz. Trelea-I, Trelea-II and Clerc-Type1, none can be declared as the best one. • Overall better accuracies are obtained by combining the forecasting outputs of the three PSO based ANN models through simple average and median. For visual illustration, the actual values (in solid line) and their corresponding forecasts (in dotted line) through PSO Trelea-I based SFANN models are presented in Fig. 4. VI. C ONCLUSIONS
In this paper, an effort was made to assess the ability of PSO based neural networks to forecast seasonal time series. Three versions of the PSO algorithm were used to train two types of ANN models, viz. feedforward ANN (FANN) and Elman ANN (EANN). The three PSO variants are Trelea-I, Trelea-II and Clerc-Type1. Empirical analysis was conducted on three time series of which two had monthly and one had quarterly seasonal patterns. Obtained results demonstrated that each of the three PSO variants achieved reasonably better forecasting accuracies than the standard backpropagation (the Levenberg-Marquardt) algorithm for FANN as well as EANN models in terms of MAE and MSE. It was also observed that the issue of choosing the best PSO variant can be resolved by combining the forecasting outputs from all the three PSO based ANN models. Moreover, our study showed that both FANN and EANN provided remarkably better forecasting accuracies than the traditional SARIMA, HW and SVM methods for all three seasonal data. This study also supports the fact that ANNs are indeed capable of directly forecasting the inherent seasonal pattern without removing it from the time series. In future, the strength of PSO based ANN models can be further explored for other varieties of seasonal data. A CKNOWLEDGMENT
The first author is grateful to the Council of Scientific and Industrial Research (CSIR) for the obtained financial support to carry out the present research work.
IndustryRed wineAirline
Fig. 4.
Actual and forecasted observations of the three seasonal time series datasets R EFERENCES [1]
G. P. Zhang and D. M. Kline, “Quarterly time-series forecasting with neural networks,” IEEE Transactions on Neural Networks, vol. 8, no. 6, pp. 1800–1814, 2007. [2]
C. Lemke, B. Gabrys, “Meta-learning for time series forecasting and forecast combination,” Neurocomputing, vol. 73, pp. 2006–2016, 2010. [3]
P. Cortez, “Sensitivity analysis for time lag selection to forecast seasonal time series using neural networks and support vector machines,” in Proceedings of the International Joint Conference on Neural Networks, IJCNN 2010, Barcelona, Spain, 2010, pp. 1–8. [4]
G. E. P. Box and G. M. Jenkins,
Time Series Analysis: Forecasting and Control . 3rd Edition. California: Holden-Day, 1970. [5]
J. A. K. Suykens and J. Vandewalle, “Least squares support vector machines classifiers,” Neural Processing Letters, vol. 9, no. 3, pp. 293–300, 1999. [6]
E. Ghysels, C. W. Granger, and P. L. Siklos, “Is seasonal adjustment a linear or nonlinear data filtering process?” Journal of Business and Economic Statistics, vol. 14, no. 3, pp. 374–386, 1996. [7]
D. M. Miller and D. Williams, “Damping seasonal factors: Shrinkage estimators for the X-12-ARIMA program,” International Journal of Forecasting, vol. 20, no. 4, pp. 529–549, 2004. [8]
G. Zhang, B. E. Patuwo, and M. Y. Hu, “Forecasting with artificial neural networks: The state of the art,” International Journal of Forecasting, vol. 14, pp. 35–62, 1998. [9]
I. Alon, M. Qi, and R. J. Sadowski, “Forecasting aggregate retail sales: a comparison of arti fi cial neural networks and traditional methods,” Journal of Retailing and Consumer Services, vol. 8, no. 3, pp. 147–156, 2001. [10] F. M. Tseng, H. C. Yu, and G. H. Tzeng, “Combining neural network model with seasonal time series ARIMA model,” Technological Forecasting and Social Change, vol. 69, no. 1, pp. 71–87, 2002. [11]
J. M. Kihoro, R. O. Otieno, and C. Wafula, “Seasonal time series forecasting: A comparative study of ARIMA and ANN models,” African Journal of Science and Technology (AJST), vol. 5, no. 2, pp. 41–49, 2004. [12]
J. Faraway and C. Chatfield, “Time series forecasting with neural networks: a comparative study using the airline data,” Applied Statistics, vol. 47, no. 2, pp. 231–250, 1998. [13]
T. Hill, M. O’Connor, and W. Remus, “Neural networks models for time series forecasts,” Management Sciences, vol. 42, no. 7, pp. 1082–1092, 1996. [14]
G. Zhang and M. Qi, “Neural network forecasting for seasonal and trend time series,” European Journal of Operational Research, vol. 160, no. 2, pp. 501–514, 2005. [15]
G. P. Zhang and D. M. Kline, “Quarterly time-series forecasting with neural networks,” IEEE Transactions on Neural Networks, vol. 8, no. 6, pp. 1800–1814, 2007. [16]
C. Hamzaçebi, “Improving artificial neural networks’ performance in seasonal time series forecasting,” Information Sciences, vol. 178, pp. 4550–4559, 2008. [17]
J. Kennedy and R. C. Eberhart, “Particle Swarm Optimization,” in IEEE International Conference on Neural Networks (ICNN), Piscataway, NJ, 1995, pp. 1942–1948. [18]
A. P. Chen, C. H. Huang, and Y. C. Hsu, “Particle swarm optimization with inertia weight and constriction factor,” in International Conference on Swarm Intelligence (ICSI), Cergy, France, 2011, pp. 1–11. [19]
R. Adhikari and R. K. Agrawal, “Effectiveness of PSO based neural network for seasonal time series forecasting,” in Indian International Conference on Artificial Intelligence (IICAI), Tumkur, India, 2011, pp. 232–244. [20]
G. K. Jha, P. Thulasiraman, and R. K. Thulasiram, “PSO based neural network for time series forecasting,” in IEEE International Joint Conference on Neural Networks (IJCNN), June 14–19, Atlanta, Georgia, USA, 2009, pp. 1422–1427. [21]
I. Trelea, "The particle swarm optimization algorithm: convergence analysis and parameter selection," Information Processing Letters, vol. 85, 2003. [22]
M. Clerc, J. Kennedy, “The particle swarm—explosion, stability, and convergence in a multidimensional complex space,” IEEE Transactions on Evolutionary Computation, vol. 6, pp. 58–73, 2002. [23]
W. Yan,”
Toward Automatic Time-series forecasting using neural networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 7, pp. 1028–1039, 2012. [24]
C. Chatfield and M. Yar, “Holt-winters forecasting: some practical issues,” The Statistician, vol. 37, pp. 129–140, 1988. [25]
V. Vapnik,
The Nature of Statistical Learning Theory . New York: Springer-Verlag, 1995. [26]
C. P. Lim, W. Y. Goh, “The application of an ensemble of boosted Elman networks to time series prediction: A benchmark study,” Journal of Computational Intelligence, vol. 3, pp. 119–126, 2005. [27]
M. Hagan and M. Menhaj, “Training feedforward networks with the marquardt algorithm,” IEEE Transactions on Neural Networks, vol. 5, pp. 989–933, 1994. [28]
M. Demuth, M. Beale, M. Hagan, Neural Network Toolbox User’s Guide. The MathWorks, Natic, 2010. [29]
R. J. Hyndman, Time Series Data Library (TSDL), URL: http://robjhyndman.com/TSDL/, Jan. 2011. [30]
R Development Core Team, R:
A Language and Environment for Statistical Computing