A Hybrid Deep Learning-Based State Forecasting Method for Smart Power Grids
Shahrzad Hadayeghparast, Amir Namavar Jahromi, Hadis Karimipour
S. Hadayeghparast, A. N. Jahromi and H. Karimipour, "A Hybrid Deep Learning-Based Power System State Forecasting," , Toronto, ON, 2020, pp. 893-898
A Hybrid Deep Learning-Based State Forecasting Method for Smart Power Grids
Shahrzad Hadayeghparast
School of Engineering
University of Guelph
Guelph, Canada [email protected] Amir Namavar Jahromi
School of Engineering
University of Guelph
Guelph, Canada [email protected] Hadis Karimipour
School of Engineering
University of Guelph
Guelph, Canada [email protected]
Abstract โSmart power grids are one of the most complex cyber-physical systems, delivering electricity from power generation stations to consumers. It is critically important to know exactly the current state of the system as well as its state variation tendency; consequently, state estimation and state forecasting are widely used in smart power grids. Given that state forecasting predicts the system state ahead of time, it can enhance state estimation because state estimation is highly sensitive to measurement corruption due to the bad data or communication failures. In this paper, a hybrid deep learning-based method is proposed for power system state forecasting. The proposed method leverages Convolutional Neural Network (CNN) for predicting voltage magnitudes and a Deep Recurrent Neural Network (RNN) for predicting phase angels. The proposed CNN-RNN model is evaluated on the IEEE 118-bus benchmark. The results demonstrate that the proposed CNN-RNN model achieves better results than the existing techniques in the literature by reducing the normalized Root Mean Squared Error (RMSE) of predicted voltages by 10%. The results also show a 65% and 35% decrease in the average and maximum absolute error of voltage magnitude forecasting.
Keywordsโforecasting, smart power grid, deep learning, recurrent neural network, convolutional neural network I. I NTRODUCTION
In smart power grids, it is critically important to have accurate information about the current state of the system and the state variation tendency [1], [2]. State estimation is extensively employed in energy management systems for monitoring of the smart power grid state [3]โ[5]. It is used to retrieve the unknown system state, including voltage magnitudes and phase angles using a set of measurements provided by the Supervisory Control and Data Acquisition (SCADA) system [6], [7]. However, state estimation is highly sensitive to measurement corruption due to the bad data or temporary malfunction of the communication system [8], [9]. Therefore, state forecasting is adopted to address the issue by predicting system states ahead of time, which in turn enhances the state estimation accuracy and security [8], [10], [11]. State forecasting aims to produce sufficiently accurate forecasts for voltage at all buses of the system. The forecasted quantities are mostly utilized for debugging incoming data obtained from the SCADA system [8]. A considerable number of studies have been carried out on power system state forecasting since the 1970s, which can be divided into two categories: traditional methods and learning-based methods. The first research in this field was presented by Debs in [12], who introduced the idea of the tracking state estimator. The Kalman filtering technique was adopted in the study above and an identity matrix was used for the state transition matrix. Nishiya in [13] improved the state forecasting by introducing a trend factor to the model. However, the state transition matrix was still a unity matrix. The model was further improved by estimating the diagonal elements of the transition matrix by the Holt's Exponential Smoothing (HES) method. Morvaj also used Kalman filter [14] for the power system dynamic state estimation, but the elements of the diagonal transition matrix were updated hourly based on load forecasting. A power flow-based dynamic model driven by load forecasts was presented in [15], which brings in a new term to the state transition model. A block-diagonal state transition matrix based on regression analysis was presented in [1] for power system state forecasting. In this study, updates occur in elements of the transition matrix, when new measurement data is received. Another approach was proposed by Hassanzadeh in [3] for short-term nodal voltage phasor forecasting using first-order vector autoregressive (VAR) modeling. However, the assumption of linear dynamics in all of these methods is practically unrealistic, since there is nonlinear relationship between system states. Artificial Neural Networks (ANNs) have also been used in power system state forecasting. ANN was utilized by Filho in [16] in order to achieve state forecasting. However, the adopted architecture corresponds to a linear model. The ANN-based state forecasting model presented in [8], [17] is also the same proposed in [16]. The state transition matrix elements correspond to the ANN interconnection weights. The drawback of this model is that the model parameters increase with the length of input sequences. The study presented in [18] utilizes ANN for predicting bus loads in the prediction stage of the dynamic state estimation. Then, power flow equations were adopted to determine voltage phasors. A Deep Recurrent Neural Network (RNN) was proposed in [10] for power system state forecasting. The RNN model outperforms existing alternatives by considering nonlinear dependencies between system states. Besides, contrary to the study presented in [17], the number of model parameters for variable-length input sequences is fixed. In this paper, a hybrid deep learning-based method is proposed for power system state forecasting in order to achieve high prediction accuracy. The proposed method consists of Convolutional Neural Network (CNN) for predicting voltage magnitudes and a deep RNN for predicting phase angels. The reason for proposing the CNN model for predicting voltage magnitudes is that CNN is specialized in processing data with a grid-like topology that fit our smart power grid data. The proposed method called CNN-RNN model is evaluated on the IEEE 118-bus benchmark, and the results have been compared with the RNN model in [10], which has the best performance among the existing alternatives in the literature. The results demonstrate that the CNN-RNN model improves prediction accuracy. The normalized Root Mean Squared Error (RMSE) of predicted voltages reduced by 10%. Besides, the average and maximum absolute error of voltage magnitude forecasts decreased by 65% and 35%, respectively. This paper is organized as follows: Section II presents the power system state forecasting. The proposed CNN-RNN model is described in Section III. Section IV provides simulation results and discussions. Finally, the paper is concluded in Section V. II. P OWER S YSTEM S TATE F ORECASTING
The purpose of power system state forecasting is to predict the next state of the system accurately ๐ ๐ก+1 using the previous system states {๐ ๐ } ๐=๐กโ๐+1๐ก as expressed in (1) [10]: ๐ ๐ก+1 = ๐ ๐ ๐ก๐๐ก๐ ๐๐๐๐๐๐๐ ๐ก๐๐๐ (๐ ๐กโ๐+1 , โฆ , ๐ ๐กโ1 , ๐ ๐ก ) + ๐ ๐ก (1) where ๐ ๐ก is the state of the system at time instance t . The number of lagged states are represented by the parameter ๐ โฅ1 . Function ๐ ๐ ๐ก๐๐ก๐ ๐๐๐๐๐๐๐ ๐ก๐๐๐ is an unknown nonlinear function that forecasts the next state ๐ ๐ก+1 using a lagged system states {๐ ๐ } ๐=๐กโ๐+1๐ก . In addition, ๐ ๐ก models inaccuracies. The state of the system is represented by bus voltages (magnitudes and phase angles) at all buses [8]. The number of state variables is twice the number of buses ( n ), which includes n voltage magnitudes and n phase angles. State variables at time instance t are shown in (2): ๐ ๐ก = [|๐ ๐ก1 | |๐ ๐ก2 | โฆ |๐ ๐ก๐ | โ ๐ ๐ก1 โ ๐ ๐ก2 โฆ โ ๐ ๐ก๐ ] (2) where |๐ ๐ก๐ | and โ ๐ ๐ก๐ represent voltage magnitude and phase angle at time instance t and bus n, respectively. III. P ROPOSED H YBRID D EEP L EARNING -B ASED M ODEL
The architecture of the proposed CNN-RNN model is illustrated in Fig. 1. The model consists of a CNN model developed for predicting voltage magnitudes and a deep RNN model developed for predicting phase angles. The upper half uses a one-dimensional (1D) CNN model for the multivariate time-series forecasting, followed by dense, fully connected layers. The lower half uses a three-layer deep RNN followed by a dense, fully connected layer. Model inputs consisting of state variables from time instance t - r +1 to t , are shown in green circles. According to (1) and (2), The input can be expressed as matrix of state variables (voltages) as presented in (3): ๐ผ๐๐๐ข๐ก = [๐ ๐กโ๐+1 ๐ ๐กโ๐+2 โฆ ๐ ๐กโ1 ๐ ๐ก ]โ ๐ ๐ฟ๐๐๐๐๐ ๐๐ฆ๐ ๐ก๐๐ ๐๐ก๐๐ก๐๐ = [ |๐ ๐กโ๐+11 | |๐ ๐กโ๐+21 | โฆ |๐ ๐กโ11 | |๐ ๐ก1 ||๐ ๐กโ๐+12 | |๐ ๐กโ๐+22 | โฆ |๐ ๐กโ12 | |๐ ๐ก2 |โฎ โฎ โฑ โฎ โฎ|๐ ๐กโ๐+1๐ | |๐ ๐กโ๐+2๐ | โฆ |๐ ๐กโ1๐ | |๐ ๐ก๐ |โ ๐ ๐กโ๐+11 โ ๐ ๐กโ๐+21 โฆ โ ๐ ๐กโ11 โ ๐ ๐ก1 โ ๐ ๐กโ๐+12 โ ๐ ๐กโ๐+22 โฆ โ ๐ ๐กโ12 โ ๐ ๐ก2 โฎ โฎ โฑ โฎ โฎโ ๐ ๐กโ๐+1๐ โ ๐ ๐กโ๐+2๐ โฆ โ ๐ ๐กโ1๐ โ ๐ ๐ก๐ ] (3) Similarly, the output of the model, which is shown in blue circles, can be expressed as a vector of state variables in (4): ๐ ๐ก+1 = [| ๐ ๐ก+11 | โฏ | ๐ ๐ก+1๐ | โ ๐ ๐ก+11 โฏ โ ๐ ๐ก+1๐ ] (4) A. Convolutional Neural Network (CNN) Model
CNNs have a great performance in processing data with grid-like topology. Therefore, they can be adopted for time series forecasting by considering time series data as a 1D grid, which takes samples at regular time intervals [19]. The input to the CNN model is the
2๐ ร ๐ matrix of state variables presented in (3). In this paper, n and r account for 118 buses and 10 lagged states, respectively. As shown in Fig. 1, the first stage of the CNN model is a convolutional layer. In this layer, 118 filters with the kernel size of 2 are applied to the matrices of
2๐ ร 2 dimension given in (5) to produce feature maps. After that, feature maps pass through a nonlinear Relu function. The equation describing the convolutional layer is presented in (6). ๐ผ ๐ = [๐ ๐กโ๐ ๐ ๐กโ๐+1 ] (5) โ ๐๐ = ๐ ๐ ๐๐๐ข (๐ค ๐ โ ๐ผ ๐ + ๐ ๐ ) 1 โค ๐ โค ๐ โ 1, 1 โค ๐ โค 118 (6) where ๐ ๐ ๐๐๐ข is the activation function, which is selected as a Relu function in this paper. The matrix ๐ค ๐ and the scaler ๐ ๐ are the weight matrix and bias related to the k th filter, respectively. Also, โ ๐๐ stands for the p th element of the k th feature map. Finally, ๐ผ ๐ represents the p th region of the input to which the filter is applied. Fig. 1. The architecture of the proposed CNN-RNN model for power system state forecasting.
The convolutional layer is followed by a pooling layer. A pooling layer receives input and changes the elements of the input by a summary statistic of its nearby elements [19]. Max pooling operation is used in the pooling layer of this model. The equation expressing max-pooling operation is presented in (7): where ๐ ๐ is the output of the pooling layer corresponding to the k th feature map. The pool size is selected to be 2; therefore, ๐ ๐ is a vector with 4 elements for each feature map. The pooling layer is followed by a flatten layer, which converts the output of the pooling layer with a dimension to a vector with 472 elements, as expressed in (8). The next stage is a dense, fully connected layer with 236 neurons. The output of the dense layer is passed through a Relu activation function, as presented in (9). ๐ ๐๐๐๐ก = ๐น๐๐๐ก๐ก๐๐(๐) (8) ๐ ๐๐๐๐ ๐1 = ๐ ๐ ๐๐๐ข (๐ ๐๐๐๐ ๐1 ๐ ๐๐๐๐ก ) (9) where ๐ ๐๐๐๐ก and ๐ ๐๐๐๐ ๐1 are the output vectors of the flatten and dense layers, respectively. Also, ๐ ๐๐๐๐ ๐1 is the weight matrix corresponding to the dense layer. The final layer of the CNN architecture is another dense, fully connected layer with 118 neurons. The output of the second dense layer is passed through a linear activation function to produce the outputs of the CNN model, which are voltage magnitudes of all busses as presented in (10): [|๐ ๐ก+11 | |๐ ๐ก+12 | โฏ |๐ ๐ก+1๐ |] = ๐ ๐ฟ๐๐๐๐๐ (๐ ๐๐๐๐ ๐2 ๐ ๐๐๐๐ ๐1 + ๐ ๐๐๐๐ ๐2 ) (10) where ๐ ๐๐๐๐ ๐2 and ๐ ๐๐๐๐ ๐2 are the weight matrix and bias vector of the second dense, fully connected layer, respectively. B. Deep Recurrent Neural Network (RNN) Model
RNNs, as a member of the neural network's family, are specialized to process sequential data having variable-length inputs or outputs [19], [20]. In principle, recurrent networks can use their feedback connections to store representations of recent input events in the form of activations (short-term memory). This is potentially significant for many applications, including speech processing, non-Markovian control, music composition, and natural language processing (NLP) [21]. A dynamical system influenced by an external signal can be described by (11) [19]: ๐ ๐ก = ๐ ๐ (๐ ๐กโ1 , ๐ฅ ๐ก ) (11) where ๐ ๐ก and ๐ฅ ๐ก account for the system state and the external input at time instance t , respectively. The function ๐ ๐ , which includes parameters ๐ , is responsible for mapping ๐ ๐กโ1 to ๐ ๐ก . The equation expressed in (11) can be visually represented in Fig. 2. The left half and right half illustrate the circuit form and its unfolded ๏ฌow graph, respectively. In the right half, each node indicates a time instance, while a delay of 1-time instance on the recurrent connection is demonstrated by a black square in the left half. A simple RNN architecture is shown in Fig. 3 [19]. The equations describing this architecture are expressed in (12) and (13) [22]: ๐ ๐ก = ๐ โ (๐ฅ ๐ก , ๐ ๐กโ1 ) = โ โ (๐ ๐ฅ ๐ก + ๐ ๐ ๐กโ1 ) (12) ๐ ๐ก = ๐ ๐ (๐ฅ ๐ก , ๐ ๐ก ) = โ ๐ (๐ ๐ ๐ก ) (13) ๐ ๐ = ๐๐๐ฅ ๐๐๐๐๐๐๐(โ ๐๐ ) 1 โค ๐ โค ๐ โ 1, 1 โค ๐ โค 118 (7) Fig. 2. Correlation between the current state and the whole past sequence.
Fig. 3. A simple RNN. where ๐ , ๐ , and ๐ are the weight matrices for input-to-hidden, hidden-to-hidden, and hidden-to-output connections, respectively. The input and output of the dynamical system are expressed by ๐ฅ ๐ก and ๐ ๐ก respectively. The functions ๐ ๐ and ๐ โ stand for the output function and transition function. The functions ๐ ๐ and ๐ โ are defined by a set of parameters ๐ ๐ and ๐ โ respectively. The functions โ ๐ and โ โ account for element-wise nonlinear functions. Deep RNNs are constructed based on RNN architecture with multiple processing layers [10]. The deep RNN used in this paper is a stacked RNN, which includes multiple levels of transition functions, one on top of another [22]. The architecture of the deep RNN and its corresponding equations are presented in Fig. 4 and (14), respectively. ๐ ๐ก๐ = ๐(๐ ๐ฃ๐โ1 ๐ ๐ก๐โ1 + ๐ โ๐ ๐ ๐กโ1๐ + ๐ ๐โ1 ) (14) where l stands for the layer index, and f is a nonlinear activation function (Relu in this paper). Also, {๐ ๐ฃ๐ , ๐ โ๐ , ๐ ๐ } are the weight matrices and vectors. ๐ ๐ก๐โ1 is the hidden state at time t and layer l-1 . Fig. 4. The architecture of the deep RNN in the proposed method.
In this paper, the deep RNN has three layers (๐ = 3) and ten lagged states (๐ = 10) . It is followed by a dense, fully connected layer with 118 neurons, as shown in Fig. 1. The equation demonstrating phase angle forecasting is presented in (15). [โ ๐ ๐ก+11 โ ๐ ๐ก+12 โฏ โ ๐ ๐ก+1๐ ] = ๐ ๐ฟ๐๐๐๐๐ (๐ ๐๐๐๐ ๐3 ๐ ๐ก๐ + ๐ ๐๐๐๐ ๐3 ) (15) where โ ๐ ๐ก+1๐ stands for the phase angle forecast at time t +1 and bus n . In addition, ๐ ๐๐๐๐ ๐3 and ๐ ๐๐๐๐ ๐3 represent the weight matrix and bias vector of the dense fully connected layer. IV. S IMULATION R ESULTS AND D ISCUSSION
The proposed CNN-RNN model developed for power system state forecasting is evaluated on the IEEE 118-bus benchmark. The dataset, which consists of 236 state variables (bus voltages, including 118 voltage magnitude and 118 phase angles) at 18,528 time instances, is adopted from [10]. The training set and test set include 80% and 20% of the dataset with 14,822 and 3,706 samples, respectively. The dataset is developed based on the real load data of the 2012 Global Energy Forecasting Competition (GEFC) [23]. The simulation results of the proposed CNN-RNN model are compared with the deep RNN model presented in [10], which outperforms existing techniques in the literature. The efficient Adam version of stochastic gradient descent is used to fit the model and learn weights. The model is implemented using Python programming language in a system with Intel(R) Core(TM) i5-7400 CPU with 16 GB RAM specifications. The CNN-RNN and RNN models are evaluated in terms of normalized RMSE. The RMSE of the proposed model is computed by determining the average RMSE over 20 independent runs. The average RMSE of the CNN-RNN and RNN models are โ3 and โ3 respectively. Therefore, the proposed model reduces the RMSE by 10% and improves the prediction accuracy. The reason for the superiority of the proposed model is using the CNN architecture for predicting voltage magnitudes. CNNs specialize in processing data with grid-like topology and mining spatial features that fit our smart power grid data. The average and maximum Absolute Errors (AE) of voltage predictions over the whole test set are presented in TABLE I . A 65% and 35% decrease in the average and maximum AEs of voltage magnitudes are observed, respectively. It is clear that the proposed CNN-RNN model significantly reduces the voltage magnitude prediction error. It is also observed that the voltage angle prediction error is almost the same in both cases. The reason for the same results is the use of three layers deep RNN in both methods for forecasting phase angles. The results are confirmed by plotting AE over the whole test set. The AEs of voltage magnitude and voltage angle forecasts are demonstrated in Fig. 5, and Fig. 6, respectively.
TABLE I. A BSOLUTE E RROR OF V OLTAGE F ORECASTING
Method
Absolute Error of Voltage Magnitude (p.u.)
Absolute Error of Voltage Angle (degree)
Average
Max
Average
Max
CNN-RNN โ4 โ2 โ3 โ2 Fig. 5. The absolute error of voltage magnitude forecasting for the CNN-RNN and RNN models.
Fig. 6. The absolute error of voltage angle forecasting for the CNN-RNN and RNN models.
Larger errors are evident in forecasted voltage magnitudes by the RNN model compared to the proposed CNN-RNN model in Fig. 5. The forecasted voltages for all buses at test instance 3222 is shown in Fig. 7. The proposed CNN-RNN model predicts voltage magnitudes with considerably better accuracy than the RNN model. However, due to the use of the same three-layer deep RNN in both methods for phase angle forecasting, the accuracy of both models in predicting phase angle is the same. Also, Fig. 8 illustrates voltage magnitude and angle forecasts for bus 16 from test instance 500 to 1000 (It is noteworthy that the test instance numbering starts from the 11th test example since r =10). The figures clearly show that the proposed CNN-RNN model considerably improves the accuracy of the voltage magnitude predictions, while the phase angle forecasting accuracy is almost the same. Fig. 7. Forecasted voltage magnitudes and angles for all buses at test instance 3222.
Fig. 8. Forecasted voltage magnitudes and angles of bus 16 from test instance 500 to 1000. V. C ONCLUSION
A hybrid deep learning-based method was proposed for power system state forecasting. The proposed method consists of a CNN model for predicting voltage magnitudes and a deep RNN model for predicting voltage angles. The CNN-RNN model produced more accurate predictions by reducing the RMSE of forecasted voltages by 10% compared to the best existing technique in the literature. In addition, using CNN for voltage magnitude forecasting considerably decreased the average and maximum absolute errors of voltage magnitude forecasts by 65% and 35%, respectively. Therefore, CNNs can be a better option than other alternatives for voltage magnitude forecasting. The higher accuracy achieved by CNNs is due to the fact that they specialize in processing data with grid-like topology and mining spatial features that fit our smart power grid data. The contribution of our future work will be adopting series and parallel CNN-RNN architectures to improve power system state forecasting accuracy further. R
EFERENCES [1] M. Hassanzadeh and C. Y. Evrenosoglu, โPower system state forecasting using regression analysis,โ
IEEE Power Energy Soc. Gen. Meet. , pp. 12โ17, 2012. [2] H. Karimipour and V. Dinavahi, โExtended Kalman filter-based parallel dynamic state estimation,โ
IEEE Trans. Smart Grid , vol. 6, no. 3, pp. 1539โ1549, 2015. [3] M. Hassanzadeh, C. Y. Evrenosoวงlu, and L. Mili, โA short-term nodal voltage phasor forecasting method using temporal and spatial correlation,โ
IEEE Trans. Power Syst. , vol. 31, no. 5, pp. 3881โ3890, 2016. [4] H. Karimipour and V. Dinavahi, โAccelerated parallel WLS state estimation for large-scale power systems on GPU,โ , no. 1, 2013. [5] I. Al-Omari, A. Rahimnejad, A. Gadsden, M. Moussa, and H. Karimipour, โPower system dynamic state estimation using smooth variable structure filter,โ
Glob. 2019 - 7th IEEE Glob. Conf. Signal Inf. Process. Proc. , no. May 2020, 2019. [6] A. J. Wood and B. Wollenberg, โPower generation operation and control โ 2nd edition,โ
Fuel Energy Abstr. , 1996. [7] H. Karimipour and V. Dinavahi, โParallel domain decomposition based distributed state estimation for large-scale power systems,โ
IEEE Trans. Ind. Appl. , vol. 2015, no. July, 2015. [8] M. B. Do Coutto Filho and J. C. Stacchini de Souza, โForecasting-aided state estimation - Part I: Panorama,โ
IEEE Trans. Power Syst. , vol. 24, no. 4, pp. 1667โ1677, 2009. [9] H. Karimipour and V. Dinavahi, โParallel relaxation-based joint dynamic state estimation of large-scale power systems,โ
IET Gener. Transm. Distrib. , vol. 10, no. 2, pp. 452โ459, 2016. [10] L. Zhang, G. Wang, and G. B. Giannakis, โReal-Time Power System State Estimation and Forecasting via Deep Unrolled Neural Networks,โ
IEEE Trans. Signal Process. , vol. 67, no. 15, pp. 4069โ4077, 2019. [11] H. Karimipour and V. Dinavahi, "Robust massively parallel dynamic state estimation of power systems against cyber-attack," in
IEEE Access , vol. 6, pp. 2984-2995, 2018. [12] A. S. Debs and R. L. E, โA dynamic estimator for tracking a power system,โ
IEEE Trans. Power Appar. Syst. , vol. 89, no. 7, pp. 1670โ1678, 1970. [13] K. Nishiya, J. Hasegawa, and T. Koike, โDynamic state estimation including anomaly detection and identification for power systems,โ
IEE Proc. C Gener. Transm. Distrib. , 1982. [14] Z. Morvaj, โA mathematical model of an electric power system for dynamic state estimation,โ
Electr. Power Syst. Res. , vol. 8, no. 3, pp. 207โ217, 1985. [15] E. A. Blood, B. H. Krogh, and M. D. Iliฤ, โElectric power system static state estimation through Kalman filtering and load forecasting,โ
IEEE Power Energy Soc. 2008 Gen. Meet. Convers. Deliv. Electr. Energy 21st Century, PES , pp. 1โ6, 2008. [16] M. B. Do Coutto Filho, J. C. S. Souza, R. S. G. Matos, and M. T. Schilling, โRevealing gross errors in critical measurements and sets via forecasting-aided state estimators,โ
Electr. Power Syst. Res. , vol. 57, no. 1, pp. 25โ32, 2001. [17] M. B. Do Coutto Filho, J. C. Stacchini de Souza, and R. S. Freund, โForecasting-aided state estimation - Part II: Implementation,โ
IEEE Trans. Power Syst. , vol. 24, no. 4, pp. 1678โ1685, 2009. [18] A. K. Sinha and J. K. Mondal, โDynamic state estimator using ANN based bus load prediction,โ
IEEE Power Eng. Rev. , vol. 19, no. 1, p. 54, 1999. [19] I. Goodfellow, Y. Bengio, and A. Courville, โDeep Learning - An MIT Press book,โ
MIT Press , 2016. . [20] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, โLearning representations by back-propagating errors,โ
Nature , 1986. [21] S. Hochreiter and J. Schmidhuber, โLong short-term memory,โ
Neural Comput. , 1997. [22] R. Pascanu, C. Gulcehre, K. Cho, and Y. Bengio, โHow to construct deep recurrent neural networks,โ in , 2014. [23] H. Herrmann and H. Bucksch, โLoad History,โ
Dictionary Geotechnical Engineering/Wรถrterbuch GeoTechnik . pp. 812โ812, 2014.. pp. 812โ812, 2014.