[PDF] A Conditional Model of Wind Power Forecast Errors and Its Application in Scenario Generation

Abstract

In power system operation, characterizing the stochastic nature of wind power is an important albeit challenging issue. It is well known that distributions of wind power forecast errors often exhibit significant variability with respect to different forecast values. Therefore, appropriate probabilistic models that can provide accurate information for conditional forecast error distributions are of great need. On the basis of Gaussian mixture model, this paper constructs analytical conditional distributions of forecast errors for multiple wind farms with respect to forecast values. The accuracy of the proposed probabilistic models is verified by using historical data. Thereafter, a fast sampling method is proposed to generate scenarios from the conditional distributions which are non-Gaussian and interdependent. The efficiency of the proposed sampling method is verified.

Full PDF

1 A Conditional Model of Wind Power Forecast Errors and Its Application in Scenario Generation

Zhiwen Wang, Chen Shen*, Feng Liu Dep. of Electrical Engineering, Tsinghua University, Beijing, 100084, China

A B S T R A C T

K E Y W O R D S

Conditional distribution Gaussian mixture model Scenario generation Wind power

H I G H L I G H T S

Conditional distributions of forecast errors for multiple wind farms under different forecast values. A fast scenario generation method for non-Gaussian interdependent distributions. Introduction

Nowadays, a large amount of wind power has been integrated into power systems. In power system operation, a wind power forecasting tool plays an important role. Since the forecast values always deviate from the true ones more or less, the resulting forecast errors should be taken into account in generation scheduling [1]. In industrial practice, systems operators usually allocate reserves to compensate the forecast errors [2]. On one hand, if the forecast errors are overestimated, reserves will be overcommitted, increasing operation costs; on the other hand, if the forecast errors are underestimated, reserves will be undercommitted, causing wind spillage and load shedding. Therefore, modeling the wind power forecast errors is a crucial issue for unit commitment (UC) and economical dispatch (ED). Given an effective point forecasting tool, distributions of wind power forecast errors are conditioned on forecast values. In the literature [3-7], various probabilistic distributions have been adopted to model conditional distributions of wind power forecast errors. In [3], the authors point out that forecast errors of a single wind farm are far from Gaussian distributions, as the kurtosis could be over 10 ( 3 for the Gaussian). Beta distribution is suggested to model forecast error uncertainties. In a relevant study [4], the authors combine the Beta distribution and Dirac delta function, and obtain a “mixed beta distribution”, improving the model accuracy. Further, Bruninx et al find that Beta distribution is not able to fully characterize the skewed and heavy-tailed forecast errors [5]. To solve this problem, the Levy α -stable distribution is adopted. The test results in [5] show that the Levy α -stable distribution outperforms Beta distribution. As the distributions of forecast errors are quite various under different forecast levels, the “versatile distribution” with three adjustable parameters is proposed in [6], achieving higher flexibility. Because the “versatile distribution” has more adjustable parameters than Beta/Gaussian distributions, it can better represent forecast error uncertainties. Following a similar idea, Menemenlis et al use the time-varying Gamma-like distribution, whose parameters are adjusted as functions of forecast levels, to model the forecast errors [7]. These detailed conditional models [3-7] help the generation scheduling dynamically adjust reserves to different forecast levels. However, they are applicable to the single wind farm case only. They cannot handle multivariate random variables. To model a joint distribution for adjacent wind farms, the Copula technique has drawn much attention lately. In [8], the Gaussian Copula is used to model the spatial interdependence structure in forecast uncertainties for multiple wind farms across a region. In a similar study [9], applying the Gaussian Copula theory, the authors conduct a multivariate probabilistic analysis for spatial correlated wind generation in the European grid. A remarkable advantage of using the Copula theory to model forecast error uncertainties is made in [10]. Different types of Copula functions, e.g., Gaussian , t , Clayton , Frank , Gumbel , are adopted to model the stochastic dependence of uncertainties. The Copula-based conditional distributions of forecast errors for multiple wind farms are obtained. For applications, authors in [11] propose a Copula-based chance-constrained optimization model for power system planning. Further, in order to deal with different dependency structures between pairs of random variables, e.g., wind and solar, the vine-Copula methods are investigated in [12], [13], improving the accuracy in high-dimension cases. Although constructing the Copula-based conditional distributions for multiple wind farms has been investigated in the literature [8-13], it is hard to ensure that the constructed distributions have some desirable attributes. For instance, in terms of the scenario generation , the Gaussian Copula method generates original scenarios from a Gaussian distribution, transforms the original scenarios into the Copula domain, and obtains final scenarios by using inverse transformations of marginal cumulative distribution functions (CDF). The procedure is time-consuming relative to sampling directly from a joint distribution. The scenario generation procedures of other types of Copulas are more complicated. In order to incorporate forecast error uncertainties into UC and ED, scenarios generated from the conditional distributions of multiple wind farms are needed [14], [15]. Generally, when random variables are non-Gaussian and interdependent, generating scenarios, i.e., sampling , from their joint distribution is difficult [16]. Many existing techniques are either not efficient enough or less accurate [17], [18]. For example, the acceptance-rejection method and conditional sampling method need many steps and multiple transformations, which are time-consuming. The affine transformation method does not ensure that the generated scenarios strictly follow the predefined joint distribution, which may lead to inaccurate results. The Nataf technique is used in [19] to produce wind power scenarios. A time-varying correlation matrix is used in [20] for generating short-term wind uncertainty scenarios. Neither of them proves that the generated scenarios follow a predefined joint distribution. Using historical time series data of wind power and the kernel density estimator, Xydas et al propose a generation method for forecast scenarios [21]. Alternatively, Morales et al adopt the autoregressive model to generate time series data for wind power scenarios [22]. However, these techniques do not retain the original distributions of uncertainties [12]. To the best of the authors’ knowledge, when the conditional forecast error distribution of multiple wind farms is available, there is not an accurate and efficient sampling method that can generate scenarios from the non-Gaussian and interdependent joint distribution. To address these important issues, this paper aims at a systematic methodology that can accurately model conditional A scenario can be understood as a plausible realization of uncertainty [14]. The uncertainty could be formulated as random variables or a stochastic process. Wind power scenario generation means producing a set of possible realizations of wind power uncertainty. From the prospective of the probability theory, the “scenario generation” indeed means generating samples from a given probabilistic distribution. In this sense, we abuse the terminology “ sampling ” to stand for “scenario generation” in this paper. A rigorous definition and several illustrative examples can be found in [14]. distributions and generate scenarios. The original contributions are twofold: (1) On the basis of Gaussian mixture model (GMM), this paper constructs conditional distributions of wind power forecast errors for multiple wind farms under different forecast values. With the proposed distributions, non-Gaussianity and correlations of forecast error uncertainties can be handled. What’s more, operator can conveniently obtain the conditional distribution of the aggregated forecast errors across a region. (2) Based on the proposed probabilistic model, a fast method is developed to generate scenarios for wind power forecast errors with high accuracy and efficiency. The method is proved to be an exactly accurate sampling method for interdependent random variables. With the proposed sampling method, tens of thousands of scenarios can be generated within milliseconds. The rest of this paper is organized as follows. In Section 2, a framework for the proposed methodology is provided. In Section 3, the GMM is used to represent a joint distribution of actual wind power outputs and forecast values for multiple wind farms. In Section 4, analytical formulae are derived to construct a conditional distribution of wind power forecast errors. Advantages of the proposed probabilistic model are discussed. In Section 5, a method that generates scenarios from the constructed conditional distribution is proposed. Case study results are presented in Section 6. Conclusions and limitations are shown in Section 7. Framework

The proposed methodology consists of three phases. A flow chart is shown in Fig. 1:

Wind farmsHistorical data ofGMM parameter estimation for T [ X , Y ] T T Forecastvalues Constructing the conditionaldistribution of [ X | Y ] and [ Z | Y ]Wind powerforecasting toolΓ={ ω m , μ m , σ m ; m =1, ,M}Weather informationGenerating scenarios from the conditional distribution [ Z | Y ] X Historical data of

Y Phase 1Phase 2Phase 3

Fig. 1

Implementation procedure of the proposed methodology.

Phase 1 : Modeling a probability density functions (PDF) of the actual wind power outputs and forecast values by a GMM. Without loss of generality, let a random vector X denote actual wind power outputs of multiple wind farms, Y denote the corresponding forecast values, and Z denote the forecast errors. A GMM is used to represent the joint PDF of an aggregated random vector [ X T Y T ] T . Phase 2 : Constructing conditional distributions of wind power with respect to forecast values [ X | Y ], as well as the conditional distributions of wind power forecast errors [ Z | Y ]. Phase 3 : Generating scenarios from the constructed conditional distributions of [ Z | Y ]. In the implementation procedure, the GMM parameter set Γ of [ X T Y T ] T is estimated according to historical data in the first phase. That is, the parameter set Γ of a GMM is obtained off-line. The parameter set Γ only needs to be updated when there is a need to update the historical dataset, e.g. once a day. Joint distribution of actual power outputs and forecast values

The hourly wind power data used in this paper is from the National Renewable Energy Laboratory (NREL) public dataset “ eastern wind integration data set ”. The data set consists of wind resource and plant output data for the eastern United States. The meteorological data, e.g., wind speed, were generated on the basis of two meteorological models: the Mesoscale Atmospheric Simulation System and the Weather Research and Forecasting model. Then, the wind power outputs were produced using turbine power curves of IEC Class 1 and 2. The forecast values were produced by running a statistical point forecasting tool called

SynForecast . There are three forecasting lead time horizons in the dataset: next-day, 6-hour, and 4-hour. NREL has compared the simulated data (wind power outputs and forecasts) with real measurements. The results show that the simulated data and the measurements are very similar. That is to say, although the wind power outputs and their forecasts were simulated, they were verified to well represent the stochastic nature of real-world wind power uncertainties.

A detailed description of the dataset is available in [23] and [24]. It should be noted that the NREL wind power data records do not have outliers. When other raw data sources are used, the outliers should be carefully preprocessed. Two alternative approaches are presented in [25] and [26]. The dates of the NREL data range from 20040101 to 20061231. There are 26304 data records. In this paper, the data of 2004 (8784 records) is used as a training set for modeling uncertainty, while the data of 2005 (8760records) is used as a test set for the scenario generation test.

Usually, forecasting tools have forecast errors. That is, the forecast values deviate from the actual wind power outputs. Given a point forecasting tool, t he distribution of forecast errors varies significantly with respect to different forecast

To clearly illustrate the phenomenon, this paper uses the historical data of a wind farm in Illinois and generates the histograms of the forecast errors under different forecasts. The ID of this wind farm in the NREL dataset is 4209. The installed capacity is 1014 MW. The maximum values of actual wind power and forecast values are 984MW, 983MW, respectively. The maximum forecast error is 654MW. The data is hourly. The forecast lead time is 4 hour. The 8784 data records in 2004 are used. The histograms of the forecast errors with respect to different forecast values are obtained as follows: (1) The historical data pairs of actual wind power and forecasts [ X T Y T ] T are normalized to the installed capacity. (2) The historical data pairs of [ X T Y T ] T are transformed into wind power forecast errors and forecasts [ Z T Y T ] T . (3) The historical data pairs of [ Z T Y T ] T are categorized into several bins on the basis of the forecasts. A bin consists of a central value y * and a width wd . The bin ranges from y *- wd to y *+ wd . In this paper, the number of bins is 9. It can be changed to other values, if needed. The value of y * is set to be 0.1 through 0.9. The width of each bin is 0.05. For example, the first bin is [0.05, 0.15], the second bin is [0.15, 0.25], and the last bin is [0.85, 0.95]. (4) The data pairs, whose forecast values are within the bin [ y *- wd , y *+ wd ], constitute the histograms of Z conditioned on y *. For demonstration purpose, the histograms of the 1 st , 5 th , and 9 th bins are shown in Fig.2, while the others are omitted. It can be observed that the distributions of forecast errors conditioned on y * are quite different. In [3], [27], Beta distributions are used to approximate those histograms one by one, while the authors in [4], [5], and [6] report that the accuracy of Beta still needs improvement. Forecast y *=0.1 p.u. Forecast error(p.u.) 0 0.5 1 P D F P D F Forecast error(p.u.) -0.5 0 0.5

012 Forecast error(p.u.) -1 -0.5 0 P D F Forecast y *=0.5 p.u. Forecast y *=0.9 p.u. Fig. 2

Histograms of forecast errors of the 1 st , 5 th , and 9 th bins. Others are omitted. The forecast lead time is 4 hour. The data is from a single wind farm (ID: 4209) in the NREL “eastern wind integration data set”. The data is normalized to the installed capacity. Beyond the variability, wind power uncertainties of adjacent wind farms have stochastic dependence. In generation scheduling, there are many cases wherein the outputs and forecast errors of multiple wind farms are needed. For example, when transmission limits are considered, one needs to know the output of each wind farm. So far, the only practical option for modeling a multidimensional random vector has been limited to a multivariate Gaussian distribution. However, the Gaussian assumption of wind uncertainty is not accurate [5], [6]. Modeling the dependence of non-Gaussian interdependent random variables (actual wind power outputs, forecast values, and forecast errors) with acceptable accuracy remains challenging. To handle the variability and dependence of forecast errors appropriately, this paper models forecast errors in the following sequence: first, a GMM is adopted to represent a joint distribution of actual wind power outputs and forecast values. Then, a conditional distribution of forecast errors is constructed analytically. Some factors contribute to conditional distributions of forecast errors, e.g., the forecast lead time, the technique that a forecasting tool utilizes, and wind farm locations. This paper does not discuss how those factors affect the probabilistic distributions of the forecast errors, nor is it aimed at developing a new probabilistic forecasting tool. Rather, this paper is focused on modeling conditional distributions of forecast errors, utilizing historical data of a given point forecasting tool. This paper uses the GMM to model forecast errors. The GMM parameters are estimated directly from the historical data of [ X T Y T ] T . When the forecasting tool and the lead time change, the historical data records of Y change accordingly. In such a situation, the GMM parameters should be estimated again using the new historical data of [ X T Y T ] T . In other words, different forecasting tools and lead times correspond to different historical data records of [ X T Y T ] T , which lead to different GMM parameters. For a given forecasting tool with a certain forecast lead time, as long as the historical data of [ X T Y T ] T is available, the proposed method can be used to model forecast errors. Therefore, the two factors, i.e., the forecasting tool and the lead time, do not limit the practicality of the proposed method. As far as wind farm location is concerned, the parameters of the GMMs for different wind farms might be different depending on the historical data. For example, the parameters of a GMM for wind farm A are estimated using historical data of A, while the parameters for wind farm B are estimated using historical data of B. Since the historical data of A and B are different, the estimated parameters of GMMs are different. As a result, the conditional distributions of A and B are different. In data clustering and machine learning, the GMM is known for its high level accuracy in characterizing multiple random variables [28]. Recently, several researchers have applied the GMM technique to power system uncertainty analysis and verified its superiority in modeling stochastic power outputs of renewable energy and loads [29], [30]. A GMM for an aggregated random vector, [ X T Y T ] T , is defined as a convex combination of multivariate Gaussian distribution functions with an adjustable parameter set Γ={ ω m , μ m , σ m ; m =1,…,M}. A mathematical expression of the GMM is given as follows:     M1 , , ; , m m m mm f N     XY x y x y μ  (1) M1

1, 0 m mm       (2)       T 1

12 1/2W , ; , 2π det m m m m m m m eN                           x xy y x y μ μ μ    (3) , m m mm mm m m            x xx xyy yx yy μμ μ     (4) where f XY ( x , y ) is the joint PDF of [ X T Y T ] T ; ω m is the weight coefficient; W in the denominator of Eq. (3) denotes the number of wind farms; N m (·) denotes a multivariate Gaussian distribution function, which is called the m th Gaussian component of the GMM; M is the total number of Gaussian components; μ and σ with subscripts and superscripts are parameters. Determining the parameter set Γ of the GMM is a typical parameter estimation problem. With historical data of X and Y , one can obtain the parameter set Γ using the maximum likelihood estimation (MLE) technique. A well-known algorithm is the Expectation Maximization (EM) algorithm. Many commercial software tools provide reliable off-the-shelf solvers for estimating the parameters of GMM, e.g., gmdistribution.fit in MATLAB. Guidelines about the GMM parameter estimation are available in [29]. Conditional distributions of forecast errors

In theory, a conditional distribution of X can be computed as a joint PDF of [ X T Y T ] T divided by a marginal distribution of Y . That is:      | ff f  XYX|Y Y x, yx y y (5) However, computing the marginal distribution of Y usually requires a multiple integral operation, which cannot be computed easily in practice. As a consequence, constructing the conditional distribution of X from the joint PDF of [ X T Y T ] T is not trivial. To circumvent the problem, it is helpful to use some properties of the GMMs. In the following, the details for constructing the conditional distribution from a GMM are given, followed by discussions of the possible extensions and advantages of the proposed method. Proposition 1 : if the joint PDF of a random vector [ X T Y T ] T is represented by a GMM in Eq. (1), then the marginal distribution of Y is also a GMM:     M1 ; , m m m mm f N     yy yyY y y μ  (6) Proof : Note that the random vector Y can be regarded as a linear transformation of [ X T Y T ] T :        XY Y (7) On the basis of the so-called “ linear invariance ” property of the GMM [31], the distribution of Y can be computed as follows:             M T1M1 ; ,; , m m m mm m m m mm f NN      Y yy yy y yy  (8) Proposition 2 : if the joint PDF of a random vector [ X T Y T ] T is represented by a GMM in Eq. (1), then the conditional distribution of X with respect to Y = y is also a GMM:     M1 | | ; , m m m mm f N      x y xx yX|Y x y x y μ  (9) where    M1 ; ,; , m m mm m l l l ll N N        yy yyyy yy y y μ μ   (10)     - m m m m m   x y x xy yy y y μ μ μ    (11)   m m m m m        xx y xx xy yy yx (12) Proof : Substituting Eq. (1) and Eq. (6) into Eq. (5), one can obtain the derivations as follows          

M1M1M M1 1M1 , ; ,| ; ,; , | ; ,; ,| ; , m m m ml l l l llm m m m m m ml l l l llm m m mm

Nf NN NNN                  

X|Y yy yyyy yy x y xx yyy yyx y xx y x yx y yy x yyx y μμμ μμμ   (13) In the second step of Eq. (13), an equation of a multivariable Gaussian distribution function is used, which is shown in Eq. (14). The proof of Eq. (14) can be found in [32].      , ; ,; , | ; , m m mm m m m m m

N N N     yy yy x y xx y x yy x y μμ μ   (14) Once the joint PDF of [ X T Y T ] T is represented by a GMM, the conditional distribution of X with respect to Y = y is analytically computed using Eqs. (9)-(12) Note that the forecast error Z can be regarded as a difference between the actual wind power and the corresponding forecast value. Hence, a conditional distribution of forecast errors with respect to forecast values is given as follows:   Z X Y (15)     M1 | ; , l m m ml f N       x y xx yZ|Y z y z y y | μ  (16) Wind power forecasts are often performed several times per day. For example, for the wind power at period t , forecasts are issued at t -24, t -6, and t -4. An example is provided in Fig. 3. Let X t denote the actual wind power at period t . Let Y t -24 , Y t -6 , and Y t -4 denote the forecast values of 24h-ahead, 6h-ahead, and Y t -24 , Y t -6 , and Y t -4 ) are correlated. The resulting forecast errors ( X t − Y t -24 , X t − Y t -6 , and X t − Y t -4 ) are also correlated. In order to model the interdependence of the forecast values/errors, joint distributions of forecast values/errors are needed. tt -4 t -6 t -24 Y t -6 Y t -24 Y t -4 X t Fig. 3

Forecast values with different lead times

In this subsection, let Y represent an aggregated random vector [ Y T t -24 , Y T t -6 Y T t -4 ] T and X represent X t . The GMM is used to represent the joint PDF of [ X T Y T ] T . For brevity, the joint distribution is still denoted by Eq. (1). It should be noted that the dimension of X remains at W, while the dimension of Y changes from W to 3W. The power of 2π in the denominator of Eq. (3) changes from W to 2W. Note that the joint distribution of Y = [ Y T t -24 , Y T t -6 Y T t -4 ] T is the marginal distribution of [ X T Y T ] T . On the basis of Proposition 1 , the joint distribution of Y = [ Y T t -24 , Y T t -6 Y T t -4 ] T can be computed as follows:     M1 ; , m m m mm f N     yy yyY y y μ  (17) The forecast errors of Y t -24 , Y t -6 , and Y t -4 are defined as follows: t tt tt t         X YZ X YX Y (18) Note that Z is a linear transformation of [ X T Y T ] T :             XZ Y A (19) On the basis of the “ linear invariance ” property of the GMM [31], the joint distribution of forecast errors can be immediately obtained by Eq. (20):     M T1 ; , m m m mm f N     Z z z Aμ A A  (20) This paper uses GMM to model the non-Gaussian distribution of actual wind power outputs X and forecast values Y . Note that when the Gaussian component number of the GMM is set to be M=1, the GMM will naturally degenerate to be a Gaussian distribution. As a result, the GMM is able to handle the case that the random variables [ X T Y T ] T are Gaussian. In several papers [3-7], a number of distributions are adopted to model conditional forecast errors separately at different forecast levels. Compared with these traditional methods, the proposed method has several advantages: (1) Because the traditional methods need to estimate parameters of conditional distributions at every forecast level, they have a heavy computational burden when there are multiple wind farms. For example, if there are 3 wind farms and 10 forecast levels for each wind farm, then there are 10 combinations of different forecast levels, which means 10 parameters estimations. In contrast, the proposed method estimates parameters of the joint distribution of [ X T Y T ] T only one time, while constructing conditional distributions over all forecast levels. (2) For a particular forecast level y *, traditional methods usually collect historical data of forecast errors into a bin if their forecast values are near y *, and extract a conditional distribution from the collected data of the bin. With multiple wind farms, the historical data records near y * may not be sufficient. For example, according the public dataset from NREL, the number of a 1h-resolution dataset of three wind farms (IDs: 4209, 4208, and 4468) for one year is 8784, while the number of the data records near y *=[0.5 0.5 0.5] with a width 0.05 is only 37. With the limited number of data records, it is not easy for traditional methods to construct a conditional distribution of y *. In contrast, the proposed method utilizes all 8784 historical data records to extract the joint distribution of [ X T Y T ] T , and computes the conditional distribution of y *. (3) It is important that the joint distribution is a GMM, which comes with useful attributes. For example, in additional to the distribution of forecast errors for multiple wind farms, operators also want to know the aggregated forecast errors of these wind farms across a region. Define the aggregated wind power X sum and the aggregated forecast value Y sum as follows: sumsum XY                XY (21) where are unit/zero vectors with proper dimensions. Suppose that [ X T Y T ] T is modeled by Copula. Since entries of [ X T Y T ] T are not independent, the convolution technique does not apply. Hence, computing the distributions of [ X sum Y sum ] T and [ X sum | Y sum ] is difficult [16]. In contrast, if the distribution of [ X T Y T ] T is a GMM, it can be proved that [ X sum Y sum ] T is also a GMM [32]. Its distribution function is:   M, 1 ,, ; , sum sum sum sum mX Y mm sum sum m m f x yN x y                         (22) X sum | Y sum ]. Note that the aggregated forecast error Z sum is a difference between X sum and Y sum : sum sum sum Z X Y   (23)

Finally, the conditional distribution of Z sum with respect to Y sum can be obtained using Eq. (16). (4) It is important that the conditional distribution of Z is still a GMM, which facilitates the scenario generation. Scenario generation

In this section, a theoretical foundation and sampling procedure of a multivariate Gaussian distribution is briefly introduced, followed by details of the proposed method that sample from the constructed conditional GMM.

If a random vector Z follows a multivariate Gaussian distribution N m ( μ m , σ m ), then Z can be converted into an independent Gaussian random vector Z’ , which can be easily sampled using the following linear transformation [33]:   -1 T , m m    Z C Z C C μ   (24) Based on this principle, one can generate samples S Z’ of Z’ , and obtain samples S Z of Z by Eq. (24). A standard sampling procedure of a Gaussian distribution is provided in [33]. Usually, if a random vector Z does not follow a multivariate Gaussian distribution, then a transformation, such as Eq. (24), cannot guarantee the independence. This is one of the main reasons why sampling from a joint distribution of interdependent random variables is difficult. Such an obstacle also exists for GMMs. To circumvent this issue, a new idea is to sample individually from each Gaussian component of the GMM, and then assemble those samples. The procedure of generating C total scenarios from a GMM Σ ω m N m (·) is detailed in as follows. Step 1:

For each Gaussian component N m (·) of the GMM, the number of samples to generate is assigned as ω m C total . Step 2:

Generating ω m C total samples from the multivariate Gaussian distribution N m (·). Those samples are denoted by S m Z . Step 3:

Collecting all those samples S m Z together results in a new sample set S GMM Z :   GMM 1 M  Z Z Z

S S S ，， (25) In Appendix A, it is proved that the sample set S GMM Z follows the predefined joint distribution Σ ω m N m (·). Therefore, the proposed method is an exactly accurate sampling methodology. In the scenario generation procedure, one only needs to sample from a series of Gaussian distributions. This task can be done using a standard sampling function mvnrnd in MATLAB within milliseconds. In this regard, the proposed method is a fast sampling methodology. In several references [8-13], different Copulas are adopted to model joint distributions. In terms of accuracy, the Copula methods use different dependence structures to achieve a satisfactory fitting performance. The GMM can increase the number of components and adjust the parameter set Γ to improve its accuracy. In this regard, both Copula and GMM are potentially appropriate tools for uncertainty modeling. However, in terms of scenario generation, the GMM has one significant advantage, which is explained as follows. U GaussianCopula N Φ ( N )= U Z F ( U ) = Z Z -1 Copula domainGaussian samples Generated scenarios

GMM

N Z

Generated scenariosGaussian samples

Assembling as shown in Eq. (25)

Fig. 4

Illustration of Gaussian Copula and GMM in scenarios generation

The Gaussian Copula method is used as an example. The sampling procedures of the Gaussian Copula and the GMM methods are illustrated in Fig. 4. The Gaussian Copula transforms original Gaussian samples N to the Copula domain as U , and obtains final scenarios Z using the inverse cumulative distribution function (CDF) operation. In this case, one has to construct CDFs for each random variable, and then tautologically find values of inverse CDFs by numerical search techniques. These steps are time-consuming. In contrast, the GMM method directly generates samples in Z domain. No transformation or numerical search is needed. Thus, the proposed GMM method saves computational effort and time. (1) In [35], Ke et al use a customized GMM to represent unconditional distributions of wind power. The conditional distributions of forecast errors are not discussed in [34]. (2) In [31], the authors use the GMM to model X , considering the correlation among adjacent wind farms. The conditional distributions of forecast errors are not discussed. This paper models [ X T Y T ] T by a GMM. The proposed method, which is used to construct conditional distributions of [ X | Y ] and [ Z | Y ] from the joint distribution of [ X T Y T ] T , is not reported in [31]. (3)The scenario generation method is not reported in [31] or [34]. Case study

The data of a wind farm (ID: 4209) with a 4-hour forecast lead time horizon is used for this test. The joint distribution of [ X T Y T ] T is modeled by the GMM with 20 components. Thereafter, conditional distributions of forecast errors Z with respect to forecasts Y= y * from 0.1 to 0.9 are computed. The constructed conditional distributions are compared with the histograms of conditional forecast errors. The way to obtain the histograms conditioned on different y * is detailed in Section 3.2. Comparative results are shown in Fig. 5. It can be seen that the conditional distributions constructed from the GMM are consistent with the histograms under different forecasts. Histograms GMMForecast y* =0.1 p.u. Forecast error(p.u.) 0 0.5 1 P D F Forecast error(p.u.) 0 0.5 1 P D F

024 Forecast error(p.u.) -0.5 0 0.5 P D F -0.5 0 0.5 P D F -0.5 0 0.5 P D F

012 Forecast error(p.u.) -0.5 0 0.5 P D F Forecast error(p.u.) -0.5 0 0.5 P D F

23 Forecast error(p.u.) -1 -0.5 0 P D F -1 -0.5 0 P D F Forecast y* =0.2 p.u. Forecast y* =0.3 p.u.Forecast y* =0.4 p.u. Forecast y* =0.5 p.u. Forecast y* =0.6 p.u.Forecast y* =0.7 p.u. Forecast y* =0.8 p.u. Forecast y* =0.9 p.u. Fig. 5

Histograms of forecast errors of the 9 bins. The forecast lead time is 4 hour. The data is from a single wind farm (ID: 4209) in the NREL “eastern wind integration data set”. The data is normalized to the installed capacity.

To quantify the accuracy of different methods modeling wind power forecast errors, two indices are used. The first one is the log-likelihood function value [35] for evaluating the fitting performance of historical data of [ X T Y T ] T . The second one is the root-mean-square error (RMSE) for conditional distributions of [ Z | Y ]. The RMSE is defined as follows:   n i ii pdf pdfn    (26) where pdf Cons i is the PDF of the constructed conditional distribution; pdf His i is the PDF of the histogram; n is the total number of points on the PDF curve. In this test, Gaussian Copula and t Copula are compared with the GMM. There are three reasons why this paper chooses the Gaussian/t Copulas. First, the Gaussian/t Copula methods are the most popular ones in the literature [8-13]. Second, the Gaussian/t Copula methods not only model joint distributions, but also construct conditional distributions. They match well with the scope of this paper. The method using Gaussian/t Copulas to construct conditional distributions is detailed in [10]. Third, the Gaussian/t Copula methods have standard testing functions in commercial software tools, e.g., MATLAB. Hence, it is convenient for readers to reproduce the test results in this paper. The code used in the tests follows standard guidelines from MATLAB documentation on Copulas [36]. Quantitative test results are shown in Fig. 6 and Table 1. It can be seen that: 1) The GMM is better at fitting [ X T Y T ] T than the Gaussian/t Copula methods, as the GMM increases the log-likelihood values by 22%, 18%, respectively. 2) The GMM has an advantage in representing conditional distributions than the Gaussian/t Copula methods, as its RMSEs under every forecast level from 0.1 to 0.9 p.u. are the lowest. Gaussian Copula t Copula GMM

Forecast (p.u.) R M S E Fig. 6

Accuracy comparison for modeling conditional distributions

Table 1

Fitting test of modeling [ X T Y T ] T Methods Log-likelihood function values(10 ) Gaussian Copula 5.6905 t Copula 5.9104 GMM 6.9614 The data of three wind farms (IDs: 4209, 4208, and 4468) with a 4-hour forecasting lead time horizon is used for this test. Since the dimensions of [ X | Y ] are 3, it is difficult to use a picture like Fig. 5 to visualize the fitting performance. However, if the joint distribution of [ X T Y T ] T is represented by the GMM, it is feasible to analytically compute the conditional distributions of the aggregated forecast errors Z sum with respect to the aggregated forecast values Y sum using the derivations in Section 4.3. On the basis of this idea, this paper uses the aggregation Z sum | Y sum to visualize the fitting performance. Results are shown in Fig. 7. Note that if [ X T Y T ] T is modeled by Copula methods, it is difficult to compute the conditional distributions of the aggregated wind power forecast errors. Hence, results of Copula methods are not provided. From Fig. 7, it can be seen that the conditional distributions of the aggregated forecast errors under different aggregated forecasts fit the histograms well, indicating that the GMM has a satisfactory level of accuracy. Histograms GMM

Forecast error(p.u.) P D F Forecast error(p.u.) -1 0 1 2 3 P D F P D F Forecast error(p.u.) -1 0 1 2 P D F Forecast error(p.u.) -2 -1 0 1 2 P D F Forecast error(p.u.) -2 -1 0 1 P D F Forecast error(p.u.) -2 -1 0 1 P D F Forecast error(p.u.) -3 -2 -1 0 1 P D F Forecast error(p.u.) -3 -2 -1 0 P D F Forecast y* =0.3 p.u. Forecast y* =0.6 p.u. Forecast y* =0.9 p.u.Forecast y* =1.2 p.u. Forecast y* =1.5 p.u. Forecast y* =1.8 p.u.Forecast y* =2.1 p.u. Forecast y* =2.4 p.u. Forecast y* =2.7 p.u. Fig. 7

Histograms of the aggregated forecast errors. The forecast lead time is 4 hour. The data is from three single wind farms (ID: 4209, 4208, and 4468) in the NREL “eastern wind integration data set”. The data is normalized to the installed capacities.

Furthermore, this paper provides an illustrative example in Fig. 8 for the joint distributions of forecast errors on the condition of given forecast values: 0.2 p.u . for WF1, 0.4 p.u. for WF2, and 0.4 p.u. for WF3. It can be seen that conditional forecast errors are interdependent among different wind farms. Results of Fig. 8 are consistent with those reported in [10]. Fig. 8 . In addition to the NREL data, in this paper, the proposed modeling method is also tested with real-world data from Bonneville Power Administration (BPA) [37]. The data is hourly. The forecast lead time is 4 hour. The dates of the BPA data range from 20160101 to 20161231. The maximum actual wind power output is 4493MW. The maximum forecast wind power is 4491MW. The data is rated to 4500MW. The test results are shown in Fig. 9. It can be seen that the GMM-based conditional distributions coincide with the histograms.

Histograms GMMForecast y* =0.1 p.u. Forecast error(p.u.) Forecast error(p.u.) Forecast error(p.u.) Forecast error(p.u.) Forecast error(p.u.) Forecast error(p.u.)

Forecast error(p.u.) Forecast error(p.u.) Forecast error(p.u.)

Forecast y* =0.2 p.u. Forecast y* =0.3 p.u.Forecast y* =0.4 p.u. Forecast y* =0.5 p.u. Forecast y* =0.6 p.u.Forecast y* =0.7 p.u. Forecast y* =0.8 p.u. Forecast y* =0.9 p.u.-0.2 0 0.2 P D F P D F P D F -0.2 0 0.2 P D F -0.2 0 0.2 P D F -0.2 0 0.2 P D F -0.2 0 0.2 P D F -0.2 0 0.2 P D F -0.2 0 0.2 P D F Fig. 9

Histograms of the BPA forecast errors. The forecast lead time is 4 hour. The data is from BPA. The data is normalized to 4500MW. (1) Different forecast lead times There are test results of different forecast lead times, e.g., 6-hour, the next-day. They are provided in Appendix B. The test results show that the GMM has a satisfactory performance. (2) Different wind farm locations The proposed method is tested using the data of four wind farms, which are located in different places: Texas (ID: 81), Oklahoma (ID: 764), Minnesota (ID: 2943), and Iowa (ID: 391), respectively. Their installed capacities are 1118, 1120, 1107, and 1001 MW. The forecast lead time is 4 hour. The data records are normalized to the installed capacities. The test results are provided in Appendix C. According to Fig C1, the GMMs match well with the histograms of the four wind farms. That is to say, the GMM can represent wind forecast error uncertainties in different locations.

In Section 4.2, the proposed method is modified to compute the joint distributions of forecasts and the forecast errors with different lead times. The test results of the possible extensions are provided in Appendix D.

In scenario generation, the proposed method is compared with three other methods in the literature. The efficiency and effectiveness of the four methods are tested, respectively. As far as the efficiency is concerned, different methods are used to generate 1000, 5000, 10000, and 50000 scenarios. The tests are implemented on a Core-i5 2.39-GHz processor. The test results are listed in Table 2. It can be seen that the proposed method greatly improves the sampling efficiency with respect to the Copula methods, especially when the number of scenarios is large. The efficient performance benefits from the GMM-based sampling method that generates scenarios directly in Z domain. As a comparison, the Copula methods have to transform original scenarios from U domain to Z domain by numerically searching inverse CDFs. The transformations are not time-saving. Table 2

Efficiency comparison Number of scenarios Time cost (seconds) 1000 5000 10000 50000 t Copula method 3.83 19.74 41.58 217.47 Gaussian Copula method 3.91 18.64 38.52 203.81 ARMA [35] 0.0040 0.0052 0.0066 0.0353 The proposed method 0.0023 0.0032 0.0042 0.0263

Remark 1 : The ARMA method [38] is a popular one used in the literature to generate time series data of wind power uncertainty. However, unlike the GMM and Copula methods, the ARMA method cannot provide probabilistic distributions of forecast errors for operators. Therefore, it does not match the scope of this paper. It is provided in the Table 2 for reference purpose only. What’s more, although the ARMA method is more efficient than the Copula methods, it does not outperform the proposed GMM method.

Remark 2 : in the literature, there are some other methods for wind power scenario generation [20], [39]. Because these methods require to compute the inverse CDFs in the scenario generation procedure, they suffer from the same problem as the Copula methods.

Remark 3 : If the number of wind farms increases, the time cost will increase, too. A test is conducted to show the time costs of different scenario generation methods with different number of wind farms. According to the results shown in Fig. 10, it takes about 750s for the Gaussian Copula method to generate 50000 scenarios for 10 wind farms. As a comparison, the GMM-based method still costs very little time (0.07232s).

GMM (×10000)

Wind farm number

Gaussian Copula T i m e c o s t ( s ) Fig. 10

Time costs of scenario generation with different wind farm number. The time cost of the GMM method is multiplied by 10000.

When a larger amount of scenarios are generated, one wants to retain K representative scenarios. The K-means clustering [40] method is used for scenario reduction in this paper. In order to evaluate the effectiveness of generated scenarios, a 52-week test is performed. In every week, two indices, namely, mean absolute error (MAE) and variance (VAR), are used to assess the quality of the generated scenarios:  

K t t tk kt k z y x        (27) Fig. 11

MAEs of different methods in 52 weeks.

Fig. 12

VARs of different methods in 52 weeks.  

168 21 1

K t t tk kt k z y x        (28) where t stands for period. There are 168 periods in a week. z k is the forecast error in scenario k . π k is the probability of scenario k . x is the measured wind power. y is the forecast value. Both MAE and VAR are negatively oriented quantification criteria: the smaller the better. Table 3

Performance of the one-year test Methods MAE VAR t Copula method 0.5168 0.2982 Gaussian Copula method 0.5087 0.2814 ARMA [35] 0.4121 0.1654 The proposed method 0.3542 0.1379

The test results of MAEs and VARs in every week are shown in Figs. 11 and 12. Overall, the MAE and VAR of the proposed method are smaller than those of the other three methods, indicating the scenarios generated by the proposed method can better represent real uncertainties. Further, this paper compute MAEs and VARs for the whole year 2005. The test results are listed in Table 3. In terms of the MAE, the proposed method has 31%, 30%, and 14% decrements with respect to the t Copula, Gaussian Copula, and ARMA method, respectively. As far as the VAR is concerned, the proposed method has 53%, 51%, and 17% decrements. The rest results further validate that the proposed method outperforms other methods in a long-time test. Conclusion

This paper develops a method that can characterize non-Gaussian and interdependent wind power forecast errors under different forecast conditions. The conditional distribution is modeled as a GMM, enhancing the scenario generation. The proposed method has one limitation. In order to model [ X T Y T ] T accurately, the number of Gaussian components of a GMM may be large. It is necessary to find a way to reduce the component number without undermining the accuracy. In [29], [30], two possible solutions are discussed. For future applications, the proposed methodology could be used to generate scenarios for various kinds of uncertain analyses, such as stochastic optimizations in the day-ahead scheduling and Monte-Carlo-based reliability analysis. Acknowledgements

This work was supported in part by the Foundation for Innovative Research Groups of National Natural Science Foundation of China under Grant 51621065, and in part by the Special Fund of the National Basic Research Program (973) of China under Grant 2013CB228201. The authors would like to thank Dr. Chen-Ching Liu for his valuable discussions and suggestions. Zhiwen Wang would like to thank China Scholarship Council for supporting his study at Washington State University.

Appendix A: Proof of the sampling method

The proof for a bivariate case is provided in the following. Those derivations are applicable to a multivariate case without major differences. As far as the m th Gaussian component of a GMM is concerned, there are ω m C total samples, denoted as S m Z , generated from the m th multivariate Gaussian distribution N m (·). Without loss of generality, for arbitrary given values z and z , it is assumed that there are C m samples, out of ω m C total , falling into the area [-∞ z ]×[-∞ z ]. An illustration is shown in Fig. A1. ( z , z ) v v For every Gaussian component N ( · ), there are C samples, out of ω C , falling into the dash line area m m total m Fig. A1 C m samples, out of ω m C total , fall into the area [-∞ z ]×[-∞ z ] Since those samples are generated from the m th multivariate Gaussian distribution N m (·), the following equation holds:   total 1 12 2 lim , m mC m v zv z C N v v dv dvC      (A1) Consider a new sample set S GMM Z , consisting of all samples S m X generated from each Gaussian component. The total number of samples of S GMM Z is C total . The number of the samples of S GMM Z falling into the area [-∞ z ]×[-∞ z ], denoted as C GMM , is the summation of all C m . In other words, Eq. (A2) holds: total total total GMM 1 1total total total lim lim lim

M m Mm mC C Cm

CC CC C C        (A2) Substituting (A1) into (A2) yields:     total 1 12 21 12 21 12 2

GMM 1 2 1 21total 1 2 1 21 1 2 1 21 lim ,,,

M m mC m v zv zM m mm u zu zM m mmv zv z

C N v v dv dvC N v v dv dvN v v dv dv                           (A3) Equation (A3) indicates that the new sample set S GMM Z follows the GMM Σ ω m N m (·). Appendix B: Different lead times

GMM

Forecast error(p.u.) P D F Forecast error(p.u.) P D F Forecast error(p.u.) -0.5 0 0.5 P D F Forecast error(p.u.) -0.5 0 0.5 P D F Forecast error(p.u.) -0.5 0 0.5 P D F Forecast error(p.u.) -0.5 0 0.5 P D F Forecast error(p.u.) -0.5 0 0.5 P D F Forecast error(p.u.) -1 -0.5 0 P D F Forecast error(p.u.) -1 -0.5 0 P D F HistogramsForecast y* =0.1 p.u. Forecast y* =0.2 p.u. Forecast y* =0.3 p.u.Forecast y* =0.4 p.u. Forecast y* =0.5 p.u. Forecast y* =0.6 p.u. Forecast y* =0.7 p.u. Forecast y* =0.8 p.u. Forecast y* =0.9 p.u. Fig. B1

Histograms of forecast errors of the 9 bins. The forecast lead time is 6 hour. The data is from a single wind farm (ID: 4209) in the NREL “eastern wind integration data set”. The data is normalized to the installed capacity.

GMM

Histograms of forecast errors of the 9 bins. The forecast lead time is 24 hour. The data is from a single wind farm (ID: 4209) in the NREL “eastern wind integration data set”. The data is normalized to the installed capacity.

Appendix C: Different wind farm locations -0.4 -0.2 0 0.2 0.4 P D F P D F -0.4 -0.2 0 0.2 0.4 P D F P D F -0.4 -0.2 0 0.2 0.4 P D F -0.4 -0.2 0 0.2 0.4 P D F P D F P D F -0.4 -0.2 0 0.2 0.4 0.6 P D F Forecast y* =0.1 p.u. Forecast error(p.u.) Forecast error(p.u.) Forecast error(p.u.)

Forecast y* =0.2 p.u. Forecast y* =0.3 p.u.Forecast y* =0.1 p.u. Forecast error(p.u.) Forecast error(p.u.) Forecast error(p.u.)

Forecast y* =0.2 p.u. Forecast y* =0.3 p.u. Forecast error(p.u.) Forecast error(p.u.) Forecast error(p.u.)

Forecast y* =0.1 p.u. Forecast y* =0.2 p.u. Forecast y* =0.3 p.u. -0.2 0 0.2 0.4 0.60

34 -0.4 -0.2 0 0.2 0.4 0.600.5 P D F P D F P D F Texas (ID:81)Oklahoma (ID:764)Minnesota (ID:2943)

Iowa (ID:391)

Fig. C1

Histograms of forecast errors of the 1 st , 2 nd , and 3 rd bins. Others are omitted. The forecast lead time is 4 hour. The data is normalized to the installed capacity. Appendix D: Joint distributions of forecasts and forecast errors with different lead times

In this test, the data is from a single wind farm (ID: 4209). The data is normalized to the installed capacity. (1) Joint distributions of forecasts with different lead times

The joint distribution of forecasts Y = [ Y T t -24 , Y T t -6 Y T t -4 ] T with different lead times is obtained using Eq. (17). To demonstrate the effectiveness of the proposed modeling method, this paper compares the cumulative distribution function of Eq. (17) with the empirical cumulative distribution of the historical data. The cumulative distribution function of Eq. (17) is obtained as follows:     M1 ; , m m m mm F N d             yy yyY v y y v v μ  (D1) where v is the integral variable; y is the upper limit. The multiple integral in Eq. (D1) can be computed using the mvncdf function in MATLAB. The empirical cumulative distribution of the historical data is obtained as follows:     n i ii F In    Y y y , y (D2)   ii i I  y yy , y (D3) where n is the total number of the historical data records; 𝒚̃ 𝑖 denotes the i th data record. I i ( 𝒚̃ 𝑖 , 𝒚 ) is an indicator function: if every entry of 𝒚̃ 𝑖 is smaller than that of 𝒚 , I i ( 𝒚̃ 𝑖 , 𝒚 ) is 1; otherwise, I i ( 𝒚̃ 𝑖 , 𝒚 ) is 0. This paper compares 𝐹 𝒀 (𝒚) and 𝐹̃ 𝒀 (𝒚) in 1000 different values of 𝒚 . The test results are shown in Fig. D1. It can be observed that the GMM-based cumulative distribution matches well with the empirical cumulative distribution. The maximum error is 0.0087. Such results indicate that the joint distribution in Eq. (17) is a good representation for the wind power forecasts. Serial number of y C u m u l a ti v e d i s t r i bu ti on EmpiricalGMM Error

Fig. D1

Cumulative distributions of forecasts in 1000 different 𝒚 . (2) Joint distributions of forecast errors with different lead times The joint distributions of forecast errors with different lead times are obtained using Eqs. (18)-(20). The obtained distribution is also compared with the empirical cumulative distribution of the historical data of forecast errors. The test results are shown in Fig. D2.

Serial number of y C u m u l a ti v e d i s t r i bu ti on EmpiricalGMM Error

Fig. D2

Cumulative distributions of forecast errors in 1000 different 𝒚 . It can be seen that the GMM-based joint distribution of forecast errors coincides with the historical data, indicating this joint distribution can well represent the uncertainty of forecast errors with different lead times.

References [1]

H. Wang, G. Li, G. Wang, J. Peng, H. Jiang, Y. Liu, Deep learning based ensemble approach for probabilistic wind power forecasting, Appl Energy 2017; 188: 56-70 [2]

F. Liu, Z. Bie, S. Liu, T. Ding, Day-ahead optimal dispatch for wind integrated power system considering zonal reserve requirements, Appl Energy 2017; 188: 399-408. [3]

H. Bludszuweit, J. A. Dominguez-Navarro, A. Llombart. Statistical analysis of wind power forecast error. IEEE Trans Power Syst 2008; 23(3): 983-991. [4]

S. Tewari, C. J. Geyer, N. Mohan. A statistical model for wind power forecast error and its application to the estimation of penalties in liberalized markets. IEEE Trans Power Syst 2011; 26(4): 2031-2039. [5]

K. Bruninx, E. Delarue. A statistical description of the error on wind power forecasts for probabilistic reserve sizing. IEEE Trans Sustain Energy 2014; 5(3): 995-1002. [6]

Z. S. Zhang, Y. Z. Sun, D. W. Gao, J. Lin, L. Cheng. A versatile probability distribution model for wind power forecast errors and its application in economic dispatch. IEEE Trans Power Syst 2013; 28(3): 3114-3125. [7]

N. Menemenlis, M. Huneault, A. Robitaille. Computation of dynamic operating balancing reserve for wind power integration for the time-horizon 1-48 hours. IEEE Trans Sustain Energy 2012; 3(4): 692–702. [8]

G. Papaefthymiou, P. Pinson. Modeling of spatial dependence in wind power forecast uncertainty. in Proc. 10th Int. Conf. Probabilistic Methods Applied to Power Systems, Rincon, Puerto Rico, May 2008. [9]

S. Hagspiel, A. Papaemannouil, M. Schmid, G. Andersson, Copula-based modeling of stochastic wind power in Europe and implications for the Swiss power grid, Appl Energy 2012; 96: 33-44. [10]

N. Zhang, C. Kang, Q. Xia, J. Liang. Modeling conditional forecast error for wind power in generation scheduling. IEEE Trans. Power Syst 2014; 29(3): 1316-1324. [11]

F. Chen, G. H. Huang, Y. R. Fan, J. P. Chen, A copula-based fuzzy chance-constrained programming model and its application to electric power generation systems planning, Appl Energy 2017; 187: 291-309,. [12]

H. V. Haghi, S. Lotfifard. Spatiotemporal modeling of wind generation for optimal energy storage sizing. IEEE Trans Sustain Energy 2015; 6(1): 113-121. [13]

G. Diaz, J. G. Aleixandre, J. Coto. Wind power scenario generation through state-space specifications for uncertainty analysis of wind power plants, Appl Energy 2016; 162: 21-30. [14]

A. J. Conejo, M. Carrion, J. M. Morales, Decision making under uncertainty in electricity markets. New York: Springer; 2010. [15]

H. Quan, D. Srinivasan, A. M. Khambadkone, A. Khosravi, A computational framework for uncertainty integration in stochastic unit commitment with intermittent renewable energy sources, Appl Energy 2015; 152: 71-82. [16]

S. M. Ross, Introduction to probability models. 11th ed. San Diego: Elsevier; 2015. [17]

R. Y. Rubinstein and D. P. Kroese, Simulation and the Monte Carlo method, California: Wiley; 2007. [18]

L. Devroye, Non-uniform random variate generation. New York: Springer; 1986. [19]

Y. Chen, J. Wen, and S. Cheng. Probabilistic load flow method based on Nataf transformation and Latin hypercube sampling. IEEE Trans Sustain Energy 2013; 4(2): 294-301. [20]

X. Y. Ma, Y. Z. Sun, and H. L. Fang. Scenario generation of wind power based on statistical uncertainty and variability. IEEE Trans Sustain Energy 2013; 4(4): 894-904. [21]

E. Xydas, M. Qadrdan, C. Marmaras, L. Cipcigan, N. Jenkins, H. Ameli, Probabilistic wind power forecasting and its application in the scheduling of gas-fired generators, Appl Energy 2017; 192: 382-394. [22]

J. M. Morales, R. Mínguez, A. J. Conejo, A methodology to generate statistically dependent wind speed scenarios, Appl Energy 2010; 87(3): 843-855. [23]

C. Draxl, B. M. Hodge, A. Clifton, and J. McCaa. Overview and meteorological validation of the wind integration national dataset toolkit. Technical Report, NREL/TP-5000-61740. National Renewable Energy Laboratory. Golden, CO. 2015. [24]

L. Zheng, W. Hu and Y. Min, “Raw wind data preprocessing: a data-mining approach,” IEEE Trans Sustain Energy 2016; 6(1): 11-19. [26]

M. Schlechtingen and I. F. Santos, “Comparative analysis of neural network and regression based condition monitoring approaches for wind turbine fault detection,” Mech. Syst. Signal Process., 2011; 6(1): 1849-1875. [27]

S. Bofinger, A. Luig, and H. G. Beyer. Qualification of wind power forecasts. in Proc Global Wind Power Conference, 2002. [28] R. Li, Z. Wang, C. Gu, F. Li, H. Wu, A novel time-of-use tariff design based on Gaussian Mixture Model, Appl Energy 2016; 162: 1530-1536. [29]

R. Singh, B. C. Pal, and R. A. Jabr. Statistical representation of distribution system loads using Gaussian mixture model. IEEE Trans Power Syst; 2010; 25(1): 29-37. [30]

G. Valverde and A.T. Saric. Probabilistic load flow with non-Gaussian correlated random variables using Gaussian mixture models. IET Gen, Trans & Distri 2012; 6(7): 701–709. [31]

Z. Wang, C. Shen, F. Liu, X. Wu, C. C. Liu, F. Gao. Chance-constrained economic dispatch with non-Gaussian correlated wind power uncertainty. IEEE Trans Power Syst. early access article. [32]

M. L. Eaton, Multivariate Statistics: A vector space approach. Beachwood: Institute of mathematical statistics; 2007. [33]

G. A. Holton, Value at risk: theory and practice. San Diego: Academic press; 2003. [34]

D. Ke, C. Y. Chung, and Y. Sun. A novel probabilistic optimal power flow model with uncertain wind power generation described by customized Gaussian mixture model. IEEE Trans Sustain Energy 2016; 7(1): 200-212. [35]

A. Hald. On the history of maximum likelihood in relation to inverse probability and least squares. Statistical Science 1999; 14(2): 214-222. [36]

Wind power data of Bonneville Power Administration, 2016. [Online]. Available: https://transmission.bpa.gov/Business/operations/Wind/ [38]

R. Karki, P. Hu, and R. Billinton. A simplified wind power generation model for reliability evaluation. IEEE Trans. Energy Conversion 2006; 21(2): 533-540. [39]