[PDF] Rule extraction based on extreme learning machine and an improved ant-miner algorithm for transient stability assessment

Abstract

In order to overcome the problems of poor understandability of the pattern recognition-based transient stability assessment (PRTSA) methods, a new rule extraction method based on extreme learning machine (ELM) and an improved Ant-miner (IAM) algorithm is presented in this paper. First, the basic principles of ELM and Ant-miner algorithm are respectively introduced. Then, based on the selected optimal feature subset, an example sample set is generated by the trained ELM-based PRTSA model. And finally, a set of classification rules are obtained by IAM algorithm to replace the original ELM network. The novelty of this proposal is that transient stability rules are extracted from an example sample set generated by the trained ELM-based transient stability assessment model by using IAM algorithm. The effectiveness of the proposed method is shown by the application results on the New England 39-bus power system and a practical power system - the southern power system of Hebei province.

Full PDF

RRESEARCH ARTICLE

Rule Extraction Based on Extreme LearningMachine and an Improved Ant-MinerAlgorithm for Transient Stability Assessment

Yang Li * , Guoqing Li, Zhenhao Wang School of Electrical Engineering, Northeast Dianli University, Jilin, Jilin, P.R.China * [email protected] Abstract

In order to overcome the problems of poor understandability of the pattern recognition-based transient stability assessment (PRTSA) methods, a new rule extraction methodbased on extreme learning machine (ELM) and an improved Ant-miner (IAM) algorithm ispresented in this paper. First, the basic principles of ELM and Ant-miner algorithm arerespectively introduced. Then, based on the selected optimal feature subset, an examplesample set is generated by the trained ELM-based PRTSA model. And finally, a set of clas-sification rules are obtained by IAM algorithm to replace the original ELM network. The nov-elty of this proposal is that transient stability rules are extracted from an example sample setgenerated by the trained ELM-based transient stability assessment model by using IAMalgorithm. The effectiveness of the proposed method is shown by the application results onthe New England 39-bus power system and a practical power system — the southernpower system of Hebei province. Introduction

Transient stability is concerned with the stability of the power system to maintain synchronismwhen subjected to a severe disturbance, such as a short circuit on a transmission line [1]. Tran-sient stability assessment (TSA) has been recognized as an important issue to ensure the secureoperation of power systems. With interconnection of large-scale power grids, electricity marketreform and growing presence of large-scale intermittent renewable energy, the dynamic behav-iors of power systems are becoming more complex and difficult to be controlled, with moreserious consequences resulted from transient instability [2]. The available TSA methods, suchas time domain simulation methods [3], energy function based methods [4] and the extendedequal-area criterion [5], can not meet the demands of online applications required by the mod-ern power systems. With the rapid development of computational intelligence, the pattern rec-ognition-based TSA (PRTSA) methods have shown much potential for on-line application topower systems [6 – PLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 1 / 18

OPEN ACCESS

Citation:

Li Y, Li G, Wang Z (2015) Rule ExtractionBased on Extreme Learning Machine and anImproved Ant-Miner Algorithm for Transient StabilityAssessment. PLoS ONE 10(6): e0130814.doi:10.1371/journal.pone.0130814

Academic Editor:

Daoqiang Zhang, NanjingUniversity of Aeronautic and Astronautics, CHINA

Received:

September 7, 2014

Accepted:

May 26, 2015

Published:

June 19, 2015

Copyright: © 2015 Li et al. This is an open accessarticle distributed under the terms of the CreativeCommons Attribution License, which permitsunrestricted use, distribution, and reproduction in anymedium, provided the original author and source arecredited.

Data Availability Statement:

All relevant data arewithin the paper.

Funding:

The authors received no specific fundingfor this work.

Competing Interests:

The authors have declaredthat no competing interests exist. cquisition of potentially useful information. They have a good prospect in the field of the on-line security and stability analysis of power systems.However, it has also been observed that the current researches of PRTSA mainly focus onthe application of machine learning technique such as artificial neural network (ANN) andsupport vector machine (SVM) to make stability classification of power systems [8, 9].Although the above PRTSA methods can perform well for TSA, the black-box nature of thegenerated classifiers makes them rather incomprehensible and opaque to humans [10, 11],since the predictive models are described as complex mathematical functions. This opacityproperty is not conducive to understand and analysis the stability problem of power systemsfrom the mechanism [12], and prevents them from being used in practical applications.Rule extraction is an effective way to make the “ black-box ” PRTSA approaches have incom-prehensibility, whose purpose is to express the knowledge that is implicit in the learningmachine in an easily understandable way. Unfortunately, there has been a little research onthis issue. In [10, 11], decision tree is employed to extract rules, but the assessment results aresensitive to the construction of samples with the poor generalization ability and robustness aswell. IF – THEN rules are extracted from the trained multilayer perceptron (MLP) by scrutinis-ing the weights between hidden-output layer and the weights between input-hidden layer in[12], however the resulting rules are not fine enough.Extreme learning machine (ELM) proposed by Huang [13, 14] is a new learning scheme forsingle hidden layer feed forward networks (SLFNs). Compared to other traditional machinelearning techniques such ANN and SVM, ELM has better generalization performance with amuch faster learning speed, which has been widely used in a lot of engineering applications[15, 16].However, the acquired knowledge from ELM is contained in the connection weights, andthe reasoning process can not be given a clear explanation, which limits further application ofELM in data mining and engineering. It ’ s contribution to enhance the understandability andinterpretation of the reasoning process by representing the knowledge contained in ELM in theform of rules, but so far none of literature on this issue has been reported.Ant-miner algorithm is a new rule mining algorithm [17] with good robustness and abilityto find optimal solutions, which has been successfully applied in engineering applications [18].Meanwhile, wide-area measurement systems (WAMS) make it possible to obtain the synchro-nized real-time state information, and this brings new ideas and opportunities to transient sta-bility assessment and prediction [19 – “ black-box ” PRTSA models.The remainder of this paper is structured as follows. First, the basic principles of ELM andAnt-miner algorithm are respectively introduced in brief. Details of the proposed rule extrac-tion scheme using ELM and IAM for TSA are presented next. Application of the proposedscheme is demonstrated using the New England 39-bus power system and a practical powersystem — the southern power system of Hebei province, and finally the conclusions are made. Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 2 / 18 rinciple of ELM and Ant-miner Algorithm

Principle of ELMFor N arbitrary distinct samples ( x i , y i ) R n × R m , where x i = [ x i , x i , (cid:2) (cid:2) (cid:2) , x in ] T R n is thefeature vector and y i = [ y i , y i , (cid:2) (cid:2) (cid:2) , y im ] T R m is the target vector, standard SLFNs with L hid-den nodes can be mathematically modeled as X Li ¼ b i G ð w i (cid:2) x j þ d i Þ ¼ y j ; j ¼ ; (cid:2) (cid:2) (cid:2) ; N : ð Þ where w i = [ w i , w i , (cid:2) (cid:2) (cid:2) , w in ] T is the weight vector connecting the i th hidden node and theinput nodes, β i = [ β i , β i , (cid:2) (cid:2) (cid:2) , β im ] T is the weight vector connecting the i th hidden node and theoutput nodes, and d i is the threshold of the i th hidden node, G ( (cid:2) ) is the activation function.Eq (1) can be written compactly as H β ¼ Y ð Þ where H is the hidden layer output matrix of the neural network, H ð w ; (cid:2) (cid:2) (cid:2) ; w L ; d ; (cid:2) (cid:2) (cid:2) ; d L ; x ; (cid:2) (cid:2) (cid:2) ; x N Þ ¼ G ð w (cid:2) x þ d Þ (cid:2) (cid:2) (cid:2) G ð w L (cid:2) x þ d L Þ ... (cid:2) (cid:2) (cid:2) ... G ð w (cid:2) x N þ d Þ (cid:2) (cid:2) (cid:2) G ð w L (cid:2) x N þ d L Þ N (cid:3) L , theunique parameter needed to be tuned is β = [ β , (cid:2) (cid:2) (cid:2) , β L ] T , Y = [ y , (cid:2) (cid:2) (cid:2) , y L ] T .ELM is to minimize the training error as well as the norm of the output weights [13, 14]Minimize : k H β (cid:4) Y k and k β k ð Þ The minimal norm least square solution of (2) is as follows. β ^ ¼ H y Y : ð Þ where H † is the Moore-Penrose generalized inverse of matrix H .Given a training set, the activation function and the hidden nodes, learning process of ELMis as follows: (a) randomly generated parameters of the hidden layer nodes ( w i , d i ), i = 1, (cid:2) (cid:2) (cid:2) , L ;(b) calculate the hidden layer output matrix H ; (c) calculate the output weights β . Introduction of Ant-miner algorithm

Ant-miner algorithm is to simulate the process of extraction rules into the process of ants for-aging, and the optimal path is chosen as the optimal classification rules [17]. The specific stepsare described as follows.Step 1: Initialization of pheromones. The initial path of the pheromone τ ij is set as follows. t ij ð t ¼ Þ ¼ X ai ¼ b i ð Þ where t and a are respectively the number of iterations and attributes, and b i is the number ofvalues in the domain of the i th attribute. Pheromone level is updated in two phases: evapora-tion and reinforcement. Evaporation is accomplished by a pheromone evaporation rate ρ , andreinforcement of the pheromone trail is only applied to the best ant ’ s path. Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 3 / 18 tep 2: Selection of attributes. Let term ij be a rule condition of the form A i = V ij , where A i isthe i th attribute value and V ij is the j th value of the domain of A i . The probability that term ij isselected to be added to the current partial rule is determined by the decision P ij : P ij ð t Þ ¼ t ij ð t Þ (cid:3) Z ij X ai ¼ X b i j ¼ ð t ij ð t Þ (cid:3) Z ij Þ ð Þ where τ ij ( t ) is the pheromone of term ij in the t th iteration, the heuristic function η ij is repre-sented by (7), (8). Z ij ¼ log ð k Þ (cid:4)

InfoT ij X ai ¼ X b i j ¼ ð log ð k Þ (cid:4)

InfoT ij Þ ð Þ InfoT ij ¼ (cid:4) X ke ¼ freqT eij j T ij j " (cid:3) log freqT eij j T ij j " ð Þ where k is the number of categories, InfoT ij is the information entropy of term ij , | T ij | is the sam-ples number of the partition T ij , freqT eij is the samples number of class e in T ij .Step 3: Rule pruning. Aiming at the problem of over-fitting, the obtained rules are needed tobe pruned. The effectiveness of the rules Q is determined by (9). Q ¼ TPTP þ FN (cid:3) TNTN þ FP ð Þ where, TP (true positives) is the number of cases covered by the rule that have the class pre-dicted by the rule, FP (false positives) is the number of cases covered by the rule that have aclass different from the class predicted by the rule, FN is (false negatives) the number of casesthat are not covered by the rule but that have the class predicted by the rule, and TN (true nega-tives) is the number of cases that are not covered by the rule and that do not have the class pre-dicted by the rule.Step 4: Pheromone updating. Pheromone is updated according to (10). t ij ð t þ Þ ¼ ð (cid:4) r Þ t ij ð t Þ þ Q þ Q (cid:2) (cid:3) (cid:3) t ij ð t Þ ð Þ where ρ is the pheromone evaporation rate.Step 5: Choose the best rule R best , and adding it to the rule sets.Step 6: Delete the training samples covered by the existing rules.Step 7: Repeat step 1~6, until the number of training samples is not bigger than the maxi-mum number of the preset uncovered samples. Rule Extractions for TSA

Steps of rule extraction

From a functional point of view, a rule extraction method based on ELM and IAM is proposedfor TSA as shown in Fig 1. The proposed method focuses on the mapping relationship betweenthe state information and the stability result of power systems, and emphasizes the ability toreproduce the function of the ELM classifier, which does not consider the type and structure ofELM. The basic idea of the proposed method is to regard the trained ELM as a new sample

Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 4 / 18 pace, and then use IAM algorithm to extract the hidden knowledge into rules with goodunderstandability. In this way, the classification rules are generated to functionally replace theoriginal ELM network.The proposed rule extraction approach can be divided into three steps, comprising featureselection, acquisition of the example sample set and rule learning.

Feature selection

As is well known, feature selection is an issue of paramount importance for PRTSA. In [23], afeature selection method based on kernelized fuzzy rough sets (KFRS) and the memetic algo-rithm is proposed for TSA. By defining a KFRS-based generalized classification function as theseparability criterion, the memetic algorithm based on binary differential evolution and Tabusearch is employed to obtain the optimal feature subsets with the maximized classificationcapability (see [23] for further details).The approach presented here uses the same feature selection method in [23], and theoptimal feature subsets obtained for the New England 39-bus system and the southern powersystem of Hebei province are used as the input features in this study, as respectively shown inTables 1 and 2. Here, t and t cl denote the fault occurrence and clearing time in turn, t cl + , t cl + and t cl + respectively denotes the 3-rd, 6-th and 9-th cycle after the fault clearance. Acquisition of the example sample set

An example consists of the input mode and output mode. If an example is judged by using thetrained ELM model, the assessment result is used as the output mode. Then, a new examplecan be obtained by composing the obtained output mode and the original input mode, whichreflects the response characteristic of ELM in a certain extent [24]. If the examples are sufficientenough and cover the entire sample space, the rules obtained will have the similar functions

Fig 1. Flowchart of ELM-based rule extraction. doi:10.1371/journal.pone.0130814.g001

Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 5 / 18 ith the original ELM, i.e., these rules can describe the functions of the original network. Thesteps of generating the example sample set are listed as follows.Step 1: Based on the training set A with class labels, the TSA model is obtained by usingELM.Step 2: Determine the range of the input features, and then generate a random data set B (the input modes) without the corresponding class labels in the range. In addition, one shouldto note that the data set B has the same input features with the training set A, and their valuesare different.Step 3: The obtained data set B is assessed by using the trained ELM-based TSA model, andthen the corresponding class labels (the output modes) are obtained.Step 4: Finally, the example sample set B used for the follow-up rule learning can beacquired by combining with the input and output modes, which are respectively obtained inStep 2 and Step 3. Rule learning

IAM algorithm. “ No free lunch ” theorem [25] shows that there is no optimization algo-rithm is optimal for any problems such as global optimization ability and convergence speed.The original Ant-Miner algorithm is improved from two aspects in this paper. On the one Table 2. The input features for the southern power system of Hebei province.No. Input featuresTz1

Mean value of all the initial acceleration power

Tz2

Maximum value of all the rotor kinetic energies at t cl Tz3

Rotor angle of the machine with the biggest difference relative to the centre of inertia at t cl + Tz4

Maximum value of the difference of rotor angles at t cl + Tz5

Kinetic energy of the machine with the maximum rotor angle at t cl + Tz6

Rotor angle of the machine with the biggest difference relative to the centre of inertia at t cl + Tz7

Maximum value of the difference of rotor angles at t cl + Tz8

Rotor angular velocity of the machine with the biggest difference relative to the centre of inertia at t cl + Tz9

Rotor angle of the machine with the biggest difference relative to the centre of inertia at t cl + Tz10

Maximum value of the difference of rotor angles at t cl + Tz11

Rotor angular velocity of the machine with the biggest difference relative to the centre of inertia at t cl + doi:10.1371/journal.pone.0130814.t002 Table 1. The input features for the New England 39-bus test system.No. Input featuresTz1

Mean value of all the mechanical power before the fault incipient time

Tz2

Mean value of all the initial acceleration power

Tz3

Rotor angular velocity of the machine with the biggest difference relative to the centre of inertia at t cl +3 c Tz4

Rotor angle of the machine with the biggest difference relative to the centre of inertia at t cl +6 c Tz5

Rotor angular velocity of the machine with the biggest difference relative to the centre of inertia at t cl +6 c Tz6

Rotor angle of the machine with the biggest difference relative to the centre of inertia at t cl +9 c Tz7

Rotor angular velocity of the machine with the biggest difference relative to the centre of inertia at t cl +9 c doi:10.1371/journal.pone.0130814.t001 Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 6 / 18 and, an adaptive tuning strategy is adopted to tune the pheromone evaporation rate ρ ; on theother hand, the heuristic function is improved to reduce the computational overhead. (1) Improvement of pheromone evaporation rate ρ . In Ant-miner algorithm, the controlparameter ρ plays an important role in the performance of Ant-miner algorithm, so an adap-tive tuning strategy is adopted here. If ρ is bigger, the algorithm is not easy to fall into localoptimum, but its convergence speed is slow; otherwise, its convergence speed is fast, but it iseasy to fall into local optimum. In this paper, a self-adaptive adjustment control way isemployed to improve the performance of the algorithm, (cid:4) r ð t þ Þ ¼ : (cid:3) ð (cid:4) r ð t ÞÞ ; if : (cid:3) ð (cid:4) r ð t ÞÞ (cid:5) r min r min ; else ð Þ ( where ρ min is the minimum of ρ .By setting the dynamic parameter ρ , the useful information in the last search can be pre-served, which facilitates a more finer search in a better area. By this means, not only the conver-gent speed but also the searching ability are enhanced. (2) Improvement of the heuristic function. The heuristic value in AntMiner is defined asan information theoretic measure in terms of the entropy. Unfortunately, the informationentropy-based heuristic function is complex and time-consuming. With the assumption thatthe small induced errors are compensated by the pheromone level [26], a density-based heuris-tic function is employed as shown in (12), which makes IAM computationally less expensivewithout a significant degradation of the stated performance. Z ij ð t Þ ¼ j T ij ð t Þjj Ts ð t Þj ð Þ where | T ij | and | Ts | are the ﬁ rst iteration t , respectively, the total number of samples in the sam-ple number and the division T ij of the training set. Process of IAM algorithm.

An iteration of the IAM Algorithm mainly consists of threesteps, comprising rule construction, rule pruning, and pheromone updating, detailed as shownin Fig 2.

Rule learning based on IAM.

The rule learning steps are listed as follows:Step 1: Data pre-processing. The z-score standardization method is used as the data pre-processing method for the obtained sample set. f ¼ ð f (cid:4) (cid:2) F Þ = s F ð Þ where (cid:2) F and σ F are the mean and standard deviation of any feature F in sample data, respec-tively; f ' is the normalized value of f , f F .Step 2: Based on the trained ELM-based TSA model, an example sample set S can beobtained. To approximate the functions of the original ELM network, the obtained samplesshould be sufficient enough and cover the entire sample space with the uniform distribution.Step 3: If there exist some examples with a same attributes combination belong to the sameclass in S , then a new rule R u is established, i.e. the attributes combination and the class arerespectively used for the rule antecedent and rule consequent.Step 4: Based on the trained IAM-based TSA model, a new example sample set S (cid:6) is gener-ated to justify the fidelity of the rule set. If the fidelity of the rule set meets the requirements,then the rule R u is retained in the discovered rule list, and the covered cases by the rule R u isremoved from S ; otherwise, R u is eliminated from the rule list. Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 7 / 18 tep 5: If S = ; , then stop the learning process; otherwise, go to Step 3 and repeat the process.This makes the generated rules gradually approaching the functions of the original ELMclassifier.Step 6: Output the generated rule set. Case Studies

The effectiveness of the proposed method is tested on the New England 39-bus power systemand a practical power system — the southern power system of Hebei province. All programs areimplemented in MATLAB on a PC platform with the master frequency 1.81 GHz and mainmemory 1 GB.In order to properly assess the performance of the proposed predictive model, the well-known model validation technique, cross-validation, is employed in the following case studies,as it provides a nearly unbiased estimate of the generalization ability of predictive models byavoiding overfitting and underfitting. Case 1 — The New England 39-bus power system

The New England 39-bus power system is the widely-used test system for TSA studies reportedin the literature [10 –

12, 22]. The one-line diagram of the power system is shown in Fig 3.

Generation of the sample sets.

Extensive time domain simulation work has been carriedout to create the training and test sample sets. The simulation is done with the four-order

Fig 2. Flowchart of IAM algorithm. doi:10.1371/journal.pone.0130814.g002

Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 8 / 18 achine model and the IEEE DC1 excitation system model, as well as the constant impedanceload model. A three-phase short-circuit faults is created at instant 0.1 s and cleared at 0.2 s. Asuccessful reclosure of the faulted line is applied after fault clearance with no topology changefrom the fault. A total of 4800 arbitrary samples at 80 different fault locations are created under75%, 80%, 85%, . . .. . . , 130% of the basic load levels. Corresponding to each loading level, 5 dif-ferent generator outputs are randomly set.

Fig 3. New England 39-bus test system. doi:10.1371/journal.pone.0130814.g003

Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 9 / 18 class label “ -1 ” or “ +1 ” is assigned to each sample according to maximum relative rotorangle deviation during the transient period. If the maximum relative rotor angle deviationexceeded 360 degree [22, 23], the class label is given as “ -1 ” and the system is considered to betransiently unstable; otherwise, the label is given as “ +1 ” and the system is stable. In Figs 4 and5, a transient stable case and an unstable case are respectively shown by plotting the rotor angleswing curves. Parameter settings.

In the proposed approach, there are only two parameters, the evapo-ration rate ρ and the number of ants, needed to be set. To obtain the optimal parameters, largeamounts of experiments over a range of parameter settings have been carried out with theresults shown in Figs 6 –

8. Fig 6 shows the influence of these parameters on accuracy with awider line used for the experiments with 400 ants and varying ρ , and the experiments with ρ setto 0.85 and varying number of ants. Fig 7 and Fig 8 show the surface lines for a constant evapo-ration rate (0.85) and constant number of ants (400), respectively.It might be taken for granted that the better results will be obtained if the parameters arehigher, since more ants mean that more candidate rules are generated and increasing ρ willcause the convergence process to be slower. However, from a certain threshold on, an Fig 4. Transient stable case. doi:10.1371/journal.pone.0130814.g004

Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 10 / 18 nteresting phenomenon called “ a flat-maximum effect ” can been found: with the increasementof the parameters, accuracy increases at first; but it has no significant increase from a certainthreshold on, as shown in Figs 6 – ρ and the number of ants are respec-tively set to 0.85 and 400 (indicated with the white dot in Fig 6), since the choice provides thebest performance for the proposal in the most cases. Therefore, the choice is employed as theparameter settings for our experiments. Evaluation measures.

In order to properly assess the performance of the proposedmethod (ELM-rules), it is tested by using the well-known 5-fold cross-validation methods.Taking into account that the predictive accuracy

Acc has some kind of occasionality, the testresults should be assessed in statistical basis [23]. Therefore, measures like precision

Prec andthe area under the receiver operating characteristic (ROC) curve

AUC are also taken intoaccount for the performance assessment of the proposed method. If a classifier model is perfect,

AUC will be 1. If the model is just a random guess model,

AUC will be 0.5. The value of modelof

AUC is greater, the model is more excellent. Considering the above three classification per-formance indicators, a composite indicator η is used to comprehensively evaluate the TSA Fig 5. Transient unstable case. doi:10.1371/journal.pone.0130814.g005

Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 11 / 18 lassifier models ’ performance [23]. η is defined as Z ¼ Acc þ Prec þ AUC : ð Þ Results and Discussion (1) Test results.

The presented method is tested by using cross-validation methods, and com-parative tests are also carried out by using other relative state-of-the-art methods, includingRuleFit (a rule-based ensembles method) [27], Rotation forest [28], ELM [16] and MLP. Thetest results are reported in Table 3 with the ROC curve shown in Fig 9.In Table 3,

Acc is the average validation accuracy of the 5-fold cross validation, “ ± ” symbol are the standard deviations of the correspondingevaluation measures. The parameters of the predictive models are set as follows: the parametersof RuleFit and Rotation forest are respectively set according to the reference [27] and [28]; forthe ELM classifier, the hidden layer node number is set to 50; the MLP is designed with the hid-den neuron number 15, and the back-propagation algorithm with the learning rate andmomentum factor 0.8 and 0.7 respectively is employed.As is shown in Table 3, the assessment results of different predictive models are differentfrom each other, which can be summarized according to classification ability and rule list sim-plicity. Concerning classification ability, the classification ability of ELM is highest with thecomposite indicator η getting the maximum value 0.9697; the one of MLP are lowest with η getting the minimum value 0.9386; and the ones of RuleFit, ELM-rules and Rotation forest are Fig 6. Influence of the two parameters on accuracy. doi:10.1371/journal.pone.0130814.g006

Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 12 / 18 espectively 0.9670, 0.9643 and 0.9643. Concerning rule list simplicity, ELM and MLP are theblack-box incomprehensible predictive models; among rule-based methods, the presentedmethod is superior to the others, and Rotation forest is the worst one.From Table 3, a good tradeoff between classification ability and rule list simplicity is clearlypresent: 1) the most accurate results are obtained by incomprehensible nonlinear ELM mod-els; 2) the most accurate rule-based classifier, RuleFit, is slightly better than the second bestrule-based classifier, ELM-rules, however, ELM-rules discovered rule lists simpler than thatdiscovered by RuleFit since which has fewer the average number of rules and terms per rule; 3)ELM-rules and Rotation forest are roughly equivalent in terms of predictive accuracy ( η ofthem are equivalent to each other), and ELM-rules discovered rule lists much simpler than thatdiscovered by Rotation forest. The reason for this is that the proposed approach extracts tran-sient stability rules from an example sample set generated by the trained ELM-based TSAmodel by using IAM algorithm. By this means, the proposal combines the advantages of ELMand IAM, resulting in the good predictive performance of the obtained rules. Therefore, theconclusion can be drawn on the basis of the evidence that the proposed method is an effectiveway to extract the transient stability rules, since the simplicity of a rule list tends to be evenmore important than its predictive ability in TSA, as motivated earlier in this paper. Fig 7. Influence of the number of ants on accuracy (with ρ set to 0.85). doi:10.1371/journal.pone.0130814.g007 Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 13 / 18 ase 2 — The southern power system of Hebei province

In order to further justify the efficacy and applicability of the proposed approach for a morecomplex and practical system, it is examined on the model of a practical large power system — the southern power system of Hebei province, China. The power system covering an area of84,000 square kilometers is a highly interconnected grid with an approximate installed capacityof 28260 MW. The modeled system comprises of 83 generators, some series compensated linesand static var compensators (SVCs). Fig 8. Influence of the pheromone evaporation rate on accuracy (with 400 ants). doi:10.1371/journal.pone.0130814.g008

Table 3. Test results in Case-1.Method

Acc (%)

Prec (%)

AUC R T/R η ELM-rules ± ± ± ± ± RuleFit ± ± ± ± ± Rotation forest ± ± ± ± ± ELM ± ± ± — — ± MLP ± ± ± — — ± doi:10.1371/journal.pone.0130814.t003 Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 14 / 18 eneration of the sample sets

Extensive simulation has been carried out to generate the sample sets. Of all 83 generators inthe system, 11 generators are modeled as the six-order model, and the excitation systems andgovernors are considered; others the classical machine model. The load model is representedby a comprehensive model with 40% constant-impedance and 60% constant power. In therange from 90% to 120% of the basic load level, active and reactive powers of generators are setcorrespondingly. Contingencies created are three-phase to ground faults, and the fault clearingtimes are varied from five to ten cycles. A successful reclosure of the faulted line is applied afterfault clearance with no topology change from the fault. The fault locations lie at 0, 25%, 50%,and 75% of the length on transmission lines. The stability criterion is as same as in Case-1. Atotal of 5000 arbitrary samples are created.

Prediction Results and Performance

In this Case, the performance of the proposed approach is also evaluated by using the 5-foldcross-validation method. By using the parameter setting method in Case-1, the evaporationrate ρ and the number of ants are respectively set to 0.85 and 600 through large amounts ofexperiments. The ROC curve of the proposal is shown in Fig 10, and the test results of the pre-sented method and the relative state-of-the-art methods in this Case are summarized inTable 4. Fig 9. The ROC curve in Case-1. doi:10.1371/journal.pone.0130814.g009

Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 15 / 18 rom Table 4, the presented results prove that the proposed algorithm is applicable to realpower systems, and it is able to extract transient stability rules for a real power system — thesouthern power system of Hebei province. Concerning classification ability, the proposedmethod has roughly equivalent performance to the relative state-of-the-art works on this area;concerning rule list simplicity, the presented approach is superior to the others. Based on acomprehensive consideration of classification ability and rule list simplicity, the presentedmethod is a good choice to solve the rule extraction problem for PRTSA. Fig 10. The ROC curve in Case-2. doi:10.1371/journal.pone.0130814.g010

Table 4. Test results in Case-2.Method

Acc (%)

Prec (%)

AUC R T/R η ELM-rules ± ± ± ± ± RuleFit ± ± ± ± ± Rotation forest ± ± ± ± ± ELM ± ± ± — — ± MLP ± ± ± — — ± doi:10.1371/journal.pone.0130814.t004 Rule Extraction Based on ELM and IAM for TSAPLOS ONE | DOI:10.1371/journal.pone.0130814 June 19, 2015 16 / 18 onclusions

In order to improve the understandability of the PRTSA approaches, a novel rule extractionmethod based on ELM and IAM algorithm is presented in this paper. The key point of the pro-posal is that transient stability rules are extracted from an example sample set generated by thetrained ELM-based TSA model by using IAM algorithm. Based on the well-known cross-vali-dation methods, the effectiveness of the proposed method is tested on the New England 39-buspower system and the southern power system of Hebei province, and the following conclusionscan be safely drawn from this work:1. The proposed method can extract the transient stability rules not only for the New England39-bus power system, but also for real power systems.2. The classification ability of the proposed method is roughly equivalent to that of the relativestate-of-the-art works on this area, including RuleFit and Rotation forest; however, the rulelist simplicity of the proposed method is better than that of the other ones.3. The proposed approach may find potential applications in real-time transient stability pre-diction of power systems. Furthermore, the methodology of rule extraction may be appliedto any similar pattern classification problem in engineering field.

Author Contributions

Conceived and designed the experiments: YL. Performed the experiments: YL GL ZW. Ana-lyzed the data: YL GL. Contributed reagents/materials/analysis tools: YL ZW. Wrote the paper:YL.

References Kundur P, Paserba J, Ajjarapu V, Anderson G, Bose A, Canizares C, et al. Definition and classificationof power system stability. IEEE Transactions on Power Systems. 2004; 19(3): 1387 – Andersson G, Donalek P, Farmer R, Hatziargyriou N, Kamwa I, Kundur P, et al. Causes of the 2003major grid blackouts in North America and Europe, and recommended means to improve systemdynamic performance. IEEE Transactions on Power Systems. 2005; 20(4): 1922 – Anderson PM, Fouad AA. Power system control and stability. 2nd ed. Piscataway, NJ: IEEE Press;2003. Pai MA. Energy function analysis for power system stability. Boston, MA: Kluwer Academic Publisher;1989. Xue Y, Wehenkel L, Belhomme R, Rousseaux P, Pavella M, Euxibie E, et al. Extended equal area crite-rion revisited (EHV power systems). IEEE Transactions on Power Systems. 1992; 7(3):1012 – Mohammadi M, Gharehpetian GB. On-line transient stability assessment of large-scale power systemsby using ball vector machines. Energy Conversion and Management. 2010; 51(4): 640 – Sun K, Likhate S, Vittal V, Kolluri VS, Mandal S. An online dynamic security assessment scheme usingphasor measurements and decision trees. IEEE Transactions on Power Systems. 2007; 22(4):1935 – Moulin LS, Alves da Silva AP, El-Sharkawi MA, Marks RJ. Support vector machines for transient stabil-ity analysis of large-scale power systems. IEEE Transactions on Power Systems. 2004; 19(2):818 – Amjady N, Banihashemi SA. Transient stability prediction of power systems by a new synchronism sta-tus index and hybrid classifier. IET generation, transmission & distribution. 2010; 4(4):509 – Wehenkel L. Machine-learning approaches to power-system security assessment. IEEE Expert-Intelli-gent Systems & Their Applications. 1997; 12(5): 60 – Kamwa I, Samantaray SR, Joos G. Development of rule-based classifiers for rapid stability assessmentof wide-area post-disturbance records. IEEE Transactions on Power Systems. 2009; 24(1): 258 – Lin YJ. Explaining critical clearing time with the rules extracted from a multilayer perceptron artificialneural network. International Journal of Electrical Power & Energy Systems. 2010; 32(8): 873 – Huang GB, Zhu QY, Siew CK. Extreme learning machine: Theory and applications. Neurocomputing.2006; 70:489 – Huang GB, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classifi-cation. IEEE Transactions on Systems, Man, and Cybernetics — Part B: Cybernetics. 2012; 42(2): 513 – Zhang R, Huang GB, Sundararajan N, Saratchandran P. Multi-category classification using extremelearning machine for microarray gene expression cancer diagnosis. IEEE/ACM Transactions onComputational Biology and Bioinformatics. 2007; 4(3): 185 – Xu Y, Dong ZY, Zhao JH, Zhang P, Wong KP. A reliable intelligent system for real-time dynamic secu-rity assessment of power systems. IEEE Transactions on Power Systems. 2012; 27(3):1253 – Parnelli RS, Lopes HS, Freitas AA. Data mining with an ant colony optimization algorithm. IEEE Trans-actions on Evolutionary Computation. 2002; 6(4): 321 – Martens D, De Backer M, Haesen R, Vanthienen J, Snoeck M, Baesens B. Classification with ant col-ony optimization. IEEE Transactions on Evolutionary Computation. 2007; 11(5): 651 – Taylor CW, Erickson DC, Martin K, Wilson RE, Venkatasubramanian V. WACS — wide-area stabilityand voltage control system: R&D and online demonstration. Proceedings of the IEEE. 2005; 93(5):892 – Phadke AG, Thorp JS. Synchronized phasor measurements and their applications. New York:Springer-Verlag; 2008.

Terzija V, Valverds G, Cai D, Regulski P, Madani V, Fitch J, et al. Wide-area monitoring, protection, andcontrol of future electric power networks. Proceedings of the IEEE. 2011; 99(1):80 – Gomez FR, Rajapakse AD, Annakkage UD, Fernando IT. Support vector machine-based algorithm forpost-fault transient stability status prediction using synchronized measurements. IEEE Transactions onPower Systems. 2011; 26(3):1474 – Gu XP, Li Y, Jia JH. Feature selection for transient stability assessment based on kernelized fuzzyrough sets and memetic algorithm. International Journal of Electrical Power & Energy Systems. 2015;64: 664 – Zhou ZH, Jiang Y, Chen SF. Extracting symbolic rules from trained neural network ensembles. AI Com-munications. 2003; 16(1): 3 – Wolpert DH, Macready WG. No free lunch theorems for optimization. IEEE Transactions on Evolution-ary Computation. 1997; 1(1): 67 – Liu B, Abbass HA, McKay B. Classification rule discovery with ant colony optimization. IEEE Computa-tional Intelligence Bulletin. 2004; 3(1): 31 – Friedman JH, Popescu BE. Predictive learning via rule ensembles. The Annals of Applied Statistics.2008;916 – Rodriguez JJ, Kuncheva LI, Alonso CJ. Rotation forest: A new classifier ensemble method. IEEETransactions on Pattern Analysis and Machine Intelligence. 2006; 28(10), 1619 ––