[PDF] A Statistical Study on Parameter Selection of Operators in Continuous State Transition Algorithm

Abstract

State transition algorithm (STA) has been emerging as a novel metaheuristic method for global optimization in recent few years. In our previous study, the parameter of transformation operator in continuous STA is kept constant or decreasing itself in a periodical way. In this paper, the optimal parameter selection of the STA is taken in consideration. Firstly, a statistical study with four benchmark two-dimensional functions is conducted to show how these parameters affect the search ability of the STA. Based on the experience gained from the statistical study, then, a new continuous STA with optimal parameters strategy is proposed to accelerate its search process. The proposed STA is successfully applied to twelve benchmarks with 20, 30 and 50 dimensional space. Comparison with other metaheuristics has also demonstrated the effectiveness of the proposed method.

Full PDF

11 A Statistical Study on Parameter Selection ofOperators in Continuous State Transition Algorithm

Xiaojun Zhou, Chuhua Yang

Member, IEEE , Weihua Gui

Member, IEEE,

Abstract —State transition algorithm (STA) has been emergingas a novel metaheuristic method for global optimization in recentfew years. In our previous study, the parameter of transformationoperator in continuous STA is kept constant or decreasing itselfin a periodical way. In this paper, the optimal parameter selectionof the STA is taken in consideration. Firstly, a statistical studywith four benchmark two-dimensional functions is conductedto show how these parameters affect the search ability of theSTA. Based on the experience gained from the statistical study,then, a new continuous STA with optimal parameters strategyis proposed to accelerate its search process. The proposed STAis successfully applied to twelve benchmarks with 20, 30 and50 dimensional space. Comparison with other metaheuristics hasalso demonstrated the effectiveness of the proposed method.

Index Terms —State transition algorithm, statistical study,metaheuristic, global optimization.

I. I

NTRODUCTION S TATE-TRANSITION-ALGORITHM (STA) [1], [2] is arecently emerging metaheuristic method for global op-timization and has found applications in nonlinear systemidentiﬁcation and control [3], water distribution networksconﬁguration [4], sensor network localization [5], PID con-troller design [6], [7], overlapping peaks resolution [8], im-age segmentation [9], wind power prediction [10], dynamicoptimization [11], [12], bi-level optimization [13], modelingand control of complex industrial processes [14]–[19], etc.In STA, a solution to an optimization problem is consideredas a state, and an update of a solution can be regarded asa state transition. Unlike the population-based evolutionaryalgorithms [20]–[22], the standard STA is an individual-basedoptimization method. Based on an incumbent best solution,a neighborhood with special characteristics will be formedautomatically when using certain state transformation operator.A variety of state transformation operators, for example,rotation, translation, expansion, and axesion in continuousSTA, or swap, shift, symmetry and substitute in discrete STA,are designed purposely for both global and local search. Onthe basis of the neighborhood, then, a sampling technique isused to generate a candidate set, and the next best solutionis updated by using a selection technique based on previousbest solution and the candidate set. This process is repeatedusing state transformation operators alternatively until someterminal conditions are satisﬁed.In this paper, the continuous state transition algorithm isstudied. As aforementioned, in continuous STA, there are four

X. Zhou, C. Yang and W. Gui are with the School of Information Scienceand Engineering, Central South University, Changsha, Hunan, China e-mail:([email protected], [email protected], [email protected]).Manuscript received XX, 2017; revised XX, 2018. state transformation operators, and each transformation oper-ator has certain geometric signiﬁcance, i.e., the neighborhoodformed by each transformation operator has certain geometriccharacteristic. To be more speciﬁc, the rotation transforma-tion has the functionality to search in a hypersphere withthe maximal radius α , called rotation factor; the translationtransformation has the functionality to search along a line withthe maximal length β , called translation factor; the expansiontransformation has the functionality to search in a broaderspace controlled by the expansion factor γ ; and the axesiontransformation is designed to strengthen single-dimensionalsearch regulated by the axesion factor δ . In our previousstudies, the rotation factor is exponentially decreasing froma maximum value to a minimum value in a periodic way,and other transformation factors are kept constant at one [1].To gain a better exploitation ability, all state transformationfactors are exponentially decreasing from a maximum valueto a minimum value in a periodic way in [5].As is known to us, there exist several parameters in meta-heuristic methods and parameter selection plays a signiﬁcantrole in their performance. For instance, crossover and mutationprobability in genetic algorithms (GAs) [23], inertia weightand acceleration factors in particle swarm optimization (PSO)[24], [25], ampliﬁcation factor and crossover rate in differ-ential evolution (DE) [26]–[28], and neighborhood radius inartiﬁcial bee colony (ABC) [29]. In general, the parametersetting can be summarized to two types: parameter tuningand parameter control. The former is to ﬁnd good parametervalues before running these algorithms, and they remain ﬁxedduring the run. On the contrary, the later is to update parametervalues in the process, and the types of update mechanisms canbe deterministic, adaptive, or self-adaptive (for details, pleaserefer to [30]–[32]).To gain a better understanding of the parameters of transfor-mation operators in continuous STA affecting its performance,the parameter selection in continuous STA is focused inthis study. With four commonly used benchmark functionsas cases, several properties of the operator parameters areobserved from a statistical study. With the gained experiencefrom the statistical results, a new continuous STA with optimaloperator parameter selection strategy is proposed, and theproposed STA is successfully applied to other benchmarkswith higher dimensions.The remainder of this paper is organized as follows. In Sec-tion II, the standard continuous STA are described. Section IIIgives a statistical study to show how the operator parametersin continuous STA affecting its performance . The proposedSTA with optimal operator parameter selection strategy is a r X i v : . [ m a t h . O C ] D ec given in Section IV. In Section V, experimental results aregiven to testify the effectiveness of the proposed STA. Finally,conclusion is drawn in Section VI.II. S TANDARD CONTINUOUS STATE TRANSITIONALGORITHM

Consider the following continuous optimization problemwith simple constraints: min x ∈ Ω f ( x ) (1)where Ω ⊆ R n is a closed and compact set, which is usuallycomposed of lower and upper bounds of x , i.e. , Ω = { x ∈ R n | x i ≤ x i ≤ x i , i = 1 , · · · , n } .In classical iterative methods for numerical optimization,a new candidate is generated based on a previous solutionby using different optimization operators. In a state transitionway, a solution can be regarded as a state, and an update of asolution can be considered as a state transition. On the basisof state space representation, the uniﬁed form of generationof solution in state transition algorithm can be described asfollows: (cid:26) s k +1 = A k s k + B k u k y k +1 = f ( s k +1 ) , (2)where s k and s k +1 stand for a current state and the next staterespectively, corresponding to solutions of the optimizationproblem; u k is a function of s k and historical states; y k is theﬁtness value at s k ; A k and B k are state transition matrices,which can be considered as transformation operators; f is theobjective function or ﬁtness function. A. State transition operators

Using state space representation and state transformationfor reference, four special state transformation operators aredesigned to generate candidate solutions for an optimizationproblem [1], [33].(1) Rotation transformation s k +1 = s k + α n (cid:107) s k (cid:107) R r s k , (3)where α is a positive constant, called the rotation factor; R r ∈ R n × n , is a random matrix with its entries being uniformlydistributed random variables deﬁned on the interval [-1, 1],and (cid:107) · (cid:107) is the L2-norm (or Euclidean norm) of a vector.This rotation transformation has the functionality to searchin a hypersphere with the maximal radius α , which has beentestiﬁed. The rotation transformation is designed for localsearch and can be used to guarantee local optimality andmanipulate solution accuracy.(2) Translation transformation s k +1 = s k + βR t s k − s k − (cid:107) s k − s k − (cid:107) , (4)where β is a positive constant, called the translation factor; R t ∈ R is a uniformly distributed random variable deﬁnedon the interval [0,1]. It is not difﬁcult to understand thatthe translation transformation has the functionality to searchalong a line from s k − to s k at the starting point s k with maximum length β . The translation operator is actually a linesearch, and it can be considered as a heuristic operator sincethere exists a possible better solution along the line if s k isbetter than s k − .(3) Expansion transformation s k +1 = s k + γR e s k , (5)where γ is a positive constant, called the expansion factor; R e ∈ R n × n is a random diagonal matrix with its entriesobeying the Gaussian distribution (or normal distribution).In the standard STA, the mean equals zero and standarddeviation equals one, i.e., the standard normal distribution isused. The expansion transformation has the functionality tosearch in the whole space in probability, and it is designedfor global search.(4) Axesion transformation s k +1 = s k + δR a s k (6)where δ is a positive constant, called the axesion factor; R a ∈ R n × n is a random diagonal matrix with its entries obeyingthe Gaussian distribution and only one random position havingnonzero value. The axesion transformation is designed tosearch along the axes, aiming to strengthen single-dimensionalsearch [34]. B. A sampling technique

The idea of sampling incorporated in continuous STA wasﬁrstly illustrated in [35]. It is found that for a given solution, aneighborhood will be automatically formed. To avoid enumer-ating all possible candidate solutions, representative samplescan be used to reﬂect the characteristics of the neighborhood.Taking the rotation transformation for example, when inde-pendently executing the rotation operator for SE times, a totalnumber of SE samples are generated in pseudocode as follows for i ← , SE do State(: , i ) ← Best + α n (cid:107) Best (cid:107) R r Best end for where Best is the incumbent best solution, and SE samplesare stored in the matrix State . C. An update strategy

As mentioned above, based on the incumbent best solution,a total number of SE candidate solutions are generated, but itshould be noted that these candidate solutions do not alwaysbelong to the domain Ω . To address this issue, these samplesare projected into Ω through x i =  x i , if x i > x i x i , if x i < x i x i , otherwise (7)As a result, the candidate solutions can be guaranteed tobe always feasible. Next, a new best solution is selected fromthe candidate set by virtue of the ﬁtness function, denoted as newBest . Finally, an update strategy based on greedy criterionis used to update the incumbent best as shown below Best = (cid:40) newBest , if f (newBest) < f (Best)Best , otherwise (8) D. Algorithm procedure of the standard continuous STA

With the state transformation operators for both local andglobal search, sampling technique for time-saving and updatestrategy for convergence, the standard continuous STA can bedescribed by the following pseudocodes State ← initialization(SE, Ω ) Best ← ﬁtness(funfcn,State) repeat if α < α min then α ← α max end if Best ← expansion(funfcn,Best,SE, β , γ ) Best ← rotation(funfcn,Best,SE, β , α ) Best ← axesion(funfcn,Best,SE, β , δ ) α ← α fc until the speciﬁed termination criterion is metAs for detailed explanations, rotation ( · ) in above pseu-docode is given for illustration purposes as follows oldBest ← Best fBest ← feval(funfcn,oldBest) State ← op rotate(Best,SE, α ) [newBest,fnewBest] ← ﬁtness(funfcn,State) if fnewBest < fBest then fBest ← fnewBest Best ← newBest State ← op translate(oldBest,newBest,SE, β ) [newBest,fnewBest] ← ﬁtness(funfcn,State) if fnewBest < fBest then fBest ← fnewBest Best ← newBest end if end if As shown in the above pseudocodes, initialization ( · ) isused to make sure the initial solution is in the range Ω . Therotation factor α is decreasing periodically from a maximumvalue α max to a minimum value α min in an exponential waywith base fc , which is called lessening coefﬁcient. op rotate ( · ) and op translate ( · ) represent the implementations of proposedsampling technique for rotation and translation operators,respectively, and ﬁtness ( · ) represents the implementation ofselecting the new best solution from SE samples. It shouldbe emphasized that the translation operator is only executedwhen a solution better than the incumbent best solution can befound in the SE samples from rotation, expansion or axesiontransformation. In the standard continuous STA, the parametersettings are given as follows: α max = 1 , α min = 1 e -4, β = 1 , γ = 1 , δ = 1 , SE = 30 , fc = 2 .III. S TATISTICAL STUDY OF THE STATE TRANSFORMATIONFACTORS

As described in Section II, in the standard continuousSTA, the state transformation factors like expansion factor γ , axesion factor δ are kept constant, and rotation factor α is decreasing periodically from a maximum value α max to aminimum value α min in an exponential way. In order to selectthe values of these parameters in a more effective manner, astatistical study of the state transformation factors is carriedout to investigate the effect of parameter selection on theperformance of state transition operators.Four well-known benchmark functions are listed below:(1) Spherical function f ( x ) = n (cid:88) i =1 x i , where the global optimum x ∗ = (0 , · · · , and f ( x ∗ ) = 0 , − ≤ x i ≤ , i = 1 , · · · , n .(2) Rosenbrock function f ( x ) = n (cid:88) i =1 (100( x i +1 − x i ) + ( x i − ) , where the global optimum x ∗ = (1 , · · · , and f ( x ∗ ) = 0 , − ≤ x i ≤ , i = 1 , · · · , n .(3) Rastrigin function f ( x ) = n (cid:88) i =1 ( x i −

10 cos(2 πx i ) + 10) , where the global optimum x ∗ = (0 , · · · , and f ( x ∗ ) = 0 , − . ≤ x i ≤ . , i = 1 , · · · , n .(4) Griewank function f ( x ) = 14000 n (cid:88) i =1 x i − n (cid:89) i cos | x i √ i | + 1 , where the global optimum x ∗ = (0 , · · · , and f ( x ∗ ) = 0 , − ≤ x i ≤ , i = 1 , · · · , n .For a given solution Best , three state transition oper-ators (rotation, expansion and axesion) are performed re-spectively for SE times (yielding SE samples) indepen-dently on each benchmark function using different val-ues of state transformation factors. To be more speciﬁc,there are ﬁve groups of given solutions, i.e. , Best =(0 . , . , (0 . , . , (0 . , . , (0 . , . , (0 . , . ; thetotal number of samples is set at SE = 1e6; and the value ofstate transformation operators is chosen from the set Ω = { } .To evaluate the inﬂuence of the parameter selection on theperformance of state transition operators, the following twoindexes are introduced: ρ s = N s SE (9) ρ d = | ave − f Best || f Best | (10)where ρ s and ρ d are called success rate and descent rate,respectively. N s is the number of samples whose objectivefunction values are smaller than that of the Best . ave isthe average function value of the N s samples, and f Best represents the function value for Best ,The statistical results for different values of state transfor-mation factors can be found from Table I to Table IV. TABLE IS

TATISTICAL RESULTS OF SUCCESS RATE AND DESCENT RATE FOR THE ROTATION TRANSFORMATION (S PHERICAL PROBLEM ) Best Index α = 1 α = 0.1 α = 0.01 α = 1e-3 α = 1e-4 α = 1e-5 α = 1e-6 α = 1e-7 α = 1e-8 (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d TABLE IIS

TATISTICAL RESULTS OF SUCCESS RATE AND DESCENT RATE FOR THE ROTATION TRANSFORMATION (R OSENBROCK PROBLEM ) Best Index α = 1 α = 0.1 α = 0.01 α = 1e-3 α = 1e-4 α = 1e-5 α = 1e-6 α = 1e-7 α = 1e-8 (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d TABLE IIIS

TATISTICAL RESULTS OF SUCCESS RATE AND DESCENT RATE FOR THE ROTATION TRANSFORMATION (R ASTRIGIN PROBLEM ) Best Index α = 1 α = 0.1 α = 0.01 α = 1e-3 α = 1e-4 α = 1e-5 α = 1e-6 α = 1e-7 α = 1e-8 (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d TABLE IVS

TATISTICAL RESULTS OF SUCCESS RATE AND DESCENT RATE FOR THE ROTATION TRANSFORMATION (G RIEWANK PROBLEM ) Best Index α = 1 α = 0.1 α = 0.01 α = 1e-3 α = 1e-4 α = 1e-5 α = 1e-6 α = 1e-7 α = 1e-8 (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d (0 . , . ρ s ρ d α ρ s f1f2f3f4 1 1e−1 1e−2 1e−3 1e−4 1e−5 1e−6 1e−7 1e−800.10.20.30.40.50.60.7 α ρ d f1f2f3f41e−5 1e−6 1e−7 1e−800.20.40.60.811.2 x 10 −3 γ ρ s f1f2f3f4 γ ρ d f1f2f3f41e−5 1e−6 1e−7 1e−801234 x 10 −3 δ ρ s f1f2f3f4 σ ρ d f1f2f3f41e−5 1e−6 1e−7 1e−800.511.522.5 x 10 −3 Fig. 1. The changes of success rate ρ s and descent rate ρ d with α , γ and δ respectively when approaching the global minima As indicated in these tables, the following properties can beobserved:1) as the decrease of a state transformation factor belowa certain threshold, the descent rate ρ d is showing adeclining trend.2) the success rate remains almost steadily high if a statetransformation factor is below a certain threshold.3) the success rate of the rotation transformation is nothigh until the rotation factor is below a threshold whencurrent solution Best is approaching the global optimalsolution.To be more speciﬁc, let’s take the rotation transformationfor example, the changes of success rate ρ s and descent rate ρ d with the rotation factor α are illustrated in Fig. 1 whencurrent solution Best is approaching the global optimum.Here, Best equals to (0.01, 0.01), (0.99, 0.99), (0.01, 0.01)and (0.01, 0.01) for f , f , f and f respectively. By takinga closer look at these two ﬁgures, it is not difﬁcult to ﬁndthat there exists a trade-off between the success rate and thedescent rate. For instance, when α = 1 , the success rate ρ s is quite low, while the descent rate ρ d is quite high. On thecontrary, when α ∈ { } , the success rate ρ s is quite high, while the descent rate ρ d is quite low. Remark 1:

The property 3) can provide additional supportto the way in changing the rotation factor in the standardcontinuous STA, i.e. , α is not kept constant but decreasingperiodically from a maximum value α max to a minimum value α min . Anyway, it is obvious that the way in changing the statetransition factors is not in an optimal manner. IV. S TATE TRANSITION ALGORITHM WITH OPTIMALPARAMETER SELECTION

As inspired by the statistical study of the state transforma-tion factors, in this section, an optimal parameter selectionstrategy is proposed to accelerate the search of the standardcontinuous state transition algorithm.

A. Optimal parameter selection for the state transformationfactors

In classical iterative methods for numerical optimization,the following iterative formula is usually adopted x k +1 = x k + a k d k (11)where d k is the search direction, and a k is the step size. Forgradient-based algorithms, the search direction is relevant tothe gradient of current iterative point, for instance, the steepestdescent method, d k = −∇ f ( x k ) , and the step size a k is oftenrestricted to the range [0,1]. It can be found that the pattern ofiterative formula in continuous STA is similar to that of Eq.(11), as shown below n (cid:107) s k (cid:107) R r s k R t s k − s k − (cid:107) s k − s k − (cid:107) R e s k R a s k  ⇒ ˜ d k , αβγδ  ⇒ ˜ a k , and a big difference is that the search direction is not de-termined. Compared with gradient-based algorithms, the STAcan be used for global optimization lies in at least two aspects:1) the search is in all directions; 2) the search can go to any length. While compared with the traditional trust regionmethod, the similarity is that some parts of the STA (exceptthe translation transformation) can be considered as a specialkind of trust region method, but the differences are: i) the STAutilizes the original function not its quadratic approximation;ii) the search direction in STA is stochastic.For rotation and translation transformation, the search zoneis restricted in a hypersphere or along a line, which arecontrolled by the corresponding transformation factors. Forexpansion and axesion transformation, although the searchzone can be expanded to the whole space in probability dueto the Gaussian distribution, the search zone is restricted andmanipulated by the expansion and axesion factors as well. Thatis to say, in practical numerical computation, the neighborhoodformed by the state transformation operators is controlled bythe transformation factors to a large extent, which are alsotestiﬁed by the statistical study. To simplify the parameterselection and accelerate the search process, the values of theseparameters are all taken from the set Ω = {

1, 1e-1, 1e-2, 1e-3,1e-4, 1e-5, 1e-6, 1e-7, 1e-8 } , and the parameter value with thecorresponding smallest objective function value is chosen. B. The proposed STA

Let’s denote the optimal parameter as ˜ a ∗ , and then we have ˜ a ∗ = arg min ˜ a k ∈ Ω f ( x k + ˜ a k ˜ d k ) (12)In theory, the neighborhood formed by the state transforma-tion operators has inﬁnite candidate solutions; however, only SE samples are used for evaluation in practice. That is to say,for a given parameter value, only SE samples are taken intoconsideration. In order to further utilize the parameter morecompletely, the selected parameter value is kept for a periodof time, denoted as T p . To be more speciﬁc, the detailed ofthe proposed STA can be outlined as follows repeat Best ← expansion w(funfcn,Best,SE, Ω ) Best ← rotation w(funfcn,Best,SE, Ω ) Best ← axesion w(funfcn,Best,SE, Ω ) until the speciﬁed termination criterion is metIn the meanwhile, rotation w ( · ) in above pseudocode isgiven for further explanations [Best, α ] ← update alpha(funfcn,Best,SE, Ω ) for i ← , T p do Best ← rotation(funfcn,Best,SE, α ) end for where the function update alpha represents the implementa-tion of selection the optimal parameter value of rotation factor.The proposed STA differs from the standard STA in threefolds: 1) the periodical way of diminishing the transformationfactors is no longer used; 2) the optimal parameter is selectedfor state transformation; 3) the optimal parameter is kept toutilize for a period of time.V. E XPERIMENTAL RESULTS

In order to testify the effectiveness of the proposed STA, thefollowing additional benchmark functions are used for test. (5) Ackley function f ( x ) = 20+ e −

20 exp( − . (cid:118)(cid:117)(cid:117)(cid:116) n n (cid:88) i =1 x i ) − exp( 1 n n (cid:88) i =1 cos(2 πx i )) where the global optimum x ∗ = (0 , · · · , and f ( x ∗ ) = 0 , − ≤ x i ≤ , i = 1 , · · · , n .(6) High Conditioned Elliptic function f ( x ) = n (cid:88) i =1 (10 ) i − n − x i where the global optimum x ∗ = (0 , · · · , and f ( x ∗ ) = 0 , − ≤ x i ≤ , i = 1 , · · · , n .(7) Michalewicz function f ( x ) = − n (cid:88) i =1 sin( x i ) sin( ix i π ) where the global optimum is unknown, ≤ x i ≤ π, i =1 , · · · , n .(8) Trid function f ( x ) = n (cid:88) i =1 ( x i − − n (cid:88) i =2 x i x i − where the global optimum x ∗ i = i ( n + 1 − i ) and f ( x ∗ ) = − n ( n +4)( n − , − n ≤ x i ≤ n , i = 1 , · · · , n .(9) Schwefel function f ( x ) = n (cid:88) i =1 [ − x i sin( (cid:112) | x i | )] where the global optimum x ∗ = (420 . , · · · , . and f ( x ∗ ) = − . n , − ≤ x i ≤ , i = 1 , · · · , n .(10) Schwefel 1.2 function f ( x ) = n (cid:88) i =1 ( i (cid:88) j =1 x j ) where the global optimum x ∗ = (0 , · · · , and f ( x ∗ ) = 0 , − ≤ x i ≤ , i = 1 , · · · , n .(11) Schwefel 2.4 function f ( x ) = n (cid:88) i =1 [( x i − + ( x − x i ) where the global optimum x ∗ = (1 , · · · , and f ( x ∗ ) = 0 , ≤ x i ≤ , i = 1 , · · · , n .(12) Weierstrass function f ( x ) = n (cid:88) i =1 k max (cid:88) k =0 [ a k cos(2 πb k ( x i +0 . − n k max (cid:88) k =0 a k cos( πb k x i ) where a = 0 . , b = 3 , k max = 20 , the global optimum x ∗ =(0 , · · · , and f ( x ∗ ) = 0 , − . ≤ x i ≤ . , i = 1 , · · · , n .Other metaheuristics are used for comparison, including theGL-25 [36], CLPSO [37], SaDE [38], and ABC [39], with thesame parameter settings as in these literatures. The parametersin the proposed STA are given by experience as follows: SE =30, T p = 10 (additional experiments have testiﬁed the validityof these parameter values). The number of decision variables TABLE VC

OMPARISONS AMONG VARIOUS ALGORITHMS ON TEST FUNCTIONS

Fcn Dim GL-25 CLPSO SaDE ABC Standard STA Proposed STA20 2.5523e-10 ± ± ± ± ± ± f

30 1.7872e-8 ± ± ± ± ± ±

050 2.3336e-6 ± ± ± ± ± ±

020 15.9120 ± ± ± ± ± ± f

30 25.9785 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± f

30 177.1109 ± ± ± ± ± ±

050 365.4491 ± ± ± ± ± ± ± ± ± ± ± ± f

30 0.0178 ± ± ± ± ± ±

050 2.0621e-6 ± ± ± ± ± ±

020 2.9519e-6 ± ± ± ± ± ± f

30 1.7312e-5 ± ± ± ± ± ±

050 1.5638e-4 ± ± ± ± ± ±

020 1.0920e-7 ± ± ± ± ± ± f

30 4.3319e-6 ± ± ± ± ± ±

050 1.4938e-4 ± ± ± ± ± ±

020 -10.7121 ± ± ± ± ± ± f

30 -13.5080 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± f

30 -2.1886e3 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± f

30 -4.2340e3 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± f

30 6.0084e3 ± ± ± ± ± ±

050 2.6141e4 ± ± ± ± ± ± ± ± ± ± ± ± f

30 1.8994e-9 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± f

30 0.0302 ± ± ± ± ± ±

050 0.2307 ± ± ± ± ± ±

20 30 50010203040506070 Dimension T i m e ( s ) f3 GL−25CLPSOSaDEABCStandard STAProposed STA

20 30 50020406080100120140 Dimension T i m e ( s ) f10 GL−25CLPSOSaDEABCStandard STAProposed STA Fig. 2. The average elapsed time for different metaheuristic methods with respect to f and f , respectively −10 −5 FES A v e r age F un c t i on V a l ue GL−25CLPSOSaDEABCStandard STAProposed STA 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5x 10 −10 −5 FES A v e r age F un c t i on V a l ue GL−25CLPSOSaDEABCStandard STAProposed STA −10 −5 FES A v e r age F un c t i on V a l ue GL−25CLPSOSaDEABCStandard STAProposed STA

Fig. 3. The average iterative curves for different metaheuristic methods with respect to the Rosenbrock function −30 −25 −20 −15 −10 −5 FES A v e r age F un c t i on V a l ue GL−25CLPSOSaDEABCStandard STAProposed STA −30 −25 −20 −15 −10 −5 FES A v e r age F un c t i on V a l ue GL−25CLPSOSaDEABCStandard STAProposed STA −30 −25 −20 −15 −10 −5 FES A v e r age F un c t i on V a l ue GL−25CLPSOSaDEABCStandard STAProposed STA

Fig. 4. The average iterative curves for different metaheuristic methods with respect to the Schwefel 2.4 function n of the benchmark functions is set to 20, 30 and 50, andthe corresponding maximum function evaluations is set at5e4* n *log( n ). A total of 20 independent runs are conductedin the MATLAB (Version R2010b) software platform onIntel(R) Core(TM) i3-2310M CPU @2.10GHz under Window7 environment. The statistic results are given in Table Vand some typical instances with respect to elapsed time anditerative curves are illustrated in Figs. 2-4.From the experimental results, it can be found that theproposed STA is superior to the basic STA among mostof these test problems. The global search ability (see theMichalewicz function) and the solution accuracy (see theSchwefel 1.2 and Schwefel 2.4 function) has greatly improved.It can also be comparable to other metaheuristics except for theMichalewicz function and the Schwefel 2.4 function. However,it should be noted that only mean and standard deviation aregiven for comparison. Actually, for the Michalewicz function,the results obtained from the proposed STA hit the knownglobal solution for more than 50% of the total runs.VI. C ONCLUSION AND FUTURE WORK

In this study, the optimal parameter selection of operatorsin continuous STA was considered to improve its searchperformance. Firstly, a statistical study with four benchmarkcases was conducted to investigate how these parameters affectthe performance of continuous STA. And several propertiesare observed from the statistical study. With the experience gained from the statistical results, then, a new continuous STAwith optimal parameters strategy was proposed to accelerate itssearch process. The proposed STA was successfully applied toother benchmarks. Comparison with other metaheuristics wasconducted to demonstrate the effectiveness of the proposedmethod as well.It should be noted that the parameter T p A CKNOWLEDGMENT

This study is supported by the National Natural Sci-ence Foundation of China (Grant No. 61503416, 61533021,61621062 and 61725306), the Innovation-Driven Plan in Cen-tral South University (Grant No. 2018CX12), the 111 Project(Grant No. B17048) and the Hunan Provincial Natural ScienceFoundation of China (Grant No. 2018JJ3683).R

EFERENCES[1] X. Zhou, C. Yang, and W. Gui, “State transition algorithm,”

Journal ofIndustrial and Management Optimization , vol. 8, no. 4, pp. 1039–1056,2012.[2] X. Zhou, D. Y. Gao, C. Yang, and W. Gui, “Discrete state transitionalgorithm for unconstrained integer optimization problems,”

Neurocom-puting , vol. 173, pp. 864–874, 2016.[3] X. Zhou, C. Yang, and W. Gui, “Nonlinear system identiﬁcation andcontrol using state transition algorithm,”

Applied Mathematics andComputation , vol. 226, pp. 169–179, 2014.[4] X. Zhou, D. Y. Gao, and A. R. Simpson, “Optimal design of water dis-tribution networks by a discrete state transition algorithm,”

EngineeringOptimization , vol. 48, no. 4, pp. 603–628, 2016.[5] X. Zhou, P. Shi, C.-C. Lim, C. Yang, and W. Gui, “A dynamic statetransition algorithm with application to sensor network localization,”

Neurocomputing , vol. 273, pp. 237–250, 2018.[6] F. Zhang, C. Yang, X. Zhou, and W. Gui, “Fractional-order pid controllertuning using continuous state transition algorithm,”

Neural Computingand Applications , pp. 1–10, 2016.[7] G. Saravanakumar, K. Valarmathi, M. P. Rajasekaran, S. Srinivasan,M. W. Iruthayarajan, and V. E. Balas, “Tuning multivariable decen-tralized pid controller using state transition algorithm,”

Studies inInformatics and control , vol. 24, no. 4, pp. 367–378, 2015.[8] G. Wang, C. Yang, H. Zhu, Y. Li, X. Peng, and W. Gui, “State-transition-algorithm-based resolution for overlapping linear sweep voltammetricpeaks with high signal ratio,”

Chemometrics and Intelligent LaboratorySystems , vol. 151, pp. 61–70, 2016.[9] J. Han, C. Yang, X. Zhou, and W. Gui, “A new multi-thresholdimage segmentation approach using state transition algorithm,”

AppliedMathematical Modelling , vol. 44, pp. 588–601, 2017.[10] C. Wang, H. Zhang, W. Fan, and X. Fan, “A new wind power predictionmethod based on chaotic theory and bernstein neural network,”

Energy ,vol. 117, pp. 259–271, 2016.[11] J. Han, C. Yang, X. Zhou, and W. Gui, “Dynamic multi-objectiveoptimization arising in iron precipitation of zinc hydrometallurgy,”

Hydrometallurgy , vol. 173, pp. 134–148, 2017.[12] M. Huang, X. Zhou, T. Huang, C. Yang, and W. Gui, “Dynamicoptimization based on state transition algorithm for copper removalprocess,”

Neural Computing and Applications , pp. 1–13, 2017.[13] Z. Huang, C. Yang, X. Zhou, and W. Gui, “A novel cognitively inspiredstate transition algorithm for solving the linear bi-level programmingproblem,”

Cognitive Computation , pp. 1–11, 2018.[14] Y. Wang, H. He, X. Zhou, C. Yang, and Y. Xie, “Optimization of bothoperating costs and energy efﬁciency in the alumina evaporation processby a multi-objective state transition algorithm,”

The Canadian Journalof Chemical Engineering , vol. 94, no. 1, pp. 53–65, 2016.[15] Y. Xie, S. Wei, X. Wang, S. Xie, and C. Yang, “A new prediction modelbased on the leaching rate kinetics in the alumina digestion process,”

Hydrometallurgy , vol. 164, pp. 7–14, 2016.[16] S. Xie, C. Yang, X. Wang, and Y. Xie, “Data reconciliation strategy withtime registration for the evaporation process in alumina production,”

TheCanadian Journal of Chemical Engineering , vol. 96, no. 1, pp. 189–204,2018.[17] C. Yang, S. Deng, Y. Li, H. Zhu, and F. Li, “Optimal control for zincelectrowinning process with current switching,”

IEEE Access , vol. 5,pp. 24688–24697, 2017.[18] F. Zhang, C. Yang, X. Zhou, and H. Zhu, “Fractional order fuzzy pidoptimal control in copper removal process of zinc hydrometallurgy,”

Hydrometallurgy , vol. 178, pp. 60–76, 2018.[19] X. Zhou, J. Zhou, C. Yang, and W. Gui, “Set-point tracking and multi-objective optimization-based pid control for the goethite process,”

IEEEAccess , 2018.[20] J. Wang, G. Liang, and J. Zhang, “Cooperative differential evolutionframework for constrained multiobjective optimization,”

IEEE Transac-tions on Cybernetics , 2018. [21] Y. Wang, D.-Q. Yin, S. Yang, and G. Sun, “Global and local surrogate-assisted differential evolution for expensive constrained optimizationproblems with inequality constraints,”

IEEE Transactions on Cybernet-ics , 2018.[22] J. Zhang, X. Zhu, Y. Wang, and M. Zhou, “Dual-environmental particleswarm optimizer in noisy and noise-free environments,”

IEEE Transac-tions on Cybernetics , 2018.[23] A. E. Eiben and S. K. Smit, “Parameter tuning for conﬁguring and ana-lyzing evolutionary algorithms,”

Swarm and Evolutionary Computation ,vol. 1, no. 1, pp. 19–31, 2011.[24] J. Sun, W. Fang, X. Wu, V. Palade, and W. Xu, “Quantum-behavedparticle swarm optimization: Analysis of individual particle behavior andparameter selection,”

Evolutionary computation , vol. 20, no. 3, pp. 349–393, 2012.[25] W. Zhang, D. Ma, J.-j. Wei, and H.-f. Liang, “A parameter selectionstrategy for particle swarm optimization based on particle positions,”

Expert Systems with Applications , vol. 41, no. 7, pp. 3576–3584, 2014.[26] R. Mallipeddi, P. N. Suganthan, Q.-K. Pan, and M. F. Tasgetiren,“Differential evolution algorithm with ensemble of parameters andmutation strategies,”

Applied soft computing , vol. 11, no. 2, pp. 1679–1696, 2011.[27] R. A. Sarker, S. M. Elsayed, and T. Ray, “Differential evolution withdynamic parameters selection for optimization problems.,”

IEEE Trans.Evolutionary Computation , vol. 18, no. 5, pp. 689–707, 2014.[28] Y. Wang, Z. Cai, and Q. Zhang, “Differential evolution with compositetrial vector generation strategies and control parameters,”

IEEE Trans-actions on Evolutionary Computation , vol. 15, no. 1, pp. 55–66, 2011.[29] D. Karaboga and B. Gorkemli, “A quick artiﬁcial bee colony (qabc)algorithm and its performance on optimization problems,”

Applied SoftComputing , vol. 23, pp. 227–238, 2014.[30] ´A. E. Eiben, R. Hinterding, and Z. Michalewicz, “Parameter control inevolutionary algorithms,”

IEEE Transactions on evolutionary computa-tion , vol. 3, no. 2, pp. 124–141, 1999.[31] K. De Jong, “Parameter setting in eas: a 30 year perspective,” in

Parameter setting in evolutionary algorithms , pp. 1–18, Springer, 2007.[32] G. Karafotias, M. Hoogendoorn, and ´A. E. Eiben, “Parameter control inevolutionary algorithms: Trends and challenges,”

IEEE Transactions onEvolutionary Computation , vol. 19, no. 2, pp. 167–187, 2015.[33] X. Zhou, C. Yang, and W. Gui, “Initial version of state transition al-gorithm,” in

Second International Conference on Digital Manufacturingand Automation (ICDMA) , pp. 644–647, IEEE, 2011.[34] X. Zhou, C. Yang, and W. Gui, “A new transformation into state tran-sition algorithm for ﬁnding the global minimum,” in ,vol. 2, pp. 674–678, IEEE, 2011.[35] X. Zhou, C. Yang, and W. Gui, “A comparative study of sta on largescale global optimization,” in , pp. 2115–2119, IEEE, 2016.[36] C. Garc´ıa-Mart´ınez, M. Lozano, F. Herrera, D. Molina, and A. M.S´anchez, “Global and local real-coded genetic algorithms based onparent-centric crossover operators,”

European Journal of OperationalResearch , vol. 185, no. 3, pp. 1088–1113, 2008.[37] J. J. Liang, A. K. Qin, P. N. Suganthan, and S. Baskar, “Comprehensivelearning particle swarm optimizer for global optimization of multimodalfunctions,”

IEEE transactions on evolutionary computation , vol. 10,no. 3, pp. 281–295, 2006.[38] A. K. Qin, V. L. Huang, and P. N. Suganthan, “Differential evolutionalgorithm with strategy adaptation for global numerical optimization,”

IEEE transactions on Evolutionary Computation , vol. 13, no. 2, pp. 398–417, 2009.[39] D. Karaboga and B. Basturk, “A powerful and efﬁcient algorithm fornumerical function optimization: artiﬁcial bee colony (abc) algorithm,”

Journal of global optimization , vol. 39, no. 3, pp. 459–471, 2007. Xiaojun Zhou received his Bachelor’s degree inAutomation in 2009 from Central South University,Changsha, China and received the Ph.D. degreein Applied Mathematics in 2014 from FederationUniversity Australia.He is currently an Associate Professor at CentralSouth University, Changsha, China. His main inter-ests include modeling, optimization and control ofcomplex industrial process, optimization theory andalgorithms, state transition algorithm, duality theoryand their applications.

Chunhua Yang received the M.S. degree in auto-matic control engineering and the Ph.D. degree incontrol science and engineering from Central SouthUniversity, Changsha, China, in 1988 and 2002,respectively.From1999 to 2001, she was a Visiting Professorwith the University of Leuven, Leuven, Belgium.Since 1999, she has been a Full Professor withthe School of Information Science and Engineering,Central South University. From 2009 to 2010, shewas a Senior Visiting Scholar with the University ofWestern Ontario, Lon- don, Canada. Her current research interests includemodeling and optimal control of complex industrial process, fault diagnosis,and intelligent control system.