General Subpopulation Framework and Taming the Conflict Inside Populations
Danilo Vasconcellos Vargas, Junichi Murata, Hirotaka Takano, Alexandre Claudio Botazzo Delbem
GGeneral Subpopulation Framework and Tamingthe Conflict Inside Populations
Danilo Vasconcellos Vargas [email protected] School of Information Science and Electrical Engineering, KyushuUniversity, Fukuoka, 819-0395, Japan
Junichi Murata [email protected] of Information Science and Electrical Engineering, Kyushu University,Fukuoka, 819-0395, Japan
Hirotaka Takano [email protected] of Information Science and Electrical Engineering, Kyushu University,Fukuoka, 819-0395, Japan
Alexandre Cl´audio Botazzo Delbem [email protected] of Mathematics and Computer Science, University of S˜ao Paulo, S˜ao Carlos,13566-590, Brazil
Abstract
Structured evolutionary algorithms have been investigated for some time. However,they have been under-explored specially in the field of multi-objective optimization.Despite their good results, the use of complex dynamics and structures make theirunderstanding and adoption rate low. Here, we propose the general subpopulationframework that has the capability of integrating optimization algorithms withoutrestrictions as well as aid the design of structured algorithms. The proposed frame-work is capable of generalizing most of the structured evolutionary algorithms, suchas cellular algorithms, island models, spatial predator-prey and restricted matingbased algorithms under its formalization. Moreover, we propose two algorithmsbased on the general subpopulation framework, demonstrating that with the simpleaddition of a number of single-objective differential evolution algorithms for eachobjective the results improve greatly, even when the combined algorithms behavepoorly when evaluated alone at the tests. Most importantly, the comparison betweenthe subpopulation algorithms and their related panmictic algorithms suggests that thecompetition between different strategies inside one population can have deleteriousconsequences for an algorithm and reveal a strong benefit of using the subpopulationframework.The code for SAN, the multi-objective algorithm which has the current best results inthe hardest benchmark, is available at the following link.
Keywords
Structured Evolutionary Algorithms, Parallel Evolutionary Algorithms, Hybridiza-tion, Multi-objective Algorithms, Novelty Search, General Subpopulation Framework,General Differential Evolution.
Although particle swarm optimization algorithms, differential evolution and geneticalgorithms follow different lines of thought, they can all be seen from the same frame- c (cid:13) a r X i v : . [ c s . N E ] J a n argas et al. work or structure. Not only these types but most of the algorithms in evolutionarycomputation share the same framework. They are based on a single population of indi-viduals, which interacts in some form to produce new ones inside the same population.To these types of algorithms, it is usually given the name of unstructured EAs or pan-mictic [42].On the other hand, island based models and cellular algorithms achieved relevantimprovements, indicating that the evolutionary bioinspiration, when extended to in-clude concepts of subpopulation and neighborhood aspects, can be advantageous [46].These types of algorithms are called structured models.Nonetheless, the use of structured algorithms in multi-objective optimization hasbeen under-explored [37]. Lately, we researchers start asking ourselves what could bethe next step (future research trends) [5], since very simple and effective algorithmswere developed, and it is hard to improve them without losing any of their benefits.This article tackles this problem from a different perspective. Here, we switch the fo-cus from algorithms to frameworks. Moreover, when changing from a panmictic to astructured framework, small and simple changes may give relevant improvements tothe algorithms of the state of the art.This article proposes the subpopulation framework which has the following fea-tures: • Integration Capability - It allows for the addition of any number of algorithms whichare integrated as subpopulations of the framework. Although this feature is notnew, for example it was explored similarly in island models [32], here we show thatnot only evolutionary algorithms (EAs) but any optimization algorithm can be in-tegrated in this framework. It is not required for these algorithms to be populationbased either (examples of how this can be constructed are given in Section 6.2). • General Formulation - This framework is a general case for most of the structured ap-proaches including but not limited to cellular algorithms and island based models(Section 6.1). The formalized subpopulation framework also generalizes the pan-mictic framework, because the panmictic framework is its special case when thenumber of subpopulations is fixed to and the IM matrix set (that describes theinteraction among subpopulations of the proposed framework, further explainedin Section 6) can be ignored. • State of The Art Solutions - Experimentally, it was shown that algorithms basedon the subpopulation framework can achieve state of the art results (Section 9).In fact, results with the Subpopulation Algorithm based on Novelty (SAN) (Sec-tion 7.2) can be reasonably regarded as one of the most robust algorithms to datein multi-objective optimization, solving different types of problems in bi-objectiveand many objective settings with excellent results and surpassing the third versionof the Generalized Differential Evolution algorithm (in short GDE3, currently oneof the most suitable MOEA of the state of the art [11]) in most of the tests.Experiments are conducted with two novel algorithms that implement the pro-posed subpopulation framework. These algorithms are developed based on singlepopulation ones (panmictic). The chosen panmictic algorithms, which were also used The definition of framework used in this article refers to a basic structure underlying a set of algorithmsthat is formalized and exemplified, enabling the understanding and analysis of a class of algorithms ratherthan a single one. Evolutionary Computation Volume x, Number x eneral Subpopulation Framework for comparison, are the GDE3 and a simple novelty search algorithm called Multi-Objective Novelty Algorithm (which is also a contribution of this article, describedin Section 5.2) Here, the intention is to choose algorithms as different as possible toshow some aspects of the subpopulation framework and its applicability to any typeof algorithm. Notice that the dissimilarities in the GDE3 and Multi-Objective Nov-elty Algorithm arise from the fact that the former is objective-based while the later isnovelty-based (further explanation of novelty search is given on Section 5). In fact, itwill be shown that the differences present in strategies of two or more subpopulations benefitstheir integration, in contrast with the competition which arises when different strategies arepresent in a single population .This article shows that simple subpopulations dynamics can give relevant im-provements when combined with an algorithm of the state of the art in the proposedframework, demonstrating the strong benefits of the subpopulation framework. Ad-ditionally, the competition between different strategies inside the traditional single-population framework can have deleterious consequences for an algorithm. This isanalyzed and verified experimentally in Section 9.5. Such problems confronted by thepanmictic algorithms are similar to the ones confronted by the objective-based algo-rithms when contrasted with novelty-search based algorithms [30], since they are easilytrapped in deceptive fitness landscapes. The solution provided by the subpopulationframework is that the presence of multiple populations with different dynamics will letthe algorithm be less sensitive to local optima.Finally, this article presents a discussion over an unexpected result, where the ex-perimental results with a combination of three simple subpopulations achieved state ofthe art quality in the WFG Toolkit [21] (presented and explained in Sections 9.3 and 9.5).Sections 2, 3, 4 and 5 review briefly the literature respectively in similar struc-tured EAs, differential evolution in single-objective optimization, differential evolutionmulti-objective algorithms and novelty search areas. Thereafter, Section 6 proposesthe general subpopulation framework. Section 7 describes two subpopulation algo-rithms which use as basis the general subpopulation framework. Section 8 presentsthe methodology used for comparison, Section 9 describes the problems’ characteris-tics and shows the results obtained on them. Lastly, the conclusions are presented inSection 10.
On one hand, the usual type of EAs pertain to a class of single population algorithms,which we call here single-population framework. But they are also known as panmic-tic EAs. On the other hand, there are other algorithms which spread their populationinto a structure with some defined interrelationship [1]. This paper will follow the def-inition that structured algorithms are any procedure which may have its populationformulated with subpopulation groups, with the number of possible non-trivial sub-population groups necessarily greater than one. For example, the simple EA can not beseen as a structured algorithm, since the number of possible subpopulation groups cannever be formulated as greater than one [15]. Multi-objective ELSA is a local selectionalgorithm which also cannot be seen as a structured algorithm [35]. Note that someprocedures, such as the restricted mating, fit in the previous definition of structured al-gorithms [52]. Therefore, restricted mating based algorithms can be seen as structuredalgorithms (see Section 6.1 for the complete description).Parallel EAs are usually examples of structured EAs which are sometimes dividedinto three classes [17], [42]:
Evolutionary Computation Volume x, Number x argas et al.
1. Island Model: The basic structure used by this model consists of multiple subpop-ulations, where a limited amount of genetic information is exchanged between anyof them arbitrarily;2. Stepping Stone Model: In this model a neighborhood relation is defined, whereonly the adjacent subpopulations can exchange information. Aside from that, it isdefined in the same way as the Island Model;3. Neighborhood Model: A complex single population structure, where individualsinteract only with adjacent individuals.The cellular algorithm [34] (also called fine grained model or lattice model) for examplepertains to the third class.According to [6], Parallel MOEA models can be divided also into three classes:global parallelization, coarse grain and fine grain. Global parallelization does notpresent any structured population aspect, while coarse grain (also called island GAs)and fine grain (also called cellular GAs) are parallel versions of structured algorithmsalready mentioned before. In [45], the classifications of the parallel models differ fromthe previous three classes, though from a population structure point of view they canstill be converted to the previous three classes.Other types of EAs were also developed where the evolutionary conditions dif-fered from subpopulation to subpopulation. These were called nonstandard structuredEAs and they were reviewed by Alba and Tomassini in [1]. Another extensive reviewof single-objective structured EAs can be found in the book of Tomassini [46].Regarding multi-objective algorithms, there are also some algorithms which arestructured. To cite some: multi-objective cellular algorithms [37], some rudimentarysubpopulation algorithms [41], [8], spatial predator-prey MOEA [28] and multi-colonyant algorithms [23]. Spatial predator-prey MOEA defines an adjacency matrix withedges as solutions where the predator makes a random walk and erases the worst so-lution in the neighborhood which is related to a given objective [28]. The number ofpredators walking are as much as there are objectives. Ant colony optimization al-gorithms construct a population of solutions by sampling from a probabilistic model(usually in the form of a matrix of pheromone). This matrix of pheromone is constantlyupdated by the ants. Although they can not be defined as structured algorithms by thedefinition above, their multi-colony version can be defined. Multi-colony optimizationalgorithms use normally multiple matrices of pheromone with some rules to decidehow and which pheromone matrix to be updated/used.Moreover, a generalized framework of the structured algorithms is still non exis-tent. This article fills this gap by formalizing a general unifying framework capable ofrepresenting most if not all of these structured models.
Although being a single population algorithm, AMALGAM is related to the proposedframework since both can be used to integrate algorithms. AMALGAM is a panmicticmultiobjective algorithm that create a number of offspring points using genetic opera-tors from different algorithms. Fast nondominated sorting is used to rank the offspringstogether with the previous population, subsequently defining the next population [49].As told before, one important difference between AMALGAM and the proposedframework is that the first is panmictic. Therefore, it has the disadvantage that multiplealgorithms joined together may conflict with each other in the single pool of solutions.4
Evolutionary Computation Volume x, Number x eneral Subpopulation Framework
Another important difference is that AMALGAM can only define the integration ofalgorithms with biological models for population evolution, since genetic operatorsare necessary for the integration. Here, the proposed framework define the integrationof any optimization algorithm.The portfolio design proposed by [16] runs different algorithms (strategies) orcopies of the same strategy with the objective of selecting the best strategy for the givenproblem. Details of how the selection and evaluation of strategies as well as the strate-gies themselves are dependent on the problem at hand [19]. The strategies run withoutcommunication between each other. Therefore, when considered under the light of theframework described here, the set of interaction matrices is null and can be ignored(interaction matrices are part of the frameword defined in Section 6). The similaritiesbetween this method and the proposed framework are limited to the use of multiplealgorithms together.
The Differential Evolution (DE) is a meta-heuristic contained in the subfield of evolu-tionary computation, which can be employed for optimizing multi-dimensional real-value functions, where these functions are neither required to be continuous nor to bedifferentiable. It solves problems using a simple algorithm similar to the ones usedby EAs, but the operators used by the DE are not based on the evolution of species[43]. The algorithm is described succinctly in Table 1 and the procedures of mutation,crossover and selection are explained in the following subsections.Table 1: Differential Evolution Algorithm1. Initialize population with random samples uniformly distributed overthe search space2. Repeat for each individual until a criterion of convergence is met(a) Mutation(b) Crossover(c) Selection3. Return solution
For each vector x i,g , where i is the index of this vector (which relates to the individualindex in the population, since each individual has its own vector) and g is the currentgeneration where the vector takes place, the mutation is applied by creating a mutantvector based on a numerical operator described in Equation 1. v i,g +1 = x r ,g + F ( x r ,g − x r ,g ) , (1)where r , r and r are randomly selected individuals of the population, which mustdiffer from the individual i . F is a parameter which should meet the condition F ∈ [0 , . Evolutionary Computation Volume x, Number x argas et al. During the crossover, trial vectors u i,g +1 are created from a combination of the muta-tion vector v i,g +1 and the original vector x i,g . The trial vector created is expressed inEquation 2. u i,j,g +1 = (cid:26) x i,j,g if rand () > CR and j (cid:54) = rnd i ; v i,j,g +1 if rand () ≤ CR or j = rnd i , (2)where rand () ∈ [0 , is an uniformly distributed random number, CR ∈ [0 , is aparameter passed to DE, j is the vector component index and rnd i is a randomly chosenindex, with the objective of choosing at least one component from the vector v i,j,g +1 . The selection is the last step of the generation, where it is determined for each vectorif the trial vector u i,g +1 will substitute the original vector x i,g or not. For this, both the u i,g +1 and v i,g are evaluated and the vector with better fitness function is kept, formingthe next generation vector x i,g +1 . The DE algorithm and its variations are known by their robustness, quality of the solu-tions, short running time, easy use and application to a wide range of applications notlimited by the type of the objective function [43], [3].Promising results were obtained in numerous different experiments. Two varia-tions of it achieved the best solutions on all problems from ICEC’96 [44]. In the workof [48] it was shown to achieve more accurate solutions, faster and with greater robust-ness than Particle Swarm Optimization (PSO) and Evolutionary Algorithms (EAs). Atthe state of the art, the DE is still compared on equal grounds to complex optimizationalgorithms (e.g., Estimation of Distribution Algorithms) [14].
The DE was shown to achieve significant improvements over other single-objective [48]as well as in multi-criteria optimization algorithms [47], [11]. The reason behind thisoverall better results lies partially on the rotational invariant behavior of DE’s opera-tors, which adapts to the fitness landscape when compared with NSGA-II’s genetic op-erators and other algorithms with similar genetic operators [22]. Recent studies showthat in multi-objective-problems, DE is one of the best approaches when the problemsize increases in scale [11].There are various multi-objective methods based on differential evolution [4].They can be divided into old versions of algorithms which used only Pareto domi-nance to select individuals and modern methods which use the Pareto dominance anda diversity measure for selection [47]. It is generally accepted that the last versionof the generalized evolution algorithm (the GDE3 [25]) and the differential evolutionmulti-objective algorithm (DEMO) [40] are the representatives of the modern class ofmulti-objective algorithms based on DE [47], [11]. By taking into account that DEMO[40] is also similar to GDE3 [25] algorithm, without constraint handling and a fallbackto the original DE in the case of single-objective, we will conduct the comparison andstudy on GDE3 solely.Recently, a comparison between eight modern multi-objective algorithms wasmade [11]. They showed evidences that GDE3 is currently one of the most suitableMOEA of the state of the art. Among the results, it is stated that GDE3 tends not only6
Evolutionary Computation Volume x, Number x eneral Subpopulation Framework to be faster, but also scales better in relation to the number of decision variables. In thetests made, there was only one other algorithm of the state of the art based on the PSOapproach with similar performance.
GDE3 has the same basic loop as the DE, with a modification in the selection phaseand an addition of a pruning stage. In the selection phase, the algorithm considersthe Pareto dominance and the constraints. Let s and t correspond respectively to thesolution and its respective trial solution. Then, in the selection phase the followingstatements apply: • If both s and t are not feasible. The trial solution t substitute s only when it domi-nates the solution s in unconstrained space; • If one solution is feasible and the other is not feasible, the feasible solution is cho-sen; • Finally, if both solutions are feasible, the solution which dominates the other iskept. However, if neither one dominates the other, both solutions are added to thenext population, increasing the size of the population.As a consequence of the modifications in the selection stage, the pruning stage wasadded to keep the population to a minimum because GDE3’s selection phase describedabove can make the population increase in size. The pruning stage consists of sortingbased on a diversity measure, consecutively selecting the first individuals to fill thenext population size.In the first version, GDE3 used crowding distance as its diversity measure [25],similar to NSGAII [7]. But in its most recent version the k -nearest neighbors measurewas used as a distance measure. This metric was shown to be more consistent than thecrowding distance measure when the number of objectives is greater than two [26]. Theexperiments conducted in this paper use GDE3 with a k -nearest neighbors measure. In nature, evolution is usually observed as an open-ended process which continuallycreates individuals with greater complexity and diversity [33]. Novelty search is amethod developed by Lehman and Stanley that mimics the open-ended evolutionaryprocess with a simple novelty metric [30], [29], rewarding novel individuals with adirect measure of novelty.Moreover, in the perspective of optimization, problems are sometimes deceptive.This is usually the case for real world problems, because when problems increase insize and complexity it is improbable that a fitness function exists which can drive thealgorithm directly to the goal. Novelty search aids the optimization in these deceptivespaces by identifying stepping stones, which are the novel individuals found by thenovelty metric.Recently, novelty search was used in very distinct areas such as neuro-evolution[36], [30], genetic programming [31], multi-objective evolution [36] and robotics [9],[10]. Moreover, there are an ever increasing number of articles with further evidence ofnovelty search benefits in deceptive problems. Some papers even showed the astonish-ing find that novelty search can be used sometimes as a substitute of objective-basedsearch [30], [50]. The good results of novelty search in relation to objective-based searchrevealed that objective-based search may have deleterious effects on search.
Evolutionary Computation Volume x, Number x argas et al. For measuring the novelty of a solution, the novelty search relies on a metric which canbe any equation capable of describing how much an individual is novel in comparisonwith the past individuals of the archive. The usual metric used is the k -nearest neigh-bors which was also employed by Lehman and Stanley in their pioneering work onnovelty search[29]. The following equation defines it exactly: p ( x ) = 1 k k (cid:88) i =1 dist ( x, µ i ) , (3)where k is a parameter defined arbitrarily, µ i is the i -th nearest neighbor of x accordingto the distance measure dist () . The distance measure is problem dependent. Usually, itis calculated in the behaviors space rather than fitness space, where behaviors space iscomposed as the small set of features which identifies a unique behavior (reducing thesearch space and differing in this way from exhaustive enumeration). The archive is anincremental set of individuals, receiving new individuals only if they surpass a noveltythreshold n min adjusted automatically by some rule.It goes often unnoticed, but one of the problems of this novelty metric lies on itsdynamic adjustment, i.e., the parameters used to update the archive. The following arethe dynamics commonly used to update the metric: • if more than n a individuals entered the archive, multiply n min by n inc ; • if n r individuals did not enter in the archive, multiply n min by n dec ;where n a , n r are positive integers (refers to the number of individuals), n inc , n dec ∈ R , n inc > , < n dec < (refers to values of the novelty metric). These parametersdefine the rate of individuals which enter the archive. It follows that the bigger thearchive is the more sensitive the novelty metric is to identify new individuals, becausethe higher the number of points, the less separated the points will be from each other.Then, a bigger archive is a direct result from a smaller n min and consequently a moresensitive search with less chances of letting new individuals go unnoticed. On the otherhand, a bigger archive makes the metric evaluation slower. In this Section, we propose MONA. The first algorithm to use novelty in a multi-objective context. The algorithm uses solely novelty search. Therefore, this algorithmfollows the same line as the Lehman and Stanley study [30], hypothesizing that analgorithm based on the novelty alone might be better than objective based methods.MONA is a very simple algorithm proposed in this article, where the space of all theobjectives is taken to be the behavior space of the novelty, differently from the Mouretapproach [36] where novelty was seen as an additional objective. Table 2 describes thealgorithm.The purpose of this algorithm is to be a very simple algorithm, which will be com-pared as well as used in the general subpopulation framework, showing that from verysimple bases efficient and robust algorithms can be constructed.
The General Subpopulation Framework (GSF) is proposed here as an underlining struc-ture of a class of multi-objective algorithms which unifies a number of structured EAs8
Evolutionary Computation Volume x, Number x eneral Subpopulation Framework
Table 2: Multi-Objective Novelty Algorithm1. Initialize population with random samples uniformly distributed overthe search space2. Repeat for each individual in the population until a criterion of conver-gence is met(a) Apply the same mutation and crossover operators as used by DE(b) Calculate the novelty metric(c) Verify if its novelty metric is above the n min threshold, if it is aboveinsert it on the archive (unlimited in size)(d) Update n min (see Section 5.1)(e) Create a new population by sampling uniformly with replacementfrom the archive3. Return the archive’s non-dominated solutions as the solution setin its formalization. Additionally, it is capable of integrating different optimizationalgorithms without restrictions. This flexible ability of joining algorithms together isimportant as it will be shown in the experiments. Mostly, because this type of coopera-tion between algorithms can sum their benefits while the competition between them ineach subpopulation is decreased to a minimum.In this context, we define: Definition 1
SubpopulationA subpopulation is a finite set of individuals related with a group of well defined dy-namics. These dynamics are usually (although not necessarily) composed of interac-tions of these individuals with either themselves or individuals of other subpopula-tions. But they are not in any way limited to it.When connecting these subpopulations together, a new matrix appears. To thismatrix is given the name IM . It is formally defined as follows: Definition 2 IM - Subpopulation Interaction Probability Matrix SetThe subpopulation interaction probability matrix set IM is a set of matrixes of theform: IM = { IM , IM , ..., IM m } , (4)where m is the number of types of interactions used in an optimization algorithm. Andeach IM i corresponds to the following matrix: IM i = p i, , p i, , · · · p i, ,s p i, , p i, , · · · p i, ,s ... ... . . . ... p i,s, p i,s, · · · p i,s,s , (5)where s is the number of subpopulations and p i,a,b is the probability of an interaction i occurring in subpopulation a and taking as parameters the individuals of subpopula-tion b or the subpopulation b itself. Evolutionary Computation Volume x, Number x argas et al. The evolutionary operators are examples of interactions. For example in the caseof a subpopulation based version of DE’s operators, let us assume their interactions aredescribed by IM d . Then, the trial vector would be, for each individual of this subpop-ulation, composed of three individuals chosen based on the probabilities of the IM d matrix. Recall that the IM matrix set can be ignored in the case of only one subpopu-lation and this is why it can be ignored for panmictic algorithms.Notice also that the interaction of each subpopulation can also differ from subpop-ulation to subpopulation. In the case of just one subpopulation k having an interac-tion i , IM i would be of the following form: IM i = · · · , ... ... . . . ... p i,k, p i,k, · · · p i,k,s ... ... . . . ... · · · . (6)Naturally, more complicated global dynamics might also be present, such as dynamicalprobabilities that depend on time t : IM i = p t,i, , p t,i, , · · · p t,i, ,s p t,i, , p t,i, , · · · p t,i, ,s ... ... . . . ... p t,i,s, p t,i,s, · · · p t,i,s,s . (7)Additionally, the population size variable is extended to a vector version. Becausethe proposed framework has a number of subpopulations, each with a given size. Thisvector is hereby called S and is defined as follows: Definition 3 S - Vector of Subpopulation SizesThe subpopulations’ sizes are defined by vector S , which corresponds to: S = ( ˇ np , ˇ np , ..., ˇ np s ) , (8)where ˇ np a is the size of subpopulation a . The total subpopulation size ( ts ) is naturally: ts = s (cid:88) j =1 ˇ np j . (9)An equivalent and more convenient representation exists which is independent of thetotal subpopulation size. Let np a = ˇ np a ts , corresponding to the ratio of the total subpop-ulation. Then, the following representation is also verified: np a ∈ { x ∈ R : 0 < x < } s (cid:88) j =1 np j = 1 . (10)With the previous definitions it is possible to describe explicitly the GSF: Definition 4
GSF - General Subpopulation FrameworkSuppose we have s subpopulations, then P is the set of subpopulations P = { P , ..., P s } Evolutionary Computation Volume x, Number x eneral Subpopulation Framework and A is the set of panmictic algorithms A = { A , ..., A s } where a subpopulation P i isconstructed by an algorithm (strategy) A i . Therefore, the GSF is defined as a 4-tuple < P , A , S , IM > , where S and IM were previously defined as respectively the vectorof subpopulation sizes and the set of interaction probability matrices.The subpopulations may even be used to join arbitrary algorithms which may noteven be based on populations. That is, as long as each algorithm can generate a setof solutions to compose the subpopulation which is representative of its dynamics, thesubpopulation framework can handle the joining process (examples are given in Sec-tions 6.2 and 7). For example, in the case of the random search algorithm the subpop-ulation can be constructed from the last generated solutions. Therefore, to the knowl-edge of the authors, any algorithms can be joined (mixed) by using this framework.Naturally, for the inclusion of an algorithm in this framework it is also relevant but notnecessary to have: • Dynamics taking into account different individuals of its population (which can bemodified to handle any individual of any population by the IM set of matrices); • Different dynamics from the other subpopulations present in the framework. Thiscan be relevant, since the higher the similarities between subpopulations are theless important the subpopulations become, in other words, multiple subpopula-tions with similar dynamics will produce results similar to a single population.The following subsections demonstrate how GSF can represent most of the opti-mization algorithms. Section 6.1 shows how GSF can represent various types of struc-tured EAs, while Section 6.2 gives two examples of famous algorithms (one a panmic-tic EA and the other a non-evolutionary algorithm) as well as shows how they can betransformed to the GSF approach without losing many of their characteristics. In Sec-tion 7, two new algorithms are proposed based on their related panmictic algorithms.This time, however, the objective is not merely illustrative, since the algorithms de-scribed possess important features described in detail later on. In fact, these importantfeatures enable them to surpass algorithms of the state of the art.
The general subpopulation framework can represent various types of structured EAs,including: • Island-Based Models[46] - Each panmictic island forms a subpopulation P i withthe set of algorithms A containing identical algorithms for all subpopulations. Letthe number of panmictic islands be s , then S = ( s , ..., s ) and | A | = | P | = s .Between the subpopulations an interaction defined by the exchange of genetic in-formation can be formalized with an IM matrix of the form: IM = p i, , · · · p i, ,s p i, , · · · p i, ,s ... ... . . . ... p i,s, p i,s, · · · . (11)That is, each individual selected for exchange must necessarily go to another sub-population, therefore the diagonal entries are zero. This dynamic is usually theunique one which other subpopulations can participate in. Inside the algorithmsother dynamics can take place (e.g., crossover) and these would have also a trivial Evolutionary Computation Volume x, Number x argas et al. set of IM i s with the only non-null probabilities residing on its diagonal (i.e., theinteractions happen only inside the same subpopulation). • Cellular Algorithms[34] - This type of algorithm can be thought as the oppositeline of thought in comparison with Island-Based Models, where the number ofsubpopulations is maximized with the minimum possible size of subpopulations,i.e., cellular algorithms can be seen as a large number of subpopulations P i of equalsize . Let the number of cells in a given cellular algorithm be s , then S = ( s , ..., s ) and | P | = s with each individual cell corresponding to a subpopulation P i and theupdate of each cell can be divided into s algorithms forming the A set of panmicticalgorithms. Consider the case of a cellular algorithm with nine individuals with avon Neumann neighborhood, then it possesses nine subpopulations and an IM c matrix defined by: IM c =
14 14 14 ... ... ... ... ... ... ... ... ...
14 14 14 . (12)Moreover, all interactions of cellular algorithms use the same neighborhood, there-fore the set of matrices IM is given by: IM = { IM c , IM c , ..., IM c } . (13)In some certain cellular algorithms, a dynamical IM c has to be used to representthe change of neighborhood of each cell. • Restricted Mating [52] - Some procedures although not related to subpopulationsat first glance, can be converted to this formalization. Restricted mating, for exam-ple, can be formalized with subpopulations. By considering each subpopulationcontaining only one individual, we have the restricted mating interaction definedby: IM = p , · · · p ,s p , · · · p ,s ... ... . . . ... p s, p s, · · · , (14)when for any ( a, b ) pair, p a,b becomes: u a,b = (cid:26) if dist ( a, b ) < σ ;0 otherwise , (15) p a,b = u a,b (cid:80) si =1 u a,i (16) σ is an arbitrary threshold and dist ( a, b ) is usually the Euclidean distance betweensolutions a and b [52]. • Spatial Predator-Prey MOEA [28] - This algorithm defines an adjacency matrix G with edges as solutions where the predator makes a random walk. This algorithmcan be reformulated into the subpopulation framework by considering as interac-tion the replacement of the preys selected by the predators. Although the replace-ment can be done of multiple ways, only the edges in the predator’s neighborhood12 Evolutionary Computation Volume x, Number x eneral Subpopulation Framework participate. Therefore, for the replacement interaction, each position ( x, y ) of theinteraction matrix becomes: IM ( x, y ) = min { G ( k, x ) , G ( k, y ) } , (17)where k is the edge of the predator responsible for this interaction matrix. Basically,two solutions can only interact if they are in the k (predator’s edge) neighborhood. • Multi-colony Ant Algorithms - Ant colony optimization algorithms in general aredifficult to map into the subpopulation framework because they use populationmodels instead of the solutions themselves. This problem is faced similarly whentrying to convert estimation of distribution algorithms [39], [27]. Additionally,some of these methods do not possess a population structure. For example, antcolony optimization algorithms with one colony do not use a structure approachto optimization following the definition above, i.e., although the construction ofthe solutions by the ants use solution components organized in a structured way,the population of solutions itself is not structurally formulated [23]. However,some of them such as the multi-colony ant algorithms do have a population struc-ture. In this case, it is possible to approximate roughly the population model (e.g.,the pheronomone matrix) as a subpopulation and consider the interrelation be-tween them as interactions with their respective interaction matrices. That is, thepheromone matrices update interaction can be represented as: IM = . . .
00 1 . . . ... ... . . . ... . . . , (18)when the update is only realized at the original colony. And when the update isdone by region ( { L , L , ..., L s } ) in the nondominated front, for a given solution a we have: IM = a ∈ L a ∈ L . . . a ∈ L s a ∈ L a ∈ L . . . a ∈ L s ... ... . . . ... a ∈ L a ∈ L . . . a ∈ L s . (19) This subsection shows how optimization algorithms of almost any type can be con-verted to multi-population versions represented by the GSF. Examples of both the sim-ple genetic algorithm [15] and the simulated annealing [24] will be presented. Their IM matrix sets will be defined and, among other things, it will be shown how theirdynamics could be used to affect other subpopulations. Notice that the conversionswill not make explicit the vector of subpopulations sizes S , since this parameter is notrelated with the representation and thus it can be established independently. There are three basic procedures in a simple genetic algorithm: crossover, mutation andselection. However, mutation does not depend on other individuals and selection is ex-ecuted over a set of individuals of its own population. Then, it does not make sense
Evolutionary Computation Volume x, Number x argas et al. to define an interaction matrix for them. The mutation and selection can be normallyapplied, with the only difference from the single population version being that the tar-get is now the current subpopulation (i.e. not the entire population). In fact, this slightmodification defines the algorithm A i which constructs its respective subpopulation P i under the GSF formulation.Thus, let us define the IM set, which consists of only the crossover interaction( IM = { IM } ). The crossover interaction matrix IM defines the probabilities that anindividual of a given subpopulation participate in the crossover. The exact values ofthe IM is the trivial IM = 1 . Note that the simple GA is not a structured algorithm(there are not any other subpopulation to interact with). However, the designer mightwant to modify IM when joining this algorithm with other algorithms. One of the main difficulties that can be spotted on the simulated annealing is that it isnot a population-based algorithm. This problem can be circumvented by adding therecent modifications of the variables’ values in a First In First Out data structure, cre-ating a subpopulation derived from its dynamics. Therefore, the simulated annealingalgorithm plus the creation of a subpopulation defines algorithm A i to be applied onits created subpopulation P i .Lastly, the interaction matrices are defined by an empty set ( IM = {} ), since thereis no interaction between solutions in its dynamics. An empty IM might be unappeal-ing at first glance, but when joined with the subpopulations of other algorithms, thesubpopulation constructed by this algorithm might be used by other interactions andconsequently influence the global dynamics. It was shown before that panmictic algorithms can be converted to the GSF. However,they possess a trivial IM and bring little explanation. Thus, one might question aboutthe usefulness of such a conversion.The answer is that, once converted to the GSF, any panmictic algorithms can beintegrated seamlessly as a subpopulation in other GSF based algorithms. Section 7 willshow some examples of algorithms constructed using the GSF.Last but not least, the pressures of different panmictic algorithms can be comparedby weighting their subpopulations’ sizes. Comparison of algorithms is an importantand complicated subject which is aided by GSF. GSF also enables a relatively easy eval-uation of the cooperation between algorithms, facilitating the construction of hybridalgorithms with the simple addition or deletion of subpopulations. One feature of the subpopulation framework is the division of interactions over interac-tion matrices. Thus, one can separate only the interactions under interest and comparetheir structural behavior by looking at those matrices. For example, it is possible tosee that both spatial predator-prey and cellular algorithms are similar in the sense thatboth use similar interaction matrices (neighborhood matrices).Moreover, designing structured algorithms may become easier by looking at dif-ferent interactions and interaction matrices instead of multiple structures and their in-ternal behavior. The framework also aids other abstractions such as a mix betweenstructures (i.e., sometimes the structure behave like a cellular algorithm and sometimeslike a island model) by the simple inclusion of other interaction matrices. For example,14
Evolutionary Computation Volume x, Number x eneral Subpopulation Framework the inclusion of a cellular’s interaction matrix into a island model algorithm.
To evaluate the subpopulation framework appropriately, we elaborate two subpopula-tion algorithms: one based on GDE3 (see Section 4.1) and the other based on MONA(see Section 5.2). We will hereby call these GSAs respectively the Subpopulation Algo-rithm based on General Differential Evolution (SAGDE) and the Subpopulation Algo-rithm based on Novelty (SAN).Both SAGDE and SAN are motivated by the fact that single-objective DEs evolvedat each objective usually achieve good results. Take for example the WFG1 problem[21]. If we apply a GSA made uniquely of subpopulations of single-objective DEs, eachevolving a different single objective, we achieve usually the result plotted in Figure 1.Note that the DEs achieve good results on each single objective, with the resultantindividuals very close to the Pareto front, but the front is hardly covered. Then, what ifanother subpopulation is added to this algorithm, which might wisely “mix” these DEssolutions? The following algorithms are motivated by this question and in Section 9 anextensive answer is given based on the experiments.
WFG1 objective 1 ob j e c t i v e lllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllll l Pareto frontsingle−objective DEs
Figure 1: Solutions of single-objective DEs, each one evolving a different objective. Forthe test, the WFG1 problem is used with objectives, distance parameters and position parameters. Each DE had a population of individuals, with CR = 0 . , F = 0 . , and maximum number of generations of . In a problem with n objectives, SAGDE has n + 1 subpopulations P = { P , ..., P n +1 } ,where { A , ..., A n } are single-objective DEs with each one evolving a different objec-tive and the GDE3 (multi-objective algorithm) is used as the algorithm A n +1 for thesubpopulation P n +1 . The GDE3 subpopulation as well as the n single-objective differ- Evolutionary Computation Volume x, Number x argas et al. ential evolution subpopulations behave in the same way as usual aside from the factthat an uniform matrix IM (shown in Equation 20) is used to determine which indi-vidual will be part of the trial vector in the differential operator, i.e., IM = { IM } where: IM = n +1 1 n +1 . . . n +11 n +1 1 n +1 . . . n +1 ... ... . . . ... n +1 1 n +1 . . . n +1 . (20) In the same way as SAGDE, SAN has n +1 subpopulations P = { P , ..., P n +1 } with n ofthem made of single-objective DEs ( { A , ..., A n } ), where n is the number of objectivesof the problem. Each single-objective DE optimizes a different objective and there isan additional subpopulation corresponding to the MONA ( A n +1 ) (multi-objective al-gorithm based on the novelty search approach proposed by this article, see Section 5.2).Both MONA and the n single-objective DE subpopulations behave in the sameway as usual with the unique differences being the use the same uniform matrix IM described in Equation 20 (i.e., an individual chosen has an uniform probability of n +1 of coming from any subpopulation) to select individuals for the trial vector in the DEoperator used in both algorithms. Moreover, MONA verifies any new individuals gen-erated by any subpopulation for inclusion in the novelty archive (i.e., not only its owngenerated individuals). In other words, the inclusion of solutions in the novelty archiveis a different interaction defined by IM . It is activated every time a new solution is cre-ated in any subpopulation, IM matrix is defined below: IM = . . .
10 0 . . . ... . . . ... . . . , (21)where the last column is referent to the MONA’s subpopulation. To compare algorithms the following procedure is used:1. Realize multiple runs of the algorithm and store the solution sets.2. For each solution set do: • Compute the hypervolume indicator (Section 8.1.1); • Compute the (cid:15) indicator (Section 8.1.2); • Store each quality indicator result in a separate vector.3. Algorithms are compared in three ways: • A group of algorithms is compared using their respective quality indicator’smean value and standard deviation. Algorithms with mean value inside thestandard deviation of the best mean value are considered equally good.16
Evolutionary Computation Volume x, Number x eneral Subpopulation Framework • Verify the statistical significance between a pair of algorithms with a non-parametric Mann-Whitney test [20]. The alternative hypothesis that onemethod has a better (smaller) quality indicator than the other is accepted ifthe p-value is lower than . . • Calculate the 50% attainment surface (Section 8.2) based on the solution sets.
In this article, to compare the quality of the algorithms, the hypervolume indicator[51],[2] and the (cid:15) indicator [53] are used. These unary quality indicators were recom-mended by [13], since they are based on different preference information. The follow-ing subsections define these quality indicators.
The hypervolume indicator ( I h ) is defined as the difference between the hypervolumeof the Pareto front and the hypervolume of the non-dominated solution set in objectivespace [51],[2]. This indicator requires a reference point for the calculation, therefore thenadir point is used in this article. (cid:15) indicator The (cid:15) indicator ( I (cid:15) ) is defined as the minimum factor (cid:15) by which a non-dominatedapproximation set (i.e., set of objective vectors which do not dominate each other) isworse than the Pareto optimal front. Let a and p be vectors in Z (the objective space)with Z ⊆ R + d where d is the number of objectives, then the (cid:15) dominance between twovectors is defined by Equation 22. a (cid:23) (cid:15) p ≡ ∀ i ∈ [1 , d ] : a i ≤ (cid:15) · p i . (22)Then, according to [53], the (cid:15) indicator is formally defined in Equation 23. I (cid:15) ( T ) = inf (cid:15) ∈R {∀ p ∈ O ∃ a ∈ T : a (cid:23) (cid:15) p } , (23)where T is the target approximation set and O is the Pareto optimal set. In this paper O refers to a reference set which approximates the Pareto optimal set.As shown in [38], quality indicators may be misleading. Therefore, when visuallypossible, attainment surfaces were also computed for the comparison. Attainment surface (AS) is the boundary in objective space of the dominated area fora single run of an algorithm. They are important because such surfaces show detailedinformation about the performance differences between algorithms. To infer a statisti-cally significant attainment surface, multiple runs of the algorithms are required andan approximated mean result is calculated. Usually, the 50% attainment surface is usedas a mean measure approximation, which is defined as the area dominated by at least50% of the approximation sets [18],[12]. In this paper, the code provided by [ ? ] is usedto obtain the 50% attainment surfaces. Some of the usual benchmarks of multi-objective problems poorly represent importantclasses such as non-separable and multimodal problems. Therefore, this paper makesuse of a relatively recent set of tests called WFG [21]. The WFG set of problems present a
Evolutionary Computation Volume x, Number x argas et al. Table 3: Properties of the WFG test problems.
Problem Obj. Separable Modality Bias GeometryWFG1 f M yes uni polynomial,flat convex,mixedWFG2 f M − no uni - convex,disconnected f M no multi -WFG3 f M no uni - linear,degenerateWFG4 f M yes multi - concaveWFG5 f M yes deceptive - concaveWFG6 f M no uni - concaveWFG7 f M yes uni parameter dependent concaveWFG8 f M no uni parameter dependent concaveWFG9 f M no multi,deceptive parameter dependent concave varied set of properties which can test the scalability of algorithms in both parametersand number of objectives. In Table 3 there is a summary of the characteristics of itstest problems.The WFG Toolkit makes use of position and distance parameters. Inone hand, when a distance parameter is modified the new solution may dominate,be dominated or be equivalent to the previous one. On the other hand, when a positionparameter is modified the new solution is either incomparable or equivalent to theprevious one. Tests were performed for the WFG problems with distance parametersand position parameters, resulting in parameters to be optimized. Each empirical attainment surface and quality indicator was calculated based on solution sets, which were obtained from multiple independent runs of the algorithmin question. Different seeds were used for each algorithm run. Both the maximumnumber of generations and the total subpopulation size (or population size in the case ofpanmictic algorithms) were fixed to respectively and . This fact assures thatall algorithms have the same number of evaluations. Table 4 shows the parameters used for GDE3. They correspond to the same used byKukkonen and Lampinen [26]. The reader may observe that when compared withusual single-objective DE’s settings, the parameters of all algorithms possess a lowervalue of CR and F . This happens because multi-objective optimization maintain a highdiversity. Therefore, it is not necessary to have a higher value of F or CR for better ex-ploration of the search space, because individuals are different enough and the trialvectors are also suitably different. Tests with even smaller values of F were shown toimprove the coverage ( F = 0 . ), but with great impacts on the distance to the OptimalPareto Front (OPF). The gain in coverage was not enough to surpass SAN’s coverageand the distance to the front was poorer enough, such that GDE3 was surpassed bySAN in all problems tested (even on some problems that it performed similarly to SANwith F = 0 . ).In the case of GSA’s algorithms, F should be logically an even lower value. Thisis justified by the fact that GSA’s subpopulations are usually very different from eachother. We conducted preliminary tests with F = 0 . and many results were the same Note that the variables subpopulation size and total subpopulation size are different from each other. Thetotal subpopulation is defined in Section 6. Evolutionary Computation Volume x, Number x eneral Subpopulation Framework
Table 4: Parameter’s Table. The first two ratios of the S vector correspond to the sub-populations of DEs used and the third ratio is either MONA (for the SAN) or GDE3(for the SAGDE). GDE3 CR F CR F n inc n dec n a n r CR F IM uniform S (0 . , . , . SAN CR F IM uniform S (0 . , . , . n inc n dec n a n r F = 0 . , though some problems showed as expected aslightly worse result. For MONA and SAN, the novelty parameters were decided upona quality-efficiency trade-off, with both algorithms having the same fixed parameters.Regarding the chosen subpopulation sizes of SAN and SAGDE, they are directlyrelated to subpopulation’s algorithm strength to “mix” the solutions of the single-objective DEs’ subpopulations. Some subpopulations “mix” better the solutions thanothers (directly related to the coverage of the OPF), requiring a smaller subpopulationsize (MONA subpopulation), while other subpopulations require a bigger subpopu-lation size to get a similar coverage (GDE3). This happens specially because GDE3have various strategies and coverage is just one of its strategies. Recall that in SANand SAGDE there are two single-objective DEs. These algorithms explore the problemsas shown in Figure 1 and discussed in Section 7. Therefore, “mixing” the solution isnecessary for coverage and this is only achieved by other subpopulations (GDE3 andMONA subpopulations for respectively SAGDE and SAN). Tests were performed for the WFG problems with two objectives. Parameters used bythe algorithms are fixed and summarized in Table 4.
Evolutionary Computation Volume x, Number x argas et al. WFG1 objective 1 ob j e c t i v e OPFSANSAGDEGDE3MONA
WFG2 objective 1 ob j e c t i v e OPFSANSAGDEGDE3MONA
WFG3 objective 1 ob j e c t i v e OPFSANSAGDEGDE3MONA
WFG4 objective 1 ob j e c t i v e OPFSANSAGDEGDE3MONA
Figure 2: 50% attainment surfaces for the WFG Toolkit problems (minimization prob-lems). Calculated for independent runs.The comparison between the 50% attainment surfaces of SAN, SAGDE, GDE3 andMONA is shown in Figures 2 and 3. Before discussing the results it is necessary toshown Tables 5 and 6 with the mean and standard deviation (sd) of (cid:15) and hypervol-ume quality indicators as well as Tables 7 and 8 with the statistical significance of bothquality indicators. Most of the time the tables and figures agree with each other. There-fore, when not stated otherwise, the discussion concerns the overall behavior of allthree comparisons (attainment surfaces, mean/sd and statistical hypothesis testing)For more information on the construction of these tables and figures please refer toSection 8 or to the tables and figures themselves.Regarding the comparison between SAGDE and GDE3. SAGDE is significantlybetter than the GDE3 in the WFG1 for both quality indicators (clearly observable in20 Evolutionary Computation Volume x, Number x eneral Subpopulation Framework
Table 5: (cid:15) indicator’s mean and standard deviation for SAN, SAGDE, GDE3 andMONA. For each problem the best mean value as well as the other mean values in-side the standard variation of the best mean value are marked in bold .SAN SAGDE GDE3 MONAProblems mean (sd) mean (sd) mean (sd) mean (sd)WFG1 . ( . ) . ( . ) 1 . .
02) 2 . . WFG2 . ( . ) 0 . .
11) 0 . .
37) 0 . . WFG3 . ( . ) 0 . .
03) 0 . .
13) 0 . . WFG4 . ( . ) 0 . .
03) 0 . .
23) 0 . . WFG5 . ( . ) 0 . .
03) 0 . .
10) 0 . . WFG6 . ( . ) 0 . .
04) 0 . .
30) 0 . . WFG7 . ( . ) 0 . .
02) 0 . .
16) 0 . . WFG8 . ( . ) 0 . .
10) 0 . .
31) 0 . . WFG9 . ( . ) 0 . .
06) 0 . .
18) 0 . . Table 6: Hypervolume indicator’s mean and standard deviation for SAN, SAGDE,GDE3 and MONA. For each problem the best mean value as well as the other meanvalues inside the standard variation of the best mean value are marked in bold .SAN SAGDE GDE3 MONAProblems mean (sd) mean (sd) mean (sd) mean (sd)WFG1 . ( . ) . ( . ) 3 . .
08) 6 . . WFG2 . ( . ) 0 . .
10) 0 . .
20) 0 . . WFG3 . ( . ) 0 . . . ( . ) 0 . . WFG4 . ( . ) 0 . . . ( . ) 0 . . WFG5 . ( . ) 0 . . . ( . ) 0 . . WFG6 . ( . ) . ( . ) . ( . ) 0 . . WFG7 . ( . ) . ( . ) . ( . ) 0 . . WFG8 . .
05) 0 . . . ( . ) 1 . . WFG9 . ( . ) 0 . . . ( . ) 0 . . Evolutionary Computation Volume x, Number x argas et al. Table 7: P-values of comparison between SAN, SAGDE, GDE3 and MONA algorithmswith Mann-Whitney significance test using the (cid:15) indicator. Results are marked in bold when the null hypothesis is rejected with a significance level of α = 0 . . The alter-native hypothesis is that the algorithm in the row is statistically better (smaller qualityindicator) than the algorithm in the column.Algorithm Problem SAN SAGDE GDE3 MONASAN WFG1 . . −
18 8 . − WFG2 .
01 1 . − . − WFG3 . −
11 8 . −
18 8 . − WFG4 . − . −
11 8 . − WFG5 . −
11 8 . −
18 8 . − WFG6 . −
10 8 . −
18 8 . − WFG7 . − . −
18 8 . − WFG8 . . − . − WFG9 . − . − . − SAGDE WFG1 . − . −
18 8 . − WFG2 . . − . − WFG3 . . − . − WFG4 . . − . − WFG5 . . −
14 9 . − WFG6 . . −
13 1 . − WFG7 . . −
16 9 . − WFG8 . .
02 2 . − WFG9 .
99 0 . . GDE3 WFG1 . − WFG2 .
99 0 .
99 0 . WFG3 .
99 0 . WFG4 .
99 0 .
99 0 . WFG5 .
99 0 . WFG6 .
99 0 . WFG7 . WFG8 .
99 0 .
97 0 . WFG9 . .
04 5 . − MONA WFG1
WFG2 .
99 0 . WFG3 . . − WFG4 .
99 0 . WFG5 . . − WFG6 . . − WFG7 . . − WFG8 .
99 0 . WFG9 .
98 0 . Evolutionary Computation Volume x, Number x eneral Subpopulation Framework
Table 8: P-values of comparison between SAN, SAGDE, GDE3 and MONA algo-rithms with Mann-Whitney significance test using the hypervolume indicator. Resultsare marked in bold when the null hypothesis is rejected with a significance level of α = 0 . . The alternative hypothesis is that the algorithm in the row is statisticallybetter (smaller quality indicator) than the algorithm in the column.Problem Algorithm SAN SAGDE GDE3 MONASAN WFG1 . , −
18 8 . − WFG2 . −
13 7 . −
12 8 . − WFG3 . − . . − WFG4 . − . . − WFG5 . − . . − WFG6 . − . . − WFG7 . − . . − WFG8 .
74 0 . . − WFG9 .
46 0 . . − SAGDE WFG1 . . −
18 8 . − WFG2 .
99 0 . . − WFG3 .
99 0 . . − WFG4 .
99 0 . . − WFG5 .
99 1 . − WFG6 .
99 0 . . − WFG7 .
99 0 . . − WFG8 .
26 0 . . − WFG9 .
53 0 . . − GDE3 WFG1 . − WFG2 .
99 0 . . − WFG3 . . −
14 8 . − WFG4 . − . −
10 8 . − WFG5 . − . −
16 8 . − WFG6 . − . −
11 2 . − WFG7 .
01 4 . − . − WFG8 . −
13 1 . − . − WFG9 . −
10 2 . −
10 5 . − MONA WFG1
WFG2 .
99 1
WFG3
WFG4
WFG5
WFG6 . WFG7
WFG8 .
99 1
WFG9
Evolutionary Computation Volume x, Number x argas et al. WFG5 objective 1 ob j e c t i v e OPFSANSAGDEGDE3MONA
WFG6 objective 1 ob j e c t i v e OPFSANSAGDEGDE3MONA
WFG7 objective 1 ob j e c t i v e OPFSANSAGDEGDE3MONA
WFG8 objective 1 ob j e c t i v e OPFSANSAGDEGDE3MONA
WFG9 objective 1 ob j e c t i v e OPFSANSAGDEGDE3MONA
Figure 3: 50% attainment surfaces for the WFG Toolkit problems (minimization prob-lems). Calculated for independent runs.Tables 7 and 8 but also present in the other tables and figures). However, the qualityindicators do not agree in the remaining problems, which suggests that there is just atrade-off but not an explicit advantage in these problems. SAGDE tends to achieve abetter coverage of the OPF, while GDE3 is closer to the OPF albeit having a slightlypoorer coverage of the front. Consequently, depending on whether coverage or prox-imity to the front is more important, the algorithm designer may choose one or theother algorithm.MONA achieved poor outcomes on all problems against all algorithms. Maybethe exceptions are the better coverage when compared against GDE3 in WFG3, WFG5,WFG6 and WFG7 problems (see Table 7 and Table 5). Even so, the combined subpopu-lations of MONA and the single-objective DEs in the SAN obtained state of art quality.24 Evolutionary Computation Volume x, Number x eneral Subpopulation Framework
Notice also that inside the SAGDE and SAN there is respectivelly a GDE3 and a MONAsubpopulation. The GDE3 subpopulation inside SAGDE is bigger than the MONA sub-population inside SAN, however, GDE3 subpopulation still “mix” the solutions worsethan MONA (resulting in poorer coverage). Demonstrating MONA’s good ability ofexpanding and mixing results. . . . . WFG1 objective 1 ob j e c t i v e SANGDE3
WFG2 objective 1 ob j e c t i v e SANGDE3
WFG3 objective 1 ob j e c t i v e SANGDE3
WFG4 objective 1 ob j e c t i v e SANGDE3
WFG5 objective 1 ob j e c t i v e SANGDE3
WFG6 objective 1 ob j e c t i v e SANGDE3
WFG7 objective 1 ob j e c t i v e SANGDE3
WFG8 objective 1 ob j e c t i v e SANGDE3
WFG9 objective 1 ob j e c t i v e SANGDE3
Figure 4: 50% attainment surfaces for the WFG Toolkit problems (minimization prob-lems). Calculated for independent runs.Concerning the comparison of both GSAs, the experiments demonstrate a surpris-ingly better overall result of the SAN over the SAGDE, as the SAN is simpler and basedon the MONA, an algorithm which achieved poor results on all tests. This fact mightseem surprising at first glance, but looking from a different point of view, it is possi-ble to understand those results if we take into account the GSF’s structure. Recall thatthe more different two strategies are, the more the subpopulation’s framework benefits Evolutionary Computation Volume x, Number x argas et al. . . . Generation E p s il on I nd i c a t o r SANGDE30 5000 10000 15000 20000 25000
Generation H y pe r v o l u m e I nd i c a t o r SANGDE3
Figure 5: Hypervolume and (cid:15) indicators throughout the generations for both SAN andGDE3 algorithms in the WFG1 problem (the confidence interval of one standard devia-tion away from the mean is shown in grey). The curve was averaged over indepen-dent runs.from it. This happens because a similar strategy will also produce similar individualsin different subpopulations, using more resources for less exploration and diversity.The comparison between SAN and GDE3 is a bit more complicated. First of all,Tables 7 and 8 show that SAN outperforms GDE3 according to both quality indicatorsin WFG1, WFG2 and WFG3 while the remaining problems have contrasting results of (cid:15) and hypervolume indicators. Consequently, GDE3 is comparable with SAN only in theconcave problems, which have easier shapes of Pareto front. However, the statisticalhypothesis testing does not tell us by how much is the difference (it only tells if it is big-ger or not with some significance). Table 5 shows unsurprisingly that SAN outperformsGDE3 by a great difference in respect to the (cid:15) indicator. But according to Table 6, in allproblems where GDE3 surpassed SAN statistically, GDE3 is shown to be close (insideGDE3’s standard deviation) to the SAN in all problems but WFG8 (the reason why thishappens is show on Section 9.6, where it is demonstrated that both algorithms have notconverged yet in WFG8). In fact, to the knowledge of the authors, SAN achieved thebest performance to date over all of the WFG’s problems with two objectives. Addi-tionally, for a clearer analysis, Figure 4 shows only SAN and GDE3 attainment surfacesand Figure 5 delineates the behavior of the quality indicators throughout the evolution26 Evolutionary Computation Volume x, Number x eneral Subpopulation Framework
Table 9: Parameter’s Table. The first five ratios of the S vector correspond to the sub-populations of the DEs used and the last ratio corresponds to the MONA.GDE3 CR F CR F IM uniform S (0 . , . , . , . , . , . n inc n dec n a n r In this study, we increased the number of objectives to five. Aside from that, the sameWFG problems were used and all other problem’s parameters were kept as before.Most of the algorithms’ parameters remained the same as well with the only exceptionbeing vector S , which depends on the number of objectives. The new set of parametersis shown in Table 9.Tests with many-objective problems were realized using the SAN, the most promi-nent algorithm in the bi-objective study from Section 9.3, and a reference from the stateof the art, GDE3.Tables 10 and 11 display the mean and standard deviation values, while Table 12shows the statistical results of the comparison. SAN is able to converge better in allproblems according to both quality indicators except WFG8, where the quality indi-cators differed in the results (even so, WFG8’s hypervolume indicator mean values ofSAN and GDE3 are close to each other) Moreover, in all other problems SAN had verysmall p-values. The negative values of the hypervolume indicator means that the sam-ples acquired from the Pareto optimum front dominate a hypervolume smaller than theSAN’s dominated hypervolume. This result may be related to the number and distri-bution of samples in the OPF generated by the WFG toolkit. The same OPF’s samples Evolutionary Computation Volume x, Number x argas et al. Table 10: (cid:15) indicator’s mean and standard deviation for SAN and GDE3. For eachproblem the best mean value as well as the other mean values inside the standardvariation of the best mean value are marked in bold .SAN GDE3Problems mean (sd) mean (sd)WFG1 . ( . ) 0 . . WFG2 . ( . ) 0 . . WFG3 . ( . ) 0 . . WFG4 . ( . ) 1 . . WFG5 . ( . ) 1 . . WFG6 . ( . ) 1 . . WFG7 . ( . ) 1 . . WFG8 . ( . ) 1 . . WFG9 . ( . ) 1 . . Table 11: Hypervolume indicator’s mean and standard deviation for SAN and GDE3.For each problem the best mean value as well as the other mean values inside the stan-dard variation of the best mean value are marked in bold .SAN GDE3Problems mean (sd) mean (sd)WFG1 . ( ) 997 . WFG2 . ( ) 95 . WFG3 − . ( ) 79 . WFG4 − . ( ) 378 . WFG5 . ( ) 1088(53) WFG6 . ( ) 1050(66) WFG7 . ( ) 1911(173) WFG8 ( ) ( ) WFG9 − . ( ) 526 . Evolutionary Computation Volume x, Number x eneral Subpopulation Framework
Table 12: P-values of comparison between SAN and GDE3 algorithms in many-objective problems with Mann-Whitney significance test using (cid:15) and hypervolume in-dicators. Results are marked in bold when the null hypothesis is rejected with a signif-icance level of α = 0 . . The alternative hypothesis is that the algorithm in the row isstatistically better (smaller quality indicator) than the algorithm in the column.SAN GDE3Algorithm Problem (cid:15) hypervolume (cid:15) hypervolumeSAN WFG1 . −
16 8 . − WFG2 . − . − WFG3 . −
12 8 . − WFG4 . −
18 8 . − WFG5 . −
13 3 . − WFG6 . −
18 8 . − WFG7 . −
18 8 . − WFG8 . − . WFG9 . −
18 8 . − GDE3 WFG1
WFG2 .
99 0 . WFG3 .
99 1
WFG4
WFG5
WFG6
WFG7
WFG8 . WFG9
Evolutionary Computation Volume x, Number x argas et al. Table 13: Comparison of the SAN and GDE3 algorithms with Mann-Whitney signifi-cance test in many-objective problems. The respective meanings of ⇑ , ↓ and ≈ is thatSAN is statistically better, worse or equal to the GDE3.SAN vs GDE3 (many-objective)Problems I (cid:15) (p-value) I h (p-value)WFG1 ⇑ (2 . e − ⇑ (1 . e − WFG2 ⇑ (9 . e − ⇑ (3 . e − WFG3 ⇑ (3 . e − ⇑ (1 . e − WFG4 ⇑ (1 . e − ⇑ (1 . e − WFG5 ⇑ (1 . e − ⇑ (1 . e − WFG6 ⇑ (1 . e − ⇑ (1 . e − WFG7 ⇑ (1 . e − ⇑ (1 . e − WFG8 ⇑ (6 . e − ↓ (0 . WFG9 ⇑ (1 . e − ⇑ (1 . e − were used to compare both GDE3 and SAN and therefore there is not any bias in thecomparison (i.e., GDE3 could have had negative hypervolume as well).This suggests that SAN should achieve better results when problems increase incomplexity. Recall that on bi-objective problems, GDE3 was shown to be comparablewith SAN only when concave Pareto fronts were present. Naturally, with the increasein the number of functions to be optimized, the number of conflicting objectives insidepanmictic algorithms is expected to increase as well. This explains the better overallsolutions of SAN in all the many-objective problems with many different properties(see Table 3). It has been argued before that the algorithms based on the GSF achieve better resultssince they divide different strategies (algorithms) in distinct populations which avoidboth the undesirable conflicts and the prevalence of one strategy over another. Here,we will present an detailed justification.Consider a bi-objective optimization problem being solved with SAN and GDE3.For this problem, SAN may be divided into three strategies (i.e., | A | = 3 ): one single-objective DE for each of the two objectives and MONA. GDE3 has one strategy whichis composed of two steps: first selecting individuals based on Pareto dominance (mainstrategy) and second pruning the population based on a diversity measure (secondarystrategy).If we see the strategies as a collection of forces capable of changing the positionsof solutions, it is possible to draw the most salient force vectors produced by SANand GDE3 (Figures 6 and 7). Therefore, for GDE3, the main force points directly tothe Pareto front with secondary forces pointing sideways (caused by the pruning strat-egy). In SAN, the single-objective DE’s subpopulations’ forces point directly to theirrespective objective’s coordinate while the MONA’s subpopulation points away fromthe previous individuals which corresponds approximately to vectors pointing in alldirections with the same strength.This analysis reveals the main problem with GDE3: its forces responsible forspreading are relatively weak. The first consequence is, for example, when the prob-lem has a disconnected geometry or bias, the solutions may spread only over a smallsubset of the optimal front (see problems WFG1, WFG2 of Figures 2 and 3 or Figure 4).30 Evolutionary Computation Volume x, Number x eneral Subpopulation Framework
Another consequence is that the necessary forces for the solution of problems dependsnaturally on the problems themselves and if a given problem needs more spreadingforces, GDE3 presents many difficulties to spread the solutions. For example, over allthe WFG’s datasets the GDE3 covered poorly the extremes of the Pareto front (see Fig-ures 2 and 3 or Figure 4) and in the case of many-objective problems, where the Paretofront becomes wider as it expands along various dimensions, it achieved poor resultsin all tests for both quality indicators (see Table 13).Notice that the vectors of the GDE3 are a consequence of its panmictic designwhich causes inevitably one force to be stronger or weaker relative to the others. Thatis, this analysis is inherently connected with the conflicting strategies of panmictic al-gorithms.Figure 6: Diagram of the GDE3 with its strategy exposed explicitly as three componentsof a force. The length of the arrow is related with its intensity.Figure 7: Diagram of the SAN with its strategies exposed explicitly as forces. Thesingle-objective DE forces (dashed gray lines) are perpendicular to each other and theMONA force (circular dashed-point gray line) is a field-force which is stronger with theincrease of the distance from the previous individuals.
Measuring the forces empirically can be done in various ways. If the average solutionof each subpopulation in objective space is considered, it is possible to analyze thesubpopulation forces throughout the evolution. However, the comparison with singlepopulation algorithms may be unfairly plotted with just one force (i.e., much of the
Evolutionary Computation Volume x, Number x argas et al. Table 14: Percentage of solutions which are either unfeasible or result in zero modulusforces. Both types of solutions are excluded from the calculation of forces and thereforenot present in Figure 8.Problem Unfeasible Solutions Zero Modulus ForcesSAN GDE3 SAN GDE3WFG1 .
64% 9 .
76% 13 .
38% 50 . WFG4 .
78% 3 .
81% 11 .
27% 55 . WFG5 .
44% 13 .
85% 8 .
11% 44 . WFG8 .
42% 11 .
06% 0 .
06% 0 . behavior is lost with just one ”mean subpopulation force”). Therefore, plotting theforces between parent and offspring in objective space seems like a better possibility,although some aspects of the global movement is lost.Here, the forces are calculated by measuring the vector from the DE’s operatormain parent to its offspring in objective space (other genetic operators with no mainparent might make necessary the computation of a set of forces for each individualwith each force related to a parent). The experiment is composed of evaluationssamples throughout one run of the algorithm (multiple runs of the algorithm presentedno significant difference from each other, as one would expect since the number ofsamples in one run are already representative). Figure 8 shows the accumulative anglesof the forces for three problems with both GDE3 and SAN algorithms. The directiongiven by the 0 ◦ and 90 ◦ are respectivelly parallel to increasing x-axis and increasingy-axis (i.e., 180 ◦ is improving objective 1, 270 ◦ is improving objective 2). Regarding themeasurement, it is done right before the selection phase of the differential evolutionoperator, otherwise the arc from 0 ◦ to 90 ◦ would be nonexistent. Naturally, a long binmeans a higher number of solutions moving in that direction. Notice, however, thatsome histograms may have more individuals than others. This happens because twoconditions cause some solutions or forces to be discarded: • Unfeasible Solutions - They are excluded from the calculation, since unfeasiblesolutions can not be mapped to a point in objective space. • Forces with Zero Modulus - In the case where the resulting child possess the samepoint in objective space as its main parent, the resulting force would have a zeromodulus. In fact, this means that no force was applied at all and therefore it isreasonable to exclude it.To give an idea of how many solutions were discarded and from which type (un-feasible solutions or solutions which result in a zero modulus force), Table 14 was con-structed. Setting problem WFG8 aside, GDE3 has always a high number of solutionsdiscarded (specially solutions which result in a zero modulus force). This happensbecause GDE3 converges prematuraly on these problems. In WFG8, however, the so-lutions which result in a zero modulus force are extremely small. This points to thefact that both algorithms have not yet converged in WFG8, explaining why GDE3 sur-passed SAN in this problem.Bare in mind that the forces seen are not just a ”DNA” of the algorithm. They areaffected intensively by the problem at hand. Therefore, the higher the bias of the prob-lem is, the higher the influence of the problem in the measured forces becomes. Theresults on WFG1 and WFG8 shows exactly this interference of the problem which is32
Evolutionary Computation Volume x, Number x eneral Subpopulation Framework strongly biased (see Table 3 for the bias properties of all problems). Therefore, analysingthe behavior on problems with less bias (such as problems WFG4 and WFG5) renders aless noisy perspective on the ”DNA” of the algorithm. In fact, there are many similari-ties between the second and third rows of Figure 8 with Figures 6 and 7. For example,the spread of forces in all directions can be seen in SAN, i.e., every direction has a binwith noticeable longness, while GDE3 has bins on fewer directions. These results werepredicted by our previous analysis. l l l l l l l l Figure 8: Accumulative angles of the forces measured for algorithms GDE3 (left col-umn) and SAN (right column) on problems WFG1 (first row), WFG4 (second row),WFG5 (third row) and WFG8 (fourth row). The forces are measured by calculatingthe vector from the parent to the offspring in objective space. The scale is linear andthe unfeasible solutions as well as forces with zero modulus were eliminated from thegraph.
Evolutionary Computation Volume x, Number x argas et al. The essential idea behind all these explanations is that a panmictic population maybe seen as a niche. Once no proper division is placed between strategies, no matterwhat strategies and procedures are involved, the population results in forces of dif-ferent intensity being developed together, i.e., a conflict of forces appears inside thepopulation. This internal population conflict is hardly solved without a division. Thatis, a division into subpopulations.
The objective of this paper is to propose the framework together with some examplesof algorithms based on it, demonstrating some of its aspects and strengths. This ishowever not an exhaustive exposition. There are still an extensive amount of topics tobe covered. To cite some: • Studies on the variations of IM and S as well as self-adaptive modifications; • The effect of different and/or complex dynamics between subpopulations; • Integration of different types of algorithms and comparison between them.
10 Conclusions
We have presented here a justification of why structured EAs, and in special the GSF,achieve better results in multi-objective optimization. This derives from the fact that well-designed structured EAs separate better the conflicting strategies, avoiding the deleteriousconsequences of the competition between themselves.
Additionally, this article presented a new framework called GSF which can aid theunderstanding and design of structured optimization algorithms. GSF can easily joinany optimization algorithms, therefore any algorithm can be with little effort combinedand tested together with others, yielding a very flexible framework.Moreover, to the knowledge of the authors, SAN’s results is the most or amongthe most robust algorithms of the state of the art, either surpassing GDE3 in the testsor achieving a comparable solution in terms of a trade-off between (cid:15) and hypervolumequality indicators. In fact, when the problems increased in the number of objectives(which also increased the number of conflicting strategies inside a panmictic algorithm)the advantage of SAN over GDE3 became more emphatic. In other words, the pro-posed subpopulation framework showed that with an integration of simple algorithmsit was possible to achieve better solutions, surpassing or at least achieving similar per-formance in all tests realized with the original panmictic algorithms. Another inter-esting result is that a simple algorithm such as MONA, which had poor results on alltests, was shown to attain state of the art quality Pareto fronts when combined withtwo simple single-objective DEs in the subpopulation framework.Thus, motivated by the population internal conflicts, structured optimization algo-rithms should find increasing attention of the optimization community. In this aspect,the proposed subpopulation framework will hopefully aid the development of newstructured algorithms and open new possibilities for the algorithms to come. Conse-quently, further studies on multiple subpopulation dynamics as well as global interac-tions for the further understanding of the framework’s frontiers is hereby encouraged.
References [1] Alba, E. and Tomassini, M. (2002). Parallelism and evolutionary algorithms.
EvolutionaryComputation, IEEE Transactions on , 6(5):443–462. Evolutionary Computation Volume x, Number x eneral Subpopulation Framework[2] Beume, N., Fonseca, C., L´opez-Ib´a ˜nez, M., Paquete, L., and Vahrenhold, J. (2009). On thecomplexity of computing the hypervolume indicator.
Evolutionary Computation, IEEE Transac-tions on , 13(5):1075–1082.[3] Brest, J., Greiner, S., Boskovic, B., Mernik, M., and Zumer, V. (2006). Self-adapting controlparameters in differential evolution: A comparative study on numerical benchmark problems.
Evolutionary Computation, IEEE Transactions on , 10(6):646–657.[4] Chakraborty, U. (2008).
Advances in differential evolution . Springer Verlag.[5] Coello, C. et al. (2006). Evolutionary multi-objective optimization: a historical view of thefield.
Computational Intelligence Magazine, IEEE , 1(1):28–36.[6] De Toro, F., Ortega, J., Fern´andez, J., and D´ıaz, A. (2002). Psfga: a parallel genetic algorithmfor multiobjective optimization. In
Parallel, Distributed and Network-based Processing, 2002. Pro-ceedings. 10th Euromicro Workshop on , pages 384–391. IEEE.[7] Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. (2002). A fast and elitist multiobjectivegenetic algorithm: Nsga-ii.
Evolutionary Computation, IEEE Transactions on , 6(2):182–197.[8] Delbem, A., de Carvalho, A., and Bretas, N. (2005). Main chain representation for evolution-ary algorithms applied to distribution system reconfiguration.
Power Systems, IEEE Transac-tions on , 20(1):425–436.[9] Doncieux, S. and Mouret, J. (2010). Behavioral diversity measures for evolutionary robotics.In
Evolutionary Computation (CEC), 2010 IEEE Congress on , pages 1–8. IEEE.[10] Doncieux, S., Mouret, J., and Bredeche, N. (2009). Exploring new horizons in evolutionarydesign of robots. In
Workshop on Exploring new horizons in Evolutionary Design of Robots at IROS ,volume 2009, pages 5–12.[11] Durillo, J., Nebro, A., Coello, C., Garc´ıa-Nieto, J., Luna, F., and Alba, E. (2010). A study ofmultiobjective metaheuristics when solving parameter scalable problems.
Evolutionary Com-putation, IEEE Transactions on , 14(4):618–635.[12] Fonseca, C. and Fleming, P. (1996). On the performance assessment and comparison ofstochastic multiobjective optimizers.
Parallel problem solving from nature-ppsn iv , pages 584–593.[13] Fonseca, C., Knowles, J., Thiele, L., and Zitzler, E. (2005). A tutorial on the performanceassessment of stochastic multiobjective optimizers. In
Third International Conference on Evolu-tionary Multi-Criterion Optimization (EMO 2005) , volume 216.[14] Garc´ıa, S., Molina, D., Lozano, M., and Herrera, F. (2009). A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on theCEC’2005 special session on real parameter optimization.
Journal of Heuristics , 15(6):617–644.[15] Goldberg, D. E. (1989). Genetic algorithms in search, optimization, and machine learning.[16] Gomes, C. and Selman, B. (1997). Algorithm portfolio design: Theory vs. practice. In
Pro-ceedings of the Thirteenth conference on Uncertainty in artificial intelligence , pages 190–197. MorganKaufmann Publishers Inc.[17] Gorges-Schleuter, M. (1991). Genetic algorithms and population structures. a massivelyparallel algorithm. Master’s thesis.[18] Grunert da Fonseca, V., Fonseca, C., and Hall, A. (2001). Inferential performance assessmentof stochastic optimisers and the attainment function. In
Evolutionary Multi-Criterion Optimiza-tion , pages 213–225. Springer.[19] Guerri, A. and Milano, M. (2004). Learning techniques for automatic algorithm portfolioselection. In
ECAI , volume 16, page 475.
Evolutionary Computation Volume x, Number x argas et al.[20] Hollander, M. and Wolfe, D. (1999). Nonparametric statistical methods.[21] Huband, S., Hingston, P., Barone, L., and While, L. (2006). A review of multiobjective testproblems and a scalable test problem toolkit. Evolutionary Computation, IEEE Transactions on ,10(5):477–506.[22] Iorio, A. and Li, X. (2005). Solving rotated multi-objective optimization problems usingdifferential evolution.
AI 2004: Advances in Artificial Intelligence , pages 861–872.[23] Iredi, S., Merkle, D., and Middendorf, M. (2001). Bi-criterion optimization with multi colonyant algorithms. In
Evolutionary Multi-Criterion Optimization , pages 359–372. Springer.[24] Kirkpatrick, S., Gelatt, C., and Vecchi, M. (1983). Optimization by simulated annealing. science , 220(4598):671.[25] Kukkonen, S. and Lampinen, J. (2005). Gde3: The third evolution step of generalized dif-ferential evolution. In
Evolutionary Computation, 2005. The 2005 IEEE Congress on , volume 1,pages 443–450. IEEE.[26] Kukkonen, S. and Lampinen, J. (2007). Performance assessment of generalized differentialevolution 3 (gde3) with a given set of problems. In
Evolutionary Computation, 2007. CEC 2007.IEEE Congress on , pages 3593–3600. IEEE.[27] Larranaga, P. and Lozano, J. (2002).
Estimation of distribution algorithms: A new tool for evolu-tionary computation , volume 2. Springer.[28] Laumanns, M., Rudolph, G., and Schwefel, H. (1998). A spatial predator-prey approachto multi-objective optimization: A preliminary study. In
Parallel Problem Solving from Nature-PPSN V , pages 241–249. Springer.[29] Lehman, J. and Stanley, K. (2008). Exploiting open-endedness to solve problems throughthe search for novelty.
Artificial Life , 11:329.[30] Lehman, J. and Stanley, K. (2010a). Abandoning objectives: Evolution through the searchfor novelty alone.
Evolutionary Computation , pages 1–34.[31] Lehman, J. and Stanley, K. (2010b). Efficiently evolving programs through the search fornovelty. In
Proceedings of the 12th annual conference on Genetic and evolutionary computation ,pages 837–844. ACM.[32] Li, C. and Yang, S. (2008). An island based hybrid evolutionary algorithm for optimization.
Simulated Evolution and Learning , pages 180–189.[33] Maley, C. (1999). Four steps toward open-ended evolution. In
GECCO-99: Proceedings of theGenetic and Evolutionary Computation Conference . Citeseer.[34] Manderick, B. and Spiessens, P. (1989). Fine-grained parallel genetic algorithms. In
ICGA’89 ,pages 428–433.[35] Menczer, F., Degeratu, M., and Street, W. (2000). Efficient and scalable pareto optimizationby evolutionary local selection algorithms.
Evolutionary Computation , 8(2):223–247.[36] Mouret, J. and Doncieux, S. (2009). Using behavioral exploration objectives to solve decep-tive problems in neuro-evolution. In
Proceedings of the 11th Annual conference on Genetic andevolutionary computation , pages 627–634. ACM.[37] Nebro, A., Durillo, J., Luna, F., Dorronsoro, B., and Alba, E. (2006). A cellular genetic algo-rithm for multiobjective optimization.
NICSO 2006 , page 25.[38] Okabe, T., Jin, Y., and Sendhoff, B. (2003). A critical survey of performance indices formulti-objective optimisation. In
Evolutionary Computation, 2003. CEC’03. The 2003 Congress on ,volume 2, pages 878–885. IEEE. Evolutionary Computation Volume x, Number x eneral Subpopulation Framework[39] Pelikan, M. (2005). Bayesian optimization algorithm.
Hierarchical Bayesian Optimization Al-gorithm , pages 31–48.[40] Robiˇc, T. and Filipiˇc, B. (2005). Demo: Differential evolution for multiobjective optimization.In
Evolutionary Multi-Criterion Optimization , pages 520–533. Springer.[41] Santos, A., Delbem, A., London, J., and Bretas, N. (2010). Node-depth encoding and mul-tiobjective evolutionary algorithm applied to large-scale distribution system reconfiguration.
Power Systems, IEEE Transactions on , 25(3):1254–1265.[42] Sprave, J. (1999). A unified model of non-panmictic population structures in evolutionaryalgorithms. In
Evolutionary Computation, 1999. CEC 99. Proceedings of the 1999 Congress on ,volume 2. IEEE.[43] Storn, R. and Price, K. (1997). Differential evolution–a simple and efficient heuristic forglobal optimization over continuous spaces.
Journal of global optimization , 11(4):341–359.[44] Storn, R. and Price, K. (2002). Minimizing the real functions of the ICEC’96 contest by differ-ential evolution. In
Evolutionary Computation, 1996., Proceedings of IEEE International Conferenceon , pages 842–844. IEEE.[45] Talbi, E., Mostaghim, S., Okabe, T., Ishibuchi, H., Rudolph, G., and Coello Coello, C. (2008).Parallel approaches for multiobjective optimization.
Multiobjective Optimization , pages 349–372.[46] Tomassini, M. (2005).
Spatially structured evolutionary algorithms . Springer.[47] Tuˇsar, T. and Filipiˇc, B. (2007). Differential evolution versus genetic algorithms in multiob-jective optimization. In
Evolutionary Multi-Criterion Optimization , pages 257–271. Springer.[48] Vesterstrom, J. and Thomsen, R. (2004). A comparative study of differential evolution, par-ticle swarm optimization, and evolutionary algorithms on numerical benchmark problems. In
Evolutionary Computation, 2004. CEC2004. Congress on , volume 2, pages 1980–1987. IEEE.[49] Vrugt, J. and Robinson, B. (2007). Improved evolutionary optimization from geneticallyadaptive multimethod search.
Proceedings of the National Academy of Sciences , 104(3):708–711.[50] Woolley, B. and Stanley, K. (2011). On the deleterious effects of a priori objectives on evolu-tion and representation. In
Proceedings of the 13th annual conference on Genetic and evolutionarycomputation , pages 957–964. ACM.[51] Zitzler, E. and Thiele, L. (1998). Multiobjective optimization using evolutionary algorithms-a comparative case study. In
Parallel Problem Solving from Nature-PPSN V , pages 292–301.Springer.[52] Zitzler, E. and Thiele, L. (1999). Multiobjective evolutionary algorithms: A comparativecase study and the strength pareto approach.
Evolutionary Computation, IEEE Transactions on ,3(4):257–271.[53] Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C., and da Fonseca, V. (2003). Performanceassessment of multiobjective optimizers: An analysis and review.
Evolutionary Computation,IEEE Transactions on , 7(2):117–132.