Towards Large Scale Automated Algorithm Design by Integrating Modular Benchmarking Frameworks
TTowards Large Scale Automated Algorithm Designby Integrating Modular Benchmarking Frameworks
Amine Aziz-Alaoui, ISAE-SUPAERO, Universit´e de Toulouse, France ∗ Carola Doerr, Sorbonne Universit´e, CNRS, LIP6, Paris, FranceJohann Dreo, Thales Research & Technology, Palaiseau, France † February 15, 2021
Abstract
We present a first proof-of-concept use-case that demonstrates the efficiency of interfacingthe algorithm framework
Paradiseo with the automated algorithm configuration tool irace and the experimental platform
IOHprofiler . By combing these three tools, we obtain apowerful benchmarking environment that allows us to systematically analyze large classesof algorithm spaces on complex benchmark problems. Key advantages of our pipeline arefast evaluation times, the possibility to generate rich data sets to support the analysis of thealgorithms, and a standardized interface that can be used to benchmark very broad classesof sampling-based optimization heuristics.In addition to enabling systematic algorithm configuration studies, our approach paves away for assessing the contribution of new ideas in interplay with already existing operators– a promising avenue for our research domain, which at present may have a too strong focuson comparing entire algorithm instances.
When confronted with an optimization problem in practice, one of the major challenges that weface is the selection (and the configuration) of an algorithm that corresponds well to the givenproblem structure, optimization objective(s), and the available resources (compute, possibilityto parallelize computations, accessibility of the problem, etc.). A vast amount of differentoptimization techniques exist, which renders this algorithm selection problem non-trivial.In practice, algorithm selection is often biased by personal preferences and experiences, aswell as by practical aspects such as the availability of ready-to-use implementations. Support-ing practitioners in making more systematic choices is one of the key objectives of our researchdomain. A key tool for deriving such recommendations is algorithm benchmarking , i.e. the anal-ysis of empirical performance data and search trajectories of one or several algorithms on one orseveral optimization problems [BDB +
20, HAR + +
16, LJD + +
20, RT18, DWY +
18, WKB + +
19, EPK20, FGLP11], andperformance extrapolation [KHNT19]. However, most of these tool are developed in isolation, ∗ This work was partially done during the M2 master internship of Amine Aziz-Alaoui at ´Ecole Polytechnique,Institut Polytechnique de Paris, CNRS, LIX, Palaiseau, France. He is now a PhD student at Institut de RechercheTechnologique Saint Exup´ery, Toulouse, France. † Corresponding author, [email protected]. a r X i v : . [ c s . N E ] F e b aying little attention to building compatible interfaces to other benchmarking modules. Thissignificantly hinders their wider adoption.With this work we demonstrate the benefits of a fully modular benchmarking pipeline design,which keeps the different steps of the benchmarking study in mind. We see our work as a proofof concept for better compatibility between our benchmarking software. On the practical side,our pipeline paves a way for assessing the benefits of new algorithmic ideas in the context andin interplay with other operators and ideas that our community offers. Our contribution:
Concretely, we propose in this work a benchmarking pipeline thatintegrates the modular algorithm framework
Paradiseo [KMRS02, CMT04] with the algorithmconfiguration tool irace [LDC + IOHexperimenter [DWY + IOHanalyzer [WVY + Quality of the results:
We show that irace is capable of finding algorithm instances whichoutperform all baseline algorithms selected by hand, and this for each of the 19 problem instancesthat we consider. The relative advantage of the best out of 15 irace suggestions over the bestbaseline algorithm, measured in terms of area under the ECDF curve (see Sec. 3.1), variesbetween 1% and 30%, with a median gain of 13%.
Scalability:
Our algorithmic framework is capable of generating large set of solvers, up toseveral millions of unique configurations. We show that it is possible to tackle such spacesthanks to fast computations. For instance, we give irace a budget of 100 000 target runs foreach of 19 problems, and it completes the full task in approximately 3 hours on a laptop. In ourexperience, our C++ pipeline is at least 10 times faster than heavily optimized counterparts inPython, not mentioning that most of the available modular frameworks are not always heavilyoptimized.
Take-away for instance selection:
As a side result, we observe that similar algorithm in-stances can be suggested by irace for some problems, suggesting that the diversity in perfor-mance profiles sought in [WCLW20] may be weaker than intended. Our work suggests that anapproach like ours may result in a more reliable instance selection, since it will be less biasedby a small set of baseline algorithms, but rather be built on a large and diverse set of possiblealgorithm instances.
Extendability:
Our pipeline is ready to perform large benchmark studies, covering largeclasses of continuous and discrete optimization algorithms. For example, local searches, particleswarm optimization, estimation of distribution algorithms and using numerical or bitstringencodings. Similarly, the pipeline gives direct access to all problems collected in
IOHprofiler ,which comprises in particular the BBOB functions from the COCO framework [HAR + + IOHprofiler would beeasily used to further extend this study.
Comparison to Previous Works:
Our work is a top-down approach for automatic algorithm design [MLDS14], which usea parametrized algorithmic framework to instantiate many algorithm instances . Follow-ing [LIKS17], we observe that this differs from bottom-up “grammar-based” approaches, likeGrammatical Evolution [RCN98, LPC12], which allow for easily designed algorithm space, butcomplicates algorithm instantiation and optimization. In our case, the width of the designspace is already large and we targets fast algorithm instantiation, we thus favor the top-downapproach.A similar approach to ours was suggested in [LS12, BLIS20, BLIS16] for multi-objectiveoptimization. Those studies also uses irace , but the authors implemented their own modularalgorithm frameworks, which is restricted to multi-objective optimization. Our work signifi-cantly scales up this kind of study, by leveraging larger algorithm design spaces, larger set of2
OHexperimenterParadisEO <
IOH_ecdf_logger +target_range: RangeLinear+budget_range: RangeLinear+data(): IOH_AttainSuite
IOH_csv_loggerIOH_observer_combine +vector
IOH_logger +do_log(problem_info) <
IOH_problem
W_Model_OneMax +epistasis: int+neutrality: int+ruggedness: int+max_target: int+dimension: int+operator()(sol:Bits): double
IOH_ecdf_sum +operator()(ecdf:IOH_AttainSuite): double
After run <
Select, Run irace +run()
Figure 1: Summary diagram of the FastGA evaluation pipeline involving the
Paradiseo (up-per part, red colors) and
IOHexperimenter (lower part, blue colors) frameworks along withthe irace entry point. The execution starts from the irace run command on the left, goesthrough the
Paradiseo modules, which call the
IOHexperimenter problem (in blue) and loggers(in green). After the run of the algorithm, a statistic is computed on the ECDF data (incyan), which is then returned to irace as performance metric (i.e., this is the “fitness value”that the evaluation associates to the configuration under evaluation). Involved classes are rep-resented using the UML convention. For the sake of clarity, the
IOHprofiler prefix is writtenas
IOH and the type of the eoAlgoFoundryFastGA slots are indicated as { double } instead of eoOperatorFoundry
Sec. 2 briefly introduces the individual modules of our algorithmdesign pipeline and how they interplay with each other. The use-case on which we apply thispipeline, as well as the experimental setup are summarized in Sec. 3. The results of our empiricalanalysis are described in Sec. 4. We conclude our paper in Sec. 5 with a discussion on promisingavenues for future work.
Availability of Code and Data:
Our code is available on GitHub at https://github.com/nojhan/paradiseo . Figure 1 summarizes our automated algorithm design pipeline for the concrete use-case thatwill be studied in Sec. 3. The pipeline links an algorithm selector with an algorithm generatorand a benchmark platform. The algorithm selector asks the algorithm generator to instantiatean algorithm, which then solves a problem of the benchmark platform while being observed bya logger. After this run, the logger’s data are summarized as a scalar performance measure,which is sent back to the algorithm selector. We briefly present in this section the differentcomponents of our pipeline, and explain the reasons behind our choices.
Algorithm Framework:
Paradis eo Many evolutionary algorithms share similar design pat-terns, and are often composed of similar operators. This has given rise to several platformswhich aim at supporting their users in designing evolutionary heuristics by compiling a set of3eadily-available operators within a standardized software environment. Given the substantialwork that has put into these frameworks, we decided to build our pipeline around one of themost powerful toolboxes. To this end, we have ranked 39 frameworks among the ones easilyavailable on the web, based on an adhoc metric combining rapidity, activity, features and li-cense, e.g., [NDV15, SL19, GP06, Jen, ECF, FDG +
12, CEP08], to name only a few. Sincespeed is a major concern for our pipeline, we favor frameworks written in C++. To select anup-to-date framework and to ensure availability of support in case of technical issues, we alsochecked the contribution activity in recent years. These two criteria reduced our choices to
Par-adiseo [KMRS02, CMT04], OpenBeagle [GP06], and ECF [ECF]. Among these three,
Paradiseo covers the largest portfolio of algorithm families, which are composed in the framework by as-sembling atomic functions (called operators ). Paradiseo is also the most actively maintainedframework among the three, so that we decided to use it for our work.The upper part of Figure 1 shows the core classes of
Paradiseo involved in our setting.
Algorithm Configuration: irace
Several algorithm configuration tools have been developedin the last decade. Among the most common ones used in our community are irace [LDC + + + irace for this study, for practical considerations (previous experience, availability ofdocumentation, support from development team). Experimental Environment:
IOHexperimenter
The
IOHprofiler project [DWY +
18] is amodular platform for algorithm benchmarking of iterative optimization heuristics (IOH). Withinthis project,
IOHexperimenter provides synthetic benchmarks which are very fast to execute anda standardized way of observing algorithms behavior through so-called loggers . We have chosenthis platform, because it is fast and its modular design made it particularly easy for us tointegrate the algorithm design framework (being written in C++, as
Paradiseo ). IOHprofiler isalso actively maintained, and provides access to broad ranges of different optimization processes.Compared to Nevergrad [RT18], we particularly like the detailed logging options, whichprovide information about the any-time behavior of the algorithms —information that is cur-rently not available in Nevergrad. Compared to the COCO [HAR +
20] environment,
IOHprofiler makes it considerably easier to test algorithms’ performance on our own benchmark problemsor suites. Finally, the project also supports interactive performance analysis and visualizationmodule,
IOHanalyzer , which we used for the interpretation of our data.The lower part of Figure 1 shows the classes related to the loggers and the problems thatare used in our experimental study in Sec. 3.
Data Records: fast ECDF Logger
In our use-case, we decide to tune algorithms for goodanytime performance, and to use area under the empirical cumulative density function (ECDF)curve as objective. To this end, we have implemented within
IOHexperimenter an efficient wayof computing these values. This “ECDF logger” will be described in Sec. 3.1.
Data Analysis and Visualization with
IOHanalyzer
Data analysis and visualization isperformed via
IOHanalyzer [WVY + IOHprofiler project [DWY + Our use-case is the optimization of the anytime performance of a genetic algorithm on selectedinstances of the W-model problem. Our performance measure (Sec. 3.1), the algorithmic frame- Version 3.4.1 of https://cran.r-project.org/web/packages/irace/ , ran with R In order to allow for large scale experiments, we implement a fast logger within
IOHexperimenter ,which essentially stores a histogram of the two-dimensional distribution of the ratio of runshaving reached a quality/time target. The time dimension is given as the number of calls tothe objective function, linearly discretized between zero and the allowed budget. The qualitydimension is given as the absolute value of the best solution found during the run, linearlydiscretized between zero and the known V max bound (see Table 1). The W-Model problemis here converted in a minimization problem where the solver seek to optimize − OM ( x ) (seeSec. 3.3). Figure 2 shows two examples of such histograms, arbitrarily chosen. The matrixdefines the considered quality/time targets ( v, t ). The color of each cell corresponds to theprobability that the algorithm has identified, within the first t function evaluations, a solutionof quality at least v . The darker a cell, the larger the fraction of runs that could successfullymeet the quality/time target.Using the histogram of the performance ECDF instead of its continuous counterpart allowsto keep the data in-memory, in compact data structures, without having to rely on slow diskaccesses.The performance of the considered algorithm is computed as a statistic on this histogram.In our study, we use the area under the curve (AUC) of the discretized ECDF, approximatedas the sum of the histogram. This allows for a compromise between quality and time, which iseasily available because we consider synthetic benchmarks with known bounds. ( µ, λ ) “Fast” GA Family We chose for our use-case a family of ( µ + λ ) GAs, which is to a large extend inspired by thestudy [YWDB20]. Algorithm 1 summarizes the framework, called “FastGA” in the implemen-tation.Essentially, given a parent population of µ points, each of the λ offspring is created byfirst deciding which variation operator is applied (line 9): with probability p c the offspring isgenerated by first recombining two search points from the parent population (lines 11–13) andthen randomly deciding (with probability p m ) whether or not to apply a mutation operator tothe so-created offspring (lines 15–18). When crossover was not selected in line 10, the offspringis created by mutation (lines 20–21). When all λ offspring have been created, the iteration iscompleted by a replacement step (line 25). Implementation of this Family in
Paradis eo : We implement this family of GAs through
Paradiseo ’s “foundries”, which allow to register a set of operators (e.g., several kind of mutations)within a “slot” (e.g., the step at which mutation is called within the algorithm). Before eachcall, it is possible to instantiate a specific operator among the registered ones, for each slot,thus assembling one of the algorithm instance among all the possible combinations of operators.Note that operators can be simple numbers, like a probability. Operators are referenced withinslots by their indices.Most of the operators we use were already available in
Paradiseo , to the exception of mu-tations operators with indices 1–5 (see below), which we implemented for this study. We alsoimplemented the algorithm 1 as the eoFastGA class , in which to plug the operators.We consider the following operators and parametrizations, which result in a total numberof 1 630 475 different configurations of Algorithm 1. Numbers in brackets indicate the indices ofthe corresponding operators within its slot. All our code is contributed to the
Paradiseo project. lgorithm 1: A Configurable Family of ( µ + λ ) Genetic Algorithms. Input:
Budget B , configuration ( µ, λ, p c , p m ), choice of the operators and conditionalparameters. Note that P and P (cid:48) are multi-sets, i.e the same point may appearmultiple times; Initialization: P ← InitialSampling( µ ); evaluate the µ points in P ; Evals ← µ ; Optimization: for t = 1 , , , . . . until Evals = B do P (cid:48) ← ∅ ; for i = 1 , . . . , λ do Sample r c ∈ [0 ,
1] u.a.r.; if r c ≤ p c then (cid:0) y ( i, , y ( i, (cid:1) ← SelectC( P ); (cid:0) y (cid:48) ( i, , y (cid:48) ( i, (cid:1) ← Crossover (cid:0) y ( i, , y ( i, (cid:1) ; Sample z ( i, ∈ (cid:0) y (cid:48) ( i, , y (cid:48) ( i, (cid:1) u.a.r. ; Sample r m ∈ [0 ,
1] u.a.r. ; if r m ≤ p m then z ( i, ← Mutation (cid:0) z ( i, (cid:1) ; else z ( i, ← z ( i, ; else z ( i, ← SelectM( P ); z ( i, ← Mutation (cid:0) z ( i, (cid:1) ; Evaluate z ( i, ; Evals ← Evals+1; P (cid:48) ← P (cid:48) ∪ (cid:8) z ( i, (cid:9) ; P ← Replace(
P, P (cid:48) , µ ); InitialSampling ( µ ) : Initialization of the Algorithm (1 option) We only consider in-dependent uniform sampling, i.e. , the µ points are i.i.d. uniform samples. The correspondingParadisEO operator is eoInitFixedLength . Crossover rate p c (6 options): We consider p c ∈ { , . , . , . , . , . } . Being only ableto use the integer and categorical interface for irace , we predefine the set of rates, which irace will see as integers. SelectC ( P ) : Selection of two points for the crossover operation (7 options). Notethat in the implementation, the selection operator (line 11) is called twice to select the twocandidate points. [0] eoRandomSelect() : Uniformly select a point from P (without removing the first selectedindividual from the set P used by the second selection). (1 option). [1] eoStochTournamentSelect( k ) : Select a point from P with tournament selection, i.e. , weselect uniformly at random k different points in P and the best one of these is selected. k denotes the tournament size as percentage of population ( i.e., k ∈ [0 , k = 0 . [2] eoSequentialSelect() : Select the best point from P (with respect to the objective func-tion value). This operator is sometimes referred to as elitist selection or truncation selec-tion . When called twice, it selects the two distinct best points from P . (1 option). [3] eoProportionalSelect() : Select a point from P with so-called fitness-proportional selec-tion, i.e. , point x ∈ P is chosen with probability f ( x ) / (cid:80) y ∈ P f ( y ). (1 option). [4--6] eoDetTournamentSelect( k ) : Like eoDetTournamentSelect , but k is deterministic.(3 different options, each one for k ∈ [2 , , Crossover ( x, y ) : Bivariate Variation Operators (11 options). [0--4] eoUBitXover( b c ) : Uniform crossover with bias (or “preference” in ParadisEO) b c , set-ting (independently for each position i ∈ [1 ..n ]) z i = x i with probability b c and setting z i = y i otherwise. z denotes the offspring element coming from the crossover of x and y .(5 different options, b c ∈ [0 . , . , . , . , . [5--9] eoNPtsBitXover( k ) : k -point crossover, which selects i , . . . , i k uniformly at randomand without replacement from [1 ..n ] and sets z i = x i for i ∈ [1 ..i ] ∪ [ i + 1 ..i ] ∪ . . . andsets z i = y i for i ∈ [ i + 1 ..i ] ∪ [ i + 1 ..i ] ∪ . . . (5 different options, k ∈ [1 , , , , [10] eo1PtBitXover() : Classic 1-point crossover. (1 option). Mutation probability p m (6 options): We consider p m ∈ { , . , . , . , . , . } . Mutation ( x ) : Univariate Variation Operator (11 options) All mutation operators areunary unbiased in the sense proposed in [LW12]. For a compact representation, we follow thecharacterization suggested in [DDY20] and define the mutation operators via the distributionsthat they define over the possible mutation strengths k ∈ [0 ..n ]. After sampling k from theoperator-specific distribution, the k -bit flip operator, flip k ( · ), is applied; it flips the entries in k uniformly chosen, pairwise different bits ( i.e. , the k bits are chosen u.a.r. without replacement).7
0] eoUniformBitMutation() : The “uniform” mutation operator, which samples k uniformlyat random in the set [0 ..n ]. (1 option). [1] eoStandardBitMutation( p = 1 /n ) : This is the standard bit mutation with mutation rate p . It chooses k from the binomial distribution B ( n, p ). (1 option). [2] eoConditionalBitMutation( p = 1 /n ) : A conditional standard bit mutation operatorwith mutation rate p . It chooses k (cid:48) from B ( n − , p ) and applies the flip k ( · ) operator with k = k (cid:48) + 1. (1 option). [3] eoShiftedBitMutation( p = 1 /n ) : The “shifted” standard bit mutation with mutationrate p , suggested in [CD18]. It samples k (cid:48) from the binomial distribution B ( n, p ). When k (cid:48) = 0, it uses k = 1 and it uses k = k (cid:48) otherwise. (1 option). [4] eoNormalBitMutation( p , σ ): The “normal” mutation operator suggested in [YDB19]. Itsamples k from the normal distribution N ( pn, σ ). When k > n , k is replaced by a valuechosen uniformly at random in the set [0 ..n ]. (1 option, p = 1 /n and σ = 1 . [5] eoFastBitMutation( β ) : The “fast” mutation operator suggested in [DLMN17]. It samples k (cid:48) from the power-law distribution P [ L = k ] = ( C βn/ ) − k − β with C βn/ = (cid:80) n/ i =1 i − β . When k (cid:48) is larger than n , it samples a uniform value k in [0 ..n ], and it uses k = k (cid:48) otherwise. (1option, β = 1 . [6--10] eoDetSingleBitFlip( k ) : Deterministically applies flip k ( · ). (5 different options, k ∈ [1 , , , , SelectM ( P ) : Selection of one point for the mutation operation if crossover was notchosen (7 options) We essentially have the same selection operators as for crossover. Theonly difference is that we select only one point instead of two.
Replace ( P, P (cid:48) , µ ) : Replacement of population (11 options) [0] eoPlusReplacement() : The best µ points of the multiset P ∪ P (cid:48) are chosen. (1 option). [1] eoCommaReplacement() : The best µ points of the offspring multiset P (cid:48) are chosen. (1option). [2] eoSSGAWorseReplacement() : The min ( λ, µ ) points of the offspring multiset P (cid:48) replace theworst points in P . (1 option). [3--5] eoSSGAStochTournamentReplacement( k ) : Like eoSSGADetTournamentReplacement() k being the the tournament size as percentage of population. (3 different options, k ∈ [0 . , . , . [6--10] eoSSGADetTournamentReplacement( k ) : The µ points are selected through tourna-ment selection. Each tournament involves k uniformly chosen points in P ∪ P (cid:48) and the bestones of these k points is selected. This procedure is repeated µ times, each time removingan already selected point from the multi-set P ∪ P (cid:48) . (5 different options, k ∈ [2 , , , , µ + λ ) GAs.The set of all combinations generates the algorithm design space on which we let irace searchfor the configuration(s) that best solve a given problem instance.8 aseline Algorithms We consider four baseline algorithms, against which we compare theresults of the automated design:1. ( λ + λ ) EA: no crossover, plus replacement, standard bit mutation, random selector formutations.2. ( λ + λ ) fEA: no crossover, plus replacement, fast bit mutation, random selector for mu-tations.3. ( λ + λ ) xGA: sequential selections, uniform crossover, standard bit mutation, plus re-placement, p c = 0 . b c = 0 . λ + λ ) sequential selections, 1-point crossover, standard bit mutation, plusreplacement, p c = 0 . b c = 0 . We evaluate our automated algorithm design pipeline on the W-model functions originallysuggested in [WW18]. In a nutshell, the W-model is a benchmark problem generator, whichallows to tune different characteristics of the problems, see below for a description. We selectedfrom this family of benchmark problems the 19 instances suggested in [WCLW20], which aresummarized in Table 1. Note here that the description differs from that given in [WCLW20],since we used the implementation within
IOHexperimenter , which was made available in thecontext of the work [DYH + +
20] to superpose the W-model transformations to different op-timization problems. The instances selected in [WCLW20], however, were only selected fromtransformations applied to the
OneMax problem OM : { , } −→ [0 ..n ] , x −→ (cid:80) ni =1 x i . The
One-Max problem has a very smooth and non-deceptive fitness landscape. Due to the well-knowncoupon collector effect [DP09], it is relatively easy to make progress when the function values aresmall, and the probability to obtain an improving move decreases considerably with increasingfunction values. The complexity of the
OneMax problem can be considerably increased throughthe following W-model transformations. (1) Neutrality W ( ., µ, ., . ): The bit string ( x , ..., x n ) is reduced to a string ( y , ..., y m ) with m := n/µ , where µ is a parameter of the transformation. For each i ∈ [ m ] the value of y i is themajority of the bit values in the size- µ substring ( x ( i − µ , x ( i − µ , ..., x iµ ) of x . That is, y i = 1if and only if there are at least µ/ n/µ / ∈ N , the last bits of x arecopied to y . (2) Epistasis W ( ., ., ν, . ): Epistasis introduces local perturbations to the bit strings. Itfirst “cuts” the input string ( x , ..., x n ) into subsequent blocks of size ν . Using a permutation e ν : { , } ν −→ { , } ν , each substring ( x ( i − ν +1 , x ( i − ν +2 , ..., x iν ) is mapped to another string( y ( i − ν +1 , y ( i − ν +2 , ..., y iν ) = e ν (( x ( i − ν +1 , x ( i − ν +2 , ..., x iν )). The permutation e ν is chosenin a way that Hamming-1 neighbors are mapped to strings of Hamming distance at least ν − (3) Ruggedness and Deceptiveness W ( ., ., ., γ ): This layer perturbs the fitness values,by applying a permutation σ ( γ ) to the possible fitness values [0 ..n ]. The parameter γ can bethought of as a parameter which controls the distance of the permutation to the identity. Thepermutations σ ( γ ) are chosen in a way such that the “hardness” of the instances monotonicallyincreases with increasing γ , see [WW18] for details. Our test bed is the automated design of Algorithm 1 with the options specified in Sec. 3.2 andwith the objective to maximize the AUC as defined in Sec. 3.1, and this for each of the 19 prob-9able 1: Test problems on which the pipeline is evaluated, taken from [WCLW20]. In column“best” we list the baseline algorithm with largest average AUC value, reported in column AUC b .AUC i is the average AUC of the elite configurations suggested by the 15 independent runs of irace . AUC-values are w.r.t. to at least 50 validation runs and “rel.” indicates the relative gain(AUC i − AUC b ) / AUC b .FID dim µ ν γ V max best AUC b AUC i rel.1 20 2 6 10 10 xGA 8378 8740 4%2 20 2 6 18 10 fEA 8402 8754 4%3 16 1 5 72 16 fEA 8352 8397 1%4 48 3 9 72 16 EA 8299 8914 7%5 25 1 23 90 25 fEA 8003 8510 6%6 32 1 2 397 32 1pt 7055 7311 4%7 128 4 11 0 32 1pt 6833 8183 20%8 128 4 14 0 32 EA 6885 8499 23%9 128 4 8 128 32 xGA 8154 8786 8%10 50 1 36 245 50 fEA 7216 8122 13%11 100 2 21 256 50 EA 8314 9139 10%12 150 3 16 613 50 EA 8034 8730 9%13 128 2 32 256 64 fEA 8076 9345 16%14 192 3 21 16 64 fEA 6173 7677 24%15 192 3 21 256 64 fEA 6797 8292 22%16 192 3 21 403 64 fEA 7273 8592 18%17 256 4 52 2 64 xGA 6935 9028 30%18 75 1 60 16 75 EA 5958 7089 19%19 150 2 32 4 75 EA 7399 8717 18%lems listed in Table 1. These instances of the W-model problem were suggested in [WCLW20]based on an empirical study using clustering of algorithm performance data, with the goal toselect a diverse collection of benchmark problems. Note here that we tune the algorithms foreach problem individually. That is, we apply our algorithm design pipeline 19 independenttimes.We fix the population sizes to λ = µ = 5, for the search performed by irace and for ourbaseline algorithms.For each use-case, we set the budget of the algorithms to 5 n function evaluations (FEs).To compute the AUC, we evaluate the performance at 100 linearly distributed budgets b , . . . , b ∈ [1 , n ] and at 100 linearly distributed target values v , . . . , v ∈ [0 , V max ]. Lin-earization computes the bucket index i = (cid:98) ( x − x min ) / ( x max − x min ) · (cid:99) for both budgets andtargets.To find the best algorithm design, we allow irace a budget of 100 000 target runs and weensure that it performed at least 50 independent validation runs for the elite configurations. Werun this irace search 15 independent times, to check the robustness of its selection. To comparethis performance to the four baseline algorithms, we run each of these 50 independent times,on each of the 19 test problems.Running irace with this budget on 19 problems on a computer with four Intel CPU coresi5-7300HQ at 2.50GHz and Crucial P1 solid-state disks takes approximately 3 hours. Comparison of AUC Values by Function
Table 1 compares the AUC values of the bestout of the four baseline algorithms against that of the elite configuration suggested by irace . We10able 2: Configuration of the best out of the elite recommendation suggested by 15 independentruns of irace , for each of the 19 benchmark problems specified in Table 1, and compared againstthe configuration of the four baseline algorithms. The “op.” column gives the number of optionsper operator. All other integer values correspond to the indices with which the different optionsare listed in Sec. 3.2 and “-” indicates a non-applicable element (for instance, no crossoveroperator is used when p c = 0). Operator op. p c SelectC
Crossover
11 1 2 8 1 2 - 3 - 2 2 10 5 2 9 2 10 2 2 2 - - 2 5 p m SelectM
Mutation
11 8 9 3 9 7 6 10 10 10 9 10 9 10 8 8 10 10 8 9 1 5 1 1
Replace
11 8 9 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Table 3: Distribution of operators variants recommended by 15 runs of irace , for problems 5(left), problem 17 (center) and all problems (right). The most selected indices are highlightedin bold and the darker the background color, the more often the operator instance is selected.Empty cells indicates that irace never selected the operator instance, cells with a “-” entrymarks indices which are not defined for this operator.
Problem 5 Problem 17 All problemsOp. index 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10Pc 5 6 1 3 - - - - - - 1 14 - - - - - - 18 74 56 57 65 - - - - - -SelectC 1 5 2 4 3 - - - - 15 - - - - 18 16 141 19 18 30 28 - - - -Crossover 1 5 2 2 1 1 1 2 15 11 29 78 24 6 8 11 19 18 23 43Pm 2 3 10 - - - - - - 15 - - - - - - 4 9 25 49 183 - - - - - -SelectM 14 1 - - - - 6 2 1 2 2 2 - - - - 26 17 61 36 27 41 62 - - - -Mutation 2 5 1 1 2 4 1 9 4 1 2 7 3 1 2 6 10 62 81 96Replace 15 15 245 2 3 6 2 1 2 5 1 3 observe that, for each of the 19 functions, the elite configurations suggested by irace performbetter than the best baseline algorithm. We report in Table 1 the average values, but thedifferences between the individual irace runs are very small, less than 2.1% difference in AUCvalue between the best and the worst elite configuration for all 19 problems, and less than1% performance difference for 9 out of the 19 functions. The relative advantage of the irace recommendations over the best baseline algorithms varies between 1% and 30%. When lookingat each of the 15 elite configurations suggested per function, the best relative advantage is31% for F17, whereas two of the irace elites performed worse than the best of the four baselinealgorithms on function F3. For all other functions, all 15 irace elites have a better AUC valuethan the best of the four baseline algorithms. However, although we see a clear advantage ofthe irace configurations, we should keep in mind that the irace configurations are specificallytuned for each function, whereas the configurations of the four baseline algorithms are identicalfor all 19 W-model functions.Unfortunately, our pipeline does not yet allow to tune a single best solver, i.e. , a singleconfiguration which maximized the AUC under the aggregated ECDF curve. Adding this func-tionality is a straightforward extension of our framework, which we plan to address in futurework. The key challenge in implementing this extension is that
Paradiseo does not have thefeature to easily reset on the fly the states of solvers between two runs on different problems.
Comparison of the Configurations.
Table 2 summarizes the best of the 15 elite configu-rations that were suggested by irace and compares them against the four baseline algorithms.Table 3 shows the distribution of operators chosen by the 15 irace runs. For the latter, wehave chosen problem 17 as an example because we observed here the largest relative gain (seeTable 1). We have added problem 5 for comparison, because the distribution of operators11
20 40 60 80 100 120 140 1601416182022242628
EA fEAxGAirace elite for problem 5
Function Evaluations B e s t - s o - f a r f ( x ) - v a l u e EA fEAxGA 1ptGAirace elite for problem 17
Function Evaluations B e s t - s o - f a r f ( x ) - v a l u e Figure 3: Convergence plots for the baseline algorithms and the elite configurations suggestedby irace , on problem 5 (left) and problem 17 (right).suggested for it is very distinct from that of problem 17.It is worth noting that each operator is selected at least once in the 19 ×
15 elite configurationssuggested by irace (Table 3, right), which seems to confirm that i) different operators work wellon different problems and ii) that irace searches the full design space, giving some indicationthat it is not too large or too complex for automated tuning approaches.We can see that, among all the best configurations proposed by irace across 15 runs, none aresimilar to one of the baseline algorithms. The probability of mutation p m is most frequently setto higher values and the most often chosen mutation is deterministic bit flip with larger numberof bits (index 10 in the Mutation slot, which is the flip mutation operator). This indicates thatlarger mutation strengths could have been worth investigating, a result that has surprised us,since in most benchmark studies we see small mutation rates as defaults. The results confirmthe superiority of the plus replacement (id. 0 in the Replace slot) and support the use of anelitist selection for the crossover candidates (id. 2 in
SelectC ). We can also see that the uniformcrossover with b c = 0 . Crossover ) is more often chosen, like a small probability ofperforming a crossover (id. 1 in p c ).For some problems, irace almost always suggest a similar algorithm. On problem 17, forexample, it often selects a GA with a large probability of applying uniform crossover in combi-nation with deterministic bit flip mutations. For some other problems, a larger variance on theselected operators can be observed. For instance on problem 5, irace selects a high mutationprobability along with an elitist mutation selection, but does not show a clear preference forthe other slots.These results support the idea that there is not always a single best solver (i.e., “No FreeLunch”), even when considering limited design and benchmarking spaces. We also see that someproblems seem to require certain design choices, whereas others can be solved well by a broadrange of configurations. A more detailed analysis of how these preferences correlate with thecharacteristics of the problems should offer plenty of interesting insights, but is left for futurework. Fixed-budget solution qualities
Figure 3 shows two examples of convergence plots, wherewe plot the values of the best solutions found so far against the number of objective functionevaluations performed, for each baseline algorithm and for the best elite configuration selectedby irace . Problems 5 and 17 are chosen to allow for comparison with Table 2.We observe that the elite configuration on problem 17 is largely more efficient than any ofthe baseline algorithms. However, on problem 5, the elite configuration is only the most efficientuntil 140 evaluations. It is selected nonetheless, because we consider the AUC of the 2D ECDF,12hich takes into account the average performance (across all budgets and targets) rather thanthe terminal budget of the best target. We believe that, whatever the performance metric wechoose, there will always exists such artifacts, where some algorithm would be the best, hadwe chosen another metric. It is clear, however, that even in this plot the elite configurationperforms better most of the time . By interfacing the three state-of-the-art benchmarking modules from the evolutionary compu-tation literature, irace [LDC + Paradiseo [KMRS02], and
IOHprofiler [DWY + where to look for interesting structures.The modular design of the pipeline and its components makes our approach very broadlyapplicable. It is not restricted to particular types of problems nor to specific algorithms. In par-ticular, extensions to continuous or mixed-integer problems are rather straightforward. Indeed,the Paradiseo framework is designed to separate operators which are independent of the encoding(selection, replacement, etc.) from operator which depends on it (mutation, crossover, etc.), al-lowing for easy reuse of components and extensions to other algorithmic paradigms (estimationof distribution, local search, multi-objective, etc.). Additionally, the
IOHexperimenter providesloggers for vectorial encodings and benchmarks for both numerical and bitstring encodings.Our work is partially motivated by an industrial application that requires an automatedconfiguration of hardware products. However, we believe that our pipeline is not only interestingfor such practical purposes. For researchers, our pipeline offers an elegant way of assessing newalgorithm operators and their interplay with already existing ones.In terms of further development, we plan to add the necessary features which would i) allowfor running the same algorithm on multiple problems, while using a single logger that aggregatesthe results and would ii) support irace ’s interface for numerical parameters (additionally tocategorical and integer ones).We then plan to test the approach on different algorithms families, with a possible extensionto generic “bottom-up” hybridization grammars [MMLIS13] and studies on the most efficientalgorithms design (e.g., on the correlations between elite algorithms’ operators).We also plan to extend the framework by integrating feature extraction methods that usealgorithm trajectory data [DLV +
19, BPRH19] and/or samples specifically made for exploratorylandscape analysis [MBT +
11, KT16] to couple the algorithm design to such information, similarto the per-instance configuration approaches made in [HHHL06, BDSS17].Our long-term vision is a pipeline for the automated design of algorithms which adjust theirbehavior during the optimization process, by taking into account information accumulated sofar, similar to the dynamic algorithm configurations studied under the notion of parametercontrol [KHE15]. In contrast to the static designs considered in this work, the automateddesign of dynamic algorithms requires to select suitable update rules (e.g. based on time, onprogress, on self-adaption, etc.).Finally, we also consider interesting the idea to provide a user-friendly front-end which allows13sers to assemble a benchmark study by selecting (e.g. through a graphical user interface) one ormore algorithms and problems, the budget, etc. and then passing on this study to an automatedinterface which tunes (if desired) and runs the algorithm(s) and then automatically directs itsusers to the data summary and visualization platform
IOHanalyzer , where the results of theempirical study can be analyzed. We believe that such a pipeline would greatly improve thedeployment of evolutionary methods in practice.
References [AMS +
15] Carlos Ans´otegui, Yuri Malitsky, Horst Samulowitz, Meinolf Sellmann, and Kevin Tierney,
Model-based genetic algorithms for algorithm configuration , Proc. of International Conference on ArtificialIntelligence (IJCAI’15), AAAI Press, 2015, pp. 733–739.[BBFKK10] Thomas Bartz-Beielstein, Oliver Flasch, Patrick Koch, and Wolfgang Konen,
SPOT: A toolbox forinteractive and automatic tuning in the R environment , Proc. of the 20. Workshop ComputationalIntelligence, Universit¨atsverlag Karlsruhe, 2010, pp. 264–273.[BDB +
20] Thomas Bartz-Beielstein, Carola Doerr, Jakob Bossek, Sowmya Chandrasekaran, Tome Eftimov,Andreas Fischbach, Pascal Kerschke, Manuel L´opez-Ib´a˜nez, Katherine M. Malan, Jason H. Moore,Boris Naujoks, Patryk Orzechowski, Vanessa Volz, Markus Wagner, and Thomas Weise,
Bench-marking in optimization: Best practice and open issues , CoRR abs/2007.03488 (2020).[BDSS17] Nacim Belkhir, Johann Dreo, Pierre Sav´eant, and Marc Schoenauer,
Per instance algorithm configu-ration of CMA-ES with limited budget , Proc. of Genetic and Evolutionary Computation Conference(GECCO’17), ACM, 2017, pp. 681–688.[BLIS16] Leonardo C. T. Bezerra, Manuel L´opez-Ib´a˜nez, and Thomas St¨utzle,
Automatic component-wisedesign of multi-objective evolutionary algorithms , IEEE Transactions on Evolutionary Computation (2016), no. 3, 403–417.[BLIS20] , Automatically designing state-of-the-art multi- and many-objective evolutionary algorithms ,Evolutionary Computation (2020), no. 2, 195–226.[BPRH19] Luk´as Bajer, Zbynek Pitra, Jakub Repick´y, and Martin Holena, Gaussian process surrogate modelsfor the CMA evolution strategy , Evolutionary Computation (2019), no. 4, 665–697.[CD18] Eduardo Carvalho Pinto and Carola Doerr, Towards a more practice-aware runtime analysis ofevolutionary algorithms , CoRR abs/1812.00493 (2018).[CEP08] T. Cloete, Andries Petrus Engelbrecht, and Gary Pampara,
Cilib: A collaborative framework forcomputational intelligence algorithms - part II , Proc. of the International Joint Conference on NeuralNetworks (IJCNN’08), IEEE, 2008, pp. 1764–1773.[CMT04] S´ebastien Cahon, Nordine Melab, and El-Ghazali Talbi,
Paradiseo: A framework for the reusabledesign of parallel and distributed metaheuristics , J. Heuristics (2004), no. 3, 357–380, Latestrelease available on https://nojhan.github.io/paradiseo/ .[CSC +
19] Borja Calvo, Ofer M. Shir, Josu Ceberio, Carola Doerr, Hao Wang, Thomas B¨ack, and Jose A.Lozano,
Bayesian performance analysis for black-box optimization benchmarking , Proc. of Geneticand Evolutionary Computation Conference (GECCO’19, Companion), ACM, 2019, pp. 1789–1797.[DDY20] Benjamin Doerr, Carola Doerr, and Jing Yang,
Optimal parameter choices via precise black-boxanalysis , Theoretical Computer Science (2020), 1–34.[DLMN17] Benjamin Doerr, Huu Phuoc Le, R´egis Makhmara, and Ta Duy Nguyen,
Fast genetic algorithms ,Proc. of Genetic and Evolutionary Computation Conference (GECCO’17), ACM, 2017, pp. 777–784.[DLV +
19] Bilel Derbel, Arnaud Liefooghe, S´ebastien V´erel, Hern´an E. Aguirre, and Kiyoshi Tanaka,
Newfeatures for continuous exploratory landscape analysis based on the SOO tree , Proc. of Foundationsof Genetic Algorithms (FOGA’19), ACM, 2019, pp. 72–86.[DP09] Devdatt P. Dubhashi and Alessandro Panconesi,
Concentration of measure for the analysis of ran-domised algorithms , Cambridge University Press, 2009.[DWY +
18] Carola Doerr, Hao Wang, Furong Ye, Sander van Rijn, and Thomas B¨ack,
IOHprofiler: A Bench-marking and Profiling Tool for Iterative Optimization Heuristics , CoRR abs/1810.05281 (2018),Available at http://arxiv.org/abs/1810.05281 . A more up-to-date documentation of IOHprofileris available at https://iohprofiler.github.io/ .[DYH +
20] Carola Doerr, Furong Ye, Naama Horesh, Hao Wang, Ofer M. Shir, and Thomas B¨ack,
Benchmark-ing discrete optimization heuristics with iohprofiler , Applied Soft Computing (2020), 106027. ECF]
Evolutionary Computation Framework (ECF), howpublished = http: // ecf. zemris. fer. hr/ ,note = Last visited: 2021-02-04 .[EPK20] Tome Eftimov, Gasper Petelin, and Peter Korosec,
Dsctool: A web-service-based framework forstatistical comparison of stochastic optimization algorithms , Appl. Soft Comput. (2020), 105977.[FDG +
12] F´elix-Antoine Fortin, Fran¸cois-Michel De Rainville, Marc-Andr´e Gardner, Marc Parizeau, and Chris-tian Gagn´e,
DEAP: Evolutionary algorithms made easy , Journal of Machine Learning Research (2012), 2171–2175.[FGLP11] Carlos M. Fonseca, Andreia P. Guerreiro, Manuel L´opez-Ib´a˜nez, and Lu´ıs Paquete, On the com-putation of the empirical attainment function , Proc. of Evolutionary Multi-Criterion Optimization(EMO’11), LNCS, vol. 6576, Springer, 2011, pp. 106–120.[GP06] Christian Gagn´e and Marc Parizeau,
Genericity in evolutionary computation software tools: Princi-ples and case study , International Journal on Artificial Intelligence Tools (2006), no. 2, 173–194.[HAR +
20] Nikolaus Hansen, Anne Auger, Raymond Ros, Olaf Mersmann, Tea Tuˇsar, and Dimo Brockhoff,
COCO: a platform for comparing continuous optimizers in a black-box setting , Optimization Meth-ods and Software (2020), 1–31.[HHHL06] Frank Hutter, Youssef Hamadi, Holger H. Hoos, and Kevin Leyton-Brown,
Performance predictionand automated tuning of randomized and parametric algorithms , Proc. of Principles and Practice ofConstraint Programming (CP’06), LNCS, vol. 4204, Springer, 2006, pp. 213–228.[HHLB11] Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown,
Sequential model-based optimization forgeneral algorithm configuration , Proc. of Learning and Intelligent Optimization (LION’11), Springer,2011, pp. 507–523.[Jen]
Jenetics, howpublished = https: // jenetics. io/ , note = Last visited: 2021-02-04 .[KHE15] Giorgos Karafotias, Mark Hoogendoorn, and A.E. Eiben,
Parameter control in evolutionary al-gorithms: Trends and challenges , IEEE Transactions on Evolutionary Computation (2015),167–187.[KHNT19] Pascal Kerschke, Holger H. Hoos, Frank Neumann, and Heike Trautmann, Automated algorithmselection: Survey and perspectives , Evolutionary Computation (2019), no. 1, 3–45.[KMRS02] Maarten Keijzer, J. J. Merelo, G. Romero, and M. Schoenauer, Evolving objects: A general purposeevolutionary computation library , Artificial Evolution (2002), 829–888, Latest release availableon https://nojhan.github.io/paradiseo/ .[KT16] Pascal Kerschke and Heike Trautmann,
The r-package FLACCO for exploratory landscape analysiswith applications to multi-objective optimization problems , Proc. of IEEE Congress on EvolutionaryComputation (CEC’16), IEEE, 2016, pp. 5262–5269.[LDC +
16] Manuel L´opez-Ib´a˜nez, J´er´emie Dubois-Lacoste, Leslie P´erez C´aceres, Mauro Birattari, and ThomasSt¨utzle,
The irace package: Iterated racing for automatic algorithm configuration , Operations Re-search Perspectives (2016), 43–58.[LIKS17] Manuel L´opez-Ib´anez, Marie-El´eonore Kessaci, and Thomas G St¨utzle, Automatic design of hybridmetaheuristic from algorithmic components , Tech. report, 2017.[LJD +
16] Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar,
Hyperband:A novel bandit-based approach to hyperparameter optimization , arXiv preprint arXiv:1603.06560(2016).[LPC12] Nuno Louren¸co, Francisco Pereira, and Ernesto Costa,
Evolving Evolutionary Algorithms , Proc.of Genetic and Evolutionary Computation Conference (GECCO’12, Companion Material), ACM,2012, pp. 51–58.[LS12] Manuel L´opez-Ib´a˜nez and Thomas St¨utzle,
The automatic design of multiobjective ant colony opti-mization algorithms , IEEE Trans. Evol. Comput. (2012), no. 6, 861–875.[LW12] Per Kristian Lehre and Carsten Witt, Black-box search by unbiased variation , Algorithmica (2012), 623–642.[MBT +
11] Olaf Mersmann, Bernd Bischl, Heike Trautmann, Mike Preuss, Claus Weihs, and G¨unter Rudolph,
Exploratory landscape analysis , Proc. of Genetic and Evolutionary Computation Conference(GECCO’11), ACM, 2011, pp. 829–836.[MLDS14] Franco Mascia, Manuel L´opez-Ib´a˜nez, J´er´emie Dubois-Lacoste, and Thomas St¨utzle,
Grammar-based generation of stochastic local search heuristics through automatic algorithm configuration tools ,Comput. Oper. Res. (2014), 190–199.[MMLIS13] Marie-El´eonore Marmion, Franco Mascia, Manuel L´opez-Ib´anez, and Thomas St¨utzle, Towardsthe Automatic Design of Metaheuristics , MIC 2013 - 10th Metaheuristics International Conference(Singapore, Singapore) (Hoong Chuin Lau, G¨unther Raidl, , and Pascal Van Hentenryck, eds.),Proceedings of the 10th Metaheuristics International Conference (MIC2013), August 2013, pp. 1–3. NDV15] Antonio J. Nebro, Juan J. Durillo, and Matthieu Vergne,
Redesigning the jmetal multi-objectiveoptimization framework , Proceedings of the Companion Publication of the 2015 Annual Confer-ence on Genetic and Evolutionary Computation (New York, NY, USA), GECCO Companion ’15,Association for Computing Machinery, 2015, p. 1093–1100.[RCN98] Conor Ryan, John James Collins, and Michael O Neill,
Grammatical evolution: Evolving programsfor an arbitrary language , European Conference on Genetic Programming, Springer, 1998, pp. 83–96.[RT18] J´er´emy Rapin and Olivier Teytaud,
Nevergrad - A gradient-free optimization platform , https://GitHub.com/FacebookResearch/Nevergrad , 2018.[SB15] Kate Smith-Miles and Simon Bowly, Generating new test instances by evolving in instance space ,Comput. Oper. Res. (2015), 102–113.[SL19] Eric O. Scott and Sean Luke, ECJ at 20: toward a general metaheuristics toolkit , Proc. of Ge-netic and Evolutionary Computation Conference (GECCO’19, Companion Material), ACM, 2019,pp. 1391–1398.[WCLW20] Thomas Weise, Yan Chen, Xinlu Li, and Zhize Wu,
Selecting a diverse set of benchmark instancesfrom a tunable model problem for black-box discrete optimization algorithms , Appl. Soft Comput. (2020), 106269.[WKB +
14] Stefan Wagner, Gabriel Kronberger, Andreas Beham, Michael Kommenda, Andreas Scheibenpflug,Erik Pitzer, Stefan Vonolfen, Monika Kofler, Stephan Winkler, Viktoria Dorfer, and Michael Af-fenzeller,
Architecture and design of the heuristiclab optimization environment , Topics in IntelligentEngineering and Informatics, vol. 6, Springer, 2014, pp. 197–261.[WVY +
20] Hao Wang, Diederick Vermetten, Furong Ye, Carola Doerr, and Thomas B¨ack,
Iohanalyzer: Per-formance analysis for iterative optimization heuristic , CoRR abs/2007.03953 (2020).[WW18] Thomas Weise and Zijun Wu,
Difficult features of combinatorial optimization problems and thetunable w-model benchmark problem for simulating them , Proc. of Genetic and Evolutionary Com-putation Conference (GECCO’18, Companion Material), ACM, 2018, pp. 1769–1776.[XHHL12] Lin Xu, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown,
Evaluating component solvercontributions to portfolio-based algorithm selectors , Proc. of Theory and Applications of SatisfiabilityTesting (SAT’12), Lecture Notes in Computer Science, vol. 7317, Springer, 2012, pp. 228–241.[YDB19] Furong Ye, Carola Doerr, and Thomas B¨ack,
Interpolating local and global search by controlling thevariance of standard bit mutation , Proc. of IEEE Congress on Evolutionary Computation (CEC’19),IEEE, 2019, pp. 2292–2299.[YWDB20] Furong Ye, Hao Wang, Carola Doerr, and Thomas B¨ack,
Benchmarking a ( µ + λ ) genetic algorithmwith configurable crossover probability , Proc. of Parallel Problem Solving from Nature (PPSN’20),LNCS, vol. 12270, Springer, 2020, pp. 699–713.[ZR20] Martin Zaefferer and Frederik Rehbach, Continuous optimization benchmarks by simulation , Proc.of Parallel Problem Solving from Nature (PPSN’20), Lecture Notes in Computer Science, vol. 12269,Springer, 2020, pp. 273–286. Comparison of frameworks
Table 4: Comparison of software frameworks for evolutionary computation. Rank is based onan adhoc aggregation of subjective metrics based on performance of the programming language,activity of the project (number of contributors), breadth of features (number of modules, numberof lines of code), and ease of integration in industrial projects (license). Data has been gatheredin 2019. N Perf Activit FeaturLicenc
Score Name Language Type UpdatLicense Contrib. kloc ParadisEO
C++ Framework 2019 LGPLv2 33 822 jMetal
Java Framework 2019 MIT 29 603
ECF
C++ Framework 2017 MIT 19 154
OpenBeagle
C++ Framework 2017 LGPLv3 4 485
Jenetics
Java Framework 2019 Apachev2 10 476
ECJ
Java Framework 2018 AFLv3 33 547
DEAP
Python Framework 2019 LGPLv3 45 98
GP.NET
C
DGPF
Java Framework 2007 LGPLv2 610
JGAP
Java Library 2015 LGPLv2 111
Watchmaker
Java Framework 2013 Apachev2 212
GenPro
Java Framework 2009 Apachev2 113
GAlib
C++ Library 1998 MIT 114
PyBrain
Python Module 2017 MIT 3315
JCLEC
Java Framework 2014 ? 116
HeuristicLab
C
GPE
C
JGAlib
Java Library 2004 ? 119
CIlib
Scala Framework 2019 Apachev2 1720 pycma
Python Solver 2019 BSD 421
PyEvolve
Python Framework 2015 PSF 1222
GPLAB
Matlab Library 2018 LGPLv2 823
Clojush
Clojure Framework 2019 EPLv1 1724 pySTEP
Python Framework 2013 MIT 125 µGP3
C++ Framework 2016 GPLv2 226
Pyvolution
Python Framework 2012 Apachev2 127
PISA
C++ Library 2008 * 428
EvoJ
Java Framework 2015 CC-BY-NC-SA-3 129
Galapagos
Java Framework 2013 GPLv2 130 branecloud
C
JAGA
Java Framework 2008 GPLv2 132
PMDGP
C++ Framework 2002 GPLv2 133
GPC++
C++ Framework 1997 GPLv2 234
PonyGE
Python Framework 2014 ? 335
Platypus
Python Framework 2019 GPLv3 936
DCTG-GP
Prolog Library 2001 ? 137
Desdeo
Python Framework 2019 MPLv2 638
PonyGE2
Python Framework 2018 GPLv3 939
EvoGrad
Python Framework 2019 Proprietary 1 Average AUC values
Table 5: Average AUC values of the elites returned by irace in each of the 15 independent runsand of the four baseline algorithms on each of the 19 W-model instances
FID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1ptGA EA fEA xGA
AUC of the elite returned by irace run number Baseline Algorithms
Figure 4: Distances between AUCs of elite algorithms and baseline algorithms.18
Diagram of
Paradis eo classes Replacement Selection<
Figure 5: Summary UML diagram of the FastGA family of algorithms, as modeled with
Par-adiseo classes. Aggregation arrows shows the cardinality of instances (arrow tail side) and slots(arrow head side) involved in the final combination. No cardinality is indicated when it equalsone. 19
Convergence plots for all problems
The following 19 figures shows the convergence plots of the baseline algorithms against the bestelite selected by irace . Algorithms are denoted in the legend by the set of indices for each slots,using the following code (see Table 2 for the corresponding algorithms):P: population size (always 5 in this study),C: crossover probability,s: crossover selector,c: crossover,a: selector after crossover (always 0 in this study),M: mutation probability,u: mutation selector,m: mutation,r: replacement,O: stopping criterion (always 0 in this study).20
20 40 60 80 10066.577.588.59
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=1_s=2_c=1_a=0_M=2_u=2_m=8_r=8_O=0 FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 6: Convergence plot of baseline algorithms and elite, for problem 1.21
20 40 60 80 1006.577.588.59
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=4_s=5_c=2_a=0_M=3_u=4_m=9_r=9_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 7: Convergence plot of baseline algorithms and elite, for problem 2.22
10 20 30 40 50 60 70 80891011121314
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=1_s=3_c=8_a=0_M=2_u=6_m=3_r=2_O=0 FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 8: Convergence plot of baseline algorithms and elite, for problem 3.23
50 100 150 200 2508910111213141516
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=1_c=1_a=0_M=2_u=6_m=9_r=0_O=0 FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 9: Convergence plot of baseline algorithms and elite, for problem 4.24
20 40 60 80 100 120161718192021222324
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=4_s=2_c=2_a=0_M=4_u=3_m=7_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 10: Convergence plot of baseline algorithms and elite, for problem 5.25
20 40 60 80 100 120 140 1601416182022242628
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=0_s=3_c=2_a=0_M=4_u=2_m=6_r=0_O=0 FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 11: Convergence plot of baseline algorithms and elite, for problem 6.26
100 200 300 400 500 6001618202224262830
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=3_s=0_c=3_a=0_M=4_u=3_m=10_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 12: Convergence plot of baseline algorithms and elite, for problem 7.27
100 200 300 400 500 60016182022242628
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=0_s=1_c=0_a=0_M=3_u=2_m=10_r=0_O=0 FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 13: Convergence plot of baseline algorithms and elite, for problem 8.28
100 200 300 400 500 6002022242628
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=2_a=0_M=4_u=5_m=10_r=0_O=0FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 14: Convergence plot of baseline algorithms and elite, for problem 9.29
50 100 150 200 25030323436384042444648
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=4_s=2_c=2_a=0_M=4_u=5_m=9_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 15: Convergence plot of baseline algorithms and elite, for problem 10.30
100 200 300 400 50030354045
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=3_s=2_c=10_a=0_M=4_u=2_m=10_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 16: Convergence plot of baseline algorithms and elite, for problem 11.31
100 200 300 400 500 600 700202530354045
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=2_s=2_c=5_a=0_M=4_u=3_m=9_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 17: Convergence plot of baseline algorithms and elite, for problem 12.32
100 200 300 400 500 600354045505560
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=3_s=6_c=2_a=0_M=4_u=1_m=10_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 18: Convergence plot of baseline algorithms and elite, for problem 13.33
200 400 600 800 10003540455055
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=1_s=5_c=9_a=0_M=4_u=2_m=8_r=0_O=0 FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 19: Convergence plot of baseline algorithms and elite, for problem 14.34
200 400 600 800 1000354045505560
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=2_s=5_c=2_a=0_M=4_u=6_m=8_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 20: Convergence plot of baseline algorithms and elite, for problem 15.35
200 400 600 800 10004045505560
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=10_a=0_M=4_u=6_m=10_r=0_O=0 FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 21: Convergence plot of baseline algorithms and elite, for problem 16.36
200 400 600 800 1000 1200354045505560
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=3_s=2_c=2_a=0_M=4_u=5_m=10_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 22: Convergence plot of baseline algorithms and elite, for problem 17.37
50 100 150 200 250 300 350404550556065
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=4_s=2_c=2_a=0_M=4_u=1_m=8_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee Figure 23: Convergence plot of baseline algorithms and elite, for problem 18.38
100 200 300 400 500 600 7003540455055606570
FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=1_r=0_O=0 FastGA_P=5_C=0_s=0_c=0_a=0_M=0_u=0_m=5_r=0_O=0FastGA_P=5_C=2_s=2_c=2_a=0_M=2_u=2_m=1_r=0_O=0 FastGA_P=5_C=2_s=2_c=5_a=0_M=2_u=2_m=1_r=0_O=0FastGA_P=5_C=4_s=2_c=2_a=0_M=4_u=6_m=9_r=0_O=0
Function Evaluations B ee
Function Evaluations B ee s t - s o - f a r f ( x ) - v a l u ee