Evolutionary Multitask Optimization: a Methodological Overview, Challenges and Future Research Directions
EEvolutionary Multitask Optimization: a Methodological Overview,Challenges and Future Research Directions
Eneko Osaba a, ∗ , Aritz D. Martinez a , and Javier Del Ser a,b a TECNALIA, Basque Research and Technology Alliance (BRTA), P. Tecnologico, Ed. 700, 48160 Derio, Spain b University of the Basque Country (UPV/EHU), 48013 Bilbao, Spain
Abstract
In this work we consider multitasking in the context of solving multiple optimization problems simulta-neously by conducting a single search process. The principal goal when dealing with this scenario is todynamically exploit the existing complementarities among the problems (tasks) being optimized, helpingeach other through the exchange of valuable knowledge. Additionally, the emerging paradigm of Evo-lutionary Multitasking tackles multitask optimization scenarios by using as inspiration concepts drawnfrom Evolutionary Computation. The main purpose of this survey is to collect, organize and criticallyexamine the abundant literature published so far in Evolutionary Multitasking, with an emphasis on themethodological patterns followed when designing new algorithmic proposals in this area (namely, mul-tifactorial optimization and multipopulation-based multitasking). We complement our critical analysiswith an identification of challenges that remain open to date, along with promising research directionsthat can stimulate future efforts in this topic. Our discussions held throughout this manuscript are offeredto the audience as a reference of the general trajectory followed by the community working in this fieldin recent times, as well as a self-contained entry point for newcomers and researchers interested to jointhis exciting research avenue.
Keywords:
Transfer Optimization, Multitasking Optimization, Evolutionary Multitasking,Multifactorial Evolutionary Algorithm, Multi-population Multitasking.
1. Introduction
Traditionally, optimization problems have been solved by different methods, part of which do notassume any a priori knowledge about the task under consideration. Over the years, this approach hasdemonstrated to be highly efficient in almost all real-world situations. Today, the scientific communityhas realized that this traditional way of solving problems may undergo some limitations. Indeed, thegrowing complexity of optimization problems and the fact that real-world optimization problems hardlyappear in isolation have uncovered the need for exploiting knowledge gathered beforehand related tothe problems themselves. This is the main reason why the incipient research area known as TransferOptimization (TO, [1]) has gained momentum within the Artificial Intelligence research community [2].The fundamental aim of TO is to exploit the knowledge learned from the optimization of one problem( task ) when addressing another related (or unrelated) problems, thus aligning much with the previouslynoted needs. ∗ Corresponding author. TECNALIA, Basque Research & Technology Alliance (BRTA), P. Tecnologico, Ed. 700. 48170 Derio(Bizkaia), Spain.
Email address: [email protected] (Eneko Osaba)
Preprint submitted to Applied Soft Computing February 5, 2021 a r X i v : . [ c s . N E ] F e b p to now, three different conceptualizations of TO have been formulated in the literature. The firstone, coined as sequential transfer [3], aims at solving problems that occur sequentially. To this end, theknowledge obtained when tackling preceding tasks is employed as external information when dealingwith new problems/instances. The second one of these categories, referred to as multitasking [4], isdevoted to the simultaneous development of different tasks by dynamically exploiting synergies existingamong them. Finally, multiform optimization relates to the discovery of a solution for a single task foundby using diverse alternative formulations.Specifically, this work focuses on multitasking tackled through the perspective of Evolutionary Mul-titasking (EM, [5]), also referred as Evolutionary Multitask Optimization. In short, EM seeks the de-velopment of efficient multitasking methods by relying on search procedures and operators drawn fromEvolutionary Computation [6, 7] and Swarm Intelligence [8]. A significant effort has been conductedby the community for solving a wide variety of continuous, combinatorial, single-objective and multi-objective optimization problems through the perspective of EM [9, 10, 11, 12]. Another research direc-tion for dealing with multitasking in the context of TO is multitask Bayesian optimization [13], whichextends Bayesian optimization approaches to multitasking environments [14, 15, 16, 17]. Despite fallingout of the focus of this paper due to its non-evolutionary nature, we note that Bayesian solvers, alongthose within EM, constitute the core of the contributions reported in the field of multitasking, with asignificantly higher presence of EM methods.A closer inspection at the most reputed scientific databases unveils that efforts carried out in EMare exponentially growing in recent times. This upsurge of activity demands a reference material tosummarize achievements so far, detect and analyze research trends, perform a profound reflection onthem to identify current limitations, and prescribe future research directions that push forward valuableadvances in the field. This is the rationale for this survey, which motivates its ultimate goal: to offer aunified, self-contained and end-to-end outline of the work done in EM. Specifically, the contribution ofthis work can be synthesized as follows:• We perform a systematic review of the literature on Evolutionary Multitask Optimization published todate. For this purpose, we design a three-fold classification criteria to organize the corpus of reviewedcontributions around a comprehensive taxonomy. To begin with, we pause at theoretical studies, grav-itating on several application-agnostic aspects of EM. In a second step of our analysis, we classify theliterature consider the knowledge sharing pattern adopted in the outlined works, namely, implicit versusexplicit knowledge transfer, as well as per the capability of the underlying algorithm to actively adaptthe amount of exchanged knowledge along the search (static versus adaptive). Finally, we distinguishamong the two algorithmic design templates used to realize the multitasking search: MultifactorialOptimization (MFO, [18]) and Multipopulation-based Multitasking (MM).• We next provide a methodological overview of the field, highlighting the main methodologies followedby researchers and practitioners in the different phases of EM algorithmic development, current limita-tions and points of improvement inferred therefrom.• We conclude our overview with a prospect of opportunities and challenges that should guide the scien-tific efforts invested by the related research community in the next years.Even though a clear interest in EM has aroused lately in the related community, to the best of ourknowledge only one recent work has conducted a similar study to the one carried out in this manuscript.Specifically, [19] exposes the work already done around the generic field of Evolutionary Transfer Op-timization (ETO), providing an overview of existing studies gravitating on different topics related toETO, namely, ETO for optimization in uncertain environments, ETO for multitask optimization, ETOfor complex optimization, ETO for multi/many-objective optimization, and ETO for machine learningapplications. The overview is supplemented by a set of challenges in the generic ETO research field.2aving said this, this paper takes a major step beyond [19] by elaborating on different directions thatmake our work differential on its own: a) a study fully focused on the stream known as EvolutionaryMultitasking (ETO for Multitask Optimization in [19]), stressing on the algorithmic perspective, b) amanifold taxonomy based on three different pivotal axis: knowledge sharing pattern adopted (implicit orexplicit), dynamic nature of the solving schemes (static or adaptive) and the design template of the searchalgorithm (MFO and MM), c) a critical analysis of the methodological trends followed by researcherswhen designing and implementing EM-based methods; and d) an insightful discussion around challengesand opportunities fully focused on EM, in which we deal with topics ranging from possible applications,algorithmic enhancements and benchmarking issues.The remainder of this manuscript is structured as follows: Section 2 briefly poses the essential con-cepts of EM and introduces the reader to the main approaches used so far to face this paradigm. Next,Section 3 delves into the survey itself, departing from the presentation of the taxonomy criteria, to arriveat a careful examination of the recent bibliography related to EM. Equally important is the methodolog-ical overview done in Section 4, bringing to the fore the main methodologies followed in the differentphases of EM algorithmic development. Section 5 gravitates on the current limitations and discussesseveral challenges stemming therefrom. Finally, Section 6 concludes our survey with a summary of themain conclusions and an outline towards the future of this exciting field.
2. Evolutionary Multitasking: Definition and Essential Concepts
As introduced before, multitasking is devoted to the simultaneous solving of different optimizationproblems or tasks. It is important to emphasize at this point that the main goal of this paradigm is to finda promising solution to each of the problems at hand. This specific TO category is featured by an omni-directional knowledge sharing among tasks, potentially reaching a synergistic push between the problemsbeing tackled [1]. In this way, multitask optimization sinks its roots in the premise that these complemen-tarities among tasks lead to a competitive advantage over the case where the same problems are solvedin isolation, either in terms of the optimality of the discovered solutions, or in terms of convergence andconsumption of computational resources.Mathematically, a multitask optimization scenario consists of K optimization tasks { T k } Kk =1 , whichare to be simultaneously solved. In this way, this environment can be characterized by the existence ofas many search spaces Ω k as tasks. Furthermore, each k task has its own fitness function (objective) f k :Ω k → R , where Ω k is the search space over which f k ( · ) is defined. Assuming that all problems should bemaximized, the main objective of multitask optimization is to discover a set of solutions { x ∗ , . . . , x ∗ K } such that x ∗ k = arg max x ∈ Ω k f k ( x ) .We now shift our focus to EM, in which two main characteristics have stimulated researchers to dealwith multitask optimization scenarios by means of evolutionary search operators. On the one hand, theintrinsic parallelism that brings a population of individuals which evolve together is well suited to dealwith concurrent problems. In fact, several papers have already highlighted the benefits of this structurefor dynamically unveiling synergistic relationships between tasks [2, 20]. On the other hand, the continu-ous exchange of genetic material along the evolutionary search allows all tasks to benefit from each other[21]. Considering the formulation introduced above, there are several ways for dealing with multitask-ing environments through the prism of EM, being two the most used approaches in the state of the art(schematically depicted in Figure 1):• The execution of a single search process over a unique population P = { x p } Pp =1 that contains thesolutions to all problems, and that fosters the exchange of information among them through the ap-plication of crossover operators (as in e.g. Multifactorial Optimization, MFO). In this case, an aspectof paramount importance is that each solution x p in the population should be evolved over an unifiedsearch space Ω U . Thus, each independent search space Ω k belonging to task T k can be translated to Ω U
3y means of an encoding/decoding function ξ k : Ω k (cid:55)→ Ω U . For this reason, each individual x p ∈ Ω U in P should be decoded to yield a task-specific solution x pk for each of the K tasks. In this context,the appropriate encoding strategy used for the individuals and the capability of the designed unifiedsearch space to represent all solutions ∀ Ω k is crucial for an effective knowledge transfer between tasks.Specifically, the formulation of Ω U should be consistent with the level of overlapping among problemsbeing solved.• The deployment of several search processes that run in parallel, one for every task under considera-tion, which exchange information periodically as per a defined knowledge sharing policy (as in e.g.Multipopulation-based Multitasking, MM). In this case, each search process operates on a task-specificpopulation P k = { x pk } P k p =1 , whose size P k and search operators can be particular for task T k and hence,differ from those used for other concurrent tasks. In accordance with previous notation, x pk ∈ Ω k ∀ p ∈ { , . . . , P k } . In this case, the exchange of information is usually made in terms of solutionseventually exchanged between populations belonging to different tasks, so that a mapping function Γ k,k (cid:48) : Ω k (cid:55)→ Ω k (cid:48) is needed to translate an individual x pk to the search space of task T k (cid:48) . This mappingfunction can be defined and particularized per every task pair or, instead, can rely on an intermediateunified search space, such that Γ k,k ‘ ( x pk ) = ξ − k (cid:48) ( ξ k ( x pk )) , with ξ k ( x pk ) ∈ Ω U . Multiform OptimizationMultifactorial Optimization Multipopulation-based MultitaskingOthersKB x ∗ time. . .. . .f ( x ) Alg x ∗ K . . .f K ( x ) Alg K f K +1 ( x ) Alg K +1 Sequential Optimization
Transfer Optimization f o ( x ) f o ( x ) f o ( x ) f No ( x )... KB Alg f ( x ) . . .f ( x ) f K ( x )KB Alg ... x ∗ x ∗ x ∗ K Multitask Optimization KB . . .. . .f ( x ) Alg . . . x ∗ K +1 f o ( x ) f o ( x )... x , ∗ o ... generations generations f ( x ) f ( x ) f K ( x ) ... ... x ∗ x ∗ Alg ... ...
Select
Alg ... D M x N, ∗ o x , ∗ o x ∗ o x ∗ K Figure 1: Schematic diagram showing the different ways Transfer Optimization can be realized, along with the two main family ofalgorithms by which multitask optimization can be approached using concepts from Evolutionary Computation.
Based on the work published by Ong and Gupta in [2], we can measure the overlap of two problemsbased on the amount of variables in the task-specific solution space which have the same phenotypicalmeaning. Thus, three different superposition levels can be identified depending on the amount of overlapin the phenotype space of the optimization tasks: 1) complete overlap , when tasks to solve are distin-guished only on their task-specific auxiliary variables; 2) partial overlap , when problems share somecharacteristics, or tasks in which the distribution of variables is similar; and 3) no overlap , when prob-lems to be tackled do not share any aspect of their structure. In any case, despite the relevance of thelevel of superposition when designing EM approaches, it is important to be aware that in many real ap-plications it is not possible to measure the level of complementarity among tasks being solved withoutactually solving them [1]. 4aving introduced these concepts, it is appropriate to highlight that there is a common point of agree-ment in the related community, which states that EM was only materialized by means of MFO until late2017 [22]. From that moment on, this incipient branch of TO has gathered a growing amount of contri-butions centered on the proposal of new EM solvers. Nowadays, it is widely agreed that two are the mostrecurring approaches for dealing with EM environments: MM and MFO, which conform to the two maindesign trends described above.On one hand, we can define MM approaches in a generalist way as techniques organized by dif-ferent populations, in which each deme is devoted to the resolution of one specific task. MM can beheterogeneous, giving rise to different solving strategies relying on evolutionary and/or swarm intelli-gence heuristics or knowledge sharing protocols. Among these strategies, the one known as coevolution-ary optimization (CoEV, [23]) is arguably the most frequently used today, in which knowledge sharingamong populations (in terms of e.g., member migration or intra-deme crossovers) helps the evolution ofeach task. Examples of MM techniques are the multitasking multi-swarm optimization proposed in [24],the coevolutionary multitasking scheme introduced in [25] or the coevolutionary variable neighborhoodsearch presented in [26].On the other hand, the design of MFO techniques hinges on the definition of four different albeitinterrelated specific concepts for each solution x p ∈ Ω U of the single population P over which the searchis performed:• Concept 1 (Factorial Cost) : the factorial cost Ψ pk ∈ R of an individual x p ∈ P is equal to its fitnessvalue f k ( x pk ) for a given task T k , which can be computed after decoding x p to x pk via ξ k ( · ) . Eachmember of the population has a list { Ψ p , Ψ p , . . . , Ψ pK } of factorial costs, each one associated with anoptimization task T k .• Concept 2 (Factorial Rank) : the factorial rank r pk ∈ N of an individual x p for task T k is the position ofthis member within the whole population sorted in ascending order of Ψ pk . Every individual also countswith a factorial rank list { r p , r p , . . . , r pK } .• Concept 3 (Scalar Fitness) : the scalar fitness ϕ p of x p is computed based on the best factorial rankamong the optimization tasks, i.e., ϕ p = 1 / min k ∈{ ...K } r pk . This value is used for comparing individ-uals in a MFO algorithm.• Concept 4 (Skill Factor) : denoted as τ p , the skill factor is the task index in which member x p performsbest, that is τ p = arg min k ∈{ ,...,K } r pk .The above four concepts are the cornerstone on which all MFO techniques rely. In fact, these defini-tions are used for different purposes, such as 1) deciding how population individuals interact with eachother; 2) determining which solutions survive in the population between successive generations; 3) as-signing tasks to individuals; or 4) classifying and sorting the whole population. With all this, these fourconcepts (either in their seminal form or in modified formulations, such as those proposed in [27, 28])have led to several efficient MFO techniques for solving multitasking scenarios.Furthermore, it is interesting to highlight here that two different knowledge sharing strategies canbe found in EM methods, which can be approached as per the level of explicitness of the exchangedknowledge with respect to the evolved solutions. As such, implicit transfer refers to those cases whereknowledge sharing is materialized through search operators, such as crossover functions. An exampleof implicit genetic transfer is the assortative mating used in most MFO techniques. By contrast, explicitknowledge transfer is conducted by migrating complete solutions from one task to another, which isoften adopted in multipopulation schemes. Furthermore, it also be noted that explicit transfer couldalso be materialized through the use of mapping functions for transforming solutions before transferring,or by making use of Estimation of Distribution Algorithms (EDA [29]). We will revolve around thesealternative paths for knowledge transfer when discussing our prospective on the field in Section 5.5otwithstanding the proven efficiency of EM solvers (including those related to MFO), it is appropri-ate to finish this section by underscoring that multitasking has been the focus of diverse debates question-ing the efficiency of techniques proposed to date. Today, it is a clear consensus regarding the paramountrelevance of the correlation among tasks to solve. The existence of these interrelationships is essential forpositively capitalizing the shared knowledge over the search. Many studies have analyzed from differentperspectives the similarities and possible synergies among problems [30]. However, in many practicalenvironments it is not possible to quantify the existing complementarity among tasks in a preemptivefashion, without any knowledge of the optimal solution to each problem under consideration. This notedfact creates a latent problem for multitasking solvers, as the sharing of genetic material among non-relatedtasks is known to potentially lead to performance downturns. This phenomenon is known by the com-munity as negative transfer [31], and has motivated a significant research upsurge towards alternativeEM methods capable of avoiding and/or counteracting its effects in the convergence of the multitaskingsearch. Such alternative methods will be reviewed and discussed in Section 3.For the sake of a solid understanding of the EM paradigm, the following two sections clarify the maindifferences between multitask optimization, multi-objective optimization (Subsection 2.1) and multitasklearning (Subsection 2.2). An insightful reader can immediately relate EM to Multi-objective Optimization (MOO) paradigmwhich, when approached via evolutionary computation, span the wide family of multi-objective evolu-tionary algorithms. Indeed, it is possible to discern a conceptual overlap between both EM and MOO,since both aim at the optimization of a set of objective functions. However, as shown in Figures 2.a and2.b, these paradigms are completely separated from each other. On the one hand, EM aims to leverage theinherent parallelism enabled by a population of individuals for exploiting the synergies among related orunrelated tasks defined in different domains, each with its own solution space Ω k that potentially requiresan encoding/decoding function for knowledge transfer. Moreover, EM also pursues the discovery of thebest solution for every task. On the contrary, the goal of MOO is to find a set of solutions that differentlybalances between several conflicting objectives, defined over a single domain (and hence, over a singlesearch space). In other words, MOO assumes the existence of a Pareto trade-off between the objectives,for which the devised MOO algorithm produces an estimation in the form of a set of possible solutions.Therefore, there is no unique solution to each problem, but rather different solutions that meet everyobjective to a certain degree. In fact, EM setups where the tasks themselves are MOO problems can befound in the literature [12]. . . . KB Alg x ∗ x ∗ x ∗ K (a) KB Alg (b) f ( x ) f ( x ) f ( x ) f ( x ) f ( x ) Domain 1 Domain 2 f ( x ) . . . f ( x ) f K ( x )( K = 3) { x ∗ p } Pp =1 Domain f K ( x ) Domain K . . . (c)
Alg
Domain 1 Domain 2 Domain K
Task K Task2Task1
Learning model M θ Figure 2: Diagram showcasing the core differences between (a) multitask optimization; (b) multi-objective optimization; and (c)multitask learning. .2. Multitask Learning versus Multitask Optimization Multitask learning and multitask optimization work on similar scenarios, in which a set of solutions { x ∗ , . . . , x ∗ K } is sought for a set of tasks { T k } Kk =1 . However, they mainly differ in terms of the optimiza-tion target, and the way in which the knowledge transfer is carried out. In multitask learning, the goal isto yield a model M θ (with θ representing the parameters of the model) such that it can tackle the goal im-posed by different tasks (e.g. classification of images of diverse kind). Here, the challenge is to determinea model structure and a value of their constituent parameters that best favors not only a good performanceon every task under consideration, but also the exploitation of the synergies between modeling tasks.This dual functionality sought in multitask learning underlies beneath the design of multi-headed neuralnetworks with shared losses trained via backpropagation, which are arguably the most utilized approachin the field: on one hand, sharing part of the neural architecture permits that part of the knowledge iscommon to all tasks, whereas the definition of a shared loss function ensures that the optimization of theparameters of the network is driven by the performance over all tasks.This being said, there is a clear connection between multitask learning and multitask optimization,in the sense that multitask learning can be stated as a multitask optimization problem, provided that 1)solutions { x ∗ k } Kk =1 elicited by multitask optimization represent the parameters of a model, and 2) solutionsare constrained to part of their genotype being shared among tasks, so that they jointly embed a singlemodel. This last constraint can be overridden so as to produce a set of models that collaborate together tosolve several learning tasks more efficiently than in isolation. In this case, each optimization task wouldaim to seek the parameters of the model that best performs over the defined modeling problem for thetask, and implicit/explicit knowledge transfer mechanisms used in EM could be effectively employed inplace to transfer the knowledge learned in a certain task to another. All in all, multitask optimization mustbe conceived as a possible way of approaching multitask learning, but not the only one whatsoever.
3. Taxonomy and Literature Review on Evolutionary Multitasking
As mentioned in the introduction of this paper, the research activity produced around EM is growingat a remarkable path since the first formulation of this vibrant paradigm. In any case, all the work donespecifically around EM has not been organized yet in a scientific paper, arising the need for properlyconducting a work of this characteristics. Indeed, this is precisely the main objective of this section, inwhich we systematically review the most important works published to the date in the field of EM.In order to appropriately guide this section, we first present in Figure 3 a taxonomy covering all thestudies contemplated in this review section. For organizing this taxonomy, and subsequently this section,we have classified all the published material using a two-level approach. First, we have deemed theknowledge transfer strategy employed by the proposed solving method (implicit or explicit). The secondlevel regards to the capacity of the method to proactively analyze the negative knowledge sharing amongtasks and dynamically react to this issue, seeking to reduce its impact in the algorithmic search. Onthe one hand, if the solving approach does not include any analyzing mechanism, we have consideredit a static . On the other hand, we consider an algorithm as adaptive if it not only employs this kindof analyzing strategies but adapt its structure to the unveiled synergies among tasks (by modifying theparameters of the algorithm, for example). Lastly, once this categorization is conducted, we have furtherclassified the published contributions taking as reference the algorithmic approaches used and proposedby researchers and practitioners. In this regard, we have considered
MFO based schemes , MM basedapproaches and other methods .With all this, this taxonomy sorts the literature according to these algorithmic schemes, being alsovaluable for distinguishing at a short glimpse those areas in which the community has so far place mostof their attention. This literature overview, together with the methodological review conducted in Section4, settle a stepping stone towards the critical discussion that will be held in Section 5 around the mainlimitations, opportunities and challenges that bring this area.7 .1. Theoretical Studies on Multitask Optimization
As mentioned, this whole section will revolve around the systematic overview of all the work done upto now on Evolutionary Multitask Optimization. This overview has been conducted though the perspec-tive of both algorithm proposals and their knowledge sharing patterns. Nevertheless, we would make abig mistake if we leave aside the large number of paramount articles which have contributed in a crucialway to the establishing, advancement and understanding of this field. Specifically, we refer to the theoret-ical papers, which address the knowledge area from a less applied point of view, in order to understand inan adequate way the ins and outs of the field. These works are essential for establishing in the communitythe main pillars that make the research stream can advance in an orchestrated and efficient way.
Evolutionary Multitask Optimization
MFO Based
Static:Adaptive :[25, 26, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99][10, 100, 101, 102]
MM BasedImplicit ExplicitOthers
Algorithmic approachKnowledge transfer[103, 104]
Adaptive:Static:Adaptive: [24, 71, 72, 73][87, 88, 89]
Static:Adaptive: [9, 12, 18, 28, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70][20, 27, 31, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86]
Figure 3: Taxonomy of the literature related to Evolutionary Multitask Optimization reviewed in this survey. A two-level classifica-tion has been made depending on the knowledge transfer scheme used and the adaptability of the proposed methods. Furthermore,solvers have been categorized into MFO and MM based ones, with an additional ’other’ classification.
Probably, the most valuable paper in this context is this published by Ong et al. in [32], which isdevoted to the introduction and presentation of EM field. This works is a cornerstone in the researchcommunity, establishing the basic concepts that have guided all the work conducted in last years. Apartfrom this influential and pioneering contribution, several remarkable theoretical works have been pub-lished on EM delving of different aspects such as the influence of complementarities between functionlandscapes on the search performance [33, 30], or just highlighting the main ingredients that make thisknowledge stream interesting for the research community [4]. Further works on EM from a theoreticalpoint of view can be found in [34, 2].It is interesting to mention again at this point the study published by Gupta et al in [18]. That paperis not only significant for introducing to the community the most important method to date, MFEA,but also for establishing the principal wickers that make up MFO. As will be demonstrated in Section3.2 and Section 3.3, both MFO and MFEA have been the source of inspiration for an abundant numberof valuable works. As part of the work carried out, there are several published papers that have alsodelved into theoretical aspects of these paradigms. In the recent [35], for example, an analysis on the8fficiency of MFEA is carried out. Main objectives of that study are twofold: to theoretically unveilwhy MFEA based methods perform better that classical techniques, and to provide some findings on theparameter setup of MFEA algorithm. In the recent [36], the impact of three different MFEA parametersis analyzed: probability of individual learning, probability of intra-crossover and probability of inter-crossover. In [37], a rigorous analysis is carried out on the relationship of MFEA and the conceptuallysimilar multipopulation evolution models. To do that, authors make an in-depth comparison on theirperformance and working procedures. A similar study is also proposed in [38], revolving around the ideaof the relationship among MFEA and island-based models. Interesting is also the brief study proposed in[39], focused on presenting insights in the measure of task relationship in MFEA.Significant is also the position paper published by Gupta and Ong in [40]. The main objective of thatresearch is to return to the roots of Evolutionary Computation. From here, authors provide an interestingreview of the field for properly understand the inspirations of what can be classified under the umbrellaof multi-X evolutionary computation concept. Thus, multitasking is again analyzed in this paper from itstheoretical perspective.Finally, it is interesting to mention in this category the studies described in both [41] and [22]. Thesereports contribute to the EM field by introducing some valuable test problems for both single-objectiveMFO and multi-objective MFO. Main intention of the authors of that works it to present to the communitysome heterogeneous benchmarks and baseline results, in order to use them for subsequent studies.
As mentioned in the previous Section 2, implicit transfer is often materialized through the applicationof dedicated search operators such as crossover functions. The principal standard-bearer for this typeof transfer is known as assortative mating which is used in most of MFO techniques. The first paperrevolving around assortative mating procedure is the same work which also introduces MultifactorialOptimization paradigm [18]. This paper became instantly in a reference paper, not only because of theintroduction of MFO concept, but also for the formulation of the most used and influential EM technique:the Multifactorial Evolutionary Algorithm (MFEA). From that moment on, many diverse adaptations andapplications of the canonical MFEA has been proposed in the literature. In [42], for example, first discreteadaptation of MFEA is proposed, using as benchmarking problems four well-known permutation-basedcombinatorial optimization problems: Traveling Salesman Problem, Quadratic Assignment Problem, Job-Shop Scheduling Problem and Linear Ordering Problem. After that pioneering study, multiple additionaldiscrete adaptations of the method have been proposed, such as the one focused on solving the CapacitatedVehicle Routing Problem in [43] or the series of works published principally by Thanh and Binh for thefacing of clustered shortest path tree problems [44, 45, 46, 47, 48].Several adaptations of MFEA have been also proposed for efficiently deal with real-world optimiza-tion problems, such as the permutation-based MFEA proposed in [49] with cloud computing servicecomposition purposes. A quite related approach was presented in [9], devoted in that case to the efficientsemantic web service composition. In [50], authors develop a MFEA embedded with a greedy-basedallocation operator for solving large-scale virtual machine placement problem in heterogeneous environ-ment. An additional interesting application of MFEA has been recently proposed in [51], with the maingoal of simultaneously evolving concurrent deep reinforcement learning models.Due to the success of MFEA, multiple variants of this method have been recently proposed. Thesemethods also rely on implicit transfer strategies for the genetic material sharing, using the majority ofthem the above pointed assortative mating . In [52], for example, authors introduced the named as Gen-eralized MFEA. The main reason for the formulation of this technique is that MFEA experiences per-formance downturns when dealing with tasks with different dimensions, or problems whose optima donot lie in the same region of the solution space. The Generalized MFEA try to overcome these issues byimplementing two different mechanisms related to decision variable translation and shuffling.9nother interesting variant of MFEA is developed in [53]. The improved MFEA proposed in thiswork explores the integration of a novel cross-task implicit transfer operator, which is based on a searchdirection instead of an individual. The main objective of this method is to accelerate the convergenceof the search process, especially in environments where the optima of tasks are far from each other. Inanother vein, authors of [54] modeled an interesting hybrid MFEA which combines both MFEA and theLinkage Tree Genetic Algorithm. In [55], a variant of MFEA coined as polygenic evolutionary algorithm is designed, which curtails the cultural issues of the evolutionary procedure in the models of multifactorialinheritance. The main objective of that work is to understand the importance of both assortative mating and vertical cultural transmission towards effective evolutionary multitasking. Further interesting MFEAvariants have been proposed in [56] by means of the coined as (4+2) MFEA, and [57] with MFEA withPriority-based Encoding.Soon after the proposal of the single-objective, same authors that introduce MFEA proposed theMulti-Objective MFEA (MO-MFEA, [12]), which automatically entailed an inflection point in the relatedresearch community. It should be noted here that this Multi-Objective MFEA also employs the assorta-tive mating procedure for implicit knowledge sharing purposes. Furthermore, this specific method hasbeen already used in a heterogeneous range of applications, such as for dealing with the multi-objectivepollution-routing problem [58] or for the electric power dispatch [59]. A further application of MO-MFEA was presented in [60] for solving operational indices optimization.Furthermore, improved variants of the basic method have been already proposed, such as the onemodeled in [61]. In that paper, authors introduce a MO-MFEA with a two-stage assortative mating method. This procedure introduced a preliminar division of the decision variables into diversity-relatedvariables and convergence-related variables. After this first step, both types of variables undergo the as-sortative mating . Another adaptation was developed in [62], coined as decomposition-based MO-MFEA(MFEA/D-M2M). The main ingredient that characterizes this method is the adoption of a M2M approachfor decomposing multi-objective optimization problems into multiple constrained sub-problems. Themain goal of this procedure is to enhance the diversity of population and convergence of sub-regions.Also valuable is the study carried out in [63], focused on the resolution of the well-known multi-objectivevehicle routing problem with time windows using an improved MO-MFEA by integrating bone route andlarge neighborhood local search. Furthermore, authors of [64] introduced a so-called Guided Differen-tial Evolutionary (DE) MO-MFEA. Two are the main novel ingredients of this method: a) an improvedcrossover operator using guided differential evolution, and b) a modified Powell mechanism for mutationoperations. An additional improved version of the MO-MFEA can be found in [65], devoted to the solv-ing of interval multiobjective optimization problems (cases in which the coefficients in their objectivesor/and constraint(s) are intervals). Finally, it is interesting to highlight the single and multi-objectiveoptimization multifactorial evolutionary algorithm (S&M-MFEA) proposed in [66]. The main purposeof that solving scheme is to combine in a single multitasking environment the original single-objectiveMFEA formulation together with its associated multi-objective reformulation.Despite the huge success and the contrasted efficiency of MFEA, researchers have rapidly detectedthe main limitations inherent to the canonical scheme of this method. As reported in previously publishedpapers [31], the main limitation of MFEA is its difficulty for facing potential incompatibilities betweendifferent non-related tasks. For dealing with this issue, two principal research streams have been followedby the community up to now. The first one is the development of adaptive methods (as will be seen inSection 3.3 and 3.5). The other approach is the design of alternative solving schemes. Within this lastcategory, different techniques can be found in the literature that address EM throughout the lenses of MFObut using a different scheme than MFEA. Another limitation of MFEA is that it resorts to non-structuredpopulations, even though we assume that such a structure exists. Thus, a more advanced and sophisticatedstructures could favor the design of alternative search operators, promoting a more controlled exchangeof knowledge between related and non-related tasks.Therefore, to overcome these limitations, practitioners have taken a step forward, proposing novel10echanisms which have led to the proposal of numerous methods, based on the essential concepts ofthe MFO. The first alternative MFEA scheme was proposed in [67], just some months later that theseminal work presenting the canonical MFEA. The main motivation that led the conduction of that workis to demonstrate that the practicality of population-based bi-level optimization could be enhanced bydeeming the paradigm of EM within the search process. To do that, authors embedded the principal MFOconcepts into the scheme of the well-known Nested Bi-Level Evolutionary Algorithm, giving rise to thecoined as N-BLEA. Some months later, Sagarna and Yew-Soon introduced in [68] a MFO method forsearch-based software test data generation. In an attempt of leveraging the knowledge from differentsources and enhance the search process, authors of that work proposed a MFO algorithm which bases thecomplete search procedure in mutation operations. Thus, authors evince that the selection operator andthe preference relation used to compare individuals allow to inter-task knowledge transfer for an effectivesearch.Also proposed shortly after the introduction of MFEA, we can find in [69] an interesting work delv-ing in the main concepts of MFO. The principal goal of that work is to explore the generality of theMFO paradigm, employing different population-based schemes. To do that, authors proposed the firstmultifactorial formulations of the hugely famous particle swarm optimization (PSO, [70]) and differen-tial evolution (DE, [71]). Regarding the knowledge sharing strategies used in these DE- and PSO-basedmethods, they also employ the widely used concept of assortative mating , adapted to the mechanisms ofthe metaheuristics at hand. Thus, they also rely on implicit transfer mechanisms. Indeed, that interestingwork has served as guiding light for subsequent studies, such as the one conducted in [72], in which theperformance of different mutation strategies in the knowledge transfer of multifactorial DE is studied.Furthermore, this same method is used as base in the remarkable investigation carried out in [73], whichmain goal is to identify the essential characteristics of tasks landscapes through the implementation of aninter-task evolutionary mechanism in the low-dimension subspace.Another example of this scientific trend is the Multifactorial Cellular Genetic Algorithm (MFCGA,[28, 74]), which hybridizes the main concepts of MFO with the structural design and behavior patterns ofwell-known Cellular Genetic Algorithms. Main inspiration of that method is to have a more controlledimplicit mating process among different tasks, favoring in this way the exploration and quantitative exam-ination of synergies among the problems being solved. Also interesting is the approach introduced in [75]proposing a multifactorial particle swarm optimization - firefly algorithm hybrid technique. Main featureof this method is that individuals of the population can behave as a particle or a firefly, depending on thesearch performance. In any case, despite each member of the population can eventually move followingeach pattern, each individual maintains its nature along the complete execution. Further alternative MFOschemes can be found in [76], presenting a method for solving large-scale optimization problems calledas evolutionary multitasking assisted random embedding; in [77], which introduces a MFO method hy-bridizing genetic transform and hyper-rectangle search strategies; and in [78], which proposed an unifiedframework of evolutionary multitasking graph-based hyper-heuristic based on MFO concepts. AdditionalMFO based techniques can be found in [79, 80, 81].It is worthy to mention that alternative MFO solving schemes have also been proposed for tacklingmulti-objective optimization problems. In this regard, we can highlight the recent research conducted in[82] by Shen et al. , introducing a novel multitasking multiobjective memetic algorithm for learning Fuzzycognitive maps, inspired by the principal concepts of MFO. Furthermore, in [83] a novel multitaskingmethod is proposed, which is fully devoted to the resolution of the sparse reconstruction problem. Themethod developed on that work was coined as multitasking sparse reconstruction (MTSR), and also reliesof MFO concepts such as skill factor, factorial rank and scalar fitness for the multi-objective solvingof the problem. It is also interesting to mention the genetic transfer scheme developed for that MTSRmethod, which is an enhanced variant of the assortative mating procedure coined as
Within-Task andBetween-Task Genetic Transfer .Finally, it is convenient to end this section dedicated to implicit knowledge transfer-based solvers11y highlighting a few EM alternatives recently proposed which do not embrace the MFO paradigm.On the contrary, the adopt previously described MM schemes. A representative approach is presentedin [24]. In that work, authors develop a dynamic multi-swarm method for EM. In that algorithm, thecomplete population is divided into as much swarms as task to solve. Furthermore, each subpopulation isdivided into different sub-swarm. Thus, within each task subpopulation, a dynamic multi-swarm methodis conducted. Furthermore, the knowledge sharing is realized through probabilistic crossover procedureswith particles from other tasks groups, giving way to the coevolutionary factor of the method. Moreover,a parallel DE is proposed in [84], which introduces knowledge transfer patterns based on the archives ofeach DE solver. The same evolutionary metaheuristic, namely DE, is considered in [85] for modeling asimilarity-guided evolutionary multitask optimization. Interesting is also the MM technique developed in[86], focused on the multitasking adaptation of the well-known Fireworks Algorithm [87].
As mentioned in Section 2 of this work, a significant effort has been conducted by the communityfor overcoming the problems related to the so-called negative transfer . Examples of these alternativeschemes are the adaptive EM methods. These instruments are mainly conceived for dynamically calculatethe synergies among tasks, and subsequently measure how much knowledge should be transferred acrossdifferent tasks. Thus, in this section we outline those MFO methods proposed up to now to dynamicallycope with the curse of negative transfers .To start with, it is appropriate to mention the recently proposed MFEA-II [31], conceived as theevolved version of the standard-bearer method of the field: MFEA. Thus, two are the main ingredientsembedded in the basic MFEA for evolving it to its adaptive variant MFEA-II. First, the parameter whichdictates the extent of transfers (RMP) is now codified as a matrix, with a dedicated value for each pair oftasks. Second, this matrix is continuously adapted based on the performance of the multitasking search.It is also noteworthy that this method has been already adapted to discrete problems as can be seen inthe recent work [88]. Furthermore, same authors that developed the single-objective MFEA-II introducedalso its multi-objective version in [89]. As in the case of the static MFEA, these adaptive schemes alsobase their knowledge sharing on implicit procedures based on genetic crossover functions.Another adaptive MFEA is proposed in [90]. In that paper, the method is endowed with a self-regulation mechanism. The main objective of this mechanism is to automatically capture the usefulknowledge in common of the tasks at hand. For materializing this goal, this approach introduces theconcept of ability vector, which substitutes the skill factor τ p , and which reflects the solutions capabilityfor tackling each of the optimizing tasks. Furthermore, similar authors that proposed MFEA-II in 2019,introduced two years before a Linearized Domain Adaptation MFEA (LDA-MFEA) [20]. This variantcan be considered as an adaptive one, since it employs the linear transformation strategy for mapping thelandscapes of a simpler tasks to the search space of complex ones. In that way, authors try to conductefficient knowledge transfer between the problems while being optimized in concert. In the same year2017, authors in [91] proposed a MFEA with parting ways detection and resource reallocation mecha-nisms. The first of this functionalities is in charge of detecting the occurrence of parting ways at whichthe sharing of knowledge is being unproductive, while the second mechanism reallocate fitness functionevaluation on different types of generated solutions by ceasing the knowledge transfer when parting ways.Interesting is also the work carried out in [92], in which a Group-Based MFEA (GMFEA) is modeledand implemented. The GMFEA has the main characteristic of dividing tasks into different conceptualgroups depending on their proved synergy. Thus, GMFEA controls the implicit genetic transfer betweenproblems belonging to same group. The most important feature is that the grouping is performed dy-namically, without the requirement of any prior knowledge. Also remarkable is the research recentlyconducted in [93]. In that paper, authors first explore how diverse kind of crossovers impact on theimplicit knowledge transfer in MFEA for solving continuous optimization problems. After that, theyintroduce a novel MFEA with adaptive knowledge transfer (MFEA-AKT), in which the mating function12sed for the genetic material sharing is autonomously adapted employing the information gathered on thecomplete search process.Furthermore, in addition to the above-mentioned multi-objective MFEA-II, a further adaptive variantof the MO-MFEA has been proposed in [94]. The specific method implemented in that work is charac-terized for introducing two novel ingredients: a) the deeming of a set of reference points to determine thediversity of current population (instead of using the crowding distance), and the online adaptation of theRandom Mating Probability (RMP) with the intention of improving the genetic transfer of high-similartasks. A further interesting method of this kind is developed in [95]. Specifically, this works is devotedto the implementation of a so-called MO-MFEA with decomposition and dynamic resource allocationstrategy (MFEA/D-DRA).Analogous to static MFO algorithms, researchers and practitioner have also proposed several adap-tive multifactorial methods inspired by the main concepts of this EM paradigm. As mentioned before,MFO methods are the main exponents of implicit knowledge transfer-based approaches. We can high-light first the adaptive multifactorial memetic algorithm proposed in [96], which congregates a) the useof local search mechanisms influenced by the knowledge learning among problems, b) a re-initializationprocedure for overcoming premature convergence issues and c) a self-adaptive parent selection strategybased on search performance. Also valuable is the work conducted in [27], which is focused on de-veloping an adaptive variant of the above mentioned MFCGA. The coined as Adaptive Transfer-guidedMFCGA introduces two dynamic ingredients: a) a dynamic reorganization of cellular grids based onsearch performance and b) a self-adaptive multi-mutation mechanism. A further MFO adaptive variantcan be found in [97], devoted to the presentation of a multifactorial Particle Swarm Optimization methodwith a self-adaption strategy for adjusting the inter-task learning probability.As multi-objective alternatives, we can highlight the adaptive multiobjective and multifactorial DEalgorithm (AdaMOMFDE) proposed in [98], based on multiple mutation operators which are selectedfollowing and adaptive strategy according to their search results. Also significant is the multiobjectiveand multifactorial subspace alignment and self-adaptive DE (MOMFEA-SADE) recently introduced in[99]. Principal ingredients of that method are a) a mapping matrix get by subspace learning and employedfor modifying the search space and minimize the impact of negative transfers, and b) a self-adaptivetrial vector used on the DE, for generating new solutions influenced by previous experiences. Finally,the work conducted in [100] revolves around the multitasking adaptive formulation of the well-knownmultiobjective optimization evolutionary algorithm based on decomposition (MOEA/D, [101]).Finally, it should be highlighted that, also in this category, few MM solvers have been proposed inrecent years in addition to those based on MFO. In this line, it is worth to describe first the coevolu-tionary multitasking framework proposed in [102] and [103], coined as evolution of biocoenosis throughsymbiosis (EBS). Inspired by the symbiosis in biocoenosis, EBS is comprised by multiple populations,running in each of them an independent Evolutionary Algorithm. Furthermore, the information exchangeamong tasks constitutes the so-called symbiosis, and it is conducted through an implicit transfer proce-dure coined as Information Exchange through Concatenate Offspring . Finally, this method introducesadaptive mechanisms for controlling information exchange, mainly based on the search performance.Additional valuable coevolutive framework is proposed in [104], named as many-task evolutionary algo-rithm (MaTEA). This framework is similar to EBS in terms that it is also featured by having multiplepopulations governed by an Evolutionary Algorithm, each one dedicated to the optimizing of one tasks.Main characteristic of this MaTEA is an adaptive selection mechanism for choosing suitable assistedtask for a given problem based on the accumulated rewards of positive knowledge sharing during thesearch. Moreover, a genetic material transfer schema via crossover is used for sharing information be-tween problems for improving the efficiency of the search, giving rise to the coevolutionary nature of themethod. 13 .4. Explicit Knowledge Transfer Based Static Solvers
All the works mentioned in this paper up to now clearly attest the importance that EM field has inthe current scientific community. Furthermore, the intense activity highlighted in previous Sections 3.2and 3.3 also unveils the importance of MFO in this specific branch of knowledge. In any case, thissuccess cannot overshadow the fact that researchers and practitioners have proposed alternative schemesto MFO to deal with EM environments. Most of these schemes are MM based approaches, principallycharacterized by embracing explicit knowledge sharing strategies. In this section, we outline the mainwork conducted in last years around explicit transfer based static solvers. As introduced in previousSection 2, this kind of knowledge transmission is usually conducted by migrating complete solutionsamong populations, namely from one task to another one. Additionally, explicit transfer could also becapitalized through the use of mapping functions or making use of EDA-style probabilistic models insteadof raw solutions.Arguably, the most successful alternative trend to MFO paradigm is the one related to MM ap-proaches. Going deeper, most used MM methods fall inside the category known as coevolutionary. Thesemultitasking methods are featured by being composed by multiple populations of individuals, which areusually independently dedicated to the optimization of a single tasks. Thus, the autonomous evolution ofthese subpopulations together with the punctual sharing of genetic material or the sporadic collaborationamong them incurs in a better evolution of all of them in an unison way.Some exponents of these methods can be found in the works [25], [26] and [105]. All these threealgorithms are multipopulation approaches, governed by separated Genetic Algorithms, Variable Neigh-borhood Search and Bat Algorithms, respectively. These three methods have demonstrated a promisingperformance, using a scheme in which each subpopulation is devoted to the solving of one single task.Furthermore, the genetic material exchange is materialized through the punctual migration of completesolutions among the multiple populations. The same trend is also adopted in [106], in which a MMmethod named as Differential Evolutionary Multitask Optimization is proposed, in which the knowledgesharing is conducted through the migration of individuals among populations. A similar philosophy isfollowed in the Multitasking Genetic Algorithm modeled in [107], in which a population of solutions iscreated for each optimizing problem, and the knowledge sharing is realized at each iteration through thetransference of different chromosomes among populations.In the research conducted in [108], EM algorithm with explicit genetic transfer is presented. Alsoknown as EM via autoencoding, or Explicit EM Algorithm, this method is comprised by as much inde-pendent populations as task being optimized. The knowledge sharing is materialized along the searchthrough the injection of good solutions found by any of the subpopulations along their execution. For ap-propriately conducting this genetic transfer, a multiplication operation is used with a previously learnedtask mapping. Authors of this last work extend their research in [109] by applying their EM via autoen-coding to the well-known Capacitated Vehicle Routing Problem. Further evolution of this method is pro-posed in [110], with an algorithm coined as EMT/ET. That enhanced technique explores a novel selectionof transferred solutions, based on the dominance of that solutions over the optimizing problems. Addi-tionally, in [111] described a generalist multipopulation optimization scheme, based on similar conceptsabove described. Authors empirically demonstrate the efficiency of their scheme using a DE algorithm asbase, giving rise to a so-called multipopulation multitask DE optimization. More concretely, this methodcapitalizes the sharing of information by sporadically creating overlapping populations. Also interestingis the work proposed in [112]. In that work a specific instantiation of a multitasking genetic fuzzy systemis presented and developed: a multitasking evolutionary optimization algorithm for Mamdani fuzzy sys-tems with fully overlapping triangle membership functions (FOTMF-M-MTGFS). Further MM schemescan be found in [113] and [114]. 14 .5. Explicit Knowledge Transfer Based Adaptive Solvers
We finish this systematic review along the state of the art related to EM delving on the last categorythat can be found in the literature: explicit knowledge transfer based adaptive solvers. In this case, it isalso interesting to mention that the methods than can be framed in this last category mainly embrace theabove introduced MM philosophy.In [115], an interesting adaptive version of the above described Explicit EM Algorithm [108] is pro-posed. Specifically, authors explore the use of the feedback gathered from the solutions transferred acrosstasks as guide for tasks selection. This feedback is updated along the search process, being able in thisway to obtain the usefulness across tasks. An additional valuable algorithm is the novel EM algorithmwith dynamic resource allocating strategy (MTO-DRA) introduced in [10]. The adaptive mechanismconsidered in this EM method is similar to those presented in [91] or [95]. Main novel ingredients ofMTO-DRA in comparison with those similar methods is its multipopulation nature. More concretely, ateach iteration, subpopulations are generated from the overall main population, each one fully devoted tothe solving of one specific task. After this step, the resources are allocated to every subpopulation basedon the index of improvements of tasks. This index is calculated online based on the performance feedbackof previous generations.Authors in [116] propose an online similarity learning strategy, named as adaptive model-based trans-fer (AMT). For demonstrating its good performance, authors instantiate an EM algorithm, called AMT- enabled
EA. Main characteristic of the modeled AMT is its capability of dynamically learn and exploitthe similarities across black-box optimization problems, minimizing negative transfers. It is also inter-esting to mention that the algorithm proposed in that work counts with a single population, not beingclassifiable as MFO nor MM. This algorithm is further studied and enhanced in [117] by means of onlinedata-driven learning of non-linear mapping functions. Additional alternative adaptive EM schemes canbe found in [118] and [119].Throughout this systematic literature review section, we have conducted a deep analysis on the effortsmade so far in Evolutionary Multitask Optimization field. In the next section 4, we further analyze anddiscuss on the common methodological trends observed in the literature. This methodological overviewshould also serve as guidance for the upcoming challenges related to this promising field.
4. Current Methodological Trends in Evolutionary Multitask Optimization
We provide in this section a methodological overview of the current state of EM research field. Thestudies already published in this area have been really abundant up to date, giving rise to a significantamount of techniques which share common practices, mechanisms and resources. The main reason of theexistence of these different research trends is because they are dedicated to tackle some latent implemen-tation challenges that should be addressed when dealing with EM environments. Thus, main goal of thissection is to briefly highlight the principal methodologies adopted by practitioners in the different phasesof algorithmic development.•
How to design the unified search space : One of the most important issues when facing EM environ-ments is the way in which solutions are encoded. This is essential principally in approaches that fallinside MFO paradigm. The main challenge at this point is that wide and generalist encodings willfall into superficial representations of solutions, not concrete enough for scrutinize interesting regionsof task-specific search spaces. On the contrary, very specific representations can make impossible thegenetic sharing between tasks coming from different optimization problems. In this sense, it shouldbe taken into account that, depending on the type of EM technique to be implemented, it is possiblethat generated individuals are evaluated in different tasks throughout the whole search process. Thisis common in methods in which the sharing of knowledge is conducted by explicit transfer . In othercases, although individuals are dedicated to the solving of an exclusive task, the existence of implicit ransfer procedures make essential that solutions devoted to the facing of different problems are ca-pable of sharing knowledge with each other. This situation unveils the necessity of the existence of aunified search space, even more when tasks to solve are not completely related or belong to differenttypology of optimization problems. In the literature, many approaches for the efficient design of theunified search space can be found. If the tasks to solve are codified by continuous variables, the mostused method for encoding individuals is the well-known random-keys representation [120], as can beseen in works such as [12, 75, 90, 2, 97]. Furthermore, for discrete problems, two alternatives havebeen mainly followed by researchers: the transformation of the discrete search space to a continuousone through the random-keys representation, as mentioned in [18] and adopted in works as [43]; or theuse of discrete search spaces such as the one introduced in [42] and used in works such as [27, 109].Additional examples of encoding strategies can be found in the literature, but mainly constructed ad-hoc for a specific type of problems. Examples of this claim are the codification used in [44, 46, 47]for solving clustered shortest path tree problems; or the one employed in [26] for community detectionover graphs. More recently, the use of neural auto-encoders has been proposed as a means to realizeinformation transfer between tasks explicitly through the exchange of problem solutions, rather thandelegating this exchange implicitly in the crossover operation over a unified search space [108, 109].• How to evolve the population(s) along the execution : As mentioned before, Evolutionary MultitaskOptimization refers to the design and implementation of multitasking solvers based on search proce-dures and operators drawn from Evolutionary Computation and Swarm Intelligence. Thus, as beingpopulation-based iterative methods, a crucial aspect that define this type of methods is the selection ofthe chromosomes that survive from one generation to another. In EM, several procedures have beenproposed up to date, being the scalar fitness based selection of MFEA the most often employed one.Specifically, scalar fitness based selection is an elitist survivor function in which the best P individualsin terms of scalar fitness ϕ p among those in the current population and the newly produced offspringsurvive for the next generation. Another alternative strategy is the coined as local improvement selec-tion , by which the newly generated solutions can only substitute their direct parent if they improve it.This strategy is followed in methods such as the MFCGA [28, 74], or those based on PSO or DE, as[75, 11]. On another vein, in most of MM schemes, the survivor selection is conducted within eachsubpopulation, following traditional evolutionary computation and/or swarm intelligence selection op-erators.• How to share knowledge between tasks : The effective genetic material transfer is arguably the mostimportant factor for EM methods to work in an efficient way. This specific procedure is what makesa multitasking technique to be superior to classical solving metaheuristics and schemes. In any case,the design of adequate knowledge sharing mechanisms is not trivial, and it usually depends on severalissues, such as the encoding strategy employed, or the nature of the problems being solved. The mainchallenge on this point is twofold: i) to share as much as valuable knowledge among tasks and withan acceptable frequency, and ii) to define which is the genetic material that should be shared amongoptimizing tasks. Following these principles, the transference of knowledge can be capitalized follow-ing several directions. Probably the most used strategy, having shown great performance so far, is thegeneration of new individuals using genetic material coming from solutions with different skill task.Example of this specific trend is the well-known assortative mating , which can be materialized throughi) a common crossover operation as in MFEA, MFEA-II and many other MFO techniques [18, 31, 27];ii) based on mutation strategies as in DE inspired techniques [98, 72, 73]; or iii) the velocity basedmovements of PSO inspired methods [69, 75, 97]. Another commonly used mechanism for conductingintra-task knowledge transfer is the one used by MM methods, in which multiple populations coexists,each one devoted to the resolution of one specific task [25, 105, 107, 114, 26]. In that cases, the knowl-edge sharing is conducted mainly by migrating solutions among subpopulations, modifying in this waythe optimizing task of individuals. Other less used genetic transfer scheme is the one based on a single16ayer denoising autoencoder, used in the coined as EM via autoencoder [108, 109]; the generation oftemporary overlapping populations [111], or based on the archives of DE solvers [84].•
How to adapt the algorithm to negative transfer : as has been seen in previous Section 3.3 and 3.5,a common trend for overcoming the curse of negative transfer is the design and implementation ofadaptive mechanisms. The main motivation that inspires the development of this mechanism is alsotwofold: i) to share as much as valuable knowledge among synergistic tasks and ii) to avoid the inef-ficient transfer of genetic material among non-complementary tasks. In this regard, several promisingalternatives have been proposed in the literature up to now. More concretely, we could distinguish twomethodological trend: i) soft negative transfer avoidance mechanisms, which are devoted to discouragethe knowledge sharing among non-related tasks, and ii) hard negative transfer avoidance mechanisms,aiming at prohibiting the transfer of genetic material among non-compatible tasks. Arguably, the mostcommon used soft mechanism is the on-line fine tuning of algorithm parameters. This is the focal pointof the influential MFEA-II [31] and its discrete and multi-objective variants [88, 89], for example. Ad-ditional examples of this trend can be found in [94, 96, 97], using similar mechanisms as MFEA-II, or in[93, 27, 98], in which the adaptation is not in the parameters, but in the search operators used. Anothercommon strategy is the resource allocation [91, 10, 95]. This mechanism is in charge of dynamicallyanalyze the complementarities among tasks and allocating computational resources based on them. An-other soft strategies contemplate the reinitialization of algorithm structures such as populations [27, 96]or the controlling of the amount of genetic material exchanged among solutions [102, 103]. As hard mechanism, we can highlight the dynamic generation of conceptual groups based on the arisen syner-gies [92, 104]. In any case, it should be highlighted that hard are more complex to implement, sincethey should be aware of the similarities among optimizing tasks (either in a preliminary way, obtainingthem dynamically, or by studying the corresponding landscapes).Despite these methodological trends, other great challenges and niches persist in the field. Some ofthese research opportunities are very closely linked to the trends discussed in this section, while others aredevoted to finding new methodological approaches or the application of EM technique to new and morecomplex domains. We review these challenges in Section 5, along with an outline of several researchopportunities that are bound to attract much of the activity of the related community in the coming years.
5. Evolutionary Multitask Optimization: Challenges and Research Directions
Considering the review of the activity so far discussed in preceding sections, there is little doubtthat Evolutionary Multitasking has brought a fresh breeze to the community working on EvolutionaryComputation and Swarm Intelligence. Advances so far in this area have been notable, exposing thebenefits of embracing multitasking in optimization problems close to reality. However, the relative youthof this field has left several challenges and research niches still insufficiently addressed. In this sectionwe enumerate a series of open research questions, and postulate research paths that can be followed totackle them effectively in years to come. We complement each identified challenge by a brief explanationof its scope, relevance and alignment with current research efforts made in other fields, summarizing allthis information in Figure 4 for a quick visual reference of contents:
As discussed throughout the survey, so far the exchange of information between tasks has been doneeither implicitly or explicitly. In both cases, the similarity between tasks has been used to dictate which(and to a point, how) different individuals have been mated with each other, or to establish which tasksexchange explicit genotype information with each other. Notwithstanding this general usage pattern, anopen question remains whether a priori assessment of the similarity between tasks is really needed inthe context of Evolutionary Multitasking. Adaptive approaches such as MFEA-II- or ATMFCGA have17xposed the capability of the evolutionary search process itself to elicit a progressively better estimation ofthe similarities between the problems being solved. However, there is no certainty whether this estimationof the similarity between tasks effectively avoids counter-synergies among them all along the process,particularly in early evolutionary stages. The availability of a priori information on how tasks relate toeach other, by any means, should be exploited from the very beginning of the EM approach for the initialevolution to be informed properly.
Measures of Similarity (Subsection 5.1) • Meta-learning to infer the similarity between tasks • Studies showing whether similarity-driven searchcan be counterproductive in early stages ofevolution
Solution representation learning (Subsection 5.2) • Flexible, learnable encoding strategies • Subspace learning methods to better alignencoding, search operators and knowledge transfer
New Problems (Subsection 5.6) • New problem flavors: multimodal, multiparty, dynamic optimization • Machine Learning optimization problems: Neural Architecture Search, Meta-learning
Scaling up Evolutionary Multitasking (Subsection 5.4) • Experiments with increased scales • Experimental conditions with realistic functional and non-functional constraints
Benchmark and Methodological guidelines (Subsection 5.5) • Benchmark comprising diverse problem families, with both synergistic and non-synergistic relationships • More informed discussions when evaluating different algorithms, according to good methodological practices
Generative Evolutionary Multitasking (Subsection 5.3) • Confluence with the family of EDA-based solvers • New models learning to generate new solutions for synergistically related tasks
Evolutionary Multitask Optimization
Figure 4: Conceptual diagram summarizing the identified set of challenges and research directions.
Departing from this last intuition, we foresee that further efforts should be invested on advancedmethods to estimate the similarity between optimization tasks without actually solving them. Clearly, awell-behaved measure of similarity between optimization tasks should roughly depend on the closenessof their optimal solutions. However, it is important to note that in the context of EM, the similarity be-tween optimization tasks has no unique definition, and depends on the search and transfer operators beingdeployed. For instance, small differences in the solutions of two tasks can be amplified if the encodingstrategy is not designed suitably, eventually leading to a counterproductive exchange of knowledge. Thispossibility is often overseen in the literature in favor of the design of unified representational strategiesfor all tasks under consideration. Conversely, given certain search operators, the tasks can be claimedto be related/similar to each other only in the context of the encoding strategy and operators in use, andprovided that multitasking leads to faster convergence than in the case of isolated problem solving. Thesame set of problems may lead to negative knowledge transfer if different operators are used.We definitely advocate for further research in this direction. A first research direction to follow is theincorporation of meta-learning algorithms capable of inferring the similarity between pairs of tasks basedon meta-features extracted from the problems (e.g. based on fitness landscapes or on solution space sam-pling). This similarity estimation should be also complemented by an encoding alignment between tasksthat ensures maximally aligned individuals in multi-population EM approaches. For this latter purpose,non-linear methods from domain adaptation have been recently explored from the transfer optimizationperspective [117], leaving a door open to the consideration of further ingredients from subspace learning.
Grounded on the two schools of thought about how to face information transfer between tasks (explicitversus implicit), a further step should be taken towards finding not only solutions to the problems, but alsorepresentation of the solutions for each problem that are more efficient for conveying knowledge transferamong tasks. This resonates with the reflections made in the previous subsection, by which similarity isstrongly subject to the set of operators and the encoding strategy in use. Indeed, knowledge exchange18etween tasks can be beneficial only under appropriate solution representations. For instance, in graphcoloring/community detection problems, a permutation-invariant encoding approach has been noted tobe of utmost necessity for implicit information transfer through crossover [26]. Otherwise, informationtransfer cannot lead to better convergence, even if the networks to be clustered are rotated versions of thesame network.Solution representation learning is therefore vital for effective knowledge transfers. This is an exceed-ingly important topic for future research in evolutionary multitasking: learning solution representations,either based on prior data or adaptively during the course of the search, to enhance positive transfers.This unleashes an interesting opportunity for learnable encoding strategies, especially for those that canbe evolved jointly with the solution itself (e.g. genetic programming). Otherwise, when allowed by theapplication domain where tasks are defined, tailored alignment methods or flexible encoding approachesshould be utilized instead, always coupled tightly to the heuristic search operators in use.
Most EM approaches reported to date are based on sampling the space of possible solutions to theproblem, without any attempt at learning the distribution of good solutions. In other words, the space ofpossible solutions is traversed by resorting to evolutionary and/or swarm operators, so that new solutionsstem from the application of such operators to one or multiple populations of individuals. An additionaldegree of intelligence in how the space is sampled could be achieved by creating synthetic solutions alongthe search that reinforce and push forward the convergence of synergistically related tasks. If two taskswere found to be related to each other during the search, a generative machine learning model could pro-gressively learn the distribution of good solutions for both problems. Once learned, this generative modelcould be queried over the search, replacing (fully or partly) the application of evolutionary operators. Asa result, synthetic solutions that are potentially good for related tasks could be produced and fed to thepopulation, ultimately accelerating the convergence.The adoption of latent generative models already underlies beneath renowned EM methods, such asthe probability mixture models used in MFEA-II to model the relationships between tasks. It is our beliefthat a profitable research path for EM remains in the long history of EDA algorithms, which address theconcept of generative modeling and sampling for single-task optimization. It will be a matter of timewhen the EDA and EM realms collide together to span a new generation of intelligent multitask solvers,not only producing solutions, but also distributions that can be exported for other EM setups comprisingtask instances of similar kind.
In reduced experimentation setups the use of EM methods has been shown to yield benefits in terms ofconvergence with respect to single-task optimization. However, when scaling up EM environments to re-alistic levels in terms of the number and diversity of tasks, these observed benefits can be turned down dueto several reasons. To begin with, the computational resources required to scale up the search nicely withthe number of tasks can become not affordable if the search over all tasks is to be made in a centralizedfashion. An opportunity arises at this point for multipopulation schemes at the expense of multifactorialapproaches, as they naively allow for decentralized implementations of the search and thereby, a morebalanced share of the computational cost among stakeholders. This alternative, however, would comealong with other aspects to be considered, such as the selection of a synchronous/asynchronous knowl-edge transfer policy or the reliability of the fitness evaluation made locally, among other issues noted inthe field of distributed evolutionary computation [121].Even if the above matters become eventually solved by the advance in research, we definitely needto formulate an additional question: when and where can it be beneficial to solve thousands (potentially,millions) of tasks at once? Is there any realistic setup comprising these scales, in which several problemsare related to each other so that this synergy leads to quantifiable performance gains? It is an undeniable19ruth that so far, experimental setups utilized in the community working in EM have been restricted to afew selected problem instances as per their relationship known beforehand. Non-functional aspects inher-ent to a distributed setup are, therefore, left aside in favor of a more focused pursuit towards algorithmicadvantages. In real settings, however, aspects such as privacy guarantees should be guaranteed in a sim-ilar fashion to what is sought in affine modeling fields (i.e. federated learning). Delving into this aspect, federated optimization would aim to evolve jointly different distributed tasks, without each task revealingeach other their actual best solutions. A possible approach would be to define an encrypted unified searchspace, so that only each task could decipher its corresponding solution. The challenge in this direction ishow to include this encryption functionality without jeopardizing the transfer of knowledge via implicitgenetic transfer, or hindering the overall multitasking search efficiency.
One aspect of EM research that has been put to question is the quality of experimental benchmarksdesigned to assess the performance of every proposed approach. Most contributions to date have tradi-tionally considered experimental setups comprising a few tasks, at their best belonging to 4-5 problemformulations over which inter-task similarities and synergies can be analyzed and discussed. Despite re-cent efforts in the heat of competitions held in frontline conferences , common methodological guidelinesand benchmarking tools are still to be agreed and widely adopted in prospective contributions. Otherwise,there will be no clear grounds to ensure the fair evaluation, replication and comparison of new advancesin the field.Therefore, new benchmarks, quantitative metrics and methodological guidelines should be proposed,discussed and embraced by researchers working in EM. On the one hand, scores should relate to the qual-ity of the produced solutions for the tasks under consideration, as well as the computational efficiency ofthe joint search, the gains with respect to single-task optimization, and the amount of positive/negativetransfer episodes registered for every task pair. Finally, methodologically speaking all aspects that impacton the obtained simulation outcomes should be reported, especially those that are often overseen whendescribing the experimental setup (e.g. parameter tuning of all solvers under comparison, the imposedconvergence criterion, and a solid justification why the selected parameters make the comparison fairamong solvers). Furthermore, the usual discussion among approaches held on the basis of global perfor-mance statistics (e.g. average fitness per task) should be informed with additional null hypothesis tests[122] and/or a Bayesian characterization of the obtained results [123] to guarantee the statistical signif-icance of the gaps claimed to exist among different approaches. All in all, a major push towards crystalclear comparisons in all measurable aspects of multitasking. In the last couple of years, a growing corpus of literature has addressed EM for tasks that go beyondreal-valued single-objective optimization problems. This is the case of permutation-based combinatorialand multi-objective optimization, which have been tackled with EM approaches that incorporate algo-rithmic ingredients suited to deal with these problems. However, other flavors have been addressed morescarcely to date. This includes multimodal optimization, where synergies emerge as per the number andinter-distance between global optima shared by the tasks; dynamic optimization, particularly the casewhen changes undergone by two tasks occur in the same direction yet at different instants over time; ormultiparty optimization, where several stakeholders participate, sharing part of the objectives and/or thesolutions of their related Pareto front approximations. Competitions on Evolutionary Multi-task Optimization held at IEEE Congress on Evolutionary Computation (CEC’2017 toCEC’2021) and Genetic and Evolutionary Computation Conference (GECCO’2020).
20n application domain that deserves a separate mention at this point is the confluence between Ma-chine Learning (ML) and EM. Indeed, many learning algorithms can be formulated as an optimizationproblem (e.g. loss minimization in Deep Neural Networks), therefore unleashing an opportunity for un-dertaking setups consisting of several interrelated ML problems with EM approaches. For instance, ithas been widely postulated that Evolutionary Computation and Swarm Intelligence solvers can be usedas an scalable replacement for optimization problems related to Deep Neural Networks [124, 125, 126].Initial explorations have exposed that EM can be used in multitask reinforcement learning environmentsto jointly train the neural models and exploit synergies between them [51]. However, other avenues at thiscrossroads are worth to be explored further, such as neural architecture search, where the joint evolutioncan serve as a mutual guidance for avoiding regions representing underperforming network configura-tions; and meta-learning, where the paradigm resides in how to optimize models that can perform well inunseen tasks. ML problems will surely unfold an interesting playground for multitask optimization in thenear future.
6. Concluding Remarks and Outlook
This overview has elaborated on the research area known as Evolutionary Multitask Optimization.Framed within the wider Transfer Optimization field, the main goal of this incipient paradigm is to ex-ploit the knowledge learned throughout the optimization of one problem towards addressing other relatedor unrelated problems, so that they are solved more effectively than in isolation. The youth of this areaclashes with the relatively high amount of contributions reported by the community to date. Consequently,our study aims at analyzing the past, present and future of this area, emphasizing on methodological pat-terns, practices and concepts followed by the community. To this end, we have first delved into the essen-tials of EM, establishing mathematical grounds that allow discerning the aforementioned methodologicalpatterns in subsequent discussions. Furthermore, a clear distinction between multitask optimization, mul-tiobjective optimization and multitask learning has been done. Departing from these prior definitions,we have performed a systematic review of the literature related to EM, focusing on remarkable studiespublished in the last few years, and establishing a landmark taxonomy that allows the audience to easilyunderstand the algorithmic choices mostly embraced in the reviewed bibliography. Specifically, our studyhas informed about the prominent role of multifactorial optimization and multipopulation multitaskingapproaches. We have also conducted a methodological analysis of the different phases followed whendesigning EM solvers. Finally, we have built upon our critical literature analysis to initiate a discussionaround research niches and challenges that remain insufficiently addressed to date. On a prescriptive note,each of such identified challenges has been associated with several possible research directions, whichshould inspire efforts in years to come.Exchange of knowledge between researchers should be encouraged via a common space of under-standing in which to unify their views, synchronize their research agendas and push energetically theirefforts towards valuable advances in the field. The ultimate purpose of this work is to promote synergisticknowledge transfer between researchers working in Evolutionary Multitasking, much in line with what issought in multitasking itself. We also hope that this material enshrines as a suggestive point of referencefor newcomers and practitioners interested in a smooth landing at this fascinating area.
Acknowledgements
The authors would like to thank the Basque Government for its funding support through the ELKA-RTEK program (3KIA project, KK-2020/00049) and the consolidated research group MATHMODE (ref.T1294-19). 21 eferenceseferences