[PDF] Manifold Interpolation for Large-Scale Multi-Objective Optimization via Generative Adversarial Networks

Abstract

Large-scale multiobjective optimization problems (LSMOPs) are characterized as involving hundreds or even thousands of decision variables and multiple conflicting objectives. An excellent algorithm for solving LSMOPs should find Pareto-optimal solutions with diversity and escape from local optima in the large-scale search space. Previous research has shown that these optimal solutions are uniformly distributed on the manifold structure in the low-dimensional space. However, traditional evolutionary algorithms for solving LSMOPs have some deficiencies in dealing with this structural manifold, resulting in poor diversity, local optima, and inefficient searches. In this work, a generative adversarial network (GAN)-based manifold interpolation framework is proposed to learn the manifold and generate high-quality solutions on this manifold, thereby improving the performance of evolutionary algorithms. We compare the proposed algorithm with several state-of-the-art algorithms on large-scale multiobjective benchmark functions. Experimental results have demonstrated the significant improvements achieved by this framework in solving LSMOPs.

Full PDF

11 Manifold Interpolation for Large-ScaleMulti-Objective Optimization via GenerativeAdversarial Networks

Zhenzhong WANG, Haokai HONG, Kai YE, Min JIANG (cid:63) , Senior Member, IEEE, and Kay ChenTAN,

Fellow, IEEE

Abstract —Large-scale multiobjective optimization problems(LSMOPs) are characterized as involving hundreds or even thou-sands of decision variables and multiple conﬂicting objectives.An excellent algorithm for solving LSMOPs should ﬁnd Pareto-optimal solutions with diversity and escape from local optimain the large-scale search space. Previous research has shownthat these optimal solutions are uniformly distributed on themanifold structure in the low-dimensional space. However, tra-ditional evolutionary algorithms for solving LSMOPs have somedeﬁciencies in dealing with this structural manifold, resultingin poor diversity, local optima and inefﬁcient searches. In thiswork, a generative adversarial network (GAN)-based manifoldinterpolation framework is proposed to learn the manifold andgenerate high-quality solutions on this manifold, thereby improv-ing the performance of evolutionary algorithms. We comparethe proposed algorithm with several state-of-the-art algorithmson large-scale multiobjective benchmark functions. Experimentalresults have demonstrated the signiﬁcant improvements achievedby this framework in solving LSMOPs.

Index Terms —evolutionary algorithm, multiobjective optimiza-tion, large-scale optimization, generative adversarial networks,manifold learning

I. INTRODUCTIONMany real-world optimization problems involve hundredsor even thousands of decision variables and multiple con-ﬂicting objectives, and such problems are called large-scalemultiobjective optimization problems (LSMOPs) [1]–[3]. Forexample, in the design of telecommunication networks [4],several tens of thousands of variables, such as the locationsof many network nodes, the transmission capacity betweennodes, and the power allocations of nodes, are involved.These numerous decision variables determine the energy con-sumption, applicability, and stability of the network. It is achallenge to ﬁnd optimal solutions for this type of networkbecause the volume of the search space is exponentially relatedto the number of decision variables, and thus the curse ofdimensionality [5] is exhibited. Therefore, an excellent large-scale multiobjective optimization algorithm (LSMOA) shouldovercome this issue to search in the search space efﬁcientlyand effectively, and this ability is crucial to solving LSMOPs.

Z. WANG, H. HONG, K. YE, and M. JIANG are with the School ofInformatics, Xiamen University, China, Fujian, 361005. M. JIANG is thecorresponding author (Email: [email protected]).Kay Chen TAN is with the Department of Computer Science, City Univer-sity of Hong Kong, and the City University of Hong Kong Shenzhen ResearchInstitute.

To solve LSMOPs, a variety of LSMOAs have been pro-posed in recent years. These LSMOAs can be roughly groupedinto the following three categories. The ﬁrst category ofLSMOAs is based on decision variable analysis. A represen-tative algorithm of this category is MOEA/DVA [6], wherethe original LSMOP is decomposed into a number of simplersubproblems. Then, the decision variables in each subproblemare optimized as an independent subcomponent. The secondcategory of LSMOAs is based on decision variable grouping.This category of LSMOAs divides the decision variables intoseveral groups and then optimizes each group of decisionvariables successively. For example, C-CGDE3 [7] maintainsseveral independent subpopulations. Each subpopulation isa subset of the equal-length decision variables obtained byvariable grouping. The above decision variable analysis orgrouping methods easily leads to excessive computationalcomplexity due to the large-scale decision variables [8]. Thethird category is based on problem transformation. A genericframework named the large-scale multiobjective optimizationframework (LSMOF) [9] is representative of this category.In LSMOF, the original LSMOP is reformulated as a low-dimensional single-objective optimization problem with somedirection vectors and weight variables, aimed at guiding thepopulation towards the optimal solutions.Recently, research has shown that the Pareto-optimal so-lutions are uniformly distributed on the manifold structurein the low-dimensional space [10], [11], and utilizing suchmanifold can be an efﬁcient way to improve the evolutionaryperformance [6], [12]–[14]. However, traditional algorithmsfor solving LSMOPs have some deﬁciencies in dealing withthis structural manifold, which may blight the manifold struc-ture and result in poor diversity, local optima and inefﬁcientsearch. For example, as shown in Fig. 1 (a), the solutions arepiecewise on the manifold due to the large-scale search space.Solutions from the same segment that have similar genes aremutated or mated, and the offspring solutions are likely to liearound the segment. In this case, the diversity of the populationcannot be improved signiﬁcantly, and it is disadvantageous toescape from local optima. Alternately, as shown in Fig. 1 (b),solutions from different segments that have dissimilar genesare mated, the diversity of the population can be improved,and the offspring solutions are likely to escape from localoptima, but these offspring solutions could be made obsoleteby nondominated sorting, which leads to inefﬁcient search.We believe that the key factor for solving LSMOPs is to ﬁnd a r X i v : . [ c s . N E ] J a n Fig. 1. Solutions are piecewise on this low-dimensional manifold, whichresults in poor diversity, local optima and inefﬁcient search. an effective way to learn the manifold and search for solutionson this manifold. An intuitive idea is to learn the manifold andinterpolate solutions along the manifold, as shown in Fig. 2.These interpolated solutions ﬁll gaps of the manifold and areuniformly distributed on the manifold, thereby leading the pop-ulation to draw on the optimal area efﬁciently. Unfortunately,it is difﬁcult to learn the characteristics of the entire manifoldand ﬁll its gaps because we can only learn characteristics fromexisting solutions. For solutions lying in the gaps that have notyet been found, their characteristics cannot be learned directly.

Fig. 2. Interpolate solutions on the manifold

Development from the neural network community has al-ready shown that a group of generative models, GANs [15],provides a powerful tool for learning the high-dimensionalcharacteristics of samples and further generating meaningfulsamples. From the perspective of manifold data, GANs canlearn the latent characteristics lying on the manifold and inter-polate previously nonexistent but meaningful samples locatedon this manifold [16]–[18]. By this interpolation method, aseries of samples with continuous changes are synthesizedon the manifold, and thus the gaps of the manifold can befulﬁlled, and we can utilize the knowledge of the manifold toguide the search direction. Therefore, the GAN can be utilizedas a powerful weapon in our work to interpolate solutions onthe manifold.In this paper, we argue that the integration of GANs intomultiobjective optimization algorithms can offer signiﬁcant beneﬁts for designing better LSMOAs, and a framework calledGAN-based large-scale multiobjective evolutionary framework(GAN-LMEF) is proposed. Initially, GAN-LMEF uses non-dominated solutions to construct a series of approximatepiecewise manifolds. Then, by adopting the proposed GAN-based interpolation method, various solutions are synthesizedalong manifolds to ﬁll gaps between the manifolds. Next, theentire manifold is updated from the generative population,and more promising solutions are reinterpolated along thenew manifolds. The above procedures are repeated until thetermination conditions are met. The contributions of this workare summarized as follows.1) The proposed framework uses manifold knowledge andutilizes a GAN as a powerful weapon to learn thecharacteristics of solutions and synthesis solutions alongthe manifold. These generative solutions are of highquality and can be helpful in the evolutionary process.This provides a novel way to solve LSMOPs.2) Considering that the gaps on the manifold vary with theevolutionary process, we propose three different inter-polation strategies to better ﬁll the gaps and maintainthe manifold.3) In the selection process, we propose a manifold selectionmechanism to predict whether a generative solution isexcellent without evaluating objective functions, whichcan reduce evaluations and avoid many meaninglessevaluations.The rest of this paper is organized as follows. Section IIintroduces existing LSMOAs for LSMOPs, and GANs arebrieﬂy reviewed. Section III details the proposed algorithmfor GAN-LMEF. In section IV, the empirical results of GAN-LMEF and various state-of-the-art LSMOAs on LSMOPs arepresented. Finally, conclusions are drawn in Section V.II. PRELIMINARY STUDIES

A. Large-scale Multiobjective Optimization

The mathematical form of LSMOPs is as follows: (cid:26) min F ( x ) = < f ( x ) , f ( x ) , ..., f m ( x ) >s.t. x ∈ Ω . (1)where x = ( x , x , ..., x n ) is the n -dimensional decision vec-tor, and F = ( f , f , ..., f m ) is the m -dimensional objectivevector. It should be noted that the dimension of the decisionvector n is greater than 100 [19]. The goal of large-scalemultiobjective optimization algorithms is to ﬁnd the solutionsfor which all objectives are as small as possible. However,conﬂicts exist between multiple bjectives, and one solutioncannot satisfy the minimum of all objectives. Therefore, atrade-off method called Pareto dominance is introduced tocompare these solutions. The set of optimal trade-off solutionsis called the POS in the decision space and the Pareto-optimalfront (POF) in the objective space. Deﬁnition 1: (Decision Vector Domination)

A decisionvector x Pareto-dominates another vector x denoted by x (cid:31) x , if and only if (cid:40) f i ( x ) ≤ f i ( x ) , ∀ i = 1 , ..., mf i ( x ) < f i ( x ) , ∃ i = 1 , ..., m. (2) Fig. 3. Step 1. Procedure C

OMPUTING C ENTRAL S OLUTIONS : Map nondominated solutions F into an m − -dimensional manifold, cluster them into severalclusters, and identify the central solutions C i for each cluster. Step 2. Procedure M ANIFOLD I NTERPOLATION : A GAN is utilized to interpolate solutionsbetween the central solutions C i . Step 3. Procedure S ECTION : Select the excellent solutions from the original population and interpolated solutions to forma new population to guide future searches.

Deﬁnition 2: (Pareto-Optimal Set, POS)

If a decisionvector x ∗ satisﬁes P OS = { x ∗ | (cid:64) x, x (cid:31) t x ∗ } , (3)then all x ∗ are called Pareto-optimal solutions, and the set ofPareto-optimal solutions is called the POS. Deﬁnition 3: (Pareto-Optimal Front, POF)

POF is thecorresponding objective vector of the POS

P OF = { y ∗ | y ∗ = F ( x ∗ ) , x ∗ ∈ P OS } . B. Theory of the Manifold Structure

The Karush–Kuhn–Tucker condition induces that the POSof a continuous multiobjective problem to follow a continuous ( m − -dimensional manifold in the decision space and thetheorem is given in detail in [10], [11]. Suppose that theobjectives f i ( x ) , i = 1 , ...m are continuous and differentiableat the Pareto-optimal solution x ∗ , then α = ( α , ...α m ) T ( || α || = 1 ) exists and satisﬁes m (cid:88) i =1 α i ∇ f i ( x ∗ ) = 0 (4)Points satisfying Equation (4) are Karush–Kuhn–Tuckerpoints. Equation (4) has n + 1 equality constraints and n + m variables x , ..x n , α , ..., α m . Thus, under certain smoothnessconditions, the distribution of POS of a multiobjective problemis a continuous ( m − )-dimensional manifold. Test instances inthe LSMOP benchmark [20] are continuous and differentiableand meet this basic theorem. C. Related Work

Existing approaches for large-scale multiobjective optimiza-tion can be roughly classiﬁed into three different categories asfollows.The ﬁrst category is the decision variable analysis-basedapproaches. Zhang et al. [21] proposed a decision variableclustering-based large-scale evolutionary algorithm (LMEA).In LMEA, the decision variables are divided into two typesusing a clustering method. Then, for diversity-related variablesand convergence-related variables, two different strategies areimplemented to improve convergence and diversity.Ma et al. [6] presented a multiobjective evolutionary algo-rithm based on decision variable analysis (MOEA/DVA). The LSMOP is decomposed into a number of simpler subproblemsthat are regarded as an independent subcomponent to optimize.The disadvantage of decision variable analysis-based LSMOAsis that they may incorrectly identify the linkages betweendecision variables to update solutions, which may lead to localoptima [21].The second category applies the decision variable group-ing framework [22]. Antonio et al. [7] maintained severalindependent subpopulations. Each subpopulation is a subsetof the equal-length decision variables obtained by variablegrouping (e.g., random grouping, linear grouping, orderedgrouping, or differential grouping). All of the subpopulationswork cooperatively to optimize the LSMOPs in a divide-and-conquer manner.Tian et al. [19] used a competitive swarm optimizer tosolve LSMOPs. The proposed LMOCSO suggests a two-stagestrategy to update the positions of particles: a competitivemechanism is adopted to determine the particles to be up-dated, and the proposed updating strategy is used to updateeach particle. However, for decision variable grouping-basedLSMOAs, two related variables may be divided into differentgroups, which may lead to local optima [22].Shang et al. [23] presented the idea of cooperative coevo-lution, which is adopted to address large-scale multiobjectivecapacitated routing problems. Bergh et al. [24] proposed amethod of casting particle swarm optimization into a coop-erative framework. It partitions the search space into lower-dimensional subspaces to overcome the exponential increasein difﬁculty and guarantees that it will be able to search everypossible region of the search space.The third category is based on problem transformation. He et al. [9] introduced a generic framework called the large-scalemultiobjective optimization framework (LSMOF). LSMOF re-formulates the original LSMOP into a low-dimensional single-objective optimization problem with some direction vectorsand weight variables, aimed at guiding the population towardsthe POS.Zille et al. [25] proposed a weighted optimization frame-work (WOF) for solving LSMOPs. Decision variables aredivided into many groups, and each group is assigned a weightvariable. Then, in the same group, the weight variables areregarded as a subproblem of a subspace of the original decisionspace.

All of these categories heavily depend on genetic operators(crossover and mutation) to reproduce offspring. Recently, He et al. [26] designed a GAN-driven evolutionary multiobjectiveoptimization (GMOEA) that generates offspring by GANs.However, as previously mentioned, these algorithms ignore thefact that Pareto-optimal solutions are uniformly distributed onthe manifold in the low-dimensional space. Therefore, in thiswork, we focus on how to employ GANs to ﬁx and maintainthe manifold of solutions to better solve LSMOPs.

D. Generative adversarial networks

GANs are generative models used to learn the characteristicsof complicated real-world data. The basic structure of a GANcontains two neural nets: a discriminator and a generator. Inaddition to using generator neural nets to synthesize mean-ingful instances from priori distributions, a discriminator istrained to distinguish fake samples synthesized by the gener-ator and real samples from the training dataset. The generatoraims to learn the distribution of training instances and generatefake instances to deceive the discriminator. The discriminatorconsists of a classiﬁer that can determine whether the inputsample is a generative fake sample or a real sample. The train-ing procedure culminates with a balance between the generatorand discriminator, i.e., the discriminator cannot make a betterdecision that determines whether a particular sample is fake orreal. Compared with other generative models, GANs provide anovel and efﬁcient framework for learning and generating data.Therefore, GANs have recently been successfully applied tomany applications [26]–[29].To learn the distribution of the training data, the generator G takes the prior distribution of a random noise variable z ∼ P z as input to generate the data distribution P G ( z ) approximating to the real data distribution P data ( x ) , and triesto make the difference between distributions of the generativedata and true data as small as possible. In existing GANs,Kullback–Leibler divergence [30], Jensen-Shannon divergence[31], and Wassertein distance [29] can be used to measure thedifference between two probability distributions. The discrimi-nator determines whether the input is generative fake instancesor true instances, and the output probability of discriminator D G ( z ) tends to 0 for fake instances, while tends to 1 fortrue instances. Finally, the discriminator cannot distinguishwhether the data belong to real samples or generative samples.Mini-batch stochastic gradient descent [32] training is usedto generate the antagonistic network. The discriminator isupdated by the stochastic gradient ascending method, andthe generator is updated by the stochastic gradient descentmethod. The process of training is a game between the twoneural networks, and the model eventually tends to the globaloptimum. The objective function of the game can be expressedas min G max D V ( G, D ) = E x ∼ P data ( x ) [ logD ( x )]+ E z ∼ P z ( z ) [ log (1 − D ( G ( z )))] . (5) G ( z ) is the data generated from the input noise variable z ,and x is the data from true instances. x ∼ P ( x ) means that x obeys the distribution of P . III. PROPOSED FRAMEWORKThis work proposes a GAN-driven manifold interpolationmethod to address LSMOPs. The details of the proposedalgorithm, GAN-LMEF, are presented in this section. Brieﬂy,GAN-LMEF consists of three subroutines: Procedure C OM - PUTING C ENTRAL S OLUTIONS , Procedure M

ANIFOLD I N - TERPOLATION , and Procedure S

ELECTION . In Fig. 3, we givea detailed illustration of the main loop in GAN-LMEF. Themain purpose of Procedure C

OMPUTING C ENTRAL S OLU - TIONS is to map solutions into a manifold and ﬁnd somerepresentative solutions, called central solutions, in this man-ifold. In Procedure M

ANIFOLD I NTERPOLATION , a GAN isutilized to interpolate solutions between central solutions.Procedure S

ELECTION selects excellent solutions from theoriginal population and generative solutions to form a newpopulation to guide future searches. In the following subsec-tions, we will introduce the three main components: ProcedureC

OMPUTING C ENTRAL S OLUTIONS , Procedure M

ANIFOLD I NTERPOLATION , Procedure S

ELECTION . A. Computing Central Solutions procedure

Central solutions are the representatives on the manifold,and interpolating between these central solutions may yieldbetter solutions. First, to obtain the manifold, the nondom-inated solutions F are mapped as the manifold data φ ( F ) by principal component analysis (PCA). Then, the k -meansalgorithm [33] is performed to cluster solutions with similarmanifold characteristics. The mapped data φ ( F ) are clusteredinto k clusters. Each cluster is denoted as φ ( F ) i , i = 1 , ..., k ,and in each cluster, a central solution is identiﬁed to representthe cluster. The central solution C i for each cluster φ ( F ) i isdeﬁned as the solution with the minimum distance from φ ( C i ) to the centroid of the cluster be calculated as C i = arg min x || φ ( x ) − φ ( F ) i || , (6)The centroid φ ( F ) i of the cluster φ ( F ) i is φ ( F ) i = (cid:80) | φ ( F ) i | j =1 φ ( F ) i,j | φ ( F ) i | , (7)where φ ( F ) i,j represents the j -th data in the cluster φ ( F ) i .The central solutions identiﬁed and the piecewise manifoldsare shown in Fig. 3 (b). The details of computing the centralsolutions are given in Procedure C OMPUTING C ENTRAL S O - LUTIONS . B. Manifold Interpolation procedure

The manifold interpolation operations will produce a num-ber of nonexistent solutions that are distributed on the manifoldand ﬁll gaps of this manifold. These solutions are promisingand beneﬁcial to the evolutionary process.Before interpolating, the GAN is trained with nondominatedsolutions F and noise variables { z } ∼ U ( − , to learnthe manifold characteristics of solutions. After that, the well-trained generator : G and noise variables { z } are saved forinterpolation. In our work, three interpolation strategies are Fig. 4. The ﬂowchart of the manifold interpolation model. Step 1. Train the GAN with nondominated solutions F and noise variables z . Step 2. Save thewell-trained generator : G . Step 3. Apply interpolation strategies to z and input these interpolating noise variables into the generator : G . Step 4. Generateinterpolating solutions Q . Procedure (C OMPUTING C ENTRAL S OLUTIONS ) Input: F (the non-dominated set), k (the number ofclusters) Output: C (the central solutions) C = ∅ ; Use PCA with F to get the manifold φ ( F ) ; Cluster φ ( F ) into k clusters by using k -means; for i = 1 , ..., k do Calculate central solution C i for each cluster φ ( F ) i according to Equation (6); C = C ∪ C i ; end return C used: 1) interpolating between clusters, 2) interpolating in thecluster, and 3) perturbation interpolation.1) Interpolating between clusters: This method is used toﬁll the gap between two clusters. Noise variables z C i areextracted, where z C i is the noise variable that generates thegenerative fake central solution ˆ C i ( G ( z C i ) = ˆ C i ). For everytwo noise variables z C i and z C j of fake central solutions,interpolating noise variables between them can be describedas [17] z ij = (1 − α ) z C i + αz C j , i, j = 1 , ..., k, i (cid:54) = j (8)where α = l/n, l = 0 , , .., n , and n represents the dimensionof the decision variable. Because z ij are gradual between z C i and z C j , the generative solutions G ( z ij ) generated by z ij will also be gradual between the two clusters. For k clusters,the GAN will interpolate solutions between different centralsolutions, so it will interpolate k ∗ ( k − / times.2) Interpolating in the cluster: This method is used toﬁll gaps between individuals from the same cluster. Theinterpolating noise variables are described as z ab = z a + z b , (9) Procedure (GAN M

ANIFOLD I NTERPOLATION ) Input: C (the central solutions), P (the population), k (the number of clusters) Output: Q (the interpolation solutions) Q = ∅ ; F = Fast-Non-Dominated-Sort ( P ) ; Sample noise variables { z } from noise prior U ( − , ; Train a GAN with F and { z } , save its generator : G ; for i = 1 , ..., k do Extract the corresponding variable z C i from { z } where G ( z C i ) = ˆ C i ; end //Interpolating between clusters; for i = 1 , ..., k do for j = 1 , ..., k and j (cid:54) = i do Interpolate the noise variable z ij according toEquation (10); Q = Q ∪ G ( z ij ) ; end end //Interpolating in the cluster; for i = 1 , ..., k do for x a , x b in k -th cluster do Interpolate the noise variable z ab according toEquation (9); Q = Q ∪ G ( z ab ) ; end end //Perturbation interpolation; Add disturbance to noise variables { z } to generatevariable { z (cid:48) } ; Q = Q ∪ G ( { z (cid:48) } ) ; return Q where z a and z b are the noise variables that generate fakesolutions x a and x b , respectively, and x a and x b are from thesame cluster. Similarly, these noise variables z ab are inputtedinto the generator : G . This interpolation method imitates thesemantic operation for noise variables [34].3) Perturbation interpolation: This method is used to applydisturbance on a noise variable to explore the neighbor spaceof a corresponding solution. Perturbation interpolation addsGaussian noise to a random dimension of a noise variable.These noise variables are also inputted into the generator : G .The details of the manifold interpolation are given inProcedure M ANIFOLD I NTERPOLATION and an illustration ofthe manifold interpolation is shown in Fig. 4.

C. Procedure of Selection

In the selection procedure, high-quality solutions are iden-tiﬁed from interpolated solutions set Q and treated as theparental population P for the next loop. Fig. 5. Manifold selection mechanism: For solutions A and C , C and C are the two nearest central solutions, and for B , C and C are the twonearest central solutions. If two solutions should be selected, A and B areselected due to having the two minimum manifold distances. There is a huge number of generative solutions in Q , and itis wasteful to evaluate so many solutions to identify whethera solution is high quality. To settle this issue, a manifoldselection mechanism is proposed to select solutions that havethe minimum distance to the manifold, as shown in Fig. 5.First, PCA is employed with C and Q to obtain the mappingmanifold data φ ( C ) and φ ( Q ) , respectively. Next, the manifolddistance ( M D ) from each mapping solution x ∈ φ ( Q ) tothe two nearest mapping central solutions φ ( C i ) and φ ( C j ) ( φ ( C i ) , φ ( C j ) ∈ φ ( C ) ) are computed M D ( x ) = ||| x − φ ( C i ) || + || x − φ ( C j ) || . (10)Then, top-( σ · | P | ) solutions that have the minimum M D are selected from Q and denoted as Q (cid:48) . Solutions in Q (cid:48) arepromising and can improve convergence or diversity becausethey are close to the existing piecewise manifolds or lie inthe gap between piecewise manifolds. Finally, only solutionsin Q (cid:48) are evaluated and solutions x i ∈ P are replaced with x j ∈ Q (cid:48) , where x j (cid:31) x i . The new population P is regardedas a parental population for the next round.Fig. 6 illustrates the selection process. The details of theselection are given in Procedure S ELECTION . D. Framework of the Proposed GAN-LMEF

The main scheme of the proposed GAN-LMEF is presentedin Algorithm 1. First, a random population P is initialized. Fig. 6. Step 1. Select solutions from Q as Q (cid:48) by the manifold selectionmechanism. Step 2. Evaluate the solutions in Q (cid:48) . Step 3. Replace solutionsin P with those in Q (cid:48) . Procedure (S ELECTION ) Input: C (the central solutions), Q (the interpolatedsolutions), σ (the proportion of introducinginterpolated solutions) Output: P (the population) Use PCA with C and Q to get φ ( C ) and φ ( Q ) ,respectively; for x ∈ φ ( Q ) do for φ ( C i ) , φ ( C j ) ∈ φ ( C ) do Find the two nearest φ ( C i ) and φ ( C j ) to x ; M D ( x ) = ||| x − φ ( C i ) || + || x − φ ( C j ) || ; end end Select top-( σ · | P | ) minimum M D solutions from Q as Q (cid:48) ; Evaluate solutions in Q (cid:48) ; Replace solutions x i ∈ P with x j ∈ Q (cid:48) where x j (cid:31) x i ; return P We allocate (cid:15) · e evaluations for the interpolation of solutions,where parameter (cid:15) controls the number of iterations of interpo-lation and e is the maximum number of evaluations for solvingthe problem.In every iteration of the main loop, through mating selectionand mutation, a new population P is generated. Then, thefast nondominated sort [35] is employed to distinguish thenondominated solution set F . Next, the nondominated solutionset F is inputted into Procedure C OMPUTING C ENTRAL S O - LUTIONS in which these nondominated solutions are mappedinto an m − -dimensional manifold. These mapping solutionsare clustered into several clusters, and a central solution foreach cluster is identiﬁed. Afterwards, in Procedure M ANIFOLD I NTERPOLATION , a GAN is utilized to interpolate the solu-tions Q between the central solutions C . Finally, ProcedureS ELECTION selects excellent solutions as a parental population P for the next round. When (cid:15) · e evaluations are used, thepopulation P is treated as the initial population initP , and optimized by any population-based MOA with (1 − (cid:15) ) · e evaluations. Algorithm 1:

GAN-LMEF

Input: N (population size), σ (the proportion ofinterpolated solutions), MOA (a multi-objectiveoptimization algorithm), F ( · ) (the LSMOP), k (the number of clusters), e (the maximumevaluations) Output:

P OS (the POS) Set parameter (cid:15) for controlling the number ofiterations of interpolation; Random initial population P ; while Used evaluations ≤ (cid:15) · e do P (cid:48) = Mating-Selection ( P ) ; P = P ∪ Variation ( P (cid:48) ) ; F = Fast-Non-Dominated-Sort ( P ) ; C = Computing-Central-Solutions ( F , k ) ; Q = GAN-Manifold-Interpolation ( C, P, k ) ; P = Selection ( C, Q, σ ) ; end initP = P ; P OS = MOA ( initP, F ( · ) , (1 − (cid:15) ) · e ) ; return P OS

E. Complexity Analysis

In this subsection, we present time complexity analysis foreach component of GAN-LMEF and give an overall timecomplexity.For the Procedure C

OMPUTING C ENTRAL S OLUTIONS , the k -means algorithm requires O ( k · n · N ) , and calculating centralsolutions for each cluster requires O ( N ) , where k is thenumber of clusters and n and N are the dimension of thedecision variables and the size of the population, respectively.For the Procedure M ANIFOLD I NTERPOLATING , the gener-ator of the GAN contains a three-layer fully-connected neuralnetwork [36], and the discriminator contains a three-layerfully-connected neural network in which a sigmoid unit is uti-lized as the output layer. Fully connected neural networks canbe regarded as special convolutional neural networks. For eachconvolutional layer, the time complexity is O ( n l · n l − · M l · S l ) [37], where n l is the number of output channels in the l -thlayer, n l − is the number of input channels of the l -th layer, M l is the size of the output feature map, and S l is the length ofthe convolution kernel. The layer of a fully-connected neuralnetwork is a special convolutional layer whose S l is equal tothe size of the input data X l and the size of the output featuremap M l = 1 . Therefore, it takes O ( (cid:80) dl =1 n l · n l − · X l ) toexecute the GAN once, where d is the depth of the neuralnetworks. Interpolation of solutions between central solutionsof clusters requires execution of the GAN k ∗ ( k − / times, and the total time complexity of executing the GANin our work is O ( k · ( (cid:80) dl =1 n l · n l − · X l )) . The timecomplexity of the Procedure M ANIFOLD I NTERPOLATING is O ( k · n · N ) + O ( N ) + O ( k · ( (cid:80) dl =1 n l · n l − · X l )) = O ( k · ( (cid:80) dl =1 n l · n l − · X l )) . The time complexity of the Procedure S ELECTION is O ( N · k ) .Because k is a user-deﬁned constant, the overall time com-plexity of GAN-LMEF is simpliﬁed as O ( (cid:80) dl =1 n l · n l − · X l ) .IV. EXPERIMENTS A. Algorithms in Comparison and Test Problems

The proposed GAN-LMEF is compared against several pop-ular algorithms including WOF [25], LSMOF [9], GMOEA[26], LMOCSO [38], and LMEA [21]. The parameters ofthe WOF, LMEA, LSMOF, and LMOCSO algorithms areset to defaults according to [39] or their references. For afair comparison, the training parameters and the structure ofthe GAN in GMOEA are the same as those in ours. Thebasic solvers of the proposed GAN-LMEF, WOF [25], andLSMOF [9] are uniﬁed as NSGA2 [40], MOEADDE [41] andCMOPSO [42], respectively, and we rename them accordingto the basic solver, e.g., GAN-NSGA2, WOF-NSGA2 andLSMOF-NSGA2.The widely used large-scale multiobjective test problemLSMOP1-LSMOP9 [20] is employed in our experiments. Forall competitive comparisons, the number of objectives is setto 2 and the number of decision variables is set from 500 to2000. For the ablation study, the number of objectives variesfrom 2 to 6, and the number of decision variables is ﬁxed to500.

B. Performance Indicators

In this study, the following metrics are utilized to evaluatethe performance of compared algorithms.1) Inverted Generational Distance (IGD) [43]: The IGDis a metric for quantifying the convergence of the solutionsobtained by a multiobjective optimization algorithm. Whenthe IGD value is small, the convergence of the solution isimproved. IGD is deﬁned as

IGD ( P OF ∗ , P OF ) = 1 n (cid:88) p ∗ ∈ P OF ∗ min p ∈ P OF (cid:107) p ∗ − p (cid:107) , (11)where P OF ∗ is the true POF of a multiobjective optimizationproblem, P OF is an approximation set of POF obtained by amultiobjective optimization algorithm and n is the number ofsolutions in the P OF ∗ .2) Schott’s Spacing (SP) [44]: It measures the uniformity ofthe solutions found by the algorithm. The smaller the SP valueis, the more uniform the distribution of the solution obtainedby the algorithm is. It is calculated by SP = (cid:118)(cid:117)(cid:117)(cid:116) n P OF ∗ − n POF ∗ (cid:88) i =1 ( E i − E ) , (12)where E i denotes the Euclidean distance between the i -thsolution in P OF ∗ and its closest solution and E representsthe mean of E i . C. Parameter Settings

1) Population Size: The population size N is set to 100 forall test instances.2) Termination Condition: The number of maximum eval-uations e is set to 100,000 for all compared LSMOAs.3) The structure of the GAN is as follows: The number ofnodes of the three-layer fully-connected neural networkin the generator is set to n − , (cid:100) n ∗ . (cid:101) , and n ,respectively. The number of nodes of the fully-connectedneural network in the discriminator is set to n , (cid:100) n ∗ . (cid:101) ,and 1, respectively.4) Parameters of GAN-LMEF: The number of clusters k isset to 3. The proportion of the introduced interpolatedsolutions σ is set to 0.4. The (cid:15) is set to 0.1. The learningrate of the GAN is set to 0.001, the epoch is set to 200,and the GAN is trained by full batch learning. D. Performance on LSMOP Problems

The statistical results of the average IGD and SP valuesover 10 runs can be found in Tables I and II respectively.In the tables, (+), (=) and (-) indicate that GAN-NSGA2is statistically signiﬁcantly better, indifferent, or signiﬁcantlyworse than the compared algorithms, respectively, by usingthe Wilcoxon test [45] (0.05 signiﬁcance level).As can be observed from Table I, it is obvious that the pro-posed GAN-NSGA2 exhibits better performance than the ﬁvecompared algorithms for convergence. GAN-NSGA2 achieved15 of the 27 best results, LSMOF-NSGA2 achieved 6 of thebest results, GMOEA achieved 4 of the best results, and WOF-NSGA2 achieved 2 of the best results for IGD values. Specif-ically, GAN-NSGA2 performs better on LSMOP1, LSMOP4,and LSMOP9 under all decision variable settings but is slightlyworse than LSMOF-NSGA2 on the LSMOP3 and LSMOP6problems with 1000 and 2000 decision variables. GMOEAcan obtain a set of well-converged solutions for the LM-SOP7 problems with 1000 and 2000 decision variables. WOF-NSGA2 achieves better convergence over POF on LSMOP7and LSMOP8 with 500 decision variables.Table II presents the SP values of the six compared al-gorithms. GAN-NSGA2 performs the best on 14 out of 27test instances, followed by LMOCSO with 6 best results,LMEA with 4 best results, and WOF-NSGA2, LMOCSO,and GMOEA with 1 best result each. Speciﬁcally, GAN-NSGA2 performs better on LSMOP1, LSMOP3, LSMOP5,and LSMOP9 under all decision variable settings. In other testinstances, GAN-LMEF falls slightly behind the correspondingbest-performing algorithms.The above experimental results suggest that GAN-NSGA2can obtain a set of solutions with good convergence and diver-sity for most of the test problems. However, GAN-NSGA2 isworse for IGD values on LSMOP3 and LSMOP6. This may beattributed to the ﬁtness landscapes of LSMOP3 being the adop-tion of Rosenbrock’s function and Rastrigin’s function andLSMOP6 mixing Rosenbrock’s function and Ackley’s function[20]. The mix of separable and nonseparable functions leadsto a complicated ﬁtness landscape and increases the difﬁculty of learning characteristics, thus degenerating the performanceof the interpolated solutions.

E. Ablation Study

In this subsection, we perform ablation experiments todemonstrate the effectiveness of applying the proposed frame-work to NSGA2, MOEADDE, and CMOPSO. In addition,we embed basic solvers into two popular frameworks forsolving LSMOP, WOF, and LSMOF. Please note that NSGA2,MOEADDE, and CMOPSO are not suitable for solvingLSMOPs. The statistics of IGD values can be found in TableIII.As seen from Table III, GAN-NSGA2 (GAN-MOEADDEand GAN-CMOPSO) achieves signiﬁcantly better IGD valuesthan WOF-NSGA2 (WOF-MOEADDE and WOF-CMOPSO)and LSMOF-NSGA2 (LSMOF-MOEADDE and LSMOF-CMOPSO). Speciﬁcally, GAN-NSGA2 performs signiﬁcantlybetter than WOF-NSGA2 in 16 out of 27 test functionsand LSMOF-NSGA2 in 13 out of 27 test functions. GAN-MOEADDE has better IGD values than WOF-MOEADDEin 21 out of 27 test functions and LSMOF-MOEADDE in19 out of 27 instances. GAN-CMOPSO obtains better IGDvalues than WOF-CMOPSO in 15 out of 27 instances andLSMOF-CMOPSO in 13 out of 27 functions. These pairwisecomparisons have demonstrated the ability of our proposedGAN-LEMF framework to improve the performance of exist-ing MOAs on LSMOPs.To verify the effectiveness of interpolation for latent noisevariables z , we develop an algorithm called Ip-LMEF thatdirectly interpolates solutions between the decision variablesof central solutions other than latent noise variables. Thepiecewise linear interpolation method is used to generatesolutions directly. The experimental results are shown in TableIV.As seen from Table IV, GAN-NSGA2 achieves 7 of the9 best IGD values, while Ip-NSGA2 achieves 2 of the bestresults. For the optimizer MOEADDE, GAN-MOEADDEachieves 6 of the best results. For the basic solver CMOPSO,GAN-CMOPSO performs signiﬁcantly better than both Ip-CMOPSO and CMOPSO in 7 out of 9 instances. We alsodraw convergence proﬁles of IGD values obtained by GAN-NSGA2 and NSGA2 in Fig. 7. The convergence rate ofGAN-NSGA2 is faster than that of NSGA2. As seen in theexperimental results, incorporation of GAN into NSGA2 cangreatly improve the quality and convergence speed of solutionsin solving LSMOPs. These results conﬁrm the effectiveness ofour proposed interpolation method. The GAN learns the map-ping relationship from the latent low-dimensional manifoldspace to the high-dimensional solutions, and interpolation ofthe latent variables z can generate high-quality solutions lyingon the manifold. F. Performance of Different Interpolation Strategies

To investigate the quality of solutions interpolated viadifferent interpolation strategies, we record the number ofdifferent interpolated nondominated solutions selected by theProcedure S

ELECTION during evolutionary optimization.

TABLE IMEAN AND VARIANCE VALUES OF IGD METRIC OBTAINED BY COMPARED ALGORITHMS OVER LSMOP PROBLEMS

Problems Dec. GAN-NSGA2 WOF-NSGA2 LSMOF-NSGA2 GMOEA LMOCSO LMEALSMOP1 500

TABLE IIMEAN AND VARIANCE VALUES OF SP METRIC OBTAINED BY COMPARED ALGORITHMS OVER LSMOP PROBLEMS

Problems Dec. GAN-NSGA2 WOF-NSGA2 LSMOF-NSGA2 GMOEA LMOCSO LMEALSMOP1 500

TABLE IIIMEAN AND VARIANCE VALUES OF IGD METRIC OBTAINED BY COMPARED ALGORITHMS OVER LSMOP PROBLEMS

Problems Dec. GAN-NSGA2 WOF-NSGA2 LSMOF-NSGA2 GAN-MOEADDE WOF-MOEADDE LSMOF-MOEADDE GAN-CMOPSO WOF-CMOPSO LSMOF-CMOPSOLSMOP1 500

LSMOP5 500

LSMOP6 500

Evaluations -1 I G D V a l ue LSMOP1

Evaluations I G D V a l ue LSMOP2

Evaluations I G D V a l ue LSMOP3

Evaluations -2 -1 I G D V a l ue LSMOP4

Evaluations -1 I G D V a l ue LSMOP5

Evaluations I G D V a l ue LSMOP6

Evaluations I G D V a l ue LSMOP7

Evaluations -1 I G D V a l ue LSMOP8

Evaluations -1 I G D V a l ue LSMOP9

GAN-NSGA2 NSGA2

Fig. 7. Convergence proﬁles of IGD values obtained by GAN-NSGA2 and NSGA2 on LSMOP problems with 500 decision variables. TABLE IVMEAN AND VARIANCE VALUES OF IGD METRIC OBTAINED BY COMPARED ALGORITHMS OVER LSMOP PROBLEMS WHEN THENUMBER OF DECISION VARIABLES IS 500

Problems GAN-NSGA2 Ip-NSGA2 NSGA2 GAN-MOEADDE Ip-MOEADDE MOEADDE GAN-CMOPSO Ip-CMOPSO CMOPSOLSMOP1

LSMOP4

LSMOP8 N o . o f nondo m i na t ed s o l u t i on s i n t e r po l a t ed LSMOP1

Interpolate betwwen clusters Perturbation interpolation Interpolate in the cluster N o . o f nondo m i na t ed s o l u t i on s i n t e r po l a t ed LSMOP2 N o . o f nondo m i na t ed s o l u t i on s i n t e r po l a t ed LSMOP3 N o . o f nondo m i na t ed s o l u t i on s i n t e r po l a t ed LSMOP4 N o . o f nondo m i na t ed s o l u t i on s i n t e r po l a t ed LSMOP5 N o . o f nondo m i na t ed s o l u t i on s i n t e r po l a t ed LSMOP6 N o . o f nondo m i na t ed s o l u t i on s i n t e r po l a t ed LSMOP7 N o . o f nondo m i na t ed s o l u t i on s i n t e r po l a t ed LSMOP8 N o . o f nondo m i na t ed s o l u t i on s i n t e r po l a t ed LSMOP9

Fig. 8. Number of selected nondominated solutions interpolated by different strategies on LSMOP problems with 500 decision variables.

As shown in Fig. 8, interpolating between central solutionscan produce much better individuals than the perturbationinterpolation method and interpolating in the cluster method.This indicates that the gaps between clusters are alwayshuge in solving LSMOPs, and the population lacks diversityand convergence during the entire evolutionary optimization.Therefore, interpolating between central solutions can producemore promising solutions.

G. Sensitivity Study

In GAN-LMEF, the parameters k and (cid:15) decide the numberof interpolated solutions and affect the performance. In thissensitivity study, the inﬂuence of k and (cid:15) on the convergenceof GAN-NSGA2 is investigated. Table V shows IGD values obtained by different choices of k and (cid:15) values for GAN-NSGA2 on the LSMOP benchmark. When k increases from 2to 3, IGD values improve signiﬁcantly in 12 out of 27 instancesand decrease signiﬁcantly in 7 out of 27 instances. When (cid:15) varies from 0.1 to 0.5, half of the instances in terms of IGDvalues become small, while half of the instances increase.Based on the experimental results, a larger k value cangenerate more piecewise manifolds, thus interpolating manymore solutions in the gaps. While the choice of parameter (cid:15) depends on the speciﬁc problem, it needs to be set carefully.Because POS can be represented in an m − -dimensionalmanifold, the number of objectives m can determine thecomplexity of the manifold, thus affecting the performance ofinterpolation. Therefore, we conduct the experiment concern- TABLE VMEAN AND VARIANCE VALUES OF IGD METRIC OBTAINED BYDIFFERENT (cid:15)

AND k VALUES OF GAN-NSGA2 WHEN THENUMBER OF DECISION VARIABLES IS 500

Problems (cid:15) k = 2 k = 3 LSMOP1 0.1 2.84e-1(4.13e-3) 3.23e-1(2.27e-4)0.2 2.59e-1(1.71e-3) 2.64e-1(2.27e-3)0.5 2.93e-1(1.81e-3) 2.77e-1(1.65e-3)LSMOP2 0.1 2.01e-2(7.81e-6) 1.70e-2(8.71e-6)0.2 1.96e-2(6.00e-6) 1.59e-2(6.19e-6)0.5 1.81e-2(1.86e-5) 1.56e-2(2.26e-5)LSMOP3 0.1 1.90e+0(1.04e-3) 1.85e+0(2.51e-4)0.2 1.56e+0(3.45e-2) 1.79e+0(3.72e-3)0.5 1.76e+0(1.18e-3) 1.68e+0(4.10e-4)LSMOP4 0.1 3.85e-2(2.17e-5) 3.62e-2(3.82e-7)0.2 3.66e-2(2.35e-5) 3.68e-2(1.57e-5)0.5 4.51e-2(1.40e-5) 3.18e-2(1.42e-5)LSMOP5 0.1 7.42e-1(1.63e-6) 7.42e-1(3.17e-4)0.2 7.42e-1(5.83e-7) 7.42e-1(2.56e-6)0.5 7.42e-1(2.80e-7) 7.42e-1(2.76e-6)LSMOP6 0.1 5.47e-1(9.47e-6) 3.31e-1(1.36e-4)0.2 6.09e-1(3.65e-5) 7.80e-1(3.82e-4)0.5 7.93e-1(4.02e-4) 7.83e-1(2.75e-4)LSMOP7 0.1 1.51e+0(4.30e-2) 1.82e+0(8.07e-4)0.2 1.93e+0(6.78e-2) 2.27e+0(2.68e-2)0.5 2.02e+0(2.71e-2) 2.01e+0(1.75e-2)LSMOP8 0.1 7.42e-1(1.50e-7) 7.42e-1(3.17e-7)0.2 7.42e-1(2.14e-7) 7.42e-1(1.13e-7)0.5 7.42e-1(2.52e-7) 7.42e-1(1.12e-7)LSMOP9 0.1 5.39e-1(2.15e-3) 5.56e-1(9.51e-4)0.2 5.41e-1(3.00e-3) 5.46e-1(8.94e-4)0.5 5.47e-1(3.34e-4) 5.38e-1(4.35e-4) ing m , and the experimental results concerning the IGD metricare shown in Table VI. GAN-NSGA2 achieves better IGDvalues in 12 out of 27 instances. However, we can immediatelyﬁnd that GAN-NSGA2 has more difﬁculty obtaining well-converged solutions on more objective functions because themanifold becomes more complicated and more difﬁcult tolearn. V. CONCLUSIONThis paper has presented a GAN-based evolutionary searchframework for solving LSMOPs, called GAN-LMEF. The aimof GAN-LMEF is to interpolate solutions on the manifoldvia a GAN to maintain the effective manifold. Based on thegenerative population, higher-quality offspring are reproduced,thereby improving evolutionary performance.In the proposed GAN-LMEF, a GAN is employed to learncharacteristics from nondominated solutions and generate anumber of solutions by three different interpolation strategies.Then, a manifold selection mechanism selects promising so-lutions from the generative solutions for the next round. Weintegrated the proposed framework with NSGA2, MOEADDE,and CMOPSO to evaluate the performance. The experimentalresults have demonstrated that the proposed GAN-LMEF canbetter solve LSMOPs than several state-of-the-art LSMOAs.This paper presents preliminary work. Several possibledirections may be taken for future work. For large-scale prob-lems with complicated search spaces, such as constraints anddisconnected Pareto-optimal regions, the proposed frameworkstill has much room for improvement. In addition, a rich body of transfer techniques such as evolutionary transfer optimiza-tion [46]–[49] and advanced machine learning [50] and caninspire further innovations in solving real-world applicationswith large-scale decision variables.A CKNOWLEDGMENT R EFERENCES[1] A. Ponsich, A. L. Jaimes, and C. A. C. Coello, “A survey on mul-tiobjective evolutionary algorithms for the solution of the portfoliooptimization problem and other ﬁnance and economics applications,”

IEEE Transactions on Evolutionary Computation , vol. 17, no. 3, pp.321–344, June 2013.[2] Z. P. Stanko, T. Nishikawa, and S. R. Paulinski, “Large-scale multi-objective optimization for the management of seawater intrusion, santabarbara, ca,” in

Agu Fall Meeting , 2015.[3] Z. Zhao, M. Jiang, S. Guo, Z. Wang, F. Chao, and K. C. Tan, “Improvingdeep learning based optical character recognition via neural architecturesearch,” in ,2020, pp. 1–7.[4] J. Cheney, “The application of optimisation methods to the design oflarge scale telecommunication networks,” in

IEE Colloquium on Large-Scale and Hierarchical Systems , March 1988, pp. 2/1–2/2.[5] H. Wang, L. Jiao, R. Shang, S. He, and F. Liu, “A memetic optimizationstrategy based on dimension reduction in decision space,”

Evolutionarycomputation , vol. 23, no. 1, pp. 69–100, 2015.[6] X. Ma, F. Liu, Y. Qi, X. Wang, L. Li, L. Jiao, M. Yin, and M. Gong, “Amultiobjective evolutionary algorithm based on decision variable analy-ses for multiobjective optimization problems with large-scale variables,”

IEEE Transactions on Evolutionary Computation , vol. 20, no. 2, pp.275–298, 2016.[7] L. M. Antonio and C. A. C. Coello, “Use of cooperative coevolution forsolving large scale multiobjective optimization problems,” in , June 2013, pp. 2758–2765.[8] H. Hong, K. Ye, M. Jiang, and K. Tan, “Solving large-scale multi-objective optimization via probabilistic prediction model,” in , 2021.[9] C. He, L. Li, Y. Tian, X. Zhang, R. Cheng, Y. Jin, and X. Yao,“Accelerating large-scale multi-objective optimization via problem re-formulation,”

IEEE Transactions on Evolutionary Computation , pp. 1–1,2019.[10] Hillermeier and Claus, “Nonlinear multiobjective optimization: A gen-eralized homotopy approach,”

Journal of the Operational ResearchSociety , vol. 10.1007/978-3-0348-8280-4, no. 2, pp. 246–247, 2001.[11] S. Mardle and K. M. Miettinen, “Nonlinear multiobjective optimization,”

Journal of the Operational Research Society , vol. 51, no. 2, p. 246, 1999.[12] Q. Zhang, A. Zhou, and Y. Jin, “Rm-meda: A regularity model-basedmultiobjective estimation of distribution algorithm,”

IEEE Transactionson Evolutionary Computation , vol. 12, no. 1, pp. 41–63, Feb 2008.[13] M. Jiang, Z. Wang, L. Qiu, S. Guo, X. Gao, and K. C. Tan, “A fastdynamic evolutionary multiobjective algorithm via manifold transferlearning,”

IEEE Transactions on Cybernetics , 2020.[14] J. Zhang, S. Li, M. Jiang, and K. C. Tan, “Learning from weakly labeleddata based on manifold regularized sparse model,”

IEEE Transactionson Cybernetics , pp. 1–14, 2020.[15] D. P. Kingma, S. Mohamed, D. Jimenez Rezende, and M. Welling,“Semi-supervised learning with deep generative models,” in

Advances inNeural Information Processing Systems 27 , Z. Ghahramani, M. Welling,C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. CurranAssociates, Inc., 2014, pp. 3581–3589.[16] M. Khayatkhoei, M. K. Singh, and A. Elgammal, “Disconnected man-ifold learning for generative adversarial networks,” in

Advances inNeural Information Processing Systems 31 , S. Bengio, H. Wallach,H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, Eds.Curran Associates, Inc., 2018, pp. 7343–7353.[17] C. Wang, C. Xu, X. Yao, and D. Tao, “Evolutionary generative adver-sarial networks,”

IEEE Transactions on Evolutionary Computation , pp.1–1, 2019.[18] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in

Advances in Neural Information Processing Systems 27 , Z. Ghahramani,M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds.Curran Associates, Inc., 2014, pp. 2672–2680. TABLE VIMEAN AND VARIANCE VALUES OF IGD METRIC OBTAINED BY COMPARED ALGORITHMS OVER LSMOP PROBLEMS WHEN THENUMBER OF DECISION VARIABLES IS 500

Problems m GAN-NSGA2 WOF-NSGA2 LSMOF-NSGA2 GMOEA LMOCSO LMEALSMOP1 2

LSMOP2 2 1.70e-2(8.71e-6) 3.32e-2(1.49e-6)+ 1.74e-2(4.53e-7)=

LSMOP4 2

LSMOP8 2 7.42e-1(3.17e-7) [19] Y. Tian, X. Zheng, X. Zhang, and Y. Jin, “Efﬁcient large-scale multi-objective optimization based on a competitive swarm optimizer,”

IEEETransactions on Cybernetics , pp. 1–13, 2019.[20] R. Cheng, Y. Jin, M. Olhofer, and B. Sendhoff, “Test problems for large-scale multiobjective and many-objective optimization,”

IEEE Transac-tions on Cybernetics , vol. 47, no. 12, pp. 4108–4121, 2017.[21] X. Zhang, Y. Tian, R. Cheng, and Y. Jin, “A decision variable clustering-based evolutionary algorithm for large-scale many-objective optimiza-tion,”

IEEE Transactions on Evolutionary Computation , vol. 22, no. 1,pp. 97–112, Feb 2018.[22] Z. Yang, K. Tang, and X. Yao, “Large scale evolutionary optimizationusing cooperative coevolution,”

Information Sciences , vol. 178, no. 15,pp. 2985–2999, 2014.[23] R. Shang, K. Dai, L. Jiao, and R. Stolkin, “Improved memetic algorithmbased on route distance grouping for multiobjective large scale capaci-tated arc routing problems,”

IEEE Transactions on Cybernetics , vol. 46,no. 4, pp. 1000–1013, April 2016.[24] F. V. D. Bergh and A. P. Engelbrecht, “A cooperative approach to particleswarm optimization,”

IEEE Trans.evol.comput , vol. 8, no. 3, pp. 225–239, 2004.[25] H. Zille, H. Ishibuchi, S. Mostaghim, and Y. Nojima, “A framework forlarge-scale multiobjective optimization based on problem transforma-tion,”

IEEE Transactions on Evolutionary Computation , vol. 22, no. 2,pp. 260–275, 2017.[26] H. Cheng, S. Huang, R. Cheng, K. C. Tan, and Y. Jin, “Evolutionarymultiobjective optimization driven by generative adversarial networks(gans),”

IEEE Transactions on Cybernetics , 2020.[27] H.-W. Dong, W.-Y. Hsiao, L.-C. Yang, and Y.-H. Yang, “Musegan:Multi-track sequential generative adversarial networks for symbolic mu-sic generation and accompaniment,” in

Thirty-Second AAAI Conferenceon Artiﬁcial Intelligence , 2018.[28] F. Chao, J. Lv, D. Zhou, L. Yang, C.-M. Lin, C. Shang, and C. Zhou,“Generative adversarial nets in robotic chinese calligraphy,” in .IEEE, 2018, pp. 1104–1110.[29] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville,“Improved training of wasserstein gans,” in

Advances in Neural Informa- tion Processing Systems 30 . Curran Associates, Inc., 2017, pp. 5767–5777.[30] M. N. Do and M. Vetterli, “Wavelet-based texture retrieval using gener-alized gaussian density and kullback-leibler distance,”

IEEE transactionson image processing , vol. 11, no. 2, pp. 146–158, 2002.[31] J. Lin, “Divergence measures based on the shannon entropy,”

IEEETransactions on Information theory , vol. 37, no. 1, pp. 145–151, 1991.[32] T. Zhang, “Solving large scale linear prediction problems using stochas-tic gradient descent algorithms,” in

Proceedings of the twenty-ﬁrstinternational conference on Machine learning . ACM, 2004, p. 116.[33] J. A. Hartigan and M. A. Wong, “Algorithm as 136: A k-meansclustering algorithm,”

Journal of the Royal Statistical Society. SeriesC (Applied Statistics) , vol. 28, no. 1, pp. 100–108, 1979.[34] A. Radford and S. Chintala, “Unsupervised representation learning withdeep convolutional generative adversarial networks,”

Computer Science ,vol. 13, no. 2, pp. 284–302, 2015.[35] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitistmultiobjective genetic algorithm: Nsga-ii,”

IEEE Transactions on Evo-lutionary Computation , vol. 6, no. 2, pp. 182–197, April 2002.[36] T. N. Sainath, O. Vinyals, A. Senior, and H. Sak, “Convolutional,long short-term memory, fully connected deep neural networks,” in . IEEE, 2015, pp. 4580–4584.[37] K. He and J. Sun, “Convolutional neural networks at constrained timecost,” in , June 2015, pp. 5353–5360.[38] Y. Tian, X. Zheng, X. Zhang, and Y. Jin, “Efﬁcient large-scale multi-objective optimization based on a competitive swarm optimizer,”

IEEETransactions on Cybernetics , pp. 1–13, 2019.[39] Y. Tian, R. Cheng, X. Zhang, and Y. Jin, “Platemo: A matlab platformfor evolutionary multi-objective optimization [educational forum],”

IEEEComputational Intelligence Magazine , vol. 12, no. 4, pp. 73–87, Nov2017.[40] K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan, “A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization:Nsga-ii,” in

Parallel Problem Solving from Nature PPSN VI , M. Schoe-nauer, K. Deb, G. Rudolph, X. Yao, E. Lutton, J. J. Merelo, and H.-P. Schwefel, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2000,pp. 849–858.[41] H. Li and Q. Zhang, “Multiobjective optimization problems withcomplicated pareto sets, moea/d and nsga-ii,”

IEEE Transactions onEvolutionary Computation , vol. 13, no. 2, pp. 284–302, 2009.[42] X. Zhang, X. Zheng, R. Cheng, J. Qiu, and Y. Jin, “A competitivemechanism based multi-objective particle swarm optimizer with fastconvergence,”

Information Sciences , vol. 427, pp. 63 – 76, 2018.[43] E. Zitzler, L. Thiele, M. Laumanns, C. M. Fonseca, and V. G. daFonseca, “Performance assessment of multiobjective optimizers: ananalysis and review,”

IEEE Transactions on Evolutionary Computation ,vol. 7, no. 2, pp. 117–132, April 2003.[44] G. Ismayilov and H. R. Topcuoglu, “Dynamic multi-objective workﬂowscheduling for cloud computing based on evolutionary algorithms,”in , Dec 2018, pp. 103–108.[45] J. Derrac, S. Garc´ıa, D. Molina, and F. Herrera, “A practical tutorialon the use of nonparametric statistical tests as a methodology forcomparing evolutionary and swarm intelligence algorithms,”

Swarm andEvolutionary Computation , vol. 1, no. 1, pp. 3 – 18, 2011.[46] A. Gupta, Y. Ong, and L. Feng, “Insights on transfer optimization:Because experience is the best teacher,”

IEEE Transactions on EmergingTopics in Computational Intelligence , vol. 2, no. 1, pp. 51–64, Feb 2018.[47] M. JIANG, Z. WANG, H. HONG, and G. G. YEN, “Knee point basedimbalanced transfer learning for dynamic multi-objective optimization,”

IEEE Transactions on Evolutionary Computation , pp. 1–1, 2020.[48] M. Jiang, Z. Huang, L. Qiu, W. Huang, and G. G. Yen, “Transferlearning-based dynamic multiobjective optimization algorithms,”

IEEETransactions on Evolutionary Computation , vol. 22, no. 4, pp. 501–514,2018.[49] M. Jiang, Z. Wang, S. Guo, X. Gao, and K. C. Tan, “Individual-based transfer learning for dynamic multiobjective optimization,”

IEEETransactions on Cybernetics , pp. 1–14, 2020.[50] G. CHI, M. JIANG, X. GAO, W. HU, S. GUO, and K. C. TAN, “Onlinebagging for anytime transfer learning,” in2019 IEEE Symposium Serieson Computational Intelligence (SSCI)