[PDF] Data-driven topology design using a deep generative model

Abstract

In this paper, we propose a sensitivity-free and multi-objective structural design methodology called data-driven topology design. It is schemed to obtain high-performance material distributions from initially given material distributions in a given design domain. Its basic idea is to iterate the following processes: (i) selecting material distributions from a dataset of material distributions according to eliteness, (ii) generating new material distributions using a deep generative model trained with the selected elite material distributions, and (iii) merging the generated material distributions with the dataset. Because of the nature of a deep generative model, the generated material distributions are diverse and inherit features of the training data, that is, the elite material distributions. Therefore, it is expected that some of the generated material distributions are superior to the current elite material distributions, and by merging the generated material distributions with the dataset, the performances of the newly selected elite material distributions are improved. The performances are further improved by iterating the above processes. The usefulness of data-driven topology design is demonstrated through numerical examples.

Full PDF

DData-driven topology design using a deep generative model

Shintaro Yamasaki a, ∗ , Kentaro Yaji a , Kikuo Fujita aa Department of Mechanical Engineering, Graduate School of Engineering,Osaka University, 2-1 Yamadaoka, Suita 565-0871, Japan

June 9, 2020

Abstract

In this paper, we propose a structural design methodology called data-driven topology design , which aims toobtain high-performance material distributions for a multi-objective optimization problem from the initially givenmaterial distributions in a given design domain. Its basic idea is iterating the following processes: (i) selectingthe material distributions from a dataset according to Pareto optimality, (ii) generating new material distributionsusing a deep generative model with the selected material distributions as the training data, and (iii) integrating thegenerated material distributions into the dataset. Because of the nature of a deep generative model, the generatedmaterial distributions are diverse and inheriting features of the training data, which are material distributions on thePareto front at that speciﬁc point. Therefore, it is expected that some of the generated material distributions aresuperior to the training data, whereas some are inferior, and the Pareto front is improved by integrating the generatedmaterial distributions into the dataset. The Pareto front is further improved by iterating the above processes. Data-driven topology design is used to enhance a support system for determining appropriate formulations of topologyoptimization problems, and its usefulness is demonstrated through numerical examples.

Keywords

Data-driven design · Topology optimization · Deep generative model · Formulation support system · Estimation of distribution algorithm

Structural design is to determine the structural shape and topology of artifacts on the basis of physics, mathematics,designer intuition, and so on. Among previously proposed methodologies for structural design, topology optimizationoriginated by Bendsøe and Kikuchi (1988) is a promising one because of its potential to yield high-performancestructures while considering both the shape and topology.There are two basic concepts in topology optimization. One is replacing a structural design problem with a ma-terial distribution problem in a given design domain. The other is exploiting the optimal, or at least local optimalmaterial distribution using mathematical programming under a given objective function and constraints, i.e., a givenformulation.Topology optimization has been applied to various engineering problems and has achieved immense success be-cause of its versatility. Nevertheless, it has an intrinsic difﬁculty, as pointed out by Yamasaki et al. (2019). That is, it isoften a difﬁcult task for designers to determine appropriate formulations from design problems that are ambiguouslydescribed. This is because there are often tacit constraints in such design problems, and it is difﬁcult to articulatelydescribe them without trial-and-error.To tackle this intrinsic difﬁculty, Yamasaki et al. (2019) has proposed a support system for formulating topologyoptimization problems. This formulation support system has a database constructed by collecting data of materialdistributions, which were obtained by solving various topology optimization problems, and the formulation supportsystem provides useful knowledge for determining appropriate formulations on the basis of knowledge discovery indatabases (Fayyad et al., 1996; Tsai et al., 2014; Adhikari and Adhikari, 2015). ∗ Corresponding author: [email protected] (Shintaro Yamasaki) a r X i v : . [ phy s i c s . c o m p - ph ] J un ore speciﬁcally, a user inputs multiple functions as candidates of the objective and constraint functions (hereafter,called candidate functions) into the formulation support system, and then it outputs material distributions having Paretooptimality from the database. By checking the outputs, the user decides whether the set of input candidate functionsis appropriate or not. This process is repeated until an appropriate set is determined. By doing so, trial-and-error fordetermining an appropriate formulation is supported.Their study is the ﬁrst challenge to tackle the intrinsic difﬁculty described above and some issues remain. Onemajor issue is the diversity of the material distributions in the database. Under a situation in which the diversity isinsufﬁcient, the formulation support system will output material distributions that may seem to be unusual for thosehaving Pareto optimality, even if the set of input candidate functions is appropriate. This is not preferable, because theuser may consider that such unusual material distributions are output owing to the inappropriateness of the input set.The formulation support system should output reasonable material distributions; at minimum, it should be clear forthe user why such material distributions are suitable to the input set.Regarding the above issue, the utilization of deep generative models (Kingma and Welling, 2013; Goodfellow et al.,2014) is promising. They can generate material distributions using the outputs of the formulation support system as thetraining data. Because of their generative nature, the generated material distributions are diverse and inherit featuresof the original material distributions, which are the current Pareto optimal solutions to the set of input candidatefunctions. Therefore, it is expected that some of the generated material distributions will be superior to the originalmaterial distributions, and that the Pareto front will be improved by integrating the generated material distributionsinto the original material distributions. In addition, the Pareto front will be further improved by iterating the aboveprocesses, and as a result, it is expected that unusual material distributions in the Pareto front will be suppressed.On the basis of the above idea, in this paper, we propose to iteratively conduct the following processes to thedatabase of the formulation support system: generating material distributions using a deep generative model fromthe Pareto optimal material distributions in the database, and integrating the generated material distributions into thedatabase. We call this structural design methodology data-driven topology design . The essence of this methodologyis to improve the structural shape and topology by iterating the data generation using a deep generative model and thedata evaluation based on Pareto optimality.Many deep generative models have recently been proposed, and variational autoencoders (VAEs) (Kingma andWelling, 2013) and generative adversarial networks (GANs) (Goodfellow et al., 2014) are representative. When com-pared to a GAN, a VAE is suitable for data-driven topology design because its neural network architecture is relativelysimple and a VAE is therefore robust (Atienza, 2018). This robustness is particularly important because we train theneural network many times while updating the training data. We therefore adopt a VAE as a deep generative model forimplementation and demonstrate that the formulation support system is enhanced by incorporating the implementedmethod.The rest of this paper is organized as follows. We brieﬂy introduce related studies in Section 2 and describe theoverall procedure in Section 3. Next, we detail its implementation in Section 4 and provide numerical examples inSection 5. Finally, we provide some concluding remarks in Section 6. Data-driven approaches based on deep-learning have recently gained signiﬁcant attention from researchers in variousﬁelds, and some studies incorporating them into topology optimization have been proposed. Ulu et al. (2014) proposedto predict optimized material distributions of the minimum compliance problem using a neural network. In theirstudy, various optimized material distributions were prepared using topology optimization while changing the loadboundary condition. The network is then trained under the load boundary condition as the input and the correspondingoptimized structure as the output. Using the trained network, the optimized material distribution to a given loadboundary condition is predicted.Zhang et al. (2019b) also proposed to predict the optimized material distributions of the minimum complianceproblem using a neural network. In their study, the displacement and strain ﬁelds of the initial material distributionsare used as the input, the corresponding optimized material distributions are used as the output, and the neural networkis trained using the input and output data. When an initial material distribution and its displacement and strain ﬁeldsare given, the optimized material distribution is predicted using the trained network. They demonstrated that their2roposed method covers a change in the location where the displacement ﬁxed boundary condition is imposed, inaddition to the load boundary condition.Similar to the above studies, Yu et al. (2019) proposed a prediction method for the minimum compliance problem.Optimized material distributions are predicted through two steps in their study. First, an optimized material distributionunder a given boundary condition is predicted in a low-resolution mesh, such as that described in the studies of Uluet al. (2014); Zhang et al. (2019b). Next, the predicted material distribution is reﬁned in a high-resolution mesh usingconditional GAN (Mirza and Osindero, 2014).Sasaki and Igarashi (2019) proposed a topology optimization method for a structural design problem of innerpermanent magnet motors (IPMs). In their study, quasi-optimal material distributions are exploited using a geneticalgorithm (GA). Although topology optimization incorporating a GA is generally time-consuming, they reduced thecomputational costs by utilizing a neural network that predicts the performances of IPMs.Although these studies utilized deep-learning for the regression, some studies have focused on deep-learning basedgenerative models. Oh et al. (2019) proposed a topology optimization method for a wheel design problem in whichthe diversity of the optimized material distributions is ensured by referring to the material distributions generated bya GAN. They also used an autoencoder (Hinton and Salakhutdinov, 2006) to evaluate the novelty of the optimizedmaterial distributions.Guo et al. (2018) proposed a structural design method for the thermal compliance minimization problem, whichconsists of two steps. First, a VAE is trained using various material distributions, which are obtained using topologyoptimization while changing the boundary conditions. Next, the latent space of the trained VAE is exploited using aGA, and as a result of the exploitation, quasi-optimal material distributions are obtained. In addition, a style transfernetwork (Gatys et al., 2016) is used to reduce noises included in the material distributions generated by the VAE.Zhang et al. (2019a) proposed a structural design method for the three-dimensional shape of a glider. In their study,a VAE is trained using airplane models registered in a three-dimensional structure database (Wu et al., 2015), and thelatent space of the trained VAE is exploited using a GA in a similar manner as the study of Guo et al. (2018).Data-driven topology design may seem to be similar to the above studies, particularly, the studies of Oh et al.(2019), Guo et al. (2018) and Zhang et al. (2019a). However, its novelty can be clearly explained using the estimationof distribution algorithm (EDA) (Larra˜naga and Lozano, 2001). Therefore, we introduce the EDA in the next section.

Because of the generative nature for structures, data-driven topology design may seem to be an image-based GA inwhich only elite individuals are selected. Indeed, this can be regarded as an EDA, which is a type of GA, on thebasis of the following two points: (i) probabilistic models are constructed with elite individuals, and new individualsare generated using these models, and (ii) this generative process is iteratively performed. Recently, Garciarenaet al. (2018); Bhattacharjee and Gras (2019) proposed to adopt a VAE as a probabilistic model of an EDA, althoughtheir targets are well studied test problems in the ﬁeld of the GA rather than structural design problems. The EDAsincorporating a VAE works well in their studies, and this fact enforces the validity of data-driven topology design.Whereas the initial individuals are randomly generated in numerous studies on EDAs, the initial material distribu-tions are given according to a certain guideline in data-driven topology design. That is, the outputs of the formulationsupport system are used as the initial material distributions. This is an important distinction between many studiesconducted on EDAs and data-driven topology design. Because the latter deals with structural design problems havingan extremely large number of design variables (typically, several thousands or more), it is difﬁcult to prepare suit-able initial material distributions using a random number generator. Therefore, the guideline for the initial materialdistributions plays an important role in data-driven topology design.

As discussed in Section 2.2, data-driven topology design is novel in terms of its application to structural design andguideline for the initial individuals, when compared to previously proposed EDAs incorporating a deep generativemodel (Garciarena et al., 2018; Bhattacharjee and Gras, 2019).Furthermore, data-driven topology design can be clearly distinguished from the studies of Oh et al. (2019), Guoet al. (2018) and Zhang et al. (2019a), from the viewpoint of an EDA. That is, the former can be regarded as a typeof EDA, whereas the latter cannot. This is because a deep generative model is trained only by material distributionshaving Pareto optimality in the former, whereas various material distributions are used for training in the latter. This3igure 1: Data process ﬂow of data-driven topology designis a critically important difference for our purposes, and we investigate the results caused by such a difference inSection 5

In this section, we describe the overall procedure of data-driven topology design. This is schemed to obtain suitablematerial distributions to a given design problem, which is deﬁned by the shape of the design domain, boundary condi-tions, and multiple objective functions. The data process ﬂow starts from the preparation of the material distributionsin the design domain, which are labeled the original data . Because data-driven topology design is used to enhance theformulation support system, the multiple objective functions correspond to the candidate functions, and the originaldata are prepared as described in the study of Yamasaki et al. (2019).After preparing the original data, the data are processed as follows according to the indication in Fig. 1:

Step 1

Evaluate the performances of the material distributions in the original data by computing the objective functionvalues of these material distributions. Here, the data including the performance values are labeled the integrateddata because they will be iteratively integrated with the generated data (see Step 6).

Step 2

Select the material distributions having Pareto optimality from the integrated data. The selected data arelabeled as the Pareto optimal data . Meanwhile, the integrated data are stored for integration with the generateddata.

Step 3

Judge whether the Pareto optimal data satisfy the convergence criteria. If so, the current Pareto optimal dataare output as the ﬁnal results. Otherwise, the material distributions of the Pareto optimal data are convertedto conform to a normalized reference domain, which is a 1 × × × Step 4

Train a deep generative model using the material distributions of the Pareto optimal data, and newly generatethe material distributions using the trained deep generative model. These material distributions are labeled as the generated data . 4igure 2: Example of material distribution conversion using DDM: a material distribution in the design domain, and b converted material distribution conforming to the reference domain, where the material and void are shown in blackand white, respectively Step 5

Inversely convert the material distributions of the generated data to conform to the design domain, using theDDM.

Step 6

Evaluate the performances of the material distributions of the generated data, in the same manner as step 1.The generated data including the performance values are integrated into the integrated data, and we return tostep 2.Through the above iterative procedure, we aim to obtain Pareto optimal data consisting of high-performance materialdistributions.

As described in Section 1, in this study, data-driven topology design is implemented using a VAE. Regarding the useof the VAE, some important implementation details are described in the following.

In data-driven topology design, we use two domains, i.e., the design and reference domains, as described in Section 3.In the design domain D , the material distributions are represented using the density function ρ ( x ) , where x are thecoordinates of an arbitrary point in D . ρ ( x ) is continuous and takes a value of 0 to 1, and ρ ( x ) = < ρ ( x ) < ρ ( ξξξ ) in the reference domain ¯ D , where ξξξ are the coordinates of an arbitrarypoint in ¯ D .When using the above representation model, we must consider preferable features of the training data for the VAE.In conventional density-based topology optimization, it is necessary to reduce the intermediate state while maintainingthe smoothness of the material distribution. From this perspective, the material distributions in Fig. 2, for example, arepreferable. By contrast, it is thought that the intermediate state has a positive effect when training the VAE because itprovides information regarding the outline of the structure. In fact, MNIST (Deng, 2012), one of the most importantdatasets in the ﬁeld of deep-learning, includes thousands of grayscale images of handwritten digits.Therefore, we blur the outline in the reference domain ¯ D as follows. First, we compute a scalar function φ ( ξξξ ) as φ ( ξξξ ) = ρ ( ξξξ ) − . (1)Next, we give φ ( ξξξ ) the signed distance characteristic to the iso-contour of φ ( ξξξ ) =

0, using a geometry-based re-initialization scheme (Yamasaki et al., 2010). Finally, we update ρ ( ξξξ ) using the following equation: ρ ( ξξξ ) =  ( φ ( ξξξ ) < − h ) H ( φ ( ξξξ )) ( − h ≤ φ ( ξξξ ) ≤ h ) ( h < φ ( ξξξ ) ) (2)5igure 3: Example of material distributions including wide transition zones: a material distribution normalized fromthat in Fig. 2(b), and b material distribution in the design domain, which is inversely converted from that in a Figure 4: Architecture of VAEwhere h is the parameter for the bandwidth of the transition zone from the void to the material, and H ( φ ) is given asfollows: H ( φ ) = + (cid:16) φ h (cid:17) − (cid:16) φ h (cid:17) + (cid:16) φ h (cid:17) . (3)This process is a type of normalization to the material distribution; as an example, the material distribution in Fig. 2bis processed as shown in Fig. 3a by setting h to 0 . h to a small value.We set h to 0 .

08 for normalization in the reference domain because the smoothness of the material distributionsgenerated by the VAE was improved by this setting in a preliminary study.

Figure 4 shows the architecture of the VAE used in the numerical examples of Section 5. As shown in the ﬁgure, thisis a type of multilayer perceptron including two hidden layers. The reference domain ¯ D is discretized with 50 × D are represented using the values of the density function ρ ( ξξξ ) atthe lattice points. Therefore, the input layer has 2 , ×

51 neurons. This input layer is fully connected to ahidden layer having 1 ,

700 neurons. 6fter activating these neurons using the ReLU function, this layer is also fully connected to two layers having 2neurons, one corresponding to µµµ , which is the mean value vector of the latent variables z , and the other correspondingto log ( σσσ ◦ σσσ ) , where σσσ is the variance vector of z , and ◦ represents the element-size product. We then obtain the latentvariables z as follows: z = µµµ + σσσ ◦ εεε , (4)where εεε is a random vector according to the standard normal distribution.The layer of the latent variables z is further fully connected to a hidden layer having 1 ,

700 neurons. After activatingthese neurons using the ReLU function, this layer is fully connected to the output layer having 2 ,

601 neurons, andoutput data of size 51 ×

51 are obtained after the sigmoid activation. The output data are interpreted as materialdistributions in ¯ D in the same manner as the input data. Note that the architecture described in this section is fortwo-dimensional material distribution problems because we focus solely on two-dimensional problems in this paper;however, there are no technical limitations in extending the architecture to three-dimensional problems.The VAE having the above architecture is trained using the material distributions of the Pareto optimal data as theinput and output data, and the latent space composed of the latent variables is constructed through the training. Inmore detail, the training is conducted by minimizing the following loss function L using the Adam optimizer (Kingmaand Ba, 2014): L : = L recon + L KL , (5)where L recon is the mean value of the reconstruction loss measured by the mean-squared error, and L KL is a termcorresponding to the Kullback-Leibler divergence. L KL is computed as follows: L KL = N mt N mt ∑ j = N lt ∑ i = (cid:0) + log (cid:0) σ i , j (cid:1) − µ i , j − σ i , j (cid:1) , (6)where µ i , j and σ i , j are the i -th components of µµµ and σσσ in the j -th material distribution, respectively. N mt and N lt arethe number of material distributions and the size of the latent space, respectively.Because the dimensionality is drastically compressed from the input and output layers into a two-dimensionallatent space, it is expected that important features of the training data are extracted into this space. Furthermore, therange of the latent space that we should focus on is restrictive because the latent variables correspond to the trainingdata do not take extremely large or small values according to the probability distribution N ( , ) .On the basis of the above discussion, we generate material distributions by uniformly sampling in the latent space;the sampling range is [ − , + ] for each component of z , and the number of samples is 20 ×

20. Thus, we obtainmaterial distributions that are diverse and inherit important features of the training data.As another notable issue, we did not prepare the validation data in this paper because the number of the Paretooptimal data is extremely small in early iterations of the numerical examples (less than 100) and the number of trainingdata further decreases if we prepare the validation data. In addition, an appropriate strategy for dividing the Paretooptimal data into the training and validation data remains unclear in such a situation. Therefore, we simply trainthe VAE with the epoch number 600 in the numerical examples. As another parameters, the mini batch size and thelearning rate are set to 10 and 1 × − , respectively. These parameters were determined through a preliminary study. It is not preferable to train a VAE using a dataset in which some of the material distributions are extremely similar (orperfectly identical) whereas other material distributions are unique. In such a case, the training result will be biasedto the former. Therefore, it is necessary to thin out the material distributions according to the similarity. Here, weconsider material distributions ρρρ j and ρρρ k , which are the j - and k -th material distributions in the discrete system. Wethen thin out one of them if the following condition is satisﬁed: N in ∑ i = | ρ i , j − ρ i , k | ≤ ( − t ) · N in , (7)where ρ i , j and ρ i , k are the i -th component of ρρρ j and ρρρ k , respectively. t is the threshold used to judge the similarity andis set to 0 .

999 in this paper. N in is the number of components, i.e., 2 , In this section, we provide three numerical examples to demonstrate the usefulness of data-driven topology design.Herein, we consider two design problems of structural mechanics, the design domains and boundary conditions ofwhich are given shown in Fig. 5.Design problem 1 is a two-dimensional high-stiffness and light-weight structure design problem, where a verticalload is applied to the bottom-right boundary and the displacement is completely ﬁxed on the left side boundary of thedesign domain (see Fig. 5a). In this design problem, two objective functions are set: one is the volume of the structure,and the other is the mean compliance to the applied load.Design problem 2 is a two-dimensional low-stress and light-weight structure design problem, where a vertical loadis applied to the center-right boundary and the displacement is completely ﬁxed on the top side boundary of the designdomain, as shown in Fig. 5b. For this design problem, two objective functions are set: the volume of the structure andthe maximum value of the von Mises stress generated in the structure. Furthermore, the mean compliance is imposed asa constraint to ensure the mechanical connection from the displacement ﬁxed boundary to the load imposed boundary;this constraint is crucial to obtain meaningful structures, as discussed by Yamasaki et al. (2019).Design problem 1 has been well-studied in numerous studies of topology optimization; therefore, we provideexample 1 targeted to design problem 1 to investigate the basic potential of data-driven topology design. By contrast,it is difﬁcult to directly solve design problem 2 using topology optimization because it is difﬁcult to accurately evaluatethe von Mises stress and it is necessary to solve a min-max problem. This difﬁcult problem is targeted in example 2.Whereas examples 1 and 2 are provided to demonstrate that data-driven topology design can enhance the formu-lation support system, another aspect is investigated in example 3 using design problem 2.

As described above, we solve the simple high-stiffness and light-weight structure design problem in example 1, thedesign domain and boundary conditions of which are shown in Fig. 5a. The design domain is discretized with 128 × h in (2) is set to 0 . × − in the void to avoid the singular stiffness matrix, and Poisson’s ratio is set to 0 .

3. We compute the displacement andstress ﬁelds under the plane stress condition.First, we collect material distributions obtained by solving various topology optimization problems and constructa database of the formulation support system in the same manner as the study of Yamasaki et al. (2019). The materialdistributions in the database are converted to conform to the design domain using the DDM. By doing so, we obtain2 ,

271 material distributions as the original data. In step 1, we evaluate the volume and mean compliance of thesematerial distributions using the ﬁnite element method and obtain the integrated data. In step 2, we select the materialdistributions having Pareto optimality from the integrated data and obtain the Pareto optimal data. The materialdistributions of the Pareto optimal data are shown in Fig. 6. In step 3, we convert these material distributions toconform to the reference domain using the DDM, and in step 4, generate material distributions using the VAE asdescribed in Section 4.2. Figure 7 shows the material distributions of the generated data. In step 5, we inversely8igure 6: Material distributions of the Pareto optimal data at iteration 0 in example 1Figure 7: Material distributions of the generated data at iteration 0 in example 1convert the generated material distributions to conform to the design domain using the DDM. In step 6, we evaluatethe performances of the generated material distributions and integrate them into the integrated data and then return tostep 2.We iterate the above data generation procedure 50 times. Figure 8 shows that the Pareto front gradually improveswhen iterating the data generation. Because the Pareto front is clearly improved after iteration 1, iterating the datageneration procedure is signiﬁcantly important to obtain high-performance material distributions.Figure 9 shows representative material distributions at iterations 0 and 50. As this ﬁgure indicates, the Pareto frontat iteration 0 includes unusual material distributions. For example, the load imposed boundary is not mechanicallyconnected to the displacement ﬁxed boundary despite an adequate amount of material in material distribution A.Material distribution B also seems to be unusual because a long bar sticks out from the base structure. In materialdistribution C, the material in the bottom side does not connect to the displacement ﬁxed boundary although anadequate mount of material exists. In material distribution D, the material at the top-right of the design domain doesnot need to support the load. The performance of material distribution E will be further improved by moving theconnecting point to the displacement ﬁxed boundary from the center to the bottom side. Material distribution F is a9igure 8: History of data-driven topology design in example 1: iteration 0 (blue), iteration 1 (green), iteration 5(orange), and iteration 50 (red)Figure 9: Pareto front and representative material distributions at iteration 0 in example 1 (blue) and at iteration 50(red)ﬂuid channel. These unusual material distributions are suppressed at iteration 50, and those at iteration 50 seem to becomparable to the well-known topology optimized structures of the minimum compliance problem.Next, we discuss the importance of training the VAE using only the Pareto optimal data. We then generate thematerial distributions using a VAE trained with all material distributions of the integrated data, and ﬁnally obtainthe Pareto front colored with black in Fig. 10. Clearly, the Pareto front is inferior to that obtained using data-driventopology design. More importantly, the material distributions obtained seem to be very poor; in particular, the materialdistributions encircled with the dotted-blue line remain in the Pareto front through iterations 0 −

50. These resultsindicate the disadvantage of training a VAE using all material distributions. If a VAE is trained using all materialdistributions, various features of low-performance material distributions will be reﬂected in the latent space. Therefore,it is extremely difﬁcult to expect a VAE to efﬁciently generate high-performance material distributions with a limitednumber of samples. Thus, the results shown in Fig. 10 indicate the importance of training a VAE using only the Pareto10igure 10: Pareto front and representative material distributions at iteration 50 in example 1 (red) and those obtainedusing a VAE trained using all of the material distributions (black)Figure 11: Pareto front and representative material distributions at iteration 50 in example 1 (red) and those obtainedusing density-based topology optimization (black)optimal data.Finally, we compare the results of data-driven topology design and the results of density-based topology opti-mization. The Pareto front colored with black in Fig. 11 is obtained by directly solving the well-known minimumcompliance problem while randomly changing the allowable upper limit of the volume 100 times. As shown inFig. 11, data-driven topology design generates similar material distributions when the volume is greater than 0 . In this section, we solve a low-stress and light-weight structure design problem, the design domain and boundaryconditions of which are given in Fig. 5b. The design domain is discretized with 78282 triangular elements whoserepresentative length is 0 . h in (2)is set to 0 .

025 to binarize the material distributions in the design domain. The material properties are set to the samevalues as in example 1. In this example, we use a conforming mesh to the structural boundary proposed by Yamasakiet al. (2017) to accurately compute the von Mises stress while excluding the so-called grayscale elements.We prepare the original data in the same manner as in example 1 and compute the performances of the materialdistributions in the original data, that is, the volume, maximum value of the von Mises stress, and mean compliance.Next, we select the Pareto optimal material distributions regarding the volume and maximum value of the von Misesstress under a constraint in which the mean compliance is less than 10. Figure 12 shows the selected material distri-butions as the Pareto optimal data. In the same manner as in example 1, we iterate the data generation procedure, andobtain the result shown in Fig. 13.As shown in this ﬁgure, some unusual material distributions exist on the Pareto front of iteration 0. For example,two holes of material distribution A seem to be useless for avoiding the stress concentration. Material distribution Balso seems to be unusual because the narrow part in the top side of the design domain is unreasonable to avoid astress concentration. In material distributions C and D, the material at the bottom side of the design domain is notneeded to support the load. Similar to example 1, these unusual material distributions are suppressed as a result of the12igure 14: Material distributions of the Pareto optimal data at iteration 0 in example 3improvement in the Pareto front.Furthermore, structures such as the letter ”J” are found as light-weight structures on the Pareto front of iteration 50.Indeed, this type of structures is superior in avoiding the stress concentration at the inner corner. Thus, it is conﬁrmedthat reasonable low-stress and light-weight structures are surely obtained using data-driven topology design.

In example 2, we provided the initial material distributions from the database of the formulation support system.However, we can provide higher-performance material distributions as the initial material distributions on the basisof another reasonable guideline, for example, by utilizing optimized structures of the minimum compliance prob-lem whose design domain and boundary conditions are given in Fig. 5b. Because the above minimum complianceproblem and design problem 2 have a correlation, data-driven topology design may generate quasi-optimal materialdistributions, although design problem 2 is difﬁcult to directly solve.The essence of this idea is to indirectly solve a topology optimization problem, which is difﬁcult to solve directly,using material distributions obtained by solving another problem, which is easy to solve directly and is correlated withthe former problem. The methodology based on the above was originated by Yaji et al. (2020), and is called multiﬁ-delity topology design . In the original study, the best material distribution among the prepared material distributions issimply chosen. Therefore, data-driven topology design may have potential as a new version of multiﬁdelity topologydesign.To investigate the potential, example 3 is provided using similar problem settings as in example 2, the only differ-ence beeing the guideline for the initial material distributions. As described above, we solve the minimum complianceproblem while randomly changing the allowable upper limit of the structural volume. By doing so, we obtain 80material distributions, and further select 48 material distributions according to Pareto optimality of design problem 2.Figure 14 shows the material distributions of the Pareto optimal data.Using these material distributions, we obtain the result shown in Fig. 15. As this ﬁgure shows, it may be difﬁcultto assert a drastic improvement of the Pareto front. We consider the reason for this to be the strong correlation betweenthe two types of topology optimization problems. However, some interesting facts can be seen in Fig. 15. For example,material distribution A’ is more rounded than material distribution A, although they are very similar. Similarly, materialdistribution B’ is more rounded than material distribution B. In general, rounded structures are preferable for avoidingthe stress concentration, and therefore these results seem to be reasonable as low-stress and light-weight structures.From these results, we consider that data-driven topology design has a potential as a type of multiﬁdelity topologydesign. 13igure 15: Pareto front and representative material distributions at iteration 0 in example 3 (blue) and at iteration 50(red)

In this paper, we proposed data-driven topology design to enhance the formulation support system and demonstratedits usefulness through numerical examples. However, some issues remain.As one issue, it is necessary to investigate another deep generative model, despite our adoption of a VAE inthis paper; more suitable deep generative models may exist for data-driven topology design. In addition, a suitablearchitecture for the VAE should be further investigated. Although we adopted the architecture shown in Fig. 4 on thebasis of the results of a preliminary study, there may be room for improvement. Furthermore, the theoretical backboneof data-driven topology design should be further investigated from the viewpoint of an EDA. We plan to tackle theseissues in future studies and aim to develop more sophisticated data-driven topology design.

The necessary information for a replication of the results are presented in the manuscript. Interested readers maycontact the corresponding author for further details regarding the implementation.

Conﬂict of interest

The authors declare that they have no conﬂicts of interest.

References

Adhikari A, Adhikari J (2015) Advances in Knowledge Discovery in Databases. Springer, DOI 10.1007/978-3-319-13212-9Atienza R (2018) Advanced Deep Learning with Keras: Apply Deep Learning Techniques, Autoencoders, GANs,Variational Autoencoders, Deep Reinforcement Learning, Policy Gradients, and More. Packt PublishingBendsøe MP (1989) Optimal shape design as a material distribution problem. Structural Optimization 1(4):193–20214endsøe MP, Kikuchi N (1988) Generating optimal topologies in structural design using a homogenization method.Computer Methods in Applied Mechanics and Engineering 71(2):197–224Bhattacharjee S, Gras R (2019) Estimation of distribution using population queue based variational autoencoders.In: Proceedings of 2019 IEEE Congress on Evolutionary Computation, IEEE, Wellington, pp 1406–1414, DOI { } Deng L (2012) The MNIST database of handwritten digit images for machine learning research. IEEE Signal Process-ing Magazine 29(6):141–142, DOI { } Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI Magazine17:37–54Garciarena U, Santana R, Mendiburu A (2018) Expanding variational autoencoders for learning and exploiting latentrepresentations in search distributions. In: Proceedings of the Genetic and Evolutionary Computation Conference,Kyoto, pp 849–856, DOI { } Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedingsof 2016 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, pp 2414–2423, DOI { } Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generativeadversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in NeuralInformation Processing Systems 27, Curran Associates, Inc., pp 2672–2680Guo T, Lohan DJ, Allison JT, Cang R, Ren Y (2018) An indirect design representation for topology optimizationusing variational autoencoder and style transfer. In: Proceedings of AIAA/ASCE/AHS/ASC Structures, StructuralDynamics, and Materials Conference, AIAA, Kissimmee, DOI { } Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science313(5786):504–507, DOI { } Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv: 1412.6980Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv: 1312.6114Larra˜naga P, Lozano JA (2001) Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation.Genetic Algorithms and Evolutionary Computation, Springer USMirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv: 1411.1784Oh S, Jung Y, Kim S, Lee I, Kang N (2019) Deep generative design: Integration of topology optimization and gener-ative models. Journal of Mechanical Design 141(11):111405–1–111405–13, DOI { } Sasaki H, Igarashi H (2019) Topology optimization accelerated by deep learning. IEEE Transactions on Magnetics55(6):1–5, DOI { } Tsai CW, Lai CF, Chiang MC, Yang LT (2014) Data mining for internet of things: A survey. IEEE CommunicationsSurveys & Tutorials 16(1):77–97, DOI { } Ulu E, Zhang R, Yumer ME, Kara LB (2014) A data-driven investigation and estimation of optimal topologies undervariable loading conﬁgurations. In: Zhang YJ, Tavares JMRS (eds) Computational Modeling of Objects Presentedin Images. Fundamentals, Methods, and Applications, Springer International Publishing, Cham, pp 387–399Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D ShapeNets: A deep representation for volumetricshapes. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, pp1912–1920, DOI { } Yaji K, Yamasaki S, Fujita K (2020) Multiﬁdelity design guided by topology optimization. Structural and Multidisci-plinary Optimization 61(3):1071–1085, DOI { } { } Yamasaki S, Yamanaka S, Fujita K (2017) Three-dimensional grayscale-free topology optimization using a level-set based r-reﬁnement method. International Journal for Numerical Methods in Engineering 112(10):1402–1438,DOI { } Yamasaki S, Yaji K, Fujita K (2019) Knowledge discovery in databases for determining formulation in topologyoptimization. Structural and Multidisciplinary Optimization 59(2):595–611, DOI { } Yu Y, Hur T, Jung J, Jang IG (2019) Deep learning for determining a near-optimal topological design without anyiteration. Structural and Multidisciplinary Optimization 59(3):787–799, DOI { }}