Multi-scale Deep Neural Network (MscaleDNN) Methods for Oscillatory Stokes Flows in Complex Domains
MMulti-scale Deep Neural Network (MscaleDNN) Meth-ods for Oscillatory Stokes Flows in Complex Domains
Bo Wang , Wenzhong Zhang , Wei Cai LCSM(MOE), School of Mathematics and Statistics, Hunan Normal University,Changsha, Hunan, 410081, P. R. China. Dept. of Mathematics, Southern Methodist University, Dallas, TX 75275
Summary.
In this paper, we study a multi-scale deep neural network (MscaleDNN)as a meshless numerical method for computing oscillatory Stokes flows in complexdomains. The MscaleDNN employs a multi-scale structure in the design of its DNNusing radial scalings to convert the approximation of high frequency components of thehighly oscillatory Stokes solution to one of lower frequencies. The MscaleDNN solutionto the Stokes problem is obtained by minimizing a loss function in terms of L normof the residual of the Stokes equation. Three forms of loss functions are investigatedbased on vorticity-velocity-pressure, velocity-stress-pressure, and velocity-gradient ofvelocity-pressure formulations of the Stokes equation. We first conduct a systematicstudy of the MscaleDNN methods with various loss functions on the Kovasznay flowin comparison with normal fully connected DNNs. Then, Stokes flows with highlyoscillatory solutions in a 2-D domain with six randomly placed holes are simulatedby the MscaleDNN. The results show that MscaleDNN has faster convergence andconsistent error decays in the simulation of Kovasznay flow for all four tested lossfunctions. More importantly, the MscaleDNN is capable of learning highly oscillatorysolutions when the normal DNNs fail to converge. AMS subject classifications : 35Q68, 65N99, 68T07, 76M99
Key words : deep neural network, Stokes equation, multi-scale, meshless methods.
Numerical methods for incompressible flow is one of the major topics in computationalfluid dynamics, which has been intensively studied over last five decades. Varioustechniques have been proposed to address the incompressibility condition of the flow,including projection methods [4] [18], Gauge methods [6], and time splitting methods ∗ Corresponding author.
Email addresses: [email protected] (Bo Wang), [email protected] (WenzhongZhang), [email protected] a r X i v : . [ m a t h . NA ] O c t [13], among others. Finite element and spectral element methods [3] are mostly usedto discretize the Navier-stokes equation where special attentions are needed for theapproximation spaces of velocity and pressure to satisfy the Babuska and Brezzi inf-supcondition for a saddle point problem [8]. Besides, for large scale engineering applications,body-fitted mesh generations for 3-D objects and efficient linear solvers for the resultinglinear systems have been a major issue for computational resources.The emerging deep neural network (DNN) has found many applications beyondits traditional applications such as image classification and speech recognition. Recentwork in extending DNNs to the field of scientific and engineering computing has shownmuch promise [7] [9] [17]. DNN based numerical methods are usually formulated asan optimization problem where the loss function could be an energy functional as ina Ritz formulation of a self-adjoint differential equation [7] or simply the least squaredmean of the residual of the PDEs [10] [2] [11]. The DNN technique provides a powerfulapproximation method to represent solutions of high dimensional variables while thetraditional finite element and spectral element methods encounter the well known curse ofdimensionality problem. Also, there are several advantages of using DNN to approximatethe solution of the incompressible flows. Firstly, the stochastic optimization algorithmemployed by DNN based methods relies on loss calculated on randomly sampled pointsin the computational domain rather than over an unstructured mesh fitting the geometryof the complex objects in the fluid problem. This feature renders the DNN-based methodsfor solving PDEs a truly mesh-less method. Secondly, due to the capability of the DNN inhandling high dimensional functions, the approximation of a time dependent solutioncan be carried out in the temporal-spatial four dimensional space. Thirdly, boundaryconditions for the fluid problems can be simply enforced by introducing penalty terms inthe loss function and no need to find and implement appropriate and non-trivial boundaryconditions for pressure [16] or vorticity variables in corresponding formulations for theStokes or Navier-Stokes equations.Normal fully connected DNNs used for image classification and data science applica-tions have been shown to be ineffective in learning high frequency contents of the solutionas illustrated in recent works on DNNs’ frequency dependent convergence properties [19].Unfortunately, fluid flow at high Reynolds number will contain many scales, which is thehallmark of the onset of turbulent flow from a laminar one. Therefore, in order to makethe DNN based approaches to be competitive numerical methods, in terms of resolutionpower, to popular spectral [3] and spectral element methods [12], it is important to developnew classes of DNNs which can represent scales of drastic disparities arising from thestudy of turbulent flows. For this purpose, we have recently developed strategies tospeed up the convergence of DNNs in learning high frequency content of the solutionsof PDEs. Two new DNNs have been proposed: a PhaseDNN [2] and a MscaleDNN [11].The PhaseDNN uses a series of phase shifts to convert high frequency contents to a lowfrequency range before the learning is carried out. This method has been shown to be veryeffective in simulating high frequency Helmholtz equations in acoustic wave scattering.On the other hand, the MscaleDNN uses a radial scaling technique in the frequency domain (or a corresponding scaling in the physical domain) to convert solution content ofa range of higher frequency to a lower frequency one, which will be learned quickly witha small size DNN, and the latter is then scaled back in the physical space to approximatethe original solution content. MscaleDNN is more effective to handle higher dimensionalPDEs and has already been shown to be superior over traditional fully connected DNNsfor solving Poisson-Boltzmann equation in complex and singular domains [11]. In this pa-per, we will extend the MscaleDNN approach to find the solution of the Stokes problem asa first step to develop DNN based numerical methods for time-dependent incompressibleNavier-Stokes equations.The rest of the paper is organized as follows. In section 2, we will present the structureof the MscaleDNN to be used for solving the Stokes problems. Section 3 will proposeseveral loss functions for training, based on three different first order system reformula-tions of the Stokes equation. A benchmark test on a low frequency Kovasznay flow willbe conducted in section 4 to evaluate the performance of normal fully connected DNNand MscaleDNNs as well as different loss functions. Section 5 will present the numericaltests of highly oscillatory Stokes flows with multiple frequencies in a complex domain.Finally, a conclusion and discussion of future work are given in Section 6. In a recent work [11], a multi-scale DNN was proposed, which consists of a series of parallelnormal sub-neural networks. Each of the sub-networks will receive a scaled version ofthe input and their outputs will then be combined to make the final out-put of theMscaleDNN (refer to Fig. 1). The individual sub-network in the MscaleDNN with a scaledinput is designed to approximate a segment of frequency content of the targeted functionand the effect of the scaling is to convert a specific high frequency content to a lowerfrequency range so the learning can be accomplished much quickly. Recent work [19] onthe frequency dependence of the DNN convergence shows that much faster convergenceoccurs in approximating low frequency function compared with approximating highfrequency ones, the MscaleDNN takes advantage of this property. In addition, in order toproduce scale separation and identification capability for a MscaleDNN, we borrowed theidea of compact mother scaling and wavelet functions from the wavelet theory [5], andfound that the activation functions with a localized frequency profile works better thannormal activation functions, e.g., ReLU, tanh, etc.Fig. 1 shows the schematics of a MscaleDNN consisting of n networks. Each scaledinput passing through a sub-network can be expressed in the following formula f θ ( x ) = W [ L − ] σ ◦ ( ··· ( W [ ] σ ◦ ( W [ ] ( x )+ b [ ] )+ b [ ] ) ··· )+ b [ L − ] , (2.1)where W [ ] to W [ L − ] and b [ ] to b [ L − ] are the weight matrices and bias unknowns, re-spectively, to be optimized via the training, σ ( x ) is the activation function. In this work,the following plane wave activation function will be used for its localized frequency Figure 1: Illustration of a MscaleDNN.property [11], σ ( x ) = sin ( x ) . (2.2)For the input scales, we could select the scale for the i -th sub-network to be i (as shownin Fig. 1) or 2 i − . Mathematically, a MscaleDNN solution f ( x ) is represented by thefollowing sum of sub-networks f θ ni with network parameters denoted by θ n i (i.e., weightmatrices and bias) f ( x ) ∼ M ∑ i = f θ ni ( α i x ) , (2.3)where α i is the chosen scale for the i -th sub-network in Fig. 1. For more details on thedesign and discussion of the MscaleDNN, please refer to [11].For comparison studies in this paper, we will refer to a “ normal ” network as an onefully connected DNN with the same total number of neurons as the MscaleDNN, butwithout multi-scale features. We would perform extensive numerical experiments toexamine the effectiveness of different settings and select efficient ones to solve complexproblems. All DNN models are trained by Adam [14]. The following two dimensional (2-D) Stokes problem − ν (cid:52) u + ∇ p = f , in Ω , (3.1) ∇· u =
0, in Ω , (3.2) u = g , on ∂ Ω , (3.3) will be solved by the MscaleDNN, here Ω is an open bounded domain in R , and theboundary condition g satisfies a compatibility condition (cid:90) ∂ Ω g · n ds =
0. (3.4)The MscaleDNN solution will be found as in the traditional least square finite elementmethod [1] where the solution is obtained by minimizing a loss function in terms of theresidual of the Stokes problem (3.1). To introduce loss functions for the DNN algorithms,we first reformulate (3.1)-(3.3) into a first order system as in least square finite elementmethods for solving Stokes problem. There are various possible ways of recasting (3.1)into a first order system, and we will focus on the following three popular approachesused in the development of least square finite element methods [1]. • Vorticity-velocity-pressure ( ω VP) formulation:
The first approach introduces a vortic-ity variable, a scalar quantity for 2-D flows, ω = ∇× u = ∂ x u y − ∂ y u x , (3.5)arriving at a vorticity-velocity-pressure ( ω VP) system: ν ∇× ω + ∇ p = f , in Ω , (3.6a) ω = ∇× u , in Ω , (3.6b) ∇· u = , in Ω . (3.6c) • Velocity-stress-pressure (VSP) formulation:
The second approach introduces a stresstensor T = √ ν ( ∇ u + ∇ u (cid:62) ) /2, (3.7)while a velocity-stress-pressure (VSP) system −√ ν ∇· T + ∇ p = f , in Ω , (3.8a) T = √ ν ( ∇ u + ∇ u (cid:62) ) , in Ω , (3.8b) ∇· u = , in Ω , (3.8c)is obtained. • Velocity-gradient of velocity-pressure (VgVP) formulation:
The third approach simplyintroduces a variable U = ∇ u (by taking gradient on each component of the velocity field),which leads to a velocity-gradient of velocity-pressure (VgVP) system − ν ∇· U + ∇ p = f , in Ω , (3.9a) U = ∇ u , in Ω . (3.9b) ∇· u = in Ω . (3.9c) It is well known that it is more difficult to compute the pressure than the velocityin computational fluid dynamics. We find that the velocity also converges faster thanpressure in the DNN-based methods. In order to take care of the pressure, we takedivergence on both sides of the Stokes equation (3.1) to obtain a Poisson equation ∆ p = ∇· f , in Ω . (3.10)The residual of this equation will be an extra term in the loss function, and a tunableweight on the loss due to pressure is introduced. To be consistent with the first ordersystems above, we also reformulate the Poisson equation (3.10) into q = ∇ p , in Ω , (3.11a) ∇· q = ∇· f , in Ω . (3.11b)Together with the first order systems (3.6),(3.8) or (3.9), respectively, we can design theMscaleDNN algorithms. In each algorithm, a total of four MscaleDNNs will be used: onefor the velocity vector u , one for the pressure p , one for the gradient of pressure q and onefor the vorticity ω , stress T or the gradient of velocity U , respectively. The DNN solutionsare denoted by u ( x , θ u ) , p ( x , θ p ) , ω ( x , θ ω ) , T ( x , θ T ) , U ( x , θ U ) , q ( x , θ q ) accordingly. Based onthe first order systems, we define loss functions as follows L ω VP ( θ u , θ p , θ ω , θ q ) : = (cid:107) ν ∇× ω + q − f (cid:107) Ω + α (cid:107)∇· q −∇· f (cid:107) Ω + (cid:107)∇× u − ω (cid:107) Ω + (cid:107)∇· u (cid:107) Ω + (cid:107)∇ p − q (cid:107) Ω + β (cid:107) u − g (cid:107) ∂ Ω , L VSP ( θ u , θ p , θ T , θ q ) : = (cid:107)√ ν ∇· T − q + f (cid:107) Ω + α (cid:107)∇· q −∇· f (cid:107) Ω + (cid:13)(cid:13)(cid:13) √ ν ( ∇ u + ∇ u (cid:62) ) − T (cid:13)(cid:13)(cid:13) Ω + (cid:107)∇· u (cid:107) Ω + (cid:107)∇ p − q (cid:107) Ω + β (cid:107) u − g (cid:107) ∂ Ω , L VgVP ( θ u , θ p , θ U , θ q ) : = (cid:107) ν ∇· U − q + f (cid:107) Ω + α (cid:107)∇· q −∇· f (cid:107) Ω + (cid:107)∇ u − U (cid:107) Ω + (cid:107)∇· u (cid:107) Ω + (cid:107)∇ p − q (cid:107) Ω + β (cid:107) u − g (cid:107) ∂ Ω , (3.12)where α , β are penalty constants. We emphasize that the Poisson residual α (cid:107)∇· q −∇· f (cid:107) Ω + (cid:107)∇ p − q (cid:107) Ω ,in the loss function is important for the convergence of the pressure as to be shown vianumerical results in Section 5.3.For the brevity of notations, the loss functions in (3.12) are named as ω VP-loss, VSP-lossand VgVP-loss, accordingly. In the rest of this paper, these loss functions will be comparedwith the simple loss function directly obtained from the original Stokes equation: L VP ( θ u , θ p ) = (cid:107) ν ∆ u −∇ p + f (cid:107) Ω + (cid:107)∇· u (cid:107) Ω + β (cid:107) u − g (cid:107) ∂ Ω , (3.13)which is named as VP-loss. In the DNN algorithms using this loss function, a total of twoMscaleDNNs will be used: one for the velocity vector u where the output y = u in Fig. 1,and one for the scalar pressure p . As a benchmark test, we first consider the Stokes problem in a square domain Ω =[ ] × [ − ] with an exact solution coinciding with the analytical solution of theincompressible Navier-Stokes equations obtained by Kovasznay [15], i.e., u = − e λ x cos ( π x ) , u = λ π e λ x sin ( π x ) , p = e λ x , (4.1)where λ = Re − (cid:114) Re + π , Re = ν .The source term f is obtained by substituting the exact solution into the Stokes equation(3.1). We set the viscosity ν = { x ,2 x ,4 x ,8 x ,16 x ,32 x } and their fully connectedsub-networks all have 4 hidden layers and 50 neurons in each hidden layer. On the otherhand, a fully connected DNN with 4 hidden layers and 300 neurons in each hidden layerwas tested for comparison. Therefore, the total number of neurons in the fully connectedDNN and MscaleDNNs are the same. Nevertheless, the fully connected DNN does havemore connectivity with more parameters. In the loss functions, we fix α = β = Ω and 10000 points on the boundary for learning.In the learning process, we set batch size equal to 1000 points inside the domain andrandomly pick 400 points on the boundary for each step. • Adaptive learning rates.
We have found that reducing learning rate as the trainingprogresses can have a noticeable improvement in the reduction of loss. In our numericaltests, the learning rate of the first 100 epochs is set to be 0.001. Then, the learning rate willbe reduced by a factor of 10 after each 100 epochs. The change of learning rate can be seenclearly in the history of losses.In order to check the accuracy of the algorithms, we define (cid:96) -errors Err ( u ) = (cid:16) N N ∑ j = | u DNN ( x j ) − u ( x j ) | (cid:17) , Err ( p ) = (cid:16) N N ∑ j = | p DNN ( x j ) − p ( x j ) | (cid:17) , (4.2)between the DNN solution { u DNN ( x ) , p DNN ( x ) } and the exact solution { u ( x ) , p ( x ) } givenin (4.1). Here, { x j = ( x j , x j ) } Nj = are locations of a uniform 200 ×
200 mesh of the domain Ω .The DNN solutions obtained by minimizing different loss functions in (3.12)-(3.13) arecompared in Fig. 2-3. The results show that both fully connected DNN and MscaleDNNsconverge in 300 epochs with any one of the loss functions in (3.12). However, the simpleVP-loss in (3.13) has a very poor performance no matter if the fully connected DNN or theMscaleDNNs is used. In particular, both fully connected DNN and MscaleDNNs can notproduce reasonable results within 300 epochs if the VP-loss function is used. (a) loss (b) Err ( u ) (c) Err ( p ) Figure 2: Normal DNN with different loss functions. (a) loss (b)
Err ( u ) (c) Err ( p ) Figure 3: MscaleDNN with different loss functions. l o g ( l o ss ) normal DNNMSDNN (a) loss epochs10 e rr o r normal DNNMSDNN (b) Err ( u ) epochs e rr o r normal DNNMSDNN (c) Err ( p ) Figure 4: Normal DNN and MscaleDNN with loss function L ω VP ( θ u , θ p , θ ω , θ q ) .More detailed difference can been seen from the comparison of loss and errors betweenthe normal DNN and MscaleDNN for the three loss functions in Fig. 4-6. The resultsshow that the MscaleDNNs have much faster convergence no matter which loss functionis used. In fact, MscaleDNNs can achieve much better accuracy than normal DNN as wecan see in Fig. 4(b)-6(b). In particular, the MscaleDNN solutions obtained by minimizingthe ω VP -loss are compared with exact solution along the line y = l o g ( l o ss ) normal DNNMSDNN (a) loss epochs10 e rr o r normal DNNMSDNN (b) Err ( u ) epochs10 e rr o r normal DNNMSDNN (c) Err ( p ) Figure 5: Normal DNN and MscaleDNN with loss function L VSP ( θ u , θ p , θ T , θ q ) . l o g ( l o ss ) normal DNNMSDNN (a) loss epochs10 e rr o r normal DNNMSDNN (b) Err ( u ) epochs10 e rr o r normal DNNMSDNN (c) Err ( p ) Figure 6: Normal DNN and MscaleDNN with loss function L VgVP ( θ u , θ p , θ U , θ q ) . u x (a) u x u y (b) u y p (c) p Figure 7: Error of MscaleDNN solutions at epoch 300 with loss function L ω VP ( θ u , θ p , θ ω , θ q ) . The MscaleDNN is more powerful than a normal DNN due to the former’s capability onsolving complicate problems with oscillatory solutions. Here, we consider the Stokes flowin the domain Ω = [ ] × [ − ] with 6 cylindrical holes (refer to Fig. 8) centered at ( ) , ( − ) , ( ) , ( ) , ( ) , ( ) ,inside the domain. The radius of the cylinders are set to be 0.2,0.15,0.18,0.2,0.18,0.15,respectively. We will test two exact solutions with highly oscillatory velocity fields. All examples are set to run 1500 epochs using Adam. (a) computational domain (b) an example of u in (5.2) Figure 8: A oscillatory solution over a domain with six cylindrical voids.The adaptive learning rates technique will be used in the numerical tests below, wherethe learning rate of the first 500 epochs is set to be 0.001, then, the learning rate will bereduced by a factor of 10 after each 500 epochs. The change of learning rate can be seenclearly in the history of losses later. In the loss functions, we fix the penalty parameter β =
100 and set an initial penalty parameter α = Err ( u ) and Err ( p ) and adjust parameter α as follows• If Err ( u ) > Err ( p ) , α = α + Err ( p ) > Err ( u ) and α > α = α − (cid:96) -errors defined in (4.2) are computed again with 34,072randomly selected points in the computational domain. The first case has an exact solution given by u = − e λ x cos ( n π x + m π x )) , u = λ m π e λ x sin ( n π x + m π x )+ nm e λ x cos ( n π x + m π x ) , p = e λ x , λ = Re − (cid:114) Re + π , Re = ν , (5.1)with frequencies n = m =
55. In the simulations of this example, the MscaleDNNs for u , ω , T and U are set to have 11 scales: { x ,2 x , ··· ,2 x } and the embedded fully connected DNN for each scale is set to have 8 hidden layers and 150 neurons in each hidden layer.As the pressure does not have high oscillations, the MscaleDNNs for p and q are set tohave 6 scales: { x ,2 x , ··· ,2 x } and the embeded fully connected DNN for each scale is setto have 8 hidden layers and 50 neurons in each hidden layer. We randomly sample 850621points inside Ω and 140000 points on the boundary for learning. In the learning process,we set batch size equal to 10000 points inside the domain and randomly pick 2000 pointson the boundary for each step.The MscaleDNN solutions of u are compared with the exact u in Fig. 9-11. Here,we plot the solutions along the line y = u and p using different lossesare depicted in Fig. 12. We can see that the ω VP-loss or
VgVP -loss with MscaleDNNcan produce very accurate solutions in just 1500 epochs while the VSP-loss needs morelearning to achieve similar accuracy. v x exactMSDNN+ VP-loss Figure 9: Exact u and its MscaleDNN approximation with ω VP-loss L ω VP ( θ u , θ p , θ ω , θ q ) .For comparison, we also test the DNN-based algorithm only using fully connectedDNNs. For u and intermediate variables ω , T and U , we use fully connected DNNswith 8 hidden layers and 1650 neurons in each hidden layer. For p and q , we use fullyconnected DNNs with 8 hidden layers and 300 neurons in each hidden layer. Therefore,the total number of neurons in the fully connected DNNs and the MscaleDNNs are thesame. The losses and (cid:96) -errors obtained by minimizing different loss functions in (3.12)are compared in Fig. 13-15. For this highly oscillatory solution, algorithms using fullyconnected DNNs can not learn anything within 1500 epochs. However, the ones usingMscaleDNNs converge very fast within 1500 epochs. v x exactMSDNN+VSP-loss Figure 10: Exact u and its MscaleDNN approximation with VSP-loss L VSP ( θ u , θ p , θ T , θ q ) . v x exactMSDNN+VgVP-loss Figure 11: Exact u and its MscaleDNN approximation with VgVP-loss L VgVP ( θ u , θ p , θ U , θ q ) . Our second test problem will be a case where the velocity field has multiple high frequen-cies as follows, u = − e λ x cos ( π x + π x )) − e λ x cos ( π x + π x )) , u = λ π e λ x sin ( π x + π x )+ e λ x cos ( π x + π x )+ λ π e λ x sin ( π x + π x )+ e λ x cos ( π x + π x ) , p = e λ x , λ = Re − (cid:114) Re + π , Re = ν . (5.2) e rr o r VgVP-lossVSP-lossVP-loss (a)
Err ( u ) e rr o r VgVP-lossVSP-lossVP-loss (b)
Err ( p ) Figure 12: Errors of MscaleDNN approximations using different loss functions. l o g ( l o ss ) normal DNNMSDNN (a) loss epochs10 e rr o r normal DNNMSDNN (b) Err ( u ) epochs10 e rr o r normal DNNMSDNN (c) Err ( p ) Figure 13: Comparison of a normal DNN and the MscaleDNN with loss function L ω VP ( θ u , θ p , θ ω , θ q ) . l o g ( l o ss ) normal DNNMSDNN (a) loss epochs10 e rr o r normal DNNMSDNN (b) Err ( u ) epochs10 e rr o r normal DNNMSDNN (c) Err ( p ) Figure 14: Comparison of a normal DNN and the MscaleDNN with loss function L VSP ( θ u , θ p , θ T , θ q ) .For this test, the MscaleDNNs for u , ω , T and U are set to have 10 scales: { x ,2 x , ··· ,2 x } and the embedded fully connected DNN for each scale is set to have 8 hidden layers and120 neurons in each hidden layer. As in the last numerical test, the MscaleDNNs for p and q are set to have 6 scales: { x ,2 x , ··· ,2 x } and the embeded fully connected DNN for eachscale is set to have 8 hidden layers and 50 neurons in each hidden layer. We randomly l o g ( l o ss ) normal DNNMSDNN (a) loss epochs10 e rr o r normal DNNMSDNN (b) Err ( u ) epochs10 e rr o r normal DNNMSDNN (c) Err ( p ) Figure 15: Comparison of a normal DNN and the MscaleDNN with loss function L VgVP ( θ u , θ p , θ U , θ q ) .sample 425290 points inside Ω and 140000 points on the boundary for learning. In thelearning process, we set batch size equal to 5000 points inside the domain and randomlyselect 2000 points on the boundary for each step.The MscaleDNN solutions of u are compared with the exact u in Fig. 16-18. Here, weagain plot the solutions along the line y = u and p using different lossesare depicted in Fig. 19. We can see that the ω VP-loss or
VgVP -loss with the MscaleDNNcan obtain very accurate solutions within 1500 epochs. Again, the VSP-loss need morelearning to achieve similar accuracy. v x exactMSDNN+ VP-loss Figure 16: Exact u and its MscaleDNN approximation with loss function L ω VP ( θ u , θ p , θ ω , θ q ) .For comparison, we test algorithms using only fully connected DNNs. For u andvariables ω , T and U , we use fully connected DNNs with 8 hidden layers and 1200 v x exactMSDNN+VSP-loss Figure 17: Exact u and its MscaleDNN approximation with loss function L VSP ( θ u , θ p , θ T , θ q ) . v x exactMSDNN+VgVP-loss Figure 18: Exact u and its MscaleDNN approximation with loss function L VgVP ( θ u , θ p , θ U , θ q ) .neurons in each hidden layer. For p and q , we use fully connected DNNs with 8 hiddenlayers and 300 neurons in each hidden layer. Again, the total number of neurons inthe fully connected DNNs and the MscaleDNNs are the same. The losses and (cid:96) -errorsobtained by minimizing different loss functions in (3.12) are compared in Fig. 20-22, whichclearly show the fast convergence of the MscaleDNNs when the normal fully connectedDNNs fail to converge at all. e rr o r VgVP lossVSP-lossVP-loss (a)
Err ( u ) e rr o r VgVP lossVSP-lossVP-loss (b)
Err ( p ) Figure 19: Error of MscaleDNN approximations using different loss functions. l o g ( l o ss ) normal DNNMSDNN (a) loss epochs10 e rr o r normal DNNMSDNN (b) Err ( u ) epochs10 e rr o r normal DNNMSDNN (c) Err ( p ) Figure 20: Comparison of a normal DNN and the MscaleDNN with loss function L ω VP ( θ u , θ p , θ ω , θ q ) . l o g ( l o ss ) normal DNNMSDNN (a) loss epochs10 e rr o r normal DNNMSDNN (b) Err ( u ) epochs10 e rr o r normal DNNMSDNN (c) Err ( p ) Figure 21: Comparison of a normal DNN and the MscaleDNN with loss function L VSP ( θ u , θ p , θ T , θ q ) . p and Poisson equation It is a well-known fact that the traditional projection methods for incompressible flowmay experience an error degeneration for pressure near the boundaries depending onthe types of pressure boundary conditions used for the Poisson equation (3.10) [13]. Toshow the importance of the pressure’s Poisson equation in the DNN-based approaches l o g ( l o ss ) normal DNNMSDNN (a) loss epochs10 e rr o r normal DNNMSDNN (b) Err ( u ) epochs10 e rr o r normal DNNMSDNN (c) Err ( p ) Figure 22: Comparison of a normal DNN and the MscaleDNN with loss function L VgVP ( θ u , θ p , θ U , θ q ) .for the Stokes problem, we will study a loss function without explicitly including theresidual of the Poisson equation. Here, we consider a modification of the loss function L ω VP ( θ u , θ p , θ ω , θ q ) given by (cid:101) L ω VP ( θ u , θ p , θ ω ) : = (cid:107) ν ∇× ω + ∇ p − f (cid:107) Ω + (cid:107)∇× u − ω (cid:107) Ω + (cid:107)∇· u (cid:107) Ω + β (cid:107) u − g (cid:107) ∂ Ω . (5.3)The input data, size of the MscaleDNNs and other settings are exactly the same as we haveused in the numerical tests in Section 5.2. The loss, errors of the MscaleDNN solutions arecompared with the algorithm using loss function L ω VP in Fig. 23. We can see that the lossand Err ( u ) are compatible. However, Err ( p ) is significantly improved if the loss functionwith a Poisson equation residual is used. epoch
200 400 600 800 1000 1200 1400 l og ( Lo ss ) ˜ L ωVP L ωVP (a) loss epoch e rr o r -2 -1 ˜ L ωVP L ωVP (b) Err ( u ) epoch e rr o r -1 ˜ L ωVP L ωVP (c) Err ( p ) Figure 23: Effect of Poisson equation in the loss function: Loss functions L ω VP ( θ u , θ p , θ ω , θ q ) (red lines and diamonds) vs. ˜ L ω VP ( θ u , θ p , θ ω ) (blue lines and circles). In this paper, we have studied the MscaleDNN methods for solving highly oscillatoryStokes flow in complex domains and demonstrated the capability of the MscaleDNNas a meshless and high resolution numerical method for simulating flows in complex domains. Several least square formulations of the Stokes equations using different formsof first order systems are used to construct the loss functions for the MscaleDNN learning.The numerical results have clearly demonstrated the increased resolution power of theMscaleDNN to capture the fine structures in the flow fields when the normal fully con-nected network with the same overall sizes fail to converge at all. The MscaleDNN showsthe potential of DNN machine learning as a practical alternative numerical method totraditional finite element methods. The DNN-based methods have an obvious advantageof no need for expensive mesh generations and matrix solvers as for traditional mesh-based numerical methods nor the delicate treatment of pressure boundary conditions andincompressibility constrains of the flow field.There are many unresolved issues for solving Navier-Stokes equation, among themthe most important one is to understand the convergence property of the MscaleDNNlearning. A related issue is to find adaptive strategies to dynamically selecting the penaltyconstants for various terms in the loss functions, which are sensitive for the performanceof DNN based machine learning PDE algorithms. It should also be mentioned that thestructure of MscaleDNN is amendable to adaptive selections of scales by either adding orremoving a scale dynamically during learning, future work will be done to explore thisfeature as well as to apply the MscaleDNN to 3-D time-dependent incompressible flows. Acknowledgments
W.C. is supported by the U.S. Army Research Office (grant W911NF-17-1-0368). B. W.acknowledges the financial support provided by NSFC (grant 11771137,12022104).
References [1] P. B. Bochev and M. D. Gunzburger, Finite element methods of least-squares type, SIAM Rev.,40 (1998), pp. 789-837.[2] W. Cai, X.G. Li, and L.Z. Liu. A phase shift deep neural network for high frequency approx-imation and wave problems. to appear in SIAM J. Scientific Computing, arXiv:1909.11759,2019.[3] C. Canuto, M. Hussain, A. Quarteroni, And T. Zang, Spectral Methods in Fluid Dynamics(Springer-Verlag, New York/Berlin, 1987 ).[4] A.J. Chorin, On the convergence of discrete approximations to the Navier-Stokes equations,Mnih. Comp. 23, 341-353 (1969).[5] I. Daubechies, Ten lectures on wavelets. Society for industrial and applied mathematics; 1992Jan 1.[6] W. N. E, J. G. Liu, Gauge method for viscous incompressible flows. Communications inMathematical Sciences. 2003;1(2):317-32.[7] W. N. E and B. Yu, The deep Ritz method: A deep learning-based numerical algorithm forsolving variational problems. Communications in Mathematics and Statistics, 6(1):1–12, 2018.[8] Girault V, Raviart PA. Finite element methods for Navier-Stokes equations: theory andalgorithms. Springer Science & Business Media; 2012 Dec 6.9