[PDF] A Machine-Learning Method for Time-Dependent Wave Equations over Unbounded Domains

Abstract

Time-dependent wave equations represent an important class of partial differential equations (PDE) for describing wave propagation phenomena, which are often formulated over unbounded domains. Given a compactly supported initial condition, classical numerical methods reduce such problems to bounded domains using artificial boundary condition (ABC). In this work, we present a machine-learning method to solve this type of equations as an alternative to ABCs. Specifically, the mapping from the initial conditions to the PDE solution is represented by a neural network, trained using wave packets that are parameterized by their band width and wave numbers. The accuracy is tested for both the second-order wave equation and the Schrodinger equation, including the nonlinear Schrodinger equation. We examine the accuracy from both interpolations and extrapolations. For initial conditions lying in the training set, the learned map has good interpolation accuracy, due to the approximation property of deep neural networks. The learned map also exhibits some good extrapolation accuracy. We also demonstrate the effectiveness of the method for problems in irregular domains. Overall, the proposed method provides an interesting alternative for finite-time simulation of wave propagation.

Full PDF

AA MACHINE-LEARNING METHOD FOR TIME-DEPENDENT WAVEEQUATIONS OVER UNBOUNDED DOMAINS

CHANGJIAN XIE, JINGRUN CHEN, AND XIANTAO LI

Abstract.

Time-dependent wave equations represent an important class of partial dif-ferential equations (PDE) for describing wave propagation phenomena, which are oftenformulated over unbounded domains. Given a compactly supported initial condition,classical numerical methods reduce such problems to bounded domains using artiﬁcialboundary condition (ABC). In this work, we present a machine-learning method to solvethis equation as an alternative to ABCs. Speciﬁcally, the mapping from the initial condi-tions to the PDE solution is represented by a neural network, trained using wave packetsthat are parameterized by their band width and wave numbers. The accuracy is tested forboth the second-order wave equation and the Schr¨odinger equation, including the nonlin-ear Schr¨odinger. We examine the accuracy from both interpolations and extrapolations .For initial conditions lying in the training set, the learned map has good interpolationaccuracy, due to the approximation property of deep neural networks. The learned mapalso exhibits some good extrapolation accuracy. Therefore, the proposed method providesan interesting alternative for ﬁnite-time simulation of wave propagation. Introduction

Wave propagation is an ubiquitous phenomenon and for a long time, the associatedproperties have been a subject of interest in many disciplines [47]. Aside from the wellknown acoustic waves, the Schr¨odinger equation that describes electronic waves, and theelastodynamics that embodies stress waves [20] are also important examples. These modelsshare the common ground that waves often propagate in an unbounded domain, eventhough they are triggered locally, e.g., by the presence of a wave source or an externalforcing.One classical numerical approach to treat wave propagation in an unbounded domainis the absorbing boundary condition (ABC), which conﬁnes the computation to a ﬁnitedomain, and an ABC is then imposed on the boundary to minimize undesirable reﬂections[2, 8, 19, 23, 26]. Rather than simply removing the exterior region, the ABC provides aneﬃcient approach to mimic the inﬂuence from the surrounding environment. There areseveral diﬀerent approaches to construct and implement ABCs, most of which involve thederivation and approximation of the Dirichlet-to-Neumann map. There has been a largebody of works on ABCs and interested readers may refer to the review articles [1, 23] for

Date : January 18, 2021.2010

Mathematics Subject Classiﬁcation.

Key words and phrases.

Machine learning, wave equation, unbounded domain. a r X i v : . [ m a t h . NA ] J a n CHANGJIAN XIE, JINGRUN CHEN, AND XIANTAO LI details and references therein. The integration of ABCs with ﬁnite diﬀerence or ﬁniteelement methods, has also been extensively studied [37, 40, 44].Recently, the rapid progress in deep learning has driven the development of solutiontechniques for PDEs under the framework of deep learning, especially in high-dimensionalcases where deep neural networks (DNNs) are expected to overcome the curse of dimen-sionality; see [15] for a review and [6, 7, 10, 16, 18, 27, 30, 34, 36, 39, 41, 45, 55] for speciﬁcexamples. One remarkable application of neural networks is the physics-informed neuralnetworks (PINNs) [41], which has demonstrated its accuracy in solving both forward prob-lems and inverse problems where model parameters are inferred from the observed data.PINNs have already been applied to a range of problems, including those in ﬂuid dynamics[32, 42], meta-material design [12], biomedical engineering [53], uncertainty quantiﬁcation[50] and free boundary problems, besides the high dimensional PDEs and stochastic dif-ferential equations. Typically, the loss function is deﬁned over a ﬁnite domain in mostmethods, such as the deep Ritz method [18], deep Galerkin method [45], physics-informedneural networks [41], and deep mixed residual method [39]. To the best of our knowledge,the only exceptions are the full history recursive multilevel Picard approximation method[6, 30] and the deep backward stochastic diﬀerential equation method [16, 27], where the so-lution of the underlying PDE is approximated through the solution of a suitable stochasticoptimization problem on an appropriate function space. Typical equations are (semilin-ear) parabolic PDEs. These recent works have demonstrated the strong representability ofDNNs for solving PDEs.The current work aims to solve time-dependent wave equations on unbounded domainsusing deep learning. One natural approach is to build an artiﬁcial neural network (ANN)that takes the most important ABC, the perfectly matched layer (PML) into account[51, 52]. The basic idea behind the approach in [51] is as follows. Given the electromag-netic ﬁeld at the current step, one can predict the ﬁeld on the PML boundary at next timestep. Then, one computes a ﬁeld in a slightly larger domain, called the object domain,at the next step through output from PML, which subsequently becomes the new inputdata, by the Finite-Diﬀerence Time-Domain (FDTD) method. Furthermore, one can em-bed the network model into the FDTD method and replace the PML. The data groups arecollected at the interface with conventional PML. The Long Short Term Memory (LSTM)network based on the PML model in [52] can achieve higher accuracy than ANN based onPML model, thanks to the sequence dependence feature of LSTM networks. Comparedto the conventional PML approach, although the machine-learning methods in [51, 52] de-crease the size of the boundary region and the complexity of the FDTD method, due tothe introduction of a one-cell boundary layer. The data generation involves prior PMLcomputation. This process involves the history of solutions at the boundary and may berather complicated in general. We propose a diﬀerent machine-learning strategy to solvethe wave propagation over unbounded domain problem. Given a compactly supported ini-tial condition, we restrict the full problem to a solution mapping over a ﬁnite region. Morespeciﬁcally, the mapping from the initial condition, expressed as wave packets with bandwidth and wave numbers as parameters, to the PDE solution in the same compactly sup-ported domain at later times, is represented by a fully connected neural network or residual

MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 3 neural network. The parameters in the network are then trained using data consisting ofeither the reference solution or a numerically computed solution.On one hand, the mapping can generate accurate results in which the speciﬁc initialcondition is not included in the training set but can be interpolated using initial conditionsin the training set. On the other hand, the method also allows extrapolations, e.g., whenthe wave packet arrives at the boundary of the ﬁnite region although the training setonly contains temporal instances prior to that event. Compared to existing works, theproposed method can be easily implemented. The solutions represented by DNN alsoexhibit absorbing properties. But there is no need to determine the coeﬃcients in ABCs,or to incorporate ABCs into ﬁnite diﬀerence or ﬁnite element methods. The proposedmethod provides an alternative for ﬁnite-time simulation of wave propagation.This paper is organized as follows. In section 2, we describe the machine-learning methodfor two representative wave equations: the second-order wave equation and the Schr¨odingerequation. Numerous examples are provided to show the interpolative and extrapolativeproperties of the proposed method in section 3. Conclusions are drawn in section 4.2.

Methodology

To elaborate the approach of constructing solution representations by a neural network,we consider, as speciﬁc examples, the time-dependent acoustic wave equation and theSchr¨odinger equation as examples, , due to the fact that they have been treated extensivelyin the mathematical analysis and numerical approximations. But we expect that the ideacan be extended to other types of wave equations. We express these two models as time-dependent PDEs over the entire space R d :(I) Time-dependent wave equation: u tt = ∆ u, x ∈ R d , t > ,u ( x ,

0) = u ( x ) , u t ( x ,

0) = v ( x ) . (2.1)(II) Time-dependent Schr¨odinger equation: i∂ t u ( x , t ) = − ∆ u ( x , t ) + V ( x , t ) u ( x , t ) + f ( | u ( x , t ) | ) u ( x , t ) , x ∈ R d , t > ,u ( x ,

0) = u ( x ) , x ∈ R d . (2.2) There are important cases that deserve particular attention:(a) The linear Schr¨odinger equation: f ≡

0. This describes the dynamics of a freeelectron.(b) The cubic Schr¨odinger equation: f ( ρ ) = βρ, ρ ∈ [0 , ∞ ) , (2.3) where β (positive for repulsive or defocusing interaction and negative for at-tractive or focusing interaction) is a given dimensionless constant describingthe strength of the interaction. It has been widely used to model nonlinearwave interactions in a dispersive medium. CHANGJIAN XIE, JINGRUN CHEN, AND XIANTAO LI

We have expressed these models in their non-dimensionalized forms. For example, thewave speed in (2.1) and the Planck constant in (2.2) have been both set to unity. Inaddition, we set β = − ⊂ R d , that is,supp( u ) , supp( v ) , supp( V ) ⊂ Ω . Our aim is to determine the solution u ( · , t ) in the same domain Ω at later times.2.1. The training procedure.

In this section, we describe how the solution is trainedusing neural networks. One key step in a machine learning procedure is the preparation ofa dataset, which will subsequently be fed into the machine learning model. We ﬁrst preparea dataset, consisting of the initial condition and the corresponding solutions at later times.We will denote initial data by U = ( u , ∂ t u ) for time-dependent wave equation (2.1)and U = ( u re0 , u im0 ) for the time-dependent Schr¨odinger equation (2.2). In principle, themapping from U to the solution at a later time can be expressed as an operator S ,(2.4) u ( x , t ) | Ω = S U . For example, in the linear case, this can be written as an integral operator using theGreen’s function [21]. But such an expression is of limited value in practice since the directevaluation is rather expensive. Here we represent such a mapping using a neural networkand determine the parameters through training.In the training step, we consider three cases, as motivated by the terminology in controlsystems,(I) Single-input single-output (SISO) datasets(2.5) (cid:110) U (cid:96) , U (cid:96)T (cid:111) N(cid:96) =1 . (II) Single-input multiple-output (SIMO) datasets(2.6) (cid:110) U (cid:96) , U (cid:96) { t i } pi =1 (cid:111) N(cid:96) =1 (III) Exogenous-input multiple-output (XIMO) datasets(2.7) (cid:110) V (cid:96) { t i } pi =1 , U (cid:96) { t i } pi =1 (cid:111) N(cid:96) =1 . Here the integer N denotes the number of training samples, and p refers to the time in-stances where the solutions are observed. The input simply refers to the initial conditionsand the output involves the resulting solutions at a later time (or at multiple time in-stances). Namely, U T := { u ( · , T ) } . These solutions will be collected at grid points that liein the domain of interest Ω. For simplicity, we also work with U (cid:96)T in the same domain. Butin practice, one might be interested in U T in a diﬀerent domain. In the case of XIMO, onemay consider the Schr¨odinger equation, with the initial condition ﬁxed. The dynamics isentirely driven by the external potential. MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 5

We will discuss the construction of datasets in the next section in more details. Inparticular, properties of wave propagations, e.g., wave length and dispersion relations,are built into the training set. Since U T is fully determined by U , we approximate themappings from U to U T using a neural network, denoted by N MD ( U , W ), i.e.,(2.8) U T ≈ N MD ( U , W ) . The function N MD is determined by a network consisting of D layers with width M ,and the associated parameters are denoted by W . For a fully connected neural network(FCNN), the mapping (2.8) from input to output is explicitly given by N MD ( U , W ) = W TD H MD − ( U , ˜ W ) + b D , (2.9)where, H MD − ( U , ˜ W ) = φ ( W TD − · · · φ ( W T φ ( W T U + b ) + b ) · · · + b D − ) , with φ being the activation function and { W j , b j } Dj =1 being the parameters speciﬁed by thenetwork.The residual neural network (ResNet) structure [28] will also be considered in our nu-merical studies. In this case, the mapping can be expressed with the following steps,  y = W T U + b ,y = y + H MD ( y ) ,y = y + H MD ( y ) , · · · · · · · · · y = y s + H MD s ( y s ) , ˜ N MD ( U , W ) = W TD y + b D , where s is the number of residual blocks with a skip connection.The neural network underlying the mapping (2.8) is illustrated in Figure 1. Also shownin the diagram is the case where solutions at multiple instances { t i } pi =1 are included as theoutput.The next step is to formulate the problem as a supervised learning problem by meansof minimizing the population risk (expected risk), elabrated in [9] bymin W E ( u ,u T ) ∼ µ (cid:2) (cid:107)N MD ( u , W ) − u T (cid:107) (cid:3) , or min W E t E ( u ,u ( t )) ∼ µ (cid:2) (cid:107)N MD ( u , W ) − u ( t ) (cid:107) (cid:3) , with µ being a probability distribution, which in practice, can be discretized by meansquared error as empirical loss for the training samples. For example, for a SISO dataset,this leads to a cost function, L ( U , W ) = 1 N N (cid:88) (cid:96) =1 (cid:104) N MD ( U (cid:96) , W ) − u (cid:96)T (cid:105) . (2.10) CHANGJIAN XIE, JINGRUN CHEN, AND XIANTAO LI

Similarly, for a SIMO dataset, we can deﬁne the loss function as follows, L ( U , W ) = 1 N p (cid:88) i =1 N (cid:88) (cid:96) =1 (cid:104) N MD ( U (cid:96) , W ) − u (cid:96) { t i } pi =1 (cid:105) . (2.11) Notations for the parameters of our model and algorithm are summarized in Table 1. d the dimension of the problem K the set of wave numbers of wave packetΣ the set of width of wave packet D number of layers M number of neurons of each hidden layer N number of initial conditions (training samples) m number of column of input matrix m number of column of output matrix p number of time instances N MD representation of fully connected neural networks˜ N MD representation of residual neural networks β constant describing the strength of interaction s number of residual blocks Table 1.

Notations for the parameters in the model (2.9) and algorithm.In addition to the structure of the network, another factor that may play a signiﬁcantrole in the approximation (2.8) is the choice of the activation function. In this paper, weﬁrst pick the FCNNs with the widely used relu activation function. Then we implement anumber of other nonlinear activation functions to test their accuracy, including:relu( x ) = x + = (cid:40) x, x > , , otherwise , tanh( x ) = exp( x ) − exp( − x )exp( x ) + exp( − x )sigmoid( x ) = 11 + exp( − x ) , elu( x ) = (cid:40) x, x > ,α (exp( x ) − , otherwise . Our remaining task is to (i) collect a suitable set of initial conditions and the corre-sponding solutions at later times, so that the nonlinear representation (2.8) can be trained; (ii) test the approximation (2.8) against analytical or numerical solutions.2.2.

The training set for time-dependent wave equation.

Typical analysis of wavepropagations starts with their dispersion properties using Fourier transform [47]. For thetime-dependent wave equation (2.1), the dispersion relation is given by ω ( k ) = | k | with k being the wave number. Often observed in practice are wave packets that are conﬁned byan envelop and travel as a unit. Here we use wave packets to form the training set. Morespeciﬁcally, we consider those wave packets with a Gaussian envelop, which can be derived, MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 7

Figure 1.

Schematic of learning mapping N MD (FCNNs) or ˜ N MD (ResNets)from initial condition of Gaussian wave packets with compact support to theoutput data from exact, or numerical solution, or neural network prediction.Top row with (A1) and (A2) for time-dependent wave equation and bottomrow with (B1) and (B2) for time-dependent Schr¨odinger equation. Single-snapshot output (left panel) and multiple-snapshot output (right panel).e.g., by using Fourier transform. For instance, for the acoustic wave equation (2.1), fromthe initial conditions u ( x ,

0) = exp (cid:18) − | x | σ (cid:19) cos( k · x ) , x ∈ R d , k ∈ K , σ ∈ Σ , (2.12)one obtains, u ( x , t ) = exp (cid:32) − (cid:80) di =1 ( x i − t ) σ (cid:33) cos( k · x − | k | t ) , x ∈ R d . (2.13)This implies that the initial velocity is given by, u t ( x ,

0) = exp (cid:18) − | x | σ (cid:19) (cid:34) (cid:80) di =1 x i σ cos( k · x ) + | k | sin( k · x ) (cid:35) , x ∈ R d . (2.14)2.3. The training set for time-dependent Schr¨odinger equation.

CHANGJIAN XIE, JINGRUN CHEN, AND XIANTAO LI

The linear case.

For the linear Schr¨odinger equation, iu t = − ∆ u, x ∈ R d , t > . (2.15)The dispersion relation is given by ω ( k ) = 1 / | k | .To form the training set, we ﬁrst pick the initial conditions from a family of wave packets, u ( x ) = exp (cid:104) − ( x / √ /σ + i k · ( x / √ (cid:105) , k ∈ K , σ ∈ Σ , (2.16)representing a Gaussian wave packet centered at the origin with wave number k . Thewidth parameter of the Gaussian envelope will be drawn from a pre-selected set: σ ∈ Σ , Σ ⊂ Σ (:= R + ) , k ∈ K , K ⊂ K (:= R d ) . Theoretically, we can go through the whole spaces K and Σ . For practical purposes,we take some representative elements from ﬁnite sets K and Σ. We will discuss more detailsabout the selection of Σ and K in the next section, and demonstrate how they impact theaccuracy.For each σ and k , the exact solution to (2.15) can be constructed directly, u ( x, t ) = 1 √ it exp (cid:32) −

11 + 4 t ( x √ − kt ) σ (cid:33) exp (cid:18) i

11 + 4 t [( k + 2 tx √ x √ − k t ] (cid:19) , (2.17)for one-dimensional problems ( d = 1).For two-dimensional problems ( d = 2), we have, u ( x , x , t ) = (cid:18) ii − t (cid:19) exp  − i (cid:16) x + x σ (cid:17) − √ ( k x + k x ) + ( k + k ) ti − t  . (2.18)These formulas can be generalized to arbitrary dimensions d , u ( x , t ) = (cid:18) ii − t (cid:19) d exp  − i (cid:16) | x | σ (cid:17) − √ k · x + | k | ti − t  . (2.19)This family of solutions will constitute the datasets deﬁned in (2.5) and (2.6), which willbe used in the training step (2.10).2.3.2. The nonlinear case.

Since analytical solutions are diﬃcult to obtain for the nonlinearPDE (2.2), we generate the solutions U (cid:96)T using numerical methods. Here we use the ﬁnitediﬀerence scheme with uniform grid size, together with an operator-splitting scheme intime [4]. More speciﬁcally, we use the Strang splitting, which at each time step, involvesthe following operations:(a) Solve iu t + ∆ u = 0 , for half of the time step: ∆ t/

2. Due to the linearity, this can be done exactly usingthe Fourier transform to diagonalize the Laplacian term.

MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 9 (b) Using the solution from the previous step, solve iu t + | u | u = 0 , for one step. Using the fact that ddt | u | = 0 , this equation can also be solved exactly.This can also be extended to include an external scalar potential, that is iu t + ( | u | + V ( x, t )) u = 0 , for one step.(c) Solve iu t + ∆ u = 0 again for another half step.The symmetric operator splitting is known to have second order accuracy in time. Inprinciple, one can also use higher order methods [54], but the current numerical method isalready adequate to test the neural network approximation. We also pick initial conditionsfrom (2.16). The solutions at time T , together with the initial conditions (2.16), will formthe data set. Remark 2.1.

Another interesting approach to build the training set is to design an em-bedding neural network to obtain the solutions to PDEs; see [18, 39] for examples. Inthis case, one employs those solutions corresponding to the Gaussian wave packet initialcondition as datasets to feed into FCNNs and ResNets in high dimensions.2.4.

Optimization.

Formulated as an optimization problem, the parameters in the net-work can be obtained by using the stochastic or batch optimized algorithms, applied to theexpected or empirical risks for (2.10) and (2.11). For a comparison of these methods, onecan refer to [9]. The prototypical stochastic optimization method is the stochastic gradientdescent method in [43], which, in the context of minimizing L ( U , W ), with W initializedby [29], is deﬁned by W k +1 ← W k − α k ∇ i k L ( U , W k ) , (2.20)for all k ∈ N . The index i k is chosen randomly and α k is a positive stepsize knownas the learning rate. Each epoch of this method is thus very cheap, involving only thecomputation of the gradient ∇ i k L ( U , W k ) corresponding to one sample. In many cases, abatch approach is a more natural ﬁt. In this paper, we employ the Adam method [35].3.

Numerical Experiments

In this section, we present numerical examples to test the eﬀectiveness of the neuralnetwork representation (2.8). Extensive tests are performed to study the accuracy of theapproximation and examine extrapolations by the neural networks. The training samplesfor the ﬁrst two examples are based on analytical solutions of the wave equation (2.1), withthe ﬁrst example in 1D and the second example in 3D. We also extend the numerical testto wave equations in 8 and 16 dimensions where the wave propagation occurs mainly intwo dimensions.. For the third example, we consider the linear Schr¨odinger equation (2.2)and build the training set from analytical solutions. In the remaining two examples, wetest our method for the cubic Schr¨odinger equations with solutions computed numerically.

Example 3.1 (The 1D wave equation) . Here we ﬁrst consider the wave equation (2.1) in1D. The training sets are gathered by (2.5) and (2.6). In the numerical experiments, wetake the neural network with D = 5 and m = 2 N x , m = N x , M = 100, both for theFCNNs and ResNets, in the latter case, we choose ResNets with two residual blocks, eachblock with D = 2, M = 100 and a skip connection so that the number of parameters ofboth networks are the same. Meanwhile, we take N x = 201, which is the number of gridpoints for both the training and testing samples in the spatial domain [ − , K and Σ. We choose K = { , , · · · , } . For Σ, weconsider two types of selections: a set with linear spacing Σ = { . , . , , . , . , . } , anda set with exponential grid { h, h, h, h, h, h } where the spacing is doubled eachtime. We train the networks for 20000 epochs.After the parameters in the network are determined, the performance of the networkapproximation is tested on solutions with the following initial conditions, u I ( x,

0) = exp (cid:0) − x (cid:1) cos(6 x ) , u I t ( x,

0) = exp (cid:0) − x (cid:1) [2 x cos(6 x ) + 6 sin(6 x )] , (3.1) u II ( x,

0) = exp (cid:0) − x / . (cid:1) cos(6 . x ) , u II t ( x,

0) = exp (cid:0) − x / . (cid:1) (cid:104) x .

75 cos(6 . x ) + 6 . . x ) (cid:105) , (3.2) u III ( x,

0) = sech( x ) cos(10 x ) , u III t ( x,

0) = sech( x )[tanh( x ) cos(10 x ) + 10 sin(10 x )] , (3.3) u IV ( x,

0) = exp (cid:0) − x (cid:1) cos(˜ kx ) , u IV t ( x,

0) = exp (cid:0) − x (cid:1) (cid:104) x cos(˜ kx ) + ˜ k sin(˜ kx ) (cid:105) (3.4)with sech( x ) = 2 / (exp( x ) + exp( − x )).These initial conditions are selected based on the following rationale: We notice that(3.1) is of the same type of initial condition as those in the training sets presented in(2.12) and (2.14). It can be used to verify the training procedure. The initial conditionsin (3.2) has a similar function form as those in the training set, but the wave number k and the width σ do not belong to K and Σ. In view of the selection of K and Σ, this canbe interpreted as an interpolation in terms of the wave number, but an extrapolation interms of the width parameter. The initial condition (3.3) is outside of training sets in thesense that the function form is completely diﬀerent. For the last initial condition (3.4), thewave number ˜ k will be selected as ˜ k = 10 . , . , . , . (cid:107) u DNN − u exact (cid:107) (cid:107) u exact (cid:107) . (3.5)The results, in terms of the relative error of the solutions at time T = 2, are shownin Table 2, Table 3 and Table 4, where we collected the results for the solutions that MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 11 correspond to the initial conditions (3.1), (3.2) and (3.3), respectively. The results for theﬁrst initial condition is hardly surprising, since the initial condition (3.1) is very similarto those in the training set. But our numerical experiments suggest that this method alsoyields reasonable accuracy for the initial conditions (3.2) and (3.3) that are not the typein the training set. We also observe that in most cases the ResNets yield slightly betteraccuracy. In this case, using the relu function with Σ = { h, h, h, h, h, h }| h =0 . yields the best result. For the third case (3.3), we observe the choice of Σ with linearspacing produces poor results, and it seems important to have a larger range of widthparameters in the training set. We observe that the choice of the activation function alsoplays a role. For example, with the choice of the relu function, the FCCN and ResNet yieldsimilar results, while for the sigmoid function, the ResNet has much better accuracy.The relative error of solutions at time T = 2 from the initial condition (3.4) with variouschoices of ˜ k outside the training set, is shown in Table 5. The training sets are constructedwith K = { , , · · · , } and Σ = { . , . , , . , . , . } . For ˜ k from 10 .

025 to 10 . k moves further awayfrom K . Interestingly, for the ResNet with activation functions sigmoid( x ) and elu( x ), theerror grows much more slowly. Example 3.2 (High dimensional wave equations) . In high dimensions, in general, thewave modes are represented by many wave numbers. Here we consider a special casewhere the variation of the solutions of (2.1) is mainly in the ﬁrst two dimensions. To thisend, we choose training samples speciﬁed by K = ( k , k , k , · · · , k d ) with k , k both in { , , · · · , } and k i = 1 , i = 3 , · · · , d , which indicates that the wave propagation is mostlyrestricted to they xy -plane. We also choose Σ = { h, h, h, h, h, h } with h = 0 . k = (2 , , , · · · ,

1) and 2 σ = 1. We take the neural network with D = 5, m = 2 N x N y , m = N x N y and M = 100 of FCNNs. We also take N x = N y = 64 as thenumber of grid points for both training and testing in domain Ω = [ − , × [ − , xy − plane at T = 0 . . × − (3D), 1 . × − (8D) and 6 . × − (16D). One canobserve that the error grows as the dimension increase. Example 3.3 (The 1D linear Schr¨odinger equation) . We consider the linear Schr¨odingerequation (2.15) in the 1D case. The training sets are constructed from (2.16) and (2.17).We consider the following initial conditions, u I0 ( x ) = exp (cid:20) − ( x √ + i √ x (cid:21) , (3.6) u II0 ( x ) = exp (cid:20) − ( x √ / . i . √ x (cid:21) , (3.7) u III0 ( x ) = exp (cid:20) − ( x √ + i ˜ k x √ (cid:21) . (3.8) Width Relative error ( × − )Activations Σ u FCNNs u ResNets relu( x ) { h, h, h, h, h, h }| h =0 . .

72 0 . { h, h, h, h, h, h }| h =0 . .

09 6 . { h, h, h, h, h, h }| h =1 .

404 1 . { . , . , , . , . , . } .

389 0 . x ) { h, h, h, h, h, h }| h =0 . .

84 2 . { h, h, h, h, h, h }| h =0 . .

14 2 . { h, h, h, h, h, h }| h =1 .

44 2 . { . , . , , . , . , . } .

34 2 . x ) { h, h, h, h, h, h }| h =0 . .

82 1 . { h, h, h, h, h, h }| h =0 . .

56 1 . { h, h, h, h, h, h }| h =1 .

634 1 . { . , . , , . , . , . } .

08 1 . x ) { h, h, h, h, h, h }| h =0 . .

951 2 . { h, h, h, h, h, h }| h =0 . .

45 2 . { h, h, h, h, h, h }| h =1 .

45 2 . { . , . , , . , . , . } .

45 2 . Table 2.

The approximation error for various choices of activation func-tions and band width of wave packets for the 1D wave equation (2.1) withFCNNs and ResNets for the initial condition (3.1). In the training sets, thewave numbers are chosen from K = { , , · · · , } .In the third case, ˜ k is to be selected to examine the extrapolation error.We consider the neural network with D = 5, m = N x , m = N x and M = 100 ofFCNNs and two residual blocks for ResNets, each block with D = 2, M = 100. N x = 201is the number of grid points for both training and testing in the domain Ω = [ − , K and Σ. We train the networks for 20000 epochs.We ﬁrst consider initial conditions (3.6) and (3.7) as input, and the corresponding solu-tions at a single time instance T = 1 as the output. The results are shown in Table 6 andTable 7, respectively. For the initial condition (3.6) the best result is obtained by using therelu function for both FCNN and ResNet. For the initial condition (3.7), the accuracy isnot as satisfactory as the previous test, especially when the relu function is used. Anotherobservation is that the result is quite sensitive to the selection of Σ for the training set.Next we discuss the results from the extrapolation. In this context, an extrapola-tion can be interpreted in terms of the wave number ˜ k in the initial condition (3.8),or in terms of predicting solutions at time instances that are beyond the training pe-riod. In the former case, we consider a training set determined by K = { , , · · · , } and Σ = { . , . , , . , . , . } , and then we pick an initial condition (3.8), where MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 13

Width Relative errorActivations Σ u FCNNs u ResNets relu( x ) { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =1 . × − . × − { . , . , , . , . , . } . × − . × − tanh( x ) { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =1 . × − . × − { . , . , , . , . , . } . × − . × − sigmoid( x ) { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =1 . × − . × − { . , . , , . , . , . } . × − . × − elu( x ) { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =1 . × − . × − { . , . , , . , . , . } . × − . × − Table 3.

The 1D wave equation (2.1) approximated by FCNNs andResNets for the initial condition (3.2). The approximation error for var-ious choice of activation functions and band width of wave packets. In thetraining sets, the wave number set is K = { , , · · · , } .˜ k ∈ { . , . , . , . } The results are summarized in Table 8. One can see that theerror is reasonable for wave numbers in this range, but in all cases the error increasesas ˜ k moves further away from K . In the latter case, we consider the solutions with theinitial condition (3.6). The training set consists of solutions ( x, t ) ∈ [ − , × [0 , .

6] with N x = 201, N t = 51. Then we test the solution at time T from 0 .

625 to 0 .

7, and theresults are presented in Table 9. The best result comes from the FCNN using the tanh( x )activation function. But the error grows with T , indicating that the accuracy of the ex-trapolation can only be guaranteed for a ﬁnite time period. To visualize the evolution intime, we take the FCNN with the best performance, together with the activation functiontanh( x ). The results are shown in Figure 3. A good performance can be guaranteed inshort time. But we can see noticeable error for longer times.Next we test the accuracy of a network trained using a dataset that consists of multiplesnapshots of the solutions in the time interval t ∈ [0 ,

3] using uniform step size with N t = 51.The network is a FCNN with D = 5, m = N x , m = N t N x and M = 100. Starting fromthe initial condition (3.6), Figure 4 shows the prediction by the FCNN, compared to theexact solution. The relative error for the density, the real part, the imaginary part of the Width Relative errorActivations Σ u FCNNs u ResNets relu( x ) { h, h, h, h, h, h }| h =0 . . × − . { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =1 . × − . × − { . , . , , . , . , . } . × − . x ) { h, h, h, h, h, h }| h =0 . . × − . { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =1 . × − . × − { . , . , , . , . , . } . × − . x ) { h, h, h, h, h, h }| h =0 . . × − . { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =1 . × − . × − { . , . , , . , . , . } . × − . x ) { h, h, h, h, h, h }| h =0 . . × − . { h, h, h, h, h, h }| h =0 . . × − . × − { h, h, h, h, h, h }| h =1 . × − . × − { . , . , , . , . , . } .

23 11 . Table 4.

Testing the accuracy of the approximation of the 1D wave equa-tion with initial condition (3.3) using FCNNs and ResNets. The table showsthe error from diﬀerent choices of the activation functions and width pa-rameters. The training set include wave numbers from K = { , , · · · , } .wave function are given by 1 . × − , 2 . × − , 1 . × − , respectively. Suchexamples appear frequently in testing an absorbing boundary condition [2, 31, 48], and themain emphasis is usually on the reﬂection at the boundary. The results in Figure 4 suggestthat the approximation by a FCNN exhibits an absorbing property that is similar to anabsorbing boundary condition. Example 3.4 (The 1D cubic Schr¨odinger equation) . Here we test the method on the cubicSchr¨odinger equation (2.2) in 1D. We use solutions from two initial conditions to test theaccuracy, u I0 ( x ) = exp( − x + i x ) , (3.9) u II0 ( x ) = sech( x ) exp( i x ) . (3.10)As demonstrated in the previous section, to generate data, the Strang splitting method,combined with the spectral method [4] are used in the domain [ − πL, πL ] with L = 16and N x being the number of Fourier modes. The numerical solution can be captured up MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 15

Wave number Relative errorActivations ˜ k u

FCNNs u ResNets relu( x ) 10 .

025 1 . × − . × − .

05 2 . × − . × − . . × − . × − . . × − . × − tanh( x ) 10 .

025 1 . × − . × − .

05 2 . × − . × − . . × − . × − . . × − . × − sigmoid( x ) 10 .

025 1 . × − . × − .

05 2 . × − . × − . . × − . × − . . × − . × − elu( x ) 10 .

025 1 . × − . × − .

05 1 . × − . × − . . × − . × − . . × − . × − Table 5.

Extrapolating the wave number. The approximation of the1D wave equation with the initial condition (3.4) using diﬀerent acti-vation functions and band width parameters with FCNNs and ResNets. K = { , , · · · , } and Σ = { . , . , , . , . , . } .to a single-time T = 1 with N t = 1000. The training samples are generated by taking K = { , , · · · , } and Σ = { h, h, h, h, h, h } with h = 0 . D = 5, m = N x , m = N x and M = 100, and train the networkfor 20000 epochs. We pick N x = 8192 both for the training and testing. The results forsolutions from initial conditions (3.9) and (3.10) are presented in Figure 5. The relativeerrors for the density, the real part and the imaginary part of the wave function are foundto be 1 . × − , 1 . × − , 9 . × − for (3.9), and 9 . × − , 7 . × − ,7 . × − , for (3.10), respectively. Example 3.5 (1D nonlinear Schr¨odinger equation with a time-dependent potential) . For problems where physical processes are initiated by an external potential, such asGross–Pitaevskii equation in Bose–Einstein condensate, there are two interesting scenarios:(a) mapping the initial condition to the solution at later time given a potential; (b) mappingthe potential to the solution at later time given an initial condition. Motivated by (a), weconsider (2.2) given a potential that is compactly supported in space. Absorbing boundarycondition for this type of problems can be derived [1]. (3D)(8D)(16D)

Figure 2.

Solution of high-dimensional wave equation by DNN (Left), ex-act solution (Middle), and the error (Right) evaluated at time T = 0 . d = 3; Middle: d = 8; Bottom: d = 16.As a speciﬁc example, we consider V ( x, t ) = E ( t ) U ( x ), with time-dependent modulation, E ( t ) = E exp( − γ ( t − t ) ) cos( ωt ) . (3.11)The spatial part given by,(3.12) U ( x ) = (cid:40) E x (1 − x ) , x ∈ [0 , , , otherwise , where E denotes the constant intensity of current. In our test, we take γ = 1, E =1, t = 0 and ω = 1. The network setup is the same as that of example 3.4 except MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 17

Width Relative error( × − )Σ u reFCNNs u imFCNNs | u FCNNs | u reResNets u imResNets | u ResNets | { h, h, h, h, h, h }| h =0 . .

58 2 .

10 6 .

01 5 .

58 5 .

54 3 . { h, h, h, h, h, h }| h =0 . .

31 0 .

27 0 .

20 1 .

86 1 .

81 1 . { h, h, h, h, h, h }| h =1 .

41 0 .

43 0 .

37 1 .

83 1 .

95 1 . { . , . , , . , . , . } .

54 0 .

48 0 .

50 1 .

01 0 .

78 0 . { h, h, h, h, h, h }| h =0 . .

15 3 .

75 3 .

57 3 .

09 2 .

66 3 . { h, h, h, h, h, h }| h =0 . .

03 2 .

45 2 .

00 2 .

02 2 .

98 2 . { h, h, h, h, h, h }| h =1 .

57 1 .

44 0 .

98 1 .

35 1 .

90 1 . { . , . , , . , . , . } .

79 1 .

51 1 .

64 1 .

47 2 .

13 2 . { h, h, h, h, h, h }| h =0 . .

64 5 .

66 4 .

15 5 .

68 7 .

23 6 . { h, h, h, h, h, h }| h =0 . .

65 1 .

99 2 .

51 4 .

73 4 .

59 3 . { h, h, h, h, h, h }| h =1 .

95 1 .

71 1 .

01 7 .

10 5 .

74 4 . { . , . , , . , . , . } .

31 1 .

15 0 .

99 2 .

19 2 .

41 2 . { h, h, h, h, h, h }| h =0 . .

76 2 .

19 3 .

68 4 .

17 3 .

21 3 . { h, h, h, h, h, h }| h =0 . .

59 2 .

02 2 .

61 3 .

79 3 .

19 3 . { h, h, h, h, h, h }| h =1 .

49 1 .

28 0 .

75 2 .

96 2 .

00 1 . { . , . , , . , . , . } .

25 1 .

24 1 .

21 2 .

55 1 .

72 1 . Table 6.

Approximation error for the 1D linear Schr¨odinger equation withinitial condition (3.6) using FCNNs and ResNets. From top to bottom:Diﬀerent choices of activation functions relu( x ), tanh( x ), sigmoid( x ) andelu( x ), and various choices of Σ. The wave numbers are drawn from K = { , , · · · , } to build the training set.that we use multiple-time output. As comparison, the reference solution is computednumerically using the Strang-splitting method combined with a spectral method with theinitial condition (3.10). The training samples are generated with K = { , , · · · , } andΣ = { h, h, h, h, h, h } with h = 0 .

5. The solution learned by the neural network,along with the solution directly computed are shown in Figure 6. The external potentialwidens the initial wave packets, which subsequently propagate toward the right boundary.The solution represented by the neural network shows great agreement with the directsolution.

Example 3.6 (The 2D cubic Schr¨odinger equation) . The last example we consider here isthe cubic Schr¨odinger equation (2.2) in 2D. The problem is set up as follows: We considerthe solution of (2.2) in a compact domain Ω = [ − Lπ, Lπ ] × [ − Lπ, Lπ ] with L = 2 . Thesolutions in Ω are represented at grid points with N x = N y = 64. The solution in timeup to T = 1 is represented at equally spaced time steps with N t = 100. The training andtesting are both handled within the domain Ω. For the parameters in the wave packet, we Width Relative error( × − )Σ u reFCNNs u imFCNNs | u FCNNs | u reResNets u imResNets | u ResNets | { h, h, h, h, h, h }| h =0 . . . . . . . { h, h, h, h, h, h }| h =0 . .

55 18 . . .

45 9 .

84 7 . { h, h, h, h, h, h }| h =1 . .

84 6 .

78 10 . . . { . , . , , . , . , . } . . .

97 14 . . . { h, h, h, h, h, h }| h =0 . .

80 7 .

04 7 .

02 6 .

34 4 .

42 4 . { h, h, h, h, h, h }| h =0 . .

53 5 .

66 3 .

68 4 .

18 5 .

84 2 . { h, h, h, h, h, h }| h =1 .

87 5 .

12 3 .

74 3 .

68 4 .

52 2 . { . , . , , . , . , . } .

08 5 .

72 3 .

25 4 .

65 5 .

49 3 . { h, h, h, h, h, h }| h =0 . . . . . .

47 7 . { h, h, h, h, h, h }| h =0 . . . . .

98 4 .

73 4 . { h, h, h, h, h, h }| h =1 . .

35 4 .

50 8 .

71 6 .

38 4 . { . , . , , . , . , . } .

87 11 . .

24 4 .

87 4 .

40 3 . { h, h, h, h, h, h }| h =0 . . .

61 7 .

99 16 . .

71 12 . { h, h, h, h, h, h }| h =0 . .

68 5 .

94 5 .

07 8 .

44 4 .

74 4 . { h, h, h, h, h, h }| h =1 .

19 3 .

25 1 .

86 4 .

32 3 .

60 3 . { . , . , , . , . , . } .

76 5 .

56 3 .

25 6 .

39 3 .

41 3 . Table 7.

Approximation error for the solution of the 1D linear Schr¨odingerequation with initial condition (3.7). From top to bottom: Diﬀerent choicesof activation functions relu( x ), tanh( x ), sigmoid( x ) and elu( x ), and variouschoices of the width parameter Σ. From left to right: FCNNs and ResNets.In the training, the wave number set K = { , , · · · , } is used.choose K = { , , · · · , } and we pick ( k , k ) ∈ K × K and Σ = { h, h, h, h, h, h } with h = 0 . − Lπ, Lπ ] × [ − Lπ, Lπ ] sothat it represents solutions over the entire space within the time period under consideration.We choose a FCNN with D = 4, m = N x N y , m = N x N y and M = 100. The networkis trained for 20000 epochs. Then we use the network to predict the solution of the 2Dcubic Schr¨odinger equation (2.2) with initial condition: u ( x , x ) = exp[ − ( x + x ) + i (3 x + 3 x )] . (3.13)The results are presented in Figure 7. The corresponding relative errors for the density,the real and imaginary parts of the wave function are 1 . × − , 1 . × − , and 1 . × − , respectively. The results will improve if we choose K = { , , · · · , } but Σ = { h, . h, . h, . h, . h, . h } with h = 0 . . × − , 9 . × − and 1 . × − for the density, the real part, and the imaginary parts, respectively.Notice that at time T = 1, part of the wave packets have moved out of Ω. Therefore, the MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 19

Wave number Relative error ( × − )˜ k u reFCNNs u imFCNNs | u FCNNs | u reResNets u imResNets | u ResNets | .

025 0 .

565 0 .

917 0 .

518 2 .

63 1 .

23 1 . .

05 1 .

07 1 .

75 1 .

06 4 .

66 2 .

20 1 . . .

11 3 .

44 2 .

11 9 .

00 4 .

84 3 . . .

04 7 .

73 3 .

68 17 . . . .

025 1 .

16 1 .

57 0 .

963 1 .

50 1 .

29 1 . .

05 1 .

33 2 .

00 1 .

23 2 .

09 2 .

04 1 . . .

26 3 .

43 1 .

97 3 .

68 3 .

97 2 . . .

22 7 .

45 3 .

92 7 .

97 9 .

00 6 . .

025 1 .

21 1 .

25 1 .

21 2 .

58 2 .

46 1 . .

05 2 .

11 2 .

24 2 .

29 3 .

06 2 .

73 2 . . .

36 4 .

88 4 .

89 4 .

38 3 .

80 3 . . . . . .

08 7 .

37 6 . .

025 1 .

66 0 .

964 1 .

21 2 .

26 1 .

36 1 . .

05 1 .

94 1 .

69 1 .

53 2 .

34 1 .

78 1 . . .

85 3 .

42 2 .

37 3 .

28 3 .

14 1 . . .

61 7 .

86 4 .

55 7 .

15 7 .

00 3 . Table 8.

Extrapolation in the wave number with FCNNs and ResNets forthe initial condition (3.6). From top to bottom: Diﬀerent choices of theactivation functions, relu( x ), tanh( x ), sigmoid( x ) and elu( x ). We choose K = { , , · · · , } and width Σ = { . , . , , . , . , . } .neural network has demonstrated an absorbing property, allowing the waves to propagateout of the domain. 4. Discussion and Conclusions

In this paper, we have proposed a machine-learning method to solve wave equations overunbounded domains without introducing artiﬁcial boundary conditions. As examples, weconsidered the Schr¨odinger equation and the second-order acoustic wave equation. Resultsshow that the proposed method has good interpolative accuracy and some extrapolativeaccuracy.All simulations are implemented in MacBook Pro Intel Core i5 (1 CPU, 4 Kernels and 8Gb random access memory). The method provides an alternative for ﬁnite-time simulationof wave propagation. On the other hand, we found that wave propagations over long timeperiod still remains a challenge for the neural network approximation.Unlike conventional numerical methods for solving wave equations, e.g., [3, 14], resultsfor rigorous error bounds from neural network approximations of PDEs over unbounded

Later time Relative error ( × − ) T u reFCNNs u imFCNNs | u FCNNs | u reResNets u imResNets | u ResNets | .

625 13 . . .

66 63 . . . .

65 15 . . .

62 68 . . . .

675 20 . . .

99 73 . . . . . . . . . . .

625 2 .

90 2 .

21 0 .

285 12 . . . .

65 4 .

80 3 .

59 0 .

410 15 . . . .

675 8 .

22 5 .

32 0 .

559 19 . . . . . .

40 0 .

732 24 . . . .

625 47 . . .

26 17 . . . .

65 51 . . .

32 21 . . . .

675 55 . . .

41 27 . . . . . . .

57 31 . . . .

625 10 . . .

28 58 . . . .

65 12 . . .

68 61 . . . .

675 15 . . .

20 65 . . . . . . .

90 68 . . . Table 9.

Illustration of extrapolation in time with training sets in [0 , . x ), tanh( x ), sigmoid( x ) and elu( x ) fromthe top row to the bottom row for the 1D linear Schr¨odinger equation withthe initial condition (3.6) as an input.domains are scarce. Therefore we rely on numerical experiments and we report some directobservations here. Neural network as a PDE solver.

Recently, the machine leaning approach has beenapplied to wave equations in [10]. It is important to point out though that most of thoseeﬀort focused on solving PDEs using neural networks on bounded domains [34, 36, 39, 45],while the current work is aimed at representing solutions of time-dependent hyperbolicPDEs on unbounded domains.

The choice of the data set and network structures.

It is apparent from the numericaltests that the accuracy of the neural network representation depends crucially on the choiceof the training set, as well as the network structure and the activation functions. Whenthe test set lies in the training sets (or the range), we interpret the representation (2.8)as an interpolation. The accuracy is generally satisfactory, and we observe that the reluactivation function stood out as the best choice with robust overall performance in bothFCNN and ResNet, as suggested by the comparison in Tables 2, 3, 6, with the exceptionof the results in 7. The comparison between the performance of FCNN and ResNet seems

MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 21

Figure 3.

Time evolution of the learned solution for the 1D linearSchr¨odinger equation associated with time extrapolation by using the acti-vation tanh( x ) of FCNNs.more subtle: the winner seems to depend on the choice of the activation function and thespeciﬁc test problem.In contrast to interpolations, extrapolations can arise in diﬀerent ways. For instance,the initial condition can be of a diﬀerent function form than those in the training set.Results in Table 4 and Figure 5 are two of such examples. FCNNs seem to oﬀer consistentresults in this case. An extrapolation also comes up when the network (2.8) is used topredict solutions at a later time. We observed from Table 9 that the accuracy of suchan approximation can be guaranteed for a short time period, and the FCNN with tanhactivation function yields much better results than other choices. Another scenario of anextrapolation is when the wave number (or the width of the packet) in the initial condition isoutside the range of the set K (or Σ) that is associated with the training set. Generally, the Figure 4.

A wave packet propagating outside the domain. Left: Predictionby a FCNN; Middle: Exact solution; Right: the error. Top: the real partof wave packet solution; Middle: the imaginary part; Bottom: the electrondensity | u | .error grows when the wave number is further away from the set K . But for a speciﬁc case,the results among diﬀerent choices of the activation functions are mixed. The activationfunction elu seems to give reasonable accuracy in all the cases tested.Due to the wave propagation nature, we proposed to use wave packets to create thetraining set. So far, our numerical tests have not singled out an optimal strategy. Althoughlarger selections of K and Σ generally give better results, they inevitably lead to largertraining dataset. One possible direction is to start with a larger set of training data, andthen use the proper orthogonal decomposition (POD) to extract the most relevant basis. MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 23

Figure 5.

The learned solution for the 1D cubic NLS (2.2), the directnumerical solution, and error proﬁle at single-time T = 1 with the initialcondition (3.9) (top row) and (3.10) (bottom row). The training samplesare created with K = { , , · · · , } and Σ = { h, h, h, h, h, h } with h = 0 .

5. Left panel: the comparison of the solutions; Right panel: the error.In high dimensions, we might use the (quasi-) Monte Carlo methods to handle and extractthe representative elements of Σ and K .The current approach excludes nonlinear waves, e.g., shock waves [13], contact discon-tinuities, solitons [22], etc. It would be interesting to investigate the performance of theneural network in those scenarios as well. The relation to absorbing boundary conditions.

The current approach targets thesame type of problems as absorbing boundary conditions, viz., wave propagation processesthat occur in an unbounded domain, but are triggered by initial conditions or externalsignals that are localized in a bounded domain. However, rather than using the neural (a)

Learned solution with U ( x ) (b) Numerical Strang splitting with U ( x ) (c) Learned solution with V ( x, t ) (d) Numerical Strang splitting with V ( x, t ) Figure 6.

The learned solution at multiple-time T = 1 , , V . Learned solution (leftpanel); Numerical solution (right panel). NLS with potential U ( x ) speciﬁedby (3.12) (top row), and a time-dependent potential V ( x, t ) speciﬁed by(3.11) (bottom row).network to incorporate the absorbing boundary condition into the FDTD procedure [11, 46,51, 52], which involves the history of solutions at the boundary, we directly map the initialcondition to the solution at time instances of interest. The numerical results suggest thatsuch an approximation also exhibits absorbing properties. In addition to the wave equationswe discussed in this paper, the results suggest that this framework can be extended to MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 25

Figure 7.

Solution by the DNN (Left), direct numerical method (Middle),and the error (Right) evaluated at time T = 1 for the initial condition(3.13). Top: the real part of wave packet solution; Middle: the imaginarypart; Bottom: the electron density | u | .other wave propagation problems, e.g., those from ﬂuid mechanics [25], elasticity [5, 24],and molecular dynamics [17, 33, 38]. High-dimensional problems.

One of the distinct advantage of neural networks is theability to treat high-dimensional problems. This has been demonstrated in various typeof PDEs. A potential application of the current approach is to many-particle Schr¨odingerequations. For instance, in ionization problems, electrons can be driven away from nu-clei via a laser ﬁeld, and traditionally, such problems have been treated using eﬀectivemodels and absorbing boundary conditions, e.g., in the context of time-dependent density-functional theory [49]. This work is currently underway.

Acknowledgments

This work is supported in part by the ﬁnancial support from the program of ChinaScholarships Council No. 201906920043 (C. Xie), National Science Foundation of ChinaGrant No. 11971021 (J. Chen), and National Science Foundation Grant DMS-1819011 andDMS-1953120 (X. Li).

References [1] X. Antoine, A. Arnold, C. Besse, M. Ehrhardt, and A. Sch¨adle,

A review of artiﬁcial boundary condi-tions for the Schr¨odinger equation , PAMM: Proc. Appl. Math. Mech., 2007, pp. 1023201–1023202.[2] A. Arnold, M. Ehrhardt, and I. Sofronov,

Discrete transparent boundary conditions for the Schr¨odingerequation: fast calculation, approximation, and stability , Commun. Math. Sci. (2013), no. 3, 501–556.[3] Gang Bao and Haijun Wu, Convergence analysis of the perfectly matched layer problems for time-harmonic maxwell’s equations , SIAM journal on numerical analysis (2005), no. 5, 2121–2143.[4] W. Bao, S. Jin, and P.A. Markowich, On time-splitting spectral approximations for the Schr¨odingerequation in the semiclassical regime , J. Comput. Phys. (2002), no. 2, 487–524.[5] E. Becache, P. Joly, and C. Tsogka,

Fictitious domains, mixed ﬁnite elements and perfectly matchedlayers for 2-D elastic wave propagation , J. Comput. Acous. (2001), no. 03, 1175–1201.[6] C. Beck, W. E, and A. Jentzen, Machine learning approximation algorithms for high-dimensional fullynonlinear partial diﬀerential equations and second-order backward stochastic diﬀerential equations , J.Nonlinear Sci. (2019), no. 4, 1563–1619.[7] S. Becker, R. Braunwarth, M. Hutzenthaler, A. Jentzen, and P. Wurstemberger, Numerical simula-tions for full history recursive multilevel Picard approximations for systems of high-dimensional partialdiﬀerential equations , Commun. Comput. Phys. (2020), 2109–2138.[8] J.P. Berenger, A perfectly matched layer for the absorption of electromagnetic waves , J. Comput. Phys. (1994), no. 2, 185–200.[9] L. Bottou, F. Curtis, and J. Nocedal,

Optimization methods for large-scale machine learning , SIAMRev. (2018), no. 2, 223–311.[10] W. Cai, X. Li, and L. Liu, A phase shift deep neural network for high frequency approximation andwave problems , SIAM J. Sci. Comput. (2019), no. 5, A3285–A3312.[11] Y. Chen and N. Feng, Learning Unsplit-ﬁeld-based PML for the FDTD method by deep diﬀerentiableforest , arXiv:2004.04815 (2020).[12] Y. Chen, L. Lu, G. Karniadakis, and L. Negro,

Physics-informed neural networks for inverse problemsin nano-optics and metamaterials , Opt. Express (2020Apr), no. 8, 11618–11633.[13] C.M. Dafermos, Hyperbolic conservation laws in continuum physics , Vol. 3, Springer, 2005.[14] Julien Diaz and Patrick Joly,

A time domain analysis of pml models in acoustics , Computer methodsin applied mechanics and engineering (2006), no. 29-32, 3820–3853.[15] W. E,

Machine learning and computational mathematics , Commun. Comput. Phys. (2020), no. 5,1639–1670.[16] W. E, J. Han, and A. Jentzen, Deep learning-based numerical methods for high-dimensional parabolicpartial diﬀerential equations and backward stochastic diﬀerential equations , Commun. Math. Statist. (2017), no. 4, 349–380.[17] W. E and Z. Huang, Matching conditions in atomistic-continuum modeling of materials , Phys. Rev.Lett. (2001), no. 13, 135501.[18] W. E and B. Yu, The deep Ritz method: a deep learning-based numerical algorithm for solving varia-tional problems , Commun. Math. Statist. (2018), no. 1, 1–12.[19] B. Engquist and A. Majda, Absorbing boundary conditions for numerical simulation of waves , Proceed.National Acad. Sci. (1977), no. 5, 1765–1766. MACHINE-LEARNING METHOD FOR WAVE EQUATIONS 27 [20] A.C. Eringen and G.A. Maugin,

Electrodynamics of continua i: foundations and solid media , Springer-Verlag New York, 1990.[21] L.C. Evans,

Partial diﬀerential equations , 2nd ed., American Mathematical Society, 2010.[22] M.G. Forest and J.E. Lee,

Geometry and modulation theory for the periodic nonlinear Schr¨odingerequation , 1986, pp. 35–69.[23] D. Givoli,

Computational absorbing boundaries , In: Marburg S., Nolte B. (eds) Computational acousticsof noise propagation in ﬂuids-ﬁnite and boundary element methods, 2008, pp. 145–166.[24] M.N. Guddati and J.L. Tassoulas,

Continued-fraction absorbing boundary conditions for the wave equa-tion , J. Comput. Acous. (2000), no. 01, 139–156.[25] H. Han and W. Bao, An artiﬁcial boundary condition for two-dimensional incompressible viscous ﬂowsusing the method of lines , Inter. J. Numer. Methods Fluids (1996), no. 6, 483–493.[26] H. Han and X. Wu, Artiﬁcial boundary method , Springer Science and Business Media, 2013.[27] J. Han, A. Jentzen, and W. E,

Solving high-dimensional partial diﬀerential equations using deep learn-ing , Proceed. National Acad. Sci. (2018), no. 34, 8505–8510.[28] K. He, X. Zhang, S. Ren, and J. Sun,

Deep residual learning for image recognition , CoRR (2015).[29] ,

Delving deep into rectiﬁers: Surpassing human-level performance on ImageNet classiﬁcation ,Proc. IEEE Inter. Conf. Computer Vision, 2015, pp. 1026–1034.[30] M. Hutzenthaler, A. Jentzen, and P. Wurstemberger,

Overcoming the curse of dimensionality in theapproximative pricing of ﬁnancial derivatives with default risks , Electron. J. Probab. (2020), no. 101,73.[31] S. Jiang and L. Greengard, Fast evaluation of nonreﬂecting boundary conditions for the Schr¨odingerequation in one dimension , Comput. Math. Appl. (2004), no. 6-7, 955–966.[32] X. Jin, S. Cai, H. Li, and G. Karniadakis, NSFnets (Navier-Stokes ﬂow nets): Physics-informed neuralnetworks for the incompressible Navier-Stokes equations , J. Comput. Phys. (2020), 109951.[33] E.G. Karpov, G.J. Wagner, and W.K. Liu,

A Green’s function approach to deriving non-reﬂectingboundary conditions in molecular dynamics simulations , Inter. J. Numer. Methods Eng. (2005),no. 9, 1250–1262.[34] Y. Khoo, J. Lu, and L. Ying, Solving parametric PDE problems with artiﬁcial neural networks , Euro-pean J. Appl. Math. (2020).[35] D.P. Kingma and J. Ba,

Adam: A method for stochastic optimization , CoRR (2015).[36] I. Lagaris, A. Likas, and D. Fotiadis,

Artiﬁcial neural networks for solving ordinary and partial diﬀer-ential equations , IEEE Trans. Neur. Netw. (1998), no. 5, 987–1000.[37] B. Li, J. Zhang, and C. Zheng, An eﬃcient Second-Order ﬁnite diﬀerence method for the One-Dimensional Schr¨odinger equation with absorbing boundary conditions , SIAM J. Numer. Anal. (2018), no. 2, 766–791.[38] X. Li and W. E, Variational boundary conditions for molecular dynamics simulations of solids at lowtemperature , Commun. Comput. Phys. (2006), no. 1, 135–175.[39] L. Lyu, Z. Zhang, M. Chen, and J. Chen, MIM: A deep mixed residual method for solving high-orderpartial diﬀerential equations , arXiv:2006.04146 (2020).[40] F. Moxley, D. Chuss, and W. Dai,

A generalized ﬁnite-diﬀerence time-domain scheme for solvingnonlinear Schr¨odinger equations , Comput. Phys. Commun. (2013), no. 8, 1834–1841.[41] M. Raissi, P. Perdikaris, and G. Karniadakis,

Physics-informed neural networks: A deep learning frame-work for solving forward and inverse problems involving nonlinear partial diﬀerential equations , J.Comput. Phys. (2019), 686–707.[42] M. Raissi, A. Yazdani, and G. Karniadakis,

Hidden ﬂuid mechanics: Learning velocity and pressureﬁelds from ﬂow visualizations , Science (2020), 1026–1030.[43] H. Robbins and S. Monro,

A stochastic approximation method , Ann. Math. Statist. (1951), 400–407.[44] T. Shibata, Absorbing boundary conditions for the ﬁnite-diﬀerence time-domain calculation of the one-dimensional Schr¨odinger equation , Phys. Rev. B (1991), no. 8, 6760. [45] J. Sirignano and K. Spiliopoulos, DGM: A deep learning algorithm for solving partial diﬀerential equa-tions , J. Comput. Phys. (2018), 1339–1364.[46] F. Wang, Z. Yang, and C. Yuan,

Practical absorbing boundary conditions for wave propagation onarbitrary domain , Adv. Appl. Math. Mech. (2020), no. 6, 1384–1415.[47] G.B. Whitham, Linear and nonlinear waves , Vol. 42, John Wiley and Sons, 2011.[48] X. Wu and X. Li,

Absorbing boundary conditions for the time-dependent Schr¨odinger-type equations in R , Phys. Rev. E (2020), no. 1, 013304.[49] K. Yabana, T. Nakatsukasa, J.I. Iwata, and G.F. Bertsch, Real-time, real-space implementation of thelinear response time-dependent density-functional theory , Physica Status Solidi (B) Basic Research (2006apr), no. 5, 1121–1138 (en).[50] L. Yang, X. Meng, and G. Karniadakis,

B-PINNs: Bayesian physics-informed neural networks forforward and inverse PDE problems with noisy data , J. Comput. Phys. (2021), 109913.[51] H. Yao and L. Jiang,

Machine-learning-based PML for the FDTD method , IEEE Ante. Wire. Prop.Lett. (2018), no. 1, 192–196.[52] , Enhanced PML based on the long short term memory network for the FDTD method , IEEEAccess (2020), 21028–21035.[53] A. Yazdani, L. Lu, M. Raissi, and G. Karniadakis, Systems biology informed deep learning for inferringparameters and hidden dynamics , PLoS Comput. Bio. (2020), no. 11, e1007575.[54] H. Yoshida, Construction of higher order symplectic integrators , Phys. Lett. A (1990), no. 5-7,262–268.[55] Y. Zang, G. Bao, X. Ye, and H. Zhou,

Weak adversarial networks for high-dimensional partial diﬀer-ential equations , J. Comput. Phys. (2020), 109409.

School of Mathematical Sciences, Soochow University, Suzhou, China.

Email address : School of Mathematical Sciences, Soochow University, Suzhou, China.

Email address : [email protected] (Corresponding author) Department of Mathematics, The Pennsylvania State University, University Park, PA16802, USA.

Email address ::