A holistic approach to computing first-arrival traveltimes using neural networks
Umair bin Waheed, Tariq Alkhalifah, Ehsan Haghighat, Chao Song
AA holistic approach to computing first-arrivaltraveltimes using neural networks
Umair bin Waheed a , Tariq Alkhalifah b , Ehsan Haghighat c , Chao Song b a Department of Geosciences, King Fahd University of Petroleum and Minerals, Dhahran31261, Saudi Arabia. b Physical Sciences and Engineering Division, King Abdullah University of Science andTechnology, Thuwal 23955, Saudi Arabia. c Department of Civil Engineering, Massachusetts Institute of Technology, MA 02139,USA.
Abstract
Since the original algorithm by John Vidale in 1988 to numerically solve theisotropic eikonal equation, there has been tremendous progress on the topicaddressing an array of challenges including improvement of the solution ac-curacy, incorporation of surface topography, adding more accurate physicsby accounting for anisotropy/attenuation in the medium, and speeding upcomputations using multiple CPUs and GPUs. Despite these advances, thereis no mechanism in these algorithms to carry on information gained by solv-ing one problem to the next. Moreover, these approaches may breakdownfor certain complex forms of the eikonal equation, requiring approximationmethods to estimate the solution. Therefore, we seek an alternate approachto address the challenge in a holistic manner, i.e., a method that not onlymakes it simpler to incorporate topography, allow accounting for any levelof complexity in physics, benefiting from computational speedup due to theavailability of multiple CPUs or GPUs, but also able to transfer knowledgegained from solving one problem to the next. We develop an algorithm basedon the emerging paradigm of physics-informed neural network to solve vari-ous forms of the eikonal equation. We show how transfer learning and surro-gate modeling can be used to speed up computations by utilizing informationgained from prior solutions. We also propose a two-stage optimization schemeto expedite the training process in presence of sharper heterogeneity in the
Email address: [email protected] (Umair bin Waheed)
Preprint submitted to Elsevier January 29, 2021 a r X i v : . [ phy s i c s . c o m p - ph ] J a n elocity model and recommend using a locally adaptive activation functionfor faster convergence. Furthermore, we demonstrate how the proposed ap-proach makes it simpler to incorporate additional physics and other featuresin contrast to conventional methods that took years and often decades tomake these advances. Such an approach not only makes the implementationof eikonal solvers much simpler but also puts us on a much faster path toprogress. The method paves the pathway to solving complex forms of theeikonal equation that have remained unsolved using conventional algorithmsor solved using some approximation techniques at best; thereby, creating newpossibilities for advancement in the field of numerical eikonal solvers. Keywords:
Eikonal equation, anisotropy, traveltimes, neural networks,scientific machine learning
1. Introduction
A mechanism to accurately compute traveltimes is an essential ingredientfor the success of a wide range of seismic processing and imaging tools in-cluding statics and moveout correction [1], traveltime tomography for initialvelocity model building [2, 3], microseismic source localization [4], and ray-based migration [5]. Ray tracing and finite-difference based solutions of theeikonal equation are the most popular approaches for computing traveltimes.The eikonal equation is a nonlinear partial differential equation (PDE) ob-tained from the first term of the Wentzel-Kramers-Brillouin expansion of thewave equation and represents a class of the Hamilton-Jacobi equations [6].Ray tracing methods compute traveltimes along the characteristics ofthe eikonal equation by solving a system of ordinary differential equations[7]. The approach is generally efficient for a sparse source-receiver geometrybut the computational cost increases dramatically with the increase in thenumber of source-receiver pairs. Moreover, for practical applications such asimaging and velocity model building, traveltime solutions need to be inter-polated onto a regular grid. This requirement not only adds to the computa-tional cost of the method but also poses a challenge particularly in complexmedia where rays may diverge from one another leading to large spatial gapsbetween rays, creating regions known as shadow zones [8]. Additionally, instrongly varying velocity models, multiple ray-paths may connect a source-receiver pair, making it easy to miss the path with the minimum traveltime.Therefore, numerical solutions of the eikonal equation has been a topic of2ontinued research interest over the years.Vidale [9] led the development of numerical eikonal solvers by proposingan expanding box strategy to compute first-arrival traveltimes in heteroge-neous media. Subsequently, the method was improved and extended to threedimensions [8], to incorporate anisotropy [10, 11], and to high-order accuratesolutions [12]. The instability of the expanding box method due to turningrays led to the development of the expanding wavefront scheme [13]. Thiswas further improved to obtain maximum energy traveltimes [14], and toincorporate anisotropy in the model [15].Another algorithm that became popular during the late 1990s was thefast-marching method [16]. The popularity of the method was due to itsaccuracy, stability, and efficiency properties. The fast-marching method sawgreat interest and development in the subsequent period. This included ex-tension of the method to improve traveltime accuracy [17, 18, 19], incorporat-ing anisotropy [20, 21, 22], parallelization for computational speedup usingmultiple CPUs [23], and even acceleration using GPUs [24].Despite its success, the fast marching method was overtaken in pop-ularity by the fast sweeping method [25] since the mid-2000s. This wasmainly due to the flexibility and robustness of the fast sweeping methodto various forms of the eikonal equation. Numerous advances to the fastsweeping method have since been proposed to improve the accuracy of themethod [26, 27], to incorporate anisotropy [28, 29, 30, 31], to account forattenuation [32], to tackle surface topography [33], and parallelization forcomputational speedup [34, 35].Several other hybrid strategies have also been proposed to solve theeikonal equation. For a detailed review of these methods, we refer the inter-ested reader to [36].In light of these developments, it is beyond doubt that there has beentremendous progress since the original eikonal solver by Vidale [9]. Thishuge and growing body of literature, spanning over three decades, on thenumerical solution of the eikonal equation required significant research ef-forts to address an array of challenges including improvement of the solutionaccuracy, incorporation of surface topography, adding more accurate physicsby accounting for anisotropy/attenuation in the medium, and speeding upcomputations by using multiple CPUs and GPUs. Therefore, we seek analternate approach that could address these challenges in a holistic manner –a method that makes it simpler to incorporate topography, allow accountingfor more accurate physics, and benefit from computational speedup due to3he availability of multiple CPUs or GPUs. Such an approach would notonly make the implementation of eikonal solvers much simpler but also putus on a much faster path to progress in solving complex forms of the eikonalequation.Furthermore, a major drawback of the conventional eikonal solvers isthat there is no mechanism to utilize the information gained by solvingone problem to the next. Therefore, same amount of computational effortis needed even for a small perturbation in the source position and/or thevelocity model. This can lead to a computational bottleneck, particularlyfor imaging/inversion applications that require repeated computations, oftenwith thousands of source positions and multiple updated velocity models.Therefore, a method that could use information gained from one solutionto the next to speed up computations can potentially remedy this situation.With these objectives in mind, we look into the machine learning literaturefor inspiration.Having shown remarkable success across multiple research domains [37],machine learning has recently shown promise in tackling problems in sci-entific computing. The idea to use an artificial neural network for solvingPDEs has been around since the 1990s [38]. However, due to recent advancesin the theory of deep learning coupled with a massive increase in computa-tional power and efficient graph-based implementation of new algorithms andautomatic differentiation, we are witnessing a resurgence of interest in usingneural networks to approximate solution of PDEs. Recently, Raissi et al. [39]developed a deep learning framework for the solution and discovery of PDEs.The so-called physics-informed neural network (PINN) leverages the capa-bilities of deep neural networks (DNNs) as universal function approximators.In contrast to the conventional deep learning approaches, PINNs restrict thespace of admissible solutions by enforcing the validity of the underlying PDEgoverning the actual physics of the problem. This is achieved by using a sim-ple feed-forward neural network leveraging automatic differentiation (AD).PINNs have already demonstrated success in solving a wide range of non-linear PDEs, including Burgers, Schr¨odinger, Navier-Stokes, and Helmholtzequations [39, 40, 41].In this chapter, we present a neural network approach to solve variousforms of the eikonal equation. We use the PINN framework, where the gov-erning equation is incorporated into the loss function of the neural network.We also show how the proposed method addresses the highlighted challengescompared to conventional algorithms. Specifically, we show that by simply4pdating the loss function of the neural network, we can account for moreaccurate physics in the traveltime solution. Moreover, since the proposedmethod is mesh-free, we will observe that to incorporate topography, no spe-cial treatment is needed as opposed to conventional finite-difference methods.In addition, the use of computational graphs allows us to run the same pieceof code on different platforms (CPUs, GPUs) and architectures (desktops,clusters) without worrying about the implementation details. Most impor-tantly, the proposed method allows us to use information gained while solvingfor a particular source position and velocity model to speed up computationsfor perturbations in the velocity model and/or source position. This wedemonstrate through the use of machine learning techniques like transferlearning and surrogate modeling.The rest of the chapter is organized as follows. We begin by present-ing the theoretical foundations of the proposed method and discuss how itcan be used to solve more complex forms of the eikonal equation. Next, wetest the method on a diverse set of 2D and 3D benchmark synthetic modelsand compare its performance with the popular fast sweeping method. Fi-nally, we conclude the chapter by discussing the strengths of the method andidentifying future research opportunities.
2. Theory
In this section, we describe how neural networks can be used to com-pute traveltime solutions for eikonal equations corresponding to isotropicand anisotropic media. We do so by first introducing the different forms ofthe eikonal equation and the concept of factorization. Next, we outline thegeneral mechanism of a feed-forward neural network followed by its capabil-ity as a function approximator. This is followed by a brief overview of theconcept of automatic differentiation, which is used to compute the derivativeof the networks’ output with respect to the inputs. Finally, putting theseconcepts together, we will present the proposed algorithm for solving variousforms of the eikonal equation.
The eikonal equation is a non-linear first-order PDE that is, for anisotropic medium, given as: (cid:18) ∂T∂x (cid:19) + (cid:18) ∂T∂z (cid:19) = 1 v ( x, z ) , (1)5ubjected to a point-source boundary condition as: T ( x s , z s ) = 0 , (2)where T ( x, z ) is the traveltime from the source point ( x s , z s ) to a point ( x, z )in the computational domain, and v ( x, z ) is the phase velocity of the isotropicmedium. Since the curvature of the wavefront near the point-source is ex-tremely large, previous studies [27, 42] have shown that it is better to solvethe factored eikonal equation instead of equation (1). The idea is to fac-tor the unknown traveltime into two multiplicative factors, where one of thefactors is specified analytically to capture the source-singularity such thatthe other factor is gently varying in the source neighborhood. Therefore, wefactor T ( x, z ) into two multiplicative functions: T ( x, z ) = T ( x, z ) τ ( x, z ) , (3)where T ( x, z ) is the known function and τ ( x, z ) is the unknown function.Plugging this into equation (1), we get the factored eikonal equation for anisotropic model as: (cid:18) T ∂τ∂x + τ ∂T ∂x (cid:19) + (cid:18) T ∂τ∂z + τ ∂T ∂z (cid:19) = 1 v , (4)subjected to the updated point-source condition: τ ( x s , z s ) = 1 . (5)The known factor T is the traveltime solution in a homogeneous isotropicmodel given as: T ( x, z ) = (cid:112) ( x − x s ) + ( z − z s ) v s , (6)where v s is taken to be the velocity at the point-source location.Equation (4) is the factored eikonal equation for an isotropic medium;however, sedimentary rocks exhibit at least some degree of anisotropy due to anumber of factors including thin layering and preferential alignment of grainscracks [43]. This results in the velocity being a function of the wave prop-agation direction, making the isotropic approximation of the Earth invalid.Therefore, traveltime computation algorithms must honor the anisotropicnature of the Earth for accurate subsurface imaging and other applications.6hus, we consider a realistic approximation of the subsurface anisotropyknown as the tilted transverse isotropy (TTI) case. The factored eikonalequation for a TTI medium is considerably more complex than the isotropiccase and is given, under the acoustic assumption, as [42]: (1 + 2 (cid:15) ) (cid:18) cos θ (cid:18) T ∂τ∂x + τ ∂T ∂x (cid:19) + sin θ (cid:18) T ∂τ∂z + τ ∂T ∂z (cid:19)(cid:19) + (cid:18) cos θ (cid:18) T ∂τ∂z + τ ∂T ∂z (cid:19) − sin θ (cid:18) T ∂τ∂x + τ ∂T ∂x (cid:19)(cid:19) × (cid:32) − ηv t (1 + 2 (cid:15) )1 + 2 η (cid:18) cos θ (cid:18) T ∂τ∂x + τ ∂T ∂x (cid:19) + sin θ (cid:18) T ∂τ∂z + τ ∂T ∂z (cid:19)(cid:19) (cid:33) = 1 v t , (7)where v t ( x, z ) is the velocity along the symmetry axis, (cid:15) ( x, z ) and η ( x, z ) arethe anisotropy parameters, and θ ( x, z ) is the tilt angle that the symmetryaxis makes with the vertical. The point-source condition is the same as theone given in equation (5). Again τ ( x, z ) is the unknown function we solveequation (7) for, whereas T ( x, z ) is the known function which may be takenas the solution of a homogeneous, tilted elliptically isotropic medium, givenas [28]: T ( x, z ) = (cid:115) b s ( x − x s ) + 2 c s ( x − x s )( z − z s ) + a s ( z − z s ) a s b s − c s , (8)with a s = v ts (1 + 2 (cid:15) s ) cos θ s + v ts sin θ s ,b s = v ts cos θ s + v ts (1 + 2 (cid:15) s ) sin θ s ,c s = (cid:0) v ts − v ts (1 + 2 (cid:15) s ) (cid:1) cos θ s sin θ s . (9)In the above expressions, v ts and (cid:15) s are the velocity along the symmetryaxis and the anisotropy parameter, respectively, at the point-source location.Similarly, θ s is the tilt angle taken at the source-point.The reason for considering eikonal equations corresponding to two differ-ent media is to highlight, in comparison with the conventional methods, howeasy it is to adapt the proposed method to solve a relatively more complexeikonal equation (more on this in Section 2.4). A feed-forward neural network, also known as a multi-layer perceptron,is a set of neurons organized in layers in which evaluations are performed7equentially through the layers. It can be seen as a computational graph withan input layer, an output layer, and an arbitrary number of hidden layers. Ina fully connected neural network, neurons in adjacent layers are connectedwith each other, but neurons within a single layer share no connections. Itis called a feed-forward neural network because information flows from theinput through each successive layer to the output. Moreover, there are nofeedback or recursive loops in a feed-forward neural network.Neural networks are well-known for their strong representational power.A neural network with n neurons in the input layer and m neurons in theoutput layer can be used to represent a function u : R n → R m . In fact, ithas been shown that a neural network with a finite number of neurons inthe hidden layer can be used to represent any bounded continuous functionto the desired accuracy. This is also known as the universal approximationtheorem [44, 45]. In addition, it was later shown that by using a deep networkwith multiple hidden layers and a non-linear activation function, the totalnumber of neurons needed to represent a given function can be significantlyreduced [46]. Therefore, our goal here is to train a deep neural network(DNN) that could represent the mapping between the spatial coordinates( x, z ), as inputs to the network, and the unknown traveltime function τ ( x, z )representing the output of the DNN, i.e., τ ≈ DNN( x, z ) . (10)We formulate here considering a 2D case for simplicity of illustration. In a3D model, one would need a neural network with three input neurons, onefor each spatial dimension. It must also be noted that while DNNs are, intheory, capable of representing very complex functions, finding the actualparameters (weights and biases) needed to solve a given PDE can be verychallenging. Solving a PDE using neural networks requires a mechanism to accuratelycompute derivatives of the network’s output(s) with respect to the input(s).There are multiple ways to compute derivatives including hand-coded analyt-ical derivatives, symbolic differentiation, numerical approximation, and auto-matic differentiation (AD) [47]. While manually working out the derivativesis exact, it is often time consuming to code and it is error-prone. Symbolicdifferentiation, while also exact, may result in exponentially large expressions8nd, therefore, can be prohibitively slow and memory intensive. Numericaldifferentiation, on the other hand, is easy to implement but can be highlyinaccurate due to round-off errors. Contrary to these approaches, AD usesexact expressions with floating-point values instead of symbolic strings andit involves no approximation errors. This results in an accurate evaluation ofderivatives at machine precision. Therefore, we evaluate the partial deriva-tives of the unknown traveltime factor τ with respect to the inputs ( x, z )using AD.However, it must be noted that an efficient implementation of the AD al-gorithm can be non-trivial. Fortunately, many existing computational frame-works such as Tensorflow [48] and
PyTorch [49] have made available effi-ciently implemented AD libraries.
We begin by considering how the different pieces of the puzzle outlined inthe previous subsections can be combined to solve eikonal equations. First,we illustrate this using the factored isotropic eikonal equation (4) and thendemonstrate how simple it is under the proposed framework to solve a morecomplex eikonal equation, such as the factored TTI eikonal equation (7).To solve equation (4), we leverage the capabilities of neural networks asfunction approximators and define a loss function that minimizes the residualof the underlying PDE for a chosen set of training (collocation) points. Thisis achieved using the following components:i. a DNN approximation of the unknown traveltime field variable τ ( x, z ),ii. a differentiation algorithm, i.e., AD in this case, to evaluate partialderivatives of τ ( x, z ) with respect to the spatial coordinates ( x, z ),iii. a loss function incorporating the underlying eikonal equation, sampledon a collocation grid, andiv. an optimizer to minimize the loss function by updating the neural net-work parameters.To illustrate the idea, let us consider a two-dimensional domain Ω ∈ R .A point-source is located at coordinates ( x s , z s ), where τ ( x s , z s ) = 1. Theunknown traveltime factor τ ( x, z ) is approximated using a DNN, N τ , suchthat: 9 ( x, z ) ≈ ˆ τ ( x, z ) = N τ ( x, z ; θ ) , (11)where x, z are network inputs, ˆ τ is the network output, and θ ∈ R D representsthe set of all trainable parameters of the network with D as the total numberof parameters.The loss function is given by the mean-squared error norm as J = 1 N I (cid:88) ( x ∗ ,z ∗ ) ∈ I (cid:107)L(cid:107) + 1 N I (cid:88) ( x ∗ ,z ∗ ) ∈ I (cid:107)H ( − ˆ τ ) | ˆ τ |(cid:107) + (cid:107) ˆ τ ( x s , z s ) − (cid:107) , (12)where L represents the residual of the factored isotropic eikonal equa-tion (4), given by L = (cid:18) T ∂ ˆ τ∂x + ˆ τ ∂T ∂x (cid:19) + (cid:18) T ∂ ˆ τ∂z + ˆ τ ∂T ∂z (cid:19) − v . (13)The first term on the right side of equation (12) imposes validity of thefactored eikonal equation (4) on a given set of training points ( x ∗ , z ∗ ) ∈ I ,where N I is the number of training samples. The second term forces thesolution ˆ τ to be positive by penalizing negative solutions using the Heavisidefunction H (). The last term enforces the boundary condition by imposingthe solution ˆ τ to be unity at the source point ( x s , z s ). The set of networkparameters θ ∗ that minimizes the loss function (12) on this set of trainingpoints, ( x ∗ , z ∗ ) ∈ I , is then identified by solving the optimization problem: θ ∗ = arg min θ ∈ R D J ( x ∗ , z ∗ ; θ ) . (14)Once the DNN is trained, we evaluate the network on a set of regulargrid-points in the computational domain to obtain the unknown traveltimefield. The final traveltime solution is obtained by multiplying it with theknown traveltime part, i.e.,ˆ T ( x, z ) = T ( x, z ) ˆ τ ( x, z ) . (15)This yields traveltimes corresponding to an isotropic approximation ofthe Earth. However, it is well-known that the subsurface is anisotropic innature. Therefore, a significant amount of research effort has been spentover the years on extending numerical eikonal solvers to anisotropic media.The complication in numerically solving the eikonal equation arises due to10 andomly initialized neural network x Trained neural network 𝑇 ! (𝑥 ∗ , 𝑧 ∗ )(𝑥 ∗ , 𝑧 ∗ ) 𝑣 (𝑥 ∗ , 𝑧 ∗ )̂𝜏(𝑥 , 𝑧) Multiply by 𝑇 ! (𝑥, 𝑧) )𝑇(𝑥 , 𝑧) ̂𝜏(𝑥 ∗ , 𝑧 ∗ ) Minimize w.r.t.network parameters ∇𝑇 ! (𝑥 ∗ , 𝑧 ∗ ) (𝑥, 𝑧)𝜖(𝑥 ∗ , 𝑧 ∗ )𝜂(𝑥 ∗ , 𝑧 ∗ )𝜃(𝑥 ∗ , 𝑧 ∗ ) Figure 1: A workflow of the proposed factored TTI eikonal solver in 2D: A randomlyinitialized neural network is trained on a set of randomly selected collocation points ( x ∗ , z ∗ )in the model space with given model parameters v t ( x ∗ , z ∗ ), (cid:15) ( x ∗ , z ∗ ), η ( x ∗ , z ∗ ), θ ( x ∗ , z ∗ ),the known traveltime function T ( x ∗ , z ∗ ), and its spatial derivative ∇ T ( x ∗ , z ∗ ) to minimizethe loss function given in equation 12. Once the network is trained, it is evaluated ona regular grid of points ( x, z ) to yield an estimate of the traveltime field ˆ τ , which isthen multiplied with the factored traveltime part T to yield the estimated first-arrivaltraveltime solution ˆ T . anellipticity of the wavefront [50] resulting in high-order nonlinear termsin the eikonal equation. These high-order terms dramatically increase thecomplexity in solving the anisotropic eikonal equation and, therefore, havebeen a topic of immense research interest. On the contrary, the proposedneural network formulation allows solving for the anisotropic eikonal equationby simply replacing the residual in equation (12) with the one correspondingto the anisotropic eikonal equation. Therefore, to solve the factored TTIeikonal equation (7), we would, instead of equation (13), use the following: L = (1 + 2 (cid:15) ) (cid:18) cos θ (cid:18) T ∂τ∂x + τ ∂T ∂x (cid:19) + sin θ (cid:18) T ∂τ∂z + τ ∂T ∂z (cid:19)(cid:19) + v t (cid:18) cos θ (cid:18) T ∂τ∂z + τ ∂T ∂z (cid:19) − sin θ (cid:18) T ∂τ∂x + τ ∂T ∂x (cid:19)(cid:19) × (cid:32) − ηv t (1 + 2 (cid:15) )1 + 2 η (cid:18) cos θ (cid:18) T ∂τ∂x + τ ∂T ∂x (cid:19) + sin θ (cid:18) T ∂τ∂z + τ ∂T ∂z (cid:19)(cid:19) (cid:33) − v t . (16)This is a highly desirable feature of this approach because eikonal equationscorresponding to models with even lower symmetry than TTI can be easilysolved by simply using a different residual term. By contrast, conventional11lgorithms such as fast marching or fast sweeping methods would require sig-nificant effort to incorporate such changes, thereby, resulting in much slowerscientific progress.A workflow summarizing the proposed solver for the factored TTI eikonalequation is shown in Figure 1.
3. Numerical Tests
In this section, we test the neural network based eikonal solver and com-pare its performance with the first-order fast sweeping method, which isroutinely used in geophysical (and other) applications for traveltime compu-tation. We consider several isotropic and anisotropic 2D/3D models for thesetests and also include a model with topography to demonstrate the flexibilityof the proposed method.For each of the examples presented below, we use a neural network having20 hidden layers with 20 neurons in each layer and minimize the neural net-work’s loss function using full-batch optimization. A locally adaptive inversetangent activation function is used for all hidden layers except the final layer,which uses a linear activation function. Locally adaptive activation functionshave been shown to achieve superior optimization performance and conver-gence speed over base methods. The introduction of a scalable parameter inthe activation function for each neuron changes the slope of the activationfunction and, therefore, alters the loss landscape of the neural network forimproved performance. For more information on locally adaptive activationfunctions, we refer the interested reader to [51].We choose the afore-mentioned configuration of the neural network basedon some initial tests and keep it fixed for the entire study to minimize theneed for hyper-parameter tuning for each new velocity model.The following examples are prepared and trained using the neural net-work library, SciANN [52], a Keras/Tensorflow API that is designed andoptimized for physics-informed deep learning. SciANN leverages the latestadvancements of Tensorflow while keeping the interface close to the mathe-matical description of the problem.
Example 1: An isotropic model with constant vertically varying gradient
First, we consider a vertically varying 1 × isotropic model. Thevelocity at zero depth is 2 km/s and it increases linearly with a gradient of 1s − to 3 km/s at a depth of 1 km. We compute traveltime solutions using the12 .0 0.2 0.4 0.6 0.8 1.0 Offset (km) D e p t h ( k m ) k m / s Figure 2: A vertically varying velocity model with a constant velocity gradient of 1 s − .The velocity at zero depth is equal to 2 km/s and it increases linearly to 3 km/s at a depthof 1 km. The black star indicates the point-source location used for the test. neural network and first-order fast sweeping method by considering a point-source located at (0 . , . ×
101 grid with 10 m grid interval alongboth axes.For training the neural network, we begin with randomly initialized pa-rameters and train on 50% of the total grid points selected randomly and usethe Adam optimizer [54] with 10,000 epochs. Once the network is trained,we evaluate the trained network on the regularly sampled (101 × τ , which is then multiplied with the cor-responding factored traveltime field T to obtain the final traveltime solution.We compare the accuracy of the neural network solution and the first-orderfast sweeping solution, computed on the same regular grid, in Figure 3. Weobserve significantly better accuracy for the neural network based solutiondespite using only half of the total grid points for training. In Figure 4, weconfirm this observation by plotting the corresponding traveltime contours. Example 2: Smoothly varying TTI model
Next, we train the neural network to solve the TTI eikonal equation.Compared to the fast sweeping method that requires significant modifications13 .0 0.2 0.4 0.6 0.8 1.0
Offset (km) D e p t h ( k m ) s e c o n d s
1e 5 (a)
Offset (km) D e p t h ( k m ) s e c o n d s (b)Figure 3: Absolute traveltime errors for the neural network solution (a) and the first-order fast sweeping solution (b) for the isotropic model and the source location shown inFigure 2. Offset (km) D e p t h ( k m ) ReferenceNeural networkFast sweeping
Figure 4: Traveltime contours for the reference solution (solid red) computed analytically,neural network solution (dashed black), and the first-order fast sweeping solution (dottedblue) for the vertically varying isotropic model. The black star shows the location of thepoint-source. .0 0.2 0.4 0.6 0.8 1.0 Offset (km) D e p t h ( k m ) k m / s Figure 5: A velocity model for the parameter v t with a constant vertical gradient of1.5 s − and a horizontal velocity gradient of 0.5 s − . A homogeneous model is used forthe anisotropy parameters ( (cid:15) =0.2, η = 0.083) and for the tilt angle ( θ = 30 ◦ ). The blackstar indicates the point-source location used for the test. Epochs L o ss Random initial modelPre-trained initial model
Figure 6: A comparison of loss history for training of the TTI model in example 2 using pre-trained weights from example 1 (blue) and random initialization (orange). An L-BFGS-Boptimizer is used for both cases. .0 0.2 0.4 0.6 0.8 1.0 Offset (km) D e p t h ( k m ) s e c o n d s (a) Offset (km) D e p t h ( k m ) s e c o n d s (b)Figure 7: Absolute traveltime errors for the neural network solution (a) and the fastsweeping solution (b) for the TTI model considered in example 2. to the isotropic eikonal solver, the neural network based approach requiresonly an update to the loss function by incorporating the appropriate residualbased on the TTI eikonal equation. For the velocity parameter v t , we considera linear velocity model with a vertical gradient of 1.5 s − and a horizontalgradient of 0.5 s − as shown in Figure 5. We use homogeneous models forthe anisotropy parameters with (cid:15) =0.2 and η = 0.083. We also consider ahomogeneous tilt angle of θ = 30 ◦ . These models are also discretized on a101 ×
101 grid with 10 m grid interval along both axes.Instead of training the network from scratch, we use transfer learning,which is a machine learning technique that relies on storing knowledge gainedwhile solving one problem and applying it to a different but related problem.Starting with a pre-trained network from example 1, we fine-tune the neuralnetwork parameters for the TTI model using 50% of the total grid points,selected randomly, using the L-BFGS-B solver [55] for 100 epochs. Start-ing with a pre-trained network allows us to use the L-BFGS-B method forfaster convergence as opposed to starting with the Adam optimizer and thenswitching to L-BFGS-B as suggested by previous studies [39]. For compari-son, we also train a neural network from scratch and the convergence history,shown in Figure 6, confirms that the solution converges much faster whenusing transfer learning.Figure 7 compares absolute traveltime errors computed using the neuralnetwork and the first-order fast sweeping method using the iterative solver16 .0 0.2 0.4 0.6 0.8 1.0
Offset (km) D e p t h ( k m ) ReferenceNeural networkFast sweeping
Figure 8: Traveltime contours for the reference solution (solid red), neural network solution(dashed black), and the first-order fast sweeping solution (dotted blue) for the smoothlyvarying TTI model. The black star shows the location of the point-source. of Waheed et al. [29]. The reference solution is obtained using a high-orderfast sweeping method on a finer grid. We observe that, despite using transferlearning, the accuracy of the neural network solution is considerably betterthan the fast sweeping method. We confirm this observation visually bycomparing the corresponding traveltime contours in Figure 8.It is worth noting that while the complexity of the fast sweeping solversand their computational cost increases dramatically when switching from anisotropic to a TTI model, for the neural network both cases require simi-lar complexity and computational cost. Therefore, the proposed method isparticularly suited to model complex physics involving media with complexanisotropy, attenuation, etc.One of the main challenges in seismic imaging and inversion is the needfor repeated traveltime computations for thousands of source locations andmultiple updated velocity models. Unfortunately, conventional techniquesdo not allow the transfer of information between one solution to the nextand, therefore, the same amount of computational effort is needed for evensmall perturbations in the source location and/or the velocity model. Wenoted above how transfer learning can be used to speed up convergence fora new velocity model and source position. This could be further extended17 .0 0.2 0.4 0.6 0.8 1.0
Offset (km) D e p t h ( k m ) k m / s Figure 9: A velocity model for the parameter v t with a constant vertical gradient of1.5 s − and a horizontal velocity gradient of 0.5 s − . A homogeneous model is used forthe anisotropy parameters ( (cid:15) =0.2, η = 0.083) and for the tilt angle ( θ = 30 ◦ ). Black starsindicate locations of sources used to train the network as a surrogate model. by adding source location ( x s , z s ) as input to the network and training asurrogate model.To do so, we train a neural network on solutions computed for 16 sourceslocated at regular intervals in the considered TTI model as shown in Figure 9.Through the training process, the network learns the mapping between agiven source location and the corresponding traveltime field. Once the surro-gate model is trained, traveltime fields for additional source locations can becomputed instantly by using a single evaluation of the trained network. Thisis similar to obtaining an analytic solver as no further training is needed forcomputing traveltimes corresponding to new source locations. This featureis particularly advantageous for large 3D models that need thousands of suchcomputations.After training the surrogate model, we test its performance by comput-ing the traveltime field corresponding to a randomly chosen source location.Figure 10 compares the absolute traveltime errors for the solution predictedby the surrogate model and the fast sweeping TTI solver. We observe thateven without any additional training for this new source position, we obtainremarkably high accuracy compared to the fast sweeping method. This isalso confirmed by visually comparing the corresponding traveltime contours18 .0 0.2 0.4 0.6 0.8 1.0 Offset (km) D e p t h ( k m ) s e c o n d s (a) Offset (km) D e p t h ( k m ) s e c o n d s (b)Figure 10: Absolute traveltime errors for the solution computed using the surrogate model(a) and the fast sweeping solution (b) for the TTI model considered in example 2 for arandomly chosen source location. in Figure 11. Example 3: VTI SEAM model
Next, we test the performance of the proposed method on a portion ofthe VTI SEAM model, shown in Figure 12. This is a particularly interestingexample due to sharper variations in the velocity model and the anisotropyparameters. The model is discretized on a 101 ×
101 grid with a grid spacingof 100 m along both axes. Based on our recent efforts in using the neural net-work solver [41, 56], we observed that the convergence of the neural networkapproach slows down considerably in the presence of sharp variations in thevelocity model. We have already seen above that using a pre-trained neuralnetwork yields faster convergence by allowing the use of the second-orderoptimization method (L-BFGS-B) directly. Therefore, in this example, weuse the pre-trained network obtained from the TTI eikonal solver in example2. For further speedup, we propose a two-stage training scheme to obtainan accurate traveltime solution. In the first stage, when the neural networkis learning a smooth representation of the underlying function, we use only asmall percentage of grid points for training. In this case, we use only 1% ofthe total grid points chosen randomly for the first 200 epochs and then switchto using 50% of the total grid points in stage 2 for another 1000 epochs to19 .0 0.2 0.4 0.6 0.8 1.0
Offset (km) D e p t h ( k m ) ReferenceNeural networkFast sweeping
Figure 11: Traveltime contours for solutions obtained using the reference solution (solidred), the neural network surrogate model (dashed black), and the first-order fast sweepingsolver (dotted blue) for a randomly chosen source point in the smoothly varying TTImodel. update the learned function in better approximating sharp features in theresulting traveltime field. Again, since we start training with a pre-trainednetwork, we use the L-BFGS-B optimizer for faster convergence.Figure 13 shows traveltime contours for a point-source located at (5 km,1 km) for the reference solution, the neural network solution, and the first-order fast sweeping solution. In Figure 13(a), we show the neural networksolution at the end of stage 1. It can be seen that the solution is quite smoothand misses sharp features visible in the reference solution. In Figure 13(b), weobserve that the neural network solution captures these features as additionaltraining points are added in the second stage of training. Therefore, using asmall number of training points in stage 1 reduces the training cost withoutcompromising on solution accuracy. By comparing the absolute traveltimeerrors in Figure 14, we observe that the neural network solution after stage 2is considerably more accurate than the first-order fast sweeping method evenfor a realistic VTI model.
Example 4: BP TTI model
We have already seen how the neural network based approach is flexiblein incorporating complex physics compared to conventional techniques. In20
Offset (km) D e p t h ( k m ) k m / s (a) Offset (km) D e p t h ( k m ) (b) Offset (km) D e p t h ( k m ) (c)Figure 12: (a) The vertical velocity v t , (b) the (cid:15) parameter, and (c) the η parameter forthe considered portion of the VTI SEAM model. The white star indicates the position ofthe point-source. Offset (km) D e p t h ( k m ) ReferenceNeural networkFast sweeping (a)
Offset (km) D e p t h ( k m ) ReferenceNeural networkFast sweeping (b)Figure 13: Traveltime contours for the reference solution (solid red), neural network solu-tion (dashed black), and the first-order fast sweeping solution (dotted blue) for the VTISEAM model. Red arrows indicate the improvement of accuracy due to the additionaltraining points used in the neural network solution in going from stage 1 (a) to stage 2 (b)of the proposed training process. The black star shows the location of the point-source.
Offset (km) D e p t h ( k m ) s e c o n d s (a) Offset (km) D e p t h ( k m ) s e c o n d s (b)Figure 14: Absolute traveltime errors for the neural network solution (a) and the fastsweeping solution (b) for the SEAM VTI model. Offset (km) D e p t h ( k m ) k m / s (a) Offset (km) D e p t h ( k m ) (b) Offset (km) D e p t h ( k m ) (c) Offset (km) D e p t h ( k m ) D e g r ee s (d)Figure 15: (a) The velocity along the symmetry axis v t , (b) the (cid:15) parameter, (c) the η parameter, and (d) the tilt angle θ for the considered portion of the BP TTI model witha layer of non-flat topography. Offset (km) D e p t h ( k m ) s e c o n d s (a) Offset (km) D e p t h ( k m ) s e c o n d s (b)Figure 16: Absolute traveltime errors for the neural network solution (a) and the fastsweeping solution (b) for the BP TTI model. this example, we will see how incorporating irregular topography is straight-forward using the proposed approach. The free-surface encountered in landseismic surveys is often non-flat and requires taking this into account for ac-curate traveltime computation. One approach to tackle this is to transformfrom Cartesian to curvilinear coordinate system to mathematically flattenthe free-surface and solve the resulting topography-dependent eikonal equa-tion [33]. This adds additional computational cost and may result in insta-bilities when topography varies sharply. On the contrary, the neural networkapproach outlined here is mesh-free and doesn’t require any modification tothe algorithm. We demonstrate this through a test on a portion of the BPTTI model, shown in Figure 15. The model is discretized on a 161 ×
161 gridusing a grid spacing of 6.25 m along both axes. Points above the consideredtopography layer are then removed from the model.For a point-source located at (5 km, 0.5 km), we compare the neural net-work and fast sweeping traveltime solutions. For training the neural network,we take into account about 50% of the grid points below the topography andonce the network is trained, we evaluate the solution on the regular gridpoints that fall below the topography. We start training using pre-trainedparameters from example 2 and train for 10,000 epochs using an L-BFGS-Boptimizer. Figure 16 shows absolute traveltime errors for the two cases, indi-cating that the neural network solution is again considerably more accurate24
Offset (km) D e p t h ( k m ) ReferenceNeural networkFast sweeping
Figure 17: Traveltime contours for the reference solution (solid red), neural network so-lution (dashed black), and the first-order fast sweeping solution (dotted blue) for the BPTTI model. The black star shows the location of the point-source and the solid blackcurve indicates the topography layer. than fast sweeping. We confirm this observation visually through traveltimecontours plotted in Figure 17.
Example 5: 3D TTI model
Finally, we show an example of extending the proposed method to a 3DTTI model. The model for the velocity parameter v t is shown in Figure 18.A homogeneous model is used for the anisotropy parameters with (cid:15) =0.2 and η = 0.1. We also consider a homogeneous tilt angle of 30 ◦ . The model isdiscretized on a 101 × ×
51 grid with a grid spacing of 50 m along eachaxis. The workflow for obtaining the neural network solution is essentiallythe same. In this case, the neural network takes three input parameterscorresponding to the spatial axes ( x, y, z ) and outputs the correspondingunknown traveltime field ˆ τ ( x, y, z ). Similar to before, the neural networkoutput is multiplied by the known traveltime factor T ( x, y, z ) to obtain thefinal traveltime solution.Starting with a pre-trained neural network on a smoothly varying TTImodel, we train the network using 50% of the total grid points, chosen ran-domly. We use 20,000 L-BFGS-B epochs during the training process. For a25 igure 18: A velocity model for the parameter v t for the 3D TTI model test. A homoge-neous model is used for the anisotropy parameters ( (cid:15) =0.2, η = 0.1) and for the tilt angle( θ = 30 ◦ ). point-source located at ( x, y, z )=(2 km, 2 km, 1 km), we compare the accu-racy of the neural network solution and the first-order fast sweeping solutionin Figure 19. We observe that the proposed method is capable of computingaccurate traveltimes for 3D TTI models as well without requiring any majoralteration to the underlying algorithm.
4. Discussion and conclusions
We proposed a neural network approach to computing first-arrival trav-eltimes based on the framework of physics-informed neural networks. Byleveraging the capabilities of neural networks as universal function approxi-mators, we define a loss function to minimize the residual of the governingeikonal equation at a chosen set of training points. This is achieved with asimple feed-forward neural network using automatic differentiation.We demonstrated the flexibility of the proposed framework in incorpo-rating anisotropy in the model simply by updating the loss function of theneural network according to the underlying PDE. Since the method is mesh-free, we also saw how easy it is to incorporate non-flat topography into thesolver compared to conventional methods. Another attractive feature, dueto this mesh-free nature of the algorithm, is that sources and receivers do26 a) (b)(c) (d)Figure 19: Absolute traveltime errors for the neural network solution (a,b) and the fastsweeping solution (c,d) for the 3D TTI model.
Tensorflow at thebackend, which allows for asy deployment of computations across a varietyof platforms (CPUs, GPUs) and architectures (desktops, clusters). On thecontrary, significant effort is needed in adapting conventional algorithms tobenefit from different computational platforms or architectures.In short, the approach tackles many problems associated with obtainingfast and accurate traveltimes, that required decades of research, in a holisticmanner. In fact, it opens new possibilities in solving complex forms of theeikonal equation that have remained unsolved by conventional algorithms orsolved through some approximation techniques at best. Examples of suchequations include the double-square root eikonal equation [57] and the at-tenuating anisotropic eikonal equation [32].Nevertheless, there are a few challenges associated with the method thatrequires further research. Chief among them is the slow convergence of the so-lution in the presence of sharp heterogeneity in the velocity and/or anisotropymodels. We proposed a two-stage optimization process in this chapter thatalleviates part of the problem by using only a small fraction of the trainingpoints during the initial training phase. Since at this stage, the network islearning a smooth representation of the underlying function, we could savesome computational cost by using a small number of training points initially.We also used a locally adaptive activation function that has been shown toachieve faster convergence. Other possible solutions may include an adaptivesampling of the velocity model by using denser sampling of training pointsaround parts of the model with large velocity gradients. Another challengeconcerns the optimal choice of the neural networks’ hyper-parameters. Inthis study, we alleviated the problem by choosing them through some quickinitial tests and keeping them fixed for all the test examples. Recent advancesin the field of meta-learning may, potentially, enable automated selection ofthese parameters in the future. 28 eferences [1] D. C. Lawton, Computation of refraction static corrections using first-break traveltime differences, Geophysics 54 (1989) 1289–1296.[2] J. Hole, B. Zelt, 3-D finite-difference reflection traveltimes, GeophysicalJournal International 121 (1995) 427–434.[3] C. Taillandier, M. Noble, H. Chauris, H. Calandra, First-arrival trav-eltime tomography based on the adjoint-state method, Geophysics 74(2009) WCB1–WCB10.[4] V. Grechka, A. De La Pena, E. Schissel´e-Rebel, E. Auger, P.-F. Roux,Relative location of microseismicity, Geophysics 80 (2015) WC1–WC9.[5] G. Lambare, S. Operto, P. Podvin, P. Thierry, 3D ray+ born migra-tion/inversion—part 1: Theory, Geophysics 68 (2003) 1348–1356.[6] M. G. Crandall, P.-L. Lions, Viscosity solutions of hamilton-jacobi equa-tions, Transactions of the American mathematical society 277 (1983)1–42.[7] V. Cerveny, Seismic ray theory, Cambridge university press, 2001.[8] J. E. Vidale, Finite-difference calculation of traveltimes in three dimen-sions, Geophysics 55 (1990) 521–526.[9] J. Vidale, Finite-difference calculation of travel times, Bulletin of theSeismological Society of America 78 (1988) 2062–2076.[10] J. Dellinger, Anisotropic finite-difference traveltimes, in: SEG TechnicalProgram Expanded Abstracts 1991, Society of Exploration Geophysi-cists, 1991, pp. 1530–1533.[11] J. Dellinger, W. Symes, Anisotropic finite-difference traveltimes usinga Hamilton-Jacobi solver, in: SEG Technical Program Expanded Ab-stracts 1997, Society of Exploration Geophysicists, 1997, pp. 1786–1789.[12] S. Kim, R. Cook, 3-D traveltime computation using second-order enoscheme, Geophysics 64 (1999) 1867–1876.2913] P. Podvin, I. Lecomte, Finite difference computation of traveltimes invery contrasted velocity models: a massively parallel approach and itsassociated tools, Geophysical Journal International 105 (1991) 271–284.[14] D. E. Nichols, Maximum energy traveltimes calculated in the seismicfrequency band, Geophysics 61 (1996) 253–263.[15] Y. Wang, T. Nemeth, R. T. Langan, An expanding-wavefront methodfor solving the eikonal equations in general anisotropic media, Geo-physics 71 (2006) T129–T135.[16] J. A. Sethian, A. M. Popovici, 3-D traveltime computation using thefast marching method, Geophysics 64 (1999) 516–523.[17] J. Rickett, S. Fomel, A second-order fast marching eikonal solver, Stan-ford Exploration Project Report 100 (1999) 287–293.[18] T. Alkhalifah, S. Fomel, Implementing the fast marching eikonal solver:spherical versus cartesian coordinates, Geophysical Prospecting 49(2001) 165–178.[19] A. M. Popovici, J. A. Sethian, 3-D imaging using higher order fastmarching traveltimes, Geophysics 67 (2002) 604–609.[20] J. A. Sethian, A. Vladimirsky, Ordered upwind methods for staticHamilton–Jacobi equations: Theory and algorithms, SIAM Journal onNumerical Analysis 41 (2003) 325–363.[21] E. Cristiani, A fast marching method for Hamilton-Jacobi equationsmodeling monotone front propagations, Journal of Scientific Computing39 (2009) 189–205.[22] U. bin Waheed, T. Alkhalifah, H. Wang, Efficient traveltime solutionsof the acoustic TI eikonal equation, Journal of Computational Physics282 (2015) 62–76.[23] M. Breuß, E. Cristiani, P. Gwosdek, O. Vogel, An adaptive domain-decomposition technique for parallelization of the fast marching method,Applied Mathematics and Computation 218 (2011) 32–44.3024] J. Monsegny, J. Monsalve, K. Le´on, M. Duarte, S. Becerra, W. Agudelo,H. Arguello, Fast marching method in seismic ray tracing on parallelGPU devices, in: Latin American High Performance Computing Con-ference, Springer, 2018, pp. 101–111.[25] H. Zhao, A fast sweeping method for eikonal equations, Mathematicsof computation 74 (2005) 603–627.[26] Y.-T. Zhang, H.-K. Zhao, J. Qian, High order fast sweeping methodsfor static hamilton–jacobi equations, Journal of Scientific Computing29 (2006) 25–56.[27] S. Fomel, S. Luo, H. Zhao, Fast sweeping method for the factored eikonalequation, Journal of Computational Physics 228 (2009) 6440–6455.[28] S. Luo, J. Qian, Fast sweeping methods for factored anisotropic eikonalequations: multiplicative and additive factors, Journal of Scientific Com-puting 52 (2012) 360–382.[29] U. B. Waheed, C. E. Yarman, G. Flagg, An iterative, fast-sweeping-based eikonal solver for 3D tilted anisotropic media, Geophysics 80(2015) C49–C58.[30] S. Han, W. Zhang, J. Zhang, Calculating qP-wave traveltimes in 2-D TTI media by high-order fast sweeping methods with a numericalquartic equation solver, Geophysical Journal International 210 (2017)1560–1569.[31] P. Le Bouteiller, M. Benjemaa, L. M´etivier, J. Virieux, A discontinuousgalerkin fast-sweeping eikonal solver for fast and accurate traveltimecomputation in 3D tilted anisotropic media, Geophysics 84 (2019) C107–C118.[32] Q. Hao, U. Waheed, T. Alkhalifah, A fast sweeping scheme for P-wavetraveltimes in attenuating VTI media, in: 80th EAGE Conference andExhibition 2018, volume 2018, European Association of Geoscientists &Engineers, 2018, pp. 1–5.[33] H. Lan, Z. Zhang, A high-order fast-sweeping scheme for calculatingfirst-arrival travel times with an irregular surface, Bulletin of the Seis-mological Society of America 103 (2013) 2070–2082.3134] H. Zhao, Parallel implementations of the fast sweeping method, Journalof Computational Mathematics (2007) 421–429.[35] M. Detrixhe, F. Gibou, C. Min, A parallel fast sweeping method for theeikonal equation, Journal of Computational Physics 237 (2013) 46–55.[36] J. V. G´omez, D. ´Alvarez, S. Garrido, L. Moreno, Fast methods foreikonal equations: an experimental survey, IEEE Access 7 (2019) 39005–39029.[37] M. I. Jordan, T. M. Mitchell, Machine learning: Trends, perspectives,and prospects, Science 349 (2015) 255–260.[38] I. E. Lagaris, A. Likas, D. I. Fotiadis, Artificial neural networks forsolving ordinary and partial differential equations, IEEE transactionson neural networks 9 (1998) 987–1000.[39] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neuralnetworks: A deep learning framework for solving forward and inverseproblems involving nonlinear partial differential equations, Journal ofComputational Physics 378 (2019) 686–707.[40] S. Karimpouli, P. Tahmasebi, Physics informed machine learning: Seis-mic wave equation, Geoscience Frontiers 11 (2020) 1993–2001.[41] C. Song, T. Alkhalifah, U. b. Waheed, Solving the acoustic VTIwave equation using physics-informed neural networks, arXiv preprintarXiv:2008.01865 (2020).[42] U. b. Waheed, T. Alkhalifah, A fast sweeping algorithm for accuratesolution of the tilted transversely isotropic eikonal equation using fac-torization, Geophysics 82 (2017) WB1–WB8.[43] L. Thomsen, Weak elastic anisotropy, Geophysics 51 (1986) 1954–1966.[44] G. Cybenko, Approximation by superpositions of a sigmoidal function,Mathematics of control, signals and systems 2 (1989) 303–314.[45] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networksare universal approximators, Neural networks 2 (1989) 359–366.3246] Z. Lu, H. Pu, F. Wang, Z. Hu, L. Wang, The expressive power of neuralnetworks: A view from the width, in: Advances in neural informationprocessing systems, 2017, pp. 6231–6239.[47] A. G. Baydin, B. A. Pearlmutter, A. A. Radul, J. M. Siskind, Automaticdifferentiation in machine learning: a survey, The Journal of MachineLearning Research 18 (2017) 5595–5637.[48] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro,G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Good-fellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser,M. Kudlur, J. Levenberg, D. Man´e, R. Monga, S. Moore, D. Mur-ray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Tal-war, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Vi´egas, O. Vinyals,P. Warden, M. Wattenberg, M. Wicke, Y. Yu, X. Zheng, Tensor-Flow: Large-scale machine learning on heterogeneous systems, 2015.URL: