Gaussian processes for data fulfilling linear differential equations
GGaussian processes for data fulfilling linear differential equations
Christopher G. AlbertMax-Planck-Institut für Plasmaphysik,Boltzmannstr. 2, 85748 Garching, [email protected] 10, 2019
Abstract
A method to reconstruct fields, source strengths and physical parameters based on Gaussian processregression is presented for the case where data are known to fulfill a given linear differential equation withlocalized sources. The approach is applicable to a wide range of data from physical measurements andnumerical simulations. It is based on the well-known invariance of the Gaussian under linear operators,in particular differentiation. Instead of using a generic covariance function to represent data from an un-known field, the space of possible covariance functions is restricted to allow only Gaussian random fieldsthat fulfill the homogeneous differential equation. The resulting tailored kernel functions lead to more reli-able regression compared to using a generic kernel and makes some hyperparameters directly interpretable.For differential equations representing laws of physics such a choice limits realizations of random fields tophysically possible solutions. Source terms are added by superposition and their strength estimated in aprobabilistic fashion, together with possibly unknown hyperparameters with physical meaning in the differ-ential operator.
The larger context of the present work is the goal to construct reduced complexity models as emulatorsor surrogates that retain mathematical and physical properties of the underlying system. Similar to usualnumerical models, such methods aim to represent infinite systems by exploiting finite information in someoptimal sense. In the spirit of structure preserving numerics the aim here is to move errors to the “rightplace”, in order to retain laws such as conservation of mass, energy or momentum.This article deals with Gaussian process (GP) regression on data with additional information known inthe form of linear, generally partial differential equations (PDEs). An illustrative application is the recon-struction of an acoustic sound pressure field and its sources from discrete microphone measurements. GPs, aspecial class of random fields, are used in a probabilistic rather than a stochastic sense: approximate a fixedbut unknown field from possibly noisy local measurements. Uncertainties in this reconstruction are modeledby a normal distribution. For the limit of zero measured data a prior has to be chosen whose realizationstake values in the expected order of magnitude. An appropriate choice of a covariance function or kernelguarantees that all fields drawn from the GP at any stage fulfill the underlying PDE. This may require togive up stationarity of the process.Techniques to fit GPs to data from PDEs has been known for some time, especially in the field of geo-statistics [1]. A general analysis including a number of important properties is given by [2]. In these earlierworks GPs are usually referred to as Kriging and stationary covariance functions / kernels as covariograms.A number of more recent works from various fields [3, 4, 5] use the linear operator of the problem to obtaina new kernel function for the source field by applying it twice to a generic, usually squared exponential,kernel. In contrast to the present approach, that method is suited best for source fields that are non-vanishingacross the whole domain. In terms of deterministic numerical methods one could say that the approach cor-respond to meshless variants of the finite element method (FEM). The approach in the present work insteadrepresents a probabilistic variant of a procedure related to the boundary element method (BEM), also knownas the method of fundamental solutions (MFS ) or regularized BEM [6, 7, 8]. As in the BEM, the MFS alsobuilds on fundamental solutions, but allows to place sources outside the boundary rather than localizingthem on a layer. Thus the MFS avoids singularities in boundary integrals of the BEM while retaining asimilar ratio of numerical effort and accuracy for smooth solutions. To the author’s knowledge the proba-bilistic variant of the MFS via GPs has first been introduced by [9] to solve the boundary value problem ofthe Laplace equation and dubbed
Bayesian boundary elements estimation method ((BE) M) . This work also a r X i v : . [ phy s i c s . d a t a - a n ] S e p rovides a detailed treatment of kernels for the 2D Laplace equation. A more extensive and general treat-ment of the Bayesian context as well as kernels and their connection to fundamental solutions is availablein [10] under the term probabilistic meshless methods (PMM) .While [9] is focused on boundary data of a single homogeneous equation, and [10] provides a detailedmathematical foundation, the present work aims to explore the topic further for application and extend therecent work in [11]. Starting from general notions some regression techniques are introduced with emphasison the role of localized sources. For this purpose Poisson, Helmholtz and heat equation are considered andseveral kernels are derived and tested. To fit a GP to a homogeneous (source-free) PDE, kernels are built viaaccording fundamental solutions. Possible singularities (sources) are moved outside the domain of interest.In particular, boundary conditions on a finite domain can be either supplied or reconstructed in this fashion.In addition contributions by internal sources are superimposed, using again fundamental solutions in thefree field. For that part boundary conditions of the actual problem are irrelevant. The specific approachtaken here is most efficient for source-free regions with possibly few localized sources that are representedby monopoles or dipoles. Gaussian processes (GPs) are a useful tool to represent and update incomplete information on scalar fields u ( x ) , i.e. a real number u depending on a (multi-dimensional) independent variable x . A GP with mean m ( x ) and covariance function of kernel k ( x , x (cid:48) ) is denoted as u ( x ) ∼ G ( m ( x ) , k ( x , x (cid:48) )) . (1)The choice of an appropriate kernel k ( x , x (cid:48) ) restricts realizations of (1) to respect regularity properties of u ( x ) such as continuity or characteristic length scales. Often regularity of u does not appear by chance, butrather reflects an underlying law. We are going to exploit such laws in the construction and application ofGaussian processes describing u for the case described by linear (partial) differential equationsˆ Lu ( x ) = q ( x ) . (2)Here ˆ L is a linear differential operator, and q ( x ) is an inhomogeneous source term. In physical laws dimen-sions of x usually consist of space and/or time. Physical scalar fields u include e.g. pressure p , temperature T or the electrostatic potential φ e . Corresponding laws under certain conditions include Gauss’ law of electro-statics for φ e with Laplacian ˆ L = ε ∆ , frequency-domain acoustics for p with Helmholtz operator ˆ L = ∆ − k or thermodynamics for T with heat/diffusion operator ˆ L = ∂∂ t − D ∆ . These operators contain free param-eters, namely permeability ε , wavenumber k , and diffusivity D , respectively. While ε may be absorbedinside q in a uniform material model of electrostatics, estimation of parameters k or D is useful for materialcharacterization.For the representation of PDE solutions the weight-space view of Gaussian process regression is useful.There the kernel k is represented via a tuple φφφ ( x ) = ( φ ( x ) , φ ( x ) , . . . ) of basis functions φ i ( x ) that underliea linear regression model u ( x ) = φφφ ( x ) T w = ∑ i φ i ( x ) w i . (3)Bayesian inference starting from a Gaussian prior with covariance matrix Σ p for weights w yields a Mercerkernel k ( x , x (cid:48) ) ≡ φφφ T ( x ) Σ p φφφ ( x (cid:48) ) = ∑ i , j φ i ( x ) Σ i j p φ j ( x (cid:48) ) . (4)The existence of such a representation is guaranteed by Mercer’s theorem in the context of reproducingkernel Hilbert spaces (RKHS) [8]. More generally one can also define kernels on an uncountably infinitenumber of basis functions in analogy to (3) via f ( x ) = ˆ φ [ w ( ζζζ )] = (cid:104) φ ( x , ζζζ ) , w ( ζζζ ) (cid:105) = (cid:90) φ ( x , ζζζ ) w ( ζζζ ) d ζζζ , (5)where ˆ φ is a linear operator acting on elements w ( ζζζ ) of an infinite-dimensional weight space parametrizedby an auxiliary index variable ζζζ , that may be multi-dimensional. We represent ˆ φ via an inner product (cid:104) φ ( x , ζζζ ) , w ( ζζζ ) (cid:105) in the respective function space given by an integral over ζζζ . The infinite-dimensional ana-logue to the prior covariance matrix is a prior covariance operator ˆ Σ p that defines the kernel as a bilinearform k ( x , x (cid:48) ) ≡ (cid:10) φ ( x , ζζζ ) , ˆ Σ p φ ( x (cid:48) , ζζζ (cid:48) ) (cid:11) ≡ (cid:90) φ ( x , ζζζ ) Σ p ( ζζζ , ζζζ (cid:48) ) φ ( x (cid:48) , ζζζ (cid:48) ) d ζζζ d ζζζ (cid:48) . (6) The more general case of complex valued fields and vector fields is left open for future investigations in this context. ernels of the form (6) are known as convolution kernels. Such a kernel is at least positive semidefinite, andpositive definiteness follows in the case of linearly independent basis functions φ ( x , ζζζ ) [8]. For treatment of PDEs possible choices of index variables in (4) or (6) include separation constants of an-alytical solutions, or the frequency variable of an integral transform. In accordance with [10], using basisfunctions that satisfy the underlying PDE, a probabilistic meshless method (PMM) is constructed. In par-ticular, if ζζζ parameterizes positions of sources, and φ ( x , ζζζ ) = G ( x , ζζζ ) in (6) is chosen to be a fundamentalsolution / Green’s function G ( x , ζζζ ) of the PDE, one may call the resulting scheme a probabilistic methodof fundamental solutions (pMFS) . In [10] sources are placed across the whole computational domain, andthe resulting kernel is called natural . Here we will instead place sources in the exterior to fulfill the ho-mogeneous interior problem, as in the classical MFS [6, 7, 8]. Technically, this is also achieved by setting Σ p ( ζζζ , ζζζ (cid:48) ) = ζζζ or ζζζ (cid:48) in the interior. For discrete sources localized ζζζ = ζζζ i one obtains againdiscrete basis functions φ i ( x ) = G ( x , ζζζ i ) for (4).More generally, according to theorem 2 of [2], for linear PDE operators ˆ L in (2) with q (cid:54) = m ( x ) with ˆ Lm ( x ) = q ( x ) , (7)ˆ Lk ( x , x (cid:48) ) = . (8)Here ˆ L acts on the first argument of k ( x , x (cid:48) ) . Sources affect only the mean m ( x ) of the Gaussian process,whereas the kernel k ( x , x (cid:48) ) should be based on the homogeneous equation. This hints to the technique of [12]discussed in [13] chapter 2.7 to treat m ( x ) via a linear model added on top of a zero-mean process for thehomogeneous equation. In that case we consider is the superposition u ( x ) = u h ( x ) + u p ( x ) , (9) u h ( x ) ∼ G ( , k ( x , x (cid:48) )) , (10) u p ( x ) = h T ( x ) b , (11) b ∼ N ( b , B ) . (12)where h T ( x ) b is a linear model for m ( x ) with Gaussian prior mean b and covariance B for the modelcoefficients. The homogeneous part (10) corresponds to a random process u h ( x ) where a source-free k isconstructed according to (8). The inhomogeneous part (11) may be given by any particular solution u p ( x ) for arbitrary boundary conditions. Using the limit of a vague prior with b = | B − | →
0, i.e. minimuminformation / infinite prior covariance [12, 13], posteriors for mean ¯ u and covariance matrix cov ( u , u ) basedon given training data y = u ( X ) + σ n with measurement noise variance σ are¯ u ( X (cid:63) ) = K T (cid:63) K − y ( y − H T ¯ b ) + H T (cid:63) ¯ b = K T (cid:63) K − y y + R T ¯ b , (13)cov ( u ( X (cid:63) ) , u ( X (cid:63) )) = K (cid:63)(cid:63) − K T (cid:63) K − y K (cid:63) + R T ( HK − y H T ) − R . (14)Here X = ( x , x , . . . x N ) contains the training points, X (cid:63) = ( x (cid:63) , x (cid:63) , . . . , x (cid:63) N (cid:63) ) the evaluation or test points.Functions of X and X (cid:63) are to be understood as vectors or matrices resulting from evaluation at different po-sitions, i.e. ¯ u ( X (cid:63) ) ≡ ( ¯ u ( x (cid:63) ) , ¯ u ( x (cid:63) ) , . . . , ¯ u ( x (cid:63) N (cid:63) )) is a tuple of predicted expectation values. The matrix K ≡ k ( X , X ) is the kernel covariance of the training data with entries K i j ≡ k ( x i , x j ) and cov ( u ( X (cid:63) ) , u ( X (cid:63) )) i j ≡ cov ( u ( x (cid:63) i ) , u ( x (cid:63) j )) are entries of the predicted covariance matrix for u evaluated in the test points x (cid:63) i . Fur-thermore K y ≡ k ( X , X ) + σ I , K (cid:63) ≡ k ( X , X (cid:63) ) , K (cid:63)(cid:63) ≡ k ( X (cid:63) , X (cid:63) ) , R ≡ H (cid:63) − HK − y K (cid:63)(cid:63) , and entries of H are H i j ≡ h i ( x j ) , H (cid:63) i j ≡ h i ( x (cid:63) j ) , and ¯ b ≡ ( HK − y H T ) − HK − y y . A linear model for m ( x ) fulfilling a PDE according to (8) follows directly from the source representation.Consider sources to be modeled as a linear superposition over basis functions q ( x ) = ∑ i ϕ i ( x ) q i (15)with unknown source strength coefficients q = ( q i ) . To model the mean instead of the source functionsthemselves, one uses an according superposition m ( x ) = ∑ i u pi ( x ) q i (16) f particular solutions u pi ( x ) from inhomogeneous equationsˆ Lu pi ( x ) = ϕ i ( x ) . (17)For the linear model (9) this means that b = q and h i ( x ) = u pi ( x ) . Posterior mean of source strengths andtheir uncertainty are ¯ q = ( HK − y H T ) − HK − y y , (18)cov ( q , q ) = ( HK − y H T ) − . (19)One can easily check that the predicted mean ¯ u ( x (cid:63) ) = ¯ u h ( x (cid:63) ) + ¯ u p ( x (cid:63) ) at a specific point x (cid:63) in (13) fulfillsthe linear differential equation (2). In the homogeneous part ¯ u h ( x (cid:63) ) = k ( x (cid:63) , X ) K − y ( y − H T ¯ q ) sources areabsent with ˆ L ¯ u h ( x (cid:63) ) =
0, with ˆ L acting on x (cid:63) here. The particular solution ¯ u p ( x (cid:63) ) = h T ( x (cid:63) ) ¯ q = ∑ i u pi ( x (cid:63) ) ¯ q i adds source contributions q i ϕ i ( x (cid:63) ) due to (17). For point monopole sources ϕ i ( x ) = δ ( x − x qi ) placed atat positions x qi , the particular solution u p , i ( x ) equals the fundamental solution G ( x , x qi ) evaluated for therespective source. In the absence of sources the part described in this subsection isn’t modeled and (13-14)reduce to posteriors of a GP with prior mean m ( x ) = R vanishes. Here the general results described in the previous section are applied to specific equations. Regression isperformed based on values measured at a set of sampling points x i and may also include optimization ofhyperparameters β appearing as auxiliary variables inside the kernel k ( x , x (cid:48) ; β ) . The optimization step isusually performed in a maximum a posteriori (MAP) sense, choosing β MAP as fixed rather than providinga joint probability distribution function including β as random variables. We note that depending on thesetting this choice may lead to underestimation of uncertainties in the reconstruction of u , in particular forsparse, low-quality measurements. First we explore construction of kernels in (10) for a purely homogeneous problem in a finite and infinitedimensional index space, depending on the mode of separation. Consider Laplace’s equation ∆ u ( x ) = . (20)In contrast to the Helmholtz equation, Laplace’s equation has no scale, i.e. permits all length scales in thesolution. In the 2D case using polar coordinates the Laplacian becomes1 r ∂∂ r (cid:18) r ∂ u ∂ r (cid:19) + r ∂ u ∂ θ = . (21)A well-known family of solutions for this problem based on the separation of variables is u = r ± m e ± im θ , (22)leading to a family of solutions r m cos ( m θ ) , r m sin ( m θ ) , r − m cos ( m θ ) , r − m sin ( m θ ) . (23)Since our aim is to work in bounded regions we discard the solutions with negative exponent that divergeat r =
0. Choosing a diagonal prior that weights sine and cosine terms equivalently [9] and introducing alength scale s as a free parameter we obtain a kernel according to (4) with k ( x , x (cid:48) ; s ) = ∞ ∑ m = (cid:18) rr (cid:48) s (cid:19) m σ m ( cos ( m θ ) cos ( m θ (cid:48) ) + sin ( m θ ) sin ( m θ (cid:48) )) = ∞ ∑ m = (cid:18) rr (cid:48) s (cid:19) m σ m cos (cid:0) m ( θ − θ (cid:48) ) (cid:1) . (24)A flat prior σ m = s as a hyperparameter, yields k ( x , x (cid:48) ; s ) = − rr (cid:48) s cos ( θ − θ (cid:48) ) − rr (cid:48) s cos ( θ − θ (cid:48) ) + ( rr (cid:48) ) s = − x · x (cid:48) s − x · x (cid:48) s + | x | | x (cid:48) | s . (25) This kernel is not stationary, but isotropic around a fixed coordinate origin. Introducing a mirror point ¯ x (cid:48) with polar angle ¯ θ (cid:48) = θ (cid:48) and radius ¯ r (cid:48) = s / r (cid:48) we notice that (25) can be written as k ( x , x (cid:48) ; s ) = | ¯x (cid:48) | − x · ¯x (cid:48) ( x − ¯ x (cid:48) ) , (26)making a dipole singularity apparent at x = ¯ x (cid:48) . In addition k is normalized to 1 at x =
0. Choosing s > R larger than the radius R of a circle centered in the origin and enclosing the computational domain, we have¯ r (cid:48) > s / s = s > R . Thus all mirror points and the according singularities are moved outside the domain.Choosing a slowly decaying σ m = / m , excluding m = k ( x , x (cid:48) ; s ) = −
12 ln (cid:18) − x · x (cid:48) s + | x | | x (cid:48) | s (cid:19) = − ln (cid:18) | x − ¯ x (cid:48) || ¯x (cid:48) | (cid:19) . (27)Instead of a dipole singularity that expression features a monopole singularity at x − ¯ x (cid:48) that is avoided asmentioned above.Using instead Cartesian coordinates x , y to separate the Laplacian provides harmonic functions like u = e ± κ x e ± i κ y . (28)Here all solutions yield finite values at x =
0, so we don’t have to exclude any of them a priori . Introducingagain a diagonal covariance operator in (6) and taking the real part yields k ( x , x (cid:48) ) = (cid:90) ϕ ( x , κ ) σ ( κ ) ϕ ( x (cid:48) , κ ) d κ = Re (cid:90) ∞ − ∞ σ ( κ ) e κ ( x ± x (cid:48) ) e i κ ( y ± y (cid:48) ) d κ . (29)Setting σ ( κ ) ≡ e − κ and choosing a characteristic length scale s together with a possible rotation angle θ of the coordinate frame yields the kernel k ( x , x (cid:48) ; s , θ ) =
12 Re exp (cid:18) (( x + x (cid:48) ) ± i ( y − y (cid:48) )) e i θ ) s (cid:19) . (30)Other sign combinations do not yield a positive definite kernel – similar to the polar kernel (26) before wecouldn’t obtain an fully stationary expression that depends only on differences between coordinates of x and x (cid:48) . q = ∆ ¯ u of prediction (bottom right). For demonstration purposes we consider an analytical solution to a boundary value problem of Laplace’sequation on a square domain Ω with corners at ( x , y ) = ( ± , ± ) . The reference solution is u ref ( x , y ) = e y cos x + x cos ( y ) (31)and depicted in the upper left of Fig. 1 together with the extension outside the boundaries. This figure alsoshows results from a GP fitted based on data with artificial noise of σ n = . s =
2. Inside Ω the solution is represented with errors below 5%. This is also reflected in theerror predicted by the posterior variance of the GP that remains small in the region enclosed by measurementpoints. The analogy in classical analysis is the theorem that the solution of a homogeneous elliptic equationis fully determined by boundary values.In comparison, a reconstruction using a generic squared exponential kernel k ∝ exp (( x − x (cid:48) ) / ( s )) yields a result of similar approximation quality in Fig. 2. The posterior covariance of that reconstructionis however not able to capture the vanishing error inside the enclosed domain due to given boundary data.More severely, in contrast to the previous case, the posterior mean ¯ u doesn’t satisfy Laplace’s equation ∆ ¯ u = Ω , showing up in the difference to the reconstruction in Fig. 2. Thiskind of error is quantified by computation of the reconstructed charge density ¯ q = ∆ ¯ u . This is fine if datafrom Poisson’s equation ∆ u = q with distributed charges should be fitted instead. However, to keep ∆ u = Ω , one requires more specialized kernels such as (26). To demonstrate the proposed method in full we now consider the Helmholtz equation with sources ∆ u ( x ) + k u ( x ) = q ( x ) . (32)Stationary kernels based on Bessel functions for the homogeneous equation have been presented in [11].These functions provide smoothing regularization on the order of the wavelength λ = π / k and have beendemonstrated to produce excellent field reconstruction from point measurements. Here we consider thetwo-dimensional case. The method of source strength reconstruction is improved compared to [11], as itconstitutes a linear problem according to (18-19). Non-linear optimization is instead applied to wavenumber k as a free hyperparameter to be estimated during the GP regression. q with 95% confidence interval according to posterior (18-19). Negative log likelihood(bottom right) with optimum at k ML0 = .
19 for Bessel kernel [11] (solid line), whereas the actual value (dottedline) is k = .
16. The length scale of a squared exponential kernel (dashed line) is less peaked.
The setup is the same as in [11]: a 2D cavity with various boundary conditions and two sound sourcesof strengths 0.5 and 1, respectively. Results for sound pressure fulfilling (32) are normalized to have amaximum of p / p =
1. Fig. 3 shows reconstruction error in field reconstruction depending on the numberof measurement positions. Here noise of σ n = .
01 has been added to the samples. The obtained negativelog-likelihood depending on k permits an accurate reconstruction of this quantity that has the physicalmeaning of a wavenumber. A generic squared exponential kernel k ∝ exp (( x − x (cid:48) ) / ( ( π / k ) )) leads toresults of similar quality and a slightly less peaked spatial length scale hyperparameter without a directphysical interpretation. Consider the homogeneous heat/diffusion equation ∂ u ∂ t − D ∆ u = . (33)for ( x , t ) ∈ R × R + . Integrating the fundamental solution G = / (cid:112) π ( t − τ ) exp (( x − ξ ) / ( ( t − τ )) from ξ = − ∞ to ∞ at τ =
0, i.e. placing sources everywhere at a single point in time, leads to the kernel k n ( x − x (cid:48) , t + t (cid:48) ; D ) = (cid:112) π D ( t + t (cid:48) ) e − ( x − x (cid:48) ) D ( t + t (cid:48) ) . (34)In terms of x this is a stationary squared exponential kernel and the natural kernel over the domain x ∈ R .The kernel broadens with increasing t and t (cid:48) . Non-stationarity in time can also be considered natural to theheat equation, since its solutions show a preferred time direction on each side of the singularity t =
0. Theonly difference of (34) to the singular heat kernel is the positive sign between t and t (cid:48) . If both of them arepositive, k is guaranteed to takes finite values. s for the Laplace equation it is also convenient to define a spatially non-stationary kernel by cuttingout a finite source-free domain. Evaluating the integral over the fundamental solution in R \ ( a , b ) withoutour domain interval ( a , b ) we obtain k n ( x , t , x (cid:48) , t (cid:48) ) = k n ( x − x (cid:48) , t + t (cid:48) ; D ) (cid:20) − g ( x , t , x (cid:48) , t (cid:48) ; D , b ) − g ( x , t , x (cid:48) , t (cid:48) ; D , a ) (cid:21) . (35)where g ( x , t , x (cid:48) , t (cid:48) ; D , s ) ≡ erf (cid:32) ( s − x ) / t + ( s − x (cid:48) ) / t (cid:48) √ D (cid:112) / t + / t (cid:48) (cid:33) . (36)Incorporating the prior knowledge that there are no domain sources could potentially improve the recon-struction. Initial investigations on the initial-boundary value problem of the heat equation based on thosekernels produce stable results showing natural regularization within the limits of the strongly ill-posed set-ting. Reconstruction of diffusivity D has proven to be a difficult task and requires further investigations. Summary and Outlook
A framework for application of Gaussian process regression to data from an underlying partial differentialhas been presented. The method is based on Mercer kernels constructed from fundamental solutions andproduces realizations that match the homogeneous problem exactly. Contributions from sources are super-imposed via an additional linear model. Several examples for suitable kernels have been given for Laplace’sequation, Helmholtz equation and heat equation. Regression performance has been shown to yield results ofsimilar or higher quality to a squared exponential kernel in the considered application cases. Advantages ofthe specialized kernel approach are the possibility to represent exact absence of sources as well as physicalinterpretability of hyperparameters.In a next step reconstruction of vector fields via GPs could be formulated, taking laws such as Maxwell’sequations or Hamilton’s equations of motion into account. A starting point could be squared exponentialkernels for divergence- and curl-free vector fields [14]. Such kernels have been used in [15] to perform sta-tistical reconstruction, and [16] apply them to GPs for source identification in the Laplace/Poisson equation.In order to model Hamiltonian dynamics in phase-space, vector-valued GPs could possibly be extended torepresent not only volume-preserving (divergence-free) maps but retain full symplectic properties, therebyconserving all integrals of motion such as energy or momentum.
Acknowledgments
I would like to thank Dirk Nille, Roland Preuss and Udo von Toussaint for insightful discussions. Thisstudy is a contribution to the
Reduced Complexity Models grant number ZT-I-0010 funded by the HelmholtzAssociation of German Research Centers.
References [1] A. Dong, “Kriging Variables that Satisfy the Partial Differential Equation ∆ Z = Y,” in
Geostatistics ,pp. 237–248, 1989.[2] K. G. van den Boogaart, “Kriging for processes solving partial differential equations,” in
IAMG2001,Cancun, Mexiko , no. July, pp. 1–21, 2001.[3] T. Graepel, “Solving noisy linear operator equations by Gaussian processes: Application to ordinaryand partial differential equations,” in
Proc. Int. Conf. Mach. Learn. (T. Fawcett and N. Mishra, eds.),pp. 234–241, 2003.[4] S. Särkkä, “Linear Operators and Stochastic Partial Differential Equations in Gaussian Process Regres-sion,”
Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics) ,vol. 6792 LNCS, no. PART 2, pp. 151–158, 2011.[5] M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Inferring solutions of differential equations usingnoisy multi-fidelity data,”
J. Comput. Phys. , vol. 335, pp. 736–746, 2017.[6] K. Lackner, “Computation of ideal MHD equilibria,”
Comput. Phys. Commun. , vol. 12, no. 1, pp. 33–44, 1976.[7] M. A. Golberg, “The method of fundamental solutions for Poisson’s equation,”
Eng. Anal. Bound.Elem. , vol. 16, no. 3, pp. 205–213, 1995.
8] R. Schaback and H. Wendland, “Kernel techniques: From machine learning to meshless methods,”
Acta Numer. , vol. 15, pp. 543–639, 2006.[9] F. M. Mendes and E. A. da Costa Junior, “Bayesian inference in the numerical solution of Laplace’sequation,” in
AIP Conf. Proc. , vol. 1443, pp. 72–79, 2012.[10] J. Cockayne, C. Oates, T. Sullivan, and M. Girolami, “Probabilistic Numerical Methods for PartialDifferential Equations and Bayesian Inverse Problems,” arXiv Prepr. , 2016.[11] C. Albert, “Physics-informed transfer path analysis with parameter estimation using Gaussian pro-cesses,” in , 2019.[12] A. O’Hagan, “Curve Fitting and Optimal Design for Prediction,”
J. R. Stat. Soc. Ser. B , vol. 40, no. 1,pp. 1–24, 1978.[13] C. E. Rasmussen and C. K. I. Williams,
Gaussian Processes for Machine Learning . MIT Press, 2006.[14] F. J. Narcowich and J. D. Ward, “Generalized Hermite Interpolation Via Matrix-Valued ConditionallyPositive Definite Functions,”
Math. Comput. , vol. 63, no. 208, p. 661, 1994.[15] I. Macêdo and R. Castro, “Learning divergence-free and curl-free vector fields with matrix-valuedkernels,”
Inst. Nac. Mat. Pura e Apl. Bras. Tech. Rep , 2008.[16] A. D. Cobb, R. Everett, A. Markham, and S. J. Roberts, “Identifying Sources and Sinks in the Presenceof Multiple Agents with Gaussian Process Vector Calculus,”
Proc. 24th ACM SIGKDD Int. Conf.Knowl. Discov. Data Min. - KDD ’18 , pp. 1254–1262, 2018., pp. 1254–1262, 2018.