Modern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport
MModern Monte Carlo Variants for UncertaintyQuantification in Neutron Transport
Ivan G. Graham, Matthew J. Parkinson, and Robert ScheichlDedicated to Ian H. Sloan on the occasion of his 80th birthday.
Abstract
We describe modern variants of Monte Carlo methods for UncertaintyQuantification (UQ) of the Neutron Transport Equation, when it is approximated bythe discrete ordinates method with diamond differencing. We focus on the mono-energetic 1D slab geometry problem, with isotropic scattering, where the cross-sections are log-normal correlated random fields of possibly low regularity. Thepaper includes an outline of novel theoretical results on the convergence of the dis-crete scheme, in the cases of both spatially variable and random cross-sections. Wealso describe the theory and practice of algorithms for quantifying the uncertaintyof a linear functional of the scalar flux, using Monte Carlo and quasi-Monte Carlomethods, and their multilevel variants. A hybrid iterative/direct solver for comput-ing each realisation of the functional is also presented. Numerical experiments showthe effectiveness of the hybrid solver and the gains that are possible through quasi-Monte Carlo sampling and multilevel variance reduction. For the multilevel quasi-Monte Carlo method, we observe gains in the computational ε -cost of up to 2 ordersof magnitude over the standard Monte Carlo method, and we explain this theoreti-cally. Experiments on problems with up to several thousand stochastic dimensionsare included. Keywords : Reactor Modelling, Neutron (Boltzmann) Transport Equation, Ra-diative Transport, Monte Carlo, QMC, MLMC, Source Iteration.
Ivan G. Graham ( (cid:0) ) · Matthew J. Parkinson · Robert ScheichlUniversity of Bath, Claverton Down, Bath, BA2 7AY, UKe-mail: [email protected]; [email protected]; [email protected] 1 a r X i v : . [ m a t h . NA ] O c t Ivan G. Graham, Matthew J. Parkinson, and Robert Scheichl
In this paper we will consider the Neutron Transport equation (NTE), sometimesreferred to as the Boltzmann transport equation. This is an integro-differential equa-tion which models the flux of neutrons in a reactor. It has particular applications fornuclear reactor design, radiation shielding and astrophysics [44]. There are manypotential sources of uncertainty in a nuclear reactor, such as the geometry, materialcomposition and reactor wear. Here, we will consider the problem of random spatialvariation in the coefficients (the cross-sections ) in the NTE, represented by corre-lated random fields with potentially low smoothness. Our aim is to understand how uncertainty in the cross-sections propagates through to (functionals of) the neutronflux . This is the forward problem of Uncertainty Quantification.We will quantify the uncertainty using Monte Carlo (MC) type methods, that is,by simulating a finite number of pseudo-random instances of the NTE and by aver-aging the outcome of those simulations to obtain statistics of quantities of interest.Each statistic can be interpreted as an expected value of some (possibly nonlinear)functional of the neutron flux with respect to the random cross-sections. The inputrandom fields typically need to be parametrised with a significant number of randomparameters leading to a problem of high-dimensional integration. MC methods areknown to be particularly well-suited to this type of problem due to their dimensionindependent convergence rates.However, convergence of the MC algorithm is slow and determined by (cid:112) V ( · ) / N ,where V ( · ) is the variance of the quantity of interest and N is the number of samples.For this reason, research is focussed on improving the convergence, whilst retain-ing dimensional independence. Advances in MC methods can broadly be split intotwo main categories: improved sampling and variance reduction. Improved sam-pling methods attempt to find samples that perform better than the pseudo-randomchoice. Effectively, they aim to improve the (cid:112) / N term in the error estimate. A ma-jor advance in sampling methods has come through the development of quasi-MonteCarlo (QMC) methods. Variance reduction methods, on the other hand, attempt toreduce the V ( · ) term in the error estimate and thus reduce the number of samplesneeded for a desired accuracy. Multilevel Monte Carlo (MLMC) methods (initiatedin [28, 18] and further developed in, e.g., [20, 7, 10, 9, 34, 32, 47, 27]) fall into thiscategory. A comprehensive review of MLMC can be found in [19].The rigorous theory of all of the improvements outlined above requires regularityproperties of the solution, the verification of which can be a substantial task. Thereare a significant number of published papers on the regularity of parametric ellipticPDEs, in physical and parameter space, as they arise, e.g., in flow in random modelsof porous media [9, 33, 12, 13, 24, 34, 32]. However, for the NTE, this regularityquestion is almost untouched. Our complementary paper [25] contains a full regu-larity and error analysis of the discrete scheme for the NTE with spatially variableand random coefficients. Here we restrict to a summary of those results.The field of UQ has grown very quickly in recent years and its application toneutron transport theory is currently of considerable interest. There are a number ofgroups that already work on this problem, e.g. [4, 17, 21] and references therein. odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 3 Up to now, research has focussed on using the polynomial chaos expansion (PCE),which comes in two forms; the intrusive and non-intrusive approaches. Both ap-proaches expand the random flux in a weighted sum of orthogonal polynomials.The intrusive approach considers the expansion directly in the differential equation,which in turn requires a new solver (‘intruding’ on the original solver). In contrast,the non-intrusive approach attempts to estimate the coefficients of the PCE directly,by projecting onto the PCE basis cf. [4, eq.(40)]. This means the original solver canbe used as a ‘black box’ as in MC methods. Both of the approaches then use quadra-ture to estimate the coefficients in the PCE. The main disadvantage of standard PCEis that typically the number of terms grow exponentially in the number of stochasticdimensions and in the order of the PCE, the so-called curse of dimensionality .Fichtl and Prinja [17] were some of the first to numerically tackle the 1D slabgeometry problem with random cross-sections. Gilli et al. [21] improved upon thiswork by using (adaptive) sparse grid ideas in the collocation method, to tackle thecurse of dimensionality. Moreover, [5] constructed a hybrid PCE using a combi-nation of Hermite and Legendre polynomials, observing superior convergence incomparison to the PCE with just Hermite polynomials. More recently [4] tackledthe (time-independent) full criticality problem in three spatial, two angular and oneenergy variable. They consider a second expansion, the high-dimensional modelrepresentation (HDMR), which allows them to expand the response (e.g. function-als of the flux) in terms of low-dimensional subspaces of the stochastic variable. ThePCE is used on the HDMR terms, each with their own basis and coefficients. Wenote however, that none of these papers provide any rigorous error or cost analysis.The structure of this paper is as follows. In Section 2, we describe the modelproblem, a 1D slab geometry simplification of the Neutron Transport Equation withspatially varying and random cross-sections. We set out the discretisation of thisequation and discuss two methods for solving the resultant linear systems; a directand an iterative solver. In Section 3, the basic elements of a fully-discrete erroranalysis of the discrete ordinates method with diamond differencing applied to themodel problem are summarised. The full analysis will be given in [25]. In Section4, we introduce a number of variations on the Monte Carlo method for quantify-ing uncertainty. This includes a summary of the theoretical computational costs foreach method. Finally, Section 5 contains numerical results relating to the rest of thepaper. We first present a hybrid solver that combines the benefits of both direct anditerative solvers. Its cost depends on the particular realisation of the cross-sections.Moreover, we present simulations for the UQ problem for the different variants ofthe Monte Carlo methods, and compare the rates with those given by the theory.
The Neutron Transport Equation (NTE) is a physically derived balance equation,that models the angular flux ψ ( r , Θ , E ) of neutrons in a domain, where r is position, Θ is angle and E is energy. Neutrons are modelled as non-interacting particles trav- Ivan G. Graham, Matthew J. Parkinson, and Robert Scheichl elling along straight line paths with some energy E . They interact with the largernuclei via absorption, scattering and fission. The rates σ A , σ S and σ F at which theseevents occur are called the absorption, scattering and fission cross-sections , respec-tively. They can depend on the position r and the energy E of the neutron. Thescattering cross-sections also depend on the energy E (cid:48) after the scattering event, aswell as on the angles Θ and Θ (cid:48) before and after the event.The two main scenarios of interest in neutron transport are the so-called fixedsource problem and the criticality problem . We will focus on the former, whichconcerns the transport of neutrons emanating from some fixed source term f . It hasparticular applications in radiation shielding. We will further simplify our model tothe
1D slab geometry case by assuming • no energy dependence; • dependence only on one spatial dimension and infinite extent of the domain inthe other two dimensions; • no dependence of any cross-sections on angle; • no fission.The resulting simplified model is an integro-differential equation for the angularflux ψ ( x , µ ) such that µ d ψ d x ( x , µ ) + σ ( x ) ψ ( x , µ ) = σ S ( x ) φ ( x ) + f ( x ) , (1)where φ ( x ) = (cid:90) − ψ ( x , µ (cid:48) ) d µ (cid:48) , (2)for any x ∈ ( , ) and µ ∈ [ − , ] , subject to the no in-flow boundary conditions ψ ( , µ ) = , for µ > ψ ( , µ ) = , for µ < . (3)Here, the angular domain is reduced from S to the unit circle S and parametrisedby the cosine µ ∈ [ − , ] of the angle. The equation degenerates at µ =
0, i.e. forneutrons moving perpendicular to the x -direction. The coefficient function σ ( x ) isthe total cross-section given by σ = σ S + σ A . For more discussion on the NTE see[11, 37]. An important problem in industry is to quantify the uncertainty in the fluxes due touncertainties in the cross-sections. Most materials, in particular shielding materialssuch as concrete, are naturally heterogeneous or change their properties over timethrough wear. Moreover, the values of the cross-sections are taken from nucleardata libraries across the world and they can differ significantly between libraries[36]. This means there are large amounts of uncertainty on the coefficients, and thiscould have significant consequences on the system itself. odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 5
To describe the random model, let ( Ω , A , P ) be a probability space with ω ∈ Ω denoting a random event from this space. Consider a (finite) set of partitions ofthe spatial domain, where on each subinterval we assume that σ S = σ S ( x , ω ) and σ = σ ( x , ω ) are two (possibly dependent or correlated) random fields. Then theangular flux and the scalar flux become random fields and the model problem (1),(2) becomes µ d ψ d x ( x , µ , ω ) + σ ( x , ω ) ψ ( x , µ , ω ) = σ S ( x , ω ) φ ( x , ω ) + f ( x ) , (4)where φ ( x , ω ) = (cid:90) − ψ ( x , µ (cid:48) , ω ) d µ (cid:48) (5)and ψ ( · , · , ω ) satisfies the boundary conditions (3). The set of equations (4), (5), (3)have to hold for almost all realisations ω ∈ Ω .For simplicity, we restrict ourselves to deterministic σ A = σ A ( x ) with0 < σ A , min ≤ σ A ( x ) ≤ σ A , max < ∞ , for all x ∈ [ , ] , (6)and assume a log-normal distribution for σ S ( x , ω ) . The total cross-section σ ( x , ω ) isthen simply the log-normal random field with values σ ( x , ω ) = σ S ( x , ω ) + σ A ( x ) . Inparticular, we assume that log σ S is a correlated zero mean Gaussian random field,with covariance function defined by C ν ( x , y ) = σ var − ν Γ ( ν ) (cid:18) √ ν | x − y | λ C (cid:19) ν K ν (cid:18) √ ν | x − y | λ C (cid:19) . (7)This class of covariances is called the Mat´ern class. It is parametrised by the smooth-ness parameter ν ≥ . λ C is the correlation length, σ var is the variance, Γ isthe gamma function and K ν is the modified Bessel function of the second kind.The limiting case, i.e. ν → ∞ , corresponds to the Gaussian covariance function C ∞ ( x , y ) = σ var exp ( −| x − y | / λ C ) .To sample from σ S we use the Karhunen-Lo`eve (KL) expansion of log σ S , i.e.,log σ S ( x , ω ) = ∞ ∑ i = (cid:112) ξ i η i ( x ) Z i ( ω ) , (8)where Z i ∼ N ( , ) i.i.d. Here ξ i and η i are the eigenvalues and the L ( , ) -orthogonal eigenfunctions of the covariance integral operator associated with ker-nel given by the covariance function in (7). In practice, the KL expansion needs tobe truncated after a finite number of terms (here denoted d ). The accuracy of thistruncation depends on the decay of the eigenvalues [38]. For ν < ∞ , this decay isalgebraic and depends on the smoothness parameter ν . In the Gaussian covariancecase the decay is exponential. Note that for the Mat´ern covariance with ν = .
5, theeigenvalues and eigenfunctions can be computed analytically [38]. For other casesof ν , we numerically compute the eigensystem using the Nystr¨om method - see, forexample, [16]. Ivan G. Graham, Matthew J. Parkinson, and Robert Scheichl
The goal of stochastic uncertainty quantification is to understand how the ran-domness in σ S and σ propagates to functionals of the scalar or angular flux. Suchquantities of interest may be point values, integrals or norms of φ or ψ . They arerandom variables and the focus is on estimating their mean, variance or distribution. For each realisation ω ∈ Ω , the stochastic 1D NTE (4), (5), (3) is an integro-differential equation in two variables, space and angle. For ease of presentation,we suppress the dependency on ω ∈ Ω for the moment.We use a 2 N -point quadrature rule (cid:82) − f ( µ ) d µ ≈ ∑ N | k | = w k f ( µ k ) with nodes µ k ∈ [ − , ] \{ } and positive weights w k to discretise in angle, assuming the (anti-)symmetry properties µ − k = − µ k and w − k = w k . (In later sections, we construct sucha rule by using N -point Gauss-Legendre rules on each of [ − , ) and ( , ] .)To discretise in space, we introduce a mesh 0 = x < x < . . . < x M = σ , σ S and is alsoquasiuniform - i.e. the subinterval lengths h j : = x j − x j − satisfy γ h ≤ h j ≤ h : = max j = ,... M h j , for some constant γ >
0. Employing a simple Crank-Nicolson methodfor the transport part of (4), (5) and combining it with the angular quadrature ruleabove we obtain the classical diamond-differencing scheme: µ k Ψ k , j − Ψ k , j − h j + σ j − / Ψ k , j + Ψ k , j − = σ S , j − / Φ j − / + F j − / , j = , ..., M , | k | = , . . . , N , (9)where Φ j − / = N ∑ | k | = w k Ψ k , j + Ψ k , j − , j = , ..., M . (10)Here σ j − / denotes the value of σ at the mid-point of the interval I j = ( x j − , x j ) ,with the analogous meaning for σ S , j − / and F j − / . The notation reflects the factthat (in the next section) we will associate the unknowns Ψ k , j in (9) with the nodalvalues ψ k , h ( x j ) of continuous piecewise-linear functions ψ k , h ≈ ψ ( · , µ k ) .Finally, (9) and (10) have to be supplemented with the boundary conditions Ψ k , =
0, for k > Ψ k , M =
0, for k <
0. If the right-hand side of (9) were known,then (9) could be solved simply by sweeping from left to right (when k >
0) and fromright to left (when k < Φ j − / on the right-hand side meansthat (9) and (10) consitute a coupled system with solution ( Ψ , Φ ) ∈ R NM × R M . Itis helpful to think of Ψ as being composed of 2 N subvectors Ψ k , each with M entries Ψ k , j , consisting of approximations to ψ ( x j , µ k ) with x j ranging over all free nodes.The coupled system (9) and (10) can be written in matrix form as odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 7 (cid:18) T − Σ S − P I (cid:19) (cid:18) ΨΦ (cid:19) = (cid:18) F (cid:19) . (11)Here, the vector Φ ∈ R M contains the approximations of the scalar flux at the M midpoints of the spatial mesh. The matrix T is a block diagonal 2 NM × NM matrix,representing the left hand side of (9). The 2 N diagonal blocks of T , one per angle,are themselves bi-diagonal. The 2 NM × M matrix Σ S simply consists of 2 N identicaldiagonal blocks, one per angle, representing the multiplication of Φ by σ S at themidpoints of the mesh. The M × NM matrix P represents the right hand side of(10), i.e. averaging at the midpoints and quadrature. The matrix I denotes the M × M identity matrix. The vector F ∈ R NM contains 2 N copies of the source termevaluated at the M midpoints of the spatial mesh. We now wish to find the (approximate) fluxes in the linear system (11). We note thatthe matrix T is invertible and has a useful sparsity structure that allows its inverseto be calculated in O ( MN ) operations. However, the bordered system (11) is not aseasy to invert, due to the presence of Σ S and P .To exploit the sparsity of T , we do block elimination on (11) obtaining the Schurcomplement system for the scalar flux, i.e., (cid:0) I − PT − Σ S (cid:1) Φ = PT − F , (12)which now requires the inversion of a smaller (dense) matrix. Note that (12) is afinite-dimensional version of the reduction of the integro-differential equation (4),(5) to the integral form of the NTE, see (20). In this case, the two dominant computa-tions with O ( M N ) and O ( M ) operations respectively, are the triple matrix product PT − Σ S in the construction of the Schur complement and the LU factorisation ofthe M × M matrix (cid:0) I − PT − Σ S (cid:1) . This leads to a totaltheoretical cost of the direct solver ∼ O ( M ( M + N )) . (13)We note that for stability reasons (see §
3, also [42] in a simpler context), the numberof spatial and angular points should be related. A suitable choice is M ∼ N , leadingto a cost of the direct solver of O ( M ) in general.The second approach for solving (11) is an iterative solver commonly referred toas source iteration , cf. [8]. The form of (12) naturally suggests the iteration Φ ( k ) = PT − (cid:16) Σ S Φ ( k − ) + F (cid:17) , (14)where Φ ( k ) is the approximation at the k th iteration, with Φ ( ) = PT − F . This canbe seen as a discrete version of an iterative method for the integral equation (20). Ivan G. Graham, Matthew J. Parkinson, and Robert Scheichl
In practice, we truncate after K iterations. The dominant computations in thesource iteration are the K multiplications with PT − Σ S . Exploiting the sparsity ofall the matrices involved, these multiplications cost O ( MN ) operations, leading toan overall theoretical cost of source iteration ∼ O ( M N K ) . (15)Our numerical experiments in Section 5 show that for N = M the hidden constantsin the two estimates (13) and (15) are approximately the same. Hence, whether theiterative solver is faster than the direct solver depends on whether the number ofiterations K to obtain an accurate enough solution is smaller or larger than M .There are sharp theoretical results on the convergence of source iteration forpiecewise smooth cross-sections [8, Thm 2.20]. In particular, if φ ( K ) ( ω ) denotesthe approximation to φ ( ω ) after K iterations, then (cid:13)(cid:13)(cid:13)(cid:13) σ / (cid:16) φ − φ ( K ) (cid:17) (cid:13)(cid:13)(cid:13)(cid:13) ≤ C (cid:48) (cid:18) η (cid:13)(cid:13)(cid:13)(cid:13) σ S σ (cid:13)(cid:13)(cid:13)(cid:13) ∞ (cid:19) K , (16)for some constant C (cid:48) and η ≤
1. That is, the error decays geometrically with rateno slower than the spatial maximum of σ S / σ . This value depends on ω and willchange pathwise. Using this result as a guide together with (6), we assume that theconvergence of the L -error with respect to K can be bounded by (cid:107) φ − φ ( K ) (cid:107) ≤ C (cid:13)(cid:13)(cid:13)(cid:13) σ S σ (cid:13)(cid:13)(cid:13)(cid:13) K ∞ , (17)for some constant C that we will estimate numerically in Section 5. The rigorous analysis of UQ for PDEs with random coefficients requires estimatesfor the error when discretisations in physical space (e.g. by finite differences) andprobability space (e.g. by sampling techniques) are combined. The physical errorestimates typically need to be probabilistic in form (e.g. estimates of expectationof the physical error). Such estimates are quite well-developed for elliptic PDEs -see for example [9] but this question is almost untouched for the transport equation(or more specifically the NTE). We outline here some results which are proved inthe forthcoming paper [25]. This paper proceeds by first giving an error analysis for(1), (2) with variable cross-sections, which is explicit in σ , σ S , and then uses this toderive probabilistic error estimates for the spatial discretisation (9), (10).The numerical analysis of the NTE (and related integro-differential equationproblems such as radiative transfer) dates back at least as far as the work of H.B.Keller [30]. After a huge growth in the mathematics literature in the 1970’s and1980’s, progress has been slower since. This is perhaps surprising, since discon-tinuous Galerkin (DG) methods have enjoyed a massive recent renaissance and the odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 9 solution of the neutron transport problem was one of the key motivations behindthe original introduction of DG [43]. Even today, an error analysis of the NTE withvariable (even deterministic) cross-sections (with explicit dependence on the data)is still not available, even for the model case of mono-energetic 1D slab geometryconsidered here.The fundamental paper on the analysis of the discrete ordinates method for theNTE is [42]. Here a full analysis of the combined effect of angular and spatial dis-cretisation is given under the assumption that the cross-sections σ and σ S in (4) areconstant. The delicate relation between spatial and angular discretisation parame-ters required to achieve stability and convergence is described there. Later researche.g. [2], [3] produced analogous results for models of increasing complexity and inhigher dimensions, but the proofs were mostly confined to the case of cross-sectionsthat are constant in space. A separate and related sequence of papers (e.g. [35], [48],and [1]) allow for variation in cross-sections, but error estimates explicit in this dataare not available there.The results outlined here are orientated to the case when σ , σ S have relativelyrough fluctuations. As a precursor to attacking the random case, we first considerrough deterministic coefficients defined as follows. We assume that there is somepartition of [ , ] and that σ , σ S are C η functions on each subinterval of the partition(with η ∈ ( , ] ), but that σ , σ S may be discontinuous across the break points. Weassume that the mesh x j introduced in § C η isthe usual H¨older space of index η with norm (cid:107) · (cid:107) η .) We also assume that the sourcefunction f ∈ C η .When discussing the error when (9), (10) is applied to (1), (2), it is useful toconsider the “pure transport” problem: µ d u d x + σ u = g , with u ( ) = , when µ > u ( ) = µ < , (18)and with g ∈ C a generic right-hand side (where µ is now a parameter). Applicationof the Crank-Nicolson method (as in (9)) yields µ (cid:18) U j − U j − h j (cid:19) + σ j − / (cid:18) U j + U j − (cid:19) = g j − / , for j = , ..., M , (19)with analogous boundary conditions, where, for any continuous function c , we use c j − / to denote c ( x j − / ) . Letting V h denote the space of continuous piecewiselinear functions with respect to the mesh { x j } , (19) is equivalent to seeking a u h ∈ V h (with nodal values U j ) such that (cid:90) I j (cid:18) µ d u h d x + (cid:101) σ u h (cid:19) = (cid:90) I j (cid:101) g , j = , . . . , M , where I j = ( x j − , x j ) , and (cid:101) c denotes the piecewise constant function with respect to the grid { x j } whichinterpolates c at the mid-points of subintervals. It is easy to show that both (18) and (19) have unique solutions and we denotethe respective solution operators by S µ and S h µ , i.e. u = S µ g and u h = S h µ g . Bearing in mind the angular averaging process in (2) and (10), it is useful to thenintroduce the corresponding continuous and discrete spatial operators: ( K g )( x ) : = (cid:90) − (cid:0) S µ g (cid:1) ( x ) d µ , and ( K h , N g )( x ) = N ∑ | k | = w k ( S h µ k g )( x ) . It is easy to see (and well known classically - e.g. [29]) that ( K g )( x ) = (cid:90) E ( | τ ( x , y ) | ) g ( y ) d y , where E is the exponential integral and the function τ ( x , y ) = (cid:82) yx σ is known as theoptical path. In fact (even when σ is merely continuous), K is a compact Fredholmintegral operator on a range of function spaces and K h , N is a finite rank approxima-tion to it. The study of these integral operators in the deterministic case is a classicaltopic, e.g. [45]. In the case of random σ , K is an integral operator with a randomkernel which merits further investigation. Returning to (1), (2), we see readily that ψ ( x , µ ) = S µ ( σ S φ + f ) , so that φ = K ( σ S φ + f ) . (20)Moreover (9) and (10) correspond to a discrete analogue of (20) as follows. Intro-duce the family of functions ψ h , Nk ∈ V h , | k | = , . . . , N , by requiring ψ h , Nk to havenodal values Ψ k , j . Then set φ h , N : = N ∑ | k | = w k ψ h , Nk ∈ V h , and it follows that (9) and (10) may be rewritten (for each j = , ..., M ) (cid:90) I j (cid:32) µ k d ψ h , Nk d x + (cid:101) σ ψ h , Nk (cid:33) = (cid:90) I j (cid:103) g h , N , where g h , N = σ S φ h , N + f . and thus ψ h , Nk = S h µ k (cid:16) σ S φ h , N + f (cid:17) , so that φ h , N = K h , N ( σ S φ h , N + f ) . (21)The numerical analysis of (9) and (10) is done by analysing (the second equa-tion in) (21) as an approximation of the second equation in (20). This is studied indetail in [42] for constant σ , σ S . In [25] we discuss the variable case, obtaining allestimates explicitly in σ , σ S . Elementary manipulation on (20) and (21) shows that odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 11 φ − φ h , N = ( I − K h , N σ S ) − ( K − K h , N )( σ S φ + f ) , (22)and so (cid:107) φ − φ h , N (cid:107) ∞ ≤ (cid:107) ( I − K h , N σ S ) − (cid:107) ∞ (cid:107) ( K − K h , N )( σ S φ + f ) (cid:107) ∞ . (23)The error analysis in [25] proceeds by estimating the two terms on the right-handside of (23) separately. We summarise the results in the lemmas below. To avoidwriting down the technicalities (which will be given in detail in [25]), in the follow-ing results, we do not give the explicit dependence of the constants C i , i = , , . . . , on the cross sections σ and σ S . For simplicity we restrict our summary to the casewhen the right-hand side of (19) is the average of g over I j (rather than the pointvalue g j − / ). The actual scheme (19) is then analysed by a perturbation argument,see [25]. Lemma 1.
Suppose N is sufficiently large and h log
N is sufficiently small. Then (cid:107) ( I − K h , N σ S ) − (cid:107) ∞ ≤ C , (24) where C depends on σ and σ S , but is independent of h and N.Sketch of proof The proof is obtained by first obtaining an estimate of the form(24) for the quantity (cid:107) ( I − K σ S ) − (cid:107) ∞ , and then showing that the perturbation (cid:107) K − K h , N (cid:107) ∞ is small, when N is sufficiently large and h log N is sufficiently small.(The constraint linking h and log N arises because the transport equation (1) has asingularity at µ = h , N which are sufficient to ensure thatthe bound (24) holds depend on the cross-sections σ , σ S . Lemma 2. (cid:107) ( K − K h , N )( σ S φ + f ) (cid:107) ∞ ≤ (cid:18) C h log N + C h η + C N (cid:19) (cid:107) f (cid:107) η , where C , C , C depend again on σ and σ S , but are independent of h , N and f .Sketch of proof
Introducing the semidiscrete operator: ( K N g )( x ) = N ∑ | k | = w k ( S µ k g )( x ) (corresponding to applying quadrature in angle but no discretisation in space), wethen write K − K h , N = ( K − K N ) + ( K N − K h , N ) and consider, separately, thesemidiscrete error due to quadrature in angle: ( K − K N )( σ S φ + f ) = (cid:32) (cid:90) − ψ ( x , µ ) d µ − N ∑ | k | = w k ψ ( x , µ k ) (cid:33) , (25)and the spatial error for a given N : ( K N − K h , N )( σ S φ + f ) = N ∑ | k | = w k (cid:16) S µ k − S h µ k (cid:17) ( σ S φ + f ) . (26)The estimate for (25) uses estimates for the regularity of ψ with respect to µ (which are explicit in the cross-sections), while (26) is estimated by proving stabilityof the Crank-Nicolson method and a cross-section-explicit bound on (cid:107) φ (cid:107) η .Putting together Lemmas 1 and 2, we obtain the following. Theorem 1.
Under the assumptions outlined above, (cid:107) φ − φ h , N (cid:107) ∞ ≤ C (cid:18) C h log N + C h η + C N (cid:19) (cid:107) f (cid:107) η . Returning to the case when σ , σ S are random functions, this theorem providespathwise estimates for the error. In [25], these are turned into estimates in the cor-responding Bochner space provided the coefficients C i are bounded in probabilityspace. Whether this is the case depends on the choice of the random model for σ , σ S .In particular, using the results in [9, § C i ∈ L p ( Ω ) ,for all 1 ≤ p < ∞ , for the specific choices of σ and σ S in §
2. Hence, we have:
Corollary 1.
For all ≤ p < ∞ , (cid:107) φ − φ h , N (cid:107) L p ( Ω , L ∞ ( , )) ≤ C (cid:18) h log N + h η + N (cid:19) (cid:107) f (cid:107) η , where C is independent of h , N and f .
Let Q ( ω ) ∈ R denote a functional of φ or ψ representing a quantity of interest. Wewill focus on estimating E [ Q ] , the expected value of Q . Since we are not specificabout what functionals we are considering, this includes also higher order momentsor CDFs of quantities of interest. The expected value is a high-dimensional inte-gral and the goal is to apply efficient quadrature methods in high dimensions. Weconsider Monte Carlo type sampling methods.As outlined above, to obtain samples of Q ( ω ) the NTE has to be approximatednumerically. First, the random scattering cross section σ S in (4) is sampled usingthe KL expansion of log σ S in (8) truncated after d terms. The stochastic dimension d is chosen sufficiently high so that the truncation error is smaller than the otherapproximation errors. For each n ∈ N , let Z n ∈ R d be a realisation of the multivari-ate Gaussian coefficient Z : = ( Z i ) i = ,..., d in the KL expansion (8). Also, denote by Q h ( Z n ) the approximation of the n th sample of Q obtained numerically using a spa-tial grid with mesh size h and 2 N angular quadrature points. We assume throughoutthat N ∼ / h , so there is a single discretisation parameter h . odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 13 We will consider various unbiased, sample-based estimators (cid:98) Q h for the expectedvalue E [ Q ] and we will quantify the accuracy of each estimator by its mean squareerror (MSE) e ( (cid:98) Q h ) . Since (cid:98) Q h is assumed to be an unbiased estimate of E [ Q h ] , i.e. E [ (cid:98) Q h ] = E [ Q h ] , the MSE can be expanded as e ( (cid:98) Q h ) = E (cid:104) ( (cid:98) Q h − E [ Q ]) (cid:105) = ( E [ Q − Q h ]) + V [ (cid:98) Q h ] , (27)i.e., the squared bias due to the numerical approximation plus the sampling (orquadrature) error V [ (cid:98) Q h ] = E [( (cid:98) Q h − E [ Q h ]) ] . In order to compare computationalcosts of the various methods we will consider their ε -cost C ε , that is, the number offloating point operations to achieve a MSE e ( (cid:98) Q h ) less than ε .To bound the ε -cost for each method, we make the following assumptions on thediscretisation error and on the average cost to compute a sample from Q h : (cid:12)(cid:12)(cid:12) E [ Q − Q h ] (cid:12)(cid:12)(cid:12) = O ( h α ) , (28) E [ C ( Q h )] ≤ O ( h − γ ) , (29)for some constants α , γ >
0. We have seen in Section 2 that (29) holds with γ be-tween 2 and 3. The new theoretical results in Section 3 guarantee that (28) also holdsfor some 0 < α ≤
1. Whilst the results of Section 3 (and [25]) are shown to be sharpin some cases, the practically observed values for α in the numerical experimentshere are significantly bigger, with values between 1.5 and 2.In recent years, many alternative methods for high-dimensional integrals haveemerged that use tensor product deterministic quadrature rules combined withsparse grid techniques to reduce the computational cost [49, 6, 40, 26, 4, 17, 21].The efficiency of these approaches relies on high levels of smoothness of the param-eter to output map and in general their cost may grow exponentially with the numberof parameters (the curse of dimensionality ). Such methods are not competitive withMonte Carlo type methods for problems with low smoothness in the coefficients,where large numbers of parameters are needed to achieve a reasonable accuracy.For example, in our later numerical tests we will consider problems in up to 3600stochastic dimensions.However, standard Monte Carlo methods are notoriously slow to converge, re-quiring thousands or even millions of samples to achieve acceptable accuracies. Inour application, where each sample involves the numerical solution of an integro-differential equation this very easily becomes intractable. The novel Monte Carloapproaches that we present here, aim to improve this situation in two complemen-tary ways. Quasi-Monte Carlo methods reduce the number of samples to achieve acertain accuracy dramatically by using deterministic ideas to find well distributedsamples in high dimensions. Multilevel methods use the available hierarchy of nu-merical approximations to our integro-differential equation to shift the bulk of thecomputations to cheap, inaccurate coarse models while providing the required ac-curacy with only a handful of expensive, accurate model solves. The (standard) Monte Carlo (MC) estimator for E [ Q ] is defined by (cid:98) Q MCh : = N MC N MC ∑ n = Q h ( Z n ) , (30)where N MC is the number of Monte Carlo points/samples Z n ∼ N ( , I ) . The sam-pling error of this estimator is V [ (cid:98) Q MCh ] = V [ Q h ] / N MC .A sufficient condition for the MSE to be less than ε is for both the squaredbias and the sampling error in (27) to be less than ε /
2. Due to assumption (28),a sufficient condition for the squared bias to be less than ε / h ∼ ε / α . Since V [ Q h ] is bounded with respect to h →
0, the sampling error of (cid:98) Q MCh is less than ε / N MC ∼ ε − . With these choices of h and N MC , it follows from Assumption (29)that the mean ε -cost of the standard Monte Carlo estimator is E (cid:104) C ε ( (cid:98) Q MCh ) (cid:105) = E (cid:34) N MC ∑ n = C ( Q h ( Z n )) (cid:35) = N MC E [ C ( Q h )] = O (cid:16) ε − − γα (cid:17) . (31)Our aim is to find alternative methods that have a lower ε -cost. The first approach to reduce the ε -cost is based on using quasi-Monte Carlo (QMC)rules, which replace the random samples in (30) by carefully chosen deterministicsamples and treat the expected value with respect to the d -dimensional Gaussian Z in (8) as a high-dimensional integral with Gaussian measure.Initially interest in QMC points arose within number theory in the 1950’s, andthe theory is still at the heart of good QMC point construction today. Nowadays, thefast component-by-component construction (CBC) [41] provides a quick methodfor generating good QMC points, in very high-dimensions. Further information onthe best choices of deterministic points and QMC theory can be found in e.g. [46,15, 39, 14].The choice of QMC points can be split into two categories; lattice rules anddigital nets. We will only consider randomised rank-1 lattice rules here. In particular,given a suitable generating vector z ∈ Z d and R independent, uniformly distributedrandom shifts ( ∆ r ) Rr = in [ , ] d , we construct N QMC = R P lattice points in the unitcube [ , ] d using the simple formula v ( n ) = frac (cid:16) nzP + ∆ r (cid:17) , n = , . . . , P , r = , . . . , R odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 15 where “frac” denotes the fractional part function applied componentwise and thenumber of random shifts R is fixed and typically small e.g. R = ,
16. To trans-form the lattice points v n ∈ [ , ] d into “samples” (cid:101) Z n ∈ R d , n = , . . . , N QMC , of themultivariate Gaussian coefficients Z in the KL expansion (8) we apply the inversecumulative normal distribution. See [23] for details.Finally, the QMC estimator is given by (cid:98) Q QMCh : = N QMC N
QMC ∑ n = Q h ( (cid:101) Z n ) , Note that this is essentially identical in its form to the standard MC estimator (30),but crucially with deterministically chosen and then randomly shifted (cid:101) Z n . The ran-dom shifts ensure that the estimator is unbiased, i.e. E [ (cid:98) Q QMCh ] = E [ Q h ] .The bias for this estimator is identical to the MC case, leading again to a choiceof h ∼ ε / α to obtain a MSE of ε . Here the MSE corresponds to the mean squareerror of a randomised rank-1 lattice rule with P points averaged over the shift ∆ ∼ U ([ , ] d ) . In many cases, it can be shown that the quadrature error, i.e., the secondterm in (27), converges with O ( N − / λ QMC ) , with λ ∈ ( , ] . That is, we can potentiallyachieve O ( N − QMC ) convergence for (cid:98) Q QMCh as opposed to the O ( N − / MC ) convergencefor (cid:98) Q MCh . A rigorous proof of the rate of convergence requires detailed analysis ofthe quantity of interest (the integrand), in an appropriate weighted Sobolev space,e.g. [24]. Such an analysis is still an open question for this class of problems, andwe do not attempt it here. Moreover, the generating vector z does in theory haveto be chosen problem specific. However, standard generating vectors, such as thoseavailable at [31], seem to also work well (and better than MC samples). Furthermore,we note the recent developments in “higher-order nets” [22, 12], which potentiallyincrease the convergence of QMC methods to O ( N − qQMC ) , for q ≥ R =
8, it suffices to choose P ∼ ε − λ for the quadratureerror to be O ( ε ) . Therefore it follows again from Assumption (29) that the ε -costof the QMC method satisfies E ∆ (cid:104) C ε ( (cid:98) Q QMC ) (cid:105) = O (cid:16) ε − λ − γα (cid:17) . (32)When λ → , this is essentially a reduction in the ε -cost by a whole order of ε . Inthe case of non-smooth random fields, we typically have λ ≈ ε -cost growswith the same rate as that of the standard MC method. However, in our experimentsand in experiments for diffusion problems [23], the absolute cost is always reduced. The main issue with the above methods is the high cost for computing the samples { Q h ( Z ( n ) ) } , each requiring us to solve the NTE. The idea of the multilevel MonteCarlo (MLMC) method is to use a hierarchy of discrete models of increasing costand accuracy, corresponding to a sequence of decreasing discretisation parameters h > h > ... > h L = h . Here, only the most accurate model on level L is designedto give a bias of O ( ε ) by choosing h L = h ∼ ε / α as above. The bias of the othermodels can be significantly higher.MLMC methods were first proposed in an abstract way for high-dimensionalquadrature by Heinrich [28] and then popularised in the context of stochastic differ-ential equations in mathematical finance by Giles [18]. MLMC methods were firstapplied in uncertainty quantification in [7, 10]. The MLMC method has quicklygained popularity and has been further developed and applied in a variety of otherproblems. See [19] for a comprehensive review. In particular, the multilevel ap-proach is not restricted to standard MC estimators and can also be used in conjunc-tion with QMC estimators [20, 34, 32] or with stochastic collocation [47]. Here, weconsider multilevel variants of standard MC and QMC.MLMC methods exploit the linearity of the expectation, writing E [ Q h ] = L ∑ (cid:96) = E [ Y (cid:96) ] , where Y (cid:96) : = Q h (cid:96) − Q h (cid:96) − and Q h − : = . Each of the expected values on the right hand side is then estimated separately. Inparticular, in the case of a standard MC estimator with N (cid:96) samples for the (cid:96) th term,we obtain the MLMC estimator (cid:98) Q MLMCh : = L ∑ (cid:96) = (cid:98) Y MC (cid:96) = L ∑ (cid:96) = N (cid:96) N (cid:96) ∑ n = Y (cid:96) ( Z (cid:96), n ) . (33)Here, { Z (cid:96), n } N (cid:96) n = denotes the set of i.i.d. samples on level (cid:96) , chosen independentlyfrom the samples on the other levels.The key idea in MLMC is to avoid estimating E [ Q h ] directly. Instead, the expec-tation E [ Y ] = E [ Q h ] of a possibly strongly biased, but cheap approximation of Q h is estimated. The bias of this coarse model is then estimated by a sum of correctionterms E [ Y (cid:96) ] using increasingly accurate and expensive models. Since the Y (cid:96) representsmall corrections between the coarse and fine models, it is reasonable to conjecturethat there exists β > V [ Y (cid:96) ] = O ( h β (cid:96) ) , (34)i.e., the variance of Y (cid:96) decreases as h (cid:96) →
0. This is verified for diffusion problems in[9]. Therefore the number of samples N (cid:96) to achieve a prescribed accuracy on level (cid:96) can be gradually reduced, leading to a lower overall cost of the MLMC estimator.More specifically, we have the following cost savings: odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 17 • On the coarsest level, using (29), the cost per sample is reduced from O ( h − γ ) to O ( h − γ ) . Provided V [ Q h ] ≈ V [ Q h ] and h can be chosen independently of ε , thecost of estimating E [ Q h ] to an accuracy of ε in (33) is reduced to O ( ε − ) . • On the finer levels, the number of samples N (cid:96) to estimate E [ Y (cid:96) ] to an accuracyof ε in (33) is proportional to V [ Y (cid:96) ] ε − . Now, provided V [ Y (cid:96) ] = O ( h β (cid:96) ) , for some β >
0, which is guaranteed if Q h (cid:96) converges almost surely to Q pathwise, thenwe can reduce the number of samples as h (cid:96) →
0. Depending on the actual valuesof α , β and γ , the cost to estimate E [ Y L ] on the finest level can, in the best case,be reduced to O ( ε − γ / α ) .The art of MLMC is to balance the number of samples across the levels to min-imise the overall cost. This is a simple constrained optimisation problem to achieve V [ (cid:98) Q MLMCh ] ≤ ε /
2. As shown in [18], using the technique of Lagrange Multipliers,the optimal number of samples on level (cid:96) is given by N (cid:96) = (cid:38) ε − (cid:32) L ∑ (cid:96) = (cid:112) V [ Y (cid:96) ] / C (cid:96) (cid:33) (cid:112) V [ Y (cid:96) ] C (cid:96) (cid:39) , (35)where C (cid:96) : = E [ C ( Y (cid:96) )] . In practice, it is necessary to estimate V [ Y (cid:96) ] and C (cid:96) in (35)from the computed samples, updating N (cid:96) as the simulation progresses.Using these values of N (cid:96) it is possible to establish the following theoretical com-plexity bound for MLMC [10]. Theorem 2.
Let us assume that (28) , (34) and (29) hold with α , β , γ > . Then, withL ∼ log ( ε − ) and with the choice of { N (cid:96) } Ll = in (35) we have E (cid:104) C ε ( (cid:98) Q MLMCh L ) (cid:105) = O (cid:16) ε − − max (cid:16) , γ − βα (cid:17) (cid:17) . (36) When β = γ , then there is an additional factor log ( ε − ) . Using lattice points (cid:101) Z (cid:96), n , as defined in Section 4.2, instead of the random sam-ples Z (cid:96), n we can in the same way define a multilevel quasi-Monte Carlo (MLQMC)estimator (cid:98) Q MLQMCh : = L ∑ (cid:96) = (cid:98) Y QMC (cid:96) = L ∑ (cid:96) = (cid:101) N (cid:96) (cid:101) N (cid:96) ∑ n = Y (cid:96) ( (cid:101) Z (cid:96), n ) . (37)The optimal values for (cid:101) N (cid:96) can be computed in a similar way to those in the MLMCmethod. However, they depend strongly on the rate of convergence of the lattice ruleand in particular on the value of λ which is difficult to estimate accurately. We willgive a practically more useful approach below.It is again possible to establish a theoretical complexity bound, cf. [34, 32]. Theorem 3.
Let us assume that (28) and (29) hold with α , γ > and that thereexists λ ∈ ( , ] and β > such that V ∆ [ (cid:98) Y QMC (cid:96) ] = O (cid:16) (cid:101) N − / λ (cid:96) h β (cid:96) (cid:17) . (38) Let the number of random shifts on each level be fixed to R and let L ∼ log ( ε − ) .Then, there exists a choice of { N (cid:96) } Ll = such that E ∆ (cid:104) C ε ( (cid:98) Q MLQMCh L ) (cid:105) = O (cid:16) ε − λ − max (cid:16) , γ − βλα (cid:17) (cid:17) . (39) When β λ = γ , then there is an additional factor log ( ε − ) + λ . The convergence rate can be further improved by using higher order QMC rules[13], but we will not consider this here.It can be shown, for the theoretically optimal values of N (cid:96) , that there exists aconstant C such that V ∆ [ (cid:98) Y QMC (cid:96) ] C (cid:96) = C , (40)independently of the level (cid:96) and of the value of λ (cf. [32, Sect. 3.3]). The same holdsfor MLMC. This leads to the following adaptive procedure to choose N (cid:96) suggestedin [20], which we use in our numerical experiments below instead of (35) .In particular, starting with an initial number of samples on all levels, we alternatethe following two steps until V [ (cid:98) Q MLMCh ] ≤ ε / C (cid:96) and V ∆ [ (cid:98) Y QMC (cid:96) ] (resp. V [ (cid:98) Y MC (cid:96) ] ).(ii) Compute (cid:96) ∗ = L argmax (cid:96) = (cid:32) V ∆ [ (cid:98) Y QMC (cid:96) ] C (cid:96) (cid:33) and double the number of samples on level (cid:96) ∗ .This procedure ensures that, on exit, (40) is roughly satisfied and the numbers ofsamples across the levels N (cid:96) are quasi-optimal.We use this adaptive procedure for both the MLMC and the MLQMC method.The lack of optimality typically has very little effect on the actual computationalcost. Since the optimal formula (35) for MLMC also depends on estimates of C (cid:96) and V [ Y (cid:96) ] , it sometimes even leads to a better performance. An additional benefit inthe case of MLQMC is that the quadrature error in rank-1 lattice rules is typicallylowest when the numbers of lattice points is a power of 2. We now present numerical results to confirm the gains that are possible with thenovel multilevel and quasi-Monte Carlo method applied to our 1D NTE model (1),(2), (3). We assume that the scattering cross-section σ S is a log-normal randomfield as described in Section 2.1 and that the absorption cross section is constant, σ A ≡ exp ( . ) . We assume no fission, σ F ≡
0, and a constant source term f = exp ( ) . We consider two cases, characterised by the choice of smoothness parameter odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 19 ν in the Mat´ern covariance function (7). For the first case, we choose ν = . ν = .
5. The correlation length and the variance are λ C = σ var =
1, respectively.The quantity of interest we consider is Q ( ω ) = (cid:90) φ ( x , ω ) d x . (41)For the discretisation, we choose a uniform spatial mesh with mesh width h = / M and a quadrature rule (in angle) with 2 N = M points. The KL expansion oflog ( σ S ) in (8) is truncated after d terms. We heuristically choose d to ensure thatthe error due to this truncation is negligible compared to the discretisation error.In particular, we choose d = h − for the Mat´ern field and d = h − / for theexponential field, leading to a maximum of 2048 and 3600 KL modes, respectively,for the finest spatial resolution in each case. Even for such large numbers of KLmodes, the sampling cost does not dominate because the randomness only exists inthe (one) spatial dimension.We introduce a hierarchy of levels (cid:96) = , ..., L corresponding to a sequence ofdiscretisation parameters h (cid:96) = − (cid:96) h with h = /
4, and approximate the quantityof interest in (41) by Q h ( ω ) : = M M ∑ j = Φ j − / ( ω ) . To generate our QMC points we use an (extensible) randomised rank-1 latticerule (as presented in Section 4.2), with R = lattice-32001-1024-1048576.3600 , which is downloaded from [31]. To compute samples of the neutron flux and thus of the quantity of interest, we pro-pose a hybrid version of the direct and the iterative solver for the Schur complementsystem (12) described in Section 2.3.The cost of the iterative solver depends on the number K of iterations that wetake. For each ω , we aim to choose K such that the L -error (cid:107) φ ( ω ) − φ ( K ) ( ω ) (cid:107) is less than ε . To estimate K we fix h = / d = φ h for each sample ω . Let ρ ( ω ) : = (cid:107) σ S ( · , ω ) / σ ( · , ω ) (cid:107) ∞ . For asufficiently large number of samples, we then evaluatelog (cid:16)(cid:13)(cid:13) φ h ( ω ) − φ ( K ) h ( ω ) (cid:13)(cid:13) (cid:17) K log (cid:0) ρ ( ω ) (cid:1) and find that this quotient is less than log ( . ) in more than 99% of the cases, for K = , . . . , C = . h and smaller values of d to verify that this bound holds inat least 99% of the cases independently of the discretisation parameter h and of thetruncation dimensions d .Hence, a sufficient, a priori condition to achieve (cid:107) φ h ( ω ) − φ ( K ) h ( ω ) (cid:107) < ε in atleast 99% of the cases is K = K ( ε , ω ) = max (cid:26) , (cid:24) log ( ε ) log (cid:0) ρ ( ω ) (cid:1) (cid:25) (cid:27) , (42)where (cid:100)·(cid:101) denotes the ceiling function. It is important to note that K is no longera deterministic parameter for the solver (like M or N ). Instead, K is a randomvariable that depends on the particular realisation of σ S . It follows from (42), us-ing the results in [9, § E [ K ( ε , · )] = O ( log ( ε )) and V [ K ( ε , · )] = O (cid:0) log ( ε ) (cid:1) , with more variability in the case of the exponential field.Recall from (13) and (15) that, in the case of N = M , the costs for the direct anditerative solvers are C M and C KM , respectively. In our numerical experiments,we found that in fact C ≈ C , for this particular relationship between M and N .This motivates a third “hybrid” solver, presented in Algorithm 1, where the iterativesolver is chosen when K ( ω ) < M and the direct solver when K ( ω ) ≥ M . This allowsus to use the optimal solver for each particular sample.We finish this section with a study of timings in seconds (here referred to as thecost) of the three solvers. In Fig. 1, we plot the average cost (over 2 samples)divided by M (cid:96) , against the level parameter (cid:96) . We observe that, as expected, the(scaled) expected cost of the direct solver is almost constant and the iterative solveris more efficient for larger values of M (cid:96) . Over the range of values of M (cid:96) consideredin our experiments, a best fit for the rate of growth of the cost with respect to thediscretisation parameter h (cid:96) in (29) is γ ≈ .
2, for both fields. Thus our solver hasa practical complexity of O ( n . ) , where n ∼ M is the total number of degrees offreedom in the system. Algorithm 1
Hybrid direct-iterative solver of (12), for one realisation . Require:
Given σ S , σ and a desired accuracy ε K = (cid:24) log ( ε ) / log ( ρ ) (cid:25) if K < M then Solve using K source iterations else Solve using the direct method end if odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 21 ℓ -8 -7 -6 E ( C o s t ) / M ℓ DirectIterativeHybrid ℓ -8 -7 -6 E ( C o s t ) / M ℓ DirectIterativeHybrid
Fig. 1
Comparison of the average costs of the solvers (actual timings in seconds divided by M (cid:96) )for the Mat´ern field (left) and for the exponential field (right). ℓ -6 -5 -4 -3 -2 | E ( Q − Q ℓ ) | DirectHybrid ℓ -10 -8 -6 -4 -2 V a r i an c e V ( Q ℓ ), Direct V ( Q ℓ ), Hybrid V ( Y ℓ ), Direct V ( Y ℓ ), Hybrid Fig. 2
Estimates of the bias due to discretisation errors (left) and of the variances of Q h (cid:96) and Y (cid:96) (right), in the case of the Mat´ern field. Studying the complexity theorems of Section 4, we can see that the effectiveness ofthe various Monte Carlo methods depends on the parameters α , β , γ and λ in (28),(29), (34) and (38). In this section, we will (numerically) estimate these parametersin order to estimate the theoretical computational cost for each approach.We have already seen that γ ≈ . E [ Q − Q h (cid:96) ] , as well as of the variances of Q h (cid:96) and of Y (cid:96) , computedvia sample means and sample variances over a sufficiently large set of samples. Weonly explicitly show the curves for the Mat´ern field. The curves for the exponentialfield look similar. From these plots, we can estimate α ≈ . β ≈ .
1, for theMat´ern field, and α ≈ . β ≈ .
9, for the exponential field.To estimate λ in (38), we need to study the convergence rate of the QMC methodwith respect to the number of samples N QMC . This study is illustrated in Fig. 3.As expected, the variance of the standard MC estimator converges with O ( N − MC ) .On the other hand, we observe that the variance of the QMC estimator converges Samples -6 -4 -2 V a r i an c e MCQMC Samples -6 -4 -2 V a r i an c e MCQMC
Fig. 3
Convergence of standard Monte Carlo and quasi-Monte Carlo estimators: Mat´ern field (left)and exponential field (right). α β γ λ
Mat´ern field 1.9 4.1 2.2 0.62Exponential field 1.7 1.9 2.2 0.71
Table 1
Summary of estimated rates in (28), (29), (34) and (38). approximately with O ( N − . QMC ) and O ( N − . QMC ) (or λ = .
62 and λ = .
71) for theMat´ern field and for the exponential field, respectively.We summarise all the estimated rates in Table 1.
For a fair comparison of the complexity of the various Monte Carlo estimators, wenow use the a priori bias estimates in Section 5.2 to choose a suitable tolerance ε L for each choice of h = h L . Let τ (cid:96) be the estimated bias on level (cid:96) . Then, for each L = , . . . ,
6, we choose h = h L and ε L : = √ τ L , and we plot in Fig. 4 the actualcost of each of the estimators described in Section 4 against the estimated bias onlevel L . The numbers of samples for each of the estimators are chosen such that V [ (cid:98) Q h ] ≤ ε (cid:96) /
2. The coarsest mesh size in the multilevel methods is always h = / ε -cost of each of the methods withthe ε -cost predicted theoretically using the estimates for α , β , γ and λ in Sec-tion 5.2. Assuming a growth of the ε -cost proportional to ε − r , for some r >
0, wecompare in Table 2 estimated and actual rates r for all the estimators. Some of theestimated rates in Section 5.2 are fairly crude, so the good agreement between esti-mated and actual rates is quite impressive. odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 23 -6 -4 -2 ǫ -4 -2 C o s t MCQMCMLMCMLQMC -6 -4 -2 ǫ -4 -2 C o s t MCQMCMLMCMLQMC
Fig. 4
Actual cost plotted against estimated bias on level L for standard Monte Carlo, QMC,multilevel MC and multilevel QMC: Mat´ern field (left) and exponential field (right). MC QMC MLMC MLQMC
Field Estimated Actual Estimated Actual Estimated Actual Estimated ActualMat´ern 3.2 3.4 2.4 2.7 2.0 2.1 1.2 1.5Exponential 3.3 3.6 2.7 2.4 2.2 2.5 1.9 1.9
Table 2
Comparison of the estimated theoretical and actual computational ε -cost rates, for differ-ent Monte Carlo methods, using the hybrid solver. To summarise, we have presented an overview of novel error estimates for the 1Dslab geometry simplification of the Neutron Transport Equation, with spatially vary-ing and random cross-sections. In particular, we consider the discrete ordinatesmethod with Gauss quadrature for the discretisation in angle, and a diamond dif-ferencing scheme on a quasi-uniform grid in space. We represent the spatial uncer-tainties in the cross-sections by log-normal random fields with Mat´ern covariances,including cases of low smoothness. These error estimates are the first of this kind.They allow us to satisfy key assumptions for the variance reduction in multilevelMonte Carlo methods.We then use a variety of recent developments in Monte Carlo methods to studythe propagation of the uncertainty in the cross-sections, through to a linear func-tional of the scalar flux. We find that the Multilevel Quasi Monte Carlo method givesus significant gains over the standard Monte Carlo method. These gains can be aslarge as almost two orders of magnitude in the computational ε -cost for ε = − .As part of the new developments, we present a hybrid solver, which automaticallyswitches between a direct or iterative method, depending on the rate of convergenceof the iterative solver which varies from sample to sample. Numerically, we ob-serve that the hybrid solver is almost an order of magnitude cheaper than the directsolver on the finest mesh, on the other hand the direct solver is almost an order ofmagnitude cheaper than the iterative solver on the coarsest mesh we considered. We conclude that modern variants of Monte Carlo based sampling methods areextremely useful for the problem of Uncertainty Quantification in Neutron Trans-port. This is particularly the case when the random fields are non-smooth and a largenumber of stochastic variables are required for accurate modelling.
Acknowledgements
We thank EPSRC and AMEC Foster Wheeler for financial support for thisproject and we particularly thank Professor Paul Smith (AMECFW) for many helpful discussions.Matthew Parkinson is supported by the EPSRC Centre for Doctoral Training in Statistical AppliedMathematics at Bath (SAMBa), under project EP/L015684/1. This research made use of the BalenaHigh Performance Computing (HPC) Service at the University of Bath.
References
1. Allen, E.J., Victory Jr, H.D., Ganguly, K.: On the convergence of finite-differenced multi-group, discrete-ordinates methods for anisotropically scattered slab media. SIAM J. Numer.Anal. , 88–106 (1989).2. Asadzadeh, M.: A finite element method for the neutron transport equation in an infinitecylindrical domain. SIAM J. Numer. Anal. , 1299–1314 (1998).3. Asadzadeh, M., Thevenot, L.: On discontinuous Galerkin and discrete ordinates approxima-tions for neutron transport equation and the critical eigenvalue. Nuovo Cimento C , 21–29(2010).4. Ayres, D.A.F., Eaton, M.D.: Uncertainty quantification in nuclear criticality modelling usinga high dimensional model representation. Ann. Nucl. Energy , 379–402 (2015).5. Ayres, D.A.F., Park, S., Eaton, M.D.: Propagation of input model uncertainties with differentmarginal distributions using a hybrid polynomial chaos expansion. Ann. Nucl. Energy ,1–4 (2014).6. Babuska, I., Nobile, F., Tempone, R.: A stochastic collocation method for elliptic partial dif-ferential equations with random input data. SIAM J. Numer. Anal. , 1005–1034 (2007).7. Barth, A., Schwab, C., Zollinger, N.: Multi-level Monte Carlo finite element method for el-liptic PDEs with stochastic coefficients. Numer. Math. , 123–161 (2011).8. Blake, J.C.H.: Domain decomposition methods for nuclear reactor modelling with diffusionacceleration. PhD Thesis, University of Bath (2016).9. Charrier, J., Scheichl, R., Teckentrup, A.L.: Finite element error analysis of elliptic PDEs withrandom coefficients and its application to multilevel Monte Carlo methods. SIAM J. Numer.Anal. , 322–352 (2013).10. Cliffe, K.A., Giles, M.B., Scheichl, R., Teckentrup, A.L.: Multilevel Monte Carlo methodsand applications to elliptic PDEs with random coefficients. Comput. Vis. Sci. , 3–15 (2011).11. Dautray, R., Lions, J.L.: Mathematical Analysis and Numerical Methods for Science andTechnology: Volume 1, Physical Origins and Classical Methods. Springer, Heidelberg (2012).12. Dick, J., Kuo, F.Y., Le Gia, Q.T., Nuyens, D., Schwab, C.: Higher order QMC Petrov–Galerkin discretization for affine parametric operator equations with random field inputs.SIAM J. Numer. Anal. , 2676–2702 (2014).13. Dick, J., Kuo, F.Y., Le Gia, Q.T., Schwab, C.: Multi-level higher order QMC Galerkin dis-cretization for affine parametric operator equations. SIAM J. Numer. Anal. , 2541–2568(2016).14. Dick, J., Kuo, F.Y., Sloan, I.H.: High-dimensional integration: the quasi-Monte Carlo way.Acta Numer. , 133–288 (2013).15. Dick, J., Pillichshammer, F.: Digital Nets and Sequences: Discrepancy Theory and QuasiV-Monte Carlo Integration. Cambridge University Press, Cambridge (2010).odern Monte Carlo Variants for Uncertainty Quantification in Neutron Transport 2516. Eiermann, M., Ernst, O.G., Ullmann, E.: Computational aspects of the stochastic finite ele-ment method. Comput. Vis. Sci. , 3–15, (2007).17. Fichtl, E.D., Prinja, A.K.: The stochastic collocation method for radiation transport in randommedia. J. Quant. Spectrosc. Radiat. Transfer (4), 646–659 (2011).18. Giles, M.B.: Multilevel Monte Carlo path simulation. Oper. Res. , 607–617 (2008).19. Giles, M.B.: Multilevel Monte Carlo methods. Acta Numer. , 259–328 (2015).20. Giles, M.B., Waterhouse, B.J.: Multilevel quasi-Monte Carlo path simulation. AdvancedFinancial Modelling, Radon Series on Computational and Applied Mathematics, 165–181(2009).21. Gilli, L., Lathouwers, D., Kloosterman, J.L., van der Hagen, T.H.J.J., Koning, A.J., Rochman,D.: Uncertainty quantification for criticality problems using non-intrusive and adaptive poly-nomial chaos techniques. Ann. Nucl. Energy , 71–80 (2013).22. Goda, T., Dick, J.: Construction of interlaced scrambled polynomial lattice rules of arbitraryhigh order. Found. Comput. Math. , 1245–1278 (2015).23. Graham, I.G., Kuo, F.Y., Nuyens, D., Scheichl, R., Sloan, I.H.: Quasi-Monte Carlo methodsfor elliptic PDEs with random coefficients and applications. J. Comput. Phys. , 3668–3694 (2011).24. Graham, I.G., Kuo, F.Y., Nichols, J.A., Scheichl, R., Schwab, C. and Sloan, I.H.: Quasi-MonteCarlo finite element methods for elliptic PDEs with lognormal random coefficients. Numer.Math. , 329–368 (2015).25. Graham. I.G., Parkinson M.J., Scheichl R.: Error Analysis and Applications for the heteroge-nous transport equation in slab geometry. In preparation (2017).26. Gunzburger, M., Webster, C.G., Zhang, G.: Stochastic finite element methods for PDEs withrandom input data. Acta Numer. , 521–650 (2014).27. Haji-Ali, A.L., Nobile, F., Tempone, R.: Multi-index Monte Carlo: when sparsity meets sam-pling. Numer. Math. , 767–806 (2016).28. Heinrich, S.: Multilevel Monte Carlo methods. Lecture Notes in Computer Science , vol. 2179,Springer, Heidelberg (2001).29. Kaper, H.G., Kellogg, R.B.: Asymptotic behavior of the solution of the integral transportequation in slab geometry. SIAM J. Appl. Math. (1), 191–200 (1977).30. Keller, H.B.: On the pointwise convergence of the discrete-ordinates method. SIAM J. Appl.Math. , 560–567 (1960).31. Kuo, F.Y.: http://web.maths.unsw.edu.au/ ∼ fkuo/lattice/index.html
32. Kuo, F.Y., Scheichl, R., Schwab, C., Sloan, I.H., Ullmann, E.: Multilevel quasi-Monte Carlomethods for lognormal diffusion problems. Math. Comp., (2017).33. Kuo, F.Y., Schwab, C., Sloan, I.H.: Quasi-Monte Carlo finite element methods for a classof elliptic partial differential equations with random coefficient. SIAM J. Numer. Anal. ,3351–3374 (2012).34. Kuo, F.Y., Schwab, C., Sloan, I.H.: Multi-level quasi-Monte Carlo finite element methods fora class of elliptic PDEs with random coefficients. Found. Comput. Math. , 411–449 (2015).35. Larsen E.W., Nelson, P.: Finite difference approximations and superconvergence for thediscrete-ordinate equations in slab geometry. SIAM J. Numer. Anal. , 334–348 (1982).36. Lee, C.W., Lee, Y.O., Cho. Y.S.: Comparison of the nuclear data libraries in the shieldingcalculation for the accelerator facility of the Proton Engineering Frontier Project in Korea.Int. Conf. Nuclear Data for Science and Technology. EDP Sciences, (2007).37. Lewis, E.E., Miller, W.F.: Computational methods of Neutron Transport. John Wiley andSons, New York (1984).38. Lord, G.J., Powell, C.E. Shardlow, T.: An Introduction to Computational Stochastic PDEs.Cambridge University Press, Cambridge (2014).39. Niederreiter, H.: Quasi-Monte Carlo Methods. John Wiley and Sons, New York (2010).40. Nobile, F., Tempone, R., Webster, C.G.: An anisotropic sparse grid stochastic collocationmethod for partial differential equations with random input data. SIAM J. Numer. Anal. ,2411–2442 (2008).6 Ivan G. Graham, Matthew J. Parkinson, and Robert Scheichl41. Nuyens, D., Cools, R.: Fast algorithms for component-by-component construction of rank-1lattice rules in shift-invariant reproducing kernel Hilbert spaces. Math. Comput. , 903–920(2006).42. Pitkaranta, J., Scott, L.R.: Error estimates for the combined spatial and angular approxima-tions of the transport equation for slab geometry. SIAM J. Numer. Anal. , 922–950 (1983).43. Reed, W.H., Hill, T.R.: Triangular mesh methods for the neutron transport equation. TechnicalReport LA-UR-73-479, Los Alamos National Laboratory (1973).44. Sanchez, R., McCormick, N.J.: Review of neutron transport approximations. Nucl. Sci. Eng. , 481–535 (1982).45. Sloan, I.H.: Error analysis for a class of degenerate-kernel methods, Numer. Math. , 231–238 (1975).46. Sloan, I.H., Wozniakowski, H.: When are quasi-Monte Carlo algorithms efficient for highdimensional integrals? J. Complexity , 1–33 (1998).47. Teckentrup, A.L., Jantsch, P., Webster, C.G., Gunzburger, M.: A multilevel stochastic collo-cation method for partial differential equations with random input data. SIAM/ASA JUQ. ,1046–1074 (2015).48. Victory Jr, H.D.: Convergence of the multigroup approximations for subcritical slab mediaand applications to shielding calculations. Adv. Appl. Math. , 227–259 (1984).49. Xiu, D., Karniadakis, G.E.: The Wiener-Askey polynomial chaos for stochastic differentialequations. SIAM J. Sci. Comput.24