[PDF] An efficient distribution method for nonlinear two-phase flow in highly heterogeneous multidimensional stochastic porous media

Abstract

In the context of stochastic two-phase flow in porous media, we introduce a novel and efficient method to estimate the probability distribution of the wetting saturation field under uncertain rock properties in highly heterogeneous porous systems, where streamline patterns are dominated by permeability heterogeneity, and for slow displacement processes (viscosity ratio close to unity). Our method, referred to as the frozen streamline distribution method (FROST), is based on a physical understanding of the stochastic problem. Indeed, we identify key random fields that guide the wetting saturation variability, namely fluid particle times of flight and injection times. By comparing saturation statistics against full-physics Monte Carlo simulations, we illustrate how this simple, yet accurate FROST method performs under the preliminary approximation of frozen streamlines. Further, we inspect the performance of an accelerated FROST variant that relies on a simplification about injection time statistics. Finally, we introduce how quantiles of saturation can be efficiently computed within the FROST framework, hence leading to robust uncertainty assessment.

Full PDF

AAn efﬁcient distribution method for nonlinear two-phase ﬂow inhighly heterogeneous multidimensional stochastic porous media

Fayadhoi Ibrahima ∗ Hamdi A. Tchelepi † Daniel W. Meyer ‡ November 10, 2018

Abstract

In the context of stochastic two-phase ﬂow in porous media, we introduce a novel and efﬁcient method to esti-mate the probability distribution of the wetting saturation ﬁeld under uncertain rock properties in highly heteroge-neous porous systems, where streamline patterns are dominated by permeability heterogeneity, and for slow displace-ment processes (viscosity ratio close to unity). Our method, referred to as the frozen streamline distribution method(FROST), is based on a physical understanding of the stochastic problem. Indeed, we identify key random ﬁeldsthat guide the wetting saturation variability, namely ﬂuid particle times of ﬂight and injection times. By comparingsaturation statistics against full-physics Monte Carlo simulations, we illustrate how this simple, yet accurate FROSTmethod performs under the preliminary approximation of frozen streamlines. Further, we inspect the performance ofan accelerated FROST variant that relies on a simpliﬁcation about injection time statistics. Finally, we introduce howquantiles of saturation can be efﬁciently computed within the FROST framework, hence leading to robust uncertaintyassessment.

The need for assessing and quantifying uncertainty in subsurface ﬂow has driven research in the stochastic aspectsof hydrology and multiphase ﬂow physics. Regardless of whether the interest lies in simulating the spreading of acontaminant in aquifers or predicting the oil recovery in a reservoir, precise and robust numerical methods have beendeveloped to provide quantitative answers for these ﬂow responses. However, the complex nature of subsurface sys-tems, together with inherent incomplete information about their properties, have resulted in the surge of probabilisticmodeling of these uncertainties. The uncertainty propagation from input subsurface properties to output ﬂows is thennaturally recast into a stochastic partial differential equation (PDE) problem. Within this stochastic context, signiﬁcantwork have addressed the estimation of the response ﬂow statistics. The fundamental work of Gelhar (1986), Dagan(1989) and Zhang (2002) have demonstrated how geostatistical treatments of subsurface properties can be used toderive statistical information of the ﬂow response. These statistics (generally mean and variance) have been estimatedin different ways.Statistical Moment Equation (SME) methods, also known as Low Order Approximations (LOA), build determin-istic PDEs for the moments by averaging stochastic PDEs. Li and Tchelepi (2006) used these methods to estimate thestatistical moments of the pressure ﬁeld for single phase ﬂow under uncertain permeability ﬁeld. However, since thisclass of methods relies on perturbations, the moment estimates are only satisfactory for very small variances, whichlimits their applicability towards uncertainty quantiﬁcation. Stochastic spectral methods are another class of popularmethods to estimate statistical moments (see e.g. Babuska et al., 2007; B¨ack et al., 2011) and rely on the truncationof the Karhunen-Lo`eve (KL) expansion of random processes (Lo`eve, 1977). Stochastic spectral methods can lead torigorous convergence analysis (e.g., Charrier, 2012), but they suffer from an exponentially increasing complexity withthe dimension and the correlation scales of the problem (this is known as the curse of dimensionality). Furthermore,they require further sophistications for highly nonlinear problems (see e.g. Abgrall, 2007; Foo et al., 2008; Le Maitre ∗ Institute for Computational & Mathematical Engineering, Stanford University, Stanford, CA † Department of Energy Resources Engineering, Stanford University, Stanford, CA, USA ‡ Department of Mechanical and Process Engineering, Institute of Fluid Dynamics, ETH Z¨urich, Z¨urich, Switzerland a r X i v : . [ phy s i c s . f l u - dyn ] A ug t al., 2004). Finally, distribution methods aim at estimating the Probability Density Function (PDF) and Cumula-tive Distribution Function (CDF) of the ﬂow response usually by solving deterministic PDEs for the distributions.These methods are attractive because they offer a complete picture of the variability of the ﬂow response. Distributionmethods have led to an abundant literature, especially for the study of stochastic turbulent ﬂows (e.g., Pope, 1985) ormore recently for tracer ﬂows (e.g., Meyer et al., 2010; Meyer and Tchelepi, 2010; Meyer et al., 2013) and advection-reaction problems (e.g., Tartakovsky and Broyda, 2011; Venturi et al., 2013). However, these methods typically relyon closure approximations and hence may not be generally applicable. The most popular and reliable method in un-certainty propagation remains Monte Carlo Simulation (MCS). However, due to its slow convergence rate, it requiresa prohibitively large number of samples to be accurate for large-scale applications, such as reservoir simulation andhydrology (Ballio and Guadagnini, 2004).For stochastic two-phase ﬂow in heterogeneous porous media, which is what we are discussing in this paper withthe Buckley-Leverett (BL) model (Buckley and Leverett, 1941; Peaceman, 1977; Aziz and Settari, 1979), additionalchallenges occur. Indeed, the governing transport equation is nonlinear due to the coupling between the saturation andthe total velocity, as well as the presence of fractional ﬂows. This nonlinearity affects the quality of the stochasticmethods aforementioned. For the stochastic BL problem, to provide ﬁrst and second moments of the water saturation,Zhang and Tchelepi (1999) and Zhang et al. (2000) had to assume that the distribution of the travel time was known,while still relying on an LOA approach. The key of their approach, though, was to identify the travel time as theunderlying random ﬁeld that explains the saturation statistics, and hence give a physical insight of the stochastic ﬂowproblem. Instead, Jarman and Russell (2002) used an Eulerian formulation to express ﬁrst and second order momentSMEs for the BL problem. They showed that the solution develops bimodal behaviors that are non-physical, thusdemonstrating the potential fragility of SME methods for nonlinear ﬂow problems when no physical information isexploited (see also Liu et al. (2007) and Jarman and Tartakovsky (2008)). For spectral methods, Liao and Zhang (2013)and Liao and Zhang (2014) used physically motivated transforms to build efﬁcient probabilistic collocation methodsfor nonlinear ﬂow in porous media. For MCS, M¨uller et al. (2013) show that Multilevel Monte Carlo (MLMC) meth-ods, together with the use of streamlines, speed up the naive MCS method when estimating the moments of the watersaturation under uncertain lognormal permeability ﬁelds. Regarding distribution-based strategies, Wang et al. (2013)extended the ideas of Tartakovsky and Broyda (2011) for the advective-reactive transport problem to successfullybuild a CDF framework for the one-dimensional BL case with known distribution for the total ﬂux. However thereis no clear extension to multiple spatial dimensions and little physical insight is explored. Furthermore, if often thepermeability ﬁeld is taken to be log normally distributed in the literature, (which, according to Gu and Oliver (2006)and Chen et al. (2009), does not prevent the saturation distribution to be highly non-Gaussian,) a geostatistical model-ing conveys more realistic scenarios (see e.g. Deutsch, 2002; Deutsch and Journel, 1998). This alternative stochasticmodeling, however, reinforces the pitfalls of the aforementioned methods since no Gaussian assumption is valid forthe closure-based SME methods, and the covariance function used for KL expansions in stochastic spectral methodscan require a prohibitively large number of KL terms to be informative.Because distribution methods provide a complete description of the stochastic variability of random ﬁelds via thePDF and CDF estimates, and because this full description is important to describe the complexity of the stochasticsaturation ﬁeld, our intent is to built efﬁcient distribution methods for the saturation, especially for high variancescenarios. In addition, we are interested in exploiting the physics of the problem to reach our goal.We propose a new distribution method, called the FROST, to treat the stochastic two-phase ﬂow problem. TheFROST is an extension of the work by Ibrahima et al. (2015) to higher spatial dimensions, where a similar approachwas ﬁrst derived for one-dimensional two-phase ﬂow. The FROST produces reliable and fast estimates of the PDFand CDF of the (water) saturation by using a ﬂow-driven approach to express the saturation as a (semi-analytical)nonlinear function of a smooth random ﬁeld. The FROST offers saturation distribution estimates that are suitablefor high permeability variances, and that stay reliable with geostatistically driven permeability ﬁelds. What’smore, the saturation PDF estimates, though discontinuous, stay stable by construction, while direct Kernel DensityEstimation (KDE) with MCS fails to converge or converge at very slow rates (Cline and Hart, 1991; Wied andWeissbach, 2012). Finally, since the saturation ﬁeld has a complex distribution, the study of probability of exceedance(Dagan and Neuman, 1997) rather than ﬁrst statistical moments appears to be more suited for uncertainty quantiﬁ-cation. The FROST also leads to fast computable probabilities of exceedance, which we refer to as saturation quantiles.The paper proceeds as follows. First, section 2 introduces the two-phase ﬂow problem and recalls the main sourcesof uncertainties considered in this study. Then, section 3 describes how, as a critical ﬁrst step, we use the ﬂow velocityto cast the transport problem into multiple one-dimensional problems along streamlines, following the methodology2dopted in Ibrahima et al. (2015) to derive saturation PDF and CDF for one-dimensional problems. We then take anextra step and extend the methodology to be applicable to two or three dimensional porous media. In these multi-dimensional stochastic porous media, we show that the saturation PDF/CDF can be estimated based on the statisticsof two random ﬁelds, the time of ﬂight (TOF) — the travel time of a particle to a given position, and the equivalentinjection time (EIT) — an injection time accounting for changes in streamtube capacity, that can both be efﬁcientlyestimated. Moreover, the FROST relies on the approximation of frozen streamlines over time, which is applicablefor highly heterogeneous porous media. This approximation leads to a unique and deterministic mapping betweenthe EIT and the TOF on the one hand, and the saturation on the other hand. Consequently, section 4 showcases theaccuracy and robustness of the FROST approach. More speciﬁcally, we assess the different approximations made inour method by means of numerical tests and sensitivities against MCS. In addition, the analysis leads to the proposalof a faster FROST implementation, named gFROST, where the TOF distribution is approximated by a log-Gaussianrandom ﬁeld. To further illustrate the performance of the FROST, section 5 gathers numerical comparisons of theaverage and standard deviation of the saturation for more challenging stochastic permeability ﬁelds. In addition, weprovide the corresponding saturation quantiles from the FROST. Finally, this paper closes with a discussion on themethod and results in section 6, along with some concluding remarks in section 7. To simulate two-phase ﬂow in porous media, Aziz and Settari (1979) demonstrated that the BL equation proves to bea simple but practical enough model. We assume that we have a wetting phase (e.g. water) and a non-wetting phase(e.g. oil), and we are interested in estimating the water saturation ﬁeld in the porous medium, i.e. the fraction of waterin a given pore space. The BL equation uses a few parameters to model the saturation transports in porous media.These parameters are the porosity φ ( (cid:126) x ) and the permeability ﬁeld K ( (cid:126) x ) ; the water and oil viscosities µ w and µ o , andthe pressure (resp. rate) on controlled zones (injectors, producers) p in j (resp. q in j ) and p prod (resp. q in j ); the relativepermeability ﬁelds k rw ( S w ) and k ro ( S o ) for the interaction between the ﬂuids and the medium. We assume no chemicalreactions and no change of state.In the ideal case, the prediction of the saturation ﬁeld relies on a complete knowledge of the aforementionedparameters. While experimental estimations of ﬂuid parameters are accessible, due to limited measurements in thesubsurface, a lack of information on the rock properties is usually the dominant cause of uncertainty in the ﬂowresponse. We will therefore focus on uncertainties in the porosity and permeability ﬁelds in the present paper. Wemodel these uncertainties by assigning probabilistic distributions to these ﬁelds that reﬂect many likely scenarios. Thegoal of our study is to estimate the probabilistic distribution of the saturation ﬁeld under these stochastic porosity andpermeability ﬁelds. The saturation ﬁeld is governed by a coupled system of equations: a nonlinear transport equation(mass conservation) and a Darcy equation (pressure equation). The total Darcy ﬂux or velocity q tot is deﬁned asthe sum of Darcy ﬂuxes for each phase, and relates the discharge rate through the porous medium to the gradient ofpressure in a proportional manner involving the previously deﬁned parameters, i.e. q tot = q w + q o with q w = − K ( (cid:126) x ) k rw ( S w ) µ w ∇ p = − λ w ∇ p and (1) q o = − K ( (cid:126) x ) k ro ( S o ) µ o ∇ p = − λ o ∇ p . The total Darcy ﬂux is assumed to verify the incompressibility condition, ∇ · q tot = . (2)Furthermore, we can deﬁne the water fractional ﬂow f w = λ w / ( λ w + λ o ) and the viscosity ratio m = µ w / µ o , so that q w = f w q tot and f w = k rw k rw + mk ro . With the condition S w + S o =

1, the conservation of mass can be reduced to one equation solely involving the watersaturation, φ ( (cid:126) x ) ∂ S w ∂ t + q tot · ∇ f w ( S w ) = . (3)3 x x sl ( ) L (Production) r ( τ ( (cid:126) x )) r = x sl ( τ ) streamlines A streamtube Figure 1

Illustration of streamline related notations. For each streamline, (cid:126) x sl ( τ ) corresponds to the position of a tracer at traveltime τ and r ( τ ) is its travel distance, which goes from 0 at the injection to L , the ﬁnal length at the production end. Also, for eachstreamtube, A ( r ) corresponds to its cross-section area at distance r . Finally, we have boundary conditions either on the pressure ﬁeld (pressure control), p ( (cid:126) x ) = p in j , (cid:126) x ∈ Γ inj , p ( (cid:126) x ) = p prod , (cid:126) x ∈ Γ prod , (4)where p in j and p prod are respectively prescribed pressures at the injector and at the producer, or on the injec-tion/pumping rates (rate control), q tot ( (cid:126) x ) = q in j , (cid:126) x ∈ Γ inj , q tot ( (cid:126) x ) = q prod , (cid:126) x ∈ Γ prod , as well as initial/boundary conditions on the water saturation, S w ( (cid:126) x , t ) = − s oi , (cid:126) x ∈ Γ inj and t > , S w ( (cid:126) x , ) = s wi , (cid:126) x ∈ Ω , where Ω is the reservoir, Γ inj is the boundary of the reservoir where the water is injected from the injector, Γ prod isthe boundary of the reservoir where water and oil are collected from the producer, and s wi and s oi are respectively theirreducible saturations of water and oil. We can recast Equation 3 into a 1D framework by considering the following characteristic curves or streamlines (e.g.,Muskat and Wyckoff, 1934; Cvetkovic and Dagan, 1994; Thiele et al., 1996), (cid:126) x sl ( τ ) = (cid:126) x , sl + (cid:90) τ q tot [ (cid:126) x sl ( t (cid:48) ) , t (cid:48) ] dt (cid:48) (5)with τ being the TOF and (cid:126) x , sl being the origin of the streamline at τ =

0. Considering the directional derivative alonga streamline, we can write q tot · ∇ = | q tot | ∂∂ r (Batycky, 1997, equation (3.35)), where r is the distance traveled along the streamline parallel to q tot and deﬁned as dr = | q tot | d τ (see Figure 1). Note that this leads to d τ / dr = / | q tot | , which is the transformation proposed by (Zhangand Tchelepi, 1999, equation (15)). We can further deﬁne the volumetric ﬂow rate q sl , tot along a streamline by q sl , tot = | q tot | A ( r ) , (6)4here A ( r ) is the cross-section of a streamtube bounded by streamlines (see Figure 1). By deﬁnition of a streamtube,we have the fundamental property, ∂ q sl , tot ∂ r = , which is a restatement of Equation 2. This means that it is possible, using a parameterization along streamlines,to reduce Equation 3 to a sum of independent 1D problems along streamlines by the change of variables S w ( t ,(cid:126) x ) = ˜ S w ( t , r ) , φ ( r ) A ( r ) ∂ ˜ S w ∂ t + q sl , tot ( t ) ∂ f w ( ˜ S w ) ∂ r = . (7)Finally, following Hewett and Yamada (1997) by introducing the cumulative injection volume Q and the cumulativepore volume V per streamline, Q ( t ) = (cid:90) t q sl , tot ( t (cid:48) ) dt (cid:48) and V ( r ) = (cid:90) r φ ( r (cid:48) ) A ( r (cid:48) ) dr (cid:48) , (8)the change of variables ˜ S w ( t , r ) = ˇ S w ( Q , V ) reduces Equation 7 to ∂ ˇ S w ∂ Q + ∂ f w ( ˇ S w ) ∂ V = S w ( , V ) = s wi and ˇ S w ( Q , ) = − s oi = s B , (10)respectively, where s wi is the initial water saturation of the reservoir and s B is the water injection (both assumed to beknown, and uniform and constant, respectively). Hence, ˇ S w : ( Q , V ) (cid:55)→ ˇ S w ( Q , V ) is a deterministic mapping from theuncertain reservoir characteristics (permeability and porosity) to the random water saturation ﬁeld. The key advantageof this reformulation lies in the fact that the mapping equation (along each streamline) is a 1D equation in the stochasticvariables, and can therefore be solved efﬁciently (Ibrahima et al., 2015). Moreover, the mapping Equation 9 admits ananalytical solution in terms of the ratio Z between V and Q (Hewett and Yamada, 1997), simplifying even further theinterpretation of the mapping function,ˇ S w (cid:18) Z = VQ (cid:19) =  s B if Z < f (cid:48) w ( s B ) s w ( Z ) if f (cid:48) w ( s B ) ≤ Z < f (cid:48) w ( s ∗ ) s wi if Z ≥ f (cid:48) w ( s ∗ ) , (11)where s ∗ is deﬁned by the Rankine-Hugoniot condition f (cid:48) w ( s ∗ ) = f w ( s ∗ ) − f w ( s wi ) s ∗ − s wi , s B is deﬁned in Equation 10, and s w ( y ) is deﬁned for y < f (cid:48) w ( s ∗ ) as s w ( y ) = ( f (cid:48) w ) − ( y ) . Let us call α ∗ = f (cid:48) w ( s ∗ ) . Thisself-similarity behavior sketched in Figure 2a is a key factor in our method, as this leads to a tremendous reduction inthe complexity of the stochastic problem. Indeed, even if the permeability ﬁeld is a full stochastic tensor, Z is alwaysa stochastic scalar ﬁeld. Once the deterministic water saturation mapping ˇ S w is solved, the uncertainties in the inputparameters V and Q translate through the mapping or its inverse to the uncertainty in S w . The mapping equation showsthat physically, propagating the uncertainties in the porosity and permeability ﬁelds is equivalent to propagating theuncertainties in Z along (uncertain) streamlines.Because the saturation is deﬁned by a nonlinear equation (Equation 3), the stochastic saturation ﬁeld S w may beelaborate (e.g. far from Gaussian). Therefore, instead of working directly with S w as a stochastic ﬁeld, we regard thesaturation as a nonlinear mapping function ˇ S w ( Z ) of ”smoother” stochastic ﬁelds that constitute Z . This mapping func-tion, deterministic and either analytically (in Equation 11) or numerically (from Equation 9-Equation 10) determined,entirely embodies the nonlinearity of the saturation. The ”smooth” stochastic ﬁelds drive the probabilistic distributionof the saturation. This reformulation was used in (Ibrahima et al., 2015) for estimating the saturation distribution inthe spatial 1D case and is extended to the multi-dimensional case in this section. Indeed, V and Q are deﬁned foreach streamline and cannot be used in a computationally efﬁcient manner to estimate the distribution of Z in multiple5 S w s wi s ∗ s B f (cid:48) w ( s B ) α ∗ s w (a) Sketch of the solution mapping ˇ S w ( Z ) S w Z s wi s ∗ s B f (cid:48) w ( s B ) α ∗ s < − > w = f (cid:48) w (b) Sketch of the inverse mapping of ˇ S w ( Z ) Figure 2

Snapshot of the solution ˇ S w ( Z ) (2a) and its inverse ˇ S < − > w ( s ) (reproduction from [ﬁg.1 in Ibrahima et al. (2015)]) (2b) spatial dimensions since the streamline paths are time dependent and also stochastic. We are going to rewrite Z into aratio of quantities that can be more easily and efﬁciently estimated. Along a streamline (cid:126) x sl ( τ ) , as pictured in Figure 1,the water saturation is S w [ (cid:126) x = (cid:126) x sl ( τ ( (cid:126) x )) , t ] = ˇ S w [ Z ( r ( τ ( (cid:126) x )) , t )] , (12)with (cid:126) x sl ( τ ) deﬁned in Equation 5 and r ( τ ) = (cid:90) τ (cid:12)(cid:12) q tot (cid:2) (cid:126) x sl ( t (cid:48) ) , t (cid:48) (cid:3)(cid:12)(cid:12) dt (cid:48) = (cid:90) τ q tot (cid:2) (cid:126) x sl ( t (cid:48) ) , t (cid:48) (cid:3) dt (cid:48) (13)being the streamline travel distance to position (cid:126) x and Z ( r , t ) = V ( r ) / Q ( t ) , with Q ( t ) and V ( r ) deﬁned in Equation 8.More precisely, we should write Q sl ( t ) ( t ) and V sl ( t ) ( r ) as the streamlines are also evolving in time. This last remarkhighlights why it is difﬁcult to efﬁciently estimate the distributions of these two quantites in multidimensional porousmedia as these distributions depend on the streamline geometry and saturation distribution. To alleviate this difﬁculty,we make a key approximation: the frozen streamline approximation. We assume that the streamline pattern is dominated by the permeability heterogeneity and is only weakly inﬂuencedby local viscosity effects due to the wetting phase displacing the non-wetting phase. Within this frozen streamlineapproximation, q tot ( (cid:126) x , t ) can be written as q tot ( (cid:126) x , t ) = q tot ( (cid:126) x , t ) e q ( (cid:126) x ) , (14)where e q ( (cid:126) x ) is a time-independent unity direction vector parallel to q tot . We introduce (cid:126) x sl ( τ ) = (cid:126) ¯ x sl [ r ( τ )] , so that (cid:126) ¯ x sl isa function of r only. We can then write d (cid:126) x sl d τ = d (cid:126) ¯ x sl dr drd τ = d (cid:126) ¯ x sl dr q tot , (15)where the second equality results from Equation 13, and from Equation 5 and Equation 14 d (cid:126) x sl d τ = q tot = q tot e q . (16)From Equation 15 and Equation 16, we can derive the following streamline-generating relation, d (cid:126) ¯ x sl dr = e q (cid:0) (cid:126) ¯ x sl (cid:1) . (17)6ence, under the ﬁxed or frozen streamline approximation, the streamline pattern is time-invariant, sl ( t ) = sl ( ) , andthe streamtube cross-sectional areas do not change in time A ( r ; t ) = A ( r ) . Therefore, V only depends on the streamlinedistance from the injection, V ( r ; t ) = V ( r ) , and Z ( r , t ) is a separable function in r and t .The frozen streamline approximation, though simplistic, is well suited for highly heterogeneous permeability ﬁelds(Zhang and Tchelepi, 1999; M¨uller et al., 2013), which are typically the case in applications. Therefore the followingframework is remarkably appropriate for high input variances σ K .Based on Equation 6, the cross-section area at r ( τ ( (cid:126) x )) along the streamtube can be expressed as A [ r ( τ ( (cid:126) x ))] = q sl , tot ( t ; (cid:126) x ) q tot [ r ( τ ( (cid:126) x )) , t ] . (18)Within the frozen streamline approximation, A [ r ( τ ( (cid:126) x ))] is time-independent and thus can be determined for exampleby q tot at time t =

0. Therefore, for each spatial location (cid:126) x in the reservoir, we have from Equation 8 Z ( (cid:126) x , t ) = V ( (cid:126) x ) Q ( (cid:126) x , t ) = (cid:90) r ( τ ( (cid:126) x )) φ ( r (cid:48) ) q tot ( r (cid:48) , t = ) dr (cid:48) (cid:90) t q sl , tot ( t (cid:48) ; (cid:126) x ) q sl , tot ( t = (cid:126) x ) dt (cid:48) . (19)The denominator (cid:82) t q sl , tot ( t (cid:48) ) / q sl , tot ( t = ) dt (cid:48) will thereafter be referred to as the Equivalent Injection Time (EIT) as itcorresponds to the time needed to inject at the rate q sl , tot ( t = ) a volume equivalent to (cid:82) t q sl , tot ( t (cid:48) ) dt (cid:48) . The volumetricﬂow rate q sl , tot is an integral quantity of a streamtube, i.e. q sl , tot = ∆ p / R , with resistance R = (cid:82) L [ A ( r (cid:48) ) λ tot ( r (cid:48) )] − dr (cid:48) and pressure drop ∆ p . Correspondingly, we may further approximate the EIT to be an almost deterministic time-dependent function that can be estimated in terms of a mean value, (cid:104) EIT (cid:105) , from few MCS samples. The scope of thisEIT approximation is studied in section 4. In the case of rate control boundary conditions, q sl , tot ( t ; x ) and consequentlyEIT may become a deterministic quantity.The numerator (cid:82) r φ ( r (cid:48) ) / q tot ( r (cid:48) , t = ) dr (cid:48) , on the other hand, is exactly the TOF τ for the initial streamline pattern.The TOF captures both uncertainties in the porosity and permeability ﬁelds. Its distribution can be estimated withsmall computational cost, since no transport sampling is needed. To estimate the distribution of the TOF, we generaterealizations of streamlines that go through every point (cid:126) x of interest using Pollock’s algorithm (Pollock, 1988; Hewettand Yamada, 1997). This is accomplished by backtracking streamlines in time from (cid:126) x to the injection location. Theresulting statistics of the TOF are needed to determine the distribution of Z ( (cid:126) x , t ) by means of kernel density estimation(KDE) (Botev et al., 2010). The TOF can also be estimated by solving an elliptic steady-state equation in Euleriancoordinates (Shahvali et al., 2012).Similarly, thanks to the frozen streamline approximation and Equation 18, the EIT can be approximated as follows, EIT ( (cid:126) x , t ) = (cid:90) t q tot [ r ( τ ( (cid:126) x )) , t (cid:48) ] A [ r ( τ ( (cid:126) x ))] q tot [ r ( τ ( (cid:126) x )) , t = ] A [ r ( τ ( (cid:126) x ))] dt (cid:48) , = (cid:90) t q tot [ r ( τ ( (cid:126) x )) , t (cid:48) ] q tot [ r ( τ ( (cid:126) x )) , t = ] dt (cid:48) , = (cid:90) t q tot ( (cid:126) x , t (cid:48) ) q tot ( (cid:126) x , t = ) dt (cid:48) , (20)which requires the total velocity ﬁeld, but not the construction of streamlines. The TOF is strongly linked to the velocity ﬁeld, which is described by a linear equation when the saturation is ﬁxed.If the permeability ﬁeld has a ‘smooth’ distribution, we expect the TOF to have a similar behavior. In the cases studiesin this work, the permeability ﬁeld is assumed to be lognormal, so that the logarithm of the TOF is the appropriatestochastic ﬁeld to be considered. We therefore approximate Z to be Z = exp ( log τ ) (cid:104) EIT (cid:105) (21)and estimate the CDF of the water-saturation at position (cid:126) x and time t , as follows, F FROSTS w ( s ; (cid:126) x , t ) = P { S w ( (cid:126) x , t ) ≤ s } P (cid:8) ˇ S w [ Z ( (cid:126) x , t )] ≤ s (cid:9) = P { Z ( (cid:126) x , t ) ≥ χ F ( s ) } = − P { Z ( (cid:126) x , t ) ≤ χ F ( s ) } (22) = − P (cid:110) e log τ ( (cid:126) x ) / (cid:104) EIT (cid:105) ( (cid:126) x , t ) ≤ χ F ( s ) (cid:111) = − P { log τ ( (cid:126) x ) ≤ log [ χ F ( s ) (cid:104) EIT (cid:105) ( (cid:126) x , t )] } = − F log τ ( (cid:126) x ) [ log ( χ F ( s ) (cid:104) EIT (cid:105) ( (cid:126) x , t ))] , where F FROSTS w ( s ; (cid:126) x , t ) is the FROST estimate of the saturation CDF at saturation level s , F log τ ( (cid:126) x ) is the log(TOF) CDF atposition (cid:126) x , P is the probability measure, and χ F ( s ) corresponds to the inverse mapping ˇ S < − > w illustrated in Figure 2b.The latter is deﬁned as follows, χ F ( s ) =  + ∞ , s < s wi , α ∗ , s ∈ ( s wi , s ∗ ) , f (cid:48) w ( s ) , s ∈ ( s ∗ , s B ) , , s > s B . The derivation of Equation 22 is deduced from Equation 21, Equation 12 that gives the mapping from Z to S w , andthe analytical expression of the mapping in Equation 11 . The saturation PDF is given by taking the derivative ofEquation 22 with respect to s , p FROSTS w ( s ; (cid:126) x , t ) = F ∗ S w ( (cid:126) x , t ) δ ( s ) − π ( s ; (cid:126) x , t ) , with F ∗ S w ( (cid:126) x , t ) = − F log τ ( (cid:126) x ) [ log ( α ∗ (cid:104) EIT (cid:105) ( (cid:126) x , t ))] , π ( s ; (cid:126) x , t ) = f (cid:48)(cid:48) w ( s ) f (cid:48) w ( s ) p log τ ( (cid:126) x ) [ log ( χ p ( s ) (cid:104) EIT (cid:105) ( (cid:126) x , t ))] , (23) δ being the Dirac measure, and χ p ( s ) = (cid:26) f (cid:48) w ( s ) , s ∈ ( s ∗ , s B ) , , otherwise . In subsection 4.4 we argue that, for time-independent pressure control, the mean EIT approximately follows apower law and is only a function of time (and implicitly, of the viscosity ratio m ), (cid:104) EIT (cid:105) ( (cid:126) x , t ) ≈ c ( m ) t β ( m ) , (24)where c ( m ) and β ( m ) are estimated using two time steps at times t = ∆ t and 2 ∆ t as c = (cid:104) EIT (cid:105) ( (cid:126) x , ∆ t ) / ∆ t β and β = log (cid:20) (cid:104) EIT (cid:105) ( (cid:126) x , ∆ t ) (cid:104) EIT (cid:105) ( (cid:126) x , ∆ t ) (cid:21)(cid:30) log ( ) (25)for any (cid:126) x . For viscosity ratios m < β ( m ) >

1. For m close to unity, the mean EIT scales linearly with t as β ( m ) ≈

1, and for m >

1, we expect β ( m ) < F FROSTS w ( s ; (cid:126) x , t ) = − F log τ ( (cid:126) x ) (cid:104) log (cid:16) χ F ( s ) c ( m ) t β ( m ) (cid:17)(cid:105) . (26)Besides the frozen streamline approximation, Equation 22 and Equation 26 are valid if the EIT has a small varianceso that we can replace it by its average. This constraint can restrict the range of possible viscosity ratios m . Weinvestigate the effect of m on the EIT variance in subsection 4.4. To summarize, the FROST saturation one-pointdistribution estimate solely depends on two key statistics: the logarithm of the TOF ﬁeld (log(TOF)), log τ ( (cid:126) x ) and theaverage EIT, (cid:104) EIT (cid:105) ( (cid:126) x , t ) . Once these quantities are obtained, the saturation distribution is readily available throughEquation 22. Therefore, the efﬁciency and accuracy of the FROST distribution method strongly depends on thecomputational costs of generating accurate estimates of these two statistics. The mean EIT has a parametric formula(see Equation 24 and Equation 25) that a priori involves solving the ﬂow and transport problem for two time steps,and the log(TOF) distribution can be evaluated using KDE on samples at time t =

0. Alternatively, for example underconditions studied in section 4, a parametric PDF model or surrogate for the log(TOF) may be applied.8 .4 The FROST algorithm

Equation 26 provides a closed-form, semi-analytical method to estimate the saturation distribution that relies on thedistribution of the log(TOF), computed from a linear problem, and on the mean EIT that can be estimated by solving thetwo-phase ﬂow problem for two time steps. We verify that this formula produces accurate estimates of the saturationdistribution. The FROST algorithm is exposed in Algorithm 1, where N s is the number of initial travel time realizations, M s is the number of times we solve the transport problem for two times steps, and N t is the number of time stepsconsidered during the entire simulation. Algorithm 1

Frozen streamline-based distribution method (FROST) for i = · · · N s do Solve initial tracer equationStore time-of-ﬂights τ ( (cid:126) x ) ( i ) if i ≤ M s then Solve transport eq. at times T / N t and 2 T / N t Store velocity ﬁelds (cid:110) q ( i ) tot ( kT / N t ) (cid:111) k = , end ifend for Estimate p log τ ( (cid:126) x ) and F log τ ( (cid:126) x ) with KDE using { τ ( (cid:126) x ) ( i ) } ≤ i ≤ N s Estimate c , β and (cid:104) EIT (cid:105) ( t ) = ct β using (cid:110) q ( i ) tot ( kT / N t ) (cid:111) k = , ≤ i ≤ N s Compute (cid:110) F FROSTS w ( s ; (cid:126) x , t j ) (cid:111) j = ... N t and (cid:110) p FROSTS w ( s ; (cid:126) x , t j ) (cid:111) j = ... N t for t j = jT / N t Before focusing on two-phase ﬂow, we explore the implications of the previously derived distribution method inthe case of advection-dominated tracer dispersion. The hydraulic conductivity ﬁeld is uncertain and assumed to bemodeled as a stochastic ﬁeld with known (log Gaussian) distribution. If C ( (cid:126) x , t ) denotes the tracer concentration atposition (cid:126) x and time t , we are interested in the following stochastic problem,  ∂ C ∂ t + ∇ · ( u C ) = , ∇ · ( K ∇ h ) = , q = − K ∇ h , u = q / n , h ( (cid:126) x , t ) = h in , C ( (cid:126) x , t ) = C , (cid:126) x ∈ Γ in j , ∀ th ( (cid:126) x , t ) = h out , (cid:126) x ∈ Γ prod , ∀ tC ( (cid:126) x , ) = , (cid:126) x / ∈ Γ in j , where K is the hydraulic conductivity ﬁeld, h is the hydraulic head, n is the porosity ﬁeld, q is the speciﬁc dischargeand u is the velocity ﬁeld. Since the conductivity ﬁeld is random, so is the hydraulic head, and the speciﬁc dischargeby construction. The velocity ﬁeld inherits randomness possibly from both the speciﬁc discharge and the porosityﬁeld. And ﬁnally the tracer concentration becomes random from transport with random velocity.The tracer case can be deduced from the two-phase ﬂow case. Indeed, this case corresponds to the situation wherethe ‘fractional’ ﬂow function is the identity f w ( s ) = s . In this case, no shocks are developed during the transport.Furthermore, since we only have one phase, the EIT is deterministic and EIT ( (cid:126) x , t ) = t . Using the distribution method,we deduce the following formula for the expected tracer concentration, (cid:104) C (cid:105) ( (cid:126) x , t ) = C P ( Z < )= C F log τ ( (cid:126) x ) [ log ( t )] The tracer concentration variance is also deduced from the CDF of the logarithm of the TOF, σ C ( (cid:126) x , t ) = C F log τ ( (cid:126) x ) [ log ( t )] (cid:0) − F log τ ( (cid:126) x ) [ log ( t )] (cid:1) . x injector producer ••• ••• ••• ( , ) ( , ) ( , )( , ) ( , ) ( , )( , ) ( , ) ( , ) Figure 3

Computational quarter-ﬁve spot setting. The solid blue line South-West corresponds to the injection spot (e.g. contaminantor water). The solid red line North-East corresponds to the pumping (or production) well. The red dots in the porous mediumcorresponds to locations where all parameters (e.g. TOF, EIT, PDF and CDF) are investigated.

Throughout the rest of the manuscript, we perform tests on a 2D quarter-ﬁve spot conﬁguration. This conﬁgurationcorresponds to a squared porous medium where ﬂuid is injected in the south-west corner and a pumping well is locatedin the north-east corner (see Figure 3). We set the dimensions of the porous domain as L x = L y = L =

1. The typicalgrid used for simulations is 128 × ( i , j ) , ≤ i , j ≤ N s , with pressure control deﬁned in Equation 4 and Γ inj and Γ prod spanning on 4 gridsegments, p in j = p prod = T = K is a stochastic ﬁeld with log ( K ) being a Gaussian ﬁeld with exponential covari-ance, while φ = . K ∼ N ( , C K ) , where C K ( (cid:126) x ,(cid:126) y ) = σ K exp  − (cid:118)(cid:117)(cid:117)(cid:116) d ∑ i = ( x i − y i ) λ i  , with σ K being the variance of log ( K ) , λ i the correlation length in each dimension and d = φ were partially studied in (Ibrahima et al., 2015) and are deferred to futurework as they only affect the distribution of log(TOF). We use the streamline solver from (M¨uller et al., 2013) to solvefor all quantities of interest.The goal of this section is to evaluate the performance of the FROST method for log-Gaussian permeability ﬁeldsand assess the accuracy of the aforementioned approximations and metrics used to derive saturation statistics. Insubsection 4.1, we verify that, in this setting, the TOF distribution is indeed smooth and can be approximated by alog-Gaussian ﬁeld. In subsection 4.2, we compare saturation CDF and PDF estimates from both MCS and FROST. Weargue that the FROST method produces accurate results compared to MCS despite the frozen streamline approxima-tion. Furthermore, the FROST saturation PDF estimation is more stable than the KDE-based MCS results, especiallyin areas where the injected water front is likely to be located. Finally we verify that, ﬁxing the initial streamlines inMCS, the frozen streamline approximation generates exactly matching saturation CDFs between the FROST and MCSmethods. In light of the observations from the latter subsections, we propose, in subsection 4.3, a computationally ac-celerated version of FROST based on a surrogate Gaussian distribution for the log(TOF). Finally, in subsection 4.4,we evaluate the scope of the EIT approximation proposed in Equation 24. log (TOF) distribution We record the TOF using Pollock’s algorithm (Pollock (1988)). We then use state-of-the-art KDE (Botev et al., 2010)to estimate the distribution of the log(TOF). With σ K = λ = λ = λ = . L , m = .

25 and N s = p log τ ( (cid:126) x ) , at the nine locations shown in Figure 3, are displayed in Figure 4. We can see that at all nineobservation spots (and really in the entire grid), p log τ ( (cid:126) x ) is smooth and close to a Gaussian surrogate, p G ( log τ ( (cid:126) x )) , simplybuilt as follows, p G ( log τ ( (cid:126) x )) ∼ N ( m ( N s ) ( (cid:126) x ) , ( s ( N s ) ( (cid:126) x )) ) with sample mean m ( N s ) ( (cid:126) x ) = N s N s ∑ k = log τ ( (cid:126) x ) ( k ) and sample variance ( s ( N s ) ( (cid:126) x )) = N s − N s ∑ k = (cid:16) log τ ( (cid:126) x ) ( k ) − m ( N s ) ( (cid:126) x ) (cid:17) based on available realizations [ log τ ( (cid:126) x ) ( k ) ] ≤ k ≤ N s of the log(TOF). This illustrates that under certain conditions thegeneration of the log(TOF) distribution can be further accelerated by means of Gaussian surrogates.To provide a numerical measure of the distance between the sampled log(TOF) distribution and its Gaussiansurrogate, we use the statistical (TVD) distance δ G ( (cid:126) x ) deﬁned as follows, δ G ( (cid:126) x ) = (cid:90) + ∞ − ∞ (cid:12)(cid:12) p log τ ( (cid:126) x ) ( θ ) − p G ( log τ ( (cid:126) x )) ( θ ) (cid:12)(cid:12) d θ . Intuitively, noticing that 0 ≤ δ G ( (cid:126) x ) ≤ δ G can be interpreted as the maximal probability of being able to distinguishthe two random ﬁelds. Figure 5 demonstrates how close both ﬁelds are and supports the use of a surrogate model forthe log(TOF). The maximum TVD distance is just above 0 .

08, while about 80 % of the grid cells have a TVD distancesmaller than 0 . Direct MCS is the most straightforward way to estimate the saturation distribution. The method is outlined in Algo-rithm 2. Though simple, it requires a large number of realizations to be accurate, especially because the two-phaseﬂow problem is nonlinear. Furthermore, a nonlinear numerical solver is needed to evolve the saturation in time foreach realization. We consider N t = N i =

10 intermediate time steps to evolve the saturation along streamlines and then interpolate the saturation valuesback to the Eulerian grid (M¨uller et al., 2013).

Algorithm 2

MCS method for i = · · · N MCSs do Solve pressure equation for n = · · · N t dofor k = · · · N i do t ← (( n − ) + ( k − ) / N i ) N t T Upwind update S kw ( (cid:126) x , t ) ( i ) S w ( (cid:126) x , t ) ( i ) ← S kw ( (cid:126) x , t ) ( i ) end for Store S w ( (cid:126) x , t ) ( i ) Update pressure at time t end forend forfor n = · · · N t do t ← nN t T Compute F MCSS w ( s ; (cid:126) x , t ) and p MCSS w ( s ; (cid:126) x , t ) withKDE using { S w ( (cid:126) x , t ) ( i ) } ≤ i ≤ N MCSs end for log(TOF) values den s i t y -2 0 2 log(TOF) values den s i t y -4 -2 0 log(TOF) values den s i t y -2 0 2 log(TOF) values den s i t y -2 0 2 log(TOF) values den s i t y -2 0 2 log(TOF) values den s i t y -2 0 2 log(TOF) values den s i t y -2 0 2 log(TOF) values den s i t y -2 0 2 log(TOF) values den s i t y Figure 4

Illustration of smoothness of log(TOF) in the case of log Gaussian permeability ﬁeld at the nine locations depicted inFigure 3. The permeability ﬁeld distribution has a uniform correlation length λ = . L , variance σ K = m = . N s = x x Figure 5

Statistical distance between the log(TOF) random ﬁeld and its Gaussian surrogate on the entire grid. The permeabilityﬁeld distribution has a uniform correlation length λ = . L , variance σ K = m = . N s = We compare the CDF and PDF of the saturation obtained using direct MCS on the saturation realizations versusthose obtained by performing the FROST with log(TOF) distribution estimated using log(TOF) realizations and KDE,and the mean EIT coming from the power law ﬁt with M s = m = . σ K = λ = . L and N s = C on ( , ) and can have a jump at s = (cid:104) EIT (cid:105) ( t ) = t exactly. Figure 9 shows that theFROST-based CDF indeed correspond to the MCS-based CDF. In Figure 10, we illustrate that the MCS-based PDFestimate may not converge, while in Figure 11, we display a “smoothed” version of these MCS-based PDFs obtainedby removing realizations where the saturation is zero, hence avoiding the non-smooth behavior of the distribution at s =

0. The displayed “smoothed” MCS-based PDF is in agreement with the shape of the estimated FROST basedPDFs, while, as expected, overestimating the peak of the density where the PDF is non-smooth.

We previously argued that the log(TOF) distributions could be well approximated by Gaussian distributions (see sub-section 3.3, subsection 4.1, Figure 4, and Figure 5). Figure 12 shows that a reliable estimate of the log(TOF) distri-bution can be achieved already with few realizations (in the order of 10 times less as compared to KDE) for the meanand variance sample estimates. This leads to further computational cost savings of the FROST method. Figure 13demonstrates the accuracy of the Gaussian surrogate FROST versus direct MCS and FROST for the saturation distri-bution. This means that we can preprocess the distribution of the log(TOF) by solving a few steady-state problems,and reuse this distribution to estimate the saturation distribution for any desired time. We can therefore formulate afast and efﬁcient algorithm to estimate the saturation distribution in Algorithm 3, where N gs , the number of realizationsneeded, is typically smaller than N s , the number of realizations used in the FROST method, and M s is the same as in13 saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y Figure 6

Comparison between the saturation CDFs obtained with MCS (crossed lines) and FROST (solid lines) at 3 different times( t = . T in blue, t = . T in green and t = T in red), and at the nine locations depicted in Figure 3. The permeability ﬁelddistribution has a uniform correlation length λ = . L , variance σ K = m = . N MCSs = N s = saturation den s i t y Figure 7

25. 2000 realizations were used to generate the distributions. saturation den s i t y Figure 8

Comparison between the saturation PDFs obtained with MCS (crossed lines) and FROST (solid lines) at 3 different times( t = . T in blue, t = . T in green and t = T in red) and at position ( , ) in Figure 3. The MCS-based PDFs are oscillatory.The permeability ﬁeld distribution has a uniform correlation length λ = . L , variance σ K = m = . N MCSs = N s = saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y Figure 9

Comparison between the saturation CDFs obtained with MCS (crossed lines) and FROST (solid lines) at 3 differenttimes ( t = . T in blue, t = . T in green and t = T in red) and at the nine positions depicted in Figure 3, with ﬁxed ﬂow ﬁeld.The permeability ﬁeld distribution has a uniform correlation length λ = . L , variance σ K = m = . N MCSs = N s = saturation den s i t y Figure 10

Comparison between the saturation PDFs obtained with MCS (crossed lines) and FROST (solid lines) at 3 different times( t = . T in blue, t = . T in green and t = T in red) with frozen streamlines, at position ( , ) in Figure 3. The permeability ﬁelddistribution has a uniform correlation length λ = . L , variance σ K = m = . N MCSs = N s = saturation den s i t y Figure 11

Comparison between the saturation PDFs obtained with “smoothed” MCS (crossed lines) and FROST (solid lines) at3 different times ( t = . T in blue, t = . T in green and t = T in red) with frozen streamlines, at position ( , ) in Figure 3.The permeability ﬁeld distribution has a uniform correlation length λ = . L , variance σ K = m = .

25. When smoothed, the PDF is adequately captured. N MCSs = N s = log(TOF) values den s i t y Figure 12

Comparison of the logTOF PDFs, at location ( , ) in Figure 3, using samples directly (solid lines) and surrogateGaussian distributions using 80 (triangles), 400 (circles) and 2000 (pluses) realizations. The permeability ﬁeld distribution has auniform correlation length λ = . L , variance σ K = m = . the FROST method. We will refer to this method as gFROST. Algorithm 3

Gaussian surrogate based FROST (gFROST) for i = · · · N gs do Solve initial tracer equationStore time-of-ﬂights τ ( (cid:126) x ) ( i ) if i ≤ M s then Solve transport eq. at times T / N t and 2 T / N t Store velocity ﬁelds (cid:110) q ( i ) tot ( kT / N t ) (cid:111) k = , end ifend for Compute sample mean, m ( N gs ) ( (cid:126) x ) , and standard deviation, s ( N gs ) ( (cid:126) x ) of (cid:110) τ ( (cid:126) x ) ( i ) (cid:111) ≤ i ≤ N gs Estimate p log τ ( (cid:126) x ) and F log τ ( (cid:126) x ) with log τ ( (cid:126) x ) ∼ N [ m ( N gs ) ( (cid:126) x ) , ( s ( N gs ) ( (cid:126) x )) ] Estimate c , β and (cid:104) EIT (cid:105) ( t ) = ct β using (cid:110) q ( i ) tot ( kT / N t ) (cid:111) k = , ≤ i ≤ N s Compute (cid:110) F gFROSTS w ( s ; (cid:126) x , t j ) (cid:111) j = ... N t and (cid:110) p gFROSTS w ( s ; (cid:126) x , t j ) (cid:111) j = ... N t for t j = jT / N t In section 3, we argued that the EIT was almost deterministic (and hence could be estimated by its average (cid:104)

EIT (cid:105) with a few samples) and that it approximately followed a power law in time. We study here when this is indeedthe case by investigating the sensitivity of the EIT with respect to the viscosity ratio m and the permeability ﬁeldvariance σ K . We assume that the permeability ﬁeld has a log normal distribution with exponential covariance and σ K ∈ { . , , } , λ = . L . We use M s =

200 ﬂow realizations with pressure updates. We vary the viscosity ratio tobe m = µ w / µ o ∈ { . , . , . , , , , } . Figure 14 shows that, under pressure control, the mean EIT is followinga power law to a good approximation. On this plot, we display the mean EIT and its standard deviation at position ( , ) only since, remarkably, these two statistics do not depend on the position (cid:126) x in the reservoir. This spatial independenceis another advantage of the FROST. However, the EIT is not always guaranteed to have a small variance. The EITstandard deviation is negligible for viscosity ratios m close to 1 (either slightly smaller or larger than 1). Hence, thisregime corresponds to the domain of applicability of the FROST with the approximation EIT ≈ (cid:104)

EIT (cid:105) . Indeed, forviscosity ratios close to unity, the EIT variance remains small even for large input variance σ K and the power lawﬁtting is satisfactory. In addition, Figure 16 and Figure 17 show that the power law ﬁtting parameters for the mean17 saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y Figure 13

Comparison of the saturation CDFs using MCS (black dashed) with N MCSs = N s = N gs =

80 (triangles), N gs =

400 (circles) and N gs = ( , ) (topleft), ( , ) (top right), ( , ) (bottom left) and ( , ) (bottom right) from Figure 3. The permeability ﬁeld distribution has a uniformcorrelation length λ = . L , variance σ K = m = . β and c , not only decay exponentially with increasing m but are also weakly sensitive to the input variance σ K ,at least for m ≤

4. This means that we can efﬁciently ﬁt c ( m ) and β ( m ) , and only sample for the TOF in the FROSTto be able to estimate the saturation distribution. Moreover, for rate control, Figure 15 shows that the EIT is in thiscase almost deterministic regardless of the level of σ K and its mean is linear, rather than following a power law. Thenon-zero EIT variance for m < (cid:104) EIT (cid:105) is accurate for m of the order of 1. While for m = . . . . m = . σ K =

4, the standard deviations become comparable to (cid:104)

EIT (cid:105) . Hence, the(simpliﬁed) FROST is accurate for viscosity ratios close to unity, and the mean EIT can be preprocessed thanks to itspower law parametrization. Nonetheless, for small and large viscosity ratios, we can still use the simpliﬁed FROSTbut lose accuracy in the saturation distribution estimation or work directly with the full EIT distribution. In the ratecontrol case, the EIT approximation is almost exact. 19 time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T Figure 14

Plots of (cid:104)

EIT (cid:105) (crosses) and σ EIT (pluses) over time for pressure control. Row 1, 2, 3 and 4 respectively correspond to m = . m = . m = m =

10, while column 1, 2 and 3 respectively correspond to σ K = . σ K = σ K =

4. Thesolid line corresponds to the ﬁtted power law EIT and the dashed line indicates the slope (cid:104)

EIT (cid:105) ( ∆ t ) / ∆ t for reference. M s = EIT were used. time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T time < E I T > , E I T Figure 15

Plots of (cid:104)

EIT (cid:105) (crosses) and σ EIT (pluses) over time for rate control ( q in j = e y and q prod = ). Row 1, 2, 3 and 4respectively correspond to m = . m = . m = m =

10, while column 1, 2 and 3 respectively correspond to σ K = . σ K = σ K =

4. The solid line corresponds to the ﬁtted power law EIT and the dashed line indicates the slope (cid:104)

EIT (cid:105) ( ∆ t ) / ∆ t for reference. M s =

200 realizations of the

EIT were used. m for K2 = 0.25 for K2 = 1 for K2 = 4 Figure 16

Power law ﬁtting exponent β of (cid:104) EIT (cid:105) ( t ) as a function of the viscosity ratio m . m c c for K2 = 0.25c for K2 = 1c for K2 = 4 Figure 17

Power law ﬁtting intercept c of (cid:104) EIT (cid:105) ( t ) as a function of the viscosity ratio m . Using FROST with geologically realistic porous systems

We illustrate how the FROST and gFROST methods, with EIT approximated by the power law in Equation 24, canbe used to estimate saturation statistics. We provide qualitative comparisons of the FROST, gFROST and MCS whenthe permeability ﬁeld is assumed to be highly heterogeneous due to anisotropy. We consider the same setting as insection 4, i.e. the quarter-ﬁve spot conﬁguration with pressure control, but the permeability ﬁeld is generated bya variogram-based geostatistical model instead of log-normal distribution. This is done by using the GeostatisticalEarth Modeling Software (GEMS), ﬁrst introduced in Deutsch and Journel (1998). We generate 3000 realizations of(anisotropic) oriented features for the permeability ﬁeld with exponential variogram in the x direction, γ ( (cid:126) h ) = c  − exp  − (cid:13)(cid:13)(cid:13) (cid:126) h (cid:13)(cid:13)(cid:13) a  , and analogously in the x direction, with sills c = c =

1, and practical ranges a = L / ≈ . L and a = L / ≈ . L . The distribution of permeabilities has variance σ K = . Figure 19 displays the FROST-based and MCS-based saturation CDFs at different times for the numerical test. We usehere N s = N MCSs = M s =

100 two-time steps solves of transportwere performed to estimate (cid:104)

EIT (cid:105) . We can see that, as explained in subsection 4.2, the MCS results tend to smoothenplateaus. But overall, the trends are captured.

In section 4, we discussed how the FROST and gFROST performed against MCS in estimating the saturation PDF andCDF. Even though estimating the saturation distributions is the main purpose of this paper, we showcase results for theﬁrst (mean) and second (standard deviation) moments of the saturation using the three approaches. These two statisticsare broadly used in the literature as uncertainty quantiﬁcation tools and provide convenient information to constructrapid conﬁdence intervals of the saturation variability. Equation 26 enables the computation of any saturation moment. x x Log pe r m eab ili t y −4−3−2−101 Figure 18

A realization of anisotropic layered permeability ﬁeld generated by variogram-based geostatistics. saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y saturation p r obab ili t y Figure 19

Comparison between the saturation CDFs obtained with MCS (crossed lines) and FROST (solid lines) at 3 differenttimes ( t = . T in blue, t = . T in green and t = T in red) and at the nine positions from Figure 3. The stochastic permeabilityﬁeld is highly anisotropic in the x direction. The viscosity ratio is set to be m = . N MCSs = N s = x s a t u r a t i o n x x s a t u r a t i o n x x s a t u r a t i o n Figure 20

Comparison between the mean saturation obtained with gFROST (left), FROST (middle) and MCS (right) at time t = . T . The stochastic permeability ﬁeld is highly anisotropic in the x direction. The viscosity ratio is set to be m = . N MCSs = N s = N gs =

300 for gFROST. x x x x x x s a t u r a t i o n s a t u r a t i o n s a t u r a t i o n Figure 21

300 for gFROST.

In particular, for the mean and standard deviation we have (cid:104) S w (cid:105) ( (cid:126) x , t ) = − (cid:90) F S w ( s ; (cid:126) x , t ) ds and σ S w ( (cid:126) x , t ) = − (cid:90) F S w ( √ s ; (cid:126) x , t ) ds − (cid:104) S w (cid:105) ( (cid:126) x , t ) . Numerically, we compute the integrals by using a simple trapezoidal rule with 400 quadrature points. Results forthe mean are displayed in Figure 20, Figure 21, Figure 22, and Figure 23 respectively at times t = . T , t = . T , t = . T , and t = T . For each ﬁgure, we compare FROST-based, gFROST-based and MCS-based mean saturations.While MCS uses N MCSs = N s = N gs = M s =

100 two-time steps solves of transport to estimate (cid:104)

EIT (cid:105) . We can see thatfor all times FROST and MCS are almost undistinguishable, and slight discrepancies with gFROST are noticeable,especially close to the producer.Similar illustrations for the standard deviation are depicted in Figure 24, Figure 25, Figure 26 and Figure 27. Forall times, the match between the three methods is apparent. Again, most discrepancies lie close to the producer. Thisexample showcases the accuracy of the FROST for complex permeability ﬁelds, as the saturation standard deviationis non-obvious.

Mean and standard deviation are useful information for a quick assessment of the average saturation response due touncertain permeability. However, these statistics do not inform enough about the statistical variability of the saturation,25 x x x x x s a t u r a t i o n s a t u r a t i o n s a t u r a t i o n Figure 22

300 for gFROST. x x x x x x s a t u r a t i o n s a t u r a t i o n s a t u r a t i o n Figure 23

Comparison between the mean saturation obtained with gFROST (left), FROST (middle) and MCS (right) at time t = T .The stochastic permeability ﬁeld is highly anisotropic in the x direction. The viscosity ratio is set to be m = . N MCSs = N s = N gs =

300 for gFROST. x x s a t u r a t i o n x x s a t u r a t i o n x x s a t u r a t i o n Figure 24

Comparison between the saturation standard deviation obtained with gFROST (left), FROST (middle) and MCS (right) attime t = . T . The stochastic permeability ﬁeld is highly anisotropic in the x direction. The viscosity ratio is set to be m = . N MCSs = N s = N gs =

300 for gFROST. x x x x x s a t u r a t i o n s a t u r a t i o n s a t u r a t i o n Figure 25

Comparison between the saturation standard deviation obtained with gFROST (left), FROST (middle) and MCS (right)at time t = . T . The stochastic permeability ﬁeld is highly anisotropic in the x direction. The viscosity ratio is set to be m = . N MCSs = N s = N gs =

300 for gFROST. x x x x x x s a t u r a t i o n s a t u r a t i o n s a t u r a t i o n Figure 26

300 for gFROST. x x x x x x s a t u r a t i o n s a t u r a t i o n s a t u r a t i o n Figure 27

Comparison between the saturation standard deviation obtained with gFROST (left), FROST (middle) and MCS (right)at time t = T . The stochastic permeability ﬁeld is highly anisotropic in the x direction. The viscosity ratio is set to be m = . N MCSs = N s = N gs =

300 for gFROST. q ∈ ( , ) , we deﬁne the saturation (ﬁrst) 1 / q -quantile as follows, [ S w ] q ( (cid:126) x , t ) = F − S w ( q ; (cid:126) x , t ) . (27) [ S w ] q corresponds to the likely saturation ﬁeld such that there is probability q that the true (unknown) saturationﬁeld has a smaller saturation at each point x . Or equivalently, with probability 1 − q , the true saturation ﬁeld isabove the saturation 1 / q -quantile. Equation 27 involves solving an inverse problem, which can be done efﬁcientlyby any algorithm exploiting the monotonicity of the saturation CDF, F S w , and the one-dimensionality of the problem.Nonetheless, in the gFROST case, we can directly evaluate Equation 27 with the help of the inverse of the errorfunction Θ , for which we give a name to emphasize that there are efﬁcient ways to evaluate the inverse directly.Indeed, using the Gaussian approximation of log(TOF), within gFROST, the saturation 1 / q -quantile is [ S w ] q ( (cid:126) x , t ) = ˇ S w  exp (cid:16) m ( N gs ) ( (cid:126) x ) + σ ( N gs ) ( (cid:126) x ) √ Θ ( − q ) (cid:17) ct β  , (28)where ˇ S w is deﬁned in Equation 11.Figure 28 and Figure 29 compare the likely saturation ﬁeld for q = . q = . q = .

9, using the gFROST andMCS, respectively at two different times t = . T and t = T . The results are similar and we can see that the gFROSTquantiles have sharper fronts than the MCS ones. Informally, if we are concerned about the water breakthrough at theproducer, q = . q = . q = . t = t wc , as it is given by the time theshock front reaches the producer for q = .

5, which leads to t wc =  exp (cid:16) m ( N gs ) ( (cid:126) x prod ) (cid:17) c α ∗  / β , where (cid:126) x prod is the location of the producer. The saturation quantile proﬁles at time t = t wc are displayed in Figure 30.This information can be used in risk assessment for instance. Equation 26 provides a simple but reliable estimate of the saturation distribution under stochastic permeability andporosity ﬁelds. The main advantage of the proposed FROST method resides in the observation that, in the highlyheterogeneous case, the streamlines are essentially ﬁxed. Hence, we can solve the nonlinear aspect of the distributionsemi-analytically with a mapping, while the linear part contains the log(TOF) and EIT random ﬁelds to be estimated.Moreover, because of its smoothness, the log(TOF) distribution can be efﬁciently and cheaply estimated by KDE.Therefore, no (or little for the EIT) transport sampling is needed. Besides, for each spatial location, the log(TOF)density is close to Gaussian. This motivates the introduction of an even more efﬁcient sampling method, the gFROST.The gFROST drastically reduces the cost of the distribution method by approximating the distribution of the log(TOF)at each point by a Gaussian distribution with mean and variance given by the mean and variance of the log(TOF).Due to the smoothness of the distribution of log(TOF), only a few samples of the saturation are needed to accuratelyestimate its mean and standard deviation and consequently form the Gaussian surrogate distribution.Within the gFROST, most key statistical quantities are available in a semi-analytical form, such as the saturationquantiles. But equally important is the fact that the EIT can be replaced by its mean, and the mean EIT does notdepend on the spatial location and follows a power law in time. Replacing the EIT by its mean requires the EIT standarddeviation to be relatively small, which can reduce the range of viscosity ratios and input variance that produce accuratesaturation distribution estimates. However, within this range of viscosity ratios, no (or little for determining the EITparameters) transport sampling is needed. Indeed, the (g)FROST is reusable for all times only at the cost of function28 x x x x x s a t u r a t i o n s a t u r a t i o n s a t u r a t i o n x x x x x x s a t u r a t i o n s a t u r a t i o n s a t u r a t i o n Figure 28

Illustration of saturation quantiles with q = . q = . q = . t = . T with gFROST(top) and MCS (bottom). The stochastic permeability ﬁeld is highly anisotropic in the x direction. The viscosity ratio is set to be m = . N MCSs = N gs = K were used. x x x x x x s a t u r a t i o n s a t u r a t i o n s a t u r a t i o n x x x x x x s a t u r a t i o n s a t u r a t i o n s a t u r a t i o n Figure 29

Illustration of saturation quantiles with q = . q = . q = . t = T with gFROST(top) and MCS (bottom). The stochastic permeability ﬁeld is highly anisotropic in the x direction. The viscosity ratio is set to be m = . N MCSs = N gs = K were used. x x x x x s a t u r a t i o n s a t u r a t i o n s a t u r a t i o n Figure 30

Illustration of saturation quantiles with q = . q = . q = . t = t wc with gFROST.The stochastic permeability ﬁeld is highly anisotropic in the x direction. The viscosity ratio is set to be m = . N MCSs = N gs = K were used. evaluations, provided we precompute the power law ﬁtted mean EIT, since no further sampling is needed to estimatethe saturation distribution at different times. On the other hand, in MCS, each time step requires to numerically andaccurately solve for the nonlinear coupled system comprised of Equation 2 and Equation 3.Finally, the FROST always produces stable estimates of the saturation PDF because the saturation nonlinearity istaken care of by the analytical mapping and the density estimation is used for the TOF that results from a linear Darcyﬂow problem and is hence smooth. Direct estimation of the saturation PDF by MCS and KDE, though, is not reliablesince the strong nonlinearity of the transport problem results in non-smooth saturation PDFs that KDE cannot manage.Regarding computational times, on a personal laptop (Mac OS X, 2 . ,

343 seconds, whileit corresponds to 23 ,

067 seconds for FROST. In other words, we obtain a 3 .

3x speed-up mainly due to the fact that,in the FROST case, we only solve for the TOF at initial time. Using FROST with 1000 realizations reduces the totalrunning time to 7 ,

754 seconds and gives a speed-up of 9 .

8x compared to MCS. Even more, using gFROST with300 realizations requires 2 ,

316 seconds and corresponds to a 32 .

9x speed-up compared to MCS. The typical timeto run one saturation realization with N t = . . N t times morethan the logTOF distribution estimation (82 seconds). Finally, the FROST requires a last step to estimate the saturationdistribution and statistics. This last step takes 19 seconds to output the CDFs, PDFs, means and standard deviationsof the saturation for N t = ∆ t = T / N t with N t =

10, we only need to run the laststep, which takes 51 seconds). In contrast, for MCS, we would have to run a new set of realizations to estimate thesaturation at reﬁned or further times.

We have designed an efﬁcient distribution method for the water saturation in two-phase ﬂow problems with highlyheterogeneous porous media. This strategy extends the ideas presented in Ibrahima et al. (2015) and demonstrateshow we can build fast algorithms to estimate the saturation distribution under uncertain geology for spatial dimensionslarger than 1.The (g)FROST method is motivated by the physics of the two-phase ﬂow problem. The developed algorithmbeneﬁts from highly heterogeneous porous systems, where the streamline patterns are only weakly time-dependent,and from slow displacement processes (viscosity ratio close to unity). The saturation distribution is estimated byefﬁciently determining the distribution of the log(TOF) and approximating the EIT by its statistical mean. The FROSTmethod does not suffer from the limitations of perturbation methods in terms of the magnitude of the variance andcorrelation lengths of geological features, nor from the curse of dimensionality as it does not rely on any spectral30xpansion. Besides, the method always gives stable estimates of the saturation PDF and CDF. However, it requiresto run full MCS for two time steps to estimate the power law for the mean EIT, and the mean EIT approximation isaccurate for a restricted range of viscosity ratios.In terms of comparing the saturation distributions and moments, the (g)FROST is in good agreement with MCSand is attractive, mainly thanks to its stability, especially for the PDF estimations. Furthermore, the (g)FROST leadsto efﬁcient estimates of saturation quantiles, which are more appropriate quantities for uncertainty assessment anddecision making. More importantly, the method does not require a speciﬁc structure of the input distribution andcomplex geostatistical models for the permeability and porosity ﬁelds can be used without signiﬁcantly damaging thequality of the estimates.The FROST is guided by a streamline interpretation of the ﬂow in the porous media, though no actual streamlinesare needed to compute the FROST estimates. However, the model relies on the frozen streamline approximation. Astreamline solver was used in this study, but examples of the FROST algorithm with a ﬁnite volume solver can befound in (Ibrahima et al., 2016).This last remark suggests the ﬁrst step of future developments. First, the self-similar and analytical aspects ofthe nonlinear mapping rely on the fact that the initial saturation was considered to be constant in the porous medium.From Equation 22, we can see that additional sources of uncertainty can be included in the FROST framework, suchas stochastic relative permeability with random coefﬁcients independent from the saturation random ﬁeld, or constant,but random initial saturation ﬁeld. However, because these extra sources of uncertainties increase the dimension of thestochastic space to sample, no such tests have yet been studied and further investigation is needed for possible efﬁcientsampling strategies for these cases. For more general initial saturation proﬁles, we need to rethink the mapping inEquation 11.Second, the frozen streamline approximation is valid here for highly heterogeneous media because of the absenceof gravity. In problems with gravity, the streamlines are expected to be highly time dependent. In this case, or generallyin cases where the frozen streamline approximation is no longer applicable, we could relax this approximation bydesigning a two step method in which the FROST estimate can be recursively used for a time span and then correctedby re-sampling the TOF, provided that we design an efﬁcient strategy for the ﬁrst problem, that is the nonlinearmapping with general initial saturation.Third, the EIT approximation by its mean implies a restriction in ﬂuid viscosity ratios. We can investigate strategiesfor efﬁcient sampling of Z deﬁned in Equation 21 instead, thus not requiring any further approximation on the EIT. Inparallel, we will investigate the limitations of the method vis a vis the complexity of the geological input distributions.Mariethoz and Caers (2014) argue that the permeability ﬁeld should rather be modeled using training images generatedwith multiple-point geostatistics, which for instance conserve the geological consistency of channelized systems underuncertainty. We shall therefore investigate the quality of the FROST for these complex, yet essential, input models.Finally, in terms of algorithm, further considerations are necessary. We shall investigate methods to further reduceor avoid the use of sampling. A ﬁrst step can be to use stochastic spectral methods to estimate the log(TOF) ﬁeld. Amore involved strategy would consist in building a parametric model for the distribution of the random log(TOF) onthe entire domain. All in all, the distribution method presented in this paper is an encouraging step towards designingefﬁcient uncertain quantiﬁcation tools for large scale ﬂow transport problems and may also be extended to a largerclass of physical problems. Acknowledgements

Daniel Meyer is very thankful to Florian M¨uller who kindly provided the streamline code that was applied in thiswork. Fayadhoi Ibrahima is thankful to Per Pettersson for his constant support in the early stage of the manuscript andto Matthias Cremon for providing SGEMS simulations. The authors are also grateful to the SUPRI-B research groupat Stanford University for their ﬁnancial support.