[PDF] Primary-Space Adaptive Control Variates using Piecewise-Polynomial Approximations

Abstract

We present an unbiased numerical integration algorithm that handles both low-frequency regions and high frequency details of multidimensional integrals. It combines quadrature and Monte Carlo integration, by using a quadrature-base approximation as a control variate of the signal. We adaptively build the control variate constructed as a piecewise polynomial, which can be analytically integrated, and accurately reconstructs the low frequency regions of the integrand. We then recover the high-frequency details missed by the control variate by using Monte Carlo integration of the residual. Our work leverages importance sampling techniques by working in primary space, allowing the combination of multiple mappings; this enables multiple importance sampling in quadrature-based integration. Our algorithm is generic, and can be applied to any complex multidimensional integral. We demonstrate its effectiveness with four applications with low dimensionality: transmittance estimation in heterogeneous participating media, low-order scattering in homogeneous media, direct illumination computation, and rendering of distributed effects. Finally, we show how our technique is extensible to integrands of higher dimensionality, by computing the control variate on Monte Carlo estimates of the high-dimensional signal, and accounting for such additional dimensionality on the residual as well. In all cases, we show accurate results and faster convergence compared to previous approaches.

Full PDF

PPrimary-Space Adaptive Control Variates using Piecewise-PolynomialApproximations

MIGUEL CRESPO,

Universidad de Zaragoza - I3A

FELIX BERNAL,

Universidad de Zaragoza - I3A

ADRIAN JARABO,

Universidad de Zaragoza - I3A, and Centro Universitario de la Defensa Zaragoza

ADOLFO MUÑOZ,

Universidad de Zaragoza - I3A

Reference MC Ours Error MC Error Ours

Fig. 1. Bistro: Unbiased rendering of a complex scene with global illumination (22 indirect bounces, resulting in a 48-dimensional integration domain).Traditional Monte Carlo-based results in high variance even with importance sampling techniques. In contrast, our technique combines multiple importancesampling with an adaptive piecewise polynomial control variate (4D in this example): Our control variate closely approximates the low-frequency regions ofthe signal, while leaving the high-frequency details on the residual, which is estimated using Monte Carlo integration. This results in lower variance withfaster convergence. Except for the reference, the images were generated using 512 samples per pixel.

We present an unbiased numerical integration algorithm that handles bothlow-frequency regions and high frequency details of multidimensional in-tegrals. It combines quadrature and Monte Carlo integration, by using aquadrature-base approximation as a control variate of the signal. We adap-tively build the control variate constructed as a piecewise polynomial, whichcan be analytically integrated, and accurately reconstructs the low frequencyregions of the integrand. We then recover the high-frequency details missedby the control variate by using Monte Carlo integration of the residual. Ourwork leverages importance sampling techniques by working in primaryspace, allowing the combination of multiple mappings; this enables multi-ple importance sampling in quadrature-based integration. Our algorithm isgeneric, and can be applied to any complex multidimensional integral. Wedemonstrate its effectiveness with four applications with low dimensionality:transmittance estimation in heterogeneous participating media, low-orderscattering in homogeneous media, direct illumination computation, and ren-dering of distributed effects. Finally, we show how our technique is extensibleto integrands of higher dimensionality, by computing the control variate onMonte Carlo estimates of the high-dimensional signal, and accounting forsuch additional dimensionality on the residual as well. In all cases, we showaccurate results and faster convergence compared to previous approaches.

Authors’ addresses: Miguel Crespo, [email protected], Universidad de Zaragoza - I3A;Felix Bernal, [email protected], Universidad de Zaragoza - I3A; AdrianJarabo, [email protected], Universidad de Zaragoza - I3A, and Centro Universitario dela Defensa Zaragoza; Adolfo Muñoz, [email protected], Universidad de Zaragoza - I3A.

Numerical integration forms the basis of rendering algorithms, aslight arriving to a sensor (pixel) is formulated as an integral. Giventhe specific nature of this integrand, Monte Carlo (MC) [Cook et al.1984] is the most commonly applied numerical integration method.However, while general and robust, MC might converge slowly tothe desired solution, introducing significant variance that leads tohigh-frequency noise even in smooth regions.Several methods have been proposed to successfully reduce suchvariance, including (multiple) importance sampling [Veach 1997],low-discrepancy sequences [Owen 2013], or MC variants based onMarkov-Chains [Šik and Krivanek 2018]. However, variance is stilla visible artifact in low-frequency areas, where stochastic meth-ods suffer the most. In contrast, deterministic integration methodsand in particular quadrature integration [Burden and Faires 2005]excel at such smooth integrals, providing a significantly faster con-vergence rates for relatively smooth low-dimensional integrands.Unfortunately, these methods introduce bias on the results, andperform poorly in discontinuities and high-frequency details.In this work we present a new unbiased numerical integrationtechnique for low-dimensional integrals, capable of accurately han-dle both low-frequency and high-frequency areas of the signal. Ourtechnique combines quadrature- and Monte Carlo-based methods,which allows to leverage the strengths of both techniques. We first a r X i v : . [ c s . G R ] A ug • Crespo et al. adaptively build a low-dimensional multivariate polynomial approx-imation of the signal using nested (adaptive) quadrature rules [Bur-den and Faires 2005]. Then, we use this approximation as a controlvariate, and compute the residual using Monte Carlo integration.Intuitively, the control variate accurately approximates the low-frequency low-dimensional content, while Monte Carlo integrationrecovers the residual high-frequency details.Our technique performs the integration in primary space, whichallows us to take advantage of any importance sampling techniquefor error reduction in both the polynomial approximation and theresidual estimation. Moreover, we demonstrate that several sampling(i.e. warping) techniques can be combined in quadrature via multipleimportance sampling (MIS) [Veach and Guibas 1995], which gener-alizes the potential of MIS for error reduction to quadrature-basedintegration. In addition, our control variate is computed adaptivelyby using an accurate error estimation, allowing for importancesampling of the residual.Our integration technique is generic, not necessarily tied torendering, agnostic to the integrand, and can be combined withany importance sampling techniques. We demonstrate its perfor-mance in four rendering applications with different dimensionality,with results showing reduced variance and faster convergence inmultidimensional integrals with low-dimensionality, and better re-sults for the same number of samples than competing methods.Finally, we demonstrate that our technique is competitive in higher-dimensional light transport integrals by building low-dimensionalquadrature-based control variates using Monte Carlo estimates ofthe function.In summary, our work presents the following contributions : • A new unbiased integration technique for low-dimensionalintegrals that combines the strengths of MC and quadraturemethods. Our technique is adaptive, leverages any importancesampling strategy for variance reduction, and amortizes sam-ples between different pixels (or frames). • A generalization of multiple importance sampling to quadrature-based integration, which we leverage in our integration tech-nique. • Several practical rendering applications of our technique, in-cluding transmittance estimation in heterogeneous media,low-order scattering in homogeneous media, direct illumina-tion computation, and rendering of distributed effects.

Limitations:

Our technique presents some limitations: First andforemost, given the curse of dimensionality in quadrature-basedmethods, the control variate is only generated on low-dimensionalsubdomains of the integrand. However, as we show in our applica-tions there is a large number of subproblems in rendering that canbenefit from our technique. Additionally, we show that additionaldimensions (e.g. high-order light bounces) can be included in ourframework, taking advantage of the variance reduction in lowerdimensions while enabling integrals of higher dimensionality. Inaddition, our technique introduces an overhead with respect to plainMonte Carlo, which is nevertheless amortized by the variance reduc-tion achieved with our technique, and becomes negligible comparedto costly integrand evaluations (such as rendering complex scenes).Finally, given the nature of our control variate, our technique is off-line, and it does not refine the control variate when additionalsamples are introduced.

Numerical integration in rendering.

Monte Carlo integration isthe standard for simulating light transport [Cook et al. 1984; Veach1997]. To reduce variance, several importance sampling strategieshave been developed, from strategies targeting low-dimensional sub-problems (e.g. area light sampling [Ureña et al. 2013] or low-ordervolume scattering [Kulla and Fajardo 2012; Novák et al. 2012]) tohigh-dimensional path-guiding methods [Müller et al. 2017; Mülleret al. 2019; Vorba et al. 2014; Zheng and Zwicker 2019]. Our work iscomplementary to those, and can leverage any importance samplingstrategy (even multiple) by working in primary-sample space. Otherworks aim to reduce variance by carefully position samples adap-tively to the signal and using advanced techniques for reconstruc-tion from those samples [Zwicker et al. 2015]. Several approachesexist either by partitioning of the sample space [Hachisuka et al.2008; Kajiya 1986], on-the-fly frequency analysis of the signal [Bel-cour et al. 2013; Durand et al. 2005], gradient information [Jaroszet al. 2008; Ramamoorthi et al. 2007; Ward et al. 1988], or machinelearning [Gharbi et al. 2019]. Our technique also positions samplesadaptively for constructing the control variate based on multivari-ate nested quadrature rules. Gradient-based techniques [Hua et al.2019; Kettunen et al. 2015] reconstruct an unbiased final image bycomputing via Monte Carlo estimation its gradients, followed bya Poisson reconstruction. On the other hand, our work focus onunbiased integration; potentially it could work on the gradient do-main to leverage the good properties of gradient-based methods.Finally, denoising techniques trade-off variance by bias, and removenoise from the final image using sophisticated filters with adap-tive kernel bandwidths [Rousselle et al. 2012], local regression tolow-order functions [Bitterli et al. 2016], or machine learning [Bakoet al. 2017]. Our technique works in sample-space and focuses onunbiased integration of light transport sub-problems. Potentially, itcould be followed by a denoising pass for removing the remainingvariance.

Quadrature rules.

There has been a lot of research involving quad-rature rules [Stroud and Secrest 1966; Ziegel 1987] and in developingadaptive schemes to increase their accuracy [Berntsen et al. 1991;Genz and Malik 1980]. In computer graphics, quadrature integra-tion is somewhat less explored. A notable widespread exception isthe integration from distant light through spherical harmonics [Ra-mamoorthi and Hanrahan 2001, 2002]. Brouillat et al. [2009] andMarques et al. [2013] proposed to use Bayesian quadrature for in-tegrating the incident illumination. In the context of renderingparticipating media, rectangle quadrature rules have been usedfor ray marching [Perlin and Hoffert 1989] or volumetric photonmapping [Jensen and Christensen 1998]. Later, Muñoz proposedusing higher-order quadrature rules [Muñoz 2014], while Johnson etal. [2011] used Gaussian quadrature to accelerate the photon beamsalgorithm. All these works are case-specific for low-dimensionalintegrals, and introduce bias in the solution. Our work proposes an unbiased and generic (not tied to any specific problem) numerical rimary-Space Adaptive Control Variates using Piecewise-Polynomial Approximations • 3 integration method by devising quadrature integration as a con-trol variate. Moreover, we demonstrate how multiple importancesampling can be applied in the context of quadrature integration.

Control variates.

Control variates have remained relatively un-explored in rendering compared to other variance reduction tech-niques like importance sampling. Lafortune and Willems proposedusing an ambient term [Lafortune and Willems 1995b], and a di-rectional piecewise approximation of indirect radiance [Lafortuneand Willems 1995a] as control variate for diffuse illumination. Fanet al. [2006] presents an estimator based on control variates thatvaries over the scene depending on surface properties and light-ing conditions unlike previous work that only uses one genericestimator for all the scenes. Clarberg and Akenine-Moeller [2008]used an approximation of the visibility function as control variatefor computing illumination from environment maps. Rousselle etal. [2016] explored two sophisticated applications of control vari-ates in rendering: re-rendering when changing material properties,and a gradient-domain rendering reconstruction strategy. In bothcases the control variate is constructed in image space, while ourapproach can explore any required dimensions of light transport, asillustrated in several applications. Keller [2001] proposed using Mul-tilevel Monte Carlo [Heinrich 2001] for rendering, leveraging low-resolution renderings as a control variate of higher-resolution ones.Our approach shares a similar idea, but uses adaptive quadratureto build the control variate, and works over arbitrary sub-domainsof the light transport integral. Recently, Kondapaneni et al. [2019]showed that optimal weights for multiple importance sampling canbe interpreted as carefully-chosen control variates.Spherical harmonics-based control variates have been appliedto integrate environment lighting with anisotropic geometry withtangent environment maps [Mehta et al. 2012], and have been ap-plied with polygonally-clipped incident lighting such as area lightswhere the control variate accounts for the higher bandwidth light-ing [Belcour et al. 2018]. Vévoda et al. [2018] used control variatesto obtain an unbiased approximation of the incident direct illumina-tion computed using a Bayesian regression model. In contrast, ourmethod is agnostic to the signal integrated and the control variatehandles multidimensional integrals because it is obtained with amultidimensional nested quadrature rule, therefore accounting formore phenomena besides incident lighting.Finally, carefully chosen constant control variates have been alsoused for reducing variance on transmittance estimation in the pres-ence of participating media [Kutz et al. 2017; Novák et al. 2014]. Wedemonstrate that our adaptive polynomial can easily be pluggedinto these frameworks, resulting in significant variance reductionin some cases.

Any general integration problem is expressed as F = ∫ Ω f ( x ) dµ ( x ) , (1)where F is the integral, Ω is the integration domain, x ∈ Ω repre-sents a differential element of the domain, f ( x ) is the integrand (the function being integrated) and µ ( x ) is the measure of the variablewithin the domain. Monte Carlo integration numerically approxi-mates the Equation (1) as F ≈ ⟨ F ⟩ N = N N (cid:213) i = f ( x i ) p ( x i ) , (2)where N is the number of samples used to estimate ⟨ F ⟩ N , x i is arandomly sampled element of the domain, and p ( x i ) is the probabil-ity distribution function (pdf), that describes probability of selecting x i as the i th sample. Choosing a good pdf that approximates theintegral is key to reduce the variance of ⟨ F ⟩ N , which is often called importance sampling . The integration domain Ω can be difficult to treat (present man-ifolds or high-order complex structures). However, by consider-ing the pdf p ( x i ) in Equation (2) as a change of variable [Muñoz2014], it is possible to transform the domain Ω integral into a pri-mary space Ω U of random numbers, defined as the unit hypercube Ω U = (cid:208) ∞ D = [ .. ] D [Kelemen et al. 2002]. The domains Ω and Ω U are related by the mapping x = P − ( ¯ u ) , where P − ( ¯ u ) is the inverseof the cumulative function of p ( x i ) . By applying the change of vari-ables defined by mapping P − (·) , and given that d ¯ u = p ( x ) dµ ( x ) ,we can redefine Equation (1) as F = ∫ Ω U f (cid:0) P − ( ¯ u ) (cid:1) p (cid:0) P − ( ¯ u ) (cid:1) d ¯ u . (3) Multiple mappings in primary space.

Equation (3) assumes a singlemapping P − : Ω U (cid:55)→ Ω . However, multiple mappings can be usedin practice, and their choice (i.e. the sampling technique used whensampling x ) can dramatically affect the variance of the estimate ⟨ F ⟩ N . Multiple importance sampling (MIS) [Veach and Guibas 1995]allows to optimally combine multiple mappings, by weighting thecontribution of each sample x i depending on the technique used togenerate it. We can generalize Equation (3) to an arbitrary numberof mappings T : F = ∫ Ω U T (cid:213) t = W t (cid:16) P − t ( ¯ u ) (cid:17) f (cid:0) P − t ( ¯ u ) (cid:1) p t (cid:0) P − t ( ¯ u ) (cid:1) d ¯ u , (4)where P − t ( ¯ u ) and p t ( ¯ x ) are the mapping technique t and its asso-ciated pdf, and W t ( ¯ x ) is the weight of technique t to ¯ x . This weightneeds to hold (cid:205) Tt = W t ( ¯ x ) = f ( ¯ x ) (cid:44) W t ( ¯ x ) = p t ( ¯ x ) = Another strategy for variance reduction is through a control variate function h ( x ) of known expected value H = ∫ Ω h ( x ) dµ ( x ) . We canthen reformulate Equation (1) as F = ∫ Ω f ( x ) − αh ( x ) dµ ( x ) + αH , (5)where f ( x ) − αh ( x ) is the residual with respect to the controlvariate and the strength of the control variate h ( x ) is controlled by • Crespo et al. Original integrandand pdf Map integrand toprimary space Select region withhighest error Subdivide andupdate Final piecewisecontrol variate Monte Carlo sampleof the residualrepeat(a) (b) (c) (d) (e) (f)

Fig. 2. This figure illustrates our approach in a one-dimensional integral. The algorithm starts (a) from the integrand (in red) and a pdf (in blue). The pdfprovides a mapping to primary space (b). Then the piecewise control variate (in green) is calculated by iteratively selecting the highest error region (c) andsplitting it into two subregions (d) for a specified number of iterations. Once the control variate is obtained (e) the final integral is obtained by sampling theresidual difference between the primary space integrand and the control variate (f). the parameter α . Then, we can compute the Monte Carlo estimate ⟨ F ⟩ N for N samples by numerically integrating its residual as ⟨ F ⟩ N = N N (cid:213) i = f ( x i ) − αh ( x i ) p ( x i ) + αH . (6)By minimizing the variance of Equation (6), we obtain that theoptimal choice for α is α = Cov [⟨ F ⟩ , ⟨ H ⟩]/ Var [ H ] (see [Robertand Casella 2004, Section 4.2.2]), which leads to a variance on theestimate Equation (6)Var [⟨ F ⟩] = Var [⟨ F ⟩] (cid:16) − Corr [⟨ F ⟩ , ⟨ H ⟩] (cid:17) . (7) To leverage the variance reduction of both control variates and im-portance sampling, we build a control variate that approximatesthe integrand in primary space. By plugging Equation (3) into Equa-tion (6) we get ⟨ F ⟩ N = N N (cid:213) i = (cid:169)(cid:173)(cid:173)(cid:171) f ( P − ( ¯ u i ) ) p ( P − ( ¯ u i )) − αh ( ¯ u i ) p h ( ¯ u i ) (cid:170)(cid:174)(cid:174)(cid:172) + αH , (8)where the new pdf p h ( ¯ u ) should be as proportional to the residualas possible. Since obtaining a global optimal h ( ¯ u ) is unlikely, weinstead define a piecewise control variate along the whole domain Ω U . For that, we draw inspiration from quadrature-based integra-tion [Burden and Faires 2005]. Quadrature integration approximatesthe expected value F of the function f ( x ) by means of a linear com-bination of samples in f ( x ) , weighted by carefully chosen weights –the quadrature rules – as F ≈ N h (cid:213) i = w i f ( x i ) , (9)where N h is the number of samples x i , with associated weights w i . The samples and corresponding weights depend on the chosenquadrature rule. Several quadrature rules exist: The simplest ones(Newton-Cotes rules) approximate the function f ( x ) by using apiecewise polynomial approximation, by subdividing the space indeterministic evenly-distributed regions. These techniques can bemade adaptive by using nested quadrature rules [Press et al. 2007]. While quadrature rules are biased, their convergence dependson the nature of the signal and is strongly affected by the curseof dimensionality. However, polynomial approximations similarto Newton-Cotes rules satisfy many properties that make theminteresting for using them as the control variate h ( x ) : The evaluationis efficient, the integral is analytical, the construction is lightweightand adaptive, they can approximate any function f ( x ) up to a certaindegree of accuracy, and they provide an estimate of the error thatcan be used as p h ( ¯ u i ) for importance sampling the residual. Ourmethod is illustrated in Figure 2.In the following, we first describe the (multidimensional) polyno-mial approximation of f ( x ) , and its adaptive generalization. Then,we describe how we use h ( x ) as a control variate to solve Equa-tion (6) that we will later include into primary space as in Equa-tion (8). Finally, we analyze the convergence of our technique as afunction of the dimensionality of the signal. Let us assume for now that f ( x ) ∈ R , with x ∈ Ω = R (we generalizeto R D later in the subsection). Based on Newton-Cotes compositerules we build our control variate h ( x ) as a piecewise approximationof the signal. We divide the integration Ω domain into M smallerdisjoint subdomains Ω r = [ a r , b r ] , so that (cid:208) Ms = Ω r = Ω and Ω r ∩ Ω s = ∅ , ∀ r (cid:44) s .For each disjoint subdomain Ω r , we approximate f ( x ) , with x ∈ Ω r , as a polynomial f ( x ) ≈ h r ( x ) = n (cid:213) i = c r , i x i , (10)where n is the order of the polynomial defined in Ω r (order two inour case) and c r , i are its coefficients. The coefficients c r , i are calcu-lated by interpolating from a set of uniformly distributed samples f ( x r , i ) , where ( x r , i ) i ∈[ , n ] ∈ Ω r , with x r , = a r , x r , i + = x r , i + h r and h r = ( b r − a r )/ n . We interpolate through a precomputed linearsystem of equations over a monomial basis, by inverting the Vander-monde matrix that defines such system of equations. This approachnaturally extends to higher-order rules and multiple dimensions.As the polynomial can be integrated analytically, through inter-polation by substitution we can obtain weights w r , i that define the rimary-Space Adaptive Control Variates using Piecewise-Polynomial Approximations • 5 order- n quadrature rule as ∫ Ω r f ( x ) dx ≈ H r = n (cid:213) i = w r , i f ( x r , i ) , (11)which is a standard approach for deriving the weights within quad-rature rules. In general, for low-order known quadrature rules (suchas Simpson’s rule, used in this paper) there is no need to derive suchweights because they can be found in the corresponding literature.We can compute the integrand for the full domain Ω as the sum ofthe integrals for all regions as H = (cid:205) r H r . Generalizing to R D . For the multidimensional case, where Ω r ∈ R D = {[ a r , , b r , ] · · · [ a r , D , b r , D ]} , we generalize Equation (10) for x ∈ R D and x = { x · · · x D } , as h r ( x ) = n (cid:213) i = · · · n (cid:213) i D = c r , { i ··· i D } D (cid:214) j = x i j j , (12)where c r , { i ··· i D } is the polynomial coefficient. We calculate thecoefficients using the same approach than for a single dimension,by interpolating from a multidimensional grid using a linear sys-tem over a multivariate monomial basis. For integration, we applyFubini’s theorem, and build the multidimensional rules as ∫ Ω r h r ( x ) dx = D (cid:213) d = n (cid:213) j = w r , { d , i } f ( x r , i ) , (13)where the weighs w r , { d , i } are obtained from the product of theone-dimensional rule’s weights, and x i form a D -dimensional gridof sampled points in Ω r . Multiple mappings.

We can leverage the variance reduction pro-vided by using multiple importance sampling (MIS) in Monte Carlointegration [Veach and Guibas 1995], by combining multiple map-pings to reduce the error when computing H . Assuming the inte-gration domain is the primary space (i.e. Ω r ∈ Ω U ), we introduce h ( x ) in Equation (4) and move the sum out of the integral as H r = T (cid:213) t = ∫ Ω r W t (cid:16) P − t ( x ) (cid:17) h ( x ) dx = n (cid:213) i = w r , i T (cid:213) t = W t (cid:16) P − t (cid:0) x r , i (cid:1)(cid:17) f (cid:0) P − t (cid:0) x r , i (cid:1)(cid:1) p t (cid:0) P − t (cid:0) x r , i (cid:1)(cid:1) . (14) So far, we have not assumed any specific distribution of the re-gions { r } within the domain Ω . Such distribution might be uniform(equally partitioning of the domain) but this would be suboptimal.Ideally, we would like to have a finer sampling rate in regions whereour order- n polynomial fails at approximating f ( x ) , while leavinga coarser sampling in areas with less error.In this context nested quadrature rules provide the tool for adaptivenumerical approximation. The key idea is to use two quadraturerules of different order for approximating the same integral, usingthe higher-order rule as an oracle of the integrated signal F r foreach region r . The difference between both of them is the estimateof the error ˆ E r . This estimation of the error is then used to selectthe region to subdivide. More formally, let the two estimates H h r and H l r computed usingquadrature rules of order n h and n l respectively, with n h > n l be H h r = n h (cid:213) i = w h r , i f (cid:16) x h i (cid:17) and H l r = n l (cid:213) i = w l r , i f (cid:16) x l i (cid:17) , (15)where x h i and x l i are the samples for each rule, and w h r , i and w l r , i their corresponding weights. For the rules to be nested, it is requiredthat (cid:8) x l i (cid:9) ⊂ (cid:8) x h i (cid:9) , which allows reusing samples when computingboth rules. Then, the estimate of the error is ˆ E r = | H h r − H l r | . Weuse the Simpson-Trapezoidal nested rule ( n h = n l = Subdivision strategy.

Most nested quadrature rules use a toleranceparameter to subdivide until the error is below a threshold. In ourcontext, we cannot use this approach since we would like to specifya samples budget. Our algorithm iteratively subdivides the region r with highest ˆ E r , until we reach the input budget of samples N h .To efficiently obtain the region with maximum error, we store theregions at a given step in a heap structure, which is updated on eachiteration. For each subdivision, we split the top of the heap usingbinary splitting along the dimension of highest error. Taking intoaccount that a subset of the samples of each subregion comes fromthe splitted region, the sample count N h is linear with the numberof regions M , following N h = ( n h + ) D ( M − ) n h ( n h + ) D − . (16)Note that depending on the (deterministic) positions of samples (cid:8) x h i (cid:9) , high-frequency features might be missed by the error esti-mation. This can lead to regions with an inaccurate polynomialapproximation h r ( x ) that are kept stagnant (i.e. never subdivided).To avoid this pitfall, we add a term to the error that accounts forthe size of the region, so larger inaccurate regions can also be sub-divided. As the error estimation must be calculated per dimension d (to split the dimension of highest error) the final form of ˆ E r , d isˆ E r , d = (cid:12)(cid:12)(cid:12) H h,d r − H l,d r (cid:12)(cid:12)(cid:12) + (cid:0) b r , d − a r , d (cid:1) ϵ , (17)where H l,d r is the integral of the control variate h r ( x ) using thehigher order rule h for all the dimensions except for dimension d (which applies the lower rule l ), a r , d and b r , d are the lower andupper limits of the integration domain Ω r for dimension d , and ϵ isa positive constant. Intuitively, ϵ is related to the uniformity of thesubdivisions: Larger values lead to a more uniform region’s size dis-tribution, while smaller values will lead to subdivisions proportionalto the estimated error. We empirically set ϵ = − .Figure 3 shows our polynomial approximation (the control vari-ate) and the residual for two two-dimensional functions: The controlvariate accurately captures the low frequency regions of the func-tion, while the high frequency details remain in the residual. Control variate for subdomains and bucketing.

While the controlvariate h ( x ) is defined for the integration domain Ω , it can also beapplied to any subdomain Ω b ⊂ Ω . While the integral for the wholedomain Ω is H = (cid:205) r H r , the integral of the subdomain is ∫ Ω b h ( x ) dx = (cid:213) r ∫ Ω r ∩ Ω b h r ( x ) dx . (18) • Crespo et al. Integrand Control variate Residual(a) (b) (c)

Fig. 3. Integration of two two-dimensional functions (a), its piecewise poly-nomial approximation used as control variate (b, boundaries of each regionin green) and the corresponding residual (c, where red and blue are positiveand negative residual, respectively).

This is specially useful when bucketing the same integrand intoa set of bins (e.g. the pixels of an image or video). In these cases, thesame control variate h ( x ) can be applied for computing all buckets,effectively amortizing the construction of the control variate alongmultiple buckets. In Sections 6 to 8 we apply this strategy in imagespace where each pixel is an independent bucket but the controlvariate is shared among all pixels. Furthermore, in Section 7 we com-pare this bucketing strategy against computing the control variateper pixel, showing faster convergence and higher pixel coherencywhen bucketing. So far we have described our adaptive construction of the piecewisepolynomial approximation of the integral on the primary domain.Now we describe how we compute the estimate in Equation (8). Inorder to reduce variance of the estimate, we would like to drawsamples with a pdf p h ( ¯ u ) that is approximately proportional to theresidual, so that p h ( ¯ u ) ∝∼ f ( P − ( ¯ u ) ) p ( P − ( ¯ u )) − αh ( ¯ u ) . Assuming that the errorguiding the construction of our control variate ˆ E r , d (Equation (17))is a good estimate of the residual, and that the regions r subdividingthe primary domain have roughly a similar error, we can uniformlysample a region with probability M − , and then sample uniformlywithin the chosen region. The resulting pdf is p h ( ¯ u ) = M | Ω r ( ¯ u )| ,where | Ω r ( ¯ u )| is the hypervolume of the selected region r , and M is the number of regions. Note that this pdf is applied only forintegrating the residual in primary space, on top of any other im-portance sampling strategy used for the corresponding application.When bucketing (see last paragraph of previous section) we firststratify per bucket (pixel), search all regions of the control vari-ate falling in the bucket, and then uniformly sample each regionwithin the bucket using p h ( ¯ u ) . We select a per-bucket optimal value α = Cov [⟨ F ⟩ , ⟨ H ⟩]/ Var [ H ] (see Section 3.3), where we estimate Cov [⟨ F ⟩ , ⟨ H ⟩] and Var [ H ] from the set of random samples fallingwithin the bucket. Here we analyze the performance of our technique as a functionof the samples used for building the control variate (built using aSimpson-Trapezoidal nested rule), and for computing the residual.We integrate a number of functions of increasing dimensionality(from 2D to 4D), and include the boundary cases i.e. Monte Carlo andquadrature for comparison. Analysis for each individual functioncan be found in the supplemental (Section S.1).Figure 4 shows the average error, cost, and the product betweencost and error for each function’s dimensionality when integrat-ing the full domain (top), and projecting the integral into buckets(bottom). The horizontal and vertical axes represent the number ofsamples for generating the control variate and for computing theresidual, respectively. As expected, the increased dimensionalityslows down the convergence rate of the control variate, while theresidual converges with the usual rate in Monte Carlo integration.In terms of cost, the samples generating the control variate are moreexpensive than Monte Carlo samples, especifically when integratingthe full domain (top row). However, this cost is amortized whensubdividing the integration domain into buckets (bottom row).Pure Simpson-Trapezoidal quadrature integration (bottom rowat each graph, marked as ST ) seems to converge relatively fast, butits convergence is irregular and they introduce bias that translatesinto perceivable artifacts. These artifacts, as well as higher-ordernested rules, are explored in the supplemental (Section S.4).By computing the efficiency of the integration (as a function ofthe time and error, and the number of samples and error), we foundthat there is an optimal trade off between the samples allocated tothe control variate and to the residual. Such optimal trade off is,on average, one sample for the control variate out of three for fullintegrals and one sample out of 16 when amortizing among differentbuckets (white dashed line in the efficiency maps). These ratios areused for all the results of this paper. We implemented our adaptive control variate as a generic templatein C++. It is agnostic to the nature of the function integrated, andeasy to integrate into other systems. We plug it in Mitsuba [Jakob2010], which provides the function to be integrated.We compute the results on an Intel Xeon Gold 6400 3.7 GHz CPUworkstation with 256 GB of RAM. We measure the error using theroot mean square error (RMSE).We build the control variate using a Simpson-Trapezoidal nestedrule, which results in an order-two polynomial. For each iteration,we deterministically draw three samples per dimension. For bucket-ing we use a box filter as the reconstruction kernel. Including otherkernels with analytical integration is left as future work. For theresidual, we random sample the regions as described in Section 4.3using a 64-bit Mersenne Twister random number generator. Basedon our analysis in Section 4.4, in all our results we allocate 1/3 (fullintegrals) and 1/16 (amortized samples by bucketing) of the totalsamples to building the control variate, while the rest are used to rimary-Space Adaptive Control Variates using Piecewise-Polynomial Approximations • 7

2D 3D 4D

MC15 123 1k 8k 66k 524kST16642561k4k16k66k524k

CV spp pp s C M Error

MC15 123 1k 8k 66k 524k

CV spp

Time

MC15 123 1k 8k 66k 524k

CV spp

Time X Error

MC15 123 1k 8k 66k 524k

CV spp

Samples X Error

MC1 8 64 512 4k 33kST1416642561k4k33k

CV spp pp s C M MC1 8 64 512 4k 33k

CV spp

MC1 8 64 512 4k 33k

CV spp

MC1 8 64 512 4k 33k

CV spp

MC63 495 4k 33k 262k 2MST642561k4k16k66k262k2M

CV spp

Error

MC63 495 4k 33k 262k 2M

CV spp

Time

MC63 495 4k 33k 262k 2M

CV spp

Time X Error

MC63 495 4k 33k 262k 2M

CV spp

Samples X Error

MC1 8 64 512 4k 33kST1416642561k4k33k

CV spp

MC1 8 64 512 4k 33k

CV spp

MC1 8 64 512 4k 33k

CV spp

MC1 8 64 512 4k 33k

CV spp

MC243 2k 16k 131k 1M 8MST2561k4k16k66k262k1M8M

CV spp

Error

MC243 2k 16k 131k 1M 8M

CV spp

Time

MC243 2k 16k 131k 1M 8M

CV spp

Time X Error

MC243 2k 16k 131k 1M 8M

CV spp

Samples X Error

MC1 8 64 512 4k 33kST1416642561k4k33k

CV spp

MC1 8 64 512 4k 33k

CV spp

MC1 8 64 512 4k 33k

CV spp

MC1 8 64 512 4k 33k

CV spp

Fig. 4. Average error, cost, and efficiency maps (brighter means higher in logarithmic scale) for a set of integrals with increasing dimensionality, as a functionof the number of samples allocated to building the control variate and to integrate the residual (horizontal and vertical axes, respectively). The leftmostcolumn in each map represents Monte Carlo integration, while the bottom row in each map represents nested quadrature (Simpson-Trapezoidal).

Top row: integration over the full 2D domain.

Bottom row: integration into buckets (pixels) (the sample count represents samples per bucket). The white linesshow the optimal ratio between the number of samples allocated to compute the control variate and the residual. compute the residual. Detailed cost breakdown for all our resultscan be found in the supplemental (Section S.3).

Here we apply our technique to the computation of transmittancein heterogeneous participating media. As light travels from position x to x through a participating medium, it is attenuated followingthe one-dimensional integral T ( x , x ) : T ( x , x ) = exp (− τ ) = exp (cid:18) − ∫ t µ ( x s ) ds (cid:19) , (19)with t = | x − x | , µ ( x ) the extinction coefficient at x , x s = x + s ω ,and ω = x − x t .Several unbiased Monte Carlo-based methods have been pro-posed to numerically solve Equation (19), based on the key idea ofintroducing null virtual particles to fill the medium, resulting into aconstant virtual extinction (the majorant ¯ µ , see [Novák et al. 2018]for an in-depth overview on the topic), at the cost of introducingvariance. Residual ratio tracking [Novák et al. 2014] reduces varianceby introducing a control extinction µ c transforming Equation (19) as T ( x , x ) = exp (cid:18) − ∫ t µ ( x s ) − µ c ( x s ) ds + µ c t (cid:19) . (20)Note that the estimate of τ in Equation (20) is essentially Equation (5)with α =

1. Unfortunately, this approach uses a constant µ c , whichworks well if the signal varies slightly around µ ( x s ) , but that mightincrease variance if µ c diverges from the actual extinction. While inpractice this is partially solved using a piecewise constant (or linear)estimate of µ c , it requires to precomputed a supervoxel hierarchywhich limits its applicability to voxelized media, while still requiringheuristics to solve special cases. Instead, we propose to use ouradaptive polynomial approximation as the control extinction µ c ( x s ) .We analyze the performance of our technique against residual ra-tio tracking with constant precalculated µ c (set to µ c = ∫ t µ ( x s ) ds ,which is the optimal parameter according to the authors) and deltatracking [Woodcock et al. 1965]. In all cases we use the same tight Delta Tracking Residual Ratio Tracking Ours S m o k e Reference H e t v o l Fig. 5. Renders of two purely absorbing media, with high (first row, Het-vol) and low (second row, Smoke) densities, computed using delta track-ing [Woodcock et al. 1965], residual ratio tracking [Novák et al. 2014], andour adaptive residual ratio tracking (left image). The three methods haveapproximately the same number of media queries. The convergence for eachmethod on both scenes can be found in Figure 6.

Delta tracking Residual ratio tracking Ours −2 −1 E rr o r Scene hetvol_absorption

10s 01m:40s10 −2 −1 E rr o r Scene hetvol_absorption −2 −1 E rr o r Scene smoke_absorption −2 −1 E rr o r Scene smoke_absorption

Hetvol Smoke E rr o r Samples E rr o r Samples E rr o r Time E rr o r Time

Fig. 6. Convergence for the scenes in Figure 5 as a function of media queries(left and middle right) and core time (middle left and right) for delta tracking,residual ratio tracking using the average extinction as control extinction,and our adaptive control variate. • Crespo et al.

Fig. 7. Equal-samples (64 spp) comparison between Monte Carlo, Simpson-Trapezoid quadrature [Muñoz 2014] and our technique for computing singlescattering from a point light source in isotropic homogenous media. Our technique yields more accurate results and recover both the smooth global structureof light transport and high frequency details of the scene, while remaining unbiased.

64 128 256 512 1024 2048 −

64 128 256 512 1024 2048 − − − − − Monte Carlo Ours[Muñoz 2014]

Pumpkin Occluder E rr o r Samples E rr o r Samples E rr o r Time E rr o r Time

Fig. 8. Convergence for the scenes in Figure 7 for Monte Carlo integra-tion, Simpson-Trapezoid quadrature [Muñoz 2014], and our technique, as afunction of number of samples and core time. majorant ¯ µ = max x ( µ ( x )) . We build our control variate perform-ing three iterations, which results in a small overhead (just nineadditional medium queries).Figure 5 shows a comparison between the three techniques at anequal number of media queries, for two absorbing heterogeneousmedia with high (Hetvol, left) and low density (Smoke, right).Without introducing a spatially-varying control extinction µ c (usinge.g supervoxels), residual ratio tracking introduces noise in regionswhere the extinction deviates significantly from µ c , resulting intohigher variance than delta tracking. While this could be alleviatedby subdividing the space in subvolumes with tighter majorants andcontrol extinctions, these would also benefit our method.In Figure 6 show the convergence of the three methods. As ex-pected, the performance of residual ratio tracking and our methodrelate with the quality of the approximation. When residual ratiotracking performs well, our technique in general performs similarly.However, when a constant control fails at representing the mediaextinction (e.g. in cases with non-uniform densities), our techniqueadapts to the signal without introducing a significant overhead. Werefer to the supplemental (Section S.2) for more examples. We apply our technique for computing one- and two-bounces scat-tering in homogeneous media from a point light source (1D integral)and a collimated beam (2D integral), respectively. In both cases, we want to compute the radiance at point x o from direction ω as L ( x , ω ) = ∫ t T ( x , x s ) σ s L i ( x s , ω ) ds , (21)where t is distance of intersection of the ray, x s = x − ω t , T ( x , x s ) = e − σ t ∥ x s − x ∥ is the transmittance between x and x s , σ t and σ s arethe extinction and scattering coefficients, and L i ( x s , ω ) is the in-scattered radiance. For light incoming from a point source then L i ( x s , ω ) = Φ l ∥ x s − x l ∥ V ( x l , x s ) T ( x l , x s ) ρ ( x l → x s → x o ) , (22)where x l and Φ l are the light’s position and intensity, V ( x l , x s ) isthe binary visibility term, and ρ ( x l → x s → x ) is the phase functionat x s .In the case of the light source being a collimated beam defined byposition x l and direction ω l ,then L i ( x s , ω ) becomes an additional1D integral [Novák et al. 2012] as L i ( x s , ω ) = ∫ t ′ Φ l ∥ x s − x l ∥ V ( x s , x s ′ ) T ( x l , x u ) T ( x s , x s ′ ) σ s ( x s ′ ) ρ ( x l → x s ′ → x s ) ρ ( x u → x s → x ) ds ′ , (23)where t ′ is distance of intersection of the light beam, with x s ′ = x l + ω l s ′ . We amortize the cost of generating the control variatealong pixels, by bucketing an additional integration domain (imageplane). This results into two integration problems of three (pointlight) and four dimensions (collimated beam).Figure 7 shows the results for single scattering in isotropic homo-geneous media. We compare against pure Monte Carlo, as well as thequadrature-based integration proposed by Muñoz [2014] for singlescattering. In all cases, we use equiangular sampling for mapping toprimary space [Kulla and Fajardo 2012]. Our technique outperformsboth competitors since it is able to adaptively generate a smoothcontrol variate along the three dimensions of the problem, whilerecovering high-frequency details by means of the Monte Carloestimate of the residual. As shown in Figure 8, the ability to handleboth low- and high-frequency parts of the integrand results in betterconvergence than both alternative limit cases. rimary-Space Adaptive Control Variates using Piecewise-Polynomial Approximations • 9 Fig. 9. Equal-samples (64 spp) comparison between Monte Carlo and ourtechnique for computing two-bounces scattering from a collimated beamin isotropic homogenous media. While pure Monte Carlo generates high-frequency noise, our approach excels at the smooth regions, while accuratelyhandling the sharp details.

64 128 256 512 1024 2048 − − − − E rr o r Samples − − Monte Carlo Ours

GreenDragonLaser

64 128 256 512 1024 204810 E rr o r E rr o r Samples Time E rr o r Samples E rr o r Time

64 128 256 512 1024 2048 16m:40s

Fig. 10. Convergence for the scenes in Figure 9 for Monte Carlo integrationand our technique, as a function of number of samples and core time.

Similar trends can be found for the case of two-bounce scatter-ing, as shown in Figures 9 and 10. In this case, we use the two-dimensional mapping proposed by Novak et al. [2012]. Again, ourcontrol variate is able to recover most of the low frequencies com-mon in scattering media, while the details are handled by means ofMonte Carlo integration of the residual.

Here we compute direct illumination at a point x as seen from adirection ω . We solve the rendering equation, as an integral over allpoints x l on the surface the emitters A : L ( x , ω ) = ∫ A Φ ( x l → x ) B ( x l → x → ω ) V ( x ↔ x l ) G ( x ↔ x l ) dx l , (24)where Φ ( x l → x ) is the radiance emitted at y towards x , B ( x l → x → ω ) is the bidirectional reflectance distribution function (BRDF)at x , and G ( x ↔ y ) is the geometric attenuation.As discussed in Section 3.2, our technique leverages the use ofmultiple mappings in primary space (Equation (4)) in our adaptivepolynomial control variate. We solve Equation (24) by combiningBRDF and emitter sampling techniques using the power heuris-tic [Veach and Guibas 1995]; we illustrate the effect of each tech-nique in Figure 11. Note that other sophisticated sampling methodscould be applied on top of our technique. M o n t e C a r l o M I S O u r s M I S O u r s B R D F Fig. 11. MIS Test: Comparison of our approach with individual specificmappings to primary space (left column) only sampling the emitter (top)or the BRDF (bottom). Right column shows results with both combinedmappings (MIS) with our technique (top) and Monte Carlo (bottom). Noticehow the ability to exploit multiple mappings better fits our control variateto the integrand.

Figure 12 shows a visual comparison of several scenes with differ-ent types of emitters and materials. We compare the performanceof computing the control variate per pixel ("Ours 2D", resulting in a2D integral per pixel) and building a single control variate on thefull image ("Ours 4D". resulting in a single bucketed 4D integral).Both approaches result into less noise than Monte Carlo for thesame number of samples. In addition, bucketing the full image space("Ours 4D") results in both less error and structure on the noise.Figure 13 shows the convergence for the three scenes: In all casesthere is a similar trend, with a faster convergence of our technique,specially when bucketing the full 4D integration domain.

As a final application, we use our technique for rendering distributedeffects such as motion blur or depth of field [Cook et al. 1984], whichincreases the dimensionality of the light transport problem in oneand two dimensions (time and aperture, respectively). We assume aconstant shutter time for motion blur, and a thin lens model for depthof field. In all cases, we amortize samples along pixels, increasing thedimensionality of our control variate with the additional dimensionsof the sensor.We compare our method against Monte Carlo integration andHachisuka et al.’s multidimensional adaptive technique [2008] infour different test scene setups (Figure 14): Pool (3D) includes mo-tion blur, Chess (4D) includes depth of field, Helicopter (5D) fea-tures both motion blur and area lighting (Section 7), and VolleyBalls (6D) includes both depth of field and area lighting. Our ap-proach generates low-noise results even in challenging scenariossuch as rotational motion blur (Helicopter). In contrast, Hachisuka

Monte Carlo Ours 2D D r a g o n H o u s e C l a ss r oo m Ours 4D Reference MC Ours 2D Ours 4D

Fig. 12. Comparison of the different approaches of our technique against Monte Carlo integration for the same number of evaluations of the direct illumination.In all cases, Monte Carlo produces noisier images even with MIS. In contrast, our technique leverages MIS adapting the control variate to the integrand,yielding better results both per pixel ("Ours 2D") and for the whole image space ("Ours 4D"). Furthermore, amortizing the control variate among the wholeimage space reduces noise in low frequency areas, removes structured noise and serves as antialiasing. All results are calculated using 155 spp.

Monte Carlo Ours 2D Ours 4D

Dragon House Classroom

64 128 256 512 1024 spp −3 −2 E rr o r Scene arvo95_DI mc ours4D 3iter

64 128 256 512 1024 spp −2 E rr o r Scene house_DI mc ours4D 3iter

64 128 256 512 1024 spp −2 E rr o r Scene classroom_DI mc ours4D 3iter

TIME −3 −2 E rr o r Scene arvo95_DI mc ours4D 3iter

TIME −2 E rr o r Scene house_DI mc ours4D 3iter

TIME −2 E rr o r Scene classroom_DI mc ours4D 3iter E rr o r Samples E rr o r Samples E rr o r Samples E rr o r Time E rr o r Time E rr o r Time

Fig. 13. Convergence of the scenes in Figure 12 as a function of numberof samples and core time for of Monte Carlo, our technique applied perpixel ("Ours 2D") and our technique extended to the full image space andbucketed perpixel ("Ours 4D"). Notice that extending our control variate to4D results in faster convergence. et al.’s method [2008], being biased, overblurs the result due its recon-struction kernel (e.g. the focused ball in Volley Balls or the glossyreflections in Helicopter), although produces noiseless results insmoother areas.Figure 15 shows the convergence of our method, compared withMonte Carlo and Hachisuka’s. Our method converges faster thanMonte Carlo in all the scenes. However, the additional cost of build-ing and evaluating the control variate might result in a time penaltyin scenes with simple relative cheap sampling evaluation (e.g. sceneswith simple geometry like Helicopter or Volley Balls). Still, notethat the slope of convergence shows that our approach pays off inthe long run. We refer to Section S.3 of the supplemental material forthe per-scene per-stage temporal cost breakdown. Our method also converges faster than the method by Hachisuka’s et al. [2008], withbetter or on-pair performance with respect to samples per pixel,and outperforming it in terms of computational cost. Finally, whileHachisuka et al.’s introduce a heavy memory overhead ( ×

140 onaverage compared with Monte Carlo), our method introduces a sig-nificantly smaller memory footprint ( × Since our control variate is based on quadrature rules, it is limitedby the curse of dimensionality. This unfortunately limits its applica-bility to relatively low-order integration domains. However, generalintegration problems in rendering are of arbitrary dimensionality.In this section we analyze the performance of our control variate inhigh-dimensional problems by building a low-dimensional controlvariate on top of an estimate of the high-dimensional integral.Let us rewrite Equation (3) as two nested integrals on orthogonalsubdomains F = ∫ Ω LU ∫ Ω ∗ U д ({ ¯ u | ˆ¯ u }) d ˆ¯ u (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) д ( ¯ u ) d ¯ u , (25)where д ( ¯ u ) = f ( P − ( ¯ u ) ) p ( P − ( ¯ u )) , the integration domain Ω U ∈ R D is Ω U = Ω LU ∪ Ω ∗ U with Ω LU ∈ R L and Ω ∗ U ∈ R D − L , and { ¯ u | ˆ¯ u } ∈ R D isthe concatenation of the variables ¯ u and ˆ¯ u . We will construct ourcontrol variate on Ω LU . For that, we need to evaluate the integrandfunction д ( ¯ u ) over the set of samples ¯ u ∈ Ω LU . Unfortunately, this rimary-Space Adaptive Control Variates using Piecewise-Polynomial Approximations • 11 P oo l C h e ss H e l i c o p t e r V o ll e y B a ll s Reference [Hachisuka] OursMC

Fig. 14. Comparison between our approach (left column), Monte Carlo andprevious related work [Hachisuka et al. 2008] in 4 different scenes (withincreasing dimensionality) at equal number of samples (64 spp). The scenesfeatures several distributed effects including motion blur, depth-of-field andsoft shadows. In all cases, Monte Carlo produces renders with high variance,while Hachisuka et al.’s approach achieves good results in soft domains, buttends to overblur the sharp regions of the scene. In contrast, our unbiasedmethod outperforms previous work keeping the high contrast areas sharp.

Monte Carlo [Hachisuka et al. 2008] Ours

Pool Chess Helicopter Volley Balls

64 128 256 512 1024 2048

SPP −3 −2 −1 E rr o r Scene pool