[PDF] An unbiased ray-marching transmittance estimator

Abstract

We present an in-depth analysis of the sources of variance in state-of-the-art unbiased volumetric transmittance estimators, and propose several new methods for improving their efficiency. These combine to produce a single estimator that is universally optimal relative to prior work, with up to several orders of magnitude lower variance at the same cost, and has zero variance for any ray with non-varying extinction. We first reduce the variance of truncated power-series estimators using a novel efficient application of U-statistics. We then greatly reduce the average expansion order of the power series and redistribute density evaluations to filter the optical depth estimates with an equidistant sampling comb. Combined with the use of an online control variate built from a sampled mean density estimate, the resulting estimator effectively performs ray marching most of the time while using rarely-sampled higher order terms to correct the bias.

Full PDF

AAn unbiased ray-marching transmittance estimator

MARKUS KETTUNEN,

NVIDIA

EUGENE D’EON,

NVIDIA

JACOPO PANTALEONI,

NVIDIA

JAN NOVÁK,

NVIDIA

Ratio tracking P-series CMF Unbiased ray marching Biased ray marching[Cramer 1978] [Georgiev et al. 2019] (ours) (ours) G l a ss V a r i a n c e I n s e t Variance (MSE) at equal cost: 1.52e-3 1.24e-3 9.49e-5 4.87e-5

Fig. 1. We propose a new unbiased Monte Carlo estimator for volumetric transmittance based on a power-series expansion. The zeroth-order term in ourestimator corresponds to a novel variant of ray-marching. Higher order terms ensure a bias-free estimate and are evaluated infrequently. The result can havemultiple orders of magnitude less variance than previous work with similar number of density evaluations.

We present an in-depth analysis of the sources of variance in state-of-the-artunbiased volumetric transmittance estimators, and propose several newmethods for improving their efficiency. These combine to produce a singleestimator that is universally optimal relative to prior work, with up to severalorders of magnitude lower variance at the same cost, and has zero variancefor any ray with non-varying extinction. We first reduce the variance oftruncated power-series estimators using a novel efficient application of U-statistics. We then greatly reduce the average expansion order of the powerseries and redistribute density evaluations to filter the optical depth estimateswith an equidistant sampling comb. Combined with the use of an onlinecontrol variate built from a sampled mean density estimate, the resultingestimator effectively performs ray marching most of the time while usingrarely-sampled higher order terms to correct the bias.CCS Concepts: • Computing methodologies → Visibility ; Raytracing . Additional Key Words and Phrases: transmittance, Poisson estimator,U-statistics, comb filter, power series

ACM Reference Format:

Markus Kettunen, Eugene d’Eon, Jacopo Pantaleoni, and Jan Novák. 2021.An unbiased ray-marching transmittance estimator. 1, 1 (February 2021),20 pages.

The visibility between two points in a scene is a fundamental quan-tity in light transport simulation. In a vacuum, it takes on a binaryvalue. In a participating medium, however, scalar radiative trans-fer [Chandrasekhar 1960] is used to statistically account for thepresence of scattering and absorbing particles. The number of par-ticles intersecting a given ray is a random variable and visibilitybecomes a fractional quantity: the probability of traversing uncol-lided from 𝑎 to 𝑏 , 𝑇 ( 𝑎, 𝑏 ) = exp (cid:32) − (cid:90) 𝑏𝑎 𝜇 ( 𝑥 ) 𝑑𝑥 (cid:33) , (1)where 𝜇 ( 𝑥 ) is a known deterministic non-negative function (the extinction coefficient at position 𝑥 ). The probability 𝑇 ( 𝑎, 𝑏 ) is some-times called transmittance , and efficiently computing this value isessential for rendering scenes with haze, fog, and clouds.The integral in Equation 1 is rarely known in closed form. Excep-tions include piecewise-homogeneous volumes, and simple atmo-spheric models [Novák et al. 2018]. The general-purpose approach,therefore, is to use Monte Carlo to estimate transmittance by point-sampling 𝜇 ( 𝑥 ) at a number of locations 𝑥 along the ray. A numberof estimators have been proposed for this purpose, but no one esti-mator is optimal in all cases, and their efficiency depends on severalparameters that are difficult to determine automatically.In this paper, we present new methods for unbiased estimation ofEquation 1. After reviewing previous work in Section 2, we presenta new parametric variance-analysis in Section 3 that reveals several Publication date: February 2021. a r X i v : . [ c s . G R ] F e b • Kettunen et al. 𝑥 ∈ [ 𝑎,𝑏 ] coordinate along a ray 𝜇 ( 𝑥 ) extinction coefficient ¯ 𝜇 ( 𝑥 ) majorant extinction 𝜇 c ( 𝑥 ) control extinction coefficient 𝜇 r ( 𝑥 ) = 𝜇 ( 𝑥 ) − 𝜇 c ( 𝑥 ) residual extinction coefficient ¯ 𝜇 r ( 𝑥 ) = ¯ 𝜇 ( 𝑥 ) − 𝜇 c ( 𝑥 ) majorant residual extinction coefficient 𝜏 = (cid:82) 𝑏𝑎 𝜇 ( 𝑥 ) 𝑑𝑥 optical depth ¯ 𝜏 = (cid:82) 𝑏𝑎 ¯ 𝜇 ( 𝑥 ) 𝑑𝑥 majorant optical depth ¯ 𝜏 r = (cid:82) 𝑏𝑎 ¯ 𝜇 r ( 𝑥 ) 𝑑𝑥 majorant residual optical depth ℓ = 𝑏 − 𝑎 length of interval Table 1. Symbols. key factors that limit the performance of these estimators. Thisinspires a number of novel variance-reduction methods, which wedetail in Section 4. Combining all of these methods together, wepropose a new estimator in Section 4.7 that seems universally moreefficient than prior work, and can in many cases yield transmittanceestimates with orders of magnitude less variance at the same cost.Our proposed estimator is based on a low-order Taylor seriesexpansion of the exponential function near a relatively accurate esti-mate of the real optical depth obtained by multiple density lookups.The use of a low-order expansion frees up sampling budget for moreaccurate evaluation of both the expansion point and the Taylorseries terms, which further allows lowering the evaluation orderand improving the samples. This self-reinforcing loop leads to anunbiased low-variance estimator that most of the time only eval-uates the already quite accurate zeroth order term. The proposedevaluation of this term can be identified with the classical jitteredray-marching solution, whereas the remaining terms can be seen asprobabilistically-sampled correction terms that make it unbiased,so we refer to our technique as unbiased ray marching . In this section we review the main approaches for transmittanceestimation in light and particle transport literature and also identifynew connections to work outside of transport theory. For brevity,we will sometimes abuse the term “density” to mean the extinctioncoefficient 𝜇 ( 𝑥 ) (which is the product of the number density ofparticles at 𝑥 with the total cross section), and the interval willsometimes be omitted (e.g. 𝑇 refers to 𝑇 ( 𝑎, 𝑏 )). Transmittance is the exponential of the negative optical depth, 𝑇 ( 𝑎, 𝑏 ) = exp (cid:0) − 𝜏 ( 𝑎, 𝑏 ) (cid:1) = exp (cid:32) − (cid:90) 𝑏𝑎 𝜇 ( 𝑥 ) 𝑑𝑥 (cid:33) . (2)The optical depth 𝜏 can be easily approximated using ray marching (uniformly-spaced samples along the interval) or by jittered andunbiased Monte Carlo approaches, but the exponential of theseestimates will result in a biased estimator of exp( − 𝜏 ) [Raab et al.2006]. The jackknife method and its generalizations [Miller 1974]can be used to reduce the bias in some cases, but the error may still not be acceptable for certain applications. The key challengeof transmittance estimation, then, is to form unbiased estimates of 𝑇 given only point samples of 𝜇 ( 𝑥 ). This relates more broadly toestimating a functional exp( − 𝜆 ) when 𝜆 is easily estimated in anunbiased fashion (see Jacob et al. [2015] for an extensive analysis ofthe challenges posed by general unbiased functional integration). Many unbiased methods have been devised to estimate exponenti-ated integrals like Equation 1, and these methods are closely relatedto the zero-order estimation problem for point processes. A pointprocess 𝑁 ( ℓ ) is a stochastic counting process of the number of events(such as particles) occurring in some time (or along a ray of length) ℓ . For a Poisson point process (PPP), the events are independent and 𝑁 ( ℓ ) ∼ Po( 𝜆 ℓ ) is Poisson-distributed with rate 𝜆 ℓ [Cox and Lewis1966]. This rate is the integral of the intensity function 𝜆 ( 𝑥 ) of theprocess over the interval 𝜆 ℓ = (cid:90) ℓ 𝜆 ( 𝑥 ) 𝑑𝑥 (3)and allows the mean density of points to vary over the domain. Itis well known that this PPP is exactly the process governing thescattering and absorption events in classical radiative transfer [Coxand Lewis 1966; Mikhailov 1992], due to the assumption of inde-pendent scattering centers. The correspondence between the two isestablished by equating the rate of the point process 𝜆 ( 𝑥 ) to the ex-tinction coefficient of the medium 𝜇 ( 𝑥 ) as the particle moves acrossthe interval when starting from 𝑎 , 𝜆 ( 𝑥 ) = 𝜇 ( 𝑎 + 𝑥 ). Transmittance isthen the probability of finding no points/particles along the interval 𝑇 ( 𝑎, 𝑏 ) = Pr (cid:2) 𝑁 ( ℓ ) = 0 (cid:3) , ℓ = 𝑏 − 𝑎. (4)Since the mean of a PPP is the rate 𝜆 ℓ = E [ 𝑁 ( ℓ )] = 𝜏 ( 𝑎, 𝑏 ), theexponential free paths of classical radiative transfer follow from thezero-order probability of the Poisson distribution (the probabilitymass function of a Poisson distribution with rate 𝜏 is 𝑒 − 𝜏 𝜏 𝑘 / 𝑘 !,which is an exponential for 𝑘 = 0). The most well-known unbiased transmittance estimators are called tracking estimators due to the fact that they track a particle movingfrom 𝑎 to 𝑏 by sampling a PPP to determine an ordered sequenceof collisions with the medium. For a constant-density medium, theexponentially-distributed free-path lengths between collisions areeasily sampled [Novák et al. 2018]. For a nonhomogeneous medium,the PPP can be sampled using the method of delta-tracking [Bertini1963; Butcher and Messel 1958; Coleman 1968; Galtier et al. 2013;Mikhailov 1970; Skullerud 1968; Woodcock et al. 1965; Zerby et al.1961]. Using a majorant ¯ 𝜇 ( 𝑥 ) ≥ 𝜇 ( 𝑥 ), a denser process is sampledwhose rate/optical-depth is easily computable¯ 𝜏 ( 𝑎, 𝑏 ) = (cid:90) 𝑏𝑎 ¯ 𝜇 ( 𝑥 ) 𝑑𝑥. (5)A rejection process is then used to thin the denser process downto the desired result whereby each sampled point 𝑥 𝑖 is kept withprobability 𝜇 ( 𝑥 𝑖 ) / ¯ 𝜇 ( 𝑥 𝑖 ). This rejection embodies the fictitious/nullcollision concept of the transport literature. We note that this is Publication date: February 2021. n unbiased ray-marching transmittance estimator • 3 equivalent to a method in the point process literature known as thinning [Pasupathy 2010] (identification of this correspondenceappears to be new). The earliest use of either method would appearto be attributed to von Neumann shortly after the war, according toCarter et al. [1972].While the majorant ¯ 𝜇 is often a constant, efficiency of delta-tracking is improved with a majorant that more tightly boundsthe target density. Piecewise-linear [Klein and Roberts 1984] orpiecewise-polynomial [Szirmay-Kalos et al. 2011] majorants can beefficiently sampled. For a general survey of methods for samplingnonhomogeneous PPPs, see [Pasupathy 2010].Somewhat remarkably, without knowing 𝜏 , delta-tracking cansample the number of collisions 𝑁 from the Poisson distribution 𝑁 ( ℓ ) ∼ Po( 𝜏 ). Given 𝑛 samples of 𝑁 ( ℓ ) with mean ¯ 𝑣 , the minimum-variance unbiased estimator for transmittance (given only ¯ 𝑣 ) is[Johnson 1951] (cid:98) 𝑇 J = (cid:18) − 𝑛 (cid:19) 𝑛 ¯ 𝑣 . (6)The single-sample ( 𝑛 = 1) form of this estimator produces (assuming0 = 1) the delta-tracking transmittance estimator [Cramer 1978;Novák et al. 2018], which returns a binary estimate depending uponwhether or not 𝑁 = 0. The case 𝑛 > Johnson’s estimator )and, to the best of our knowledge, it has not been applied to lighttransport. While this estimator is optimal (given only ¯ 𝑣 ), in practiceit can be improved upon by using the sampled densities 𝜇 ( 𝑥 𝑖 ) directly.Another related estimator had been proposed by [Raab et al. 2006],obtained by averaging together 𝑛 partially stratified delta-trackingestimates. Weighted tracking on a line [Cramer 1978](also known as ratio tracking in graphics [Novák et al. 2014]) ap-plies an expected-value optimization to the 𝑛 = 1 delta-trackingestimator to form a product of ratios of densities ( null density 𝜇 n ( 𝑥 ) = ¯ 𝜇 ( 𝑥 ) − 𝜇 ( 𝑥 ) to total density ¯ 𝜇 ( 𝑥 )). This is closely relatedto a distance-sampling scheme known as weighted delta tracking(see e.g. Galtier et al. [2013] or Legrady et al. [2017]). Like deltatracking, a majorant PPP samples 𝑁 ∼ Po(¯ 𝜏 ) points 𝑥 𝑖 in the inter-val ( 𝑎, 𝑏 ). Instead of returning 0 as soon as a real particle is sampled,the ratio-tracking estimator imparts a fractional opacity to eachsampled particle based on its probability of being fictitious, (cid:98) 𝑇 rt = 𝑁 (cid:89) 𝑖 =1 (cid:18) − 𝜇 ( 𝑥 𝑖 )¯ 𝜇 ( 𝑥 𝑖 ) (cid:19) = 𝑁 (cid:89) 𝑖 =1 𝜇 n ( 𝑥 𝑖 )¯ 𝜇 ( 𝑥 𝑖 ) . (7)Ratio tracking outperforms delta tracking in most cases. However,delta tracking can use early termination after the first real parti-cle is sampled and avoid many unnecessary density evaluations.Therefore it can be beneficial to switch to delta tracking after therunning product in Equation 7 goes below some threshold [Nováket al. 2014]. A common theme in transmittance estimation is the utilization ofauxiliary density functions (null-collision density, control density, Also known as the track-length transmittance estimator [Georgiev et al. 2019] etc.). While these auxiliary functions can serve different purposes,for example to facilitate sampling of collisions and/or to reducevariance, they (or the combination of them) can be interpreted asa control variate (CV) [Georgiev et al. 2019]. Given an analyticallyintegrable control variate 𝜇 c ( 𝑥 ) with 𝜏 c = (cid:82) 𝑏𝑎 𝜇 c ( 𝑥 ) d 𝑥 , the opticaldepth integral can be rewritten as 𝜏 ( 𝑎, 𝑏 ) = 𝜏 c ( 𝑎, 𝑏 ) + (cid:90) 𝑏𝑎 𝜇 ( 𝑥 ) − 𝜇 c ( 𝑥 ) d 𝑥. (8)We will refer to 𝜇 c ( 𝑥 ) and 𝜇 r ( 𝑥 ) = 𝜇 ( 𝑥 ) − 𝜇 c ( 𝑥 ) as the control and residual extinction coefficients, and to their respective integrals 𝜏 c and 𝜏 r as the control and residual optical depth. The majorantresidual coefficient ¯ 𝜇 r ( 𝑥 ) = ¯ 𝜇 ( 𝑥 ) − 𝜇 c ( 𝑥 ) and optical depth ¯ 𝜏 r = ¯ 𝜏 − 𝜏 c follow. The transmittance is then 𝑇 ( 𝑎, 𝑏 ) = 𝑇 c ( 𝑎, 𝑏 ) 𝑇 r ( 𝑎, 𝑏 ) = exp (− 𝜏 c ) exp (cid:32) − (cid:90) 𝑏𝑎 𝜇 r ( 𝑥 ) 𝑑𝑥 (cid:33) . (9)This transformation can dramatically reduce variance, particularlywhen the control closely matches the true density. The first appli-cation of control variates in transmittance estimation was withratio tracking to produce the residual ratio tracking (RRT) estima-tor [Novák et al. 2014]. Applying the transformation in Equation 9,the estimator reads (cid:98) 𝑇 rrt = exp (− 𝜏 c ) 𝑁 (cid:89) 𝑖 =1 (cid:18) − 𝜇 r ( 𝑥 𝑖 )¯ 𝜇 r ( 𝑥 𝑖 ) (cid:19) . (10)where 𝑁 ∼ Po(¯ 𝜏 r ) points 𝑥 𝑖 in the interval ( 𝑎, 𝑏 ) are generated bysampling a PPP with residual intensity 𝜆 ( 𝑥 ) = ¯ 𝜇 ( 𝑥 ) − 𝜇 c ( 𝑥 ) ≥

0. Whiledifferent articles may propose different approaches for setting theresidual intensity 𝜆 ( 𝑥 ) of the PPP, this estimator is conceptuallyequivalent to the one known as the Poisson estimator [Beskos et al.2006; Chen and Huang 2012; Fearnhead et al. 2008; Jacob et al. 2015;Jonsson et al. 2020; Papaspiliopoulos 2011], first presented by Wag-ner [1987].Concurrently, Jonsson et al. [2020] have also connected the ratiotracking and Poisson estimator literature and proposed several newvariants of RRT that use online estimation of a constant control.We also propose online control estimation, but include additionalvariance-reduction techniques such as comb-filtering. We then ap-ply this idea to a power-series formulation, which improves perfor-mance and naturally includes ray marching as a biased member ofthe general formalism.

Another family of unbiased transmittance estimators follows froma power-series (Taylor) expansion of the exponential in Equation 1.This approach has been suggested as early as [Cameron 1954] andhas been used in estimation problems involving transformed obser-vations [Neyman and Scott 1960]. One such form, which is usedto estimate the exponential of the Hamiltonian in particle physicsMarkov-chain Monte Carlo simulations, due to Bhanot and Kennedy In standard literature the CV and its integral are typically weighted by a coefficientthat controls the strength of applying the CV. Since we design our CVs heuristicallywith the goal of maximizing positive correlations, we simply absorb the scaling factorinto the CV for brevity. Publication date: February 2021. • Kettunen et al. [Bhanot and Kennedy 1985; Lin et al. 2000; Wagner 1987, 1988], canbe applied directly to transmittance estimation. Related applicationsof this idea to transmittance estimation were independently pre-sented by several authors [El-Hafi et al. 2018; Georgiev et al. 2019;Jonsson et al. 2020; Longo 2002]. Similar work has been proposedby [Lyne et al. 2015] in the context of Bayesian inference.Georgiev et al. [2019] first introduced this formulation to com-puter graphics, showing how it can in fact be seen as a very generalframework for expressing and analysing all transmittance estima-tors. Following their derivation [Georgiev et al. 2019, Equations (15)and (16)], transmittance (1) can be expressed as: 𝑇 ( 𝑎, 𝑏 ) = ∞ ∑︁ 𝑘 =0 (− 𝜏 ) 𝑘 𝑘 ! = ∞ ∑︁ 𝑘 =0 𝑘 ! 𝑘 (cid:89) 𝑖 =1 (cid:32) − (cid:90) 𝑏𝑎 𝜇 ( 𝑥 ) d 𝑥 (cid:33) = 1 − 𝜏

1! + 𝜏 − 𝜏

3! + · · · . (11)Monte Carlo estimation is then applied to each integer power ofthe optical depth 𝜏 𝑘 in the expansion. This is typically achievedby using 𝑘 numerical estimates { 𝑋 , . . . , 𝑋 𝑘 } of the negative opticaldepth. As long as { 𝑋 , . . . , 𝑋 𝑘 } are independent and unbiased , i.e. E [ 𝑋 𝑖 ] = − 𝜏 ( 𝑎, 𝑏 ) = − (cid:90) 𝑏𝑎 𝜇 ( 𝑥 ) d 𝑥 , (12)it follows that their product provides an unbiased estimate of the 𝑘 -th power of − 𝜏 (we drop the index since all 𝑋 𝑖 have the sameexpectation): E (cid:34) 𝑘 (cid:89) 𝑖 =1 𝑋 𝑖 (cid:35) = 𝑘 (cid:89) 𝑖 =1 E [ 𝑋 𝑖 ] = E [ 𝑋 ] 𝑘 = (− 𝜏 ) 𝑘 . (13)This observation allows formulating the transmittance functionas the series of products of unbiased, independent estimates of(negative) optical depth: 𝑇 ( 𝑎, 𝑏 ) = 𝑒 E [ 𝑋 ] = ∞ ∑︁ 𝑘 =0 𝑘 ! E (cid:34) 𝑘 (cid:89) 𝑖 =1 𝑋 𝑖 (cid:35) . (14)Various estimators then follow from estimating random finite por-tions of this expansion with the appropriate weight corrections(explained below).The above derivation highlights the importance of using inde-pendent and unbiased estimates of 𝜏 within a single term 𝜏 𝑘 ofthe power series. Correlations across the terms of the sum, how-ever, are perfectly acceptable. In fact, high-order terms are typicallycomputed from the low-order ones using the recursive formulation [Bhanot and Kennedy 1985; Georgiev et al. 2019] 𝑇 ( 𝑎, 𝑏 ) = 1 − 𝜏 (cid:32) − 𝜏 (cid:18) − 𝜏 ( . . . , (15)which has been used in Equation 14. See the appendix of Glasser [1962] for an interesting alternative.

A control variate 𝜇 c ( 𝑥 ) is often applied to Equation 14 to yield 𝑇 ( 𝑎, 𝑏 ) = 𝑇 c ( 𝑎, 𝑏 ) 𝑇 r ( 𝑎, 𝑏 ) = 𝑒 − 𝜏 c 𝑒 − 𝜏 r = 𝑒 − 𝜏 c ∞ ∑︁ 𝑘 =0 𝑘 ! 𝑘 (cid:89) 𝑖 =1 (cid:32) − (cid:90) 𝑏𝑎 𝜇 r ( 𝑥 ) d 𝑥 (cid:33) = 𝑒 − 𝜏 c ∞ ∑︁ 𝑘 =0 𝑘 ! E (cid:34) 𝑘 (cid:89) 𝑖 =1 𝑌 𝑖 (cid:35) , (16)where 𝑌 𝑖 are unbiased, independent estimates of (negative) residualoptical depth. Galtier et al. [2013], El-Hafi et al. [2018] and Georgievet al. [2019] proposed to set the control variate to a strict majorant, 𝜇 c ( 𝑥 ) = ¯ 𝜇 ( 𝑥 ), to avoid sign oscillations in power-series estimates of 𝑇 r ( 𝑎, 𝑏 ). Based on our analysis from Section 3, we will revisit thisdecision and propose a new way of setting the control variate inSection 4. Notice how − 𝜏 c effectively acts as a pivot for the Taylorseries expansion. This interpretation is central to our investigationsand we will refer to − 𝜏 c as the pivot in the rest of the text. In practice, the evaluation of the infi-nite power series needs to be limited to sampling a finite number ofterms; Georgiev et al. [2019] have shown that virtually all unbiasedtransmittance estimators can be ultimately related to sampling thispower series expansion. In this respect, existing unbiased estimatorscan be classified into two broad categories.

Single-term estimation.

Georgiev et al. [2019] showed that thedelta-tracking and ratio-tracking estimators can be described in thepower-series formulation by noting that these estimators estimatea single term in Equation 11 at a time; when 𝑁 points are sampledby the ¯ 𝜇 -driven PPP, these estimators estimate ( − 𝜏 ) 𝑁 / 𝑁 !. The gen-eral form of the single-term power-series estimator is called the generalized Poisson estimator [Fearnhead et al. 2008], (cid:98) 𝑇 single = 𝑒 − 𝜏 c 𝑁 ! 𝑃 ( 𝑁 ) 𝑁 (cid:89) 𝑖 =1 𝑌 𝑖 , (17)where 𝑃 ( 𝑁 ) is the probability mass function of 𝑁 . Using delta-tracking results in a Poisson distribution, 𝑁 ∼ Po(¯ 𝜏 r ), but otherdistributions can be used [Fearnhead et al. 2008; Jonsson et al. 2020].For the standard delta-tracking estimator, the random variable 𝑌 is replaced by a 𝜇 c -weighted Bernoulli random variable and thepower-series derivation of this estimator is a special case of a moregeneral derivation [Glasser 1962, appendix]. Truncated-series estimators.

The recursive power-series relationin Equation 15 directly produces a truncated-series estimator thatestimates all terms in the Taylor expansion up to and including 𝜏 𝑁 .If 𝑁 is a discrete random variable and 𝑄 ( 𝑘 ) = Pr[ 𝑁 ≥ 𝑘 ], then thetruncated estimator for Equation 16 is: (cid:98) 𝑇 trunc = 𝑒 − 𝜏 c 𝑁 ∑︁ 𝑘 =0 𝑘 ! 𝑄 ( 𝑘 ) 𝑘 (cid:89) 𝑖 =1 𝑌 𝑖 . (18)Instead of selecting 𝑁 from a Poisson process, Russian roulette iscommonly employed and 𝑄 ( 𝑘 ) becomes the product of the continu-ation probabilities. Publication date: February 2021. n unbiased ray-marching transmittance estimator • 5

A useful scheme where the ex-pansion is always evaluated up to order 𝐾 and then terminatedusing term-wise roulette decisions follows from the equivalence(which we generalize here) [Bhanot and Kennedy 1985] 𝑒 𝑥 = 𝐾 ∑︁ 𝑘 =0 𝑥 𝑘 𝑘 ! + 𝑐𝐾 + 1 (cid:169)(cid:173)(cid:171) 𝑥 𝐾 +1 𝑐𝐾 ! + 𝑐𝐾 + 2 (cid:32) 𝑥 𝐾 +2 𝑐 𝐾 ! + . . . . (19)Here 𝑐 is a roulette control parameter restricted to 0 < 𝑐 < 𝐾 + 1and 𝑐 / ( 𝐾 + 𝑘 ) is the probability of expanding from order 𝐾 + 𝑘 − 𝐾 + 𝑘 . Bhanot and Kennedy proposed using a continuous expansion parameter 𝑐 >

0, setting 𝐾 = ⌊ 𝑐 ⌋ , which we will refer to asthe Bhanot & Kennedy (BK) estimator. The BK roulette, specificallythe parameter 𝑐 , provides a very explicit control over the cost of theestimator.Independently, Georgiev et al. [2019] proposed the p-series CMF estimator that sets 𝑐 = ¯ 𝜏 to the majorant and selects 𝐾 such that 99%of the majorant cumulative mass function (CMF) is accumulated(assuming ¯ 𝜏 is a safe and reasonable guess for the true optical depthwhen selecting 𝐾 ). Delta tracking and ratio tracking each have variations known asthe next-flight estimators [Cramer 1978; Novák et al. 2018] that fallsomewhat in between the tracking and truncated series forms. Also,Georgiev et al. [2019] introduced a number of additional estimators,including the p-series cumulative estimator that employs a differentroulette strategy than described above, but concluded the p-seriesCMF was best overall. We refer the reader to these works for furtherdetails.In this work we use multiple correlated density evaluations perestimate of optical depth, which was mentioned by Georgiev et al.[2019] but, to the best of our knowledge, has not been applied before.

In this section, we investigate the efficiency of single-term andtruncated power-series estimators. We measure the sensitivity ofeach estimator’s variance to various factors, which leads to keyinsights that inform the design of new estimators.

Following prior work we define the efficiency of an estimator to bethe reciprocal of variance times cost,Eff[ (cid:98) 𝑇 ] = 1Var[ (cid:98) 𝑇 ]Cost[ (cid:98) 𝑇 ] (20)where Cost[ (cid:98) 𝑇 ] is the mean number of density 𝜇 ( 𝑥 ) evaluations. Forboth single-term and truncated power-series estimators, this willdepend on 𝑁 : the number of unbiased estimates of negative residualoptical depth ( 𝑌 𝑖 ) needed to estimate a subset of the power series inEquation 16. By abandoning the physical picture of tracking estima-tors, the power series formulation permits a new parameter 𝑀 thatwe call the query size , which is the number of density evaluations per estimate 𝑌 𝑖 (see Figure 2). The estimator for 𝑌 is then (cid:98) 𝑌 = − 𝑀 𝑀 ∑︁ 𝑖 =1 𝜇 r ( 𝑥 𝑖 ) 𝑝 ( 𝑥 𝑖 ) = − 𝑀 𝑀 ∑︁ 𝑖 =1 𝜇 ( 𝑥 𝑖 ) − 𝜇 c ( 𝑥 𝑖 ) 𝑝 ( 𝑥 𝑖 ) (21)where 𝑝 ( 𝑥 ) is the density for sampling 𝑥 ∈ ( 𝑎, 𝑏 ) and total cost isCost[ (cid:98) 𝑇 ] = E [ 𝑁 ] · 𝑀 .The efficiency of a given estimator will depend on a number of pa-rameters that can be adjusted: the control variates, the query size 𝑀 ,and (in the case of BK roulette) power-series expansion parameter 𝑐 . Ideally, an automatic procedure would optimally configure theseparameters given only limited knowledge of the density statisticsin the scene. To discern more about how this could be achieved, weneed a detailed picture of how these parameters influence variance.While it is known that the efficiency of residual ratio tracking im-proves with increasing majorant [Georgiev et al. 2019] and basicheuristics for setting control variates have been discussed [Jonssonet al. 2020; Novák et al. 2014], to the best of our knowledge, little tono detailed investigation of query size 𝑀 or expansion parameter 𝑐 has been presented for any truncated estimator. Ultimately, variance in a transmittance estimator will arise due totwo factors, which we will call 𝑌 - variance and roulette variance , andwe will show that they are in fact weakly coupled. Transmittanceestimators are random functions 𝑓 (cid:98) 𝑇 ( 𝑌 , . . . , 𝑌 𝑁 ) of 𝑁 random vari-ables 𝑌 𝑖 . By 𝑌 - variance , we mean the variance in the optical depthestimates 𝑌 𝑖 themselves, which leads to variance in (cid:98) 𝑇 upon insertioninto 𝑓 (cid:98) 𝑇 . To better understand the influence of 𝑌 -variance, we canturn it off by considering a uniform medium and uniform sampling 𝑝 ( 𝑥 𝑖 ) = 1 / ℓ . The only variance that remains is then due to 𝑁 beinga random variable causing 𝑓 (cid:98) 𝑇 to evaluate different portions of thepower series. This variance arises due to the roulette scheme of atruncated estimator (or PPP sampling for a single-term estimator),and we call it roulette variance .In Figure 3 we compare the roulette variance of single-term andtruncated power-series estimators. We use residual ratio tracking,with known variance ( 𝐸𝑞𝑢𝑎𝑡𝑖𝑜𝑛 𝜇 such that ¯ 𝜏 r = ¯ 𝜏 − 𝜏 c = ℓ ( ¯ 𝜇 − 𝜇 c ) = E [ 𝑁 𝐵𝐾 ],where ℓ is the length of the estimation interval (see also Equation 55).The most efficient estimator is the one with the lowest variance. Weobserve two important trends as the pivot − 𝜏 c is varied; First, wecan achieve arbitrarily low variance by moving the negative pivotclose to the true optical depth of the medium ( 𝜏 c = 𝜏 ). Second, whenthe pivot is near this optimal value, the truncated (BK) estimator isuniversally better than the single-term estimator (RRT).This analysis hints at the possibility of finding a single trans-mittance estimator that performs best in all cases (truncated), butalso highlights the need for an accurate pivot estimator in order toachieve this. It is known that ratio tracking can outperform trun-cated estimators in some cases [Georgiev et al. 2019] and we see thisagain here for the uniform medium: when the pivot is far from itsoptimal value, the truncated estimator sees a significant explosion Publication date: February 2021. • Kettunen et al. + w Y

4! + w Y

5! + ... w + w Y + w Y

2! + w Y

3! + w Y

4! + w Y

5! + ... ! " P ( N ) N = 3 !"! !" Y Y Y M = 4 Control µ ( x ) (cid:31) T = w i = e − τ c (cid:31) ∞ k = i P ( k ) × × Fig. 2. Transmittance estimation via truncated power series using a constant control variate and multiple ( 𝑀 = 4 ) density evaluations per estimate 𝑌 𝑖 of thenegative residual optical depth. - - v a r i an c e τ = - - τ = - τ = c = - - - v a r i an c e - - - - - - c = - - - - τ c v a r i an c e - - - - τ c - - - τ c c = Fig. 3. Variance of cost-matched RRT ( black ) and BK ( dashed ) estimatorsfor uniform media as a function of negative pivot 𝜏 c in three configurationsof optical depth 𝜏 and expansion parameter 𝑐 . c = = = = = - - - τ c v a r i an c e * c o s t τ = = = = = =

40 5 10 15 2010 - - - τ c τ = Fig. 4. Inverse efficiency of the BK estimator in a uniform medium as afunction of negative pivot 𝜏 c in two configurations of optical depth 𝜏 andfive configurations of expansion parameter 𝑐 . of variance, while the single-term estimator sees far less. This issueis lessened by estimating more of the series via the expansion pa-rameter 𝑐 , but at a cost of more density evaluations. Increasing 𝑐 alsowidens the performance gap between the two estimators (Figure 3,bottom row) and extends the range of pivots where the truncatedestimator is better than the single-term.Figure 4 further demonstrates how the expansion parameter 𝑐 influences the efficiency of power-series estimators by plotting in-verse efficiency. We see that increasing 𝑐 widens the range of pivotswhere high efficiency can be obtained. It also shows a universaltrend shared with RRT: increasing E [ 𝑁 ] monotonically improvesthe overall efficiency, regardless of the pivot. While rays passing through uniform density are common in practice(especially empty space with 𝑇 = 1), the analysis above is notrepresentative of the full picture. We now introduce increasing - - - - - - τ c v a r i an c e τ = - - - - - - τ c τ = Fig. 5. Variance of the BK (thick) and U-BK (thin) estimators ( 𝐾 = 𝑐 = 2 ) fortwo different optical thicknesses. The uniform medium (black) is comparedto five different levels of density fluctuations (colored). The pivot with thelowest variance for the BK estimator is indicated by black dots and shiftsto the right with increasing density fluctuations, making the optimal pivotdifficult to predict. Application of U-statistics always reduces the varianceand also widens the range of pivots where the minimum variance is achieved. amounts of 𝑌 -variance to observe how and when the total variancechanges.In Figure 5 we compare the variance of the truncated estimatoras different amounts of fluctuation in the density 𝜇 ( 𝑥 ) are intro-duced while preserving the mean of 𝜇 ( 𝑥 ). Each thick colored linecorresponds to a different amount of fluctuation in 𝜇 ( 𝑥 ) (thereby in-troducing 𝑌 -variance). The uniform medium (pure roulette variance)is shown in black for reference. This comparison is comprehensivein that, like with ratio tracking, it follows from the power-series for-mulation that the variance of the BK estimator is purely a functionof the mean and variance of the 𝑌 estimates (together with the pivotvalue)—the exact profile of the density fluctuations is irrelevant (thisis because the BK roulette is independent of 𝑌 𝑖 ).We find that the variance of the BK estimator is dominated byeither 𝑌 -variance or roulette variance: they are weakly coupled . Farfrom the optimal pivot (where the black curves merge with therest) the variance is essentially the same as that of a medium withconstant density, so increasing 𝑀 will have little impact. Conversely,no matter how good the pivot, variance in the samples 𝑌 𝑖 limits theminimum-achievable variance. Further, as 𝑌 -variance decreases (byincreasing 𝑀 , say), the pivot needs to be closer to the optimal valueto avoid roulette variance limiting the gains (to stay inside the blackcurves in Figure 5), suggesting that the sample budget of any onlinepivot estimation should be positively correlated to 𝑀 . From the analysis in this section we take away several key insights: • Regardless of the optical depth or variation of density alonga ray, the pivot is a critical parameter for achieving optimalefficiency with either ratio tracking or truncated estimators.

Publication date: February 2021. n unbiased ray-marching transmittance estimator • 7 • Near the optimal pivot value, truncated estimators outper-form ratio tracking when 𝑌 -variance is low. • The lowest achievable variance of the estimator is ultimatelylimited by variance in the optical depth estimates.In the next section we introduce a novel truncated power-series esti-mator that builds on these insights with the goal of “climbing downinto the valley of zero variance“ in Figure 5: the region betweenthe black curves. This is achieved by combining an accurate onlinepivot estimation with 𝑌 -variance reduction and symmetrizing thepower-series estimator to further reduce variance and sensitivity tothe estimated pivot. In this section, we propose new truncated power-series estimatorsinspired by the previous analysis. These estimators builds on previ-ous work through the introduction of several novel methods, largelyunder the theme of 𝑌 -variance reduction and pivot estimation. Wedescribe each of these methods separately, with both experimentaland theoretical motivation for each, before detailing their combina-tion. For notational simplicity, and unless stated otherwise, we willdiscuss estimation of transmittance without the application of thecontrol variate. Extending the proposed improvements to residualtransmittance 𝑇 r is trivial, necessitating mere substitution of thecorresponding terms. In order to estimate transmittance with a power-series estimatorthat evaluates all terms up to order 𝑁 (Equation 14), we need toobtain 𝑁 estimates of the negative optical thickness, 𝑋 to 𝑋 𝑁 , andevaluate the following sum: (cid:98) 𝑇 trunc = 1 + 𝑋 𝑄 (1) + 𝑋 𝑋 𝑄 (2) + · · · + 𝑋 · · · 𝑋 𝑁 𝑁 ! 𝑄 ( 𝑁 ) , (22)where 𝑄 ( 𝑘 ) is the probability of evaluating at least 𝑘 orders. Thisspecific estimator follows from the recursive formulation of thepower series [Bhanot and Kennedy 1985; Georgiev et al. 2019] butis not the only unbiased estimator with the correct expectation. Weshow how to reduce the variance of this estimator with no additionaldensity evaluations.The key insight in reducing the variance of Equation 22 is notingthat the first estimate 𝑋 appears in all of the terms, but the lastestimate 𝑋 𝑁 is used in only once, and so increasing 𝑁 has littleimpact on the variance of the linear term, and so on. Our goal is toensure that all estimates are in a symmetric position with respect toimpacting the sum, and that we utilize the estimates maximally foreach term in the estimator. We can achieve this for the first-orderterm in Equation 22 by replacing 𝑋 by the mean of all estimates: 𝑚 (cid:66) 𝑋 + · · · + 𝑋 𝑁 𝑁 (23)Analogously, we replace the 𝑋 𝑋 product in the second-order termby the mean of all two-term products 𝑋 𝑖 𝑋 𝑗 : 𝑚 (cid:66) 𝑋 𝑋 + · · · + 𝑋 𝑋 𝑁 + 𝑋 𝑋 + · · · + 𝑋 𝑁 − 𝑋 𝑁 (cid:0) 𝑁 (cid:1) . (24)In order to generalize this idea to the 𝑘 -th order, we sum theproducts of all possible 𝑘 -wide combinations—the 𝑘 -th elementary c = = = = = =

60 2 4 6 810 - - - τ c BK τ = = = = = = =

60 2 4 6 810 - - - τ c U - BK τ = Fig. 6. Variance of BK (left) and U-BK (right) estimators for a ray withoptical depth 𝜏 = 4 and small 𝑌 -variance. The pivot that minimizes variancefor each value of expansion parameter 𝑐 is indicated (approximately) bythe black dots. In addition to reducing variance, U-statistics flattens thevariance profile with respect to pivot. This helps to mitigate any increasein variance due to errors in online pivot estimation and provides a simplecommon goal for pivot estimation—the negative optical depth of the ray. symmetric sum : 𝑠 𝑘 (cid:66) ∑︁ ≤ 𝑖 < ··· < 𝑖 𝑘 ≤ 𝑁 𝑋 𝑖 · · · 𝑋 𝑖 𝑘 , (25)where 𝑠 (cid:66)

1, and divide it by the number of 𝑘 -wide combinations;this yields a general formula for computing the 𝑘 -th symmetricmean: 𝑚 𝑘 (cid:66) 𝑠 𝑘 (cid:0) 𝑁𝑘 (cid:1) . (26)This variance-reduction procedure is well-known in probabilitytheory: Equation 22 is a statistic 𝑓 ( 𝑋 , · · · , 𝑋 𝑁 ) of 𝑁 independentand identically-distributed random variables. It is known [Halmos1946] that the unique and minimum-variance estimator of such astatistic is the symmetric function 𝑓 [ 𝑁 ] ( 𝑋 , · · · , 𝑋 𝑁 ) that is invariantto the order of the 𝑋 𝑖 inputs. The generalized means 𝑚 𝑘 are knownas U-statistics [Lee 1990].Utilizing U-statistics as the numerators in Equation 22 yields anovel U-statistics power-series estimator of transmittance: (cid:98) 𝑇 U = 1 + 𝑚 𝑄 (1) + 𝑚 𝑄 (2) + · · · + 𝑚 𝑁 𝑁 ! 𝑄 ( 𝑁 ) . (27)This approach can lower the variance of any estimator that estimatesmore than one term of the power series at the same time (trackingestimators are already fully symmetric). When combined with thegeneralized Bhanot & Kennedy roulette scheme (19) we refer to thisestimator as the U-BK estimator .In addition to reducing variance, U-statistics makes it easier tofind the optimal pivot. This can be seen in the variance compar-isons in Figure 5 (varying 𝑌 -variance with 𝑐 fixed) and Figure 6(varying 𝑐 with 𝑌 -variance fixed). For BK, the negative pivot 𝜏 c thatachieves minimum variance shifts to the right of 𝜏 as 𝑌 -varianceor 𝑐 increases, making this a difficult parameter to automaticallydetermine. In addition to univerisally lowering the variance, we seethat U-statistics flattens the variance profile in regions not domi-nated by roulette variance. Importantly, regardless of 𝑌 -varianceor 𝑐 , the true optical depth 𝜏 is a (near) optimal setting for 𝜏 c in allcases, making the goal of pivot estimation simple: to estimate thenegative optical depth.The main caveat of naively evaluating a U-statistics estimatoris the exponential computational complexity: the total number ofcombinations required for evaluating the series up to order N is Publication date: February 2021. • Kettunen et al. (cid:0) 𝑁 (cid:1) + (cid:0) 𝑁 (cid:1) + · · · + (cid:0) 𝑁𝑁 (cid:1) = 2 𝑁 . A simple approximation of the opti-mal estimator could be built in 𝑂 ( 𝑁 ) time by averaging 𝑁 rota-tions of the 𝑋 𝑖 estimates (taking the average of { 𝑓 ( 𝑋 , 𝑋 , . . . , 𝑋 𝑁 ) ,𝑓 ( 𝑋 , 𝑋 , . . . , 𝑋 ) , . . . } ). However, efficient full symmetrization ispossible by using the Girard-Newton formulas [Mead 1992], inde-pendently found by Albert Girard and Isaac Newton in the 17thcentury, which relate numbers 𝑥 to 𝑥 𝑁 to their elementary sym-metric sums 𝑠 𝑘 . By precomputing the power sums 𝑃 𝑘 = (cid:80) 𝑁𝑖 =1 𝑥 𝑘𝑖 , wehave 𝑠 𝑘 = 1 𝑘 𝑘 ∑︁ 𝑖 =1 ( − 𝑖 − 𝑠 𝑘 − 𝑖 𝑃 𝑖 , (28)a simple and efficient recurrence relation for the elementary sym-metric sums.Although the Girard-Newton formulas provide a convenient wayfor directly calculating the elementary symmetric means, we foundthem to suffer from numerical precision problems. Algorithm 1provides pseudo-code for a novel incremental algorithm to computethe elementary symmetric means that we designed to address theserobustness issues while potentially allowing to add new samples on-the-fly. An explanation of the algorithm is provided in Appendix B.We recommend using this version in practical implementations.Our algorithm and the Girard-Newton formulas both run in time 𝑂 ( 𝑁𝑍 ), where 𝑁 is the number of samples and 𝑍 is the number oforders evaluated. Normally we evaluate all orders ( 𝑍 = 𝑁 ) but ifthe sample count 𝑁 is high enough, the highest orders might notcontribute and we might want to make 𝑍 smaller than 𝑁 to savetime. These algorithms reduce the time of evaluating the elementarysymmetric means from 𝑂 (2 𝑁 ) to 𝑂 ( 𝑁 ), or 𝑂 ( 𝑁𝑍 ), and make thecombination estimator practical. ALGORITHM 1:

ElementaryMeans

Input :

Samples 𝑥 , · · · , 𝑥 𝑁 ; Evaluation order 𝑍 Output :

Elementary symmetric means 𝑚 , ...,𝑚 𝑍 𝑚 = 1 ; 𝑚 𝑘 = 0 (for 𝑘 = 1 to 𝑍 ) ; for 𝑛 = 1 to 𝑁 dofor 𝑘 = min( 𝑛, 𝑍 ) to do 𝑚 𝑘 = 𝑚 𝑘 + 𝑘𝑛 ( 𝑚 𝑘 − 𝑥 𝑛 − 𝑚 𝑘 ) ; endend We presented empirical evidence in Section 3 that the pivot plays akey role in minimizing the variance of transmittance estimators andthat, with U-statistics, the negative optical depth is a universallygood choice. We now present additional theoretical motivation forthis observation before discussing online pivot estimation.Interpreting the negative control thickness − 𝜏 c as the expansionpoint, or pivot, of the Taylor series of the exponential provides aninsightful new way to analyze power series estimation. Considerthe general problem of evaluating 𝑒 𝑥 , for a given 𝑥 ∈ R , using its exp ( x ) Order 2Order 4Order 6 - - - - - x ( x ) Order 1 - - - - - x Fig. 7. Moving the pivot (black) of the Taylor series expansion closer to thepoint where we want to evaluate it (red) has a dramatic effect on its con-vergence. Under the power-series formulation of transmittance estimationthis means that a more accurate control variate permits more aggressiveroulette on higher order terms in the series, lowering the cost.

BK, τ c = τ U - BK, τ c = τ BK, τ c = τ U - BK, τ c = τ = E [ N ] V a r i an c e tight majorant pivot τ c = τ true optical depth pivot τ c = τ x μ ( x ) Fig. 8. The variance of a non-symmetric p-series (BK) estimator quicklyreaches a plateau as the expansion order is raised (increasing 𝑐 ) whilemaintaining a fixed pivot. This is because the additional samples are usedonly in the higher-order terms, which have an insignificant contribution inthe Taylor expansion of 𝑇 . Moving the pivot closer to the true optical depthworsens the efficiency because even fewer terms play a significant role inthe expansion. Our symmetric estimator (UBK) continually improves withlarger 𝑐 because additional samples improve all terms in the expansion. series expansion centered at point 𝑝 : 𝑒 𝑥 = 𝑒 𝑝 ∞ ∑︁ 𝑘 =0 ( 𝑥 − 𝑝 ) 𝑘 𝑘 ! ≈ 𝑒 𝑝 𝑁 ∑︁ 𝑘 =0 ( 𝑥 − 𝑝 ) 𝑘 𝑘 ! . (29)Different values of the pivot 𝑝 correspond to different polynomialfits: the closer 𝑝 is to 𝑥 , the faster the Taylor polynomial convergesto the true value at 𝑥 (see Figure 7 for demonstration). For transmit-tance estimation, this is another way of saying that the “mass” thateach term in the series contributes to the final estimate of 𝑇 shiftsas the pivot changes (see [Georgiev et al. 2019, Figure (5) and (6)]).For the uniform medium, where all optical depth estimates aredeterministic (zero variance), we found that the optimal estimator isthe one with the pivot that has the Taylor expansion converge at thezeroth order: 𝑒 𝑥 = 𝑒 − 𝜏 ( 𝜏 r + · · ·) where 𝜏 r = 0. This may seemlike a purely theoretical curiosity, because knowing 𝜏 immediatelygives 𝑇 , but it does ensure that any estimator using an estimate of − 𝜏 for 𝑝 will have zero variance for rays with uniform density, suchas through empty portions of the volume.Whether or not 𝑝 = − 𝜏 is a good choice in general, though, ismore complicated. Without a symmetric estimator, once the opticaldepth estimates are random, a pivot derived from a majorant densityto suppress alternating signs at consecutive orders tends to resultin lower variance [El-Hafi et al. 2018; Galtier et al. 2013; Georgiev Publication date: February 2021. n unbiased ray-marching transmittance estimator • 9 et al. 2019]. This results from a complicated interaction between thesampling probabilities for each term in the series and their expectedcontributions (masses). We note that (lack of) symmetry can shedmore light on why this happens.In Figure 5 and Figure 6 we compare the variance of the truncatedBK estimator as the pivot changes. As the expansion parameter 𝑐 orthe level of Y-variance change, the optimal pivot (black dots) moves.Note how in Figure 6 (left) at each of the optimal pivot locations, anincrease in 𝑐 , which provides more samples to the estimator, has noeffect on the variance—it plateaus. We show this as well in Figure 8with the medium and pivot held fixed as more samples are used bythe estimator. For either the tight majorant pivot (blue line) or usingthe true optical depth 𝜏 (yellow line), eventually the extra samplesare wasteful. In the latter case, the plateau is reached instantly.To a first order, the explanation for this effect is the lack of esti-mator symmetry, resulting in bad sampling of the low order terms:the 𝑘 -th term in the Taylor series is estimated by a single product ofnoisy samples 𝑌 · · · 𝑌 𝑘 , and this sampling error is never correctedby sampling higher orders: i.e. sampling the order 𝑘 + 1 with a newdensity sample 𝑌 𝑘 +1 will not improve the estimates of the previousorders { , ..., 𝑘 } .This means that regardless of the pivot, the convergence willplateau as the expected contribution of the higher orders eventuallytends to zero and additional samples stop improving the result. Witha more accurate pivot, which results in even higher relative expectedcontribution of the low order terms, the problem is exacerbated,making the convergence plateau even sooner.The situation changes once we utilize U-statistics because eachadditional sample improves all of the terms in the Taylor expan-sion. Computing the pivot using the majorant density no longerimproves performance, while the approximate density mean alwaysyields better variance, especially in the low sample-count settings(Figure 8). As we noted above, in addition to lowering the variancerelative to the non-symmetric estimator, U-statistics creates a rangeof pivots containing 𝑝 = − 𝜏 where the variance is near optimal, andso we will refer to 𝑝 = − 𝜏 as the optimal pivot . Appendix A containsa more thorough analysis of the effects of using the approximatemean pivot. Assuming we have obtaineda good pivot, a natural next question is to understand how the 𝑌 -variance relates to the required order of the Taylor expansion.Let us assume a relatively accurate pivot, 𝑝 ≈ E [ 𝑋 ] such that ourshifted samples 𝑌 𝑖 (cid:66) 𝑋 𝑖 − 𝑝 have approximately zero expectation.For simplicity, let us also assume that our estimates for the terms E [ 𝑌 ] 𝑘 are given by simple products 𝑌 · · · 𝑌 𝑘 . The variance of theproduct is Var[ 𝑌 · · · 𝑌 𝑘 ] ≈ E [ 𝑌 ] 𝑘 ≈ Var[ 𝑌 ] 𝑘 , (30)which relies on the assumption that E [ 𝑌 ] ≈

0. Since shifting arandom variable does not change its variance, the immediate follow-up is that if we decrease the variance of our samples 𝑋 𝑖 to a factor 𝑠 ,the variance of our estimate for E [ 𝑋 − 𝑝 ] 𝑘 will fall geometrically tothe factor of 𝑠 𝑘 , quickly making the higher-order terms insignificant.For instance, decreasing the variance of our samples by 50% would result in a decrease of the variance of the 10th order term to around1/1000 of its original value.This means that even a small reduction in 𝑌 -variance makesthe Taylor series converge with fewer terms. This, in turn, allowsus to save computation by more aggressive Russian roulette. Wecan use this freed sampling budget for bringing the variance of thesamples down even more – a potential self-amplifying feedback loop.However, all of this needs a good pivot which might not always beavailable. One way of obtain-ing accurate approximations of the optimal pivot is to subdivide thevolume, precompute localized statistics, and query them along eachray. We take a lighter approach and propose to estimate the optimalpivot on-the-fly by taking an additional independent sample 𝑋 𝑁 +1 of the integral − 𝜏 .A single sample might not seem enough to estimate a mean:however, we observe that we can apply a procedure analogous tothe rotations briefly mentioned in Section 4.1 to effectively increasethe total number of samples to 𝑁 + 1: If we indicate by 𝑋 the entireset of samples { 𝑋 , · · · , 𝑋 𝑁 +1 } , we may consider all 𝑁 + 1 estimatorsresulting from taking each unbiased sample 𝑋 𝑖 as the pivot in turn,and using the samples 𝑋 \ { 𝑋 𝑖 } to build our symmetrized estimatorfrom section 4.1, and averaging the result. Formally, such unbiasedestimator of order 𝑁 reads: (cid:98) 𝑇 = 1 𝑁 + 1 𝑁 +1 ∑︁ 𝑖 =1 𝑓 𝑁 (cid:0) 𝑋 𝑖 , 𝑋 \ { 𝑋 𝑖 } (cid:1) (31)where the function 𝑓 𝑁 is given by: 𝑓 𝑁 ( 𝑝, 𝑌 ) = 𝑒 𝑝 𝑁 ∑︁ 𝑘 =0 𝑚 𝑘 ( 𝑌 − 𝑝 ) 𝑘 ! 𝑄 ( 𝑘 ) , (32)where 𝑚 𝑘 is the 𝑘 -th symmetric means, each 𝑋 𝑖 is an independentunbiased estimator of (cid:82) 𝑏𝑎 − 𝜇 ( 𝑥 ) d 𝑥 , and 𝑌 − 𝑝 subtracts 𝑝 from each ofthe remaining samples. This ensures that all samples in the expandedset have a symmetric contribution in the new estimator. In this section, we focus on reducing the 𝑌 - variance . Equation 30suggests that the variance of higher order terms 𝑘 is proportional tothe 𝑘 -th power of the variance of the estimators 𝑋 𝑖 : hence, even asmall reduction in variance of each individual 𝑋 𝑖 will transform intomuch larger reductions for the higher order terms of the series ex-pansion. We propose to use an unbiased, multi-sample estimator thatstrikes better quality-cost tradeoff than single-sample estimators: 𝑋 𝑖 = − 𝜇 ( 𝑥 𝑖 ) 𝑝 ( 𝑥 𝑖 ) , where 𝑥 𝑖 ∼ 𝑝 . The estimator is based on random-ized Cranley-Patterson (CP) rotations of equidistant points; othersampling patterns are briefly discussed in Section 6.Without loss of generality, we assume the integration interval tobe [0 , ℓ ). We use an 𝑀 -tuple of equidistant points { 𝑢 𝑗 : ℓ 𝑗𝑀 − } 𝑗 =1 ..𝑀 that we randomly offset and wrap around the [0 , ℓ ) interval usingthe CP rotation. For each order 𝑖 , we use a single random number 𝑥 𝑖 ∈ [0 , ℓ ) to obtain the rotated set { 𝑥 𝑖 𝑗 : ( 𝑥 𝑖 + 𝑢 𝑗 ) 𝑚𝑜𝑑 ℓ } 𝑗 =1 ..𝑀 and Publication date: February 2021. estimate the optical depth as: 𝑋 𝑖 = − 𝑀 𝑀 ∑︁ 𝑗 =1 𝜇 ( 𝑥 𝑖 𝑗 ) 𝑝 ( 𝑥 𝑖 𝑗 ) . (33)Notice that this estimator is equivalent to convolving the inte-grand with an 𝑀 -point Dirac comb: 𝜇 ⊗ ( 𝑠 ) = 1 𝑀 𝑀 ∑︁ 𝑗 =1 𝜇 (cid:16) ( 𝑠 + 𝑢 𝑗 ) 𝑚𝑜𝑑 ℓ (cid:17) . (34)Henceforth, we will refer to the resulting estimators as combedestimators . Fast convergence rate of equidistant sampling.

Using multiple den-sity evaluations induces higher evaluation cost than single-sampleoptical-depth estimators. It is thus important to consider whetherthe U-statistics estimator, which utilizes all estimates maximally,yields lower variance with few high-quality estimates or with manylow-quality ones.The reason why using combs dedicating 𝑀 evaluations to each 𝑌 estimate is advantageous is to be found in the very fast conver-gence rate of integration by equidistant sampling: in fact, whereaswith pure random sampling 𝑀 evaluations would yield an integra-tion error reduction of only 𝑂 (1 / √ 𝑀 ), if the integrand has boundedslope (which is common in practice, at least locally) integrating withequidistant combs features a convergence rate of 𝑂 (1 / 𝑀 ), meaningthat variance goes does down as fast as 𝑂 (1 / 𝑀 ). Hence, even ifat equal sample count we are reducing the number of combina-tions available for the U-statistics estimator, we have observed thatthis is more than compensated by the much lower variance of theindividual estimates.Our proposed algorithms will take this idea to a logical maximum:we try to maximally benefit from the improved convergence rate byutilizing as dense sampling combs as possible (i.e. a large 𝑀 ), andcompensate for the larger 𝑀 by a very aggressive Russian rouletteto keep the truncation order 𝑁 low. As shown by equation (34), anM-point equidistant sample of the density function corresponds to asingle evaluation of the convolution of the density with an M-pointDirac comb. This convolution does not change the value of the den-sity integral: it merely reshuffles its density into a form that is moresuitable for Monte Carlo estimation (see Figure 9). This inspires anew general invariance principle for transmittance estimation : wecan alter the density along the ray in any integral-preserving waythat we like and not change the result. With this principle, we canmaintain the physical picture of a particle traversing the intervalor use the Volterra integral formulation of transmittance [Georgievet al. 2019] and still benefit from 𝑌 -variance reduction using a querysize 𝑀 . We can also design additional density-reshuffling transfor-mations that further reduce 𝑌 -variance. The CP rotation utilized in the combed estimator may introduce anartificial discontinuity. This is easy to realize when noticing thatrotating the set of samples around the integration interval is equiv-alent to rotating the integrand (while keeping the set of samples N o M a t c h i ng M = = = M a t c h i ng M = = = Fig. 9. Top row: 𝑀 -tap Dirac comb filtering is used to reshuffle the originaldensity 𝜇 ( 𝑥 ) (left) to reduce 𝑌 variance. Bottom row: an affine transformation(red) that preserves optical depth is applied to the density to match theendpoints and remove the discontinuities in the combed densities with onlytwo extra density evaluations. fixed). The original interval endpoints 𝑎 and 𝑏 coincide at a newlocation 𝑥 𝑖 where the rotated integrand 𝜇 cp features discontinuity:lim 𝑠 → 𝑥 − 𝑖 𝜇 cp ( 𝑠 ) = 𝜇 (0)lim 𝑠 → 𝑥 + 𝑖 𝜇 cp ( 𝑠 ) = 𝜇 ( ℓ ) . (35) - Fig. 10

In practice, if the original 𝜇 had abounded maximum slope, which is oftenthe case in practice, 𝜇 cp no longer does(see Figure 10). This breaks the assump-tion that guarantees the improved con-vergence rate of equidistant sampling.We can remedy the discontinuity byanother density reshuffling operation,namely subtracting a zero-mean affinecontrol variate that interpolates the end-points: 𝜇 ★ ( 𝑠 ) := 𝜇 ( 𝑠 ) + (cid:18) − 𝑠ℓ (cid:19) ( 𝜇 ( ℓ ) − 𝜇 (0)) . (36)This modification eliminates the discontinuity caused by the randomoffset ( 𝜇 ★ (0) = 𝜇 ★ ( ℓ )) while the integral remains unchanged, re-enabling the improved convergence rate from equidistant sampling.See the bottom row of Figure 9 for an illustration. Appendix Dprovides further formulas and simplifications. Sampling the order of the series expansion 𝑁 is typically performedincrementally by Russian roulette, and several different methodshave been discussed [Bhanot and Kennedy 1985; Booth 2007; Georgievet al. 2019; Girolami et al. 2013; Moka et al. 2019; Papaspiliopoulos2011]. Our work builds on the Bhanot & Kennedy roulette describedin Section 2.5.2, which we modify to incorporate the following ob-servations.Using a sampled pivot that is close to the true optical thicknessreduces the expected contributions of the first and higher ordersof the Taylor expansion. Moreover, with the combed and endpoint-matched 𝑀 -sample U-BK estimator, both the bias and variance of the Publication date: February 2021. n unbiased ray-marching transmittance estimator • 11 zeroth order term are often very small, while most of the variancecomes form the higher order (correction) terms.Using all the samples for the pivot would improve the estimator(which has superlinear convergence in 𝑀 ), but it would also makethe method biased; however, we can still take advantage of thisobservation by allocating a larger portion of the samples to thezeroth order, and sampling the higher order terms more infrequently.In order to do that, we terminate the series at the zeroth order withprobability 𝑝 𝑍 , and only sample the first and higher order terms withprobability 1 − 𝑝 𝑍 times the original BK roulette probabilities. Thisis equivalent to using the original BK roulette with the acceptanceprobability of the first-order term multiplied by 1 − 𝑝 𝑍 , leading tothe following probabilities of evaluating the first 𝐾 terms: 𝑃 = 1 (37) 𝑃 = · · · = 𝑃 𝐾 = 1 − 𝑝 𝑍 , (38)and the following conditional probability for adding the subsequentterms: 𝑃 𝑘 | 𝑘 − = 𝑐𝑘 , where 𝑘 > 𝐾 . (39)This amortizes the cost of the correction terms, allowing us to uselarger tuple sizes which in turn improve the pivot and exponentiallyreduce the expected contribution and variance of the higher-orderterms.In practice, we found that truncating the series at the zeroth termin 90% of cases, i.e. 𝑝 𝑍 = 0 .

9, provides a large increase in efficiencyacross all of our tests. Using the BK scheme with parameters 𝐾 = 𝑐 = 2, this lowers the expected evaluation order from 𝑒 − ≈ . . 𝑒 ≈ . . 𝑀 . The superlinear convergence obtained by increased tuple sizesmore than offsets the variance increase caused by the higher weightsof the correction terms.Raising this probability to 99% yields even lower variance, but atthe cost of occasional outliers (manifesting as “fireflies”). Moreover,evaluating only the zeroth order term 90% of the time already pro-vides 90% of the possible cost savings, so higher values are unlikelyto strike much better efficiency. With all the above improvements, we obtain an estimator that hassuperlinear convergence properties in the tuple size. While in itselfthis is very powerful, designing a strategy to determine optimaltuple sizes may depend on all the sources of noise surroundingtransmittance estimation (for example, in a rendering problem, allthe sources of noise in path sampling), and we consider it outsidethe scope of this paper. Our objective is designing an estimator thatworks well even at relatively low sample counts. We found that asample count related to the one used in the p-series CMF estimatorworks well in practice, and we describe its evaluation and use here.As a first step, we employed a simple grid search to obtain a fitfor the expected sample count used by the p-series CMF with 99%mass (given the control / majorant optical thickness ¯ 𝜏 ) E [ 𝑁 𝐶𝑀𝐹 ] ≈ (cid:108) √︁ (0 .

015 + ¯ 𝜏 )(0 .

65 + ¯ 𝜏 )(60 . 𝜏 ) (cid:109) . (40) ALGORITHM 2:

AggressiveBKRoulette

Input : 𝑝 𝑍 = 0 . Output : maximum order 𝑘 and weights 𝑤 , ..., 𝑤 𝑘 𝑤 = 1; 𝑃 = 1 − 𝑝 𝑍 ; 𝑢 = rand();// Stop at the zeroth order term with probability 𝑝 𝑍 if 𝑃 ≤ 𝑢 then return 0; end // BK with 𝐾 = 𝑐 = 2; 𝐾 = 𝑐 = 2; for 𝑖 = 1 to 𝐾 do 𝑤 𝑖 = 1 / 𝑃 ; endfor 𝑖 = 𝐾 + 1 to ∞ do // Compute the continuation probabilities 𝑐 𝑖 = min( 𝑐 / 𝑖,

1) ;// Update the probability of sampling at least order i 𝑃 = 𝑃 · 𝑐 𝑖 ;// Russian roulette termination if 𝑃 ≤ 𝑢 then return 𝑖 − end // Final weight for order i 𝑤 𝑖 = 1 / 𝑃 ; endALGORITHM 3: BKExpectedEvalOrder

Input : 𝑝 𝑍 , 𝑐 , 𝐾 = ⌊ 𝑐 ⌋ assumed Output : expected evaluation order of our roulette// Evaluate E [ 𝑁 𝐵𝐾 ] = 𝐾 + (cid:16) 𝐾 ! / 𝑐 𝐾 (cid:17) (cid:16) 𝑒 𝑐 − (cid:80) 𝐾𝑘 =0 𝑐 𝑘 / 𝑘 ! (cid:17) 𝐾 = ⌊ 𝑐 ⌋ ; 𝑡 = 1; 𝑠𝑢𝑚 = 1; for 𝑘 = 1 to 𝐾 do 𝑡 = 𝑡 ∗ 𝑐 / 𝑘 ; 𝑠𝑢𝑚 = 𝑠𝑢𝑚 + 𝑡 ; end 𝐸 𝑁 = 𝐾 + (exp( 𝑐 ) − 𝑠𝑢𝑚 ) / 𝑡 ;// Non-zero orders are evaluated with probability 1 − 𝑝 𝑍 return (1 − 𝑝 𝑍 ) · 𝐸 𝑁 ; ALGORITHM 4:

DetermineTupleSize (for unbiased ray marching)

Input : control optical thickness ¯ 𝜏 Output : tuple size M matching the p-series CMF cost 𝑁 𝐶𝑀𝐹 = (cid:108) √︁ (0 .

015 + ¯ 𝜏 )(0 .

65 + ¯ 𝜏 )(60 . 𝜏 ) (cid:109) ; 𝑁 𝐵𝐾 = BKExpectedEvalOrder(2) ; // ≈ . , (cid:4) 𝑁 𝐶𝑀𝐹 / ( 𝑁 𝐵𝐾 + 1) + 0 . (cid:5) ) ; Publication date: February 2021.

This approximation is by empirical analysis asymptotically correct,has mean absolute error of 0 .

34 samples for ¯ 𝜏 <

10 and a maximumrelative error of 9% for ¯ 𝜏 ≥

10. We found this more efficient thanthe approach used by Georgiev et al. [2019].We then use the Algorithm 3 to solve for 𝑀 such that a generalizedBK roulette with given 𝐾 produces the same mean number of densityevaluations as p-series CMF. This uses an exact formula, Equation 55,for the expected evaluation order E [ 𝑁 𝐵𝐾 ]. We need one sample perorder plus one for the pivot, and hence to achieve the same cost,the expected sample count E [ 𝑁 𝐵𝐾 ] + 1 times the tuple size 𝑀 mustmatch E [ 𝑁 𝐶𝑀𝐹 ], or in other words, our desired tuple size is givenby 𝑀 = E [ 𝑁 𝐶𝑀𝐹 ] E [ 𝑁 𝐵𝐾 ] + 1 . (41)Algorithm 4 provides pseudo-code for the final algorithm.In practice, in order to not oversample high-density but low-variance volumes, we recommend using the difference between themajorant and minorant optical thicknesses as the control parameter¯ 𝜏 , when a minorant is available. The following paragraphs summarize the construction of our finalestimators.

Our final unbiasedestimator is summarized in Algorithm 5 and works as follows: Wefirst determine the number of density evaluations, 𝑀 , for estimatingeach sample of negative optical thickness 𝑋 𝑖 (as described in Sec-tion 4.6), and we determine the highest order of the power series, 𝑁 , using the BK roulette (Section 4.5); these first two steps do notimpact each other.Then we compute 𝑁 + 1 combed estimates ( 𝑋 · · · 𝑋 𝑁 +1 ) usingequidistant, CP-rotated evaluations (Section 4.3) and apply endpointmatching (Section 4.4).Finally, we use each 𝑋 𝑖 as the pivot (Equation 31) and evaluatethe Taylor series thereof using the symmetrized estimator (Sec-tion 4.1). Specifically, we use our new elementary-means algorithm(algorithm 1) with the remaining 𝑁 estimates 𝑋 \ { 𝑋 𝑖 } .Notice that at very low sample counts, e.g. 𝑀 <

6, we disableendpoint matching, since we found that the additional overheadof its two additional evaluations 𝜇 (0) and 𝜇 ( ℓ ) was not worth theresulting variance reduction.Most of the time our estimator evaluates only the zeroth orderterm, when roulette samples 𝑁 = 0. In this case, the estimate issimply 𝑒 𝑋 , where 𝑋 is the single estimation of negative opticaldepth using equidistant evaluations of the density from 𝑎 to 𝑏 . Onits own, this estimate is virtually the same as jittered ray marching[Pauly et al. 2000] potentially coupled with the endpoint-matchingcontrol variate. The higher order terms correct the bias, so we callthis estimator unbiased ray-marching . One of the surprisingconclusions from section 4.5 is that with all our optimizations inplace and enough equidistant samples and a sampled pivot, we canmake the Russian roulette most often truncate at the constant term– and still obtain very little variance. This is possible because our

ALGORITHM 5:

Unbiased ray marching

Input :

Interval length ℓ ; control optical thickness ¯ 𝜏 Output :

Transmittance 𝑇𝑀 = DetermineTupleSize(¯ 𝜏 ); 𝑁 = AggressiveBKRoulette( 𝐾 = 𝑐 = 2 , 𝑝 𝑍 = 90%); 𝜇 ℓ , 𝜇 = EvalDensity( ℓ ) , EvalDensity(0); // optional for 𝑖 = 1 to 𝑁 + 1 do 𝑢 𝑖 = rand(); 𝑋 𝑖 = − ℓ𝑀 · (cid:80) 𝑀 − 𝑖 =0 EvalDensity( ℓ𝑀 ( 𝑢 + 𝑖 )); 𝑋 𝑖 = 𝑋 𝑖 − ℓ𝑀 (cid:16) − 𝑢 𝑖 (cid:17) ( 𝜇 ℓ − 𝜇 ); // optional end T = 0; for 𝑗 = 1 to 𝑁 + 1 do 𝑚 , · · · ,𝑚 𝑁 = ElementaryMeans( { 𝑋 \ 𝑋 𝑗 } − 𝑋 𝑗 ); 𝑇 = 𝑇 + 𝑁 +1 𝑒 𝑋 𝑗 (cid:80) 𝑁𝑘 =0 𝑚 𝑘 𝑘 ! 𝑝 𝑘 ; end sampled pivots become increasingly good estimates for the realintegral with the addition of more equidistant samples, and henceeven a zeroth order Taylor polynomial often results in a very good– and a very cost-effective – estimate for the real integral.This surprising behavior is partially explained by the followingobservation: When the pivot 𝑋 is an unbiased estimate for theintegral, the zeroth order approximation 𝑒 𝑋 is actually in a senseaccurate to the first order, essentially gaining an order of accuracyfor free: E [ 𝑒 𝑋 − 𝑒 E [ 𝑋 ] ] = 𝑒 E [ 𝑋 ] E (cid:104) 𝑒 𝑋 − E [ 𝑋 ] − (cid:105) = 𝑒 E [ 𝑋 ] E (cid:34) ( 𝑋 − E [ 𝑋 ] ) + ( 𝑋 − E [ 𝑋 ])

2! + · · · (cid:35) = 𝑒 E [ 𝑋 ] E (cid:34) 𝑋 − E [ 𝑋 ])

2! + · · · (cid:35) . (42)This suggests that we can build an effective low-bias estimatorby always truncating the series at the zeroth order, that is to sayevaluating only: 𝑒 − 𝜏 ≈ 𝑒 𝑋 · 𝑒 𝑋 , (43)where 𝑋 is an unbiased estimator for the integral of 𝜇 which weobtain with combing and by using all of the transmittance budgetto increase the tuple size 𝑀 . We couple this technique with theendpoint matching control variate (see Algorithm 6).This is again the same as jittered ray marching applied to theendpoint-matching-reshuffled density: a surprisingly simple algo-rithm. In this section we compare our proposed unbiased and biased trans-mittance estimators to ratio tracking (RT) [Cramer 1978], residual ra-tio tracking (RRT) [Novák et al. 2014] and the p-series CMF [Georgievet al. 2019] estimators in a variety of scenes featuring participatingmedia. For the unbiased methods we report variance, and for ourbiased ray marching we measure mean-square-error of one sample.

Publication date: February 2021. n unbiased ray-marching transmittance estimator • 13

ALGORITHM 6:

Biased ray marching

Input :

Interval length ℓ ; control optical thickness ¯ 𝜏 Output :

Transmittance 𝑇𝑀 = (cid:108) √︁ (0 .

015 + ¯ 𝜏 )(0 .

65 + ¯ 𝜏 )(60 . 𝜏 ) (cid:109) ; 𝑢 = rand(); 𝑋 = − ℓ𝑀 · (cid:80) 𝑀 − 𝑗 =0 EvalDensity( ℓ𝑀 ( 𝑢 + 𝑖 )); 𝑋 = 𝑋 − ℓ𝑀 (cid:16) − 𝑢 (cid:17) (cid:0) EvalDensity( ℓ ) − EvalDensity(0) (cid:1) ; // optional 𝑇 = 𝑒 𝑋 ; In Figure 1 and Figure 11 we study the performance of the in-dividual estimators in a path tracer. All volumes in the figure arestored using the VDB data structure [Museth 2013] that additionallyprovides aggregate volumetric statistics (minimum, maximum, andmean density) over 8 × × ,

1] primary interval and then transform them into a warpedcomb along the ray.The insets in the figure show results for different estimators atone path sample per pixel. Since the efficiency of certain estimatorsimproves with higher lookup counts, we normalize the comparisonby adjusting them to yield approximately equal number of densitylookups per transmittance estimate . We use the p-series CMF estima-tor as the baseline and uniformly increase local (residual) majorantssuch that the tracking estimators utilize approximately the samenumber of lookups (predicted by Equation 40). For our methods weemploy the automatic tuple size mechanism discussed in Section 4.6.In the following we list the specifics of individual scenes: • Plume features absorptive smoke. Transmittance estimationis the only source of noise in this scene; this setting is thusthe most representative one (out of the four scenes) of therelative performance between the estimators. • Box features indirect illumination from an area light sur-rounded by an absorptive medium. Despite the extra noisefrom simulating up to four light bounces, the impact of thedifferent transmittance estimators is still clearly visible. • Cloud features single-scattering illumination due to twopoint lights. We use equiangular sampling [Kulla and Fajardo2011] to sample collisions along primary rays. Transmittancealong the primary ray and the shadow rays is estimated withthe studied estimator. The improvement from the transmit-tance estimation is partially masked by other sources of noise. • Glass with Smoke features frequency-dependent absorptivesmoke in a reflective glass box and shows how improvedtransmittance estimation can affect the quality of volumerendering either directly or through reflections. • Glass (Figure 1) features another frequency-dependent ab-sorptive medium in a glass embedding. Our unbiased estimator obtains an MSE reduction between 1.5and 13x across all scenes compared to previous state-of-the-artmethod for each scene. Our biased estimator provides additional improvement of 1.1 to 2x on top of that. Note that the MSE valuesinclude also other sources of noise (such as from global illuminationin the Box scene, or single-scattering in the Cloud scene), whichpartly masks the improvements in transmittance estimation.

Scaling to higher quality.

Figure 12 shows a simple test comparingthe variance of our estimators to Georgiev et al’s p-series CMF as afunction of the number of density evaluations on a single exampledensity that exhibits high frequencies and fractal behavior. For p-series CMF, we examined two methods of increasing the expectednumber of samples: by averaging multiple evaluations, and by multi-plying the optical thickness (in this case the majorant) by a constantgreater than 1. For our methods we increase the control opticalthickness similarly, but it is only used for calculating the tuple size.Increasing the majorant helps the p-series CMF estimator in thisparticular instance, but this is not always the case, as we will seelater on. Our method always benefits from increasing the tuple sizedue to the improved pivot and lower-variance correction samples,but we see bumps in the convergence curves due to the non-uniformfrequency response of our equidistant sampling combs. Despite thebumps, we always found equidistant sampling to perform betterthan breaking the frequency response with e.g. stratified samplingor a low-discrepancy pattern.Our methods clearly show a higher rate of convergence whichcontinually increases their lead by orders of magnitude when tar-geting noise-free transmittance estimates, with the biased variantfeaturing slightly lower MSE at the cost of a small amount of bias.

Endpoint matching.

Figure 13 shows another test where we an-alyze the behavior of our estimator with and without endpointmatching on two different densities. The top of the figure showsa case where the density is very different at the two endpoints,which creates a strong discontinuity in the periodic extension ofthe function. Our endpoint-matching control variate removes thisdiscontinuity, greatly reducing the variance and improving the con-vergence rate. The effect is particularly large when the discontinuityis high compared to the other variation in the density function (asin this example). The bottom plot shows a counter example wherethe control variate does not yield any improvement. The plot showsthat the overhead of performing the extra lookups at the endpointsis relatively low; we see only mild reduction in efficiency, especiallywhen targeting high-quality transmittance estimates.

Pure transmittance estimation.

Figure 14 analyzes the impact ofgradually enabling some of our proposed techniques; the resultsfrom the p-series CMF estimator are used as a baseline. All methodsare adjusted to utilize roughly the same number of lookups. Weestimate transmittance through a uniformly lit volumetric slab witha variable density field. The slab is viewed from the +Z direction; its(X,Z) cross-section features a 2D fractal density field with increasingfractal dimension going from left to right, modulated along thevertical Y axis so as to have near-zero density at the bottom, anda maximum optical thickness of 10 at the top. Each pixel in therendered insets represents transmittance along a single ray through

Publication date: February 2021.

Ratio tracking (RT) Residual RT P-series CMF Unbiased ray march. Biased ray march.[Cramer 1978] [Novák et al. 2014] [Georgiev et al. 2019] (ours) (ours) P l u m e I n v . e ff . I n s e t I n s e t Variance (MSE): 0.0277 0.0175 0.0287 3.87e-3 2.14e-3Cost: 7.39 7.39 7.22 7.44 7.39Inv. efficiency: 0.231 0.143 0.231 0.0301 0.0163 B o x I n v . e ff . I n s e t I n s e t Variance (MSE): 0.200 0.131 0.170 0.0518 0.0465Cost: 46.2 46.2 45.2 46.5 46.2Inv. efficiency: 8.96 5.79 7.35 2.27 2.02 C l o u d I n v . e ff . I n s e t I n s e t Variance (MSE): 72.5 43.0 38.5 24.7 15.5Cost: 22.4 22.4 22.4 22.5 22.4Inv. efficiency: 3.06e+3 1.79e+3 1.61e+3 1.01e+3 645 G l a ss w i t h S m o k e I n v . e ff . I n s e t I n s e t Variance (MSE): 1.24e-3 1.12e-3 1.43e-3 1.02e-4 4.15e-5Cost: 3.35 3.35 3.28 3.37 3.35Inv. efficiency: 0.0167 0.0152 0.0183 1.14e-3 5.22e-4

Fig. 11. A comparison of our unbiased and biased estimators (two rightmost columns) to ratio tracking, residual ratio tracking and the p-series CMF estimatorson a variety of rendered content featuring participating media. The

Plume , Box and

Glass with Smoke scenes contain purely absorptive media, while the

Cloud scene shows single-scattering illumination by point lights, rendered using equiangular sampling to sample collisions along primary rays.

Publication date: February 2021. n unbiased ray-marching transmittance estimator • 15

10 30 100 300 600

Evaluation Count − − − − Variance / MSE of estimatorspCMF (avg)pCMF (inc. maj.)Ours, unbiasedOurs, biased ℓ / ℓ DistanceDensity (Transm. = 0.22)

Fig. 12. A graph of variance (respectively MSE) of our unbiased and biasedestimators as well as that of Georgiev et al’s p-series CMF as a function ofsample count. For p-series CMF, we display two methods of increasing theexpected sample count: the first (dashed blue line) is by averaging multipleevaluations, the second (dashed yellow line) is increasing the control opticalthickness (in this case the majorant). Both our estimators display a fasterconvergence rate. − − − − − Variance of estimators ℓ / ℓ Density (Transm. = 0.22)

Evaluation Count − − − − − pCMF (avg)pCMF (inc. maj.)Ours, unbiasedOurs + EM ℓ / ℓ Distance(Transm. = 0.85)

Fig. 13. Endpoint matching (top, dashed purple) may improve the conver-gence rate over the base method (green) when the interval ends are at verydifferent densities compared to the general density variation (top right).Endpoint matching is not beneficial when the ends are at similar densities(bottom). the slab, evaluated using one of the tested estimators. We performedtwo tests: the first test (two leftmost columns) employs tight per-pixel majorants, whereas the second test (two rightmost columns)employs a single global majorant for the entire volume. The fourrows compare: • the p-series CMF estimator; • our U-BK estimator with 𝑐 = 2 using a sampled pivot andcombing; • our U-BK estimator with 𝑐 = 2 using a sampled pivot, combingand our endpoint matching control variate; • our final U-BK estimator using the sampled pivot, combing,the endpoint matching control variate and our aggressiveroulette scheme detailed in Section 4.5. All variants of our estimators employ the automatic tuple size de-duction algorithms described in Section 4.6 in order to match theexpected sample count of the p-series CMF estimator. Odd columnsshow the result of a single evaluation, whereas even column showa plot of inverse efficiency.We make the following observations: • Using tight per-pixel majorants causes the p-series CMF es-timator to take discrete jumps in the base number of termsevaluated, due to it activating RR after reaching 99% massonly - this appears as blocky variations in variance/efficiency. • With enough samples (e.g. with the global majorant), ourequidistant sampling combs coupled with the symmetriza-tion provided by U-statistics already provide a significantefficiency improvement. • Enabling the endpoint matching control variate in some areasallows a relatively large variance reduction, but the largestimprovement is obtained by combining the previous tech-niques with our aggressive roulette, that allows using evenlarge tuples by sampling fewer orders. • In regions with a low-frequency density function, we obtainup to 5 orders of magnitude improvements in efficiency. Withhigher frequencies our final estimator achieves 2 to 3 ordersof magnitude lower variance.Since our estimator gains efficiency with larger and larger tuples,we also compared the evaluation of the p-series CMF estimatorwith varying majorants ¯ 𝜇 (respectively 1, 10 and 100 times largerthan the tight per-pixel majorant) against single evaluations ofour estimators with a tuple size 𝑀 = 𝐷𝑒𝑡𝑒𝑟𝑚𝑖𝑛𝑒𝑇𝑢𝑝𝑙𝑒𝑆𝑖𝑧𝑒 ( ¯ 𝜇 ); seeFigure 15. This comparison reveals that using larger majorants withthe p-series CMF estimator can be very detrimental at low opticalthicknesses; the majorant effectively acts as a worse and worsepivot. This leads to an exponential increase of the error of the loworder terms that is never fully recovered, as the Russian roulettecontinuation probability after the CMF threshold of 99% approacheszero. In the second and third row, increasing the majorant appearsto squeeze the large variance (low efficiency) bump at the center ofthe first row (using the tight majorant) towards the bottom of theslab, where the transmittance 𝑇 approaches 1. In the Results section, we have seen how our novel unbiased ray-marching estimator provides a major efficiency improvement acrossall our tests compared to previous state-of-the-art, and how thebiased ray-marching solution reaches even lower MSE at equal cost.In the following we discuss a different perspective on our U-statistics estimator as well as alternative strategies to equidistantcombing and connections to the more general theme of samplestratification.

Complex factorization of the truncated Taylor polynomial.

Anotherpath to obtaining our U-statistics estimator is to apply the complexfactorization of the truncated power-series polynomial: 𝑁 ∑︁ 𝑘 =1 𝑘 ! 𝑄 ( 𝑘 ) 𝑘 (cid:89) 𝑖 𝑋 𝑖 = 𝑐 𝑁 (cid:89) 𝑖 =1 ( 𝑋 𝑖 − 𝑐 𝑖 ) (44) Publication date: February 2021.

Fig. 14. This setup compares our techniques to the power-series estimator (p-series CMF) from Georgiev et al. simulating transmittance through a 3d slabwith varying density. The (X,Z) cross-section of the density field, shown in the upper-left corner, features a higher and higher fractal dimension going from leftto right. The average density is varied across the vertical axis Y, so as to have near-zero density at the bottom of the slab and a maximum optical thickness of10 towards the top. The slab is illuminated by a uniform directional light source on the back, so that each pixel in the image plane records the amount of lighttransmitted through a single ray through the slab. The two leftmost columns show equal sample count results using tight per-pixel majorants, whereas therightmost columns show equal sample count results using a single global majorant ¯ 𝜇 = 25 . Note the use of a logarithmic color scale: our final estimatorsprovide a 2 to 5 orders of magnitude efficiency increase over previous state-of-the-art. and apply the generic estimator of products of unbiased estima-tors recently suggested by Lee et al. [2019, Eq.(6)]. The resultingpermuted estimator matches exactly our U-statistics estimator, anddespite the presence of complex coefficients, the imaginary partcancels out. What is most interesting, though, is that unlike thegeneric estimators of Lee et al. [2019], our algorithms can exploitthe structure of the truncated Taylor series to evaluate all combi-nations in 𝑂 ( 𝑁𝑍 ) time, whereas the direct evaluation of [Lee et al.2019, Eq.(6)] is 𝑃 -hard. Connections of combing to stratified sampling.

Combing can beseen as a form of stratified sampling applied to each individual estimate of the integral of the null density. It is important to notethat the separate integral estimates are uncorrelated. Using a singlestratified set of random numbers across all orders is not possible, asthat would result in correlated integral estimates whose productswould result in biased estimates of the powers of − 𝜏 . Georgievet al. [2019] had previously suggested another form of stratification, across multiple evaluations of the transmittance integral. This form ofstratification is orthogonal and can be combined with our approach: it is sufficient to stratify the Cranley-Patterson rotations ( 𝑥 , ..., 𝑥 𝑁 )across different evaluations of the estimator, for example using Latinhypercube sampling or some other randomized QMC sequence. Alternative strategies for reducing 𝑌 -variance. A regular comb us-ing an equidistant sampling tuple works well under the assumptionthat the density has bounded slope: in this case it can potentiallyreduce the integration error to 𝑂 (1 / 𝑀 ) or less.For highly discontinuous densities, or densities with very highfractal dimension, this might no longer be the case. An alternativein these extreme cases could be using CP-rotated low discrepancyblue-noise combs that react less to the spectrum of the integrand.An example of such a comb can be easily obtained using 𝑢 𝑗 = ℓ · 𝜙 · 𝑗 ,where 𝜙 is the well known golden ratio.In practice, however, we have found equidistant sampling toalways outperform any other low-discrepancy set we have tried.This might be related to the observations of Ramamoorthi et al.[2012]. Publication date: February 2021. n unbiased ray-marching transmittance estimator • 17

Fig. 15. This figure uses the same setup of Figure 14 to compare the variance and inverse efficiency of the p-series CMF estimator (left 3 columns) to that ofour unbiased estimator (right 3 columns) when the majorant is raised respectively by 1, 10, and 100 times compared to the tight per-pixel majorant, increasingthe number of density evaluations. Notice how at low optical thickness values the original p-series CMF estimator can even suffer from raising the majorantabove a certain point, as the error from the low order terms increases exponentially due to the use of a very bad pivot, without ever being fully recovered. Ourestimator is able to use all the available density evaluations to reduce variance and improve efficiency.

While in this work we have focused on matching the sample budgetsof previously known methods, as briefly mentioned in Section 4.6a natural and needed extension of this work would be a schemefor adaptive allocation of tuple sizes in the presence of additionalsources of noise: the superlinear convergence properties of ourestimators might in fact allow to highly benefit from taking moresamples in important regions of path space, while taking fewer inless important ones. Another potentially related point that deservesattention is a more thorough investigation of the bias/variancetradeoff of our biased and unbiased estimators.There may be scenarios where negative transmittance estimatesare undesired or the sample budget is fixed independent of a ma-jorant optical depth. What estimator performs best in these casesremains an open question. Finally, power series estimation of zero-order probabilities for random media with non-exponential trans-mission laws where 𝜇 ( 𝑥 ) is a random variable is an interesting openarea [Bitterli et al. 2018; d’Eon 2018; Jarabo et al. 2018], and somesteps in this direction using the Master equation for binary mixtureshas already been made [Longo 2002]. We presented a novel in-depth variance analysis (Section 3) of exist-ing unbiased transmittance estimators, revealing weaknesses andareas for improvement. We then proposed a series of techniques(Section 4) exploiting these insights, specifically: • We have presented a novel power-series estimator utilizing allsamples efficiently using U-statistics, a recipe for evaluating the estimator in quadratic time, and a numerically robust,incremental elementary symmetric means algorithm. • We have demonstrated how to further reduce variance byusing sampled mean pivots instead of majorant derived ones;a development enabled by the U-statistics. • We described a combed estimator for evaluating optical depthusing 𝑀 rotated equidistant samples and proposed an affineCV to preserve its superlinear convergence rate. • We have proposed to alter the BK roulette and make it vastlymore aggressive, enabling us to use larger combs and attaineven higher overall efficiency.Since the zeroth order term of our final power-series estimator isanalogous to the classical ray marching solution (with the additionof our endpoint matching control variate), we refer to the novelestimator as unbiased ray marching . We have shown that unbiasedray marching is universally faster than any of the previously knownunbiased estimators, and often offers several orders of magnitudelower variance at equal sample count. Moreover, we have shownthat stopping the power-series evaluation at the zeroth-order andeffectively getting back to simple ray marching results in a verylow-bias estimator that attains lower MSE than any known unbiasedestimator, even at relatively low sample counts. This latter resultmight have interesting consequences for real-time rendering andother applications where unbiasedness is not crucial.

REFERENCES

John Amanatides and Andrew Woo. 1987. A Fast Voxel Traversal Algorithm for RayTracing. In

EG 1987-Technical Papers . Eurographics Association. https://doi.org/10.2312/egtp.19871000 Publication date: February 2021.

H. W. Bertini. 1963.

Monte Carlo simulations on intranuclear cascades . TechnicalReport ORNL–3383. Oak Ridge National Laboratory, Oak Ridge, TN, USA. https://doi.org/10.2172/4692927Alexandros Beskos, Omiros Papaspiliopoulos, Gareth O Roberts, and Paul Fearnhead.2006. Exact and computationally efficient likelihood-based estimation for discretelyobserved diffusion processes (with discussion).

Journal of the Royal Statistical Society:Series B (Statistical Methodology)

68, 3 (2006), 333–382.Gyan Bhanot and Anthony D Kennedy. 1985. Bosonic lattice gauge theory with noise.

Physics letters B

ACM Transactions on Graphics

37, 6 (2018). https://doi.org/10.1145/3272127.3275103Thomas E Booth. 2007. Unbiased Monte Carlo estimation of the reciprocal of an integral.

Nuclear science and engineering

TheMonte Carlo method: the method of statistical trials . Vol. 87. Pergamon.J. C. Butcher and H. Messel. 1958. Electron Number Distribution in Electron-PhotonShowers.

Phys. Rev.

112 (Dec. 1958), 2096–2106. Issue 6. https://doi.org/10.1103/PhysRev.112.2096RH Cameron. 1954. The generalized heat flow equation and a corresponding Poissonformula.

Annals of Mathematics (1954), 434–462.LL Carter, ED Cashwell, and WM Taylor. 1972. Monte Carlo sampling with continuouslyvarying cross sections along flight paths.

Nucl. Sci. Eng

48 (1972), 403–411. https://doi.org/10.13182/NSE72-1Subrahmanyan Chandrasekhar. 1960.

Radiative Transfer . Dover.Nan Chen and Zhengyu Huang. 2012. Brownian meanders, importance sampling andunbiased simulation of diffusion extremes.

Operations research letters

40, 6 (2012),554–563.W. A. Coleman. 1968. Mathematical Verification of a Certain Monte Carlo SamplingTechnique and Applications of the Technique to Radiation Transport Problems.

Nuclear Science and Engineering

32, 1 (1968), 76–81. https://doi.org/10.13182/NSE68-1DR Cox and PAW Lewis. 1966.

The statistical analysis of Series of Events . Wiley.SN Cramer. 1978. Application of the fictitious scattering radiation transport model fordeep-penetration Monte Carlo calculations.

Nuclear Science and Engineering

65, 2(1978), 237–253. https://doi.org/10.13182/NSE78-A27154Eugene d’Eon. 2018. A reciprocal formulation of nonexponential radiative transfer. 1:Sketch and motivation.

Journal of Computational and Theoretical Transport (2018).https://doi.org/10.1080/23324309.2018.1481433Mouna El-Hafi, Stephane Blanco, Jeremi Dauchet, Mathieu Galtier, Richard Fournier,Jean-Marc Tregan, and Najda Villefranque. 2018. Three viewpoints on null-collisionMonte Carlo algorithms. In

CTRPM-VI-6th Computational Thermal Radiation inParticipating Media VI . 8–p.Paul Fearnhead, Omiros Papaspiliopoulos, and Gareth O Roberts. 2008. Particle filtersfor partially observed diffusions.

Journal of the Royal Statistical Society: Series B(Statistical Methodology)

70, 4 (2008), 755–777.M. Galtier, S. Blanco, Cyril Caliot, C. Coustet, J. Dauchet, Mouna El-Hafi, Vin-cent Eymet, R. Fournier, J. Gautrais, A. Khuong, B. Piaud, and Guillaume Ter-rée. 2013. Integral formulation of null-collision Monte Carlo algorithms.

Jour-nal of Quantitative Spectroscopy and Radiative Transfer

125 (Aug. 2013), 57–68.https://doi.org/10.1016/j.jqsrt.2013.04.001Iliyan Georgiev, Zackary Misso, Toshiya Hachisuka, Derek Nowrouzezahrai, JaroslavKřivánek, and Wojciech Jarosz. 2019. Integral formulations of volumetric transmit-tance.

ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia)

38, 6 (Nov.2019). https://doi.org/10/dffnMark Girolami, Anne-Marie Lyne, Heiko Strathmann, Daniel Simpson, and YvesAtchade. 2013. Playing Russian roulette with intractable likelihoods. arXiv preprintarXiv:1306.4032 (2013).Gerald J Glasser. 1962. Minimum variance unbiased estimators for Poisson probabilities.

Technometrics

4, 3 (1962), 409–418.Paul R Halmos. 1946. The theory of unbiased estimation.

The Annals of MathematicalStatistics (1946), 34–43.Pierre E Jacob, Alexandre H Thiery, et al. 2015. On nonnegative unbiased estimators.

The Annals of Statistics

43, 2 (2015), 769–784.Adrian Jarabo, Carlos Aliaga, and Diego Gutierrez. 2018. A Radiative Transfer Frame-work for Spatially-Correlated Materials.

ACM Transactions on Graphics

37, 4 (2018),14. https://doi.org/10.1145/3197517.3201282NL Johnson. 1951. Estimators of the probability of the zero class in Poisson and certainrelated populations.

The Annals of Mathematical Statistics

22, 1 (1951), 94–101.https://doi.org/10.1214/aoms/1177729696Daniel Jonsson, Joel Kronander, Jonas Unger, Thomas B Schon, and Magnus Wrenninge.2020. Direct Transmittance Estimation in Heterogeneous Participating Media UsingApproximated Taylor Expansions.

IEEE Transactions on Visualization and ComputerGraphics (2020). https://doi.org/10.1109/TVCG.2020.3035516 Robert W Klein and Stephen D Roberts. 1984. A time-varying Poisson arrival processgenerator.

Simulation

43, 4 (1984), 193–195.Christopher Kulla and Marcos Fajardo. 2011. Importance Sampling of Area Lightsin Participating Media.

ACM SIGGRAPH 2011 Talks, SIGGRAPH’11 , 55. https://doi.org/10.1145/2037826.2037899Anthony Lee, Simone Tiberi, and Giacomo Zanella. 2019. Unbiased approximations ofproducts of expectations.

Biometrika

U-statistics: Theory and Practice . Routledge.David Legrady, Balazs Molnar, Milan Klausz, and Tibor Major. 2017. Woodcock trackingwith arbitrary sampling cross section using negative weights.

Annals of NuclearEnergy

102 (04 2017), 116–123. https://doi.org/10.1016/j.anucene.2016.12.003L Lin, K F Liu, and J Sloan. 2000. A noisy Monte Carlo algorithm.

Physical Review. D,Particles Fields

61 (Apr 2000). Issue 7. https://doi.org/10.1103/PhysRevD.61.074505Savino Longo. 2002. Direct derivation of Skullerud’s Monte Carlo method for chargedparticle transport from the linear Boltzmann equation.

Physica A: Statistical Me-chanics and its Applications

Statistical science

30, 4 (2015), 443–467.D. G. Mead. 1992. Newton’s Identities.

The American Mathematical Monthly

SovietAtomic Energy

28, 2 (1970), 224–225. https://doi.org/10.1007/BF01162640Gennadii A. Mikhailov. 1992.

Optimization of weighted Monte Carlo methods

Biometrika

61, 1 (1974), 1–15.Sarat Babu Moka, Dirk P Kroese, and Sandeep Juneja. 2019. Unbiased estimation ofthe reciprocal mean for non-negative random variables. In . IEEE, 404–415.Ken Museth. 2013. VDB: High-Resolution Sparse Volumes with Dynamic Topology.

ACM Trans. Graph.

32, 3, Article 27 (July 2013), 22 pages. https://doi.org/10.1145/2487228.2487235Jerzy Neyman and Elizabeth L Scott. 1960. Correction for bias introduced by a transfor-mation of variables.

The Annals of Mathematical Statistics

31, 3 (1960), 643–655.Jan Novák, Iliyan Georgiev, Johannes Hanika, and Wojciech Jarosz. 2018. MonteCarlo Methods for Volumetric Light Transport Simulation.

Computer GraphicsForum (Proceedings of Eurographics - State of the Art Reports)

37, 2 (May 2018).https://doi.org/10.1111/cgf.13383Jan Novák, Andrew Selle, and Wojciech Jarosz. 2014. Residual ratio tracking forestimating attenuation in participating media.

ACM Trans. Graph.

33, 6 (2014),179–1. https://doi.org/10.1145/2661229.2661292Omiros Papaspiliopoulos. 2011. Monte Carlo probabilistic inference for diffusionprocesses: A methodological framework.

Bayesian time series models (2011), 82–103.Raghu Pasupathy. 2010. Generating nonhomogeneous Poisson processes.

Wi-ley encyclopedia of operations research and management science (2010). https://doi.org/10.1002/9780470400531.eorms0355Mark Pauly, Thomas Kollig, and Alexander Keller. 2000. Metropolis Light Transportfor Participating Media.

Rendering Techniques

Monte Carlo and Quasi Monte Carlo Methods 2006 . Springer,591–601.Ravi Ramamoorthi, John Anderson, Mark Meyer, and Derek Nowrouzezahrai. 2012. ATheory of Monte Carlo Visibility Sampling.

ACM Transactions on Graphics (2012).http://graphics.berkeley.edu/papers/Ramamoorthi-ATO-2012-02/H. R. Skullerud. 1968. The stochastic computer simulation of ion motion in a gassubjected to a constant electric field.

Journal of Physics D: Applied Physics

1, 11(1968), 1567–1568. https://doi.org/10.1088/0022-3727/1/11/423László Szirmay-Kalos, Balázs Tóth, and Milán Magdics. 2011. Free path sampling in highresolution inhomogeneous participating media. In

Computer Graphics Forum , Vol. 30.Wiley Online Library, 85–97. https://doi.org/10.1111/j.1467-8659.2010.01831.xWolfgang Wagner. 1987. Unbiased Monte Carlo evaluation of certain functional inte-grals.

J. Comput. Phys.

71, 1 (1987), 21–33.Wolfgang Wagner. 1988. Monte Carlo evaluation of functionals of solutions of stochas-tic differential equations. Variance reduction and numerical examples.

StochasticAnalysis and Applications

6, 4 (1988), 447–468.E Woodcock, T Murphy, P Hemmings, and S Longworth. 1965. Techniques used in thegem code for Monte Carlo neutronics calculations in reactors and other systems ofcomplex geometry.

Applications of Computing Methods in Reactor Physics , 557.C. D. Zerby, R. B. Curtis, and H. W. Bertini. 1961.

The relativistic doppler problem .Technical Report ORNL-61-7-20. Oak Ridge National Laboratory, Oak Ridge, TN,USA. https://doi.org/10.2172/4836227Publication date: February 2021. n unbiased ray-marching transmittance estimator • 19

A OPTIMALITY OF THE MEAN PIVOT

Earlier, we discussed how non-symmetric power series estimatorsbenefit from a majorant pivot and how U-statistics changes thisbehaviour. With this important difference, we perform a similaranalysis for the pivot as in earlier work (e.g. [Georgiev et al. 2019]):We analyze the sum of the absolute values of the different ordercontributions in the Taylor series of 𝑒 E [ 𝑋 ] with different pivots 𝑝 .The sum of the absolute values of the contributions from all orderswith pivot 𝑝 is 𝑒 𝑝 (cid:169)(cid:173)(cid:171) (cid:12)(cid:12) E [ 𝑋 ] − 𝑝 (cid:12)(cid:12) + (cid:12)(cid:12) E [ 𝑋 ] − 𝑝 (cid:12)(cid:12)

2! + · · · (cid:170)(cid:174)(cid:172) = 𝑒 𝑝 + | E [ 𝑋 ] − 𝑝 | . (45)This says that in some sense, it is optimal to use any pivot less thanthe expectation, 𝑝 ≤ E [ 𝑥 ], as any such pivot minimizes the aboveexpression. In terms of positive densities, this says that the controldensity should be at least as high as the mean density. However, thecontrol density does not need to be greater than all of the densitysamples—there is no need to use a majorant.However, the picture changes drastically when we take Russianroulette into account: As noted earlier, moving the pivot closer to − 𝜏 implies faster convergence for the Taylor series. This means that weneed to evaluate fewer orders of the power series for good estimates,which means that we can employ more aggressive Russian roulette.Interestingly, with a good pivot, even aggressive Russian roulettewill not make efficiency worse: for 𝑁 total evaluations of an estima-tor 𝑋 , returning the roulette-compensated variable with probability 𝑝 and otherwise zero, the actual number of evaluations is 𝑁 𝑝 . Theinverse efficiency of the estimator is thus proportional to 𝑝 Var (cid:20)

𝑅𝑋𝑝 (cid:21) = 𝑝 (cid:32) E [ 𝑅 ] E [ 𝑋 ] 𝑝 − E [ 𝑅 ] E [ 𝑋 ] 𝑝 (cid:33) = E [ 𝑋 ] − 𝑝 E [ 𝑋 ] = Var[ 𝑋 ] + (1 − 𝑝 ) E [ 𝑋 ] , (46)where 𝑅 is the random binary choice variable. This says that theefficiency of the roulette is maximized when E [ 𝑋 ] = 0, that is, whenwe use the theoretical mean pivot. The efficiency of the roulettedestimator decreases as the pivot moves farther from the real expec-tation.Therefore, using an approximate mean pivot allows the use of amore aggressive roulette.The resulting lower mean estimation order from more aggressiveroulette means that we now need a smaller number of independentsamples, and we can use our density evaluation budget to makethose samples higher-quality by performing variance reductiontechniques such as stratification or numerical integration rules, asdiscussed in Section 4. B ELEMENTARY SYMMETRIC MEANS

In this appendix we derive the elementary symmetric means formu-las that lead to Algorithm 1.Elementary symmetric sums 𝑒 𝑘 = 𝑒 𝑘 ( 𝑥 , · · · , 𝑥 𝑛 ) are defined as 𝑒 𝑘 = ∑︁ ≤ 𝑖 < ··· < 𝑖 𝑘 ≤ 𝑛 𝑥 𝑖 · · · 𝑥 𝑖 𝑘 , (47) with 𝑒 = 1. To distinguish between different numbers of parameters,in this appendix we denote the elementary sums of 𝑥 · · · 𝑥 𝑛 by 𝑒 𝑛𝑘 = 𝑒 𝑘 ( 𝑥 , · · · , 𝑥 𝑛 ) , (48)and elementary symmetric means by 𝑚 𝑛𝑘 = 𝑚 𝑘 ( 𝑥 , · · · , 𝑥 𝑛 ) = 𝑒 𝑘 ( 𝑥 , · · · , 𝑥 𝑛 ) (cid:0) 𝑛𝑘 (cid:1) . (49)A simple derivation leads to a formula for iteratively constructingelementary symmetric sums: 𝑒 𝑛 +1 𝑘 = ∑︁ ≤ 𝑖 < ··· < 𝑖 𝑘 ≤ 𝑛 +1 𝑥 𝑖 · · · 𝑥 𝑖 𝑘 = ∑︁ ≤ 𝑖 < ··· < 𝑖 𝑘 ≤ 𝑛 𝑥 𝑖 · · · 𝑥 𝑖 𝑘 + ∑︁ ≤ 𝑖 < ··· < 𝑖 𝑘 = 𝑛 +1 𝑥 𝑖 · · · 𝑥 𝑖 𝑘 = 𝑒 𝑛𝑘 + (cid:169)(cid:173)(cid:171) ∑︁ ≤ 𝑖 < ··· < 𝑖 𝑘 − ≤ 𝑛 𝑥 𝑖 · · · 𝑥 𝑖 𝑘 − (cid:170)(cid:174)(cid:172) 𝑥 𝑛 +1 = 𝑒 𝑛𝑘 + 𝑒 𝑛𝑘 − 𝑥 𝑛 +1 . (50)Then, by substituting the definition of elementary symmetricmeans, we reach the recurrence formula 𝑚 𝑛 +1 𝑘 = 𝑚 𝑛𝑘 + 𝑘𝑛 + 1 (cid:16) 𝑚 𝑛𝑘 − 𝑥 𝑛 +1 − 𝑚 𝑛𝑘 (cid:17) . (51)Observing the directions of the dependencies in this formula leadsto Algorithm 1. C EFFICIENCY DERIVATIONS

In this appendix we review known analytic results for the varianceand cost of transmittance estimators as well as present some newderivations for power-series estimators.

C.1 Costs

Tracking estimators.

The expected number E [ 𝑁 ] of optical-depthestimates for a tracking estimator follows from the mean of thePoisson distribution, which is simply the rate, E [ 𝑁 ] = 𝜆 ℓ = ¯ 𝜏 r .Therefore, for residual ratio tracking with query size 𝑀 ,Cost[ (cid:98) 𝑇 𝑟𝑟𝑡 ] = 𝑀 ¯ 𝜏 r . (52)Delta-tracking with 𝑛 = 1 is an exception in that the estimator canperform early termination as soon as the first real (non-thinned)estimate is performed. The cost for 𝑛 = 1 delta-tracking is therefore(assuming 𝜇 c = 0) [Georgiev et al. 2019]Cost[ (cid:98) 𝑇 𝑑𝑡 ] = 𝑀 ¯ 𝜏 (cid:0) − 𝑒 − 𝜏 (cid:1) 𝜏 , ( 𝑛 = 1) (53)and otherwise tracking must completely traverse the interval 𝑛 times, Cost[ (cid:98) 𝑇 𝐽 ] = 𝑀 ¯ 𝜏𝑛 . Bhanot and Kennedy roulette.

Using the continuation probabilitiesof the generalized BK estimator (19) we find the probability 𝑄 BK ( 𝑁 )of evaluating term 𝑁 to be 𝑄 BK ( 𝑁 ) = 𝑐𝐾 + 1 𝑐𝐾 + 2 . . . 𝑐𝐾 + 𝑁 = 𝑐 𝑁 − 𝐾 𝑁 ! / 𝐾 ! , 𝑁 > 𝐾 = ⌊ 𝑐 ⌋ , (54)and 1 otherwise. The expected number of evaluated orders is thus E [ 𝑁 𝐵𝐾 ] = 𝐾 + ∞ ∑︁ 𝑁 = 𝐾 +1 𝑐 𝑁 − 𝐾 𝑁 ! / 𝐾 ! = 𝐾 + 𝐾 ! 𝑐 𝐾 (cid:32) 𝑒 𝑐 − 𝐾 ∑︁ 𝑁 =0 𝑐 𝑁 𝑁 ! (cid:33) . (55) Publication date: February 2021.

C.2 Variances

Delta tracking.

The exact variance of delta-tracking is known[Glasser 1962] and agrees with a derivation for the special case of 𝑛 = 1 and a uniform-medium [Georgiev et al. 2019]Var[ (cid:98) 𝑇 𝑑𝑡 ] = 𝑒 − 𝜏 − 𝑒 − 𝜏 . (56)This result is exact for any input, and generalizes for Johnson’s 𝑛 > (cid:98) 𝑇 𝐽 ] = 𝑒 − 𝜏 (cid:16) 𝑒 𝜏𝑛 − (cid:17) . (57)We investigate the efficiency of Johnson’s estimator in the supple-mentary material. Residual ratio tracking.

The exact variance for residual ratio track-ing is also known. The rate of the Poisson process follows from thedifference of known optical depths 𝜆 = ¯ 𝜏 − 𝜏 c , which are the integralsof the upper ¯ 𝜇 ( 𝑥 ) and lower 𝜇 c ( 𝑥 ) control variates. The variance isthen [Papaspiliopoulos 2011, Eq.(4.17)]Var[ (cid:98) 𝑇 𝑟𝑟𝑡 ] = 𝑒 − 𝜏 + 𝜆 + 𝑉𝜆 − 𝑒 − 𝜏 , (58) 𝑉 = 1( 𝑏 − 𝑎 ) (cid:90) 𝑏𝑎 (cid:0) ( 𝑏 − 𝑎 )( ¯ 𝜇 − 𝜇 ( 𝑥 ) (cid:1) 𝑑𝑥. (59) C.2.1 Truncated estimators.

Roulette variance.

With uniform density, the negative residualoptical depth 𝑌 is estimated with zero variance and the full varianceof the generalized BK estimator is (see supplemental material)Var[ (cid:98) 𝑇 𝐵𝐾 ] = 𝑒 − 𝜏 c ∞ ∑︁ 𝑗 =0 𝑐 𝑗 (cid:16) − 𝑐𝑗 + 𝐾 +1 (cid:17) ( 𝐾 + 1) 𝑗 (cid:32) 𝐾 ∑︁ 𝑛 =0 𝑌 𝑛 𝑛 ! + 𝑗 ∑︁ 𝑖 =1 𝑌 𝐾 + 𝑖 𝑐 𝑖 𝐾 ! (cid:33) − 𝑇 . Variance at the optimal pivot.

Consider the BK estimator withpivot 𝑝 = − 𝜏 c = − 𝜏 , where truncation is fixed to deterministic order 𝑁 . The estimator will then have a biased expectation 𝑇 𝑏 ≈ 𝑒 − 𝜏 . Inthe supplemental material we show that this estimator has varianceVar[ (cid:98) 𝑇 𝐵𝐾 ] = 𝑒 − 𝜏 𝑁 ∑︁ 𝑘 =0 E [ 𝑌 ] 𝑘 ( 𝑘 ! ) − 𝑇 𝑏 ≈ 𝑒 − 𝜏 𝑁 ∑︁ 𝑘 =1 E [ 𝑌 ] 𝑘 ( 𝑘 ! ) . (60)With U-statistics the variance becomesVar[ (cid:98) 𝑇 𝑈 𝐵𝐾 ] = 𝑒 − 𝜏 𝑁 ∑︁ 𝑘 =0 E [ 𝑌 ] 𝑘 (cid:0) 𝑁𝑘 (cid:1) ( 𝑘 ! ) − 𝑇 𝑏 ≈ 𝑒 − 𝜏 𝑁 ∑︁ 𝑘 =1 E [ 𝑌 ] 𝑘 (cid:0) 𝑁𝑘 (cid:1) ( 𝑘 ! ) . (61)The binomial denominators reduce the variance relative to thenon-symmetrized estimator. For small E [ 𝑌 ] (low 𝑌 -variance), thelinear term sees a variance reduction of 1 / 𝑁 relative to the non-symmetrized version, with diminishing gains for the higher orderterms. So at the optimal pivot, the variance reduction between UBKand BK approaches 1 / 𝑁 as 𝑌 -variance goes to 0.Roulette significantly complicates the variance derivation for theUBK estimator, but at the optimal pivot we found E [ 𝑇 ] = 𝑒 − 𝜏 ∞ ∑︁ 𝑘 =0 (cid:18) 𝑐 𝑘 (cid:16) − 𝑐 𝐾 + 𝑘 (cid:17)(cid:19) (1 + 𝐾 ) 𝑘 (cid:169)(cid:173)(cid:171) 𝐾 ∑︁ 𝑛 =0 E [ 𝑌 ] 𝑛 ( 𝑛 ! ) (cid:0) 𝑘 + 𝐾𝑛 (cid:1) + 𝑘 ∑︁ 𝑖 =1 𝑐 − 𝑖 E [ 𝑌 ] 𝐾 + 𝑖 ( 𝐾 ! ) (cid:0) 𝐾 + 𝑘𝐾 + 𝑖 (cid:1) (cid:170)(cid:174)(cid:172) from which the variance follows (Var[ 𝑇 ] = E [ 𝑇 ] − E [ 𝑇 ] ). This canbe used to rigorously analyze the tradeoffs between decreasing 𝐾 in favour of reducing 𝑌 -variance (by increasing 𝑀 ). D ENDPOINT MATCHING FORMULAS

Integration with endpoint matching and equidistant combs can befurther simplified. Since the order of the samples doesn’t matter, wecan write the integral estimate as 𝑋 𝑖 = − ℓ𝑀 𝑀 − ∑︁ 𝑖 =0 𝜇 ★ (cid:18) ℓ𝑀 ( 𝑢 + 𝑖 ) (cid:19) (62)where 𝑢 is a uniform random number in [0 , 𝑋 𝑖 = − ℓ𝑀 𝑀 − ∑︁ 𝑖 =0 𝜇 (cid:18) ℓ𝑀 ( 𝑢 + 𝑖 ) (cid:19)(cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) original estimate − ℓ𝑀 (cid:18) − 𝑢 (cid:19) (cid:0) 𝜇 ( ℓ ) − 𝜇 (0) (cid:1)(cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) endpoint matching . (63)The left-hand-side term is the integral estimate without endpointmatching, and the right-hand-side is the zero-expectation term fromendpoint matching that often improves the convergence rate.This reshuffling can cause the resulting integrand to assumenegative values. If non-negativity is a constraint, an alternate optionis to symmetrize the estimator over the interval by using the mean ofmirrored lookups [Buslenko et al. 1966] (p.106). This is equivalent toblending the interval of scattering material with its reversed copy.(63)The left-hand-side term is the integral estimate without endpointmatching, and the right-hand-side is the zero-expectation term fromendpoint matching that often improves the convergence rate.This reshuffling can cause the resulting integrand to assumenegative values. If non-negativity is a constraint, an alternate optionis to symmetrize the estimator over the interval by using the mean ofmirrored lookups [Buslenko et al. 1966] (p.106). This is equivalent toblending the interval of scattering material with its reversed copy.