[PDF] Variance Reduction with Array-RQMC for Tau-Leaping Simulation of Stochastic Biological and Chemical Reaction Networks

Abstract

We explore the use of Array-RQMC, a randomized quasi-Monte Carlo method designed for the simulation of Markov chains, to reduce the variance when simulating stochastic biological or chemical reaction networks with τ -leaping. The task is to estimate the expectation of a function of molecule copy numbers at a given future time T by the sample average over n sample paths, and the goal is to reduce the variance of this sample-average estimator. We find that when the method is properly applied, variance reductions by factors in the thousands can be obtained. These factors are much larger than those observed previously by other authors who tried RQMC methods for the same examples. Array-RQMC simulates an array of realizations of the Markov chain and requires a sorting function to reorder these chains according to their states, after each step. The choice of sorting function is a key ingredient for the efficiency of the method, although in our experiments, Array-RQMC was never worse than ordinary Monte Carlo, regardless of the sorting method. The expected number of reactions of each type per step also has an impact on the efficiency gain.

Full PDF

aa r X i v : . [ s t a t . C O ] S e p Variance Reduction with Array-RQMC for Tau-LeapingSimulation of Stochastic Biological and Chemical ReactionNetworks

Florian Puchhammer · Amal Ben Abdellah · PierreL’Ecuyer

September 2, 2020

Abstract

We explore the use of Array-RQMC, a randomized quasi-Monte Carlo methoddesigned for the simulation of Markov chains, to reduce the variance when simulatingstochastic biological or chemical reaction networks with τ -leaping. We ﬁnd that when themethod is properly applied, variance reductions by factors in the thousands can be obtained.These factors are much larger than those observed previously by other authors who triedRQMC methods for the same examples. Array-RQMC simulates an array of realizations ofthe Markov chain and requires a sorting function to reorder these chains according to theirstates, after each step. The choice of a good sorting function is a key ingredient for the efﬁ-ciency of the method. We illustrate this by comparing various choices. The expected numberof reactions of each type per step also has an impact on the efﬁciency gain. Keywords

Chemical reaction networks · stochastic biological systems · variancereduction · Quasi-Monte Carlo · Array-RQMC · Tau-leaping · continuous-time Markovchains · Gillespie

We consider systems of chemical species whose molecule numbers dynamically change overtime as the molecules react via a set of predeﬁned chemical equations. The evolution of suchsystems is typically modeled by a continuous-time Markov chain (CTMC) (Gillespie, 1977;Anderson, 1991; Anderson and Kurtz, 2011) whose state is a vector that gives the number

F. Puchhammer,Basque Center for Applied Mathematics, Alameda de Mazarredo 14, 48009 Bilbao, Basque Country, Spain;and DIRO, Universit´e de Montr´eal, C.P. 6128, Succ. Centre-Ville, Montr´eal, H3C 3J7, CanadaE-mail: [email protected]. Ben AbdellahDIRO, Universit´e de Montr´eal, C.P. 6128, Succ. Centre-Ville, Montr´eal, H3C 3J7, CanadaE-mail: [email protected]. L’EcuyerDIRO, Universit´e de Montr´eal, C.P. 6128, Succ. Centre-Ville, Montr´eal, H3C 3J7, CanadaE-mail: [email protected] Florian Puchhammer et al. of copies of each species. Each transition (or jump) of the CTMC corresponds to the oc-currence of one reaction, and the occurrence rate of each potential reaction (also called the reaction propensity ) is a function of the state of the chain. The probability that any givenreaction is the next one that will occur is proportional to its propensity and the time until thenext reaction has an exponential distribution whose rate is the sum of these propensities. The stochastic simulation algorithm (SSA) of Gillespie (1977) simulates the successive transi-tions of this CTMC one by one, by generating the exponential time until the next reactionand determining independently which reaction it is. This method is exact (there is no bias).But when the number of molecules is large, simulating all the reactions one by one is oftentoo slow, because their frequency is too high. One popular alternative is to approximate theCTMC by a discrete-time Markov chain (DTMC), as follows. Fix a time interval τ >

0. Un-der the simplifying assumption that the rates of the different reactions do not change duringthe next τ units of time, the numbers of occurrences for each type of reaction are independentPoisson random variables with means that are τ times the occurrence rates (or propensities)of these reactions. Each step (or transition) of the DTMC corresponds to τ units of time forthe CTMC. This DTMC can be simulated by generating a vector of independent Poissonrandom variables at each step, and updating the state to reﬂect all the reactions that occurredduring this time interval. In the setting of chemical reaction networks, this approach is the τ -leaping method of Gillespie (2001), and it is widely used in practice. This is the methodwe consider in this paper.There are several other approximation methods, some of them leading to simpler andfaster simulations, but the error and/or bias can also be more signiﬁcant (Gillespie, 2000;Higham, 2008). One simple approach uses a ﬂuid approximation in which the copy num-bers are assumed to take real values that vary in time according to a system of deterministicdifferential equations called the reaction rate equations which can be simulated numerically(Gillespie, 2000; Higham, 2008). This type of deterministic model is the primary tool in theﬁeld of system dynamics, and it is widely used in many areas. It corresponds to chemical ki-netics equations found in textbooks. But this model ignores randomness, so it cannot capturethe stochastic variations observed in experiments with real systems (Beentjes and Baker,2019). Noise can be introduced via a stochastic differential equations model, which amountsessentially to approximate the Poisson distribution by a normal one, and the denumerable-state CTMC by a continuous-state process. This leads to the chemical Langevin equation (Gillespie, 2000; Beentjes and Baker, 2019), which can be simulated efﬁciently by standardmethods for stochastic differential equations (Kloeden and Platen, 1992) and may provide areasonable approximation when the number of molecules of each type is very large, but canotherwise suffer from bias.The purpose of the stochastic simulations with τ -leaping could be for example to es-timate the probability distribution of the state at a given time t , or the probability that thestate is in a given subset at time t , or perhaps the expectation of some function of the state.The simulations are usually done via Monte Carlo (MC) sampling, using a random numbergenerator that provides a good imitation of independent uniform random variables over theinterval ( , ) (L’Ecuyer, 2012). For MC estimators based on the average over n indepen-dent samples, the variance and the standard deviation converge as O ( n − ) and O ( n − / ) ,respectively, which is rather slow. Randomized quasi-Monte Carlo (RQMC) provides an alternative sampling approachwhich under favorable conditions can improve this convergence rate of the variance to O ( n − + ε ) for any ε >

0, and even better in special situations (Owen, 1997, 2003; L’Ecuyer and Lemieux,2002; L’Ecuyer, 2009, 2018).

Quasi-Monte Carlo (QMC) replaces the n independent vec-tors of uniform random numbers that drive the simulations by n deterministic vectors with rray-RQMC for Tau-Leaping Simulation 3 a sufﬁcient number of coordinates to simulate the system and which cover the space (theunit hypercube) more evenly than typical independent random points (Niederreiter, 1992;Dick and Pillichshammer, 2010). RQMC randomizes these points in a way that each indi-vidual point becomes a vector of independent uniform random numbers, while at the sametime the set of points as a whole retains its structure and high uniformity. As a result, RQMCcan provide an unbiased estimator with lower variance.On the other hand, there are two important limitations. Firstly, the O ( n − + ε ) conver-gence rates for RQMC are proved only under conditions that the integrand is a smooth func-tion of the uniforms, whereas when simulating the CTMC considered here, the sequence ofstates that are visited is discontinuous in the underlying uniform random variates. Secondly,when the points are high-dimensional and some high-order interactions between the coordi-nates are important, the variance reduction is usually limited, and this often happens whensimulating the CTMCs that model reaction networks via either direct SSA or τ -leaping. In-deed, those simulations require at least one or two random numbers per step of the chain,the number of steps can be very large in real applications, so the dimension of the points,which is the total number of random numbers that are required to simulate one realization ofthe process, can be very large. Beentjes and Baker (2019) investigated the performance of τ -leaping combined with traditional RQMC and found that the gain from RQMC comparedto MC was small. They mentioned the two limitations above as possible explanations forthis behavior.The Array-RQMC algorithm (L’Ecuyer et al, 2006, 2008, 2009) has been developed pre-cisely to recapture the power of RQMC when simulating Markov chains over a large num-ber of steps, as in the problem considered here. The empirical variance under Array-RQMChas been observed to converge faster than under MC in several examples from various ar-eas, sometimes at the n − + ε rate, even for some examples where the cost function wasdiscontinuous (Demers et al, 2005; L’Ecuyer et al, 2007, 2008, 2009; Dion and L’Ecuyer,2010; L’Ecuyer et al, 2018; Ben Abdellah et al, 2019). The faster convergence has also beenproven theoretically under certain conditions (L’Ecuyer et al, 2008).Our present work was motivated by Beentjes and Baker (2019) and our aim is to seehow Array-RQMC can improve upon MC and classical RQMC, ﬁrst by using the same ex-amples as in their paper. Hellander (2008) also experimented with Array-RQMC, in combi-nation with uniformization of the CTMC and conditional Monte Carlo (CMC) based on thediscrete-time conversion method of Fox and Glynn (1990). Their goal was to estimate theprobability distribution of the state at a ﬁxed time t >

0. In this setting, CMC alone provablyreduces the variance. Empirically, with CMC, they obtained variance reductions by factorsof about 20 in one example and 45 in another example. With the combination of CMC withArray-RQMC, they observed variance reductions by a factor of about 100 with n = forboth examples. Thus, Array-RQMC provides an additional gain on top of CMC, by a factorof about 2.5 to 5.In this paper, we show how to obtain much larger variance-reduction factors with Array-RQMC. We do this in the same setting as Beentjes and Baker (2019), where the τ -leapingmethod is used to estimate an expectation at a given time t . We ﬁnd empirically that thecombination of τ -leaping with the Array-RQMC algorithm can bring not only a signiﬁcantvariance reduction, but also an improved convergence rate, compared with plain MC.The main idea of the Array-RQMC algorithm is to simulate n copies of the Markovchain in parallel, in a way that the empirical distribution of the chain’s states at any givenstep is closer to the exact theoretical distribution at that step than with ordinary MC. Toachieve this, at each step, the ﬁrst few coordinates of the RQMC point set are designatedto match the points to the states, and the remaining coordinates are used to advance the Florian Puchhammer et al. chains by one step. This matching can be interpreted as sorting the chains in some particularorder, to match the ordering of the RQMC points. In the simple case where the state is one-dimensional, it sufﬁces to enumerate the points by increasing order of their ﬁrst coordinateand sort the chains by increasing order of their state. For higher-dimensional states, onepossibility is to use some kind of multivariate sort to order both the points and the states; wewill describe some of these sorts in Section 3.2. Another approach is to deﬁne an importancefunction , which maps the state to a one-dimensional representative value, and sort the chainsby that value. The choice of mapping can have a signiﬁcant impact on the performance. Ifthe mapping is fast to evaluate, this approach can reduce the computing time signiﬁcantly,because a one-dimensional sort is usually much faster to execute than a multivariate one.To preserve the power of Array-RQMC, on the other hand, the importance function mustprovide a good estimate (or forecast) of the expected future value or cost, given the state atwhich it is evaluated. For this, it must be tailored to the problem at hand. A good tradeoffbetween simplicity and prediction accuracy is not always easy to achieve, but it is a keyingredient for the performance of Array-RQMC. As a proof of concept that this approachcan work for reaction networks, we experiment with a very simple one-step look-aheadimportance function, and we ﬁnd that it works very well in all our examples. Empirically,in our experiments, this approach is often competitive with the best multivariate sorts interms of variance reduction, and the sorting times are shorter, so it often provides the bestefﬁciency improvement. We also discuss how more elaborate importance functions could bedeﬁned.The remainder is structured as follows. In Section 2 we recall the ﬁxed-step τ -leapingmethod for the simulation of well-mixed reaction networks in its simplest form. In Section 3,we deﬁne the Array-RQMC method and discuss some of the most prominent multivariatesorting algorithms. In Section 4, we describe the methodology used for our experiments andprovide numerical results, with a discussion. A conclusion follows. τ -Leaping Algorithm for Reaction Networks We consider a system comprised of ℓ ≥ S , . . ., S ℓ that can reactvia d ≥ R , . . ., R d . We assume that the species are well-mixedwithin a volume that does not change over time and whose temperature remains constant.Each reaction R k , k = , . . ., d , can be written as α , k S + · · · + α ℓ, k S ℓ c k −→ β , k S + · · · + β ℓ, k S ℓ , α i , k , β i , k ∈ N , where c k > R k . Let X ( t ) = ( X ( t ) , ..., X ℓ ( t )) ∈ N ℓ , where X i ( t ) is the copy number (i.e., the number of molecules) of type S i at time t , for i = , . . ., ℓ and 0 ≤ t ≤ T . The process { X ( t ) , t ≥ } is modeled as a CTMC with ﬁxed initial state X ( ) = x and for which each jump corresponds to the occurrence of one reaction. Thejump rate (or propensity function ) for reaction R k is a function a k of the current state; itis a k ( x ) when X ( t ) = x . This means that for a small δ >

0, reaction R k occurs exactlyonce during the time interval ( t , t + δ ] with probability a k ( x ) δ + o ( δ ) and occurs more thanonce with probability o ( δ ) . When R k occurs, the state changes from x to x + ζ k , where ζ k = ( β , k − α , k , . . ., β ℓ, k − α ℓ, k ) is the stoichiometric vector for R k . The standard for a k ( x ) ,which we assume in our examples, is a k ( x ) = c k h k ( x ) where h k ( x ) = ∏ ℓ i = (cid:0) x i α i , k (cid:1) representsthe number of ways of selecting the molecules for reaction R k when in state x = ( x , . . ., x ℓ ) rray-RQMC for Tau-Leaping Simulation 5 (Higham, 2008). When in state x , the time until the next reaction has an exponential dis-tribution with rate λ ( x ) = ∑ dk = a k ( x ) , the probability that this reaction is R k is a k ( x ) / λ ( x ) ,and these random variables are independent. The SSA of Gillespie (1977) simulates thisCTMC directly. However, when a very large number of reactions occur in the time intervalof interest, the direct simulation approach may be too slow.Gillespie (2001) proposed the τ -leaping algorithm as a way to speed up the simulation.This approach discretizes the time into intervals of length τ >

0, and it generates directythe number of occurrences of each type of reaction in each such interval. If X ( t ) = x at thebeginning of an interval, it is assumed (as an approximation) that the rate of each reaction R k remains equal to a k ( x ) during the entire interval [ t , t + τ ] . Under this simplifying assumption,the number D k of occurrences of R k during this time interval has a Poisson distributionwith mean a k ( x ) τ , and D , . . ., D d are independent. These D k can be simulated easily viathe inversion method, by generating independent uniform random numbers over ( , ) andapplying the inverse of the cumulative distribution function (cdf) of the appropriate Poissondistribution (Giles, 2016). The simulated state at time t + τ is then x + ∑ dk = D k ζ k . Repeatingthis at each step gives an approximating discrete-time Markov chain (DTMC) { X j , j ≥ } deﬁned by X = x and X j = X j − + d ∑ k = D j , k ζ k , = X j − + d ∑ k = F − j , k ( U j , k ) ζ k def = ϕ ( X j − , U j ) , (1)where D j , k = F − j , k ( U j , k ) , F j , k is the cdf of the Poisson distribution with mean a k ( X j − ) τ , U j = ( U j , , . . ., U j , d ) , and the U j , k are independent uniform random numbers over ( , ) , for k = , . . ., d and j ≥

1. If τ is small enough, X j has approximately the same distribution as X ( j τ ) , so this DTMC provides an approximate skeleton of a CTMC sample path.This τ -leaping approximation has some potential problems, because it introduces biaswhich can propagate across successive steps, and this bias can be important if τ is not smallenough. It is also possible to obtain negative copy numbers, i.e., some coordinates of some X j taking negative values. Adaptive strategies and modiﬁcations of the algorithm have beendesigned to prevent or handle this; see, e.g., (Anderson, 2008; Anderson and Higham, 2012;Beentjes and Baker, 2019), and the references given there. We do not discuss these tech-niques in this paper. Our main goal is to explore how Array-RQMC can be effectively com-bined with τ -leaping, and we keep the setting simple to avoid distractions. In our experi-ments, we took τ small enough so we did not observe negative copy numbers.Following Beentjes and Baker (2019), we suppose that the objective is to estimate µ = E [ g ( X ( T ))] for a given time T > g : N ℓ → R . These authors onlytook a coordinate projection for g (i.e., they only estimated expected copy numbers) in theirexamples, and we do the same, but what we do applies easily to other choices of g . Forexample, g ( x ) could be the indicator that x belongs to a given set A , in which case µ = P [ X ( T ) ∈ A ] . We take τ = T / s where s is a positive integer that represents the numberof steps of the DTMC that will be simulated. To estimate µ with τ -leaping and MC, wesimulate n independent realizations of the DTMC via X i , = x , X i , j = ϕ j ( X i , j − , U i , j ) for j = , . . ., s and i = , . . ., n − , (2)where the U i , j ’s are independent uniform random points over ( , ) d . The estimator isˆ µ n = n n − ∑ i = g ( X i , s ) . (3) Florian Puchhammer et al.

We know that E [ ˆ µ n ] = E [ g ( X s )] ≈ E [ g ( X ( T ))] = µ (we do not look at the bias E [ g ( X s )] − E [ g ( X ( T ))] in this paper) and Var [ ˆ µ n ] = Var [ g ( X s )] / n .To use classical RQMC instead of MC, we simply replace the independent randompoints by a set of n vectors V i = ( U i , , U i , , . . ., U i , s ) , i = , . . ., n , which form an RQMCpoint set in sd dimensions, as did Beentjes and Baker (2019). E [ g ( X s )] ≈ µ again with (3), but with a different sampling strategy for the random numbers.The algorithm simulates the n sample paths of the DTMC in parallel, using an ( l + d ) -dimensional RQMC point set to advance all the chains by one step at a time, for some l ∈ { , . . . , ℓ } . The ﬁrst l coordinates of the points are used to make a one-to-one pairingbetween the chains and the points, and the other d coordinates are used to advance thechains. When l < ℓ , one must ﬁrst deﬁne a dimension-reduction mapping h : N ℓ → R l whoseaim is to extract the most important features from the state and summarize them in a lower-dimensional vector which is used for the sort. For l =

1, the mapping h has been calledan importance function or sorting function (L’Ecuyer et al, 2006, 2007, Section 3). At eachstep, both the RQMC points and the chains are ordered using the same l -dimensional sort.Different types of sorts are discussed in Section 3.2.Speciﬁcally, we select a deterministic low-discrepancy (QMC) point set of the form˜ Q n = { ( w i , u i ) , i = , . . ., n − } , with w i ∈ [ , ) l and u i ∈ [ , ) d , whose points are alreadysorted with respect to their ﬁrst l coordinates with the multivariate sort that we have selected.At each step j , we randomize the last d coordinates of the points of ˜ Q n to obtain the RQMCpoint set Q n , j = { ( w i , U i , j ) : i = , . . ., n − } , (4)in which each U i , j is uniformly distributed in [ , ) d . We also sort the n states X , j − , . . ., X n − , j − based on their values of h ( X , j − ) , . . ., h ( X n − , j − ) , using the same sorting algo-rithm as for the QMC points, and let π j denote the permutation of the indices { , , . . ., n − } implicitly deﬁned by this reordering. Then the n chains advance to step j via X i , j = ϕ ( X π j ( i ) , j − , U i , j ) , i = , . . ., n − . It is also possible to use a different sorting method at each step j , in which case the QMCpoints must be sorted differently as well, so this is usually not convenient.At the end, one computes ˆ µ n in (3), which is an unbiased estimator of E [ g ( X s )] . Themain goal of this procedure is for the empirical distribution of the states X , j , . . ., X n − , j to better approximate the theoretical distribution of X j at each step j , than if the chainswere simulated independently with standard MC, and as a result reduce the variance ofˆ µ n . For further theoretical analysis and empirical evidence, see for example L’Ecuyer et al(2008, 2009, 2018). To estimate the variance of this Array-RQMC estimator, one can repeatthe entire procedure m times, with independent randomizations of the points, and take theempirical variance of the m realizations of ˆ µ n as an unbiased estimator for Var [ ˆ µ n ] . ThisArray-RQMC procedure is stated in Algorithm 1. rray-RQMC for Tau-Leaping Simulation 7 Algorithm 1

Array-RQMC Algorithm X i , ← x for i = , ..., n − for j = , , ..., s do Sort the states X , j − , . . ., X n − , j − by their values of h ( X i , j − ) , using the selected sort, and let π j be the corresponding permutation; Randomize afresh the last d coordinates of the RQMC points, U , j , ..., U n − , j ; for i = , , . . ., n − do X i , j = ϕ ( X π j ( i ) , j − , U i , j ) ; end for end for Return the estimator ˆ µ n = ( / n ) ∑ n − i = g ( X i , s ) .3.2 Sorting StrategiesIn the special case where l =

1, the RQMC points are sorted by their ﬁrst coordinate and thestates X i , j − are simply sorted by their value of h ( X i , j − ) , in increasing order. In this case,one would typically have w i = i / n and the points are already sorted by construction (this istrue for all the point sets used in this paper).When ℓ >

1, sorting for good pairing is less obvious. Two related multivariate sorts thatgave good results for other applications are the batch sort and the split sort (L´ecot and Coulibaly,1998; El Haddad et al, 2008; L’Ecuyer et al, 2009, 2018). For the batch sort we factor n = n n · · · n L with L ≥

1. Each time we sort, we split the set of states into n batches of size n / n such that the ﬁrst coordinate of every state in one batch is smaller or equal to the ﬁrstcoordinate of every state in the next batch; then we further subdivide each batch into n batches of size n / ( n n ) in the same way but now according to the second coordinate of thestates. This procedure is repeated L times in total. If L > ℓ , after ℓ steps we begin subdividingthe batches with respect to their ﬁrst coordinate again. The split sort is simply a variant ofthe batch sort in which n = L and n = n = . . . = n L = ℓ -dimensional unit hypercube [ , ) ℓ ,so we can assume that the state space is now [ , ) ℓ instead of N ℓ , and then use a discretizedversion of a space ﬁlling curve for this hypercube. The hypercube is partitioned into a gridof small subcubes so that the event that two states fall in the same small subcube has a verysmall probability, then the states are sorted in the order that their subcubes are visited bythe curve (those in the same subcube can be ordered arbitrarily). With this, we use ( d + ) -dimensional RQMC points sorted by their ﬁrst coordinate. This approach is in fact an im-plicit way to map the states to the one-dimensional real line, and then use a one-dimensionalsort (with l = Hilbert curvesort . To map ℓ -dimensional states to [ , ) ℓ , Gerber and Chopin (2015) suggest applying arescaled logistic transformation Ψ ( x j ) = / ( + exp [ − ( x j − µ j + σ j ) / ( σ j )]) , 1 ≤ j ≤ ℓ ,to each coordinate. We estimated the means µ j and the variances σ j of the copy numbers ofeach species at every step, from data obtained from preliminary experiments.A variant that avoids the need for such a transformation is the Hilbert batch sort (L’Ecuyer et al,2018): One ﬁrst applies a batch sort to partition R ℓ into n boxes, each of which containingexactly one of the states, then these boxes are associated with n subcubes in [ , ) ℓ and thestates are enumerated in the order that the corresponding boxes are visited by the Hilbertcurve. Florian Puchhammer et al.

All these multivariate sorts can be computationally expensive when n is large. For thisreason, we made some efforts in this work to explore ways of deﬁning importance functions h : N ℓ → R that can be computed quickly during the simulations and provide at the sametime good representations for the value of a state. An appropriate choice of h is certainlyproblem-dependent and good ones have been constructed for some examples in other set-tings such as computational ﬁnance, queueing, and reliability (L’Ecuyer et al, 2007, 2008,2018; Ben Abdellah et al, 2019).We adopt the (partly heuristic) idea that at each step j , an ideal importance function h j should have the property that h j ( x ) is a good approximation of E [ g ( X s ) | X j = x ] for all x ∈ N ℓ and j = , . . ., s (L’Ecuyer et al, 2007, 2009). To really do this, we would need toconstruct a different approximation h j for each j . We will call it a step-dependent impor-tance function (SDIF). To see how well this general type of approach could perform, wemade the following experiment with each of the examples considered in Section 4. First, wegenerated data by simulating the DTMC for n = independent “pilot” samples paths, andwe collected the n pairs ( X i , j , g ( X i , s )) , i = , . . ., n −

1, for each j . Then, our aim was to ﬁnda function h j : N ℓ → R for which h j ( X i , j ) was a good predictor of g ( X i , s ) conditional on X i , j . For this, we selected a parameterized form of function h j , say h j ( θ , · ) , which dependson a parameter vector θ , and we estimated the best value of θ by least-squares regressionfrom the data. The general form that we explored for h j ( θ , x ) was a linear combination ofpolynomials in the coordinates of x , where θ was the vector of coefﬁcients in the linearcombination. The motivation for this choice is that the expected number of molecules ofa given type at the next step, given the current state, is an afﬁne function of the expectednumber of reactions of each type that will occur at that step, and this expected number forreaction type R k is in turn linear in a k ( x ) , which is a known polynomial in the coordinatesof x .A cruder but less expensive strategy uses the same function h j = h for all j . One specialcase of this is to use h s − at all steps. We had some success with this simple version, whichwe call the one-step look-ahead importance function (OSLAIF). In the special case where g ( x ) is linear in x , say g ( x ) = b t x where b t is the transpose a vector of coefﬁcients, then h s − ( x ) def = E [ g ( X s ) | X s − = x ] = E [ b t X | X = x ] is given by a polynomial in x , and onecan obtain this polynomial exactly, since E [ X | X = x ] = x + d ∑ k = ζ k E [ D , k | X = x ] = x + τ d ∑ k = ζ k a k ( x ) , (5)which is a vector of polynomials in x that are easy to calculate. This includes the case of g ( x ) = x i , the number of molecules of species i , which occurs in all our examples.Extending this to more than one step can be more difﬁcult when the a k are nonlinear.One can write E [ X | X = x ] = x + τ d ∑ k = ζ k [ a k ( x ) + E [ a k ( X ) | X = x ]] , but when a k is nonlinear, the quantity in the last expectation is a nonlinear function of a ran-dom vector. Extending to more steps leads to even more complicated embedded conditionalexpectations. This motivated us to try just the OSLAIF rule as a heuristic, and we got somegood results with that. Speciﬁc illustrations are given in Section 4.Let ˜ h j denote the functions h j estimated from data as just described, for each j . These˜ h j are noisy estimates, and since they are estimated separately across values of j , we can rray-RQMC for Tau-Leaping Simulation 9 observe some random variation when looking at their sequence as a function of j . To smoothout this variation, we tried ﬁtting a (least-squares) smoothing spline (de Boor, 2001; Pollock,1993) to this sequence of functions ˜ h j to obtain a sequence of functions h j , j = , ... s ,that varies more smoothly across the step number j . This yields a smoothed SDIF . In ourexperiments, we never observed a large improvement by doing this, because with n = pilot simulations, the ˜ h j did not vary much already as a function of j . But the smoothingmight be worthwhile when the number n of pilot simulations is smaller.3.3 RQMC Point SetsThe RQMC point sets considered in this paper are the following (the short names in paren-theses are used to identify them in the next section): (1) a randomly-shifted rank-1 lat-tice rule (Lat+s); (2) a Lat+s with the baker’s transformation applied to the points af-ter the shift (Lat+s+b); (3) a Sobol’ net with a left random matrix scramble followed bya random digital shift (Sob+LMS); (4) a Sobol’ net with the nested uniform scrambleof Owen (1997) (Sob+NUS). These point sets and randomizations are deﬁned and ex-plained in L’Ecuyer and Lemieux (2000); Owen (2003); L’Ecuyer (2009, 2018). They areimplemented in SSJ (L’Ecuyer and Buist, 2005; L’Ecuyer, 2016), which we used for allour experiments. For the lattice rules, the parameters were found with the Lattice Buildertool (L’Ecuyer and Munger, 2016), using the weighted P criterion with order dependentweights equal to ρ k for each projection of order k , for each k , with ρ = . ρ = .

05 inall the other cases. The baker’s transformation stretches each coordinate of each point by afactor or 2, then folds back the values by replacing u with 2 − u when u >

1. This is equiva-lent to transforming the integrand to make it periodic, which may improve the convergencerate (Hickernell, 2002) and may provide huge improvements in some cases, but it also in-creases the variation of the integrand, so it may increase the variance (moderately) in othercases. For the Sobol’ points, we used the parameters (direction numbers) from Joe and Kuo(2008), except for Example 4.1 and the PKAr case in Example 4.3, for which we used theparameters from Lemieux et al (2004).

For our numerical illustrations, we use two low-dimensional examples taken from Beentjes and Baker(2019), then a higher-dimensional example from Padgett and Ilie (2016). On these examples,we compare the performances of both classical RQMC and Array-RQMC in combinationwith τ -leaping. All experiments were run using SSJ (L’Ecuyer and Buist, 2005; L’Ecuyer,2016), which provides the required RQMC tools and also implements the sorting methodsdiscussed in Section 3.2. The MRG32k3a random number generator was used for MC andall the randomizations.We repeated each Array-RQMC procedure m =

100 times independently to estimate theRQMC variance Var [ ˆ µ n ] for n = , . . ., . We then ﬁtted a model of the form Var [ ˆ µ n ] ≈ κ n − β to these observations by least-squares linear regression in log-log scale. This gave anestimated convergence rate of O ( n − ˆ β ) for the variance, where ˆ β is the least-square estimateof β . We report this ˆ β in our results. Ordinary MC gives β =

1, so we can compare. We alsoprovide a few plots of Var [ ˆ µ n ] as a function of n , in log-log scale, to illustrate the typicalbehavior. Our logs are always in base 2, because we always use powers of 2 for n . We computed the estimated variance reduction factor (VRF) of Array-RQMC comparedwith MC, which is deﬁned as Var [ g ( X s )] / ( n Var [ ˆ µ n ]) where Var [ g ( X s )] is the MC variancefor a single run, which was estimated separately by making n = independent runs. Thisis the variance per run for MC divided by the variance per run for Array-RQMC. We call VRF

19 this value for n = and we report it in our results. We also computed an efﬁciencyratio which measures the change in the work-normalized variance (the product of the esti-mator’s variance by its computing cost). It is the VRF multiplied by the CPU time requiredto compute n realizations with MC and divided by the CPU time to compute the RQMC orArray-RQMC estimator with the same n . We call EIF

19 its value for n = and we reportit as well. This measure takes into account both the gain in variance and the extra cost inCPU time which is required to sort the chains at each step of the Array-RQMC algorithm.Note that using RQMC only is generally not slower than MC, but usually a bit faster.4.1 A Reversible Isomerization SystemWe start with the same simple model of a reversible isomerization system as Beentjes and Baker(2019). There are two species, S and S , and d = c = c = − : S c −→←− c S . We start with X ( ) = molecules of type S and X ( ) = molecules of type S .Since the total number of molecules is constant over time, it sufﬁces to know the number ofmolecules of the ﬁrst type, X ( t ) , at any time t , so we can deﬁne the state of the CTMC as X ( t ) = X ( t ) only. This gives us ℓ =

1. Then, we only need a one-dimensional sort for Array-RQMC. We also take g ( X ( t )) = X ( t ) . Note that with our choice of initial state, E [ X ( t )] = for all t >

0, so we already know the answer for this simple example. There are twopossible reactions, so d =

2, and we therefore need RQMC points in 2 s dimensions withclassical RQMC and in ℓ + d = T = . s =

8, so τ = T / s = .

2. Figure 1 displays how the variance decreases as a function of n for this case. Notice the steeper slope for the four Array-RQMC variants. Array-RQMCclearly outperforms both MC and classical RQMC in this example.We also observe with these three cases that when we increase s with T ﬁxed, the factors VRF

19 and

EIF

19 diminish, and the diminution is much more prominent with RQMC. Thelatter might be no surprise, because increasing s increases the dimension of the RQMCpoints. But it was unclear a priori if it would also occur with Array-RQMC, and how much.However, by doing further experimentation, we found that the decrease of VRF

19 is notreally due to the increase in the number of steps, but rather to the decrease in τ . To see that,look at the fourth case, with ( T , s , τ ) = ( . , , . ) . Here we have the same τ as in theﬁrst case, but s is multiplied by 16. For the Array-RQMC methods, the variance reductionsand convergence rates are similar to the ﬁrst case. For RQMC, they are a bit lower, which isnot surprising because the dimension has increased. For cases ﬁve and six, we have increased τ to 0.8 and we compare two large values of s . The VRF τ and not much on s . Why is that?Recall that in this example, at each step we generate a pair of Poisson random variables,which are discrete and therefore discontinuous with respect to the underlying uniforms. The rray-RQMC for Tau-Leaping Simulation 11 Table 1: Estimated rates ˆ β , VRF

19, and

EIF

19, for the reversible isomerization example,for various choices of ( T , s , τ ) . MC refers to ordinary MC, RQMC is classical RQMC withSobol’ points and LMS randomization, and the other four rows are for Array-RQMC withdifferent RQMC point sets. “MC Var” is Var [ g ( X s )] , the variance per run with MC. ( T , s , τ ) −→ ( . , , . ) ( . , , . / ) ( . , , . / ) MC Var 107 . . . β VRF EIF

19 ˆ β VRF EIF

19 ˆ β VRF EIF ( T , s , τ ) −→ ( . , , . ) ( . , , . ) ( . , , . ) MC Var 111 . . . β VRF EIF

19 ˆ β VRF EIF

19 ˆ β VRF EIF ( T , s , τ ) −→ ( . , , . ) , normalMC Var 107 . β VRF EIF

14 16 18 − −

10 log ( n ) l og ( V a r ) MCRQMCLat+sLat+bSob+LMSSob+NUS

Fig. 1: Estimated Var [ ˆ µ n ] as a function of n , in log-log scale, for the reversible isomerizationsystem, with T = . s = mean of each Poisson random variable is proportional to τ , and the larger the mean, thecloser it is to a continuous distribution. In fact, as τ increases, the Poisson converges to anormal distribution, whose inverse cdf is smooth, so the generated values are smooth func-tions of the underlying uniforms in the limit. That is, we obtain a better VRF

19 when τ is larger because the integrand is closer to a continuous (and smooth) function. When thePoisson distributions have small means, in contrast, the response has larger discontinuities.And it is well known that RQMC is much more effective for smooth functions that discon-tinuous functions. This kind of behavior was already pointed out for RQMC in Section 5.2of Beentjes and Baker (2019). Interestingly, we see that the same effect applies to Array-RQMC as well. To illustrate this effect “in the limit,” we made an experiment in whichall the Poisson random variables at each step are replaced by normals with the same meanand variance, and the state vector has real-valued components rather than integer compo-nents, using the same parameters as in the ﬁrst case in the table. The results are in the last(bottom) entry of the table and they are stunning. Firstly, for RQMC and all Array-RQMCmethods, the rate ˆ β is close to 2, which does not occur for the other cases. Secondly, the VRF

19 factor is also very large for RQMC and is huge in particular for Array-RQMC withLat+s+b. This surprising result for RQMC can be explained as follows. Here the integrandhas 16 dimensions, but on a closer look one can see that it is a sum of 16 normal randomvariables that are almost independent; i.e., almost a sum of one-dimensional functions. Thismeans that the effective dimension is close to 1, and this explains the success of RQMC.Regarding the huge gain with Lat+s+b, it can be explained by the fact that for a smooth one-dimensional function, RQMC with Lat+s+b can provide an O ( n − ) convergence rate forthe variance (Hickernell, 2002; L’Ecuyer, 2009). Essentially, for one-dimensional smoothfunctions, the baker’s transformation produces a locally antithetic effect which integratesexactly the piecewise linear approximation, and only higher-order error terms remain. Thehuge VRF

19 indicates that part of this effect carries over to Array-RQMC.We just saw that as a rough rule of thumb, the RQMC methods bring more gain whenthe Poisson random variables have larger means. We know (from Section 2) that the meanof the Poisson random variable D j , k is a k ( X j − ) τ . This mean can be increased by increasingeither τ or the components of the state vector. For the present example, if we denote X j − =( X ( ) j − , X ( ) j − ) t , the number of molecules of each of the two types at step j −

1, we have a k ( X j − ) = c k X ( k ) j − for k = ,

2, so the Poisson means are increased by a factor γ > τ by γ or multiplying the vector X j − by γ . We made experiments whoseresults agreed with that when all the components of the state were large enough. But if onecomponent of X j − is small, and we increase τ and simulate the system over a few steps,this component has a good chance of getting close to zero at some step, and this increasesthe discontinuity. In that situation, a larger τ can worsen the VRF. To further test the abovereasoning, we made another set of experiments in which the initial state X had two equalcomponents, exactly X ( ) = X ( ) = ( + ) / c = c = / X ( ) , to keep E [ X ( t )] = X ( ) for all t . In this case, theproblem of one component getting close to 0 does not occur so things remain smoother. Wefound that the VRFs were larger than in Table 1 for both RQMC and Array-RQMC (weexclude the normal distribution). The VRF for RQMC was also smaller when both T and s were large, but not when s was increased and T remained small. One possible explanationfor this is that when T and s are large, the overall change in the state can be large, and thenthe set of successive changes in the state are less independent, which increases the effectivedimension. rray-RQMC for Tau-Leaping Simulation 13 S , S and S , and four reaction channels with reaction rates c = × − , c = − , c = − and c = .

5, respectively. The model can be depicted as:2 S + S c −→←− c S , S c −→←− c S . The propensity functions a k are given by a ( x ) = c x ( x − ) x / , a ( x ) = c x ( x − )( x − ) / , a ( x ) = c x , a ( x ) = c x . We also take x = ( , , × ) t , T =

4, and τ = T /

16, so s =

16 steps. This thesame model as in Beentjes and Baker (2019), with the same parameters, except that we tooka smaller τ to avoid negative copy numbers. We want to estimate E [ X ( T )] , the expectednumber of molecules of S at time T . Here, this expectation does depend on T , and we willsee that Var [ X ( T )] also depends very much on T .Since the total number of molecules remains constant over time, the dimension of thestate here can be taken as ℓ =

2. We take the state as X = ( X ( ) , X ( ) ) t , and X ( ) can bededuced by X ( ) = N − X ( ) − X ( ) where N is the total number of molecules. With d = h that maps the state to one dimension, and must be six-dimensional otherwise. Forcomparison, with classical RQMC, the dimension of the RQMC points is d ⌈ T / τ ⌉ = h j : N → R as discussed in Section 3.2.To construct an importance function h : N → R using the OSLAIF, when g ( x ) is a linearfunction of x , one can compute the conditional expectation exactly by using by (5). Thisgives a polynomial of the form: h j ( x , x ) = θ + θ x + θ x + θ x + θ x x + θ x + θ x x . (6)With g ( x ) = x (our case), the coefﬁcients are ( θ , θ , . . ., θ ) ≈ ( . τ , − . τ , − − τ , × − τ , − . × − τ , − . × − τ , . × − τ ) . If x was very large, we could ap-proximate a ( x ) ≈ c x x / a ( x ) = c x /

6, and then remove the two terms θ x + θ x x from (6), but in our example, x is not very large.To obtain a SDIF for a more general j , one possible heuristic could be to assume thesame form of polynomial (even if this is not exact) and select the coefﬁcients θ i by least-squares ﬁtting to data obtained from n = pilot runs as explained in Section 3.2. We didthis and we also tried ﬁtting a more general bivariate polynomial that contains all possiblemonomials x ε y ε with 0 ≤ ε , ε ≤

3, but this gave us no improvement over OSLAIF. Theother SDIF approches that we tried also did no better than OSLAIF. A plausible explanationis that the functions h j in this case are based on data obtained from noisy simulations (largevariance and dependence on j ). A possible alternative could be to use automatic learningwith a deep neural network to learn a good h . But this is beyond our scope.For the batch sort, we kept the three coordinates in their natural order and we used n = ⌈ n / ⌉ and n = ⌈ ( n / n ) / ⌉ . For n = , for example, this gives n =

725 and n = in one case with n = . The OSLAIF, Hilbert curve sort, and batch sort all perform rea-sonably well, which is not very surprising, because the state space is only two-dimensional.The OSLAIF is very effective for T =

4, but somewhat less effective for T =

32. Globally,the batch sort is the best performer; its

VRF

19 and

EIF

19 values are both consistently amongthe largest ones. The Sobol’ points are generally the best performers for each type of sort.The left panel of Figure 2 shows Var [ ˆ µ n ] versus n in log-log scale for the OSLAIF sort,for various point sets. The estimated convergence rates − ˆ β are mostly between -1.3 and -1.6, which clearly beats the usual MC rate of -1. The right panel shows Var [ ˆ µ n ] as a functionof n under Sob+LMS, in a log-log-scale.One important observation is the large difference in MC variance between T = T =

32; it is larger at T = E [ ˆ µ n ] also depends on T : itis about 240 at T = T =

32. It is plotted as a function of T in the left panelof Figure 3. What happens is that the trajectories have roughly two very different kinds oftransient regimes between t = t =

10. For some trajectories, X ( t ) goes up tosomewhere between 400 and 600 at around t =

4, then goes down to around the long-termmean, say between 70 and 100. For other trajectories, X ( t ) decreases right away to between70 and 100 at around t =

5. Figure 3 illustrates this behavior, with 16 sample paths. This wasalready mentioned in Beentjes and Baker (2019), although they say the system is bistable,whereas what we observe is rather two types of transient paths. This behavior explains themuch larger variance at T = T =

32. It also shows why it is very hard to predictthe state at some larger T from the state at t = /

4, say, hence the difﬁculty to estimate an“optimal” importance function. Despite all of this, Array-RQMC performs quite well withsimple sorts and brings large efﬁciency improvements compared with MC and RQMC.Table 2: Estimated rates ˆ β , VRF

19, and

EIF

19 for the Schl¨ogl system, with various types ofsorts for Array-RQMC. T = , s = T = , s = T = , s = ,

409 27 ,

471 270Sort Sample ˆ β VRF EIF

19 ˆ β VRF EIF

19 ˆ β VRF EIF rray-RQMC for Tau-Leaping Simulation 15

14 16 18 − − −

50 log ( n ) MCRQMCLat+sLat+s+bSob+LMSSob+NUS 14 16 18 − − − ( n ) l og ( V a r ) OSLAIFBatch sortHilbert batchHilbert sort

Fig. 2: Empirical variance of the sorting methods vs n in a log-log scale, T = s = t a v e r a g e t c opynu m b e r S Fig. 3: The mean with n = (left) and the trajectories of X ( t ) for n =

16 chains for t ≤ ℓ = d =

6, which are both larger than in theprevious examples. The six molecular species S to S are (in this order) PKA, cAMP, thepartially saturated PKA-cAMP , the saturated PKA-cAMP , the catalytic subunit PKAr,and the regulatory subunit PKAc. The d = + c −→←− c PKA-cAMP , PKA-cAMP + c −→←− c PKA-cAMP , PKA-cAMP c −→←− c PKAr + . The reaction rates are c = . × − , c = . c = . × − , c = . c = .

016 and c = . × − . We simulate this system with the same parametersas Padgett and Ilie (2016), except that we assume that the molecules are homogeneouslydistributed in the volume and we choose a ﬁxed τ as opposed to selecting it adaptively aftereach step. At time zero there are 33 ,

000 molecules of PKA, 33 ,

030 molecules of cAMP, and ,

100 molecules of each other species. We take T = .

05 and τ = T / s =

256 steps.This problem requires RQMC points in 7 or 12 dimensions with Array-RQMC, comparedwith 1536 dimensions with classical RQMC.We report experiments with two different objective functions. The ﬁrst one is E [ X ( T )] ,the expected number of molecules of PKA at time T , and the second one is E [ X ( T )] , the ex-pected number of molecules of PKAr at time T . In each case, we implemented and tested theOSLAIF and SDIF methods to select a mapping h to one dimension. We also tried the mul-tivariate batch and split sorts, the Hilbert sort, and the Hilbert batch sort, from Section 3.2.The best performers were the OSLAIF map, the batch sort, and the Hilbert sort.The OSLAIF in this case is given by the polynomial h ( x ) = x + τ ( − c x x ( x − ) / + c x ) . In this function, the magnitude x outweighs that of the − τ c x x ( x − ) / τ c x . This suggests taking x as the most important coordinate for thesort, followed by x and x . So for the batch sort, we used the three coordinates x , x , x in this order. We tried a few settings for the batch sizes and ended up with n = ⌈ n / ⌉ , n = ⌈ ( n / n ) / ⌉ , and n = ⌈ ( n / n / n ) / ⌉ . For n = , this gives n = n =

12, and n = g ( x ) = x . The estimatedmean and variance per run are also 19663 and 1775, respectively. We ﬁnd that the threesorting methods reported in the table offer comparable performance in terms of VRF

EIF

19. This is becausesorting on a single value or a restricted set of coordinates, as we do for the batch sort, isfaster than a full multivariate sort. Classical RQMC also performs surprisingly well despitethe large number of dimensions, but not as well as Array-RQMC with the best sorts. WithArray-RQMC, we also observe empirical convergence rates ˆ β consistently better than theMC rate of 1.0. This indicates that the VRF should increase further with n .Table 4 gives the results for the PKAr case, for which g ( x ) = x . The estimated mean andvariance per run are about 716 and 47, respectively. For this case, the OSLAIF is given by h ( x ) = x + τ ( c x − . c x x ( x − )) . Given that x , x , and x remain roughly between500 and 1000 in this model, and that τ = / x , followed by − τ c x x ≈ − . × − x . Based on this, for the batch sort, weinitially used the coordinates x , x , x in this order, and took n = ⌈ n / ⌉ , n = ⌈ ( n / n ) / ⌉ ,and n = ⌈ ( n / n / n ) / ⌉ for the batch sizes, as in the previous case. This is denoted by“Batch-5” in the table.We also tried SDIF with various types of functions, but it did not really perform better.While doing that, we applied the random forest permutation-based statistical procedure ofBreiman (2001) to detect the most important variables in a noisy function. This proceduretold us that x was by far the most important variable for the sort, at all steps. Based onthis, we also tried a batch sort with the three coordinates x , x , x in this order, with thesame batch sizes as before. This is named “Batch-6” in the table. We also tried sorting thestates by x (the number of PKAc molecules) only. This a degenerate form of batch sort with n = n . We call it “By PKAc” in Table 4.The OSLAIF, Batch-6, and “By PKAc” sorts perform similarly. They provide large im-provement factors for both the variance and the efﬁciency, and empirical convergence ratesˆ β that are signiﬁcantly larger than 1. Their performance is orders of magnitude better thanRQMC. The Batch-5 and Hilbert sorts are not competitive with the other ones in this case,but they nevertheless reduce the variance by signiﬁcant factors.This example illustrates two facts. First, the dimension of the state is not the ultimatecriterion for Array-RQMC to perform well. Secondly, customizing sorting algorithms basedon information on the underlying model can improve results signiﬁcantly. rray-RQMC for Tau-Leaping Simulation 17 Table 3: Estimated rates ˆ β , VRF

19, and

EIF

19, for PKA with T = . s = Sort Sample ˆ β VRF EIF

Table 4: Estimated rates ˆ β , VRF

19, and

EIF

19, for PKAr with T = . s = Sort Sample ˆ β VRF EIF

We have studied the combination of the ﬁxed step τ -leap algorithm with Array-RQMC forwell-mixed chemical reaction networks and found that in this way, we can reduce the vari-ance in comparison to MC signiﬁcantly. In contrast to the simulation with traditional RQMC,this approach could often also improve the convergence rate of the variance. Array-RQMCrequires to sort the chains by their states at each step of the chain. This can be done witha multivariate sort, which may become costly when the state space has large dimension-ality. But we also showed that one can construct sorts by mapping the states into the realnumbers by an uncomplicated importance function, where sorting is trivial. A simple vari-ant named OSLAIF performs comparably as well or better than several standard sorting algorithms, while being naturally easier and less costly to apply. We have also shown thatobtaining additional knowledge of the model, such as identifying important variable projec-tions, and adapting a sort to this information can improve the convergence of the variancetremendously. Acknowledgements

This work has been supported by a Canada Research Chair, an IVADO Research Grant,and an NSERC Discovery Grant number RGPIN-110050 to P. L’Ecuyer. F. Puchhammer was also supportedby Spanish and Basque governments fundings through BCAM (ERDF, ESF, SEV-2017-0718, PID2019-108111RB-I00, PID2019-104927GB-C22, BERC 2018e2021, EXP. 2019/00432, ELKARTEK KK-2020/00049),and the computing infrastructure of i2BASQUE academic network and IZO-SGI SGIker (UPV).

References

Anderson D, Higham D (2012) Multilevel Monte Carlo for continuous-time Markov chains,with applications in biochemical kinetics. Multiscale Modeling & Simulation 10(1):146–179, DOI 10.1137/110840546Anderson DF (2008) Incorporating postleap checks in tau-leaping. The Journal of ChemicalPhysics 128(5):054,103, URL https://doi.org/10.1063/1.2819665

Anderson DF, Kurtz TG (2011) Continuous time Markov chain models for chemical reactionnetworks. In: Koeppl H, Densmore D, Setti G, di Bernardo M (eds) Design and analysisof biomolecular circuits, vol 117, Springer, New York, pp 3–42Anderson WJ (1991) Continuous-Time Markov Chains: An Applications-Oriented Ap-proach. Springer-Verlag, New YorkBeentjes CHL, Baker RE (2019) Quasi-Monte Carlo methods applied to tau-leaping instochastic biological systems. Bulletin of Mathematical Biology 81:2931–2959Ben Abdellah A, L’Ecuyer P, Puchhammer F (2019) Array-RQMC for option pricing understochastic volatility models. In: Proceedings of the 2019 Winter Simulation Conference,IEEE Press, pp 440–451Breiman L (2001) Random forests. Machine learning 45(1):5–32de Boor C (2001) A Practical Guide to Splines, 2nd edn. Springer-Verlag, New YorkDemers V, L’Ecuyer P, Tufﬁn B (2005) A combination of randomized quasi-Monte Carlowith splitting for rare-event simulation. In: Proceedings of the 2005 European Simulationand Modeling Conference, EUROSIS, Ghent, Belgium, pp 25–32Dick J, Pillichshammer F (2010) Digital Nets and Sequences: Discrepancy Theory andQuasi-Monte Carlo Integration. Cambridge University Press, Cambridge, U.K.Dion M, L’Ecuyer P (2010) American option pricing with randomized quasi-Monte Carlosimulations. In: Proceedings of the 2010 Winter Simulation Conference, pp 2705–2720El Haddad R, L´ecot C, L’Ecuyer P (2008) Quasi-Monte Carlo simulation of discrete-timeMarkov chains on multidimensional state spaces. In: Keller A, Heinrich S, NiederreiterH (eds) Monte Carlo and Quasi-Monte Carlo Methods 2006, Springer-Verlag, Berlin, pp413–429Fox BL, Glynn PW (1990) Discrete-time conversion for simulating ﬁnite-horizon Markovprocesses. SIAM Journal on Applied Mathematics 50:1457–1473Gerber M, Chopin N (2015) Sequential quasi-Monte Carlo. Journal of the Royal StatisticalSociety, Series B 77(Part 3):509–579Giles MB (2016) Algorithm 955: approximation of the inverse Poisson cumulative distribu-tion. ACM Transactions on Mathematical Software 42:1–22Gillespie DT (1977) Exact stochastic simulation of coupled chemical reactions. The Journalof Physical Chemistry 81(25):2340–2361, DOI 10.1021/j100540a008 rray-RQMC for Tau-Leaping Simulation 19

Gillespie DT (2000) The chemical Langevin equation. The Journal of Chemical Physics113(1):297–306Gillespie DT (2001) Approximate accelerated stochastic simulation of chemically reactingsystems. The Journal of Chemical Physics 115(4):1716–1733, DOI 10.1063/1.1378322Hellander A (2008) Efﬁcient computation of transient solutions of the chemical master equa-tion based on uniformization and quasi-Monte Carlo. The Journal of Chemical Physics128:154,109Hickernell FJ (2002) Obtaining O ( N − + ε ) convergence for lattice quadrature rules. In: FangKT, Hickernell FJ, Niederreiter H (eds) Monte Carlo and Quasi-Monte Carlo Methods2000, Springer-Verlag, Berlin, pp 274–289Higham DJ (2008) Modeling and simulating chemical reactions. SIAM Review 50(2):347–368, URL https://doi.org/10.1137/060666457 Joe S, Kuo FY (2008) Constructing Sobol sequences with better two-dimensional projec-tions. SIAM Journal on Scientiﬁc Computing 30(5):2635–2654Kloeden PE, Platen E (1992) Numerical Solutions of Stochastic Differential Equations.Springer-Verlag, BerlinKoh W, Blackwell KT (2012) Improved spatial direct method with gradient-based diffusionto retain full diffusive ﬂuctuations. The Journal of Chemical Physics 137(15):154,111,DOI 10.1063/1.4758459L´ecot C, Coulibaly I (1998) A quasi-Monte Carlo scheme using nets for a linear Boltzmannequation. SIAM Journal on Numerical Analysis 35(1):51–70L’Ecuyer P (2009) Quasi-Monte Carlo methods with applications in ﬁnance. Finance andStochastics 13(3):307–349L’Ecuyer P (2012) Random number generation. In: Gentle JE, Haerdle W, Mori Y (eds)Handbook of Computational Statistics, 2nd edn, Springer-Verlag, Berlin, pp 35–71L’Ecuyer P (2016) SSJ: Stochastic simulation in Java, http://simul.iro.umontreal.ca/ssj/

L’Ecuyer P (2018) Randomized quasi-Monte Carlo: An introduction for practitioners. In:Glynn PW, Owen AB (eds) Monte Carlo and Quasi-Monte Carlo Methods: MCQMC2016, Springer, Berlin, pp 29–52L’Ecuyer P, Buist E (2005) Simulation in Java with SSJ. In: Proceedings of the 2005 WinterSimulation Conference, IEEE Press, Piscataway, NJ, pp 611–620L’Ecuyer P, Lemieux C (2000) Variance reduction via lattice rules. Management Science46(9):1214–1235L’Ecuyer P, Lemieux C (2002) Recent advances in randomized quasi-Monte Carlo methods.In: Dror M, L’Ecuyer P, Szidarovszky F (eds) Modeling Uncertainty: An Examination ofStochastic Theory, Methods, and Applications, Kluwer Academic, Boston, pp 419–474L’Ecuyer P, Munger D (2016) Algorithm 958: Lattice builder: A general software tool forconstructing rank-1 lattice rules. ACM Transactions on Mathematical Software 42(2):Ar-ticle 15L’Ecuyer P, L´ecot C, Tufﬁn B (2006) Randomized quasi-Monte Carlo simulation of Markovchains with an ordered state space. In: Niederreiter H, Talay D (eds) Monte Carlo andQuasi-Monte Carlo Methods 2004, Springer-Verlag, Berlin, pp 331–342L’Ecuyer P, Demers V, Tufﬁn B (2007) Rare-events, splitting, and quasi-Monte Carlo. ACMTransactions on Modeling and Computer Simulation 17(2):Article 9, 45 pagesL’Ecuyer P, L´ecot C, Tufﬁn B (2008) A randomized quasi-Monte Carlo simulation methodfor Markov chains. Operations Research 56(4):958–975L’Ecuyer P, L´ecot C, L’Archevˆeque-Gaudet A (2009) On array-RQMC for Markov chains:Mapping alternatives and convergence rates. In: L’Ecuyer P, Owen AB (eds) Monte Carlo and Quasi-Monte Carlo Methods 2008, Springer-Verlag, Berlin, pp 485–500L’Ecuyer P, Munger D, L´ecot C, Tufﬁn B (2018) Sorting methods and convergence rates forArray-RQMC: Some empirical comparisons. Mathematics and Computers in Simulation143:191–201Lemieux C, Cieslak M, Luttmer K (2004) RandQMC User’s Guide: A Package forRandomized Quasi-Monte Carlo Methods in C. Software user’s guide, available at