Push-forward method for piecewise deterministic biochemical simulations
Guilherme C.P. Innocentini, Arran Hodgkinson, Fernando Antoneli, Arnaud Debussche, Ovidiu Radulescu
PPush-forward method for piecewise deterministicbiochemical simulations
Guilherme C.P. Innocentini
Universidade Federal do ABC, Santo André, Brazil
Arran Hodgkinson
Quantitative Biology and Medicine, University of Exeter, United KingdomHodgkinson Laboratories Limited, Sheffield, United Kingdom
Fernando Antoneli
Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil
Arnaud Debussche
University of Rennes, CNRS, IRMAR - UMR 6625, F- 35000 Rennes, France
Ovidiu Radulescu
University of Montpellier, CNRS, LPHI - UMR CNRS 5235, Montpellier, France
Abstract
A biochemical network can be simulated by a set of ordinary differential equa-tions (ODE) under well stirred reactor conditions, for large numbers of mo-lecules, and frequent reactions. This is no longer a robust representation whensome molecular species are in small numbers and reactions changing them areinfrequent. In this case, discrete stochastic events trigger changes of the smoothdeterministic dynamics of the biochemical network. Piecewise-deterministicMarkov processes (PDMP) are well adapted for describing such situations. Al-though PDMP models are now well established in biology, these models remaincomputationally challenging. Previously we have introduced the push-forwardmethod to compute how the probability measure is spread by the deterministicODE flow of PDMPs, through the use of analytic expressions of the correspond-ing semigroup. In this paper we provide a more general simulation algorithmthat works also for non-integrable systems. The method can be used for bio-chemical simulations with applications in fundamental biology, biotechnologyand biocomputing. This work is an extended version of the work presented atthe conference CMSB2019.
Preprint submitted to Elsevier 15th September 2020 a r X i v : . [ q - b i o . Q M ] S e p . Introduction Stochastic simulation is a powerful tool in biology. Moreover, stochasticityis a general property of biochemical networks, having multiple origins. Noiseis generated intrinsically by these molecular systems when neither the law oflarge numbers nor the averaging theorem can be applied, for instance whensome of the constituent molecular species are present in small numbers andtrigger relatively slow reactions [4]. There are also extrinsic sources of noise,resulting from a fluctuating cellular environment. The extrinsic noise paradigmalso applies to stochasticity resulting from manipulating biochemical networksin an artificial environment such as lab on a chip or simply in a titration device[1]. Stochastic simulation is used in single cell experimental studies, where theamounts of mRNA [25, 19, 3, 24] and protein [8, 9] products of a gene can bedetermined for each cell. By double- or multiple-fluorophore fluorescence tech-niques, products from several genes can be quantified simultaneously and onecan have access to multivariate probability distributions of mRNA or proteins.The stochastic dynamics of promoters and gene networks can have importantconsequences for fundamental biology [7] but also for HIV [20] and cancer re-search [10]. Predicting the probability distributions of biochemical networks’products is also instrumental for lab-on-a-chip applications, when one wants tooptimize and control the functioning of these networks. For these reasons weaim to develop effective methods for computing time-dependent distributionsfor stochastic models. Our main objective is the reduction of computation timewhich is prerequisite for parameter scans and machine learning applications [11].The traditional stochastic simulation protocol uses the chemical master equa-tion and the Gillespie algorithm. In such a protocol all the chemical reactionsare simulated individually as discrete stochastic events. Simulations on relev-ant experimental times have to cope with − such events and generate − samples in order to get statistically significant estimates of molecu-lar species distributions. Altogether, these simulations are extremely costly interms of execution time.An important reduction of the simulation time is obtained by noticing thatthe sources of noise can be localized in small parts of the system that behavediscretely, or in the discrete environmental variables in the case of intrinsicor extrinsic noise, respectively. The remaining, large part of the system con-sists of molecular species in large numbers that evolve continuously. This leadsto piecewise-deterministic Markov process (PDMP) approximations of the bio-chemical dynamics, coupling discrete state Markov chain dynamics with con-tinuous state ordinary differential equations (ODE) dynamics. The justificationof the PDMP approximations of the chemical master equation can be found in[4, 5]. Although simpler than the chemical master equation, direct simulationof the PDMP remains time consuming because the Markov chains can still havea very large number of states [4].In the CMSB2019 proceedings paper we have introduced new methods forsimulating PDMPs for gene networks [15]. A gene network PDMP model can be2imulated by numerical integration of ODEs satisfied by the mRNA and the pro-tein species, coupled to the Markov chain describing the successive transitions ofthe gene promoters [27, 4, 21, 17]. The simulation becomes particularly effectivewhen analytic solutions of the ODEs are available [13].Probability distributions of PDMP are solutions of the Liouville-master par-tial differential equations (PDEs). Numerical integration of these PDEs is aninteresting alternative to direct simulation, combining precision and speed forsmall models. Finite difference methods, however, are of limited use in this con-text as they can not cope with high dimensional models (for instance, extantgene networks applications are restricted to the dimension 2, corresponding toa single promoter, with or without self-regulation see [13, 16]).Another interesting method for computing time dependent distributions isthe push-forward method. For gene networks, this method has been first intro-duced in [12] and further adapted for continuous mRNA variables in [13]. It isbased on the idea to compute the probability distribution of gene products asthe push-forward measure of the semigroup defined by the ODEs. This methodis an approximation, as one has to consider that the discrete PDMP variablesare piecewise constant on a deterministic time partition. The transition ratesbetween promoter states were computed using a mean field approximation in[13]. In the CMSB2019 proceeding paper, the mean field approximation was re-placed by the next order approximation taking into account the second momentof the protein distribution [15]. In this paper we extend the method to generalPDMP models, thus covering all biochemical networks, with both intrinsic andextrinsic noise simulation protocols.
2. Models
A biochemical network is defined by a set of chemical reactions and a setof chemical species. The amounts of different chemical species form a vector x = ( x , x , . . . , x N ) ∈ R N + , where x i is the concentration of the species i .Each reaction is characterized by a stoichiometric vector and a reaction rate.For example, a simple transcription model describing the synthesis of a geneproduct X (mRNA or protein) from a two state promoter (with an active state P ∗ and an inactive state P ) is represented by four reactions P → P ∗ , P ∗ → P, P ∗ → P ∗ + X , X → , with stoichiometric vectors ν = ( − , , , ν =(1 , − , , ν = (0 , , , ν = (0 , , − and reaction rates R = k P , R = k P ∗ , R = k P ∗ , R = k X where k , . . . , k are constants.When all chemical species are present in large numbers, the biochemicalnetwork is well described by ODEs: d x d t = r (cid:88) j =1 ν j R j ( x ) , (1)3here ν j ∈ Z N , R j : R N + → R are the stoichiometric vectors and reaction ratesof the reaction j , respectively.For instance, for large numbers of P , P ∗ and X copies, the simple transcrip-tion example model reads: d P d t = − k P + k P ∗ , d P ∗ d t = k P − k P ∗ , d X d t = k P ∗ − k X. (2) When some species are present in low numbers and there are also slow reac-tions, a piecewise deterministic Markov process description is more adapted.For instance, in the simple transcription model, let us consider that thereis only one copy of the promoter P , that the switching constants, k and k ,are small compared to k , and also that k V cell /k >> , where V cell is the cellvolume. In this case, using methods from [5] we show that P , P ∗ are discretevariables and take values in the set S = { , } . The variable x = X/V cell hasa switching behavior alternating accumulation and degradation periods when P ∗ = 1 and P ∗ = 0 , respectively, each one described by ODEs: d x d t = (cid:26) k − k x when P ∗ = 1 − k x when P ∗ = 0 (3)The switching between the two discrete states can be described by a con-tinuous time Markov chain. Supposing that P ∗ (0) = 1 , then P ∗ ( t ) = 1 for ≤ t < T where T is a random time such that P [ T > t ] = exp( − k t ) , P ∗ ( t ) = 0 for T ≤ t < T + T where P [ T > t ] = exp( − k t ) , etc..More generally, a PDMP biochemical model have states in the set E = R N + ×S , where S is a finite set encoding discrete states of the biochemical model.It is a process ζ t = ( x t , s t ) , where x t is a vector in R N + whose components x it ( ≤ i ≤ N ) encode the dynamics of continuous biochemical species, and s t describes the jump Markov process between the discrete states. The PDMPprocess ζ t = ( x t , s t ) is determined by three characteristics: For all fixed s ∈ S , a vector field V s : R N + → R N determining a uniqueglobal flow Φ s ( t, x ) in R N + , such that, for t > , d Φ s ( t, x )d t = V s ( Φ s ( t, x )) , Φ s (0 , x ) = x . (4)The flow Φ s ( t, x ) represents a one parameter semigroup fulfilling the prop-erties (i) Φ s (0 , x ) = x , (ii) Φ s ( t + t (cid:48) , x ) = Φ s ( t (cid:48) , Φ s ( t, x )) . A transition rate matrix H : R N + → M N s × N s ( R ) , such that H r,s ( x ) is the ( r, s ) element of the matrix H , N s = S is the number of states. If ( s (cid:54) = r ) , H r,s ( x ) is the rate of probability to jump to the state r from thestate s . Furthermore, H s,s ( x ) = − (cid:80) r (cid:54) = s H r,s ( x ) for all s ∈ S and for all x ∈ R N . 4 ) A jump rate λ : E → R + . The jump rate can be obtained from the transitionrate matrix λ ( x , s ) = (cid:88) r (cid:54) = s H r,s ( x ) = − H s,s ( x ) . (5)From these characteristics, right-continuous sample paths { x t : t > } start-ing at ζ = ( x , s ) ∈ E can be constructed as follows. Define x t ( ω ) := Φ s ( t, x ) for ≤ t ≤ T ( ω ) , (6)where T ( ω ) is a realisation of the first jump time of s t , with the distribution F ( t ) = P [ T > t ] = exp( − (cid:90) t λ ( Φ s ( u, x ) , s ) du ) , t > , (7)and ω is the element of the probability space for which the particular realisationof the process is given. The pre-jump state is ζ T − ( ω ) ( ω ) = ( Φ s ( T ( ω ) , x ) , s ) and the post-jump state is ζ T ( ω ) ( ω ) = ( Φ s ( T ( ω ) , x ) , s ) , where s has thedistribution P [ s = r ] = H r,s ( Φ s ( T ( ω ) , x ) , s ) λ ( Φ s ( T ( ω ) , x ) , s ) , for all r (cid:54) = s . (8)We then restart the process ζ T ( ω ) and recursively apply the same procedure atjump times T ( ω ) , etc..Note that between each two consecutive jumps x t follow deterministic ODEdynamics defined by the vector field V s . At the jumps, the values x t are con-tinuous. More general definitions of PDMPs include jumps in the continuousvariables but will not be discussed here.We define multivariate probability density functions p s ( t, x ) . These functionssatisfy the Liouville-master equation which is a system of partial differentialequations: ∂p s ( t, x ) ∂t = −∇ x . ( V s ( x ) p s ( t, x )) + (cid:88) s (cid:48) H s,s (cid:48) ( x ) p s (cid:48) ( t, x ) . (9) Two state ON/OFF gene networks generalize the simple two state singlegene transcription model, by considering more interacting genes and by usingseparate variables for the two gene products: mRNA and protein.To each gene we associate a discrete variable σ i with two possible values σ i = 0 for a non-productive state (OFF) and σ i = 1 for a productive state(ON). Furthermore, a gene i produces proteins and mRNAs in the amounts y i and r i , respectively. A gene network state is described by the N − dimensionalvector x t = ( r ( t ) , y ( t ) , r ( t ) , y ( t ) , . . . , r N/ ( t ) , y N/ ( t )) . The gene productsamounts satisfy ODEs: d y i d t = b i r i − a i y i , d r i d t = k i ( σ i ) − ρ i r i (10)5he coupling between genes is modeled at the level of discrete state transitions.The elements of the matrix H are functions of products from various genes. Forinstance, if a gene i inhibits a gene j the transition rate from ON to OFF of thegene j is an increasing function of the protein concentration y i .As a first example that we denote as model M , let us consider a two genesnetwork; the expression of the first gene being constitutive and the expression ofthe second gene being activated by the first. We consider that the transcriptionactivation rate of the second gene is proportional to the concentration of thefirst protein f y . All the other rates are constant f , h , h , representing thetranscription activation rate of the first gene, and the transcription inactivationrates of gene one and gene two, respectively. For simplicity, we consider thatthe two genes have identical protein and mRNA parameters b = b = b , a = a = a , ρ = ρ = ρ . We further consider that k i = k if the gene i is OFF and k i = k > k if the gene i is ON.The gene network has four discrete states, in order (0 , , (1 , , (0 , , and (1 , . Then, the transition rate matrix for the model M is H ( y , y ) = − ( f + f y ) h h f − ( h + f y ) 0 h f y − ( f + h ) h f y f − ( h + h ) . (11)The Liouville-master equation for the model M reads ∂p ∂t = − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r + h p + h p − ( f + f y ) p ,∂p ∂t = − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r + f p + h p − ( h + f y ) p ,∂p ∂t = − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r + h p + f y p − ( h + f ) p ,∂p ∂t = − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r + f p + f y p − ( h + h ) p . (12)The model M differs from the model M by the form of the activation function.Instead of a linear transcription rate f y we use a Michaelis-Menten model f y / ( K + y ) . This model is more realistic as it takes into account that theprotein p has to attach to specific promoter sites which become saturated whenthe concentration of this protein is high.6he transition rate matrix for the model M is H ( y , y ) = − ( f + f y K + y ) h h f − ( h + f y K + y ) 0 h f y K + y − ( f + h ) h f y ( K + y ) f − ( h + h ) . (13)The Liouville-master equation for the model M reads ∂p ∂t = − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r + h p + h p − ( f + f y / ( K + y )) p ,∂p ∂t = − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r + f p + h p − ( h + f y / ( K + y )) p ,∂p ∂t = − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r + h p + f y / ( K + y ) p − ( h + f ) p ,∂p ∂t = − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r − ∂ [( br − ay ) p ] ∂y − ∂ [( k − ρr ) p ] ∂r + f p + f y / ( K + y ) p − ( h + h ) p .
3. Simulation methods
The Monte-Carlo method utilizes the direct simulation of the PDMP basedon the iteration of the following equations: d x d t = V s ( x ( t )) , d F d t = − λ ( x , s ) F, (14)for t ∈ [0 , T ) with initial conditions x (0) = x , F (0) = 0 , and stopping condi-tion F ( T ) = U , where U is random variable, uniform in the range [0 , , followedby the choice of the next discrete state s by using the matrix H ( x ( T )) .A large number N mc of sample traces is generated and the values of x t arestored at selected times. Multivariate time dependent probability distributionsare then estimated from this data.The complete algorithm is presented in the Algorithms 1 and 2.The direct simulation of PDMPs needs the solutions of Eqs. (14) which canbe obtained by numerical integration. This is not always computationally easy.7 lgorithm 1 NextState
Input: x initial value continuous variable, 2. s initial discrete state, 3. t initial time, 4. t max maximal time Output: ( x , x , . . . , x n ) , 2. ( t , t , . . . , t n ) s new discrete state x := x ; s := s U := randunif ([0 , solve d x d t = V s ( x ) , d F d t = − λ ( x, s ) F with initial data x ( t ) = x , F ( t ) = 1 ,until F ( t ) = U or t = t max , return x = x ( t ) , . . . , x n = x ( t n ) , where t < t < . . . < t n , t n = min( F − ( U ) , t max ) . V := randunif ([0 , CUM := 0 ; I := 0 while CUM < V do I := I + 1 CUM := CUM + H I,s ( x ) / (cid:80) J (cid:54) = s H J,s ( x ) end while return ( ( x , x , . . . , x n ) , ( t , t , . . . , t n ) , I )Problems may arise for fast switching promoters when the ODEs have to be in-tegrated many times on small intervals between successive jumps. Alternatively,the numerical integration of the ODEs can be replaced by analytic solutions orquadrature. Analytic expressions are always available for the gene network flow(10) and read Φ yi ( t, y , r ) = y exp( − a i t )+ b i (cid:20)(cid:18) r − k i ( σ i ) ρ i (cid:19) exp( − ρ i t ) − a i − ρ i + k i ( σ i ) ρ i − exp( − a i t ) a i (cid:21) , Φ ri ( t, y , r ) = ( r − k i /ρ i ) exp( − ρ i t ) + k i /ρ i . Let us consider the following general expression of the jump intensity func-tion λ ( x , s ) = c ( s ) + N (cid:88) i c i ( s ) y i + N (cid:88) i d i ( s ) f i ( y i ) , where f i are non-linear functions, for instance Michaelis-Menten function f i ( p i ) = p i K i + p i or Hill functions f i ( y i ) = ( y i ) n i K n i i + ( y i ) n i . If d i = 0 for all ≤ i ≤ N , the cumulative distribution function of the wait-ing time T can be solved analytically [13], otherwise it can be obtained by8 lgorithm 2 PDMPmontecarlo
Input: p ( x, s ) , initial distribution of ( x , s ) . 2. τ time step for computingthe time-dependent distribution. 3. δx bin size in one direction. Output: n t vectors H , . . . , H nt ,representing the distribution at times , τ, . . . , n t τ for i := 1 to n t do H i := n Mx dimensional null vector end for for m := 1 to N mc do Draw ( x , s ) from the distribution p ( x, s ) x := x , s := s t := t ; i := 0 while t < n t τ do (( x , . . . , x n ) , ( t , . . . , t n ) , s next ) := NextState ( x, s, t, t M ) for j := 1 to n do if t j ≤ iτ < t j +1 then X i := ( x j + x j +1 ) / increment by / (( δx ) N N mc ) ,the bin corresponding to X i in the histogram H i i := i + 1 end if end for x := x n s := s next t := t n nr := nr + 1 end while end for return ( H , . . . , H n t ) quadrature. For example, for the model M one has λ ( x , s ) = (cid:18) f + f y K + y (cid:19) δ s, + (cid:18) h + f y K + y (cid:19) δ s, + ( h + f ) δ s, + ( h + h ) δ s, , where δ i,j is Kronecker’s delta. In this case, the waiting time T is obtained asthe unique solution of the equation − log( U ) = (cid:20) ( f + f ) T + f (cid:90) T K + Φ y ( t (cid:48) , y , r ) dt (cid:48) (cid:21) δ s , + (cid:20) ( h + f ) T + f (cid:90) T K + Φ y ( t (cid:48) , y , r ) dt (cid:48) (cid:21) δ s , + ( h + f ) T δ s , + ( h + h ) T δ s , , (15)9here U is a random variable, uniformly distributed in the range [0 , . In ourimplementation of the algorithm we solve (15) numerically, using the bisectionmethod. The finite difference Liouville master equation method for a given number ofgenes, N g , uses a discrete approximation of the domain to compute the numer-ical solution across time for a given higher-dimensional system (12), with ini-tial conditions given by p (0 , y , r , . . . ) := δ ( y , r , . . . ) and p (0 , r , y , . . . ) := . . . := p N g (0 , y , r , . . . ) := 0 where δ ( · ) is the Dirac delta function. We com-pute the solution for N g distributions, since each gene has both an ON andOFF state, respectively.In order to achieve the simulation of the equations given by system (12), webegin by discretising each of the domains into n η intervals, whose centres aregiven by { η , . . . , η n η } ∈ y meeting the condition that η < η < . . . < η n η andwhere, likewise, an arbitrary discrete value in y i would be denoted η i,j . Each ηi, j then represents a unique, discrete, j th position in the protein abundancedomain spanned by y i . Likewise, mRNA would have an analogous discretisationgiven by all ρ i,j , ∀ j ∈ { , . . . , n ρ } . Time is then similarly discretised by a time-step τ into n t + 1 temporal locations given by { , τ, . . . , τ n t } . The task thenbecomes the computation of the solution at each of these discrete locations inthe domain, such that the sink and source terms may be trivially calculated butwhere the derivative terms warrant further explanation.The solution to the advection equation under a uniform coefficient, χ , is givenby a translation in the relevant domain. To achieve this, whilst also maintainingthe stability of the system, as a whole, we implement a simultaneous forwards-and backwards-difference discrete operator scheme. This means that for anarbitrary probability density function, p k ( t, y i , . . . ) , we write the discrete partialderivative operator as χ ∂∂y i p k ( t, η i,j , . . . ) = | χ | p k ( t, η i,j − , . . . ) − p k ( t, η i,j , . . . ) η i,j − η i,j − if χ > | χ | p k ( t, η i,j +1 , . . . ) − p k ( t, η i,j , . . . ) η i,j +1 − η i,j otherwise.(16)This may be evaluated as such for each term within the system of equations,given by (12), and guarantees the probability balance of the system, as a whole.Beyond the computation of the equation’s solutions at a single time-stepthe solutions must be computed robustly across time. We therefore employ aMcCormack predictor-corrector scheme [18], given explicitly for any i th time-step by p (cid:48) k,i +1 = p k,i + τ F k ( p ,i , . . . , p N g ,i ) p k,i +1 = 12 ( p k,i + p (cid:48) k,i +1 ) + 12 τ F k ( p (cid:48) ,i +1 , . . . , p (cid:48) N g ,i +1 ) ∀ k ∈ { , . . . , N g } , (17)10here p k,i = p k ( t i , y , r , . . . ) , F k ( p , . . . , p N g ) = ∂p k /∂t and p (cid:48) k,i is the discretenomenclature for a prediction of the solution for p k at time t i .The method for solving the problem, in totality, is then given by Algorithm3 where, again, the partial derivative terms on the right-hand side of each equa-tion are evaluated using (16). Algorithm 3 is a direct implementation of (17)as an algorithm with the concurrent calculation of the distributions P t , ∀ t ∈{ , τ, . . . , n t τ } from the individual distributions p k,i , ∀ k ∈ { , . . . , N g } , i ∈{ , . . . , n t } . For each time-step, we solve the right-hand side of the equationusing the predictor corrector scheme, update the value of each distribution, andcalculate the total distribution, P t . Algorithm 3
PDMPLiouvillemaster
Input: p , , p , , p , , p , initial distributions in ( y , r , . . . , y N g , r N g ) , where p i,t := p i ( t, y , r , . . . , y N g , r N g ) , ∀ i ∈ { , . . . , N g } . 2. τ time step forcomputing the time-dependent distribution. 3. δy, δr bin size for the proteinand mRNA distributions, respectively. Output: n t + 1 distributions P , . . . , P n t representing the sums of the indi-vidual distributions in ( y , r , y , r ) at times , τ, . . . , n t τ , respectively for i := 1 to n t do for j := 1 to N g do Use p k,i − and discretised right hand side of PDE in (12) to computethe temporal gradient, ∂ t p j,i − Compute predicted solution at next time step as p (cid:48) j,i = p j,i − + τ ∂ t p j,i − end for Set P i = 0 for j := 1 to N g do Use prediction p (cid:48) k,i and discretised right hand side of PDE in 12 tocompute the temporal gradient, ∂ t p (cid:48) j,i Compute corrected solution at next time step as p j,i = ( p j,i − + p (cid:48) j,i ) + τ ∂ t p (cid:48) j,i Set P i = P i + p j,i end for end for return ( P , . . . , P n t ) This method allows one to compute the multivariate probability distributionof the continuous variable x at a time τ given the probability distribution of ( x , s ) at time .In order to achieve this we consider a deterministic partition τ = 0 < τ <. . . < τ M = τ of the interval [0 , τ ] such that ∆ M = max j ∈ [1 ,M ] ( τ j − τ j − ) is small. The main approximation of this method is to assume that s t , for11 lgorithm 4 PDMPpushforward
Input: p ( x, s ) , initial distribution of ( x , s ) . 2. τ time step for computingthe time-dependent distribution Output: n t vectors H , . . . , H nt ,representing the distribution at times , τ, . . . , n t τ compute H , P from initial distribution for i := 1 to n t do for j := 1 to n Nx do for s to N s do solve d x t = V s ( x , d Π t = H ( x Π with initial conditions x C j , Π0(0) = I from t = 0 to t = τ for s to N s do solve d x t = V s ( x , d Π t = H ( x Π with initial conditions x τ ) = x τ ) , Π1(0) = I from t = τ to t = τ for s to N s do solve d x t = V s ( x , d Π t = H ( x Π with initial conditions x τ ) = x τ ) , Π2(0) = I from t = τ to t = τ for s to N s do solve d x t = V s ( x with initial conditions x τ ) = x τ ) from t = τ to t = τ P M = Π2 s ,s Π1 s ,s Π0 s ,s P s x = x τ ) H i ( BIN ( x )) = H i ( BIN ( x )) + P M H i − ( j ) BIN ( x ) : bincontaining x end for end for end for end for end for end for t ∈ [0 , τ ] , is piecewise constant on this partition, more precisely, that s t = s j := s τ j , for t ∈ [ τ j , τ j +1 ) , ≤ j ≤ M − . This is rigorously true for intervals [ τ j , τ j +1 ) completely contained between two successive random jump times of s t . This situation becomes very frequent for a very fine partition (large M ).Thus, the error generated by the approximation vanishes in the limit M → ∞ (the rigorous result is Theorem 1 given in the Results section).For each path realization S M := ( s , s , . . . , s M − ) ∈ Ω := S M of the discretestates, we can compute x ( t ) , t ∈ [0 , τ ] as the continuous solution of the followingODE with piecewise-defined r.h.s: d x d t = V s j ( x ) , for t ∈ [ τ j , τ j +1 ) , ≤ j ≤ M − (18)and with the initial condition x (0) = x .12 =1 s =1 s =1 s =0 s =0 tx 0 τ τ τ τ τ Figure 1: Deterministic partition of time in the push forward method. Trajectories of thePDMP model (red line) having a jump of the discrete state inside [ τ , τ ) are replaced by atrajectory (black line) having a jump of the discrete state at τ . In order to compute the probability P [ S M ] of a path realization we can usethe fact that, given x t , s t is a finite state Markov process. Therefore, P [ S M ] = Π s M − ,s M − ( τ M − , τ M − ) . . . Π s ,s ( τ , τ ) P S ( s ) , (19)where P S : S → [0 , is the initial distribution of the discrete state, and Π ( τ j , τ j +1 ) is the solution, at t = τ j +1 , of d Π ( τ j , t )d t = H ( x t ) Π ( τ j , t ) (20)with Π ( τ j , τ j ) = I and x t is given by (18).In order to compute the probability distribution of x at time τ one has tosum the contributions of all solutions of (18), obtained for the N Ms realisationsof promoter state paths with weights given by the probabilities of the paths.Suppose that we want to estimate the distribution of x ( τ ) , using a multivari-ate histogram with bin centers c ( l ,...,l N ) = ( c l , . . . , c l N N ) , ≤ l i ≤ n x , ≤ i ≤ N where n x is the number of bins in each direction x i , ≤ i ≤ N . The initialdistribution of x at time t = 0 is given by the bin probabilities p ( l ,...,l N )0 , ≤ i ≤ N, ≤ l i ≤ n x and the distribution at time τ are given by the probabilities p ( l ,...,l N ) τ , ≤ i ≤ N, ≤ l i ≤ n x .Let x ( l ,...,l N ) ( t ) be the solution of (18) with x ( l ,...,l N ) (0) = c ( l ,...,l N ) andlet ( l (cid:48) , . . . , l (cid:48) N ) be the histogram bin containing x ( l ,...,l N ) ( τ ) . The application ( l , . . . , l N ) → ( l (cid:48) , . . . , l (cid:48) N ) := ψ ( l , . . . , l N ) is in general many to one. Given the13robability P [ S M ] of a path S M ∈ Ω , the push forward distribution of x ( τ ) iscomputed as p ( l (cid:48) ,...,l (cid:48) N ) τ = (cid:88) S M ∈ Ω (cid:88) ψ ( l ,...,l N )=( l (cid:48) ,...,l (cid:48) N ) p ( l ,...,l N )0 P [ S M ] , (21)The push-forward method can be applied recursively to compute the dis-tribution at times τ, τ, . . . , n t τ . The complete algorithm is schematized inAlgorithm 4 for the choice M = 4 . Remark 1.
The Algorithm 4 can be used for computing the full multivariateprobability distribution of the vector x ( t ) , but also for computing marginal dis-tributions. For gene network the full multivariate distribution implies productsfrom all genes, whereas a marginal distribution can select only one, or a smallnumber of genes. For marginals, the dimension N is replaced by N m < N where N m is the number of components of interest of the vector x ( t ) . Remark 2.
For gene networks, the numerical integration of the ODEs (steps7,9,11 in the Algorithm 4) can be replaced by symbolic solutions. For each pathrealization S M := ( s , s , . . . , s M − ) ∈ Ω := { , , . . . , N − } M of promoterstates, we can compute the protein and mRNA levels, y t and r t , respectively,of all genes i ∈ { , N } , at t = τ : r iτ = r i e − ρτ + k ρ (1 − e − ρτ ) + k − k ρ M − (cid:88) j =1 e − ρτ ( e − ρτ j +1 − e − ρτ j ) s ij (22) y iτ = y i e − aτ + br i a − ρ ( e − ρτ − e − aτ ) + bk ρ (cid:18) − e − aτ a + e − aτ − e − ρτ a − ρ (cid:19) + b ( k − k ) ρ e − aτ M (cid:88) j =1 s ij − w j , (23)for i ∈ { , . . . , N } . Here, w j = e ( a − ρ ) τ − e ( a − ρ ) τ j a − ρ ( e ρτ j − e ρτ j − ) − e ( a − ρ ) τ j − e ( a − ρ ) τ j − a − ρ e ρτ j − + e aτ j − e aτ j − a with s ij := 0 if promoter i is OFF for t ∈ [ τ j , τ j +1 ) and s ij := 1 if promoter i isON for t ∈ [ τ j , τ j +1 ) . The complexity of the push forward algorithm scales as n t n Nx N Ms ( N + N s ) ,because there are N Ms possible paths S M , n Nx histogram bins, and n t timecomplexity for solving (18), (20) is O ( N + N s ) . The complexity of com-puting marginal distributions of N m < N variables is lower and scales as n t n N m x N Ms ( N + N s ) . 14 .3.3. Mean field push-forward method A way to mitigate the computational burden of the Algorithm 4 is to usethe mean field approximation. In the mean field approximation, the probabil-ities P [ S M ] are computed from averaged equations that are identical for eachhistogram bin.More precisely, equation (20) is replaced by d Π ( t (cid:48) , t )d t = E [ H ( x t )] Π ( t (cid:48) , t ) (24)where t (cid:48) ≤ t , Π ( t (cid:48) , t (cid:48) ) = I .Using a Taylor expansion of H ( x t ) around the expectation E [ x t ] , one gets d Π ( t (cid:48) , t )d t = H ( E [ x t ]) + 12 (cid:0) H (cid:48)(cid:48) ( E [ x t ]) : Var( x t ) (cid:1) Π ( t (cid:48) , t ) , (25)where Var( x t ) is the variance/covariance matrix of x t , H (cid:48)(cid:48) is the element-wisesecond derivative matrix of H and : stands for the double dot product.In (25) the moments E [ x t ] and Var( x t ) are either available analytically orare solutions of ODEs obtained using moment closures such as in [22].The main advantage of the mean-field approximation is that the transitionmatrix elements can be computed outside the discrete variables loop. The com-plete algorithm is presented in the Algorithm 5. The time complexity in thiscase is n t n Nx N Ms N , because in the innermost loop one has to solve only (18),whose time complexity is O ( N ) .For gene networks, the mean field push-forward procedure can be appliedalso to ODE dynamics of the individual genes. This is possible because duringON or OFF periods, the ODE dynamics of one gene is uncoupled from that ofanother gene. Furthermore, the N s × N s transition matrix H can be replacedby N/ × transition matrices of one gene with elements averaged over thevalues of the other genes. This approximation reduces the complexity of thecalculations to n t ( n x ) N M , which is linear in the number of genes. The meanfield push-forward algorithm for gene networks is presented in the Algorithm 6.This method has already been used for particular models. In [13] we havereplaced the regulation term f y ( t ) occurring in the transition matrix of thegene network model M by its mean f E [ y ( t )] . This means that the gene switches between its ON and OFF states with rates given by the mean of theregulatory protein y . In this case both H and Π can be computed analytically,which leads to a drastic reduction in the execution time. This simple meanfield approach is suitable for the model M , which contains only linear regula-tion terms. For non-linear regulation terms, Π can not generally be computedanalytically. Moreover, the naive mean field approach introduces biases. Forinstance, in the case of the model M , the approximation f y ( t ) / ( K + y ( t )) ≈ f E [ y ( t )] / ( K + E [ y ( t )]) is poor. In the CMSB2019 paper we proposed a bet-ter approximation [15], in which we replace f y ( t ) / ( K + y ( t )) by its meanvalue and use E (cid:20) f y ( t ) K + y ( t ) (cid:21) ≈ f E [ y ( t )]( K + E [ y ( t )]) − f ( K + E [ y ( t )]) Var( y ( t )) , (26)in order to correct the bias. This approach is generalized by (25).15 lgorithm 5 PDMPpushforwardmeanfield
Input: p g ( x, s ) , initial distribution of ( x , s ) . 2. τ time step for computingthe time-dependent distribution Output: n t vectors H , . . . , H nt representing the distribution of x at times , τ, . . . , n t τ compute H , P from initial distribution for i := 1 to n t do for s to N s do for s to N s do for s to N s do for s to N s do Compute P M ( s , s , s , s ) = Π s ,s ( τ , τ )Π s ,s ( τ , τ )Π s ,s ( τ , τ ) P s where Π( t (cid:48) , t ) is the solution of (25). end for end for end for end for for s to do for j := 1 to n Nx do solve d x t = V s ( x , with initial conditions x C j , from t = 0 to t = τ for s to do solve d x t = V s ( x with initial conditions x τ ) = x τ ) from t = τ to t = τ for s to do solve d x t = V s ( x with initial conditions x τ ) = x τ ) from t = τ to t = τ for s to do solve d x t = V s ( x with initial conditions x τ ) = x τ ) from t = τ to t = τ x = x τ ) H i ( BIN ( x )) = H i ( BIN ( x ))+ P M ( s , s , s , s ) H i − ( j ) where BIN ( x ) is the bin containing x end for end for end for end for end for end for lgorithm 6 PDMPpushforwardmeanfieldgenenetwork
Input:
1. for each gene g ∈ { , , . . . , N/ } , p ( x, s ) , initial distribution of ( x , s ) , where x = ( r, p ) . 2. τ time step for computing the time-dependentdistribution Output: for each gene g ∈ { , , . . . , N/ } , n t vectors H , . . . , H nt representingthe distribution at times , τ, . . . , n t τ for g := 1 to N/ do compute H , P from initial distribution for gene g for i := 1 to n t do for s to do for s to do for s to do for s to do Compute P M ( s , s , s , s ) = Π s ,s ( τ , τ )Π s ,s ( τ , τ )Π s ,s ( τ , τ ) P s where Π( t (cid:48) , t ) is the solution of (25) for gene g . end for end for end for end for for s to do for j := 1 to n x do solve d x t = V s ( x , with initial conditions x C j , from t = 0 to t = τ , and V s is defined by (10) for gene g for s to N s do solve d x t = V s ( x with initial conditions x τ ) = x τ ) from t = τ to t = τ for s to N s do solve d x t = V s ( x with initial conditions x τ ) = x τ ) from t = τ to t = τ for s to N s do solve d x t = V s ( x with initial conditions x τ ) = x τ ) from t = τ to t = τ x = x τ ) H i ( BIN ( x )) = H i ( BIN ( x )) + P M ( s , s , s , s ) H i − ( j ) where BIN ( x ) is the bin containing x end for end for end for end for end for end for end for
17s in [13] we can use analytic expressions for E [ y ( t )] , but also for Var( y ( t )) .These expressions can be found in the Appendix A. Although the elementsof matrix H have analytic expressions, the elements of the matrix Π containintegrals that must be computed numerically. For the model M , we have Π ( τ, τ (cid:48) ) = (cid:34) (1 − p ,on ) + p ,on e − (cid:15) ( τ (cid:48) − τ ) (1 − p ,on )(1 − e − (cid:15) ( τ (cid:48) − τ ) ) p ,on (1 − e − (cid:15) ( τ (cid:48) − τ ) ) p ,on + (1 − p ,on ) e − (cid:15) ( τ (cid:48) − τ ) (cid:35) , (27)for the transition rates of the first gene, where p ,on = f / ( f + h ) , (cid:15) =( f + h ) /ρ , and Π ( τ, τ (cid:48) ) = (cid:34) K ( τ, τ (cid:48) ) + h (cid:82) τ (cid:48) τ K ( t, τ (cid:48) ) dt h (cid:82) τ (cid:48) τ K ( t, τ (cid:48) ) dt − K ( τ, τ (cid:48) ) − h (cid:82) τ (cid:48) τ K ( t, τ (cid:48) ) dt − h (cid:82) τ (cid:48) τ K ( t, τ (cid:48) ) dt (cid:35) , (28)for the transitions of the second gene, where K ( τ, τ (cid:48) ) = e − (cid:82) τ (cid:48) τ ( h + F ( t )) dt and F ( t ) = f E (cid:104) y ( t ) K + y ( t ) (cid:105) .
4. Results
The probability distribution obtained with the push-forward method con-verges to the exact PDMP distribution in the limit M → ∞ . This is a con-sequence of the following theorem: Theorem 1.
Let Φ S M ( t, x ) be the flow defined by the formulas (18) , such that x t = Φ S M ( t, x (0)) for t ∈ [0 , τ ] . Let µ Mt : B ( R N ) → R + be the probability meas-ure defined as µ Mt ( A ) = (cid:80) S M ∈ Ω P [ S M ] µ (Φ − S M ( t, A )) , where µ : B ( R N ) → R + is the probability distribution of x at t = 0 , P [ S M ] is given by (19) , and B ( R N ) is the σ -algebra of Borel sets on R N . Let µ t , the exact distribution of x t for thePDMP defined in Section 2.2, with initial values ( x , s ) distributed accordingto µ × P S . Assume that the vector fields V s , s ∈ S , and the transition matrix H are C functions and that there exists a bounded set of R N such that all flows Φ s , s ∈ S , leave B invariant. Assume that | τ i − τ i − | < C/M for all i ∈ [1 , M ] ,where C is a positive constant. Then, for all t ∈ [0 , τ ] , µ Mt converges in distri-bution to µ t , when M → ∞ . More precisely, for all Lipschitz functions ϕ on R N , there exists a constant κ depending on the data of the PDMP, τ and theLipschitz constant of ϕ such that: (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B ϕ ( x ) µ Mt ( dx ) − (cid:90) B ϕ ( x ) µ t ( dx ) (cid:12)(cid:12)(cid:12)(cid:12) = | E ( ϕ (Φ S M ( t, x ))) − E ( ϕ ( x t )) | ≤ κ/M. Also for all Borel set A in B ( R N ) such that µ t ( ∂A ) = 0 we have µ Mt ( A ) → µ t ( A ) when M → ∞ . Here, ∂A denotes the boundary of A . The proof of this theorem is given in Appendix B. It is inspired by theclassical proof of weak order of convergence for the Euler scheme for a stochasticdifferential equation (see [23]). 18 emark 3.
If the flows Φ s are not known explicitly, a numerical scheme canbe used. This introduces another source of error. Our proof easily extendsand provide a similar result of convergence. If one wants to investigate theconvergence of p ( l ,...,l N ) τ given by (21), this follows from the above theorem onlyunder the assumption that the probability that the PDMP is on the boundaryof the bins at time t is . Note this is not restrictive and happens only inpathological situations. Also, if the initial distribution has a smooth density, wecan prove that the error estimate above holds for borelian bounded functions ϕ , thus we can choose ϕ to be an indicator function and obtain error boundsfor these probabilities without this restriction. More precisely, we have for anyBorel set A : (cid:12)(cid:12) µ Mt ( A ) − µ t ( A ) (cid:12)(cid:12) ≤ κ/M, where now κ depends on the Lipschitz constant of the initial density, see Remark4 in Appendix B. In order to test the push-forward method, we compared the resulting prob-ability distributions with the ones obtained by the Monte-Carlo method usingthe direct simulation of the PDMP. We considered the models M and M withthe following parameters: ρ = 1 , p = ff + h = 1 / , a = 1 / , b = 4 , k = 4 , k = 40 for the two genes. The parameter (cid:15) took two values (cid:15) = 0 . for slowgenes and (cid:15) = f + hρ = 5 . for fast genes. We tested the slow-slow and thefast-fast combinations of parameters.The initial distribution of the promoters states was P S ((0 , where thestate (0 , means that both promoters are OFF. The initial probability measure µ was a delta Dirac distribution centered at r = r = 0 and y = y = 0 .This is obtained by always starting the direct simulation of the PDMP from r (0) = r (0) = 0 , y (0) = y (0) = 0 , and s = s = 0 . The simulations wereperformed between t = 0 and t max = 20 for fast genes and between t = 0 and t max = 90 for slow genes. In order to estimate the distributions we have used M C = 50000 samples for the highest sampling.The push-forward method was implemented with M = 10 equal length sub-intervals of [0 , τ ] . The time step τ was chosen τ = 2 for fast genes and τ = 15 forslow genes. The procedure was iterated times for fast genes (up to t max = 20 )and times for slow genes (up to t max = 90 ).The execution times are provided in the Table 1. The comparison of theprobability distributions are illustrated in the Figures 2 and 3. In order toquantify the relative difference between methods we use the L distance betweendistributions. More precisely, if p ( x ) and ˜ p ( x ) are probability density functionsto be compared, the distance between distributions is d = (cid:90) | p ( x ) − ˜ p ( x ) | dx. (29)19his distance was computed between distributions resulting from the push-forward method and the Monte-Carlo method with the highest sampling. Wehave also used a reduced sampling Monte-Carlo scheme whose execution time issimilar to the one of the push-forward method. The distributions resulting fromlow sampling and high sampling Monte-Carlo were compared using the samedistance. Figures 2 and 3 clearly show that, for the same execution time, thepush-forward method outperforms the Monte-Carlo method.Model Monte-Carlo high sampling [min] Push-forward [s] M slow-slow 45 20 M fast-fast 74 30 M slow-slow 447 20 M fast-fast 758 30 Table 1: Execution times for different methods. All the methods were implemented inMatlab R2013b running on a single core (multi-threading inactivated) of a Intel i5-700u 2.5GHz processor. The Monte-Carlo method computed the next jump waiting time (Algoritm 1)using the analytical solution of Eq.15 for M and the numerical solution of Eq.15 for M .The push-forward method used Algorithm 6 and analytic solutions for mRNA and proteintrajectories from (23),(18), and numerical computation of the integrals in Eq.28, for bothmodels. We also, have implemented in Python Algorithm 4, for one gene model,to compute mRNA ( r ( t )) and protein ( y ( t )) probabilities density, P ( r, t ) and P ( y, t ) , respectively, in two switch regimes: slow ( (cid:15) = 0 . and fast ( (cid:15) = 5 . .Also in Python, we have implemented the Algorithm 5, for two gene models,to compute the probabilities density for mRNA and protein associate to geneone ( P ( r , t ) and P ( y , t ) ), and for mRNA and protein associated to gene two( P ( r , t ) and P ( y , t ) ), considering four different switch regimes: slow-slow, fast-fast, slow-fast and fast-slow. For all implementations each time interval [0 , τ ] has been partitioned in four sub intervals of equal size ( M = 4) resulting inthe sequence { s , s , s , s } , representing the state of the switch inside eachsub-interval ( s = s j := s t j for t ∈ [ t j , t j +1 ) , j = 0 , ..., M − ), leading to path realizations . For all slow genes t = 0 , t = 9 and we have evaluate thesolution up to t max = 90 , using the composition rule between successive timeintervals. For all fast genes t = 0 , t = 1 and we have evaluate the solution upto t max = 20 , using the composition rule between successive time intervals.We have compared the results obtained by the implementation of the pushforward method (PF), as described above, with Monte-Carlo simulation (MC)Algorithms 1 and 2, Gillespie Algorithm (G) and with numerical solution ofthe Liouville-master equation (LME) Algorithm 3. The comparison betweenexecution time for each model and each method can be found in Table 2 (allthe execution times are expressed in minutes).In order to illustrate our results we have produced Figures 4, 5, 6 and 7, forselected models and for methods: Push forward (PF), Monte-Carlo simulation20 igure 2: Histograms of protein for the second gene, produced by the Monte-Carlo method(green lines) and by the Push-forward method (black lines) for the model M . The greendotted line results from low sampling Monte-Carlo with similar execution time as the push-forward method, whereas the solid green line results from high sampling Monte-Carlo. Thedistances, defined by (29), are between low sampling and high sampling Monte-Carlo ( d ∗ ) andbetween push-forward and high sampling Monte-Carlo ( d ). (MC) and Liouville-master equation (LME). The one gene model (Figures 4 and5) has as initial conditions p (0) = 1 , where p (0) is the probability to find theswitch in state at time t = 0 , and r (0) = y (0) = 0 , for both switch regimes:slow ( (cid:15) = 0 . and fast ( (cid:15) = 5 . . The two genes model (Figures 6 and 7)has as initial conditions p (0) = p (0) = 1 (meaning that at time t = 0 bothgenes are in state with probability one), r (0) = y (0) = r (0) = y (0) = 0 ,for the switches configurations: slow-slow and slow-fast. We choose the sameparameters for all the models and for all the genes, and the values are: (cid:15) = (cid:15) = 0 . (slow switch), (cid:15) = (cid:15) = 5 . (fast switch), p = p = 0 . , ρ = ρ = 1 ,21 igure 3: Histograms of protein for the second gene, produced by the Monte-Carlo method(green lines) and by the Push-forward method (black lines) for the model M . The greendotted line results from low sampling Monte-Carlo with similar execution time as the push-forward method, whereas the solid green line results from high sampling Monte-Carlo. Thedistances, defined by (29), are between low sampling and high sampling Monte-Carlo ( d ∗ ) andbetween push-forward and high sampling Monte-Carlo ( d ). k ( σ = 0) = k ( σ = 0) = 4 , k ( σ = 1) = k ( σ = 1) = 40 , a = a = 1 / and b = b = 4 . We have quantified the relative difference, using Eq. (29), betweenthe methods Push Forward and Monte-Carlo (high sampling), expressed in thegraphics by d P F , and between methods Liouville-master equation and Monte-Carlo (high sampling) encoded by d LME .
5. Discussion and conclusion
Combining direct simulation of PDMP gene network models and analyticformulae for the ODE flow represents an effective, easy to implement method22odel Monte-Carlo G PF LME(high sampling)One gene slow - 6425 2.55 97.39One gene fast - 979 5.12 21.64Two genes slow-slow 45 15739 6.86 97.39Two genes fast-fast 74 1609 10.85 21.64Two genes slow-fast 243 37935 9.22 21.64Two genes fast-slow 249 41232 8.83 97.39
Table 2: Execution times (in minutes for all cases) for different methods to compute theprobability distributions for one gene model and two genes model. Methods: G = Gillespie,PF = Push-forward, LME = Liouville-master equation for computing time dependent, multivariate probability distributions of thesemodels. However, the precision of the Monte-Carlo estimates of the distributionsincreases with √ M C , where
M C is the number of Monte-Carlo samples. Forthis reason, the execution time of this method, although smaller compared toPDMP simulation methods, which implement numerical resolution of the ODEssuch as reported in [17] (data not shown), is large compared to deterministicmethods such as the push-forward method.The push-forward method represents an effective alternative to Monte-Carloand Liouville-master PDE methods, ensuring reduced execution time.With respect to an earlier implementation of this method for gene networksin [12] we used promoter states instead of mRNA copy numbers as discrete vari-ables of the PDMP. As a consequence, the number of discrete states is lower andwe can afford to increase the number M of temporal subdivisions. Comparedto the similar work done in [13] we used second moments of the protein distri-bution, which took into account the correlation of the promoter states and leadto increased accuracy in the case of nonlinear regulation. Although the proteinmoments and the exponential transition rate matrix Π can be computed numer-ically, the effectiveness of the push-forward method is increased when analyticexpressions are available for these quantities. In this paper, these expressionswere computed for particular cases.The push-forward method is an approximate method, and its accuracy relieson the careful choice of the temporal and spatial step densities, namely of theintegers M , n t , n x . We have rigorously proved that the convergence of thepush-forward method is of order one, with errors that scale with /M .We situate our findings in the broader effort of the community to producenew effective simulation algorithms for computational biology.23 igure 4: This set of graphics shows the comparison between the computed probabilitydistribution for mRNA ( P ( r, t ) ) and protein ( P ( y, t ) ), using the push forward method (Al-gorithm 4), direct Monte-Carlo simulation (Algorithms 1 and 2) and numerical solution ofthe Liouville-master Equation (Algorithm 3). The model is for one gene in the slow switchregime ( (cid:15) = 0 . ). The execution time of PF is . minutes in Python and . minutes forLME in MATLAB. The values d LME and d PF are presented in the graphic and measure thedistances between the solution LME equation and MC simulation, PF and MC simulation,respectively, as given in Eq.((29)). The value of parameters are: p = 0 . , k = 4 , k = 40 , a = 1 / and b = 4 . For initial conditions we set: r (0) = y (0) = 0 and p (0) = 1 . igure 5: This set of graphics shows the comparison between the computed probabilitydistribution for mRNA ( P ( r, t ) ) and protein ( P ( y, t ) ), using PF (Algorithm 4), MC simulation(Algorithms 1 and 2) and numerical solution LME (Algorithm 3). The model is for onegene producing in the fast switch regime ( (cid:15) = 5 . ). The execution time of the PF is . minutes in Pyhton, and . minutes for LME in MATLAB. The values d LME and d PF are presented in the graphic and measure the distances between the solution LME equationand MC simulation, PF and MC simulation, respectively, as given in Eq.((29)). The value ofparameters are: p = 0 . , k = 4 , k = 40 , a = 1 / and b = 4 . For initial conditions we set: r (0) = y (0) = 0 and p (0) = 1 . igure 6: This set of graphics shows the comparison between the computed probability dens-ity for mRNA ( P ( r , t ) ) and protein ( P ( y , t ) ) associated to gene two, using PF (Algorithm 5),MC simulation (Algorithms 1 and 2) and numerical solution LME (Algorithm 3). The modelis M in the slow-slow regime ( (cid:15) = 0 . ). The execution time of PF is . minutes in Python,and . minutes for LME in MATLAB. The values d LME and d PF are presented in thegraphic and measure the distances between the solution LME equation and MC simulation,PF and MC simulation, respectively, as given in Eq.((29)). The value of parameters, for bothgenes are: p = 0 . , k = 4 , k = 40 , a = 1 / and b = 4 . For initial conditions we set: r (0) = y (0) = r (0) = y (0) = 0 and p (0) = 1 . igure 7: This set of graphics shows the comparison between the computed probabilitydensity for mRNA ( P ( r , t ) ) and protein ( P ( y , t ) ) associated to gene two, using PF (de-scribed in Algorithm 5), MC simulation (Algorithms 1 and 2) and numerical solution LME(Algorithm 3). The model is M in the fast-slow regime ( (cid:15) = 5 . and (cid:15) = 0 . ). The exe-cution time of PF is . minutes in Python and . minutes for LME in MATLAB. Thevalues d LME and d PF are presented in the graphic and measure the distances between thesolution LME equation and MC simulation, PF and MC simulation, respectively, as given inEq.((29)). The value of parameters, for both genes are: p = 0 . , k = 4 , k = 40 , a = 1 / and b = 4 . For initial conditions we set: r (0) = y (0) = r (0) = y (0) = 0 and p (0) = 1 . ppendicesAppendix A. Expectation and variance of the protein For the sake of completeness, we give explicit formulas for the expectationand the variance of the protein synthesized by a constitutive promoter (gene 1of models M and M ). The details of the calculation can be found in [15].The expectation is given by E [ y t ] = M + M e − at + M e − ρt + M e − (cid:15)t , (A.1)where M = b ( k + ( k − k ) p ) a ,M = E [ x ] − b E [ y ] a − ρ + bk a ( a − ρ ) + b ( k − k )( p − p )( a − ρ(cid:15) )( a − ρ ) + b ( k − k ) p a ( a − ρ ) ,M = b E [ y ] a − ρ − bk a − ρ − b ( k − k )( p − p ) ρ (1 − (cid:15) )( a − ρ ) − b ( k − k ) p a − ρ ,M = b ( k − k )( p − p ) ρ (1 − (cid:15) )( a − (cid:15)ρ ) . The variance is given by
Var( y t ) = Var( y ) e − at + b Var( r ) (cid:18) e − ρt − e − at a − ρ (cid:19) − (cid:20) ( p − p )( k − k ) bρ (1 − (cid:15) ) (cid:18) e − ρ(cid:15)t − e − at a − ρ(cid:15) − e − (cid:15)t − e − at a − ρ(cid:15) (cid:19)(cid:21) + p (1 − p )( k − k ) b ρ ( V + V e − ( a + ρ(cid:15) ) t + V e − ρ (1+ (cid:15) ) t + V e − at + V e − ( a + ρ ) t + V e − ρt )+ (1 − p )( p − p )( k − k ) b ρ ( V e − ρ(cid:15)t + V e − ( a − ρ(cid:15) ) t + V e − ρ ( (cid:15) +1) t + V e − ( a + ρ(cid:15) ) t + V e − ρt + V e − ρt ) , (A.2)where V = a + ( (cid:15) + 1) ρa ( a + ρ(cid:15) )( a + ρ )( (cid:15) + 1) , V = − a − ρ (cid:15) )( a − ρ )( (cid:15) − ,V = 2( a − ρ(cid:15) )( a − ρ )( (cid:15) − , V = 1 a ( a − ρ(cid:15) )( a − ρ ) ,V = 2( a + (1 − (cid:15) ) ρ )( a − ρ(cid:15) )( a − ρ ) ( a + ρ )( (cid:15) − , V = − (cid:15) − a − ρ ) ,V = − a + (2 − (cid:15) ) ρ ) a ( (cid:15) − a − ρ(cid:15) )( a + (1 − (cid:15) ) ρ ) , V = 2(1 − (cid:15) ) a ( a − ρ )( a − ρ(cid:15) ) ,V = 2( a − ρ(cid:15) )( a − ρ )( (cid:15) − , V = 2( a + (1 − (cid:15) ) ρ )( a − ρ ) ( (cid:15) − a − ρ(cid:15) )( a + (1 − (cid:15) )) ,V = 2( a − ρ ) (2 − (cid:15) )(1 − (cid:15) ) , V = 2( a − ρ ) (2 a − ρ(cid:15) )( a − ρ(cid:15) ) . ppendix B. Proof of the Theorem 1 For simplicity, we consider only the case N = 1 , the proof generalizes imme-diately to N > at the price of cumbersome notations. We also consider thecase when the process x always belongs to [0 , . Thus, we may assume that V s , s ∈ S and H are bounded as well as their derivatives. We first consider the caseof a Lipschitz function ϕ .We denote by ( x , s ) the PDMP starting with the random initial data dis-tributed along µ and by ( x ( t, ζ ) , s ( t, ζ )) the PDMP starting from the determ-inistic initial data ζ = ( x , s ) .In this proof, we use the letter κ for a constant which depends on V s , s ∈ S , H and τ .The push-forward algorithm can be rewritten as follows: starting from ˜ s (0) = s , ˜ x (0) = x , we set ˜ x ( τ j +1 ) = Φ s ( τ j ) ( τ j +1 − τ j , ˜ x ( τ j )) and then choose s ( τ j +1 ) randomly, it is equal to r ∈ S with probability Π s ( τ j ) ,r ( τ j , τ j +1 ) , the lattermatrix being defined in (20).In fact, we consider more generally the approximation of E ( ϕ ( x ( τ ) , s ( τ )) , fora function ϕ : R × S → R , by E ( ϕ (˜ x ( τ ) , ˜ s ( τ )) . The result is obtained when wechoose ϕ independent on s ∈ S .We first assume that ϕ is C with respect to x . Let us denote by L itsLipschitz constant with respect to x : | ϕ ( x , s ) − ϕ ( x , s ) | ≤ L | x − x | for x , x ∈ R , s ∈ S .The main tool for the proof is the generator of the process: L ϕ ( x, s ) = V s ( x ) ∂ x ϕ + (cid:88) r ∈ S ( ϕ ( x, r ) − ϕ ( x, s )) H r,s ( x ) , and the forward Kolmogorov equation: dudt ( t, x, s ) = L u ( t, x, s ) , x ∈ R , s ∈ S, t > . When u (0 , x, s ) = ϕ ( x, s ) for x ∈ R , s ∈ S , it is well known that u ( t, x , s ) = E ( ϕ ( x ( t, ζ ) , s ( t, ζ ))) (see [6]). Therefore, E ( ϕ ( x ( t ) , s ( t )) = (cid:90) R × S E ( ϕ ( x ( t, ζ ) , s ( t, ζ ))) dµ ( ζ )= (cid:90) R × S u ( t, x , s ) dµ ( x , s ) . It follows from Proposition A.1 in [5] that u satisfies the formula: u ( t, x, s ) = ϕ (Φ s ( t, x ) , s ) exp (cid:18) − (cid:90) t λ (Φ s ( σ, x )) dσ (cid:19) + (cid:88) s ∈ S (cid:90) t u ( t − σ, Φ s ( σ, x ) , r, ) H r,s (Φ s ( σ, x )) exp (cid:18) − (cid:90) σ λ (Φ s ( τ, x )) dτ (cid:19) dσ ϕ is Lipschitzwith respect to x , so is u , with a Lipschitz constant less than κ (cid:107) ϕ (cid:107) , . Here, κ depends on the characteristics of the PDMP and the time horizon τ ; (cid:107) ϕ (cid:107) , isthe sum of the supremum of ϕ with its Lipschitz constant. Moreover, it followsfrom the forward Kolmogorov equation that its time derivative is also bounded.Hence, | u ( t , x , s ) − u ( t , x , s ) | ≤ κ (cid:107) ϕ (cid:107) , ( | x − x | + | t − t | ) . (B.1)Now we rewrite the error as follows: E ( ϕ ( x ( τ, ζ ) , s ( τ, ζ )) − E ( ϕ (˜ x ( τ, ζ ) , ˜ s ( τ, ζ ))= E ( u ( τ, x , s ) − u (0 , ˜ x ( τ, ζ ) , ˜ s ( τ, ζ )))= M − (cid:88) j =0 E ( u ( τ − τ j , ˜ x ( τ j , ζ ) , ˜ s ( τ j , ζ )) − u ( τ − τ j +1 , ζ , ˜ x ( τ j +1 , ζ ) , ˜ s ( τ j +1 , ζ )) On each interval [ τ j , τ j +1 ] we have, by chain rule and the forward Kolmogorovequation, that ddt [ u ( τ − t, ˜ x ( t, ζ ) , ˜ s ( τ j , ζ ))]= − dudt ( τ − t, ˜ x ( t, ζ ) , ˜ s ( t, ζ )) + d ˜ xdt ( t ) ∂ x u ( τ − t, ˜ x ( t, ζ ) , ˜ s ( τ j , ζ )= − dudt ( τ − t, ˜ x ( t, ζ ) , ˜ s ( t, ζ )) + V ˜ s ( τ j ,ζ ) (˜ x ( t, ζ )) ∂ x u ( τ − t, ˜ x ( t, ζ ) , ζ , ˜ s ( τ j , ζ )= − (cid:88) r ∈ S ( u ( τ − t, ˜ x ( t, ζ ) , r ) − u ( τ − t, ˜ x ( t, ζ ) , ˜ s ( τ j , ζ ))) H r, ˜ s ( τ j ,ζ ) (˜ x ( t, ζ )) . Hence, E (cid:0) u ( τ − τ j , ˜ x ( τ j , ζ ) , ˜ s ( τ j ) , ζ ) − u ( τ − τ j +1 , ˜ x ( τ j +1 , ζ ) , ˜ s ( τ j +1 , ζ )) (cid:1) = E (cid:32)(cid:90) τ j +1 τ j (cid:88) r ∈ S ( u ( τ − t, ˜ x ( t, ζ ) , r ) − u ( τ − t, ˜ x ( t, ζ ) , ˜ s ( τ j , ζ ))) H r, ˜ s ( τ j ,ζ ) (˜ x ( t, ζ )) dt (cid:33) + E ( u ( τ − τ j +1 , ˜ x ( τ j +1 , ζ ) , ˜ s ( τ j , ζ )) − u ( τ − τ j +1 , ˜ x ( τ j +1 , ζ ) , ˜ s ( τ j +1 , ζ )))= E (cid:32)(cid:90) τ j +1 τ j (cid:88) r ∈ S ( u ( τ − t, ˜ x ( t, ζ ) , r ) − u ( τ − t, ˜ x ( t, ζ ) , ˜ s ( τ j , ζ ))) H r, ˜ s ( τ j ,ζ ) (˜ x ( t, ζ )) dt (cid:33) + E (cid:32)(cid:88) r ∈ S ( u ( τ − τ j +1 , ˜ x ( τ j +1 , ζ )) , ˜ s ( τ j , ζ ) − u ( τ − τ j +1 , ˜ x ( τ j +1 , ζ ) , r ) (cid:33) Π r, ˜ s ( τ , ζ j ) ( τ j , τ j +1 ) . By eq. (20), we have that Π( τ j , τ j +1 ) = I + (cid:90) τ j +1 τ j H (˜ x ( t ) , ζ )Π( τ j , t ) dt, I is the identity matrix.Since H is bounded, it follows from Gronwall’s lemma that Π is boundedand (cid:107) Π( τ j , t ) − I (cid:107) ≤ κ | t − τ j | for some constant κ , where (cid:107) · (cid:107) is a norm on the space of matrices. For instancewe may take the supremum of the modulus of the coefficients of a matrix. Hence, (cid:107) Π( τ j , τ j +1 ) − I − (cid:90) τ j +1 τ j H (˜ x ( t, ζ )) dt (cid:107) = (cid:107) (cid:90) τ j +1 τ j H (˜ x ( t, ζ ))( I − Π( τ j , t )) dt (cid:107)≤ κ | τ j +1 − τ j | , for another constant κ . In particular, for r (cid:54) = ˜ s ( τ j ) : | Π r, ˜ s ( τ j ) ( τ j , τ j +1 ) − (cid:90) τ j +1 τ j H r, ˜ s ( τ j ,ζ ) (˜ x ( t, ζ )) dt (cid:107) ≤ κ | τ j +1 − τ j | , Moreover, | ˜ x ( τ j +1 ,ζ ) − ˜ x ( t, ζ ) | ≤ κ | τ j +1 − t | , since its derivative is V ˜ s j (˜ x ( t )) , which is bounded. We have shown in (B.1) that u is Lipschitz with respect to x and t , thus | u ( τ − t, ˜ x ( t, ζ ) , r ) − u ( τ − τ j +1 , ˜ x ( τ j +1 , ζ ) , r ) | ≤ κ ( (cid:107) ϕ (cid:107) , + 1) | τ j +1 − t | . (B.2)Now we have the following estimate | E ( u ( τ − τ j , ˜ x ( τ j , ζ ) , ˜ s ( τ j , ζ )) − u ( τ − τ j +1 , ˜ x ( τ j +1 , ζ ) , ˜ s ( τ j +1 , ζ ))) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E (cid:32)(cid:90) τ j +1 τ j (cid:88) r ∈ S ( u ( τ − t, ˜ x ( t, ζ ) , r ) − u ( τ − t, ˜ x ( t, ζ ) , ˜ s ( τ j , ζ ))) H r, ˜ s ( τ j ,ζ ) (˜ x ( t, ζ )) dt (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + E (cid:32)(cid:88) r ∈ S ( u ( τ − τ j +1 , ˜ x ( τ j +1 , ζ ) , ˜ s ( τ j , ζ )) − u ( τ − τ j +1 , ˜ x ( τ j +1 , ζ ) , r ))Π r, ˜ s ( τ j ,ζ ) ( τ j , τ j +1 ) (cid:33) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E (cid:32)(cid:90) τ j +1 τ j (cid:88) r ∈ S ( u ( τ − t, ˜ x ( t, ζ ) , r ) − u ( τ − t, ˜ x ( t, ζ ) , ˜ s ( τ j , ζ )))+ ( u ( τ − τ j +1 , ˜ x ( τ j +1 , ζ )) , ˜ s ( τ j , ζ ) − u ( τ − τ j +1 , ˜ x ( τ j +1 , ζ ) , r )) H r, ˜ s ( τ j ) ,ζ (˜ x ( t, ζ )) dt (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + κ | τ j +1 − τ j | ≤ κ ( (cid:107) ϕ (cid:107) , + 1) | τ j +1 − τ j | and so | E ( ϕ ( x ( τ, ζ ) , s ( τ, ζ )) − E ( ϕ (˜ x ( τ, ζ ) , ˜ s ( τ, ζ )) | ≤ M − (cid:88) j =0 κ | ( (cid:107) ϕ (cid:107) , + 1) τ j +1 − τ j | ≤ Cκτ ( (cid:107) ϕ (cid:107) , + 1) /M. ϕ depending only on x and integ-rating with respect to µ : (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) [0 , ϕ ( x ) µ Mt ( dx ) − (cid:90) B ϕ ( x ) µ t ( dx ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) [0 , E ( ϕ ( x ( τ, ζ ))) − E ( ϕ (˜ x ( τ, ζ ))) dµ ( ζ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ Cκτ ( (cid:107) ϕ (cid:107) , + 1) /M When ϕ is only Lipschitz, we obtain the same result by choosing a sequenceof C functions which converges uniformly to ϕ and whose Lipschitz constantsdo not exceed L . This finishes the proof of the first statement. It is classicalthat this implies that the left-handed side converges to zero when max j =0 ,...,M − | τ j +1 − τ j | → for any uniformly continuous and bounded function ϕ . We conclude thanks tothe Portemanteau theorem [2]. Remark 4.
When ϕ is only borelian bounded but µ has a C and Lipschitzdensity f , the proof can be adapted. We observe that the only point where theLipschitz property of ϕ is used is to obtain (B.1) which is used to derive (B.2).To replace (B.2), we can instead write the explicit formula for E ( ϕ ( x ( t ) , s ( t )) and E ( ϕ (˜ x ( t ) , ˜ s ( t )) . Since the flows Φ s , s ∈ S are C diffeomorphisms thatdepend smoothly on time, we see that (B.2) still holds, but on the left-handedside, κ ( (cid:107) ϕ (cid:107) , + 1) must be replaced by a constant which depends on (cid:107) f (cid:107) , and the supremum of ϕ . References [1] A. Abate, L. Cardelli, M. Kwiatkowska, L. Laurenti, and B. YordanovExperimental biological protocols with formal semantics,
InternationalConference on Computational Methods in Systems Biology , pp. 165–182.Springer, 2018.[2] P. Billingsley.
Convergence of Probability Measures, Second Edition . Wileyseries in Probability and Statistics, 1999.[3] L. Cai, N. Friedman, and X.S. Xie. Stochastic protein expression in indi-vidual cells at the single molecule level.
Nature , 440(7082):358, 2006.[4] A. Crudu, A. Debussche, and O. Radulescu. Hybrid stochastic simplifica-tions for multiscale gene networks.
BMC Systems Biology , 3(1):89, 2009.[5] A. Crudu, A. Debussche, A. Muller, and O. Radulescu. Convergence ofstochastic gene networks to hybrid piecewise deterministic processes.
An-nals of Applied Probability , 22:1822–1859, 2012.326] M. Davis.
Markov Models and Optimization . Chapman and Hall, 1993.[7] A. Eldar and M.B. Elowitz. Functional roles for noise in genetic circuits.
Nature , 467(7312):167, 2010.[8] M.B. Elowitz, A.J. Levine, E.D. Siggia, and P. S. Swain. Stochastic geneexpression in a single cell.
Science , 297(5584):1183–1186, 2002.[9] M.L Ferguson, D. Le Coq, M. Jules, S. Aymerich, O. Radulescu, N. De-clerck, and C.A. Royer. Reconciling molecular regulatory mechanisms withnoise patterns of bacterial metabolic promoters in induced and repressedstates.
Proceedings of the National Academy of Sciences USA , 109:155,2012.[10] P.B. Gupta, C.M. Fillmore, G. Jiang, S.D. Shapira, K. Tao, C. Kuper-wasser, and E.S. Lander. Stochastic state transitions give rise to phenotypicequilibrium in populations of cancer cells.
Cell , 146(4):633–644, 2011.[11] U. Herbach, A. Bonnaffoux, T. Espinasse, and O. Gandrillon. Inferringgene regulatory networks from single-cell data: a mechanistic approach.
BMC Systems Biology , 11(1):105, 2017.[12] G.C.P. Innocentini, M. Forger, O. Radulescu, and F. Antoneli. Proteinsynthesis driven by dynamical stochastic transcription.
Bulletin of Math-ematical Biology , 78(1):110–131, 2016.[13] G.C.P. Innocentini, A. Hodgkinson, and O. Radulescu. Time dependentstochastic mRNA and protein synthesis in piecewise-deterministic modelsof gene networks.
Frontiers in Physics , 6:46, 2018.[14] G.C.P. Innocentini, M. Forger, A.F. Ramos, O. Radulescu, and J.E.M.Hornos. Multimodality and flexibility of stochastic gene expression.
Bul-letin of Mathematical Biology , 75(12):2600–2630, 2013.[15] G.C.P. Innocentini, A. Hodgkinson, F. Antoneli, and O. Radulescu. Effect-ive computational methods for hybrid stochastic gene networks.
Computa-tional Methods in Systems Biology: 17th International Conference, CMSB2019, Trieste, Italy, September 18–20, 2019, Proceedings , LNBI 11773: 60,Springer Nature, 2019.[16] P. Kurasov, A. Lück, D. Mugnolo, and V. Wolf. Stochastic hybrid modelsof gene regulatory networks – a PDE approach.
Mathematical Biosciences ,305:170–177, 2018.[17] Y. Ting Lin and N.E. Buchler. Efficient analysis of stochastic gene dynamicsin the non-adiabatic regime using piecewise deterministic markov processes.
Journal of The Royal Society Interface , 15(138):20170804, 2018.[18] R.W. MacCormack. The Effect of Viscosity in Hypervelocity ImpactCratering
Frontiers of Computational Fluid Dynamics
PLoS Biology , 4(10):e309, 2006.[20] B.S. Razooky, A. Pai, K. Aull, I.M. Rouzine, and L.S. Weinberger. Ahardwired HIV latency program.
Cell , 160(5):990–1001, 2015.[21] M.G. Riedler. Almost sure convergence of numerical approximations forpiecewise deterministic Markov processes.
Journal of Computational andApplied Mathematics , 239:50–71, 2013.[22] A. Singh and J.P. Hespanha. Stochastic hybrid systems for studying bio-chemical processes.
Philosophical Transactions of the Royal Society A:Mathematical, Physical and Engineering Sciences , 368 : 4995–5011, 2010.[23] D. Talay. Discrétisation d’une équation différentielle stochastique et calculapproché d’espérances de fonctionnelles de la solution.
RAIRO Modél.Math. Anal. Numér. , 20:141–179, 1986.[24] K. Tantale, F. Mueller, A. Kozulic-Pirher, A. Lesne, J.-M. Victor, M.-C. Robert, S. Capozi, R. Chouaib, V. Bäcker, J. Mateos-Langerak, et al.A single-molecule view of transcription reveals convoys of RNA polymerasesand multi-scale bursting.
Nature Communications , 7:12248, 2016.[25] M. Thattai and A. Oudenaarden. Stochastic gene expression in fluctuatingenvironments.
Genetics , 167(1):523–530, 2004.[26] P. Thomas, N. Popović, and R. Grima. Phenotypic switching in gene reg-ulatory networks.
Proceedings of the National Academy of Sciences USA ,111(19):6994–6999, 2014.[27] S. Zeiser, U. Franz, O. Wittich, and V. Liebscher. Simulation of genetic net-works modelled by piecewise deterministic Markov processes.