[PDF] Investigating the two-moment characterisation of subcellular biochemical networks

Abstract

While ordinary differential equations (ODEs) form the conceptual framework for modelling many cellular processes, specific situations demand stochastic models to capture the influence of noise. The most common formulation of stochastic models for biochemical networks is the chemical master equation (CME). While stochastic simulations are a practical way to realise the CME, analytical approximations offer more insight into the influence of noise. Towards that end, the two-moment approximation (2MA) is a promising addition to the established analytical approaches including the chemical Langevin equation (CLE) and the related linear noise approximation (LNA). The 2MA approach directly tracks the mean and (co)variance which are coupled in general. This coupling is not obvious in CME and CLE and ignored by LNA and conventional ODE models. We extend previous derivations of 2MA by allowing a) non-elementary reactions and b) relative concentrations. Often, several elementary reactions are approximated by a single step. Furthermore, practical situations often require the use relative concentrations. We investigate the applicability of the 2MA approach to the well established fission yeast cell cycle model. Our analytical model reproduces the clustering of cycle times observed in experiments. This is explained through multiple resettings of MPF, caused by the coupling between mean and (co)variance, near the G2/M transition.

Full PDF

IInvestigating the two-moment characterisationof subcellular biochemical networks

Mukhtar Ullah a , Olaf Wolkenhauer a , b , ∗ a Systems Biology and Bioinformatics Group, Dept. of Computer Science, Universityof Rostock, Albert Einstein Str. 21, 18051 Rostock, Germany b Stellenbosch Institute for Advanced Study (STIAS), 10 Marais Street, Stellenbosch7600, South AfricaAbstractWhile ordinary di(cid:27)erential equations (ODEs) form the conceptual framework formodelling many cellular processes, speci(cid:28)c situations demand stochastic models tocapture the in(cid:29)uence of noise. The most common formulation of stochastic modelsfor biochemical networks is the chemical master equation (CME). While stochas-tic simulations are a practical way to realise the CME, analytical approximationso(cid:27)er more insight into the in(cid:29)uence of noise. Towards that end, the two-moment ap-proximation (2MA) is a promising addition to the established analytical approachesincluding the chemical Langevin equation (CLE) and the related linear noise ap-proximation (LNA). The 2MA approach directly tracks the mean and (co)variancewhich are coupled in general. This coupling is not obvious in CME and CLE andignored by LNA and conventional ODE models. We extend previous derivations of2MA by allowing a) non-elementary reactions and b) relative concentrations. Of-ten, several elementary reactions are approximated by a single step. Furthermore,practical situations often require the use relative concentrations. We investigate theapplicability of the 2MA approach to the well established (cid:28)ssion yeast cell cyclemodel. Our analytical model reproduces the clustering of cycle times observed inexperiments. This is explained through multiple resettings of MPF, caused by thecoupling between mean and (co)variance, near the G2/M transition.Key words: Noise, two-moment approximation, mean, (co)variance, cell cycle ∗ a r X i v : . [ q - b i o . S C ] A p r IntroductionAt a coarse level, cellular functions are largely determined by spatio-temporalchanges in the abundance of molecular components. At a (cid:28)ner level, cellularevents are triggered by discrete and random encounters of molecules [1]. Thissuggests a deterministic modelling approach at the coarse level (cell function)and a stochastic one at the (cid:28)ner level (gene regulation) [2, 3, 4, 5, 6, 7, 8,9, 10, 11]. However, stochastic modelling is necessary when noise propagationfrom processes at the (cid:28)ne level changes cellular behaviour at the coarse level.Stochasticity is not limited to low copy numbers. The binding and dissocia-tion events during transcription initiation are the result of random encoun-ters between molecules [4]. If molecules are present in large numbers and themolecular events occur frequently, the randomness would cancel out (bothwithin a single cell and from cell to cell) and the average cellular behaviourcould be described by a deterministic model. However, many subcellular pro-cesses, including gene expression, are characterised by infrequent (rare) molec-ular events involving small copy numbers of molecules [1, 4]. Most proteinsin metabolic pathways and signalling networks, realising cell functions, arepresent in the range 10-1000 copies per cell [12, 13, 14]. For such moder-ate/large copy numbers, noise can be signi(cid:28)cant when the system dynamicsare driven towards critical points in cellular systems which operate far fromequilibrium [15, 16, 17]. The signi(cid:28)cance of noise in such systems has beendemonstrated for microtubule formation [18], ultrasensitive modi(cid:28)cation anddemodi(cid:28)cation reactions [12], plasmid copy number control [19], limit cycleattractor [20], noise-induced oscillations near a macroscopic Hopf bifurcation[21], and intracellular metabolite concentrations [22].Noise has a role at all levels of cell function. Noise, when undesired, may besuppressed by the network (e.g. through negative feedback) for robust be-haviour [2, 23, 24, 25, 26, 27]. However, all noise may not be rejected andsome noise may even be ampli(cid:28)ed from process to process, and ultimatelyin(cid:29)uencing the phenotypic behaviour of the cell [6, 11, 28, 29, 30]. Noise mayeven be exploited by the network to generate desired variability (phenotypicand cell-type diversi(cid:28)cation) [2, 31, 32, 33, 34]. Noise from gene expression caninduce new dynamics including ampli(cid:28)cation (stochastic focusing) [6, 35, 36],bistability (switching between states) and oscillations [37, 38, 39, 40], thatis both quantitatively and qualitatively di(cid:27)erent from what is predicted orpossible deterministically.The most common formulation of stochastic models for biochemical networksis the chemical master equation (CME). While stochastic simulations [41] area practical way to realise the CME, analytical approximations o(cid:27)er more in-sights into the in(cid:29)uence of noise on cell function. Formally, the CME is a2ontinuous-time discrete-state Markov process [42, 43, 44]. For gaining intu-itive insight and a quick characterisation of (cid:29)uctuations in biochemical net-works, the CME is usually approximated analytically in di(cid:27)erent ways [44, 45],including the frequently used the chemical Langevin approach [46, 47, 48, 49],the linear noise approximation (LNA) [15, 50, 51, 52] and the two-momentapproximation (2MA) [53, 54, 55].Of the analytical approaches mentioned above, we here focus on the 2MAapproach because of its representation of the coupling between the mean and(co)variance. The traditional Langevin approach is based on the assumptionthat the time-rate of abundance (copy number or concentration) or the (cid:29)uxof a component can be decomposed into a deterministic (cid:29)ux and a Langevinnoise term, which is a Gaussian (white noise) process with zero mean andamplitude determined by the the dynamics of the system. This separation ofnoise from the system dynamics may be a reasonable assumption for externalnoise that arises from the interaction of the system with other systems (like theenvironment), but cannot be assumed for internal noise that arises from withinthe system [4, 5, 11, 14, 56, 57]. As categorically discussed in [47], internal noiseis not something that can be isolated from the system because it results fromthe discrete nature of the underlying molecular events. Any noise term in themodel must be derived from the system dynamics and cannot be presupposedin an ad hoc manner. However the chemical Langevin equation (CLE) doesnot su(cid:27)er from the above criticism because Gillespie [46] derived it from theCME description. The CLE allows much faster simulations compared to theexact stochastic simulation algorithm (SSA) [43] and its variants. The CLEis a stochastic di(cid:27)erential equation (dealing directly with random variablesrather than moments) and has no direct way of representing the mean and(co)variance and the coupling between the two. That does not imply that CLEignores the coupling like the LNA which has the same mean as the solution ofthe deterministic model.The merits of the 2MA compared to alternative approximations have beendiscussed in [53, 54, 58]. In [55], the 2MA is developed as an approximation ofthe master equation for a generic Markov process. In [54], the 2MA frameworkis developed under the name (cid:16)mass (cid:29)uctuation kinetics(cid:17) for biochemical net-works composed of elementary reactions. The authors demonstrate that the2MA can reveal new behaviour like stochastic focusing and bistability. An-other instance of the 2MA is proposed in [45, 53] under the names (cid:16)mean-(cid:28)eldapproximation(cid:17) and (cid:16)statistical chemical kinetics(cid:17). Again, the authors assumeelementary reactions so that the propensity function is at most quadratic inconcentrations. The authors evaluate the accuracy of the 2MA against thealternatives (such as LNA) for a few toy models. The derivation of the 2-MAfor more general systems with non-elementary reactions is one motivation forthe present paper. 3he 2MA approaches referred to above assume absolute concentrations (copynumber divided by some (cid:28)xed system size parameter). In systems biology,however, models often use relative concentrations that have arbitrary units[59, 60, 61, 62]. In general, the concentration of each component in the systemmay have been obtained by a di(cid:27)erent scaling parameter, rather than usinga global system size. For such models, the above mentioned approaches needmodi(cid:28)cation. This was another motivation for our derivation in this paper.In the present paper we develop a compact derivation of the (cid:28)rst two-moments,the mean and (co)variance of the continuous-time discrete-state Markov pro-cess that models a biochemical reaction system by the CME. This derivation isan extension of previous derivations, taking into account arbitrary concentra-tions and non-elementary reactions. The matrix form of our derivation allowsfor an easy interpretation. Using these analytical results, we develop our 2MAmodel of the (cid:28)ssion yeast cell cycle which has two sets of ODEs: one set for themean protein concentrations and the other set for concentration (co)variances.Numerical simulations of our model show a considerably di(cid:27)erent behaviour.Especially, for the wee1 - cdc25 ∆ mutant (hereafter referred simply as double-mutant), the timings of S-phase and M-phase are visibly di(cid:27)erent from thoseobtained for a deterministic model because of the oscillatory behaviour ofthe key regulator. Since the 2MA is only an approximation, we investigateits validity by comparing the statistics computed from the 2MA model withexperimental data.The rest of this paper is organised as follows. In the (cid:28)rst section we introducethe basic terminology and notation. Then the system of ODEs forming the2MA approach is presented. Next, we introduce an application to the (cid:28)ssionyeast cell cycle model [59]. We present a 2MA model of the cell cycle, followedby a comparison to the experimental data and conclusions. The appendicescontain full derivations of the 2MA model, further proofs and additional tables.2 Stochastic modelling of biochemical systemsImagine a well-mixed homogeneous cellular compartment of a (cid:28)xed volume V at thermal equilibrium that contains molecules of s di(cid:27)erent kinds (each kindreferred to as a chemical component or species) interacting in r distinct ways(each way referred to as a reaction channel or step). Since these biochemicalreactions occur by random encounters of reactant molecules, the copy num-ber of a particular component present in the system at time t (cid:29)uctuates. Thestate of the cellular system is described by the s × random vector N ( t ) whose i th element is the copy number N i ( t ) of the i th species present in thesystem at time t . Each (time-varying) element N i ( t ) is a stochastic process,where N i ( t ) = n i means that n i molecules of the i th species are present in the4ystem at time t . The s × vector n , with elements n i , is thus a sample (or avalue) of the stochastic process N ( t ) . The stochastic process is characterisedby the (time-dependent) probability distribution P ( n, t ) , that is the proba-bility of N ( t ) = n given a (cid:28)xed initial condition N (0) = n . The probabilitydistribution itself is characterised by its moments.We can describe the system state at time t by the s × vector X ( t ) whose i thelement is the concentration X i ( t ) of the i th component. The concentration X i ( t ) is, in general, the copy number N i ( t ) divided by some (cid:28)xed scalingparameter Ω i speci(cid:28)c to that component. In other words N i ( t ) = Ω i X i ( t ) , n i = Ω i x i . Each concentration X i ( t ) is a stochastic process, where X i ( t ) = x i means thatthe concentration of the i th component at time t is x i . The s × vector x ,with elements x i , is thus a sample of the stochastic process X ( t ) . The copynumber and concentration (vectors) are related by N ( t ) = Ω X ( t ) , n = Ω x, where Ω is the diagonal matrix with Ω i being its i th diagonal element.Commonly, all components are scaled by a single parameter, in which case Ω is a scalar known as the system size. A common choice for the system sizeis some multiple of the volume V of the system. For molar concentrations,the system size chosen is Ω = N A V where N A is the Avogadro’s constant. Insystems biology, one often uses relative concentrations x i where Ω i is some(cid:28)xed copy number speci(cid:28)c to component i . The simplest case of relative con-centrations uses a single (maximum) copy number n max for all components.Note that our approach is developed for the general case which allows for rel-ative concentrations instead of assuming one global system-size Ω as done in[16, 51, 53, 54, 63].If we assume that the molecules are well mixed and are available everywherefor a reaction (space can be ignored), then the probability of a reaction in ashort time interval depends almost entirely on the most recent copy numbers(and not its earlier values). In other words, the stochastic process N ( t ) of copynumbers is Markovian in continuous-time. Since changes in the copy numbersrequire the occurrences of reactions which are discrete event phenomena, N ( t ) is referred as a jump process. The Markov property implies that each reactionchannel j can be characterised by a reaction propensity a j ( n ) de(cid:28)ned suchthat, in state n , the probability of one occurrence of reaction channel j in avanishingly short time interval of length dt is a j ( n ) dt .The transition from state n to the state determined by the j th reaction will5e represented by the following scheme n a j ( n ) −−−−−−→ n + S (cid:5) j where S (cid:5) j is the j th column of the stoichiometry matrix S whose element S ij denotes the change in copy number of the i th component resulting from theoccurrence of the j th channel. Similarly the transitions towards state n fromthe state determined by the j th reaction can be represented by n − S (cid:5) j a j ( n − S (cid:5) j ) −−−−−−−−−→ n where the argument of the propensity function a j is n − S (cid:5) j which is theassumed current state. Transitions away from state n will decrease the proba-bility P ( n, t ) while those towards state n will increase it. Since this is equallytrue for each reaction channel, during a short time interval of length ∆ t , thechange in the probability is given by P ( n, t +∆ t ) − P ( n, t ) = r (cid:88) j =1 P ( n − S (cid:5) j , t ) a j ( n − S (cid:5) j )∆ t − r (cid:88) j =1 P ( n, t ) a j ( n )∆ t + o (∆ t ) where o (∆ t ) represents terms that vanish faster than ∆ t as the later ap-proaches zero. As ∆ t approaches zero in the above system of equations, weare led to what is known as the chemical master equation (CME): ddt P ( n, t ) = r (cid:88) j =1 (cid:34) a j ( n − S (cid:5) j ) P ( n − S (cid:5) j , t ) − a j ( n ) P ( n, t ) (cid:35) . (1)We will switch between the two alternative notations ddt φ ( t ) and dφdt for anyscalar quantity φ ( t ) . We will prefer the later when dependence on time variableis implicitly clear.Since there is one equation for each state n and there is potentially a largenumber of possible states, it is impractical to solve the CME. In most cases, weare interested in the (cid:28)rst two-moments: component-wise copy number means E [ N i ( t )] = (cid:88) n n i P ( n, t ) , and the covariances Cov ( N i ( t ) , N k ( t )) = E (cid:20)(cid:16) N i ( t ) − E [ N i ( t )] (cid:17)(cid:16) N k ( t ) − E [ N k ( t )] (cid:17)(cid:21) , between copy numbers of component pairs. These covariances form the covari-ance matrix in which the diagonal elements are component-wise variances.In the present paper, we are interested in the mean concentration vector µ ( t ) µ i ( t ) = E [ X i ( t )] = E [ N i ( t )] Ω i and the concentration covariance matrix σ ( t ) with elements σ ik ( t ) = Cov ( X i ( t ) , X k ( t )) = Cov ( N i ( t ) , N k ( t )) Ω i Ω k Hereafter, we leave out the dependence on time to simplify the notation, butinclude it occasionally when causing confusion.2.1 Continuous approximations of the jump process N ( t ) While the stochastic simulation algorithm and extensions provide a way togenerate sample paths of copy numbers for a biochemical system, the needfor repeating many simulation runs to get an idea of the probability distribu-tion in terms of its moments (mean and (co)variance) become increasing timeconsuming and even impractical for larger systems. Therefore attempts havebeen made towards approximations of the CME, the most notable being thechemical Langevin equation (CLE) by Gillespie [46]. He obtained that con-tinuous approximation for the incremental change in copy number during ashort interval [ t, t + dt ] where the interval length dt satis(cid:28)es two conditions: (i)It is small enough that the propensity does not change (cid:16)appreciably(cid:17) duringthe interval, and (ii) is large enough that the expected number of occurrences E [ Z j ( t + dt ) − Z j ( t )] of each reaction channel j during the interval is muchlarger than unity. That continuous approximation takes the form of the CLE N ci ( t + dt ) − N ci ( t ) = r (cid:88) j =1 S ij a j ( N c ( t )) dt + r (cid:88) j =1 S ij (cid:113) a j ( N c ( t )) dt N j ( t ) . (2)Here N c ( t ) denotes the continuous Markov process approximating the jumpprocess N ( t ) , and the set {N j ( t ) } are statistically independent Gaussian ran-dom variables each with zero mean and unit variance. The probability densityfunction P c ( n, t ) of the continuous Markov process N c ( t ) obeys the (forward)Fokker-Planck equation (FPE) [46, 64] ∂∂t P c ( n, t ) = r (cid:88) j =1 (cid:18) − s (cid:88) i =1 S ij ∂∂n i + 12 s (cid:88) i,k =1 S ij S kj ∂ ∂n i ∂n k (cid:19)(cid:104) a j ( n ) P c ( n, t ) (cid:105) . (3)In e(cid:27)ect, condition (i) allows a Poissonian approximation of Z j ( t + dt ) − Z j ( t ) and condition (ii) allows a normal approximation of the Poissonian. The twoconditions seem con(cid:29)icting and require the existence of a domain of macro-scopically in(cid:28)nitesimal time intervals. Although the existence of a such a do-main cannot be guaranteed, Gillespie argues that this can be found for most7ractical cases. Admitting that, (cid:16)it may not be easy to continually monitorthe system to ensure that conditions (i) and (ii) [..] are satis(cid:28)ed.(cid:17) He justi(cid:28)eshis argument by saying that this (cid:16)will not be the (cid:28)rst time that Nature hasproved to be unaccommodating to our purposes.(cid:17) [46].Generating sample paths of (2) is orders of magnitude faster than doing thesame for the CME because it essentially needs generation of normal randomnumbers. See [65] for numerical simulation methods of stochastic di(cid:27)erentialequations such as (2). However, solving the nonlinear FPE (3) for the prob-ability density is as di(cid:30)cult as the CME. Therefore, on the analytical side,the CLE and the associated nonlinear FPE do not provide any signi(cid:28)cant ad-vantage. That leads to a further simpli(cid:28)cation referred to as the linear noiseapproximation (LNA) [44, 45]. The LNA is a linear approximation of the non-linear FPE (3) obtained by linearising the propensity function around themean. The solution of the LNA is a Gaussian distribution with a mean that isequal to the solution of the deterministic ODE model and a covariance matrixthat obeys a linear ODE. This is the main drawback of LNA because, for sys-tem containing at least one biomolecular reactions, the mean of a stochasticmodel is not equal to the solution of deterministic ODEs, as shown next.2.2 Mean of the stochastic modelThe mean copy number for the i th component obeys the ODE ddt E [ N i ( t )] = r (cid:88) j =1 S ij E (cid:104) a j (cid:16) N ( t ) (cid:17)(cid:105) (4)which is derived in Appendix A1. In general, the expectation on the right of(4) involves involves the unknown probability distribution P ( n, t ) . In otherwords, the mean copy number depends not just on the mean itself, but alsoinvolves higher-order moments, and therefore (4) is, in general, not closedin the mean unless the reaction propensity is a linear function of N whichis the case only for zero- and (cid:28)rst-order reactions. Take the example of a(cid:28)rst-order reaction X k −→ Y with n denoting the copy number of its reactantand k denoting the reaction coe(cid:30)cient. The reaction propensity a ( n ) = kn (mass action kinetics) is linear in n . From probability theory, the expectationbecomes E( kN ) = k E( N ) and thus we do not need to know the probabilitydistribution for solving the ODE in the mean. Only if all reactions elementaryand are of zero or (cid:28)rst-order, we have exact equations for the evolution ofmean: ddt E [ N i ( t )] = r (cid:88) j =1 S ij a j (cid:16) E [ N ( t )] (cid:17) which corresponds to the ODE system for the deterministic model which treatsthe copy numbers n ( t ) as a continuous time-varying quantity that can be8niquely predicted for a given initial condition. For systems containing second(and higher) order reactions, a ( n ) is a nonlinear function and the evolution ofthe mean cannot be determined by the mean alone. Instead the mean dependson higher-order moments, and hence the deterministic ODE model and theLNA cannot be used to describe the mean in (4).2.3 The 2MA approachThe present section provides only a brief outline of the 2MA approach and werefer to the Appendix A1 for a detailed derivation.An exact and closed representation of mean is not possible in general, as ev-ident from (4). The same is true for (co)variance and higher-order moments.One way to solve this problem is by repeating many stochastic simulation runsbased on CME or the CLE, and computing the desired moments from the en-semble runs. An alternative is to (cid:28)nd approximations to the exact ODEs suchas (4) for the moments. The 2MA is one such attempt which assumes closureto the (cid:28)rst two-moments: the mean and (co)variance. A scheme of chemicalreactions or a system of deterministic ODEs is the starting point. From thisare concluded the reaction propensities a j ( n ) which appear as coe(cid:30)cients inthe CME describing the time derivative of the probability distribution P ( n, t ) .By taking the (cid:28)rst two-moments of the CME and subsequent simpli(cid:28)cationsfollowed by appropriate scaling, two sets of ODEs for the mean concentrationvector µ ( t ) and covariance matrix σ ( t ) are derived. This is followed by Taylorexpansions of any nonlinear functions involving the propensity vector a ( n ) . Ig-noring central moments of 3rd and order higher eventually leads to the 2MAsystem: dµdt = f ( µ ) + ε f ( µ, σ ) (5) dσdt = A ( µ ) σ + σA ( µ ) T + Ω − / [ B ( µ ) + ε B ( µ, σ )] (cid:16) Ω − / (cid:17) T (6)9here the superscript T denotes transpose of a matrix and f i ( x ) = 1 Ω i r (cid:88) j =1 S ij a j ( Ω x ) ε f i ( µ, σ ) = 12 (cid:88) k,l (cid:34) ∂ f i ∂x k ∂x l (cid:35) x = µ σ kl A ik ( x ) = ∂f i ( x ) ∂x k B ik ( x ) = 1 √ Ω i Ω k r (cid:88) j =1 S ij S kj a j ( Ω x ) ε B ik ( µ, σ ) = 12 (cid:88) i (cid:48) ,k (cid:48) (cid:34) ∂ B ik ∂x i (cid:48) ∂x k (cid:48) (cid:35) x = µ σ i (cid:48) k (cid:48) . (7)The derivation of these equations is given in Appendix A1. The e(cid:27)ective (cid:29)uxon the right in (5) is the sum of a deterministic (cid:29)ux f ( µ ) and a stochastic(cid:29)ux ε f ( µ, σ ) , the latter determined by the dynamics of both the mean and(co)variance. This in(cid:29)uence of the (co)variance implies that knowledge of (cid:29)uc-tuations is important for a correct description of the mean. This also indicatesan advantage of the stochastic framework over its deterministic counterpart:starting from the same assumptions and approximations, the stochastic frame-work allows us to describe the in(cid:29)uence of (cid:29)uctuations on the mean. This canbe posed as the central phenomenological argument for stochastic modelling.Note that (5) is exact for systems where no reaction has an order higher thantwo because then 3rd and higher derivatives of propensity are zero. In (6), thedrift matrix A ( µ ) re(cid:29)ects the noise dynamics for relaxation to the steady stateand the (Taylor approximation to the 2nd order of) di(cid:27)usion matrix B ( n ) therandomness ((cid:29)uctuation) of the individual events. The scaling by Ω con(cid:28)rmsthe inverse relationship between the noise, as measured by (co)variance, andthe system size. Note the in(cid:29)uence of the mean on the (co)variance in (6).A deterministic model treats concentrations x ( t ) as continuous variables thatcan be predicted entirely from the initial conditions. Hence there is no noiseterm in the deterministic model and the ODEs reduce to ˙ x = f ( x ) .Since the 2MA approach is based on the truncation of terms containing 3rdand higher-order moments, any conclusion from the solution of 2MA must bedrawn with care. Ideally, the 2MA should be complemented and checked witha reasonable number of SSA runs.In [53, 54], the 2MA has been applied biochemical systems, demonstratingquantitative and qualitative di(cid:27)erences between the mean of the stochas-tic model and the solution of the deterministic model. The examples usedin [53, 54] all assume elementary reactions (and hence propensities at most10uadratic) and the usual interpretation of concentration as the moles per unitvolume. In the next section, we investigate the 2MA for complex systemswith non-elementary and relative concentrations. The reason for our interestin non-elementary reactions is the frequent occurrence of rational propensities(reaction rates), e.g. Michaelis-Menten type and Hill type, in models in thesystem biology literature (e.g. [66]).3 Fission yeast cell cycle modellingThe growth and reproduction of organisms requires a precisely controlled se-quence of events known as the cell cycle [67]. On a coarse scale, the cell cycleis composed of four phases: the replication of DNA (S phase), the separationof DNA (mitosis, M phase), and the intervening phases (gapes G1 and G2)which allow for preparation, regulation and control of cell division. The cen-tral molecular components of cell cycle control system have been identi(cid:28)ed[67, 68].Cell cycle experiments show that cycle times (CTs) have di(cid:27)erent patternsfor the wild type and for various mutants [69, 70]. For the wild type, theCTs have more or less a constant value near 150 min ensured by a size controlmechanism: mitosis happens only when the cell has reached a critical size. Thevalue 150 min has been considered in [48, 63, 70, 71] as the CT of an averageWT cell (also referred to as the (cid:16)mass-doubling time(cid:17)). The double-mutant of(cid:28)ssion yeast (namely wee1 - cdc25 ∆ ) exhibits quantised cycle times: the CTsget clustered into three di(cid:27)erent groups (with mean CTs of 90, 160 and 230min). The proposed explanation for the quantised cycle times is a weakendpositive feedback loop (due to wee1 and cdc25) which means cells reset (morethan once) back to G2 from early stages of mitosis by premature activation ofa negative feedback loop [70, 71].Many deterministic ODE models describing the cell cycle dynamics have beenconstructed [59, 61, 72, 73]. These models can explain many aspects of thecell cycle including the size control for both the wild type and mutants. Sincedeterministic models describe the behaviour of a non-existing ‘average cell’,neglecting the di(cid:27)erences among cells in culture, they fail to explain curiousbehaviours such as the quantised cycle times in the double-mutant. To accountfor such curiosities in experiments, two stochastic models were constructed bySveiczer: The (cid:28)rst model [70, 71] introduces (external) noise into the rateparameter of the protein Pyp3. The second model [74] introduces noise intotwo cell and nuclear sizes after division asymmetry. Full stochastic modelsthat treat all the time-varying protein concentrations as random variables arereported in [48, 63]. They provide a reasonable explanation for the size controlin wild type and the quantised CTs in the double-mutant type. Both models11mploy the Langevin approach and hence require many simulation runs toprovide an ensemble for computing the mean and (co)variance. However, thesimulation results of stochastic models in [48, 63, 70, 71, 74] represent onetrajectory (for a large number of successive cycles) of the many possible inthe ensemble from which the CT statistics (time averages) are computed. Wewill see that the time-averages computed from the 2MA simulation are for theensemble of all trajectories.3.1 The deterministic modelWe base our 2MA model on the deterministic ODE model for the (cid:28)ssion yeastcell cycle, developed by Tyson-NovÆk in [59]. In this context, the cell cy-cle control mechanism centres around the M-phase promoting factor (MPF),the active form of the heterodimer Cdc13/Cdc2, and its antagonistic interac-tions with enemies (Ste9,Slp1,Rum1) and the positive feedback with its friendCdc25. These interactions, among many others, de(cid:28)ne a sequence of checkpoints to control the timing of cell cycle phases. The result is MPF activityoscillation between low (G1-phase), intermediate (S- and G2-phases) and high(M-phase) levels that is required for the correct sequence of cell cycle events.For simplicity, it is assumed that the cell divides functionally when MPF dropsfrom 0.1.Table 1 lists the proteins whose concentrations x i , together with MPF con-centration, are treated as dynamic variables that evolve according to dx i dt = f + i ( x ) − f − i ( x ) . (8)Here f + i ( x ) is the production (cid:29)ux and f − i ( x ) is the elimination (cid:29)ux of i thprotein. Note that the summands in the (cid:29)uxes f + i ( x ) and f − i ( x ) are ratesof reactions, most of which, are non-elementary (summarizing many elemen-tary reactions into a single step). Quite a few of these reaction rates haverational expressions which requires the extended 2MA approach developed inthis paper. The MPF concentration x mpf can be obtained from the algebraicrelation x mpf = ( x − x ) ( x − x trim ) x (9)12 able 1Proteins and (cid:29)uxes. Here x denotes the vector of concentrations x to x .Index Protein Production (cid:29)ux Elimination (cid:29)ux i f + i ( x ) f − i ( x ) T k M ( k (cid:48) + k (cid:48)(cid:48) x + k (cid:48)(cid:48)(cid:48) x ) x ( x − x ) k wee ( k + k (cid:48) + k (cid:48)(cid:48) x + k (cid:48)(cid:48)(cid:48) x ) x ( k (cid:48) + k (cid:48)(cid:48) x ) (1 − x ) J +1 − x ( k (cid:48) x + k x mpf ) x J + x T k (cid:48) + k (cid:48)(cid:48) x J + x k x k x − x ) x J + x − x k x + k x J + x k − x ) x mpf J +1 − x k x J + x T k ( k + k (cid:48) x + k (cid:48)(cid:48) x mpf ) x k x tf k x where dMdt = ρMx trim = 2 x x Σ + √ Σ − x x x tf = G ( k M, k (cid:48) , k (cid:48)(cid:48) x mpf , J , J ) k wee = k (cid:48) wee + ( k (cid:48)(cid:48) wee − k (cid:48) wee ) G ( V awee , V iwee x mpf , J awee , J iwee ) k = k (cid:48) + ( k (cid:48)(cid:48) − k (cid:48) ) G ( V a25 x mpf , V i25 , J a25 , J i25 ) Σ = x + x + K diss ,G ( a, b, c, d ) = 2 adb − a + bc + ad + (cid:113) ( b − a + bc + ad ) − b − a ) ad (10)Note that the cellular mass M is assumed to grow exponentially with a rate ρ , and the concentrations ( x trim , x tf , k wee , k ) are assumed to be in a pseudo-steady-state to simplify the model. Note that we use a slightly di(cid:27)erent nota-tion: ρ for mass growth rate (instead of µ ), x trim for Trimmer concentrationand x tf for TF concentration. We have to emphasise that the concentrationsused in this model are relative and dimensionless. When one concentration isdivided by another, the proportion is the same as a proportion of two copynumbers. Hence, such a concentration should not be interpreted as a copynumber per unit volume (as misinterpreted in [63]). The parameters used inthe Tyson-NovÆk model [59] are listed in Table A3.1 in Appendix A3.The deterministic ODE model describes the behaviour of an ‘average cell’,neglecting the di(cid:27)erences among cells in culture. Speci(cid:28)cally, it fails to explainthe experimentally observed clusters of the CT-vs-BM plot and the tri-modaldistribution of CT [69, 70, 71, 74]. 13.2 Feasibility of Gillespie simulationsIdeally, we should repeat many runs of Gillespie’s SSA and compute our de-sired moments from the ensemble of those runs. At present, there are twoproblems which this. The (cid:28)rst problem is the requirement of elementary reac-tions for SSA. The elementary reactions underlying the deterministic model[59] are not known. Many elementary steps have been simpli(cid:28)ed to obtain thatmodel. Trying to perform SSA on non-elementary reactions will lose the dis-crete event character of SSA. The second problem arises from the fact that theSSA requires copy numbers which in turn requires knowledge of measured con-centrations. All protein concentrations in the model are expressed in arbitraryunits (a.u.) because the actual concentrations of most regulatory proteins inthe cell are not known [62]. Tyson and Sveiczer de(cid:28)ne relative concentration x i of the i th protein as x i = n i / Ω i where Ω i = C i N A V . Here C i is an unknowncharacteristic concentration of the i th component. The idea is to make therelative concentrations x i free of scale of the actual (molar) concentrations n i /N A V . Although one would like to vary C i , this is computationally inten-sive. This problem is not so serious for the continuous approximations such asCLE, LNA and the 2MA which are all ODEs and can be numerically solved.3.3 The stochastic model using Langevin’s approachIn [63] a stochastic model is proposed that replaces the ODE model (8) witha set of chemical Langevin equations (CLEs) ddt x i ( t ) = f + i (cid:16) x ( t ) (cid:17) − f − i (cid:16) x ( t ) (cid:17) + 1 Ω (cid:20)(cid:113) f + i ( x ( t )) Γ + i ( t ) − (cid:113) f − i ( x ( t )) Γ − i ( t ) (cid:21) , which uses the Langevin noise terms: White noises Γ + i and Γ − i scaled by (cid:113) f + i ( x ) and (cid:113) f − i ( x ) to represent the internal noise. The system parameter Ω has been described as the volume by the author. As we discussed before, theconcentrations are relative levels with di(cid:27)erent system size parameters. Thatmeans that concentrations are not the same as copy numbers per unit volume.Another stochastic model employing the Langevin’s approach is reported in[48] which approximates the squared noise amplitudes by linear functions: ddt x i ( t ) = f i ( x ( t )) + (cid:113) D i x i ( t ) Γ i ( t ) , where D i is a constant. The reason why the model dynamics f ( x ) are missingin this model is that the author wanted to represent both the internal and Personal communication. f and the di(cid:27)usion matrix B , de(cid:28)ned in (7),have elements f i ( x ) = f + i ( x ) − f − i ( x ) , B ik ( x ) =  f + i ( x ) + f − i ( x ) if i = k if i (cid:54) = k . The o(cid:27)-diagonal elements of B are zero because each reaction changes onlyone component, so that S ij S kj = 0 for i (cid:54) = k . Once these quantities are known,it follows from (5) and (6) that the following set of ODEs: dµ i dt = f i ( µ ) + ε f i ( µ, σ ) (11) dσ ii dt = 2 (cid:88) l A il ( µ ) σ li + 1 Ω i [ B ii ( µ ) + ε B ii ( µ, σ )] (12) dσ ik dt = (cid:88) l [ A il ( µ ) σ lk + σ il A kl ( µ )] i (cid:54) = k (13)approximates (correctly to the 2nd order moments) the evolution of component-wise concentration mean and covariance. See See Tables A3.2-A3.4 in Ap-pendix A3 for the respective expressions of the drift matrix A , the stochastic(cid:29)ux ε f and the correction-term ε B added to the di(cid:27)usion matrix B in (12).Having at hand the moments involving the eight dynamic variables x to x ,the mean MPF concentration can be shown to be approximately (correct to2nd order moments): µ mpf = µ − µ − x trim + x trim µ (cid:34)(cid:32) σ µ (cid:33) µ − σ µ (cid:35) (14)for the mean MPF concentration with the understanding that x trim is in pseudosteady state (See Appendix A2 for the derivation). This expression for theaverage MPF activity demonstrates the in(cid:29)uence of (co)variance on the meanas emphasised here. We see the dependence of mean MPF concentration µ mpf on the variance σ and covariance σ in addition to the means µ , µ and x trim . 15

100 200 300 400 500 6000.511.5 M P F ( a . u . ) Time (min)11.52 M a ss ( a . u . ) (a) M P F ( a . u . ) Time (min)123 M a ss ( a . u . ) (b)Figure 1. The time-courses of mass and MPF activity: (a) for the wild type, (b)for the double-mutant type. The 2MA predicted mean trajectories are plotted assolid lines and the corresponding deterministic trajectories as dashed lines. Thedi(cid:27)erence between the two predictions is negligible for the wild type, but signi(cid:28)cantfor double-mutant type. Ω i used in the de(cid:28)nition of concentrations is not available,we have used Ω i = 5000 for all i . This value has also been used in [63], althoughthere is no clear justi(cid:28)cation. Note, however, that the 2MA approach developedhere will work for any combination of { Ω i } . The time-courses of mass and MPFactivity are plotted in Figure 1a for the wild type and in Figure 1b for thedouble-mutant type. For the wild type, the 2MA predicted mean trajectoriesdo not di(cid:27)er considerably from the corresponding deterministic trajectories.Both plots show a more or less constant CT near 150 min. Thus internal noisedoes not seem to have a major in(cid:29)uence for the wild type.For the double-mutant type, the di(cid:27)erence between the 2MA and determinis-tic predictions is signi(cid:28)cant. The deterministic model (8) predicts alternatingshort cycles and long cycles because cells born at the larger size have shortercycle, and smaller newborns have longer cycles [59]. This strict alternation dueto size control is not observed in experiments: cells of same mass may haveshort or long cycles (excluding very large cells that have always the shortestCT) [69, 71]. This lack of size control is reproduced by the 2MA simulations:the multiple resettings of MPF to G2, induced by the internal noise, result inlonger CTs (thus accounting for the 230-min cycles observed experimentally).Such MPF resettings have been proposed in [70, 71] to explain quantised CTs.No such resetting is demonstrated by the deterministic model.16

100 200 300 400 500 60000.010.02 σ Time (min)00.010.02 σ (a) σ Time (min)01020 σ (b)Figure 2. Variance σ (of Cdc13 T ) and covariance σ (between Cdc13 T andpreMPF): (a) for the wild type, (b) for double-mutant type. Note that the mean µ ( t ) of the 2MA describes the average of an ensemble ofcells. Yet the MPF resettings observed in Figure (1b), near G2/M transition,introduce the required variability that explains the clustering of the cycletime observed in experiments. This is in contrast to the alternative stochasticapproaches in [48, 63, 70, 71, 74] that use one sample trajectory rather thanthe ensemble average.How do we explain this signi(cid:28)cant e(cid:27)ect of noise for the double-mutant onone hand and its negligible e(cid:27)ect for the wild type on the other hand? If welook at expression (14), we see the in(cid:29)uence of the variance σ (of Cdc13 T )and covariance σ (between Cdc13 T and preMPF) on the mean MPF con-centration µ mpf . The two (co)variances are plotted in Figure 2a for the wildtype and in Figure 2b for the double-mutant type. It is clear that the two(co)variances have very small peaks for the wild type compared to the largepeaks for the double-mutant type. Note that the larger peaks in Figure 2bare located at the same time points where the MPF activity exhibits oscilla-tions and hence multiple resettings to G2. This suggest that the oscillatorybehaviour of MPF near the G2/M transition is due to the in(cid:29)uence of theoscillatory (co)variances. This coupling between the mean and (co)variance isnot captured by the deterministic model.It has to be realised that the above proposition requires validation sincethe 2MA approach ignores 3rd and higher-order moments. We cannot knowwhether that truncation is responsible for the oscillations in Figures 1 and 2,unless compared with a few sample trajectories simulated by the SSA. How-ever, as discussed before, the SSA cannot be performed (at present) for themodel in consideration. Therefore we need to compare the 2MA predictions forthe double-mutant type cells with experimental data. Towards that end, valuesof cycle time (CT), birth mass (BM) and division mass (DM) were computedfor 465 successive cycles of double-mutant cells. Figure 3 shows the CT-vs-BM17 able 2Statistics over 465 successive cell cycles of the double-mutant type cells, predictedby the 2MA model, compared with experimental data, see [69, Table 1].Case µ CT σ CT CV CT µ DM σ DM CV DM µ BM σ BM (1) 131 47 0.358 2.22 0.45 0.203 1.21 0.24(2) 138.8 12.4 0.09 3.18 0.101 0.0319 1.59 0.0575(3) 138.8 17.6 0.127 3.25 0.178 0.055 1.623 0.0934(4) 138.8 23.9 0.172 3.32 0.231 0.0697 1.657 0.12(1) experimental data, (2) Ω = 5000 , (3) Ω = 5200 , (4) Ω = 5300 . plot and the CT distribution for three di(cid:27)erent values { , , } ofsystem size Ω .To make this (cid:28)gure comparable with experimental data from [69, 70], we as-sume that 1 unit of mass corresponds to 8.2 µ m cell length [71]. We can seethe missing size control (CT clusters), in qualitative agreement with experi-mentally observed ones (see [69, Figure 6] and [70, Figure 5] for a comparison).There are more than four clusters, which may have arisen from the truncatedhigher-order moments. The extreme value of CT higher than 230 min suggestsmore than two MPF resettings. Furthermore, more than three modes in theCT distribution may have arisen from the truncated higher-order moments.Table 2 compares the statistics for the double-mutant type cells, computedwith the 2MA approach, with data from [69, Table 1]. Column 2-4 tabulate,for CT, the mean µ CT , the standard deviation σ CT and the coe(cid:30)cient of varia-tion CV CT , respectively. The other columns tabulate similar quantities for thedivision mass (DM) and the birth mass (BM). We see that only the mean CTis in agreement with the experimental data. The mean values for both BMand DM are larger than the corresponding experimental values. The otherstatistics are much smaller the corresponding experimental values. This andthe above plots suggest that the 2MA should be used with caution. However,another aspect of the cell cycle model deserves attention here. The way the rel-ative protein concentrations have been de(cid:28)ned implies unknown values of thescaling parameters { Ω i } . Since Ω i = C i N A V , knowing the volume V does notsolve the problem: the characteristic concentrations { C i } are still unknown.Our simulations have chosen typical values Ω = { , , } . The cor-responding three pairs of plots in Figure 3 and rows in Table 2 demonstratea dependence of the results on a suitable system size. There is no way tocon(cid:28)rm these values. The scaling parameters could be regulated in a widerrange in order to imporve the accuracy of our simulation, motivating futurework for us. The conclusion is that the quantitative disagreement of the 2MApredictions can be attributed to two factors: 1) the truncated higher-ordermoments during the derivation of the 2MA, and (2) the unknown values ofscaling parameters. 18 C T ( m i n ) (a)

100 150 200 250010203040506070 CT (min) nu m be r o f c e ll s (b) C T ( m i n ) (c)

100 150 200 2500102030405060 CT (min) nu m be r o f c e ll s (d) C T ( m i n ) (e)

100 150 200 250020406080100120 CT (min) nu m be r o f c e ll s (f)Figure 3. Cycle time behaviour over 465 successive cycles of the double-mutant cells,predicted by the 2MA model. (a,c,e): CT vs BM, (b,d,f): CT distribution, (a,b): Ω = 5000 , (c,d): Ω = 5200 , (e,f): Ω = 5300 . The plots are in qualitative agreementto experiments, see [69, Figure 6] and [70, Figure 5] for a comparison.

19 ConclusionsThe recently developed two-moment approximation (2MA) [53, 54] is a promis-ing approach because it accounts for the coupling between the means and(co)variances. We have extended the derivation of the 2MA to biochemicalnetworks and established two advances to previous e(cid:27)orts: a) relative con-centrations and b) non-elementary reactions. Both aspects are important insystems biology where one is often forced to aggregate elementary reactionsinto single step reactions. In these situations one cannot assume knowledge ofelementary reactions to formulate a stochastic model. Previous derivations as-sumed elementary reactions and absolute concentrations. However, numerousexisting models in systems biology use relative concentrations.We investigated the applicability of the 2MA approach to the well established(cid:28)ssion yeast cell cycle model. The simulations of the 2MA model show os-cillatory behaviour near the G2/M transition, which is signi(cid:28)cantly di(cid:27)erentfrom the simulations of deterministic ODE model. One notable aspect of ouranalytical model is that, although it describes the average of an ensemble, itreproduces enough variability among cycles to reproduce the curious quantisedcycle times observed in experiments on double mutants.AcknowledgementsIn the process of preparing this manuscript Hanspeter Herzel (Humboldt Uni-versity), Akos Sveiczer (Budapest University of Technology and Economics)and Kevin Burrage (University of Queensland, Brisbane) helped in clarifyingquestions related to the cell cycle and stochastic modelling. We very appreciatetheir willingness to have discussed these issues with thus. M.U. has been sup-ported by the Deutsche Forschungsgemeinschaft (DFG) through grant (WO991/3-1). O.W. acknowledges support by the Helmholtz Alliance on SystemsBiology and the Stellenbosch Institute for Advanced Study (STIAS).

Appendices

A1 Derivation of the 2MA equationsThe progress of a particular reaction can be described by a quantity knownas the degree of advancement (DA). We will write Z j ( t ) for the DA of the j th20eaction, where Z j ( t ) = z j means that the j th reaction has occurred z j timesduring the interval [0 , t ) . In the same interval the j th reaction will contributea change of z j S ij molecules to the overall change in the copy number N i ofthe i th component. Summing up contributions from all the reactions, the copynumber can be expressed as N i ( t ) = N i (0) + r (cid:88) j =1 S ij Z j ( t ) . (A1.1)Based on the de(cid:28)nition of reaction propensity, the number of occurrences Z j ( t + ∆ t ) − Z j ( t ) during a short interval [ t, t + ∆ t ] has the probability distri-bution Pr [ Z j ( t + ∆ t ) − Z j ( t ) = z j | N ( t ) = n ]=  a j ( n )∆ t + o (∆ t ) if z j = 11 − a j ( n )∆ t + o (∆ t ) if z j = 0 o (∆ t ) if z j > (A1.2)where o (∆ t ) represents a quantity that vanishes faster than ∆ t as the laterapproaches zero. In e(cid:27)ect, (A1.2) gives the conditional probability distribution,in state n , of the random progress (DA increment) Z j ( t + ∆ t ) − Z j ( t ) of the j th reaction during the time interval [ t, t + ∆ t ) . The expected value of thisshort-time DA increment can be obtained from (A1.2) as E [ Z j ( t + ∆ t ) − Z j ( t ) | N ( t ) = n ]= r (cid:88) j =0 z j Pr [ Z j ( t + ∆ t ) − Z j ( t ) = z j | N ( t ) = n ]= z j =1 (cid:122) (cid:125)(cid:124) (cid:123) a j ( n )∆ t + z j > (cid:122) (cid:125)(cid:124) (cid:123) o (∆ t ) (A1.3)which is conditioned on N ( t ) = n . The unconditional expectation of the DAincrement can be obtained by summing the probabilities P ( n, t ) weighted bythe above conditional expectation over all possible states n : E [ Z j ( t + ∆ t ) − Z j ( t )] = (cid:88) n E [ Z j ( t + ∆ t ) − Z j ( t ) | N ( t ) = n ] P ( n, t )= (cid:88) n a j ( n ) p n ( t )∆ t + o (∆ t )= E (cid:104) a j (cid:16) N ( t ) (cid:17)(cid:105) ∆ t + o (∆ t ) which for vanishingly small ∆ t leads to the ODE ddt E [ Z j ( t )] = E [ a j ( N ( t ))] (A1.4)21hus the mean propensity of a particular reaction can be interpreted as theaverage number of occurrences (DA) per unit time of that reaction. Take theexpectation on both side of the conservation (A1.1) to obtain ddt E [ N i ( t )] = r (cid:88) j =1 S ij E (cid:104) a j (cid:16) N ( t ) (cid:17)(cid:105) which proves (4) in the main text. It is interesting to note that the aboveODE is a direct consequence of mass conservation (A1.1) and de(cid:28)nition ofpropensity because we have not referred to the CME (which is the usualprocedure) during our derivation.Dividing (4) by Ω i gives the ODE for the component mean concentration, ddt µ i ( t ) = E (cid:104) f i (cid:16) X ( t ) (cid:17)(cid:105) (A1.5)where f i ( x ) = 1 Ω i r (cid:88) j =1 S ij a j ( Ω x ) is the total (cid:29)ux of component i in state x .Suppose the propensity a j ( n ) is a smooth function and that central moments E [( N − µ ) m ] of order higher than m = 2 can be ignored. In that case, theTaylor series expansion of (cid:29)ux f i ( x ) around the mean is f i ( x ) = f i ( µ ) + (cid:34) ∂f i ∂x T (cid:35) x = µ ( x − µ ) + 12 ( x − µ ) T (cid:34) ∂ f i ∂x∂x T (cid:35) x = µ ( x − µ ) + · · · . Expectation of the 2nd term on the right is zero. Expectation of the 3rd termcan be written as ε f i ( µ, σ ) = 12 (cid:88) k,l (cid:34) ∂ f i ∂x k ∂x l (cid:35) x = µ σ kl . Note that the Taylor expansion in powers of x − µ is more convincing thanthat in powers of n − E( n ) because higher-order terms vanish quicker in theformer. Having arrived at this point, ignoring terms (moments) higher than2nd order, we can write: dµ i dt = f i ( µ ) + ε f i ( µ, σ ) (A1.6)for mean component concentration and dµdt = f ( µ ) + ε f ( µ, σ ) ε f ( µ, σ ) is the internal noise that arises from the discreteand random nature of chemical reactions. Note that this term has been derivedfrom the CME instead of being assumed like external noise. This shows thatknowledge of (cid:29)uctuations (even if small) is important for a correct descriptionof the mean. This also indicates an advantage of the stochastic frameworkover it deterministic counterpart: starting from the same assumptions andapproximations, the stochastic framework allows us to see the in(cid:29)uence of(cid:29)uctuation on the mean. Note that the above equation is exact for systemswhere no reaction has an order higher than two because then 3rd and higherderivatives of propensity are zero.Before we can see how the covariance σ evolves in time, let us multiply theCME with n i n k and sum over all n , (cid:88) n n i n k dP ( n, t ) dt = (cid:88) n n i n k r (cid:88) j =1 [ a j ( n − S (cid:5) j ) P ( n − S (cid:5) j , t ) − a j ( n ) P ( n, t )]= (cid:88) n r (cid:88) j =1 (cid:34) ( n i + S ij ) ( n k + S kj ) a j ( n ) P ( n, t ) − n i n k a j ( n ) P ( n, t ) (cid:35) = (cid:88) n r (cid:88) j =1 ( n k S ij + n i S kj + S ij S kj ) a j ( n ) P ( n, t ) where dependence on time is implicit for all variables except n and s . Dividingby Ω i Ω k and recognising sums of probabilities as expectations, d E [ X i X k ] dt = E [ X k f i ( X )] + E [ X i f k ( X )] + 1 √ Ω i Ω k E [ B ik ( X )] where B ( x ) is the di(cid:27)usion matrix with elements B ik ( x ) = 1 √ Ω i Ω k r (cid:88) j =1 S ij S kj a j ( Ω x ) . The relation σ ik = E [ X i X k ] − µ i µ k can be utilised to yield dσ ik dt = E [( X k − µ k ) f i ( X )] + E [( X i − µ i ) f k ( X )] + 1 √ Ω i Ω k E [ B ik ( X )] (A1.7)for the covariances between concentrations of component pairs. The argumentof the (cid:28)rst expectation in (A1.7) has Taylor expansion f i ( x ) ( x k − µ k ) = f i ( µ ) ( x k − µ k ) + (cid:34) ∂f i ∂x T (cid:35) x = µ ( x − µ ) ( x k − µ k ) + · · · . Expectation of the (cid:28)rst term on the right is zero. Ignoring 3rd and higher-order23oments, the (cid:28)rst expectation in (A1.7) is then

E [( X k − µ k ) f i ( X )] = (cid:88) l A il ( µ ) σ lk where A ( x ) is the drift matrix (the Jacobian of f ( x ) ) with elements A ik ( x ) = ∂f i ( x ) ∂x k . By a similar procedure, the second expectation (A1.7) is

E [( X i − µ i ) f k ( X )] = (cid:88) l σ il A il ( µ ) , correct to 2nd-order moments. The element B ik ( x ) of the di(cid:27)usion matrix hasTaylor expansion B ik ( x ) = B ik ( µ ) + (cid:34) ∂B ik ∂x T (cid:35) x = µ ( x − µ ) + 12 ( x − µ ) T (cid:34) ∂ B ik ∂x∂x T (cid:35) x = µ ( x − µ ) + · · · . Taking term-wise expectation, and ignoring 3rd and higher-order moments,

E [ B ik ( X )] = B ik ( µ ) + ε B ik ( µ, σ ) where ε B ik ( µ, σ ) = 12 (cid:88) i (cid:48) ,k (cid:48) (cid:34) ∂ B ik ∂x i (cid:48) ∂x k (cid:48) (cid:35) x = µ σ i (cid:48) k (cid:48) . Having these results at hand, we can now write dσ ik dt = (cid:88) l [ A il ( µ ) σ lk + σ il A kl ( µ )] + 1 √ Ω i Ω k [ B ik ( µ ) + ε B ik ( µ, σ )] for the component-wise covariances. In matrix notation dσdt = A ( µ ) σ + σA ( µ ) T + Ω − / [ B ( µ ) + ε B ( µ, σ )] (cid:16) Ω − / (cid:17) T proves (6) in the main text. The drift matrix A ( µ ) re(cid:29)ects the dynamics forrelaxation (dissipation) to the steady state and the di(cid:27)usion matrix B ( µ ) the randomness ((cid:29)uctuation) of the individual events [1]. These terms areborrowed from the (cid:29)uctuation-dissipation theorem (FDT) [76, 77], which hasthe same form as (6). Remember that (6) is exact for systems that contain onlyzero and (cid:28)rst-order reactions because in that case the propensity is alreadylinear. 242 Mean MPF concentrationTo (cid:28)nd the mean MPF concentration, we start with the MPF concentration x mpf = ( x − x ) (cid:18) − x trim x (cid:19) = x − x − x trim + x trim x x . The ratio x / x can be expanded around the mean, x x = 1 µ x ( x − µ ) µ = 1 µ (cid:34) x − ( x − µ ) x µ + ( x − µ ) x µ + · · · (cid:35) . Taking expectation on both sides, E (cid:20) X X (cid:21) = 1 µ E  X ( X − µ ) µ  = 1 µ E (cid:34) X − ( X − µ ) X µ + ( X − µ ) X µ + · · · (cid:35) = 1 µ (cid:34) µ − σ µ + µ σ µ (cid:35) . Finally, the mean MPF concentration follows from the expectation of x mpf tobe µ mpf = µ − µ − x trim + x trim µ (cid:34)(cid:32) σ µ (cid:33) µ − σ µ (cid:35) , thus proving (14) in the main text.A3 Parameters and coe(cid:30)cients of the 2MA equations25 able A3.1Parameter values for the Tyson-NovÆk cell cycle model of the (cid:28)ssion yeast (wildtype) [59]. All constants have units min − , except the J s, which are dimension-less Michaelis constants, and K diss , which is a dimensionless equilibrium constantfor trimer dissociation. For the double-mutant type, one makes the following threechanges: k (cid:48)(cid:48) wee = 0 . , k (cid:48) = k (cid:48)(cid:48) = 0 . . k = 0 . , k (cid:48) = 0 . , k (cid:48)(cid:48) = 1 , k (cid:48)(cid:48)(cid:48) = 0 . , k (cid:48) = 1 , k (cid:48)(cid:48) = 10 , J = 0 . ,k (cid:48) = 2 , k = 35 , J = 0 . , k (cid:48) = 0 . , k (cid:48)(cid:48) = 0 . , k = 0 . , J = 0 . ,k = 1 , k = 0 . , J = J = 0 . , J = 0 . , k = 0 . , k = 0 . ,J = 0 . , J = 0 . , k = 0 . , k = 0 . , k (cid:48) = 1 , k (cid:48)(cid:48) = 3 , K diss = 0 . ,k = 0 . , k = 0 . , k = 1 . , k (cid:48) = 1 , k (cid:48)(cid:48) = 2 , J = 0 . , J = 0 . ,V awee = 0 . , V iwee = 1 , J awee = 0 . , J iwee = 0 . , V a25 = 1 , V i25 = 0 . ,J a25 = 0 . , J i25 = 0 . , k (cid:48) wee = 0 . , k (cid:48)(cid:48) wee = 1 . , k (cid:48) = 0 . , k (cid:48)(cid:48) = 5 , ρ = 0 . . Table A3.2Rows of the drift matrix A of the 2MA cell cycle model. We here use e i to denotethe i th row of × identity matrix.Index i A i (cid:5) ( x ) = ∂f i ∂x T − ( k (cid:48) + k (cid:48)(cid:48) x + k (cid:48)(cid:48)(cid:48) x ) e − k (cid:48)(cid:48) x e − k (cid:48)(cid:48)(cid:48) x e k wee e − ( k wee + k + k (cid:48) + k (cid:48)(cid:48) x + k (cid:48)(cid:48)(cid:48) x ) e − k (cid:48)(cid:48) x e − k (cid:48)(cid:48)(cid:48) x e (cid:20) ( k (cid:48) x + k x mpf ) J ( J + x ) + ( k (cid:48) + k (cid:48)(cid:48) x ) J ( J +1 − x ) (cid:21) e + (1 − x ) k (cid:48)(cid:48) J +1 − x e − k (cid:48) x J + x e − k e k J x ( J + x − x ) e − (cid:104) k + k J x ( J + x − x ) + k J ( J + x ) (cid:105) e + ( x − x ) k J + x − x e − (cid:104) k x mpf J ( J +1 − x ) + k J ( J + x ) (cid:105) e − ( k + k (cid:48) x + k (cid:48)(cid:48) x mpf ) e − k (cid:48) x e − k e able A3.3Stochastic (cid:29)ux, the correction-term added to the deterministic (cid:29)ux in (5).Index i ε f ( x, σ ) = (cid:80) k,l ∂ f i ∂x k ∂x l σ kl − k (cid:48)(cid:48) σ − k (cid:48)(cid:48)(cid:48) σ − k (cid:48)(cid:48) σ − k (cid:48)(cid:48)(cid:48) σ (cid:20) ( k (cid:48) x + k x mpf ) J ( J + x ) − ( k (cid:48) + k (cid:48)(cid:48) x ) J ( J +1 − x ) (cid:21) σ − k (cid:48)(cid:48) J σ ( J +1 − x ) − k (cid:48) J σ ( J + x ) k J x (2 σ − σ − σ )( J + x − x ) + k J ( σ − σ )( J + x − x ) + k J ( J + x ) σ (cid:104) k J ( J + x ) − k x mpf J ( J +1 − x ) (cid:105) σ − k (cid:48) σ Table A3.4Correction-term added to B ii ( x ) in (12).Index i ε B ii ( x, σ ) = (cid:80) k,l ∂ B ii ∂x k ∂x l σ kl k (cid:48)(cid:48) σ + k (cid:48)(cid:48)(cid:48) σ k (cid:48)(cid:48) σ + k (cid:48)(cid:48)(cid:48) σ − (cid:20) ( k (cid:48) x + k x mpf ) J ( J + x ) + ( k (cid:48) + k (cid:48)(cid:48) x ) J ( J +1 − x ) (cid:21) σ − k (cid:48)(cid:48) J σ ( J +1 − x ) + k (cid:48) J σ ( J + x ) k J x (2 σ − σ − σ )( J + x − x ) + k J ( σ − σ )( J + x − x ) − k J ( J + x ) σ − (cid:104) k J ( J + x ) + k x mpf J ( J +1 − x ) (cid:105) σ k (cid:48) σ0