[PDF] Steady-state fluctuations of a genetic feedback loop with fluctuating rate parameters using the unified colored noise approximation

Abstract

A common model of stochastic auto-regulatory gene expression describes promoter switching via cooperative protein binding, effective protein production in the active state and dilution of proteins. Here we consider an extension of this model whereby colored noise with a short correlation time is added to the reaction rate parameters -- we show that when the size and timescale of the noise is appropriately chosen it accounts for fast reactions that are not explicitly modelled, e.g., in models with no mRNA description, fluctuations in the protein production rate can account for rapid multiple stages of nuclear mRNA processing which precede translation in eukaryotes. We show how the unified colored noise approximation can be used to derive expressions for the protein number distribution that is in good agreement with stochastic simulations. We find that even when the noise in the rate parameters is small, the protein distributions predicted by our model can be significantly different than models assuming constant reaction rates.

Full PDF

SSteady-state ﬂuctuations of a genetic feedback loopwith ﬂuctuating rate parameters using theuniﬁed colored noise approximation

James Holehouse ∗ , Abhishek Gupta ∗ and Ramon Grima † School of Biological Sciences, University of Edinburgh, U.K.April 7, 2020

Abstract

A common model of stochastic auto-regulatory gene expression describes promoter switchingvia cooperative protein binding, eﬀective protein production in the active state and dilution ofproteins. Here we consider an extension of this model whereby colored noise with a short correlationtime is added to the reaction rate parameters – we show that when the size and timescale of thenoise is appropriately chosen it accounts for fast reactions that are not explicitly modelled, e.g.,in models with no mRNA description, ﬂuctuations in the protein production rate can accountfor rapid multiple stages of nuclear mRNA processing which precede translation in eukaryotes.We show how the uniﬁed colored noise approximation can be used to derive expressions for theprotein number distribution that is in good agreement with stochastic simulations. We ﬁnd thateven when the noise in the rate parameters is small, the protein distributions predicted by ourmodel can be signiﬁcantly diﬀerent than models assuming constant reaction rates.

Proteins perform a large range of cellular functions and hence it is of great interest to understandhow the genes that produce them operate. Autoregulation is a mechanism to regulate gene expressionwhereby proteins expressed by a certain gene can subsequently bind to the same gene and cause anincrease or a decrease in its expression (positive and negative feedback, respectively) [1]. Autoreg-ulation is common; for example in

E. coli it is estimated that 40% of all transcription factors areself-regulated [2, 3].For almost two decades, it has been known that gene expression is inherently stochastic [4], andas such the modelling of gene regulatory networks must account for this stochasticity. Noise in geneexpression can be split into two contributions: (i) Fluctuations in the number of messenger RNA(mRNA) and proteins due to inherent randomness in the time at which the processes of transcriptionand translation occur. This is often described as intrinsic noise. For example, if a gene is always activeand produces mRNA with rate parameter k then the time between successive transcription events is anexponentially distributed random variable with mean /k . (ii) Fluctuations in the number of mRNAand proteins due to ﬂuctuations in the rate parameters themselves. For example the transcriptionrate k mentioned previously, while often taken to be a constant, it is generally a ﬂuctuating quantityitself, because of ﬂuctuating numbers of polymerases and transcription factors. Another example ∗ These authors contributed equally. † Correspondence: [email protected]. a r X i v : . [ q - b i o . S C ] A p r ould be ﬂuctuations in the protein numbers due to ﬂuctuations in the translation rate stemmingfrom ﬂuctuating numbers of ribosomes in the cell [4, 5]. This noise is often termed extrinsic noise.The division of noise into these two categories is of course artiﬁcial but it is useful from a conceptualand modelling point of view. The simulation of stochastic biochemical processes is most commonlydone using the stochastic simulation algorithm (SSA) [6] which assumes that the rate parameter of areaction will not change in the interval between two successive reaction events. Hence the prevalentmeans of stochastic simulation assumes that noise is either principally intrinsic or that if there isextrinsic noise it is operating on long timescales such that it is slowly varying. While this may bethe case sometimes, it is not generally true. This is because whenever we have an eﬀective reactionthat lumps together a large number of intermediate reactions (a multi-stage reaction process), we aremaking the inherent assumption that these intermediate reactions occur very fast and hence naturallythe eﬀective rate parameter is ﬂuctuating on a fast timescale.Taking into account these ﬂuctuations is however not a simple feat. The chemical master equation(CME, [7, 8]) describing the Markov process simulated by the SSA has been solved exactly or ap-proximately to obtain the protein number distribution in steady-state for a wide variety of models ofautoregulation [9–21], provided the rate parameters are assumed to be constant, i.e., ignoring extrinsicnoise. There are however a number of studies that have analyzed stochastic models with ﬂuctuatingrate parameters, which we summarize next. Modiﬁcations of the linear noise approximation (a type ofFokker-Planck approximation of the CME) incorporating extrinsic noise have proved popular to ap-proximate moments for systems subject to small magnitudes of extrinsic noise with certain properties:(i) for time-independent Gaussian colored noise [22, 23] and (ii) more realistic lognormally distributednoise [24]. Wentzel-Kramers-Brillouin (WKB) methods have also been utilised for cases where thecorrelation time of the colored noise is tending either to zero or to inﬁnity [25]. These methods provideprobability distributions for systems where the noise on the rate parameters is drawn from a negativebinomial distribution, however their analysis does not easily translate to ﬁnding good approximationsfor steady states probability distributions where the correlation time of colored noise is neither smallor large.The focus of the present article is threefold: (i) to provide a general method by which one canobtain analytical expressions for the steady-state protein distributions of auto-regulatory gene circuitswith ﬂuctuating rate parameters, through the use of the uniﬁed colored noise approximation (UCNA)[26], (ii) to use this method to investigate the eﬀects that extrinsic noise of diﬀerent magnitude andtimescales has on auto-regulatory gene expression and (iii) to show how the colored noise formalism canbe used to describe complex models of autoregulation that involve multi-stage protein production andmulti-stage protein degradation. We note that the UCNA was previously utilised in a gene expressioncontext [27] for linear reaction networks that are deterministically monostable and in which there isno feedback mechanism. Our analysis goes further, exploring the addition of colored noise to a non-linear reaction network which expresses deterministic bistability, whilst also incorporating intrinsicﬂuctuations from the core gene expression processes.The structure of our paper is as follows. In Section 2 we introduce the cooperative auto-regulatoryreaction scheme that we will study in this article. We also show that for non-ﬂuctuating rate pa-rameters, the analytical protein distribution given by the chemical Fokker-Planck equation providesan excellent approximation of the protein distribution solution of the CME, in the limit of fast geneswitching. In Section 3 we add colored noise to each reaction in the auto-regulatory reaction scheme(assuming fast gene switching) and use the UCNA to derive the protein number distribution solu-tion of the chemical Fokker-Planck equation. The solution is shown to be in good agreement with astochastic simulation algorithm modiﬁed to account for extrinsic noise on the rate parameters. Wealso use the solution to investigate the eﬀect that extrinsic noise has on the number of modes of theprotein distribution and clarify the limits of the UCNA derivation, including the three main conditionswhich cause it to breakdown. In Section 4 we extend the analysis to the limit of slow gene switchingby introducing a conditional version of the UCNA. In Section 5 we show two examples of how one cansuccessfully model complex auto-regulatory systems by means of simpler ones with colored noise on thereaction rate parameters, here done for multi-stage protein production and multi-stage degradation.2e conclude in Section 6 with a discussion of our results and further problems to be addressed on thistopic. We consider the reaction scheme for a genetic non-bursty cooperative feedback loop, where for sim-plicity we neglect the presence of mRNA: G r u −→ G + P, G ∗ r b −→ G ∗ + P, G + 2 P s b − (cid:42)(cid:41) − s u G ∗ , P d −→ ∅ . (1)The reactions G r u −→ G + P and G ∗ r b −→ G ∗ + P model the production of protein in each gene state, G + 2 P s b − (cid:42)(cid:41) − s u G ∗ models the binding and unbinding of the gene to the proteins (with cooperativity 2),and P d −→ ∅ models the dilution/degradation of proteins inside the cell. It is assumed there is only onegene present in the system and hence we are either in state G or G ∗ at any one time. Before consideringthe addition of colored noise to the reaction rate parameters above, we ﬁrst consider the solution withconstant rate parameters to provide a reference point for approximations made in Section 3, and toclarify the approximation of a CME by a one variable chemical Fokker-Planck equation (FPE).The CME for the reaction scheme in Eq. (1) does not have a known exact solution, even at steady-state for constant reaction rate parameters, and so approximations are necessary. Note that in whatfollows, we will use the terminology “reaction rate parameters” and “rates” interchangeably. We ﬁrstconsider the limit of fast gene switching – i.e., the frequency of gene activation and inactivation eventsis much larger than the frequency of any other reaction in the system. Later in Section 4 we will discussapproximations for the slow switching limit. Where [ g ∗ ] and [ g ] are the deterministic mean number ofbound and unbound gene respectively and [ n ] is the mean protein number, the rate equations for thereaction scheme in Eq. (1) are: d [ g ] dt = s u [ g ∗ ] − s b Ω [ g ][ n ] , (2) d [ n ] dt = 2 s u [ g ∗ ] − s b Ω [ g ][ n ] + r u [ g ] + r b [ g ∗ ] − d [ n ] , (3)where [ g ] + [ g ∗ ] = 1 . For clarity we state the units of each rate parameter: r u , r b , d and s u have unitsof s − , and s b has units of Volume · s − . This ensures a matching of the units with the left handside of Eqs. (2–3), which has units of s − . In the fast switching limit, the gene rapidly equilibrates toquasi-steady state conditions, i.e. d [ g ] /dt ≈ d [ g ∗ ] /dt ≈ and hence the deterministic rate equation formean protein number reduces to a much simpler form: d [ n ] dt = Lr u + r b ([ n ] / Ω) L + ([ n ] / Ω) − d [ n ] , (4)where L = s u /s b . Note that the reaction scheme here described exhibits deterministic bistability oversome regions of the parameter space. This equation is consistent with a birth-death process whereproteins are produced via a zeroth-order reaction (which is dependent on the number of proteins) andare destroyed by a ﬁrst-order reaction [28]. The CME for this reduced process is given by: dP a ( n, t ) dt = T + ( n − P a ( n − , t ) + T − ( n + 1) P a ( n + 1 , t ) − ( T + ( n ) + T − ( n )) P a ( n, t ) , (5)where P a ( n, t ) is the probability that at a time t there are n proteins in the system; T + ( n ) and T − ( n ) are the propensities of protein production and degradation respectively. The subscript a denotes that3his is the probability for the reduced system, an approximate solution to the master equation of thefull system. T + ( n ) dt is the probability, given n proteins are in the system, that a protein productionreaction occurs, increasing the protein number of the system by 1, in the time interval [ t, t + dt ) .Similarly, T − ( n ) dt is the probability, given n proteins are in the system, that a protein degradationevent occurs, decreasing the protein number by 1, in the time interval [ t, t + dt ) . These propensitiesare given by: T + ( n ) = r u L + r b ( n/ Ω) L + ( n/ Ω) , (6) T − ( n ) = d n. (7)These propensities are deduced directly from the form of the eﬀective rate equation in Eq. (4).Essentially, the probability for a particular reaction per unit time is taken to be the same as thereaction rate in the eﬀective deterministic rate equation with [ n ] replaced by n . We emphasise thatwhile this appears to be a heuristic rule with no apparent fundamental microscopic basis, it has beenshown that the reduced master equation based on it provides an accurate approximation to the SSA ofthe full reaction system in fast gene switching conditions provided the low protein number states arerarely visited [14, 28] .The exact steady state solution of the one variable master equation given by Eq. (5) can be foundusing standard methods [29]: P a ( n ) = P a (0) n (cid:89) z =1 T + ( z − T − ( z ) , (8)where P a (0) is the steady state probability evaluated at n = 0 (acting eﬀectively here as a normalisationconstant). We can further approximate the reduced master equation in Eq. (5) by the chemical FPE(via the Kramers-Moyal expansion) [29, 30]: ∂P ( n, t ) ∂t = − ∂∂n (cid:2) a ( n ) P ( n, t ) (cid:3) + 12 ∂ ∂n (cid:2) a ( n ) P ( n, t ) (cid:3) , (9)where a ( n ) and a ( n ) are the ﬁrst two jump moments, given by a ( n ) = T + ( n ) − T − ( n ) and a ( n ) = T + ( n ) + T − ( n ) respectively, and P ( n ) denotes the FPE solution (a notation used throughout thepaper). The purpose of this further approximation by means of a FPE will be made clear in Section3.1. Eq. (9) has a steady state solution of the form [8]: P ( n ) = NT + ( n ) + T − ( n ) exp (cid:16) (cid:90) n T + ( z ) − T − ( z ) T + ( z ) + T − ( z ) dz (cid:17) , (10)where N is a normalisation constant. Although the integral in the exponent of Eq. (10) can be solvedexactly with propensities of the form of Eq. (6) and Eq. (7) since it is the integral of the ratio of twocubic polynomials, the solution is too complicated to be detailed here. The approximations made bythe FPE approximation are that (i) ﬂuctuations in the protein number are small and (ii) we are inthe fast switching regime between the gene states. Fig. 1 compares the FPE solution Eq. (10) withthe solution of the heuristic CME in Eq. (8) and the solution of the full CME of the reaction schemein Eq. (1) using the ﬁnite space projection method (FSP) [31]. Note that provided the state space istruncated large enough, the FPE solution matches the solution of the heuristic CME almost exactly.Clearly, when gene switching is fast (bottom plot of Fig. 1) all three solutions agree with each other.However, when gene switching is not fast (top and middle plots on Fig. 1) both the reduced CME andFPE solutions are a poor approximation of the true distribution from FSP.4 Figure 1:

Comparison of the heuristic reduced master equation solution from Eq. (8) (red dots), the FPE solutionfrom Eq. (10) (black line) and the solution of the full cooperative network using FSP (green shaded region). Sharedparameters in each plot are r u = 50 , r b = 400 , Ω = 200 and d = 1 . The FSP gives the exact solution for a truncatedstate space chosen such that the neglected probability mass is negligible. The top plot shows distributions for the case s u = s b = 5 , where clearly the heuristic master equation and FPE solutions are a poor approximation of the FSP. Themiddle plot shows distributions for the case s u = s b = 50 where we can observe a convergence of the heuristic masterequation and FPE solutions towards the FSP solution. The bottom plot shows excellent agreement of the FSP with theheuristic master equation and FPE solutions for fast switching where s u = s b = 5 × . Fluctuating rate parameters can be used to include a description of processes not explicitly taken intoaccount in the formulation of a model. In Fig. 2 we illustrate this idea. In this section, we addﬂuctuations to the rate parameters of the FPE description derived earlier and use the UCNA to obtaina new eﬀective FPE that is valid when the timescale of the noise on the rates is either very small orvery large.

We begin by considering the case of a ﬂuctuating degradation rate. These ﬂuctuations could forexample stem from details of the degradation machinery that are not explicitly described in the model,e.g multi-step degradation mediated by enzymatic reactions.The equivalent Langevin equation to the chemical Fokker-Planck equation from Eq. (9) using the5

RNARibosomes + Enzyme

Complex

Figure 2:

Illustration of the cooperative auto-regulatory reaction scheme, with colored noise included on each individualreaction. For the case of non-ﬂuctuating rates explored in Section 2 the noise terms, η i , on the rate parameter can beset to zero. Where colored noise is included in Section 3 these noise terms are not set to zero. The addition of noise ontorate parameters can be thought of as accounting for processes that are not explicitly included in the gene expressionmodel. Here we show two examples, where colored noise on the rate parameters of the reduced model can be usedto account for mRNA number ﬂuctuations during protein translation, or the degradation of proteins via an enzymecatalytic mechanism. propensities from Eqs. (6) and (7) is given by [8, 30]: dndt = r u L + r b ( n/ Ω) L + ( n/ Ω) − d n + (cid:115) r u L + r b ( n/ Ω) L + ( n/ Ω) + d n · Γ( t ) , (11)where Γ( t ) is Gaussian white noise with zero mean and correlator (cid:104) Γ( t )Γ( t (cid:48) ) (cid:105) = δ ( t − t (cid:48) ) . Now weintroduce a ﬂuctuating degradation rate by setting d = d (1 + η ( t )) , where η ( t ) is Gaussian colorednoise with a mean of zero and correlator (cid:104) η ( t ) η ( t (cid:48) ) (cid:105) = ( D/τ ) exp( −| t − t (cid:48) | /τ ) [26, 32]. Here, τ is the correlation time of the colored noise, D/τ is the noise strength (the variance of ﬂuctuations) and d isthe mean degradation rate. Since D/τ is the noise strength, i.e., D scales the noise strength at constant τ , we occasionally refer to D itself as the noise strength (where τ is a ﬁxed parameter). In the limitof τ → colored noise becomes white noise since lim τ → (cid:104) η ( t ) η ( t (cid:48) ) (cid:105) = Dδ ( t − t (cid:48) ) . Note that | η ( t ) | (cid:28) such that d is a positive quantity (and hence admits physical interpretation as a rate parameter). Theinclusion of colored noise can be shown to be equivalent to the following two component system [26]: dndt = r u L + r b ( n/ Ω) L + ( n/ Ω) − d (1 + η ) n + (cid:115) r u L + r b ( n/ Ω) L + ( n/ Ω) + d n · Γ( t ) , (12) dηdt = − τ η + 1 τ θ ( t ) , (13)where θ ( t ) is Gaussian white noise with zero mean and correlator (cid:104) θ ( t ) θ ( t (cid:48) ) (cid:105) = 2 Dδ ( t − t (cid:48) ) , and thetime dependence on the protein number n ( t ) and noise η ( t ) is suppressed for notational convenience.Note that in the argument of the square root above we have replaced η ( t ) by its mean of zero; thisconstitutes a mean-ﬁeld type of approximation, and is necessary such that one can solve Eqs. (12)-(13)6nalytically – however, where the noise is small this is generally a very good approximation. Note thatwe also use this mean ﬁeld assumption in Sections 3.2 and 3.3. For transparency, we rewrite Eqs.(12)-(13) as: dndt = h ( n ) + g ( n ) η + g ( n )Γ( t ) , (14) dηdt = − τ η + 1 τ θ ( t ) , (15)with h ( n ) = r u L + r b ( n/ Ω) L + ( n/ Ω) − d n, (16) g ( n ) = − d n, (17) g ( n ) = (cid:115) r u L + r b ( n/ Ω) L + ( n/ Ω) + d n. (18)In order to approximately solve Eqs. (14)–(15) we next utilize the UCNA to obtain reduced Langevinequations when the noise η is either very fast or very slow. For completeness, we present a non-rigorous but intuitive proof of the UCNA along the lines found in [26] which essentially consists of adirect adiabatic elimination on the stochastic diﬀerential equations (SDEs) in Eqs. (14)–(15). For amore rigorous derivation of a UCNA-like FPE we advise reader to read the seminal work of Fox, whointroduced a functional calculus approach to the study of colored noise SDEs [33–36]. A review of thediﬀering UCNA-like derivations can be found in [37].It has been discussed in [26, 37] that the adiabatic elimination we employ below is exact for τ → (white noise) or τ → ∞ (highly correlated noise) but that it should give a useful approximation forintermediate values of τ . We note that the theory provided by Roberts et al. [25] does not providesuch a result as they consider separately the cases of τ → and τ → ∞ . For us, the limit of τ → ∞ isnot of biological interest, and we will later focus on the limit of τ small, although the derivation shownhere holds for large τ too. First, where we use overdots to represent derivatives with respect to time t , one should proceed in rearranging Eq. (14) for η : η ( n, ˙ n ) = 1 g ( n ) ( ˙ n − h ( n ) − g ( n )Γ( t )) . (19)In what follows we will utilise a mean-ﬁeld approximation (denoted by the subscript mf ) to approxi-mately calculate the time derivative of η ( n, ˙ n ) . We start by deﬁning the mean-ﬁeld approximation of η ( n, ˙ n ) as: η mf ( n mf , ˙ n mf ) = 1 g ( n mf ) ( ˙ n mf − h ( n mf )) . (20)Taking the time derivative with respect to non-dimensional time ˆ t = t/τ (denoted by the overdot) weobtain: ˙ η mf = 1 g ( n mf ) (cid:18) h ( n mf ) g (cid:48) ( n mf ) g ( n mf ) − h (cid:48) ( n mf ) (cid:19) ˙ n mf + τ − g ( n mf ) (cid:18) ¨ n mf − g (cid:48) ( n mf ) g ( n mf ) ˙ n mf (cid:19) , (21)where the prime on each function of n mf denotes the derivative with respect to n mf . In the limit of τ → , the second term on the right hand side of Eq. (21) goes to inﬁnity and hence the only way tokeep the time derivative ﬁnite is to impose the condition: ¨ n mf − g (cid:48) ( n mf ) g ( n mf ) ˙ n mf = 0 . (22)7his then implies that in this limit we have: ˙ η mf ≈ g ( n mf ) (cid:18) h ( n mf ) g (cid:48) ( n mf ) g ( n mf ) − h (cid:48) ( n mf ) (cid:19) ˙ n mf . (23)Note that taking the limit of τ → ∞ gives the same result and hence the approximation Eq. (23) isvalid in both the limit of small and large τ . This can be shown to be self-consistently true; taking thetime-derivative of Eq. (14) alongside a mean-ﬁeld approximation we get, ¨ n mf = ( h (cid:48) ( n mf ) + g (cid:48) ( n mf ) η mf ) ˙ n mf + g ( n mf ) ˙ η mf . (24)Assuming Eqs. (20) and (23) to be true one then recovers ¨ n mf − g (cid:48) ( n mf ) g ( n mf ) ˙ n mf = 0 . (25)In Eq. (15) we can now substitute η from (19) and ˙ η mf for ˙ η from Eq. (23) giving us the UCNA forthe system with colored noise on the degradation rate, which is exact in the limits τ → or τ → ∞ : ˙ n ≈ h ( n ) C ( n, τ ) + 1 C ( n, τ ) ( g ( n ) θ ( t ) + g ( n )Γ( t )) , (26)where C ( n, τ ) = 1 + τ (cid:18) g (cid:48) ( n ) h ( n ) g ( n ) − h (cid:48) ( n ) (cid:19) . (27)Note that we have dropped oﬀ the mf subscript for clarity. Finally, in order to get a simpliﬁed Langevinequation, we modify Eq. (26) such that we only have one eﬀective Gaussian white noise term. Webegin by proposing: g ( n )˜Γ( t ) = g ( n ) θ ( t ) + g ( n )Γ( t ) , (28)where ˜Γ( t ) is Gaussian white noise with mean zero and correlator (cid:104) ˜Γ( t )˜Γ( t (cid:48) ) (cid:105) = 2 δ ( t − t (cid:48) ) , and thenuse relations between the correlators to ﬁnd our unknown g ( n ) . Note that we assume zero correlationbetween Γ( t ) and θ ( t ) , i.e. (cid:104) Γ( t ) θ ( t (cid:48) ) (cid:105) = (cid:104) Γ( t (cid:48) ) θ ( t ) (cid:105) = 0 . Explicitly, utilising the correlators, we ﬁnd: g ( n ) (cid:104) ˜Γ( t )˜Γ( t (cid:48) ) (cid:105) = g ( n ) (cid:104) θ ( t ) θ ( t (cid:48) ) (cid:105) + g ( n ) (cid:104) Γ( t )Γ( t (cid:48) ) (cid:105) , (29)which gives us g ( n ) = (cid:114) Dg ( n ) + 12 g ( n ) . (30)Hence, our ﬁnal reduced Langevin equation is given by: ˙ n = h ( n ) C ( n, τ ) + g ( n ) C ( n, τ ) ˜Γ( t ) , (31)which corresponds to the result in [32]. Note that Eqs. (26) and (31) are identical . Here we pause tomake a couple of comments on C ( n, τ ) , which can be interpreted as a renormalisation of the Langevinequation in Eq. (14) to account for the addition of colored noise to the rate parameters. In fact, when τ = 0 , Eq. (31) recovers the correct Langevin equation for a process with white noise on the rateparameters. One should also note the independence of C ( n, τ ) from the strength of the noise D ; therenormalisation with respect to the addition of colored noise on the degradation rate is not speciﬁc tothe size of the noise, it simply accounts for the ﬁnite correlation time.8he FPE corresponding to this SDE should be chosen in the Stratonovich form, following from[35, 38, 39], as this is the physical implementation of an SDE with colored noise having a non-zerocorrelation time τ . This FPE is: ∂P ( n, t ) ∂t = − ∂∂n (cid:104)(cid:16) ˜ h ( n ) + ˜ g ( n )˜ g (cid:48) ( n ) (cid:17) P ( n, t ) (cid:105) + ∂ ∂n (cid:2) ˜ g ( n ) P ( n, t ) (cid:3) , (32)where ˜ h ( n ) = h ( n ) /C ( n, τ ) and ˜ g ( n ) = g ( n ) /C ( n, τ ) . Following Eqs. (9)–(10) in Section 2 and [8], thesteady state solution to this equation is then given by: P ( n ) = N ˜ g ( n ) exp (cid:32)(cid:90) n ˜ h ( z ) + ˜ g ( z )˜ g (cid:48) ( z )˜ g ( z ) dz (cid:33) = N ˜ g ( n ) exp (cid:32)(cid:90) n ˜ h ( z )˜ g ( z ) dz (cid:33) , (33)where N is the normalisation constant, chosen over the domain n ∈ [0 , ∞ ) .To test the accuracy of the distributions for colored noise provided by the UCNA in Eq. (33), wecompare the UCNA solution to a distribution produced from a modiﬁed SSA that explicitly accountsfor the colored noise on the degradation rate. This modiﬁcation is given in full detail in Appendix A.Essentially, the dilution/degradation reaction P → ∅ is replaced by three new reactions alongside theintroduction of a ghost species Y , these being (i) ∅ − (cid:42)(cid:41) − Y and (ii) P + Y → Y . The rates of thesenew reactions are then chosen to ensure the magnitude of eﬀective external noise on the degradationreaction, due to ﬂuctuations in molecule numbers of the ghost species, match the colored noise SDEgiven in Eq. (13).Fig. 3 shows steady state probability distributions produced by the UCNA for various values of D for a deterministically bistable set of parameters. The UCNA correctly captures the shift of theprobability mass from the equilibrium point of higher molecule number (referred to as the upper mode )to the lower equilibrium point (referred to as the lower mode ) as D is increased. Importantly, thisshows that when gene switching is assumed to be fast, colored noise can induce bimodality – one shouldkeep this in mind for when we look at slow gene switching in Section 4. Readers should also note thatthe parameter choices have been selected such that the Fokker-Planck approximation is good, notablythat the system size is large, i.e., Ω (cid:29) , and the mean number of proteins in the system is alsolarge. In all cases (cid:112) D/τ < so that the degradation rate remains positive. The behaviour seen as D increases in Fig. 3 can be explained as follows. When D is small (Fig. 3A) the colored noise η inEq. (15) is also small compared to the mean number of molecules in the system, and the noise cannotforce the system out of the upper mode. As D gets larger (Figs. 3B and 3C) the ﬂuctuations η atthe upper mode also become larger, allowing the system to explore the lower mode. When the systemis found in the lower mode the pre-factor of the coloured noise in Eq. (14), g ( n ) ∝ n , is lesser inmagnitude, and the ﬂuctuations in η are much smaller than when the system inhabits the upper modehence the increased probability mass at the lower mode. That the system is less noisy at the lowermode means that the system struggles more to get a ﬂuctuation large enough to propel it into theupper mode. These properties of the system as D increases can be further seen through (i) the increasein probability mass found at the lower mode as D increases thoroughout all of Fig 3(A–D), and (ii) theincreased probability mass found in the tail of the distribution for large n (Fig. 3D); while the tail isvery slowly decaying it is still exponential and hence the distribution is not heavy-tailed (see the insetof Fig. 3D). This ability to induce bimodality through a more detailed description of the details of thedegradation process is important in the context of cellular decision-making . It is hence possible forregions of the reaction rate parameter space previously thought unable to induce multiple phenotypicstates to do so with an increasing inﬂuence of more complex degradation mechanisms. Note that forthe majority of cases in Fig. 3, the UCNA provides a much better approximation than the white noiseapproximation, hence one cannot simply assume that since the correlation time τ is relatively smallthat it can be approximated as zero.Fig. 4 shows how the UCNA responds to increasing correlation time τ while the noise strength, D/τ ,remains ﬁxed. For all cases where τ is small, the UCNA performs very well. As τ increases however theUCNA starts to predict ever increasing negative probabilities for some values of n . Notably though,9 .0510 -4 -3 -2 Figure 3:

Comparison of the UCNA (black line) from Eq. (33) and white extrinsic noise (UCNA with τ = 0 , dashedblue line) with stochastic simulations using the modiﬁed SSA (red points) of the cooperative reaction scheme in Eq. (1),where the colored noise is added to the degradation rate. Aside from variation in the strength of noise D (shown on eachplot), the shared parameters are r u = 24 , r b = 464 , s b = s u = 1000 , d = 1 , Ω = 200 and τ = 1 . Parameters s b and s u are chosen to be large compared to other system parameters such that the frequency of gene activation and inactivationevents is much larger than the frequency of other reaction events, i.e. the fast gene switching assumption. Note that forthis choice of rate parameters, the rate equations are bistable with equilibrium points at n = 47 . , . . The criterion (cid:112) D/τ < is required to ensure positivity of the degradation rate. As the extrinsic noise is increased, the mass of thedistribution shifts from the mode at 360.4 to the mode at 47.4. The inset of D shows the same distribution but with they-axis on a log scale, emphasising the exponential tail of the distribution for large n . SSA data in each case comes froma single steady state trajectory of × s. Fig. 4B shows that even where signiﬁcant negative probability is predicted at large τ , the UCNAstill manages to capture the rest of the distribution. This negativity of C ( n, τ ) is commented on inboth [26] and [35]. The former deals with this negativity by taking the absolute magnitude of thepre-factor of the exponential in Eq. (33), while the latter comments that the proof of their UCNA-likeFPE is only formally valid where C ( n, τ ) > , ∀ n . Here we choose not to take the magnitude of thepre-factor in Eq. (33), since although this leads to a positive probability for all n it is nonetheless apoor approximation; but we take careful note of the comment made by Fox in [35], as this indicateswhere the UCNA will perform well. The intuition behind the argument of Fox can be stated as: if forsome n , C ( n, τ ) < there must be a transitory value of n for which C ( n, τ ) = 0 , at this point the Eq.(31) becomes physically ill-deﬁned and our solution is invalid.Finally, we observe that the parameter values chosen for both plots in Fig. 4 correspond to de-terministically monostable systems. The bimodality that is observed in Fig. 4 is hence noise inducedbimodality . The mode that appears for small τ corresponds to the deterministic equilibrium point,whereas the noise induced mode does not correspond to an equilibrium point of the deterministic sys-tem. We notice that the ability to exhibit a noise induced mode as τ becomes large is especially truefor monostable parameter sets which are in close proximity to bistable parameter sets in the parameterspace. This can be explained by occasional jumps between the monostable and bistable regimes dueto suﬃciently large ﬂuctuations in the degradation rate. Hence a measure of the distance here is thediﬀerence in the magnitude of d needed such that the system is deterministically bistable divided by10 . . . . . . Figure 4:

Comparison of the UCNA (black line) against the modiﬁed SSA (colored dots) as the correlation time τ is increased at constant noise size D/τ . Note that the y-axis shows P ( n ) /p max , where P ( n ) is deﬁned in Eq. (33)and p max is equal to the maximum value of P ( n ) . (A) Shows the performance of increasing τ for a system withparameters r u = 20 , r b = 250 , d = 1 , s u = 3 × , s b = 10 and Ω = 100 . Deterministically this system ismonostable with an equilibrium point at n = 194 . , however as τ is increased a shift towards a lower mode is observed.When τ is suﬃciently large, the UCNA predicts a negative probability. (B) Shows similar to (A) but with parameters r u = 25 , r b = 480 , d = 1 , s u = 8 × , s b = 10 and Ω = 200 . This too is a deterministically monostable systemwith equilibrium point n = 406 . . As τ increases, the breakdown of the UCNA is more apparent than for (A) withthe prediction of negative probability for small n more drastic. Both (A) and (B) show that unless τ is large, while D/τ is small, the UCNA provides a very good approximation, even where the colored noise induces bimodality indeterministically monomodal systems. SSA data in each case comes from a single steady state trajectory of × s. the noise strength, deﬁned as ∆ d = | d − d c | / ( D/τ ) , where d c is the closest value of the mean degra-dation rate to d expressing bistability. For example, the parameter set chosen in Fig. 4A, althoughmonostable, is very close to a parameter set that exhibits deterministic bistability ( ∆ d = 2 . ). Onthe other hand, the parameter set of Fig. 4B is far from the bistable parameter regime ( ∆ d = 57 . ) –and hence the bimodality shown is very limited as τ becomes large. The reason for this noise inducedbimodality then can be seen by the ability of a system, through ﬂuctuations in the rate parameters, toaccess parameter regimes which in fact do exhibit deterministic bistability. Importantly, even when itseems bimodality is not induced (e.g., Figs. 3A or 4A), using the extremal equation of P ( n ) from [32],i.e., ˜ h ( n ) = ˜ g ( n )˜ g (cid:48) ( n ) , one can show that the UCNA still predicts the presence of two modes. Thisexplanation of the induced bimodality in cooperative autoregulation is further supported by the lack ofnoise induced bimodality when colored noise is included on the degradation rate of the FPE describingnon-cooperative autoregulation; here the UCNA’s extremal equation only ever predicts the existenceof one mode for the probability distribution. We now extend the analysis from Section 3.1 to the eﬀective protein production rates. Colored noise onthe eﬀective production rates can be used to implicitly model multi-step protein production, includingmultiple stages of mRNA processing before translation (see Fig. 2). We add colored noise onto theeﬀective protein production rates via, r u = r (0) u (1 + η ( t )) and r b = r (0) b (1 + η ( t )) , which uponsubstituting in the Langevin equation describing the feedback loop Eq. (11) we obtain the following11et of SDEs: dndt = r (0) u L + r (0) b ( n/ Ω) L + ( n/ Ω) − d n + r (0) u Lη + r (0) b ( n/ Ω) η L + ( n/ Ω) + (cid:115) r (0) u L + r (0) b ( n/ Ω) L + ( n/ Ω) + d n · Γ( t ) , (34) dη dt = − τ η + 1 τ θ ( t ) , (35) dη dt = − τ η + 1 τ θ ( t ) , (36)where θ ( t ) and θ ( t ) are Gaussian white noise terms with zero mean and correlators (cid:104) θ ( t ) θ ( t (cid:48) ) (cid:105) =2 D δ ( t − t (cid:48) ) and (cid:104) θ ( t ) θ ( t (cid:48) ) (cid:105) = 2 D δ ( t − t (cid:48) ) respectively. Note that here we have used a mean ﬁeldapproximation for the terms under the square root, as was done in Section 3.1. In a similar style toEq. (28) we now propose a new noise term ˜ η ( t ) , which couples η ( t ) and η ( t ) , satisfying: F ( n )˜ η ( t ) = f ( n ) η ( t ) + f ( n ) η ( t ) , (37)where f ( n ) = r (0) u L/ ( L + ( n/ Ω) ) , f ( n ) = r (0) b ( n/ Ω) / ( L + ( n/ Ω) ) and ˜ η ( t ) is colored noise withzero mean and correlator (cid:104) ˜ η ( t )˜ η ( t (cid:48) ) (cid:105) = e −| t − t (cid:48) | /τ /τ , satisfying the following equation: d ˜ ηdt = − τ ˜ η + 1 τ θ ( t ) , (38)where θ ( t ) is Gaussian white noise with correlator (cid:104) θ ( t ) θ ( t (cid:48) ) (cid:105) = 2 δ ( t − t (cid:48) ) . The correlators for η ( t ) and η ( t ) are (cid:104) η ( t ) η ( t (cid:48) ) (cid:105) = D e −| t − t (cid:48) | /τ /τ and (cid:104) η ( t ) η ( t (cid:48) ) (cid:105) = D e −| t − t (cid:48) | /τ /τ , where we have assumedthat the colored noise on both production rates has the same correlation time but a diﬀering magnitudeof noise strength. Using the properties of the correlators of η , η and ˜ η we then ﬁnd: F ( n ) = (cid:112) f ( n ) D + f ( n ) D . (39)Sharing the notation adopted in Section 3.1, we deﬁne the following: h ( n ) = r (0) u L + r (0) b ( n/ Ω) L + ( n/ Ω) − dn, (40) g ( n ) = (cid:115) r (0) u L + r (0) b ( n/ Ω) L + ( n/ Ω) + dn. (41)This gives us the following SDE which is coupled to Eq. (38): dndt = h ( n ) + F ( n )˜ η + g ( n )Γ( t ) . (42)Then, following the same UCNA procedure as in Eqs. (19)–(26), we obtain the following approximateLangevin equation: ˙ n ≈ h ( n ) C ( n, τ ) + 1 C ( n, τ ) ( F ( n ) θ ( t ) + g ( n )Γ( t )) , (43)where C ( n, τ ) = 1 + τ (cid:18) F (cid:48) ( n ) h ( n ) F ( n ) − h (cid:48) ( n ) (cid:19) . (44)In this case it is interesting to note that unlike the case of a ﬂuctuating degradation rate, here C ( n, τ ) does depend on both the correlation time τ and the strength of the colored noise D , D (unless12 = D in which case there is only dependence on τ ). To simplify Eq. (43) further, we againpropose: g ( n )˜Γ( t ) = F ( n ) θ ( t ) + g ( n )Γ( t ) , (45)where ˜Γ( t )˜Γ( t (cid:48) ) = 2 δ ( t − t (cid:48) ) , and ﬁnd using the correlators that g ( n ) = (cid:112) F ( n ) + g ( n ) / . This leadsto the ﬁnal approximate SDE: ˙ n = h ( n ) C ( n, τ ) + g ( n ) C ( n, τ ) ˜Γ( t ) , (46)which is identical in notation to Eq. (31) but where h ( n ) , C ( n, τ ) and g ( n ) are all deﬁned in thissection. The equivalent FPE for this SDE is then: ∂P ( n, t ) ∂t = − ∂∂n (cid:104)(cid:16) ˜ h ( n ) + ˜ g ( n )˜ g (cid:48) ( n ) (cid:17) P ( n, t ) (cid:105) + ∂ ∂n (cid:2) ˜ g ( n ) P ( n, t ) (cid:3) . (47)Again our solution for the probability distribution will then be: P ( n ) = N ˜ g ( n ) exp (cid:32)(cid:90) n ˜ h ( z )˜ g ( z ) dz (cid:33) , (48)with ˜ h ( n ) = h ( n ) /C ( n, τ ) and ˜ g ( n ) = g ( n ) /C ( n, τ ) . Figure 5:

This ﬁgure shows the agreement of the UCNA on the protein production rates to the modiﬁed SSA (detailedin Section 3.2), also compared to the case of white extrinsic noise ( τ = 0 ). The plots show agreement of the UCNAover the three main qualitative regimes of cooperative autoregulation at large molecule number, showing respectively:(i) monostable positive feedback with parameters r (0) u = 150 , r (0) b = 300 , s u = s b = 10 , d = 1 , D = 0 . , D = 0 . , τ = 0 . , Ω = 100 ; (ii) bistable positive feedback with parameters r (0) u = 24 , r (0) b = 468 , s u = s b = 10 , d = 1 , D = 0 . , D = 0 . , τ = 1 , Ω = 200 ; (iii) monostable negative feedback with parameters r (0) u = 470 , r (0) b = 20 , s u = s b = 10 , d = 1 , D = 0 . , D = 0 . , τ = 1 , Ω = 70 . SSA data in each case comes from a single steady state trajectory of × s. We now describe the modiﬁed SSA that takes into account extrinsic noise on the eﬀective proteinproduction rates. This modiﬁcation replaces the protein production reaction in each gene state, i.e., G k → G k + P where G k represents either G or G ∗ , by three new reactions alongside the introduction ofa ghost species Y k for each gene state. These new reactions are ∅ r − (cid:42)(cid:41) − r Y k and G k + Y k r −→ G k + Y k + P .Utilising the LNA (assuming Y k to be abundant), as was done for colored noise on the degradationrate in Appendix A, one ﬁnds these rates to be r = 1 / ( D k Ω) , r = 1 /τ and r = r (0) k D k Ω /τ , whichensure matching to the colored noise SDE given in Eq. (34), where r (0) k represents r (0) u or r (0) b in G and G ∗ respectively.Figure 5 shows a good performance of the UCNA when compared to the modiﬁed SSA describedabove. This performance is shown for each diﬀering qualitative behaviour expressed by cooperativebimodality, i.e., (i) monostable positive feedback, (ii) bistable positive feedback, and (iii) monostablenegative feedback. In all three plots shown the UCNA matches the modiﬁed SSA well, and clearlyperforms better than if one were to approximate the colored noise with white noise (i.e., τ = 0 ).13e ﬁnd the same qualitative behaviour of the creation and eventual destruction of bimodality (seeFig. 6A(i–iii)) as the noise strengths, D and D , become large for the colored noise on the proteinproduction rates as was found in Fig. 3 for colored noise on the degradation rate. Note that for thechosen parameter set in Fig. 6A that the white noise approximation performs generally very wellcompared to the UCNA. For τ ≤ , the white extrinsic noise approximation can typically performquite well compared to the modiﬁed SSA, but note that this is not always the case (e.g., see Fig. 3). Finally, we apply the UCNA to the case of colored noise added to the binding and unbinding rates ofthe protein to the gene. This could be utilised to implicitly model the eﬀect of multiple gene states inthe transition of G to G ∗ , as has been experimentally and theoretically investigated [40–42], accountingfor DNA looping via distal enhancers or chromatin conformational states. For convenience we deﬁne s b = s (0) b (1 + η ( t )) , s u = s (0) u (1 + η ( t )) and L η = L (cid:18) η ( t )1 + η ( t ) (cid:19) , with L = s (0) u s (0) b . (49)Substituting Eq. (49) in the Langevin equation describing the feedback loop Eq. (11) (and making amean-ﬁeld approximation for the terms under the square root) we obtain the following set of SDEs: dndt = L η r u + r b ( n/ Ω) L η + ( n/ Ω) + (cid:115) r u L + r b ( n/ Ω) L + ( n/ Ω) + d n · Γ( t ) , (50) dη dt = − τ η + 1 τ θ ( t ) , (51) dη dt = − τ η + 1 τ θ ( t ) , (52)where θ ( t ) and θ ( t ) are Gaussian white noise terms with zero mean and correlators (cid:104) θ ( t ) θ ( t (cid:48) ) (cid:105) =2 D δ ( t − t (cid:48) ) and (cid:104) θ ( t ) θ ( t (cid:48) ) (cid:105) = 2 D δ ( t − t (cid:48) ) respectively. In order to proceed using the UCNA we mustlinearise the drift term in Eq. (50) with respect to η and η through the small noise approximation η , η (cid:28) : dndt ≈ r u L + r b ( n/ Ω) L + ( n/ Ω) − d n + (cid:18) L n Ω ( r u − r b )( L Ω + n ) (cid:19) ( η − η ) (53) + (cid:115) r u L + r b ( n/ Ω) L + ( n/ Ω) + d n · Γ( t ) . For convenience we now deﬁne: h ( n ) = r u L + r b ( n/ Ω) L + ( n/ Ω) − d n, (54) g ( n ) = L n Ω ( r u − r b )( L Ω + n ) , (55) g ( n ) = (cid:115) r u L + r b ( n/ Ω) L + ( n/ Ω) + d n, (56) F ( n ) = g ( n ) (cid:112) D + D , (57) g ( n ) = (cid:112) F ( n ) + g ( n ) / . (58)14 igure 6: Plots showing the creation and eventual destruction of bimodality in the probability distributions for colorednoise on the (A) protein production rates, (B) binding/unbinding rates (denoted on the ﬁgure by (un)binding rates),analogously to what was observed in Fig. 3 for colored noise on the degradation rate. For A it is clear that the UCNAperforms well where the noise strength is both small in A(i) and large in A(iii). For B we see that the low (B(i)) andintermediate (B(ii)) noise cases are well predicted by the UCNA and white noise approximation, however where the noisebecomes large (B(iii)) the UCNA breaks down, whereas the white noise approximation still performs well compared tothe modiﬁed SSA prediction. Other than the noise strengths given on the ﬁgure, the parameters for both A and B are r (0) u = 24 , r (0) b = 468 , s u = s b = 10 , d = 1 , τ = 1 , Ω = 200 (i.e., the same parameters used in Fig. 3 which expressdeterministic bistability). SSA data in each case comes from a single steady state trajectory of × s. In terms of these new functions Eq. (53) becomes, dndt = h ( n ) + g ( n )( η − η ) + g ( n )Γ( t ) . (59)Following Section 3.2 we then arrive at the UCNA for colored noise on the binding rates where η , η (cid:28) : dndt = h ( n ) C ( n, τ ) + 1 C ( n, τ ) ( F ( n ) θ ( t ) + g ( n )Γ( t )) . (60)Then, using the properties of the correlators of θ ( t ) and Γ( t ) we arrive at: dndt = h ( n ) C ( n, τ ) + g ( n ) C ( n, τ ) ˜Γ( t ) , (61)where, C ( n, τ ) = 1 + τ (cid:18) g (cid:48) ( n ) h ( n ) g ( n ) − h (cid:48) ( n ) (cid:19) , (62)and ˜Γ( t ) is Gaussian white noise with mean zero and correlator (cid:104) ˜Γ( t )˜Γ( t (cid:48) ) (cid:105) = 2 δ ( t − t (cid:48) ) . Here, as forthe UCNA applied to the degradation rate, C ( n, τ ) is again independent of the strengths of the colorednoise terms. This UCNA, as we shall see, should be a good approximation where both D and D aresmall – by ‘small’ we explicitly mean that D and D should be smaller than noise strengths used onthe UCNA for protein production rates or the degradation rate. The solution to Eq. (61) is given by: P ( n ) = N ˜ g ( n ) exp (cid:32)(cid:90) n ˜ h ( z )˜ g ( z ) dz (cid:33) , (63)15ith ˜ h ( n ) = h ( n ) /C ( n, τ ) and ˜ g ( n ) = g ( n ) /C ( n, τ ) .Now we evaluate the performance of the UCNA on the binding and unbinding rates, and compareit with the modiﬁed SSA. In this case the modiﬁed SSA replaces the binding and unbinding reactions, G + 2 P s b − (cid:42)(cid:41) − s u G ∗ , by the following: ∅ r − (cid:42)(cid:41) − r Y , G + Y + 2 P r −→ Y + G ∗ , ∅ r − (cid:42)(cid:41) − r Y , and G ∗ + Y r −→ G + Y + 2 P , where Y and Y are ghost species. The rates of these reactions are determined via theLNA (assuming the ghost species to be numerous) and are r = 1 / ( D Ω) , r = 1 /τ , r = s (0) b D Ω /τ , r = 1 / ( D Ω) , r = 1 /τ and r = s (0) u D Ω /τ .In Fig. 6B we test the UCNA on the binding and unbinding rates compared to the modiﬁed SSAdescribed above. Clearly, the same qualitative behaviour of the creation and destruction of bimodality,as noise strength is increased, is observed, as was also observed for colored noise on the degradationrate (Fig. 3) and protein production rates (Fig. 6A). The resultant expression of bimodality however,is clearly diﬀerent than for these cases. Notably, this UCNA does ascribe to an additional limitationcompared to the UCNA of degradation or production rates; a limitation due to the further small noiseapproximation made in Eq. (53). This limitation is seen in Fig. 6B(iii), showing that the UCNAapplied to the binding and unbinding rates is much more sensitive to increased noise strength than theother UCNA applications. One also observes that the white noise approximation in Fig. 6B performsalmost as well as the UCNA (Figs. 6B(i–ii)) or in some cases better than the UCNA (Fig. 6B(iii));hence, the white noise approximation may be a safer approximation than the UCNA for colored noiseapplied to the binding and unbinding rates. Having now applied the UCNA to approximate distributions for colored noise on the (i) degradationrate, (ii) protein production rates and (iii) the binding/unbinding rates, we now assess the conditionswhich cause the UCNA to breakdown. The application of the UCNA to colored noise on the proteinproduction rates presents a somewhat more complex problem than the application of the UCNA tocolored noise on the degradation rate or the binding/unbinding rates; hence, we more easily see thatthere are three main conditions for the breakdown of the UCNA – conditions beside the large systemsize or large molecule number requirement needed to approximate the discrete master equation by aone variable FPE. Below we detail these three conditions, in each case explaining why the disagreementoccurs. Note that although the analysis of breakdown conditions below is done for the UCNA on theprotein production rates, the same arguments hold for the other applications of the UCNA previouslypresented.

The ﬁrst of these conditions concerns the positivity condition required on C ( n, τ ) , that is C ( n, τ ) > ∀ n . We refer to this as condition 1 . Since we have already discussed this condition in a previoussection we will not repeat the discussion here, and refer the reader to Section 3.1. In Fig. 7A(i) we seea disagreement between the UCNA and the modiﬁed SSA for a parameter set that exhibits bimodality,and in Fig. 7A(ii) it is veriﬁed that this since C ( n, τ ) < where n ≈ . Note however, that if C ( n, τ ) becomes negative outside of the region containing most of the probability mass that the UCNA canstill provide a good approximation to the true modiﬁed SSA solution. The second condition observed for the breakdown of the UCNA concerns the violation of the char-acteristic ‘length’ scale (the length here being a distance measure in the n space), which we nowdiscuss. In Appendix B we show in more detail why the arguments we present below hold. Based onthe noise intensity of the noise term arising from the colored noise in Eq. (43), we can introduce the16 igure 7: This ﬁgure shows the disagreement of the UCNA on the protein production rates to the ground truth modiﬁedSSA predictions (detailed in Section 3.2), also compared to the case of white extrinsic noise ( τ = 0 ). Each disagreementcorresponds to a single breakdown condition of the UCNA being violated. The legend in A(i) applies to A(i), B(i) andC(i). Plots in A show the breakdown of the UCNA due to condition 1. A(i) shows the prediction of negative probabilitydue to the negativity of C ( n, τ ) in A(ii) around the same value of n . Parameters for A are r (0) u = 20 , r (0) b = 470 , s u = s b = 10 , d = 1 , D = 1 , D = 0 . , τ = 10 , Ω = 200 . Plots in B show the breakdown of the UCNA due tocondition 2. B(ii) shows that κ ( n, τ ) > C ( n, τ ) over a large range of n , corresponding to the poor UCNA predictionseen in B(i) over this entire region. Parameters for B are r (0) u = 50 , r (0) b = 450 , s u = s b = 10 , d = 1 , D = 1 , D = 1 , τ = 1 , Ω = 100 . Plots in C show the breakdown of the UCNA due to condition 3. C(ii) shows a relatively large valueof γ ( n ) over most of the deﬁned region D , and also shows the the pre-factors of the total UCNA noise ˜ g ( n ) and thatarising only from the colored noise ˜ F ( n ) = F ( n ) /C ( n, τ ) . Vertical orange lines in C(i) indicate the limits of the region D . The value ρ in the top right-hand corner of C(i) can be compared to the smaller values of ρ seen for other parametersets in A(i) and B(i), indicating that the breakdown observed is truly associated to condition 3. The parameters for Care r (0) u = 2300 , r (0) b = 120 , s u = s b = 10 , d = 1 , D = 0 . , D = 0 . , τ = 2 , Ω = 230 . SSA data in each case comesfrom a single steady state trajectory of × s. characteristic length scale L over which ﬂuctuations in the colored noise term are damped: L ( n, τ ) = F ( n ) C ( n, τ ) , (64)noting that the requirement of condition 1 means that this length is always positive. Our approximateone variable FPE in Eq. (47) will then be valid under the condition that the drift term varies slowlywith respect to L (following Appendix B), meaning that one needs to satisfy L (cid:12)(cid:12)(cid:12) ∂ n (cid:16) ˜ h ( n ) + ˜ g ( n )˜ g (cid:48) ( n ) (cid:17)(cid:12)(cid:12)(cid:12) (cid:28) (cid:12)(cid:12)(cid:12) ˜ h ( n ) + ˜ g ( n )˜ g (cid:48) ( n ) (cid:12)(cid:12)(cid:12) (65)in order for the UCNA to hold. More succinctly, this condition is: C ( n, τ ) (cid:29) κ ( n, τ ) , (66)where we henceforth deﬁne the function κ ( n, τ ) = F ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ n (cid:16) ˜ h ( n ) + ˜ g ( n )˜ g (cid:48) ( n ) (cid:17) ˜ h ( n ) + ˜ g ( n )˜ g (cid:48) ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (67)We refer to Eq. (66) as condition 2 . In Fig. 7B we explore this breakdown for a parameter set thatbreaks condition 2 over a large region of the parameter space, between < n < . Clearly the17CNA is provides a poor approximation in this regime; note however that, similar to condition 1, ifcondition 2 is violated (i) outside of the domain where most of the probability mass is contained, or(ii) over a small region of the domain containing most of the probability mass, then the UCNA canstill provide a good approximation. The ﬁnal condition resulting in the breakdown of the UCNA concerns the underestimation of noise.We refer to this as unaccounted peak noise , and this forms our ﬁnal breakdown condition, condition 3 .The explanation behind condition 3 is that the UCNA in general will always underestimate the Poissonnoise for a particular value of n , arising from the necessary neglection of Poisson noise terms in thederivation of the UCNA: (i) neglection of the noise terms under the square root of the Poisson noisepre-factor in Eqs. (34) (a form of mean-ﬁeld approximation), and (ii) neglection of Poisson noise term g ( n )Γ( t ) and its time derivative from the ˙ η term in Eqs. (19–21) via the use of another mean-ﬁeldapproximation. However, the error on the UCNA caused by condition 3 will be small when colorednoise dominates the Poisson noise. To investigate the degree to which colored noise is dominant,identifying F ( n ) /C ( n, τ ) from Eq. (42) and g ( n ) /C | ( n, τ ) from Eq. (46), we deﬁne γ ( n ) = (cid:12)(cid:12)(cid:12)(cid:12) g ( n ) /C ( n, τ ) − F ( n ) /C ( n, τ ) g ( n ) /C ( n, τ ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) g ( n ) − F ( n ) g ( n ) (cid:12)(cid:12)(cid:12)(cid:12) (68)where, for some n , if γ ( n ) ≈ then Poisson noise dominates, else if γ ( n ) ≈ then colored noisedominates. Intermediate values of γ ( n ) mean that both Poisson and colored noise is apparent in thesystem. To investigate whether noise is underestimated generally over the region containing most ofthe probability, deﬁned as D = [ n min , n max ] , we further deﬁne ρ = 1 |D| (cid:90) n max n min γ ( n ) dn. (69)Here, if ρ ≈ then Poisson noise dominates over the entire region D , else if ρ ≈ then colored noisedominates over the entire region D . Fig. 7C explores this disagreement, where Fig. 7C(i) showsthe clear underestimation of noise in the UCNA distribution when compared to the modiﬁed SSAdistribution. Sample values of n min and n max are also shown on Fig. 7C(i). Fig. 7C(ii) shows how thetotal UCNA noise ˜ g ( n ) varies with respect to the contribution of colored noise ˜ F ( n ) = F ( n ) /C ( n, τ ) .Also shown on Fig. 7C(ii) is the variation of γ ( n ) . Values of ρ are shown in the top right-hand cornerfor all probability distributions in Fig. 5; unlike the other distributions shown in Fig. 7, in Fig. 7C(i) ρ (cid:54)≈ does not hold, clarifying that the reason for the UCNA’s disagreement for this parameter set isdue to condition 3. τ UCNA distributions

Having successfully identiﬁed the three main conditions causing the breakdown of the UCNA, we arenow able to determine where the UCNA will perform well, even in the large τ limit . In Figure 8 weexplore an example of the UCNA performing exceptionally well for τ = 10 (see Fig. 8(i)). Clearlythe UCNA does not violate any of the three conditions here: (1) C ( n, τ ) is not negative in D (see Fig.8(ii)); (2) C ( n, τ ) (cid:29) κ ( n, τ ) in D (again, see Fig. 8(ii)); (3) γ ( n ) is small for all n in D (see Fig. 8(iii))as evidenced by the small value of ρ = 0 . . As expected, the prediction of white noise on the proteinproduction rates is very poor in the regime where τ (cid:29) . In the previous sections we have focused on fast gene switching, whereby a Hill function can then beused to approximate the production of proteins from two diﬀerent gene states, shown in the reaction18 igure 8:

Plots showing a good performance of the UCNA for τ = 100 . (i) Shows the probability distributions fromthe modiﬁed SSA, UCNA and white noise approximation. Ther vertical orange lines show the limits of the region D inthis case. (ii) Shows that in the region D that condition 1 is satisﬁed since C ( n ∈ D ) > , and condition 2 is satisﬁedsince C ( n ∈ D , τ ) (cid:29) κ ( n ∈ D , τ ) . (iii) Shows that condition 3 is satisifed in D , i.e., γ ( n ) ≈ , since ˜ g ( n ) ≈ ˜ F ( n ) , whichis corroborated by the small value of ρ shown in (i). Other parameters here are r (0) u = 10 , r (0) b = 400 , s u = s b = 10 , d = 1 , D = 0 . , D = 1 , Ω = 70 . SSA data in each case comes from a single steady state trajectory of × s. scheme of Eq. (1). We now consider the case where the switching rates s u and s b are very small; smallenough that the system has two dominant modes of behaviour, one pertaining to each gene state. Theapproach followed here is very similar to the conditional linear noise approximation (cLNA) studiedin [43], but instead of approximating the distribution conditional on each gene state as a Gaussian weinstead utilise the UCNA in each gene state. We shall refer to this method as the conditional UCNA (cUCNA). We begin by stating Bayes’ theorem for the marginal distribution of proteins that we areinterested in approximating: P ( n, t ) = (cid:88) G P ( G i , t ) P ( n | G i , t ) . (70)Here G is the set of possible gene state (in our case G = { G, G ∗ } ), P ( G i , t ) is the marginal distributionof being in gene state G i at a time t and P ( n | G i , t ) is the conditional probability of having n proteinsat a time t given that the system is in state G i . Our task now is to ﬁnd suitable approximationsfor P ( G i , t ) and P ( n | G i , t ) that allow us then to construct an approximation of the full steady statedistribution in Eq. (70). In our case we have two diﬀerent gene states, G and G ∗ , and hence we canconstruct the reaction schemes conditional on each gene state. The reaction scheme conditional ongene state G is (i) G r u −→ G + P, P d −→ ∅ , and the reaction scheme conditional on gene state G ∗ is (ii) G ∗ r b −→ G ∗ + P, P d −→ ∅ . This then allows us to approximately ﬁnd the steady state mean number ofproteins conditional on each gene state when s u and s b are very small (where the subscript a denotesapproximate): (cid:104) n | G (cid:105) a = r u /d and (cid:104) n | G ∗ (cid:105) a = r b /d . We can use these conditional means to ﬁnd themarginal probabilities of being in a speciﬁc gene state at steady state. Note that in this calculation wewill ignore the inﬂuence of noise on the rate parameters; the inherent assumption is that extrinsic noisedoes not much inﬂuence the probability of being in each gene state. First we write an approximatemaster equation for the transitions between diﬀering gene states: ddt P ( G, t ) ≈ s u P ( G ∗ , t ) − s b (cid:104) n | G (cid:105) a Ω P ( G, t ) . (71)We can then solve the above equation at steady state (denoted by the subscript s ) by utilising conser-vation of probability, P s ( G ) = 1 − P s ( G ∗ ) , giving: P s ( G ∗ ) = (cid:32) s u s b (cid:18) d · Ω r u (cid:19) (cid:33) − , (72) P s ( G ) = (cid:18) s b s u (cid:16) r u d · Ω (cid:17) (cid:19) − . (73)19ince now we have the P s ( G i ) needed for Eq. (70) we need to ﬁnd the P s ( n | G i ) terms. Here we showhow to calculate these terms for noise on the degradation rate, although this can be easily extendedto the case where we have noise on the protein production rates. In each gene state, the system weare concerned to study is G i r i −→ G i + P, P d −→ ∅ , where G i and r i represent either gene state G or G ∗ and the corresponding production rate r u or r b respectively. Adding colored noise to the degradationrate, d = d (1 + η ) , we then have the following set of SDEs in each gene state (here we have appliedthe mean ﬁeld approximation to the terms in the square root): dndt = r i − d n − ( d n ) η + (cid:112) r i + d n · Γ( t ) , (74) dηdt = − τ η + 1 τ θ ( t ) , (75)where Γ( t ) and θ ( t ) are Gaussian white noise terms, each with zero mean and correlators (cid:104) Γ( t )Γ( t (cid:48) ) (cid:105) = δ ( t − t (cid:48) ) and (cid:104) θ ( t ) θ ( t (cid:48) ) (cid:105) = 2 Dδ ( t − t (cid:48) ) respectively. Processing the usual steps of the UCNA method,detailed explicitly in Section 3, we ﬁnd the approximate steady state probability for each gene state: P s ( n | G i ) ≈ N exp ( u ( n, r i )) n r i τ − ( r i + d n (2 d Dn + 1)) − − d D − r i τ ( n + r i τ ) , (76)where N is a normalisation constant and we have deﬁned, u ( n, r i ) = ( D (4 − d τ ) r i + 1) tan − (cid:16) d Dn +1 √ Dr i − (cid:17) d D √ Dr i − . (77)We note that since each gene state is individually considered, each can have it’s own associated noisestrength D and correlation time τ on the degradation rate. Hence, using Eqs. (72)-(73) and (76) wecan now approximate Eq. (70) as: P ( n ) ≈ P s ( G ) P s ( n | G ) + P s ( G ∗ ) P s ( n | G ∗ ) . (78)Figure 9 compares the cUCNA with the modiﬁed SSA – which is the same as the modiﬁed SSAfound in Section 3.1. Fig. 9(i) shows that for small switching rates, the cUCNA can correctly capturethe bimodality exhibited where the colored noise on the degradation rate is small. As the noise on thedegradation rate gets larger the cUCNA still provides a decent approximation to the true distributions;it is also clear that the bimodality of the protein distribution is destroyed as the size of this noiseincreases. One can contrast this to the cases observed in Figs. 3 and 6 which showed that wherethe gene switching rates are fast, increased colored noise strength can in fact induce bimodality. Insummary, we ﬁnd that extrinsic noise on the degradation rate of a slow switching auto-regulatory systemgenerally destroys bimodality, but for fast switching it is common to observe the opposite phenomenon.

In this section we explicitly show, by means of two examples, how one can use the colored noise formu-lation that was introduced earlier to describe intricate molecular details of cooperative autoregulation.We ﬁrst show this for multi-stage protein production with fast gene switching, and then for multi-stageprotein degradation with slow gene switching.

The ﬁrst example of using colored noise as a form of model reduction is that of mapping multistageprotein production onto a simpler system, where colored noise accounts for processes not explicitly20

100 200 3000.0000.0050.0100.015

Figure 9:

Comparison of the cUCNA with the modiﬁed SSA for increasing values of the colored noise strength, D . It isseen that the cUCNA (solid lines) is a good approximation to the true distribution (dots, simulated using the modiﬁedSSA described in Appendix A), especially for small values of D . The noise strengths for each plot are (i) D = 0 . , (ii) D = 0 . , (iii) D = 0 . , (iv) D = 0 . , (v) D = 0 . , and the shared parameters are r u = 30 , r b = 75 , s b = 0 . , s u = 0 . , d = 0 . and τ = 1 . Clearly as D gets larger the bimodality exhibited by the slow switching between the gene states isdestroyed by the extrinsic noise added to the degradation rate. considered in the simpler model. Consider multi-stage protein production on the cooperative auto-regulatory feedback loop: G ρ u −→ G + M , G ∗ ρ b −→ G ∗ + M , (79) M i Λ i −→ M i +1 , i ∈ [1 , N − , M N Λ N −−→ ∅ , M N r −→ M N + P,G + 2 P s b − (cid:42)(cid:41) − s u G ∗ , P d −→ ∅ , where it is assumed the system contains only one gene, either in state G or in state G ∗ . The simplermodel that we will then map this system onto the cooperative auto-regulatory feedback loop: G r u −→ G + P, G ∗ r b −→ G ∗ + P, (80) G + 2 P s b − (cid:42)(cid:41) − s u G ∗ , P d −→ ∅ , where r u = r (0) u (1 + η ( t )) and r b = r (0) b (1 + η ( t )) , and assigning the properties of extrinsic noises η ( t ) and η ( t ) such that Eq. (79) can be mapped onto Eq. (80) is the task we have assigned ourselves.One can think of the diﬀerent M i for i < N as the various stages of nascent mRNA, before it iseventually fully transcribed in stage M N (mature mRNA) where it can then begin translation [44].Utilising the slow scale linear noise approximation [18] one can show that if Λ i (cid:29) max { Λ N , ρ u , ρ b } for21 ∈ [1 , N − then the nascent mRNA M , ..., M N − are fast species, and the reaction system in Eq.(79) is consistent with the following reaction scheme describing ﬂuctuations in the slow species G , G ∗ , M N and P : G ρ u −→ G + M N , G ∗ ρ b −→ G ∗ + M N , (81) M N Λ N −−→ ∅ , M N r −→ M N + P,G + 2 P s b − (cid:42)(cid:41) − s u G ∗ , P d −→ ∅ . We now apply the van Kampen ansatz to the number of mature mRNA, M N . In gene state G thisgives us n ( t ) = Ω φ + Ω / (cid:15) ( t ) , and in gene state G ∗ this gives us n ( t ) = Ω φ + Ω / (cid:15) ( t ) , where φ = ρ u / (Λ N Ω) and φ = ρ b / (Λ N Ω) are the steady state solutions to the rate equation describing themature mRNA in the gene states G and G ∗ respectively, and (cid:15) ( t ) and (cid:15) ( t ) describe small ﬂuctuationsabout these means. Note the occurrence of / Ω in φ and φ follows since the concentration of a singlegene in a volume Ω is / Ω . Using these ansatzes allows us to construct the eﬀective protein productionrates in gene states G and G ∗ respectively: r u = r n ( t ) = r ρ u Λ N (cid:18) / Λ N ρ u (cid:15) ( t ) (cid:19) , (82) r b = r n ( t ) = r ρ b Λ N (cid:18) / Λ N ρ b (cid:15) ( t ) (cid:19) . (83)One can then see that r (0) u = r ρ u / Λ N , r (0) b = r ρ b / Λ N and that the noise terms have the form: η ( t ) = Ω / Λ N ρ u (cid:15) ( t ) , (84) η ( t ) = Ω / Λ N ρ b (cid:15) ( t ) . (85)In order to fully specify η ( t ) and η ( t ) we need to ﬁnd the correlators (cid:104) η ( t ) η ( t (cid:48) ) (cid:105) and (cid:104) η ( t ) η ( t (cid:48) ) (cid:105) ,which can be done by application of the linear noise approximation (LNA) [8]. Applying the LNA to n ( t ) and n ( t ) , whose ﬂuctuations are fully speciﬁed by the reactions G ρ u −→ G + M N , G ∗ ρ b −→ G ∗ + M N and M N Λ N −−→ ∅ , gives us the two following one variable FPEs: ∂ Π( (cid:15) , t ) ∂t = Λ N ∂∂(cid:15) ( (cid:15) Π( (cid:15) , t )) + 12 (cid:18) ρ u Ω (cid:19) ∂ Π( (cid:15) , t ) ∂(cid:15) , (86) ∂ Π( (cid:15) , t ) ∂t = Λ N ∂∂(cid:15) ( (cid:15) Π( (cid:15) , t )) + 12 (cid:18) ρ b Ω (cid:19) ∂ Π( (cid:15) , t ) ∂(cid:15) , (87)where Π( (cid:15) i , t ) is the probability of having a ﬂuctuation of size (cid:15) i at a time t . These FPEs, combinedwith Eq. (84) and (85), admit equivalent Langevin equations for η ( t ) and η ( t ) , given by: dη ( t ) dt = Λ N (cid:18) − η ( t ) + (cid:114) ρ u β ( t ) (cid:19) , (88) dη ( t ) dt = Λ N (cid:18) − η ( t ) + (cid:114) ρ b β ( t ) (cid:19) , (89)where β ( t ) and β ( t ) are independent Gaussian white noises with zero mean and correlator (cid:104) β ( t ) β ( t (cid:48) ) (cid:105) = (cid:104) β ( t ) β ( t (cid:48) ) (cid:105) = δ ( t − t (cid:48) ) . From here one can ﬁnd the correlators of η ( t ) and η ( t ) : (cid:104) η ( t ) η ( t (cid:48) ) (cid:105) = Λ N ρ u exp ( − Λ N | t − t (cid:48) | ) , (90) (cid:104) η ( t ) η ( t (cid:48) ) (cid:105) = Λ N ρ b exp ( − Λ N | t − t (cid:48) | ) . (91)22omparing to the results of Section 3.2 it is clear that η ( t ) and η ( t ) satisfy the deﬁnition of colorednoise, with noise strengths D = 1 /ρ u , D = 1 /ρ b and shared correlation time τ = 1 / Λ N . Thiscompletes the mapping between the full complex system in Eq. (79) and our reduced process in Eq.(80). We can hence utilise our solution for the probability distribution with colored noise on theeﬀective protein production rates in Eq. (48).In Fig. 10A we show how eﬀective the UCNA can be in approximating the protein distributionfrom the full system described in Eq. (79), where we have for simplicity assumed that there are threemRNA states: M , M and M (i.e., N = 3 ). Fig. 10A(i) shows the approximation for a parameterset exhibiting bimodality: the red points represent the standard SSA of the full system in Eq. (79); theblack line represents the distribution predicted from the UCNA (i.e., using Eq. (48) with D = 1 /ρ u , D = 1 /ρ b and τ = 1 / Λ N ); the blue dotted line represents the distribution if one put white noise of thesame magnitude on the protein production rates (i.e., the UCNA at τ = 0 ); and the orange line withcircles shows the distribution if one was to neglect noise on the reaction rates entirely (i.e., r u = r (0) u and r b = r (0) b ). Clearly, in Fig. 10A(i) the UCNA is the only distribution that ﬁts the SSA prediction,showing both the eﬀectiveness of our model reduction as well as the need to properly account forthe correlation time of colored noise in model reduction. This makes sense, since one would expectprocesses occurring in the full system to be correlated over short times, i.e., that noise events in closetemporal proximity are not independent , and one cannot simply neglect these eﬀects. Fig. 10A(ii)instead shows the various approximations for a monomodal parameter set. In this case, white noise isa poor approximation, and it is clear that one cannot neglect the ﬁnite correlation time. However, itis interesting to note that properly accounting for the correlation time using the UCNA returns thesame distribution as if one had not added noise to the production rates at all – this is due to the smallmagnitudes of D /τ and D /τ respectively. These two examples shows that where correlation time isﬁnite, it is imperative that one models it correctly. Proteins in cells are often degraded via multi-step processes. For example, a major degradation pathwayin eukaryotic cells is the ubiquitin-proteasome degradation pathway [45], and more recent experimentshave shown that a subset of proteins in the mammalian proteome have age-dependent degradationrates [46, 47]. From Fig. 2 in [46] we consider a system with two diﬀerent stages of protein withdiﬀering degradation rates combined with the cooperative auto-regulatory feedback loop: G r u −→ G + P , G ∗ , r b −→ G ∗ , + P , (92) P κ −→ P , P d −→ ∅ , P d −→ ∅ ,G + 2 P s b − (cid:42)(cid:41) − s u G ∗ , G + 2 P s b − (cid:42)(cid:41) − s u G ∗ , where G ∗ , indicates either the state G ∗ or G ∗ . This reaction system models age dependent proteinstates, since the protein P is always produced from the gene, and eventually undergoes a transitionto protein state P , where P and P have diﬀering degradation rates. We will show how to map thissystem to the reduced system: G + 2 P s b − (cid:42)(cid:41) − s u G ∗ , G r u −→ G + P, G ∗ r u −→ G ∗ + P, P d −→ ∅ , (93)where the total number of P is given as the sum of the number of P and P , i.e., n = n + n , G ∗ issimply the sum of G ∗ and G ∗ , and d = d (1 + η ( t )) , where η ( t ) is extrinsic noise. Our task is to ﬁndthe properties of the noise η ( t ) such that one can map the full system in Eq. (92) onto the reducedsystem in Eq. (93). Note that although we here look at two diﬀerent stages of protein, the analysispresented below can be easily extended for several diﬀerent stages of protein degradation.23

500 1000 150000.008 200 400 600000 ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo Figure 10: (A) Shows distributions of the standard SSA of the reaction scheme in Eq. (79) for multistage proteinproduction in three intermediate species M , M and M , against (1) the UCNA, (2) white extrinsic noise (i.e., τ = 0 )and (3) no extrinsic noise. Each plot shows the colored noise parameters used for the UCNA solution from Eq. (48),which are determined from the full multi-stage protein production process in Eq. (79). Note the legend in A(ii) appliesonly to distributions in A(i–ii). Parameter values for the standard SSA in A(i) are ρ u = 2 , ρ b = 50 , Λ = 1000 , Λ = 1000 , Λ = 0 . , r = 1 , s u = s b = 1000 , d = 1 and Ω = 230 . Parameter values for the SSA in A(ii) are ρ u = 20 , ρ b = 80 , Λ = 1000 , Λ = 1000 , Λ = 0 . , r = 1 , s u = s b = 1000 , d = 1 and Ω = 100 . (B) Shows distributions of the standard SSA , with protein decay following the multi-step process of Eq. (92), against the cUCNA. The colored noiseparameters used for the cUCNA solution of Eq. (118) are shown on each plot, with these values being determined fromthe full model using Eqs. (99,116,117). The insets show a comparison of the approximate correlator (see Eq. (115))and the full double exponential correlator (see Eq. (114)) for gene state G ∗ . Note the legend in B(ii) applies only todistributions in B(i–ii), and the legend on the inset of B(ii) applies also to the inset in B(i). Parameter values for theSSA in B(i) are r u = 50 , r b = 250 , s b = 2 . × − , s u = 10 − , Ω = 200 , d = 1 , k = 1 and d = 0 . . Parameter valuesfor the SSA in B(ii) are r u = 100 , r b = 400 , s b = 10 − , s u = 10 − , Ω = 200 , d = 1 , k = 1 and d = 5 . SSA datafor A(i) and A(ii) come from a single steady state trajectories of length s and × s respectively. Note that A(i)presents a very long relaxation to the steady state due to the systems inertia in staying in one of the two modes of thedistribution. SSA data for B(i) and B(ii) come from a single steady state trajectory of × s. One ﬁnds that the eﬀective degradation rate of the sum of protein number P and P is: d = n d + n d n + n . (94)In the following analysis we consider gene switching to be slow, which allows us to apply the cUCNAfrom Section 4. We ﬁrst consider the probability of being in each gene state at steady state P s ( G k ) ,where G k represents either gene state G or G ∗ . Note that we assume both protein stages P and P can bind and unbind to the gene at the same respective rates, and note that (cid:104) n | G (cid:105) = (cid:104) n | G (cid:105) + (cid:104) n | G (cid:105) .24ollowing the analysis from Section 4 in Eqs. (72–73) we ﬁnd: P s ( G ∗ ) = (cid:32) s u Ω s b (cid:18) d ( κ + d ) r u ( κ + d ) (cid:19) (cid:33) − , (95) P s ( G ) = (cid:32) s b s u Ω (cid:18) r u ( κ + d ) d ( κ + d ) (cid:19) (cid:33) − . (96)We can now proceed to ﬁnd the probability distribution conditional on each gene state P s ( n , n ) .In gene state G k the conditional reaction system is: G k r k −→ G k + P , P d −→ ∅ , P κ −→ P , P d −→ ∅ , (97)where the protein is always produced in stage P , and r k ≡ r u in gene state G , and r k ≡ r b ingene state G ∗ . Now we employ the van Kampen ansatz [8] on n and n in gene state G k , i.e. n ( k )1 ( t ) = Ω φ ∗ ( k )1 + Ω / (cid:15) ( k )1 ( t ) and n ( k )2 ( t ) = Ω φ ∗ ( k )2 + Ω / (cid:15) ( k )2 ( t ) , where φ ∗ ( k )1 and φ ∗ ( k )2 are thedeterministic steady state mean concentrations of P and P in gene state G k respectively, and (cid:15) ( k )1 ( t ) and (cid:15) ( k )2 ( t ) are ﬂuctuations about these mean values. In the following we drop the superscript ( k ) notation for aesthetic reasons, although one should keep in mind that the process below must beindividually conducted on each gene state. The purpose of using the van Kampen ansatz can be seenupon its substitution into Eq. (94) which for a large system size, Ω , gives: d = d φ ∗ + d φ ∗ φ ∗ + φ ∗ (cid:32) − / (cid:16) (cid:15) ( t ) (cid:16) d d φ ∗ + d φ ∗ − φ ∗ + φ ∗ (cid:17) (98) + (cid:15) ( t ) (cid:16) d d φ ∗ + d φ ∗ − φ ∗ + φ ∗ (cid:17)(cid:17)(cid:33) + O (Ω − ) . By comparing to the eﬀective degradation from the reduced model in gene state G k , d = d (1 + η k ( t )) ,one can see that to match the two models we must have d = ( d φ ∗ + d φ ∗ ) / ( φ ∗ + φ ∗ ) , (99)and η k ( t ) = Ω − / ( (cid:15) ( t ) y ( d , d , φ ∗ , φ ∗ ) + (cid:15) ( t ) y ( d , d , φ ∗ , φ ∗ )) , (100)where we have deﬁned the functions, y ( d , d , φ ∗ , φ ∗ ) = d d φ ∗ + d φ ∗ − φ ∗ + φ ∗ , (101) y ( d , d , φ ∗ , φ ∗ ) = d d φ ∗ + d φ ∗ − φ ∗ + φ ∗ . (102)If the correlators (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) , (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) , (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) and (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) are known, then one can alsoﬁnd the correlator of η k ( t ) , i.e., (cid:104) η k (0) η k ( t ) (cid:105) , given by: (cid:104) η k (0) η k ( t ) (cid:105) = 1Ω (cid:0) y (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) + y (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) + y y ( (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) + (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) ) (cid:1) . (103)Note in Eq. (100) that if d = d = d , then the magnitude of η k ( t ) is zero for all t since the system P κ −→ P , P d −→ ∅ , P d −→ ∅ is equivalent to P d −→ ∅ where one is only interested in the total numberof proteins. 25o proceed in ﬁnding η k ( t ) in Eq. (100), we ﬁrst need to ﬁnd the steady state concentrations φ ∗ and φ ∗ from the deterministic rate equations. These are, dφ dt = r k Ω − ( κ + d ) φ , (104) dφ dt = κφ − d φ , (105)where again the / Ω in Eq. (104) follows since the concentration of a single gene in a volume Ω is / Ω . Enforcing the steady state condition allows us to ﬁnd φ ∗ and φ ∗ , φ ∗ = r k Ω( κ + d ) , φ ∗ = κr k d ( κ + d ) . Note that the linear dependence of φ ∗ and φ ∗ on r k means that the eﬀective degradation rate d fromEq. (99) is independent of the gene state. Assuming that both P and P are numerous, we now proceedto the LNA [8, 48] of the system in Eq. (97), which will allow us to ﬁnd the correlators (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) , (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) , (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) and (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) . Where S is the stoichiometric matrix, φ = ( φ , φ ) and f ( φ ) is the macroscopic rate vector one can computationally ﬁnd the required matrices needed to performthe LNA: (i) the steady state Jacobian matrix A ij = d ( S · f ( φ )) j /dφ i | φ = φ ∗ , and (ii) the steady statediﬀusion matrix ( B · B T ) ij = S · Diag ( f ( φ )) · S T | φ = φ ∗ . The Jacobian matrix then allows us to ﬁnd thetime evolution of both (cid:104) (cid:15) ( t ) (cid:105) and (cid:104) (cid:15) ( t ) (cid:105) since ∂ t (cid:104) (cid:15) (cid:105) = A · (cid:104) (cid:15) (cid:105) , where (cid:104) (cid:15) (cid:105) = ( (cid:104) (cid:15) ( t ) (cid:105) , (cid:104) (cid:15) ( t )) (cid:105) . Solvingthese coupled ﬁrst order ODEs gives us: (cid:104) (cid:15) ( t ) (cid:105) = (cid:104) (cid:15) (0) (cid:105) e − ( d + κ ) t , (106) (cid:104) (cid:15) ( t ) (cid:105) = − κ (cid:104) (cid:15) (0) (cid:105) e − ( d + κ ) t + ( κ (cid:104) (cid:15) (0) (cid:105) + ( d − d + κ ) (cid:104) (cid:15) (0) (cid:105) ) e − d t d − d + κ , (107)where − d and − ( d + κ ) are eigenvalues of A . Clearly, in the limit t → ∞ the ﬂuctuations aboutthe steady state concentrations φ ∗ and φ ∗ , (cid:104) (cid:15) ( t ) (cid:105) and (cid:104) (cid:15) ( t ) (cid:105) , tend to zero as required. The ﬁnal stepof the LNA then requires us to ﬁnd the covariance matrix C at steady state, which has the steadystate variances (cid:104) (cid:15) (cid:105) and (cid:104) (cid:15) (cid:105) as diagonal components and covariance (cid:104) (cid:15) (cid:15) (cid:105) = (cid:104) (cid:15) (cid:15) (cid:105) in the oﬀ-diagonalcomponents. C is then given by the Lyapunov equation [48]: A · C + C · A T + B · B T = 0 , (108)whose solution is given by: C = (cid:32) r k ( d + κ )Ω κr k d ( d + κ )Ω (cid:33) . (109)From van Kampen [8] p. 259 we assert that for some ﬂuctuation (cid:15) i , (cid:104) (cid:15) i (0) (cid:15) j ( t ) (cid:105) = (cid:104) (cid:15) i (0) (cid:104) (cid:15) j ( t ) (cid:105)(cid:105) , andthat at t = 0 we have φ = φ ∗ so that (cid:104) (cid:15) i (0) (cid:15) j (0) (cid:105) = (cid:104) (cid:15) i (cid:15) j (cid:105) . For example, for (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) we have,using (cid:104) (cid:15) ( t ) (cid:105) from Eq. (106) and (cid:104) (cid:15) (cid:105) from Eq. (109), (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) = (cid:104) (cid:15) (0) (cid:104) (cid:15) ( t ) (cid:105)(cid:105) = (cid:104) (cid:15) (cid:105) e − ( d + κ ) t .Explicitly, one can then calculate all the correlators, which are given by: (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) = r k ( d + κ )Ω e − ( d + κ ) t , (110) (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) = κr k d ( d + κ )Ω e − d t , (111) (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) = 0 , (112) (cid:104) (cid:15) (0) (cid:15) ( t ) (cid:105) = κr k ( e − d t − e − ( d + κ ) t )( d + κ )( d − d + κ )Ω . (113)26ow that these correlators have been determined, we can substitute them into Eq. (103) giving us thefollowing for the correlator of η k ( t ) : (cid:104) η k (0) η k ( t ) (cid:105) = ( d − d ) κ (cid:0) κ ( d + κ ) e − ( d + κ ) t + ( d − d ) d e − d t (cid:1) ( d + κ ) ( d − d + κ ) ( d + κ ) r k , (114)noting the only dependence on the gene state G k comes from the pre-factor /r k . Comparing thisequation to the colored noise seen in Eq. (12) in Section 3.1 we see however that we have twoexponentials in the correlator. This sum of exponentials in Eq. (114) can be approximated by a singleexponential through a small t expansion. This gives us: (cid:104) η k (0) η k ( t ) (cid:105) ≈ (cid:104) η k (0) η k ( t ) (cid:105) a = D ( k ) a τ a e − t/τ a , (115)where D ( k ) a and τ a are the approximate noise strength and correlation time given by: D ( k ) a = ( d − d ) κ ( d + κ ) ( d κ + d ( d + κ ) + κ ) r k , (116) τ a = d + κd κ + d ( d + κ ) + κ . (117)Clearly, the small t expansion allows us to roughly interpret the noise η k ( t ) , present in gene state G k ,as colored noise with strength D a /τ a and correlation time τ a . Note that even when both exponentialsequally contribute to the correlator in Eq. (114), this is generally a very good approximation for fewprotein stages. Knowing D ( k ) a and τ a for η k ( t ) we can now substitute them into Eqs. (76–77) in Section4, then using Eqs. (95–96) we ﬁnd P ( n ) ≈ P s ( G ) P s ( n | G ) + P s ( G ∗ ) P s ( n | G ∗ ) . (118)Fig. 10B shows two diﬀerent cases of the cUCNA for predicting distributions for multi-stagedegradation and slow gene switching: in B(i) for the case of d > d (true for around 80% of proteinsin [46]); in B(ii) for the case of d > d (true for around 20% of proteins in [46]). On the main plotsred dots show the standard SSA prediction of the full reaction scheme in Eq. (92), and the black linesshow the cUCNA from Eq. (118), which in both cases is almost indistinguishable from the white noise(cUCNA with τ = 0 ) and no extrinsic noise prediction (discussed further in the following paragraph).The insets show the correlators in gene state G ∗ , where the red dashed line represents (cid:104) η G ∗ (0) η G ∗ ( t ) (cid:105) a and the black line shows (cid:104) η G ∗ (0) η G ∗ ( t ) (cid:105) . Note that the correlators for gene state G are not shownbecause they show very similar to what is seen for state G . Even given the complex model reductionfrom two protein species to one eﬀective protein species the cUCNA performs very well in predictingdistributions from the standard SSA of the full system in Eq. (92). Note that as one considers moreprotein stages with diﬀering degradation rates it becomes more diﬀerent to ﬁt the correlator to a singleexponential, which presents a limitation of this method for more protein stages.However, we ﬁnd that since our analysis is restricted to the large protein number regime, and thenoise strength D a is inversely proportional to the production rate r k which is typically large, that D a is typically very small in both gene states. This means that the cUCNA probability distribution isalmost identical to probability distributions that assume white noise (cUCNA with τ = 0 ) or even noextrinsic noise. However, what the analysis in this section provides is the quantitative reasons why onecould necessarily neglect the contribution of extrinsic noise in model reduction from the full system inEq. (92) to the simpler system in Eq. (93) . In this paper we have explored the addition of colored noise onto the reaction rates for a cooperativeauto-regulatory circuit. Starting from a reduced chemical Fokker-Planck description, we used the27CNA to derive approximate expressions for the probability distribution of protein numbers in thelimits of fast and slow promoter switching. The approximation is valid provided the colored noise on thereaction rates is small and the correlation time is short. By means of stochastic simulations, we veriﬁedthe accuracy of the approximate distributions; we also veriﬁed the predictions of the UCNA, namelythat under fast promoter switching conditions the addition of colored noise can induce bimodalitywhereas under slow promoter switching conditions, noise can destroy bimodality.We also have shown how complex models of gene expression can be mapped onto simpler modelswith noisy rates. In particular we have shown that: (i) An auto-regulatory feedback loop with multi-stage protein production, including diﬀerent stages of mRNA processing, can be mapped onto anauto-regulatory feedback loop with a single protein production reaction step having colored noisedadded to its reaction rate. (ii) A feedback loop with multi-stage protein degradation can be mappedonto a feedback loop with a single protein degradation reaction with a ﬂuctuating rate. We have alsoveriﬁed that in many instances, one cannot simply approximate colored noise with white noise, orelse neglect it entirely, since this does not match behaviour seen from the full underlying models ofmulti-stage protein production or degradation.The UCNA provides an easily extendable analysis for other gene regulatory networks so long as onlyone species eﬀectively describes the system. Our analysis is the ﬁrst to our knowledge, to analyticallyﬁnd steady state probability distributions where colored noise is added to a non-linear reaction (theprotein-gene binding reaction) in a gene regulatory context; a previous study applied the UCNA tostudy the eﬀects of extrinsic noise in genetic circuits composed of purely linear reactions [27]. Giventhat our calculations show that the protein distributions for auto-regulatory circuits with extrinsicnoise on reaction parameters can be dramatically diﬀerent than models assuming constant reactionrates, an interesting future research direction would be to develop UCNA based methods that candirectly infer the properties of colored noise on reaction rates from protein expression data.

Acknowledgments

This work was supported by a BBSRC EASTBIO PhD studentship for J.H., a Darwin Trust scholarshipfor A.G. and a Leverhulme Trust grant (RPG-2018-423) for R.G.

A Stochastic simulations of autoregulation with extrinsic noise

In this paper extrinsic noise is accounted for in the SSA through the introduction of a new ghost species Y and some new ghost reactions. For example, consider the case where we want to model a ﬂuctuatingdegradation rate d = d (1 + η ( t )) , where (cid:104) η ( t ) (cid:105) = 0 and (cid:104) η ( t ) η ( t (cid:48) ) (cid:105) = ( D/τ ) exp ( −| t − t (cid:48) | /τ ) . We willthen replace the degradation reaction, P d −→ ∅ , by the following set of reactions: ∅ / ( D Ω) −−−−− (cid:42)(cid:41) −−−−− /τ Y, P + Y d D Ω /τ −−−−−→ Y. (119)We now show why this equivalence exists. The propensity for the degradation reaction in Eq. (119)is ( n P n Y d D Ω /τ ) / Ω , meaning that the eﬀective degradation rate is d = ( n Y d D Ω /τ ) / Ω . Assumingthere are large numbers of Y , we apply the van Kampen ansatz that ﬂuctuations in Y occur aroundits deterministic steady state mean [8]: n Y Ω = τD Ω + Ω − / (cid:15) ( t ) . (120)Then, employing the system size expansion, and enforcing the linear noise approximation (LNA),we obtain a linear FPE for the probability of having a ﬂuctuation of size (cid:15) ( t ) at a time t , denoted Π( (cid:15), t ) [8, 48]: ∂ Π( (cid:15), t ) ∂t = 1 τ ∂∂(cid:15) ( (cid:15) Π( (cid:15), t )) + 12 2 D Ω ∂ Π( (cid:15), t ) ∂(cid:15) . (121)28his FPE then admits an equivalent Langevin equation given by: d(cid:15) ( t ) dt = − τ (cid:15) ( t ) + (cid:114) D Ω β ( t ) , (122)where β ( t ) is Gaussian white noise with zero mean and correlator (cid:104) β ( t ) β ( t (cid:48) ) (cid:105) = δ ( t − t (cid:48) ) . Hence, fromEq. (120) it follows that d goes as: d = d D Ω τ n Y Ω = d (1 + η ( t )) , (123) dη ( t ) dt = − τ η ( t ) + √ Dτ β ( t ) , (124)where η ( t ) = Ω / D(cid:15) ( t ) /τ . Eqs. (123) and (124) are consistent with the deﬁnition of colored noisedescribed at the beginning of this section. This modiﬁed SSA requires that where τ and D are bothindividually large, that τ (cid:29) D such that slow switching is not enforced between diﬀering numbers ofthe ghost species. B Detailed explanation of condition 2

In order to explain the origin of condition 2 – a condition on the length scale of colored noise ﬂuctuationcompared to the rate of variation of the drift term in Eq. (47) – we will ﬁrst consider a more intuitiveexample. Consider a Brownian particle subject to a force F ( x ) , whose state is speciﬁed by both itsposition x , as well as its velocity v . The set of SDEs governing the state of this particle is then [8]: dxdt = v, (125) dvdt = F ( x ) m − γv + k b T γm Γ( t ) , (126)where m is the mass of the particle, γ is the damping coeﬃcient of the frictional force surroundingthe particle (frictional force is − γmv ), k b T is the thermal energy of the particle, and Γ( t ) is Gaussianwhite noise with zero mean and correlator (cid:104) Γ( t )Γ( t (cid:48) ) (cid:105) = δ ( t − t (cid:48) ) . The equivalent multivariate FPE forthis set of SDEs is [29]: ∂P ( x, v ; t ) ∂t = γ (cid:20) ∂ ( vP ) ∂v + k b Tm ∂ P∂v (cid:21) − v ∂P∂x − F ( x ) m ∂P∂v . (127)Now following van Kampen p. 216–218 [8], one can utilise singular perturbation theory assuming thatthe damping coeﬃcient γ is small (although the same procedure could be done for γ large) in orderto reduce the above FPE in two variables to a FPE in the position variable x alone. The result afterhaving done this procedure is: ∂P ( x ; t ) ∂t = − ∂∂x (cid:18) F ( x ) mγ P (cid:19) + k b Tmγ ∂ P∂x . (128)Aside from the requirement that γ must be small, there is another condition required of Eq. (128)such that it reasonably approximates Eq. (127). This condition arises physically since we realise thatif we are to approximate Eq. (127) by Eq. (128), then the drift term F ( x ) /mγ must be approximatelyconstant over the distance that the velocity is damped. One ﬁnds that the associated ‘length scale’ L over which the velocity is damped is simply the pre-factor of diﬀusion term in Eq. (128), i.e., L = k b Tmγ [49]. Enforcing the requirement that F ( x ) is slowly varying over this length scale we ﬁnd theinequality L | F (cid:48) ( x ) | (cid:28) | F ( x ) | , explicitly: mγk b T (cid:29) (cid:12)(cid:12)(cid:12)(cid:12) F (cid:48) ( x ) F ( x ) (cid:12)(cid:12)(cid:12)(cid:12) , (129)29hich must be satisﬁed for our one variable FPE to be a good approximation.We now return to our colored noise problem and recall the set of SDEs that deﬁne our system: dndt = h ( n ) + F ( n ) η + g ( n )Γ( t ) , (130) dηdt = − τ η + 1 τ θ ( t ) , (131)where all functions of n and t are deﬁned in Section 3.2. This set of SDEs has an equivalent bi-variate(Stratonovich) FPE given by: ∂P ( n, η ; t ) ∂t = − ∂∂n (cid:16)(cid:0) h ( n ) + F ( n ) η − g ( n ) g (cid:48) ( n ) (cid:1) P (cid:17) + 1 τ ∂∂η ( ηP ) (132) + 12 ∂ ∂n (cid:0) g ( n ) P (cid:1) + 1 τ ∂ ∂η P. We now recall Eq. (47), i.e., our UCNA approximated one variable FPE, where η was adiabaticallyeliminated: ∂P ( n ; t ) ∂t = − ∂∂n (cid:104)(cid:16) ˜ h ( n ) + ˜ g ( n )˜ g (cid:48) ( n ) (cid:17) P ( n, t ) (cid:105) + ∂ ∂n (cid:2) ˜ g ( n ) P ( n, t ) (cid:3) . (133)Analogously to the case of the Brownian particle, this one variable FPE can only be approximatelycorrect where the variation of the drift term with respect to the length scale of colored noise ﬂuctuationsis small. From Eq. (43), we identify our length scale as the pre-factor of the noise term whose originis the adiabatic elimination of η , i.e., L = F ( n ) C ( n, τ ) . (134)Hence, C ( n, τ ) must satisfy the following length scale condition C ( n, τ ) (cid:29) F ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ n (cid:16) ˜ h ( n ) + ˜ g ( n )˜ g (cid:48) ( n ) (cid:17) ˜ h ( n ) + ˜ g ( n )˜ g (cid:48) ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , (135)if Eq. (133) is to be a good approximation of Eq. (132), as seen in Eq. (66) from the main text.Note that one can also make the argument that the diﬀusion term ˜ g ( n ) should also slowly vary withrespect to L . However, we generally ﬁnd that this is satisﬁed if Eq. (135) is satisﬁed, and hence wedo not include this as an additional condition on the validity of the UCNA. References [1] Bruce Alberts, Alexander Johnson, Julian Lewis, David Morgan, Martin Raﬀ, Keith Roberts, andPeter Walter.

Molecular biology of the cell sixth edition . Garland Science, 2015.[2] Shai S Shen-Orr, Ron Milo, Shmoolik Mangan, and Uri Alon. Network motifs in the transcriptionalregulation network of escherichia coli.

Nature genetics , 31(1):64, 2002.[3] Nitzan Rosenfeld, Michael B Elowitz, and Uri Alon. Negative autoregulation speeds the responsetimes of transcription networks.

Journal of molecular biology , 323(5):785–793, 2002.[4] Michael B Elowitz, Arnold J Levine, Eric D Siggia, and Peter S Swain. Stochastic gene expressionin a single cell.

Science , 297(5584):1183–1186, 2002.305] Peter S Swain, Michael B Elowitz, and Eric D Siggia. Intrinsic and extrinsic contributions tostochasticity in gene expression.

Proceedings of the National Academy of Sciences , 99(20):12795–12800, 2002.[6] Daniel T Gillespie. Exact stochastic simulation of coupled chemical reactions.

The journal ofphysical chemistry , 81(25):2340–2361, 1977.[7] Crispin W Gardiner et al.

Handbook of stochastic methods , volume 3. springer Berlin, 1985.[8] Nicolaas Godfried Van Kampen.

Stochastic processes in physics and chemistry , volume 1. Elsevier,1992.[9] James Holehouse, Zhixing Cao, and Ramon Grima. Stochastic modeling of auto-regulatory geneticfeedback loops: a review and comparative study.

Biophysical Journal , 2020. https://doi.org/10.1016/j.bpj.2020.02.016 .[10] Ramon Grima, Deena R Schmidt, and Timothy J Newman. Steady-state ﬂuctuations of a geneticfeedback loop: An exact solution.

The Journal of chemical physics , 137(3):035104, 2012.[11] Niraj Kumar, Thierry Platini, and Rahul V Kulkarni. Exact distributions for stochastic geneexpression models with bursting and feedback.

Physical review letters , 113(26):268105, 2014.[12] Peijiang Liu, Zhanjiang Yuan, Haohua Wang, and Tianshou Zhou. Decomposition and tunabilityof expression noise in the presence of coupled feedbacks.

Chaos: An Interdisciplinary Journal ofNonlinear Science , 26(4):043108, 2016.[13] Chen Jia, George G Yin, Michael Q Zhang, et al. Single-cell stochastic gene expression kineticswith coupled positive-plus-negative feedback.

Physical Review E , 100(5):052406, 2019.[14] Chen Jia and Ramon Grima. Small protein number eﬀects in stochastic models of autoregulatedbursty gene expression.

The Journal of chemical physics , 152(8):084115, 2020.[15] Zhixing Cao and Ramon Grima. Linear mapping approximation of gene regulatory networks withstochastic dynamics.

Nature communications , 9(1):3305, 2018.[16] Pavel Kurasov, Alexander Lück, Delio Mugnolo, and Verena Wolf. Stochastic hybrid models ofgene regulatory networks–a pde approach.

Mathematical biosciences , 305:170–177, 2018.[17] Alexander Andreychenko, Luca Bortolussi, Ramon Grima, Philipp Thomas, and Verena Wolf.Distribution approximations for the chemical master equation: comparison of the method ofmoments and the system size expansion. In

Modeling Cellular Systems , pages 39–66. Springer,2017.[18] Philipp Thomas, Arthur V Straube, and Ramon Grima. The slow-scale linear noise approximation:an accurate, reduced stochastic description of biochemical networks under timescale separationconditions.

BMC systems biology , 6(1):39, 2012.[19] Nir Friedman, Long Cai, and X Sunney Xie. Linking stochastic dynamics to population dis-tribution: an analytical framework of gene expression.

Physical review letters , 97(16):168302,2006.[20] Anna Ochab-Marcinek and Marcin Tabaka. Transcriptional leakage versus noise: a simple mech-anism of conversion between binary and graded response in autoregulated genes.

Physical ReviewE , 91(1):012704, 2015.[21] Jakub Jędrak and Anna Ochab-Marcinek. Inﬂuence of gene copy number on self-regulated geneexpression.

Journal of theoretical biology , 408:222–236, 2016.3122] Matthew Scott, Brian Ingalls, and Mads Kærn. Estimations of intrinsic and extrinsic noise inmodels of nonlinear genetic networks.

Chaos: An Interdisciplinary Journal of Nonlinear Science ,16(2):026107, 2006.[23] Tina Toni and Bruce Tidor. Combined model of intrinsic and extrinsic variability for computa-tional network design with application to synthetic biology.

PLoS computational biology , 9(3),2013.[24] Emma M Keizer, Björn Bastian, Robert W Smith, Ramon Grima, and Christian Fleck. Extend-ing the linear-noise approximation to biochemical systems inﬂuenced by intrinsic noise and slowlognormally distributed extrinsic noise.

Physical Review E , 99(5):052417, 2019.[25] Elijah Roberts, Shay Be’er, Chris Bohrer, Rati Sharma, and Michael Assaf. Dynamics of simplegene-network motifs subject to extrinsic ﬂuctuations.

Physical Review E , 92(6):062717, 2015.[26] Peter Jung and Peter Hänggi. Dynamical systems: a uniﬁed colored-noise approximation.

Physicalreview A , 35(10):4464, 1987.[27] Vahid Shahrezaei, Julien F Ollivier, and Peter S Swain. Colored extrinsic ﬂuctuations and stochas-tic gene expression.

Molecular systems biology , 4(1), 2008.[28] James Holehouse and Ramon Grima. Revisiting the reduction of stochastic models of geneticfeedback loops with fast promoter switching.

Biophysical journal , 117(7):1311–1330, 2019.[29] Crispin Gardiner.

Stochastic methods , volume 4. Springer Berlin, 2009.[30] Daniel T Gillespie. The chemical langevin equation.

The Journal of Chemical Physics , 113(1):297–306, 2000.[31] Brian Munsky and Mustafa Khammash. The ﬁnite state projection algorithm for the solution ofthe chemical master equation.

The Journal of chemical physics , 124(4):044104, 2006.[32] Cao Li, Wu Da-Jin, and Ke Sheng-Zhi. Bistable kinetic model driven by correlated noises: uniﬁedcolored-noise approximation.

Physical Review E , 52(3):3228, 1995.[33] Ronald F Fox. Functional-calculus approach to stochastic diﬀerential equations.

Physical ReviewA , 33(1):467, 1986.[34] Ronald Forrest Fox. Uniform convergence to an eﬀective fokker-planck equation for weakly colorednoise.

Physical Review A , 34(5):4525, 1986.[35] Ronald F Fox. Stochastic calculus in physics.

Journal of statistical physics , 46(5-6):1145–1157,1987.[36] Ronald F Fox and Rajarshi Roy. Steady-state analysis of strongly colored multiplicative noise ina dye laser.

Physical Review A , 35(4):1838, 1987.[37] P Grigolini, LA Lugiato, Riccardo Mannella, Peter VE McClintock, M Merri, and M Pernigo.Fokker-planck description of stochastic processes with colored noise.

Physical Review A ,38(4):1966, 1988.[38] Eugene Wong and Moshe Zakai. On the convergence of ordinary integrals to stochastic integrals.

The Annals of Mathematical Statistics , 36(5):1560–1564, 1965.[39] Giuseppe Pesce, Austin McDaniel, Scott Hottovy, Jan Wehr, and Giovanni Volpe. Stratonovich-to-itô transition in noisy systems with multiplicative feedback.

Nature communications , 4:2733,2013. 3240] Damien Nicolas, Nick E Phillips, and Felix Naef. What shapes eukaryotic transcriptional bursting?

Molecular BioSystems , 13(7):1280–1290, 2017.[41] Zhixing Cao, Tatiana Filatova, Diego A Oyarzún, and Ramon Grima. Multi-scale bursting instochastic gene expression. bioRxiv , page 717199, 2019.[42] Jiajun Zhang and Tianshou Zhou. Stationary moments, distribution conjugation and pheno-typic regions in stochastic gene transcription.

Mathematical biosciences and engineering: MBE ,16(5):6134–6166, 2019.[43] Philipp Thomas, Nikola Popović, and Ramon Grima. Phenotypic switching in gene regulatorynetworks.

Proceedings of the National Academy of Sciences , 111(19):6994–6999, 2014.[44] Sandeep Choubey, Jane Kondev, and Alvaro Sanchez. Deciphering transcriptional dynamics invivo by counting nascent rna molecules.

PLoS computational biology , 11(11), 2015.[45] Geoﬀrey M Cooper.

The Cell: A Molecular Approach. 2nd edition.

Sunderland (MA): SinauerAssociates, 2000.[46] Erik McShane, Celine Sin, Henrik Zauber, Jonathan N Wells, Neysan Donnelly, Xi Wang, JingyiHou, Wei Chen, Zuzana Storchova, Joseph A Marsh, et al. Kinetic analysis of protein stabilityreveals age-dependent degradation.

Cell , 167(3):803–815, 2016.[47] Juan M Pedraza and Johan Paulsson. Eﬀects of molecular memory and bursting on ﬂuctuationsin gene expression.

Science , 319(5861):339–343, 2008.[48] Johan Elf and Måns Ehrenberg. Fast evaluation of ﬂuctuations in biochemical networks with thelinear noise approximation.

Genome research , 13(11):2475–2484, 2003.[49] Peter Hänggi and Peter Jung. Colored noise in dynamical systems.