Decomposing variability in protein levels from noisy expression, genome duplication and partitioning errors during cell-divisions
Mohammad Soltani, Cesar Augusto Vargas-Garcia, Duarte Antunes, Abhyudai Singh
DDecomposing variability in protein levels from noisy expression,genome duplication and partitioning errors during cell-divisions
M. Soltani , C. A. Vargas-Garcia , D. Antunes , A. Singh , , , ∗ ∗ E-mail: Corresponding [email protected]
Abstract
Inside individual cells, expression of genes is inherently stochastic and manifests as cell-to-cell variability or noise in protein copy numbers. Since proteins half-lives can be comparableto the cell-cycle length, randomness in cell-division times generates additional intercellularvariability in protein levels. Moreover, as many mRNA/protein species are expressed atlow-copy numbers, errors incurred in partitioning of molecules between the mother anddaughter cells are significant. We derive analytical formulas for the total noise in proteinlevels for a general class of cell-division time and partitioning error distributions. Usinga novel hybrid approach the total noise is decomposed into components arising from i)stochastic expression; ii) partitioning errors at the time of cell-division and iii) randomcell-division events. These formulas reveal that random cell-division times not only gener-ate additional extrinsic noise but also critically affect the mean protein copy numbers andintrinsic noise components. Counter intuitively, in some parameter regimes noise in pro-tein levels can decrease as cell-division times become more stochastic. Computations areextended to consider genome duplication, where the gene dosage is increased by two-fold ata random point in the cell-cycle. We systematically investigate how the timing of genomeduplication influences different protein noise components. Intriguingly, results show thatnoise contribution from stochastic expression is minimized at an optimal genome duplica-tion time. Our theoretical results motivate new experimental methods for decomposingprotein noise levels from single-cell expression data. Characterizing the contributions ofindividual noise mechanisms will lead to precise estimates of gene expression parametersand techniques for altering stochasticity to change phenotype of individual cells.
Key index words or phrases
Single cell, stochastic gene expression, cell-division, moment dynamics, hybrid models, noise decomposi-tion, partitioning errors 1 a r X i v : . [ q - b i o . M N ] O c t The level of a protein can deviate considerably from cell-to-cell, in spite of the fact that cells aregenetically-identical and are in the same extracellular environment [1–3]. This intercellular variationor noise in protein counts has been implicated in diverse processes such as corrupting functioning ofgene networks [4–6], driving probabilistic cell-fate decisions [7–12], buffering cell populations from hostilechanges in the environment [13–16], and causing clonal cells to respond differently to the same stimu-lus [17–19]. An important source of noise driving random fluctuations in protein levels is stochastic geneexpression due to the inherent probabilistic nature of biochemical processes [20–23]. Recent experimentalstudies have uncovered additional noise sources that affect protein copy numbers. For example, the timetake to complete cell-cycle (i.e., time between two successive cell-division events) has been observed tobe stochastic across organisms [24–32]. Given that many proteins/mRNAs are present inside cells atlow-copy numbers, errors incurred in partitioning of molecules between the mother and daughter cellsare significant [33–35]. Finally, the time at which a particular gene of interest is duplicated can alsovary between cells [36, 37]. We investigate how such noise sources in the cell-cycle process combine withstochastic gene expression to generate intercellular variability in protein copy numbers (Fig.1).
Time ( minutes ) Probability P r o t e i n c o u n t p e r ce ll Cell-division 𝐶𝑉 = 𝐶𝑉 𝐸2 + 𝐶𝑉 𝑃2 + 𝐶𝑉 𝑅2 𝐶𝑉 𝐸2 = Cell-cycle noise 𝐶𝑉 𝑃2 = Production noise 𝐶𝑉 𝑅2 = Partitioning noise Start cell-cycle 𝐶𝑉 Random cell cycle time
150 200 250 300 0 20
40 60 80
Gene duplication
Figure 1. Sample trajectory of the protein level in a single cell with different sources ofnoise . Stochastically expressed proteins accumulate within the cell at a certain rate. At a random pointin the cell-cycle, gene-duplication results in an increase in production rate. Stochastic cell-division eventslead to random partitioning of protein molecules between the mother and daughter cells with each cellreceiving, on average, half the number of proteins in the mother cell just before division. The steady-stateprotein copy number distribution obtained from a large number of trajectories is shown on the right. Thetotal noise in the protein level, as measured by the Coefficient of Variation ( CV ) squared can be brokeninto contributions from individual noise mechanisms.Prior studies that quantify the effects of cell-division on the protein noise level have been restrictedto specific cases. For example, noise computations have been done in stochastic gene expression mod-els, where cell-divisions occur at deterministic time intervals [33, 38, 39]. Recently, we have analyzed adeterministic model of gene expression with random cell-division events [40]. Building up on this work,we formulate a mathematical model that couples stochastic expression of a stable protein with randomcell-division events that follow an arbitrary probability distribution function. Moreover, at the time ofcell-division, proteins are randomly partitioned between the mother and daughter cells based on a gen-eral framework that allows the partitioning errors to be higher or lower than as predicted by binomialpartitioning. For this class of models, we derive an exact analytical formula for the protein noise levelas quantified by the steady-state Coefficient of Variation ( CV ) squared. This formula is further decom-posed into individual components representing contributions from different noise sources. A systematicinvestigation of this formula leads to novel insights, such as identification of regimes where increasingrandomness in the timing of cell-division events decreases the protein noise level.Next, we extend the above model to include genome duplication events that increase the gene’stranscription rate by two-fold (corresponding to doubling of gene dosage) prior to cell-division [36, 41].To our knowledge, this is the first study integrating randomness in the genome duplication process withstochastic gene expression. An exact formula for the protein noise level is derived for this extendedmodel and used to investigate how the timing of duplication affects different noise components. Counterintuitively, results show that doubling of the transcription rate within the cell-cycle can lead to smallerfluctuations in protein levels as compared to a constant transcription rate through out the cell-cycle.Finally, we discuss how formulas obtained in this study can be used to infer parameters and characterizethe gene expression process from single-cell studies. We consider the standard model of stochastic gene expression [42, 43], where mRNAs are transcribedat exponentially distributed time intervals from a constitutive gene with rate k x . For the time being,we exclude genome duplication and the transcription rate is fixed throughout the cell-cycle. Assumingshort-lived mRNAs, each transcription event results in a burst of proteins [43–45]. The correspondingjump in protein levels is shown as x ( t ) (cid:55)→ x ( t ) + B, (1)where x ( t ) is the protein population count in the mother cell at time t , B is a random burst size drawnfrom a positively-valued distribution and represents the number of protein molecules synthesized in asingle-mRNA lifetime. Motivated by observations in E. coli and mammalian cells, where many proteinshave half-lives considerably longer than the cell-doubling time, we assume a stable protein with no activedegradation [46–48]. Thus, proteins accumulate within the cell till the time of cell-division, at whichpoint they are randomly partitioned between the mother and daughter cells.Let cell-division events occur at times t s , s ∈ { , , . . . } . The cell-cycle time T := t s − t s − , (2)follows an arbitrary positively-valued probability distribution with the following mean and coefficient ofvariation ( CV ) squared (cid:104) T (cid:105) = (cid:104) t s − t s − (cid:105) , CV T = (cid:104) T (cid:105) − (cid:104) T (cid:105) (cid:104) T (cid:105) , (3)where (cid:104) . (cid:105) denotes expected value through out this paper. The random change in x ( t ) during cell-divisionis given by x ( t s ) (cid:55)→ x + ( t s ) , (4)where x ( t s ) and x + ( t s ) denote the protein levels in the mother cell just before and after division, respec-tively. Conditioned on x ( t s ), x + ( t s ) is assumed to have the following statistics (cid:104) x + ( t s ) | x ( t s ) (cid:105) = x ( t s )2 , (cid:28) x ( t s ) − (cid:104) x + ( t s ) (cid:105) (cid:12)(cid:12)(cid:12)(cid:12) x ( t s ) (cid:29) = αx ( t s )4 . (5)The first equation implies symmetric division, i.e., on average the mother cell inherits half the numberprotein molecules just before division. The second equation in (5) describes the variance of (cid:104) x + ( t s ) (cid:105) andquantifies the error in partitioning of molecules through the non-negative parameter α . For example, α = 0 represents deterministic partitioning where x + ( t s ) = x ( t s ) / x + ( t s )Probability { x + ( t s ) = j | x ( t s ) } = x ( t s )! j !( x ( t s ) − j )! (cid:18) (cid:19) x ( t s ) , j ∈ { , , . . . , x ( t s ) } , (6)and corresponds to α = 1 in (5). Interestingly, recent studies have shown that partitioning of proteinsthat form clusters or multimers can result in α > α < x ( t ) takes non-negative integervalues, x ( t ) is continuous in the hybrid models. We will use these hybrid models for decomposing theprotein noise level obtained from the full model into individual components representing contributionsfrom different noise sources. However, before computing the noise, we first determine the average numberof proteins as a function of the cell-cycle time distribution. Bkx x B) Stochastic cell cycle Deterministic partitioning & production C) Stochastic cell cycle & partitioning
Deterministic production A) Stochastic cell cycle, partitioning & production x k Bxx xx x Bkx x xx / xx Cell division Cell division Cell division
Figure 2. Stochastic models of gene expression with cell-division.
Arrows denote stochasticevents that change the protein level by discrete jumps as shown in (1) and (4). The differential equationwithin the circle represents the time evolution of x ( t ) in between events. A) Model with all the differentsources of noise: proteins are expressed in stochastic bursts, cell-division occurs at random times, andmolecules are partitioned between the mother and daughter cells based on (5). The trivial dynamics˙ x = 0 signifies that the protein level is constant in-between stochastic events. B) Hybrid model whererandomness in cell-division events is the only source of noise. Protein production is modeled determin-istically through a differential equation and partitioning errors are absent, i.e., α = 0 in (5). C) Hybridmodel where noise comes from both cell-division events and partitioning errors. Protein production isconsidered deterministically as in Fig. 2B. Since x ( t ) is continuous here, x + ( t s ) has a positively-valuedcontinuous distribution with same mean and variance as in (5) To quantify the steady-state mean protein level we consider the full model illustrated in Fig. 2A. Itturns out that all the models shown in Fig. 2 are identical in terms of finding (cid:104) x ( t ) (cid:105) and in principleany one of them could have been used. To obtain differential equations describing the time evolutionof (cid:104) x ( t ) (cid:105) we model the cell-cycle time through a phase-type distribution, which can be represented by acontinuous-time Markov chain. Phase-type distributions are dense in the class of positively-valued con-tinuous distributions, i.e., one can always construct a sequence of phase-type distributions that convergespoint wise to a given distribution of interest [54]. We use this denseness property as a practical tool formodeling the cell-cycle time. We consider a class of phase-type distribution that consists of a mixture of Erlang distributions. Recallthat an Erlang distribution of order i is the distribution of the sum of i independent and identicalexponential random variables. The cell-cycle time is assumed to have an Erlang distribution of order i with probability p i , i = { , . . . , n } and can be represented by a continuous-time Markov chain withstates G ij , j = { , . . . , i } , i = { , . . . , n } (Fig. 3). Let Bernoulli random variables g ij = 1 if the systemresides in state G ij and 0 otherwise. The probability of transition G ij → G i ( j +1) in the next infinitesimaltime interval [ t, t + dt ) is given by kg ij dt , implying that the time spent in each state G ij is exponentiallydistributed with mean 1 /k . To summarize, at the start of cell-cycle, a state G i , i = { , . . . , n } is chosenwith probability p i and cell-division occurs after transitioning through i exponentially distributed steps.Based on this formulation, the probability of a cell-division event occurring in the next time interval[ t, t + dt ) is given by kp i (cid:80) nj =1 g jj dt , and whenever the event occurs, the protein level changes as per (4).Finally, the mean and the coefficient of variation squared of the cell-cycle time is obtained as (cid:104) T (cid:105) = n (cid:88) i =1 p i ik , CV T = 1 k (cid:104) T (cid:105) (7)in terms of the Markov chain parameters. Our goal is to obtain (cid:104) x (cid:105) := lim t →∞ (cid:104) x ( t ) (cid:105) as a function of (cid:104) T (cid:105) and CV T . Cell division p n p p k k k k k k k Cell division
Cell division
Start cell cycle G G G n G n G nn G Figure 3. A continuous-time Markov chainmodel for the cell-cycle time . The cell-cycle time is assumed to follow a mixture of Er-lang distributions. At the start of cell-cycle, astate G i , i = { , . . . , n } is chosen with probabil-ity p i . The cell-cycle transitions through states G ij , j = { , . . . , i } residing for an exponentiallydistributed time with mean 1 /k in each state.Cell-division occurs after exit from G ii and theabove process is repeated. Time evolution of the statistical moments of x ( t ) can be obtained from the Kolmogorov forward equationscorresponding to the full model in Fig. 2A combined with the cell-division process described in Fig. 3.We refer the reader to [52, 55, 56] for an introduction to moment dynamics for stochastic and hybridsystems. Analysis in Appendix A shows d (cid:104) x (cid:105) dt = k x (cid:104) B (cid:105) − k (cid:42) n (cid:88) j =1 xg jj (cid:43) . (8)Note that the time-derivative of the mean protein level (first-order moment) is unclosed, in the sense that,it depends on the second-order moment (cid:104) xg ij (cid:105) . Typically, approximate closure methods are used to solvemoments in such cases [52, 56–61]. However, the fact that g ij is binary can be exploited to automaticallyclose moment dynamics. In particular, since g ij ∈ { , }(cid:104) g nij x m (cid:105) = (cid:104) g ij x m (cid:105) , n ∈ { , , . . . } (9)for any non-negative integer m . Moreover, as only a single state g ij can be 1 at any time (cid:104) g ij g rq x m (cid:105) = 0 , if i (cid:54) = r or j (cid:54) = q. (10)Using (9) and (10), the time evolution of (cid:104) xg ij (cid:105) is obtained as d (cid:104) xg i (cid:105) dt = k x (cid:104) B (cid:105) p i i + k p i (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:104) xg i (cid:105) , (11a) d (cid:104) xg ij (cid:105) dt = k x (cid:104) B (cid:105) p i i − k (cid:104) xg ij (cid:105) + k (cid:104) xg i ( j − (cid:105) , j = { , . . . , i } (11b)and only depends on (cid:104) xg ij (cid:105) (see Appendix A). Thus, (8) and (11) constitute a closed system of lineardifferential equations from which moments can be computed exactly.To obtain an analytical formula for the average number of proteins, we start by performing a steady-state analysis of (8) that yields (cid:42) n (cid:88) j =1 xg jj (cid:43) = 2 k x (cid:104) B (cid:105) k , (12)where (cid:104) . (cid:105) denotes the expected value in the limit t → ∞ . Using (12), (cid:104) xg i (cid:105) is determined from (11a),and then all moments (cid:104) xg ij (cid:105) are obtained recursively by performing a steady-state analysis of (11b) for j = { , . . . , i } . This analysis results in (cid:104) xg ij (cid:105) = k x (cid:104) B (cid:105) k p i (cid:18) ji (cid:19) . (13)Using (7), (13) and the fact that (cid:80) ni =1 (cid:80) ij =1 g ij = 1 we obtain the following expression for the meanprotein level (cid:104) x (cid:105) = (cid:42) x n (cid:88) i =1 i (cid:88) j =1 g ij (cid:43) = n (cid:88) i =1 i (cid:88) j =1 (cid:104) xg ij (cid:105) = k x (cid:104) B (cid:105)(cid:104) T (cid:105) (cid:0) CV T (cid:1) . (14)It is important to point that (14) holds irrespective of the complexity, i.e., the number of sates G ij used in the phase-type distribution to approximate the cell-cycle time distribution. As expected, (cid:104) x (cid:105) increases linearly with the average cell-cycle time duration (cid:104) T (cid:105) with longer cell-cycles resulting in moreaccumulation of proteins. Consistent with previous findings, (14) shows that the mean protein level is alsoaffected by the randomness in the cell-cycle times ( CV T ) [40, 62]. For example, (cid:104) x (cid:105) reduces by 25% as T changes from being exponentially distributed ( CV T = 1) to periodic ( CV T = 0) for fixed (cid:104) T (cid:105) fixed. Next,we determine the noise in protein copy numbers, as quantified by the coefficient of variation squared. Recall that the full model introduced in Fig. 2A has three distinct noise mechanisms. Our strategy forcomputing the protein noise level is to first analyze the model with a single noise source, and then considermodels with two and three sources. As shown below, this approach provides a systematic dissection ofthe protein noise level into components representing contributions from different mechanisms.
We begin with the model shown in Fig. 2B, where noise comes from a single source - random cell-divisionevents. For this model, the time evolution of the second-order moment of the protein copy number isobtained as d (cid:104) x (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) x (cid:105) − k (cid:42) n (cid:88) j =1 x g jj (cid:43) , (15)and depends on third-order moments (cid:104) x g jj (cid:105) (see Appendix B). Using the approach introduced earlier forobtaining the mean protein level, we close moment equations by writing the time evolution of moments (cid:104) x g ij (cid:105) . Using (9) and (10) d (cid:104) x g i (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + k p i (cid:42) n (cid:88) j =1 x g jj (cid:43) − k (cid:104) x g i (cid:105) , (16a) d (cid:104) x g ij (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg ij (cid:105) − k (cid:104) x g ij (cid:105) + k (cid:104) x g ( i − j (cid:105) , j = { , . . . , i } . (16b)Note that the moment dynamics for (cid:104) x (cid:105) and (cid:104) xg ij (cid:105) obtained in the previous section (equations (8) and(11)) are identical for all the models in Fig. 2, irrespective of whether the noise mechanism is modeleddeterministically or stochastically. Equations (8), (11), (15) and (16) represent a closed set of lineardifferential equations and their steady-state analysis yields (cid:104) x g ij (cid:105) = k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) CV T (cid:1) k p i + 2 k x (cid:104) B (cid:105) k (cid:18) j + ji (cid:19) p i . (17)From (96) (cid:104) x (cid:105) = (cid:42) x n (cid:88) i =1 i (cid:88) j =1 g ij (cid:43) = n (cid:88) i =1 i (cid:88) j =1 (cid:104) x g ij (cid:105) = k x (cid:104) B (cid:105) (cid:104) T (cid:105) + 4 CV T (cid:104) T (cid:105) + 6 (cid:104) T (cid:105) (cid:104) T (cid:105) , (18a) (cid:104) T (cid:105) = 2 k CV T + 3 k (cid:104) T (cid:105) + (cid:104) T (cid:105) , (18b)where (cid:104) T (cid:105) is the third-order moment of the cell-cycle time. Using (18) and the mean protein countquantified in (14), we obtain the following coefficient of variation squared CV E = 127 + 4 (cid:16) (cid:104) T (cid:105)(cid:104) T (cid:105) − − CV T − CV T (cid:17)
27 (3 + CV T ) , (19)which represents the noise contribution from random cell-division events. Since cell-division is a globalevent that affects expression of all genes, this noise contribution can also be referred to as extrinsicnoise [49, 63–66]. In reality, there would be other sources of extrinsic noise, such as, fluctuations in thegene-expression machinery that we have ignored in this analysis.Note that CV E → /
27 as T approaches a delta distribution, i.e., cell divisions occur at fixed timeintervals. We discuss simplifications of (19) in various limits. For example, if the time taken to completecell-cycle is lognormally distributed, then (cid:104) T (cid:105)(cid:104) T (cid:105) = (cid:0) CV T (cid:1) = ⇒ CV E = 127 + 4 (cid:0) CV T + 20 CV T + 9 CV T (cid:1)
27 (3 + CV T ) (20)and extrinsic noise monotonically increases with CV T . If fluctuations in T around (cid:104) T (cid:105) are small, thenusing Taylor series (cid:104) T (cid:105) / (cid:104) T (cid:105) ≈ CV T . (21)Substituting (21) in (19) and ignoring CV T and higher order terms yields CV E ≈
127 + 28 CV T , (22)where the first term is the extrinsic noise for CV T → Next, we consider the model illustrated in Fig. 2C with both random cell-division events and partitioningof protein between the mother and daughter cells. Thus, the protein noise level here represents thecontribution from both these sources. Analysis in Appendix C shows that the time evolution of (cid:104) x (cid:105) and (cid:104) x g ij (cid:105) are given by d (cid:104) x (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) x (cid:105) + 14 αk (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:42) n (cid:88) j =1 x g jj (cid:43) , (23a) d (cid:104) x g i (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + k p i (cid:42) n (cid:88) j =1 x g jj (cid:43) + 14 αkp i (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:104) x g i (cid:105) , (23b) d (cid:104) x g ij (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg ij (cid:105) − k (cid:104) x g ij (cid:105) + k (cid:104) x g ( i − j (cid:105) , j = { , . . . , i } . (23c)Note that (23a)-(23b) are slightly different from their counterparts obtained in the previous section(equations (15) and (16a)) with additional terms that depend on α , where α quantifies the degree ofpartitioning error as defined in (5). As expected, (23) reduces to (15)-(16) when α = 0 (i.e., deterministicpartitioning). Computing (cid:104) x g ij (cid:105) by performing a steady-state analysis of (23) and using a similarapproach as in (18) we obtain (cid:104) x (cid:105) = (cid:104) T (cid:105) + 4 CV T (cid:104) T (cid:105) + 6 (cid:104) T (cid:105) (cid:104) T (cid:105) + 2 αk x (cid:104) B (cid:105)(cid:104) T (cid:105) . (24)Finding CV of the protein level and subtracting the extrinsic noise found in (19) yields CV R = 4 α CV T ) 1 (cid:104) x (cid:105) , (25)where CV R represents the contribution of partitioning errors to the protein noise level. Intriguingly, while CV R increases with α , it decrease with CV T . Thus, as cell-division times become more random for afixed (cid:104) T (cid:105) and (cid:104) x (cid:105) , the noise contribution from partitioning errors decrease. Finally, we consider the full model in Fig. 2A with all the three different noise sources. For this model,moment dynamics is obtained as (see Appendix D) d (cid:104) x (cid:105) dt = k x (cid:104) B (cid:105) + 2 k x (cid:104) B (cid:105)(cid:104) x (cid:105) + 14 αk (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:42) n (cid:88) j =1 x g jj (cid:43) , (26a) d (cid:104) x g i (cid:105) dt = k x (cid:104) B (cid:105) p i i + 2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + k p i (cid:42) n (cid:88) j =1 x g jj (cid:43) + 14 αkp i (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:104) x g i (cid:105) , (26b) d (cid:104) x g ij (cid:105) dt = k x (cid:104) B (cid:105) p i i + 2 k x (cid:104) B (cid:105)(cid:104) xg ij (cid:105) − k (cid:104) x g ij (cid:105) + k (cid:104) x g ( i − j (cid:105) , j = { , . . . , i } . (26c)Compared to (23), (26) has additional terms of the form k x (cid:104) B (cid:105) , where (cid:104) B (cid:105) is the second-order momentof the protein burst size in (1). Performing an identical analysis as before we obtain (cid:104) x (cid:105) = (cid:104) T (cid:105) + 4 CV T (cid:104) T (cid:105) + 6 (cid:104) T (cid:105) (cid:104) T (cid:105) + 2 αk x (cid:104) B (cid:105)(cid:104) T (cid:105) k x (cid:104) B (cid:105)(cid:104) T (cid:105) (3 CV T + 5)2 , (27) -4 Mean protein level per cell N o i se i n p r o t e i n l eve l s ( 𝐶 𝑉 ) -2 xCVCV TR
133 4 )( noiseng Partitioni xBBCVCVCV TTP
133 53 )( noise Production RPE
CVCVCVCV noise Total E CV noise cycle-Cell Figure 4. Scaling of noise as a function of the mean protein level for different mechanisms .The contribution of random cell-division events to the noise in protein copy numbers (extrinsic noise)is invariant of the mean. In contrast, contributions from partitioning errors at the time of cell-division(partitioning noise) and stochastic expression (production noise) scale inversely with the mean. Thescaling factors are shown as a function of the protein random burst size B , noise in cell-cycle time ( CV T )and magnitude of partitioning errors quantified by α (see (5)). With increasing mean level the total noisefirst decreases and then reaches a baseline that corresponds to extrinsic noise. For this plot, B is assumedto be geometrically-distributed with mean (cid:104) B (cid:105) = 1 . CV T = 0 and α = 1 (i.e., binomial partitioning).0which yields the following total protein noise level CV = CV E + CV R + CV P = CV E + Partitioning noise ( CV R ) (cid:122) (cid:125)(cid:124) (cid:123) α CV T ) 1 (cid:104) x (cid:105) + Production noise ( CV P ) (cid:122) (cid:125)(cid:124) (cid:123) CV T + 53(3 + CV T ) (cid:104) B (cid:105)(cid:104) B (cid:105) (cid:104) x (cid:105) (cid:124) (cid:123)(cid:122) (cid:125) Intrinsic noise , (28)that can be decomposed into three terms. The first is the extrinsic noise CV E representing the contributionfrom random cell-division events and given by (19). The second term CV R is the contribution frompartitioning errors determined in the previous section (partitioning noise), and the final term CV P is theadditional noise representing the contribution from stochastic expression (production noise). We refer tothe sum of the contributions from partitioning errors and stochastic expression as intrinsic noise . Theseintrinsic and extrinsic noise components are generally obtained experimentally using the dual-color assaythat measures the correlation in the expression of two identical copies of the gene [49].Interestingly, for a fixed mean protein level (cid:104) x (cid:105) , CV T has opposite effects on CV R and CV P . While CV R monotonically decreases with increasing CV T , CV P increases with CV T . It turns out that in certaincases these effects can cancel each other out. For example, when B = 1 with probability one, i.e., proteinsare synthesized one at a time at exponentially distributed time intervals and α = 1 (binomial partitioning) CV = CV E + 43(3 + CV T ) 1 (cid:104) x (cid:105) + 3 CV T + 53(3 + CV T ) 1 (cid:104) x (cid:105) = CV E + 1 (cid:104) x (cid:105) . (29)In this limit the intrinsic noise is always 1/Mean irrespective of the cell-cycle time distribution T [33].Note that the average number of proteins itself depends on T as shown in (14). Another important limitis CV T →
0, in which case (28) reduces to CV ≈ CV E (cid:122)(cid:125)(cid:124)(cid:123) (cid:124)(cid:123)(cid:122)(cid:125) Extrinsic noise + CV R (cid:122) (cid:125)(cid:124) (cid:123) α (cid:104) x (cid:105) + CV P (cid:122) (cid:125)(cid:124) (cid:123) (cid:104) B (cid:105)(cid:104) B (cid:105) (cid:104) x (cid:105) (cid:124) (cid:123)(cid:122) (cid:125) Intrinsic noise , (30)and is similar to the result obtained in [38] for deterministic cell-division times and binomial partitioning.Fig. 4 shows how different protein noise components change as a function of the mean protein levelas the gene’s transcription rate k x is modulated. The extrinsic noise is primarily determined by thedistribution of the cell-cycle time and is completely independent of the mean. In contrast, both CV R and CV P scale inversely with the mean, albeit with different scaling factors (Fig. 4). This observationis particularly important since many single-cell studies in E. coli , yeast and mammalian cells have foundthe protein noise levels to scale inversely with the mean across different genes [67–70]. Based on thisscaling it is often assumed that the observed cell-to-cell variability in protein copy numbers is a result ofstochastic expression. However, as our results show, noise generated thorough partitioning errors is alsoconsistent with these experimental observations and it may be impossible to distinguish between thesetwo noise mechanisms based on protein CV versus mean plots unless α is known. The full model introduced in Fig. 2 assumes that the transcription rate (i.e., the protein burst arrivalrate) is constant throughout the cell-cycle. This model is now extended to incorporate gene duplicationduring cell cycle, which is assumed to create a two-fold change in the burst arrival rate (Fig. 5). Asa result of this, accumulation of proteins will be bilinear as illustrated in Fig. 1. We divide the cell-cycle time T into two intervals: time from the start of cell-cycle to gene-duplication ( T ), and time from1gene-duplication to cell-division ( T ). T and T are independent random variables that follow arbitrarydistributions modeled through phase-type processes (see Fig. S2 in the Supplementary Information). Themean cell-cycle duration and its noise can be expressed as (cid:104) T (cid:105) = (cid:104) T (cid:105) + (cid:104) T (cid:105) , β = (cid:104) T (cid:105)(cid:104) T (cid:105) , CV T = β CV T + (1 − β ) CV T , (31)where CV X denotes the coefficient of variation squared of the random variable X . An important variablein this formulation is β , which represents the average time of gene-duplication normalized by the meancell-cycle time. Thus, β values close to 0 (1) imply that the gene is duplicated early (late) in the cell-cycleprocess. Moreover, the noise in the gene-duplication time is controlled via CV T . Cell-division
Gene-duplication x x xx Bxx Bxx x k x k Figure 5. Model illustrating stochasticexpression together with random gene-duplication and cell-division events.
At thestart of cell-cycle, protein production occurs instochastic bursts with rate k x . Genome duplica-tion occurs at a random point T within the cell-cycle and increases the burst arrival rate to 2 k x .Cell-division occurs after time T from genomeduplication, at which point the burst arrival ratereverts back to k x and proteins are randomly par-titioned between cells based on (4).We refer the reader to Appendix E for a detailed analysis of the model in Fig. 5 and only present themain results on the protein mean and noise levels. The steady-state mean protein count is given by (cid:104) x (cid:105) = k x (cid:104) B (cid:105)(cid:104) T (cid:105) (cid:0) − β + βCV T (cid:1) k x (cid:104) B (cid:105)(cid:104) T (cid:105) (cid:0) − β + (1 − β ) CV T (cid:1) , (32)and decreases with β , i.e., a gene that duplicates early has on average, more number of proteins. When β = 1, then the transcription rate is k x throughout the cell-cycle and we recover the mean protein levelobtained in (14). Similarly, when β = 0 the transcription rate is 2 k x and we obtain twice the amountas in (14). As per our earlier observation, more randomness in the timing of genome duplication andcell-division (i.e., higher CV T and CV T values) increases (cid:104) x (cid:105) .Our analysis shows that the total protein noise level can be decomposed into three components CV = CV E + CV R + CV P (33)where CV E is the extrinsic noise from random genome-duplication and cell-division events. Given itscomplexity, we refer the reader to equation (100) in Appendix E2 for an exact formula for CV E . More-over, the intrinsic noise, which represents the sum of contributions from partitioning errors ( CV R ) andstochastic expression ( CV P ) is obtained as CV R + CV P = CV R (cid:122) (cid:125)(cid:124) (cid:123) α (2 − β )3 (cid:0) ( β − β + 6) + β CV T + 2(1 − β ) CV T (cid:1) (cid:104) x (cid:105) + CV P (cid:122) (cid:125)(cid:124) (cid:123) (10 − β + 3 β ) + 6(1 − β ) CV T + 3 β CV T (cid:0) ( β − β + 6) + β CV T + 2(1 − β ) CV T (cid:1) (cid:104) B (cid:105)(cid:104) B (cid:105) (cid:104) x (cid:105) . (34) Note that for β = 0 and 1, we recover the intrinsic noise level in (28) from (34). Interestingly, for B = 1with probability 1 and α = 1, the intrinsic noise is always 1 / Mean irrespective of the values chosen for2 CV T , CV T and β . For high precision in the timing of cell-cycle events ( CV T → , CV T → CV ≈ CV E (cid:122) (cid:125)(cid:124) (cid:123) − β − β β − β + 6) (cid:124) (cid:123)(cid:122) (cid:125) Extrinsic noise + CV R (cid:122) (cid:125)(cid:124) (cid:123) α (2 − β )3 ( β − β + 6) 1 (cid:104) x (cid:105) + CV P (cid:122) (cid:125)(cid:124) (cid:123) (10 − β + 3 β )3 ( β − β + 6) (cid:104) B (cid:105)(cid:104) B (cid:105) (cid:104) x (cid:105) (cid:124) (cid:123)(cid:122) (cid:125) Intrinsic noise , (35)where mean protein level is given by (cid:104) x (cid:105) ≈ k x (cid:104) B (cid:105)(cid:104) T (cid:105) (4 − β )2 + k x (cid:104) B (cid:105)(cid:104) T (cid:105) (3 − β ) . (36)We investigate how different noise components in (35) vary with β as the mean protein level is heldfixed by changing k x . Fig. 6 shows that CV P follows a U-shaped profile with the optima occurring at β = 2 − √ ≈ . ≈
5% lower that its value at β = 0.An implication of this result is that if stochastic expression is the dominant noise source, then gene-duplication can result in slightly lower protein noise levels. In contrast to CV P , CV R has a maxima at β = 2 − √ ≈
6% higher than its value at β = 0 (Fig. 6). Analysis in Appendix E5 reveals that CV R and CV R follow the same qualitative shapes as in Fig. 6 for non-zero CV T and CV T . Interestingly,when CV T = CV T , the maximum and minimum values of CV R and CV P always occur at β = 2 − √ CV T = CV T = 1 (i.e., exponentially distributed T and T ), then the maximum value of CV R is 20% higher and the minimum value of CV P is 10% lower than their respective value for β = 0.Given that the effect of changing β on CV P and CV R is small and antagonistic, the overall affect ofgenome duplication on intrinsic noise may be minimal and hard to detect experimentally. We have investigated a model of protein expression in bursts coupled to discrete gene-duplication andcell-division events. The novelty of our modeling framework lies in describing the size of protein bursts, T (time between cell birth and gene duplication), T (time between gene duplication and cell division)and partitioning of molecules during cell division through arbitrary distributions. Exact formulas con-necting the protein mean and noise levels to these underlying distributions were derived. Furthermore,the protein noise level, as measured by the coefficient of variation squared, was decomposed into threecomponents representing contributions from gene-duplication/cell-division events, stochastic expressionand random partitioning. While the first component is independent of the mean protein level, the othertwo components are inversely proportional to it. Key insights obtained are as follows: • The mean protein level is affected by both the first and second-order moments of T and T . Inparticular, randomness in these times (for a fixed mean) increases the average protein count. • Random gene-duplication/cell-division events create an extrinsic noise term which is completelydetermined by moments of T and T up to order three. • The noise contribution from partitioning errors decreases with increasing randomness in T and T .Thus, if (cid:104) x (cid:105) is sufficiently small and α is large compared to B in (34), increasing noise in the timingof cell-cycle events decreases the total noise level. • Genome duplication has counter intuitive effects on the protein noise level (Fig. 6). For example,if stochastic expression is the dominant source of noise, then doubling of transcription due toduplication results in lower noise as compared to constant transcription throughout the cell-cycle.3
0 Gene-duplication time / Cell-cycle time N o r m a li z e d p r o t e i n n o i se l eve l 𝐶𝑉 𝐸2 ) Production noise ( 𝐶𝑉 𝑃2 ) Partitioning noise ( 𝐶𝑉 𝑅2 ) 0.9 Figure 6. Contributions from different noise sources as a function of the timing of genomeduplication for CV T = CV T = 0. Different noise components in (35) are plotted as a function of β , which represents the fraction of time within the cell-cycle at which gene-duplication occurs. Themean protein level is held constant by simultaneously changing the transcription rate k x . Noise levelsare normalized by their respective value at β = 0. The noise contribution from partitioning errors ismaximized at β ≈ .
6. In contrast, the contribution from stochastic expression is minimum at β ≈ . β ≈ . β ≈ . • For a non-bursty protein production process ( B = 1) and binomial partitioning ( α = 1), the netnoise from stochastic expression and partitioning is always 1 / (cid:104) x (cid:105) , the noise level predicted by aPoisson distribution.We discuss our results on gene duplication in further detail and how noise formulas derived here can beused for estimating model parameters from single-cell expression data. In this first-of-its-kind study, we have investigated how discrete two-fold changes in the transcription ratedue do gene duplication affect the intercellular variability in protein levels. Not surprisingly, the timingof genome duplication has a strong effect on the mean protein level – (cid:104) x (cid:105) changes by two-fold dependingon whether the gene duplicates early ( β = 0) or late ( β = 1) in the cell-cycle. In contrast, the effect of β on noise is quite small. As β is varied keeping (cid:104) x (cid:105) fixed, noise components deviate by ≈
10% fromtheir values at β = 0 (Fig. 6). Recall that these results are for a stable protein, whose intracellular copynumber accumulate in a bilinear fashion. A natural question to ask is how would these results change foran unstable protein?Consider an unstable protein with half-life considerably shorter than the cell-cycle duration. Thisrapid turnover ensures that the protein level equilibrates instantaneously after cell-division and gene-duplication events. Let γ x denote the protein decay rate. Then, the mean protein level before and aftergenome duplication is (cid:104) x (cid:105) = k x (cid:104) B (cid:105) /γ x and (cid:104) x (cid:105) = 2 k x (cid:104) B (cid:105) /γ x , respectively. Note that in the limit of large γ x there is no noise contribution form partitioning errors since errors incurred at the time of cell divisionwould be instantaneously corrected. The extrinsic noise, which can be interpreted as the protein noise4level for deterministic protein production and decay is obtained as (see Appendix F) CV E = (1 − β ) β (2 − β ) . (37)When β = 0 or 1, the transcription rate and the protein level are constant within the cell cycle and CV E = 0. Moreover, CV E is maximized at β = 2 / /
12. Thus, in contrast to a stableprotein, extrinsic noise in an unstable protein is strongly dependent on the timing of gene duplication.Next, consider the intrinsic noise component. Analysis in Appendix F shows that the noise contributionfrom random protein production and decay is CV P = 12 (cid:18) (cid:104) B (cid:105)(cid:104) B (cid:105) + 1 (cid:19) (cid:104) x (cid:105) , (cid:104) x (cid:105) = k x (cid:104) B (cid:105) (2 − β ) γ x . (38)While the mean protein level is strongly dependent on β , the intrinsic noise Fano factor = CV P × (cid:104) x (cid:105) is independent of it. Thus, similar to what was observed for a stable protein, the intrinsic noise in anunstable protein is invariant of β for a fixed (cid:104) x (cid:105) . Overall, these results suggest that studies quantifyingintrinsic noise in gene expression models, or using intrinsic noise to estimate model parameters (see below)can ignore the effects of gene duplication. Finally, note that the mean and noise levels obtained for anunstable protein are independent of the cell-cycle time T . Simple models of bursty expression and decay predict the distribution of protein levels to be negativebinomial (or gamma distributed in the continuous framework) [71,72]. These distributions are character-ized by two parameter – the burst arrival rate k x and the average burst size (cid:104) B (cid:105) , which can be estimatedfrom measured protein mean and noise levels. This method has been used for estimating k x and (cid:104) B (cid:105) across different genes in E. coli [47, 73]. Our detailed model that takes into account partitioning errorspredicts (ignoring gene duplication effects)Intrinsic noise = 4 α CV T ) 1 (cid:104) x (cid:105) + 3 CV T + 53(3 + CV T ) (cid:104) B (cid:105)(cid:104) B (cid:105) (cid:104) x (cid:105) . (39)Using CV T (cid:28) B [50, 74–76], (39) reduces toIntrinsic noise = 4 α (cid:104) x (cid:105) + 59 1 + 2 (cid:104) B (cid:105)(cid:104) x (cid:105) . (40)Given measurements of intrinsic noise and the mean protein level, (cid:104) B (cid:105) can be estimated from (40)assuming α = 1 (i.e., binomial partitioning). Once (cid:104) B (cid:105) is known, k x is obtained from the mean proteinlevel given by (14). Since for many genes (cid:104) B (cid:105) ≈ . − (cid:104) B (cid:105) . Overestimation would be even moresevere if α happen to be much higher than 1, as would be the case for proteins that form aggregatesor multimers [33]. One approach to estimate both (cid:104) B (cid:105) and α is to measure intrinsic noise changes inresponse to perturbing (cid:104) B (cid:105) by, for example, changing the mRNA translation rate through mutations inthe ribosomal-binding sites (RBS). Consider a hypothetical scenario where the Fano Factor (intrinsicnoise times the mean level) is 6. Let mutations in the RBS reduces (cid:104) x (cid:105) by 50%, implying a 50% reductionin (cid:104) B (cid:105) . If the Fano factor changes from 6 to 4 due to this mutation, then (cid:104) B (cid:105) = 3 . (cid:104) α (cid:105) = 3 . (cid:104) B (cid:105) and α .5 An important limitation of our modeling approach is that it does not take into account the size of growingcells. Recent experimental studies have provided important insights into the regulatory mechanismscontrolling cell size [79–81]. More specifically, studies in
E. coli and yeast argue for an “adder” model,where cell-cycle timing is controlled so as to add a constant volume between cell birth and division[82–84]. Assuming exponential growth, this implies that the time taken to complete cell-cycle is negativelycorrelated with cell size at birth. In addition, cell size also affects gene expression – in mammalian cellstranscription rates linearly increase with the cell size [85]. Thus, as cells become bigger they also producemore mRNAs to ensure gene product concentrations remains more or less constant. An importantdirection of future work would to explicitly include cell size with size-dependent expression and timing ofcell division determined by the adder model. This formulation will for the first time, allow simultaneousinvestigation of stochasticity in cell size, protein molecular count and concentration.Our study ignores genetic promoter switching between active and inactive states, which has beenshown to be a major source of noise in the expression of genes across organisms [86–95]. Taking intoaccount promote switching is particularly important for genome duplication studies, where doubling thenumber of gene copies could lead to more efficient averaging of promoter fluctuations. Another direction offuture work will be to incorporate this addition noise source into the modeling framework and investigateits contribution as a function of gene-duplication timing.
AppendixA Mean of protein in the presence of cell-cycle variations
Based on standard stochastic formulation of chemical kinetics [96, 97], the model introduced in Figure2A coupled with phase-type distribution introduced in Figure 3 contains the following stochastic events
Event Reset
Propensity ,)()( tgtg ijij )()( )()( tgtg jiji , ij kg },,,{ ni },,{ ij Phase-type Evolution
Cell-division ),()( ss txtx ,)( sjj tg )()( sisi tgtg },,{ ni nj jji gkp , utxtx )()( Protein production " pk x u Note that x + ( t s ) is protein level after division, characteristics of x + ( t s ) is related to protein level beforedivision as shown in equation (5) of the main text. Whenever an event occurs, protein level and states ofphase-type distribution change based on the stoichiometries shown in the second column of the table. Thethird column of table shows event propensity function f ( x, g ij ), which determines how often reactionsoccur, i.e., the probability that an event occurs in the next infinitesimal time interval ( t, t + dt ] is f ( x, g ij ) dt .Protein production is a stochastic event which happens in bursts, each burst generates B molecules where6 B is a general random variable with distributionProbability { B = u } = p (cid:48)(cid:48) u , u ∈ { , , . . . , ∞} . (41)The probability of having a burst in the time interval ( t, t + dt ] is k x p (cid:48)(cid:48) u dt . Events related to time evolutionof phase-type distribution happen with a constant rate k . Cell-division changes both the level of proteinand states of phase-type. This event contains start of new cell-cycle, hence whenever this event occurs,the last state of phase-type distribution resets to zero, and a new cell-cycle which is sum of i exponentialsstarts with probability p i ; protein count level also resets to x + ( t s ). The probability of cell-division andstarting a new cell-cycle from state g i in the time interval ( t, t + dt ] is kp i (cid:80) nj =1 dt .Theorem 1 of [55] gives the time derivative of the expected value of any function ϕ ( x, g ij ) as d (cid:104) ϕ ( x, g ij ) (cid:105) dt = (cid:42) (cid:88) Events ∆ ϕ ( x, g ij ) × f ( x, g ij ) (cid:43) , (42)where ∆ ϕ ( x, g ij ) is a change in ϕ when an event occurs. Based on this setup, mean dynamics of proteincan be written by choosing ϕ to be xd (cid:104) x (cid:105) dt = k x (cid:104) B (cid:105) + k (cid:42) n (cid:88) j =1 ( x − x ) g jj (cid:43) ⇒ d (cid:104) x (cid:105) dt = k x (cid:104) B (cid:105) − k (cid:42) n (cid:88) j =1 xg jj (cid:43) , (43)where we replaced conditional expected value of x + by x/ x + and x shown in equation (5).Dynamics of (cid:104) x (cid:105) is not closed and depends to moments (cid:104) xg jj (cid:105) , hence in order to have a closed set ofequations we add new moments dynamics by selecting ϕ to be xg ij . We do it in two steps: first we writethe moment dynamics of (cid:104) xg (cid:105) d (cid:104) xg (cid:105) dt = k x (cid:104) B (cid:105)(cid:104) g (cid:105) + k p (cid:10) xg (cid:11) − kp (cid:10) xg (cid:11) − k n (cid:88) i =2 p i (cid:104) xg (cid:105) . (44)In the equation (9) of the main text it has been shown that (cid:104) g nij x m (cid:105) = (cid:104) g ij x m (cid:105) , n ∈ { , , . . . } , (45)thus the term (cid:10) xg (cid:11) will simplify as (cid:10) xg (cid:11) = (cid:104) xg (cid:105) , (46)and the dynamics of (cid:104) xg (cid:105) can be written as d (cid:104) xg (cid:105) dt = k x (cid:104) B (cid:105)(cid:104) g (cid:105) + k p (cid:104) xg (cid:105) − k (cid:104) xg (cid:105) . (47)In the second step we write dynamics of the moments of the form (cid:104) xg ij (cid:105) other than (cid:104) xg (cid:105) d (cid:104) xg i (cid:105) dt = k x (cid:104) B (cid:105)(cid:104) g i (cid:105) + kp i (cid:42) n (cid:88) j =1 ( x x g i − xg i ) g jj (cid:43) − k (cid:104) xg i (cid:105) , (48a) d (cid:104) xg ij (cid:105) dt = k x (cid:104) B (cid:105)(cid:104) g ij (cid:105) − k (cid:104) xg ij (cid:105) + k (cid:104) xg i ( j − (cid:105) , j ∈ { , . . . , i } , (48b)7where dynamics of (cid:104) xg i (cid:105) can be written as d (cid:104) xg i (cid:105) dt = k x (cid:104) B (cid:105) p i i + kp i (cid:42) n (cid:88) j =1 x g jj (cid:43) + kp i (cid:42) n (cid:88) j =1 − x g i g jj (cid:43) − k (cid:104) xg i (cid:105) . (49)The equation (10) in the main text shows that (cid:104) g ij g rq x m (cid:105) = 0 , if i (cid:54) = r or j (cid:54) = q, (50)hence (cid:68)(cid:80) nj =1 x g i g jj (cid:69) = 0, and equation (49) simplifies to d (cid:104) xg i (cid:105) dt = k x (cid:104) B (cid:105)(cid:104) g i (cid:105) + k p i (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:104) xg i (cid:105) . (51)Further based on Figure 3 in the main text the probability of selecting a branch of i exponentials is p i , and because all the transitions happen with a constant rate k , hence mean of each of these i states is (cid:104) g ij (cid:105) = p i i . (52)Thus equations (47), (48b), and (51) can be compactly written as shown in equation (11). B Moment dynamics of hybrid model introduced in Figure 2B
Stochastic hybrid system introduced in Figure 2B coupled with phase-type distribution contains thefollowing stochastic events
Event
Reset
Propensity ,)()( tgtg ijij )()( )()( tgtg jiji , ij kg },,,{ ni },,{ ij Phase-type Evolution Cell-division ,/)()( ss txtx ,)( sjj tg )()( sisi tgtg },,{ ni nj jji gkp , and deterministic protein production dynamics˙ x = k x (cid:104) B (cid:105) . (53)Time derivative of the expected value of any function ϕ ( x, g ij ) for this hybrid system can be writtenas [55] d (cid:104) ϕ ( x, g ij ) (cid:105) dt = (cid:42) (cid:88) Events ∆ ϕ ( x, g ij ) × f ( x, g ij ) (cid:43) + (cid:28) ∂ϕ ( x, g ij ) ∂x k x (cid:104) B (cid:105) (cid:29) , (54)8where the first term in the right-hand side is contributed from stochastic events and the second term iscontributed from deterministic protein production dynamics. Based on this equation, the mean dynamicsof the protein is calculated by choosing ϕ to be xd (cid:104) x (cid:105) dt = k x (cid:104) B (cid:105) − k (cid:42) n (cid:88) j =1 xg jj (cid:43) , (55)which is the same as equation (43). In addition to mean, dynamics of (cid:104) xg ij (cid:105) are also equal to theirequation in the previous section.The second order moment dynamics of protein can be expressed by choosing ϕ to be x d (cid:104) x (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) x (cid:105) + k (cid:42) n (cid:88) j =1 (cid:18)(cid:16) x (cid:17) − x (cid:19) g jj (cid:43) , (56)which can be simplified as d (cid:104) x (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) x (cid:105) − k (cid:42) n (cid:88) j =1 x g jj (cid:43) . (57)In order to have a closed set of equations we select ϕ to be of the form x g ij . At the first step we writemoment dynamics of (cid:104) x g (cid:105) d (cid:104) x g (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg (cid:105) + k p (cid:10) x g (cid:11) − kp (cid:10) x g (cid:11) − k n (cid:88) i =2 p i (cid:104) x g (cid:105) . (58)Based on equation (9) of the main text, the term (cid:10) x g (cid:11) simplifies as (cid:10) x g (cid:11) = (cid:10) x g (cid:11) , (59)hence dynamics of (cid:104) x g (cid:105) will be d (cid:104) x g (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg (cid:105) + k p (cid:10) x g (cid:11) − k (cid:104) x g (cid:105) . (60)In the second step, we write dynamics of moments (cid:104) x g ij (cid:105) when g ij (cid:54) = g d (cid:104) x g i (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + kp i (cid:42) n (cid:88) j =1 (cid:18) x x g i − x g i (cid:19) g jj (cid:43) − k (cid:104) x g i (cid:105) , (61a) d (cid:104) x g ij (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg ij (cid:105) − k (cid:104) x g ij (cid:105) + k (cid:104) x g ( i − j (cid:105) , j = { , . . . , i } , (61b)where dynamics of (cid:104) x g i (cid:105) can be shown to follow d (cid:104) x g i (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + k p i (cid:42) n (cid:88) j =1 x g jj (cid:43) − k p i (cid:42) n (cid:88) j =1 x g i g jj (cid:43) − k (cid:104) x g i (cid:105) . (62)Based on equation (10) in the main text (cid:68)(cid:80) nj =1 x g i g jj (cid:69) = 0, thus equation (62) simplifies to d (cid:104) x g i (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + k p i (cid:42) n (cid:88) j =1 x g jj (cid:43) − k (cid:104) x g i (cid:105) . (63)Equations (60), (61b), and (63) can be compactly written as equation (16) in the main text.9 C Moment dynamics of hybrid model introduced in Figure 2C
Stochastic hybrid system introduced in Figure 2C coupled with phase-type distribution contains thefollowing stochastic events
Event
Reset
Propensity ,)()( tgtg ijij )()( )()( tgtg jiji , ij kg },,,{ ni },,{ ij Phase-type Evolution Cell-division ),()( ss txtx ,)( sjj tg )()( sisi tgtg },,{ ni nj jji gkp , and deterministic protein production dynamics˙ x = k x (cid:104) B (cid:105) . (64)Note that in this model x ( t ) is a continuous random variable, thus we also use a continuous distributionto describe x + ( t s ), however statistical properties of x + ( t s ) is still given by (5). For this model we still canuse equation (54) to derive moment dynamics; equations describing time evolution of mean and (cid:104) xg ij (cid:105) are the same as previous models, thus mean of protein for this model is equal to its value in Appendix A.The second order moment dynamics of protein can be written by choosing ϕ to be x in equation (54) d (cid:104) x (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) x (cid:105) + k (cid:42) n (cid:88) j =1 (cid:18) x αx − x (cid:19) g jj (cid:43) , (65)where conditional expected value of x is substituted based on equation (5). Dynamics of (cid:104) x (cid:105) can besimplified as d (cid:104) x (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) x (cid:105) + αk (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:42) n (cid:88) j =1 x g jj (cid:43) . (66)The same as before we add dynamics of the form (cid:104) x g ij (cid:105) to have a closed set of dynamics. First we adddynamics of (cid:104) x g (cid:105) d (cid:104) x g (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg (cid:105) + αk p (cid:10) xg (cid:11) + k p (cid:10) x g (cid:11) − kp (cid:10) x g (cid:11) − k n (cid:88) i =2 p i (cid:104) x g (cid:105) , (67)Based on equation (9) of the main text dynamics of (cid:104) x g (cid:105) simplifies to d (cid:104) x g (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg (cid:105) + αk p (cid:104) xg (cid:105) + k p (cid:10) x g (cid:11) − k (cid:104) x g (cid:105) . (68)Now we express dynamics of moments (cid:104) x g ij (cid:105) for g ij (cid:54) = g d (cid:104) x g i (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + kp i (cid:42) n (cid:88) j =1 (cid:18) x x g i + αx αx g i − x g i (cid:19) g jj (cid:43) − k (cid:104) x g i (cid:105) , (69a) d (cid:104) x g ij (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg ij (cid:105) − k (cid:104) x g ij (cid:105) + k (cid:104) x g ( i − j (cid:105) , j = { , . . . , i } , (69b)0where dynamics of (cid:104) x g i (cid:105) can be shown as d (cid:104) x g i (cid:105) dt =2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + αk p i (cid:42) n (cid:88) j =1 xg jj (cid:43) + k p i (cid:42) n (cid:88) j =1 x g jj (cid:43) + αk p i (cid:42) n (cid:88) j =1 xg i g jj (cid:43) − k p i (cid:42) n (cid:88) j =1 x g i g jj (cid:43) − k (cid:104) x g i (cid:105) . (70)Based on equation (10) in the main text (cid:68)(cid:80) nj =1 x g i g jj (cid:69) = 0, and (cid:68)(cid:80) nj =1 xg i g jj (cid:69) = 0, hence equation(70) simplifies to d (cid:104) x g i (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + αk p i (cid:42) n (cid:88) j =1 xg jj (cid:43) + k p i (cid:42) n (cid:88) j =1 x g jj (cid:43) − k (cid:104) x g i (cid:105) . (71)Equations (66), (68), (69b), and (71) can be compactly written as equation (23) in the main text. D Second and third-order moment dynamics of the full model
Based on model introduced in Appendix A, second order moment dynamics of protein is expressed bychoosing ϕ to be x in equation (42), d (cid:104) x (cid:105) dt = k x (cid:104) B (cid:105) + 2 k x (cid:104) B (cid:105)(cid:104) x (cid:105) + k (cid:42) n (cid:88) j =1 (cid:18) x αx − x (cid:19) g jj (cid:43) , (72)where conditional expected value of x is substituted based on equation (5). Dynamics of (cid:104) x (cid:105) can besimplified as d (cid:104) x (cid:105) dt = k x (cid:104) B (cid:105) + 2 k x (cid:104) B (cid:105)(cid:104) x (cid:105) + αk (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:42) n (cid:88) j =1 x g jj (cid:43) . (73)The same as before we add dynamics of the form (cid:104) x g ij (cid:105) to have a closed set of moments. First we writedynamics of (cid:104) x g (cid:105) d (cid:104) x g (cid:105) dt = k x (cid:104) B (cid:105) p + 2 k x (cid:104) B (cid:105)(cid:104) xg (cid:105) + αk p (cid:10) xg (cid:11) + k p (cid:10) x g (cid:11) − kp (cid:10) x g (cid:11) − k n (cid:88) i =2 p i (cid:104) x g (cid:105) , (74)Based on equation (9) of the main text dynamics of (cid:104) x g (cid:105) simplifies to d (cid:104) x g (cid:105) dt = k x (cid:104) B (cid:105) p + 2 k x (cid:104) B (cid:105)(cid:104) xg (cid:105) + αk p (cid:104) xg (cid:105) + k p (cid:10) x g (cid:11) − k (cid:104) x g (cid:105) . (75)Next, dynamics of moments (cid:104) x g ij (cid:105) when g ij (cid:54) = g can be written as d (cid:104) x g i (cid:105) dt = k x (cid:104) B (cid:105) p i i + 2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + kp i (cid:42) n (cid:88) j =1 (cid:18) x x g i + αx αx g i − x g i (cid:19) g jj (cid:43) − k (cid:104) x g i (cid:105) , (76a) d (cid:104) x g ij (cid:105) dt = k x (cid:104) B (cid:105) p i i + 2 k x (cid:104) B (cid:105)(cid:104) xg ij (cid:105) − k (cid:104) x g ij (cid:105) + k (cid:104) x g ( i − j (cid:105) , j = { , . . . , i } , (76b)1where dynamics of (cid:104) x g i (cid:105) can be shown as d (cid:104) x g i (cid:105) dt = k x (cid:104) B (cid:105) p i i + 2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + αk p i (cid:42) n (cid:88) j =1 xg jj (cid:43) + k p i (cid:42) n (cid:88) j =1 x g jj (cid:43) + αk p i (cid:42) n (cid:88) j =1 xg i g jj (cid:43) − k p i (cid:42) n (cid:88) j =1 x g i g jj (cid:43) − k (cid:104) x g i (cid:105) . (77)Based on equation (10) in the main text (cid:68)(cid:80) nj =1 x g i g jj (cid:69) = 0 and (cid:68)(cid:80) nj =1 xg i g jj (cid:69) = 0, hence equation(77) simplifies to d (cid:104) x g i (cid:105) dt = k x (cid:104) B (cid:105) p i i + 2 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + αk p i (cid:42) n (cid:88) j =1 xg jj (cid:43) + k p i (cid:42) n (cid:88) j =1 x g jj (cid:43) − k (cid:104) x g i (cid:105) . (78)Equations (73), (75), (76b), and (78) can be compactly written as equation (26) in the main text. E Contribution of different sources of stochasticity in proteinby taking into account gene-duplication
We study the contribution of different sources of stochasticity by using models introduced in FigureS1. The cell-cycle time consists of two time intervals: the time interval before gene-duplication andthe time after gene-duplication. These time intervals are modeled by using two independent phase-typedistributions as shown in Figure S2. Based on phase-type characteristics mean of the states of the firstphase-type (cid:104) s ij (cid:105) and the second phase-type (cid:104) g ij (cid:105) are (cid:104) s ij (cid:105) = p i i β, i ∈ { , . . . , n } , j ∈ { , . . . , i } , (cid:104) g ij (cid:105) = p (cid:48) i i (1 − β ) , i ∈ { , . . . , n } , j ∈ { , . . . , i } , (79)where β is defined as β := Mean time interval before gene-duplicationMean cell-cycle time = (cid:104) T (cid:105)(cid:104) T (cid:105) . (80)We start our analysis by deriving mean level of protein in the next section. E.1 Mean of protein count level in the presence of gene-duplication
After gene-duplication the amount of genes expressing a specific protein doubles. Thus the rate of proteinproduction increases by a factor of two as shown in Figure S1A. This model coupled with phase-typedistributions contains the following stochastic eventsNote that in the protein production event, before gene-duplication all the states g ij are zero thus propen-sity function will be k x p (cid:48)(cid:48) u . After gene-duplication and before division, one of the states g ij is one hencepropensity function will be 2 k x p (cid:48)(cid:48) u . In time of gene-duplication, states of the first phase-type will reset tozero and state g i of the second distribution will be selected with probability p (cid:48) i ; hence propensity functionof gene-duplication event is k p (cid:48) i (cid:80) n j =1 s jj . At the end of cell-cycle, states of the second phase-type willreset to zero and a new cell-cycle which is sum of i exponentials will be selected with probability p i ; thuspropensity function of cell-division event is k p i (cid:80) n j =1 g jj .2 B) Stochastic cell-cycle,
Stochastic gene-duplication, Deterministic partitioning, Deterministic production A ) Stochastic cell-cycle, Stochastic gene-duplication, Stochastic partitioning, Stochastic production
Cell-division
Gene-duplication x x xx Bxx Bxx x k x k Cell-division
Gene-duplication / xx Cell-division Gene-duplication xx Bkx x Bkx x Bkx x Bkx x C) Stochastic cell-cycle,
Stochastic gene-duplication, Deterministic partitioning, Deterministic production
Figure S1:
Stochastic hybrid models for quantifying different sources of noise.
Gene-duplicationand cell-division times are random events. A) Protein production happens in random bursts with burstfrequency k x . After gene-duplication event burst frequency doubles (2 k x ). In the time of division proteinswill be distributed between mother and daughter cells randomly, and the protein burst frequency willbe k x again. B) Protein production is considered in a deterministic fashion, and after gene-duplicationdynamics of protein production is multiplied by a factor of two, i.e., ˙ x = 2 k x (cid:104) B (cid:105) . In the division eventproteins are distributed between mother and daughter cells equally. Thus the only stochastic events areduplication and division events. C) Protein is produced in a deterministic fashion, but in time of divisionprotein levels in daughter and mother cells are random. Thus duplication, division, and partitioning arerandom events.Theorem 1 of [55] gives the time derivative of the expected value of any function ϕ ( x, s ij , g ij ) as d (cid:104) ϕ ( x, s ij , g ij ) (cid:105) dt = (cid:42) (cid:88) Events ∆ ϕ ( x, s ij , g ij ) × f ( x, s ij , g ij ) (cid:43) , (81)where ∆ ϕ ( x, s ij , g ij ) is a change in ϕ when an event occurs. The first-order moment dynamic of thismodel can be expressed by selecting ϕ to be x in equation (81) d (cid:104) x (cid:105) dt = k x (cid:104) B (cid:105) (cid:42) n (cid:88) i =1 i (cid:88) j =1 g ij (cid:43) − k (cid:42) n (cid:88) j =1 ( x − x ) g jj (cid:43) , (82)where conditional expected value of x + is replaced from equation (5); by using equation (79) mean3 Cell division Cell division
Gene duplication
Gene duplication Cell division ' p n p ' ' p k k k k k k k Gene duplication S S S n S n S nn S Gene duplication p n p p k k k k k k k Start cell cycle G G G n G n G nn G Figure S2:
Cell-cycle time consists of two time intervals: at the end of the first intervalgene duplicates, and at the end of the second one cell divides.
Two independent phase-typedistributions are used to model cell-cycle time in the presence of genome duplication. The states of the firstdistribution are denoted by S ij , i = { , . . . , n } , j = { , . . . , i } ; transition between these states happens ata constant rate k . The states of the second distribution are shown by G ij , i = { , . . . , n } , j = { , . . . , i } ,and transition between these states occurs at a rate k .dynamics can be simplified as d (cid:104) x (cid:105) dt = k x (cid:104) B (cid:105) (2 − β ) − k (cid:42) n (cid:88) j =1 xg jj (cid:43) , (83)Mean dynamics is not closed thus we add dynamics of (cid:104) xs ij (cid:105) , i = { , . . . , n } , j = { , . . . , i } and (cid:104) xg ij (cid:105) , i = { , . . . , n } , j = { , . . . , i } to have a closed set of moment equations. These moment dynamicsare simplified by using equations (5), (9), (10) and (79) as d (cid:104) xs i (cid:105) dt = k x (cid:104) B (cid:105) p (cid:48) i βi + k p (cid:48) i (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:104) xs i (cid:105) , (84a) d (cid:104) xs ij (cid:105) dt = k x (cid:104) B (cid:105) p (cid:48) i βi − k (cid:104) xs ij (cid:105) + k (cid:104) xs i ( j − (cid:105) , j = { , . . . , i } , (84b) d (cid:104) xg i (cid:105) dt = 2 k x (cid:104) B (cid:105) p i (1 − β ) i + k p i (cid:42) n (cid:88) j =1 xs jj (cid:43) − k (cid:104) xg i (cid:105) , (84c) d (cid:104) xg ij (cid:105) dt = 2 k x (cid:104) B (cid:105) p i (1 − β ) i − k (cid:104) xg ij (cid:105) + k (cid:104) xg i ( j − (cid:105) , j = { , . . . , i } . (84d)In order to find the mean of protein, first we need to find the moments (cid:104) xs ij (cid:105) , i = { , . . . , n } , j =4 Event Reset
Propensity utxtx )()( Protein production ni ij ijx gpk " ,)()( tsts ijij )()( )()( tsts jiji , ij sk },,,{ ni },,{ ij First phase-type evolution Cell-division ),()( ss txtx ,)( sjj tg )()( sisi tsts },,{ ni Gene-duplication ,)( ts jj )()( tgtg ii nj jj spk ,' },,{ ni ,)()( tgtg ijij )()( )()( tgtg jiji , ij gk },,,{ ni },,{ ij Second phase-type evolution i nj jji gpk , u { , . . . , i } and (cid:104) xg ij (cid:105) , i = { , . . . , n } , j = { , . . . , i } . For calculating these moments we should calculatethe term (cid:68)(cid:80) n j =1 xg jj (cid:69) ; this term can be obtained by analyzing equation (83) in steady-state k x (cid:104) B (cid:105) (2 − β ) = k (cid:42) n (cid:88) j =1 xg jj (cid:43) ⇒ (cid:42) n (cid:88) j =1 xg jj (cid:43) = 2 k x (cid:104) B (cid:105) (2 − β ) k . (85)By having this term, we calculate (cid:104) xs ij (cid:105) by recursion process: we start by calculating (cid:104) xs i (cid:105) by substi-tuting equation (85) in equation (84a). In the next step we use the definition we derived for (cid:104) xs i (cid:105) tocalculate (cid:104) xs i (cid:105) from equation (84b). We continue this process until we derive all the moments (cid:104) xs ij (cid:105) = k x (cid:104) B (cid:105) k p (cid:48) i (cid:18) β ji + (2 − β ) (cid:19) , i = { , . . . , n } , j = { , . . . , i } . (86)Now we need to calculate the moments (cid:104) xg ij (cid:105) , i = { , . . . , n } , j = { , . . . , i } , thus we need the expressionof the term (cid:68)(cid:80) n j =1 xs jj (cid:69) ; from equation (86) we have the following (cid:42) n (cid:88) j =1 xs jj (cid:43) = 2 k x (cid:104) B (cid:105) k . (87)Substituting this term in equations (84c) and (84d) result in (cid:104) xg ij (cid:105) = 2 k x (cid:104) B (cid:105) k p i (cid:18) (1 − β ) ji + 1 (cid:19) , i = { , . . . , n } , j = { , . . . , i } . (88)5Note that n (cid:88) i =1 i (cid:88) j =1 s ij + n (cid:88) i =1 i (cid:88) j =1 g ij = 1 ⇒ (cid:104) x (cid:105) = (cid:42) x n (cid:88) i =1 i (cid:88) j =1 s ij + n (cid:88) i =1 i (cid:88) j =1 g ij (cid:43) ⇒ (cid:104) x (cid:105) = n (cid:88) i =1 i (cid:88) j =1 (cid:104) xs ij (cid:105) + n (cid:88) i =1 i (cid:88) j =1 (cid:104) xg ij (cid:105) . (89)Thus by adding all the term calculated here and using equation (7) mean of protein can be calculated as (cid:104) x (cid:105) = k x (cid:104) B (cid:105)(cid:104) T (cid:105) (cid:0) − β + βCV T (cid:1) k x (cid:104) B (cid:105)(cid:104) T (cid:105) (cid:0) − β + (1 − β ) CV T (cid:1) . (90)6 E.2 Noise in protein count level contributed from cell-cycle time
In order to calculate the noise contributed from cell-cycle time variation, the model introduced in FigureS1B coupled with phase-type distributions is used. This model contains following stochastic events
Event Reset
Propensity ,)()( tsts ijij )()( )()( tsts jiji , ij sk },,,{ ni },,{ ij First phase-type evolution Cell-division ,/)()( ss txtx ,)( sjj tg )()( sisi tsts },,{ ni Gene-duplication ,)( ts jj )()( tgtg ii nj jj spk ,' },,{ ni ,)()( tgtg ijij )()( )()( tgtg jiji , ij gk },,,{ ni },,{ ij Second phase-type evolution nj jji gpk , i and deterministic protein production ˙ x = k x (cid:104) B (cid:105) n (cid:88) i =1 i (cid:88) j =1 g ij . (91)Theorem 1 of [55] gives the time derivative of the expected value of any function ϕ ( x, s ij , g ij ) as d (cid:104) ϕ ( x, s ij , g ij ) (cid:105) dt = (cid:42) (cid:88) Events ∆ ϕ ( x, s ij , g ij ) × f ( x, s ij , g ij ) (cid:43) + (cid:42) ∂ϕ ( x, g ij ) ∂x k x (cid:104) B (cid:105) n (cid:88) i =1 i (cid:88) j =1 g ij (cid:43) , (92)where the first term in the right hand side is contributed from stochastic events, and the second term iscontributed from deterministic protein production. In this model, dynamics of (cid:104) x (cid:105) , (cid:104) xs ij (cid:105) and (cid:104) xg ij (cid:105) arethe same as equations (83) and (E.6), thus mean of protein, (cid:104) xs ij (cid:105) , and (cid:104) xg ij (cid:105) will be equal to their valuein previous section. Further, the second-order moment dynamics of protein can be added by selecting ϕ to be x in equation (92) d (cid:104) x (cid:105) dt = 2 k x (cid:104) B (cid:105) (cid:104) x (cid:105) + (cid:42) n (cid:88) i =1 i (cid:88) j =1 xg ij (cid:43) − k (cid:42) n (cid:88) j =1 x g jj (cid:43) . (93)This equation is not closed thus we add dynamics of (cid:104) x s ij (cid:105) , i = { , . . . , n } , j = { , . . . , i } and7 (cid:104) x g ij (cid:105) , i = { , . . . , n } , j = { , . . . , i } to have a closed set of equations d (cid:104) x s i (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xs i (cid:105) + k p i (cid:42) n (cid:88) j =1 x g jj (cid:43) − k (cid:104) x s i (cid:105) , (94a) d (cid:104) x s ij (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xs ij (cid:105) − k (cid:104) x s ij (cid:105) + k (cid:104) x s ( i − j (cid:105) , j = { , . . . , i } , (94b) d (cid:104) x g i (cid:105) dt = 4 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + k p i (cid:42) n (cid:88) j =1 x s jj (cid:43) − k (cid:104) x g i (cid:105) , (94c) d (cid:104) x g ij (cid:105) dt = 4 k x (cid:104) B (cid:105)(cid:104) xg ij (cid:105) − k (cid:104) x g ij (cid:105) + k (cid:104) x g ( i − j (cid:105) , j = { , . . . , i } . (94d)In order to calculate noise we need to express (cid:104) x s ij (cid:105) , and (cid:104) x g ij (cid:105) , which requires calculating the term (cid:104) (cid:80) n j =1 x g jj (cid:105) ; this term can be derived by analyzing equation (93) in steady-state3 k (cid:42) n (cid:88) j =1 x g jj (cid:43) = 2 k x (cid:104) B (cid:105) (cid:104) x (cid:105) + (cid:42) n (cid:88) i =1 i (cid:88) j =1 xg ij (cid:43) ⇒ (cid:42) n (cid:88) j =1 x g jj (cid:43) = 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (4 − β ) + βCV T (cid:1) k + 16 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (3 − β ) + (1 − β ) CV T (cid:1) k , (95)where in deriving this term we used equation (90) and we summed all the terms in equation (88). Byhaving this term, we calculate (cid:104) x s ij (cid:105) by recursion process. we derive (cid:104) x s i (cid:105) by substituting equation(95) in equation (94a). In the next step we use the definition of (cid:104) x s i (cid:105) to calculate (cid:104) x s i (cid:105) from equation(94b). We continue this process until we derive all the moments (cid:104) x s ij (cid:105) = k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (4 − β ) + βCV T (cid:1) k p (cid:48) i + 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (3 − β ) + (1 − β ) CV T (cid:1) k p (cid:48) i + 2 k x (cid:104) B (cid:105) k p (cid:48) i (cid:18) βj + (2 − β ) ji (cid:19) , i = { , . . . , n } , j = { , . . . , i } . (96)Expressing (cid:104) x g ij (cid:105) requires calculation of the term (cid:104) (cid:80) n j =1 x s jj (cid:105) which can be obtained from equation(96) as (cid:42) n (cid:88) j =1 x s jj (cid:43) = 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (4 − β ) + βCV T (cid:1) k + 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (3 − β ) + (1 − β ) CV T (cid:1) k . (97)Thus (cid:104) x g ij (cid:105) can be obtained with a recursion process from equations (94c) and (94d) (cid:104) x g ij (cid:105) = 4 k x (cid:104) B (cid:105) (cid:104)(cid:104) T (cid:105) (cid:0) (4 − β ) + βCV T (cid:1) k p i + 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (3 − β ) + (1 − β ) CV T (cid:1) k p i + 8 k x (cid:104) B (cid:105) k p i (cid:18) (1 − β ) j + ji (cid:19) , i = { , . . . , n } , j = { , . . . , i } . (98)Note that (cid:80) n i =1 (cid:80) ij =1 (cid:104) x s ij (cid:105) + (cid:80) n i =1 (cid:80) ij =1 (cid:104) x g ij (cid:105) = (cid:104) x (cid:105) thus the second order moment of protein canbe derived by adding all the terms in equations (96) and (98). (cid:104) x (cid:105) can be simplified by using equations(7) and (18b) in the main article as (cid:104) x (cid:105) = k x (cid:104) B (cid:105) (cid:104) T (cid:105) + 16 (cid:104) T (cid:105) + 2 (cid:104) T (cid:105) (3(2 − β ) + β (5 − β ) CV T + 8(1 − β ) CV T )3 (cid:104) T (cid:105) . (99)8Finally, using the definition of CV results in noise of protein raised from cell-cycle time variations CV E = (4 (cid:104) T (cid:105) + 16 (cid:104) T (cid:105) ) / (cid:104) T (cid:105) − − β + β ) (cid:16) ( β − β + 6) + β CV T + 2(1 − β ) CV T (cid:17) − β ( β ( − CV T )) CV T − β CV T + (1 − β ) CV T )(2 − β + 3 β + 3( β CV T + (1 − β ) CV T ))3 (cid:16) ( β − β + 6) + β CV T + 2(1 − β ) CV T (cid:17) . (100) E.3 Noise in protein count level contributed from random partitioning
In order to take into account noise caused by random partitioning of proteins between two daughter cells,we use the model shown in Figure S1C coupled with phase-type distributions. This model contains thefollowing stochastic events
Event Reset
Propensity ,)()( tsts ijij )()( )()( tsts jiji , ij sk },,,{ ni },,{ ij First phase-type evolution Cell-division ),()( ss txtx ,)( sjj tg )()( sisi tsts },,{ ni Gene-duplication ,)( ts jj )()( tgtg ii nj jj spk ,' },,{ ni ,)()( tgtg ijij )()( )()( tgtg jiji , ij gk },,,{ ni },,{ ij Second phase-type evolution nj jji gpk , i and deterministic protein production ˙ x = k x (cid:104) B (cid:105) n (cid:88) i =1 i (cid:88) j =1 g ij . (101)Note that here x is a continuous random variable, hence x + is also obtained from a continious distribution.Connection between statistical statistical moments of x and x + is given by (5).For this model, (cid:104) x (cid:105) , (cid:104) xs ij (cid:105) , and (cid:104) xg ij (cid:105) are equal to their value in Section E.1 and Section E.2. However,dynamics of (cid:104) x (cid:105) and (cid:104) x s i (cid:105) are different d (cid:104) x (cid:105) dt = 2 k x (cid:104) B (cid:105) (cid:104) x (cid:105) + (cid:42) n (cid:88) i =1 i (cid:88) j =1 xg ij (cid:43) + 14 αk (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:42) n (cid:88) j =1 x g jj (cid:43) , (102a) d (cid:104) x s i (cid:105) dt = 2 k x (cid:104) B (cid:105)(cid:104) xs i (cid:105) + k p (cid:48) i (cid:42) n (cid:88) j =1 x g jj (cid:43) + 14 αk p (cid:48) i (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:104) x s i (cid:105) , (102b)note that dynamics of (cid:104) x s ij (cid:105) , (cid:104) x g i (cid:105) , and (cid:104) x g ij (cid:105) are identical to equations (94b), (94c), and (94d).Similar to previous section, we start by deriving the term (cid:104) (cid:80) nj =1 x g jj (cid:105) . Analyzing equation (102a) insteady-state gives this term as (cid:42) n (cid:88) j =1 x g jj (cid:43) = 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (4 − β ) + βCV T (cid:1) k + 16 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (3 − β ) + (1 − β ) CV T (cid:1) k + 2 αk x (cid:104) B (cid:105) (2 − β )3 k . (103)0Substituting equation (103) in equations (102b) and (94b) results in (cid:104) x s ij (cid:105) = k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (4 − β ) + βCV T (cid:1) k p (cid:48) i + 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (3 − β ) + (1 − β ) CV T (cid:1) k p (cid:48) i + 2 k x (cid:104) B (cid:105) k p (cid:48) i (cid:18) βj + (2 − β ) ji (cid:19) + 2 αk x (2 − β ) (cid:104) B (cid:105) k p (cid:48) i ,i = { , . . . , n } j = { , . . . , i } . (104)In the next step we derive moments (cid:104) x g ij (cid:105) ; we start by calculating (cid:104) (cid:80) n j =1 x s jj (cid:105) from (104) (cid:42) n (cid:88) j =1 x s jj (cid:43) = 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (4 − β ) + βCV T (cid:1) k + 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (3 − β ) + (1 − β ) CV T (cid:1) k + 2 k x (cid:104) B (cid:105) (2 − β )3 k . (105)By having this term, the moments (cid:104) x g ij (cid:105) are derived by solving equations (94c) and (94d) in steady-state (cid:104) x g ij (cid:105) = 4 k x (cid:104) B (cid:105) (cid:104)(cid:104) T (cid:105) (cid:0) (4 − β ) + βCV T (cid:1) k p i + 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (3 − β ) + (1 − β ) CV T (cid:1) k p i + 8 k x (cid:104) B (cid:105) k p i (cid:18) (1 − β ) j + ji (cid:19) + 2 αk x (cid:104) B (cid:105) (2 − β )3 k p i ,i = { , . . . , n } , j = { , . . . , i } . (106)Note that (cid:104) x (cid:105) = (cid:42) x n (cid:88) i =1 i (cid:88) j =1 s ij + n (cid:88) i =1 i (cid:88) j =1 g ij (cid:43) = n (cid:88) i =1 i (cid:88) j =1 (cid:104) x s ij (cid:105) + n (cid:88) i =1 i (cid:88) j =1 (cid:104) x g ij (cid:105) , (107)hence the second-order moment is (cid:104) x (cid:105) = 4 (cid:104) T (cid:105) + 16 (cid:104) T (cid:105) + 2 (cid:104) T (cid:105) (3(2 − β ) + β (5 − β ) CV T + 8(1 − β ) CV T )3 (cid:104) T (cid:105) + 2 αk x (cid:104) B (cid:105) (2 − β ) (cid:104) T (cid:105) . (108)Coefficient of variation squared gives noise raised from partitioning and cell-cycle variations, which sub-tracting equation (100) from results gives partitioning noise as CV R = 4 α (2 − β )3 (cid:0) ( β − β + 6) + β CV T + 2(1 − β ) CV T (cid:1) (cid:104) x (cid:105) . (109)1 E.4 Noise in protein count level contributed from stochastic production
In order to calculate the noise caused by stochastic birth of protein, we use the model introduced inSection C.1. For this model, moments dynamics of (cid:104) x (cid:105) , (cid:104) x s ij (cid:105) , and (cid:104) x g ij (cid:105) can be written as d (cid:104) x (cid:105) dt = k x (cid:104) B (cid:105) (2 − β ) + 2 k x (cid:104) B (cid:105) (cid:32) (cid:104) x (cid:105) + (cid:42) n (cid:88) i =1 i (cid:88) j =1 xg ij (cid:43)(cid:33) + 14 αk (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:42) n (cid:88) j =1 x g jj (cid:43) , (110a) d (cid:104) x s i (cid:105) dt = k x (cid:104) B (cid:105) βp (cid:48) i i + 2 k x (cid:104) B (cid:105)(cid:104) xs i (cid:105) + k p (cid:48) i (cid:42) n (cid:88) j =1 x g jj (cid:43) + 14 αk p (cid:48) i (cid:42) n (cid:88) j =1 xg jj (cid:43) − k (cid:104) x s i (cid:105) , (110b) d (cid:104) x s ij (cid:105) dt = k x (cid:104) B (cid:105) βp (cid:48) i i + 2 k x (cid:104) B (cid:105)(cid:104) xs ij (cid:105) − k (cid:104) x s ij (cid:105) + k (cid:104) x s ( i − j (cid:105) , j = { , . . . , i } , (110c) d (cid:104) x g i (cid:105) dt = 2 k x (cid:104) B (cid:105) (1 − β ) p i i + 4 k x (cid:104) B (cid:105)(cid:104) xg i (cid:105) + k p i (cid:42) n (cid:88) j =1 x s jj (cid:43) − k (cid:104) x g i (cid:105) , , (110d) d (cid:104) x g ij (cid:105) dt = 2 k x (cid:104) B (cid:105) (1 − β ) p i i + 4 k x (cid:104) B (cid:105)(cid:104) xg ij (cid:105) − k (cid:104) x g ij (cid:105) + k (cid:104) x g ( i − j (cid:105) , j = { , . . . , i } . (110e) The same as before we start by expressing the term (cid:104) (cid:80) n j =1 x g jj (cid:105) , this term is calculated by analyzingequation (110a) in steady-state (cid:42) n (cid:88) j =1 x g jj (cid:43) = 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (4 − β ) + βCV T (cid:1) k + 2 αk x (cid:104) B (cid:105) (2 − β )3 k + 16 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (3 − β ) + (1 − β ) CV T (cid:1) k + 4 k x (2 − β ) (cid:104) B (cid:105) k . (111)Substituting this term in equations (110b) and (110c) results in (cid:104) x s ij (cid:105) = k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (4 − β ) + βCV T (cid:1) k p (cid:48) i + 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (3 − β ) + (1 − β ) CV T (cid:1) k p (cid:48) i + 2 k x (cid:104) B (cid:105) k p (cid:48) i (cid:18) βj + (2 − β ) ji (cid:19) + 2 αk x (2 − β ) (cid:104) B (cid:105) k p (cid:48) i + k x (cid:104) B (cid:105) k (cid:18) − β β ji (cid:19) p (cid:48) i , i = { , . . . , n } , j = { , . . . , i } . (112)Similar to previous section, solving equations (110d) and (110e) gives the (cid:104) x g ij (cid:105)(cid:104) x g ij (cid:105) = 4 k x (cid:104) B (cid:105) (cid:104)(cid:104) T (cid:105) (cid:0) (4 − β ) + βCV T (cid:1) k p i + 4 k x (cid:104) B (cid:105) (cid:104) T (cid:105) (cid:0) (3 − β ) + (1 − β ) CV T (cid:1) k p i + 8 k x (cid:104) B (cid:105) k p i (cid:18) (1 − β ) j + ji (cid:19) + 2 αk x (cid:104) B (cid:105) (2 − β )3 k p i + 2 k x (cid:104) B (cid:105) k (cid:18) β − β ) ji (cid:19) p i , i = { , . . . , n } , j = { , . . . , i } . (113)2Finally summing all the moments (cid:104) x s ij (cid:105) , and (cid:104) x g ij (cid:105) results in (cid:104) x (cid:105) as (cid:104) x (cid:105) = 4 (cid:104) T (cid:105) + 16 (cid:104) T (cid:105) + 2 (cid:104) T (cid:105) (3 a ( − ( β + 1) βCV T + a −
4) + 8 CV T + 12)3 (cid:104) T (cid:105) + 2 αk x (cid:104) B (cid:105) (2 − a ) (cid:104) T (cid:105) k x (cid:104) B (cid:105) (cid:18) − β β (cid:18) CV T (cid:19)(cid:19) (cid:104) T (cid:105) + 2 k x (cid:104) B (cid:105) (cid:18) β − β ) (cid:18) CV T (cid:19)(cid:19) (cid:104) T (cid:105) . (114)Steady-state analysis gives the noise from stochastic birth, random partitioning, and cell-cycle timevariations. Subtracting noise of cell-cycle time and partitioning in equations (100) and (109) results innoise caused by stochastic production of protein CV P = (10 − β + 3 β ) + 6(1 − β ) CV T + 3 β CV T (cid:0) ( β − β + 6) + β CV T + 2(1 − β ) CV T (cid:1) (cid:104) B (cid:105)(cid:104) B (cid:105) (cid:104) x (cid:105) . (115)3 E.5 Effect of gene-duplication time in intrinsic noise
We investigate how the noise contributions from random partitioning and stochastic expression ( CV R and CV P terms in equation (34) of the main text) change as β is varied between 0 and 1. Results showthat CV R and CV P follow the same qualitative shapes as reported in Fig. 6. There exists a β ∗ β ∗ = − (cid:113) CV T + 5 CV T CV T + 3 CV T + 2 CV T + 3 CV T + 1) + 2 CV T + 4 CV T + 2 CV T + 2 CV T + 1 , (116) such that CV P is minimized and CV R is maximized when β = β ∗ . Note that when CV T = CV T = 0, β ∗ = 2 − √ CV P and the maximum value of CV R are given by CV P = CV T (3 CV T + 7) − (cid:113) CV T + CV T + 1)( CV T + 2 CV T + 1) + 7 CV T + 33( CV T ( CV T + 3) + 3 CV T + 1) (cid:104) B (cid:105)(cid:104) B (cid:105) (cid:104) x (cid:105) , (117) CV R = √ α (cid:113) (2 CV T + CV T + 1)( CV T + 2 CV T + 1) − √ CV T − √ CV T , (118) respectively. Plots of β ∗ and optimal value of CV R and CV P as a function of CV T are shown in Fig.S4. Note that if noise in T is high and T is deterministic then β ∗ shifts towards zero. Similarly, if noisein T is high and T is deterministic then β ∗ shifts towards one. F Noise level in unstable protein
Consider an unstable protein with sufficiently high degradation rate γ x such that the protein level reachessteady-state instantaneously compared to the cell-cycle time (Fig. S4). Let τ denote the time from thelast division event, then (cid:104) x | τ < T (cid:105) = k x (cid:104) B (cid:105) γ x , (cid:104) x | τ > T (cid:105) = 2 k x (cid:104) B (cid:105) γ x , (119)where T is the time in which duplication happens. The mean level of an unstable protein can becalculated as (cid:104) x (cid:105) = (cid:104) x | τ < T (cid:105) p ( τ < T ) + (cid:104) x | τ > T (cid:105) p ( τ > T ) , (120)
0 0.2 0.4 0.8 0.6 0.2 0.4 0.6 O p t i m a l v a l u e o f 𝛽
0 0.2 0.4 0.8 0.6 L o w e r b o u n d o f 𝐶 𝑉 𝑃 U pp e r b o u n d o f 𝐶 𝑉 𝑅 𝐶𝑉 𝑇 𝐶𝑉 𝑇 𝐶𝑉 𝑇 𝐶𝑉 𝑇 = 1 𝐶𝑉 𝑇 = 𝐶𝑉 𝑇 𝐶𝑉 𝑇 = 0 𝐶𝑉 𝑇 = 1 𝐶𝑉 𝑇 = 𝐶𝑉 𝑇 𝐶𝑉 𝑇 = 0 𝐶𝑉 𝑇 = 1 𝐶𝑉 𝑇 = 𝐶𝑉 𝑇 𝐶𝑉 𝑇 = 0 Figure S3:
Effect of gene-duplication on intrinsic noise level . Left : Value of β where CV P isminimized and CV R is maximized as a function of CV T . When CV T = CV T , noise levels always reachtheir extrema at β = 2 − √ Middle & Right : Extremum values of CV P and CV R as a functions of CV T . Noise levels are normalized by their values at β = 0.4where p ( τ < T ) and p ( τ > T ) denote the probability of being in the time interval before and aftergene-duplication. Using p ( τ < T ) = β, p ( τ > T ) = (1 − β ) , (121)we obtain (cid:104) x (cid:105) = k x (cid:104) B (cid:105) (2 − β ) γ x . (122)To compute the extrinsic noise component we consider deterministic protein production and decay.The second-order moment of x ( t ) is given by (cid:104) x | τ < T (cid:105) = (cid:16) k x (cid:104) B (cid:105) γ x (cid:17) (cid:104) x | τ > T (cid:105) = (cid:16) k x (cid:104) B (cid:105) γ x (cid:17) ⇒ (cid:104) x (cid:105) = (cid:18) k x (cid:104) B (cid:105) γ x (cid:19) β + (cid:18) k x (cid:104) B (cid:105) γ x (cid:19) (1 − β ) . (123)By using definition of CV , extrinsic noise is CV E = (1 − β ) β (2 − β ) , (124)which is zero at β = 0 , β = 2 / γ x , (cid:104) x | τ < T (cid:105) is equal to the steady-state second-order moment of a stochastic model with burst frequency k x (analyzedin [64]) (cid:104) x | τ < T (cid:105) = (cid:18) k x (cid:104) B (cid:105) γ x (cid:19) + k x (cid:104) B (cid:105) γ x + k x (cid:104) B (cid:105) γ x . (125)In comparison with equation (123), there are two extra terms at the right hand side of (cid:104) x | τ < T (cid:105) .The first extra term is due to production of protein in random bursts and the second one is due tostochastic degradation of protein molecules. Further for the same reasons (large degradation rate andrapid equilibration of the distribution), (cid:104) x | τ > T (cid:105) is equal to the second-order moment of a modelcontaining stochastic bursty production of proteins with burst frequency 2 k x which is (cid:104) x | τ > T (cid:105) = (cid:18) k x (cid:104) B (cid:105) γ x (cid:19) + k x (cid:104) B (cid:105) γ x + k x (cid:104) B (cid:105) γ x . (126)Thus the second order moment of an unstable protein can be written as (cid:104) x (cid:105) = (cid:18) k x (cid:104) B (cid:105) γ x (cid:19) β + k x (cid:104) B (cid:105) γ x β + k x (cid:104) B (cid:105) γ x β + (cid:18) k x (cid:104) B (cid:105) γ x (cid:19) (1 − β ) + k x (cid:104) B (cid:105) γ x (1 − β ) + k x (cid:104) B (cid:105) γ x (1 − β ) . (127)Using definition of CV and subtracting extrinsic noise we obtain the following noise contribution fromstochastic expression and decay CV P = 12 (cid:18) (cid:104) B (cid:105)(cid:104) B (cid:105) + 1 (cid:19) (cid:104) x (cid:105) . (128)5 Time T T T Stable protein
Unstable protein P r o t e i n c o u n t l eve l
0 Gene-duplication time / Cell-cycle time 0.2 0.4 0.6 0.8 1 0.12 P r o t e i n n o i se l eve l xx Bk xx Bk Intrinsic noise Extrinsic noise
Figure S4:
Contribution of gene duplication to noise levels of an unstable protein. left : For astable protein, copy numbers accumulate in a bilinear fashion. In contrast, an unstable protein reachesequilibrium rapidly and its level changes in steps.
Right : Extrinsic and intrinsic noise predicted for anunstable protein as a function of β . Solid lines are predictions from (124) and (128), which agree withestimates from 20 ,
000 Monte Carlo simulations. Parameters taken as γ x = 10 hr − , and a geometricburst with (cid:104) B (cid:105) = 6. Burst frequency is changed to have a constant mean protein level of 100 moleculesfor different values of β . 95% confidence intervals are calculated via bootstrapping. Acknowledgements
AS is supported by the National Science Foundation Grant DMS-1312926.
References
1. Blake WJ, Kaern M, Cantor CR, Collins JJ (2003) Noise in eukaryotic gene expression. Nature422: 633-637.2. Raser JM, O’Shea EK (2005) Noise in gene expression: Origins, consequences, and control. Science309: 2010-2013.3. Neuert G, Munsky B, Tan RZ, Teytelman L, Khammash M, et al. (2013) Systematic identificationof signal-activated stochastic gene regulation. Science 339: 584-587.4. Libby E, Perkins TJ, Swain PS (2007) Noisy information processing through transcriptional regu-lation. Proceedings of the National Academy of Sciences 104: 7151-7156.5. Fraser HB, Hirsh AE, Giaever G, Kumm J, Eisen MB (2004) Noise minimization in eukaryoticgene expression. PLOS Biology 2: e137.6. Lehner B (2008) Selection to minimise noise in living systems and its implications for the evolutionof gene expression. Molecular Systems Biology 4: 170.7. Losick R, Desplan C (2008) Stochasticity and cell fate. Science 320: 65-68.8. Arkin A, Ross J, McAdams HH (1998) Stochastic kinetic analysis of developmental pathway bi-furcation in phage λ -infected Escherichia coli cells. Genetics 149: 1633–1648.9. Weinberger L, Burnett J, Toettcher J, Arkin A, Schaffer D (2005) Stochastic gene expression ina lentiviral positive-feedback loop: HIV-1 Tat fluctuations drive phenotypic diversity. Cell 122:169-182.610. Weinberger LS, Dar RD, Simpson ML (2008) Transient-mediated fate determination in a tran-scriptional circuit of HIV. Nature Genetics 40: 466-470.11. Singh A, Weinberger LS (2009) Stochastic gene expression as a molecular switch for viral latency.Current Opinion in Microbiology 12: 460-466.12. Dar RD, Hosmane NN, Arkin MR, Siliciano RF, Weinberger LS (2014) Sreening for noise in geneexpression identifies drug synergies. Science 344: 1392-1396.13. Eldar A, Elowitz MB (2010) Functional roles for noise in genetic circuits. Nature 467: 167-173.14. Veening JW, Smits WK, Kuipers OP (2008) Bistability, epigenetics, and bet-hedging in bacteria.Annual Review of Microbiology 62: 193210.15. Kussell E, Leibler S (2005) Phenotypic diversity, population growth, and information in fluctuatingenvironments. Science 309: 2075-2078.16. Balaban N, Merrin J, Chait R, Kowalik L, Leibler S (2004) Bacterial persistence as a phenotypicswitch. Science 305: 1622-1625.17. Snchez-Romero MA, Casadess J (2014) Contribution of phenotypic heterogeneity to adaptive an-tibiotic resistance. Proceedings of the National Academy of Sciences 111: 355–360.18. Neildez-Nguyen TMA, Parisot A, Vignal C, Rameau P, Stockholm D, et al. (2008) Epigeneticgene expression noise and phenotypic diversification of clonal cell populations. Differentiation 76:33–40.19. Paldi A (2003) Stochastic gene expression during cell differentiation: order from disorder? Cellularand Molecular Life Sciences 60: 1775-1778.20. Raj A, van Oudenaarden A (2008) Nature, nurture, or chance: stochastic gene expression and itsconsequences. Cell 135: 216–226.21. Kaern M, Elston TC, Blake WJ, Collins JJ (2005) Stochasticity in gene expression: from theoriesto phenotypes. Nature Reviews Genetics 6: 451-464.22. Magklara A, Lomvardas S (2013) Stochastic gene expression in mammals: lessons from olfaction.Trends in Cell Biology 23: 449–456.23. Munsky B, B Trinh B, Khammash M (2009) Listening to the noise: random fluctuations revealgene network parameters. Molecular systems biology 5: 318.24. Wang P, Robert L, Pelletier J, Dang WL, Taddei F, et al. (2010) Robust growth of
Escherichiacoli . Current biology 20: 1099–1103.25. Lambert G, Kussell E (2015) Quantifying selective pressures driving bacterial evolution usinglineage analysis. Physical Review X 5: 011016.26. Tsukanov R, Reshes G, Carmon G, Fischer-Friedrich E, Gov NS, et al. (2011) Timing of Z-ringlocalization in
Escherichia coli . Physical Biology 8: 066003.27. Reshes G, Vanounou S, Fishov I, Feingold M (2008) Cell shape dynamics in
Escherichia coli .Biophysical Journal 94: 251–264.28. Reshes G, Vanounou S, Fishov I, Feingold M (2008) Timing the start of division in
E. coli : asingle-cell study. Physical Biology 5: 046001.729. Roeder A, Chickarmane V, Obara B, Manjunath B, Meyerowitz EM (2010) Variability in thecontrol of cell division underlies sepal epidermal patterning in Arabidopsis thaliana. PLOS Biology8: e1000367.30. Zilman A, Ganusov V, Perelson A (2010) Stochastic models of lymphocyte proliferation and death.PLOS ONE 5: e12775.31. Hawkins ED, Markham JF, McGuinness LP, Hodgkin P (2009) A single-cell pedigree analysis ofalternative stochastic lymphocyte fates. Proceedings of the National Academy of Sciences 106:13457-13462.32. Stukalin EB, Aifuwa I, Kim JS, Wirtz D, Sun SX (2013) Age-dependent stochastic models forunderstanding population fluctuations in continuously cultured cells. Journal of The Royal SocietyInterface 10.33. Huh D, Paulsson J (2011) Random partitioning of molecules at cell division. Proceedings of theNational Academy of Sciences 108: 15004–15009.34. Gonze D (2013) Modeling the effect of cell division on genetic oscillators. Journal of TheoreticalBiology 325: 22–33.35. Lloyd-Price J, Tran H, Ribeiro AS (2014) Dynamics of small genetic circuits subject to stochasticpartitioning in cell division. Journal of Theoretical Biology 356: 11-19.36. Zopf CJ, Quinn K, Zeidman J, Maheshri N (2013) Cell-cycle dependence of transcription dominatesnoise in gene expression. PLOS Computational Biology 9: e1003161.37. Narula J, Kuchina A, Lee DyD, Fujita M, Sel GM, et al. (2015) Chromosomal arrangement ofphosphorelay genes couples sporulation and DNA replication. Cell 162: 328–337.38. Schwabe A, Bruggeman FJ (2014) Contributions of cell growth and biochemical reactions to non-genetic variability of cells. Biophysical Journal 107: 301–313.39. Huh D, Paulsson J (2011) Non-genetic heterogeneity from stochastic partitioning at cell division.Nature Genetics 43: 95–100.40. Antunes D, Singh A (2014) Quantifying gene expression variability arising from randomness in celldivision times. Journal of Mathematical Biology 71: 1–27.41. Yu J, Xiao J, Ren X, Lao K, Xie XS (2006) Probing gene expression in live cells, one proteinmolecule at a time. Science 311: 1600-1603.42. Paulsson J (2005) Model of stochastic gene expression. Physics of Life Reviews 2: 157-175.43. Shahrezaei V, Swain PS (2008) Analytical distributions for stochastic gene expression. Proceedingsof the National Academy of Sciences 105: 17256-17261.44. Singh A, Hespanha JP (2009) Optimal feedback strength for noise suppression in autoregulatorygene networks. Biophysical Journal 96: 4013–4023.45. Jia T, Kulkarni RV (2011) Intrinsic noise in stochastic models of gene expression with molecularmemory and bursting. Journal of Mathematical Biology 106: 058102.46. Alon U (2006) An Introduction to Systems Biology: Design Principles of Biological Circuits. Chap-man and Hall/CRC.847. Taniguchi Y, Choi P, Li G, Chen H, Babu M, et al. (2010) Quantifying
E. coli proteome andtranscriptome with single-molecule sensitivity in single cells. Science 329: 533-538.48. Schwanhausser B, Busse D, Li N, Dittmar G, Schuchhardt J, et al. (2011) Global quantification ofmammalian gene expression control. Nature 473: 337–342.49. Swain PS, Elowitz MB, Siggia ED (2002) Intrinsic and extrinsic contributions to stochasticity ingene expression. Proceedings of the National Academy of Sciences 99: 12795-12800.50. Berg OG (1978) A model for the statistical fluctuations of protein numbers in a microbial popula-tion. Journal of Theoretical Biology 71: 587–603.51. Rigney DR (1979) Stochastic model of constitutive protein levels in growing and dividing bacterialcells. Journal of Theoretical Biology 76: 453–480.52. Singh A, Hespanha JP (2010) Stochastic hybrid systems for studying biochemical processes. Philo-sophical Transactions of the Royal Society of London A: Mathematical, Physical and EngineeringSciences 368: 4995-5011.53. Daigle BJ, Soltani M, Petzold LR, Singh A (2015) Inferring single-cell gene expression mechanismsusing stochastic simulation. Bioinformatics 31: 1428–1435.54. Tijms HC (1994) Stochastic models: an algorithmic approach. John Wiley & Sons.55. Hespanha JP, Singh A (2005) Stochastic models for chemically reacting systems using polynomialstochastic hybrid systems. International Journal of Robust and Nonlinear Control 15: 669–689.56. Singh A, Hespanha JP (2011) Approximate moment dynamics for chemically reacting systems.IEEE Transactions on Automatic Control 56: 414-418.57. Gomez-Uribe CA, Verghese GC (2007) Mass fluctuation kinetics: Capturing stochastic effects insystems of chemical reactions through coupled mean-variance computations. Journal of ChemicalPhysics 126: 024109.58. Lee CH, Kim K, Kim P (2009) A moment closure method for stochastic reaction networks. Journalof Chemical Physics 130: 134107.59. Goutsias J (2007) Classical versus stochastic kinetics modeling of biochemical reaction systems.Biophysical Journal 92: 2350-2365.60. Gillespie CS (2009) Moment-closure approximations for mass-action models. IET systems biology3: 52–58.61. Soltani M, Vargas-Garcia CA, Singh A (2015) Conditional moment closure schemes for studyingstochastic dynamics of genetic circuits. IEEE Transactions on Biomedical Circuits and Systems :Accepted to publish.62. Wang H, Yuan Z, Liu P, Zhou T (2015) Division time-based amplifiers for stochastic gene expres-sion. Molecular BioSystems 11: 2417-2428.63. Hilfinger A, Paulsson J (2011) Separating intrinsic from extrinsic fluctuations in dynamic biologicalsystems. Proceedings of the National Academy of Sciences 108: 12167-12172.64. Singh A, Soltani M (2013) Quantifying intrinsic and extrinsic variability in stochastic gene expres-sion models. PLOS ONE 8: e84301.965. Shahrezaei V, Ollivier JF, Swain PS (2008) Colored extrinsic fluctuations and stochastic geneexpression. Molecular Systems Biology 4.66. Scott M, Ingalls B, Kaern M (2006) Estimations of intrinsic and extrinsic noise in models ofnonlinear genetic networks. Chaos 16: 026107.67. Ozbudak EM, Thattai M, Kurtser I, Grossman AD, van Oudenaarden A (2002) Regulation of noisein the expression of a single gene. Nature Genetics 31: 69–73.68. Newman JRS, Ghaemmaghami S, Ihmels J, Breslow DK, Noble M, et al. (2006) Single-cell pro-teomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature Genetics 441:840-846.69. Singh A, Razooky B, Cox CD, Simpson ML, Weinberger LS (2010) Transcriptional bursting fromthe HIV-1 promoter is a significant source of stochastic noise in HIV-1 gene expression. BiophysicalJournal 98: L32-L34.70. Bar-Even A, Paulsson J, Maheshri N, Carmi M, O’Shea E, et al. (2006) Noise in protein expressionscales with natural protein abundance. Nature Genetics 38: 636-643.71. Friedman N, Cai L, Xie X (2006) Linking stochastic dynamics to population distribution: ananalytical framework of gene expression. Physical Review Letters 97: 168302.72. Paulsson J (2005) Model of stochastic gene expression. Physics of Life Reviews 2: 157–175.73. Sherman MS, Cohen BA (2014) A computational framework for analyzing stochasticity in geneexpression. PLOS Computational Biology 10: e1003596.74. Golding I, Paulsson J, Zawilski S, Cox E (2005) Real-time kinetics of gene activity in individualbacteria. Cell 123: 1025-1036.75. Yu J, Xiao J, Ren X, Lao K, Xie XS (2006) Probing gene expression in live cells, one proteinmolecule at a time. Science 311: 1600-1603.76. Cai L, Xie NFXS (2006) Stochastic protein expression in individual cells at the single moleculelevel. Nature 440: 358-362.77. Kumar N, Singh A, Kulkarni RV (2015) Transcriptional bursting in gene expression: analyticalresults for general stochastic models. To appear in PLOS Computational Biology .78. Singh A (2014) Transient changes in intercellular protein variability identify sources of noise ingene expression. Biophysical Journal 107: 2214–2220.79. Osella M, Nugent E, Lagomarsino MC (2014) Concerted control of
Escherichia coli cell division.Proceedings of the National Academy of Sciences 111: 3431-3435.80. Robert L, Hoffmann M, Krell N, Aymerich S, Robert J, et al. (2014) Division in