[PDF] Emergent Lévy behavior in single-cell stochastic gene expression

Abstract

Single-cell gene expression is inherently stochastic; its emergent behavior can be defined in terms of the chemical master equation describing the evolution of the mRNA and protein copy numbers as the latter tends to infinity. We establish two types of "macroscopic limits": the Kurtz limit is consistent with the classical chemical kinetics, while the L\'{e}vy limit provides a theoretical foundation for an empirical equation proposed in [Phys. Rev. Lett. 97:168302, 2006]. Furthermore, we clarify the biochemical implications and ranges of applicability for various macroscopic limits and calculate a comprehensive analytic expression for the protein concentration distribution in autoregulatory gene networks. The relationship between our work and modern population genetics is discussed.

Full PDF

EEmergent L´evy behavior in single-cell stochastic geneexpression

Chen Jia , Michael Q. Zhang , , Hong Qian Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX 75080, U.S.A. Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080, U.S.A. MOE Key Lab and Division of Bioinformatics, CSSB, TNLIST, Tsinghua University, Beijing 100084, China Department of Applied Mathematics, University of Washington, Seattle, WA 98195, U.S.A.

Abstract

Single-cell gene expression is inherently stochastic; its emergent behavior can be defined in termsof the chemical master equation describing the evolution of the mRNA and protein copy numbersas the latter tends to infinity. We establish two types of “macroscopic limits”: the Kurtz limit isconsistent with the classical chemical kinetics, while the L´evy limit provides a theoretical foundationfor an empirical equation proposed in [Phys. Rev. Lett. 97:168302, 2006]. Furthermore, we clarifythe biochemical implications and ranges of applicability for various macroscopic limits and calculatea comprehensive analytic expression for the protein concentration distribution in autoregulatory genenetworks. The relationship between our work and modern population genetics is discussed.

Introduction

The mesoscopic stochastic theory of chemical reaction kinetics is a powerful analyticparadigm for single-cell biochemical dynamics [1]. At the center of this theory is a limittheorem, first proved by Kurtz in the 1970s [2], which states that when the size V of the reactionvessel tends to infinity, the kinetics of a well-mixed reaction system can be described by a set ofordinary differential equations (ODEs), as intuitively expected from the macroscopic chemicalreaction kinetics. It is the macroscopic limit, instead of the mean value, that should be identifiedas the emergent behavior of the stochastic dynamics, as incisively pointed out by Anderson[3]: “It is only as it is considered to be a many body system — in what is often called the N → ∞ limit — that such [emergent] behavior is rigorously definable.” Investigating the limitof V → ∞ or N → ∞ , therefore, provides a way to reveal the inherent fundamental characterof a stochastic biochemical system.In general, the stochastic biochemical reaction kinetics has two complementary represen-tations: the stochastic trajectory and the probability distribution. The former is governed by acontinuous-time Markov chain that can be simulated via Gillespie’s algorithm and the latter isgoverned by the chemical master equation (CME) first appearing in the work of Delbr¨uck [4].To emphasize this dual perspective, the underlying stochastic dynamics is usually termed the Delbr¨uck-Gillespie process (DGP) [5].In recent years, significant progress has been made in the kinetic theory of single-cellstochastic gene expression based on the central dogma of molecular biology [6–17]. A thorough a r X i v : . [ q - b i o . M N ] O c t tudy based on the DGP framework, in terms of the protein copy number, was carried out byShahrezaei and Swain [11]. However, in bulk experiments and many single-cell experimentswithout single-molecule resolution such as RNA sequencing and flow cytometry, data are usuallyobtained as continuous variables at a macroscopic scale. At the center of the kinetic theory interms of the protein concentration is an empirical equation proposed by Friedman, Cai, andXie (FCX) [10]. However, the mathematical foundation of the now classical FCX equation stillremains unclear. This paper addresses its theoretical foundation. Emergent behavior in single-cell stochastic gene expression

We consider the canonical three-stage representation of stochastic gene expression in asingle cell with size V , with V → ∞ corresponding to a macroscopic scale, as illustrated inFig. 1(a) [11]. The size V in chemistry stands for the reaction volume [2], but in molecularbiology it could also be the maximum protein copy number [8], etc. The biochemical state ofthe gene of interest can be described by three variables: the promoter activity i with i = 1 and i = 0 corresponding to the active and inactive states of the promoter, respectively, the mRNAcopy number m , and the protein copy number n . Then the kinetics can be described by the DGPdepicted in Fig. 1(b). Here s and s are the transcription rates when the promoter is active andinactive, respectively; u , v , and d are the rate constants for translation, mRNA degradation, andprotein degradation, respectively; a n and b n are the switching rates of the promoter between theactive and inactive states [18]. In living cells, the products of many genes also regulate their ownexpression to form an autoregulatory gene network. This suggests that the promoter switchingrates a n and b n generally depend on the protein copy number n .Experimentally, it has been consistently observed that the mRNA decays substantially fasterthan its protein counterpart [11]. Then the process of protein synthesis followed by mRNAdegradation is essentially instantaneous: Protein synthesis in single cells occurs in randombursts [19]. Once an mRNA is synthesized, it can either produce a protein with probability p = u/ ( u + v ) or be degraded with probability q = v/ ( u + v ) . Thus the probability that j proteins are synthesized in a single burst will be p j q , which follows the geometric distribution[20]. The average number of proteins synthesized per mRNA, also called the mean burst size, isthen (cid:80) ∞ j =0 jp j q = p/q . These considerations yield the reduced Markov model illustrated in Fig.1(c) [7].In fact, the reduced model can be derived rigorously from the original DGP. To do this,let (cid:15) = d/v denote the ratio of the mRNA and protein lifetimes. Let q ( i,m,n ) denote the rate atwhich the system leaves state ( i, m, n ) , which is defined as the sum of transition rates from state ( i, m, n ) to other states [21]. Since (cid:15) (cid:28) , we say that ( i, m, n ) is a fast state if q ( i,m,n ) → ∞ as (cid:15) → . Otherwise, ( i, m, n ) is called a slow state. If ( i, m, n ) is a fast state, then the time thatthe system stays in this state will be very short. By a recently developed simplification methodof two-time-scale Markov chains [22–24], the DGP can be simplified by removal of all the fast c) (1,0)(0,0) (1,1)(0,1) (1,2)(0,2) (1, n )(0, n ) ...... ...... s pq s pq s pqs p n qs p n qs pq s pq s pqa b a b a b a n b n d 2d ndd 2d nd (a)(b) active promoterinactive promoter mRNA protein ФФ s s duv (0, m,n )(0, m,n +1)(0, m -1 ,n ) (0, m +1 ,n )(0, m,n -1) (1, m,n )(1, m,n +1)(1, m -1 ,n ) (1, m +1 ,n )(1, m,n -1) mumu ( n +1) dnda n b n s s mv ( m +1) v s s mv ( m +1) vnd ( n +1) d mumu (d) a ( x ) b ( x ) x = s p / q − dx . x = s p / q − dx . a ( x ) b ( x ) d X ( t ) = − dX ( t ) d t+ d C ( t ) d X ( t ) = − dX ( t ) d t+ d C ( t ) Kurtz limit Lévy limit Lévy noises ^^ (1, n -1)(0, n -1) a n b n Fig. 1. (a) The canonical three-stage representation of stochastic gene expression. (b) The transition diagram of theDGP. (c) The transition diagram of the reduced model when (cid:15) (cid:28) . (d) The Kurtz and L´evy limits. states. It is easy to check that q (0 ,m,n ) = md ( u/v + 1) (cid:15) − + a n + s + nd,q (1 ,m,n ) = md ( u/v + 1) (cid:15) − + b n + s + nd. This indicates that all the states ( i, m, n ) with m ≥ are fast states and can be removed and onlythe states ( i, , n ) are retained. Thus the original DGP can be simplified to the reduced modelwith effective transition rates depicted in Fig. 1(c) [21]. In the reduced model, the biochemicalstate of the gene is only described by the variables i and n . It yields large increments of theprotein number, which suggests that protein synthesis occurs in random bursts.Let α ( t ) and N ( t ) denote the promoter activity and protein copy number in a single cell attime t , respectively. Then X V ( t ) = N ( t ) /V stands for the protein concentration. When (cid:15) (cid:28) , ( α ( t ) , N ( t )) can be described by the reduced model. Thus ( α ( t ) , X V ( t )) is a Markov chain withstate space { ( i, n/V ) : i = 0 , , n = 0 , , · · · } . Under mild conditions, the evolution of aMarkov process is uniquely determined by its generator. In particular, the generator A V of the arkov chain ( α ( t ) , X V ( t )) is given by  A V f (cid:0) nV (cid:1) = nd (cid:104) f (cid:0) nV − V (cid:1) − f (cid:0) nV (cid:1) (cid:105) + b n (cid:104) f (cid:0) nV (cid:1) − f (cid:0) nV (cid:1) (cid:105) + ∞ (cid:88) j =0 s p j q (cid:104) f (cid:0) nV + jV (cid:1) − f (cid:0) nV (cid:1) (cid:105) , A V f (cid:0) nV (cid:1) = nd (cid:104) f (cid:0) nV − V (cid:1) − f (cid:0) nV (cid:1) (cid:105) + a n (cid:104) f (cid:0) nV (cid:1) − f (cid:0) nV (cid:1) (cid:105) + ∞ (cid:88) j =0 s p j q (cid:104) f (cid:0) nV + jV (cid:1) − f (cid:0) nV (cid:1) (cid:105) . Let x = n/V and y = j/V and let a ( x ) = a n and b ( x ) = b n . Under the framework ofmesoscopic chemical reaction kinetics, DNA → mRNA is a zero-order reaction and thus thetranscription rate should scale with size V , that is, s i = ˆ s i V [2]. As V → ∞ , the generator A V will converge to another operator B : (cid:40) B f ( x ) = (ˆ s p/q − dx ) f (cid:48) ( x ) + b ( x ) (cid:2) f ( x ) − f ( x ) (cid:3) , B f ( x ) = (ˆ s p/q − dx ) f (cid:48) ( x ) + a ( x ) (cid:2) f ( x ) − f ( x ) (cid:3) . This shows that the discrete-valued Markov chain ( α ( t ) , X V ( t )) will converge to a continuous-valued Markov process ( α ( t ) , X ( t )) with generator B . Mathematically, the limiting process is apiecewise deterministic Markov process (PDMP), as illustrated in Fig. 1(d). This macroscopiclimit will be named as the Kurtz limit because it is consistent with the classical chemicalkinetics: Given a particular promoter state, the protein concentration evolves as an ODE with nofluctuations. The PDMP was introduced in [13, 14] for studying stochastic phenotype switching.In [15], Lin and Doering considered a gene network with positive autoregulation and obtainedthe PDMP by taking a different but mathematically equivalent limit. Recently, there has beenmany studies on gene expression kinetics based on the PDMP model and the detailed analysiscan be found in [13, 16].Interestingly, there is another macroscopic limit that is more consistent with single-cellexperiments. To see this, we assume that the mean burst size p/q = V /k scales with size V .Here we shall treat s i and k as constants and take the limit V → ∞ . Under these assumptions,we have p → , qV → k , and p j = e j log(1 − q ) → e − ky . Thus the generator A V will converge toa different operator A :  A f ( x ) = − dxf (cid:48) ( x ) + b ( x ) (cid:2) f ( x ) − f ( x ) (cid:3) + s (cid:90) ∞ ke − ky (cid:2) f ( x + y ) − f ( x ) (cid:3) d y, A f ( x ) = − dxf (cid:48) ( x ) + a ( x ) (cid:2) f ( x ) − f ( x ) (cid:3) + s (cid:90) ∞ ke − ky (cid:2) f ( x + y ) − f ( x ) (cid:3) d y. This shows that the Markov chain ( α ( t ) , X V ( t )) will converge to a different Markov process ( α ( t ) , X ( t )) with generator A . Mathematically, the limiting process is a switching (hybrid) tochastic differential equation (SDE) driven by L´evy noises, as illustrated in Fig. 1(d). Here C i ( t ) is a compound Poisson process, a particular kind of L´evy process, with arrival rate s i andjump distribution w ( x ) = ke − kx . This can be explained as follows. When the promoter is instate i , the process of mRNA synthesis can be described by a Poisson process with arrival rate s i and each mRNA can produce proteins with the burst size having the exponential distribution w ( x ) , which can be viewed as the continuous limit of the geometric distribution. Thus theprocess of protein synthesis can be described by the compound Poisson process C i ( t ) . We shallname this macroscopic limit as the L´evy limit. Given a particular promoter state, the proteinconcentration still evolves as a stochastic process with large fluctuations.Fig. 2(b) illustrates the simulated trajectories of the two kinds of macroscopic limits whenthe promoter is always active, that is, b n = 0 . It can be seen that the trajectories of the Kurtlimit are continuous, while the L´evy limit has discontinuous trajectories. The jump point ofthe trajectory corresponds to the burst time and the jump height corresponds to the burst size.For any fixed protein concentration x , consider the transition rate q ( i,n ) , ( i,n + xV ) = s i p xV q ofthe reduced model from state ( i, n ) to ( i, n + xV ) , where xV is assumed to be an integer forsimplicity. Under the assumption of the Kurtz limit, q ( i,n ) , ( i,n + xV ) = ˆ s i qV p xV , which decays tozero at exponential speed. Under the assumption of the L´evy limit, q ( i,n ) , ( i,n + xV ) ≈ s i ke − kx /V ,which decays to zero with a power law. Thus the L´evy limit allows a larger probability to yieldlarge increments. This explains why the trajectories of the L´evy limit are discontinuous. (a) DNA mRNA protein Ф s duv Ф time p r o t e i n nu m be r (b) burst timeburst size Fig. 2. (a) The two-stage representation of stochastic gene expression. (b) Trajectories of the Kurtz (red) and L´evy(blue) limits. In the simulation, the model parameters are chosen as s = 1 , d = 0 . , k = 1 , and V = 50 . Let p i ( x ) denote the probability density of the protein concentration when the promoteris in state i and let p ( x ) = p ( x ) + p ( x ) denote the total probability density of the proteinconcentration. Then the evolution of the L´evy limit is governed by the Kolmogorov forwardequation ∂ t p i ( x ) = A ∗ p i ( x ) , that is,  ∂ t p ( x ) = d∂ x (cid:0) xp ( x ) (cid:1) + s (cid:90) x ke − k ( x − y ) p ( y )d y + a ( x ) p ( x ) − (cid:2) b ( x ) + s (cid:3) p ( x ) ,∂ t p ( x ) = d∂ x (cid:0) xp ( x ) (cid:1) + s (cid:90) x ke − k ( x − y ) p ( y )d y + b ( x ) p ( x ) − (cid:2) a ( x ) + s (cid:3) p ( x ) , (1) here A ∗ is the adjoint of A . Based on this equation, we can obtain a general form of the steady-state distribution of the protein concentration, instead of the protein copy number as discussedin [11], in autoregulatory networks. For simplicity, we assume that the promoter switchingrates have the form of a n = a and b n = b + γn [12], where b is the spontaneous switching ratefrom the active to the inactive states and γ is the feedback strength. This model can be used toanalyze networks with either positive or negative autoregulation. When s > s , the feedbackterm γn inhibits protein synthesis and leads to negative feedback. In contrast, s < s leads topositive feedback. In autoregulatory networks, the steady-state protein distribution is given by p ss ( x ) = u ∗ v ( x ) [21], where ∗ denotes the convolution, u ( x ) = k s /d Γ( s /d ) x s /d − e − kx , is the Gamma distribution and v ( x ) = Γ( β )Γ( α )Γ( α ) F ( α , α ; β ; γ/dw ) we ( γd − w ) x ( wx ) α α − W α α − β, α − α ( wx ) . Here F ( α , α ; β ; x ) is the Gaussian hypergeometric function, W α,β ( x ) is the Whittakerfunction, and α + α = a + b + s − s d , α α = a ( s − s ) d ,β = γs + ( a + b )( dk + γ ) d ( dk + γ ) , w = k + γd . When a = 0 or s = s , there is only one promoter state and we have α = 0 and v ( x ) = δ ( x ) .In this case, the protein concentration has the Gamma distribution: p ss ( x ) = u ∗ δ ( x ) = u ( x ) [10]. If we assume that the mRNA produces no proteins when the promoter is inactive, thatis, s = 0 , we have u ( x ) = δ ( x ) and thus the protein concentration has the Whittaker-typedistribution: p ss ( x ) = δ ∗ v ( x ) = v ( x ) .The emergent L´evy behavior of single-cell stochastic gene expression is itself a stochasticprocess with large fluctuations, which shows that the stochastic effects cannot be averagedout at the macroscopic scale. This provides a mechanistic foundation, from the viewpoint ofmany-body theoretical physics, for intracellular variations at the epigenetic and phenotypiclevel. The L´evy limit of the DGP is on par with the Feller-Kimura diffusion limit of the Wright-Fisher random mating model [25], which has become the theoretical foundation for “nearlyall of modern population genetics” [26]. As a comparison, the DGP and the Wright-Fishermodel are both discrete-valued Markov chains, while the L´evy limit of the former and thediffusion limit of the latter are both continuous-valued Markov processes. However, they aresubtly different because the diffusion limit has continuous trajectories, while the L´evy limithas discontinuous ones: The intracellular diversity is much greater. This insight may havefar-reaching implications to many biological phenomena such as bacterial drug resistance andnon-genetic cancer heterogeneity [27]. The full comparison between our theory and the theoryof population genetics is listed in Table 1. Table 1. Comparison between our theory and the theory of population genetics.

Two special cases

There are two special scenarios that is most interesting. The first one occurs when thepromoter is always active, that is, b n = 0 . In this case, stochastic gene expression in a single cellhas the two-stage representation illustrated in Fig. 2(a), where s is the transcription rate. If thetranscription rate s = ˆ sV scales with size V , the two-stage model has the following Kurtz limit,which is an ODE: ˙ x = − dx + ˆ sp/q. (2)where ˆ sp/q is the mean synthesis rate of the protein. If the mean burst size p/q = V /k scaleswith size V , however, the two-stage model has the following L´evy limit, which is an SDE drivenby L´evy noise: d X ( t ) = − dX ( t )d t + d C ( t ) , where C ( t ) is a compound Poisson process with arrival rate s and jump distribution w ( x ) = ke − kx . From Eq. (1), the evolution of the L´evy limit is governed by ∂ t p ( x ) = d∂ x (cid:0) xp ( x ) (cid:1) + s (cid:90) x w ( x − y ) p ( y )d y − sp ( x ) . This is exactly the empirical equation proposed by FCX [10], in which the authors made clearthat w ( x − y ) stands for the transition probability of the protein concentration from y to x in asingle burst. They further combined experimental observations [19, 28] to show that the burstsize x − y has an exponential distribution w ( x − y ) . Our theory shows that the classical FCXequation can be derived theoretically from the fundamental single-cell biochemical reactionkinetics without resorting to experimental information.To further compare the two kinds of limits, we introduce the Laplace transform f ( λ ) = (cid:82) ∞ p ( x ) e − λx d x . Then the FCX equation is converted to the first-order linear partial differentialequation ∂ t f = − dλ∂ λ f − sλfλ + k . It is easy to see that the mean protein concentration (cid:104) x (cid:105) can be recovered from f ( λ ) as (cid:104) x (cid:105) = − ∂ λ f (0) . Thus the evolution of (cid:104) x (cid:105) is governed by the following ODE: d (cid:104) x (cid:105) d t = − d (cid:104) x (cid:105) + sk . (3)Since s = ˆ sV and p/q = V /k , we have ˆ sp/q = s/k . Comparing Eqs. (2) and (3), we clearlysee that the Kurtz limit is exactly the mean of the L´evy limit, as illustrated in Fig. 2(b). he biochemical implications of the two kinds of limits can be seen as follows. Recall thatthe mean protein copy number in a single cell is the product of the mean burst frequency s/d and the mean burst size p/q . If s/d (cid:29) p/q , the Kurtz limit is valid. This condition is consistentwith bulk experiments in which a large number of cells are ground to form a cell extract andthus the DNA copy number is very large. If p/q (cid:29) s/d , the L´evy limit is applicable. In livingcells, the mean burst size p/q is relatively large, typically on the order of 100 for an E. coli gene[9]. Thus this condition corresponds to single-cell experiments in which the DNA copy numberis very small.We stress here that our theory can be also applied to model stochastic mRNA expression withtranscriptional bursts. Recent bulk [29] and single-cell [30] experiments have shown that mRNAabundances in individual eukaryotic cells generally scale with cellular volume. Single-moleculeimaging techniques [30] have further shown that cellular volume affects mRNA abundancesthrough modulation of transcriptional burst size. As a result, the L´evy limit is also applicable todescribe mRNA fluctuations in single cells with large volumes.Many previous studies also focused on the scenario when the promoter switches rapidlybetween the active and inactive states, that is, a n , b n (cid:29) s , d [10]. Under this assumption, theprotein concentration will reach a quasi-steady state between the active and inactive states,which suggests that p ( x ) ≈ a ( x ) p ( x ) / ( a ( x ) + b ( x )) ,p ( x ) ≈ b ( x ) p ( x ) / ( a ( x ) + b ( x )) . From Eq. (1), the evolution of the L´evy limit is governed by ∂ t p ( x ) = d∂ x (cid:0) xp ( x ) (cid:1) + (cid:90) x ke − k ( x − y ) c ( y ) p ( y )d y − c ( x ) p ( x ) , where c ( x ) = ( a ( x ) s + b ( x ) s ) / ( a ( x ) + b ( x )) is the effective transcription rate. This empiricalequation has also appeared in [10] and here we provide a theoretical foundation of this equationas the emergent behavior of the fundamental biochemical reaction kinetics. In this case, theL´evy limit is no longer an SDE driven by L´evy noise. However, it falls into the category ofL´evy-type processes [31], which behave locally like L´evy processes. As a summary, we list allkinds of macroscopic limits and their ranges of applicability in Table 2. Macroscopic limits Ranges of applicabilityODE s /d (cid:29) p/q , b n = 0 PDMP s /d (cid:29) p/q L´evy-driven SDE p/q (cid:29) s /d , b n = 0 Switching L´evy-driven SDE p/q (cid:29) s /d L´evy-type process p/q (cid:29) s /d , a n , b n (cid:29) s , d Table 2. Macroscopic limits of single-cell gene expression kinetics and their ranges of applicability. onclusions We show that deterministic Kurtz and stochastic L´evy behaviors naturally emerge fromthe fundamental single-cell gene expression kinetics. When the transcription rate scales withsize, the macroscopic limit is a PDMP which is consistent with the classical deterministicchemical kinetics in aqueous solution. When the mean burst size scales with size, however, themacroscopic limit is a switching L´evy-driven SDE which captures intracellular variations atthe epigenetic level. The L´evy limit provides a theoretical foundation for the classical FCXempirical equation and gives by far the most general form for the steady-state distribution of theprotein concentration. Our theory unifies various continuous gene expression models proposedin the previous literature and clarifies their biochemical implications and ranges of applicability.

Acknowledgements

We are grateful to the anonymous referees for their valuable comments and suggestionswhich helped us greatly in improving the quality of this paper. This work was supported by NIHGrants MH102616, MH109665, and R01GM109964, and also by NSFC Grants 31671384 and91329000.

References [1] Qian, H. Cooperativity in cellular biochemical processes: Noise-enhanced sensitivity, fluctuating enzyme,bistability with nonlinear feedback, and other mechanisms for sigmoidal responses.

Annu. Rev. Biophys. ,179–204 (2012).[2] Kurtz, T. G. The relationship between stochastic and deterministic models for chemical reactions. J. Chem.Phys. , 2976–2978 (1972).[3] Anderson, P. W. et al. More is different.

Science , 393–396 (1972).[4] Delbr¨uck, M. Statistical fluctuations in autocatalytic reactions.

J. Chem. Phys. , 120–124 (1940).[5] Qian, H. Nonlinear stochastic dynamics of mesoscopic homogeneous biochemical reaction systemsananalytical theory. Nonlinearity , R19 (2011).[6] Peccoud, J. & Ycart, B. Markovian modeling of gene-product synthesis. Theor. Popul. Biol. , 222–234(1995).[7] Paulsson, J. & Ehrenberg, M. Random signal fluctuations can reduce random fluctuations in regulatedcomponents of chemical regulatory networks. Phys. Rev. Lett. , 5447 (2000).[8] Kepler, T. B. & Elston, T. C. Stochasticity in transcriptional regulation: origins, consequences, and mathemat-ical representations. Biophys. J. , 3116–3136 (2001).[9] Paulsson, J. Models of stochastic gene expression. Phys. Life Rev. , 157–175 (2005).[10] Friedman, N., Cai, L. & Xie, X. S. Linking stochastic dynamics to population distribution: an analyticalframework of gene expression. Phys. Rev. Lett. , 168302 (2006).[11] Shahrezaei, V. & Swain, P. S. Analytical distributions for stochastic gene expression. Proc. Natl. Acad. Sci.USA , 17256–17261 (2008).[12] Kumar, N., Platini, T. & Kulkarni, R. V. Exact distributions for stochastic gene expression models withbursting and feedback.

Phys. Rev. Lett. , 268105 (2014).[13] Newby, J. Bistable switching asymptotics for the self-regulating gene.

J. Phys. A: Math. Theor. , 185001(2015).

14] Ge, H., Qian, H. & Xie, X. S. Stochastic phenotype transition of a single cell in an intermediate region ofgene state switching.

Phys. Rev. Lett. , 078101 (2015).[15] Lin, Y. T. & Doering, C. R. Gene expression dynamics with stochastic bursts: Construction and exact resultsfor a coarse-grained model.

Phys. Rev. E , 022409 (2016).[16] Bressloff, P. C. Stochastic switching in biology: from genotype to phenotype. J. Phys. A: Math. Theor. ,133001 (2017).[17] Jia, C., Xie, P., Chen, M. & Zhang, M. Q. Stochastic fluctuations can reveal the auto-regulatory characteristicsof gene networks at the single-molecule level. arXiv:1703.06532 (2017).[18] Chong, S., Chen, C., Ge, H. & Xie, X. S. Mechanism of transcriptional bursting in bacteria. Cell ,314–326 (2014).[19] Cai, L., Friedman, N. & Xie, X. S. Stochastic protein expression in individual cells at the single moleculelevel.

Nature , 358–362 (2006).[20] Berg, O. G. A model for the statistical fluctuations of protein numbers in a microbial population.

J. Theor.Biol. , 587–603 (1978).[21] (????). See Supplemental Material for the detailed derivation of all the equations.[22] Jia, C. Reduction of Markov chains with two-time-scale state transitions. Stochastics , 73–105 (2016).[23] Jia, C. Simplification of irreversible Markov chains by removal of states with fast leaving rates. J. Theor. Biol. , 129–137 (2016).[24] Jia, C. Simplification of Markov chains with infinite state space and the mathematical theory of random geneexpression bursts.

Phys. Rev. E , 032402 (2017).[25] Ewens, W. J. Mathematical Population Genetics 1: Theoretical Introduction (Springer, Berline, 2004), 2ndedn.[26] Wakeley, J. The limits of theoretical population genetics.

Genetics , 1–7 (2005).[27] Brock, A., Chang, H. & Huang, S. Non-genetic heterogeneity a mutation-independent driving force for thesomatic evolution of tumours.

Nature Rev. Genet. , 336–342 (2009).[28] Yu, J., Xiao, J., Ren, X., Lao, K. & Xie, X. S. Probing gene expression in live cells, one protein molecule at atime. Science , 1600–1603 (2006).[29] Marguerat, S. & B¨ahler, J. Coordinating genome expression with cell size.

Trends Genet. , 560–565 (2012).[30] Padovan-Merhar, O. et al. Single mammalian cells compensate for differences in cellular volume and DNAcopy number through independent global transcriptional mechanisms.

Mol. Cell , 339–352 (2015).[31] Applebaum, D. L´evy Processes and Stochastic Calculus (Cambridge university press, 2009).(Cambridge university press, 2009).