[PDF] Pseudo-Random Fluctuations, Stochastic Cooperativity and Burstiness in Dynamically Unstable High-Dimensional Biochemical Networks

Abstract

The goal of this paper is to outline a scenario of emerging stochasticity in high-dimensional highly nonlinear systems, such as genetic regulatory networks (GRN). We focus attention on the fact that in such systems confluence of all the factors necessary for gene expression is a comparatively rare event, and only massive redundancy makes such events sufficiently frequent. An immediate consequence of this rareness is burstiness in mRNA and protein copy numbers, a well known experimentally observed effect. We introduce the concept of stochastic cooperativity and show that this phenomenon is a natural consequence of high dimensionality coupled with highly nonlinearity of a dynamical system. In mathematical terms, burstiness is associated with heavy-tailed probability distributions of stochastic processes describing the dynamics of the system. The sequence of stochastic cooperativity events allows for transition from continuous deterministic dynamics expressed in terms of ordinary differential equations (ODE) to discrete stochastic dynamics expressed in terms of Langevin and Fokker-Plank equations. We demonstrate also that high-dimensional nonlinear systems, even in the absence of explicit mechanisms for suppressing inherent instability, may nevertheless reside in a state of stationary pseudo-random fluctuations which for all practical purposes may be regarded as stochastic process. This type of stochastic behavior is an inherent property of such systems and requires neither an external random force, nor highly specialized conditions of bistability.

Full PDF

1 Pseudo-Random Fluctuations, Stochastic Cooperativity and Burstiness in Dynamically Unstable High-Dimensional Biochemical Networks

Simon Rosenfeld

National Cancer Institute EPN 3108, 6130 Executive Blvd, Bethesda, MD 20892 (301) 496-7748; [email protected]

Abstract

Two major approaches are known in the field of stochastic dynamics of intra-cellular biochemical networks. The first one places the focus of attention on the fact that many biochemical constituents vitally important for the network functionality may be present only in small quantities within the cell, and therefore the regulatory process is essentially discrete and prone to relatively big fluctuations. The second approach treats the regulatory process as essentially continuous. Stochasticity in such processes may occur due to bistability and oscillatory motion within the limit cycles. The goal of this paper is to outline the third scenario of stochasticity in the regulatory process. This scenario is only conceivable in high-dimensional highly nonlinear systems, such as genetic regulatory networks (GRN). We focus attention on the fact that in the systems with size and link density of GRN (~25,000 and ~100, respectively) confluence of all the factors necessary for gene expression is a comparatively rare event, and only massive redundancy makes such events sufficiently frequent. An immediate consequence of this rareness is “burstiness” in mRNA and protein copy numbers, a well known experimentally observed effect. We introduce the concept of “stochastic cooperativity” and show that this phenomenon is a natural consequence of high 2dimensionality coupled with highly nonlinearity of a dynamical system. In mathematical terms, burstiness is associated with heavy-tailed probability distributions of stochastic processes describing the dynamics of the system. The sequence of stochastic cooperativity events allows for transition from continuous deterministic dynamics expressed in terms of ordinary differential equations (ODE) to discrete stochastic dynamics expressed in terms of Langevin and Fokker-Plank equations. We demonstrate also that high-dimensional nonlinear systems, even in the absence of explicit mechanisms for suppressing inherent instability, may nevertheless reside in a state of stationary pseudo-random fluctuations which for all practical purposes may be regarded as stochastic process. This type of stochastic behavior is an inherent property of such systems and requires neither an external random force, nor highly specialized conditions of bistability.

KEYWORDS : Nonlinear dynamics; Gene expression; Biochemical networks; S-functions; Heavy-tailed distributions; Bursting processes; Langevin equation; Stochastic cooperativity

1. Introduction

High-dimensional biochemical networks are the integral parts of intracellular organization. The most prominent roles in this organization belong to genetic regulatory networks [1] and protein-interaction networks [2]. Also, there are numerous other subsystems, such as metabolic [3] and glycomic networks [4], to name just a few. All these networks have several important features in common. First, they are highly diverse, i.e., contain numerous (up to tens of thousands) different types of molecules. Second, their dynamics is constrained by a highly structured, densely tangled intracellular environment. Third, their constituents are predominantly macro-molecules interacting in accordance with the laws of thermodynamics and chemical kinetics. Fourth, all these networks may be called “unsupervised” in the sense that they do not have an overlaying 3regulatory structure of a non-biochemical nature that would control the behavior of each individual molecule. Although the term “regulation” is frequently used in the description of cellular processes, its actual meaning is different from that in the systems control theory. In this theory, a controller is a device which is largely independent on the system under control. It has an ability to gather the information about the system in a non-invasive manner through sensors and signal exchange, but, importantly, this exchange has a different physical nature than the system-to-be-controlled. Notably, the regulatory signal produced by the controller and the way it directs the system are of a different physical nature than the functions of the system under control. Contrary to this picture, the intra- and inter-cellular regulations are of a biochemical nature themselves (e.g., protein signal transduction [5]); therefore, the subdivision of a system on the regulator and the subsystem-to-be-regulated is largely nominal. Logically, such a subdivision serves as a way of compartmentalizing a big biochemical system into relatively independent parts for the simplification of analysis. However, in biology this compartmentalization is rarely unambiguous, and it is never known for sure what regulates what. Another term frequently used in description of cellular dynamics is “machine” or “machinery.” Although these words are appropriate to express our fascination with the high level of organization of the cells and apparent purposefulness in their actions, yet the analogy of the machine fails to go far enough. The machines per se are being assembled in an environment different from that in which the machines are intended to work. There are always a design, a designer’s mind and a designer’s tools behind the creation of a machine; these tools are of a different physical nature than the operational modalities of the machine itself. On the contrary, the cellular machines are self-assembled; the resources of their creation are the same as the resources the machines are working with, that is the macromolecules and biochemical reactions. Similar reservations should be made regarding the concepts of “information”, “coding”, “signal” and other terms borrowed from the human technological and societal experiences. These analogies and metaphors are useful instruments assisting human imagination in dissecting cellular realities, but none of them should be taken too literally. The drawback of these instruments is that their indiscriminate usage obscures the fundamental fact that intracellular functionality is nothing else than a vast system of interconnecting 4biochemical reactions between billions of molecules belonging to tens of thousands of various molecular species. Therefore, studying general properties of such large biochemical systems is of primary importance for understanding functionality of the cell. In this work, the focus of attention is placed on dynamical stability of the biochemical networks. First, we show that stringent requirements of dynamical stability have very little chance to be satisfied in the biochemical networks of sufficiently high order. Second, we show that a dynamically unstable system does not necessarily end up its existence through explosion or implosion, as prescribed by simple linear considerations. It is possible that such a system would reside in a dynamic state similar to a stationary or slowly evolving stochastic process. Third, we conjecture that the motion in a high-dimensional system of strongly interacting units inevitably includes a pattern of “bursting”, i.e., sporadic changes of the state variables in either positive or negative directions. In biology, burstiness is an experimentally observed phenomenon [6-9], and a number of theoretical works have been performed to understand its origins [10-13]. Two major facts are usually invoked to explain the burstiness. The first one is that some of the molecules critically important for the network functionality are present in the cell in small quantities, thus producing large random fluctuations downstream in the biochemical pathways. Another popular approach places the focus of attention on the oscillatory motion within limit cycles involving a number of simultaneous chemical reactions. The analysis presented in this paper points out to another possible scenario of bursting, in addition to the existing models. Unlike the two approaches just mentioned, the mechanism we consider does not require any special conditions for its realization. Rather, it is seen as a ubiquitous property of any high-dimensional highly nonlinear dynamical system, including biochemical networks. The mechanism of stochastic evolution proposed here allows for some experimentally verifiable predictions regarding global parameters characterizing the system.

2. Nonlinear Model and the State of Equilibrium

A natural basis for the description of chemical kinetics in a multidimensional network is the power-law formalism, also known under the name S-systems [14-17]. A useful 5property of S-systems is that S-functions are the “universal approximators,” i.e., have the capability of representing a wide range of nonlinear functions under mild restrictions on their regularity and differentiability. S-functions are found to be helpful in the analysis of genome-wide data, including those derived from microarray experiments [18]. However, the most important in the context of this work is the fact that in the vicinity of equilibrium any nonlinear dynamical system may be represented as an S-system [19]. Unlike mere linearization which replaces a nonlinear system by the topologically isomorphic linear one, the S-approximation still retains essential traits of nonlinearity but often is much easier to analyze. Without loss of generality, the system of equations of chemical kinetics may be recast in the following form ( ,..., ) , im im

N Npi i n i m i mm m dx F x x x xdt α β = = = = − ∏ ∏ q (1) where i α , i β are the rates of production and degradation, and im p , are the stoichiometric coefficients in the direct and inverse reactions, respectively. Depending on the nature and complexity of the system under investigation, the quantities{ im q } i x , , may represent various biochemical constituents participating in the process, including individual biomolecules or their aggregates. In the context of GRN, these biochemical constituents may include proteins, mRNAs, DNA, and numerous transcription factors such as holoenzymes, promoters, repressors and others [20]. There is no unique way of representing the biochemical machinery in mathematical form: depending on the level of structural “granularity” and temporal resolution, the same process may be seen either as an individual chemical reaction or as a complex system of reactions. For example, the process that is commonly characterized as a singular event of “binding” of a protein to the regulatory site in reality is the sequence of events of enormous complexity involving a large number of rearrangements and supported by numerous transcriptional co-activators. Similarly, on a certain level of abstraction, the process of transcription may be seen as an individual biochemical reaction between RNA polymerase and DNA molecule, whereas a more detailed view reveals a complex “dance” i = N i x . Simple algebra allows for transformation of equation (1) to a more universal and analytically tractable form ( ) ( ) ( ) e e i i U t V ti i i dz F tdt ν ⎡ ⎤= = −⎣ ⎦ (2) where

U t , V t ( ) Ni i m mm

P z = = ∑ ( ) Ni im

P z = = ∑ i m im im P p m m δ= − , i m im im Q q δ= − (see Appendix A for technical details.) It is easy to see that the fixed point is located in the origin of coordinates and that the Jacobian matrix in its vicinity is simply ( im i i m i m J p q ) ν= − (3) No simplifications have been made for the derivation of (2-3). This means that these equations are quite general and may be always derived for any given sets of rates and stoichiometric coefficients.

3. Structure of the Solution in the Vicinity of Equilibrium

Equations (1,2) may be simultaneously viewed as equations of chemical kinetics derived from and governed by the laws of non-equilibrium thermodynamics, and also as the equations of a certain dynamical system, whether originating in chemistry or not. There is a fundamental difference between the dynamic equilibrium resulting from the conditions 0, 1,..., i dz dt i N = = , and the thermodynamic equilibrium expressed in the Acting Mass Law in chemical kinetics [22]. The latter assumes, in addition to the fact that the fixed point is the equilibrium point, existence of the detailed balance, i.e., full compensation of each chemical reaction by the inverse one. Since a system of biochemical reactions within the cell may be seen as a collection of spatially separated loops with mostly unidirectional flow of chemical constituents, we do not expect a global validity of the principle of detailed balance (PDB) in intracellular dynamics. A cascade of spatially separated processes supporting homeostatic equilibrium is a fundamental 7feature of living systems; it is very distinct from the state of “heat death” expressed in PDB. The principles governing the behavior of such systems (known as “dissipative structures”) have been formulated by Prigogine [23]. The most fundamental among these is the principle of minimal entropy production. It should be noted, however, that this principle is phenomenological by nature and cannot be directly derived from the equations of underlying dynamics [24]. Therefore, there are no first principles that would impose any limitations on the structure of the Jacobian matrix, J , in the vicinity of the fixed point. This means in turn that J is just a matrix of general form, and there is no reason to expect that it has only the eigenvalues with negative real parts. Consequently, generally speaking, there are no reasons to assume that the macroscopic law of motion, i.e., , is stable. Although the assumption of stability is frequently introduced in the context of genetic regulation, in fact, it refers to a highly specific condition which is hardly possible in multidimensional systems with many thousands of independent governing parameters. dt dx/ = F(x) It is sometimes argued that this presumed instability contradicts the common perception of smooth behavior of living entities. It should be noted in this regard that there are several qualitatively similar concepts characterizing temporal evolution which are frequently used interchangeably in biology, but should be clearly distinguished in mathematical modeling. It is especially important to distinguish between stability and stationarity. In biology, it is often observed that certain states either do not change at all or evolve slowly. Such a behavior may be the result of a genuine dynamical stability of the system, but also it may be the result of averaging across the population of cells of stationary variations in an inherently unstable dynamical system. On a macro level, these two behaviors may look similar, and thus both deserve to be characterized as “stable.” In this context, it is useful to recall some fundamental results pertaining to stability of nonlinear systems. According to the theorem by Lyapunov [25], the matrix J is stable if and only if the equation has a solution, , and this solution is a positive definite matrix [25]. If such a solution does exist then the criteria of stability are reduced to the sequence of inequalities ′ + = − J V VJ I V

11 12 121 22 211 1211 21 22 1 2 ......0 ; 0;...; 0... ... ... ...... nnn n nn

V V VV V VV VV V V V V V > > > (4) Matrix is a complicated function of all the stoichiometric coefficients and kinetic rates characterizing the network. Inequalities (4) would impose the set of very stringent constraints of high algebraic order on the structure of dynamically stable biochemical networks. Another classical approach to stability consists of the application of the Routh-Hurwitz criterion [25]. In this approach, one first calculates the characteristic polynomial of the Jacobian matrix; then one builds the sequence of the so-called Hurwitz determinants from its coefficients. The system is stable if and only if all the Hurwitz determinants are positive. Again, the Routh-Hurwitz criterion imposes the set of very complex constraints on global structure of a biochemical network. As argued above, apart from the PDB there are no other first principles and/or general laws governing stability of biochemical systems, and neither the Lyapunov nor Routh-Hurwitz criteria are the parts of PDB. As shown in a recent paper by the author [26], the Jacobian matrix of an arbitrary biochemical system may have comparable numbers of eigenvalues with negative and positive real parts. This property holds under widely varying assumptions regarding kinetic rates and stoichiometric coefficients. Therefore, generally, high-dimensional biochemical networks which are not purposefully designed to be stable (like in reactors for biochemical synthesis [27]) are reasonably presumed to be unstable. It is sometimes argued that the cell is not “a bag of molecules” but rather it is a complex “machine.” It is also hypothesized that in the process of evolution the “evolutionary pressure” moved the cell from one stable configuration to another. Not discarding such a possibility, one still has to recognize that the “evolutionary pressure” is not a law by itself but rather is a label for all the unknown laws which brought the living systems into existence. In this capacity, the “evolutionary pressure” is analogous to the “wisdom of nature” or even to the “intelligent design”; it is not an established fact, but rather an expression of hope that the mechanism providing stability will be discovered sooner or later. It is precisely the goal of this paper to show how an apparent stability may appear even in the dynamically unstable high-dimensional biochemical networks. V

4. Stochastic Cooperativity

The term cooperativity is widely used in biology for describing multi-step joint actions of biomolecular constituents to produce a singular step in intracellular regulation [28, 29]. In intra-cellular regulatory dynamics, the term cooperativity reflects the fact that an individual act of gene expression is not possible until all the gene-specific co-activators are accumulated in the quantities sufficient for triggering the transcription machinery. In ODE terms, this means that d dt z in (2) may noticeably deviate from zero only when the majority of arguments in and come to “cooperation” by simultaneously reaching vicinities of their respective maxima. The fact that such an event is not frequent in a multidimensional system is demonstrated by the following simple example. Let’s assume that i U i V ( ) x t and are the Gaussian processes and consider the behavior of the process, ( ) y t ( ) exp[ ( )] exp[ ( )] F t x t y t σ σ = − . The pattern of this behavior is seen in Figure 1 whereby fluctuates in the vicinity of zero most of the time, thus making no contribution to the variations of ( )

F t ( ) z t . However, sometimes makes large excursions in either direction causing fast sporadic changes in ( )

F t ( ) z t . As shown in Figure 2, the distribution of is approximately symmetric; this means that positive excursions are generally balanced by negative ones. This observation helps us to understand how it happens that an inherently unstable system nevertheless behaves decently and does not explode or implode as prescribed by its linear instability. In simplified terms, the reason is that sporadic deviations of concentrations in positive directions are followed, sooner or later, by the balancing responses in degradation, thus maintaining approximate equilibrium. ( )

F t

5. Probabilistic Structure of Burstiness

In order to envision a stochastic structure in the solution to equation (2) we make use of three fundamental results from the theory of stochastic processes, namely, (i) Central Limit Theorem (CLT) under the strong mixing conditions (SMC) [30] ; (ii) asymptotic distribution of level-crossings by stationary stochastic processes [31] and (iii) 10probabilistic structure of heavy-tailed (also known as bursting) processes [32]. We first notice that the arguments of in (2) are combined into two linear forms, ( ) i F t ( ) , ( ) Ni i m m i i m mm

U t P z V t Q z = = = Nm = ∑ ∑ (5) in which only n terms are non-zeros, where is the typical number of proteins facilitating gene expression; as mentioned above, this number may be of order from several dozens to hundreds. Generally, collections of transcription factors are gene-specific, and there is no explicit correlation between transcription rates and transcription stoichiometry. We therefore may view these collections as independently bootstrapped from the totality of the proteome. According to the CLT under the SMC, the sums of weakly dependent random variables are asymptotically normal. Note that according to the Lindeberg theorem [33], the terms in and are not required to be identically distributed: boundedness of the second moments is sufficient for the validity of CLT. Validity of the SMC, as applied to and is easy to demonstrate by simulation. Importantly, the sums (5) are asymptotically normal even when the processes are drastically non-Gaussian. Figures N (cid:19) n ( ) i z t ( ) i U t ( ) i V t ( ) i U t ( ) i V t ( ) i z t i z t ( ) i z t ( ) i z t ( ) i U t ( ) i V t ( ) i U t ( ) i V t expectations (6) [ ] [ ] ( ) ( ) ; ( ) ( ) ,

N Ni im m i im mk k

P P E z t Q Q E z t µ µ= = ∑ ∑ variances (7) [ ][ ] ,2 ,,2 , ( ) var[ ( )] cov ( ) ( )( ) var[ ( )] cov ( ) ( ) ,

N Ni i im ik m kk mN Ni i im ik m kk m

P U t P P z t z tQ V t Q Q z t z t θθ = == = ∑∑ and covariance 11 (8) [ ,, ( , ) ( ), ( ) cov ( ) ( ) , N Nij i j im jk m kk m

P Q cov U t V t P Q z t z t ⎡ ⎤Λ = =⎣ ⎦ ∑ ] where ( ) ( ) ( ) ( ) cov m k m k m k z z E z z E z E z = − Since and are Gaussian, the processes and e are lognormally distributed; their expectations and variances are, respectively, ( ) i U t ( ) i V t exp[ ( )] i U t xp[ ( )] i V t { ( )exp ( ) ; exp 2 ( ) ( ) exp ( ) 12 ii i i i i i θµ µ θ⎡ ⎤⋅ ⎡ ⎤ ⎡Μ = ⋅ + Θ = ⋅ + ⋅ ⋅ −⎢ ⎥ ⎣ ⎦ ⎣⎣ ⎦ } θ ⎤⎦ (9) where dot stands for or Q . The correlation coefficient between two exponentials is P { } ( ) ( ) { } ( , ) exp ( , ) 1 exp ( ) 1 exp ( ) 1 ij ij i j P Q P Q P Q ρ θ − ⎡ ⎤ ⎡⎡ ⎤= Λ − − −⎣ ⎦ ⎣ ⎦ ⎣ θ ⎤⎦ (10) The right-hand side in (2) is the difference of two lognormal random variables. Exact probabilistic distribution of this difference is unknown. We have found by simulation that these distributions may be reasonably well approximated by the Generalized Pareto Distribution (GPD) ( ) ( )

1, , ( ) 1 1 , 0 ; ( ) 1 exp , 0

G x x G x x ξξβ ξβ ξ β ξ β ξ − = − + ≠ = − − = (11) More specifically, the tail distributions of ( ) exp( ) exp( ) h x x y σ σ σ = − (12) may be accurately represented through (11) with appropriately selected parameters ( ) ξ ξσ = and ( ) β β σ = . These dependencies are found in our work by simulation and are shown in Figure 5. Furthermore, very accurate analytical approximations are available for ξ and β . It turns out that ( ) ξ ξσ = is nearly linear ( ) ( ) ( ) ; 2 2 2 0.376; 0.745; 0.088 u v w u v w ξ σ σ σ π π = + + = − − = − = = − (13) and ( ) β β σ = is nearly exponential ( ) [ ] ( ) ( ) exp( ) exp( ) ; 1.162 ; 2.753; 2 1.553 p q p q p q β σ ϕ σ σ ϕ π π= + − − = = = − = (14) Although the primary goal for these approximations is to accurately capture only the tail distributions of , within the interval ( ) h x σ σ≤ ≤ , approximations (13-14) are quite satisfactory down to 0.1-quantile. Essentially, this means that GPD may serve as a very good representation for as a whole, not just for the tails. ( ) h x σ Figure 6 shows an example of fitting the GPD to . The histogram in the right panel depicts empirical ( ) h x σ line belongs to the theoretical density of GPD with parameters ( ) h x σ ( ) ξσ and ( ) β σ obtained from (13-14). The fact that is representable through the “heavy-tailed” GPD is significant. As well known from the literature [32], stochastic processes with heavy-tailed distribution usually possess the property of “burstiness”. This property means that a substantial amount of spectral energy of such processes is contained in the so-called “exceedances”, i.e., in the short sporadic pulses beyond the certain predefined bounds. ( ) h t σ Figure 7 illustrates this concept. The top panel depicts the stochastic process [ ] [ ] ( ) exp ( ) exp ( ) ; h t x t y t σ σ σ= − (15) where ( ) x t and are standardized independent Gaussian processes. The second panel shows the process of exceedances, , defined as the part of jumping outside the interval 0.025 . Although spends only 5% of all the available time outside this interval, its variance is overwhelmingly greater than that of difference, (183 and 7698, respectively). On this basis, we may regard as a small background noise which only slightly distorts the strong signal provided by . If we ignore this noise, then equation (15) acquires a familiar form of the Langevin equation ( ) y t ( ) h t σ (cid:4) ( ) h t σ Prob( ) 0.975 h σ ≤ ≤ ( ) h t σ (cid:4) ( ) ( ) ( ) d t h t h t σ σ σ = − (cid:4) ( ) d t σ ( ) h t σ (cid:4) ( ) ( ) i Li i i ik ikk dz F t t tdt ν µ δ = = = −′ ∑ (16) where ik µ is the matrix of random Pareto-distributed amplitudes and is the set of random point processes coinciding with the events of bursting. ik t Temporal locations of pulses, , are those corresponding to local maxima of and . It is a well known result from the theory of level-crossing processes [31] that the sequence of such events in the interval (0 asymptotically converges to a Poisson process with the parameter ik t ( ) i U t ( ) i V t , ] t { } (1 2 ) (1 ) exp 2 a ζ π τ θ⎡ ⎤= − ⎣ ⎦ (17) 13where is the threshold of excursion; a → ∞ τ and θ are the correlation radius and variance of the generating Gaussian processes, respectively. On the basis of this asymptotic result, it may be reasonably assumed that for a finite, but sufficiently large the sequences, , may also form a set of Poisson processes with appropriately selected parameters. a ik t Figure 8 shows an example of simulation where the threshold is only slightly greater than the standard deviation, i.e., a θ= . The QQ-plot and histogram of waiting times, , clearly follow exponential distribution, which is an indication that the sequence forms a Poisson process. It is also worth mentioning that in this simulation the number of peaks in the interval k k t t t + ∆ = − k k t (0, 100000] T = predicted from the asymptotic theory, 703, is fairly close to the number of peaks actually found, 696. These two findings indicate that equation (17) is practically applicable under much milder conditions than . a → ∞

6. Meaning of Stochastic Stability

Having the Langevin equation (16) in place, we may now derive the corresponding Fokker-Plank equation. For this purpose, we compute increments ( ) ( )0 ( ) (0) i i

T ti i i

U V z T z dt e e ν t ⎡ ⎤− = ⎣ ⎦ − ∫ (18) over the period of time, , encompassing many excursion events. Since T [ ] ( ) (0) 0 i i E z T z − = , we have the following equation for the variances of increments. [ ] { } ( ) ( ) ( ) ( )2 0 0 var ( ) (0) i i i i

T T t t t ti i i

U V U V z T z dt dt E e e e e ν ′ ′ ⎡ ⎤ ⎡ ⎤′− = ⎣ ⎦ ⎣ ⎦ − − ∫ ∫ (19) Denoting { } ( ) ( ) ( ) ( ) ( ) i i i i t t t ti U V U V

R t t E e e e e ′ ′ , ⎡ ⎤ ⎡ ⎤′− = ⎣ ⎦ ⎣ ⎦ − − (20) and using the standard Dirichlet technique, we find [ ] ( ) var ( ) (0) 2 ( ) Ti i i i z T z R T d ν τ τ − = − ∫ τ (21) By definition, the diffusion coefficient is 14 [ ] var ( ) (0) 2 ( ) Ti ii z T zD T i i

R d ν τ τ ∂ −= =∂ ∫ (22) Since the correlation radius is much smaller than the inter-event time, in the above integral T may be extended to ∞ . Therefore, ( ) i i i D R d ν τ τ ∞ = ∫ (23) Expression (20), after some inessential simplifications, may be reduced to ( ) ( ) ( ) ( ) exp 2 var exp var ( ) 1 i k k kk k k R E z z z r τ λ λ λ τ k ⎧ ⎫⎡ ⎤ ⎡= + ⎤ −⎨ ⎬⎢ ⎥ ⎢⎣ ⎦ ⎣⎩ ⎭ ∑ ∑ ∑ i ⎥⎦ (24) where n N λ= ; (see Appendix B for details.) In equation (24), ( ) k r τ are the autocorrelation functions of individual series . Applying the saddle-point approximation to the integral (23) we come to the following expression for the diffusion coefficient (see Appendix C). ( ) k z t ( ) Gi i G G

TD z πν λ λλ G ⎡ ⎤= Θ⎣ ⎦Θ (25) where denotes the network-wide variance of fluctuations and ( ) var G k z Θ = ∑ k ( ) var G G k kk

T z τ⎡= Θ ⎢⎣ ⎦ ∑ ⎤⎥ is the network-wide square of relaxation time. Equation (25) reveals important details of multidimensional diffusion in the biochemical networks. First, there is a collective random force created by the entire network characterized by the factor ( ) ( exp 2 2 G G G G

T z λ λ

Θ + ) Θ , the force acting uniformly upon all the individual constituents. But also there are individual motilities characterized by the factors i ν that reflect the abilities of individual constituents to be excited by this collective random force. Equation (25) also means that all the rescaled constituent-specific concentrations, ( ) ( ) i i Z t z t i ν − = , have the same diffusion coefficient, ( GG G TD π λλ ) G G z ⎡ ⎤= + Θ⎣ ⎦Θ (26) 15Equation (25) reflects general tendencies in the structure of fluctuations. First, the diffusion coefficient rapidly grows with the complexity of the network, i.e., with λ . Further, it is natural to assume that correlation times, k τ , are of the same order of magnitude as the corresponding times of chemical relaxation, k ν − , because both introduce characteristic time scales into the individual chemical reactions. Therefore, the entire system may be stratified in accordance with only one parameter, the kinetic rate k ν Generally, the probabilistic state of a biochemical network may be characterized by joint distribution, of all the chemical constituents which satisfies the multivariate Fokker-Plank equation (FPE) [34]. However, in light of the above simplifications, such a detailed description would be redundant. Instead, we introduce a collection of identical univariate probability distributions, , where ( , )

P t z N ( , ) P Z t Z is any of the i i i Z z ν − = , each satisfying the same FPE with the coefficient of diffusion (24). Implicitly, equations in this system are not independent because the individual diffusion coefficients depend on the state of the entire network; however this moment-level dependence evolves in a much slower time scale then the correlation radius of fluctuations. There are two important consequences of the fact that all the ( ) ( ) i i Z t z t i ν − = satisfy the same FPE. First, it means that variances of chemical constituents, , are directly proportional to the squares of corresponding kinetic rates. Since var( ) i z ln( ) i z i y = , we conclude that [ ] var ln( ) i y i ν ∼ . This is a testable property of all the large scale biochemical networks and may serve as a basis for experimental validation. It gives some meaning to the experimentally observable fact that slow processes have low variances, whereas rapid fluctuations have high variances thus overwhelmingly contributing to unpredictable “noise" [35] . Since i ν is the only temporal scaling constituent-specific parameter in the network, it is natural to surmise that the times of correlation, i τ , are directly proportional to the corresponding times of chemical relaxation, i ν − . This is another macroscopically observable property suitable for experimental validation. 16 As is well known, there are two major types of problems associated with FPE; the stationary problem and the initial value problem [34]. None of these approaches seems to be quite adequate for intracellular networks. Stationary approach is only meaningful when there are permanent influx and outflow of chemical constituents to and from the system, and, in addition, the time frame available for observation is substantially greater than all the i ν − . An alternative approach, i.e., the initial value problem requires knowledge of the initial state, . If such information is available, then the solution to FPE produces a non-stationary process known as multidimensional random walk. Between these two polar cases, there is a vast variety of possible intermediate conditions which can be roughly categorized by the characteristic times, ( , ) P t z i ν − . As known from the biochemical observations, these characteristic times may cover wide spectrum of fluctuations ranging from milliseconds to many days [36]. The FPE framework proposed in this paper has an advantage of flexibility allowing one to treat each process individually, from stationary approach for the rapidly evolving constituents to the initial value approach for slow processes. Although not being trivially simple, the FPE methodology is nevertheless much simpler than tackling the original ODE system (1). Importantly, the stochastic model introduced in this paper does not depend on arbitrary assumptions regarding extraneous random noise of unknown origin. It emerges solely as a consequence of the deterministic equations of chemical kinetics (1) thus better conforming to the Ockham's lex parsimoniae principle. Due to random partitioning and stochasticity of transcription initiation, the initial condition for the system of equations (21) may be considered as random. Starting with these initial conditions, the system is predominantly driven by the sequence of sporadic events of stochastic cooperativity. Although each such event produces a noticeable momentary shift in the system’s evolution, the multitude of such events makes its overall behavior quite smooth; this behavior is illustrated in Figure 7, bottom panel. In principle, solution of the FPE with initial conditions is a non-stationary process in which individual variances of constituent-specific concentrations grow linearly with time. However, this 17non-stationarity is only noticeable in large time scales and reflects slow evolution of the system as a whole. Linear growth of variances with time reflects a progressive desynchronization of the biochemical interactions. This process may be viewed as slow “aging” of the system, whereas the sequence of stochastic cooperativity events itself looks more like the system’s current “business as usual.”

7. Discussion

In this paper, the Pareto representation of exceedances has been derived from the fact that and are Gaussian processes, and, therefore, e and e are lognormally distributed. We have justified the normality of and by the CLT. This assumption, however, served the only goal to simplify the analysis and may be substantially relaxed at the expense of increased complexity of calculations. Conceptually, all the major ideas leading to the notion of stochastic cooperativity would stay in place even without transition to asymptotic normality. Let’s assume again, as we did in the examples in ( ) i U t ( ) i V t xp[ ( )] i U t xp[ ( )] i V t ( ) i U t ( ) i V t

Figures 3 - 4, that { } { } ( ), ( ) , i i im imm

U t V t P Q z = m ∑ , where are lognormal processes. This time, however, we do not assume that the number of non-zero elements in these sums is sufficiently large to equate the distributions of sums to their asymptotic limits. This would reflect the situation when the number of transcription factors is comparatively small (quite a rare case, as far as gene expression is concerned!) Generally, exact analytical expressions for the distributions of sums of lognormals are unknown, but there is a consensus in the literature that they may be accurately modeled as lognormally-distributed themselves [37]. We have performed the simulation for studying probabilistic structure of the exceedances with lognormal and . It is rather remarkable that the GPD turns out to be a good approximation in this case as well, with the only reservation that simple parameterizations (12-13) are no longer valid and should be replaced by more complex ones. ( ) i z t ( ) i U t ( ) i V t

The property of stochastic cooperativity is not limited to a special form of dynamical systems introduced through S-functions. There is a much wider class of nonlinear systems where the above outlined approach may be applied as well. Let, for example, 18{ ( )} i F x be a vector of positive monotonic functions with all ( ) 0 i F x ′ > in x −∞ < < ∞ . Let us also assume that P and Q are two matrices of all positive elements independently drawn from the same distribution. If the parameter σ is sufficiently large, then the dynamics of the system ( ) ( ddt σ σ= − x F Px F Qx ) (27) will possess the properties of stochastic cooperativity and burstiness. The simplest way to envision this fact is to recall that, in accordance to the above mentioned property of S-functions to be “universal approximators”, any nonlinear system may be represented through S-functions in the vicinity of the fixed point. This and other generalizations are currently under development and the results are to be published elsewhere. There is a vast literature devoted to modeling the intracellular dynamics as a sequence of discrete events, famously exemplified by the Boolean Networks [38], Cellular Automata [39] and Artificial Genome [40], to name just a few (see [41] for comprehensive review.) Generally, the discrete and continuous descriptions largely exist as two separate, methodologically alternative realms. In the discrete dynamics, it is difficult, if possible at all, to realistically incorporate biochemical processes; differential equations of chemical kinetics is the only way to explicitly utilize the laws of molecular interactions [42]. On the other hand, when delving into intricate details of the trajectories in the state space with dimension in thousands, it is easy to loose a big picture and overall logic of events. Only the abstract discrete models seem to be able to reveal this logic. The model proposed in this work may serve as a natural bridge between the two approaches. Although in the time scale of the lifecycle of the cell the chemical reactions are essentially continuous processes, the sequence of events of stochastic cooperativity is essentially discrete.

8. Summary

We have outlined the mechanism by which a multidimensional autonomous nonlinear system, despite being dynamically unstable, nevertheless may be stationary, that is, may 19reside in a state of stochastic fluctuations obeying the probabilistic laws of random walk. Importantly, in this mechanism, the transition from the deterministic to probabilistic laws of motion does not require any artificial assumptions regarding the presence of extraneous random noise of unknown origin; stochastic-like behavior is produced by the system itself. An important role in forming this type of fluctuative motion belongs to inherent burstiness of the system associated with the events of stochastic cooperativity. Unlike classical Langevin approach, macroscopic laws of motion of the system are not required to be dynamically stable. The overall properties of stochastic cooperativity and burstiness are largely independent of the structure of the Jacobian’s eigenvalues in the vicinity of a fixed point. In this work, we have selected the S-systems to be an example of a nonlinear system. At least three motivations justify this selection. First, the S-systems are structured after the equations of chemical kinetics, thus being a natural tool for description of high-dimensional biochemical networks. Second, many other nonlinear systems may be represented through the S-systems in the vicinity of a fixed point. Third, despite generality, the S-systems have an advantage of being analytically tractable. However, many results regarding stochastic cooperativity and burstiness may be readily extended to other multidimensional nonlinear systems. In such system, short pulses during the events of stochastic cooperativity may be described in terms of “shot” noise with subsequent derivation of the Fokker-Plank equation. As proposed in this paper, it is possible to indicate some general experimentally verifiable predictions regarding the behavior of this type of system, such as distribution of intensities of fluctuations and distribution of temporal autocorrelations among individual units of the system. 20

Appendix A. Derivation of the Equations (2-3)

Following the standard procedures in nonlinear dynamics [43], we first search for the state of dynamical equilibrium, { } m x , commonly referred to as a “fixed” point, i.e., the point where all the time derivatives turn to zero, and which therefore satisfy the equations

01 1 im im

N Npi m i mm m x α β = = q x ⎡ ⎤ ⎡= ⎤⎣ ⎦ ⎣ ∏ ∏ ⎦ (A1) Taking logarithm of both sides and solving the linear equations, we obtain the vector of solutions exp ( ) ln N mi i m i mm m x p q βα −= ⎡ ⎤⎛ ⎞= −⎢ ⎥⎜ ⎟⎝ ⎠⎣ ⎦ ∑ (A2) Note that stoichiometric coefficients, im p and cannot be identical in all the direct and inverse reactions simultaneously, therefore, the matrix im q im im p q − is always invertible. It is convenient to introduce relative quantities, i i y x x = i , and then, after denoting (A3) -1 -1 U = (p - I)(p - q) ; V = (q - I)(p - q) to obtain the equations , im im N Npi i m i mm m dy A y B ydt = = = − ∏ ∏ q (A4) where exp ln ; exp ln N Nmi i im i i imm mm m

A U B V βα βα α = = ⎧ ⎫ ⎧⎛ ⎞ ⎛ ⎞⎪ ⎪ ⎪= =⎨ ⎬ ⎨⎜ ⎟ ⎜ ⎟⎪ ⎪ ⎪⎝ ⎠ ⎝ ⎠⎩ ⎭ ⎩ ∑ ∑ m β ⎫⎪⎬⎪⎭ i z (A5) Since we are interested only in positive solutions, we replace exp( ) i y = and obtain exp( ) exp( ) , N Ni i i m m i i m mm m dz A P z B Q zdt = = = − ∑ ∑ (A6) where im im im

P p δ = − , and im im im Q q δ = − . We note further that 21 exp ( ) ln Ni i mim immi i m

B V UA β βα α = ⎧ ⎫⎛ ⎞⎪ ⎪= −⎨ ⎬⎜ ⎟⎪ ⎪⎝ ⎠⎩ ⎭ ∑ (A7) and, because , we find that -1 V - U = (q - p)(p - q) = -I i i

B A = , therefore exp( ) exp( ) N Ni i i m m i mm m dz A P z Q zdt = = m ⎧ ⎫= −⎨ ⎬⎩ ⎭ ∑ ∑ (A8) After introducing a more appropriate time scale tA t ′= , where N ii

A N − = = ∑ A , we rewrite equation (A8) as ( ) exp( ) exp( ) , N Ni i i i m m i m mm m dz F z P z Q zdt ν = = ⎧ ⎫= = −⎨ ⎬′ ⎩ ⎭ ∑ ∑ (A9) where i i A A ν = with an important property that 1 i ν = . It is easy to see that now the fixed point is located in the origin of coordinates and that the Jacobian matrix in the vicinity of this point is ( im i i m i m J p q ) ν = − (A10) Appendix B. Derivation of the Autocorrelation Function

We start with two inessential simplifications. First, if the number of components in the sums

U t , V t is sufficiently large then we may ignore the differences between P, and , respectively, in the expressions (7) for variances and . Another simplification is based on the computationally established fact that the coefficients of variation of and across the genes are much smaller than that of . Therefore, we can replace and by their averages across the genes (see [26] for more detail). After these simplifications ( ) Ni i m mm

P z = = ∑ ( ) Ni i m mm

P z = = ∑ Q , p q ( ) i P θ ( ) i Q θ ( ) i P θ ( ) i Q θ var( ) i z ( ) i P θ ( ) i Q θ { } { } { } { } ( ) exp 2 ( ) var[ ] exp cov[ (0), ( )] exp cov[ (0), ( )] i i R E U U U U U V τ τ = + − τ (B1) We have further, ( ) , var( | ) cov , k m k mk m U P p p z z = ∑ (B2) and 22 ( ) ( ) var( ) var cov , kk k m U z z λ λ= + k m z ∑ ∑ (B3) Similarly, [ ] { } [ ] , cov (0), ( ) | cov (0), ( ) k m k mk m U U P p p z z τ τ= ∑ (B4) and [ ] [ ] [ ] cov (0), ( ) cov (0), ( ) cov (0), ( ) k k k mk k m U U z z z z τ λ τ λ= + τ ∑ ∑ (B5) At last, [ ] { } [ ] , cov (0), ( ) | , cov (0), ( ) k m k mk m U V P Q p q z z τ τ= ∑ (B6) and [ ] [ ] cov (0), ( ) cov (0), ( ) k mk m U V z z τ λ= τ ∑ (B7) Putting everything together (B8) ( ) ( ) ( ) [ ] ( ) exp 2 ( ) var cov , cov (0), ( )exp cov (0), ( ) 1 i i k k m k mk k m k mk kk R E U z z z z zz z τ λ λ λλ τ⎧ ⎫⎡ ⎤⎪ ⎪= + + +⎨ ⎬⎢ ⎥⎪ ⎪⎣ ⎦⎩ ⎭⎧ ⎫⎧ ⎫ −⎨ ⎨ ⎬ ⎬⎩ ⎭⎩ ⎭ ∑ ∑ ∑∑ ii τ The terms ( ) ( ) cov , cov (0), ( ) k m k mk m k m z z z z λ λ+ ∑ ∑ τ are small compared to ( ) var kk z λ ∑ ; first, because λ (cid:19) , and second, because the double sums here are of the same order of magnitude as the sums of variances. The latter claim is supported by simulation. We therefore simplify (B8) to ( ) ( ) ( ) exp 2 ( ) var exp var ( ) 1 i i k k kk k R E U z z r τ λ λ τ ⎧ ⎫⎡ ⎤ ⎡= + ⎤ −⎨ ⎬⎢ ⎥ ⎢⎣ ⎦ ⎣⎩ ⎭ ∑ ∑ i ⎥⎦ (B9) Appendix C. Derivation of the Diffusion Coefficient Using the Saddle-Point Approximation

Let ( ) R τ be a decreasing function of τ such that: (0) 1; ( ) 0; (0) 0 R R R ′∞ = (cid:21) = . Then [ ] { }

20 0 exp ( ) 1 exp (0) (0) (0) 12

J R d R R R τ d τ τ τ ∞ ∞ τ⎧ ⎫⎡ ′ ′′= − ≈ + + −⎤⎨ ⎬⎢⎣ ⎦⎩ ⎭ ∫ ∫ ⎥ (C1) 23Denoting R σ ′′= − ) , we find that ( ) exp (0) 2 R τ σ⎡ ⎤−⎣ ⎦ is a good representation of the integrand in (C1) both in the vicinity of zero and at infinity. Therefore, R J R d e eR τ π πτ σσ ∞ ⎡ ⎤= − = =⎢ ⎥ ′′−⎣ ⎦ ∫ Rk (C2) Introducing , and ( ) exp 2 ( ) var i i k E U z λ⎡ ⎤Λ = +⎢ ⎥⎣ ⎦ ∑ ( ) ( ) var ( ) k kk R z r τ λ τ= ∑ we obtain [ ] ( ) exp ( ) 1 i i R R τ τ = Λ i − (C3) Denoting , we get (0) k k r τ − = − (cid:5)(cid:5) ( ) var k kk z σ λ τ − ⎡ ⎤= ⎣ ⎦ ∑ Therefore, ( ) ( )( ) exp varexp 2 ( ) var 2 var kki i i kk kk k zD E U z z λπν λ λ τ⎡ ⎤⎢ ⎥⎡ ⎤ ⎣ ⎦= +⎢ ⎥⎣ ⎦ ∑∑ ∑ (C4) Introducing the parameters ( ) ( ) ( ) var ; var var G k G k kk k k z T z z k τΘ = = ∑ ∑ ∑ (C5) We finally obtain [ { Gi G iG TD π λ νλ ⎡ ⎤= Θ⎣ ⎦Θ ] } i E U (C6) 24 x(t) time0 200 400 600 800 1000 - y(t) time0 200 400 600 800 1000 - exp[1.5*x(t)]-exp[1.5*y(t)] time0 200 400 600 800 1000 - Figure 1. Illustration of the notion of burstiness. -2 0 2 . . . . . . . kurtosis=27.5; degr.freedom=1.13 Figure 2. Histogram of the process depicted in Figure 1. The distribution is close to the Students t with number of degrees of freedom 1.13. This is an indicator of "heavy tails". Solid line belongs to the standard normal distribution .

N(0,1) individual exp(ar1); corr of orig ar1=0.154; corr of lognormals=0.096 time z = e x p ( y ) sums of auto & cross-correlated lognormals time z Figure 3. Convergence of the sums of lognormal processes (top panel) to approximate normality (bottom panel). -4 -2 original lognormal ss= 240000 av=1.62; sd=2.12; sk=6.18; kt=106; -4 -2 0 2 sums of auto & cross-correlated lognor ss= 3000 av=14.7; sd=6.23; sk=1.15; kt=1.95; Figure 4. Histograms of processes shown in Figure 3. Left panel: lognormal processes (skeweness 6.2, kurtosis 106). Right panel: distribution of sums of 80 lognormals (skeweness 1.2, kurtosis 2). In both cases, solid lines belong to standard normal. 'xi' of GPD vs 'sglog' SG X I . . . . 'beta' of GPD vs 'sglog' SG BE T A approximation of abs[exp(x)-exp(y)] through generalized pareto p r o j e c t = G ene D y n ; pg m = f i t. d i f r L OG N O R M . b y G P D ; c a ll ed f r o m :; T ue M a y : : E D T ; Figure 5. Parameters of GPD expressed through the standard deviation, σ . Dots are the parameters obtained by fitting the GPD to the simulated exp( ) exp( ) h x σ σ = − y σ ; solid lines are the parameters obtained through the analytical approximations (13-14). sg=1.8quant apprx: xi=0.675; beta=3.169 theor quant e m p i r quan t