Variability in mRNA Translation: A Random Matrix Theory Approach
aa r X i v : . [ q - b i o . S C ] S e p Variability in mRNA Translation: A RandomMatrix Theory Approach
Michael Margaliot a, 1 and Wasim Huleihel a a The authors are with the Dept. of Electrical Engineering-Systems, Faculty of Engineering, Tel Aviv University, Tel Aviv, Israel, 69978.This manuscript was compiled on September 30, 2020
Translation is a fundamental cellular process in gene expression. During translation, complex micro-molecules called ribosomes sequentiallyscan the genetic information encoded in the mRNA molecule and transform it into a chain of amino-acids that is further processed to yielda functional protein. The speed of translation depends on the initiation, elongation, and termination rates of ribosomes along the mRNA.These rates depend on many “local” factors like the abundance of free ribosomes and cognate tRNA molecules in the vicinity of the mRNA.Consequently, copies of the same mRNA molecule located at different parts of the cell may be translated in different rates. An importantquestion is how total protein production in the cell is regulated, despite this considerable variability. We develop a theoretical frameworkfor addressing this question that is based on: (1) considering a computational model for the flow of ribosomes along the mRNA, called theribosome flow model, but with rates that are random variables; and (2) analyzing the model steady-state behaviour using tools from randommatrix theory. Our results show that if all the rates are modeled as i.i.d. random variables bounded from below by a value B then as thelength of the mRNA increases the total protein production converges, with probability one, to a value that depends only on B . This reveals aprinciple of universality: total protein production is insensitive to many of the details underlying the distribution of the random variables. Ribosome flow model | Perron-Frobenius theory | Random matrix theory | G ene expression is the process by which the information encoded in the genes is decoded tofunctional proteins. Gene expression involves several stages. During transcription , instructionsencoded in regions of the DNA, called genes, are copied into molecules called messenger-RNA(mRNA) by the enzyme RNA polymerase. During the translation process, the information encodedin the mRNA is translated into a chain of amino-acids that is further processed to yield a functionalprotein (1). This transfer of information between the three information-carrying biopolymers: DNA,RNA, and protein takes place in the cells of numerous organisms, from bacteria to human.Each sequence of three consecutive nucleotides in the mRNA, called a codon , encodes for aspecific amino-acid or a control signal. For example, three codons: UAA, UAG, and UGA, referredto as “stop codons”, signal the termination of the translation process of the current protein. Theremaining 61 codons encode the standard 20 amino-acids (1).During translation complex molecular machines called ribosomes scan the mRNA codon by codon.The ribosome links amino-acids together in the order specified by the codons to form polypeptidechains. For each codon, the ribosome waits for a transfer RNA (tRNA) molecule that matches andcarries the correct amino-acid for incorporating it into the growing polypeptide chain. When theribosome reaches a stop codon, it detaches from the mRNA and releases the amino-acid chain.Several ribosomes may read the same mRNA molecule simultaneously, as this form of “pipelining”increases the protein production rate. The dynamics of ribosome flow along the mRNA is important,as it strongly affects the production rate and the correct folding of the protein. Variations in proteintranslation rates are associated with neurodegenerative diseases, viral infection, and cancer.A ribosome that is stalled for a long time may lead to the formation of a “traffic jam” of ribosomesbehind it, and consequently to depletion of the pool of free ribosomes. Cells operate sophisticatedregulation mechanisms to avoid and resolve ribosome traffic jams (2–5). Another testimony of theimportance of ribosome flow is the fact that about half of the currently existing antibiotics targetthe bacterial ribosome by interfering with translation initiation, elongation, termination and otherregulatory mechanisms (6, 7). For example, Aminoglycosides inhibit bacterial protein synthesis bybinding to the 30S ribosomal subunit, stabilizing a normal mismatch in codon–anticodon pairing,and leading to mistranslations (8). Understanding the mechanisms of ribosome-targeting antibiotics1nd the molecular mechanisms of bacterial resistance is crucial for developing new drugs that caneffectively inhibit bacterial protein synthesis (9).Summarizing, an important problem is to understand the dynamics of ribosome flow along themRNA, and how it affects the protein production rate. As in many cellular processes, a crucialpuzzle is understanding how proper functioning is maintained, and adjusted to the signals that acell receives and to resource availability, in spite of the large stochasticity in the cell (10, 11) In thecontext of translation, the question is how adequate translation rate is maintained in spite of thestochastic nature of chemical reactions, large fluctuations in factors like cognate tRNA availability,structural accessibility of the 5 ′ -end to translation factors, the spatial organization of mRNAs insidethe cell (12–14), etc.Here, we consider a somewhat different problem, namely, how is protein production from severalcopies of the same mRNA affected by variations in the translation rates due, for example, to thedifferent spatial location of these mRNAs inside the cell. Indeed, stochastic diffusion of translationsubstrates play a key role in determining translation rates (15). We refer to this as spatial variation .We develop a theoretical approach to analyze translation subject to spatial variation by combin-ing a deterministic computational model, called the ribosome flow model (RFM), with tools fromrandom matrix theory. We model the variation in the rates in several copies of the same mRNAby assuming that the rate parameters in the RFM are i.i.d. random variables (RVs). Our mainresults (Theorems 1 and 2 below) reveal a principle of universality: as the length of the mRNAmolecule increases the overall steady-state protein production rate converges, with probability one,to a constant value that depends only on the minimal possible value of the RVs. Roughly speaking,this suggests that much of the variability is “filtered out”. This may explain how the cell overcomesthe variations in protein production due to different spatial locations of the same mRNA.The next section reviews the RFM and its dynamical properties that are relevant in our context.This is followed by our theoretical results. The final section concludes and describes several possibledirections for further research. To increase readability, the proofs of the theatrical results are placedin the Appendix... Significance Statement
Proteins are translated from mRNA molecules. Translation speed depends on the initiation, elon-gation, and termination rates at which ribosomes bind, scan and detach the mRNA. These ratesdepend on various factors, like abundance of free ribosomes, that vary across different locations inthe cell. We develop a theoretical framework for analyzing the average behavior of protein produc-tion subject to spatial variability in the cell. This is based on combining a computational model forthe flow of ribosomes along the mRNA with tools from the theory of random matrices, i.e. matriceswhose entries are random variables. Our results suggest a general universality principle that mayexplain how protein production is successfully regulated in spite of the considerable stochasticityin the cell.
Both authors performed the research and wrote the paper together.The authors declare no competing interests. MM (ORCHID 0000-0001-8319-8996) and WH (ORCHID 0000-0001-7500-1911) contributed equally to this work. To whom correspondence should be addressed. E-mail: [email protected] | ig. 1. Unidirectional flow along an n site RFM. State variable x i ( t ) ∈ [0 , represents the normalized density at site i at time t . The parameter λ i > controls thetransition rate from site i to site i + 1 , with λ [ λ n ] controlling the initiation [termination] rate. R ( t ) is the output rate from the chain at time t . Ribosome Flow Model (RFM)
Mathematical models of the flow of “biological particles” like RNA polymerase, ribosomes, molecularmotors and ribosomes, are becoming increasingly important, as powerful experimental techniquesprovide rich data on the dynamics of such machines inside the cell (16–18), sometimes in real-time (19). Computational models are particularly important in fields like synthetic biology andbiotechnology, as they can provide qualitative and quantitative testifiable predictions on the effectsof various manipulations of the genetic machinery.The standard computational model for the flow of biological particles is the asymmetric simpleexclusion process (ASEP) (20–24). This is a fundamental model from nonequilibrium statisticalmechanics describing particles that hop randomly from a site to a neighboring site along an ordered(usually 1D) lattice. Each site may be either free or occupied by a single particle, and hops maytake place only to a free target site, representing the fact that the particles have volume and cannotovertake one another. This simple exclusion principle generates an indirect coupling between themoving particles. The motion is assumed to be directionally asymmetric, i.e., there is some preferreddirection of motion. In the totally asymmetric simple exclusion process (TASEP) the motion isunidirectional.TASEP and its variants have been used extensively to model and analyze natural and artificialprocesses including ribosome flow, vehicular traffic and pedestrian dynamics, molecular motor traffic,the movement of ants along a trail, and more (25–27). However, due to the intricate indirectinteractions between the hopping particles, analysis of TASEP is difficult, and closed-form resultsexist only in some special cases (28, 29).The RFM (30) is a deterministic, nonlinear, continuous-time ODE model that can be derivedvia a dynamic mean-field approximation of TASEP (31). It is amenable to rigorous analysis usingtools from systems and control theory. The RFM includes n sites ordered along a 1D chain. Thenormalized density (or occupancy level) of site i at time t is described by a state variable x i ( t ) thattakes values in the interval [0 , x i ( t ) = 0 [ x i ( t ) = 1] represents that site i is completely free[full] at time t . The transition between sites i and site i + 1 is regulated by a parameter λ i >
0. Inparticular, λ [ λ n ] controls the initiation [termination] rate into [from] the chain. The rate at whichparticles exit the chain at time t is a scalar denoted by R ( t ) (see Fig. 1).When modeling the flow of biological machines like ribosomes the chain models an mRNAmolecule coarse-grained into n sites. Each site is a codon or group of consecutive codons, and R ( t ) is the rate at which ribosomes detach from the mRNA, i.e. the protein production rate. Thevalues of the λ i s encapsulate many biophysical properties like the number of available free ribo-somes, the nucleotide context surrounding initiation codons, the codon compositions in each siteand the corresponding tRNA avilability, and so on (30, 32, 33). Note that these factors may varyin different locations inside the cell.he dynamics of the RFM is described by n nonlinear first-order ordinary differential equations:˙ x i = λ i − x i − (1 − x i ) − λ i x i (1 − x i +1 ) , i = 1 , . . . , n, [1]where we define x ( t ) := 1 and x n +1 ( t ) := 0. Every x i is dimensionless, and every rate λ i hasunits of 1 / time. Eq. [1] can be explained as follows. The flow of particles from site i to site i + 1is λ i x i ( t )(1 − x i +1 ( t )). This flow is proportional to x i ( t ), i.e. it increases with the occupancy levelat site i , and to (1 − x i +1 ( t )), i.e. it decreases as site i + 1 becomes fuller. This is a “soft” version ofthe simple exclusion principle. The maximal possible flow from site i to site i + 1 is the transitionrate λ i . Eq. [1] is thus a simple balance law: the change in the density x i equals the flow enteringsite i from site i −
1, minus the flow exiting from site i to site i + 1. The output rate from the lastsite at time t is R ( t ) := λ n x n ( t ).An important property of the RFM (inherited from TASEP) is that it can be used to model andanalyze the formation of “traffic jams” of particles along the chain. Indeed, suppose that thereexists an index j such that λ j is much smaller than all the other rates. Then Eq. [1] gives˙ x j = λ j − x j − (1 − x j ) − λ j x j (1 − x j +1 ) ≈ λ j − x j − (1 − x j ) , this term is positive when x ∈ (0 , n , so we can expect site j to fill up, i.e. x j ( t ) →
1. Using Eq. [1]again gives ˙ x j − = λ j − x j − (1 − x j − ) − λ j − x j − (1 − x j ) ≈ λ j − x j − (1 − x j − ) , suggesting that site j − λ j .Note that if λ j = 0 for some index j then the RFM splits into two separate chains, so we alwaysassume that λ j > j ∈ { , . . . , n } .The asymptotic behavior of the RFM has been analyzed using tools from contraction theory (34),the theory of cooperative dynamical systems (35), continued fractions and Perron-Frobenius the-ory (36). We briefly review some of these results that are required later on. Dynamical Properties of the RFM.
Let x ( t, a ) denote the solution of the RFM at time t ≥ x (0) = a . Since the state-variables correspond to normalized occupancy levels,we always assume that a belongs to the closed n -dimensional unit cube: [0 , n := { x ∈ R n : x i ∈ [0 , , i = 1 , . . . , n } . Let (0 , n denote the interior of [0 , n .It was shown in (35) (see also (34)) that there exists a unique e = e ( λ , . . . , λ n ) ∈ (0 , n suchthat for any a ∈ [0 , n the solution satisfies x ( t, a ) ∈ (0 , n for all t > t →∞ x ( t, a ) = e. In other words, every state-variable remains well-defined in the sense that it always takes valuesin [0 , R ( t ) converges to the steady-state value R := λ n e n . The rate ofconvergence to the steady-state e is exponential (37).At the steady-state, the left hand-side of Eq. [1] is zero, and this gives λ i e i (1 − e i +1 ) ≡ R, i = 0 , , . . . , n, [2] | here we define e := 1 and e n +1 := 0. In other words, at the steady-state the flow into and out ofeach site are equal.Solving the set of non-linear equations in Eq. [2] is not trivial. Fortunately, there exists a betterrepresentation of the mapping from the rates to the steady-state. Let R k> denote the set of k -dimensional vectors with all entries positive. Define the ( n + 2) × ( n + 2) tridiagonal matrix T := λ − / . . . λ − / λ − / . . . λ − / . . . . . . λ − / n . . . λ − / n . [3]This is a symmetric matrix, so all its eigenvalues are real. Since every entry of T is non-negativeand T is irreducible, it admits a simple maximal eigenvalue σ > T ), and a corresponding eigenvector ζ ∈ R n +2 > (the Perron eigenvector) that isunique (up to scaling) (38).Given an RFM with dimension n and rates λ , . . . , λ n , let T be the matrix defined in Eq. [3]. Itwas shown in (39) that then R = σ − and e i = λ − / i σ − ζ i +2 ζ i +1 , i = 1 , . . . , n. [4]In other words, the steady-state density and production rate in the RFM can be directly obtainedfrom the spectral properties of T . In particular, this makes it possible to determine R and e evenfor very large chains using efficient and numerically stable algorithms for computing the Perroneigenvalue and eigenvector of a tridiagonal matrix.The spectral representation has several useful theoretical implications. It implies that that R = R ( λ , . . . , λ n ) is a strictly concave function on R n +1 > (39). Also, it implies that the sensitivity of thesteady-state w.r.t. a perturbation in the rates becomes an eigenvalue sensitivity problem. Knownresults on the sensitivity of the Perron root imply that ∂∂λ i R = 2 σ λ / i ζ ′ ζ ζ i +1 ζ i +2 , i = 0 , . . . , n, [5]where ζ ′ denotes the transpose of the vector ζ . It follows in particular that ∂∂λ i R > i , thatis, an increase in any of the transition rates yields an increase in the steady-state production rate.The RFM has been used to analyze various properties of translation. These include mRNAcircularization and ribosome cycling (40), maximizing the steady-state production rate under aconstraint on the rates (39, 41), optimal down regulation of translation (42), and the effect ofribosome drop off on the production rate (43). More recent work focused on coupled networks ofmRNA molecules. The coupling may be due to competition for shared resources like the finitepool of free ribosomes (44), or due to the effect of the proteins produced on the promoters of othermRNAs (45). Several variations and generalizations of the RFM have also been suggested andanalyzed (31, 43, 46–49).Several studies compared predictions of the RFM with biological measurements. For example,protein levels and ribosome densities in translation (30), and RNAP densities in transcription (50).he results demonstrate high correlation between gene expression measurements and the RFMpredictions.Previous works on the RFM assumed that the transition rates λ i are deterministic. Here, weanalyze for the first time the case where the rates are RVs. This may model for example the paralleltranslation of copies of the same mRNA molecule in different locations inside the cell. The varianceof factors like tRNA abundance in these different locations implies that each mRNA is translatedwith different rates. We model this variability by assuming that the rates are RVs. Our goal is toanalyze the resulting protein production rate.The next section describes our main results on translation with stochastic rates. Results
Assume that the RFM rates are not constant, but rather are RVs with some known distributionsupported over R ≥ δ := { x ∈ R : x ≥ δ } , where δ >
0. What will the statistical properties of theresulting protein production rate be? In the context of the spectral representation given in Eq. [3],this amounts to the following question: given the distributions of the RVs λ i , what are the statisticalproperties of the maximal eigenvalue σ of the matrix T ?Recall that a random variable X is called essentially bounded if there exists 0 ≤ b < ∞ suchthat P [ | X | ≤ b ] = 1, and then the L ∞ norm of X is k X k ∞ := inf b ≥ { P [ | X | ≤ b ] = 1 } . Clearly, this is the relevant case in any biological model. In particular, if X is supported over R ≥ δ ,with δ >
0, then the RV defined by Y := X − / is essentially bounded and || Y || ∞ ≤ δ − / .We can now state our main results. The proofs are placed in the Appendix. Theorem 1
Suppose that every rate λ , . . . , λ n in the RFM is drawn independently according tothe distribution of an RV X that is supported on R ≥ δ , with δ > . Then as n → ∞ , the maxi-mal eigenvalue of the matrix T converges to || X − / || ∞ with probability one, and the steady-stateproduction rate R in the RFM converges to (2 || X − / || ∞ ) − , [6] with probability one. This result may explain how proper functioning is maintained in spite of significant variability in therates: the steady-state production rate always converges to the value in Eq. [6], that depends onlyon || X − / || ∞ . This also implies a form of universality with respect to the noises and uncertainties:the exact details of the distribution of X are not relevant, but only the value || X − / || ∞ .In general, the convergence to the values in Theorem 1 as n increases is slow, and computersimulations may require n values that exhaust the computer’s memory before we are close to thetheoretical values. The next example demonstrates a case where the convergence is relatively fast. Example 1
Recall that the probability density function of the half-normal distribution with param-eters ( µ, σ ) is f ( x ) = q πσ exp( − ( x − µσ ) ) , x ≥ µ, , otherwise . | ig. 2. Histograms of , R values in Example 1 for n = 50 (green), n = 500 (blue), and n = 1000 (red). Suppose that X has this distribution with parameters ( µ = 1 , σ = 0 . . Note that || X − / || ∞ = 1 ,so in this case Thm. 1 implies that R converges with probability one to / as n goes to infinity.For n ∈ { , , } , we numerically computed R using the spectral representation for , random matrices. Fig. 2 depicts a histogram of the results. It may be seen that as n increases thehistogram becomes “sharper” and its center converges towards / , as expected. Theorem 1 does not provide any information on the rate of convergence to the limiting value of R .This is important as in practice n is always finite. The next result addresses this issue. For ǫ > a ( ǫ ) := P (cid:16) X − / ≥ k X − / k ∞ − ǫ (cid:17) . Note that a ( ǫ ) ∈ (0 , Theorem 2
Suppose that every rate λ , . . . , λ n in the RFM is drawn independently according tothe distribution of an RV X that is supported on R ≥ δ , with δ > . Pick two sequences of positiveintegers n < n < . . . and k < k < . . . , with k i < n i for all i , and a decreasing sequence ofpositive scalars ǫ i , with ǫ i → . Then for any i the steady-state production rate R i in an RFM withdimension n i satisfies (2 k X − / k ∞ ) − ≤ R i ≤ (2 k X − / k ∞ ) − (cid:16) O ( ǫ i + k − i ) (cid:17) , [7] with probability at least − exp (cid:18) − (cid:22) n i − k i (cid:23) ( a ( ǫ i )) k i (cid:19) . [8]Note that if we choose the sequences such that n i k i ( a ( ǫ i )) k i → ∞ , [9]and take i → ∞ then Theorem 2 yields Theorem 1. Yet, we state and prove both results separatelyin the interest of readability. xample 2 Suppose that X has a uniform distribution over an interval [ δ, γ ] with < δ < γ . Fromhere on we assume for simplicity that δ = 1 and γ = 2 . Then for any ǫ > sufficiently small, wehave a ( ǫ ) = P (cid:16) X − / ≥ − ǫ (cid:17) = P (cid:16) X ≤ (1 − ǫ ) − (cid:17) = 2 ǫ + o ( ǫ ) . Fix d ∈ (0 , and take ǫ i = n ( d − /k i i . Then the condition in Eq. [9] becomes n di k i → ∞ and this will hold if k i does not increase too quickly. We can write ǫ i as ǫ i = exp(( d −
1) log( n i ) /k i ) , so to guarantee that ǫ i → , we take k i = (log( n i )) c , with c ∈ (0 , , and then Eq. [9] indeed holds.Theorem 2 now implies that (2 k X − / k ∞ ) − ≤ R i ≤ (2 k X − / k ∞ ) − × (cid:16) O (max { exp(( d − n i )) − c ) , (log( n i )) − c } ) (cid:17) , with probability at least − exp − n di (log( n i )) c ! . [10] Example 3
As in Example 1, consider the case where X is half-normal with parameters ( µ, σ ) ,where µ > . Then k X − / k ∞ = µ − / , so a ( ǫ ) = P (cid:16) X − / ≥ µ − / − ǫ (cid:17) = P ( X ≤ z ) , where z := ( µ − / − ǫ ) − . Thus, a ( ǫ ) = s πσ Z zµ e − ( x − µ )22 σ d x = 2 √ π Z z − µ √ σ e − x d x. It is not difficult to show that this implies that a ( ǫ ) = c ( µ, σ ) ǫ + o ( ǫ ) , [11] where c ( µ, σ ) := 2 q πσ µ / . To satisfy Eq. [9] , fix p ∈ (0 , and choose ǫ i such that ( cǫ i ) k i = n p − i .This implies that ǫ i = 1 c exp (cid:18) p − k i log( n i ) (cid:19) . [12] | ow, pick q ∈ (0 , and take k i = (log( n i )) q . Then Eq. [9] holds, and ǫ i = 1 c exp (cid:16) ( p − n i )) − q (cid:17) . [13] Theorem 2 implies that for any p, q ∈ (0 , , we have µ ≤ R i ≤ µ O (max { c exp (cid:16) ( p − n i )) − q (cid:17) , (log( n i )) − q } ) with probability at least − exp − n pi (log( n i )) q ! . Remark 1
Analysis of the proofs of Theorems 1 and 2 shows that these results remain valid if therandom variables X , . . . , X n − are not necessarily identically distributed, but are all independent,supported over the positive semi-axis, and satisfy || X − / || ∞ = · · · = || X − / n − || ∞ , i.e. they all have the same bound. For example, we may model every rate λ i as an RV distributedwith a half-normal distribution with parameters ( µ i , σ i ) , where all the µ i s are equal. Discussion
Cellular systems are inherently noisy, and it is natural to speculate that they were optimized byevolution to function properly in the presence of stochastic fluctuations.Many studies analyzed the fluctuations in protein production due to both extrinsic and intrinsicnoise (see, e.g. (11, 51–53)). Here, we considered a somewhat different problem, namely, how isprotein production from several copies of the same mRNA affected by variations in the translationrates due, for example, to the different spatial location of these mRNAs inside the cell.Current experimental techniques fall short of providing accurate estimations for the rates alongdifferent copies of the same mRNA molecule in the cell. Furthermore, protein abundance dependsnot only on translation, but also on the rate of transcription, and mRNA and protein dilution anddecay (52). Our results however may indicate general principles that can be tested experimentally.For example, the analysis suggests that as the length of the mRNA increases, the translation ratebecomes more uniform.The RFM, just like TASEP, is a phenomenological model for the flow of interacting particlesand thus can be used to model and analyze phenomena like the flow of packets in communicationnetworks (54), the transfer of a phosphate group through a serial chain of proteins during phospho-relay (49), and more. The RFM is also closely related to a mathematical model for a disorderedlinear chain of masses, each coupled to its two nearest neighbors by elastic springs (55), that wasoriginally analyzed in the seminal work of Dyson (56). In many of these applications it is naturalto assume that the rates are subject to uncertainties or fluctuations and model them as RVs. Thenthe results here can be immediately applied. ppendix: Proofs
The proofs of our main results are based on analyzing the matrix T in Eq. [3] when the λ i s arei.i.d. RVs. The problem that we study here is a classical problem in random matrix theory (57),yet the matrix that we consider is somewhat different from the standard matrices analyzed usingthe existing theory (e.g. the Wigner matrix). Hence, we provide a self-contained analysis based oncombining probabilistic arguments with the Perron-Frobenius theory of matrices with non-negativeentries (see e.g. (38, Ch. 8)). A. Proof of Theorem 1.
Let { X i } n − i =1 be a set of i.i.d. random variables supported on R > . Definea random n × n matrix: T n ( X , . . . , X n − ) := X X X X X n − X n − , [14]i.e. T n is a symmetric tridiagonal matrix, with zeros on its main diagonal, and the positive randomvariables { X i } n − i =1 on the super- and sub-diagonals.Since T n is symmetric, componentwise non-negative, and irreducible, it admits a simple maximaleigenvalue denoted λ max ( T n ), and λ max ( T n ) >
0. Our goal is to understand the asymptotic behaviorof λ max ( T n ), as n → ∞ . We begin with an auxiliary result that will be used later on. Proposition 1
Suppose that the random variables { X i } n − i =1 are i.i.d. and essentially bounded. Fix ǫ > and a positive integer k . Let K denote the event: there exists an index ℓ ≤ n − k such that X ℓ , . . . , X ℓ + k − ≥ k X k ∞ − ǫ . Then as n → ∞ the probability of K converges to one. Proof 1
Fix ǫ > and a positive integer k . Let s := k X k ∞ − ǫ . For any j ∈ { , . . . , n − k } , let K ( j ) denote the event: X j , . . . , X j + k − ≥ s . Then P ( K ) ≥ P ( K (1) ∪ K ( k + 1) ∪ K (2 k + 1) ∪ · · · ∪ K ( pk + 1)) , where p is the largest integer such that ( p + 1) k ≤ n − . Since the X i s are i.i.d., P ( K ) ≥ − (1 − P ( K (1))) p +1 = 1 − (1 − ( P ( X ≥ s )) k ) p +1 . The probability P ( X ≥ s ) is positive, and when n → ∞ , we have p → ∞ , so P ( K ) → . (cid:3) The next result uses Proposition 1 to provide a tight asymptotic lower bound on the maximaleigenvalue of T n ( X ). Proposition 2
Suppose that the random variables { X i } n − i =1 are i.i.d. and essentially bounded. Fix ǫ > and a positive integer k . Then the probability P (cid:18) λ max ( T n ) ≥ k X k ∞ − ǫ ) cos πk + 1 (cid:19) [15] goes to one as n → ∞ . | roof 2 Let s := k X k ∞ − ǫ . Conditioned on the event K , there exists an index ℓ such that X ℓ , . . . , X ℓ + k − ≥ s . We assume that ℓ = 1 (the proof in the case ℓ > is very similar).Let M k denote the k × k symmetric tridiagonal matrix: M k := . . .. . .
11 0 . [16] Recall that the maximal eigenvalue of this matrix is λ max ( M k ) = 2 cos πk +1 (see e.g. (58)).Let P n be the matrix obtained by replacing the k × k leading principal minor of T n by s M k . Notethat T n ≥ P n (where the inequality is componentwise) and thus λ max ( T n ) ≥ λ max ( P n ) .By Cauchy’s interlacing theorem, the largest eigenvalue of P n is larger or equal to the largesteigenvalue of any of its principal minors. Thus, λ max ( P n ) ≥ λ max ( s M k ) ≥ s cos πk + 1 . and this completes the proof of Proposition 2. (cid:3) We can now complete the proof of Theorem 1. Recall that if A is an n × n symmetric andcomponentwise non-negative matrix then (see, e.g. (38, Ch. 8)) | λ max ( A ) | ≤ max i ∈{ ,...,n } n X j =1 a ij . [17]As any row of T n has at most two nonzero elements, Eq. [17] implies that λ max ( T n ) ≤ max i ∈{ ,...,n − } ( X i − + X i ) ≤ i ∈{ ,...,n − } X i , [18]with probability one. Combining this with Proposition 2 implies that2( || X || ∞ − ǫ ) cos πk + 1 ≤ λ max ( T n ) ≤ || X || ∞ , [19]with probability one. Since this holds for any ǫ > k >
0, this completes the proofof Theorem 1. (cid:3)
We can now prove Theorem 2. Fix ǫ > k ∈ { , . . . , n − } . Let ¯ a ( ǫ ) := P ( X ≥ k X k ∞ − ǫ ). The proofs of Propositions 1 and 2 imply that λ max ( T n ) ≥ k X k ∞ − ǫ ) cos πk + 1 , [20]with probability P ( K ) ≥ − (1 − (¯ a ( ǫ )) k ) ⌊ n − k ⌋ .Fix b, c >
0. Then the bound 1 − b < exp( − b ) gives1 − (1 − b ) c > − exp( − bc ) , o P ( K ) ≥ − (1 − (¯ a ( ǫ )) k ) ⌊ n − k ⌋ ≥ − exp (cid:18) −⌊ n − k ⌋ (¯ a ( ǫ )) k (cid:19) . [21]Pick two sequences of positive integers n < n < . . . and k < k < . . . , with k i < n i for all i ,and a decreasing sequence of positive scalars ǫ i , with ǫ i →
0. Using Eq. [20] gives( λ max ( T n i )) − ≤ (cid:18) k X k ∞ − ǫ i ) cos πk i + 1 (cid:19) − = (2 k X k ∞ ) − ǫ i k X k ∞ + o ( ǫ i ) ! (cid:18) cos πk i + 1 (cid:19) − = (2 k X k ∞ ) − ǫ i k X k ∞ + o ( ǫ i ) ! π ( k i + 1) + o ( k − i ) ! = (2 k X k ∞ ) − (cid:16) O ( ǫ i + k − i ) (cid:17) . Combining this with the spectral representation of the steady-state in the RFM completes the proofof Theorem 2. (cid:3)
ACKNOWLEDGMENTS.
The authors thank Yoram Zarai and Tamir Tuller for helpful comments. Thework of MM is partially supported by a research grant from the ISF.
References.
1. B Alberts, et al.,
Molecular Biology of the Cell . (Garland Science, New York), 5 edition, (2007).2. S Juszkiewicz, et al., Ribosome collisions trigger cis-acting feedback inhibition of translation initiation. eLife (2020).3. S Juszkiewicz, SH Speldewinde, L Wan, J Svejstrup, RS Hegde, The ASC-1 complex disassembles collided ribosomes. Mol. Cell , 603–614 (2020).4. EW Mills, R Green, Ribosomopathies: There’s strength in numbers. Science (2017).5. T Tuller, et al., An evolutionarily conserved mechanism for controlling the efficiency of protein translation.
Cell , 344–54 (2010).6. AG Myasnikov, et al., Structure-function insights reveal the human ribosome as a cancer target for antibiotics.
Nat. Commun. , 12856 (2016).7. M Johansson, J Chen, A Tsai, G Kornberg, J Puglisi, Sequence-dependent elongation dynamics on macrolide-bound ribosomes. Cell Reports , 1534–1546 (2014).8. T Lambert, Antibiotics that affect the ribosome. Rev. sci. tech. Off. int. Epiz. , 57–64 (2012).9. DN Wilson, Ribosome-targeting antibiotics and mechanisms of bacterial resistance. Nat. Rev. Microbiol . , 35–48 (2014).10. WJ Blake, M Kaern, CR Cantor, JJ Collins, Noise in eukaryotic gene expression. Nature , 633–637 (2003).11. JRS Newman, et al., Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise.
Nature , 840–846 (2006).12. E Korkmazhan, H Teimouri, N Peterman, E Levine, Dynamics of translation can determine the spatial organization of membrane-bound proteins and their mRNA.
Proc. Natl. Acad. Sci . ,13424–13429 (2017).13. E Lecuyer, et al., Global analysis of mRNA localization reveals a prominent role in organizing cellular architecture and function. Cell , 174–187 (2007).14. F Besse, A Ephrussi, Translational control of localized mRNAs: restricting protein synthesis in space and time.
Nat. Rev. Mol. Cell Biol . , 971–980 (2008).15. A Nieb, M Siemann-Herzberg, R Takors, Protein production in Escherichia coli is guided by the trade-off between intracellular substrate availability and energy cost. Microb. Cell Fact. (2019).16. NT Ingolia, Ribosome profiling: new views of translation, from single codons to genome scale. Nat. Rev. Genet . , 205–213 (2014).17. A Newhart, SM Janicki, Seeing is believing: Visualizing transcriptional dynamics in single cells. J. Cell. Physiol. , 259–265 (2014).18. A Mayer, L Churchman, Genome-wide profiling of RNA polymerase transcription at nucleotide resolution in human cells with native elongating transcript sequencing.
Nat. Protoc . , 813–833(2016).19. S Iwasaki, NT Ingolia, Seeing translation. Science , 1391–1392 (2016).20. CT MacDonald, JH Gibbs, AC Pipkin, Kinetics of biopolymerization on nucleic acid templates.
Biopolymers , 1–25 (1968).21. CT MacDonald, JH Gibbs, Concerning the kinetics of polypeptide synthesis on polyribosomes. Biopolymers , 707–725 (1969).22. F Spitzer, Interaction of Markov processes. Adv. Math. , 246–290 (1970).23. R Zia, J Dong, B Schmittmann, Modeling translation in protein synthesis with TASEP: A tutorial and recent developments. J. Stat. Phys. , 405–428 (2011).24. LB Shaw, RK Zia, KH Lee, Totally asymmetric exclusion process with extended objects: a model for protein synthesis.
Phys. Rev. E Stat. Nonlin. Soft. Matter Phys. , 021910 (2003).25. A Schadschneider, D Chowdhury, K Nishinari, Stochastic Transport in Complex Systems: From Molecules to Vehicles . (Elsevier), (2011).26. I Pinkoviezky, N Gov, Transport dynamics of molecular motors that switch between an active and inactive state.
Phys. Rev. E , 022714 (2013).27. H Zur, T Tuller, Predictive biophysical modeling and understanding of the dynamics of mRNA translation and its evolution. Nucleic Acids Res. , 9031–9049 (2016).28. B Derrida, E Domany, D Mukamel, An exact solution of a one-dimensional asymmetric exclusion model with open boundaries. J. Stat. Phys. , 667–687 (1992).29. B Derrida, MR Evans, V Hakim, V Pasquier, Exact solution of a 1D asymmetric exclusion model using a matrix formulation. J. Phys. A: Math. Gen. , 1493 (1993).30. S Reuveni, I Meilijson, M Kupiec, E Ruppin, T Tuller, Genome-scale analysis of translation elongation with a ribosome flow model. PLoS Comp. Biol. , e1002127 (2011).31. Y Zarai, M Margaliot, T Tuller, Ribosome flow model with extended objects. J. R. Soc. Interface (2017).32. T Tuller, et al., Composite effects of gene determinants on the translation speed and density of ribosomes. Genome Biol. , R110 (2011).33. A Dana, T Tuller, Efficient manipulations of synonymous mutations for controlling translation rate–an analytical approach. J. Comput. Biol. , 200–231 (2012).34. M Margaliot, ED Sontag, T Tuller, Entrainment to periodic initiation and transition rates in a computational model for gene translation. PLoS ONE , e96039 (2014).35. M Margaliot, T Tuller, Stability analysis of the ribosome flow model. IEEE/ACM Trans. Comput. Biol. Bioinform. , 1545–1552 (2012).36. G Poker, M Margaliot, T Tuller, Sensitivity of mRNA translation. Sci. Rep. (2015).37. M Margaliot, T Tuller, ED Sontag, Checkable conditions for contraction after small transients in time and amplitude in Feedback Stabilization of Controlled Dynamical Systems: In Honor of LaurentPraly , ed. N Petit. (Springer International Publishing, Cham, Switzerland), pp. 279–305 (2017).38. RA Horn, CR Johnson,
Matrix Analysis . (Cambridge), 2 edition, (2013). |
9. G Poker, Y Zarai, M Margaliot, T Tuller, Maximizing protein translation rate in the nonhomogeneous ribosome flow model: A convex optimization approach.
J. R. Soc. Interface , 20140713 (2014).40. M Margaliot, T Tuller, Ribosome flow model with positive feedback. J. R. Soc. Interface , 20130267 (2013).41. Y Zarai, M Margaliot, T Tuller, On the ribosomal density that maximizes protein translation rate. PLoS ONE , 1–26 (2016).42. Y Zarai, M Margaliot, T Tuller, Optimal down regulation of mRNA translation. Sci. Rep. , 41243 (2017).43. Y Zarai, M Margaliot, T Tuller, A deterministic mathematical model for bidirectional excluded flow with langmuir kinetics. PLoS ONE , e0182178 (2017).44. A Raveh, M Margaliot, E Sontag, T Tuller, A model for competition for ribosomes in the cell. J R Soc Interface , 20151062 (2016).45. I Nanikashvili, Y Zarai, A Ovseevich, T Tuller, M Margaliot, Networks of ribosome flow models for modeling and analyzing intracellular traffic. Sci. Rep. (2019).46. A Raveh, Y Zarai, M Margaliot, T Tuller, Ribosome flow model on a ring. IEEE/ACM Trans. Comput. Biol. Bioinform. , 1429–1439 (2015).47. Y Zarai, A Ovseevich, M Margaliot, Optimal translation along a circular mRNA. Sci. Rep. , 9464 (2017).48. Y Zarai, M Margaliot, AB Kolomeisky, A deterministic model for one-dimensional excluded flow with local interactions. PLoS ONE , 1–23 (2017).49. E Bar-Shalom, A Ovseevich, M Margaliot, Ribosome flow model with different site sizes. SIAM J. Appl. Dyn. Syst. , 541–576 (2020).50. S Edri, E Gazit, E Cohen, T Tuller, The RNA polymerase flow model of gene transcription. IEEE Trans Biomed Circuits Syst. , 54–64 (2014).51. J Pviseaulsson, Summing up the noise in gene networks. nature , 415–418 (2004).52. J Hausser, A Mayo, L Keren, U Alon, Central dogma rates and the trade-off between precision and economy in gene expression. Nat. Commun . , 1–15 (2019).53. HH McAdams, A Arkin, Stochastic mechanisms in gene expression. Proc. Natl. Acad. Sci . , 814–819 (1997).54. Y Zarai, O Mendel, M Margaliot, Analyzing linear communication networks using the ribosome flow model in Proc. 2015 IEEE Int. Conf. on Computer and Information Technology; UbiquitousComputing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing . pp. 755–761 (2015).55. Y Zarai, M Margaliot, On minimizing the maximal characteristic frequency of a linear chain.
IEEE Trans. Autom. Control . , 4827–4833 (2017).56. F Dyson, The dynamics of a disordered linear chain. Phys. Rev. , 1331–1338 (1953).57. B Zhidong, JW Silverstein, Spectral Analysis of Large Dimensional Random Matrices . (Springer-Verlag, New York), (2010).58. WC Yueh, Eigenvalues of several tridiagonal matrices.
Appl. Math. E-Notes5