Cotranscriptional kinetic folding of RNA secondary structures including pseudoknots
aa r X i v : . [ q - b i o . B M ] J u l Cotranscriptional kinetic folding of RNAsecondary structures
Vo Hong Thanh and Pekka Orponen
Department of Computer Science, Aalto University
Email: { thanh.vo,pekka.orponen } @aalto.fi Computational prediction of RNA structures is animportant problem in computational structural biology.Studies of RNA structure formation often assume thatthe process starts from a fully synthesized sequence.Experimental evidence, however, has shown thatRNA folds concurrently with its elongation. Weinvestigate RNA structure formation, taking intoaccount also the cotranscriptional effects. We pro-pose a single-nucleotide resolution kinetic model ofthe folding process of RNA molecules, where thepolymerase-driven elongation of an RNA strand by anew nucleotide is included as a primitive operation,together with a stochastic simulation method thatimplements this folding concurrently with the tran-scriptional synthesis. Numerical case studies showthat our cotranscriptional RNA folding model canpredict the formation of metastable conformationsthat are favored in actual biological systems. Ournew computational tool can thus provide quantitativepredictions and offer useful insights into the kinetics ofRNA folding.
Keywords:
RNA; Costranscriptional folding; Stochas-tic simulation.
Ribonucleic acid (RNA) is a biopolymer consti-tuted of nucleotides with bases adenine (A), cytosine(C), guanine (G) and uracil (U). The synthesis of anRNA molecule from its DNA template is initiated whenthe corresponding RNA polymerase binds to the DNApromoter region. RNA has been shown to serve diversefunctions in a wide range of cellular processes such asregulating gene expression and acting as an enzymatic catalyst [3, 23], and has also recently been used as anemerging material for nanotechnology [11].Computational prediction of RNA secondary struc-tures given their primary sequences is often based onthe estimation of changes in free energy, which postu-lates that thermodynamically an RNA strand will foldinto a conformation that yields the minimum free en-ergy (MFE) value (see e.g., [5] for a review on thetopic). We limit our focus in this work to pseudoknot-free secondary structures, i.e., structures with no cross-ing base pairs, because finding MFE structures withpseudoknots in a general energy model is an NP-complete problem [15]. Under this simplification, theenergy of an RNA secondary structure can be mod-eled as the sum of energies of strand loops flankedby base pairs. The loop energy parameters have beenmeasured experimentally and are detailed in a nearestneighbor parameter database [27]. Methods groundedin the thermodynamic framework, such as the Zuker al-gorithm [31] and its extensions [30, 17], can be usedto compute pseudoknot-free MFE secondary structureseffectively in a bottom-up manner.The kinetic approach is an alternative way to studythe RNA folding process. It models the folding as arandom process where the additions/deletions of basepairs in the current structure are assigned probabilitiesproportional to the respective changes in free energyvalues [6]. A folding pathway of a sequence is thengenerated by executing a stochastic simulation compu-tation [6, 19, 4]. We refer to the book by Marchetti et al. [16] for a comprehensive review of state-of-the-art stochastic simulation techniques. Each simulationrun on a given RNA sequence can produce a list ofpossible structures that it can fold into. Such dynamicview of RNA folding allows one to capture cases whereocal conformations are progressively folded to createmetastable structures that kinetically trap the folding,thus complementing the prediction of equilibrium MFEstructures produced by the thermodynamic approach.The study of RNA structure formation often as-sumes that the folding process starts from a fully syn-thesized open strand, the denatured state . However,experimental evidence [28, 21] has shown that RNAstarts folding already concurrently with the transcrip-tion. The nucleotide transcription speed varies from200 nt/sec (nucleotides per second) in phages, to 20-80 nt/sec in bacteria, and 5-20 nt/sec in humans [21].The RNA dynamics also occur over a wide range oftime-scales where base pairing takes about 10 − msec;structure formation is about 10-100 msec; and kineti-cally trapped conformations can persist for minutes orhours [28]. One consequence of considering cotran-scriptional folding is that the base pairs at the 5’ end ofthe RNA strand will form first, while the ones at the 3’end can only be formed once the transcription is com-plete, which leads to structural asymmetries. Cotran-scriptional folding can thus form transient structuresthat are only present for a specific time period and in-volved in distinct roles. For instance, gene expressionwhen considering such transient conformations of RNAduring cotranscriptional folding can expose oscillationbehavior [2]. We refer to the review by Lai et al. [14]for further discussion on the importance of cotranscrip-tional effects.In this work, we extend the kinetic approach to takeinto account cotranscriptional effects on the folding ofRNA at single-nucleotide resolution. Unlike existingkinetic methods that either start from a fully synthe-sized sequence [6] or approximate the effects of co-transcriptional folding [22, 20, 10, 29, 9], we explicitlyconsider the elongation of RNA during transcription asa primitive action in the model. The time when a newnucleotide is added to the current RNA chain is speci-fied by the transcription speed of the RNA polymeraseenzyme. The RNA strand in our modeling approachcan elongate with newly synthesized nucleotides addedto the sequence and fold simultaneously. We also pro-pose an exact stochastic simulation method, the CoS-tochFold algorithm, to handle the transcription events.The rest of the paper is organized as follows. Sec. 2reviews some background on kinetic folding of RNA.In Sec. 3, we present our work to extend the model ofRNA folding to incorporate the transcription process.Sec. 4 reports our numerical experiments on case stud-ies. Concluding remarks are in Sec. 5. Let S n be a linear sequence of length n of four basesA, C, G, and U in which the 5’ end is at position 1and the 3’ end is at position n . A base at position i may form a hydrogen bond with a base at j , denotedby ( i , j ) , if they form a Watson-Crick pair
A-U, G-Cor a wobble pair
G-U. A secondary structure formedby intra-molecular interactions between bases in S n is alist of base pairs ( i , j ) with i < j satisfying constraints:a) the i th base and j th base must be separated by atleast 3 (un-paired) bases, i.e., j − i >
3; b) for any basepair ( k , l ) with k < l , if i = k then j = l ; and c) forany base pair ( k , l ) with k < l , if i < k then i < k < l < j . The first condition prevents the RNA backbonefrom bending too sharply. The second one prevents theforming of tertiary structure motifs such as base tripletsand G-quartets. The last constraint ensures that no twobase pairs intersect, i.e. there are no pseudoknots.Let Ω S n be the set of all possible secondary struc-tures formed by S n . Consider a secondary structure x ∈ Ω S n . It can be represented compactly as a stringof dots and brackets (see Fig. 1). Specifically, for abase pair ( i , j ) , an opening bracket ’(’ is put at i th po-sition and a closing bracket ’)’ at j th position. Finally,unpaired positions are represented by dots ’.’. The dot-bracket representation is unambiguous because the basepairs in a secondary structure do not cross each other.The free energy of x can be estimated by the near-est neighbor model [17], in which the free energy of anRNA secondary structure is taken to be the sum of en-ergies of components flanked by base pairs. Formally,for a base pair ( i , j ) in x , we say that base k , i < k < j , is accessible from ( i , j ) if there is no other base pair ( i ′ , j ′ ) such that i < i ′ < k < j ′ < j . The set of accessible basesflanked by base pair ( i , j ) is called the loop L ( i , j ) . Thenumber of unpaired bases in a loop L ( i , j ) is its size ,while the number of enclosed base pairs determines itsdegree. Based on these properties, loops L ( i , j ) can beclassified as stacks , hairpins , bulges , internal loops and multi-loops . The unpaired bases that are not containedin loops constitute the exterior (or external) loop L e .A secondary structure x is thus uniquely decom-posed into a collection of loops x = ∪ ( i , j ) L ( i , j ) ∪ L e .Based on this decomposition, the free energy G x (inkcal) of secondary structure x is computed as: G x = ∑ ( i , j ) G L ( i , j ) + G L e (1)where G L ( i , j ) is the free energy of loop L ( i , j ) . Ex- ) (((((((..((((........)))).(((((.......))))).....(((((.......)))))))))))).… b) Fig. 1. Representation of the tRNA molecule in a) dot-bracket notation and b) graphical visualization. The graphical visualization is made bythe Forna tool [13] .perimental energy values for G L ( i , j ) are available in thenearest neighbor database [27].Let y ∈ Ω S n be a secondary structure derived di-rectly from x by an intramolecular reaction betweenbases i and j in x . Commonly, three operations on apair of bases, referred to as the move set (see Fig. 2),are defined [6]: Addition: y is derived by adding a base pair thatjoins bases i and j in x that are currently unpairedand eligible to pair. Deletion: y is derived by breaking a current basepair ( i , j ) in x . Shifting: y is derived by shifting a base pair ( i , j ) in x to form a new base pair ( i , k ) or ( k , j ) .Let k x → y be the rate (probability per time unit) of thetransition from x to y . In a conformation x , the RNAmolecule may wander vibrationally around its energybasin G x for a long time, before it surmounts an energybarrier to escape to a conformation y in another basin.The dynamics of the transition from x to y characterizes a rare event in Molecular Dynamics (MD). Here, weadopt the coarse-grained kinetic Monte Carlo approxi-mation [18, 12], and model the transition rate k x → y as: k x → y = k e − ∆ G xy / RT (2)where T is absolute temperature in Kelvin (K), R = . × − ( kcal · K − · mol − ) is the gas constantand ∆ G xy = G y − G x denotes the difference betweenfree energies of x and y . k is a constant for calibra-tion of time.Let P ( x , t ) be the probability that the system is atconformation x at time t . The dynamics of P ( x , t ) isformulated by the (chemical) master equation [16] as: dP ( x , t ) dt = ∑ y ∈ Ω Sn h k y → x P ( x , t ) − k x → y P ( x , t ) i (3)Analytically solving Eq.3 requires to enumerate all pos-sible states x and their neighbors y . The size of the statepace k Ω S n k ∼ n − / α n with α = . n , and the numberof neighbors of x is in order of O ( n ) [8]. Thus, dueto the high dimension of the state space, solving Eq.3often involves numerical simulation.Let P ( y , τ | x , t ) be the probability that, given currentstructure x at time t , x will fold into y in the next in-finitesimal time interval [ t + τ , t + τ + d τ ) . We have P ( y , τ | x , t ) = k x → y e − k x τ d τ (4)where k x = ∑ y ∈ Ω Sn k x → y is the sum of transition ratesto single-move neighbors of x . Eq. 4 shows that theprobability that x moves to y is k x → y / k x , and the wait-ing time τ until that transition follows an exponentialdistribution Exp ( k x ) . These facts are the basis for ourkinetic folding algorithm StochFold presented as Algo-rithm 1. We note that StochFold shares the structureof the earlier algorithm Kinfold [6] and its improve-ments [4, 25, 24, 26].
Algorithm 1
StochFold
Require: initial RNA conformation s and ending time T max initialize x = s and time t = repeat enumerate next possible conformations of thecurrent conformation x and put into set Q compute the transition rate k x → y for each y ∈ Q and total rate k x = ∑ y ∈ Q k x → y select next conformation y ∈ Q with probability k x → y / k x sample waiting time to the next folding event τ ∼ Exp ( k x ) set x = y and t = t + τ until t ≥ T max The folding of an RNA strand adapts immediatelyto new nucleotides synthesized during the transcription.The kinetic approach described in Section 2 cannot cap-ture the effects of such cotranscriptional folding, be-cause it considers only interactions between bases al-ready present in the sequence. We outline in this sectionan approach to incorporating these effects in the simu-lation. The transcription process is explicitly taken into account by extending the move set with the new oper-ation of elongation . Our extended move set thus com-prises four operations: addition, deletion, shifting andelongation. The first three operations are defined as inthe previous section. In elongation, the current RNAchain increases in length and a newly synthesized nu-cleotide is added to its 3’ end. Figure 2 illustrates theextended move set. a) Addition b) Deletion c) Shifting d) Elongation
Fig. 2. Extended move set consisting of a) addition, b) deletion, c)shifting and d) elongation. The elongation move models the transcrip-tion process extending the current RNA chain with a new nucleotideat the 3’ end.
Under the extended move set, we define two eventtypes: folding event and transcription event . A fold-ing event is an internal event that occurs when one ofthe three operations addition, deletion or shifting, is ap-plied to a base pair of the current sequence. A tran-scription event happens when the elongation operationis applied. It is an external event whose rate is specifiedby the transcription speed of the RNA polymerase en-zyme. The occurrences of transcription events break theMarkovian property of transitions between conforma-tions. This is because when a new nucleotide is addedto the current RNA conformation, the number of nextpossible conformations increases. The waiting time ofthe next folding event also changes and thus a new fold-ing event has to be recomputed.Algorithm 2 outlines how the CoStochFold algo-ithm handles this situation. The key element of CoS-tochFold (lines 8 - 14) is a race where the event havingthe smallest waiting time will be selected to update thecurrent RNA conformation. More specifically, supposethe current structure is x at time t . Let τ e be the wait-ing time to the next folding event and τ trans the waitingtime to the next transcription event. Assuming that noevents occur earlier, τ e has an exponential distributionwith rate k x which is the sum of all transition rates of ap-plying addition, deletion and shifting operations to basepairs in x . For simplifying the computation of τ trans , weassume that it is the expected time to transcribe one nu-cleotide. Let N trans be the (average) transcription speedof the polymerase. We compute τ trans as: τ trans = / N trans (5)Thus, given current time t , the next folding event willoccur at time t e = t + τ e and, respectively, the transcrip-tion event where a new nucleotide will be added to thecurrent sequence is scheduled at time t trans = t + τ trans .We decide which event will occur by comparing t e and t trans . If t e > t trans , then a new nucleotide is first tran-scribed and added to the current RNA conformation.Otherwise, a folding event is performed where a struc-ture in the set Q of neighboring structures is selected toupdate the current conformation.We remark that one can easily extend Algorithm 2to allow modeling τ trans as a random variable withoutchanging the steps of event selection. Specifically, oneonly needs to change step 2 in Algorithm 2 to generatethe waiting time of the next transcription event, whilekeeping the simulation otherwise unchanged. We illustrate the capability of our cotranscriptionalkinetic folding method on three case studies: a) theswitching molecule [6], b) the E. coli signal recogni-tion particle (SRP) RNA and c) the SV-11 variant inQ β replicase [1]. SRP and SV-11 are real biologicalmolecules that manifest the characteristics of cotran-scriptional folding that thermodynamic/kinetic meth-ods [31, 7] would fail to capture if initiated from fullydenatured sequences. Our method is not only able toproduce these structures, but also provides insight intomechanisms that biological systems may use to guidethe structure formation process. The code for the imple-mentation of our CoStochFold algorithm is available at: https://github.com/vo-hong-thanh/stochfold . Algorithm 2
CoStochFold
Require: initial RNA conformation s , transcriptionspeed N trans , and ending time T max initialize x = s and time t = set τ trans = / N trans repeat enumerate next possible conformations by ap-plying addition, deletion and shifting operationson the current conformation x and put into set Q compute the transition rate k x → y , for y ∈ Q , andtotal rate k x = ∑ y ∈ Q k x → y sample waiting time to the next folding event τ e ∼ Exp ( k x ) and set t e = t + τ e compute the next transcription event t trans = t + τ trans if ( t e > t trans ) then elongate x set t = t trans else select next conformation y ∈ Q with probabil-ity k x → y / k x set x = y and t = t e end if until t ≥ T max We consider the dynamic folding of the RNAsequence S = ”GGCCCCUUUGGGGGCCAGACC-CCUAAAGGGGUC” [6]. Two stable conforma-tions of the sequence are: the MFE structure x = “((((((((((((((.....))))))))))))))” ( − . kcal), and asuboptimal structure y = “((((((....)))))).((((((....))))))”( − . kcal). In this example, we focus on the num-ber of first-hitting time occurrences of a target struc-ture. The number of first-hitting time occurrences of astructure in a time interval divided by the total numberof simulation runs approximates the first-passage timeprobability of the structure, i.e., its folding time [6].Fig. 3 plots the number of first-hitting time oc-currences of the MFE structure x and the suboptimal y with varying transcription speeds. Kinetic foldingstarting from the denatured state was carried out bythe StochFold algorithm, while cotranscriptional fold-ing was conducted by the CoStochFold algorithm. Foreach method, we performed simulation runs onthe sequence S in which each simulation ran until a tar-get structure was observed or the ending time T max = was reached. Fig. 3 shows that changing the transcrip-tion speed of the polymerase significantly affects the N u m b e r o f o cc u rr e n c e s Time((((((....)))))).((((((....))))))
Denatured stateTranscription speed: 200 nt/secTranscription speed: 50nt/secTranscription speed: 5nt/sec
Transcription speed 1nt/sec N u m b e r o f o cc u rr e n c e s Time((((((((((((((.....))))))))))))))
Fig. 3. First-hitting time occurrences of MFE structure x = “((((((((((((((.....))))))))))))))” ( − . kcal, left) and suboptimal y = “((((((....)))))).((((((....))))))” ( − . kcal, right). folding characteristics of the sequence. Specifically, co-transcriptional folding with slow transcription speed fa-vors the suboptimal structure y . It increases the numberof occurrences of y , while reducing the number of oc-currences of the MFE structure x .Fig. 4 compares the number of first-hitting time oc-currences of the MFE structure x with respect to thesuboptimal conformation y . We note that if the sim-ulation starts from the fully denatured state, the occur-rence ratio of the suboptimal conformation y to the MFEstructure x is about 2:1 as also observed by Flamm etal. [6]. However, the ratio increases noticeably whenthe transcription speed decreases. For example, the oc-currence ratio of the suboptimal conformation y to theMFE structure x is about 6.5:1 in the case of transcrip-tion speed nt/sec. This section studies the process of structural forma-tion of the E. coli SRP RNA during transcription. SRPis a nt long molecule, which recognizes the signalpeptide and binds to the ribosome locking the proteinsynthesis. Its active structure is a long helical structurecontaining interspersed inner loops (see S3 in Fig. 5).Experimental work [28] using SHAPE-seq techniqueshas suggested a series of structural rearrangements dur-ing transcription that ultimately result in the SRP heli-cal structure. In particular, the 5’ end of SRP forms ahairpin structure during early transcription. The struc-ture persists until it reaches a length of nt. The unsta- N u m b e r o f O cc u rr e n c e s ((((((((((((((..…))))))))))))))((((((....)))))).((((((....)))))) Fig. 4. First-hitting time occurrences of MFE structure x = ”((((((((((((((.....))))))))))))))” ( − . kcal) and suboptimal y = ”((((((....)))))).((((((....))))))” ( − . kcal) by varying transcriptionspeeds. ble hairpin then rearranges to its active structure. Fig. 5depicts three structural motifs at nt (S1), nt (S2),and nt (S3), respectively, in the formation of SRP. ig. 5. The folding pathway of secondary structures of E. Coli signaling recognition particle (SNP) RNA. The hairpin motif S1 is formed attranscript length nt and form S2 completed at length nt. When reaching the transcript length nt, SRP rearranges into its stablehelical shape S3. The visualization of structures is made by the Forna tool [13]. Specifically, the hairpin motif S1 emerges at transcriptlength nt, and the transcript then continues elongatingto form structure S2 at length nt. When reaching thetranscript length nt, SRP rearranges into its persis-tent helical conformation S3.We validated the prediction of the CoStochFold al-gorithm against the experimental work in [28]. To dothat, we performed simulations of the algorithmto fold SRP transcriptionally. The average transcriptionspeed was set to nt/sec. Fig. 6 shows the frequencyof occurrences of the considered structures during thesimulated time of seconds. The plot on the left showsthe cotranscriptional folding of SRP and the plot on theright presents the folding of SRP starting from the de-natured state. The figures clearly show that the CoS-tochFold algorithm can capture the folding pathway ofSRP. Specifically, the hairpin motif S1 starts to format about t = s when the transcript length is nt andpeaks at about t = s when nt have been transcribed.At about t = s, Structure S2 appears and then rear-ranges to S3 at about t = s. We note that the simulatedfolding without considering transcription only the con-formation S3 is found. SV-11 is a nt long RNA sequence. It is a recom-binant between the plus and minus strands of the natu-ral Q β template MNV-11 RNA [1]. The result of therecombination is a highly palindromic sequence whosemost stable secondary structure is a long hairpin-likestructure, the MFE structure in Fig. 7a). The MFEstructure, however, disables Q β replicase because itsprimer regions are blocked. Experimental work [1] hasshown that an active structure of SV-11 for replicationis when it folds into a metastable conformation. Fig. 7b)depicts a metastable structure with a hairpin-hairpin-multi-loop motif with open primer regions that serve astemplate for replication. Transition from the metastablestructure to the MFE structure has been observed exper-imentally but is rather slow [1], indicating long relax-ation time to equilibrium.We plot in Fig. 8 the energy vs. occurrence fre-quency of structures by the cotranscriptional foldingof SV-11. The result is obtained by simulationruns of our CoStochFold algorithm with average tran-scription speed nt/sec. To determine the frequency ofoccurrence of a structure, we discretize the simulation P e r c (cid:0) n t a g e o f p o p u l a t i o n time CoStochFold S (cid:1) (cid:2) time StochFold
Fig. 6. Prediction of the structural formation of SNP. Left: cotranscriptional folding. Right: folding from denatured state without transcription. a ) M F E s t ruc (cid:3) ure b (cid:4) (cid:5) e (cid:6)a s (cid:7)(cid:8)b le s (cid:9) ruc (cid:10) ure Fig. 7. SV-11 with two conformations a) MFE structure ( − . kcal) and b) metastable structure ( − . kcal). The visualization ofstructures is made by the Forna tool [13]. time [ , t ] into intervals and record how much time wasspent in each structure within each interval. The fre-quency of occurrence of a structure in each time inter-val is then averaged over runs. The figure showsthat the folding favors metastable structures, and disfa-vors the MFE structure. In particular, cotranscriptionalfolding quickly folds SV-11 to its metastable conforma-tions with the mode of the energy distribution at about − kcal.Fig. 9 shows the long-term occurrence frequenciesof structures at different energy levels in the SV-11 fold-ing and Fig. 10 compares the occurrence frequencies ofa specific metastable structure depicted in Fig. 7b) with the MFE structure and two randomly selected subopti-mal structures in the locality of MFE structure. Fig. 10shows that the SV-11 molecule interestingly prefers themetastable structure over the MFE structure. Specif-ically, the metastable structure in the cotranscriptionalfolding regime is in the time interval [ , ] about ten-fold more frequent than the MFE structure. We propose a kinetic model of RNA folding thattakes into account the elongation of an RNA chain dur-ing transcription as a primitive structure-forming op- .000.050.100.150.200.250.300.350.40 - (cid:11) (cid:12) (cid:13) (cid:14) (cid:15) (cid:16) (cid:17) (cid:18) (cid:19) (cid:20) (cid:21) (cid:22) (cid:23) (cid:24) (cid:25) (cid:26) t = 5 (cid:27) (cid:28) (cid:29) (cid:30) (cid:31) ! " $ % & ’ ( * + , t = 10 . / : ; < = > t = 15 ? @ A B C D F G H I J K L N O P Q t = 20 R T U V W X Y Z [ \ ] ^ _ ‘ c d f t = 25 g h i j k l m n o p q r s u v w x t = 50 Fig. 8. Cotranscriptional folding of SV-11. The x-axis denotes the energy level, and y-axis shows the frequency of structures in the sameenergy level. eration alongside the common base-pairing operations.Based on this, we developed a new stochastic simula-tion algorithm CoStochFold to explore RNA structureformation in the cotranscriptional folding regime. Weshowed through numerical case studies that the methodcan quantitatively predict the formation of metastableconformations in an RNA folding pathway. The simu-lation method and tools thus promise to offer useful in-sights into RNA folding kinetics in real biological sys- tems.
Acknowledgements
This work has been supported by Academy ofFinland grant no. 311639, ”Algorithmic Designs forBiomolecular Nanostructures (ALBION).” .000.050.100.150.200.25 y z { | } ~ (cid:127) (cid:128) (cid:129) (cid:130) (cid:131) (cid:132) (cid:133) (cid:134) (cid:135) (cid:136) (cid:137) F r e (cid:138) u e n c (cid:139) (cid:140) ner (cid:141)(cid:142) le (cid:143) el t (cid:144) (cid:145)(cid:146)(cid:147)(cid:148)(cid:149) Fig. 9. Frequency of structures in folding SV-11. (cid:150)(cid:151) (cid:152)(cid:153) (cid:154)(cid:155) (cid:156)(cid:157) (cid:158)(cid:159) (cid:160)¡ ¢£ ⁄¥ ƒ§
03 0 2000 4000 6000 8000 10000 ¤ r e ' u e n c y “«‹ cc u rr e n c e ›fifl e (cid:176) e –† s ‡·(cid:181) le ¶ F •‚ u „”»…‰(cid:190)¿ l 1 (cid:192) u `´ˆ˜¯˘˙ l 2 ¨ e (cid:201)˚ s ¸(cid:204)˝ le ˛ (((((((((((...)))))))..((((((((((....))))))))))........)))).....((((((((........)))))))).((((((((.....))))))))..… ˇ F —(cid:209) (((.(((((((((((((((((((((((((((.(.((((((((((((((..((((...))))..)))))))))))))).).)))))))))))))))))))..))))))))..))). (cid:210) u (cid:211)(cid:212)(cid:213)(cid:214)(cid:215)(cid:216)(cid:217) l 1 (cid:218) (((.(((((((((((((((((((((((((((...((((((((((((((..((((...))))..))))))))))))))...)))))))))))))))))))..))))))))..))). (cid:219) u (cid:220)(cid:221)(cid:222)(cid:223)(cid:224)Æ(cid:226) l 2 ª (((.(((((((((((((((((((((((((((...((((((((((((((..((((...))))..))))))))))))))...))))))))))))))))))..)))))))))..))). Fig. 10. Frequency of the metastable structure in comparison with the MFE structure and two randomly selected suboptimal structures.
References [1] C. K. Biebricher and R. Luce. In vitro recombination and terminal elongationof RNA by Q β replicase. EMBO J. , 11:5129–5135, 1992.[2] Dmitri Bratsun, Dmitri Volfson, Lev S. Tsimring, and Jeff Hasty. Delay- induced stochastic oscillations in gene regulation.
PNAS , 102(41):14593–14598, 2005.[3] Lesley J. Collins and David Penny. The RNA infrastructure: Dark matter ofthe eukaryotic cell?
Trends Genet. , 25(3):120–128, 2009.4] Eric C. Dykeman. An implementation of the Gillespie algorithm for RNAkinetics with logarithmic time update.
Nucleic Acids Res. , 43(12):5708–5715,2015.[5] J¨org Fallmann, Sebastian Will, Jan Engelhardt, Bj¨orn Gr¨uning, Rolf Back-ofenc, and Peter F. Stadler. Recent advances in RNA folding.
J. Biotechnol. ,261:97–104, 2017.[6] Christoph Flamm, Walter Fontana, Ivo L. Hofacker, and Peter Schuster. RNAfolding at elementary step resolution.
RNA , 6:325–338, 2000.[7] Alexander P. Gultyaev, van Batenburg F. H. D., and Cornelis W. A. Pleij. Thecomputer simulation of RNA folding pathways using a genetic algorithm.
J.Mol. Biol. , 250(1):37–51, 1995.[8] Ivo L. Hofacker, Peter Schuster, and Peter F. Stadler. Combinatorics of RNAsecondary structures.
Discrete Appl. Math. , 88(1-3):207–237, 1998.[9] Boyang Hua, Subrata Panja, Yanbo Wang, Sarah A. Woodson, and Taekjip Ha.Mimicking co-transcriptional rna folding using a superhelicase.
J. Am. Chem.Soc. , 140(32):10067–10070, 2018.[10] Herv´e Isambert and Eric D. Siggia. Modeling RNA folding paths with pseu-doknots: Application to hepatitis delta virus ribozyme.
PNAS , 97(12):6515,2000.[11] Daniel Jasinski, Farzin Haque, Daniel W. Binzel, and Peixuan Guo. Advance-ment of the emerging field of RNA nanotechnology.
ACS Nano , 11(2):1142–1164, 2017.[12] Kyozi Kawasaki. Diffusion constants near the critical point for time-dependentIsing models.
Phys. Rev. , 145:224–230, 1966.[13] Peter Kerpedjiev, Stefan Hammer, and Ivo L. Hofacker. Forna (force-directedRNA): Simple and effective online RNA secondary structure diagrams.
Bioin-formatics , 31(20):3377–3379, 2015.[14] Daniel Lai, Jeff R. Proctor, and Irmtraud M. Meyer. On the importance ofcotranscriptional RNA structure formation.
RNA , 19(11):1461–1473, 2013.[15] Rune B. Lyngsøand Christian N. S. Pedersen. RNA pseudoknot prediction inenergy-based models.
J. Comp. Biol. , 7(3-4):409–427, 2000.[16] Luca Marchetti, Corrado Priami, and Vo H. Thanh.
Simulation Algorithms forComputational Systems Biology . Springer, 2017.[17] David H. Mathews, Jeffrey Sabina, Michael Zuker, and Douglas H. Turner.Expanded sequence dependence of thermodynamic parameters improves pre-diction of RNA secondary structure.
J. Mol. Biol. , 288(5):911–940, 1999.[18] Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, andAugusta H. Teller. Equation of state calculations by fast computing machines.
J. Chem. Phys. , 21:1087–1092, 1953.[19] A. A. Mironov and V. F. Lebedev. A kinetic model of RNA folding.
Biosys-tems , 30(1-3):49–56, 1993.[20] Andrey Mironov and Alexander Kister. RNA secondary structure forma-tion during transcription.
Journal of Biomolecular Structure and Dynamics ,4(1):1–9, 1986.[21] Tao Pan and Tobin R. Sosnick. RNA folding during transcription.
Annu. Rev.Biophys. Biomol. Struct. , 35:161–175, 2006.[22] Jeff R. Proctor and Irmtraud M. Meyer. COFOLD: an RNA secondary struc-ture prediction method that takes co-transcriptional folding into account.
Nu-cleic Acids Res. , 41(9):e102, 2013.[23] Gisela Storz. An expanding universe of noncoding RNAs.
Science ,296(5571):1260–1263, 2002.[24] Vo Hong Thanh, Corrado Priami, and Roberto Zunino. Efficient rejection-based simulation of biochemical reactions with stochastic noise and delays.
J.Chem. Phys. , 141(13):10B602, 2014.[25] Vo Hong Thanh and Roberto Zunino. Adaptive tree-based search for stochasticsimulation algorithm.
Int. J. Comput. Biol. Drug. Des. , 74(4):341–357, 2014.[26] Vo Hong Thanh, Roberto Zunino, and Corrado Priami. Efficient stochasticsimulation of biochemical reactions with noise and delays.
J. Chem. Phys. ,146(8):084107, 2017.[27] Douglas H. Turner and David H. Mathews. NNDB: The nearest neighborparameter database for predicting stability of nucleic acid secondary structure.
Nucleic Acids Res. , 38:D280–D282, 2009.[28] Kyle E. Watters, Eric J. Strobel, Angela M. Yu, John T. Lis, and Julius B.Lucks. Cotranscriptional folding of a riboswitch at nucleotide resolution.
Nat.Struct. Mol. Biol. , 23(12):1124–1131, 2016.[29] Peinan Zhao, Wenbing Zhang, and Shi-Jie Chen. Cotranscriptional foldingkinetics of ribonucleic acid secondary structure.
J. Chem. Phys , 135:245101,2011.[30] Michael Zuker. On finding all suboptimal foldings of an RNA molecule.
Sci-ence , 244(4900):48–52, 1989.[31] Michael Zuker and Patrick Stiegler. Optimal computer folding of large RNAsequences using thermodynamics and auxiliary information.