[PDF] Iterative Annealing Mechanism Explains the Functions of the GroEL and RNA Chaperones

Abstract

Molecular chaperones are ATP-consuming biological machines, which facilitate the folding of proteins and RNA molecules that are kinetically trapped in misfolded states for long times. Unassisted folding occurs by the kinetic partitioning mechanism according to which folding to the native state, with low probability as well as misfolding to one of the many metastable states, with high probability, occur rapidly on similar time scales. GroEL is an all-purpose stochastic machine that assists misfolded substrate proteins (SPs) to fold. The RNA chaperones (CYT-19) help the folding of ribozymes that readily misfold. GroEL does not interact with the folded proteins but CYT-19 disrupts both the folded and misfolded ribozymes. Despite this major difference, the Iterative Annealing Mechanism (IAM) quantitatively explains all the available experimental data for assisted folding of proteins and ribozymes. Driven by ATP binding and hydrolysis and GroES binding, GroEL undergoes a catalytic cycle during which it samples three allosteric states, referred to as T (apo), R (ATP bound), and R'' (ADP bound). In accord with the IAM predictions, analyses of the experimental data shows that the efficiency of the GroEL-GroES machinery and mutants is determined by the resetting rate k R ′′ →T , which is largest for the wild type GroEL. Generalized IAM accurately predicts the folding kinetics of Tetrahymena ribozyme and its variants. Chaperones maximize the product of the folding rate and the steady state native state fold by driving the substrates out of equilibrium. Neither the absolute yield nor the folding rate is optimized.

Full PDF

IIterative Annealing Mechanism Explains the Functions of the GroEL andRNA Chaperones

D. Thirumalai , George H. Lorimer , and Changbong Hyeon Department of Chemistry,The University of Texas at Austin, Texas, 78712 Biophysics Program,Institute For Physical Science and Technology,University of Maryland,College Park, MD 20742 Korea Institute for Advanced Study,Seoul 02455, Korea (Dated: September 18, 2019) a r X i v : . [ q - b i o . B M ] S e p bstract Molecular chaperones are ATP-consuming biological machines, which facilitate the folding of proteinsand RNA molecules that are kinetically trapped in misfolded states for long times. Unassisted foldingoccurs by the kinetic partitioning mechanism according to which folding to the native state, with lowprobability as well as misfolding to one of the many metastable states, with high probability, occurrapidly on similar time scales. GroEL is an all-purpose stochastic machine that assists misfoldedsubstrate proteins (SPs) to fold. The RNA chaperones (CYT-19) help the folding of ribozymes thatreadily misfold. GroEL does not interact with the folded proteins but CYT-19 disrupts both the foldedand misfolded ribozymes. Despite this major difference, the Iterative Annealing Mechanism (IAM)quantitatively explains all the available experimental data for assisted folding of proteins and ribozymes.Driven by ATP binding and hydrolysis and GroES binding, GroEL undergoes a catalytic cycle duringwhich it samples three allosteric states, referred to as T ( apo ), R (ATP bound), and R (cid:48)(cid:48) (ADP bound).In accord with the IAM predictions, analyses of the experimental data shows that the efﬁciency of theGroEL-GroES machinery and mutants is determined by the resetting rate k R (cid:48)(cid:48) → T , which is largest forthe wild type GroEL. Generalized IAM accurately predicts the folding kinetics of Tetrahymena ribozymeand its variants. Chaperones maximize the product of the folding rate and the steady state nativestate fold by driving the substrates out of equilibrium. Neither the absolute yield nor the folding rate isoptimized. NTRODUCTION

Molecular chaperones have evolved to facilitate the folding of proteins that cannot do sospontaneously under crowded cellular conditions . This important task is accomplishedwithout chaperones imparting any additional information beyond what is contained in theamino acid sequence. Furthermore, chaperones assist the folding of proteins whose foldedstructures bear no relationship to one another. In other words, chaperones are “blind" tothe architecture of the folded proteins. Most of the protein chaperones belong to the fam-ily of heat shock proteins (HSPs) that are over expressed when the cells are under stress.Among the many classes of chaperones, the bacterial chaperonin, GroEL, has been mostextensively investigated, possibly because it was the ﬁrst one to be discovered . Althoughless appreciated, RNA chaperones have also evolved to enable the folding of ribozymes ,which are readily kinetically trapped implying that in vitro only a very small fraction folds tothe functionally competent state on a biologically relevant time scale . Both GroEL and RNAchaperones (CYT-19), which we will collectively refer to as molecular chaperones from nowon, are not unlike molecular motors, such as kinesin, myosin, and dynein. There are manysimilarities between motors and chaperones. (i) Both motors and chaperones are enzymesthat undergo a catalytic cycle, which involves binding and hydrolysis of ATP. Molecular motorshydrolyze one ATP per step, thus converting chemical energy to mechanical work in order towalk on the linear cytoskeletal ﬁlaments (actin or microtubule). Both GroEL and RNA chap-erones consume copious amounts of ATP (see below). They couple the hydrolysis of ATP toperform work by partially unfolding the misfolded RNA or proteins. Indeed, helicase activityis attributed to RNA chaperones, such as CYT-19. Helicases are biological machine thatseparate double stranded DNA or RNA and translocate on single stranded nucleic acids. (ii)During the catalytic cycle, the enzymes (motor head in the case of motors and the subunitsin the GroEL particle) undergo spectacular conformational changes, which are transmittedallosterically (action at a distance) throughout the complex (see for a recent review). Indeed,it is impossible to rationalize the functions of motors or chaperones without allosteric signaling,which we illustrate more fully for GroEL in this article. (iii) Some of the rates in the catalyticcycles of molecular motors also depend on the presence of actin or microtubule. Similarly,ATPase functions of GroEL are stimulated in the presence of substrate proteins (to be referredto as SPs from now on). For these reasons, a quantitative understanding of the functions of3olecular chaperones mandates that they be treated as molecular machines. GroEL-GroES machine

The complete chaperonin system consists of GroEL, the co-chaperonin GroES, which to-gether form both a 2:1 and 1:1 complex depending on whether SPs are present or absent.For it to function, which means assist in the folding of a vast number of SPs that otherwisecould aggregate, it requires MgATP as well. The availability of a number of structures thatGroEL visits during the catalytic cycle and theoretical developments have made it pos-sible to obtain insights into the function of GroEL-GroES system. GroEL, assembled fromseven identical subunits, is a homo oligomer with two rings that are stacked back-to-back,which confers it in an unusual rare seven fold symmetry in the resting ( T or taut) state. Majorchanges in the structures take place between the allosteric ( T , R , and R (cid:48)(cid:48) ) states in responseto ATP and GroES binding (Figs.1 and 2a). The dynamics of allosteric changes in GroEL hasbeen reviewed recently . The ATP binding sites are localized in the base of GroEL corre-sponding to the equatorial (E) domain, connecting the two rings (Fig. 2a). The E carries bulk(roughly two thirds) of the inertial mass of a single subunit. Binding sites for the co-chaperoninGroES are localized in the apical (A) domain, which also coincides with the region of interac-tion of the SPs with GroEL in the T state. We present a schematic in Fig. 3b of the reactioncycle in a single ring. We ought to emphasize, right at the outset, that recent advances showthat when challenged with SPs the functioning state is the symmetric 14-mer GroEL-GroEscomplex, resembling a football (see Fig.2b), and not the asymmetric bullet structure ashad been thought for a long time.The parts list of this complex machine is GroEL, GroES, MgATP, and the SPs, that requireassistance to reach the folded structures. A few words about the SPs are in order. It has beenshown long ago that GroEL is a promiscuous machine that interacts with a vast majority of E. Coli. proteins as long as they are presented in the misfolded states . This observationand the subsequent demonstration that most of the SPs used to study GroEL assisted foldingare ones not found in E. Coli. further buttresses this point. For discussion purposes, wedistinguish between permissive and non-permissive conditions. Under permissive conditions,folding to the native state occurs readily in vitro on a biologically relevant time scale ( τ B ),4hich is between 20–40 minutes for E. coli proteins at 37 ◦ C. Under non-permissive conditions,spontaneous folding does not occur in vitro with sufﬁcient yield of the folded protein on the timescale, τ B . The SPs satisfying this criterion are deemed to be stringent substrates for GroEL.Several in vitro experiments show that in most cases the SP folding rates in the wild typeGroEL are enhanced only modestly, which is fully explained using theoretical studies .In fact, the folding rate could even decrease although this has been shown experimentally using only a mutant form of GroEL, referred to as SR1, from which GroES does not easilydissociate. In contrast, it is the native yield over a period of time that is maximized by theGroEL machinery. RNA Chaperones

The tendency of RNA enzymes (ribozymes) to misfold in vitro is well established . Sev-eral in vitro experiments have ﬁrmly established that self-splicing ribozymes, such as

Tetrahy-mena ribozyme, fold to functionally competent state extremely slowly. Only a very small frac-tion of the initially unfolded ensemble reaches the folded state rapidly ( ∼ one second) . Thereason for RNAs to be kinetically trapped in metastable states is due to the high stability of thebase-paired nucleotides. In addition, RNA molecules have considerable homopolymers-likecharacteristics, which do not fully discriminate between a large number of relatively low freeenergy structures. These factors render the folding landscapes of RNAs more rugged thanproteins . We showed that only ∼

54 % of RNA secondary structures is made of helices with intact Watson-Crick base pairs, which implies that substantial number of nucleotides areengaged in non-canonical base pairs, bulges, and internal multi loops. It is estimated that thelife time of a helix made of 6 bps, which gives rise to a free energy barrier of δG ‡ ≈ − kcal/mol, can be as large as ∼ sec ( ∼ ).The arguments given above suggest that the folding landscape of most RNA moleculesought to consist of multiple metastable minima with similar stability that are separated fromthe native state by large (compared to k B T where k B is the Boltzmann constant and T is thetemperature, which unfortunately is the same notation used for the T state in GroEL) free5nergy barriers . The structures in the metastable states often share many features thatare common with the folded state. Such rugged landscapes govern the functions of manyRNA molecules, such as riboswitches that are involved in transcription and translation. Thesebiological processes are associated with switching between at least two alternative structures.In riboswitches the switch between the states is modulated by metabolites or metal ions. Ofrelevance here is the in vitro folding of Tetrahymena ribozyme, which is a self-splicing intron.For this enzyme it is found that only a fraction (

Φ = 0 . ) of the initial population of unfoldedmolecules directly folds to the functionally competent native state in about 1s, and the rest( − Φ = 0 . ) of the molecules are kinetically trapped in competing basins of attraction for arbitrarily long times. For Tetrahymena ribozyme to function, it is essential that severalkey native tertiary contacts form. Incorrect formation of these tertiary contacts leads results infunctionally incompetent ribozyme. For example, without the formation of the pseudo knot (P3helix) the two domains (P5-P4-P6 and P7-P3-P8) cannot be stabilized (see Fig.4a). Formationof alternative helix (Alt-P3) and other misfolded structures impair the function of ribozymes.An introduction of a single point mutation (U273A) stabilizing the P3 pseudoknot helix wasshown to increase Φ as high to 0.8 .DEAD-box protein CYT-19, which belongs to a general class of RNA chaperones ,comprises of a core helix domain and arginine rich C-terminal tail. Cyt-19 recognizes surface-exposed RNA helices (duplexes) and unwinds them, like helicases belonging to the SF2 family(see Fig.4b for the yeast analogue of CYT-19 ), into single stranded RNA by expending freeenergy due to ATP hydrolysis. It is likely that ATP triggered conformational changes promoteslocal unwinding of RNA helices. Because of the helicase activity of CYT-19, the microscopicmechanism does involve local unfolding of the accessible helices. Thus, both GroEL andCYT-19 perform work on the misfolded structures by forcibly unfolding them, at least partially.This is another common theme linking the functions of CYT-19 and GroEL.Our goals in this article are the following: (1) We present a uniﬁed theoretical perspective onthe functions of GroEL and RNA chaperones. The essence of the assisted folding mechanismof the SPs is illustrated using the well investigated GroEL-GroES system. Although the twoenzymes, exhibiting machine-like activity, are quite different we show that the theory based onthe Iterative Annealing Mechanism (IAM) quantitatively explains a vast amount of experimentaldata in chaperone-assisted folding of proteins as well as ribozymes. A major conclusion of the6heory is that these ATP-consuming chaperones are stochastic machines that drive the SPsor ribozymes out of equilibrium. This implies that in the steady state, P SSN , (long time limit)the yield of the folded protein does not correspond to that expected at equilibrium P EQN , whichwould be ∝ exp( − β ∆ G NU ) where β = 1 /k B T , and ∆ G NU is the free energy of stability ofthe native N state with respect to the unfolded state ( U ). In other words, P SSN (cid:54) = P EQN . (2)The differences between GroEL-GroES system and RNA chaperones naturally arises fromthe IAM predictions, and highlights the likely inefﬁciency (large consumption of ATP relativeto the production of the folded state) of RNA chaperones. (3) Because the GroEL structuresin different nucleotide states are known, we illustrate the conformational changes that occurduring the allosteric transitions in the GroEL in response to ATP and SP binding and linkthese changes to the folding of the SPs. (4) Finally, we outline recent developments, whichprovide incontrovertible evidence for the quantitative validity of the IAM, which establishes thatGroEL-GroES system is a parallel processing stochastic machine that simultaneously annealstwo misfolded molecules by sequestering one each in the two chambers of the symmetriccomplex. Remarkably, the symmetric complex forms only when the GroEL-GroES system issubject to load, i.e., challenged with SPs that require assistance to reach the native state.

ITERATIVE ANNEALING MECHANISM (IAM) FOR GROEL-GROES

In this section we systematically develop the physical basis for the IAM by dissecting thefate of the SPs in the absence of the chaperonin machinery. We begin by considering howSPs, which do not recruit GroEL-GroES, fold spontaneously. This is followed by a brief de-scription of the dynamics of allosteric transitions that GroEL undergoes in response to ATP andGroES binding and hydrolysis of ATP. Lastly, the physical picture of the link between allosterictransitions and SP folding is described, which vividly reveals the machine-like characteristicsof the GroEL-GroES system. The applications of the theory of the IAM to experimental datacement the quantitative validity of the active role GroEL plays in assisted folding.

E. Coli does not have enough GroEL to process the entire proteome

Over twenty years ago Lorimer showed, using data for E. Coli

B/r growing in minimalglucose medium at 37 ◦ C with a cell doubling time ( τ D of ∼

40 minutes, that the number of7roEL particles can only process between (5 -10)% of the proteome. The crux of the thatargument can be summarized as follows. The rate of protein synthesis is k S = N P /τ D where N P is the number of polypeptide chains in a cell. Assuming that the total mass of proteinsper cell is ≈ × − g and the average mass is ≈ × g/mol then N P ≈ × ,which implies that k S = 6 × chains/min. Given there are about 3.5 × ribosomes, itfollows that this strain of E. Coli synthesizes about one polypeptide chain every 35 seconds.Needless to say, most if not all of the proteins have to reach the the folded state in the crowdedenvironment without the assistance of GroEL. The average cell contains about N GroEL = particles, and about nearly twice as many GroES molecules. The typical measuredvalues of the rates of assisted folding in vitro , k F s are in range (1 − min − . Thus, theavailable GroEL particles can assist in the folding of N GroEL k F ≈ × chains/min. Thus, only about (3 − % of the proteome can recruit GroEL-GroES in order to fold. Nevertheless, removal of theGroE gene is lethal to the organism, attesting to its importance in E. Coli growth. Theseestimates raise the following two important questions: (a) What are the potential SPs that foldwith the assistance of the GroEL-GroES system? (b) How do the vast majority of proteins( ≈ , except to note that GroEL does not discriminatebetween proteins based on their folded structures because the very SP residues that interactwith GroEL are buried in the folded state . Stringent SPs and ribozymes fold by the Kinetic Partitioning Mechanism (KPM)

Spontaneous folding of small proteins or those with relatively simple native topology is wellunderstood. Proteins, such as SH3 domain or Chymotrypsin Inhibitor 2, fold in an ostensiblytwo state manner although when examined using high spatial and temporal resolution it isfound they too fold by multiple routes to the native state. For these proteins the yield of thenative state is sufﬁciently large that their folding does not require the assistance of chaperones.However, from the perspective of assisted folding, it is more instructive to consider the foldingof SPs whose folding landscapes are rugged containing many free energy minima (Fig. 3(a))separated by sufﬁciently large barriers (several k B T s) that they cannot be overcome readily.Although the structures in the low energy minima could have considerable overlap with the8olded state they are misfolded because they are likely to contain incorrect tertiary contactsand/or secondary structures. These are targets for recognition by molecular chaperones. Afteran initial rapid compaction of the SPs or the ribozyme many of the molecules are trapped inone of several low free energy minima.The KPM, which explains the folding mechanisms of proteins succinctly, follows immedi-ately from the rugged folding landscape in Fig.3(a) (see also Fig.5). According to KPM ,a fraction of molecules Φ folds rapidly without being trapped in one of the low free energyminima. These are sometimes referred to as the fast track molecules for which, following aninitial “speciﬁc" collapse, folding to the native state is rapid . Explicit simulations using lat-tice models have shown that the folding characteristics (dynamics of compaction and theincrease in the fraction of native contacts as a function of time) of the fast track molecules areidentical to sequences for which the folding landscape is simple with one dominant minimum.The remaining fraction ( − Φ ) of molecules are trapped in an ensemble of low free energystructures because their initial collapse produce structures containing interactions that are notpresent in the native state. The resulting misfolded structures have to overcome activationbarriers in order to reach the folded state. Thus, after the ensemble of unfolded moleculesundergoes rapid collapse they partition to the native state at a rate k IN or transition to themisfolded ensemble at a rate k IM . The fraction of fast track molecules, referred to as thepartition factor is associated with the rates in the 3-state cyclic model for chaperone-assistingfolding depicted in Fig.5 as, Φ = k IN k IN + k IM . It is the value of Φ , which depends on a number ofextrinsic factors such as ionic strength, pH, and temperature that governs the need of a SP orthe ribozyme for the chaperone machinery (see below).We classify the misfolded structures into slow folders and no folders, depending on themagnitude of the activation free barriers separating them from the native state. The time scalefor slow folders to reach the native state could range from milliseconds to several minuteswhereas for no folders the transition to the native state could occur on time scales that exceedbiologically relevant times. The effects of external conditions might be appreciated by notingthat ribulose bisphosphate carboxylase oxygenase (RUBISCO) behaves as a no folder at lowionic strength but becomes a slow folder at high ionic strength . Similarly, by increasingthe temperature from about (10 − ◦ C to physiological temperature (37 ◦ C) both malatedehydrogenase and aspartate transaminase transition from being a slow to no folders .The no folders, with low Φ , are prime candidates, which can fold with the aid of the complete9haperonin machinery. EXPERIMENTAL EVIDENCE FOR KPM

The KPM has been validated in a number of experiments. The value of Φ has been mea-sured in ensemble and single molecule experiments. (i) For example, Kiefhaber showed,using interrupted folding (ﬁnal folding condition is 0.6 M GdmCl, pH = 5.2 and T = 20 ◦ C), that

Φ = Φ was found to increase to about 25 % while the timeconstants for the fast track molecules were roughly identical to the earlier study . Becauseof the time for reaching the folded state by the molecules in the slow track is relatively smallcompared to biological times, it might be correctly concluded that folding of HEWL would notrequire the assistance of chaperones. (ii) The yield of the folded RUBISCO obtained in thedirect folding, under non-permissive conditions, inferred from chaperonin-mediated folding(see below) is extremely small and is only order of (2-5)%. For both Tetrahymena and RU-BISCO most of the molecules ought to be classiﬁed as no folders, which imply their foldingrequires molecular chaperones. (iii) An indirect estimate of Φ was ﬁrst made using theory andexperiments for Tetrahymena ribozyme. It was found that Φ ≈ . , which was subsequentlyconﬁrmed in smFRET experiments . These values were obtained at sufﬁciently high Mg concentration. At cellular Mg the value is expected to be much less. Introduction of asingle point mutation (U273A) stabilizing the P3 pseudoknot helix was shown to increase Φ as high to 0.8 , which shows that both sequence and external conditions determine the valueof Φ . For both Tetrahymena and RUBISCO most of the molecules ought to be classiﬁed asno folders, which imply their folding requires molecular chaperones. In addition to the abovementioned experiments single molecule pulling experiments on several proteins (Tenacin, Fi-bronectin, T4Lysozyme, Calmodulin) using both Atomic Force Spectroscopy , and optimaltweezer techniques have established the validity of the KPM.10 ize and kinetic Constraints

Two constraints must be satisﬁed for GroEL-GroES assisted folding. First, pertains to thesize of the SPs. The radius of gyration, R g , of folded states of globular proteins is fairlyaccurately given by R g = 3 N / Å . Small Angle X-Ray Scattering experiments on a fewproteins have shown that the typical sizes of misfolded SPs is about (5-10)% larger than thefolded states. This implies that the size of the RUBISCO monomer, with N = 491 , in themisfolded state is ≈ ≈ . If the cavity is approximated as a sphere the apparentradius would be 35 Å, which implies that if RUBISCO is fully encapsulated in the expandedcavity there would be room for about one layer of water molecules. Thus, GroEL can processSPs that contain (cid:46)

500 residues by fully encapsulating them.The second and a more important constraint is kinetic in nature. As argued before only asmall fraction, Φ of the SPs, reaches the folded state rapidly without being kinetically trappedin one of the many metastable states. If the average rate for molecules that fold by the slowtrack is k s then in order to prevent aggregation the pseudo ﬁrst order binding rate, k B , of themisfolded SP to bind to GroEL must greatly exceed k A where k A is a pseudo ﬁrst order ratefor SP aggregation. The kinetic constraint shows clearly that the efﬁcacy of assisted foldingdepends on the concentrations of both the SPs and GroEL. Allosteric Transitions in GroEL

Because the equilibrium and non-equilibrium aspects in the spectacular allosteric transi-tions in GroEL have been recently reviewed , we describe only brieﬂy the key eventsthat impact the nature of assisted folding. Although the functional state of GroEL-GroES inthe presence of SPs is the symmetric structure with the co-chaperonin bound to both therings , let us consider for illustration purposes only the hemicycle, thus allowing us todescribe events in one ring. The T , R , and R (cid:48)(cid:48) are the three major allosteric states (Fig.1).The misfolded SPs, with exposed hydrophobic residues, preferentially interact with the T state, which has almost a continuous hydrophobic region lining the mouth of the GroEL cav-ity. The presence of the hydrophobic region is due to the alignment of seven subunits thatjoin several large hydrophobic residues in the two helices (H and I) in the apical domain of11ach subunit. The T → R transition, resisted by the SP, is triggered by ATP binding to theseven sites in the equatorial domain. The rates of the reversible T ↔ R transition were ﬁrstmeasured in pioneering studies by Yifrach and Horovitz who also established an inverserelation, predicted using computations , between the extent of co-operativity in this transitionand the folding rates of slow folding SPs . Binding of GroES, which predominantly occursonly after ATP binds, drives GroEL to the so-called R (cid:48) state, which is followed by an irre-versible non-equilibrium transition to the R (cid:48)(cid:48) state after ATP hydrolysis. It is suspected thereis little structural difference between the R (cid:48) , with ATP-bound, and the R (cid:48)(cid:48) , containing ADP andinorganic phosphate, states. In both these transitions strain due to ATP binding and hydrolysisat the catalytic site propagates through a network of inter-residue contacts , thus inducinglarge scale conformational changes. That such changes must occur during the reaction cycleof GroEL is already evident by comparing the static crystal structures in different allostericstates, such as the T and R (cid:48)(cid:48) states . Release of ADP and the inorganic phosphate from the R (cid:48)(cid:48) state resets the machine back to the taut state from which a new cycle can begin. Theallosteric transitions that GroEL undergoes during the catalytic cycle is intimately related to itsfunction (see below). As we discuss later, it is not sufﬁcient to deal with the catalytic cycle in asingle ring because under load it is the symmetric football-like structure that is the functionalstate. Iterative Annealing Mechanism integrates GroEL Allostery and assisted SP folding

The importance of GroEL allostery in assisted folding can be appreciated by understand-ing the interaction of the SP with the GroEL-GroES system in different allosteric states. Thechanges in the SP-GroEL interaction occur in three stages corresponding to the allosterictransitions between the three major allosteric states (see Fig.1). (i) The continuous lining ofthe hydrophobic residues in the T state ensnares a misfolded SP with exposed hydrophobicresidues. At this stage in the catalytic cycle the SP is predominantly in a hydrophobic envi-ronment, resulting in the formation of a SP-GroEL complex that is stable but not hyper stableso that the SP can be dislodged in to the cavity upon GroES binding . (ii)The dynamics ofthe T ↔ R transition, upon ATP binding, reveals that there is a downward tilt in two helicesnear the E domain that closes off the ATP binding sites, and which is followed by multiple salt12ridge disruption (within a subunit) and formation of new ones across the adjacent subunits .As these events unfold cooperatively, the stability of the initial SP-GroES complex decreases.More importantly, the adjacent subunits start to move apart, which imparts a moderate forcethat is large enough to at least partially unfold the SPs . (iii) Both GroES and SP bind tothe same sites, which are located in the crevices of helices H and I in the apical domain. Thus,when GroES binds, displacing the SP into the expanded central cavity, there are major struc-tural changes in the GroEL cavity with profound consequences for the annealing mechanism.Only 3–4 of the 7 SP binding sites are needed to capture the SP, leaving 3–4 sites availablefor binding of the mobile loops of GroES. This ensures that the subsequent displacement ofthe SP occurs vectorially into the central cavity of GroEL. First, there is a signiﬁcant confor-mational change in the A domain, which undergoes a rotation and twist motion. Each subunitresults in the two helices (K and L) in each subunit undergo an outside-in movement (Fig.1).As a result, polar and charged residues, which are solvent exposed in the T state, line theinside of the GroEL cavity. This in turn creates a polar microenvironment for the SP (Fig.1).Second, these large scale conformational changes are facilitated by the formation of severalinter subunit salt bridges and disruption of intra subunit salt bridges .From the perspective of SP, there are major consequences that occur as a result of theallosteric transitions in GroEL. First, by breaking a number of salt bridges the volume of thecentral cavity increases two fold (85,000 Å → ). In such a large central cavity,enough to fully accommodate a compact protein with ≈

500 residues, folding to the nativestate could occur if given sufﬁcient time as is the case in the SR1 mutant. But in the wildtype the residence time of the encapsulated SP is very short (see below). Second, and mostimportantly, the SP-GroEL interaction changes drastically during the catalytic cycle. In the T state, SP-GroEL complex is (marginally) stabilized predominantly by hydrophobic interactions.However, during the subsequent ATP-consuming and irreversible step R → R (cid:48)(cid:48) transition themicroenvironment of the SP is largely polar (see the discussion in the previous paragraph).Thus, during a single catalytic cycle, that is replicated in both the rings, the microenvironmentof the SP changes from being hydrophobic to polar. We note parenthetically that even duringthe T → R transition there is a change in the SP-GroEL interactions, which explains the ob-servations that GroEL can assist of the folding of certain SPs (non stringent substrates) evenin the absence of GroES. The annealing capacity of GroEL is intimately related to the changesin the SP-GroEL interactions that occur during each catalytic cycle. Hence, the function of the13roEL-GroES system cannot be understood without considering the complex allosteric tran-sitions that occur due to ATP and GroES binding. As a result of these transitions, the SP isplaced stochastically from one region in the folding landscape, in which the misfolded SP istrapped, to another region from which it could undergo kinetic partitioning with small probabilityto the folded state or be trapped in another misfolded state. The cycle of hydrophobic to polarchange is repeated in each catalytic cycle, and hence the GroEL-GroES system iterativelyanneals the misfolded SP enabling it to fold to the native state. Because this process is purelystochastic, GroEL plays no role in guiding the protein to the folded state nor does it sense thearchitecture or any characteristics of the folded state. In other words, the information for pro-tein self-assembly is fully encoded in the amino acid sequence as articulated by Anﬁnsen .GroEL merely alters the conformation of the SP stochastically as it undergoes the reactioncycle, enabling the SP to explore different regions of the folding landscape. In this sense theaction of GroEL is analogous to simulated annealing used in optimization problems althoughthe latter is a more recent realization of an evolutionary event that took place millions of yearsago. Theory underlying the IAM

The physical picture of the IAM described above can be formulated mathematically to quan-titatively describe the kinetics of chaperonin-assisted folding of stringent in vitro substrateproteins . According to theory underlying IAM (see Fig. 3), in each cycle the SP folds by theKPM, as the microenvironment for the SP changes as GroEL undergoes the reaction cycle.Thus, with each round of folding the fraction of folded molecules is Φ , and the remaining frac-tion gets trapped in one of the many misfolded structures. After n such cycles (or iterations)the yield of the native state is, Ψ = Λ ss [1 − (1 − Φ) n ] (1)where Λ ss is the steady state yield. The mathematical model accounts for all the availableexperimental data, and shows that for for RUBISCO the partition factor Φ ≈ . , whichmeans that only about 2% of the SP reaches the folded state in each cycle. From this ﬁndingwe could surmise that the GroEL chaperonin is an inefﬁcient machine, which consumes ATPlavishly and yet the yield of the folded protein per cycle is small. A prediction of the IAM isthat GroEL should reset to the starting T state as rapidly as possible in the presence of SPs.14y rapidly resetting to the T state the number of interactions can be maximized, which wouldmaximize the yield of the folded state for a speciﬁed amount of time . Indeed, this is thecase, which we delve into detail below. Rate of R (cid:48)(cid:48) → T transition is a maximum for the wild-type (WT) GroEL A clear implication of the IAM is that rapid turnover of the catalytic cycle would producethe maximum yield of the native state in a given time. Examination of the reaction cycleshows that the rate determining step (resetting of the machine) should correspond to releaseof ADP and the inorganic phosphate. In other words, maximization of the rate, k R (cid:48)(cid:48) → T returnsGroEL to the acceptor state for processing a new SP. In order to illustrate that this is indeedthe case, we ﬁrst extracted the rates of the allosteric transitions by ﬁtting the solutions ofthe kinetic equations by simultaneously ﬁtting the experimental data for assisted folding atvarious GroEL concentrations. For this purpose, we used the data for RUBISCO for which theyield of the folded state as a function of eight values of the GroEL concentration are available .The excellent ﬁts at various GroEL concentrations (Fig.6), with a ﬁxed initial concentration ofRubisco, were used to extract Φ . We ﬁnd that Φ ≈ . , which means that only about 2% ofthe SP reaches the folded state in each catalytic cycle.Armed with the rates that describe the allosteric transitions, we used the IAM theory basedto analyze experimental data on the folding of other SPs. Because the reversible transitionATP-induced T ↔ R transition occurs at equilibrium even in the absence of SP it is rea-sonable to assume that they are relatively insensitive to the nature of the SP. Indeed, theextracted values of the T ↔ R rates using the RUBISCO data (see Table 1 in ) are very sim-ilar to measurements made in the absence of SP . This leaves the rate k R (cid:48)(cid:48) → T that results inthe resetting the machine after ATP hydrolysis to the taut ( T ) state as the most important factorin determining the efﬁciency of GroEL or its mutants. Thus, maximizing k R (cid:48)(cid:48) → T should result inoptimizing the native state yield at a ﬁxed time. This most signiﬁcant prediction of IAM can bequantitatively demonstrated by analyzing the data reported by Lund and coworkers . Theymeasured the activity, which we assume is proportional to the yield of the folded state, as afunction of time for GroEL and ﬁve mutants including SR1. The two SPs used in these studieswere mitochondrial Malate Dehydrogenase (mtMDH) and citrate synthase (CS). The results,15eproduced in Fig.7 shows that P SSN and indeed the yield at any time is largest for the WT andis least for the SR1 mutant from which GroES disassociates in ≈

300 minutes. The curves inFig.7 were calculated by adjusting just one parameter , the rate k R (cid:48)(cid:48) → T while keeping the ratesfor other allosteric rates ﬁxed at the values extracted by analyzing the RUBISCO data. TheIAM predictions are in quantitative agreement with experiments for both the proteins and forGroEL and its mutants. The value of k R (cid:48)(cid:48) → T ≈

60 s − (one second) is largest the WT GroEL.This implies, as predicted by IAM, that GroEL catalytic cycle is greatly accelerated when SPis present, a point that requires further elaboration. GENERALIZED IAM FOR RNA CHAPERONES

Compared to GroEL-GroES chaperonin, details of the catalytic cycle of CYT-19 are notknown. Consequently, it is not possible to link the structural transitions that occur during theCYT-19 assisted folding of the misfolded ribozymes, as we did for the GroEL-GroES machin-ery. We should note that the structures and biophysical studies of the DEAD-box proteinMss116p,

Saccharomyces cerevisiae analogue of CYT-19, showed the expected helicase ac-tivity, resulting in the disruption of the structure of the misfolded ribozyme. These studies andthe still undetermined ATPase cycle could be used in the future to provide a molecular basisof the IAM for CYT-19 assisted folding. Nevertheless, the mathematical formulation of the IAMtheory could be adopted to investigate the interesting experimental ﬁndings by Bhaskaran andRussell .The most signiﬁcant experimental ﬁndings of CYT-19 assisted folding of ribozymes are:(i) When incubated in CYT-19 under somewhat destabilizing conditions ([Mg ] < P SSN (cid:54) = 1 ). (iii) The deactivation of the ribozyme function was observed at longerpre-incubation times in CYT-19. Deactivation of native ribozyme was also observed at higherCYT-19 concentration. Taken together, these observations imply that CYT-19 destabilizes thenative as well as the misfolded ribozyme. The ﬁnding that CYT-19 interacts with the fullyfolded ribozyme is in stark contrast with GroEL, which does not interact with the folded statesof proteins. In light of the experimental observations the IAM theory has to be generalized16see Fig.8). The results in this study inspired us to generalize the IAM theory using the mas-ter equation . More recently, we proposed a simpler version that describes the functionsof GroEL and RNA chaperones on equal footing . The resulting theory, which gives rise acomplicated expression for the folded state of P5a variant of the Tetrahymena sketched below,provided a quantitative agreement (Fig.9) of the experimental data .The KPM description of ribozyme folding shows that upon increasing the Mg concen-tration a fraction of the initial unfolded population, Φ , folds to the native state and the remainingfraction, M = 1 − Φ , collapses to one of many misfolded states. Consider the fate of the mis-folded states, with population, M , as they interact with CYT-19. In the presence of the RNAchaperone, a fraction Φ of M reaches the native state ( Φ(1 − Φ) ) and − Φ of M to one ofthe misfolded states ( (1 − Φ) ). Because CYT-19 also acts on the native state we also have toconsider the fate of the folded ribozyme, as it interacts with CYT-19 (see Fig.8). Let a fraction κ denote the fraction of the initially folded ribozyme reach the misfolded state (bottom right circlein Fig.8) while the − κ remain in the native state (top right circle in Fig.8). In the subsequentround, Out of κ Φ , κ Φ of them goes to native and κ Φ(1 − Φ) reaches the misfolded state.Therefore, M = κ Φ(1 − Φ) + (1 − Φ) is the total of the misfolded ribozyme in the secondround of IAM, which accounts for accumulation from both the folded and misfolded states inthe ﬁrst round. In order to obtain an expression for the yields of both the folded and misfoldedstates of the ribozyme the branching process from both the accumulated folded and misfoldedstates of the ribozyme in the previous round has to be taken into account. A recursion relationfor this iterative process may be written down, such that the amount of misfolded state at the n -th round is the sum of M n − × (1 − Φ) from the misfolded ensemble, and κ (1 − M n − )(1 − Φ) from the native ensemble. In short, M n = M n − (1 − Φ) + κ (1 − M n − )(1 − Φ) (see Fig.8). Asa result, the total yield of native state in the N -th round of annealing process ( Ψ N = 1 − M N )can be calculated in order to obtain yield of the native ribozyme from the generalized versionof IAM, Ψ N = Φ 1 − (1 − κ ) N (1 − Φ) N κ + (1 − κ )Φ , (2)and the steady state solution ( N → ∞ ) is Ψ ∞ = Φ κ + (1 − κ )Φ . (3)For κ = 0 , corresponding to the situation that the RNA chaperone does not recognize thenative state , the yield in the N -th round is identical to the conventional IAM expression. For17 = 1 in which RNA chaperone recognizes the native state equally as well as misfolded states,there would be no gain in the native yield by the action of RNA chaperone.The action of chaperones on substrate RNA can be mapped onto 3-state kinetic modelof RNA with transitions between the native ( N ), misfolded ( M ), and intermediate states ( I ).When the partition factor Φ in terms of the rate constants, Φ = k IN / ( k IN + k IM ) , is pluggedinto Eq.3, Ψ ∞ is Ψ ∞ = k IN κk IM + k IN . (4)In addition, the expression for the steady state value of fraction native ( P SSN ), which is equiva-lent to Ψ ∞ , can be obtained using 3-state kinetic model (Fig.5) under the following conditions:(i) k NM , k MN (cid:28) k IN , k IM , k NI , k MI , and (ii) k NI (cid:28) k IN . With these assumptions we ﬁndthat, P SSN ≈ k IN (cid:16) k NI k MI (cid:17) k IM + k IN . (5)Therefore, comparison between Eq.4 and Eq.5 gives κ = k NI ([ C ] , [ T ]) k MI ([ C ] , [ T ]) (6)where the dependence of unfolding rates k NI and k MI on chaperone and ATP concentration ismade explicit. It turns out that κ , deﬁned as the unfolding efﬁciency of chaperone for the nativestate with reference to the misfolded ensemble, is effectively the ratio between chaperone-induced unfolding rate from the native and misfolded state. A sketch of the native state as afunction of κ , which depends both on the chaperone and ATP concentration, is given in Fig.10. DISCUSSIONWhat do chaperones optimize?

The question of what quantity a biological machines optimize subject to the constant ofavailable free energy does not have a general answer. However, in the rare case of chaper-ones a plausible answer has been recently proposed, which we illustrate here . It is note-worthy that despite the critical difference between CYT-19 and GroEL, with the former that18isrupts both the folded and misfolded states of ribozymes whereas the latter does not inter-act with the folded proteins, the mechanisms of their functions are in accord with the predic-tions of IAM. Both GroEL and RNA chaperones function by driving the SPs and ribozymes outof equilibrium . Remarkably, we showed by analyzing experimental data on ribozymes andMDH that the quantity that is optimized by GroEL and RNA chaperones is, ∆ NE = k F P SSN (7)where k F is the folding rate and P SSN is the steady state yield (see Fig.11). Thus, neitherthe folding rate nor the steady state yield is maximized but it is the product of the two thatis optimized by the molecular chaperones. It follows from Fig.11 that, for a given SP andexternal conditions, which would ﬁx k F , the steady state yield would have the largest value forthe wild type GroEL than any other mutant. That this is indeed the case is vividly illustrated inFig. 6. In the case of GroEL, the value of P SSN (or P N ( t ) at any t ) is critically dependent onlyon k R (cid:48)(cid:48) → T , which has the largest value for the WT GroEL. The optimality condition given inEq.7 is determined by the value of k R (cid:48)(cid:48) → T , which in turn depends on the dynamics of allosterictransitions as well the presence of SP. Thus, the function of GroEL, and most likely CYT-19 andrelated RNA chaperones, cannot be understood without considering the details of the reactioncycle and how they are directly related to SP folding. The IAM theory, which accounts for allthe complexities of the reaction cycle, explains the available experimental data quantitatively(see for example Fig. 6) using a single parameter ( k R (cid:48)(cid:48) → T ). When it does SP folding occurs in the expanded GroEL cavity

Does SP folding occur in the expanded cavity or in solution after ejection? This questionhas unnecessarily plagued the discussion of GroEL-assisted folding, causing substantial con-fusion largely because of insistence by some that GroEL merely encapsulates the SP in thecavity until it reaches the native state with unit probability . Such an inference that GroELis a passive Anﬁnsen cage has been made principally using experiments based on a singlering mutant (SR1) from which discharge of GroES and the SP occurs on a time scale of 300minutes is erroneous. For starters, the life time of the encapsulated SP in the wild type (WT)cycling GroEL is about 2 seconds that is four orders of magnitude shorter than the SR1lifetime! Furthermore, neither the passive or active cage model can explain how the commu-19ication to discharge the ligands (ADP and the inorganic phosphate), GroES, and the foldedSP takes place.Does folding to the folded state occur within the cavity in the WT GroEL? We answer thisquestion in the afﬁrmative by using the following argument. Assisted folding requires that thekinetic constraint, k F < k B be satisﬁed where k B pseudo ﬁrst order binding rate of SP toGroEL. In the opposite limit ( k F > k B ), which is relevant at low GroEL concentrations, foldingis sufﬁciently rapid compared to diffusion controlled binding that the chaperonin machinerywould not be needed. Thus, assuming that the kinetic constraint ( k B (cid:29) k A ) is always satisﬁedfor stringent substrates under non-permissive conditions then the SP upon ejection from theGroEL cavity, roughly every two seconds, rebinds (presumably to the same GroEL molecule)rapidly. If the ejected SP is in the folded state then it would not be recognized by GroELbecause the hydrophobic recognition motifs would no longer be solvent exposed. Thus, thefate of SP, which occurs by the KPM, is decided entirely within the cavity during the lifetime ofits residence. Both folding and partitioning to the ensemble of misfolded states occur rapidlywhile the SP is encapsulated for a brief period in either chamber.We provide evidence to substantiate the physical arguments given above. The theoryunderlying IAM was used to obtain the parameters for the rates in the catalytic cycle and theintrinsic rates for assisted folding of RUBISCO. The time for RUBISCO molecules to reachthe folded state by the fast track, τ F = k − F = 0 . s (Table 1 in ), which is less than theencapsulation time of about 2 seconds. This implies only the fast track RUBISCO moleculesfold in the cavity because time for slow track Rubisco molecules τ S (= k − S ) to fold is about333 minutes (Table 1 in ). The slow track molecule would rapidly rebind upon exiting thecavity, and the process is iterated multiple times till the majority of unfolded SPs reach thenative state. One can use the same argument for reconstituting Citrate Synthase (CS) usingGroEL. The ﬁts to the experimental data in Fig. 6 yields τ F = 0 . s whereas τ S = 100 minutes , which again shows that KPM resulting in folded and misfolded states occurs whileCS is encapsulated in the cavity. Thus, we can conclude that when SP folding occurs it occursin the expanded cavity. It is worth emphasizing that because the IAM theory takes into accountthe coupling between the events in the reaction cycle of GroEL and SP folding it naturallyexplains the allosteric communication needed for discharge of the SP, whether it is folded ornot, and other ligands. However, only a very small fraction reaches the folded state in eachcycle, and hence the need to perform the iterations as rapidly as possible. Remarkably, GroEL20as evolved to do just that by functioning as a parallel processing machine in the symmetriccomplex when challenged with SP . Symmetric Complex is the Functioning Unit of the GroEL-GroES machine

The IAM predicts that the yield of the folded SP increases with each iteration. It, therefore,follows that for highly efﬁcacious function it would be optimal if GroEL-GroES functions as aparallel processing machine with one SP in each chamber. This would necessarily involveformation of a symmetric complex GroEL -GroES , which was shown as the functioningunit only recently . In particular, using a FRET-based system Ye and Lorimer haveestablished unequivocally that the response of the GroEL-GroES machinery is dramaticallydifferent with and without the presence of SP. In order to unveil the differences they had tofollow the fate of ADP and P i release in real time. These experiments showed that in theabsence of the SP the rate determining step involves release of P i before ADP release fromthe trans ring of the dominant asymmetric complex (GroES bound to the cis ring). In sharpcontrast, when challenged with the SP, ADP is released before P i . The symmetric particle,with GroES bound to both the rings (Fig.2b), is the predominant species in the presence ofSP. In principle, the symmetric particle can simultaneously facilitate the folding of two SPs onein each chamber. Thus, it is likely the case that the functional form in vivo is the symmetricparticle , which is activated when there is a job to do, namely, help SPs fold.There was one other major ﬁnding in the Ye-Lorimer study . They discovered that the ATPhydrolysis rate ( ∼ − ) is the same in the presence and absence of the SP. In the presenceof SP, hydrolysis of ATP is rate limiting, which in the language used to describe motility ofmotors means that GroEL is ATP-gated. In other words, symmetry breaking (or inter ringcommunication) events that determine the ring from which GroEL disassociates depends onextent of ATP hydrolysis in each ring. Remarkably, the release of ADP from the trans ringis accelerated roughly 100 fold in the presence of SP. We note parenthetically that releaseof ADP from the nucleotide binding pocket of conventional kinesin is accelerated by nearly1000 fold in the presence of microtubules , hinting at the possibility that there is a uniﬁedmolecular basis for nucleotide chemistry in biological machines. By greatly enhancing ADPrelease in the presence of SP, resetting to the initial SP accepting state occurs rapidly ( k R (cid:48)(cid:48) → T is maximized in the WT GroEL), which allows GroEL to process as many SP molecules as21ast as possible. Clearly, these ﬁndings are in complete accord with the IAM predictions anddebunk the Anﬁnsen cage model . CONCLUSIONS

In this perspective, we have shown that, despite profound differences, the functions ofGroEL-GroES machine and RNA chaperones are quantitatively described by the theory un-derlying the Iterative Annealing Mechanism. We are unaware any experiment of assistedfolding of the SPs or ribozymes that cannot be explained by the theory. We conclude with thefollowing additional comments.1. It is sometimes stated that the mechanism of how GroEL functions is controversial be-cause of the proposal that the cavity in the GroEL could act as an Anﬁnsen cage in whichfolding can be completed unhindered by aggregation. Such a conclusion was reachedbased mostly on experiments on the SR1 mutant (an asymmetric GroEL complex) fromwhich GroES disassociates in 300 minutes. Although experiments using SR1 (with ADPand P i locked into the equatorial domain) provide insights into effects of conﬁnementon SP folding they are irrelevant in understanding of WT GroEL function. Finally, in theAnﬁnsen cage model there is no necessity for invoking allosteric transitions and howthey are linked to assisted folding. In the SR1 mutant, ATP binding and hydrolysis oc-curs once, which means that the SP is trapped in a hydrophilic cavity for extremely longtimes, and hence lessons from the SR1 mutant neither inform us about the intact WTGroEL nor are they biologically relevant. On the other hand, the stochastic WT GroELcomes alive when presented with SPs, undergoes a series of allosteric transitions bybinding, hydrolyzing ATP, and releasing the products, which permits the SPs multiplechances to fold in the most optimal fashion (see Eq.7). The quantitative success of theIAM should put to rest the inadequacy and the erroneous Anﬁnsen cage model fordescribing the function of the WT GroEL. For instance, the results in Fig. 7 cannot beunderstood within the Anﬁnsen cage model.2. The machine-like non-equilibrium characteristics of chaperones are most evident by thedemonstration that the steady state yield, P SSN (cid:54) = P EQ where the equilibrium yield of the22olded state, P EQ , is given by the Boltzmann distribution, P EQN = 1(1 + e − ∆ G NM /k B T ) , (8)where ∆ G NM is the free energy of the folded state with respect to the manifold of mis-folded states. The values of P SSN for the two SPs and

Tetrahymena ribozyme and itsvariants depend both the chaperone and ATP concentrations , which itself is evidenceof departure from equilibrium. In addition, the measured value of ∆ G NM for the WTribozyme is ∼ k B T , which implies that P EQN ≈ P SSN (cid:54) = P EQN . Theﬁnding that P SSN values of RUBISCO and MDH are dependent on GroEL concentrationalso implies that in the presence of GroEL Eq. 8 is not valid. Taken together they implythat in the process of assisted folding both GroEL and CYT-19 drive the misfolded SPsand ribozymes out of equilibrium (see also ).3. GroEL and RNA chaperones burn copious amount of ATP because in each round onlya small fraction ( Φ (cid:28) ) of misfolded molecules reach the native state. Consider RU-BISCO folding for which Φ = 0 . . The yield of the native state at t = 20 min with theconcentration of GroEL roughly equal to the initial unfolded RUBISCO (both at 50 nm) isabout 0.7 (see Fig.6). The value of P SSN ≈ . from which we obtain n ≈

100 using Eq.1.In each catalytic cycle between (3 − ATP molecules are consumed, which implies thatin order to fold roughly 88% of RUBISCO in the steady state between (300 − ATPmolecules are hydrolyzed. As pointed out elsewhere , this is but a very small fraction ofenergy required to synthesize RUBISCO, a protein with 491 residues. Thus, the beneﬁtsof GroEL assisted folding far outweighs the cost of protein synthesis. However, from athermodynamic perspective it can be argued that GroEL is less efﬁcient than Myosin V,which consumes one ATP molecule (available energy is about E AT P ≈ (20 − k B T )per step ( s ≈

36 nm) while walking on actin ﬁlament, resisting forces on the order ofabout f s ≈ sf s E ATP is very high.

ACKNOWLEDGMENTS

We are grateful to Bernard Brooks, Shaon Chakrabarti, Eda Koculi, George Stan, RiinaTehver, and Scott Ye for collaboration on various aspects of the works described here. DT23cknowledges useful conversations with Alan Lambowitz on RNA chaperones. This workwas supported in part by a grant from National Science Foundation (CHE 19-00093) and theCollie-Welch Chair (F-0019) administered through the Welch Foundation. [1] Goloubinoff P, Gatenby AA, Lorimer GH (1989) GroEL heat-shock proteins promote assemblyof foreign prokaryotic ribulose bisphosphate carboxylase oligomers in Escherichia-Coli. Nature337:44–47.[2] Sigler PB, Xu Z, Rye HS, Burston SG, Fenton WA, Horwich AL (1998) Structure and function ingroel-mediated protein folding. Annu. Rev. Biochem. 67:581–608.[3] Thirumalai D, Lorimer GH (2001) Chaperonin-mediated protein folding. Ann. Rev. Biophys.Biomol. Struct. 30:245–269.[4] Ellis J (1987) Proteins as molecular chaperones. Nature 328:378–379.[5] Lorimer GH (2001) A personal account of chaperonin history. Plant physiology 125:38–41.[6] Hemmingsen SM, Woolford C, van der Vies SM, Tilly K, Dennis DT, Georgopoulos CP, HendrixRW, Ellis RJ (1988) Homologous plant and bacterial proteins chaperone oligomeric protein as-sembly. Nature 333:330.[7] Mohr S, Stryker JM, Lambowitz AM (2002) A dead-box protein functions as an atp-dependent rnachaperone in group i intron splicing. Cell 109:769–779.[8] Woodson SA (2010) Taming free energy landscapes with rna chaperones. RNA biology 7:677–686.[9] Pan J, Thirumalai D, Woodson SA (1997) Folding of rna involves parallel pathways. J. Mol. Biol.273:7–13.[10] Thirumalai D, Hyeon C, Zhuravlev PI, Lorimer GH (2019) Symmetry, rigidity, and allosteric signal-ing: From monomeric proteins to molecular machines. Chem. Rev. 119:6788–6821.[11] Xu Z, Horwich AL, Sigler, PB (1997) The crystal structure of the asymmetric GroEL-GroES-(ADP) chaperonin complex. Nature 388:741.[12] Fei X, Yang D, LaRonade-LeBlanc N, Lorimer GH (2013) Crystal structure of a groel-adp complexin the relaxed allosteric state at 2.7 Å resolution. Proc. Natl. Acad. Sci. U. S. A. 110:E2958–E2966.[13] Fei X, Ye X, LaRonade NA, Lorimer GH (2014) Formation and structures of groel:groes2 chaper-onin footballs, the protein-folding functioal form. Proc. Natl. Acad. Sci. U. S. A. 111:12776–12780.

14] Todd MJ, Lorimer GH, Thirumalai D (1996) Chaperonin-facilitated protein folding: Optimization ofrate and yield by an iterative annealing mechanism. Proc. Natl. Acad. Sci. U. S. A. 93:4030–4035.[15] Gruber R, Horovitz A (2016) Allosteric mechanisms in chaperonin machines. Chem. Rev.116:6588–6606.[16] Thirumalai D, Hyeon C (2018) Signalling networks and dynamics of allosteric transitions in bacte-rial chaperonin GroEL: implications for iterative annealing of misfolded proteins. Phil. Trans. RoyalSoc. B: Biol. Sci. 373:20170182.[17] Gruber R, Horovitz A (2018) Unpicking allosteric mechanisms of homo-oligomeric proteins bydetermining their successive ligand binding constants. Phil. Trans. R. Soc. B 373:20170176.[18] Yang D, Ye X, Lorimer GH (2013) Symmetric GroEL:GroES2 complexes are the protein-foldingfunctional form of the chaperonin nanomachine. Proc. Natl. Acad. Sci. U. S. A. 110:E4298–E4305.[19] Takei Y, Iizuka R, Ueno T, Funatsu T (2012) Single-molecule observation of protein folding insymmetric GroEL-(GroES) 2 complexes. J. Biol. Chem. 287:41118–41125.[20] Viitanen PV, Gatenby AA, Lorimer GH (1992) Puriﬁed chaperonin 60 (groel) interacts with thenonnative states of a multitude of escherichia coli proteins. Protein Science 1:363–369.[21] Betancourt MR, Thirumalai D (1999) Exploring the kinetic requirements for enhancement of pro-tein folding rates in the GroEL cavity. J. Mol. Biol. 287:627–644.[22] Baumketner A, Jewett A, Shea J (2003) Effects of conﬁnement in chaperonin assisted proteinfolding: rate enhancement by decreasing the roughness of the folding energy landscape. Journalof molecular biology 332:701–713.[23] Cheung MS, Thirumalai D (2006) Nanopore–protein interactions dramatically alter stability andyield of the native state in restricted spaces. J. Mol. Biol. 357:632–643.[24] Hofmann H, Hillger F, Pfeil SH, Hoffmann A, Streich D, Haenni D, Nettels D, Lipman EA, SchulerB (2010) Single-molecule spectroscopy of protein folding in a chaperonin cage. Proceedings ofthe National Academy of Sciences 107:11793–11798.[25] Fedorov AN, Baldwin TO (1997) GroE modulates kinetic partitioning of folding intermediates be-tween alternative states to maximize the yield of biologically active protein. J. Mol. Biol. 268:712–723.[26] Treiber DK, Rook M, Zarrinkar PR, Williamson, JR (1998) Kinetic Intermediates Trapped by NativeInteractions in RNA Folding. Science 279:1943–1946.

27] Treiber DK, Williamson JR (1999) Exposing the kinetic traps in RNA folding. Curr. Opin. Struct.Biol. 9:339–345.[28] Thirumalai D, Woodson SA (1996) Kinetics of Folding of Proteins and RNA. Acc. Chem. Res.29:433–439.[29] Russell R, Herschlag D (2001) Probing the folding landscape of the tetrahymena ribozyme: Com-mitment to form the native conformation is late in the folding pathway. J. Mol. Biol. 308:839–851.[30] Woodson SA (2005) Structure and assembly of group i introns. Curr. Opin. Struct. Biol. 15:324–330.[31] Russell R, Herschlag D (1999) New pathways in folding of the

Tetrahymena group I RNA enzyme.J. Mol. Biol. 291:1155–1167.[32] Thirumalai D, Hyeon C (2005) RNA and Protein folding: Common Themes and Variations. Bio-chemistry 44:4957–4970.[33] Hyeon C, Denesyuk, NA, Thirumalai D (2014) Development and Applications of Coarse-GrainedModels for RNA. Israel J. Chem. 54:1358–1373.[34] Solomatin SV, Greenfeld M, Chu S, Herschlag D (2010) Multiple native states reveal persistentruggedness of an RNA folding landscape. Nature 463:681–684.[35] Thirumalai D, Lee N, Woodson SA, Klimov DK (2001) Early Events in RNA Folding. Annu. Rev.Phys. Chem. 52:751–762.[36] Zhuang X, Bartley L, Babcock A, Russell R, Ha T, Hershlag D, Chu S (2000) A single-moleculestudy of RNA catalysis and folding. Science 288:2048–2051.[37] Pan J, Deras ML, Woodson SA (2000) Fast Folding of a Ribozyme by Stabilization Core Interac-tions: Evidence for Multiple Folding Pathways in RNA. J. Mol. Biol. 296:133–144.[38] Lorsch JR (2002) Rna chaperones exist and dead box proteins get a life. Cell 109:797–800.[39] Tijerina P, Bhaskaran H, Russell R (2006) Nonspeciﬁc binding to structured RNA and preferentialunwinding of an exposed helix by the CYT-19 protein, a DEAD-box RNA chaperone. Proc. Natl.Acad. Sci. U. S. A. 103:16698–16703.[40] Bhaskaran H, Russell R (2007) Kinetic redistribution of native and misfolded rnas by dead-boxchaperone. Nature 449:1014–1018.[41] Grohman JK, Del Campo M, Bhaskaran H, Tijerina P, Lambowitz AM, Russell R (2007) Probingthe mechanisms of DEAD-box proteins as general RNA chaperones: the C-terminal domain ofCYT-19 mediates general recognition of RNA. Biochemistry 46:3013–3022.

42] Mallam AL, Jarmoskaite I, Tijerina P, Del Campo M, Seifert S, Guo L, Russell R, Lambowitz AM.(2011) Solution structures of dead-box rna chaperones reveal conformational changes and nucleicacid tethering by a basic tail. Proc. Natl. Acad. Sci. U. S. A. 108:12254–12259.[43] Russell R, Jarmoskaite I, Lambowitz AM (2013) Toward a molecular understanding of RNA re-modeling by DEAD-box proteins. RNA Biology 10:44–55.[44] Lorimer GH (1996) A quantitative assessment of the role of the chaperonin proteins in proteinfolding in vivo. FASEB J. 10:5.[45] Stan G, Brooks BR, Lorimer GH, Thirumalai D (2005) Identifying natural substrates for chaper-onins using a sequence-based approach. Prot. Sci. 14:193–201.[46] Chaudhuri TK, Gupta P (2005) Factors governing the substrate recognition by groel chaperone:a sequence correlation approach. Cell stress & chaperones 10:24.[47] Stan G, Brooks BR, Lorimer GH, Thirumalai D (2006) Residues in substrate proteins that interactwith groel in the capture process are buried in the native state. Proc. Natl. Acad. Sci. U. S. A.103:4433–4438.[48] Noivirt-Brik O, Unger R, Horovitz A (2007) Low folding propensity and high translation efﬁ-ciency distinguish in vivo substrates of GroEL from other Escherichia coli proteins. Bioinformatics23:3276–3279.[49] Endo A, Kurusu Y (2007) Identiﬁcation of in vivo substrates of the chaperonin GroEL from Bacillussubtilis. Biosci. Biotech. Biochem. 71:1073–1077.[50] Guo Z, Thirumalai D (1995) Kinetics of Protein Folding: Nucleation Mechanism, Time Scales, andPathways. Biopolymers 36:83–102.[51] Thirumalai D, Klimov DK, Woodson SA (1997) Kinetic partitioning mechanism as a unifying themein the folding of biomolecules. Theor. Chem. Acc. 96:14–22.[52] Thirumalai D (1995) From Minimal Models to Real Proteins: Time Scales for Protein FoldingKinetics. J. Phys. I (Fr.) 5:1457–1467.[53] Ziv G, Thirumalai D, Haran G (2009) Collapse transition in proteins. Phys. Chem. Chem. Phys.11:83–93.[54] Schmidt M, Buchner J, Todd MJ, Lorimer GH, Viitanen PV (1994) On the role of groes in thechaperonin-assisted folding reaction. three case studies. J. Biol. Chem. 269:10304–10311.[55] Mattingly JR, Iriarte A, Martinez-Carrion M (1995) Homologous proteins with different afﬁnities forgroel the refolding of the aspartate aminotransferase isozymes at varying temperatures. J. Biol. hem. 270:1138–1148.[56] Kiefhaber T (1995) Kinetic traps in lysozyme folding. Proc. Natl. Acad. Sci. U. S. A. 92:9029–9033.[57] Matagne A, Chung EW, Ball LJ, Radford SE, Robinson CV, Dobson CM (1998) The origin of the α -domain intermediate in the folding of hen lysozyme. J. Mol. Biol. 277:997–1005.[58] Peng Q, Li H (2008) Atomic force microscopy reveals parallel mechanical unfolding pathwaysof t4 lysozyme: evidence for a kinetic partitioning mechanism. Proc. Natl. Acad. Sci. U. S. A.105:1885–1890.[59] Stigler J, Ziegler F, Gieseke A, Gebhardt J, Rief M (2011) The Complex Folding Network of SingleCalmodulin Molecules. Science 334:512–516.[60] Dima RI, Thirumalai D (2004) Asymmetry in the shapes of folded and denatured states of proteins.J. Phys. Chem. B 108:6564–6570.[61] Lorimer GH, Horovitz A, McLeish T (2018) Allostery and molecular machines. Philos Trans R SocLond B Biol Sci. 373:20170173.[62] Sameshima T, Iizuka R, Ueno T, Funatsu T (2010) Denatured proteins facilitate the formation ofthe football-shaped GroEL–(GroES) 2 complex. Biochem. J. 427:247–254.[63] Ye X, Lorimer GH (2013) Substrate protein switches groe chaperonins from asymmetric to sym-metric cycling by catalyzing nucleotide exchange. Proc. Natl. Acad. Sci. U. S. A. 110:E4289–E4297.[64] Horovitz A, Fridmann Y, Kafri G, Yifrach O (2001) Review:allostery in chaperonins. J. Struct. Biol.135:104–114.[65] Yifrach O, Horovitz A (1995) Nested cooperativity in the ARPase activity of the oligomeric chap-eronin GroEL. Biochemistry 34:5303–5308.[66] Yifrach O, Horovitz A (2000) Coupling between protein folding and allostery in the groe chaperoninsystem. Proc. Natl. Acad. Sci. U. S. A. 97:1521–1524.[67] Tehver R, Chen J, Thirumalai D (2009) Allostery wiring diagrams in the transitions that drive thegroel reaction cycle. J. Mol. Biol. 387:390–406.[68] Thirumalai D (1994) Statistical Mechanics, Protein Structure, and Protein - Substrate Interactions ,ed. Doniach, S. (Plenum, New York), pp. 115 – 134.[69] Orland H, Thirumalai D. (1997) A kinetic model for chaperonin assisted folding of proteins. J.Phys. I France 7:553–560.

70] Hyeon C, Lorimer GH, Thirumalai D (2006) Dynamics of allosteric transition in GroEL. Proc. Natl.Acad. Sci. U. S. A. 103:18939–18944.[71] Corsepius NC, Lorimer GH (2013) Measuring how much work the chaperone groel can do. Proc.Natl. Acad. Sci. U. S. A. 110:E2451–E2459.[72] Anﬁnsen CB (1973) Principles that govern the folding of protein chain. Science 181:223–230.[73] Kirkpatrick S, Gelatt D, Vecchi MP (1983) Optimization by simulated annealing. Science 220:671–680.[74] Tehver R, Thirumalai D (2008) Kinetic Model for the Coupling between Allosteric Transitions inGroEL and Substrate Protein Folding and Aggregation. J. Mol. Biol. 377:1279–1295.[75] Chakrabarti S, Hyeon C, Ye X, Lorimer G, Thirumalai D (2017) Molecular Chaperones Maximizethe Native State Yield on Biological Times by Driving Substrates out of Equilibrium. Proc. Natl.Acad. Sci. U. S. A. 114:E10919–E10927.[76] Sun Z, Scott DJ, Lund PA (2003) Isolation and characterisation of mutants of groel that are fullyfunctional as single rings. J. Mol. Biol. 332:715–728.[77] Hyeon C, Thirumalai D (2013) Generalized iterative annealing model for the action of rna chaper-ones. J. Chem. Phys. 139:121924.[78] Horwich AL, Apetri AC, Fenton WA (2009) The groel/groes cis cavity as a passive anti-aggregationdevice. FEBS letters 583:2654–2662.[79] Goloubinoff P, Sassi AS, Fauvet B, Barducci A and De los Rios P (2018) Chaperones convertthe energy from ATP into the nonequilibrium stabilization of native proteins. Nat. Chem. Biol.14:388–394.[80] Hackney DD (1988) Kinesin atpase: rate-limiting adp release. Proceedings of the NationalAcademy of Sciences 85:6314–6318.[81] Brinker A, Pfeifer G, Kerner MJ, Naylor DJ, Hartl FU, Hayer-Hartl M (2001) Dual Function ofProtein Conﬁnement in Chaperonin-Assisted Protein Folding. Cell 107:223–233. AAAB3nicbVDLSgNBEOyNrxhfUY9eBoPgKexGQY9BLx4TyAuSEGcnvcmQ2QczvUIIuXoR8aLgJ/kL/o2TxyWJBQNFVQ3d1X6ipCHX/XUyW9s7u3vZ/dzB4dHxSf70rGHiVAusi1jFuuVzg0pGWCdJCluJRh76Cpv+6HHmN19QGxlHNRon2A35IJKBFJysVK318gW36M7BNom3JAVYotLL/3T6sUhDjEgobkzbcxPqTrgmKRROc53UYMLFiA9wMl9vyq6s1GdBrO2LiM3VlRwPjRmHvk2GnIZm3ZuJ/3ntlIL77kRGSUoYicWgIFWMYjbryvpSoyA1toQLLe2GTAy55oLsRXK2urdedJM0SkXvpliq3hbKD8sjZOECLuEaPLiDMjxBBeogAOENPuHLeXZenXfnYxHNOMs/57AC5/sPp2qIhw== R AAAB3nicbVDLSgNBEOyNrxhfUY9eBoPgKexGQY9BLx4TMQ9IQpyd9CZDZh/M9Aoh5OpFxIuCn+Qv+DdOHpckFgwUVTV0V/uJkoZc99fJbGxube9kd3N7+weHR/njk7qJUy2wJmIV66bPDSoZYY0kKWwmGnnoK2z4w/up33hBbWQcPdEowU7I+5EMpOBkpepjN19wi+4MbJ14C1KABSrd/E+7F4s0xIiE4sa0PDehzphrkkLhJNdODSZcDHkfx7P1JuzCSj0WxNq+iNhMXcrx0JhR6NtkyGlgVr2p+J/XSim47YxllKSEkZgPClLFKGbTrqwnNQpSI0u40NJuyMSAay7IXiRnq3urRddJvVT0roql6nWhfLc4QhbO4BwuwYMbKMMDVKAGAhDe4BO+nGfn1Xl3PubRjLP4cwpLcL7/AKR2iIU= R AAAB73icbVDLTgIxFL2DL8TXqEs3jcTEFZlBEl0S3bhEI48EkHTKHWjoPGw7JGTCd7gxxo0m/om/4N9YYDaAJ2lzcs5pes/1YsGVdpxfK7exubW9k98t7O0fHB7ZxycNFSWSYZ1FIpItjyoUPMS65lpgK5ZIA09g0xvdzfzmGKXiUfikJzF2AzoIuc8Z1Ubq2fbjc9qJJQ9wcU97dtEpOXOQdeJmpAgZaj37p9OPWBJgqJmgSrVdJ9bdlErNmcBpoZMojCkb0QGm83mn5MJIfeJH0pxQk7m6lKOBUpPAM8mA6qFa9Wbif1470f5NN+VhnGgM2eIjPxFER2RWnvS5RKbFxBDKJDcTEjakkjJtVlQw1d3VouukUS65V6XyQ6VYvc2WkIczOIdLcOEaqnAPNagDgzG8wSd8WS/Wq/VufSyiOSt7cwpLsL7/ALvrj5Q= ATP GroES SP ATP ATP SP ATPATP SP Hydrophobic Less Hydrophobic HydrophilicSP micro- environment FIG. 1. Allosteric states in GroEL. The T → R transition is driven by ATP, and subsequentbinding of GroES and ATP hydrolysis results in the R → R (cid:48)(cid:48) . As a result of transition from T to R (cid:48)(cid:48) , the volume of cavity expands from 85,000 to 185,000 Å , and the SP experiences the change inmicroenvironment from hydrophobic in the T state to hydrophilic in the R (cid:48)(cid:48) state. T AAAB3nicbVDLSgNBEOyNrxhfUY9eBoPgKexGQY9BLx4TyAuSEGcnvcmQ2QczvUIIuXoR8aLgJ/kL/o2TxyWJBQNFVQ3d1X6ipCHX/XUyW9s7u3vZ/dzB4dHxSf70rGHiVAusi1jFuuVzg0pGWCdJCluJRh76Cpv+6HHmN19QGxlHNRon2A35IJKBFJysVK318gW36M7BNom3JAVYotLL/3T6sUhDjEgobkzbcxPqTrgmKRROc53UYMLFiA9wMl9vyq6s1GdBrO2LiM3VlRwPjRmHvk2GnIZm3ZuJ/3ntlIL77kRGSUoYicWgIFWMYjbryvpSoyA1toQLLe2GTAy55oLsRXK2urdedJM0SkXvpliq3hbKD8sjZOECLuEaPLiDMjxBBeogAOENPuHLeXZenXfnYxHNOMs/57AC5/sPp2qIhw== R AAAB3nicbVDLSgNBEOyNrxhfUY9eBoPgKexGQY9BLx4TMQ9IQpyd9CZDZh/M9Aoh5OpFxIuCn+Qv+DdOHpckFgwUVTV0V/uJkoZc99fJbGxube9kd3N7+weHR/njk7qJUy2wJmIV66bPDSoZYY0kKWwmGnnoK2z4w/up33hBbWQcPdEowU7I+5EMpOBkpepjN19wi+4MbJ14C1KABSrd/E+7F4s0xIiE4sa0PDehzphrkkLhJNdODSZcDHkfx7P1JuzCSj0WxNq+iNhMXcrx0JhR6NtkyGlgVr2p+J/XSim47YxllKSEkZgPClLFKGbTrqwnNQpSI0u40NJuyMSAay7IXiRnq3urRddJvVT0roql6nWhfLc4QhbO4BwuwYMbKMMDVKAGAhDe4BO+nGfn1Xl3PubRjLP4cwpLcL7/AKR2iIU= R + GroES AAAB/HicbVDLSsNAFJ34rPUVdSlCsAiCUJIq6LIoosv66AOaWibTSTt0kgkzN2IJceOvuBFxo+A3+Av+jdM2m7YemOFwzhnm3uNFnCmw7V9jbn5hcWk5t5JfXVvf2DS3tmtKxJLQKhFcyIaHFeUspFVgwGkjkhQHHqd1r38x9OuPVComwnsYRLQV4G7IfEYwaKlt7t0+JG4kWUDHd3rkAn2C5EqKy7u0bRbsoj2CNUucjBRQhkrb/HE7gsQBDYFwrFTTsSNoJVgCI5ymeTdWNMKkj7s0GQ2fWgda6li+kPqEYI3UiRwOlBoEnk4GGHpq2huK/3nNGPyzVsLCKAYakvFHfswtENawCavDJCXAB5pgIpme0CI9LDEB3Vder+5MLzpLaqWic1ws3ZwUyudZCTm0i/bRIXLQKSqja1RBVUTQC3pDn+jLeDZejXfjYxydM7I3O2gCxvcfejKVGg== ab FIG. 2. Structures of the chaperonin GroEL-GroES molecular. a . Structures of GroEL in T , R ,and R (cid:48)(cid:48) states (PDB codes: 1OEL, 2C7E, 1AON). Apical, intermediate, and equatorial domainsare colored in red, green, and blue, respectively. In R (cid:48)(cid:48) state, GroES is bound on top of the apicaldomain of GroEL ring structure. b . Football like structure of GroEL − GroES complex (PDB code :4PKN) is the functional state that is populated in the presence of substrate proteins. A view fromthe bottom highlights the structure with 7-fold symmetry. ative Basin of Attraction Competing Basins of Attraction

Unfolded generalized tertiary capture model for protein-assisted (CBP2) RNA folding. This scheme closely (a) (b) (c) } ∆ G ‡ ! k B T AAAB+nicdVDJSgNBFHwTtxi3qMdcGoPgKUyioMcQBT1GyAaZOPT09Eya9Cx0vxFCzMFf8SLiRcGP8Bf8GyeLYFwKHhRV9ejX5cRSaDTNDyOztLyyupZdz21sbm3v5Hf3WjpKFONNFslIdRyquRQhb6JAyTux4jRwJG87g/OJ377lSosobOAw5r2A+qHwBKOYSna+YF1wiZRc3ows16W+z9XY8n0ysGsNO18sl8wpiPmLfFlFmKNu598tN2JJwENkkmrdLZsx9kZUoWCSj3NWonlM2YD6fDQ9fUwOU8klXqTSCZFM1YUcDbQeBk6aDCj29U9vIv7ldRP0znojEcYJ8pDNHvISSTAikx6IKxRnKIcpoUyJ9ELC+lRRhmlbue9f/5+0KqXycalyfVKs1uYlZKEAB3AEZTiFKlxBHZrA4B4e4QVejTvjwXgynmfRjDHf2YcFGG+fbn2TQA== Φ AAAB63icbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cI9gPaUDbbTbN0dxN2N0IJ/QtePCji1T/kzX/jps1BWx8MPN6bYWZemHKmjet+O5W19Y3Nrep2bWd3b/+gfnjU0UmmCG2ThCeqF2JNOZO0bZjhtJcqikXIaTec3BV+94kqzRL5aKYpDQQeSxYxgk0hDfyYDesNt+nOgVaJV5IGlPCH9a/BKCGZoNIQjrXue25qghwrwwins9og0zTFZILHtG+pxILqIJ/fOkNnVhmhKFG2pEFz9fdEjoXWUxHaToFNrJe9QvzP62cmuglyJtPMUEkWi6KMI5Og4nE0YooSw6eWYKKYvRWRGCtMjI2nZkPwll9eJZ2LpnfZdB+uGq3bMo4qnMApnIMH19CCe/ChDQRieIZXeHOE8+K8Ox+L1opTzhzDHzifP+NjjiE= − Φ AAAB7XicbVBNSwMxEJ3Ur1q/qh69BIvgxbKrgh6LXjxWsB/QLiWbZtvYbLIkWaEs/Q9ePCji1f/jzX9j2u5BWx8MPN6bYWZemAhurOd9o8LK6tr6RnGztLW9s7tX3j9oGpVqyhpUCaXbITFMcMkallvB2olmJA4Fa4Wj26nfemLacCUf7DhhQUwGkkecEuukpn/WrQ95r1zxqt4MeJn4OalAjnqv/NXtK5rGTFoqiDEd30tskBFtORVsUuqmhiWEjsiAdRyVJGYmyGbXTvCJU/o4UtqVtHim/p7ISGzMOA5dZ0zs0Cx6U/E/r5Pa6DrIuExSyySdL4pSga3C09dxn2tGrRg7Qqjm7lZMh0QTal1AJReCv/jyMmmeV/2Lqnd/Wand5HEU4QiO4RR8uIIa3EEdGkDhEZ7hFd6QQi/oHX3MWwsonzmEP0CfP71pjpM= FIG. 3. Kinetic partitioning mechanism (KPM) on rugged energy landscapes and chaperones. (a)Rugged folding landscape illustrating the native (NBA) and competing basins of attraction (CBA).In the absence of chaperones, a fraction Φ of the unfolded state ensemble folds into the NBA and − Φ of the ensemble collapses to CBA. (b) IAM for GroEL-GroES showing the coupling betweenallosteric transitions and SP folding. The ﬁgure clearly illustrates that partitioning to native state,with probability Φ , and mifolding to a metastable state, with probability ( − Φ ) occurs rapidlywithin the cavity during the two second life time of the encapsulated state. Although not shownexplicitly, the functioning state is the symmetric complex (see Fig. 2). (c) IAM for RNA chaperone(CYT-18/CYT-19) acting on RNA. Both the processes shown in (b) and (c) are energy consumingprocesses associated with ATP hydrolysis. b FIG. 4. a . Folding of T . ribozyme from its secondary structure to three dimensional native state. b . Structure of yeast analogue of CYT-19. IG. 1: Model for chaperone assisted folding of

Tetrahymena ribozyme. The ribozyme in the interme-diate (I, brown), native (N, red), misfolded (M, blue) states and RNP complexes are illustrated in thescheme. The CYT-19 is represented in green.

Shaon, why don’t we add (N), (M), and (I) nextto the Native, Misfolded, and Intermediate. (M) (N)(I) k M I k NI FIG. 5. The model for a uniﬁed description of chaperone-assisted folding. Tetrahymena ribozymeand CYT-19 (green) are employed for illustration purposes. The model shows the ribozyme in theI (brown), N (red), and M (blue) states and ribozyme-CYT-19. The GroEL-associated folding cansimilarly be accounted for by replacing CYT-19 with the chaperonin machinery. The ﬁgure wasadapted from Ref. . ifferent intrinsic folding timescales and can alsoaffect the GroEL allosteric rates (see below). Never-theless, many of the qualitative results obtained herealso provide predictions or illustrations of trendsand dependencies of the GroEL efficiency on allos-teric transitions or spontaneous folding rates of SP.The data at eight values of GroEL concentration inFig. 5 of Ref. 19 allow us to demonstrate that ourmodel can reproduce the observed time-dependentyield of the native state of Rubisco and its depen-dence on the concentration of GroEL. In the process,we can extract the nine kinetic parameters [Eq. (12)].Fortunately, the number of measurements reportedin Ref. 19 is large enough that an unbiased fit to theequations can be carried out. In the first test of Eq.(12), we varied, without prejudice, the nine para-meters to fit all eight curves at different [EL] con-centrations. Further tests of the robustness of ourmodel are described below. In assessing the validityof the model, it is important to not only check thequality of the fits but also ensure that the values ofthe parameters are physically reasonable. Ourkinetic model fits the experimental data extremelywell (Fig. 2). We should emphasize that all the datapoints were globally fit using only one set ofparameters, describing the kinetic reaction cycle ofGroEL and intrinsic Rubisco rates. By fitting theexperimental data using our kinetic model, we ob-tained physically reasonable reaction rates forRubisco (see Table 1). The intrinsic timescalesassociated with Rubisco are in accord with theore-tical expectations. For example, the timescale for thefast process τ F ≈ k F − is in the range of time expectedfor a 491-residue protein. The value k S is consistentwith measurements that monitored spontaneousfolding of Rubisco in the presence of chloride ion. The ratio k F / k S ≈ ≫ k S is an averagerate for the transition from the ensemble of mis-folded structures to the native state. It is likely asmall fraction of misfolded states that can fold morerapidly (i.e., 1 ≪ k F / k S b ). That, under experi-mental conditions, Rubisco folding requires thechaperonin machinery implies that the aggregationmust compete with transition from misfoldedstructures, especially considering that the partition-ing factor Φ = 0.02 is small. Indeed, the pseudo-first-order process for aggregation k A ≈ λ A [SP] N k S when[SP] ≈

40 nM, as in the experiments. Efficientfunction of GroEL also requires that the binary rateconstant for [ELSP] formation exceeds the low-orderaggregation rate. Our fits show that λ B = 0.6 nM − min − ≫ λ A , and hence GroEL can effectively rescuestringent substrates. Extracted kinetic parameters are in accord withexperiments

We further calibrate our model by comparingthe extracted parameters for GroEL allostery withrates that have been obtained by independent mea-surements. In particular, the transition R → R ʺ isbelieved to be the rate-limiting step in the GroEL allosteric cycle in the presence of saturating SP. Ourfits yield k R – R ʺ = 12 min − (Table 1) or k R – R ʺ − = 5 s,which is consistent with the GroE hemicycle time.The load-dependent GroEL allosteric cycle time τ isusually estimated to be 10 –

15 s. The value of τ canbe as small as 6 s, depending on the nature of theligand that is bound to the trans ring (J. Gresham, J.Grason, and G.H. Lorimer, unpublished data). Thechaperonin binding rate has been measured to be inthe range λ B ≈ – M − s − . The bindingconstant λ B = 10 s − M − obtained from our fits(Table 1) falls within this range. The TT ↔ TRtransitions have been characterized extensively byYifrach and Horovitz, who found that, for variousGroEL mutants, k T – R ∼ – − . The value Fig. 2. (a) Native-state yield of Rubisco as a function oftime at different chaperonin concentrations. The points areexperimental data from Ref. 19. The lines are the fits to thedata using the kinetic model. The set of parameters thatprovide a global fit to all eight data sets is given in Table 1.The initial concentration of Rubisco is 40 nM. Thechaperonin concentrations for the curves from bottom totop are as follows: [EL ] = 1 nM, [EL ] = 2 nM, [EL ] = 5 nM,[EL ] = 10 nM, [EL ] = 20 nM, [EL ] = 30 nM, [EL ] = 50 nM,and [EL ] =100 nM, corresponding exactly to Ref. 19. (b)Results of sensitivity analysis of the kinetic model [Eq.(12)] using Rubisco parameters (Appendix B). The time-dependent native-state yield [N]( t ), continuous curves, forthree chaperonin concentrations, [EL ] = 1 nM, [EL ] =10 nM, and [EL ] = 100 nM, is compared with the robust-ness measure Δ [N]( t ), dashed curves [Eq. (23)]. The smallvalues of δ [N]( t )/[N]( t ) for all t and at various [EL]concentrations suggest that the allosteric model is robustto changes in the kinetic parameters around the valuesthat provide the best fit in (a). GroEL Allosteric Transitions and SP Folding and Aggregation

Time (min) Fo l d e d R u b i s c o ( n M ) FIG. 6. Native-state yield of Rubisco as a function of time at diﬀerent GroEL concentrations. Thechaperonin concentrations for the curves from bottom to top are 1 nM, 2 nM, 5 nM, 10 nM, 20nM, 30 nM, 50 nM, and 100 nM. The lines are the ﬁts to the experimental data from Ref. usingthe kinetic model. The ﬁgure was adapted from Ref. . han that of the WT GroEL. The efficiency of SR1mutants is linked to a decrease in k R ʺ – T (see below). Efficiency of SR1 and its variants in foldingmtMDH and CS depends on intrinsic GroELallosteric rates

In the quest to find single-ring constructs that cansubstitute for GroEL in vivo , Sun et al . used geneticexperiments to screen for SR1 mutants from whichGroES can disassociate. Two of the three SR1mutants, namely, SR-D115N and SR-A399T, werejudged to be nearly as efficient as GroEL in thatthey facilitated folding of mtMDH and CS almost aswell as did GroEL. They could also function well invivo at 37 °C. The third one, SR-T522I, while con-siderably more efficient than SR1, was unable tosustain growth of E. coli . The experiments wereperformed under non-permissive conditions suchthat spontaneous folding of mtMDH and CS wasnegligible (see Figs. 5 and 6 of Ref. 18). The time-dependent changes in the activity of mtMDH, whichpresumably mirrors the yield of the folded state, inthe presence of GroEL, SR1, SR-D115N, SR-A399T,and SR-T522I are reported in Fig. 5 of Ref. 18. mtMDH

We fitted the experimental data using our kineticmodel, with the same GroEL parameters (varying only k R ʺ – T for different mutants; see below) as weobtained in the analysis of Rubisco (Table 1) andtreating mtMDH folding within the single-ringcavity using Eq. (8). In particular, we analyzed thedata in Fig. 5 of Ref. 18 in two steps. First, we usedthe data of spontaneous folding and GroEL-assistedfolding from the figure to determine the intrinsicrates of mtMDH folding. While GroEL allostericrates are dependent on the SP and on the specificexperimental conditions, we used the rates λ B , k R – R ʺ , k T – R , k R – T , and k R ʺ – T from Table 1 for the WT GroEL.Thus, for mtMDH, the intrinsic folding rates are asfollows: k F = 100 min − , k S = 0.0012 min − , Φ = 0.002,and λ A = 0.021 nM − min − . Consistent with theserates, the values for k N and k M [ k N = Φ̄ k R ʺ – T and k M =(1 − Φ̄ ) k R ʺ – T ] for GroEL are as follows: Φ̄ = 0.002and k R ʺ – T = 60 min − . For SR1, we used the same λ B , k R – R ʺ , k T – R , and k R – T values; however, k R ʺ – T = 1/300 min − . The corresponding data and curves areshown in Fig. 3.Second, we fitted the yield data for mtMDH forthe three SR1 mutants (SR-D115N, SR-A399T, andSR-T522I), using only one free parameter each, namely,the R ʺ → T transition rate, k R ʺ – T , which reflects thekinetic efficiency of inter-ring communication. Forthe mutants SR-T522I, SR-A399T, and SR-D115N,the fits yielded k R ʺ – T = 2.5 min − , k R ʺ – T = 11 min − ,and k R ʺ – T = 14 min − , respectively. The fits are shownin Fig. 3. While it is likely that other SR1 kinetic rateswere also altered as a result of these mutations, wechose the k R ʺ – T transition rate because it is mostclosely related to the GroES dissociation rate (Table 5of Ref. 18). The GroES K d measurements for these mutants (Table 5 of Ref. 18) indicate an increaseddissociation rate for the mutants compared withSR1. A larger GroES dissociation rate would corres-pond to speeding of the R ʺ → T transition rate

Fig. 3.

Yield of SPs as a function of time. (a) The pointsare taken from the experimental measurements forfolding mtMDH. The lines are fits to the data using thekinetic model. The black line is for spontaneous folding.Assisted folding in the presence of GroES and SR1(purple), SR-T522I (blue), SR-A399T (red), and SR-D115N (dark red) was used to assess the efficiencies ofthese three single-ring chaperonins relative to GroEL(green). (b) The same as (a) except that the folding of CSinstead of mtMDH is analyzed. The only GroEL allostericrate that is varied in obtaining the results in (a) and (b) is k R ʺ – T , while all others were taken from Table 1. (c) Pre-dictions for the time-dependent yield of Rubisco forGroEL (green) and the single-ring mutants SR-T522I(blue), SR-A399T (red), SR-D115N (dark red), and SR1(purple) based on chaperonin allosteric rates. The valuesof k R ʺ – T were taken by analyzing the mtMDH data in (a). In(a) and (b), superstoichiometric concentrations of chaper-onins were used. GroEL Allosteric Transitions and SP Folding and Aggregation F o l d e d C S ( n M ) Time (min) \ (b) FIG. 7. Yield of SPs as a function of time. (a) The data points are taken from the experiment forfolding of mtMDH. The lines are the ﬁts using the kinetic model developed in . The black lineis for spontaneous folding. Assisted folding in the presence of GroES and SR1 (purple), SR-T522I(blue), SR-A399T (red), and SR- D115N (dark red) was used to assess the eﬃciencies of these threesingle-ring chaperonins relative to GroEL (green). (b) The same as (a) except the SP is CS. 1 (1 )(1 )   (1 ) + M ++ Y = 1 M AAAB8HicdVDLSgMxFM3UV62vqks3wSK4ccjU1rYLoejGjVDB1pZ2GDJppg3NPEgyQhn6FW5cKOLWz3Hn35hpR1DRAxcO59zLvfe4EWdSIfRh5JaWV1bX8uuFjc2t7Z3i7l5HhrEgtE1CHoquiyXlLKBtxRSn3UhQ7Luc3rmTy9S/u6dCsjC4VdOI2j4eBcxjBCst9XpO+dw6uXbKTrGEzEa92qhUITLRHCmpokrtDFqZUgIZWk7xfTAMSezTQBGOpexbKFJ2goVihNNZYRBLGmEywSPa1zTAPpV2Mj94Bo+0MoReKHQFCs7V7xMJ9qWc+q7u9LEay99eKv7l9WPl1e2EBVGsaEAWi7yYQxXC9Hs4ZIISxaeaYCKYvhWSMRaYKJ1RQYfw9Sn8n3TKpnVqoptKqXmRxZEHB+AQHAML1EATXIEWaAMCfPAAnsCzIYxH48V4XbTmjGxmH/yA8fYJWwuPdw== M …..…..  (1  )   M n = M n (1 ) +  (1 M n )(1 ) FIG. 8. Schematic of the generalized IAM of chaperone-assisted substrate folding. Depicted arethe logical steps in a branching process that leads to the recursion relation for the total yield ofthe misfolded state after n -th annealing process ( M n ), given at the bottom. Y i (= 1 − M i ) and M i are respectively the yield of native and misfolded states from the i th iteration. Φ is the kineticpartitioning factor. N A S P L U S B I O P H Y S I C S A ND C O M P U T A T I O N A L B I O L O G Y A B

Fig. 4.

Analysis of CYT-19–mediated folding of the WT

Tetrahymena ribozyme. Circles represent experimental data, while the curves are plotsof Eq. S2 . The ﬁve sets of data in A have been ﬁt simultaneously to deter-mine the best parameters for the WT (given in Table S1). ( A ) Kinetics ofWT ribozyme in 2 mM ATP concentration for various concentrations of CYT-19: no CYT-19 (brown), 1 µ M CYT-19 (red), 2 µ M CYT-19 (blue), and 3 µ MCYT-19 (pink). The curve in green is obtained for a mixture of native andmisfolded WT ribozymes when proteinase K is introduced to inactivate CYT-19. ( B ) Dependence of k obs of WT ribozyme on CYT-19 (data from ﬁgure 1dof ref. 30). The curve is the CYT-19 dependence of the second eigenvalue | | obtained from our model (see Supporting Information ), with parame-ters obtained from the ﬁts in A . Given the large experimental uncertainty,only the trend “ k obs increases as [C] increases” is meaningful. ﬁtted values of k IM and k IN is . . . The free energy dif-ference, G NM , calculated from k NM and k MN gives 2.6 k B T .This value is in rough accord with experimental results showingthat the N state of P5a is less stable compared with the WT with G WTNM ⇠ k B T (11, 48).For all three variants of the ribozyme, one dominant eigen-value ( | | ⇡ k obs ) of the master equation formulation describesthe overall kinetic behavior of the three-state model (see Sup-porting Information ). Thus, the time evolution of the fraction of Nstate is primarily governed by the exponential term e | | t , mak-ing | | comparable to the experimentally observed rate k obs . Toassess the effect of varying ATP and CYT-19 concentrations onthe chaperone-induced unfolding kinetics of the native ribozyme,we compared | | (computed from the parameters in Table S1)with data on k obs as a function of CYT-19 (Fig. 4 B for WT andFig. 5 C for P5a) and ATP concentration (Fig. 5 D for P5a). Thereasonable agreement of these curves with the experimental dataand the best ﬁt parameters with their corresponding experimen-tally measured values indicate that our kinetic model faithfullydescribes CYT-19–mediated folding/unfolding of Tetrahymena ribozyme. The agreement is especially satisfactory given the largescatter in the experimental data.The ratio k MI / k NI , which quantiﬁes how indiscriminately thechaperone unwinds both the N and M states, is roughly 40 to 80in the ribozyme . µ M concentration range of CYT-19 andat 1 mM ATP using the parameters for the P5a variant in TableS1. We obtained qualitatively similar results if parameters fromthe WT are used. Since more of the P5a parameters could berobustly ﬁt, we report k MI / k NI for only the P5a variant.Finally, to test the importance of the N state recognition byCYT-19, we analyzed how the long-term N state yield (Eq. )changes due to perturbations of the parameter N around thebest ﬁt values of the WT ribozyme (Fig. S2). We also perturbedsome of the other parameters that could conceivably be changedby making mutations in the chaperone domains—for example,the ATP hydrolysis rates and the binding constant M (Fig. S2).Interestingly, P ( N , ) is most sensitive to changes in N com-pared with the other parameters (Fig. S2), thereby indicating thatchanges in recognition and binding of CYT-19 to native RNAcan result in signiﬁcant shifts in the ﬁnal N state yield. Additional Remarks for RNA Chaperones . Besides ATP-drivenrearrangements, certain DEAD-box proteins drive ATP-inde-pendent conversions between RNA structures as well (49, 50). However, the ATP independent process is highly inefﬁcient andoccurs at much lower rates than the ATP-driven structural rear-rangements only when [protein] [substrate] (49, 50). In addi-tion, Yang et al. (49) demonstrate that only the ATP-dependentrearrangements lead to substrate concentrations that are outof equilibrium. The ATP-independent pathways lead to equi-librium concentrations, exactly as our model predicts. There-fore, while the ATP-independent pathways are no doubt present,they are much slower, and hence the ATP-dependent pathwaysdominate.Finally, although the Tetrahymena ribozyme is not a naturalsubstrate for CYT-19 and therefore may not refold with maxi-mal efﬁciency, our focus has been on explaining the differencesin refolding driven by different concentrations of CYT-19. Thisanalysis allowed us to show that regardless of the detailed mech-anisms, the action of the chaperone is nonequilibrium in nature,with the thermodynamic driving force being ATP hydrolysis. Inthe presence of CYT-18, the molecular details of the RNA chap-erone acting on

Tetrahymena ribozyme will most likely be altered,but the nonequilibrium nature of the chaperone action capturedby the three-state model will still be maintained.

GroEL-Mediated Folding of Rubisco and MDH.

Rubisco is a strin-gent substrate for GroEL in the sense that the full machineryincluding ATP and GroES is required to ensure folding. In aprevious study, the GroEL-assisted folding of Rubisco as a func-tion of GroEL concentration was reported (17). Starting fromacid-denatured Rubisco (in kinetically trapped M states), theyield of the N state increased with time upon addition of thechaperonin system (GroEL and GroES). Using our theory, wesimultaneously ﬁt the nine time-evolution curves, correspond-ing to nine different concentrations of GroEL using Eq. S2 . A BC D

Fig. 5.

Analysis of CYT-19–mediated folding of the P5a variant of

Tetrahy-mena ribozyme. The circles and inverted triangles represent experimentaldata, while the curves are plots of Eq. S2 . The 11 sets of data in A and B were ﬁt simultaneously to determine best ﬁt parameters (given in TableS1). ( A ) CYT-19 (1 µ M)-induced kinetics starting from the native P5a vari-ant ribozyme in 5 mM Mg + at various ATP concentrations: no ATP (brown),100 µ M ATP (blue), 200 µ M ATP (red), 400 µ M ATP (green), 1 mM ATP (pink),and 2 mM ATP (black). ( B ) Kinetics of P5a variant folding for different CYT-19 concentrations. Starting conditions were primarily native (circles) or pri-marily misfolded (triangles) P5a variants. Cyt-19 concentrations are 0.5 µ M(blue) and 1 µ M (red). The curve in green is obtained for a mixture of nativeand misfolded P5a variant ribozymes when proteinase K is introduced toinactivate CYT-19. ( C and D ) Dependence of k obs of the P5a variant on CYT-19 (data from ﬁgure S3 of ref. 30) and ATP concentration (data from ﬁgureS4c of ref. 30), respectively. The lines are CYT-19 or ATP dependence of thesecond eigenvalue | | obtained from our model, with parameters obtainedfrom the ﬁts in A and B . Chakrabarti et al. PNAS | Published online December 7, 2017 | E10923 a b

FIG. 9. Analysis of folding of the P5a variant of T . ribozyme in the presence of CYT-19. (a) CYT-19 (1 µ M)-induced kinetics of the native P5a variant ribozyme in 5 mM Mg at various ATPconcentrations: no ATP (brown), 100 µ M ATP (blue), 200 µ M ATP (red), 400 µ M ATP (green),1 mM ATP (pink), and 2 mM ATP (black). (b) Kinetics of P5a variant folding for diﬀerent CYT-19 concentrations. Starting conditions were native (circles) or misfolded (triangles) P5a variants.CYT-19s are 0.5 µ M (blue) and 1 µ M (red). The curve in green is for a mixture of native andmisfolded P5a variant ribozymes when proteinase K is added to inactivate CYT-19. The ﬁgure wasadapted from Ref. . contrasting behavior is fully explained by our model. Our the-ory predicts that P ( N , ) is a monotonically increasing functionof [C] if the inequality: k ATPcat , N k ATPcat , M < N M k IM k IM + k ATPcat , M [5] is satisﬁed. On the other hand, if: k ATPcat , N k ATPcat , M > N M k IM k IM + k ATPcat , M , [6] then P ( N , ) will be a monotonically decreasing function of [C](see Supporting Information for details).Substituting the parameters from Table S1 (see

CYT-19–Mediated Folding of

Tetrahymena

Ribozymes and

GroEL-Mediated Folding of Rubisco and MDH ), we see that theinequality in Eq. is indeed satisﬁed by Tetrahymena ribozyme.Similarly, the best ﬁt parameters from Table S2 show that Rubiscosatisﬁes the inequality in Eq. , thus explaining the increase innative yield of Rubisco as GroEL concentration is increased. Generalized IAM and the N State Recognition Factor.

Without chap-erones, only a small fraction of the original unfolded ensem-ble reach the N state spontaneously. The rest, , remaintrapped in long-lived metastable states. To rescue these kineti-cally trapped proteins to the N state, the chaperone moleculesrecognize and bind to the exposed hydrophobic regions of themisfolded protein. The remaining fraction, (1 ) , is assisted byGroEL, in all likelihood reverting it to the more expanded form,and the whole process is repeated over and over again. The yieldof the N state as a function of such reaction cycles n is given by Y N ( n ) = 1 (1 ) n . As n becomes large, the native yield cantheoretically reach Y N ( n ) ! .The generalized IAM (35) allows for the possibility of N staterecognition by the RNA chaperone, CYT-19, which was not con-sidered previously (42). The chaperone is allowed to act on theN state in addition to the M states of protein or RNA and redis-tributes  again into  N states and  (1 ) M states,where  (0 <  < is the degree of discrimination by the chap-erone between the N and M states. A fraction (1  ) of theoriginal native population remains unperturbed in the same Nstate. It is easy to show that the net gain in the fraction of Nstate after n iterations is given by ((1 )(1  )) n (where n = 1 , , ... ). The total yield of the N state after n iterations Y N ( n ) (Fig. 3) is therefore Y N ( n ) = + (1 )(1  ) + ... + ((1 )(1  )) n , and the conditions of < and  < lead to: Y N ( n ) = (1  ) n (1 ) n  + (1  ) . [7] The physical meaning of the discrimination factor,  , is evi-dent by making an approximate mapping of the long-time yield Y N ( n ! 1 ) to the equivalent expression in our master-equationframework, P ( N , ) . By substituting = k IN k IN + k IM into Eq. and taking the limit n ! 1 , we obtain Y N ( ) = k IN  k IM + k IN ,while P ( N , ) with k MN , k NM ⌧ , and k NI ⌧ k IN reduces to P ⇤ ( N , ) = k IN ( k NI / k MI ) k IM + k IN . Therefore,  is approximately theratio of two rate constants associated with chaperonin-inducedunfolding:  ⇡ k NI ([ C ] , [ T ]) k MI ([ C ] , [ T ]) , [8] which is in accord with the intuitive deﬁnition of  given in Eq. . It is worth noting that  depends on the chaperone ( [ C ] ) andATP ( [ T ] ) concentrations, which suggests that it is possible toreduce  by increasing [ C ] or [ T ] . Evidently, for GroEL,  ! because k NI ([ C ] , [ T ]) is negligible. CYT-19–Mediated Folding of

Tetrahymena

Ribozymes.

Since thediscovery of self-splicing enzymatic activity in the group I intron

Tetrahymena thermophila ribozyme (43–45), the

Tetrahymena ribozyme has been the workhorse used to reveal the general prin-ciples of RNA folding. In accord with the KPM (Fig. 1), the valueof of the WT Tetrahymena ribozyme that attains catalytic activ-ity in the absence of CYT-19 is only 6% to 10% (at C ), whilethe majority of ribozymes remain inactive (11, 13). In the caseof the Tetrahymena ribozyme, it is suspected that the formationof incorrect base pairs stabilizes the misfolded conformations(46). For example, to disrupt a six base-paired helix, a secondarystructure motif ubiquitous in RNA, the free energy barrier is G ‡ =

10 to 15 kcal/mol (=5 stacks ⇥ ⌧ ⇠ ⌧ o e G ‡ / k B T with ⌧ o ⇡ µ s (47) for a sponta-neous disruption of base stacks is estimated to be O (10 ) . ⌧ . O (10 ) s. Thus, once trapped into a mispaired conformation, itis highly unlikely to autonomously resolve the kinetic trap on abiologically viable time scale (46).We ﬁrst analyze the ability of CYT-19 to facilitate the foldingof Tetrahymena ribozyme. Time-resolved kinetics of two variants[P5a mutant and P5abc-deleted ( P5abc) ribozyme] as well asthe WT of the ribozyme were probed by varying CYT-19 andATP concentrations (30). We establish the validity of our the-ory by using Eq. S2 to quantitatively ﬁt an array of experimen-tal data on the WT and P5a mutant (Figs. 4 A and 5 A and B ,respectively). In the experiments, the fraction of native ribozymewas probed as a function of time, under different initial con-ditions: ( i ) starting from completely folded (N) ribozymes, ( ii )starting from primarily misfolded (M) ribozymes, and ( iii ) CYT-19 chaperone inactivated by addition of proteinase K. To probethe effects of CYT-19 and ATP on the production of active (N)state, CYT-19 was varied for cases i and ii , and ATP concentra-tion was varied for case i . In total, we used our theory to ﬁt ﬁvesets of data for the WT (Fig. 4 A ) and 11 sets of data for the P5avariant (Fig. 5 A and B ) ribozyme. By accounting quantitativelyfor the dataset, we extracted the best ﬁt parameters, given inTable S1.The overall trends in the parameters, extracted by simulta-neous ﬁt of the available data, are consistent with the directexperimental measurements and estimates (see Table S1). Notethat some of the experimental results cited in Table S1 wereperformed under different conditions (temperature, Mg ionconcentration, or absence of CYT-19) than the experimentsanalyzed using our theory. These differences could affect the var-ious rates and are pointed out in Table S1. For the P5a mutant,the fraction of ribozymes that fold directly to the N state wasestimated to be . (30), while (Eq. ) calculated from our Fig. 3.

Generalized IAM for proteins and RNA, showing the effect of vary-ing  on the yield of the N state. Shown is the plot of the yield, Y N ( n ) (seeEq. ), as a function of number of cycles n . The native fraction in the limit oflarge n therefore depends on  , the efﬁciency of chaperone recognition ofthe N state:  =  = .

01 (blue),  = .

05 (green),  = .  = . E10922 | FIG. 10. The eﬀect of varying κ on the yield of the N state. . Shown is the plot of the nativeyield, Y N ( n ) as a function of number of cycles n for varying κ values: κ = 0 (red), κ = 0 . (blue), κ = 0 . (green), κ = 0 . (brown), and κ = 1 . (black). The ﬁgure was adapted from Ref. . N A S P L U S B I O P H Y S I C S A ND C O M P U T A T I O N A L B I O L O G Y Effect of Aggregation.

Note that our three-state model neglectsany possibility of aggregation, which introduces at least one andlikely many additional parameters. Could the different steady-state yields of substrate at different concentrations of chaper-one (especially substoichiometric values; see Fig. 6 A and C )be a consequence of substrate aggregation? From the results inTable S2, we calculated the chaperone-driven rate k MI that pri-marily brings misfolded Rubisco back into circulation ( k MN ismuch smaller and can be neglected for this analysis). For theexperiments shown in Fig. 6 A , 50 nM denatured Rubisco wasused, so GroEL concentrations of 1, 2, 5, 10, 20, and 30 nMare substoichiometric. The corresponding values of k MI for theseGroEL concentrations as predicted by our theory are 0.39, 0.77,1.86, 3.58, 6.64, and 9.29 min , respectively. On the other hand,using the estimate of . nM min for the second-orderrate constant for Rubisco aggregation obtained by an elaborateframework (27) and using 50 nM for the misfolded Rubisco con-centration (setting an upper limit for the aggregation rate), weﬁnd that the aggregation rate is . min . Note once againthat this is an upper limit, and the true aggregation rate willbe smaller since not all of the 50 nM Rubisco will be in theM state. Clearly, the aggregation rate is an order of magnitudesmaller than k MI for the smallest GroEL concentration usedin Fig. 6 A (1 nM), while it is over two orders of magnitudesmaller when the GroEL concentration is 30 nM. This clearlydemonstrates that even though aggregation could be present,it is much slower than the dynamics within the three states ofthe substrate. As a result, a steady-state probability current willbe set up within the three states of I, M, and N on time scalesfaster than the cumulative rate of aggregation, resulting in thedifferent steady-state concentrations observed for the differentGroEL concentrations. These calculations are also in accordwith the experimental observations of Rubisco (17) and MDH(36), where no visible aggregation was observed. Therefore,while aggregation likely plays some role in establishing the ﬁnalsteady-state yields, our calculations show that it makes a veryminor contribution compared with the role of nonequilibriumdynamics. Maximization of the Finite-Time Yield by Iterative Annealing andin Vivo Regulation of Chaperone Concentration.

Do chaperonesmaximize either the absolute yield of the N states or the fold-ing rate? Our theory suggests a general answer to this ques-tion in the uniﬁed scenario of both RNA and protein chaper-ones. Using the parameters in Tables S1 and S2, the steady-statenative yield P ( N , ) is plotted for both Rubisco and Tetrahy-mena ribozyme in Fig. 7 A and B . The ﬁgure highlights thatincreasing the chaperone concentration results in completelyopposite behavior of the native yield for Rubisco and Tetrahy-mena ribozyme. This immediately suggests that the absolutevalue of the yield is not maximized, since increasing CYT-19concentration decreases the native yield of the ribozyme (Fig.7 B ). The folding rate ( k obs ) is not maximized either—Fig. 4 B for the WT ribozyme and Fig. 5 C for the P5a variant bothshow that the folding rate is an increasing function of CYT-19 concentration, even around µ M . Our theory predictsthat the folding rate saturates at much larger CYT-19 con-centration ( ⇠ µ M ). At such high concentrations, however,the native yield of ribozyme would become very low (Fig. 7 B ),strongly indicating that neither the folding rate nor the nonequi-librium steady-state yield is maximized. This also suggests thatthe CYT-19 concentration cannot be arbitrarily large under invivo conditions and that the chaperone concentrations must beregulated.Interestingly, the product NE = | | P ( N , ) quantifying thebalance between folding rate and amount of steady-state yieldreaches saturation at low values of chaperone concentration forboth protein and RNA as shown in Fig. 7 C and D . NE is a A BC D

Fig. 7.

Maximization of the ﬁnite-time yield by iterative annealing. ( A and B ) Steady-state yield of the N state of Rubisco ( A ) and ribozyme ( B ), as func-tions of chaperone concentration. ( C and D ) Plots of NE = | | P ( N , ) forprotein ( C ) and ribozyme ( D ), as functions of chaperone concentration. Thecurves in A and C were obtained using the best ﬁt parameters for the GroEL–Rubisco system, given in Table S2. The curves in B and D have been producedusing the best ﬁt parameters for the mutant P5a ribozyme, given in TableS1. For all of the curves, the ATP concentration [T] was set to 1 mM. Thequalitative results do not change for other concentrations of ATP. monotonically increasing function of the chaperone concentra-tion, reaching saturation values at ⇠ . µ M for both the RNAand the protein. The same plot for MDH is shown in Fig. S1.This intriguing result shows that chaperone concentrations maywell be regulated to be in the range of a few µ M such that NE is maximized (Fig. 7 C and D ), thereby allowing for higher nativeyields in short biologically relevant times.Finally, rough estimates of chaperone concentrations in vivoalso support our results suggesting the maximization of NE .There are 10,300 molecules of the yeast RNA chaperoneMss116p (53, 54), which is structurally similar to CYT-19 andcatalyzes the efﬁcient splicing of yeast mitochondrial group Iand II introns (54). Given an average yeast volume of 37 µ m (55), the concentration of Mss116p is ⇠ . µ M , which is in thesaturation region of Fig. 7 D . GroEL concentration in vivo isabout 5.2 µ M (there are 1,580 14-mer GroEL molecules in avolume of 1 µ m in Escherichia coli , with the functional unitbeing the 7-mer). As can be seen from Fig. 7 C and Fig. S2,a concentration of 5.2 µ M is in the saturation region as well.These two results provide additional support to the idea thatby functioning out of equilibrium it is possible to maximize thenative yield in biologically relevant time scales under in vivoconditions. Concluding Remarks

With a doubling time of about 2 hours,

Tetrahymena are someof the fastest multiplying free-living eukaryotic cells (56). There-fore, the viable time scale for

Tetrahymena ribozyme folding tothe N state should be on the order of a few hours. Though a largefraction of the ribozyme ( with ⇡ . ) misfolds and stayskinetically trapped over time scales of days in vitro (11), experi-ments show that the addition of CYT-19 can accelerate the fold-ing process to a matter of minutes (30). Surprisingly, however,increasing the CYT-19 concentration decreases the ﬁnal yield ofthe N states, in stark contrast to GroEL-mediated folding of pro-teins, where increasing the chaperone concentration increasesthe native yield at long times.In this work, we have developed a theoretical model to studythe widely contrasting experimental results on protein and RNA Chakrabarti et al. PNAS | Published online December 7, 2017 | E10925 P SS N AAAB5XicbVDLSgMxFL3js9ZX1aWbYBFclZkq6LLoxpVUah/Q1pJJM21oJjMkd4Qy9BPciLhR8Hf8Bf/GtJ1NWw8EDueccO+5fiyFQdf9ddbWNza3tnM7+d29/YPDwtFxw0SJZrzOIhnplk8Nl0LxOgqUvBVrTkNf8qY/upv6zReujYjUE45j3g3pQIlAMIpWalV7D89prTbpFYpuyZ2BrBIvI0XIUO0Vfjr9iCUhV8gkNabtuTF2U6pRMMkn+U5ieEzZiA54OttyQs6t1CdBpO1TSGbqQo6GxoxD3yZDikOz7E3F/7x2gsFNNxUqTpArNh8UJJJgRKaVSV9ozlCOLaFMC7shYUOqKUN7mLyt7i0XXSWNcsm7LJUfr4qV2+wIOTiFM7gAD66hAvdQhTowkPAGn/DlDJxX5935mEfXnOzPCSzA+f4Dm0mLcg== P SS N AAAB5XicbVDLSgMxFL3js9ZX1aWbYBFclZkq6LLoxpVUah/Q1pJJM21oJjMkd4Qy9BPciLhR8Hf8Bf/GtJ1NWw8EDueccO+5fiyFQdf9ddbWNza3tnM7+d29/YPDwtFxw0SJZrzOIhnplk8Nl0LxOgqUvBVrTkNf8qY/upv6zReujYjUE45j3g3pQIlAMIpWalV7D89prTbpFYpuyZ2BrBIvI0XIUO0Vfjr9iCUhV8gkNabtuTF2U6pRMMkn+U5ieEzZiA54OttyQs6t1CdBpO1TSGbqQo6GxoxD3yZDikOz7E3F/7x2gsFNNxUqTpArNh8UJJJgRKaVSV9ozlCOLaFMC7shYUOqKUN7mLyt7i0XXSWNcsm7LJUfr4qV2+wIOTiFM7gAD66hAvdQhTowkPAGn/DlDJxX5935mEfXnOzPCSzA+f4Dm0mLcg== k F P SS N ( m i n ) AAACBHicbVBLSwMxGMzWV62vVY9egqVQD5bdKuixKIgnqdQ+oI8lm2bb0OyD5FuxLHv14l/xIuJFwbN/wX/jtt1LWwcCw8yEZMYOBFdgGL9aZmV1bX0ju5nb2t7Z3dP3DxrKDyVldeoLX7ZsopjgHqsDB8FagWTEtQVr2qPrid98ZFJx33uAccC6Lhl43OGUQCJZemFk3VStu15Uq8W4A+wJIlyMZ8TlXtyLTs34xNLzRsmYAi8TMyV5lKJq6T+dvk9Dl3lABVGqbRoBdCMigVPB4lwnVCwgdEQGLJqWiHEhkfrY8WVyPMBTdS5HXKXGrp0kXQJDtehNxP+8dgjOZTfiXhAC8+jsIScUGHw8WQT3uWQUxDghhEqe/BDTIZGEQrJbLqluLhZdJo1yyTwrle/P85WrdIQsOkLHqIhMdIEq6BZVUR1R9ILe0Cf60p61V+1d+5hFM1p65xDNQfv+AyOkl5A= k F P SS N ( m i n ) AAACBHicbVBLSwMxGMzWV62vVY9egqVQD5bdKuixKIgnqdQ+oI8lm2bb0OyD5FuxLHv14l/xIuJFwbN/wX/jtt1LWwcCw8yEZMYOBFdgGL9aZmV1bX0ju5nb2t7Z3dP3DxrKDyVldeoLX7ZsopjgHqsDB8FagWTEtQVr2qPrid98ZFJx33uAccC6Lhl43OGUQCJZemFk3VStu15Uq8W4A+wJIlyMZ8TlXtyLTs34xNLzRsmYAi8TMyV5lKJq6T+dvk9Dl3lABVGqbRoBdCMigVPB4lwnVCwgdEQGLJqWiHEhkfrY8WVyPMBTdS5HXKXGrp0kXQJDtehNxP+8dgjOZTfiXhAC8+jsIScUGHw8WQT3uWQUxDghhEqe/BDTIZGEQrJbLqluLhZdJo1yyTwrle/P85WrdIQsOkLHqIhMdIEq6BZVUR1R9ILe0Cf60p61V+1d+5hFM1p65xDNQfv+AyOkl5A= a b [GroEL]( µM ) AAAB9XicbVDLSsNAFJ34rPUVFdy4GSxC3ZSkCrosiuhCoYJ9QBPKZDpph04yYeZGLbGf4kbEjYK/4S/4N6ZtNm09MHA45wz33uNFgmuwrF9jYXFpeWU1t5Zf39jc2jZ3dutaxoqyGpVCqqZHNBM8ZDXgIFgzUowEnmANr3858huPTGkuwwcYRMwNSDfkPqcEUqlt7rccYM+QXCt5dTt0cdEJYnx33DYLVskaA88TOyMFlKHaNn+cjqRxwEKggmjdsq0I3IQo4FSwYd6JNYsI7ZMuS8ZbD/FRKnWwL1X6QsBjdSpHAq0HgZcmAwI9PeuNxP+8Vgz+uZvwMIqBhXQyyI8FBolHFeAOV4yCGKSEUMXTDTHtEUUopEXl09Pt2UPnSb1csk9K5fvTQuUiKyGHDtAhKiIbnaEKukFVVEMUvaA39Im+jCfj1Xg3PibRBSP7s4emYHz/AU5fkNw= [Cyt-19]( µM ) AAAB9nicbVDLSsNAFJ3UV62vqAsXbgaLUBeWpArqrtiNG6GCfUATymQ6aYdOHszcSEPIr7gRcaPgZ/gL/o1pm01bDwwczjnDvfc4oeAKDONXK6ytb2xuFbdLO7t7+wf64VFbBZGkrEUDEciuQxQT3Gct4CBYN5SMeI5gHWfcmPqdFyYVD/xniENme2Toc5dTApnU1096FrAJJI0YLs271MYVy4vw40VfLxtVYwa8SsyclFGOZl//sQYBjTzmAxVEqZ5phGAnRAKngqUlK1IsJHRMhiyZrZ3i80waYDeQ2fMBz9SFHPGUij0nS3oERmrZm4r/eb0I3Fs74X4YAfPpfJAbCQwBnnaAB1wyCiLOCKGSZxtiOiKSUMiaKmWnm8uHrpJ2rWpeVWtP1+X6fV5CEZ2iM1RBJrpBdfSAmqiFKErRG/pEX9pEe9XetY95tKDlf47RArTvP402kPQ= [GroEL]( µM ) AAAB9XicbVDLSsNAFJ34rPUVFdy4GSxC3ZSkCrosiuhCoYJ9QBPKZDpph04yYeZGLbGf4kbEjYK/4S/4N6ZtNm09MHA45wz33uNFgmuwrF9jYXFpeWU1t5Zf39jc2jZ3dutaxoqyGpVCqqZHNBM8ZDXgIFgzUowEnmANr3858huPTGkuwwcYRMwNSDfkPqcEUqlt7rccYM+QXCt5dTt0cdEJYnx33DYLVskaA88TOyMFlKHaNn+cjqRxwEKggmjdsq0I3IQo4FSwYd6JNYsI7ZMuS8ZbD/FRKnWwL1X6QsBjdSpHAq0HgZcmAwI9PeuNxP+8Vgz+uZvwMIqBhXQyyI8FBolHFeAOV4yCGKSEUMXTDTHtEUUopEXl09Pt2UPnSb1csk9K5fvTQuUiKyGHDtAhKiIbnaEKukFVVEMUvaA39Im+jCfj1Xg3PibRBSP7s4emYHz/AU5fkNw= [Cyt-19]( µM ) AAAB9nicbVDLSsNAFJ3UV62vqAsXbgaLUBeWpArqrtiNG6GCfUATymQ6aYdOHszcSEPIr7gRcaPgZ/gL/o1pm01bDwwczjnDvfc4oeAKDONXK6ytb2xuFbdLO7t7+wf64VFbBZGkrEUDEciuQxQT3Gct4CBYN5SMeI5gHWfcmPqdFyYVD/xniENme2Toc5dTApnU1096FrAJJI0YLs271MYVy4vw40VfLxtVYwa8SsyclFGOZl//sQYBjTzmAxVEqZ5phGAnRAKngqUlK1IsJHRMhiyZrZ3i80waYDeQ2fMBz9SFHPGUij0nS3oERmrZm4r/eb0I3Fs74X4YAfPpfJAbCQwBnnaAB1wyCiLOCKGSZxtiOiKSUMiaKmWnm8uHrpJ2rWpeVWtP1+X6fV5CEZ2iM1RBJrpBdfSAmqiFKErRG/pEX9pEe9XetY95tKDlf47RArTvP402kPQ= FIG. 11. Maximization of the ﬁnite-time yield by iterative annealing. a GroEL and b Cyt-19. (Toppanels) Steady-state yield of the folded Rubisco ( a ) and ribozyme ( b ), as a function of chaperoneconcentration. (Bottom panels) Yield per unit time ∆ NE = k F P SSN for Rubisco ( a ) and ribozyme( b ), as a chaperone concentration. For all of the curves, ATP concentration was set to 1 mM. Theﬁgure was adapted from Ref. ..