Thermodynamic uncertainty relation to assess biological processes
TThermodynamic uncertainty relation to assess the efficiency of biologicalprocesses
Yonghyun Song and Changbong Hyeon a) Korea Institute for Advanced Study, Seoul 02455, Korea
We review the trade-o ff s between speed, fluctuations, and thermodynamic cost of generating biological processes innonequilibrium steady states, and discuss how optimal these processes are in light of the universal bound set by thethermodynamic uncertainty relation (TUR). The values of the uncertainty product Q of TUR, which can be used asa measure of the e ffi ciency of enzymatic processes, are suboptimal when the substrate concentration [ S ] is at theMichaelis constant ( K M ), and some of the key biological processes are found to work around this condition. Weillustrate the utility of Q in assessing the e ffi ciencies of molecular motors and biomass producing machineries, and forthe latter cases discuss how their e ffi ciencies quantified in Q is balanced with the error rate in information transfer. Wealso touch upon the trade-o ff s in other error-minimizing processes in biology, such as gene regulation and chaperone-assisted protein folding. A spectrum of Q recapitulating the biological processes surveyed here provides glimpses intohow biological systems are evolved to optimize and balance the conflicting functional requirements.From the perspective of thermodynamics, biological ma-chineries required to sustain living matters are at work out ofequilibrium. Their operations to carry out specific functions,which must defy the e ff ect of noise inherent to cellular envi-ronment, necessarily incur thermodynamic costs . As such,the quest for the relationships between thermodynamic costsand information processing has been a recurring theme in bi-ological sciences .Recently, a new class of thermodynamic principle, calledthe thermodynamic uncertainty relation (TUR), has been de-rived, giving us quantitative ideas on the tradeo ff between thethermodynamic costs and uncertainty (or precision) of dy-namic processes generated in nonequilibrium . For con-tinuous time Markov jump processes or overdamped Langevinprocesses in nonequilibrium steady states (NESS), TUR iswritten as Q = ∆ S tot ( t ) (cid:104) δ X ( t ) (cid:105)(cid:104) X ( t ) (cid:105) ≥ . (1)Here, ∆ S tot ( t ) = (cid:104) ∆ s tot ( t ) (cid:105) is the mean total entropy produc-tion with k B = ∆ s tot ( t ) = ∆ s sys ( t ) + ∆ s env ( t )), calculated over an ensembleof trajectories generated for time t . X ( t ) is a time-integratedcurrent-like observable with odd parity ( X ( − t ) = − X ( t )) thatbest captures the functional feature of the process with (cid:104) X ( t ) (cid:105) and (cid:104) δ X ( t ) (cid:105) ( = (cid:104) X ( t ) (cid:105) − (cid:104) X ( t ) (cid:105) ) being its mean and variance.For the process generated from an enzyme reaction, one could choose the net number of catalyses or product formations thathave occurred for time interval t ( ∆ n ( t ) = n ( t ) − n (0)) as a natu-ral output observable of the process, X ( t ) = ∆ n ( t ). Upon timereversal, i.e., t → − t , ∆ n ( − t ) = n ( − t ) − n (0) = n (0) − n ( t ) = − ∆ n ( t ), satisfying the odd parity. If the enzyme exhibitsmotility along a filament, the net displacement of the enzyme( ∆ x ( t ) = x ( t ) − x (0)) can be taken as the output observable.From the central limit theorem, the entropy production and thesquare of relative uncertainty grows with time t as ∆ S tot ( t ) ∝ t and (cid:104) δ X ( t ) (cid:105) / (cid:104) X ( t ) (cid:105) ∝ / t . As a result, the product between a) Electronic mail: [email protected] the two quantities, termed the uncertainty product Q , is a time-independent constant. The inequality sign of TUR (Eq.1) con-strains the value of Q , specifying the minimal dissipation fora given uncertainty, or the minimal uncertainty for a givenamount of dissipation. While the TUR can be extended be-yond the overdamped, Markovian processes to more generalconditions with its lower bound being adjusted , themajority of biological processes of interest can be describedby employing the language of Markov jump processes on ki-netic network or overdamped Langevin processes. For thepurpose of our discussion on biological processes, we confineourselves to the TUR in its originally proposed form (Eq.1).In this paper, we will first clarify the physical significanceof the bound set by the TUR from a perspective of the stochas-tic thermodynamics. By quantifying the uncertainty product Q for various biological processes with particular emphasison biological motors and biomass producing machineries, weprovide our unified perspective on how the functionally criti-cal features, such as thermodynamic cost, reaction speed, andfluctuation of those processes are balanced in light of TUR .For the case of biomass production processes that transferthe sequence information of DNA or RNA to its downstream,the error in information transfer is another key quantity to betightly regulated . We will address how error correcting ma-chineries balance the error probability ( η ) with the dynamicalfeatures integrated in the Q . Including other dynamical pro-cesses associated with information processing that can also beanalyzed to determine the value of Q , we will construct a spec-trum of Q to provide an idea of how di ff erent biological pro-cesses comprising cellular activities balance their functionalneeds under the fundamental constraint dictated by the TUR. PHYSICAL SIGNIFICANCE OF THERMODYNAMICUNCERTAINTY RELATION
First, to illuminate the significance of TUR, we discuss thecondition leading to the equality of Eq.1. Although it wouldnot be straightforward to directly measure the total entropyproduction ∆ s tot ( t ) from each trajectory, if ∆ s tot ( t ) is acces-sible for measurement, one can choose the entropy produc- a r X i v : . [ phy s i c s . b i o - ph ] J a n tion as the output observable of interest such that X ( t ) =∆ s tot ( t ) . Then, the TUR in Eq.1 can be written as Q = Var( ∆ s tot ( t )) (cid:104) ∆ s tot ( t ) (cid:105) ≥ . (2)Since the entropy production, an extensive variable, increaseswith time ( ∆ S tot = (cid:104) ∆ s tot ( t ) (cid:105) ∝ t ), the inequality betweenthe variance and mean of entropy production, Var( ∆ s tot ( t )) ≥ (cid:104) ∆ s tot ( t ) (cid:105) , demands that the variance of entropy also in-creases with t but that it cannot be smaller than twice the meanentropy production (see Fig.1).Meanwhile, nonequilibrium dynamical processes discussedin this article are expected to establish cyclic steady states aslong as the time scale of observation is longer than a singlecycle time. The total entropy ∆ s tot ( t ) produced from such pro-cesses obeys the detailed fluctuation theorem (DFT) , P ( ∆ s tot ) P ( − ∆ s tot ) = e ∆ s tot (3)which relates the probability density functions of entropy pro-duction from an irreversible process for time t with exponenti-ated total entropy production, where we drop the dependenceof entropy production on t for simplicity of the notation. Re-arranging the terms and integrating with respect to ∆ s tot over −∞ < ∆ s tot < ∞ give rise to the integral fluctuation theorem (IFT), (cid:104) e − ∆ s tot (cid:105) = (cid:104) e − ∆ s tot (cid:105) = (cid:90) ∞−∞ d ( ∆ s tot ) e − ∆ s tot P ( ∆ s tot ) = (cid:90) ∞−∞ d ( ∆ s tot ) P ( − ∆ s tot ) = . (4)Here, (cid:104) . . . (cid:105) denotes the average taken over the distributionof total entropy production from the process, P ( ∆ s tot ), andthe relation should hold for any dynamical process. How-ever, a special relation follows if the distribution of the to-tal entropy production ( ∆ s tot ) is a Gaussian, i.e., P ( ∆ s tot ) ∼ exp (cid:104) − ( ∆ s tot −(cid:104) ∆ s tot (cid:105) ) ∆ s tot ) (cid:105) = (cid:104) e − ∆ s tot (cid:105) = (cid:90) ∞−∞ d ( ∆ s tot ) e − ∆ s tot P ( ∆ s tot ) = e −(cid:104) ∆ s tot (cid:105) + Var( ∆ s tot ) / , (5)from which the equality condition of Eq.2 Var( ∆ s tot ) / (cid:104) ∆ s tot (cid:105) = Q =
2) isattained when the distribution of entropy production followsprecisely a Gaussian. Since the tail part of the distributioncontributes to the evaluation of the exponentiated average, thedistribution of the total entropy production ∆ s tot , the sum of ∆ s i ’s ( i = , , . . . t ) which are the independently and identi-cally distributed (i.i.d.) random variables ( ∆ s tot = (cid:80) ti = ∆ s i ),should be be assessed using the large deviation theory, an ex-tension of central limit theorem . In order for P ( ∆ s tot ) to bea perfect Gaussian including the tail parts, the entropy produc-tion from each cycle P ( ∆ s i ) should also be a Gaussian, which FIG. 1. Properties of the distribution of total entropy production de-manded by TUR. Two scenarios of time evolution of P ( ∆ s tot ( t )) from t = t i (blue) to t = t f (red). (i) As time increases, the distributionof entropy production is broadened. In this case, the TUR is obeyedwith Var( ∆ s tot ) ≥ (cid:104) ∆ s tot (cid:105) at both t = t i and t = t f . (ii) As timeincreases, the distribution of entropy production is narrowed down,such that Var( ∆ s tot ) < (cid:104) ∆ s tot (cid:105) at t = t f , which contradicts to Eq.2and violates the TUR. is highly restrictive. Thus, the inequality ( Q > ∆ s tot ) > (cid:104) ∆ s tot (cid:105) , arises when P ( ∆ s tot ) deviates from theGaussianity. This condition of Gaussianity of the distributionof total entropy production is more general than the conditionof detailed balance (DB) or equilibrium, the latter of which isoften commented as the one giving rise to the equality condi-tion of TUR for Markov process. DB conditions giving rise to Q = Q could beminimized (or sub-optimized) to a small value, if not Q = Q =
2) is a strongly driven colloidalparticle in periodic potentials .Second, we underscore the significance of TUR in light ofthe second law of thermodynamics. Using the definitions ofentropy production rate σ tot ( t ) = ∆ S tot ( t ) / t , mean and vari-ance of current, (cid:104) J (cid:105) = (cid:104) X ( t ) (cid:105) / t and (cid:104) δ J (cid:105) = (cid:104) δ X ( t ) (cid:105) / t , onecan rewrite the expression of TUR in Eq.1 as follows. Q = σ tot ( t ) (cid:104) δ J (cid:105)(cid:104) J (cid:105) ≥ . (6)Rearranged in the following way, σ tot ( t ) ≥ B (cid:104) J (cid:105) (7)with B = (cid:16) / (cid:104) δ J (cid:105) (cid:17) , the TUR conveys a remarkable messagethat the rate of entropy production from the dynamical processis bounded below by the square of mean current multiplied bya prefactor B . This makes the physical bound of thermody-namic process more explicit and tighter than the second lawof thermodynamics ( ∆ S tot ≥ σ tot ( t ) ≥ . FIG. 2. Q of simple enzyme reactions as a function of the substrateconcentration [ S ]. Schematics of (a) the 1-state and (b) 2-state ki-netic networks. (c)(d) Q ([ S ]) of the 2-state kinetic network, for threepairs of γ and ξ values. In (c), [ S ], normalized by K M , shows that Q ([ S ]) is locally minimized at [ S ] ≈ K M . In (d), [ S ] is normalizedby [ S ] eq [ = ( k o ff k rev ) / ( k on k cat )], the substrate concentration at the de-tailed balance condition, showing that lim [ S ] → [ S ] eq Q ([ S ]) =
2. Therate constants used for the plots are as follows: k on = M − sec − , k o ff = k rev =
10 sec − , and k cat = , , − . THE UNCERTAINTY PRODUCT Q AS A MEASURE OFEFFICIENCY OF ENZYMATIC PROCESSES
Many biological processes, which exhibit a net change interms of chemical species, are driven by enzyme reactions thatincur thermodynamic cost. To discuss the TUR in the contextof the enzyme reactions, it is convenient to recast the expres-sion (Eq. 1) into Q = (cid:104) ∆ s tot ( t ) (cid:105)(cid:104) ∆ n ( t ) (cid:105) Var( ∆ n ( t )) (cid:104) ∆ n ( t ) (cid:105) = β A (cid:104) δ J (cid:105)(cid:104) J (cid:105) = β A × λ ≥ , (8)where we choose ∆ n ( t ) the net number of catalyses that takeplace for time t as the observable of interest, defining the meanand fluctuation of reaction current as (cid:104) J (cid:105) = (cid:104) ∆ n ( t ) (cid:105) / t , (cid:104) δ J (cid:105) = Var( ∆ n ( t )) / t . The Fano factor associated with the numberof catalyses or with the reaction current is defined as λ = Var( ∆ n ( t )) / (cid:104) ∆ n ( t ) (cid:105) = (cid:104) δ J (cid:105) / (cid:104) J (cid:105) , and β A = (cid:104) ∆ s tot ( t ) (cid:105) / (cid:104) ∆ n ( t ) (cid:105) with the inverse temperature β = ( k B T ) − is the entropy pro-duction per cycle (or a ffi nity) of the enzymatic reaction. Since (cid:104) J (cid:105) and (cid:104) δ J (cid:105) are associated with the speed ( (cid:104) J (cid:105) ∼ Vt ) anddi ff usivity ( (cid:104) δ J (cid:105) ∼ Dt ) of the enzymatic process, a dynamicprocess that operates at high speed, low fluctuation (high reg-ularity), and low energetic cost is characterized with a small Q and can be deemed to demonstrate higher e ffi ciency when Q is smaller and close to 2 .Before exploring physically realistic processes, such asmolecular motors and biomass producing machineries, wemotivate by computing Q of two simple versions of enzymaticcycle below.(i) For the catalysis of the substrate S in the ( N = k on and k o ff , respectively.The substrate concentration, [ S ], can be varied independently as a control parameter. Using the quantities required to evalu-ate Q in Eq.8 (cid:104) J (cid:105) = k on [ S ] − k o ff , (9) (cid:104) δ J (cid:105) = k on [ S ] + k o ff , (10) λ = e β A + e β A − , (11) β A = ln k on [ S ] k o ff , (12)we obtain Q = β A e β A + e β A − ≥ . (13)At the DB condition ([ S ] = [ S ] eq ≡ k o ff / k on ), no net catal-ysis ( (cid:104) J (cid:105) =
0) and no dissipation ( A =
0) occurs, and theuncertainty product reaches its bound, Q =
2. As [ S ] in-creases, breaking the DB condition, (cid:104) J (cid:105) , A , λ , and Q all in-crease monotonically.(ii) For substrate catalysis through the ( N = for the quantities requiredto evaluate Q : (cid:104) J (cid:105) = k on k cat [ S ] − k rev k o ff k on [ S ] + k cat + k rev + k o ff (14) (cid:104) δ J (cid:105) = k on k cat [ S ] + k rev k o ff − (cid:104) J (cid:105) k on [ S ] + k cat + k rev + k o ff (15) λ = k on k cat [ S ] + k rev k o ff k on k cat [ S ] − k rev k o ff − k on k cat [ S ] − k rev k o ff k (cid:16) [ S ] + K M + k rev k on (cid:17) (16) β A = ln k on k cat [ S ] k rev k o ff (17)with the Michaelis constant, K M = ( k cat + k o ff ) / k on . When Q iswritten as a function of the thermodynamic drive β A , Q = β A e β A + e β A − − γ (cid:16) e β A − (cid:17)(cid:0) γ + γξ + e β A (cid:1) , (18)with dimensionless parameters γ = k cat √ k o ff k rev and ξ = k o ff + k rev √ k o ff k rev ( ≥ ξ ( ≥
2) there exists a thresh-old value of γ , above which Q becomes non-monotonic with[ S ] (see Fig. 2c, and supplementary Fig. 2 in ). One canalso show that for β A (cid:28) Q = + (cid:20) − γ (1 + γξ + γ ) (cid:21) ( β A ) + O (cid:104) ( β A ) (cid:105) ≥
2, which confirms that Q = β A (cid:29) k on (cid:29) k rev , λ is simplified to λ ([ S ]) (cid:39) − k cat [ S ] k on ([ S ] + K M ) , (19)which is minimized at [ S ] = K M (Fig. 2c). Thus, when k on k cat [ S ] (cid:29) k rev k o ff ( β A (cid:29)
0) and k on (cid:29) k rev , Q ([S]) hasa local minimum and is sub-optimized at around [S] ≈ K M (Fig. 2c).The non-monotonic variations of Q ([ S ]) seen in some of thebiological motors and copy enzymes discussed below simplymeans that the dynamical processes of these molecules dis-play a Michaelis-Menten type hyperbolic dependence on [ S ].The condition of [ S ] ≈ K M in the Michaelis-Menten type re-action mechanism, v ([ S ]) = k cat [ S ] / ( K M + [ S ]), corresponds tothe point where the response of enzymatic activity to the log-arithmic change in substrate concentration is maximized, i.e., dv / d log [ S ] = k cat ( K M / [ S ]) / (1 + K M / [ S ]) ≤ k cat /
4. Impor-tantly, a recent systems level analysis of metabolic pathwaysin eukaryotic cells has shown that the physiological substrateconcentrations of many enzyme reactions are tuned near theirrespective K M values , which naturally explains the subopti-mality of Q ([ S ]) at [ S ] ≈ K M for some processes discussed inthe following sections. EFFICIENCY OF TRANSPORT MOTORS
Biological motors are a class of enzymes that have an abil-ity to transduce chemical free energy to mechanical motionvia the catalysis of molecular fuels (ATP, GTP, etc) presentin the cellular milieu . Among them, kinesin-1 is arguablythe most well-studied motor protein that transports cargos ororganelles from the minus to plus ends of microtubules withthe velocity of V ≈ µ m / sec, taking 8 nm step for every ATPhydrolysis. Dynein moves with a similar speed, but in theopposite direction along microtubules, displaying more fluc-tuations in the time traces. Myosin families are the motor pro-teins specialized to move along actin filaments and generatemechanical forces. Whereas these motor proteins are spe-cialized for linear movement and force generation, there arealso rotary motors (e.g., F F ATPase) that utilize H + gradientacross membrane to generate rotational motion (or torque),which is, for example, used to empower the beating dynamicsof bacterial flagella.The chemical free energy-driven dynamics of molecularmotors is a perfect example whose e ffi ciency can be assessedin light of TUR and the uncertainty product Q . The TUR inthe original form (Eq.1) with X ( t ) = ∆ x ( t ) can be cast intothe following form with V = (cid:104) ∆ x ( t ) (cid:105) / t and D = [ (cid:104) ∆ x ( t ) (cid:105) −(cid:104) ∆ x ( t ) (cid:105) ] / t , Q = σ tot DV ≥ . (20)According to this relation, a molecular motor that transportscargos with high velocity ( V ), low fluctuations ( D ), and lowthermodynamic cost ( σ tot ) would be characterized with a small value of Q , and one could argue that a motor withsmaller Q value is more e ffi ciently designed as a transporterand thus has a higher transport e ffi ciency than the motor withhigher Q .Of note, the TUR can also be used to determine the upperbounds on the thermodynamic e ffi ciency by means of T σ tot ≥ T V / D (Eq.20). Pietzonka et al. defined the thermodynamice ffi ciency ( η ) of a molecular motor in the presence of externalload ( f ) as the ratio between the amount of work production( ˙ W = f V ) and an input chemical potential ( − ˙ ∆ µ = ˙ W + T σ tot ),so as to obtain its upper bound of η as follows. η = ˙ W ˙ W + T σ tot ≤ + TVf D = η max . (21)The maximum thermodynamic e ffi ciency, η max , is obtained inthe condition equivalent to Q = { x ( t ) } , the veloc-ity ( V ) and di ff usivity ( D ) are straightforwardly determined.Meanwhile, the rate of total entropy production σ tot either (i)can be estimated, for example, from the fact that each step ofkinesin-1, which occurs every ∼
10 msec, is generated fromthe hydrolysis free energy of a single ATP molecule ∼ k B T ( σ tot ∼ k B T /
10 msec), or (ii) can be calculated more sys-tematically by knowing all the chemical rate constants on thekinetic network . σ tot for ( N = f = σ tot = (cid:104) J (cid:105) × β A = k on k cat [ S ] − k rev k o ff k on [ S ] + k cat + k rev + k o ff log (cid:32) k on k cat [ S ] k rev k o ff (cid:33) . (22)For the case (ii), σ tot of a general kinetic network with knowntransition rate constants, can be obtained by a method devel-oped by Koza .As clearly gleaned from single molecule time traces of amolecular motor (e.g., kinesin, myosin, dynein), the veloc-ity and fluctuations of time traces vary with ATP concentra-tion ([ATP]) and external force ( f ) . Thus, the value of Q not only depends on a type of molecular process of in-terest, but also on its working condition. Fig. 3 shows thatthe Q is a complicated function of f and [ATP]. Some fea-tures of Q ( f , [ATP]) of biological motors are noteworthy: (i)As clearly seen for the case of kinesin-1, besides the triv-ial minima at low [ATP] corresponding to DB condition, Q is sub-optimized at f ≈ ≈ µ M . Q (f = , [ATP]) plotted in Fig.3e displays non-monotonicvariation with [ATP]. A similar behavior is observed in Q fordynein (Fig.3c). As already discussed in the foregoing sec-tion, the non-monotonic variation of Q ([ATP]) is the outcomeof Michaelis-Menten type response of motor or enzymatic ac-tivity on substrate concentration. (ii) The cellular condition( f ≈ ≈ Q valueregion in the middle of the diagram ( Q > FIG. 3. Diagrams of Q as a function of [ATP] and external load ( f ). Assisting and resisting loads correspond to f < f >
0, respectively. (a) kinesin-1, (b) myosin-V, (c) dynein, (d) kinesin-1 mutant. (e)
The values of Q as a function of ATP concentration at f = (f) Thetradeo ff relations between S tot and the relative uncertainty of displacement x ( t ) are plotted for various molecular motors with their uncertaintyproducts calculated at [ATP] = f = . ( σ tot > . While the motor is motionless at mechan-ical equilibrium ( V ≈ σ tot > Q diverges (Eq.20). (iv) Q ( f , [ATP])of kinesin-1 mutant (Fig.3d), which has six additional aminoacids inserted into neck-linker, is drastically altered from thatof the wild-type, such that the value of Q is increased andthe stall condition is formed at smaller f values. (v) Lastly,Fig.3f compares the values of Q for various biological mo-tors at the cellular condition ( f ≈ ≈ Q = −
15 for wild type motors. In particular, Q = ffi -ciency with Q =
19. The e ffi ciency of biological motor spe-cialized for cargo / organelle transport is relatively higher thanother machineries that will be explained below. EFFICIENCY OF BIOLOGICAL COPY PROCESSES
As exemplified in DNA replication, transcription, andtranslation processes which comprise the central dogma ofmolecular biology, some of the key information transfer pro-cesses in biology are nearly free from copy error even in thenoisy cellular environment. For the case of DNA replica-tion, the error probability ( η eq ) estimated solely from the sta-bility di ff erence between the correct and incorrect base pairsis at best η eq ≈ − ; however, the actual error probabil-ity of replicating incorrect bases to copy strand is as small as η = − , which makes the replication of giant human DNA consisting of N = × bases e ff ectively error-free . Toachieve the substantial error reduction from η eq to η , a hostof elaborate energy-expending molecular mechanisms are atwork at every step of the biological copy processes .From the viewpoint of information processing in biology,the error reduction is certainly an important issue; however,it in itself cannot be the sole goal of the biological copyprocesses given that biomass production, e.g., from DNA toRNA, and from RNA to proteins, is the major outcome fromthe processes. Excessive operations of error correction ma-chineries would not only incur energetic cost, but also slowdown the process of biomass production. There have recentlybeen a number of studies devoted to understanding the strat-egy of biological copy processes to achieve the mutually con-tradicting goals of low copy error, high speed, and low ener-getic cost (Fig. 4a).Here, we will view each step of copying enzyme along atemplate polymer as that of a molecular motor stepping alonga template filament processively. Each substrate incorporat-ing event can be split into several Markov jump steps on acyclic kinetic network, at the end of which the copying en-zyme transitions to the next position along the template strand.The forward motion of the enzyme is driven by the chemicalpotential of the biosynthetic substrates (dNTPs for the DNApolymerase, NTPs for the RNA polymerase, and charged tR-NAs for the ribosome). Under an assumption that cellularhomeostasis maintains the substrate concentrations constantbut away from the DB condition, the dynamical process as-sociated with synthesizing a copy strand can be modeled as a a E rr o r c o rr e c t i o n s E n e r g e t i c c o s t s Speed and fluctuations of Biomass production
DNA → RNA → protein S f q m j d b u j p o U s b o t d s j q u j p o U s b o t m b u j p o E ci Pi Pi J i fut J i pol J cfut J cpol b c [GTP] (M) s − ⟨ J i fut ⟩⟨ J i pol ⟩⟨ J c pol ⟩⟨ J c fut ⟩ FIG. 4. Biological copy processes and tradeo ff relation. (a) Tradeo ff relations in biological copy processes. Excessive amount of errorcorrections are expected to lead to higher energetic cost and slow biomass production. (b) The illustration of DNA replication process (top),and a kinetic network to represent general copy processes with the possibility of accommodating either correct or incorrect substrate to copystrand. The c -cycle (red) and i -cycle (blue) are for the correct and incorrect substrate incorporations, respectively. (c) Reaction currents alongthe subcycles of the network as a function of substrate concentration. Explicit calculations of currents were conducted using an example oftranslation of a codon CUG by ribosome . The figures b and c were adapted from Ref. . process operating at NESS with [ S ] as the key parameter tobe controlled.Instead of the uni-cyclic kinetic network introduced earlierfor enzymatic processes on a single type of substrate catalysis,additional cycles due to the chance of copying incorrect typesof substrate to the strand are necessary for the copy processesdescribed above. The schematic of the reaction network inFig.4b represents the one for copy processes, which takes intoaccount the chance of incorporating incorrect substrates. The c denotes the reaction cycle (red) associated with correct sub-strate incorporation, whereas i is for the cycle (blue) associ-ated with incorrect substrate incorporation. Provided that cor-rect substrates are always accommodated to the copy enzymewithout any futile attempt, the reaction current, more specif-ically the current associated with polymerization, J c pol , flowsonly through the c -cycle. In this case, the probability of copyerror would be 0 (Eq.24, see below). However, due to thestochastic nature of biochemical reactions, incorporation ofincorrect substrate to the copy polymer is still inevitable. Anincorrect substrate incorporated to the enzyme-cofactor com-plex engenders the reaction current through i -cycle. A proof-reading mechanism expending the free energy (e.g., GTP hy-drolysis for the case of mRNA translation by ribosome) inaction filters out the incorrect substrate from the system, gen-erating a futile current ( J i fut ); otherwise the substrate is ac-commodated into the copy polymer, generating a current as-sociated with polymerization ( J i pol ). The futile current alongthe c -cycle is conceivable as well, just like the case in whicha molecular motor, say kinesin-1, occasionally fails to step inspite of ATP hydrolysis. Explicit calculations of four currentsin Fig. 4b conducted for the translation of CUG codon by ri-bosome indicate that the sizes of the four currents are main-tained in the following order, e ff ectively over the whole rangeof GTP concentration (nM ≤ [GTP] ≤
100 mM) , while thecellular concentration of GTP is [GTP] ≈ : (cid:104) J c pol (cid:105) (cid:29) (cid:104) J i fut (cid:105) (cid:38) (cid:104) J c fut (cid:105) (cid:29) (cid:104) J i pol (cid:105) . (23)Both (cid:104) J c pol (cid:105) and (cid:104) J i pol (cid:105) are reflected to the copied sequence, such that the error probability of the copy process is associ-ated with the two mean currents as η = (cid:104) J i pol (cid:105)(cid:104) J c pol (cid:105) + (cid:104) J i pol (cid:105) , (24)which gives η ≈ × − for the case of CUG codon .The free energy cost associated with the copy process rep-resented by the schematic of kinetic network in Fig.4b is givenas follows . β A = − β (cid:34) ∆ µ pol + (cid:104) J fut (cid:105)(cid:104) J pol (cid:105) ∆ µ fut (cid:35) − η ln η − (1 − η ) ln (1 − η ) . (25)where (cid:104) J pol (cid:105) ≡ (cid:104) J c pol (cid:105) + (cid:104) J i pol (cid:105) and (cid:104) J fut (cid:105) ≡ (cid:104) J c fut (cid:105) + (cid:104) J i fut (cid:105) arethe total reaction current along the polymerization and futilecycles. Note that since ∆ µ pol and ∆ µ fut , the chemical poten-tials associated with each cycle, are the state function, theyare identical regardless of the cycle. The value of β A is de-termined by the ratio of currents (cid:104) J fut (cid:105) / (cid:104) J pol (cid:105) as given in theexpression of Eq.25. Lastly, the Shannon entropy-like term, S ( η ) = − η ln η − (1 − η ) ln (1 − η ) arises from the entropic drivecreated by a potential disorder in copied sequence. However,considering that S (1 / = log 2 ≈ .
69 is the maximum, itsactual contribution is minor to the entire free energy ( A ) in bi-ologically relevant parameter regimes . Eq. 25 clarifies thatextra free energy cost is incurred with a larger amount of futilecurrent, when the proofreading mechanisms are at work.Finally, we aim to look at the problem of biologicalcopy processes under the hood of TUR and study how theerror probability is balanced with dissipation, speed, andfluctuations. While there have been a number of studies onthe relation between error probability, dissipation, and speed,somewhat less attention was paid to the fluctuations emanat-ing from the dynamical process. Large fluctuations in thereaction cycles of transcription and translation contributingto higher variability in the protein copy number could bedetrimental to the fitness of an organism . Furthermore, FIG. 5. The uncertainty product Q for biological copy processes. (a) DNA replication by T7 DNA polymerase. (b)
Translation of codons by
E. coli. ribosome. The dashed lines indicate the cellular concentrations of aa-tRNA and GTP. (c)
Proofreading step-modulated fluctuations inthe translation times of
E. coli. ribosome ( T ) that reads the tufB mRNA sequence encoding the 394 amino acids of EF-Tu. In the two panelsat the bottom are shown the error probability and uncertainty product as a function of the perturbation factor ( α PR ) multiplied to the rate ofproofreading step. The figure was adapted from Ref. . given that the DNA replication of developing animals ismeticulously synchronized across the cells, disarray of cellcycle from other cells can be lethal . Overall, fluctuationsassociated with biological copy processes also have to besuppressed down to biologically acceptable levels, and thisaspect is naturally taken into account by calculating theuncertainty product Q of TUR. DNA replication.
Of the three biological copy processes,the most precise one is DNA replication, at the heart of whichis the interaction between the polymerase and the exonucle-ase domains of the DNA polymerase (DNAP) . In a nutshell,the exonuclease domain of the DNAP executes the proofread-ing step by preferentially cleaving o ff incorrect nucleotides,the incorporation of which slows down the action of the poly-merase. Through a Michaelis-Menten approximation of allthe chemical reactions catalyzed by the DNA polymerase,Gaspard has derived the dependence of the error probability,speed, and energetic cost on the substrate concentrations .Along similar lines, Oleg and colleagues further explored thereaction dynamics of the DNAP, by explicitly modeling theswitching of the DNAP between the polymerase and exonu-clease states . Results from both groups demonstrated thatthe copy errors were being suppressed at the expense of thespeed and the energetic cost. Furthermore, the analysis of rateconstants of the wild type T7 DNA polymerase suggested thatit was optimized for the speed rather than the accuracy.Evaluating the fluctuation and the transport e ffi ciency Q ofDNA polymerases together with the activity of exonucleaserequires a careful analysis of the kinetics involved with theswitching among the polymerizing, proofreading, and paused states of the polymerase . However, since the preciseknowledge of exonuclease mechanism still remain elusive,the analysis here is limited to the e ffi ciency of exonuclease-deficient proofreading-free T7 DNAP. The analysis of kineticnetwork of T7 DNAP finds that Q for T7 DNAP is suboptimal( Q ≈
10) at [dNTP] ≈ µ M, which is in the similarrange of dNTP at the cellular condition O (10 ) µ M − O (1)mM . Notably, the error probability is already saturatedto its minimal value η ≈ − when [dNTP] (cid:38) − M.This suggests that the error probability is not the primaryproperty to be optimized at the cellular condition. Instead,there are more room for molecular functions of T7 DNAP tobe optimized for the dynamic properties integrated into theuncertainty product Q of TUR. RNA polymerases.
While synthesizing the RNAtranscript complementary to the DNA sequence, the RNApolymerase (RNAP) translocates along the DNA sequence,maintaining a DNA bubble of 12-14 basepairs, and an 8-9basepair RNA-DNA hybrid double strand . Similarly tothe DNAP, the polymerase and the exonuclease activities ofthe RNAP are combined together to suppress copy errors .Upon incorporating incorrect nucleotides, the polymeraseactivity slows down and the RNAP converts to a backtrackedstate that can no longer incorporate a new nucleotide. Onlyafter removal of the erroneous NTP, the RNAP is able to con-tinue incorporating new nucleotides. Structural, biochemical,and theoretical studies have characterized both the mecha-nisms of nucleotide incorporation in the elongation complexand the proofreading, highlighting the strong sequencedependence on the error rate and the pausing frequency ofthe polymerase. The e ff ect of the exonuclease-mediatedproofreading reactions on the error probability have beenquantitatively evaluated through kinetic modeling . Alongwith the fluctuations of the RNAP activity, analysis of RNAPdynamics in light of Q would also be of great interest. mRNA translation by E. coli ribosome.
The
E. coli ribosome is a well characterized system that employs KPMto suppress copy errors . The ribosome synthesizes pro-teins from mRNAs by decoding the sequence information en-coded in the ‘codons’. For each of the 61 codons (exclud-ing the 3 stop codons from the total of 4 possible com-binations), there exist potentially multiple ‘correct’ tRNAswith the matching anti-codon, in complex with the encodedamino-acid (aa), elongation factor (EF), and GTP. The bind-ing of the correct aa-tRNA-EF-GTP complex to the ribosome-mRNA complex initiates the reaction cycle through whichthe amino-acid is added to the elongating polypeptide se-quence. Similarly, aa-tRNA-EF-GTP complexes with incor-rect amino-acids can undergo a parallel reaction cycle as thecorrect substrate, which leads to the incorporation of errors inthe polypeptide sequence . Free energy released from GTPhydrolysis is used to execute KPM (see Ref. for the detailsof kinetic proofreading mechanism).The error probabilities associated with codon-anticodonpairings are already saturated to η ≈ − − − for all thecodons with respect to the variations in [aa-tRNA] and [GTP](see Fig.S6 in Ref. ). As shown in Fig.5b, the uncertaintyproducts Q are again non-monotonic functions of both [aa-tRNA] and [GTP], and the corresponding cellular concentra-tions of [aa-tRNA] and [GTP] are found greater than the con-centrations, [aa-tRNA] ∗ and [GTP] ∗ , that give rise to the sub-optimal values of Q . Depending on the codon type, Q variesbetween 20 and 40 (Fig.5b). The mRNA translation is real-ized when ribosome translocates through a string of codons,accommodating correct types of aa-tRNA and forming newpeptide bonds. The greater forward kinetic rates of the GTPhydrolysis and the polymerization along the correct cycle lowers the error probability.To study the process of mRNA translation of E. coli. ri-bosome in a more realistic fashion, it is possible to consideran extended version of network model which translates 42species of aa-tRNAs into 20 di ff erent amino acids. With in-formation on the concentration of aa-tRNAs in the cellularmilieu, Song et al. classified them into cognate, near-cognate,and non-cognate types and simulated the E. coli. ribosome-mediated translation of the tufB mRNA sequence that encodesthe 394 amino acid EF-Tu (Fig. 4c). The e ff ect of proofread-ing step on the ribosome dynamics as well as on the errorprobability can be assessed by modulating the polymerizationcurrent by multiplying a factor α PR to the associated rate con-stants (see the general kinetic network for proofreading de-picted in Fig.4b). A similar perturbative analysis has previ-ously been used to decipher which feature of the biomolec-ular process had been optimized throughout evolution .Although rate constants are not easy to tune in experiments,such modification happens throughout the evolution by meansof mutations to the ribosome, EF-Tu, and tRNA. Their simula- tion results demonstrated that the variations in the completiontimes of mRNA translation ( T ) depends critically on the pa-rameter α PR that modulates the polymerization current. For α PR = ≈
16 aa / sec and the error prob-ability is η ≈ − (Fig.4c), in good agreement with thoseknown from experimental measurements . While η de-creases monotonically with α PR , Q is non-monotonic with α PR , minimized near the wild type condition. At α PR = E. coli. ribosome is Q =
45 (Fig.4c). Forthe given kinetic parameters from the wild type (WT), Q isminimized to Q ∼
30 at α PR ≈
5. For α PR = − , the transla-tion times display a much broader distribution than that of thewild-type (Fig.4c). Thus, it could be argued that the extent ofproofreading of the WT is in a proper range, that the fidelity oftranslation and the fluctuations of protein synthesis are simul-taneously regulated. Fluctuations in the completion time formRNA translation ( (cid:104) δ T (cid:105) ) can be critical, as it is translated tosignificant variation in protein copy number.For the E. coli. ribosome, the kinetic rate constants of ribo-somes have evolved to optimize the speed ( (cid:104) J pol (cid:105) ) over theaccuracy ( η ), while increasing the energetic cost ( A ) onlyslightly above the minimal cost necessary for the polymeriza-tion reaction. Importantly, similar conclusions were drawn bya number of studies, each of which evaluated the translationprocess of the E. coli ribosome using di ff erent kinetic reac-tion networks . From the findings of maximized current,suppressed fluctuations, and moderate increase of A capturedby the e ff ectively sub-optimized value of Q ( ≈ −
50) at α PR ≈
1, while η being determined at biologically acceptablelevels , it could be argued that the wild type ribosome op-erates near semi-optimal condition. Gene regulation by transcription factors
Gene expression is regulated in large part by the bind-ing of transcription factors (TFs) to the regulatory regionsof DNA. The specificity by which TFs can activate targetgenes is originated from the discriminatory binding of TFsto the non-specific regulatory regions of the DNA. With anassumption that the relative binding a ffi nities of the TF tothe target and the non-target sequences di ff er only by theirrespective dissociation rates, k c o ff and k i o ff , the minimal er-ror probability of transcription can be approximated by η ≈ η = [1 + k i o ff / k c o ff ] − . Shelansky and Boeger recently pro-posed a mechanism in gene regulation by TFs and nucleo-somes that can reduce η at the expense of extra free energycost . Nucleosomes, the regulatory structures that bind andunbind reversibly from DNA, are generally known to suppresstranscriptional activity in the bound state. In Shelansky andBoeger’s model, transcription is assumed to occur only whenthe DNA is bound to the transcription factor, and free fromthe nucleosome (Fig. 6a). Briefly, when the nucleosome canbe more easily removed from the TF-bound regulatory se-quences than from the TF-absent ones ( α > β in Fig. 6a),a kinetic proofreading-like mechanism can reduce the min- v TFNC k on k c off k on k c off αk NCon β k
NCon mRNA λ increasing αα = βα → ∞ η / η a b FIG. 6. Error probability and transcription noise in non-equilibriumgene regulation by transcription factors (TFs). (a) Schematic of agene regulation by a combination of the nucleosome (NC) and TFoccupancy. Transcriptional activity is assumed to occur only whenthe regulatory sequence is released from the NC and bound to thetranscription factor (TF) . A parallel network for incorrect geneexpression also exists, in which the TF unbind with rate k i o ff . (b) Theerror probability ( η ) and the Fano factor of correct gene expression( λ ) plotted as functions of α . When the NC is removed preferentiallyin the TF bound state (i.e. α > β ), the error in gene expression canbe reduced below η ≡ [1 + k i o ff / k c o ff ] − . Other than α , the remainingparameters are set as follows: k on = − , k c o ff = − , k i o ff =
100 sec − , k NCon = − , β = . − , and v = − . imum error rate below η . The nucleosome removal in theTF-bound sequences is predicted to be driven by the activityof ATP-consuming chromatin remodelers, which are recruitedby the TF .Fig. 6 plots the error probability and noise in gene expres-sion with an increasing thermodynamic drive. In parallel toEq.24, the error probability is defined as η ≡ J i trans / ( J i trans + J c trans ), where J i trans and J c trans are the reaction currents associ-ated with the incorrect and correct gene expression. Next, thenoise in the correct gene expression is defined by the Fanofactor λ ≡ (cid:104) ( δ J c trans ) (cid:105) / (cid:104) J c trans (cid:105) . To increase the thermodynamicdrive, we increase the nucleosome unbinding rate from TF-bound sequences ( α ), while keeping the rest of the parametersconstant. Close to the equilibrium condition of α = β , in-creasing α reduces λ and η simultaneously (Fig. 6b). Thus,similarly to the E. coli ribosome, KPM can simultaneously re-duce the error probability and the relative fluctuation of geneexpression. An analysis in the framework of the TUR willbe useful in quantifying trade-o ff relations among the errorprobability, transcription noise, and the energetic cost of generegulation by TF binding. Molecular chaperone-assisted folding of proteins
In an appropriate environmental condition, small singledomain proteins in general can reversibly fold and unfold,and reach their native state within biologically relevant timescales . Yet, there are still a class of proteins that areprone to misfold and aggregate, whose presence can be detri-mental to the organisms. For such proteins (e.g., Rubisco,malate dehydrogenase ), only a small fraction ( Φ (cid:28) binding of ATP to the equatorial domain of GroEL, isentirelyconcerted.SimplegeometricconsiderationsshowthattheT T RstateisaccompaniedbythemovementoftheSPbindingsiteswithnonadjacentonesmovingfartherthanadjacentsites.TheSPbindingstimulatesATPaseactiv-itywiththe k cat persubunitbeingaboutfivetimesgreaterintheTstatethaninRstate.Thus,theSPresiststheT T Rtransition. This suggests that in the process of T-to-RtransitionforceisexertedontheSP( ),whichimpliesthattheannealingactionofGroELresultsinunfoldingoftheSP( , ). (b)Encapsulation. UponencapsulationtheSPgoesfrombeingboundtoGroELtoastateinwhichitisconfinedinthevolumepermittedbythecavity(Figure5).IntheboundstatethemicroenvironmentexperiencedbytheSPislargelyhydrophobic,whereasinthesequesteredstatetheSPisconformationallyunrestrainedbecauseofweakerinteractionswiththeGroELcavity.EncapsulationisaccompaniedbyaseriesofallosterictransitionsintheGroELwhichconstitute thefundamentalpowerstrokeinthechaperonincycle.ThebindingofMgATPtriggersthedomainmovementsthatareexaggeratedinthepresenceofGroES.Theencapsulationprocessincreasestheinnervolumeofthecavitytoabout185000Å .ThepolarityofthesurfaceofthecentralcavityundergoesadramaticchangefrombeinghydrophobicintheTstatetohydrophilicintheRstates.ItremainssointhisstateuntilreversedomainmovementsreturnGroELtotheTstate.Theswitchingfromthehydrophobictohydrophilicsurfacesthatoccursineach hemicycleresultsintheunfoldingoftheSP .ThiseventputstheSPinahigherpointinthefreeenergylandscapefromwhichitcanpartitioneithertothefoldedstateorbetrappedinanothermisfoldedconfor-mation. (c)ATPHydrolysis. AsaresultofencapsulationtheATPmolecules,whicharelockedintheactivesite,arecommittedtohydrolysis.AtinvivoconcentrationsofATPallsevenATParehydrolyzedina“quantized”mannerinthepresenceofGroES( ).ThehydrolysisofATPservesasatiming F IGURE
5: ThetoprightshowsthestructureoftheR ′′ (GroEL - (ADP) - GroES)state.OneoftheGroELsubunitsisshowninthecircleonthetopleft.Theapicaldomainisshowninred,theintermediateisingreen,andtheequatorialdomainisincyan.Thehemicycle,whichiscompletedinabout15sat37 ° C,intheGroEL-assistedfoldingofproteinsisshowninthebottom.Forclarityonlythestepsintheliganddrivenallosterictransitionsinoneoftherings(cis)isshown.Intheinitialstepthesubstrateprotein(SP)iscapturedbytheGroELintheTstate.ThisstepcouldinduceminorconformationalchangesinGroEL.ATPbindingtriggersrigidbodyrotationoftheintermediatedomaintowardtheequatorialdomain,leadingGroELtotheRstate.TheR f R ′ transitionandGroESbindingencapsulatestheSPprovideditissmallenoughtofittotheexpandedcavity.AfterATPhydrolysistheR ′ f R ′′ transitiontakesplace.ThereleaseofADP,inorganicphosphate,andtheSP(whetheritisfoldedornot)istriggeredbyasignalfromthetransring(notshown).OnlytheT T Risreversible.Allotherstepsinthecyclearedriven.
Biochemistry,Vol.44,No.13,2005
CurrentTopics
ATP ADP Φ (1 − Φ ) ! ⎧⎨⎪⎩⎪ binding of ATP to the equatorial domain of GroEL, isentirelyconcerted.SimplegeometricconsiderationsshowthattheT T RstateisaccompaniedbythemovementoftheSPbindingsiteswithnonadjacentonesmovingfartherthanadjacentsites.TheSPbindingstimulatesATPaseactiv-itywiththe k cat persubunitbeingaboutfivetimesgreaterintheTstatethaninRstate.Thus,theSPresiststheT T Rtransition. This suggests that in the process of T-to-RtransitionforceisexertedontheSP( ),whichimpliesthattheannealingactionofGroELresultsinunfoldingoftheSP( , ). (b)Encapsulation. UponencapsulationtheSPgoesfrombeingboundtoGroELtoastateinwhichitisconfinedinthevolumepermittedbythecavity(Figure5).IntheboundstatethemicroenvironmentexperiencedbytheSPislargelyhydrophobic,whereasinthesequesteredstatetheSPisconformationallyunrestrainedbecauseofweakerinteractionswiththeGroELcavity.EncapsulationisaccompaniedbyaseriesofallosterictransitionsintheGroELwhichconstitute thefundamentalpowerstrokeinthechaperonincycle.ThebindingofMgATPtriggersthedomainmovementsthatareexaggeratedinthepresenceofGroES.Theencapsulationprocessincreasestheinnervolumeofthecavitytoabout185000Å .ThepolarityofthesurfaceofthecentralcavityundergoesadramaticchangefrombeinghydrophobicintheTstatetohydrophilicintheRstates.ItremainssointhisstateuntilreversedomainmovementsreturnGroELtotheTstate.Theswitchingfromthehydrophobictohydrophilicsurfacesthatoccursineach hemicycleresultsintheunfoldingoftheSP .ThiseventputstheSPinahigherpointinthefreeenergylandscapefromwhichitcanpartitioneithertothefoldedstateorbetrappedinanothermisfoldedconfor-mation. (c)ATPHydrolysis. AsaresultofencapsulationtheATPmolecules,whicharelockedintheactivesite,arecommittedtohydrolysis.AtinvivoconcentrationsofATPallsevenATParehydrolyzedina“quantized”mannerinthepresenceofGroES( ).ThehydrolysisofATPservesasatiming F IGURE
5: ThetoprightshowsthestructureoftheR ′′ (GroEL - (ADP) - GroES)state.OneoftheGroELsubunitsisshowninthecircleonthetopleft.Theapicaldomainisshowninred,theintermediateisingreen,andtheequatorialdomainisincyan.Thehemicycle,whichiscompletedinabout15sat37 ° C,intheGroEL-assistedfoldingofproteinsisshowninthebottom.Forclarityonlythestepsintheliganddrivenallosterictransitionsinoneoftherings(cis)isshown.Intheinitialstepthesubstrateprotein(SP)iscapturedbytheGroELintheTstate.ThisstepcouldinduceminorconformationalchangesinGroEL.ATPbindingtriggersrigidbodyrotationoftheintermediatedomaintowardtheequatorialdomain,leadingGroELtotheRstate.TheR f R ′ transitionandGroESbindingencapsulatestheSPprovideditissmallenoughtofittotheexpandedcavity.AfterATPhydrolysistheR ′ f R ′′ transitiontakesplace.ThereleaseofADP,inorganicphosphate,andtheSP(whetheritisfoldedornot)istriggeredbyasignalfromthetransring(notshown).OnlytheT T Risreversible.Allotherstepsinthecyclearedriven.
Biochemistry,Vol.44,No.13,2005
CurrentTopics Φ (1 − Φ )(1 − Φ ) ⎧⎨⎪⎩⎪ ATP ADP binding of ATP to the equatorial domain of GroEL, isentirelyconcerted.SimplegeometricconsiderationsshowthattheT T RstateisaccompaniedbythemovementoftheSPbindingsiteswithnonadjacentonesmovingfartherthanadjacentsites.TheSPbindingstimulatesATPaseactiv-itywiththe k cat persubunitbeingaboutfivetimesgreaterintheTstatethaninRstate.Thus,theSPresiststheT T Rtransition. This suggests that in the process of T-to-RtransitionforceisexertedontheSP( ),whichimpliesthattheannealingactionofGroELresultsinunfoldingoftheSP( , ). (b)Encapsulation. UponencapsulationtheSPgoesfrombeingboundtoGroELtoastateinwhichitisconfinedinthevolumepermittedbythecavity(Figure5).IntheboundstatethemicroenvironmentexperiencedbytheSPislargelyhydrophobic,whereasinthesequesteredstatetheSPisconformationallyunrestrainedbecauseofweakerinteractionswiththeGroELcavity.EncapsulationisaccompaniedbyaseriesofallosterictransitionsintheGroELwhichconstitute thefundamentalpowerstrokeinthechaperonincycle.ThebindingofMgATPtriggersthedomainmovementsthatareexaggeratedinthepresenceofGroES.Theencapsulationprocessincreasestheinnervolumeofthecavitytoabout185000Å .ThepolarityofthesurfaceofthecentralcavityundergoesadramaticchangefrombeinghydrophobicintheTstatetohydrophilicintheRstates.ItremainssointhisstateuntilreversedomainmovementsreturnGroELtotheTstate.Theswitchingfromthehydrophobictohydrophilicsurfacesthatoccursineach hemicycleresultsintheunfoldingoftheSP .ThiseventputstheSPinahigherpointinthefreeenergylandscapefromwhichitcanpartitioneithertothefoldedstateorbetrappedinanothermisfoldedconfor-mation. (c)ATPHydrolysis. AsaresultofencapsulationtheATPmolecules,whicharelockedintheactivesite,arecommittedtohydrolysis.AtinvivoconcentrationsofATPallsevenATParehydrolyzedina“quantized”mannerinthepresenceofGroES( ).ThehydrolysisofATPservesasatiming F IGURE
5: ThetoprightshowsthestructureoftheR ′′ (GroEL - (ADP) - GroES)state.OneoftheGroELsubunitsisshowninthecircleonthetopleft.Theapicaldomainisshowninred,theintermediateisingreen,andtheequatorialdomainisincyan.Thehemicycle,whichiscompletedinabout15sat37 ° C,intheGroEL-assistedfoldingofproteinsisshowninthebottom.Forclarityonlythestepsintheliganddrivenallosterictransitionsinoneoftherings(cis)isshown.Intheinitialstepthesubstrateprotein(SP)iscapturedbytheGroELintheTstate.ThisstepcouldinduceminorconformationalchangesinGroEL.ATPbindingtriggersrigidbodyrotationoftheintermediatedomaintowardtheequatorialdomain,leadingGroELtotheRstate.TheR f R ′ transitionandGroESbindingencapsulatestheSPprovideditissmallenoughtofittotheexpandedcavity.AfterATPhydrolysistheR ′ f R ′′ transitiontakesplace.ThereleaseofADP,inorganicphosphate,andtheSP(whetheritisfoldedornot)istriggeredbyasignalfromthetransring(notshown).OnlytheT T Risreversible.Allotherstepsinthecyclearedriven.
Biochemistry,Vol.44,No.13,2005
CurrentTopics Φ (1 − Φ ) (1 − Φ ) ⎧⎨⎪⎩⎪ ATP ADP → Φ − Φ⎧⎨⎩ b N n (1 ) N n (1 ) N n M n (1 ) M n N n M n N (= ) M (= 1 )1 IM N k NM k MN k IN k NI k MI k IM A BC
Intheearlystagesoffolding,theradiusofgyrationofthechain decreases rapidly on the collapse time scale τ c . Fortwo-statefolders τ F / τ c isontheorderofunity( O (1))sothatcollapseandfoldingarenearlysimultaneous( ).By O (1)itismeantthat0 < τ F / τ c < (5 - µ sonward,areinaccordwiththesearguments( , ).Majority of the two-state folders reach the NBA by anucleation-collapse(orcondensation)(NC)mechanism( , - ).AccordingtotheNCmodeltheacquisitionofthenativefoldisprecededbytheformationofoneofthefoldingnuclei,inwhichafractionofinteractionsthatstabilizethenativestructureispresent.ThetransitiontotheNBAisrapidoncethefoldingnucleiareformed.Inthissense,thefoldingreactionissimilartothegas - liquidtransition( ).However,there are profound difference in the nature of the foldingnuclei due to the topological restrictions. In proteins,systematiccomputationsshowthatthefoldingnucleihavea mixture of local and nonlocal contacts ( , , ).Nonlocal contacts are required to stabilize distant parts oftheproteinbecausethesecondarystructuralelementsarenotindependentlystable.Itisdifficulttodecipherthenatureofthefoldingnucleieven for simple two-state folders ( , ). Theoreticalargumentsandcomputationshaveshownthateitherthereisan extended nucleus in which virtually all of the residuesformtheirnativelikecontactswithsomeprobabilityintheTSE( )ortherearemultiplefoldingnuclei(MFN),whicharguesforanumberofsmallernuclei( , ).AccordingtotheMFNmodel,ineachmoleculecertaincontactsformwithhighprobabilityinthetransitionstate( > , ). This suggests that, in general,thereoughttobeMFNwithconsiderableheterogeneityintheTSEstructures.MuchlessisknownaboutstructuresintheTSEinRNA.SeveralrecentexperimentssuggestthatTSEinRNAmustalso be heterogeneous ( , , , ). The formation ofindependentlystablesecondarystructureatverylowcoun-terionconcentrationandsubsequentassemblyintotertiaryfold are expected to make the nature of TSE different inRNAthaninproteins.Becauseneutralizationofchargesonphosphatebycounterion - condensationisaprerequisiteforforming tertiary contacts TSE in RNA may be conforma-tionallymorerestrictedthaninproteins( ). PathwayDi V ersityandtheKineticPartitioningMechanism(KPM). BecauseoftheruggednessoftheenergylandscapenavigationtotheNBAisimpededbypausesinkinetictraps.Thepresenceofkinetictrapsisexacerbated,especiallyforlargeRNAmolecules(seebelow)( , - )andproteinswithcomplexfolds.Inthesesystems,thealternatemisfoldedstructures( )oroverstabilizedpartsofthenativesubstruc-ture ( ) retard folding. The structures in the competingbasins of attraction (CBAs) could have many nativelikefeaturesthatmakethemlong-livedunderfoldingconditions.Fromtheschematicsketchoftheruggedenergylandscape(Figure2)thebasicnotionsofKPMcanbeobtained.Imaginethe navigation process in which an ensemble of unfoldedmolecules(U)beginstotraversetheruggedenergylandscape insearchoftheNBA.TheconformationsintheUstatesareheterogeneousandspanarangeofvastlydifferingstructures.Afraction Φ (referredtoasthepartitionfactor)canreachthe NBA rapidly. These molecules fold rapidly withoutpopulating any discernible intermediates. The remainingfraction, 1 - Φ , is trapped in a manifold of discreteintermediates { I NS } .SincethetransitionsfromtheCBAstothe NBA requires large-scale structural rearrangement forcrossingthefreeenergybarriersthefoldingofthisclassofmoleculesisslow.Thus,duetothemultivalleystructureofthefreeenergylandscapetheinitialensembleofmoleculespartitionsintofastfolders( Φ beingtheirfraction)andslowfolders.AccordingtoKPM,thefractionofmoleculesthatreachesthenativestate f NBA ( t ) ) - Φ e k fast t - ∑ a i e - k i t ( Φ + ∑ a i ) k fast istherateforthefastprocess, k i istheratefortransitionfromthediscreteintermediatesinthe { I NS } ensemble to the NBA, and a i is the correspondingamplitude.Thepartitionfactor Φ ,whichgivestheyieldoffasttrackmolecules,hasbeenmeasuredforafewbiomolecules(Table1). Because the energy landscape can be manipulated bymutation,additionofcosolvents,andcounterionsitfollowsthat Φ alsoshouldrespondtothesechanges.Thevalueof F IGURE
2: Schematic sketch of the rugged energy landscapeunderlyingproteinsandRNAthatfoldbytheKPM.Theentropi-callystabilizedhighfreeenergystatesarepopulatedunderunfoldingconditions.Underfoldingconditionsafractionofmolecules( Φ )reachestheNBAdirectly.Asketchofatrajectoryforafasttrackmolecule that starts in a region of the energy landscape whichconnectsdirectlytotheNBAisgiveninwhite.Trajectories(showningreen)thatbegininotherregionsoftheenergylandscapecanbekineticallytrappedintheCBAswithprobability(1 - Φ ).Thissmalldimensionalrepresentationofthecomplexenergylandscapesuggests that the initial conditions, which can be changed bycounterions, stretching force, or denaturants, can alter foldingpathways. Biochemistry,Vol.44,No.13,2005
CurrentTopics (misfolded) (native)collapsedintermediates a FIG. 7. Molecular chaperone as an error-reducing machine. (a)
Rugged folding landscape of biomolecules that visualizes the nativeand misfolded basins of attraction. As a result of spontaneous fold-ing, molecules are partitioned with the fraction of Φ and 1 − Φ tonative and misfolded states, respectively. (b) Iterative annealing byGroEL. The figure (a) was adapted from Ref. . Φ ≈ .
05 for Rubisco ) of the population can reach their na-tive state, and the remaining fraction of population (1 − Φ )are kinetically trapped in misfolded states (Fig.7). Molecularchaperones , which employ the free energy sourcesubiquitous in cells to change their conformations and interactwith the molecules in misfolded states, can change the pop-ulation entirely and sustain the cellular environment in goodcondition.One of the most well-studied protein chaperones, bacterialGroES-GroEL chaperonin system interacts exclusively withthe misfolded population of proteins, providing them with an-other chance to repeat the folding process. As a result ofan initial interaction of the chaperone with misfolded proteinpopulation, out of the misfolded population 1 − Φ from thefirst round of folding process, Φ (1 − Φ ) would fold into thenative state, and (1 − Φ ) would be again trapped in the mis-folded states. When this process is repeated N ( = t /τ ) timeswhere τ is the time associated with a single cycle and t is thetime duration, the fraction (1 − Φ ) N still remains misfolded andhence the 1 − (1 − Φ ) N are folded. The fraction of native pop-ulation increases from its originally small yield Y = Φ ( (cid:28) N = Y N = − (1 − Φ ) N Φ (cid:28) −−−→ − e − Φ t /τ . (26)Via this mechanism, called the iterative annealing mecha-nism (IAM) , the native yield of unity can be finallyreached when N → ∞ or t → ∞ . From the perspective ofinformation processing, molecular chaperones are another el-egantly designed error reducing machinery. For every cycle,which lasts about 2 sec , GroEL made of two heptamericrings presumably consumes at least 3 − ≈ − k B T of dissipation. Further-0 FIG. 8. A spectrum of uncertainty product ( Q ) computed for trans-port motors , T7 DNA polymerase , E. coli ribosome , molec-ular chaperone, and biochemical oscillators . The theoretical lowerbound of TUR is specified at Q =
2. The range of Q for the family ofmolecular motors and biochemical oscillators are shaded in red andgreen, respectively . more, since successful conversion of misfolded to folded stateis not guaranteed at each cycle, the dissipation per cycle es-timated above is only a lower bound of the estimate. In fact, τ = τ / Φ is the full conversion time to the native state. Giventhat τ ≈ Φ = .
05 for Rubisco, τ =
40 sec. Sincethe number of successful conversions to native state usuallyobeys Poisson statistics, the Fano factor of the net fraction ofconversion ∆ Y N ( t ) is λ = (cid:104) ( δ ∆ Y N ) ( t ) (cid:105) / (cid:104) ∆ Y ( t ) (cid:105) ≈ O (1). Thus,a rough estimate of Q for GroEL assisted protein folding is Q (cid:38) τ/τ × (60 − ≈ × . CONCLUSIONS
Under evolutionary pressure, biological processes are oftenconfronted with situations in which to balance between a num-ber of competing options. Here, we have reviewed the recentstudies on the trade-o ff relations of these features in biologi-cal motors and biological dynamics involved with informationprocessing, from the perspective of TUR which o ff ers the un-certainty product Q as a measure of the e ffi ciency integratingthe cost and precision of the processes. Importantly, the phys-ical lower bound of Q provides an absolute scale onto whichwe can map the e ffi ciencies of diverse biomolecular processes.Biological motors, specialized for cargo transport, are foundto operate close to the lower bound ( Q =
2) even at NESS,with
Q ≈ − . Compared with biological motors thatutilize ATP hydrolysis free energy ( ∼ k B T ≈ that use > Q . The exonuclease-deficient T7 DNA polymerase operate at Q ≈
10, and the ribosome operates at Q = − . Thevalue of Q for molecular chaperones is estimated to be ratherlarge, Q (cid:38) O (10 ), mainly due to the large cost of operatingthe reaction cycle of chaperones. Although it was not dis-cussed in this article, Marsland et al. have evaluated the val-ues of Q for several biochemical oscillators by selecting theoscillation period as their output observable of interest. Someof the biochemical oscillators severely underperform the ef-ficiency bound; for example, the KaiABC system works at Q (cid:38) O (10 ) . For biological motors, both thermodynamiccost and precision of the processes are valuable quantities tobe balanced, giving rise to relatively small value of Q ; how-ever, as seen in the examples of molecular chaperones andbiochemical oscillators, the precision of the processes is func-tionally more of a priority than the thermodynamic cost forsome biological processes. Along with the spectrum of Q recapitulating our survey on various processes in this article(Fig. 8), the Q values of more number of other dynamical pro-cesses will be of help to better glean the design principles un-derlying life sustaining cellular processes. ACKNOWLEDGMENTS
This work was supported by the KIAS individual GrantsCG067102 (YS) and CG035003 (CH) at Korea Institute forAdvanced Study. We thank the Center for Advanced Compu-tation in KIAS for providing computing resources. Alberts, B, Johnson, A, Lewis, J, Ra ff , M, Roberts, K, & Walter, P, Molec-ular Biology of the Cell , Garland Science, 5th edition, 2008. Bustamante, C, Liphardt, J, & Ritort, F,
Physics Today , 43 (2005). Mugnai, M. L, Hyeon, C, Hinczewski, M, & Thirumalai, D,
Rev. Mod.Phys. , 025001 (2020). Beard, D. A & Qian, H,
Chemical Biophysics: Quantitative Analysis ofCellular Systems , Cambridge University Press, 2008. Kolomeisky, A. B,
Motor Proteins and Molecular Motors , CRC Press,Boca Raton, 2015. Hyeon, C & Onuchic, J. N,
Biophys. J. , 2749 (2011). Hopfield, J. J,
Proc. Natl. Acad. Sci. USA , 4135 (1974). Ninio, J,
Biochimie , 587 (1975). Bennett, C. H,
Int. J. Theor. Phys. , 905 (1982). Sagawa, T,
Prog. Theor. Phys. , 1 (2012). Lan, G, Sartori, P, Neumann, S, Sourjik, V, & Tu, Y,
Nature physics , 422(2012). Flamholz, A, Noor, E, Bar-Even, A, Liebermeister, W, & Milo, R,
Proc.Natl. Acad. Sci. U. S. A. , 10039 (2013). Sartori, P, Granger, L, Lee, C. F, & Horowitz, J. M,
PLoS Comput Biol ,e1003974 (2014). Cao, Y, Wang, H, Ouyang, Q, & Tu, Y,
Nature Phys. , 772 (2015). Zhang, D, Cao, Y, Ouyang, Q, & Tu, Y,
Nature Phys. , 95 (2020). Hong, H, Jo, J, Hyeon, C, & Park, H,
J. Stat. Mech. , 074001 (2020). Basan, M, Honda, T, Christodoulou, D, H¨orl, M, Chang, Y.-F, Leoncini,E, Mukherjee, A, Okano, H, Taylor, B. R, Silverman, J. M, et al.,
Nature , 470 (2020). Barato, A. C & Seifert, U,
Phys. Rev. Lett. , 158101 (2015). Gingrich, T. R, Horowitz, J. M, Perunov, N, & England, J. L,
Phys. Rev.Lett. , 120601 (2016). Horowitz, J. M & Gingrich, T. R,
Nat. Phys. , 15 (2019). Hasegawa, Y & Van Vu, T,
Phys. Rev. Lett. , 110602 (2019). Hasegawa, Y & Van Vu, T,
Phys. Rev. E , 062126 (2019). Proesmans, K & Van den Broeck, C,
EPL , 20001 (2017). Lee, J. S, Park, J.-M, & Park, H,
Phys. Rev. E , 062132 (2019). Agarwalla, B. K & Segal, D,
Phys. Rev. B , 155438 (2018). Lee, S, Hyeon, C, & Jo, J,
Phys. Rev. E , 032119 (2018). Potts, P. P & Samuelsson, P,
Phys. Rev. E , 052137 (2019). Koyuk, T, Seifert, U, & Pietzonka, P,
J. Phys. A: Math. Theor. , 02LT02(2018). Paneru, G, Dutta, S, Tlusty, T, & Pak, H. K,
Phys. Rev. E , 032126(2020). Hwang, W & Hyeon, C,
J. Phys. Chem. Lett. , 513 (2018). Song, Y & Hyeon, C,
J. Phys. Chem. Lett. , 3136 (2020). Hyeon, C & Hwang, W,
Phys. Rev. E. , 012156 (2017). Pietzonka, P, Ritort, F, & Seifert, U,
Phys. Rev. E. , 012101 (2017). Manikandan, S. K, Gupta, D, & Krishnamurthy, S,
Phys. Rev. Lett. ,120603 (2020). Seifert, U,
Rep. Prog. Phys. , 126001 (2012). Seifert, U,
Phys. Rev. Lett. , 040602 (2005). Touchette, H,
Physics Reports , 1 (2009). Speck, T & Seifert, U,
J. Phys. A: Math. General , L581 (2005). Pigolotti, S, Neri, I, Rold´an, ´E, & J¨ulicher, F,
Phys. Rev. Lett. , 140604(2017). Speck, T, Blickle, V, Bechinger, C, & Seifert, U,
EPL , 30002 (2007). Dechant, A & Sasa, S.-i,
J. Stat. Mech. , 063209 (2018). Dechant, A & Sasa, S.-i,
Phys. Rev. E , 062101 (2018). Fisher, M. E & Kolomeisky, A. B,
Proc. Natl. Acad. Sci. U. S. A. , 7748(2001). Hwang, W & Hyeon, C,
J. Phys. Chem. Lett. , 250 (2017). Park, J. O, Rubin, S. A, Xu, Y.-F, Amador-Noguez, D, Fan, J, Shlomi, T,& Rabinowitz, J. D,
Nat. Chem. Bio. , 482 (2016). Howard, J,
Mechanics of motor proteins and the cytoskeleton , Sinauerassociates Sunderland, MA, 2001. Pietzonka, P, Barato, A. C, & Seifert, U,
J. Stat. Mech. Theory Exp. ,124004 (2016). Schnakenberg, J,
Rev. Mod. Phys. , 571 (1976). Koza, Z,
J. Phys. A: Math. Gen. , 7637 (1999). Schnitzer, M. J & Block, S. M,
Nature , 386 (1997). Visscher, K, Schnitzer, M. J, & Block, S. M,
Nature , 184 (1999). Schnitzer, M. J, Visscher, K, & Block, S. M,
Nature Cell Biol. , 718(2000). Carter, N. J & Cross, R. A,
Nature , 308 (2005). Hyeon, C, Klumpp, S, & Onuchic, J. N,
Phys. Chem. Chem. Phys. ,4899 (2009). Sumi, T & Klumpp, S,
Nano Lett. , 3370 (2019). Goodman, M. F,
Proc. Natl. Acad. Sci. USA , 10493 (1997). Kunkel, T. A & Bebenek, K,
Annu. Rev. Biochem. , 105 (2000). Ibarra, B, Chemla, Y. R, Plyasunov, S, Smith, S. B, L´azaro, J. M, Salas,M, & Bustamante, C,
EMBO J. , 2794 (2009). Shaevitz, J. W, Abbondanzieri, E. A, Landick, R, & Block, S. M,
Nature , 684 (2003). Blanchard, S. C, Gonzalez, R. L, Kim, H. D, Chu, S, & Puglisi, J. D,
Nat.Struct. Biol. , 1008 (2004). Cvetesic, N, Perona, J. J, & Gruic-Sovulj, I,
J. Biol. Chem. , 25381(2012). Murugan, A, Huse, D. A, & Leibler, S,
Proc. Natl. Acad. Sci. USA ,12034 (2012). Gaspard, P,
Phys. Rev. E , 042420 (2016). Gaspard, P,
Phys. Rev. E , 042419 (2016). Banerjee, K, Kolomeisky, A. B, & Igoshin, O. A,
Proc. Natl. Acad. Sci.USA , 5183 (2017). Wong, F, Amir, A, & Gunawardena, J,
Phys. Rev. E , 012420 (2018). Mallory, J. D, Kolomeisky, A. B, & Igoshin, O. A,
J. Phys. Chem. B ,4718 (2019). Bennett, B. D, Kimball, E. H, Gao, M, Osterhout, R, Van Dien, S. J, &Rabinowitz, J. D,
Nat. Chem.Biol. , 593 (2009). Soltani, M, Vargas-Garcia, C. A, Antunes, D, & Singh, A,
PLoS Comp.Biol. , e1004972 (2016). Co, A. D, Lagomarsino, M. C, Caselle, M, & Osella, M,
Nuc. Acids. Res. , 1069 (2017). Fraser, H. B, Hirsh, A. E, Giaever, G, Kumm, J, & Eisen, M. B,
PLoS Biol. , 0834 (2004). Hausser, J, Mayo, A, Keren, L, & Alon, U,
Nat. Commun. , 68 (2019). Blythe, S. A & Wieschaus, E. F,
Cell , 1169 (2015). Djabrayan, N. J, Smits, C. M, Krajnc, M, Tomer, S, Yamada, S, Lemon,W. C, Keller, P. J, Rushlow, C. A, & Shvartsman, S. Y,
Curr. Biol. , 1193(2019). Johnson, K. A,
Annu. Rev. Biochem. , 685 (1993). Pi˜neros, W. D & Tlusty, T,
Phys. Rev. E , 022415 (2020). Hoekstra, T. P, Depken, M, Lin, S. N, Cabanas-Dan´es, J, Gross, P, Dame,R. T, Peterman, E. J, & Wuite, G. J,
Biophys. J. , 575 (2017). Bochner, B. R & Ames, B. N,
J. Biol. Chem. , 9759 (1982). Buckstein, M. H, He, J, & Rubin, H,
J. Bacteriol , 718 (2008). Schaaper, R. M & Mathews, C. K,
DNA Repair , 73 (2013). Abbondanzieri, E. A, Greenleaf, W. J, Shaevitz, J. W, Landick, R, & Block,S. M,
Nature , 460 (2005). Chen, J, Darst, S. A, & Thirumalai, D,
Proc. Natl. Acad. Sci. U. S. A. ,12523 (2010). Sydow, J. F & Cramer, P,
Curr. Op. Struct. Biol. , 732 (2009). Mellenius, H & Ehrenberg, M,
Nuc. Acids. Res. , 11582 (2017). Rodnina, M. V,
Csh. Perspect. Biol. , a032664 (2018). Rudorf, S, Thommen, M, Rodnina, M. V, & Lipowsky, R,
PLoS Comp.Biol. , e1003909 (2014). Wohlgemuth, I, Pohl, C, Mittelstaet, J, Konevega, A. L, & Rodnina, M. V,
Philos. Trans. R. Soc. B , 2979 (2011). Mallory, J. D, Igoshin, O. A, & Kolomeisky, A. B,
J. Phys. Chem. B ,9289 (2020). Young, R & Bremer, H,
Biochem. J. , 185 (1976). Bouadloun, F, Donner, D, & Kurland, C,
EMBO J. , 1351 (1983). Shelansky, R & Boeger, H,
Proc. Natl. Acad. Sci. USA , 2456 (2020). Zhou, C. Y, Johnson, S. L, Gamarra, N. I, & Narlikar, G. J,
Annual Reviewof Biophysics , 153 (2016). Thirumalai, D & Hyeon, C,
Biochemistry , 4957 (2005). Anfinsen, C. B & Scheraga, H. A,
Adv. Protein Chem. , 205 (1975). Thirumalai, D & Lorimer, G. H,
Ann. Rev. Biophys. Biomol. Struct. ,245 (2001). Chakrabarti, S, Hyeon, C, Ye, X, Lorimer, G, & Thirumalai, D,
Proc. Natl.Acad. Sci. U. S. A. , E10919 (2017). Todd, M. J, Lorimer, G. H, & Thirumalai, D,
Proc. Natl. Acad. Sci. U. S.A. , 4030 (1996). Goloubino ff , P, Sassi, A. S, Fauvet, B, Barducci, A, & De los Rios, P, Nat.Chem. Biol. , 388 (2018). Hyeon, C & Thirumalai, D,
J. Chem. Phys. , 121924 (2013).
Korobko, I, Mazal, H, Haran, G, & Horovitz, A,
Elife , e56511 (2020). Ye, X & Lorimer, G. H,
Proc. Natl. Acad. Sci. U. S. A. , E4289 (2013).
Fei, X, Ye, X, LaRonade, N. A, & Lorimer, G. H,
Proc. Natl. Acad. Sci.U. S. A. , 12776 (2014).
Marsland, R, Cui, W, & Horowitz, J. M,
J. R. Soc. Interface , 20190098(2019). Kudernac, T, Ruangsupapichat, N, Parschau, M, Maci´a, B, Katsonis, N,Harutyunyan, S. R, Ernst, K.-H, & Feringa, B. L,
Nature479