Protein escape at the ribosomal exit tunnel: Effect of the tunnel shape
PProtein escape at the ribosomal exit tunnel: Effect of the tunnel shape
Phuong Thuy Bui
1, 2 and Trinh Xuan Hoang
3, 4, a) Institute of Theoretical and Applied Research, Duy Tan University, Hanoi, 100000,Vietnam Faculty of Pharmacy, Duy Tan University, Da Nang, 550000, Vietnam Institute of Physics, Vietnam Academy of Science and Technology, 10 Dao Tan, Ba Dinh, Hanoi 11108,Vietnam Graduate University of Science and Technology, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet,Cau Giay, Hanoi 11307, Vietnam (Dated: 25 August 2020)
We study the post-translational escape of nascent proteins at the ribosomal exit tunnel with the considerationof a real shape atomistic tunnel based on the Protein Data Bank (PDB) structure of the large ribosomesubunit of archeon
Haloarcula marismortui . Molecular dynamics simulations employing the Go-like modelfor the proteins show that at intermediate and high temperatures, including a presumable physiologicaltemperature, the protein escape process at the atomistic tunnel is quantitatively similar to that at a cylindertunnel of length L = 72 ˚A and diameter d = 16 ˚A. At low temperatures, the atomistic tunnel, however,yields an increased probability of protein trapping inside the tunnel while the cylinder tunnel does notcause the trapping. All- β proteins tend to escape faster than all- α proteins but this difference is blurred onincreasing the protein’s chain length. A 29-residue zinc-finger domain is shown to be severely trapped insidethe tunnel. Most of the single-domain proteins considered, however, can escape efficiently at the physiologicaltemperature with the escape time distribution following the diffusion model proposed in our previous works.An extrapolation of the simulation data to a realistic value of the friction coefficient for amino acids indicatesthat the escape times of globular proteins are at the sub-millisecond scale. It is argued that this time scale isshort enough for the smooth functioning of the ribosome by not allowing nascent proteins to jam the ribosometunnel. I. INTRODUCTION
After the determination of the ribosome structures ,there has been growing attention on understanding therole of the ribosomal exit tunnel on protein biosynthe-sis and on cotranslational protein folding (for recent re-views see Ref. ). Biochemical studies indicate thatthe tunnel plays an active role in the regulation of theprotein translation process by blocking specific peptidesequences . The mechanism of this blocking or trans-lation arrest can be associated with certain ribosomalprotein’s motion which alters the tunnel shape . In co-translational protein folding, the tunnel imposes a spa-tial confinement on the traversing nascent peptide whileit is being synthesized by the ribosome. The narrow ge-ometry of the tunnel was suggested to entropically pro-mote the α -helix formation whereas it may stericallyobstruct the formation of β -sheet . Depending on thelocation within the tunnel, peptides can form simple α -helices and small tertiary structure units . The lattercan be observed near the tunnel exit port where thereis enough space to hold the structure. Simulations and experiments indicate that cotranslational fold-ing starts inside the tunnel, with the structures rangingfrom a non-native compact conformation and transienttertiary structures to a small protein domain . Thereare also considerations that the folding inside the ribo-some tunnel is negligible leading to a focus only on the a) Corresponding author, E-mail: [email protected] folding of nascent chains as they emerge from the tunnel,as shown in studies with stalled ribosome-bound nascentchain experiments and simulations .In recent works , we suggested that the exit tun-nel, as a passive conduit, has a significant impact onthe early post-translational folding, i.e. shortly after theprotein’s C-terminus is released from the peptidyl trans-ferase center (PTC). This impact corresponds to a vec-torial folding induced by the tunnel and is associatedwith the escape process of a full length protein from thetunnel. In particular, the folding and escape of a nascentprotein at the tunnel are concomitant with each other.Folding accelerates the escape process whereas a gradualescape improves the protein foldability. Furthermore, weshowed that the protein escape at the tunnel is governedby a diffusion mechanism and the escape time distribu-tion can be captured by a simple model of a Brownianparticle in a linear potential field.The escape process also has an important meaning ofitself. It should not be too quickly because this wouldleave an escaped protein significantly unfolded outsidethe ribosome, by that increases the chance of proteinaggregation . It cannot be also too slow because thiswould decrease the productivity of the ribosome. Inter-estingly, our previous study shows that the real lengthof ribosome exit tunnel is close to a cross-over tunnellength of 90 to 110 ˚A for the diffusion of small globularproteins. For tunnels of lengths larger than this cross-over length, the diffusion is much slower. Thus, it wassuggested that the ribosome tunnel length may have beenselected by evolution to facilitate an appropriate escape a r X i v : . [ q - b i o . B M ] A ug time.The previous works on the protein escape processconsidered a highly simplified model of the ribosomal exittunnel, being a hollow cylinder with repulsive wall. Realexit tunnel is highly porous for water-size molecules andeffectively adopts an irregular shape for polypetides.The tunnel shape also depends on the type of organism .The aim of our present study is to work with a realis-tic tunnel to test the validity of the previous findings,and to investigate the effect of the tunnel shape on theprotein escape process. For this purpose, we consideran atomistic model of the tunnel based on the resolvedPDB structure of the large ribosome subunit of Haloar-cula marismortui . The atomistic tunnel incorporates allheavy atoms for the ribosomal RNA but only the C α ’sfor the ribosomal proteins. The C α -only representationis also used for nascent proteins within a standard Go-like model . We have chosen the same coarse-grainedlevel for ribosomal proteins for a consistency in modellingwith the chosen Go-like model. This choice, however, re-duces the roughness of the tunnel’s wall where amino acidside-chains are exposed. In order to accelerate the simu-lations all the ribosomal atoms are kept fixed, thus, thetunnel acts solely as a passive channel for the escapingproteins in our consideration.The main focus of our paper is on the effect of thetunnel shape on the escape process. In order to delineatethis effect, we compared the escape of a nascent proteinat the atomistic tunnel with that at an equivalent cylin-der tunnel. The latter is described such that it producesa similar median escape time to that obtained with theatomistic tunnel. This comparison shows similarities anddifferences between the two tunnels. Remarkably, we findthat the atomistic tunnel yields a non-zero probability oftrapping protein inside the tunnel while the cylinder tun-nel does not. Other issues being discussed are the depen-dences of the escape time on temperature, on protein’slength and native state topology among small single do-main proteins, and on the friction coefficient for aminoacids. An estimation of real escape time from the simu-lations is also presented. Interestingly, we find that theestimated escape time is relevant to the functionality ofthe ribosome. II. MODELS AND METHODSA. Protein and tunnel models
Nascent proteins are considered in a Go-likemodel , in which a protein is represented only by itsC α atoms. The intramolecular potential energy of a pro- tein in a conformation is given by E = N − (cid:88) i =1 K b ( r i,i +1 − b ) + N − (cid:88) i =2 K θ ( θ i − θ ∗ i ) ++ (cid:88) n =1 , N − (cid:88) i =2 K ( n ) φ [1 + cos( n ( φ i − φ ∗ i ))] ++ (cid:88) i +3 100 Å60 Å L = 72 Å (a)(b) PTC L x FIG. 1. (a) Two projected views of the ribosomal exit tunnelof H. Marismortui represented by all heavy atoms within 30˚A away from a chosen axis that goes roughly through themiddle of the tunnel. (b) A conformation of protein GB1(red) obtained by simulation after the protein is grown fromthe PTC at the atomistic tunnel. The latter is shown from across-section plane which includes the tunnel axis. Assumingthat x is the tunnel axis, the planes of the projected views are y - z and x - y in (a) and x - z in (b). The PTC region is identifiedas located between the bases A2486, U2620 and C2104 of theribosomal RNA . The tunnel length of L = 72 ˚A indicatedby arrows corresponds to an axial distance from the openingof the tunnel near the PTC to an inner edge of the tunnelexit port. The simulations were carried out using molecular dy-namics method based on the Langevin equation of mo-tion and a Verlet algorithm . The amino acids are as-sumed to have an uniform mass, m . Temperature is givenin units of (cid:15)/k B , whereas time is measured in units of τ = (cid:112) mσ /(cid:15) . We used the same value of the friction co-efficient in the Langevin equation, ζ = 1 mτ − , throughout the studies except in the Subsection III.C where thedependence on the friction coefficient is investigated. Ineach simulation, first the polypeptide chain was grown inthe tunnel from the PTC at a constant speed given bythe growth time t g = 100 τ per amino acid. This growthtime is slow enough to produce converged properties offully translated protein conformations in terms of the ra-dius of gyration and the number of native contacts, i.e.the distributions of these quantities are similar to thoseobtained with a much larger growth time . After theprotein is completely translated, the simulation was rununtil it has fully escaped from the tunnel. The escapetime was measured from the moment of complete trans-lation. Because the exit port is irregular with a complexgeometry, in order to capture the essential escape time,we have defined the tunnel region as the space withina cylinder of length L = 72 ˚A and radius of 15 ˚A cen-tered about the tunnel axis x . This region starts from an opening of the tunnel near the PTC and ends at aninner edge of the exit port (Fig. 1), corresponding topositions from x = 10 ˚A to x = 82 ˚A. An amino acidresidue is considered to have escaped from the tunnel ifit is found outside the tunnel region. Typically, for eachtemperature about 1000 independent growth and escapetrajectories are simulated to obtain the statistics of theescape time. B. Diffusion model The escape of a fully translated protein at the ribosometunnel is driven by: (i) an enthalpic force associated withthe folding of the protein outside the tunnel, (ii) an en-tropy gain as the chain emerges from the tunnel, and(iii) the stochastic motion of a partially unfolded chain.It has been shown that in the Go-like model, the freeenergy change of a protein along an escape coordinateis a monotonically decreasing function, which is approxi-mately linear at intermediate and high temperatures .Interestingly, all these effects can be effectively acquiredin a diffusion model , which describes the protein es-cape as the diffusion of a Brownian particle pulled by aconstant force in one dimension. The particle diffusion ina potential field U ( x ) is governed by the Smoluchowskiequation ∂∂t p ( x, t ) = ∂∂x D (cid:18) β ∂U ( x ) ∂x + ∂∂x (cid:19) p ( x, t ) , (3)where p ( x, t ) is a probability density of finding the par-ticle at position x and at time t > 0, given that it wasfound at position x = 0 at time t = 0; D is the diffu-sion constant, assumed to be position independent; and β = ( k B T ) − is the inverse temperature with k B theBoltzmann constant. The escape time is described asthe first passage time of the particle reaching a distance L from an origin in the drift direction. Given an exter-nal potential of the linear form U ( x ) = − kx , where x is the coordinate of the particle and k is the force, thedistribution of the escape time was obtained via an exactsolution and is given by g ( t ) = L √ πDt exp (cid:20) − ( L − Dβkt ) Dt (cid:21) . (4)Using the distribution in Eq. (4) one obtains the meanescape time µ t ≡ (cid:104) t (cid:105) = (cid:90) ∞ t g ( t ) dt = LDβk , (5)with Dβk as the mean diffusion speed, and the standarddeviation σ t ≡ (cid:112) (cid:104) t (cid:105) − (cid:104) t (cid:105) = √ LD ( βk ) . (6)Note that both µ t and σ t diverges when k = 0, for which g ( t ) becomes a heavy-tailed L´evy distribution. It has 100 150 200 250 300 350 0.7 0.8 0.9 1 1.2 1.4 1.6GB1(a) t m ed ( τ ) Atomistic tunnelCylinder tunnel 300 600 900 1200 GB1 (b) µ t ( τ ) T ( ε /k B ) Atomistic tunnelCylinder tunnel FIG. 2. Log-log plots of the temperature dependence of themedian escape time t med (a) and the mean escape time µ t (b) for protein GB1 at the atomistic tunnel (crosses) and ata cylinder tunnel of length L = 72 ˚A and diameter d = 16 ˚A(circles). The straight line (dashed) has a slope of − T ≥ (cid:15)/k B ). been shown that D and βk depend on L , on the proteinand on other conditions such as the crowders’ volumefraction outside the tunnel . III. RESULTS AND DISCUSSIONA. Effect of tunnel shape on escape process To study the effect of tunnel shape on the escape pro-cess, we consider the immunoglobulin binding (B1) do-main of protein G (GB1) as a nascent protein. Thisprotein has a length of N = 56 amino acids and wasconsidered in our previous studies . The folding tem-perature of GB1 was found as T f = 1 . (cid:15)/k B . Ex-perimentally, the melting temperature of the wild-typeGB1 at pH 5.5 has been reported to be 80.5 ◦ C . Wewill study the escape process at various temperatures butwill focus on the simulation temperature T = 0 . (cid:15)/k B ,which after unit conversion corresponds to a physiologi-cally relevant temperature of 26.3 ◦ C.In order to delineate the effects of tunnel shape on the escape process we sought for an equivalent cylinder tun-nel that yields a similar escape time to that obtained atthe atomistic tunnel for the protein GB1. We found thatamong the cylinder tunnels of the same length as that ofthe atomistic tunnel, i.e. L = 72 ˚A, the one with diame-ter d = 16 ˚A satisfies quite well the last requirement overa wide range of temperature. Figure 2(a) shows thatthe median escape times t med for the atomistic tunneland the cylinder tunnel are quite close to each other forvarious temperatures from 0 . (cid:15)/k B to 1 . (cid:15)/k B . Fig-ure 2(b) also shows that the mean escape times µ t forthe two tunnels agree very well with each other at in-termediate and high temperatures ( T > . (cid:15)/k B ). For T ≤ . (cid:15)/k B , both t med and µ t for the atomistic tunnelare larger than for the cylinder tunnel and the differencesincrease as the temperature decreases. These differencesindicate that at low temperatures, it is more difficult forproteins to escape from the atomistic tunnel than fromthe cylinder one. It appears that the physiological tem-perature 0 . (cid:15)/k B corresponds to a borderline behaviorof the escape process, in which the effect of tunnel shapestarts to get in.Figure 2 also shows the linear dependences of t med and µ t on T − in log-log scales (the dashed lines) as it wouldbe found for a Brownian particle diffused in a potentialfield with a constant βk . Our previous study showsthat this linear behavior is found for a homopolymerchain with self-repulsion, and thus can be applied for in-trinsically disordered proteins. For foldable proteins likethe GB1, this linear dependence can be observed onlyat high temperatures, at which the proteins are unfoldedduring the escape.Note that one can also have an equivalent cylinder tun-nel of a length different from that of the atomistic tun-nel. For example, we found that the cylinder tunnel of L = 82 ˚A and d = 13 . L = 72 ˚A and d = 16 ˚A.We now examine more carefully the escape processesof GB1 at the atomistic and cylinder tunnels at T =0 . (cid:15)/k B . Figure 3(a) shows that at the atomistic tun-nel, the histogram of the escape time for this proteinobtained from the simulations follows quite well the dis-tribution function g ( t ) given by the diffusion model inEq. (4). Figure 3(b) shows that the probability of pro-tein escape P escape has a sigmoidal dependence on thetime t with P escape reached the value of 1 at t ≈ τ .This result means that the protein can effectively es-cape from the tunnel without significant delays comparedto the median escape time, t med ≈ τ . Fig. 3(b)also shows the dependence of the probability P C-term- β of forming the C-terminal β -hairpin inside the tunnel ontime. The time dependence of this probability is obtainedby averaging over multiple escape trajectories. The C-terminal β -hairpin is said to be formed inside the tunnelif it forms at least half of its native contacts and whenall of its residues (41-56) are located within the tunnel.We tracked this β -hairpin because previous study showedthat at low temperatures the GB1 protein can escapefrom a cylinder tunnel through two different pathwaysdepending on whether the C-terminal β -hairpin is formedinside the tunnel or not . Fig. 3(b) shows that at theatomistic tunnel, only a small fraction of about 2% ofthe escape trajectories have this β -hairpin formed insidethe tunnel. We have checked that the trajectories havingthis hairpin formed typically correspond to longer escapetimes than other trajectories.Figure 3(c) shows the histogram of conformations ob-served during the escape process as a function of the num-ber of native contacts N c and the number of residuesescaped from the tunnel N out for protein GB1 at theatomistic tunnel. The histogram shows a high-densitycloud of conformations having intermediate values of N c and N out , indicating that the protein folds during the es-cape. The blurring of the cloud, however, suggests thatthe protein adopts a wide range of conformations dur-ing the escape process. Given that the maximum N c forGB1 is 120, the histogram shows that during the escapethe protein can form up to two-thirds of all of its nativecontacts. Note that conformations of N out = 0, i.e. com-pletely located within the tunnel, are also present in thehistogram.Figures 3(d,e,f) show that the equivalent cylinder tun-nel of diameter d = 16 ˚A produces not only a similarescape time distribution but also a similar dependenceof P escape on time and a similar histogram of escapingprotein conformations to those obtained with the atom-istic tunnel. Notice, however, that there are differences.First, the escape time distribution at the atomistic tun-nel is slightly more narrow than the one at the cylindertunnel while the median escape time at the cylinder isslightly smaller than at the atomistic tunnel (220 τ vs.230 τ ). Second, for the cylinder tunnel the escape prob-ability P escape reaches 1 faster at the time about 500 τ .Third, the histogram in N c and N out less dispersed inthe case of the cylinder tunnel. These differences indi-cate that the protein escapes relatively more easily at thecylinder tunnel than at the atomistic tunnel.Other differences at the two tunnels can be seen atthe probability of forming the C-terminal β -hairpin andthe histogram of conformation during the escape process.Fig. 3(e) shows that for the cylinder tunnel the proba-bility P C-term- β is zero at all times, indicating that theC-terminal β -hairpin does not form inside the cylindertunnel. Fig. 3(f) shows that the histogram of conforma-tions during the escape process for the cylinder tunneldoes not include conformations of small N out (less thanabout 16). These results are different to those at theatomistic tunnel and indicate that the pathways at theatomistic tunnel are more diverse than at the cylindertunnel.The differences between the escape processes at thetwo tunnels magnify as the temperature is lowered. In Fig. 4 we show the same plots as in Fig. 3 but for T = 0 . (cid:15)/k B . Figures 4(a) and 4(d) show that theescape time distribution for the atomistic tunnel is sig-nificantly more broad than for the cylinder tunnel. Fig-ures 4(b) and 4(e) show that for the atomistic tunnel theescape probability P escape can reach only about 94% at t = 1000 τ while for cylinder tunnel it can reach 100%at t ≈ τ . Figure 4(b) shows that at T = 0 . (cid:15)/k B ,about 5% of the escape trajectories having the C-terminal β -hairpin formed inside the atomistic tunnel while thisfraction remains to be zero for the cylinder tunnel (Fig.4(e)). Figures 4(c) and 4(f) show that the histogram ofescaping conformations for the atomistic tunnel is morecomplex than for the cylinder tunnel. There appears asignificant number of conformations of low N out , includ-ing those of N out = 0, at the atomistic but the cylin-der tunnel. We have checked that the trajectories thatdid not end with a successful escape after a long timecompared to t med are associated with conformations of N out = 0. These conformations are identified as kinetictraps in the escape process.Figure 5 shows several trapped conformations obtainedat T = 0 . (cid:15)/k B for GB1 at the atomistic tunnel. Thenumber of native contacts N c in these conformations isdifferent but all of them have the α -helix and at least one β -hairpin formed. The conformation shown in Fig. 5(c)also has a partial tertiary structure established by con-tacts between the α -helix and the N-terminal β -hairpin.These conformations did not appear at the cylinder tun-nel, indicating that the irregular shape of the atomistictunnel allows for and makes the formation of trappedconformations more easy inside the tunnel. Note thatat the physiological temperature T = 0 . (cid:15)/k B , therewere no kinetic traps. This can be understood as dueto two reasons: the faster diffusion at this temperaturehelps the protein to avoid trapped conformations, andthe larger thermal fluctuations help the protein to getout from the traps. The trapping at an atomistic ribo-some tunnel and the alleviation of trapping by increasedthermal fluctuations have been also observed for the chy-motrypsin inhibitor 2 (CI2) protein in an early simulationstudy by Elcock . Our results here for GB1 are consis-tent with that previous work.It is interesting now to compare the protein diffusionproperties at the two tunnels using the diffusion model.Figure 6 shows the values of the diffusion constant D and the potential slope βk obtained by fitting the escapetime histograms at various temperatures to the diffusionmodel, wherein it can be seen that for both the tunnels D appears to increase linearly with temperature whereas βk tends to adopt a constant value. The linear depen-dence of D on temperature agrees with that of an idealBrownian particle. The atomistic tunnel, however, yieldsa lower D and a higher βk than the cylinder tunnel atintermediate and high temperatures ( T ≥ . (cid:15)/k B ),while the average diffusion speed given by Dβk is almostthe same for the two tunnels. We have checked that atthese temperatures, the escape time distributions at the ε /k B Atomistic tunnel N o r m a li z ed h i s t og r a m Escape time ( τ ) 0 0.002 0.004 0.006 0 200 400 600 800 1000(d) Cylinder tunnel T=0.85 ε /k B Escape time ( τ ) 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 1000(e) T=0.85 ε /k B Cylinder tunnel P e sc ape P C − t e r m − β Time ( τ ) 0 20 40 60 80 100 0 10 20 30 40 50 60(f) N c N out T=0.85 ε /k B , Cylinder tunnel 0 to 11010 ε /k B (b) Atomistic tunnel P e sc ape P C − t e r m − β Time ( τ ) 0 20 40 60 80 100 0 10 20 30 40 50 60(c) N c N out T=0.85 ε /k B , Atomistic tunnel 0 to 11010 FIG. 3. Distributions of the escape time (a,d), time dependences of the escape probability P escape (solid) and the probabilityof C-terminal β -hairpin formation inside the tunnel P C-term- β (dashed) (b,e), and histograms of conformations as a functionof the number of residues escaped from the tunnel N out and the number of native contacts N c (c,f) for protein GB1 at theatomistic tunnel (upper panels) and at an equivalent cylinder tunnel (lower panels) of length L = 72 ˚A and diameter d = 16 ˚A,at temperature T = 0 . (cid:15)/k B . The native conformation of GB1, shown in (a) as inset, has 120 native contacts. Atomistic tunnel GB1T=0.75 ε /k B N o r m a li z ed h i s t og r a m Escape time ( τ ) 0 0.002 0.004 0.006 0 200 400 600 800 1000(d) Cylinder tunnel T=0.75 ε /k B Escape time ( τ ) 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 1000(e) T=0.75 ε /k B Cylinder tunnel P e sc ape P C − t e r m − β Time ( τ ) 0 20 40 60 80 100 0 10 20 30 40 50 60(f) N c N out T=0.75 ε /k B , Cylinder tunnel 0 to 11010 ε /k B (b) Atomistic tunnel P e sc ape P C − t e r m − β Time ( τ ) 0 20 40 60 80 100 0 10 20 30 40 50 60(c) N c N out T=0.75 ε /k B , Atomistic tunnel 0 to 11010 FIG. 4. Same as Fig. 3 but for T = 0 . (cid:15)/k B . The panels show distributions of the escape time (a,d), the time dependences ofthe escape probability P escape (solid) and the probability of C-terminal β -hairpin formation inside the tunnel P C-term- β (dashed)(b,e), and the histograms of conformations as a function of the number of residues escaped from the tunnel N out and thenumber of native contacts N c (c,f) for protein GB1 at the atomistic tunnel (upper panels) and at an equivalent cylinder tunnel(lower panels) of length L = 72 ˚A and diameter d = 16 ˚A. atomistic tunnel are slightly narrower than at the cylin-der tunnel. At low temperatures ( T < . (cid:15)/k B ), both D and βk for the atomistic tunnel deviate significantlyfrom the average trends due to the impact of kinetic trap-ping. Also at low temperatures, the escape time distri-bution for the atomistic tunnel becomes broader than forthe cylinder tunnel.To better understand the differences between the twotunnels, we sought for a quantitative comparison between the shapes of the atomistic tunnel and the cylinder tun-nel. To that extent, we have calculated the area S ofthe inner cross-section of the atomistic tunnel as a func-tion of the position x along the tunnel axis using a probesphere. The effective diameter of the tunnel at a givenposition then was calculated as d = 2 (cid:112) S/π . Figure7(a) shows that the shape of the tunnel’s cross-sectionvaries strongly with x . It is typically not circular andsignificantly deviates from that of the equivalent cylin- (a) (b) (c)N c = 27, N out = 0 N c = 34, N out = 0 N c = 47, N out = 0 FIG. 5. Examples of the conformations of GB1 that are trapped inside the atomistic tunnel during the escape process at T = 0 . (cid:15)/k B . The number of native contacts of the conformations are N c = 27 (a), N c = 34 (b) and N c = 47 (c), whereasall of them have N out = 0, as indicated. der tunnel. The tunnel is also quite narrow near thePTC and becomes much wider at the exit port. Figure7(b) shows that the effective diameter d of the atomistictunnel varies, but not too strongly, between 15 and 20 ˚A,for the positions of x between 15 and 75 ˚A. Notice thatfor these positions, the diameter d = 16 ˚A of the equiva-lent cylinder tunnel lies within the variation range of theatomistic tunnel’s diameter but is near the lower boundof this range. This can be understood as the irregularshape of the atomistic tunnel’s cross-sections makes iteffectively smaller for nascent proteins.The present model of the atomistic tunnel neglects thepresence of amino acid side-chains. We have checkedthat by considering all the heavy atoms of ribosomalproteins in the tunnel model while keeping the C α -onlyrepresentation for GB1, the escape time distribution at T = 0 . (cid:15)/k B changes only slightly with a small shift to-wards smaller values, but the escape probability reachesonly about 95% at the time of 1000 τ (Fig. S2 of thesupplementary material). We also found that due to theincreased roughness of the tunnel surface, the all-heavy-atom tunnel requires a longer growth time of the protein,with t g = 400 τ per amino acid, to obtain convergedproperties of fully translated protein conformations. Theresults indicate that amino acid side-chains may havea relatively small but detrimental effect on the escapeprocess. A proper consideration of this effect, however,would need models that include side-chain representa-tions for both ribosomal and nascent proteins and allowthe degrees freedom of side-chain rotation. It can be ex-pected that while the side-chain exclude volumes causesome obstruction to the escape process, the freedom ofside-chain rotation can make the escape somewhat easier. B. Dependence of escape time on protein In order to study the dependence of the escape time onprotein, apart from GB1 we selected additional 16 singledomain proteins of lengths between 37 and 99 residuesand belonging to different classes of all- α , all- β and α/β proteins, and carried out simulations for these proteins. We consider only the atomistic tunnel and the simulationtemperature T = 0 . (cid:15)/k B . Figure 8 shows that the his-tograms of the escape times for the selected proteins arequite similar in the overall shape and the range of mostprobable values. The peak positions of the histogramsvary but within the same order of magnitude. The dis-tribution width is the most narrow for the all- β proteins.For most of the proteins, the histogram can be fittedquite well to the distribution function given by Eq. (4)of the diffusion model. For a few proteins, i.e. the oneswith PDB codes 2spz and 1wt7, the agreement with thediffusion model is worse than for others with the appear-ance of a thick tail of large escape times (Fig. 8(m andn)). For most of the proteins, we observed trajectorieswith trapped conformations of N out = 0. The fractionof non-escaped trajectories at the largest time in the his-tograms shown in Fig. 8 (1200 τ or 1800 τ ) is below 2%for 10 out of 16 proteins (see the caption of Fig. 8). Ata larger time of 8000 τ , this fraction falls below 2% forall proteins except 2spz, for which this fraction remainsat 9.7%. Thus, most proteins can escape efficiently. Wehave checked that the trapped conformations of 2spz typ-ically have a two-helix bundle formed within the tunnel,resulting in an increased difficulty for its escape.Figure 9(a) shows the median escape time t med as afunction of the chain length N for all 17 proteins consid-ered including GB1. It can be seen that t med is foundin the range from 200 τ to 500 τ and does not seem todepend on N . However, the variation of the escape timesamong the proteins decreases with N . Figure 9(b) plots t med against a topological parameter of the protein na-tive state, the relative contact order ( CO ) . It showsan weak but visible trend that t med decreases with CO .From the types of data points shown in Fig. 9(b) one canalso see that the all- β proteins on averages escape fasterthan the all- α proteins, whereas the α/β proteins can es-cape either as fast or as slow as the two other groups.From the fits of the simulated escape time distributionto the diffusion model, we obtained the values of D and βk for 17 proteins. Figures 9(c) and 9(d) show thatboth D and βk vary strongly among the proteins, butlike for t med , the variation decreases with N indicating Cylinder tunnel β kD D ( Å / τ ) β k ( Å − ) T ( ε /k B ) D=1.59 T β k=0.20 Atomistic tunnel β kD D ( Å / τ ) β k ( Å − ) D=1.10 T β k=0.29 FIG. 6. Dependence of the diffusion constant D (circles) andthe parameter βk (squares) on temperature for protein GB1at the atomistic tunnel (a) and at the cylinder tunnel of L =72 ˚A and d = 16 ˚A (b). The data are obtained by fitting thesimulated escape time distribution to Eq. (4). Straight linesshow fits of the simulation data for T ≥ . (cid:15)/k B to a lineardependence on temperature in the case of D (dashed) and to aconstant value in the case of βk (dotted). The functions of thefits are D = 1 . T and βk = 0 . 29 ˚A − in (a) and D = 1 . T and βk = 0 . − in (b). a kind of convergence. It can be also noticed that thestrongest variation belongs to the α/β proteins, whereasthe weakest belongs to the β proteins. It can be expectedthat proteins of length N > 100 have similar t med anddiffusion properties at the tunnel to the ones of length70 < N < τ . A projected medianescape time for this domain would be at least about 50times larger than for GB1. Furthermore, the simulated d ( Å ) x (Å) (a)(b) FIG. 7. (a) Inner cross-sections (dark color) of the atomistictunnel obtained by using a probe sphere of radius R = 3 ˚Aat various position x along the tunnel axis as indicated. Forcomparison, the cross-section of an equivalent cylinder tunnelof diameter 16 ˚A is also shown (gray circle). (b) Dependenceof the effective diameter d of the atomistic tunnel (solid) on x . For a given position x , d is calculated as d = 2 (cid:112) S/π ,where S is the area of the tunnel’s cross-section. Horizontaldashed line indicates the constant diameter d = 16 ˚A of theequivalent cylinder tunnel. escape time distribution as shown Fig. 10(a) cannot befitted to the diffusion model. Fig. 10(c) shows that a typ-ical trapped conformation of this zinc-finger domain hasthe α -helix and the N-terminal β -hairpin formed and it isfound deeply within the tunnel. The partial folding ob-served here is consistent with a recent experiment whichindicates that the 2adr zinc-finger domain can fold com-pletely within the tunnel . The difference between thezinc-finger domain of 2adr and that of 2vy4 is that thelatter is 8 residues longer and has a more developed β -hairpin with a coil-like flagging tail near the N-terminus.This flagging tail makes the β -hairpin formation insidethe tunnel more difficult allowing the N-terminus to reachout of the tunnel. It is suggested that the behavior of the29-residue zinc-finger domain is similar to the diffusionwith k = 0 in the diffusion model. A protein trappedentirely inside the tunnel would feel no free energy gra-dient and therefore has no indication on which directionto diffuse. It can be expected that the cross-over tunnellength for the 29-residue zinc-finger domain is signif-icantly shorter than the real length of the exit tunneland therefore the protein is found in the slow diffusionregime. N o r m a li z ed h i s t og r a m Escape time ( τ ) FIG. 8. Distribution of the escape time at the atomistic tunnel at T = 0 . (cid:15)/k B for 16 small single-domain proteins (notincluding GB1) named by their PDB codes as 1iur (a), 2jwd (b), 2yxf (c), 2uvs (d), 1wxl (e), 2ql0 (f), 2ci2 (g), 2vxf (h), 2k3b(i), 1shg (j), 1f53 (k), 2zeq (l), 2spz (m), 1wt7 (n), 2vy4 (o), 2rjy (p). The PDB code and the native state of each protein areshown inside the panels with the N-terminus indicated by a blue ball. In each panel, a normalized histogram of the escapetimes obtained from simulations (boxes) is fitted to the diffusion model (solid line). The fractions of non-escaped trajectoriesat the largest time in the histograms are 0% (1iur), 1% (2jwd), 0.3% (2yxf), 4.6% (2uvs), 0% (1wxl), 5.6% (2ql0), 0.6% (2ci2),0.2% (2vxf), 1.2% (2k3b), 1.2% (1shg), 4% (1f53), 0.6% (2zeq), 10.9% (2spz), 13.7% (1wt7), 1% (2vy4), and 2.6% (2rjy). C. Dependence of escape time on friction coefficient This subsection is relevant only to the methodologyused in the study but it helps to better interpret the pre-vious results. All the simulations in the previous subsec-tions were done with the friction coefficient ζ = 1 mτ − for amino acids. This value of ζ may not be realistic forreal proteins inside cells. Thus, we ask how the escapetime would depend on ζ and whether one can extrapo-late this dependence to obtain the real escape time. Forthis purpose we carried out additional simulations forGB1 with ζ = 2, 4, and 8 mτ − at the atomistic tunnelwith T = 0 . (cid:15)/k B . In these simulations, because theincreased friction slows down the dynamics, the growthtime per amino acid t g was also increased to 200, 400 and800 τ , respectively, for the given values of ζ . Fig. 11(a)shows that the median escape time t med is an almostperfect linear function of ζ . This linear dependence indi-cates that the simulation results are in the overdamped (large friction) regime. Fig. 11(b) shows that the dif-fusion constant D of the escaping protein, obtained byfitting the simulated escape time distribution to the dif-fusion model, decreases with ζ like D ∼ ζ − . Togetherwith the approximate linear dependence of D on tem-perature shown Fig. 6, one finds a complete consistencywith the Einstein’s relation D = k B T /ζ ∗ where ζ ∗ is thefriction coefficient of a Brownian particle. Thus, proteinat the tunnel behaves very much like a Brownian particleif one assumes that the quantity ζ ∗ is proportional to ζ and plays the role of an effective friction coefficient of thewhole protein.Given that σ = 5 ˚A, m = 110 g/mol, and (cid:15) ≈ . τ = (cid:112) mσ /(cid:15) ≈ mτ − ≈ × − g s − . The realistic frictioncoefficient of amino acid in water can be obtained fromthe Stokes law, ζ water = 6 πησ , where η = 0 . 01 Poiseis the viscosity of water at 25 ◦ C. One obtains ζ water ≈ t m ed ( τ ) N 0 200 400 600 0.14 0.21 0.28 0.35 0.42(b) t m ed ( τ ) CO 0 1 2 3 4 30 40 50 60 70 80 90 100(c) D ( Å / τ ) N 0 0.1 0.2 0.3 0.4 30 40 50 60 70 80 90 100(d) β k ( Å − ) N FIG. 9. (a and b) The median escape time t med plotted against the protein length N (a) and the relative contact order CO (b)for GB1 and 16 proteins considered in Fig. 8. The values of t med are obtained from simulations with the atomistic tunnel at T = 0 . (cid:15)/k B . The point type indicates the protein class, i.e. all- α (circles), all- β (squares) and α/β (triangles). (c and d) Thediffusion constant D (c) and the potential parameter βk (d) plotted against the protein length N for 17 proteins considered in(a). D and βk are obtained by fitting the simulated escape time distribution to the diffusion model. −5 N o r m a li z ed h i s t og r a m Escape time ( τ ) 0 0.2 0.4 0.6 0.8 1 0 2000 4000 6000 8000(b) (c) P e sc ape Time ( τ ) FIG. 10. (a) Histogram of the escape times at T = 0 . (cid:15)/k B obtained from simulations at the atomistic tunnel for a 29-residuezinc-finger domain of ADR1 protein (pdb code: 2adr, res. 102-130) with the native structure of the domain shown as inset. (b)Dependence of the escape probability, P escape , on time for the system considered in (a). (c) A typical trapped conformation ofthe domain inside the tunnel. . × − g s − ≈ mτ − . By extrapolating the lineardependence in Fig. 11(a) one finds that at ζ = ζ water themedian escape time for GB1 is t med ≈ × τ ≈ 90 ns.This time appears to be too short for large scale motionlike the protein escape.Veitshans et al. suggested that at high friction, in-ertial terms in the Langevin equation are irrelevant, andthe natural time unit is τ H = ζσ /k B T . For water atroom temperature, a direct calculation from the last for-mula gives τ H ≈ . et al. estimated that τ H ≈ t med is ei- ther 18 µ s or 90 µ s. The experimental refolding time ofGB1 at neutral pH is about 1 ms . We have checked thatwithin the same Go-like model at T = 0 . (cid:15)/k B , the me-dian refolding time is about 50% larger than t med . Thus,the model prediction of the refolding time is smaller butwithin the same order of magnitude as the experimentalvalue given that some uncertainties are associated withthe estimates. With the above estimates, and given theresults of the previous section, it can be expected that theescape times of single-domain proteins are of the orderof 0.1 ms, i.e. in the sub-millisecond scale.1 D ( Å / τ ) ζ (m τ −1 )D=1.05/ ζ t m ed ( τ ) FIG. 11. Dependence of the median escape time t med (a) andthe diffusion constant D (b) on the friction coefficient ζ forprotein GB1 at T = 0 . (cid:15)/k B . In (a) the dependence is fittedby a linear function, t med = 179 . ζ + 37 . 62 (dashed). In (b)the dependence is fitted by the function D = 1 . /ζ (dashed). IV. CONCLUSION There are several remarks we would like to mentionfor the conclusion. First, the shape of the ribosomal exittunnel appears to cause increased difficulty for nascentproteins to escape compared to a smooth cylinder tun-nel. This difficulty is reflected by the appearance of ki-netic traps in the escape pathways leading to lengthenedescape times. We have shown that the trapped conforma-tions are completely located inside the tunnel and usuallyhave a significant development of tertiary structure. Theformation of tertiary structure elements inside the tun-nel correlates with the modulated shape of the ribosometunnel, which has some narrow parts but also some widerparts which can hold a tertiary unit. In contrast, theequivalent cylinder tunnel of 13.5 ˚A diameter does notallow tertiary structure formation and yields no kinetictrapping. Second, thermal fluctuations are importantfor the escape of nascent proteins. We have shown thatfor GB1, a significant fraction of escape trajectories gettrapped at T = 0 . (cid:15)/k B , but not at the physiologicaltemperature T = 0 . (cid:15)/k B . Interestingly, at the lattertemperature, almost all of the 17 single-domain proteinsconsidered are able to escape efficiently, even the smallestone, the 37-residue 2vy4. Note that the trapping arises solely due to the folding of a protein within the tunnel,thus it depends on temperature. At high temperatures,folding is slow and diffusion is fast, therefore a proteinwould have a low probability of getting trapped beforeescaping from the tunnel. At low temperatures, foldingis fast while diffusion is slow, the trapping probability isincreased. On the other hand, a protein would get outfrom a trap easier at a higher temperature.Third, if a protein or peptide is too small it cannot es-cape efficiently from the tunnel. The example of the 29-residue zinc-finger domain of 2adr shows that the proteinis severely trapped inside the tunnel with the median es-cape time about of two orders of magnitude larger thanthat of GB1. The trapped protein is not guided by apotential gradient towards the escape direction. This ex-ample reflects a relation between the protein size anda cross-over tunnel length for efficient diffusion, as pre-dicted by our previous study with the cylinder tunnel .Forth, the escape time of single-domain proteins weaklydepends on the native state topology and is almost in-dependent of the protein size. Our model predicts thatthe protein escape time at the ribosome tunnel is of theorder of 0.1 ms. The latter is much shorter than the timeneeded by the ribosome to translate one codon (tens ofmilliseconds), therefore not allowing nascent proteins tojam the ribosome tunnel.One may ask to what extent hydrophobic and elec-trostatic interactions of a nascent protein with the ribo-some exit tunnel can alter the above obtained results.It is well-known that the tunnel’s wall formed by theribosomal RNA is negatively charged. We found thatfor H. marismortui ’s ribosome, the tunnel’s inner surfacewith x < 82 ˚A has only four hydrophobic side-chainsthat are clearly exposed within the tunnel, belonging toPhe61 of protein L4, Met130 of protein L22, and Met26and Leu27 of protein L38, and about 10 exposed chargedamino acid side-chains. These statistics indicate that theeffect of hydrophobic interaction on the escape processcan be considerably small whereas the Coulomb interac-tion may have a strong effect on nascent chains. However,if the total charge of a nascent protein is neutral, theelectrostatic forces on the protein’s positive and negativecharges may cancel out each other. Thus, it is reason-able to expect that the energetic interactions of nascentproteins with the tunnel can lead to specific changes inthe escape behavior for individual proteins, but on anaverage they give only higher-order corrections to whatobtained with excluded volume interaction.Finally, like for the cylinder tunnel, it is found thatthe escape time distribution at the atomistic tunnel forvarious proteins follows very well the one-dimensional dif-fusion model of a drifting Brownian particle. This con-sistent finding suggests that the protein escape at theribosome tunnel may have been designed by Nature tobe simple, efficient and predictable for the smooth func-tioning of the ribosome. This result also proves the use-fulness of using simple stochastic models to understandcomplex dynamics of biomolecules.2 SUPPLEMENTARY MATERIAL See supplementary material for the dependences of themedian escape time and the mean escape time on tem-perature for protein GB1 at the cylinder tunnel of length L = 82 ˚A and diameter d = 13 . β -hairpin formation inside the tunnel for protein GB1 ata tunnel model that considers all the heavy atoms of theribosomal RNA and the ribosomal proteins. ACKNOWLEDGMENTS This research is funded by Vietnam National Founda-tion for Science and Technology Development (NAFOS-TED) under grant number 103.01-2019.363. T.X.H. alsoacknowledges the support of the International Centre forPhysics at the Institute of Physics, VAST under grantnumber ICP.2020.05. We thank the VNU Key Labora-tory of Multiscale Simulation of Complex Systems for theoccasional use of their high performance computer. DATA AVAILABILITY STATEMENT The data that support the findings of this study areavailable from the corresponding author upon reasonablerequest. I. S. Gabashvili, R. K. Agrawal, C. M. Spahn, R. A. Grassucci,D. I. Svergun, J. Frank, and P. Penczek, Cell , 537 (2000). N. Ban, P. Nissen, J. Hansen, P. B. Moore, and T. A. Steitz,Science , 905 (2000). D. V. Fedyukina and S. Cavagnero, Ann. Rev. Biophys. , 337(2011). F. Trovato and E. P. O’Brien, Annu. Rev. Biophys. , 345(2016). A. Javed, J. Christodoulou, L. D. Cabrita, and E. V. Orlova,Acta Crystallogr. D , 509 (2017). M. Thommen, W. Holtkamp, and M. V. Rodnina, Curr. Opin.Struct. Biol. , 83 (2017). H. Nakatogawa and K. Ito, Cell , 629 (2002). T. Tenson and M. Ehrenberg, Cell , 591 (2002). R. Berisio, F. Schluenzen, J. Harms, A. Bashan, T. Auerbach,D. Baram, and A. Yonath, Nat. Struct. Mol. Biol. , 366 (2003). G. Ziv, G. Haran, and D. Thirumalai, Proc. Natl. Acad. Sci.USA , 18956 (2005). S. Kirmizialtin, V. Ganesan, and D. E. Makarov, J. Chem. Phys. , 10268 (2004). J. Lu and C. Deutsch, Nat. Struct. Mol. Biol. , 1123 (2005). A. Kosolapov and C. Deutsch, Nat. Struct. Mol. Biol. , 405(2009). A. H. Elcock, PLoS Comp. Biol. , 0824 (2006). E. P. O’Brien, S.-T. D. Hsu, J. Christodoulou, M. Vendruscolo,and C. M. Dobson, J. Am. Chem. Soc. , 16928 (2010). W. Holtkamp, G. Kokic, M. J¨ager, J. Mittelstaet, A. A. Komar,and M. V. Rodnina, Science , 1104 (2015). O. B. Nilsson, R. Hedman, J. Marino, S. Wickles, L. Bischoff,M. Johansson, A. M¨uller-Lucks, F. Trovato, J. D. Puglisi, E. P.O’Brien, et al. , Cell Rep. , 1533 (2015). R. Kudva, P. Tian, F. Pardo-Avila, M. Carroni, R. B. Best, H. D.Bernstein, and G. Von Heijne, Elife , e36326 (2018). L. D. Cabrita, S.-T. D. Hsu, H. Launay, C. M. Dobson, andJ. Christodoulou, Proc. Natl. Acad. Sci. USA , 22239 (2009). C. Eichmann, S. Preissler, R. Riek, and E. Deuerling, Proc. Natl.Acad. Sci. USA , 9111 (2010). E. P. OBrien, J. Christodoulou, M. Vendruscolo, and C. M.Dobson, J. Am. Chem. Soc. , 513 (2011). H. Krobath, E. I. Shakhnovich, and P. F. Fa´ısca, J. Chem. Phys. , 215101 (2013). P. T. Bui and T. X. Hoang, J. Chem. Phys. , 095102 (2016). P. T. Bui and T. X. Hoang, J. Chem. Phys. , 045102 (2018). C. M. Dobson, Nature , 884 (2003). N. Voss, M. Gerstein, T. Steitz, and P. Moore, J. Mol. Biol. ,893 (2006). K. Dao Duc, S. S. Batra, N. Bhattacharya, J. H. Cate, and Y. S.Song, Nucleic Acids Research , 4198 (2019). N. Go, Ann. Rev. Biophys. Bioeng. , 183 (1983). T. X. Hoang and M. Cieplak, J. Chem. Phys. , 6851 (2000). T. X. Hoang and M. Cieplak, J. Chem. Phys. , 8319 (2000). C. Clementi, H. Nymeyer, and J. N. Onuchic, J. Mol. Biol. ,937 (2000). T. M. Schmeing, K. S. Huang, S. A. Strobel, and T. A. Steitz,Nature , 520 (2005). P. T. Bui and T. X. Hoang, J. Phys.: Conf. Ser. , 012022(2020). N. G. Van Kampen, Stochastic Processes in Physics and Chem-istry , 3rd ed. (Elsevier, 1992). D. R. Cox and H. D. Miller, The Theory of Stochastic Processes (Chapman and Hall, 1965) pp. 219–223. R. Campos-Olivas, R. Aziz, G. L. Helms, J. N. Evans, and A. M.Gronenborn, FEBS Lett. , 55 (2002). K. W. Plaxco, K. T. Simons, and D. Baker, J. Mol. Biol. ,985 (1998). T. Veitshans, D. Klimov, and D. Thirumalai, Fold. Des. , 1(1997). P. Alexander, J. Orban, and P. Bryan, Biochem. , 7243 (1992). SUPPLEMENTARY MATERIAL FOR: “PROTEIN ESCAPE AT THE RIBOSOMAL EXIT TUNNEL: EFFECT OF THETUNNEL SHAPE”, P.T. BUI AND T.X. HOANG 100 150 200 250 300 350 0.7 0.8 0.9 1 1.2 1.4 1.6GB1(a) t m ed ( τ ) Atomistic tunnelCylinder tunnel 100 300 600 900 1200 0.7 0.8 0.9 1 1.2 1.4 1.6GB1(b) µ t ( τ ) T ( ε /k B ) Atomistic tunnelCylinder tunnel FIG. S1. Same as for Figure 2 but for a comparison between the atomistic tunnel (crosses) with a cylinder tunnel of length L = 82 ˚A and diameter d = 13 . GB1T=0.85 ε /k B All−heavy−atom tunnelAtomistic tunnel N o r m a li z ed h i s t og r a m Escape time ( τ ) 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 1000 T=0.85 ε /k B (b) All−heavy−atom tunnel P e sc ape P C − t e r m − β Time ( τ ) FIG. S2. Same as for panels (a) and (b) of Figure 7 but for a tunnel model that consists of all the heavy atoms of the ribosomalRNA and the ribosomal proteins. The GB1 protein is still considered in the C α -based Go-like model. The fit of the simulationdata to the diffusion model for the atomistic tunnel (dotted) is shown for comparison with the fit for the all-heavy-atom tunnel(solid). Compared to the atomistic tunnel, the all-heavy-atom tunnel requires a longer growth time of the protein to obtainconverged properties of fully translated protein conformations in terms of number of native contacts and radius of gyration.For the results shown in the figure, the protein is grown with the growth time t g = 400 ττ