[PDF] A Continuous Effective Model of the Protein Dynamics

Abstract

The theory of elastic rods can be used to describe certain geometric and topological properties of the DNA molecules. A similar effective field theory approach was previously suggested to describe the conformations and dynamics of proteins. In this letter we report a detailed study of the basic features of a version of the proposed model, which assumes proteins to be very long continuous curves. In the most appealing case, the model is based on a potential with a pair of minima corresponding to helical and strand-like configurations of the curves. It allows to derive several predictions about the geometric features of the molecules, and we show that the predictions are compatible with the phenomenology. While the helices represent the ground state configurations, the abundance of beta strands is controlled by a parameter, which can either completely suppress their presence in a molecule, or make them abundant. The few-parameter model investigated in the letter rather represents a universality class of protein molecules. Generalizations accounting for the discrete nature and inhomogeneity of the molecules presumably allow to model realistic cases.

Full PDF

IITEP-TH-nn/yy

A Continuous Eﬀective Model of the Protein Dynamics

Dmitry Melnikov

1, 2 and Alyson B. F. Neves

3, 4 International Institute of Physics, Federal University of Rio Grande do Norte,Campus Universit´ario, Lagoa Nova, Natal-RN 59078-970, Brazil Institute for Theoretical and Experimental Physics,B. Cheremushkinskaya 25, Moscow 117218, Russia Department of Theoretical and Experimental Physics,Federal University of Rio Grande do Norte,Campus Universit´ario, Lagoa Nova, Natal-RN 59078-970, Brazil Federal University of Maranh˜ao, Campus Balsas,rua Jos´e Le˜ao 484, Balsas-MA 65800-000, Brazil

The theory of elastic rods can be used to describe certain geometric and topological properties ofthe DNA molecules. A similar eﬀective ﬁeld theory approach was previously suggested to describethe conformations and dynamics of proteins. In this letter we report a detailed study of the basicfeatures of a version of the proposed model, which assumes proteins to be very long continuouscurves. In the most appealing case, the model is based on a potential with a pair of minimacorresponding to helical and strand-like conﬁgurations of the curves. It allows to derive severalpredictions about the geometric features of the molecules, and we show that the predictions arecompatible with the phenomenology. While the helices represent the ground state conﬁgurations,the abundance of beta strands is controlled by a parameter, which can either completely suppresstheir presence in a molecule, or make them abundant. The few-parameter model investigated in theletter rather represents a universality class of protein molecules. Generalizations accounting for thediscrete nature and inhomogeneity of the molecules presumably allow to model realistic cases.

Proteins are long quasi one-dimensional chainsof amino acids. At the secondary structure levelthese chains organize themselves in several char-acteristic conﬁgurations, such as helices and betasheets, appearing due to hydrogen bonds formingbetween atoms of beyond-nearest-neighbor aminoacids. Consequently, at this level, proteins can beviewed as a sequence of helices and beta strands(elements that form the pleated beta sheets) joinedby less regular connections called loops and turns.In natural conditions, the arrays of the sec-ondary structures further fold to make space-ﬁllingconﬁgurations – the native states of the proteins,which deﬁne their biological function. Predict-ing the shape of the native states from the se-quence of amino acids is a very important, butformidable task. Due to the large number of de-grees of freedom in this problem, current all-atomcomputer simulations only allow to fold relativelyshort chains, yet in time signiﬁcantly longer thanit takes a nature protein to do so. Diﬀerent tech-niques are implemented to eﬀectively reduce thenumber of degrees of freedom. Although the lattersimulations are more eﬃcient, they suﬀer from alarge number of input parameters, which restricttheir predictive power.In Ref. 1,2 it was proposed to apply an eﬀectiveﬁeld theory approach to proteins, constructing ef-fective energy functionals for 3D curves. It wasargued that a natural description is in terms of agauge theory that is a low-dimensional analog ofthe Abelian Higgs model. Such model would re- produce helices as ground state conﬁgurations andloops as solitons interpolating between the groundstates. In a series of subsequent works it wasshown that a predominant number of secondarystructures found in real proteins can be ﬁt with asub-angstrom accuracy by an eﬀective model withan order of magnitude fewer number of parame-ters than that of the conventional coarse grainedmodels.In the papers using the eﬀective theory approachthe proteins were described by a discrete versionof the model owing to the fact that proteins arediscrete chains with no translational symmetry.In this letter we discuss the continuous AbelianHiggs model and explain a number of universalfeatures of proteins that this model predicts. Wewill claim that the phenomenology is based on atwo-minimum potential, whose ground state con-ﬁgurations are helices. Beta strands can appear asmetastable conﬁgurations or as solitons, and theirabundance is controlled by one of the parametersof the model. Moreover, we establish several rela-tions between the geometry of diﬀerent secondarystructure elements. In the end we comment, howdiscreteness and inhomogeneity of proteins can beaccounted in generalizations of the present model.The various details of the analysis summarized inthis letter appeared in a companion paper. Following Refs. 1,2 we will use curvature κ andtorsion τ as the eﬀective ﬁelds of the large-scale dy-namics of the protein molecules. We will considerthem as functions of the (arc) length parameter a r X i v : . [ q - b i o . B M ] A ug ( s ) of the curves. For these ﬁelds we will write aneﬀective energy functional, cf. Refs. 11,12.A minimal model necessary for reproducing thefeatures of the protein molecules, relevant for thepresent discussion, is the one-dimensional AbelianHiggs model with a Chern-Simons term and aProca mass term: E = L (cid:90) ds (cid:0) |∇ ˆ κ | − m | ˆ κ | + λ | ˆ κ | (cid:1) − F L (cid:90) ds ˆ τ + L (cid:90) ds (cid:15) (ˆ τ − η (cid:48) ) . (1)Here complex ˆ κ = κe iη and real ˆ τ denote ﬁelds,subject to local gauge transformations, so thatphysical gauge invariant curvature and torsion are κ = | ˆ κ | and τ = ˆ τ − η (cid:48) . Quantities m , λ , F and (cid:15) are phenomenological parameters, which can beﬁxed by comparing this model with real proteins. L is the length of the curve, which will later beconsidered inﬁnite.We note that the ﬁrst term in the second line ofEq. (1), representing the one-dimensional Chern-Simons action, is not gauge invariant in the case ofﬁnite open curves. Consequently, the model pos-sesses physical edge modes corresponding to thechoice of normal vectors at the endpoints of thecurves. The torsion ﬁeld enters quadratically in the en-ergy functional and can be integrated out assuming τ = F κ + (cid:15) . (2)We also note that all the equations depend ongauge invariant quantities τ and κ . This situa-tion is equivalent to the Higgs mechanism of thespontaneous breaking of the original U (1) gaugetheory. The reduced gauge invariant energy func-tional takes the following form, E = 12 L (cid:90) ds (cid:18) κ (cid:48) − m κ + λκ − F κ + (cid:15) (cid:19) , (3)with a residual Z symmetry κ → − κ .The ﬁrst feature of the model is special rela-tion (2) between the curvature and the torsion,which emphasizes the importance of the Chern-Simons term for the solutions with non-zero tor-sion. The second feature is a non-local eﬀective po-tential for the curvature ﬁeld obtained upon inte-gration of the torsion. To highlight other featureswe discuss the classical minimum energy conﬁgu-rations of this theory. FIG. 1: (Color online) Customary secondary structurerepresentation of a piece of a protein chain. The cyancolored spirals are alpha helices and the twisted purpleribbon is a beta strand, here sandwiched between twohelices. The image is produced with the help of thePyMOL software. It is convenient to rewrite the derivative-freepart of Eq. (3) in terms of a potential, V ( κ ) = λ ( κ − κ ) ( κ + κ )2( κ + (cid:15) ) , (4)with another set of phenomenological parameters( λ, κ , κ , (cid:15) ). As explained in Ref. 10, the most in-teresting scenario corresponds to 0 ≤ κ ≤ (cid:15) . Letus review the properties of this potential in thelimit of inﬁnite curves. Finite size eﬀects are im-portant for the stability of possible conﬁgurations.They are discussed in some detail in Ref. 10.Static solutions in the inﬁnite length model in-clude a true ground state with constant κ = κ and τ = τ , stable kink-like solutions, interpolat-ing between the minima κ and − κ , local energyminimum solutions κ = 0 and unstable sphaleronsof the local energy minimum. The last two solu-tions are only present if κ ≤ κ (cid:15) / ( κ + 2 (cid:15) ). Thesphalerons are classical bounce solutions of theparticle motion in an inverted potential. Theycharacterize the height of the potential barrier sep-arating the true and the false ground states.These solutions have a natural interpretationin terms of the observed secondary structures ofproteins. The constant curvature, κ = κ , con-stant torsion curves are helices, which are the mostabundant regular structures (for example, alphahelices). The conﬁgurations with zero curvature(but non-zero torsion) can be compared with betastrands in proteins – zigzag-like conﬁgurations ofthe backbone chain, which are typically visualizedby quasi straight and slightly twisted ribbons as inﬁgure 1. The kinks interpolating between globalminima are loops (structural motifs) connectingpairs of helices. Finally, sphalerons are non-zerocurvature segments connecting two straight pieces.They are unstable in the model with translationalsymmetry, but can be stabilized by ﬁnite size and Structurehelix2pneRstrandLstrand κ | τ | FIG. 2: (Color online) Distribution of the curvatureand torsion pairs. The color identiﬁcation is bluefor the alpha helices, green for the right-handed betastrands, red for the left-handed strands and orange forthe strands of the protein. The green curve is aﬁt (2) for the helix and strand points (excluding ).The blue and orange curves are the same ﬁts, but as-suming (cid:15) = 0, for the helices and for the . discretization eﬀects. Thus they can be comparedwith higher curvature loops (hairpins) connectingthe beta strands.At the next level of the comparison of the modelwith the protein phenomenology, we can estimatethe values of the parameters in the eﬀective energyfunctional. In Ref. 10 we ﬁtted the positions ofthe C α atoms in the helices with continuous curvesto obtain the following estimates for the curvatureand torsion: κ (cid:39) . − , τ (cid:39) .

15 ˚A − . (5)One can then ask whether relation (2) is satis-ﬁed in real proteins. The relation roughly tells usthat structures with lower curvature should havehigher torsion. Alpha helices have a rather nar-row distribution around values of Eq. (5), so wechecked whether beta strands can be ﬁt in this pic-ture. One can think of the ribbons, as the one rep-resenting the beta strand on ﬁgure 1 as a stretchedhelix with κ/τ (cid:28)

1. By ﬁtting the strands as suchstretched helices we obtained a distribution shownon ﬁgure 2. We stress that in the case of thebeta strands we measured the torsion of the rib-bon, rather than that of the backbone chain.We ﬁnd that strands can have both positive andnegative torsion as shown on ﬁgure 2. Moreoverthere are some special strands that do not ﬁt thepresent model, or rather the universality class ofthe speciﬁc relation (2). We found such strandsin the protein. The strand of that proteinare bona ﬁde left helices, but with non-standardcurvature and torsion. κ , Å - L loop , Å FIG. 3: (Color online) Size of the kink interpolating be-tween two global minima in potential (4). The curvescorrespond to the sections containing 50% (blue) and90% (orange) of the soliton energy. The inset showsthe shape of the curve for κ = 0 . − . Part of thecurve highlighted in red, shows the piece contributing50% of the energy. Figure 2 is similar in spirit to the famous Ra-machandran plots , cast in the form, which al-lows to extract the information about relation (2).As in the Ramachandran plots, the distributionof the beta strands is much less localized in com-parison to alpha helices, but it is compatible withEq. 2. In particular, it is consistent with the non-zero Proca mass, which acts an IR regulator inpotential (4). By ﬁtting the data on ﬁgure 2 wecan obtain the average values of parameters F and (cid:15) : F = 0 .

70 ˚A − , (cid:15) = 1 . − . (6)With κ , F and (cid:15) ﬁxed there remains only onefree parameter in the model. We can choose it tobe λ or κ . The remaining parameter controls thesize of the solitons. In ﬁgure 3 we show how thesize of the kink depends on κ . In general, we see alogarithmic decrease of the loop size with increaseof κ . For small κ one observes long loops, whichcorrespond to a characteristic step appearing inthe kink solution, as can be seen on ﬁgure 4. Inthe limit κ →

0, the global and the local minimabecome degenerate and the original kink splits intoa pair of kinks with inﬁnite separation.The step appearing in the kink solution in ﬁg-ure 4 is a piece of the curve with much lower curva-ture (see also the inset on ﬁgure 3), which shouldbe interpreted in terms of the conﬁguration shownon ﬁgure 1: a pair of alpha helices separated by abeta strand. One can make the following estimateof the size of the small curvature region, assumingthat it is characterized by | κ | (cid:28) κ (cid:28) κ , R β (cid:39) κ (cid:90) − κ dk √ λκ (cid:15) (cid:112) k + κ (cid:39) (cid:15) ( κ + (cid:15) ) F κ , (7) κ = - κ = - κ = - - - -

20 20 40 60 s, Å - - - κ , Å - FIG. 4: (Color online) Kink solutions of the model withpotential (4). For small κ → where we assumed the relation between parameters λ and F following from two diﬀerent parameteriza-tions of the potential. Note, that this value doesnot depend on κ . The estimate gives the numer-ical value R β (cid:39)

12 ˚A, which is a prediction of theuniversal size of the length of the beta strand inthe model. Beta strands can appear longer, be-cause they have an attached intermediate region,which depends logarithmically on κ .On the other hand, if κ is not much smallerthan κ , the step is not formed and the conﬁgu-rations like the one on ﬁgure 1 are not possible.Hence κ can be viewed as an external parame-ter, like chemical potential, controlling the abilityof the protein to form beta strands. This chemi-cal potential can either be a characteristic of themedium, in which the protein is present, or of theamino acid composition of the backbone chain.It is interesting to discuss what happens in thelimit κ → κ = 0. Such struc-tures become more stable in the κ → κ = 0 segments. In other words,it is plausible that longer chains of the beta strandsare formed in that regime. The sphalerons mightalso stabilize in the discrete case. They would in-troduce loops of higher curvature connecting betastrands (similar to the known hairpin motifs in pro-teins).There is a way how the continuous model can bedeformed to account for the breaking of the trans-lational invariance in actual proteins. Apart fromconsidering ﬁnite curves, one can introduce coor-dinate dependence to the parameters. Apart froma discrete periodicity of the backbone chain, thiscould also take into account the local inhomogene-ity of the chemical structure. It is then natural toﬁt the continuous curves of real proteins using aFourier expansion. We hope to discuss such gen-eralizations in a future work. Acknowledgements

DM would like to thankAntti Niemi and Ara Sedrakyan for many usefuldiscussions on the application of eﬀective ﬁeld the-ory and topology to protein physics. DM is alsograteful to Dionisio Bazeia, Sergei Brazovskii andthe participants of the workshop “Physics and Bi-ology of Proteins” held at the International Insti-tute of Physics in Natal in June 2017 for interestingideas and discussions. This work was supported bythe grant No. 16-12-10344 of the Russian ScienceFoundation. U. H. Danielsson, M. Lundgren and A. J. Niemi,Phys. Rev. E (2010) 021910 [arXiv:0902.2920[cond-mat.stat-mech]]. M. Chernodub, S. Hu and A. J. Niemi, Phys. Rev.E (2010) 011916 [arXiv:1003.4481 [bio-ph]]. N. Molkenthin, S. Hu and A.J. Niemi, Phys. Rev.Lett. (2011) 78102. S. Hu, A. Krokhotin, A. J. Niemi, and X. PengPhys. Rev. E (2011) 041907. S. Hu, M. Lundgren and A. J. Niemi, Phys. Rev. A (2011) 061908 [arXiv:1102.5658 [q-bio.BM]]. A. Krokhotin, A. J. Niemi and X. Peng, Phys. Rev.E (2012) 031906. X. Peng, A. Chenani, S. Hu, Y. Zhou, A. J. Niemi,BMC Struct. Biol. (2014) 27 A. J. Niemi, Theoret. Math. Phys. (1) (2014)1235. A. Molochkov, A. Begun and A. Niemi, EPJ Web Conf. (2017) 04004 [arXiv:1703.04263 [q-bio.BM]]. D. Melnikov and A. B. F. Neves, “Chern-Simons-Higgs Model as a Theory of Protein Molecules”.This paper was submitted to arXiv on August 9,2019. As of August 30, it is still on hold by thearXiv moderators. S. Hu, Y. Jiang and A. J. Niemi, Phys. Rev. D (2013) 105011. I. Gordeli, D. Melnikov, A. Niemi and A. Se-drakyan, Phys. Rev. D (2016) no.2, 021701[arXiv:1508.03268 [hep-th]]. F. R. Klinkhamer and N. S. Manton, Phys. Rev. D (1984) 2212. V. A. Rubakov, “Classical Theory of Gauge Fields,”

Princeton University Press, 2002. The PyMOL Molecular Graphics System, Version2.0 Schr¨odinger, LLC. G. N. Ramachandran, C. Ramakrishnan andV. Sasisekharan, Journal of Molecular Biology, (1) (1963) 95. C. Ramakrishnan and G. N. Ramachandran Bio-phys. J. (1965) 909. S C. Lovell, I. W. Davis, W. B. Arendall, P. I. deBakker, J. M. Word, M. G. Prisant,J. S. Richardsonand D. C. Richardson, Proteins (2003) 437. N. H. Christ and T. D. Lee, Phys. Rev. D (1975)1606. D. Bazeia and L. Losano, Phys. Rev. D (2006)025016 [hep-th/0511193]. D. Bazeia, M. A. Liao and M. A. Marques,arXiv:1908.01085 [hep-th].22