A Continuous Effective Model of the Protein Dynamics
IITEP-TH-nn/yy
A Continuous Effective Model of the Protein Dynamics
Dmitry Melnikov
1, 2 and Alyson B. F. Neves
3, 4 International Institute of Physics, Federal University of Rio Grande do Norte,Campus Universit´ario, Lagoa Nova, Natal-RN 59078-970, Brazil Institute for Theoretical and Experimental Physics,B. Cheremushkinskaya 25, Moscow 117218, Russia Department of Theoretical and Experimental Physics,Federal University of Rio Grande do Norte,Campus Universit´ario, Lagoa Nova, Natal-RN 59078-970, Brazil Federal University of Maranh˜ao, Campus Balsas,rua Jos´e Le˜ao 484, Balsas-MA 65800-000, Brazil
The theory of elastic rods can be used to describe certain geometric and topological properties ofthe DNA molecules. A similar effective field theory approach was previously suggested to describethe conformations and dynamics of proteins. In this letter we report a detailed study of the basicfeatures of a version of the proposed model, which assumes proteins to be very long continuouscurves. In the most appealing case, the model is based on a potential with a pair of minimacorresponding to helical and strand-like configurations of the curves. It allows to derive severalpredictions about the geometric features of the molecules, and we show that the predictions arecompatible with the phenomenology. While the helices represent the ground state configurations,the abundance of beta strands is controlled by a parameter, which can either completely suppresstheir presence in a molecule, or make them abundant. The few-parameter model investigated in theletter rather represents a universality class of protein molecules. Generalizations accounting for thediscrete nature and inhomogeneity of the molecules presumably allow to model realistic cases.
Proteins are long quasi one-dimensional chainsof amino acids. At the secondary structure levelthese chains organize themselves in several char-acteristic configurations, such as helices and betasheets, appearing due to hydrogen bonds formingbetween atoms of beyond-nearest-neighbor aminoacids. Consequently, at this level, proteins can beviewed as a sequence of helices and beta strands(elements that form the pleated beta sheets) joinedby less regular connections called loops and turns.In natural conditions, the arrays of the sec-ondary structures further fold to make space-fillingconfigurations – the native states of the proteins,which define their biological function. Predict-ing the shape of the native states from the se-quence of amino acids is a very important, butformidable task. Due to the large number of de-grees of freedom in this problem, current all-atomcomputer simulations only allow to fold relativelyshort chains, yet in time significantly longer thanit takes a nature protein to do so. Different tech-niques are implemented to effectively reduce thenumber of degrees of freedom. Although the lattersimulations are more efficient, they suffer from alarge number of input parameters, which restricttheir predictive power.In Ref. 1,2 it was proposed to apply an effectivefield theory approach to proteins, constructing ef-fective energy functionals for 3D curves. It wasargued that a natural description is in terms of agauge theory that is a low-dimensional analog ofthe Abelian Higgs model. Such model would re- produce helices as ground state configurations andloops as solitons interpolating between the groundstates. In a series of subsequent works it wasshown that a predominant number of secondarystructures found in real proteins can be fit with asub-angstrom accuracy by an effective model withan order of magnitude fewer number of parame-ters than that of the conventional coarse grainedmodels.In the papers using the effective theory approachthe proteins were described by a discrete versionof the model owing to the fact that proteins arediscrete chains with no translational symmetry.In this letter we discuss the continuous AbelianHiggs model and explain a number of universalfeatures of proteins that this model predicts. Wewill claim that the phenomenology is based on atwo-minimum potential, whose ground state con-figurations are helices. Beta strands can appear asmetastable configurations or as solitons, and theirabundance is controlled by one of the parametersof the model. Moreover, we establish several rela-tions between the geometry of different secondarystructure elements. In the end we comment, howdiscreteness and inhomogeneity of proteins can beaccounted in generalizations of the present model.The various details of the analysis summarized inthis letter appeared in a companion paper. Following Refs. 1,2 we will use curvature κ andtorsion τ as the effective fields of the large-scale dy-namics of the protein molecules. We will considerthem as functions of the (arc) length parameter a r X i v : . [ q - b i o . B M ] A ug ( s ) of the curves. For these fields we will write aneffective energy functional, cf. Refs. 11,12.A minimal model necessary for reproducing thefeatures of the protein molecules, relevant for thepresent discussion, is the one-dimensional AbelianHiggs model with a Chern-Simons term and aProca mass term: E = L (cid:90) ds (cid:0) |∇ ˆ κ | − m | ˆ κ | + λ | ˆ κ | (cid:1) − F L (cid:90) ds ˆ τ + L (cid:90) ds (cid:15) (ˆ τ − η (cid:48) ) . (1)Here complex ˆ κ = κe iη and real ˆ τ denote fields,subject to local gauge transformations, so thatphysical gauge invariant curvature and torsion are κ = | ˆ κ | and τ = ˆ τ − η (cid:48) . Quantities m , λ , F and (cid:15) are phenomenological parameters, which can befixed by comparing this model with real proteins. L is the length of the curve, which will later beconsidered infinite.We note that the first term in the second line ofEq. (1), representing the one-dimensional Chern-Simons action, is not gauge invariant in the case offinite open curves. Consequently, the model pos-sesses physical edge modes corresponding to thechoice of normal vectors at the endpoints of thecurves. The torsion field enters quadratically in the en-ergy functional and can be integrated out assuming τ = F κ + (cid:15) . (2)We also note that all the equations depend ongauge invariant quantities τ and κ . This situa-tion is equivalent to the Higgs mechanism of thespontaneous breaking of the original U (1) gaugetheory. The reduced gauge invariant energy func-tional takes the following form, E = 12 L (cid:90) ds (cid:18) κ (cid:48) − m κ + λκ − F κ + (cid:15) (cid:19) , (3)with a residual Z symmetry κ → − κ .The first feature of the model is special rela-tion (2) between the curvature and the torsion,which emphasizes the importance of the Chern-Simons term for the solutions with non-zero tor-sion. The second feature is a non-local effective po-tential for the curvature field obtained upon inte-gration of the torsion. To highlight other featureswe discuss the classical minimum energy configu-rations of this theory. FIG. 1: (Color online) Customary secondary structurerepresentation of a piece of a protein chain. The cyancolored spirals are alpha helices and the twisted purpleribbon is a beta strand, here sandwiched between twohelices. The image is produced with the help of thePyMOL software. It is convenient to rewrite the derivative-freepart of Eq. (3) in terms of a potential, V ( κ ) = λ ( κ − κ ) ( κ + κ )2( κ + (cid:15) ) , (4)with another set of phenomenological parameters( λ, κ , κ , (cid:15) ). As explained in Ref. 10, the most in-teresting scenario corresponds to 0 ≤ κ ≤ (cid:15) . Letus review the properties of this potential in thelimit of infinite curves. Finite size effects are im-portant for the stability of possible configurations.They are discussed in some detail in Ref. 10.Static solutions in the infinite length model in-clude a true ground state with constant κ = κ and τ = τ , stable kink-like solutions, interpolat-ing between the minima κ and − κ , local energyminimum solutions κ = 0 and unstable sphaleronsof the local energy minimum. The last two solu-tions are only present if κ ≤ κ (cid:15) / ( κ + 2 (cid:15) ). Thesphalerons are classical bounce solutions of theparticle motion in an inverted potential. Theycharacterize the height of the potential barrier sep-arating the true and the false ground states.These solutions have a natural interpretationin terms of the observed secondary structures ofproteins. The constant curvature, κ = κ , con-stant torsion curves are helices, which are the mostabundant regular structures (for example, alphahelices). The configurations with zero curvature(but non-zero torsion) can be compared with betastrands in proteins – zigzag-like configurations ofthe backbone chain, which are typically visualizedby quasi straight and slightly twisted ribbons as infigure 1. The kinks interpolating between globalminima are loops (structural motifs) connectingpairs of helices. Finally, sphalerons are non-zerocurvature segments connecting two straight pieces.They are unstable in the model with translationalsymmetry, but can be stabilized by finite size and Structurehelix2pneRstrandLstrand κ | τ | FIG. 2: (Color online) Distribution of the curvatureand torsion pairs. The color identification is bluefor the alpha helices, green for the right-handed betastrands, red for the left-handed strands and orange forthe strands of the protein. The green curve is afit (2) for the helix and strand points (excluding ).The blue and orange curves are the same fits, but as-suming (cid:15) = 0, for the helices and for the . discretization effects. Thus they can be comparedwith higher curvature loops (hairpins) connectingthe beta strands.At the next level of the comparison of the modelwith the protein phenomenology, we can estimatethe values of the parameters in the effective energyfunctional. In Ref. 10 we fitted the positions ofthe C α atoms in the helices with continuous curvesto obtain the following estimates for the curvatureand torsion: κ (cid:39) . − , τ (cid:39) .
15 ˚A − . (5)One can then ask whether relation (2) is satis-fied in real proteins. The relation roughly tells usthat structures with lower curvature should havehigher torsion. Alpha helices have a rather nar-row distribution around values of Eq. (5), so wechecked whether beta strands can be fit in this pic-ture. One can think of the ribbons, as the one rep-resenting the beta strand on figure 1 as a stretchedhelix with κ/τ (cid:28)
1. By fitting the strands as suchstretched helices we obtained a distribution shownon figure 2. We stress that in the case of thebeta strands we measured the torsion of the rib-bon, rather than that of the backbone chain.We find that strands can have both positive andnegative torsion as shown on figure 2. Moreoverthere are some special strands that do not fit thepresent model, or rather the universality class ofthe specific relation (2). We found such strandsin the protein. The strand of that proteinare bona fide left helices, but with non-standardcurvature and torsion. κ , Å - L loop , Å FIG. 3: (Color online) Size of the kink interpolating be-tween two global minima in potential (4). The curvescorrespond to the sections containing 50% (blue) and90% (orange) of the soliton energy. The inset showsthe shape of the curve for κ = 0 . − . Part of thecurve highlighted in red, shows the piece contributing50% of the energy. Figure 2 is similar in spirit to the famous Ra-machandran plots , cast in the form, which al-lows to extract the information about relation (2).As in the Ramachandran plots, the distributionof the beta strands is much less localized in com-parison to alpha helices, but it is compatible withEq. 2. In particular, it is consistent with the non-zero Proca mass, which acts an IR regulator inpotential (4). By fitting the data on figure 2 wecan obtain the average values of parameters F and (cid:15) : F = 0 .
70 ˚A − , (cid:15) = 1 . − . (6)With κ , F and (cid:15) fixed there remains only onefree parameter in the model. We can choose it tobe λ or κ . The remaining parameter controls thesize of the solitons. In figure 3 we show how thesize of the kink depends on κ . In general, we see alogarithmic decrease of the loop size with increaseof κ . For small κ one observes long loops, whichcorrespond to a characteristic step appearing inthe kink solution, as can be seen on figure 4. Inthe limit κ →
0, the global and the local minimabecome degenerate and the original kink splits intoa pair of kinks with infinite separation.The step appearing in the kink solution in fig-ure 4 is a piece of the curve with much lower curva-ture (see also the inset on figure 3), which shouldbe interpreted in terms of the configuration shownon figure 1: a pair of alpha helices separated by abeta strand. One can make the following estimateof the size of the small curvature region, assumingthat it is characterized by | κ | (cid:28) κ (cid:28) κ , R β (cid:39) κ (cid:90) − κ dk √ λκ (cid:15) (cid:112) k + κ (cid:39) (cid:15) ( κ + (cid:15) ) F κ , (7) κ = - κ = - κ = - - - -
20 20 40 60 s, Å - - - κ , Å - FIG. 4: (Color online) Kink solutions of the model withpotential (4). For small κ → where we assumed the relation between parameters λ and F following from two different parameteriza-tions of the potential. Note, that this value doesnot depend on κ . The estimate gives the numer-ical value R β (cid:39)
12 ˚A, which is a prediction of theuniversal size of the length of the beta strand inthe model. Beta strands can appear longer, be-cause they have an attached intermediate region,which depends logarithmically on κ .On the other hand, if κ is not much smallerthan κ , the step is not formed and the configu-rations like the one on figure 1 are not possible.Hence κ can be viewed as an external parame-ter, like chemical potential, controlling the abilityof the protein to form beta strands. This chemi-cal potential can either be a characteristic of themedium, in which the protein is present, or of theamino acid composition of the backbone chain.It is interesting to discuss what happens in thelimit κ → κ = 0. Such struc-tures become more stable in the κ → κ = 0 segments. In other words,it is plausible that longer chains of the beta strandsare formed in that regime. The sphalerons mightalso stabilize in the discrete case. They would in-troduce loops of higher curvature connecting betastrands (similar to the known hairpin motifs in pro-teins).There is a way how the continuous model can bedeformed to account for the breaking of the trans-lational invariance in actual proteins. Apart fromconsidering finite curves, one can introduce coor-dinate dependence to the parameters. Apart froma discrete periodicity of the backbone chain, thiscould also take into account the local inhomogene-ity of the chemical structure. It is then natural tofit the continuous curves of real proteins using aFourier expansion. We hope to discuss such gen-eralizations in a future work. Acknowledgements
DM would like to thankAntti Niemi and Ara Sedrakyan for many usefuldiscussions on the application of effective field the-ory and topology to protein physics. DM is alsograteful to Dionisio Bazeia, Sergei Brazovskii andthe participants of the workshop “Physics and Bi-ology of Proteins” held at the International Insti-tute of Physics in Natal in June 2017 for interestingideas and discussions. This work was supported bythe grant No. 16-12-10344 of the Russian ScienceFoundation. U. H. Danielsson, M. Lundgren and A. J. Niemi,Phys. Rev. E (2010) 021910 [arXiv:0902.2920[cond-mat.stat-mech]]. M. Chernodub, S. Hu and A. J. Niemi, Phys. Rev.E (2010) 011916 [arXiv:1003.4481 [bio-ph]]. N. Molkenthin, S. Hu and A.J. Niemi, Phys. Rev.Lett. (2011) 78102. S. Hu, A. Krokhotin, A. J. Niemi, and X. PengPhys. Rev. E (2011) 041907. S. Hu, M. Lundgren and A. J. Niemi, Phys. Rev. A (2011) 061908 [arXiv:1102.5658 [q-bio.BM]]. A. Krokhotin, A. J. Niemi and X. Peng, Phys. Rev.E (2012) 031906. X. Peng, A. Chenani, S. Hu, Y. Zhou, A. J. Niemi,BMC Struct. Biol. (2014) 27 A. J. Niemi, Theoret. Math. Phys. (1) (2014)1235. A. Molochkov, A. Begun and A. Niemi, EPJ Web Conf. (2017) 04004 [arXiv:1703.04263 [q-bio.BM]]. D. Melnikov and A. B. F. Neves, “Chern-Simons-Higgs Model as a Theory of Protein Molecules”.This paper was submitted to arXiv on August 9,2019. As of August 30, it is still on hold by thearXiv moderators. S. Hu, Y. Jiang and A. J. Niemi, Phys. Rev. D (2013) 105011. I. Gordeli, D. Melnikov, A. Niemi and A. Se-drakyan, Phys. Rev. D (2016) no.2, 021701[arXiv:1508.03268 [hep-th]]. F. R. Klinkhamer and N. S. Manton, Phys. Rev. D (1984) 2212. V. A. Rubakov, “Classical Theory of Gauge Fields,”
Princeton University Press, 2002. The PyMOL Molecular Graphics System, Version2.0 Schr¨odinger, LLC. G. N. Ramachandran, C. Ramakrishnan andV. Sasisekharan, Journal of Molecular Biology, (1) (1963) 95. C. Ramakrishnan and G. N. Ramachandran Bio-phys. J. (1965) 909. S C. Lovell, I. W. Davis, W. B. Arendall, P. I. deBakker, J. M. Word, M. G. Prisant,J. S. Richardsonand D. C. Richardson, Proteins (2003) 437. N. H. Christ and T. D. Lee, Phys. Rev. D (1975)1606. D. Bazeia and L. Losano, Phys. Rev. D (2006)025016 [hep-th/0511193]. D. Bazeia, M. A. Liao and M. A. Marques,arXiv:1908.01085 [hep-th].22