The role of hydrophobic interactions in folding of β -sheets
Jiacheng Li, Xiaoliang Ma, Hongchi Zhang, Chengyu Hou, Liping Shi, Shuai Guo, Chenchen Liao, Bing Zheng, Lin Ye, Lin Yang, Xiaodong He
TThe role of hydrophobic interactions in folding of β-sheets
Jiacheng Li a,1 , Xiaoliang Ma a,1 , Hongchi Zhang a,1 , Chengyu Hou b,1 , Liping Shi a , Shuai Guo a , Chenchen Liao b , Bing Zheng c , Lin Ye d , Lin Yang a,d,* , Xiaodong He a,e a National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and Structures, Harbin Institute of Technology, Harbin 150080, China b School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin 150080, China c Key Laboratory of Functional Inorganic Material Chemistry (Ministry of Education) and School of Chemistry and Materials Science, Heilongjiang University, Harbin 150001, P. R. China. d School of Aerospace, Mechanical and Mechatronic Engineering, The University of Sydney, NSW 2006, Australia e Shenzhen STRONG Advanced Materials Research Institute Co., Ltd, Shenzhen 518035, P. R. China.
Exploring the protein-folding problem has been a long-standing challenge in molecular biology. Protein folding is highly dependent on folding of secondary structures as the way to pave a native folding pathway. Here, we demonstrate that a feature of a large hydrophobic surface area covering most side-chains on one side or the other side of adjacent β-strands of a β-sheet is prevail in almost all experimentally determined β-sheets, indicating that folding of β-sheets is most likely triggered by multistage hydrophobic interactions among neighbored side-chains of unfolded polypeptides, enable β-sheets fold reproducibly following explicit physical folding codes in aqueous environments. β-turns often contain five types of residues characterized with relatively small exposed hydrophobic proportions of their side-chains, that is explained as these residues can block hydrophobic effect among neighbored side-chains in sequence. Temperature dependence of the folding of β-sheet is thus attributed to temperature dependence of the strength of the hydrophobicity. The hydrophobic-effect-based mechanism responsible for β-sheets folding is verified by bioinformatics analyses of thousands of results available from experiments. The folding codes in amino acid sequence that dictate formation of a β-hairpin can be deciphered through evaluating hydrophobic interaction among side-chains of an unfolded polypeptide from a β-strand-like thermodynamic metastable state. INTRODUCTION
Protein products are the basis of life on Earth and serve nearly all the functions in the essential biochemistry of life science. Each nascent protein exists as an unfolded polypeptide or random coil when translated from a sequence of mRNA to a linear chain of residues by a ribosome. The intrinsic biological functions of a protein are expressed and determined by its native three-dimensional (3D) tructure that derives from the physical process of protein folding , by which a polypeptide folds into its native characteristic and functional three-dimensional structure, in an expeditious and reproducible manner. Protein folding can thereby be considered the most important mechanism, principle, and motivation of biological existence, functionalization, diversity, and evolution . Based on the complexity of protein folding, the protein-folding problem has been summarized in three unanswered questions : (i) What is the physical folding code in the amino acid sequence that dictates the particular native 3D structure? (ii) What is the folding mechanism that enables proteins to fold so quickly? (iii) Is it possible to devise a computer algorithm to effectively predict a protein’s native structure from its amino acid sequence? Moreover, another essential question is why protein folding highly depends on the solvent (water or lipid bilayer) and the temperature ? The protein folding problem was brought to light over 60 years ago. In particular, since Anfinsen shared a 1972 Nobel Prize in Chemistry for his work revealing the connection between the amino acid sequence and the native conformation , understanding of protein sequence-structure relationships has become the most fundamental task in molecular and structural biology . Protein folding is one of the miracles of nature that human technology finds quite difficult to follow, due to the very large number of degrees of rotational freedom in an unfolded polypeptide chain. In the 1960s, Cyrus Levinthal pointed out that the apparent contradiction between the astronomical number of possible conformations for a protein chain and the fact that proteins can fold quickly into their native structures should be regarded as a paradox (Levinthal's paradox) , so there must be mechanisms that allow polypeptide chains to find the native states encoded in their sequence. As stated in Anfinsen's Dogma, the well-defined native 3D structures of small globular proteins are uniquely encoded in their primary structures (the amino acid sequences), is kinetically reproducible and stable under a range of physiological conditions, and can therefore be considered as an issue of the certainty. Many proteins or protein domains, relatively rapid and efficient refolding can be observed in vitro, thus proteins may be regarded as "folding themselves" following explicit folding pathways . Protein folding is considered a free energy minimization or a relaxation process that is guided mainly by the following hysical forces: (i) formation of intramolecular hydrogen bonds, (ii) van der Waals interactions, (iii) electrostatic interactions, (iv) hydrophobic interactions, (v) chain entropy of protein, (vi) thermal motions . Among them, hydrophobic effect is normally thought to play a decisive role . Currently, the generally accepted hypothesis in the field is to conceive of protein folding in a funnel-shaped energy landscape, where every possible conformation is represented by a free energy value. The rapid folding of proteins has been attributed to random thermal motions that cause conformational changes leading energetically downhill toward the native structure corresponds to its free energy minimum under the solution conditions . However, there are both enthalpic and entropic contributions to free energy of protein that change with temperature and so give rise to heat denaturation and, in some cases, cold denaturation . So far the hypothesis haven’t been able to decipher the folding code and therefore aren’t generally able to read a sequence and predict what shape it will adopt. The interaction of protein surface with the surrounding water is often referred to as protein hydration layer (also sometimes called hydration shell) and is fundamental to structural stability of protein, because non-aqueous solvents in general denature proteins . The hydration layer around a protein has been found to have dynamics distinct from the bulk water to a distance of 1 nm and water molecules slow down greatly when they encounter a protein . Thus, hydrophilic side chains of proteins are normally hydrogen bonded with surrounding water molecules in aqueous environments, thereby preventing the surface hydrophilic side-chains of proteins from randomly hydrogen bonding together . This is the reason why proteins usually do not aggregate or crystallize in unsaturated aqueous solutions , even though the solvent-facing surface of the proteins is usually composed of predominantly hydrophilic regions. Experiments have also shown that secondary structures of protein (such as -helices and -sheets) are stabilized by hydrogen bonds between the N-H groups and C=O groups of the main chain . This also indicates that the shielding effect of surrounding water molecules prevent hydrophilic side-chains from interfering with the formation of secondary structures during protein folding. Thus, water molecules should be able to saturate the hydrogen bond formations of hydrophilic side-chains and the main chain before the protein folding , due to water molecules have very strong polarity . his is the reason why intrinsically disordered proteins (IDPs) and regions (IDRs) can make up a significant part of the proteome . Before the folding of secondary structures, the early steps of protein folding may be not directly dominated by the formation of intramolecular hydrogen bonds, due to the shielding effect of surrounding water molecules. Thus, this problem may lie in our lack of understanding of the hydrophobic interaction among neighbored side-chains of unfolded proteins at early steps of the folding, given the lack of awareness of the importance of the shielding effect of water. Almost all experimentally determined native tertiary structures of water-soluble proteins have a hydrophobic core in which hydrophobic side-chains are buried from water . Incidentally, polar residues interact favorably with water, thus the solvent-facing surface of the peptide is usually composed of predominantly hydrophilic regions . Minimizing the number of hydrophobic side-chains exposed to water, namely, hydrophobic collapse thus has been regarded as one of the most important driving force for protein folding processs . Experimental methods such as laser temperature jumping technology and single molecule experimental techniques have revealed that protein folding first leads to the formation of secondary structures (α-helices and β-strands), and the tertiary structure is formed by the folding of secondary structures . It is likely that the nascent polypeptide forms initial secondary structure through creating localized regions of predominantly hydrophobic residues due to hydrophobic effect . The secondary structures interacts with water, thus placing thermodynamic pressures on these regions which then aggregate or "collapse" into a tertiary conformation with a hydrophobic core . Therefore, protein folding is highly dependent on folding of secondary structures as the way to hierarchically pave a native folding pathway that lead to formation of correct tertiary structures and cause conformational changes leading energetically downhill toward the native globular structure that possesses the minimum free energy. Thus, decipher of the folding codes in amino acid sequence that dictate the secondary structures formation should be regarded as a key to crack the protein folding problem. Among types of secondary structure in proteins, the β-sheet is the most prevalent. If the controlling mechanism for β-sheet folding can be revealed, it would remarkably promote solution of the protein folding problem. Currently, several hypotheses has been proposed for explaining the folding mechanism of β-sheet. The hydrophobic zipper hypothesis indicates that a hairpin is first formed before hydrophobic contacts act as onstraints which bring other contacts into spatial proximity . This leads to further constrain and causes the rest of the contacts to zip up. Munoz et al proposed that the folding of a β -hairpin initiates at the turn and propagates towards the tails . In particular, they found that stabilization through hydrophobic contacts between residues and hydrogen bonding interaction are important for the formation of the β -hairpin. Petrovich et al. studied a 37-residue triple-stranded β -sheet protein via MD simulations. Their results indicate that a β-hairpin first appears before the third strand joins in to complete the β-sheet at the end of the folding process. They ascribe the folding mechanism of the β-sheet to a combination of initial hydrophobic collapse and zipper mechanism, which serve to nucleate the hairpin formation. Notably, all the three mechanisms above suggest that the folding of a β-sheet is necessarily preceded by the occurrence of a β-turn. We are still missing a "folding mechanism" for β-sheets. By mechanism, we mean a narrative that explains how the time evolution of a β-sheet folding development derives from its amino acid sequence and solution conditions. Results β-sheet folding highly depends on the temperature , where β-sheets can form in as little as 1 microsecond after the temperature jumping . β-sheets consist of β-strands connected laterally by at least three backbone hydrogen bonds, forming a generally pleated sheet. A β-strand is a stretch of polypeptide chain typically 3 or more amino acids long with backbone in an extended conformation. It most like that the β-strands exist before the folding of β-sheets. Because it is difficult to explain how the folding process of a β-sheet (i.e., laterally hydrogen bonding process of segments of unfolded polypeptide) is accompanied by stretching process of the segments of polypeptide into β-strands. There must be mechanisms that allow polypeptide chain segments to find the states of β-strands encoded in their sequence. There also must be some physical effects providing the long-range attractive force among β-strands for the β-sheets formation. Experimental evidences of the folding of unfolded proteins provide corroboration for a hypothesis that folding initiation sites arise from hydrophobic interactions . The folding of β-strands and β-sheets may be driven by hydrophobic interactions, as the nascent polypeptide may form initial primary structure hrough creating localized regions of predominantly hydrophobic residues . Hydrophobic effect most likely can contribute to the formation of β-sheets through multistage aggregations of neighbored hydrophobic groups of unfolded polypeptides, which lead to the formation of β-strands, and consequently fold into β-sheets. A β-sheet always is amphipathic in nature, namely, contain hydrophilic surface areas and hydrophobic surface areas. Note that the hydrophobic attraction (due to the hydrophobic effect) among adjacent side-chains on one side or the other side of a β-strand may be common in experimentally determined protein structures, which should be considered as an evidence for hydrophobic effect dominating the formation of β-strands. It has previously been noted that many amino acid side chains contain considerable nonpolar sections, even if they also contain polar or charged groups . Namely, hydrophilic side-chains are not completely hydrophilic. The hydrophilicity of hydrophilic side-chains is normally expressed by C=O or N-H2 groups at their ends, and the other portions of hydrophilic side-chains are hydrophobic, because the molecular structures of these portions are basically alkyl and benzene ring structures, as shown in Figure 1. Folding initiation sites of β-brands might therefore contain not only accepted “hydrophobic” amino acids, but also larger hydrophilic side-chains . If formation of β-brands is driven by hydrophobic interactions among neighbored side-chains of unfolded polypeptide, we should be able to find experimental evidence of the hydrophobic interaction in the Protein Data Bank (PDB) achieves, due to hundreds of thousands of β-sheet structures have been experimentally determined. In an aqueous environment, the water molecules tend to segregate around the “hydrophobic” side chains of the nascent protein, creating hydration shells of ordered water molecules . An ordering of water molecules around a hydrophobic region increases order in a system and therefore contributes a negative change in entropy (less entropy in the system) . The water molecules are fixed in these water cages which drives the hydrophobic collapse, or the aggregation of the hydrophobic groups. Thus, the hydrophobic interaction among neighbored side-chains in sequence can introduce entropy back to the system via the breaking of their water cages which frees the ordered water molecules . If hydrophobic interactions among neighbor side-chains in amino acid sequences provide the structural stability for β-brands formation, we must can find out that the phenomenon of a large hydrophobic surface area covering on one side or the other side of a -strand is prevail in almost all experimentally determined β-sheets. If the phenomenon of hydrophobic side-chains tend to cluster together on one side of adjacent β-strands of a β-sheet is prevail in almost all experimentally determined β-sheets, we may demonstrate that the hydrophobic interaction among the neighbored side-chains responsible for β-sheet-folding initiation. The capability of an amino acid residue to get involved in the hydrophobic attraction with neighbored residues in sequence can be evaluated by the exposed alkyl and benzene ring structures of the side-chain, as shown in Fig.1, in which 20 kinds of amino acid residue are divided into four groups . Arginine-R, Histidine-H, and Lysine-K can involve in hydrophobic interaction with adjacent hydrophobic side-chains in sequence due to their long hydrophilic side chains contain long nonpolar alkyl structures, see Fig.1A. Cysteine-C, Isoleucine-I, Leucine-L, Methionine-M, Tryptophan-W, Phenylalanine-F, Tyrosine-Y, and Valine-V can fully involve in hydrophobic interaction with adjacent side-chains due to their high hydrophobicity, see Fig.1B. Glutamate-E, Glutamine-Q, Threonine-T, and Alanine-A would allow limited participation in hydrophobic interaction with neighbored side-chains in sequence due to their exposed hydrophobic proportions is relatively small, see Fig.1C. Aspartate-D, Asparagine-N, Serine-S, Proline-P, and Glycine-G basically can’t participate in hydrophobic interaction with adjacent side-chains in sequence due to the hydrophobic proportions of their side-chains are too small or being occluded by hydrophilic groups, see Fig.1D. A de novo designed protein with curved β-sheet (PBDID: 5TPJ) is a good example for illustrating the phenomenon of the hydrophobic attraction (due to the hydrophobic effect) among adjacent side-chains on one side of each β-strand of the protein, see Fig.2 . To illustrate the hydrophobic attraction, we highlight the hydrophobic surface areas of adjacent side-chains on each β-strand of the protein based on the experimentally determined protein structure as shown in Fig. 2C and 2D. Noting that every β-strand is characterized by a large hydrophobic surface fully covering one side of the β-brand (the inner side), and caused each side-chains is parallel to every other side-chain of each strands due to the hydrophobic interaction. Parallel distribution of adjacent peptide planes of these β-strands also causes adjacent side-chains to distribute on opposite sides of the main chain and each carbonyl oxygen atom in a peptide plane tends to hydrogen bond with an amide hydrogen atom in an adjacent peptide plane due to the electrostatic ttractions between them, except the Proline-P . Parallel distribution of neighbored “hydrophobic” side-chains in a β-strand can effectively introduce entropy back to the system via the merging of the water cages of the side-chains which frees the ordered water molecules, see Fig.2D. Thus, β-strand should be considered as a metastable state for unfolded polypeptides corresponds to its free energy minimum under the solution conditions, creating localized regions of predominantly hydrophobic side-chains . We use another small-molecule protein (PBDID:1OUR) as the example to demonstrate the role of hydrophobic interactions among neighbored side-chains played in formation of β-strands, β-turns and β-sheets, see Fig.3. The protein is mainly composed with β-strands and 10 β-turns. Every β-strand of the protein is also characterized by a large hydrophobic surface fully covering one side or the other side of the β-brand, see Fig.3A. Aspartate-D, Asparagine-N, Serine-S, Proline-P, Glycine-G most likely contribute to formation of β-turns in protein folding, due to the other neighbored side-chains in amino acid sequence tend to hydrophobic attract with each other through bypassing these residues (see Fig.1d). Thereby, Aspartate-D, Asparagine-N, Serine-S, Proline-P, Glycine-G can be classified as a hydrophobic blocking (R B ) group. It is worth noting that almost all the 10 β-turns of the protein are composed with two or more residues of Aspartate-D, Asparagine-N, Serine-S, Proline-P, Glycine-G, see Fig.3A and 3B. This indicates that two or more adjacent R B residues can effectively block hydrophobic attraction among neighbored side-chains in sequence at both side of a strand. We plot the protein structure into three parts according to three segments of the amino acid sequence to illustrate the hydrophobic collapse among neighbored β-strands in sequence, see Fig.3B and 3C. Hydrophobic interactions among these β-strands may drive them collapse together through bending the unfolded polypeptide at the location of these R B residues, namely, bypassing these R B residues at the turns to achieve the hydrophobic collapse. This also indicates that hydrophobic attraction among neighbored side-chains drive the β-strands formation and then cause hydrophobic attraction among the neighbored β-strands and formation of the β-sheets, due to β-strands formation create localized regions of predominantly hydrophobic residues and place thermodynamic pressures on these regions under the solution conditions. Formation of β-sheets also make β-strands aggregate or "collapse" into a tertiary conformation with a hydrophobic core. Thereby, we speculate that folding of β-sheets is triggered by multistage hydrophobic interactions among eighbored side-chains of unfolded polypeptides, enable β-sheets fold reproducibly following explicit physical folding codes in aqueous environments. We use 1000 experimentally determined small protein structures to further demonstrate the hydrophobic-effect-based folding mechanism for β-sheets. All the 1000 small proteins were randomly selected from the PDB. 3235 β-strands can be identified in the 1000 protein structures by using the PDB archive and the STRIDE software . From analysis of all the 3235 β-strands of the 1000 proteins in PDB, we find out that the feature of hydrophobic attraction (due to the hydrophobic effect) among adjacent side-chains on one side or the other side of a β-strand covering the length of the β-strand is prevail in all the experimentally determined β-strands (see Supplementary S5). This indicates that the hydrophobic interaction among the neighbored side-chains responsible for the formation of β-strands. Aspartate-D, Asparagine-N, Serine-S, Proline-P, Glycine-G can’t effectively hydrophobic attract with neighbored side-chains in sequence, see Fig.1D. Thus, Aspartate-D, Asparagine-N, Serine-S, Proline-P, Glycine-G most likely lead to β-turns formation in protein folding, due to the other neighbored side-chains in amino acid sequence tend to hydrophobic attract with each other through bypassing these residues. The β-turn is the third most important secondary structure after helices and β-strands. β-turns have been classified according to the values of the dihedral angles φ and ψ of the central residue. β-turns can be easily identified in between β-strands or α-helices of the protein structures using the PDB archive and the STRIDE software . We identified 5776 β-turns in the 1000 protein structures, include about 1780 β-hairpin turns. We find out that about 97.4% of the β-turns contain at least one Aspartate-D, Asparagine-N, Serine-S, Proline-P or Glycine-G residue , as illustrated in Supplementary 2. Whereas, most of the rest no-R B β-turns contain at least one Glutamate-E, Glutamine-Q, Threonine-T, and Alanine-A residue. This indicates that Glutamate-E, Glutamine-Q, Threonine-T, and Alanine-A may contribute to the formation of β-turns due to their exposed hydrophobic proportions is relatively small. Moreover, about 99.3% β-hairpin turns contain at least one Aspartate-D, Asparagine-N, Serine-S, Proline-P or Glycine-G residue, see Supplementary 2. wo R B residues coded together normally shouldn’t be able to present at the middle of a long straight β-strand. Because the other residues of the strand at both sides of the two R B residues tend to hydrophobic aggregate together and thus would bend the strand at the two R B residues to achieve the hydrophobic interaction. However, we can still identified 29 long β-strands (each β-strands contain more than 12 residues), which are characterized by two adjacent R B residues locating at the middle of the β-strands through scanning the 1000 protein structures by using the STRIDE software . By checking these long β-strands using PyMOL software, we find out that 24 of these long β-strands actually curved exactly at their two R B residues in the amino acid sequences, demonstrating the capability of R B residues to cause β-turns formation, see Fig. 4. The other 5 long β-strands either have three or more R B residues coded together or have R B residues located at one end of the strands that make the hydrophobic blocking region extend to the ends of these β-strands, thus undermining the hydrophobic interaction between the both ends of these β-strands, see Supplementary S3. The long β-strand of the 1YV7 protein curved at a sequence segment of threonine-threonine-terine-glutamate (TTSE), see Supplementary S3. This indicates that Glutamate-E, Glutamine-Q, Threonine-T, and Alanine-A may also contribute to the formation of β-turns due to their exposed hydrophobic proportions is relatively small. The spike (S) protein of novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is of great concern due to the coronavirus disease 2019 (COVID-19) pandemic. The D614G mutation in SARS-CoV-2 begin to receive widespread attention for its rising dominance worldwide. This mutation changes the amino acid at position 614, from D (aspartic acid) to G (glycine), the initial D614 is now the G614 variant. It is worth noting that the amino acid at position 614 is located at a β-turn in a tertiary structure of the spike. This is consistent with our new theory that both D (aspartic acid) to G (glycine) can result in the β-turn formation. The D-614-G nutation may accelerate the folding of the quaternary structure of the spike due to G614 most likely can contribute to the hydrophobic effect between two tertiary structures of the protein rather than the D614 (see Fig.1D), due to the position 614 located at the docking site in between them. A typical β-hairpin structure contains two β-strands with hydrophobic attraction between each side-chain and every other side-chain on the strands. Thus, we might be able to predict β-hairpin structures through valuating hydrophobic attraction among each side-chain with every other side-chain in the primary structure of a protein. We may can predict β-hairpin through identifying two neighbored sequences of residues in the polypeptide chain both characterized by hydrophobic attraction between each side-chain to every other side-chain, and have two R B in between them. By using this method, we identified 553 samples in terms of the characteristics above from the 1000 proteins. We find that 158 of the samples are β-hairpins, 36 of the samples are structures of strand-turn-strand, 296 of the samples are structures of strand-turn-helix, 23 of the samples are structures of coil-turn-strand ,
23 of the samples are coil-turn-coil and 6 of the samples are α-helices. Thus, physical folding codes for β-hairpins and strand-turn-strand can be deciphered through evaluating hydrophobic interaction among side-chains of an unfolded polypeptide. The results show that strand-turn-helix also can be predict by the method. This indicates that folding of α-helix may be initiated from a β-strand-like thermodynamic metastable state . Conclusion
Many amino acid residues contain considerable nonpolar sections in their side-chains, even if they also contain polar or charged groups. This make hydrophobic interaction among neighbored amino acid side-chains in amino acid sequence of polypeptides becomes an important driving force for the stabilization of initial thermodynamic state of unfolded Proteins. The feature of a large hydrophobic surface area covering most side-chains on one side or the other side of adjacent β-strands of a β-sheet is prevail in almost all experimentally determined β-sheets. Minimizing the exposed hydrophobic portions of adjacent side-chains to water should be regarded as the most important driving force for the β-strands formation and caused each side-chains is parallel to every other side-chain on strands. β-turns often contain residues of Aspartate-D, Asparagine-N, Serine-S, Proline-P, Glycine-G which characterized with their side-chains having very small hydrophobic proportions exposure, that is explained as these residues can block hydrophobic effect among neighbored side-chains in sequence, thereby contribute to turns formation. The folding of β-sheets are most likely triggered by multistage hydrophobic interactions among neighbored side-chains of unfolded polypeptides, enable β-sheets fold reproducibly following explicit physical folding codes in aqueous environments. Temperature dependence of the folding of β-sheet is thus attributed to temperature dependence of the strength of the hydrophobicity. The hydrophobic ollapse of β-strands into β-sheets most likely trigger enthalpy-entropy compensation of unfolded polypeptides, enable the main-chain of β-strands to get rid of the hydrogen-bonded water molecules and laterally hydrogen bonding with each other. The folding codes in amino acid sequence that dictate the formation of a β-hairpin can thus be deciphered through evaluating hydrophobic interaction among side-chains of an unfolded polypeptide from a β-strand-like thermodynamic metastable state.
Materials and Methods Protein structures
In this study, many experimentally determined native structures of proteins are used to study the folding mechanism of β-sheets. All the three-dimensional (3D) structure data of protein molecules are resourced from the PDB database. IDs of these proteins according to PDB database are marked in the Fig.2, Fig.3, and Fig.4. In order to show the distribution of hydrophobic areas on the surface of β-strands and β-sheets in these figures, we used the structural biology visualization software PyMOL to display the hydrophobic surface areas of these secondary structures.
Identification of secondary structures of proteins
Secondary structures of β-strands, β-turns, β-sheets and α-helices were identified in the 1000 proteins by using the STRIDE software . We also used molecular 3D structure display software PyMOL to confirm the identification of secondary structures of proteins. Figure 1 Hydrophobic portions of amino acid side-chains (hydrophobic portions are highlighted by green)
Fig. 2. Hydrophobic attraction among neighbored side-chains of β-strands. ( A ) A de novo designed protein (PBDID: 5TPJ). ( B ) The curved β-sheet of 5TPJ. (C) Hydrophobic attraction among adjacent β-strands via hydrophobic surface of side-chains of the β-sheet (hydrophobic surface is highlighted by sing green surface areas). (D) Hydrophobic surface areas on the 6 β-strands of the sheet (green surface areas). Fig. 3. ( A ) Hydrophobic surface areas on the β-strands of the protein 1OUR (hydrophobic surface of side-chains is highlighted by using green surface areas), residues located at turns are highlighted in red color in the sequence of the protein. (B) The parts of the protein (residues1-33 are highlighted in green, residues 34-71 are highlighted in magenta, residues 72-114 are highlighted in red). Hydrophobic surface areas on the β-strands of the sheet (green surface areas). Fig.4 L ong β-strands (more than 12 residues) characterized with two adjacent R B residues located at the middle of the β-strands and curved exactly at their two RB residues in the amino acid sequences. *Corresponding author. E-mail address: [email protected] (Lin Yang) These authors contributed equally to this work.
ACKNOWLEDGEMENTS
Lin Yang is indebted to Daniel Wagner from the Weizmann Institute of Science and Liyong Tong from the University of Sydney for their support and guidance. Lin Yang is grateful for his research experience in the Weizmann Institute of Science for inspiration. The authors acknowledge the financial support from the National Natural Science Foundation of China (Grant 21601054), Shenzhen Science and Technology Program (Grant No. KQTD2016112814303055), Science Foundation of the National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, the Fundamental Research Funds for the Central Universities of China, and the University Nursing Program for Young Scholars with Creative Talents in Heilongjiang Province of China (Grants UNPYSCT-2017126).
Additional information
The authors declare no competing financial interests.
Author Contributions
L.Yang, L.Ye and X.H. formulated the study. X.M., L.Yang, C.H. and L.S. conducted the MD simulation. L.Yang, X.M., C.H., L.S., L.L. and J.L. analyzed the PDB data and coded the protein folding codes. L.Yang, X.M., C.H. and L.S collected and analysed the electric charge and rotational resistance data of side-chains. C.H. wrote programs. L.Yang, L.Ye and X.H. wrote the paper, and all authors contributed to revising it. All authors discussed the results and theoretical interpretations . References
1 Dill, K. A. & MacCallum, J. L. The Protein-Folding Problem, 50 Years On.
Science , 1042-1046, doi:10.1126/science.1219021 (2012). 2 Lednev, Igor K. Amyloid Fibrils: the Eighth Wonder of the World in Protein Folding and Aggregation.
Biophysical Journal , 1433-1435, doi:10.1016/j.bpj.2014.02.007 (2014). Alberts B, J. A., Lewis J, Raff M, Roberts K, Walters P.
Molecular Biology of the Cell . 4th edn, (Garland Science, 2002). 4 Grishin, N. V. Fold Change in Evolution of Protein Structures.
Journal of Structural Biology , 167-185, doi:https://doi.org/10.1006/jsbi.2001.4335 (2001). 5 van den Berg, B., Wain, R., Dobson, C. M. & Ellis, R. J. Macromolecular crowding perturbs protein refolding kinetics: implications for folding inside the cell. , 3870-3875, doi:10.1093/emboj/19.15.3870 (2000). 6 van den Berg, B., Wain, R., Dobson, C. M. & Ellis, R. Macromolecular crowding perturbs protein refolding kinetics: implications for folding inside the cell. The EMBO Journal , 3870-3875, doi:10.1093/emboj/19.15.3870 (2000). 7 Anfinsen, C. B. Principles that Govern the Folding of Protein Chains. Science , 223-230, doi:10.1126/science.181.4096.223 (1973). 8 Leopold, P. E., Montal, M. & Onuchic, J. N. Protein folding funnels: a kinetic approach to the sequence-structure relationship.
Proceedings of the National Academy of Sciences , 8721-8725, doi:10.1073/pnas.89.18.8721 (1992). 9 Zwanzig, R., Szabo, A. & Bagchi, B. Levinthal& Proceedings of the National Academy of Sciences , 20, doi:10.1073/pnas.89.1.20 (1992). 10 Dill, K. A. & Chan, H. S. From Levinthal to pathways to funnels. Nature Structural Biology , 10, doi:10.1038/nsb0197-10 (1997). 11 Walther, K. A. et al. Signatures of hydrophobic collapse in extended proteins captured with force spectroscopy.
Proceedings of the National Academy of Sciences of the United States of America
Biophys J , 1628-1641, doi:10.1016/S0006-3495(03)74972-8 (2003). 14 Zhang, L. et al. Mapping hydration dynamics around a protein surface.
Proceedings of the National Academy of Sciences of the United States of America , 18461-18466, doi:10.1073/pnas.0707647104 (2007). 15 Lin, Y. et al.
Universal Initial Thermodynamic Metastable state of Unfolded Proteins.
Progress in biochemistry and biophysics , 8, doi:10.16476/j.pibb.2019.0111 (2019). 16 Qiao, B., Jiménez-Ángeles, F., Nguyen, T. D. & Olvera de la Cruz, M. Water follows polar and nonpolar protein surface domains. Proceedings of the National Academy of Sciences , 19274-19281, doi:10.1073/pnas.1910225116 %J Proceedings of the National Academy of Sciences (2019). 17 McPherson, A. & Gavira, J. A. Introduction to protein crystallization.
Acta Crystallogr F Struct Biol Commun , 2-20, doi:10.1107/S2053230X13033141 (2014). 18 Berman, H., Henrick, K. & Nakamura, H. Announcing the worldwide Protein Data Bank. Nature Structural Biology , 980, doi:10.1038/nsb1203-980 (2003). 19 Berman, H., Henrick, K., Nakamura, H. & Markley, J. L. The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Research , D301-D303, doi:10.1093/nar/gkl971 (2007). 20 Brooks, B. R. et al. CHARMM: the biomolecular simulation program.
J Comput Chem , 1545-1614, doi:10.1002/jcc.21287 (2009). 21 Yang, L. et al. Structure relaxation via long trajectories made stable.
Phys Chem Chem Phys , 24478-24484, doi:10.1039/c7cp04838f (2017). 22 Uversky, V. N. Intrinsically Disordered Proteins and Their “Mysterious” (Meta)Physics. Frontiers in Physics , doi:10.3389/fphy.2019.00010 (2019). 3 Compiani, M. & Capriotti, E. Computational and Theoretical Methods for Protein Folding. Biochemistry , 8601-8624, doi:10.1021/bi4001529 (2013). 24 Callaway, D. J. Solvent-induced organization: a physical model of folding myoglobin. Proteins , 124-138, doi:10.1002/prot.340200203 (1994). 25 Rose, G. D., Fleming, P. J., Banavar, J. R. & Maritan, A. A backbone-based theory of protein folding. Proceedings of the National Academy of Sciences of the United States of America , 16623-16633, doi:10.1073/pnas.0606843103 (2006). 26 Voet D, V. J., Pratt.
Fundamental of Biochemistry: Life at the Molecular Level , 75-83, doi:10.1096/fasebj.10.1.8566551 (1996). 28 Dill, K. A., Ozkan, S. B., Shell, M. S. & Weikl, T. R. The Protein Folding Problem. Annual review of biophysics , 289-316, doi:10.1146/annurev.biophys.37.092707.153558 (2008). 29 Voet D, V. J., Pratt Fundamentals of Biochemistry: Life at the Molecular Level (Wiley & Sons, 1999). 30 A, K. et al.
Cooperativity in protein-folding kinetics. (1993). 31 Muoz, V., Thompson, P. A., Hofrichter, J. & Eaton, W. A. J. N. Folding dynamics and mechanism of β-hairpin formation. , 196-199 (1997). 32 Petrovich, M., Jonsson, A. L., Ferguson, N., Daggett, V. & Fersht, A. R. phi-Analysis at the experimental limits: Mechanism of beta-hairpin formation.
Journal of Molecular Biology , 865-881, doi:10.1016/j.jmb.2006.05.050 (2006). 33 Dobson, C. M. Protein folding and misfolding.
Nature , 884-890, doi:10.1038/nature02261 (2003). 34 Clarke, D. T., Doig, A. J., Stapley, B. J. & Jones, G. R. The α-helix folds on the millisecond time scale.
Proceedings of the National Academy of Sciences of the United States of America , 7232-7237 (1999). 35 Chen, E. H. L. et al. Directly monitor protein rearrangement on a nanosecond-to-millisecond time-scale.
Scientific Reports , 8691, doi:10.1038/s41598-017-08385-0 (2017). 36 Dyson, H. J., Wright, P. E. & Scheraga, H. A. The role of hydrophobic interactions in initiation and propagation of protein folding. Proceedings of the National Academy of Sciences of the United States of America , 13057-13061, doi:10.1073/pnas.0605504103 (2006). 37 Cui, D., Ou, S. & Patel, S. Protein-spanning water networks and implications for prediction of protein–protein interactions mediated through hydrophobic effects. , 3312-3326, doi:10.1002/prot.24683 (2014). 38 Wang, Q. & Smith, C. Molecular biology genes to proteins, 3rd edition by B. E. Tropp. , 318-319, doi:10.1002/bmb.20196 (2008). 39 Voet D, V. J., Pratt CW. Principles of Biochemistry . (Wiley, 2016). 40 Eisenberg, D. Three-dimensional structure of membrane and surface proteins.
Annual review of biochemistry , 595-623 (1984). 41 Marcos, E. et al. Principles for designing proteins with cavities formed by curved β sheets.
Science , 201-206, doi:10.1126/science.aah7389 %J Science (2017). 42 Heinig, M. & Frishman, D. STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins.
Nucleic acids research , W500-W502, doi:10.1093/nar/gkh429 (2004). 3 Eudes, R., Le Tuan, K., Delettré, J., Mornon, J.-P. & Callebaut, I. A generalized analysis of hydrophobic and loop clusters within globular protein sequences. BMC Structural Biology , 2, doi:10.1186/1472-6807-7-2 (2007). upplementary Information Note S1. Protein structures samples BQR
EMN
Table. S1. Randomly selected 1000 small protein structures in PDB
Note S2. Amino acid sequences of β-turns of the 1000 protein samples
PDBID:1AA2 PDBID:1NXV AGYPNV NFT RDG RPDLI KKSN TKL VDHP PD LMLPEID AD KPF PDBID:1ACF PDBID:1O42 NL GAVT LDG SAGF AG DDR GS EEW KI NAENPRG SETTKGA NAK LDSG SR TSK NEKI ADGLCHR PDBID:1AOJ PDBID:1O4C NSS MKD ASG NNI TPE NSSE MKD EEW KI NAENPRG SETTKGA NAK LDSG SR DDRR NASG ADGLCHR PDBID:1AYC PDBID:1O4J RRW HPNI VDG KSNPG NG TGDY LYGG EEW KI NAENPRG SETTKG NAK LDSG SR HHGQ KNG ADGLCHR PDBID:1BU3 PDBID:1O4N AFSGILA AADS DQDK SAGA DSDG SIQAEEW KI NAENPRG SETTKGA NAK LDSG SR PDBID:1CKA ADGLCHR DEE KKG KPEEQ DSEG PDBID:1O4O PDBID:1FB7 SIQAEEW KI NAENPRG SETTKGA NAK LDSG SR LWQR GG DTGA IG CG PTPVN ADGLCHR PDBID:1FB8 PDBID:1OD7 SLGT GGLVK RN KDQMS YSQERV PF PK LSG PADV VSNG DYSGKR SGD DKRFRGR PDBID:1FES KD DS DVKG CSGCSGA PDBID:1ODA DBID:1FNK PADV DYSGKR SGDPKLV RGR MDHK KD DS RDT TPDL LSGWQYV VTGGLKK DVKG PDBID:1FOY PDBID:1PKS TFIT PDL SMG LYD REE HLG SDG PDBID:1GBQ PDBID:1PO8 ADD KRGD DQN NG KNY GCY IDGC EP DSKD PDBID:1GVP PDBID:1PZA KPSQA SRQG NEYP DEGQ GQFG DRL AE PA NPG VDK IKDM PEGA KINE PDBID:1J2V CTPH GDSP EG YDVP NED PDBID:1PZB PDBID:1J4I AE PA NPG VDK IKDM PEGA KINE DGRT KRG EDG GKQEVI SVG ATG PGI CTPH GDSP PPH PDBID:1RNQ PDBID:1J82 DSSTS NLTKDR CKNG TGSS YP GNPY NLTKDR CKNG TGSS YP GNPY PDBID:1RNV PDBID:1K5A NLTKDR CKNG TGSS YP GNPY DAKPQGR LTSPCKD ENKN REN WPPC NG QSA PDBID:1RNX PDBID:1LEA DSSTS NLTKDR CKNG TGSS YP GNPY FRS VSGAS PDBID:1RRI PDBID:1LGP YNR ENPPI RLGAEEGE KR RRGCD DEKSG STSG VKKQ QTG PDBID:1UIG YRKNE LDN NF TQA NTDG GILQ SRWW DGRTPGS PDBID:1MH8 NLCN SSD DGN KGTD RGC VPAR GCY QG KGRN NI PDBID:1UW7 PDBID:1MK0 PNN TTQTAC NSKGG HQDL KSDG TPKG KGL TLNN KD HSS GNV PYE NSKIN PDBID:1WJG PDBID:1NLO FK PDYL GTRG SLFL LYD TET KKG NTEGD LTTG PDBID:2BEZ PDBID:1NNX NVLY ISGI NESL RDD DD AS WNGV TPKD WN PDBID:2BPP PDBID:1O4E IPSS NN GCY KVL NPYTN NN SSEN EEW KI NAENPRG SETTKGA NAK LDSG SR NL ADGLCHR PDBID:2BTI PDBID:1OPY VG GDE GN PKEV VG GDE GN DPFGQ SHNG NG DEHG PKEV PDBID:1P9G PDBID:2CXY CPRP NAG IYG GAGN GK PEV PDBID:1RDS PDBID:2DH1 GS DD DYEG MSDY GDD HTGASGDD QGDV LSNG NLNKVNS SSATG IKDY PEDT PAGS PDBID:1S3P GGGTLGV DTKG NSEYN VSGA DSSTK NDNG SLGG AADS GLKKK DKDK SSDA DKDG QGDV LSNG NLNKVNS SSATG IKDY PEDT PAGS PDBID:1SKZ GGGTLGV DTKG NSEYN VSGA DSSTK NDNG SLGG EGS IITD CPH SRYG LEP PEG RLTN SEGNPAI QGDV LSNG NLNKVNS SSATG PEDT PAGS IDI CPN DKLG GGGTLGV DTKG NSEYN VSGA DSSTK NDNG SLGG PDBID:1T1D SEGNPAI QGDV LSNG NLNKVNS SSAT IKDY PEDT SG FPDT PLRN GGR PVNV PAGS GGGTLGV DTKG NSEYN VSGA DSSTK NDNGD PDBID:1TGJ DLKAKL SLGG WKW PK PYLRSA NPEAS GR PDBID:2FD2 PDBID:1TS9 IKCK EVCPVD PN HPDEC EDEV DGVKGK WIG SPN VG TQN TEKG KRGR KG PDBID:2FLT PDBID:1UPJ QDRLT PAGN GG VEYG LWQRP GG DTGA GIGG CG TPVN QIG PDBID:2LYO PDBID:1UVY LDN NF TQA NTDG GILQ SRWW DGRTPGS NGI GPNA ANM NLCN SSD DGN KGTD IRGC PDBID:1VIH PDBID:2O1N HK GKSG SEKS GCY GWG NG LY PDBID:2AIF PDBID:2OUM ADAEP KDGS KELA LPHG DKTG ASF AHKPEGA TF TMG PDBID:2AZ9 PDBID:2P0D LWKRP GG DTGA GIGG CG GPT QIG GG GN LRGA GRHLS RRN TIPG SDH PDBID:2BHK PDBID:2PPN WDDW PL EFPL DPEST DSAN DGRT KRG EDG GKQEVI SVG ATG PGI PDBID:2BHO PPH NEND DGH HQQ LGAD FGEHWPA DG LVGL PDBID:2QR3 PDBID:2BO1 NHFS MNFTSNE YRDL PW ARNT AGES PDBID:2UZR PDBID:2EFF GKYIK NDG KER VAQ ERPRPN WT GG ENG KQG GD DRSG PDBID:2V5F PDBID:2FCD DPEH HHHH DKKG KD DSSLRDAS PDBID:2W1R PDBID:2H46 RD GS RS SDA NG NG KDQ HPW KI HDG ESAPG GN DGAG RST PDBID:3F45 SRNQ TAAD DKDK SSDA DKDG PDBID:2HC8 PDBID:3FZ9 DG AVG RPG KG FGA NP KTWC TVP GG PDBID:2HPL PDBID:3HFF NPSD IGN LPVRG GET PKKA DGP KESNG HEEV DKDG SVI GDH IIG PDBID:2JVH DLGK AMVSI KDS PMDV PDBID:3HFN PDBID:2M6H VTG DPT DRQ KQA SLP VTGD DPT GCPDP DRQ PDBID:2QNW PDBID:3KD0 NPES AGDK ATWC SN VDDS SMP KG PDBID:2VSD PDBID:3LYO PSQG SLG PRM NG DKEQ SEPL LDN NF TQA NTDG GILQ SRWW DGRTPGS DBID:2YWZ NLCN SSD DGNGMNAW KGTD RGC PR ETGE LGS GR KGSK SDL SPSCYSYP PDBID:3O8W PDBID:2ZGD SDD KFGD IRTG DVAA DKNG DKFG PDBID:3Q4Y PDBID:3BJO VPSR GCY KG QG KGDN NI LDYI EE FKGKYE PQRG QS PDBID:3RVC PDBID:3C1R MLI GP QAI FD EEG LPSL NE KTYC KLK TVP NG PIL PDBID:3UB4 PDBID:3CE8 SRWSKD GAI QALS PGAEG LGL GRG ND LEFL RE KHN PDBID:4AHN PDBID:3DJU DNS DAKPQGR TSPCKD ENKN RENL WPPC NG NHK PSE PY GEDG QSIFH PDBID:3F3Q PDBID:4BFN HHH ATWC YPQA VDEL AMP NG PSSAS DGS NIAY DGSD SGS VKGI SG PDBID:3GM2 PDBID:4CQI RTDD VM AVT KEGSTV PDBID:3H33 PDBID:4DH0 TRIG IDGF GKGC DGRT KRG EDG GK SVG ATG PGI PDBID:3HAF PPH VGGLGGY PDBID:4F69 PDBID:3K6D FGFKGV NQ VSDRIPGS NENSG DREAIP DENG PDBID:4G4X PDBID:3KQI DYDS NANS KEAPVNP GSMA CVCR DVTR CDACK KKKR PDBID:4K1V PDBID:3KTP DPFGQ TGP SHNG NG DEHG HPT NWPP PG WKGL PDBID:4LBB PDBID:3MCD IPA IAGE QG IPA IAGE QG ERTL LPYM ERTL LPY PDBID:4MJJ PDBID:3NRW GL TLNP GI EDE TDA ARE DEDL PDBID:4UNV PDBID:3T8R SLGQ TSSDVGGY HAGK EVN PSGVPDR GN SGL KEGQ RGAKG RVP IESE SD PDBID:4B9J PDBID:4X1M KEG PENV VY TADDSQ DDKG KD TPN YKDSP VPGK DRSS GG LLQEF SAPT TDK PDBID:4CPV GV FAGVLN AADS GLTSK DQDK FKADA DSDG PDBID:4Y92 PDBID:4E1P TTTD KKG NTEG SLTTG DDFDGSG DG DDFDG DG PDBID:5B1F PDBID:4F2E LDN NF TQA NTDG GILQ SRW DGRTPGS NG MGG PE KKSV PDFG PMG GMNMM LCN SSD DGNGMNAW KGT RGC PDBID:4G3O PDBID:5B3Q PQV DNGLGN GFN YLTY RPER WKD PEEDP VTDLW PDBID:4G4W GH DYDS NANS KEAPVNP PDBID:5EE2 PDBID:4GBC NG FNSN SDY GP GG EV DGD TSI TSI LDD PDBID:4I3I PDBID:5HBL AFGS GDG VIG SPEF DYQM SGD IPE HE QAG APA ED PDBID:4NXR PDBID:5HBQ TAAD DG KETG KAG NN IPE HE DQAG APA ED PDBID:4OOH PDBID:5HJC DSST NLTKDR CKNG TGSS YP GNPY YKP IIKHP PPDH PDBID:4X1L PDBID:5HPA ADG YKDSP VPGK GG LLQDGEF KPT TDK IPE HE QAG APA ED GV PDBID:5HVZ PDBID:4ZC3 PER PER DMN PDBID:5KM6 PDBID:5AI2 KEIP DDR DISP AP LNKG QSVY HWPP KICG EDA SPGT PDDW CPICG PDBID:5KUE PDBID:5C68 NG PS KPISD DRGI SPDG ND PD PG AEPP GAGE GPGG PDBID:5KXF PDBID:5D54 GG PEVV LN LDDG QER TSI FVNQ PDBID:5M9A PDBID:5DZD DAKPQGR GLTSPCKD ENKN REN GGSPWPPC NG PEG TVDG HNRR PRTG PEG TVDG HNRR PDBID:5YDN PRTG TADG ESQP PDBID:5E4X PDBID:6CE8 GDDGKT MSDA PQDNV CPH PAAG DVTQ CGDCG IQE LSCY GRYING PDBID:5H0Y YIDL YYCQ SP AVRG AIN AADD SESG TEEEFV PFYE PDBID:6CEA NDSG PY PWCPH PAAG DVTQ CGDCG IQE LSCY GRYING PDBID:5I72 YIDL YYCQ PQF KSCW NKG CNNH CPICK PQF KSCWF PDBID:6FGU NKG CNNH CPICK SMSV HEDAWPFL NLKLVPG IKKP EDDS PDBID:5JOE PDBID:6GF0 LKD YPEI GT SDK GD KN GP LDN NF TQA NTDG GILQ SRW DGRTPGS PDBID:5L8Z LCN SSD NGMNAW KGTD RGC TDVL AGF NPSTG PDBID:6HA4 PDBID:5RHN KSKN NDAG KDN TYNN RPGG KEIP DDQ DISP AP KKH LKKG PDBID:7PAZ QSVY NWPP AE PA NPG VDK IKDM PEGA KINE PDBID:5RNT CTIH GDSP CD GS GSNS NYEGFDF LSSG ENN HTGASGNN PDBID:1ANU DBID:5UEY SVG GVPSKG DPNV GDII DPN NDL AKK DVEALG IKHP PPDH PDBID:1AWJ PDBID:5USV EETL DPQE DKNGHE SSYL TSI FVNQ PDBID:1B1E PDBID:5YM7 QDN DAKPQGR TSPCQD ENKN REN GGS WPPC EDKSPDS GK NG PDBID:6B25 PDBID:1BEA PQENE RPG NED QD ANF QQNE VRT GPDA PG KEN DG GK PDBID:1BMG PDBID:6IQC RHP DG PP NG KS KDW NSKD KIP VTLEQP PDBID:9RAT PDBID:1CHN DSST NLTKDR CKNG YP GNPY WNMPNMD DGAMSAL EA KEN PF PDBID:9RNT PDBID:1COF CD GS GSNS NYEGFDF LSSG ENN HTGASGNN KKYK NDAK PEND NGNE PDTA LNG DFSEVS PDBID:1AE2 RG GA KPSQA SRQG NEYP DEGQ GQFG DRL PDBID:1DMN PDBID:1AHO DPFGQ SHNG NG DEHG DDVN PY PDHV PDBID:1E6K PDBID:1B1U MPNMD DGAMSAL EAK PF TPSG RLLQ PG LVTEVECN PDBID:1FD2 PDBID:1BHF IKCK EVAPVD PN HPDEC CPAQ DGVKG EPW KNL APGNTHG SESTAGS DQNQG LDNG PR PDBID:1FN5 SDGLCTR LDN NF TQA NTDA GILQ SRWW DGRTPGS PDBID:1BQR NLCN SSD DGNGMNAW KGTD RGC KD PA APG TDK IKGM PDGA KINE PDBID:1HD0 CTPH GDAP SWDT PITG NG PDBID:1CM3 PDBID:1HIK TPNGLD NG AKS TQG GED HKCD TLCTEL DIFAASKN DTRCL PVK PDBID:1ECW PDBID:1HRQ SVLS LRPGG TGTA PAN AADD QDATKS QF ADAAGA DE QL PDBID:1EIG VNG CASW VSKR PENR RSTC KKG DPKQ KK PDBID:1IR9 PDBID:1FD8 LDN NF TQA NTDG GILQ SRWW DGRTPGS VM CSG PD LEKQ NLCN SSD DGN KGTD RGC PDBID:1FDB PDBID:1JER IKCK CPVD PN HPDEC CPAQ EVW DGVKGK VG PSSP VG PANA FVNSDND VGTH PDBID:1FKK PDBID:1KCQ PGDGRT KRG EDG SRDRN GKQEVI SVG ATG RRV NNGD GN GSNS RSGR EGTE PGI PPN PDBID:1L7L DBID:1FNJ ANNEAG NPGD YGP REH NS VNT PNNV RDT TPDL LSGWQYV VTGGLKK PQDQI VPGTYGNN PDBID:1GDC PDBID:1LZ4 CLVCS YG CAGRN NLEA MDG GY TRA AGDR GIFQ SRY DGKTPGA PDBID:1IIZ AAH QDN DPQ QNR VQGC AR TDK NKNG GLFQ DKYW GST GKDCN PDBID:1M4B TDD KFDAW NHS NYKNPK KKA PDBID:1IKL PDBID:1M4M CQCIKT HPKF GPHCA SDG DPKE QIW FKNW CACT TENEPDL FFCF EPDD SPGC PDBID:1IRQ VKK PDK PDBID:1ML8 PDBID:1IYU MQA GL ASG GNSGDKA QD RDL KTG EVEQ SAKA SPKA KLGD EG PAAGA KAV PDBID:1J81 PDBID:1N9N NLTKDR CKNG TGSS YP GNPY ATLPDC GEGT RKDG TPDG PDBID:1K40 PDBID:1OOI PAS FNF NKKG ADGQ PDBID:1LKL PDBID:1P0R EPW KNL APGNTHG SESTAGS QNQ LDNG PR RLG NTDD QTG RWNKI KWYT KDHV LGDYE SDGLCTR HDG PDBID:1LRA PDBID:1PAZ CD GS GSNS NYEG LSSG ENN HTGASGNN AE PA NPG VDK IKDM PEGA KINE PDBID:1N9O CTPH GDSP ATLPDC GPDEVLG GEGT RKDG TPDG PDBID:1PBI PDBID:1NEQ CDT SNPP SNPP QCHN KNKSA CDT SNPP NEKARD LER IWPSRYQ SNPP QCHN KN PDBID:1NKO PDBID:1QPV EGM DSD NPAW RDR RDA GN KKYK NDAK NGNE PDTA LNGV DFSEVS PDBID:1O4M PDBID:1RNZ EEW KI NAENPRG SETTKGA NAK LDSG SR DSSTS NLTKDR CKNG TGSS YP GNPY ADGLCHR PDBID:1RRY PDBID:1O7Z NVIDT YNR ENP CIS NPRS SQFCP KKKGE NPES CISI SQFCP PDBID:1RSI KKGE NPES NVIDT YNR ENPPI PDBID:1P65 PDBID:1SFB SDSG SDSG LDN RG NF TQA NTDG YGILQ NSRWWC PDBID:1PZ4 DGRTPGS NLCN LLSSDITA VSDG CKGT WIR GIRM PA GG LKNV AE GALP PDBID:1U3Z PDBID:1QLS DRT VTG KD EENG RQF YKDVNI RKSDL RDGNNT TE DLDS PDBID:1U42 PDBID:1R9H DRT CVTG QKDDI EGV SPTDV YKDVNI KSDL TPKK GG TTG ENG GRGNVI DA PPK PDBID:1UHD G REG VNNG KRGE AIG DEKN LNY PDBID:1RBX PDBID:1UKU DSST NLTKDR CKNGQTN TGSS YP GNPY RLIA EG YDVP PDBID:1SF6 PDBID:1UPR GRCELAAA LDN RG NF TQA NTDG GILQ DPNL DSSGLR GH DSRE LPSY APRGRRF PG SRWWC DGRTPGS NLCN KGT WIRGC PDBID:1V07 PDBID:1SF7 VN FGFKGV LA LDN RG NF TQA NTDG GILQ PDBID:1V46 SRWW DGRTPGS NLCN SSDITAS NAFT PDBID:1TNS PDBID:1YV7 LPGLP RAGVKG GEI TSLGYF TRDVD HCK KGI TSRPCK NQ PDBID:1U36 PDBID:2B4A DRT VTG KD RQF YKDVNI RKSDL CD VD QTKQ SSEH KPF PDBID:1VER PDBID:2B9D PR ETGE DA LGS GR KGSK RDL CAYCE CAYCE RLS PDBID:2CDS PDBID:1W6X LDN NF TQA NTDG GILQ SRW DGRTPGS SKL KAG KD RG SKL KAG KD LCN SSD DGN KGTD IRGC RG PDBID:2DP9 PDBID:1WRP GG KDE RRPGR LSEV SPYS PDBID:2EG2 PDBID:1XW3 FDF RVGD IRTG GAQG PDL PDBID:2EQL PDBID:1XW4 EMDG NF TRA NANG GLFQ NKWW KRSS GAQG PDL NACN DPK KDK PDBID:1YAT PDBID:2ES9 PGDGAT KTG ENG GV SVG PRG PGL IKN STY PPN PDBID:2GSP PDBID:1ZIB CD GS GSNS NYEG LSSG ENN HTGASGNN KD PA APG TDK IKGM PDGA KINE PDBID:2H8V CTPH GDAP VNE SL DDEA LPT PDBID:1ZPA PDBID:2INT LWKRP GG DTGA IG CG GPTPVN TLCTEL DIFAASKN DTRCL PDBID:2A3G PDBID:2KRI ASV VKKA QG KNG LHG KEKK DG PKCF PDBID:2BS5 SSLAF DASDV GPA CNSS ACD CEDG GTVP NG DGK GD GS GT DGNG PDBID:2OYZ PDBID:2D58 HG QAP VGE SG EGN SSD NGNG PDBID:2P2E PDBID:2DB7 DSN SLN DYKKDYN NI NTNN DNS SGGYFD DASD DASD PDBID:2RLN PDBID:2DU9 NLTKDR CKNG TGSS YP GNPY GTLS RGI PDBID:2XCZ DBID:2FLS SG IGA NG NC KTSC TVP NG PDBID:3B7X PDBID:2OSN QRM ISGDRG APDA EHM LED TLG PPL YIN GCYC QP RTAD MISASNSC PPN PDBID:2P8V PDBID:3CR5 AR DPATK ATRN GA TPNM SQK EGDKH SHFL DSDG PDBID:2PAL PDBID:3ETW SFAG AADS DQDK SPSA DKDG NTRFY PDBID:2QIU PDBID:3F8C FVN TSI NY GERG NGEME DEGR PDBID:2R39 PDBID:3FRV DP DRNQ FRVAGE LNDV EPG SA PPK GQ PDBID:2WQH PDBID:3H9W YPNN YPNN WQT HRDG DDSG PDBID:2XBD PDBID:3HQA WSD SSAW NGSQ NA SGST PNGS KNGS KH NNDE NNDE PDBID:3C97 PDBID:3IRC TN FD DIQMPVMD IDDD GAEL KPL TQHG DAP EKG TANP KE GEKA PDBID:3CR2 PDBID:3L6W EGDKH SHFL DSDG PEKGDEK DWTQ IRSIN GTDSD GAKE TDDH KYDHT PDBID:3FWU DASVL RPNNAVF LPPK KRGT SVDEAVT PPFS LGKM PEDLV DST FG LGG PLHC NG PEKGDEK DWTQ IRSIN KSDD GAKE TDDH KYDHT PDBID:3K0X DASVL RPNNAVF LPPK KRGT SVDEAVT PPFS LGKM KDG NG GS VTVVLPDV QKH DG AVGI PDBID:3LHC PDBID:3K63 GS RTNG LNSVI DG GSS TAAG DG RLKDT AA PNQ YQNE YPDG KKNN NPYG PDBID:3NZ3 DENG PG KVTN KVTG IDK DIP SDS PDBID:3K9I PDBID:3QD7 AS FPNH DKLDE TGFL PLSQ REGL LLRQ DDKS FDD PDBID:3KHQ PDBID:3RSD CSQ PR KRSS RGSQ GS QNI KCD DSST NLTKDR CKNG TGSS YP GNPY PDBID:3KU7 PDBID:3T8T AATA RTL RTL LPYMEE KEGQ RGAKG RVP IESE PDBID:3Q1D PDBID:3VXW CEEHEEE LSCE GAHKDC FKN AEKS IDKR ADL PPEKA ND PTAA PDBID:3R3K DKDG QG PDBID:4BS7 PDBID:3T3J LDN NF TQA NTDG GILQ SRWW DGRTPGS KPYTFEDY SG GDL PSS TGK SHDG LSSLAY NLCN SSD DGNGMNAW KGTD RGC PDBID:4DP2 PDBID:4E1R ADDGSLA PS PAG SEED AKG CSPH GM DDFDG DG DDFDG DG PDBID:4DP4 PDBID:4EES ADDG PS PAG SEED AKG CSPH GM DPRLPDN ILG GPET TKSG DQKG PDBID:4EC2 PDBID:4ETD SHG PAF PPNKQ PLS LNGE LRNG LDN NF TQA NTDG GILQ SRW DGRTPGS PDBID:4GBN LCN SSD DGNGMNAW KGTD RGC TSI PDBID:4HE7 PDBID:4IDL ENY LAN DEKR YCE QTGG APGK TTG GDFVKGR NANN DSL DGAR PDBID:4HV2 PDBID:4JJD LDN NF TQA NTDG GILQ SRW DGRTPGS DPN LYD ATLS TKG NHNGE KN LCN SSD KGT RGC PDBID:4JZ3 PDBID:4IC9 TETD KKG NTEG SLTTG VGVGGKS EPGD GLLN GLDT PDBID:4KV4 PDBID:4IPF DVEALGLH IKHP PPDH DEKQQH CSND FGVQ VKE PDBID:4MDQ PDBID:4MZ2 DEKQQH CSND VKE VNGT SAFGF SSDA EKS DDR SMNN RDL PDBID:4QYW PDBID:4N5T PD MPEMN DPNA MG PF GEGT EEV DKQRQH CHDD VK WAQSA PDBID:4XO2 PDBID:4P15 MT MT GEG HSY PDBID:5C4P PDBID:4P7U PD PG AEPP GAGE NQD GPGG CEKPQE DEN DPKL HD LEDAAS PG SDE PDBID:5DKN PDBID:4P9E EGDKH SHFL DSDG FFE CKRM NG PSGF NPFV IDGS PDBID:5EH4 PDBID:4PE7 EPE EGDKH SHFL DSDG PDBID:5HDG PDBID:4RLN DAL SPSG FD PC RG LDN NF TQA NTDG GILQ SRW DGRTPGS PDBID:5NWX LCN SSD DGNGMNAW KGTD RG GC CR TTG RTG GR LNDG YP PDBID:4RUD PDBID:5OC8 SSRTE PEGE HG SKC GAYN TDLC SRTE EK CSND VKE PEGE HG SKC GAYN TD DL LC PDBID:5UEV CN NK DVEALGL IKHP PPDH PDBID:4WO6 PDBID:5UVY LDN NF TQA NTDG GILQ SRW DGRTPGS DVEALGLH IKHP PPDH LCN SSD DGNGMNAW KGT RGC PDBID:6EVL PDBID:5CPV DPSH FAGVLN AADS GLTSK DQDK FKADA DSDG PDBID:6PAZ PDBID:5E0Z E PA NPG VDK IKDM PEGA KINE VAG PAG PPAG VDS LSG HN PPAG CTIH GDSP RDG PDBID:1B0T PDBID:5E4P IKCK CPVD PN HPDEC DGVKGK LD RG NF TQA NTDG GILQ SRWWC PDBID:1BFE DGRTPGS NLCN SSD NGMNAW CKGTD IRGC RGST GEDGE LAGG RKG NG LRNA GQ PDBID:5EN9 EAN NSSG TSI PDBID:1C0B PDBID:5G25 KET DSST NLTKDRC CKNGQTN KYPN GNPY GG AQDG CPTAGD EE PDBID:1C76 PDBID:5UEP DGKG KPG TAYKE DPSA KNKK TEKG LSEHIKNP DVEALGL IKHP PPDH PDBID:1CUO PDBID:5UVZ STR PASCA PKTGMG GGG SKDE YPGHFSMM DVEALGLH IKHP PPDH PDBID:1DDV PDBID:5VR3 DPNTK STRN GS TPNM SQK PAIP EHNP PDBID:1E5B PDBID:5XWE WSD SSA NGSQ SGST NGSGN KNGS IDSC PEGE YT PKTE YT IDSC PEGE PDBID:1EFQ WKI PKTE PD SLG SQS SSNS KPGQ WAS ESGVPDR PDBID:6FGT GT SS YYSTPY SMSV HEDAWPFL NLKLVPG IKKP EDDS PDBID:1EN2 PDBID:6H0L SQGG IWG CGRT RSDH GN GQD VHG LDN NF TQA NTDG GILQ SRW DGRTPGS PDBID:1EOT LCN SSD DGNGMNAW KGTD RGC PASVP CFNL PLQR SGKCPQK KLA DPKK PDBID:1A18 PDBID:1FAO CDAFV GD STFK TADD GG DG GD LG GGLVK RN KDQMSPE YSQERV PF KG PDBID:1FAZ PDBID:1B6E DLCTQA NPFGFP QEK RC WENG SQYLFPS NTK NPNG CED PDBID:1FDA PDBID:1C9V IKCK CPVD PN HPDEC CPAQ PEDM LPD DSST NLTKDR CKNG TGSS YP GNPY WDGVKGK PDBID:1CS3 PDBID:1FE3 GTLC DSQ SQ DFL VEAFC GD STFK LG TADD DGDK DG PDBID:1DCF DG GD FTGL SHEHK VEN KPV PDBID:1FER PDBID:1EMN IKCK CPVD PN HPDEC EDEVPEDM DGVKGK DG PFG GN NPCGNG IG TCEEGF GPMM PDBID:1FEV PDBID:1EOQ NLTKDR CKNG TGSS YP GNPY GPSE STL PDBID:1FOW PDBID:1EOW MTFITKT IESG RSMG DSST NLTKDR CKNG TGSS YP GNPY DBID:1GD6 PDBID:1EQV SR TSK NRNG GLFQ DRYW KDCN TDD GD KG LL ETK PKK AKK DKYG RFDAW HCQGS DG AP PDBID:1GHJ PDBID:1FIK PESIA WHKKPGE KRD DKV NEGD LSG YKDSP VPGK GG LLQDGEF TGGA TDK GV PDBID:1GKH PDBID:1FKD AQ SRQG NEYP DEGQ GQFG DRL DGRT KRG EDG GK SVG ATG PGI PDBID:1GV2 PPH GPKR LKGR GNR LPGR PDBID:1FKF PDBID:1GXT DGRT KRG EDG GK SVG ATG PGI VQGV DGDG PPLA SQL PPH PDBID:1H2P PDBID:1HEY RN VPGG FG NTG ST ISGSS IDN DKEL TGKS NN WNMPNMD DGAMSAL KPF QGERDH YR NKGF EH NNDE PDBID:1HME PDBID:1IDI DPNA HPGL TAT NLCY CDAF SRGK KPYE TDKC PDBID:1IOS PDBID:1J7Z LDN NF TQA NTDG GILQ SRWW DGRTPGS NLTKDR CKNG TGSS YP GNPY NLCN SSD DGN KGTD RGC PDBID:1JPO PDBID:1JPE LDN NF TQA NTDG GILQ SRWW DGRTPGS QH EFY DAG NLCN SSD KGTD RGC PDBID:1KM9 PDBID:1KM8 DN GG TGVI PRPC NQ GR DN GG TGV PRPC NQ GR PDBID:1KXX PDBID:1KSM LNN NF TQA NTDG GILQ SRWW DGRTPGS KEGDPNQL DKNG NLCN SSD DGN KGTD RGC PDBID:1KTH PDBID:1LXI PNTK NENK WQDW AFPL NPETV DDSS PDBID:1LOZ PDBID:1MG6 MDG GY TRA AGDR GTFQ SRYW DGKTPGA GCNC NK HL NACH QDN DPQ QNR VQGC PDBID:1O2E PDBID:1MH7 IPSS NN GCY NPYTN NN SSEN NLDMKN VPAR GCY GG QG KGRN NI PDBID:1O45 PDBID:1O4K EEW KI NAENPRG SETTKG NAK LDSG SR EEW KI NAENPRG SETTKGA NAK LDSG SR ADGLCHR ADGLCHR PDBID:1O4G PDBID:1O4L SIQAEEW KI NAENPRG SETTKGA NAK LDSG SR EEW KI NAENPRG SETTKG NAK LDSG SR ADGLCHR ADGLCHR PDBID:1Q8B PDBID:1O4Q DLNEI EEG EEW KI NAENPRG SETTKG NAK LDSG SR PDBID:1R26 ADGLCHR MRAR AVWC FPTV ADNN QLP SG GAN PDBID:1RTU PDBID:1RLK PQ GG YP SEDI VYNG RD TNTG KDLD GYTQVE TGA SYDG PDBID:1SDZ PDBID:1RZY FTDW LDWL GD FFCG EQED PIN KEIPA DDQ DISP AP KKH LKKG QSVY PDBID:1SNQ NWPP GD KG LL ETVEKY AKK DKYG DG PDBID:1TUW KG HD RSH AQDW PDBID:1SSC PDBID:1U3J DSST NLTKDR CKNG TGSS YP DRT VTG KD EV RQF YKDVNI KSDL PDBID:1TAY PDBID:1UIC MDG GY TRA AGDR GIFQ SRAW DGKTPGA LDN NF TQA NTDG GILQ SRWW DGRTPGS NACH QDN DPQ QNR VQGC NLCN SSD DGN KGT RGC PDBID:1TXA PDBID:1VHF PDAT KPGV DNC TWKRK RLIA KG YETP LTEY PDBID:1U68 PDBID:1ZIA YNR ENP KD PA APG PTDK IKGM PDGA KINE PDBID:1U9R CTPH GDAP GD KG LL ETVEKY AKK DKYG DG PDBID:1ZJ7 KG LWQRP GG DTGA IG CG PDBID:1UID PDBID:2B8G LDN NF TQA NTDG GILQ SRWW DGRTPGS KAG EKG MK DRS KEG NEG SNST NLCN SSD DGNGMNAW KGTD RGC PDBID:2BRF PDBID:1YEB PPGE PSDGQ PLTQ DRKCSRTQ PETR GQ KPG QQCH EEGG NKVGP LHG SGQVKGY TN IPGT VG NG PDBID:208L PDBID:2BZY MDG GY TRA AGDR GIFQ SRYW DGKTPGA DKT EVG NING NG DKT EVG NING NAAH QDN DPQ QNR QGC NG PQ QN PDBID:2D4M PDBID:2CW4 ATPGS PPN DNDY VNN SQG TDRAP AQ GG IP APDG DE TPPY PDBID:2FO3 PDBID:2HB4 PIN HPNNIR LENTIYAN PDDYPLKP QKP TH YSNG LWKRP GG DTGA IG CG GPTPVN GDDYNPSL PDBID:2J5A PDBID:2FTB KPTL QK DED NGD PN IG SMGG GG TDQ GN PDBID:2JK7 GG PRN FGTW GD FHCG KPSE YPGC PDBID:2H8A PDBID:2NUH NKV NDL VPF SLSG QPNR RL WQGK YRLP PDBID:2JP8 PDBID:2OPY RVYI PRN GTW GD FHCG KPSE PGC PDBID:2MSS PDBID:2PPI SVNT VD DKTTNRHR FES NN KA G EGI PDBID:2OQK PDBID:2V6H EEG NG FDG NPG DF GEIPETT TVGG GKW QH RASK TDA TKDK PDBID:2OUB PDBID:3A0V SS GCY GWG NG LY KDG VLG LPD KGER NAKTQ PDBID:2P61 PDBID:3BI9 SPT KL EWQTI LGQ PNSKCNA DGTR STK KVQFG SNT GWFN PDBID:2PNE AL LV CPG GPG NPGC GT GTPK PDBID:3ENU PDBID:2PVT KNF PRL PFG WENK GPR HNY DAGA GCY GG NG LY ANL FDNF PDBID:2QDB PDBID:3EZM GN KGQ LL AKK DKYG DG KG GS RTNG LNSV DG NFIET GSS TRAQ PDBID:2QHW DG GCY WG NG LY PDBID:3FAJ PDBID:2REA KKGD DGR YNPD EGQ RMVLGRT PDBID:3GLW PDBID:2REY RNF TR GQ MEYN MKGG EPG ND FENM PDBID:3GM3 PDBID:2UUX RTDD AQL AVT PIGW MGNC PDBID:3GQU PDBID:2VSL PSS LGAHG PWW IAE PRN FGTW GD FHCG KPSE PGC PDBID:3I3Z PDBID:2W51 TSI FVNQ RPGDC TFS TDDA DSQICEL KKL GETCKGC PDBID:3I7W PDBID:3BKS DSSTS NLTKDR KP CKNG TGSS YP GNPY DRSKR EG FNDL ELAV TADYP KETS PDBID:3I7X PDBID:3CQ1 DSSTS NLTKDR KP CKNG TGSS YP GNPY DPELG VVNLG PP PLHD LPGV FEPP RLL PDBID:3IE9 PDBID:3DML ADGA KM ETP KVG VAGVLG KKE TP QPGC QRD PPGL LARP FTP GD PDBID:3INC PDBID:3G7C TSI WIREYP REES PDBID:3J4G PDBID:3KLU LDNYRGY NF TQA NTDGS GILQ SRWW TPGS LSFFPGQT PISKRF DKEG TTEL PTFKA NLCN SSD KGT IRGC PDBID:3L1M PDBID:3JZR GK KDT DEKQQH CSND VKE PDBID:3LLH PDBID:3KVT EPN GD GQP GD PS GG KIPAT TEGMLN PVLN PTDVC PDBID:3O70 PDBID:3ONH LYFQGL FAGR NECH KS VPEV PQD ASNQ YD EDLNDR GNG EEGDT DELPCNT PDBID:3P2X PDBID:3S9K CTS NLETYEW SI KEG SRTPGT KAIISEN DSPK FDS PDBID:3PAZ GGLVTR SPGI AE PA NPG VDK IKDM PEGA KINE PDBID:3UAF CTPH GDSP CPTD KK DRNG GP HRDSN PS LDR PDBID:3RNT PDBID:3WVT CD GS GSNS NYEGFDF LSSG ENN HTGASGNN PQCKEE GT SPG GELT APR LSKC ST PDBID:3UB2 DQG GD SRWSKD GAI QALS APGAE SGL GRG PDBID:4DP0 PDBID:3UB3 ADDGSLA PS PAG SEED AKG CSPH GM SRWSKD SEE GAI QALS PGAEG SGL GRG PDBID:4F68 PDBID:3UMD FGFKGV EHVAFGS GDG VIG DSPEF DYQM SGD PDBID:4FE6 PDBID:3UME LWQRP GG DTGA IG CG PTPVN AFGS GDG VIG DYQM SGD PDBID:4GQV PDBID:3V1G KPTT DEDW EKT DSDG TSI GRG TSI GRG PDBID:4K59 PDBID:3WUN EGE KADY GN PRG LDN NF TQA NTDG GILQ SRW DGRTPGS PDBID:4KGT LCN SSD DGNGMNAW KGT IRGC GG DG GGL DG PDBID:4AYA PDBID:4LFQ VPSIPQNK HLKPSF VPSIPQNK RKTCG PDBID:4CGQ PDBID:4LTT QPQN AEHV RMEE KPF PD DD PDBID:4DH4 PDBID:4NEJ VAA GG GG GD KKA PDBID:4ETA PDBID:4OXW LDN NF TQA NTDG GILQ SRW DGRTPGS EGKSK GG PELV IYSAL LCN SSD DGNGMNAW KGTD RGC PDBID:4TQN PDBID:4ETB VKNP YQE RKTS LDN NF TQA NTDG GILQ SRW DGRTPGS PDBID:5B79 LCN SSD DGNGMNAW KGT RGC CDFCL SKINKKTG CSDCG KTYRW CIECK NICG SEN PDBID:4ETE CDDCD LTPSMSEP ASIY LDN NF TQA NTDG GILQ SRW DGRTPGS PDBID:5BPO LCN SSD DGNGMNAW KGTD RGC TSI TSI PDBID:4EWW PDBID:5PAL TSI DPGT LKGK DKDQ SAHG DSDH PDBID:4EWZ PDBID:5UVS TSI DVEALGL IKHP PPDH PDBID:4F6D PDBID:6BBK VN FGFKGV ATPKDT STL NQDG KNDG PDBID:4GSP PDBID:6CEC CD GS GSNS NYEG LSSG ENN HTGASGNN CPH PAAG DVTQ CGDCG IQE LSCY GRYING PDBID:4HB6 YIDL YYCQ KAKNG VPD GKGC PDBID:6CEF PDBID:4JHB PWCPH PAAG DVTQ CGDCG IQE LSCY GRYING GS NKE CTG KE TASW RQF YEDLEI YIDL YYCQ RLTDG PDBID:6EVM PDBID:4LJN DPSH SFCL KEQNREK CADCG CIECK SSCR ADN CDSCD PDBID:6FH6 CDPP PKGM KK KG SMSV HEDAWPFL NLKLVPG IKKP EDDS PDBID:4P2P PDBID:6IFH IPGS NN GCY LDSC NT NSKN NLDTKKY MKIPGMD TPDV AYGE KPF PDBID:4RLM PDBID:1AB0 LDN NF TQA NTDG GILQ SRW DGRTPGS GDAFV GD STHK TADD GG DG GD LCN SSD DGNGMNAW KGT RG GC CR KG PDBID:4YM8 PDBID:1ACD LDN NF TQA NTDG GILQ SRW DGRTPGS GDAFV GD STHK TADD GG DG GD LCN SSD NGMNAW KGT RGC KG PDBID:4YOP PDBID:1B1J LDN NF TQA NTDG GILQ SRW DGRTPGS QDNS DAKPQGR TSPC ENKN REN WPPC ENGL LCN SSD DGNGMNAW KGTD RGC QSIFRR PDBID:5CO9 PDBID:1BM2 TSI HPW KI HDG SESAPGD GN DGAG LWV PDBID:5DM9 RNQ LDN NF TQA NTDG GILQ SRW DGRTPGS PDBID:1BQK LCN SSD NGMNAW KGTD IRGC KD PA APG PTDK IKGM PDGA KINE PDBID:5HBP CTPH GDAP IPE HE APA ED PDBID:1D1M PDBID:5HMB NADG DG NADG DG PD PHG GDR GAT DAAG PDBID:1DDW PDBID:5HQI DPNTK STRN GS SQK SRAN TSI FVNQ PDBID:1E6L PDBID:5LAW DKEL WNMPNMD SAL AEA KPF EKQQH CSND VKE PDBID:1FDD PDBID:5LN2 IKCK EVCPVD PN HPDEC PEDM DGVKGK SQIP SVG DEKQQH CSND VKE PDBID:1FE5 PDBID:6CE6 GCY EP DTAD MISSSTNC CPH PAAG DVTQ CGDCG IQE LSCY GRYING PDBID:1FF2 YIDL YYCQ IKCK PN HPDEC PEDM DGVKG PDBID:6DXH DBID:1FLQ SKHAFSL CSKC REN LTDG YFDG PD YDED LDN NF TQA NTDG GILQ SRWW DGRTPGS DMLKM NLCN SSD DGNGMNAW KATD RGC PDBID:6INS PDBID:1FLY GERG TP PKG TSI LDN NF TQA NTDG GILQ SRWW DGRTPGS PDBID:6JQ4 NLCN SSD DAN KGTD RGC DPEN GKRK PDBID:1GOD PDBID:1A0B KDKT SPDL PDBID:1HEH PDBID:1AF5 VRG WAD SGSSSW NGGQ PNGS KNGS NKE PNQSYKFK TQR GS LKLKQ AALN PDBID:1HRC PDBID:1ALB AQCH EKGG HKTGP LHGLFG TGQAPGF IPGT CDAFV GD STFK LG TADD GG DG PDBID:1J73 GD KG GERG PDBID:1B1I PDBID:1JZA DAKPQGR TSPCKD ENKN REN WPPC NG KSTG AKNQG YAF GLPEST PN KSTG AKNQG PDBID:1BAS YAF GLPEST PN KNGG HPDG EKSDPHI RG VSAN KEDG TDEC PDBID:1K1Z ESNN RKYTSW KRTG GPGQKAI AQDKKR ELGL GIPP PGAF NPG AEAEH TATN PDBID:1CYI PCNR AACH NSVMPEKT LDGG GA WADRL PDBID:1M1S PDBID:1DKJ PP PATG SNNEH PV DAKG PAEE APFKAG LDN NF SQA NTDG GVLQ SRWW DGKTPGS PDBID:1O41 NLCN SSD KGT RGC EEW KI NAENPRG SETTKGA NAK LDSG SR PDBID:1EV3 ADGLCHR TSI TSI PDBID:1O47 PDBID:1HE7 EEW KI NAENPRG SETTKGA NAK LDSG SR GQPA NG TSF AANE QP PF NPFE ADGLCHR PDBID:1HQ8 PDBID:1O4R PNNW RN IYS IPANG EDG SYNQL IPKG SIQAEEW KI NAENPRG SETTKGA NAK LDSG SR SSF DCAN ADGLCHR PDBID:1I0V PDBID:1OR5 CD GS GSNS NYEG LSSG ENN HTGASGNN SALT EDDSV RYG PDBID:1I3Z PDBID:1PAL LPY GCL VDG ESVPG KK EKHGY DAHT SFAG AADS DQDK FSPSA DKDG PN PDBID:1Q4R PDBID:1IOR KDGV HQGY LDN NF TQA NTDG GIFQ SRWW DGRTPGS PDBID:1QL2 NLCN SSD DGN KGTD RGC IDT IDT IDT PDBID:1IRW PDBID:1T00 EKGG HKVGP LHGIFG SGQAEGY IPGT HMAG TDD LKN AAWC GDKI IDEN SIP PDBID:1JDL GG CMACH GPDA NLVGP LTGVID AGTAPGF PD PDBID:1UIA PDBID:1JKD N NF TQA NTDG GILQ SRWW DGRTPGS MDG GY TRA AGDR GIFQ SRYW DGKTPGA NLCN SSD GN KGT RGC NACH QDN DPQ QNR VQGC PDBID:1UWM PDBID:1KH0 EHNG KPG AYEPNPAT DG FANG NG FANG NG PDBID:1WAQ PDBID:1KH8 WDDW PL EFPL DPEST DSAN DSSTS NLTKDR CKNG TGSS YP PDBID:1WHI PDBID:1KXI QE GSGR NIG TKRG RPDG RDDK AP LPFI PEGK KKFPL DNC SAL TDKC LPFI PDBID:1WY9 PEGK KKF DNC SAL TDKC SND NGNG SEE PDBID:1L5D PDBID:1YGT LDVR GPGGK GPS DE GT NYLP GSG GGN KP NNDTD NKT AAAG SPDG DP PA AC CD DP PDBID:1Z21 PE EA FS AF SNPP PDBID:1LVE PDBID:2A9I PD SLG SQS SSNS KPGQ WAS ESGVPDR TPST PQE KPSG RYN LQTGL GT SS QAEDV STPY PDBID:2AZ8 PDBID:1MB3 LWKRP GG DTGA LPG IG CG GPT LPEI DDDLAHI KPI PDBID:2BQQ PDBID:1NEH NG RYF SSDRFR NI LPQG IDG EEGE PAN AADD QD QF QLFPGK VNG CASW DDV NPNW PDBID:1O3X PDBID:2COQ NKLI SQG TEDN PR ETGE DTA LGS GR RDL AFN PDBID:1O5J PDBID:2EH9 KG YETP LTEY ELG GK GEEP PDBID:1PZC PDBID:2HWV AE PA NPG VDK IKDM PEGA KINE GD AY RG IG FGD ED PS CTPH GDSP PTY RGV PDBID:1REX PDBID:2JIN MDG GY TRA AGDR GIFQ SRYW DGKTPGA DYLV PS LG KENG QEG NG LKNL NACH QDN DPQ QNR VQGC GY PDBID:1RFP PDBID:2O0P LDN NF TQA NTDG GILQ SRWW DGRTPGS RGQAN AEPGED DADG NLCN SSD DGN KGTD RGC PDBID:2O0Q PDBID:1RS2 RGQAN GED DADG NVIDT YNR ENP PDBID:2OAI PDBID:1SNP REDG DGTL NNYH LAG HVG AG GA GD KG LL ETVEKY AKK DKYG DG PDBID:2OJR KG DTNN EGDE DTNN EGDELLA TLTG EPSD AG PDBID:1T2J EDGR LSDYN QKES LR RG QPGG APGK SGSG ADSVKGR NSL DWYGMD PDBID:2ON8 PDBID:1TCY GKTL MDG GY TRA AGDR GIFQ SRFW DGKTPGA PDBID:2P1X NACH QDN DPQ QNR VQGC G ENG KQG GD DRSG PDBID:1TVQ PDBID:2Q1M GD PR LG TMDG NG EK GN GPLPSK SEPP DW NANY NK NKSK HVG GG NSEHQ PDBID:1UW3 PDBID:2X44 LGGY FGN PAVV SSRG AT DS MMGN LDDS SGN PDBID:1VED PDBID:2YHN LDN NF TQA NTDG GILQ SRWW DGRTPGS CMVCCEE PCG CPVCR CMVCC PCG CPVCR NLCN SSD NGMNAW KGT IRGC PDBID:2Z44 PDBID:1Z3L PIQE LPE TAAA NLTKDR CKNG TGSS YP GNPY PDBID:2ZHH PDBID:2A9Q NSGNQ SRSDC PD KPF PDBID:3A7L PDBID:2APB PAEL SKEH EADG EVG SAG VKA PVS AA PR VTG DTGH AG GDIPDG QE DSP EPYAGG SD ES GGGT PDBID:3BOV PDBID:2B29 APKE VG ELE SLQSERA PS GA SEG DTNI GN DGL EQLSSN LKDG SAEAVG PDBID:3BY5 PDBID:2FKE RKGA KADE TFSQASL GAGA GD DGRT EDG GK SVG ATG PGI PPH PDBID:3DP5 PDBID:2FKL ASWS AV NTVHPEKT RNPGPGM GEAMIP DKC RMDV ID PDKC RMDV ID PDBID:3EAZ PDBID:2FMB MPW KI YPPETG TNYPG DG AS IDEEV LEKR ND DTGADTS LKYR VG KG DI ADGLCTR PDBID:2I9V PDBID:3ERS AFGS DDG DGDG DPKQVIG DSPEF DQQM ALSG HENADKL VGQK LVPYYS MG CNL RG TDDG PDBID:2IGP PERM PAG PLND QAP CPSY DTK VGAPH PDBID:3GBQ PDBID:2PK7 ADD KRG QN NG CPICK SADK DG CPICK SADK DG PDBID:3GK2 PDBID:2PWS EGDKH SHFL DSDG GCY WG NG LY PDBID:3GK4 PDBID:2QAS EGDKH SHFL DSDG DLMQ APGG PEPH TKAAGV YPD QHQ GET PDBID:3GSP GG FSA CD GS GSNS NYEG LSSG ENN HTGASGNN PDBID:2QVT PDBID:3HJX SPTA SGL NEQM QAEE GPKLNL SNMDQH SGAT RPMDEYS TT NDIPNF PDBID:3M4T PDBID:2VKS GLFC ED SVA SFI IVRV AAA AV EA DV DGPG TD SGA PDBID:3OZZ PDBID:3A0S AK HY LSEHPGG AG KDG VLG LPD GE NAKTQ PDBID:3UNN PDBID:3BIA AH MPDCS WD CGSLNG RPP RDQ AD LG PNSKCNA DGTR STK SN GWFN DBID:3V19 PDBID:3C4S HSI FPG NVDDTYYR DG GN FPG NVDDTYYR DG PDBID:3VYA GN ED DGN GGM RDE RKV RN IARFKWA PDBID:3C5K PDBID:4AHI PWCPH PAAG DVTQ CGDCG IQE LSCY GRYING DAKPQGR GLTSPCID ENKNG REN GGS WPPC NG YIDL YYCQ GEDM PDBID:4AQI PDBID:3C7I GRDG DKNE HPW KI HDG ESAPG GN DGAG LWV PDBID:4AQJ RST SRNQ RRDG NYLA DKNE PDBID:3DJN PDBID:4EX0 NHK PSE PY GEDG TSI PDBID:3FFY PDBID:4LYO NAT PNEK PQKKGR VDE GPER KI LDN NF TQA NTDG GILQ SRWW DGRTPGS PDBID:3GKY NLCN SSD GNGMNAW KGTD IRGC HSI HSI GERG PDBID:4UNG PDBID:3HQB TSI TSI YKH NNDE NNDE PDBID:5AFG PDBID:3IN2 DAAQQH CSND VKE AECS DQM NTN DKSCKQ PKNVMG TAAD GSG PDBID:5B1G FPGHSALL LDN NF TQA NTDG GILQ SRW DGRTPGS PDBID:3LR0 LCN SSD NGMNAW KGT RGC DPD EKTD GDDT ND DD PDBID:5BMH PDBID:3LVE GKTL PD SLG SQS SSNS KPGQ WAS ESGVPDR PDBID:5C6X GT SS QAEDV YYSTPY PD PG AEPP GAGE GPGG PDBID:3MF8 PDBID:5CUL QDRLT PAGN GG VEYG SS PTH RRGETPLP NVD PDBID:3MYA PDBID:5D53 PRGVDPSR LVNT PRGV PS TSI FVNQ PDBID:3Q7Y PDBID:5FD1 TDTG NPDG DRSDPGI DGNG TDTG NPDG DRSDPGI IKCK CPVD PN HPDEC CPAQ DGVKGK DGNG TDTG NPDG DRSDPGI PDBID:5KAZ PDBID:3RHE LPY HGRL VDG SESIPG KN EKHGY AEGS KN PIES PT VKTG IEPKA SNE QDF PS DPDE PDBID:5NGN PDBID:3STM DSCSEYC CCP DSCSEYC DGQ CCP KGKDI GK GS TMTG EGDN KN GD PDBID:5OC4 GD PAQI AE PDBID:3SUL PDBID:5PAZ NADQ VACS GPNG SGTG AGNG NGQ AE PA NPG VDK IKDM PEGA KINE PDBID:3T1X CTAH GDSP SA DRKG ALWA PPP LLG GERM GEHA DETA DBID:5TAB PDBID:3WW5 GPLGSEV RCICE NDF CEECQ CYVC LDN NF TQA ETDG GILQ SRW DGRTPGS PDBID:5UET LCN SSD DGNGMNAW KGTD RGC DVEALGL IKHP PPDH PDBID:3ZEK PDBID:5XUK LDN NF TQA NTDG GILQ SRWW DGRTPGS KRFK KDK GKELS SPKN AG NLCN SSD DGNGMNAW KGTD RGC PDBID:5ZND PDBID:4AHG RDDVA PDCDDW DPHLCD TSPC ENKN REN GG WPPC NG PDBID:6CEE PDBID:4ET9 CPH PAAG DVTQ CGDCG IQE LSCY GRYING LDN NF TQA NTDG GILQ SRW DGRTPGS YIDL YYCQ LCN SSD DGNGMNAW KGTD IRGC PDBID:6EKB PDBID:4GFY CANCEGEG CSQCKGG HFNG KAG CWLCRGK CGDCNGA SS GCY GWG VNGA LYPDFLCK PDBID:6I3S PDBID:4HMB EKQQH CSND VKE GCY WG NG LY PDBID:6QQJ PDBID:4HP9 DPRLPDN ILG GPET TKSG DQKG FQSR ARVENC MQ CTC GPRT RKDG NEDG PDBID:6RNT PDBID:4HRS CD GS GSNS NYEGFDF LSSG ENN HTGASGNN GR VVGE DDDG DG PEDQ VEV RLIK PDBID:1B2O PDBID:4IAS TVC NPGT PDDW CPLCA TVC NPGT PDDW LDN NF TQA NTDG GILQ SRW DGRTPGS CPLCA LCN SSD DGN KGTD RGC PDBID:1BKF PDBID:4KUO DGRT KRG EDG NK GK SVG ATG DPSQPDN VLG GPDT RKDG DPEG PGI PPH PDBID:4LFS PDBID:1BPQ SCI RKTCG IPSS NN GCY NPYTN NN SSEN NLDKKN PDBID:4ML2 PDBID:1C0C TLPL KG PD TDK KET DSST NLTKDRC CKNGQTN KYPN GNPY PDBID:4P9V PDBID:1C9H HPW KI HDG ESAPG GN DGAG LWV DGRT KKG QNG SRDRN GK SLG ATG SRNQ PGV PPN PDBID:4PTA PDBID:1DMM NW DTEG KRPN GD DPFGQ SHNG NG DEHG PDBID:4XC4 PDBID:1DMQ TSI DPFGQ SHNG NG DEHG PDBID:5AEF PDBID:1DYZ GVVI GVVI AQC NDAM NVK DKSCK AKVAMG GGG FPGHWAMM PDBID:5CB9 PDBID:1DZ0 PD PA PG AEPP GAGE NDTACCY GPGG AQC NDAM NVK DKSCK AKVAMG GGG FPGHWAMM PDBID:5J6X PDBID:1E6M DN KQ DN SDKEL DDF WNMPNMD DGAMSAL KPF PDBID:5K2L PDBID:1E97 QPGD DPFGQ SHNG NG DEHG PDBID:5O3A PDBID:1EA2 IKHP PPDH DPFGQ SHNG NG DEHG PDBID:5UER PDBID:1ED1 DVEALG IKHP PPDH LRPGG VPTG TGTA PDBID:5VEA PDBID:1FAV LWKRP GG DTGADD IG CG PTPV EL PDBID:5WPB PDBID:1G2S CPH PAAG DVTQ CGDCG IQE LSCY GRYING SDISKT SNSCSQR KRG HPRK TPK YIDL YYCQ PDBID:1HKF PDBID:5WRB VAGQ TGSL AL KPR TSR TDL RPSDN LDN NF TQA NTDG GILQ SRW DGRTPGS PDBID:1I0C LCN SSD DGNGMNAW KGT RGC NSS MKD RQ NASG NSS MKD RQ PDBID:6FH7 NASG NNI SMSV HEDAWPFL NLKLVPG IKKP EDDS PDBID:1IOT PDBID:6MHN LDN NF TQA NTDG GILQ SRWW DGRTPGS AFGS GDG DPKQVIG DSPEF DYQM ALSGD NLCN SSD DGN KGTD RGC PDBID:8PAZ PDBID:1J4H AE PA NPG VDK IKDM PEGA KINE DGRT KRG EDG GKQEVI SVG ATG PGI CTPH GDSP PPH PDBID:134L PDBID:1JC7 MDG GY TRA AGDR GIFQ SRYW DGKTPGA AN PVHH GGN GDPLICDN STG WPAHRNE NACH QDN DPQ QNR VQGC PDBID:1JHC PDBID:1A6F AAGEPLLA DPSLFKPN MA MDG QDVRNG DD GN NRQ QPEN SKKIG ER KEK SQL NSEF LRQQ DWLV PDBID:1BEL PDBID:1JON DSSTS NLTKDRC CKNG TGSS YP GNPY DR LS IN SP GIVK NKDT PDBID:1BGI PDBID:1KC2 LDN NF TQA NTDG GILQ SRWW DGRTPGS EEW KI NPENPRG SETTKGA LNV LDSG SR NLCN SSD NGMNAW KGT RGC ADGLCHR PDBID:1BO0 PDBID:1KP4 PVGI TSSHCPRE KLD KKT TPK DLCTQA NPFGFP PDBID:1BYL PDBID:1KXW RD DD DD DQVV RG TNFRDA PW LDN NF TQA NTDG GILQ SRWW DGRTPGS DPAG NLCN SSD DGN KGTD RGC PDBID:1EIF PDBID:1LSL KVG DG IFE TSS GD LQTY PEGI VTC QMNG EA VTCG FGG DV NK EPG VG PDBID:1LYO PDBID:1FKV LDN NF TQA NTDG GILQ SRWW DGRTPGS KD GY TQA NNDS GLFQ NKIW DDQNPHS NLCN SSD GN KGTD IRGC NICN PDBID:1O4H PDBID:1GP3 EEW KI NAENPRG SETTKG NAK LDSG SR NPSTG SEK CG NENCPPG GQTR KR DQ ADGLCHR CPSKSGL KQTC PDBID:1OAP PDBID:1GS3 QQNN DLDK NPSY KEKPAVL GD NPFGQ SHNG NG DEHG PDBID:1QTO PDBID:1GSW VD DRD GD TD RAV DYADTSG PA AFGSED DDG GDG IG DSPE DYQM ALSGD DPAG PDBID:1HFX PDBID:1RMD NDLAG GY TQA SD GLFQ DKDF NICD CQICE LAD TSCK CPSCR AQDC PDBID:1I9E PDBID:1SV9 EG YPRQ VNG KSNS AS HWSDS FA SS GCY GWG NG LY PDBID:1IGU PDBID:1U3Y DPD EDDG DE LN AEG EDDG DPD DRT VTG KD RQF YKDVNI RKSDL LKK DE LN PPAEG PDBID:1UIE PDBID:1IR8 LDN NF TQA NTDG GILQ SRWW DGRTPGS LDN NF TQA NTDG GILQ SRWW DGRTPGS NLCN SSD DGN KGT RGC NLCN SSD DGN KGTD RGC PDBID:1UIF PDBID:1JOI LDN NF TQA NTDG GILQ SRWW DGRTPGS AEC TDQM DKSCK PKNVMG GAG AAGT FPGHISMM NLCN SSD GN KGTD RGC PDBID:1K58 PDBID:1W2L DAKPQGR LTSPCKD ENKN REN WPPC NG HQSIFR SIDG RLVGP FKGLYG EDG QPGAKV QGYPNV PDBID:1KJT PDBID:1WJX YPD APKARI LDKK SDL RAED NNV PTSA FTGS DG NL PVDPRR LGKVE KG NERG EEDF PDBID:1Y49 PDBID:1KXY FCNAFTG LNN NF TQA NTDG GILQ SRWW DGRTPGS PDBID:1Y9T NLCN SSD DGN KGTD RGC FSIPNDL TTNK SKC AG VDGC NKIN DG PDBID:1LAC PDBID:1Z3P KLPD EGIH KPG NEDD NDKA SPVK LVP NLTKDR CKNG TGSS YP GNPY VGQT PDBID:2A7B PDBID:1MKU NG SNTNPS PKTF PAFNT NPQK KG NNF IPSS NN GCY NPYTN NN SSEN NLDKKN PDBID:2AZB PDBID:1NXT LWKRP GG DTGA CG TPA PD MLPD AD KPF PDBID:2B1Y PDBID:1O43 RAGT KSPI AEGH EEW KI NAENPRG SETTKG NAK LDSG SR PDBID:2BFH NS ADGLCHR KNGG HPDG EKSDPHI RG VCAN KEDG KCVTDEC PDBID:1O44 ESNN RKYTSW KRTG GPGQKAI EEW KI NAENPRG SETTKGA NAK LDSG SR DBID:2EYW ADGLCHR GAM DEED KKGD KPE DSEG VPYV PASA PDBID:1O4A PDBID:2GV2 EEW KI NAENPRG SETTKGA NAK LDSG SR DEKQQH CSND VKE ADGLCHR PDBID:2H2B PDBID:1O4I GSHM PGF FG NG MDNV SGK WRRT EEW KI NAENPRG SETTKGA NAK LDSG SR PDBID:2H36 ADGLCHR MEKV LGNY NG DDD EDN PDBID:1Q4V PDBID:2IIY GV SLYQL GERG GERG AGDK ATWC YSN VDD CMP KG GAN PDBID:1QKF PDBID:2OXK SRR VG NGKQ EK PDBID:1R75 PDBID:2P0F FKNG MFSNI TEAR SKE YDS LGDT IG GG GN REPPPTA GWGPAGS LRGA GRHLS RR RPL AEDERG TIPG DH PDBID:1RAQ PDBID:2PKT LQCH DQGG NKVGP LHGIFG SGQAEGY IPGT TPGS CIG DG TSNM CTD PDBID:1RBI PDBID:2QHE NLTKDR CKNG TGSS YP GNPY NG TYY PDBID:1RBW PDBID:2R2Y DSST NLTKDR CKNGQTN TGSS YP GNPY YL GT TDDS RTSG FPDD PQCPSG KAGS PDBID:1RNN EPKT DSSTS NLTKDR CKNG TGSS YP GNPY PDBID:2R34 PDBID:1RNU TSI TPK TSI GERG NLTKDR CKNG TGSS YP GNPY PDBID:2R36 PDBID:1RNW TSI GERG PKT TSI LENY GERG DSST NLTKDR CKNG TGSS YP GNPY PDBID:2SAK PDBID:1RWJ DSKG KPG TAYKE DPSA KNKK TEKG LSEHIKNP KGMT PK MKGV GMY CHTKLF KAGAKR PDBID:2WOR PDBID:1S7I RRDD NYLA DKNE LAAV QGGR TKE PDBID:2WOS PDBID:1SFG RRDD NYLA DKNE LA LDN RG NTDG GILQ SRWW DGRTPGS PDBID:2X9C NLCN SSDITA NRCKGT WIRGC KPSD KPSD PDBID:1TDY PDBID:2XKH MDG GY TRA AGDR GIFQ SRWW DGKTPGA VN FGFKGV NACH QDN DPQ QNR VQGC PDBID:3AZ5 PDBID:1TEN LDN RG NF TQA NTDG GILQ SRWW TDT KDVPGD EDE KPDT GD DGRTPGS NLCN SSD NGMNAW KGT RGC PDBID:1VAT PDBID:3B84 LDN NF TQA NTDG GILQ SRW DGRTPGS GQYC GG GDG LALTSG LCN SSD DGN KGTD RGC PDBID:3BYR PDBID:1YEA MDEG RGRA GP RGDT FPG RC EEGG NKVGP LHG SGQVKGY IPGT DBID:3BZS PDBID:1YMV ST PTH KLGETPLP HKG MPNMD DGAMSAL EAK PF PDBID:3C12 PDBID:1YVS DATG DANG DANG TAG SD TGL LANV PDNY VAPGK NREGKLP SG NY SDW TDHYQT PDBID:3CTG PDBID:1Z3M KE KTYC ELN TVP NG PVF NLTKDR CKNG TGSS YP GNPY PDBID:3CYI PDBID:2BWK GAQG DAKPKGR TSPC GAN REN GG RPPC NG PDBID:3HNU PDBID:2BWL NADG DG GG GAAE DCNG LNRI NG DAKPKGR TSPC GAN REN RPPC NG ESF PDBID:3KMJ PDBID:2D4L NDR SPLG RKS EKG TALEDY KN LGFGAIF ATPGS PPN DNDY VNN SQG KDP TAGLK IKP AIG GDD RLTQ PDBID:2DYQ PDBID:3KQ6 AR SDSL PL GRDPH LGRQS QPH HSI HSI PDBID:2F3L PDBID:3LZ2 DV LIGE FSGK LTYA NA LTDS FSEA LDN NF THA NTDGS GILQ SRWW TPG LRGA GS LIGA LHGA LTNG FKGA LTNA NLCN SSDI GGNGMNAW KGT IRGC LTEA FDDA ITGA FSLA NPKTG PDBID:3OSE PDBID:2F4G TWSM RF DARQD LDN NF TQA NTDG GILQ SRWW DGRTPGS PDBID:3RTO NLCN SSD NGMNAW KGTD RGC TSI PDBID:2FB0 PDBID:3S8S NET EEG SSTRRD GQIP RLNDNV HPRTR TS HL MG DIKG PDBID:2FQL TPQTV SHG PAF PPNKQ PLS LNGE LRNG PDBID:3WIT PDBID:2M6D NG EE PD GES KD GGS CVYP PDBID:3WRP PDBID:2M6E SAA LKS GC CV VY YP PW WC PDBID:3WX4 PDBID:2M6G AG APGK GCPD PDBID:4DP1 PDBID:2OLI ADDGSLA PS PAG SEED AKG CSPH GM GCY NG LY PDBID:4F1A PDBID:2P6V TSI TPD PDBID:4IP1 PDBID:2PW5 KGK ADWC FSD ADT VTAN GLP DGQG GD KG LL ETKVEKY AKK DKYG DG PDBID:4OV1 KG DQDKC APEL DDYG KGDG PAD CPEN PDBID:2PYK PDBID:4OZL GD KG LL ETKVEKY AKK DKYG DG ADTP EKGD VRTG KG PDBID:4PAZ PDBID:2PZW AE PA NPG VDK IKDM PEGA KINE GD KG LL ETKVEKY AKK DKYG DG TAH GDSP KG PDBID:4TS8 PDBID:2RKN QDPES DPQLLG VKNP GQYQE RKTS CG KENP YKNS PT TC PDBID:4XXL PDBID:2RNS GVDK KDGP RGK FPDG INGK NLTKDR CKNG TGSS YP GNPY PDBID:5B52 PDBID:2VJW RLA AV EA DV DGPG TD SGA PDBID:5ER4 PDBID:2VP7 EGDKH SHFL DSDG SSSD GICT NDDQ EASC EAS CDTCM TE PDBID:5GSP GQVETIVS CD GS GSNS NYEG LSSG ENN HTGASGNN PDBID:2XDY PDBID:5LAZ GSKANEG HQ PRK TFGV STKG EKQQH CSND VKE PDBID:2XFE PDBID:5UES VNG GG TG NG TGV NN GG DVEALGLH IKHP PPDH HCN PDBID:5YI1 PDBID:2XMU ATNDERV VPTI DAQA LTSK VPTI EDAQA LTSK PDBID:6CED PDBID:2YXY CPH PAAG DVTQ CGDCG IQE LSCY GRYING HPG EG GV LKG YIDL YYCQ PDBID:2Z7J PDBID:6FGL IYFA GHR DPNE YLDY AEKS APGQA HGR HDAAWPFL NPRLVSG IKNP EDDS PDBID:3A0D PDBID:6H0K SPN TG GPS QGDC SG TGGLGSGC HNNG LDN NF TQA NTDG GILQ SRWW DGRTPGS DQSN QQDR NLCN SSD DGNGMNAW KGTD IRGC PDBID:3ADY PDBID:155C NTLTIP PSVP PNSQ NEG KCKAC GKT NPDL GKN AXXXX PDBID:3BF2 PDBID:1ANG LTYR DAAGA VTAVI RG QDN DAKPQGR GLTSPCKD ENKN REN WPPC NG PDBID:3FZA PDBID:1AQT NP KTWC TVP GG AEQ SEG YPGH QHG PG SHG PDBID:3I7T PDBID:1BXY SR GP TD KIR LIAPGL PIGY RLQ PIGY RLQ AHL PDBID:3I7Y PDBID:1CDP DSSTS NLTKDR KP CKNG TGSS YP GNPY FAGVL AADS GLTSK DQDK FKADA DSDG PDBID:3IJU PDBID:1CDT LDN NF TQA NTDG GILQ SRW DGRTPGS KLIPIA PEGKN ASKKM SAL TDRC KLIPIA PEGK LCN SSD DGN KGTD RGC ASKKM NVC SAL TDRC PDBID:3JTE PDBID:1DPY CNSI MKMPKLS TPHM GHG KPV GCY PG NPNIK QP DSAD ML STSC PDBID:3MAZ PDBID:1DQ7 PAC GSDS IDI GQ LEKP RGNLR GDN PY PTPVP DGDN PY PTPV PDBID:3NGR DBID:1F1W QSGP KN ATN QPGT DVGRNV SVES EEW KI NPENPRG SETTKGA NAK LDSG SR PDBID:3R5P ADGLCHR TFQK GRKTG GG AEKN KK PDBID:1FLU PDBID:3RH1 LDN NF TQA NTDG GILQ SRWW DARTPGS DSST NLTKDR CKNG TGSS YP GNAY NLCN SSD DGNGMNAW KGTD RGC PDBID:3RSK PDBID:1FLW DSST NLTKDR CANG TGSS YP GNPY LDN NF TQA NTDG GILQ SRWW DGRTPAS PDBID:3S3Y NLCN SSD GNGMNAW KGTD RGC GS RTNG LNSV DG GSS TRAQ DG PDBID:1HQB PDBID:3ZK0 SQFG PVSEF WDT SPGAE SDSM EA DG DR PAH KSGG PDBID:1HSQ KL KQG AH AQREDEL IKS EGG YGGK PDBID:4ET8 PDBID:1I07 LDN NF TQA NTDG GILQ SRW DGRTPGS NSS MKD ASG NNI ARNSSE MKD NASG LCN SSD DGNGMNAW KGTD RGC NNI PDBID:4FC1 PDBID:1IR7 PGDYA LDN NF TQA NTDG GILQ SRWW DGRTPGS PDBID:4FDX NLCN SSD DGN KGTD RGC GVEHF AG GVEHF AG PDBID:1ISU PDBID:4IP6 NG AQ GASPTA KV PGDNQ APGG CDAF KGK ADWC FSD ADT VTAN GLP DGQG NG AQ SPTA KV PGDNQ APGG CDAF PDBID:4N0Z PDBID:1IZA NI KTEC IEKN SVP NK TSI SLYQL FVN TSI PDBID:4WBF PDBID:1LCJ KER ATN APGT SIDENA EPW FKNL APGNTHG SESTAGS QNQ DNG PR PDBID:4YX5 SDGLCTR PTN ANGV ND YA VEG PDBID:1LPI PDBID:5FG4 LDN NF TQA NTDG GILQ SRWW DGRTPGS DTGNIFS IKKP AKDT NLCN SSD DGN KGTD RGC PDBID:5HBO PDBID:1M4A IPE HE APA ED KKA RPR PDBID:5MXY PDBID:1MLI GVDG KFAD PEM FKEKL IAGH HPSS IAGH HPSS IAGH HPSS IAGH PDBID:6BSY HPSS IAGH HPSS IAGH HPSS IAGH HPSS NPE IAGH HPSS IAGH HPSS IAGH HPSS PDBID:6I5A PDBID:1NN7 QFQT QVTV PDSD RYN DT ENQA GTG SG YPDTLLGS PRHE LIPE
Table. S2. Amino acid sequences of β-turns of the 1000 protein samples. ote S3. Amino acid sequences of long β-strands with two adjacent RB residues locating at the middle segments of the β-strands.