Zhenling Peng
Tianjin University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zhenling Peng.
Current Protein & Peptide Science | 2012
Zhenling Peng; Lukasz Kurgan
Intrinsic disorder is relatively common in proteins, plays important roles in numerous cellular activities, and its prevalence was implicated in various human diseases. However, annotations of the disorder lag behind the rapidly increasing number of known protein chains. The last decade observed development of a relatively large number of in-silico methods that predict the disorder using the protein sequence as their input. We perform a first-of-its kind comprehensive empirical evaluation of the disorder predictors which is characterized by three novel aspects, (1) we evaluate the quality of the disorder predictions at the residue, segment, and chain levels; (2) we consider a large number of published and accessible to the end user predictors that are evaluated on a relatively big dataset with close to 500 proteins; and (3) we assess statistical significance of differences between the considered methods. Our study reveals that there is no universally superior predictor and that the top-performing methods are complementary. We show that while recent consensus-based predictors outperform other considered methods for the residue-level predictions, some older methods perform better for the prediction of the disordered segments. Our analysis indicates that certain predictors are biased to under-predict the disorder, while some other solutions tend to over-predict the number of the disordered residues. We also evaluate the utility of the predicted residue-level disorder for prediction of proteins with long disordered segments and prediction of the chainlevel disorder content. Lastly, we provide recommendations concerning development of a new generation of consensusbased methods and specialized methods for improved prediction of the disorder content.
Cellular and Molecular Life Sciences | 2015
Zhenling Peng; Jing Yan; Xiao Fan; Marcin J. Mizianty; Bin Xue; Kui Wang; Gang Hu; Vladimir N. Uversky; Lukasz Kurgan
Recent years witnessed increased interest in intrinsically disordered proteins and regions. These proteins and regions are abundant and possess unique structural features and a broad functional repertoire that complements ordered proteins. However, modern studies on the abundance and functions of intrinsically disordered proteins and regions are relatively limited in size and scope of their analysis. To fill this gap, we performed a broad and detailed computational analysis of over 6 million proteins from 59 archaea, 471 bacterial, 110 eukaryotic and 325 viral proteomes. We used arguably more accurate consensus-based disorder predictions, and for the first time comprehensively characterized intrinsic disorder at proteomic and protein levels from all significant perspectives, including abundance, cellular localization, functional roles, evolution, and impact on structural coverage. We show that intrinsic disorder is more abundant and has a unique profile in eukaryotes. We map disorder into archaea, bacterial and eukaryotic cells, and demonstrate that it is preferentially located in some cellular compartments. Functional analysis that considers over 1,200 annotations shows that certain functions are exclusively implemented by intrinsically disordered proteins and regions, and that some of them are specific to certain domains of life. We reveal that disordered regions are often targets for various post-translational modifications, but primarily in the eukaryotes and viruses. Using a phylogenetic tree for 14 eukaryotic and 112 bacterial species, we analyzed relations between disorder, sequence conservation and evolutionary speed. We provide a complete analysis that clearly shows that intrinsic disorder is exceptionally and uniquely abundant in each domain of life.
Journal of Theoretical Biology | 2009
Jianyi Yang; Zhenling Peng; Zu-Guo Yu; Ruijie Zhang; Vo Anh; Desheng Wang
In this paper, we intend to predict protein structural classes (alpha, beta, alpha+beta, or alpha/beta) for low-homology data sets. Two data sets were used widely, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence homology being 40% and 25%, respectively. We propose to decompose the chaos game representation of proteins into two kinds of time series. Then, a novel and powerful nonlinear analysis technique, recurrence quantification analysis (RQA), is applied to analyze these time series. For a given protein sequence, a total of 16 characteristic parameters can be calculated with RQA, which are treated as feature representation of protein sequences. Based on such feature representation, the structural class for each protein is predicted with Fishers linear discriminant algorithm. The jackknife test is used to test and compare our method with other existing methods. The overall accuracies with step-by-step procedure are 65.8% and 64.2% for 1189 and 25PDB data sets, respectively. With one-against-others procedure used widely, we compare our method with five other existing methods. Especially, the overall accuracies of our method are 6.3% and 4.1% higher for the two data sets, respectively. Furthermore, only 16 parameters are used in our method, which is less than that used by other methods. This suggests that the current method may play a complementary role to the existing methods and is promising to perform the prediction of protein structural classes.
BMC Bioinformatics | 2010
Jianyi Yang; Zhenling Peng; Xin Chen
BackgroundPrediction of protein structural classes (α, β, α + β and α/β) from amino acid sequences is of great importance, as it is beneficial to study protein function, regulation and interactions. Many methods have been developed for high-homology protein sequences, and the prediction accuracies can achieve up to 90%. However, for low-homology sequences whose average pairwise sequence identity lies between 20% and 40%, they perform relatively poorly, yielding the prediction accuracy often below 60%.ResultsWe propose a new method to predict protein structural classes on the basis of features extracted from the predicted secondary structures of proteins rather than directly from their amino acid sequences. It first uses PSIPRED to predict the secondary structure for each protein sequence. Then, the chaos game representation is employed to represent the predicted secondary structure as two time series, from which we generate a comprehensive set of 24 features using recurrence quantification analysis, K-string based information entropy and segment-based analysis. The resulting feature vectors are finally fed into a simple yet powerful Fishers discriminant algorithm for the prediction of protein structural classes. We tested the proposed method on three benchmark datasets in low homology and achieved the overall prediction accuracies of 82.9%, 83.1% and 81.3%, respectively. Comparisons with ten existing methods showed that our method consistently performs better for all the tested datasets and the overall accuracy improvements range from 2.3% to 27.5%. A web server that implements the proposed method is freely available at http://www1.spms.ntu.edu.sg/~chenxin/RKS_PPSC/.ConclusionThe high prediction accuracy achieved by our proposed method is attributed to the design of a comprehensive feature set on the predicted secondary structure sequences, which is capable of characterizing the sequence order information, local interactions of the secondary structural elements, and spacial arrangements of α helices and β strands. Thus, it is a valuable method to predict protein structural classes particularly for low-homology amino acid sequences.
Cellular and Molecular Life Sciences | 2014
Zhenling Peng; Christopher J. Oldfield; Bin Xue; Marcin J. Mizianty; A. Keith Dunker; Lukasz Kurgan; Vladimir N. Uversky
Intrinsic disorder (i.e., lack of a unique 3-D structure) is a common phenomenon, and many biologically active proteins are disordered as a whole, or contain long disordered regions. These intrinsically disordered proteins/regions constitute a significant part of all proteomes, and their functional repertoire is complementary to functions of ordered proteins. In fact, intrinsic disorder represents an important driving force for many specific functions. An illustrative example of such disorder-centric functional class is RNA-binding proteins. In this study, we present the results of comprehensive bioinformatics analyses of the abundance and roles of intrinsic disorder in 3,411 ribosomal proteins from 32 species. We show that many ribosomal proteins are intrinsically disordered or hybrid proteins that contain ordered and disordered domains. Predicted globular domains of many ribosomal proteins contain noticeable regions of intrinsic disorder. We also show that disorder in ribosomal proteins has different characteristics compared to other proteins that interact with RNA and DNA including overall abundance, evolutionary conservation, and involvement in protein–protein interactions. Furthermore, intrinsic disorder is not only abundant in the ribosomal proteins, but we demonstrate that it is absolutely necessary for their various functions.
Science Signaling | 2014
Jody Groenendyk; Zhenling Peng; Elzbieta Dudek; Xiao Fan; Marcin J. Mizianty; Estefanie Dufey; Hery Urra; Denisse Sepulveda; Diego Rojas-Rivera; Yunki Lim; Do Han Kim; Kayla Baretta; Sonal Srikanth; Yousang Gwack; Joohong Ahnn; Randal J. Kaufman; Sun-Kyung Lee; Claudio Hetz; Lukasz Kurgan; Marek Michalak
Depletion of Ca2+ in the endoplasmic reticulum favors activation of a stress response involving IRE1α. Responding the Right Way to Cellular Stress Some proteins must be folded correctly in the endoplasmic reticulum (ER) to function properly. Various stress conditions can cause the buildup of unfolded proteins in the ER, which can cause cell death. There are multiple ways in which cells can respond to deal with the buildup of unfolded proteins. Groenendyk et al. investigated how cells deal with the stress of depletion of calcium ions from the ER and identified a pathway involving a microRNA and an oxidoreductase in the ER. They found that depletion of calcium from the ER resulted in the decreased abundance of a microRNA, which enabled a target mRNA and the oxidoreductase it encoded to accumulate. The oxidoreductase then activated a specific stress response. The authors showed that this pathway could be present in mice and nematodes. The disruption of the energy or nutrient balance triggers endoplasmic reticulum (ER) stress, a process that mobilizes various strategies, collectively called the unfolded protein response (UPR), which reestablish homeostasis of the ER and cell. Activation of the UPR stress sensor IRE1α (inositol-requiring enzyme 1α) stimulates its endoribonuclease activity, leading to the generation of the mRNA encoding the transcription factor XBP1 (X-box binding protein 1), which regulates the transcription of genes encoding factors involved in controlling the quality and folding of proteins. We found that the activity of IRE1α was regulated by the ER oxidoreductase PDIA6 (protein disulfide isomerase A6) and the microRNA miR-322 in response to disruption of ER Ca2+ homeostasis. PDIA6 interacted with IRE1α and enhanced IRE1α activity as monitored by phosphorylation of IRE1α and XBP1 mRNA splicing, but PDIA6 did not substantially affect the activity of other pathways that mediate responses to ER stress. ER Ca2+ depletion and activation of store-operated Ca2+ entry reduced the abundance of the microRNA miR-322, which increased PDIA6 mRNA stability and, consequently, IRE1α activity during the ER stress response. In vivo experiments with mice and worms showed that the induction of ER stress correlated with decreased miR-322 abundance, increased PDIA6 mRNA abundance, or both. Together, these findings demonstrated that ER Ca2+, PDIA6, IRE1α, and miR-322 function in a dynamic feedback loop modulating the UPR under conditions of disrupted ER Ca2+ homeostasis.
Cell Death & Differentiation | 2013
Zhenling Peng; Bin Xue; Lukasz Kurgan; Vladimir N. Uversky
It is recognized now that intrinsically disordered proteins (IDPs), which do not have unique 3D structures as a whole or in noticeable parts, constitute a significant fraction of any given proteome. IDPs are characterized by an astonishing structural and functional diversity that defines their ability to be universal regulators of various cellular pathways. Programmed cell death (PCD) is one of the most intricate cellular processes where the cell uses specialized cellular machinery and intracellular programs to kill itself. This cell-suicide mechanism enables metazoans to control cell numbers and to eliminate cells that threaten the animal’s survival. PCD includes several specific modules, such as apoptosis, autophagy, and programmed necrosis (necroptosis). These modules are not only tightly regulated but also intimately interconnected and are jointly controlled via a complex set of protein–protein interactions. To understand the role of the intrinsic disorder in controlling and regulating the PCD, several large sets of PCD-related proteins across 28 species were analyzed using a wide array of modern bioinformatics tools. This study indicates that the intrinsic disorder phenomenon has to be taken into consideration to generate a complete picture of the interconnected processes, pathways, and modules that determine the essence of the PCD. We demonstrate that proteins involved in regulation and execution of PCD possess substantial amount of intrinsic disorder. We annotate functional roles of disorder across and within apoptosis, autophagy, and necroptosis processes. Disordered regions are shown to be implemented in a number of crucial functions, such as protein–protein interactions, interactions with other partners including nucleic acids and other ligands, are enriched in post-translational modification sites, and are characterized by specific evolutionary patterns. We mapped the disorder into an integrated network of PCD pathways and into the interactomes of selected proteins that are involved in the p53-mediated apoptotic signaling pathway.
Proteins | 2014
Zhenling Peng; Marcin J. Mizianty; Lukasz Kurgan
Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super‐fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time‐efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first‐of‐its‐kind large‐scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/.Proteins 2014; 82:145–158.
Nucleic Acids Research | 2015
Zhenling Peng; Lukasz Kurgan
Intrinsically disordered proteins and regions (IDPs and IDRs) lack stable 3D structure under physiological conditions in-vitro, are common in eukaryotes, and facilitate interactions with RNA, DNA and proteins. Current methods for prediction of IDPs and IDRs do not provide insights into their functions, except for a handful of methods that address predictions of protein-binding regions. We report first-of-its-kind computational method DisoRDPbind for high-throughput prediction of RNA, DNA and protein binding residues located in IDRs from protein sequences. DisoRDPbind is implemented using a runtime-efficient multi-layered design that utilizes information extracted from physiochemical properties of amino acids, sequence complexity, putative secondary structure and disorder and sequence alignment. Empirical tests demonstrate that it provides accurate predictions that are competitive with other predictors of disorder-mediated protein binding regions and complementary to the methods that predict RNA- and DNA-binding residues annotated based on crystal structures. Application in Homo sapiens, Mus musculus, Caenorhabditis elegans and Drosophila melanogaster proteomes reveals that RNA- and DNA-binding proteins predicted by DisoRDPbind complement and overlap with the corresponding known binding proteins collected from several sources. Also, the number of the putative protein-binding regions predicted with DisoRDPbind correlates with the promiscuity of proteins in the corresponding protein–protein interaction networks. Webserver: http://biomine.ece.ualberta.ca/DisoRDPbind/
pacific symposium on biocomputing | 2011
Zhenling Peng; Lukasz Kurgan
Intrinsic disorder in proteins plays important roles in transcriptional regulation, translation, and cellular signal transduction. The experimental annotation of the disorder lags behind the rapidly accumulating number of known protein chains, which motivates the development of computational predictors of disorder. Some of these methods address predictions of certain types/flavors of the disorder and recent years show that consensus-based predictors provide a viable way to improve predictive performance. However, the selection of the base predictors in a given consensus is usually performed in an ad-hock manner, based on their availability and with a premise that more is better. We perform first-of-its-kind investigation that analyzes complementarity among a dozen recent predictors to identify characteristics of (future) predictors that would lead to further consensus-based improvements in the predictive quality. The complementarity of a given set of three base predictors is expressed by the differences in their predictions when compared with each other and with their majority vote consensus. We propose a regression-based model that quantifies/predicts quality of the majority-vote consensus of a given triplet of predictors based on their individual predictive performance and their complementarity measured at the residue and the disorder segment levels. Our model shows that improved performance is associated with higher (lower) similarity between the three base predictors at the residue (segment) level and to their consensus prediction at the segment (residue) level. We also show that better consensuses utilize higher quality base methods. We use our model to predict the best-performing consensus on an independent test dataset and our empirical evaluation shows that this consensus outperforms individual methods and other consensus-based predictors based on the area under the ROC curve measure. Our study provides insights that could lead to the development of a new generation of the consensus-based disorder predictors.