Slobodan Vucetic
Temple University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Slobodan Vucetic.
BMC Bioinformatics | 2006
Kang Peng; Predrag Radivojac; Slobodan Vucetic; A. Keith Dunker; Zoran Obradovic
BackgroundDue to the functional importance of intrinsically disordered proteins or protein regions, prediction of intrinsic protein disorder from amino acid sequence has become an area of active research as witnessed in the 6th experiment on Critical Assessment of Techniques for Protein Structure Prediction (CASP6). Since the initial work by Romero et al. (Identifying disordered regions in proteins from amino acid sequences, IEEE Int. Conf. Neural Netw., 1997), our group has developed several predictors optimized for long disordered regions (>30 residues) with prediction accuracy exceeding 85%. However, these predictors are less successful on short disordered regions (≤30 residues). A probable cause is a length-dependent amino acid compositions and sequence properties of disordered regions.ResultsWe proposed two new predictor models, VSL2-M1 and VSL2-M2, to address this length-dependency problem in prediction of intrinsic protein disorder. These two predictors are similar to the original VSL1 predictor used in the CASP6 experiment. In both models, two specialized predictors were first built and optimized for short (≤30 residues) and long disordered regions (>30 residues), respectively. A meta predictor was then trained to integrate the specialized predictors into the final predictor model. As the 10-fold cross-validation results showed, the VSL2 predictors achieved well-balanced prediction accuracies of 81% on both short and long disordered regions. Comparisons over the VSL2 training dataset via 10-fold cross-validation and a blind-test set of unrelated recent PDB chains indicated that VSL2 predictors were significantly more accurate than several existing predictors of intrinsic protein disorder.ConclusionThe VSL2 predictors are applicable to disordered regions of any length and can accurately identify the short disordered regions that are often misclassified by our previous disorder predictors. The success of the VSL2 predictors further confirmed the previously observed differences in amino acid compositions and sequence properties between short and long disordered regions, and justified our approaches for modelling short and long disordered regions separately. The VSL2 predictors are freely accessible for non-commercial use at http://www.ist.temple.edu/disprot/predictorVSL2.php
Proteins | 2005
Zoran Obradovic; Kang Peng; Slobodan Vucetic; Predrag Radivojac; A. Keith Dunker
During the past few years we have investigated methods to improve predictors of intrinsically disordered regions longer than 30 consecutive residues. Experimental evidence, however, showed that these predictors were less successful on short disordered regions, as observed two years ago during the fifth Critical Assessment of Techniques for Protein Structure Prediction (CASP5). To address this shortcoming, we developed a two‐level model called VSL1 (CASP6 id: 193‐1). At the first level, VSL1 consists of two specialized predictors, one of which was optimized for long disordered regions (>30 residues) and the other for short disordered regions (≤30 residues). At the second level, a meta‐predictor was built to assign weights for combining the two first‐level predictors. As the results of the CASP6 experiment showed, this new predictor has achieved the highest accuracy yet and significantly improved performance on short disordered regions, while maintaining high performance on long disordered regions. Proteins 2005;Suppl 7:176–182.
Proteins | 2003
Zoran Obradovic; Kang Peng; Slobodan Vucetic; Predrag Radivojac; Celeste J. Brown; A. Keith Dunker
Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77% to 91% for the ordered regions and from 56% to 78% for the disordered segments. The average of the order and disorder predictions ranged from 73% to 77%. The prediction of disorder in the shorter segments was poor, from 25% to 66% correct, while the prediction of disorder in the longer segments was better, from 75% to 95% correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61‐residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence. Proteins 2003;53:566–572.
Proteins | 2003
Slobodan Vucetic; Celeste J. Brown; A. Keith Dunker; Zoran Obradovic
Intrinsically disordered proteins are characterized by long regions lacking 3‐D structure in their native states, yet they have been so far associated with 28 distinguishable functions. Previous studies showed that protein predictors trained on disorder from one type of protein often achieve poor accuracy on disorder of proteins of a different type, thus indicating significant differences in sequence properties among disordered proteins. Important biological problems are identifying different types, or flavors, of disorder and examining their relationships with protein function. Innovative use of computational methods is needed in addressing these problems due to relative scarcity of experimental data and background knowledge related to protein disorder. We developed an algorithm that partitions protein disorder into flavors based on competition among increasing numbers of predictors, with prediction accuracy determining both the number of distinct predictors and the partitioning of the individual proteins. Using 145 variously characterized proteins with long (>30 amino acids) disordered regions, 3 flavors, called V, C, and S, were identified by this approach, with the V subset containing 52 segments and 7743 residues, C containing 39 segments and 3402 residues, and S containing 54 segments and 5752 residues. The V, C, and S flavors were distinguishable by amino acid compositions, sequence locations, and biological function. For the sequences in SwissProt and 28 genomes, their protein functions exhibit correlations with the commonness and usage of different disorder flavors, suggesting different flavor‐function sets across these protein groups. Overall, the results herein support the flavor‐function approach as a useful complement to structural genomics as a means for automatically assigning possible functions to sequences. Proteins 2003;52:573–584.
Journal of Bioinformatics and Computational Biology | 2005
Kang Peng; Slobodan Vucetic; Predrag Radivojac; Celeste J. Brown; A. Keith Dunker; Zoran Obradovic
Protein existing as an ensemble of structures, called intrinsically disordered, has been shown to be responsible for a wide variety of biological functions and to be common in nature. Here we focus on improving sequence-based predictions of long (>30 amino acid residues) regions lacking specific 3-D structure by means of four new neural-network-based Predictors Of Natural Disordered Regions (PONDRs): VL3, VL3H, VL3P, and VL3E. PONDR VL3 used several features from a previously introduced PONDR VL2, but benefitted from optimized predictor models and a slightly larger (152 vs. 145) set of disordered proteins that were cleaned of mislabeling errors found in the smaller set. PONDR VL3H utilized homologues of the disordered proteins in the training stage, while PONDR VL3P used attributes derived from sequence profiles obtained by PSI-BLAST searches. The measure of accuracy was the average between accuracies on disordered and ordered protein regions. By this measure, the 30-fold cross-validation accuracies of VL3, VL3H, and VL3P were, respectively, 83.6 +/- 1.4%, 85.3 +/- 1.4%, and 85.2 +/- 1.5%. By combining VL3H and VL3P, the resulting PONDR VL3E achieved an accuracy of 86.7 +/- 1.4%. This is a significant improvement over our previous PONDRs VLXT (71.6 +/- 1.3%) and VL2 (80.9 +/- 1.4%). The new disorder predictors with the corresponding datasets are freely accessible through the web server at http://www.ist.temple.edu/disprot.
Protein Science | 2004
Predrag Radivojac; Zoran Obradovic; David K. Smith; Guang Zhu; Slobodan Vucetic; Celeste J. Brown; J. David Lawson; A. Keith Dunker
Comparisons were made among four categories of protein flexibility: (1) low‐B‐factor ordered regions, (2) high‐B‐factor ordered regions, (3) short disordered regions, and (4) long disordered regions. Amino acid compositions of the four categories were found to be significantly different from each other, with high‐B‐factor ordered and short disordered regions being the most similar pair. The high‐B‐factor (flexible) ordered regions are characterized by a higher average flexibility index, higher average hydrophilicity, higher average absolute net charge, and higher total charge than disordered regions. The low‐B‐factor regions are significantly enriched in hydrophobic residues and depleted in the total number of charged residues compared to the other three categories. We examined the predictability of the high‐B‐factor regions and developed a predictor that discriminates between regions of low and high B‐factors. This predictor achieved an accuracy of 70% and a correlation of 0.43 with experimental data, outperforming the 64% accuracy and 0.32 correlation of predictors based solely on flexibility indices. To further clarify the differences between short disordered regions and ordered regions, a predictor of short disordered regions was developed. Its relatively high accuracy of 81% indicates considerable differences between ordered and disordered regions. The distinctive amino acid biases of high‐B‐factor ordered regions, short disordered regions, and long disordered regions indicate that the sequence determinants for these flexibility categories differ from one another, whereas the significantly‐greater‐than‐chance predictability of these categories from sequence suggest that flexible ordered regions, short disorder, and long disorder are, to a significant degree, encoded at the primary structure level.
BMC Genomics | 2009
Vladimir N. Uversky; Christopher J. Oldfield; Uros Midic; Hongbo Xie; Bin Xue; Slobodan Vucetic; Lilia M. Iakoucheva; Zoran Obradovic; A. Keith Dunker
BackgroundIntrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) lack stable tertiary and/or secondary structure yet fulfills key biological functions. The recent recognition of IDPs and IDRs is leading to an entire field aimed at their systematic structural characterization and at determination of their mechanisms of action. Bioinformatics studies showed that IDPs and IDRs are highly abundant in different proteomes and carry out mostly regulatory functions related to molecular recognition and signal transduction. These activities complement the functions of structured proteins. IDPs and IDRs were shown to participate in both one-to-many and many-to-one signaling. Alternative splicing and posttranslational modifications are frequently used to tune the IDP functionality. Several individual IDPs were shown to be associated with human diseases, such as cancer, cardiovascular disease, amyloidoses, diabetes, neurodegenerative diseases, and others. This raises questions regarding the involvement of IDPs and IDRs in various diseases.ResultsIDPs and IDRs were shown to be highly abundant in proteins associated with various human maladies. As the number of IDPs related to various diseases was found to be very large, the concepts of the disease-related unfoldome and unfoldomics were introduced. Novel bioinformatics tools were proposed to populate and characterize the disease-associated unfoldome. Structural characterization of the members of the disease-related unfoldome requires specialized experimental approaches. IDPs possess a number of unique structural and functional features that determine their broad involvement into the pathogenesis of various diseases.ConclusionProteins associated with various human diseases are enriched in intrinsic disorder. These disease-associated IDPs and IDRs are real, abundant, diversified, vital, and dynamic. These proteins and regions comprise the disease-related unfoldome, which covers a significant part of the human proteome. Profound association between intrinsic disorder and various human diseases is determined by a set of unique structural and functional characteristics of IDPs and IDRs. Unfoldomics of human diseases utilizes unrivaled bioinformatics and experimental techniques, paves the road for better understanding of human diseases, their pathogenesis and molecular mechanisms, and helps develop new strategies for the analysis of disease-related proteins.
Proteins | 2006
Predrag Radivojac; Slobodan Vucetic; Timothy R. O'Connor; Vladimir N. Uversky; Zoran Obradovic; A. Keith Dunker
Calmodulin (CaM) signaling involves important, wide spread eukaryotic protein–protein interactions. The solved structures of CaM associated with several of its binding targets, the distinctive binding mechanism of CaM, and the significant trypsin sensitivity of the binding targets combine to indicate that the process of association likely involves coupled binding and folding for both CaM and its binding targets. Here, we use bioinformatics approaches to test the hypothesis that CaM‐binding targets are intrinsically disordered. We developed a predictor of CaM‐binding regions and estimated its performance. Per residue accuracy of this predictor reached 81%, which, in combination with a high recall/precision balance at the binding region level, suggests high predictability of CaM‐binding partners. An analysis of putative CaM‐binding proteins in yeast and human strongly indicates that their molecular functions are related to those of intrinsically disordered proteins. These findings add to the growing list of examples in which intrinsically disordered protein regions are indicated to provide the basis for cell signaling and regulation. Proteins 2006.
IEEE Transactions on Power Systems | 2001
Slobodan Vucetic; Kevin Tomsovic; Zoran Obradovic
This paper reports on characterizing recent price behavior in the Califomia electricity market. Market participants-that is, producers, consumers, and traders-are highly motivated by the potential for profits to develop strategies to explore-and exploit-the limits of system operation. These strategies should be reflected in the market as different price to load relationships. We show that a number of regimes, i.e., characteristic behaviors, exist in the price time series and provide a brief analysis of each regime. Knowledge of the number of regimes, their characteristics, and switching dynamics allows insight into the market and power system performance.
Knowledge and Information Systems | 2005
Slobodan Vucetic; Zoran Obradovic
The task of collaborative filtering is to predict the preferences of an active user for unseen items given preferences of other users. These preferences are typically expressed as numerical ratings. In this paper, we propose a novel regression-based approach that first learns a number of experts describing relationships in ratings between pairs of items. Based on ratings provided by an active user for some of the items, the experts are combined by using statistical methods to predict the user’s preferences for the remaining items. The approach was designed to efficiently address the problem of data sparsity and prediction latency that characterise collaborative filtering. Extensive experiments on Eachmovie and Jester benchmark collaborative filtering data show that the proposed regression-based approach achieves improved accuracy and is orders of magnitude faster than the popular neighbour-based alternative. The difference in accuracy was more evident when the number of ratings provided by an active user was small, as is common for real-life recommendation systems. Additional benefits were observed in predicting items with large rating variability. To provide a more detailed characterisation of the proposed algorithm, additional experiments were performed on synthetic data with second-order statistics similar to that of the Eachmovie data. Strong experimental evidence was obtained that the proposed approach can be applied to data over a large range of sparsity scenarios and is superior to non-personalised predictors even when ratings data are very sparse.