Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yasser EL-Manzalawy is active.

Publication


Featured researches published by Yasser EL-Manzalawy.


Journal of Molecular Recognition | 2008

Predicting linear B-cell epitopes using string kernels

Yasser EL-Manzalawy; Drena Dobbs; Vasant G. Honavar

The identification and characterization of B‐cell epitopes play an important role in vaccine design, immunodiagnostic tests, and antibody production. Therefore, computational tools for reliably predicting linear B‐cell epitopes are highly desirable. We evaluated Support Vector Machine (SVM) classifiers trained utilizing five different kernel methods using fivefold cross‐validation on a homology‐reduced data set of 701 linear B‐cell epitopes, extracted from Bcipep database, and 701 non‐epitopes, randomly extracted from SwissProt sequences. Based on the results of our computational experiments, we propose BCPred, a novel method for predicting linear B‐cell epitopes using the subsequence kernel. We show that the predictive performance of BCPred (AUC = 0.758) outperforms 11 SVM‐based classifiers developed and evaluated in our experiments as well as our implementation of AAP (AUC = 0.7), a recently proposed method for predicting linear B‐cell epitopes using amino acid pair antigenicity. Furthermore, we compared BCPred with AAP and ABCPred, a method that uses recurrent neural networks, using two data sets of unique B‐cell epitopes that had been previously used to evaluate ABCPred. Analysis of the data sets used and the results of this comparison show that conclusions about the relative performance of different B‐cell epitope prediction methods drawn on the basis of experiments using data sets of unique B‐cell epitopes are likely to yield overly optimistic estimates of performance of evaluated methods. This argues for the use of carefully homology‐reduced data sets in comparing B‐cell epitope prediction methods to avoid misleading conclusions about how different methods compare to each other. Our homology‐reduced data set and implementations of BCPred as well as the APP method are publicly available through our web‐based server, BCPREDS, at: http://ailab.cs.iastate.edu/bcpreds/. Copyright


Immunome Research | 2010

Recent advances in B-cell epitope prediction methods

Yasser EL-Manzalawy; Vasant G. Honavar

Identification of epitopes that invoke strong responses from B-cells is one of the key steps in designing effective vaccines against pathogens. Because experimental determination of epitopes is expensive in terms of cost, time, and effort involved, there is an urgent need for computational methods for reliable identification of B-cell epitopes. Although several computational tools for predicting B-cell epitopes have become available in recent years, the predictive performance of existing tools remains far from ideal. We review recent advances in computational methods for B-cell epitope prediction, identify some gaps in the current state of the art, and outline some promising directions for improving the reliability of such methods.


computational systems bioinformatics | 2008

Predicting flexible length linear B-cell epitopes.

Yasser EL-Manzalawy; Drena Dobbs; Vasant G. Honavar

Identifying B-cell epitopes play an important role in vaccine design, immunodiagnostic tests, and antibody production. Therefore, computational tools for reliably predicting B-cell epitopes are highly desirable. We explore two machine learning approaches for predicting flexible length linear B-cell epitopes. The first approach utilizes four sequence kernels for determining a similarity score between any arbitrary pair of variable length sequences. The second approach utilizes four different methods of mapping a variable length sequence into a fixed length feature vector. Based on our empirical comparisons, we propose FBCPred, a novel method for predicting flexible length linear B-cell epitopes using the subsequence kernel. Our results demonstrate that FBCPred significantly outperforms all other classifiers evaluated in this study. An implementation of FBCPred and the datasets used in this study are publicly available through our linear B-cell epitope prediction server, BCPREDS, at: http://ailab.cs.iastate.edu/bcpreds/.


BMC Bioinformatics | 2012

Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art

Rasna R. Walia; Cornelia Caragea; Benjamin A. Lewis; Fadi Towfic; Michael Terribilini; Yasser EL-Manzalawy; Drena Dobbs; Vasant G. Honavar

BackgroundRNA molecules play diverse functional and structural roles in cells. They function as messengers for transferring genetic information from DNA to proteins, as the primary genetic material in many viruses, as catalysts (ribozymes) important for protein synthesis and RNA processing, and as essential and ubiquitous regulators of gene expression in living organisms. Many of these functions depend on precisely orchestrated interactions between RNA molecules and specific proteins in cells. Understanding the molecular mechanisms by which proteins recognize and bind RNA is essential for comprehending the functional implications of these interactions, but the recognition ‘code’ that mediates interactions between proteins and RNA is not yet understood. Success in deciphering this code would dramatically impact the development of new therapeutic strategies for intervening in devastating diseases such as AIDS and cancer. Because of the high cost of experimental determination of protein-RNA interfaces, there is an increasing reliance on statistical machine learning methods for training predictors of RNA-binding residues in proteins. However, because of differences in the choice of datasets, performance measures, and data representations used, it has been difficult to obtain an accurate assessment of the current state of the art in protein-RNA interface prediction.ResultsWe provide a review of published approaches for predicting RNA-binding residues in proteins and a systematic comparison and critical assessment of protein-RNA interface residue predictors trained using these approaches on three carefully curated non-redundant datasets. We directly compare two widely used machine learning algorithms (Naïve Bayes (NB) and Support Vector Machine (SVM)) using three different data representations in which features are encoded using either sequence- or structure-based windows. Our results show that (i) Sequence-based classifiers that use a position-specific scoring matrix (PSSM)-based representation (PSSMSeq) outperform those that use an amino acid identity based representation (IDSeq) or a smoothed PSSM (SmoPSSMSeq); (ii) Structure-based classifiers that use smoothed PSSM representation (SmoPSSMStr) outperform those that use PSSM (PSSMStr) as well as sequence identity based representation (IDStr). PSSMSeq classifiers, when tested on an independent test set of 44 proteins, achieve performance that is comparable to that of three state-of-the-art structure-based predictors (including those that exploit geometric features) in terms of Matthews Correlation Coefficient (MCC), although the structure-based methods achieve substantially higher Specificity (albeit at the expense of Sensitivity) compared to sequence-based methods. We also find that the expected performance of the classifiers on a residue level can be markedly different from that on a protein level. Our experiments show that the classifiers trained on three different non-redundant protein-RNA interface datasets achieve comparable cross-validation performance. However, we find that the results are significantly affected by differences in the distance threshold used to define interface residues.ConclusionsOur results demonstrate that protein-RNA interface residue predictors that use a PSSM-based encoding of sequence windows outperform classifiers that use other encodings of sequence windows. While structure-based methods that exploit geometric features can yield significant increases in the Specificity of protein-RNA interface residue predictions, such increases are offset by decreases in Sensitivity. These results underscore the importance of comparing alternative methods using rigorous statistical procedures, multiple performance measures, and datasets that are constructed based on several alternative definitions of interface residues and redundancy cutoffs as well as including evaluations on independent test sets into the comparisons.


BMC Bioinformatics | 2012

Predicting protein-protein interface residues using local surface structural similarity.

Rafael A. Jordan; Yasser EL-Manzalawy; Drena Dobbs; Vasant G. Honavar

BackgroundIdentification of the residues in protein-protein interaction sites has a significant impact in problems such as drug discovery. Motivated by the observation that the set of interface residues of a protein tend to be conserved even among remote structural homologs, we introduce PrISE, a family of local structural similarity-based computational methods for predicting protein-protein interface residues.ResultsWe present a novel representation of the surface residues of a protein in the form of structural elements. Each structural element consists of a central residue and its surface neighbors. The PrISE family of interface prediction methods uses a representation of structural elements that captures the atomic composition and accessible surface area of the residues that make up each structural element. Each of the members of the PrISE methods identifies for each structural element in the query protein, a collection of similar structural elements in its repository of structural elements and weights them according to their similarity with the structural element of the query protein. PrISEL relies on the similarity between structural elements (i.e. local structural similarity). PrISEG relies on the similarity between protein surfaces (i.e. general structural similarity). PrISEC , combines local structural similarity and general structural similarity to predict interface residues. These predictors label the central residue of a structural element in a query protein as an interface residue if a weighted majority of the structural elements that are similar to it are interface residues, and as a non-interface residue otherwise. The results of our experiments using three representative benchmark datasets show that the PrISEC outperforms PrISEL and PrISEG ; and that PrISEC is highly competitive with state-of-the-art structure-based methods for predicting protein-protein interface residues. Our comparison of PrISEC with PredUs, a recently developed method for predicting interface residues of a query protein based on the known interface residues of its (global) structural homologs, shows that performance superior or comparable to that of PredUs can be obtained using only local surface structural similarity. PrISEC is available as a Web server at http://prise.cs.iastate.edu/ConclusionsLocal surface structural similarity based methods offer a simple, efficient, and effective approach to predict protein-protein interface residues.


PLOS ONE | 2008

On Evaluating MHC-II Binding Peptide Prediction Methods

Yasser EL-Manzalawy; Drena Dobbs; Vasant G. Honavar

Choice of one method over another for MHC-II binding peptide prediction is typically based on published reports of their estimated performance on standard benchmark datasets. We show that several standard benchmark datasets of unique peptides used in such studies contain a substantial number of peptides that share a high degree of sequence identity with one or more other peptide sequences in the same dataset. Thus, in a standard cross-validation setup, the test set and the training set are likely to contain sequences that share a high degree of sequence identity with each other, leading to overly optimistic estimates of performance. Hence, to more rigorously assess the relative performance of different prediction methods, we explore the use of similarity-reduced datasets. We introduce three similarity-reduced MHC-II benchmark datasets derived from MHCPEP, MHCBN, and IEDB databases. The results of our comparison of the performance of three MHC-II binding peptide prediction methods estimated using datasets of unique peptides with that obtained using their similarity-reduced counterparts shows that the former can be rather optimistic relative to the performance of the same methods on similarity-reduced counterparts of the same datasets. Furthermore, our results demonstrate that conclusions regarding the superiority of one method over another drawn on the basis of performance estimates obtained using commonly used datasets of unique peptides are often contradicted by the observed performance of the methods on the similarity-reduced versions of the same datasets. These results underscore the importance of using similarity-reduced datasets in rigorously comparing the performance of alternative MHC-II peptide prediction methods.


Proteins | 2014

DockRank: Ranking docked conformations using partner-specific sequence homology-based protein interface prediction

Li C. Xue; Rafael A. Jordan; Yasser EL-Manzalawy; Drena Dobbs; Vasant G. Honavar

Selecting near‐native conformations from the immense number of conformations generated by docking programs remains a major challenge in molecular docking. We introduce DockRank, a novel approach to scoring docked conformations based on the degree to which the interface residues of the docked conformation match a set of predicted interface residues. DockRank uses interface residues predicted by partner‐specific sequence homology‐based protein–protein interface predictor (PS‐HomPPI), which predicts the interface residues of a query protein with a specific interaction partner. We compared the performance of DockRank with several state‐of‐the‐art docking scoring functions using Success Rate (the percentage of cases that have at least one near‐native conformation among the top m conformations) and Hit Rate (the percentage of near‐native conformations that are included among the top m conformations). In cases where it is possible to obtain partner‐specific (PS) interface predictions from PS‐HomPPI, DockRank consistently outperforms both (i) ZRank and IRAD, two state‐of‐the‐art energy‐based scoring functions (improving Success Rate by up to 4‐fold); and (ii) Variants of DockRank that use predicted interface residues obtained from several protein interface predictors that do not take into account the binding partner in making interface predictions (improving success rate by up to 39‐fold). The latter result underscores the importance of using partner‐specific interface residues in scoring docked conformations. We show that DockRank, when used to re‐rank the conformations returned by ClusPro, improves upon the original ClusPro rankings in terms of both Success Rate and Hit Rate. DockRank is available as a server at http://einstein.cs.iastate.edu/DockRank/. Proteins 2014; 82:250–267.


bioinformatics and biomedicine | 2008

Predicting Protective Linear B-Cell Epitopes Using Evolutionary Information

Yasser EL-Manzalawy; Drena Dobbs; Vasant G. Honavar

Mapping B-cell epitopes plays an important role in vaccine design, immunodiagnostic tests, and antibody production. Because the experimental determination of B-cell epitopes is time-consuming and expensive, there is an urgent need for computational methods for reliable identification of putative B-cell epitopes from antigenic sequences. In this study, we explore the utility of evolutionary profiles derived from antigenic sequences in improving the performance of machine learning methods for protective linear B-cell epitope prediction. Specifically, we compare propensity scale based methods with a Naive Bayes classifier using three different representations of the classifier input: amino acid identities, position specific scoring matrix (PSSM) profiles, and dipeptide composition. We find that in predicting protective linear B-cell epitopes, a Naive Bayes classifier trained using PSSM profiles significantly outperforms the propensity scale based methods as well as the Naive Bayes classifiers trained using the amino acid identity or dipeptide composition representations of input data.


Methods of Molecular Biology | 2017

In Silico Prediction of Linear B-Cell Epitopes on Proteins.

Yasser EL-Manzalawy; Drena Dobbs; Vasant G. Honavar

Antibody-protein interactions play a critical role in the humoral immune response. B-cells secrete antibodies, which bind antigens (e.g., cell surface proteins of pathogens). The specific parts of antigens that are recognized by antibodies are called B-cell epitopes. These epitopes can be linear, corresponding to a contiguous amino acid sequence fragment of an antigen, or conformational, in which residues critical for recognition may not be contiguous in the primary sequence, but are in close proximity within the folded protein 3D structure.Identification of B-cell epitopes in target antigens is one of the key steps in epitope-driven subunit vaccine design, immunodiagnostic tests, and antibody production. In silico bioinformatics techniques offer a promising and cost-effective approach for identifying potential B-cell epitopes in a target vaccine candidate. In this chapter, we show how to utilize online B-cell epitope prediction tools to identify linear B-cell epitopes from the primary amino acid sequence of proteins.


Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine | 2011

Ranking docked models of protein-protein complexes using predicted partner-specific protein-protein interfaces: a preliminary study

Li C. Xue; Rafael A. Jordan; Yasser EL-Manzalawy; Drena Dobbs; Vasant G. Honavar

Computational protein-protein docking is a valuable tool for determining the conformation of complexes formed by interacting proteins. Selecting near-native conformations from the large number of possible models generated by docking software presents a significant challenge in practice. We introduce a novel method for ranking docked conformations based on the degree of overlap between the interface residues of a docked conformation formed by a pair of proteins with the set of predicted interface residues between them. Our approach relies on a method, called PS-HomPPI, for reliably predicting proteinprotein interface residues by taking into account information derived from both interacting proteins. PS-HomPPI infers the residues of a query protein that are likely to interact with a partner protein based on known interface residues of the homo-interologs of the query-partner protein pair, i.e., pairs of interacting proteins that are homologous to the query protein and partner protein. Our results on Docking Benchmark 3.0 show that the quality of the ranking of docked conformations using our method is consistently superior to that produced using ClusPro cluster-size-based and energy-based criteria for 61 out of the 64 docking complexes for which PS-HomPPI produces interface predictions. An implementation of our method for ranking docked models is freely available at: http://einstein.cs.iastate.edu/DockRank/.

Collaboration


Dive into the Yasser EL-Manzalawy's collaboration.

Top Co-Authors

Avatar

Vasant G. Honavar

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Li C. Xue

Iowa State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mostafa M. Abbas

Qatar Computing Research Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dokyoon Kim

Geisinger Health System

View shared research outputs
Researchain Logo
Decentralizing Knowledge