Yizhou Li
Sichuan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yizhou Li.
Journal of Theoretical Biology | 2010
Lezheng Yu; Yanzhi Guo; Yizhou Li; Gongbing Li; Menglong Li; Jiesi Luo; Wenjia Xiong; Wenli Qin
Protein secretion plays an important role in bacterial lifestyles. Secreted proteins are crucial for bacterial pathogenesis by making bacteria interact with their environments, particularly delivering pathogenic and symbiotic bacteria into their eukaryotic hosts. Therefore, identification of bacterial secreted proteins becomes an important process for the study of various diseases and the corresponding drugs. In this paper, fusing several new features into Chous pseudo-amino acid composition (PseAAC), two support vector machine (SVM)-based ternary classifiers are developed to predict secreted proteins of Gram-negative and Gram-positive bacteria. For the two types of bacteria, the high accuracy of 94.03% and 94.36% are obtained in distinguishing classically secreted, non-classically secreted and non-secreted proteins by our method. In order to compare the practical ability of our method in identifying bacterial secreted proteins with those of six published methods, proteins in Escherichia coli and Bacillus subtilis are collected to construct the test sets of Gram-negative and Gram-positive bacteria, and the prediction results of our method are comparable to those of existing methods. When performed on two public independent data sets for predicting NCSPs, it also yields satisfactory results for Gram-negative bacterial proteins. The prediction server SecretP can be accessed at http://cic.scu.edu.cn/bioinformatics/secretPV2/index.htm.
BMC Bioinformatics | 2011
Yizhou Li; Zhining Wen; Jiamin Xiao; Hui Yin; Lezheng Yu; Li Yang; Menglong Li
BackgroundThe rapid accumulation of data on non-synonymous single nucleotide polymorphisms (nsSNPs, also called SAPs) should allow us to further our understanding of the underlying disease-associated mechanisms. Here, we use complex networks to study the role of an amino acid in both local and global structures and determine the extent to which disease-associated and polymorphic SAPs differ in terms of their interactions to other residues.ResultsWe found that SAPs can be well characterized by network topological features. Mutations are probably disease-associated when they occur at a site with a high centrality value and/or high degree value in a protein structure network. We also discovered that study of the neighboring residues around a mutation site can help to determine whether the mutation is disease-related or not. We compiled a dataset from the Swiss-Prot variant pages and constructed a model to predict disease-associated SAPs based on the random forest algorithm. The values of total accuracy and MCC were 83.0% and 0.64, respectively, as determined by 5-fold cross-validation. With an independent dataset, our model achieved a total accuracy of 80.8% and MCC of 0.59, respectively.ConclusionsThe satisfactory performance suggests that network topological features can be used as quantification measures to determine the importance of a site on a protein, and this approach can complement existing methods for prediction of disease-associated SAPs. Moreover, the use of this method in SAP studies would help to determine the underlying linkage between SAPs and diseases through extensive investigation of mutual interactions between residues.
BMC Bioinformatics | 2011
Jiamin Xiao; Xiaojing Tang; Yizhou Li; Zheng Fang; Daichuan Ma; Yangzhige He; Menglong Li
BackgroundMicroRNAs (miRNAs) play a key role in regulating various biological processes such as participating in the post-transcriptional pathway and affecting the stability and/or the translation of mRNA. Current methods have extracted feature information at different levels, among which the characteristic stem-loop structure makes the greatest contribution to the prediction of putative miRNA precursor (pre-miRNA). We find that none of these features alone is capable of identifying new pre-miRNA accurately.ResultsIn the present work, a pre-miRNA stem-loop secondary structure is translated to a network, which provides a novel perspective for its structural analysis. Network parameters are used to construct prediction model, achieving an area under the receiver operating curves (AUC) value of 0.956. Moreover, by repeating the same method on two independent datasets, accuracies of 0.976 and 0.913 are achieved, respectively.ConclusionsNetwork parameters effectively characterize pre-miRNA secondary structure, which improves our prediction model in both prediction ability and computation efficiency. Additionally, as a complement to feature extraction methods in previous studies, these multifaceted features can reflect natural properties of miRNAs and be used for comprehensive and systematic analysis on miRNA.
Peptides | 2010
Lezheng Yu; Yanzhi Guo; Zheng Zhang; Yizhou Li; Menglong Li; Gongbing Li; Wenjia Xiong; Yuhong Zeng
In contrast to a large number of classically secreted proteins (CSPs) and non-secreted proteins (NSPs), only a few proteins have been experimentally proved to enter non-classical secretory pathways. So it is difficult to identify non-classically secreted proteins (NCSPs), and no methods are available for distinguishing the three types of proteins simultaneously. In order to solve this problem, a data mining has been taken firstly, and mammalian proteins exported via ER-Golgi-independent pathways are collected through extensive literature searches. In this paper, a support vector machine (SVM)-based ternary classifier named SecretP is proposed to predict mammalian secreted proteins by using pseudo-amino acid composition (PseAA) and five additional features. When distinguishing the three types of proteins, SecretP yielded an accuracy of 88.79%. Evaluating the performance of our method by an independent test set of 92 human proteins, 76 of them are correctly predicted as NCSPs. When performed on another public independent data set, the prediction result of SecretP is comparable to those of other existing computational methods. Therefore, SecretP can be a useful supplementary tool for future secretome studies. The web server SecretP and all supplementary tables listed in this paper are freely available at http://cic.scu.edu.cn/bioinformatics/secretp/index.htm.
Amino Acids | 2010
Li Yang; Yizhou Li; Rongquan Xiao; Yuhong Zeng; Jiamin Xiao; Fuyuan Tan; Menglong Li
Membrane transporters are critical in living cells. Therefore, the discrimination of the types of membrane proteins based on their functions is of great importance both for helping genome annotation and providing a supplementary role to experimental researchers to gain insight into membrane proteins’ function. There are a lot of computational methods to facilitate the identification of the functional types of membrane proteins. However, in these methods, the local sequence environment was not integrated into the constructed model. In this study, we described a new strategy to predict the functional types of membrane proteins using a model based on auto covariance and position-specific scoring matrix. The novelty of the presented approach is considering the distribution of different positions of functional conservation sites in protein sequences. Thereby, this model adequately takes into account the long-range correlation between such sites during sequential evolution. Fivefold cross-validation test shows that this method greatly improves the prediction accuracy and achieves an acceptable prediction accuracy of 87.51%. The result indicates that the current approach might be an effective tool for predicting the functional types of membrane proteins only using the primary sequences. The code and dataset used in this article are freely available at http://cic.scu.edu.cn/bioinformatics/predict_membrane.zip.
BMC Bioinformatics | 2009
Jiamin Xiao; Yizhou Li; Kelong Wang; Zhining Wen; Menglong Li; Lifang Zhang; Xuanmin Guang
BackgroundMicroRNA (miRNA), which is short non-coding RNA, plays a pivotal role in the regulation of many biological processes and affects the stability and/or translation of mRNA. Recently, machine learning algorithms were developed to predict potential miRNA targets. Most of these methods are robust but are not sensitive to redundant or irrelevant features. Despite their good performance, the relative importance of each feature is still unclear. With increasing experimental data becoming available, research interest has shifted from higher prediction performance to uncovering the mechanism of microRNA-mRNA interactions.ResultsSystematic analysis of sequence, structural and positional features was carried out for two different data sets. The dominant functional features were distinguished from uninformative features in single and hybrid feature sets. Models were developed using only statistically significant sequence, structural and positional features, resulting in area under the receiver operating curves (AUC) values of 0.919, 0.927 and 0.969 for one data set and of 0.926, 0.874 and 0.954 for another data set, respectively. Hybrid models were developed by combining various features and achieved AUC of 0.978 and 0.970 for two different data sets. Functional miRNA information is well reflected in these features, which are expected to be valuable in understanding the mechanism of microRNA-mRNA interactions and in designing experiments.ConclusionsDiffering from previous approaches, this study focused on systematic analysis of all types of features. Statistically significant features were identified and used to construct models that yield similar accuracy to previous studies in a shorter computation time.
Scientific Reports | 2015
Qifan Kuang; Xin Xu; Rong Li; Yongcheng Dong; Yan Li; Ziyan Huang; Yizhou Li; Menglong Li
The prediction of drug-target interactions is a key step in the drug discovery process, which serves to identify new drugs or novel targets for existing drugs. However, experimental methods for predicting drug-target interactions are expensive and time-consuming. Therefore, the in silico prediction of drug-target interactions has recently attracted increasing attention. In this study, we propose an eigenvalue transformation technique and apply this technique to two representative algorithms, the Regularized Least Squares classifier (RLS) and the semi-supervised link prediction classifier (SLP), that have been used to predict drug-target interaction. The results of computational experiments with these techniques show that algorithms including eigenvalue transformation achieved better performance on drug-target interaction prediction than did the original algorithms. These findings show that eigenvalue transformation is an efficient technique for improving the performance of methods for predicting drug-target interactions. We further show that, in theory, eigenvalue transformation can be viewed as a feature transformation on the kernel matrix. Accordingly, although we only apply this technique to two algorithms in the current study, eigenvalue transformation also has the potential to be applied to other algorithms based on kernels.
Computers in Biology and Medicine | 2013
Lezheng Yu; Jiesi Luo; Yanzhi Guo; Yizhou Li; Xuemei Pu; Menglong Li
In this study, we focus on different types of Gram-negative bacterial secreted proteins, and try to analyze the relationships and differences among them. Through an extensive literature search, 1612 secreted proteins have been collected as a standard data set from three data sources, including Swiss-Prot, TrEMBL and RefSeq. To explore the relationships among different types of secreted proteins, we model this data set as a sequence similarity network. Finally, a multi-classifier named SecretP is proposed to distinguish different types of secreted proteins, and yields a high total sensitivity of 90.12% for the test set. When performed on another public independent dataset for further evaluation, a promising prediction result is obtained. Predictions can be implemented freely online at http://cic.scu.edu.cn/bioinformatics/secretPv2_1/index.htm.
Analytical Methods | 2013
Jiao Lin; Qifan Kuang; Yizhou Li; Yongqing Zhang; Jing Sun; Zhanling Ding; Menglong Li
Detecting adverse drug reaction (ADR) is a big challenge to drug development and post-marketing applications. Owing to the low costs and high performance, computational methods are used to predict unknown adverse reactions of drugs. In the present study, a network based method is developed, in which a bipartite network is introduced to represent associations between ADRs and drugs. The potential ADRs of a drug could be simply inferred by its neighbourhood in the bipartite network. Our method was applied on three datasets compiled from FAERS, SIDER and intersection of these two databases (gold standard data). Encouraging results were achieved, area under curve (AUC) values were 0.93, 0.94 and 0.83, respectively. To further evaluate the performance of our method, comparisons were made with internal link prediction method and logistic regression method on the gold standard data. Our method achieved an AUC value of 0.83, while the AUC values were 0.75 for both internal link prediction method and logistic regression method. The results show that it is feasible to predict unknown drug–ADR associations using only topology features of the drug–ADR network.
PLOS ONE | 2011
Yizhou Li; Gongbing Li; Zhining Wen; Hui Yin; Mei Hu; Jiamin Xiao; Menglong Li
Owing to their potential for systematic analysis, complex networks have been widely used in proteomics. Representing a protein structure as a topology network provides novel insight into understanding protein folding mechanisms, stability and function. Here, we develop a new feature to reveal correlations between residues using a protein structure network. In an original attempt to quantify the effects of several key residues on catalytic residues, a power function was used to model interactions between residues. The results indicate that focusing on a few residues is a feasible approach to identifying catalytic residues. The spatial environment surrounding a catalytic residue was analyzed in a layered manner. We present evidence that correlation between residues is related to their distance apart most environmental parameters of the outer layer make a smaller contribution to prediction and ii catalytic residues tend to be located near key positions in enzyme folds. Feature analysis revealed satisfactory performance for our features, which were combined with several conventional features in a prediction model for catalytic residues using a comprehensive data set from the Catalytic Site Atlas. Values of 88.6 for sensitivity and 88.4 for specificity were obtained by 10fold crossvalidation. These results suggest that these features reveal the mutual dependence of residues and are promising for further study of structurefunction relationship.