Xing-Yu Sun | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xing-Yu Sun is active.

Explore More

Publication

Featured researches published by Xing-Yu Sun.

Molecular BioSystems | 2012

Identifying protein quaternary structural attributes by incorporating physicochemical properties into the general form of Chou's PseAAC via discrete wavelet transform.

Xing-Yu Sun; Shao-Ping Shi; Jian-Ding Qiu; Sheng-Bao Suo; Shu-Yun Huang; Ru-Ping Liang

In vivo, some proteins exist as monomers and others as oligomers. Oligomers can be further classified into homo-oligomers (formed by identical subunits) and hetero-oligomers (formed by different subunits), and they form the structural components of various biological functions, including cooperative effects, allosteric mechanism and ion-channel gating. Therefore, with the avalanche of protein sequences generated in the post-genomic era, it is very important for both basic research and the pharmaceutical industry to acquire the possible knowledge about quaternary structural attributes of their proteins of interest. In view of this, a high throughput method (DWT_DT), a 2-layer approach by fusing discrete wavelet transform (DWT) and decision-tree algorithm (DT) with physicochemical features, has been developed to predict protein quaternary structures. The 1st layer is to assign a query protein to one of the 10 main quaternary structural attributes. The 2nd layer is to evaluate whether the protein in question is composed of homo- or hetero-oligomers. The overall accuracy by jackknife test for the 1st layer identification was 89.60%. The overall accuracy of the 2nd layer varies from 88.23 to 100%. The results suggest that this newly developed protocol (DWT_DT) is very promising in predicting quaternary structures with complicated composition.

PLOS ONE | 2012

PMeS: Prediction of Methylation Sites Based on Enhanced Feature Encoding Scheme

Shao-Ping Shi; Jian-Ding Qiu; Xing-Yu Sun; Sheng-Bao Suo; Shu-Yun Huang; Ru-Ping Liang

Protein methylation is predominantly found on lysine and arginine residues, and carries many important biological functions, including gene regulation and signal transduction. Given their important involvement in gene expression, protein methylation and their regulatory enzymes are implicated in a variety of human disease states such as cancer, coronary heart disease and neurodegenerative disorders. Thus, identification of methylation sites can be very helpful for the drug designs of various related diseases. In this study, we developed a method called PMeS to improve the prediction of protein methylation sites based on an enhanced feature encoding scheme and support vector machine. The enhanced feature encoding scheme was composed of the sparse property coding, normalized van der Waals volume, position weight amino acid composition and accessible surface area. The PMeS achieved a promising performance with a sensitivity of 92.45%, a specificity of 93.18%, an accuracy of 92.82% and a Matthew’s correlation coefficient of 85.69% for arginine as well as a sensitivity of 84.38%, a specificity of 93.94%, an accuracy of 89.16% and a Matthew’s correlation coefficient of 78.68% for lysine in 10-fold cross validation. Compared with other existing methods, the PMeS provides better predictive performance and greater robustness. It can be anticipated that the PMeS might be useful to guide future experiments needed to identify potential methylation sites in proteins of interest. The online service is available at http://bioinfo.ncu.edu.cn/inquiries_PMeS.aspx.

Biochimica et Biophysica Acta | 2011

Identify submitochondria and subchloroplast locations with pseudo amino acid composition: Approach from the strategy of discrete wavelet transform feature extraction

Shao-Ping Shi; Jian-Ding Qiu; Xing-Yu Sun; Jian-Hua Huang; Shu-Yun Huang; Sheng-Bao Suo; Ru-Ping Liang; Li Zhang

It is very challenging and complicated to predict protein locations at the sub-subcellular level. The key to enhancing the prediction quality for protein sub-subcellular locations is to grasp the core features of a protein that can discriminate among proteins with different subcompartment locations. In this study, a different formulation of pseudoamino acid composition by the approach of discrete wavelet transform feature extraction was developed to predict submitochondria and subchloroplast locations. As a result of jackknife cross-validation, with our method, it can efficiently distinguish mitochondrial proteins from chloroplast proteins with total accuracy of 98.8% and obtained a promising total accuracy of 93.38% for predicting submitochondria locations. Especially the predictive accuracy for mitochondrial outer membrane and chloroplast thylakoid lumen were 82.93% and 82.22%, respectively, showing an improvement of 4.88% and 27.22% when other existing methods were compared. The results indicated that the proposed method might be employed as a useful assistant technique for identifying sub-subcellular locations. We have implemented our algorithm as an online service called SubIdent (http://bioinfo.ncu.edu.cn/services.aspx).

PLOS ONE | 2012

Position-Specific Analysis and Prediction for Protein Lysine Acetylation Based on Multiple Features

Sheng-Bao Suo; Jian-Ding Qiu; Shao-Ping Shi; Xing-Yu Sun; Shu-Yun Huang; Xiang Chen; Ru-Ping Liang

Protein lysine acetylation is a type of reversible post-translational modification that plays a vital role in many cellular processes, such as transcriptional regulation, apoptosis and cytokine signaling. To fully decipher the molecular mechanisms of acetylation-related biological processes, an initial but crucial step is the recognition of acetylated substrates and the corresponding acetylation sites. In this study, we developed a position-specific method named PSKAcePred for lysine acetylation prediction based on support vector machines. The residues around the acetylation sites were selected or excluded based on their entropy values. We incorporated features of amino acid composition information, evolutionary similarity and physicochemical properties to predict lysine acetylation sites. The prediction model achieved an accuracy of 79.84% and a Matthews correlation coefficient of 59.72% using the 10-fold cross-validation on balanced positive and negative samples. A feature analysis showed that all features applied in this method contributed to the acetylation process. A position-specific analysis showed that the features derived from the critical neighboring residues contributed profoundly to the acetylation site determination. The detailed analysis in this paper can help us to understand more of the acetylation mechanism and can provide guidance for the related experimental validation.

Amino Acids | 2010

Predicting subcellular location of apoptosis proteins based on wavelet transform and support vector machine

Jian-Ding Qiu; San-Hua Luo; Jian-Hua Huang; Xing-Yu Sun; Ru-Ping Liang

Apoptosis proteins have a central role in the development and homeostasis of an organism. These proteins are very important for understanding the mechanism of programmed cell death. As a result of genome and other sequencing projects, the gap between the number of known apoptosis protein sequences and the number of known apoptosis protein structures is widening rapidly. Because of this extremely unbalanced state, it would be worthwhile to develop a fast and reliable method to identify their subcellular locations so as to gain better insight into their biological functions. In view of this, a new method, in which the support vector machine combines with discrete wavelet transform, has been developed to predict the subcellular location of apoptosis proteins. The results obtained by the jackknife test were quite promising, and indicated that the proposed method can remarkably improve the prediction accuracy of subcellular locations, and might also become a useful high-throughput tool in characterizing other attributes of proteins, such as enzyme class, membrane protein type, and nuclear receptor subfamily according to their sequences.

Journal of Molecular Graphics & Modelling | 2011

OligoPred: A web-server for predicting homo-oligomeric proteins by incorporating discrete wavelet transform into Chou's pseudo amino acid composition

Jian-Ding Qiu; Sheng-Bao Suo; Xing-Yu Sun; Shao-Ping Shi; Ru-Ping Liang

In vivo, some proteins exist as monomers (single polypeptide chains) and others as oligomers. Not like monomers, oligomers are composed of two or more chains (subunits) that are associated with each other through non-covalent interactions and, occasionally, through disulfide bonds. These proteins are the structural components of various biological functions, including cooperative effects, allosteric mechanisms and ion-channel gating. However, with the dramatic increase in the number of protein sequences submitted to the public data bank, it is important for both basic research and drug discovery research to acquire the possible knowledge about homo-oligomeric attributes of their interested proteins in a timely manner. In this paper, a high-throughput method, combined support vector machines with discrete wavelet transform, has been developed to predict the protein homo-oligomers. The total accuracy obtained by the re-substitution test, jackknife test and independent dataset test are 99.94%, 96.17% and 96.18%, respectively, showing that the proposed method of extracting feature from the protein sequences is effective and feasible for predicting homo-oligomers. The online service is available at http://bioinfo.ncu.edu.cn/Services.aspx.

Analytical Biochemistry | 2012

PredSulSite: prediction of protein tyrosine sulfation sites with multiple features and analysis.

Shu-Yun Huang; Shao-Ping Shi; Jian-Ding Qiu; Xing-Yu Sun; Sheng-Bao Suo; Ru-Ping Liang

Tyrosine sulfation is a ubiquitous posttranslational modification that regulates extracellular protein-protein interactions, intracellular protein transportation modulation, and protein proteolytic process. However, identifying tyrosine sulfation sites remains a challenge due to the lability of sulfation sequences. In this study, we developed a method called PredSulSite that incorporates protein secondary structure, physicochemical properties of amino acids, and residue sequence order information based on support vector machine to predict sulfotyrosine sites. Three types of encoding algorithms-secondary structure, grouped weight, and autocorrelation function-were applied to mine features from tyrosine sulfation proteins. The prediction model with multiple features achieved an accuracy of 92.89% in 10-fold cross-validation. Feature analysis showed that the coil structure, acidic amino acids, and residue interactions around the tyrosine sulfation sites all contributed to the sulfation site determination. The detailed feature analysis in this work can help us to understand the sulfation mechanism and provide guidance for the related experimental validation. PredSulSite is available as a community resource at http://www.bioinfo.ncu.edu.cn/inquiries_PredSulSite.aspx.

Journal of Molecular Graphics & Modelling | 2013

The prediction of palmitoylation site locations using a multiple feature extraction method.

Shao-Ping Shi; Xing-Yu Sun; Jian-Ding Qiu; Sheng-Bao Suo; Xiang Chen; Shu-Yun Huang; Ru-Ping Liang

As an extremely important and ubiquitous post-translational lipid modification, palmitoylation plays a significant role in a variety of biological and physiological processes. Unlike other lipid modifications, protein palmitoylation and depalmitoylation are highly dynamic and can regulate both protein function and localization. The dynamic nature of palmitoylation is poorly understood because of the limitations in current assay methods. The in vivo or in vitro experimental identification of palmitoylation sites is both time consuming and expensive. Due to the large volume of protein sequences generated in the post-genomic era, it is extraordinarily important in both basic research and drug discovery to rapidly identify the attributes of a new proteins palmitoylation sites. In this work, a new computational method, WAP-Palm, combining multiple feature extraction, has been developed to predict the palmitoylation sites of proteins. The performance of the WAP-Palm model is measured herein and was found to have a sensitivity of 81.53%, a specificity of 90.45%, an accuracy of 85.99% and a Matthews correlation coefficient of 72.26% in 10-fold cross-validation test. The results obtained from both the cross-validation and independent tests suggest that the WAP-Palm model might facilitate the identification and annotation of protein palmitoylation locations. The online service is available at http://bioinfo.ncu.edu.cn/WAP-Palm.aspx.

Computers in Biology and Medicine | 2012

A novel algorithm combining support vector machine with the discrete wavelet transform for the prediction of protein subcellular localization

Ru-Ping Liang; Shu-Yun Huang; Shao-Ping Shi; Xing-Yu Sun; Sheng-Bao Suo; Jian-Ding Qiu

Knowing the subcellular localization of proteins within the cell is an important step in elucidating its role in biological processes, its function and its potential as a drug target for disease diagnosis. As the number of complete genomes rapidly increases, accurate and efficient methods that automatically predict the subcellular localizations become more urgent. In the current paper, we developed a novel method that coupled the discrete wavelet transform with support vector machine based on the amino acid polarity to predict the subcellular localizations of prokaryotic and eukaryotic proteins. The results obtained by the jackknife test were quite promising, and indicated that the proposed method remarkably improved the prediction accuracy of subcellular locations, and could be as an effective and promising high-throughput method in the subcellular localization research.

Journal of Theoretical Biology | 2012

A method to distinguish between lysine acetylation and lysine methylation from protein sequences.

Shao-Ping Shi; Jian-Ding Qiu; Xing-Yu Sun; Sheng-Bao Suo; Shu-Yun Huang; Ru-Ping Liang

Lysine acetylation and methylation are two major post-translational modifications of lysine residues. They play vital roles in both biological and pathological processes. Specific lysine residues in H3 histone protein tails appear to be targeted for either acetylation or methylation. Hence it is very challenging to distinguish between acetylated and methylated lysine residues using computational methods. This work presents a method that incorporates protein sequence information, secondary structure and amino acid properties to differentiate acetyl-lysine from methyl-lysine. We apply an encoding scheme based on grouped weight and position weight amino acid composition to extract sequence information and physicochemical properties around lysine sites. The proposed method achieves an accuracy of 93.3% using a jackknife test. Feature analysis demonstrates that the prediction model with multiple features can take full advantage of the supplementary information from different features to improve classification performance and prediction robustness. Analysis of the characteristics of lysine residues which can be either methylated or acetylated shows that they are more similar to methyl-lysine than to acetyl-lysine.

Explore More