Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Shen Niu is active.

Publication


Featured researches published by Shen Niu.


Journal of Proteomics | 2012

Predict and analyze S-nitrosylation modification sites with the mRMR and IFS approaches.

Bi-Qing Li; Le-Le Hu; Shen Niu; Yu-Dong Cai; Kuo-Chen Chou

S-nitrosylation (SNO) is one of the most important and universal post-translational modifications (PTMs) which regulates various cellular functions and signaling events. Identification of the exact S-nitrosylation sites in proteins may facilitate the understanding of the molecular mechanisms and biological function of S-nitrosylation. Unfortunately, traditional experimental approaches used for detecting S-nitrosylation sites are often laborious and time-consuming. However, computational methods could overcome this demerit. In this work, we developed a novel predictor based on nearest neighbor algorithm (NNA) with the maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). The features of physicochemical/biochemical properties, sequence conservation, residual disorder, amino acid occurrence frequency, second structure and the solvent accessibility were utilized to represent the peptides concerned. Feature analysis showed that the features except residual disorder affected identification of the S-nitrosylation sites. It was also shown via the site-specific feature analysis that the features of sites away from the central cysteine might contribute to the S-nitrosylation site determination through a subtle manner. It is anticipated that our prediction method may become a useful tool for identifying the protein S-nitrosylation sites and that the features analysis described in this paper may provide useful insights for in-depth investigation into the mechanism of S-nitrosylation.


PLOS ONE | 2011

Predicting Transcriptional Activity of Multiple Site p53 Mutants Based on Hybrid Properties

Tao Huang; Shen Niu; Zhongping Xu; Yun Huang; Xiangyin Kong; Yu-Dong Cai; Kuo-Chen Chou

As an important tumor suppressor protein, reactivate mutated p53 was found in many kinds of human cancers and that restoring active p53 would lead to tumor regression. In this work, we developed a new computational method to predict the transcriptional activity for one-, two-, three- and four-site p53 mutants, respectively. With the approach from the general form of pseudo amino acid composition, we used eight types of features to represent the mutation and then selected the optimal prediction features based on the maximum relevance, minimum redundancy, and incremental feature selection methods. The Mathews correlation coefficients (MCC) obtained by using nearest neighbor algorithm and jackknife cross validation for one-, two-, three- and four-site p53 mutants were 0.678, 0.314, 0.705, and 0.907, respectively. It was revealed by the further optimal feature set analysis that the 2D (two-dimensional) structure features composed the largest part of the optimal feature set and maybe played the most important roles in all four types of p53 mutant active status prediction. It was also demonstrated by the optimal feature sets, especially those at the top level, that the 3D structure features, conservation, physicochemical and biochemical properties of amino acid near the mutation site, also played quite important roles for p53 mutant active status prediction. Our study has provided a new and promising approach for finding functionally important sites and the relevant features for in-depth study of p53 protein and its action mechanism.


Biochimie | 2011

Prediction and analysis of protein palmitoylation sites.

Le-Le Hu; Si-Bao Wan; Shen Niu; Xiao-He Shi; Haipeng Li; Yu-Dong Cai; Kuo-Chen Chou

Palmitoylation is a universal and important lipid modification, involving a series of basic cellular processes, such as membrane trafficking, protein stability and protein aggregation. With the avalanche of new protein sequences generated in the post genomic era, it is highly desirable to develop computational methods for rapidly and effectively identifying the potential palmitoylation sites of uncharacterized proteins so as to timely provide useful information for revealing the mechanism of protein palmitoylation. By using the Incremental Feature Selection approach based on amino acid factors, conservation, disorder feature, and specific features of palmitoylation site, a new predictor named IFS-Palm was developed in this regard. The overall success rate thus achieved by jackknife test on a newly constructed benchmark dataset was 90.65%. It was shown via an in-depth analysis that palmitoylation was intimately correlated with the feature of the upstream residue directly adjacent to cysteine site as well as the conservation of amino acid cysteine. Meanwhile, the protein disorder region might also play an import role in the post-translational modification. These findings may provide useful insights for revealing the mechanisms of palmitoylation.


Journal of Proteome Research | 2010

Prediction of tyrosine sulfation with mRMR feature selection and analysis.

Shen Niu; Tao Huang; Kai-Yan Feng; Yu-Dong Cai; Yixue Li

Protein tyrosine sulfation is a ubiquitous post-translational modification (PTM) of secreted and transmembrane proteins that pass through the Golgi apparatus. In this study, we developed a new method for protein tyrosine sulfation prediction based on a nearest neighbor algorithm with the maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). We incorporated features of sequence conservation, residual disorder, and amino acid factor, 229 features in total, to predict tyrosine sulfation sites. From these 229 features, 145 features were selected and deemed as the optimized features for the prediction. The prediction model achieved a prediction accuracy of 90.01% using the optimal 145-feature set. Feature analysis showed that conservation, disorder, and physicochemical/biochemical properties of amino acids all contributed to the sulfation process. Site-specific feature analysis showed that the features derived from its surrounding sites contributed profoundly to sulfation site determination in addition to features derived from the sulfation site itself. The detailed feature analysis in this paper might help understand more of the sulfation mechanism and guide the related experimental validation.


PLOS ONE | 2010

Prediction and Analysis of Protein Hydroxyproline and Hydroxylysine

Le-Le Hu; Shen Niu; Tao Huang; Kai Wang; Xiao-He Shi; Yu-Dong Cai

Background Hydroxylation is an important post-translational modification and closely related to various diseases. Besides the biotechnology experiments, in silico prediction methods are alternative ways to identify the potential hydroxylation sites. Methodology/Principal Findings In this study, we developed a novel sequence-based method for identifying the two main types of hydroxylation sites – hydroxyproline and hydroxylysine. First, feature selection was made on three kinds of features consisting of amino acid indices (AAindex) which includes various physicochemical properties and biochemical properties of amino acids, Position-Specific Scoring Matrices (PSSM) which represent evolution information of amino acids and structural disorder of amino acids in the sliding window with length of 13 amino acids, then the prediction model were built using incremental feature selection method. As a result, the prediction accuracies are 76.0% and 82.1%, evaluated by jackknife cross-validation on the hydroxyproline dataset and hydroxylysine dataset, respectively. Feature analysis suggested that physicochemical properties and biochemical properties and evolution information of amino acids contribute much to the identification of the protein hydroxylation sites, while structural disorder had little relation to protein hydroxylation. It was also found that the amino acid adjacent to the hydroxylation site tends to exert more influence than other sites on hydroxylation determination. Conclusions/Significance These findings may provide useful insights for exploiting the mechanisms of hydroxylation.


Biopolymers | 2011

Prediction and analysis of protein methylarginine and methyllysine based on Multisequence features

Le-Le Hu; Zhen Li; Kai Wang; Shen Niu; Xiao-He Shi; Yu-Dong Cai; Haipeng Li

Protein methylation, one of the most important post-translational modifications, typically takes place on arginine or lysine residue. The reversible modification involves a series of basic cellular processes. Identification of methyl proteins with their sites will facilitate the understanding of the molecular mechanism of methylation. Besides the experimental methods, computational predictions of methylated sites are much more desirable for their convenience and fast speed. Here, we propose a method dedicated to predicting methylated sites of proteins. Feature selection was made on sequence conservation, physicochemical/biochemical properties, and structural disorder by applying maximum relevance minimum redundancy and incremental feature selection methods. The prediction models were built according to nearest the neighbor algorithm and evaluated by the jackknife cross-validation. We built 11 and 9 predictors for methylarginine and methyllysine, respectively, and integrated them to predict methylated sites. As a result, the average prediction accuracies are 74.25%, 77.02% for methylarginine and methyllysine training sets, respectively. Feature analysis suggested evolutionary information, and physicochemical/biochemical properties play important roles in the recognition of methylated sites. These findings may provide valuable information for exploiting the mechanisms of methylation. Our method may serve as a useful tool for biologists to find the potential methylated sites of proteins.


Journal of Biomolecular Structure & Dynamics | 2012

Predicting protein oxidation sites with feature selection and analysis approach

Shen Niu; Le-Le Hu; Lulu Zheng; Tao Huang; Kai-Yan Feng; Yu-Dong Cai; Haipeng Li; Yixue Li; Kuo-Chen Chou

Protein oxidation is a ubiquitous post-translational modification that plays important roles in various physiological and pathological processes. Owing to the fact that protein oxidation can also take place as an experimental artifact or caused by oxygen in the air during the process of sample collection and analysis, and that it is both time-consuming and expensive to determine the protein oxidation sites purely by biochemical experiments, it would be of great benefit to develop in silico methods for rapidly and effectively identifying protein oxidation sites. In this study, we developed a computational method to address this problem. Our method was based on the nearest neighbor algorithm in which, however, the maximum relevance minimum redundancy and incremental feature selection approaches were incorporated. From the initial 735 features, 16 features were selected as the optimal feature set. Of such 16 optimized features, 10 features were associated with the position-specific scoring matrix conservation scores, three with the amino acid factors, one with the propensity of conservation of residues on protein surface, one with the side chain count of carbon atom deviation from mean, and one with the solvent accessibility. It was observed that our prediction model achieved an overall success rate of 75.82%, indicating that it is quite encouraging and promising for practical applications. Also, the 16 optimal features obtained through this study may provide useful clues and insights for in-depth understanding the action mechanism of protein oxidation.


PLOS ONE | 2011

Prediction of protein modification sites of pyrrolidone carboxylic acid using mRMR feature selection and analysis.

Lulu Zheng; Shen Niu; Pei Hao; Kai-Yan Feng; Yu-Dong Cai; Yixue Li

Pyrrolidone carboxylic acid (PCA) is formed during a common post-translational modification (PTM) of extracellular and multi-pass membrane proteins. In this study, we developed a new predictor to predict the modification sites of PCA based on maximum relevance minimum redundancy (mRMR) and incremental feature selection (IFS). We incorporated 727 features that belonged to 7 kinds of protein properties to predict the modification sites, including sequence conservation, residual disorder, amino acid factor, secondary structure and solvent accessibility, gain/loss of amino acid during evolution, propensity of amino acid to be conserved at protein-protein interface and protein surface, and deviation of side chain carbon atom number. Among these 727 features, 244 features were selected by mRMR and IFS as the optimized features for the prediction, with which the prediction model achieved a maximum of MCC of 0.7812. Feature analysis showed that all feature types contributed to the modification process. Further site-specific feature analysis showed that the features derived from PCAs surrounding sites contributed more to the determination of PCA sites than other sites. The detailed feature analysis in this paper might provide important clues for understanding the mechanism of the PCA formation and guide relevant experimental validations.


PLOS ONE | 2010

Analyses of Copy Number Variation of GK Rat Reveal New Putative Type 2 Diabetes Susceptibility Loci

Zhi-Qiang Ye; Shen Niu; Yang Yu; Hui Yu; Bao-Hong Liu; Rongxia Li; Hua-Sheng Xiao; Rong Zeng; Yixue Li; Jiarui Wu; Yuan-Yuan Li

Large efforts have been taken to search for genes responsible for type 2 diabetes (T2D), but have resulted in only about 20 in humans due to its complexity and heterogeneity. The GK rat, a spontanous T2D model, offers us a superior opportunity to search for more diabetic genes. Utilizing array comparative genome hybridization (aCGH) technology, we identifed 137 non-redundant copy number variation (CNV) regions from the GK rats when using normal Wistar rats as control. These CNV regions (CNVRs) covered approximately 36 Mb nucleotides, accounting for about 1% of the whole genome. By integrating information from gene annotations and disease knowledge, we investigated the CNVRs comprehensively for mining new T2D genes. As a result, we prioritized 16 putative protein-coding genes and two microRNA genes (rno-mir-30b and rno-mir-30d) as good candidates. The catalogue of CNVRs between GK and Wistar rats identified in this work served as a repository for mining genes that might play roles in the pathogenesis of T2D. Moreover, our efforts in utilizing bioinformatics methods to prioritize good candidate genes provided a more specific set of putative candidates. These findings would contribute to the research into the genetic basis of T2D, and thus shed light on its pathogenesis.


Protein and Peptide Letters | 2013

Inter- and intra-chain disulfide bond prediction based on optimal feature selection.

Shen Niu; Tao Huang; Kai-Yan Feng; Zhisong He; Weiren Cui; Lei Gu; Haipeng Li; Yu-Dong Cai; Yixue Li

Protein disulfide bond is formed during post-translational modifications, and has been implicated in various physiological and pathological processes. Proper localization of disulfide bonds also facilitates the prediction of protein three-dimensional (3D) structure. However, it is both time-consuming and labor-intensive using conventional experimental approaches to determine disulfide bonds, especially for large-scale data sets. Since there are also some limitations for disulfide bond prediction based on 3D structure features, developing sequence-based, convenient and fast-speed computational methods for both inter- and intra-chain disulfide bond prediction is necessary. In this study, we developed a computational method for both types of disulfide bond prediction based on maximum relevance and minimum redundancy (mRMR) method followed by incremental feature selection (IFS), with nearest neighbor algorithm as its prediction model. Features of sequence conservation, residual disorder, and amino acid factor are used for inter-chain disulfide bond prediction. And in addition to these features, sequential distance between a pair of cysteines is also used for intra-chain disulfide bond prediction. Our approach achieves a prediction accuracy of 0.8702 for inter-chain disulfide bond prediction using 128 features and 0.9219 for intra-chain disulfide bond prediction using 261 features. Analysis of optimal feature set indicated key features and key sites for the disulfide bond formation. Interestingly, comparison of top features between interand intra-chain disulfide bonds revealed the similarities and differences of the mechanisms of forming these two types of disulfide bonds, which might help understand more of the mechanisms and provide clues to further experimental studies in this research field.

Collaboration


Dive into the Shen Niu's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tao Huang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yixue Li

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Kai-Yan Feng

University of Manchester

View shared research outputs
Top Co-Authors

Avatar

Haipeng Li

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lulu Zheng

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Xiao-He Shi

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge