Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where ShaoPeng Wang is active.

Publication


Featured researches published by ShaoPeng Wang.


BioMed Research International | 2016

Analysis and Identification of Aptamer-Compound Interactions with a Maximum Relevance Minimum Redundancy and Nearest Neighbor Algorithm

ShaoPeng Wang; Yu-Hang Zhang; Jing Lu; Weiren Cui; Jerry Hu; Yu-Dong Cai

The development of biochemistry and molecular biology has revealed an increasingly important role of compounds in several biological processes. Like the aptamer-protein interaction, aptamer-compound interaction attracts increasing attention. However, it is time-consuming to select proper aptamers against compounds using traditional methods, such as exponential enrichment. Thus, there is an urgent need to design effective computational methods for searching effective aptamers against compounds. This study attempted to extract important features for aptamer-compound interactions using feature selection methods, such as Maximum Relevance Minimum Redundancy, as well as incremental feature selection. Each aptamer-compound pair was represented by properties derived from the aptamer and compound, including frequencies of single nucleotides and dinucleotides for the aptamer, as well as the constitutional, electrostatic, quantum-chemical, and space conformational descriptors of the compounds. As a result, some important features were obtained. To confirm the importance of the obtained features, we further discussed the associations between them and aptamer-compound interactions. Simultaneously, an optimal prediction model based on the nearest neighbor algorithm was built to identify aptamer-compound interactions, which has the potential to be a useful tool for the identification of novel aptamer-compound interactions. The program is available upon the request.


IEEE Access | 2017

Identify Key Sequence Features to Improve CRISPR sgRNA Efficacy

Lei Chen; ShaoPeng Wang; Yu-Hang Zhang; JiaRui Li; Zhihao Xing; Jialiang Yang; Tao Huang; Yu-Dong Cai

The CRISPR/Cas9 system is a creative and innovative gene editing biotechnology tool in genetic engineering. Although several achievements have been attained using the CRISPR/Cas9 system, it is still a challenge to avoid off-target effects and improve the editing efficacy. Previous efforts on evaluating the efficacy and designing the guide RNA mainly focused on DNA properties. However, some DNA features have not been characterized but can be reflected by protein properties, such as the disorder features and the sequence conservation. In this paper, we provided a computational framework to identify important features related to the efficacy of CRISPR/Cas9 focusing on the properties of the proteins encoded by the target DNA fragments. The feature selection method, maximal-relevance-minimal-redundancy, was adopted to analyze these features. And incremental feature selection together with support vector machine, were employed to extract optimal features, on which an optimal classifier can be constructed. As a result, 152 important features were extracted, with which an optimal classifier based on support vector machine was built. This classifier obtained the highest MCC value of 0.355. Finally, a series of detailed biological analyses were performed on the optimal features. From the results, we found that some key factors may differentially affect the binding activity of sgRNAs to their targets. Among them, the disorder status of the target protein sequences was found to be a major factor that is related to the efficacy of sgRNAs, suggesting the DNA features associated with the protein disorder status could also affect the CRISPR/Cas9 efficacy.


Journal of Cellular Biochemistry | 2018

Identification of gene expression signatures across different types of neural stem cells with the Monte-Carlo feature selection method†

Lei Chen; JiaRui Li; Yu-Hang Zhang; Kai-Yan Feng; ShaoPeng Wang; YunHua Zhang; Tao Huang; Xiangyin Kong; Yu-Dong Cai

Adult neural stem cells (NSCs) are a group of multi‐potent, self‐renewing progenitor cells that contribute to the generation of new neurons and oligodendrocytes. Three subtypes of NSCs can be isolated based on the stages of the NSC lineage, including quiescent neural stem cells (qNSCs), activated neural stem cells (aNSCs) and neural progenitor cells (NPCs). Although it is widely accepted that these three groups of NSCs play different roles in the development of the nervous system, their molecular signatures are poorly understood. In this study, we applied the Monte‐Carlo Feature Selection (MCFS) method to identify the gene expression signatures, which can yield a Matthews correlation coefficient (MCC) value of 0.918 with a support vector machine evaluated by ten‐fold cross‐validation. In addition, some classification rules yielded by the MCFS program for distinguishing above three subtypes were reported. Our results not only demonstrate a high classification capacity and subtype‐specific gene expression patterns but also quantitatively reflect the pattern of the gene expression levels across the NSC lineage, providing insight into deciphering the molecular basis of NSC differentiation.


International Journal of Molecular Sciences | 2017

Determination of Genes Related to Uveitis by Utilization of the Random Walk with Restart Algorithm on a Protein–Protein Interaction Network

Shiheng Lu; Yan Yan; Zhen Li; Lei Chen; Jing Yang; Yu-Hang Zhang; ShaoPeng Wang; Lin Liu

Uveitis, defined as inflammation of the uveal tract, may cause blindness in both young and middle-aged people. Approximately 10–15% of blindness in the West is caused by uveitis. Therefore, a comprehensive investigation to determine the disease pathogenesis is urgent, as it will thus be possible to design effective treatments. Identification of the disease genes that cause uveitis is an important requirement to achieve this goal. To begin to answer this question, in this study, a computational method was proposed to identify novel uveitis-related genes. This method was executed on a large protein–protein interaction network and employed a popular ranking algorithm, the Random Walk with Restart (RWR) algorithm. To improve the utility of the method, a permutation test and a procedure for selecting core genes were added, which helped to exclude false discoveries and select the most important candidate genes. The five-fold cross-validation was adopted to evaluate the method, yielding the average F1-measure of 0.189. In addition, we compared our method with a classic GBA-based method to further indicate its utility. Based on our method, 56 putative genes were chosen for further assessment. We have determined that several of these genes (e.g., CCL4, Jun, and MMP9) are likely to be important for the pathogenesis of uveitis.


Protein and Peptide Letters | 2016

OPMSP: A Computational Method Integrating Protein Interaction and Sequence Information for the Identification of Novel Putative Oncogenes

Lei Chen; Baoman Wang; ShaoPeng Wang; Jing Yang; Jerry Hu; ZhiQun Xie; Yuwei Wang; Tao Huang; Yu-Dong Cai

Oncogenes are genes that have the potential to cause cancer. Oncogene research can provide insight into the occurrence and development of cancer, thereby helping to prevent cancer and to design effective treatments. This study proposes a network method called the oncogene prediction method based on shortest path algorithm (OPMSP) for the identification of novel oncogenes in a large protein network built using protein-protein interaction data. Novel putative genes were extracted from the shortest paths connecting any two known oncogenes. Then, they were filtered by a randomization test, and the linkages among them and known oncogenes were measured by protein interaction and sequence data. Thirty-seven new putative oncogenes were identified by this method. The enrichment analysis of the 37 putative oncogenes indicated that they are highly associated with several biological processes related to the initiation, progression and metastasis of tumors. Six of these genes-ESR1, CDK9, SEPT2, HOXA10, LMX1B, and NR2C2-are extensively discussed. Several lines of evidence indicate that they may be novel oncogenes.


Molecular Genetics and Genomics | 2018

Discriminating cirRNAs from other lncRNAs using a hierarchical extreme learning machine (H-ELM) algorithm with feature selection

Lei Chen; Yu-Hang Zhang; Guohua Huang; Xiaoyong Pan; ShaoPeng Wang; Tao Huang; Yu-Dong Cai

As non-coding RNAs, circular RNAs (cirRNAs) and long non-coding RNAs (lncRNAs) have attracted an increasing amount of attention. They have been confirmed to participate in many biological processes, including playing roles in transcriptional regulation, regulating protein-coding genes, and binding to RNA-associated proteins. Until now, the differences between these two types of non-coding RNAs have not been fully uncovered. It is still quite difficult to detect cirRNAs from other lncRNAs using simple techniques. In this study, we investigated these two types of non-coding RNAs using several computational methods. The purpose was to extract important factors that could distinguish cirRNAs from other lncRNAs and build an effective classification model to distinguish them. First, we collected cirRNAs, lncRNAs and their representations from a previous study, in which each cirRNA or lncRNA was represented by 188 features derived from its graph representation, sequence and conservation properties. Second, these features were analyzed by the minimum redundancy maximum relevance (mRMR) method. The obtained mRMR feature list, incremental feature selection method and hierarchical extreme learning machine algorithm were employed to build an optimal classification model with sensitivity of 0.703, specificity of 0.850, accuracy of 0.789 and a Matthews correlation coefficient of 0.561. Finally, we analyzed the 16 most important features. Of them, the sequences and structures of the RNA molecule were top ranking, implying they can be potential indicators of differences between cirRNAs and other lncRNAs. Meanwhile, other features of evolutionary conversation, sequence consecution were also important.


Combinatorial Chemistry & High Throughput Screening | 2017

Analysis and Prediction of Myristoylation Sites Using the mRMR Method, the IFS Method and an Extreme Learning Machine Algorithm

ShaoPeng Wang; Yu-Hang Zhang; Guohua Huang; Lei Chen; Yu-Dong Cai

BACKGROUND Myristoylation is an important hydrophobic post-translational modification that is covalently bound to the amino group of Gly residues on the N-terminus of proteins. The many diverse functions of myristoylation on proteins, such as membrane targeting, signal pathway regulation and apoptosis, are largely due to the lipid modification, whereas abnormal or irregular myristoylation on proteins can lead to several pathological changes in the cell. OBJECTIVE To better understand the function of myristoylated sites and to correctly identify them in protein sequences, this study conducted a novel computational investigation on identifying myristoylation sites in protein sequences. MATERIALS AND METHODS A training dataset with 196 positive and 84 negative peptide segments were obtained. Four types of features derived from the peptide segments following the myristoylation sites were used to specify myristoylatedand non-myristoylated sites. Then, feature selection methods including maximum relevance and minimum redundancy (mRMR), incremental feature selection (IFS), and a machine learning algorithm (extreme learning machine method) were adopted to extract optimal features for the algorithm to identify myristoylation sites in protein sequences, thereby building an optimal prediction model. RESULTS As a result, 41 key features were extracted and used to build an optimal prediction model. The effectiveness of the optimal prediction model was further validated by its performance on a test dataset. Furthermore, detailed analyses were also performed on the extracted 41 features to gain insight into the mechanism of myristoylation modification. CONCLUSION This study provided a new computational method for identifying myristoylation sites in protein sequences. We believe that it can be a useful tool to predict myristoylation sites from protein sequences.


International Journal of Cancer | 2018

Gene expression differences among different MSI statuses in colorectal cancer: Gene expression differences among MSI statuses

Lei Chen; Xiaoyong Pan; XiaoHua Hu; Yu-Hang Zhang; ShaoPeng Wang; Tao Huang; Yu-Dong Cai

Colorectal cancer is the third most common cancer in males and second in females. This disease can be caused by genetic and acquired/environmental factors. Microsatellite instability (MSI) is one of the major mechanisms in colorectal cancer. This mechanism is a specific condition of genetic hyper mutability that results from incompetent DNA mismatch repair. MSI has been applied to classify different colorectal cancer subtypes. However, the effects of MSI status on gene expression are largely unknown. In our study, we integrated the gene expression profile and MSI status of all CRC samples from the TCGA database, and then categorized the CRC samples into three subgroups, namely, MSI‐stable, MSI‐low, and MSI‐high, according to the MSI status. We applied a novel computational method based on machine learning and screened the genes specifically expressed for the different colorectal cancer subtypes. The results showed the distinct mechanisms of the different colorectal cancer subtypes with MSI status and provided the genes that may be the optimal standards to further classify the various molecular subtypes of colorectal cancer with distinct MSI status.


Scientific Reports | 2017

Identification of the core regulators of the HLA I-peptide binding process

Yu-Hang Zhang; Zhihao Xing; Chenglin Liu; ShaoPeng Wang; Tao Huang; Yu-Dong Cai; Xiangyin Kong

During the display of peptide/human leukocyte antigen (HLA) -I complex for further immune recognition, the cleaved and transported antigenic peptides have to bind to HLA-I protein and the binding affinity between peptide epitopes and HLA proteins directly influences the immune recognition ability in human beings. Key factors affecting the binding affinity during the generation, selection and presentation processes of HLA-I complex have not yet been fully discovered. In this study, a new method describing the HLA class I-peptide interactions was proposed. Three hundred and forty features of HLA I proteins and peptide sequences were utilized for analysis by four candidate algorithms, screening the optimal classifier. Features derived from the optimal classifier were further selected and systematically analyzed, revealing the core regulators. The results validated the hypothesis that features of HLA I proteins and related peptides simultaneously affect the binding process, though with discrepant redundancy. Besides, the high relative ratio (16/20) of the amino acid composition features suggests the unique role of sequence signatures for the binding processes. Integrating biological, evolutionary and chemical features of both HLA I molecules and peptides, this study may provide a new perspective of the underlying mechanisms of HLA I-mediated immune reactions.


BioMed Research International | 2015

Mining for Candidate Genes Related to Pancreatic Cancer Using Protein-Protein Interactions and a Shortest Path Approach

Fei Yuan; Yu-Hang Zhang; Sibao Wan; ShaoPeng Wang; Xiangyin Kong

Pancreatic cancer (PC) is a highly malignant tumor derived from pancreas tissue and is one of the leading causes of death from cancer. Its molecular mechanism has been partially revealed by validating its oncogenes and tumor suppressor genes; however, the available data remain insufficient for medical workers to design effective treatments. Large-scale identification of PC-related genes can promote studies on PC. In this study, we propose a computational method for mining new candidate PC-related genes. A large network was constructed using protein-protein interaction information, and a shortest path approach was applied to mine new candidate genes based on validated PC-related genes. In addition, a permutation test was adopted to further select key candidate genes. Finally, for all discovered candidate genes, the likelihood that the genes are novel PC-related genes is discussed based on their currently known functions.

Collaboration


Dive into the ShaoPeng Wang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yu-Hang Zhang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Lei Chen

Shanghai Maritime University

View shared research outputs
Top Co-Authors

Avatar

Tao Huang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xiangyin Kong

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

YunHua Zhang

Anhui Agricultural University

View shared research outputs
Top Co-Authors

Avatar

Fei Yuan

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge