Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yingli Lv is active.

Publication


Featured researches published by Yingli Lv.


Genomics | 2011

Predicting human microRNA precursors based on an optimized feature subset generated by GA-SVM.

Yanqiu Wang; Xiaowen Chen; Wei Jiang; Li Li; Wei Li; Lei Yang; Mingzhi Liao; Baofeng Lian; Yingli Lv; Shiyuan Wang; Shuyuan Wang; Xia Li

MicroRNAs (miRNAs) are non-coding RNAs that play important roles in post-transcriptional regulation. Identification of miRNAs is crucial to understanding their biological mechanism. Recently, machine-learning approaches have been employed to predict miRNA precursors (pre-miRNAs). However, features used are divergent and consequently induce different performance. Thus, feature selection is critical for pre-miRNA prediction. We generated an optimized feature subset including 13 features using a hybrid of genetic algorithm and support vector machine (GA-SVM). Based on SVM, the classification performance of the optimized feature subset is much higher than that of the two feature sets used in microPred and miPred by five-fold cross-validation. Finally, we constructed the classifier miR-SF to predict the most recently identified human pre-miRNAs in miRBase (version 16). Compared with microPred and miPred, miR-SF achieved much higher classification performance. Accuracies were 93.97%, 86.21% and 64.66% for miR-SF, microPred and miPred, respectively. Thus, miR-SF is effective for identifying pre-miRNAs.


Bioinformatics | 2006

Effects of replacing the unreliable cDNA microarray measurements on the disease classification based on gene expression profiles and functional modules

D. Wang; Yingli Lv; Zheng Guo; Xia Li; Yanhui Li; Jing Zhu; Da Yang; Jianzhen Xu; Chenguang Wang; Shaoqi Rao; Baofeng Yang

MOTIVATION Microarrays datasets frequently contain a large number of missing values (MVs), which need to be estimated and replaced for subsequent data mining. The focus of the paper is to study the effects of different MV treatments for cDNA microarray data on disease classification analysis. RESULTS By analyzing five datasets, we demonstrate that among three kinds of classifiers evaluated in this study, support vector machine (SVM) classifiers are robust to varied MV imputation methods [e.g. replacing MVs by zero, K nearest-neighbor (KNN) imputation algorithm, local least square imputation and Bayesian principal component analysis], while the classification and regression tree classifiers are sensitive in terms of classification accuracy. The KNNclassifiers built on differentially expressed genes (DEGs) are robust to the varied MV treatments, but the performances of the KNN classifiers based on all measured genes can be significantly deteriorated when imputing MVs for genes with larger missing rate (MR) (e.g. MR > 5%). Generally, while replacing MVs by zero performs relatively poor, the other imputation algorithms have little difference in affecting classification performances of the SVM or KNN classifiers. We further demonstrate the power and feasibility of our recently proposed functional expression profile (FEP) approach as means to handle microarray data with MVs. The FEPs, which are derived from the functional modules that are enriched with sets of DEGs and thus can be consistently identified under varied MV treatments, achieve precise disease classification with better biological interpretation. We conclude that the choice of MV treatments should be determined in context of the later approaches used for disease classification. The suggested exclusion criterion of ignoring the genes with larger MR (e.g. >5%), while justifiable for some classifiers such as KNN classifiers, might not be considered as a general rule for all classifiers.


Briefings in Bioinformatics | 2012

Dissection of human MiRNA regulatory influence to subpathway

Xia Li; Wei Jiang; Wei Li; Baofeng Lian; Shuyuan Wang; Mingzhi Liao; Xiaowen Chen; Yanqiu Wang; Yingli Lv; Shiyuan Wang; Lei Yang

The global insight into the relationships between miRNAs and their regulatory influences remains poorly understood. And most of complex diseases may be attributed to certain local areas of pathway (subpathway) instead of the entire pathway. Here, we reviewed the studies on miRNA regulations to pathways and constructed a bipartite miRNAs and subpathways network for systematic analyzing the miRNA regulatory influences to subpathways. We found that a small fraction of miRNAs were global regulators, environmental information processing pathways were preferentially regulated by miRNAs, and miRNAs had synergistic effect on regulating group of subpathways with similar function. Integrating the disease states of miRNAs, we also found that disease miRNAs regulated more subpathways than nondisease miRNAs, and for all miRNAs, the number of regulated subpathways was not in proportion to the number of the related diseases. Therefore, the study not only provided a global view on the relationships among disease, miRNA and subpathway, but also uncovered the function aspects of miRNA regulations and potential pathogenesis of complex diseases. A web server to query, visualize and download for all the data can be freely accessed at http://bioinfo.hrbmu.edu.cn/miR2Subpath.


Gene | 2014

Computational identification of human long intergenic non-coding RNAs using a GA-SVM algorithm.

Yanqiu Wang; Yang Li; Qi Wang; Yingli Lv; Shiyuan Wang; Xi Chen; Xuexin Yu; Wei Jiang; Xia Li

Long intergenic non-coding RNAs (lincRNAs) are a new type of non-coding RNAs and are closely related with the occurrence and development of diseases. In previous studies, most lincRNAs have been identified through next-generation sequencing. Because lincRNAs exhibit tissue-specific expression, the reproducibility of lincRNA discovery in different studies is very poor. In this study, not including lincRNA expression, we used the sequence, structural and protein-coding potential features as potential features to construct a classifier that can be used to distinguish lincRNAs from non-lincRNAs. The GA-SVM algorithm was performed to extract the optimized feature subset. Compared with several feature subsets, the five-fold cross validation results showed that this optimized feature subset exhibited the best performance for the identification of human lincRNAs. Moreover, the LincRNA Classifier based on Selected Features (linc-SF) was constructed by support vector machine (SVM) based on the optimized feature subset. The performance of this classifier was further evaluated by predicting lincRNAs from two independent lincRNA sets. Because the recognition rates for the two lincRNA sets were 100% and 99.8%, the linc-SF was found to be effective for the prediction of human lincRNAs.


Bioinformatics | 2015

Identifying novel associations between small molecules and miRNAs based on integrated molecular networks

Yingli Lv; Shuyuan Wang; Fanlin Meng; Lei Yang; Zhifeng Wang; Jing Wang; Xiaowen Chen; Wei Jiang; Yixue Li; Xia Li

MOTIVATION miRNAs play crucial roles in human diseases and newly discovered could be targeted by small molecule (SM) drug compounds. Thus, the identification of small molecule drug compounds (SM) that target dysregulated miRNAs in cancers will provide new insight into cancer biology and accelerate drug discovery for cancer therapy. RESULTS In this study, we aimed to develop a novel computational method to comprehensively identify associations between SMs and miRNAs. To this end, exploiting multiple molecular interaction databases, we first established an integrated SM-miRNA association network based on 690 561 SM to SM interactions, 291 600 miRNA to miRNA associations, as well as 664 known SM to miRNA targeting pairs. Then, by performing Random Walk with Restart algorithm on the integrated network, we prioritized the miRNAs associated to each of the SMs. By validating our results utilizing an independent dataset we obtained an area under the ROC curve greater than 0.7. Furthermore, comparisons indicated our integrated approach significantly improved the identification performance of those simple modeled methods. This computational framework as well as the prioritized SM-miRNA targeting relationships will promote the further developments of targeted cancer therapies. CONTACT [email protected], [email protected] or [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Journal of Theoretical Biology | 2014

Analysis and identification of toxin targets by topological properties in protein-protein interaction network.

Lei Yang; Jizhe Wang; Huiping Wang; Yingli Lv; Yongchun Zuo; Wei Jiang

Proteins do not exert their function in isolation of one another, but interact together in protein-protein interaction (PPI) networks. Analysis of topological properties of proteins in the PPI network is very helpful to understand the function of proteins. However, until recently, no one has ever undertaken to investigate toxin targets by topological properties. In this study, for the first time, 12 topological properties are used to investigate the characteristics of toxin targets in the PPI network. Most of the topological properties are found to be statistically discriminative between toxin targets and other proteins, and toxin targets tend to play more important roles in the PPI network. In addition, based on the topological properties and the sequence information, support vector machine (SVM) is used to predict toxin targets. The results obtained by the jackknife test and 10-fold cross validation are encouraging, indicating that SVM is a useful tool for predicting toxin targets, or at least can play complementary roles in relevant areas.


Scientific Reports | 2017

Prediction of presynaptic and postsynaptic neurotoxins by combining various Chou’s pseudo components

Haiyan Huo; Tao Li; Shiyuan Wang; Yingli Lv; Yongchun Zuo; Lei Yang

Presynaptic and postsynaptic neurotoxins are two groups of neurotoxins. Identification of presynaptic and postsynaptic neurotoxins is an important work for numerous newly found toxins. It is both costly and time consuming to determine these two neurotoxins by experimental methods. As a complement, using computational methods for predicting presynaptic and postsynaptic neurotoxins could provide some useful information in a timely manner. In this study, we described four algorithms for predicting presynaptic and postsynaptic neurotoxins from sequence driven features by using Increment of Diversity (ID), Multinomial Naive Bayes Classifier (MNBC), Random Forest (RF), and K-nearest Neighbours Classifier (IBK). Each protein sequence was encoded by pseudo amino acid (PseAA) compositions and three biological motif features, including MEME, Prosite and InterPro motif features. The Maximum Relevance Minimum Redundancy (MRMR) feature selection method was used to rank the PseAA compositions and the 50 top ranked features were selected to improve the prediction accuracy. The PseAA compositions and three kinds of biological motif features were combined and 12 different parameters that defined as P1-P12 were selected as the input parameters of ID, MNBC, RF, and IBK. The prediction results obtained in this study were significantly better than those of previously developed methods.


Biochemical and Biophysical Research Communications | 2014

Characterization of essential genes by topological properties in the perturbation sensitivity network

Lei Yang; Jizhe Wang; Huiping Wang; Yingli Lv; Yongchun Zuo; Wei Jiang

Genes that are indispensable for survival are called essential genes. In recent years, the analysis of essential genes has become extremely important for understanding the way a cell functions. With the advent of large-scale gene expression profiling technologies, it is now possible to profile transcriptional changes in the entire genome of Saccharomyces cerevisiae. Notwithstanding the accumulation of gene expression profiling in recent years, only a few studies have used these data to construct the network for S. cerevisiae. In this paper, based on the transcriptional profiling of the S. cerevisiae genome in hundreds of different gene disruptions, the perturbation sensitivity (PS) network is constructed. A scale-free topology with node degree following a power-law distribution is shown in the PS network. Twelve topological properties are used to investigate the characteristics of essential and non-essential genes in the PS network. Most of the properties are found to be statistically discriminative between essential and non-essential genes. In addition, the F-score is used to estimate the essentiality of each property, and the core number demonstrates the highest F-score among all properties.


Journal of Theoretical Biology | 2014

Human proteins characterization with subcellular localizations

Lei Yang; Yingli Lv; Tao Li; Yongchun Zuo; Wei Jiang

Proteins are responsible for performing the vast majority of cellular functions which are critical to a cells survival. The knowledge of the subcellular localization of proteins can provide valuable information about their molecular functions. Therefore, one of the fundamental goals in cell biology and proteomics is to analyze the subcellular localizations and functions of these proteins. Recent large-scale human genomics and proteomics studies have made it possible to characterize human proteins at a subcellular localization level. In this study, according to the annotation in Swiss-Prot, 8842 human proteins were classified into seven subcellular localizations. Human proteins in the seven subcellular localizations were compared by using topological properties, biological properties, codon usage indices, mRNA expression levels, protein complexity and physicochemical properties. All these properties were found to be significantly different in the seven categories. In addition, based on these properties and pseudo-amino acid compositions, a machine learning classifier was built for the prediction of protein subcellular localization. The study presented here was an attempt to address the aforementioned properties for comparing human proteins of different subcellular localizations. We hope our findings presented in this study may provide important help for the prediction of protein subcellular localization and for understanding the general function of human proteins in cells.


Gene | 2014

Analysis and identification of essential genes in humans using topological properties and biological information

Lei Yang; Jizhe Wang; Huiping Wang; Yingli Lv; Yongchun Zuo; Xiang Li; Wei Jiang

Genes that are indispensable for survival are termed essential genes. The analysis and identification of essential genes are very important for understanding the minimal requirements of cellular survival and for practical purposes. Proteins do not exert their function in isolation of one another but rather interact together in PPI networks. A global analysis of protein interaction networks provides an effective way to elucidate the relationships between proteins. With the recent large-scale identifications of essential genes and the production of large amounts of PPIs in humans, we are able to investigate the topological properties and biological properties of essential genes. However, until recently, no one has ever investigated human essential genes using topological and biological properties. In this study, for the first time, 28 topological properties and 22 biological properties were used to investigate the characteristics of essential and non-essential genes in humans. Most of the properties were statistically discriminative between essential and non-essential genes. The F-score was used to estimate the essentiality of each property. The GO-enrichment analysis was performed to investigate the functions of the essential and non-essential genes. Finally, based on the topological features and the biological characteristics, a machine-learning classifier was constructed to predict the essential genes. The results of the jackknife test and 10-fold cross validation test are encouraging, indicating that our classifier is an effective human essential gene discovery method.

Collaboration


Dive into the Yingli Lv's collaboration.

Top Co-Authors

Avatar

Lei Yang

Harbin Medical University

View shared research outputs
Top Co-Authors

Avatar

Wei Jiang

Harbin Medical University

View shared research outputs
Top Co-Authors

Avatar

Xiaowen Chen

Harbin Medical University

View shared research outputs
Top Co-Authors

Avatar

Shiyuan Wang

Harbin Medical University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xia Li

Harbin Medical University

View shared research outputs
Top Co-Authors

Avatar

Shuyuan Wang

Harbin Medical University

View shared research outputs
Top Co-Authors

Avatar

Jizhe Wang

Harbin Medical University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dapeng Hao

Harbin Medical University

View shared research outputs
Researchain Logo
Decentralizing Knowledge