Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ao Li is active.

Publication


Featured researches published by Ao Li.


BMC Bioinformatics | 2006

PPSP: prediction of PK-specific phosphorylation site with Bayesian decision theory

Yu Xue; Ao Li; Lirong Wang; Huanqing Feng; Xuebiao Yao

BackgroundAs a reversible and dynamic post-translational modification (PTM) of proteins, phosphorylation plays essential regulatory roles in a broad spectrum of the biological processes. Although many studies have been contributed on the molecular mechanism of phosphorylation dynamics, the intrinsic feature of substrates specificity is still elusive and remains to be delineated.ResultsIn this work, we present a novel, versatile and comprehensive program, PPSP (Prediction of PK-specific Phosphorylation site), deployed with approach of Bayesian decision theory (BDT). PPSP could predict the potential phosphorylation sites accurately for ~70 PK (Protein Kinase) groups. Compared with four existing tools Scansite, NetPhosK, KinasePhos and GPS, PPSP is more accurate and powerful than these tools. Moreover, PPSP also provides the prediction for many novel PKs, say, TRK, mTOR, SyK and MET/RON, etc. The accuracy of these novel PKs are also satisfying.ConclusionTaken together, we propose that PPSP could be a potentially powerful tool for the experimentalists who are focusing on phosphorylation substrates with their PK-specific sites identification. Moreover, the BDT strategy could also be a ubiquitous approach for PTMs, such as sumoylation and ubiquitination, etc.


Nucleic Acids Research | 2005

LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST

Dan Xie; Ao Li; Minghui Wang; Zhewen Fan; Huanqing Feng

Subcellular location of a protein is one of the key functional characters as proteins must be localized correctly at the subcellular level to have normal biological function. In this paper, a novel method named LOCSVMPSI has been introduced, which is based on the support vector machine (SVM) and the position-specific scoring matrix generated from profiles of PSI-BLAST. With a jackknife test on the RH2427 data set, LOCSVMPSI achieved a high overall prediction accuracy of 90.2%, which is higher than the prediction results by SubLoc and ESLpred on this data set. In addition, prediction performance of LOCSVMPSI was evaluated with 5-fold cross validation test on the PK7579 data set and the prediction results were consistently better than the previous method based on several SVMs using composition of both amino acids and amino acid pairs. Further test on the SWISSPROT new-unique data set showed that LOCSVMPSI also performed better than some widely used prediction methods, such as PSORTII, TargetP and LOCnet. All these results indicate that LOCSVMPSI is a powerful tool for the prediction of eukaryotic protein subcellular localization. An online web server (current version is 1.3) based on this method has been developed and is freely available to both academic and commercial users, which can be accessed by at .


Nature Genetics | 2006

Differentiated cells are more efficient than adult stem cells for cloning by somatic cell nuclear transfer

Li-Ying Sung; Shaorong Gao; Hongmei Shen; Hui Yu; Yifang Song; Sadie Smith; C.-C. Chang; Kimiko Inoue; Lynn Kuo; Jin Lian; Ao Li; X. Cindy Tian; David Tuck; Sherman M. Weissman; Xiangzhong Yang; Tao Cheng

Since the creation of Dolly via somatic cell nuclear transfer (SCNT), more than a dozen species of mammals have been cloned using this technology. One hypothesis for the limited success of cloning via SCNT (1%–5%) is that the clones are likely to be derived from adult stem cells. Support for this hypothesis comes from the findings that the reproductive cloning efficiency for embryonic stem cells is five to ten times higher than that for somatic cells as donors and that cloned pups cannot be produced directly from cloned embryos derived from differentiated B and T cells or neuronal cells. The question remains as to whether SCNT-derived animal clones can be derived from truly differentiated somatic cells. We tested this hypothesis with mouse hematopoietic cells at different differentiation stages: hematopoietic stem cells, progenitor cells and granulocytes. We found that cloning efficiency increases over the differentiation hierarchy, and terminally differentiated postmitotic granulocytes yield cloned pups with the greatest cloning efficiency.


BMC Bioinformatics | 2006

Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme

Xian Wang; Ao Li; Zhaohui Jiang; Huanqing Feng

BackgroundGene expression profiling has become a useful biological resource in recent years, and it plays an important role in a broad range of areas in biology. The raw gene expression data, usually in the form of large matrix, may contain missing values. The downstream analysis methods that postulate complete matrix input are thus not applicable. Several methods have been developed to solve this problem, such as K nearest neighbor impute method, Bayesian principal components analysis impute method, etc. In this paper, we introduce a novel imputing approach based on the Support Vector Regression (SVR) method. The proposed approach utilizes an orthogonal coding input scheme, which makes use of multi-missing values in one row of a certain gene expression profile and imputes the missing value into a much higher dimensional space, to obtain better performance.ResultsA comparative study of our method with the previously developed methods has been presented for the estimation of the missing values on six gene expression data sets. Among the three different input-vector coding schemes we tried, the orthogonal input coding scheme obtains the best estimation results with the minimum Normalized Root Mean Squared Error (NRMSE). The results also demonstrate that the SVR method has powerful estimation ability on different kinds of data sets with relatively small NRMSE.ConclusionThe SVR impute method shows better performance than, or at least comparable with, the previously developed methods in present research. The outstanding estimation ability of this impute method is partly due to the use of the most missing value information by incorporating orthogonal input coding scheme. In addition, the solid theoretical foundation of SVR method also helps in estimation of performance together with orthogonal input coding scheme. The promising estimation ability demonstrated in the results section suggests that the proposed approach provides a proper solution to the missing value estimation problem. The source code of the SVR method is available from http://202.38.78.189/downloads/svrimpute.html for non-commercial use.


Nucleic Acids Research | 2011

GPHMM: an integrated hidden Markov model for identification of copy number alteration and loss of heterozygosity in complex tumor samples using whole genome SNP arrays

Ao Li; Zongzhi Liu; Kimberly Lezon-Geyda; Sudipa Sarkar; Donald Lannin; Vincent Schulz; Ian E. Krop; Lyndsay Harris; David Tuck

There is an increasing interest in using single nucleotide polymorphism (SNP) genotyping arrays for profiling chromosomal rearrangements in tumors, as they allow simultaneous detection of copy number and loss of heterozygosity with high resolution. Critical issues such as signal baseline shift due to aneuploidy, normal cell contamination, and the presence of GC content bias have been reported to dramatically alter SNP array signals and complicate accurate identification of aberrations in cancer genomes. To address these issues, we propose a novel Global Parameter Hidden Markov Model (GPHMM) to unravel tangled genotyping data generated from tumor samples. In contrast to other HMM methods, a distinct feature of GPHMM is that the issues mentioned above are quantitatively modeled by global parameters and integrated within the statistical framework. We developed an efficient EM algorithm for parameter estimation. We evaluated performance on three data sets and show that GPHMM can correctly identify chromosomal aberrations in tumor samples containing as few as 10% cancer cells. Furthermore, we demonstrated that the estimation of global parameters in GPHMM provides information about the biological characteristics of tumor samples and the quality of genotyping signal from SNP array experiments, which is helpful for data quality control and outlier detection in cohort studies.


BMC Bioinformatics | 2013

PKIS: computational identification of protein kinases for experimentally discovered protein phosphorylation sites

Liang Zou; Mang Wang; Yi Shen; Jie Liao; Ao Li; Minghui Wang

BackgroundDynamic protein phosphorylation is an essential regulatory mechanism in various organisms. In this capacity, it is involved in a multitude of signal transduction pathways. Kinase-specific phosphorylation data lay the foundation for reconstruction of signal transduction networks. For this reason, precise annotation of phosphorylated proteins is the first step toward simulating cell signaling pathways. However, the vast majority of kinase-specific phosphorylation data remain undiscovered and existing experimental methods and computational phosphorylation site (P-site) prediction tools have various limitations with respect to addressing this problem.ResultsTo address this issue, a novel protein kinase identification web server, PKIS, is here presented for the identification of the protein kinases responsible for experimentally verified P-sites at high specificity, which incorporates the composition of monomer spectrum (CMS) encoding strategy and support vector machines (SVMs). Compared to widely used P-site prediction tools including KinasePhos 2.0, Musite, and GPS2.1, PKIS largely outperformed these tools in identifying protein kinases associated with known P-sites. In addition, PKIS was used on all the P-sites in Phospho.ELM that currently lack kinase information. It successfully identified 14 potential SYK substrates with 36 known P-sites. Further literature search showed that 5 of them were indeed phosphorylated by SYK. Finally, an enrichment analysis was performed and 6 significant SYK-related signal pathways were identified.ConclusionsIn general, PKIS can identify protein kinases for experimental phosphorylation sites efficiently. It is a valuable bioinformatics tool suitable for the study of protein phosphorylation. The PKIS web server is freely available at http://bioinformatics.ustc.edu.cn/pkis.


PLOS ONE | 2010

MixHMM: Inferring Copy Number Variation and Allelic Imbalance Using SNP Arrays and Tumor Samples Mixed with Stromal Cells

Zongzhi Liu; Ao Li; Vincent P. Schulz; Min Chen; David Tuck

Background Genotyping platforms such as single nucleotide polymorphism (SNP) arrays are powerful tools to study genomic aberrations in cancer samples. Allele specific information from SNP arrays provides valuable information for interpreting copy number variation (CNV) and allelic imbalance including loss-of-heterozygosity (LOH) beyond that obtained from the total DNA signal available from array comparative genomic hybridization (aCGH) platforms. Several algorithms based on hidden Markov models (HMMs) have been designed to detect copy number changes and copy-neutral LOH making use of the allele information on SNP arrays. However heterogeneity in clinical samples, due to stromal contamination and somatic alterations, complicates analysis and interpretation of these data. Methods We have developed MixHMM, a novel hidden Markov model using hidden states based on chromosomal structural aberrations. MixHMM allows CNV detection for copy numbers up to 7 and allows more complete and accurate description of other forms of allelic imbalance, such as increased copy number LOH or imbalanced amplifications. MixHMM also incorporates a novel sample mixing model that allows detection of tumor CNV events in heterogeneous tumor samples, where cancer cells are mixed with a proportion of stromal cells. Conclusions We validate MixHMM and demonstrate its advantages with simulated samples, clinical tumor samples and a dilution series of mixed samples. We have shown that the CNVs of cancer cells in a tumor sample contaminated with up to 80% of stromal cells can be detected accurately using Illumina BeadChip and MixHMM. Availability The MixHMM is available as a Python package provided with some other useful tools at http://genecube.med.yale.edu:8080/MixHMM.


Amino Acids | 2014

Prediction of protein kinase-specific phosphorylation sites in hierarchical structure using functional information and random forest

Wenwen Fan; Xiaoyi Xu; Yi Shen; Huanqing Feng; Ao Li; Minghui Wang

Reversible protein phosphorylation is one of the most important post-translational modifications, which regulates various biological cellular processes. Identification of the kinase-specific phosphorylation sites is helpful for understanding the phosphorylation mechanism and regulation processes. Although a number of computational approaches have been developed, currently few studies are concerned about hierarchical structures of kinases, and most of the existing tools use only local sequence information to construct predictive models. In this work, we conduct a systematic and hierarchy-specific investigation of protein phosphorylation site prediction in which protein kinases are clustered into hierarchical structures with four levels including kinase, subfamily, family and group. To enhance phosphorylation site prediction at all hierarchical levels, functional information of proteins, including gene ontology (GO) and protein–protein interaction (PPI), is adopted in addition to primary sequence to construct prediction models based on random forest. Analysis of selected GO and PPI features shows that functional information is critical in determining protein phosphorylation sites for every hierarchical level. Furthermore, the prediction results of Phospho.ELM and additional testing dataset demonstrate that the proposed method remarkably outperforms existing phosphorylation prediction methods at all hierarchical levels. The proposed method is freely available at http://bioinformatics.ustc.edu.cn/phos_pred/.


Bioinformatics | 2014

CLImAT: accurate detection of copy number alteration and loss of heterozygosity in impure and aneuploid tumor samples using whole-genome sequencing data.

Zhenhua Yu; Yuanning Liu; Yi Shen; Minghui Wang; Ao Li

Motivation: Whole-genome sequencing of tumor samples has been demonstrated as an efficient approach for comprehensive analysis of genomic aberrations in cancer genome. Critical issues such as tumor impurity and aneuploidy, GC-content and mappability bias have been reported to complicate identification of copy number alteration and loss of heterozygosity in complex tumor samples. Therefore, efficient computational methods are required to address these issues. Results: We introduce CLImAT (CNA and LOH Assessment in Impure and Aneuploid Tumors), a bioinformatics tool for identification of genomic aberrations from tumor samples using whole-genome sequencing data. Without requiring a matched normal sample, CLImAT takes integrated analysis of read depth and allelic frequency and provides extensive data processing procedures including GC-content and mappability correction of read depth and quantile normalization of B-allele frequency. CLImAT accurately identifies copy number alteration and loss of heterozygosity even for highly impure tumor samples with aneuploidy. We evaluate CLImAT on both simulated and real DNA sequencing data to demonstrate its ability to infer tumor impurity and ploidy and identify genomic aberrations in complex tumor samples. Availability and implementation: The CLImAT software package can be freely downloaded at http://bioinformatics.ustc.edu.cn/CLImAT/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Genomics, Proteomics & Bioinformatics | 2016

A Bipartite Network-based Method for Prediction of Long Non-coding RNA-protein Interactions.

Mengqu Ge; Ao Li; Minghui Wang

As one large class of non-coding RNAs (ncRNAs), long ncRNAs (lncRNAs) have gained considerable attention in recent years. Mutations and dysfunction of lncRNAs have been implicated in human disorders. Many lncRNAs exert their effects through interactions with the corresponding RNA-binding proteins. Several computational approaches have been developed, but only few are able to perform the prediction of these interactions from a network-based point of view. Here, we introduce a computational method named lncRNA–protein bipartite network inference (LPBNI). LPBNI aims to identify potential lncRNA–interacting proteins, by making full use of the known lncRNA–protein interactions. Leave-one-out cross validation (LOOCV) test shows that LPBNI significantly outperforms other network-based methods, including random walk (RWR) and protein-based collaborative filtering (ProCF). Furthermore, a case study was performed to demonstrate the performance of LPBNI using real data in predicting potential lncRNA–interacting proteins.

Collaboration


Dive into the Ao Li's collaboration.

Top Co-Authors

Avatar

Minghui Wang

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar

Huanqing Feng

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar

Jianing Xi

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar

Yuanning Liu

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar

Dongdong Sun

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar

Yi Shen

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar

Zhenhua Yu

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar

Zhaohui Jiang

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar

Chen Peng

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar

Mengqu Ge

University of Science and Technology of China

View shared research outputs
Researchain Logo
Decentralizing Knowledge