Huanqing Feng
University of Science and Technology of China
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Huanqing Feng.
BMC Bioinformatics | 2006
Yu Xue; Ao Li; Lirong Wang; Huanqing Feng; Xuebiao Yao
BackgroundAs a reversible and dynamic post-translational modification (PTM) of proteins, phosphorylation plays essential regulatory roles in a broad spectrum of the biological processes. Although many studies have been contributed on the molecular mechanism of phosphorylation dynamics, the intrinsic feature of substrates specificity is still elusive and remains to be delineated.ResultsIn this work, we present a novel, versatile and comprehensive program, PPSP (Prediction of PK-specific Phosphorylation site), deployed with approach of Bayesian decision theory (BDT). PPSP could predict the potential phosphorylation sites accurately for ~70 PK (Protein Kinase) groups. Compared with four existing tools Scansite, NetPhosK, KinasePhos and GPS, PPSP is more accurate and powerful than these tools. Moreover, PPSP also provides the prediction for many novel PKs, say, TRK, mTOR, SyK and MET/RON, etc. The accuracy of these novel PKs are also satisfying.ConclusionTaken together, we propose that PPSP could be a potentially powerful tool for the experimentalists who are focusing on phosphorylation substrates with their PK-specific sites identification. Moreover, the BDT strategy could also be a ubiquitous approach for PTMs, such as sumoylation and ubiquitination, etc.
Nucleic Acids Research | 2005
Dan Xie; Ao Li; Minghui Wang; Zhewen Fan; Huanqing Feng
Subcellular location of a protein is one of the key functional characters as proteins must be localized correctly at the subcellular level to have normal biological function. In this paper, a novel method named LOCSVMPSI has been introduced, which is based on the support vector machine (SVM) and the position-specific scoring matrix generated from profiles of PSI-BLAST. With a jackknife test on the RH2427 data set, LOCSVMPSI achieved a high overall prediction accuracy of 90.2%, which is higher than the prediction results by SubLoc and ESLpred on this data set. In addition, prediction performance of LOCSVMPSI was evaluated with 5-fold cross validation test on the PK7579 data set and the prediction results were consistently better than the previous method based on several SVMs using composition of both amino acids and amino acid pairs. Further test on the SWISSPROT new-unique data set showed that LOCSVMPSI also performed better than some widely used prediction methods, such as PSORTII, TargetP and LOCnet. All these results indicate that LOCSVMPSI is a powerful tool for the prediction of eukaryotic protein subcellular localization. An online web server (current version is 1.3) based on this method has been developed and is freely available to both academic and commercial users, which can be accessed by at .
BMC Bioinformatics | 2006
Xian Wang; Ao Li; Zhaohui Jiang; Huanqing Feng
BackgroundGene expression profiling has become a useful biological resource in recent years, and it plays an important role in a broad range of areas in biology. The raw gene expression data, usually in the form of large matrix, may contain missing values. The downstream analysis methods that postulate complete matrix input are thus not applicable. Several methods have been developed to solve this problem, such as K nearest neighbor impute method, Bayesian principal components analysis impute method, etc. In this paper, we introduce a novel imputing approach based on the Support Vector Regression (SVR) method. The proposed approach utilizes an orthogonal coding input scheme, which makes use of multi-missing values in one row of a certain gene expression profile and imputes the missing value into a much higher dimensional space, to obtain better performance.ResultsA comparative study of our method with the previously developed methods has been presented for the estimation of the missing values on six gene expression data sets. Among the three different input-vector coding schemes we tried, the orthogonal input coding scheme obtains the best estimation results with the minimum Normalized Root Mean Squared Error (NRMSE). The results also demonstrate that the SVR method has powerful estimation ability on different kinds of data sets with relatively small NRMSE.ConclusionThe SVR impute method shows better performance than, or at least comparable with, the previously developed methods in present research. The outstanding estimation ability of this impute method is partly due to the use of the most missing value information by incorporating orthogonal input coding scheme. In addition, the solid theoretical foundation of SVR method also helps in estimation of performance together with orthogonal input coding scheme. The promising estimation ability demonstrated in the results section suggests that the proposed approach provides a proper solution to the missing value estimation problem. The source code of the SVR method is available from http://202.38.78.189/downloads/svrimpute.html for non-commercial use.
Amino Acids | 2014
Wenwen Fan; Xiaoyi Xu; Yi Shen; Huanqing Feng; Ao Li; Minghui Wang
Reversible protein phosphorylation is one of the most important post-translational modifications, which regulates various biological cellular processes. Identification of the kinase-specific phosphorylation sites is helpful for understanding the phosphorylation mechanism and regulation processes. Although a number of computational approaches have been developed, currently few studies are concerned about hierarchical structures of kinases, and most of the existing tools use only local sequence information to construct predictive models. In this work, we conduct a systematic and hierarchy-specific investigation of protein phosphorylation site prediction in which protein kinases are clustered into hierarchical structures with four levels including kinase, subfamily, family and group. To enhance phosphorylation site prediction at all hierarchical levels, functional information of proteins, including gene ontology (GO) and protein–protein interaction (PPI), is adopted in addition to primary sequence to construct prediction models based on random forest. Analysis of selected GO and PPI features shows that functional information is critical in determining protein phosphorylation sites for every hierarchical level. Furthermore, the prediction results of Phospho.ELM and additional testing dataset demonstrate that the proposed method remarkably outperforms existing phosphorylation prediction methods at all hierarchical levels. The proposed method is freely available at http://bioinformatics.ustc.edu.cn/phos_pred/.
northeast bioengineering conference | 2003
Xiaoyan Li; T. Wang; Ping Zhou; Huanqing Feng
ST-T complex automatic analysis of the electrocardiogram (ECG) signals was investigated. First, a wavelet adaptive filter structure was used to remove the baseline wandering of the ECG signals, which was critically important for ST segment analysis. Then, taking advantages of the multiple resolution ability of the wavelet transform, an identification method was developed to identify the ST segment fiducial points of the ECG signals at different wavelet decomposition scales or frequency bands. The proposed methods were tested using the standard MIT/BIH ECG ST segment database. The fiducial points identification results were compared with those obtained manually by the experienced cardiologists. This comparison showed a good matching, which suggested the reliability of the proposed method.
PLOS ONE | 2013
Chen Peng; Minghui Wang; Yi Shen; Huanqing Feng; Ao Li
Background As one of the most common types of co-regulatory motifs, feed-forward loops (FFLs) control many cell functions and play an important role in human cancers. Therefore, it is crucial to reconstruct and analyze cancer-related FFLs that are controlled by transcription factor (TF) and microRNA (miRNA) simultaneously, in order to find out how miRNAs and TFs cooperate with each other in cancer cells and how they contribute to carcinogenesis. Current FFL studies rely on predicted regulation information and therefore suffer the false positive issue in prediction results. More critically, FFLs generated by existing approaches cannot represent the dynamic and conditional regulation relationship under different experimental conditions. Methodology/Principal Findings In this study, we proposed a novel filter-wrapper feature selection method to accurately identify co-regulatory mechanism by incorporating prior information from predicted regulatory interactions with parallel miRNA/mRNA expression datasets. By applying this method, we reconstructed 208 and 110 TF-miRNA co-regulatory FFLs from human pan-cancer and prostate datasets, respectively. Further analysis of these cancer-related FFLs showed that the top-ranking TF STAT3 and miRNA hsa-let-7e are key regulators implicated in human cancers, which have regulated targets significantly enriched in cellular process regulations and signaling pathways that are involved in carcinogenesis. Conclusions/Significance In this study, we introduced an efficient computational approach to reconstruct co-regulatory FFLs by accurately identifying gene co-regulatory interactions. The strength of the proposed feature selection method lies in the fact it can precisely filter out false positives in predicted regulatory interactions by quantitatively modeling the complex co-regulation of target genes mediated by TFs and miRNAs simultaneously. Moreover, the proposed feature selection method can be generally applied to other gene regulation studies using parallel expression data with respect to different biological contexts.
international conference of the ieee engineering in medicine and biology society | 2005
Zhaohui Jiang; Yan Ning; Bin An; Ao Li; Huanqing Feng
Based on detrended fluctuation analysis (DFA), we explore the characteristics of multichannel electroencephalogram (EEG), which is recorded from many subjects performing different mental tasks. The results show that mental EEG exhibits long-range power-law correlations by calculating its scaling exponents (alpha), which can reflect the kinds of mental tasks. The scaling exponent of letter-composing is different from that of multiplication especially at positions C3 and C4, and at positions O1 and O2 the scaling exponent of rotation is also different distinctively from that of multiplication. Detrended fluctuation analysis exhibits its robustness against noises in our works. We could benefit more from the results of this paper in designing mental tasks and selecting brain areas in brain-computer interface systems
international conference of the ieee engineering in medicine and biology society | 2006
Bin An; Yan Ning; Zhaohui Jiang; Huanqing Feng; Heqin Zhou
The multichannel electrocorticogram (ECoG)/electroencephalogram (EEG) signals are commonly used to classify two kinds of motor imagery (MI) tasks. In this paper, the ECoG and EEG data sets are composed of training and test data, which are recorded during different time/days. Power spectral density (PSD) is selected as features; Fisher discriminant analysis (FDA) and common spatial patterns (CSP) are used to filter redundancy; K-Nearest-Neighbor (KNN) classifier is applied to classify MI tasks; and a new function R (k) is presented to estimate the value of k. Using these methods, we obtain the predictive accuracy of MI tasks based on ECoG data (which is 92%) and EEG data (which is 81%). The results show that we can effectively classify two kinds of MI tasks based on EEG as well as ECoG
international conference on bioinformatics and biomedical engineering | 2009
Kai Lai; Peng Zhao; Yufeng Huang; Junwei Liu; Chang Wang; Huanqing Feng; Chuanfu Li
In diagnosing pulmonary diseases aided by computer, accurate segmentation of the airway tree from the CT images is the basis for subsequent processing and analyzing. It is still a challenging task due to the image noise, partial volume effect and texture similarity of the airway and parenchyma. In order to solve these problems, various algorithms have been proposed, among which the region growing is the most commonly used one. However, previous region growing algorithms, either those using constant parameters or those using adaptive parameters, suffered from leakage and/or disconnection. This paper presents a novel adaptive region growing approach using two-step processing. The first step is rough segmentation, for dividing the sub- volumes surrounding the airway into three types according to their topology; and the second step is fine segmentation, using specific methods for each type. The experimental results show that the proposed approach can effectively suppress leakage and remedy disconnection.
PLOS ONE | 2014
Ao Li; Yuanning Liu; Qihong Zhao; Huanqing Feng; Lyndsay Harris; Minghui Wang
Genomic copy number alteration and allelic imbalance are distinct features of cancer cells, and recent advances in the genotyping technology have greatly boosted the research in the cancer genome. However, the complicated nature of tumor usually hampers the dissection of the SNP arrays. In this study, we describe a bioinformatic tool, named GIANT, for genome-wide identification of somatic aberrations from paired normal-tumor samples measured with SNP arrays. By efficiently incorporating genotype information of matched normal sample, it accurately detects different types of aberrations in cancer genome, even for aneuploid tumor samples with severe normal cell contamination. Furthermore, it allows for discovery of recurrent aberrations with critical biological properties in tumorigenesis by using statistical significance test. We demonstrate the superior performance of the proposed method on various datasets including tumor replicate pairs, simulated SNP arrays and dilution series of normal-cancer cell lines. Results show that GIANT has the potential to detect the genomic aberration even when the cancer cell proportion is as low as 5∼10%. Application on a large number of paired tumor samples delivers a genome-wide profile of the statistical significance of the various aberrations, including amplification, deletion and LOH. We believe that GIANT represents a powerful bioinformatic tool for interpreting the complex genomic aberration, and thus assisting both academic study and the clinical treatment of cancer.