Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yanchun Liang is active.

Publication


Featured researches published by Yanchun Liang.


Bioinformatics | 2009

Methods for labeling error detection in microarrays based on the effect of data perturbation on the regression model

Chen Zhang; Chunguo Wu; Enrico Blanzieri; You Zhou; Yan Wang; Wei Du; Yanchun Liang

MOTIVATION Mislabeled samples often appear in gene expression profile because of the similarity of different sub-type of disease and the subjective misdiagnosis. The mislabeled samples deteriorate supervised learning procedures. The LOOE-sensitivity algorithm is an approach for mislabeled sample detection for microarray based on data perturbation. However, the failure of measuring the perturbing effect makes the LOOE-sensitivity algorithm a poor performance. The purpose of this article is to design a novel detection method for mislabeled samples of microarray, which could take advantage of the measuring effect of data perturbations. RESULTS To measure the effect of data perturbation, we define an index named perturbing influence value (PIV), based on the support vector machine (SVM) regression model. The Column Algorithm (CAPIV), Row Algorithm (RAPIV) and progressive Row Algorithm (PRAPIV) based on the PIV value are proposed to detect the mislabeled samples. Experimental results obtained by using six artificial datasets and five microarray datasets demonstrate that all proposed methods in this article are superior to LOOE-sensitivity. Moreover, compared with the simple SVM and CL-stability, the PRAPIV algorithm shows an increase in precision and high recall. AVAILABILITY The program and source code (in JAVA) are publicly available at http://ccst.jlu.edu.cn/CSBG/PIVS/index.htm


PLOS ONE | 2013

Computational Prediction of Human Salivary Proteins from Blood Circulation and Application to Diagnostic Biomarker Identification

Jiaxin Wang; Yanchun Liang; Yan Wang; Juan Cui; Ming Liu; Wei Du; Ying Xu

Proteins can move from blood circulation into salivary glands through active transportation, passive diffusion or ultrafiltration, some of which are then released into saliva and hence can potentially serve as biomarkers for diseases if accurately identified. We present a novel computational method for predicting salivary proteins that come from circulation. The basis for the prediction is a set of physiochemical and sequence features we found to be discerning between human proteins known to be movable from circulation to saliva and proteins deemed to be not in saliva. A classifier was trained based on these features using a support-vector machine to predict protein secretion into saliva. The classifier achieved 88.56% average recall and 90.76% average precision in 10-fold cross-validation on the training data, indicating that the selected features are informative. Considering the possibility that our negative training data may not be highly reliable (i.e., proteins predicted to be not in saliva), we have also trained a ranking method, aiming to rank the known salivary proteins from circulation as the highest among the proteins in the general background, based on the same features. This prediction capability can be used to predict potential biomarker proteins for specific human diseases when coupled with the information of differentially expressed proteins in diseased versus healthy control tissues and a prediction capability for blood-secretory proteins. Using such integrated information, we predicted 31 candidate biomarker proteins in saliva for breast cancer.


PLOS ONE | 2014

Identification of essential proteins based on ranking edge-weights in protein-protein interaction networks.

Yan Wang; Huiyan Sun; Wei Du; Enrico Blanzieri; Gabriella Viero; Ying Xu; Yanchun Liang

Essential proteins are those that are indispensable to cellular survival and development. Existing methods for essential protein identification generally rely on knock-out experiments and/or the relative density of their interactions (edges) with other proteins in a Protein-Protein Interaction (PPI) network. Here, we present a computational method, called EW, to first rank protein-protein interactions in terms of their Edge Weights, and then identify sub-PPI-networks consisting of only the highly-ranked edges and predict their proteins as essential proteins. We have applied this method to publicly-available PPI data on Saccharomyces cerevisiae (Yeast) and Escherichia coli (E. coli) for essential protein identification, and demonstrated that EW achieves better performance than the state-of-the-art methods in terms of the precision-recall and Jackknife measures. The highly-ranked protein-protein interactions by our prediction tend to be biologically significant in both the Yeast and E. coli PPI networks. Further analyses on systematically perturbed Yeast and E. coli PPI networks through randomly deleting edges demonstrate that the proposed method is robust and the top-ranked edges tend to be more associated with known essential proteins than the lowly-ranked edges.


BioMed Research International | 2013

Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data

Wei Du; Zhongbo Cao; Yan Wang; Ying Sun; Enrico Blanzieri; Yanchun Liang

Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergeys taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis.


data mining in bioinformatics | 2014

Operon prediction by Markov clustering

Wei Du; Zhongbo Cao; Yan Wang; Enrico Blanzieri; Chen Zhang; Yanchun Liang

The prediction of operons is a critical step for the reconstruction of biochemical and regulatory networks at the whole genome level. In this paper, a novel operon prediction model is proposed based on Markov Clustering (MCL). The model employs a graph-clustering method by MCL for prediction and does not need a classifier. In the cross-species validation, the accuracies of E. coli K12, Bacillus subtilis and P. furiosus are 92.1, 86.9 and 87.3%, respectively. Experimental results show that the proposed method has a powerful capability of operon prediction. The compiled program and test data sets are publicly available at http://ccst.jlu.edu.cn/JCSB/OPMC/.


African Journal of Biotechnology | 2012

An entropy-based improved k-top scoring pairs (TSP) method for classifying human cancers

Chunbao Zhou; Shuqin Wang; Enrico Blanzieri; Yanchun Liang

Classification and prediction of different cancers based on gene-expression profiles are important for cancer diagnosis, cancer treatment and medication discovery. However, most data in the gene expression profile are not able to make a contribution to cancer classification and prediction. Hence, it is important to find the key genes that are relevant. An entropy-based improved k-top scoring pairs (TSP) (Ik-TSP) method was presented in this study for the classification and prediction of human cancers based on gene-expression data. We compared Ik-TSP classifiers with 5 different machine learning methods and the k-TSP method based on 3 different feature selection methods on 9 binary class gene expression datasets and 10 multi-class gene expression datasets involving human cancers. Experimental results showed that the Ik-TSP method had higher accuracy. The experimental results also showed that the proposed method can effectively find genes that are important for distinguishing different cancer and cancer subtype. Key words: Cancer classification, gene expression, k-TSP, information entropy, gene selection.


computer science and software engineering | 2008

Improved Quantum-Inspired Evolutionary Algorithm and Its Application to 3-SAT Problems

Xiaoyue Feng; Enrico Blanzieri; Yanchun Liang

An improved quantum-inspired evolutionary algorithm is presented in this paper. Quantum angle is adopted to present the quantum bit in the proposed algorithm. A novel quantum rotation gate strategy is adopted to adjust the direction of the quantum gate which is used to update the quantum population. The step size is adaptively adjusted rather than a fixed angle. Furthermore, the particle swarm optimization is added into the improved algorithm to accelerate the convergent speed and develop the local searching ability. To demonstrate the effectiveness and applicability of the proposed approach, several experiments are performed on the 3-SAT problems. The results show that it is feasible and effective to solve the 3-SAT problem using the proposed algorithm.


BioMed Research International | 2017

Computational Analysis of Specific MicroRNA Biomarkers for Noninvasive Early Cancer Detection

Tianci Song; Yanchun Liang; Zhongbo Cao; Wei Du; Ying Li

Cancer is a complex disease residing in various tissues of human body, accompanied with many abnormalities and mutations in genomes, transcriptome, and epigenome. Early detection plays a crucial role in extending survival time of all major cancer types. Recent advances in microarray and sequencing techniques have given more support to identifying effective biomarkers for early detection of cancer. MicroRNAs (miRNAs) are more and more frequently used as candidates for biomarkers in cancer related studies due to their regulation of target gene expression. In this paper, the comparative analysis is used to discover miRNA expression patterns in cancer versus normal samples on early stage of eight prevalent cancer types. Our work focuses on the specific miRNAs biomarkers identification and function analysis. Several identified miRNA biomarkers in this paper are matched well with those reported in existing researches, and most of them could serve as potential candidate indicators for clinical early diagnosis applications.


International Journal of Digital Content Technology and Its Applications | 2011

An Algorithm for Recognizing Mislabeled and Abnormal Samples in Cancer Microarray

You Zhou; Enrico Blanzieri; Mengmeng Zhang; Yanchun Liang; Xu Zhou

Microarray is a high-throughput experimental technology which has been used in many life-science areas especially in medical applications. The sample classification problem is crucial for disease diagnosis and treatment. However, the process of sample labeling can be very complex and partially subjective. Existing studies confirm this phenomenon and show that even a very small number of error samples could deeply degrade the performance of the obtained classifier, particularly when the size of the dataset is small. More and more Microarray data have been collected by organizations or companies and can be used for further investigation, but the detection and correction of mislabeled samples remains hard to be done by hand. The problem we address in this paper is to develop a method for automatic detection of mislabeled samples and correction of the suspect samples. An algorithm for detecting and correcting potential error samples is proposed: Iterative-CLSWE. The algorithm is based on the classification stability of each sample in the whole dataset. The experimental results validate the proposed algorithm. This automatic way for detecting mislabeled and abnormal samples can prove to be significant for large collection of data coming from heterogeneous studies.


Methods | 2015

Essential protein identification based on essential protein-protein interaction prediction by Integrated Edge Weights.

Yuexu Jiang; Yan Wang; Wei Pang; Liang Chen; Huiyan Sun; Yanchun Liang; Enrico Blanzieri

Collaboration


Dive into the Yanchun Liang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

You Zhou

University of Trento

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Wei Pang

University of Aberdeen

View shared research outputs
Researchain Logo
Decentralizing Knowledge