Qifeng Zhou | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Qifeng Zhou is active.

Explore More

Publication

Featured researches published by Qifeng Zhou.

Knowledge Based Systems | 2016

Cost-sensitive feature selection using random forest: Selecting low-cost subsets of informative features

Qifeng Zhou; Hao Zhou; Tao Li

Abstract Feature selection aims to select a small subset of informative features that contain most of the information related to a given task. Existing feature selection methods often assume that all the features have the same cost. However, in many real world applications, different features may have different costs (e.g., different tests a patient might take in medical diagnosis). Ignoring the feature cost may produce good feature subsets in theory but they can not be used in practice. In this paper, we propose a random forest-based feature selection algorithm that incorporates the feature cost into the base decision tree construction process to produce low-cost feature subsets. In particular, when constructing a base tree, a feature is randomly selected with a probability inversely proportional to its associated cost. We evaluate the proposed method on a number of UCI datasets and apply it to a medical diagnosis problem where the real feature costs are estimated by experts. The experimental results demonstrate that our feature-cost-sensitive random forest (FCS-RF) is able to select a low-cost subset of informative features and achieves better performance than other state-of-art feature selection methods in real-world problems.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2011

Improving the Computational Efficiency of Recursive Cluster Elimination for Gene Selection

Linkai Luo; Dengfeng Huang; Lingjun Ye; Qifeng Zhou; Gui-Fang Shao; Hong Peng

The gene expression data are usually provided with a large number of genes and a relatively small number of samples, which brings a lot of new challenges. Selecting those informative genes becomes the main issue in microarray data analysis. Recursive cluster elimination based on support vector machine (SVM-RCE) has shown the better classification accuracy on some microarray data sets than recursive feature elimination based on support vector machine (SVM-RFE). However, SVM-RCE is extremely time-consuming. In this paper, we propose an improved method of SVM-RCE called ISVM-RCE. ISVM-RCE first trains a SVM model with all clusters, then applies the infinite norm of weight coefficient vector in each cluster to score the cluster, finally eliminates the gene clusters with the lowest score. In addition, ISVM-RCE eliminates genes within the clusters instead of removing a cluster of genes when the number of clusters is small. We have tested ISVM-RCE on six gene expression data sets and compared their performances with SVM-RCE and linear-discriminant-analysis-based RFE (LDA-RFE). The experiment results on these data sets show that ISVM-RCE greatly reduces the time cost of SVM-RCE, meanwhile obtains comparable classification performance as SVM-RCE, while LDA-RFE is not stable.

Applied Soft Computing | 2015

Structural damage detection based on posteriori probability support vector machine and Dempster-Shafer evidence theory

Qifeng Zhou; Hao Zhou; Qingqing Zhou; Fan Yang; Linkai Luo; Tao Li

Proposed a stable structural damage detection method based on information fusion.Improved the accuracy and stability of the damage detection than using single sensor.Solved the bad impact of any sensor failure and give more robust detection results. An intelligent detection method is proposed in this paper to enrich the study of applying machine learning and data mining techniques to building structural damage identification. The proposed method integrates the multi-sensory data fusion and classifier ensemble to detect the location and extent of the damage. First, the wavelet package analysis is used to transform the original vibration acceleration signal into energy features. Then the posteriori probability support vector machines (PPSVM) and the Dempster-Shafer (DS) evidence theory are combined to identify the damage. Empirical study on a benchmark structure model shows that, compared with popular data mining approaches, the proposed method can provide more accurate and stable detection results. Furthermore, this paper compares the detection performance of the information fusion at different levels. The experimental analysis demonstrates that the proposed method with the fusion at the decision level can make good use of multi-sensory information and is more robust in practice.

Expert Systems With Applications | 2015

Two approaches for novelty detection using random forest

Qifeng Zhou; Hao Zhou; Yongpeng Ning; Fan Yang; Tao Li

A framework for novelty detection using random forest is proposed.Two specific approaches using the vote distribution and the proximity matrix are presented.A comprehensive empirical study on both synthetic and real-world datasets is conducted. In many online classification tasks or non-exhaustive learning, it is often impossible to define a training set with a complete set of classes. The presence of new classes as well as the novelties caused by data errors can severely affect the performance of classifiers. Traditional proximity-based approaches usually utilize the distance to measure the proximity of different samples. In this study, we propose a framework that uses ensemble learning to detect novelty based on Random Forest (RF). The proposed framework is based on the observation that an ensemble of classifiers can provide a kind of metric to characterize different classes and measure their proximity. In particular, we apply ensemble methods with the decision tree as base classifiers and present two specific approaches, RFV and RFP, based on random forest. RFV uses the vote distribution of RF on a testing sample, and RFP takes the proximity matrix of RF as a special kernel metric to discover the novelty. The proposed approaches are compared against two common approaches: support vector domain description (SVDD) and Gaussian Mixed Model (GMM) on one artificial data set and five benchmark data sets. The experimental results show that the proposed methods achieve better performance in terms of accuracy and recall.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2013

Using the Maximum Between-Class Variance for Automatic Gridding of cDNA Microarray Images

Gui-Fang Shao; Fan Yang; Qian Zhang; Qifeng Zhou; Linkai Luo

Gridding is the first and most important step to separate the spots into distinct areas in microarray image analysis. Human intervention is necessary for most gridding methods, even if some so-called fully automatic approaches also need preset parameters. The applicability of these methods is limited in certain domains and will cause variations in the gene expression results. In addition, improper gridding, which is influenced by both the misalignment and high noise level, will affect the high throughput analysis. In this paper, we have presented a fully automatic gridding technique to break through the limitation of traditional mathematical morphology gridding methods. First, a preprocessing algorithm was applied for noise reduction. Subsequently, the optimal threshold was gained by using the improved Otsu method to actually locate each spot. In order to diminish the error, the original gridding result was optimized according to the heuristic techniques by estimating the distribution of the spots. Intensive experiments on six different data sets indicate that our method is superior to the traditional morphology one and is robust in the presence of noise. More importantly, the algorithm involved in our method is simple. Furthermore, human intervention and parameters presetting are unnecessary when the algorithm is applied in different types of microarray images.

Structural Health Monitoring-an International Journal | 2013

Structural damage detection method based on random forests and data fusion

Qifeng Zhou; Yongpeng Ning; Qingqing Zhou; Linkai Luo; Jiayan Lei

A structural damage detection method by integrating data fusion and random forests was proposed. The original acceleration signals were translated into energy features by wavelet packet decomposition. Then the processed energy features were fused into new energy features by data fusion. This can further enlarge the differences among all types of damages. Finally, random forests as an effective classifier was used to detect the multiclass damage. Numerical study on the benchmark model and an eight-storey steel shear frame structure model was carried out to validate the accuracy of the proposed damage detection method. The experiment results indicate that the damage detection method based on random forests and data fusion can improve damage detection accuracy in comparison with random forests alone, support vector machine alone, and support vector machine and data fusion techniques. Moreover, the proposed method has significantly better stability than several other methods.

international conference on intelligent computing | 2009

A new SVM-RFE approach towards ranking problem

Qifeng Zhou; Wencai Hong; Guifang Shao; Wei-You Cai

Support Vector Machine Recursive Feature Elimination (SVM-RFE) is a simple and efficient feature selection algorithm which has been used in many fields. Just like SVM itself, SVM-RFE was originally designed to solve binary feature selection problems. In this paper, we propose a new recursive feature elimination method based on SVM for ranking problem. As against standard approaches of treating ranking as a multiclass classification problem, our approach enables the use of standard binary SVM-RFE algorithms for ranking problems. We evaluate our algorithm on both public dataset and for a real world credit evaluating problem. The results obtained demonstrate the superiority of our algorithm over extended SVM-RFE to solve multiclass problems using ensemble techniques.

international conference on control, automation, robotics and vision | 2006

A Study on Piecewise Polynomial Smooth Approximation to the Plus Function

Linkai Luo; Chengde Lin; Hong Peng; Qifeng Zhou

In smooth support vector machine (SSVM), the plus function must be approximated by some smooth function, and the approximate error will affect the classification ability. This paper studies the smooth approximation to the plus function by piecewise polynomials. First, some standard piecewise polynomial smooth approximation problems are formulated. Then, the existence and uniqueness of solution for these problems are proved and the analytic solutions are achieved. The comparison between the results in this paper and the previous ones shows that the piecewise polynomial functions in this paper achieve better approximation to the plus function

Expert Systems With Applications | 2017

VRer: Context-Based Venue Recommendation using embedded space ranking SVM in location-based social network

Bin Xia; Zhen Ni; Tao Li; Qianmu Li; Qifeng Zhou

Abstract Venue recommendation has attracted a lot of research attention with the rapid development of Location-Based Social Networks. The effectiveness of venue recommendation largely depends on how well it captures users’ contexts or preferences. However, it is quite difficult, if not impossible, to capture the whole information about users’ preferences. In addition, users’ preferences are often heterogeneous (i.e., some preferences are static and common to all users while some preferences are dynamic and diverse). Existing venue recommendation does not well address the aforementioned issues and often recommends the most popular, the cheapest, or the closest venues based on simple contexts. In this paper, we cast the venue recommendation as a ranking problem and propose a recommendation framework named VRer (Context-Based V enue R ecommendation using e mbedded space r anking SVM) employing an embedded space ranking SVM model to separate the venues in terms of different characteristics. Our proposed approach makes use of ‘check-in’ data to capture users’ preferences and utilizes a machine learning model to tune the importance of different factors in ranking. The major contribution of this paper are: (1) VRer combines various contexts (e.g., the temporal influence and the category of locations) with the check-in records to capture individual heterogeneous preferences; (2) we propose an embedded space ranking SVM optimizing the learning function to reduce the time consumption of training the personalized recommendation model for each group or user; (3) we evaluate our proposed approach against a real world LBSN and compare it with other baseline methods. Experimental results demonstrate the benefits of our proposed approach.

Neurocomputing | 2017

Cluster ensemble selection with constraints

Fan Yang; Tao Li; Qifeng Zhou; Han Xiao

Clustering ensemble has emerged as an important tool for data analysis, by which a more robust and accurate consensus clustering can be generated. On forming the ensembles, empirical studies have suggested that better ensembles can be obtained by simultaneously considering the quality of the ensembles and the diversity among ensemble members. However, little research efforts have been paid to incorporate prior background knowledge. In this paper, we first provide a theoretical analysis on the effect of the diversity and quality of the ensemble members. We then propose a unified framework to solve constraint-based clustering ensemble selection problem, where some instance level must-link and cannot-link constraints are given as prior knowledge or background information. We formalize this problem as a combinatorial optimization problem in terms of the consistency under the constraints, the diversity among ensemble members, and the overall quality of ensembles. Our proposed framework brings together two distinct yet interrelated themes from clustering: ensemble clustering and semi-supervised clustering. We study different techniques for searching high-quality solutions. Experiments on benchmark datasets demonstrate the effectiveness of our framework.

Explore More