Zhi-Hua Zhou | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zhi-Hua Zhou is active.

Explore More

Publication

Featured researches published by Zhi-Hua Zhou.

Knowledge and Information Systems | 2007

Top 10 algorithms in data mining

Xindong Wu; Vipin Kumar; J. Ross Quinlan; Joydeep Ghosh; Qiang Yang; Hiroshi Motoda; Geoffrey J. McLachlan; Angus F. M. Ng; Bing Liu; Philip S. Yu; Zhi-Hua Zhou; Michael Steinbach; David J. Hand; Dan Steinberg

This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm. These 10 algorithms cover classification, clustering, statistical learning, association analysis, and link mining, which are all among the most important topics in data mining research and development.

Pattern Recognition | 2007

ML-KNN: A lazy learning approach to multi-label learning

Min-Ling Zhang; Zhi-Hua Zhou

Multi-label learning originated from the investigation of text categorization problem, where each document may belong to several predefined topics simultaneously. In multi-label learning, the training set is composed of instances each associated with a set of labels, and the task is to predict the label sets of unseen instances through analyzing training instances with known label sets. In this paper, a multi-label lazy learning approach named ML-KNN is presented, which is derived from the traditional K-nearest neighbor (KNN) algorithm. In detail, for each unseen instance, its K nearest neighbors in the training set are firstly identified. After that, based on statistical information gained from the label sets of these neighboring instances, i.e. the number of neighboring instances belonging to each possible class, maximum a posteriori (MAP) principle is utilized to determine the label set for the unseen instance. Experiments on three different real-world multi-label learning problems, i.e. Yeast gene functional analysis, natural scene classification and automatic web page categorization, show that ML-KNN achieves superior performance to some well-established multi-label learning algorithms.

Artificial Intelligence | 2002

Ensembling neural networks: many could be better than all

Zhi-Hua Zhou; Jianxin Wu; Wei Tang

Neural network ensemble is a learning paradigm where many neural networks are jointly used to solve a problem. In this paper, the relationship between the ensemble and its component neural networks is analyzed from the context of both regression and classification, which reveals that it may be better to ensemble many instead of all of the neural networks at hand. This result is interesting because at present, most approaches ensemble all the available neural networks for prediction. Then, in order to show that the appropriate neural networks for composing an ensemble can be effectively selected from a set of available neural networks, an approach named GASEN is presented. GASEN trains a number of neural networks at first. Then it assigns random weights to those networks and employs genetic algorithm to evolve the weights so that they can characterize to some extent the fitness of the neural networks in constituting an ensemble. Finally it selects some neural networks based on the evolved weights to make up the ensemble. A large empirical study shows that, compared with some popular ensemble approaches such as Bagging and Boosting, GASEN can generate neural network ensembles with far smaller sizes but stronger generalization ability. Furthermore, in order to understand the working mechanism of GASEN, the bias-variance decomposition of the error is provided in this paper, which shows that the success of GASEN may lie in that it can significantly reduce the bias as well as the variance.

IEEE Transactions on Knowledge and Data Engineering | 2014

A Review on Multi-Label Learning Algorithms

Min-Ling Zhang; Zhi-Hua Zhou

Multi-label learning studies the problem where each example is represented by a single instance while associated with a set of labels simultaneously. During the past decade, significant amount of progresses have been made toward this emerging machine learning paradigm. This paper aims to provide a timely review on this area with emphasis on state-of-the-art multi-label learning algorithms. Firstly, fundamentals on multi-label learning including formal definition and evaluation metrics are given. Secondly and primarily, eight representative multi-label learning algorithms are scrutinized under common notations with relevant analyses and discussions. Thirdly, several related learning settings are briefly summarized. As a conclusion, online resources and open research problems on multi-label learning are outlined for reference purposes.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2007

Automatic Age Estimation Based on Facial Aging Patterns

Xin Geng; Zhi-Hua Zhou; Kate Smith-Miles

While recognition of most facial variations, such as identity, expression, and gender, has been extensively studied, automatic age estimation has rarely been explored. In contrast to other facial variations, aging variation presents several unique characteristics which make age estimation a challenging task. This paper proposes an automatic age estimation method named AGES (AGing pattErn Subspace). The basic idea is to model the aging pattern, which is defined as the sequence of a particular individuals face images sorted in time order, by constructing a representative subspace. The proper aging pattern for a previously unseen face image is determined by the projection in the subspace that can reconstruct the face image with minimum reconstruction error, while the position of the face image in that aging pattern will then indicate its age. In the experiments, AGES and its variants are compared with the limited existing age estimation methods (WAS and AAS) and some well-established classification methods (kNN, BP, C4.5, and SVM). Moreover, a comparison with human perception ability on age is conducted. It is interesting to note that the performance of AGES is not only significantly better than that of all the other algorithms, but also comparable to that of the human observers.

systems man and cybernetics | 2009

Exploratory Undersampling for Class-Imbalance Learning

Xu-Ying Liu; Jianxin Wu; Zhi-Hua Zhou

Under-sampling is a class-imbalance learning method which uses only a subset of major class examples and thus is very efficient. The main deficiency is that many major class examples are ignored. We propose two algorithms to overcome the deficiency. EasyEnsemble samples several subsets from the major class, trains a learner using each of them, and combines the outputs of those learners. BalanceCascade is similar to EasyEnsemble except that it removes correctly classified major class examples of trained learners from further consideration. Experiments show that both of the proposed algorithms have better AUC scores than many existing class-imbalance learning methods. Moreover, they have approximately the same training time as that of under-sampling, which trains significantly faster than other methods.

IEEE Transactions on Knowledge and Data Engineering | 2006

Training cost-sensitive neural networks with methods addressing the class imbalance problem

Zhi-Hua Zhou; Xu-Ying Liu

This paper studies empirically the effect of sampling and threshold-moving in training cost-sensitive neural networks. Both oversampling and undersampling are considered. These techniques modify the distribution of the training data such that the costs of the examples are conveyed explicitly by the appearances of the examples. Threshold-moving tries to move the output threshold toward inexpensive classes such that examples with higher costs become harder to be misclassified. Moreover, hard-ensemble and soft-ensemble, i.e., the combination of above techniques via hard or soft voting schemes, are also tested. Twenty-one UCl data sets with three types of cost matrices and a real-world cost-sensitive data set are used in the empirical study. The results suggest that cost-sensitive learning with multiclass tasks is more difficult than with two-class tasks, and a higher degree of class imbalance may increase the difficulty. It also reveals that almost all the techniques are effective on two-class tasks, while most are ineffective and even may cause negative effect on multiclass tasks. Overall, threshold-moving and soft-ensemble are relatively good choices in training cost-sensitive neural networks. The empirical study also suggests that some methods that have been believed to be effective in addressing the class imbalance problem may, in fact, only be effective on learning with imbalanced two-class data sets.

IEEE Transactions on Knowledge and Data Engineering | 2006

Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization

Min-Ling Zhang; Zhi-Hua Zhou

In multilabel learning, each instance in the training set is associated with a set of labels and the task is to output a label set whose size is unknown a priori for each unseen instance. In this paper, this problem is addressed in the way that a neural network algorithm named BP-MLL, i.e., backpropagation for multilabel learning, is proposed. It is derived from the popular backpropagation algorithm through employing a novel error function capturing the characteristics of multilabel learning, i.e., the labels belonging to an instance should be ranked higher than those not belonging to that instance. Applications to two real-world multilabel learning problems, i.e., functional genomics and text categorization, show that the performance of BP-MLL is superior to that of some well-established multilabel learning algorithms

IEEE Transactions on Knowledge and Data Engineering | 2005

Tri-training: exploiting unlabeled data using three classifiers

Zhi-Hua Zhou; Ming Li

In many practical data mining applications, such as Web page classification, unlabeled training examples are readily available, but labeled ones are fairly expensive to obtain. Therefore, semi-supervised learning algorithms such as co-training have attracted much attention. In this paper, a new co-training style semi-supervised learning algorithm, named tri-training, is proposed. This algorithm generates three classifiers from the original labeled example set. These classifiers are then refined using unlabeled examples in the tri-training process. In detail, in each round of tri-training, an unlabeled example is labeled for a classifier if the other two classifiers agree on the labeling, under certain conditions. Since tri-training neither requires the instance space to be described with sufficient and redundant views nor does it put any constraints on the supervised learning algorithm, its applicability is broader than that of previous co-training style algorithms. Experiments on UCI data sets and application to the Web page classification task indicate that tri-training can effectively exploit unlabeled data to enhance the learning performance.

Neurocomputing | 2005

Letters: (2D)2PCA: Two-directional two-dimensional PCA for efficient face representation and recognition

Daoqiang Zhang; Zhi-Hua Zhou

Recently, a new technique called two-dimensional principal component analysis (2DPCA) was proposed for face representation and recognition. The main idea behind 2DPCA is that it is based on 2D matrices as opposed to the standard PCA, which is based on 1D vectors. Although 2DPCA obtains higher recognition accuracy than PCA, a vital unresolved problem of 2DPCA is that it needs many more coefficients for image representation than PCA. In this paper, we first indicate that 2DPCA is essentially working in the row direction of images, and then propose an alternative 2DPCA which is working in the column direction of images. By simultaneously considering the row and column directions, we develop the two-directional 2DPCA, i.e. (2D)^2PCA, for efficient face representation and recognition. Experimental results on ORL and a subset of FERET face databases show that (2D)^2PCA achieves the same or even higher recognition accuracy than 2DPCA, while the former needs a much reduced coefficient set for image representation than the latter.

Explore More