Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hanchuan Peng is active.

Publication


Featured researches published by Hanchuan Peng.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005

Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy

Hanchuan Peng; Fuhui Long; Chris H. Q. Ding

Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion based on mutual information. Because of the difficulty in directly implementing the maximal dependency condition, we first derive an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection. Then, we present a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e.g., wrappers). This allows us to select a compact set of superior features at very low cost. We perform extensive experimental comparison of our algorithm and other methods using three different classifiers (naive Bayes, support vector machine, and linear discriminate analysis) and four different data sets (handwritten digits, arrhythmia, NCI cancer cell lines, and lymphoma tissues). The results confirm that mRMR leads to promising improvement on feature selection and classification accuracy.


Journal of Bioinformatics and Computational Biology | 2005

MINIMUM REDUNDANCY FEATURE SELECTION FROM MICROARRAY GENE EXPRESSION DATA

Chris H. Q. Ding; Hanchuan Peng

How to selecting a small subset out of the thousands of genes in microarray data is important for accurate classification of phenotypes. Widely used methods typically rank genes according to their ...


computational systems bioinformatics | 2003

Minimum redundancy feature selection from microarray gene expression data

Chris H. Q. Ding; Hanchuan Peng

Selecting a small subset of genes out of the thousands of genes in microarray data is important for accurate classification of phenotypes. Widely used methods typically rank genes according to their differential expressions among phenotypes and pick the top-ranked genes. We observe that feature sets so obtained have certain redundancy and study methods to minimize it. Feature sets obtained through the minimum redundancy - maximum relevance framework represent broader spectrum of characteristics of phenotypes than those obtained through standard ranking methods; they are more robust, generalize well to unseen data, and lead to significantly improved classifications in extensive experiments on 5 gene expressions data sets.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2003

Document Image Recognition based on template matching of component block projections

Hanchuan Peng; Fuhui Long; Zheru Chi

Document Image Recognition (DIR), a very useful technique in office automation and digital library applications, is to find the most similar template for any input document image in a prestored template document image data set. Existing methods use both local features and global layout information. In this paper, we propose a novel algorithm based on the global matching of Component Block Projections (CBP), which are the concatenated directional projection vectors of the component blocks of a document image. Compared to those existing methods, CBP-based template-matching methods possess two major advantages: (1) The spatial relationship among the component blocks of a document image is better represented, hence a very high matching accuracy can be obtained even for a large template set and seriously distorted input images; and (2) the effective matching distance of each template and the triangle inequality are proposed to significantly reduce the computational cost. Our experimental results confirm these advantages and show that the CBP-based template-matching methods are very suitable for DIR applications.


international conference on data mining | 2003

Structure search and stability enhancement of Bayesian networks

Hanchuan Peng; Chris H. Q. Ding

Learning Bayesian network structure from large-scale data sets, without any expert-specified ordering of variables, remains a difficult problem. We propose systematic improvements to automatically learn Bayesian network structure from data. (1) We propose a linear parent search method to generate candidate graph. (2) We propose a comprehensive approach to eliminate cycles using minimal likelihood loss, a short cycle first heuristic, and a cut-edge repairing. (3) We propose structure perturbation to assess the stability of the network and a stability-improvement method to refine the network structure. The algorithms are easy to implement and efficient for large networks. Experimental results on two data sets show that our new approach outperforms existing methods.


congress on evolutionary computation | 2000

A hierarchical distributed genetic algorithm for image segmentation

Hanchuan Peng; Fuhui Long; Zheru Chi; Wanchi Su

A novel hierarchical distributed genetic algorithm is proposed for image segmentation. Firstly, a technique of histogram dichotomy is proposed to explore the statistical property of input image and produce a hierarchical quantization image. Then a hierarchical distributed genetic algorithm (HDGA) is imposed on the quantized image to explore the spatial connectivity and produce final segmentation result. HDGA is a major improvement of the original distributed genetic algorithm (DGA) and multiscale distributed genetic algorithm (MDGA) in four aspects: (1) HDGA does not require the a priori number of image regions, however it can effectively and adaptively control the segmentation quality; (2) the chromosome structure is revised from the original label (multilabel)-condition-fitness format to a more compact (storage-efficient) label-fitness format; (3) the fitness function is revised to utilize the spatial connectivity, but not the original reconstruction error; (4) three revised genetic operations are presented to make the algorithm computation-efficient. Our experiments give proofs for the advantages of HDGA.


International Journal of Neural Systems | 2000

A Semi-Parametric Hybrid Neural Model for Nonlinear Blind Signal Separation

Hanchuan Peng; Zheru Chi; Wan-Chi Siu

Nonlinear blind signal separation is an important but rather difficult problem. Any general nonlinear independent component analysis algorithm for such a problem should specify which solution it tries to find. Several recent neural networks for separating the post nonlinear blind mixtures are limited to the diagonal nonlinearity, where there is no cross-channel nonlinearity. In this paper, a new semi-parametric hybrid neural network is proposed to separate the post nonlinearly mixed blind signals where cross-channel disturbance is included. This hybrid network consists of two cascading modules, which are a neural nonlinear module for approximating the post nonlinearity and a linear module for separating the predicted linear blind mixtures. The nonlinear module is a semi-parametric expansion made up of two sub-networks, one of which is a linear model and the other of which is a three-layer perceptron. These two sub-networks together produce a weak nonlinear operator and can approach relatively strong nonlinearity by tuning parameters. A batch learning algorithm based on the entropy maximization and the gradient descent method is deduced. This model is successfully applied to a blind signal separation problem with two sources. Our simulation results indicate that this hybrid model can effectively approach the cross-channel post nonlinearity and achieve a good visual quality as well as a high signal-to-noise ratio in some cases.


knowledge discovery and data mining | 2005

Finding cliques in protein interaction networks via transitive closure of a weighted graph

Chris H. Q. Ding; Xiaofeng He; Hanchuan Peng

Finding protein functional modules in protein interaction networks amounts to finding densely connected subgraphs. Standard methods such as cliques and k-cores produce very small subgraphs due to highly sparse connections in most protein networks. Furthermore, standard methods are not applicable on weighted protein networks. We propose a method to identify cliques on weighted graphs. To overcome the sparsity problem, we introduce the concept of transitive closure on weighted graphs which is based on enforcing a transitive affinity inequality on the connection weights, and an algorithm to compute them. Using protein network from TAP-MS experiment on yeast, we discover a large number of cliques that are densely connected protein modules, with clear biological meanings as shown on Gene Ontology analysis.


IEEE Transactions on Pattern Analysis and MachineIntelligence | 2003

Feature selection based on mutual information: criteria ofmax-dependency, max-relevance, and min-redundancy

Hanchuan Peng; Fuhui Long; Chris H. Q. Ding


Archive | 2002

Feature selection based on mutual information: Cri-teria of max-dependency

Hanchuan Peng; Fuhui Long; Chris H. Q. Ding

Collaboration


Dive into the Hanchuan Peng's collaboration.

Top Co-Authors

Avatar

Chris H. Q. Ding

University of Texas at Arlington

View shared research outputs
Top Co-Authors

Avatar

Fuhui Long

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Zheru Chi

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Fuhui Long

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Xiaofeng He

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Wan-Chi Siu

Hong Kong Polytechnic University

View shared research outputs
Researchain Logo
Decentralizing Knowledge