Chris H. Q. Ding | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chris H. Q. Ding is active.

Explore More

Publication

Featured researches published by Chris H. Q. Ding.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005

Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy

Hanchuan Peng; Fuhui Long; Chris H. Q. Ding

Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion based on mutual information. Because of the difficulty in directly implementing the maximal dependency condition, we first derive an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection. Then, we present a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e.g., wrappers). This allows us to select a compact set of superior features at very low cost. We perform extensive experimental comparison of our algorithm and other methods using three different classifiers (naive Bayes, support vector machine, and linear discriminate analysis) and four different data sets (handwritten digits, arrhythmia, NCI cancer cell lines, and lymphoma tissues). The results confirm that mRMR leads to promising improvement on feature selection and classification accuracy.

Journal of Bioinformatics and Computational Biology | 2005

MINIMUM REDUNDANCY FEATURE SELECTION FROM MICROARRAY GENE EXPRESSION DATA

Chris H. Q. Ding; Hanchuan Peng

How to selecting a small subset out of the thousands of genes in microarray data is important for accurate classification of phenotypes. Widely used methods typically rank genes according to their ...

international conference on machine learning | 2004

K -means clustering via principal component analysis

Chris H. Q. Ding; Xiaofeng He

Principal component analysis (PCA) is a widely used statistical technique for unsupervised dimension reduction. K-means clustering is a commonly used data clustering for performing unsupervised learning tasks. Here we prove that principal components are the continuous solutions to the discrete cluster membership indicators for K-means clustering. New lower bounds for K-means objective function are derived, which is the total variance minus the eigenvalues of the data covariance matrix. These results indicate that unsupervised dimension reduction is closely related to unsupervised learning. Several implications are discussed. On dimension reduction, the result provides new insights to the observed effectiveness of PCA-based data reductions, beyond the conventional noise-reduction explanation that PCA, via singular value decomposition, provides the best low-dimensional linear approximation of the data. On learning, the result suggests effective techniques for K-means data clustering. DNA gene expression and Internet newsgroups are analyzed to illustrate our results. Experiments indicate that the new bounds are within 0.5-1.5% of the optimal values.

knowledge discovery and data mining | 2006

Orthogonal nonnegative matrix t-factorizations for clustering

Chris H. Q. Ding; Tao Li; Wei Peng; Haesun Park

Currently, most research on nonnegative matrix factorization (NMF)focus on 2-factor

international conference on data mining | 2001

A min-max cut algorithm for graph partitioning and data clustering

Chris H. Q. Ding; Xiaofeng He; Hongyuan Zha; Ming Gu; Horst D. Simon

X=FG^T

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2010

Convex and Semi-Nonnegative Matrix Factorizations

Chris H. Q. Ding; Tao Li; Michael I. Jordan

factorization. We provide a systematicanalysis of 3-factor

computational systems bioinformatics | 2003

Minimum redundancy feature selection from microarray gene expression data

Chris H. Q. Ding; Hanchuan Peng

X=FSG^T

international conference on machine learning | 2006

R 1 -PCA: rotational invariant L 1 -norm principal component analysis for robust subspace factorization

Chris H. Q. Ding; Ding Zhou; Xiaofeng He; Hongyuan Zha

NMF. While it unconstrained 3-factor NMF is equivalent to it unconstrained 2-factor NMF, itconstrained 3-factor NMF brings new features to it constrained 2-factor NMF. We study the orthogonality constraint because it leadsto rigorous clustering interpretation. We provide new rules for updating

international acm sigir conference on research and development in information retrieval | 2008

Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization

Dingding Wang; Tao Li; Shenghuo Zhu; Chris H. Q. Ding

F,S, G

Computational Statistics & Data Analysis | 2008

On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing

Chris H. Q. Ding; Tao Li; Wei Peng

and prove the convergenceof these algorithms. Experiments on 5 datasets and a real world casestudy are performed to show the capability of bi-orthogonal 3-factorNMF on simultaneously clustering rows and columns of the input datamatrix. We provide a new approach of evaluating the quality ofclustering on words using class aggregate distribution andmulti-peak distribution. We also provide an overview of various NMF extensions andexamine their relationships.

Explore More