Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kin-On Cheng is active.

Publication


Featured researches published by Kin-On Cheng.


BMC Bioinformatics | 2008

Identification of coherent patterns in gene expression data using an efficient biclustering algorithm and parallel coordinate visualization

Kin-On Cheng; Ngai-fong Bonnie Law; Wan-Chi Siu; Alan Wee-Chung Liew

BackgroundThe DNA microarray technology allows the measurement of expression levels of thousands of genes under tens/hundreds of different conditions. In microarray data, genes with similar functions usually co-express under certain conditions only [1]. Thus, biclustering which clusters genes and conditions simultaneously is preferred over the traditional clustering technique in discovering these coherent genes. Various biclustering algorithms have been developed using different bicluster formulations. Unfortunately, many useful formulations result in NP-complete problems. In this article, we investigate an efficient method for identifying a popular type of biclusters called additive model. Furthermore, parallel coordinate (PC) plots are used for bicluster visualization and analysis.ResultsWe develop a novel and efficient biclustering algorithm which can be regarded as a greedy version of an existing algorithm known as pCluster algorithm. By relaxing the constraint in homogeneity, the proposed algorithm has polynomial-time complexity in the worst case instead of exponential-time complexity as in the pCluster algorithm. Experiments on artificial datasets verify that our algorithm can identify both additive-related and multiplicative-related biclusters in the presence of overlap and noise. Biologically significant biclusters have been validated on the yeast cell-cycle expression dataset using Gene Ontology annotations. Comparative study shows that the proposed approach outperforms several existing biclustering algorithms. We also provide an interactive exploratory tool based on PC plot visualization for determining the parameters of our biclustering algorithm.ConclusionWe have proposed a novel biclustering algorithm which works with PC plots for an interactive exploratory analysis of gene expression data. Experiments show that the biclustering algorithm is efficient and is capable of detecting co-regulated genes. The interactive analysis enables an optimum parameter determination in the biclustering algorithm so as to achieve the best result. In future, we will modify the proposed algorithm for other bicluster models such as the coherent evolution model.


Bioinformatics | 2007

BiVisu: software tool for bicluster detection and visualization

Kin-On Cheng; Ngai-Fong Law; Wan-Chi Siu; T. H. Lau

UNLABELLED BiVisu is an open-source software tool for detecting and visualizing biclusters embedded in a gene expression matrix. Through the use of appropriate coherence relations, BiVisu can detect constant, constant-row, constant-column, additive-related as well as multiplicative-related biclusters. The biclustering results are then visualized under a 2D setting for easy inspection. In particular, parallel coordinate (PC) plots for each bicluster are displayed, from which objective and subjective cluster quality evaluation can be performed. AVAILABILITY BiVisu has been developed in Matlab and is available at http://www.eie.polyu.edu.hk/~nflaw/Biclustering/.


Pattern Recognition | 2012

Iterative bicluster-based least square framework for estimation of missing values in microarray gene expression data

Kin-On Cheng; Ngai-fong Bonnie Law; Wan-Chi Siu

DNA microarray experiment inevitably generates gene expression data with missing values. An important and necessary pre-processing step is thus to impute these missing values. Existing imputation methods exploit gene correlation among all experimental conditions for estimating the missing values. However, related genes coexpress in subsets of experimental conditions only. In this paper, we propose to use biclusters, which contain similar genes under subset of conditions for characterizing the gene similarity and then estimating the missing values. To further improve the accuracy in missing value estimation, an iterative framework is developed with a stopping criterion on minimizing uncertainty. Extensive experiments have been conducted on artificial datasets, real microarray datasets as well as one non-microarray dataset. Our proposed biclusters-based approach is able to reduce errors in missing value estimation.


Pattern Recognition | 2007

Multiscale directional filter bank with applications to structured and random texture retrieval

Kin-On Cheng; Ngai-Fong Law; Wan-Chi Siu

In this paper, multiscale directional filter bank (MDFB) is investigated for texture characterization and retrieval. First, the problem of aliasing in decimated bandpass images on directional decomposition is addressed. MDFB is then designed to suppress the aliasing effect as well as to minimize the reduction in frequency resolution. Second, an entropy-based measure on energy signatures is proposed to classify structured and random textures. With the use of this measure for texture pre-classification, an optimized retrieval performance can be achieved by selecting the MDFB-based method for retrieving structured textures and a statistical or model-based method for retrieving random textures. In addition, a feature reduction scheme and a rotation-invariant conversion method are developed. The former is developed so as to find the most representative features while the latter is developed to provide a set of rotation-invariant features for texture characterization. Experimental works confirm that they are effective for texture retrieval.


IEEE Transactions on Image Processing | 2007

A Novel Fast and Reduced Redundancy Structure for Multiscale Directional Filter Banks

Kin-On Cheng; Ngai-fong Bonnie Law; Wan-Chi Siu

The multiscale directional filter bank (MDFB) improves the radial frequency resolution of the contourlet transform by introducing an additional decomposition in the high-frequency band. The increase in frequency resolution is particularly useful for texture description because of the quasi-periodic property of textures. However, the MDFB needs an extra set of scale and directional decomposition, which is performed on the full image size. The rise in computational complexity is, thus, prominent. In this paper, we develop an efficient implementation framework for the MDFB. In the new framework, directional decomposition on the first two scales is performed prior to the scale decomposition. This allows sharing of directional decomposition among the two scales and, hence, reduces the computational complexity significantly. Based on this framework, two fast implementations of the MDFB are proposed. The first one can maintain the same flexibility in directional selectivity in the first two scales while the other has the same redundancy ratio as the contourlet transform. Experimental results show that the first and the second schemes can reduce the computational time by 33.3%-34.6% and 37.1%-37.5%, respectively, compared to the original MDFB algorithm. Meanwhile, the texture retrieval performance of the proposed algorithms is more or less the same as the original MDFB approach which outperforms the steerable pyramid and the contourlet transform approaches.


Artificial Intelligence Review | 2013

Use of biclustering for missing value imputation in gene expression data

Kin-On Cheng; Ngai-Fong Law; Wan-Chi Siu

DNA microarray data always contains missing values. As subsequent analysis such as biclustering can only be applied on complete data, these missing values have to be imputed before any biclusters can be detected. Existing imputation methods exploit coherence among expression values in the microarray data. In view that biclustering attempts to find correlated expression values within the data, we propose to combine the missing value imputation and biclustering into a single framework in which the two processes are performed iteratively. In this way, the missing value imputation can improve bicluster analysis and the coherence in detected biclusters can be exploited for better missing value estimation. Experiments have been conducted on artificial datasets and real datasets to verify the effectiveness of the proposed algorithm in reducing estimation errors of missing values.


Pattern Recognition | 2010

Fast extraction of wavelet-based features from JPEG images for joint retrieval with JPEG2000 images

Kin-On Cheng; Ngai-Fong Law; Wan-Chi Siu

In this paper, some fast feature extraction algorithms are addressed for joint retrieval of images compressed in JPEG and JPEG2000 formats. In order to avoid full decoding, three fast algorithms that convert block-based discrete cosine transform (BDCT) into wavelet transform are developed, so that wavelet-based features can be extracted from JPEG images as in JPEG2000 images. The first algorithm exploits the similarity between the BDCT and the wavelet packet transform. For the second and third algorithms, the first algorithm or an existing algorithm known as multiresolution reordering is first applied to obtain bandpass subbands at fine scales and the lowpass subband. Then for the subbands at the coarse scale, a new filter bank structure is developed to reduce the mismatch in low frequency features. Compared with the extraction based on full decoding, there is more than 72% reduction in computational complexity. Retrieval experiments also show that the three proposed algorithms can achieve higher precision and recall than the multiresolution reordering, especially around the typical range of compression ratio.


COMPUTATIONAL MODELS FOR LIFE SCIENCES—CMLS '07: 2007 International Symposium on Computational Models of Life Sciences | 2007

Biclusters Visualization and Detection Using Parallel Coordinate Plots

Kin-On Cheng; Ngai-Fong Law; Wan-Chi Siu; Alan Wee-Chung Liew

The parallel coordinate (PC) plot is a powerful visualization tools for high‐dimensional data. In this paper, we explore its usage on gene expression data analysis. We found that both the additive‐related and the multiplicative‐related coherent genes exhibit special patterns in the PC plots. One‐dimensional clustering can then be applied to detect these patterns. Besides, a split‐and‐merge mechanism is employed to find the biggest coherent subsets inside the gene expression matrix. Experimental results showed that our proposed algorithm is effective in detecting various types of biclusters. In addition, the biclustering results can be visualized under a 2D setting, in which objective and subjective cluster quality evaluation can be performed.


international symposium on intelligent multimedia video and speech processing | 2004

Structured and random texture patterns characterization using multiscale directional filter bank

Kin-On Cheng; Ngai-Fong Law; Wan-Chi Siu

The use of multiscale directional decomposition, achieved by combining a Laplacian pyramid and a directional filter bank, is studied for texture classification. We first demonstrate the importance of the multiscale analysis of directional texture features. Then, it is found that directional analysis is suitable for characterizing structured textures, but not random textures. Thus, structured and random textures are separated by employing an entropy-based measure on the multiscale directional features. Through this pre-filtering step, structured textures are extracted for further classification, so that the overall retrieval performance can be enhanced. Experimental results showed that this pre-filtering step can significantly improve the overall retrieval accuracy.


Bioinformation | 2006

On relationship of Z-curve and Fourier approaches for DNA coding sequence classification

Ngai-Fong Law; Kin-On Cheng; Wan-Chi Siu

Z-curve features are one of the popular features used in exon/intron classification. We showed that although both Z-curve and Fourier approaches are based on detecting 3-periodicity in coding regions, there are significant differences in their spectral formulation. From the spectral formulation of the Z-curve, we obtained three modified sequences that characterize different biological properties. Spectral analysis on the modified sequences showed a much more prominent 3-periodicity peak in coding regions than the Fourier approach. For long sequences, prominent peaks at 2Π/3 are observed at coding regions, whereas for short sequences, clearly discernible peaks are still visible. Better classification can be obtained using spectral features derived from the modified sequences.

Collaboration


Dive into the Kin-On Cheng's collaboration.

Top Co-Authors

Avatar

Wan-Chi Siu

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Ngai-Fong Law

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Ngai-fong Bonnie Law

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Paula Wu

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

T. H. Lau

Hong Kong Polytechnic University

View shared research outputs
Researchain Logo
Decentralizing Knowledge