Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mengsu Yang is active.

Publication


Featured researches published by Mengsu Yang.


international conference of the ieee engineering in medicine and biology society | 2004

Cluster analysis of gene expression data based on self-splitting and merging competitive learning

Shuanhu Wu; Alan Wee-Chung Liew; Hong Yan; Mengsu Yang

Cluster analysis of gene expression data from a cDNA microarray is useful for identifying biologically relevant groups of genes. However, finding the natural clusters in the data and estimating the correct number of clusters are still two largely unsolved problems. In this paper, we propose a new clustering framework that is able to address both these problems. By using the one-prototype-take-one-cluster (OPTOC) competitive learning paradigm, the proposed algorithm can find natural clusters in the input data, and the clustering solution is not sensitive to initialization. In order to estimate the number of distinct clusters in the data, we propose a cluster splitting and merging strategy. We have applied the new algorithm to simulated gene expression data for which the correct distribution of genes over clusters is known a priori. The results show that the proposed algorithm can find natural clusters and give the correct number of clusters. The algorithm has also been tested on real gene expression changes during yeast cell cycle, for which the fundamental patterns of gene expression and assignment of genes to clusters are well understood from numerous previous studies. Comparative studies with several clustering algorithms illustrate the effectiveness of our method.


Chemical Physics Letters | 2003

DB-Curve: a novel 2D method of DNA sequence visualization and representation

Yonghui Wu; Alan Wee-Chung Liew; Hong Yan; Mengsu Yang

The large number of bases in a DNA sequence and the cryptic nature of the 4-alphabet representation make graphical visualization of DNA sequences useful for biologists. However, existing 3D graphical representations are complicated, whereas existing 2D graphical representations suffer from high degeneracy, and many features in a DNA sequence cannot be visualized clearly. This Letter introduces a novel 2D method of DNA representation: the DB-Curve (Dual-Base Curve), which overcomes some of the limitations in existing 2D graphical representations. Many properties of DNA sequences can be observed and visualized easily using a combination of DB-Curves. The new representation can avoid degeneracy completely compared to existing 2D graphical representations of DNA sequences. Unlike 3D graphical representations, no 2D projection is required for the DB-Curve, and this allows for easier analysis of DNA sequences. The DB-Curve provides a useful graphical tool for the visualization and study of DNA sequences.


Pattern Recognition | 2003

Robust adaptive spot segmentation of DNA microarray images

Alan Wee-Chung Liew; Hong Yan; Mengsu Yang

The rapid advancement of DNA chip (microarray) technology has revolutionalized genetic research in bioscience. However, the enormous amount of data produced from a microarray image makes automatic computer analysis indispensable. An important first step in analyzing microarray image is the accurate determination of the DNA spots in the image. We report here a novel spot segmentation method for DNA microarray images. The algorithm makes use of adaptive thresholding and statistical intensity modeling to: (i) generate the grid structure automatically, where each subregion in the grid contains only one spot, and (ii) to segment the spot, if any, within each subregion. The algorithm is fully automatic, robust, and can aid in the high throughput computer analysis of microarray data.


International Journal of Bioinformatics Research and Applications | 2005

Effective statistical features for coding and non-coding DNA sequence classification for yeast, C. elegans and human

Alan Wee-Chung Liew; Yonghui Wu; Hong Yan; Mengsu Yang

This study performs a quantitative evaluation of the different coding features in terms of their information content for the classification of coding and non-coding regions for three species. Our study indicated that coding features that are effective for yeast or C. elegans are generally not very effective for human, which has a short average exon length. By performing a correlation analysis, we identified a subset of human coding features with high discriminative power, but complementary in their information content. For this subset, a classification accuracy of up to 90% was obtained using a simple kNN classifier.


Bioinformatics technologies | 2005

Microarray Data Analysis

Alan Wee-Chung Liew; Hong Yan; Mengsu Yang; Y.-P. Phoebe Chen

Microarray analysis is an emerging field, simultaneously harnessing advances in semiconductor manufacturing, biochemistry, medicine, computation, and algorithms research. Microarrays now provide a platform for an unprecedented genome-wide view of a biological sample. Microarray analysis makes use of the vast amounts of data that the microarray platform provides. It is through the intelligent combination of mathematical algorithms and clinical validation that microarray analysis provides a real opportunity to realize the goal of targeted personalized medicine. One day, the information from a single microarray might be able to tell a doctor if a patient has cancer, what type of cancer it is, what the prognosis is, and what drug to use to best fight the cancer. The foundation of this story is being built in laboratories across the world today and it starts with sound microarray analysis. Microarray analysis is a multistep process that converts raw microarray data into biomarkers for clinical use. First, noise must be removed from raw data using preprocessing methods, such as normalization and artifact removal. Clean data can then be used to select important features or to build predictive rules called classifiers. The results of feature selection and classification are lists of biomarkers that are appropriate for classifying the data into groups such as benign or malignant. These biomarkers must then be validated clinically or through knowledge-based approaches. The results of validation can then be used as feedback in order to select better features or build better classifiers. Keywords: microarray analysis; pattern recognition; bioinformatics; cancer; personalized medicine; biomarker; DNA; RNA; computational biology


signal processing systems | 2004

A Computational Approach to Gene Expression Data Extraction and Analysis

Alan Wee-Chung Liew; Lap Keung Szeto; Sy-sen Tang; Hong Yan; Mengsu Yang

The rapid advancement of DNA microarray technology has revolutionalized genetic research in bioscience. Due to the enormous amount of gene expression data generated by such technology, computer processing and analysis of such data has become indispensable. In this paper, we present a computational framework for the extraction, analysis and visualization of gene expression data from microarray experiments. A novel, fully automated, spot segmentation algorithm for DNA microarray images, which makes use of adaptive thresholding, morphological processing and statistical intensity modeling, is proposed to: (i) segment the blocks of spots, (ii) generate the grid structure, and (iii) to segment the spot within each subregion. For data analysis, we propose a binary hierarchical clustering (BHC) framework for the clustering of gene expression data. The BHC algorithm involves two major steps. Firstly, the fuzzy C-means algorithm and the average linkage hierarchical clustering algorithm are used to split the data into two classes. Secondly, the Fisher linear discriminant analysis is applied to the two classes to assess whether the split is acceptable. The BHC algorithm is applied to the sub-classes recursively and ends when all clusters cannot be split any further. BHC does not require the number of clusters to be known in advance. It does not place any assumption about the number of samples in each cluster or the class distribution. The hierarchical framework naturally leads to a tree structure representation for effective visualization of gene expressions.


Physical Review E | 2003

Classification of short human exons and introns based on statistical features

Yonghui Wu; Alan W.ee-Chung Liew; Hong Yan; Mengsu Yang


Archive | 2007

Microarray Image Analysis and Spot Segmentation

Alan Wee-Chung Liew; Hong Yan; Mengsu Yang


Archive | 2003

Rapid and BriefCommunication Robust adaptive spot segmentation ofDNA microarray images

Alan Wee-Chung Liew; Hong Yan; Mengsu Yang

Collaboration


Dive into the Mengsu Yang's collaboration.

Top Co-Authors

Avatar

Hong Yan

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yonghui Wu

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Lap Keung Szeto

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Shuanhu Wu

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Sy-sen Tang

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge