Keng Hoong Ng | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Keng Hoong Ng is active.

Explore More

Publication

Featured researches published by Keng Hoong Ng.

international conference on neural information processing | 2009

Protein Fold Prediction Problem Using Ensemble of Classifiers

Abdollah Dehzangi; Somnuk Phon Amnuaisuk; Keng Hoong Ng; Ehsan Mohandesi

Prediction of tertiary structure of protein from its primary structure (amino acid sequence of protein) without relying on sequential similarity is a challenging task for bioinformatics and biological science. The protein fold prediction problem can be expressed as a prediction problem that can be solved by machine learning techniques. In this paper, a new method based on ensemble of five classifiers (Naive Bayes, Multi Layer Perceptron (MLP), Support Vector Machine (SVM), LogitBoost and AdaBoost.M1) is proposed for the protein fold prediction problem. The dataset used in this experiment is from the standard dataset provided by Ding and Dubchak. Experimental results show that the proposed method enhanced the prediction accuracy up to 64% on an independent test dataset, which is the highest prediction accuracy in compare with other methods proposed by the works have done by literature.

PLOS ONE | 2012

A hybrid distance measure for clustering expressed sequence tags originating from the same gene family.

Keng Hoong Ng; Chin Kuan Ho; Somnuk Phon-Amnuaisuk

Background Clustering is a key step in the processing of Expressed Sequence Tags (ESTs). The primary goal of clustering is to put ESTs from the same transcript of a single gene into a unique cluster. Recent EST clustering algorithms mostly adopt the alignment-free distance measures, where they tend to yield acceptable clustering accuracies with reasonable computational time. Despite the fact that these clustering methods work satisfactorily on a majority of the EST datasets, they have a common weakness. They are prone to deliver unsatisfactory clustering results when dealing with ESTs from the genes derived from the same family. The root cause is the distance measures applied on them are not sensitive enough to separate these closely related genes. Methodology/Principal Findings We propose a hybrid distance measure that combines the global and local features extracted from ESTs, with the aim to address the clustering problem faced by ESTs derived from the same gene family. The clustering process is implemented using the DBSCAN algorithm. We test the hybrid distance measure on the ten EST datasets, and the clustering results are compared with the two alignment-free EST clustering tools, i.e. wcd and PEACE. The clustering results indicate that the proposed hybrid distance measure performs relatively better (in terms of clustering accuracy) than both EST clustering tools. Conclusions/Significance The clustering results provide support for the effectiveness of the proposed hybrid distance measure in solving the clustering problem for ESTs that originate from the same gene family. The improvement of clustering accuracies on the experimental datasets has supported the claim that the sensitivity of the hybrid distance measure is sufficient to solve the clustering problem.

Information Systems and E-business Management | 2017

StockProF: a stock profiling framework using data mining approaches

Keng Hoong Ng; Kok-Chin Khor

Analysing stock financial data and producing an insight into it are not easy tasks for many stock investors, particularly individual investors. Therefore, building a good stock portfolio from a pool of stocks often requires Herculean efforts. This paper proposes a stock profiling framework, StockProF, for building stock portfolios rapidly. StockProF utilizes data mining approaches, namely, (1) Local Outlier Factor (LOF) and (2) Expectation Maximization (EM). LOF first detects outliers (stocks) that are superior or poor in financial performance. After removing the outliers, EM clusters the remaining stocks. The investors can then profile the resulted clusters using mean and 5-number summary. This study utilized the financial data of the plantation stocks listed on Bursa Malaysia. The authors used 1-year stock price movements to evaluate the performance of the outliers as well as the clusters. The results showed that StockProF is effective as the profiling corresponded to the average capital gain or loss of the plantation stocks.

computational intelligence | 2016

An Improvement to StockProF: Profiling Clustered Stocks with Class Association Rule Mining

Kok-Chin Khor; Keng Hoong Ng

Using StockProF developed in our previous work, we are able to identify outliers from a pool of stocks and form clusters with the remaining stocks based on their financial performance. The financial performance is measured using financial ratios obtained directly or derived from financial reports. The resulted clusters are then profiled manually using mean and 5-number summary calculated from the financial ratios. However, this is time consuming and a disadvantage to novice investors who are lacking of skills in interpreting financial ratios. In this study, we utilized class association rule mining to overcome the problems. Class association rule mining was used to form rules by finding financial ratios that were strongly associated with a particular cluster. The resulted rules were more intuitive to investors as compared with our previous work. Thus, the profiling process became easier. The evaluation results also showed that profiling stocks using class association rules helps investors in making better investment decisions.

biomedical engineering and informatics | 2010

Clustering of expressed sequence tags with distance measure based on Burrows-Wheeler transform

Keng Hoong Ng; Somnuk Phon-Amnuaisuk; Chin Kuan Ho

Expressed sequence tag (ESTs) are a technology used for gene discovery and transcriptome analysis. They are single-read short fragments of expressed gene produced from mRNA extracted from a living cell. Clustering is a vital computational step in the processing of ESTs, its main goal is to ensure that all ESTs originated from the same mRNA are grouped together. Basically, the clustering algorithms of EST can be classified into two approaches, i.e. alignment-based and alignment-free. The latter approach is preferred in recent years, due to its faster speed and satisfactory outcome. In this paper, we proposed and implemented an EST clustering algorithm based on the alignment-free approach, where we introduced a measurement of distance between ESTs using the combination of Burrows-Wheeler transform, window length and word-tuple. We assessed the proposed method with a dataset downloaded from the Unigene. The preliminary result shows high clustering quality with this method, where the accuracy of clustering (evaluated using F-measure) can achieve up to 0.9671.

soft computing and pattern recognition | 2009

A Review of Recent Alignment-Free Clustering Algorithms in Expressed Sequence Tag

Keng Hoong Ng; Somnuk Phon-Amnuaisuk; Chin Kuan Ho

Expressed sequence tags (ESTs) are short single pass sequence reads derived from cDNA libraries, they have been used for gene discovery, detection of splice variants, expression of genes and also transciptome analysis. Clustering of ESTs is a vital step before they can be processed further. Currently there are many EST clustering algorithms available. Basically they can be generalized into two broad approaches, i.e. alignment-based and alignment-free. The former approach is reliable but inefficient in terms of running time, while the latter approach is gaining popularity and currently under rapid development due to its faster speed and acceptable result. In this paper, we propose a taxonomy for sequence comparison algorithms and another taxonomy for EST clustering algorithms. In addition, we also highlight the peculiarities of recently introduced alignment-free EST clustering algorithms by focusing on their features, distance measures, advantages and disadvantages.

Archive | 2009

Clustering of Expressed Sequence Tag Using Global and Local Features: A Performance Study

Keng Hoong Ng; Somnuk Phon-Amnuaisuk; Chin Kuan Ho

Clustering of expressed sequence tag (EST) plays an important role in gene analysis. Alignment-based sequence comparison is commonly used to measure the similarity between sequences, and recently some of the alignment-free comparisons have been introduced. In this paper, we evaluate the role of global and local features extracted from the alignment free approaches i.e., the compression-based method and the generalized relative entropy method. The evaluation is done from the perspective of EST clustering quality. Our evaluation shows that the local feature of EST yields much better clustering quality compared to the global feature. Furthermore, we verified our best clustering result achieved in the experiments with another EST clustering algorithm, wcd, and it shows that our performance is comparable to the later.

international symposium on information technology | 2008

Clustering of Expressed Sequence Tag (EST) with Markov models and self-organizing maps: An exploratory study

Keng Hoong Ng; Somnuk Phon-Amnuaisuk; Chin Kuan Ho

Expressed Sequence Tag (EST) plays an important role in discovering the full length of a gene. Therefore clustering of ESTs remains an interesting area for further exploration. We investigate several available clustering algorithms, and then we propose and evaluate an unsuperivsed clustering method that uses Markov models and Self-organizing maps. The initial evaluation of the method gives satisfactory result, where its clustering accuracy is up to 80.08%.

Archive | 2016