Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Worrawat Engchuan is active.

Publication


Featured researches published by Worrawat Engchuan.


Neurocomputing | 2015

Pathway activity transformation for multi-class classification of lung cancer datasets

Worrawat Engchuan; Jonathan H. Chan

Pathway-based microarray analysis has been found to be a powerful tool to study disease mechanisms and to identify biological markers of complex diseases like lung cancer. From previous studies, the use of pathway activity transformed from gene expression data has been shown to be more informative in disease classification. However, current works on a pathway activity transformation method are for binary-class classification. In this study, we propose a pathway activity transformation method for multi-class data termed Analysis-of-Variance-based Feature Set (AFS). The classification results of using pathway activity derived from our proposed method show high classification power in three-fold cross-validation and robustness in across dataset validation for all four lung cancer datasets used.


BMC Medical Genomics | 2015

Performance of case-control rare copy number variation annotation in classification of autism

Worrawat Engchuan; Kiret Dhindsa; Anath C. Lionel; Stephen W. Scherer; Jonathan H. Chan; Daniele Merico

BackgroundA substantial proportion of Autism Spectrum Disorder (ASD) risk resides in de novo germline and rare inherited genetic variation. In particular, rare copy number variation (CNV) contributes to ASD risk in up to 10% of ASD subjects. Despite the striking degree of genetic heterogeneity, case-control studies have detected specific burden of rare disruptive CNV for neuronal and neurodevelopmental pathways. Here, we used machine learning methods to classify ASD subjects and controls, based on rare CNV data and comprehensive gene annotations. We investigated performance of different methods and estimated the percentage of ASD subjects that could be reliably classified based on presumed etiologic CNV they carry.ResultsWe analyzed 1,892 Caucasian ASD subjects and 2,342 matched controls. Rare CNVs (frequency 1% or less) were detected using Illumina 1M and 1M-Duo BeadChips. Conditional Inference Forest (CF) typically performed as well as or better than other classification methods. We found a maximum AUC (area under the ROC curve) of 0.533 when considering all ASD subjects with rare genic CNVs, corresponding to 7.9% correctly classified ASD subjects and less than 3% incorrectly classified controls; performance was significantly higher when considering only subjects harboring de novo or pathogenic CNVs. We also found rare losses to be more predictive than gains and that curated neurally-relevant annotations (brain expression, synaptic components and neurodevelopmental phenotypes) outperform Gene Ontology and pathway-based annotations.ConclusionsCF is an optimal classification approach for case-control rare CNV data and it can be used to prioritize subjects with variants potentially contributing to ASD risk not yet recognized. The neurally-relevant annotations used in this study could be successfully applied to rare CNV case-control data-sets for other neuropsychiatric disorders.


Journal of Bioinformatics and Computational Biology | 2016

Gene-set activity toolbox (GAT): A platform for microarray-based cancer diagnosis using an integrative gene-set analysis approach

Worrawat Engchuan; Asawin Meechai; Sissades Tongsima; Narumol Doungpan; Jonathan H. Chan

Cancer is a complex disease that cannot be diagnosed reliably using only single gene expression analysis. Using gene-set analysis on high throughput gene expression profiling controlled by various environmental factors is a commonly adopted technique used by the cancer research community. This work develops a comprehensive gene expression analysis tool (gene-set activity toolbox: (GAT)) that is implemented with data retriever, traditional data pre-processing, several gene-set analysis methods, network visualization and data mining tools. The gene-set analysis methods are used to identify subsets of phenotype-relevant genes that will be used to build a classification model. To evaluate GAT performance, we performed a cross-dataset validation study on three common cancers namely colorectal, breast and lung cancers. The results show that GAT can be used to build a reasonable disease diagnostic model and the predicted markers have biological relevance. GAT can be accessed from http://gat.sit.kmutt.ac.th where GATs java library for gene-set analysis, simple classification and a database with three cancer benchmark datasets can be downloaded.


International Journal of Advanced Intelligence Paradigms | 2016

Handling batch effects on cross-platform classification of microarray data

Worrawat Engchuan; Asawin Meechai; Sissades Tongsima; Jonathan H. Chan

Gene-set-based microarray analysis is commonly applied in the classification of complex diseases. However, the robustness of a classifier is normally limited by the small number of samples in many microarray datasets. Although a merged dataset from multiple experiments may improve classification performance, batch effects or technical/biological variations among these experiments may eventually confound the analysis. Besides the batch effects, merging multiple microarray datasets from different platforms can generate missing values, due to a different number of covered genes. In this work, we extend previous works that focused on the missing value incident by further exploring the impact of batch effects on cross-platform classification. Two quality measures of data purity are proposed and two data imputation methods are compared. The results show that by doing batch correction the quality of the merged data is improved significantly. Furthermore, the classification performance is high when the normalised purity is above a certain threshold.


international conference on neural information processing | 2012

Pathway-Based multi-class classification of lung cancer

Worrawat Engchuan; Jonathan H. Chan

The advances in high throughput microarray technology have enabled genome-wide expression analysis to identify diagnostic biomarkers of various disease states. In this work, muti-class classification of lung cancer data is developed based on our previous accurate and robust binary-class classification using pathway activity data. In particular, the pathway activity of each pathway was inferred using a Negatively Correlated Feature Set (NCFS) method based on curated pathway data from MSigDB, which combines pathway data of many public databases such as KEGG, PubMed, BioCarta, etc. The developed technique was tested on three independent datasets as well as a merged dataset. The results show that using a two-stage binary classification process on independent datasets provided the best performance. Nonetheless, the multi-class SVM technique also yielded acceptable results.


international conference on knowledge and smart technology | 2015

Clustering-based multi-class classification of complex disease

Thiptanawat Phongwattana; Worrawat Engchuan; Jonathan H. Chan

Pathway activity data transformed from gene expression profiles may be used to identify tumors, complex diseases progression, and cellular response to stimuli, and so on. Previous researches utilized data mining techniques on pathway activity data to distinguish subjects or to predict the phenotype outcome of subject directly. However, in the multi-class classification, learning those data mixing with population from different groups may result in contaminated model as excessive information is presented. This research, we use a two-stage approach applying clustering to homogenize training data before building the classification model. Hierarchical Clustering is used as a clustering method and Random Forest is used as classifier for evaluating the performance of the proposed method. The results are promising and show that using a clustering technique before classifying improves classification performance in general.


INNS-CIIS | 2015

Cross-Platform Pathway Activity Transformation and Classification of Microarray Data

Worrawat Engchuan; Asawin Meechai; Sissades Tongsima; Jonathan H. Chan

One of the most challenging problems in microarray study is to analyze microarray data from different platforms. This will improve the reliability of the study, as number of samples is larger and it can be applied for rare disease study, for which only a few microarray data have been published. As different microarray platforms cover different number of genes, so the integrative study of two different platforms needs to be able to deal with the missing value issue. Many works have been done for cross-platform microarray data utilization but none of them have focused on gene-set based microarray data classification. In this study, we applied the Bayesian-based method to reconstruct the expression level of the missing genes before transforming it to the gene-set activity. Two gene-set activity transformation methods; Negatively Correlated Feature Set (NCFS-i) and Analysis-of-Variance Feature Set (AFS), were used to evaluate the performance of this method using actual microarray datasets. The results show that the imputation of missing data can improve the classification performance of the cross-platform study.


Archive | 2016

Intelligent and Evolutionary Systems

Kittichai Lavangnananda; Somnuk Phon-Amnuaisuk; Worrawat Engchuan; Jonathan H. Chan

Bargaining with reading habit is no need. Reading is not kind of something sold that you can take or not. It is a thing that will change your life to life better. It is the thing that will give you many things around the world and this universe, in the real world and here after. As what will be given by this intelligent and evolutionary systems, how can you bargain with the thing that has many benefits for you?


international symposium on neural networks | 2015

Clustering-based gene-subnetwork biomarker identification using gene expression data

Narumol Doungpan; Worrawat Engchuan; Asawin Meechai; Jonathan H. Chan

The identification of predictive biomarkers of complex disease with robustness and specificity is an ongoing challenge. Gene expressions provide information on how the cell reacts to a particular state and the relationship of genes may lead to novel information. A network-based approach integrating expression data with protein-protein interaction network can be used to identify gene-subnetwork biomarkers for a particular disease. However, cancer datasets are heterogeneous in nature containing unknown or undefined subtypes of cancers. In this study, we propose a gene-subnetwork biomarker identification approach by implementing an Expectation-Maximization (EM) clustering technique to homogenize the dataset. To validate our proposed method. Lung cancer expression datasets are used to identify gene-subnetwork biomarkers. The evaluation of gene-subnetwork biomarkers is done by 5-fold cross-validation on an independent dataset. The comparison between non-clustering and clustering-based gene-subnetwork identification showed that clustering produced improved classification performance at a statistically significant level. Furthermore, preliminary functional analysis results showed more significant subnetworks were identified using the proposed approach.


international conference on neural information processing | 2015

Comparative Study of Web-Based Gene Expression Analysis Tools for Biomarkers Identification

Worrawat Engchuan; Preecha Patumcharoenpol; Jonathan H. Chan

With the flood of publicly available data, it allows scientists to explore and discover new findings. Gene expression is one type of biological data which captures the activity inside the cell. Studying gene expression data may expose the mechanisms of disease development. However, with the limitation of computing resources or knowledge in computer programming, many research groups are unable to effectively utilize the data. For about a decade now, various web-based data analysis tools have been developed to analyze gene expression data. Different tools were implemented by different analytical approaches, often resulting in different outcomes. This study conducts a comparative study of three existing web-based gene expression analysis tools, namely Gene-set Activity Toolbox (GAT), NetworkAnalyst and GEO2R using six publicly available cancer data sets. Results of our case study show that NetworkAnalyst has the best performance followed by GAT and GEO2R, respectively.

Collaboration


Dive into the Worrawat Engchuan's collaboration.

Top Co-Authors

Avatar

Jonathan H. Chan

King Mongkut's University of Technology Thonburi

View shared research outputs
Top Co-Authors

Avatar

Asawin Meechai

King Mongkut's University of Technology Thonburi

View shared research outputs
Top Co-Authors

Avatar

Narumol Doungpan

King Mongkut's University of Technology Thonburi

View shared research outputs
Top Co-Authors

Avatar

Sissades Tongsima

Thailand National Science and Technology Development Agency

View shared research outputs
Top Co-Authors

Avatar

Pornchai Mongolnam

King Mongkut's University of Technology Thonburi

View shared research outputs
Top Co-Authors

Avatar

Praisan Padungweang

King Mongkut's University of Technology Thonburi

View shared research outputs
Top Co-Authors

Avatar

Preecha Patumcharoenpol

King Mongkut's University of Technology Thonburi

View shared research outputs
Top Co-Authors

Avatar

Thammarsat Visutarrom

King Mongkut's University of Technology Thonburi

View shared research outputs
Top Co-Authors

Avatar

Thiptanawat Phongwattana

King Mongkut's University of Technology Thonburi

View shared research outputs
Researchain Logo
Decentralizing Knowledge