Peerapon Vateekul | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Peerapon Vateekul is active.

Explore More

Publication

Featured researches published by Peerapon Vateekul.

intelligent data analysis | 2011

Irrelevant attributes and imbalanced classes in multi-label text-categorization domains

Sareewan Dendamrongvit; Peerapon Vateekul; Miroslav Kubat

An interesting issue in machine learning is induction in multi-label domains where each example can be labeled with two or more classes at the same time. In a work focusing on text categorization, we followed the most commonly used approach and induced a binary classifier for each class. Analyzing the results, we noticed that performance had been impaired by two factors. First, in text domains, each class is characterized by a different set of attributes; an appropriate attribute-selection technique thus has to be applied separately to each of them. Second, the individual classes often have to be induced from imbalanced training sets, a circumstance we addressed here by majority-class undersampling. The paper provides details of the induction system and reports the results of systematic experimentation.

computer science and software engineering | 2014

An evaluation of feature extraction in EEG-based emotion prediction with support vector machines

Itsara Wichakam; Peerapon Vateekul

Electroencephalograph (EEG) data is a recording of brain electrical activities, which is commonly used in emotion prediction. To obtain promising accuracy, it is important to perform a suitable data preprocessing; however, different works employed different procedures and features. In this paper, we aim to investigate various feature extraction techniques for EEG signals. To obtain the best choice, there are four factors investigated in the experiment: (i) the number of channels, (ii) signal transformation methods, (iii) feature representations, and (iv) feature transformation techniques. Support Vector Machine (SVM) is chosen to be our baseline classifier due to its promising performance. The experiments were conducted on the DEAP benchmark dataset. The results showed that the prediction on EEG signals from 10 channels represented by the band power one-minute features gave the best accuracy and F1.

intelligent data analysis | 2014

Hierarchical multi-label classification with SVMs: A case study in gene function prediction

Peerapon Vateekul; Miroslav Kubat; Kanoksri Sarinnapakorn

Hierarchical multi-label classification is a relatively new research topic in the field of classifier induction. What dis- tinguishes it from earlier tasks is that it allows each example to belong to two or more classes at the same time, and by assuming that the classes are mutually related by generalization/specialization operators. The paper first investigates the problem of per- formance evaluation in these domains. After this, it proposes a new induction system, HR-SVM, built around support vector machines. In our experiments, we demonstrate that this systems performance compares favorably with that earlier attempts, and then we proceed to an investigation of how HR-SVMs individual modules contribute to the overall systems behavior. As a testbed, we use a set of benchmark domains from the field of gene-function prediction.

International Journal on Artificial Intelligence Tools | 2013

IMPROVING SVM PERFORMANCE IN MULTI-LABEL DOMAINS: THRESHOLD ADJUSTMENT

Peerapon Vateekul; Sareewan Dendamrongvit; Miroslav Kubat

In “multi-label domains,” where the same example can simultaneously belong to two or more classes, it is customary to induce a separate binary classifier for each class, and then use them all in parallel. As a result, some of these classifiers are induced from imbalanced training sets where one class outnumbers the other – a circumstance known to hurt some machine learning paradigms. In the case of Support Vector Machines (SVM), this suboptimal behavior is explained by the fact that SVM seeks to minimize error rate, a criterion that is in domains of this type misleading. This is why several research groups have studied mechanisms to readjust the bias of SVMs hyperplane. The best of these achieves very good classification performance at the price of impractically high computational costs. We propose here an improvement where these cost are reduced to a small fraction without significantly impairing classification.

Archive | 2015

Software Defect Prediction in Imbalanced Data Sets Using Unbiased Support Vector Machine

Teerawit Choeikiwong; Peerapon Vateekul

In the software assurance process, it is crucial to prevent a program with defected modules to be published to users since it can save the maintenance cost and increase software quality and reliability. There were many prior attempts to automatically capture errors by employing conventional classification techniques, e.g., Decision Tree, k-NN, Naive Bayes, etc. However, their detection performance was limited due to the imbalanced issue since the number of defected modules is very small comparing to that of non-defected modules. This paper aims to achieve high prediction rate by employing unbiased SVM called “R-SVM,” our version of SVM tailored to domains with imbalanced classes. The experiment was conducted in the NASA Metric Data Program (MDP) data set. The result showed that our proposed system outperformed all of the major traditional approaches.

international conference on data mining | 2009

Tree-Based Approach to Missing Data Imputation

Peerapon Vateekul; Kanoksri Sarinnapakorn

Missing data is a well-recognized issue in data mining, and imputation is one way to handle the problem. In this paper, we propose a novel tree-based imputation algorithm called “Imputation Tree” (ITree). It first studies the predictability of missingness using all observations by constructing a binary classification tree called “Missing Pattern Tree” (MPT). Then, missing values in each cluster or terminal node are estimated by a regression tree of observations at that node. We present empirical results using both synthetic and real data. Almost all experiments demonstrate that ITree is superior to other commonly used methods in estimating missing values. The algorithm not only produces an impressive accuracy, but also provides information on the nature of missingness.

international joint conference on computer science and software engineering | 2016

A study of sentiment analysis using deep learning techniques on Thai Twitter data

Peerapon Vateekul; Thanabhat Koomsubha

Sentiment analysis is very important for social listening, especially, when there are millions of Twitter users in Thailand nowadays. Almost all prior works are based on classical classification techniques, e.g., SVM, Naïve Bayes, etc. Recently, the deep learning techniques have shown promising accuracy in this domain on English tweet corpus. In this paper, we propose the first study that applies deep learning techniques to classify sentiment of Thai Twitter data. There are two deep learning techniques included in our study: Long Short Term Memory (LSTM) and Dynamic Convolutional Neural Network (DCNN). A proper data preprocessing has been conducted. Moreover, we also investigate an effect of word orders in Thai tweets. The results show that the deep learning techniques significantly outperform many classical techniques: Naïve Bayes and SVM, except Maximum Entropy.

information reuse and integration | 2008

A conflict-based confidence measure for associative classification

Peerapon Vateekul; Mei Ling Shyu

Associative classification has aroused significant attention recently and achieved promising results. In the rule ranking process, the confidence measure is usually used to sort the class association rules (CARs). However, it may be not good enough for a classification task due to a low discrimination power to instances in the other classes. In this paper, we propose a novel conflict-based confidence measure with an interleaving ranking strategy for re-ranking CARs in an associative classification framework, which better captures the conflict between a rule and a training data instance. In the experiments, the traditional confidence measure and our proposed conflict-based confidence measure with the interleaving ranking strategy are applied as the primary sorting criterion for CARs. The experimental results show that the proposed associative classification framework achieves promising classification accuracy with the use of the conflict-based confidence measure, particularly for an imbalanced data set.

international conference on knowledge and smart technology | 2016

Combining deep convolutional networks and SVMs for mass detection on digital mammograms

Itsara Wichakam; Peerapon Vateekul

It is important to detect breast cancers as early as possible, which are commonly diagnosed as a mass region on mammograms. Deep Convolutional networks (ConvNets) have been specially designed for various computer vision tasks. In image classification, it contains many layers to automatically extract image features and employs the softmax function at the last layer to predict a probability. Although it excels in feature extraction, the classification is still limited. In this paper, we propose to apply SVMs into ConvNets to detect a mass on mammograms. To overcome the scarcity of training images, a data augmentation technique is employed to increase the sample data. To further enhance the accuracy, two recent techniques in ConvNets are applied including (i) rectified linear units and (ii) dropout. The experiment was conducted on the INbreast data set. The result showed that the proposed method achieved an accuracy at 98.44%, which is superior to the baseline (ConvNets) for 8%.

international joint conference on computer science and software engineering | 2016

Enhancing accuracy of multi-label classification by applying one-vs-one support vector machine

Suthipong Daengduang; Peerapon Vateekul

Multi-label classification is a supervised learning, where one example can belong to several classes. In the case of Support Vector Machine (SVM), One-versus-All (OVA) is the most common approach to tackle this problem. However, the accuracy is very limited due to extremely imbalanced training set. It is interesting that there have only very few works that applied One-versus-One (OVO) in the multi-label domain even though it has been shown to provide better accuracy than OVA in the multiclass domain. In this paper, we propose a multi-label classification framework that employs OVO incorporating with the undersampling technique to alleviate the imbalanced issue. In the experiment, there are five standard benchmarks. The results show that our proposed algorithm outperforms OVA and traditional OVO in all data sets in terms of accuracy and F1.

Explore More