Choochart Haruechaiyasak

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Choochart Haruechaiyasak is active.

Explore More

Publication

Featured researches published by Choochart Haruechaiyasak.

international conference on electrical engineering/electronics, computer, telecommunications and information technology | 2008

A comparative study on Thai word segmentation approaches

Choochart Haruechaiyasak; Sarawoot Kongyoung; Matthew N. Dailey

In this paper, we analyze and compare various approaches for Thai word segmentation. The word segmentation approaches could be classified into two distinct types, dictionary based (DCB) and machine learning based (MLB). The DCB approach relies on a set of terms for parsing and segmenting input texts. Whereas the MLB approach relies on a model trained from a corpus by using machine learning techniques. We compare between two algorithms from the DCB approach: longest-matching and maximal matching, and four algorithms from the MLB approach: Naive Bayes (NB), decision tree, support vector machine (SVM), and conditional random field (CRF). From the experimental results, the DCB approach yielded better performance than the NB, decision tree and SVM algorithms from the MLB approach. However, the best performance was obtained from the CRF algorithm with the precision and recall of 95.79% and 94.98%, respectively.

Information Sciences | 2003

Category cluster discovery from distributed WWW directories

Mei Ling Shyu; Choochart Haruechaiyasak; Shu-Ching Chen

Due to the inherently distributed nature of many networks, including the Internet, information and knowledge are generated and organized independently by different groups of people. To discover and exploit all the knowledge from different sources, a method of knowledge integration is usually required. Considering the document category sets as information sources, we define a problem of information integration called category merging. The purpose of category merging is to automatically construct a unified category set which represents and exploits document information from several different sources. This merging process is based on the clustering concept where categories with similar characteristics are merged into the same cluster under certain distributed constraints. To evaluate the quality of the merged category set, we measure the precision and recall values under three classification methods, Naive Bayes, Vector Space Model, and K-Nearest Neighbor. In addition, we propose a performance measure called cluster entropy, which determines how well the categories from different sources are distributed over the resulting clusters. We perform the merging process by using the real data sets collected from three different Web directories. The results show that our merging process improves the classification performance over the non-merged approach and also provides a better representation for all categories from distributed directories.

International Workshop on Challenges in Web Information Retrieval and Integration | 2005

Collaborative Filtering by Mining Association Rules from User Access Sequences

Mei Ling Shyu; Choochart Haruechaiyasak; Shu-Ching Chen; Na Zhao

Recent research in mining user access patterns for predicting Web page requests focuses only on consecutive sequential Web page accesses, i.e., pages which are accessed by following the hyperlinks. In this paper, we propose a new method for mining user access patterns that allows the prediction of multiple non-consecutive Web pages, i.e., any pages within the Web site. Our approach consists of two major steps. First, the shortest path algorithm in graph theory is applied to find the distances between Web pages. In order to capture user access behavior on the Web, the distances are derived from user access sequences, as opposed to static structural hyperlinks. We refer to these distances as minimum reaching distance (MRD) information. The association rule mining (ARM) technique is then applied to form a set of predictive rules which are further refined and pruned by using the MRD information. The proposed approach is applied as a collaborative filtering technique to recommend Web pages within a Web site. Experimental results demonstrate that our approach improves performance over the existing Markov model approach in terms of precision and recall, and also has a better potential of reducing the user access time on the Web

systems man and cybernetics | 2001

Mining user access behavior on the WWW

Mei Ling Shyu; Shu-Ching Chen; Choochart Haruechaiyasak

In this paper, an affinity-based approach that provides good similarity measures for Web document clustering to discover user access behavior on the World Wide Web (WWW) is proposed. The proposed approach generates the similarity measures for groups of Web documents by considering the user access patterns. Any clustering algorithm using better similarity measures should yield better clusters for discovering user access behavior. By utilizing the discovered user access behavior, for example, the companies can precisely target their potential customers and convince them to purchase their products or services in electronic commerce. An experiment on a real data set is conducted and the experimental result shows that the proposed approach yields a better performance than the cosine coefficient and the Euclidean distance method under the partitioning around medoid (PAM) method.

Knowledge and Information Systems | 2006

Mining user access patterns with traversal constraint for predicting web page requests

Mei Ling Shyu; Choochart Haruechaiyasak; Shu-Ching Chen

The recent increase in HyperText Transfer Protocol (HTTP) traffic on the World Wide Web (WWW) has generated an enormous amount of log records on Web server databases. Applying Web mining techniques on these server log records can discover potentially useful patterns and reveal user access behaviors on the Web site. In this paper, we propose a new approach for mining user access patterns for predicting Web page requests, which consists of two steps. First, the Minimum Reaching Distance (MRD) algorithm is applied to find the distances between the Web pages. Second, the association rule mining technique is applied to form a set of predictive rules, and the MRD information is used to prune the results from the association rule mining process. Experimental results from a real Web data set show that our approach improved the performance over the existing Markov-model approach in precision, recall, and the reduction of user browsing time.

ieee international conference on e-technology, e-commerce and e-service | 2005

A dynamic framework for maintaining customer profiles in e-commerce recommender systems

Choochart Haruechaiyasak; Chatchawal Tipnoe; Sarawoot Kongyoung; Chaianun Damrongrat; Niran Angkawattanawit

Recommender systems have been successfully applied to enhance the quality of service for customers, and more importantly, to increase the sale of products and services in e-commerce business. In order to provide effective recommendation results within an acceptable response time, a recommender system is required to have the scalability to handle a large customer population in real time. In this paper, we propose a new recommender system framework based on the incremental clustering algorithm in order to dynamically maintain the customer profiles. Using the incremental clustering technique, the dynamic changes in the number of customers and products purchased could be handled effectively. Experiments on real data sets showed that the proposed framework helps to reduce the recommendation time, while retaining accuracy.

information reuse and integration | 2004

A data mining framework for building a Web-page recommender system

Choochart Haruechaiyasak; Mei Ling Shyu; Shu-Ching Chen

In this paper, we propose a new framework based on data mining algorithms for building a Web-page recommender system. A recommender system is an intermediary program (or an agent) with a user interface that automatically and intelligently generates a list of information, which suits an individuals needs. Two information filtering methods for providing the recommended information are considered: (1) by analyzing the information content, i.e., content-based filtering, and (2) by referencing other user access behaviors, i.e., collaborative filtering. By using the data mining algorithms, the information filtering processes can be performed prior to the actual recommending process. As a result, the system response time could be improved and thus, making the framework scalable.

improving non english web searching | 2008

LearnLexTo: a machine-learning based word segmentation for indexing Thai texts

Choochart Haruechaiyasak; Sarawoot Kongyoung; Chaianun Damrongrat

Thai language is considered as an unsegmented language in which words are written continuously without the use of word delimiters. To index Thai texts via the inverted index, a word segmentation algorithm is usually required to tokenize a text into a series of terms. Recent works on word segmentation reported Conditional Random Fields (CRFs) as the best machine learning algorithm, outperforming the dictionary-based approach and other machine learning algorithms. Our main contribution is to propose a new hybrid approach, LearnLexTo, which further improves the CRF model by integrating the dictionary-based approach. The key idea is to solve the ambiguity problem in the CRF model by using the dictionary-based approach which relies on a valid word set. Experimental results showed that the proposed hybrid approach yields the highest F1 value of 88.46%, compared to 82.07% by using the dictionary-based approach and 85.71% by using the CRF model.

International Journal of Innovation and Technology Management | 2014

The Role of Social Media During a Natural Disaster: A Case Study of the 2011 Thai Flood

Alisa Kongthon; Choochart Haruechaiyasak; Jaruwat Pailai; Sarawoot Kongyoung

Recently, social media has become a key platform that allowed people to interact and share information. The use of social media is expanding significantly and can serve a variety of purposes. Over the last few years, users of social media have played an increasing role in the dissemination of emergency and disaster information. In this paper, we conduct a case study exploring how Thai people used social media such as Twitter in response to one of the countrys worst disasters in recent history: the 2011 Thai Flood. We combine multiple analysis methods in this study, including content analysis of Twitter messages, trend analysis of different message categories, and influential Twitter users analysis. This study helps us understand the role of social media in time of natural disaster.

meeting of the association for computational linguistics | 2006

A Collaborative Framework for Collecting Thai Unknown Words from the Web

Choochart Haruechaiyasak; Chatchawal Sangkeettrakarn; Pornpimon Palingoon; Sarawoot Kongyoung; Chaianun Damrongrat

We propose a collaborative framework for collecting Thai unknown words found on Web pages over the Internet. Our main goal is to design and construct a Web-based system which allows a group of interested users to participate in constructing a Thai unknown-word open dictionary. The proposed framework provides supporting algorithms and tools for automatically identifying and extracting unknown words from Web pages of given URLs. The system yields the result of unknown-word candidates which are presented to the users for verification. The approved unknown words could be combined with the set of existing words in the lexicon to improve the performance of many NLP tasks such as word segmentation, information retrieval and machine translation. Our framework includes word segmentation and morphological analysis modules for handling the non-segmenting characteristic of Thai written language. To take advantage of large available text resource on the Web, our unknown-word boundary identification approach is based on the statistical string pattern-matching algorithm.

Explore More