Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Wirote Aroonmanakun is active.

Publication


Featured researches published by Wirote Aroonmanakun.


Proceedings of the 7th Workshop on Asian Language Resources | 2009

Thai National Corpus: A Progress Report

Wirote Aroonmanakun; Kachen Tansiri; Pairit Nittayanuparp

This paper presents problems and solutions in developing Thai National Corpus (TNC). TNC is designed to be a comparable corpus of British National Corpus. The project aims to collect eighty million words. Since 2006, the project can now collect only fourteen million words. The data is accessible from the TNC Web. Delay in creating the TNC is mainly caused from obtaining authorization of copyright texts. Methods used for collecting data and the results are discussed. Errors during the process of encoding data and how to handle these errors will be described.


natural language processing and knowledge engineering | 2009

Thai named entity recognition based on conditional random fields

Nutcha Tirasaroj; Wirote Aroonmanakun

This paper presents the Thai named entity recognition (NER) systems using Conditional Random Fields (CRFs). In the previous studies of Thai NER, there are not any systems using syllable-segmented data as an input but word-segmented one. Since the results of some researches on NER in other languages such as Chinese show that the systems based on character are better than those based on word, this study is also conducted to find out if the syllable-segmented input helps improve Thai NER. In order to compare the system getting word-segmented input to that getting syllable-segmented input, there will be two sets of features used in the systems in this study. The results of the experiment show that the systems do not perform well enough due to few features used. However, it reveals that the syllable-based system is slightly better than the word-based one. The corpus, training data preparation and system overview are also included in this paper.


international conference on asian language processing | 2009

Extracting Thai Compounds Using Collocations and POS Bigram Probabilities without a POS Tagger

Wirote Aroonmanakun

This paper presents a simple method to extract compounds using statistical collocations and POS bigram probabilities without a POS tagger. Statistical collocation was used to determine strength of word co-occurrences. Probabilities of POS sequences were used to adjust the strength of collocation within a possible compound. These probabilities were estimated from compounds found in the dictionary. Bigram and trigram words extracted from a corpus of 28 million words were ranked by two means, collocation scores and collocation scores weighted by POS pattern probabilities. Cutoff precision at every 200 points were calculated for both methods. The results showed that probabilities of POS sequences could increase the precision rate of compound extraction at certain level. The system can extract 2-word compounds and 3-word compounds at the precision rate up to 63% and 35% respectively. When eliminating bigram extractions that could be parts of trigram extraction, the precision rate is increased up to 71%.


natural language processing and knowledge engineering | 2009

A linguistic study of product names in Thai economic news

Nattadaporn Lertcheva; Wirote Aroonmanakun

This paper reports a linguistic analysis of product names in Thai economic news. This basic knowledge will be useful for solving named entity identification problem. The structure of product names used in the economic news is described. The analysis not only shows the result of each product names structure, but also shows the different varying forms found in continuous text. The result of this analysis can be applied for future work that focuses in Product Named Entity recognition.


international conference on asian language processing | 2009

A Bi-directional Translation Approach for Building Thai Wordnet

Prissana Akaraputthiporn; Krit Kosawat; Wirote Aroonmanakun

In this paper we introduce a bi-directional translation approach for building Thai WordNet automatically. The 2nd Order Entity of common bases concepts were selected as the target for constructing Thai WordNet in this study. Manual construction was carried out to set up a gold standard for evaluating the bi-directional translation approach as well as other automatic Thai WordNet construction methods. The bi-directional translation method was found to be good for precision but not recall. Issues relating to the number of word senses, whether it is monosemic or polysemic, and the relation between source and target words, whether it is 1:1, 1:many, many:1, or many:many, were investigated.


international conference on asian language processing | 2009

Building Thai WordNet with a Bi-directional Translation Method

Dhanon Leenoi; Thepchai Supnithi; Wirote Aroonmanakun

This research presents a method of building Thai WordNet using an automatic bi-directional translation system with two EnglishThai dictionaries, LEXiTRON and Thiengburanathum Dictionary. The former was compiled using a corpus-based approach, whilst the latter was compiled on the basis of the author’s expertise. The results show that using LEXiTRON gives an F-measure of 50.36 for synset aspect, and 25.01 for word aspect, while using the Thiengburanathum Dictionary results in F-measure of 64.51 for synset aspect and 34.54 for word aspect. Furthermore, for a combination of two dictionaries, the F-measure increases to 67.16 for synset aspect and 36.27 for word aspect.


Natural Language Processing | 2002

Collocation and Thai Word Segmentation

Wirote Aroonmanakun


computer and information technology | 2006

A Chunk-based n-gram English to Thai Transliteration

Wirote Aroonmanakun


pacific asia conference on language information and computation | 2004

A Unified Model of Thai Romanization and Word Segmentation

Wirote Aroonmanakun; Wanchai Rivepiboon


Proceedings of the 3rd Named Entities Workshop (NEWS 2011) | 2011

Product Name Identification and Classification in Thai Economic News

Nattadaporn Lertcheva; Wirote Aroonmanakun

Collaboration


Dive into the Wirote Aroonmanakun's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kachen Tansiri

Chulalongkorn University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge