Palakorn Achananuparp

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Palakorn Achananuparp is active.

Explore More

Publication

Featured researches published by Palakorn Achananuparp.

data warehousing and knowledge discovery | 2008

The Evaluation of Sentence Similarity Measures

Palakorn Achananuparp; Xiaohua Hu; Xiajiong Shen

The ability to accurately judge the similarity between natural language sentences is critical to the performance of several applications such as text mining, question answering, and text summarization. Given two sentences, an effective similarity measure should be able to determine whether the sentences are semantically equivalent or not, taking into account the variability of natural language expression. That is, the correct similarity judgment should be made even if the sentences do not share similar surface form. In this work, we evaluate fourteen existing text similarity measures which have been used to calculate similarity score between sentences in many text applications. The evaluation is conducted on three different data sets, TREC9 question variants, Microsoft Research paraphrase corpus, and the third recognizing textual entailment data set.

hawaii international conference on system sciences | 2012

Tweets and Votes: A Study of the 2011 Singapore General Election

Marko M. Skoric; Nathaniel D. Poor; Palakorn Achananuparp; Ee-Peng Lim; Jing Jiang

This study focuses on the uses of Twitter during the elections, examining whether the messages posted online are reflective of the climate of public opinion. Using Twitter data obtained during the official campaign period of the 2011 Singapore General Election, we test the predictive power of tweets in forecasting the election results. In line with some previous studies, we find that during the elections the Twitter sphere represents a rich source of data for gauging public opinion and that the frequency of tweets mentioning names of political parties, political candidates and contested constituencies could be used to make predictions about the share of votes at the national level, although the accuracy of the predictions was significantly lower that in the studies done in Germany and the UK. At the level of constituency the predictive power of tweets was much weaker, although still better than chance. The findings suggest that the context in which the elections take place also matters, and that issues like media freedoms, competitiveness of the elections and specifics of the electoral system may lead to certain over- and under-estimations of voting sentiment. The implications for future research are discussed.

international conference on software maintenance | 2012

Automatic classification of software related microblogs

Philips Kokoh Prasetyo; David Lo; Palakorn Achananuparp; Yuan Tian; Ee-Peng Lim

Millions of people, including those in the software engineering communities have turned to microblogging services, such as Twitter, as a means to quickly disseminate information. A number of past studies by Treude et al., Storey, and Yuan et al. have shown that a wealth of interesting information is stored in these microblogs. However, microblogs also contain a large amount of noisy content that are less relevant to software developers in engineering software systems. In this work, we perform a preliminary study to investigate the feasibility of automatic classification of microblogs into two categories: relevant and irrelevant to engineering software systems. We extract features from the textual content of the microblogs and the titles of any URLs mentioned in the microblogs. These features are then used to learn a discriminative model used in classifying relevant and irrelevant microblogs. We show that our trained model can achieve a promising classification performance.

mining software repositories | 2012

What does software engineering community microblog about

Yuan Tian; Palakorn Achananuparp; Ibrahim Nelman Lubis; David Lo; Ee-Peng Lim

Microblogging is a new trend to communicate and to disseminate information. One microblog post could potentially reach millions of users. Millions of microblogs are generated on a daily basis on popular sites such as Twitter. The popularity of microblogging among programmers, software engineers, and software users has also led to their use of microblogs to communicate software engineering issues apart from using emails and other traditional communication channels. Understanding how millions of users use microblogs in software engineering related activities would shed light on ways we could leverage the fast evolving microblogging content to aid software development efforts. In this work, we perform a preliminary study on what the software engineering community microblogs about. We analyze the content of microblogs from Twitter and categorize the types of microblogs that are posted. We investigate the relative popularity of each category of microblogs. We also investigate what kinds of microblogs are diffused more widely in the Twitter network via the “retweet” feature. Our experiments show that microblogs commonly contain job openings, news, questions and answers, or links to download new tools and code. We find that microblogs concerning real-world events are more widely diffused in the Twitter network.

Social Networks | 2012

Influentials, Novelty, and Social Contagion: The Viral Power of Average Friends, Close Communities, and Old News

Nicholas Harrigan; Palakorn Achananuparp; Ee-Peng Lim

What is the effect of (1) popular individuals, and (2) community structures on the retransmission of socially contagious behavior? We examine a community of Twitter users over a five month period, operationalizing social contagion as ‘retweeting’, and social structure as the count of subgraphs (small patterns of ties and nodes) between users in the follower/following network. We find that popular individuals act as ‘inefficient hubs’ for social contagion: they have limited attention, are overloaded with inputs, and therefore display limited responsiveness to viral messages. We argue this contradicts the ‘law of the few’ and ‘influentials hypothesis’. We find that community structures, particularly reciprocal ties and certain triadic structures, substantially increase social contagion. This contradicts the theory that communities display lower internal contagion because of the inherent redundancy and lack of novelty of messages within a community. Instead, we speculate that the reasons community structures show increased social contagion are, first, that members of communities have higher similarity (reflecting shared interests and characteristics, increasing the relevance of messages), and second, that communities amplify the social bonding effect of retransmitted messages.

acm transactions on management information systems | 2012

Who is Retweeting the Tweeters? Modeling, Originating, and Promoting Behaviors in the Twitter Network

Palakorn Achananuparp; Ee-Peng Lim; Jing Jiang; Tuan-Anh Hoang

Real-time microblogging systems such as Twitter offer users an easy and lightweight means to exchange information. Instead of writing formal and lengthy messages, microbloggers prefer to frequently broadcast several short messages to be read by other users. Only when messages are interesting, are they propagated further by the readers. In this article, we examine user behavior relevant to information propagation through microblogging. We specifically use retweeting activities among Twitter users to define and model originating and promoting behavior. We propose a basic model for measuring the two behaviors, a mutual dependency model, which considers the mutual relationships between the two behaviors, and a range-based model, which considers the depth and reach of users’ original tweets. Next, we compare the three behavior models and contrast them with the existing work on modeling influential Twitter users. Last, to demonstrate their applicability, we further employ the behavior models to detect interesting events from sudden changes in aggregated information propagation behavior of Twitter users. The results will show that the proposed behavior models can be effectively applied to detect interesting events in the Twitter stream, compared to the baseline tweet-based approaches.

acm symposium on applied computing | 2007

Semantically enhanced user modeling

Palakorn Achananuparp; Hyoil Han; Olfa Nasraoui; R. M. Johnson

Content-based implicit user modeling techniques usually employ a traditional term vector as a representation of the users interest. However, due to the problem of dimensionality in the vector space model, a simple term vector is not a sufficient representation of the user model as it ignores the semantic relations between terms. In this paper, we present a novel method to enhance a traditional term-based user model with WordNet-based semantic similarity techniques. To achieve this, we use word definitions and relationship hierarchies in WordNet to perform word sense disambiguation and employ domain-specific concepts as category labels for the derived user models. We tested our method on Windows to the Universe, a public educational website covering subjects in the Earth and Space Sciences, and performed an evaluation of our semantically enhanced user models against human judgment. Our approach is distinguishable from existing work because we automatically narrow down the set of domain specific concepts from initial domain concepts obtained from Wikipedia and because we automatically create semantically enhanced user models.

conference on human interface | 2007

A framework for text processing and supporting access to collections of digitized historical newspapers

Robert B. Allen; Andrea Japzon; Palakorn Achananuparp; Ki Jung Lee

Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extract metadata for them. We describe the article schema and provide examples of features that facilitate automatic indexing of them. For this processing, we employ lexical semantics, structural models, and community content. Furthermore, we describe visualization and summarization techniques that can be used to present the extracted events.

international conference on asian digital libraries | 2011

On modeling virality of twitter content

Tuan Anh Hoang; Ee-Peng Lim; Palakorn Achananuparp; Jing Jiang; Feida Zhu

Twitter is a popular microblogging site where users can easily use mobile phones or desktop machines to generate short messages to be shared with others in realtime. Twitter has seen heavy usage in many recent international events including Japan earthquake, Iran election, etc. In such events, many tweets may become viral for different reasons. In this paper, we study the virality of socio-political tweet content in the Singapores 2011 general election (GE2011). We collected tweet data generated by about 20K Singapore users from 1 April 2011 till 12 May 2011, and the follow relationships among them. We introduce several quantitative indices for measuring the virality of tweets that are retweeted. Using these indices, we identify the most viral messages in GE2011 as well as the users behind them.

International Journal of Data Warehousing and Mining | 2008

A Graph-Based Biomedical Literature Clustering Approach Utilizing Term's Global and Local Importance Information

Xiaodan Zhang; Xiaohua Hu; Jiali Xia; Xiaohua Zhou; Palakorn Achananuparp

In this article, we present a graph-based knowledge representation for biomedical digital library literature clustering. An efficient clustering method is developed to identify the ontology-enriched k-highest density term subgraphs that capture the core semantic relationship information about each document cluster. The distance between each document and the k term graph clusters is calculated. A document is then assigned to the closest term cluster. The extensive experimental results on two PubMed document sets (Disease10 and OHSUMED23) show that our approach is comparable to spherical k-means. The contributions of our approach are the following: (1) we provide two corpus-level graph representations to improve document clustering, a term co-occurrence graph and an abstract-title graph; (2) we develop an efficient and effective document clustering algorithm by identifying k distinguishable class-specific core term subgraphs using termsâ€™ global and local importance information; and (3) the identified term clusters give a meaningful explanation for the document clustering results.

Explore More