Gan Keng Hoon | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gan Keng Hoon is active.

Explore More

Publication

Featured researches published by Gan Keng Hoon.

Procedia Computer Science | 2015

Personalization of Trending Tweets using Like-dislike Category Model☆

Lu Weilin; Gan Keng Hoon

Abstract Twitter is a popular social media platform, where millions of tweets are being generated every day. On Twitter, users can tweet about certain topic during occurrence of events. This results in trending topics, such as #MH370, #MH17, #South Korea ferry etc. When viewing trending topic, not all the tweets are relevant to one self. Therefore, it is important to classify these tweets based on individual preference for better information retrieval. To address this problem, this paper focuses on automated personalization of tweets for popular trending topics. The main objective is to classify the tweets information as “Like” or “Dislike” on a particular topic depending on personal preference by feature selection. For enhancement, topic-related keywords are selected as features for representing a category model from user preferred tweets and other sources like news. Finally, the result of experiment shows promising outcome that a training based on low number of tweets can give quick personalization on next incoming tweets.

web intelligence | 2004

MICE: Aggregating and Classifying Meta Search Results into Self-Customized Categories

Saravadee Sae Tan; Gan Keng Hoon; Tang Enya Kong; Cheong Sook Lin; Chan Siew Lin; Foo Wen Ying

Having broad coverage of search results returned by various search sources, combining and organizing these results in a meaningful way has become a common issue in the field of information retrieval. In this demo paper, we describe our meta search system, MICE, that is able to aggregate and classify search results based on user-customized categories. Categories help user to focus on search results, with respect to the categories concept customized by the user.

2011 International Conference on Semantic Technology and Information Retrieval | 2011

Automated query transformation for searching semantically rich structured collections

Gan Keng Hoon; Phang Keat Keong

The availability of semantically rich structured resources makes the search of such resources on the web even more interesting, where the formation of a meaningful structured query is now possible for better retrieval. Incorporation of concepts like role, category, topic, class, attributes, etc. to indicate the search target and constraints in a structured query enable better definition of information needs. However, the varieties of usages in application contexts create heterogeneity of structures definition and presentation, thus making it hard of using this information when formulating structured query for a search process. Hence, we are motivated to automate transformation of unstructured queries (which are more common for web user), to structured queries (which support better specification of information needs). This paper first explores two important issues arisen in the process of query transformation, i.e. multiple structures (with respect to the needs of handling multiple structured queries type) and multiple semantics (with respect to the needs of handling different collections of structured resources). Along with these issues, we define research problems, followed by an approach of mediating multiple structures and multiple semantics to support the query transformation from unstructured to structured form.

web intelligence | 2006

A Semantic Learning Approach for Mapping Unstructured Query to Web Resources

Gan Keng Hoon; Phang Keat Keong; Tang Enya Kong

The search that involves structured Web resources like XML data, services is still lagging of its own method and relying on contemporary search systems. This paper presents a method that learns semantics from structured information of these resources. Instead of committing the semantic meaning of resources to strict and formal vocabularies like ontology or data dictionary, we are interested to interpret the meaning based on the natural context of the resources. The semantics are used in search process, i.e. query reasoning and resource selection, to provide better answer in terms of context relevancy and clearer result description

Pattern Analysis and Applications | 2018

A text representation model using Sequential Pattern-Growth method

Suraya Alias; Siti Khaotijah Mohammad; Gan Keng Hoon; Tan Tien Ping

Text representation is an essential task in transforming the input from text into features that can be later used for further Text Mining and Information Retrieval tasks. The commonly used text representation model is Bags-of-Words (BOW) and the N-gram model. Nevertheless, some known issues of these models, which are inaccurate semantic representation of text and high dimensionality of word size combination, should be investigated. A pattern-based model named Frequent Adjacent Sequential Pattern (FASP) is introduced to represent the text using a set of sequence adjacent words that are frequently used across the document collection. The purpose of this study is to discover the similarity of textual pattern between documents that can be later converted to a set of rules to describe the main news event. The FASP is based on the Pattern-Growth’s divide-and-conquer strategy where the main difference between FASP and the prior technique is in the Pattern Generation phase. This approach is tested against the BOW and N-gram text representation model using Malay and English language news dataset with different term weightings in the Vector Space Model (VSM). The findings demonstrate that the FASP model has a promising performance in finding similarities between documents with the average vector size reduction of 34% against the BOW and 77% against the N-gram model using the Malay dataset. Results using the English dataset is also consistent, indicating that the FASP approach is also language independent.

Neural Computing and Applications | 2018

Term weighting scheme for short-text classification: Twitter corpuses

Issa Alsmadi; Gan Keng Hoon

Term weighting is a well-known preprocessing step in text classification that assigns appropriate weights to each term in all documents to enhance the performance of text classification. Most methods proposed in the literature use traditional approaches that emphasize term frequency. These methods perform reasonably with traditional documents. However, these approaches are unsuitable for social network data with limited length and where sparsity and noise are characteristics of short text. A simple supervised term weighting approach, i.e., SW, which considers the special nature of short texts based on term strength and term distribution, is introduced in these study, and its effect in a high-dimensional vector space over term weighting schemes, which represent baseline term weighting in traditional text classification, are assessed. Two datasets are employed with support vector machine, decision tree, k-nearest neighbor, and logistic regression algorithms. The first dataset, Sanders dataset, is a benchmark dataset that includes approximately 5000 tweets in four categories. The second self-collected dataset contains roughly 1500 tweets distributed in six classes collected using Twitter API. The evaluation applied tenfold cross-validation on the labeled data to compare the proposed approach with state-of-the-art methods. The experimental results indicate that supervised approaches perform varied performance, predominantly better than the unsupervised approaches. However, the proposed approach SW has better performance than other ones in terms of accuracy. SW can deal with the limitations of short texts and mitigate the limitations of traditional approaches in the literature, thus improving performance to 80.83 and 90.64 (F-measure) on Sanders dataset and a self-collected dataset, respectively.

First EAI International Conference on Computer Science and Engineering | 2017

Mining Points of Interests with Popular Travel Patterns and Spatial Guidance from Social and Credible Sources

Erum Haris; Gan Keng Hoon

This work aims to utilize social and credible tourism content in order to develop a gazetteer of worth seeing points of interest (POIs) in a region along with mining most popular visit patterns of these POIs followed by experienced travelers. It proposes a new insight to frequent travel pattern mining by enriching these routes with spatial relations among the POIs to facilitate navigation.

2016 Third International Conference on Information Retrieval and Knowledge Management (CAMP) | 2016

An efficient similarity matching for clustering XML element

Saravadee Sae Tan; Gan Keng Hoon

In recent year, XML has become the major means for information representation and exchange on the Web. Due to the increasing number of XML documents, XML similarity becomes essential in a wide range of applications like information extraction, data integration etc. In this paper we present a clustering approach that calculate similarity between XML elements to identify appropriate information unit for retrieval task. Preliminary experimental results conducted using XML data sets of a biliographical site and a conference site shows that the proposed approach is promising to be further extended for other possible XML sources.

web intelligence | 2015

Classifly: Classification of Experts by Their Expertise on the Fly

Gan Keng Hoon; Gan Kian Min; Oscar Wong; Ooi Bong Pin; Chan Ying Sheng

Expert finding is commonly required in organizations. A well classified experts by their expertise helps in giving better access and exploitation on the human resources. In this paper, we show case Classifly, a tool to enable quick learning by using available sources related to an experts expertise to avoid the needs to manual labeling of training data. We demonstrate our tool in the context of academic search.

web intelligence | 2006

MICE^3: An Information Desktop on the Web

Gan Keng Hoon; Saravadee Sae Tan; Bryan Gan

Resembling the desktop of personal computer in which files and applications are stored, MICE3 desktop is a Web-based information desktop where individual can keep and manage resources such as links to Web sites and services, as well as Web sources and documents. Resources on the desktop are represented in Web directory structure and annotated with information such as concepts, entities, etc. Categories are used to organize resources in the directory. Each category is annotated with concept learnt from the contents of the category. MICE3 desktop allows users to exploit information and knowledge from their resources. Users can retrieve information by navigating the directory structure, look up the resource index or searching the resource contents. The desktop also allows users to effectively manage their resources by enabling them to enhance categories with concept. The concept is further used to assist user in organizing resources into their directory

Explore More