Lay-Ki Soon
Multimedia University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lay-Ki Soon.
2012 Third FTRA International Conference on Mobile, Ubiquitous, and Intelligent Computing | 2012
Mohammed Ibrahim Alowais; Lay-Ki Soon
Banking industry suffers lost in millions of dollars each year caused by credit card fraud. Tremendous effort, time and money have been spent to detect fraud where there are studies done on creating personalized model for each credit card holder to identify fraud. These studies claimed that each card holder carries different spending behavior which necessitates personalized model. However, to the best of our knowledge, there has not been any study conducted to verify this hypothesis. Hence, in this paper, we investigate the effectiveness of personalized models compared to the aggregated models in identify fraud for different individuals. For this purpose, we have collected some actual transactions and some other data through an online questionnaire. We have then constructed personalized and aggregated models. The performance of these models is evaluated using test data set to compare their accuracy in identifying fraud for different individuals. To our surprise, the experimental results show that aggregated models outperforms personalized models. Besides, we have also compared the performance of the random forest and Naïve Bayes in creating the models for fraud detection. Generally, random forest performs better than the Naïve Bayes for the aggregated model while Naïve Bayes performs better in the personalized models.
Multimedia Tools and Applications | 2015
Hui Ngo Goh; Lay-Ki Soon; Su Cheng Haw
Verb is the most important word in a sentence as it asserts an action, events, feeling about the subject and object discussed in the sentence. For news articles, it is observable that there is always at least a verb attached to the person(s) mentioned in the news. As such, a hypothesis has been formed such that there must exist some verbs that specifically describe human being conducts within a news article. In this paper, we propose an approach which aims to identify named-entity (NE) that performs human activity automatically. More specifically, our approach attempts to identify person-related NE generally and “person name” predefined type specifically by studying the nature of verb that associated with human activity via TreeTagger, Stanford packages and WordNet. The experimental results show that it is viable to use verb in identifying “person name“entity type. In addition, our empirical study proves that the approach is applicable to small text size articles. Another significant contribution of our approach is that it does not require training data set and anaphora resolution.
ieee international symposium on telecommunication technologies | 2014
Samini Subramaniam; Su-Cheng Haw; Lay-Ki Soon
XML has become the de facto standard in the real world application over the WWW. Thus, data or query processing is critical to ensure speed response time to cater user queries. Response time is often influenced by the complexity of labeling scheme which is not only used for unique identification of XML nodes, but for structural relationship purpose as well. The labeling scheme adopted is vital to ensure query processing is done flawlessly and promptly. In this paper, we introduce ReLab, a subtree based labeling scheme which generates labels using depth-first traversal. Our experimental evaluation indicated that ReLab outperformed Dietz and region numbering schemes in terms of time taken to generate labels for each XML nodes.
international conference on artificial intelligence | 2014
Wei Yen Chong; Bhawani Selvaretnam; Lay-Ki Soon
In this paper, we present our preliminary experiments on tweets sentiment analysis. This experiment is designed to extract sentiment based on subjects that exist in tweets. It detects the sentiment that refers to the specific subject using Natural Language Processing techniques. To classify sentiment, our experiment consists of three main steps, which are subjectivity classification, semantic association, and polarity classification. The experiment utilizes sentiment lexicons by defining the grammatical relationship between sentiment lexicons and subject. Experimental results show that the proposed system is working better than current text sentiment analysis tools, as the structure of tweets is not same as regular text.
data mining and optimization | 2012
Lay-Ki Soon; Yee-Ern Ku; Sang-Ho Lee
URL signature was proposed to be implemented in web crawling, aiming to avoid processing duplicated web pages for further web crawling. In this paper, we present our performance study on an open source web crawler - WebSPHINX, in which we have embedded URL signature. The experimental result indicates that URL signature is able to reduce the processing of duplicated web pages significantly for further web crawling at a negligible cost compared to the one without URL signature.
International Journal of Software Engineering and Knowledge Engineering | 2017
Samini Subramaniam; Su-Cheng Haw; Lay-Ki Soon; Kok-Leong Koong
Dependability on XML has increased tremendously over the years. As such the need for efficient query processing technique is certainly important. Despite the fact that these techniques are able to process queries with various edge combinations, they still suffer from processing overheads by buffering large amount of intermediate results particularly for parent–child (P–C) edges. Therefore, in this paper, we propose architecture named ReLaQ, which comprises of two components, ReLab+ (node annotator) and QTwig (query processor) for efficient XML query processing. QTwig improves retrieval time by incorporating a pruning technique that avoids accessing irrelevant data during query processing. Experimental results indicated that ReLaQ superseded TwigStack for both path and twig queries using both regular- and skewed-structured datasets. In addition, this is also proven by means of correctness analysis of ReLaQ.
Expert Systems With Applications | 2017
Chong Chai Chua; Tek Yong Lim; Lay-Ki Soon; Enya Kong Tang; Bali Ranaivo-Malançon
Abstract The main tasks in Example-based Machine Translation (EBMT) comprise of source text decomposition, following with translation examples matching and selection, and finally adaptation and recombination of the target translation. As the natural language is ambiguous in nature, the preservation of source text’s meaning throughout these processes is complex and challenging. A structural semantics is introduced, as an attempt towards meaning-based approach to improve the EBMT system. The structural semantics is used to support deeper semantic similarity measurement and impose structural constraints in translation examples selection. A semantic compositional structure is derived from the structural semantics of the selected translation examples. This semantic compositional structure serves as a representation structure to preserve the consistency and integrity of the input sentence’s meaning structure throughout the recombination process. In this paper, an English to Malay EBMT system is presented to demonstrate the practical application of this structural semantics. Evaluation of the translation test results shows that the new translation framework based on the structural semantics has outperformed the previous EBMT framework.
conference on information and knowledge management | 2014
Saravadee Sae Tan; Tek Yong Lim; Lay-Ki Soon; Enya Kong Tang
This paper addresses the problem of matching between highly heterogeneous structures. The problem is modeled as a classification task where training examples are used to learn the matching between structures. In our approach, training is performed using partially labeled data. We propose a Greedy Mapping approach to generate training examples from partially labeled data. Different types of structures may have different types of attributes that can be exploited to enhance the matching problem. We utilize three types of attributes, namely, text content, structure name and path correspondence, in the matching problem. Experiments are performed on two types of structures: semantic domain and semantic role. We evaluate the effectiveness of the Greedy Mapping as well as the performance on different types of attributes. Finally, the results are presented and discussed.
international conference on computational linguistics | 2013
Suhaila Saee; Lay-Ki Soon; Tek Yong Lim; Bali Ranaivo-Malançon; Enya Kong Tang
We describe in this paper a semi-automatic acquisition of morphological rules for morphological analyser in the case of under-resourced language, which is Iban language. We modify ideas from previous automatic morphological rules acquisition approaches, where the input requirements has become constraints to develop the analyser for under-resourced language. This work introduces three main steps in acquiring the rules from the under-resourced language, which are morphological data acquisition, morphological information validation and morphological rules extraction. The experiment shows that this approach gives successful results with 0.76 of precision and 0.99 of recall. Our findings also suggest that the availability of linguistic references and the selection of assorted techniques for morphology analysis could lead to the design of the workflow. We believe this workflow will assist other researchers to build morphological analyser with the validated morphological rules for the under-resourced languages.
Knowledge Based Systems | 2013
Hui-Ngo Goh; Lay-Ki Soon; Su-Cheng Haw
Named entity recognition (NER) is a subtask in information extraction which aims to locate atomic element into predefined types. Various NER techniques and tools have been developed to fit the interest of the applications developed. However, most NER works carried out focus on non-fiction domain. Fiction based domain displays a complex context in locating its NE, specifically whereby its characters could be represented in diverse spectrums, ranging from living things (animals, plants, and person) to non-living things (vehicle, furniture). Motivated by a hypothesis such that there always exists verb specifically describes human being conduct, in this paper, we propose a NER system which aims to identify NEs that perform human activity based on verb analysis (VAHA) in an autonomous manner. More specifically, our approach attempts to identify dominant character (DC) by studying the nature of verb that associates with human activity via TreeTagger, Stanford packages and WordNet. Our experimental results validate our initial hypothesis that NEs can be accurately identified by referring to the associated verbs that associate with human activity. Our empirical study also proves that the approach is applicable to small text size articles. Another significant contribution of our approach is that it does not require training data set and anaphora resolution.