Lay-Ki Soon | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lay-Ki Soon is active.

Explore More

Publication

Featured researches published by Lay-Ki Soon.

2012 Third FTRA International Conference on Mobile, Ubiquitous, and Intelligent Computing | 2012

Credit Card Fraud Detection: Personalized or Aggregated Model

Mohammed Ibrahim Alowais; Lay-Ki Soon

Banking industry suffers lost in millions of dollars each year caused by credit card fraud. Tremendous effort, time and money have been spent to detect fraud where there are studies done on creating personalized model for each credit card holder to identify fraud. These studies claimed that each card holder carries different spending behavior which necessitates personalized model. However, to the best of our knowledge, there has not been any study conducted to verify this hypothesis. Hence, in this paper, we investigate the effectiveness of personalized models compared to the aggregated models in identify fraud for different individuals. For this purpose, we have collected some actual transactions and some other data through an online questionnaire. We have then constructed personalized and aggregated models. The performance of these models is evaluated using test data set to compare their accuracy in identifying fraud for different individuals. To our surprise, the experimental results show that aggregated models outperforms personalized models. Besides, we have also compared the performance of the random forest and Naïve Bayes in creating the models for fraud detection. Generally, random forest performs better than the Naïve Bayes for the aggregated model while Naïve Bayes performs better in the personalized models.

Multimedia Tools and Applications | 2015

Automatic discovery of person-related named-entity in news articles based on verb analysis

Hui Ngo Goh; Lay-Ki Soon; Su Cheng Haw

Verb is the most important word in a sentence as it asserts an action, events, feeling about the subject and object discussed in the sentence. For news articles, it is observable that there is always at least a verb attached to the person(s) mentioned in the news. As such, a hypothesis has been formed such that there must exist some verbs that specifically describe human being conducts within a news article. In this paper, we propose an approach which aims to identify named-entity (NE) that performs human activity automatically. More specifically, our approach attempts to identify person-related NE generally and “person name” predefined type specifically by studying the nature of verb that associated with human activity via TreeTagger, Stanford packages and WordNet. The experimental results show that it is viable to use verb in identifying “person name“entity type. In addition, our empirical study proves that the approach is applicable to small text size articles. Another significant contribution of our approach is that it does not require training data set and anaphora resolution.

ieee international symposium on telecommunication technologies | 2014

ReLab: A subtree based labeling scheme for efficient XML query processing

Samini Subramaniam; Su-Cheng Haw; Lay-Ki Soon

XML has become the de facto standard in the real world application over the WWW. Thus, data or query processing is critical to ensure speed response time to cater user queries. Response time is often influenced by the complexity of labeling scheme which is not only used for unique identification of XML nodes, but for structural relationship purpose as well. The labeling scheme adopted is vital to ensure query processing is done flawlessly and promptly. In this paper, we introduce ReLab, a subtree based labeling scheme which generates labels using depth-first traversal. Our experimental evaluation indicated that ReLab outperformed Dietz and region numbering schemes in terms of time taken to generate labels for each XML nodes.

international conference on artificial intelligence | 2014

Natural Language Processing for Sentiment Analysis: An Exploratory Analysis on Tweets

Wei Yen Chong; Bhawani Selvaretnam; Lay-Ki Soon

In this paper, we present our preliminary experiments on tweets sentiment analysis. This experiment is designed to extract sentiment based on subjects that exist in tweets. It detects the sentiment that refers to the specific subject using Natural Language Processing techniques. To classify sentiment, our experiment consists of three main steps, which are subjectivity classification, semantic association, and polarity classification. The experiment utilizes sentiment lexicons by defining the grammatical relationship between sentiment lexicons and subject. Experimental results show that the proposed system is working better than current text sentiment analysis tools, as the structure of tweets is not same as regular text.

data mining and optimization | 2012

Web crawler with URL signature — A performance study

Lay-Ki Soon; Yee-Ern Ku; Sang-Ho Lee

URL signature was proposed to be implemented in web crawling, aiming to avoid processing duplicated web pages for further web crawling. In this paper, we present our performance study on an open source web crawler - WebSPHINX, in which we have embedded URL signature. The experimental result indicates that URL signature is able to reduce the processing of duplicated web pages significantly for further web crawling at a negligible cost compared to the one without URL signature.

International Journal of Software Engineering and Knowledge Engineering | 2017

QTwig: A Structural Join Algorithm for Efficient Query Retrieval Based on Region-Based Labeling

Samini Subramaniam; Su-Cheng Haw; Lay-Ki Soon; Kok-Leong Koong

Dependability on XML has increased tremendously over the years. As such the need for efficient query processing technique is certainly important. Despite the fact that these techniques are able to process queries with various edge combinations, they still suffer from processing overheads by buffering large amount of intermediate results particularly for parent–child (P–C) edges. Therefore, in this paper, we propose architecture named ReLaQ, which comprises of two components, ReLab+ (node annotator) and QTwig (query processor) for efficient XML query processing. QTwig improves retrieval time by incorporating a pruning technique that avoids accessing irrelevant data during query processing. Experimental results indicated that ReLaQ superseded TwigStack for both path and twig queries using both regular- and skewed-structured datasets. In addition, this is also proven by means of correctness analysis of ReLaQ.

Expert Systems With Applications | 2017

Meaning preservation in Example-based Machine Translation with structural semantics

Chong Chai Chua; Tek Yong Lim; Lay-Ki Soon; Enya Kong Tang; Bali Ranaivo-Malançon

Abstract The main tasks in Example-based Machine Translation (EBMT) comprise of source text decomposition, following with translation examples matching and selection, and finally adaptation and recombination of the target translation. As the natural language is ambiguous in nature, the preservation of source text’s meaning throughout these processes is complex and challenging. A structural semantics is introduced, as an attempt towards meaning-based approach to improve the EBMT system. The structural semantics is used to support deeper semantic similarity measurement and impose structural constraints in translation examples selection. A semantic compositional structure is derived from the structural semantics of the selected translation examples. This semantic compositional structure serves as a representation structure to preserve the consistency and integrity of the input sentence’s meaning structure throughout the recombination process. In this paper, an English to Malay EBMT system is presented to demonstrate the practical application of this structural semantics. Evaluation of the translation test results shows that the new translation framework based on the structural semantics has outperformed the previous EBMT framework.

conference on information and knowledge management | 2014

Learning to Match Heterogeneous Structures using Partially Labeled Data

Saravadee Sae Tan; Tek Yong Lim; Lay-Ki Soon; Enya Kong Tang

This paper addresses the problem of matching between highly heterogeneous structures. The problem is modeled as a classification task where training examples are used to learn the matching between structures. In our approach, training is performed using partially labeled data. We propose a Greedy Mapping approach to generate training examples from partially labeled data. Different types of structures may have different types of attributes that can be exploited to enhance the matching problem. We utilize three types of attributes, namely, text content, structure name and path correspondence, in the matching problem. Experiments are performed on two types of structures: semantic domain and semantic role. We evaluate the effectiveness of the Greedy Mapping as well as the performance on different types of attributes. Finally, the results are presented and discussed.

international conference on computational linguistics | 2013

Semi-automatic acquisition of two-level morphological rules for iban language

Suhaila Saee; Lay-Ki Soon; Tek Yong Lim; Bali Ranaivo-Malançon; Enya Kong Tang

We describe in this paper a semi-automatic acquisition of morphological rules for morphological analyser in the case of under-resourced language, which is Iban language. We modify ideas from previous automatic morphological rules acquisition approaches, where the input requirements has become constraints to develop the analyser for under-resourced language. This work introduces three main steps in acquiring the rules from the under-resourced language, which are morphological data acquisition, morphological information validation and morphological rules extraction. The experiment shows that this approach gives successful results with 0.76 of precision and 0.99 of recall. Our findings also suggest that the availability of linguistic references and the selection of assorted techniques for morphology analysis could lead to the design of the workflow. We believe this workflow will assist other researchers to build morphological analyser with the validated morphological rules for the under-resourced languages.

Knowledge Based Systems | 2013

Automatic dominant character identification in fables based on verb analysis - Empirical study on the impact of anaphora resolution

Hui-Ngo Goh; Lay-Ki Soon; Su-Cheng Haw

Named entity recognition (NER) is a subtask in information extraction which aims to locate atomic element into predefined types. Various NER techniques and tools have been developed to fit the interest of the applications developed. However, most NER works carried out focus on non-fiction domain. Fiction based domain displays a complex context in locating its NE, specifically whereby its characters could be represented in diverse spectrums, ranging from living things (animals, plants, and person) to non-living things (vehicle, furniture). Motivated by a hypothesis such that there always exists verb specifically describes human being conduct, in this paper, we propose a NER system which aims to identify NEs that perform human activity based on verb analysis (VAHA) in an autonomous manner. More specifically, our approach attempts to identify dominant character (DC) by studying the nature of verb that associates with human activity via TreeTagger, Stanford packages and WordNet. Our experimental results validate our initial hypothesis that NEs can be accurately identified by referring to the associated verbs that associate with human activity. Our empirical study also proves that the approach is applicable to small text size articles. Another significant contribution of our approach is that it does not require training data set and anaphora resolution.

Explore More