Xiuming Yu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xiuming Yu is active.

Explore More

Publication

Featured researches published by Xiuming Yu.

Journal of Information Science | 2014

MapReduce-based web mining for prediction of web-user navigation

Meijing Li; Xiuming Yu; Keun Ho Ryu

Predicting web user behaviour is typically an application for finding frequent sequence patterns. With the rapid growth of the Internet, a large amount of information is stored in web logs. Traditional frequent-sequence-pattern-mining algorithms are hard pressed to analyse information from within big datasets. In this paper, we propose an efficient way to predict navigation patterns of web users by improving frequent-sequence-pattern-mining algorithms based on the programming model of MapReduce, which can handle huge datasets efficiently. During the experiments, we show that our proposed MapReduce-based algorithm is more efficient than traditional frequent-sequence-pattern-mining algorithms, and by comparing our proposed algorithms with current existed algorithms in web-usage mining, we also prove that using the MapReduce programming model saves time.

Archive | 2012

Application of Closed Gap-Constrained Sequential Pattern Mining in Web Log Data

Xiuming Yu; Meijing Li; Dong Gyu Lee; Kwang Deuk Kim; Keun Ho Ryu

Discovery of information in web log data is a very popular research area in the field of data mining. Two of the objectives of favorite applications are to obtain useful information of web users’ behavior and to analyze the structure of web sites. In this paper, we suggest a novel approach to generate web sequential patterns using the gap-constrained method in web log data. The process of mining task in the proposed approach is described as follows. First, pre-process of the raw web log data is introduced by removing irrelevant or redundant items, gathering the same users and transforming the web log data into a set of tuples (sequence identifier, sequence) constrained by visiting time. Second, web access patterns, which are closed sequential patterns with gap constraints, are generated using the Gap-BIDE algorithm in web log data with two parameters, minimum support threshold and gap constraint. In the experiment, a data set is derived from http://www.vtsns.edu.rs/maja/, which is proposed in [1]. The result shows that, with the application of sequential pattern mining in the web log data presented in this paper, we can find information about navigational behavior of web users and the structure of the web page can be designed more legitimately by the order of obtained patterns.

database and expert systems applications | 2012

Prediction of Web User Behavior by Discovering Temporal Relational Rules from Web Log Data

Xiuming Yu; Meijing Li; Incheon Paik; Keun Ho Ryu

The Web has become a very popular and interactive medium in our lives. With the rapid development and proliferation of e-commerce and Web-based information systems, web mining has become an essential tool for discovering specific information on the Web. There are a lot of previous web mining techniques have been proposed. In this paper, an approach of temporal interval relational rule mining is applied to discover knowledge from web log data. Comparing our proposed approach and previous web mining techniques, the attribute of timestamp in web log data is considered in our approach. Firstly, temporal intervals of accessing web pages are formed by folding over a periodicity. And then discovery of relational rules is performed based on constraint of these temporal intervals. In the experiment, we analyze the result of relational rules and the effect of important parameters used in the mining approach.

Mathematical Problems in Engineering | 2015

A Novel Approach for Protein-Named Entity Recognition and Protein-Protein Interaction Extraction

Meijing Li; Tsendsuren Munkhdalai; Xiuming Yu; Keun Ho Ryu

Many researchers focus on developing protein-named entity recognition (Protein-NER) or PPI extraction systems. However, the studies about these two topics cannot be merged well; then existing PPI extraction systems’ Protein-NER still needs to improve. In this paper, we developed the protein-protein interaction extraction system named PPIMiner based on Support Vector Machine (SVM) and parsing tree. PPIMiner consists of three main models: natural language processing (NLP) model, Protein-NER model, and PPI discovery model. The Protein-NER model, which is named ProNER, identifies the protein names based on two methods: dictionary-based method and machine learning-based method. ProNER is capable of identifying more proteins than dictionary-based Protein-NER model in other existing systems. The final discovered PPIs extracted via PPI discovery model are represented in detail because we showed the protein interaction types and the occurrence frequency through two different methods. In the experiments, the result shows that the performances achieved by our ProNER and PPI discovery model are better than other existing tools. PPIMiner applied this protein-named entity recognition approach and parsing tree based PPI extraction method to improve the performance of PPI extraction. We also provide an easy-to-use interface to access PPIs database and an online system for PPIs extraction and Protein-NER.

Cluster Computing | 2018

Face recognition technology development with Gabor, PCA and SVM methodology under illumination normalization condition

Meijing Li; Xiuming Yu; Keun Ho Ryu; Sanghyuk Lee; Nipon Theera-Umpon

Face recognition is a challenging research field in computer sciences, numerous studies have been proposed by many researchers. However, there have been no effective solutions reported for full illumination variation of face images in the facial recognition research field. In this paper, we propose a methodology to solve the problem of full illumination variation by the combination of histogram equalization (HE) and Gaussian low-pass filter (GLPF). In order to process illumination normalization, feature extraction is applied with consideration of both Gabor wavelet and principal component analysis methods. Next, a Support Vector Machine classifier is used for face classification. In the experiments, illustration performance was compared with our proposed approach and the conventional approaches with three different kinds of face databases. Experimental results show that our proposed illumination normalization approach (HE_GLPF) performs better than the conventional illumination normalization approaches, in face images with the full illumination variation problem.

soft computing | 2012

An application of improved gap-BIDE algorithm for discovering access patterns

Xiuming Yu; Meijing Li; Taewook Kim; Seon-Phil Jeong; Keun Ho Ryu

Discovering access patterns from web log data is a typical sequential pattern mining application, and a lot of access pattern mining algorithms have been proposed. In this paper, we propose an improved approach of Gap-BIDE algorithm to extract user access patterns from web log data. Compared with the previous Gap-BIDE algorithm, a process of getting a large event set is proposed in the provided algorithm; the proposed approach can find out the frequent events by discarding the infrequent events which do not occur continuously in an accessing time before generating candidate patterns. In the experiment, we compare the previous access pattern mining algorithm with the proposed one, which shows that our approach is very efficient in discovering access patterns in large database.

international conference on computational science | 2015

Clustering of Web Users Based on Matrix of Influence Degree

Xiuming Yu; Meijing Li; Keun Ho Ryu

Clustering of web users is an important research field in web mining. Information of web user clusters have been wildly used in many applications, such as solution of website structure optimization, reconstruction of website and distribution of advertising business. In this paper, we convert web log data into a sparse matrix, and propose a novel approach to calculate influence degree of each web page for all web users to build a Matrix of Influence Degree (MID) according to the generated sparse matrix, we can cluster web users simply from the generated MID. In the experiments, the results show that our proposed approach is capable of being the basic of clustering web users in web log data.

ieee international conference on adaptive science technology | 2011