Bay Vo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bay Vo is active.

Explore More

Publication

Featured researches published by Bay Vo.

Expert Systems With Applications | 2013

A new method for mining Frequent Weighted Itemsets based on WIT-trees

Bay Vo; Frans Coenen; Bac Le

The mining frequent itemsets plays an important role in the mining of association rules. Frequent itemsets are typically mined from binary databases where each item in a transaction may have a different significance. Mining Frequent Weighted Itemsets (FWI) from weighted items transaction databases addresses this issue. This paper therefore proposes algorithms for the fast mining of FWI from weighted item transaction databases. Firstly, an algorithm for directly mining FWI using WIT-trees is presented. After that, some theorems are developed concerning the fast mining of FWI. Based on these theorems, an advanced algorithm for mining FWI is proposed. Finally, a Diffset strategy for the efficient computation of the weighted support for itemsets is described, and an algorithm for mining FWI using Diffsets presented. A complete evaluation of the proposed algorithms is also presented.

Knowledge Based Systems | 2013

A lattice-based approach for mining most generalization association rules

Bay Vo; Tzung-Pei Hong; Bac Le

Traditional association rules consist of some redundant information. Some variants based on support and confidence measures such as non-redundant rules and minimal non-redundant rules were thus proposed to reduce the redundant information. In the past, we proposed most generalization association rules (MGARs), which were more compact than (minimal) non-redundant rules in that they considered the condition of equal or higher confidence, instead of only equal confidence. However, the execution time for generating MGARs increased with an increasing number of frequent closed itemsets. Since lattices are an effective data structure widely used in data mining, in this paper, we thus propose a lattice-based approach for fast mining most generalization association rules. Firstly, a new algorithm for building a frequent-closed-itemset lattice is introduced. After that, a theorem on pruning nodes in the lattice for rule generation is derived. Finally, an algorithm for fast mining MGARs from the lattice constructed is developed. The proposed algorithm is tested with several databases and the results show that it is more efficient than mining MGARs directly from frequent closed itemsets.

Expert Systems With Applications | 2013

CAR-Miner

Loan T. T. Nguyen; Bay Vo; Tzung-Pei Hong; Hoang Chi Thanh

Highlights? We propose the MECR-tree data structure for mining class-association rules. ? Some theorems for fast joining itemsets and computing supports of rules are developed. ? An efficient algorithm for mining class-association rules based on the MECR-tree and theorems has been proposed. ? Our proposal algorithm is always faster than ECR-CARM. Building a high accuracy classifier for classification is a problem in real applications. One high accuracy classifier used for this purpose is based on association rules. In the past, some researches showed that classification based on association rules (or class-association rules - CARs) has higher accuracy than that of other rule-based methods such as ILA and C4.5. However, mining CARs consumes more time because it mines a complete rule set. Therefore, improving the execution time for mining CARs is one of the main problems with this method that needs to be solved. In this paper, we propose a new method for mining class-association rule. Firstly, we design a tree structure for the storage frequent itemsets of datasets. Some theorems for pruning nodes and computing information in the tree are developed after that, and then, based on the theorems, we propose an efficient algorithm for mining CARs. Experimental results show that our approach is more efficient than those used previously.

2012 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future | 2012

An Efficient Incremental Mining Approach Based on IT-Tree

Thien-Phuong Le; Tzung-Pei Hong; Bay Vo; Bac Le

The itemset-tidset-tree (IT-tree) is an efficient data structure for association-rule mining. Zaki et al. designed a mining algorithm based on the IT-tree structure, which traversed an IT-tree in a depth-first order, generated itemsets by using the concept of equivalence classes, and computed the support values of itemsets fast by tidset intersection. It, however, needed to process all transactions in a batch way. In this paper, we propose a Pre-FUIT algorithm (Fast-Update algorithm based on the IT-tree structure and the concept of PRE-large itemsets algorithm), which does not only update the IT-tree when new transactions are inserted, but also mine all frequent itemsets easily. Experimental results show the good performance of the proposed algorithm.

international conference on computational collective intelligence | 2012

MSGPs: a novel algorithm for mining sequential generator patterns

Thi-Thiet Pham; Jiawei Luo; Tzung-Pei Hong; Bay Vo

Sequential generator pattern mining is an important task in data mining. Sequential generator patterns used together with closed sequential patterns can provide additional information that closed sequential patterns alone are not able to provide. In this paper, we proposed an algorithm called MSGPs, which based on the characteristics of sequential generator patterns and sequence extensions by doing depth-first search on the prefix tree, to find all of the sequential generator patterns. This algorithm uses a vertical approach to listing and counting the support, based on the prime block encoding approach of the prime factorization theory to represent candidate sequences and determine the frequency for each candidate. Experimental results showed that the proposed algorithm is effective.

International Journal of Intelligent Information and Database Systems | 2013

An effective algorithm for mining closed sequential patterns and their minimal generators based on prefix trees

Thi–Thiet Pham; Jiawei Luo; Bay Vo

Sequential generator patterns and closed sequential patterns play an important role in data mining tasks. They are proposed to address difficult problems in mining sequential pattern and have often been used together to generate non-redundant rules. Based on their important role, this paper proposes an efficient algorithm called CloGen for mining closed sequential patterns and their minimal sequential generator patterns. The CloGen algorithm uses the parent-child relationship on prefix tree structure and inserts fields into each node on prefix tree to determine whether that is a minimal sequential generator pattern or closed sequential pattern. Experimental results show that the performance runtime of CloGen algorithm is much faster than that of other algorithms by more than one order of magnitude.

international conference on computational collective intelligence | 2012

A tree-based approach for mining frequent weighted utility itemsets

Bay Vo; Bac Le; Jason J. Jung

In this paper, we propose a method for mining Frequent Weighted Utility Itemsets (FWUIs) from quantitative databases. Firstly, we introduce the WIT (Weighted Itemset Tidset) tree data structure for mining high utility itemsets in the work of Le et al. (2009) and modify it into MWIT (M stands for Modification) tree for mining FWUIs. Next, we propose an algorithm for mining FWUIs using MWIT-tree. We test the proposed algorithm in many databases and show that they are very efficient.

international conference on computational collective intelligence | 2012

Interestingness measures for classification based on association rules

Loan T. T. Nguyen; Bay Vo; Tzung-Pei Hong; Hoang Chi Thanh

This paper proposes a new algorithm for classification based on association rule with interestingness measures. The proposed algorithm uses a tree structure for maintenance of related information in each node, thus making the process of generating rules fast. Besides, the proposed algorithm can be easily extended to integrate some measures together for ranking rules. Experiments are also made to show the efficiency of the proposed approach for different settings. The mining time for different interestingness measures is varied only a little when ten measures are integrated.

asian conference on intelligent information and database systems | 2013

A space-time trade off for FUFP-trees maintenance

Bac Le; Chanh-Truc Tran; Tzung-Pei Hong; Bay Vo

In the past, Hong et al. proposed an algorithm to maintain the fast updated frequent pattern tree (FUFP-tree), which was an efficient data structure for association-rule mining. However in the maintenance process, the counts of infrequent items and the IDs of transactions with those items were determined by rescanning all the transactions in the original database. This step might be quite time-consuming depending on the number of transactions in the original database and the number of rescanned items. This study improves that approach by storing 1-items during the maintenance process and based on the properties of FUFP-trees, such that the rescanned items and inserted items are processed more efficiently to reduce execution time. Experimental results show that the improved algorithm needs some more memory to store infrequent 1-items but the performance is better than the original one.

international conference on innovations in bio-inspired computing and applications | 2012

An Enhanced FUFP-Tree Maintenance Approach for Transaction Deletion

Chanh-Truc Tran; Bay Vo; Tzung-Pei Hong; Chun-Wei Lin; Bac Le

The fast updated frequent pattern tree (FUFP-tree) is an efficient data structure for association-rule mining. Hong et al. (2009) proposed an approach for the maintenance of the FUFP-tree structure after the deletion of transactions. However, all transactions in the original database might need to be rescanned to determine the occurrence of infrequent items, which were not stored during the mining and maintenance process. The rescanning process steps can be time-consuming depending on the original database size and number of rescanned items. The study in this paper enhances Hong et. al.s approach. In the proposed algorithm, the infrequent 1-itemsets are stored during the maintenance process and the rescanned items are pruned out step by step to reduce execution time. Experimental results verify the performance of the proposed algorithm.

Explore More