Guo-Cheng Lan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Guo-Cheng Lan is active.

Explore More

Publication

Featured researches published by Guo-Cheng Lan.

Expert Systems With Applications | 2012

An incremental mining algorithm for high utility itemsets

Chun-Wei Lin; Guo-Cheng Lan; Tzung-Pei Hong

Association-rule mining, which is based on frequency values of items, is the most common topic in data mining. In real-world applications, customers may, however, buy many copies of products and each product may have different factors, such as profits and prices. Only mining frequent itemsets in binary databases is thus not suitable for some applications. Utility mining is thus presented to consider additional measures, such as profits or costs according to user preference. In the past, a two-phase mining algorithm was designed for fast discovering high utility itemsets from databases. When data come intermittently, the approach needs to process all the transactions in a batch way. In this paper, an incremental mining algorithm for efficiently mining high utility itemsets is proposed to handle the above situation. It is based on the concept of the fast-update (FUP) approach, which was originally designed for association mining. The proposed approach first partitions itemsets into four parts according to whether they are high transaction-weighted utilization itemsets in the original database and in the newly inserted transactions. Each part is then executed by its own procedure. Experimental results also show that the proposed algorithm executes faster than the two-phase batch mining algorithm in the intermittent data environment

Knowledge and Information Systems | 2014

An efficient projection-based indexing approach for mining high utility itemsets

Guo-Cheng Lan; Tzung-Pei Hong; Vincent S. Tseng

Recently, utility mining has widely been discussed in the field of data mining. It finds high utility itemsets by considering both profits and quantities of items in transactional data sets. However, most of the existing approaches are based on the principle of levelwise processing, as in the traditional two-phase utility mining algorithm to find a high utility itemsets. In this paper, we propose an efficient utility mining approach that adopts an indexing mechanism to speed up the execution and reduce the memory requirement in the mining process. The indexing mechanism can imitate the traditional projection algorithms to achieve the aim of projecting sub-databases for mining. In addition, a pruning strategy is also applied to reduce the number of unpromising itemsets in mining. Finally, the experimental results on synthetic data sets and on a real data set show the superior performance of the proposed approach.

Expert Systems With Applications | 2011

Discovery of high utility itemsets from on-shelf time periods of products

Guo-Cheng Lan; Tzung-Pei Hong; Vincent S. Tseng

Utility mining has recently been an emerging topic in the field of data mining. It finds out high utility itemsets by considering both the profits and quantities of items in transactions. It may have a bias if items are not always on shelf. In this paper, we thus design a new kind of patterns, named high on-shelf utility itemsets, which considers not only individual profit and quantity of each item in a transaction but also common on-shelf time periods of a product combination. We also propose a two-phased mining algorithm to effectively and efficiently discover high on-shelf utility itemsets. In the first phase, the possible candidate on-shelf utility itemsets within each time period are found level by level. In the second phase, the candidate on-shelf utility itemsets are further checked for their actual utility values by an additional database scan. At last, the experimental results on synthetic datasets also show the proposed approach has a good performance.

Expert Systems With Applications | 2014

Applying the maximum utility measure in high utility sequential pattern mining

Guo-Cheng Lan; Tzung-Pei Hong; Vincent S. Tseng; Shyue-Liang Wang

Abstract Recently, high utility sequential pattern mining has been an emerging popular issue due to the consideration of quantities, profits and time orders of items. The utilities of subsequences in sequences in the existing approach are difficult to be calculated due to the three kinds of utility calculations. To simplify the utility calculation, this work then presents a maximum utility measure, which is derived from the principle of traditional sequential pattern mining that the count of a subsequence in the sequence is only regarded as one. Hence, the maximum measure is properly used to simplify the utility calculation for subsequences in mining. Meanwhile, an effective upper-bound model is designed to avoid information losing in mining, and also an effective projection-based pruning strategy is designed as well to cause more accurate sequence-utility upper-bounds of subsequences. The indexing strategy is also developed to quickly find the relevant sequences for prefixes in mining, and thus unnecessary search time can be reduced. Finally, the experimental results on several datasets show the proposed approach has good performance in both pruning effectiveness and execution efficiency.

Applied Intelligence | 2014

Incrementally mining high utility patterns based on pre-large concept

Chun-Wei Lin; Tzung-Pei Hong; Guo-Cheng Lan; Jia-Wei Wong; Wen-Yang Lin

In traditional association rule mining, most algorithms are designed to discover frequent itemsets from a binary database. Utility mining was thus proposed to measure the utility values of purchased items for revealing high utility itemsets from a quantitative database. In the past, a two-phase high utility mining algorithm was thus proposed for efficiently discovering high utility itemsets from a quantitative database. In dynamic data mining, transactions may be inserted, deleted, or modified from a database. In this case, a batch mining procedure must rescan the whole updated database to maintain the up-to-date information. Designing an efficient approach for handling dynamic databases is thus a critical research issue in utility mining. In this paper, an incremental mining algorithm is proposed for efficiently maintaining discovered high utility itemsets based on pre-large concepts. Itemsets are first partitioned into three parts according to whether they have large (high), pre-large, or small transaction-weighted utilization in the original database and in inserted transactions. Individual procedures are then executed for each part. Experimental results show that the proposed incremental high utility mining algorithm outperforms existing algorithms.

Advanced Engineering Informatics | 2015

Efficient updating of discovered high-utility itemsets for transaction deletion in dynamic databases

Chun-Wei Lin; Tzung-Pei Hong; Guo-Cheng Lan; Jia-Wei Wong; Wen-Yang Lin

Most algorithms related to association rule mining are designed to discover frequent itemsets from a binary database. Other factors such as profit, cost, or quantity are not concerned in binary databases. Utility mining was thus proposed to measure the utility values of purchased items for finding high-utility itemsets from a static database. In real-world applications, transactions are changed whether insertion or deletion in a dynamic database. An existing maintenance approach for handling high-utility itemsets in dynamic databases with transaction deletion must rescan the database when necessary. In this paper, an efficient algorithm, called PRE-HUI-DEL, for updating high-utility itemsets based on the pre-large concept for transaction deletion is proposed. The pre-large concept is used to partition transaction-weighted utilization itemsets into three sets with nine cases according to whether they have large (high), pre-large, or small transaction-weighted utilization in the original database and in the deleted transactions. Specific procedures are then applied to each case for maintaining and updating the discovered high-utility itemsets. Experimental results show that the proposed PRE-HUI-DEL algorithm outperforms a batch two-phase algorithm and a FUP2-based algorithm in maintaining high-utility itemsets.

International Journal of Information Technology and Decision Making | 2012

EFFICIENTLY MINING HIGH AVERAGE-UTILITY ITEMSETS WITH AN IMPROVED UPPER-BOUND STRATEGY

Guo-Cheng Lan; Tzung-Pei Hong; Vincent S. Tseng

Utility mining has recently been discussed in the field of data mining. A utility itemset considers both profits and quantities of items in transactions, and thus its utility value increases with increasing itemset length. To reveal a better utility effect, an average-utility measure, which is the total utility of an itemset divided by its itemset length, is proposed. However, existing approaches use the traditional average-utility upper-bound model to find high average-utility itemsets, and thus generate a large number of unpromising candidates in the mining process. The present study proposes an improved upper-bound approach that uses the prefix concept to create tighter upper bounds of average-utility values for itemsets, thus reducing the number of unpromising itemsets for mining. Results from experiments on two real databases show that the proposed algorithm outperforms other mining algorithms under various parameter settings.

Expert Systems With Applications | 2014

On-shelf utility mining with negative item values

Guo-Cheng Lan; Tzung-Pei Hong; Jen-Peng Huang; Vincent S. Tseng

We introduce a new research work, on-shelf utility mining with negative item values.We propose a TS-HOUN algorithm for mining the new type of utility itemsets.The derived itemsets are expected to be more reliable in terms of business.The synthetic and real datasets are used to show the TS-HOUN is effective.Experiments show that TS-HOUN can find more utility itemsets with negative values. On-shelf utility mining has recently received interest in the data mining field due to its practical considerations. On-shelf utility mining considers not only profits and quantities of items in transactions but also their on-shelf time periods in stores. Profit values of items in traditional on-shelf utility mining are considered as being positive. However, in real-world applications, items may be associated with negative profit values. This paper proposes an efficient three-scan mining approach to efficiently find high on-shelf utility itemsets with negative profit values from temporal databases. In particular, an effective itemset generation method is developed to avoid generating a large number of redundant candidates and to effectively reduce the number of data scans in mining. Experimental results for several synthetic and real datasets show that the proposed approach has good performance in pruning effectiveness and execution efficiency.

Applied Intelligence | 2014

An efficient approach for finding weighted sequential patterns from sequence databases

Guo-Cheng Lan; Tzung-Pei Hong; Hong-Yu Lee

Weighted sequential pattern mining has recently been discussed in the field of data mining. Different from traditional sequential pattern mining, this kind of mining considers different significances of items in real applications, such as cost or profit. Most of the related studies adopt the maximum weighted upper-bound model to find weighted sequential patterns, but they generate a large number of unpromising candidate subsequences. In this study, we thus propose an efficient approach for finding weighted sequential patterns from sequence databases. In particular, a tightening strategy in the proposed approach is proposed to obtain more accurate weighted upper-bounds for subsequences in mining. Through the experimental evaluation, the results also show the proposed approach has good performance in terms of pruning effectiveness and execution efficiency.

Applied Soft Computing | 2016

Mining fuzzy temporal association rules by item lifespans

Chun-Hao Chen; Guo-Cheng Lan; Tzung-Pei Hong; Shih-Bin Lin

We propose a fuzzy temporal association rule mining algorithm (FTARM).Information inside transactions can be found correctly by using lifespan of items.Three datasets are used to show the FTARM is effective.Experiments show that FTARM can derive more rules than FAR.The derived rules are better than FAR in terms of supports and confidences. Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. In real-world applications, transactions may contain quantitative values and each item may have a lifespan from a temporal database. In this paper, we thus propose a data mining algorithm for deriving fuzzy temporal association rules. It first transforms each quantitative value into a fuzzy set using the given membership functions. Meanwhile, item lifespans are collected and recorded in a temporal information table through a transformation process. The algorithm then calculates the scalar cardinality of each linguistic term of each item. A mining process based on fuzzy counts and item lifespans is then performed to find fuzzy temporal association rules. Experiments are finally performed on two simulation datasets and the foodmart dataset to show the effectiveness and the efficiency of the proposed approach.

Explore More