Is this you? Create Your Porfile

Wen-Yang Lin

National University of Kaohsiung

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wen-Yang Lin is active.

Explore More

Publication

Featured researches published by Wen-Yang Lin.

International Journal of Approximate Reasoning | 2005

Mining association rules with multiple minimum supports using maximum constraints

Yeong-Chyi Lee; Tzung-Pei Hong; Wen-Yang Lin

Abstract Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. Most of the previous approaches set a single minimum support threshold for all the items or itemsets. But in real applications, different items may have different criteria to judge its importance. The support requirements should then vary with different items. In this paper, we provide another point of view about defining the minimum supports of itemsets when items have different minimum supports. The maximum constraint is used, which is well explained and may be suitable to some mining domains. We then propose a simple algorithm based on the Apriori approach to find the large-itemsets and association rules under this constraint. The proposed algorithm is easy and efficient when compared to Wang et al.’s under the maximum constraint. The numbers of association rules and large itemsets obtained by the proposed mining algorithm using the maximum constraint are also less than those using the minimum constraint. Whether to adopt the proposed approach thus depends on the requirements of mining problems. Besides, the granular computing technique of bit strings is used to speed up the proposed data mining algorithm.

Applied Intelligence | 2001

Evolution of Appropriate Crossover and Mutation Operators in a Genetic Process

Tzung-Pei Hong; Hong-Shung Wang; Wen-Yang Lin; Wen-Yuan Lee

Traditional genetic algorithms use only one crossover and one mutation operator to generate the next generation. The chosen crossover and mutation operators are critical to the success of genetic algorithms. Different crossover or mutation operators, however, are suitable for different problems, even for different stages of the genetic process in a problem. Determining which crossover and mutation operators should be used is quite difficult and is usually done by trial-and-error. In this paper, a new genetic algorithm, the dynamic genetic algorithm (DGA), is proposed to solve the problem. The dynamic genetic algorithm simultaneously uses more than one crossover and mutation operators to generate the next generation. The crossover and mutation ratios change along with the evaluation results of the respective offspring in the next generation. By this way, we expect that the really good operators will have an increasing effect in the genetic process. Experiments are also made, with results showing the proposed algorithm performs better than the algorithms with a single crossover and a single mutation operator.

Knowledge and Information Systems | 2004

A genetic selection algorithm for OLAP data cubes

Wen-Yang Lin; I-Chung Kuo

Multidimensional data analysis, as supported by OLAP (online analytical processing) systems, requires the computation of many aggregate functions over a large volume of historically collected data. To decrease the query time and to provide various viewpoints for the analysts, these data are usually organized as a multidimensional data model, called data cubes. Each cell in a data cube corresponds to a unique set of values for the different dimensions and contains the metric of interest. The data cube selection problem is, given the set of user queries and a storage space constraint, to select a set of materialized cubes from the data cubes to minimize the query cost and/or the maintenance cost. This problem is known to be an NP-hard problem. In this study, we examined the application of genetic algorithms to the cube selection problem. We proposed a greedy-repaired genetic algorithm, called the genetic greedy method. According to our experiments, the solution obtained by our genetic greedy method is superior to that found using the traditional greedy method. That is, within the same storage constraint, the solution can greatly reduce the amount of query cost as well as the cube maintenance cost.

data warehousing and knowledge discovery | 2001

Mining Generalized Association Rules with Multiple Minimum Supports

Ming-Cheng Tseng; Wen-Yang Lin

Mining generalized association rules in the presence of the taxonomy has been recognized as an important model in data mining. Earlier work on generalized association rules confined the minimum support to be uniformly specified for all items or for items within the same taxonomy level. In this paper, we extended the scope of mining generalized association rules in the presence of taxonomy to allow any form of user-specified multiple minimum supports. We discussed the problems of using classic Apriori itemset generation and presented two algorithms, MMS_Cumulate and MMS_Stratify, for discovering the generalized frequent itemsets. Empirical evaluation showed that these two algorithms are very effective and have good linear scale-up characteristic.

Applied Intelligence | 2014

Incrementally mining high utility patterns based on pre-large concept

Chun-Wei Lin; Tzung-Pei Hong; Guo-Cheng Lan; Jia-Wei Wong; Wen-Yang Lin

In traditional association rule mining, most algorithms are designed to discover frequent itemsets from a binary database. Utility mining was thus proposed to measure the utility values of purchased items for revealing high utility itemsets from a quantitative database. In the past, a two-phase high utility mining algorithm was thus proposed for efficiently discovering high utility itemsets from a quantitative database. In dynamic data mining, transactions may be inserted, deleted, or modified from a database. In this case, a batch mining procedure must rescan the whole updated database to maintain the up-to-date information. Designing an efficient approach for handling dynamic databases is thus a critical research issue in utility mining. In this paper, an incremental mining algorithm is proposed for efficiently maintaining discovered high utility itemsets based on pre-large concepts. Itemsets are first partitioned into three parts according to whether they have large (high), pre-large, or small transaction-weighted utilization in the original database and in inserted transactions. Individual procedures are then executed for each part. Experimental results show that the proposed incremental high utility mining algorithm outperforms existing algorithms.

intelligent systems design and applications | 2008

An Incremental FUSP-Tree Maintenance Algorithm

Chun Wei Lin; Tzung-Pei Hong; Wen Hsiang Lu; Wen-Yang Lin

In this paper, we attempt to handle the maintenance of sequential patterns. New transactions may come from both the new customers and old customers. A fast updated sequential pattern tree (called FUSP-tree) structure is proposed to make the tree update process become easy. An incremental FUSP-tree maintenance algorithm is also proposed for reducing the execution time in reconstructing the tree. The proposed approach is expected to achieve a good trade-off between execution time and tree complexity.

Advanced Engineering Informatics | 2015

Efficient updating of discovered high-utility itemsets for transaction deletion in dynamic databases

Chun-Wei Lin; Tzung-Pei Hong; Guo-Cheng Lan; Jia-Wei Wong; Wen-Yang Lin

Most algorithms related to association rule mining are designed to discover frequent itemsets from a binary database. Other factors such as profit, cost, or quantity are not concerned in binary databases. Utility mining was thus proposed to measure the utility values of purchased items for finding high-utility itemsets from a static database. In real-world applications, transactions are changed whether insertion or deletion in a dynamic database. An existing maintenance approach for handling high-utility itemsets in dynamic databases with transaction deletion must rescan the database when necessary. In this paper, an efficient algorithm, called PRE-HUI-DEL, for updating high-utility itemsets based on the pre-large concept for transaction deletion is proposed. The pre-large concept is used to partition transaction-weighted utilization itemsets into three sets with nine cases according to whether they have large (high), pre-large, or small transaction-weighted utilization in the original database and in the deleted transactions. Specific procedures are then applied to each case for maintaining and updating the discovered high-utility itemsets. Experimental results show that the proposed PRE-HUI-DEL algorithm outperforms a batch two-phase algorithm and a FUP2-based algorithm in maintaining high-utility itemsets.

Journal of Information Science | 2006

Automated support specification for efficient mining of interesting association rules

Wen-Yang Lin; Ming-Cheng Tseng

In recent years, the weakness of the canonical support-confidence framework for associations mining has been widely studied. One of the difficulties in applying association rules mining is the setting of support constraints. A high-support constraint avoids the combinatorial explosion in discovering frequent itemsets, but at the expense of missing interesting patterns of low support. Instead of seeking a way to set the appropriate support constraints, all current approaches leave the users in charge of the support setting, which, however, puts the users in a dilemma. This paper is an effort to answer this long-standing open question. According to the notion of confidence and lift measures, we propose an automatic support specification for efficiently mining high-confidence and positive-lift associations without consulting the users. Experimental results show that the proposed method is not only good at discovering high-confidence and positive-lift associations, but also effective in reducing spurious frequent itemsets.

hawaii international conference on system sciences | 2004

CBW: an efficient algorithm for frequent itemset mining

Ja-Hwung Su; Wen-Yang Lin

Frequent itemset generation is the prerequisite and most time-consuming process for association rule mining. Nowadays, most efficient apriori-like algorithms rely heavily on the minimum support constraint to prune a vast amount of non-candidate itemsets. This pruning technique, however, becomes less useful for some real applications where the supports of interesting itemsets are extremely small, such as medical diagnosis, fraud detection, among the others. In this paper, we propose a new algorithm that maintains its performance even at relative low supports. Empirical evaluations show that our algorithm is, on the average, more than an order of magnitude faster than a priori-like algorithms.

Expert Systems With Applications | 2009

A practical extension of web usage mining with intentional browsing data toward usage

Yu-Hui Tao; Tzung-Pei Hong; Wen-Yang Lin; Wen-Yuan Chiu

Intentional browsing data is a new data component for improving Web usage mining that uses Web log files as the primary data source. Previously, the Web transaction mining algorithm was used in e-commerce applications to demonstrate how it could be enhanced by intentional browsing data on pages with item purchase and complemented by intentional browsing data on pages without item purchase. Although these two intention-based algorithms satisfactorily illustrated the benefits of intentional browsing data on the original Web transaction mining algorithm, three potential issues remain: Why is there a need to separate the source data into purchased-item and not-purchased-item segments to be processed by two intention-based algorithms? Moreover, can the algorithms contain more than one browsing data types? Finally, can the numeric intention-based data counts be more user friendly for decision-making practices? To address these three issues, we propose a unified intention-based Web transaction mining algorithm that can efficiently process the whole data set simultaneously with multiple intentional browsing data types as well as transform the intentional browsing data counts into easily understood linguistic items using the fuzzy set concept. Comparisons and implications for e-commerce are also discussed. Instead of addressing the technical innovation in this extension study, the revised intention-based Web usage mining algorithm should make its applications much easier and more useful in practice.

Explore More