Chun-Hao Chen
Tamkang University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chun-Hao Chen.
soft computing | 2006
Tzung-Pei Hong; Chun-Hao Chen; Yu-Lung Wu; Yeong-Chyi Lee
Data mining is most commonly used in attempts to induce association rules from transaction data. Transactions in real-world applications, however, usually consist of quantitative values. This paper thus proposes a fuzzy data-mining algorithm for extracting both association rules and membership functions from quantitative transactions. We present a GA-based framework for finding membership functions suitable for mining problems and then use the final best set of membership functions to mine fuzzy association rules. The fitness of each chromosome is evaluated by the number of large 1-itemsets generated from part of the previously proposed fuzzy mining algorithm and by the suitability of the membership functions. Experimental results also show the effectiveness of the framework.
IEEE Transactions on Evolutionary Computation | 2008
Tzung-Pei Hong; Chun-Hao Chen; Yeong-Chyi Lee; Yu-Lung Wu
Data mining is most commonly used in attempts to induce association rules from transaction data. Most previous studies focused on binary-valued transaction data. Transaction data in real-world applications, however, usually consist of quantitative values. This paper, thus, proposes a fuzzy data-mining algorithm for extracting both association rules and membership functions from quantitative transactions. A genetic algorithm (GA)-based framework for finding membership functions suitable for mining problems is proposed. The fitness of each set of membership functions is evaluated by the fuzzy-supports of the linguistic terms in the large 1-itemsets and by the suitability of the derived membership functions. The evaluation by the fuzzy supports of large 1-itemsets is much faster than that when considering all itemsets or interesting association rules. It can also help divide-and-conquer the derivation process of the membership functions for different items. The proposed GA framework, thus, maintains multiple populations, each for one items membership functions. The final best sets of membership functions in all the populations are then gathered together to be used for mining fuzzy association rules. Experiments are conducted to analyze different fitness functions and set different fitness functions and setting different supports and confidences. Experiments are also conducted to compare the proposed algorithm, the one with uniform fuzzy partition, and the existing one without divide-and-conquer, with results validating the performance of the proposed algorithm.
IEEE Transactions on Fuzzy Systems | 2008
Chun-Hao Chen; Vincent S. Tseng; Tzung-Pei Hong
Data mining is commonly used in attempts to induce association rules from transaction data. Most previous studies focused on binary-valued transaction data. Transactions in real-world applications, however, usually consist of quantitative values. In the past, we proposed a fuzzy-genetic data-mining algorithm for extracting both association rules and membership functions from quantitative transactions. It used a combination of large 1-itemsets and membership-function suitability to evaluate the fitness values of chromosomes. The calculation for large 1-itemsets could take a lot of time, especially when the database to be scanned could not totally fed into main memory. In this paper, an enhanced approach, called the cluster-based fuzzy-genetic mining algorithm, is thus proposed to speed up the evaluation process and keep nearly the same quality of solutions as the previous one. It divides the chromosomes in a population into clusters by the - means clustering approach and evaluates each individual according to both cluster and their own information. Experimental results also show the effectiveness and efficiency of the proposed approach.
Applied Soft Computing | 2012
Chun-Hao Chen; Tzung-Pei Hong; Vincent S. Tseng
Time series analysis has always been an important and interesting research field due to its frequent appearance in different applications. In the past, many approaches based on regression, neural networks and other mathematical models were proposed to analyze the time series. In this paper, we attempt to use the data mining technique to analyze time series. Many previous studies on data mining have focused on handling binary-valued data. Time series data, however, are usually quantitative values. We thus extend our previous fuzzy mining approach for handling time-series data to find linguistic association rules. The proposed approach first uses a sliding window to generate continues subsequences from a given time series and then analyzes the fuzzy itemsets from these subsequences. Appropriate post-processing is then performed to remove redundant patterns. Experiments are also made to show the performance of the proposed mining algorithm. Since the final results are represented by linguistic rules, they will be friendlier to human than quantitative representation.
Pattern Recognition Letters | 2009
Vincent S. Tseng; Chun-Hao Chen; Pai-Chieh Huang; Tzung-Pei Hong
A time series is composed of lots of data points, each of which represents a value at a certain time. Many phenomena can be represented by time series, such as electrocardiograms in medical science, gene expressions in biology and video data in multimedia. Time series have thus been an important and interesting research field due to their frequent appearance in different applications. This paper proposes a time series segmentation approach by combining the clustering technique, the discrete wavelet transformation and the genetic algorithm to automatically find segments and patterns from a time series. The genetic algorithm is used to find the segmentation points for deriving appropriate patterns. In fitness evaluation, the proposed approach first divides the segments in a chromosome into k groups according to their slopes by using clustering techniques. The Euclidean distance is then used to calculate the distance of each subsequence and evaluate a chromosome. The discrete wavelet transformation is also used to adjust the length of the subsequences for calculating the similarity since their length may be different. The evaluation results are utilized to choose appropriate chromosomes for mating. The offspring then undergo recursive evolution until a good result has been obtained. Experimental results show that the proposed approach can get good results in finding appropriate segmentation patterns in time series.
soft computing | 2008
Chun-Hao Chen; Tzung-Pei Hong; Vincent S. Tseng; Chang-Shing Lee
Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. Mining association rules from transaction data is most commonly seen among the mining techniques. Most of the previous mining approaches set a single minimum support threshold for all the items and identify the relationships among transactions using binary values. In the past, we proposed a genetic-fuzzy data-mining algorithm for extracting both association rules and membership functions from quantitative transactions under a single minimum support. In real applications, different items may have different criteria to judge their importance. In this paper, we thus propose an algorithm which combines clustering, fuzzy and genetic concepts for extracting reasonable multiple minimum support values, membership functions and fuzzy association rules from quantitative transactions. It first uses the k-means clustering approach to gather similar items into groups. All items in the same cluster are considered to have similar characteristics and are assigned similar values for initializing a better population. Each chromosome is then evaluated by the criteria of requirement satisfaction and suitability of membership functions to estimate its fitness value. Experimental results also show the effectiveness and the efficiency of the proposed approach.
Expert Systems With Applications | 2009
Chun-Hao Chen; Tzung-Pei Hong; Vincent S. Tseng
Fuzzy mining approaches have recently been discussed for deriving fuzzy knowledge. Since items may have their own characteristics, different minimum supports and membership functions may be specified for different items. In the past, we proposed a genetic-fuzzy data-mining algorithm for extracting minimum supports and membership functions for items from quantitative transactions. In that paper, minimum supports and membership functions of all items are encoded in a chromosome such that it may be not easy to converge. In this paper, an enhanced approach is proposed, which processes the items in a divide-and-conquer strategy. The approach is called divide-and-conquer genetic-fuzzy mining algorithm for items with Multiple Minimum Supports (DGFMMS), and is designed for finding minimum supports, membership functions, and fuzzy association rules. Possible solutions are evaluated by their requirement satisfaction divided by their suitability of derived membership functions. The proposed GA framework maintains multiple populations, each for one items minimum support and membership functions. The final best minimum supports and membership functions in all the populations are then gathered together to be used for mining fuzzy association rules. Experimental results also show the effectiveness of the proposed approach.
international symposium on computers and communications | 2004
Tzung-Pei Hong; Chun-Hao Chen; Yu-Lung Wu; Yeong-Chyi Lee
Data mining is most commonly used in attempts to induce association rules from transaction data. Transactions in real-world applications, however, usually consist of quantitative values. This work thus proposes a fuzzy data-mining algorithm for extracting both association rules and membership functions from quantitative transactions. A GA-based framework for finding membership functions suitable for mining problems is proposed. The fitness of each set of membership functions is evaluated using the fuzzy-supports of the linguistic terms in the large 1-itemsets and the suitability of the derived membership functions. The proposed framework thus maintains multiple populations of membership functions, with one population for one items membership functions. The final best set of membership functions gathered from all the populations is used to effectively mine fuzzy association rules.
ieee international conference on fuzzy systems | 2007
Chun-Hao Chen; Tzung-Pei Hong; Vincent S. Tseng; Chang-Shing Lee
In the past, we proposed a genetic-fuzzy data-mining algorithm for extracting both association rules and membership functions from quantitative transactions under a single minimum support. In real applications, different items may have different criteria to judge their importance. In this paper, we thus propose an algorithm which combines clustering, fuzzy and genetic concepts for extracting reasonable multiple minimum support values, membership functions and fuzzy association rules form quantitative transactions. It first uses the k-means clustering approach to gather similar items into groups. All items in the same cluster are considered to have similar characteristics and are assigned similar values for initializing a better population. Each chromosome is then evaluated by the criteria of requirement satisfaction and suitability of membership functions to estimate its fitness value. Experimental results also show the effectiveness and the efficiency of the proposed approach.
international conference on data mining | 2006
Vincent S. Tseng; Chun-Hao Chen; Chien-Hsiang Chen; Tzung-Pei Hong
This paper proposes a time series segmentation approach by combining the clustering technique, the discrete wavelet transformation and the genetic algorithm to automatically find segments and patterns from a time series. The genetic algorithm is used to find the segmentation points for deriving appropriate patterns. In the fitness evaluation, the proposed algorithm first divides subsequences in a chromosome into k clusters by using the k-means clustering approach. The Euclidean distance is then used to calculate the distance of each subsequence and evaluate a chromosome. The discrete wavelet transformation is also used to adjust the length of the subsequences for comparing their similarity since their length may be different. Experimental results show that the proposed approach can get good effects in finding appropriate segmentation patterns in time series