Shyue-Liang Wang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shyue-Liang Wang is active.

Explore More

Publication

Featured researches published by Shyue-Liang Wang.

Applied Intelligence | 2013

Using TF-IDF to hide sensitive itemsets

Tzung-Pei Hong; Chun-Wei Lin; Kuo-Tung Yang; Shyue-Liang Wang

Data mining technology helps extract usable knowledge from large data sets. The process of data collection and data dissemination may, however, result in an inherent risk of privacy threats. Some sensitive or private information about individuals, businesses and organizations needs to be suppressed before it is shared or published. The privacy-preserving data mining (PPDM) has thus become an important issue in recent years. In this paper, we propose an algorithm called SIF-IDF for modifying original databases in order to hide sensitive itemsets. It is a greedy approach based on the concept borrowed from the Term Frequency and Inverse Document Frequency (TF-IDF) in text mining. The above concept is used to evaluate the similarity degrees between the items in transactions and the desired sensitive itemsets and then selects appropriate items in some transactions to hide. The proposed algorithm can easily make good trade-offs between privacy preserving and execution time. Experimental results also show the performance of the proposed approach.

Expert Systems With Applications | 2014

Applying the maximum utility measure in high utility sequential pattern mining

Guo-Cheng Lan; Tzung-Pei Hong; Vincent S. Tseng; Shyue-Liang Wang

Abstract Recently, high utility sequential pattern mining has been an emerging popular issue due to the consideration of quantities, profits and time orders of items. The utilities of subsequences in sequences in the existing approach are difficult to be calculated due to the three kinds of utility calculations. To simplify the utility calculation, this work then presents a maximum utility measure, which is derived from the principle of traditional sequential pattern mining that the count of a subsequence in the sequence is only regarded as one. Hence, the maximum measure is properly used to simplify the utility calculation for subsequences in mining. Meanwhile, an effective upper-bound model is designed to avoid information losing in mining, and also an effective projection-based pruning strategy is designed as well to cause more accurate sequence-utility upper-bounds of subsequences. The indexing strategy is also developed to quickly find the relevant sequences for prefixes in mining, and thus unnecessary search time can be reduced. Finally, the experimental results on several datasets show the proposed approach has good performance in both pruning effectiveness and execution efficiency.

Expert Systems With Applications | 2009

Fuzzy rough sets with hierarchical quantitative attributes

Tzung-Pei Hong; Yan-Liang Liou; Shyue-Liang Wang

Machine learning can extract desired knowledge and ease the development bottleneck in building expert systems. Among the proposed approaches, deriving classification rules from training examples is the most common. Given a set of examples, a learning program tries to induce rules that describe each class. The rough-set theory has served as a good mathematical tool for dealing with data classification problems. It adopts the concept of equivalence classes to partition training instances according to some criteria. In the past, we thus proposed a fuzzy-rough approach to produce a set of certain and possible rules from quantitative data. Attributes are, however, usually organized into hierarchy in real applications. This paper thus extends our previous approach to deal with the problem of producing a set of cross-level maximally general fuzzy certain and possible rules from examples with hierarchical and quantitative attributes. The proposed approach combines the rough-set theory and the fuzzy-set theory to learn. It is more complex than learning from single-level values, but may derive more general knowledge from data. Fuzzy boundary approximations, instead of upper approximations, are used to find possible rules, thus reducing some subsumption checking. Some pruning heuristics are adopted in the proposed algorithm to avoid unnecessary search. A simple example is also given to illustrate the proposed approach.

Expert Systems With Applications | 2009

An effective mining approach for up-to-date patterns

Tzung-Pei Hong; Yi-Ying Wu; Shyue-Liang Wang

Mining association rules is most commonly seen among the techniques for knowledge discovery from databases (KDD). It is used to discover relationships among items or itemsets. Furthermore, temporal data mining is concerned with the analysis of temporal data and the discovery of temporal patterns and regularities. In this paper, a new concept of up-to-date patterns is proposed, which is a hybrid of the association rules and temporal mining. An itemset may not be frequent (large) for an entire database but may be large up-to-date since the items seldom occurring early may often occur lately. An up-to-date pattern is thus composed of an itemset and its up-to-date lifetime, in which the user-defined minimum-support threshold must be satisfied. The proposed approach can mine more useful large itemsets than the conventional ones which discover large itemsets valid only for the entire database. Experimental results show that the proposed algorithm is more effective than the traditional ones in discovering such up-to-date temporal patterns especially when the minimum-support threshold is high.

systems, man and cybernetics | 2009

Mining high average-utility itemsets

Tzung-Pei Hong; Cho-Han Lee; Shyue-Liang Wang

The average utility measure is adopted in this paper to reveal a better utility effect of combining several items than the original utility measure. A mining algorithm is then proposed to efficiently find the high average-utility itemsets. It uses the summation of the maximal utility among the items in each transaction including the target itemset as the upper bounds to overestimate the actual average utilities of the itemset and processes it in two phases. As expected, the mined high average-utility itemsets in the proposed way will be fewer than the high utility itemset under the same threshold. Experiments results also show the performance of the proposed algorithm.

Expert Systems With Applications | 2009

An ACS-based framework for fuzzy data mining

Tzung-Pei Hong; Ya-Fang Tung; Shyue-Liang Wang; Min-Thai Wu; Yu-Lung Wu

Data mining is often used to find out interesting and meaningful patterns from huge databases. It may generate different kinds of knowledge such as classification rules, clusters, association rules, and among others. A lot of researches have been proposed about data mining and most of them focused on mining from binary-valued data. Fuzzy data mining was thus proposed to discover fuzzy knowledge from linguistic or quantitative data. Recently, ant colony systems (ACS) have been successfully applied to optimization problems. However, few works have been done on applying ACS to fuzzy data mining. This thesis thus attempts to propose an ACS-based framework for fuzzy data mining. In the framework, the membership functions are first encoded into binary-bits and then fed into the ACS to search for the optimal set of membership functions. The problem is then transformed into a multi-stage graph, with each route representing a possible set of membership functions. When the termination condition is reached, the best membership function set (with the highest fitness value) can then be used to mine fuzzy association rules from a database. At last, experiments are made to make a comparison with other approaches and show the performance of the proposed framework.

Information Sciences | 2012

A multi-level ant-colony mining algorithm for membership functions

Tzung-Pei Hong; Ya-Fang Tung; Shyue-Liang Wang; Yu-Lung Wu; Min-Thai Wu

Fuzzy data mining is used to extract fuzzy knowledge from linguistic or quantitative data. It is an extension of traditional data mining and the derived knowledge is relatively meaningful to human beings. In the past, we proposed a mining algorithm to find suitable membership functions for fuzzy association rules based on ant colony systems. In that approach, precision was limited by the use of binary bits to encode the membership functions. This paper elaborates on the original approach to increase the accuracy of results by adding multi-level processing. A multi-level ant colony framework is thus designed and an algorithm based on the structure is proposed to achieve the purpose. The proposed approach first transforms the fuzzy mining problem into a multi-stage graph, with each route representing a possible set of membership functions. The new approach then extends the previous one, using multi-level processing to solve the problem in which the maximum quantities of item values in the transactions may be large. The membership functions derived in a given level will be refined in the subsequent level. The final membership functions in the last level are then outputted to the rule-mining phase to find fuzzy association rules. Experiments are also performed to show the performance of the proposed approach. The experimental results show that the proposed multi-level ant colony systems mining approach can obtain improved results.

asian conference on intelligent information and database systems | 2011

Anonymizing shortest paths on social network graphs

Shyue-Liang Wang; Zheng-Ze Tsai; Tzung-Pei Hong; I-Hsien Ting

Social networking is gaining enormous popularity in the past few years. However, the popularity may also bring unexpected consequences for users regarding safety and privacy concerns. To prevent privacy being breached and modeling a social network as a weighted graph, many effective anonymization techniques have been proposed. In this work, we consider the edge weight anonymity problem. In particular, to protect the weight privacy of the shortest path between two vertices on a weighted graph, we present a new concept called k-anonymous path privacy. A published social network graph with k-anonymous path privacy has at least k indistinguishable shortest paths between the source and destination vertices. Greedy-based modification algorithms and experimental results showing the feasibility and characteristics of the proposed approach are presented.

ieee international conference on fuzzy systems | 2002

Mining from quantitative data with linguistic minimum supports and confidences

Tzung-Pei Hong; Ming-Jer Chiang; Shyue-Liang Wang

Most conventional data-mining algorithms identify the relationships among transactions using binary values and set the minimum supports and minimum confidences at numerical values. This paper thus attempts to propose a new mining approach for extracting linguistic weighted association rules from quantitative transactions, when the parameters needed in the mining process are given in linguistic terms. Items are also evaluated by managers as linguistic terms to reflect their importance, which are then transformed as fuzzy sets of weights. Fuzzy operations are then used to find weighted fuzzy large item sets and fuzzy association rules. An example is given to clearly illustrate the proposed approach.

Vietnam Journal of Computer Science | 2014

Feature selection and replacement by clustering attributes

Tzung-Pei Hong; Yan-Liang Liou; Shyue-Liang Wang; Bay Vo

Feature selection is to find useful and relevant features from an original feature space to effectively represent and index a given dataset. It is very important for classification and clustering problems, which may be quite difficult to solve when the amount of attributes in a given training data is very large. They usually need a very time-consuming search to get the features desired. In this paper, we will try to select features based on attribute clustering. A distance measure for a pair of attributes based on the relative dependency is proposed. An attribute clustering algorithm, called Most Neighbors First, is also presented to cluster the attributes into a fixed number of groups. The representative attributes found in the clusters can be used for classification such that the whole feature space can be greatly reduced. Besides, if the values of some representative attributes cannot be obtained from current environments for inference, some other possible attributes in the same clusters can be used to achieve approximate inference results.

Explore More