Jieh-Shan Yeh
Providence College
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jieh-Shan Yeh.
data and knowledge engineering | 2008
Yu-Chiang Li; Jieh-Shan Yeh; Chin-Chen Chang
Traditional methods of association rule mining consider the appearance of an item in a transaction, whether or not it is purchased, as a binary variable. However, customers may purchase more than one of the same item, and the unit cost may vary among items. Utility mining, a generalized form of the share mining model, attempts to overcome this problem. Since the Apriori pruning strategy cannot identify high utility itemsets, developing an efficient algorithm is crucial for utility mining. This study proposes the Isolated Items Discarding Strategy (IIDS), which can be applied to any existing level-wise utility mining method to reduce candidates and to improve performance. The most efficient known models for share mining are ShFSM and DCG, which also work adequately for utility mining as well. By applying IIDS to ShFSM and DCG, the two methods FUM and DCG+ were implemented, respectively. For both synthetic and real datasets, experimental results reveal that the performance of FUM and DCG+ is more efficient than that of ShFSM and DCG, respectively. Therefore, IIDS is an effective strategy for utility mining.
fuzzy systems and knowledge discovery | 2005
Yu-Chiang Li; Jieh-Shan Yeh; Chin-Chen Chang
The value of the itemset share is one way of evaluating the magnitude of an itemset. From business perspective, itemset share values reflect more the significance of itemsets for mining association rules in a database. The Share-counted FSM (ShFSM) algorithm is one of the best algorithms which can discover all share-frequent itemsets efficiently. However, ShFSM wastes the computation time on the join and the prune steps of candidate generation in each pass, and generates too many useless candidates. Therefore, this study proposes the Direct Candidates Generation (DCG) algorithm to directly generate candidates without the prune and the join steps in each pass. Moreover, the number of candidates generated by DCG is less than that by ShFSM. Experimental results reveal that the proposed method performs significantly better than ShFSM.
asia-pacific web conference | 2005
Yu-Chiang Li; Jieh-Shan Yeh; Chin-Chen Chang
Itemset share has been proposed as a measure of the importance of itemsets for mining association rules. The value of the itemset share can provide useful information such as total profit or total customer purchased quantity associated with an itemset in database. The discovery of share-frequent itemsets does not have the downward closure property. Existing algorithms for discovering share-frequent itemsets are inefficient or do not find all share-frequent itemsets. Therefore, this study proposes a novel Fast Share Measure (FSM) algorithm to efficiently generate all share-frequent itemsets. Instead of the downward closure property, FSM satisfies the level closure property. Simulation results reveal that the performance of the FSM algorithm is superior to the ZSP algorithm two to three orders of magnitude between 0.2% and 2% minimum share thresholds.
Information Sciences | 2009
Chin-Chen Chang; Pei-Yu Lin; Jieh-Shan Yeh
Watermarking techniques are applied to digital media to protect their integrity and copyright. The embedding of a watermark, however, often distorts the quality of the protected image. This may be intolerable since the protected media is for preserving artistic and valuable images. Hence, engineers have proposed removable solutions permitting authorized users to restore watermarked images to unmarked images with satisfactory quality. Unfortunately, these mechanisms cannot resist signal processing attacks to protect the ownership. In this article, we propose a novel watermarking mechanism by utilizing pair-difference correlations upon subsampling and the technique of JND. This new approach can guarantee the robust essentials of watermarking schemes. Experimental results reveal that the new method outperforms others in terms of restored image quality. More specifically, this novel approach can resist various attacks to which related works are vulnerable.
Expert Systems With Applications | 2010
Jieh-Shan Yeh; Po-Chiang Hsu
Privacy preserving data mining (PPDM) is a popular topic in the research community. How to strike a balance between privacy protection and knowledge discovery in the sharing process is an important issue. This study focuses on privacy preserving utility mining (PPUM) and presents two novel algorithms, HHUIF and MSICF, to achieve the goal of hiding sensitive itemsets so that the adversaries cannot mine them from the modified database. The work also minimizes the impact on the sanitized database of hiding sensitive itemsets. The experimental results show that HHUIF achieves lower miss costs than MSICF on two synthetic datasets. On the other hand, MSICF generally has a lower difference ratio than HHUIF between original and sanitized databases.
international conference on ubiquitous information management and communication | 2008
Jieh-Shan Yeh; Chih-Yang Chang; Yao-Te Wang
Temporal data mining is the activity of finding interesting correlations or patterns in large temporal data sets. On the other hand, utility mining aims at identifying the itemsets with high utilities. In 2006, Tseng et al. introduced the temporal utility mining which is extended from both temporal association rule mining and utility mining. In this study, we investigated the incremental utility mining which can identify all high temporal utility itemsets in a specified time period on an incremental transaction database. Two efficient algorithms, Incremental Utility Mining (IUM) and Fast Incremental Utility Mining (FIUM), were proposed. The experimental results also showed that both algorithms are efficient.
intelligent systems design and applications | 2008
Jieh-Shan Yeh; Po-Chiang Hsu; Ming-Hsun Wen
Privacy preserving data mining (PPDM) has become a popular topic in the research community. How to strike a balance between privacy protection and knowledge discovery in the sharing process is an important issue. This study focuses on privacy preserving utility mining (PPUM) and presents two novel algorithms, HHUIF and MSICF, to achieve the goal of hiding sensitive itemsets so that the adversaries can not mine them from the modified database. In addition, we minimize the impact on the sanitized database in the process of hiding sensitive itemsets. The experimental results show that HHUIF achieves the lower miss costs than MSICF does on two synthetic datasets. On the other hand, MSICF generally has the lower difference between the original and sanitized databases than HHUIF does.
international conference on ubiquitous information management and communication | 2009
Jieh-Shan Yeh; Szu-Chen Lin
The periodic pattern mining is to discover valid periodic patterns in a time-related dataset. Previous studies mostly concern the synchronous periodic patterns. There are many methods for mining periodic patterns proposed in literature. Nevertheless, asynchronous periodic pattern mining gradually receives more and more attention recently. In this paper, we propose an efficient linked structure and the OEOP algorithm to discover all kinds of valid segments in each single event sequence. Then, refer to the general model of asynchronous periodic pattern mining proposed by Huang and Chang, we combine these valid segments found by OEOP into 1-patterns with multiple events, multiple patterns with multiple events and asynchronous periodic patterns. Besides, we implement these algorithms on two real datasets. The experimental results show that these algorithms have the good performance and scalability.
international conference on genetic and evolutionary computing | 2011
Sheng Xiang Fan; Jieh-Shan Yeh; Yaw-Ling Lin
Both sequential pattern mining and temporal pattern mining have become highly relevant data mining topics in this decade. In 2009, Wu and Chen proposed a representation for hybrid events and an HTPM mining method. However, their approach neither addresses nor analyzes the length of event time. An event representation may stand for the same event with extremely different time lengths, which may induces the loss of accurate mining results. This paper addresses this difficulty and explores different models and solutions. Firstly, this paper introduces the concept of the time grain, and proposes new hybrid models as well as the pattern mining algorithms associated with the concept of event length limit. Events in hybrid sequences are divided or distinguished according to a given threshold, to enable a detailed exploration of the more frequent hybrid sequence of events. Secondly, this paper utilizes the Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) as the testing data, to examine the proposed model and the feasibility and effectiveness of the algorithm.
artificial intelligence applications and innovations | 2005
Yu-Chiang Li; Chin-Chen Chang; Jieh-Shan Yeh
Most existing algorithms employ a uniform minimum support for mining association rules. Nevertheless, each item in a publication database, even each set of items, is exhibited in an individual period. A reasonable minimum support threshold has to be adjusted according to the exhibition period of each k-itemset. Accordingly, this paper proposes a new algorithm, called WMS, for mining association rules with weighted minimum supports in publication databases. WMS discovers all frequent itemsets which satisfy their individual requirement of minimum support thresholds. WMS applies the group closure property to prune futile itemsets, to reduce the number of candidates generated, and thus to generate the candidate sets efficiently.