Vincent S. Tseng | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vincent S. Tseng is active.

Explore More

Publication

Featured researches published by Vincent S. Tseng.

IEEE Transactions on Knowledge and Data Engineering | 2013

Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases

Vincent S. Tseng; Bai En Shie; Cheng Wei Wu; Philip S. Yu

Mining high utility itemsets from a transactional database refers to the discovery of itemsets with high utility like profits. Although a number of relevant algorithms have been proposed in recent years, they incur the problem of producing a large number of candidate itemsets for high utility itemsets. Such a large number of candidate itemsets degrades the mining performance in terms of execution time and space requirement. The situation may become worse when the database contains lots of long transactions or long high utility itemsets. In this paper, we propose two algorithms, namely utility pattern growth (UP-Growth) and UP-Growth+, for mining high utility itemsets with a set of effective strategies for pruning candidate itemsets. The information of high utility itemsets is maintained in a tree-based data structure named utility pattern tree (UP-Tree) such that candidate itemsets can be generated efficiently with only two scans of database. The performance of UP-Growth and UP-Growth+ is compared with the state-of-the-art algorithms on many types of both real and synthetic data sets. Experimental results show that the proposed algorithms, especially UP-Growth+, not only reduce the number of candidates effectively but also outperform other algorithms substantially in terms of runtime, especially when databases contain lots of long transactions.

knowledge discovery and data mining | 2010

UP-Growth: an efficient algorithm for high utility itemset mining

Vincent S. Tseng; Cheng Wei Wu; Bai En Shie; Philip S. Yu

Mining high utility itemsets from a transactional database refers to the discovery of itemsets with high utility like profits. Although a number of relevant approaches have been proposed in recent years, they incur the problem of producing a large number of candidate itemsets for high utility itemsets. Such a large number of candidate itemsets degrades the mining performance in terms of execution time and space requirement. The situation may become worse when the database contains lots of long transactions or long high utility itemsets. In this paper, we propose an efficient algorithm, namely UP-Growth (Utility Pattern Growth), for mining high utility itemsets with a set of techniques for pruning candidate itemsets. The information of high utility itemsets is maintained in a special data structure named UP-Tree (Utility Pattern Tree) such that the candidate itemsets can be generated efficiently with only two scans of the database. The performance of UP-Growth was evaluated in comparison with the state-of-the-art algorithms on different types of datasets. The experimental results show that UP-Growth not only reduces the number of candidates effectively but also outperforms other algorithms substantially in terms of execution time, especially when the database contains lots of long transactions.

IEEE Transactions on Knowledge and Data Engineering | 2011

Efficient Relevance Feedback for Content-Based Image Retrieval by Mining User Navigation Patterns

Ja Hwung Su; Wei Jyun Huang; Philip S. Yu; Vincent S. Tseng

Nowadays, content-based image retrieval (CBIR) is the mainstay of image retrieval systems. To be more profitable, relevance feedback techniques were incorporated into CBIR such that more precise results can be obtained by taking users feedbacks into account. However, existing relevance feedback-based CBIR methods usually request a number of iterative feedbacks to produce refined search results, especially in a large-scale image database. This is impractical and inefficient in real applications. In this paper, we propose a novel method, Navigation-Pattern-based Relevance Feedback (NPRF), to achieve the high efficiency and effectiveness of CBIR in coping with the large-scale image data. In terms of efficiency, the iterations of feedback are reduced substantially by using the navigation patterns discovered from the user query log. In terms of effectiveness, our proposed search algorithm NPRFSearch makes use of the discovered navigation patterns and three kinds of query refinement strategies, Query Point Movement (QPM), Query Reweighting (QR), and Query Expansion (QEX), to converge the search space toward the users intention effectively. By using NPRF method, high quality of image retrieval on RF can be achieved in a small number of feedbacks. The experimental results reveal that NPRF outperforms other existing methods significantly in terms of precision, coverage, and number of feedbacks.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2005

Efficiently Mining Gene Expression Data via a Novel Parameterless Clustering Method

Vincent S. Tseng; Ching-Pin Kao

Clustering analysis has been an important research topic in the machine learning field due to the wide applications. In recent years, it has even become a valuable and useful tool for in-silico analysis of microarray or gene expression data. Although a number of clustering methods have been proposed, they are confronted with difficulties in meeting the requirements of automation, high quality, and high efficiency at the same time. In this paper, we propose a novel, parameterless and efficient clustering algorithm, namely, correlation search technique (CST), which fits for analysis of gene expression data. The unique feature of CST is it incorporates the validation techniques into the clustering process so that high quality clustering results can be produced on the fly. Through experimental evaluation, CST is shown to outperform other clustering methods greatly in terms of clustering quality, efficiency, and automation on both of synthetic and real data sets.

international syposium on methodologies for intelligent systems | 2014

FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning

Philippe Fournier-Viger; Cheng-Wei Wu; Souleymane Zida; Vincent S. Tseng

High utility itemset mining is a challenging task in frequent pattern mining, which has wide applications. The state-of-the-art algorithm is HUI-Miner. It adopts a vertical representation and performs a depth-first search to discover patterns and calculate their utility without performing costly database scans. Although, this approach is effective, mining high-utility itemsets remains computationally expensive because HUI-Miner has to perform a costly join operation for each pattern that is generated by its search procedure. In this paper, we address this issue by proposing a novel strategy based on the analysis of item co-occurrences to reduce the number of join operations that need to be performed. An extensive experimental study with four real-life datasets shows that the resulting algorithm named FHM (Fast High-Utility Miner) reduces the number of join operations by up to 95 % and is up to six times faster than the state-of-the-art algorithm HUI-Miner.

IEEE Transactions on Knowledge and Data Engineering | 2011

Mining Cluster-Based Temporal Mobile Sequential Patterns in Location-Based Service Environments

Eric Hsueh Chan Lu; Vincent S. Tseng; Philip S. Yu

Researches on Location-Based Service (LBS) have been emerging in recent years due to a wide range of potential applications. One of the active topics is the mining and prediction of mobile movements and associated transactions. Most of existing studies focus on discovering mobile patterns from the whole logs. However, this kind of patterns may not be precise enough for predictions since the differentiated mobile behaviors among users and temporal periods are not considered. In this paper, we propose a novel algorithm, namely, Cluster-based Temporal Mobile Sequential Pattern Mine (CTMSP-Mine), to discover the Cluster-based Temporal Mobile Sequential Patterns (CTMSPs). Moreover, a prediction strategy is proposed to predict the subsequent mobile behaviors. In CTMSP-Mine, user clusters are constructed by a novel algorithm named Cluster-Object-based Smart Cluster Affinity Search Technique (CO-Smart-CAST) and similarities between users are evaluated by the proposed measure, Location-Based Service Alignment (LBS-Alignment). Meanwhile, a time segmentation approach is presented to find segmenting time intervals where similar mobile characteristics exist. To our best knowledge, this is the first work on mining and prediction of mobile behaviors with considerations of user relations and temporal property simultaneously. Through experimental evaluation under various simulated conditions, the proposed methods are shown to deliver excellent performance.

knowledge discovery and data mining | 2012

Mining top-K high utility itemsets

Cheng Wei Wu; Bai En Shie; Vincent S. Tseng; Philip S. Yu

Mining high utility itemsets from databases is an emerging topic in data mining, which refers to the discovery of itemsets with utilities higher than a user-specified minimum utility threshold min_util. Although several studies have been carried out on this topic, setting an appropriate minimum utility threshold is a difficult problem for users. If min_util is set too low, too many high utility itemsets will be generated, which may cause the mining algorithms to become inefficient or even run out of memory. On the other hand, if min_util is set too high, no high utility itemset will be found. Setting appropriate minimum utility thresholds by trial and error is a tedious process for users. In this paper, we address this problem by proposing a new framework named top-k high utility itemset mining, where k is the desired number of high utility itemsets to be mined. An efficient algorithm named TKU (Top-K Utility itemsets mining) is proposed for mining such itemsets without setting min_util. Several features were designed in TKU to solve the new challenges raised in this problem, like the absence of anti-monotone property and the requirement of lossless results. Moreover, TKU incorporates several novel strategies for pruning the search space to achieve high efficiency. Results on real and synthetic datasets show that TKU has excellent performance and scalability.

advances in geographic information systems | 2012

Personalized trip recommendation with multiple constraints by mining user check-in behaviors

Eric Hsueh Chan Lu; Ching-Yu Chen; Vincent S. Tseng

In recent years, researches on travel recommendation have attracted extensive attentions due to the wide applications. Among them, one of the active topics is constraint-based trip recommendation for meeting users personal requirements. Although a number of studies on this topic have been proposed in literatures, most of them only regard the user-specific constraints as some filtering conditions for planning the trip. In fact, immersing the constraints into travel recommendation systems to provide a personalized trip is desired for users. Furthermore, time complexity of trip planning from a set of attractions is sensitive to the scalability of travel regions. Hence, how to reduce the computational cost by parallel cloud computing techniques is also a critical issue. In this paper, we propose a novel framework named Personalized Trip Recommendation (PTR) to efficiently recommend the personalized trips meeting multiple constraints of users by mining users check-in behaviors. In PTR, a mining-based module is first proposed to estimate the scores of attractions by considering both of user-based preferences and temporal-based properties. Then, a trip planning algorithm named Parallel Trip-Mine+ is proposed to efficiently plan the trip that satisfies multiple user-specific constraints. To our best knowledge, this is the first work on travel recommendation that considers the issues of multiple constraints, social relationship, temporal property and parallel computing simultaneously. Through comprehensive experimental evaluations on a real check-in dataset obtained from Gowalla, PTR is shown to deliver excellent performance.

Journal of Systems and Software | 2008

An efficient algorithm for mining temporal high utility itemsets from data streams

Chun-Jung Chu; Vincent S. Tseng; Tyne Liang

Utility of an itemset is considered as the value of this itemset, and utility mining aims at identifying the itemsets with high utilities. The temporal high utility itemsets are the itemsets whose support is larger than a pre-specified threshold in current time window of the data stream. Discovery of temporal high utility itemsets is an important process for mining interesting patterns like association rules from data streams. In this paper, we propose a novel method, namely THUI (Temporal High Utility Itemsets)-Mine, for mining temporal high utility itemsets from data streams efficiently and effectively. To the best of our knowledge, this is the first work on mining temporal high utility itemsets from data streams. The novel contribution of THUI-Mine is that it can effectively identify the temporal high utility itemsets by generating fewer candidate itemsets such that the execution time can be reduced substantially in mining all high utility itemsets in data streams. In this way, the process of discovering all temporal high utility itemsets under all time windows of data streams can be achieved effectively with less memory space and execution time. This meets the critical requirements on time and space efficiency for mining data streams. Through experimental evaluation, THUI-Mine is shown to significantly outperform other existing methods like Two-Phase algorithm under various experimental conditions.

Knowledge and Information Systems | 2014

An efficient projection-based indexing approach for mining high utility itemsets

Guo-Cheng Lan; Tzung-Pei Hong; Vincent S. Tseng

Recently, utility mining has widely been discussed in the field of data mining. It finds high utility itemsets by considering both profits and quantities of items in transactional data sets. However, most of the existing approaches are based on the principle of levelwise processing, as in the traditional two-phase utility mining algorithm to find a high utility itemsets. In this paper, we propose an efficient utility mining approach that adopts an indexing mechanism to speed up the execution and reduce the memory requirement in the mining process. The indexing mechanism can imitate the traditional projection algorithms to achieve the aim of projecting sub-databases for mining. In addition, a pruning strategy is also applied to reduce the number of unpromising itemsets in mining. Finally, the experimental results on synthetic data sets and on a real data set show the superior performance of the proposed approach.

Explore More