Carson Kai-Sang Leung

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Carson Kai-Sang Leung is active.

Explore More

Publication

Featured researches published by Carson Kai-Sang Leung.

knowledge discovery and data mining | 2008

A tree-based approach for frequent pattern mining from uncertain data

Carson Kai-Sang Leung; Mark Anthony F. Mateo; Dale A. Brajczuk

Many frequent pattern mining algorithms find patterns from traditional transaction databases, in which the content of each transaction--namely, items--is definitely known and precise. However, there are many real-life situations in which the content of transactions is uncertain. To deal with these situations, we propose a tree-based mining algorithm to efficiently find frequent patterns from uncertain data, where each item in the transactions is associated with an existential probability. Experimental results show the efficiency of our proposed algorithm.

international conference on data mining | 2006

DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams

Carson Kai-Sang Leung; Quamrul I. Khan

With advances in technology, a flood of data can be produced in many applications such as sensor networks and Web click streams. This calls for efficient techniques for extracting useful information from streams of data. In this paper, we propose a novel tree structure, called DSTree (Data Stream Tree), that captures important data from the streams. By exploiting its nice properties, the DSTree can be easily maintained and mined for frequent itemsets as well as various other patterns like constrained itemsets.

Knowledge and Information Systems | 2007

CanTree: a canonical-order tree for incremental frequent-pattern mining

Carson Kai-Sang Leung; Quamrul I. Khan; Zhan Li; Tariqul Hoque

Since its introduction, frequent-pattern mining has been the subject of numerous studies, including incremental updating. Many existing incremental mining algorithms are Apriori-based, which are not easily adoptable to FP-tree-based frequent-pattern mining. In this paper, we propose a novel tree structure, called CanTree (canonical-order tree), that captures the content of the transaction database and orders tree nodes according to some canonical order. By exploiting its nice properties, the CanTree can be easily maintained when database transactions are inserted, deleted, and/or modified. For example, the CanTree does not require adjustment, merging, and/or splitting of tree nodes during maintenance. No rescan of the entire updated database or reconstruction of a new tree is needed for incremental updating. Experimental results show the effectiveness of our CanTree in the incremental mining of frequent patterns. Moreover, the applicability of CanTrees is not confined to incremental mining; CanTrees can also be applicable to other frequent-pattern mining tasks including constrained mining and interactive mining.

international conference on data mining | 2005

CanTree: a tree structure for efficient incremental mining of frequent patterns

Carson Kai-Sang Leung; Quamrul I. Khan; Tariqul Hoque

Since its introduction, frequent-pattern mining has been the subject of numerous studies, including incremental updating. Many existing incremental mining algorithms are Apriori-based, which are not easily adoptable to FP-tree based frequent-pattern mining. In this paper, we propose a novel tree structure, called CanTree (canonical-order tree), that captures the content of the transaction database and orders tree nodes according to some canonical order. By exploiting its nice properties, the CanTree can be easily maintained when database transactions are inserted, deleted, and/or modified. For example, the CanTree does not require adjustment, merging, and/or splitting of tree nodes during maintenance. No rescan of the entire updated database or reconstruction of a new tree is needed for incremental updating. Experimental results show the effectiveness of our CanTree.

international conference on data engineering | 2009

Mining of Frequent Itemsets from Streams of Uncertain Data

Carson Kai-Sang Leung; Boyu Hao

Frequent itemset mining plays an essential role in the mining of various patterns and is in demand in many real-life applications. Hence, mining of frequent itemsets has been the subject of numerous studies since its introduction. Generally, most of these studies find frequent itemsets from traditional transaction databases, in which the content of each transaction--namely, items--is definitely known and precise. However, there are many real-life situations in which ones are uncertain about the content of transactions. This calls for the mining of uncertain data. Moreover, due to advances in technology, a flood of precise or uncertain data can be produced in many situations. This calls for the mining of data streams. To deal with these situations, we propose two tree-based mining algorithms to efficiently find frequent itemsets from streams of uncertain data, where each item in the transactions in the streams is associated with an existential probability. Experimental results show the effectiveness of our algorithms in mining frequent itemsets from streams of uncertain data.

international conference on data mining | 2007

Efficient Mining of Frequent Patterns from Uncertain Data

Carson Kai-Sang Leung; Christopher L. Carmichael; Boyu Hao

Since its introduction, mining of frequent patterns has been the subject of numerous studies. Generally, they focus on improving algorithmic efficiency for finding frequent patterns or on extending the notion of frequent patterns to other interesting patterns. Most of these studies find patterns from traditional transaction databases, in which the content of each transaction-namely, items-is definitely known and precise. However, there are many real-life situations in which ones are uncertain about the content of transactions. To deal with these situations, we propose a tree-based mining algorithm to efficiently find frequent patterns from uncertain data, where each item in the transactions is associated with an existential probability. Experimental results show the efficiency of our algorithm over its non-tree-based counterpart.

ACM Transactions on Database Systems | 2003

Efficient dynamic mining of constrained frequent sets

Laks V. S. Lakshmanan; Carson Kai-Sang Leung; Raymond T. Ng

Data mining is supposed to be an iterative and exploratory process. In this context, we are working on a project with the overall objective of developing a practical computing environment for the human-centered exploratory mining of frequent sets. One critical component of such an environment is the support for the dynamic mining of constrained frequent sets of items. Constraints enable users to impose a certain focus on the mining process; dynamic means that, in the middle of the computation, users are able to (i) change (such as tighten or relax) the constraints and/or (ii) change the minimum support threshold, thus having a decisive influence on subsequent computations. In a real-life situation, the available buffer space may be limited, thus adding another complication to the problem.In this article, we develop an algorithm, called DCF, for Dynamic Constrained Frequent-set computation. This algorithm is enhanced with a few optimizations, exploiting a lightweight structure called a segment support map. It enables DCF to (i) obtain sharper bounds on the support of sets of items, and to (ii) better exploit properties of constraints. Furthermore, when handling dynamic changes to constraints, DCF relies on the concept of a delta member generating function, which generates precisely the sets of items that satisfy the new but not the old constraints. Our experimental results show the effectiveness of these enhancements.

Sigkdd Explorations | 2002

Exploiting succinct constraints using FP-trees

Carson Kai-Sang Leung; Laks V. S. Lakshmanan; Raymond T. Ng

Since its introduction, frequent-set mining has been generalized to many forms, which include constrained data mining. The use of constraints permits user focus and guidance, enables user exploration and control, and leads to effective pruning of the search space and efficient mining of frequent itemsets. In this paper, we focus on the use of succinct constraints. In particular, we propose a novel algorithm called FPS to mine frequent itemsets satisfying succinct constraints. The FPS algorithm avoids the generate-and-test paradigm by exploiting succinctness properties of the constraints in a FP-tree based framework. In terms of functionality, our algorithm is capable of handling not just the succinct aggregate constraint, but any succinct constraint in general. Moreover, it handles multiple succinct constraints. In terms of performance, our algorithm is more efficient and effective than existing FP-tree based constrained frequent-set mining algorithms.

Trans. Large-Scale Data- and Knowledge-Centered Systems | 2013

Discovering Frequent Patterns from Uncertain Data Streams with Time-Fading and Landmark Models

Carson Kai-Sang Leung; Alfredo Cuzzocrea; Fan Jiang

Streams of data can be continuously generated by sensors in various real-life applications such as environment surveillance. Partially due to the inherited limitation of the sensors, data in these streams can be uncertain. To discover useful knowledge in the form of frequent patterns from streams of uncertain data, a few algorithms have been developed. They mostly use the sliding window model for processing and mining data streams. However, for some applications, other stream processing models such as the time-fading model and the landmark model are more appropriate. In this paper, we propose mining algorithms that use (i) the time-fading model and (ii) the landmark model to discover frequent patterns from streams of uncertain data.

Future Generation Computer Systems | 2014

Mining constrained frequent itemsets from distributed uncertain data

Alfredo Cuzzocrea; Carson Kai-Sang Leung; Richard Kyle MacKinnon

Nowadays, high volumes of massive data can be generated from various sources (e.g.,sensor data from environmental surveillance). Many existing distributed frequent itemset mining algorithms do not allow users to express the itemsets to be mined according to their intention via the use of constraints. Consequently, these unconstrained mining algorithms can yield numerous itemsets that are not interesting to users. Moreover, due to inherited measurement inaccuracies and/or network latencies, the data are often riddled with uncertainty. These call for both constrained mining and uncertain data mining. In this journal article, we propose a data-intensive computer system for tree-based mining of frequent itemsets that satisfy user-defined constraints from a distributed environment such as a wireless sensor network of uncertain data. We proposed a system for tree-based distributed uncertain frequent itemset mining.Our system allows users to specify constraints for expressing their interests.It finds frequent itemsets that satisfy succinct constraints from distributed uncertain data.It also handles non-succinct (e.g.,inductive succinct, anti-monotone) constraints.

Explore More