Dongming Liang
York University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dongming Liang.
international conference on data engineering | 2003
Jan Chomicki; Parke Godfrey; Jarek Gryz; Dongming Liang
The skyline, or Pareto, operator selects those tuples that are not dominated by any others. Extending relational systems with the skyline operator would offer a basis for handling preference queries. Good algorithms are needed for skyline, however, to make this efficient in a relational setting. We propose a skyline algorithm, SFS, based on presorting that is general, for use with any skyline query, efficient, and well behaved in a relational setting.
intelligent information systems | 2005
Jan Chomicki; Parke Godfrey; Jarek Gryz; Dongming Liang
There has been interest recently in skyline queries, also called Pareto queries, on relational databases. Relational query languages do not support search for “best” tuples, beyond the order by statement. The proposed skyline operator allows one to query for best tuples with respect to any number of attributes as preferences. In this work, we explore what the skyline means, and why skyline queries are useful, particularly for expressing preference. We describe the theoretical aspects and possible optimizations of an efficiant algorithm for computing skyline queries presented in [6].
Theoretical Computer Science | 2003
Jeff Edmonds; Jarek Gryz; Dongming Liang; Renée J. Miller
Many data mining approaches focus on the discovery of similar (and frequent) data values in large data sets. We present an alternative, but complementary approach in which we search for empty regions in the data. We consider the problem of finding all maximal empty rectangles in large, two-dimensional data sets. We introduce a novel, scalable algorithm for finding all such rectangles. The algorithm achieves this with a single scan over a sorted data set and requires only a small bounded amount of memory. We extend the algorithm to find all maximal empty hyper-rectangles in a multi-dimensional space. We consider the complexity of this search problem and present new bounds on the number of maximal empty hyper-rectangles. We briefly overview experimental results obtained by applying our algorithm to real and synthetic data sets and describe one application of empty-space knowledge to query optimization.
international conference on database theory | 2001
Jeff Edmonds; Jarek Gryz; Dongming Liang; Renée J. Miller
Many data mining approaches focus on the discovery of similar (and frequent) data values in large data sets. We present an alternative, but complementary approach in which we search for empty regions in the data. We consider the problem of finding all maximal empty rectangles in large, two-dimensional data sets. We introduce a novel, scalable algorithm for finding all such rectangles. The algorithm achieves this with a single scan over a sorted data set and requires only a small bounded amount of memory. We also describe an algorithm to find all maximal empty hyper-rectangles in a multi-dimensional space. We consider the complexity of this search problem and present new bounds on the number of maximal empty hyper-rectangles. We briefly overview experimental results obtained by applying our algorithm to a synthetic data set.
intelligent information systems | 2004
Jarek Gryz; Dongming Liang
Estimating the result size of a join is an important query optimization problem as it determines the choice of a good query evaluation strategy. Yet, there are few efficient techniques that solve this problem. We propose a new approach to join selectivity estimation. Our strategy relies on information extracted from stored data in the form of empty joins which represent portions of the two joined tables that produce an empty result. We present experimental results indicating that empty joins are common in real data sets and propose a simple strategy that uses information about empty joins for an improved join selectivity estimation.
intelligent information systems | 2006
Jarek Gryz; Dongming Liang
A join of two relations in real databases is usually much smaller than their Cartesian product. This means that most of the combinations of tuples in the crossproduct of the respective relations do not appear together in the join result. We characterize these combinations as ranges of attributes that do not appear together. We sketch an algorithm for finding such combinations and present experimental results from real data sets. We then explore two potential applications of this knowledge in query processing. In the first application, we model empty joins as materialized views, we show how they can be used for query optimization. In the second application, we propose a strategy that uses information about empty joins for an improved join selectivity estimation.
database and expert systems applications | 2002
Jarek Gryz; Dongming Liang
A join of two relations in real databases is usually much smaller than their Cartesian product. This means that most of the combinations of tuples in the crossproduct of the respective relations do not appear together in the join result. We characterize these missing combinations as ranges of attributes that do not appear together and present experimental results on their discovery from real data sets. We then explore potential applications of this knowledge to query optimization. By modeling empty joins as materialized views, we show how knowledge of these regions can be used to improve query performance.
Archive | 2004
Jarek Gryz; Dongming Liang
Lecture Notes in Computer Science | 2002
Jarek Gryz; Dongming Liang
Lecture Notes in Computer Science | 2001
Jeff Edmonds; Jarek Gryz; Dongming Liang; Renée J. Miller