Pauray S. M. Tsai
National Tsing Hua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pauray S. M. Tsai.
knowledge discovery and data mining | 1999
Pauray S. M. Tsai; Chih-Chong Lee; Arbee L. P. Chen
In this paper, we study the issue of maintaining association rules in a large database of sales transactions. The maintenance of association rules can be mapped into the problem of maintaining large itemsets in the database. Because the mining of association rules is time-consuming, we need an efficient approach to maintain the large itemsets when the database is updated. In this paper, we present efficient approaches to solve the problem. Our approaches store the itemsets that are not large at present but may become large itemsets after updating the database, so that the cost of processing the updated database can be reduced. Moreover, we discuss the cases where the large itemsets can be obtained without scanning the original database. Experimental results show that our algorithms outperform other algorithms, especially when the original database need not be scanned in our algorithms.
Distributed and Parallel Databases | 1996
Arbee L. P. Chen; Pauray S. M. Tsai; Jia Ling Koh
In a multidatabase system that consists of object databases, the same real-world entity can be stored as objects in different databases with incompatible object identifiers. How to identify and integrate these objects representing the same entities such that (a) object duplication in the query result can be avoided, (b) information for the entity can be gathered, and (c) the specialization of multiple classes can be built is an important issue to provide a well structured global object schema and a more informative query result. In this paper, we extend our results on probabilistic query processing and joining relations on incompatible keys to solve the problem. Various data and schema conflicts such as missing data, inconsistent data and domain mismatch which may exist in classes from different databases are considered in the process of identification.
IEEE Transactions on Knowledge and Data Engineering | 2002
Pauray S. M. Tsai; Arbee L. P. Chen
Foreign functions have been considered in the advanced database systems to support complex applications. We consider optimizing queries with foreign functions in a distributed environment. In traditional distributed query processing, selection operations are locally processed before joins as much as possible so that the size of relations being transmitted and joined can be reduced. However, if selection predicates involve foreign functions, the cost of evaluating selections cannot be ignored. As a result, the execution order of selections and joins becomes significant, and the trade-off for reducing the costs of data transmission, join processing, and selection predicate evaluation needs to be carefully considered in query optimization. A response time model is developed for estimating the cost of distributed query processing involving foreign functions. We explore the property of the problem and find an optimal algorithm with polynomial complexity for a special case of it. However, finding the optimal execution plan for the general case is NP-hard. We propose an efficient heuristic algorithm for solving the problem and the simulation result shows its good quality. The research result can also be applied to the advanced database systems and the multidatabase systems where the conversion function defined for the need of schema integration can be considered a type of foreign functions.
data and knowledge engineering | 1997
Pauray S. M. Tsai; Arbee L. P. Chen
Abstract Heterogeneities exist in a multidatabase environment. For example, a real world entity may be differently represented in relations of different databases. In particular, keys of these relations may be incompatible. In this paper, we consider processing entity join queries when data transmission cost dominates. An entity join operation ‘integrates’ tuples representing the same entities from different relations in which inconsistent data may exist. A natural way to process the entity join is to transmit both relations to a site, resolve the possible conflicts between corresponding attributes and process the join, which is very costly. In this paper, an approach is proposed to correctly transform a global query into local subqueries to preprocess entity join queries in multiple sites with an attempt to lower the cost of data transmission. Besides, an extension of the traditional semijoin, named extended semijoin , is proposed to further reduce the cost of data transmission for entity join query processing.
international workshop on research issues in data engineering | 1993
Pauray S. M. Tsai; Arbee L. P. Chen
conference on management of data | 1994
Pauray S. M. Tsai; Arbee L. P. Chen
Journal of Information Science and Engineering | 1993
Pauray S. M. Tsai; Arbee L. P. Chen
Journal of Information Science and Engineering | 2000
Pauray S. M. Tsai; Arbee L. P. Chen
international conference on parallel and distributed systems | 1994
Pauray S. M. Tsai; Arbee L. P. Chen
Archive | 2000
Pauray S. M. Tsai; Arbee L. P. Chen