Donovan A. Schneider
University of Wisconsin-Madison
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Donovan A. Schneider.
IEEE Transactions on Knowledge and Data Engineering | 1990
David J. DeWitt; Shahram Ghandeharizadeh; Donovan A. Schneider; Allan Bricker; Hui-I Hsiao; Rick Rasmussen
The design of the Gamma database machine and the techniques employed in its implementation are described. Gamma is a relational database machine currently operating on an Intel iPSC/2 hypercube with 32 processors and 32 disk drives. Gamma employs three key technical ideas which enable the architecture to be scaled to hundreds of processors. First, all relations are horizontally partitioned across multiple disk drives, enabling relations to be scanned in parallel. Second, parallel algorithms based on hashing are used to implement the complex relational operators, such as join and aggregate functions. Third, dataflow scheduling techniques are used to coordinate multioperator queries. By using these techniques, it is possible to control the execution of very complex queries with minimal coordination. The design of the Gamma software is described and a thorough performance evaluation of the iPSC/s hypercube version of Gamma is presented. >
international conference on management of data | 1989
Donovan A. Schneider; David J. DeWitt
In this paper we analyze and compare four parallel join algorithms. Grace and Hybrid hash represent the class of hash-based join methods, Simple hash represents a looping algorithm with hashing, and our last algorithm is the more traditional sort-merge. The performance of each of the algorithms with different tuple distribution policies, the addition of bit vector filters, varying amounts of main-memory for joining, and non-uniformly distributed join attribute values is studied. The Hybrid hash-join algorithm is found to be superior except when the join attribute values of the inner relation are non-uniformly distributed and memory is limited. In this case, a more conservative algorithm such as the sort-merge algorithm should be used. The Gamma database machine serves as the host for the performance comparison.
international conference on management of data | 1990
Richard J. Lipton; Jeffrey F. Naughton; Donovan A. Schneider
Recently we have proposed an adaptive, random sampling algorithm for general query size estimation. In earlier work we analyzed the asymptotic efficiency and accuracy of the algorithm, in this paper we investigate its practicality as applied to selects and joins. First, we extend our previous analysis to provide significantly improved bounds on the amount of sampling necessary for a given level of accuracy. Next, we provide “sanity bounds” to deal with queries for which the underlying data is extremely skewed or the query result is very small. Finally, we report on the performance of the estimation algorithm as implemented in a host language on a commercial relational system. The results are encouraging, even with this loose coupling between the estimation algorithm and the DBMS.
international conference on management of data | 1988
David J. DeWitt; S. Ghanderaizadeh; Donovan A. Schneider
This paper presents the results of an initial performance evaluation of the Gamma database machine. In our experiments we measured the effect of relation size and indices on response time for selection, join, and aggregation queries, and single-tuple updates. A Teradata DBC/1012 database machine of similar size is used as a basis for interpreting the results obtained. We also analyze the performance of Gemma relative to the number of processors employed and study the impact of varying the memory size and disk page size on the execution time of a variety of selection and join queries. We analyze and interpret the results of these experiments based on our understanding of the system hardware and software, and conclude with an assessment of the strengths and weaknesses of Gamma.
international conference on database theory | 1993
Richard J. Lipton; Jeffrey F. Naughton; Donovan A. Schneider; S. Seshadri
Abstract Recently, we have proposed an adaptive, random-sampling algorithm for general query size estimation in databases. In an earlier work we analyzed the asymptotic efficiency and accuracy of the algorithm; in this paper we investigate its practicality as applied to the relational database operations select, project, and join. We extend our previous analysis to provide significantly improved bounds on the amount of sampling necessary for a given level of accuracy. Also, we provide “sanity bounds” to deal with queries for which the underlying data are extremely skewed or the query result is very small. We investigate how the existence of indices can be used to generate more efficient sampling algorithms for the operations of project and join. Finally, we report on the performance of the estimation algorithm, both as implemented in “stand alone” C programs and as implemented in a host language on a commericial relational system.
Parallel architectures for database systems | 1989
David J. DeWitt; Shahram Ghandeharizadeh; Donovan A. Schneider; Rajiv Jauhari; M. Muralikrishna; Anoop Sharma
This paper presents the results of an initial performance evaluation of the Gamma database machine based on an expanded version of the single-user Wisconsin benchmark. In our experiments we measured the effect of relation size and indices on response time for selection, join, and aggregation queries, and single-tuple updates. A Teradata DBC/1012 database machine of similar size is used as a basis for interpreting the results obtained. We analyze and interpret the results of these experiments based on our understanding of the system hardware and software, and conclude with an assessment of the strengths and weaknesses of the two machines.
ieee computer society international conference | 1989
Donovan A. Schneider; David J. DeWitt; Shahram Ghandeharizadeh
Gamma is a relational database machine that utilizes dataflow query processing techniques. Selection query results are presented which show that response time increases in a linear fashion as the size of the input relations increase and that speedup is linear as processors are added. The authors present results from a performance analysis of join queries. Speedups very close to linear are attained as processors are added.<<ETX>>
very large data bases | 1990
Donovan A. Schneider; David J. DeWitt
Archive | 1990
Donovan A. Schneider; David J. DeWitt
Archive | 1992
David J. DeWitt; Jeffrey F. Naughton; Donovan A. Schneider; Sridhar Seshadri