Goetz Graefe | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Goetz Graefe is active.

Explore More

Publication

Featured researches published by Goetz Graefe.

ACM Computing Surveys | 1993

Query evaluation techniques for large databases

Goetz Graefe

Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On the contrary, modern data models exacerbate the problem: In order to manipulate large sets of complex objects as efficiently as todays database systems manipulate simple records, query-processing algorithms and software will become more complex, and a solid understanding of algorithm and architectural issues is essential for the designer of database management software. This survey provides a foundation for the design and implementation of query execution facilities in new database management systems. It describes a wide array of practical query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.

international conference on management of data | 1990

Encapsulation of parallelism in the Volcano query processing system

Goetz Graefe

Volcano is a new dataflow query processing system we have developed for database systems research and education. The uniform interface between operators makes Volcano extensible by new operators. All operators are designed and coded as if they were meant for a single-process system only. When attempting to parallelize Volcano, we had to choose between two models of parallelization, called here the bracket and operator models. We describe the reasons for not choosing the bracket model, introduce the novel operator model, and provide details of Volcanos exchange operator that parallelizes all other operators. It allows intra-operator parallelism on partitioned datasets and both vertical and horizontal inter-operator parallelism. The exchange operator encapsulates all parallelism issues and therefore makes implementation of parallel database algorithms significantly easier and more robust. Included in this encapsulation is the translation between demand-driven dataflow within processes and data-driven dataflow between processes. Since the interface between Volcano operators is similar to the one used in “real,” commercial systems, the techniques described here can be used to parallelize other query processing engines.

international conference on data engineering | 1993

The Volcano optimizer generator: extensibility and efficient search

Goetz Graefe; William J. McKenna

The Volcano project, which provides efficient, extensible tools for query and request processing, particularly for object-oriented and scientific database systems, is reviewed. In particular, one of its tools, the optimizer generator, is discussed. The data model, logical algebra, physical algebra, and optimization rules are translated by the optimizer generator into optimizer source code. It is shown that, compared with the EXODUS optimizer generator prototype, the search engine of the Volcano optimizer generator is more extensible and powerful. It provides effective support for non-trivial cost models and for physical properties such as sorting order. At the same time, it is much more efficient, as it combines dynamic programming with goal-directed searching and branch-and-bound pruning. Compared with other rule-based optimization systems, it provides complete data model independence and more natural extensibility.<<ETX>>

international conference on management of data | 1995

Multi-table joins through bitmapped join indices

Patrick E. O'Neil; Goetz Graefe

This technical note shows how to combine some well-known techniques to create a method that will efficiently execute common multi-table joins. We concentrate on a commonly occurring type of join known as a star-join, although the method presented will generalize to any type of multi-table join. A star-join consists of a central detail table with large cardinality, such as an orders table (where an order row contains a single purchase) with foreign keys that join to descriptive tables, such as customers, products, and (sales) agents. The method presented in this note uses join indices with compressed bitmap representations, which allow predicates restricting columns of descriptive tables to determine an answer set (or foundset) in the central detail table; the method uses different predicates on different descriptive tables in combination to restrict the detail table through compressed bitmap representations of join indices, and easily completes the join of the fully restricted detail table rows back to the descriptive tables. We outline realistic examples where the combination of these techniques yields substantial performance improvements over alternative, more traditional query evaluation plans.

international conference on management of data | 1987

The EXODUS optimizer generator

Goetz Graefe; David J. DeWitt

This paper presents the design and an initial performance evaluation of the query optimizer generator designed for the EXODUS extensible database system. Algebraic transformation rules are translated into an executable query optimizer, which transforms query trees and selects methods for executing operations according to cost functions associated with the methods. The search strategy avoids exhaustive search and it modifies itself to take advantage of past experience. Computational results show that an optimizer generated for a relational system produces access plans almost as good as those produced by exhaustive search, with the search time cut to a small fraction.

IEEE Transactions on Knowledge and Data Engineering | 1994

Volcano/spl minus/an extensible and parallel query evaluation system

Goetz Graefe

To investigate the interactions of extensibility and parallelism in database query processing, we have developed a new dataflow query execution system called Volcano. The Volcano effort provides a rich environment for research and education in database systems design, heuristics for query optimization, parallel query execution, and resource allocation. Volcano uses a standard interface between algebra operators, allowing easy addition of new operators and operator implementations. Operations on individual items, e.g., predicates, are imported into the query processing operators using support functions. The semantics of support functions is not prescribed; any data type including complex objects and any operation can be realized. Thus, Volcano is extensible with new operators, algorithms, data types, and type-specific methods. Volcano includes two novel meta-operators. The choose-plan meta-operator supports dynamic query evaluation plans that allow delaying selected optimization decisions until run-time, e.g., for embedded queries with free variables. The exchange meta-operator supports intra-operator parallelism on partitioned datasets and both vertical and horizontal inter-operator parallelism, translating between demand-driven dataflow within processes and data-driven dataflow between processes. All operators, with the exception of the exchange operator, have been designed and implemented in a single-process environment, and parallelized using the exchange operator. Even operators not yet designed can be parallelized using this new operator if they use and provide the interator interface. Thus, the issues of data manipulation and parallelism have become orthogonal, making Volcano the first implemented query execution engine that effectively combines extensibility and parallelism. >

OODS '86 Proceedings on the 1986 international workshop on Object-oriented database systems | 1986

The architecture of the EXODUS extensible DBMS

Michael J. Carey; David J. DeWitt; Daniel Frank; M. Muralikrishna; Goetz Graefe; Joel E. Richardson; Eugene J. Shekita

With non-traditional application areas such as engineering design, image/voice data management, scientific/statistical applications, and artificial intelligence systems all clamoring for ways to store and efficiently process larger and larger volumes of data, it is clear that traditional database technology has been pushed to its limits. It also seems clear that no single database system will be capable of simultaneously meeting the functionality and performance requirements of such a diverse set of applications. In this paper we describe the preliminary design of EXODUS, an extensible database system that will facilitate the fast development of high-performance, application-specific database systems. EXODUS provides certain kernel facilities, including a versatile storage manager and a type manager. In addition, it provides an architectural framework for building application-specific database systems, tools to partially automate the generation of such systems, and libraries of software components (e.g., access methods) that are likely to be useful for many application domains.

international conference on management of data | 1989

Dynamic query evaluation plans

Goetz Graefe; Karen Ward

In most database systems, a query embedded in a program written in a conventional programming language is optimized when the program is compiled. The query optimizer must make assumptions about the values of the program variables that appear as constants in the query, the resources that can be committed to query evaluation, and the data in the database. The optimality of the resulting query evaluation plan depends on the validity of these assumptions. If a query evaluation plan is used repeatedly over an extended period of time, it is important to determine when reoptimization is necessary. Our work aims at developing criteria when reoptimization is required, how these criteria can be implemented efficiently, and how reoptimization can be avoided by using a new technique called dynamic query evaluation plans. We experimentally demonstrate the need for dynamic plans and outline modifications to the EXODUS optimizer generator required for creating dynamic query evaluation plans.

international conference on management of data | 1994

Optimization of dynamic query evaluation plans

Richard L. Cole; Goetz Graefe

Traditional query optimizers assume accurate knowledge of run-time parameters such as selectivities and resource availability during plan optimization, i.e., at compile time. In reality, however, this assumption is often not justified. Therefore, the “static” plans produced by traditional optimizers may not be optimal for many of their actual run-time invocations. Instead, we propose a novel optimization model that assigns the bulk of the optimization effort to compile-time and delays carefully selected optimization decisions until run-time. Our previous work defined the run-time primitives, “dynamic plans” using “choose-plan” operators, for executing such delayed decisions, but did not solve the problem of constructing dynamic plans at compile-time. The present paper introduces techniques that solve this problem. Experience with a working prototype optimizer demonstrates (i) that the additional optimization and start-up overhead of dynamic plans compared to static plans is dominated by their advantage at run-time, (ii) that dynamic plans are as robust as the “brute-force” remedy of run-time optimization, i.e., dynamic plans maintain their optimality even if parameters change between compile-time and run-time, and (iii) that the start-up overhead of dynamic plans is significantly less than the time required for complete optimization at run-time. In other words, our proposed techniques are superior to both techniques considered to-date, namely compile-time optimization into a single static plan as well as run-time optimization. Finally, we believe that the concepts and technology described can be transferred to commercial query optimizers in order to improve the performance of embedded queries with host variables in the query predicate and to adapt to run-time system loads unpredictable at compile time.

symposium on applied computing | 1991

Data compression and database performance

Goetz Graefe; Leonard D. Shapiro

Data compression is widely used in data management to save storage space and network bandwidth. The authors outline the performance improvements that can be achieved by exploiting data compression in query processing. The novel idea is to leave data in compressed state as long as possible, and to only uncompress data when absolutely necessary. They show that many query processing algorithms can manipulate compressed data just as well as decompressed data, and that processing compressed data can speed query processing by a factor much larger than the compression factor.<<ETX>>

Explore More