Timothy R. Malkemus | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Timothy R. Malkemus is active.

Explore More

Publication

Featured researches published by Timothy R. Malkemus.

international conference on management of data | 1996

Fundamental techniques for order optimization

David E. Simmen; Eugene J. Shekita; Timothy R. Malkemus

Decision support applications are growing in popularity as more business data is kept on-line. Such applications typically include complex SQL queries that can test a query optimizers ability to produce an efficient access plan. Many access plan strategies exploit the physical ordering of data provided by indexes or sorting. Sorting is an expensive operation, however. Therefore, it is imperative that sorting is optimized in some way or avoided all together. Toward that goal, this paper describes novel optimization techniques for pushing down sorts in joins, minimizing the number of sorting columns, and detecting when sorting can be avoided because of predicates, keys, or indexes. A set of fundamental operations is described that provide the foundation for implementing such techniques. The operations exploit data properties that arise from predicate application, uniqueness, and functional dependencies. These operations and techniques have been implemented in IBMs DB2/CS.

international conference on data engineering | 2001

Block oriented processing of relational database operations in modern computer architectures

Sriram Padmanabhan; Timothy R. Malkemus; Anant Jhingran; R. Agarwal

Database systems are not well-tuned to take advantage of modern superscalar processor architectures. In particular, the clocks per instruction (CPI) for rather simple database queries are quite poor compared to scientific kernels or SPEC benchmarks. The lack of performance of database systems has been attributed to poor utilization of caches and processor function units as well as higher branching penalties. In this paper, we argue that a block-oriented processing strategy for database operations can lead to better utilization of the processors and caches, generating significantly higher performance. We have implemented the block-oriented processing technique for aggregation expression evaluation and sorting operations as a feature in the DB2 Universal Database (UDB) system. We present results from representative queries on a 30-GB TPC-H (Transaction Processing Council Benchmark H) database to show the value of this technique.

very large data bases | 2009

Efficient index compression in DB2 LUW

Bishwaranjan Bhattacharjee; Lipyeow Lim; Timothy R. Malkemus; George A. Mihaila; Kenneth A. Ross; Sherman Lau; Cathy Mcarthur; Zoltan Toth; Reza Sherkat

In database systems, the cost of data storage and retrieval are important components of the total cost and response time of the system. A popular mechanism to reduce the storage footprint is by compressing the data residing in tables and indexes. Compressing indexes efficiently, while maintaining response time requirements, is known to be challenging. This is especially true when designing for a workload spectrum covering both data warehousing and transaction processing environments. DB2 Linux, UNIX, Windows (LUW) recently introduced index compression for use in both environments. This uses techniques that are able to compress index data efficiently while incurring virtually no performance penalty for query processing. On the contrary, for certain operations, the performance is actually better. In this paper, we detail the design of index compression in DB2 LUW and discuss the challenges that were encountered in meeting the design goals. We also demonstrate its effectiveness by showing performance results on typical customer scenarios.

international conference on management of data | 2003

Multi-dimensional clustering: a new data layout scheme in DB2

Sriram Padmanabhan; Bishwaranjan Bhattacharjee; Timothy R. Malkemus; Leslie A. Cranston; Matthew A. Huras

We describe the design and implementation of a new data layout scheme, called multi-dimensional clustering, in DB2 Universal Database Version 8. Many applications, e.g., OLAP and data warehousing, process a table or tables in a database using a multi-dimensional access paradigm. Currently, most database systems can only support organization of a table using a primary clustering index. Secondary indexes are created to access the tables when the primary key index is not applicable. Unfortunately, secondary indexes perform many random I/O accesses against the table for a simple operation such as a range query. Our work in multi-dimensional clustering addresses this important deficiency in database systems. Multi-Dimensional Clustering is based on the definition of one or more orthogonal clustering attributes (or expressions) of a table. The table is organized physically by associating records with similar values for the dimension attributes in a cluster. We describe novel techniques for maintaining this physical layout efficiently and methods of processing database operations that provide significant performance improvements. We show results from experiments using a star-schema database to validate our claims of performance with minimal overhead.

international conference on data engineering | 2007

Increasing Buffer-Locality for Multiple Relational Table Scans through Grouping and Throttling

Christian A. Lang; Bishwaranjan Bhattacharjee; Timothy R. Malkemus; Sriram Padmanabhan; Kwai Wong

Decision support (DSS) workloads generally contain multiple large concurrent scan operations. These are often executed as relational table scans which can take up a lot of I/O bandwidth. This is especially true for ad-hoc queries where the workload is not known in advance. Common database management systems have only limited ability to reuse memory buffer content across multiple running queries due to their treatment of queries in isolation. Previous attempts to coordinate scans for better buffer reuse were less than satisfactory due to drifting between scans and the required radical DBMS architecture changes. In this paper, we describe a new mechanism to keep similar table scans closer together during scanning. This is achieved via dynamic grouping and regrouping of scans based on their runtime behavior and via adaptive throttling of scan speeds based on scan group characteristics. The required memory footprint is very small and the effort required to extend existing database management systems is minimal, as shown in our DB2 UDB prototype. Our experiments show significant gains in end-to-end response times as well as average response times for TPC-H workloads.

very large data bases | 2003

Efficient query processing for multi-dimensionally clustered tables in DB2

Bishwaranjan Bhattacharjee; Sriram Padmanabhan; Timothy R. Malkemus; Tony Wen Hsun Lai; Leslie A. Cranston; Matthew A. Huras

We have introduced a Multi-Dimensional Clustering (MDC) physical layout scheme in DB2 version 8.0 for relational tables. Multi-Dimensional Clustering is based on the definition of one or more orthogonal clustering attributes (or expressions) of a table. The table is organized physically by associating records with similar values for the dimension attributes in a cluster. Each clustering key is allocated one or more blocks of physical storage with the aim of storing the multiple records belonging to the cluster in almost contiguous fashion. Block oriented indexes are created to access these blocks. In this paper, we describe novel techniques for query processing operations that provide significant performance improvements for MDC tables. Current database systems employ a repertoire of access methods including table scans, index scans, index ANDing, and index ORing. We have extended these access methods for efficiently processing the block based MDC tables. One important concept at the core of processing MDC tables is the block oriented access technique. In addition, since MDC tables can include regular record oriented indexes, we employ novel techniques to combine block and record indexes. Block oriented processing is extended to nested loop joins and star joins as well. We show results from experiments using a star-schema database to validate our claims of performance with minimal overhead.

international conference on data engineering | 2005

Predicate derivation and monotonicity detection in DB2 UDB

Timothy R. Malkemus; Sriram Padmanabhan; Bishwaranjan Bhattacharjee; Leslie A. Cranston; T. Lai; F. Koo

DB2 universal database allows database schema designers to specify generated columns. These generated columns are useful for maintaining rollup hierarchy variables in warehouses (e.g., date, month, quarter). In order for the generated columns to be useful for query processing, queries must automatically make use of such columns when applicable. In particular, query predicates on the original columns should be rewritten to make use of the generated columns. In this paper, we describe two main aspects of this predicate rewriting technique that allows usage of the generated columns for a variety of query predicate types. The first aspect, monotonicity detection, allows for rewrites in the case of range predicates. The second aspect, predicate derivation, is the technique for using generating expressions for query processing. We show the value of this technique for providing significant performance improvement when combined with indexing or multidimensional clustering in DB2.

extending database technology | 1996

Fundamental Techniques for Order Optimization

David E. Simmen; Eugene J. Shekita; Timothy R. Malkemus

This paper briefly describes some of the novel techniques used by the query optimizer of IBMs DB2 to process and optimize the way order requirements are satisfied.

international conference on data engineering | 2007

Poster Session: Improved Buffer Size Adaptation through Cache/Controller Coupling

Christian A. Lang; Bishwaranjan Bhattacharjee; Timothy R. Malkemus; I. Slanoi

Database workloads seldom remain static. A system tuned by an expert for the current environment, might not always remain optimal. To deal with this situation, database systems have been incorporating self tuning features. An important component is self tuning memory or bufferpools. These often work on a feedback loop and take some time to converge to an optimal state. T)\c transition to the optimal state could be accelerated if future access patterns could be taken into account when making decisions on the bufferpool size. In this paper, we describe a caching algorithm for scans on bufferpools. which keeps track of ongoing scans and the state of each scan. T)\e proposed algorithm results in a better hit ratio and. more importantly, provides a way to predict future access patterns. This property is beneficial for providing feedback to a self-tuning memory controller to make better allocation decisions.

Archive | 2002