Peter Zabback
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Peter Zabback.
very large data bases | 1998
Peter Scheuermann; Gerhard Weikum; Peter Zabback
Abstract. Parallel disk systems provide opportunities for exploiting I/O parallelism in two possible ways, namely via inter-request and intra-request parallelism. In this paper, we discuss the main issues in performance tuning of such systems, namely striping and load balancing, and show their relationship to response time and throughput. We outline the main components of an intelligent, self-reliant file system that aims to optimize striping by taking into account the requirements of the applications, and performs load balancing by judicious file allocation and dynamic redistributions of the data when access patterns change. Our system uses simple but effective heuristics that incur only little overhead. We present performance experiments based on synthetic workloads and real-life traces.
Information Systems | 1994
Gerhard Weikum; Christof Hasse; Axel Mönkeberg; Peter Zabback
Abstract This paper reports on results and experiences from the COMFORT automatic tuning project. The objective of the project has been to investigate architectural principles of self-tuning database and transaction processing systems, and to develop self-tuning methods for specific performance tuning problems. A particular concern of the project has been to cope with workload dynamics and workload heterogeneity in multi-user systems. As a general guideline, an adaptive feedback control approach has been adopted, where observations of the current load characteristics are used to predict performance trends and to drive the dynamic adjustment of tuning parameters. As examples of these general principles, the paper discusses adaptive approaches to two specific tuning problems and the deeloped solutions. First, we present a self-tuning load control method that copes with overload caused by excessive lock conflicts that may occur during load surges. This conflict-driven load control method adapts the multiprogramming level of the system to the evolving load characteristics dynamically and automatically. Secondly, we discuss the self-tuning LRU-K database buffering method, which makes intelligent buffering decisions by dynamically tracking the access frequency of pages, thus coping well with evolving access patterns. We discuss the rationale of these two self-tuning methods in great detail and we present comprehensive performance evaluation experiments for both synthetic and trace-driven workloads.
very large data bases | 2009
Mohamed H. Ali; C. Gerea; Balan Sethu Raman; Beysim Sezgin; T. Tarnavski; Tomer Verona; Ping Wang; Peter Zabback; Asvin Ananthanarayan; Anton Kirilov; M. Lu; Alex Raizman; R. Krishnan; Roman Schindlauer; Torsten Grabs; S. Bjeletich; Badrish Chandramouli; Jonathan Goldstein; S. Bhat; Ying Li; V. Di Nicola; Xiaoyang Sean Wang; David Maier; S. Grell; O. Nano; Ivo Santos
In this demo, we present the Microsoft Complex Event Processing (CEP) Server, Microsoft CEP for short. Microsoft CEP is an event stream processing system featured by its declarative query language and its multiple consistency levels of stream query processing. Query composability, query fusing, and operator sharing are key features in the Microsoft CEP query processor. Moreover, the debugging and supportability tools of Microsoft CEP provide visibility of system internals to users. Web click analysis has been crucial to behavior-based online marketing. Streams of web click events provide a typical workload for a CEP server. Meanwhile, a CEP server with its processing capabilities plays a key role in web click analysis. This demo highlights the features of Microsoft CEP under a workload of web click events.
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms | 1993
Peter Scheuermann; Gerhard Weikum; Peter Zabback
Large arrays of small disks axe providing an attractive approach for high performance I/O systems. In order to make effective use of disk arrays and other multi-disk architectures, it is necessary to develop intelligent software tools that allow automatic tuning of the disk arrays to varying workloads. In this paper we describe an adaptive method for data allocation and load balancing in disk arrays. Our method deals with dynamically changing access frequencies of files by reallocating file extents, thus ”cooling down” hot disks. In addition, the method takes into account the fact that some files may exhibit periodical access patterns, and considers explicitly the cost of performing the ”cooling” operations. Preliminary performance studies based on real-life I/O traces demonstrate the effectivity of this approach.
international conference on management of data | 2007
Per-Ake Larson; Wolfgang Lehner; Jingren Zhou; Peter Zabback
Accurate cardinality estimation is critically important to high-quality query optimization. It is well known that conventional cardinality estimation based on histograms or similar statistics may produce extremely poor estimates in a variety of situations, for example, queries with complex predicates, correlation among columns, or predicates containing user-defined functions. In this paper, we propose a new, general cardinality estimation technique that combines random sampling and materialized view technology to produce accurate estimates even in these situations. As a major innovation, we exploit feedback information from query execution and process control techniques to assure that estimates remain statistically valid when the underlying data changes. Experimental results based on a prototype implementation in Microsoft SQL Server demonstrate the practicality of the approach and illustrate the dramatic effects improved cardinality estimates may have.
international workshop on research issues in data engineering | 1992
Gerhard Weikum; Peter Zabback
Striping files across the disks of a disk array is a promising approach to improve the I/O performance of data management systems. An important tuning parameter of this method is the striping unit that is, the maximum number of logically consecutive blocks that are allocated on one disk. The striping unit determines the degree of parallelism in servicing a request by multiple disks, and its affects the achievable throughput of I/O requests. Since a good choice of a files striping unit depends on the files access characteristics such as average request size, it is proposed that file-specific striping units be chosen rather than choosing the same global striping unit for all files. The paper presents a method for tuning file-specific striping units, based on the access characteristics of the individual files and the throughput requirements of the application. Performance experiments are presented, based on a synthetic benchmark that was run on the file system prototype FIVE and a simulation testbed for disk-arrays. The experiments indicate significant performance gains of file-specific striping units compared to an optimally chosen global striping unit.<<ETX>>
very large data bases | 2008
Srini Acharya; Peter Carlin; Cesar A. Galindo-Legaria; Krzysztof Kozielczyk; Pawel Terlecki; Peter Zabback
Efficient support for applications that deal with data heterogeneity, hierarchies and schema evolution is an important challenge for relational engines. In this paper we show how this flexibility can be handled in Microsoft SQL Server. For this purpose, the engine has been equipped in an integrated package of relational extensions. The package includes sparse storage, column set operations, filtered indices, filtered statistics and hierarchy querying with OrdPath labeling. In addition, economical loading of metadata allow us to answer queries independently of the number of columns in a table and drastically improve scaling capabilities. The design of a prototypical content and collaboration application based on a wide table is described, along with experiments validating its performance.
international conference on parallel and distributed information systems | 1993
Gerhard Weikum; Christof Hasse; A. Moenkeberg; Michael Rys; Peter Zabback
The COMFORT project, which is intended to automate the performance tuning of database systems, is discussed. The project addresses several important tuning issues of parallel database systems in a multiuser environment, and it aims to develop general architectural principles for a self-tuning database system. The overall goals and the rationale are examined, and a brief overview of the current work is given.<<ETX>>
international conference on data engineering | 2008
Cesar A. Galindo-Legaria; Torsten Grabs; Sreenivas Gukal; Steve Herbert; Aleksandras Surna; Shirley Wang; Wei Yu; Peter Zabback; Shin Zhang
As mainstream data warehouses are growing into the multi-terabyte range, adequate performance for decision support queries remains challenging for database query processors. Proper choice of query plan is essential in data warehouses where fact tables often store billions of rows. This paper discusses query optimization and execution strategies that Microsoft SQL Server employs for decision support queries in dimensionally modeled relational data warehouses. Our approach is based on pattern matching to detect typical star query patterns. When matching the pattern, the optimizer generates additional query plan alternatives specifically optimized for data warehouse performance. For high selectivity queries, the plans use nested loops joins and seeks. Medium selectivity queries in turn rely on right-deep hash joins with bitmap filters. Bitmap filters perform semi-join reductions to efficiently prune out non-qualifying rows early. Final plan choice is left for cost-based optimization which also compares the data warehouse specific plans against conventional query plans. We conducted an extensive experimental investigation using both synthetic workloads and several customer workloads. As our results show, the new plan shapes and execution strategies yield significant performance improvements across the targeted workloads as compared to earlier versions of Microsoft SQL Server.
IEEE Transactions on Knowledge and Data Engineering | 1998
Peter Zabback; I. Onyuksei; Peter Scheuermann; G. Welkum
We present a model for data reorganization in parallel disk systems that is geared toward load balancing in an environment with periodic access patterns. Data reorganization is performed by disk cooling, i.e., migrating files or extents from the hottest disks to the coldest ones. We develop an approximate queueing model for determining the effective arrival rates of cooling requests and discuss its use in assessing the costs versus benefits of cooling actions.