Knut Stolze | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Knut Stolze is active.

Explore More

Publication

Featured researches published by Knut Stolze.

business intelligence for the real time enterprises | 2011

Blink: Not Your Father's Database!

Ronald J. Barber; Peter Bendel; Marco Czech; Oliver Draese; Frederick Ho; Namik Hrle; Stratos Idreos; Min-Soo Kim; Oliver Koeth; Jae-Gil Lee; Tianchao Tim Li; Guy M. Lohman; Konstantinos Morfonios; Rene Mueller; Keshava Murthy; Ippokratis Pandis; Lin Qiao; Vijayshankar Raman; Sandor Szabo; Richard S. Sidle; Knut Stolze

The Blink project’s ambitious goals are to answer all Business Intelligence (BI) queries in mere seconds, regardless of the database size, with an extremely low total cost of ownership. It takes a very innovative and counter-intuitive approach to processing BI queries, one that exploits several disruptive hardware and software technology trends. Specifically, it is a new, workload-optimized DBMS aimed primarily at BI query processing, and exploits scale-out of commodity multi-core processors and cheap DRAM to retain a (copy of a) data mart completely in main memory. Additionally, it exploits proprietary compression technology and cache-conscious algorithms that reduce memory bandwidth consumption and allow most SQL query processing to be performed on the compressed data. Ignoring the general wisdom of the last three decades that the only way to scalably search large databases is with indexes, Blink always performs simple, “brute force” scans of the entire data mart in parallel on all nodes, without using any indexes or materialized views, and without any query optimizer to choose among them. The Blink technology has thus far been incorporated into two products: (1) an accelerator appliance product for DB2 for z/OS (on the “mainframe”), called the IBM Smart Analytics Optimizer for DB2 for z/OS, V1.1, which was generally available in November 2010; and (2) the Informix Warehouse Accelerator (IWA), a software-only version that was generally available in March 2011. We are now working on the next generation of Blink, called BLink Ultra, or BLU, which will significantly expand the “sweet spot” of Blink technology to much larger, disk-based warehouses and allow BLU to “own” the data, rather than copies of it.

Datenbanksysteme in Büro, Technik und Wissenschaft (BTW), 9. GI-Fachtagung, | 2001

SQL/MM Part 5: Still Image - The Standard and Implementation Aspects

Knut Stolze

ISO/IEC 13249 SQL/MM is the effort to standardize SQL extensions. As the name suggests, this project focuses on multi media and application specific packages like text, spatial, and image together with extended search facilities for them. Part 1 is the framework, part 2 the full text standard, part 3 addresses spatial data, and part 5 still images. SQL extensions for data mining applications are handled in part 6. The withdrawn part 4 addressed general purpose facilities. This paper presents the upcoming ISO/IEC 13249-5 SQL/MM Part 5: Still Image standard [ISO00a], its features and functionality. The standard is also discussed critically. A prototypical implementation of the standard, based on IBM’s database system DB2 Universal Database Version 7.1, is presented. The prototype is designed with performance in mind and for standard conformance. Implications from this approach are described here.

web age information management | 2012

WYSIWYE: An Algebra for Expressing Spatial and Textual Rules for Information Extraction

Vijil Chenthamarakshan; Ramakrishna Varadarajan; Prasad M. Deshpande; Raghuram Krishnapuram; Knut Stolze

The visual layout of a webpage can provide valuable clues for certain types of Information Extraction (IE) tasks. In traditional rule based IE frameworks, these layout cues are mapped to rules that operate on the HTML source of the webpages. In contrast, we have developed a framework in which the rules can be specified directly at the layout level. This has many advantages, since the higher level of abstraction leads to simpler extraction rules that are largely independent of the source code of the page, and, therefore, more robust. It can also enable specification of new types of rules that are not otherwise possible. To the best of our knowledge, there is no general framework that allows declarative specification of information extraction rules based on spatial layout. Our framework is complementary to traditional text based rules framework and allows a seamless combination of spatial layout based rules with traditional text based rules. We describe the algebra that enables such a system and its efficient implementation using standard relational and text indexing features of a relational database. We demonstrate the simplicity and efficiency of this system for a task involving the extraction of software system requirements from software product pages.

international conference on management of data | 2011

Online reorganization in read optimized MMDBS

Felix Beier; Knut Stolze; Kai-Uwe Sattler

Query performance is a critical factor in modern business intelligence and data warehouse systems. An increasing number of companies uses detailed analyses for conducting daily business and supporting management decisions. Thus, several techniques have been developed for achieving near realtime response times - techniques which try to alleviate I/O bottlenecks while increasing the throughputs of available processing units, i.e. by keeping relevant data in compressed main-memory data structures and exploiting the read-only characteristics of analytical workloads. However, update processing and skews in data distribution result in degenerations in these densely packed and highly compressed data structures affecting the memory efficiency and query performance negatively. Reorganization tasks can repair these data structures, but -- since these are usually costly operations -- require a well-considered decision which of several possible strategies should be processed and when, in order to reduce system downtimes. In this paper, we address these problems by presenting an approach for online reorganization in main-memory database systems (MMDBS). Based on a discussion of necessary reorganization strategies in IBM Smart Analytics Optimizer, a read optimized parallel MMDBS, we introduce a framework for executing arbitrary reorganization tasks online, i.e. in the background of normal user workloads without disrupting query results or performance.

Datenbank-spektrum | 2011

Integrating Cluster-Based Main-Memory Accelerators in Relational Data Warehouse Systems

Knut Stolze; Felix Beier; Oliver Koeth; Kai-Uwe Sattler

Today, data warehouse systems are faced with challenges for providing nearly realtime response times even for complex analytical queries on enormous data volumes. Highly scalable computing clusters in combination with parallel in-memory processing of compressed data are valuable techniques to address these challenges. In this paper, we give an overview on core techniques of the IBM Smart Analytics Optimizer—an accelerator engine for IBM’s mainframe database system DB2 for z/OS. We particularly discuss aspects of a seamless integration between the two worlds and describe techniques exploiting features of modern hardware such as parallel processing, cache utilization, and SIMD. We describe issues encountered during the development and evaluation of our system and outline current research activities for solving them.

Informatik - Forschung Und Entwicklung | 2005

Efficient interval management using object-relational database servers

Christoph Brochhaus; Jost Enderle; Achim Schlosser; Thomas Seidl; Knut Stolze

User-defined data types such as intervals require specialized access methods to be efficiently searched and queried. As database implementors cannot provide appropriate index structures and query processing methods for each conceivable data type, present-day object-relational database systems offer extensible indexing frameworks that enable developers to extend the set of built-in index structures by custom access methods. Although these frameworks permit a seamless integration of user-defined indexing techniques into query processing they do not facilitate the actual implementation of the access method itself. In order to leverage the applicability of indexing frameworks, relational access methods such as the Relational Interval Tree (RI-tree), an efficient index structure to process interval intersection queries, mainly rely on the functionality, robustness and performance of built-in indexes, thus simplifying the index implementation significantly. To investigate the behavior and performance of the recently released IBM DB2 indexing framework we use this interface to integrate the RI-tree into the DB2 server. The standard implementation of the RI-tree, however, does not fit to the narrow corset of the DB2 framework which is restricted to the use of a single index only. We therefore present our adaptation of the original two-tree technique to the single index constraint as well as an approximate adaptation which conceptually only needs a single index. As experimental results with interval intersection queries show, the plugged-in access methods deliver excellent performance compared to other techniques. ZusammenfassungBenutzerdefinierte Datentypen wie beispielsweise Intervalle setzen zur effizienten Realisierung von Suchanfragen spezialisierte Zugriffsmethoden voraus. Da nicht für jeden denkbaren Datentyp datenbankseitig auch die entsprechenden Indexstrukturen und passenden Zugriffs- und Anfragemethoden zur Verfügung gestellt werden können, bieten moderne objekt-relationale Datenbanksysteme erweiterbare Indexschnittstellen an, die den Entwicklern die Möglichkeit geben, die eingebauten Indexstrukturen um maßgeschneiderte Zugriffsmethoden zu erweitern. Obwohl diese Schnittstellen die nahtlose Integration von benutzerdefinierten Indexierungstechniken in die Anfragebearbeitung ermöglichen, erleichtern sie nicht die eigentliche Implementierung der tatsächlichen Zugriffsmethode. Um die Vorteile dieser Schnittstellen zu nutzen, verlassen sich Zugriffsmethoden wie beispielsweise der Relationale Intervallbaum (RI-Baum), eine Indexstruktur zur effizienten Bearbeitung von Intervallschnittanfragen, hauptsächlich auf die Funktionalität, Robustheit und Leistung von eingebauten Indexen, wodurch die Indeximplementierung wesentlich vereinfacht wird. Um das Verhalten und die Leistung des kürzlich veröffentlichten IBM DB2 Indexing Framework zu untersuchen, wurde der RI-Baum in den DB2-Datenbankserver mittels dieser Schnittstelle integriert. Die Standardimplementation des RI-Baums jedoch genügt nicht den restriktiven Anforderungen der DB2-Schnittstelle, welche nur die Verwendung eines einzelnen Indexes zulässt. Daher wird hier sowohl eine Adaption der ursprünglichen Zwei-Index-Technik gemäß der Einschränkung auf einen Index vorgestellt als auch eine approximierte Version, welche konzeptionell nur einen einzelnen Index benötigt. Experimentelle Ergebnisse zeigen, dass die auf diese Weise integrierten Zugriffsmethoden verglichen mit anderen Techniken exzellente Leistungswerte aufweisen.

Information Technology | 2017

Architecture of a data analytics service in hybrid cloud environments

Felix Beier; Knut Stolze

Abstract DB2 for z/OS is the backbone of many transactional systems in the world. IBM DB2 Analytics Accelerator (IDAA) is IBMs approach to enhance DB2 for z/OS with very fast processing of OLAP and analytical SQL workload. While IDAA was originally designed as an appliance to be connected directly to System z, the trend in the IT industry is towards cloud environments. That offers a broad range of tools for analytical data processing tasks. This article presents the architecture for offering a hybrid IDAA, which continues the seamless integration with DB2 for z/OS and now also runs as a specialty engine in cloud environments. Both approaches have their merit and will remain important for customers in the next years. The specific challenges for accelerating query processing for relational data in the cloud are highlighted. Specialized hardware options are not readily available, and that has a direct impact on the system architecture, the offered functionality and its implementation.

international conference on data engineering | 2011

Autonomous workload-driven reorganization of column groupings in MMDBS

Felix Beier; Knut Stolze; Kai-Uwe Sattler

A current trend to achieve high query performance even for huge data warehouse and business intelligence systems is to exploit main-memory-based processing techniques such as compression, cache-conscious strategies, and optimized data structures. However, update processing and skews in data distribution might lead to degenerations in such densely packed and highly compressed data structures affecting the memory efficiency and query performance negatively. Thus, reorganization tasks for repairing these data structures are necessary but should be carefully applied in order to not impact query execution or even system availability significantly. In this paper, we consider the special problem of tuple layout in banked storage structures. Based on runtime statistics capturing typical access patterns in the current workload, we present a bank reassignment approach that can be piggybacked to maintenance tasks without any administration overhead. We have implemented this approach in IBM Smart Analytics Optimizer (ISAOPT). The results of our experimental evaluation show that a simple automatic restructuring of the considered hybrid row-column-store structures offers opportunities to improve query runtimes when a slight memory overhead is acceptable.

Archive | 2005