Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jens Krueger is active.

Publication


Featured researches published by Jens Krueger.


very large data bases | 2011

Fast updates on read-optimized databases using multi-core CPUs

Jens Krueger; Changkyu Kim; Martin Grund; Nadathur Satish; David Schwalb; Jatin Chhugani; Hasso Plattner; Pradeep Dubey; Alexander Zeier

Read-optimized columnar databases use differential updates to handle writes by maintaining a separate write-optimized delta partition which is periodically merged with the read-optimized and compressed main partition. This merge process introduces significant overheads and unacceptable downtimes in update intensive systems, aspiring to combine transactional and analytical workloads into one system. In the first part of the paper, we report data analyses of 12 SAP Business Suite customer systems. In the second half, we present an optimized merge process reducing the merge overhead of current systems by a factor of 30. Our linear-time merge algorithm exploits the underlying high compute and bandwidth resources of modern multi-core CPUs with architecture-aware optimizations and efficient parallelization. This enables compressed in-memory column stores to handle the transactional update rate required by enterprise applications, while keeping properties of read-optimized databases for analytic-style queries.


database systems for advanced applications | 2010

Optimizing write performance for read optimized databases

Jens Krueger; Martin Grund; Christian Tinnefeld; Hasso Plattner; Alexander Zeier; Franz Faerber

Compression in column-oriented databases has been proven to offer both performance enhancements and reductions in storage consumption. This is especially true for read access as compressed data can directly be processed for query execution.Nevertheless, compression happens to be disadvantageous when it comes to write access due to unavoidable re-compression: write-access requires significantly more data to be read than involved in the particular operation, more tuples may have to be modified depending on the compression algorithm, and table-level locks have to be acquired instead of row-level locks as long as no second version of the data is stored. As an effect the duration of a single modification — both insert and update — limits both throughput and response time significantly. In this paper, we propose to use an additional write-optimized buffer to maintain the delta that in conjunction with the compressed main store represents the current state of the data. This buffer facilitates an uncompressed, column-oriented data structure. To address the mentioned disadvantages of data compression, we trade write-performance for query-performance and memory consumption by using the buffer as an intermediate storage for several modifications which are then populated as a bulk in a merge operation. Hereby, the overhead created by one single re-compression is shared among all recent modifications. We evaluated our implementation inside SAP’s in memory column store. We then analyze the different parameters influencing the merge process, and make a complexity analysis. Finally, we show optimizations regarding resource consumption and merge duration.


2008 IEEE Symposium on Advanced Management of Information for Globalized Enterprises (AMIGE) | 2008

Shared Table Access Pattern Analysis for Multi-Tenant Applications

Martin Grund; Matthieu Schapranow; Jens Krueger; Jan Schaffner; Anja Bog

With the rise of applications in the area of software as a service it becomes crucial to operate profitable even with small profit margins. The possibility of customers to switch between different hosted solutions makes the calculation of the total cost of ownership even more important. Not only the application design is an important factor in this calculation but as well the employed storage techniques used to create and maintain thousands of instances of the same application play one of the most important roles in this game. None of the well-known commercial of the shelf databases like MySQL, PostgreSQL, etc. provides an optimized solution for multi-tenancy problems. In this paper we analyze typical operations in the area of multi-tenant databases and show how they are wired together. Based on the analysis we propose an architecture for a multi-tenant storage engine capable to overcome the problems of typical relational databases in this area.


enterprise distributed object computing | 2010

Enterprise Application-Specific Data Management

Jens Krueger; Martin Grund; Alexander Zeier; Hasso Plattner

Enterprise applications are presently built on a 20-year old data management infrastructure that was designed to meet a specific set of requirements for OLTP systems. In the meantime, enterprise applications have become more sophisticated, data set sizes have increased, requirements on the freshness of input data have been strengthened, and the time allotted for completing business processes has been reduced. To meet these challenges, enterprise applications have become increasingly complicated to make up for short-comings in the data management infrastructure. To address this issue we investigate recent trends in data management such as main memory databases, column stores and compression techniques with regards to the workload requirements and data characteristics derived from actual customer systems. We show that a main memory column store is better suited for to days enterprise systems, which we validate by using SAP’s Net Weaver Business Warehouse Accelerator and a realistic set of data from an inventory management application.


international database engineering and applications symposium | 2010

How to juggle columns: an entropy-based approach for table compression

Marcus Paradies; Christian Lemke; Hasso Plattner; Wolfgang Lehner; Kai-Uwe Sattler; Alexander Zeier; Jens Krueger

Many relational databases exhibit complex dependencies between data attributes, caused either by the nature of the underlying data or by explicitly denormalized schemas. In data warehouse scenarios, calculated key figures may be materialized or hierarchy levels may be held within a single dimension table. Such column correlations and the resulting data redundancy may result in additional storage requirements. They may also result in bad query performance if inappropriate independence assumptions are made during query compilation. In this paper, we tackle the specific problem of detecting functional dependencies between columns to improve the compression rate for column-based database systems, which both reduces main memory consumption and improves query performance. Although a huge variety of algorithms have been proposed for detecting column dependencies in databases, we maintain that increased data volumes and recent developments in hardware architectures demand novel algorithms with much lower runtime overhead and smaller memory footprint. Our novel approach is based on entropy estimations and exploits a combination of sampling and multiple heuristics to render it applicable for a wide range of use cases. We demonstrate the quality of our approach by means of an implementation within the SAP NetWeaver Business Warehouse Accelerator. Our experiments indicate that our approach scales well with the number of columns and produces reliable dependence structure information. This both reduces memory consumption and improves performance for nontrivial queries.


international conference on computer sciences and convergence information technology | 2010

Data structures for mixed workloads in in-memory databases

Jens Krueger; Martin Grund; Martin Boissier; Alexander Zeier; Hasso Plattner

Traditionally, enterprise data management is divided into separate systems. Online Transaction Processing (OLTP) systems are focused on the day to day business by being optimized for retrieving and modifying complete entities. Online Analytical Processing (OLAP) systems initiate queries on specific attributes as these applications are optimized to support decision making based on the information gathered from many instances. In parallel both hardware and database applications are subject to steady improvements. For example, todays size of main memory in combination with the column oriented organization of data offer completely new possibilities such as real time analytical ad hoc queries on transactional data. Especially latest development in the area of main memory database systems raises the question whether those databases are capable of handling both OLAP and OLTP workloads in one system. This Paper discusses requirements for main memory database systems managing both workloads and analyses using appropriate data structures.


data management on new hardware | 2010

The effects of virtualization on main memory systems

Martin Grund; Jan Schaffner; Jens Krueger; Jan Brunnert; Alexander Zeier

Virtualization is mainly employed for increasing the utilization of a lightly-loaded system by consolidation, but also to ease the administration based on the possibility to rapidly provision or migrate virtual machines. These facilities are crucial for efficiently managing large data centers. At the same time, modern hardware --- such as Intels Nehalem microarchitecure --- change critical assumptions about performance bottlenecks and software systems explicitly exploiting the underlying hardware --- such as main memory databases --- gain increasing momentum. In this paper, we address the question of how these specialized software systems perform in a virtualized environment. To do so, we present a set of experiments looking at several different variants of in-memory databases: The MonetDB Calibrator, a fine-grained hybrid row/column in-memory database running an OLTP workload, and an in-memory column store database running a multi-user OLAP workload. We examine how memory management in virtual machine monitors affects these three classes of applications. For the multi-user OLAP experiment we also experimentally compare a virtualized Nehalem server to one of its predecessors. We show that saturation of the memory bus is a major limiting factor but is much less impactful on the new architecture.


conference on information and knowledge management | 2012

Efficient logging for enterprise workloads on column-oriented in-memory databases

Johannes Wust; Joos-Hendrick Boese; Frank Renkes; Sebastian Blessing; Jens Krueger; Hasso Plattner

The introduction of a 64 bit address space in commodity operating systems and the constant drop in hardware prices made large capacities of main memory in the order of terabytes technically feasible and economically viable. Especially column-oriented in-memory databases are a promising platform to improve data management for enterprise applications. As in-memory databases hold the primary persistence in volatile memory, some form of recovery mechanism is required to prevent potential data loss in case of failures. Two desirable characteristics of any recovery mechanism are (1) that it has a minimal impact on the running system, and (2) that the system recovers quickly and without any data loss after a failure. This paper introduces an efficient logging mechanism for dictionary-compressed column structures that addresses these two characteristics by (1) reducing the overall log size by writing dictionary-compressed values and (2) allowing for parallel writing and reading of log files. We demonstrate the efficiency of our logging approach by comparing the resulting log-file size with traditional logical logging on a workload produced by a productive enterprise system.


industrial engineering and engineering management | 2011

Main memory databases for enterprise applications

Jens Krueger; Florian Huebner; Johannes Wust; Martin Boissier; Alexander Zeier; Hasso Plattner

Enterprise applications are traditionally divided in transactional and analytical processing. This separation was essential as growing data volume and more complex requests were no longer performing feasibly on conventional relational databases.


industrial engineering and engineering management | 2009

Vertical partioning in insert-only scenarios for enterprise applications

Martin Grund; Jens Krueger; Christian Tinnefeld; Alexander Zeier

Todays applications have a specific demand for operational reporting. It becomes more important to gain information using analytic style queries on the current transactional data. In addition enterprises must keep track of historical data for legal reasons, so they are forced to track any changes in the system. One possibility to record all changes is to use an insert-only data management approach. When using a main-memory database efficient usage of main memory is a very important factor. In this paper we show how a combination of insert-only data management together with vertical partitioning to achieve a seamless integration of time based data management and efficient memory usage. To validate our approach we present results of a customer analysis.

Collaboration


Dive into the Jens Krueger's collaboration.

Top Co-Authors

Avatar

Hasso Plattner

Hasso Plattner Institute

View shared research outputs
Top Co-Authors

Avatar

Alexander Zeier

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anja Bog

University of Potsdam

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Johannes Wust

Hasso Plattner Institute

View shared research outputs
Top Co-Authors

Avatar

David Schwalb

Hasso Plattner Institute

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge