Martin Grund | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Grund is active.

Explore More

Publication

Featured researches published by Martin Grund.

very large data bases | 2010

HYRISE: a main memory hybrid storage engine

Martin Grund; Jens H. Krüger; Hasso Plattner; Alexander Zeier; Philippe Cudré-Mauroux; Samuel Madden

In this paper, we describe a main memory hybrid database system called HYRISE, which automatically partitions tables into vertical partitions of varying widths depending on how the columns of the table are accessed. For columns accessed as a part of analytical queries (e.g., via sequential scans), narrow partitions perform better, because, when scanning a single column, cache locality is improved if the values of that column are stored contiguously. In contrast, for columns accessed as a part of OLTP-style queries, wider partitions perform better, because such transactions frequently insert, delete, update, or access many of the fields of a row, and co-locating those fields leads to better cache locality. Using a highly accurate model of cache misses, HYRISE is able to predict the performance of different partitionings, and to automatically select the best partitioning using an automated database design algorithm. We show that, on a realistic workload derived from customer applications, HYRISE can achieve a 20% to 400% performance improvement over pure all-column or all-row designs, and that it is both more scalable and produces better designs than previous vertical partitioning approaches for main memory systems.

very large data bases | 2011

Fast updates on read-optimized databases using multi-core CPUs

Jens Krueger; Changkyu Kim; Martin Grund; Nadathur Satish; David Schwalb; Jatin Chhugani; Hasso Plattner; Pradeep Dubey; Alexander Zeier

Read-optimized columnar databases use differential updates to handle writes by maintaining a separate write-optimized delta partition which is periodically merged with the read-optimized and compressed main partition. This merge process introduces significant overheads and unacceptable downtimes in update intensive systems, aspiring to combine transactional and analytical workloads into one system. In the first part of the paper, we report data analyses of 12 SAP Business Suite customer systems. In the second half, we present an optimized merge process reducing the merge overhead of current systems by a factor of 30. Our linear-time merge algorithm exploits the underlying high compute and bandwidth resources of modern multi-core CPUs with architecture-aware optimizations and efficient parallelization. This enables compressed in-memory column stores to handle the transactional update rate required by enterprise applications, while keeping properties of read-optimized databases for analytic-style queries.

database systems for advanced applications | 2010

Optimizing write performance for read optimized databases

Jens Krueger; Martin Grund; Christian Tinnefeld; Hasso Plattner; Alexander Zeier; Franz Faerber

Compression in column-oriented databases has been proven to offer both performance enhancements and reductions in storage consumption. This is especially true for read access as compressed data can directly be processed for query execution.Nevertheless, compression happens to be disadvantageous when it comes to write access due to unavoidable re-compression: write-access requires significantly more data to be read than involved in the particular operation, more tuples may have to be modified depending on the compression algorithm, and table-level locks have to be acquired instead of row-level locks as long as no second version of the data is stored. As an effect the duration of a single modification — both insert and update — limits both throughput and response time significantly. In this paper, we propose to use an additional write-optimized buffer to maintain the delta that in conjunction with the compressed main store represents the current state of the data. This buffer facilitates an uncompressed, column-oriented data structure. To address the mentioned disadvantages of data compression, we trade write-performance for query-performance and memory consumption by using the buffer as an intermediate storage for several modifications which are then populated as a bulk in a merge operation. Hereby, the overhead created by one single re-compression is shared among all recent modifications. We evaluated our implementation inside SAP’s in memory column store. We then analyze the different parameters influencing the merge process, and make a complexity analysis. Finally, we show optimizations regarding resource consumption and merge duration.

2008 IEEE Symposium on Advanced Management of Information for Globalized Enterprises (AMIGE) | 2008

Shared Table Access Pattern Analysis for Multi-Tenant Applications

Martin Grund; Matthieu Schapranow; Jens Krueger; Jan Schaffner; Anja Bog

With the rise of applications in the area of software as a service it becomes crucial to operate profitable even with small profit margins. The possibility of customers to switch between different hosted solutions makes the calculation of the total cost of ownership even more important. Not only the application design is an important factor in this calculation but as well the employed storage techniques used to create and maintain thousands of instances of the same application play one of the most important roles in this game. None of the well-known commercial of the shelf databases like MySQL, PostgreSQL, etc. provides an optimized solution for multi-tenancy problems. In this paper we analyze typical operations in the area of multi-tenant databases and show how they are wired together. Based on the analysis we propose an architecture for a multi-tenant storage engine capable to overcome the problems of typical relational databases in this area.

international conference on data engineering | 2013

CPU and cache efficient management of memory-resident databases

Holger Pirk; Florian Funke; Martin Grund; Thomas Neumann; Ulf Leser; Stefan Manegold; Alfons Kemper; Martin L. Kersten

Memory-Resident Database Management Systems (MRDBMS) have to be optimized for two resources: CPU cycles and memory bandwidth. To optimize for bandwidth in mixed OLTP/OLAP scenarios, the hybrid or Partially Decomposed Storage Model (PDSM) has been proposed. However, in current implementations, bandwidth savings achieved by partial decomposition come at increased CPU costs. To achieve the aspired bandwidth savings without sacrificing CPU efficiency, we combine partially decomposed storage with Just-in-Time (JiT) compilation of queries, thus eliminating CPU inefficient function calls. Since existing cost based optimization components are not designed for JiT-compiled query execution, we also develop a novel approach to cost modeling and subsequent storage layout optimization. Our evaluation shows that the JiT-based processor maintains the bandwidth savings of previously presented hybrid query processors but outperforms them by two orders of magnitude due to increased CPU efficiency.

enterprise distributed object computing | 2010

Enterprise Application-Specific Data Management

Jens Krueger; Martin Grund; Alexander Zeier; Hasso Plattner

Enterprise applications are presently built on a 20-year old data management infrastructure that was designed to meet a specific set of requirements for OLTP systems. In the meantime, enterprise applications have become more sophisticated, data set sizes have increased, requirements on the freshness of input data have been strengthened, and the time allotted for completing business processes has been reduced. To meet these challenges, enterprise applications have become increasingly complicated to make up for short-comings in the data management infrastructure. To address this issue we investigate recent trends in data management such as main memory databases, column stores and compression techniques with regards to the workload requirements and data characteristics derived from actual customer systems. We show that a main memory column store is better suited for to days enterprise systems, which we validate by using SAP’s Net Weaver Business Warehouse Accelerator and a realistic set of data from an inventory management application.

extending database technology | 2013

Elastic online analytical processing on RAMCloud

Christian Tinnefeld; Donald Kossmann; Martin Grund; Joos-Hendrik Boese; Frank Renkes; Vishal Sikka; Hasso Plattner

A shared-nothing architecture is state-of-the-art for deploying a distributed analytical in-memory database management system: it preserves the in-memory performance advantage by processing data locally on each node but is difficult to scale out. Modern switched fabric communication links such as InfiniBand narrow the performance gap between local and remote DRAM data access to a single order of magnitude. Based on these premises, we introduce a distributed in-memory database architecture that separates the query execution engine and data access: this enables a) the usage of a large-scale DRAM-based storage system such as Stanfords RAMCloud and b) the push-down of bandwidth-intensive database operators into the storage system. We address the resulting challenges such as finding the optimal operator execution strategy and partitioning scheme. We demonstrate that such an architecture delivers both: the elasticity of a shared-storage approach and the performance characteristics of operating on local DRAM.

international conference on computer sciences and convergence information technology | 2010

Data structures for mixed workloads in in-memory databases

Jens Krueger; Martin Grund; Martin Boissier; Alexander Zeier; Hasso Plattner

Traditionally, enterprise data management is divided into separate systems. Online Transaction Processing (OLTP) systems are focused on the day to day business by being optimized for retrieving and modifying complete entities. Online Analytical Processing (OLAP) systems initiate queries on specific attributes as these applications are optimized to support decision making based on the information gathered from many instances. In parallel both hardware and database applications are subject to steady improvements. For example, todays size of main memory in combination with the column oriented organization of data offer completely new possibilities such as real time analytical ad hoc queries on transactional data. Especially latest development in the area of main memory database systems raises the question whether those databases are capable of handling both OLAP and OLTP workloads in one system. This Paper discusses requirements for main memory database systems managing both workloads and analyses using appropriate data structures.

data management on new hardware | 2010

The effects of virtualization on main memory systems

Martin Grund; Jan Schaffner; Jens Krueger; Jan Brunnert; Alexander Zeier

Virtualization is mainly employed for increasing the utilization of a lightly-loaded system by consolidation, but also to ease the administration based on the possibility to rapidly provision or migrate virtual machines. These facilities are crucial for efficiently managing large data centers. At the same time, modern hardware --- such as Intels Nehalem microarchitecure --- change critical assumptions about performance bottlenecks and software systems explicitly exploiting the underlying hardware --- such as main memory databases --- gain increasing momentum. In this paper, we address the question of how these specialized software systems perform in a virtualized environment. To do so, we present a set of experiments looking at several different variants of in-memory databases: The MonetDB Calibrator, a fine-grained hybrid row/column in-memory database running an OLTP workload, and an in-memory column store database running a multi-user OLAP workload. We examine how memory management in virtual machine monitors affects these three classes of applications. For the multi-user OLAP experiment we also experimentally compare a virtualized Nehalem server to one of its predecessors. We show that saturation of the memory bus is a major limiting factor but is much less impactful on the new architecture.

international conference on big data | 2013

Big data analytics on high Velocity streams: A case study

Thibaud Chardonnens; Philippe Cudré-Mauroux; Martin Grund; Benoit Perroud

Big data management is often characterized by three Vs: Volume, Velocity and Variety. While traditional batch-oriented systems such as MapReduce are able to scale-out and process very large volumes of data in parallel, they also introduce some significant latency. In this paper, we focus on the second V (Velocity) of the Big Data triad; We present a case-study where we use a popular open-source stream processing engine (Storm) to perform real-time integration and trend detection on Twitter and Bitly streams. We describe our trend detection solution below and experimentally demonstrate that our architecture can effectively process data in real-time - even for high-velocity streams.

Explore More