Christian Tinnefeld | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christian Tinnefeld is active.

Explore More

Publication

Featured researches published by Christian Tinnefeld.

database systems for advanced applications | 2010

Optimizing write performance for read optimized databases

Jens Krueger; Martin Grund; Christian Tinnefeld; Hasso Plattner; Alexander Zeier; Franz Faerber

Compression in column-oriented databases has been proven to offer both performance enhancements and reductions in storage consumption. This is especially true for read access as compressed data can directly be processed for query execution.Nevertheless, compression happens to be disadvantageous when it comes to write access due to unavoidable re-compression: write-access requires significantly more data to be read than involved in the particular operation, more tuples may have to be modified depending on the compression algorithm, and table-level locks have to be acquired instead of row-level locks as long as no second version of the data is stored. As an effect the duration of a single modification — both insert and update — limits both throughput and response time significantly. In this paper, we propose to use an additional write-optimized buffer to maintain the delta that in conjunction with the compressed main store represents the current state of the data. This buffer facilitates an uncompressed, column-oriented data structure. To address the mentioned disadvantages of data compression, we trade write-performance for query-performance and memory consumption by using the buffer as an intermediate storage for several modifications which are then populated as a bulk in a merge operation. Hereby, the overhead created by one single re-compression is shared among all recent modifications. We evaluated our implementation inside SAP’s in memory column store. We then analyze the different parameters influencing the merge process, and make a complexity analysis. Finally, we show optimizations regarding resource consumption and merge duration.

extending database technology | 2013

Elastic online analytical processing on RAMCloud

Christian Tinnefeld; Donald Kossmann; Martin Grund; Joos-Hendrik Boese; Frank Renkes; Vishal Sikka; Hasso Plattner

A shared-nothing architecture is state-of-the-art for deploying a distributed analytical in-memory database management system: it preserves the in-memory performance advantage by processing data locally on each node but is difficult to scale out. Modern switched fabric communication links such as InfiniBand narrow the performance gap between local and remote DRAM data access to a single order of magnitude. Based on these premises, we introduce a distributed in-memory database architecture that separates the query execution engine and data access: this enables a) the usage of a large-scale DRAM-based storage system such as Stanfords RAMCloud and b) the push-down of bandwidth-intensive database operators into the storage system. We address the resulting challenges such as finding the optimal operator execution strategy and partitioning scheme. We demonstrate that such an architecture delivers both: the elasticity of a shared-storage approach and the performance characteristics of operating on local DRAM.

industrial engineering and engineering management | 2009

Vertical partioning in insert-only scenarios for enterprise applications

Martin Grund; Jens Krueger; Christian Tinnefeld; Alexander Zeier

Todays applications have a specific demand for operational reporting. It becomes more important to gain information using analytic style queries on the current transactional data. In addition enterprises must keep track of historical data for legal reasons, so they are forced to track any changes in the system. One possibility to record all changes is to use an insert-only data management approach. When using a main-memory database efficient usage of main memory is a very important factor. In this paper we show how a combination of insert-only data management together with vertical partitioning to achieve a seamless integration of time based data management and efficient memory usage. To validate our approach we present results of a customer analysis.

international conference on data engineering | 2014

Parallel join executions in RAMCloud

Christian Tinnefeld; Donald Kossmann; Joos-Hendrik Boese; Hasso Plattner

Modern large-scale storage systems provide not only storage capacity, but also processing power. When such a storage system serves as persistence for a database application, it is desirable to utilize its processing power for supporting query execution. In this paper, we evaluate the parallel execution of join operations in Stanfords RAMCloud which is a DRAM-based storage system connected via RDMA-enabled network adapters. We a) provide a system model to derive the execution costs for the Grace Join, the Distributed Block Nested Loop Join, and the Cyclo Join algorithm and their corresponding implementations in RAMCloud. We describe b) how the execution time for a single join operation depends on factors such as relation sizes, numbers of nodes used for a join, and the chosen algorithm. We finally introduce and evaluate c) a set of heuristics for parameterizing the execution of many join operations in parallel with the goal of maximizing the throughput.

Datenbank-spektrum | 2010

Hauptspeicherdatenbanken für Unternehmensanwendungen

Jens Krueger; Martin Grund; Christian Tinnefeld; Benjamin Eckart; Alexander Zeier; Hasso Plattner

ZusammenfassungUnternehmensanwendungen werden traditionell in OLTP (Online Transactional Processing) und OLAP (Online Analytical Processing) unterteilt. Während sich viele Forschungsaktivitäten der letzten Jahre auf die Optimierung dieser Trennung fokussieren, haben – im Speziellen während des letztes Jahrzehnts – sich sowohl Datenbanken als auch Hardware weiterentwickelt. Einerseits gibt es Datenmanagementsysteme, die Daten spaltenorientiert organisieren und dabei ideal das Anforderungsprofil analytischer Anfragen abdecken. Andererseits steht Anwendungen heute wesentlich mehr Hauptspeicher zur Verfügung, der in Kombination mit der ebenfalls wesentlich gesteigerten Rechenleistung es erlaubt, komplette Datenbanken von Unternehmen komprimiert im Speicher vorzuhalten. Beide Entwicklungen ermöglichen die Bearbeitung komplexer analytischer Anfragen in Sekundenbruchteilen und ermöglichen so komplett neue Geschäftsprozesse und -applikationen. Folglich stellt sich die Frage, ob die künstlich eingeführte Trennung von OLTP und OLAP aufgehoben werden kann und sämtliche Anfragen auf einem vereinten Datenbestand arbeiten können. Dieser Artikel betrachtet hierfür die Charakteristiken der Datenverarbeitung in Unternehmensanwendungen und zeigt wie ausgesuchte Technologien die Datenverarbeitung optimieren können. Ein weiterer Trend ist die Verwendung von Cloud Computing und somit die Auslagerung des Rechenzentrums zur Kostenoptimierung. Damit einher gehen Anforderungen an das Datenmanagement hinsichtlich dynamischer Erweiterung und Skalierung um dem Konzept des Cloud Computings gerecht zu werden. Die Eigenschaften spaltenorientierter Hauptspeicherdatenbanken bieten hier Vorteile, auch in Bezug auf die effektivere Auslastung der zur Verfügung stehenden Hardwareressourcen.Ein wichtiger Aspekt ist, dass alle Anfragen in einer definierten Reaktionszeit erfolgen auch wenn die Last stark schwanken kann. Erfahrungsgemäß steigt insbesondere am Ende eines Quartals die Belastung der vorhandenen Datenbanksysteme. Um hierfür immer genau die richtige Hardwareressourcen zur Verfügung zu haben, eignet sich Cloud Computing. Aus der gewünschten Elastizität ergeben sich Anforderungen an das Datenmanagement, die im Artikel betrachtet werden.

international database engineering and applications symposium | 2011

Cache-conscious data placement in an in-memory key-value store

Christian Tinnefeld; Alexander Zeier; Hasso Plattner

Key-value stores which keep the data entirely in main memory can serve applications whose performance criteria cannot be met by disk-based key-value stores. This paper evaluates the performance implications of cache-conscious data placement in an in-memory key-value store by examining how many values have to be stored consecutively in blocks in order to fully exploit memory locality during bandwidth-bound operations. We contribute by introducing a random block traversal main memory access pattern, by describing the corresponding memory access costs as well as by formally and experimentally deriving the correlation between block size and throughput. Our calculations and experiments vary the value and block sizes as well as their placement in the memory and derive their impact on cache-misses throughout the different memory hierarchies, the ability to prefetch data, and the number of needed CPU cycles to perform a certain set of data operations. The paper closes with the insight that a block-wise grouping of relatively few key-value pairs increases the throughput up to a factor six and with a discussion which implications a block-wise grouping of data has on the system design key-value store.

Workshop on Big Data Benchmarks | 2014

And All of a Sudden: Main Memory Is Less Expensive Than Disk

Martin Boissier; Carsten Alexander Meyer; Matthias Uflacker; Christian Tinnefeld

Even today, the wisdom for storage still is that storing data in main memory is more expensive than storing on disks. While this is true for the price per byte, the picture looks different for price per bandwidth. However, for data driven applications with high throughput demands, I/O bandwidth can easily become the major bottleneck. Comparing costs for different storage types for a given bandwidth requirement shows that the old wisdom of inexpensive disks and expensive main memory is no longer valid in every case. The higher the bandwidth requirements become, the more cost efficient main memory is. And all of sudden: main memory is less expensive than disk.

Archive | 2016

Related Work and Background

Christian Tinnefeld

This chapter presents the background as well as the related work for this work. Instead of separating the chapters related work and background, both topics are presented together in one chapter, giving the reader the advantage of understanding underlying concepts and getting to know the respective related work in one stroke.Each section and subsection is preceded by a short summary. Additional related work is presented where appropriate for example Chap. 6 starts with discussing system models which are related to the system model presented in remainder of this chapter, while Chap. 7 contains an overview on state-of-the-art distributed join algorithms. The four major areas which influence this work are current computing hardware trends, in-memory database management systems, parallel database management systems, as well as cloud storage systems. The subsequent sections and subsections also include discussions regarding how the different areas influence each other.

Archive | 2016

Operator Execution on Two Relations

Christian Tinnefeld

The previous Chapter describes the possibility of pushing-down the execution of database operators into RAMCloud, but the corresponding model is limited to operating on a single database column or relation at a time. This is practical for the execution of a scan operation, but necessitates breaking up a join operation into two separate operations.

Archive | 2016

Operator Execution on One Relation

Christian Tinnefeld

In this chapter we introduce an execution cost model for AnalyticsDB to analyze the impact of different parameters that have been induced by the data mapping, column partitioning, and the design of the operators itself. We first derive an abstract system model which is later used to predict execution costs analytically for different scenarios. Afterwards, we use our cost model to evaluate operator push-down and data pull execution strategies and show how the cost model can be used to decide on different execution strategies. We start with operators that operate on one relation at a time, continuing with two or more relations in the next chapter.

Explore More