Gregor Hackenbroich | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gregor Hackenbroich is active.

Explore More

Publication

Featured researches published by Gregor Hackenbroich.

international conference on data engineering | 2007

Representing Data Quality for Streaming and Static Data

Anja Klein; Hong-Hai Do; Gregor Hackenbroich; M. Kamstedt; Wolfgang Lehner

In smart item environments, multitude of sensors are applied to capture data about product conditions and usage to guide business decisions as well as production automation processes. A big issue in this application area is posed by the restricted quality of sensor data due to limited sensor precision as well as sensor failures and malfunctions. Decisions derived on incorrect or misleading sensor data are likely to be faulty. The issue of how to efficiently provide applications with information about data quality (DQ) is still an open research problem. In this paper, we present a flexible model for the efficient transfer and management of data quality for streaming as well as static data. We propose a data stream metamodel to allow for the propagation of data quality from the sensors up to the respective business application without a significant overhead of data. Furthermore, we present the extension of the traditional RDBMS metamodel to permit the persistent storage of data quality information in a relational database. Finally, we demonstrate a data quality metadata mapping to close the gap between the streaming environment and the target database. Our solution maintains a flexible number of DQ dimensions and supports applications directly consuming streaming data or processing data filed in a persistent database.

very large data bases | 2012

A storage advisor for hybrid-store databases

Philipp Rösch; Lars Dannecker; Franz Färber; Gregor Hackenbroich

With the SAP HANA database, SAP offers a high-performance in-memory hybrid-store database. Hybrid-store databases---that is, databases supporting row- and column-oriented data management---are getting more and more prominent. While the columnar management offers high-performance capabilities for analyzing large quantities of data, the row-oriented store can handle transactional point queries as well as inserts and updates more efficiently. To effectively take advantage of both stores at the same time the novel question whether to store the given data row- or column-oriented arises. We tackle this problem with a storage advisor tool that supports database administrators at this decision. Our proposed storage advisor recommends the optimal store based on data and query characteristics; its core is a cost model to estimate and compare query execution times for the different stores. Besides a per-table decision, our tool also considers to horizontally and vertically partition the data and manage the partitions on different stores. We evaluated the storage advisor for the use in the SAP HANA database; we show the recommendation quality as well as the benefit of having the data in the optimal store with respect to increased query performance.

statistical and scientific database management | 2011

Context-aware parameter estimation for forecast models in the energy domain

Lars Dannecker; Robert Schulze; Matthias Böhm; Wolfgang Lehner; Gregor Hackenbroich

Continuous balancing of energy demand and supply is a fundamental prerequisite for the stability and efficiency of energy grids. This balancing task requires accurate forecasts of future electricity consumption and production at any point in time. For this purpose, database systems need to be able to rapidly process forecasting queries and to provide accurate results in short time frames. However, time series from the electricity domain pose the challenge that measurements are constantly appended to the time series. Using a naive maintenance approach for such evolving time series would mean a re-estimation of the employed mathematical forecast model from scratch for each new measurement, which is very time consuming. We speed-up the forecast model maintenance by exploiting the particularities of electricity time series to reuse previously employed forecast models and their parameter combinations. These parameter combinations and information about the context in which they were valid are stored in a repository. We compare the current context with contexts from the repository to retrieve parameter combinations that were valid in similar contexts as starting points for further optimization. An evaluation shows that our approach improves the maintenance process especially for complex models by providing more accurate forecasts in less time than comparable estimation methods.

IEEE Communications Magazine | 2009

Performance control in wireless sensor networks: the ginseng project - [Global communications news letter]

C. Srean; J. Sa Silva; Lars C. Wolf; R. Eiras; Thiemo Voigt; Utz Roedig; Vasos Vassiliou; Gregor Hackenbroich

Research on wireless sensor networks (WSNs) has mainly been focused on protocols and architectures for applications in which network performance assurances are not considered essential, such as agriculture and environmental monitoring. However, for many important areas, such as plant automation and health monitoring, performance assurances are crucial, especially for metrics such as delay and reliability.

very large data bases | 2016

VDDA: automatic visualization-driven data aggregation in relational databases

Uwe Jugel; Zbigniew Jerzak; Gregor Hackenbroich; Volker Markl

Contemporary RDBMS-based systems for visualization of high-volume numerical data have difficulty to cope with the hard latency requirements and high ingestion rates of interactive visualizations. Existing solutions for lowering the volume of large data sets disregard the spatial properties of visualizations, resulting in visualization errors. In this work, we introduce VDDA, a visualization-driven data aggregation that models visual aggregation at the pixel level as data aggregation at the query level. Based on the M4 aggregation for producing pixel-perfect line charts from highly reduced data subsets, we define a complete set of data reduction operators that simulate the overplotting behavior of the most frequently used chart types. Relying only on the relational algebra and the common data aggregation functions, our approach is generic and applicable to any visualization system that consumes data stored in relational databases. We demonstrate our visualization-driven data aggregation using real-world data sets from high-tech manufacturing, stock markets, and sports analytics, reducing data volumes by up to two orders of magnitude, while preserving pixel-perfect visualizations, as producible from the raw data.

sensor mesh and ad hoc communications and networks | 2009

Performance Control in Wireless Sensor Networks

Cormac J. Sreenan; Utz Roedig; James Brown; Carlo Alberto Boano; Adam Dunkels; Zhitao He; Thiemo Voigt; Vasos Vassiliou; Jorge Sá Silva; Lars C. Wolf; Oliver Wellnitz; Ruben Eiras; Gregor Hackenbroich; Anja Klein; Deepak Agrawal

Most of the currently deployed wireless sensor networks applications do not require performance control. The goal of the GINSENG project is sensor networks that meet application-specific performance targets, in particular with respect to latency and reliability. We present scenarios within the GALP oil reÂ?nery where the system will be deployed and some initial technical insights with respect to deterministic communication.

international conference on management of data | 2015

Quality-Driven Continuous Query Execution over Out-of-Order Data Streams

Yuanzhen Ji; Hongjin Zhou; Zbigniew Jerzak; Anisoara Nica; Gregor Hackenbroich; Christof Fetzer

Executing continuous queries over out-of-order data streams, where tuples are not ordered according to timestamps, is challenging; because high result accuracy and low result latency are two conflicting performance metrics. Although many applications allow trading exact query results for lower latency, they still expect the produced results to meet a certain quality requirement. However, none of existing disorder handling approaches have considered minimizing the result latency while meeting user-specified requirements on the quality of query results. In this demonstration, we showcase AQ-K-slack, an adaptive, buffer-based disorder handling approach, which supports executing sliding window aggregate queries over out-of-order data streams in a quality-driven manner. By adapting techniques from the field of sampling-based approximate query processing and control theory, AQ-K-slack dynamically adjusts the input buffer size at query runtime to minimize the result latency, while respecting a user-specified threshold on relative errors in produced query results. We demonstrate a prototype stream processing system, which extends SAP Event Stream Processor with the implementation of AQ-K-slack. Through an interactive interface, the audience will learn the effect of different factors, such as the aggregate function, the window specification, the result error threshold, and stream properties, on the latency and the accuracy of query results. Moreover, they can experience the effectiveness of AQ-K-slack in obtaining user-desired latency vs. result accuracy trade-offs, compared to naive disorder handling approaches that make extreme trade-offs. For instance, by scarifying 1% result accuracy, our system can reduce the result latency by 80% when compared to the state of the art.

advances in databases and information systems | 2011

Forcasting evolving time series of energy demand and supply

Lars Dannecker; Matthias Böhm; Wolfgang Lehner; Gregor Hackenbroich

Real-time balancing of energy demand and supply requires accurate and efficient forecasting in order to take future consumption and production into account. These balancing capabilities are reasoned by emerging energy market developments, which also pose new challenges to forecasting in the energy domain not addressed so far: First, real-time balancing requires accurate forecasts at any point in time. Second, the hierarchical market organization motivates forecasting in a distributed system environment. In this paper, we present an approach that adapts forecasting to the hierarchical organization of todays energy markets. Furthermore, we introduce a forecasting framework, which allows efficient forecasting and forecast model maintenance of time series that evolve due to continuous streams of measurements. This framework includes model evaluation and adaptation techniques that enhance the model maintenance process by exploiting context knowledge from previous model adaptations. With this approach (1) more accurate forecasts can be produced within the same time budget, or (2) forecasts with similar accuracy can be produced in less time.

statistical and scientific database management | 2012

Partitioning and multi-core parallelization of multi-equation forecast models

Lars Dannecker; Matthias Boehm; Wolfgang Lehner; Gregor Hackenbroich

Forecasting is an important analysis technique used in many application domains such as electricity management, sales and retail and, traffic predictions. The employed statistical models already provide very accurate predictions, but recent developments in these domains pose new requirements on the calculation speed of the forecast models. Especially, the often used multi-equation models tend to be very complex and their estimation is very time consuming. To still allow the use of these highly accurate forecast models, it is necessary to improve the data processing capabilities of the involved data management systems. For this purpose, we introduce a partitioning approach for multi-equation forecast models that considers the specific data access pattern of these models to optimize the data storage and memory access. With the help of our approach we avoid the redundant reading of unnecessary values and improve the utilization of the CPU cache. Furthermore, we utilize the capabilities of modern multi-core hardware and parallelize the model estimation. Our experimental results on real-world data show speedups of up to 73x for the initial model estimation. Thus, our partitioning and parallelization approach significantly increases the efficiency of multi-equation models.

Archive | 2011

GINSENG Data Processing Framework

Zbigniew Jerzak; Anja Klein; Gregor Hackenbroich

For many applications guided by sensor networks, such as production automation and health monitoring, an efficient data processing with performance assurance is crucial, especially for metrics such as delay and reliability. Our study of current middleware approaches showed that they do not allow a sophisticated complex event processing, neither the performance monitoring. In this chapter we present the GINSENG middleware architecture that provides a 3-tier data processing framework to exploit the benefits of basic publish/subscribe systems, traditional event stream processing and complex business rule processing. Furthermore, the GINSENG middleware architecture provides performance control mechanisms, i.e., monitoring metrics and improvement methods, both of the underlying sensor network and the middleware itself. Finally, it supports the constraints of industrial environments by allowing for the distributed middleware deployment and data processing.

Explore More