Thomas Heinze | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thomas Heinze is active.

Explore More

Publication

Featured researches published by Thomas Heinze.

distributed event-based systems | 2014

Latency-aware elastic scaling for distributed data stream processing systems

Thomas Heinze; Zbigniew Jerzak; Gregor Hackenbroich; Christof Fetzer

Elastic scaling allows a data stream processing system to react to a dynamically changing query or event workload by automatically scaling in or out. Thereby, both unpredictable load peaks as well as underload situations can be handled. However, each scaling decision comes with a latency penalty due to the required operator movements. Therefore, in practice an elastic system might be able to improve the system utilization, however it is not able to provide latency guarantees defined by a service level agreement (SLA). In this paper we introduce an elastic scaling system, which optimizes the utilization under certain latency constraints defined by a SLA. Specifically, we present a model, which estimates the latency spike created by a set of operator movements. We use this model to built a latency-aware elastic operator placement algorithm, which minimizes the number of latency violations. We show that our solution is able to reduce the 90th percentile of the end to end latency by up to 30% and reduce the number of latency violations by 50%. The achieved system utilization for our approach is comparable to a scaling strategy, which does not use latency as optimization target.

symposium on cloud computing | 2015

Online parameter optimization for elastic data stream processing

Thomas Heinze; Lars Roediger; Andreas Meister; Yuanzhen Ji; Zbigniew Jerzak; Christof Fetzer

Elastic scaling allows data stream processing systems to dynamically scale in and out to react to workload changes. As a consequence, unexpected load peaks can be handled and the extent of the overprovisioning can be reduced. However, the strategies used for elastic scaling of such systems need to be tuned manually by the user. This is an error prone and cumbersome task, because it requires a detailed knowledge of the underlying system and workload characteristics. In addition, the resulting quality of service for a specific scaling strategy is unknown a priori and can be measured only during runtime. In this paper we present an elastic scaling data stream processing prototype, which allows to trade off monetary cost against the offered quality of service. To that end, we use an online parameter optimization, which minimizes the monetary cost for the user. Using our prototype a user is able to specify the expected quality of service as an input to the optimization, which automatically detects significant changes of the workload pattern and adjusts the elastic scaling strategy based on the current workload characteristics. Our prototype is able to reduce the costs for three real-world use cases by 19% compared to a naive parameter setting and by 10% compared to a manually tuned system. In contrast to state of the art solutions, our system provides a stable and good trade-off between monetary cost and quality of service.

international conference on data engineering | 2014

Auto-scaling techniques for elastic data stream processing

Thomas Heinze; Valerio Pappalardo; Zbigniew Jerzak; Christof Fetzer

An elastic data stream processing system is able to handle changes in workload by dynamically scaling out and scaling in. This allows for handling of unexpected load spikes without the need for constant overprovisioning. One of the major challenges for an elastic system is to find the right point in time to scale in or to scale out. Finding such a point is difficult as it depends on constantly changing workload and system characteristics. In this paper we investigate the application of different auto-scaling techniques for solving this problem. Specifically: (1) we formulate basic requirements for an auto-scaling technique used in an elastic data stream processing system (2) we use the formulated requirements to select the best auto scaling techniques and (3) we perform evaluation of the selected auto scaling techniques using the real world data. Our experiments show that the auto scaling techniques used in existing elastic data stream processing systems are performing worse than the strategies used in our work.

distributed event-based systems | 2014

Cloud-based data stream processing

Thomas Heinze; Leonardo Aniello; Leonardo Querzoni; Zbigniew Jerzak

In this tutorial we present the results of recent research about the cloud enablement of data streaming systems. We illustrate, based on both industrial as well as academic prototypes, new emerging uses cases and research trends. Specifically, we focus on novel approaches for (1) scalability and (2) fault tolerance in large scale distributed streaming systems. In general, new fault tolerance mechanisms strive to be more robust and at the same time introduce less overhead. Novel load balancing approaches focus on elastic scaling over hundreds of instances based on the data and query workload. Finally, we present open challenges for the next generation of cloud-based data stream processing engines.

international conference on distributed computing systems | 2014

Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine

Raphaël Barazzutti; Thomas Heinze; André Martin; Emanuel Onica; Pascal Felber; Christof Fetzer; Zbigniew Jerzak; Marcelo Pasin; Etienne Rivière

Publish/subscribe (pub/sub) infrastructures running as a service on cloud environments offer simplicity and flexibility for composing distributed applications. Provisioning them appropriately is however challenging. The amount of stored subscriptions and incoming publications varies over time, and the computational cost depends on the nature of the applications and in particular on the filtering operation they require (e.g., content-based vs. topic-based, encrypted vs. non-encrypted filtering). The ability to elastically adapt the amount of resources required to sustain given throughput and delay requirements is key to achieving cost-effectiveness for a pub/sub service running in a cloud environment. In this paper, we present the design and evaluation of an elastic content-based pub/sub system: E-STREAMHUB. Specific contributions of this paper include: (1) a mechanism for dynamic scaling, both out and in, of stateful and stateless pub/sub operators, (2) a local and global elasticity policy enforcer maintaining high system utilization and stable end-to-end latencies, and (3) an evaluation using real-world tick workload from the Frankfurt Stock Exchange and encrypted content-based filtering.

distributed event-based systems | 2015

An adaptive replication scheme for elastic data stream processing systems

Thomas Heinze; Mariam Zia; Robert Krahn; Zbigniew Jerzak; Christof Fetzer

A major challenge for cloud-based systems is to be fault tolerant to cope with an increasing probability of faults in cloud environments. This is especially true for in-memory computing solutions like data stream processing systems, where a single host failure might result in an unrecoverable information loss. In state of the art data streaming systems either active replication or upstream backup are applied to ensure fault tolerance, which have a high resource overhead or a high recovery time respectively. This paper combines these two fault tolerance mechanisms in one system to minimize the number of violations of a user-defined recovery time threshold and to reduce the overall resource consumption compared to active replication. The system switches for individual operators between both replication techniques dynamically based on the current workload characteristics. Our approach is implemented as an extension of an elastic data stream processing engine, which is able to reduce the number of used hosts due to the smaller replication overhead. Based on a real-world evaluation we show that our system is able to reduce the resource usage by up to 19% compared to an active replication scheme.

international conference on data engineering | 2017

Revisiting the Design of Data Stream Processing Systems on Multi-Core Processors

Shuhao Zhang; Bingsheng He; Daniel Dahlmeier; Amelie Chi Zhou; Thomas Heinze

Driven by the rapidly increasing demand for handling real-time data streams, many data stream processing (DSP) systems have been proposed. Regardless of the different architectures of those DSP systems, they are mostly aiming at scaling out using a cluster of commodity machines and built around a number of key design aspects: a) pipelined processing with message passing, b) on-demand data parallelism, and c) JVM based implementation. However, there lacks a study on those key design aspects on modern scale-up architectures, where more CPU cores are being put on the same die, and the onchip cache hierarchies are getting larger, deeper, and complex. Multiple sockets bring non-uniform memory access (NUMA) effort. In this paper, we revisit the aforementioned design aspects on a modern scale-up server. Specifically, we use a series of applications as micro benchmark to conduct detailed profiling studies on Apache Storm and Flink. From the profiling results, we observe two major performance issues: a) the massively parallel execution model causes serious front-end stalls, which are a major performance bottleneck issue on a single CPU socket, b) the lack of NUMA-aware mechanism causes major drawback on the scalability of DSP systems on multi-socket architectures. Addressing these issues should allow DSP systems to exploit modern scale-up architectures, which also benefits scaling out environments. We present our initial efforts on resolving the above-mentioned performance issues, which have shown up to 3.2x and 3.1x improvement on the performance of Storm and Flink, respectively.

distributed event-based systems | 2013

Demo: measuring and estimating monetary cost for cloud-based data stream processing

Thomas Heinze; Patrick Meyer; Zbigniew Jerzak; Christof Fetzer

In recent time due to the availability of cloud-based data streaming systems like Yahoo! S4 or Twitter Storm and virtually unlimited resources using a public cloud infrastructure it is possible to run stream processing tasks with a new dimension of computational complexity. However, the required resources in terms of CPU, memory, and network bandwidth differ depending on the use case and applied data streaming system. For the user of such a system this is directly visible in the monetary cost he has to spent for the used resources. Therefore, he would like to maximize the ratio between gained performance and his monetary cost. In our demonstration we present an approach to measure and estimate the monetary cost for data streaming systems. We present a general scheme to model monetary cost for any combination of a cloud-based data streaming system and a major public cloud provider. Our model can be used as a starting point for optimizing the ratio between monetary cost and performance of streaming systems in general.

extending database technology | 2012

Fault-tolerant complex event processing using customizable state machine-based operators

Thomas Heinze; Zbigniew Jerzak; André Martin; Lenar Yazdanov; Christof Fetzer

Modern Complex Event Processing (CEP) systems often need an high degree of customization in order to implement required application logic. The use of declarative languages, such as CQL, often leads to complicated and hard to maintain application code. In this demo, we show how state machine-based CEP operators help to cope with these problems. State machine-based CEP operators allow for a high flexibility as well as a re-usability of application logic components. A major benefit of the presented solution is its easy integration with existing streaming engines, which we demonstrate using StreamMine, a highly parallel and faulttolerant streaming engine prototype. In this demo we show: (1) how state machine-based operators allow for an easy definition of custom, reusable CEP operators, (2) how resulting state machines can be easily combined with existing faulttolerance techniques within StreamMine and (3) how the resulting CEP applications can be tested in a cost efficient way.

distributed event-based systems | 2013

HUGO: real-time analysis of component interactions in high-tech manufacturing equipment (industry article)

Yuanzhen Ji; Thomas Heinze; Zbigniew Jerzak

One of the major problems faced by the high-tech manufacturing industry is the need for automated and timely detection of anomalies which can lead to failures of the manufacturing equipment. Failures of the high-tech manufacturing equipment have a direct negative impact on the operating margin and consequently profit of the high-tech manufacturing industry. Automated and timely detection of anomalies is a difficult problem, the major challenge being the need to understand the interactions between large amount of machine components. Even very experienced system engineers are not aware of all interactions, especially if those need to be derived from high velocity sensor data. This, in turn, makes it impossible to recognize early warning signals and take action before failure happens. In this paper we present HUGO -- a system for real-time analysis of component interactions in high-tech manufacturing equipment. HUGO automatically discovers (based on the available sensor data) correlations between machine components and helps engineers analyze them in real-time so as to be able to detect deterioration of the manufacturing equipment conditions in a timely fashion.

Explore More