Cagri Balkesen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Cagri Balkesen is active.

Explore More

Publication

Featured researches published by Cagri Balkesen.

international conference on data engineering | 2013

Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware

Cagri Balkesen; Jens Teubner; Gustavo Alonso; M.T. Ozsu

The architectural changes introduced with multi-core CPUs have triggered a redesign of main-memory join algorithms. In the last few years, two diverging views have appeared. One approach advocates careful tailoring of the algorithm to the architectural parameters (cache sizes, TLB, and memory bandwidth). The other approach argues that modern hardware is good enough at hiding cache and TLB miss latencies and, consequently, the careful tailoring can be omitted without sacrificing performance. In this paper we demonstrate through experimental analysis of different algorithms and architectures that hardware still matters. Join algorithms that are hardware conscious perform better than hardware-oblivious approaches. The analysis and comparisons in the paper show that many of the claims regarding the behavior of join algorithms that have appeared in literature are due to selection effects (relative table sizes, tuple sizes, the underlying architecture, using sorted data, etc.) and are not supported by experiments run under different parameters settings. Through the analysis, we shed light on how modern hardware affects the implementation of data operators and provide the fastest implementation of radix join to date, reaching close to 200 million tuples per second.

very large data bases | 2013

Multi-core, main-memory joins: sort vs. hash revisited

Cagri Balkesen; Gustavo Alonso; Jens Teubner; M. Tamer Özsu

In this paper we experimentally study the performance of main-memory, parallel, multi-core join algorithms, focusing on sort-merge and (radix-)hash join. The relative performance of these two join approaches have been a topic of discussion for a long time. With the advent of modern multi-core architectures, it has been argued that sort-merge join is now a better choice than radix-hash join. This claim is justified based on the width of SIMD instructions (sort-merge outperforms radix-hash join once SIMD is sufficiently wide), and NUMA awareness (sort-merge is superior to hash join in NUMA architectures). We conduct extensive experiments on the original and optimized versions of these algorithms. The experiments show that, contrary to these claims, radix-hash join is still clearly superior, and sort-merge approaches to performance of radix only when very large amounts of data are involved. The paper also provides the fastest implementations of these algorithms, and covers many aspects of modern hardware architectures relevant not only for joins but for any parallel data processing operator.

distributed event-based systems | 2013

Adaptive input admission and management for parallel stream processing

Cagri Balkesen; Nesime Tatbul; M. Tamer Özsu

In this paper, we propose a framework for adaptive admission control and management of a large number of dynamic input streams in parallel stream processing engines. The framework takes as input any available information about input stream behaviors and the requirements of the query processing layer, and adaptively decides how to adjust the entry points of streams to the system. As the optimization decisions propagate early from input management layer to the query processing layer, the size of the cluster is minimized, the load balance is maintained, and latency bounds of queries are met in a more effective and timely manner. Declarative integration of external meta-data about data sources makes the system more robust and resource-efficient. Additionally, exploiting knowledge about queries moves data partitioning to the input management layer, where better load balance for query processing can be achieved. We implemented these techniques as a part of the Borealis stream processing system and conducted experiments showing the performance benefits of our framework.

distributed event-based systems | 2013

RIP: run-based intra-query parallelism for scalable complex event processing

Cagri Balkesen; Nihal Dindar; Matthias Wetter; Nesime Tatbul

Recognition of patterns in event streams has become important in many application areas of Complex Event Processing (CEP) including financial markets, electronic health-care systems, and security monitoring systems. In most applications, patterns have to be detected continuously and in real-time over streams that are generated at very high rates, imposing high-performance requirements on the underlying CEP system. For scaling CEP systems to increasing workloads, parallel pattern matching techniques that can exploit multi-core processing opportunities are needed. In this paper, we propose RIP - a Run-based Intra-query Parallelism technique for scalable pattern matching over event streams. RIP distributes input events that belong to individual run instances of a patterns Finite State Machine (FSM) to different processing units, thereby providing fine-grained partitioned data parallelism. We compare RIP to a state-based alternative which partitions individual FSM states to different processing units instead. Our experiments demonstrate that RIPs partitioned parallelism approach outperforms the pipelined parallelism approach of this state-based alternative, achieving near-linear scalability that is independent from the query pattern definition.

IEEE Transactions on Knowledge and Data Engineering | 2015

Main-Memory Hash Joins on Modern Processor Architectures

Cagri Balkesen; Jens Teubner; Gustavo Alonso; M. Tamer Özsu

Existing main-memory hash join algorithms for multi-core can be classified into two camps. Hardware-oblivious hash join variants do not depend on hardware-specific parameters. Rather, they consider qualitative characteristics of modern hardware and are expected to achieve good performance on any technologically similar platform. The assumption behind these algorithms is that hardware is now good enough at hiding its own limitations-through automatic hardware prefetching, out-of-order execution, or simultaneous multi-threading (SMT)-to make hardware-oblivious algorithms competitive without the overhead of carefully tuning to the underlying hardware. Hardware-conscious implementations, such as (parallel) radix join, aim to maximally exploit a given architecture by tuning the algorithm parameters (e.g., hash table sizes) to the particular features of the architecture. The assumption here is that explicit parameter tuning yields enough performance advantages to warrant the effort required. This paper compares the two approaches under a wide range of workloads (relative table sizes, tuple sizes, effects of sorted data, etc.) and configuration parameters (VM page sizes, number of threads, number of cores, SMT, SIMD, prefetching, etc.). The results show that hardware-conscious algorithms generally outperform hardware-oblivious ones. However, on specific workloads and special architectures with aggressive simultaneous multi-threading, hardware-oblivious algorithms are competitive. The main conclusion of the paper is that, in existing multi-core architectures, it is still important to carefully tailor algorithms to the underlying hardware to get the necessary performance. But processor developments may require to revisit this conclusion in the future.

IEEE Pervasive Computing | 2009

Event Processing Support for Cross-Reality Environments

Nihal Dindar; Cagri Balkesen; Katinka Kromwijk; Nesime Tatbul

Complex event processing (CEP) is an essential functionality for cross-reality environments. Through CEP, we can turn raw sensor data generated in the real world into more meaningful information that has some significance for the virtual world. In this article, the authors present DejaVu, a general-purpose event processing system built at ETH Zurich. SmartRFLib, a cross-reality application, builds on DejaVu and enables real-time event detection over RFID data streams feeding a virtual library on second life.

Archive | 2011