Is this you? Create Your Porfile

Iraklis Psaroudakis

École Polytechnique Fédérale de Lausanne

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Iraklis Psaroudakis is active.

Explore More

Publication

Featured researches published by Iraklis Psaroudakis.

very large data bases | 2013

Sharing data and work across concurrent analytical queries

Iraklis Psaroudakis; Manos Athanassoulis; Anastasia Ailamaki

Todays data deluge enables organizations to collect massive data, and analyze it with an ever-increasing number of concurrent queries. Traditional data warehouses (DW) face a challenging problem in executing this task, due to their query-centric model: each query is optimized and executed independently. This model results in high contention for resources. Thus, modern DW depart from the query-centric model to execution models involving sharing of common data and work. Our goal is to show when and how a DW should employ sharing. We evaluate experimentally two sharing methodologies, based on their original prototype systems, that exploit work sharing opportunities among concurrent queries at run-time: Simultaneous Pipelining (SP), which shares intermediate results of common sub-plans, and Global Query Plans (GQP), which build and evaluate a single query plan with shared operators. First, after a short review of sharing methodologies, we show that SP and GQP are orthogonal techniques. SP can be applied to shared operators of a GQP, reducing response times by 20%-48% in workloads with numerous common sub-plans. Second, we corroborate previous results on the negative impact of SP on performance for cases of low concurrency. We attribute this behavior to a bottleneck caused by the push-based communication model of SP. We show that pull-based communication for SP eliminates the overhead of sharing altogether for low concurrency, and scales better on multi-core machines than push-based SP, further reducing response times by 82%-86% for high concurrency. Third, we perform an experimental analysis of SP, GQP and their combination, and show when each one is beneficial. We identify a trade-off between low and high concurrency. In the former case, traditional query-centric operators with SP perform better, while in the latter case, GQP with shared operators enhanced by SP give the best results.

very large data bases | 2015

Scaling up concurrent main-memory column-store scans: towards adaptive NUMA-aware data and task placement

Iraklis Psaroudakis; Tobias Scheuer; Norman May; Abdelkader Sellami; Anastasia Ailamaki

Main-memory column-stores are called to efficiently use modern non-uniform memory access (NUMA) architectures to service concurrent clients on big data. The efficient usage of NUMA architectures depends on the data placement and scheduling strategy of the column-store. Most column-stores choose a static strategy that involves partitioning all data across the NUMA architecture, and employing a stealing-based task scheduler. In this paper, we implement different strategies for data placement and task scheduling for the case of concurrent scans. We compare these strategies with an extensive sensitivity analysis. Our most significant findings include that unnecessary partitioning can hurt throughput by up to 70%, and that stealing memory-intensive tasks can hurt throughput by up to 58%. Based on our analysis, we envision a design that adapts the data placement and task scheduling strategy to the workload.

data management on new hardware | 2014

Dynamic fine-grained scheduling for energy-efficient main-memory queries

Iraklis Psaroudakis; Thomas Kissinger; Danica Porobic; Thomas Ilsche; Erietta Liarou; Pinar Tözün; Anastasia Ailamaki; Wolfgang Lehner

Power and cooling costs are some of the highest costs in data centers today, which make improvement in energy efficiency crucial. Energy efficiency is also a major design point for chips that power whole ranges of computing devices. One important goal in this area is energy proportionality, arguing that the systems power consumption should be proportional to its performance. Currently, a major trend among server processors, which stems from the design of chips for mobile devices, is the inclusion of advanced power management techniques, such as dynamic voltage-frequency scaling, clock gating, and turbo modes. A lot of recent work on energy efficiency of database management systems is focused on coarse-grained power management at the granularity of multiple machines and whole queries. These techniques, however, cannot efficiently adapt to the frequently fluctuating behavior of contemporary workloads. In this paper, we argue that databases should employ a fine-grained approach by dynamically scheduling tasks using precise hardware models. These models can be produced by calibrating operators under different combinations of scheduling policies, parallelism, and memory access strategies. The models can be employed at run-time for dynamic scheduling and power management in order to improve the overall energy efficiency. We experimentally show that energy efficiency can be improved by up to 4x for fundamental memory-intensive database operations, such as scans.

tpc technology conference | 2014

Scaling Up Mixed Workloads: A Battle of Data Freshness, Flexibility, and Scheduling

Iraklis Psaroudakis; Florian Wolf; Norman May; Thomas Neumann; Alexander Böhm; Anastasia Ailamaki; Kai-Uwe Sattler

The common “one size does not fit all” paradigm isolates transactional and analytical workloads into separate, specialized database systems. Operational data is periodically replicated to a data warehouse for analytics. Competitiveness of enterprises today, however, depends on real-time reporting on operational data, necessitating an integration of transactional and analytical processing in a single database system. The mixed workload should be able to query and modify common data in a shared schema. The database needs to provide performance guarantees for transactional workloads, and, at the same time, efficiently evaluate complex analytical queries. In this paper, we share our analysis of the performance of two main-memory databases that support mixed workloads, SAP HANA and HyPer, while evaluating the mixed workload CH-benCHmark. By examining their similarities and differences, we identify the factors that affect performance while scaling the number of concurrent transactional and analytical clients. The three main factors are (a) data freshness, i.e., how recent is the data processed by analytical queries, (b) flexibility, i.e., restricting transactional features in order to increase optimization choices and enhance performance, and (c) scheduling, i.e., how the mixed workload utilizes resources. Specifically for scheduling, we show that the absence of workload management under cases of high concurrency leads to analytical workloads overwhelming the system and severely hurting the performance of transactional workloads.

very large data bases | 2016

Adaptive NUMA-aware data placement and task scheduling for analytical workloads in main-memory column-stores

Iraklis Psaroudakis; Tobias Scheuer; Norman May; Abdelkader Sellami; Anastasia Ailamaki

Non-uniform memory access (NUMA) architectures pose numerous performance challenges for main-memory column-stores in scaling up analytics on modern multi-socket multi-core servers. A NUMA-aware execution engine needs a strategy for data placement and task scheduling that prefers fast local memory accesses over remote memory accesses, and avoids an imbalance of resource utilization, both CPU and memory bandwidth, across sockets. State-of-the-art systems typically use a static strategy that always partitions data across sockets, and always allows inter-socket task stealing. In this paper, we show that adapting data placement and task stealing to the workload can improve throughput by up to a factor of 4 compared to a static approach. We focus on highly concurrent workloads dominated by operators working on a single table or table group (copartitioned tables). Our adaptive data placement algorithm tracks the resource utilization of tasks, partitions of tables and table groups, and sockets. When a utilization imbalance across sockets is detected, the algorithm corrects it by moving or repartitioning tables. Also, inter-socket task stealing is dynamically disabled for memory-intensive tasks that could otherwise hurt performance.

statistical and scientific database management | 2015

Extending database task schedulers for multi-threaded application code

Florian Wolf; Iraklis Psaroudakis; Norman May; Anastasia Ailamaki; Kai-Uwe Sattler

Modern databases can run application logic defined in stored procedures inside the database server to improve application speed. The SQL standard specifies how to call external stored routines implemented in programming languages, such as C, C++, or JAVA, to complement declarative SQL-based application logic. This is beneficial for scientific and analytical algorithms because they are usually too complex to be implemented entirely in SQL. At the same time, database applications like matrix calculations or data mining algorithms benefit from multi-threading to parallelize compute-intensive operations. Multi-threaded application code, however, introduces a resource competition between the threads of applications and the threads of the database task scheduler. In this paper, we show that multi-threaded application code can render the databases workload scheduling ineffective and decrease the core throughput of the database by up to 50%. We present a general approach to address this issue by integrating shared memory programming solutions into the task schedulers of databases. In particular, we describe the integration of OpenMP into databases. We implement and evaluate our approach using SAP HANA. Our experiments show that our integration does not introduce overhead, and can improve the throughput of core database operations by up to 15%.

international conference on data engineering | 2015

How to stop under-utilization and love multicores

Anastasia Ailamaki; Erietta Liarou; Pinar Tözün; Danica Porobic; Iraklis Psaroudakis

Hardware trends oblige software to overcome three major challenges against systems scalability: (1) taking advantage of the implicit/vertical parallelism within a core that is enabled through the aggressive micro-architectural features, (2) exploiting the explicit/horizontal parallelism provided by multicores, and (3) achieving predictively efficient execution despite the variability in communication latencies among cores on multisocket multicores. In this three hour tutorial, we shed light on the above three challenges and survey recent proposals to alleviate them. The first part of the tutorial describes the instruction- and data-level parallelism opportunities in a core coming from the hardware and software side. In addition, it examines the sources of under-utilization in a modern processor and presents insights and hardware/software techniques to better exploit the micro-architectural resources of a processor by improving cache locality at the right level of the memory hierarchy. The second part focuses on the scalability bottlenecks of database applications at the level of multicore and multisocket multicore architectures. It first presents a systematic way of eliminating such bottlenecks in online transaction processing workloads, which is based on minimizing unbounded communication, and shows several techniques that minimize bottlenecks in major components of database management systems. Then, it demonstrates the data and work sharing opportunities for analytical workloads, and reviews advanced scheduling mechanisms that are aware of non-uniform memory accesses and alleviate bandwidth saturation.

international conference on management of data | 2014

Reactive and proactive sharing across concurrent analytical queries

Iraklis Psaroudakis; Manos Athanassoulis; Matthaios Olma; Anastasia Ailamaki

Today an ever increasing amount of data is collected and analyzed by researchers, businesses, and scientists in data warehouses (DW). In addition to the data size, the number of users and applications querying data grows exponentially. The increasing concurrency is itself a challenge in query execution, but also introduces an opportunity favoring synergy between concurrent queries. Traditional execution engines of DW follows a query-centric approach, where each query is optimized and executed independently. On the other hand, workloads with increased concurrency have several queries with common parts of data and work, creating the opportunity for sharing among concurrent queries. Sharing can be reactive to the inherently existing sharing opportunities, or proactive by redesigning query operators to maximize the sharing opportunities. This demonstration showcases the impact of proactive and reactive sharing by comparing and integrating representative state-of-the-art techniques: Simultaneous Pipelining (SP), for reactive sharing, which shares intermediate results of common sub-plans, and Global Query Plans (GQP) for proactive sharing, which build and evaluate a single query plan with shared operators. We visually demonstrate, in an interactive interface, the behavior of both sharing approaches on top of a state-of-the-art storage engine using the original prototypes. We show that pull-based sharing for SP eliminates the serialization point imposed by the original push-based approach. Then, we compare, through a sensitivity analysis, the performance of SP and GQP. Finally, we show that SP can improve the performance of GQP for a query mix with common sub-plans.

very large data bases | 2013