Is this you? Create Your Porfile

Ioannis Konstantinou

National Technical University of Athens

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ioannis Konstantinou is active.

Explore More

Publication

Featured researches published by Ioannis Konstantinou.

international world wide web conferences | 2012

H2RDF: adaptive query processing on RDF data in the cloud.

Nikolaos Papailiou; Ioannis Konstantinou; Dimitrios Tsoumakos; Nectarios Koziris

In this work we present H2RDF, a fully distributed RDF store that combines the MapReduce processing framework with a NoSQL distributed data store. Our system features two unique characteristics that enable efficient processing of both simple and multi-join SPARQL queries on virtually unlimited number of triples: Join algorithms that execute joins according to query selectivity to reduce processing; and adaptive choice among centralized and distributed (MapReduce-based) join execution for fast query responses. Our system efficiently answers both simple joins and complex multivariate queries and easily scales to 3 billion triples using a small cluster of 9 worker nodes. H2RDF outperforms state-of-the-art distributed solutions in multi-join and nonselective queries while achieving comparable performance to centralized solutions in selective queries. In this demonstration we showcase the systems functionality through an interactive GUI. Users will be able to execute predefined or custom-made SPARQL queries on datasets of different sizes, using different join algorithms. Moreover, they can repeat all queries utilizing a different number of cluster resources. Using real-time cluster monitoring and detailed statistics, participants will be able to understand the advantages of different execution schemes versus the input data as well as the scalability properties of H2RDF over both the data size and the available worker resources.

conference on information and knowledge management | 2011

On the elasticity of NoSQL databases over cloud management platforms

Ioannis Konstantinou; Evangelos Angelou; Christina Boumpouka; Dimitrios Tsoumakos; Nectarios Koziris

NoSQL databases focus on analytical processing of large scale datasets, offering increased scalability over commodity hardware. One of their strongest features is elasticity, which allows for fairly portioned premiums and high-quality performance and directly applies to the philosophy of a cloud-based platform. Yet, the process of adaptive expansion and contraction of resources usually involves a lot of manual effort during cluster configuration. To date, there exists no comparative study to quantify this cost and measure the efficacy of NoSQL engines that offer this feature over a cloud provider. In this work, we present a cloud-enabled framework for adaptive monitoring of NoSQL systems. We perform a study of the elasticity feature on some of the most popular NoSQL databases over an open-source cloud platform. Based on these measurements, we finally present a prototype implementation of a decision making system that enables automatic elastic operations of any NoSQL engine based on administrator or application-specified constraints.

ieee/acm international symposium cluster, cloud and grid computing | 2013

Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA

Dimitrios Tsoumakos; Ioannis Konstantinou; Christina Boumpouka; Spyros Sioutas; Nectarios Koziris

This work presents TIRAMOLA, a cloud-enabled, open-source framework to perform automatic resizing of NoSQL clusters according to user-defined policies. Decisions on adding or removing worker VMs from a cluster are modeled as a Markov Decision Process and taken in real-time. The system automatically decides on the most advantageous cluster size according to user-defined policies, it then proceeds on requesting/releasing VM resources from the provider and orchestrating them inside a NoSQL cluster. TIRAMOLAs modular architecture and standard API support allows interaction with most current IaaS platforms and increased customization. An extensive experimental evaluation on an HBase cluster confirms our assertions: The system resizes clusters in real-time and adapts its performance through different optimization strategies, different permissible actions, different input and training loads. Besides the automation of the process, it exhibits a learning feature which allows it to make very close to optimal decisions even with input loads 130% larger or alternating 10 times faster compared to the accumulated information.

international conference on big data | 2013

H 2 RDF+: High-performance distributed joins over large-scale RDF graphs

Nikolaos Papailiou; Ioannis Konstantinou; Dimitrios Tsoumakos; Panagiotis Karras; Nectarios Koziris

The proliferation of data in RDF format calls for efficient and scalable solutions for their management. While scalability in the era of big data is a hard requirement, modern systems fail to adapt based on the complexity of the query. Current approaches do not scale well when faced with substantially complex, non-selective joins, resulting in exponential growth of execution times. In this work we present H2RDF+, an RDF store that efficiently performs distributed Merge and Sort-Merge joins over a multiple index scheme. H2RDF+ is highly scalable, utilizing distributed MapReduce processing and HBase indexes. Utilizing aggressive byte-level compression and result grouping over fast scans, it can process both complex and selective join queries in a highly efficient manner. Furthermore, it adaptively chooses for either single- or multi-machine execution based on join complexity estimated through index statistics. Our extensive evaluation demonstrates that H2RDF+ efficiently answers non-selective joins an order of magnitude faster than both current state-of-the-art distributed and centralized stores, while being only tenths of a second slower in simple queries, scaling linearly to the amount of available resources.

IEEE Transactions on Parallel and Distributed Systems | 2011

Fast and Cost-Effective Online Load-Balancing in Distributed Range-Queriable Systems

Ioannis Konstantinou; Dimitrios Tsoumakos; Nectarios Koziris

Distributed systems such as Peer-to-Peer overlays have been shown to efficiently support the processing of range queries over large numbers of participating hosts. In such systems, uneven load allocation has to be effectively tackled in order to minimize overloaded peers and optimize their performance. In this work, we detect the two basic methodologies used to achieve load-balancing: Iterative key redistribution between neighbors and node migration. We identify these two key mechanisms and describe their relative advantages and disadvantages. Based on this analysis, we propose NIXMIG, a hybrid method that adaptively utilizes these two extremes to achieve both fast and cost-effective load-balancing in distributed systems that support range queries. We theoretically prove its convergence and as a case study, we offer an implementation on top of a Skip Graph, where we thoroughly validate our findings in a variety of static, dynamic and realistic workloads. We compare NIXMIG with an existing load-balancing algorithm proposed by Karger and Ruhl [1] and our experimental analysis shows that, NIXMIG can be as much as three times faster, requiring only one sixth and one third of message and item exchanges, respectively, to bring the system to a balanced state.

international conference on management of data | 2012

TIRAMOLA: elastic nosql provisioning through a cloud management platform

Ioannis Konstantinou; Evangelos Angelou; Dimitrios Tsoumakos; Christina Boumpouka; Nectarios Koziris; Spyros Sioutas

NoSQL databases focus on analytical processing of large scale datasets, offering increased scalability over commodity hardware. One of their strongest features is elasticity, which allows for fairly portioned premiums and high-quality performance. Yet, the process of adaptive expansion and contraction of resources usually involves a lot of manual effort, often requiring the definition of the conditions for scaling up or down to be provided by the users. To date, there exists no open-source system for automatic resizing of NoSQL clusters. In this demonstration, we present TIRAMOLA, a modular, cloud-enabled framework for monitoring and adaptively resizing NoSQL clusters. Our system incorporates a decision-making module which allows for optimal cluster resize actions in order to maximize any quantifiable reward function provided together with life-long adaptation to workload or infrastructural changes. The audience will be able to initiate HBase clusters of various sizes and apply varying workloads through multiple YCSB clients. The attendees will be able to watch, in real-time, the system perform automatic VM additions and removals as well as how cluster performance metrics change relative to the optimization parameters of their choice.

Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud | 2010

Distributed indexing of web scale datasets for the cloud

Ioannis Konstantinou; Evangelos Angelou; Dimitrios Tsoumakos; Nectarios Koziris

In this paper, we present a distributed architecture for indexing and serving large and diverse datasets. It incorporates and extends the functionality of Hadoop, the open source MapReduce framework, and of HBase, a distributed, sparse, NoSQL database, to create a fully parallel indexing system. Experiments with structured, semi-structured and unstructured data of various sizes demonstrate the flexibility, speed and robustness of our implementation and contrast it with similarly oriented projects. Our 11 node cluster prototype managed to keep full-text indexing time of 150GB raw content in less than 3 hours, whereas the systems response time under sustained query load of more than 1000 queries/sec was kept in the order of milliseconds.

international conference on management of data | 2014

H 2 RDF+: an efficient data management system for big RDF graphs

Nikolaos Papailiou; Dimitrios Tsoumakos; Ioannis Konstantinou; Panagiotis Karras; Nectarios Koziris

The proliferation of data in RDF format has resulted in the emergence of a plethora of specialized management systems. While the ability to adapt to the complexity of a SPARQL query -- given their inherent diversity -- is crucial, current approaches do not scale well when faced with substantially complex, non-selective joins, resulting in exponential growth of execution times. In this demonstration we present H2 RDF+, an RDF store that efficiently performs distributed Merge and Sort-Merge joins using a multiple-index scheme over HBase indexes. Through a greedy planner that incorporates our cost-model, it adaptively commands for either single or multi-machine query execution based on join complexity. In this paper, we present its key scientific contributions and allow participants to interact with an H2RDF+ deployment over a Cloud infrastructure. Using a web-based GUI we allow users to load different datasets (both real and synthetic), apply any query (custom or predefined) and monitor its execution. By allowing real-time inspection of cluster status, response times and committed resources the audience will evaluate the validity of H2RDF+s claims and perform direct comparisons to two other state-of-the-art RDF stores.

ieee/acm international symposium cluster, cloud and grid computing | 2015

Dependable Horizontal Scaling Based on Probabilistic Model Checking

Athanasios Naskos; Emmanouela Stachtiari; Anastasios Gounaris; Panagiotis Katsaros; Dimitrios Tsoumakos; Ioannis Konstantinou; Spyros Sioutas

The focus of this work is the on-demand resource provisioning in cloud computing, which is commonly referredto as cloud elasticity. Although a lot of effort has been invested in developing systems and mechanisms that enable elasticity, the elasticity decision policies tend to be designed without quantifying or guaranteeing the quality of their operation. We present an approach towards the development of more formalized and dependable elasticity policies. We make two distinct contributions. First, we propose an extensible approach to enforcing elasticity through the dynamic instantiation and online quantitative verification of Markov Decision Processes(MDP) using probabilistic model checking. Second, various concrete elasticity models and elasticity policies are studied. We evaluate the decision policies using traces from a realNoSQL database cluster under constantly evolving externalload. We reason about the behaviour of different modelling and elasticity policy options and we show that our proposal can improve upon the state-of-the-art in significantly decreasing under-provisioning while avoiding over-provisioning.

international conference on management of data | 2013

DBalancer: distributed load balancing for NoSQL data-stores

Ioannis Konstantinou; Dimitrios Tsoumakos; Ioannis Mytilinis; Nectarios Koziris

Unanticipated load spikes or skewed data access patterns may lead to severe performance degradation in data serving applications, a typical problem of distributed NoSQL data-stores. In these cases, load balancing is a necessary operation. In this demonstration, we present the DBalancer, a generic distributed module that can be installed on top of a typical NoSQL data-store and provide an efficient and highly configurable load balancing mechanism. Balancing is performed by simple message exchanges and typical data movement operations supported by most modern NoSQL data-stores. We present the systems architecture, we describe in detail its modules and their interaction and we implement a suite of different algorithms on top of it. Through a web-based interactive GUI we allow the users to launch NoSQL clusters of various sizes, to apply numerous skewed and dynamic workloads and to compare the implemented load balancing algorithms. Videos and graphs showcasing each algorithms effect on a number of indicative performance and cost metrics will be created on the fly for every setup. By browsing the results of different executions users will be able to grasp each algorithms balancing mechanisms and performance impact in a number of representative setups.

Explore More