Herald Kllapi
National and Kapodistrian University of Athens
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Herald Kllapi.
international conference on management of data | 2011
Herald Kllapi; Eva Sitaridi; Manolis M. Tsangaris; Yannis E. Ioannidis
Scheduling data processing workflows (dataflows) on the cloud is a very complex and challenging task. It is essentially an optimization problem, very similar to query optimization, that is characteristically different from traditional problems in two aspects: Its space of alternative schedules is very rich, due to various optimization opportunities that cloud computing offers; its optimization criterion is at least two-dimensional, with monetary cost of using the cloud being at least as important as query completion time. In this paper, we study scheduling of dataflows that involve arbitrary data processing operators in the context of three different problems: 1) minimize completion time given a fixed budget, 2) minimize monetary cost given a deadline, and 3) find trade-offs between completion time and monetary cost without any a-priori constraints. We formulate these problems and present an approximate optimization framework to address them that uses resource elasticity in the cloud. To investigate the effectiveness of our approach, we incorporate the devised framework into a prototype system for dataflow evaluation and instantiate it with several greedy, probabilistic, and exhaustive search algorithms. Finally, through several experiments that we have conducted with the prototype elastic optimizer on numerous scientific and synthetic dataflows, we identify several interesting general characteristics of the space of alternative schedules as well as the advantages and disadvantages of the various search algorithms. The overall results are quite promising and indicate the effectiveness of our approach.
international conference on data engineering | 2011
Konstantinos Tsakalozos; Herald Kllapi; Eva Sitaridi; Mema Roussopoulos; Dimitris Paparas; Alex Delis
Modern frameworks, such as Hadoop, combined with abundance of computing resources from the cloud, offer a significant opportunity to address long standing challenges in distributed processing. Infrastructure-as-a-Service clouds reduce the investment cost of renting a large data center while distributed processing frameworks are capable of efficiently harvesting the rented physical resources. Yet, the performance users get out of these resources varies greatly because the cloud hardware is shared by all users. The value for money cloud consumers achieve renders resource sharing policies a key player in both cloud performance and user satisfaction. In this paper, we employ microeconomics to direct the allotment of cloud resources for consumption in highly scalable master-worker virtual infrastructures. Our approach is developed on two premises: the cloud-consumer always has a budget and cloud physical resources are limited. Using our approach, the cloud administration is able to maximize per-user financial profit. We show that there is an equilibrium point at which our method achieves resource sharing proportional to each users budget. Ultimately, this approach allows us to answer the question of how many resources a consumer should request from the seemingly endless pool provided by the cloud.
extended semantic web conference | 2013
Evgeny Kharlamov; Ernesto Jiménez-Ruiz; Dmitriy Zheleznyakov; Dimitris Bilidas; Martin Giese; Peter Haase; Ian Horrocks; Herald Kllapi; Manolis Koubarakis; Özgür Lütfü Özçep; Mariano Rodriguez-Muro; Riccardo Rosati; Michael Schmidt; Rudolf Schlatte; Ahmet Soylu; Arild Waaler
The recently started EU FP7-funded project Optique will develop an end-to-end OBDA system providing scalable end-user access to industrial Big Data stores. This paper presents an initial architectural specification for the Optique system along with the individual system components.
extended semantic web conference | 2013
Diego Calvanese; Martin Giese; Peter Haase; Ian Horrocks; Thomas Hubauer; Yannis E. Ioannidis; Ernesto Jiménez-Ruiz; Evgeny Kharlamov; Herald Kllapi; Johan W. Klüwer; Manolis Koubarakis; Steffen Lamparter; Ralf Möller; Christian Neuenstadt; T. Nordtveit; Özgür L. Özçep; Mariano Rodriguez-Muro; Mikhail Roshchin; F. Savo; Michael Schmidt; Ahmet Soylu; Arild Waaler; Dmitriy Zheleznyakov
Accessing the relevant data in Big Data scenarios is increasingly difficult both for end-user and IT-experts, due to the volume, variety, and velocity dimensions of Big Data.This brings a hight cost overhead in data access for large enterprises. For instance, in the oil and gas industry, IT-experts spend 30-70% of their time gathering and assessing the quality of data [1]. The Optique project ( http://www.optique-project.eu/ ) advocates a next generation of the well known Ontology-Based Data Access (OBDA) approach to address the Big Data dimensions and in particular the data access problem. The project aims at solutions that reduce the cost of data access dramatically.
international conference on data engineering | 2014
Herald Kllapi; Boulos Harb; Cong Yu
An increasing number of Web applications such as friends recommendation depend on the ability to join objects at scale. The traditional approach taken is nearest neighbor join (also called similarity join), whose goal is to find, based on a given join function, the closest set of objects or all the objects within a distance threshold to each object in the input. The scalability of techniques utilizing this approach often depends on the characteristics of the objects and the join function. However, many real-world join functions are intricately engineered and constantly evolving, which makes the design of white-box methods that rely on understanding the join function impractical. Finding a technique that can join extremely large number of objects with complex join functions has always been a tough challenge. In this paper, we propose a practical alternative approach called near neighbor join that, although does not find the closest neighbors, finds close neighbors, and can do so at extremely large scale when the join functions are complex. In particular, we design and implement a super-scalable system we name SAJ that is capable of best-effort joining of billions of objects for complex functions. Extensive experimental analysis over real-world large datasets shows that SAJ is scalable and generates good results.
international conference on big data | 2016
Christoforos Svingos; Theofilos P. Mailis; Herald Kllapi; Lefteris Stamatogiannakis; Yannis Kotidis; Yannis E. Ioannidis
Big Data applications require real-time processing of complex computations on streaming and static information. Applications such as the diagnosis of power generating turbines require the integration of high velocity streaming and large volume of static data from multiple sources. In this paper we study various optimisations related to efficiently processing of streaming and static information. We introduce novel indexing structures for stream processing, a query-planner component that decides when their creation is beneficial, and we examine precomputed summarisations on archived measurements to accelerate streaming and static information processing. To put our ideas into practise, we have developed Exa Stream, a data stream management system that is scalable, has declarative semantics, supports user defined functions, and allows efficient execution of complex analytical queries on streaming and static data. Our work is accompanied by an empirical evaluation of our optimisation techniques.
statistical and scientific database management | 2012
Harry Dimitropoulos; Herald Kllapi; Omiros Metaxas; Nikolas Oikonomidis; Eva Sitaridi; Manolis M. Tsangaris; Yannis E. Ioannidis
AITION is a scalable, user-friendly, and interactive data mining (DM) platform, designed for analyzing large heterogeneous datasets. Implementing state-of-the-art machine learning algorithms, it successfully utilizes generative Probabilistic Graphical Models (PGMs) providing an integrated framework targeting feature selection, Knowledge Discovery (KD), and decision support. At the same time, it offers advanced capabilities for multi-scale data distribution representation, analysis & simulation, as well as, for identification and modelling of variable associations. n nAITION is built on top of Athena Distributed Processing (ADP) engine, a next generation data-flow language engine, capable of supporting large-scale KD on a variety of distributed platforms, such as, ad-hoc clusters, grids, or clouds. On the front end, it offers an interactive visual interface that allows users to explore the results of the KD process. The end result is that users not only understand the process that led to a statistical conclusion, but also the impact of that conclusion on their hypotheses. n nIn the proposed demonstration, we will show AITION in action at various stages of the knowledge discovery process, showcasing its key features regarding interactivity and scalability against a variety of problems.
IEEE Data(base) Engineering Bulletin | 2009
Manolis M. Tsangaris; George Kakaletris; Herald Kllapi; Giorgos Papanikos; Fragkiskos Pentaris; Paul Polydoras; Eva Sitaridi; Vassilis Stoumpos; Yannis E. Ioannidis
networked systems design and implementation | 2016
Alon Michael Shalita; Brian Karrer; Igor Kabiljo; Arun Dattaram Sharma; Alessandro Presta; Aaron B. Adcock; Herald Kllapi; Michael Stumm
owl: experiences and directions | 2013
Herald Kllapi; Dimitris Bilidas; Ian Horrocks; Yannis E. Ioannidis; Ernesto Jiménez-Ruiz; Evgeny Kharlamov; Manolis Koubarakis; Dmitriy Zheleznyakov