Is this you? Create Your Porfile

Akrivi Vlachou

Norwegian University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Akrivi Vlachou is active.

Explore More

Publication

Featured researches published by Akrivi Vlachou.

international conference on data engineering | 2007

SKYPEER: Efficient Subspace Skyline Computation over Distributed Data

Akrivi Vlachou; Christos Doulkeridis; Yannis Kotidis; Michalis Vazirgiannis

Skyline query processing has received considerable attention in the recent past. Mainly, the skyline query is used to find a set of non dominated data points in a multidimensional dataset. While most previous work has assumed a centralized setting, in this paper we address the efficient computation of subspace skyline queries in large-scale peer-to-peer (P2P) networks, where the dataset is horizontally distributed across the peers. Relying on a super-peer architecture we propose a threshold based algorithm, called SKYPEER, which forwards the skyline query requests among peers, in such a way that the amount of transferred data is significantly reduced. For efficient subspace skyline processing, we extend the notion of domination by defining the extended skyline set, which contains all data elements that are necessary to answer a skyline query in any arbitrary subspace. We prove that our algorithm provides the exact answers and we present optimization techniques to reduce communication cost and execution time. Finally, we provide an extensive experimental evaluation showing that SKYPEER performs efficiently and provides a viable solution when a large degree of distribution is required.

international conference on data engineering | 2010

Reverse top-k queries

Akrivi Vlachou; Christos Doulkeridis; Yannis Kotidis; Kjetil Nørvåg

Rank-aware query processing has become essential for many applications that return to the user only the top-k objects based on the individual users preferences. Top-k queries have been mainly studied from the perspective of the user, focusing primarily on efficient query processing. In this work, for the first time, we study top-k queries from the perspective of the product manufacturer. Given a potential product, which are the user preferences for which this product is in the top-k query result set? We identify a novel query type, namely reverse top-k query, that is essential for manufacturers to assess the potential market and impact of their products based on the competition. We formally define reverse top-k queries and introduce two versions of the query, namely monochromatic and bichromatic. We first provide a geometric interpretation of the monochromatic reverse top-k query in the solution space that helps to understand the reverse top-k query conceptually. Then, we study in more details the case of bichromatic reverse top-k query, which is more interesting for practical applications. Such a query, if computed in a straightforward manner, requires evaluating a top-k query for each user preference in the database, which is prohibitively expensive even for moderate datasets. In this paper, we present an efficient threshold-based algorithm that eliminates candidate user preferences, without processing the respective top-k queries. Furthermore, we introduce an indexing structure based on materialized reverse top-k views in order to speed up the computation of reverse top-k queries. Materialized reverse top-k views trade preprocessing cost for query speed up in a controllable manner. Our experimental evaluation demonstrates the efficiency of our techniques, which reduce the required number of top-k computations by 1 to 3 orders of magnitude.

international conference on management of data | 2008

Angle-based space partitioning for efficient parallel skyline computation

Akrivi Vlachou; Christos Doulkeridis; Yannis Kotidis

Recently, skyline queries have attracted much attention in the database research community. Space partitioning techniques, such as recursive division of the data space, have been used for skyline query processing in centralized, parallel and distributed settings. Unfortunately, such grid-based partitioning is not suitable in the case of a parallel skyline query, where allpartitions are examined at the same time, since many data partitions do not contribute to the overall skyline set, resulting in a lot of redundant processing. In this paper we propose a novel angle-based space partitioning scheme using the hyperspherical coordinates of the data points. We demonstrate both formally as well as through an exhaustive set of experiments that this new scheme is very suitable for skyline query processing in a parallel share-nothing architecture. The intuition of our partitioning technique is that the skyline points are equally spread to all partitions. We also show that partitioning the data according to the hyperspherical coordinates manages to increase the average pruning power of points within a partition. Our novel partitioning scheme alleviates most of the problems of traditional grid partitioning techniques, thus managing to reduce the response time and share the computational workload more fairly. As demonstrated by our experimental study, our technique outperforms grid partitioning in all cases, thus becoming an efficient and scalable solution for skyline query processing in parallel environments.

very large data bases | 2012

A survey of skyline processing in highly distributed environments

Katja Hose; Akrivi Vlachou

During the last decades, data management and storage have become increasingly distributed. Advanced query operators, such as skyline queries, are necessary in order to help users to handle the huge amount of available data by identifying a set of interesting data objects. Skyline query processing in highly distributed environments poses inherent challenges and demands and requires non-traditional techniques due to the distribution of content and the lack of global knowledge. This paper surveys this interesting and still evolving research area, so that readers can easily obtain an overview of the state-of-the-art. We outline the objectives and the main principles that any distributed skyline approach has to fulfill, leading to useful guidelines for developing algorithms for distributed skyline processing. We review in detail existing approaches that are applicable for highly distributed environments, clarify the assumptions of each approach, and provide a comparative performance analysis. Moreover, we study the skyline variants each approach supports. Our analysis leads to a taxonomy of existing approaches. Finally, we present interesting research topics on distributed skyline computation that have not yet been explored.

international conference on management of data | 2008

On efficient top-k query processing in highly distributed environments

Akrivi Vlachou; Christos Doulkeridis; Kjetil Nørvåg; Michalis Vazirgiannis

Lately the advances in centralized database management systems show a trend towards supporting rank-aware query operators, like top-k, that enable users to retrieve only the most interesting data objects. A challenging problem is to support rank-aware queries in highly distributed environments. In this paper, we present a novel approach, called SPEERTO, for top-k query processing in large-scale peer-to-peer networks, where the dataset is horizontally distributed over the peers. Towards this goal, we explore the applicability of the skyline operator for efficiently routing top-k queries in a large super-peer network. Relying on a thresholding scheme, SPEERTO returns the exact results progressively to the user, while the number of queried super-peers and transferred data is minimized. Finally, we propose different variations of SPEERTO that allow balancing between transferred data volume and response time. Through simulations we demonstrate the feasibility of our approach.

very large data bases | 2010

Identifying the most influential data objects with reverse top-k queries

Akrivi Vlachou; Christos Doulkeridis; Kjetil Nørvåg; Yannis Kotidis

Top-k queries are widely applied for retrieving a ranked set of the k most interesting objects based on the individual user preferences. As an example, in online marketplaces, customers (users) typically seek a ranked set of products (objects) that satisfy their needs. Reversing top-k queries leads to a query type that instead returns the set of customers that find a product appealing (it belongs to the top-k result set of their preferences). In this paper, we address the challenging problem of processing queries that identify the top-m most influential products to customers, where influence is defined as the cardinality of the reverse top-k result set. This definition of influence is useful for market analysis, since it is directly related to the number of customers that value a particular product and, consequently, to its visibility and impact in the market. Existing techniques require processing a reverse top-k query for each object in the database, which is prohibitively expensive even for databases of moderate size. In contrast, we propose two algorithms, SB and BB, for identifying the most influential objects: SB restricts the candidate set of objects that need to be examined, while BB is a branch-and-bound algorithm that retrieves the result incrementally. Furthermore, we propose meaningful variations of the query for most influential objects that are supported by our algorithms. Our experiments demonstrate the efficiency of our algorithms both for synthetic and real-life datasets.

international conference on management of data | 2013

Branch-and-bound algorithm for reverse top-k queries

Akrivi Vlachou; Christos Doulkeridis; Kjetil Nørvåg; Yannis Kotidis

Top-k queries return to the user only the k best objects based on the individual user preferences and comprise an essential tool for rank-aware query processing. Assuming a stored data set of user preferences, reverse top-k queries have been introduced for retrieving the users that deem a given database object as one of their top-k results. Reverse top-k queries have already attracted significant interest in research, due to numerous real-life applications such as market analysis and product placement. Currently, the most efficient algorithm for computing the reverse top-k set is RTA. RTA has two main drawbacks when processing a reverse top-k query: (i) it needs to access all stored user preferences, and (ii) it cannot avoid executing a top-k query for each user preference that belongs to the result set. To address these limitations, in this paper, we identify useful properties for processing reverse top-k queries without accessing each users individual preferences nor executing the top-k query. We propose an intuitive branch-and-bound algorithm for processing reverse top-k queries efficiently and discuss novel optimizations to boost its performance. Our experimental evaluation demonstrates the efficiency of the proposed algorithm that outperforms RTA by a large margin.

very large data bases | 2010

Efficient processing of top-k spatial preference queries

João B. Rocha-Junior; Akrivi Vlachou; Christos Doulkeridis; Kjetil Nørvåg

Top-k spatial preference queries return a ranked set of the k best data objects based on the scores of feature objects in their spatial neighborhood. Despite the wide range of location-based applications that rely on spatial preference queries, existing algorithms incur non-negligible processing cost resulting in high response time. The reason is that computing the score of a data object requires examining its spatial neighborhood to find the feature object with highest score. In this paper, we propose a novel technique to speed up the performance of top-k spatial preference queries. To this end, we propose a mapping of pairs of data and feature objects to a distance-score space, which in turn allows us to identify and materialize the minimal subset of pairs that is sufficient to answer any spatial preference query. Furthermore, we present a novel algorithm that improves query processing performance by avoiding examining the spatial neighborhood of the data objects during query execution. In addition, we propose an efficient algorithm for materialization and we describe useful properties that reduce the cost of maintenance. We show through extensive experiments that our approach significantly reduces the number of I/Os and execution time compared to the state-of-the-art algorithms for different setups.

conference on information and knowledge management | 2006

Constrained subspace skyline computation

Evangelos Dellis; Akrivi Vlachou; Ilya Vladimirskiy; Bernhard Seeger; Yannis Theodoridis

In this paper we introduce the problem of Constrained Subspace Skyline Queries. This class of queries can be thought of as a generalization of subspace skyline queries using range constraints. Although both constrained skyline queries and subspace skyline queries have been addressed previously, the implications of constrained subspace skyline queries has not been examined so far. Constrained skyline queries are usually more expensive than regular skylines. In case of constrained subspace skyline queries additional performance degradation is caused through the projection. In order to support constrained skylines for arbitrary subspaces, we present approaches exploiting multiple low-dimensional indexes instead of relying on a single high-dimensional index. Effective pruning strategies are applied to discard points from dominated regions. An important ingredient of our approach is the workload-adaptive strategy for determining the number of indexes and the assignment of dimensions to the indexes. Extensive performance evaluation shows the superiority of our proposed technique compared to its most related competitors.

international conference on data management in grid and p2p systems | 2009

AGiDS: A Grid-Based Strategy for Distributed Skyline Query Processing

João B. Rocha-Junior; Akrivi Vlachou; Christos Doulkeridis; Kjetil Nørvåg

Skyline queries help users make intelligent decisions over complex data, where different and often conflicting criteria are considered. A challenging problem is to support skyline queries in distributed environments, where data is scattered over independent sources. The query response time of skyline processing over distributed data depends on the amount of transferred data and the query processing cost at each server. In this paper, we propose AGiDS, a framework for efficient skyline processing over distributed data. Our approach reduces significantly the amount of transferred data, by using a grid-based data summary that captures the data distribution on each server. AGiDS consists of two phases to compute the result: in the first phase the querying server gathers the grid-based summary, whereas in the second phase a skyline request is sent only to the servers that may contribute to the skyline result set asking only for the points of non-dominated regions. We provide an experimental evaluation showing that our approach performs efficiently and outperforms existing techniques.

Explore More