Marina Drosou | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marina Drosou is active.

Explore More

Publication

Featured researches published by Marina Drosou.

international conference on management of data | 2010

Search result diversification

Marina Drosou; Evaggelia Pitoura

Result diversification has recently attracted much attention as a means of increasing user satisfaction in recommender systems and web search. Many different approaches have been proposed in the related literature for the diversification problem. In this paper, we survey, classify and comparatively study the various definitions, algorithms and metrics for result diversification.

very large data bases | 2012

DisC diversity: result diversification based on dissimilarity and coverage

Marina Drosou; Evaggelia Pitoura

Recently, result diversification has attracted a lot of attention as a means to improve the quality of results retrieved by user queries. In this paper, we propose a new, intuitive definition of diversity called DisC diversity. A DisC diverse subset of a query result contains objects such that each object in the result is represented by a similar object in the diverse subset and the objects in the diverse subset are dissimilar to each other. We show that locating a minimum DisC diverse subset is an NP-hard problem and provide heuristics for its approximation. We also propose adapting DisC diverse subsets to a different degree of diversification. We call this operation zooming. We present efficient implementations of our algorithms based on the M-tree, a spatial index structure, and experimentally evaluate their performance.

extending database technology | 2010

PerK: personalized keyword search in relational databases through preferences

Kostas Stefanidis; Marina Drosou; Evaggelia Pitoura

Keyword-based search in relational databases allows users to discover relevant information without knowing the database schema or using complicated queries. However, such searches may return an overwhelming number of results, often loosely related to the user intent. In this paper, we propose personalizing keyword database search by utilizing user preferences. Query results are ranked based on both their relevance to the query and their preference degree for the user. To further increase the quality of results, we consider two new metrics that evaluate the goodness of the result as a set, namely coverage of many user interests and content diversity. We present an algorithm for processing preference queries that uses the preferential order between keywords to direct the joining of relevant tuples from multiple relations. We then show how to reduce the complexity of this algorithm by sharing computational steps. Finally, we report evaluation results of the efficiency and effectiveness of our approach.

distributed event-based systems | 2009

Preference-aware publish/subscribe delivery with diversity

Marina Drosou; Kostas Stefanidis; Evaggelia Pitoura

In publish/subscribe systems, users describe their interests via subscriptions and are notified whenever new interesting events become available. Typically, in such systems, all subscriptions are considered equally important. However, due to the abundance of information, users may receive overwhelming amounts of events. In this paper, we propose using a ranking mechanism based on user preferences, so that only top-ranked events are delivered to each user. Since many times top-ranked events are similar to each other, we also propose increasing the diversity of delivered events. Furthermore, we examine a number of different delivering policies for forwarding ranked events to users, namely a periodic, a sliding-window and a history-based one. We have fully implemented our approach in SIENA, a popular publish/subscribe middleware system, and report experimental results of its deployment.

extending database technology | 2012

Dynamic diversification of continuous data

Marina Drosou; Evaggelia Pitoura

Result diversification has recently attracted considerable attention as a means of increasing user satisfaction in recommender systems, as well as in web and database search. In this paper, we focus on the problem of selecting the k-most diverse items from a result set. Whereas previous research has mainly considered the static version of the problem, in this paper, we exploit the dynamic case in which the result set changes over time, as for example, in the case of notification services. We define the Continuous k-Diversity Problem along with appropriate constraints that enforce continuity requirements on the diversified results. Our proposed approach is based on cover trees and supports dynamic item insertion and deletion. The diversification problem is in general NP-complete; we provide theoretical bounds that characterize the quality of our solution based on cover trees with respect to the optimal solution. Finally, we report experimental results concerning the efficiency and effectiveness of our approach on a variety of real and synthetic datasets.

very large data bases | 2013

YmalDB: exploring relational databases via result-driven recommendations

Marina Drosou; Evaggelia Pitoura

The typical user interaction with a database system is through queries. However, many times users do not have a clear understanding of their information needs or the exact content of the database. In this paper, we propose assisting users in database exploration by recommending to them additional items, called Ymal (“You May Also Like”) results, that, although not part of the result of their original query, appear to be highly related to it. Such items are computed based on the most interesting sets of attribute values, called faSets, that appear in the result of the original query. The interestingness of a faSet is defined based on its frequency in the query result and in the database. Database frequency estimations rely on a novel approach of maintaining a set of representative rare faSets. We have implemented our approach and report results regarding both its performance and its usefulness.

ACM Transactions on Database Systems | 2015

Multiple Radii DisC Diversity: Result Diversification Based on Dissimilarity and Coverage

Marina Drosou; Evaggelia Pitoura

Recently, result diversification has attracted a lot of attention as a means to improve the quality of results retrieved by user queries. In this article, we introduce a novel definition of diversity called DisC diversity. Given a tuning parameter r, which we call radius, we consider two items to be similar if their distance is smaller than or equal to r. A DisC diverse subset of a result contains items such that each item in the result is represented by a similar item in the diverse subset and the items in the diverse subset are dissimilar to each other. We show that locating a minimum DisC diverse subset is an NP-hard problem and provide algorithms for its approximation. We extend our definition to the multiple radii case, where each item is associated with a different radius based on its importance, relevance, or other factors. We also propose adapting DisC diverse subsets to a different degree of diversification by adjusting r, that is, increasing the radius (or zooming-out) and decreasing the radius (or zooming-in). We present efficient implementations of our algorithms based on the M-tree, a spatial index structure, and experimentally evaluate their performance.

conference on information and knowledge management | 2011

ReDRIVE: result-driven database exploration through recommendations

Marina Drosou; Evaggelia Pitoura

Typically, users interact with database systems by formulating queries. However, many times users do not have a clear understanding of their information needs or the exact content of the database, thus, their queries are of an exploratory nature. In this paper, we propose assisting users in database exploration by recommending to them additional items that are highly related with the items in the result of their original query. Such items are computed based on the most interesting sets of attribute values (or faSets) that appear in the result of the original user query. The interestingness of a faSet is defined based on its frequency both in the query result and in the database instance. Database frequency estimations rely on a novel approach that employs an e-tolerance closed rare faSets representation. We report evaluation results of the efficiency and effectiveness of our approach on both real and synthetic datasets.

IEEE Transactions on Knowledge and Data Engineering | 2014

Diverse Set Selection Over Dynamic Data

Marina Drosou; Evaggelia Pitoura

Result diversification has recently attracted considerable attention as a means of increasing user satisfaction in recommender systems, as well as in web and database search. In this paper, we focus on the problem of selecting the k-most diverse items from a result set. Whereas previous research has mainly considered the static version of the problem, in this paper, we exploit the dynamic case in which the result set changes over time, as for example, in the case of notification services. We define the CONTINUOUS k-DIVERSITY PROBLEM along with appropriate constraints that enforce continuity requirements on the diversified results. Our proposed approach is based on cover trees and supports dynamic item insertion and deletion. The diversification problem is in general NP-hard; we provide theoretical bounds that characterize the quality of our cover tree solution with respect to the optimal one. Since results are often associated with a relevance score, we extend our approach to account for relevance. Finally, we report experimental results concerning the efficiency and effectiveness of our approach on a variety of real and synthetic datasets.

statistical and scientific database management | 2013

DoS: an efficient scheme for the diversification of multiple search results

Hina A. Khan; Marina Drosou; Mohamed A. Sharaf

Data diversification provides users with a concise and meaningful view of the results returned by search queries. In addition to taming the information overload, data diversification also provides the benefits of reducing data communication costs as well as enabling data exploration. The explosion of big data emphasizes the need for data diversification in modern data management platforms, especially for applications based on web, scientific, and business databases. Achieving effective diversification, however, is rather a challenging task due to the inherent high processing costs of current data diversification techniques. This challenge is further accentuated in a multi-user environment, in which multiple search queries are to be executed and diversified concurrently. In this paper, we propose the DoS scheme, which addresses the problem of scalable diversification of multiple search results. Our experimental evaluation shows the scalability exhibited by DoS under various workload settings, and the significant benefits it provides compared to sequential methods.

Explore More