Pavlos Fafalios
University of Crete
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pavlos Fafalios.
metadata and semantics research | 2013
Yannis Tzitzikas; Carlo Allocca; Chryssoula Bekiari; Yannis Marketakis; Pavlos Fafalios; Martin Doerr; Nikos Minadakis; Theodore Patkos; Leonardo Candela
One of the main characteristics of biodiversity data is its cross-disciplinary feature and the extremely broad range of data types, structures, and semantic concepts which encompasses. Moreover, biodiversity data, especially in the marine domain, is widely distributed, with few well-established repositories or standard protocols for their archiving, access, and retrieval. Our research aims at providing models and methods that allow integrating such information either for publishing it, browsing it, or querying it. For providing a valid and reliable knowledge ground for enabling semantic interoperability of marine data, in this paper we motivate a top level ontology, called MarineTLO that we have designed for this purpose, and discuss its use for creating MarineTLO-based warehouses in the context of a research infrastructure.
international acm sigir conference on research and development in information retrieval | 2013
Pavlos Fafalios; Yannis Tzitzikas
While more and more semantic data are published on the Web, an important question is how typical web users can access and exploit this body of knowledge. Although, existing interaction paradigms in semantic search hide the complexity behind an easy-to-use interface, they have not managed to cover common search needs. In this paper, we present X-ENS (eXplore ENtities in Search), a web search application that enhances the classical, keyword-based, web searching with semantic information, as a means to combine the pros of both Semantic Web standards and common Web Searching. X-ENS identifies entities of interest in the snippets of the top search results which can be further exploited in a faceted interaction scheme, and thereby can help the user to limit the - often very large - search space to those hits that contain a particular piece of information. Moreover, X-ENS permits the exploration of the identified entities by exploiting semantic repositories.
information retrieval facility conference | 2012
Pavlos Fafalios; Ioannis Kitsos; Yannis Marketakis; Claudio Baldassarre; Michail Salampasis; Yannis Tzitzikas
In this paper we present a method to enrich the classical web searching with entity mining that is performed at query time. The results of entity mining (entities grouped in categories) can complement the query answers with useful for the user information which can be further exploited in a faceted search-like interaction scheme. We show that the application of entity mining over the snippets of the top-hits of the answers, can be performed at real-time. However mining over the snippets returns less entities than mining over the full contents of the hits, and for this reason we report comparative results for these two scenarios. In addition, we show how Linked Data can be exploited for specifying the entities of interest and for providing further information about the identified entities, implementing a kind of entity-based integration of documents and (semantic) data. Finally, we discuss the applicability of this approach on professional search, specifically for the domains of fisheries/aquaculture and patents.
international world wide web conferences | 2012
Pavlos Fafalios; Ioannis Kitsos; Yannis Tzitzikas
The last years there is an increasing interest on providing the top search results while the user types a query letter by letter. In this paper we present and demonstrate a family of instant search applications which apart from showing instantly only the top search results, they can show various other kinds of precomputed aggregated information. This paradigm is more helpful for the end user (in comparison to the classic search-as-you-type), since it can combine autocompletion, search-as-you-type, results clustering, faceted search, entity mining, etc. Furthermore, apart from being helpful for the end user, it is also beneficial for the servers side. However, the instant provision of such services for large number of queries, big amounts of precomputed information, and large number of concurrent users is challenging. We demonstrate how this can be achieved using very modest hardware. Our approach relies on (a) a partitioned trie-based index that exploits the available main memory and disk, and (b) dedicated caching techniques. We report performance results over a server running on a modest personal computer (with 3 GB main memory) that provides instant services for millions of distinct queries and terabytes of precomputed information. Furthermore these services are tolerant to user typos and the word order.
International Journal on Artificial Intelligence Tools | 2015
Pavlos Fafalios; Manolis Baritakis; Yannis Tzitzikas
Named Entity Extraction (NEE) is the process of identifying entities in texts and, very commonly, linking them to related (Web) resources. This task is useful in several applications, e.g. for question answering, annotating documents, post-processing of search results, etc. However, existing NEE tools lack an open or easy configuration although this is very important for building domain-specific applications. For example, supporting a new category of entities, or specifying how to link the detected entities with online resources, is either impossible or very laborious. In this paper, we show how we can exploit semantic information (Linked Data) at real-time for configuring (handily) a NEE system and we propose a generic model for configuring such services. To explicitly define the semantics of the proposed model, we introduce an RDF/S vocabulary, called “Open NEE Configuration Model”, which allows a NEE service to describe (and publish as Linked Data) its entity mining capabilities, but also to be dynamically configured. To allow relating the output of a NEE process with an applied configuration, we propose an extension of the Open Annotation Data Model which also enables an application to run advanced queries over the annotated data. As a proof of concept, we present X-Link, a fully-configurable NEE framework that realizes this approach. Contrary to the existing tools, X-Link allows the user to easily define the categories of entities that are interesting for the application at hand by exploiting one or more semantic Knowledge Bases. The user is also able to update a category and specify how to semantically link and enrich the identified entities. This enhanced configurability allows X-Link to be easily configured for different contexts for building domain-specific applications. To test the approach, we conducted a task-based evaluation with users that demonstrates its usability, and a case study that demonstrates its feasibility.
Journal of Web Semantics | 2014
Pavlos Fafalios; Panagiotis Papadakos
Theophrastus is a system that supports the automatic annotation of (web) documents through entity mining and provides exploration services by exploiting Linked Open Data (LOD), in real-time and only when needed. The system aims at assisting biologists in their research on species and biodiversity. It was based on requirements coming from the biodiversity domain and was awarded the first prize in the Blue Hackathon 2013. Theophrastus has been designed to be highly configurable regarding a number of different aspects like entities of interest, information cards and external search systems. As a result it can be exploited in different contexts and other areas of interest. The provided experimental results show that the proposed approach is efficient and can be applied in real-time.
International Journal on Semantic Web and Information Systems | 2016
Michalis Mountantonakis; Nikos Minadakis; Yannis Marketakis; Pavlos Fafalios; Yannis Tzitzikas
In many applications one has to fetch and assemble pieces of information coming from more than one source for building a semantic warehouse offering more advanced query capabilities. In this paper the authors describe the corresponding requirements and challenges, and they focus on the aspects of quality and value of the warehouse. For this reason they introduce various metrics (or measures) for quantifying its connectivity, and consequently its ability to answer complex queries. The authors demonstrate the behaviour of these metrics in the context of a real and operational semantic warehouse, as well as on synthetically produced warehouses. The proposed metrics allow someone to get an overview of the contribution (to the warehouse) of each source and to quantify the value of the entire warehouse. Consequently, these metrics can be used for advancing data/endpoint profiling and for this reason the authors use an extension of VoID (for making them publishable). Such descriptions can be exploited for dataset/endpoint selection in the context of federated search. In addition, the authors show how the metrics can be used for monitoring a semantic warehouse after each reconstruction reducing thereby the cost of quality checking, as well as for understanding its evolution over time.
International Journal of Semantic Computing | 2014
Pavlos Fafalios; Panagiotis Papadakos; Yannis Tzitzikas
The integration of the classical Web (of documents) with the emerging Web of Data is a challenging vision. In this paper we focus on an integration approach during searching which aims at enriching the responses of non-semantic search systems with semantic information, i.e. Linked Open Data (LOD), and exploiting the outcome for offering advanced exploratory search services which provide an overview of the search space and allow the users to explore the related LOD. We use named entities identified in the search results for automatically connecting search hits with LOD and we consider a scenario where this entity-based integration is performed at query time with no human effort and no a-priori indexing which is beneficial in terms of configurability and freshness. However, the number of identified entities can be high and the same is true for the semantic information about these entities that can be fetched from the available LOD. To this end, in this paper we propose a Link Analysis-based method which is used for ranking (and thus selecting to show) the more important semantic information related to the search results. We report the results of a survey regarding the marine domain with promising results, and comparative results that illustrate the effectiveness of the proposed (PageRank-based) ranking scheme. Finally, we report experimental results regarding efficiency showing that the proposed functionality can be offered even at query time.
web information systems engineering | 2011
Pavlos Fafalios; Yannis Tzitzikas
Search-As-You-Type (or Instant Search) is a recently introduced functionality which shows predictive results while the user types a query letter by letter. In this paper we generalize and propose an extension of this technique which apart from showing on-the-fly the first page of results, it shows various other kinds of information, e.g. the outcome of results clustering techniques, or metadata-based groupings of the results. Although this functionality is more informative than the classic search-as-you type, since it combines Autocompletion, Search-As-You-Type, and Results Clustering, the provision of real-time interaction is more challenging. To tackle this issue we propose an approach based on pre-computed information and we comparatively evaluate various index structures for making real-time interaction feasible, even if the size of the available memory space is limited. This comparison reveals the memory/ performance trade-off and allows deciding which index structure to use according to the available main memory and desired performance. Furthermore we show that an incremental algorithm can be used to keep the index structure fresh.
international conference theory and practice digital libraries | 2016
Pavlos Fafalios; Thanos Yannakis; Yannis Tzitzikas
A constantly increasing number of data providers publish their data on the Web in the RDF format as Linked Data. SPARQL is the standard query language for retrieving and manipulating RDF data. However, the majority of SPARQL implementations requires the data to be available in advance (in main memory or in a repository), not exploiting thereby the real-time and dynamic nature of Linked Data. In this paper we present SPARQL-LD, an extension of SPARQL 1.1 Federated Query that allows to directly fetch and query RDF data from any Web source. Using SPARQL-LD, one can even query a dataset coming from the partial results of a query (i.e., discovered at query execution time), or RDF data that is dynamically created by Web Services. Such a functionality motivates Web publishers to adopt the Linked Data principles and enrich their digital contents and services with RDF, since their data is made directly accessible and exploitable via SPARQL (without needing to set up and maintain an endpoint). In this paper, we showcase the benefits offered by SPARQL-LD through an example related to the Europeana digital library, we report experimental results that demonstrate the feasibility of SPARQL-LD, and we introduce optimizations that improve its efficiency.