Sofia Stamou
University of Patras
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sofia Stamou.
european conference on information retrieval | 2010
Sofia Stamou; Efthimis N. Efthimiadis
The lack of user activity on search results was until recently perceived as a sign of user dissatisfaction from retrieval performance, often, referring to such inactivity as a failed search (negative search abandonment). However, recent studies suggest that some search tasks can be achieved in the contents of the results displayed without the need to click through them (positive search abandonment); thus they emphasize the need to discriminate between successful and failed searches without follow-up clicks. In this paper, we study users’ inactivity on search results in relation to their pursued search goals and investigate the impact of displayed results on user clicking decisions. Our study examines two types of post-query user inactivity: pre-determined and post-determined depending on whether the user started searching with a preset intention to look for answers only within the result snippets and did not intend to click through the results, or the user inactivity was decided after the user had reviewed the list of retrieved documents. Our findings indicate that 27% of web searches in our sample are conducted with a pre-determined intention to look for answers in the results’ list and 75% of them can be satisfied in the contents of the displayed results. Moreover, in nearly half the queries that did not yield result visits, the desired information is found in the result snippets.
web information and data management | 2005
Vlassis Krikos; Sofia Stamou; Pavlos Kokosis; Alexandros Ntoulas; Dimitris Christodoulakis
Web Directories are repositories of Web pages organized in a hierarchy of topics and sub-topics. In this paper, we present DirectoryRank, a ranking framework that orders the pages within a given topic according to how informative they are about the topic. Our method works in three steps: first, it processes Web pages within a topic in order to extract structures that are called lexical chains, which are then used for measuring how informative a page is for a particular topic. Then, it measures the relative semantic similarity of the pages within a topic. Finally, the two metrics are combined for ranking all the pages within a topic before presenting them to the users.
international conference natural language processing | 2005
Sofia Stamou; Vlassis Krikos; Pavlos Kokosis; Alexandros Ntoulas; Dimitris Christodoulakis
Web Directories provide a way of locating relevant information on the Web. Typically, Web Directories rely on humans putting in significant time and effort into finding important pages on the Web and categorizing them in the Directory. In this paper we present a way for automating the creation of a Web Directory. At a high level, our method takes as input a subject hierarchy and a collection of pages. We first leverage a variety of lexical resources from the Natural Language Processing community to enrich our hierarchy. After that, we process the pages and identify sequences of important terms, which are referred to as lexical chains. Finally, we use the lexical chains in order to decide where in the enriched subject hierarchy we should assign every page. Our experimental results with real Web data show that our method is quite promising into assisting humans during page categorization.
international conference on computational linguistics | 2005
Sofia Stamou; Dimitris Christodoulakis
In this paper we experimentally study the impact of normalized query expansion on Web Information Retrieval. In this respect, we have implemented a query expansion module, which firstly normalizes the user submitted queries and subsequently attempts to enrich them with semantically related terms that are obtained from WordNet. Experimental results demonstrate that for certain query types our module has a potential in giving improved search results in terms of relevance, compared to the results retrieved for the same queries by other retrieval methods.
international conference natural language processing | 2000
Alexandros Ntoulas; Sofia Stamou; I. Tsakou; Christos Tsalidis; Manolis Tzagarakis; Aristides Th. Vagelatos
Greek WordNet is a project aiming at developing a database of wordnets for the Greek language, structured along the same lines as the Euro WordNet project. This contribution presents the morphosyntactic lexicon, which will be used as the basis for the development of the whole project. This lexicon was developed within the framework of a spelling correction system. Later on, it was enhanced by adding syntactic information for each lemma and by using a relational database for the storage and management of the data.
asia pacific web conference | 2006
Sofia Stamou; Alexandros Ntoulas; Vlassis Krikos; Pavlos Kokosis; Dimitris Christodoulakis
Web Directories have emerged as an alternative to the Search Engines for locating information on the Web. Typically, Web Directories rely on humans putting in significant time and effort into finding important pages on the Web and categorizing them in the Directory. In this paper, we experimentally study the automatic population of a Web Directory via the use of a subject hierarchy. For our study, we have constructed a subject hierarchy for the top level topics offered in Dmoz, by leveraging ontological content from available lexical resources. We first describe how we built our subject hierarchy. Then, we analytically present how the hierarchy can help in the construction of a Directory. We also introduce a ranking formula for sorting the pages listed in every Directory topic, based on the pages’ quality, and we experimentally study the efficiency of our approach against other popular methods for creating Directories.
Journal of the Association for Information Science and Technology | 2012
Christos Makris; Yannis Plegas; Sofia Stamou
In this article, we propose new word sense disambiguation strategies for resolving the senses of polysemous query terms issued to Web search engines, and we explore the application of those strategies when used in a query expansion framework. The novelty of our approach lies in the exploitation of the Web page PageRank values as indicators of the significance the different senses of a term carry when employed in search queries. We also aim at scalable query sense resolution techniques that can be applied without loss of efficiency to large data sets such as those on the Web. Our experimental findings validate that the proposed techniques perform more accurately than do the traditional disambiguation strategies and improve the quality of the search results, when involved in query expansion.
web intelligence | 2011
Nikos Kirtsis; Sofia Stamou
Web searches are driven by information needs and intend the accomplishment of specific tasks. Information needs are determined by the topical subject of queries, i.e. what we search, while tasks are determined by the user motives that induce the submission of queries, i.e. why we search. Though there exist numerous studies on how to assist searchers specify queries that are expressive of their underlying information needs, little has been done to help searchers specify queries that describe the tasks they pursue via their searches. In this paper we propose a query reformulation method to empower task-oriented web searches. Given a query, our method starts with the identification of terms that could serve as descriptors of the potential search tasks the query represents. Based on the identified terms, it generates query re-formulations that explicitly verbalize the possible search tasks. Query reformulations are presented to the user in order to select the one that best suits her search intention.
acm symposium on applied computing | 2013
Yannis Plegas; Sofia Stamou
It is well-known that the web contains many duplicate and near-duplicate documents. Despite the efforts that have been put towards equipping search engines with duplicate detection algorithms, still there are cases where the documents retrieved in response to web queries contain redundant information. In this paper, we are concerned with effectively identifying and reducing redundant information in search results. In particular, we describe how we automatically detect content that is lexically and/or semantically duplicated across search results and we introduce a novel algorithm that upon the detection of significant (i.e., above a given threshold) content duplication, it filters out redundant information. Information filtering takes place in two-steps depending on whether we are dealing with documents of (nearly) identical lexical content or with documents of lexically distinct but semantically equivalent content. In the first case, our algorithm retains in the result list the document that is the most relevant to the query intention and removes duplicates. In the second case, our algorithm merges into a single text, which we call SuperText, the documents of redundant information in a way that every document contributes diverse semantic content to the generated SuperText. Additionally, the algorithm re-ranks the remaining documents based on their contextual relevance to the query intention. The experimental evaluation of our approach demonstrates that it is very effective in identifying lexical and semantic information redundancy across search results. In addition, we have found that our algorithm manages to filter out successfully content duplication from the results list and the SuperTexts it generates for reducing information redundancy are syntactically and semantically coherent texts.
asia-pacific web conference | 2009
Sofia Stamou; Lefteris Kozanidis
A considerable fraction of the web queries contain named entities. This, coupled with the fact that a proper name might refer to multiple entities, imposes the ever-increasing need that search engines handle efficiently named entity queries. In this paper, we present a technique that automatically identifies the distinct subject classes to which a named entity query might refer and selects a set of appropriate facets for denoting the query properties within every class. We also suggest a method that examines the distribution of the identified query facets within the contents of the query matching pages and groups search results according to their entity denotation types. Our preliminary study shows that our technique identifies useful facets for representing the named entity query properties in each of their referenced subject classes.