Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jacques Savoy is active.

Publication


Featured researches published by Jacques Savoy.


european conference on information retrieval | 2003

Term proximity scoring for keyword-based retrieval systems

Yves Rasolofo; Jacques Savoy

This paper suggests the use of proximity measurement in combination with the Okapi probabilistic model. First, using the Okapi system, our investigation was carried out in a distributed retrieval framework to calculate the same relevance score as that achieved by a single centralized index. Second, by applying a term-proximity scoring heuristic to the top documents returned by a keyword-based system, our aim is to enhance retrieval performance. Our experiments were conducted using the TREC8, TREC9 and TREC10 test collections, and show that the suggested approach is stable and generally tends to improve retrieval effectiveness especially at the top documents retrieved.


Journal of the Association for Information Science and Technology | 1999

A stemming procedure and stopword list for general French corpora

Jacques Savoy

Due to the increasing use of network-based systems, there is a growing interest in access to and search mechanisms for text databases in languages other than English. To adapt searching systems to those foreign languages with characteristics similar to the English language, all we need to do for the most part is to establish a general stopword list and a stemming procedure. This article presents the tools needed to establish these in the French language databases and some retrieval experiments that have been carried out using two medium-sized French language test collections. These experiments were conducted to evaluate the retrieval effectiveness of the propositions described.


Information Processing and Management | 1997

Statistical inference in retrieval effectiveness evaluation

Jacques Savoy

Evaluation methodology, and particularly its statistical tests associated, plays a central role in the information retrieval domain which maintains a strong empirical tradition. In an effort to evaluate the retrieval effectiveness of a search algorithm, this paper focuses on the average precision over a set of fixed recall values. After reviewing traditional evaluation methodology through the use of examples, this study suggests applying another statistical inference methodology called bootstrap, within which no particular assumption is needed about the distribution of the observations. Moreover, this scheme may be used to assert the accuracy of virtually any statistic, to build approximate confidence interval, and to verify whether a statistically significant difference exists between two retrieval schemes, even when dealing with a relatively small sample size. This study also suggests selecting the sample median rather than the sample mean in evaluating retrieval effectiveness where the justification for this choice is based on the nature of the information retrieval data.


Information Processing and Management | 2000

Database merging strategy based on logistic regression

Anne Le Calvé; Jacques Savoy

With the development of network technology, users looking for information may send a request to various selected databases and then inspect multiple result lists. To avoid the need for inspecting multiple result lists, the database merging strategy merges the retrieval results produced by separate, autonomous servers into an effective, single ranked list. Our study deals with a particular aspect of this merging process, whereby only the rank of the retrieved records is available, and where a key points to different result lists. On the basis of this rather limited information, this paper describes the theoretical foundation and retrieval performance of our database merging approach based on logistic regression.


conference on information and knowledge management | 2001

Approaches to collection selection and results merging for distributed information retrieval

Yves Rasolofo; Fai’za Abbaci; Jacques Savoy

We have investigated two major issues in Distributed Information Retrieval (DIR), namely: collection selection and search results merging. While most published works on these two issues are based on pre-stored metadata, the approaches described in this paper involve extracting the required information at the time the query is processed. In order to predict the relevance of collections to a given query, we analyse a limited number of full documents (e.g., the top five documents) retrieved from each collection and then consider term proximity within them. On the other hand, our merging technique is rather simple since input only requires document scores and lengths of results lists. Our experiments evaluate the retrieval effectiveness of these approaches and compare them with centralised indexing and various other DIR techniques (e.g., CORI). We conducted our experiments using two testbeds: one containing news articles extracted from four different sources (2 GB) and another containing 10 GB of Web pages. Our evaluations demonstrate that the retrieval effectiveness of our simple approaches is worth considering.


acm symposium on applied computing | 2006

Light stemming approaches for the French, Portuguese, German and Hungarian languages

Jacques Savoy

This paper describes and evaluates various general stemming approaches for the French, Portuguese (Brazilian), German and Hungarian languages. Based on the CLEF test-collections, we demonstrate that light stemmers for the French, Portuguese and Hungarian languages perform well, and reasonably well for the German language. Variations in mean average precision among the different stemming approaches are also evaluated and sometimes they are found statistically significant.


Information Processing and Management | 2008

Searching in Medline: Query expansion and manual indexing evaluation

Samir Abdou; Jacques Savoy

Based on a relatively large subset representing one third of the Medline collection, this paper evaluates ten different IR models, including recent developments in both probabilistic and language models. We show that the best performing IR models is a probabilistic model developed within the Divergence from Randomness framework [Amati, G., & van Rijsbergen, C.J. (2002) Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM-Transactions on Information Systems 20(4), 357-389], which result in 170% enhancements in mean average precision when compared to the classical tf idf vector-space model. This paper also reports on our impact evaluations on the retrieval effectiveness of manually assigned descriptors (MeSH or Medical Subject Headings), showing that by including these terms retrieval performance can improve from 2.4% to 13.5%, depending on the underling IR model. Finally, we design a new general blind-query expansion approach showing improved retrieval performances compared to those obtained using the Rocchio approach.


Information Processing and Management | 2003

Result merging strategies for a current news metasearcher

Yves Rasolofo; David Hawking; Jacques Savoy

Metasearching of online current news services is a potentially useful Web application of distributed information retrieval techniques. We constructed a realistic current news test collection using the results obtained from 15 current news Web sites (including ABC News, BBC and AllAfrica) in response to 107 topical queries. Results were judged for relevance by independent assessors. Online news services varied considerably both in the usefulness of the results sets they returned and also in the amount of information they provided which could be exploited by a metasearcher. Using the current news test collection we compared a range of different merging methods. We found that a low-cost merging scheme based on a combination of available evidence (title, summary, rank and server usefulness) worked almost as well as merging based on downloading and rescoring the actual news articles.


cross language evaluation forum | 2002

Report on CLEF-2002 Experiments: Combining Multiple Sources of Evidence

Jacques Savoy

In our second participation in the CLEF retrieval tasks, our first objective was to propose better and more general stopword lists for various European languages (namely, French, Italian, German, Spanish and Finnish) along with improved, simpler and efficient stemming procedures. Our second goal was to propose a combined query-translation approach that could cross language barriers and also an effective merging strategy based on logistic regression for accessing the multilingual collection. Finally, within the Amaryllis experiment, we wanted to analyze how a specialized thesaurus might improve retrieval effectiveness.


Information Processing and Management | 2003

Cross-language information retrieval: experiments based on CLEF 2000 corpora

Jacques Savoy

Search engines play an essential role in the usability of Internet-based information systems and without them the Web would be much less accessible, and at the very least would develop at a much slower rate. Given that non-English users now tend to make up the majority in this environment, our main objective is to analyze and evaluate the retrieval effectiveness of various indexing and search strategies based on test-collections written in four different languages: English, French, German, and Italian. Our second objective is to describe and evaluate various approaches that might be implemented in order to effectively access document collections written in another language. As a third objective, we will explore the underlying problems involved in searching document collections written in the four different languages, and we will suggest and evaluate different database merging strategies capable of providing the user with a single unique result list.

Collaboration


Dive into the Jacques Savoy's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Claire Fautsch

University of Neuchâtel

View shared research outputs
Top Co-Authors

Avatar

Mirco Kocher

University of Neuchâtel

View shared research outputs
Top Co-Authors

Avatar

Nada Naji

University of Neuchâtel

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yves Rasolofo

University of Neuchâtel

View shared research outputs
Top Co-Authors

Avatar

Mitra Akasereh

University of Neuchâtel

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge