Nada Naji
University of Neuchâtel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nada Naji.
asia information retrieval symposium | 2011
Nada Naji; Jacques Savoy
This paper describes and evaluates different IR models and search strategies for digitized manuscripts. Written during the thirteenth century, these manuscripts were digitized using an imperfect recognition system with a word error rate of around 6%. Having access to the internal representation during the recognition stage, we were able to produce four automatic transcriptions, each introducing some form of spelling correction as an attempt to improve the retrieval effectiveness. We evaluated the retrieval effectiveness for each of these versions using three text representations combined with five IR models, three stemming strategies and two query formulations. We employed a manually-transcribed error-free version to define the ground-truth. Based on our experiments, we conclude that taking account of the single best recognition word or all possible top-k recognition alternatives does not provide the best performance. Selecting all possible words each having a log-likelihood close to the best alternative yields the best text surrogate. Within this representation, different retrieval strategies tend to produce similar performance levels.
ASIST '13 Proceedings of the 76th ASIS&T Annual Meeting: Beyond the Cloud: Rethinking Information Boundaries | 2013
Nada Naji; Jacques Savoy
This article tackles the task of retrieving very short documents via even shorter queries. The problem on hand may relate to the retrieval of tweets, image and table captions, short text messages (SMS) and sponsored retrieval among others. In such cases, document and/or query expansion using thesauri and other external resources (e.g., Wikipedia) usually available on the World Wide Web (WWW) are proven to be effective approaches. However, the focus of this paper is on documents that are written in lesser known languages for which the WWW is of limited use. Our experiments are based on two main corpora extracted from historical manuscripts written in Latin and Middle High German. We found that retrieving very short documents whose lengths are quite similar via short queries given that no external enrichment resources are available, the classical tf-idf model performs as satisfactorily as the more complex models do, if not better sometimes.
Archive | 2014
Andreas Fischer; Horst Bunke; Nada Naji; Jacques Savoy; Micheal Baechler; Rolf Ingold
cross-language evaluation forum | 2012
Mitra Akasereh; Nada Naji; Jacques Savoy
DH | 2012
Micheal Baechler; Andreas Fischer; Nada Naji; Rolf Ingold; Horst Bunke; Jacques Savoy
annual conference on computers | 2011
Jacques Savoy; Nada Naji
CLEF (Working Notes) | 2013
Mitra Akasereh; Nada Naji; Jacques Savoy
Actes 8ème Conférence en Recherche d’Information et Applications CORIA’11 | 2011
Nada Naji; Jacques Savoy; Ljiljana Dolamic
Actes 11e Journées internationales d’analyse statistique des données textuelles JADT 2012 | 2012
Nada Naji; Jacques Savoy
Archive | 2013
Nada Naji