Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Giorgio Maria Di Nunzio is active.

Publication


Featured researches published by Giorgio Maria Di Nunzio.


cross language evaluation forum | 2008

GeoCLEF 2008: the CLEF 2008 cross-language geographic information retrieval track overview

Thomas Mandl; Paula Carvalho; Giorgio Maria Di Nunzio; Fredric C. Gey; Ray R. Larson; Diana Santos; Christa Womser-Hacker

GeoCLEF is an evaluation task running under the scope of the Cross Language Evaluation Forum (CLEF). The purpose of GeoCLEF is to test and evaluate cross-language geographic information retrieval (GIR). The GeoCLEF 2008 task presented twenty-five geographically challenging search topics for English, German and Portuguese. Eleven participants submitted 131 runs, based on a variety of approaches, including sample documents, named entity extraction and ontology based retrieval. The evaluation methodology and results are presented in the paper.


cross language evaluation forum | 2009

CLEF 2009 ad hoc track overview: robust-WSD task

Eneko Agirre; Giorgio Maria Di Nunzio; Thomas Mandl; Arantxa Otegi

The Robust-WSD at CLEF 2009 aims at exploring the contribution ofWord Sense Disambiguation to monolingual and multilingual Information Retrieval. The organizers of the task provide documents and topics which have been automatically tagged with Word Senses from WordNet using several state-of-the-art Word Sense Disambiguation systems. The Robust-WSD exercise follows the same design as in 2008. It uses two languages often used in previous CLEF campaigns (English, Spanish). Documents were in English, and topics in both English and Spanish. The document collections are based on the widely used LA94 and GH95 news collections. All instructions and datasets required to replicate the experiment are available from the organizers website (http://ixa2.si.ehu.es/clirwsd/). The results show that some top-scoring systems improve their IR and CLIR results with the use of WSD tags, but the best scoring runs do not use WSD.


european conference on research and advanced technology for digital libraries | 2005

DIRECT: a system for evaluating information access components of digital libraries

Giorgio Maria Di Nunzio; Nicola Ferro

Digital Library Management Systems(DLMSs) generally manage collections of multi-media digitalized data and include components that perform the storage, access, retrieval, and analysis of the collections of data. Recently, the new trend of DLMS applications is pushing towards a components/services technology which is becoming more and more standardized [1,2]. The results of this new orientation are ad-hoc solutions for different components and services of DLMS: the data repository, the data manager, the search and retrieval components, etc. We are particularly interested in the evaluation aspects that range from measuring and quantifying the performances of the information access and extraction components of a DLMS to designing and developing an architecture for a system capable of supporting this kind of evaluation in the context of DLMS[3,4].


cross-language evaluation forum | 2004

CLEF 2004: ad hoc track overview and results analysis

Martin Braschler; Giorgio Maria Di Nunzio; Nicola Ferro; Carol Peters

We describe the objectives and organization of the CLEF 2004 ad hoc track and discuss the main characteristics of the experiments. The results are analyzed and commented and their statistical significance is investigated. The paper concludes with some observations on the impact of the CLEF campaign on the state-of-the-art in cross-language information retrieval.


acm international conference on digital libraries | 2007

The importance of scientific data curation for evaluation campaigns

Maristella Agosti; Giorgio Maria Di Nunzio; Nicola Ferro

Information Retrieval system evaluation campaigns produce valuable scientific data, which should be preserved carefully so that they can be available for further studies. A complete record should be maintained of all analyses and interpretations in order to ensure that they are reusable in attempts to replicate particular results or in new research and so that they can be referred to or cited at any time. In this paper, we describe the data curation approach for the scientific data produced by evaluation campaigns. The medium/long-term aim is to create a large-scale Digital Library System (DLS) of scientific data which supports services for the creation, interpretation and use of multidisciplinary and multilingual digital content.


Information Processing and Management | 2014

A new decision to take for cost-sensitive Naïve Bayes classifiers

Giorgio Maria Di Nunzio

Practical classification problems often involve some kind of trade-off between the decisions a classifier may take. Indeed, it may be the case that decisions are not equally good or costly; therefore, it is important for the classifier to be able to predict the risk associated with each classification decision. Bayesian decision theory is a fundamental statistical approach to the problem of pattern classification. The objective is to quantify the trade-off between various classification decisions using probability and the costs that accompany such decisions. Within this framework, a loss function measures the rates of the costs and the risk in taking one decision over another. In this paper, we give a formal justification for a decision function under the Bayesian decision framework that comprises (i) the minimisation of Bayesian risk and (ii) an empirical decision function found by Domingos and Pazzani (1997). This new decision function has a very intuitive geometrical interpretation that can be explored on a Cartesian plane. We use this graphical interpretation to analyse different approaches to find the best decision on four different Naive Bayes (NB) classifiers: Gaussian, Bernoulli, Multinomial, and Poisson, on different standard collections. We show that the graphical interpretation significantly improves the understanding of the models and opens new perspectives for new research studies.


International Journal on Digital Libraries | 2010

Understanding user requirements and preferences for a digital library Web portal

Maristella Agosti; Franco Crivellari; Giorgio Maria Di Nunzio; Silvia Gabrielli

This article reports the findings of a user study conducted in the context of the TELplus project to gain insights about user needs and preferences for the digital library services offered by The European Library Web portal. The user requirements collection for the Web portal was designed by adopting a comprehensive survey approach. This combined explicit user feedback with implicit usage data so as to provide a more in-depth analysis of user experience with the portal. The analysis conducted shed light on likely motivations for both participant usage and reluctance to use the services provided, leading to more informed decisions on how to refine, improve, and present Web portal services to their future users. The lessons learnt from this case study also contributed to the development of an integrated methodological framework which provided insights for the future design and evaluation of digital library Web portals and services.


cross language evaluation forum | 2003

Experiments to Evaluate Probabilistic Models for Automatic Stemmer Generation and Query Word Translation

Giorgio Maria Di Nunzio; Nicola Ferro; Massimo Melucci; Nicola Orio

The paper describes statistical methods and experiments for stemming and for the translation of query words used in the monolingual and bilingual tracks in CLEF 2003. While there is still room for improvement in the method proposed for the bilingual track, the approach adopted for the monolingual track makes it possible to generate stemmers which learn directly how to stem the words in a document from a training word list extracted from the document collection, with no need for language-dependent knowledge. The experiments suggest that statistical approaches to stemming are as effective as classical algorithms which encapsulate predefined linguistic rules.


italian research conference on digital library management systems | 2010

A Digital Library Effort to Support the Building of Grammatical Resources for Italian Dialects

Maristella Agosti; Paola Benincà; Giorgio Maria Di Nunzio; Riccardo Miotto; Diego Pescarini

In this paper we present the results of a project, named ASIt, which provides linguists with a crucial test bed for formal hypotheses concerning human language. In particular, ASIt aims to capture cross-linguistic variants of grammatical structures within a sample of about 200 Italian Dialects. Since dialects are rarely recognized as official languages, first of all linguists need a dedicated digital library system providing the tools for the unambiguous identification of each dialect on the basis of geographical, administrative and geo-linguistic parameters. Secondly, the information access component of the digital library system needs to be designed to allow users to search the occurrences of a specific grammatical structure (e.g. a relative clause or a particular word order) rather than a specific word. Thirdly, since ASIt has been specifically geared to the needs of linguists, user-friendly graphical interfaces need to be created to give easy access to and make the building of the language resource easier and distributed. The paper reports on the ways these three main aims have been achieved.


acm international conference on digital libraries | 2007

Gathering and mining information from web log files

Maristella Agosti; Giorgio Maria Di Nunzio

In this paper, a general methodology for gathering and mining information from Web log files is proposed. A series of tools to retrieve, store, and analyze the data extracted from log files have been designed and implemented. The aim is to form general methods by abstracting from the analysis of logs which use a well-defined standard format, such as the Extended Log File Format proposed by W3C. The methodology has been experimented on the Web log files of The European Library portal; the experimental analyses led to personal, technical, geographical and temporal findings about the usage and traffic load. Considerations about a more accurate tracking of users and users profiles, and a better management of crawler accesses using authentication are presented.

Collaboration


Dive into the Giorgio Maria Di Nunzio's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Thomas Mandl

University of Hildesheim

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Carol Peters

Istituto di Scienza e Tecnologie dell'Informazione

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge