Massimo Melucci | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Massimo Melucci is active.

Explore More

Publication

Featured researches published by Massimo Melucci.

ACM Transactions on Information Systems | 2008

A basis for information retrieval in context

Massimo Melucci

Information retrieval (IR) models based on vector spaces have been investigated for a long time. Nevertheless, they have recently attracted much research interest. In parallel, context has been rediscovered as a crucial issue in information retrieval. This article presents a principled approach to modeling context and its role in ranking information objects using vector spaces. First, the article outlines how a basis of a vector space naturally represents context, both its properties and factors. Second, a ranking function computes the probability of context in the objects represented in a vector space, namely, the probability that a contextual factor has affected the preparation of an object.

acm conference on hypertext | 1997

On the use of information retrieval techniques for the automatic construction of hypertext

Maristella Agosti; Fabio Crestani; Massimo Melucci

Abstract The first part of the paper briefly introduces what automatic authoring of a hypertext for information retrieval means. The most difficult part of the automatic construction of a hypertext is the creation of links connecting documents or document fragments that are simantically related. Because of this, to many researchers it seemed natural to use IR techniques for this purpose, since IR has always dealt with the construction of relationships between objects mutually relevant. The second part of the paper presents a survey of some of attempts toward the automatic construction of hypertexts for information retrieval. This part will identify and compare scope, advantages and limitations of different approaches. The aim of this survey is to point out the main and most successful current lines of research.

Information Processing and Management | 2005

A probabilistic model for stemmer generation

Michela Bacchin; Nicola Ferro; Massimo Melucci

In this paper we will present a language-independent probabilistic model which can automatically generate stemmers. Stemmers can improve the retrieval effectiveness of information retrieval systems, however the designing and the implementation of stemmers requires a laborious amount of effort due to the fact that documents and queries are often written or spoken in several different languages. The probabilistic model proposed in this paper aims at the development of stemmers used for several languages. The proposed model describes the mutual reinforcement relationship between stems and derivations and then provides a probabilistic interpretation. A series of experiments shows that the stemmers generated by the probabilistic model are as effective as the ones based on linguistic knowledge.

Multimedia Systems | 1995

Automatic authoring and construction of hypermedia for information retrieval

Maristella Agosti; Massimo Melucci; Fabio Crestani

This paper describes the complete process and a tool for the automatic construction of a multimedia hypertext starting from a large collection of multimedia documents. Through the use of an authoring methodology, the document collection is automatically authored, and the result is a multimedia hypertext, also called a hypermedia, written in hypertext mark-up language (HTML), almost a standard among hypermedia mark-up languages. The resulting hypermedia can be browsed and queried with Mosaic, an interface developed in the framework of the World Wide Web Project. In particular, the set of methods and techniques used for the automatic construction of hypermedia is described in this paper, and their relevance in the context of multimedia information retrieval is highlighted.

Information Processing and Management | 1996

Design and implementation of a tool for the automatic construction of hypertexts for information retrieval

Maristella Agosti; Fabio Crestani; Massimo Melucci

Abstract The paper describes the design and implementation of TACHIR, a tool for the automatic construction of hypertexts for Information Retrieval. Through the use of an authoring methodology employing a set of well known Information Retrieval techniques, TACHIR automatically builds up a hypertext from a document collection. The structure of the hypertext reflects a three level conceptual model that has proved to be quite effective for Information Retrieval. Using this model it is possible to navigate among documents, index terms, and concepts using automatically determined links. The hypertext is implemented using the HTML hypertext mark up language, the mark up language of the World Wide Web project. It can be distributed on different sites and different machines over the Internet, and it can be navigated using any of the interfaces developed in the framework World Wide Web project, for example NetScape .

conference on information and knowledge management | 2003

A novel method for stemmer generation based on hidden markov models

Massimo Melucci; Nicola Orio

In this paper, we present a method based on Hidden Markov Models (HMMs) to generate statistical stemmers. Using a list of words as training set, the method estimates the HMM parameters which are used to calculate the most probable stem for an arbitrary word. Stemming is performed by computing the most probable path, through the HMM states, corresponding to the input word. Linguistic knowledge or a training set of manually stemmed words are not required. We describe the method and the results of the experiments carried out using standard test collections for five different languages.

international acm sigir conference on research and development in information retrieval | 2007

On rank correlation in information retrieval evaluation

Massimo Melucci

Some methods for rank correlation in evaluation are examined and their relative advantages and disadvantages are discussed. In particular, it is suggested that different test statistics should be used for providing additional information about the experiments other that the one provided by statistical significance testing. Kendalls τ is often used for testing-rank correlation, yet it is little appropriate if the objective of the test is different from what τ was designed for. In particular, attention should be paid to the null hypothesis. Other measures for rank correlation are described. If one test statistic suggests to reject a hypothesis, other test statistics should be used to support or to revise the decision. The paper then focuses on rank correlation between webpage lists ordered by PageRank for applying the general reflections on these test statistics. An interpretation of PageRank behaviour is provided on the basis of the discussion of the test statistics for rank correlation.

conference on information and knowledge management | 2005

Context modeling and discovery using vector space bases

Massimo Melucci

In this paper, context is modeled by vector space bases and its evolution is modeled by linear transformations from one base to another. Each document or query can be associated to a distinct base, which corresponds to one context. Also, algorithms are proposed to discover contexts from document, query or groups or them. Linear algebra can thus by employed in a mathematical framework to process context, its evolution and application.

International Journal on Digital Libraries | 2006

Appearance and functionality of electronic books

Fabio Crestani; Monica Landoni; Massimo Melucci

We present the results and the lessons learned from two separate and independent studies into the design, development, and evaluation of electronic books for information access: the Visual Book and the Hyper-TextBook. The Visual Book explored the importance of the visual component of the book metaphor in the production of “good” electronic books for referencing. The Hyper-TextBook concentrated on the importance of models and techniques for the automatic production of functional electronic versions of textbooks. Both studies started from similar considerations on what kinds of paper books are suitable for translation into electronic form but di.er on the prominence given to book appearance and functionalities. The results of these two research projects are critically presented in this paper, with the aim of helping designers and implementers to better integrate appearance and functional aspects of books into a more general methodology for the automatic production of electronic books for information access.

Information Processing and Management | 1998

Passage retrieval: a probabilistic technique

Massimo Melucci

Abstract This paper presents a probabilistic technique to retrieve passages from texts having a large size or heterogeneous semantic content. The proposed technique is independent on any supporting auxiliary data, such as text structure, topic organization, or pre-defined text segments. A Bayesian framework implements the probabilistic technique. We carried out experiments to compare the probabilistic technique to one based on a text segmentation algorithm. In particular, the probabilistic technique is more effective than, or as effective as the one based on the text segmentation to retrieve small passages. Results show that passage size affects passage retrieval performance. Results do also suggest that text organization and query generality may have an impact on the difference in effectiveness between the two techniques.

Explore More