Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Emanuele Di Buccio is active.

Publication


Featured researches published by Emanuele Di Buccio.


cross language evaluation forum | 2012

DIRECTions: design and specification of an IR evaluation infrastructure

Maristella Agosti; Emanuele Di Buccio; Nicola Ferro; Ivano Masiero; Gianmaria Silvello

Information Retrieval (IR) experimental evaluation is an essential part of the research on and development of information access methods and tools. Shared data sets and evaluation scenarios allow for comparing methods and systems, understanding their behaviour, and tracking performances and progress over the time. On the other hand, experimental evaluation is an expensive activity in terms of human effort, time, and costs required to carry it out. Software and hardware infrastructures that support experimental evaluation operation as well as management, enrichment, and exploitation of the produced scientific data provide a key contribution in reducing such effort and costs and carrying out systematic and throughout analysis and comparison of systems and methods, overall acting as enablers of scientific and technical advancement in the field. This paper describes the specification for an Information Retrieval (IR) evaluation infrastructure by conceptually modeling the entities involved in Information Retrieval (IR) experimental evaluation and their relationships and by defining the architecture of the proposed evaluation infrastructure and the APIs for accessing it.


Information Processing and Management | 2014

Detecting verbose queries and improving information retrieval

Emanuele Di Buccio; Massimo Melucci; Federica Moro

Although most of the queries submitted to search engines are composed of a few keywords and have a length that ranges from three to six words, more than 15% of the total volume of the queries are verbose, introduce ambiguity and cause topic drifts. We consider verbosity a different property of queries from length since a verbose query is not necessarily long, it might be succinct and a short query might be verbose. This paper proposes a methodology to automatically detect verbose queries and conditionally modify queries. The methodology proposed in this paper exploits state-of-the-art classification algorithms, combines concepts from a large linguistic database and uses a topic gisting algorithm we designed for verbose query modification purposes. Our experimental results have been obtained using the TREC Robust track collection, thirty topics classified by difficulty degree, four queries per topic classified by verbosity and length, and human assessment of query verbosity. Our results suggest that the methodology for query modification conditioned to query verbosity detection and topic gisting is significantly effective and that query modification should be refined when topic difficulty and query verbosity are considered since these two properties interact and query verbosity is not straightforwardly related to query length.


International Journal of Metadata, Semantics and Ontologies | 2014

A linked open data approach for geolinguistics applications

Emanuele Di Buccio; Giorgio Maria Di Nunzio; Gianmaria Silvello

Geolinguistic systems explore the relationship between language and cultural adaptation and change and they can be used as instructional tools, presenting complex data and relationships in a way accessible to all educational levels. However, the heterogeneity of geolinguistic projects has been recognised as a key problem limiting the reusability of linguistic tools and data collections. We propose an approach based on LOD, which moves the focus from the systems handling the data to the data themselves with the main goal of increasing the level of interoperability of geolinguistic applications and the reuse of the data. We defined an extensible ontology for geolinguistic resources based on the common ground defined by current European linguistic projects. We provide a Geolinguistic Linked Open Dataset based on the data case study of a linguistic project named ASIt. Finally, we show a geolinguistic application, which exploits this dataset for dynamically generating linguistic maps.


european conference on information retrieval | 2011

Towards predicting relevance using a quantum-like framework

Emanuele Di Buccio; Massimo Melucci; Dawei Song

In this paper, the users relevance state is modeled using quantum-like probability and the interference term is proposed so as to model the evolution of the state and the users uncertainty about the assessment. The theoretical framework has been formulated and the results of an experimental user study based on a TREC test collection have been reported.


cross language evaluation forum | 2011

To re-rank or to re-query: can visual analytics solve this dilemma?

Emanuele Di Buccio; Marco Dussin; Nicola Ferro; Ivano Masiero; Giuseppe Santucci; Giuseppe Tino

Evaluation has a crucial role in Information Retrieval (IR) since it allows for identifying possible points of failure of an IR approach, thus addressing them to improve its effectiveness. Developing tools to support researchers and analysts when analyzing results and investigating strategies to improve IR system performance can help make the analysis easier and more effective. In this paper we discuss a Visual Analytics-based approach to support the analyst when deciding whether or not to investigate re-ranking to improve the system effectiveness measured after a retrieval run. Our approach is based on effectiveness measures that exploit graded relevance judgements and it provides both a principled and intuitive way to support analysis. A prototype is described and exploited to discuss some case studies based on TREC data.


acm multimedia | 2010

A scalable cover identification engine

Emanuele Di Buccio; Nicola Montecchio; Nicola Orio

This paper describes the implementation of a content-based cover song identification system which has been released under an open source license. The system is centered around the Apache Lucene text search engine library, and proves how classic techniques derived from textual Information Retrieval, in particular the bag-of-words paradigm, can successfully be adapted to music identification. The paper focuses on extensive experimentation on the most influential system parameters, in order to find an optimal tradeoff between retrieval accuracy and speed of querying.


international acm sigir conference on research and development in information retrieval | 2017

Lucene4IR: Developing Information Retrieval Evaluation Resources using Lucene

Leif Azzopardi; Yashar Moshfeghi; Martin Halvey; Rami Suleiman Alkhawaldeh; Krisztian Balog; Emanuele Di Buccio; Diego Ceccarelli; Juan M. Fernández-Luna; Charlie Hull; Jake Mannix; Sauparna Palchowdhury

The workshop and hackathon on developing Information Retrieval Evaluation Resources using Lucene (L4IR) was held on the 8th and 9th of September, 2016 at the University of Strathclyde in Glasgow, UK and funded by the ESF Elias Network. The event featured three main elements: (i) a series of keynote and invited talks on industry, teaching and evaluation; (ii) planning, coding and hacking where a number of groups created modules and infrastructure to use Lucene to undertake TREC based evaluations; and (iii) a number of breakout groups discussing challenges, opportunities and problems in bridging the divide between academia and industry, and how we can use Lucene for teaching and learning Information Retrieval (IR). The event was composed of a mix and blend of academics, experts and students wanting to learn, share and create evaluation resources for the community. The hacking was intense and the discussions lively creating the basis of many useful tools but also raising numerous issues. It was clear that by adopting and contributing to most widely used and supported Open Source IR toolkit, there were many benefits for academics, students, researchers, developers and practitioners - providing a basis for stronger evaluation practices, increased reproducibility, more efficient knowledge transfer, greater collaboration between academia and industry, and shared teaching and training resources.


international conference on the theory of information retrieval | 2011

Distilling relevant documents by means of dynamic quantum clustering

Emanuele Di Buccio; Giorgio Maria Di Nunzio

Dynamic Quantum Clustering (DQC) is a recent clustering technique based on physical intuition from quantum mechanics. Clusters are identified as the minima of the potential function of the Schrodinger equation. In this poster, we apply this technique to explore the possibility to select highly relevant documents relative to a query of a user. In particular, we analyze the clusters produced by DQC with a standard test collection.


conference on information and knowledge management | 2010

Toward the design of a methodology to predict relevance through multiple sources of evidence

Emanuele Di Buccio; Massimo Melucci

Textual queries, often short and ambiguous, can be insufficient when describing complex user information needs. Since users are reluctant or unable to provide long or precise descriptions, a possible solution to the low Information Retrieval (IR) system relevance prediction capability is to exploit diverse sources of evidence which are available during the search process. One of the open problems of the combination of diverse sources of evidence is the need of a uniform formalism which seamlessly describes the sources and the document ranking function within a single model. To this end, this paper discusses an IR view which explicitly considers other sources in addition to the information need and the document, and proposes a methodology to exploit them to support feedback. The IR view is described using the Entity-Relationship (ER) model which allows us to view the sources as properties of entities -- e.g. of the entity information need, document, or user -- or of their relationships.


acm multimedia | 2010

FALCON: FAst Lucene-based Cover sOng identification

Emanuele Di Buccio; Nicola Montecchio; Nicola Orio

We present FALCON, an open-source engine for content-based cover song identification written in Java. The popular Lucene search engine library is used as the core of the software, proving that textual methods in information retrieval can be successfully adapted to multimedia tasks. An overview of the system methodology and of the implementation are provided, along with experimental results on a medium-size test collection

Collaboration


Dive into the Emanuele Di Buccio's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alberto Costa

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge