Roberto Cornacchia | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Roberto Cornacchia is active.

Explore More

Publication

Featured researches published by Roberto Cornacchia.

very large data bases | 2008

Flexible and efficient IR using array databases

Roberto Cornacchia; Sándor Héman; Marcin Zukowski; Arjen P. de Vries; Peter A. Boncz

The Matrix Framework is a recent proposal by Information Retrieval (IR) researchers to flexibly represent information retrieval models and concepts in a single multi-dimensional array framework. We provide computational support for exactly this framework with the array database system SRAM (Sparse Relational Array Mapping), that works on top of a DBMS. Information retrieval models can be specified in its comprehension-based array query language, in a way that directly corresponds to the underlying mathematical formulas. SRAM efficiently stores sparse arrays in (compressed) relational tables and translates and optimizes array queries into relational queries. In this work, we describe a number of array query optimization rules. To demonstrate their effect on text retrieval, we apply them in the TREC TeraByte track (TREC-TB) efficiency task, using the Okapi BM25 model as our example. It turns out that these optimization rules enable SRAM to automatically translate the BM25 array queries into the relational equivalent of inverted list processing including compression, score materialization and quantization, such as employed by custom-built IR systems. The use of the high-performance MonetDB/X100 relational backend, that provides transparent database compression, allows the system to achieve very fast response times with good precision and low resource usage.

european conference on information retrieval | 2007

A parameterised search system

Roberto Cornacchia; Arjen P. de Vries

This paper introduces the concept of a Parameterised Search System (PSS), which allows flexibility in user queries, and, more importantly, allows system engineers to easily define customised search strategies. Putting this idea into practise requires a carefully designed system architecture that supports a declarative abstraction language for the specification of search strategies. These specifications should stay as close as possible to the problem definition (i.e., the retrieval model to be used in the search application), abstracting away the details of the physical organisation of data and content. We show how extending an existing XML retrieval system with an abstraction mechanism based on array databases meets this requirement.

Proceedings of the 1st international workshop on Computer vision meets databases | 2004

A case study on array query optimisation

Roberto Cornacchia; Alex van Ballegooij; Arjen P. de Vries

The development of applications involving multi-dimensional data sets on top of a RDBMS raises several difficulties that are not directly related to the scientific problem being addressed. In particular, an additional effort is needed to solve the mismatch existing between the array-based data model typical of such computations and the set-based data model provided by the RDMBS. The RAM (Relational Array Mapping) system fills this gap, silently providing a mapping layer between the two data models. As expected though, a naive implementation of such an automatic translation cannot compete with the efficiency of queries written by an experienced programmer. In order to make RAM a valid alternative to expensive and time-consuming hand-written solutions, this performance gap should be reduced. We study a real-world application aimed at the ranking of multimedia collections to assess the impact of different implementation strategies. The result of this study provides an illustrative outlook for the development of generally applicable optimisation techniques.

exploiting semantic annotations in information retrieval | 2010

Search by strategy

Arjen P. de Vries; Wouter Alink; Roberto Cornacchia

This position statement advocates that the integration of information retrieval and databases, a topic that has been studied for many years (see e.g. [3]), is now in a state where the technology is ready to be brought out of the laboratory, and that this technology is especially a good match for the meaningful, semantic annotations that are the topic of this workshop.

european conference on information retrieval | 2006

A declarative DB-Powered approach to IR

Roberto Cornacchia; Arjen P. de Vries

We present a prototype system using array comprehensions to bridge the gap between databases and information retrieval. It allows researchers to express their retrieval models in the General Matrix Framework for Information Retrieval [1], and have these executed on relational database systems with negligible effort.

Lecture Notes in Computer Science | 2005