Roberto Cornacchia
Centrum Wiskunde & Informatica
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Roberto Cornacchia.
very large data bases | 2008
Roberto Cornacchia; Sándor Héman; Marcin Zukowski; Arjen P. de Vries; Peter A. Boncz
The Matrix Framework is a recent proposal by Information Retrieval (IR) researchers to flexibly represent information retrieval models and concepts in a single multi-dimensional array framework. We provide computational support for exactly this framework with the array database system SRAM (Sparse Relational Array Mapping), that works on top of a DBMS. Information retrieval models can be specified in its comprehension-based array query language, in a way that directly corresponds to the underlying mathematical formulas. SRAM efficiently stores sparse arrays in (compressed) relational tables and translates and optimizes array queries into relational queries. In this work, we describe a number of array query optimization rules. To demonstrate their effect on text retrieval, we apply them in the TREC TeraByte track (TREC-TB) efficiency task, using the Okapi BM25 model as our example. It turns out that these optimization rules enable SRAM to automatically translate the BM25 array queries into the relational equivalent of inverted list processing including compression, score materialization and quantization, such as employed by custom-built IR systems. The use of the high-performance MonetDB/X100 relational backend, that provides transparent database compression, allows the system to achieve very fast response times with good precision and low resource usage.
european conference on information retrieval | 2007
Roberto Cornacchia; Arjen P. de Vries
This paper introduces the concept of a Parameterised Search System (PSS), which allows flexibility in user queries, and, more importantly, allows system engineers to easily define customised search strategies. Putting this idea into practise requires a carefully designed system architecture that supports a declarative abstraction language for the specification of search strategies. These specifications should stay as close as possible to the problem definition (i.e., the retrieval model to be used in the search application), abstracting away the details of the physical organisation of data and content. We show how extending an existing XML retrieval system with an abstraction mechanism based on array databases meets this requirement.
Proceedings of the 1st international workshop on Computer vision meets databases | 2004
Roberto Cornacchia; Alex van Ballegooij; Arjen P. de Vries
The development of applications involving multi-dimensional data sets on top of a RDBMS raises several difficulties that are not directly related to the scientific problem being addressed. In particular, an additional effort is needed to solve the mismatch existing between the array-based data model typical of such computations and the set-based data model provided by the RDMBS. The RAM (Relational Array Mapping) system fills this gap, silently providing a mapping layer between the two data models. As expected though, a naive implementation of such an automatic translation cannot compete with the efficiency of queries written by an experienced programmer. In order to make RAM a valid alternative to expensive and time-consuming hand-written solutions, this performance gap should be reduced. We study a real-world application aimed at the ranking of multimedia collections to assess the impact of different implementation strategies. The result of this study provides an illustrative outlook for the development of generally applicable optimisation techniques.
exploiting semantic annotations in information retrieval | 2010
Arjen P. de Vries; Wouter Alink; Roberto Cornacchia
This position statement advocates that the integration of information retrieval and databases, a topic that has been studied for many years (see e.g. [3]), is now in a state where the technology is ready to be brought out of the laboratory, and that this technology is especially a good match for the meaningful, semantic annotations that are the topic of this workshop.
european conference on information retrieval | 2006
Roberto Cornacchia; Arjen P. de Vries
We present a prototype system using array comprehensions to bridge the gap between databases and information retrieval. It allows researchers to express their retrieval models in the General Matrix Framework for Information Retrieval [1], and have these executed on relational database systems with negligible effort.
Lecture Notes in Computer Science | 2005
Alex van Ballegooij; Roberto Cornacchia; Arjen P. de Vries; Martin L. Kersten
TREC Video Retrieval Evaluation Online Proceedings | 2005
Tzvetanka I. Ianeva; Lioudmila Boldareva; Thijs Westerveld; Roberto Cornacchia; Djoerd Hiemstra; de Arjen Vries
CLEF (Notebook Papers/Labs/Workshop) | 2011
Eva D'hondt; Suzan Verberne; Wouter Alink; Roberto Cornacchia
text retrieval conference | 2011
Jiyin He; Vera Hollink; Corrado Boscarino; Arjen P. de Vries; Roberto Cornacchia
Archive | 2012
Roberto Cornacchia