Paolo Ciaccia | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Paolo Ciaccia is active.

Explore More

Publication

Featured researches published by Paolo Ciaccia.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005

WARP: accurate retrieval of shapes using phase of Fourier descriptors and time warping distance

Ilaria Bartolini; Paolo Ciaccia; Marco Patella

Effective and efficient retrieval of similar shapes from large image databases is still a challenging problem in spite of the high relevance that shape information can have in describing image contents. We propose a novel Fourier-based approach, called WARP, for matching and retrieving similar shapes. The unique characteristics of WARP are the exploitation of the phase of Fourier coefficients and the use of the dynamic time warping (DTW) distance to compare shape descriptors. While phase information provides a more accurate description of object boundaries than using only the amplitude of Fourier coefficients, the DTW distance permits us to accurately match images even in the presence of (limited) phase shillings. In terms of classical precision/recall measures, we experimentally demonstrate that WARP can gain, say, up to 35 percent in precision at a 20 percent recall level with respect to Fourier-based techniques that use neither phase nor DTW distance.

ACM Transactions on Database Systems | 2008

Efficient sort-based skyline evaluation

Ilaria Bartolini; Paolo Ciaccia; Marco Patella

Skyline queries compute the set of Pareto-optimal tuples in a relation, that is, those tuples that are not dominated by any other tuple in the same relation. Although several algorithms have been proposed for efficiently evaluating skyline queries, they either necessitate the relation to have been indexed or have to perform the dominance tests on all the tuples in order to determine the result. In this article we introduce salsa, a novel skyline algorithm that exploits the idea of presorting the input data so as to effectively limit the number of tuples to be read and compared. This makes salsa also attractive when skyline queries are executed on top of systems that do not understand skyline semantics, or when the skyline logic runs on clients with limited power and/or bandwidth. We prove that, if one considers symmetric sorting functions, the number of tuples to be read is minimized by sorting data according to a “minimum coordinate,” minC, criterion, and that performance can be further improved if data distribution is known and an asymmetric sorting function is used. Experimental results obtained on synthetic and real datasets show that salsa consistently outperforms state-of-the-art sequential skyline algorithms and that its performance can be accurately predicted.

symposium on principles of database systems | 1998

A cost model for similarity queries in metric spaces

Paolo Ciaccia; Marco Patella; Pavel Zezula

We consider the problem of estimating CPU (distance computations) and I/O costs for processing range and k-nearest neighbors queries over metric spaces. Unlike the specific case of vector spaces, where information on data distribution has been exploited to derive cost models for predicting the performance of multi-dimensional access methods, in a generic metric space there is no such a possibility, which makes the problem quite different and requires a novel approach. We insist that the distance distribution of objects can be profitably used to solve the problem, and consequently develop a concrete cost model for the M-tree access method [10]. Our results rely on the assumption that the indexed dataset comes from a metric space which is “homogeneous” enough (in a probabilistic sense) to allow reliable cost estimations even if the distance distribution with respect to a specific query object is unknown. We experimentally validate the model over both real and synthetic datasets, and show how the model can be used to tune the M-tree in order to minimize a combination of CPU and I/O costs. Finally, we sketch how the same approach can be applied to derive a cost model for the vp-tree index structure [8].

conference on information and knowledge management | 2006

SaLSa: computing the skyline without scanning the whole sky

Ilaria Bartolini; Paolo Ciaccia; Marco Patella

Skyline queries compute the set of Pareto-optimal tuples in a relation, ie those tuples that are not dominated by any other tuple in the same relation. Although several algorithms have been proposed for efficiently evaluating skyline queries, they either require to extend the relational server with specialized access methods (which is not always feasible) or have to perform the dominance tests on all the tuples in order to determine the result. In this paper we introduce SaLSa (Sort and Limit Skyline algorithm), which exploits the sorting machinery of a relational engine to order tuples so that only a subset of them needs to be examined for computing the skyline result. This makes SaLSa particularly attractive when skyline queries are executed on top of systems that do not understand skyline semantics or when the skyline logic runs on clients with limited power and/or bandwidth.

Journal of Discrete Algorithms | 2009

Approximate similarity search: A multi-faceted problem

Marco Patella; Paolo Ciaccia

We review the major paradigms for approximate similarity queries and propose a classification schema that easily allows existing approaches to be compared along several independent coordinates. Then, we discuss the impact that scheduling of index nodes can have on performance and show that, unlike exact similarity queries, no provable optimal scheduling strategy exists for approximate queries. On the positive side, we show that optimal-on-the-average schedules are well-defined and that their performance is indeed the best among practical schedules.

extending database technology | 1998

Processing Complex Similarity Queries with Distance-Based Access Methods

Paolo Ciaccia; Marco Patella; Pavel Zezula

Efficient evaluation of similarity queries is one of the basic requirements for advanced multimedia applications. In this paper, we consider the relevant case where complex similarity queries are defined through a generic language £ and whose predicates refer to a single feature F. Contrary to the language level which deals only with similarity scores, the proposed evaluation process is based on distances between feature values - known spatial or metric indexes use distances to evaluate predicates. The proposed solution suggests that the index should process complex queries as a whole, thus evaluating multiple similarity predicates at a time. The flexibility of our approach is demonstrated by considering three different similarity languages, and showing how the M-tree access method has been extended to this purpose. Experimental results clearly show that performance of the extended M-tree is consistently better than that of state-of-the-art search algorithms.

international conference on management of data | 2013

Skyline queries, front and back

Jan Chomicki; Paolo Ciaccia; Niccolo' Meneghetti

Skyline queries are a popular way to obtain preferred answers from the database by providing only the orderings of attribute values. The result of a skyline query consists of those input tuples for which there is no input tuple having better or equal values in all the attributes and a better value in at least one attribute. In this article, we summarize the basic notions and properties of skyline queries, and discuss their extensions and generalizations. In particular, we consider skyline algorithms and skyline cardinality issues.

european conference on principles of data mining and knowledge discovery | 2004

A unified and flexible framework for comparing simple and complex patterns

Ilaria Bartolini; Paolo Ciaccia; Irene Ntoutsi; Marco Patella; Yannis Theodoridis

One of the most important operations involving Data Mining patterns is computing their similarity. In this paper we present a general framework for comparing both simple and complex patterns, i.e., patterns built up from other patterns. Major features of our framework include the notion of structure and measure similarity, the possibility of managing multiple coupling types and aggregation logics, and the recursive definition of similarity for complex patterns.

string processing and information retrieval | 2002

String Matching with Metric Trees Using an Approximate Distance

Ilaria Bartolini; Paolo Ciaccia; Marco Patella

Searching in a large data set those strings that are more similar, according to the edit distance, to a given one is a time-consuming process. In this paper we investigate the performance of metric trees, namely the M-tree, when they are extended using a cheap approximate distance function as a filter to quickly discard irrelevant strings. Using the bag distance as an approximation of the edit distance, we show an improvement in performance up to 90% with respect to the basic case. This, along with the fact that our solution is independent on both the distance used in the pre-test and on the underlying metric index, demonstrates that metric indices are a powerful solution, not only for many modern application areas, as multimedia, data mining and pattern recognition, but also for the string matching problem.

Multimedia Tools and Applications | 2006

Adaptively browsing image databases with PIBE

Ilaria Bartolini; Paolo Ciaccia; Marco Patella

Browsing large image collections is a complex and often tedious task, due to the semantic gap existing between the user subjective notion of similarity and the one according to which a browsing system organizes the images. In this paper we propose PIBE, an adaptive image browsing system, which provides users with a hierarchical view of images (the Browsing Tree) that can be customized according to user preferences. A key feature of PIBE is that it maintains local similarity criteria for each portion of the Browsing Tree. This makes it possible both to avoid costly global reorganization upon execution of user actions and, combined with a persistent storage of the Browsing Tree, to efficiently support multiple browsing tasks. We present the basic principles of PIBE and report experimental results showing the effectiveness of its browsing and personalization functionalities.

Explore More