Antonio Corral | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Antonio Corral is active.

Explore More

Publication

Featured researches published by Antonio Corral.

international conference on management of data | 2000

Closest pair queries in spatial databases

Antonio Corral; Yannis Manolopoulos; Yannis Theodoridis; Michael Vassilakopoulos

This paper addresses the problem of finding the K closest pairs between two spatial data sets, where each set is stored in a structure belonging in the R-tree family. Five different algorithms (four recursive and one iterative) are presented for solving this problem. The case of 1 closest pair is treated as a special case. An extensive study, based on experiments performed with synthetic as well as with real point data sets, is presented. A wide range of values for the basic parameters affecting the performance of the algorithms, especially the effect of overlap between the two data sets, is explored. Moreover, an algorithmic as well as an experimental comparison with existing incremental algorithms addressing the same problem is presented. In most settings, the new algorithms proposed clearly outperform the existing ones.

data and knowledge engineering | 2004

Algorithms for processing K -closest-pair queries in spatial databases

Antonio Corral; Yannis Manolopoulos; Yannis Theodoridis; Michael Vassilakopoulos

This paper addresses the problem of finding the K closest pairs between two spatial datasets (the so-called, K closest pairs query, K-CPQ), where each dataset is stored in an R-tree. There are two different techniques for solving this kind of distance-based query. The first technique is the incremental approach, which returns the output elements one-by-one in ascending order of distance. The second one is the nonincremental alternative, which returns the K elements of the result all together at the end of the algorithm. In this paper, based on distance functions between two MBRs in the multidimensional Euclidean space, we propose a pruning heuristic and two updating strategies for minimizing the pruning distance, and use them in the design of three non-incremental branch-and-bound algorithms for K-CPQ between spatial objects stored in two R-trees. Two of those approaches are recursive following a Depth-First searching strategy and one is iterative obeying a Best-First traversal policy. The plane-sweep method and the search ordering are used as optimization techniques for improving the naive approaches. Besides, a number of interesting extensions of the K-CPQ (K-Self-CPQ, Semi-CPQ, K-FPQ (the K-farthest pairs query), etc.) are discussed. An extensive performance study is also presented. This study is based on experiments performed with real datasets. A wide range of values for the basic parameters affecting the performance of the algorithms is examined in order to designate the most efficient algorithm for each setting of parameter values. Finally, an experimental study of the behavior of the proposed K-CPQ branch-and-bound algorithms in terms of scalability of the dataset size and the K value is also included.

Information Sciences | 2007

A performance comparison of distance-based query algorithms using R-trees in spatial databases

Antonio Corral; Jesús Manuel Almendros-Jiménez

Efficient processing of distance-based queries (DBQs) is of great importance in spatial databases due to the wide area of applications that may address such queries. The most representative and known DBQs are the K Nearest Neighbors Query (KNNQ), @r Distance Range Query (@rDRQ), K Closest Pairs Query (KCPQ) and @r Distance Join Query (@rDJQ). In this paper, we propose new pruning mechanism to apply them in the design of new Recursive Best-First Search (RBFS) algorithms for DBQs between spatial objects indexed in R-trees. RBFS is a general search algorithm that runs in linear space and expands nodes in best-first order, but it can suffer from node re-expansion overhead (i.e. to expand nodes in best-first order, some nodes can be considered more than once). The R-tree and its variations are commonly cited spatial access methods that can be used for answering such spatial queries. Moreover, an exhaustive experimental study was also included using R-trees, which resulted to several conclusions about the efficiency of proposed RBFS algorithm and its comparison with respect to other search algorithms (Best-First Search (BFS) and Depth-First Branch-and-Bound (DFBnB)), in terms of disk accesses, response time and main memory requirements, taking into account several important parameters as maximum branching factor (Cmax), cardinality of the final query result (K), distance threshold (@r) and size of a global LRU buffer (B). In general RBFS is competitive for KNNQ and KCPQ where the maximum branching factor (Cmax) is large enough (even better than DFBnB and very close to BFS), and it is a good alternative when we have main memory limitations in our computer due to high process overload in our system, since it is linear space consuming with respect to the height of the R-trees. Nevertheless, RBFS is the worst alternative for @rDRQ and @rDJQ. DFBnB is also a linear space algorithm and it obtains the same behavior as BFS for @rDRQ and @rDJQ; and it is the best when an LRU buffer was included. Finally, we have been able to check experimentally that BFS is the best for all DBQs, but it can consume many main memory resources to perform spatial queries.

The Computer Journal | 2005

On Approximate Algorithms for Distance-Based Queries using R-trees

Antonio Corral; Michael Vassilakopoulos

In modern database applications the similarity or dissimilarity of complex objects is examined by performing distance-based queries (DBQs) on data of high dimensionality. The R-tree and its variations are commonly cited multidimensional access methods that can beused for answering such queries. Although the related algorithms work well for low-dimensional data spaces, their performance degrades as the number of dimensions increases (dimensionality curse). In order to obtain acceptable response time in high-dimensional data spaces, algorithms that obtain approximate solutions can be used. Approximation techniques, like N-consider (based on the tree structure), α-allowance and e-approximate (based on distance), or Time-consider (based on time) can be applied in branch-and-bound algorithms for DBQs inorder to control the trade-off between cost and accuracy of the result. In this paper, we improve previous approximate DBQ algorithms by applying a combination of the approximation techniques in the same query algorithm (hybrid approximation scheme). We investigate the performance of these improvements for one of the most representative DBQs (the K-closest pairs query, K-CPQ) in high-dimensional data spaces, as well as the influence of the algorithmic parameters on the control of the trade-off between the response time and the accuracy of the result. The outcome of the experimental evaluation, using synthetic and real datasets, is the derivation of the outperforming DBQ approximate algorithm for large high-dimensional point datasets.

Geoinformatica | 2004

Multi-Way Distance Join Queries in Spatial Databases

Antonio Corral; Yannis Manolopoulos; Yannis Theodoridis; Michael Vassilakopoulos

Let a tuple of n objects obeying a query graph (QG) be called the n-tuple. The “D_distance-value” of this n-tuple is the value of a linear function of distances of the n objects that make up this n-tuple, according to the edges of the QG. This paper addresses the problem of finding the K n-tuples between n spatial datasets that have the smallest D_distance-values, the so-called K-multi-way distance join query (K-MWDJQ), where each set is indexed by an R-tree-based structure. This query can be viewed as an extension of K-closest-pairs query (K-CPQ) [8] for n inputs. In addition, a recursive non-incremental branch-and-bound algorithm following a depth-first search for processing synchronously all inputs without producing any intermediate result is proposed. Enhanced pruning techniques are also applied to n R-trees nodes in order to reduce the total response time and the number of distance computations of the query. Due to the exponential nature of the problem, we also propose a time-based approximate version of the recursive algorithm that combines approximation techniques to adjust the quality of the result and the global processing time. Finally, we give a detailed experimental study of the proposed algorithms using real spatial datasets, highlighting their performance and the quality of the approximate results.

advances in databases and information systems | 2002

Approximate Algorithms for Distance-Based Queries in High-Dimensional Data Spaces Using R-Trees

Antonio Corral; Joaquín Cañadas; Michael Vassilakopoulos

In modern database applications the similarity or dissimilarity of complex objects is examined by performing distance-based queries (DBQs) on data of high dimensionality. The R-tree and its variations are commonly cited multidimensional access methods that can be used for answering such queries. Although, the related algorithms work well for low-dimensional data spaces, their performance degrades as the number of dimensions increases (dimensionality curse). In order to obtain acceptable response time in high-dimensional data spaces, algorithms that obtain approximate solutions can be used. Three approximation techniques (?-allowance, N-consider and M-consider) and the respective recursive branch-and-bound algorithms for DBQs are presented and studied in this paper. We investigate the performance of these algorithms for the most representative DBQs (the K-nearest neighbors query and the K-closest pairs query) in high-dimensional data spaces, where the point data sets are indexed by tree-like structures belonging to the R-tree family: R*- trees and X-trees. The searching strategy is tuned according to several parameters, in order to examine the trade-off between cost (I/O activity and response time) and accuracy of the result. The outcome of the experimental evaluation is the derivation of the outperforming DBQ approximate algorithm for large high-dimensional point data sets.

advances in databases and information systems | 2011

Performance comparison of xBR-trees and R*-trees for single dataset spatial queries

George Roumelis; Michael Vassilakopoulos; Antonio Corral

Processing of spatial queries has been studied extensively in the literature. In most cases, it is accomplished by indexing spatial data by an access method. For queries involving a single dataset, like the Point Location Query, the Window (Distance Range) Query, the (Constrained) K Nearest Neighbor Query, the R*-tree (a data-driven structure) is a very popular choice of such a method. In this paper, we compare the performance of the R*-tree for processing single dataset spatial queries to the performance of a disk based structure that belongs to the Quadtree family, the xBR-tree (a space-driven structure). We demonstrate performance results (I/O efficiency and execution time) of extensive experimentation that was based on real datasets, using these two index structures. The winner depends on several parameters and the results show that the xBR-tree is a promising alternative for these spatial operations.

data and knowledge engineering | 2006

Cost models for distance joins queries using R-trees

Antonio Corral; Yannis Manolopoulos; Yannis Theodoridis; Michael Vassilakopoulos

The K-Closest-Pairs Query (K-CPQ), a type of distance join in spatial databases, discovers the K pairs of objects formed from two different datasets with the K smallest distances. Recently, branch-and-bound algorithms based on R-trees have been developed in order to answer K-CPQs efficiently. For query optimization purposes, analytical models are needed to estimate the processing cost of a specific query in order to evaluate alternative execution plans. In this paper, we combine techniques that have been used for the analysis of nearest neighbor and spatial join queries, and derive the performance cost (in terms of disk accesses) of K-CPQs using R-trees. Moreover, we present two interesting extensions of the cost model for K-CPQs, one exploiting the buffering management using R-trees and another for a second type of distance join, the so-called buffer queries. The proposed cost models are verified under a variety of distributions in 2-dimensional space on both synthetic and real datasets, shown to achieve accurate estimations of the measured experimental results.

Lecture Notes in Computer Science | 1999

Algorithms for Joining R-Trees and Linear Region Quadtrees

Antonio Corral; Michael Vassilakopoulos; Yannis Manolopoulos

The family of R-trees is suitable for storing various kinds of multidimensional objects and is considered an excellent choice for indexing a spatial database. Region Quadtrees are suitable for storing 2-dimensional regional data and their linear variant is used in many Geographical Information Systems for this purpose. In this report, we present five algorithms suitable for processing join queries between these two successful, although very different, access methods. Two of the algorithms are based on heuristics that aim at minimizing I/O cost with a limited amount of main memory. We also present the results of experiments performed with real data that compare the I/O performance of these algorithms.

conference on current trends in theory and practice of informatics | 2014

A New Plane-Sweep Algorithm for the K-Closest-Pairs Query

George Roumelis; Michael Vassilakopoulos; Antonio Corral; Yannis Manolopoulos

One of the most representative and studied Distance-Based Queries in Spatial Databases is the K-Closest-Pairs Query (KCPQ). This query involves two spatial data sets and a distance function to measure the degree of closeness, along with a given number K of elements of the result. The output is a set of pairs of objects (with one object element from each set), with the K lowest distances. In this paper, we study the problem of processing KCPQs between RAM-based point sets, using Plane-Sweep (PS) algorithms. We utilize two improvements that can be applied to a PS algorithm and propose a new algorithm that minimizes the number of distance computations, in comparison to the classic PS algorithm. By extensive experimentation, using real and synthetic data sets, we highlight the most efficient improvement and show that the new PS algorithm outperforms the classic one, in most cases.

Explore More