Cristina Dutra de Aguiar Ciferri

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Cristina Dutra de Aguiar Ciferri is active.

Explore More

Publication

Featured researches published by Cristina Dutra de Aguiar Ciferri.

International Journal of Data Warehousing and Mining | 2013

Cube Algebra: A Generic User-Centric Model and Query Language for OLAP Cubes

Cristina Dutra de Aguiar Ciferri; Ricardo Rodrigues Ciferri; Leticia I. Gómez; Markus Schneider; Alejandro A. Vaisman; Esteban Zimanyi

The lack of an appropriate conceptual model for data warehouses and OLAP systems has led to the tendency to deploy logical models for example, star, snowflake, and constellation schemas for them as conceptual models. ER model extensions, UML extensions, special graphical user interfaces, and dashboards have been proposed as conceptual approaches. However, they introduce their own problems, are somehow complex and difficult to understand, and are not always user-friendly. They also require a high learning curve, and most of them address only structural design, not considering associated operations. Therefore, they are not really an improvement and, in the end, only represent a reflection of the logical model. The essential drawback of offering this system-centric view as a user concept is that knowledge workers are confronted with the full and overwhelming complexity of these systems as well as complicated and user-unfriendly query languages such as SQL OLAP and MDX. In this article, the authors propose a user-centric conceptual model for data warehouses and OLAP systems, called the Cube Algebra. It takes the cube metaphor literally and provides the knowledge worker with high-level cube objects and related concepts. A novel query language leverages well known high-level operations such as roll-up, drill-down, slice, and drill-across. As a result, the logical and physical levels are hidden from the unskilled end user.

Geoinformatica | 2012

The SB-index and the HSB-index: efficient indices for spatial data warehouses

Thiago Luís Lopes Siqueira; Cristina Dutra de Aguiar Ciferri; Valéria Cesário Times; Ricardo Rodrigues Ciferri

Spatial data warehouses (SDWs) allow for spatial analysis together with analytical multidimensional queries over huge volumes of data. The challenge is to retrieve data related to ad hoc spatial query windows according to spatial predicates, avoiding the high cost of joining large tables. Therefore, mechanisms to provide efficient query processing over SDWs are essential. In this paper, we propose two efficient indices for SDW: the SB-index and the HSB-index. The proposed indices share the following characteristics. They enable multidimensional queries with spatial predicate for SDW and also support predefined spatial hierarchies. Furthermore, they compute the spatial predicate and transform it into a conventional one, which can be evaluated together with other conventional predicates by accessing a star-join Bitmap index. While the SB-index has a sequential data structure, the HSB-index uses a hierarchical data structure to enable spatial objects clustering and a specialized buffer-pool to decrease the number of disk accesses. The advantages of the SB-index and the HSB-index over the DBMS resources for SDW indexing (i.e. star-join computation and materialized views) were investigated through performance tests, which issued roll-up operations extended with containment and intersection range queries. The performance results showed that improvements ranged from 68% up to 99% over both the star-join computation and the materialized view. Furthermore, the proposed indices proved to be very compact, adding only less than 1% to the storage requirements. Therefore, both the SB-index and the HSB-index are excellent choices for SDW indexing. Choosing between the SB-index and the HSB-index mainly depends on the query selectivity of spatial predicates. While low query selectivity benefits the HSB-index, the SB-index provides better performance for higher query selectivity.

Information Systems | 2011

Slicing the metric space to provide quick indexing of complex data in the main memory

Caio César Mori Carélo; Ives Rene Venturini Pola; Ricardo Rodrigues Ciferri; Agma J. M. Traina; Caetano Traina; Cristina Dutra de Aguiar Ciferri

Searching in a dataset for elements that are similar to a given query element is a core problem in applications that manage complex data, and has been aided by metric access methods (MAMs). A growing number of applications require indices that must be built faster and repeatedly, also providing faster response for similarity queries. The increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper, we propose the Onion-tree, a new and robust dynamic memory-based MAM that slices the metric space into disjoint subspaces to provide quick indexing of complex data. It introduces three major characteristics: (i) a partitioning method that controls the number of disjoint subspaces generated at each node; (ii) a replacement technique that can change the leaf node pivots in insertion operations; and (iii) range and k-NN extended query algorithms to support the new partitioning method, including a new visit order of the subspaces in k-NN queries. Performance tests with both real-world and synthetic datasets showed that the Onion-tree is very compact. Comparisons of the Onion-tree with the MM-tree and a memory-based version of the Slim-tree showed that the Onion-tree was always faster to build the index. The experiments also showed that the Onion-tree significantly improved range and k-NN query processing performance and was the most efficient MAM, followed by the MM-tree, which in turn outperformed the Slim-tree in almost all the tests.

combinatorial pattern matching | 2013

External Memory Generalized Suffix and LCP Arrays Construction

Felipe Alves da Louza; Guilherme P. Telles; Cristina Dutra de Aguiar Ciferri

A suffix array is a data structure that, together with the LCP array, allows solving many string processing problems in a very efficient fashion. In this article we introduce eGSA, the first external memory algorithm to construct both generalized suffix and LCP arrays for sets of strings. Our algorithm relies on a combination of buffers, induced sorting and a heap. Performance tests with real DNA sequence sets of size up to 8.5 GB showed that eGSA can indeed be applied to sets of large sequences with efficient running time on a low-cost machine. Compared to the algorithm that most closely resembles eGSA purpose, eSAIS, eGSA reduced the time spent to construct the arrays by a factor of 2.5−4.8.

Journal of the Brazilian Computer Society | 2009

The impact of spatial data redundancy on SOLAP query performance

Thiago Luís Lopes Siqueira; Cristina Dutra de Aguiar Ciferri; Valéria Cesário Times; Anjolina Grisi de Oliveira; Ricardo Rodrigues Ciferri

Geographic Data Warehouses (GDW) are one of the main technologies used in decision-making processes and spatial analysis, and the literature proposes several conceptual and logical data models for GDW. However, little effort has been focused on studying how spatial data redundancy affects SOLAP (Spatial On-Line Analytical Processing) query performance over GDW. In this paper, we investigate this issue. Firstly, we compare redundant and non-redundant GDW schemas and conclude that redundancy is related to high performance losses. We also analyze the issue of indexing, aiming at improving SOLAP query performance on a redundant GDW. Comparisons of the SB-index approach, the star-join aided by R-tree and the star-join aided by GiST indicate that the SB-index significantly improves the elapsed time in query processing from 25% up to 99% with regard to SOLAP queries defined over the spatial predicates of intersection, enclosure and containment and applied to roll-up and drill-down operations. We also investigate the impact of the increase in data volume on the performance. The increase did not impair the performance of the SB-index, which highly improved the elapsed time in query processing. Performance tests also show that the SB-index is far more compact than the star-join, requiring only a small fraction of at most 0.20% of the volume. Moreover, we propose a specific enhancement of the SB-index to deal with spatial data redundancy. This enhancement improved performance from 80 to 91% for redundant GDW schemas.

acm symposium on applied computing | 2009

A spatial bitmap-based index for geographical data warehouses

Thiago Luís Lopes Siqueira; Ricardo Rodrigues Ciferri; Valéria Cesário Times; Cristina Dutra de Aguiar Ciferri

In this paper we propose the Spatial Bitmap Index (SB-index), which is an index based on Bitmap and Minimum Bounding Rectangle (MBR) to provide efficient query processing in Geographical Data Warehouses. The SB-index is built on the primary key of a spatial dimension table, and maintains the MBR of a given spatial attribute. Query processing requires a scan on the index, which compares both the query spatial predicate and the current MBR. This scan supplies a set of candidate solutions to a refinement step that evaluates each candidate. Finally, only the index entries from objects that satisfy the spatial predicate must be accessed, in order to answer the submitted query. Comparisons between the SB-index and the star-join indexed with R-tree and GiST showed significantly improvement of 25% up to 95% with regards to the query processing time. This performance gain occurs since SB-index restricts a set of candidates and avoids the star-join calculation.

data warehousing and knowledge discovery | 2010

Benchmarking spatial data warehouses

Thiago Luís Lopes Siqueira; Ricardo Rodrigues Ciferri; Valéria Cesário Times; Cristina Dutra de Aguiar Ciferri

Spatial data warehouses (SDW) enable analytical multidimensional queries together with spatial analysis. Mainly, three operations are related to SDW query processing performance: (i) joining large fact tables and large spatial and non-spatial dimension tables; (ii) computing one or more costly spatial predicates based on spatial ad hoc query windows; and (iii) aggregating data according to different spatial granularity levels. Several techniques to improve the query processing performance over SDW have been proposed in the literature. However, we identified the lack of a benchmark to carry out a controlled experimental evaluation of such techniques and, principally, to effectively measure the costs of the aforementioned three complex operations. In this paper, we propose a novel spatial data warehouse benchmark, called Spadawan, to provide performance evaluation environments for SDW and enable a further investigation on spatial data redundancy. The Spadawan benchmark is available at http://gbd.dc.ufscar.br/spadawan.

data warehousing and knowledge discovery | 2014

Processing OLAP Queries over an Encrypted Data Warehouse Stored in the Cloud

Claudivan Cruz Lopes; Valéria Cesário Times; Stan Matwin; Ricardo Rodrigues Ciferri; Cristina Dutra de Aguiar Ciferri

Several studies deal with mechanisms for processing transactional queries over encrypted data. However, little attention has been devoted to determine how a data warehouse (DW) hosted in a cloud should be encrypted to enable analytical queries processing. In this article, we present a novel method for encrypting a DW and show performance results of this DW implementation. Moreover, an OLAP system based on the proposed encryption method was developed and performance tests were conducted to validate our system in terms of query processing performance. Results showed that the overhead caused by the proposed encryption method decreased when the proposed system was scaled out and compared to a non-encrypted dataset (46.62% with one node and 9.47% with 16 nodes). Also, the computation of aggregates and data groupings over encrypted data in the server produced performance gains (from 84.67% to 93.95%) when compared to their executions in the client, after decryption.

acm symposium on applied computing | 2007

Horizontal fragmentation as a technique to improve the performance of drill-down and roll-up queries

Cristina Dutra de Aguiar Ciferri; Ricardo Rodrigues Ciferri; Diogo Tuler Forlani; Agma J. M. Traina; Fernando da Fonseca de Souza

In this paper, we focus on the horizontal fragmentation of data warehouses. Our main contribution is the proposal of the MHF-DHA algorithm, which is aimed at improving the performance of drill-down and roll-up queries by horizontally fragmenting data warehouses organized in different levels of aggregation. Besides allowing that multiple dimensions be used as a basis for the fragmentation, the algorithm also explores the hierarchical structure of these dimensions. The performance tests carried out using the TPC-H benchmark showed that the proposed fragmentation provides a huge improvement on the query performance, with a reduction in elapsed time and disk accesses between 71% and 99%.

Geoinformatica | 2014

Modeling vague spatial data warehouses using the VSCube conceptual model

Thiago Luís Lopes Siqueira; Cristina Dutra de Aguiar Ciferri; Valéria Cesário Times; Ricardo Rodrigues Ciferri

Although many real world phenomena are vague and characterized by having uncertain location or vague shape, existing spatial data warehouse models do not support spatial vagueness and then cannot properly represent these phenomena. In this paper, we propose the VSCube conceptual model to represent and manipulate shape vagueness in spatial data warehouses, allowing the analysis of business scores related to vague spatial data, and therefore improving the decision-making process. Our VSCube conceptual model is based on the cube metaphor and supports geometric shapes and the corresponding membership values, thus providing more expressiveness to represent vague spatial data. We also define vague spatial aggregation functions (e.g. vague spatial union) and vague spatial predicates to enable vague SOLAP queries (e.g. intersection range queries). Finally, we introduce the concept of vague SOLAP and its operations (e.g. drill-down and roll-up). We demonstrate the applicability of our model by describing an application concerning pest control in agriculture and by discussing the reuse of existing models in the VSCube conceptual model.

Explore More