Paulo H. Oliveira
University of São Paulo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Paulo H. Oliveira.
Tellus B | 2007
Paulo H. Oliveira; Paulo Artaxo; Carlos Pires; Silvia De Lucca; A. S. Procopio; Brent N. Holben; J. S. Schafer; Luiz F. Cardoso; Steven C. Wofsy; Humberto R. Rocha
Aerosol particles associated with biomass burning emissions affect the surface radiative budget and net ecosystem exchange (NEE) over large areas in Amazonia during the dry season. We analysed CO2 fluxes as a function of aerosol loading for two forest sites in Amazonia as part of the LBA experiment. Aerosol optical thickness (AOT) measurements were made with AERONET sun photometers, and CO2 flux measurements were determined by eddy-correlation. The enhancement of the NEE varied with different aerosol loading, as well as cloud cover, solar elevation angles and other parameters. The AOT value with the strongest effect on the NEE in the FLONA-Tapajós site was 1.7, with an enhancement of the NEE of 11% compared with clear-sky conditions. In the RBJ site, the strongest effect was for AOT of 1.6 with an enhancement of 18% in the NEE. For values of AOT lager than 2.7, strong reduction on the NEE was observed due to the reduction in the total solar radiation. The enhancement in the NEE is attributed to the increase of diffuse versus direct solar radiation. Due to the fact that aerosols from biomass burning are present in most tropical areas, its effects on the global carbon budget could also be significant.
advances in databases and information systems | 2015
Paulo H. Oliveira; Caetano Traina; Daniel S. Kaster
Metric Access Methods (MAMs) have been proved to allow performing similarity queries over complex data more efficiently than other access methods. They can be considered dynamic or static depending on the pivot type used in their construction. Global pivots tend to compromise the dynamicity of MAMs, as eventual pivot-related updates must be propagated through the entire structure, while local pivots allow this maintenance to occur locally. Several applications handle online complex data and, consequently, demand efficient dynamic indexes to be successful. In this context, this work presents two techniques for improving the pruning ability of dynamic MAMs: (i) using cutting local additional pivots to reduce distance calculations and (ii) anticipating information from child nodes to reduce unnecessary disk accesses. The experiments reveal significant improvements in a dynamic MAM, reducing execution time in more than 50 % for similarity queries posed on datasets ranging from moderate to high dimensionality and cardinality.
international conference on enterprise information systems | 2016
Paulo H. Oliveira; Antonio C. Fraideinberze; Natan A. Laverde; Hugo Gualdron; André S. Gonzaga; Lucas D. R. Ferreira; Willian D. Oliveira; F Jose Rodrigues-Jr.; Robson L. F. Cordeiro; Caetano Traina; Agma J. M. Traina; Elaine P. M. de Sousa
Crowdsourcing solutions can be helpful to extract information from disaster-related data during crisis management. However, certain information can only be obtained through similarity operations. Some of them also depend on additional data stored in a Relational Database Management System (RDBMS). In this context, several works focus on crisis management supported by data. Nevertheless, none of them provide a methodology for employing a similarity-enabled RDBMS in disaster-relief tasks. To fill this gap, we introduce a methodology together with the Data-Centric Crisis Management (DCCM) architecture, which employs our methods over a similarity-enabled RDBMS. We evaluate our proposal through three tasks: classification of incoming data regarding current events, identifying relevant information to guide rescue teams; filtering of incoming data, enhancing the decision support by removing near-duplicate data; and similarity retrieval of historical data, supporting analytical comprehension of the crisis context. To make it possible, similarity-based operations were implemented within one popular, open-source RDBMS. Results using real data from Flickr show that our proposal is feasible for real-time applications. In addition to high performance, accurate results were obtained with a proper combination of techniques for each task. Hence, we expect our work to provide a framework for further developments on crisis management solutions.
symposium on applied computing | 2017
Paulo H. Oliveira; Lucas C. Scabora; Caetano Traina; Daniel S. Kaster
Relational Database Management Systems (RDBMS) organize data into relations, representing them as tables. Most of queries executed over them are optimized by index structures. However, considering queries that require scanning indexes across multiple tables, the common approach involves scanning multiple indexes and combining their results, which is potentially costly, especially regarding similarity queries over complex data. This paper proposes a new type of index for modern RDBMS called domain index. Such proposal consists of indexes that allow searching columns of the same type, across multiple tables, with a single index scan, hence with superior performance. To evaluate our proposal, we carried out experiments (i) over a medical image dataset, to evaluate the performance in content-based similarity queries; and (ii) over a flow-based intrusion detection dataset, to evaluate the performance in conventional queries both in a real scenario and over synthetic data so to evaluate scalability. The results exhibit the higher performance of domain indexes. Specifically, the gains reached up to 42.9+ in similarity queries and up to 65.9+ in conventional queries. As the first paper on this subject, we expect this work to provide a basis for further developments on indexing techniques over domains of attributes within modern RDBMS.
computer-based medical systems | 2017
Paulo H. Oliveira; Lucas C. Scabora; Mirela T. Cazzolato; Willian D. Oliveira; Agma J. M. Traina; Caetano Traina
Performing content-based image retrieval over large repositories of medical images demands efficient computational techniques. The use of such techniques is intended to speed up the work of physicians, who often have to deal with information from multiple data repositories. When dealing with multiple data repositories, the common computational approach is to search each repository separately and merge the multiple results into one final response, which slows down the whole process. This can be improved if we build a mechanism able to search several repositories as if they were a single one, i.e. a mechanism to search the whole domain of medical images. Aiming at this goal, we propose the Domain Index, a new category of index structures aimed at efficiently searching domains of data, regardless of the repository to which they belong. To evaluate our proposal, we carried out experiments over multiple mammography repositories involving k Nearest Neighbor (kNN) and Range queries. The results show that images from any repository are seamlessly retrieved, even sustaining gains in performance of up to 36% in kNN queries and up to 7% in Range queries. The experimental evaluation shows that the Domain Index allows fast retrieval from multiple data repositories for medical systems, allowing a better performance in similarity queries over them.
database and expert systems applications | 2018
Pedro H. B. Siqueira; Paulo H. Oliveira; Marcos V. N. Bedo; Daniel S. Kaster
The handling of massive data requires the retrieval procedures to be aligned with the storage model. Similarity searching is an established paradigm for querying large datasets by content, in which data elements are compared by means of metric distance functions. Although several strategies have been proposed for the storage of data queried by metrics into relational schemas, no empirical assessment on the suitability of such strategies for similarity searching has been conducted. In this study, we aim at filling this gap by providing an in-depth evaluation of storage models for Relational Database Management Systems (RDBMS) in standard SQL. Accordingly, we propose a taxonomy, which divides approaches into four categories, Binary, Relational, Object-Relational, and Semistructured, and implement a representative storage model for each category within a common framework. We carried out extensive experiments on the four implemented strategies, and results indicate the Relational and Object-Relational storage models outperform the other competitors in most scenarios, whereas the Binary storage model reaches a good performance for queries with costly comparisons. Finally, the Object-Relational approach showed the best compromise between performance and representation, since its behavior is similar to the Relational storage model with a cleaner representation.
international conference on conceptual structures | 2017
Gabriel Spadon; Lucas C. Scabora; Paulo H. Oliveira; Marcus V.S. Araujo; Bruno Brandoli Machado; Elaine P. M. de Sousa; Caetano Traina-Jr.; Jose F. Rodrigues-Jr
Abstract Complex networks are commonly used to model urban street networks, which allows aiding the analysis of criminal activities in cities. Despite several works focusing on such application, there is a lack of a clear methodology focused in the analysis of crime behavior. In this sense, we propose a methodology for employing complex networks in the analysis of criminality spread within criminal areas of a city. Here, we evaluate synthetic cases of crime propagation concerning real criminal data from the North American city of San Francisco — CA. Our results confirm the effectiveness of our methodology in analyzing the crime behavior by means of criminality spread. Hence, this paper renders further development and planning on public safety in cities.
Information Systems | 2017
Paulo H. Oliveira; Caetano Traina; Daniel S. Kaster
Abstract Constant technological advances in electronic devices have led to the growth of elaborated data such as large texts, time series, georeferenced imagery, genetic sequences, photos, videos and several other types of complex data. Differently from scalar, traditional data types such as numbers and strings, complex data do not present the order relation property, which allows identifying whether an element precedes another according to some criterion. Therefore, these data are usually compared by the similarity degree among them. The Metric Access Methods (MAMs) are recognized as well-suited to perform similarity queries over such kind of data more efficiently than other access methods. MAMs can be considered dynamic or static depending on the pivot type used to construct them. Pivots are often employed to narrow the search for data. Global pivots can be employed to look into elements in the whole dataset, thus they have a high impact in the process of pruning irrelevant elements, since a single global pivot can be used to discard a large amount of irrelevant elements. Nevertheless, MAMs based on global pivots may have their dynamicity compromised by the fact that eventual pivot-related updates must be propagated through the entire structure. Local pivots, on the other hand, allow the maintenance to occur locally at the price of a lower pruning ability. In this paper, we propose novel techniques for improving the performance of dynamic MAMs without harming their dynamicity, once that several applications handle online complex data and, consequently, demand efficient dynamic indexes to be successful. Specifically, our main contributions are three techniques: (i) CLAP, which consists of employing local additional pivots to reduce distance calculations; (ii) ACIR, which is combined with CLAP and anticipates information from child nodes to reduce unnecessary disk accesses; and (iii) SCOOP, which is combined with CLAP as an extended version of ACIR, anticipating a larger amount of information from child nodes. The techniques have been applied to a dynamic MAM and evaluated over real datasets ranging from moderate to high dimensionality and cardinality. The experimental results show that our techniques were able to reduce query execution time in up to 63% for point queries and up to 53% for queries retrieving multiple elements.
Archive | 2006
Paulo Artaxo; Paulo H. Oliveira; Luciene L. Lara; Theotonio Pauliquevis; Luciana V. Rizzo; Carlos Pires Junior; M. Paixao; Karla M. Longo; Saulo De Freitas; Alexandre L. Correia; Travessa R; Rodovia Presidente Dutra; Recebido Setembro
Amazonia and Global Change | 2013
Paulo Artaxo; Luciana V. Rizzo; M. Paixao; Silvia De Lucca; Paulo H. Oliveira; Luciene L. Lara; K. T. Wiedemann; Meinrat O. Andreae; Brent N. Holben; J. S. Schafer; Alexandre L. Correia; Theotonio Pauliquevis