Willian D. Oliveira | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Willian D. Oliveira is active.

Explore More

Publication

Featured researches published by Willian D. Oliveira.

international conference on enterprise information systems | 2016

On the Support of a Similarity-enabled Relational Database Management System in Civilian Crisis Situations

Paulo H. Oliveira; Antonio C. Fraideinberze; Natan A. Laverde; Hugo Gualdron; André S. Gonzaga; Lucas D. R. Ferreira; Willian D. Oliveira; F Jose Rodrigues-Jr.; Robson L. F. Cordeiro; Caetano Traina; Agma J. M. Traina; Elaine P. M. de Sousa

Crowdsourcing solutions can be helpful to extract information from disaster-related data during crisis management. However, certain information can only be obtained through similarity operations. Some of them also depend on additional data stored in a Relational Database Management System (RDBMS). In this context, several works focus on crisis management supported by data. Nevertheless, none of them provide a methodology for employing a similarity-enabled RDBMS in disaster-relief tasks. To fill this gap, we introduce a methodology together with the Data-Centric Crisis Management (DCCM) architecture, which employs our methods over a similarity-enabled RDBMS. We evaluate our proposal through three tasks: classification of incoming data regarding current events, identifying relevant information to guide rescue teams; filtering of incoming data, enhancing the decision support by removing near-duplicate data; and similarity retrieval of historical data, supporting analytical comprehension of the crisis context. To make it possible, similarity-based operations were implemented within one popular, open-source RDBMS. Results using real data from Flickr show that our proposal is feasible for real-time applications. In addition to high performance, accurate results were obtained with a proper combination of techniques for each task. Hence, we expect our work to provide a framework for further developments on crisis management solutions.

similarity search and applications | 2015

Similarity Joins and Beyond: An Extended Set of Binary Operators with Order

Luiz Olmes Carvalho; Lucio F. D. Santos; Willian D. Oliveira; Agma J. M. Traina; Caetano Traina

Similarity joins are troublesome database operators that often produce results much larger than the user really needs or expects. In order to return the similar elements, similarity joins also require sorting during the retrieval process, although order is a concept not supported in the relational model. This paper proposes a solution to solve those two issues extending the similarity join concept to a broader set of binary operators, which aims at retrieving the most similar pairs and embedding the sorting operation only as an internal processing step, so as to comply with the relational theory. Additionally, our extension allows to explore another useful condition not previously considered in the similarity retrieval: the negation of predicates. Experiments performed on real and synthetic data show that our operators are fast enough to be used in real applications and scale well both for multidimensional and non-dimensional metric data.

international symposium on multimedia | 2015

Self Similarity Wide-Joins for Near-Duplicate Image Detection

Luiz Olmes Carvalho; Lucio F. D. Santos; Willian D. Oliveira; Agma J. M. Traina; Caetano Traina

Near-duplicate image detection plays an important role in several real applications. Such task is usually achieved by applying a clustering algorithm followed by refinement steps, which is a computationally expensive process. In this paper we introduce a framework based on a novel similarity join operator, which is able both to replace and speed up the clustering step, whereas also releasing the need of further refinement processes. It is based on absolute and relative similarity ratios, ensuring that top ranked image pairs are in the final result. Experiments performed on real datasets shows that our proposal is up to three orders of magnitude faster than the best techniques in the literature, always returning a high-quality result set.

computer-based medical systems | 2017

Efficiently Indexing Multiple Repositories of Medical Image Databases

Paulo H. Oliveira; Lucas C. Scabora; Mirela T. Cazzolato; Willian D. Oliveira; Agma J. M. Traina; Caetano Traina

Performing content-based image retrieval over large repositories of medical images demands efficient computational techniques. The use of such techniques is intended to speed up the work of physicians, who often have to deal with information from multiple data repositories. When dealing with multiple data repositories, the common computational approach is to search each repository separately and merge the multiple results into one final response, which slows down the whole process. This can be improved if we build a mechanism able to search several repositories as if they were a single one, i.e. a mechanism to search the whole domain of medical images. Aiming at this goal, we propose the Domain Index, a new category of index structures aimed at efficiently searching domains of data, regardless of the repository to which they belong. To evaluate our proposal, we carried out experiments over multiple mammography repositories involving k Nearest Neighbor (kNN) and Range queries. The results show that images from any repository are seamlessly retrieved, even sustaining gains in performance of up to 36% in kNN queries and up to 7% in Range queries. The experimental evaluation shows that the Domain Index allows fast retrieval from multiple data repositories for medical systems, allowing a better performance in similarity queries over them.

international conference on enterprise information systems | 2016

Efficient Self-similarity Range Wide-joins Fostering Near-duplicate Image Detection in Emergency Scenarios

Luiz Olmes Carvalho; Lucio F. D. Santos; Willian D. Oliveira; Agma J. M. Traina; Caetano Traina

Crowdsourcing information is being increasingly employed to improve and support decision making in emergency situations. However, the gathered records quickly become too similar among themselves and handling several similar reports does not add valuable knowledge to assist the helping personnel at the control center in their decision making tasks. The usual approaches to detect and handle the so-called near-duplicate data rely on costly twofold processing. Aimed at reducing the cost and also improving the ability of duplication detection, we developed a framework model based on the similarity wide-join database operator. We extended the wide-join definition empowering it to surpass its restrictions and accomplish the near-duplicate task too. In this paper, we also provide an efficient algorithm based on pivots that speeds up the entire process, which enables retrieving the top similar elements in a single-pass processing. Experiments using real datasets show that our framework is up to three orders of magnitude faster than the competing techniques in the literature, whereas also improving the quality of the result in about 35 percent.

computer-based medical systems | 2015

Color and Texture Influence on Computer-Aided Diagnosis of Dermatological Ulcers

Marcos Vinicius Naves Bedo; Lucio F. D. Santos; Willian D. Oliveira; Gustavo Blanco; Agma J. M. Traina; Marco Antonio Frade; Paulo M. Azevedo-Marques; Caetano Traina Junior

This study presents an analysis of classification techniques for Computer-Aided Diagnosis (CAD) regarding ulcerated lesions. We focus on determining influence of both color and texture in the automated image classification and its implication. To do so, we assayed a dataset of dermatological ulcers containing five variations in terms of tissue composition of lesion skin: granulation (red), fibrin (yellow), callous (white), necrotic (black), and a mix of the previous variations (mixed). Every image was previously labelled by experts regarding this red-yellow-black-white-mixed model. We employed specially designed color and texture extractors to represent the dataset images, namely: Color Layout, Color Structure, Scalable Color, Edge Histogram, Haralick, and Texture-Spectrum. The first three are color feature extractors and the last three are texture extractors. Following, we employed the Symmetrica Uncert Attribute Eval method to determine the features suitable for image classification. We tested a set of classifiers that follows distinct paradigms over the selected features, achieving an accuracy ratio of up to 77% in terms of images correctly classified, with the area under the receiver operating characteristic (ROC) curve up to 0.84. The classification performance and the selected features enabled us to determine that texture features were more predominant than color in the entire classification process.

acm symposium on applied computing | 2015

Combine-and-conquer: improving the diversity in similarity search through influence sampling

Lucio F. D. Santos; Willian D. Oliveira; Luiz Olmes Carvalho; Mônica Ribeiro Porto Ferreira; Agma J. M. Traina; Caetano Traina

Result diversification methods are intended to retrieve elements similar to a given object whereas also enforcing a certain degree of diversity among them, aimed at improving the answer relevance. Most of the methods are based on optimization, but bearing NP-hard solutions. Diversity is injected into an otherwise all-too-similar result set in two phases: in the first, the search space is reduced to speed up finding the optimal solution, whereas in the second a trade-off between diversity and similarity over the reduced space is obtained. It is assumed that the first phase is achieved by applying a traditional nearest neighbor algorithm, but no previous investigation evaluated the impact of the first over the second phase. In this paper, we devised alternative techniques to execute the first phase and evaluated how obtaining a better quality set of elements in the first phase can improve the diversity. Besides the traditional nearest neighbor-based pre-selection, we also considered naive random selection, cluster-based and influence-based ones. Thereafter, extensive experiments evaluated a number of state-of-the-art diversity algorithms employed in the second phase, regarding both processing time and answer quality. The obtained results have shown that although the much more elaborated (and much more time consuming) methods indeed provide best answers, other alternatives are able to provide a better commitment regarding quality and performance. Moreover, the pre-selection techniques can reduce the total running time by up to two orders of magnitude.

similarity search and applications | 2015

Diversity in Similarity Joins

Lucio F. D. Santos; Luiz Olmes Carvalho; Willian D. Oliveira; Agma J. M. Traina; Caetano Traina

With the increasing ability of current applications to produce and consume more complex data, such as images and geographic information, the similarity join has attracted considerable attention. However, this operator does not consider the relationship among the elements in the answer, generating results with many pairs similar among themselves, which does not add value to the final answer. Result diversification methods are intended to retrieve elements similar enough to satisfy the similarity conditions, but also considering the diversity among the elements in the answer, producing a more heterogeneous result with smaller cardinality, which improves the meaning of the answer. Still, diversity have been studied only when applied to unary operations. In this paper, we introduce the concept of diverse similarity joins: a similarity join operator that ensures a smaller, more diversified and useful answers. The experiments performed on real and synthetic datasets show that our proposal allows exploiting diversity in similarity joins without diminish their performance whereas providing elements that cover the same data space distribution of the non-diverse answers.

statistical and scientific database management | 2013

Parameter-free and domain-independent similarity search with diversity

Lucio F. D. Santos; Willian D. Oliveira; Mônica Ribeiro Porto Ferreira; Agma J. M. Traina; Caetano Traina

international conference on enterprise information systems | 2015

Techniques for Effective and Efficient Fire Detection from Social Media Images

Marcos Vinicius Naves Bedo; Gustavo Blanco; Willian D. Oliveira; Mirela T. Cazzolato; Alceu Ferraz Costa; José Fernando Rodrigues; Agma J. M. Traina; Caetano Traina

Explore More