Lucio F. D. Santos | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lucio F. D. Santos is active.

Explore More

Publication

Featured researches published by Lucio F. D. Santos.

international symposium on multimedia | 2016

A Label-Scaled Similarity Measure for Content-Based Image Retrieval

Gustavo Blanco; Marcos Vinicius Naves Bedo; Mirela T. Cazzolato; Lucio F. D. Santos; Ana Elisa Serafim Jorge; Caetano Traina; Paulo M. Azevedo-Marques; Agma J. M. Traina

Content-Based Image Retrieval (CBIR) has proven to be a suitable complement to traditional text-based searching. CBIR applications rely on two main steps, namely the representation of the images, and the similarity measuring between two represented images. Although modern segmentation and learning algorithms enable the accurate representation of local and global features within an image, how to properly compare the segmented objects is still an open issue. In this study, we propose a new comparison method called Counting-Labels Similarity Measure (CL-Measure). Our approach calculates the similarity between two images by comparing the labeled regions within these images and by balancing the influence of each label according to its predominance in both non-metric and metric fashion. The experiments on a real dataset of dermatological ulcers show that CL-Measure achieves a higher Precision for all values of Recall compared to its competitors in retrieval tasks.

similarity search and applications | 2015

Similarity Joins and Beyond: An Extended Set of Binary Operators with Order

Luiz Olmes Carvalho; Lucio F. D. Santos; Willian D. Oliveira; Agma J. M. Traina; Caetano Traina

Similarity joins are troublesome database operators that often produce results much larger than the user really needs or expects. In order to return the similar elements, similarity joins also require sorting during the retrieval process, although order is a concept not supported in the relational model. This paper proposes a solution to solve those two issues extending the similarity join concept to a broader set of binary operators, which aims at retrieving the most similar pairs and embedding the sorting operation only as an internal processing step, so as to comply with the relational theory. Additionally, our extension allows to explore another useful condition not previously considered in the similarity retrieval: the negation of predicates. Experiments performed on real and synthetic data show that our operators are fast enough to be used in real applications and scale well both for multidimensional and non-dimensional metric data.

international symposium on multimedia | 2015

Self Similarity Wide-Joins for Near-Duplicate Image Detection

Luiz Olmes Carvalho; Lucio F. D. Santos; Willian D. Oliveira; Agma J. M. Traina; Caetano Traina

Near-duplicate image detection plays an important role in several real applications. Such task is usually achieved by applying a clustering algorithm followed by refinement steps, which is a computationally expensive process. In this paper we introduce a framework based on a novel similarity join operator, which is able both to replace and speed up the clustering step, whereas also releasing the need of further refinement processes. It is based on absolute and relative similarity ratios, ensuring that top ranked image pairs are in the final result. Experiments performed on real datasets shows that our proposal is up to three orders of magnitude faster than the best techniques in the literature, always returning a high-quality result set.

international conference on enterprise information systems | 2016

Efficient Self-similarity Range Wide-joins Fostering Near-duplicate Image Detection in Emergency Scenarios

Luiz Olmes Carvalho; Lucio F. D. Santos; Willian D. Oliveira; Agma J. M. Traina; Caetano Traina

Crowdsourcing information is being increasingly employed to improve and support decision making in emergency situations. However, the gathered records quickly become too similar among themselves and handling several similar reports does not add valuable knowledge to assist the helping personnel at the control center in their decision making tasks. The usual approaches to detect and handle the so-called near-duplicate data rely on costly twofold processing. Aimed at reducing the cost and also improving the ability of duplication detection, we developed a framework model based on the similarity wide-join database operator. We extended the wide-join definition empowering it to surpass its restrictions and accomplish the near-duplicate task too. In this paper, we also provide an efficient algorithm based on pivots that speeds up the entire process, which enables retrieving the top similar elements in a single-pass processing. Experiments using real datasets show that our framework is up to three orders of magnitude faster than the competing techniques in the literature, whereas also improving the quality of the result in about 35 percent.

computer-based medical systems | 2015

Color and Texture Influence on Computer-Aided Diagnosis of Dermatological Ulcers

Marcos Vinicius Naves Bedo; Lucio F. D. Santos; Willian D. Oliveira; Gustavo Blanco; Agma J. M. Traina; Marco Antonio Frade; Paulo M. Azevedo-Marques; Caetano Traina Junior

This study presents an analysis of classification techniques for Computer-Aided Diagnosis (CAD) regarding ulcerated lesions. We focus on determining influence of both color and texture in the automated image classification and its implication. To do so, we assayed a dataset of dermatological ulcers containing five variations in terms of tissue composition of lesion skin: granulation (red), fibrin (yellow), callous (white), necrotic (black), and a mix of the previous variations (mixed). Every image was previously labelled by experts regarding this red-yellow-black-white-mixed model. We employed specially designed color and texture extractors to represent the dataset images, namely: Color Layout, Color Structure, Scalable Color, Edge Histogram, Haralick, and Texture-Spectrum. The first three are color feature extractors and the last three are texture extractors. Following, we employed the Symmetrica Uncert Attribute Eval method to determine the features suitable for image classification. We tested a set of classifiers that follows distinct paradigms over the selected features, achieving an accuracy ratio of up to 77% in terms of images correctly classified, with the area under the receiver operating characteristic (ROC) curve up to 0.84. The classification performance and the selected features enabled us to determine that texture features were more predominant than color in the entire classification process.

acm symposium on applied computing | 2015

Combine-and-conquer: improving the diversity in similarity search through influence sampling

Lucio F. D. Santos; Willian D. Oliveira; Luiz Olmes Carvalho; Mônica Ribeiro Porto Ferreira; Agma J. M. Traina; Caetano Traina

Result diversification methods are intended to retrieve elements similar to a given object whereas also enforcing a certain degree of diversity among them, aimed at improving the answer relevance. Most of the methods are based on optimization, but bearing NP-hard solutions. Diversity is injected into an otherwise all-too-similar result set in two phases: in the first, the search space is reduced to speed up finding the optimal solution, whereas in the second a trade-off between diversity and similarity over the reduced space is obtained. It is assumed that the first phase is achieved by applying a traditional nearest neighbor algorithm, but no previous investigation evaluated the impact of the first over the second phase. In this paper, we devised alternative techniques to execute the first phase and evaluated how obtaining a better quality set of elements in the first phase can improve the diversity. Besides the traditional nearest neighbor-based pre-selection, we also considered naive random selection, cluster-based and influence-based ones. Thereafter, extensive experiments evaluated a number of state-of-the-art diversity algorithms employed in the second phase, regarding both processing time and answer quality. The obtained results have shown that although the much more elaborated (and much more time consuming) methods indeed provide best answers, other alternatives are able to provide a better commitment regarding quality and performance. Moreover, the pre-selection techniques can reduce the total running time by up to two orders of magnitude.

conference on information and knowledge management | 2018

Exploring Diversified Similarity with Kundaha

Lucio F. D. Santos; Gustavo Blanco; Daniel de Oliveira; Agma J. M. Traina; Caetano Traina; Marcos V. N. Bedo

Exploring large medical image sets by means of traditional similarity query criteria (e.g., neighborhood) can be fruitless if retrieved images are too similar among themselves. This demonstration introduces Kundaha, an exploration tool that assists experts in retrieving and navigating on results from a diversified similarity perspective of user-posed queries. Its implementation includes a wide set of metrics, descriptors, and indexes for enhancing query execution. Users can combine such features with diversified similarity criteria for the organized exploration of result sets and also employ relevance feedback cycles for finding new query-based viewpoints.

international symposium on multimedia | 2016

When Similarity is Not Enough, Ask for Diversity: Grouping Elements Based on Influence

Lucio F. D. Santos; Luiz Olmes Carvalho; Marcos Vinicius Naves Bedo; Agma J. M. Traina; Caetano Traina

Crowdsourcing images have been increasingly employed for mapping emergency scenarios, which helps rescue forces in choosing contingency plans. In this scenario, similarity searching can be used to retrieve related images from past situations. However, the retrieved images often are similar among themselves and, therefore, add little to none new information to the rescue decision-making process. In this paper, we take advantage of diversity queries to increase the variety of the representative elements about an incident, whereas the remaining and related data are grouped according to the set of representatives. Thus, our approach enables content retrieval, grouping and an easier exploration of the result set. Experiments performed on real datasets shows that our proposal outperforms the existing methods regarding both quality and performance, being at least three orders of magnitude faster.

international conference on enterprise information systems | 2016

Pivot-Based Similarity Wide-Joins Fostering Near-Duplicate Detection

Luiz Olmes Carvalho; Lucio F. D. Santos; Agma J. M. Traina; Caetano Traina

Monitoring systems targeting to improve decision making in emergency scenarios are currently benefiting from crowdsourcing information. The main issue with such kind of data is that the gathered reports quickly become too similar among themselves. Hence, too much similar reports, namely near-duplicates, do not add valuable knowledge to assist crisis control committees in their decision making tasks. The current approaches to detect near-duplicates are usually based on a twofold processing, where the first phase relies on similarity queries or clustering techniques, whereas the second and most computationally costly phase refines the result from the first one. Aimed at reducing that cost and also improving the ability of near-duplication detection, we developed a framework model based on the similarity wide-join database operator. This paper extends the wide-join definition empowering it to surpass its restrictions and provides an efficient algorithm based on pivots that speeds up the entire process, whereas enabling to retrieve the most similar elements in a single-pass. We also investigate alternatives and propose efficient algorithms to choose the pivots. Experiments using real datasets show that our framework is up to three orders of magnitude faster than the competing techniques in the literature, whereas it also improves the quality of the result in about 35%.

similarity search and applications | 2015

Diversity in Similarity Joins

Lucio F. D. Santos; Luiz Olmes Carvalho; Willian D. Oliveira; Agma J. M. Traina; Caetano Traina

With the increasing ability of current applications to produce and consume more complex data, such as images and geographic information, the similarity join has attracted considerable attention. However, this operator does not consider the relationship among the elements in the answer, generating results with many pairs similar among themselves, which does not add value to the final answer. Result diversification methods are intended to retrieve elements similar enough to satisfy the similarity conditions, but also considering the diversity among the elements in the answer, producing a more heterogeneous result with smaller cardinality, which improves the meaning of the answer. Still, diversity have been studied only when applied to unary operations. In this paper, we introduce the concept of diverse similarity joins: a similarity join operator that ensures a smaller, more diversified and useful answers. The experiments performed on real and synthetic datasets show that our proposal allows exploiting diversity in similarity joins without diminish their performance whereas providing elements that cover the same data space distribution of the non-diverse answers.

Explore More