Lucas C. Scabora
University of São Paulo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lucas C. Scabora.
arXiv: Social and Information Networks | 2018
Gabriel Spadon; Lucas C. Scabora; Marcus V.S. Araujo; Paulo H. Oliveir; Bruno Brandoli Machado; Elaine P. M. de Sousa; Caetano Traina; José Fernando Rodrigues
Complex networks are nowadays employed in several applications. Modeling urban street networks is one of them, and in particular to analyze criminal aspects of a city. Several research groups have focused on such application, but until now, there is a lack of a well-defined methodology for employing complex networks in a whole crime analysis process, i.e. from data preparation to a deep analysis of criminal communities. Furthermore, the “toolset” available for those works is not complete enough, also lacking techniques to maintain up-to-date, complete crime datasets and proper assessment measures. In this sense, we propose a threefold methodology for employing complex networks in the detection of highly criminal areas within a city. Our methodology comprises three tasks: (i) Mapping of Urban Crimes; (ii) Criminal Community Identification; and (iii) Crime Analysis. Moreover, it provides a proper set of assessment measures for analyzing intrinsic criminality of communities, especially when considering different crime types. We show our methodology by applying it to a real crime dataset from the city of San Francisco—CA, USA. The results confirm its effectiveness to identify and analyze high criminality areas within a city. Hence, our contributions provide a basis for further developments on complex networks applied to crime analysis.
symposium on applied computing | 2017
Paulo H. Oliveira; Lucas C. Scabora; Caetano Traina; Daniel S. Kaster
Relational Database Management Systems (RDBMS) organize data into relations, representing them as tables. Most of queries executed over them are optimized by index structures. However, considering queries that require scanning indexes across multiple tables, the common approach involves scanning multiple indexes and combining their results, which is potentially costly, especially regarding similarity queries over complex data. This paper proposes a new type of index for modern RDBMS called domain index. Such proposal consists of indexes that allow searching columns of the same type, across multiple tables, with a single index scan, hence with superior performance. To evaluate our proposal, we carried out experiments (i) over a medical image dataset, to evaluate the performance in content-based similarity queries; and (ii) over a flow-based intrusion detection dataset, to evaluate the performance in conventional queries both in a real scenario and over synthetic data so to evaluate scalability. The results exhibit the higher performance of domain indexes. Specifically, the gains reached up to 42.9+ in similarity queries and up to 65.9+ in conventional queries. As the first paper on this subject, we expect this work to provide a basis for further developments on indexing techniques over domains of attributes within modern RDBMS.
computer-based medical systems | 2017
Paulo H. Oliveira; Lucas C. Scabora; Mirela T. Cazzolato; Willian D. Oliveira; Agma J. M. Traina; Caetano Traina
Performing content-based image retrieval over large repositories of medical images demands efficient computational techniques. The use of such techniques is intended to speed up the work of physicians, who often have to deal with information from multiple data repositories. When dealing with multiple data repositories, the common computational approach is to search each repository separately and merge the multiple results into one final response, which slows down the whole process. This can be improved if we build a mechanism able to search several repositories as if they were a single one, i.e. a mechanism to search the whole domain of medical images. Aiming at this goal, we propose the Domain Index, a new category of index structures aimed at efficiently searching domains of data, regardless of the repository to which they belong. To evaluate our proposal, we carried out experiments over multiple mammography repositories involving k Nearest Neighbor (kNN) and Range queries. The results show that images from any repository are seamlessly retrieved, even sustaining gains in performance of up to 36% in kNN queries and up to 7% in Range queries. The experimental evaluation shows that the Domain Index allows fast retrieval from multiple data repositories for medical systems, allowing a better performance in similarity queries over them.
international conference on enterprise information systems | 2016
Lucas C. Scabora; Jaqueline Joice Brito; Ricardo Rodrigues Ciferri; Cristina Dutra de Aguiar Ciferri
Nowadays, data warehousing and online analytical processing (OLAP) are core technologies in business intelligence and therefore have drawn much interest by researchers in the last decade. However, these technologies have been mainly developed for relational database systems in centralized environments. In other words, these technologies have not been designed to be applied in scalable systems such as NoSQL databases. Adapting a data warehousing environment to NoSQL databases introduces several advantages, such as scalability and flexibility. This paper investigates three physical data warehouse designs to adapt the Star Schema Benchmark for its use in NoSQL databases. In particular, our main investigation refers to the OLAP query processing over column-oriented databases using the MapReduce framework. We analyze the impact of distributing attributes among column-families in HBase on the OLAP query performance. Our experiments showed how processing time of OLAP queries was impacted by a physical data warehouse design regarding the number of dimensions accessed and the data volume. We conclude that using distinct distributions of attributes among column-families can improve OLAP query performance in HBase and consequently make the benchmark more suitable for OLAP over NoSQL databases.
acm symposium on applied computing | 2018
Daniel Yoshinobu Takada Chino; Lucas C. Scabora; Caetano Traina; Agma J. M. Traina
Techniques of bags-of-visual-words based on signature have been employed in image retrieval and analysis, with the benefit of dismissing expensive clustering processes. However, the limitations of such techniques are the requirement of multiple parameters, which may be unintuitive and in most cases depends on the application domain. In this paper, we overcome these limitations by proposing Bag-of-Superpixel Signatures (BoSS), which extracts visual signatures using local features from superpixels. Moreover, our proposal also employs a fractal analysis to extract intrinsic information about the domain application and also to diminish the amount of parameters needed. The results demonstrated that BoSS achieved an improvement up to 31.2% in image retrieval precision during experimental evaluations over five distinct datasets. We conclude that BoSS introduces an intuitive, self-contained, scalable and effective approach for image retrieval using bags-of-visual words.
international conference on conceptual structures | 2017
Gabriel Spadon; Lucas C. Scabora; Paulo H. Oliveira; Marcus V.S. Araujo; Bruno Brandoli Machado; Elaine P. M. de Sousa; Caetano Traina-Jr.; Jose F. Rodrigues-Jr
Abstract Complex networks are commonly used to model urban street networks, which allows aiding the analysis of criminal activities in cities. Despite several works focusing on such application, there is a lack of a clear methodology focused in the analysis of crime behavior. In this sense, we propose a methodology for employing complex networks in the analysis of criminality spread within criminal areas of a city. Here, we evaluate synthetic cases of crime propagation concerning real criminal data from the North American city of San Francisco — CA. Our results confirm the effectiveness of our methodology in analyzing the crime behavior by means of criminality spread. Hence, this paper renders further development and planning on public safety in cities.
computer-based medical systems | 2017
Mirela T. Cazzolato; Lucas C. Scabora; Alceu Ferraz Costa; Marcos Roberto Nesso Junior; Luis Fernando Milano Oliveira; Daniel S. Kaster; Caetano Traina Junior; Agma J. M. Traina
Computed Tomography (CT) scans are often employed to diagnose lung diseases, as abnormal tissue regions may indicate whether proper treatment is required. However, detecting specific regions containing abnormalities in a CT scan demands time and effort of specialists. Moreover, different parts of a single lung image may present both normal and abnormal characteristics, what makes inaccurate the classification of a single lung as healthy (normal) or not. In this paper we propose the BREATH method, capable of detecting abnormalities in lung tissue regions, highlighting them by means of a heat map visualization. The method starts by segmenting lung tissues using a superpixel-based approach, followed by the training of a statistical model to represent normal tissues and, finally, the generation of a heat map showing abnormal regions that require attention from the physicians. We validated our statistical model using a dataset with 246 lung CT scans, where 40 are healthy and the remaining present varying diseases. Experimental results show that BREATH is accurate for lung segmentation with F-Measure of up to 0.99. The statistical modeling of healthy and abnormal lung regions has shown almost no overlap, and the detection of superpixels containing abnormalities presented precision values higher than 86%, for all values of recall. These values support our claim that the heat map representation of BREATH for the abnormal detection can be used as an intuitive method to assist physicians during the diagnosis.
computer-based medical systems | 2018
Marcos R. Nesso; Mirela T. Cazzolato; Lucas C. Scabora; Paulo H. Oliveira; Gabriel Spadon; Jéssica Andressa de Souza; Willian D. Oliveira; Daniel Yoshinobu Takada Chino; José Fernando Rodrigues; Agma J. M. Traina; Caetano Traina
computer-based medical systems | 2018
Daniel Yoshinobu Takada Chino; Lucas C. Scabora; Mirela T. Cazzolato; Ana Elisa Serafim Jorge; Caetano Traina; Agma J. M. Traina
brazilian symposium on databases | 2018
Gabriel Spadon; Lucas C. Scabora; Marcos R. Nesso; Caetano Traina; José F. Rodrigues