Priscila T. M. Saito
Federal University of Technology - Paraná
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Priscila T. M. Saito.
Expert Systems With Applications | 2014
Priscila T. M. Saito; Pedro Jussieu de Rezende; Alexandre X. Falcão; Celso Tetsuo Nagase Suzuki; Jancarlo Ferreira Gomes
Abstract In the past few years, active learning has been reasonably successful and it has drawn a lot of attention. However, recent active learning methods have focused on strategies in which a large unlabeled dataset has to be reprocessed at each learning iteration. As the datasets grow, these strategies become inefficient or even a tremendous computational challenge. In order to address these issues, we propose an effective and efficient active learning paradigm which attains a significant reduction in the size of the learning set by applying an a priori process of identification and organization of a small relevant subset. Furthermore, the concomitant classification and selection processes enable the classification of a very small number of samples, while selecting the informative ones. Experimental results showed that the proposed paradigm allows to achieve high accuracy quickly with minimum user interaction, further improving its efficiency.
acm symposium on applied computing | 2013
Priscila T. M. Saito; Pedro Jussieu de Rezende; Alexandre X. Falcão; Celso Tetsuo Nagase Suzuki; Jancarlo Ferreira Gomes
The labor-intensive and time-consuming process of annotating data is a serious bottleneck in many pattern recognition applications when handling massive datasets. Active learning strategies have been sought to reduce the cost on human annotation, by means of automatically selecting the most informative unlabeled samples for annotation. The critical issue lies on the selection of such samples. As an effective solution, we propose an active learning approach that preprocesses the dataset, efficiently reduces and organizes a learning set of samples and selects the most representative ones for human annotation. Experiments performed on real datasets show that the proposed approach requires only a few iterations to achieve high accuracy, keeping user involvement to a minimum.
Pattern Recognition | 2015
Priscila T. M. Saito; Celso Tetsuo Nagase Suzuki; Jancarlo Ferreira Gomes; Pedro Jussieu de Rezende; Alexandre X. Falcão
We have developed an automated system for the diagnosis of intestinal parasites from optical microscopy images. The objects (species of parasites and impurities) segmented from these images form a large dataset. We are interested in the active learning problem of selecting a reasonably small number of objects to be labeled under an experts supervision for use in training a pattern classifier. However, impurities are very numerous, constitute several clusters in the feature space, and can be quite similar to some species of parasites, leading to a significant challenge for active learning methods. We propose a technique that pre-organizes the data and then properly balances the selection of samples from all classes and uncertain samples for training. Early data organization avoids reprocessing of the large dataset at each learning iteration, enabling the halting of sample selection after a desired number of samples per iteration, yielding interactive response time. We validate our method by comparing it with state-of-the-art approaches, using a previously labeled dataset of almost 6000 objects. Moreover, we report results from experiments on a very realistic scenario, consisting of a dataset with over 140,000 unlabeled objects, under unbalanced classes, the absence of some classes, and the presence of a very large set of impurities. HighlightsA robust active learning method, called RDS, based on a priori data organization.RDS properly balances sample diversity and uncertainty for useful sample selection.It provides high classification accuracy for the automated diagnosis of parasites.Comparisons with different clustering, classification and other literature methods.RDS was evaluated by an experienced expert in parasitology using a realistic scenario.
brazilian symposium on computer graphics and image processing | 2014
John E. Vargas; Priscila T. M. Saito; Alexandre X. Falcão; Pedro Jussieu de Rezende; Jefersson Alex dos Santos
Very high resolution (VHR) images are large datasets for pixel annotation -- a process that has depended on the supervised training of an effective pixel classifier. Active learning techniques have mitigated this problem, but pixel descriptors are limited to local image information and the large number of pixels makes the response time to the users actions impractical, during active learning. To circumvent the problem, we present an active learning strategy that relies on superpixel descriptors and a priori dataset reduction. Firstly, we compare VHR image annotation using superpixel- and pixel-based classifiers, as designed by the same state-of-the-art active learning technique -- Multi-Class Level Uncertainty (MCLU). Even with the dataset reduction provided by the superpixel representation, MCLU remains unfeasible for user interaction. Therefore, we propose a technique to considerably reduce the superpixel dataset for active learning. Moreover, we subdivide the reduced dataset into a list of subsets with random sample rearrangement to gain both speed and sample diversity during the active learning process.
PLOS ONE | 2015
Priscila T. M. Saito; Rodrigo Y. M. Nakamura; Willian Paraguassu Amorim; João Paulo Papa; Pedro Jussieu de Rezende; Alexandre X. Falcão
Nowadays, large datasets are common and demand faster and more effective pattern analysis techniques. However, methodologies to compare classifiers usually do not take into account the learning-time constraints required by applications. This work presents a methodology to compare classifiers with respect to their ability to learn from classification errors on a large learning set, within a given time limit. Faster techniques may acquire more training samples, but only when they are more effective will they achieve higher performance on unseen testing sets. We demonstrate this result using several techniques, multiple datasets, and typical learning-time limits required by applications.
iberoamerican congress on pattern recognition | 2015
Geovana V. L. de Lima; Thullyo Radeli Castilho; Pedro Henrique Bugatti; Priscila T. M. Saito; Fabrício Martins Lopes
Complex network is a topic related with a plurality of knowledge from various areas and has been applied with success in all of them. However, it is a recent area considering its application in image pattern recognition. There are few works in the literature that use the complex networks for image characterization following its analysis and classification. An image can be interpreted as a complex network wherein each pixel represents a vertex and the weighted edges are generated according to the location and intensity between two pixels. Thus, the present paper aims to investigate this type of application and explore different measurements that can be extracted from complex networks to better characterize an image. One special type of measure that we applied were those based on motifs, which are employed in several areas. However, to the best of our knowledge, motifs were never explored in complex networks representing images. The results demonstrate that our proposed methodology presented great potential, reaching up to 89.81% of accuracy for the classification of public domain image texture datasets.
Briefings in Bioinformatics | 2018
Tatianne da Costa Negri; Wonder Alexandre Luz Alves; Pedro Henrique Bugatti; Priscila T. M. Saito; Douglas Silva Domingues; Alexandre Rossi Paschoal
MOTIVATIONnLong noncoding RNAs (lncRNAs) correspond to a eukaryotic noncoding RNA class that gained great attention in the past years as a higher layer of regulation for gene expression in cells. There is, however, a lack of specific computational approaches to reliably predict lncRNA in plants, which contrast the variety of prediction tools available for mammalian lncRNAs. This distinction is not that obvious, given that biological features and mechanisms generating lncRNAs in the cell are likely different between animals and plants. Considering this, we present a machine learning analysis and a classifier approach called RNAplonc (https://github.com/TatianneNegri/RNAplonc/) to identify lncRNAs in plants.nnnRESULTSnOur feature selection analysis considered 5468 features, and it used only 16 features to robustly identify lncRNA with the REPTree algorithm. That was the base to create the model and train it with lncRNA and mRNA data from five plant species (thale cress, cucumber, soybean, poplar and Asian rice). After an extensive comparison with other tools largely used in plants (CPC, CPC2, CPAT and PLncPRO), we found that RNAplonc produced more reliable lncRNA predictions from plant transcripts with 87.5% of the best result in eight tests in eight species from the GreeNC database and four independent studies in monocotyledonous (Brachypodium) and eudicotyledonous (Populus and Gossypium) species.
iberoamerican congress on pattern recognition | 2017
Daniel H. A. Alves; Luís F. Galonetti; Claiton de Oliveira; Pedro Henrique Bugatti; Priscila T. M. Saito
In this paper, we present and evaluate the accuracy of a Deep Convolutional Neural Network (DCNN) architecture, with other traditional methods, to solve a bioimage classification problem. The main contributions of this work are the application of a DCNN architecture and the further comparison of different types of classification and feature extraction techniques applied to a plant leaf image dataset. Furthermore, we go deeper on the analysis of a cross-domain transfer learning approach using a state-of-the-art deep neural network called Inception-v3. Our results show that we manage to classify a subset of 53 species of leafs with a notable mean accuracy of (98.2%).
computer-based medical systems | 2017
Guilherme Camargo; Rafael Staiger Bressan; Pedro Henrique Bugatti; Priscila T. M. Saito
Nowadays a huge volume of biomedical data (images, genes, etc) are daily generated. The interpretation of such data involves a considerable expertise. The misinterpretation and/or misdetection of a suspicious clinical finding leads to increasing the negligence claims, and redundant procedures (e.g. biopsies). The analysis of biomedical data is a complex task which are performed by specialists on whose expertise degree affects the accuracy of their diagnosis. Besides, due to the huge volume of data, it is a tiresome process. To mitigate these intrinsic drawbacks Computeraided Diagnosis approaches have been proposed in the last decade, but applied without a deep analysis. It is also very common in the literature for the presentation of experimental results to rely solely on the mean of accuracy values. This procedure is not always reliable, especially for applications that require faster classifiers due to their learning-time constraints. Hence, in this paper we proposed an extensive analysis towards an effective and efficient learning for biomedical data classification. To do so, several public biomedical datasets were used against different supervised classifiers, taking into account accuracies and computational times obtained throughout the learning process.
acm symposium on applied computing | 2016
Douglas Felipe Pereira; Priscila T. M. Saito; Pedro Henrique Bugatti
The main drawback regarding agriculture commodities such as soybeans is that they must be uniform in quality considering the companies that produce and sell it. Thus, to reach such requirements it is necessary to establish strict parameters and quality control processes. The most important test to accomplish this task is based on the seed vigor definition. However, the great majority of seed vigor analysis is performed by a human specialist leading to an extremely tiresome and subjective approach, highly susceptible to failures, as well as, certain types of deliberate adulteration, based on monetary interests. In order to deal with this problem, the present paper proposed an image analysis framework for effective classification of seeds according to their vigor. The experiments performed showed that the proposed framework presented several notable contributions to the soybean vigor definition process, reaching up to 81% of classification accuracy. Moreover, it not only improves in a great extent the entire process from planting to harvesting, but also enables to automate and accelerate it, as well as, can be used as a counterevidence of the specialist classification.