Eulanda Miranda dos Santos
Federal University of Amazonas
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Eulanda Miranda dos Santos.
Pattern Recognition | 2008
Eulanda Miranda dos Santos; Robert Sabourin; Patrick Maupin
The overproduce-and-choose strategy, which is divided into the overproduction and selection phases, has traditionally focused on finding the most accurate subset of classifiers at the selection phase, and using it to predict the class of all the samples in the test data set. It is therefore, a static classifier ensemble selection strategy. In this paper, we propose a dynamic overproduce-and-choose strategy which combines optimization and dynamic selection in a two-level selection phase to allow the selection of the most confident subset of classifiers to label each test sample individually. The optimization level is intended to generate a population of highly accurate candidate classifier ensembles, while the dynamic selection level applies measures of confidence to reveal the candidate ensemble with the highest degree of confidence in the current decision. Experimental results conducted to compare the proposed method to a static overproduce-and-choose strategy and a classical dynamic classifier selection approach demonstrate that our method outperforms both these selection-based methods, and is also more efficient in terms of performance than combining the decisions of all classifiers in the initial pool.
Information Fusion | 2009
Eulanda Miranda dos Santos; Robert Sabourin; Patrick Maupin
Information fusion research has recently focused on the characteristics of the decision profiles of ensemble members in order to optimize performance. These characteristics are particularly important in the selection of ensemble members. However, even though the control of overfitting is a challenge in machine learning problems, much less work has been devoted to the control of overfitting in selection tasks. The objectives of this paper are: (1) to show that overfitting can be detected at the selection stage; and (2) to present strategies to control overfitting. Decision trees and k nearest neighbors classifiers are used to create homogeneous ensembles, while single- and multi-objective genetic algorithms are employed as search algorithms at the selection stage. In this study, we use bagging and random subspace methods for ensemble generation. The classification error rate and a set of diversity measures are applied as search criteria. We show experimentally that the selection of classifier ensembles conducted by genetic algorithms is prone to overfitting, especially in the multi-objective case. In this study, the partial validation, backwarding and global validation strategies are tailored for classifier ensemble selection problem and compared. This comparison allows us to show that a global validation strategy should be applied to control overfitting in pattern recognition systems involving an ensemble member selection task. Furthermore, this study has helped us to establish that the global validation strategy can be used to measure the relationship between diversity and classification performance when diversity measures are employed as single-objective functions.
international symposium on computers and communications | 2012
Angelo Eduardo Nunan; Eduardo Souto; Eulanda Miranda dos Santos; Eduardo Feitosa
The structure of dynamic websites comprised of a set of objects such as HTML tags, script functions, hyperlinks and advanced features in browsers lead to numerous resources and interactiveness in services currently provided on the Internet. However, these features have also increased security risks and attacks since they allow malicious codes injection or XSS (Cross-Site Scripting). XSS remains at the top of the lists of the greatest threats to web applications in recent years. This paper presents the experimental results obtained on XSS automatic classification in web pages using Machine Learning techniques. We focus on features extracted from web document content and URL. Our results demonstrate that the proposed features lead to highly accurate classification of malicious page.
genetic and evolutionary computation conference | 2008
Eulanda Miranda dos Santos; Robert Sabourin; Patrick Maupin
The overproduce-and-choose strategy involves the generation of an initial large pool of candidate classifiers and it is intended to test different candidate ensembles in order to select the best performing solution. The ensembles error rate, ensemble size and diversity measures are the most frequent search criteria employed to guide this selection. By applying the error rate, we may accomplish the main objective in Pattern Recognition and Machine Learning, which is to find high-performance predictors. In terms of ensemble size, the hope is to increase the recognition rate while minimizing the number of classifiers in order to meet both the performance and low ensemble size requirements. Finally, ensembles can be more accurate than individual classifiers only when classifier members present diversity among themselves. In this paper we apply two Pareto front spread quality measures to analyze the relationship between the three main search criteria used in the overproduce-and-choose strategy. Experimental results conducted demonstrate that the combination of ensemble size and diversity does not produce conflicting multi-objective optimization problems. Moreover, we cannot decrease the generalization error rate by combining this pair of search criteria. However, when the error rate is combined with diversity or the ensemble size, we found that these measures are conflicting objective functions and that the performances of the solutions are much higher.
international symposium on neural networks | 2012
Juan Gabriel Colonna; Afonso D. Ribas; Eulanda Miranda dos Santos; Eduardo Freire Nakamura
Anurans (frogs or toads) are commonly used by biologists as early indicators of ecological stress. The reason is that anurans are closely related to the ecosystem. Although several sources of data may be used for monitoring these animals, anuran calls lead to a non-intrusive data acquisition strategy. Moreover, wireless sensor networks (WSNs) may be used for such a task, resulting in more accurate and autonomous system. However, it is essential save resources to extend the network lifetime. In this paper, we evaluate the impact of reducing data dimension for automatic classification of bioacoustic signals when a WSN is involved. Such a reduction is achieved through a wrapper-based feature subset selection strategy that uses genetic algorithm (GA). We use GA to find the subset of features that maximizes the cost-benefit ratio. In addition, we evaluate the impact of reducing the original feature space, when sampling frequencies are also reduced. Experimental results indicate that we can reduce the number of features, while increasing classification rates (even when smaller sampling frequencies of transmission are used).
information sciences, signal processing and their applications | 2012
Clayton Santos; Eulanda Miranda dos Santos; Eduardo Souto
Automatic nudity detection strategies play an important role in solutions focusing on controlling access to inappropriate content. These strategies usually apply filter or similar approaches in order to detect nudity in digital images. In this paper we propose a strategy for nudity detection based on applying image zoning. Moreover, we perform feature extraction using color and texture information, as well as feature selection, focusing on pointing out the most relevant features. SVM is used for image classification. Experimental results demonstrate that an effective improvement in terms of accuracy is attained by using the zoning strategy. In addition, the highest classification rates are obtained by combining two color-based and two texture-based features, which shows that it is possible to reduce the number of features while keeping high classification rates.
genetic and evolutionary computation conference | 2008
Eulanda Miranda dos Santos; Luiz S. Oliveira; Robert Sabourin; Patrick Maupin
Classifier ensemble selection may be formulated as a learning task since the search algorithm operates by minimizing/maximizing the objective function. As a consequence, the selection process may be prone to overfitting. The objectives of this paper are: (1) to show how overfitting can be detected when the selection is performed by two classical search algorithms: Genetic Algorithm and Particle Swarm Optimization; and (2) to verify which algorithm is more prone to overfitting. The experimental results demonstrate that GA appears to be more affected by overfitting.
congress on evolutionary computation | 2011
Eulanda Miranda dos Santos; Robert Sabourin
Dynamic classifier ensemble selection is focused on selecting the most confident classifier ensemble to predict the class of a particular test pattern. The overproduce-and-choose strategy is a dynamic classifier ensemble selection method which is divided into optimization and dynamic selection phases. The first phase involves the test of different candidate ensembles in order to produce a population composed of the highest performing candidate ensembles. Then, the second phase calculates the domain of expertise of each candidate ensemble to pick up the solution with highest degree of certainty of its decision to classify the unknown test samples. It has been shown that the optimization phase decreases oracle, the upper bound of dynamic selection processes. In this paper we propose a hybrid algorithm to perform the optimization phase of overproduce-and-choose strategy. The proposed algorithm combines stochastic initialization of candidate ensembles of different sizes, with the traditional forward search greedy method. The objective is to apply oracle as search criterion during the optimization phase. We show experimentally that choosing the population of classifier ensembles taking into account the population oracle leads to increase the upper bound of the dynamic selection phase. Moreover, experimental results conducted to compare the proposed method to a multi-objective genetic algorithm (MOGA), demonstrate that our method outperforms MOGA on generating population of candidate ensembles with higher oracle rates.
international conference on machine vision | 2017
Bernardo B. Gatto; Lincon Sales de Souza; Eulanda Miranda dos Santos
In this paper, we propose a novel deep neural network based on learning subspaces and convolutional neural network with applications in image classification. Recently, multistage PCA based filter banks have been successfully adopted in convolutional neural networks architectures in many applications including texture classification, face recognition and scene understanding. These approaches have shown to be powerful, with a straightforward implementation that enables a fast prototyping of efficient image classification systems. However, these architectures employ filters based on PCA, which may not achieve high discriminative features in more complicated computer vision datasets. In order to cope with the aforementioned drawback, we propose a Hybrid Subspace Neural Network (HS-Net). The proposed architecture employs filters from both PCA and discriminative filters banks from more sophisticated subspace methods, therefore achieving more representative and discriminative information. In addition, the use of hybrid architecture enables the use of supervised and unsupervised samples, depending on the application, making the introduced architecture quite attractive in practical terms. Exsperimental results on three publicly available datasets demonstrate the effectiveness and the practicability of the proposed architecture.
brazilian conference on intelligent systems | 2016
Bernardo B. Gatto; Eulanda Miranda dos Santos
In this paper, we present a novel supervised learning algorithm for object recognition from sets of images, where the sets describe most of the variation in an objects appearance caused by lighting, pose and view angle. In this scenario, generalized mutual subspace method (gMSM) has attracted attention for image-set matching due to its advantages in accuracy and robustness. However, gMSM employs PCA, which has high computational cost contrasting to state-of-art appearance-based methods. To create a faster method, we replace the traditional PCA by 2D-PCA and variants on gMSM framework. In general, 2D-PCA and variants require less memory resource than conventional PCA since its covariance matrix is calculated directly from two-dimensional matrices. The introduced method has the advantage of representing the subspaces in a more compact manner, providing reasonably competitive recognition rate comparing to the traditional MSM, confirming the suitability of employing 2D-PCA and variants on gMSM framework. These results have been revealed through experimentation conducted on five widely used datasets.