Everton Alvares Cherman

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Everton Alvares Cherman is active.

Explore More

Publication

Featured researches published by Everton Alvares Cherman.

Electronic Notes in Theoretical Computer Science | 2013

A Comparison of Multi-label Feature Selection Methods using the Problem Transformation Approach

Newton Spolaôr; Everton Alvares Cherman; Maria Carolina Monard; Huei Diana Lee

Feature selection is an important task in machine learning, which can effectively reduce the dataset dimensionality by removing irrelevant and/or redundant features. Although a large body of research deals with feature selection in single-label data, in which measures have been proposed to filter out irrelevant features, this is not the case for multi-label data. This work proposes multi-label feature selection methods which use the filter approach. To this end, two standard multi-label feature selection approaches, which transform the multi-label data into single-label data, are used. Besides these two problem transformation approaches, we use ReliefF and Information Gain to measure the goodness of features. This gives rise to four multi-label feature selection methods. A thorough experimental evaluation of these methods was carried out on 10 benchmark datasets. Results show that ReliefF is able to select fewer features without diminishing the quality of the classifiers constructed using the features selected.

brazilian conference on intelligent systems | 2013

ReliefF for Multi-label Feature Selection

Newton Spolaôr; Everton Alvares Cherman; Maria Carolina Monard; Huei Diana Lee

The feature selection process aims to select a subset of relevant features to be used in model construction, reducing data dimensionality by removing irrelevant and redundant features. Although effective feature selection methods to support single-label learning are abound, this is not the case for multi-label learning. Furthermore, most of the multi-label feature selection methods proposed initially transform the multi-label data to single-label in which a traditional feature selection method is then applied. However, the application of single-label feature selection methods after transforming the data can hinder exploring label dependence, an important issue in multi-label learning. This work proposes a new multi-label feature selection algorithm, RF-ML, by extending the single-label feature selection ReliefF algorithm. RF-ML, unlike strictly univariate measures for feature ranking, takes into account the effect of interacting attributes to directly deal with multi-label data without any data transformation. Using synthetic datasets, the proposed algorithm is experimentally compared to the ReliefF algorithm in which the multi-label data has been previously transformed to single-label data using two well-known data transformation approaches. Results show that the proposed algorithm stands out by ranking the relevant features as the best ones more often.

brazilian symposium on artificial intelligence | 2012

Filter approach feature selection methods to support multi-label learning based on relieff and information gain

Newton Spolaôr; Everton Alvares Cherman; Maria Carolina Monard; Huei Diana Lee

In multi-label learning, each example in the dataset is associated with a set of labels, and the task of the generated classifier is to predict the label set of unseen examples. Feature selection is an important task in machine learning, which aims to find a small number of features that describes the dataset as well as, or even better, than the original set of features does. This can be achieved by removing irrelevant and/or redundant features according to some importance criterion. Although effective feature selection methods to support classification for single-label data are abound, this is not the case for multi-label data. This work proposes two multi-label feature selection methods which use the filter approach. This approach evaluates statistics of the data independently of any particular classifier. To this end, ReliefF, a single-label feature selection method and an adaptation of the Information Gain measure for multi-label data are used to find the features that should be selected. Both methods were experimentally evaluated in ten benchmark datasets, taking into account the reduction in the number of features as well as the quality of the generated classifiers, showing promising results.

Electronic Notes in Theoretical Computer Science | 2014

A Framework to Generate Synthetic Multi-label Datasets

Jimena Torres Tomás; Newton Spolaôr; Everton Alvares Cherman; Maria Carolina Monard

A controlled environment based on known properties of the dataset used by a learning algorithm is useful to empirically evaluate machine learning algorithms. Synthetic (artificial) datasets are used for this purpose. Although there are publicly available frameworks to generate synthetic single-label datasets, this is not the case for multi-label datasets, in which each instance is associated with a set of labels usually correlated. This work presents Mldatagen, a multi-label dataset generator framework we have implemented, which is publicly available to the community. Currently, two strategies have been implemented in Mldatagen: hypersphere and hypercube. For each label in the multi-label dataset, these strategies randomly generate a geometric shape (hypersphere or hypercube), which is populated with points (instances) randomly generated. Afterwards, each instance is labeled according to the shapes it belongs to, which defines its multi-label. Experiments with a multi-label classification algorithm in six synthetic datasets illustrate the use of Mldatagen.

mexican international conference on artificial intelligence | 2010

A simple approach to incorporate label dependency in multi-label classification

Everton Alvares Cherman; Jean Metz; Maria Carolina Monard

In multi-label classification, each example can be associated with multiple labels simultaneously. The task of learning from multilabel data can be addressed by methods that transform the multi-label classification problem into several single-label classification problems. The binary relevance approach is one of these methods, where the multilabel learning task is decomposed into several independent binary classification problems, one for each label in the set of labels, and the final labels for each example are determined by aggregating the predictions from all binary classifiers. However, this approach fails to consider any dependency among the labels. In this paper, we consider a simple approach which can be used to explore labels dependency aiming to accurately predict label combinations. An experimental study using decision trees, a kernel method as well as Naive Bayes as base-learning techniques shows the potential of the proposed approach to improve the multi-label classification performance.

ibero-american conference on artificial intelligence | 2012

On the Estimation of Predictive Evaluation Measure Baselines for Multi-label Learning

Jean Metz; Luís F. D. de Abreu; Everton Alvares Cherman; Maria Carolina Monard

Machine learning research relies to a large extent on experimental observations. The evaluation of classifiers is often carried out by empirical comparison with classifiers generated by different learning algorithms, allowing the identification of the best algorithm for the problem at hand. Nevertheless, previously to this evaluation, it is important to state if the classifiers have truly learned the domain class concepts, which can be done by comparing the classifiers’ predictive measures with the ones from the baseline classifiers. A baseline classifier is the one constructed by a naive learning algorithm which only uses the class distribution of the dataset. However, finding naive classifiers in multi-label learning is not as straightforward as in single-label learning. This work proposes a simple way to find baseline multi-label classifiers. Three specific and one general naive multi-label classifiers are proposed to estimate the baseline values for multi-label predictive evaluation measures. Experimental results show the suitability of our proposal in revealing the learning power of multi-label learning algorithms.

international conference hybrid intelligent systems | 2011

On the estimation of the number of fuzzy sets for fuzzy rule-based classification systems

Marcos Evandro Cintra; Maria Carolina Monard; Everton Alvares Cherman; Heloisa A. Camargo

Defining the attributes in terms of fuzzy sets is an essential part in designing a fuzzy system. The main tasks involved in defining the fuzzy data base include deciding the type of fuzzy set (triangular, trapezoidal, etc), the number of fuzzy sets for each attribute, and their distribution in each attribute domain. In the absence of an expert, these definitions can be done empirically or by using automatic methods. In this paper, we present four different methods to estimate the number of fuzzy sets for a dataset. The first defines the same number of fuzzy sets for all attributes, while the other three flexibly estimate different numbers of fuzzy sets for each attribute of a given dataset. The aim of this paper is to provide fast and practicable methods to define fuzzy data bases, previously to the generation of the fuzzy rule base by more costly approaches, such as genetic fuzzy systems. These methods are evaluated using the FuzzyDT method, which generates a fuzzy decision tree based on the C4.5 classic method, on 11 datasets. The results are compared in terms of accuracy and number of generated rules. The results showed that the flexible estimation of the number of fuzzy sets obtained better error rates for the datasets used in the experiments.

artificial intelligence applications and innovations | 2016

Active Learning Algorithms for Multi-label Data

Everton Alvares Cherman; Grigorios Tsoumakas; Maria Carolina Monard

Active learning is an iterative supervised learning task where learning algorithms can actively query an oracle, i.e. a human annotator that understands the nature of the pro blem, for labels. As the learner is allowed to interactively choose the data from which it learns, it is expected that the learner will perform better with less training. The active learning approach is appropriate to machine learning applications where training labels are costly to obtain but unlabeled data is abundant. Although active learning has been widely considered for single-label learning, this is not the case for multi-label learning, where objects can have more than one class labels and a multi-label learner is trained to assign multiple labels simultaneously to an object. We discuss the key issues that need to be considered in pool-based multi-label active learning and discuss how existing solutions in the literature deal with each of these issues. We further empirically study the performance of the existing solutions, after implementing them in a common framework, on two multi-label datasets with different characteristics and under two different applications settings (transductive, inductive). We find out interesting results that we attribute to the properties of, mainly, the data sets, and, secondarily, the application settings.

Journal of Intelligent and Robotic Systems | 2015

Lazy Multi-label Learning Algorithms Based on Mutuality Strategies

Everton Alvares Cherman; Newton Spolaôr; Jorge Carlos Valverde-Rebaza; Maria Carolina Monard

Lazy multi-label learning algorithms have become an important research topic within the multi-label community. These algorithms usually consider the set of standard k-Nearest Neighbors of a new instance to predict its labels (multi-label). The prediction is made by following a voting criteria within the multi-labels of the set of k-Nearest Neighbors of the new instance. This work proposes the use of two alternative strategies to identify the set of these examples: the Mutual and Not Mutual Nearest Neighbors rules, which have already been used by lazy single-learning algorithms. In this work, we use these strategies to extend the lazy multi-label algorithm BRkNN. An experimental evaluation carried out to compare both mutuality strategies with the original BRkNN algorithm and the well-known MLkNN lazy algorithm on 15 benchmark datasets showed that MLkNN presented the best predictive performance for the Hamming-Loss evaluation measure, although it was significantly outperformed by the mutuality strategies when F-Measure is considered. The best results of the lazy algorithms were also compared with the results obtained by the Binary Relevance approach using three different base learning algorithms.

intelligent data analysis | 2017

Towards Automatic Evaluation of Asphalt Irregularity Using Smartphone’s Sensors

Vinícius Mourão Alves de Souza; Everton Alvares Cherman; Rafael Rossi; Rafael A. Souza

The quality of the pavement of roads and streets has significant influence in the final price of goods and services, in the safety of pedestrians and also in the driver’s comfort. Thus, the development of tools for continuous monitoring of the pavement, intending to obtain a more precise and adequate maintenance plan is essential. In order to reduce the manual effort of inspections made by experts, the use of high-cost equipment as laser profilometer and allowing evaluations in real-time, the use of motion sensor of smartphones to monitor the asphalt irregularity is proposed. In this paper, the present problem is modeled as a classification task that can be performed by supervised learning algorithms and aided by signal processing techniques for features extraction from the acceleration data. The proposed approach shows promising accuracies for the identification of asphalt irregularity (around 99%) and for identification of obstacles as speed bumps, raised crosswalk, pavement markers, and asphalt patches (around 87%).

Explore More