Marcílio Carlos Pereira de Souto

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marcílio Carlos Pereira de Souto is active.

Explore More

Publication

Featured researches published by Marcílio Carlos Pereira de Souto.

international conference hybrid intelligent systems | 2006

Multi-Objective Clustering Ensemble

Katti Faceli; André Carlos Ponce Leon Ferreira de Carvalho; Marcílio Carlos Pereira de Souto

In this paper, we present an algorithm for cluster analysis that provides a robust way to deal with datasets presenting different types of clusters and allows finding more than one structure in a dataset. Our approach is based on ideas from cluster ensembles and multi-objective clustering. We apply a Pareto-based multi-objective genetic algorithm with a special crossover operator. Such an operator combines a number of partitions obtained according to different clustering criteria. As a result, our approach generates a concise and stable set of partitions representing different trade-offs between two validation measures related to different clustering criteria.

Genetics and Molecular Biology | 2004

Comparative analysis of clustering methods for gene expression time course data

Ivan G. Costa; Francisco de A. T. de Carvalho; Marcílio Carlos Pereira de Souto

This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series). Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the partitions obtained in these experiments with gene annotation, such as protein function and series classification.

brazilian symposium on neural networks | 2006

A Dynamic Classifier Selection Method to Build Ensembles using Accuracy and Diversity

Alixandre Santana; Rodrigo G. Soares; Anne M. P. Canuto; Marcílio Carlos Pereira de Souto

Ensemble of classifiers is an effective way of improving performance of individual classifiers. However, the choice of the ensemble members can become a very difficult task, in which, in some cases, it can lead to ensembles with no performance improvement. In order to avoid this situation, there is a need to find effective classifier member selection methods. In this paper, a DCS (Dynamic Classifier Selection)-based method is presented, which takes into account performance and diversity of the classifiers in order to choose the ensemble members.

Neurocomputing | 2009

Multi-objective clustering ensemble for gene expression data analysis

Katti Faceli; Marcílio Carlos Pereira de Souto; Daniel de Araújo; André Carlos Ponce Leon Ferreira de Carvalho

In this paper, we present an algorithm for cluster analysis that integrates aspects from cluster ensemble and multi-objective clustering. The algorithm is based on a Pareto-based multi-objective genetic algorithm, with a special crossover operator, which uses clustering validation measures as objective functions. The algorithm proposed can deal with data sets presenting different types of clusters, without the need of expertise in cluster analysis. Its result is a concise set of partitions representing alternative trade-offs among the objective functions. We compare the results obtained with our algorithm, in the context of gene expression data sets, to those achieved with multi-objective clustering with automatic K-determination (MOCK), the algorithm most closely related to ours.

Neurocomputing | 2012

Analysis of complexity indices for classification problems: Cancer gene expression data

Ana Carolina Lorena; Ivan G. Costa; Newton Spolaôr; Marcílio Carlos Pereira de Souto

Currently, cancer diagnosis at a molecular level has been made possible through the analysis of gene expression data. More specifically, one usually uses machine learning (ML) techniques to build, from cancer gene expression data, automatic diagnosis models (classifiers). Cancer gene expression data often present some characteristics that can have a negative impact in the generalization ability of the classifiers generated. Some of these properties are data sparsity and an unbalanced class distribution. We investigate the results of a set of indices able to extract the intrinsic complexity information from the data. Such measures can be used to analyze, among other things, which particular characteristics of cancer gene expression data mostly impact the prediction ability of support vector machine classifiers. In this context, we also show that, by applying a proper feature selection procedure to the data, one can reduce the influence of those characteristics in the error rates of the classifiers induced.

international symposium on neural networks | 2009

Use of multi-objective genetic algorithms to investigate the diversity/accuracy dilemma in heterogeneous ensembles

Diogo F. de Oliveira; Anne M. P. Canuto; Marcílio Carlos Pereira de Souto

Classifier ensembles, also known as committees, are systems composed of a set of base classifiers (organized in a parallel way) and a combination module, which is responsible for providing the final output of the system. The main aim of using ensembles is to provide better performance than the individual classifiers. In order to build robust ensembles, it is often required that the base classifiers are as accurate as diverse among themselves - this is known as the diversity/accuracy dilemma. There are, in the literature, some works analyzing the ensemble performance in context of such a dilemma. However, the majority of them address the homogenous structures, i.e., ensembles composed only of the same type of classifiers. Motivated by such a limitation, this paper presents an empirical investigation on the diversity/accuracy dilemma for heterogeneous ensembles. In order to do so, multi-objective genetic algorithms will be used to guide the building of the ensemble systems.

international conference on artificial neural networks | 2009

Mining Rules for the Automatic Selection Process of Clustering Methods Applied to Cancer Gene Expression Data

André C. A. Nascimento; Ricardo Bastos Cavalcante Prudêncio; Marcílio Carlos Pereira de Souto; Ivan G. Costa

Different algorithms have been proposed in the literature to cluster gene expression data, however there is no single algorithm that can be considered the best one independently on the data. In this work, we applied the concepts of Meta-Learning to relate features of gene expression data sets to the performance of clustering algorithms. In our context, each meta-example represents descriptive features of a gene expression data set and a label indicating the best clustering algorithm when applied to the data. A set of such meta-examples is given as input to a learning technique (the meta-learner ) which is responsible to acquire knowledge relating the descriptive features and the best algorithms. In our work, we performed experiments on a case study in which a meta-learner was applied to discriminate among three competing algorithms for clustering gene expression data of cancer. In this case study, a set of meta-examples was generated from the application of the algorithms to 30 different cancer data sets. The knowledge extracted by the meta-learner was useful to understanding the suitability of each clustering algorithm for specific problems.

brazilian symposium on bioinformatics | 2007

Multi-objective clustering ensemble with prior knowledge

Katti Faceli; André Carlos Ponce Leon Ferreira de Carvalho; Marcílio Carlos Pereira de Souto

In this paper, we introduce an approach to integrate prior knowledge in cluster analysis, which is different from the existing ones for semi-supervised clustering methods. In order to aid the discovery of alternative structures present in the data, we consider the knowledge of some existing complete classification of such data. The approach proposed is based on our Multi-Objective Clustering Ensemble algorithm (MOCLE). This algorithm generates a concise and stable set of partitions, which represents different trade-offs between several measures of partition quality. The prior knowledge is automatically integrated in MOCLE by embedding it into one of the objective functions. In this case, the function gives as output the quality of a partition, considering the prior knowledge of one of the known structures of the data.

Meta-Learning in Computational Intelligence | 2011

Selecting Machine Learning Algorithms Using the Ranking Meta-Learning Approach

Ricardo Bastos Cavalcante Prudêncio; Marcílio Carlos Pereira de Souto; Teresa Bernarda Ludermir

In this work, we present the use of Ranking Meta-Learning approaches to ranking and selecting algorithms for problems of time series forecasting and clustering of gene expression data. Given a problem (forecasting or clustering), the Meta-Learning approach provides a ranking of the candidate algorithms, according to the characteristics of the problem’s dataset. The best ranked algorithm can be returned as the selected one. In order to evaluate the Ranking Meta-Learning proposal, prototypes were implemented to rank artificial neural networks models for forecasting financial and economic time series and to rank clustering algorithms in the context of cancer gene expression microarray datasets. The case studies regard experiments to measure the correlation between the suggested rankings of algorithms and the ideal rankings. The results revealed that Meta-Learning was able to suggest more adequate rankings in both domains of application considered.

international conference hybrid intelligent systems | 2008

A Class-Based Feature Selection Method for Ensemble Systems

Karliane O. Vale; Filipe G. Dias; Anne M. P. Canuto; Marcílio Carlos Pereira de Souto

Diversity is considered as one of the main prerequisites for an efficient use of ensemble systems. One way of increasing diversity is through the use of feature selection methods in ensemble systems. In this paper, a class-based feature selection method for ensemble systems is proposed. The proposed method is inserted into the filter approach of feature selection methods and it chooses only the attributes that are important only for a specific class. An analysis of the performance of the proposed method is also investigated in this paper and it shows that the proposed method has outperformed the standard feature selection method.

Explore More