André Carlos Ponce Leon Ferreira de Carvalho

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where André Carlos Ponce Leon Ferreira de Carvalho is active.

Explore More

Publication

Featured researches published by André Carlos Ponce Leon Ferreira de Carvalho.

ACM Computing Surveys | 2013

Data stream clustering: A survey

Jonathan de Andrade Silva; Elaine R. Faria; Rodrigo C. Barros; Eduardo R. Hruschka; André Carlos Ponce Leon Ferreira de Carvalho; João Gama

Data stream mining is an active research area that has recently emerged to discover knowledge from large amounts of continuously generated data. In this context, several data stream clustering algorithms have been proposed to perform unsupervised learning. Nevertheless, data stream clustering imposes several challenges to be addressed, such as dealing with nonstationary, unbounded data that arrive in an online fashion. The intrinsic nature of stream data requires the development of algorithms capable of performing fast and incremental processing of data objects, suitably addressing time and memory limitations. In this article, we present a survey of data stream clustering algorithms, providing a thorough discussion of the main design components of state-of-the-art algorithms. In addition, this work addresses the temporal aspects involved in data stream clustering, and presents an overview of the usually employed experimental methodologies. A number of references are provided that describe applications of data stream clustering in different domains, such as network intrusion detection, sensor networks, and stock market analysis. Information regarding software packages and data repositories are also available for helping researchers and practitioners. Finally, some important issues and open questions that can be subject of future research are discussed.

Artificial Intelligence Review | 2008

A review on the combination of binary classifiers in multiclass problems

Ana Carolina Lorena; André Carlos Ponce Leon Ferreira de Carvalho; João Gama

Several real problems involve the classification of data into categories or classes. Given a data set containing data whose classes are known, Machine Learning algorithms can be employed for the induction of a classifier able to predict the class of new data from the same domain, performing the desired discrimination. Some learning techniques are originally conceived for the solution of problems with only two classes, also named binary classification problems. However, many problems require the discrimination of examples into more than two categories or classes. This paper presents a survey on the main strategies for the generalization of binary classifiers to problems with more than two classes, known as multiclass classification problems. The focus is on strategies that decompose the original multiclass problem into multiple binary subtasks, whose outputs are combined to obtain the final prediction.

Neurocomputing | 2008

Evolutionary tuning of SVM parameter values in multiclass problems

Ana Carolina Lorena; André Carlos Ponce Leon Ferreira de Carvalho

Support vector machines (SVMs) were originally formulated for the solution of binary classification problems. In multiclass problems, a decomposition approach is often employed, in which the multiclass problem is divided into multiple binary subproblems, whose results are combined. Generally, the performance of SVM classifiers is affected by the selection of values for their parameters. This paper investigates the use of genetic algorithms (GAs) to tune the parameters of the binary SVMs in common multiclass decompositions. The developed GA may search for a set of parameter values common to all binary classifiers or for differentiated values for each binary classifier.

European Journal of Operational Research | 2011

Spectral methods for graph clustering – A survey

Mariá Cristina Vasconcelos Nascimento; André Carlos Ponce Leon Ferreira de Carvalho

Graph clustering is an area in cluster analysis that looks for groups of related vertices in a graph. Due to its large applicability, several graph clustering algorithms have been proposed in the last years. A particular class of graph clustering algorithms is known as spectral clustering algorithms. These algorithms are mostly based on the eigen-decomposition of Laplacian matrices of either weighted or unweighted graphs. This survey presents different graph clustering formulations, most of which based on graph cut and partitioning problems, and describes the main spectral clustering algorithms found in literature that solve these problems.

foundations of computational intelligence | 2009

A Tutorial on Multi-label Classification Techniques

André Carlos Ponce Leon Ferreira de Carvalho; Alex Alves Freitas

Most classification problems associate a single class to each example or instance. However, there are many classification tasks where each instance can be associated with one or more classes. This group of problems represents an area known as multi-label classification. One typical example of multi-label classification problems is the classification of documents, where each document can be assigned to more than one class. This tutorial presents the most frequently used techniques to deal with these problems in a pedagogical manner, with examples illustrating the main techniques and proposing a taxonomy of multi-label techniques that highlights the similarities and differences between these techniques.

acm symposium on applied computing | 2007

OLINDDA: a cluster-based approach for detecting novelty and concept drift in data streams

Eduardo Jaques Spinosa; André Carlos Ponce Leon Ferreira de Carvalho; João Gama

A machine learning approach that is capable of treating data streams presents new challenges and enables the analysis of a variety of real problems in which concepts change over time. In this scenario, the ability to identify novel concepts as well as to deal with concept drift are two important attributes. This paper presents a technique based on the k-means clustering algorithm aimed at considering those two situations in a single learning strategy. Experimental results performed with data from various domains provide insight into how clustering algorithms can be used for the discovery of new concepts in streams of data.

Neurocomputing | 2012

Combining meta-learning and search techniques to select parameters for support vector machines

Taciana A. F. Gomes; Ricardo Bastos Cavalcante Prudêncio; Carlos Soares; André L. D. Rossi; André Carlos Ponce Leon Ferreira de Carvalho

Support Vector Machines (SVMs) have achieved very good performance on different learning problems. However, the success of SVMs depends on the adequate choice of the values of a number of parameters (e.g., the kernel and regularization parameters). In the current work, we propose the combination of meta-learning and search algorithms to deal with the problem of SVM parameter selection. In this combination, given a new problem to be solved, meta-learning is employed to recommend SVM parameter values based on parameter configurations that have been successfully adopted in previous similar problems. The parameter values returned by meta-learning are then used as initial search points by a search technique, which will further explore the parameter space. In this proposal, we envisioned that the initial solutions provided by meta-learning are located in good regions of the search space (i.e. they are closer to optimum solutions). Hence, the search algorithm would need to evaluate a lower number of candidate solutions when looking for an adequate solution. In this work, we investigate the combination of meta-learning with two search algorithms: Particle Swarm Optimization and Tabu Search. The implemented hybrid algorithms were used to select the values of two SVM parameters in the regression domain. These combinations were compared with the use of the search algorithms without meta-learning. The experimental results on a set of 40 regression problems showed that, on average, the proposed hybrid methods obtained lower error rates when compared to their components applied in isolation.

Genetics and Molecular Biology | 2011

Mechanisms and role of microRNA deregulation in cancer onset and progression

Edenir Inês Palmero; Silvana Gisele P de Campos; Marcelo Campos; Naiara C Nogueira de Souza; Ismael Dale C. Guerreiro; André Carlos Ponce Leon Ferreira de Carvalho; Marcia Maria Chiquitelli Marques

MicroRNAs are key regulators of various fundamental biological processes and, although representing only a small portion of the genome, they regulate a much larger population of target genes. Mature microRNAs (miRNAs) are single-stranded RNA molecules of 20–23 nucleotide (nt) length that control gene expression in many cellular processes. These molecules typically reduce the stability of mRNAs, including those of genes that mediate processes in tumorigenesis, such as inflammation, cell cycle regulation, stress response, differentiation, apoptosis and invasion. MicroRNA targeting is mostly achieved through specific base-pairing interactions between the 5′ end (‘seed’ region) of the miRNA and sites within coding and untranslated regions (UTRs) of mRNAs; target sites in the 3′ UTR diminish mRNA stability. Since miRNAs frequently target hundreds of mRNAs, miRNA regulatory pathways are complex. Calin and Croce were the first to demonstrate a connection between microRNAs and increased risk of developing cancer, and meanwhile the role of microRNAs in carcinogenesis has definitively been evidenced. It needs to be considered that the complex mechanism of gene regulation by microRNAs is profoundly influenced by variation in gene sequence (polymorphisms) of the target sites. Thus, individual variability could cause patients to present differential risks regarding several diseases. Aiming to provide a critical overview of miRNA dysregulation in cancer, this article reviews the growing number of studies that have shown the importance of these small molecules and how these microRNAs can affect or be affected by genetic and epigenetic mechanisms.

acm symposium on applied computing | 2008

Cluster-based novel concept detection in data streams applied to intrusion detection in computer networks

Eduardo Jaques Spinosa; André Carlos Ponce Leon Ferreira de Carvalho; João Gama

In this paper, a cluster-based novelty detection technique capable of dealing with a large amount of data is presented and evaluated in the context of intrusion detection. Starting with examples of a single class that describe the normal profile, the proposed technique detects novel concepts initially as cohesive clusters of examples and later as sets of clusters in an unsupervised incremental learning fashion. Experimental results with the KDD Cup 1999 data set show that the technique is capable of dealing with data streams, successfully learning novel concepts that are pure in terms of the real class structure.

international conference hybrid intelligent systems | 2006

Multi-Objective Clustering Ensemble

Katti Faceli; André Carlos Ponce Leon Ferreira de Carvalho; Marcílio Carlos Pereira de Souto

In this paper, we present an algorithm for cluster analysis that provides a robust way to deal with datasets presenting different types of clusters and allows finding more than one structure in a dataset. Our approach is based on ideas from cluster ensembles and multi-objective clustering. We apply a Pareto-based multi-objective genetic algorithm with a special crossover operator. Such an operator combines a number of partitions obtained according to different clustering criteria. As a result, our approach generates a concise and stable set of partitions representing different trade-offs between two validation measures related to different clustering criteria.

Explore More