Is this you? Create Your Porfile

Deborah Ribeiro Carvalho

Pontifícia Universidade Católica do Paraná

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Deborah Ribeiro Carvalho is active.

Explore More

Publication

Featured researches published by Deborah Ribeiro Carvalho.

Information Sciences | 2004

A hybrid decision tree/genetic algorithm method for data mining

Deborah Ribeiro Carvalho; Alex Alves Freitas

This paper addresses the well-known classification task of data mining, where the objective is to predict the class which an example belongs to. Discovered knowledge is expressed in the form of high-level, easy-to-interpret classification rules. In order to discover classification rules, we propose a hybrid decision tree/genetic algorithm method. The central idea of this hybrid method involves the concept of small disjuncts in data mining, as follows. In essence, a set of classification rules can be regarded as a logical disjunction of rules. so that each rule can be regarded as a disjunct. A small disjunct is a rule covering a small number of examples. Due to their nature, small disjuncts are error prone. However, although each small disjunct covers just a few examples, the set of all small disjuncts can cover a large number of examples, so that it is important to develop new approaches to cope with the problem of small disjuncts. In our hybrid approach, we have developed two genetic algorithms (GA) specifically designed for discovering rules covering examples belonging to small disjuncts, whereas a conventional decision tree algorithm is used to produce rules covering examples belonging to large disjuncts. We present results evaluating the performance of the hybrid method in 22 real-world data sets.

european conference on machine learning | 2005

Evaluating the correlation between objective rule interestingness measures and real human interest

Deborah Ribeiro Carvalho; Alex Alves Freitas; Nelson F. F. Ebecken

In the last few years, the data mining community has proposed a number of objective rule interestingness measures to select the most interesting rules, out of a large set of discovered rules. However, it should be recalled that objective measures are just an estimate of the true degree of interestingness of a rule to the user, the so-called real human interest. The latter is inherently subjective. Hence, it is not clear how effective, in practice, objective measures are. More precisely, the central question investigated in this paper is: “how effective objective rule interestingness measures are, in the sense of being a good estimate of the true, subjective degree of interestingness of a rule to the user?” This question is investigated by extensive experiments with 11 objective rule interestingness measures across eight real-world data sets.

Applied Soft Computing | 2002

A genetic algorithm for discovering small disjunct rules in data mining

Deborah Ribeiro Carvalho; Alex Alves Freitas

This paper addresses the well-known classification task of data mining, where the goal is to discover rules predicting the class of examples (records of a given dataset). In the context of data mining, small disjuncts are rules covering a small number of examples. Hence, these rules are usually error-prone, which contributes to a decrease in predictive accuracy. At first glance, this is not a serious problem, since the impact on predictive accuracy should be small. However, although each small-disjunct covers few examples, the set of all small disjuncts can cover a large number of examples. This paper presents evidence that this is the case in several datasets. This paper also addresses the problem of small disjuncts by using a hybrid decision-tree/genetic-algorithm approach. In essence, examples belonging to large disjuncts are classified by rules produced by a decision-tree algorithm (C4.5), while examples belonging to small disjuncts are classified by a genetic-algorithm specifically designed for discovering small-disjunct rules. We present results comparing the predictive accuracy of this hybrid system with the prediction accuracy of three versions of C4.5 alone in eight public domain datasets. Overall, the results show that our hybrid system achieves better predictive accuracy than all three versions of C4.5 alone.

european conference on principles of data mining and knowledge discovery | 2000

A Genetic Algorithm-Based Solution for the Problem of Small Disjuncts

Deborah Ribeiro Carvalho; Alex Alves Freitas

In essence, small disjuncts are rules covering a small number of examples. Hence, these rules are usually error-prone, which contributes to a decrease in predictive accuracy. The problem is particularly serious because, although each small disjuncts covers few examples, the set of small disjuncts can cover a large number of examples. This paper proposes a solution to the problem of discovering accurate small-disjunct rules based on genetic algorithms. The basic idea of our method is to use a hybrid decision tree / genetic algorithm approach for classification. More precisely, examples belonging to large disjuncts are classified by rules produced by a decision-tree algorithm, while examples belonging to small disjuncts are classified by a new genetic algorithm, particularly designed for discovering small-disjunct rules.

Cadernos De Saude Publica | 2010

Mineração de dados e características da mortalidade infantil

Rossana Cristina Xavier Ferreira Vianna; Claudia Maria Cabral Moro; Samuel Jorge Moysés; Deborah Ribeiro Carvalho; Julio Cesar Nievola

This study aims to identify patterns in maternal and fetal characteristics in the prediction of infant mortality by incorporating innovative techniques like data mining, with proven relevance for public health. A database was developed with infant deaths from 2000 to 2004 analyzed by the Committees for the Prevention of Infant Mortality, based on integration of the Information System on Live Births (SINASC), Mortality Information System, and Investigation of Infant Mortality in the State of Paraná. The data mining software was WEKA (open source). The data mining conducts a database search and provides rules to be analyzed to transform the data into useful information. After mining, 4,230 rules were selected: teenage pregnancy plus birth weight < 2,500 g, or post-term birth plus teenage mother with a previous child or intercurrent conditions increase the risk of neonatal death. The results highlight the need for greater attention to teenage mothers, newborns with birth weight < 2,500 g, post-term neonates, and infants of mothers with intercurrent conditions, thus corroborating other studies.

Revista gaúcha de enfermagem | 2013

SISTEMA ESPECIALISTA PARA APOIAR A DECISÃO NA TERAPIA TÓPICA DE ÚLCERAS VENOSAS

Danielle Sellmer; Carina Maris Gaspar Carvalho; Deborah Ribeiro Carvalho; Andreia Malucelli

Apesar de o tratamento das ulceras venosas exigir um conjunto de conhecimentos especificos, os enfermeiros nao especialistas desconhecem as terapias adequadas, o que constitui uma dificuldade na terapia topica dessas lesoes de pele. Este artigo tem como objetivo apresentar um sistema especialista para apoiar o processo de decisao dos enfermeiros na terapia topica das ulceras venosas. Trata-se de uma pesquisa de desenvolvimento, operacionalizada em cinco etapas: modelagem do sistema, aquisicao do conhecimento, representacao do conhecimento a partir de regras de producao, implementacao e avaliacao do sistema. O conjunto das regras e apresentado, assim como casos que simulam o comportamento do sistema especialista, mostrando a viabilidade da sua utilizacao na pratica do enfermeiro. O sistema podera auxiliar na tomada de decisao sobre as condutas topicas em ulceras venosas, porem, a avaliacao da ulcera deve ser realizada de forma correta, a fim de que o sistema forneca sugestoes adequadas, permitindo melhor organizacao e planejamento da assistencia.

Fisioterapia em Movimento | 2012

Mineração de Dados aplicada à fisioterapia

Deborah Ribeiro Carvalho; Auristela Duarte de Lima Moser; Verônica Andrade da Silva; Marcelo Rosano Dallagassa

INTRODUCTION: With the increasing amount of data stored in the practice of physiotherapy and health area in general, expands the possibility of obtaining important information to decision support of health professionals. However, many times the volume of generated data is so great that their use is difficult, requiring more sophisticated procedures for data manipulation. OBJECTIVE: This article aims to present and discuss the potential use of the KDD process on a set of monitoring data for physical therapy patients, as well as its usefulness in decision-making therapeutic or prophylactic. METHODS: We selected a subset of data, referring to records available in a physical therapy clinic, from which were extracted three major groups of data mining tasks: association, classification and clustering. RESULTS: Knowledge was extracted from the data in such a way that allows the reader to understand step-by-step process, broadening their understanding of the results. Knowledge was discovered in various formats, which showed the possible relationships among the variables available. Not only the knowledge was discussed, but also the importance of quality of data collected. CONCLUSIONS: The tasks of classification, association rules and clustering allowed a better understanding of the patients characteristics seen by the clinic in question, thus expanding the knowledge of professionals in the identification of actions to be adopted.

Revista De Saude Publica | 2010

Classificação de microáreas de risco com uso de mineraçãode dados

Andreia Malucelli; Altair von Stein Junior; Laudelino Cordeiro Bastos; Deborah Ribeiro Carvalho; Marcia Regina Cubas; Emerson Cabrera Paraiso

OBJETIVO: Identificar, com o auxilio de tecnicas computacionais, regras referentes as condicoes do ambiente fisico para a classificacao de microareas de risco. METODOS: Pesquisa exploratoria, desenvolvida na cidade de Curitiba, PR, em 2007, dividida em tres etapas: identificacao de atributos para classificar uma microarea; construcao de uma base de dados; e aplicacao do processo de descoberta de conhecimento em base de dados, por meio da aplicacao de mineracao de dados. O conjunto de atributos envolveu as condicoes de infra- estrutura, hidrografia, solo, area de lazer, caracteristicas da comunidade e existencia de vetores. A base de dados foi construida com dados obtidos em entrevistas com agentes comunitarios de saude, sendo utilizado um questionario com questoes fechadas, elaborado com os atributos essenciais, selecionados por especialistas. RESULTADOS: Foram identificados 49 atributos, sendo 41 essenciais e oito irrelevantes. Foram obtidas 68 regras com a mineracao de dados, as quais foram analisadas sob a perspectiva de desempenho e qualidade e divididas em dois conjuntos: as inconsistentes e as que confirmam o conhecimento de especialistas. A comparacao entre os conjuntos mostrou que as regras que confirmavam o conhecimento, apesar de terem desempenho computacional inferior, foram consideradas mais interessantes. CONCLUSOES: A mineracao de dados ofereceu um conjunto de regras uteis e compreensiveis, capazes de caracterizar microareas, classificando-as quanto ao grau do risco, com base em caracteristicas do ambiente fisico. A utilizacao das regras propostas permite que a classificacao de uma microarea possa ser realizada de forma mais rapida, menos subjetiva, mantendo um padrao entre as equipes de saude, superando a influencia da percepcao particular de cada componente da equipe.OBJECTIVE To identify, with the assistance of computational techniques, rules concerning the conditions of the physical environment for the classification of risk micro-areas. METHODS Exploratory research carried out in Curitiba, Southern Brazil, in 2007. It was divided into three phases: the identification of attributes to classify a micro-area; the construction of a database; and the process of discovering knowledge in a database through the use of data mining. The set of attributes included the conditions of infrastructure; hydrography; soil; recreation area; community characteristics; and existence of vectors. The database was constructed with data obtained in interviews by community health workers using questionnaires with closed-ended questions, developed with the essential attributes selected by specialists. RESULTS There were 49 attributes identified, 41 of which were essential and eight irrelevant. There were 68 rules obtained in the data mining, which were analyzed through the perspectives of performance and quality and divided into two sets: the inconsistent rules and the rules that confirm the knowledge of experts. The comparison between the groups showed that the rules that confirm the knowledge, despite having lower computational performance, were considered more interesting. CONCLUSIONS The data mining provided a set of useful and understandable rules capable of characterizing risk areas based on the characteristics of the physical environment. The use of the proposed rules allows a faster and less subjective area classification, maintaining a standard between the health teams and overcoming the influence of individual perception by each team member.

Archive | 2004

New Results for a Hybrid Decision Tree/Genetic Algorithm for Data Mining

Deborah Ribeiro Carvalho; Alex Alves Freitas

This paper proposes a hybrid decision tree/genetic algorithm for solving the problem of small disjuncts in the classification task of data mining. It reports computational results comparing the proposed algorithm with two versions of C4.5 (one of them also specifically designed for solving the problem of small disjuncts) in 22 data sets.

Revista De Saude Publica | 2010