Renata M. C. R. de Souza
Federal University of Pernambuco
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Renata M. C. R. de Souza.
Pattern Recognition Letters | 2004
Renata M. C. R. de Souza; Francisco de A. T. de Carvalho
The recording of interval data has become a common practice with the recent advances in database technologies. This paper introduces clustering methods for interval data based on the dynamic cluster algorithm. Two methods are considered: one with adaptive distances and the other without.
Pattern Recognition Letters | 2006
Francisco de A. T. de Carvalho; Renata M. C. R. de Souza; Marie Chavent; Yves Lechevallier
This paper presents a partitional dynamic clustering method for interval data based on adaptive Hausdorff distances. Dynamic clustering algorithms are iterative two-step relocation algorithms involving the construction of the clusters at each iteration and the identification of a suitable representation or prototype (means, axes, probability laws, groups of elements, etc.) for each cluster by locally optimizing an adequacy criterion that measures the fitting between the clusters and their corresponding representatives. In this paper, each pattern is represented by a vector of intervals. Adaptive Hausdorff distances are the measures used to compare two interval vectors. Adaptive distances at each iteration change for each cluster according to its intra-class structure. The advantage of these adaptive distances is that the clustering algorithm is able to recognize clusters of different shapes and sizes. To evaluate this method, experiments with real and synthetic interval data sets were performed. The evaluation is based on an external cluster validity index (corrected Rand index) in a framework of a Monte Carlo experiment with 100 replications. These experiments showed the usefulness of the proposed method.
Pattern Recognition Letters | 2010
Francisco de A. T. de Carvalho; Renata M. C. R. de Souza
Unsupervised pattern recognition methods for mixed feature-type symbolic data based on dynamical clustering methodology with adaptive distances are presented. These distances change at each algorithms iteration and can either be the same for all clusters or different from one cluster to another. Moreover, the methods need a previous pre-processing step in order to obtain a suitable homogenization of the mixed feature-type symbolic data into histogram-valued symbolic data. The presented dynamic clustering algorithms have then as input a set of vectors of histogram-valued symbolic data and they furnish a partition and a prototype to each cluster by optimizing an adequacy criterion based on suitable adaptive squared Euclidean distances. To show the usefulness of these methods, examples with synthetic symbolic data sets as well as applications with real symbolic data sets are considered. Moreover, various tools suitable for interpreting the partition and the clusters given by these algorithms are also presented.
Applied Soft Computing | 2013
Bruno Almeida Pimentel; Renata M. C. R. de Souza
Fuzzy c-means (FCMs) is an important and popular unsupervised partitioning algorithm used in several application domains such as pattern recognition, machine learning and data mining. Although the FCM has shown good performance in detecting clusters, the membership values for each individual computed to each of the clusters cannot indicate how well the individuals are classified. In this paper, a new approach to handle the memberships based on the inherent information in each feature is presented. The algorithm produces a membership matrix for each individual, the membership values are between zero and one and measure the similarity of this individual to the center of each cluster according to each feature. These values can change at each iteration of the algorithm and they are different from one feature to another and from one cluster to another in order to increase the performance of the fuzzy c-means clustering algorithm. To obtain a fuzzy partition by class of the input data set, a way to compute the class membership values is also proposed in this work. Experiments with synthetic and real data sets show that the proposed approach produces good quality of clustering.
Pattern Recognition Letters | 2010
Marco A. O. Domingues; Renata M. C. R. de Souza; Francisco José A. Cysneiros
This paper introduces a new linear regression method for interval valued-data. The method is based on the symmetrical linear regression methodology such that the prediction of the lower and upper bounds of the interval value of the dependent variable is not damaged by the presence of interval-valued data outliers. The method considers mid-points and ranges of the interval values assumed by the variables in the learning set. The prediction of the boundaries of an interval is accomplished through a combination of predictions from mid-point and range of the interval values. The evaluation of the method is based on the average behavior of a pooled root mean-square error. Experiments with real and simulated symbolic interval data sets demonstrate the usefulness of this symbolic symmetrical linear regression method.
Food and Chemical Toxicology | 2011
Luís Cláudio Nascimento da Silva; Carlos Alberto da Silva Júnior; Renata M. C. R. de Souza; Alexandre José Macedo; Márcia Vanusa da Silva; Maria Tereza dos Santos Correia
This study aimed to explore the antioxidant and DNA protection abilities of hydroalcoholic extracts from fruits of Anadenanthera colubrina (ACHE), Libidibia ferrea (LFHE) and Pityrocarpa moniliformis (PMHE). These extracts were tested by five antioxidant methods (phosphomolibdenium and reducing power assays; superoxide, hydrogen peroxide and nitric oxide scavenging) and DNA protection capacity. Total phenolic content was measured by Folin-Ciocalteu method. ACHE exhibited the highest phenolic content (578 mg/g GAE), followed by LFHE (460 mg/g GAE) and PMHE (448 mg/g GAE). In phosphomolibdenium assay, ACHE showed 24.81% of activity in relation to ascorbic acid, whereas LFHE and PMHE had 21.08% and 18.05%, respectively. These plants showed high ability to inhibit reactive species tested with IC50 values ranged from 10.66 to 14.37 μg/mL for superoxide radical; 26.05 to 45.43 μg/mL for hydrogen peroxide; 178.42 to 182.98 μg/mL for reducing power; and 199.2 to 283 μg/mL for nitric oxide. Furthermore, these extracts had capacity to break the DNA damage induced by hydroxyl radicals. The antioxidant activity of these plants is related with their higher phenolic content and show that they may be used as source of bioactive compounds, relevant to the maintenance of oxidative stability of the food matrix, cosmetics and/or pharmaceutical preparations.
Expert Systems With Applications | 2014
Marcus C. Araújo; Rita de Cássia Fernandes de Lima; Renata M. C. R. de Souza
Breast cancer is one of the leading causes of death in women. Recent studies involving the use of thermal imaging as a screening technique have generated a growing interest especially in cases where the mammography is limited, as in young patients who have dense breast tissue. The aim of this work is to evaluate the feasibility of using interval data in the symbolic data analysis (SDA) framework to model breast abnormalities (malignant, benign and cyst) in order to detect breast cancer. SDA allows a more realistic description of the input units by taking into consideration their internal variation. In this direction, a three-stage feature extraction approach is proposed. In the first stage four intervals variables are obtained by the minimum and maximum temperature values from the morphological and thermal matrices. In the second one, operators based on dissimilarities for intervals are considered and then continuous features are obtained. In the last one, these continuous features are transformed by the Fishers criterion, giving the input data to the classification process. This three-stage approach is applied to a Brazilians thermography breast database and it is compared with a statistical feature extraction and a texture feature extraction approach widely used in thermal imaging studies. Different classifiers are considered to detect breast cancer, achieving 16% of misclassification rate, 85.7% of sensitivity and 86.5% of specificity to the malignant class.
Engineering Applications of Artificial Intelligence | 2013
Roberta A. de A. Fagundes; Renata M. C. R. de Souza; Francisco José A. Cysneiros
This paper presents a robust regression model that deals with cases that have interval-valued outliers in the input data set. Each interval of the input data is represented by its range and midpoint and the fitting to interval-valued data is not sensible in the presence of midpoint and/or range outliers on the interval response. The predictions of the lower and upper bounds of new intervals are performed and simulation studies are carried out to validate these predictions. Two applications with real-life interval data sets are considered. The prediction quality is assessed by a mean magnitude of relative error calculated from a test data set.
Pattern Analysis and Applications | 2011
Renata M. C. R. de Souza; Diego C. F. Queiroz; Francisco José A. Cysneiros
This paper introduces different pattern classifiers for interval data based on the logistic regression methodology. Four approaches are considered. These approaches differ according to the way of representing the intervals. The first classifier considers that each interval is represented by the centres of the intervals and performs a classic logistic regression on the centers of the intervals. The second one assumes each interval as a pair of quantitative variables and performs a conjoint classic logistic regression on these variables. The third one considers that each interval is represented by its vertices and a classic logistic regression on the vertices of the intervals is applied. The last one assumes each interval as a pair of quantitative variables, performs two separate classic logistic regressions on these variables and combines the results in some appropriate way. Experiments with synthetic data sets and an application with a real interval data set demonstrate the usefulness of these classifiers.
Archive | 1998
Francisco de A. T. de Carvalho; Renata M. C. R. de Souza
In this paper we make a synthesis between the Ichino and Yaguchi (1994) and Moore (1991) metrics to obtain a new logical proximity function between Boolean symbolic objects. Then, we use histograms defined on these objects to obtain a statistical one. For both logical and statistical proximity functions, we study its properties and we present examples to illustrate the usefulness of our approach.