Ana Carolina Lorena | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ana Carolina Lorena is active.

Explore More

Publication

Featured researches published by Ana Carolina Lorena.

Journal of the Brazilian Computer Society | 2013

A systematic review on keystroke dynamics

Paulo Henrique Pisani; Ana Carolina Lorena

Computing and communication systems have improved our way of life, but have also contributed to an increased data exposure and, consequently, to identity theft. A possible way to overcome this issue is by the use of biometric technologies for user authentication. Among the possible technologies to be analysed, this work focuses on keystroke dynamics, which attempts to recognize users by their typing rhythm. In order to guide future researches in this area, a systematic review on keystroke dynamics was conducted and presented here. The systematic review method adopts a rigorous procedure with the definition of a formal review protocol. Systematic reviews are not commonly used in artificial intelligence, and this work contributes to its use in the area. This paper discusses the process involved in the review along with the results obtained in order to identify the state of the art of keystroke dynamics. We summarized main classifiers, performance measures, extracted features and benchmark datasets used in the area.

Neurocomputing | 2015

Effect of label noise in the complexity of classification problems

Luís Paulo F. Garcia; André Carlos Ponce de Leon Ferreira de Carvalho; Ana Carolina Lorena

Abstract Noisy data are common in real-world problems and may have several causes, like inaccuracies, distortions or contamination during data collection, storage and/or transmission. The presence of noise in data can affect the complexity of classification problems, making the discrimination of objects from different classes more difficult, and requiring more complex decision boundaries for data separation. In this paper, we investigate how noise affects the complexity of classification problems, by monitoring the sensitivity of several indices of data complexity in the presence of different label noise levels. To characterize the complexity of a classification dataset, we use geometric, statistical and structural measures extracted from data. The experimental results show that some measures are more sensitive than others to the addition of noise in a dataset. These measures can be used in the development of new preprocessing techniques for noise identification and novel label noise tolerant algorithms. We thereby show preliminary results on a new filter for noise identification, which is based on two of the complexity measures which were more sensitive to the presence of label noise.

Knowledge Based Systems | 2015

Using the One-vs-One decomposition to improve the performance of class noise filters via an aggregation strategy in multi-class classification problems

Luís Paulo F. Garcia; José A. Sáez; Julián Luengo; Ana Carolina Lorena; André Carlos Ponce de Leon Ferreira de Carvalho; Francisco Herrera

Noise filters are preprocessing techniques designed to improve data quality in classification tasks by detecting and eliminating examples that contain errors or noise. However, filtering can also remove correct examples and examples containing valuable information, which could be useful for learning. This fact usually implies a margin of improvement on the noise detection accuracy for almost any noise filter. This paper proposes a scheme to improve the performance of noise filters in multi-class classification problems, based on decomposing the dataset into multiple binary subproblems. Decomposition strategies have proven to be successful in improving classification performance in multi-class problems by generating simpler binary subproblems. Similarly, we adapt the principles of the One-vs-One decomposition strategy to noise filtering, making the noise identification process simpler. In order to integrate the filtering results achieved in the binary subproblems, our proposal uses a soft voting approach considering a reliability level based on the aggregation of the noise degree prediction calculated for each binary classifier. The experimental results show that the One-vs-One decomposition strategy usually increases the performance of the noise filters studied, which can detect more accurately the noisy examples.

Journal of Intelligent and Robotic Systems | 2015

Filter Feature Selection for One-Class Classification

Luiz Henrique Nogueira Lorena; André Carlos Ponce Leon Ferreira de Carvalho; Ana Carolina Lorena

In one-class classification problems all training examples belong to a single class. The absence of counter-examples represents a challenge to traditional Machine Learning and pre-processing techniques. This is the case of various feature selection techniques for labeled data. The selection of the most relevant features from a dataset usually benefits the performance obtained by classification algorithms. Despite the relevance of this issue, few techniques have been proposed for feature selection in one-class classification problems. Moreover, most of the existent techniques are wrapper approaches, which have to rely on a specific classification algorithm for feature selection, or aggregation techniques. This paper proposes a new filter feature selection approach for one-class classification. First, five feature selection measures from different paradigms are here employed or adapted to the one-class scenario. Next, the feature rankings produced by these measures are combined using different aggregation strategies. The proposed approach was able to reduce the size of the feature sets while maintaining or even improving the predictive performance obtained by the one-class classifier.

Clinical Eeg and Neuroscience | 2014

Clinician’s Road Map to Wavelet EEG as an Alzheimer’s disease Biomarker:

Paulo Afonso Medeiros Kanda; Lucas R. Trambaiolli; Ana Carolina Lorena; Francisco J. Fraga; Luis I. Basile; Ricardo Nitrini; Renato Anghinah

Alzheimer’s disease (AD) is considered the main cause of dementia in Western countries. Consequently, there is a need for an accurate, universal, specific and cost-effective biomarker for early AD diagnosis, to follow disease progression and therapy response. This article describes a new diagnostic approach to quantitative electroencephalogram (QEEG) diagnosis of mild and moderate AD. The data set used in this study was composed of EEG signals recorded from 2 groups: (S1) 74 normal subjects, 33 females and 41 males (mean age 67 years, standard deviation = 8) and (S2) 88 probable AD patients (NINCDS-ADRDA criteria), 55 females and 33 males (mean age 74.7 years, standard deviation = 7.8) with mild to moderate symptoms (DSM-IV-TR). Attention is given to sample size and the use of state of the art open source tools (LetsWave and WEKA) to process the EEG data. This innovative technique consists in associating Morlet wavelet filter with a support vector machine technique. A total of 111 EEG features (attributes) were obtained for 162 probands. The results were accuracy of 92.72% and area under the curve of 0.92 (percentage split test). Most important, comparing a single patient versus the total data set resulted in accuracy of 84.56% (leave-one-patient-out test). Particular emphasis was on clinical diagnosis and feasibility of implementation of this low-cost procedure, because programming knowledge is not required. Consequently, this new method can be useful to support AD diagnosis in resource-limited settings.

Neurocomputing | 2016

Noise detection in the meta-learning level

Luís Paulo F. Garcia; André Carlos Ponce de Leon Ferreira de Carvalho; Ana Carolina Lorena

The presence of noise in real data sets can harm the predictive performance of machine learning algorithms. There are several noise filtering techniques whose goal is to improve the quality of the data in classification tasks. These techniques usually scan the data for noise identification in a preprocessing step. Nonetheless, this is a non-trivial task and some noisy data can remain unidentified, while safe data can also be removed. The bias of each filtering technique influences its performance on a particular data set. Therefore, there is no single technique that can be considered the best for all domains or data distribution and choosing a particular filter is not straightforward. Meta-learning has been largely used in the last years to support the recommendation of the most suitable machine learning algorithm(s) for a new data set. This paper presents a meta-learning recommendation system able to predict the expected performance of noise filters in noisy data identification tasks. For such, a meta-base is created, containing meta-features extracted from several corrupted data sets along with the performance of some noise filters when applied to these data sets. Next, regression models are induced from this meta-base to predict the expected performance of the investigated filters in the identification of noisy data. The experimental results show that meta-learning can provide a good recommendation of the most promising filters to be applied to new classification data sets.

Data Mining and Knowledge Discovery | 2016

Ensembles of label noise filters: a ranking approach

Luís Paulo F. Garcia; Ana Carolina Lorena; Stan Matwin; André Carlos Ponce de Leon Ferreira de Carvalho

Label noise can be a major problem in classification tasks, since most machine learning algorithms rely on data labels in their inductive process. Thereupon, various techniques for label noise identification have been investigated in the literature. The bias of each technique defines how suitable it is for each dataset. Besides, while some techniques identify a large number of examples as noisy and have a high false positive rate, others are very restrictive and therefore not able to identify all noisy examples. This paper investigates how label noise detection can be improved by using an ensemble of noise filtering techniques. These filters, individual and ensembles, are experimentally compared. Another concern in this paper is the computational cost of ensembles, once, for a particular dataset, an individual technique can have the same predictive performance as an ensemble. In this case the individual technique should be preferred. To deal with this situation, this study also proposes the use of meta-learning to recommend, for a new dataset, the best filter. An extensive experimental evaluation of the use of individual filters, ensemble filters and meta-learning was performed using public datasets with imputed label noise. The results show that ensembles of noise filters can improve noise filtering performance and that a recommendation system based on meta-learning can successfully recommend the best filtering technique for new datasets. A case study using a real dataset from the ecological niche modeling domain is also presented and evaluated, with the results validated by an expert.

international symposium on neural networks | 2015

Adaptive approaches for keystroke dynamics

Paulo Henrique Pisani; Ana Carolina Lorena; André Carlos Ponce Leon Ferreira de Carvalho

Enhanced authentication mechanisms are currently needed in several situations. Mainly due to the widespread use of the Internet, data exposure became a source of growing concern. Commonly used login and password credentials may not provide enough security in this scenario, as they may be easily stolen or guessed in some cases. The use of biometrics is a prominent alternative for user authentication, such as by the use of keystroke dynamics. This biometric technology allows the recognition of users by their typing rhythm, which can be performed using data provided by a common keyboard. However, recent work has shown that typing rhythm changes over time. As a result, a static biometric model can become outdated, decreasing the predictive performance of the system. In light of this fact, there is a need for new techniques able to dynamically adapt user models over time. This paper evaluates, in a data stream context, algorithms proposed in the literature for user authentication based on keystroke dynamics. Modifications to these algorithms are also proposed and evaluated. A study of the behaviour of the algorithms over time under several aspects is also performed. According to our experiments, adaptive methods can improve predictive performance of user recognition by keystroke dynamics.

Journal of Intelligent and Robotic Systems | 2015

Adaptive Positive Selection for Keystroke Dynamics

Paulo Henrique Pisani; Ana Carolina Lorena; André Carlos Ponce Leon Ferreira de Carvalho

Current technologies provide state of the art services but, at the same time, increase data exposure, mainly due to Internet-based applications. In view of this scenario, improved authentication mechanisms are needed. Keystroke dynamics, which recognizes users by their typing rhythm, is a cost-effective alternative. This technology usually only requires a common keyboard in order to acquire authentication data. There are several studies investigating the use of machine learning techniques for user authentication based on keystroke dynamics. However, the majority of them assume a scenario which the user model is not updated. It is known that typing rhythm changes over time (concept drift). Consequently, classification algorithms in keystroke dynamics have to be able to adapt the user model to these changes. This paper evaluates adaptation methods for an immune positive selection algorithm in a data stream context. Experimental results showed that they improved classification performance, mainly for false rejection rates.

Clinical Neurophysiology | 2017

Feature selection before EEG classification supports the diagnosis of Alzheimer’s disease

Lucas R. Trambaiolli; N. Spolaôr; Ana Carolina Lorena; Renato Anghinah; João Ricardo Sato

OBJECTIVE In many decision support systems, some input features can be marginal or irrelevant to the diagnosis, while others can be redundant among each other. Thus, feature selection (FS) algorithms are often considered to find relevant/non-redundant features. This study aimed to evaluate the relevance of FS approaches applied to Alzheimers Disease (AD) EEG-based diagnosis and compare the selected features with previous clinical findings. METHODS Eight different FS algorithms were applied to EEG spectral measures from 22 AD patients and 12 healthy age-matched controls. The FS contribution was evaluated by considering the leave-one-subject-out accuracy of Support Vector Machine classifiers built in the datasets described by the selected features. RESULTS The Filtered Subset Evaluator technique achieved the best performance improvement both on a per-patient basis (91.18% of accuracy) and on a per-epoch basis (85.29±21.62%), after removing 88.76±1.12% of the original features. All algorithms found out that alpha and beta bands are relevant features, which is in agreement with previous findings from the literature. CONCLUSION Biologically plausible EEG datasets could achieve improved accuracies with pre-processing FS steps. SIGNIFICANCE The results suggest that the FS and classification techniques are an attractive complementary tool in order to reveal potential biomarkers aiding the AD clinical diagnosis.

Explore More