Raquel Sebastião | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Raquel Sebastião is active.

Explore More

Publication

Featured researches published by Raquel Sebastião.

knowledge discovery and data mining | 2009

Issues in evaluation of stream learning algorithms

João Gama; Raquel Sebastião; Pedro Pereira Rodrigues

Learning from data streams is a research area of increasing importance. Nowadays, several stream learning algorithms have been developed. Most of them learn decision models that continuously evolve over time, run in resource-aware environments, detect and react to changes in the environment generating data. One important issue, not yet conveniently addressed, is the design of experimental work to evaluate and compare decision models that evolve over time. There are no golden standards for assessing performance in non-stationary environments. This paper proposes a general framework for assessing predictive stream learning algorithms. We defend the use of Predictive Sequential methods for error estimate - the prequential error. The prequential error allows us to monitor the evolution of the performance of models that evolve over time. Nevertheless, it is known to be a pessimistic estimator in comparison to holdout estimates. To obtain more reliable estimators we need some forgetting mechanism. Two viable alternatives are: sliding windows and fading factors. We observe that the prequential error converges to an holdout estimator when estimated over a sliding window or using fading factors. We present illustrative examples of the use of prequential error estimators, using fading factors, for the tasks of: i) assessing performance of a learning algorithm; ii) comparing learning algorithms; iii) hypothesis testing using McNemar test; and iv) change detection using Page-Hinkley test. In these tasks, the prequential error estimated using fading factors provide reliable estimators. In comparison to sliding windows, fading factors are faster and memory-less, a requirement for streaming applications. This paper is a contribution to a discussion in the good-practices on performance assessment when learning dynamic models that evolve over time.

Machine Learning | 2013

On evaluating stream learning algorithms

João Gama; Raquel Sebastião; Pedro Pereira Rodrigues

Most streaming decision models evolve continuously over time, run in resource-aware environments, and detect and react to changes in the environment generating data. One important issue, not yet convincingly addressed, is the design of experimental work to evaluate and compare decision models that evolve over time. This paper proposes a general framework for assessing predictive stream learning algorithms. We defend the use of prequential error with forgetting mechanisms to provide reliable error estimators. We prove that, in stationary data and for consistent learning algorithms, the holdout estimator, the prequential error and the prequential error estimated over a sliding window or using fading factors, all converge to the Bayes error. The use of prequential error with forgetting mechanisms reveals to be advantageous in assessing performance and in comparing stream learning algorithms. It is also worthwhile to use the proposed methods for hypothesis testing and for change detection. In a set of experiments in drift scenarios, we evaluate the ability of a standard change detection algorithm to detect change using three prequential error estimators. These experiments point out that the use of forgetting mechanisms (sliding windows or fading factors) are required for fast and efficient change detection. In comparison to sliding windows, fading factors are faster and memoryless, both important requirements for streaming applications. Overall, this paper is a contribution to a discussion on best practice for performance assessment when learning is a continuous process, and the decision models are dynamic and evolve over time.

portuguese conference on artificial intelligence | 2007

Change detection in learning histograms from data streams

Raquel Sebastião; João Gama

In this paper we study the problem of constructing histograms from high-speed time-changing data streams. Learning in this context requires the ability to process examples once at the rate they arrive, maintaining a histogram consistent with the most recent data, and forgetting out-date data whenever a change in the distribution is detected. To construct histogram from high-speed data streams we use the two layer structure used in the Partition Incremental Discretization (PiD) algorithm. Our contribution is a new method to detect whenever a change in the distribution generating examples occurs. The base idea consists of monitoring distributions from two different time windows: the reference time window, that reflects the distribution observed in the past; and the current time window reflecting the distribution observed in the most recent data. We compare both distributions and signal a change whenever they are greater than a threshold value, using three different methods: the Entropy Absolute Difference, the Kullback-Leibler divergence and the Cosine Distance. The experimental results suggest that Kullback-Leibler divergence exhibit high probability in change detection, faster detection rates, with few false positives alarms.

discovery science | 2009

Regression Trees from Data Streams with Drift Detection

Elena Ikonomovska; João Gama; Raquel Sebastião; Dejan Gjorgjevik

The problem of extracting meaningful patterns from time changing data streams is of increasing importance for the machine learning and data mining communities. We present an algorithm which is able to learn regression trees from fast and unbounded data streams in the presence of concept drifts. To our best knowledge there is no other algorithm for incremental learning regression trees equipped with change detection abilities. The FIRT-DD algorithm has mechanisms for drift detection and model adaptation, which enable to maintain accurate and updated regression models at any time. The drift detection mechanism is based on sequential statistical tests that track the evolution of the local error, at each node of the tree, and inform the learning process for the detected changes. As a response to a local drift, the algorithm is able to adapt the model only locally, avoiding the necessity of a global model adaptation. The adaptation strategy consists of building a new tree whenever a change is suspected in the region and replacing the old ones when the new trees become more accurate. This enables smooth and granular adaptation of the global model. The results from the empirical evaluation performed over several different types of drift show that the algorithm has good capability of consistent detection and proper adaptation to concept drifts.

knowledge discovery and data mining | 2008

Monitoring incremental histogram distribution for change detection in data streams

Raquel Sebastião; João Gama; Pedro Pereira Rodrigues; João Bernardes

Histograms are a common technique for density estimation and they have been widely used as a tool in exploratory data analysis. Learning histograms from static and stationary data is a well known topic. Nevertheless, very few works discuss this problem when we have a continuous flow of data generated from dynamic environments. The scope of this paper is to detect changes from high-speed time-changing data streams. To address this problem, we construct histograms able to process examples once at the rate they arrive. The main goal of this work is continuously maintain a histogram consistent with the current status of the nature. We study strategies to detect changes in the distribution generating examples, and adapt the histogram to the most recent data by forgetting outdated data. We use the Partition Incremental Discretization algorithm that was designed to learn histograms from high-speed data streams. We present a method to detect whenever a change in the distribution generating examples occurs. The base idea consists of monitoring distributions from two different time windows: the reference window, reflecting the distribution observed in the past; and the current window which receives the most recent data. The current window is cumulative and can have a fixed or an adaptive step depending on the distance between distributions. We compared both distributions using Kullback-Leibler divergence, defining a threshold for change detection decision based on the asymmetry of this measure. We evaluated our algorithm with controlled artificial data sets and compare the proposed approach with nonparametric tests. We also present results with real word data sets from industrial and medical domains. Those results suggest that an adaptive windows step exhibit high probability in change detection and faster detection rates, with few false positives alarms.

acm symposium on applied computing | 2009

Evaluating algorithms that learn from data streams

João Gama; Pedro Pereira Rodrigues; Raquel Sebastião

mediterranean conference on control and automation | 2009

Total Mass TCI driven by parametric estimation

Margarida Martins da Silva; Claudia Sousa; Raquel Sebastião; João Gama; Teresa Mendonça; Paula Rochak; Simao Esteves

This paper presents the Total Mass Target Controlled Infusion algorithm. The system comprises an On Line tuned Algorithm for Recovery Detection (OLARD) after an initial bolus administration and a Bayesian identification method for parametric estimation based on sparse measurements of the accessible signal. To design the drug dosage profile, two algorithms are here proposed. During the transient phase, an Input Variance Control (IVC) algorithm is used. It is based on the concept of TCI and aims to steer the drug effect to a predefined target value within an a priori fixed interval of time. After the steady state phase is reached the drug dose regimen is controlled by a Total Mass Control (TMC) algorithm. The mass control law for compartmental systems is robust even in the presence of parameter uncertainties. The whole system feasibility has been evaluated for the case of Neuromuscular Blockade (NMB) level and was tested both in simulation and in real cases.

Journal of data science | 2017

Fading histograms in detecting distribution and concept changes

Raquel Sebastião; João Gama; Teresa Mendonça

The remarkable number of real applications under dynamic scenarios is driving a novel ability to generate and gather information. Nowadays, a massive amount of information is generated at a high-speed rate, known as data streams. Moreover, data are collected under evolving environments. Due to memory restrictions, data must be promptly processed and discarded immediately. Therefore, dealing with evolving data streams raises two main questions: (i) how to remember discarded data? and (ii) how to forget outdated data? To maintain an updated representation of the time-evolving data, this paper proposes fading histograms. Regarding the dynamics of nature, changes in data are detected through a windowing scheme that compares data distributions computed by the fading histograms: the adaptive cumulative windows model (ACWM). The online monitoring of the distance between data distributions is evaluated using a dissimilarity measure based on the asymmetry of the Kullback–Leibler divergence. The experimental results support the ability of fading histograms in providing an updated representation of data. Such property works in favor of detecting distribution changes with smaller detection delay time when compared with standard histograms. With respect to the detection of concept changes, the ACWM is compared with 3 known algorithms taken from the literature, using artificial data and using public data sets, presenting better results. Furthermore, we the proposed method was extended for multidimensional and the experiments performed show the ability of the ACWM for detecting distribution changes in these settings.

computer-based medical systems | 2012

Contributions to a decision support system based on depth of anesthesia signals

Raquel Sebastião; Margarida Martins da Silva; João Gama; Teresa Mendonça

In the clinical practice the concerns about the administration of hypnotics and analgesics for minimally invasive diagnostics and therapeutic procedures have enormously increased in the past years. The automatic detection of changes in the signals used to evaluate the depth of anesthesia is hence of foremost importance in order to decide how to adapt the doses of hypnotics and analgesics that should be administered to patients. The aim of this work is to online detect drifts in the referred depth of anesthesia signals of patients undergoing general anesthesia. The performance of the proposed method is illustrated using BIS records previously collected from patients subject to abdominal surgery. The results show that the drifts detected by the proposed method are in accordance with the actions of the clinicians in terms of times where a change in the hypnotic or analgesic rates had occurred. This detection was performed under the presence of noise and sensor faults. The presented algorithm was also online validated. The results encourage the inclusion of the proposed algorithm in a decision support system based on depth of anesthesia signals.

computer analysis of images and patterns | 2009

Decision Trees Using the Minimum Entropy-of-Error Principle

Joaquim Marques de Sá; João Gama; Raquel Sebastião; Luís A. Alexandre

Binary decision trees based on univariate splits have traditionally employed so-called impurity functions as a means of searching for the best node splits. Such functions use estimates of the class distributions. In the present paper we introduce a new concept to binary tree design: instead of working with the class distributions of the data we work directly with the distribution of the errors originated by the node splits. Concretely, we search for the best splits using a minimum entropy-of-error (MEE) strategy. This strategy has recently been applied in other areas (e.g. regression, clustering, blind source separation, neural network training) with success. We show that MEE trees are capable of producing good results with often simpler trees, have interesting generalization properties and in the many experiments we have performed they could be used without pruning.

Explore More