Francisco J. Ferrer-Troyano

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Francisco J. Ferrer-Troyano is active.

Explore More

Publication

Featured researches published by Francisco J. Ferrer-Troyano.

Journal of Universal Computer Science | 2005

Incremental Rule Learning and Border Examples Selection from Numerical Data Streams

Francisco J. Ferrer-Troyano; Jesús S. Aguilar-Ruiz; José Cristóbal Riquelme Santos

Mining data streams is a challenging task that requires online systems ba- sed on incremental learning approaches. This paper describes a classification system based on decision rules that may store up-to-date border examples to avoid unneces- sary revisions when virtual drifts are present in data. Consistent rules classify new test examples by covering and inconsistent rules classify them by distance as the nearest neighbour algorithm. In addition, the system provides an implicit forgetting heuristic so that positive and negative examples are removed from a rule when they are not near one another.

acm symposium on applied computing | 2004

Discovering decision rules from numerical data streams

Francisco J. Ferrer-Troyano; Jesús S. Aguilar-Ruiz; José C. Riquelme

This paper presents a scalable learning algorithm to classify numerical, low dimensionality, high-cardinality, time-changing data streams. Our approach, named SCALLOP, provides a set of decision rules on demand which improves its simplicity and helpfulness for the user. SCALLOP updates the knowledge model every time a new example is read, adding interesting rules and removing out-of-date rules. As the model is dynamic, it maintains the tendency of data. Experimental results with synthetic data streams show a good performance with respect to running time, accuracy and simplicity of the model.

acm symposium on applied computing | 2005

Incremental rule learning based on example nearness from numerical data streams

Francisco J. Ferrer-Troyano; Jesús S. Aguilar-Ruiz; José C. Riquelme

Mining data streams is a challenging task that requires online systems based on incremental learning approaches. This paper describes a classification system based on decision rules that may store up-to-date border examples to avoid unnecessary revisions when virtual drifts are present in data. Consistent rules classify new test examples by covering and inconsistent rules classify them by distance as the nearest neighbor algorithm. In addition, the system provides an implicit forgetting heuristic so that positive and negative examples are removed from a rule when they are not near one another.

acm symposium on applied computing | 2006

Data streams classification by incremental rule learning with parameterized generalization

Francisco J. Ferrer-Troyano; Jesús S. Aguilar-Ruiz; José C. Riquelme

Mining data streams is a challenging task that requires online systems based on incremental learning approaches. This paper describes a classification system based on decision rules that may store up--to--date border examples to avoid unnecessary revisions when virtual drifts are present in data. Consistent rules classify new test examples by covering and inconsistent rules classify them by distance as the nearest neighbor algorithm. In addition, the system provides an implicit forgetting heuristic so that positive and negative examples are removed from a rule when they are not near one another.

international conference on computational science | 2003

Empirical evaluation of the difficulty of finding a good value of k for the nearest neighbor

Francisco J. Ferrer-Troyano; Jesús S. Aguilar-Ruiz; José C. Riquelme

As an analysis of the classification accuracy bound for the Nearest Neighbor technique, in this work we have studied if it is possible to find a good value of the parameter k for each example according to their attribute values. Or at least, if there is a pattern for the parameter k in the original search space. We have carried out different approaches based on the Nearest Neighbor technique and calculated the prediction accuracy for a group of databases from the UCI repository. Based on the experimental results of our study, we can state that, in general, it is not possible to know a priori a specific value of k to correctly classify an unseen example.

acm symposium on applied computing | 2004

Editorial message: special track on data streams

Jesús S. Aguilar-Ruiz; Francisco J. Ferrer-Troyano

Databases are growing incessantly and many sources produce data continuously. In many cases, we need to extract some sort of knowledge from this continuous stream of data. Example include customer click streams, telephone records, large sets of web pages, multimedia data, scientific data, and sets of retail chain transactions. These sources are called data streams. The goal of this track is to convene researches who deal with decision rules, decision trees, association rules, clustering, filtering pre/post-processing, feature selection, visualization techniques, etc. from data streams and related themes.

portuguese conference on artificial intelligence | 2003

Mining low dimensionality data streams of continuous attributes

Francisco J. Ferrer-Troyano; Jesús S. Aguilar-Ruiz; José C. Riquelme

This paper presents an incremental and scalable learning algorithm in order to mine numeric, low dimensionality, high-cardinality, time-changing data streams. Within the Supervised Learning field, our approach, named SCALLOP, provides a set of decision rules whose size is very near to the number of concepts to be extracted. Experimental results with synthetic databases of different complexity degrees show a good performance from streams of data received at a rapid rate, whose label distribution may not be stationary in time.

acm symposium on applied computing | 2003

Prototype-based mining of numeric data streams

Francisco J. Ferrer-Troyano; Jesús S. Aguilar-Ruiz; José C. Riquelme

Great organizations collect open-ended and time-changing data received at a high speed. The possibility of extracting useful knowledge from these potentially infinite databases is a new challenge in Data Mining. In this paper we propose an anytime incremental learning algorithm for mining numeric data streams. Within Supervised Learning, our approach is based on prototypes and hypercubic decision rules, concerning with the simplicity of the model provided and the time complexity as primary goals. Experimental results with synthetic databases of 100 gigabytes show a good performance from streams of data in continuous transformation.

portuguese conference on artificial intelligence | 2001

Non-parametric Nearest Neighbor with Local Adaptation

Francisco J. Ferrer-Troyano; Jesús S. Aguilar-Ruiz; José Cristóbal Riquelme Santos

The k-Nearest Neighbor algorithm (k-NN) uses a classification criterion that depends on the parameter k. Usually, the value of this parameter must be determined by the user. In this paper we present an algorithm based on the NN technique that does not take the value of k from the user. Our approach evaluates values of k that classified the training examples correctly and takes which classified most examples. As the user does not take part in the election of the parameter k, the algorithm is non-parametric. With this heuristic, we propose an easy variation of the k-NN algorithm that gives robustness with noise present in data. Summarized in the last section, the experiments show that the error rate decreases in comparison with the k-NN technique when the best k for each database has been previously obtained.

Journal of Universal Computer Science | 2005

Connecting Segments for Visual Data Exploration and Interactive Mining of Decision Rules

Francisco J. Ferrer-Troyano; Jesús S. Aguilar-Ruiz; José Cristóbal Riquelme Santos

Visualization has become an essential support throughout the KDD pro- cess in order to extract hidden information from huge amount of data. Visual data exploration techniques provide the user with graphic views or metaphors that repres- ent potential patterns and data relationships. However, an only image does not always convey high-dimensional data properties successfully. From such data sets, visualiza- tion techniques have to deal with the curse of dimensionality in a critical way, as the number of examples may be very small with respect to the number of attributes. In this work, we describe a visual exploration technique that automatically extracts relevant attributes and displays their ranges of interest in order to support two data mining tasks: classification and feature selection. Through different metaphors with dynamic properties, the user can re-explore meaningful intervals belonging to the most relevant attributes, building decision rules and increasing the model accuracy interactively.

Explore More