Elizabeth León | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Elizabeth León is active.

Explore More

Publication

Featured researches published by Elizabeth León.

congress on evolutionary computation | 2004

Anomaly detection based on unsupervised niche clustering with application to network intrusion detection

Elizabeth León; Olfa Nasraoui; Jonatan Gómez

We present a new approach to anomaly detection based on unsupervised niche clustering (UNC). The UNC is a genetic niching technique for clustering that can handle noise, and is able to determine the number of clusters automatically. The UNC uses the normal samples for generating a profile of the normal space (clusters). Each cluster can later be characterized by a fuzzy membership function that follows a Gaussian shape defined by the evolved cluster centers and radii. The set of memberships are aggregated using a max-or fuzzy operator in order to determine the normalcy level of a data sample. Experiments on synthetic and real data sets, including a network intrusion detection data set, are performed and some results are analyzed and reported.

Expert Systems With Applications | 2014

Extractive single-document summarization based on genetic operators and guided local search

Susana Bonilla; Clara Noguera; Carlos Cobos; Elizabeth León

Due to the exponential growth of textual information available on the Web, end users need to be able to access information in summary form - and without losing the most important information in the document when generating the summaries. Automatic generation of extractive summaries from a single document has traditionally been given the task of extracting the most relevant sentences from the original document. The methods employed generally allocate a score to each sentence in the document, taking into account certain features. The most relevant sentences are then selected, according to the score obtained for each sentence. These features include the position of the sentence in the document, its similarity to the title, the sentence length, and the frequency of the terms in the sentence. However, it has still not been possible to achieve a quality of summary that matches that performed by humans and therefore methods continue to be brought forward that aim to improve on the results. This paper addresses the generation of extractive summaries from a single document as a binary optimization problem where the quality (fitness) of the solutions is based on the weighting of individual statistical features of each sentence - such as position, sentence length and the relationship of the summary to the title, combined with group features of similarity between candidate sentences in the summary and the original document, and among the candidate sentences of the summary. This paper proposes a method of extractive single-document summarization based on genetic operators and guided local search, called MA-SingleDocSum. A memetic algorithm is used to integrate the own-population-based search of evolutionary algorithms with a guided local search strategy. The proposed method was compared with the state of the art methods UnifiedRank, DE, FEOM, NetSum, CRF, QCS, SVM, and Manifold Ranking, using ROUGE measures on the datasets DUC2001 and DUC2002. The results showed that MA-SingleDocSum outperforms the state of the art methods.

Information Processing and Management | 2013

A hybrid system of pedagogical pattern recommendations based on singular value decomposition and variable data attributes

Carlos Cobos; Orlando Rodriguez; Jarvein Rivera; John Betancourt; Elizabeth León; Enrique Herrera-Viedma

To carry out effective teaching/learning processes, lecturers in a variety of educational institutions frequently need support. They therefore resort to advice from more experienced lecturers, to formal training processes such as specializations, master or doctoral degrees, or to self-training. High costs in time and money are invariably involved in the processes of formal training, while self-training and advice each bring their own specific risks (e.g. of following new trends that are not fully evaluated or the risk of applying techniques that are inappropriate in specific contexts).This paper presents a system that allows lecturers to define their best teaching strategies for use in the context of a specific class. The context is defined by: the specific characteristics of the subject being treated, the specific objectives that are expected to be achieved in the classroom session, the profile of the students on the course, the dominant characteristics of the teacher, and the classroom environment for each session, among others. The system presented is the Recommendation System of Pedagogical Patterns (RSPP). To construct the RSPP, an ontology representing the pedagogical patterns and their interaction with the fundamentals of the educational process was defined. A web information system was also defined to record information on courses, students, lecturers, etc.; an option based on a unified hybrid model (for content and collaborative filtering) of recommendations for pedagogical patterns was further added to the system. RSPP features a minable view, a tabular structure that summarizes and organizes the information registered in the rest of the system as well as facilitating the task of recommendation. The data recorded in the minable view is taken to a latent space, where noise is reduced and the essence of the information contained in the structure is distilled. This process makes use of Singular Value Decomposition (SVD), commonly used by information retrieval and recommendation systems. Satisfactory results both in the accuracy of the recommendations and in the use of the general application open the door for further research and expand the role of recommender systems in educational teacher support processes.

congress on evolutionary computation | 2010

Web document clustering based on Global-Best Harmony Search, K-means, Frequent Term Sets and Bayesian Information Criterion

Carlos Cobos; Jennifer Andrade; William Constain; Elizabeth León

This paper introduces a new description-centric algorithm for web document clustering based on the hybridization of the Global-Best Harmony Search with the K-means algorithm, Frequent Term Sets and Bayesian Information Criterion. The new algorithm defines the number of clusters automatically. The Global-Best Harmony Search provides a global strategy for a search in the solution space, based on the Harmony Search and the concept of swarm intelligence. The K-means algorithm is used to find the optimum value in a local search space. Bayesian Information Criterion is used as a fitness function, while FP-Growth is used to reduce the high dimensionality in the vocabulary. This resulting algorithm, called IGBHSK, was tested with data sets based on Reuters-21578 and DMOZ, obtaining promising results (better precision results than a Singular Value Decomposition algorithm). Also, it was also then evaluated by a group of users.

Archive | 2005

Unsupervised Niche Clustering: Discovering an Unknown Number of Clusters in Noisy Data Sets

Olfa Nasraoui; Elizabeth León; Raghu Krishnapuram

As a valuable unsupervised learning tool, clustering is crucial to many applications in pattern recognition, machine learning, and data mining. Evolutionary techniques have been used with success as global searchers in difficult problems, particularly in the optimization of non-differentiable functions. Hence, they can improve clustering. However, existing evolutionary clustering techniques suffer from one or more of the following shortcomings: (i) they are not robust in the presence of noise, (ii) they assume a known number of clusters, and (iii) the size of the search space explodes exponentially with the number of clusters, or with the number of data points. We present a robust clustering algorithm, called the Unsupervised Niche Clustering algorithm (UNC), that overcomes all the above difficulties. UNC can successfully find dense areas (clusters) in feature space and determines the number of clusters automatically. The clustering problem is converted to a multimodal function optimization problem within the context of Genetic Niching. Robust cluster scale estimates are dynamically estimated using a hybrid learning scheme coupled with the genetic optimization of the cluster centers, to adapt to clusters of different sizes and noise contamination rates. Genetic optimization enables our approach to handle data with both numeric and qualitative attributes, and general subjective, non metric, even non-differentiable dissimilarity measures.

congress on evolutionary computation | 2010

Web document clustering based on a new niching Memetic Algorithm, Term-Document Matrix and Bayesian Information Criterion

Carlos Cobos; Claudia Montealegre; María-Fernanda Mejía; Elizabeth León

This paper introduces a new description-centric algorithm for web document clustering based on Memetic Algorithms with Niching Methods, Term-Document Matrix and Bayesian Information Criterion. The algorithm defines the number of clusters automatically. The Memetic Algorithm provides a combined global and local strategy for a search in the solution space and the Niching methods to promote diversity in the population and prevent the population from converging too quickly (based on restricted competition replacement and restrictive mating). The Memetic Algorithm uses the K-means algorithm to find the optimum value in a local search space. Bayesian Information Criterion is used as a fitness function, while FP-Growth is used to reduce the high dimensionality in the vocabulary. This resulting algorithm, called WDC-NMA, was tested with data sets based on Reuters-21578 and DMOZ, obtaining promising results (better precision results than a Singular Value Decomposition algorithm). Also, it was also then initially evaluated by a group of users.

congress on evolutionary computation | 2011

A hyper-heuristic approach to design and tuning heuristic methods for web document clustering

Carlos Cobos; Elizabeth León

This paper introduces a new description-centric algorithm for web document clustering called HHWDC. The HHWDC algorithm has been designed from a hyper-heuristic approach and allows defining the best algorithm for web document clustering. HHWDC uses as heuristic selection methodology two options, namely: random selection and roulette wheel selection based on performance of low-level heuristics (harmony search, an improved harmony search, a novel global harmony search, global-best harmony search, restrictive mating, roulette wheel selection, and particle swarm optimization). HHWDC uses the k-means algorithm for local solution improvement strategy, and based on the Bayesian Information Criteria is able to automatically define the number of clusters. HHWDC uses two acceptance/replace strategies, namely: Replace the worst and Restricted Competition Replacement. HHWDC was tested with data sets based on Reuters-21578 and DMOZ, obtaining promising results (better precision results than a Singular Value Decomposition algorithm).

ieee international conference on evolutionary computation | 2006

ECSAGO: Evolutionary Clustering with Self Adaptive Genetic Operators

Elizabeth León; Olfa Nasraoui; Jonatan Gómez

We present an algorithm for Evolutionary Clustering with Self Adaptive Genetic Operators (ECSAGO). This algorithm is based on the Unsupervised Niche Clustering (UNC) and Hybrid Adaptive Evolutionary (HAEA) algorithms. The UNC is a genetic clustering algorithm that is robust to noise and is able to determine the number of clusters automatically. HAEA is a parameter adaptation technique that automatically learns the rates of its genetic operators at the same time that the individuals are evolved in an Evolutionary Algorithm (EA). ECSAGO uses an EA with real encoding, real genetic operators, and adapts the genetic operator rates as it is evolving the cluster prototypes. This will have the advantage of reducing the number of parameters required by UNC (thus avoiding the problem of fixing the genetic operator parameter values), and solving problems where real representation is required or prefered for the solutions.

international conference on machine learning and applications | 2004

RAIN: data clustering using randomized interactions between data points

Jonatan Gómez; Olfa Nasraoui; Elizabeth León

This paper introduces a generalization of the Gravitational Clustering Algorithm. First, it is extended in such a way that the Gravitational Law is not the only law that can be applied. Instead, any decreasing function of the distance between points can be used. An estimate of the maximum distance between the closest points is calculated in order to reduce the sensibility of the clustering process to the size of the data set. Finally, a heuristic for setting the interaction strength (gravitational constant) is introduced in order to reduce the number of parameters of the algorithm. Experiments with benchmark synthetic data sets are performed in order to show the applicability of the proposed approach.

ieee international conference on fuzzy systems | 2006

A Fuzzy Set/Rule Distance for Evolving Fuzzy Anomaly Detectors

Jonatan Gómez; Elizabeth León

This paper develops a notion of quasi distance between fuzzy sets (rule detectors) that preserves a notion of fuzzy set (rule) dominance. Such quasi distance notion is used by an evolutionary algorithm during its deterministic crowding mechanism in order to preserve diversity of the evolved fuzzy rule detectors. Moreover, the evolutionary process uses a variable length encoding that allows to tune up the fuzzy set associated to each attribute in the atomic condition of a fuzzy rule by evolving the fuzzy set parameters. Experiments with real data sets are performed and some results are reported.

Explore More