Ernestina Menasalvas | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ernestina Menasalvas is active.

Explore More

Publication

Featured researches published by Ernestina Menasalvas.

computer aided systems theory | 2005

Web usage mining project for improving web-based learning sites

Marta E. Zorrilla; Ernestina Menasalvas; D. Marín; Elena Mora; Javier Segovia

Despite the great success of data mining being applied for personalization in web environments, it has not yet been massively applied in the e-learning domains. In this paper, we outline a web usage mining project which has been initiated in University of Cantabria. The aim of this project is to develop tools which let us improve its Web-based learning environment in two main aspects: the first that the teacher obtains information which allows him to evaluate the learning process and the second that the student feels supported in this task.

Scientific Reports | 2012

Optimizing Functional Network Representation of Multivariate Time Series

Massimiliano Zanin; Pedro Sousa; David Papo; Ricardo Bajo; Juan Garcia-Prieto; Francisco del Pozo; Ernestina Menasalvas; Stefano Boccaletti

By combining complex network theory and data mining techniques, we provide objective criteria for optimization of the functional network representation of generic multivariate time series. In particular, we propose a method for the principled selection of the threshold value for functional network reconstruction from raw data, and for proper identification of the networks indicators that unveil the most discriminative information on the system for classification purposes. We illustrate our method by analysing networks of functional brain activity of healthy subjects, and patients suffering from Mild Cognitive Impairment, an intermediate stage between the expected cognitive decline of normal aging and the more pronounced decline of dementia. We discuss extensions of the scope of the proposed methodology to network engineering purposes, and to other data mining tasks.

Physics Reports | 2016

Combining complex networks and data mining: Why and how

Massimiliano Zanin; David Papo; Pedro A. C. Sousa; Ernestina Menasalvas; Andrea Nicchi; Elaine Kubik; Stefano Boccaletti

The increasing power of computer technology does not dispense with the need to extract meaningful in- formation out of data sets of ever growing size, and indeed typically exacerbates the complexity of this task. To tackle this general problem, two methods have emerged, at chronologically different times, that are now commonly used in the scientific community: data mining and complex network theory. Not only do complex network analysis and data mining share the same general goal, that of extracting information from complex systems to ultimately create a new compact quantifiable representation, but they also often address similar problems too. In the face of that, a surprisingly low number of researchers turn out to resort to both methodologies. One may then be tempted to conclude that these two fields are either largely redundant or totally antithetic. The starting point of this review is that this state of affairs should be put down to contingent rather than conceptual differences, and that these two fields can in fact advantageously be used in a synergistic manner. An overview of both fields is first provided, some fundamental concepts of which are illustrated. A variety of contexts in which complex network theory and data mining have been used in a synergistic manner are then presented. Contexts in which the appropriate integration of complex network metrics can lead to improved classification rates with respect to classical data mining algorithms and, conversely, contexts in which data mining can be used to tackle important issues in complex network theory applications are illustrated. Finally, ways to achieve a tighter integration between complex networks and data mining, and open lines of research are discussed.

Artificial Intelligence in Medicine | 2004

Bayesian network multi-classifiers for protein secondary structure prediction

Víctor Robles; Pedro Larrañaga; José M. Peña; Ernestina Menasalvas; María S. Pérez; Vanessa Herves; Anita Wasilewska

Successful secondary structure predictions provide a starting point for direct tertiary structure modelling, and also can significantly improve sequence analysis and sequence-structure threading for aiding in structure and function determination. Hence the improvement of predictive accuracy of the secondary structure prediction becomes essential for future development of the whole field of protein research. In this work we present several multi-classifiers that combine the predictions of the best current classifiers available on Internet. Our results prove that combining the predictions of a set of classifiers by creating composite classifiers is a fruitful one. We have created multi-classifiers that are more accurate than any of the component classifiers. The multi-classifiers are based on Bayesian networks. They are validated with 9 different datasets. Their predictive accuracy results outperform the best secondary structure predictors by 1.21% on average. Our main contributions are: (i) we improved the best know predictive accuracy by 1.21%, (ii) our best results have been obtained with a new semi naïve Bayes approach named Pazzani-EDA and (iii) our multi-classifiers combine results of previously build classifiers predictions obtained through Internet, thanks to our development of a Java application.

mobile data management | 2012

MARS: A Personalised Mobile Activity Recognition System

João Bártolo Gomes; Shonali Krishnaswamy; Mohamed Medhat Gaber; Pedro A. C. Sousa; Ernestina Menasalvas

Mobile activity recognition focuses on inferring the current activities of a mobile user by leveraging the sensory data that is available on todays smart phones. The state of the art in mobile activity recognition uses traditional classification techniques. Thus, the learning process typically involves: i) collection of labelled sensory data that is transferred and collated in a centralised repository, ii) model building where the classification model is trained and tested using the collected data, iii) a model deployment stage where the learnt model is deployed on-board a mobile device for identifying activities based on new sensory data. In this paper, we demonstrate the Mobile Activity Recognition System (MARS) where for the first time the model is built and continuously updated on-board the mobile device itself using data stream mining. The advantages of the on-board approach are that it allows model personalisation and increased privacy as the data is not sent to any external site. Furthermore, when the user or its activity profile changes MARS enables quick model adaptation. One of the stand out features of MARS is that training/updating the model takes less than 30 seconds per activity. MARS has been implemented on the Android platform to demonstrate that it can achieve accurate mobile activity recognition. Moreover, we can show in practice that MARS quickly adapts to user profile changes while at the same time being scalable and efficient in terms of consumption of the device resources.

acm symposium on applied computing | 2011

Learning recurring concepts from data streams with a context-aware ensemble

João Bártolo Gomes; Ernestina Menasalvas; Pedro A. C. Sousa

The dynamic and unstable nature observed in real world applications influences learning systems through changes in data, context and resource availability. Data stream mining systems must be aware and adapt to such changes so that incoming data can continuously be classified with high accuracy. Ensemble approaches have been shown successful in dealing with concept changes. Despite their success in learning under concept changes, context information has not yet been exploited by ensemble approaches in data stream scenarios where concepts reappear. Under these circumstances, context information appropriately integrated with learned concepts would enable to anticipate recurring changes in concepts. In this work, we present an ensemble based approach for the problem of detecting concept changes in data streams where concepts reappear, that dynamically adds and removes weighted classifiers in response to changes not only in concepts but to context. We identify stable concepts using a change detection method, based on the error-rate of the learning process. Context information is used in the adaptation to recurring concepts and in the management of knowledge from previous learned concepts while adapting to resource constraints. Consequently, proper representation and storage of context and concepts is a major issue dealt within the paper. We present and discuss preliminary experimental results with synthetic and real datasets.

discovery science | 2009

C-DenStream: Using Domain Knowledge on a Data Stream

Carlos Ruiz; Ernestina Menasalvas; Myra Spiliopoulou

Stream clustering algorithms are traditionally designed to process streams efficiently and to adapt to the evolution of the underlying population. This is done without assuming any prior knowledge about the data. However, in many cases, a certain amount of domain or background knowledge is available, and instead of simply using it for the external validation of the clustering results, this knowledge can be used to guide the clustering process. In non-stream data, domain knowledge is exploited in the context of semi-supervised clustering . In this paper, we extend the static semi-supervised learning paradigm for streams. We present C-DenStream, a density-based clustering algorithm for data streams that includes domain information in the form of constraints. We also propose a novel method for the use of background knowledge in data streams. The performance study over a number of real and synthetic data sets demonstrates the effectiveness and efficiency of our method. To our knowledge, this is the first approach to include domain knowledge in clustering for data streams.

IEEE Transactions on Neural Networks | 2014

Mining Recurring Concepts in a Dynamic Feature Space

João Bártolo Gomes; Mohamed Medhat Gaber; Pedro A. C. Sousa; Ernestina Menasalvas

Most data stream classification techniques assume that the underlying feature space is static. However, in real-world applications the set of features and their relevance to the target concept may change over time. In addition, when the underlying concepts reappear, reusing previously learnt models can enhance the learning process in terms of accuracy and processing time at the expense of manageable memory consumption. In this paper, we propose mining recurring concepts in a dynamic feature space (MReC-DFS), a data stream classification system to address the challenges of learning recurring concepts in a dynamic feature space while simultaneously reducing the memory cost associated with storing past models. MReC-DFS is able to detect and adapt to concept changes using the performance of the learning process and contextual information. To handle recurring concepts, stored models are combined in a dynamically weighted ensemble. Incremental feature selection is performed to reduce the combined feature space. This contribution allows MReC-DFS to store only the features most relevant to the learnt concepts, which in turn increases the memory efficiency of the technique. In addition, an incremental feature selection method is proposed that dynamically determines the threshold between relevant and irrelevant features. Experimental results demonstrating the high accuracy of MReC-DFS compared with state-of-the-art techniques on a variety of real datasets are presented. The results also show the superior memory efficiency of MReC-DFS.

Data Mining and Knowledge Discovery | 2010

Density-based semi-supervised clustering

Carlos Ruiz; Myra Spiliopoulou; Ernestina Menasalvas

Semi-supervised clustering methods guide the data partitioning and grouping process by exploiting background knowledge, among else in the form of constraints. In this study, we propose a semi-supervised density-based clustering method. Density-based algorithms are traditionally used in applications, where the anticipated groups are expected to assume non-spherical shapes and/or differ in cardinality or density. Many such applications, among else those on GIS, lend themselves to constraint-based clustering, because there is a priori knowledge on the group membership of some records. In fact, constraints might be the only way to prevent the formation of clusters that do not conform to the applications’ semantics. For example, geographical objects, e.g. houses, separated by a borderline or a river may not be assigned to the same cluster, independently of their physical proximity. We first provide an overview of constraint-based clustering for different families of clustering algorithms. Then, we concentrate on the density-based algorithms’ family and select the algorithm DBSCAN, which we enhance with Must-Link and Cannot-Link constraints. Our enhancement is seamless: we allow DBSCAN to build temporary clusters, which we then split or merge according to the constraints. Our experiments on synthetic and real datasets show that our approach improves the performance of the algorithm.

Information Systems | 2009

Toward data mining engineering: A software engineering approach

Oscar Marbán; Javier Segovia; Ernestina Menasalvas; Covadonga Fernández-Baizán

The number, variety and complexity of projects involving data mining or knowledge discovery in databases activities have increased just lately at such a pace that aspects related to their development process need to be standardized for results to be integrated, reused and interchanged in the future. Data mining projects are quickly becoming engineering projects, and current standard processes, like CRISP-DM, need to be revisited to incorporate this engineering viewpoint. This is the central motivation of this paper that makes the point that experience gained about the software development process over almost 40 years could be reused and integrated to improve data mining processes. Consequently, this paper proposes to reuse ideas and concepts underlying the IEEE Std 1074 and ISO 12207 software engineering model processes to redefine and add to the CRISP-DM process and make it a data mining engineering standard.

Explore More