Diego García-Saiz
University of Cantabria
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Diego García-Saiz.
decision support systems | 2013
Marta E. Zorrilla; Diego García-Saiz
In todays competitive market, companies need to use discovery knowledge techniques to make better, more informed decisions. But these techniques are out of the reach of most users as the knowledge discovery process requires an incredible amount of expertise. Additionally, business intelligence vendors are moving their systems to the cloud in order to provide services which offer companies cost-savings, better performance and faster access to new applications. This work joins both facets. It describes a data mining service addressed to non-expert data miners which can be delivered as Software-as-a-Service. Its main advantage is that by simply indicating where the data file is, the service itself is able to perform all the process.
Archive | 2014
Diego García-Saiz; Camilo Palazuelos; Marta E. Zorrilla
With the increasing popularity of social networking services like Facebook, social network analysis (SNA) has emerged again. Undoubtedly, there is an inherent social network in any learning context, where teachers, learners, and learning resources behave as main actors, among which different relationships can be defined, e.g., “participate in” among blogs, students, and learners. From their analysis, information about group cohesion, participation in activities, and connections among subjects can be obtained. At the same time, it is well-known the need of tools that help instructors, in particular those involved in distance education, to discover their students’ behavior profile, models about how they participate in collaborative activities or likely the most important, to know the performance and dropout pattern with the aim of improving the teaching–learning process. Therefore, the goal of this chapter is to describe our e-learning Web Mining tool and the new services that it provides, supported by the use of SNA and classification techniques.
International Conference on Rough Sets and Intelligent Systems Paradigms | 2014
Marta E. Zorrilla; Diego García-Saiz
The use of e-learning platforms is practically generalised in all educational levels. Even more, virtual teaching is currently acquiring a great relevance never seen before. The information that these systems record is a wealthy source of information that once it is suitably analised, allows both, instructors and academic authorities to make more informed decisions. But, these individuals are not expert in data mining techniques, therefore they require tools which automatise the KDD process and, the same time, hide its complexity. In this paper, we show how meta-learning can be a suitable alternative for selecting the algorithm to be used in the KDD process, which will later be wrapped and deployed as a web service, making it easily accessible to the educational community. Our case study focuses on the student performance prediction from the activity performed by the students in courses hosted in Moodle platform.
Revista Iberoamericana De Tecnologías Del Aprendizaje | 2014
Pablo Sánchez Barreiro; Diego García-Saiz; Marta Elena Zorrilla Pantaleón
Applications for e-learning platforms must deal with certain variability inherent to their domain. For example, these applications must be adapted to the variations of each teaching-learning process. Thus, they must be changed manually, according to the particular environment in which they will be deployed. This manual adaptation process is costly and error-prone. Our hypothesis is that software product line (SPL) engineering, whose goal is the effective production of similar software systems, can help to alleviate this problem. This paper illustrates this idea by refactoring an e-learning application named E-Learning Web Miner in a SPL. The benefits obtained are highlighted and analyzed.
International Symposium on Data-Driven Process Discovery and Analysis | 2013
Roberto Espinosa; Diego García-Saiz; Marta E. Zorrilla; Jose Zubcoff; Jose-Norberto Mazón
Non-expert users find complex to gain richer insights into the increasingly amount of available heterogeneous data, the so called big data. Advanced data analysis techniques, such as data mining, are difficult to apply due to the fact that (i) a great number of data mining algorithms can be applied to solve the same problem, and (ii) correctly applying data mining techniques always requires dealing with the inherent features of the data source. Therefore, we are attending a novel scenario in which non-experts are unable to take advantage of big data, while data mining experts do: the big data divide. In order to bridge this gap, we propose an approach to offer non-expert miners a tool that just by uploading their data sets, return them the more accurate mining pattern without dealing with algorithms or settings, thanks to the use of a data mining algorithm recommender. We also incorporate a previous task to help non-expert users to specify data mining requirements and a later task in which users are guided in interpreting data mining results. Furthermore, we experimentally test the feasibility of our approach, in particular, the method to build recommenders in an educational context, where instructors of e-learning courses are non-expert data miners who need to discover how their courses are used in order to make informed decisions to improve them.
symposium on languages applications and technologies | 2015
Alfonso de la Vega; Diego García-Saiz; Marta E. Zorrilla; Pablo Sánchez
Nowadays, most companies and organizations rely on computer systems to run their work processes. Therefore, the analysis of how these systems are used can be an important source of information to improve these work processes. In the era of Big Data, this is perfectly feasible with current state-of-art data analysis tools. Nevertheless, these data analysis tools cannot be used by general users, as they require a deep and sound knowledge of the algorithms and techniques they implement. In other areas of computer science, domain-specific languages have been created to abstract users from low level details of complex technologies. Therefore, we believe the same solution could be applied for data analysis tools. This article explores this hypothesis by creating a Domain-Specific Language (DSL) for the educational domain.
model and data engineering | 2017
Alfonso de la Vega; Diego García-Saiz; Marta E. Zorrilla; Pablo Sánchez
Data mining techniques are making their entrance in nowadays companies, allowing business users to take informed decisions based on their available data. However, these business experts usually lack the knowledge to perform the analysis of the data by themselves, which makes it necessary to rely on experts in the field of data mining. In an attempt to solve this problem, we previously studied the definition of domain-specific languages, which allowed to specify data mining processes without requiring experience in the applied techniques. The specification was made through high-level language primitives, which referred only to familiar concepts and terms from the original domain of the data. Therefore, technical details about the mining processes were hidden to the final user. Although these languages present themselves as a promising solution, their development can become a challenging task, incurring in costly endeavours. This work describes a development ecosystem devised for the generation of these languages, starting from a generic perspective that can be specialized into the details of each domain.
international conference on computational collective intelligence | 2015
Marta E. Zorrilla; Diego García-Saiz
One of the most challenging tasks in the knowledge discovery process is the selection of the best classification algorithm for a data set at hand. Thus, tools which help practitioners to choose the best classifier along with its parameter setting are highly demanded. These will not only be useful for trainees but also for the automation of the data mining process. Our approach is based on meta-learning, which relies on the application of learning algorithms on meta-data extracted from data mining experiments in order to better understand how these algorithms can become flexible in solving different kinds of learning problems. This paper presents a framework which allows novices to create and feed their own experiment database and later, analyse and select the best technique for their target data set. As case study, we evaluate different sets of meta-features on educational data sets and discuss which ones are more suitable for predicting student performance.
Archive | 2019
Diego García-Saiz; Marta E. Zorrilla; Alfonso de la Vega; Pablo Sánchez
Due to the lack of a face-to-face interaction between teachers and students in virtual courses, the identification of at-risk learners among those who appear to show normal activity is a challenge. Particularly, we refer to those who are very active in the Learning Management System, but their performance is low in comparison with their peers. To fix this issue, we describe a method aimed to discover learners with an inconsistent performance with respect to their activity, by using an ensemble of classifiers. Its effectiveness will be shown by its application on data from virtual courses and its comparison with the results achieved by two well-known outlier detection techniques.
Archive | 2018
Alfonso de la Vega; Diego García-Saiz; Carlos Blanco; Marta E. Zorrilla; Pablo Sánchez
In big data contexts, the performance of relational databases can get overwhelmed, usually by numerous concurrent connections over large volumes of data. In these cases, the support of ACID transactions is dropped in favour of NoSQL data stores, which offer quick responses and high data availability. Although NoSQL systems solve this concrete performance problem, they also present some issues. For instance, the NoSQL spectrum covers a wide range of database paradigms, such as key-value, column-oriented or document stores. These paradigms differ too much from the relational model, provoking that it is not possible to make use of existent, well-known practices from relational database design. Moreover, the existence of that paradigm heterogeneity makes difficult the definition of general design practices for NoSQL data stores. We present Mortadelo, a framework devised for the automatic design of NoSQL databases. Mortadelo offers a model-driven transformation process, which starts from a technology-agnostic data model and provides an automatically generated design and implementation for the desired NoSQL data store. The main strength of our framework is its generality, i.e., Mortadelo can be extended to support any kind of NoSQL database. The validity of our approach has been checked through the implementation of a tool, which currently supports the generation of column family data stores and offers preliminary support of document-based ones.