Alberto Abelló
Polytechnic University of Catalonia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alberto Abelló.
Information Systems | 2006
Alberto Abelló; José Samos; Fèlix Saltor
This paper presents a multidimensional conceptual Object-Oriented model for Data Warehousing and OLAP tools, its structures, integrity constraints and query operations. It has been developed as an extension of UML core metaclasses to facilitate its usage, and try to fill the absence of a standard model. Being a UML extension allows reusing modeling constructs and techniques, and integrating multidimensional modeling in more general modeling processes. Moreover, while existing multidimensional models are restricted to the modeling of isolated stars, this paper investigates the representation of several semantically related star schemas. Summarizability and identification constraints can also be represented in the model, and a closed and complete set of algebraic operations has been defined in terms of functions (so that mathematical properties of functions can be smoothly applied).
data warehousing and olap | 2007
Oscar Romero; Alberto Abelló
This paper presents a new approach to automate the multidimensional design of Data Warehouses. In our approach we propose a semi-automatable method aimed to find the business multidimensional concepts from a domain ontology representing different and potentially heterogeneous data sources of our business domain. In short, our method identifies business multidimensional concepts from heterogeneous data sources having nothing in common but that they are all described by an ontology.
International Journal of Data Warehousing and Mining | 2009
Oscar Romero; Alberto Abelló
Many methodologies have been presented to support the multidimensional design of the data warehouse. First methodologies introduced were requirement-driven but the semantics of a data warehouse require to also consider data sources along the design process. In the following years, data sources gained relevance in multidimensional modeling and gave rise to several data-driven methodologies that automate the data warehouse design process from relational sources. Currently, research on multidimensional modeling is still a hot topic and we have two main research lines. On the one hand, new hybrid automatic methodologies have been introduced proposing to combine data-driven and requirement-driven approaches. On the other hand, new approaches focus on considering other kind of structured data sources that have gained relevance in the last years such as ontologies or XML. In this article we present the most relevant methodologies introduced in the literature and a detailed comparison showing main features of each approach.
IEEE Transactions on Knowledge and Data Engineering | 2015
Alberto Abelló; Oscar Romero; Torben Bach Pedersen; Rafael Berlanga; Victoria Nebot; María José Aramburu; Alkis Simitsis
This paper describes the convergence of some of the most influential technologies in the last few years, namely data warehousing (DW), on-line analytical processing (OLAP), and the Semantic Web (SW). OLAP is used by enterprises to derive important business-critical knowledge from data inside the company. However, the most interesting OLAP queries can no longer be answered on internal data alone, external data must also be discovered (most often on the web), acquired, integrated, and (analytically) queried, resulting in a new type of OLAP, exploratory OLAP. When using external data, an important issue is knowing the precise semantics of the data. Here, SW technologies come to the rescue, as they allow semantics (ranging from very simple to very complex) to be specified for web-available resources. SW technologies do not only support capturing the “passive” semantics, but also support active inference and reasoning on the data. The paper first presents a characterization of DW/OLAP environments, followed by an introduction to the relevant SW foundation concepts. Then, it describes the relationship of multidimensional (MD) models and SW technologies, including the relationship between MD models and SW formalisms. Next, the paper goes on to survey the use of SW technologies for data modeling and data provisioning, including semantic data annotation and semantic-aware extract, transform, and load (ETL) processes. Finally, all the findings are discussed and a number of directions for future research are outlined, including SW support for intelligent MD querying, using SW technologies for providing context to data warehouses, and scalability issues.
data and knowledge engineering | 2010
Oscar Romero; Alberto Abelló
The data warehouse design task needs to consider both the end-user requirements and the organization data sources. For this reason, the data warehouse design has been traditionally considered a reengineering process, guided by requirements, from the data sources. Most current design methods available demand highly-expressive end-user requirements as input, in order to carry out the exploration and analysis of the data sources. However, the task to elicit the end-user information requirements might result in a thorough task. Importantly, in the data warehousing context, the analysis capabilities of the target data warehouse depend on what kind of data is available in the data sources. Thus, in those scenarios where the analysis capabilities of the data sources are not (fully) known, it is possible to help the data warehouse designer to identify and elicit unknown analysis capabilities. In this paper we introduce a user-centered approach to support the end-user requirements elicitation and the data warehouse multidimensional design tasks. Our proposal is based on a reengineering process that derives the multidimensional schema from a conceptual formalization of the domain. It starts by fully analyzing the data sources to identify, without considering requirements yet, the multidimensional knowledge they capture (i.e., data likely to be analyzed from a multidimensional point of view). Next, we propose to exploit this knowledge in order to support the requirements elicitation task. In this way, we are already conciliating requirements with the data sources, and we are able to fully exploit the analysis capabilities of the sources. Once requirements are clear, we automatically create the data warehouse conceptual schema according to the multidimensional knowledge extracted from the sources.
database and expert systems applications | 2001
Alberto Abelló; José Samos; Fèlix Saltor
The words On-Line Analytical Processing bring together a set of tools, that use multidimensional modeling in the management of information to improve the decision making process. Lately, a lot of work has been devoted to modeling the multidimensional space. The aim of this paper is two fold. On one hand, it compiles and classifies some of that work, with regard to the design phase they are used in. On the other hand, it allows to compare the different terminology used by each author, by placing all the terms in a common framework.
data and knowledge engineering | 2010
Oscar Romero; Alberto Abelló
It is widely accepted that the conceptual schema of a data warehouse must be structured according to the multidimensional model. Moreover, it has been suggested that the ideal scenario for deriving the multidimensional conceptual schema of the data warehouse would consist of a hybrid approach (i.e., a combination of data-driven and requirement-driven paradigms). Thus, the resulting multidimensional schema would satisfy the end-user requirements and would be conciliated with the data sources. Most current methods follow either a data-driven or requirement-driven paradigm and only a few use a hybrid approach. Furthermore, hybrid methods are unbalanced and do not benefit from all of the advantages brought by each paradigm. In this paper we present our approach for multidimensional design. The most relevant step in our framework is Multidimensional Design by Examples (MDBE), which is a novel method for deriving multidimensional conceptual schemas from relational sources according to end-user requirements. MDBE introduces several advantages over previous approaches, which can be summarized as three main contributions. (i) The MDBE method is a fully automatic approach that handles and analyzes the end-user requirements automatically. (ii) Unlike data-driven methods, we focus on data of interest to the end-user. However, the user may not be aware of all the potential analyses of the data sources and, in contrast to requirement-driven approaches, MDBE can propose new multidimensional knowledge related to concepts already queried by the user. (iii) Finally, MDBE proposes meaningful multidimensional schemas derived from a validation process. Therefore, the proposed schemas are sound and meaningful.
data warehousing and knowledge discovery | 2011
Oscar Romero; Alkis Simitsis; Alberto Abelló
At the early stages of a data warehouse design project, the main objective is to collect the business requirements and needs, and translate them into an appropriate conceptual, multidimensional design. Typically, this task is performed manually, through a series of interviews involving two different parties: the business analysts and technical designers. Producing an appropriate conceptual design is an error-prone task that undergoes several rounds of reconciliation and redesigning, until the business needs are satisfied. It is of great importance for the business of an enterprise to facilitate and automate such a process. The goal of our research is to provide designers with a semi-automatic means for producing conceptual multidimensional designs and also, conceptual representation of the extract-transform-load (ETL) processes that orchestrate the data flow from the operational sources to the data warehouse constructs. In particular, we describe a method that combines information about the data sources along with the business requirements, for validating and completing -if necessary- these requirements, producing a multidimensional design, and identifying the ETL operations needed. We present our method in terms of the TPC-DS benchmark and show its applicability and usefulness.
data warehousing and olap | 2009
Oscar Romero; Diego Calvanese; Alberto Abelló; Mariano Rodriguez-Muro
Nowadays, it is widely accepted that the data warehouse design task should be largely automated. Furthermore, the data warehouse conceptual schema must be structured according to the multidimensional model and as a consequence, the most common way to automatically look for subjects and dimensions of analysis is by discovering functional dependencies (as dimensions functionally depend on the fact) over the data sources. Most advanced methods for automating the design of the data warehouse carry out this process from relational OLTP systems, assuming that a RDBMS is the most common kind of data source we may find, and taking as starting point a relational schema. In contrast, in our approach we propose to rely instead on a conceptual representation of the domain of interest formalized through a domain ontology expressed in the DL-Lite Description Logic. We propose an algorithm to discover functional dependencies from the domain ontology that exploits the inference capabilities of DL-Lite, thus fully taking into account the semantics of the domain. We also provide an evaluation of our approach in a real-world scenario.
data warehousing and knowledge discovery | 2007
Oscar Romero; Alberto Abelló
Although multidimensionality has been widely accepted as the best solution for conceptual modelling, there is not such agreement about the set of operators to handle multidimensional data. This paper presents a comparison of the existing multidimensional algebras trying to find a common backbone, as well as it discusses the necessity of a reference multidimensional algebra and the current state of the art.