Lorena Etcheverry | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lorena Etcheverry is active.

Explore More

Publication

Featured researches published by Lorena Etcheverry.

data warehousing and knowledge discovery | 2014

Modeling and querying data warehouses on the semantic web using QB4OLAP

Lorena Etcheverry; Alejandro A. Vaisman; Esteban Zimanyi

The web is changing the way in which data warehouses are designed and exploited. Nowadays, for many data analysis tasks, data contained in a conventional data warehouse may not suffice, and external data sources, like the web, can provide useful multidimensional information. Also, large repositories of semantically annotated data are becoming available on the web, opening new opportunities for enhancing current decision-support systems. Representation of multidimensional data via semantic web standards is crucial to achieve such goal. In this paper we extend the QB4OLAP RDF vocabulary to represent balanced, recursive, and ragged hierarchies. We also present a set of rules to obtain a QB4OLAP representation of a conceptual multidimensional model, and a procedure to populate the result from a relational implementation of the multidimensional model. We conclude the paper showing how complex real-world OLAP queries expressed in SPARQL can be posed to the resulting QB4OLAP model.

international semantic web conference | 2012

Enhancing OLAP analysis with web cubes

Lorena Etcheverry; Alejandro A. Vaisman

Traditional OLAP tools have proven to be successful in analyzing large sets of enterprise data. For todays business dynamics, sometimes these highly curated data is not enough. External data (particularly web data), may be useful to enhance local analysis. In this paper we discuss the extraction of multidimensional data from web sources, and their representation in RDFS. We introduce Open Cubes, an RDFS vocabulary for the specification and publication of multidimensional cubes on the Semantic Web, and show how classical OLAP operations can be implemented over Open Cubes using SPARQL 1.1, without the need of mapping the multidimensional information to the local database (the usual approach to multidimensional analysis of Semantic Web data). We show that our approach is plausible for the data sizes that can usually be retrieved to enhance local data repositories.

Journal of Web Semantics | 2016

Dimensional enrichment of statistical linked open data

Jovan Varga; Alejandro A. Vaisman; Oscar Romero; Lorena Etcheverry; Torben Bach Pedersen; Christian Thomsen

On-Line Analytical Processing (OLAP) is a data analysis technique typically used for local and well-prepared data. However, initiatives like Open Data and Open Government bring new and publicly available data on the web that are to be analyzed in the same way. The use of semantic web technologies for this context is especially encouraged by the Linked Data initiative. There is already a considerable amount of statistical linked open data sets published using the RDF Data Cube Vocabulary (QB) which is designed for these purposes. However, QB lacks some essential schema constructs (e.g.,źdimension levels) to support OLAP. Thus, the QB4OLAP vocabulary has been proposed to extend QB with the necessary constructs and be fully compliant with OLAP. In this paper, we focus on the enrichment of an existing QB data set with QB4OLAP semantics. We first thoroughly compare the two vocabularies and outline the benefits of QB4OLAP. Then, we propose a series of steps to automate the enrichment of QB data sets with specific QB4OLAP semantics; being the most important, the definition of aggregate functions and the detection of new concepts in the dimension hierarchy construction. The proposed steps are defined to form a semi-automatic enrichment method, which is implemented in a tool that enables the enrichment in an interactive and iterative fashion. The user can enrich the QB data set with QB4OLAP concepts (e.g.,źfull-fledged dimension hierarchies) by choosing among the candidate concepts automatically discovered with the steps proposed. Finally, we conduct experiments with 25 users and use three real-world QB data sets to evaluate our approach. The evaluation demonstrates the feasibility of our approach and shows that, in practice, our tool facilitates, speeds up, and guarantees the correct results of the enrichment process.

international conference on data engineering | 2016

QB2OLAP: Enabling OLAP on Statistical Linked Open Data

Jovan Varga; Lorena Etcheverry; Alejandro A. Vaisman; Oscar Romero; Torben Bach Pedersen; Christian Thomsen

Publication and sharing of multidimensional (MD) data on the Semantic Web (SW) opens new opportunities for the use of On-Line Analytical Processing (OLAP). The RDF Data Cube (QB) vocabulary, the current standard for statistical data publishing, however, lacks key MD concepts such as dimension hierarchies and aggregate functions. QB4OLAP was proposed to remedy this. However, QB4OLAP requires extensive manual annotation and users must still write queries in SPARQL, the standard query language for RDF, which typical OLAP users are not familiar with. In this demo, we present QB2OLAP, a tool for enabling OLAP on existing QB data. Without requiring any RDF, QB(4OLAP), or SPARQL skills, it allows semi-automatic transformation of a QB data set into a QB4OLAP one via enrichment with QB4OLAP semantics, exploration of the enriched schema, and querying with the high-level OLAP language QL that exploits the QB4OLAP semantics and is automatically translated to SPARQL.

latin american web congress | 2014

Publishing and Querying Government Multidimensional Data Using QB4OLAP

M. Bouza; B. Elliot; Lorena Etcheverry; Alejandro A. Vaisman

The web is changing the way in which data warehouses are designed, used, and queried. With the advent of initiatives such as Open Data and Open Government, organizations want to share their multidimensional data cubes and make them available to be queried online. The RDF data cube vocabulary (QB), the W3C standard to publish statistical data in RDF, presents several limitations to fully support the multidimensional model. The QB4OLAP vocabulary extends QB to overcome these limitations, and provides the distinctive feature of being able to implement several OLAP operations, such as rollup, slice, and dice using standard SPARQL queries. In this paper we present QB4OLAP Engine, a tool that transforms multidimensional data stored in relational DWs into RDF using QB4OLAP, and apply the solution to a real-world case, based on the national survey of housing, health services, and income, carried out by the government of Uruguay.

database and expert systems applications | 2010

Data Quality Metrics for Genome Wide Association Studies

Lorena Etcheverry; Adriana Marotta; Raúl Ruggia

Genome Wide Association Studies (GWAS) are developed to find direct or indirect relations from given genomic configurations to physical characteristics or specific diseases. In order to build new GWAS, avoiding the complexities of field based studies, a statistical technique called meta-analysis can be used. Bad or unknown data quality has been largely identified as a major problem in meta-analysis since it generates lack of confidence and inhibits its exploitation. This paper addresses GWAS data quality issues and presents a domain specific model for data quality assessment, which has been developed taking into account meta-analysis requirements.

Journal on Data Semantics | 2017

Efficient Analytical Queries on Semantic Web Data Cubes

Lorena Etcheverry; Alejandro A. Vaisman

The amount of multidimensional data published on the Semantic Web (SW) is constantly increasing, due to initiatives such as Open Data and Open Government Data, among others. Models, languages, and tools, that allow obtaining valuable information efficiently, are thus required. Multidimensional data are typically represented as data cubes and exploited using online analytical processing (OLAP) techniques. The RDF Data Cube Vocabulary, also denoted QB, is the current W3C standard to represent statistical data on the SW. Given that QB does not include key features needed for OLAP analysis, in previous work we have proposed an extension, denoted QB4OLAP, to overcome this problem without the need of modifying already published data. Once data cubes are appropriately represented on the SW, we need mechanisms to analyze them. However, in the current state-of-the-art, writing efficient analytical queries over SW data cubes demands a deep knowledge of standards like RDF and SPARQL. These skills are unlikely to be found in typical analytical users. Further, OLAP languages like MDX are far from being easily understood by the final user. The lack of friendly tools to exploit multidimensional data on the SW is a barrier that needs to be broken to promote the publication of such data. This is the problem we address in this paper. Our approach is based on allowing analytical users to write queries using what they know best: OLAP operations over data cubes, without dealing with SW technicalities. For this, we devised CQL (standing for Cube Query Language), a simple, high-level query language that operates over data cubes. Taking advantage of structural metadata provided by QB4OLAP, we translate CQL queries into SPARQL ones. Then, we propose query improvement strategies to produce efficient SPARQL queries, adapting general-purpose SPARQL query optimization techniques. We evaluate our implementation using the Star Schema benchmark, showing that our proposal outperforms others. The QB4OLAP toolkit, a web application that allows exploring and querying (using CQL) SW data cubes, completes our contributions.

international conference of the chilean computer science society | 2016

Accelerating the quality measurement of DNA with GPUs

Gonzalo Javiel; Lorena Etcheverry; Pablo Ezzatti

High-throughput ADN sequencing technologies, also known as Next Generation Sequencing (NGS), have a huge impact in the development of bioinformatic tools. NGS generate massive amounts of data that support biomedical research, and assessing the quality of these data is a crucial task but also an important computational challenge. In the last ten years Graphic Processing Units (GPUs) have emerged as one of the major paradigms for solving complex problems using parallel computing techniques on commodity hardware. This paper presents Sodium, a high performance tool that efficiently evaluates the quality of NGS data using GPUs. Our experimental evaluation, performed on low-cost hardware, shows promising results. In particular, we observe acceleration values of up to 2 digits when our proposal is compared with FastQC, one of the most popular tools to evaluate the quality of NGS sequences.

International Journal of Data Warehousing and Mining | 2013

Fusion Cubes: Towards Self-Service Business Intelligence

Alberto Abelló; Jérôme Darmont; Lorena Etcheverry; Matteo Golfarelli; Jose-Norberto Mazón; Felix Naumann; Torben Bach Pedersen; Stefano Rizzi; Juan Trujillo; Panos Vassiliadis; Gottfried Vossen

CEUR Workshop Proceedings | 2012