Panos Vassiliadis | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Panos Vassiliadis is active.

Explore More

Publication

Featured researches published by Panos Vassiliadis.

Archive | 2010

Fundamentals of Data Warehouses

Matthias Jarke; Maurizio Lenzerini; Yannis Vassiliou; Panos Vassiliadis

From the Publisher: Data warehouses have captured the attention of practitioners and researchers alike. But the design and optimization of data warehouses remains an art rather than a science. This book presents a comparative review of the state of the art and best current practice of data warehouses. It covers source and data integration, multidimensional aggregation, query optimization, update propagation, metadata management, quality assessment, and design optimization. Also, based on results of the European Data Warehouse Quality project, it offers a conceptual framework by which the architecture and quality of data warehouse efforts can be assessed and improved using enriched metadata management combined with advanced techniques from databases, business modeling, and artificial intelligence. For researchers and database professionals in academia and industry, the book offers an excellent introduction to the issues of quality and metadata usage in the context of data warehouses.

international conference on management of data | 1999

A survey of logical models for OLAP databases

Panos Vassiliadis; Timos K. Sellis

In this paper, we present different proposals for multidimensional data cubes, which are the basic logical model for OLAP applications. We have grouped the work in the field in two categories: commercial tools (presented along with terminology and standards) and academic efforts. We further divide the academic efforts in two subcategories: the relational model extensions and the cube-oriented approaches. Finally, we attempt a comparative analysis of the various efforts.

statistical and scientific database management | 1998

Modeling multidimensional databases, cubes and cube operations

Panos Vassiliadis

Online analytical processing (OLAP) is a trend in database technology, which has attracted the interest of a lot of research work. OLAP is based on the multidimensional view of data, supported either by multidimensional databases (MOLAP) or relational engines (ROLAP). We propose a model for multidimensional databases. Dimensions, dimension hierarchies and cubes are formally introduced. We also introduce cube operations (changing of levels in the dimension hierarchy, function application, navigation etc.). The approach is based on the notion of the base cube, which is used for the calculation of the results of cube operations. We focus our approach on the support of a series of operations on cubes (i.e., the preservation of the results of previous operations and the applicability of aggregate functions in a series of operations). Furthermore, we provide a mapping of the multidimensional model to the relational model and to multidimensional arrays.

international conference on data engineering | 2005

Optimizing ETL processes in data warehouses

Alkis Simitsis; Panos Vassiliadis; Timos K. Sellis

Extraction-transformation-loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization and insertion into a data warehouse. Usually, these processes must be completed in a certain time window; thus, it is necessary to optimize their execution time. In this paper, we delve into the logical optimization of ETL processes, modeling it as a state-space search problem. We consider each ETL workflow as a state and fabricate the state space through a set of correct state transitions. Moreover, we provide algorithms towards the minimization of the execution cost of an ETL workflow.

Information Systems | 1999

Architecture and quality in data warehouses: An extended repository approach

Matthias Jarke; Manfred A. Jeusfeld; Christoph Quix; Panos Vassiliadis

Abstract Most database researchers have studied data warehouses (DW) in their role as buffers of materialized views, mediating between update-intensive OLTP systems and query-intensive decision support. This neglects the organizational role of data warehousing as a means of centralized information flow control. As a consequence, a large number of quality aspects relevant for data warehousing cannot be expressed with the current DW meta models. This paper makes two contributions towards solving these problems. Firstly, we enrich the meta data about DW architectures by explicit enterprise models. Secondly, many very different mathematical techniques for measuring or optimizing certain aspects of DW quality are being developed. We adapt the Goal-Question-Metric approach from software quality management to a meta data management environment in order to link these special techniques to a generic conceptual framework of DW quality. The approach has been implemented in full on top of the ConceptBase repository system and has undergone some validation by applying it to the support of specific quality-oriented methods, tools, and application projects in data warehousing.

conference on advanced information systems engineering | 2005

A generic and customizable framework for the design of ETL scenarios

Panos Vassiliadis; Alkis Simitsis; Panos Georgantas; Manolis Terrovitis; Spiros Skiadopoulos

Extraction-transformation-loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization and insertion into a data warehouse. In this paper, we delve into the logical design of ETL scenarios and provide a generic and customizable framework in order to support the DW designer in his task. First, we present a metamodel particularly customized for the definition of ETL activities. We follow a workflow-like approach, where the output of a certain activity can either be stored persistently or passed to a subsequent activity. Also, we employ a declarative database programming language, LDL, to define the semantics of each activity. The metamodel is generic enough to capture any possible ETL activity. Nevertheless, in the pursuit of higher reusability and flexibility, we specialize the set of our generic metamodel constructs with a palette of frequently used ETL activities, which we call templates. Moreover, in order to achieve a uniform extensibility mechanism for this library of built-ins, we have to deal with specific language issues. Therefore, we also discuss the mechanics of template instantiation to concrete activities. The design concepts that we introduce have been implemented in a tool, ARKTOS II, which is also presented.

International Journal of Data Warehousing and Mining | 2009

A Survey of Extract–Transform–Load Technology

Panos Vassiliadis

The software processes that facilitate the original loading and the periodic refreshment of the data warehouse contents are commonly known as Extraction-Transformation-Loading (ETL) processes. The intention of this survey is to present the research work in the field of ETL technology in a structured way. To this end, we organize the coverage of the field as follows: (a) first, we cover the conceptual and logical modeling of ETL processes, along with some design methods, (b) we visit each stage of the E-T-L triplet, and examine problems that fall within each of these stages, (c) we discuss problems that pertain to the entirety of an ETL process, and, (d) we review some research prototypes of academic origin. [Article copies are available for purchase from InfoSci-on-Demand.com]

conference on advanced information systems engineering | 2000

Towards quality-oriented data warehouse usage and evolution

Panos Vassiliadis; Mokrane Bouzeghoub; Christoph Quix

As a decision support information system, a data warehouse must provide high level quality of data and quality of service. In the DWQ project we have proposed an architectural framework and a repository of metadata which describes all the data warehouse components in a set of metamodels to which is added a quality metamodel, defining for each data warehouse metaobject the corresponding relevant quality dimensions and quality factors. Apart from this static definition of quality, we also provide an operational complement, that is a methodology on how to use quality factors and to achieve user quality goals. This methodology is an extension of the Goal-Question-Metric (GQM) approach, which allows to capture (a) the inter-relationships between different quality factors and (b) to organize them in order to fulfil specific quality goals. After summarizing the DWQ quality model, this paper describes the methodology we propose to use this quality model, as well as its impact on the data warehouse evolution.

international conference on conceptual modeling | 2004

Data Mapping Diagrams for Data Warehouse Design with UML

Sergio Luján-Mora; Panos Vassiliadis; Juan Trujillo

In Data Warehouse (DW) scenarios, ETL (Extraction, Transformation, Loading) processes are responsible for the extraction of data from heterogeneous operational data sources, their transformation (conversion, cleaning, normalization, etc.) and their loading into the DW. In this paper, we present a framework for the design of the DW back-stage (and the respective ETL processes) based on the key observation that this task fundamentally involves dealing with the specificities of information at very low levels of granularity including transformation rules at the attribute level. Specifically, we present a disciplined framework for the modeling of the relationships between sources and targets in different levels of granularity (including coarse mappings at the database and table levels to detailed inter-attribute mappings at the attribute level). In order to accomplish this goal, we extend UML (Unified Modeling Language) to model attributes as first-class citizens. In our attempt to provide complementary views of the design artifacts in different levels of detail, our framework is based on a principled approach in the usage of UML packages, to allow zooming in and out the design of a scenario.

international conference on data engineering | 2007

Supporting Streaming Updates in an Active Data Warehouse

Neoklis Polyzotis; Spiros Skiadopoulos; Panos Vassiliadis; Alkis Simitsis; Nils-Erik Frantzell

Active data warehousing has emerged as an alternative to conventional warehousing practices in order to meet the high demand of applications for up-to-date information. In a nutshell, an active warehouse is refreshed on-line and thus achieves a higher consistency between the stored information and the latest data updates. The need for on-line warehouse refreshment introduces several challenges in the implementation of data warehouse transformations, with respect to their execution time and their overhead to the warehouse processes. In this paper, we focus on a frequently encountered operation in this context, namely, the join of a fast stream S of source updates with a disk-based relation R, under the constraint of limited memory. This operation lies at the core of several common transformations, such as, surrogate key assignment, duplicate detection or identification of newly inserted tuples. We propose a specialized join algorithm, termed mesh join (MeshJoin), that compensates for the difference in the access cost of the two join inputs by (a) relying entirely on fast sequential scans of R, and (b) sharing the I/O cost of accessing R across multiple tuples of S. We detail the Mesh Join algorithm and develop a systematic cost model that enables the tuning of Mesh Join for two objectives: maximizing throughput under a specific memory budget or minimizing memory consumption for a specific throughput. We present an experimental study that validates the performance of Mesh Join on synthetic and real-life data. Our results verify the scalability of Mesh-Join to fast streams and large relations, and demonstrate its numerous advantages over existing join algorithms.

Explore More