Marko Niinimäki
Helsinki Institute of Physics
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marko Niinimäki.
Journal on Data Semantics | 2009
Marko Niinimäki; Tapio Niemi
In this paper, we present an advanced method for on-demand construction of OLAP cubes for ROLAP systems. The method contains the steps from cube design to ETL but focuses on ETL. Actual data analysis can then be done using the tools and methods of the OLAP software at hand. The method is based on RDF/OWL ontologies and design tools. The ontology serves as a basis for designing and creating the OLAP schema, its corresponding database tables, and finally populating the database. Our starting point is heterogeneous and distributed data sources that are eventually used to populate the OLAP cubes. Mapping between the source data and its OLAP form is done by converting the data first to RDF using ontology maps. Then the data are extracted from its RDF form by queries that are generated using the ontology of the OLAP schema. Finally, the extracted data are stored in the database tables and analysed using an OLAP software. Algorithms and examples are provided for all these steps. In our tests, we have used an open source OLAP implementation and a database server. The performance of the system is found satisfactory when testing with a data source of 450 000 RDF statements. We also propose an ontology based tool that will work as a user interface to the system, from design to actual analysis.
International Journal on Semantic Web and Information Systems | 2007
Tapio Niemi; Santtu Toivonen; Marko Niinimäki; Jyrki Nummenmaa
Traditionally, data used in OLAP (online analytical processing) have been limited to the contents of the data warehouse of a company. However, the needs for analysis are often more demanding and data are needed from different sources. In this article, we study how the semantics of data sources can be described to allow combining data from several sources into an OLAP cube. We apply Semantic Web technologies for defining an OWL/RDF ontology for OLAP data sources and OLAP cubes. These definitions are then utilised in OLAP cube formation by posing an OWL/RDF ontology-based query against them. We use Grid technologies to enhance the efficiency of processing and ensuring security. Our primary interest is in the cube construction (i.e., ETL process), and we assume that standard OLAP methods can be used for the actual analysis. Our tests show that the proposed approach can speed up the construction of an OLAP cube for ad hoc queries by supporting a high-level query language and reducing the amount of required data.
acm symposium on applied computing | 2010
Tapio Niemi; Marko Niinimäki
Summarizability, i.e. the correctness of aggregation operations, is essential for OLAP analysis. Summarizability has commonly been studied in the context of dimension hierarchies, but the role of semantics of measure attributes and aggregation functions (sum, avg, min, max, count) has received less research interest. In this paper, we focus on the relationship between measure and dimension attributes and its effect on summarizability. We define the concept of measure-dimension consistency and show how it can be concluded from an OLAP ontology constructed by using Semantic Web technologies. Measure-dimension consistency can be used both for OLAP cube construction and queries and it is also very useful when integrating data over the internet.
international conference on business informatics research | 2011
Peter Thanisch; Tapio Niemi; Marko Niinimäki; Jyrki Nummenmaa
When utilising multidimensional OLAP (On-Line Analytic Processing) analysis models in Business Intelligence analysis, it is common that the users need to add new, unanticipated dimensions to the OLAP cube. In a conventional implementation, this would imply frequent re-designs of the cube’s dimensions. We present an alternative method for the addition of new dimensions. Interestingly, the same design method can also be used to import EAV (Entity-Attribute-Value) tables into a cube. EAV tables have earlier been used to represent extremely sparse data in applications such as biomedical databases. Though space-efficient, EAV-representation can be awkward to query.
international conference on information and software technologies | 2013
Peter Thanisch; Tapio Niemi; Jyrki Nummenmaa; Zheying Zhang; Marko Niinimäki; Pertti Saariluoma
Although conceptual data modelers can ”get creative” when designing entities and relationships to meet business requirements, they are highly constrained by the business rules which determine the details of how the entities and relationships combine. Typically, there is a delay in realising which business rules might be relevant and a further delay in obtaining an authoritative statement of these rules. We identify circumstances under which viable database designs can be constructed from conceptual data models which are incomplete in the sense that they lack this “infrastructural” detail normally obtained from the business rules. As such detail becomes available, our approach allows the conceptual model to be incrementally refined so that each refinements can be associated with standard database refactorings, minimising the impact on database operations. Our incremental approach facilitates the implementation of the database earlier in the development cycle.
international symposium on parallel and distributed computing | 2003
Marko Niinimäki; John White; Juha Herrala
The emergence of the Grid computing model is important for disciplines that are computing and data storage-intensive. A generic web-based interface to a computing grid is presented. The package allows the user to submit a high energy physics simulation to Grid resources, track the progress of the job and upon job completion, display graphical results.
data warehousing and knowledge discovery | 2013
Peter Thanisch; Jyrki Nummenmaa; Tapio Niemi; Marko Niinimäki
We present a lazy evaluation technique for computing summarized information from dimensional databases. Our technique works well with a very large number of dimensions. While the traditional approach has been to preprocess analysis models from which the user selects the data of interest, in our approach only the cells required by the user are calculated using a cell-by-cell computation strategy.
international conference on business informatics research | 2011
Marko Niinimäki; Tapio Niemi; Stephen Martin; Jyrki Nummenmaa; Peter Thanisch
In business intelligence, reporting is perceived by users as the most important area. Here, we present a case study of data integration for reporting within the World Health Organization (WHO).
Archive | 2011
Peter Thanisch; Tapio Niemi; Marko Niinimäki; Jyrki Nummenmaa
In our previous work, we have created an ontology for describing both the structure and the content of on-line analytical processing (OLAP) multidimensional cubes using the Resource Description Framework (RDF) XML sublanguage. In the present paper, we describe how we have mapped our ontology onto a multidimensional database schema which can effectively be used to define the data access layer for Microsofts Business Intelligence Semantic Model (BISM). This mapping facilitates so-called self-service business intelligence by allowing immediate end-user access to the multidimensional data, from desktop applications such as spreadsheet pivot tables, without the need for any special technical expertise to guide the transformation process.
data warehousing and olap | 2002
Tapio Niemi; Marko Niinimäki; Jyrki Nummenmaa; Peter Thanisch