Marko Banek | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marko Banek is active.

Explore More

Publication

Featured researches published by Marko Banek.

data warehousing and knowledge discovery | 2003

Designing Web Warehouses from XML Schemas

Boris Vrdoljak; Marko Banek; Stefano Rizzi

Web warehousing plays a key role in providing the managers with up-to-date and comprehensive information about their business domain. On the other hand, since XML is now a standard de facto for the exchange of semi-structured data, integrating XML data into web warehouses is a hot topic. In this paper we propose a semi-automated methodology for designing web warehouses from XML sources modeled by XML Schemas. In the proposed methodology, design is carried out by first creating a schema graph, then navigating its arcs in order to derive a correct multidimensional representation. Differently from previous approaches in the literature, particular relevance is given to the problem of detecting shared hierarchies and convergence of dependencies, and of modeling many-to-many relationships. The approach is implemented in a prototype that reads an XML Schema and produces in output the logical schema of the warehouse.

availability, reliability and security | 2006

The security issue of federated data warehouses in the area of evidence-based medicine

Nevena Stolba; Marko Banek; A.M. Tjoa

Healthcare organisations practicing evidence-based medicine strive to unite their data assets in order to achieve a wider knowledge base for more sophisticated research as well as to provide a matured decision support service for the care givers. The central point of such an integrated system is a data warehouse, to which all participants have access. Due to the high confidentiality of healthcare data, and the privacy policy of participating organisations, the proposed warehouse is not created physically but as a federated system. Its conceptual model is based on a widely accepted international standard to overwhelm the heterogeneity of the components. Any disclosure of health data, especially when related to a particular person, could be irreparably harmful, and their protection is even legally prescribed. Depersonalisation and pseudonymisation are used to ensure that personal identities are made secret before sending data to the federation. In this paper a case study of a federation of health insurance data warehouses (HEWAF) is described. The protection of data privacy and confidentiality in the underlying warehouse is guaranteed through reliable security measures in the federation.

International Journal of Data Warehousing and Mining | 2008

Automated Integration of Heterogeneous Data Warehouse Schemas

Marko Banek; Boris Vrdoljak; A Min Tjoa; Zoran Skočir

A federated data warehouse is a logical integration of data warehouses applicable when physical integration is impossible due to privacy policy or legal restrictions. In healthcare systems federated data warehouses are a most feasible source of data for deducing guidelines for evidence-based medicine based on data material from different participating institutions. In order to enable the translation of queries in a federated approach, schemas of the federated warehouse and the local warehouses must be matched. In this paper we present a procedure that enables the matching process for schema structures specific to the multidimensional model of data warehouses: facts, measures, dimensions, aggregation levels and dimensional attributes. Similarities between warehouse-specific structures are computed by using linguistic and structural comparison. The calculated values are used to create necessary mappings.

data warehousing and knowledge discovery | 2006

Integrating different grain levels in a medical data warehouse federation

Marko Banek; A Min Tjoa; Nevena Stolba

Healthcare organizations practicing evidence-based medicine strive to unite their data resources in order to achieve a wider knowledge base for sophisticated research and matured decision support service. The central point of such an integrated system is a data warehouse, to which all participants have access. In order to insure a better protection of highly sensitive healthcare data, the warehouse is not created physically, but as a federated system. The paper describes the conceptual design of a health insurance data warehouse federation (HEWAF) aimed at supporting evidence-based medicine. We address a major domain-specific conceptual design issue: the integration of low-grained, time-segmented data into the traditional warehouse, whose basic grain level is higher than that of the time-segmented data. The conceptual model is based on a widely accepted international healthcare standard. We use ontologies of the data warehouse domain, as well as of the healthcare and pharmacy domains, to provide schema matching between the federation and the component warehouses.

Journal of Web Semantics | 2016

CroMatcher: An ontology matching system based on automated weighted aggregation and iterative final alignment

Marko Gulić; Boris Vrdoljak; Marko Banek

In order to perform ontology matching with high accuracy, while at the same time retaining applicability to most diverse input ontologies, the matching process generally incorporates multiple methods. Each of these methods is aimed at a particular ontology component, such as annotations, structure, properties or instances. Adequately combining these methods is one of the greatest challenges in designing an ontology matching system. In a parallel composition of basic matchers, the ability to dynamically set the weights of the basic matchers in the final output, thus making the weights optimal for the given input, is the key breakthrough for obtaining first-rate matching performance. In this paper we present CroMatcher, an ontology matching system, introducing several novelties to the automated weight calculation process. We apply substitute values for matchers that are inapplicable for the particular case and use thresholds to eliminate low-probability alignment candidates. We compare the alignments produced by the matchers and give less weight to the matchers producing mutually similar alignments, whereas more weight is given to those matchers whose alignment is distinct and rather unique. We also present a new, iterative method for producing one-to-one final alignment of ontology structures, which is a significant enhancement of similar non-iterative methods proposed in the literature. CroMatcher has been evaluated against other state-of-the-art matching systems at the OAEI evaluation contest. In a large number of test cases it achieved the highest score, which puts it among the state-of-the-art leaders.

database and expert systems applications | 2008

Word Sense Disambiguation as the Primary Step of Ontology Integration

Marko Banek; Boris Vrdoljak; A Min Tjoa

The recommendable primary step of ontology integration is annotation of ontology components with entries from WordNet or other dictionary sources in order to disambiguate their meaning. This paper presents an approach to automatically disambiguating the meaning of OWL ontology classes by providing sense annotation from WordNet. A class name is disambiguated using the names of the related classes, by comparing the taxonomy of the ontology with the portions of the WordNet taxonomy corresponding to all possible meanings of the class. The equivalence of the taxonomies is expressed by a probability function called affinity function. We apply two different basic techniques to compute the affinity coefficients: one based on semantic similarity calculation and the other on analyzing overlaps between word definitions and hyponyms. A software prototype is provided to evaluate the approach, as well as to determine which of the two disambiguation techniques produces better results.

international conference on data engineering | 2006

Integrating XML sources into a data warehouse

Boris Vrdoljak; Marko Banek; Zoran Skočir

Since XML has become a standard for data exchange over the Internet, especially in B2B and B2C communication, there is an increasing need of integrating XML data into data warehousing systems. In this paper we propose a methodology for data warehouse design, when data sources are XML Schemas and conforming XML documents. Particular relevance is given to the conceptual and logical multidimensional design. A prototype tool has been developed to verify and support our methodology. Because of the semi-structured nature of XML data, not all the information needed for design can be safely derived from XML Schema. In these situations, XQuery statements are generated by the tool to examine XML documents. The functionality of the tool is explained on a real-life XML Schema that describes purchase orders.

international conference on knowledge based and intelligent information and engineering systems | 2008

Uncovering the Deep Web: Transferring Relational Database Content and Metadata to OWL Ontologies

Damir Jurić; Marko Banek; Zoran Skočir

Organizing the publicly available Web content into highly systematized domain ontologies is a necessary step in the evolvement of the Semantic Web. A large portion of that content called the deep Web is stored in relational databases and it is not accessible to Web search engines. Incorporation of the deep Web data results in domain ontologies richer both in content and in semantic relations. In this paper we introduce a framework for an automatic mapping of relational database metadata and content to domain ontologies written in OWL. Relational constructs: relations, attributes and primary-foreign key associations are translated to OWL classes, datatype properties and object properties. Database tuples become ontology instances. In order to define reference points for integration with other ontologies the constructed ontologies are further enriched with additional semantics from the WordNet lexical database using word sense disambiguation mechanisms. A software implementation of the approach has been developed and evaluated on case study examples.

data warehousing and knowledge discovery | 2007

Automating the schema matching process for heterogeneous data warehouses

Marko Banek; Boris Vrdoljak; A Min Tjoa; Zoran Skočir

A federated data warehouse is a logical integration of data warehouses applicable when physical integration is impossible due to privacy policy or legal restrictions. In order to enable the translation of queries in a federated approach, schemas of the federated and the local warehouses must be matched. In this paper we present a procedure that enables the matching process for schema structures specific to the multidimensional model of data warehouses: facts, measures, dimensions, aggregation levels and dimensional attributes. Similarities between warehouse-specific structures are computed by using linguistic and structural comparison, where calculated values are used to create necessary mappings. We present restriction rules and recommendations for aggregation level matching, which builds the most complex part of the process. A software implementation of the entire process is provided in order to perform its verification, as well as to determine the proper selection metric for mapping different multidimensional structures.

Lecture Notes in Computer Science | 2006

Distributed architecture for association rule mining

Marko Banek; Damir Jurić; Ivo Pejaković; Zoran Skočir

Organizations have adopted various data mining techniques to support their decision-making and business processes. However, the mining analysis is not performed and supervised by the final user, the management of the organization, since the knowledge of mathematical models as well as expert database administration skills is required. This paper describes a distributed architecture for association rule mining analysis in the retail area, designed to be used directly by the management of an organization and implemented as a Java web application. The rule discovery algorithm is executed at the database server that hosts the source data warehouse, while the only used client tool is a web browser. The user interactively initiates the rule discovery process through a simple user interface, which is used later to browse, sort and compare the discovered rules.

Explore More