Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ivan Ermilov is active.

Publication


Featured researches published by Ivan Ermilov.


International Conference on Knowledge Engineering and the Semantic Web | 2013

Linked Open Data Statistics: Collection and Exploitation

Ivan Ermilov; Michael Martin; Jens Lehmann; Sören Auer

This demo presents LODStats, a web application for collection and exploration of the Linked Open Data statistics. LODStats consists of two parts: the core collects statistics about the LOD cloud and publishes it on the LODStats web portal, a front-end for exploration of dataset statistics. Statistics are published both in human-readable and machine-readable formats, thus allowing consumption of the data through web front-end by the users as well as through an API by services and applications. As an example for the latter we showcase how to visualize the statistical data with the CubeViz application.


Sprachwissenschaft | 2016

A fine-grained evaluation of SPARQL endpoint federation systems

Muhammad Saleem; Yasar Khan; Ali Hasnain; Ivan Ermilov; Axel-Cyrille Ngonga Ngomo

The Web of Data has grown enormously over the last years. Currently, it comprises a large compendium of interlinked and distributed datasets from multiple domains. Running complex queries on this compendium often requires accessing data from different endpoints within one query. The abundance of datasets and the need for running complex query has thus motivated a considerable body of work on SPARQL query federation systems, the dedicated means to access data distributed over the Web of Data. However, the granularity of previous evaluations of such systems has not allowed deriving of insights concerning their behavior in different steps involved during federated query processing. In this work, we perform extensive experiments to compare state-of-the-art SPARQL endpoint federation systems using the comprehensive performance evaluation framework Fed- Bench. In addition to considering the tradition query runtime as an evaluation criterion, we extend the scope of our performance evaluation by considering criteria, which have not been paid much attention to in previous studies. In particular, we consider the number of sources selected, the total number of SPARQL ASK requests used, the completeness of answers as well as the source selection time. Yet, we show that they have a significant impact on the overall query runtime of existing systems. Moreover, we extend FedBench to mirror a highly distributed data environment and assess the behavior of existing systems by using the same performance criteria. As the result we provide a detailed analysis of the experimental outcomes that reveal novel insights for improving current and future SPARQL federation systems.


international conference on semantic systems | 2014

DataID: towards semantically rich metadata for complex datasets

Martin Brümmer; Ciro Baron; Ivan Ermilov; Markus Freudenberg; Dimitris Kontokostas; Sebastian Hellmann

The constantly growing amount of Linked Open Data (LOD) datasets constitutes the need for rich metadata descriptions, enabling users to discover, understand and process the available data. This metadata is often created, maintained and stored in diverse data repositories featuring disparate data models that are often unable to provide the metadata necessary to automatically process the datasets described. This paper proposes DataID, a best-practice for LOD dataset descriptions which utilize RDF files hosted together with the datasets, under the same domain. We are describing the data model, which is based on the widely used DCAT and VoID vocabularies, as well as supporting tools to create and publish DataIDs and use cases that show the benefits of providing semantically rich metadata for complex datasets. As a proof of concept, we generated a DataID for the DBpedia dataset, which we will present in the paper.


international conference on semantic systems | 2013

User-driven semantic mapping of tabular data

Ivan Ermilov; Sören Auer; Claus Stadler

Governments and public administrations started recently to publish large amounts of structured data on the Web, mostly in the form of tabular data such as CSV files or Excel sheets. Various tools and projects have been launched aiming at facilitating the lifting of tabular data to reach semantically structured and linked data. However, none of these tools supported a truly incremental, pay-as-you-go data publication and mapping strategy, which enables effort sharing between data owners, community experts and consumers. In this article, we present an approach for enabling the user-driven semantic mapping of large amounts tabular data. We devise a simple mapping language for tabular data, which is easy to understand even for casual users, but expressive enough to cover the vast majority of potential tabular mappings use cases. We outline a formal approach for mapping tabular data to RDF. Default mappings are automatically created and can be revised by the community using a semantic wiki. The mappings are executed using a sophisticated streaming RDB2RDF conversion. We report about the deployment of our approach at the Pan-European data portal PublicData.eu, where we transformed and enriched almost 10,000 datasets accounting for 7.3 billion triples.


international semantic web conference | 2016

LODStats: The Data Web Census Dataset

Ivan Ermilov; Jens Lehmann; Michael Martin; Sören Auer

Over the past years, the size of the Data Web has increased significantly, which makes obtaining general insights into its growth and structure both more challenging and more desirable. The lack of such insights hinders important data management tasks such as quality, privacy and coverage analysis. In this paper, we present the LODStats dataset, which provides a comprehensive picture of the current state of a significant part of the Data Web. LODStats is based on RDF datasets from data.gov, publicdata.eu and datahub.io data catalogs and at the time of writing lists over 9000 RDF datasets. For each RDF dataset, LODStats collects comprehensive statistics and makes these available in adhering to the LDSO vocabulary. This analysis has been regularly published and enhanced over the past five years at the public platform lodstats.aksw.org. We give a comprehensive overview over the resulting dataset.


knowledge acquisition, modeling and management | 2012

SlideWiki: elicitation and sharing of corporate knowledge using presentations

Ali Khalili; Sören Auer; Darya Tarasowa; Ivan Ermilov

Presentations play a crucial role in knowledge management within organizations, in particular to facilitate organizational learning and innovation. Much of the corporate strategy, direction and accumulated knowledge within organizations is encapsulated in presentations. In this paper, we investigate the limitations of current presentation tools for semi-structured knowledge representation and sharing within organizations. We address challenges such as collaborative creation of presentations, tracking changes within them, sharing and reusing existing presentations. Then we present SlideWiki as a crowd-sourcing platform for the elicitation and sharing of corporate knowledge using presentations. With SlideWiki users can author, collaborate and arrange slides in organizational presentations by employing Web 2.0 strategies. Presentations can be organized hierarchically, so as to structure them reasonably according to their content. According to the wiki paradigm, all content in SlideWiki (i.e. slides, decks, themes, diagrams) are versioned and users can fork and merge presentations the same way as modern social coding platforms allow. Moreover, SlideWiki supports social networking activities such as following and discussing presentations for effective knowledge management. The article also comprises an evaluation of our SlideWiki implementation involving real users.


Linked Open Data | 2014

Lifting Open Data Portals to the Data Web

Sander van der Waal; Krzysztof Węcel; Ivan Ermilov; Valentina Janev; Uroš Milošević; Mark Wainwright

Recently, a large number of open data repositories, catalogs and portals have been emerging in the scientific and government realms. In this chapter, we characterise this newly emerging class of information systems. We describe the key functionality of open data portals, present a conceptual model and showcase the pan-European data portal PublicData.eu as a prominent example. Using examples from Serbia and Poland, we present an approach for lifting the often semantically shallow datasets registered at such data portals to Linked Data in order to make data portals the backbone of a distributed global data warehouse for our information society on the Web.


international conference on web engineering | 2017

The BigDataEurope Platform – Supporting the Variety Dimension of Big Data

Sören Auer; Simon Scerri; Aad Versteden; Erika Pauwels; Angelos Charalambidis; Stasinos Konstantopoulos; Jens Lehmann; Hajira Jabeen; Ivan Ermilov; Gezim Sejdiu; Andreas Ikonomopoulos; Spyros Andronopoulos; Mandy Vlachogiannis; Charalambos Pappas; Athanasios Davettas; Iraklis A. Klampanos; Efstathios Grigoropoulos; Vangelis Karkaletsis; Victor de Boer; Ronald Siebes; Mohamed Nadjib Mami; Sergio Albani; Michele Lazzarini; Paulo Nunes; Emanuele Angiuli; Nikiforos Pittaras; George Giannakopoulos; Giorgos Argyriou; George Stamoulis; George Papadakis

The management and analysis of large-scale datasets – described with the term Big Data – involves the three classic dimensions volume, velocity and variety. While the former two are well supported by a plethora of software components, the variety dimension is still rather neglected. We present the BDE platform – an easy-to-deploy, easy-to-use and adaptable (cluster-based and standalone) platform for the execution of big data components and tools like Hadoop, Spark, Flink, Flume and Cassandra. The BDE platform was designed based upon the requirements gathered from seven of the societal challenges put forward by the European Commission in the Horizon 2020 programme and targeted by the BigDataEurope pilots. As a result, the BDE platform allows to perform a variety of Big Data flow tasks like message passing, storage, analysis or publishing. To facilitate the processing of heterogeneous data, a particular innovation of the platform is the Semantic Layer, which allows to directly process RDF data and to map and transform arbitrary data into RDF. The advantages of the BDE platform are demonstrated through seven pilots, each focusing on a major societal challenge.


international semantic web conference | 2016

Detecting Similar Linked Datasets Using Topic Modelling

Michael Röder; Axel-Cyrille Ngonga Ngomo; Ivan Ermilov; Andreas Both

The Web of data is growing continuously with respect to both the size and number of the datasets published. Porting a dataset to five-star Linked Data however requires the publisher of this dataset to link it with the already available linked datasets. Given the size and growth of the Linked Data Cloud, the current mostly manual approach used for detecting relevant datasets for linking is obsolete. We study the use of topic modelling for dataset search experimentally and present Tapioca, a linked dataset search engine that provides data publishers with similar existing datasets automatically. Our search engine uses a novel approach for determining the topical similarity of datasets. This approach relies on probabilistic topic modelling to determine related datasets by relying solely on the metadata of datasets. We evaluate our approach on a manually created gold standard and with a user study. Our evaluation shows that our algorithm outperforms a set of comparable baseline algorithms including standard search engines significantly by 6i?ź% F1-score. Moreover, we show that it can be used on a large real world dataset with a comparable performance.


international semantic web conference | 2017

Distributed Semantic Analytics Using the SANSA Stack

Jens Lehmann; Gezim Sejdiu; Lorenz Bühmann; Patrick Westphal; Claus Stadler; Ivan Ermilov; Simon Bin; Nilesh Chakraborty; Muhammad Saleem; Axel-Cyrille Ngonga Ngomo; Hajira Jabeen

A major research challenge is to perform scalable analysis of large-scale knowledge graphs to facilitate applications like link prediction, knowledge base completion and reasoning. Analytics methods which exploit expressive structures usually do not scale well to very large knowledge bases, and most analytics approaches which do scale horizontally (i.e., can be executed in a distributed environment) work on simple feature-vector-based input. This software framework paper describes the ongoing Semantic Analytics Stack (SANSA) project, which supports expressive and scalable semantic analytics by providing functionality for distributed computing on RDF data.

Collaboration


Dive into the Ivan Ermilov's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hajira Jabeen

National University of Computer and Emerging Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Muhammad Saleem

University of Agriculture

View shared research outputs
Top Co-Authors

Avatar

Sandro Rautenberg

Midwestern State University

View shared research outputs
Top Co-Authors

Avatar

Aad Versteden

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge