Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sebastian Neumaier is active.

Publication


Featured researches published by Sebastian Neumaier.


Journal of Data and Information Quality | 2016

Automated Quality Assessment of Metadata across Open Data Portals

Sebastian Neumaier; Jürgen Umbrich; Axel Polleres

The Open Data movement has become a driver for publicly available data on the Web. More and more data—from governments and public institutions but also from the private sector—are made available online and are mainly published in so-called Open Data portals. However, with the increasing number of published resources, there is a number of concerns with regards to the quality of the data sources and the corresponding metadata, which compromise the searchability, discoverability, and usability of resources. In order to get a more complete picture of the severity of these issues, the present work aims at developing a generic metadata quality assessment framework for various Open Data portals: We treat data portals independently from the portal software frameworks by mapping the specific metadata of three widely used portal software frameworks (CKAN, Socrata, OpenDataSoft) to the standardized Data Catalog Vocabulary metadata schema. We subsequently define several quality metrics, which can be evaluated automatically and in an efficient manner. Finally, we report findings based on monitoring a set of over 260 Open Data portals with 1.1M datasets. This includes the discussion of general quality issues, for example, the retrievability of data, and the analysis of our specific quality metrics.


international semantic web conference | 2016

Multi-level Semantic Labelling of Numerical Values

Sebastian Neumaier; Jürgen Umbrich; Josiane Xavier Parreira; Axel Polleres

With the success of Open Data a huge amount of tabular data sources became available that could potentially be mapped and linked into the Web of (Linked) Data. Most existing approaches to “semantically label” such tabular data rely on mappings of textual information to classes, properties, or instances in RDF knowledge bases in order to link – and eventually transform – tabular data into RDF. However, as we will illustrate, Open Data tables typically contain a large portion of numerical columns and/or non-textual headers; therefore solutions that solely focus on textual “cues” are only partially applicable for mapping such data sources. We propose an approach to find and rank candidates of semantic labels and context descriptions for a given bag of numerical values. To this end, we apply a hierarchical clustering over information taken from DBpedia to build a background knowledge graph of possible “semantic contexts” for bags of numerical values, over which we perform a nearest neighbour search to rank the most likely candidates. Our evaluation shows that our approach can assign fine-grained semantic labels, when there is enough supporting evidence in the background knowledge graph. In other cases, our approach can nevertheless assign high level contexts to the data, which could potentially be used in combination with other approaches to narrow down the search space of possible labels.


international conference on digital government research | 2016

Open Data Portal Quality Comparison using AHP

Sylvain Kubler; Jérémy Robert; Yves Le Traon; Jürgen Umbrich; Sebastian Neumaier

During recent years, more and more Open Data becomes available and used as part of the Open Data movement. However, there are reported issues with the quality of the metadata in data portals and the data itself. This is a serious risk that could disrupt the Open Data project, as well as e-government initiatives since the data quality needs to be managed to guarantee the reliability of e-government to the public. First quality assessment frameworks emerge to evaluate the quality for a given dataset or portal along various dimensions (e.g., information completeness). Nonetheless, a common problem with such frameworks is to provide meaningful ranking mechanisms that are able to integrate several quality dimensions and user preferences (e.g., a portal provider is likely to have different quality preferences than a portal consumer). To address this multi-criteria decision making problem, our research work applies AHP (Analytic Hierarchy Process), which compares 146 active Open Data portals across 44 countries, powered by the CKAN software.


european semantic web conference | 2017

Talking Open Data

Sebastian Neumaier; Vadim Savenkov; Svitlana Vakulenko

Enticing users into exploring Open Data remains an important challenge for the whole Open Data paradigm. Standard stock interfaces often used by Open Data portals are anything but inspiring even for tech-savvy users, let alone those without an articulated interest in data science. To address a broader range of citizens, we designed an open data search interface supporting natural language interactions via popular platforms like Facebook and Skype. Our data-aware chatbot answers search requests and suggests relevant open datasets, bringing fun factor and a potential of viral dissemination into Open Data exploration. The current system prototype is available for Facebook (this https URL) and Skype (this https URL) users.


2016 2nd International Conference on Open and Big Data (OBD) | 2016

Characteristics of Open Data CSV Files

Johann Mitlöhner; Sebastian Neumaier; Jürgen Umbrich; Axel Polleres

This work analyzes an Open Data corpus containing 200K tabular resources with a total file size of 413 GB from a data consumer perspective. Our study shows that ~10% of the resources in Open Data portals are labelled as a tabular data of which only 50% can be considered CSV files. The study inspects the general shape of these tabular data, reports on column and row distribution, analyses the availability of (multiple) header rows and if a file contains multiple tables. In addition, we inspect and analyze the table column types, detect missing values and report about the distribution of the values.


european semantic web conference | 2018

reboting.com: Towards Geo-search and Visualization of Austrian Open Data.

Erich Heil; Sebastian Neumaier

Data portals mainly publish semi-structured, tabular formats which lack semantic descriptions of geo-entities and therefore, do not allow any exploration and automated visualization of these datasets. Herein, we present a framework to add geo-semantic labels, based on a constructed geo-entity knowledge graph, and a user interface to query and automatically visualize the resources from the Austrian data portals. The web-application is available at https://reboting.com/.


Procedia Computer Science | 2018

Geo-Semantic Labelling of Open Data

Sebastian Neumaier; Vadim Savenkov; Axel Polleres

In the past years Open Data has become a trend among governments to increase transparency and public engagement by opening up national, regional, and local datasets. However, while many of these datasets come in semi-structured file formats, they use different schemata and lack geo-references or semantically meaningful links and descriptions of the corresponding geo-entities. We aim to address this by detecting and establishing links to geo-entities in the datasets found in Open Data catalogs and their respective metadata descriptions and link them to a knowledge graph of geo-entities. This knowledge graph does not yet readily exist, though, or at least, not a single one: so, we integrate and interlink several datasets to construct our (extensible) base geo-entities knowledge graph: (i) the openly available geospatial data repository GeoNames, (ii) the map service OpenStreetMap, (iii) country-specific sets of postal codes, and (iv) the European Union?s classification system NUTS. As a second step, this base knowledge graph is used to add semantic labels to the open datasets, i.e., we heuristically disambiguate the geo-entities in CSV columns using the context of the labels and the hierarchical graph structure of our base knowledge graph. Finally, in order to interact with and retrieve the content, we index the datasets and provide a demo user interface. Currently we indexed resources from four Open Data portals, and allow search queries for geo-entities as well as full-text matches at http://data.wu.ac.at/odgraph/.


Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW '18 | 2018

Search, Filter, Fork, and Link Open Data: The ADEQUATe platform: data- and community-driven quality improvements

Sebastian Neumaier; Lörinc Thurnay; Thomas J. Lampoltshammer; Tomás Knap

The present work describes the ADEQUATe platform: a framework to monitor the quality of (Governmental) Open Data catalogs, to re-publish improved and linked versions of the datasets and their respective metadata descriptions, and to include the community in the quality improvement process. The information acquired by the linking and (meta)data improvement steps is then integrated in a semantic search engine. In the paper, we first describe the requirements of the platform, which are based on focus group interviews and a web-based survey. Second, we use these requirements to formulate the goals and show the architecture of the overall platform, and third, we showcase the potential and relevance of the platform to resolve the requirements by describing exemplary user journeys exploring the system. The platform is available at: https://www.adequate.at/


Reasoning Web International Summer School | 2017

Data Integration for Open Data on the Web

Sebastian Neumaier; Axel Polleres; Simon Steyskal; Jürgen Umbrich

In this lecture we will discuss and introduce challenges of integrating openly available Web data and how to solve them. Firstly, while we will address this topic from the viewpoint of Semantic Web research, not all data is readily available as RDF or Linked Data, so we will give an introduction to different data formats prevalent on the Web, namely, standard formats for publishing and exchanging tabular, tree-shaped, and graph data. Secondly, not all Open Data is really completely open, so we will discuss and address issues around licences, terms of usage associated with Open Data, as well as documentation of data provenance. Thirdly, we will discuss issues connected with (meta-)data quality issues associated with Open Data on the Web and how Semantic Web techniques and vocabularies can be used to describe and remedy them. Fourth, we will address issues about searchability and integration of Open Data and discuss in how far semantic search can help to overcome these. We close with briefly summarizing further issues not covered explicitly herein, such as multi-linguality, temporal aspects (archiving, evolution, temporal querying), as well as how/whether OWL and RDFS reasoning on top of integrated open data could be help.


conference on the future of the internet | 2015

Quality Assessment and Evolution of Open Data Portals

Jürgen Umbrich; Sebastian Neumaier; Axel Polleres

Collaboration


Dive into the Sebastian Neumaier's collaboration.

Top Co-Authors

Avatar

Jürgen Umbrich

Vienna University of Economics and Business

View shared research outputs
Top Co-Authors

Avatar

Axel Polleres

Vienna University of Economics and Business

View shared research outputs
Top Co-Authors

Avatar

Vadim Savenkov

Vienna University of Technology

View shared research outputs
Top Co-Authors

Avatar

Sylvain Kubler

University of Luxembourg

View shared research outputs
Top Co-Authors

Avatar

Yves Le Traon

University of Luxembourg

View shared research outputs
Top Co-Authors

Avatar

Johann Mitlöhner

Vienna University of Economics and Business

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Simon Steyskal

Vienna University of Economics and Business

View shared research outputs
Top Co-Authors

Avatar

Svitlana Vakulenko

Vienna University of Economics and Business

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge