Sebastian Bayerl
University of Passau
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sebastian Bayerl.
international conference on data engineering | 2014
Kai Schlegel; Florian Stegmaier; Sebastian Bayerl; Michael Granitzer; Harald Kosch
While Linked Open Data showed enormous increase in volume, yet there is no single point of access for querying the over 200 SPARQL repositories. In this paper we present Balloon Fusion, a SPARQL 1.1 rewriting and query federation service build on crawling and consolidating co-reference relationships in over 100 reachable Linked Data SPARQL Endpoints. The results of this process are 17.6M co-reference statements that have been clustered to 8.4M distinct semantic entities and are now accessible as download for further analysis. The proposed SPARQL rewriting performs a substitution of all URI occurrences with their synonyms combined with an automatic endpoint selection based on URI origin for a comprehensive query federation. While we show the technical feasibility, we also critically reflect the current status of the Linked Open Data cloud: although it is huge in size, access via SPARQL Endpoints is complicated in most cases due to missing quality of service.
Revised Selected Papers of the First Workshop on Specifying Big Data Benchmarks - Volume 8163 | 2012
Florian Stegmaier; Christin Seifert; Roman Kern; Patrick Höfler; Sebastian Bayerl; Michael Granitzer; Harald Kosch; Stefanie N. Lindstaedt; Belgin Mutlu; Vedran Sabol; Kai Schlegel; Stefan Zwicklbauer
Research depends to a large degree on the availability and quality of primary research data, i.e., data generated through experiments and evaluations. While the Web in general and Linked Data in particular provide a platform and the necessary technologies for sharing, managing and utilizing research data, an ecosystem supporting those tasks is still missing. The vision of the CODE project is the establishment of a sophisticated ecosystem for Linked Data. Here, the extraction of knowledge encapsulated in scientific research paper along with its public release as Linked Data serves as the major use case. Further, Visual Analytics approaches empower end users to analyse, integrate and organize data. During these tasks, specific Big Data issues are present.
knowledge discovery and data mining | 2013
Christin Seifert; Michael Granitzer; Patrick Höfler; Belgin Mutlu; Vedran Sabol; Kai Schlegel; Sebastian Bayerl; Florian Stegmaier; Stefan Zwicklbauer; Roman Kern
Scientific publications constitute an extremely valuable body of knowledge and can be seen as the roots of our civilisation. However, with the exponential growth of written publications, comparing facts and findings between different research groups and communities becomes nearly impossible. In this paper, we present a conceptual approach and a first implementation for creating an open knowledge base of scientific knowledge mined from research publications. This requires to extract facts - mostly empirical observations - from unstructured texts (mainly PDF’s). Due to the importance of extracting facts with high-accuracy and the impreciseness of automatic methods, human quality control is of utmost importance. In order to establish such quality control mechanisms, we rely on intelligent visual interfaces and on establishing a toolset for crowdsourcing fact extraction, text mining and data integration tasks.
ieee international conference semantic computing | 2017
Sebastian Bayerl; Michael Granitzer
The RDF Data Cube Vocabulary [1] is a W3C recommendation for publishing multi-dimensional data in a semantic web format. A large and growing number of such data cubes is already available in the linked data cloud. Merging multiple isolated cubes can lead to new insights as known from traditional data warehousing. In order to access and integrate the decentralized cubes, a suitable ranking and discovery mechanism is needed. We propose a cube discovery approach, based on the structure definition of the cubes. Syntactic and semantic properties of the cubes are considered to develop a similarity measure for cubes. We present a graph-based shortest-path algorithm that utilizes the DBpedia category dataset and a Word2Vec [2] model, to carry out the discovery process. The structural mapping for the data cubes generated during our ranking process can be leveraged to support the cube integration process. We use a machine learning approach to combine the implemented similarity measures. The evaluation shows, that this combination produces the best results and that the usage of the Hungarian algorithm [3] is suitable to find good structural mappings.
Archive | 2014
Sebastian Bayerl; Michael Granitzer
Data-Warehousing bezeichnet die technologische Realisierung analytischer Datenbestande sowie entsprechender Schnittstellen zu deren Exploration und Analyse. Linked Data bietet vor allem mit der vor Kurzem begonnenen Entwicklung des RDF Data Cube Vokabulars neue Entwicklungsmoglichkeiten fur Data-Warehousing Technologien und deren Einsatzspektrum. Der Beitrag stellt die Grundlagen zu Data-Warehouses vor und fuhrt in das RDF Data Cube Vokabular als Linked Data Aquivalent ein. Beide Grundlagen dienen der Diskussion sowohl der Anwendung von RDF Data Cubes im Data-Warehousing als auch der Erweiterung traditioneller Data-Warehousing Ansatze, z. B. durch Integration offener Daten in Data-Warehousing Prozessen.
extended semantic web conference | 2013
Kai Schlegel; Sebastian Bayerl; Stefan Zwicklbauer; Florian Stegmaier; Christin Seifert; Michael Granitzer; Harald Kosch
A crucial task in a researchers’ daily work is the analysis of primary research data to estimate the evolution of certain fields or technologies, e.g. tables in publications or tabular benchmark results. Due to a lack of comparability and reliability of published primary research data, this becomes more and more time-consuming leading to contradicting facts, as has been shown for ad-hoc retrieval [1]. The CODE project [2] aims at contributing to a Linked Science Data Cloud by integrating unstructured research information with semantically represented research data. Through crowdsourcing techniques, data centric tasks like data extraction, integration and analysis in combination with sustainable data marketplace concepts will establish a sustainable, high-impact ecosystem.
International Journal of Multimedia Data Engineering and Management | 2012
Kai Schlegel; Florian Stegmaier; Sebastian Bayerl; Harald Kosch; Mario Döller
Along with the tremendous growth of Social Media, the variety of multimedia sharing platforms on the Web is ever growing, whereas unified retrieval issues remain unsolved. Beside unified retrieval languages and metadata interoperability issues, a crucial task in such a retrieval environment is query optimization in federated and distributed retrieval scenarios. This work introduces three different dimensions of query optimization that have been integrated in an external multimedia meta-search engine. The main innovations are query execution planning, various query processing strategies as well as a multimedia perceptual caching system.
STCSN-E-Letter | 2015
Christin Seifert; Nils Witt; Sebastian Bayerl; Michael Granitzer; Rene Kaiser; Elisabeth Lex; Peter Kraker
Archive | 2014
Christin Seifert; Jörg Schlötterer; Nils Witt; Timo Borst; Sebastian Bayerl; Andreas Eisenkolb
Lecture Notes in Computer Science | 2013
Kai Schlegel; Sebastian Bayerl; Stefan Zwicklbauer; Florian Stegmaier; Christin Seifert; Michael Granitzer; Harald Kosch; Philipp Cimiano; Miriam Fernández; Vanessa Lopez; Stefan Schlobach; Johanna Völker