Johann Christoph Freytag

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Johann Christoph Freytag is active.

Explore More

Publication

Featured researches published by Johann Christoph Freytag.

international semantic web conference | 2009

Executing SPARQL Queries over the Web of Linked Data

Olaf Hartig; Johann Christoph Freytag

The Web of Linked Data forms a single, globally distributed dataspace. Due to the openness of this dataspace, it is not possible to know in advance all data sources that might be relevant for query answering. This openness poses a new challenge that is not addressed by traditional research on federated query processing. In this paper we present an approach to execute SPARQL queries over the Web of Linked Data. The main idea of our approach is to discover data that might be relevant for answering a query during the query execution itself. This discovery is driven by following RDF links between data sources based on URIs in the query and in partial results. The URIs are resolved over the HTTP protocol into RDF data which is continuously added to the queried dataset. This paper describes concepts and algorithms to implement our approach using an iterator-based pipeline. We introduce a formalization of the pipelining approach and show that classical iterators may cause blocking due to the latency of HTTP requests. To avoid blocking, we propose an extension of the iterator paradigm. The evaluation of our approach shows its strengths as well as the still existing challenges.

very large data bases | 2014

The Stratosphere platform for big data analytics

Alexander Alexandrov; Rico Bergmann; Stephan Ewen; Johann Christoph Freytag; Fabian Hueske; Arvid Heise; Odej Kao; Marcus Leich; Ulf Leser; Volker Markl; Felix Naumann; Mathias Peters; Astrid Rheinländer; Matthias J. Sax; Sebastian Schelter; Mareike Hoger; Kostas Tzoumas; Daniel Warneke

We present Stratosphere, an open-source software stack for parallel data analysis. Stratosphere brings together a unique set of features that allow the expressive, easy, and efficient programming of analytical applications at very large scale. Stratosphere’s features include “in situ” data processing, a declarative query language, treatment of user-defined functions as first-class citizens, automatic program parallelization and optimization, support for iterative programs, and a scalable and efficient execution engine. Stratosphere covers a variety of “Big Data” use cases, such as data warehousing, information extraction and integration, data cleansing, graph analysis, and statistical analysis applications. In this paper, we present the overall system architecture design decisions, introduce Stratosphere through example queries, and then dive into the internal workings of the system’s components that relate to extensibility, programming model, optimization, and query execution. We experimentally compare Stratosphere against popular open-source alternatives, and we conclude with a research outlook for the next years.

very large data bases | 1999

Quality-driven Integration of Heterogenous Information Systems

Felix Naumann; Ulf Leser; Johann Christoph Freytag

Integrated access to information that is spread over multiple, distributed, and heterogeneous sources is an important problem in many scienti c and commercial domains. While much work has been done on query processing and choosing plans under cost criteria, very little is known about the important problem of incorporating the information quality aspect into query planning. In this paper we describe a framework for multidatabase query processing that fully includes the quality of information in many facets, such as completeness, timeliness, accuracy, etc. We seamlessly include information quality into a multidatabase query processor based on a view-rewriting mechanism. We model information quality at di erent levels to ultimately nd a set of high-quality queryanswering plans.

cooperative information systems | 2004

Completeness of integrated information sources

Felix Naumann; Johann Christoph Freytag; Ulf Leser

For many information domains there are numerous World Wide Web data sources. The sources vary both in their extension and their intension: They represent different real-world entities with possible overlap and provide different attriouites of these entities. Mediator-based information systems allow integrated access to such sources by providing a common schema against which the user can pose queries. Given a query, the mediator must determine which participating sources to access and how to integrate the incoming results.This article describes how to support mediators in their source selection and query planning process. We propose three new merge operators, which formalize the integration of multiple source responses. A completeness model describes the usefulness of a source to answer a query. The completeness measure incorporates both extensional value (called coverage) and intensional value (called density) of a source. We show how to determine the completeness of single sources and of combinations of sources under the new merge operators. Finally, we show how to use the measure for source selection and query planning.

international conference on management of data | 2007

Efficient exploitation of similar subexpressions for query processing

Jingren Zhou; Per-Ake Larson; Johann Christoph Freytag; Wolfgang Lehner

Complex queries often contain common or similar subexpressions, either within a single query or among multiple queries submitted as a batch. If so, query execution time can be improved by evaluating a common subexpression once and reusing the result in multiple places. However, current query optimizers do not recognize and exploit similar subexpressions, even within the same query. We present an efficient, scalable, and principled solution to this long-standing optimization problem. We introduce a light-weight and effective mechanism to detect potential sharing opportunities among expressions. Candidate covering subexpressions are constructed and optimization is resumed to determine which, if any, such subexpressions to include in the final query plan. The chosen subexpression(s) are computed only once and the results are reused to answer other parts of queries. Our solution automatically applies to optimization of query batches, nested queries, and maintenance of multiple materialized views. It is the first comprehensive solution covering all aspects of the problem: detection, construction, and cost-based optimization. Experiments on Microsoft SQL Server show significant performance improvements with minimal overhead.

ACM Transactions on Database Systems | 1989

On the translation of relational queries into iterative programs

Johann Christoph Freytag; Nathan Goodman

This paper investigates the problem of translating set-oriented query specifications into iterative programs. The translation uses techniques of functional programming and program transformation. We present two algorithms that generate iterative programs from algebra-based query specifications. The first algorithm translates query specifications into recursive programs. Those are simplified by sets of transformation rules before the algorithm generates the final iterative form. The second algorithm uses a two-level translation that generates iterative programs faster than the first algorithm. On the first level a small set of transformation rules performs structural simplification before the functional combination on the second level yields the final iterative form.

conference on advanced information systems engineering | 2005

Query processing using ontologies

Chokri Ben Necib; Johann Christoph Freytag

Recently, the database and AI research communities have paid increased attention to ontologies. The main motivating reason is that ontologies promise solutions for complex problems caused by the lack of a good understanding of the semantics of data in many cases. In particular, ontologies have extensively been used to overcome the interoperability problem during the integration of heterogeneous information sources. Moreover, many efforts have been put into developing ontology based techniques for improving the query answering process in database and information systems. In this paper, we present a new approach for query processing within single (object) relational databases using ontology knowledge. Our goal is to process database queries in a semantically more meaningful way. In fact, our approach shows how an ontology can be effectively exploited to rewrite a user query into another one such that the new query provides more meaningful results satisfying the intention of the user. To this end, we develop a set of transformation rules which rely on semantic information extracted from the ontology associated with the database. In addition, we propose a semantic model and a set of criteria to prove the validity of the transformation results. We also address the necessary mappings between an ontology and its underlying database w.r.t. our framework.

cooperative information systems | 2003

Ontology based query processing in database management systems

Chokri Ben Necib; Johann Christoph Freytag

The use of semantic knowledge in its various forms has become an important aspect in managing data in database and information systems. In the form of integrity constraints, it has been used intensively in query optimization for some time. Similarly, data integration techniques have utilized semantic knowledge to handle heterogeneity for query processing on distributed information sources in a graceful manner. Recently, ontologies have become a “buzz word” for the semantic web and semantic data processing. In fact, they play a central role in facilitating the exchange of data between the several sources. In this paper, we present a new approach using ontology knowledge for query processing within a single relational database to extend the result of a query in a semantically meaningful way. We describe how an ontology can be effectively exploited to rewrite a user query into another query such that the new query provides additional meaningful results that satisfy the intention of the user. We outline a set of query transformation rules and describe by using a semantic Model the necessary criteria to prove their validity.

acm conference on hypertext | 2012

Foundations of traversal based query execution over linked data

Olaf Hartig; Johann Christoph Freytag

Query execution over the Web of Linked Data has attracted much attention recently. A particularly interesting approach is link traversal based query execution which proposes to integrate the traversal of data links into the creation of query results. Hence -in contrast to traditional query execution paradigms- this does not assume a fixed set of relevant data sources beforehand; instead, the traversal process discovers data and data sources on the fly and, thus, enables applications to tap the full potential of the Web. While several authors have studied possibilities to implement the idea of link traversal based query execution and to optimize query execution in this context, no work exists that discusses theoretical foundations of the approach in general. Our paper fills this gap. We introduce a well-defined semantics for queries that may be executed using a link traversal based approach. Based on this semantics we formally analyze properties of such queries. In particular, we study the computability of queries as well as the implications of querying a potentially infinite Web of Linked Data. Our results show that query computation in general is not guaranteed to terminate and that for any given query it is undecidable whether the execution terminates. Furthermore, we define an abstract execution model that captures the integration of link traversal into the query execution process. Based on this model we prove the soundness and completeness of link traversal based query execution and analyze an existing implementation approach.

very large data bases | 2008

Adaptive workflow scheduling under resource allocation constraints and network dynamics

Artin Avanes; Johann Christoph Freytag

Workflow concepts are well suited for scenarios where many distributed entities work collaboratively together to achieve a common goal. Today, workflows are mostly used as computerized model for business processes executed in instances in commercial Workflow Management Systems. However, there are many other application domains where computer-supported cooperative work can be captured and organized by workflows. In this paper, we investigate the task of scheduling workflows in self-organizing wireless networks for disaster scenarios. Most research work in the field of workflow scheduling has been driven by temporal and causality constraints. We present an adaptive scheduling algorithm that finds a suitable execution sequence for workflow activities by additionally considering resource allocation constraints and dynamic topology changes. Our approach utilizes a multi-stage distribution algorithm which we extend with techniques to cope with network dynamics.

Explore More