André Hernich | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where André Hernich is active.

Explore More

Publication

Featured researches published by André Hernich.

symposium on principles of database systems | 2006

Randomized computations on large data sets: tight lower bounds

Martin Grohe; André Hernich; Nicole Schweikardt

We study the randomized version of a computation model (introduced in [9, 10]) that restricts random access to external memory and internal memory space. Essentially, this model can be viewed as a powerful version of a data stream model that puts no cost on sequential scans of external memory (as other models for data streams) and, in addition, (like other external memory models, but unlike streaming models), admits several large external memory devices that can be read and written to in parallel.We obtain tight lower bounds for the decision problems set equality, multiset equality, and checksort. More precisely, we show that any randomized one-sided-error bounded Monte Carlo algorithm for these problems must perform Ω(logN) random accesses to external memory devices, provided that the internal memory size is at most O(4√N/logN), where N denotes the size of the input data.From the lower bound on the set equality problem we can infer lower bounds on the worst case data complexity of query evaluation for the languages XQuery, XPath, and relational algebra on streaming data. More precisely, we show that there exist queries in XQuery, XPath, and relational algebra, such that any (randomized) Las Vegas algorithm that evaluates these queries must perform Ω(logN) random accesses to external memory devices, provided that the internal memory size is at most O(4√N/logN).

ACM Transactions on Database Systems | 2011

Closed world data exchange

André Hernich; Leonid Libkin; Nicole Schweikardt

Data exchange deals with translating data structured in some source format into data structured in some target format, given a specification of the relationship between the source and the target and possibly constraints on the target; and answering queries over the target in a way that is semantically consistent with the information in the source. Theoretical foundations of data exchange have been actively explored recently. It was also noticed that the standard semantics for query answering in data exchange may lead to counterintuitive or anomalous answers. In the present article, we explain that this behavior is due to the fact that solutions can contain invented information (information that is not related to the source instance), and that the presence of incomplete information in target instances has been ignored. In particular, proper query evaluation techniques for databases with nulls have not been used, and the distinction between closed and open world semantics has not been made. We present a concept of solutions, called CWA-solutions, that is based on the closed world assumption. For data exchange settings without constraints on the target, the space of CWA-solutions has two extreme points: the canonical universal solution (the maximal CWA-solution) and the core of the universal solutions (the minimal CWA-solution), both of them well studied in data exchange. In the presence of constraints on the target, the core of the universal solutions is still the minimal CWA-solution, but there may be no unique maximal CWA-solution. We show how to define the semantics of query-answering taking into account incomplete information, and show that some of the well-known anomalies go away with the new semantics. The article also contains results on the complexity of query-answering, upper approximations to queries (maybe-answers), and various extensions.

international conference on database theory | 2010

Answering non-monotonic queries in relational data exchange

André Hernich

Relational data exchange deals with translating a relational database instance over some source schema into a relational database instance over some target schema, according to a schema mapping that specifies the relationship between the source data and the target data. Various semantics for answering queries against the target schema exist, each of them suitable for a certain class of queries, and with respect to certain schema mappings. However, for each of these semantics, there are examples that show that it leads to counter-intuitive answers, or that it does not respect logical equivalence of schema mappings. In this article, we study query answering semantics for deductive databases in the context of relational data exchange. Furthermore, we propose a new semantics, called GCWA*-answers semantics, which seems to be well-suited with respect to a number of schema mappings, including schema mappings defined by st-tgds and egds. We show that the GCWA*-answers semantics coincides with the classical certain answers semantics on monotonic queries, and we further explore the data complexity of computing the GCWA*-answers to non-monotonic queries. In particular, we identify a class of schema mappings for which the GCWA*-answers to universal queries can be computed from the core of the universal solutions in polynomial time (data complexity).

symposium on principles of database systems | 2013

Well-founded semantics for extended datalog and ontological reasoning

André Hernich; Clemens Kupke; Thomas Lukasiewicz; Georg Gottlob

The Datalog± family of expressive extensions of Datalog has recently been introduced as a new paradigm for query answering over ontologies, which captures and extends several common description logics. It extends plain Datalog by features such as existentially quantified rule heads and, at the same time, restricts the rule syntax so as to achieve decidability and tractability. In this paper, we continue the research on Datalog±. More precisely, we generalize the well-founded semantics (WFS), as the standard semantics for nonmonotonic normal programs in the database context, to Datalog± programs with negation under the unique name assumption (UNA). We prove that for guarded Datalog± with negation under the standard WFS, answering normal Boolean conjunctive queries is decidable, and we provide precise complexity results for this problem, namely, in particular, completeness for PTIME (resp., 2-EXPTIME) in the data (resp., combined) complexity.

Journal of the ACM | 2009

Lower bounds for processing data with few random accesses to external memory

Martin Grohe; André Hernich; Nicole Schweikardt

We consider a scenario where we want to query a large dataset that is stored in external memory and does not fit into main memory. The most constrained resources in such a situation are the size of the main memory and the number of random accesses to external memory. We note that sequentially streaming data from external memory through main memory is much less prohibitive. We propose an abstract model of this scenario in which we restrict the size of the main memory and the number of random accesses to external memory, but admit arbitrary sequential access. A distinguishing feature of our model is that it allows the usage of unlimited external memory for storing intermediate results, such as several hard disks that can be accessed in parallel. In this model, we prove lower bounds for the problem of sorting a sequence of strings (or numbers), the problem of deciding whether two given sets of strings are equal, and two closely related decision problems. Intuitively, our results say that there is no algorithm for the problems that uses internal memory space bounded by N1−ϵ and at most o(log N) random accesses to external memory, but unlimited “streaming access”, both for writing to and reading from external memory. (Here, N denotes the size of the input and ϵ is an arbitrary constant greater than 0.) We even permit randomized algorithms with one-sided bounded error. We also consider the problem of evaluating database queries and prove similar lower bounds for evaluating relational algebra queries against relational databases and XQuery and XPath queries against XML-databases.

international conference on database theory | 2012

Computing universal models under guarded TGDs

André Hernich

A universal model of a database D and a set Σ of integrity constraints is a database that extends D, satisfies Σ, and is most general in the sense that it contains sound and complete information. Universal models have a number of applications including answering conjunctive queries, and deciding containment of conjunctive queries, with respect to databases with integrity constraints. Furthermore, they are used in slightly modified form as solutions in data exchange. In general, it is undecidable whether a database possesses a universal model, but in the past few years researchers identified various settings where this problem is decidable, and even efficiently solvable. This paper focuses on computing universal models under finite sets of guarded TGDs, non-conflicting keys, and negative constraints. Such constraints generalize inclusion dependencies, and were recently shown to be expressive enough to capture certain members of the DL-Lite family of description logics. The main result is an algorithm that, given a database without null values and a finite set Σ of such constraints, decides whether there is a universal model, and if so, outputs such a model. If Σ is fixed, the algorithm runs in polynomial time. The algorithm can be extended to cope with databases containing nulls; however, in this case, polynomial running time can be guaranteed only for databases with bounded block size.

logic in computer science | 2017

Foundations of information integration under bag semantics

André Hernich; Phokion G. Kolaitis

During the past several decades, the database theory community has successfully investigated several different facets of the principles of database systems, including the development of various data models, the systematic exploration of the expressive power of database query languages, and, more recently, the study of the foundations of information integration via schema mappings. For the most part, all these investigations have been carried out under set semantics, that is, both the database relations and the answers to database queries are sets. In contrast, SQL deploys bag (multiset) semantics and, as a result, theory and practice diverge at this crucial point. Our main goal in this paper is to embark on the development of the foundations of information integration under bag semantics, thus taking the first step towards bridging the gap between theory and practice in this area. Our first contribution is conceptual, namely, we give rigorous bag semantics to GLAV mappings and to the certain answers of conjunctive queries in the context of data exchange and data integration. In fact, we introduce and explore two different versions of bag semantics that, intuitively, correspond to the maximum-based union of bags and to the sum-based union of bags. After this, we establish a number of technical results, including results about the computational complexity of the certain answers of conjunctive queries under bag semantics and about the existence and computation of universal solutions under these two versions of bag semantics. Our results reveal that the adoption of more realistic semantics comes at a price, namely, algorithmic problems in data exchange and data integration that were tractable under set semantics become intractable under bag semantics.

computer science logic | 2011

L-Recursion and a new Logic for Logarithmic Space.

Martin Grohe; Berit Grußien; André Hernich; Bastian Laubner

We extend first-order logic with counting by a new operator that allows it to formalise a limited form of recursion which can be evaluated in logarithmic space. The resulting logic LREC has a data complexity in LOGSPACE, and it defines LOGSPACE-complete problems like deterministic reachability and Boolean formula evaluation. We prove that LREC is strictly more expressive than deterministic transitive closure logic with counting and incomparable in expressive power with symmetric transitive closure logic STC and transitive closure logic (with or without counting). LREC is strictly contained in fixed-point logic with counting FPC. We also study an extension LREC= of LREC that has nicer closure properties and is more expressive than both LREC and STC, but is still contained in FPC and has a data complexity in LOGSPACE. Our main results are that LREC captures LOGSPACE on the class of directed trees and that LREC= captures LOGSPACE on the class of interval graphs.

symposium on principles of database systems | 2017

Dichotomies in Ontology-Mediated Querying with the Guarded Fragment

André Hernich; Carsten Lutz; Fabio Papacchini; Frank Wolter

We study the complexity of ontology-mediated querying when ontologies are formulated in the guarded fragment of first-order logic (GF). Our general aim is to classify the data complexity on the level of ontologies where query evaluation w.r.t. an ontology O is considered to be in PTime if all (unions of conjunctive) queries can be evaluated in PTime w.r.t. O and coNP-hard if at least one query is coNP-hard w.r.t. O. We identify several large and relevant fragments of GF that enjoy a dichotomy between PTime and coNP, some of them additionally admitting a form of counting. In fact, almost all ontologies in the BioPortal repository fall into these fragments or can easily be rewritten to do so. We then establish a variation of Ladners Theorem on the existence of NP-intermediate problems and use this result to show that for other fragments, there is provably no such dichotomy. Again for other fragments (such as full GF), establishing a dichotomy implies the Feder-Vardi conjecture on the complexity of constraint satisfaction problems. We also link these results to Datalog-rewritability and study the decidability of whether a given ontology enjoys PTime query evaluation, presenting both positive and negative results.

Logical Methods in Computer Science | 2013