Alan Nash
University of California, San Diego
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alan Nash.
symposium on principles of database systems | 2008
Alin Deutsch; Alan Nash; Jeffrey B. Remmel
We revisit the standard chase procedure, studying its properties and applicability to classical database problems. We settle (in the negative) the open problem of decidability of termination of the standard chase, and we provide sufficient termination conditions which are strictly less over-conservative than the best previously known. We investigate the adequacy of the standard chase for checking query containment under constraints, constraint implication and computing certain answers in data exchange, gaining a deeper understanding by separating the algorithm from its result. We identify the properties of the chase result that are essential to the above applications, and we introduce the more general notion of F-universal model set, which supports query and constraint languages that are closed under a class F of mappings. By choosing F appropriately, we extend prior results to existential first-order queries and ∀∃-firstorder constraints. We show that the standard chase is incomplete for finding universal model sets, and we introduce the extended core chase which is complete, i.e. finds an F-universal model set when it exists. A key advantage of the new chase is that the same algorithm can be applied for all mapping classes F of interest, simply by modifying the set of constraints given as input. Even when restricted to the typical input in prior work, the new chase supports certain answer computation and containment/implication tests in strictly more cases than the incomplete standard chase.
very large data bases | 2006
Philip A. Bernstein; Todd J. Green; Sergey Melnik; Alan Nash
Mapping composition is a fundamental operation in metadata driven applications. Given a mapping over schemas σ1 and σ2 and a mapping over schemas σ2 and σ3, the composition problem is to compute an equivalent mapping over σ1 and σ3. We describe a new composition algorithm that targets practical applications. It incorporates view unfolding. It eliminates as many σ2 symbols as possible, even if not all can be eliminated. It covers constraints expressed using arbitrary monotone relational operators and, to a lesser extent, non-monotone operators. And it introduces the new technique of left composition. We describe our implementation, explain how to extend it to support user-defined operators, and present experimental results which validate its effectiveness.
Theoretical Computer Science | 2007
Alin Deutsch; Bertram Ludäscher; Alan Nash
We study the problem of rewriting queries using views in the presence of access patterns, integrity constraints, disjunction, and negation. We provide asymptotically optimal algorithms for finding minimal containing and maximal contained rewritings and for deciding whether an exact rewriting exists. We show that rewriting queries using views in this case reduces (a) to rewriting queries with access patterns and constraints without views and also (b) to rewriting queries using views under constraints without access patterns. We show how to solve (a) directly and how to reduce (b) to rewriting queries under constraints only (semantic optimization). These reductions provide two separate routes to a unified solution for all three problems, based on an extension of the relational chase theory to queries and constraints with disjunction and negation. We also handle equality and arithmetic comparisons.
ACM Transactions on Database Systems | 2010
Alan Nash; Luc Segoufin; Victor Vianu
We investigate the question of whether a query Q can be answered using a set V of views. We first define the problem in information-theoretic terms: we say that V determines Q if V provides enough information to uniquely determine the answer to Q. Next, we look at the problem of rewriting Q in terms of V using a specific language. Given a view language V and query language Q, we say that a rewriting language R is complete for V-to-Q rewritings if every Q ∈ Q can be rewritten in terms of V ∈ V using a query in R, whenever V determines Q. While query rewriting using views has been extensively investigated for some specific languages, the connection to the information-theoretic notion of determinacy, and the question of completeness of a rewriting language have received little attention. In this article we investigate systematically the notion of determinacy and its connection to rewriting. The results concern decidability of determinacy for various view and query languages, as well as the power required of complete rewriting languages. We consider languages ranging from first-order to conjunctive queries.
Journal of the ACM | 2008
Georg Gottlob; Alan Nash
Data exchange deals with inserting data from one database into another database having a different schema. Fagin et al. [2005] have shown that among the universal solutions of a solvable data exchange problem, there exists—up to isomorphism—a unique most compact one, “the core”, and have convincingly argued that this core should be the database to be materialized. They stated as an important open problem whether the core can be computed in polynomial time in the general setting where the mapping between the source and target schemas is given by source-to-target constraints that are arbitrary tuple generating dependencies (tgds) and target constraints consisting of equality generating dependencies (egds) and a weakly acyclic set of tgds. In this article, we solve this problem by developing new methods for efficiently computing the core of a universal solution. This positive result shows that data exchange based on cores is feasible and applicable in a very general setting. In addition to our main result, we use the method of hypertree decompositions to derive new algorithms and upper bounds for query containment checking and computing cores of arbitrary database instances. We also show that computing the core of a data exchange problem is fixed-parameter intractable with respect to a number of relevant parameters, and that computing cores is NP-complete if the rule bodies of target tgds are augmented by a special predicate that distinguishes a null value from a constant data value.
international conference on database theory | 2007
Alan Nash; Luc Segoufin; Victor Vianu
Suppose we are given a set of exact conjunctive views V and a conjunctive query Q. Suppose we wish to answer Q using V, but the classical test for the existence of a conjunctive rewriting of Q using V answers negatively. What can we conclude: (i) there is no way Q can be answered using V, or (ii) a more powerful rewriting language may be needed. This has been an open question, with conventional wisdom favoring (i). Surprisingly, we show that the right answer is actually (ii). That is, even if V provides enough information to answer Q, it may not be possible to rewrite Q in terms of V using just conjunctive queries – in fact, no monotonic language is sufficiently powerful. We also exhibit several well-behaved classes of conjunctive views and queries for which conjunctive rewritings remain sufficient. This continues a previous investigation of rewriting and its connection to semantic determinacy, for various query and view languages.
international conference on database theory | 2010
Marcelo Arenas; Ronald Fagin; Alan Nash
It is known that the composition of schema mappings, each specified by source-to-target tgds (st-tgds), can be specified by a second-order tgd (SO tgd). We consider the question of what happens when target constraints are allowed. Specifically, we consider the question of specifying the composition of standard schema mappings (those specified by st-tgds, target egds, and a weakly-acyclic set of target tgds). We show that SO tgds, even with the assistance of arbitrary source constraints and target constraints, cannot specify in general the composition of two standard schema mappings. Therefore, we introduce source-to-target second-order dependencies (st-SO dependencies), which are similar to SO tgds, but allow equations in the conclusion. We show that st-SO dependencies (along with target egds and target tgds) are sufficient to express the composition of every finite sequence of standard schema mappings, and further, every st-SO dependency specifies such a composition. In addition to this expressive power, we show that st-SO dependencies enjoy other desirable properties. In particular, they have a polynomial-time chase that generates a universal solution. This universal solution can be used to find the certain answers to unions of conjunctive queries in polynomial time. It is easy to show that the composition of an arbitrary number of standard schema mappings is equivalent to the composition of only two standard schema mappings. We show that surprisingly, the analogous result holds also for schema mappings specified by just st-tgds (no target constraints). That is, the composition of an arbitrary number of such schema mappings is equivalent to the composition of only two such schema mappings. This is proven by showing that every SO tgd is equivalent to an unnested SO tgd (one where there is no nesting of function symbols). The language of unnested SO tgds is quite natural, and we show that unnested SO tgds are capable of specifying the composition of an arbitrary number of schema mappings, each specified by st-tgds. Similarly, we prove unnesting results for st-SO dependencies, with the same types of consequences. The collapsing result for SO tgds gives us two alternative ways to deal with the composition of multiple schema mappings specified by st-tgds. First, we can replace the composition by a single schema mapping, specified by an unnested SO tgd. Second, we can replace the composition by the composition of only two schema mappings, each specified by st-tgds. A similar comment holds for the composition of standard schema mappings.
symposium on principles of database systems | 2004
Alan Nash; Bertram Ludäscher
We study the problem of answering queries over sources with limited access patterns. Given a first-order query Q, the problem is to decide whether there is an equivalent query which can be executed observing the access patterns restrictions. If so, we say that Q is feasible. We define feasible for first-order queries---previous definitions handled only some existential cases---and characterize the complexity of many first-order query classes. For each of them, we show that deciding feasibility is as hard as deciding containment. Since feasibility is undecidable in many cases and hard to decide in some others, we also define an approximation to it which can be computed in NP for any first-order query and in P for unions of conjunctive queries with negation. Finally, we outline a practical overall strategy for processing first-order queries under limited access patterns.
international conference on database theory | 2005
Alan Nash; Jeffrey B. Remmel; Victor Vianu
The existence of a language expressing precisely the PTIME queries on arbitrary structures remains the central open problem in the theory of database query languages. As it turns out, two variants of this question have been formulated. Surprisingly, despite the importance of the problem, the relationship between these variants has not been systematically explored. A first contribution of the present paper is to revisit the basic definitions and clarify the connection between these two variants. We then investigate two relaxations to the original problem that appear as tempting alternatives in the absence of a language for the PTIME queries. The first consists in trying to express the PTIME queries using a richer language that can also express queries beyond PTIME, but for which there exists a query processor evaluating all PTIME queries in PTIME. The second approach, studied by many researchers, is to focus on PTIME properties on restricted sets of graphs. Our results are mostly negative, and point out limitations to both approaches. Finally, we turn to a natural class of languages that we call finitely generated, whose syntax is obtained by applying a fixed set of constructors to a given set of building blocks. We identify a broad class of such languages that cannot express all the PTIME queries.
Journal of the ACM | 2010
Ronald Fagin; Alan Nash
A schema mapping is a specification that describes how data structured under one schema (the source schema) is to be transformed into data structured under a different schema (the target schema). The notion of an inverse of a schema mapping is subtle, because a schema mapping may associate many target instances with each source instance, and many source instances with each target instance. In PODS 2006, Fagin defined a notion of the inverse of a schema mapping. This notion is tailored to the types of schema mappings that commonly arise in practice (those specified by “source-to-target tuple-generating dependencies”, or s-t tgds). We resolve the key open problem of the complexity of deciding whether there is an inverse. We also explore a number of interesting questions, including: What is the structure of an inverse? When is the inverse unique? How many nonequivalent inverses can there be? When does an inverse have an inverse? How big must an inverse be? Surprisingly, these questions are all interrelated. We show that for schema mappings M specified by full s-t tgds (those with no existential quantifiers), if M has an inverse, then it has a polynomial-size inverse of a particularly nice form, and there is a polynomial-time algorithm for generating it. We introduce the notion of “essential conjunctions” (or “essential atoms” in the full case), and show that they play a crucial role in the study of inverses. We use them to give greatly simplified proofs of some known results about inverses. What emerges is a much deeper understanding about this fundamental and complex operator.