Val Tannen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Val Tannen is active.

Explore More

Publication

Featured researches published by Val Tannen.

symposium on principles of database systems | 2007

Provenance semirings

Todd J. Green; Grigoris Karvounarakis; Val Tannen

We show that relational algebra calculations for incomplete databases, probabilistic databases, bag semantics and why-provenance are particular cases of the same general algorithms involving semirings. This further suggests a comprehensive provenance representation that uses semirings of polynomials. We extend these considerations to datalog and semirings of formal power series. We give algorithms for datalog provenance calculation as well as datalog evaluation for incomplete and probabilistic databases. Finally, we show that for some semirings containment of conjunctive queries is the same as for standard set semantics.

Frontiers in Plant Science | 2011

The iPlant Collaborative: Cyberinfrastructure for Plant Biology

Stephen A. Goff; Matthew W. Vaughn; Sheldon J. McKay; Eric Lyons; Ann E. Stapleton; Damian Gessler; Naim Matasci; Liya Wang; Matthew R. Hanlon; Andrew Lenards; Andy Muir; Nirav Merchant; Sonya Lowry; Stephen A. Mock; Matthew Helmke; Adam Kubach; Martha L. Narro; Nicole Hopkins; David Micklos; Uwe Hilgert; Michael Gonzales; Chris Jordan; Edwin Skidmore; Rion Dooley; John Cazes; Robert T. McLay; Zhenyuan Lu; Shiran Pasternak; Lars Koesterke; William H. Piel

The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure that uniquely enables scientists throughout the diverse fields that comprise plant biology to address Grand Challenges in new ways, to stimulate and facilitate cross-disciplinary research, to promote biology and computer science research interactions, and to train the next generation of scientists on the use of cyberinfrastructure in research and education. Meeting humanitys projected demands for agricultural and forest products and the expectation that natural ecosystems be managed sustainably will require synergies from the application of information technologies. The iPlant cyberinfrastructure design is based on an unprecedented period of research community input, and leverages developments in high-performance computing, data storage, and cyberinfrastructure for the physical sciences. iPlant is an open-source project with application programming interfaces that allow the community to extend the infrastructure to meet its needs. iPlant is sponsoring community-driven workshops addressing specific scientific questions via analysis tool integration and hypothesis testing. These workshops teach researchers how to add bioinformatics tools and/or datasets into the iPlant cyberinfrastructure enabling plant scientists to perform complex analyses on large datasets without the need to master the command-line or high-performance computational services.

international conference on database theory | 1995

Principles of programming with complex objects and collection types

Peter Buneman; Shamim A. Naqvi; Val Tannen; Limsoon Wong

Abstract We present a new principle for the development of database query languages that the primitive operations should be organized around types. Viewing a relational database as consisting of sets of records, this principle dectates that we should investigate separately operations for records and sets. There are two immediate advantages of this approach, which is partly inspired by basic ideas from category theoryl. First, it provides a language for structures in which record and set types may be freely combined: nested relations or complex objects. Second, the fundamental operations for sets are closely related to those for other “collection types” such as bags or lists, and this suggests how database languages may be uniformly extended to these new types. the most general operation on sets, that of structural recursion , is one in which not all programs are well-defined. In looking for limited forms of this operation that always give rise to well-defined operations, we find a number of close connection with exiting database languages, notably those developed for complex objects. Moreover, even though the general paradigm of structural recursion is shown to be no more expressive than one of the existing languages for complex objects, it possesses certain properties of uniformity that make it a better candidate for an efficient, practical language. Thus rather than developing query languages by extending, for example, relational calculus, we advocate a very powerful paradigm in which a number of well-known languages are to be found as natural sublanguages.

international conference on database theory | 2003

Reformulation of XML Queries and Constraints

Alin Deutsch; Val Tannen

We state and solve the query reformulation problem for XML publishing in a general setting that allows mixed (XML and relational) storage for the proprietary data and exploits redundancies (materialized views, indexes and caches) to enhance performance. The correspondence between published and proprietary schemas is specified by views in both directions, and the same algorithm performs rewriting-with-views, composition-with-views, or the combined effect of both, unifying the Global-As-View and Local-As-View approaches to data integration. We prove a completeness theorem which guarantees that under certain conditions, our algorithm will find a minimal reformulation if one exists. Moreover, we identify conditions when this algorithm achieves optimal complexity bounds. We solve the reformulation problem for constraints by exploiting a reduction to the problem of query reformulation.

international conference on database theory | 1992

Naturally Embedded Query Languages

Val Tannen; Peter Buneman; Limsoon Wong

We investigate the properties of a simple programming language whose main computational engine is structural recursion on sets. We describe a progression of sublanguages in this paradigm that (1) have increasing expressive power, and (2) illustrate robust conceptual restrictions thus exhibiting interesting additional properties. These properties suggest that we consider our sublanguages as candidates for “query languages”. Viewing query languages as restrictions of our more general programming language has several advantages. First, there is no “impedance mismatch” problem; the query languages are already there, so they share common semantic foundation with the general language. Second, we suggest a uniform characterization of nested relational and complex-object algebras in terms of some surprisingly simple operators;and we can make comparisons of expressiveness in a general framework. Third, we exhibit differences in expressive power that are not always based on complexity arguments, but use the idea that a query in one language may not be polymorphically expressible in another. Fourth, ideas of category theory can be profitably used to organize semantics and syntax, in particular our minimal (core) language is a well-understood categorical construction: a cartesian category with a strong monad on it. Finally, we bring out an algebraic perspective, that is, our languages come with equational theories, and categorical ideas can be used to derive a number of rather general identities that may serve as optimizations or as techniques for discovering optimizations.

extending database technology | 2006

Models for incomplete and probabilistic information

Todd J. Green; Val Tannen

We discuss, compare and relate some old and some new models for incomplete and probabilistic databases. We characterize the expressive power of c-tables over infinite domains and we introduce a new kind of result, algebraic completion, for studying less expressive models. By viewing probabilistic models as incompleteness models with additional probability information, we define completeness and closure under query languages of general probabilistic database models and we introduce a new such model, probabilistic c-tables, that is shown to be complete and closed under the relational algebra.

very large data bases | 2003

MARS: a system for publishing XML from mixed and redundant storage

Alin Deutsch; Val Tannen

We present a system for publishing as XML data from mixed (relational+XML) proprietary storage, while supporting redundancy in storage for tuning purposes. The correspondence between public and proprietary schemas is given by a combination of LAV-and GAV-style views expressed in XQuery. XML and relational integrity constraints are also taken into consideration. Starting with client XQueries formulated against the public schema the system achieves the combined effect of rewriting-with-views, composition-with-views and query minimization under integrity constraints to obtain optimal reformulations against the proprietary schema. The paper focuses on the engineering and the experimental evaluation of the MARS system.

international conference on database theory | 2009

Reconcilable differences

Todd J. Green; Zachary G. Ives; Val Tannen

Exact query reformulation using views in positive relational languages is well understood, and has a variety of applications in query optimization and data sharing. Generalizations to larger fragments of the relational algebra (RA) --- specifically, support for the difference operator --- would increase the options available for query reformulation, and also apply to view adaptation (updating a materialized view in response to a modified view definition) and view maintenance. Unfortunately, most questions about queries become undecidable in the presence of difference/negation. We present a novel way of managing this difficulty via an excursion through a non-standard semantics, Z-relations, where tuples are annotated with positive or negative integers. We show that under Z-semantics RA queries have a normal form as a single difference of positive queries and this leads to the decidability of equivalence. In most real-world settings with difference, it is possible to convert the queries to this normal form. We give a sound and complete algorithm that explores all reformulations of an RA query (under Z-semantics) using a set of RA views, finitely bounding the search space with a simple and natural cost model. We investigate related complexity questions, and we also extend our results to queries with built-in predicates. Z-relations are interesting in their own right because they capture updates and data uniformly. However, our algorithm turns out to be sound and complete also for bag semantics, albeit necessarily only for a subclass of RA. This subclass turns out to be quite large and covers generously the applications of interest to us. We also show a subclass of RA where reformulation and evaluation under Z-semantics can be combined with duplicate elimination to obtain the answer under set semantics.

international conference on management of data | 2006

Query reformulation with constraints

Alin Deutsch; Lucian Popa; Val Tannen

Let Σ<inf>1</inf>, Σ<inf>2</inf> be two schemas, which may overlap, C be a set of constraints on the joint schema Σ<inf>1</inf> ∪ Σ<inf>2</inf>, and q<inf>1</inf> be a Σ<inf>1</inf>-query. An (equivalent) reformulation of q<inf>1</inf> in the presence of C is a Σ<inf>2</inf>-query, q<inf>2</inf>, such that q<inf>2</inf> gives the same answers as q<inf>1</inf> on any Σ<inf>1</inf> ∪ Σ<inf>2</inf>-database instance that satisfies C. In general, there may exist multiple such reformulations and choosing among them may require, for example, a cost model.

symposium on principles of database systems | 2008

Annotated XML: queries and provenance

J. Nathan Foster; Todd J. Green; Val Tannen

We present a formal framework for capturing the provenance of data appearing in XQuery views of XML. Building on previous work on relations and their (positive) query languages, we decorate unordered XML with annotations from commutative semirings and show that these annotations suffice for a large positive fragment of XQuery applied to this data. In addition to tracking provenance metadata, the framework can be used to represent and process XML with repetitions, incomplete XML, and probabilistic XML, and provides a basis for enforcing access control policies in security applications. Each of these applications builds on our semantics for XQuery, which we present in several steps: we generalize the semantics of the Nested Relational Calculus (NRC) to handle semiring-annotated complex values, we extend it with a recursive type and structural recursion operator for trees, and we define a semantics for XQuery on annotated XML by translation into this calculus.

Explore More