Adriane Chapman | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Adriane Chapman is active.

Explore More

Publication

Featured researches published by Adriane Chapman.

very large data bases | 2002

TIMBER: A native XML database

H. V. Jagadish; Shurug Al-Khalifa; Adriane Chapman; Laks V. S. Lakshmanan; Andrew Nierman; Stelios Paparizos; Jignesh M. Patel; Divesh Srivastava; Nuwee Wiwatwattana; Yuqing Wu; Cong Yu

Abstract. This paper describes the overall design and architecture of the Timber XML database system currently being implemented at the University of Michigan. The system is based upon a bulk algebra for manipulating trees, and natively stores XML. New access methods have been developed to evaluate queries in the XML context, and new cost estimation and query optimization techniques have also been developed. We present performance numbers to support some of our design decisions. We believe that the key intellectual contribution of this system is a comprehensive set-at-a-time query processing ability in a native XML store, with all the standard components of relational query processing, including algebraic rewriting and a cost-based optimizer.

international conference on management of data | 2006

Provenance management in curated databases

Peter Buneman; Adriane Chapman; James Cheney

Curated databases in bioinformatics and other disciplines are the result of a great deal of manual annotation, correction and transfer of data from other sources. Provenance information concerning the creation, attribution, or version history of such data is crucial for assessing its integrity and scientific value. General purpose database systems provide little support for tracking provenance, especially when data moves among databases. This paper investigates general-purpose techniques for recording provenance for data that is copied among databases. We describe an approach in which we track the users actions while browsing source databases and copying data into a curated database, in order to record the users actions in a convenient, queryable form. We present an implementation of this technique and use it to evaluate the feasibility of database support for provenance management. Our experiments show that although the overhead of a naive approach is fairly high, it can be decreased to an acceptable level using simple optimizations.

international conference on management of data | 2007

Making database systems usable

H. V. Jagadish; Adriane Chapman; Aaron Elkiss; Magesh Jayapandian; Yunyao Li; Arnab Nandi; Cong Yu

Database researchers have striven to improve the capability of a database in terms of both performance and functionality. We assert that the usability of a database is as important as its capability. In this paper, we study why database systems today are so difficult to use. We identify a set of five pain points and propose a research agenda to address these. In particular, we introduce a presentation data model and recommend direct data manipulation with a schema later approach. We also stress the importance of provenance and of consistency across presentation models.

international conference on management of data | 2008

Efficient provenance storage

Adriane Chapman; H. V. Jagadish; Prakash Ramanan

As the world is increasingly networked and digitized, the data we store has more and more frequently been chopped, baked, diced and stewed. In consequence, there is an increasing need to store and manage provenance for each data item stored in a database, describing exactly where it came from, and what manipulations have been applied to it. Storage of the complete provenance of each data item can become prohibitively expensive. In this paper, we identify important properties of provenance that can be used to considerably reduce the amount of storage required. We identify three different techniques: a family of factorization processes and two methods based on inheritance, to decrease the amount of storage required for provenance. We have used the techniques described in this work to significantly reduce the provenance storage costs associated with constructing MiMI [22], a warehouse of data regarding protein interactions, as well as two provenance stores, Karma [31] and PReServ [20], produced through workflow execution. In these real provenance sets, we were able to reduce the size of the provenance by up to a factor of 20. Additionally, we show that this reduced store can be queried efficiently and further that incremental changes can be made inexpensively.

Nucleic Acids Research | 2007

Michigan Molecular Interactions (MiMI): putting the jigsaw puzzle together

Magesh Jayapandian; Adriane Chapman; V. Glenn Tarcea; Cong Yu; Aaron Elkiss; Angela Ianni; Bin Liu; Arnab Nandi; Carlos de los Santos; Philip C. Andrews; Brian D. Athey; David J. States; H. V. Jagadish

Protein interaction data exists in a number of repositories. Each repository has its own data format, molecule identifier and supplementary information. Michigan Molecular Interactions (MiMI) assists scientists searching through this overwhelming amount of protein interaction data. MiMI gathers data from well-known protein interaction databases and deep-merges the information. Utilizing an identity function, molecules that may have different identifiers but represent the same real-world object are merged. Thus, MiMI allows the users to retrieve information from many different databases at once, highlighting complementary and contradictory information. To help scientists judge the usefulness of a piece of data, MiMI tracks the provenance of all data. Finally, a simple yet powerful user interface aids users in their queries, and frees them from the onerous task of knowing the data format or learning a query language. MiMI allows scientists to query all data, whether corroborative or contradictory, and specify which sources to utilize. MiMI is part of the National Center for Integrative Biomedical Informatics (NCIBI) and is publicly available at: .

Nucleic Acids Research | 2009

Michigan molecular interactions r2: from interacting proteins to pathways.

V. Glenn Tarcea; Terry E. Weymouth; Alexander S. Ade; Aaron V. Bookvich; Jing Gao; Vasudeva Mahavisno; Zach Wright; Adriane Chapman; Magesh Jayapandian; Arzucan Özgür; Yuanyuan Tian; James D. Cavalcoli; Barbara Mirel; Jignesh M. Patel; Dragomir R. Radev; Brian D. Athey; David J. States; H. V. Jagadish

Molecular interaction data exists in a number of repositories, each with its own data format, molecule identifier and information coverage. Michigan molecular interactions (MiMI) assists scientists searching through this profusion of molecular interaction data. The original release of MiMI gathered data from well-known protein interaction databases, and deep merged this information while keeping track of provenance. Based on the feedback received from users, MiMI has been completely redesigned. This article describes the resulting MiMI Release 2 (MiMIr2). New functionality includes extension from proteins to genes and to pathways; identification of highlighted sentences in source publications; seamless two-way linkage with Cytoscape; query facilities based on MeSH/GO terms and other concepts; approximate graph matching to find relevant pathways; support for querying in bulk; and a user focus-group driven interface design. MiMI is part of the NIHs; National Center for Integrative Biomedical Informatics (NCIBI) and is publicly available at: http://mimi.ncibi.org.

international conference on management of data | 2003

TIMBER: a native system for querying XML

Stelios Paparizos; Shurug Al-Khalifa; Adriane Chapman; H. V. Jagadish; Laks V. S. Lakshmanan; Andrew Nierman; Jignesh M. Patel; Divesh Srivastava; Nuwee Wiwatwattana; Yuqing Wu; Cong Yu

XML has become ubiquitous, and XML data has to be managed in databases. The current industry standard is to map XML data into relational tables and store this information in a relational database. Such mappings create both expressive power problems and performance problems.In the TIMBER [7] project we are exploring the issues involved in storing XML in native format. We believe that the key intellectual contribution of this system is a comprehensive set-at-a-time query processing ability in a native XML store, with all the standard components of relational query processing, including algebraic rewriting and a cost-based optimizer.

international provenance and annotation workshop | 2006

A provenance model for manually curated data

Peter Buneman; Adriane Chapman; James Cheney; Stijn Vansummeren

Many curated databases are constructed by scientists integrating various existing data sources “by hand”, that is, by manually entering or copying data from other sources. Capturing provenance in such an environment is a challenging problem, requiring a good model of the process of curation. Existing models of provenance focus on queries/views in databases or computations on the Grid, not updates of databases or Web sites. In this paper we motivate and present a simple model of provenance for manually curated databases and discuss ongoing and future work.

very large data bases | 2009

Do You Know Where Your Data's Been? --- Tamper-Evident Database Provenance

Jing Zhang; Adriane Chapman; Kristen LeFevre

Database provenance chronicles the history of updates and modifications to data, and has received much attention due to its central role in scientific data management. However, the use of provenance information still requires a leap of faith. Without additional protections, provenance records are vulnerable to accidental corruption, and even malicious forgery, a problem that is most pronounced in the loosely-coupled multi-user environments often found in scientific research. This paper investigates the problem of providing integrity and tamper-detection for database provenance. We propose a checksum-based approach, which is well-suited to the unique characteristics of database provenance, including non-linear provenance objects and provenance associated with multiple fine granularities of data. We demonstrate that the proposed solution satisfies a set of desirable security properties, and that the additional time and space overhead incurred by the checksum approach is manageable, making the solution feasible in practice.

information reuse and integration | 2011

PLUS: A provenance manager for integrated information

Adriane Chapman; Barbara T. Blaustein; Len Seligman; M. David Allen

It can be difficult to fully understand the result of integrating information from diverse sources. When all the information comes from a single organization, there is a collective knowledge about where it came from and whether it can be trusted. Unfortunately, once information from multiple organizations is integrated, there is no longer a shared knowledge of the data and its quality. It is often impossible to view and judge the information from a different organization; when errors occur, notification does not always reach all users of the data. We describe how a multi-organizational provenance store that collects provenance from heterogeneous systems addresses these problems. Unlike most provenance systems, we cope with an open world, where the data usage is not determined in advance and can take place across many systems and organizations.

Explore More