Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Shu-Yao Chien is active.

Publication


Featured researches published by Shu-Yao Chien.


very large data bases | 2002

Efficient structural joins on indexed XML documents

Shu-Yao Chien; Zografoula Vagena; Donghui Zhang; Vassilis J. Tsotras; Carlo Zaniolo

Queries on XML documents typically combine selections on element contents, and, via path expressions, the structural relationships between tagged elements. Structural joins are used to find all pairs of elements satisfying the primitive structural relationships specified in the query, namely, parent-child and ancestor-descendant relationships. Efficient support for structural joins is thus the key to efficient implementations of XML queries. Recently proposed node numbering schemes enable the capturing of the XML document structure using traditional indices (such as B+-trees or R-trees). This paper proposes efficient structural join algorithms in the presence of tag indices. We first concentrate on using B+- trees and show how to expedite a structural join by avoiding collections of elements that do not participate in the join. We then introduce an enhancement (based on sibling pointers) that further improves performance. Such sibling pointers are easily implemented and dynamically maintainable. We also present a structural join algorithm that utilizes R-trees. An extensive experimental comparison shows that the B+-tree structural joins are more robust. Furthermore, they provide drastic improvement gains over the current state of the art.


very large data bases | 2002

Efficient schemes for managing multiversionXML documents

Shu-Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo

Abstract. Multiversion support for XML documents is needed in many critical applications, such as software configuration control, cooperative authoring, web information warehouses, and ”e-permanence” of web documents. In this paper, we introduce efficient and robust techniques for: (i) storing and retrieving; (ii) viewing and exchanging; and (iii) querying multiversion XML documents. We first discuss the limitations of traditional version control methods, such as RCS and SCCS, and then propose novel techniques that overcome their limitations. Initially, we focus on the problem of managing secondary storage efficiently, and introduce an edit-based versioning scheme that enhances RCS with an effective clustering policy based on the concept of page-usefulness. The new scheme drastically improves version retrieval at the expense of a small (linear) space overhead. However, the edit-based approach falls short of achieving objectives (ii) and (iii). Therefore, we introduce and investigate a second scheme, which is reference-based and preserves the structure of the original document. In the reference-based approach, a multiversion document can be represented as yet another XML document, which can be easily exchanged and viewed on the web; furthermore, simple queries are also expressed and supported well under this representation. To achieve objective (i), we extend the page-usefulness clustering technique to the reference-based scheme. After characterizing the asymptotic behavior of the new techniques proposed, the paper presents the results of an experimental study evaluating and comparing their performance.


web information systems engineering | 2001

Storing and querying multiversion XML documents using durable node numbers

Shu-Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo; Donghui Zhang

Managing multiple versions of XML documents represents an important problem for many traditional applications, such as software configuration control, as well as new ones, such as link permanence of web documents. Research on managing multiversion XML documents seeks to provide efficient and robust techniques for storing, retrieving and querying such documents. In this paper we present a novel approach to version management that achieves these objectives by a scheme based on Durable Node Numbers and timestamps for the elements of XML documents. We first present efficient storage and retrieval techniques for multiversion documents. Then, we explore the indexing and clustering strategies needed to assure efficient support for complex queries on content and on document evolution.


international workshop on the web and databases | 2000

Version Management of XML Documents

Shu-Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo

The problem of ensuring efficient storage and fast retrieval for multi-version structured documents is important because of the recent popularity of XML documents and semistructured information on the web. Traditional document version control systems, e.g. RCS, which model documents as a sequence of lines of text and use the shortest edit script to represent version differences, can be inefficient and they do not preserve the logical structure of the original document. Therefore, we propose a new approach where the structure of the documents is preserved intact, and their sub-objects are timestamped hierarchically for efficient reconstruction of current and past versions. Our technique, called the Usefulness Based Copy Control (UBCC), is geared towards efficient version reconstruction while using small storage overhead. Our analysis and experiments illustrate the effectiveness of the overall approach to version control for structured documents. Moreover UBCC can easily support multiple concurrent versions as well as partial document retrieval.


international workshop on research issues in data engineering | 2001

Copy-based versus edit-based version management schemes for structured documents

Shu-Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo

Managing multiple versions of XML documents and semistructured data represents a problem of growing interest. Traditional version control methods, such as RCS, use edit scripts representing changes in the document to support the incremental reconstruction of different versions. The edit-based approaches have been enhanced with a replication scheme called UBCC (Chien et al., 2000). UBCC is based on the notion of page usefulness and ensures effective management for multi-version documents in terms of both retrieval and storage cost. These improvements notwithstanding, the edit-based representation suffers from limited generality and flexibility-e.g., it cannot represent changes such as rearranging the document or duplicating parts of its content. To solve these problems, the paper proposes a copy-based UBCC versioning scheme, which also provides a simpler format for the electronic exchange of multi-version documents. With the objective of matching the performance of the edit-based UBCC technique, we develop algorithms that enhance the copy-based UBCC scheme with page usefulness management. We also present results of various experiments that test the storage and retrieval performance of the new copy-based approach, and compare it with that of the edit-based UBCC approach.


ACM Transactions on Internet Technology | 2006

Supporting complex queries on multiversion XML documents

Shu-Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo; Donghui Zhang

Managing multiple versions of XML documents represents a critical requirement for many applications. Recently, there has been much work on supporting complex queries on XML data (e.g., regular path expressions, structural projections, etc.). In this article, we examine the problem of implementing efficiently such complex queries on multiversion XML documents. Our approach relies on a numbering scheme, whereby durable node numbers (DNNs) are used to preserve the order among the nodes of the XML tree while remaining invariant with respect to updates. Using the documents DNNs, we show that many complex queries are reduced to combinations of range version retrieval queries. We thus examine three alternative storage organizations/indexing schemes to efficiently evaluate range version retrieval queries in this environment. A thorough performance analysis is then presented to reveal the advantages of each scheme.


extending database technology | 2002

Efficient Complex Query Support for Multiversion XML Documents

Shu-Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo; Donghui Zhang

Managing multiple versions of XML documents represents a critical requirement for many applications. Also, there has been much recent interest in supporting complex queries on XML data (e.g., regular path expressions, structural projections, DIFF queries). In this paper, we examine the problem of supporting efficiently complex queries on multiversioned XML documents. Our approach relies on a scheme based on durable node numbers (DNNs) that preserve the order among the XML tree nodes and are invariant with respect to updates. Using the documents DNNs various complex queries are reduced to combinations of partial version retrieval queries. We examine three indexing schemes to efficiently evaluate partial version retrieval queries in this environment. A thorough performance analysis is then presented to reveal the advantages of each scheme.


very large data bases | 2001

Efficient Management of Multiversion Documents by Object Referencing

Shu-Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo


Lecture Notes in Computer Science | 2002

Efficient complex query support for multiversion XML documents

Shu-Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo; Donghui Zhang


Archive | 2000

A Comparative Study of Version Management Schemes for XML Documents

Shu-Yao Chien; Vassilis J. Tsotras; Carlo Zaniolo

Collaboration


Dive into the Shu-Yao Chien's collaboration.

Top Co-Authors

Avatar

Carlo Zaniolo

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Donghui Zhang

University of California

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge