William M. Shui
University of New South Wales
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by William M. Shui.
international world wide web conferences | 2007
Raymond K. Wong; Franky Lam; William M. Shui
As XML database sizes grow, the amount of space used for storing the data and auxiliary data structures becomes a major factor in query and update performance. This paper presents a new storage scheme for XML data that supports all navigational operations in near constant time. In addition to supporting efficient queries, the space requirement of the proposed scheme is within a constant factor of the information theoretic minimum, while insertions and deletions can be performed in near constant time as well. As a result, the proposed structure features a small memory footprint that increases cache locality, whilst still supporting standard APIs, such as DOM, and necessary database operations, such as queries and updates, efficiently. Analysis and experiments show that the proposed structure is space and time efficient.
bioinformatics and bioengineering | 2003
William M. Shui; Raymond K. Wong
Automating the process of information retrieval and integration of heterogeneous biological data is complex and difficult. This paper describes an approach to solve this problem by using XML technologies such as XML Schema and an XML-based active rules system. Current limitations of active rule system for XML databases are discussed. We then propose a template for defining rules that is consistent with the current XQuery specification, a defacto standard language for querying XML data. Finally, an example scenario is used to illustrate how these techniques can come together in integrating heterogeneous biological data sources.
bioinformatics and bioengineering | 2001
Raymond K. Wong; William M. Shui
Biological databanks have proven useful to bioscience researchers, especially in the analysis of raw data. Computational tools for sequence identification, structural analysis, and visualization have been built to access these databanks. This paper describes a way to utilize these resources (both data and tools) by integrating different biological databanks into a unified XML framework. An interface to access the embedded bioinformatic tools for this common model is built by leveraging the query language of XML database management system. The proposed framework has been implemented with the emphasis of reusing the existing bioinformatic data and tools. This paper describes the overall architecture of this prototype and some design issues.
conference on information and knowledge management | 2003
Damien K. Fisher; Franky Lam; William M. Shui; Raymond K. Wong
With the increasing popularity of XML, there arises the need for managing and querying information in this form. Several query languages, such as XQuery, have been proposed which return their results in document order. However, most recent efforts focused on query optimization have disregarded order. This paper presents a simple yet elegant method to maintain document ordering for XML data. Analysis of our method shows that it is indeed efficient and scalable, even for changing data.
bioinformatics and bioengineering | 2000
Raymond K. Wong; Franky Lam; Stephen C. Graham; William M. Shui
The emergence of the Extensible Markup Language (XML) as a new standard for data representation and exchange on the World Wide Web has created a new information revolution. Several proposals have been made to formulate molecular sequences in XML, however none of them mentioned the efficient storage and management of the resultant XML sequence data. This paper addresses some implementation issues of an XML repository for molecular sequence data.
bioinformatics and bioengineering | 2003
William M. Shui; Nicole Lam; Raymond K. Wong
Ability to keep track of records for various biological experiments allows for future validation of the current experiments and other non-experimental laboratory procedures. With the increasing popularity of publishing biological data in XML format, there arises the need for the control and management of this data, as well as dynamically exporting this data to various formats for reporting purposes. As such data is constantly changing, users want to be able to query previous versions, plotting data across different versions from history, query changes in documents, as well as to retrieve a particular document version efficiently. This paper proposes an XML-based version management system for tracking and analyzing data obtained from any laboratory experiments in an effective and meaningful manner. This includes experiments ranging from genomic, proteomic and protein structural. We also present methods for importing non XML data into the system as well as generating reports in multiple formats dynamically.
database systems for advanced applications | 2004
Franky Lam; William M. Shui; Damien K. Fisher; Raymond K. Wong
The structural join is considered a core operation in processing and optimizing XML queries. Recently, various techniques have been proposed for efficiently finding structural relationships between sets of nodes. This paper presents an adaptive algorithm for efficiently processing structural joins. In contrast to previous work, which usually relies on external index structures such as B-trees, our proposal paper does not require any such data structures. Hence, our strategy has lower overheads than previous techniques, and can be easily implemented and incorporated into any existing system. Experiments show that our method significantly outperforms previous algorithms.
database and expert systems applications | 2004
William M. Shui; Damien K. Fisher; Franky Lam; Raymond K. Wong
Although clustering problems are in general NP-hard, much research effort on this problem has been invested in the areas of object-oriented databases (OODB) and relational databases systems (RDBMS). With the increasing popularity of XML, researchers have been focusing on various XML data management including query processing and optimization. However, the clustering issues for XML data storage have been disregarded in their work. This paper provides a preliminary study on data clustering for optimizing XML databases. Different clustering schemes are compared through a set of extensive experiments.
database systems for advanced applications | 2003
William M. Shui; Raymond K. Wong; Stephen C. Graham; Lawrence K. Lee
The development of high-throughput genome sequencing and protein structure determination techniques have provided researchers with a wealth of biological data. Integrated analysis of such data is difficult due to the disparate nature of the repositories used to store this biological data and of the software used for its analysis. This paper presents a framework based upon the use of semi-structured database management systems that would provide an integrated interface for the collection, storage and retrieval of biological data from existing repositories and of biological information generated by existing analysis programs. A simple implementation that integrates information from databases and analytical programs is presented as a proof of concept. In particular, this paper focuses on the data transformation, data integration, and the support of active rules for biological data.
australasian database conference | 2006
Damien K. Fisher; Franky Lam; William M. Shui; Raymond K. Wong