David M. Choy | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David M. Choy is active.

Explore More

Publication

Featured researches published by David M. Choy.

ACM Transactions on Information Systems | 1989

An algebra for structured office documents

Ralf Hartmut Güting; Roberto Zicari; David M. Choy

We describe a data model for structured office information objects, which we generically call “documents,” and a practically useful algebraic language for the retrieval and manipulation of such objects. Documents are viewed as hierarchical structures; their layout (presentation) aspect is to be treated separately. The syntax and semantics of the language are defined precisely in terms of the formal model, an extended relational algebra. The proposed approach has several new features, some of which are particularly useful for the management of office information. The data model is based on nested sequences of tuples rather than nested relations. Therefore, sorting and sequence operations and the explicit handling of duplicates can be described by the model. Furthermore, this is the first model based on a many-sorted instead of a one-sorted algebra, which means that atomic data values as well as nested structures are objects of the algebra. As a consequence, arithmetic operations, aggregate functions, and so forth can be treated inside the model and need not be introduced as query language extensions to the model. Many-sorted algebra also allows arbitrary algebra expressions (with Boolean result) to be admitted as selection or join conditions and the results of arbitrary expressions to be embedded into tuples. In contrast to other formal models, this algebra can be used directly as a rich query language for office documents with precisely defined semantics.

Ibm Systems Journal | 1982

OPAS: an office procedure automation system

Vincent Y. Lum; David M. Choy; Nan C. Shu

This paper discusses an experimental system being developed to support office automation. The emphasis of the paper is on a technology that allows people to automate their office and business activities. Specifically, using forms as the interface, the authors propose a powerful data manipulation and restructuring facility that not only allows users to extract and manipulate data in the forms, but can be used to interface between new and existing applications as well. Since business and office procedures are not discrete activities, but a structured sequence of activities, a means to define and execute procedures is required. Such means is described in this paper along with its model and an example of its application.

Ibm Systems Journal | 2002

Bringing together content and data management systems: Challenges and opportunities

Amit Somani; David M. Choy; Jim Kleewein

With advances in computing and communication technologies in recent years, two significant trends have emerged in terms of information management: heterogeneity and distribution. Heterogeneity (herein discussed in terms of different types of data, not in terms of schematic heterogeneity) pertains to information use evolving from operational business data (e.g., accounting, payroll, and inventory) to digital assets, communications, and content (e.g., documents, intellectual property, rich media, e-mail, and Web data). Information has also become widely distributed, both in scale and ownership. To manage heterogeneity, two major classes of systems have evolved: database management systems to manage structured data, and content management systems to manage document and rich media information. In this paper, we compare and contrast these different paradigms. We believe it is imperative for any business to exploit value from all information--independent of where it resides or its form. We also identify the technical challenges and opportunities for bringing these different paradigms closer together.

COMPCON '96. Technologies for the Information Superhighway Digest of Papers | 1996

Services and architectures for electronic publishing

David M. Choy; Robert J. T. Morris

We describe changes that are occurring in electronic publishing as a result of the emergence of networked computing. Networks facilitate the exchange of information, but the changes will be far more profound than a simple substitution of media. As new capabilities and new challenges arise, there are corresponding new services that must be developed or else electronic publishing will not meet the needs of the public. After discussing these services, we describe architectures that call be used to implement them. These architectures respond to the issues of performance, scalability, security, and economics that must be addressed before electronic publishing will blossom. We illustrate how current architectures support existing services, enhancements offered by multi-tier servers, and extensions of these architectures to more symmetric models such as the broker model. Many of these concepts are being pursued by the IBM Digital Library family of products.

international conference on parallel and distributed information systems | 1991

A distributed catalog for heterogeneous distributed database resources

David M. Choy; Patricia G. Selinger

To support a distributed, heterogeneous computing environment, an inter-system catalog protocol is needed so that remote resources can be located, used, and maintained with little human intervention. This paper describes a scalable catalog framework, which is an extension of previous work in a distributed relational DBMS research prototype called R*. This work builds on the R* concepts to accommodate heterogeneity, to handle partitioned and replicated data, to support non-DBMS resource managers, and to enhance catalog access performance and system extensibility.<<ETX>>

Proceedings of the Third Forum on Research and Technology Advances in Digital Libraries, | 1996

A digital library system for periodicals distribution

David M. Choy; Cynthia Dwork; Jeffrey Bruce Lotspiech; Laura C. Anderson; Stephen K. Boyer; Thomas D. Griffin; Bruce Albert Hoenig; M. J. Jackson; W. Kaka; James M. McCrossin; Alex Miller; Robert J. T. Morris; Norman J. Pass

As part of IBMs Digital Library Initiative, IBMs Almaden Research Center has teamed with the Institute for Scientific Information in a joint project to deliver on-line access to the bibliographic information and abstracts from the scientific journal articles indexed in Current Contents/Life Sciences as well as articles offered by the respective publishers. This requires both adaptation of existing technologies and development of new capabilities, especially regarding copyright protection. Since the Fall of 1995, a pilot system has been installed at four universities, two corporate libraries, and a major public research library, beginning a study that involves many publishers, libraries, and users to test the system and to experiment with new economic models. This article describes some requirements we identified for this system, and the solutions we have devised for these requirements.

Algorithmica | 1996

Efficiently extendible mappings for balanced data distribution

David M. Choy; Ronald Fagin; Larry J. Stockmeyer

AbstractIn data storage applications, a large collection of consecutively numbered data “buckets” are often mapped to a relatively small collection of consecutively numbered storage “bins.” For example, in parallel database applications, buckets correspond to hash buckets of data and bins correspond to database nodes. In disk array applications, buckets correspond to logical tracks and bins correspond to physical disks in an array. Measures of the “goodness” of a mapping method include:(1)Thetime (number of operations) needed to compute the mapping.(2)Thestorage needed to store a representation of the mapping.(3)Thebalance of the mapping, i.e., the extent to which all bins receive the same number of buckets.(4)The cost ofrelocation, that is, the number of buckets that must be relocated to a new bin if a new mapping is needed due to an expansion of the number of bins or the number of buckets. One contribution of this paper is to give a new mapping method, theInterval-Round-Robin (IRR) method. The IRR method has optimal balance and relocation cost, and its time complexity and storage requirements compare favorably with known methods. Specifically, ifm is the number of times that the number of bins and/or buckets has increased, then the time complexity isO(logm) and the storage isO(m2). Another contribution of the paper is to identify the concept of ahistory-independent mapping, meaning informally that the mapping does not “remember” the past history of expansions to the number of buckets and bins, but only the current number of buckets and bins. Thus, such mappings require very little information to be stored. Assuming that balance and relocation are optimal, we prove that history-independent mappings are possible if the number of buckets is fixed (so only the number of bins can increase), but not possible if the number of bins and buckets can both increase.

Bit Numerical Mathematics | 1977

Bounds for optimalα-β binary trees

David M. Choy; C. K. Wong

In this paper we consider a special kind of binary trees where each right edge is associated with a positive numberα and each left edge with a positive numberβ(α ≦ β). Givenα, β and the number of nodesn, an optimal tree is one which minimizes the total weighted path length. An algorithm for constructing an optimal tree for givenα, β, n is presented, based on which bounds for balances and total weighted path lengths of optimal trees are derived.

Acta Informatica | 1978

Optimal α-β trees with capacity constraint

David M. Choy; C. K. Wong

SummaryWe consider a specific kind of binary trees with weighted edges. Each right edge has weight α while each left edge has weight β. Furthermore, no path in the tree is allowed to contain L or more consecutive α-edges, where L ≧ 1 is fixed. Given, α, β, L and the number of nodes n, an optimal tree is one which minimizes the total weighted path length. Algorithms for constructing an optimal tree as well as all optimal trees for given α, β, L and n are proposed and analyzed. Timing and storage requirements are also discussed.

ADL '95 Selected Papers from the Digital Libraries, Research and Technology Advances | 1995

The Almaden Distributed Digital Library System

David M. Choy; Cynthia Dwork; Jeffrey Bruce Lotspiech; Robert J. T. Morris; Norman J. Pass; Laura C. Anderson; Alan E. Bell; Stephen K. Boyer; Thomas D. Griffin; Bruce Albert Hoenig; James M. McCrossin; Alex Miller; Florian Pestoni; Deidra S. Picciano

In this chapter we describe the architecture for the Almaden Distributed Digital Library System, which is intended to support an emerging “information marketplace”. Using a distributed server approach and accommodating heterogeneous environments, the system is designed to meet the diverse needs of the publishers, distributors, and users of scientific journal information at low cost, while protecting the information assets of the publishers and the privacy of the users. A prototype is currently being implemented in a joint effort by IBM Almaden Research Center and the Institute for Scientific Information. A pilot is planned to test the system and to explore new economic models.

Explore More