Donovan A. Schneider | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Donovan A. Schneider is active.

Explore More

Publication

Featured researches published by Donovan A. Schneider.

ACM Transactions on Database Systems | 1996

LH*—a scalable, distributed data structure

Witold Litwin; Marie-Anna Neimat; Donovan A. Schneider

We present a scalable distributed data structure called LH*. LH* generalizes Linear Hashing (LH) to distributed RAM and disk files. An LH* file can be created from records with primary keys, or objects with OIDs, provided by any number of distributed and autonomous clients. It does not require a central directory, and grows gracefully, through splits of one bucket at a time, to virtually any number of servers. The number of messages per random insertion is one in general, and three in the worst case, regardless of the file size. The number of messages per key search is two in general, and four in the worst case. The file supports parallel operations, e.g., hash joins and scans. Performing a parallel operation on a file of M buckets costs at most 2M + 1 messages, and between 1 and O(log2 Mrounds of messages. We first describle the basic LH* scheme where a coordinator site manages abucket splits, and splits a bucket every time a collision occurs. We show that the average load factor of an LH* file is 65%–70% regardless of file size, and bucket capacity. We then enhance the scheme with load control, performed at no additional message cost. The average load factor then increases to 80–95%. These values are about that of LH, but the load factor for LH* varies more. We nest define LH* schemes without a coordinator. We show that insert and search costs are the same as for the basic scheme. The splitting cost decreases on the average, but becomes more variable, as cascading splits are needed to prevent file overload. Next, we briefly describe two variants of splitting policy, using parallel splits and presplitting that should enhance performance for high-performance applications. All together, we show that LH* files can efficiently scale to files that are orders of magnitude larger in size than single-site files. LH* files that reside in main memory may also be much faster than single-site disk files. Finally, LH* files can be more efficient than any distributed file with a centralized directory, or a static parallel or distributed hash file.

international conference on parallel and distributed information systems | 1991

The papyrus integrated data server

Tim Connors; Waqar Hasan; Curtis P. Kolovson; Marie-Anne Neimat; Donovan A. Schneider; W. Kevin Wilkinson

Summary form only given. The authors focus on the performance of integrated specialized data managers. In particular, they focus on the customizations of parallel executions of data manager operators in a variety of computer configurations. This is done by specifying the glue that connects data manager operators in a way that is independent of the computer configuration; and then providing the ability to transparently target the execution to a variety of computer configurations. Parallelization of data manager operators built using Papyrus services is a challenging problem, but a more challenging problem is the parallelization of data manager operators that are built independently of Papyrus and are therefore black boxes to Papyrus. Papyrus is a set of modules and services that enables the parallelization and integration of specialized data managers into one execution environment. Papyrus programs can be transparently targeted to different hardware configurations and can dynamically adjust at runtime to the number of available resources. A Papyrus System consists of a number of clients interfacing to a Papyrus Server. The Server consists of several integrated data managers executing on a multiprocessor system.<<ETX>>

international conference on management of data | 1996

The ins and outs (and everything in between) of data warehousing

Phillip M. Fernandez; Donovan A. Schneider

Data warehousing is the latest “hot topic” in the industry. With market projections of

international conference on management of data | 2005

Modeling and querying multidimensional data sources in Siebel Analytics: a federated relational system

Kazi A. Zaman; Donovan A. Schneider

8 billion by the year 2000, vendors of all flavors are claiming the suitability and superiority of their products for this market segment. This has led to a great deal of confusion, with terms such as OLAP, ROLAP, MDDB, decision support systems (DSS) and data warehousing being defined, re-defined, and sometimes even used interchangeably.

international conference on parallel and distributed information systems | 1993

Managing query execution for an advanced database programming language

Donovan A. Schneider; Tim Connors

Large organizations have a multitude of data sources across the enterprise and want to obtain business value from all of them. While the majority of these data sources may be consolidated in an enterprise data warehouse, many business units have their own data marts where analysis is carried out against data stored in multidimensional data structures. It is often critical to pose queries which span both these sources. This is a challenge since these sources have differing models and query languages (SQL vs MDX). The Siebel Analytics Server enables this requirement to be fulfilled. In this paper, we describe how the multidimensional metadata is modeled relationally within Siebel Analytics, efficient SQL to MDX translation algorithms and the conversion protocols required to convert a multidimensional result into a relational rowset.

international workshop on research issues in data engineering | 1992

The Papyrus query processing engine

Tim Connors; Donovan A. Schneider

A method for efficiently managing the execution of a query for an advanced database programming language is demonstrated. It is shown that the mechanism gracefully and efficiently manages control flow in both single-processor and multiprocessor environments. A prototype demonstrating the feasibility of this approach is operational on both single-processor and small-scale multiprocessor systems.<<ETX>>

international conference on parallel and distributed information systems | 1994

Achieving transaction scaleup on Unix

Marie-Anne Neimat; Donovan A. Schneider

This paper describes query processing in the Papyrus integrated data server. Query processing is unique in Papyrus because the underlying language is very powerful (computationally complete), performance on uni-processor systems is not sacrificed in order to support multiprocessor architectures, and queries are self-scheduling in order to support extensibility. A prototype demonstrating the feasibility of this approach is in operation.<<ETX>>

international workshop on research issues in data engineering | 1993

LH/sup */-a tool for interoperability at the file access level

Witold Litwin; Marie-Anne Neimat; Donovan A. Schneider

Constructing scalable high-performance applications on commodity hardware running the Unix operating system is a problem that must be addressed in several application domains. We relate our experience in achieving transaction scaleup on Unix for a high-performance OLTP system intended for Service Control Points (SCPs) in a telephone switching network. SCPs are but one example from a class of applications whose requirements cannot be properly handled by todays commercial DBMSs. In addition to high throughput and low response time, SCPs require transaction scaleup on standard hardware and software. Using a main-memory DBMS to obtain high throughput, we focus on the problem of achieving transaction scaleup on a cluster of workstations running Unix while constrained by the low response time requirement of the SCP application.<<ETX>>

very large data bases | 1992

Practical Skew Handling in Parallel Joins

David J. DeWitt; Jeffrey F. Naughton; Donovan A. Schneider; S. Seshadri

Distributed database managers were traditionally designed to interoperate at the query processing level. The rationale for this approach was the limited network bandwidth, prohibiting efficient interoperability at the file or OS levels for database management. New LANs offer order of magnitude higher bandwidths. Database managers can now efficiently interoperate at the file management level. One problem to be solved, though, is the scalability of distributed files. The authors report the work in progress on a new class of data structures, extensible distributed data structures. They focus on one such data structure termed LH/sup */. Through the description of the on-going experiences with LH/sup */, they conclude that it appears particularly promising.<<ETX>>

very large data bases | 1994