Scott T. Leutenegger | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Scott T. Leutenegger is active.

Explore More

Publication

Featured researches published by Scott T. Leutenegger.

international conference on management of data | 2000

Indexing the positions of continuously moving objects

Simonas Saltenis; Christian S. Jensen; Scott T. Leutenegger; Mario A. Lopez

The coming years will witness dramatic advances in wireless communications as well as positioning technologies. As a result, tracking the changing positions of objects capable of continuous movement is becoming increasingly feasible and necessary. The present paper proposes a novel, R*-tree based indexing technique that supports the efficient querying of the current and projected future positions of such moving objects. The technique is capable of indexing objects moving in one-, two-, and three-dimensional space. Update algorithms enable the index to accommodate a dynamic data set, where objects may appear and disappear, and where changes occur in the anticipated positions of existing objects. A comprehensive performance study is reported.

international conference on data engineering | 1997

STR: a simple and efficient algorithm for R-tree packing

Scott T. Leutenegger; Mario A. Lopez; Jeffrey Edgington

Presents the results from an extensive comparison study of three R-tree packing algorithms: the Hilbert and nearest-X packing algorithms, and an algorithm which is very simple to implement, called the STR (Sort-Tile-Recursive) algorithm. The algorithms are evaluated using both synthetic and actual data from various application domains including VLSI design, GIS (Tiger files), and computational fluid dynamics. Our studies also consider the impact that various degrees of buffering have on query performance. Experimental results indicate that none of the algorithms as best for all types of data. In general, our new algorithm requires up to 50% fewer disk accesses than the best previously proposed algorithm for point and region queries on uniformly distributed or mildly skewed point and region data, and approximately the same for highly skewed point and region data.

measurement and modeling of computer systems | 1990

The performance of multiprogrammed multiprocessor scheduling algorithms

Scott T. Leutenegger; Mary K. Vernon

Scheduling policies for general purpose multiprogrammed multiprocessors are not well understood. This paper examines various policies to determine which properties of a scheduling policy are the most significant determinants of performance. We compare a more comprehensive set of policies than previous work, including one important scheduling policy that has not previously been examined. We also compare the policies under workloads that we feel are more realistic than previous studies have used. Using these new workloads, we arrive at different conclusions than reported in earlier work. In particular, we find that the “smallest number of processes first” (SNPF) scheduling discipline performs poorly, even when the number of processes in a job is positively correlated with the total service demand of the job. We also find that policies that allocate an equal fraction of the processing power to each job in the system perform better, on the whole, than policies that allocate processing power unequally. Finally, we find that for lock access synchronization, dividing processing power equally among all jobs in the system is a more effective property of a scheduling policy than the property of minimizing synchronization spin-waiting, unless demand for synchronization is extremely high. (The latter property is implemented by coscheduling processes within a job, or by using a thread management package that avoids preemption of processes that hold spinlocks.) Our studies are done by simulating abstract models of the system and the workloads.

job scheduling strategies for parallel processing | 1999

Benchmarks and Standards for the Evaluation of Parallel Job Schedulers

Steve J. Chapin; Walfredo Cirne; Dror G. Feitelson; James Patton Jones; Scott T. Leutenegger; Uwe Schwiegelshohn; Warren Smith; David Talby

The evaluation of parallel job schedulers hinges on the workloads used. It is suggested that this be standardized, in terms of both format and content, so as to ease the evaluation and comparison of different systems. The question remains whether this can encompass both traditional parallel systems and metacomputing systems. This paper is based on a panel on this subject that was held at the workshop, and the ensuing discussion; its authors are both the panel members and participants from the audience. Naturally, not all of us agree with all the opinions expressed here...

international conference on management of data | 1993

A modeling study of the TPC-C benchmark

Scott T. Leutenegger; Daniel M. Dias

The TPC-C benchmark is a new benchmark approved by the TPC council intended for comparing database platforms running a medium complexity transaction processing workload. Some key aspects in which this new benchmark differs from the TPC-A benchmark are in having several transaction types, some of which are more complex than that in TPC-A, and in having data access skew. In this paper we present results from a modelling study of the TPC-C benchmark for both single node and distributed database management systems. We simulate the TPC-C workload to determine expected buffer miss rates assuming an LRU buffer management policy. These miss rates are then used as inputs to a throughput model. From these models we show the following: (i) We quantify the data access skew as specified in the benchmark and show what fraction of the accesses go to what fraction of the data. (ii) We quantify the resulting buffer hit ratios for each relation as a function of buffer size. (iii) We show that close to linear scale-up (about 3% from the ideal) can be achieved in a distributed system, assuming replication of a read-only table. (iv) We examine the effect of packing hot tuples into pages and show that significant price/performance benefit can be thus achieved. (v) Finally, by coupling the buffer simulations with the throughput model, we examine typical disk/memory configurations that maximize the overall price/performance.

international conference on data engineering | 2001

High dimensional similarity search with space filling curves

Swanwa Liao; Mario A. Lopez; Scott T. Leutenegger

We present a new approach for approximate nearest neighbor queries for sets of high dimensional points under any L/sub t/-metric, t=1,...,/spl infin/. The proposed algorithm is efficient and simple to implement. The algorithm uses multiple shifted copies of the data points and stores them in up to (d+1) B-trees where d is the dimensionality of the data, sorted according to their position along a space filling curve. This is done in a way that allows us to guarantee that a neighbor within an O(d/sup 1+1/t/) factor of the exact nearest, can be returned with at most (d+1)log, n page accesses, where p is the branching factor of the B-trees. In practice, for real data sets, our approximate technique finds the exact nearest neighbor between 87% and 99% of the time and a point no farther than the third nearest neighbor between 98% and 100% of the time. Our solution is dynamic, allowing insertion or deletion of points in O(d log/sub p/ n) page accesses and generalizes easily to find approximate k-nearest neighbors.

advances in geographic information systems | 1998

A greedy algorithm for bulk loading R-trees

Yván J. García R; Mario A. Lopez; Scott T. Leutenegger

CM-line loading of R-trees is useful to improve node utilization and query performance. WTepresent an algorithm for bulk loading R-trees -r&id diifers horn previous ones in two aspects (a) it partitions input data into subtrees in a top-down fashion (based on the fact that splits close to the root are likely to have a greater impact on performance), (b) at each tree level, it considers all cuts orthogonal to the coordinate axes that result in packed trees and greedily picks those optimizing an arbitrary cost function. EMm.sive esperirnentation with both real and synthetic data indicate that for region data our algorithm requires up to three times fewer disk accesses than other algorithms. It is the method of choice for data with skew in locations, areas, or aspect ratios. Such data is common in practice. Let n = number of input rectangles Let S = maximumnumber of rectangles per subtree Let M = maximumnumber of entries per node Let f (rl, r2) be the “user-supplied” cost function If n < S return {stop condition} For each dimension d For each ordering considered in this dimension d For i from 1 to [n/iVfl – 1 Let B. = MSR of first i S rectangles Let B1 = MSR of the other rectangles Remember i if f(Bo, BI) is better valued Split input set and orderings at best position.

statistical and scientific database management | 1999

Master-client R-trees: a new parallel R-tree architecture

Bernd Schnitzer; Scott T. Leutenegger

Scientific databases must be able to efficiently run subset retrievals of multidimensional data sets. If the data sets are very large, significant retrieval speedups can be obtained via parallelism. In this paper, we present a new parallel distributed shared-nothing R-tree architecture. We provide experimental results demonstrating actual speedups for several synthetic and real data sets. In addition, we conduct experimental studies to investigate the effect of several declustering strategies and communication parameters.

international conference on data engineering | 1998

The effect of buffering on the performance of R-trees

Scott T. Leutenegger; Mario A. Lopez

Past R tree studies have focused on the number of nodes visited as a metric of query performance. Since database systems usually include a buffering mechanism, we propose that the number of disk accesses is a more realistic measure of performance. We develop a buffer model to analyze the number of disk accesses required for spatial queries using R trees. The model can be used to evaluate the quality of R tree update operations, such as various node splitting and tree restructuring policies, as measured by query performance on the resulting tree. We use our model to study the performance of three well known R tree packing algorithms. We show that ignoring buffer behavior and using number of nodes accessed as a performance metric can lead to incorrect conclusions, not only quantitatively, but also qualitatively. In addition, we consider the problem of how many levels of the R tree should be pinned in the buffer.

conference on information and knowledge management | 1995

Experimental evaluation of dynamic data allocation strategies in a distributed database with changing workloads

Anna Brunstrom; Scott T. Leutenegger; Rahul Simha

Traditionally, allocation of data in distributed database management systems has been determined by off-line analysis and optimization. This technique works well for static database access patterns, but is often inadequate for frequently changing workloads. In this paper we address how to dynamically reallocate data for partionable distributed databases with changing access patterns. Rather than complicated and expensive optimization algorithms, a simple heuristic is presented and shown, via an implementation study, to improve system throughput by 3 based system. Based on artificial wide area network delays, we show that dynamic reallocation can improve system throughput by a factor of two and a half for wide area networks. We also show that individual site load must be taken into consideration when reallocating data, and provide a simple policy that incorporates load in the reallocation decision.

Explore More