Harold S. Stone | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Harold S. Stone is active.

Explore More

Publication

Featured researches published by Harold S. Stone.

IEEE Transactions on Computers | 1992

Optimal partitioning of cache memory

Harold S. Stone; John Turek; Joel L. Wolf

A model for studying the optimal allocation of cache memory among two or more competing processes is developed and used to show that, for the examples studied, the least recently used (LRU) replacement strategy produces cache allocations that are very close to optimal. It is also shown that when program behavior changes, LRU replacement moves quickly toward the steady-state allocation if it is far from optimal, but converges slowly as the allocation approaches the steady-state allocation. An efficient combinatorial algorithm for determining the optimal steady-state allocation, which, in theory, could be used to reduce the length of the transient, is described. The algorithm generalizes to multilevel cache memories. For multiprogrammed systems, a cache-replacement policy better than LRU replacement is given. The policy increases the memory available to the running process until the allocation reaches a threshold time beyond which the replacement policy does not increase the cache memory allocated to the running process. >

ACM Transactions on Computer Systems | 1987

Footprints in the cache

Dominique Thiebaut; Harold S. Stone

This paper develops an analytical model for cache-reload transients and compares the model to observations based on several address traces. The cache-reload transient is the set of cache misses that occur when a process is reinitiated after being suspended temporarily. For example, an interrupt program that runs periodically experiences a reload transient at each initiation. The reload transient depends on the cache size and on the sizes of the footprints in the cache of the competing programs, where a program footprint is defined to be the set of lines in the cache in active use by the program. The model shows that the size of the transient is related to the normal distribution function. A simulation based on program-address traces shows excellent agreement between the model and the observations.

IEEE Transactions on Computers | 1992

Improving disk cache hit-ratios through cache partitioning

Dominique Thiebaut; Harold S. Stone; Joel L. Wolf

An adaptive algorithm for managing fully associative cache memories shared by several identifiable processes is presented. The on-line algorithm extends an earlier model due to H.S. Stone et al. (1989) and partitions the cache storage in disjoint blocks whose sizes are determined by the locality of the processes accessing the cache. Simulation results of traces for 32-MB disk caches show a relative improvement in the overall and read hit-ratios in the range of 1% to 2% over those generated by a conventional least recently used replacement algorithm. The analysis of a queuing network model shows that such an increase in hit-ratio in a system with a heavy throughput of I/O requests can provide a significant decrease in disk response time. >

IEEE Parallel & Distributed Technology: Systems & Applications | 1993

Multiple reservations and the Oklahoma update

Janice M. Stone; Harold S. Stone; Philip Heidelberger; John Turek

A multiple reservation approach that allows atomic updates of multiple shared variables and simplifies concurrent and nonblocking codes for managing shared data structures such as queues and linked lists is presented. The method can be implemented as an extension to any cache protocol that grants write access to at most one processor at a time. Performance improvement, automatic restart, and livelock avoidance are discussed. Some sample programs are examined.<<ETX>>

Ibm Journal of Research and Development | 1987

Efficient search techniques—an empirical study of the N-Queens problem

Harold S. Stone; Janice M. Stone

This paper investigates the cost of finding the first solution to the N-Queens Problem using various backtrack search strategies. Among the empirical results obtained are the following: 1) To find the first solution to the N-Queens Problem using lexicographic backtracking requires a time that grows exponentially with increasing values of N. 2) For most even values of N < 30, search time can be reduced by a factor from 2 to 70 by searching lexicographically for a solution to the N+1-Queens Problem. 3) By reordering the search so that the queen placed next is the queen with the fewest possible moves to make, it is possible to find solutions very quickly for all N < 97, improving running time by dozens of orders of magnitude over lexicographic backtrack search. To estimate the improvement, we present an algorithm that is a variant of algorithms of Knuth and Purdom for estimating the size of the unvisited portion of a tree from the statistics of the visited portion.

winter simulation conference | 1990

Parallel trace-driven cache simulation by time partitioning

Philip Heidelberger; Harold S. Stone

The authors describe a technique for performing parallel simulation of a trace of address references for the purpose of evaluating different cache structures. One way to achieve fast parallel simulation is to simulate the individual independent sets of a cache concurrently on different computers, but this technique is not efficient in a statistical sense because of a high correlation of the activity between different sets. Only a small fraction of sets should actually be simulated. To put parallelism to effective use, a trace of the sets to be simulated can be partitioned into disjoint time intervals, and each interval can be simulated concurrently. Because the contents of the cache are unknown at the start of the time intervals, this parallel simulation does not produce the correct counts of cache hits and misses. However, after simulating the trace in parallel, a small amount of resimulation can produce the correct counts. The resimulation effort required is proportional to the size of the cache simulated and not to the length of the trace.<<ETX>>

IEEE Computer | 1991

Computer architecture in the 1990s

Harold S. Stone; John Cocke

Some of the technology that will drive the advances of the 1990s are explored. A brief tutorial is given to explain the fundamental speed limits of metal interconnections. The advantages and disadvantages of optical interconnections and where they may be used are discussed in some detail. Trends in speeding up performance by increasing data-path width and by increasing the number of operations performed are reviewed, and questions of efficiency are examined. The advent of super reliable machines produced at very low cost by replicating entire processors is examined.<<ETX>>

Ibm Journal of Research and Development | 1986

The average complexity of depth-first search with backtracking and cutoff

Harold S. Stone; Paolo Sipala

This paper analyzes two algorithms for depth-first search of binary trees. The first algorithm uses a search strategy that terminates the search when a successful leaf is reached. The algorithm does not use internal cutoff to restrict the search space. If N is the depth of the tree, then the average number of nodes visited by the algorithm is as low as O(N) and as high as O(2 N ) depending only on the value of the probability parameter that characterizes the search. The second search algorithm uses backtracking with cutoff. A decision to cut off the search at a node eliminates the entire subtree below that node from further consideration. The surprising result for this algorithm is that the average number of nodes visited grows linearly in the depth of the tree, regardless of the cutoff probability. If the cutoff probability is high, then the search has a high probability of failing without examining much of the tree. If the cutoff probability is low, then the search has a high probability of succeeding on the leftmost path of the tree without performing extensive backtracking. This model sheds light on why some instances of NP-complete problems are solvable in practice with a low average complexity.

IEEE Transactions on Very Large Scale Integration Systems | 1993

Fully differential optical interconnections for high-speed digital systems

Chung-Sheng Li; Harold S. Stone; Y. Kwark; C. M. Olsen

This work presents the design details and experimental results for a parallel optical link. The link is designed for connections within high-speed digital systems, specifically for board- and backplane-level interconnections. The link can contain as many fibers in parallel as technology permits. The unusual aspects of this interconnection system are that it is DC-coupled and uses fully differential inputs, two optical channels per signal, to achieve self-thresholding and noise immunity. A chip set consisting of a 2.5-Gb/s bipolar differential laser driver, a 800-Mb/s GaAs MSM (metal-semiconductor-metal) preamplifier array, a 800-Mb/s GaAs MSM preamplifier-postamplifier array, and a GaAs MSM preamplifier array in which each preamplifier has a different bandwidth varying from 300 Mb/s to 2 Gb/s has been designed, fabricated, and tested to serve as a vehicle for verifying the concept. Although the experimental testing of the entire interconnect system is not yet complete, the experimental studies presented show a bandwidth in excess of 800 MHz and excellent signal isolation between channels. >

IEEE Transactions on Computers | 1984

Database Applications of the FETCH-AND-ADD Instruction

Harold S. Stone

The FETCH-AND-ADD instruction provides for synchronization of multiple processes in a parallel manner. This paper explores the use of FETCH-AND-ADD in the context of database systems. We show how to enqueue locks, detect lock conflicts, and release locks without resorting to critical program sections that require mutual exclusion during execution. The scheme is compatible with a variant of lock management proposed by Rosenkrantz and Stearns. A second approach to parallel lock management is based on a reservation scheme by Milenkovic. This methodology uses FETCH-AND-ADD implementation of a priority queue. An implementation of such a queue originally reported by Gottlieb and Kruskal is used for this purpose, although the storage requirements for queue management may be unacceptably large in specific cases. Both approaches described in the paper suggest that FETCH-AND-ADD is potentially effective for eliminating serial bottlenecks caused by lock conflicts in multiprocessor systems.

Explore More