Scott F. Kaplan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Scott F. Kaplan is active.

Explore More

Publication

Featured researches published by Scott F. Kaplan.

measurement and modeling of computer systems | 1999

EELRU: simple and effective adaptive page replacement

Yannis Smaragdakis; Scott F. Kaplan; Paul R. Wilson

Despite the many replacement algorithms proposed throughout the years, approximations of Least Recently Used (LRU) replacement are predominant in actual virtual memory management systems because of their simplicity and efficiency. LRU, however, exhibits well-known performance problems for regular access patterns of size larger than the main memory. In this paper we present Early Eviction LRU (EELRU): an adaptive replacement algorithm based on the principle of detecting when the LRU algorithm underperforms (i.e., when the fetched memory pages are often the ones evicted lately). In simulations, EELRU proves to be quite effective for many memory sizes and several applications, often decreasing paging by over 30% for programs with large-scale reference patterns and by over lOSo for programs with small-scale patterns. Additionally, the algorithm is very robust, rarely underperforming LRU. Our experiments are mostly with traces from the recent research literature to allow for easy comparison with previous results.

international symposium on memory management | 2004

Automatic heap sizing: taking real memory into account

Ting Yang; Matthew Hertz; Emery D. Berger; Scott F. Kaplan; J. Eliot B. Moss

Heap size has a huge impact on the performance of garbage collected applications. A heap that barely meets the applications needs causes excessive GC overhead, while a heap that exceeds physical memory induces paging. Choosing the best heap size <i>a priori</i> is impossible in multiprogrammed environments, where physical memory allocations to processes change constantly. We present an automatic heap-sizing algorithm applicable to different garbage collectors with only modest changes. It relies on an analytical model and on detailed information from the virtual memory manager. The model characterizes the relation between collection algorithm, heap size, and footprint. The virtual memory manager tracks recent reference behavior, reporting the current footprint and allocation to the collector. The collector uses those values as inputs to its model to compute a heap size that maximizes throughput while minimizing paging. We show that our adaptive heap sizing algorithm can substantially reduce running time over fixed-sized heaps.

Performance Evaluation | 2003

The EELRU adaptive replacement algorithm

Yannis Smaragdakis; Scott F. Kaplan; Paul R. Wilson

The wide performance gap between processors and disks ensures that effective page replacement remains an important consideration in modern systems. This paper presents early eviction LRU (EELRU), an adaptive replacement algorithm. EELRU uses aggregate recency information to recognize the reference behavior of a workload and to adjust its speed of adaptation. An on-line cost/benefit analysis guides replacement decisions. This analysis is based on the LRU stack model (LRUSM) of program behavior. Essentially, EELRU is an on-line approximation of an optimal algorithm for the LRUSM. We prove that EELRU offers strong theoretical guarantees of performance relative to the LRU replacement algorithm. EELRU can never be more than a factor of 3 worse than LRU, while in a common best case it can be better than LRU by a large factor (proportional to the number of pages in memory).The goal of EELRU is to provide a simple replacement algorithm that adapts to reference patterns at all scales. Thus, EELRU should perform well for a wider range of programs and memory sizes than other algorithms. Practical experiments validate this claim. For a large number of programs and wide ranges of memory sizes, we show that EELRU outperforms LRU, typically reducing misses by 10-30%, and occasionally by much more--sometimes by a factor of 2-10. It rarely performs worse than LRU, and then only by a small amount.

ACM Transactions on Modeling and Computer Simulation | 2003

Flexible reference trace reduction for VM simulations

Scott F. Kaplan; Yannis Smaragdakis; Paul R. Wilson

The unmanageably large size of reference traces has spurred the development of sophisticated trace reduction techniques. In this article we present two new algorithms for trace reduction: Safely Allowed Drop (SAD) and Optimal LRU Reduction (OLR). Both achieve high reduction factors and guarantee exact simulations for common replacement policies and for memories larger than a user-defined threshold. In particular, simulation on OLR-reduced traces is accurate for the LRU replacement algorithm, while simulation on SAD-reduced traces is accurate for the LRU and OPT algorithms. Both policies can easily be modified and extended to maintain timing information, thus allowing for exact simulation of the Working Set and VMIN policies. OLR also satisfies an optimality property: for a given original trace and chosen memory size, it produces the shortest possible reduced trace that has the same LRU behavior as the original for a memory of at least the chosen size. We present a proof of this optimality of OLR, and show that SAD, while not optimal, yields nearly optimal performance in practice.Our approach has multiple applications, especially in simulating virtual memory systems; many page replacement algorithms are similar to LRU in that more recently referenced pages are likely to be resident. For several replacement algorithms in the literature, SAD- and OLR-reduced traces yield exact simulations. For many other algorithms, our trace reduction eliminates information that matters little: we present extensive measurements to show that the error for simulations of the clock and segq (segmented queue) replacement policies (the most common LRU approximations) is under 3% for the vast majority of memory sizes. In nearly all cases, the error is much smaller than that incurred by the well-known stack deletion technique.SAD and OLR have many desirable properties. In practice, they achieve reduction factors up to several orders of magnitude. The reduction translates to both storage savings and simulation speedups. Both techniques require little memory and perform a single forward traversal of the original trace, making them suitable for online trace reduction. Neither requires that the simulator be modified to accept the reduced trace.

international symposium on memory management | 2002

Adaptive caching for demand prepaging

Scott F. Kaplan; Lyle A. McGeoch; Megan F. Cole

Demand prepaging was long ago proposed as a method for taking advantage of high disk bandwidths and avoiding long disk latencies by fetching, at each page fault, not only the demanded page but also other pages predicted to be used soon. Studies performed more than twenty years ago found that demand prepaging would not be generally beneficial. Those studies failed to examine thoroughly the interaction between prepaging and main memory caching. It is unclear how many main memory page frames should be allocated to cache pages that were prepaged but have not yet been referenced. This issue is critical to the efficacy of any demand prepaging policy.In this paper, we examine prepaged allocation and its interaction with two other important demand prepaging parameters: the degree, which is the number of extra pages that may be fetched at each page fault, and the predictor that selects which pages to prepage. The choices for these two parameters, the reference behavior of the workload, and the main memory size all substantially affect the appropriate choice of prepaged allocation. In some situations, demand prepaging cannot provide benefit, as any allocation to prepaged pages will increase page faults, while in other situations, a good choice of allocation will yield a substantial reduction in page faults. We will present a mechanism that dynamically adapts the prepaged allocation on-line, as well as experimental results that show that this mechanism typically reduces page faults by 10 to 40% and sometimes by more than 50%. In those cases where demand prepaging should not be used, the mechanism correctly allocates no space for prepaged pages and thus does not increase the number of page faults. Finally, we will show that prepaging offers substantial benefits over the simpler solution of sing larger pages, which can substantially increase page faults.

measurement and modeling of computer systems | 1999

Trace reduction for virtual memory simulations

Scott F. Kaplan; Yannis Smaragdakis; Paul R. Wilson

The unmanageably large size of reference traces has spurred the development of sophisticated trace reduction techniques. In this paper we present two new algorithms for trace reductionSafely Allowed Drop (SAD) and Optimal LRU Reduction (OLR). Both achieve high reduction factors and guarantee ezact simulations for common replacement policies and for memories larger than a user-defined threshold. In particular, simulation on OLR-reduced traces is accurate for the LRU replacement algorithm, while simulation on SAD-reduced traces is accurate for the LRU and OPT algorithms. OLR also satisfies an optimality property: for a given trace and memory size it produces the shortest possible trace that has the same LRU behavior as the original for a memory of at least this size. Our approach has multiple applications, especially in simulating virtual memory systems; many page replacement algorithms are similar to LRU in that more recently referenced pages are likely to be resident. For several replacement algorithms in the literature, SADand OLRreduced traces yield exact simulations. For many other algorithms, our trace reduction eliminates information that matters little: we present extensive measurements to show that the error for simulations of the CLOCK and SEGQ (segmented queue) replacement policies (the most common LRU approximations) is under 3% for the majority of memory sizes. In nearly all cases, the error is smaller than that incurred by the well known stack deletion technique. SAD and OLR have many desirable properties. In practice, they achieve reduction factors up to several orders of magnitude. The reduction translates to both storage savings and simulation speedups. Both techniques require little memory and perform a single forward traversal of the original trace, which makes them suitable for on-line trace reduction. Neither requires that the simulator be modified to accept the reduced trace.

workshop on software and performance | 2004

Collecting whole-system reference traces of multiprogrammed and multithreaded workloads

Scott F. Kaplan

The simulated evaluation of memory management policies relies on reference traces--logs of memory operations performed by running processes. No existing approach to reference trace collection is applicable to a complete system, including the kernel and all processes. Specifically, none gather sufficient information for simulating the virtual memory management, the filesystem cache management, and the scheduling of a multiprogrammed, multithreaded workload. Existing trace collectors are also difficult to maintain and to port, making them partially or wholly unusable on many modern systems.We present Laplace, a trace collector that can log every memory reference that a system performs. Laplace is implemented through modifications to a simulated processor and an operating system kernel. Because it collects references at the simulated CPU layer, Laplace produces traces that are complete and minimally distorted. Its modified kernel also logs selected events that a post-processor uses to associate every virtual memory and filesystem reference with its thread. It also uses this information to reconcile all uses of shared memory, which is a task that is more complex than previously believed. Laplace is capable of tracing a workload without modifications to any source, library, or executable file. Finally, Laplace is relatively easy to maintain and port to different architectures and kernels.

measurement and modeling of computer systems | 2004

Complete or fast reference trace collection for simulating multiprogrammed workloads: choose one

Scott F. Kaplan

Trace-driven simulation [8] provides a reproducible, controllable, and verifiable mechanism for evaluating memory management policies used at every level of the memory hierarchy. A reference trace collector gathers the inputs that drive such simulations. Each collector must interfere with normal execution so that it can capture the references—that is, gain control of execution when a reference occurs. Once a reference has been captured, it is then handled—that is, stored, filtered, or otherwise processed. Most existing trace collectors operate only on a single process [5, 6, 7]. The few collectors that operate on multiple processes [1] are not capable of gathering all of the information needed to drive multiprogrammed simulations. Specifically, they do not record critical kernel events required to associate each reference with its thread, associate each thread with its process, account for file system accesses, and identify all uses of shared memory. Most collectors have low capturing overhead. However, it is the handling overhead that dominates the total overhead of any collector that captures every reference [3], slowing executing by factors of at least 400. Thus, to reduce the total overhead, a collector must not capture all references. Significant event tracing [5] is the only existing method of selective capturing. It is a binary rewriting strategy that can identify code segments within which references may be inferred by a post-processor. This method, however, can be applied only to binaries whose symbol table information has not been stripped. It also cannot be applied to dynamically generated code, and it is not applicable to multiprogrammed workloads.

usenix annual technical conference | 1999