Scott W. Devine | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Scott W. Devine is active.

Explore More

Publication

Featured researches published by Scott W. Devine.

ACM Transactions on Modeling and Computer Simulation | 1997

Using the SimOS machine simulator to study complex computer systems

Mendel Rosenblum; Edouard Bugnion; Scott W. Devine; Stephen Alan Herrod

SimOS is an environment for studying the hardware and software of computer systems. SimOS simulates the hardware of a computer system in enough detail to boot a commercial operating system and run realistic workloads on top of it. This paper identifies two challenges that machine simulators such as SimOS must overcome in order to effectively analyze large complex workloads: handling long workload execution times and collecting data effectively. To study long-running workloads, SimOS includes multiple interchangeable simulation models for each hardware component. By selecting the appropriate combination of simulation models, the user can explicitly control the tradeoff between simulation speed and simulation detail. To handle the large amount of low-level data generated by the hardware simulation models, SimOS contains flexible annotation and event classification mechanisms that map the data back to concepts meaningful to the user. SimOS has been extensively used to study new computer hardware designs, to analyze application performance, and to study operating systems. We include two case studies that demonstrate how a low-level machine simulator such as SimOS can be used to study large and complex workloads.

architectural support for programming languages and operating systems | 1996

Operating system support for improving data locality on CC-NUMA compute servers

Ben Verghese; Scott W. Devine; Anoop Gupta; Mendel Rosenblum

The dominant architecture for the next generation of shared-memory multiprocessors is CC-NUMA (cache-coherent non-uniform memory architecture). These machines are attractive as compute servers because they provide transparent access to local and remote memory. However, the access latency to remote memory is 3 to 5 times the latency to local memory. CC-NOW machines provide the benefits of cache coherence to networks of workstations, at the cost of even higher remote access latency. Given the large remote access latencies of these architectures, data locality is potentially the most important performance issue. Using realistic workloads, we study the performance improvements provided by OS supported dynamic page migration and replication. Analyzing our kernel-based implementation, we provide a detailed breakdown of the costs. We show that sampling of cache misses can be used to reduce cost without compromising performance, and that TLB misses may not be a consistent approximation for cache misses. Finally, our experiments show that dynamic page migration and replication can substantially increase application performance, as much as 30%, and reduce contention for resources in the NUMA memory system.

architectural support for programming languages and operating systems | 1994

Scheduling and page migration for multiprocessor compute servers

Rohit Chandra; Scott W. Devine; Ben Verghese; Anoop Gupta; Mendel Rosenblum

Several cache-coherent shared-memory multiprocessors have been developed that are scalable and offer a very tight coupling between the processing resources. They are therefore quite attractive for use as compute servers for multiprogramming and parallel application workloads. Process scheduling and memory management, however, remain challenging due to the distributed main memory found on such machines. This paper examines the effects of OS scheduling and page migration policies on the performance of such compute servers. Our experiments are done on the Stanford DASH, a distributed-memory cache-coherent multiprocessor. We show that for our multiprogramming workloads consisting of sequential jobs, the traditional Unix scheduling policy does very poorly. In contrast, a policy incorporating cluster and cache affinity along with a simple page-migration algorithm offers up to two-fold performance improvement. For our workloads consisting of multiple parallel applications, we compare space-sharing policies that divide the processors among the applications to time-slicing policies such as standard Unix or gang scheduling. We show that space-sharing policies can achieve better processor utilization due to the operating point effect, but time-slicing policies benefit strongly from user-level data distribution. Our initial experience with automatic page migration suggests that policies based only on TLB miss information can be quite effective, and useful for addressing the data distribution problems of space-sharing schedulers.

symposium on operating systems principles | 1995

Hive: fault containment for shared-memory multiprocessors

John M. Chapin; Mendel Rosenblum; Scott W. Devine; Tirthankar Lahiri; Dan Teodosiu; Anoop Gupta

Reliability and scalability are major concerns when designing operating systems for large-scale shared-memory multiprocessors. In this paper we describe Hive, an operating system with a novel kernel architecture that addresses these issues. Hive is structured as an internal distributed system of independent kernels called cells. This improves reliability because a hardware or software fault damages only one cell rather than the whole system, and improves scalability because few kernel resources are shared by processes running on different cells. The Hive prototype is a complete implementation of UNIX SVR4 and is targeted to run on the Stanford FLASH multiprocessor. This paper focuses on Hives solution to the following key challenges : (1) fault containment, i.e. confining the effects of hardware or software faults to the cell where they occur. and (2) memory sharing among cells, which is required to achieve application performance competitive with other multiprocessor operating systems. Fault containment in a shared-memory multiprocessor requires defending each cell against erroneous writes caused by faults in other cells. Hive prevents such damage by using the FLASH firewall, a write permission bit-vector associated with each page of memory, and by discarding potentially corrupt pages when a fault is detected. Memory sharing is provided through a unified file and virtual memory page cache across the cells, and through a unified free page frame pool. We report early experience with the system, including the results of fault injection and performance experiments using SimOS, an accurate simulator of FLASH. The effects of faults were contained to the cell in which they occurred in all 49 tests where we injected fail-stop hardware faults, and in all 20 tests where we injected kernel data corruption. The Hive prototype executes test workloads on a four-processor four-cell system with between 0% and 11% slowdown as compared to SGI IRIX 5.2 (the version of UNIX on which it is based).

ACM Transactions on Computer Systems | 2012

Bringing Virtualization to the x86 Architecture with the Original VMware Workstation

Edouard Bugnion; Scott W. Devine; Mendel Rosenblum; Jeremy Sugerman; Edward Y. Wang

This article describes the historical context, technical challenges, and main implementation techniques used by VMware Workstation to bring virtualization to the x86 architecture in 1999. Although virtual machine monitors (VMMs) had been around for decades, they were traditionally designed as part of monolithic, single-vendor architectures with explicit support for virtualization. In contrast, the x86 architecture lacked virtualization support, and the industry around it had disaggregated into an ecosystem, with different vendors controlling the computers, CPUs, peripherals, operating systems, and applications, none of them asking for virtualization. We chose to build our solution independently of these vendors. As a result, VMware Workstation had to deal with new challenges associated with (i) the lack of virtualization support in the x86 architecture, (ii) the daunting complexity of the architecture itself, (iii) the need to support a broad combination of peripherals, and (iv) the need to offer a simple user experience within existing environments. These new challenges led us to a novel combination of well-known virtualization techniques, techniques from other domains, and new techniques. VMware Workstation combined a hosted architecture with a VMM. The hosted architecture enabled a simple user experience and offered broad hardware compatibility. Rather than exposing I/O diversity to the virtual machines, VMware Workstation also relied on software emulation of I/O devices. The VMM combined a trap-and-emulate direct execution engine with a system-level dynamic binary translator to efficiently virtualize the x86 architecture and support most commodity operating systems. By relying on x86 hardware segmentation as a protection mechanism, the binary translator could execute translated code at near hardware speeds. The binary translator also relied on partial evaluation and adaptive retranslation to reduce the overall overheads of virtualization. Written with the benefit of hindsight, this article shares the key lessons we learned from building the original system and from its later evolution.

Communications of The ACM | 1996

Implementing efficient fault containment for multiprocessors: confining faults in a shared-memory multiprocessor environment

Mendel Rosenblum; John M. Chapin; Dan Teodosiu; Scott W. Devine; Tirthankar Lahiri; Anoop Gupta

A stop member is hingedly mounted to a base member which clips onto a vehicle license plate of the type which is pivotably mounted to a vehicle and functions as an external cover which hides the usual fuel tank cap and fill port. After the license plate has been swung into an open position the stop member can be pivoted from an inoperative position, wherein it lies parallel to the rear face of the license plate, to an operative position wherein it lies in a plane perpendicular to that of the license plate and abuts the rear wall of the vehicle to hold the plate in a open position.

Archive | 1998