Willy Zwaenepoel
École Polytechnique Fédérale de Lausanne
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Willy Zwaenepoel.
IEEE Computer | 1996
Cristiana Amza; Alan L. Cox; Sandhya Dwarkadas; Peter J. Keleher; Honghui Lu; Ramakrishnan Rajamony; Weimin Yu; Willy Zwaenepoel
Shared memory facilitates the transition from sequential to parallel processing. Since most data structures can be retained, simply adding synchronization achieves correct, efficient programs for many applications. We discuss our experience with parallel computing on networks of workstations using the TreadMarks distributed shared memory system. DSM allows processes to assume a globally shared virtual memory even though they execute on nodes that do not physically share memory. We illustrate a DSM system consisting of N networked workstations, each with its own memory. The DSM software provides the abstraction of a globally shared memory, in which each processor can access any data item without the programmer having to worry about where the data is or how to obtain its value.
symposium on operating systems principles | 1991
John B. Carter; John K. Bennett; Willy Zwaenepoel
Munin is a distributed shared memory (DSM) system that allows shared memory parallel programs to be executed efficiently on distributed memory multiprocessors. Munin is unique among existing DSM systems in its use of multiple consistency protocols and in its use of release consistency. In Munin, shared program variables are annotated with their expected access pattern, and these annotations are then used by the runtime system to choose a consistency protocol best suited to that access pattern. Release consistency allows Munin to mask network latency and reduce the number of messages required to keep memory consistent. Munins multiprotocol release consistency is implemented in software using a delayed update queue that buffers and merges pending outgoing writes. A sixteen-processor prototype of Munin is currently operational. We evaluate its implementation and describe the execution of two Munin programs that achieve performance within ten percent of message passing implementations of the same programs. Munin achieves this level of performance with only minor annotations to the shared memory programs.
architectural support for programming languages and operating systems | 1994
Michael Wu; Willy Zwaenepoel
This paper describes the architecture of eNVy, a large non-volatile main memory storage system built primarily with Flash memory. eNVy presents its storage space as a linear, memory mapped array rather than as an emulated disk in order to provide an efficient and easy to use software interface. Flash memories provide persistent storage with solid-state memory access times at a lower cost than other solid-state technologies. However, they have a number of drawbacks. Flash chips are write-once, bulk-erase devices whose contents cannot be updated in-place. They also suffer from slow program times and a limit on the number of program/erase cycles. eNVy uses a copy-on-write scheme, page remapping, a small amount of battery backed SRAM, and high bandwidth parallel data transfers to provide low latency, in-place update semantics. A cleaning algorithm optimized for large Flash arrays is used to reclaim space. The algorithm is designed to evenly wear the array, thereby extending its lifetime. Software simulations of a 2 gigabyte eNVy system show that it can support I/O rates corresponding to approximately 30,000 transactions per second on the TPC-A database benchmark. Despite the added work done to overcome the deficiencies associated with Flash memories, average latencies to the storage system are as low as 180ns for reads and 200ns for writes. The estimated lifetime of this type of storage system is in the 10 year range when exposed to a workload of 10,000 transactions per second.
international symposium on computer architecture | 1992
Peter J. Keleher; Alan L. Cox; Willy Zwaenepoel
Relaxed memory consistency models, such as release consistency, were introduced in order to reduce the impact of remote memory access latency in both software and hardware distributed shared memory (DSM). However, in a software DSM, it is also important to reduce the number of messages and the amount of data exchanged for remote memory access. Lazy release consistency is a new algorithm for implementing release consistency that lazily pulls modifications across the interconnect only when necessary. Trace-driven simulation using the SPLASH benchmarks indicates that lazy release consistency reduces both the number of messages and the amount of data transferred between processors. These reductions are especially significant for programs that exhibit false sharing and make extensive use of locks.
virtual execution environments | 2005
Aravind Menon; Jose Renato Santos; Yoshio Turner; Gopalakrishnan Janakiraman; Willy Zwaenepoel
Virtual Machine (VM) environments (e.g., VMware and Xen) are experiencing a resurgence of interest for diverse uses including server consolidation and shared hosting. An applications performance in a virtual machine environment can differ markedly from its performance in a non-virtualized environment because of interactions with the underlying virtual machine monitor and other virtual machines. However, few tools are currently available to help debug performance problems in virtual machine environments.In this paper, we present Xenoprof, a system-wide statistical profiling toolkit implemented for the Xen virtual machine environment. The toolkit enables coordinated profiling of multiple VMs in a system to obtain the distribution of hardware events such as clock cycles and cache and TLB misses. The toolkit will facilitate a better understanding of performance characteristics of Xens mechanisms allowing the community to optimize the Xen implementation.We use our toolkit to analyze performance overheads incurred by networking applications running in Xen VMs. We focus on networking applications since virtualizing network I/O devices is relatively expensive. Our experimental results quantify Xens performance overheads for network I/O device virtualization in uni- and multi-processor systems. With certain Xen configurations, networking workloads in the Xen environment can suffer significant performance degradation. Our results identify the main sources of this overhead which should be the focus of Xen optimization efforts. We also show how our profiling toolkit was used to uncover and resolve performance bugs that we encountered in our experiments which caused unexpected application behavior.
symposium on reliable distributed systems | 1992
Elmootazbellah Nabil Elnozahy; David B. Johnson; Willy Zwaenepoel
Consistent checkpointing provides transparent fault tolerance for long-running distributed applications. Performance measurements of an implementation of consistent checkpointing are described. The measurements show that consistent checkpointing performs remarkably well. Eight computation-intensive distributed applications were executed on a network of 16 diskless Sun-3/60 workstations, and the performance without checkpointing was compared to the performance with consistent checkpoints taken at two-minute intervals. For six of the eight applications, the running time increased by less than 1% as a result of the checkpointing. The highest overhead measured was 5.8%. Incremental checkpointing and copy-on write checkpointing were the most effective techniques in lowering the running time overhead. It is argued that these measurements show that consistent checkpointing is an efficient way to provide fault tolerance for long-running distributed applications.<<ETX>>
acm sigplan symposium on principles and practice of parallel programming | 1990
John K. Bennett; John B. Carter; Willy Zwaenepoel
We are developing Munin, a system that allows programs written for shared memory multiprocessors to be executed efficiently on distributed memory machines. Munin attempts to overcome the architectural limitations of shared memory machines, while maintaining their advantages in terms of ease of programming. Our system is unique in its use of loosely coherent memory, based on the partial order specified by a shared memory parallel program, and in its use of type-specific memory coherence. Instead of a single memory coherence mechanism for all shared data objects, Munin employs several different mechanisms, each appropriate for a different class of shared data object. These type-specific mechanisms are part of a runtime system that accepts hints from the user or the compiler to determine the coherence mechanism to be used for each object. This paper focuses on the design and use of Munins memory coherence mechanisms, and compares our approach to previous work in this area.
IEEE Transactions on Computers | 1992
Elmootazbellah Nabil Elnozahy; Willy Zwaenepoel
Manetho is a new transparent rollback-recovery protocol for long-running distributed computations. It uses a novel combination of antecedence graph maintenance, uncoordinated checkpointing, and sender-based message logging. Manetho simultaneously achieves the advantages of pessimistic message logging, namely limited rollback and, fast output commit, and the advantage of optimistic message logging, namely low failure-free overhead. These advantages come at the expense of a complex recovery scheme. >
international world wide web conferences | 2004
Sameh Elnikety; Erich M. Nahum; John M. Tracey; Willy Zwaenepoel
This paper presents a method for admission control and request scheduling for multiply-tiered e-commerce Web sites, achieving both stable behavior during overload and improved response times. Our method externally observes execution costs of requests online, distinguishing different request types, and performs overload protection and preferential scheduling using relatively simple measurements and a straight forward control mechanism. Unlike previous proposals, which require extensive changes to the server or operating system, our method requires no modifications to the host O.S., Web server, application server or database. Since our method is external, it can be implemented in a proxy. We present such an implementation, called Gatekeeper, using it with standard software components on the Linux operating system. We evaluate the proxy using the industry standard TPC-W workload generator in a typical three-tiered e-commerce environment. We show consistent performance during overload and throughput increases of up to 10 percent. Response time improves by up to a factor of 14, with only a 15 percent penalty to large jobs.
Journal of Algorithms | 1990
David B. Johnson; Willy Zwaenepoel
Message logging and check pointing can provide fault tolerance in distributed systems in which all process communication is through messages. This paper presents a general model for reasoning about recovery in these systems. Using this model_ we prove that the set of recoverable system states that have occurred during any single execution of the system forms a lattice, and that therefore, there is always a unique maximum recoverable system state, which never decreases. Based on this model, we present an algorithm for determining this maximum recoverable state, and prove its correctness. Our algorithm utilizes all logged messages and checkpoints, and thus always finds the maximum recoverable state possible. Previous recovery methods using optimistic message logging and checkpointing have not considered the existing checkpoints, and thus may not find this maximum state. Furthermore, by utilizing the checkpoints, some messages received by a process before it was checkpointed may not need to be logged. Using our algorithm also adds less communication overhead to the system than do previous methods. Our model and algorithm can be used with any message logging protocol, whether pessimistic or optimistic, but their full generality is only required with optimistic logging protocols.