Deborah A. Wallach
Massachusetts Institute of Technology
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by Deborah A. Wallach.
international symposium on computer architecture | 1993
Michael D. Noakes; Deborah A. Wallach; William J. Dally
The MIT J-Machine multicomputer has been constructed to study the role of a set of primitive mechanisms in providing efficient support for parallel computing. Each J-Machine node consists of an integrated multicomputer component, the Message-Driven Processor (MDP), and 1 MByte of DRAM. The MDP provides mechanisms to support efficient communication, synchronization, and naming. A 512 node J-Machine is operational and is due to be expanded to 1024 nodes in March 1993. In this paper we discuss the design of the J-Machine and evaluate the effectiveness of the mechanisms incorporated into the MDP. We measure the performance of the communication and synchronization mechanisms directly and investigate the behavior of four complete applications.
symposium on operating systems principles | 1995
Kirk L. Johnson; M.F. Kaashoek; Deborah A. Wallach
The C Region Library (CRL) is a new all-software distributed shared memory (DSM) system. CRL requires no special compiler, hardware, or operating system support beyond the ability to send and receive messages between processing nodes. It provides a simple, portable, region-based shared address space programming model that is capable of delivering good performance on a wide range of multiprocessor and distributed system architectures. Each region is an arbitrarily sized, contiguous area of memory. The programmer defines regions and delimits accesses to them using annotations. nCRL implementations have been developed for two platforms: the Thinking Machines CM-5, a commercial multicomputer, and the MIT Alewife machine, an experimental multiprocessor offering efficient hardware support for both message passing and shared memory. Results are presented for up to 128 processors on the CM-5 and up to 32 processors on Alewife. nUsing Alewife as a vehicle, this thesis presents results from the first completely controlled comparison of scalable hardware and software DSM systems. These results indicate that CRL is capable of delivering performance that is competitive with hardware DSM systems: CRL achieves speedups within 15% of those provided by Alewifes native hardware-supported shared memory, even for challenging applications (e.g., Barnes-Hut) and small problem sizes. nA second set of experimental results provides insight into the sensitivity of CRLs performance to increased communication costs (both higher latency and lower bandwidth). These results demonstrate that even for relatively challenging applications, CRL should be capable of delivering reasonable performance on current-generation distributed systems. nTaken together, these results indicate the substantial promise of CRL and other all-software approaches to providing shared memory functionality and suggest that in many cases special-purpose hardware support for shared memory may not be necessary. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)
acm sigops european workshop | 1996
M. Frans Kaashoek; Dawson R. Engler; Gregory R. Ganger; Deborah A. Wallach
We introduce server operating systems, which are sets of abstractions and runtime support for specialized, high-performance server applications. We have designed and are implementing a prototype server OS with support for aggressive specialization, direct device-to-device access, an event-driven organization, and dynamic compiler-assisted ILP. Using this server OS, we have constructed an HTTP server that outperforms servers running on a conventional OS by more than an order of magnitude and that can safely timeshare the hardware platform with other applications.
acm sigplan symposium on principles and practice of parallel programming | 1995
Deborah A. Wallach; Wilson C. Hsieh; Kirk L. Johnson; M. Frans Kaashoek; William E. Weihl
Low-overhead message passing is critical to the performance of many applications. Active Messages reduce the software overhead for message handling: messages are run as handlers instead of as threads, which avoids the overhead of thread management and the unnecessary data copying of other communication models. Scheduling the execution of Active Messages is typically done by disabling and enabling interrupts, or by polling the network. This primitive scheduling control, combined with the fact that handlers are not schedulable entities, puts severe restrictions on the code that can be run in a message handler. This paper describes a new software mechanism, Optimistic Active Messages (OAM), that eliminates these restrictions; OAMs allow arbitrary user code to execute in handlers, and also allow handlers to block. Despite this gain in expressiveness, OAMs perform as well as Active Messages.nWe used OAM as the base for an RPC system, Optimistic RPC (ORPC), for the Thinking Machines CM-5 multiprocessor; it consists of an optimized thread package and a stub compiler that hides communication details from the programmer. ORPC is 1.5 to 5 times faster than traditional RPC (TRPC) for small messages and performs as well as Active Messages (AM). Applications that primarily communicate using large data transfers or are fairly coarse-grained perform equally well, independent of whether AMs, ORPCs, or TRPCs are used. For applications that send many short messages, however, the ORPC and AM implementations are up to three times faster than the TRPC implementations. Using ORPC, programmers obtain the benefits of well-proven programming abstractions such as threads, mutexes, and condition variables, do not have to be concerned with communication details, and yet obtain nearly the performance of hand-coded Active Message programs.
acm special interest group on data communication | 1996
Deborah A. Wallach; Dawson R. Engler; M. Frans Kaashoek
Application-specific safe message handlers (ASHs) are designed to provide applications with hardware-level network performance. ASHs are user-written code fragments that safely and efficiently execute in the kernel in response to message arrival. ASHs can direct message transfers (thereby eliminating copies) and send messages (thereby reducing send-response latency). In addition, the ASH system provides support for dynamic integrated layer processing (thereby eliminating duplicate message traversals) and dynamic protocol composition (thereby supporting modularity). ASHs provide this high degree of flexibility while still providing network performance as good as, or (if they exploit application-specific knowledge) even better than, hard-wired in-kernel implementations. A combination of user-level microbenchmarks and end-to-end system measurements using TCP demonstrate the benefits of the ASH system.
IEEE ACM Transactions on Networking | 1997
Deborah A. Wallach; Dawson R. Engler; M. Frans Kaashoek
Application-specific safe message handlers (ASHs) are designed to provide applications with hardware-level network performance. ASHs are user-written code fragments that safely and efficiently execute in the kernel in response to message arrival. ASHs can direct message transfers (thereby eliminating copies) and send messages (thereby reducing send-response latency). In addition, the ASH system provides support for dynamic integrated layer processing (thereby eliminating duplicate message traversals) and dynamic protocol composition (thereby supporting modularity). ASHs offer this high degree of flexibility while still providing network performance as good as, or (if they exploit application-specific knowledge) even better than, hard-wired in-kernel implementations. A combination of user-level microbenchmarks and end-to-end system measurements using TCP demonstrates the benefits of the ASH system.
international symposium on computer architecture | 1998
William J. Dally; Andrew A. Chien; Stuart Fiske; Waldemar Horwat; Richard Lethin; Michael D. Noakes; Peter R. Nuth; Ellen Spertus; Deborah A. Wallach; D. Scott Wills; Andrew Chang; John S. Keen
1 Computer Systems ’ Department of Computer 3 Department of Electrical Laboratory, Stanford Science, University of Illinois, and Computer Engineering, University Urbana-Champaign Georgia Institute of Technology 4 Netscape Communications 5 Equator Technologies 6 Hewlett Packard Consulting Laboratories 7 Department of Computer 8 DEC, Western Research 9 Silicon Graphics Computer Science, Mills College Laboratory Systems
acm sigops european workshop | 1994
M. Frans Kaashoek; William E. Weihl; Deborah A. Wallach; Wilson C. Hsieh; Kirk L. Johnson
Recent networks and network interfaces promise remarkable communication performance with very little overhead, but current software structures impose substantial overhead that prevents applications from achieving the benefits of these new architectures. We propose a new software structure that eliminates much of the overhead while preserving the ease of programming of current systems. Our architecture relies on the compiler to bridge the gap between high-level application programs and low-level communication primitives. The compiler incorporates application code into message handlers using a new runtime mechanism called optimistic active messages.
Computing Systems in Engineering | 1992
William J. Dally; Andrew A. Chien; R.E. Davison; J.A.S. Fiske; S. Furman; G. Fyler; D.B. Gaunce; Waldemar Horwat; S. Kaneshiro; John S. Keen; Richard Lethin; Michael D. Noakes; Peter R. Nuth; Ellen Spertus; Brian Totty; Deborah A. Wallach; D.S. Wills
Abstract Most modern computers, whether parallel or sequential, are coarse grained. They are composed of physically large nodes with tens of megabytes of memory. Only a small fraction of the silicon area in the machine is devoted to computation. By increasing the ratio of computation area to memory area, fine-grain computers offer the potential of improving cost/performance by several orders of magnitude. To efficiently operate at such a fine grain, however, a machine must provide mechanisms that permit rapid access to global data and fast interaction between nodes. The MIT J-Machine is a fine-grain concurrent computer that provides low-overhead mechanisms for parallel computing. Prototype J-Machines have been operational since July 1991. The J-Machine communication mechanism permits a node to send a message to any other node in the machine in μ s. On message arrival, a task is created and dispatched in μ s. A translation mechanism supports a global virtual address space. These mechanisms efficiently support most proposed models of concurrent computation and allow parallelism to be exploited at a grain size of 10 operations. The hardware is an ensemble of up to 65,536 nodes each containing a 36-bit processor, 4K 36-bit words of on-chip memory, 256K words of DRAM and a router. The nodes are connected by a high-speed three-dimensional mesh network.
operating systems design and implementation | 2006
Fay W. Chang; Jeffrey Dean; Sanjay Ghemawat; Wilson C. Hsieh; Deborah A. Wallach; Michael Burrows; Tushar Deepak Chandra; Andrew Fikes; Robert Gruber
