Amos Waterland | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Amos Waterland is active.

Explore More

Publication

Featured researches published by Amos Waterland.

european conference on computer systems | 2006

K42: building a complete operating system

Orran Krieger; Marc A. Auslander; Bryan S. Rosenburg; Robert W. Wisniewski; Jimi Xenidis; Dilma Da Silva; Michal Ostrowski; Jonathan Appavoo; Maria A. Butrico; Mark F. Mergen; Amos Waterland; Volkmar Uhlig

K42 is one of the few recent research projects that is examining operating system design structure issues in the context of new whole-system design. K42 is open source and was designed from the ground up to perform well and to be scalable, customizable, and maintainable. The project was begun in 1996 by a team at IBM Research. Over the last nine years there has been a development effort on K42 from between six to twenty researchers and developers across IBM, collaborating universities, and national laboratories. K42 supports the Linux API and ABI, and is able to run unmodified Linux applications and libraries. The approach we took in K42 to achieve scalability and customizability has been successful.The project has produced positive research results, has resulted in contributions to Linux and the Xen hypervisor on Power, and continues to be a rich platform for exploring system software technology. Today, K42, is one of the key exploratory platforms in the DOEs FAST-OS program, is being used as a prototyping vehicle in IBMs PERCS project, and is being used by universities and national labs for exploratory research. In this paper, we provide insight into building an entire system by discussing the motivation and history of K42, describing its fundamental technologies, and presenting an overview of the research directions we have been pursuing.

ACM Transactions on Computer Systems | 2007

Experience distributing objects in an SMMP OS

Jonathan Appavoo; Dilma Da Silva; Orran Krieger; Marc A. Auslander; Michal Ostrowski; Bryan S. Rosenburg; Amos Waterland; Robert W. Wisniewski; Jimi Xenidis; Michael Stumm; Livio Soares

Designing and implementing system software so that it scales well on shared-memory multiprocessors (SMMPs) has proven to be surprisingly challenging. To improve scalability, most designers to date have focused on concurrency by iteratively eliminating the need for locks and reducing lock contention. However, our experience indicates that locality is just as, if not more, important and that focusing on locality ultimately leads to a more scalable system. In this paper, we describe a methodology and a framework for constructing system software structured for locality, exploiting techniques similar to those used in distributed systems. Specifically, we found two techniques to be effective in improving scalability of SMMP operating systems: (i) an object-oriented structure that minimizes sharing by providing a natural mapping from independent requests to independent code paths and data structures, and (ii) the selective partitioning, distribution, and replication of object implementations in order to improve locality. We describe concrete examples of distributed objects and our experience implementing them. We demonstrate that the distributed implementations improve the scalability of operating-system-intensive parallel workloads.

Operating Systems Review | 2008

Project Kittyhawk: building a global-scale computer: Blue Gene/P as a generic computing platform

Jonathan Appavoo; Volkmar Uhlig; Amos Waterland

This paper describes Project Kittyhawk, an undertaking at IBM Research to explore the construction of a next-generation platform capable of hosting many simultaneous web-scale workloads. We hypothesize that for a large class of web-scale workloads the Blue Gene/P platform is an order of magnitude more efficient to purchase and operate than the commodity clusters in use today. Driven by scientific computing demands the Blue Gene designers pursued an aggressive system-on-a-chip methodology that led to a scalable platform composed of air-cooled racks. Each rack contains more than a thousand independent computers with highspeed interconnects inside and between racks. We postulate that the same demands of efficiency and density apply to web-scale platforms. This project aims to develop the system software to enable Blue Gene/P as a generic platform capable of being used by heterogeneous workloads. We describe our firmware and operating system work to provide Blue Gene/P with generic system software, one of the results of which is the ability to run thousands of heterogeneous Linux instances connected by TCP/IP networks over the high-speed internal interconnects.

high performance distributed computing | 2010

Providing a cloud network infrastructure on a supercomputer

Jonathan Appavoo; Amos Waterland; Dilma Da Silva; Volkmar Uhlig; Bryan S. Rosenburg; Eric Van Hensbergen; Jan Stoess; Robert W. Wisniewski; Udo Steinberg

Supercomputers and clouds both strive to make a large number of computing cores available for computation. More recently, similar objectives such as low-power, manageability at scale, and low cost of ownership are driving a more converged hardware and software. Challenges remain, however, of which one is that current cloud infrastructure does not yield the performance sought by many scientific applications. A source of the performance loss comes from virtualization and virtualization of the network in particular. This paper provides an introduction and analysis of a hybrid supercomputer software infrastructure, which allows direct hardware access to the communication hardware for the necessary components while providing the standard elastic cloud infrastructure for other components.

Journal of Chemical Theory and Computation | 2016

Long-time dynamics through parallel trajectory splicing

Danny Perez; Ekin D. Cubuk; Amos Waterland; Efthimios Kaxiras; Arthur F. Voter

Simulating the atomistic evolution of materials over long time scales is a longstanding challenge, especially for complex systems where the distribution of barrier heights is very heterogeneous. Such systems are difficult to investigate using conventional long-time scale techniques, and the fact that they tend to remain trapped in small regions of configuration space for extended periods of time strongly limits the physical insights gained from short simulations. We introduce a novel simulation technique, Parallel Trajectory Splicing (ParSplice), that aims at addressing this problem through the timewise parallelization of long trajectories. The computational efficiency of ParSplice stems from a speculation strategy whereby predictions of the future evolution of the system are leveraged to increase the amount of work that can be concurrently performed at any one time, hence improving the scalability of the method. ParSplice is also able to accurately account for, and potentially reuse, a substantial fraction of the computational work invested in the simulation. We validate the method on a simple Ag surface system and demonstrate substantial increases in efficiency compared to previous methods. We then demonstrate the power of ParSplice through the study of topology changes in Ag42Cu13 core-shell nanoparticles.

international parallel and distributed processing symposium | 2007

Base Operating System Provisioning and Bringup for a Commercial Supercomputer

David Daly; Jong Hyuk Choi; José E. Moreira; Amos Waterland

Commercial scale-out is a new research project at IBM research. Its main goal is to investigate and develop technologies for the use of large scale parallelism in commercial applications, eventually leading to a commercial supercomputer. The project leverages and explores the features of IBMs BladeCenter family of products. A significant challenge in using a large cluster of servers is the installation and provisioning of the base operating system in those servers. Compounding this problem is the issue of maintenance of the software image in each server after its provisioning. This paper describes the system we developed to manage the installation, provisioning, and maintenance process for a cluster of blades, providing a base level of functionality to be used by higher level management tools. The system leverages the management facilitation features of BladeCenter, and exploits the network and storage architecture of the commercial scale-out prototype cluster. It uses a single shared root filesystem image to reduce management complexity, and completely automates the process of bringing a new blade into the cluster upon its insertion into a BladeCenter chassiss.

Operating Systems Review | 2006

K42: an infrastructure for operating system research

Dilma Da Silva; Orran Krieger; Robert W. Wisniewski; Amos Waterland; David K. Tam; Andrew Baumann

K42 is an open-source scalable research operating system well suited to support systems research. The primary goals of K42s design that support such research include flexibility to allow a multitude of policies and implementations to be supported simultaneously, extensibility to allow new policies and implementations to be readily added, and scalability to enable good performance for both small and large applications on both small and large multiprocessor systems. The goals are accomplished via key features including an object-oriented structure that allows specialized resource management implementations and policies on a per-resource, per-application basis, implementation in user-level servers of much of the system functionality, and a sophisticated set of underlying services that provides a programming model for developing system software in a scalable and modular fashion.These characteristics make K42 an attractive framework for prototyping new operating system ideas. In addition, K42 has a sophisticated performance monitoring infrastructure allowing a thorough understanding of new ideas to be gained. The above framework combined with a consistent emphasis on scalability makes K42 well suited for high-end computing initiatives. In this paper, we describe the structure of K42 which contributes to the advantageous prototyping environment, and demonstrate how to utilize it by describing ongoing research efforts.

Ibm Journal of Research and Development | 2009

Kittyhawk: enabling cooperation and competition in a global, shared computational system

Jonathan Appavoo; Volkmar Uhlig; Amos Waterland; Bryan S. Rosenburg; Dilma Da Silva; José E. Moreira

Kittyhawk represents our vision for a Web-scale computational resource that can accommodate a significant fraction of the worlds computation needs and enable various parties to compete and cooperate in the provisioning of services on a consolidated platform. In this paper, we explain both the vision and the system architecture that supports it. We demonstrate these ideas by way of a prototype implementation that uses the IBM Blue Genet/P platform. In the Kittyhawk prototype, we define a set of basic services that enable the allocation and interconnection of computing resources. By using examples, we show how higher layers of services can be built by using our basic services and standard open-source software.

international conference on parallel architectures and compilation techniques | 2015

Towards General-Purpose Neural Network Computing

Schuyler Eldridge; Amos Waterland; Margo I. Seltzer; Jonathan Appavoo; Ajay Joshi

Machine learning is becoming pervasive, decades of research in neural network computation is now being leveraged to learn patterns in data and perform computations that are difficult to express using standard programming approaches. Recent work has demonstrated that custom hardware accelerators for neural network processing can outperform software implementations in both performance and power consumption. However, there is neither an agreed-upon interface to neural network accelerators nor a consensus on neural network hardware implementations. We present a generic set of software/hardware extensions, X-FILES, that allow for the general-purpose integration of feedforward and feedback neural network computation in applications. The interface is independent of the network type, configuration, and implementation. Using these proposed extensions, we demonstrate and evaluate an example dynamically allocated, multi-context neural network accelerator architecture, DANA. We show that the combination of X-FILES and our hardware prototype, DANA, enables generic support and increased throughput for neural-network-based computation in multi-threaded scenarios.

architectural support for programming languages and operating systems | 2014

ASC: automatically scalable computation

Amos Waterland; Elaine Angelino; Ryan P. Adams; Jonathan Appavoo; Margo I. Seltzer

We present an architecture designed to transparently and automatically scale the performance of sequential programs as a function of the hardware resources available. The architecture is predicated on a model of computation that views program execution as a walk through the enormous state space composed of the memory and registers of a single-threaded processor. Each instruction execution in this model moves the system from its current point in state space to a deterministic subsequent point. We can parallelize such execution by predictively partitioning the complete path and speculatively executing each partition in parallel. Accurately partitioning the path is a challenging prediction problem. We have implemented our system using a functional simulator that emulates the x86 instruction set, including a collection of state predictors and a mechanism for speculatively executing threads that explore potential states along the execution path. While the overhead of our simulation makes it impractical to measure speedup relative to native x86 execution, experiments on three benchmarks show scalability of up to a factor of 256 on a 1024 core machine when executing unmodified sequential programs.

Explore More