Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Boon Seong Ang is active.

Publication


Featured researches published by Boon Seong Ang.


european conference on parallel processing | 1995

START-NG: Delivering Seamless Parallel Computing

Derek Chiou; Boon Seong Ang; Robert Greiner; Arvind; James C. Hoe; Michael J. Beckerle; James E. Hicks; G. Andrew Boughton

StarT-ng is a joint MIT-Motorola project to build a high-performance message passing machine from commercial systems. Each site of the machine consists of a PowerPC 620-based Motorola symmetric multiprocessor (SMP) running the AIX 4.1 operating system. Every processor is connected to a low-latency, high-bandwidth network that is directly accessible from user-level code. In addition to fast message passing capabilities, the machine has experimental support for cachecoherent shared memory across sites. When the machine requires memory to be kept globally coherent, one processor on each site is devoted to supporting shared memory. When globally coherent shared memory is not required, that processor can be used for normal computation tasks. StarT-ng will be delivered at about the time the base SMP is introduced into the marketplace. The ability to be both a collection of standard SMP and an aggressive message passing machine with coherent shared memory makes StarT-ng a good building block for incrementally expandable parallel machines.


Journal of Parallel and Distributed Computing | 1993

Performance studies of Id on the Monsoon dataflow system

James Edward Hicks; Derek Chiou; Boon Seong Ang; Arvind

Abstract In this paper, we examine the performance of Id, an implicitly parallel language, on Monsoon, an experimental dataflow machine. One of the precepts of our work is that the Id run-time system and compiled Id programs should run on any number of Monsoon processors without change. Our experiments running Id programs on Monsoon show that speedups of more than 7 are easily achieved on 8 processors for most of the applications that we studied. We explain the sources of overhead that limit the speedup of each of our benchmark programs. We also compare the performance of Id on a single Monsoon processor with C/Fortran on a DEC Station 5000 (MIPS R3000 processor), to establish a baseline for the efficiency of Id execution on Monsoon. We find that the execution of Id programs on one Monsoon processor takes up to three times as many cycles as the corresponding C or Fortran programs executing on a MIPS R3000 processor. We identify the sources of inefficiency on Monsoon and suggest improvements, where possible. In many cases, however, improving single processor performance will reduce parallel processor performance.


conference on high performance computing (supercomputing) | 1998

StarT-Voyager: A Flexible Platform for Exploring Scalable SMP Issues

Boon Seong Ang; Derek Chiou; Daniel L. Rosenband; Mike Ehrlich; Larry Rudolph; Arvind

This paper describes StarT-Voyager, a machine designed as an experimental platform for research in cluster system communication. The heart of StarT-Voyager is a network interface unit (NIU) that connects the memory bus of a PowerPC-based SMP to the MIT Arctic network. The NIU is highly flexible, with its set of functions easily modified by firmware or by programmable hardware, making it possible to compare different communication interfaces and implementation strategies on a common platform. Its flexibility comes from a fast embedded processor and large, fast FPGAs that surround a high-speed protected communication core. Its efficiency comes from a set of primitive operations that are implemented in hardware and are designed to reduce the firmware overhead. Our initial configuration of StarT-Voyager implements four forms of message passing along with S-COMA and NUMA shared memory support. With experimentation on the machine, it can be reconfigured to introduce new mechanisms improving usability and performance.


ieee international conference on high performance computing data and analytics | 1998

Message passing support on StarT-Voyager

Boon Seong Ang; Derek Chiou; Larry Rudolph; Arvind

No single message passing mechanism can efficiently support all types of communication that commonly occur in most parallel or distributed programs. MITs StarT-Voyager, a hybrid message passing/shared memory parallel machine, provides four message passing mechanisms to achieve high performance over a wide spectrum of communication types and sizes. Hardware and address translation enforced protection allows direct user-level access to message passing facilities in a multiuser environment. StarT-Voyagers protection scheme improves upon past designs by not requiring strictly synchronized gang-scheduling, and by supporting non-monolithic protection domains. To minimize the development effort and cost, the machine is designed to use unmodified commercial PowerPC 604-based SMP systems as the building block. A Network End-point Subsystem (NES) card which plugs into one of each SMPs processor card slots provides the interface to Arctic, a low-latency, high-bandwidth network developed at MIT. This paper describes StarT-Voyagers message passing mechanisms and their predicted performance.


international conference on parallel architectures and compilation techniques | 1998

The START-VOYAGER parallel system

Boon Seong Ang; Derek Chiou; Larry Rudolph; Arvind

This paper presents the communication architecture of the START-VOYAGER system, a parallel machine composed of a cluster of unmodified IBM 604e-based SMPs connected via a high speed interconnection network. A custom network interface unit (NIU) plugs into a processor card slot of each SMP, providing a high-performance message passing substrate that supports both fast user-level message passing and cache-line coherent shared memory. The substrate consists of four hardware implemented message passing mechanisms to achieve high performance over a wide spectrum of communication patterns. START-VOYAGER also introduces a novel protection scheme which improves upon past designs by not requiring strictly synchronized gang-scheduling and by allowing system code and multiple user applications to share the network simultaneously without compromising protection nor performance. Performance predictions based on synthesized Verilog code show START-VOYAGERs novel message passing mechanisms offer a definitive advantage in a multi-threaded environment without compromising the performance in a single-threaded environment. Preliminary shared memory simulations are also promising.


Journal of Parallel and Distributed Computing | 1993

Performance visualization on Monsoon

Venkat Natarajan; Derek Chiou; Boon Seong Ang

Abstract The performance of an applications program running on a parallel machine is affected by several factors such as the algorithm, the programming language, the compiler, and the operating system. Performance evaluation of parallel machines requires quick and easy-to-use analysis of large amounts of data. This paper describes a performance evaluation tool built for Monsoon, a multithreaded multiprocessor machine built by Motorola in collaboration with MIT. The tool offers integrated data collection, analysis, and visualization and is designed to be simple but powerful. Software layers built on top of simple hardware monitors offer a flexible, yet nonintrusive performance evaluation tool. Examples of successful use of the tool by both systems and applications programmers are included.


international parallel and distributed processing symposium | 2000

Micro-architectures of high performance, multi-user system area network interface cards

Boon Seong Ang; Derek Chiou; Larry Rudolph; Arvind

This paper examines two Network Interface Card micro-architectures that support low latency, high bandwidth user level message passing in multi-user environments. The two are at different ends of a design spectrum-the Resident queues design relies completely on hardware, while the Non-resident queues design is heavily firmware driven. Through actual implementation of these designs and simulation-based micro-benchmark studies, we identify issues critical to the performance and functionality of the firmware-based approach. The firmware-based approach offers much flexibility at a moderate performance penalty, while the Resident design has superior performance for the functions it implements. This leads us to conclude that a hybrid design combining complete hardware support for common operations and a firmware implementation of less common functions achieves both high performance and flexibility.


Archive | 1999

Method and apparatus for curious and column caching

Derek Chiou; Boon Seong Ang


design automation conference | 2000

Dynamic Cache Partitioning via Columnization

Derek Chiou; Larry Rudolph; Srinivas Devadas; Boon Seong Ang


Archive | 2001

Method and apparatus for direct conveyance of physical addresses from user level code to peripheral devices in virtual memory systems

Boon Seong Ang

Collaboration


Dive into the Boon Seong Ang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Derek Chiou

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Arvind

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Larry Rudolph

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Daniel L. Rosenband

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

G. Andrew Boughton

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

James C. Hoe

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

James Edward Hicks

Massachusetts Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge