Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Robert A. Iannucci.
international symposium on computer architecture | 1988
Robert A. Iannucci
Dataflow architectures offer the ability to trade program level parallelism in order to overcome machine level latency. Dataflow further offers a uniform synchronization paradigm, representing one end of a spectrum wherein the unit of scheduling is a single instruction. At the opposite extreme are the von Neumann architectures which schedule on a task, or process, basis. This paper examines the spectrum by proposing a new architecture which is a hybrid of dataflow and von Neumann organizations. The analysis attempts to discover those features of the dataflow architecture, lacking in a von Neumann machine, which are essential for tolerating latency and synchronization costs. These features are captured in the concept of a parallel machine language which can be grafted on top of an otherwise traditional von Neumann base. In such an architecture, the units of scheduling, called scheduling quanta, are bound at compile time rather than at instruction set design time. The parallel machine language supports this notion via a large synchronization name space. A prototypical architecture is described, and results of simulation studies are presented. A comparison is made between the MIT Tagged-Token Dataflow machine and the subject machine which presents a model for understanding the cost of synchronization in a parallel environment.
parallel computing | 1987
Arvind; Robert A. Iannucci
A general purpose multiprocessor should be scalable, i.e. show higher performance when more hardware resources are added to the machine. Architects of such multiprocessors must address the loss in processor efficiency due to two fundamental issues: long memory latencies and waits due to synchronization events. It is argued that a well designed processor can overcome these losses provided there is sufficient parallelism in the program being executed. The detrimental effect of long latency can be reduced by instruction pipelining, however, the restriction of a single thread of computation in von Neumann processors severely limits their ability to have more than a few instructions in the pipeline. Furthermore, techniques to reduce the memory latency tend to increase the cost of task switching. The cost of synchronization events in von Neumann machines makes decomposing a program into very small tasks counter-productive. Dataflow machines, on the other hand, treat each instruction as a task, and by paying a small synchronization cost for each instruction executed, offer the ultimate flexibility in scheduling instructions to reduce processor idle time.
international symposium on computer architecture | 1983
Arvind; Robert A. Iannucci
In recent years, there have been many attempts to construct multiple-processor computer systems. The majority of these systems are based on von Neumann style uniprocessors. To exploit the parallelism in algorithms, any high performance multiprocessor system must, however, address two very basic issues-the ability to tolerate long latencies for memory requests and the ability to achieve unconstrained, yet synchronized, access to shared data. In this paper, we define these two problems, and examine the ways in which they are addressed by some of the current and past von Neumann multiprocessor projects. We then proceed to hypothesize that the problems cannot be solved in a von Neumann context. We offer the data flow model as one possible alternative, and we describe our research in this area.
Archive | 1990
Robert A. Iannucci
This text considers the space of architectures which fit the description of scalable, general purpose parallel computers. The term PARALLEL COMPUTER denotes a collection of computing resources, specifically, some number of identical, asynchronously operating processors, some number of identical memory units, and some means for intercommunication, assembled for the purpose of cooperating on the solution of problems1. Such problems are decomposed into communicating parts which are mapped onto the processors. GENERAL PURPOSE means simply that such computers can exploit parallelism, when present, in any program, without appealing to some specific attribute unique to some specific problem domain. SCALABILITY implies that, given sufficient program parallelism, adding hardware resources will result in higher performance without requiring program alteration. The scaling range is assumed to be from a single processor up to a thousand processors2. Parallelism significantly beyond this limit demands yet another change in viewpoint for both machines and languages.
Archive | 1990
Robert A. Iannucci
In the previous Chapter, it was concluded that satisfactory solutions to the problems raised for von Neumann architectures can only be had by altering the architecture of the processor itself. It was further observed that dataflow architectures do address these problems satisfactorily. Based on observations of the near-miss behavior of certain von Neumann parallel processors (e.g., the Denelcor HEP [40, 52]), it is reasonable to speculate that dataflow and von Neumann machines actually represent two points on a continuum of architectures. The goal of the present study is to develop a new machine model which differs minimally from the von Neumann model, yet embodies the same latency and synchronization characteristics which make dataflow architectures amenable to parallel processing.
Archive | 1990
Robert A. Iannucci
This chapter considers the task of transforming dataflow program graphs into partitioned graphs, and thence into PML. Section 4.1 extends the work of Section 1.1 by completing the description of DFPG’s. Section 4.2 discusses the issues involved in generating partitioned code from DFPG’s. Section 4.3 presents the design of a suitable code generator.
Archive | 1990
Robert A. Iannucci
Current-day multiprocessors represent the general belief that processor architecture is of little importance in designing parallel machines. In this Chapter, the fallacy of this assumption will be demonstrated on the basis of the two fundamental issue of latency and synchronization.
Archive | 1993
Steven Lee Gregor; Robert A. Iannucci
Archive | 1992
Steven Lee Gregor; Robert A. Iannucci
Archive | 1985
Robert A. Iannucci