Andrew R. Pleszkun | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrew R. Pleszkun is active.

Explore More

Publication

Featured researches published by Andrew R. Pleszkun.

international symposium on computer architecture | 1985

Implementation of precise interrupts in pipelined processors

James E. Smith; Andrew R. Pleszkun

An interrupt is precise if the saved process state corresponds with the sequential model of program execution where one instruction completes before the next begins. In a pipelined processor, precise interrupts are difficult to achieve because an instruction may be initiated before its predecessors have been completed. This paper describes and evaluates solutions to the precise interrupt problem in pipelined processors. The precise interrupt problem is first described. Then five solutions are discussed in detail. The first forces instructions to complete and modify the process state in architectural order. The other four allow instructions to complete in any order, but additional hardware is used so that a precise state can be restored when an interrupt occurs. All the methods are discussed in the context of a parallel pipeline struck sure. Simulation results based on the CRAY-1S scalar architecture are used to show that, at best, the first solution results in a performance degradation of about 16%. The remaining four solutions offer similar performance, and three of them result in as little as a 3% performance loss. Several extensions, including virtual memory and linear pipeline structures, are briefly discussed.

international symposium on computer architecture | 1988

The performance potential of multiple functional unit processors

Andrew R. Pleszkun; Gurindar S. Sohi

In this paper, we look at the interaction of pipelining and multiple functional units in single processor machines. When implementing a high performance machine, a number of hardware techniques maybe used to improve the performance of the final system. Our goal is to gain an understanding of how each of these techniques contribute to performance improvement. As a basis for our studies we use a CRAY-like processor model and the issue rate (instructions per clock cycle) as the performance measure. We then systematically augment this base, non-pipelined, machine with more and more hardware features and evaluate the performance impact of each feature. We find, for example, that in non-vector machines, pipelining multiple function units does not provide significant performance improvements. Dataflow limits are then derived for our benchmark programs to determine the performance potential of each benchmark. In addition, other limits are computed which apply more realistic constraints on a computation. Based on these more realistic limits, we determine it is worthwhile to investigate the performance improvements that can be achieved from issuing multiple instructions each clock cycle. Several hardware approaches are evaluated for issuing multiple instructions each clock cycle.

international symposium on computer architecture | 1987

WISQ: a restartable architecture using queues

Andrew R. Pleszkun; James R. Goodman; Wei-Chung Hsu; R. T. Joersz; George E. Bier; Philip J. Woest; P. B. Schechter

In this paper, the WISQ architecture is described. This architecture is designed to achieve high performance by exploiting new compiler technology and using a highly segmented pipeline. By having a highly segmented pipeline, a very-high-speed clock can be used. Since a highly segmented pipeline will require relatively long pipelines, a way must be provided to minimize the effects of pipeline bubbles that are formed due to data and control dependencies. It is also important to provide a way of supporting precise interrupts. These goals are met, in part, by providing a reorder buffer to help restore the machine to a precise state. The architecture then makes the pipelining visible to the programmer/compiler by making the reorder buffer accessible and by explicitly providing that issued instructions cannot be affected by immediately preceding ones. Compiler techniques have been identified that can take advantage of the reorder buffer and permit a sustained execution rate approaching or exceeding one per clock. These techniques include using trace scheduling and providing a relatively easy way to “undo” instructions if the predicted branch path is not taken. We have also studied ways to further reduce the effects of branches by not having them executed in the execution unit. In particular, branches are detected and resolved in the instruction fetch unit. Using this approach, the execution unit is sent a stream of instructions (without branches) that are guaranteed to execute.

design automation conference | 1985

An Algorithm for Design Rule Checking on a Multiprocessor

George E. Bier; Andrew R. Pleszkun

Design rules and the problem of design rule checking are introduced. The critical problem of design rule checking is the execution time required to check a complete chip. Proposed solutions try to take advantage of hierarchical aspects of a layout. The algorithm presented in this paper proposes a different approach. Observing that design rule checking is a very local operation, a method is described for partitioning a design for checking on a multiprocessor. An implementation is described and results are given for runs on a single processor. These results indicate that speedup proportional to the number of processors is possible.

Information Processing Letters | 1987

On the structural locality of reference in LISP list access streams

Matthew J. Thazhuthaveetil; Andrew R. Pleszkun

The programming language LISP has been providing programmers with a powerful and invigorating applicative programming environment for over 25 years now [4]. It was adopted as the major programming language of the artificial intelligence community due to its support for symbolic manipulation; most of the large AI programs in widespread use today are LISP programs. LISP programs are organized as collections of user-defined functions that call each other. These functions, as well as the data they manipulate, are represented as lists. Unfortunately, computation based on list manipulation does not map well onto the linear memories of conventional Von Neumann machines. Surprisingly, few studies have been conducted on the properties of list manipulation, or on how its efficiency can be increased. This paper describes interesting new results from one such study.

international symposium on computer architecture | 1986

An architecture for efficient Lisp list access

Andrew R. Pleszkun; Matthew Thazhuthaveetil

In this paper, we present a Lisp machine architecture that supports efficient list manipulation. This Lisp architecture is organized as two processing units: a List Processor (LP), that performs all list related operations and manages the list memory, and an Evaluation Processor (EP), that maintains the addressing and control environment. The LP contains a translation table (LPT) that maps a small set of list identifiers into the physical memory addresses of objects. Essentially, the LP and LPT virtualize a list. The EP then operates on these virtualized lists. Such an organization permits the overlap of EP function evaluation with LP memory accesses and management, thus reducing the performance penalties typically associated with Lisp list manipulation activities. We used trace-driven simulations to evaluate this architecture. From our evaluation a relatively small LPT is seen to be sufficient, and to yield “hit rates” on data accesses higher than those of a data cache of comparable size.

international symposium on computer architecture | 1985