Mark A. Nichols | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mark A. Nichols is active.

Explore More

Publication

Featured researches published by Mark A. Nichols.

Proceedings. Workshop on Heterogeneous Processing | 1992

Augmenting the Optimal Selection Theory for Superconcurrency

Mu-Cheng Wang; Shin-Dug Kim; Mark A. Nichols; Richard F. Freund; Howard Jay Siegel; Wayne G. Nation

An approach for jinding the optimal configuration of heterogeneous computer systems to solve supercomputing problem is presented. Superconcurrency as a form of distributed heterogeneous supercomputing is an approach for matching and managing an optimally configured suite of super-speed machines to minimize the execution time on a given task. The approach performs best when the computational requirements for a given set of tasks are diverse. A supercomputing application task is decomposed into a collection of code segments, where the processing requirement is homogeneous in each code segment. The optimal selection theory has been proposed to choose the optimal configuration of machines for a supercomputing problem. This technique is based on code projiling and analytical benchmarking. Here, the previously presented optimal selection theory approach is augmented in two ways: the performance of code segments on non-optimal machine choices is incorporated and non-uniform &compositions of code segments are considered.

IEEE Transactions on Parallel and Distributed Systems | 1993

Data management and control-flow aspects of an SIMD/SPMD parallel language/compiler

Mark A. Nichols; Howard Jay Siegel; Henry G. Dietz

Features of an explicitly parallel programming language targeted for reconfigurable parallel processing systems, where the machines N processing elements (PEs) are capable of operating in both the SIMD and SPMD modes of parallelism, are described. The SPMD (single program-multiple data) mode of parallelism is a subset of the MIMD mode where all processors execute the same program. By providing all aspects of the language with an SIMD mode version and an SPMD mode version that are syntactically and semantically equivalent, the language facilitates experimentation with and exploitation of hybrid SIMD/SPMD machines. Language constructs (and their implementations) for data management, data-dependent control-flow, and PE-address-dependent control-flow are presented. These constructs are based on experience gained from programming a parallel machine prototype and are being incorporated into a compiler under development. Much of the research presented is applicable to general SIMD machines and MIMD machines. >

IEEE Transactions on Parallel and Distributed Systems | 1995

Using a multipath network for reducing the effects of hot spots

Mu-Cheng Wang; Howard Jay Siegel; Mark A. Nichols; Seth Abraham

One type of interconnection network for a medium to large-scale parallel processing system (i.e., a system with 2/sup 6/ to 2/sup 16/ processors) is a buffered packet-switched multistage interconnection network (MIN). It has been shown that the performance of these networks is satisfactory for uniform network traffic. More recently, several studies have indicated that the performance of MINs is degraded significantly when there is hot spot traffic, that is, a large fraction of the messages are routed to one particular destination. A multipath MIN is a MIN with two or more paths between all source and destination pairs. This research investigates how the Extra Stage Cube multipath MIN can reduce the detrimental effects of tree saturation caused by hot spots. Simulation is used to evaluate the performance of the proposed approaches. The objective of this evaluation is to show that, under certain conditions, the performance of the network with the usual routing scheme is severely degraded by the presence of hot spots. With the proposed approaches, although the delay time of hot spot traffic may be increased, the performance of the background traffic, which constitutes the majority of the network traffic, can be significantly improved. >

Journal of Parallel and Distributed Computing | 1991

Modeling overlapped operation between the control unit and processing elements in an SIMD machine

Shin-Dug Kim; Mark A. Nichols; Howard Jay Siegel

Abstract A model for overlapped operation between the control unit (CU) and processing elements (PEs) in an SIMD machine is presented. The major requirements and structure of the CU for overlapped operation in SIMD mode are described and overlapped operation is formally defined. To use the computing power of both the CU and the PEs most effectively to execute a single program, a balanced work load between the CU and PEs is required. It is assumed that certain computations (e.g., the manipulation of loop index variables, PE-common array index calculations) can be migrated from the PEs to the CU and vice versa.This research demonstrates how to increase the effectiveness of an SIMD machine by allowing overlapped operation between the CU and PEs. The best overlapping can be achieved ideally by assigning an equal amount of work to be executed concurrently on the CU and PEs, resulting in a 2N speedup for an N-PE system. The goal of this research is to develop a model of overlapped operation in SIMD mode so that the actual maximum possible performance of the SIMD machine can be attained.

IEEE Transactions on Parallel and Distributed Systems | 1991

Eliminating memory for fragmentation within partitionable SIMD/SPMD machines

Mark A. Nichols; Howard Jay Siegel; Henry G. Dietz; Russell W. Quong; Wayne G. Nation

Efficient data layout is an important aspect of the compilation process. A model for the creation of perfect memory maps for large-scale parallel machines capable of user-controlled partitionable single-instruction-multiple data/single-program-multiple data (SIMD/SPMD) operation is developed. The term perfect implies that no memory fragmentation occurs and ensures that the memory map size is kept to a minimum. A major constraint on solving this problem is based on the single program nature of both the SIMD and SPMD modes of parallelism. It is assumed that all processors within the same submachine used identical addresses to access corresponding data items in each of their local memories. Necessary and sufficient conditions are derived for being able to create perfect memory maps, and results are applied to several partitionable interconnection networks. >

Journal of Parallel and Distributed Computing | 1994

A Block-Based Mode Selection Model for SIMD/SPMD Parallel Environments

Daniel W. Watson; Howard Jay Siegel; John K. Antonio; Mark A. Nichols; Mikhail J. Atallah

One of the challenges for parallel compilers and compiler-related tools is, given a machine-independent parallel language, to generate executable code for a variety of computational models, and to identify those specific parallel modes for which a program is well-suited. One portion of this problem, developing a method for estimating the relative execution time of a data-parallel algorithm in an environment capable of the SIMD and SPMD (MIMD) modes of parallelism, is presented. Given a data-parallel program in a language whose syntax is mode-independent and empirical information about instruction execution time characteristics, the goal is to use static source-code analysis to determine an implementation that results in an optimal execution time for a mixed-mode machine capable of SIMD and SPMD parallelism. Statistical information about individual operation execution times and paths of execution through a parallel program is assumed. A secondary goal of this study is to indicate language, algorithm, and machine characteristics that must be researched to learn how to provide the information needed to obtain an optimal assignment of parallel modes to program segments.

Journal of Parallel and Distributed Computing | 1994

Multiple quadratic forms: a case study in the design of data-parallel algorithms

Mu-Cheng Wang; Wayne G. Nation; James B. Armstrong; Howard Jay Siegel; Shin-Dug Kim; Mark A. Nichols; Michael Gherrity

Abstract Data-parallel implementations of the computationally intensive task of solving multiple quadratic forms (MQFs) have been examined. Coupled and uncoupled parallel methods are investigated, where coupling relates to the degree of interaction among the processors. Also, the impact of partitioning a large MQF problem into smaller non-interacting subtasks is studied. Trade-offs among the implementations for various data-size/machine-size ratios are categorized in terms of complex arithmetic operation counts, communication overhead, and memory storage requirements. Furthermore, the impact on performance of the mode of parallelism used is considered, specifically, SIMD versus MIMD versus SIMD/MIMD mixed-mode. From the complexity analyses, it is shown that none of the algorithms presented in this paper is best for all data-size/machine-size ratios. Thus, to achieve scalability (i.e., good performance as the number of processors available in a machine increases), instead of using a single algorithm, the approach discussed is to have a set of algorithms from which the most appropriate algorithm or combination of algorithms is selected based on the ratio calculated from the scaled machine size. The analytical results have been verified by experiments on the MasPar MP-1 (SIMD), nCUBE 2 (MIMD), and PASM (mixed-mode) prototype.

Proceedings. Workshop on Heterogeneous Processing, | 1993

A Framework for Compile-Time Selection of Parallel Modes in an Simd/spmd Heterogeneous Environment

Daniel W. Watson; Howard Jay Siegel; John K. Antonio; Mark A. Nichols; Mikhail J. Atallah

A framework for estimating the relative execution time of a data-parallel algorir!hm in an environment capable of the SIMD and SPMD (Single Program Multiple Data) modes of computation is presented. Given a data-parallel program in a language whose syntax is mode-independent, and empirical inj5ormation about instruction execution time characteristics, the long-term goal is to determine at compile time an implemenlation that results in an optimal execution time for a heterogeneous system capable of SIMD and SPMD parallelism.

The Journal of Supercomputing | 1998

Parallel Image Correlation: Case Study to Examine Trade-Offs in Algorithm-to-Machine Mappings

James B. Armstrong; Muthucumaru Maheswaran; Mitchell D. Theys; Howard Jay Siegel; Mark A. Nichols; Kenneth H. Casey

Performance of a parallel algorithm on a parallel machine depends not only on the time complexity of the algorithm, but also on how the underlying machine supports the fundamental operations used by the algorithm. This study analyzes various mappings of image correlation algorithms in SIMD, MIMD, and mixed-mode environments. Experiments were conducted on the Intel Paragon, MasPar MP-1, nCUBE 2, and PASM prototype. The machine features considered in this study include: modes of parallelism, communication/computation ratio, network topology and implementation, SIMD CU/PE overlap, and communication/computation overlap. Performance of an implementation can be enhanced by using algorithmic techniques that match the machine features. Some algorithmic techniques discussed here are additional communication versus redundant computation, data block transfers, and communication/computation overlap. The results presented are applicable to a large class of image processing tasks. Case studies, such as the one presented here, are a necessary step in developing software tools for mapping an application task onto a single parallel machine and for mapping the subtasks of an application task, or a set of independent application tasks, onto a heterogeneous suite of parallel machines.

international parallel and distributed processing symposium | 1991

Examining the effects of CU/PE overlap and synchronization overhead when using the complete sums approach to image correlation

James B. Armstrong; Mark A. Nichols; H. Jay Siegel; Leah H. Jamieson

A mixed-mode parallel machines processing elements (PEs) are capable of operating in and switching between the SIMD and MIMD modes of parallelism. The paper analyzes various mappings of image correlation algorithms onto a mixed-mode parallel processing system. The trade-offs that exist between the SIMD and MIMD modes explain why some sequences of instructions are performed better in one mode than in the other and are the primary basis employed in comparing different mappings of a parallel algorithm onto a mixed-mode system.<<ETX>>

Explore More