Rodney W. Johnson
St. Cloud State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rodney W. Johnson.
international conference on supercomputing | 1994
S. D. Kaushik; Chua-Huang Huang; Rodney W. Johnson; P. Sadayappan
We address the development of efficient methods for performing data redistribution of arrays on distributed-memory machines. Data redistribution is important for the distributed-memory implementation of data parallel languages such as High Performance Fortran. An algebraic representation of regular data distributions is used to develop an analytical model for evaluating the communication cost of data redistribution. Using this algebraic representation and the analytical model, an approach to communication-efficient data redistribution is developed. Implementation results on the Intel iPSC/860 are reported.
conference on high performance computing (supercomputing) | 1993
S. D. Kaushik; Chua-Huang Huang; John R. Johnson; Rodney W. Johnson; P. Sadayappan
The authors present transposition algorithms for matrices that do not fit in main memory. Transposition is interpreted as a permutation of the vector obtained by mapping a matrix to linear memory. Algorithms are derived from factorizations of this permutation, using a class of permutations related to the tensor product. Using this formulation of transposition, the authors first obtain several known algorithms and then they derive a new algorithm which reduces the number of disk accesses required. The new algorithm was compared to existing algorithms using an implementation on the Intel iPSC/860. This comparison shows the benefits of the new algorithm.
Journal of Parallel and Distributed Computing | 1986
Sandeep K. S. Gupta; Chua-Huang Huang; P. Sadayappan; Rodney W. Johnson
A framework for synthesizing communication-efficient distributed-memory parallel programs for block recursive algorithms such as the fast Fourier transform (FFT) and Strassens matrix multiplication is presented. This framework is based on an algebraic representation of the algorithms, which involves the tensor (Kronecker) product and other matrix operations. This representation is useful in analyzing the communication implications of computation partitioning and data distributions. The programs are synthesized under two different target program models. These two models are based on different ways of managing the distribution of data for optimizing communication. The first model uses point-to-point interprocessor communication primitives, whereas the second model uses data redistribution primitives involving collective all-to-many communication. These two program models are shown to be suitable for different ranges of problem size. The methodology is illustrated by synthesizing communication-efficient programs for the FFT. This framework has been incorporated into the EXTENT system for automatic generation of parallel/vector programs for block recursive algorithms.
conference on high performance computing (supercomputing) | 1994
D. L. Dai; Sandeep K. S. Gupta; S. D. Kaushik; J. H. Lu; R. V. Singh; Chua-Huang Huang; P. Sadayappan; Rodney W. Johnson
Presents EXTENT (EXpert system for TENsor product formula Translation) which is a programming environment for the automatic generation of parallel/vector programs from tensor product formulas. A tensor (Kronecker) product based programming methodology is used for designing high-performance programs on various architectures. In this programming methodology, block recursive algorithms such as the fast Fourier transform and Strassens matrix multiplication algorithm are expressed as tensor product formulas involving tensor product and other matrix operations. A tensor product formula can be systematically translated into parallel and/or vector code for various parallel architectures. A prototype system which generates programs for the Cray Y-MP, Cray T3D and Intel Paragon has been developed. Performance results for some generated programs are presented.<<ETX>>
Scientific Programming | 1995
Bharat Kumar; Chua-Huang Huang; P. Sadayappan; Rodney W. Johnson
In this article, we present a program generation strategy of Strassens matrix multiplication algorithm using a programming methodology based on tensor product formulas. In this methodology, block recursive programs such as the fast Fourier Transforms and Strassens matrix multiplication algorithm are expressed as algebraic formulas involving tensor products and other matrix operations. Such formulas can be systematically translated to high-performance parallel/vector codes for various architectures. In this article, we present a nonrecursive implementation of Strassens algorithm for shared memory vector processors such as the Cray Y-MP. A previous implementation of Strassens algorithm synthesized from tensor product formulas required working storage of size O(7
international parallel and distributed processing symposium | 1992
Sandeep K. S. Gupta; S. D. Kaushik; Chua-Huang Huang; John R. Johnson; Rodney W. Johnson; P. Sadayappan
^n
conference on high performance computing (supercomputing) | 1992
S. D. Kaushik; Sanjay Sharma; Chua-Huang Huang; Jeremy R. Johnson; Rodney W. Johnson; P. Sadayappan
) for multiplying 2
Concurrency and Computation: Practice and Experience | 1998
Sandeep K. S. Gupta; Chua-Huang Huang; P. Sadayappan; Rodney W. Johnson
^n
languages and compilers for parallel computing | 1993
S. D. Kaushik; Chua-Huang Huang; Rodney W. Johnson; P. Sadayappan
× 2
languages and compilers for parallel computing | 1992
Sandeep K. S. Gupta; Chua-Huang Huang; P. Sadayappan; Rodney W. Johnson
^n