Will Denissen
Delft University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Will Denissen.
IEEE Transactions on Parallel and Distributed Systems | 1996
K. van Reeuwijk; Will Denissen; Henk J. Sips; Edwin M. R. M. Paalvast
Data parallel languages, like High Performance Fortran (HPF), support the notion of distributed arrays. However, the implementation of such distributed array structures and their access on message passing computers is not straightforward. This holds especially for distributed arrays that are aligned to each other and given a block-cyclic distribution. In this paper, an implementation framework is presented for HPF distributed arrays on message passing computers. Methods are presented for efficient (in space and time) local index enumeration, local storage, and communication. Techniques for local set enumeration provide the basis for constructing local iteration sets and communication sets. It is shown that both local set enumeration and local storage schemes can be derived from the same equation. Local set enumeration and local storage schemes are shown to be orthogonal, i.e., they can be freely combined. Moreover, for linear access sequences generated by our enumeration methods, the local address calculations can be moved out of the enumeration loop, yielding efficient local memory address generation. The local set enumeration methods are implemented by using a relatively simple general transformation rule for absorbing ownership tests. This transformation rule can be repeatedly applied to absorb multiple ownership tests. Performance figures are presented for local iteration overhead, a simple communication pattern, and storage efficiency.
parallel computing | 1998
Henk J. Sips; Will Denissen; Kees van Reeuwijk
Abstract In this paper, we analyze the properties and efficiency of three basic local enumeration and three storage compression schemes for cyclic ( m ) data distributions in High Performance Fortran (HPF). The methods are presented in a unified framework, showing the relations between the various methods. We show that for array accesses that are affine functions of the loop bounds, efficient local enumeration and storage compression schemes can be derived. Furthermore, the basic set enumeration and storage techniques are shown to be orthogonal, if the local storage compression scheme is collapsible. This allows choosing the most appropriate method in parts of the computation and communication phases of parallel loops. Performance figures of the methods show that programs with cyclic ( m ) data distributions can be executed efficiently even without compile-time knowledge of the relevant access, alignment, and distribution parameters.
Concurrency and Computation: Practice and Experience | 2002
Will Denissen; Henk J. Sips
High‐Performance Fortran (HPF) has been designed to provide portable performance on distributed memory machines. An important aspect of portable performance is the behavior of the available HPF compilers. Ideally, a programmer may expect comparable performance between different HPF compilers, given the same program and the same machine.
languages and compilers for parallel computing | 2000
Will Denissen; Henk J. Sips
In translating HPF programs, a compiler has to generate local iteration and communication sets. Apart from local enumeration, local storage compression is an issue, because in HPF array alignment functions can introduce local storage inefficiencies. Storage compression, however, may not lead to serious performance penalties. A problem in semi-automatic translation is that a compiler should generate efficient code in all cases the user may expect efficient translation (no surprises). However, in current compilers this turns out to be not always true. A major cause for this inefficiencies is that compilers use the same fixed enumeration scheme in all cases. In this paper, we present an efficient dynamic local enumeration method, which always selects the optimal solution at run-time and has no need for code duplication. The method is compared with the PGI and the Adaptor compiler.
Archive | 2001
Frits Kuijlman; Henk J. Sips; C. Van Reeuwijk; Will Denissen
IEEE Transactions on Parallel and Distributed Systems | 1996
C. Van Reeuwijk; Will Denissen; Henk J. Sips; Edwin M. R. M. Paalvast
Archive | 1995
C. Van Reeuwijk; Henk J. Sips; Will Denissen
international conference on supercomputing | 1996
Henk J. Sips; C. Van Reeuwijk; Will Denissen
EUROSIM | 1996
Kees van Reeuwijk; Will Denissen; Henk J. Sips