Christoph W. Keßler
University of Trier
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Christoph W. Keßler.
The Journal of Supercomputing | 2000
Christoph W. Keßler
NestStep is a parallel programming language for the BSP (bulk–synchronous–parallel) model of parallel computation.Extending the classical BSP model, NestStep supports dynamically nested parallelism by nesting of supersteps and a hierarchical processor group concept. Furthermore, NestStep adds a virtual shared memory realization in software, where memory consistency is relaxed to superstep boundaries. Distribution of shared arrays is also supported.A prototype for a subset of NestStep has been implemented based on Java as sequential basis language. The prototype implementation is targeted to a set of Java Virtual Machines coupled by Java socket communication to a virtual parallel computer.
International Journal of Parallel Programming | 1997
Christoph W. Keßler; Helmut Seidl
Fork95 is an imperative parallel programming language intended to express algorithms for synchronous shared memory machines (PRAMs). It is based on ANSI C and offers additional constructs to hierarchically divide processor groups into subgroups and manage shared and private address subspaces. Fork95 makes the assembly-level synchronicity of the underlying hardware available to the programmer at the language level. Nevertheless, it supports locally asynchronous computation where desired by the programmer. We present a one pass compiler, fcc, which compiles Fork95 and C programs to the SB-PRAM machine. The SB-PRAM is a lock-step synchronous, massively parallel multiprocessor currently being built at Saarbrücken University, with a physically shared memory and uniform memory access time. We examine three important types of parallel computation frequently used for the parallel solution of real-world problems. While farming and parallel divide-and-conquer are directly supported by Fork95 language constructs, pipelining can be easily expressed using existing language features; an additional language construct for pipelining is not required.
Proceedings of the Second International ACPC Conference on Parallel Computation | 1993
Christoph W. Keßler; Wolfgang J. Paul
We present the top-down design of a new system which performs automatic parallelization of numerical Fortran 77 or C source programs for execution on distributed-memory message — passing multiprocessors such as e.g. the INTEL iPSC860 or the TMC CM-5.
Code Generation | 1992
Christoph W. Keßler; Wolfgang J. Paul; Thomas Rauber
We present an algorithm to schedule basic blocks of vector three-address-instructions. This algorithm is suited for a special class of vector processors containing a buffer (register file) which may be partitioned arbitrarily into vector registers by the user. The algorithm computes the best ratio of vector register spilling to strip mining, taking the vector length and the buffer size into consideration, as well as several machine parameters of the target architecture. We apply the algorithm to groups of vector instructions within a basic block that are quasiscalar, i.e. all vectors occurring in the group must have one fixed length L.
international symposium on programming language implementation and logic programming | 1991
Christoph W. Keßler; Wolfgang J. Paul; Thomas Rauber
We present a randomized algorithm to generate contiguous evaluations for expression DAGs representing basic blocks of straight line code with nearly minimal register need. This heuristic may be used to reorder the statements in a basic block before applying a global register allocation scheme like Graph Coloring. Experiments have shown that the new heuristic produces results which are about 30% better on the average than without reordering.
Proceedings of the First IFIP TC10 International Workshop on Software Engineering for Parallel and Distributed Systems | 1996
Beniamino Di Martino; Christoph W. Keßler
We compare two systems for program comprehension that are targeted towards support of automatic parallelization: the PAP recognizer currently included into the Vienna Fortran Compilation System, and the PARAMAT pattern recognizer developed at Saarbrucken University. We illuminate the main differences, the advantages and disadvantages of each approach, and show how both approaches may be integrated to combine the generality of one approach with the speed of the other one.
ieee international conference on high performance computing data and analytics | 1999
Christoph W. Keßler; Helmut Seidl
ForkLight is an imperative, task-parallel programming language for massively parallel shared memory machines. It is based on ANSI C, follows the SPMD model of parallel program execution, provides a sequentially consistent shared memory, and supports dynamically nested parallelism. While no assumptions are made on uniformity of memory access time or instruction-level synchronicity of the underlying hardware, ForkLight offers a simple but powerful mechanism for coordination of parallel processes in the tradition and notation of PRAM algorithms: Beyond its asynchronous default execution mode, ForkLight offers a mode for control-synchronous execution that relates the programs block structure to parallel control flow.
international symposium on programming language implementation and logic programming | 1996
Christoph W. Keßler
Generating schedules for expression DAGs that use a minimal number of registers is a classical NP-complete optimization problem. Up to now an exact solution could only be computed for small DAGs (with up to 20 nodes), using a trivial O(n!) enumeration algorithm. We present a new algorithm with worst-case complexity O(n22n) and very good average behaviour. Applying a dynamic programming scheme and reordering techniques, it is able to defer the combinatorial explosion and to generate an optimal schedule not only for small DAGs but also for medium-sized ones with up to 50 nodes, a class that contains nearly all DAGs encountered in typical application programs. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithm generates optimal schedules quite fast. We extend our algorithm to cope with delay slots and multiple functional units, two common features of modern superscalar processors.
international symposium on programming language implementation and logic programming | 1993
Christoph W. Keßler; Thomas Rauber
We consider the NP-complete problem of generating evaluations for expression DAGs with a minimal number of registers. We restrict our attention to contiguous evaluations, because for nearly all of the DAGs derived from real application programs there exists a contiguous evaluation that is optimal w.r. to the register need. We present an algorithm that generates an optimal contiguous evaluation for a given DAG. The algorithm is very fast on the average. It generates the evaluation by splitting the DAG in trees with import and export nodes and applying a labeling scheme to the trees.
european conference on parallel processing | 1997
Christoph W. Keßler