Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Christoph W. Keßler is active.

Publication


Featured researches published by Christoph W. Keßler.


The Journal of Supercomputing | 2000

NestStep: Nested Parallelism and Virtual Shared Memory for the BSP Model

Christoph W. Keßler

NestStep is a parallel programming language for the BSP (bulk–synchronous–parallel) model of parallel computation.Extending the classical BSP model, NestStep supports dynamically nested parallelism by nesting of supersteps and a hierarchical processor group concept. Furthermore, NestStep adds a virtual shared memory realization in software, where memory consistency is relaxed to superstep boundaries. Distribution of shared arrays is also supported.A prototype for a subset of NestStep has been implemented based on Java as sequential basis language. The prototype implementation is targeted to a set of Java Virtual Machines coupled by Java socket communication to a virtual parallel computer.


International Journal of Parallel Programming | 1997

The Fork95 parallel programming language: design, implementation, application

Christoph W. Keßler; Helmut Seidl

Fork95 is an imperative parallel programming language intended to express algorithms for synchronous shared memory machines (PRAMs). It is based on ANSI C and offers additional constructs to hierarchically divide processor groups into subgroups and manage shared and private address subspaces. Fork95 makes the assembly-level synchronicity of the underlying hardware available to the programmer at the language level. Nevertheless, it supports locally asynchronous computation where desired by the programmer. We present a one pass compiler, fcc, which compiles Fork95 and C programs to the SB-PRAM machine. The SB-PRAM is a lock-step synchronous, massively parallel multiprocessor currently being built at Saarbrücken University, with a physically shared memory and uniform memory access time. We examine three important types of parallel computation frequently used for the parallel solution of real-world problems. While farming and parallel divide-and-conquer are directly supported by Fork95 language constructs, pipelining can be easily expressed using existing language features; an additional language construct for pipelining is not required.


Proceedings of the Second International ACPC Conference on Parallel Computation | 1993

Automatic Parallelization by Pattern-Matching

Christoph W. Keßler; Wolfgang J. Paul

We present the top-down design of a new system which performs automatic parallelization of numerical Fortran 77 or C source programs for execution on distributed-memory message — passing multiprocessors such as e.g. the INTEL iPSC860 or the TMC CM-5.


Code Generation | 1992

Scheduling Vector Straight Line Code on Vector Processors

Christoph W. Keßler; Wolfgang J. Paul; Thomas Rauber

We present an algorithm to schedule basic blocks of vector three-address-instructions. This algorithm is suited for a special class of vector processors containing a buffer (register file) which may be partitioned arbitrarily into vector registers by the user. The algorithm computes the best ratio of vector register spilling to strip mining, taking the vector length and the buffer size into consideration, as well as several machine parameters of the target architecture. We apply the algorithm to groups of vector instructions within a basic block that are quasiscalar, i.e. all vectors occurring in the group must have one fixed length L.


international symposium on programming language implementation and logic programming | 1991

A Randomized Heuristic Approach to Register Allocation

Christoph W. Keßler; Wolfgang J. Paul; Thomas Rauber

We present a randomized algorithm to generate contiguous evaluations for expression DAGs representing basic blocks of straight line code with nearly minimal register need. This heuristic may be used to reorder the statements in a basic block before applying a global register allocation scheme like Graph Coloring. Experiments have shown that the new heuristic produces results which are about 30% better on the average than without reordering.


Proceedings of the First IFIP TC10 International Workshop on Software Engineering for Parallel and Distributed Systems | 1996

Program comprehension engines for automatic parallelization: a comparative study

Beniamino Di Martino; Christoph W. Keßler

We compare two systems for program comprehension that are targeted towards support of automatic parallelization: the PAP recognizer currently included into the Vienna Fortran Compilation System, and the PARAMAT pattern recognizer developed at Saarbrucken University. We illuminate the main differences, the advantages and disadvantages of each approach, and show how both approaches may be integrated to combine the generality of one approach with the speed of the other one.


ieee international conference on high performance computing data and analytics | 1999

ForkLight: A Control-Synchronous Parallel Programming Language

Christoph W. Keßler; Helmut Seidl

ForkLight is an imperative, task-parallel programming language for massively parallel shared memory machines. It is based on ANSI C, follows the SPMD model of parallel program execution, provides a sequentially consistent shared memory, and supports dynamically nested parallelism. While no assumptions are made on uniformity of memory access time or instruction-level synchronicity of the underlying hardware, ForkLight offers a simple but powerful mechanism for coordination of parallel processes in the tradition and notation of PRAM algorithms: Beyond its asynchronous default execution mode, ForkLight offers a mode for control-synchronous execution that relates the programs block structure to parallel control flow.


international symposium on programming language implementation and logic programming | 1996

Scheduling Expression DAGs for Minimal Register Need

Christoph W. Keßler

Generating schedules for expression DAGs that use a minimal number of registers is a classical NP-complete optimization problem. Up to now an exact solution could only be computed for small DAGs (with up to 20 nodes), using a trivial O(n!) enumeration algorithm. We present a new algorithm with worst-case complexity O(n22n) and very good average behaviour. Applying a dynamic programming scheme and reordering techniques, it is able to defer the combinatorial explosion and to generate an optimal schedule not only for small DAGs but also for medium-sized ones with up to 50 nodes, a class that contains nearly all DAGs encountered in typical application programs. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithm generates optimal schedules quite fast. We extend our algorithm to cope with delay slots and multiple functional units, two common features of modern superscalar processors.


international symposium on programming language implementation and logic programming | 1993

Efficient Register Allocation for Large Basic Blocks

Christoph W. Keßler; Thomas Rauber

We consider the NP-complete problem of generating evaluations for expression DAGs with a minimal number of registers. We restrict our attention to contiguous evaluations, because for nearly all of the DAGs derived from real application programs there exists a contiguous evaluation that is optimal w.r. to the register need. We present an algorithm that generates an optimal contiguous evaluation for a given DAG. The algorithm is very fast on the average. It generates the evaluation by splitting the DAG in trees with import and export nodes and applying a labeling scheme to the trees.


european conference on parallel processing | 1997

Applicability of Program Comprehension to Sparse Matrix Computations

Christoph W. Keßler

Collaboration


Dive into the Christoph W. Keßler's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Beniamino Di Martino

Seconda Università degli Studi di Napoli

View shared research outputs
Researchain Logo
Decentralizing Knowledge