Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Maraike Schellmann is active.

Publication


Featured researches published by Maraike Schellmann.


european conference on parallel processing | 2009

Using OpenMP vs. Threading Building Blocks for Medical Imaging on Multi-cores

Philipp Kegel; Maraike Schellmann; Sergei Gorlatch

We compare two parallel programming approaches for multi-core systems: the well-known OpenMP and the recently introduced Threading Building Blocks (TBB) library by Intel®. The comparison is made using the parallelization of a real-world numerical algorithm for medical imaging. We develop several parallel implementations, and compare them w.r.t. programming effort, programming style and abstraction, and runtime performance. We show that TBB requires a considerable program re-design, whereas with OpenMP simple compiler directives are sufficient. While TBB appears to be less appropriate for parallelizing existing implementations, it fosters a good programming style and higher abstraction level for newly developed parallel programs. Our experimental measurements on a dual quad-core system demonstrate that OpenMP slightly outperforms TBB in our implementation.


The Journal of Supercomputing | 2011

Parallel medical image reconstruction: from graphics processing units (GPU) to Grids

Maraike Schellmann; Sergei Gorlatch; Dominik Meiländer; Thomas Kösters; Klaus P. Schäfers; Frank Wübbeling; Martin Burger

We present and compare a variety of parallelization approaches for a real-world case study on modern parallel and distributed computer architectures. Our case study is a production-quality, time-intensive algorithm for medical image reconstruction used in computer tomography (PET). We parallelize this algorithm for the main kinds of contemporary parallel architectures: shared-memory multiprocessors, distributed-memory clusters, graphics processing units (GPU) using the CUDA framework, the Cell processor and, finally, how various architectures can be accessed in a distributed Grid environment. The main contribution of the paper, besides the parallelization approaches, is their systematic comparison regarding four important criteria: performance, programming comfort, accessibility, and cost-effectiveness. We report results of experiments on particular parallel machines of different architectures that confirm the findings of our systematic comparison.


computing frontiers | 2008

Cost-effective medical image reconstruction: from clusters to graphics processing units

Maraike Schellmann; Jürgen Vörding; Sergei Gorlatch; Dominik Meiländer

We demonstrate that for modern medical imaging applications, parallel implementations on traditional parallel architectures (clusters and multiprocessor servers) can be outperformed, both in terms of speed and cost-effectiveness, by new implementations on next-generation architectures like GPUs (Graphics Processing Units). Although, compared to clusters and multiprocessor servers, GPUs are rather small and much less expensive, they consist of several SIMD-processors and thus provide a high degree of parallelism. For an iterative image reconstruction algorithm---the list-mode OSEM--- we demonstrate, first, the limitations of parallel reconstructions with this algorithm on the traditional parallel architectures, and second, how the well-analyzed parallel strategies for traditional architectures can be adapted systematically to achieve fast reconstructions on the GPU.


Concurrency and Computation: Practice and Experience | 2011

Comparing programming models for medical imaging on multi-core systems

Philipp Kegel; Maraike Schellmann; Sergei Gorlatch

Multi‐core processors offer a huge potential of parallelism but pose a challenge of program development for achieving high performance in real applications. We compare three popular parallel programming models—POSIX threads (Pthreads), OpenMP, and Threading Building Blocks (TBB)—regarding their use for multi‐core systems. We analyze how these models can be employed for implementing various parallelizations of a real‐world application from the area of medical imaging, and we conduct extensive runtime experiments to measure performance. Our main contribution is a comprehensive comparison of Pthreads, OpenMP, and TBB with respect to the following criteria: program development effort, programming style, level of abstraction, and runtime performance on multi‐cores. Copyright


parallel computing technologies | 2009

Parallel Medical Image Reconstruction: From Graphics Processors to Grids

Maraike Schellmann; Sergei Gorlatch; Dominik Meiländer; Thomas Kösters; Klaus P. Schäfers; Frank Wübbeling; Martin Burger

We present a variety of possible parallelization approaches for a real-world case study using several modern parallel and distributed computer architectures. Our case study is a production-quality, time-intensive algorithm for medical image reconstruction used in computer tomography. We describe how this algorithm can be parallelized for the main kinds of contemporary parallel architectures: shared-memory multiprocessors, distributed-memory clusters, graphics processors, the Cell processor and, finally, how various architectures can be accessed in a distributed Grid environment. The main contribution of the paper, besides the parallelization approaches, is their systematic comparison regarding four important criteria: performance, programming comfort, accessibility, and cost-effectiveness. We report results of experiments on particular parallel machines of different architectures that confirm the findings of our systematic comparison.


european pvm mpi users group meeting on recent advances in parallel virtual machine and message passing interface | 2008

Communication Optimization for Medical Image Reconstruction Algorithms

Torsten Hoefler; Maraike Schellmann; Sergei Gorlatch; Andrew Lumsdaine

This paper presents experiences and results obtained in optimizing the parallel communication performance of a production-quality medical image reconstruction application. The fundamental communication operations in the applications principal algorithm are collective reductions. The overhead of these operations was reduced by transforming the algorithm to overlap its computation and communication. Several different approaches to communication progress were studied, both user-directed and asynchronous. Experimental results comparing the new approach to the previous implementation show overall application performance improvements of up to 8%, when run on 32 nodes.


ieee nuclear science symposium | 2006

Parallelization and Runtime Prediction of the ListMode OSEM Algorithm for 3D PET Reconstruction

Maraike Schellmann; Thomas Kösters; Sergei Gorlatch

For high-resolution PET (Positron Emission Tomography) image reconstructions, the LM OSEM (ListMode Ordered Subset Expectation Maximization) algorithm proves to be quite appropriate, but it is very time-consuming. In order to improve its runtime, we parallelized the algorithm and implemented it on different classes of parallel computer architectures: with shared, distributed and hybrid memory. These implementations reduce the reconstruction time from more than two hours to six minutes. We suggest an analytical model for predicting parallel LM OSEM runtimes on distributed-memory machines, and verify our model in runtime experiments on different reconstruction problems, which demonstrate a prediction error of less than 10 %. The model allows the user to achieve a desired reconstruction quality while minimizing resource usage.


advances in computer entertainment technology | 2005

Rokkatan: scaling an RTS game design to the massively multiplayer realm

Jens Müller; Jan Hendrik Metzen; Alexander Ploss; Maraike Schellmann; Sergei Gorlatch

While massively multiplayer online role-playing games involve large numbers of simultaneous players, two other popular game classes - first person action and real-time strategy games - are still rarely discussed for massively multiplayer gaming. This paper presents our work on Rokkatan, an online game which implements the common concept of real-time strategy in a scalable multiplayer design. In order to allow hundreds of users to participate in a single game session, Rokkatan uses our proxy-server network architecture which provides the required scalability and responsiveness required for a fast-paced gaming style. An analytical scalability model integrated into Rokkatan allows to forecast the maximum number of simultaneous players. Our experiments demonstrate good prediction quality of the model and high scalability of Rokkatan, which allows several hundreds of users to participate in a single game session.


ieee nuclear science symposium | 2007

Towards a grid system for medical image reconstruction

Maraike Schellmann; Dominik Bohm; Stefan Wichmann; Sergei Gorlatch

This paper presents an experimental grid system - the MIRGrid-which enables transparent usage of high- performance computers for medical image reconstruction. MIR-grid provides a comfortable way to perform time-consuming iterative image reconstructions on interconnected high-performance computers; it covers the whole imaging workflow: from reading the raw data acquired by the scanner, over user-transparent parallel reconstruction to visualization. The system is able to perform and monitor parallel reconstructions on different kinds of parallel architectures with shared and distributed memory, as well as combinations of both. In order to optimally distribute the reconstructions among the high-performance computers, we estimate parallel reconstruction time by using a previously developed performance model. Additionally to traditional 3D imaging, the system seamlessly integrates dynamic (4D) and gated studies. The system optimizes parallel 4D reconstruction time by partitioning dynamic and gated reconstructions into independent 3D sub- reconstructions that are computed simultaneously on several high- performance computers.


european conference on parallel processing | 2008

Systematic Parallelization of Medical Image Reconstruction for Graphics Hardware

Maraike Schellmann; Jürgen Vörding; Sergei Gorlatch

Modern Graphics Processing Units (GPUs) consist of several SIMD-processors and thus provide a high degree of parallelism at low cost. We introduce a new approach to systematically develop parallel image reconstruction algorithms for GPUs from their parallel equivalents for distributed-memory machines. We use High-Level Petri Nets (HLPN) to intuitively describe the parallel implementations for distributed- memory machines. By denoting the functions of the HLPN with memory requirements and information about data distribution, we are able to identify parallel functions that can be implemented efficiently on the GPU. For an important iterative medical image reconstruction algorithm --the list-mode OSEM algorithm--we demonstrate the limitations of its distributed-memory implementation and show how our HLPN-based approach leads to a fast implementation on GPUs, reusable across different medical imaging devices.

Collaboration


Dive into the Maraike Schellmann's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge