Jan-Philipp Weiss | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jan-Philipp Weiss is active.

Explore More

Publication

Featured researches published by Jan-Philipp Weiss.

Archive | 2010

Facing the Multicore-Challenge III

Rainer Keller; David Kramer; Jan-Philipp Weiss

Thank you very much for downloading facing the multicore challenge aspects of new paradigms and technologies in parallel computing. As you may know, people have look hundreds times for their chosen books like this facing the multicore challenge aspects of new paradigms and technologies in parallel computing, but end up in malicious downloads. Rather than enjoying a good book with a cup of coffee in the afternoon, instead they cope with some malicious bugs inside their computer.

european conference on parallel processing | 2009

Optimized Stencil Computation Using In-Place Calculation on Modern Multicore Systems

Werner Augustin; Vincent Heuveline; Jan-Philipp Weiss

Numerical algorithms on parallel systems built upon modern multicore processors are facing two challenging obstacles that keep realistic applications from reaching the theoretically available compute performance. First, the parallelization on several system levels has to be exploited to the full extent. Second, provision of data to the compute cores needs to be adapted to the constraints of a hardware-controlled nested cache hierarchy with shared resources. In this paper we analyze dedicated optimization techniques on modern multicore systems for stencil kernels on regular three-dimensional grids. We combine various methods like a compressed grid algorithm with finite shifts in each time step and loop skewing into an optimized parallel in-place stencil implementation of the three-dimensional Laplacian operator. In that context, memory requirements are reduced by a factor of approximately two while considerable performance gains are observed on modern Intel and AMD based multicore systems.

Concurrency and Computation: Practice and Experience | 2012

A survey on hardware-aware and heterogeneous computing on multicore processors and accelerators

Rainer Buchty; Vincent Heuveline; Wolfgang Karl; Jan-Philipp Weiss

In the last few years, the landscape of parallel computing has been subject to profound and highly dynamic changes. The paradigm shift towards multicore and manycore technologies coupled with accelerators in a heterogeneous environment is offering a great potential of computing power for scientific and industrial applications. However, for one to take full advantage of these new technologies, holistic approaches coupling the expertise ranging from hardware architecture and software design to numerical algorithms are a pressing necessity. Parallel computing is no longer limited to supercomputers and is now much more diversified – with a multitude of technologies, architectures, and programming approaches leading to increased complexity for developers and engineers.

Facing the Multicore-Challenge | 2013

GASPI – A Partitioned Global Address Space Programming Interface

Thomas Alrutz; Jan Backhaus; Thomas Brandes; Vanessa End; Thomas Gerhold; Alfred Geiger; Daniel Grünewald; Vincent Heuveline; Jens Jägersküpper; Andreas Knüpfer; Olaf Krzikalla; Edmund Kügeler; Carsten Lojewski; Guy Lonsdale; Ralph Müller-Pfefferkorn; Wolfgang E. Nagel; Lena Oden; Franz-Josef Pfreundt; Mirko Rahn; Michael Sattler; Mareike Schmidtobreick; Annika Schiller; Christian Simmendinger; Thomas Soddemann; Godehard Sutmann; Henning Weber; Jan-Philipp Weiss

At the threshold to exascale computing, limitations of the MPI programming model become more and more pronounced. HPC programmers have to design codes that can run and scale on systems with hundreds of thousands of cores. Setting up accordingly many communication buffers, point-to-point communication links, and using bulk-synchronous communication phases is contradicting scalability in these dimensions. Moreover, the reliability of upcoming systems will worsen.

Parallel Tools Workshop | 2012

HiFlow 3 : A Hardware-Aware Parallel Finite Element Package

Hartwig Anzt; Werner Augustin; Martin Baumann; Thomas Gengenbach; Tobias Hahn; Andreas Helfrich-Schkarbanenko; Vincent Heuveline; Eva Ketelaer; Dimitar Lukarski; Andreas Nestler; Sebastian Ritterbusch; Staffan Ronnas; Michael Schick; Mareike Schmidtobreick; Chandramowli Subramanian; Jan-Philipp Weiss; Florian Wilhelm; Martin Wlotzka

The goal of this paper is to describe the hardware-aware parallel C++ finite element package HiFlow3. HiFlow3 aims at providing a powerful platform for simulating processes modelled by partial differential equations. Our vision is to solve boundary value problems in an appropriate way by coupling numerical simulations with modern software design and state-of-the-art hardware technologies. The main functionalities for mapping the mathematical model into parallel software are implemented in the three core modules Mesh, DoF/FEM and Linear Algebra (LA). Parallelism is realized on two levels. The modules provide efficient MPI-based distributed data structures to achieve performance on large HPC systems but also on stand-alone workstations. Additionally, the hardware-aware cross-platform approach in the LA module accelerates the solution process by exploiting the computing power from emerging technologies like multi-core CPUs and GPUs. In this context performance evaluation on different hardware-architectures will be demonstrated.

Facing the Multicore-Challenge II | 2012

Parallel smoothers for matrix-based geometric multigrid methods on locally refined meshes using multicore CPUs and GPUs

Vincent Heuveline; Dimitar Lukarski; Nico Trost; Jan-Philipp Weiss

Multigrid methods are efficient and fast solvers for problems typically modeled by partial differential equations of elliptic type. We use the approach of matrix-based geometric multigrid that has high flexibility with respect to complex geometries and local singularities. Furthermore, it adapts well to the exigences of modern computing platforms. In this work we investigate multi-colored Gaus-Seidel type smoothers, the power(q)-pattern enhanced multi-colored ILU(p,q) smoothers with fill-ins, and factorized sparse approximate inverse (FSAI) smoothers. These approaches provide efficient smoothers with a high degree of parallelism. We describe the configuration of our smoothers in the context of the portable lmpLAtoolbox and the HiFlow 3 parallel finite element package. In our approach, a single source code can be used across diverse platforms including multicore CPUs and GPUs. Highly optimized implementations are hidden behind a unified user interface. Efficiency and scalability of our multigrid solvers are demonstrated by means of a comprehensive performance analysis on multicore CPUs and GPUs.

international conference on cluster computing | 2010

A multi-platform linear algebra toolbox for finite element solvers on heterogeneous clusters

Vincent Heuveline; Chandramowli Subramanian; Dimitar Lukarski; Jan-Philipp Weiss

Heterogeneous clusters with multiple sockets and multicore-processors accelerated by dedicated coprocessors like GPUs, Cell BE, FPGAs or others nowadays provide unrivaled computing power in terms of floating point operations. Specific capabilities of additional processor technologies enable dedicated exploitation with respect to particular application and data characteristics. However, resource utilization, programmability, and scalability of applications across heterogeneous platforms is a major concern. In the framework of the HiFlow finite element software package we have developed a portable software approach that implements efficient parallel solvers for partial differential equations by means of unified and modular user interfaces across a variety of heterogeneous platforms — in particular on GPU accelerated clusters. We detail our concept and provide performance analysis for various test scenarios that prove performance capabilities, scalability, viability, and user friendliness.

european conference on parallel processing | 2010

Scalable multi-coloring preconditioning for multi-core CPUs and GPUs

Vincent Heuveline; Dimitar Lukarski; Jan-Philipp Weiss

Krylov space methods like conjugate gradient and GMRES are efficient and parallelizable approaches for solving huge and sparse linear systems of equations. But as condition numbers are increasing polynomially with problem size sophisticated preconditioning techniques are essential building blocks. However, many preconditioning approaches like Gauss-Seidel/SSOR and ILU are based on sequential algorithms. Introducing parallelism for preconditioners is mostly hampering mathematical efficiency. In the era of multi-core and many-core processors like GPUs there is a strong need for scalable and fine-grained parallel preconditioning approaches. In the framework of the multi-platform capable finite element package HiFlow3 we are investigating multi-coloring techniques for block Gauss-Seidel type preconditioners. Our approach proves efficiency and scalability across hybrid multi-core and GPU platforms.

Facing the multicore-challenge | 2010

Parallel 3D multigrid methods on the STI cell BE architecture

Fabian Oboril; Jan-Philipp Weiss; Vincent Heuveline

The STI Cell Broadband Engine (BE) is a highly capable heterogeneous multicore processor with large bandwidth and computing power perfectly suited for numerical simulation. However, all performance benefits come at the price of productivity since more responsibility is put to the programmer. In particular, programming with the IBM Cell SDK is hampered by not only taking care of a parallel decomposition of the problem but also of managing all data transfers and organizing all computations in a performance-beneficial manner. While raising complexity of program development, this approach enables efficient utilization of available resources. In the present work we investigate the potential and the performance behavior of Cells parallel cores for a resource-demanding and bandwidth-bound multigrid solver for a three-dimensional Poisson problem. The chosen multigrid method based on a parallel Gaus-Seidel and Jacobi smoothers combines mathematical optimality with a high degree of inherent parallelism. We investigate dedicated code optimization strategies on the Cell platform and evaluate associated performance benefits by a comprehensive analysis. Our results show that the Cell BE platform can give tremendous benefits for numerical simulation based on well-structured data. However, it is inescapable that isolated, vendor-specific, but performance-optimal programming approaches need to be replaced by portable and generic concepts like OpenCL - maybe at the price of performance loss.

Preprint Series of the Engineering Mathematics and Computing Lab | 2011