Jonas Thies
German Aerospace Center
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jonas Thies.
SIAM Journal on Scientific Computing | 2015
Melven Röhrig-Zöllner; Jonas Thies; Moritz Kreutzer; Andreas Alvermann; Andreas Pieper; Achim Basermann; Georg Hager; Gerhard Wellein; H. Fehske
Block variants of the Jacobi--Davidson method for computing a few eigenpairs of a large sparse matrix are known to improve the robustness of the standard algorithm when it comes to computing multiple or clustered eigenvalues. In practice, however, they are typically avoided because the total number of matrix-vector operations increases. In this paper we present the implementation of a block Jacobi--Davidson solver. By detailed performance engineering and numerical experiments we demonstrate that the increase in operations is typically more than compensated by performance gains through better cache usage on modern CPUs, resulting in a method that is both more efficient and robust than its single vector counterpart. The steps to be taken to achieve a block speedup involve both kernel optimizations for sparse matrix and block vector operations, and algorithmic choices to allow using blocked operations in most parts of the computation. We discuss the aspect of avoiding synchronization in the algorithm and sho...
Journal of Computational Physics | 2010
Erik Bernsen; Henk A. Dijkstra; Jonas Thies; Fred Wubs
In present-day forward time stepping ocean-climate models, capturing both the wind-driven and thermohaline components, a substantial amount of CPU time is needed in a so-called spin-up simulation to determine an equilibrium solution. In this paper, we present methodology based on Jacobian-Free Newton-Krylov methods to reduce the computational time for such a spin-up problem. We apply the method to an idealized configuration of a state-of-the-art ocean model, the Modular Ocean Model version 4 (MOM4). It is shown that a typical speed-up of a factor 10-25 with respect to the original MOM4 code can be achieved and that this speed-up increases with increasing horizontal resolution.
International Journal of Parallel Programming | 2017
Moritz Kreutzer; Jonas Thies; Melven Röhrig-Zöllner; Andreas Pieper; Faisal Shahzad; Martin Galgon; Achim Basermann; H. Fehske; Georg Hager; Gerhard Wellein
While many of the architectural details of future exascale-class high performance computer systems are still a matter of intense research, there appears to be a general consensus that they will be strongly heterogeneous, featuring “standard” as well as “accelerated” resources. Today, such resources are available as multicore processors, graphics processing units (GPUs), and other accelerators such as the Intel Xeon Phi. Any software infrastructure that claims usefulness for such environments must be able to meet their inherent challenges: massive multi-level parallelism, topology, asynchronicity, and abstraction. The “General, Hybrid, and Optimized Sparse Toolkit” (GHOST) is a collection of building blocks that targets algorithms dealing with sparse matrix representations on current and future large-scale systems. It implements the “MPI+X” paradigm, has a pure C interface, and provides hybrid-parallel numerical kernels, intelligent resource management, and truly heterogeneous parallelism for multicore CPUs, Nvidia GPUs, and the Intel Xeon Phi. We describe the details of its design with respect to the challenges posed by modern heterogeneous supercomputers and recent algorithmic developments. Implementation details which are indispensable for achieving high efficiency are pointed out and their necessity is justified by performance measurements or predictions based on performance models. We also provide instructions on how to make use of GHOST in existing software packages, together with a case study which demonstrates the applicability and performance of GHOST as a component within a larger software stack. The library code and several applications are available as open source.
parallel computing | 2015
Martin Galgon; Lukas Krämer; Jonas Thies; Achim Basermann; Bruno Lang
Parallel iterative solution of linear systems from FEAST algorithm.Hybrid parallel implementation.CG variant with multi-coloring approach for better performance on hybrid systems. Methods for the solution of sparse eigenvalue problems that are based on spectral projectors and contour integration have recently attracted more and more attention. Such methods require the solution of many shifted sparse linear systems of full size. In most of the literature concerning these eigenvalue solvers, only few words are said on the solution of the linear systems, but they turn out to be very hard to solve by iterative linear solvers in practice. In this work we identify a row projection method for the solution of the inner linear systems encountered in the FEAST algorithm and introduce a novel hybrid parallel and fully iterative implementation of the eigenvalue solver. Our approach ultimately aims at achieving extreme parallelism by exploiting the algorithms potential on several levels. We present numerical examples where graphene modeling is one of the target applications. In this application, several hundred or even thousands of eigenvalues from the interior of the spectrum are required, which is a big challenge for state-of-the-art numerical methods.
SIAM Journal on Matrix Analysis and Applications | 2011
Fred Wubs; Jonas Thies
We present a new hybrid direct/iterative approach to the solution of a special class of saddle point matrices arising from the discretization of the steady incompressible Navier-Stokes equations on an Arakawa C-grid. The two-level method introduced here has the following properties: (i) it is very robust, even close to the point where the solution becomes unstable; (ii) a single parameter controls fill and convergence, making the method straightforward to use; (iii) the convergence rate is independent of the number of unknowns; (iv) it can be implemented on distributed memory machines in a natural way; (v) the matrix on the second level has the same structure and numerical properties as the original problem, so the method can be applied recursively; (vi) the iteration takes place in the divergence-free space, so the method qualifies as a “constraint preconditioner”; (vii) the approach can also be applied to Poisson problems. This work is also relevant for problems in which similar saddle point matrices occur, for instance, when simulating electrical networks, where one has to satisfy Kirchhoffs conservation law for currents.
european conference on parallel processing | 2014
Andreas Alvermann; Achim Basermann; H. Fehske; Martin Galgon; Georg Hager; Moritz Kreutzer; Lukas Krämer; Bruno Lang; Andreas Pieper; Melven Röhrig-Zöllner; Faisal Shahzad; Jonas Thies; Gerhard Wellein
The ESSEX project investigates computational issues arising at exascale for large-scale sparse eigenvalue problems and develops programming concepts and numerical methods for their solution. The project pursues a coherent co-design of all software layers where a holistic performance engineering process guides code development across the classic boundaries of application, numerical method, and basic kernel library. Within ESSEX the numerical methods cover widely applicable solvers such as classic Krylov, Jacobi-Davidson, or the recent FEAST methods, as well as domain-specific iterative schemes relevant for the ESSEX quantum physics application. This report introduces the project structure and presents selected results which demonstrate the potential impact of ESSEX for efficient sparse solvers on highly scalable heterogeneous supercomputers.
ieee international conference on escience | 2011
Jonas Thies; Fred Wubs
We discuss the parallel implementation of a hybrid direct/iterative solver for a special class of saddle point matrices arising from the discretization of the steady Navier-Stokes equations on an Arakawa C-grid, the F-matrices. The two-level method described here has the following properties: (i) it is very robust, even at comparatively high Reynolds Numbers, (ii) a single parameter controls fill and convergence, making the method straightforward to use, (iii) the convergence rate is independent of the number of unknowns, (iv) it can be implemented on distributed memory machines in a natural way, (v) the matrix on the second level has the same structure and numerical properties as the original problem, so the method can be applied recursively. The implementation focusses on generality, modularity, code reuse and recursiveness. The solver is implemented using building blocks of the Trilinos libraries. We show its performance on a parallel computer for the Navier-Stokes equations.
Software for Exascale Computing | 2016
Jonas Thies; Martin Galgon; Faisal Shahzad; Andreas Alvermann; Moritz Kreutzer; Andreas Pieper; Melven Röhrig-Zöllner; Achim Basermann; H. Fehske; Georg Hager; Bruno Lang; Gerhard Wellein
As we approach the exascale computing era, disruptive changes in the software landscape are required to tackle the challenges posed by manycore CPUs and accelerators. We discuss the development of a new ‘exascale enabled’ sparse solver repository (the ESSR) that addresses these challenges—from fundamental design considerations and development processes to actual implementations of some prototypical iterative schemes for computing eigenvalues of sparse matrices. Key features of the ESSR include holistic performance engineering, tight integration between software layers and mechanisms to mitigate hardware failures.
Software for Exascale Computing | 2016
Moritz Kreutzer; Jonas Thies; Andreas Pieper; Andreas Alvermann; Martin Galgon; Melven Röhrig-Zöllner; Faisal Shahzad; Achim Basermann; A. R. Bishop; H. Fehske; Georg Hager; Bruno Lang; Gerhard Wellein
Numerous challenges have to be mastered as applications in scientific computing are being developed for post-petascale parallel systems. While ample parallelism is usually available in the numerical problems at hand, the efficient use of supercomputer resources requires not only good scalability but also a verifiably effective use of resources on the core, the processor, and the accelerator level. Furthermore, power dissipation and energy consumption are becoming further optimization targets besides time-to-solution. Performance Engineering (PE) is the pivotal strategy for developing effective parallel code on all levels of modern architectures. In this paper we report on the development and use of low-level parallel building blocks in the GHOST library (“General, Hybrid, and Optimized Sparse Toolkit”). We demonstrate the use of PE in optimizing a density of states computation using the Kernel Polynomial Method, and show that reduction of runtime and reduction of energy are literally the same goal in this case. We also give a brief overview of the capabilities of GHOST and the applications in which it is being used successfully.
International Workshop on Eigenvalue Problems: Algorithms, Software and Applications in Petascale Computing | 2015
Martin Galgon; Lukas Krämer; Bruno Lang; Andreas Alvermann; H. Fehske; Andreas Pieper; Georg Hager; Moritz Kreutzer; Faisal Shahzad; Gerhard Wellein; Achim Basermann; Melven Röhrig-Zöllner; Jonas Thies
The ESSEX project is an ongoing effort to provide exascale-enabled sparse eigensolvers, especially for quantum physics and related application areas. In this paper we first briefly summarize some key achievements that have been made within this project.