Martin Kronbichler | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Kronbichler is active.

Explore More

Publication

Featured researches published by Martin Kronbichler.

Journal of Numerical Mathematics | 2016

The deal.II Library, Version 8.4

Wolfgang Bangerth; Denis Davydov; Timo Heister; Luca Heltai; Guido Kanschat; Martin Kronbichler; Matthias Maier; Bruno Turcksin; David Wells

Abstract This paper provides an overview of the new features of the finite element library deal.II version 8.5.

ACM Transactions on Mathematical Software | 2011

Algorithms and data structures for massively parallel generic adaptive finite element codes

Wolfgang Bangerth; Carsten Burstedde; Timo Heister; Martin Kronbichler

Todays largest supercomputers have 100,000s of processor cores and offer the potential to solve partial differential equations discretized by billions of unknowns. However, the complexity of scaling to such large machines and problem sizes has so far prevented the emergence of generic software libraries that support such computations, although these would lower the threshold of entry and enable many more applications to benefit from large-scale computing. We are concerned with providing this functionality for mesh-adaptive finite element computations. We assume the existence of an “oracle” that implements the generation and modification of an adaptive mesh distributed across many processors, and that responds to queries about its structure. Based on querying the oracle, we develop scalable algorithms and data structures for generic finite element methods. Specifically, we consider the parallel distribution of mesh data, global enumeration of degrees of freedom, constraints, and postprocessing. Our algorithms remove the bottlenecks that typically limit large-scale adaptive finite element analyses. We demonstrate scalability of complete finite element workflows on up to 16,384 processors. An implementation of the proposed algorithms, based on the open source software p4est as mesh oracle, is provided under an open source license through the widely used deal.II finite element software library.

Computers & Mathematics With Applications | 2013

Numerical and computational efficiency of solvers for two-phase problems

Owe Axelsson; Petia T. Boyanova; Martin Kronbichler; Maya Neytcheva; Xunxun Wu

We consider two-phase flow problems, modelled by the Cahn-Hilliard equation. In this work, the nonlinear fourth-order equation is decomposed into a system of two coupled second-order equations for the concentration and the chemical potential. We analyse solution methods based on an approximate two-by-two block factorization of the Jacobian of the nonlinear discrete problem. We propose a preconditioning technique that reduces the problem of solving the non-symmetric discrete Cahn-Hilliard system to a problem of solving systems with symmetric positive definite matrices where off-the-shelf multilevel and multigrid algorithms are directly applicable. The resulting solution methods exhibit optimal convergence and computational complexity properties and are suitable for parallel implementation. We illustrate the efficiency of the proposed methods by various numerical experiments, including parallel results for large scale three dimensional problems.

ieee international conference on escience | 2011

Parallel Finite Element Operator Application: Graph Partitioning and Coloring

Katharina Kormann; Martin Kronbichler

We present an efficient implementation of parallel finite element operator application for hexahedral elements. The implementation is tailored to data structures for adaptively refined meshes and exploits parallelism on modern computer systems. The evaluation of local shape functions and gradients is performed with sum-factorization that makes use of the tensor-product form. For shared memory parallelization, we propose a novel two-level partitioning/coloring approach that avoids race conditions when writing into the result vector. We give evidence for the good performance of our implementation. We employ the optimized operator implementation on a problem in quantum dynamics described by the time-dependent Schroedinger equation. We obtain a speedup of more than a factor four over conventional solvers based on sparse matrices for a moderate polynomial order of four in three dimensions.

International Journal for Numerical Methods in Fluids | 2018

Wall modeling via function enrichment within a high-order DG method for RANS simulations of incompressible flow

Benjamin Krank; Martin Kronbichler; Wolfgang A. Wall

Summary We present a novel approach to wall modeling for RANS within the discontinuous Galerkin method. Wall functions are not used to prescribe boundary conditions as usual but they are built into the function space of the numerical method as a local enrichment, in addition to the standard polynomial component. The Galerkin method then automatically finds the optimal solution among all shape functions available. This idea is fully consistent and gives the wall model vast flexibility in separated boundary layers or high adverse pressure gradients. The wall model is implemented in a high-order discontinuous Galerkin solver for incompressible flow complemented by the Spalart–Allmaras closure model. As benchmark examples we present turbulent channel flow starting from Reτ=180 and up to Reτ=100,000 as well as flow past periodic hills at Reynolds numbers based on the hill height of ReH=10,595 and ReH=19,000. This article is protected by copyright. All rights reserved.

international supercomputing conference | 2017

Fast Matrix-Free Discontinuous Galerkin Kernels on Modern Computer Architectures

Martin Kronbichler; Katharina Kormann; Igor Pasichnyk; Momme Allalen

This study compares the performance of high-order discontinuous Galerkin finite elements on modern hardware. The main computational kernel is the matrix-free evaluation of differential operators by sum factorization, exemplified on the symmetric interior penalty discretization of the Laplacian as a metric for a complex application code in fluid dynamics. State-of-the-art implementations of these kernels stress both arithmetics and memory transfer. The implementations of SIMD vectorization and shared-memory parallelization are detailed. Computational results are presented for dual-socket Intel Haswell CPUs at 28 cores, a 64-core Intel Knights Landing, and a 16-core IBM Power8 processor. Up to polynomial degree six, Knights Landing is approximately twice as fast as Haswell. Power8 performs similarly to Haswell, trading a higher frequency for narrower SIMD units. The performance comparison shows that simple ways to express parallelism through for loops perform better on medium and high core counts than a more elaborate task-based parallelization with dynamic scheduling according to dependency graphs, despite less memory transfer in the latter algorithm.

International Journal for Numerical Methods in Fluids | 2018

Efficiency of high-performance discontinuous Galerkin spectral element methods for under-resolved turbulent incompressible flows: High-performance discontinuous Galerkin for turbulent flows

Niklas Fehn; Wolfgang A. Wall; Martin Kronbichler

The present paper addresses the numerical solution of turbulent flows with high-order discontinuous Galerkin methods for discretizing the incompressible Navier-Stokes equations. The efficiency of high-order methods when applied to under-resolved problems is an open issue in literature. This topic is carefully investigated in the present work by the example of the 3D Taylor-Green vortex problem. Our implementation is based on a generic high-performance framework for matrix-free evaluation of finite element operators with one of the best realizations currently known. We present a methodology to systematically analyze the efficiency of the incompressible Navier-Stokes solver for high polynomial degrees. Due to the absence of optimal rates of convergence in the under-resolved regime, our results reveal that demonstrating improved efficiency of high-order methods is a challenging task and that optimal computational complexity of solvers, preconditioners, and matrix-free implementations are necessary ingredients to achieve the goal of better solution quality at the same computational costs already for a geometrically simple problem such as the Taylor-Green vortex. Although the analysis is performed for a Cartesian geometry, our approach is generic and can be applied to arbitrary geometries. We present excellent performance numbers on modern, cache-based computer architectures achieving a throughput for operator evaluation of 3e8 up to 1e9 DoFs/sec on one Intel Haswell node with 28 cores. Compared to performance results published within the last 5 years for high-order DG discretizations of the compressible Navier-Stokes equations, our approach reduces computational costs by more than one order of magnitude for the same setup.

International Journal of High Performance Computing Applications | 2018

A fast massively parallel two-phase flow solver for microfluidic chip simulation

Martin Kronbichler; Ababacar Diagne; Hanna Holmgren

This work presents a parallel finite element solver of incompressible two-phase flow targeting large-scale simulations of three-dimensional dynamics in high-throughput microfluidic separation devices. The method relies on a conservative level set formulation for representing the fluid-fluid interface and uses adaptive mesh refinement on forests of octrees. An implicit time stepping with efficient block solvers for the incompressible Navier–Stokes equations discretized with Taylor–Hood and augmented Taylor–Hood finite elements is presented. A matrix-free implementation is used that reduces the solution time for the Navier–Stokes system by a factor of approximately three compared to the best matrix-based algorithms. Scalability of the chosen algorithms up to 32,768 cores and a billion degrees of freedom is shown.

ACM Transactions on Mathematical Software | 2016

WorkStream -- A Design Pattern for Multicore-Enabled Finite Element Computations

Bruno Turcksin; Martin Kronbichler; Wolfgang Bangerth

Many operations that need to be performed in modern finite element codes can be described as an operation that needs to be done independently on every cell, followed by a reduction of these local results into a global data structure. For example, matrix assembly, estimating discretization errors, or converting nodal values into data structures that can be output in visualization file formats all fall into this class of operations. Using this realization, we identify a software design pattern that we call WorkStream and that can be used to model such operations and enables the use of multicore shared memory parallel processing. We also describe in detail how this design pattern can be efficiently implemented, and we provide numerical scalability results from its use in the deal.II software library.

EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface | 2010

Massively parallel finite element programming

Timo Heister; Martin Kronbichler; Wolfgang Bangerth

Todays large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability.

Explore More