Radim Sojka
Technical University of Ostrava
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Radim Sojka.
Computing | 2017
Joseph Schuchart; Michael Gerndt; Per Gunnar Kjeldsberg; Michael Lysaght; David Horák; Lubomír Říha; Andreas Gocht; Mohammed Sourouri; Madhura Kumaraswamy; Anamika Chowdhury; Magnus Jahre; Kai Diethelm; Othman Bouizi; Umbreen Sabir Mian; Jakub Kružík; Radim Sojka; Martin Beseda; Venkatesh Kannan; Zakaria Bendifallah; Daniel Hackenberg; Wolfgang E. Nagel
Energy efficiency is an important aspect of future exascale systems, mainly due to rising energy cost. Although High performance computing (HPC) applications are compute centric, they still exhibit varying computational characteristics in different regions of the program, such as compute-, memory-, and I/O-bound code regions. Some of today’s clusters already offer mechanisms to adjust the system to the resource requirements of an application, e.g., by controlling the CPU frequency. However, manually tuning for improved energy efficiency is a tedious and painstaking task that is often neglected by application developers. The European Union’s Horizon 2020 project READEX (Runtime Exploitation of Application Dynamism for Energy-efficient eXascale computing) aims at developing a tools-aided approach for improved energy efficiency of current and future HPC applications. To reach this goal, the READEX project combines technologies from two ends of the compute spectrum, embedded systems and HPC, constituting a split design-time/runtime methodology. From the HPC domain, the Periscope Tuning Framework (PTF) is extended to perform dynamic auto-tuning of fine-grained application regions using the systems scenario methodology, which was originally developed for improving the energy efficiency in embedded systems. This paper introduces the concepts of the READEX project, its envisioned implementation, and preliminary results that demonstrate the feasibility of this approach.
ieee international conference on high performance computing data and analytics | 2015
Václav Hapla; David Horák; Lukáš Pospíšil; Martin Čermák; Alena Vašatová; Radim Sojka
PERMON makes use of theoretical results in quadratic programming algorithms and domain decomposition methods. It is built on top of the PETSc framework for numerical computations. This paper describes its fundamental packages and shows their applications. We focus here on contact problems of mechanics decomposed by means of a FETI-type non-overlapping domain decomposition method. These problems lead to inequality constrained quadratic programming problems that can be solved by our PermonQP package.
INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS (ICNAAM 2016) | 2017
Jan Papuga; Radim Halama; Martin Fusek; Jaroslav Rojíček; František Fojtík; David Horák; Marek Pecha; Jiří Tomčala; Martin Čermák; Václav Hapla; Radim Sojka; Jakub Kružík
In this paper, we discuss and present our progress toward a project, which is focused on fatigue life prediction under multiaxial loading in the domain of low-cycle fatigue, i.e. cases, where the plasticity cannot be neglected. First, the elastic-plastic solution in the finite element analysis is enhanced and verified on own experiments. Second, the method by Jiang describing the instantaneous damage increase by analyses of load time by time, is in implementation phase. In addition, simplified routines for conversion of elastic stresses-strains to elastic-plastic ones as proposed by Firat and Ye et.al. are evaluated on the basis of data gathered from external sources. In order to produce high quality complex analyses, which could be feasible in an acceptable time, and allow the period for next analyses of results to be expanded; the core of PragTic fatigue solver used for all fatigue computations are being re-implemented to get the fully parallelized scalable solution.
international conference on high performance computing and simulation | 2016
David Horák; Lubomir Riha; Radim Sojka; Jakub Kruzik; Martin Beseda
The energy consumption of supercomputers is one of the critical problems for the upcoming Exascale supercomputing era. The awareness of power an energy consumption is required on both software and hardware side. This poster deals with the energy consumption evaluation of the Total-Finite Element Tearing and Interconnect (TFETI) based solvers [2] of linear systems implemented in PERMON toolbox [1], which is an established method for solving real-world engineering problems, and with the energy consumption evaluation of the BLAS routines. The experiments performed in the poster deal with CPU frequency. This work is performed in the scope of the READEX project (Runtime Exploitation of Application Dynamism for Energy-efficient eXascale computing) [6]. The measurements were performed on the Intel Xeon E5-2680 (Intel Haswell micro-architecture) based Taurus system installed at TU Dresden. The system contains over 1400 nodes that have an FPGA-based power instrumentation called HDEEM (High Definition Energy Efficiency Monitoring), that allows for fine-grained and more accurate power and energy measurements. The measurements can be accessed through the HDEEM library, allowing developers to take energy measurements before and after the region of interest. We have evaluated the effect of the CPU frequency on the energy consumption of the TFETI solver for a linear elasticity 3D cube synthetic benchmark. On the dualized problem MPFX=MPd, we have evaluated the effect of frequency tuning on the energy consumption of the essential processing kernels of the TFETI method. There are two main phases in TFETI - preprocessing and solve. In preprocessing it is necessary to regularize the stiffness matrix K and factorize it and to assemble the G and GGT matrices and the second one to factorize. Both operations belong to the most time and also energy consuming operations. The solve employs the Preconditioned Conjugate Gradient (PCG) algorithm, which consists of sparse matrix-vector multiplications (by F, P, ML, MD matrices) and vector dot products and AXPY functions. In each iteration, we need to apply the direct solver twice, i.e., for forward and backward solves for the pseudoinverse K+ action and for the coarse problem solution, the (GGT)-1 action. The multiplication by the dense Schur complement matrix adds an additional operator with different computational characteristics, potentially increasing the exploitable dynamism. The poster provides results for two types of frequency tuning: (1) static tuning and (2) dynamic tuning. For static tuning experiments, the frequency is set before execution and kept constant during the runtime. For dynamic tuning, the frequency is changed during the program execution to adapt the system to the actual needs of the application. The poster shows that static tuning brings up 11.84% energy savings when compared to default CPU settings (the highest clock rate). The dynamic tuning improves this further by up to 2.68%. In total, the approach presented in this paper shows the potential to save up to 14.52% of energy for TFETI based solvers, see Table1. Another energy consumption evaluations were done with selected Sparse and Dense BLAS Level 1, 2 and 3 routines. For benchmarking we have used a set of matrices from University Florida collection [4]. We have employed AXPY, Sparse Matrix-Vector, Sparse MatrixMatrix, Dense Matrix-Vector, Dense Matrix-Matrix and Sparse Matrix-Dense Matrix multiplication routines from Intel Math Kernel Library (MKL) [3]. The measured characteristics illustrate the different energy consumption of BLAS routines, as some operations are memory-bounded and others are compute-bounded. Based on our recommendations one can explore dynamic frequency switching to achieve significant energy savings up to 23%, for more details see Table 2.
Applied Mathematics and Computation | 2018
Radim Sojka; David Horák; Václav Hapla; Martin Čermák
Abstract The paper deals with handling multiple subdomains per computational core in the PERMON toolbox, namely in the PermonFLLOP module, to fully exploit the potential of the Total Finite Element Tearing and Interconnecting (TFETI) domain decomposition method (DDM). Most authors researching FETI methods present weak parallel scalability with one subdomain assigned to each computational core, and call it just parallel scalability. Here we present an extension showing the data of more than one subdomain being held by each MPI process. Numerical experiments demonstrate the theoretically supported fact that for the given problem size and number of processors, the increased number of subdomains leads to better conditioning of the system operator, and hence faster convergence. Moreover, numerical, memory, strong parallel, and weak parallel scalability is reported, and optimal numbers of subdomains per core are examined. Finally, new PETSc matrix types dealing with the aforementioned extension are introduced.
INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS (ICNAAM 2016) | 2017
Radim Sojka; Lubomir Riha; David Horák; Jakub Kruzik; Martin Beseda; Martin Čermák
The paper deals with the energy consumption evaluation of selected Sparse and Dense BLAS Level 1, 2 and 3 routines. We have employed AXPY, Sparse Matrix-Vector, Sparse Matrix-Matrix, Dense Matrix-Vector, Dense Matrix-Matrix and Sparse Matrix-Dense Matrix multiplication routines from Intel Math Kernel Library (MKL). The measured characteristics illustrate the different energy consumption of BLAS routines, as some operations are memory-bounded and others are compute-bounded. Based on our recommendations one can explore dynamic frequency switching to achieve significant energy savings up to 23%.
INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS (ICNAAM 2016) | 2017
David Horák; Lubomir Riha; Radim Sojka; Jakub Kruzik; Martin Beseda; Martin Čermák; Joseph Schuchart
Abstract. The energy consumption of supercomputers is one of the critical problems for the upcoming Exascale supercomputing era. The awareness of power and energy consumption is required on both software and hardware side. This paper deals with the energy consumption evaluation of the Finite Element Tearing and Interconnect (FETI) based solvers of linear systems, which is an established method for solving real-world engineering problems. We have evaluated the effect of the CPU frequency on the energy consumption of the FETI solver using a linear elasticity 3D cube synthetic benchmark. In this problem, we have evaluated the effect of frequency tuning on the energy consumption of the essential processing kernels of the FETI method. The paper provides results for two types of frequency tuning: (1) static tuning and (2) dynamic tuning. For static tuning experiments, the frequency is set before execution and kept constant during the runtime. For dynamic tuning, the frequency is changed during the program execution to adapt the system to the actual needs of the application. The paper shows that static tuning brings up 12% energy savings when compared to default CPU settings (the highest clock rate). The dynamic tuning improves this further by up to 3%.
Programs and Algorithms of Numerical Mathematics 18 | 2017
Alena Vašatová; Jiří Tomčala; Radim Sojka; Marek Pecha; Jakub Kružík; David Horák; Václav Hapla; Martin Čermák
Archive | 2017
Martin Čermák; Václav Hapla; David Horák; Jakub Kružík; Marek Pecha; Radim Sojka; Jiří Tomčala
Advances in Electrical and Electronic Engineering | 2017
David Horák; Václav Hapla; Jakub Kruzik; Radim Sojka; Martin Čermák; Jiri Tomcala; Marek Pecha; Zdenek Dostál