Martin Bečka | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Bečka is active.

Explore More

Publication

Featured researches published by Martin Bečka.

parallel computing | 2002

Dynamic ordering for a parallel block-Jacobi SVD algorithm

Martin Bečka; Gabriel Okša; Marián Vajteršic

A new approach for the parallel computation of singular value decomposition (SVD) of matrix A ∈ Cm×n is proposed. Contrary to the known algorithms that use a static cyclic ordering of subproblems simultaneously solved in one iteration step, the proposed implementation of the two-sided block-Jacobi method uses a dynamic ordering of subproblems. The dynamic ordering takes into account the actual status of matrix A. In each iteration step, a set of the off-diagonal blocks is determined that reduces the Frobenius norm of the off-diagonal elements of A as much as possible and, at the same time, can be annihilated concurrently. The solution of this task is equivalent to the solution of the maximum-weight perfect matching problem. The greedy algorithm for the efficient solution of this problem is presented. The computational experiments with both types of ordering, incorporated into the two-sided block-Jacobi method, were performed on an SGI - Cray Origin 2000 parallel computer using the Message Passing Interface (MPI). The results confirm that the dynamic ordering is much more efficient with regard to the amount of work required for the computation of SVD of a given accuracy than the static cyclic ordering.

Parallel Algorithms and Applications | 1999

BLOCK-JACOBI SVD ALGORITHMS FOR DISTRIBUTED MEMORY SYSTEMS I: HYPERCUBES AND RINGS*

Martin Bečka; Marián Vajteršic

The paper presents parallel algorithms for efficient solution of the Singular Value Decomposition (SVD) problem by the block two-sided Jacobi method. In this part of the work, we show how the method may be used on MIMD computers with hypercube and ring topologies. We analyse three types of orderings for solving SVD on block-structured submatrices from the point of view of communication requirements and suitability for parallel execution of the computational process The algorithms map well onto the hypercube topology. Two of the ordering schemes can also be directly implemented on rings. Results obtained on an Intel Paragon are shown and discussed for all the three types of orderings.

Parallel Algorithms and Applications | 1999

BLOCK-JACOBI SVD ALGORITHMS FOR DISTRIBUTED MEMORY SYSTEMS II: MESHES∗

Martin Bečka; Marián Vajteršic

This paper deals with a parallelization of the two-sided Jacobi algorithm for computation of Singular Value Decomposition (SVD) on a computer with p processors, which are organized into a two-dimensional √p × √p mesh configuration. This work represents a continuation of our paper (Part I, to appear in J. Parallel Algorithms and Applications), which described a parallelization approach by columns and efficient ordering strategies for the hypercube and ring topologies. Our parallelization approach is based on slicing the matrices by rows and columns. The orderings developed for rings and hypercubes are adopted here and we show a proper assignment of submatrices to processors that enables their efficient parallel execution. A complexity comparison to the column-based algorithm is given. Parallel computational experiments on a Paragon system are presented and discussed for two test matrices.

parallel computing | 2003

On variable blocking factor in a parallel dynamic block: Jacobi SVD algorithm

Martin Bečka; Gabriel Okša

The parallel two-sided block-Jacobi singular value decomposition (SVD) algorithm with dynamic ordering, originally proposed in [Parallel Comput. 28 (2002) 243-262], has been extended with respect to the blocking factor l. Unlike the unique blocking factor l = 2p in the original algorithm running on p processors, the current blocking factor is a variable parameter that covers the values in two different regions--namely, l = p/k and l = 2kp for some integer k. Two new parallel two-sided block-Jacobi SVD algorithms with dynamic ordering are described in detail. They arise in those two regions and differ in the logical data arrangement and communication complexity of the reordering step. For the case of l = 2kp, it is proved that a designed point-to-point communication algorithm is optimal with respect to the amount of communication required per processor as well as to the amount of overall communication. Using the message passing programming model for distributed memory machines, new parallel block-Jacobi SVD algorithms were implemented on an SGI-Cray Origin 2000 parallel computer. Numerical experiments were performed on p = 12 and 24 processors using a set of six matrices of order 4000 and blocking factors l, 2 ≤ l ≤ 192. To achieve the minimal total parallel execution time, the use of a blocking factor l ∈ {2,p, 2p} can be recommended for matrices with distinct singular values. However, for matrices with a multiple minimal singular value, the total parallel execution time may monotonically increase with l. In this case, the recommended Jacobi method with l = 2 is just the ScaLAPACK routine with some additional matrix multiplications, and it computes the SVD in one parallel iteration step.

Parallel Processing Letters | 2015

New Dynamic Orderings for the Parallel One–Sided Block-Jacobi SVD Algorithm

Martin Bečka; Gabriel Oksa; Marián Vajteršic

Five variants of a new dynamic ordering are presented for the parallel one-sided block Jacobi SVD algorithm. Similarly to the two-sided algorithm, the dynamic ordering takes into account the actual status of a matrix—this time of its block columns with respect to their mutual orthogonality. Variants differ in the computational and communication complexities and in proposed global and local stopping criteria. Their performance is tested on a square random matrix of order 8192 with a random distribution of singular values using p=16, 32, 64, 96 and 128 processors. All variants of dynamic ordering are compared with a parallel cyclic ordering, two-sided block-Jacobi method with dynamic ordering and the ScaLAPACK routine PDGESVD with respect to the number of parallel iteration steps needed for the convergence and total parallel execution time. Moreover, the relative errors in the orthogonality of computed left singular vectors and in the matrix assembled from computed singular triplets are also discussed. It turns out that the variant 3, for which a local optimality in some precisely defined sense can be proved, and its combination with variant 2, are the most efficient ones. For relatively small blocking factors l=2p, they outperform the ScaLAPACK procedure PDGESVD and are about 2 times faster.

international conference on parallel processing | 2013

Parallel One–Sided Jacobi SVD Algorithm with Variable Blocking Factor

Martin Bečka; Gabriel Okša

Parallel one-sided block-Jacobi algorithm for the matrix singular value decomposition (SVD) requires an efficient computation of symmetric Gram matrices, their eigenvalue decompositions (EVDs) and an update of matrix columns and right singular vectors by matrix multiplication. In our recent parallel implementation with \(p\) processors and blocking factor \(\ell =2p\), these tasks are computed serially in each processor in a given parallel iteration step because each processor contains exactly two block columns of an input matrix \(A\). However, as shown in our previous work, with increasing \(p\) (hence, with increasing blocking factor) the number of parallel iteration steps needed for the convergence of the whole algorithm increases linearly but faster than proportionally to \(p\), so that it is hard to achieve a good speedup. We propose to break the tight relation \(\ell =2p\) and to use a small blocking factor \(\ell = p/k\) for some integer \(k\) that divides \(p\), \(\ell \) even. The algorithm then works with pairs of logical block columns that are distributed among processors so that all computations inside a parallel iteration step are themselves parallel. We discuss the optimal data distribution for parallel subproblems in the one-sided block-Jacobi algorithm and analyze its computational and communication complexity. Experimental results with full matrices of order \(8192\) show that our new algorithm with a small blocking factor is well scalable and can be \(2\)–\(3\) times faster than the ScaLAPACK procedure PDGESVD.

parallel computing | 1999

Experiments with Parallel One-Sided and Two-Sided Algorithms for SVD

Martin Bečka; Sophie Robert; Marián Vajteršic

A paper reports on testing parallel SVD algorithms for matrices arising from selected scientific and industrial applications. The codes for the SVD are based respectively on the one{sided and the two{ sided Jacobi approach. The matrices come from solving problems of the diffraction process in the crystallography, the diffusion equation in the reactor physics and from the aircraft industry. A parallelization of each of these approaches is described. Results from computational experiments performed on the Paragon machine with 56 processors are presented and discussed.

Concurrency and Computation: Practice and Experience | 2017

Performance analysis and optimization of the parallel one-sided block Jacobi SVD algorithm with dynamic ordering and variable blocking

Shuhei Kudo; Yusaku Yamamoto; Martin Bečka; Marián Vajteršic

The one‐sided block Jacobi (OSBJ) method is known to be an efficient method for computing the singular value decomposition on a parallel computer. In this paper, we focus on the most recent variant of the OSBJ method, the one with parallel dynamic ordering and variable blocking, and present both theoretical and experimental analyses of the algorithm. In the first part of the paper, we provide a detailed theoretical analysis of its convergence properties. In the second part, based on preliminary performance measurement on the Fujitsu FX10 and SGI Altix ICE parallel computers, we identify two performance bottlenecks of the algorithm and propose new implementations to resolve the problem. Experimental results show that they are effective and can achieve up to 1.8 and 1.4 times speedup of the total execution time on the FX10 and the Altix ICE, respectively. Comparison with the ScaLAPACK SVD routine PDGESVD shows that our OSBJ solver is efficient when solving small to medium sized problems (n < 10000) using modest number ( < 100) of computing nodes. Copyright

Proceedings of IEEE International Symposium on Parallel Algorithms Architecture Synthesis | 1997

Block-SVD algorithms and their adaptation to hypercubes and rings

Marián Vajteršic; Martin Bečka

The paper presents parallel algorithms for efficient solution of the SVD (singular value decomposition) problem by the block two sided Jacobi method. It is shown how the method could be applied to MIMD computers with the hypercube and ring topology. Three types of orderings for solving SVD on block structured submatrices are analysed from the point of view of communication requirements and suitability for a parallel execution of the computational process, which is carried out on block columns of the matrix. All three orderings fit well to the hypercube topology. Two of them can be directly implemented also on rings. The optimality in parallelization of the method and data transfers has been achieved there within each sweep. For the third scheme, an efficient numbering of processor nodes is discussed. Computer results obtained on an Intel Paragon system are shown for a chosen ordering.

international conference on parallel processing | 2015

New Approach to Local Computations in the Parallel One–Sided Jacobi SVD Algorithm

Martin Bečka; Gabriel Okša

One sided block Jacobi algorithm for the singular value decomposition (SVD) of matrix can be a method of choice to compute SVD efficiently and accurately in parallel. A given matrix is logically partitioned into block columns and is subjected to an iteration process. In each iteration step, for given two block columns, their Gram matrix is generated, its symmetric eigenvalue decomposition (EVD) is computed and the update of block columns by matrix-matrix multiplication is performed. Another possibility is to omit the computation of Gram matrix and the update, so that there is no matrix-matrix multiplication at all. A local matrix is formed by two block columns and its QR decomposition is computed first to reduce the dimension and decrease the off-diagonal norm. Then, the one-sided serial Jacobi SVD is called (either for a local matrix or its R-factor). No update is necessary, since the result from the serial one-sided Jacobi SVD algorithm is the same as the original matrix after update. Crucial for this new approach is an efficient implementation of the QR decomposition for tall and skinny matrices, as well as a fast and accurate (serial) one-sided Jacobi SVD algorithm. Another improvement of the algorithm would be to replace the static (fixed) local stopping criterion in the inner EVD or SVD computations by the dynamic (flexible) one to reduce the work done by these routines in one parallel iteration step. Since the orthogonality of columns is crucial at the end of the algorithm, one could progressively set the local stopping criterion as the computation proceeds. We tried to implement the proposed approaches and compare the achieved results with the standard algorithm.

Explore More