Kinji Kimura | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kinji Kimura is active.

Explore More

Publication

Featured researches published by Kinji Kimura.

parallel computing technologies | 2007

Accelerating the singular value decomposition of rectangular matrices with the CSK600 and the integrable SVD

Yusaku Yamamoto; Takeshi Fukaya; Takashi Uneyama; Masami Takata; Kinji Kimura; Masashi Iwasaki; Yoshimasa Nakamura

We propose an approach to speed up the singular value decomposition (SVD) of very large rectangular matrices using the CSX600 floating point coprocessor. The CSX600-based acceleration board we use offers 50GFLOPS of sustained performance, which is many times greater than that provided by standard microprocessors. However, this performance can be achieved only when a vendor-supplied matrix-matrix multiplication routine is used and the matrix size is sufficiently large. In this paper, we optimize two of the major components of rectangular SVD, namely, QR decomposition of the input matrix and back-transformation of the left singular vectors by matrix Q, so that large-size matrix multiplications can be used efficiently. In addition, we use the Integrable SVD algorithm to compute the SVD of an intermediate bidiagonal matrix. This helps to further speed up the computation and reduce the memory requirements. As a result, we achieved up to 3.5 times speedup over the Intel Math Kernel Library running on an 3.2GHz Xeon processor when computing the SVD of a 100,000 × 4000 matrix.

Journal of Physics A | 2011

Conserved quantities of the discrete finite Toda equation and lower bounds of the minimal singular value of upper bidiagonal matrices

Kinji Kimura; Takumi Yamashita; Yoshimasa Nakamura

Some numerical algorithms are known to be related to discrete-time integrable systems, where it is essential that quantities to be computed (for example, eigenvalues and singular values of a matrix, poles of a continued fraction) are conserved quantities. In this paper, a new application of conserved quantities of integrable systems to numerical algorithms is presented. For an N × N (N ≥ 2) real upper bidiagonal matrix B where all the diagonals and the upper subdiagonals are positive, conserved quantities Tr(((BTB)M)−1) (M = 1, 2, ...) of the discrete finite Toda equation give a sequence of lower bounds of the minimal singular value of B. Recurrence relations for computing higher order conserved quantities Tr(((BTB)M)−1) are also derived.

Numerical Algorithms | 2015

A new subtraction-free formula for lower bounds of the minimal singular value of an upper bidiagonal matrix

Takumi Yamashita; Kinji Kimura; Yusaku Yamamoto

Traces of inverse powers of a positive definite symmetric tridiagonal matrix give lower bounds of the minimal singular value of an upper bidiagonal matrix. In a preceding work, a formula for the traces which gives the diagonal entries of the inverse powers is presented. In this paper, we present another formula which gives the traces based on a quite different idea from the one in the preceding work. An efficient implementation of the formula for practice is also presented.

International Conference on Informatics Education and Research for Knowledge-Circulating Society (icks 2008) | 2008

Application of the Kato-Temple Inequality for Eigenvalues of Symmetric Matrices to Numerical Algorithms with Shift for Singular Values

Kinji Kimura; Masami Takata; Masashi Iwasaki; Yoshimasa Nakamura

The Kato-Temple inequality for eigenvalues of symmetric matrices gives a lower bound of the minimal eigenvalue lambdam. Let A be a symmetric positive definite tridiagonal matrix defined by A = BT B, where B is bidiagonal. Then the so-called Kato-Temple bound gives a lower bound of the minimal singular value sigmam of B. In this paper we discuss how to apply the Kato-Temple inequality to shift of origin which appears in the mdLVs algorithm, for example, for computing all singular values of B. To make use of the Kato-Temple inequality a Rayleigh quotient for the matrix A = BT B and a right endpoint of interval where lambdam = sigmam 2 belongs are necessary. Then it is shown that the execution time of mdLVs with the standard shifts can be shorten by a possible choice of the generalized Newton bound or the Kato-Temple bound.

2015 10th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC) | 2015

A New Parallel Symmetric Tridiagonal Eigensolver Based on Bisection and Inverse Iteration Algorithms for Shared-Memory Multi-core Processors

Hiroyuki Ishigami; Kinji Kimura; Yoshimasa Nakamura

In order to accelerate the subset computation of eigenpairs for real symmetric tridiagonal matrices on shared-memory multi-core processors, a parallel symmetric tridiagonal eigensolver is proposed, which computes eigenvalues of target matrices using the parallel bisection algorithm and computes the corresponding eigenvectors using the block inverse iteration algorithm with reorthogonalization (BIR algorithm). The BIR algorithm is based on the simultaneous inverse iteration (SI) algorithm, which is a variant of the inverse iteration algorithm, and is introduced to a block parameter. Since the BIR algorithm is mainly composed of the matrix multiplications, the proposed eigensolver is expected to accelerate the computation of eigenpairs even on massively parallel computers. Numerical experiments on shared-memory multi-core processors show that the BIR algorithm is faster than the SI algorithm and achieves the good parallel efficiency. In addition, many cases of the numerical experiments also show that the proposed eigensolver, including the parallel bisection and the BIR algorithm, is more accurate than the parallel implementation of other eigensolvers, such as the QR iteration algorithm, the divide-and-conquer algorithm, and the multiple relatively robust representations algorithm.

symbolic numeric computation | 2009

A method for finding zeros of polynomial equations using a contour integral based eigensolver

Tetsuya Sakurai; Junko Asakura; Hiroto Tadano; Tsutomu Ikegami; Kinji Kimura

In this paper, we present a method for finding zeros of polynomial equations in a given domain. We apply a numerical eigensolver using contour integral for a polynomial eigenvalue problem that is derived from polynomial equations. The Dixon resultant is used to derive the matrix polynomial of which eigenvalues involve roots of the polynomial equations with respect to one variable. The matrix polynomial obtained by the Dixon resultant is sometimes singular. By applying the singular value decomposition for a matrix which appears in the eigensolver, we can obtain the roots of given polynomial systems. Experimental results demonstrate the efficiency of the proposed method.

International Workshop on Eigenvalue Problems: Algorithms, Software and Applications in Petascale Computing | 2015

A Parallel Bisection and Inverse Iteration Solver for a Subset of Eigenpairs of Symmetric Band Matrices

Hiroyuki Ishigami; Hidehiko Hasegawa; Kinji Kimura; Yoshimasa Nakamura

The tridiagonalization and its back-transformation for computing eigenpairs of real symmetric dense matrices are known to be the bottleneck of the execution time in parallel processing owing to the communication cost and the number of floating-point operations. To overcome this problem, we focus on real symmetric band eigensolvers proposed by Gupta and Murata since their eigensolvers are composed of the bisection and inverse iteration algorithms and do not include neither the tridiagonalization of real symmetric band matrices nor its back-transformation. In this paper, the following parallel solver for computing a subset of eigenpairs of real symmetric band matrices is proposed on the basis of Murata’s eigensolver: the desired eigenvalues of the target band matrices are computed directly by using parallel Murata’s bisection algorithm. The corresponding eigenvectors are computed by using block inverse iteration algorithm with reorthogonalization, which can be parallelized with lower communication cost than the inverse iteration algorithm. Numerical experiments on shared-memory multi-core processors show that the proposed eigensolver is faster than the conventional solvers.

parallel, distributed and network-based processing | 2014

GPU Implementation of Inverse Iteration Algorithm for Computing Eigenvectors

Hiroyuki Ishigami; Kinji Kimura; Yoshimasa Nakamura

Effective GPU implementations of an inverse iteration algorithm with reorthogonalization are proposed for computing eigenvectors of symmetric tridiagonal matrices. The key to effectively accelerating the inverse iteration algorithm in GPU computing is the adoption of reorthogonalization code optimal for the GPU. The CGS2 algorithm and the compact WY orthogonalization algorithm, which can be implemented using level 2 BLAS routines, are implemented using CUBLAS. The size of the data transferred between the CPU and GPU is also optimally reduced. The proposed code of the inverse iteration algorithm using the CGS2 algorithm is shown to map well to a GPU and to achieve high performance through numerical experiments on a CPU-GPU heterogeneous computer.

ieee international conference on high performance computing data and analytics | 2012

Accelerating the Reorthogonalization of Singular Vectors with a Multi-core Processor

Hiroki Toyokawa; Hiroyuki Ishigami; Kinji Kimura; Masami Takata; Yoshimasa Nakamura

The dLV twisted factorization is an algorithm to compute singular vectors for given singular values fast and in parallel. However the orthogonality of the computed singular vectors may be worse if a matrix has clustered singular values. In order to improve the orthogonality, reorthogonalization by, for example, the modified Gram-Schmidt algorithm should be done. The problem is that this process takes a longer time. In this paper an algorithm to accelerate the reorthogonalization of singular vectors with a multi-core processor is devised.

JSIAM Letters | 2009