Masami Takata | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Masami Takata is active.

Explore More

Publication

Featured researches published by Masami Takata.

parallel computing technologies | 2007

Accelerating the singular value decomposition of rectangular matrices with the CSK600 and the integrable SVD

Yusaku Yamamoto; Takeshi Fukaya; Takashi Uneyama; Masami Takata; Kinji Kimura; Masashi Iwasaki; Yoshimasa Nakamura

We propose an approach to speed up the singular value decomposition (SVD) of very large rectangular matrices using the CSX600 floating point coprocessor. The CSX600-based acceleration board we use offers 50GFLOPS of sustained performance, which is many times greater than that provided by standard microprocessors. However, this performance can be achieved only when a vendor-supplied matrix-matrix multiplication routine is used and the matrix size is sufficiently large. In this paper, we optimize two of the major components of rectangular SVD, namely, QR decomposition of the input matrix and back-transformation of the left singular vectors by matrix Q, so that large-size matrix multiplications can be used efficiently. In addition, we use the Integrable SVD algorithm to compute the SVD of an intermediate bidiagonal matrix. This helps to further speed up the computation and reduce the memory requirements. As a result, we achieved up to 3.5 times speedup over the Intel Math Kernel Library running on an 3.2GHz Xeon processor when computing the SVD of a 100,000 × 4000 matrix.

parallel, distributed and network-based processing | 2011

Scaleable Sparse Matrix-Vector Multiplication with Functional Memory and GPUs

Noboru Tanabe; Yuuka Ogawa; Masami Takata; Kazuki Joe

Sparse matrix-vector multiplication on GPUs faces to a serious problem when the vector length is too large to be stored in GPUs device memory. To solve this problem, we propose a novel software-hardware hybrid method for a heterogeneous system with GPUs and functional memory modules connected by PCI express. The functional memory contains huge capacity of memory and provides scatter/gather operations. We perform some preliminary evaluation for the proposed method with using a sparse matrix benchmark collection. We observe that the proposed method for a GPU with converting indirect references to direct references without exhausting GPUs cache memory achieves 4.1 times speedup compared with conventional methods. The proposed method intrinsically has high scalability of the number of GPUs because intercommunication among GPUs is completely eliminated. Therefore we estimate the performance of our proposed method would be expressed as the single GPU execution performance, which may be suppressed by the burst-transfer bandwidth of PCI express, multiplied with the number of GPUs.

conference on multimedia modeling | 2007

Similarity searching techniques in content-based audio retrieval via hashing

Yi Yu; Masami Takata; Kazuki Joe

With this work we study suitable indexing techniques to support efficient content-based music retrieval in large acoustic databases. To obtain the index-based retrieval mechanism applicable to audio content, we pay the most attention to the design of Locality Sensitive Hashing (LSH) and the partial sequence comparison, and propose a fast and efficient audio retrieval framework of query-by-content. On the basis of this indexable framework, four different retrieval schemes, LSH-Dynamic Programming (DP), LSH-Sparse DP (SDP), Exact Euclidian LSH (E2LSH)-DP, E2LSH-SDP, are presented and estimated in order to achieve an extensive understanding of retrieval algorithms performance. The experiment results indicate that compared to other three schemes, E2LSH-SDP exhibits best tradeoff in terms of the response time, retrieval ratio, and computation cost.

international conference on e science | 2006

Automatic Viewpoint Selection for a Visualization I/F in a PSE

Machiko Nakagawa; Masami Takata; Kazuki Joe

Visualization plays an important role in PSEs. Some PSE for the support of scientific simulation provides visualization I/F for simulation results. Without deep knowledge of visualization presentation, users require automatic viewpoint selection of the resultant visualization of simulation results. In this paper, we propose an automatic viewpoint selection method, called viewpoint potential, and show some experimental results.

irregular applications: architectures and algorithms | 2011

A memory accelerator with gather functions for bandwidth-bound irregular applications

Noboru Tanabe; Boonyasitpichai Nuttapon; Hironori Nakajo; Yuuka Ogawa; Junko Kogou; Masami Takata; Kazuki Joe

Compute intensive processing can be easily accelerated using processors with many cores such as GPUs. However, memory bandwidth limitation becomes serious year by year for memory bandwidth intensive applications such as sparse matrix vector multiplications (SpMV). In order to accelerate memory bandwidth intensive applications, we have proposed a memory system with additional functions of scattering and gathering. For the preliminary evaluation of our proposed system, we assumed that the throughput of the memory system was sufficient. In this paper, we propose a memory system with scattering and gathering using many narrow memory channels. We evaluate the feasible throughput of the proposed memory system based on DDR3 DRAM with the modified DRAMsim2 simulator. In addition, we evaluate the performance of SpMV using our method for the proposed memory system and a GPU. We have confirmed the proposed memory system has good performance and good stability for matrix shape variation using fewer pins for external memory.

International Conference on Informatics Education and Research for Knowledge-Circulating Society (icks 2008) | 2008

Application of the Kato-Temple Inequality for Eigenvalues of Symmetric Matrices to Numerical Algorithms with Shift for Singular Values

Kinji Kimura; Masami Takata; Masashi Iwasaki; Yoshimasa Nakamura

The Kato-Temple inequality for eigenvalues of symmetric matrices gives a lower bound of the minimal eigenvalue lambdam. Let A be a symmetric positive definite tridiagonal matrix defined by A = BT B, where B is bidiagonal. Then the so-called Kato-Temple bound gives a lower bound of the minimal singular value sigmam of B. In this paper we discuss how to apply the Kato-Temple inequality to shift of origin which appears in the mdLVs algorithm, for example, for computing all singular values of B. To make use of the Kato-Temple inequality a Rayleigh quotient for the matrix A = BT B and a right endpoint of interval where lambdam = sigmam 2 belongs are necessary. Then it is shown that the execution time of mdLVs with the standard shifts can be shorten by a possible choice of the generalized Newton bound or the Kato-Temple bound.

international conference on algorithms and architectures for parallel processing | 2013

Character of Graph Analysis Workloads and Recommended Solutions on Future Parallel Systems

Noboru Tanabe; Sonoko Tomimori; Masami Takata; Kazuki Joe

Graph500 is a benchmark suite for big data analysis. Matrices used for Graph500 inherit the properties of graph analysis such as breadth first search for SNS and PageRank for web searching engine. Especially power saving is very important for its execution on future massively parallel processors and clouds. The spatial locality of sparse matrices used for Graph500 and its behaviors on cache memory are investigated. The experimental results show the spatial locality of sparse matrices used for Graph500 is very low. It is very difficult to solve the problem by just software approach because of the huge size and the randomness of their accesses. Therefore, we recommend hardwired scatter/gather functions at memory side. They improve the processing speed in an order of magnitude. For achieving both of low power and high throughput of random access, we recommend implementing hardwired scatter/gather functions on logic-base in Hybrid Memory Cube (HMC). We also describe brief considerations of the power saving in the case of low cache hit rate application such as graph500. For example, when the hit rate is 15%, the power saving ratio of memory access is about 30-fold.

Innovative Architecture for Future Generation High-Performance Processors and Systems (Cat. No.PR00650) | 1999

A heuristic approach to improve a branch and bound based program partitioning algorithm

Masami Takata; Yoshitoshi Kunieda; Kazuki Joe

In this paper, we propose several heuristics that improve the branch and bound based program partitioning algorithm proposed by Girkar et al., and evaluate the effectiveness by experiments. The heuristic depends heavily, on the element order of edges of a given task graph. Therefore, it is necessary to sort the edges carefully to make effective use of the heuristic. Different sorting methods are investigated and experimentally evaluated. Approximate solutions that provide a sufficient practical partitioning were obtained using the accelerated heuristic, and execution times and error compared to the optimal solutions decreased considerably by sorting the edges of the task graph.

international conference on pattern recognition applications and methods | 2014

A Multi-fonts Kanji Character Recognition Method for Early-modern Japanese Printed Books with Ruby Characters

Taeka Awazu; Manami Fukuo; Masami Takata; Kazuki Joe

The web site of National Diet Library in Japan provides a lot of early-modern (AD1868-1945) Japanese printed books to the public, but full-text search is essentially impossible. In order to perform advanced search for historical literatures, the automatic textualization of the images is required. However, the ruby system, which is peculiar to Japanese books, gives a serious obstacle against the textualization. When we apply existing OCRs to early-modern Japanese printed books, the recognition rate is extremely low. To solve this problem, we have already proposed a multi-font Kanji character recognition method using the PDC feature and an SVM. In this paper, we propose a ruby character removal method for early-modern Japanese printed books using genetic programming, and evaluate our multi-fonts Kanji character recognition method with 1,000 types of early-modern printed Kanji characters.

international semantic technology conference | 2013

Ontology Construction Support for Specialized Books

Yuki Eguchi; Yuri Iwakata; Minami Kawasaki; Masami Takata; Kazuki Joe

In this paper, we present a support system for ontology construction just based on a given specialized book with lowest possible cost. The system tries to combine minimum hand and mostly automatic constructions. It extracts required information for the ontology design from the specialized book with presenting yes-no selections to an expert of the specialized book. The ontology construction is performed automatically just using the answers of the expert. The constructed ontology is reliably and highly technical since it is constructed on the basis of the specialized book. In addition, since user operations are restricted only to yes-no selection, any expert can make use of our system without any special knowledge about ontology.

Explore More