Yasunori Kimura | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yasunori Kimura is active.

Explore More

Publication

Featured researches published by Yasunori Kimura.

ieee international conference on high performance computing data and analytics | 2008

Performance prediction of large-scale parallell system and application using macro-level simulation

Ryutaro Susukita; Hisashige Ando; Mutsumi Aoyagi; Hiroaki Honda; Yuichi Inadomi; Koji Inoue; Shigeru Ishizuki; Yasunori Kimura; Hidemi Komatsu; Motoyoshi Kurokawa; Kazuaki Murakami; Hidetomo Shibamura; Shuji Yamamura; Yunqing Yu

To predict application performance on an HPC system is an important technology for designing the computing system and developing applications. However, accurate prediction is a challenge, particularly, in the case of a future coming system with higher performance. In this paper, we present a new method for predicting application performance on HPC systems. This method combines modeling of sequential performance on a single processor and macro-level simulations of applications for parallel performance on the entire system. In the simulation, the execution flow is traced but kernel computations are omitted for reducing the execution time. Validation on a real terascale system showed that the predicted and measured performance agreed within 10% to 20 %. We employed the method in designing a hypothetical petascale system of 32768 SIMD-extended processor cores. For predicting application performance on the petascale system, the macro-level simulation required several hours.

Systems and Computers in Japan | 2001

Building a design support tool for superscalar processors and its case studies

Yasunori Kimura; Kouya Shimura; Haruko Nishimoto; Motoyuki Kawaba; Takeshi Eguchi

We will describe Paratool, a support tool to efficiently design superscalar processors, and its case studies. Paratool is a software tool that predicts performance of superscalar processors being designed, using execution traces on a sequential processor, and it runs on a sequential processor. It can measure almost all microarchitectures for superscalar processors. It can measure an ideal microarchitecture, whose resources are limitless. Furthermore, we will describe the result of case studies and report findings designers tend to overlook. This support tool is useful not only for architecture design, but also for tuning of compilers/application software and education in universities.

Archive | 1991

Incremental Garbage Collection Scheme in KL1 and Its Architectural Support of PIM

Yasunori Kimura; Takashi Chikayama; Tsuyoshi Shinogi; Atsuhiro Goto

This paper describes an incremental garbage collection (GC) scheme for the parallel logic programming language KL1 that uses one extra bit of information attached to pointers to data objects rather than the data objects themselves.

conference on logic programming | 1985

Design and Evaluation of a Prolog Compiler

Mitsuhiro Kishimoto; Tsuyoshi Shinogi; Yasunori Kimura; Akira Hattori

This paper discusses a Prolog compiler for the FACOM α, a symbolic data processing machine. The compiler includes several optimization algorithms, such as separated predicate frames, extended mode declaration, and fast goal invocation. Compiled programs run at 30 to 40 KLIPS.

New Generation Computing | 1991

Locally parallel cache design based on KL1 memory access characteristics

Akira Matsumoto; Takayuki Nakagawa; Masatoshi Sato; Yasunori Kimura; Kenji Nishida; Atshurio Goto

The parallel inference machine (PIM) is now being developed at ICOT. It consists of a dozen or more clusters, each of which is a tightly coupled multiprocessor (comprising about eight processing elements) with shared global memory and a common bus. Kernel language 1 (KL1), a parallel logic programming language based on Guarded Horn Clauses (GHC), is executed on each PIM cluster.This paper describes the memory access characteristics in KL1 parallel execution and a locally parallel cache mechanism with hardware lock. The most important issue of locally parallel cache design is how to reduce common bus traffic. A write-back cache protocol having five cache states specially optimized for KL1 execution on each PIM cluster is described. We introduced new software controlled memory access commands, named DW, ER, and RP. A hardware lock mechanism is attached to the cache on each processor. This lock mechanism enables efficient word-by-word locking, reducing common bus traffic by using the cache states.The effects of the PIM cache are evaluated with measurements obtained by software simulation.

european conference on object oriented programming | 2000