Kozo Kimura
Panasonic
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kozo Kimura.
international symposium on computer architecture | 1992
Hiroaki Hirata; Kozo Kimura; Satoshi Nagamine; Yoshiyuki Mochizuki; Akio Nishimura; Yoshimori Nakase; Teiji Nishizawa
In this paper, we propose a multithreaded processor architecture which improves machine throughput. In our processor architecture, instructions from different threads (not a single thread) are issued simultaneously to multiple functional units, and these instructions can begin execution unless there are functional unit conflicts. This parallel execution scheme greatly improves the utilization of the functional unit. Simulation results show that by executing two and four threads in parallel on a nine-functional-unit processor, a 2.02 and a 3.72 times speed-up, respectively, can be achieved over a conventional RISC processor. Our architecture is also applicable to the efficient execution of a single loop. In order to control functional unit conflicts between loop iterations, we have developed a new static code scheduling technique. Another loop execution scheme, by using the multiple control flow mechanism of our architecture, makes it possible to parallelize loops which are difficult to parallelize in vector or VLIW machines.
international symposium on industrial electronics | 1994
Kozo Kimura; Hiroaki Hirata; Tokuzo Kiyohara; S. Ashara; Takayuki Sagishima; Takao Onoye; Isao Shirakawa
A multithreaded processor is a good approach to increase the performance by utilizing coarse grain parallelism. The execution of multiple threads in parallel makes a performance prediction difficult because of a complicated behavior. Thus instruction-level simulation is necessary for a performance evaluation. In practice, it is very difficult to select optimum configuration of microarchitecture through a simulation of wide variety of candidates because of a long simulation time. The paper presents an evaluation method of microarchitecture for multithreaded processors. The method consists of three steps; first, the characteristics of the application are analysed, secondly, the candidates of microarchitecture are selected in consideration of the characteristics, lastly, the selected architectures are evaluated through the instruction-level simulation using practical application program. The experimental results using computer graphics application show that the proposed evaluation method of microarchitecture are very effective in order to increase the performance of multithreaded processors.<<ETX>>
international symposium on circuits and systems | 1994
Takayuki Sagishima; Kozo Kimura; Hiroaki Hirata; Tokuzo Kiyohara; Shigeo Asahara; Takao Onoye; Isao Shirakawa
Multiple instruction execution is a major approach to designing high-performance processors. Superscalar and VLIW processor that utilize instruction level parallelism are usually focused on. On the other hand, the multithreaded processor can be expected to achieve a high degree of multiple instruction execution by utilizing coarse grain parallelism. Many computer graphics applications (such as the radiosity method and ray-tracing method) can be optimized by reorganizing the code to take advantage of coarse grain parallelism, but the degree of instruction level parallelism is not sufficient for a superscalar processor. Experimental result using the radiosity method shows that the 4-thread multithreaded processor achieves 2.9 times speedup over single thread, while the 4-issue superscalar processor manages around 1.5 times. By duplicating two kinds of function units, the performance of a multithreaded processor increases to 3.7 times, but the performance of a superscalar processor is saturated at around 1.5 times. Therefore, for computer graphics applications, the multithreaded processor is a better approach than the superscalar processor.<<ETX>>
The Journal of The Institute of Image Information and Television Engineers | 1998
Kozo Kimura; Hiroyuki Okuhata; Takao Onoye; Isao Shirakawa; Tokuzo Kiyohara; Takayuki Sagishima
In this paper, we present a control method of data cache for a multithreaded processor and its evaluation. A multithreaded processor is effective for 3D-CG, however the increase of the working set size is unavoidable, and this limits the effectiveness of the data cache. Usually, the size and/or the associativity of the cache are increased in order to achieve a higher cache hit rate. This causes the chip size to increase, but the performance remains limited. An inter-thread non-blocking cache control method is proposed for reducing cache miss penalties. This control method achieves higher performance than the blocking cache method and also requires much less hardware cost than a traditional non-blocking cache method. In the case of the proposed cache control method, the performance degradation decreases to half and the performance ratio achieves 80-90% of an ideal cache case.
Archive | 1997
Kozo Kimura; Tokuzo Kiyohara; Kousuke Yoshioka
Archive | 1993
Kozo Kimura; Hiroaki Hirata
Archive | 2004
Hiroyuki Morishita; Atsushi Ito; Satoshi Takashima; Hideshi Nishida; Kozo Kimura; Tokuzo Kiyohara; Akira Miyoshi; Hiroshi Kadota
Archive | 2001
Kosuke Yoshioka; Makoto Hirai; Tokuzo Kiyohara; Kozo Kimura
Archive | 1992
Kozo Kimura; Kosuki Yoshioka; Tokuzo Kiyohara
Archive | 2002
Nobuo Higaki; Tetsuya Tanaka; Kunihiko Hayashi; Hiroshi Kadota; Tokuzo Kiyohara; Kozo Kimura; Hideshi Nishida