Masaaki Shimizu
Hitachi
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Masaaki Shimizu.
ieee international conference on high performance computing, data, and analytics | 2014
Taku Shimosawa; Balazs Gerofi; Masamichi Takagi; Gou Nakamura; Tomoki Shirasawa; Yuji Saeki; Masaaki Shimizu; Atsushi Hori; Yutaka Ishikawa
Turning towards exascale systems and beyond, it has been widely argued that the currently available systems software is not going to be feasible due to various requirements such as the ability to deal with heterogeneous architectures, the need for systems level optimization targeting specific applications, elimination of OS noise, and at the same time, compatibility with legacy applications. To cope with these issues, a hybrid design of operating systems where light-weight specialized kernels can cooperate with a traditional OS kernel seems adequate, and a number of recent research projects are now heading into this direction. This paper presents Interface for Heterogeneous Kernels (IHK), a general framework enabling hybrid kernel designs in systems equipped with manycore processors and/or accelerators. IHK provides a range of capabilities, such as resource partitioning, management of heterogeneous OS kernels, as well as a low-level communication layer among the kernels. We describe IHKs interface and demonstrate its feasibility for hybrid kernel designs through executing various different lightweight OS kernels on top of it, which are specialized for certain types of applications. We use the Intel Xeon Phi, Intels latest manycore coprocessor, as our experimental platform.
cluster computing and the grid | 2006
Taisuke Boku; Mitsuhisa Sato; Akira Ukawa; Daisuke Takahashi; Shinji Sumimoto; Kouichi Kumon; Takashi Moriyama; Masaaki Shimizu
We have been developing a large scale PC cluster named PACS-CS (Parallel Array Computer System for Computational Sciences) at Center for Computational Sciences, University of Tsukuba, for wide variety of computational science applications such as computational physics, computational material science, computational biology, etc. We consider the most important issue on the computation node is the memory access bandwidth, then a node is equipped with a single CPU which is different from ordinary high-end PC clusters. The interconnection network for parallel processing is configured as a multi-dimensional hyper-crossbar network based on trunking of Gigabit Ethernet to support large scale scientific computation with physical space modeling. Based on the above concept, we are developing an original mother board to configure a single CPU node with 8 ports of Gigabit Ethernet, which can be implemented in the half size of 19 inch rack-mountable 1U size platform. Under the preliminary performance evaluation, we confirmed that the computation part in practical Lattice QCD code will be able to achieve 30% of peak performance, and up to 600 Mbyte/sec of bandwidth at single directed neighboring communication will be achieved. PACS-CS will start its operation on July 2006 with 2560 CPUs and 14.3 Tflops of peak performance.
grid computing | 2010
Masaaki Shimizu; Akinori Yonezawa
Dedicated processors that are specialized for numerical computations, such as the Cell/B.E. and vector processors, tend to have low performance in the integer computations required by operating systems. To solve this problem, we propose a remote process and remote file I/O management architecture that enables processes on compute nodes that have dedicated processors to be executed from a management node that has general-purpose processors. The architecture allows the processes and files to be managed as a single system. The management node provides general OS functions such as process management and file I/O, while the compute nodes are dedicated to executing numerical application programs. It makes it possible to take advantage of the characteristics of each processor and achieves efficient execution of both OS functions and applications. In this architecture, our heterogeneity-aware binary loader allows programs to be executed on the compute nodes of different types of processors, while our remote file I/O function transparently executes file I/O issued by programs running on the compute nodes at the management node. The proposed architecture has been integrated into the Linux kernel. The system was evaluated using the cluster of an x86_64 node and 16 Cell/B.E. nodes. The results showed that compared to when only compute nodes are used, process invocation is 41 times as faster than rsh, and 1.6 times faster for the start-up time of an MPI program as well. Also for remote file I/O, performance twice as fast as NFS is achieved, and a 30% reduction in execution time was confirmed for the NAS Parallel Benchmark BTIO.
Archive | 2007
Masaaki Shimizu; Katsuji Mitsui; Masahiko Shimizu; Shinichi Yamamoto; Atsushi Shimizu; Masahiko Shinotsuka; Masaaki Yamagishi; Hiroshi Tsuji
Archive | 2011
Manabu Kato; Masaaki Shimizu
Archive | 1995
Tohiyuki Ukai; Toyohiko Kagimasa; Toshiaki Mori; Masaaki Shimizu
Archive | 1997
Toshiyuki Ukai; Masaaki Shimizu; Fujio Fujita
Archive | 2008
Masaaki Shimizu; Naonobu Sukegawa
Archive | 2002
Hitoshi Doi; Katsushi Mitsui; Atsushi Shimizu; Masaaki Shimizu; 克司 三井; 仁 土居; 敦 清水; 正明 清水
Archive | 2007
Masaaki Shimizu; Naonobu Sukegawa